Guides
Guides

HERE Anonymizer Preprocessor configuration

Service configuration

The basic configuration of HERE Anonymizer Preprocessor is performed by setting environment variables in the runtime environment.

For example, the Docker quick start example uses this configuration:

  • Indexer:

    SOURCE_FORMAT=HERE_PROBE
    SOURCE_URI=s3+index://minioadmin:minioadmin@input-bucket/input?region=eu-somewhere&endpoint=http://s3:9000&cache_endpoint=ignite://ignite:10800
  • Preprocessor:

    SOURCE_FORMAT=HERE_PROBE
    SOURCE_URI=s3+preprocess://minioadmin:minioadmin@input-bucket/input?region=eu-somewhere&endpoint=http://s3:9000&cache_endpoint=ignite://ignite:10800
    SINK_URI=s3+batch+files://minioadmin:minioadmin@output-bucket/output?region=eu-somewhere&endpoint=http://s3:9000
    SINK_FORMAT=HERE_PROBE
📘

Note

The provided sample configuration is for local in-Docker deployments. When deploying the application in other environments, set the environment variables following instructions from the relevant deployment guide.

SOURCE_URI

The SOURCE_URI variable defines the location of the data source.

In the example configuration, the input data source is located in-Docker, locally to the application. The data comes from input-bucket of the s3 host using Amazon S3 data connector with minioadmin:minioadmin credentials.

For a full list of supported URI options, see Data connectors.

📘

Note

You can use environment variable substitution when setting SOURCE_URI. For example, this is how you pass the accessKey and secretKey using an environment variable:

SOURCE_URI=s3+index://${AWS_ACCESS_KEY_ID}:${AWS_SECRET_ACCESS_KEY}@input-bucket/input?region=eu-west-1

SINK_URI

The SINK_URI variable defines the destination of the preprocessed output data.

In the example configuration, the output data is saved in-Docker, locally to the application. The data is sent to the output-bucket of the s3 host Amazon S3 data connector with minioadmin:minioadmin credentials.

📘

Note

You can use environment variable substitution when setting SINK_URI, same as in the case of SOURCE_URI.

SOURCE_FORMAT

The SOURCE_FORMAT variable defines the expected format of input data messages. The supported source formats are the same as for the HERE Anonymizer Self-Hosted. To learn more, see Data message formats documentation of HERE Anonymizer Self-Hosted.

SINK_FORMAT

The SINK_FORMAT variable defines the expected format of output data. The supported sink formats are the same as for the HERE Anonymizer Self-Hosted. To learn more, see Data message formats documentation of HERE Anonymizer Self-Hosted.

POINTS_PER_FILE

The POINTS_PER_FILE variable defines the average number of points that the preprocessed files should contain.

Default value: 5000

TILE_ZOOM_LEVEL

The TILE_ZOOM_LEVEL variable defines the zoom level for the map tile. The map tiles aggregate traces based on origin, allowing the HERE Anonymizer Self-Hosted to distribute input data across compute nodes by tiles.

Default value: 12