Guides
Guides

How to run HERE Anonymizer Self-Hosted locally in Docker

This section outlines how to quickly get started with and evaluate HERE Anonymizer Self-Hosted by running the application locally in Docker. This allows you to get familiar with the product before deploying to a production.

Prerequisites

  • Recommended: Linux-based or macOS machine
  • Docker installed with docker and docker-compose commands in your PATH
  • Downloaded HERE Anonymizer Self-Hosted ZIP archive
📘

Note

To get started, you must run the supplied shell scripts on your machine. Windows machines can't run these scripts natively. To run them on Windows, install WSL2.

Start the application

To run HERE Anonymizer Self-Hosted locally in Docker, open a terminal window and navigate to the path where you unpacked the HERE Anonymizer Self-Hosted ZIP archive. Then, run ./demo-start.sh (or demo-start.ps1 on Windows) to run a script that sets up a local Docker cluster and starts the application.

To run in Docker with monitoring enabled, use ./demo-start.sh --profile monitoring on macOS or demo-start.ps1 monitoring on Windows.

📘

Note

If you can't run the script on macOS or Linux, try fixing the executable permission by running chmod a+x ./demo-*.sh or run $SHELL ./demo-start.sh instead.

When the application starts, you can access the following links:

When monitoring is enabled you can access prometheus and grafana links.

Process data

To try processing some data, use the supplied example of HERE_PROBE format. You can send it by RabbitMQ console into Queues-> input-queue -> Publish message or publish the message directly from the command line by running:

docker exec -i demo_rabbit_1 rabbitmqadmin publish exchange=amq.default routing_key="input-queue" < deployments/common/here-probe-example.json`

To check the processed data:

  • Check the Flink web console: Running Job List -> OPA -> <choose the tasks> -> Accumulators. You should get OPA_decoding_point_info_all=425 and other metrics. Additionally, check the OPA_output_chunk_info_all value and verify if the value falls in the expected range of 10-15.
  • Check anonymized chunked data by the RabbitMQ console Queues -> input-queue -> Get messages and mention the OPA_output_chunk_info_all for Messages: field to reveal all chunks.
  • Check logs of *_jobmanager and *_taskmanager containers.
  • See the demo_notebook.ipynb file in JupyterLab for additional visualizations.

Stop the app

Run ./demo-stop.sh to stop the app and shut down all running containers.

Further reading

To better understand how Docker deployments work and take this concept to production, see Standalone deployment.

Troubleshooting

Consult these procedures if you run into issues when running the example.

Can't start the example

  • For macOS and Linux: Ensure that the demo-start.sh and demo-stop.sh scripts have the +x permission. Run chmod +x ./demo-*.sh

  • Make sure you installed Docker and Docker Compose:

    $ docker -v
    # Docker version 20.10.23, build 7155243
    $ docker-compose -v
    # docker-compose version 1.29.2, build 5becea4c
  • Ensure that there are no containers left running from a previous time you worked with the example. Run ./demo-stop.sh and list all running containers:

    $ docker ps
    # CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
    $

Example starts, can't access Flink console

  1. Configure the demo logs to Options -> Show dropped containers
  2. Find the ***-job-manager container.
  3. Check its logs for ERROR or Exception messages

A common reason for errors is making mistakes in the anonymization config. If that is the case, you will see an error similar to this:

java.lang.IllegalStateException: Unable to load anonymization library configuration
...
com.fasterxml.jackson.databind.exc.UnrecognizedPropertyException:
Unrecognized field "someBadName" (class com.here.anonymization.core.config.
anonymization.SplitAndGapConfig$SplitAndGapConfigBuilder),
not marked as ignorable (11 known properties: "startCut", "samplingRate",
"outputTraceDuration", "type", "removeZeroSpeedPoints", "endCutExpression",
"stayPoint", "startCutExpression", "destinationObfuscationDistance",
"gapDuration", "endCut"])

Fix the anonymization configuration, restart the example, and try accessing the console again.

Data anonymization doesn't work

  1. Check the Anonymization metrics HERE_decoding_message_info_all, HERE_decoding_point_info_all, HERE_decoding_message_dropped_corrupted, HERE_output_message_info_all, and HERE_output_point_info_all, then push the message into input queue and check how these metrics changed.
  2. If the HERE_decoding_message_dropped_corrupted changed (+1), check if the SOURCE_FORMAT is configured correctly and check if the input data follows the configured format.Enable the DEBUG logging level and check the logs for detailed error.
  3. If the HERE_decoding_message_info_all and HERE_decoding_point_info_all changed but HERE_output_message_info_all and HERE_output_point_info_all haven't, check the HERE_anonymization_point_dropped_* metrics to look for the reason why the points were dropped. You might temporarily disable any anonymization by applying the no-anonymization.conf configuration to simple-anonymization.conf.

Manifest file size exceeds JDK limits

Certain versions of the Java Development Kit (JDK) require the manifest file to stay within a file size limit. When the file exceeds the limit, you get this error:

Unsupported size: xxx for JarEntry META-INF/MANIFEST.MF. Allowed max size: 8000000 bytes

To fix this issue, increase the maximum manifest file size by adjusting the jdk.jar.maxSignatureFileSize property of the JAVA_TOOL_OPTIONS environment variable. This is a mechanism used to pass startup options and arguments to the Java Virtual Machine (JVM).

For example, use the following configuration to set the maximum file size to approximately 22 MB:

JAVA_TOOL_OPTIONS=-Djdk.jar.maxSignatureFileSize=22000000