How to run HERE Anonymizer Preprocessor locally in Docker
This section outlines how to quickly get started with and evaluate HERE Anonymizer Preprocessor by running the application locally in Docker. This allows you to get familiar with the product before deploying to a production environment.
Prerequisites
- Recommended: Linux-based or macOS machine.
- Windows machines: install WSL2 to run the supplied shell scripts.
- Install Docker with
dockeranddocker-composecommands in yourPATH. - Download the HERE Anonymizer Preprocessor ZIP archive.
Note
To get started, you must run the supplied shell scripts on your machine. Windows machines can't run these scripts natively. To run them on Windows, install WSL2.
Start the application
To run HERE Anonymizer Preprocessor locally in Docker, open a terminal window and navigate to the path where you unpacked the
HERE Anonymizer Preprocessor ZIP archive. Then, run ./demo-start.sh (or demo-start.ps1 on Windows) to run a script that sets up a local Docker cluster and starts the application.
Note
If you can't run the script on macOS or Linux, try fixing the executable permission by running
chmod a+x ./demo-*.shor run$SHELL ./demo-start.shinstead.
When the application starts, you can access the following information:
- Flink web console is available at http://localhost:8081 as long as the preprocessor job is running.
- MINIO web console is available at http://localhost:9002. Use
minioadmin:minioadminfor credentials. - Dozzle monitoring and logging is available at http://localhost:8888.
Input data
When you run the demo, HERE probe example data is copied to the MINIO S3 container. This data is then used as the input
data set. The example data is included in the HERE Anonymizer Preprocessor package on
the dist/deployments/common/here-probe-example-data path.
Preprocessed data
The HERE Anonymizer Preprocessor is a batch data processing application. It keeps the Flink cluster online only for the time required to process the input data set.
The cluster is shut down immediately after the input data set is processed.
To check the preprocessed data:
- Check the MINIO web console at http://localhost:9002:
- To check the report of indexing phase, go to Object Browser > input-bucket > input >
INDEXER_REPORT.json. You should seeHERE_indexer_input_files_total: 2andHERE_indexer_input_files_dropped_corrupted: 0. - To check the report of preprocessing phase, go to Object Browser > output-bucket > output >
PREPROCESSOR_REPORTjson. You should seeHERE_preprocessor_input_traces_total: 2,HERE_preprocessor_input_points_total: 200,HERE_preprocessor_output_points_total: 200,HERE_preprocessor_output_files_total: 2, andHERE_preprocessor_output_traces_total: 2. - To see the preprocess data, go to Object Browser > output-bucket > output > preprocessed-data.
- Check Dozzle at http://localhost:8888 to see the logs of the
*_jobmanagerand*_taskmanagercontainers.
Stop the app
Run ./demo-stop.sh to stop the app and shut down all running containers.
Further reading
To better understand how Docker deployments work and take this concept to production, see Standalone deployment.
Troubleshooting
Consult these procedures if you run into issues when running the example.
Can't start the example
-
For macOS and Linux: ensure that the
demo-start.shanddemo-stop.shscripts have the+xpermission. Runchmod +x ./demo-*.sh -
Make sure you installed Docker and Docker Compose.
$ docker -v # Docker version 20.10.23, build 7155243 $ docker-compose -v # docker-compose version 1.29.2, build 5becea4c -
Ensure that there are no containers left running from a previous time you worked with the example. Run
./demo-stop.shand list all running containers.$ docker ps # CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES $
Demo starts, **-jobmanager container fails
- Configure the demo logs at http://localhost:8888/settings to show dropped containers.
- Find the
***-jobmanagercontainer. - Check its logs for
ERRORorExceptionmessages
Error connecting to IgniteCache container
If you get the IgniteCache container error:
Exception in thread "main" com.here.anonymization.data.preprocessor.cache.IgniteCacheException: Fail to initiate cache for the preprocessor.
at com.here.anonymization.data.preprocessor.Indexer.initiateCache(Indexer.java:84)
at com.here.anonymization.data.preprocessor.Indexer.main(Indexer.java:46)
/
/ ...
Check if the container is running with correct port mapping 10800 and check if the cache_endpoint parameter is correctly defined in the SOURCE_URI.
Error connecting to MINIO S3 container
If you get the MINIO S3 container error:
Exception in thread "main" java.util.concurrent.ExecutionException: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999)
at com.here.anonymization.data.preprocessor.Indexer.main(Indexer.java:67)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144)
/
/...
Check if the s3 container is running with correct port-mapping (9000), the init-s3 container has completed successfully and check if the endpoint parameter is correctly defined in both the SOURCE_URI and SINK_URI.
Data preprocessing doesn't work
Check the INDEXER_REPORT in the MINIO web console.
- If
HERE_indexer_input_files_dropped_corruptedchanged to (+2), check if theSOURCE_FORMATis configured correctly and check if the input data follows the configured format. Enable theDEBUGlogging level and check logs for more details. - If the
HERE_indexer_input_files_dropped_corruptedis0and theHERE_indexer_input_files_totalis2, it means that indexer should have run correctly. Check thePREPROCESSOR_REPORT. - If the
PREPROCESSOR_REPORTdoesn't exist or shows deviation in any of the following metrics, check the configuration entities (SOURCE_URI,SINK_URI) for the preprocessor and investigate the job managaer logs for errors or exceptions.
HERE_preprocessor_input_traces_total: 2
HERE_preprocessor_input_points_total: 200
HERE_preprocessor_output_traces_total: 2
HERE_preprocessor_output_points_total: 200
HERE_preprocessor_output_files_total: 2
Manifest file size exceeds JDK limits
Certain versions of the Java Development Kit (JDK) require the manifest file to stay within a file size limit. When the file exceeds the limit, you get this error:
Unsupported size: xxx for JarEntry META-INF/MANIFEST.MF. Allowed max size: 8000000 bytes
To fix this issue, increase the maximum manifest file size by adjusting the jdk.jar.maxSignatureFileSize property of the JAVA_TOOL_OPTIONS environment variable. This is a mechanism used to pass startup options and arguments to the Java Virtual Machine (JVM).
For example, use the following configuration to set the maximum file size to approximately 22 MB:
JAVA_TOOL_OPTIONS=-Djdk.jar.maxSignatureFileSize=22000000Updated 25 days ago