Introduction to HERE Anonymizer Preprocessor

HERE Anonymizer Preprocessor is an Apache Flink-based application that structures and organizes data into a format that works best with the HERE Anonymizer Self-Hosted.

By ensuring the best quality of the input data, HERE Anonymizer Preprocessor significantly enhances the performance, reliability, and accuracy of HERE Anonymizer Self-Hosted.

HERE Anonymizer Preprocessor ensures that the data meets the following criteria required by the HERE Anonymizer Self-Hosted:

A single file can contain one or multiple complete traces. A single trace can't be split across multiple files.
Data must be in one of the supported formats.

The application runs in the batch execution mode where the data source clearly defines the number of files that must be preprocessed. It's delivered as a here-anonymizer-preprocessor.jar archive which contains an application you can deploy to the target environment.

How it works

To ensure efficient handling of a large number of input files, HERE Anonymizer Preprocessor works in two phases:

Indexing: HERE Anonymizer Preprocessor examines all input files and creates an index based on filenames and trace IDs. This determines how the data must be read and grouped in the subsequent phase. The resulting index is used to organize related data into coherent groups.
Preprocessing: In the second phase, the index generated in the first phase is used to combine related trace chunks. All data that belongs to the same trace is consolidated into a single file, ensuring that chunks from the same trace aren't split across multiple files.

This guarantees that the data doesn't have to be processed further by HERE Anonymizer Self-Hosted.

The output of the HERE Anonymizer Preprocessor is a set of structured files, each containing one or more complete traces.

Input and output formats

HERE Anonymizer Preprocessor supports different input and output data formats, same as the HERE Anonymizer Self-Hosted. To learn about the supported formats, see the Data message formats documentation.

Connectors

Built-in connectors allow you to integrate with different data sources, such as Amazon Simple Storage Service (S3) or Azure Blob Storage.

To learn more about the available connectors, see this topic.

Learn more

For more information on:

How to quickly set up and evaluate HERE Anonymizer Preprocessor, see How to run HERE Anonymizer Preprocessor locally in Docker.
The terms and conditions covering this documentation, see the HERE Documentation License.

HERE Privacy Charter

Data privacy is of fundamental importance to HERE and our customers. We practice data minimization and don’t collect data we don’t need.

And we promote pseudonymity for data subjects wherever a service does not require personal information to function. We employ privacy by design in services we develop. We strive to go beyond mere regulatory compliance and make privacy an integral part of our corporate culture. We believe that our approach to privacy is vital to earning and retaining the trust of our customers – and the bedrock of our future success as a data-driven location platform.

For more information on how data privacy is of fundamental importance to HERE and our customers, see the HERE Privacy Charter.