Guides
Guides

Double processing

When working with Docker deployments of HERE Anonymizer Self-Hosted you can create a setup in which two Anonymizer instances process the same input raw probe data stream. This allows you to compare the results of the anonymization process and select the output data that matches your criteria.

                        ----- Anonymizer A --- Anonymized stream A
                        |
Raw probes stream ------|
                        |
                        ------Anonymizer B --- Anonymized stream B

The implementation changes depending on the type of source connectors used.

RabbitMQ

An example for RabbitMQ is available in the deployments/docker/ab-case/docker-compose.yml file; it contains two Flink JobManagers and two Flink TaskManagers pointed to corresponding JobManagers.

Follow these steps to run the example and see the setup in action:

  1. From the application directory, build the here-anonymizer:latest image.

    docker build -t here-anonymizer:latest .
  2. Run the example.

    cd ./deployments/docker/ab-case/
    docker-compose up -d
  3. Access the consoles of two Anonymizer instances.

  4. Increase the number of message streams consumed by RabbitMQ by declaring a special fanout exchange and binding multiple queues to it. Run the supplied script.

    ./configure-routing-and-publish.sh

    Running this script:

    • Declares two queues: input-queue-a and input-queue-b
    • Declares fanout exchange input-queue
    • Binds input-queue to both input-queue-a and input-queue-b
    • Publishes deployments/common/here-probe-example.json to input-queue exchange
  5. Check Running Jobs -> HERE Anonymizer -> Accumulators for anonymization stats of both running Anonymizer instances.

    As the instances use different configuration files, the number of HERE_output_point_info_all and HERE_anonymization_point_dropped_start differs between the instances due to the difference in the startCut.publishAfter value.

  6. Stop the example.

    docker-compose down

HERE platform stream layers

HERE platform stream layers are based on Kafka. As a result, the same approach as one used when configuring HERE platform stream URI applies:

jobmanager-a:
  image: here-anonymizer:latest
  ### ...
  environment:
    - SOURCE_URI=olp+stream://platform.here.com/hrn:here:data::my-org:my-catalog/my-stream-layer?consumerGroup=anonA
    ## ...

jobmanager-b:
  image: here-anonymizer:latest
  ### ...
  environment:
    - SOURCE_URI=olp+stream://platform.here.com/hrn:here:data::my-org:my-catalog/my-stream-layer?consumerGroup=anonB
    ### ...