Guides
Guides

Azure Event Hubs

The Azure Event Hubs data connector is implemented for source and sink connections to stream data using the AMQP protocol. Since the Azure Event Hubs support Apache Kafka, you can also use the Apache Kafka connector to connect with the event hubs.

To learn more, see the Kafka connector documentation.

This is a minimal example of the Event Hubs connector URI scheme:

evh://namespace/event-hub

Authentication

This connector supports two authentication methods:

  • Default authentication When credentials aren't specified, the system uses the default Azure credentials provider.

    For example: evh://namespace/event-hub...

  • Explicit (static) credentials provider Uses credentials provided explicitly to the URI. For example: evh://SharedAccessKeyName:SharedAccessKey@namespace/event-hub...

Configuration

Use the following parameters to configure the connector.

The consumer_group parameter

Optional parameter. A consumer group is a view of the entire Event Hub. Consumer groups allow multiple apps connected to the hub to:

  • Have separate views of the event stream.
  • Read the event stream independently.
  • Read the event stream at app-specific pace and from app-specific position.

The maximum number of concurrent readers on a partition per consumer group is 5. It's recommended that there's only one active consumer for a given partition and consumer group pairing.

Each active reader receives the events from its partition. If there are multiple readers on the same partition, they receive duplicate events.

The default value is $Default. To learn more, see the Consumer groups documentation

The offset parameter

Optional parameter. Defines the position within the Event Hub partition to begin consuming events.

Supported values: earliest and latest.

Default value: latest.

The partitions parameter

Optional parameter. Configures a list of partitions to be read. Event Hubs organize the received data into one or more partitions. For an event hub with four partitions, partition IDs 0, 1, 2, and 3 are valid.

Relevant only when the connector is used as a source connector. By default, the connector consumes all partitions of the topic.

The max_wait_time parameter

Optional parameter. Relevant only when the connector is used as a source connector. Value defined using the Java Duration format.

Defines the maximum amount of time the system waits for an event to be available when reading before it reads an empty event. If a message is available, the system returns the call without waiting for the time defined in WaitTimeSeconds.

If no messages are available and the wait time expires, the call returns successfully with an empty event.

To learn more, see the EventHubConsumerClient Class documentation.

Examples:

  • PT1M: 1 minute
  • PT1H: 1 hour
  • PT30S: 30 seconds
  • PT23H56M4.091S: the sidereal day

The default value is PT0.5S.

The max_message_count parameter

Optional parameter. Relevant only when the connector is used as a source connector.

Defines the maximum number of messages to receive in a batch.

To learn more, see the EventHubConsumerClient Class.

The default value is 1.

The prefetch_count parameter

Optional parameter. Relevant only when the connector is used as a source connector.

The number of events that will be eagerly requested from the Event Hubs service and queued locally regardless of whether a read operation is currently taking place, to increase maximize throughput by allowing events tobe read from a local cache rather than waiting for a service request.

To learn more, see the EventHubClientBuilder Class documentation.

Using multi-partitioned sources

HERE Anonymizer Self-Hosted allows all probe data of a single vehicle to be placed into a single dedicated partition of a multi-partitioned event hub. This means that there's always one partition that gets data from a specific vehicle.

If the data can be categorized by a vehicle/trajectory/trace identifier or by a geospatial identifier, such as a state or a big enough map tile, all the messages must be directed into a single partition that is mapped to this specific identifier.

The easiest way to fulfil this requirement is to have all messages keyed using keys that are equal to category identifiers. For example, all messages for vehicle A must have the key A and all messages from the region California must have the California key.

To learn more, see Features and terminology in Azure Event Hubs