Guides
Guides

Data message formats in HERE Anonymizer Self-Hosted

HERE Anonymizer Self-Hosted supports multiple data message formats. Use them by providing the format's system name in the SOURCE_FORMAT and SINK_FORMAT configuration variables. To learn more, see Configuration of HERE Anonymizer Self-Hosted.

When choosing the data format for your implementation, remember that binary formats, such as Protobuf, have a smaller footprint but are harder to maintain.

📘

Note

Even if the message format supports multiple traces per message, it's recommended to have a single trace per message, as it works better for data streaming.

Data formatSystem nameEncoding typeMultiple traces per message
SDIISDIIProtobuf
SDII listSDII_MESSAGE_LISTProtobuf
SENSORISSENSORISProtobuf
HERE ProbeHERE_PROBEJSON
GPXGPXXML

Terminology

The following table defines the most important terms used when talking about data messages.

📘

Note

The term "double numbers" refers to the Java data type.

TermDefinition
probeA probe is a data set that records the state and attributes of a (usually moving) at a certain point in time.
chunkA chunk is a collection of probe points and/or events. Each chunk is defined by a unique ID (trace chunk ID).
traceA trace is a collection of chunks.
trajectoryTrajectory is used as a synonym for a trace.
locationA location is defined by longitude and latitude.
pointA point is defined by longitude, latitude, altitude, and timestamp.
eventAn event is defined by type, sub-type, timestamp, raw event bytes, and attributes.
timestampA timestamp is the number of seconds or microseconds (depending on the required granularity) since the epoch when a certain event happened or a probe was collected.
longitudeThe longitude is a geographic coordinate that specifies the east–west position of a point on the surface of the Earth. It's a double number in the range from -180.0 to 180.0.
latitudeThe latitude is a geographic coordinate that specifies the north–south position of a point on the surface of the Earth. It's a double number in the range from -90.0 to 90.0.
elevationThe elevation is the height of a point above the defined sea level. Sometimes referred to as altitude. It's a double number value expressed in meters.
speedThe speed is the speed of the device at the time when the probe was created. It's a double number value expressed in meters.
headingThe heading is the compass direction in which the device pointed at the time when the probe was created. It ranges from 0 to 360 (=north) with 90 (=east), 180 (=south), and 270 (=west).

Mandatory data entities

The following probe data entities are mandatory to perform meaningful anonymization:

  • Unique trace chunk ID
  • Longitude
  • Latitude
  • Timestamp
  • Speed
  • Heading

Compression formats

Data messages can be compressed to the Gzip format or have no compression applied. You can adjust the compression settings for source and sink connectors independently.

To see how to configure data message compression, Configuration of HERE Anonymizer Self-Hosted.