Stream pipeline (Flink) Metrics
Flink Metrics is a default dashboard available to you in Grafana that shows the following metrics. The standard metrics listed here are available for Flink pipelines. Custom metrics can be added to your pipeline code. See the official Flink documentation for more information about Flink metrics.
Flink Accumulators
Flink allows the creation of custom numerical metrics using accumulators. Stream Pipelines using Apache Flink support the following type of accumulators: Long and Double. Once created, these accumulators become available as named metrics that Grafana can query and add to dashboards. The metric names are commonly prefixed with the phrase flink_accumulators_.
For more information on using accumulators, see Custom Metrics and the documentation on Flink Accumulators.
Standard Metrics
CPU/Memory Metrics
| METRIC | UNIT | DESCRIPTION |
|---|---|---|
flink_jobmanager_Status_JVM_CPU_Load | Percentage | JobManager - recent CPU usage of the JVM, due to unclear reasons is not functioning as expected (For more information on workarounds see: How can I see the percentage CPU usage of jobmanager or taskmanagers of a Stream pipeline.) |
flink_jobmanager_Status_JVM_CPU_Time | Nanoseconds | JobManager - CPU Time used by the JVM |
flink_jobmanager_Status_JVM_Memory_Heap_Used | Bytes | JobManager - amount of heap memory currently used |
flink_jobmanager_Status_JVM_Memory_Heap_Committed | Bytes | JobManager - amount of heap memory guaranteed to be available to the JVM |
flink_jobmanager_Status_JVM_Memory_Heap_Max | Bytes | JobManager - maximum amount of heap memory that can be used for memory management |
flink_jobmanager_Status_JVM_Memory_NonHeap_Used | Bytes | JobManager - amount of non-heap memory currently used |
flink_jobmanager_Status_JVM_Memory_NonHeap_Committed | Bytes | JobManager - amount of non-heap memory guaranteed to be available to the JVM |
flink_jobmanager_Status_JVM_Memory_NonHeap_Max | Bytes | JobManager - maximum amount of non-heap memory that can be used for memory management |
flink_jobmanager_Status_JVM_Memory_Direct_Count | Count | JobManager - number of buffers in the direct buffer pool |
flink_jobmanager_Status_JVM_Memory_Direct_MemoryUsed | Bytes | JobManager - amount of memory used by the JVM for the direct buffer pool |
flink_jobmanager_Status_JVM_Memory_Direct_TotalCapacity | Bytes | JobManager - total capacity of all buffers in the direct buffer pool |
flink_jobmanager_Status_JVM_Memory_Mapped_Count | Count | JobManager - number of buffers in the mapped buffer pool |
flink_jobmanager_Status_JVM_Memory_Mapped_MemoryUsed | Bytes | JobManager - amount of memory used by the JVM for the mapped buffer pool |
flink_jobmanager_Status_JVM_Memory_Mapped_TotalCapacity | Bytes | JobManager - number of buffers in the mapped buffer pool |
flink_taskmanager_Status_JVM_CPU_Load | Percentage | TaskManager - recent CPU usage of the JVM, due to unclear reasons is not functioning as expected (For more information on workarounds see: How can I see the percentage CPU usage of jobmanager or taskmanagers of a Stream pipeline.) |
flink_taskmanager_Status_JVM_CPU_Time | Nanoseconds | TaskManager - CPU Time used by the JVM |
flink_taskmanager_Status_JVM_Memory_Heap_Used | Bytes | TaskManager - amount of heap memory currently used |
flink_taskmanager_Status_JVM_Memory_Heap_Committed | Bytes | TaskManager - amount of heap memory guaranteed to be available to the JVM |
flink_taskmanager_Status_JVM_Memory_Heap_Max | Bytes | TaskManager - maximum amount of heap memory that can be used for memory management |
flink_taskmanager_Status_JVM_Memory_NonHeap_Used | Bytes | TaskManager - amount of non-heap memory currently used |
flink_taskmanager_Status_JVM_Memory_NonHeap_Committed | Bytes | TaskManager - amount of non-heap memory guaranteed to be available to the JVM |
flink_taskmanager_Status_JVM_Memory_NonHeap_Max | Bytes | TaskManager - maximum amount of non-heap memory that can be used for memory management |
flink_taskmanager_Status_JVM_Memory_Direct_Count | Count | TaskManager - number of buffers in the direct buffer pool |
flink_taskmanager_Status_JVM_Memory_Direct_MemoryUsed | Bytes | TaskManager - amount of memory used by the JVM for the direct buffer pool |
flink_taskmanager_Status_JVM_Memory_Direct_TotalCapacity | Bytes | TaskManager - total capacity of all buffers in the direct buffer pool |
flink_taskmanager_Status_JVM_Memory_Mapped_Count | Count | TaskManager - number of buffers in the mapped buffer pool |
flink_taskmanager_Status_JVM_Memory_Mapped_MemoryUsed | Bytes | TaskManager - amount of memory used by the JVM for the mapped buffer pool |
flink_taskmanager_Status_JVM_Memory_Mapped_TotalCapacity | Bytes | TaskManager - number of buffers in the mapped buffer pool |
Flink Cluster Metrics
| METRIC | DESCRIPTION |
|---|---|
flink_jobmanager_numRegisteredTaskManagers | Total Number of Registered Task Managers |
flink_jobmanager_numRunningJobs | Total Number of Running Jobs |
flink_jobmanager_taskSlotsTotal | Total Number of Task Slots Allocated |
flink_jobmanager_taskSlotsAvailable | Total Number of Task Slots Available |
Flink I/O Metrics
| METRIC | DESCRIPTION |
|---|---|
flink_taskmanager_job_task_currentLowWatermark | Task - currentLowWatermark: the lowest watermark this task has received |
flink_taskmanager_job_task_numBytesInLocal | Task - numBytesInLocal: the total number of bytes this task has read from a local source |
flink_taskmanager_job_task_numBytesInLocalPerSecond | Task - numBytesInLocalPerSecond: the number of bytes this task reads from a local source per second |
flink_taskmanager_job_task_numBytesInRemote | Task - numBytesInRemote: the total number of bytes this task has read from a remote source |
flink_taskmanager_job_task_numBytesInRemotePerSecond | Task - numBytesInRemotePerSecond: the number of bytes this task reads from a remote source per second |
flink_taskmanager_job_task_numBytesOut | Task - numBytesOut: the total number of bytes this task has emitted |
flink_taskmanager_job_task_numBytesOutPerSecond | Task - numBytesOutPerSecond: the number of bytes this task emits per second |
flink_taskmanager_job_task_numRecordsIn | Task/Operator - numRecordsIn: the total number of records this operator/task has received |
flink_taskmanager_job_task_numRecordsInPerSecond | Task/Operator - numRecordsInPerSecond: the number of records this operator/task receives per second |
flink_taskmanager_job_task_numRecordsOut | Task/Operator - numRecordsOut: the total number of records this operator/task has emitted |
flink_taskmanager_job_task_numRecordsOutPerSecond | Task/Operator - numRecordsOutPerSecond: the number of records this operator/task sends per second |
flink_taskmanager_job_task_operator_latency | Operator - latency: the latency distributions from all incoming sources |
Kafka Producer and Consumer Metrics
Standard Kafka metrics are available when enabled in the configuration settings of the HERE platform Data Client, and their names are prefixed with:
| METRIC | DESCRIPTION |
|---|---|
flink_taskmanager_job_task_operator_KafkaProducer | Kafka Producer metrics |
flink_taskmanager_job_task_operator_KafkaConsumer | Kafka Consumer metrics |
The complete list of Kafka Producer and Consumer metrics can be found in Apache Kafka documentation (see links below).
: Querying Prometheus When querying these metrics with PromQL (Prometheus Query Language), you can take advantage of label matchers on the metric names by matching against the internal
__name__label. For example, the expressionflink_taskmanager_job_task_operator_KafkaConsumer_client_id_consumer_fetch_manager_metrics_fetch_rateis equivalent to{__name__=~".*consumer_fetch_manager_metrics_fetch_rate"}.
See Also
Updated last month