Skip to main content
Version: 2.x

Zio-Kafka Metrics

Zio-kafka exposes all the metrics of the wrapped Java based consumer and producer, plus some more metrics about the zio-kafka consumer itself.

Java client metrics

The metrics from the Java metrics can be obtained via the Consumer.metrics and Producer.metrics methods. Both return a live view on the internal metrics of the consumer/producer. We currently do not expose these metrics elsewhere, a PR to copy them to the zio-metrics API is welcome.

Zio-kafka metrics

The zio-kafka consumer and producer collects some additional metrics using the zio-metrics API. This allows any zio-metrics backend to access and process the observed values.

By default, no tags are added. Tags can be configured via ConsumerSettings.withMetricsLabels / ProducerSettings.withMetricsLabels.

Like the zio-metrics we follow Prometheus conventions. This means that:

  • durations are expressed in seconds,
  • counters can only increase,
  • metric names use snake_case and end in the unit where possible.

The histograms each use 10 buckets. To reach a decent range while keeping sufficient accuracy at the low end, most bucket boundaries use an exponential series based on 𝑒.

Zio-kafka consumer metrics

Poll metrics

TypeNameDescription
counterziokafka_consumer_pollsThe number of polls.
histogramziokafka_consumer_poll_latency_secondsThe duration of a single poll in seconds.
histogramziokafka_consumer_poll_sizeThe number of records fetched by a single poll.
gaugeziokafka_consumer_partitions_resumed_in_latest_pollThe number of partitions resumed in the latest poll call.
gaugeziokafka_consumer_partitions_paused_in_latest_pollThe number of partitions paused in the latest poll call (because of backpressure).
counterziokafka_consumer_poll_auth_errorsThe number of polls that ended with an authentication or authorization error.

Partition stream metrics

These metrics are updated after every poll.

TypeNameDescription
histogramziokafka_consumer_pending_requestsThe number of partitions that ran out of records (the queue is empty).
histogramziokafka_consumer_queue_sizeThe number of records queued for a partition.
histogramziokafka_consumer_all_queue_sizeThe total number of records queued for all partitions.
histogramziokafka_consumer_queue_pollsThe number of polls during which records are idling in a queue.

Commit metrics

These metrics measure the separate commit requests issued through zio-kafka's api.

TypeNameDescription
histogramziokafka_consumer_pending_commitsThe number of commits that are awaiting completion.
counterIntziokafka_consumer_commitsThe number of commits.
histogramziokafka_consumer_commit_latency_secondsThe duration of a commit in seconds.

Aggregated commit metrics

After every poll zio-kafka combines all outstanding commit requests into 1 aggregated commit. These metrics are for the aggregated commits.

TypeNameDescription
counterIntziokafka_consumer_aggregated_commitsThe number of aggregated commits.
histogramziokafka_consumer_aggregated_commit_latency_secondsThe duration of an aggregated commit in seconds.
histogramziokafka_consumer_aggregated_commit_sizeAn approximation of the number of records (offsets) per aggregated commit.

Rebalance metrics

TypeNameDescription
counterIntziokafka_consumer_rebalancesThe number of rebalances.
gaugeziokafka_consumer_partitions_currently_assignedThe number of partitions currently assigned to the consumer.
counterIntziokafka_consumer_partitions_assignedThe number of partitions assigned to the consumer.
counterIntziokafka_consumer_partitions_revokedThe number of partitions revoked to the consumer.
counterIntziokafka_consumer_partitions_lostThe number of partitions lost to the consumer.

Runloop metrics

These metrics are updated after every poll.

TypeNameDescription
gaugeziokafka_consumer_subscription_stateWhether the consumer is subscribed (1) or not (0).
histogramziokafka_consumer_command_queue_sizeThe number of commands queued in the consumer.
histogramziokafka_consumer_commit_queue_sizeThe number of commits queued in the consumer.

See ConsumerMetrics.scala for the exact details.

Example

Here is an example dashboard that could be built with these metrics:

metrics-dashboard.png

Zio-kafka producer metrics

To understand the producer metrics it useful to know how the producer works.

When you call the producer it first serializes the given batch of records (a single record is treated like a batch of one). The batch is then offered to an internal send queue. Once the queue accepts the batch, the produce*Async methods return immediately. The produce*Sync methods return later, when all acknowledgements are in. Once the fiber is ready for it, it pulls a batch from the queue and sends the records to the Kafka broker one by one. Meanwhile, the broker acknowledges records.

Producer metrics

TypeNameDescription
counterziokafka_producer_callsThe number of times a produce method is invoked.
histogramziokafka_producer_latency_secondsThe duration of a single produce, from after serialization to acknowledged, in seconds.
counterziokafka_producer_recordsThe number of records produced.
histogramziokafka_producer_batch_sizeThe number of records per produce call.

Queue metrics

TypeNameDescription
histogramziokafka_producer_send_queue_sizeThe number of records in the zio-kafka send queue.
histogramziokafka_producer_send_queue_latency_secondsTime in send queue, including waiting for capacity, in seconds.

Send metrics

TypeNameDescription
counterziokafka_producer_auth_errorsThe number of record sends that resulted in an authentication or authorization error.

See ProducerMetrics.scala for the exact details.