reference-metrics.md

November 8, 2022 ยท View on GitHub

KoP exposes the following metrics in Prometheus format. You can monitor your clusters with those metrics.

The following types of metrics are available:

  • Counter: a cumulative metric that represents a single monotonically increasing counter. The value increases by default. You can reset the value to zero or restart your cluster.
  • Gauge: a metric that represents a single numerical value that can arbitrarily go up and down.
  • Histogram: a histogram samples observations (usually things like request durations or response sizes) and counts them in configurable buckets.
  • Summary: similar to a histogram, a summary samples observations (usually things like request durations and response sizes). While it also provides a total count of observations and a sum of all observed values, it calculates configurable quantiles over a sliding time window.

KoP metrics

The KoP metrics are exposed under "/metrics" at port 8080 along with Pulsar metrics. You can use a different port by configuring the stats_server_port system property.

Channel metrics

NameTypeDescription
kop_server_ALIVE_CHANNEL_COUNTGaugeThe number of alive request channel
kop_server_ACTIVE_CHANNEL_COUNTGaugeThe number of active request channel

Request metrics

NameTypeDescription
kop_server_REQUEST_QUEUE_SIZEGaugeThe number of quest in kop request processing queue of total request channel.
kop_server_REQUEST_QUEUED_LATENCYSummaryThe requests queued latency calculated in milliseconds.
Available labels: request (ApiVersions, Metadata, Produce, FindCoordinator, ListOffsets, OffsetFetch, OffsetCommit, Fetch, JoinGroup, SyncGroup, Heartbeat, LeaveGroup, DescribeGroups, ListGroups, DeleteGroups, SaslHandshake, SaslAuthenticate, CreateTopics, InitProducerId, AddPartitionsToTxn, AddOffsetsToTxn, TxnOffsetCommit, EndTxn, WriteTxnMarkers, DescribeConfigs, DeleteTopics).
kop_server_REQUEST_PARSE_LATENCYSummaryThe requests parse latency from byteBuf to MemoryRecords calculated in milliseconds.
kop_server_REQUEST_LATENCYSummaryThe requests processing total latency for all Kafka Apis.
Available labels: request (ApiVersions, Metadata, Produce, FindCoordinator, ListOffsets, OffsetFetch, OffsetCommit, Fetch, JoinGroup, SyncGroup, Heartbeat, LeaveGroup, DescribeGroups, ListGroups, DeleteGroups, SaslHandshake, SaslAuthenticate, CreateTopics, InitProducerId, AddPartitionsToTxn, AddOffsetsToTxn, TxnOffsetCommit, EndTxn, WriteTxnMarkers, DescribeConfigs, DeleteTopics).

Response metrics

NameTypeDescription
kop_server_RESPONSE_BLOCKED_TIMESCounterThe response blocked times due to waiting for process complete
kop_server_RESPONSE_BLOCKED_LATENCYSummaryThe response blocked latency calculated in milliseconds

Producer metrics

NameTypeDescription
kop_server_PENDING_TOPIC_LATENCYSummaryThe latency for when a pending topic future finishes
kop_server_PRODUCE_ENCODESummaryThe memory record encode latency
kop_server_MESSAGE_PUBLISHSummaryThe message publish latency to Pulsar ManagedLedger
kop_server_MESSAGE_QUEUED_LATENCYSummaryThe message queued latency in KoP message publish queue
kop_server_BYTES_INCounterThe producer bytes in stats.
Available labels: topic, partition.
  • topic: the topic name to produce.
  • partition: the partition id for the topic to produce
kop_server_MESSAGE_INCounterThe producer message in stats.
Available labels: topic, partition.
  • topic: the topic name to produce.
  • partition: the partition id for the topic to produce
kop_server_BATCH_COUNT_PER_MEMORYRECORDSGaugeThe number of batches in each memory records
kop_server_PRODUCE_MESSAGE_CONVERSIONSCounterThe producer message conversions in stats.
Available labels: topic, partition.
  • topic: the topic name to produce.
  • partition: the partition id for the topic to produce
kop_server_PRODUCE_MESSAGE_CONVERSIONS_TIME_NANOSSummaryThe producer message convert latency in nanoseconds.
Available labels: topic, partition.
  • topic: the topic name to produce.
  • partition: the partition id for the topic to produce

Consumer metrics

NameTypeDescription
kop_server_PREPARE_METADATASummaryThe prepare metadata latency in milliseconds before starting fetch from Pulsar ManagedLedger
kop_server_TOTAL_MESSAGE_READSummaryThe total message read latency in milliseconds in this fetch request
kop_server_MESSAGE_READSummaryThe message read latency in milliseconds for one cursor read entry request
kop_server_FETCH_DECODESummaryThe message decode latency in milliseconds
kop_server_BYTES_OUTCounterThe consumer bytes out stats.
Available labels: topic, partition, group.
  • topic: the topic name to consume.
  • partition: the partition id for the topic to consume
  • group: the group id for consumer to consumer message from topic-partition
kop_server_MESSAGE_OUTCounterThe consumer message out stats.
Available labels: topic, partition, group.
  • topic: the topic name to consume.
  • partition: the partition id for the topic to consume
  • group: the group id for consumer to consumer message from topic-partition
kop_server_ENTRIES_OUTCounterThe consumer entries out stats.
Available labels: topic, partition, group.
  • topic: the topic name to consume.
  • partition: the partition id for the topic to consume
  • group: the group id for consumer to consumer message from topic-partition
kop_server_CONSUME_MESSAGE_CONVERSIONSCounterThe consumer message conversions in stats.
Available labels: topic, partition.
  • topic: the topic name to consume.
  • partition: the partition id for the topic to consume
kop_server_CONSUME_MESSAGE_CONVERSIONS_TIME_NANOSSummaryThe consumer message convert latency in nanoseconds.
Available labels: topic, partition.
  • topic: the topic name to consume.
  • partition: the partition id for the topic to consume
kop_server_WAITING_FETCHES_TRIGGEREDCounterNumber of fetches that have been delayed due to not enough data, and that have been unblocked because some message has been produced

Kop event metrics

NameTypeDescription
kop_server_KOP_EVENT_QUEUE_SIZEGaugeThe total number of events in KoP event processing queue.
kop_server_KOP_EVENT_QUEUED_LATENCYSummaryThe events queued latency calculated in milliseconds.
Available labels: event (DeleteTopicsEvent, BrokersChangeEvent, ShutdownEventThread).
kop_server_KOP_EVENT_LATENCYSummaryThe events processing total latency for all KoP event types.
Available labels: event (DeleteTopicsEvent, BrokersChangeEvent, ShutdownEventThread).