Metric

June 5, 2025 ยท View on GitHub

Spiderpool can be configured to serve Opentelemetry metrics. And spiderpool metrics provide the insight of Spiderpool Agent and Spiderpool Controller.

spiderpool controller

The metrics of spiderpool controller is set by the following pod environment:

environmentdescriptiondefault
SPIDERPOOL_ENABLED_METRICenable metricsfalse
SPIDERPOOL_ENABLED_DEBUG_METRICenable debug level metricsfalse
SPIDERPOOL_METRIC_HTTP_PORTmetrics port5721

spiderpool agent

The metrics of spiderpool agent is set by the following pod environment:

environmentdescriptiondefault
SPIDERPOOL_ENABLED_METRICenable metricsfalse
SPIDERPOOL_ENABLED_DEBUG_METRICenable debug level metricsfalse
SPIDERPOOL_METRIC_HTTP_PORTmetrics port5711

Get Started

Enable Metric support

Check the environment variable SPIDERPOOL_ENABLED_METRIC of the daemonset spiderpool-agent for whether it is already set to true or not.

Check the environment variable SPIDERPOOL_ENABLED_METRIC of deployment spiderpool-controller for whether it is already set to true or not.

kubectl -n kube-system get daemonset spiderpool-agent -o yaml
------
kubectl -n kube-system get deployment spiderpool-controller -o yaml

You can set one or both of them to true. For example, let's enable spiderpool agent metrics by running helm upgrade --set spiderpoolAgent.prometheus.enabled=true.

Metric reference

Spiderpool Agent

Spiderpool agent exports some metrics related with IPAM allocation and release. Currently, those include:

Namedescription
spiderpool_ipam_allocation_countsNumber of IPAM allocation requests that Spiderpool Agent received , prometheus type: counter
spiderpool_ipam_allocation_failure_countsNumber of Spiderpool Agent IPAM allocation failures, prometheus type: counter
spiderpool_ipam_allocation_update_ippool_conflict_countsNumber of Spiderpool Agent IPAM allocation update IPPool conflicts, prometheus type: counter
spiderpool_ipam_allocation_err_internal_countsNumber of Spiderpool Agent IPAM allocation internal errors, prometheus type: counter
spiderpool_ipam_allocation_err_no_available_pool_countsNumber of Spiderpool Agent IPAM allocation no available IPPool errors, prometheus type: counter
spiderpool_ipam_allocation_err_retries_exhausted_countsNumber of Spiderpool Agent IPAM allocation retries exhausted errors, prometheus type: counter
spiderpool_ipam_allocation_err_ip_used_out_countsNumber of Spiderpool Agent IPAM allocation IP addresses used out errors, prometheus type: counter
spiderpool_ipam_allocation_average_duration_secondsThe average duration of all Spiderpool Agent allocation processes, prometheus type: gauge
spiderpool_ipam_allocation_max_duration_secondsThe maximum duration of Spiderpool Agent allocation process (per-process), prometheus type: gauge
spiderpool_ipam_allocation_min_duration_secondsThe minimum duration of Spiderpool Agent allocation process (per-process), prometheus type: gauge
spiderpool_ipam_allocation_latest_duration_secondsThe latest duration of Spiderpool Agent allocation process (per-process), prometheus type: gauge
spiderpool_ipam_allocation_duration_secondsHistogram of IPAM allocation duration in seconds, prometheus type: histogram
spiderpool_ipam_allocation_average_limit_duration_secondsThe average duration of all Spiderpool Agent allocation queuing, prometheus type: gauge
spiderpool_ipam_allocation_max_limit_duration_secondsThe maximum duration of Spiderpool Agent allocation queuing, prometheus type: gauge
spiderpool_ipam_allocation_min_limit_duration_secondsThe minimum duration of Spiderpool Agent allocation queuing, prometheus type: gauge
spiderpool_ipam_allocation_latest_limit_duration_secondsThe latest duration of Spiderpool Agent allocation queuing, prometheus type: gauge
spiderpool_ipam_allocation_limit_duration_secondsHistogram of IPAM allocation queuing duration in seconds, prometheus type: histogram
spiderpool_ipam_release_countsCount of the number of Spiderpool Agent received the IPAM release requests, prometheus type: counter
spiderpool_ipam_release_failure_countsNumber of Spiderpool Agent IPAM release failure, prometheus type: counter
spiderpool_ipam_release_update_ippool_conflict_countsNumber of Spiderpool Agent IPAM release update IPPool conflicts, prometheus type: counter
spiderpool_ipam_release_err_internal_countsNumber of Spiderpool Agent IPAM releasing internal error, prometheus type: counter
spiderpool_ipam_release_err_retries_exhausted_countsNumber of Spiderpool Agent IPAM releasing retries exhausted error, prometheus type: counter
spiderpool_ipam_release_average_duration_secondsThe average duration of all Spiderpool Agent release processes, prometheus type: gauge
spiderpool_ipam_release_max_duration_secondsThe maximum duration of Spiderpool Agent release process (per-process), prometheus type: gauge
spiderpool_ipam_release_min_duration_secondsThe minimum duration of Spiderpool Agent release process (per-process), prometheus type: gauge
spiderpool_ipam_release_latest_duration_secondsThe latest duration of Spiderpool Agent release process (per-process), prometheus type: gauge
spiderpool_ipam_release_duration_secondsHistogram of IPAM release duration in seconds, prometheus type: histogram
spiderpool_ipam_release_average_limit_duration_secondsThe average duration of all Spiderpool Agent release queuing, prometheus type: gauge
spiderpool_ipam_release_max_limit_duration_secondsThe maximum duration of Spiderpool Agent release queuing, prometheus type: gauge
spiderpool_ipam_release_min_limit_duration_secondsThe minimum duration of Spiderpool Agent release queuing, prometheus type: gauge
spiderpool_ipam_release_latest_limit_duration_secondsThe latest duration of Spiderpool Agent release queuing, prometheus type: gauge
spiderpool_ipam_release_limit_duration_secondsHistogram of IPAM release queuing duration in seconds, prometheus type: histogram
spiderpool_debug_auto_pool_waited_for_available_countsNumber of Spiderpool Agent IPAM allocation wait for auto-created IPPool available, prometheus type: counter. (debug level metric)

Spiderpool Controller

Spiderpool controller exports some metrics related with SpiderIPPool IP garbage collection. Currently, those include:

Namedescription
spiderpool_ip_gc_countsNumber of Spiderpool Controller IP garbage collection, prometheus type: counter.
spiderpool_ip_gc_failure_countsNumber of Spiderpool Controller IP garbage collection failures, prometheus type: counter.
spiderpool_total_ippool_countsNumber of Spiderpool IPPools, prometheus type: gauge.
spiderpool_debug_ippool_total_ip_countsNumber of Spiderpool IPPool corresponding total IPs (per-IPPool), prometheus type: gauge. (debug level metric)
spiderpool_debug_ippool_available_ip_countsNumber of Spiderpool IPPool corresponding availbale IPs (per-IPPool), prometheus type: gauge. (debug level metric)
spiderpool_total_subnet_countsNumber of Spiderpool Subnets, prometheus type: gauge.
spiderpool_debug_subnet_ippool_countsNumber of Spiderpool Subnet corresponding IPPools (per-Subnet), prometheus type: gauge. (debug level metric)
spiderpool_debug_subnet_total_ip_countsNumber of Spiderpool Subnet corresponding total IPs (per-Subnet), prometheus type: gauge. (debug level metric)
spiderpool_debug_subnet_available_ip_countsNumber of Spiderpool Subnet corresponding availbale IPs (per-Subnet), prometheus type: gauge. (debug level metric)
spiderpool_debug_auto_pool_waited_for_available_countsNumber of waiting for auto-created IPPool available, prometheus type: couter. (debug level metric)

RDMA exporter

Spiderpool also provides RDMA exporter to export RDMA metrics. The RDMA metrics include:

Metric NameTypeDescriptionRemarks
rdma_rx_write_requestsCounterNumber of received write requests
rdma_rx_read_requestsCounterNumber of received read requests
rdma_rx_atomic_requestsCounterNumber of received atomic requests
rdma_rx_dct_connectCounterNumber of received DCT connection requests
rdma_out_of_bufferCounterNumber of buffer insufficiency errors
rdma_out_of_sequenceCounterNumber of out-of-sequence packets received
rdma_duplicate_requestCounterNumber of duplicate requests
rdma_rnr_nak_retry_errCounterCount of RNR NAK packets not exceeding QP retry limit
rdma_packet_seq_errCounterNumber of packet sequence errors
rdma_implied_nak_seq_errCounterNumber of implied NAK sequence errors
rdma_local_ack_timeout_errCounterNumber of times the sender's QP ack timer expiredRC, XRC, DCT QPs only
rdma_resp_local_length_errorCounterNumber of times a respondent detected a local length error
rdma_resp_cqe_errorCounterNumber of response CQE errors
rdma_req_cqe_errorCounterNumber of times a requester detected CQE completion with errors
rdma_req_remote_invalid_requestCounterNumber of remote invalid request errors detected by requester
rdma_req_remote_access_errorsCounterNumber of requested remote access errors
rdma_resp_remote_access_errorsCounterNumber of response remote access errors
rdma_resp_cqe_flush_errorCounterNumber of response CQE flush errors
rdma_req_cqe_flush_errorCounterNumber of request CQE flush errors
rdma_roce_adp_retransCounterNumber of RoCE adaptive retransmissions
rdma_roce_adp_retrans_toCounterNumber of RoCE adaptive retransmission timeouts
rdma_roce_slow_restartCounterNumber of RoCE slow restarts
rdma_roce_slow_restart_cnpsCounterNumber of CNP packets generated during RoCE slow restart
rdma_roce_slow_restart_transCounterNumber of times state transitioned to slow restart
rdma_rp_cnp_ignoredCounterNumber of CNP packets received and ignored by Reaction Point HCA
rdma_rp_cnp_handledCounterNumber of CNP packets handled by Reaction Point HCA to reduce transmission rate
rdma_np_ecn_marked_roce_packetsCounterNumber of ECN-marked RoCE packets indicating path congestion
rdma_np_cnp_sentCounterNumber of CNP packets sent when congestion is experienced in RoCEv2 IP header
rdma_rx_icrc_encapsulatedCounterNumber of RoCE packets with ICRC errors
rdma_rx_vport_rdma_unicast_packetsCounterNumber of received unicast RDMA packets
rdma_tx_vport_rdma_unicast_packetsCounterNumber of transmitted unicast RDMA packets
rdma_rx_vport_rdma_multicast_packetsCounterNumber of received multicast RDMA packets
rdma_tx_vport_rdma_multicast_packetsCounterNumber of transmitted multicast RDMA packets
rdma_rx_vport_rdma_unicast_bytesCounterNumber of bytes received in unicast RDMA packets
rdma_tx_vport_rdma_unicast_bytesCounterNumber of bytes transmitted in unicast RDMA packets
rdma_rx_vport_rdma_multicast_bytesCounterNumber of bytes received in multicast RDMA packets
rdma_tx_vport_rdma_multicast_bytesCounterNumber of bytes transmitted in multicast RDMA packets
rdma_vport_speed_mbpsGaugeSpeed of the port in Mbps
rdma_device_tosGaugeRDMA device traffic class (TOS) value.