Settings for config.yml

July 6, 2023 ยท View on GitHub

Almost everything in tenderduty is controlled via the config.yml file. There are many options, and this attempts to explain them.

ProTip: If you only have a binary, or are using the docker image the -example-config flag will have tenderduty dump the example-config.yml file to STDOUT and exit. This can be used to get started without needing to download from Github. Example:

$ tenderduty -example-config > config.yml

Or if using the docker image:

$ docker run --rm ghcr.io/blockpane/tenderduty:latest -example-config >config.yml

A few notes on how Go handles YAML:

  • Booleans can be specified with either true/false or yes/no
  • If a setting is omitted it will default to an empty string for strings, zero for numbers, false for booleans, and nil for arrays and structures.
  • This can be useful for building a more compact config file.

For example if not using telegram and discord, and only alerting on consecutive missed blocks the config for a chain could be condensed to:

chains:

  "Osmosis":
    chain_id: osmosis-1
    valoper_address: osmovaloper1xxxxxxx...
    alerts:
      consecutive_enabled: yes
      consecutive_missed: 5
      pagerduty:
        enabled: yes
    nodes:
      - url: tcp://localhost:26657

General Settings

Config SettingDescription
enable_dashboardcontrols whether the dashboard is enabled
listen_portWhat TCP port the dashboard will listen on. Only the port is controllable for now.
hide_logshide_logs is useful if the dashboard will be posted publicly. It disables the log feed, and obscures most node-related details. Be aware this isn't fully vetted for preventing info leaks about node names, etc.
node_down_alert_minutesHow long to wait before alerting that a node is down.
prometheus_enabledShould the prometheus exporter be enabled? See the prometheus doc for information about what endpoints are available.
prometheus_listen_portWhat port should it listen on? For now only port is configurable

PagerDuty Settings

Config SettingDescription
pagerduty.enabledShould we use PD? Be aware that if this is set to no it overrides individual chain alerting settings.
pagerduty.api_keyThis is an API key, not oauth token, see the pagerduty doc for specific setup details.
pagerduty.default_severityNot currently used, but will be soon. This allows setting escalation priorities etc.

Discord Settings

Config SettingDescription
discord.enabledAlert to discord? Also overrides chain-specific alerts if "no".
discord.webhookSee the discord setup document for how to get this information.

Telegram Settings

Config SettingDescription
telegram.enabledAlert via telegram? Note: also supersedes chain-specific settings.
telegram.api_keyAPI key ... talk to @BotFather. More setup info in the telegram doc.
telegram.channelSee the telegram doc for how to get this value.

Health Check Settings

Config SettingDescription
healthcheck.enabledSend pings to determine if the monitor is running?
healthcheck.ping_urlURL to send pings to.
healthcheck.ping_rateRate in which pings are sent in seconds.

Chain Specific Settings

This section can be repeated for monitoring multiple chains.

Config SettingDescription
chain."name"The user-friendly name that will be used for labels. Highly suggest wrapping in quotes to prevent YAML parsing issues if there is a space or special characters.
chain."name".chain_idThe chain-id for the chain, this is verified to match when connecting to an RPC server
chain."name".valoper_addressHooray, in v2 we derive the valcons from abci queries so you don't have to jump through hoops to figure out how to convert ed25519 keys to the appropriate bech32 address
chain."name".public_fallbackShould the monitor revert to using public API endpoints if all supplied RCP nodes fail? This isn't always reliable, not all public nodes have websocket proxying setup correctly. Endpoints are sourced from the cosmos directory.

Chain Alerting Settings

Config SettingDescription
chain."name".alerts.stalled_enabledIf the chain stops seeing new blocks, should an alert be sent?
chain."name".alerts.stalled_minutesHow long a halted chain takes in minutes to generate an alarm.
chain."name".alerts.consecutive_enabledMost basic alarm, you just missed x blocks ... would you like to know?
chain."name".alerts.consecutive_missedHow many missed blocks should trigger a notification?
chain."name".alerts.consecutive_priorityNOT USED: future hint for pagerduty's routing.
chain."name".alerts.percentage_enabledFor each chain there is a specific window of blocks and a percentage of missed blocks that will result in a downtime jail infraction. Should an alert be sent if a certain percentage of this window is exceeded?
chain."name".alerts.percentage_missedWhat percentage should trigger the alert?
chain."name".alerts.percentage_priorityNOT USED: future hint for pagerduty's routing.
chain."name".alerts.alert_if_inactiveShould an alert be sent if the validator is not in the active set: jailed, tombstoned, or unbonding?
chain."name".alerts.alert_if_no_serversShould an alert be sent if no RPC servers are responding? (Note this alarm uses the node_down_alert_minutes setting)
chain."name".alerts.pagerduty.*This section is the same as the pagerduty structure above. It allows disabling or enabling specific settings on a per-chain basis. Including routing to a different destination. If the api_key is blank it will use the settings defined in pagerduty.*
Note both pagerduty.enabled and chain."name".alerts.pagerduty.enabled must be 'yes' to get alerts.
chain."name".alerts.discord.*This section is the same as the discord structure above. It allows disabling or enabling specific settings on a per-chain basis. Including routing to a different destination. If the webhook is blank it will use the settings defined in discord.*
Note both discord.enabled and chain."name".alerts.discord.enabled must be 'yes' to get alerts.
chain."name".alerts.telegram.*This section is the same as the telegram structure above. It allows disabling or enabling specific settings on a per-chain basis. Including routing to a different destination. If the api_key and channel are blank it will use the settings defined in telegram.*
Note both telegram.enabled and chain."name".alerts.telegram.enabled must be 'yes' to get alerts.

Node Settings:

Note: if this section is omitted and public fallbacks are enabled, tenderduty will only use public endpoints. This is not encouraged for a few reasons: public nodes can be unreliable, some proxy servers do not support websockets (which td relies on for watching blocks,) and it consumes resources from other validators.

Config SettingDescription
chain."name".nodes[]This is an array of nodes to use as RPC servers.
chain."name".nodes[].urlShould include the protocol://hostname:port For now only http (tcp is an alias) and https (with a valid certificate) are supported. UDS and insecure TLS support is planned
chain."name".nodes[].alert_if_downShould an alert be sent if this host isn't responding? Uses the node_down_alert_minutes setting to determine threshold.