Skip to content

XLabs/dead-mans-switch

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dead Man's Switch

Forked from pingcap/dead-mans-switch

Dead Man's Switch is a simple Prometheus alert manager webhook service. It provides a basic mechanism to ensure alerting pipeline is healthy.

It will send notifications to PagerDuty when it detects that the alerting pipeline is unhealthy.

How it works

Prometheus provides a mechanism to make defione alerts that will be always firing with the expr: vector(1) rule. For example:

- alert: Watchdog
  annotations:
    message: |
      This is an alert meant to ensure that the entire alerting pipeline is functional.
      This alert is always firing, therefore it should always be firing in Alertmanager
      and always fire against a receiver. There are integrations with various notification
      mechanisms that send a notification when this alert is not firing. For example the
      "DeadMansSnitch" integration in PagerDuty.
  expr: vector(1)
  labels:
    severity: none

We call this the Watchdog alert. With it, we can monitor the alerting pipeline: if the alert is not firing, the alert system is unhealthy and alerts are not being sent.

Dead Man's Switch exposes an webhook endpoint at /webhook, that should receive the Watchdog alert from Alertmanager.

It will monitor the alerts received, evaluating alerts based on the rules defined in the configuration file.

You need to define your own Watchdog alert in your monitoring infrastructure.

Configuration

The configuration for the application is done via command-line flags and a simple YAML configuration file.

Command line flags

  • -port: Which port the server will listen for webhooks, healthcheck and metrics.
  • -config: Path to configuration file.

Environment variables

  • PAGERDUTY_API_KEY: Overwrite notify.pagerDuty.key, so it may be ommited from the configuration file.

Configuration file

The configuration file can be used to define the evaluation interval, the rules used to evaluate if the alerts received are the ones expected and it if should trigger an notification.

There are also options to configure the notification sent to PagerDuty.

You can find an example of configuration in config.example.yaml.

AlertManager config

Make sure the Watchdog alert is sent to Dead Man's Switch as an Alertmanager Receiver:

route:
  routes:
    - receiver: dead-mans-switch
      match:
        alertname: 'Watchdog'

Make sure the Alertmanager Receiver is defined:

receivers:
- name: dead-mans-switch
  webhook_configs:
  - url: http://dead-mans-switch:8080/webhook

Configuration

Develop

Build binary in local environment:

make build

Run in local environment:

make run

Send alert manager webhook payload:

curl -H "Content-Type: application/json" --data @payload.json http://localhost:8080/webhook

Release Process

The release process is triggered by tags. To trigger a new image build and release, use the following:

git tag <version>
git push origin --tags

The Docker image should be available at ghcr.io/XLabs/dead-mans-switch with the tags <version> and latest.

Pull requests trigger the pipeline but will not publish the built container image.

About

A bypass monitoring prober

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Languages

  • Go 94.0%
  • Makefile 3.3%
  • Dockerfile 2.7%