Skip to content

steadwing/openalerts

Repository files navigation

OpenAlerts

An alerting layer for agentic frameworks.

npm License GitHub stars Discord

Quickstart · Alert Rules · LLM Enrichment · Dashboard · Commands


AI agents fail silently. LLM errors, stuck sessions, gateway outages — nobody knows until a user complains.

OpenAlerts watches your agent in real-time and alerts you the moment something goes wrong. A framework-agnostic core with adapter plugins — starting with OpenClaw.

Quickstart

Currently supports OpenClaw. More framework adapters coming soon. This project is under revamp for the next few hours

1. Install

openclaw plugins install @steadwing/openalerts

2. Configure

If you already have a channel paired with OpenClaw (e.g. Telegram via openclaw pair), no config is needed — OpenAlerts auto-detects where to send alerts.

Otherwise, set it explicitly in openclaw.json:

{
	"plugins": {
		"entries": {
			"openalerts": {
				"enabled": true,
				"config": {
					"alertChannel": "telegram", // telegram | discord | slack | whatsapp | signal
					"alertTo": "YOUR_CHAT_ID",
				},
			},
		},
	},
}

Auto-detection priority: explicit config > static allowFrom in channel config > pairing store.

3. Restart & verify

openclaw gateway stop && openclaw gateway run

Send /health to your bot. You should get a live status report back — zero LLM tokens consumed.

That's it. OpenAlerts is now watching your agent.

Demo

OpenAlerts.Demo.mp4

Dashboard

A real-time web dashboard is embedded in the gateway at:

http://127.0.0.1:18789/openalerts
  • Activity — Live event timeline with session flows, tool calls, LLM usage
  • System Logs — Filtered, structured logs with search
  • Health — Rule status, alert history, system stats

Alert Rules

Eight rules run against every event in real-time. All thresholds and cooldowns are configurable.

Rule Watches for Severity Threshold (default)
llm-errors LLM/agent failures in 1 min window ERROR 1 error
infra-errors Infrastructure errors in 1 min window ERROR 1 error
gateway-down No heartbeat received CRITICAL 30000 ms (30s)
session-stuck Session idle too long WARN 120000 ms (2 min)
high-error-rate Message failure rate over last 20 ERROR 50%
queue-depth Queued items piling up WARN 10 items
tool-errors Tool failures in 1 min window WARN 1 error
heartbeat-fail Consecutive heartbeat failures ERROR 3 failures

Every rule also accepts:

  • enabledfalse to disable the rule (default: true)
  • cooldownMinutes — minutes before the same rule can fire again (default: 15)

To tune rules, add a rules object in your plugin config:

{
	"plugins": {
		"entries": {
			"openalerts": {
				"config": {
					"cooldownMinutes": 10,
					"rules": {
						"llm-errors": { "threshold": 5 },
						"infra-errors": { "cooldownMinutes": 30 },
						"high-error-rate": { "enabled": false },
						"gateway-down": { "threshold": 60000 },
					},
				},
			},
		},
	},
}

Set "quiet": true at the config level for log-only mode (no messages sent).

LLM-Enriched Alerts

OpenAlerts can optionally use your configured LLM to enrich alerts with a human-friendly summary and an actionable suggestion. This feature is disabled by default — opt in by setting "llmEnriched": true in your plugin config:

{
	"plugins": {
		"entries": {
			"openalerts": {
				"config": {
					"llmEnriched": true,
				},
			},
		},
	},
}

When enabled, alerts include an LLM-generated summary and action:

1 agent error(s) on unknown in the last minute. Last: 401 Incorrect API key...

Summary: Your OpenAI API key is invalid or expired — the agent cannot make LLM calls.
Action: Update your API key in ~/.openclaw/.env with a valid key from platform.openai.com/api-keys
  • Model: reads from agents.defaults.model.primary in your openclaw.json (e.g. "openai/gpt-4o-mini")
  • API key: reads from the corresponding environment variable (OPENAI_API_KEY, ANTHROPIC_API_KEY, GROQ_API_KEY, etc.)
  • Supported providers: OpenAI, Anthropic, Groq, Together, DeepSeek (and any OpenAI-compatible API)
  • Graceful fallback: if the LLM call fails or times out (10s), the original alert is sent unchanged

Commands

Zero-token chat commands available in any connected channel:

Command What it does
/health System health snapshot — uptime, active alerts, stats
/alerts Recent alert history with severity and timestamps
/dashboard Returns the dashboard URL

Roadmap

Development

npm install        # install dependencies
npm run build      # compile TypeScript
npm run typecheck  # type-check without emitting
npm run clean      # remove dist/

License

Apache-2.0