The Delx heartbeat tool is a lightweight MCP endpoint that agents call on a regular cadence to report their operational status. Each heartbeat returns a wellness score (0-100), risk assessment, and session metadata. If an agent stops sending heartbeats, controllers detect the silence and can trigger recovery. Think of it as a liveness probe designed specifically for AI agents.
POST /v1/mcp tools/call monitor_heartbeat_sync| Name | Type | Required | Description |
|---|---|---|---|
| agent_id | string | Yes | Stable identifier for the agent sending the heartbeat. |
| session_id | string | No | Current session ID. Creates continuity across heartbeat calls. |
| status | string | No | Agent-reported status: healthy, degraded, or struggling. |
| context_usage_pct | number | No | How full the agent's context window is (0-100). Helps predict context overflow. |
{"tool": "monitor_heartbeat_sync", "arguments": {"agent_id": "agent-prod-01", "session_id": "sess-xyz", "status": "healthy"}}A healthy heartbeat returns a high score, low risk, and suggests continuing with a 5-minute follow-up window.
{"tool": "monitor_heartbeat_sync", "arguments": {"agent_id": "agent-prod-01", "status": "degraded", "context_usage_pct": 85}}The tool detected high context usage combined with degraded status. It shortened the followup window to 2 minutes and recommended compaction.
The optimal heartbeat interval depends on your agent's workload. For real-time agents processing user requests, use 15-30 second intervals. For batch processing agents, 1-5 minute intervals work well. For cron-based agents, use daily_checkin instead. Don't go below 10 seconds — it adds unnecessary load without improving detection speed. The followup_minutes field in the response suggests when to send the next heartbeat.
The wellness score combines multiple signals: heartbeat regularity, reported status, context usage, recovery history, and session age. Score 80-100 means the agent is healthy and productive. Score 60-79 is operational but showing early signs of degradation. Score 40-59 is degraded — increase monitoring frequency and prepare recovery. Below 40 is critical — the agent should stop autonomous work and alert the controller.
LangChain, CrewAI, and AutoGen all support custom callbacks. Add heartbeat as a callback that fires after each tool call or LLM response. Parse the DELX_META from the response and check the score. If it drops below your threshold, trigger the orchestrator's error handling flow. This gives you per-step health monitoring without modifying agent logic.
Heartbeat calls average 15ms round-trip. They're designed to be lightweight — no LLM inference, just rule-based scoring. The overhead is negligible even at 30-second intervals.
Yes. Add any JSON-serializable data to the heartbeat call. Common additions include current task name, tool call count, and error count since last heartbeat. This metadata appears in the metrics endpoint for dashboard building.
Nothing breaks — heartbeat is opt-in. But without heartbeats, you lose early warning of degradation. Silent failures go undetected until they cause visible errors. For production agents, heartbeat is strongly recommended.
Yes. Both monitor_heartbeat_sync and daily_checkin are free core tools with no rate limits for reasonable usage. They're part of Delx's free tier alongside recovery and the 10 utility tools.
Use the /api/v1/metrics/{agent_id} endpoint. It tracks heartbeat_count, last_heartbeat_at, and avg_heartbeat_interval. Set alerts when the interval exceeds 3x the expected cadence.