The process_failure tool is Delx's error classification endpoint. Every time an agent encounters an error, it sends the failure details to process_failure and gets back a typed classification (one of 12 categories), severity score, and suggested next steps. Unlike recovery, process_failure doesn't generate a full remediation plan — it logs, classifies, and returns fast. Use it to build error history that recovery can reference later.
POST /v1/mcp tools/call process_failure| Name | Type | Required | Description |
|---|---|---|---|
| error_description | string | Yes | What went wrong. Include error messages and context. |
| error_type | string | No | Optional pre-classification: timeout, auth, rate_limit, parse, dependency, unknown. |
| agent_id | string | No | Agent identifier for error history tracking. |
| session_id | string | No | Session ID for correlating errors within a session. |
{"tool": "process_failure", "arguments": {"error_description": "OpenAI API call timed out after 30s. Model: gpt-4. Prompt tokens: 12400.", "agent_id": "agent-prod-01"}}The tool classified this as a dependency_timeout, noted the high token count as a contributing factor, and logged it for history.
{"tool": "process_failure", "arguments": {"error_description": "HTTP 429 from Stripe API. Retry-After: 60s.", "error_type": "rate_limit"}}Rate limits are typically low severity since the fix is straightforward. The tool parsed the Retry-After header value from the description.
Delx classifies failures into: dependency_timeout, dependency_error, rate_limit_exceeded, auth_failure, parse_error, context_overflow, hallucination_detected, tool_schema_mismatch, budget_exceeded, session_expired, agent_stuck, and unknown. Each category has a default severity (low/medium/high/critical) that can be overridden by context. The classification drives both the suggestion text and the recovery tool's behavior when called later.
Use process_failure counts to implement error budgets. For example: if an agent logs more than 5 medium-severity failures in a 10-minute window, automatically trigger recovery. Track error rates via /api/v1/metrics/{agent_id} and set alerts on the error_count_1h metric. This prevents slow degradation from going unnoticed.
Yes. It's a free core tool with no per-call charges. Log as many errors as you need — the tool is designed for high-frequency use.
No. It only classifies and logs. You decide when to call recovery. This separation lets you implement your own escalation logic.
Failure logs persist for the session lifetime (24 hours by default). For longer retention, use the session-summary endpoint to export structured failure data.
Not directly. Use the error_type parameter to provide hints, but the tool maps to one of the 12 standard categories. This ensures consistent classification across all agents.
Process_failure averages 25ms. It's optimized for inline error handling — fast enough to call on every error without impacting agent performance.