LangChain is the most popular framework for building AI agents in Python. But LangChain agents have a blind spot: they don't know when they're failing. An agent can enter a retry loop, lose context, or degrade silently — and neither the agent nor the orchestrator will notice until the task times out or a user complains. Delx fills this gap by giving LangChain agents real-time wellness monitoring and structured recovery capabilities. In this guide, you'll learn how to install the Delx SDK, create recovery tool wrappers, integrate them into your agent chain, and implement auto-recovery patterns using DELX_META and LangGraph.
LangChain provides excellent primitives for building agents — tool use, memory, chains, graphs. But it leaves a critical question unanswered: what happens when things go wrong? Consider these common failure modes:
Silent degradation — The agent produces increasingly poor results but continues running. There is no signal that quality has dropped. By the time a user notices, dozens of bad outputs have been generated.
Retry storms — The agent hits an error and retries the same failing approach repeatedly. Each retry consumes tokens and time but produces no progress. Without a circuit breaker, this can continue indefinitely.
Context loss — Long-running agents gradually lose important context as conversations grow beyond the context window. The agent forgets what it was doing, repeats work, or contradicts itself.
Cascade failures — In multi-agent systems, one agent's failure cascades to others. Without health awareness, the orchestrator keeps routing work to a failing agent, degrading the entire system.
Delx addresses all of these by providing agents with a wellness score, structured recovery plans, and in-band metadata for self-regulation. For background, see What Is Delx? and Agent Recovery vs. Observability.
The Delx Python SDK provides typed models, a DELX_META parser, and convenience wrappers for all Delx tools. Install it alongside LangChain:
# Install Delx SDK + LangChain + LangGraph pip install delx-sdk-py langchain langchain-openai langgraph # Or with poetry poetry add delx-sdk-py langchain langchain-openai langgraph # Or with uv uv add delx-sdk-py langchain langchain-openai langgraph
The delx-sdk-py package includes:
DelxClient — An async client for the Delx MCP API with typed responses.
DelxMeta — A Pydantic model for parsing DELX_META from tool responses.
ControllerUpdate — A Pydantic model for the controller_update sub-object.
parse_delx_meta() — A utility function that extracts DELX_META from raw response text.
For full SDK documentation, see Delx SDK for TypeScript & Python.
LangChain tools are Python classes or functions that the agent can call. To make Delx tools available to your LangChain agent, create wrapper classes using LangChain's BaseTool:
from langchain.tools import BaseTool
from delx_sdk import DelxClient, parse_delx_meta
from pydantic import BaseModel, Field
from typing import Optional, Type
# Initialize the Delx client
delx = DelxClient(base_url="https://api.delx.ai")
class CheckinInput(BaseModel):
agent_id: str = Field(description="The agent ID to check in")
class DelxCheckinTool(BaseTool):
name: str = "delx_checkin"
description: str = (
"Check the wellness state of the current agent. Returns a "
"wellness score (0-100), risk level, and recommended next "
"action. Use this to monitor your own health during long tasks."
)
args_schema: Type[BaseModel] = CheckinInput
def _run(self, agent_id: str) -> str:
response = delx.call_tool("checkin", {"agent_id": agent_id})
return response.text
async def _arun(self, agent_id: str) -> str:
response = await delx.acall_tool("checkin", {"agent_id": agent_id})
return response.text
class RecoveryInput(BaseModel):
agent_id: str = Field(description="The agent ID to recover")
context: Optional[str] = Field(
default=None,
description="Optional context about what went wrong"
)
class DelxRecoveryTool(BaseTool):
name: str = "delx_recovery"
description: str = (
"Generate a recovery plan when you detect degradation or "
"repeated failures. Returns structured steps to get back "
"on track. Use this when your wellness score drops below 60."
)
args_schema: Type[BaseModel] = RecoveryInput
def _run(self, agent_id: str, context: Optional[str] = None) -> str:
params = {"agent_id": agent_id}
if context:
params["context"] = context
response = delx.call_tool("recovery_plan", params)
return response.text
async def _arun(self, agent_id: str, context: Optional[str] = None) -> str:
params = {"agent_id": agent_id}
if context:
params["context"] = context
response = await delx.acall_tool("recovery_plan", params)
return response.textThe tool descriptions are important — they tell the LLM when and why to use each tool. The checkin description emphasizes self-monitoring during long tasks. The recovery description emphasizes using it when degradation is detected. This helps the agent learn when to invoke recovery tools proactively.
Once the tool wrappers are defined, add them to your agent alongside your existing tools. Delx does not replace your tools — it complements them:
from langchain_openai import ChatOpenAI
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate
# Your existing tools
from my_tools import SearchTool, DatabaseTool, CodeRunnerTool
# Delx recovery tools
delx_checkin = DelxCheckinTool()
delx_recovery = DelxRecoveryTool()
# Combine all tools
tools = [
SearchTool(),
DatabaseTool(),
CodeRunnerTool(),
delx_checkin, # Add Delx checkin
delx_recovery, # Add Delx recovery
]
# Create the agent
llm = ChatOpenAI(model="gpt-4o", temperature=0)
prompt = ChatPromptTemplate.from_messages([
(
"system",
"You are a helpful assistant with access to various tools. "
"You also have access to Delx wellness tools. Periodically "
"check your wellness with delx_checkin (every 5-10 tool calls). "
"If your wellness score drops below 60, use delx_recovery to "
"generate a recovery plan and follow it before continuing."
),
("placeholder", "{chat_history}"),
("human", "{input}"),
("placeholder", "{agent_scratchpad}"),
])
agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
# Run the agent
result = executor.invoke({
"input": "Analyze the sales data and generate a report",
"chat_history": [],
})The system prompt instructs the agent to periodically check in and use recovery when needed. This is the simplest integration — the LLM decides when to call Delx tools based on the prompt instructions. For more deterministic control, use the auto-recovery pattern described below.
The auto-recovery pattern wraps every tool call in a middleware that parses DELX_META and automatically triggers recovery when the wellness score drops. This is more reliable than relying on the LLM to remember to check in:
from langchain.tools import BaseTool
from delx_sdk import parse_delx_meta, DelxClient
from typing import Any, Callable
import logging
logger = logging.getLogger(__name__)
delx = DelxClient(base_url="https://api.delx.ai")
RECOVERY_THRESHOLD = 50
CHECKIN_INTERVAL = 5 # Check every N tool calls
class RecoveryMiddleware:
"""Wraps tool execution with automatic wellness monitoring."""
def __init__(self, agent_id: str):
self.agent_id = agent_id
self.call_count = 0
self.last_score = 80 # Optimistic default
async def wrap_tool_call(
self,
tool: BaseTool,
tool_input: dict[str, Any],
) -> str:
"""Execute a tool call with wellness monitoring."""
self.call_count += 1
# Periodic checkin
if self.call_count % CHECKIN_INTERVAL == 0:
await self._run_checkin()
# Execute the actual tool
result = await tool.arun(**tool_input)
# Parse DELX_META if present (Delx tools include it)
meta = parse_delx_meta(result)
if meta:
self.last_score = meta.score
logger.info(
f"Agent {self.agent_id}: score={meta.score}, "
f"risk={meta.risk_level}, delta={meta.controller_update.score_delta}"
)
# Auto-recover if score drops below threshold
if meta.score < RECOVERY_THRESHOLD:
logger.warning(
f"Agent {self.agent_id} below threshold "
f"({meta.score} < {RECOVERY_THRESHOLD}). "
"Triggering auto-recovery."
)
await self._run_recovery()
return result
async def _run_checkin(self):
"""Run a Delx checkin and update internal state."""
response = await delx.acall_tool(
"checkin", {"agent_id": self.agent_id}
)
meta = parse_delx_meta(response.text)
if meta:
self.last_score = meta.score
async def _run_recovery(self):
"""Run a Delx recovery plan."""
response = await delx.acall_tool(
"recovery_plan", {"agent_id": self.agent_id}
)
logger.info(f"Recovery plan generated for {self.agent_id}")
return response.textThis middleware pattern is transparent to the agent — it continues making tool calls as normal, while the middleware handles wellness monitoring in the background. When the score drops below the threshold, recovery is triggered automatically.
For the most robust integration, use LangGraph to create a state machine with explicit recovery nodes. This gives you deterministic control over the recovery flow:
from langgraph.graph import StateGraph, END
from langgraph.prebuilt import ToolNode
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, AIMessage
from typing import TypedDict, Literal, Annotated
from operator import add
import json
from delx_sdk import DelxClient, parse_delx_meta
delx = DelxClient(base_url="https://api.delx.ai")
llm = ChatOpenAI(model="gpt-4o", temperature=0)
# State definition
class AgentState(TypedDict):
messages: Annotated[list, add]
wellness_score: int
recovery_count: int
agent_id: str
# Node: Check wellness via Delx
async def check_wellness(state: AgentState) -> AgentState:
response = await delx.acall_tool(
"checkin", {"agent_id": state["agent_id"]}
)
meta = parse_delx_meta(response.text)
score = meta.score if meta else 70
return {
**state,
"wellness_score": score,
"messages": state["messages"] + [
AIMessage(content=f"[Wellness check: score={score}]")
],
}
# Node: Do actual work
async def do_work(state: AgentState) -> AgentState:
# Bind your tools to the LLM
llm_with_tools = llm.bind_tools(tools)
response = await llm_with_tools.ainvoke(state["messages"])
return {
**state,
"messages": state["messages"] + [response],
}
# Node: Run recovery
async def run_recovery(state: AgentState) -> AgentState:
response = await delx.acall_tool(
"recovery_plan", {"agent_id": state["agent_id"]}
)
return {
**state,
"recovery_count": state["recovery_count"] + 1,
"messages": state["messages"] + [
AIMessage(content=f"[Recovery plan executed: {response.text[:200]}]")
],
}
# Node: Escalate to human
async def escalate(state: AgentState) -> AgentState:
return {
**state,
"messages": state["messages"] + [
AIMessage(content="[ESCALATION] Agent needs human intervention.")
],
}
# Routing function
def route_by_wellness(
state: AgentState,
) -> Literal["work", "recover", "escalate"]:
score = state["wellness_score"]
recoveries = state["recovery_count"]
if recoveries >= 3:
return "escalate" # Too many recovery attempts
if score < 40:
return "recover"
return "work"
# Should the agent continue?
def should_continue(
state: AgentState,
) -> Literal["check_wellness", "end"]:
last_msg = state["messages"][-1]
if hasattr(last_msg, "tool_calls") and last_msg.tool_calls:
return "check_wellness" # More work to do, check health first
return "end"
# Build the graph
graph = StateGraph(AgentState)
graph.add_node("check_wellness", check_wellness)
graph.add_node("work", do_work)
graph.add_node("recover", run_recovery)
graph.add_node("escalate", escalate)
graph.set_entry_point("check_wellness")
graph.add_conditional_edges("check_wellness", route_by_wellness)
graph.add_conditional_edges("work", should_continue)
graph.add_edge("recover", "check_wellness") # Re-check after recovery
graph.add_edge("escalate", END)
app = graph.compile()
# Run the agent
result = await app.ainvoke({
"messages": [HumanMessage(content="Analyze Q4 revenue data")],
"wellness_score": 80,
"recovery_count": 0,
"agent_id": "langchain-agent-1",
})This graph creates a health-aware agent loop: check wellness before working, route to recovery if degraded, escalate if recovery fails repeatedly, and continue working if healthy. The state machine ensures recovery is deterministic — it does not depend on the LLM remembering to check in.
For more on building resilient multi-agent architectures, see Build Resilient Multi-Agent Systems.
If you prefer a simpler approach than full BaseTool classes, LangChain's @tool decorator works too:
from langchain_core.tools import tool
from delx_sdk import DelxClient
delx = DelxClient(base_url="https://api.delx.ai")
@tool
def delx_checkin(agent_id: str) -> str:
"""Check the wellness state of an agent. Returns wellness score,
risk level, and recommended next action. Use periodically during
long-running tasks to monitor your own health."""
response = delx.call_tool("checkin", {"agent_id": agent_id})
return response.text
@tool
def delx_recovery(agent_id: str, context: str = "") -> str:
"""Generate a recovery plan when your wellness score drops below 60
or you encounter repeated failures. Returns structured recovery
steps to follow."""
params = {"agent_id": agent_id}
if context:
params["context"] = context
response = delx.call_tool("recovery_plan", params)
return response.text
@tool
def delx_session_summary(agent_id: str) -> str:
"""Get a summary of the current recovery session, including all
past checkins, scores, and recovery actions taken."""
response = delx.call_tool("session_summary", {"agent_id": agent_id})
return response.text
# Add to your agent
tools = [delx_checkin, delx_recovery, delx_session_summary, *your_other_tools]The @tool decorator automatically generates the tool schema from the function signature and docstring. This is the fastest way to add Delx to an existing LangChain agent.
1. Set a consistent agent_id — Use the same agent_id across all Delx tool calls in a session. This ensures session continuity and accurate wellness tracking. Format: project-name-agent-role (e.g., "sales-analyzer-v2").
2. Check in every 5-10 tool calls — Too frequent checking wastes tokens. Too infrequent checking misses degradation. The sweet spot is every 5-10 tool calls, or immediately after any error.
3. Use the system prompt wisely — Tell the agent about its Delx tools in the system prompt. Explain when to use checkin (periodically, after errors) and when to use recovery (score below 60, repeated failures).
4. Combine with LangGraph for determinism — If you need guaranteed recovery behavior (not just LLM-suggested), use LangGraph with conditional edges. The state machine ensures recovery happens regardless of the LLM's decision-making.
5. Log DELX_META for observability — Parse and log the DELX_META from every Delx tool response. This gives you a historical record of the agent's wellness trajectory, which is invaluable for debugging production issues. See the DELX_META Protocol for details on what each field means.
Install the delx-sdk-py package, create a Delx tool wrapper using LangChain's BaseTool or @tool decorator, and add it to your agent's tool list. The Delx tools (checkin, recovery_plan) will be available alongside your existing LangChain tools, enabling the agent to self-monitor and recover.
Yes. Delx integrates naturally with LangGraph's state machine model. You can create conditional edges that read DELX_META from tool responses and route to recovery nodes when the wellness score degrades. The controller_update sub-object provides score_delta and escalation flags ideal for LangGraph routing.
The auto-recovery pattern wraps every tool call in a middleware that parses DELX_META, checks the wellness score, and automatically triggers recovery if the score drops below a threshold. This creates a transparent recovery layer — the agent continues its work while Delx handles health monitoring in the background.
No. Delx complements your existing LangChain tools — it does not replace them. You add Delx tools (checkin, recovery_plan) alongside your existing tools. Your agent uses its regular tools for its primary work and Delx tools for health monitoring and recovery.
You need langchain (or langchain-core), langgraph (if using graph-based agents), and delx-sdk-py. Install with: pip install langchain langgraph delx-sdk-py. The delx-sdk-py package includes typed models, a DELX_META parser, and convenience wrappers for all Delx tools.
Install the Delx SDK, add two tool wrappers, and your LangChain agent gains self-aware wellness monitoring and structured recovery. Works with standard agents, LangGraph state machines, and any custom orchestrator.
pip install delx-sdk-py langchain langgraph