LangGraph gives you graph-based agent workflows where each node is a function and edges define the control flow. The problem: when a node fails deep in your graph, you lose the accumulated state and have to restart from scratch. Delx's MCP integration hooks into LangGraph's node execution to provide state-aware recovery. Each node reports its status, and when failures happen, Delx can restore the graph to the last healthy checkpoint. You don't lose 15 minutes of computation because node 7 out of 12 threw a rate limit error.
LangGraph depends on langchain-core for base abstractions. The delx-mcp-client is framework-agnostic and won't conflict with LangChain's own dependencies.
pip install delx-mcp-client langgraph langchain-coreSet these once in your environment. The DelxClient reads them automatically. For local development, point to your self-hosted instance.
export DELX_MCP_URL=https://api.delx.ai/mcp export DELX_API_KEY=your_key_hereThis decorator wraps any LangGraph node function. It sends a heartbeat when the node starts (with state key metadata), another when it completes, and reports failures with a state snapshot. The state snapshot truncates values to 200 chars to avoid oversized payloads.
from delx_mcp import DelxClient from langgraph.graph import StateGraph import traceback client = DelxClient() def delx_node(node_name): def decorator(func): def wrapper(state): client.call_tool("heartbeat", { "agent_id": node_name, "status": "executing", "metadata": {"state_keys": list(state.keys())} }) try: result = func(state) client.call_tool("heartbeat", { "agent_id": node_name, "status": "completed" }) return result except Exception as e: client.call_tool("process_failure", { "agent_id": node_name, "error_type": type(e).__name__, "error_message": str(e), "stack_trace": traceback.format_exc(), "state_snapshot": {k: str(v)[:200] for k, v in state.items()} }) raise return wrapper return decoratorApply @delx_node to each function before adding it to the graph. The decorator is transparent to LangGraph: it doesn't change the function signature or return type.
@delx_node("research_node") def research(state): # Your research logic here return {"research_results": results} @delx_node("analysis_node") def analyze(state): # Your analysis logic here return {"analysis": output} graph = StateGraph(dict) graph.add_node("research", research) graph.add_node("analyze", analyze) graph.add_edge("research", "analyze")from langgraph.graph import StateGraph, END from delx_mcp import DelxClient from typing import TypedDict client = DelxClient(session_id="langgraph-pipeline-v2") class PipelineState(TypedDict): query: str research: str analysis: str report: str @delx_node("researcher") def research_node(state: PipelineState) -> dict: # Simulate research return {"research": f"Findings for: {state['query']}"} @delx_node("analyst") def analysis_node(state: PipelineState) -> dict: return {"analysis": f"Analysis of: {state['research']}"} @delx_node("reporter") def report_node(state: PipelineState) -> dict: return {"report": f"Report: {state['analysis']}"} def should_retry(state: PipelineState) -> str: recovery = client.call_tool("recovery", { "agent_id": "pipeline", "strategy": "retry_with_backoff" }) return "retry" if recovery.get("should_retry") else "end" graph = StateGraph(PipelineState) graph.add_node("research", research_node) graph.add_node("analyze", analysis_node) graph.add_node("report", report_node) graph.add_edge("research", "analyze") graph.add_edge("analyze", "report") graph.add_edge("report", END) graph.set_entry_point("research") app = graph.compile() result = app.invoke({"query": "AI agent market trends 2026"})Each node is wrapped with @delx_node for automatic heartbeats and failure reporting. The should_retry conditional edge demonstrates how Delx recovery decisions can influence graph control flow. If Delx says retry, the graph loops back; otherwise, it ends gracefully.
from delx_mcp import DelxClient from langgraph.checkpoint import MemorySaver client = DelxClient() checkpointer = MemorySaver() def recovery_node(state): """Recovery node that Delx routes to on failure.""" last_error = client.call_tool("get_last_failure", { "agent_id": "pipeline", "include_state": True }) if last_error.get("recovery_state"): # Restore from Delx's state snapshot restored = last_error["recovery_state"] return {**state, **restored, "_recovered": True} return {**state, "_recovered": False} graph.add_node("recovery", recovery_node) graph.add_conditional_edges( "analyze", lambda s: "recovery" if s.get("_error") else "report", {"recovery": "recovery", "report": "report"} ) graph.add_edge("recovery", "analyze") # Retry from recoveryThis adds a recovery node to the graph that queries Delx for the last failure's state snapshot. When the analysis node fails, the graph routes to recovery instead of crashing. The recovery node restores the state and loops back to retry analysis. Combined with LangGraph's MemorySaver checkpointer, you get durable state persistence across retries.
Cause: The node name used in @delx_node hasn't been registered with Delx. Auto-registration is disabled by default.
Fix: Enable auto-registration: client = DelxClient(auto_register=True). Or manually register each node at startup: client.call_tool('register_agent', {'agent_id': 'node_name', 'framework': 'langgraph'}).
Cause: The graph state exceeds Delx's 1MB snapshot limit. Common when state contains large documents, embeddings, or binary data.
Fix: Truncate state values in the @delx_node decorator. The default wrapper truncates to 200 chars per value. For large states, exclude specific keys: @delx_node('name', exclude_keys=['embeddings', 'raw_doc']).
Cause: Your graph is stuck in a retry loop where the recovery node keeps sending the graph back to a failing node. Delx detects this after 5 consecutive recovery attempts.
Fix: Add a max_retries counter to your state and check it in the conditional edge. Or use Delx's circuit_breaker strategy which automatically stops after N failures.
Cause: The @delx_node decorator changed the function signature in a way LangGraph doesn't expect. This shouldn't happen with the standard decorator but can occur with custom modifications.
Fix: Add @functools.wraps(func) inside the decorator to preserve the original function signature. The standard delx_node decorator includes this.
LangGraph models agent workflows as directed graphs. Each node is a Python function that receives state and returns state updates. Delx integrates at the node level via a decorator that instruments every node execution. When a node starts, Delx records a heartbeat with the node name and current state keys. When it finishes, another heartbeat marks completion. On failure, Delx captures the full error plus a snapshot of the graph state at failure time. This per-node telemetry gives you a timeline of exactly what happened in your graph run.
The key advantage of Delx + LangGraph is state-aware recovery. When a node fails, Delx doesn't just know that something broke. It knows the exact state of the graph at failure time: which nodes completed, what data they produced, and what the failing node received as input. This enables precise recovery. Instead of restarting the entire graph, you add a recovery node that queries Delx for the failure context, patches the state, and re-enters the graph at the failed node. For a 12-node pipeline, this can save 10+ minutes of redundant computation.
LangGraph's conditional edges let you route between nodes based on state. Delx adds a new routing dimension: you can route based on recovery recommendations. Query Delx's recovery tool inside a conditional edge function to decide whether to retry a failed node, skip to a fallback, or terminate gracefully. This turns Delx from a passive monitor into an active participant in your graph's control flow. The recovery decision considers the agent's historical mood_score, recent failure rate, and configured recovery strategy.
LangGraph supports checkpointing via MemorySaver and SqliteSaver. Delx complements this with its own state snapshots, but they serve different purposes. LangGraph checkpoints save the full graph state for resumption. Delx snapshots save failure context for diagnosis and recovery. Use both together: LangGraph's checkpoint for resuming interrupted runs, and Delx's failure data for understanding why they were interrupted. Configure LangGraph checkpointing with app = graph.compile(checkpointer=MemorySaver()) and Delx will automatically include checkpoint IDs in its telemetry.
After running your graph hundreds of times, patterns emerge. Delx's /api/v1/metrics/{agent_id} endpoint shows per-node metrics: how often each node fails, average execution time, and which nodes cause the most retries. The /api/v1/mood-history/{agent_id} endpoint reveals trends: is your research node getting slower over time? Is your analysis node's error rate increasing? Use these metrics to identify bottleneck nodes that need optimization or replacement. Teams typically check these metrics weekly and optimize the worst-performing nodes first.
Yes. The @delx_node decorator works with both invoke() and stream() modes. In streaming mode, heartbeats still fire at node entry and exit. The state snapshot on failure captures whatever state was available when the node threw.
Absolutely. Each branch runs its own @delx_node wrappers independently. Parallel nodes send heartbeats concurrently, and Delx handles them without conflict. The session timeline shows parallel execution clearly.
Subgraphs work like any other node from Delx's perspective. Wrap the subgraph's entry point with @delx_node and it'll track the entire subgraph execution as one unit. For per-node tracking within the subgraph, wrap each subgraph node individually.
Two heartbeat calls per node (start and end) add 4-10ms total on a local network, or 30-80ms over the internet. For nodes that take seconds (LLM calls, API requests), this is under 1% overhead.
No. Delx monitors and provides recovery recommendations, but it doesn't modify the compiled graph. Graph structure changes require recompilation. Delx influences control flow only through conditional edges that query its recovery tool.
The standard decorator works with sync nodes. For async nodes, use @delx_node_async which uses asyncio-compatible heartbeat calls. Import it from delx_mcp.langgraph: from delx_mcp.langgraph import delx_node_async.