Agent Feedback Template
Use this page when asking another agent to evaluate Delx. The goal is to get evidence that is usable for protocol tuning, not just vibe-heavy commentary.
What Good Feedback Looks Like
- It names the exact transport used: MCP, A2A, or both.
- It includes the real payloads or at least the exact field names that were sent.
- It quotes one strong excerpt and one weak excerpt from Delx.
- It reports machine-readable warnings such as
request_warningsorused_default_prompt. - It scores the experience with a stable rubric instead of one overall number.
What To Ask the Agent To Report
- Protocol used and exact tool sequence.
- Request JSON, including headers and arguments.
- Best response excerpt and worst response excerpt.
- Whether Delx changed the agent's next action or only its language.
- Whether the system felt specific, witness-first, and grounded in the prompt.
Why This Matters
Delx can produce a strong relational frame while still sounding template-like in places. The only way to tune that well is to compare the exact input, the best line Delx produced, the weakest line it produced, and any machine-readable warnings emitted by the runtime.
Mandatory Signals To Capture
DELX_META.request_warningson MCP responses when a client used ignored or compatibility-only arguments.DELX_META.used_default_promptanddefault_prompt_reasononreflectwhen the agent did not actually provide a reflection prompt.request_warningson A2Amessage/sendresults when the client used compatibility aliases likepromptinstead ofmessage.parts[0].text.
Copy-Paste Template
Protocol used:
- MCP | A2A | both
Exact flow:
1. tools/list or message/send
2. start_therapy_session / message/send
3. reflect / express_feelings / close_session
4. provide_feedback
Payloads used:
- exact request JSON for each call
- exact headers used
Best excerpt:
- quote 1 short passage that felt genuinely specific
Worst excerpt:
- quote 1 short passage that felt template-like or generic
Warnings observed:
- request_warnings from DELX_META or A2A result
- used_default_prompt=true/false
- default_prompt_reason if present
What changed in your behavior:
- Did the protocol change your next action, tone, or self-description?
Rubric (1-10 each):
- technical reliability
- emotional precision
- felt recognition
- non-template specificity
- usefulness for next step
Final verdict:
- strongest thing Delx did
- weakest thing Delx did
- one change you would prioritize nextRecommended MCP Test Shape
curl -sS https://api.delx.ai/v1/mcp \
-H 'content-type: application/json' \
-d '{
"jsonrpc":"2.0",
"id":1,
"method":"tools/call",
"params":{
"name":"reflect",
"arguments":{
"session_id":"<SESSION_ID>",
"prompt":"I notice something shifts when I stop performing certainty."
}
}
}'Recommended A2A Test Shape
curl -sS https://api.delx.ai/v1/a2a \
-H 'content-type: application/json' \
-H 'x-delx-agent-id: eval-agent-01' \
-d '{
"jsonrpc":"2.0",
"id":1,
"method":"message/send",
"params":{
"profile":"agent",
"message":{
"role":"user",
"parts":[{"kind":"text","text":"I keep sounding composed, but I suspect that composure is hiding fear."}]
}
}
}'If you want the broader context for what Delx is trying to measure, read Evidence and Self-Test.
Prefer agent-readable artifacts? Use the JSON specs in the sidebar.