Errors an agent can fix, operations an agent can't break

An agent's retry loop is only as good as the error it reads. A human sees 400 Bad Request and opens the docs. An agent sees it and guesses.

Most API errors are written for people who can go read something else: a status code, a sentence, maybe a link. Agents don't browse. Whatever the error body carries is the entire repair budget for the next attempt. If it says "validation failed," the agent mutates its input at random and tries again. Three retries later you've paid for four LLM calls and the request still doesn't parse.

So AgentBack makes every failure carry its own repair manual. One envelope, on both surfaces: REST responses and MCP tool errors serialize the same shape, so one parser self-corrects against either.

{
  "error": {
    "statusCode": 422,
    "code": "invalid_body",
    "message": "The request body is invalid.",
    "issues": [
      {"path": ["amount"], "code": "too_small",
       "message": "Number must be greater than 0"}
    ],
    "schema": {
      "type": "object",
      "properties": {"amount": {"type": "number", "exclusiveMinimum": 0}},
      "required": ["amount"]
    },
    "retryable": true,
    "hint": "Fix the listed issues (each has a path and expected type) and retry; the violated section's JSON Schema is included as 'schema'."
  }
}

Each field has one job. code is stable and machine-readable; the contract is that nobody ever parses message. issues point at exact fields with what was expected and what arrived. schema inlines the violated section's JSON Schema, so the agent can re-shape its input without a second round-trip to /openapi.json. retryable answers the only question a retry loop actually has: can the same operation succeed with corrected input? Validation failures say yes. Auth failures say no, because retrying with the same credentials is a waste of everyone's tokens.

The operations that must not run twice

Self-correcting errors solve the failure path. The scarier path is success: an agent that can call delete_environment will eventually call it with conviction and the wrong argument. Two decorator options handle this.

confirm: makes an operation a two-step handshake. The first call is refused with a single-use token; the identical retry with that token executes.

@post('/deploy', {body: DeployIn, response: DeployOut, confirm: true})
async deploy(input: {body: z.infer<typeof DeployIn>}) { … }

// 1st call            -> 409 {code: "confirmation_required",
//                            confirmationToken: "…", retryable: true}
// retry + token header -> executes

The detail that matters: the token is bound to a fingerprint of the exact payload, not just to the route. Confirm {target: "staging"} and then send {target: "prod-db"} with the same token, and the request is refused. An agent cannot confirm one mutation and execute another, which is precisely the failure mode you imagine at 2am. Tokens are single-use and expire after five minutes. MCP tools get the same flow through @tool(..., {confirm: true}), with the token riding in a confirmationToken input property that is advertised in the tool's schema, so the agent learns the protocol from tools/list rather than from its first failure.

idempotency: covers the duplicate problem. Agents retry. Networks fail after the server commits. With the option set, a route honors the idempotency-key header: replaying a key returns the original result without re-executing, concurrent duplicates share one execution, and errors are never cached, so a retry after a failure genuinely retries.

@post('/charge', {body: ChargeIn, response: ChargeOut, idempotency: true})
async charge(input: {body: z.infer<typeof ChargeIn>}) { … }

Ordering is part of the contract

The confirmation gate runs after authentication and authorization, so an unauthorized caller never learns the operation exists, and before input validation, so what gets confirmed is the byte-exact proposal. Both stores are DI bindings with in-memory defaults; multi-instance deployments swap in a shared implementation without touching route code. Everything above is also documented in the emitted OpenAPI: the 409 flow, the headers, the x-confirmation-required marker. The contract an agent discovers is the contract the dispatcher enforces.