Testing

How to test an AgentBack application: the harness, the four client surfaces it hands you, and the conventions the workspace itself follows.

The one rule: tests run against dist/

vitest.config.ts globs packages/*/dist/__tests__/**/*.{test,spec,unit,integration,acceptance}.js. Edit a .ts file → pnpm build (or keep pnpm build:watch running) → pnpm test. If a change "isn't being picked up," this is why.

Naming conventions: *.unit.ts for tests of one module in isolation, *.integration.ts for tests that boot servers, under src/__tests__/unit/ and src/__tests__/integration/.

createTestApp — boot once, get every surface

@agentback/testing boots your real application class with test-safe overrides: an ephemeral REST port, MCP stdio disabled, and your bindings swapped where you need fakes.

import {createTestApp} from '@agentback/testing';
import {getOrder} from 'my-service/routes'; // a defineRoute/routeGroup handle

it('serves an order end to end', async () => {
  await using t = await createTestApp(MyApplication, {
    overrides: {[DB_KEY]: fakeDb}, // rebinding by key wins
  });

  const order = await t.call(getOrder, {path: {id: '42'}});
  expect(order.status).toBe('shipped'); // typed: z.infer of the response schema
});

await using (explicit resource management) stops the app when the block exits — no afterEach bookkeeping. On runtimes without await using, call t.stop() in a finally.

The returned TestApp carries four surfaces; pick the lowest one that can express the assertion:

Surface What it is Use for
t.call typed route-handle execution (schema-shared client) most behavior tests — input and output are z.infered
t.client a @agentback/client Client at the test URL safeCall, custom handles, error-result shapes
t.http raw supertest status codes, headers, malformed-input cases
t.mcp in-memory MCP SDK client tool/resource/prompt behavior, visibility, envelopes
t.app the application (a Context) DI assertions: t.app.getSync(KEY)

Examples of the non-typed surfaces:

// Wire-level: assert the agent error envelope on a validation failure.
const r = await t.http.post('/orders').send({}).expect(422);
expect(r.body.error.code).toBe('invalid_body');

// MCP: same process, no transport, real dispatch pipeline.
const result = await t.mcp.callTool({name: 'get_order', arguments: {id: '42'}});
expect(result.isError).toBeFalsy();

Testing the policy layer

mcpScopes builds the in-memory MCP session exactly like an authenticated HTTP session, so scope-gated visibility is testable without standing up OAuth:

await using t = await createTestApp(MyApp, {mcpScopes: ['orders:read']});
const {tools} = await t.mcp.listTools();
expect(tools.map(x => x.name)).not.toContain('refund_order'); // needs orders:write

For REST auth, drive the real strategies through t.http with real headers — the test app runs the same authenticate → authorize → validate pipeline as production.

Overriding configuration

configurations merges over whatever the app configured per binding key:

await using t = await createTestApp(MyApplication, {
  configurations: {
    'servers.RestServer': {basePath: '/api'},
    'servers.MCPServer': {name: 'test-server'},
  },
});

(The harness always forces port: 0 and transports: {stdio: false} on top — tests must not grab fixed ports or hijack stdio.)

What to test at which level

One startup behavior worth relying on instead of testing: URL placeholders are cross-checked against path: schemas at app.start() — so a single "the app boots" integration test catches every route/schema mismatch in the codebase at once.

Testing time, randomness, and queues