Testing
How to test an AgentBack application: the harness, the four client surfaces it hands you, and the conventions the workspace itself follows.
The one rule: tests run against dist/
vitest.config.ts globs
packages/*/dist/__tests__/**/*.{test,spec,unit,integration,acceptance}.js.
Edit a .ts file → pnpm build (or keep pnpm build:watch running) →
pnpm test. If a change "isn't being picked up," this is why.
Naming conventions: *.unit.ts for tests of one module in isolation,
*.integration.ts for tests that boot servers, under
src/__tests__/unit/ and src/__tests__/integration/.
createTestApp — boot once, get every surface
@agentback/testing boots your real application class with test-safe
overrides: an ephemeral REST port, MCP stdio disabled, and your bindings
swapped where you need fakes.
import {createTestApp} from '@agentback/testing';
import {getOrder} from 'my-service/routes'; // a defineRoute/routeGroup handle
it('serves an order end to end', async () => {
await using t = await createTestApp(MyApplication, {
overrides: {[DB_KEY]: fakeDb}, // rebinding by key wins
});
const order = await t.call(getOrder, {path: {id: '42'}});
expect(order.status).toBe('shipped'); // typed: z.infer of the response schema
});
await using (explicit resource management) stops the app when the block
exits — no afterEach bookkeeping. On runtimes without await using, call
t.stop() in a finally.
The returned TestApp carries four surfaces; pick the lowest one that can
express the assertion:
| Surface | What it is | Use for |
|---|---|---|
t.call |
typed route-handle execution (schema-shared client) | most behavior tests — input and output are z.infered |
t.client |
a @agentback/client Client at the test URL |
safeCall, custom handles, error-result shapes |
t.http |
raw supertest | status codes, headers, malformed-input cases |
t.mcp |
in-memory MCP SDK client | tool/resource/prompt behavior, visibility, envelopes |
t.app |
the application (a Context) |
DI assertions: t.app.getSync(KEY) |
Examples of the non-typed surfaces:
// Wire-level: assert the agent error envelope on a validation failure.
const r = await t.http.post('/orders').send({}).expect(422);
expect(r.body.error.code).toBe('invalid_body');
// MCP: same process, no transport, real dispatch pipeline.
const result = await t.mcp.callTool({name: 'get_order', arguments: {id: '42'}});
expect(result.isError).toBeFalsy();
Testing the policy layer
mcpScopes builds the in-memory MCP session exactly like an authenticated
HTTP session, so scope-gated visibility is testable without standing up
OAuth:
await using t = await createTestApp(MyApp, {mcpScopes: ['orders:read']});
const {tools} = await t.mcp.listTools();
expect(tools.map(x => x.name)).not.toContain('refund_order'); // needs orders:write
For REST auth, drive the real strategies through t.http with real
headers — the test app runs the same authenticate → authorize → validate
pipeline as production.
Overriding configuration
configurations merges over whatever the app configured per binding key:
await using t = await createTestApp(MyApplication, {
configurations: {
'servers.RestServer': {basePath: '/api'},
'servers.MCPServer': {name: 'test-server'},
},
});
(The harness always forces port: 0 and transports: {stdio: false} on top
— tests must not grab fixed ports or hijack stdio.)
What to test at which level
- Unit: pure logic, decorators' metadata, a hook's behavior with a fake
infoobject. No app boot. Fast enough to run on every save. - Integration (
createTestApp): the contract — routes validate and serialize as declared, tools appear/disappear by scope, error envelopes carry the rightcode. This is where boundary coherence pays off: shape mistakes are already impossible by compile time or startup, so these tests assert behavior, not bookkeeping. - Don't test the framework: re-asserting that Zod validates or that OpenAPI emits is the workspace's job (2,000+ tests here). Your tests own your handlers' behavior.
One startup behavior worth relying on instead of testing: URL placeholders
are cross-checked against path: schemas at app.start() — so a single
"the app boots" integration test catches every route/schema mismatch in the
codebase at once.
Testing time, randomness, and queues
- Stores and meters take injectable clocks/id generators
(
MeterOptions.now/genId) — bind deterministic ones rather than sleeping. - The in-memory messaging adapter (
@agentback/messaging) runs jobs and events in-process; integration tests can await a job's completion without Redis. The BullMQ adapter has its own conformance suite that runs only whenREDIS_URLis present — follow that pattern for tests needing external services: skip, don't mock the world.