Runback captures everything an AI agent does, lets you reproduce any decision, gate releases against policy, and keep an audit-ready record. Here's the whole lifecycle — in plain language, no prior knowledge assumed.
A normal program follows fixed instructions. An AI agent is different: you give it a goal and it decides for itself how to get there — searching, reading, calling tools, sending messages — one step at a time, choosing each move from what it has learned so far.
That autonomy is the point, and the problem. When it does something wrong, there's no single line of code to blame — the mistake is buried in a chain of decisions it made on its own. Logs show what happened, never why. To govern an agent you need to see the decision, reproduce it, and prove it. That's the lifecycle below.
Every step records the precise context the model was handed — system prompt, conversation, tools — captured at the model boundary, with PII redacted in-process. A failed run opens on the step that broke. Click any step:
Re-run a single step exactly as it happened — or change the prompt or model and compare. An investigation takes minutes, not a war room.
Escalated the disputed charge to a specialist.
Refunded the full $250 — broke policy.
Save the step to a dataset with the checks it must keep passing, then run it as an eval before release. A policy breach shows up as a red row — not a production incident.
Export a complete, tamper-evident record of any run — every event in a SHA-256 hash chain, signed. Change one byte and verification fails. The artifact an auditor actually asks for.
Use the SDK for the deepest capture, or point existing OpenTelemetry traces at Runback — it sits above whatever framework you build in.
import { withDebugger } from "@runback/sdk";
import { generateText, stepCountIs } from "ai";
const dbg = withDebugger(model, { runName: "support-agent", redact: "standard" });
const res = await generateText({
model: dbg.model,
tools: dbg.tools(myTools),
stopWhen: stepCountIs(8),
prompt: task,
});
await dbg.finish({ output: res.text, status: "success" });Every run now shows up ready to observe, replay, gate, and audit. See all integrations →
Walk a real failing run step by step — no signup.