Build agents that run for hours, not seconds

Long-horizon, multi-step agent loops on OpenRouter. Cost and step ceilings, resumable state, streaming progress, voice input, and human-in-the-loop approvals — all in one SDK.

Multi-Hour Loops

Run agent loops with high step counts and long timeouts. The Agent SDK orchestrates tool calls, retries, and turn-by-turn state for you.

Cost & Step Ceilings

Cap runs with stop conditions like maxCost and stepCountIs. Combine multiple guards so agents end gracefully when budgets are exhausted.

Resumable State

Persist conversation messages, tool results, and shared context. Replay or resume long runs after a crash, deploy, or human review.

Streaming Progress

Stream tokens, tool calls, and step events in real time so dashboards and UIs can show progress on long-running jobs.

Retry with Backoff

Automatic retries on transient API errors keep multi-hour agents alive through provider hiccups without manual intervention.

Voice Input

Drop in /api/v1/audio/transcriptions to drive the same agent loop from a phone call, push-to-talk, or live mic.

Human-in-the-Loop

Pause for approval on high-stakes tool calls, inject corrections, then resume the same run with full context.

Self-Ask & Review

Wrap callModel in an adversarial self-review loop that runs until the agent emits [DONE]. Catch gaps, hallucinations, and unverified claims before shipping.

One Loop, Hours of Work

The Agent SDK keeps long-horizon runs compact. Set ceilings, persist state, and stream progress with the same primitives.

Voice In, Voice Out

Drive the same agent loop from a phone call, push-to-talk app, or live mic. OpenRouter ships dedicated /api/v1/audio/transcriptions and /api/v1/audio/speech endpoints.

Transcribe

Send audio to /api/v1/audio/transcriptions and pick from any STT model on OpenRouter, including Whisper.

Run the Agent

Hand the transcript to callModel with the same tools, stop conditions, and state you use for text-driven runs.

Speak Back

Pipe the result through /api/v1/audio/speech to return a spoken response or stream audio back to the caller.

Voice-Driven Agent in Three Calls

Capture

Take input as text or voice. Whisper-grade STT is one API call away when you need it.

Run

Hand the request to callModel with tools and stop conditions. The SDK handles the loop, retries, and streaming.

Resume

Persist state with a StateAccessor. Pause for human review or restart after a deploy and continue exactly where you left off.

Ship a long-running agent today

Get an API key, copy the cookbook, and have a multi-hour, voice-aware agent in production this week.