Subagent

Beta
Delegate tasks to a smaller, faster model as a server tool
Beta

Server tools are currently in beta. The API and behavior may change.

The openrouter:subagent server tool lets a model delegate self-contained tasks to a smaller, cheaper, faster worker model mid-generation. When your model has a piece of work that doesn’t need its full capability — summarizing a document, extracting structured data, drafting boilerplate, reformatting text — it invokes the tool with a task_name and a task_description. The worker model executes the task and returns its result as the tool’s outcome, and your model continues, integrating the result.

The worker can be any OpenRouter model, and it can optionally run as a sub-agent with its own tools (for example openrouter:web_search). Each task is independent: the worker sees only the task description (not the parent conversation) and keeps no memory between tasks.

Quick start

1const response = await fetch('https://openrouter.ai/api/v1/chat/completions', {
2 method: 'POST',
3 headers: {
4 Authorization: 'Bearer {{API_KEY_REF}}',
5 'Content-Type': 'application/json',
6 },
7 body: JSON.stringify({
8 model: '{{MODEL}}',
9 messages: [
10 {
11 role: 'user',
12 content: 'Audit this release: summarize the changelog, list breaking changes, and draft the announcement.',
13 },
14 ],
15 tools: [
16 {
17 type: 'openrouter:subagent',
18 parameters: { model: '~anthropic/claude-haiku-latest' },
19 },
20 ],
21 }),
22});
23
24const data = await response.json();
25console.log(data.choices[0].message.content);

Choosing the worker model

The worker model is resolved with the following precedence:

  1. parameters.model on the tool definition, if set.
  2. The model from the outer API request, as a fallback.

Unlike the advisor tool, the delegating model does not choose its worker per call — the worker is fixed by the tool definition. The subagent tool itself can never be the worker model.

When does the model invoke it?

The tool’s description steers the model to delegate focused sub-tasks that don’t need its full capability — and to skip delegation for work that is faster to do directly than to describe. Because the worker has no access to the parent conversation, the model is instructed to include all relevant context and the expected output format in the task_description.

Parameters

Pass an optional parameters object on the tool entry:

1{
2 "tools": [
3 {
4 "type": "openrouter:subagent",
5 "parameters": {
6 "model": "~anthropic/claude-haiku-latest",
7 "instructions": "You are a fast, focused worker. Complete the task exactly as described.",
8 "tools": [{ "type": "openrouter:web_search" }]
9 }
10 }
11 ]
12}
FieldDefaultDescription
modelOuter request modelThe worker model that executes delegated tasks (any OpenRouter model). Typically smaller, cheaper, and faster than the delegating model.
toolsNoneTools made available to the worker. Only OpenRouter server tools (such as openrouter:web_search) are supported; function tools are rejected with a 400 because the worker has no way to execute them. The subagent may not list itself.
instructionsNoneSystem instructions for the worker.
max_tool_callsProvider defaultMax tool-calling steps the worker may take. Only relevant when the worker has tools. Range 1–25. Accepted and validated but not yet enforced on the worker call.
max_completion_tokensProvider defaultMax output tokens (including reasoning) for the worker call.
reasoningProvider defaultReasoning config for the worker call — an object with optional effort and max_tokens. effort is forwarded to the worker; max_tokens is accepted and validated but not yet forwarded.
temperatureProvider defaultSampling temperature (02) forwarded to the worker call.

Tool-call arguments

When invoking the tool, the model passes:

ArgumentDescription
task_nameA short identifier for the delegated task (e.g. summarize-changelog).
task_descriptionEverything the worker needs to complete the task: full context, inputs, constraints, and the expected output format. The worker sees only this description.

What the tool returns

On success the tool result contains the outcome text, the task name, and the model that produced it:

1{
2 "status": "ok",
3 "model": "anthropic/claude-haiku-4.5",
4 "task_name": "summarize-changelog",
5 "outcome": "Release 2.4 highlights: 1) New streaming API..."
6}

On failure the result has status: "error" with a message; the calling model continues without the outcome:

1{
2 "status": "error",
3 "task_name": "summarize-changelog",
4 "error": "Subagent call failed: ..."
5}

Worker tools

When you pass tools, the worker runs as an agentic sub-agent over them before producing its outcome — for example, giving the worker openrouter:web_search lets it ground its result in fresh sources. The worker’s tool use happens inside the tool call; only its final text is returned to your model.

Nested tools must be OpenRouter server tools (for example openrouter:web_search or openrouter:web_fetch). Function tools ({ "type": "function" }) are rejected with a 400: the worker call has no client-side executor, so a function tool call could never be fulfilled.

Recursion protection

The subagent tool cannot invoke itself. Two guards enforce this:

  • A self-reference check rejects a subagent entry inside the subagent’s own tools array (and rejects the subagent tool name as the worker model).
  • Each inner subagent call carries an x-openrouter-subagent-depth header; the subagent tool is stripped from any sub-call, so a worker can never re-enter the subagent.

Task executions are also capped per request to bound cost and latency.