> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://openrouter.ai/docs/llms.txt.
> For full documentation content, see https://openrouter.ai/docs/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://openrouter.ai/docs/_mcp/server.

# Auto Router

The [Auto Router](https://openrouter.ai/openrouter/auto) (`openrouter/auto`) automatically selects the best model for your prompt, powered by [NotDiamond](https://www.notdiamond.ai/).

## Overview

Instead of manually choosing a model, let the Auto Router analyze your prompt and select the optimal model from a curated set of high-quality options. The router considers factors like prompt complexity, task type, and model capabilities.

## Usage

Set your model to `openrouter/auto`:

```typescript title="TypeScript SDK"
import { OpenRouter } from '@openrouter/sdk';

const openRouter = new OpenRouter({
  apiKey: '<OPENROUTER_API_KEY>',
});

const completion = await openRouter.chat.send({
  model: 'openrouter/auto',
  messages: [
    {
      role: 'user',
      content: 'Explain quantum entanglement in simple terms',
    },
  ],
});

console.log(completion.choices[0].message.content);
// Check which model was selected
console.log('Model used:', completion.model);
```

```typescript title="TypeScript (fetch)"
const response = await fetch('https://openrouter.ai/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer <OPENROUTER_API_KEY>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'openrouter/auto',
    messages: [
      {
        role: 'user',
        content: 'Explain quantum entanglement in simple terms',
      },
    ],
  }),
});

const data = await response.json();
console.log(data.choices[0].message.content);
// Check which model was selected
console.log('Model used:', data.model);
```

```python title="Python"
import requests
import json

response = requests.post(
  url="https://openrouter.ai/api/v1/chat/completions",
  headers={
    "Authorization": "Bearer <OPENROUTER_API_KEY>",
    "Content-Type": "application/json",
  },
  data=json.dumps({
    "model": "openrouter/auto",
    "messages": [
      {
        "role": "user",
        "content": "Explain quantum entanglement in simple terms"
      }
    ]
  })
)

data = response.json()
print(data['choices'][0]['message']['content'])
# Check which model was selected
print('Model used:', data['model'])
```

## Response

The response includes the `model` field showing which model was actually used:

```json
{
  "id": "gen-...",
  "model": "anthropic/claude-sonnet-4.5",  // The model that was selected
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "..."
      }
    }
  ],
  "usage": {
    "prompt_tokens": 15,
    "completion_tokens": 150,
    "total_tokens": 165
  }
}
```

## How It Works

1. **Prompt Analysis**: Your prompt is analyzed by NotDiamond's routing system
2. **Model Selection**: The optimal model is selected based on the task requirements
3. **Request Forwarding**: Your request is forwarded to the selected model
4. **Response Tracking**: The response includes metadata showing which model was used

## Session Stickiness

The Auto Router pins both the selected **model** and **provider** so that subsequent requests in the same conversation route to the same place. This ensures consistent behavior within a conversation and maximizes [prompt cache](/docs/guides/best-practices/prompt-caching) hits.

Stickiness applies at two levels:

* **Implicit (automatic)**: OpenRouter derives a conversation fingerprint from your messages (hashing the first system message and first user message). Once the provider reports prompt cache usage, the model and provider are pinned for that conversation. No configuration needed.
* **Explicit (`session_id`)**: When you include a `session_id`, stickiness kicks in on the first successful response — even before cache usage is observed. This is recommended for multi-turn conversations and agent workflows where you want consistent routing from the start.

In both cases, the cache expires after **5 minutes** of inactivity. Each successful request resets the timer. If the cached provider returns an error, the cache is not updated, allowing the next request to be re-routed.

For full details on how sticky routing works, cache key granularity, and the `x-session-id` header, see [Provider Sticky Routing](/docs/guides/best-practices/prompt-caching#provider-sticky-routing).

### Example with `session_id`

```typescript title="TypeScript SDK"
const completion = await openRouter.chat.send({
  model: 'openrouter/auto',
  session_id: 'my-conversation-123',
  messages: [
    {
      role: 'user',
      content: 'Explain quantum entanglement',
    },
  ],
});

// Subsequent requests with the same session_id will use the same model and provider
const followUp = await openRouter.chat.send({
  model: 'openrouter/auto',
  session_id: 'my-conversation-123',
  messages: [
    { role: 'user', content: 'Explain quantum entanglement' },
    { role: 'assistant', content: completion.choices[0].message.content ?? '' },
    { role: 'user', content: 'Now explain it to a 5-year-old' },
  ],
});
```

```typescript title="TypeScript (fetch)"
const response = await fetch('https://openrouter.ai/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    Authorization: 'Bearer <OPENROUTER_API_KEY>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'openrouter/auto',
    session_id: 'my-conversation-123',
    messages: [
      {
        role: 'user',
        content: 'Explain quantum entanglement',
      },
    ],
  }),
});
```

```python title="Python"
response = requests.post(
  url="https://openrouter.ai/api/v1/chat/completions",
  headers={
    "Authorization": "Bearer <OPENROUTER_API_KEY>",
    "Content-Type": "application/json",
  },
  data=json.dumps({
    "model": "openrouter/auto",
    "session_id": "my-conversation-123",
    "messages": [
      {
        "role": "user",
        "content": "Explain quantum entanglement"
      }
    ]
  })
)
```

### Why It Matters for the Auto Router

Unlike using a fixed model, the Auto Router selects a different model each time based on your prompt. Session stickiness is especially important here because it also pins the **model selection** — not just the provider. Without it, you could get different models on each turn of a conversation, leading to inconsistent behavior and wasted prompt cache.

## Supported Models

The Auto Router selects from a curated set of high-quality models including:

Model slugs change as new versions are released. The examples below are current as of December 4, 2025. Check the [models page](https://openrouter.ai/models) for the latest available models.

* Claude Sonnet 4.5 (`anthropic/claude-sonnet-4.5`)
* Claude Opus 4.5 (`anthropic/claude-opus-4.5`)
* GPT-5.1 (`openai/gpt-5.1`)
* Gemini 3.1 Pro (`google/gemini-3.1-pro-preview`)
* DeepSeek 3.2 (`deepseek/deepseek-v3.2`)
* And other top-performing models

The exact model pool may be updated as new models become available.

## Configuring Allowed Models

You can restrict which models the Auto Router can select from using the `plugins` parameter. This is useful when you want to limit routing to specific providers or model families.

### Via API Request

Use wildcard patterns to filter models. For example, `anthropic/*` matches all Anthropic models:

```typescript title="TypeScript SDK"
const completion = await openRouter.chat.send({
  model: 'openrouter/auto',
  messages: [
    {
      role: 'user',
      content: 'Explain quantum entanglement',
    },
  ],
  plugins: [
    {
      id: 'auto-router',
      allowed_models: ['anthropic/*', 'openai/gpt-5.1'],
    },
  ],
});
```

```typescript title="TypeScript (fetch)"
const response = await fetch('https://openrouter.ai/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    Authorization: 'Bearer <OPENROUTER_API_KEY>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'openrouter/auto',
    messages: [
      {
        role: 'user',
        content: 'Explain quantum entanglement',
      },
    ],
    plugins: [
      {
        id: 'auto-router',
        allowed_models: ['anthropic/*', 'openai/gpt-5.1'],
      },
    ],
  }),
});
```

```python title="Python"
response = requests.post(
  url="https://openrouter.ai/api/v1/chat/completions",
  headers={
    "Authorization": "Bearer <OPENROUTER_API_KEY>",
    "Content-Type": "application/json",
  },
  data=json.dumps({
    "model": "openrouter/auto",
    "messages": [
      {
        "role": "user",
        "content": "Explain quantum entanglement"
      }
    ],
    "plugins": [
      {
        "id": "auto-router",
        "allowed_models": ["anthropic/*", "openai/gpt-5.1"]
      }
    ]
  })
)
```

### Via Settings UI

You can also configure default allowed models in your [Plugin Settings](https://openrouter.ai/settings/plugins):

1. Navigate to **Settings > Plugins**
2. Find **Auto Router** and click the configure button
3. Enter model patterns (one per line)
4. Save your settings

These defaults apply to all your API requests unless overridden per-request.

### Pattern Syntax

| Pattern          | Matches                                |
| ---------------- | -------------------------------------- |
| `anthropic/*`    | All Anthropic models                   |
| `openai/gpt-5*`  | All GPT-5 variants                     |
| `google/*`       | All Google models                      |
| `openai/gpt-5.1` | Exact match only                       |
| `*/claude-*`     | Any provider with claude in model name |

When no patterns are configured, the Auto Router uses all supported models.

## Cost / Quality Tradeoff

Control how aggressively the Auto Router optimizes for cost vs. quality using the `cost_quality_tradeoff` parameter (integer, 0–10):

* **0** = pure quality — always picks the most capable model regardless of cost
* **10** = maximize for cost — cheapest model wins
* Intermediate values blend quality and cost signals continuously

The default is **7**, which balances cost savings with strong output quality.

### Via API Request

```typescript title="TypeScript SDK"
const completion = await openRouter.chat.send({
  model: 'openrouter/auto',
  messages: [
    {
      role: 'user',
      content: 'Summarize this paragraph',
    },
  ],
  plugins: [
    {
      id: 'auto-router',
      cost_quality_tradeoff: 3, // Favor quality over cost
    },
  ],
});
```

```typescript title="TypeScript (fetch)"
const response = await fetch('https://openrouter.ai/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    Authorization: 'Bearer <OPENROUTER_API_KEY>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'openrouter/auto',
    messages: [
      {
        role: 'user',
        content: 'Summarize this paragraph',
      },
    ],
    plugins: [
      {
        id: 'auto-router',
        cost_quality_tradeoff: 3,
      },
    ],
  }),
});
```

```python title="Python"
response = requests.post(
  url="https://openrouter.ai/api/v1/chat/completions",
  headers={
    "Authorization": "Bearer <OPENROUTER_API_KEY>",
    "Content-Type": "application/json",
  },
  data=json.dumps({
    "model": "openrouter/auto",
    "messages": [
      {
        "role": "user",
        "content": "Summarize this paragraph"
      }
    ],
    "plugins": [
      {
        "id": "auto-router",
        "cost_quality_tradeoff": 3
      }
    ]
  })
)
```

### Via Settings UI

You can also set a default tradeoff in your [Plugin Settings](https://openrouter.ai/settings/plugins) under **Auto Router**. The per-request value overrides this default.

## Pricing

You pay the standard rate for whichever model is selected. There is no additional fee for using the Auto Router.

## Use Cases

* **General-purpose applications**: When you don't know what types of prompts users will send
* **Cost optimization**: Let the router choose efficient models for simpler tasks
* **Quality optimization**: Ensure complex prompts get routed to capable models
* **Experimentation**: Discover which models work best for your use case

## Limitations

* The router requires `messages` format (not `prompt`)
* Streaming is supported
* All standard OpenRouter features (tool calling, etc.) work with the selected model

## Related

* [Body Builder](/docs/guides/routing/routers/body-builder) - Generate multiple parallel API requests
* [Latest Model Resolution](/docs/guides/routing/routers/latest-resolution) - Always target the newest version of a model family
* [Model Fallbacks](/docs/guides/routing/model-fallbacks) - Configure fallback models
* [Provider Selection](/docs/guides/routing/provider-selection) - Control which providers are used