Skip to main content
POST
/
messages
Create a message
curl --request POST \
  --url https://openrouter.ai/api/v1/messages \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "max_tokens": 1024,
  "messages": [
    {
      "content": "Hello, how are you?",
      "role": "user"
    }
  ],
  "model": "anthropic/claude-sonnet-4"
}
'
{
  "content": [
    {
      "text": "I'm doing well, thank you for asking! How can I help you today?",
      "type": "text"
    }
  ],
  "id": "msg_abc123",
  "model": "anthropic/claude-sonnet-4",
  "role": "assistant",
  "stop_reason": "end_turn",
  "type": "message",
  "usage": {
    "input_tokens": 12,
    "output_tokens": 18
  }
}

Authorizations

Authorization
string
header
required

API key as bearer token in Authorization header

Headers

X-OpenRouter-Metadata
enum<string>

Opt-in to surface routing metadata on the response under openrouter_metadata. Defaults to disabled. The legacy header X-OpenRouter-Experimental-Metadata is also accepted for backward compatibility. Opt-in level for surfacing routing metadata on the response under openrouter_metadata.

Available options:
disabled,
enabled
Example:

"enabled"

Body

application/json

Request schema for Anthropic Messages API endpoint

messages
object[] | null
required
model
string
required
cache_control
object

Enable automatic prompt caching. When set at the top level, the system automatically applies cache breakpoints to the last cacheable block in the request. Currently supported for Anthropic Claude models.

Example:
{ "type": "ephemeral" }
context_management
object | null
fallbacks
object[] | null

Fallback models to try if the primary model fails or refuses, in order. Handled by OpenRouter multi-model routing rather than Anthropic server-side fallbacks; cannot be combined with models. Each entry accepts only model. Maximum of 3 entries.

Example:
[{ "model": "claude-opus-4-8" }]
max_tokens
integer
metadata
object
models
string[]
output_config
object

Configuration for controlling output behavior. Supports the effort parameter and structured output format.

Example:
{ "effort": "medium" }
plugins
object[]

Plugins you want to enable for this request, including their settings.

Example:
{
"allowed_models": ["anthropic/*", "openai/gpt-4o"],
"cost_quality_tradeoff": 7,
"enabled": true,
"id": "auto-router"
}
provider
object | null

When multiple model providers are available, optionally indicate your routing preference.

Example:
{ "allow_fallbacks": true }
route
enum<string> | null
deprecated

DEPRECATED Use providers.sort.partition instead. Backwards-compatible alias for providers.sort.partition. Accepts legacy values: "fallback" (maps to "model"), "sort" (maps to "none").

Available options:
fallback,
sort,
null
Example:

"fallback"

service_tier
string
session_id
string

A unique identifier for grouping related requests (e.g., a conversation or agent workflow). When provided, OpenRouter uses it as the sticky routing key, routing all requests in the session to the same provider to maximize prompt cache hits. Also used for observability grouping. If provided in both the request body and the x-session-id header, the body value takes precedence. Maximum of 256 characters.

Maximum string length: 256
speed
enum<string> | null

Controls output generation speed. When set to fast, uses a higher-speed inference configuration at premium pricing. Defaults to standard when omitted.

Available options:
fast,
standard,
null
Example:

"fast"

stop_sequences
string[]
stop_server_tools_when
object[]

Stop conditions for the server-tool agent loop. Any condition firing halts the loop (OR logic). When set, this overrides max_tool_calls.

Minimum array length: 1

A single condition that, when met, halts the server-tool agent loop.

Example:
{ "step_count": 5, "type": "step_count_is" }
Example:
[
{ "step_count": 5, "type": "step_count_is" },
{
"max_cost_in_dollars": 0.5,
"type": "max_cost"
}
]
stream
boolean
system
temperature
number<double>
thinking
object
tool_choice
object
tools
object[]
top_k
integer
top_p
number<double>
trace
object

Metadata for observability and tracing. Known keys (trace_id, trace_name, span_name, generation_name, parent_span_id) have special handling. Additional keys are passed through as custom metadata to configured broadcast destinations.

Example:
{
"trace_id": "trace-abc123",
"trace_name": "my-app-trace"
}
user
string

A unique identifier representing your end-user, which helps distinguish between different users of your app. This allows your app to identify specific users in case of abuse reports, preventing your entire app from being affected by the actions of individual users. Maximum of 256 characters.

Maximum string length: 256

Response

Successful response

Non-streaming response from the Anthropic Messages API with OpenRouter extensions

container
object | null
required
Example:
{
"expires_at": "2026-04-08T00:00:00Z",
"id": "ctr_01abc"
}
content
object[]
required
Example:
{
"citations": null,
"text": "Hello, world!",
"type": "text"
}
id
string
required
model
string
required
role
enum<string>
required
Available options:
assistant
stop_details
object | null
required

Structured information about a refusal

Example:
{
"category": "cyber",
"explanation": "The request was refused due to policy.",
"type": "refusal"
}
stop_reason
enum<string> | null
required
Available options:
end_turn,
max_tokens,
stop_sequence,
tool_use,
pause_turn,
refusal,
compaction,
null
Example:

"end_turn"

stop_sequence
string | null
required
type
enum<string>
required
Available options:
message
usage
object
required
Example:
{
"cache_creation": null,
"cache_creation_input_tokens": null,
"cache_read_input_tokens": null,
"inference_geo": null,
"input_tokens": 100,
"output_tokens": 50,
"output_tokens_details": null,
"server_tool_use": null,
"service_tier": "standard"
}
context_management
object | null
openrouter_metadata
object
Example:
{
"attempt": 1,
"endpoints": {
"available": [
{
"model": "openai/gpt-4o",
"provider": "OpenAI",
"selected": true
}
],
"total": 1
},
"is_byok": false,
"region": "iad",
"requested": "openai/gpt-4o",
"strategy": "direct",
"summary": "available=1, selected=OpenAI"
}
provider
enum<string>
Available options:
AkashML,
AI21,
AionLabs,
Alibaba,
Ambient,
Baidu,
Amazon Bedrock,
Amazon Nova,
Anthropic,
Arcee AI,
AtlasCloud,
Avian,
Azure,
BaseTen,
BytePlus,
Black Forest Labs,
Cerebras,
Chutes,
Cirrascale,
Clarifai,
Cloudflare,
Cohere,
Crucible,
Crusoe,
Darkbloom,
Decart,
DeepInfra,
DeepSeek,
DekaLLM,
DigitalOcean,
Featherless,
Fireworks,
Friendli,
GMICloud,
Google,
Google AI Studio,
Groq,
HeyGen,
Inception,
Inceptron,
InferenceNet,
Ionstream,
Infermatic,
Io Net,
Inferact vLLM,
Inflection,
Liquid,
Mara,
Mancer 2,
Minimax,
ModelRun,
Mistral,
Modular,
Moonshot AI,
Morph,
NCompass,
Nebius,
Nex AGI,
NextBit,
Novita,
Nvidia,
OpenAI,
OpenInference,
Parasail,
Poolside,
Perceptron,
Perplexity,
Phala,
Recraft,
Reka,
Relace,
Sakana AI,
SambaNova,
Seed,
SiliconFlow,
Sourceful,
StepFun,
Stealth,
StreamLake,
Switchpoint,
Tenstorrent,
Together,
Upstage,
Venice,
Wafer,
WandB,
Quiver,
Xiaomi,
xAI,
Z.AI,
FakeProvider
Example:

"OpenAI"