Docs
The future will bring us hundreds of language models and dozens of providers for each. How will you choose the best?
Standardized API. No need to change your code when switching between models or providers.
Transparent metrics. Compare models by how often they're used, and soon, for which purposes. Track throughput and latency, and use the lowest-cost providers.
Flexible auth. Use traditional API keys, let users pay for their own models via OAuth PKCE, or connect via the Window AI extension.
Quick Start
fetch("https://openrouter.ai/api/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${OPENROUTER_API_KEY}`,
"HTTP-Referer": `${YOUR_SITE_URL}`, // Optional, for including your app on openrouter.ai rankings.
"X-Title": `${YOUR_SITE_NAME}`, // Optional. Shows in rankings on openrouter.ai.
"Content-Type": "application/json"
},
body: JSON.stringify({
"model": "openai/gpt-3.5-turbo", // Optional (user controls the default),
"messages": [
{"role": "user", "content": "What is the meaning of life?"}
]
})
});You can also use OpenRouter with OpenAI's client API:
import OpenAI from "openai"
const openai = new OpenAI({
baseURL: "https://openrouter.ai/api/v1",
apiKey: $OPENROUTER_API_KEY,
defaultHeaders: {
"HTTP-Referer": $YOUR_SITE_URL, // Optional, for including your app on openrouter.ai rankings.
"X-Title": $YOUR_SITE_NAME, // Optional. Shows in rankings on openrouter.ai.
},
// dangerouslyAllowBrowser: true,
})
async function main() {
const completion = await openai.chat.completions.create({
model: "openai/gpt-3.5-turbo",
messages: [
{ role: "user", content: "Say this is a test" }
],
})
console.log(completion.choices[0].message)
}
main()Supported Models
Model usage can be paid by users, developers, or both, and may shift in availability. You can also fetch models, prices, and limits via API.If you'd like to add an open source model directly to OpenRouter, visit our Github here.
Text models
| Model Name & ID | Prompt cost ($ per 1k tokens) | Completion cost ($ per 1k tokens) | Context (tokens) | Moderation Whether content filtering is applied by OpenRouter, per the model provider's Terms of Service. Developers should adhere to the terms of the model regardless. |
|---|---|---|---|---|
Auto (best for prompt)openrouter/auto | Depending on their size, subject, and complexity, your prompts will be sent to [MythoMax 13B](/models/gryphe/mythomax-l2-13b), [MythoMax 13B 8k](/models/gryphe/mythomax-l2-13b-8k) or [GPT-4 Turbo](/models/openai/gpt-4-1106-preview). To see which model was used, visit [Activity](/activity). Pricing depends on the final model chosen. | Depending on their size, subject, and complexity, your prompts will be sent to [MythoMax 13B](/models/gryphe/mythomax-l2-13b), [MythoMax 13B 8k](/models/gryphe/mythomax-l2-13b-8k) or [GPT-4 Turbo](/models/openai/gpt-4-1106-preview). To see which model was used, visit [Activity](/activity). Pricing depends on the final model chosen. | 128,000 | N/A |
Mistral 7B Instruct (beta)mistralai/mistral-7b-instruct | $0 100% off | $0 100% off | 8,192 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Hugging Face: Zephyr 7B (beta)huggingfaceh4/zephyr-7b-beta | $0 100% off | $0 100% off | 4,096 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Toppy M 7B (beta)undi95/toppy-m-7b | $0 100% off | $0 100% off | 32,768 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Nous: Hermes 13B (beta)nousresearch/nous-hermes-llama2-13b | $0.00015 50% off | $0.00015 50% off | 4,096 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Meta: CodeLlama 34B Instruct (beta)meta-llama/codellama-34b-instruct | $0.0004 50% off | $0.0004 50% off | 8,192 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Phind: CodeLlama 34B v2 (beta)phind/phind-codellama-34b | $0.0004 50% off | $0.0004 50% off | 4,096 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Llava 13B (beta)haotian-liu/llava-13b | $0.005 50% off | $0.005 50% off | 2,048 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Meta: Llama v2 13B Chat (beta)meta-llama/llama-2-13b-chat | $0.0002345 33% off | $0.0002345 33% off | 4,096 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Goliath 120B (beta)alpindale/goliath-120b | $0.00703125 25% off | $0.00703125 25% off | 6,144 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
lzlv 70B (beta)lizpreciatior/lzlv-70b-fp16-hf | $0.00056 20% off | $0.00076 20% off | 4,096 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
OpenAI: GPT-3.5 Turboopenai/gpt-3.5-turbo | $0.001 | $0.002 | 4,095 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Filtered |
OpenAI: GPT-3.5 Turbo 16k (preview)openai/gpt-3.5-turbo-1106 | $0.001 | $0.002 | 16,385 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Filtered |
OpenAI: GPT-3.5 Turbo 16kopenai/gpt-3.5-turbo-16k | $0.003 | $0.004 | 16,385 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Filtered |
OpenAI: GPT-4 Turbo (preview)openai/gpt-4-1106-preview | $0.01 | $0.03 | 128,000 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Filtered |
OpenAI: GPT-4openai/gpt-4 | $0.03 | $0.06 | 8,191 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Filtered |
OpenAI: GPT-4 32kopenai/gpt-4-32k | $0.06 | $0.12 | 32,767 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Filtered |
OpenAI: GPT-4 Vision (preview)openai/gpt-4-vision-preview | $0.01 | $0.03 | 128,000 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Filtered |
OpenAI: GPT-3.5 Turbo Instructopenai/gpt-3.5-turbo-instruct | $0.0015 | $0.002 | 4,095 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Filtered |
Google: PaLM 2 Chatgoogle/palm-2-chat-bison | $0.0005 | $0.0005 | 9,216 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Google: PaLM 2 Code Chatgoogle/palm-2-codechat-bison | $0.0005 | $0.0005 | 7,168 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Google: PaLM 2 Chat 32kgoogle/palm-2-chat-bison-32k | $0.0005 | $0.0005 | 32,000 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Google: PaLM 2 Code Chat 32kgoogle/palm-2-codechat-bison-32k | $0.0005 | $0.0005 | 32,000 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Meta: Llama v2 70B Chat (beta)meta-llama/llama-2-70b-chat | $0.0007 | $0.00095 | 4,096 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Nous: Hermes 70B (beta)nousresearch/nous-hermes-llama2-70b | $0.0009 | $0.0009 | 4,096 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Nous: Capybara 34B (beta)nousresearch/nous-capybara-34b | $0.02 | $0.02 | 32,000 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Airoboros 70B (beta)jondurbin/airoboros-l2-70b | $0.0007 | $0.00095 | 4,096 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Synthia 70B (beta)migtissera/synthia-70b | $0.009375 | $0.009375 | 8,192 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Mistral OpenOrca 7B (beta)open-orca/mistral-7b-openorca | $0.0002 | $0.0002 | 8,192 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
OpenHermes Mistral 7B (beta)teknium/openhermes-2-mistral-7b | $0.0002 | $0.0002 | 4,096 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
OpenHermes Mistral 7B 2.5 (beta)teknium/openhermes-2.5-mistral-7b | $0.0002 | $0.0002 | 4,096 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Pygmalion: Mythalion 13B (beta)pygmalionai/mythalion-13b | $0.001125 | $0.001125 | 8,192 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
ReMM SLERP 13B (beta)undi95/remm-slerp-l2-13b | $0.001125 | $0.001125 | 6,144 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Xwin 70B (beta)xwin-lm/xwin-lm-70b | $0.009375 | $0.009375 | 8,192 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
MythoMax 13B 8k (beta)gryphe/mythomax-l2-13b-8k | $0.001125 | $0.001125 | 8,192 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Anthropic: Claude v2.1anthropic/claude-2 | $0.008 | $0.024 | 200,000 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Filtered |
Anthropic: Claude v2.0anthropic/claude-2.0 | $0.008 | $0.024 | 100,000 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Filtered |
Anthropic: Claude Instant v1anthropic/claude-instant-v1 | $0.00163 | $0.00551 | 100,000 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Filtered |
Mancer: Weaver (alpha)mancer/weaver | $0.0045 | $0.0045 | 8,000 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
MythoMax 13Bgryphe/mythomax-l2-13b | $0.0006 | $0.0006 | 4,096 OpenRouter defaults to allowing prompts of "unlimited" length for this model, using a middle-out transform, which you can disable (and does not affect prompts of size less than the context length). | Unfiltered |
Media modelsMore coming soon. Learn about making 3D object requests in our Discord
Note: Different models tokenize text in different ways. Some models break up text into chunks of multiple characters (GPT, Claude, Llama, etc) while others tokenize by character (PaLM). This means that the number of tokens may vary depending on the model.
Fallback Models
OpenRouter allows you to automatically try other models if the primary model is down, rate-limited, or refuses to reply due to content moderation required by the provider:
{
models: ["anthropic/claude-2.1", "gryphe/mythomax-l2-13b"],
route: "fallback",
... // Other params
}If the model you selected returns an error, OpenRouter will try to use the fallback model instead. If the fallback model is down or returns an error, OpenRouter will return that error.
By default, any error can trigger the use of a fallback model, including context length validation errors, moderation flags for filtered models, rate-limiting, and downtime.
Requests are priced using the model that was used, which will be returned in the model attribute of the response body.
If no fallback model is specified but route: "fallback" is still included, OpenRouter will try the most appropriate open-source model available, with pricing less than the primary model (or very close to it).
OAuth PKCE
Users can connect to OpenRouter in one click using Proof Key for Code Exchange (PKCE). Here's an example, and here's a step-by-step:
- Send your user to
https://openrouter.ai/auth?callback_url=YOUR_SITE_URL.You can optionally include acode_challenge(random password up to 256 digits) for extra security.
For maximum security, we recommend also settingcode_challenge_methodtoS256, and then settingcode_challengeto the base64 encoding of the sha256 hash ofcode_verifier, which you will submit in Step 2. More info in Auth0's docs. - Once logged in, they'll be redirected back to your site with a
codein the URL. Make an API call (can be frontend or backend) to exchange the code for a user-controlled API key. And that's it for PKCE!Look for thecodequery parameter, e.g.?code=...fetch("https://openrouter.ai/api/v1/auth/keys", { method: 'POST', body: JSON.stringify({ code: $CODE_FROM_QUERY_PARAM, code_verifier: $CODE_VERIFIER // Only needed if you sent a code_challenge in Step 1 }) }); - A fresh API key will be in the result under "key". Store it securely and make OpenAI-style requests (supports streaming as well):
fetch("https://openrouter.ai/api/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${OPENROUTER_API_KEY}`,
"HTTP-Referer": `${YOUR_SITE_URL}`, // Optional, for including your app on openrouter.ai rankings.
"X-Title": `${YOUR_SITE_NAME}`, // Optional. Shows in rankings on openrouter.ai.
"Content-Type": "application/json"
},
body: JSON.stringify({
"model": "anthropic/claude-2", // Optional (user controls the default),
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
})
});You can use JavaScript or any server-side framework, like Streamlit . The linked example shows multiple models and file Q&A.
API Keys
Users or developers can cover model costs with normal API keys. This allows you to use curl or the OpenAI SDK directly with OpenRouter. Just create an API key, set the api_base, and set a referrer header to make your app discoverable to others on OpenRouter.
Note: API keys on OpenRouter are more powerful than keys used directly for model APIs. They allow users to set credit limits for apps, and they can be used in OAuth flows.
Example code:import openai
openai.api_base = "https://openrouter.ai/api/v1"
openai.api_key = $OPENROUTER_API_KEY
response = openai.ChatCompletion.create(
model="openai/gpt-3.5-turbo", # Optional (user controls the default)
messages=[...],
headers={
"HTTP-Referer": $YOUR_SITE_URL, # Optional, for including your app on openrouter.ai rankings.
"X-Title": $YOUR_APP_NAME, # Optional. Shows in rankings on openrouter.ai.
},
)
reply = response.choices[0].messageTo extend the Python code for streaming, see this example from OpenAI.
Requests & Responses
More docs coming. In the meantime, see the OpenAI Chat API, which is compatible with OpenRouter. One note on headers:
Request Headers
OpenRouter allows you to specify an optional HTTP-Referer header to identify your app and make it discoverable to users on openrouter.ai. You can also include an optional X-Title header to set or modify the title of your app. Example:
fetch("https://openrouter.ai/api/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${OPENROUTER_API_KEY}`,
"HTTP-Referer": `${YOUR_SITE_URL}`, // Optional, for including your app on openrouter.ai rankings.
"X-Title": `${YOUR_SITE_NAME}`, // Optional. Shows in rankings on openrouter.ai.
"Content-Type": "application/json"
},
body: JSON.stringify({
"messages": [
{"role": "user", "content": "Who are you?"}
]
})
});Request Body
More docs coming. In the meantime, see the OpenAI Chat API, which OpenRouter extends.
Model routing: If the model parameter is omitted, the user or payer's default is used. Otherwise, remember to select a value for model from the supported models or API, and include the organization prefix. OpenRouter will select the least expensive and best GPUs available to serve the request, and fall back to other providers or GPUs if it receives a 5xx response code or if you are rate-limited.
Streaming: Server-Sent Events (SSE) are supported as well, to enable streaming for all models. Simply send stream: true in your request body. The SSE stream will occasionally contain a "comment" payload, which you should ignore (noted below).
Non-standard parameters: If the chosen model doesn't support a request parameter (such as logit_bias in non-OpenAI models, or top_k for OpenAI), then the parameter is ignored. The rest are forwarded to the underlying model API.
Response Body
Responses are largely consistent with OpenAI. This means that choices is always an array, even if the model only returns one completion. Each choice will contain a delta property if a stream was requested and a message property otherwise. This makes it easier to use the same code for all models. Note that finish_reason will vary depending on the model provider.
The model property tells you which model was used inside the underlying API. Example:
{
"id": "gen-xxxxxxxxxxxxxx",
"choices": [
{
"finish_reason": "stop", // Different models provide different reasons here
"message": { // will be "delta" if streaming
role: "assistant",
content: "Hello there!"
}
}
],
"model": "openai/" // Could also be "claude-1.3-100k", "chat-bison@001", etc, depending on the "model" that ends up being used
}Querying Cost and Stats
You can use the returned id to query for the generation stats (including token counts and cost) after the request is complete:
const generation = await fetch("https://openrouter.ai/api/v1/generation?id=$GENERATION_ID", { headers })
await generation.json()
// OUTPUT:
{
"id": "gen-nNPYi0ZB6GOK5TNCUMHJGgXo",
"model": "openai/gpt-4-32k",
"streamed": false,
"generation_time": 2,
"created_at": "2023-09-02T20:29:18.574972+00:00",
"tokens_prompt": 24,
"tokens_completion": 29,
"native_tokens_prompt": null,
"native_tokens_completion": null,
"num_media_prompt": null,
"num_media_completion": null,
"origin": "https://localhost:47323/",
"usage": 0.00492
}For SSE stream, we occasionally need to send an SSE comment to indicate that OpenRouter is processing your request. This is to prevent the connection from timing out. The comment will look like this:
: OPENROUTER PROCESSINGComment payload can be safely ignored per the SSE specs. However you can leverage it to improve UX as needed such as by showing a dynamic loading indicator.
Some SSE client implementations might not parse the payload according to spec, which leads to an uncaught error when you JSON.stringify the non-JSON payloads. We recommend the following clients:
Prompt Transforms
OpenRouter has a simple rule for choosing between sending a prompt and sending a list of ChatML messages:
- Choose
messagesif you want to have OpenRouter apply a recommended instruct template to your prompt, depending on which model serves your request. Available instruct modes include: - Choose
promptif you want to send a custom prompt to the model. This is useful if you want to use a custom instruct template or maintain full control over the prompt submitted to the model.
To help with prompts that exceed the maximum context size of a model, OpenRouter supports a custom parameter called transforms:
{
transforms: ["middle-out"], // Compress prompts > context size. This is the default for all models.
messages: [...], // "prompt" works as well
model // Works with any model
}The transforms param is an array of strings that tell OpenRouter to apply a series of transformations to the prompt before sending it to the model. Transformations are applied in-order. Available transforms are:
middle-out: compress prompts and message chains to the context size. This helps users extend conversations in part because LLMs pay significantly less attention to the middle of sequences anyway. Works by compressing or removing messages in the middle of the prompt.
Note: All OpenRouter models default to usingmiddle-out, unless you exclude this transform by e.g. settingtransforms: []in the request body.
Error Handling
For errors, OpenRouter returns a JSON response with the following shape:
type ErrorResponse = {
error: {
code: number
message: string
}
}The HTTP Response will have the same status code as error.code, forming a request error if:
- Your original request is invalid
- Your API key/account is out of credits
- You did not set
stream: trueand the LLM returned an error within 15 seconds.
Otherwise, the returned HTTP response status will be 200 and any error occured while the LLM is producing the output will be emitted in the response body or as an SSE data event.
Example code for printing errors in JavaScript:
const request = await fetch("https://openrouter.ai/...")
console.log(request.status) // Will be an error code unless the model started processing your request
const response = await request.json()
console.error(response.error?.status) // Will be an error code
console.error(response.error?.message)Error Codes
- 400: Bad Request (invalid or missing params, CORS)
- 401: Invalid credentials (OAuth session expired, disabled/invalid API key)
- 402: Out of credits
- 403: Your chosen model requires moderation and your input was flagged
- 408: Your request timed out
- 429: You are being rate limited
- 502: Your chosen model is down or we received an invalid response from it
User Limits
Rate Limits and Credits Remaining
To check the rate limit or credits left on an API key, make a GET request to https://openrouter.ai/api/v1/auth/key.
fetch("https://openrouter.ai/api/v1/auth/key", {
method: 'GET',
headers: {
'Authorization': 'Bearer $OPENROUTER_API_KEY'
},
});If you submit a valid API key, you should get a response of the form:
type Key = {
data: {
label: string,
usage: number, // Number of credits used
limit: number | null, // Credit limit for the key, or null if unlimited
rate_limit: {
requests: number, // Number of requests allowed...
interval: string // in this interval, e.g. "10s"
}
}
}Rate limits are a function of the number of credits remaining on the key or account. Basically, your rate limit is the number of credits you have per second. To be exact:
requests_per_10_seconds = 10 * (1 + Math.floor(Math.max(credits, 0)))
Example 1: if you have 9.9 credits remaining, you can make 100 requests every 10 seconds.
Example 2: if you have -0.1 credits remaining, you can make 10 requests every 10 seconds (but you may see 402 errors).
Note that every user/ip-address pair is also subject to a rate limit of 60 requests per second, to defend against denial-of-service attacks.
Token Limits
Some users may have too few credits on their account to make expensive requests. OpenRouter provides a way to know that before making a request to any model.
To get the maximum tokens that a user can generate and the maximum tokens allowed in their prompt, add authentication headers in your request to https://openrouter.ai/api/v1/models:
fetch("https://openrouter.ai/api/v1/models", {
method: 'GET',
headers: {
'Authorization': 'Bearer $OPENROUTER_API_KEY'
},
});Each model will include an per_request_limits property:
type Model = {
id: string,
pricing: {
prompt: number,
completion: number
},
context_length: number,
per_request_limits: {
prompt_tokens: number,
completion_tokens: number
}
}Other Frameworks
You can find a few examples of using OpenRouter with other frameworks in this Github repository. Here are some examples:
- Using npm i openai: github.Tip: You can also use Grit to automatically migrate your code. Simply run
npx @getgrit/launcher openrouter. - Using Streamlit, a way to build and share Python apps: github
- Using LangChain for Python, a composable LLM framework: github
- Using LangChain.js: github
- Using the Vercel AI SDK:
const chat = new ChatOpenAI({
modelName: "anthropic/claude-instant-v1",
temperature: 0.8,
streaming: true,
openAIApiKey: $OPENROUTER_API_KEY,
}, {
basePath: $OPENROUTER_BASE_URL + "/api/v1",
baseOptions: {
headers: {
"HTTP-Referer": "https://yourapp.com/", // Optional, for including your app on openrouter.ai rankings.
"X-Title": "Langchain.js Testing", // Optional. Shows in rankings on openrouter.ai.
},
},
});const config = new Configuration({
basePath: $OPENROUTER_BASE_URL + "/api/v1",
apiKey: $OPENROUTER_API_KEY,
baseOptions: {
headers: {
"HTTP-Referer": "https://yourapp.com/", // Optional, for including your app on openrouter.ai rankings.
"X-Title": "Vercel Testing", // Optional. Shows in rankings on openrouter.ai.
}
}
})
const openrouter = new OpenAIApi(config)3D Objects (beta)
OpenRouter supports text-to-3D Object generation, currently in beta. See supported media models and try a demo. To generate 3D Objects, send a POST request to https://openrouter.ai/api/v1/objects/generations
curl https://openrouter.ai/api/v1/objects/generations \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
-H "HTTP-Referer: $YOUR_SITE_URL" \
-H "X-Title: $YOUR_SITE_NAME" \
-d '{
"prompt": "a chair shaped like an avacado",
"num_inference_steps": 32,
"num_outputs": 1,
"extension": "ply",
"model": "openai/shap-e"
}'//Each generation will contain either a base64 string or a hosted url, or both.
interface MediaOutput {
uri?: string; //base64 string
url?: string; //hosted url
}
interface MediaResponse {
generations: MediaOutput[];
}