Provider Routing
Route requests to the best provider
OpenRouter routes requests to the best available providers for your model. By default, requests are load balanced across the top providers to maximize uptime.
You can customize how your requests are routed using the provider
object in the request body for Chat Completions and Completions.
The provider
object can contain the following fields:
Load Balancing (Default Strategy)
For each model in your request, OpenRouter’s default behavior is to load balance requests across providers.
When you send a request with tools
or tool_choice
, OpenRouter will only
route to providers that natively support tool use. This is currently a beta
feature.
Here is OpenRouter’s default load balancing strategy:
- Prioritize providers that have not seen significant outages in the last 30 seconds.
- For the stable providers, look at the lowest-cost candidates and select one weighted by inverse square of the price (example below).
- Use the remaining providers as fallbacks.
A Load Balancing Example
If Provider A costs $1 per million tokens, Provider B costs $2, and Provider C costs $3, and Provider B recently saw a few outages.
- Your request is routed to Provider A. Provider A is 9x more likely to be first routed to Provider A than Provider C because (inverse square of the price).
- If Provider A fails, then Provider C will be tried next.
- If Provider C also fails, Provider B will be tried last.
If you have sort
or order
set in your provider preferences, load balancing will be disabled.
Provider Sorting
As described above, OpenRouter tries to strike a balance between price and uptime by default.
If you instead want to prioritize throughput, you can include the sort
field in the provider
preferences, set to "throughput"
. Load balancing will be disabled, and the router will prioritize providers that have the highest median throughput over the last day.
To always prioritize low prices, and not apply any load balancing, set sort
to "price"
.
Ordering Specific Providers
You can set the providers that OpenRouter will prioritize for your request using the order
field.
The router will prioritize providers in this list, and in this order, for the model you’re using. If you don’t set this field, the router will load balance across the top providers to maximize uptime.
OpenRouter will try them one at a time and proceed to other providers if none are operational. If you don’t want to allow any other providers, you should disable fallbacks as well.
Example: Specifying providers with fallbacks
This example skips over OpenAI (which doesn’t host Mixtral), tries Together, and then falls back to the normal list of providers on OpenRouter:
Example: Specifying providers with fallbacks disabled
Here’s an example with allow_fallbacks
set to false
that skips over OpenAI (which doesn’t host Mixtral), tries Together, and then fails if Together fails:
Requiring Providers to Support All Parameters (beta)
You can restrict requests only to providers that support all parameters in your request using the require_parameters
field.
With the default routing strategy, providers that don’t support all the LLM parameters specified in your request can still receive the request, but will ignore unknown parameters. When you set require_parameters
to true
, the request won’t even be routed to that provider.
Example: Excluding providers that don’t support JSON formatting
For example, to only use providers that support JSON formatting:
Requiring Providers to Comply with Data Policies
You can restrict requests only to providers that comply with your data policies using the data_collection
field.
allow
: (default) allow providers which store user data non-transiently and may train on itdeny
: use only providers which do not collect user data
Some model providers may log prompts, so we display them with a Data Policy tag on model pages. This is not a definitive source of third party data policies, but represents our best knowledge.
Account-Wide Data Policy Filtering
This is also available as an account-wide setting in your privacy settings. You can disable third party model providers that store inputs for training.
Example: Excluding providers that don’t comply with data policies
To exclude providers that don’t comply with your data policies, set data_collection
to deny
:
Disabling Fallbacks
To guarantee that your request is only served by the top (lowest-cost) provider, you can disable fallbacks.
This is combined with the order
field from Ordering Specific Providers to restrict the providers that OpenRouter will prioritize to just your chosen list.
Ignoring Providers
You can ignore providers for a request by setting the ignore
field in the provider
object.
Ignoring multiple providers may significantly reduce fallback options and limit request recovery.
Account-Wide Ignored Providers
You can ignore providers for all account requests by configuring your preferences. This configuration applies to all API requests and chatroom messages.
Note that when you ignore providers for a specific request, the list of ignored providers is merged with your account-wide ignored providers.
Example: Ignoring Azure for a request calling GPT-4 Omni
Here’s an example that will ignore Azure for a request calling GPT-4 Omni:
Quantization
Quantization reduces model size and computational requirements while aiming to preserve performance.
Quantized models may exhibit degraded performance for certain prompts, depending on the method used.
Providers can support various quantization levels for open-weight models.
Quantization Levels
By default, requests are load-balanced across all available providers, ordered by price. To filter providers by quantization level, specify the quantizations
field in the provider
parameter with the following values:
TODO: Add a generated list of quantization
Example: Requesting FP8 Quantization
Here’s an example that will only use providers that support FP8 quantization:
JSON Schema for Provider Preferences
For a complete list of options, see this JSON schema: