List on OpenRouter

Integrate your inference API with the OpenRouter network.

OpenRouter routes requests across 70+ providers to serve 10M+ developers. We review every application to maintain API reliability and performance standards across the network.

We currently have a large backlog of provider applications and are prioritizing providers with proprietary models.

How the network works

Unified API Surface

Your endpoints are accessible to 10M+ developers through a single OpenAI-compatible API — no additional integration work on their side.

Performance-Based Routing

Requests are routed based on latency, throughput, uptime, and price. Providers that perform well receive proportionally more traffic.

Public Performance Metrics

TTFT, throughput, and uptime are tracked publicly on every model page. These metrics are transparent to developers choosing providers.

Automated Payments

Usage-based billing handled via monthly invoicing. Token counts are reconciled automatically.

Uptime Monitoring

Endpoint reliability is continuously monitored. Providers maintaining 95%+ uptime retain standard routing priority.

Geographic Routing

Declare your datacenter locations in the /models endpoint. Routing respects geographic preferences and data residency requirements.

Technical requirements

All providers must meet these requirements before being considered for the network. Applications that don't meet these criteria will not be reviewed.

View full technical requirements →

OpenAI-Compatible API

Your /chat/completions endpoint must be OpenAI-compatible, return usage tokens for both stream and non-stream requests, and support streaming.

List Models Endpoint

Publish a /models endpoint returning your available models with pricing, context length, max output tokens, supported features, and datacenter locations.

Automated Payment

Support monthly invoicing so OpenRouter can pay for inference without manual intervention.

Privacy & Data Policy

Have a published privacy policy and clear data retention terms. Providers must disclose whether prompts are logged and if data is used for training.

How it works

Submit your application

Provide details about your infrastructure, API endpoints, supported models, and data policies.

Technical review

Our team evaluates your API compatibility, endpoint reliability, pricing, and performance against network standards.

Integration & testing

Accepted providers are onboarded with test traffic to validate latency, throughput, and error handling.

Go live

Your models become available on the network and begin receiving production requests routed by performance and price.

FAQ

Can't find what you need?
Reach out to [email protected]

Your API must be OpenAI-compatible, supporting /chat/completions with streaming. You must return usage (token counts) for both stream and non-stream requests.

You set your own per-token pricing in USD. OpenRouter pays you for inference based on actual token usage through monthly invoicing.

OpenRouter routes requests to providers based on price, latency, throughput, and reliability. High-performing providers receive more traffic automatically. For tool-calling requests, our Auto Exacto system further optimizes routing based on tool-call success rates.

Providers with 95%+ uptime receive normal routing priority. Between 80-94% uptime, traffic is reduced. Below 80%, your endpoints are used only as fallback.

No. You declare which features you support (tools, structured outputs, JSON mode, etc.) in your /models endpoint. OpenRouter only routes feature-specific traffic to providers that support those features.

OpenRouter pays providers through monthly invoicing. Payment details are set up during onboarding.

Yes. Each model in your /models endpoint has its own pricing object. You can also provide tiered pricing for long-context requests with different rates above a token threshold.

OpenRouter publicly tracks TTFT (time to first token) and throughput (tokens/second) for all providers. These metrics are visible on each model page and directly influence how much traffic you receive.

Submit your application

We review applications on a rolling basis. Due to high demand, not all providers will be accepted. Priority is given to providers that fill gaps in our current network.

List on OpenRouter

Integrate your inference API with the OpenRouter network.

OpenRouter routes requests across 70+ providers to serve 10M+ developers. We review every application to maintain API reliability and performance standards across the network.

We currently have a large backlog of provider applications and are prioritizing providers with proprietary models.

How the network works

Unified API Surface

Your endpoints are accessible to 10M+ developers through a single OpenAI-compatible API — no additional integration work on their side.

Performance-Based Routing

Requests are routed based on latency, throughput, uptime, and price. Providers that perform well receive proportionally more traffic.

Public Performance Metrics

TTFT, throughput, and uptime are tracked publicly on every model page. These metrics are transparent to developers choosing providers.

Automated Payments

Usage-based billing handled via monthly invoicing. Token counts are reconciled automatically.

Uptime Monitoring

Endpoint reliability is continuously monitored. Providers maintaining 95%+ uptime retain standard routing priority.

Geographic Routing

Declare your datacenter locations in the /models endpoint. Routing respects geographic preferences and data residency requirements.

Technical requirements

All providers must meet these requirements before being considered for the network. Applications that don't meet these criteria will not be reviewed.

View full technical requirements →

OpenAI-Compatible API

Your /chat/completions endpoint must be OpenAI-compatible, return usage tokens for both stream and non-stream requests, and support streaming.

List Models Endpoint

Publish a /models endpoint returning your available models with pricing, context length, max output tokens, supported features, and datacenter locations.

Automated Payment

Support monthly invoicing so OpenRouter can pay for inference without manual intervention.

Privacy & Data Policy

Have a published privacy policy and clear data retention terms. Providers must disclose whether prompts are logged and if data is used for training.

How it works

Submit your application

Provide details about your infrastructure, API endpoints, supported models, and data policies.

Technical review

Our team evaluates your API compatibility, endpoint reliability, pricing, and performance against network standards.

Integration & testing

Accepted providers are onboarded with test traffic to validate latency, throughput, and error handling.

Go live

Your models become available on the network and begin receiving production requests routed by performance and price.

FAQ

Can't find what you need?
Reach out to [email protected]

Your API must be OpenAI-compatible, supporting /chat/completions with streaming. You must return usage (token counts) for both stream and non-stream requests.

You set your own per-token pricing in USD. OpenRouter pays you for inference based on actual token usage through monthly invoicing.

Providers with 95%+ uptime receive normal routing priority. Between 80-94% uptime, traffic is reduced. Below 80%, your endpoints are used only as fallback.

OpenRouter pays providers through monthly invoicing. Payment details are set up during onboarding.

Yes. Each model in your /models endpoint has its own pricing object. You can also provide tiered pricing for long-context requests with different rates above a token threshold.

Submit your application

We review applications on a rolling basis. Due to high demand, not all providers will be accepted. Priority is given to providers that fill gaps in our current network.