Skip to content
  1.  
  2. © 2023 – 2025 OpenRouter, Inc
    Favicon for Azure

    Azure

    Browse models provided by Azure (Terms of Service)

    19 models

    Tokens processed on OpenRouter

    • OpenAI: GPT-5GPT-5

      GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy in high-stakes use cases. It supports test-time routing features and advanced prompt understanding, including user-specified intent like "think hard about this." Improvements include reductions in hallucination, sycophancy, and better performance in coding, writing, and health-related tasks.

      by openai400K context$1.25/M input tokens$10/M output tokens
  3. OpenAI: GPT-5 MiniGPT-5 Mini

    GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instruction-following and safety-tuning benefits as GPT-5, but with reduced latency and cost. GPT-5 Mini is the successor to OpenAI's o4-mini model.

    by openai400K context$0.25/M input tokens$2/M output tokens
  4. OpenAI: GPT-5 NanoGPT-5 Nano

    GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments. While limited in reasoning depth compared to its larger counterparts, it retains key instruction-following and safety features. It is the successor to GPT-4.1-nano and offers a lightweight option for cost-sensitive or real-time applications.

    by openai400K context$0.05/M input tokens$0.40/M output tokens
  5. OpenAI: GPT-4.1GPT-4.1

    GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

    by openai1.05M context$2/M input tokens$8/M output tokens
  6. OpenAI: GPT-4.1 MiniGPT-4.1 Mini

    GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

    by openai1.05M context$0.40/M input tokens$1.60/M output tokens
  7. OpenAI: GPT-4.1 NanoGPT-4.1 Nano

    For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

    by openai1.05M context$0.10/M input tokens$0.40/M output tokens
  8. DeepSeek: R1R1

    DeepSeek R1 is here: Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass. Fully open-source model & technical report. MIT licensed: Distill & commercialize freely!

    by deepseek164K context$1.485/M input tokens$5.94/M output tokens
  9. Microsoft: Phi-3.5 Mini 128K InstructPhi-3.5 Mini 128K Instruct

    Phi-3.5 models are lightweight, state-of-the-art open models. These models were trained with Phi-3 datasets that include both synthetic data and the filtered, publicly available websites data, with a focus on high quality and reasoning-dense properties. Phi-3.5 Mini uses 3.8B parameters, and is a dense decoder-only transformer model using the same tokenizer as Phi-3 Mini. The models underwent a rigorous enhancement process, incorporating both supervised fine-tuning, proximal policy optimization, and direct preference optimization to ensure precise instruction adherence and robust safety measures. When assessed against benchmarks that test common sense, language understanding, math, code, long context and logical reasoning, Phi-3.5 models showcased robust and state-of-the-art performance among models with less than 13 billion parameters.

    by microsoft128K context$0.10/M input tokens$0.10/M output tokens
  10. OpenAI: GPT-4o (2024-08-06)GPT-4o (2024-08-06)

    The 2024-08-06 version of GPT-4o offers improved performance in structured outputs, with the ability to supply a JSON schema in the respone_format. Read more here. GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of GPT-4 Turbo while being twice as fast and 50% more cost-effective. GPT-4o also offers improved performance in processing non-English languages and enhanced visual capabilities. For benchmarking against other models, it was briefly called "im-also-a-good-gpt2-chatbot"

    by openai128K context$2.50/M input tokens$10/M output tokens$3.613/K input imgs
  11. Mistral: Mistral NemoMistral Nemo

    A 12B parameter model with a 128k token context length built by Mistral in collaboration with NVIDIA. The model is multilingual, supporting English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi. It supports function calling and is released under the Apache 2.0 license.

    by mistralai131K context$0.30/M input tokens$0.30/M output tokens
  12. OpenAI: GPT-4o-miniGPT-4o-mini

    GPT-4o mini is OpenAI's newest model after GPT-4 Omni, supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable than other recent frontier models, and more than 60% cheaper than GPT-3.5 Turbo. It maintains SOTA intelligence, while being significantly more cost-effective. GPT-4o mini achieves an 82% score on MMLU and presently ranks higher than GPT-4 on chat preferences common leaderboards. Check out the launch announcement to learn more. #multimodal

    by openai128K context$0.15/M input tokens$0.60/M output tokens
  13. Microsoft: Phi-3 Mini 128K InstructPhi-3 Mini 128K Instruct

    Phi-3 Mini is a powerful 3.8B parameter model designed for advanced language understanding, reasoning, and instruction following. Optimized through supervised fine-tuning and preference adjustments, it excels in tasks involving common sense, mathematics, logical reasoning, and code processing. At time of release, Phi-3 Medium demonstrated state-of-the-art performance among lightweight models. This model is static, trained on an offline dataset with an October 2023 cutoff date.

    by microsoft128K context$0.10/M input tokens$0.10/M output tokens
  14. Microsoft: Phi-3 Medium 128K InstructPhi-3 Medium 128K Instruct

    Phi-3 128K Medium is a powerful 14-billion parameter model designed for advanced language understanding, reasoning, and instruction following. Optimized through supervised fine-tuning and preference adjustments, it excels in tasks involving common sense, mathematics, logical reasoning, and code processing. At time of release, Phi-3 Medium demonstrated state-of-the-art performance among lightweight models. In the MMLU-Pro eval, the model even comes close to a Llama3 70B level of performance. For 4k context length, try Phi-3 Medium 4K.

    by microsoft128K context$1/M input tokens$1/M output tokens
  15. OpenAI: GPT-4oGPT-4o

    GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of GPT-4 Turbo while being twice as fast and 50% more cost-effective. GPT-4o also offers improved performance in processing non-English languages and enhanced visual capabilities. For benchmarking against other models, it was briefly called "im-also-a-good-gpt2-chatbot" #multimodal

    by openai128K context$2.50/M input tokens$10/M output tokens$3.613/K input imgs
  16. OpenAI: GPT-4o (2024-05-13)GPT-4o (2024-05-13)

    GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of GPT-4 Turbo while being twice as fast and 50% more cost-effective. GPT-4o also offers improved performance in processing non-English languages and enhanced visual capabilities. For benchmarking against other models, it was briefly called "im-also-a-good-gpt2-chatbot" #multimodal

    by openai128K context$5/M input tokens$15/M output tokens$7.225/K input imgs
  17. Mistral LargeMistral Large

    This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch announcement here. It supports dozens of languages including French, German, Spanish, Italian, Portuguese, Arabic, Hindi, Russian, Chinese, Japanese, and Korean, along with 80+ coding languages including Python, Java, C, C++, JavaScript, and Bash. Its long context window allows precise information recall from large documents.

    by mistralai128K context$3/M input tokens$9/M output tokens
  18. OpenAI: GPT-3.5 Turbo (older v0613)GPT-3.5 Turbo (older v0613)

    GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks. Training data up to Sep 2021.

    by openai4K context$1/M input tokens$2/M output tokens
  19. OpenAI: GPT-3.5 Turbo 16kGPT-3.5 Turbo 16k

    This model offers four times the context length of gpt-3.5-turbo, allowing it to support approximately 20 pages of text in a single request at a higher cost. Training data: up to Sep 2021.

    by openai16K context$3/M input tokens$4/M output tokens
  20. OpenAI: GPT-4GPT-4

    OpenAI's flagship model, GPT-4 is a large-scale multimodal language model capable of solving difficult problems with greater accuracy than previous models due to its broader general knowledge and advanced reasoning capabilities. Training data: up to Sep 2021.

    by openai8K context$30/M input tokens$60/M output tokens