Skip to content
  •  
  • © 2023 – 2025 OpenRouter, Inc
      Favicon for Alibaba

      Alibaba

      Browse models provided by Alibaba (Terms of Service)

      7 models

      Tokens processed

      • Qwen: Qwen2.5 VL 32B InstructFree variant

        Qwen2.5-VL-32B is a multimodal vision-language model fine-tuned through reinforcement learning for enhanced mathematical reasoning, structured outputs, and visual problem-solving capabilities. It excels at visual analysis tasks, including object recognition, textual interpretation within images, and precise event localization in extended videos. Qwen2.5-VL-32B demonstrates state-of-the-art performance across multimodal benchmarks such as MMMU, MathVista, and VideoMME, while maintaining strong reasoning and clarity in text-based tasks like MMLU, mathematical problem-solving, and code generation.

        by qwen33K context$0/M input tokens$0/M output tokens
      • Qwen: Qwen VL Plus

        Qwen's Enhanced Large Visual Language Model. Significantly upgraded for detailed recognition capabilities and text recognition abilities, supporting ultra-high pixel resolutions up to millions of pixels and extreme aspect ratios for image input. It delivers significant performance across a broad range of visual tasks.

        by qwen8K context$0.21/M input tokens$0.63/M output tokens$0.269/K input imgs
      • Qwen: Qwen VL Max

        Qwen VL Max is a visual understanding model with 7500 tokens context length. It excels in delivering optimal performance for a broader spectrum of complex tasks.

        by qwen8K context$0.80/M input tokens$3.20/M output tokens$1.024/K input imgs
      • Qwen: Qwen-Turbo

        Qwen-Turbo, based on Qwen2.5, is a 1M context model that provides fast speed and low cost, suitable for simple tasks.

        by qwen1M context$0.05/M input tokens$0.20/M output tokens
      • Qwen: Qwen2.5 VL 72B InstructFree variant

        Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capable of analyzing texts, charts, icons, graphics, and layouts within images.

        by qwen131K context$0/M input tokens$0/M output tokens
      • Qwen: Qwen-Plus

        Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a balanced performance, speed, and cost combination.

        by qwen131K context$0.40/M input tokens$1.20/M output tokens
      • Qwen: Qwen-Max

        Qwen-Max, based on Qwen2.5, provides the best inference performance among Qwen models, especially for complex multi-step tasks. It's a large-scale MoE model that has been pretrained on over 20 trillion tokens and further post-trained with curated Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF) methodologies. The parameter count is unknown.

        by qwen33K context$1.60/M input tokens$6.40/M output tokens