Skip to content
  1. Status
  2. Announcements
  3. Docs
  4. Support
  5. About
  6. Partners
  7. Enterprise
  8. Careers
  9. Pricing
  10. Privacy
  11. Terms
  12.  
  13. © 2025 OpenRouter, Inc

    DeepSeek: DeepSeek R1 Zero

    deepseek/deepseek-r1-zero

    Created Mar 6, 2025163,840 context

    DeepSeek-R1-Zero is a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step. It's 671B parameters in size, with 37B active in an inference pass.

    It demonstrates remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors.

    DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. See DeepSeek R1 for the SFT model.

    Providers for DeepSeek R1 Zero

    OpenRouter routes requests to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.