Search/
Skip to content
/
OpenRouter
© 2026 OpenRouter, Inc

Product

  • Chat
  • Rankings
  • Apps
  • Models
  • Providers
  • Pricing
  • Enterprise
  • Labs

Company

  • About
  • Announcements
  • CareersHiring
  • Privacy
  • Terms of Service
  • Support
  • State of AI
  • Works With OR
  • Data

Developer

  • Documentation
  • API Reference
  • SDK
  • Status

Connect

  • Discord
  • GitHub
  • LinkedIn
  • X
  • YouTube

Google: Gemma 4 26B A4B

google/gemma-4-26b-a4b-it

Released Apr 3, 2026262,144 context
$0.13/M input tokens$0.40/M output tokens

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at a fraction of the compute cost. Supports multimodal input including text, images, and video (up to 60s at 1fps). Features a 256K token context window, native function calling, configurable thinking/reasoning mode, and structured output support. Released under Apache 2.0.

Providers for Gemma 4 26B A4B

OpenRouter routes requests to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.

Performance for Gemma 4 26B A4B

Compare different providers across OpenRouter

Effective Pricing for Gemma 4 26B A4B

Actual cost per million tokens across providers over the past hour

Apps using Gemma 4 26B A4B

Top public apps this month

Recent activity on Gemma 4 26B A4B

Total usage per day on OpenRouter

Prompt
274M
Completion
15.6M
Reasoning
2.88M

Prompt tokens measure input size. Reasoning tokens show internal thinking before a response. Completion tokens reflect total output length.

Uptime stats for Gemma 4 26B A4B

Uptime stats for Gemma 4 26B A4B across all providers

Sample code and API for Gemma 4 26B A4B

OpenRouter normalizes requests and responses across providers for you.

OpenRouter supports reasoning-enabled models that can show their step-by-step thinking process. Use the reasoning parameter in your request to enable reasoning, and access the reasoning_details array in the response to see the model's internal reasoning before the final answer. When continuing a conversation, preserve the complete reasoning_details when passing messages back to the model so it can continue reasoning from where it left off. Learn more about reasoning tokens.

In the examples below, the OpenRouter-specific headers are optional. Setting them allows your app to appear on the OpenRouter leaderboards.

Using third-party SDKs

For information about using third-party SDKs and frameworks with OpenRouter, please see our frameworks documentation.

See the Request docs for all possible fields, and Parameters for explanations of specific sampling parameters.