Search/
Skip to content
/
OpenRouter
© 2026 OpenRouter, Inc

Product

  • Chat
  • Rankings
  • Apps
  • Models
  • Providers
  • Pricing
  • Enterprise
  • Labs

Company

  • About
  • Announcements
  • CareersHiring
  • Privacy
  • Terms of Service
  • Support
  • State of AI
  • Works With OR
  • Data

Developer

  • Documentation
  • API Reference
  • SDK
  • Status

Connect

  • Discord
  • GitHub
  • LinkedIn
  • X
  • YouTube

Google: Gemma 4 26B A4B (free)Free variant

google/gemma-4-26b-a4b-it:free

Released Apr 3, 2026262,144 context$0/M input tokens$0/M output tokens

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at a fraction of the compute cost. Supports multimodal input including text, images, and video (up to 60s at 1fps). Features a 256K token context window, native function calling, configurable thinking/reasoning mode, and structured output support. Released under Apache 2.0.

Performance for Gemma 4 26B A4B (free)

Compare different providers across OpenRouter

Effective Pricing for Gemma 4 26B A4B (free)

Actual cost per million tokens across providers over the past hour

Apps using Gemma 4 26B A4B (free)

Top public apps this month

Recent activity on Gemma 4 26B A4B (free)

Total usage per day on OpenRouter

Prompt
751M
Completion
11.1M
Reasoning
1.45M

Prompt tokens measure input size. Reasoning tokens show internal thinking before a response. Completion tokens reflect total output length.

Uptime stats for Gemma 4 26B A4B (free)

Uptime stats for Gemma 4 26B A4B (free) across all providers

Providers for Gemma 4 26B A4B (free)

OpenRouter routes requests to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.

Sample code and API for Gemma 4 26B A4B (free)

OpenRouter normalizes requests and responses across providers for you.

OpenRouter supports reasoning-enabled models that can show their step-by-step thinking process. Use the reasoning parameter in your request to enable reasoning, and access the reasoning_details array in the response to see the model's internal reasoning before the final answer. When continuing a conversation, preserve the complete reasoning_details when passing messages back to the model so it can continue reasoning from where it left off. Learn more about reasoning tokens.

In the examples below, the OpenRouter-specific headers are optional. Setting them allows your app to appear on the OpenRouter leaderboards.

Using third-party SDKs

For information about using third-party SDKs and frameworks with OpenRouter, please see our frameworks documentation.

See the Request docs for all possible fields, and Parameters for explanations of specific sampling parameters.