Image Generation Models

Model rankings updated August 2026 based on real usage data.

Image generation models create high-quality images from text descriptions, and OpenRouter exposes them through one unified API. This collection ranks models by their usage on OpenRouter over the past week, surfacing the tools developers rely on most. The current top models are Nano Banana (Gemini 2.5 Flash Image), Nano Banana 2 (Gemini 3.1 Flash Image), and Seedream 4.5. Compare pricing, resolutions, and capabilities to pick the best fit.

Image Generation Models on OpenRouter

Google: Nano Banana (Gemini 2.5 Flash Image)

5.96B tokens

Gemini 2.5 Flash Image, a.k.a. "Nano Banana," is now generally available. It is a state of the art image generation model with contextual understanding. It is capable of image generation, edits, and multi-turn conversations. Aspect ratios can be controlled with the image_config API Parameter

by google33K context$0.30/M input tokens$2.50/M output tokens

Google: Nano Banana 2 (Gemini 3.1 Flash Image)

4.99B tokens

Gemini 3.1 Flash Image, a.k.a. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, delivering Pro-level visual quality at Flash speed. It combines advanced contextual understanding with fast, cost-efficient inference, making complex image generation and iterative edits significantly more accessible. Aspect ratios can be controlled with the image_config API Parameter

by google131K context$0.50/M input tokens$3/M output tokens

ByteDance Seed: Seedream 4.5

3.21B tokens

Seedream 4.5 is the latest in-house image generation model developed by ByteDance. Compared with Seedream 4.0, it delivers comprehensive improvements, especially in editing consistency, including better preservation of subject details, lighting, and color tone. It also enhances portrait refinement and small-text rendering. The model’s multi-image composition capabilities have been significantly strengthened, and both reasoning performance and visual aesthetics continue to advance, enabling more accurate and artistically expressive image generation.

Pricing is $0.04 per output image, regardless of size.

by bytedance-seed4K context$0.04/image

OpenAI: GPT Image 2

2.92B tokens

OpenAI's latest image generation model. Supports high-fidelity image generation and editing via the dedicated Images API.

by openai400K context$8/M input tokens$8/M output tokens

OpenAI: GPT-5.4 Image 2

1.8B tokens

GPT-5.4 Image 2 combines OpenAI's GPT-5.4 model with state-of-the-art image generation capabilities from GPT Image 2. It enables rich multimodal workflows, allowing users to seamlessly move between reasoning, coding, and visual generation within the same interaction.

by openai272K context$8/M input tokens$15/M output tokens

Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview)

1.76B tokens

Gemini 3.1 Flash Image Preview, a.k.a. "Nano Banana 2," is Google’s latest state of the art image generation and editing model, delivering Pro-level visual quality at Flash speed. It combines advanced contextual understanding with fast, cost-efficient inference, making complex image generation and iterative edits significantly more accessible. Aspect ratios can be controlled with the image_config API Parameter

by google66K context$0.50/M input tokens$3/M output tokens

xAI: Grok Imagine Image Quality

1.59B tokens

Grok Imagine Image Quality is xAI's fast, high-fidelity image generation and editing model. It accepts text prompts and optional reference images, producing photorealistic outputs at 1K or 2K across a range of aspect ratios, including flexible adjustment of reference images.

The model emphasizes realistic detail — natural lighting and physics, accurate textures, and consistent rendering of named entities such as brands, public figures, and specific locations. It supports clean multilingual text rendering inside images, making it the top choice for posters, packaging, ads, menus, and social graphics. When given reference images, it preserves identity and structure for product placement, brand-aligned variations, and character continuity across scenes.

by x-ai66K contextfrom $0.05/image

Google: Nano Banana Pro (Gemini 3 Pro Image)

1.54B tokens

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro. It extends the original Nano Banana with significantly improved multimodal reasoning, real-world grounding, and high-fidelity visual synthesis. The model generates context-rich graphics, from infographics and diagrams to cinematic composites, and can incorporate real-time information via Search grounding.

It offers industry-leading text rendering in images (including long passages and multilingual layouts), consistent multi-image blending, and accurate identity preservation across up to five subjects. Nano Banana Pro adds fine-grained creative controls such as localized edits, lighting and focus adjustments, camera transformations, and support for 2K/4K outputs and flexible aspect ratios. It is designed for professional-grade design, product visualization, storyboarding, and complex multi-element compositions while remaining efficient for general image creation workflows.

by google131K context$2/M input tokens$12/M output tokens

Google: Nano Banana Pro (Gemini 3 Pro Image Preview)

1.46B tokens

by google66K context$2/M input tokens$12/M output tokens

Google: Nano Banana 2 Lite (Gemini 3.1 Flash Lite Image)

1.03B tokens

Nano Banana 2 Lite (Gemini 3.1 Flash Lite Image) is Google's fastest, most cost-efficient Gemini image model, built for high-velocity developer pipelines and rapid-fire visual exploration. It delivers text-to-image generation in roughly 4 seconds — about 2.7× faster than Gemini 3.1 Flash Image — while keeping the character consistency, precise editing, and real-world knowledge of the Nano Banana family.

A single drop-in API handles text-to-image, image editing, and multi-image composition. As a multimodal model it also returns text alongside images. Outputs are generated at 1K resolution across 14 aspect ratios and carry an invisible SynthID watermark so they can be identified as AI-generated.

Positioned as the best balance of quality and speed in the Nano Banana 2 line, it lets you generate thousands of images at a fraction of the cost of heavier production models — ideal for prototyping, real-time apps, and visual workflows at scale.

by google66K context$0.25/M input tokens$1.50/M output tokens

Black Forest Labs: FLUX.2 Pro

759M tokens

A high-end image generation and editing model focused on frontier-level visual quality and reliability. It delivers strong prompt adherence, stable lighting, sharp textures, and consistent character/style reproduction across multi-reference inputs. Designed for production workloads, it balances speed and quality while supporting text-to-image and image editing up to 4 MP resolution.

Pricing is as follows, per the docs: Input: We charge $0.015 for each megapixel on the input (i.e. reference images for editing) Output: The first megapixel is charged $0.03 and then each subsequent MP will be charged $0.015.

by black-forest-labs47K context$0.03/megapixel

Black Forest Labs: FLUX.2 Klein 4B

457M tokens

FLUX.2 [klein] 4B is the fastest and most cost-effective model in the FLUX.2 family, optimized for high-throughput use cases while maintaining excellent image quality.

Pricing is based on the output image. The first generated megapixel is charged $0.014. Each subsequent megapixel is charged $0.001.

by black-forest-labs41K context$0.014/megapixel

Image Generation Models

Model rankings updated August 2026 based on real usage data.