Skip to content
  1.  
  2. © 2023 – 2025 OpenRouter, Inc

    Image Generation Models

    OpenRouter provides access to leading image generation models through a single, unified API gateway. Compare pricing, capabilities, and performance across multiple image generation APIs to find the best fit for models that transform text descriptions into high-quality images.

    Image Generation Models on OpenRouter

    Google: Nano Banana Pro (Gemini 3 Pro Image Preview)

    3.55B tokens

    Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro. It extends the original Nano Banana with significantly improved multimodal reasoning, real-world grounding, and high-fidelity visual synthesis. The model generates context-rich graphics, from infographics and diagrams to cinematic composites, and can incorporate real-time information via Search grounding.

    It offers industry-leading text rendering in images (including long passages and multilingual layouts), consistent multi-image blending, and accurate identity preservation across up to five subjects. Nano Banana Pro adds fine-grained creative controls such as localized edits, lighting and focus adjustments, camera transformations, and support for 2K/4K outputs and flexible aspect ratios. It is designed for professional-grade design, product visualization, storyboarding, and complex multi-element compositions while remaining efficient for general image creation workflows.

    by google66K context$2/M input tokens$12/M output tokens$120/M tokens

    Google: Gemini 2.5 Flash Image (Nano Banana)

    2.58B tokens

    Gemini 2.5 Flash Image, a.k.a. "Nano Banana," is now generally available. It is a state of the art image generation model with contextual understanding. It is capable of image generation, edits, and multi-turn conversations. Aspect ratios can be controlled with the image_config API Parameter

    by google33K context$0.30/M input tokens$2.50/M output tokens$30/M tokens

    Google: Gemini 2.5 Flash Image Preview (Nano Banana)

    982M tokens

    Gemini 2.5 Flash Image Preview, a.k.a. "Nano Banana," is a state of the art image generation model with contextual understanding. It is capable of image generation, edits, and multi-turn conversations.

    by google33K context$0.30/M input tokens$2.50/M output tokens$30/M tokens

    OpenAI: GPT-5 Image Mini

    234M tokens

    GPT-5 Image Mini combines OpenAI's advanced language capabilities, powered by GPT-5 Mini, with GPT Image 1 Mini for efficient image generation. This natively multimodal model features superior instruction following, text rendering, and detailed image editing with reduced latency and cost. It excels at high-quality visual creation while maintaining strong text understanding, making it ideal for applications that require both efficient image generation and text processing at scale.

    by openai400K context$2.50/M input tokens$2/M output tokens$8/M tokens

    OpenAI: GPT-5 Image

    224M tokens

    GPT-5 Image combines OpenAI's most advanced language model with state-of-the-art image generation capabilities. It offers major improvements in reasoning, code quality, and user experience while incorporating GPT Image 1's superior instruction following, text rendering, and detailed image editing.

    by openai400K context$10/M input tokens$10/M output tokens$40/M tokens

    Black Forest Labs: FLUX.2 Pro

    30.3M tokens

    A high-end image generation and editing model focused on frontier-level visual quality and reliability. It delivers strong prompt adherence, stable lighting, sharp textures, and consistent character/style reproduction across multi-reference inputs. Designed for production workloads, it balances speed and quality while supporting text-to-image and image editing up to 4 MP resolution.

    Pricing is as follows, per the docs: Input: We charge $0.015 for each megapixel on the input (i.e. reference images for editing) Output: The first megapixel is charged $0.03 and then each subsequent MP will be charged $0.015.

    by black-forest-labs47K context$3.66/M input tokens$3.66/M output tokens

    Sourceful: Riverflow V2 Max Preview

    11.6M tokens

    Riverflow V2 Max Preview is the most powerful variant of Sourceful's Riverflow V2 preview lineup. This preview version exceeds the performance of Riverflow 1 Family and is Sourceful's first unified text-to-image and image-to-image model family.

    Pricing is $0.075 per output image, regardless of size.

    Sourceful imposes a 4.5MB request size limit, therefore it is highly recommended to pass image URLs instead of Base64 data.

    by sourceful8K context$0/M input tokens$17.96/M output tokens$17.96/M tokens

    Black Forest Labs: FLUX.2 Flex

    7.22M tokens

    FLUX.2 [flex] excels at rendering complex text, typography, and fine details, and supports multi-reference editing in the same unified architecture.

    Pricing is as follows, per the docs: We charge $0.06 for each megapixel on both input and output side.

    by black-forest-labs67K context$14.64/M input tokens$14.64/M output tokens

    Sourceful: Riverflow V2 Fast Preview

    6.32M tokens

    Riverflow V2 Fast Preview is the fastest variant of Sourceful's Riverflow V2 preview lineup. This preview version exceeds the performance of Riverflow 1 Family and is Sourceful's first unified text-to-image and image-to-image model family.

    Pricing is $0.03 per output image, regardless of size.

    Sourceful imposes a 4.5MB request size limit, therefore it is highly recommended to pass image URLs instead of Base64 data.

    by sourceful8K context$0/M input tokens$7.19/M output tokens$7.19/M tokens

    Sourceful: Riverflow V2 Standard Preview

    4.12M tokens

    Riverflow V2 Standard Preview is the standard variant of Sourceful's Riverflow V2 preview lineup. This preview version exceeds the performance of Riverflow 1 Family and is Sourceful's first unified text-to-image and image-to-image model family.

    Pricing is $0.035 per output image, regardless of size.

    Sourceful imposes a 4.5MB request size limit, therefore it is highly recommended to pass image URLs instead of Base64 data.

    by sourceful8K context$0/M input tokens$8.38/M output tokens$8.38/M tokens