Google: Gemini Pro Vision (preview)


Updated Dec 1365,536 context
$0.125 / 1M input tokens$0.375 / 1M output tokens$2.5 / 1K input images

Google's flagship multimodal model, supporting image and video in text or chat prompts for a text or code response.

See the benchmarks and prompting guidelines from Deepmind.

Note: Preview models are offered for testing purposes and should not be used in production apps.


OpenRouter first attempts the primary provider, and falls back to others if it encounters an error. Prices displayed per million tokens.