Google: Gemini Pro Vision (preview)

google/gemini-pro-vision

Updated Dec 1365,536 context
$0.125 / 1M input tokens$0.375 / 1M output tokens$2.5 / 1K input images

Google's flagship multimodal model, supporting image and video in text or chat prompts for a text or code response.

See the benchmarks and prompting guidelines from Deepmind.

Note: Preview models are offered for testing purposes and should not be used in production apps.

#multimodal

OpenRouter first attempts the primary provider, and falls back to others if it encounters an error. Prices displayed per million tokens.