Google: Gemini Pro Vision 1.0
google/gemini-pro-vision
Updated Oct 116,384 context
$0.5/M input tokens$1.5/M output tokens$2.5/K input imgs
Google's flagship multimodal model, supporting image and video in text or chat prompts for a text or code response.
See the benchmarks and prompting guidelines from Deepmind.
Usage of Gemini is subject to Google's Gemini Terms of Use.
#multimodal