FireLLaVA 13B

fireworks/firellava-13b

Updated Apr 264,096 context
$0.2/M input tkns$0.2/M output tkns$0.1152/K input imgs

A blazing fast vision-language model, FireLLaVA quickly understands both text and images. It achieves impressive chat skills in tests, and was designed to mimic multimodal GPT-4.

The first commercially permissive open source LLaVA model, trained entirely on open source LLM generated instruction following data.