FireLLaVA 13B
fireworks/firellava-13b
Updated Apr 264,096 context
$0.2/M input tkns$0.2/M output tkns$0.1152/K input imgs
A blazing fast vision-language model, FireLLaVA quickly understands both text and images. It achieves impressive chat skills in tests, and was designed to mimic multimodal GPT-4.
The first commercially permissive open source LLaVA model, trained entirely on open source LLM generated instruction following data.