Question 1

What is Llama 3.2 11B Vision Instruct?

Accepted Answer

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and textual data. It excels in tasks such as image captioning and visual question answering, bridging the gap between language generation and visual reasoning. Pre-trained on a massive dataset of image-text pairs, it performs well in complex, high-accuracy image analysis. Its ability...

Question 2

What is the context length of Llama 3.2 11B Vision Instruct?

Accepted Answer

Llama 3.2 11B Vision Instruct has a 131,072 token context window.

Question 3

What modalities does Llama 3.2 11B Vision Instruct support?

Accepted Answer

Llama 3.2 11B Vision Instruct accepts text, image input and produces text output.

Question 4

When was Llama 3.2 11B Vision Instruct released?

Accepted Answer

Llama 3.2 11B Vision Instruct was released on 2024-09-25. Its knowledge cutoff is 2023-12-31.

Meta: Llama 3.2 11B Vision Instruct

meta-llama/llama-3.2-11b-vision-instruct

Meta: Llama 3.2 11B Vision Instruct

meta-llama/llama-3.2-11b-vision-instruct

Activity

Frequently asked questions

Activity

Frequently asked questions