Meta: LlamaGuard 2 8B
meta-llama/llama-guard-2-8b
This safeguard model has 8B parameters and is based on the Llama 3 family. Just like is predecessor, LlamaGuard 1, it can do both prompt and response classification.
LlamaGuard 2 acts as a normal LLM would, generating text that indicates whether the given input/output is safe/unsafe. If deemed unsafe, it will also share the content categories violated.
For best results, please use raw prompt input or the /completions
endpoint, instead of the chat API.
It has demonstrated strong performance compared to leading closed-source models in human evaluations.
To read more about the model release, click here. Usage of this model is subject to Meta's Acceptable Use Policy.