For the free endpoint, please do not upload any confidential information or personal data (such as voices or faces of people). Your use is logged for security purposes and to improve NVIDIA products and services. The logged session data for improvement purposes is not linked to your identity or any persistent identifier. For more information about NVIDIA's data processing practices, see Privacy Policy(opens in new tab). By using this free endpoint, you consent to NVIDIA's collection, recording, and use of such information and the NVIDIA API Trial Terms of Service(opens in new tab).

NVIDIA: Nemotron 3 Nano Omni

nvidia/nemotron-3-nano-omni-30b-a3b-reasoning

Model weights

NVIDIA Nemotron™ 3 Nano Omni is a 30B-A3B open multimodal model designed to function as a perception and context sub-agent in enterprise agent systems. It accepts text, image, video, and audio inputs and produces text output, enabling agents to perceive and reason across modalities in a single inference loop.

Built on a hybrid MoE Transformer-Mamba architecture with Conv3D video layers and Efficient Video Sampling (EVS), it delivers approximately 2× higher throughput and 2.5× lower compute for video reasoning versus separate vision + speech pipelines. It supports up to 300K context length and a 16,384 reasoning budget, with extended thinking enabled via reasoning.enabled on OpenRouter.

Modalities

Context

256K

Released

Apr 28, 2026

For the free endpoint, please do not upload any confidential information or personal data (such as voices or faces of people). Your use is logged for security purposes and to improve NVIDIA products and services. The logged session data for improvement purposes is not linked to your identity or any persistent identifier. For more information about NVIDIA's data processing practices, see Privacy Policy(opens in new tab). By using this free endpoint, you consent to NVIDIA's collection, recording, and use of such information and the NVIDIA API Trial Terms of Service(opens in new tab).

NVIDIA: Nemotron 3 Nano Omni