NVIDIA: Llama 3.3 Nemotron Super 49B v1 (free)

nvidia/llama-3.3-nemotron-super-49b-v1:free

Created Apr 8, 2025131,072 context
$0/M input tokens$0/M output tokens

Llama-3.3-Nemotron-Super-49B-v1 is a large language model (LLM) optimized for advanced reasoning, conversational interactions, retrieval-augmented generation (RAG), and tool-calling tasks. Derived from Meta's Llama-3.3-70B-Instruct, it employs a Neural Architecture Search (NAS) approach, significantly enhancing efficiency and reducing memory requirements. This allows the model to support a context length of up to 128K tokens and fit efficiently on single high-performance GPUs, such as NVIDIA H200.

Note: you must include detailed thinking on in the system prompt to enable reasoning. Please see Usage Recommendations for more.

Providers for Llama 3.3 Nemotron Super 49B v1 (free)

OpenRouter routes requests to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.

Apps using Llama 3.3 Nemotron Super 49B v1 (free)

Top public apps this week using this model

Recent activity on Llama 3.3 Nemotron Super 49B v1 (free)

Tokens processed per day

Apr 8Apr 9Apr 10Apr 11Apr 12Apr 13Apr 14Apr 15Apr 16Apr 17015M30M45M60M

Uptime stats for Llama 3.3 Nemotron Super 49B v1 (free)

Uptime stats for Llama 3.3 Nemotron Super 49B v1 (free) across all providers

Sample code and API for Llama 3.3 Nemotron Super 49B v1 (free)

OpenRouter normalizes requests and responses across providers for you.

OpenRouter provides an OpenAI-compatible completion API to 300+ models & providers that you can call directly, or using the OpenAI SDK. Additionally, some third-party SDKs are available.

In the examples below, the OpenRouter-specific headers are optional. Setting them allows your app to appear on the OpenRouter leaderboards.

from openai import OpenAI

client = OpenAI(
  base_url="https://openrouter.ai/api/v1",
  api_key="<OPENROUTER_API_KEY>",
)

completion = client.chat.completions.create(
  extra_headers={
    "HTTP-Referer": "<YOUR_SITE_URL>", # Optional. Site URL for rankings on openrouter.ai.
    "X-Title": "<YOUR_SITE_NAME>", # Optional. Site title for rankings on openrouter.ai.
  },
  extra_body={},
  model="nvidia/llama-3.3-nemotron-super-49b-v1:free",
  messages=[
    {
      "role": "user",
      "content": "What is the meaning of life?"
    }
  ]
)
print(completion.choices[0].message.content)

Using third-party SDKs

For information about using third-party SDKs and frameworks with OpenRouter, please see our frameworks documentation.

See the Request docs for all possible fields, and Parameters for explanations of specific sampling parameters.

More models from Nvidia

    Llama 3.3 Nemotron Super 49B v1 (free) - API, Providers, Stats