Anthropic: Claude 3.5 Haiku

anthropic/claude-3.5-haiku

Created Nov 4, 2024200,000 context
$0.8/M input tokens$4/M output tokens

Providers for Claude 3.5 Haiku

OpenRouter routes requests to the top-ranked providers able to handle your prompts.

Context
200K
Max Output
8K
Input
$0.8
Output
$4
Latency
1.31s
Throughput
54.84t/s
Context
200K
Max Output
8K
Input
$0.8
Output
$4
Latency
1.38s
Throughput
58.62t/s
Context
200K
Max Output
8K
Input
$0.8
Output
$4
Latency
2.07s
Throughput
60.43t/s

Apps using Claude 3.5 Haiku

Top public apps this week using this model

1.
Cline
Autonomous coding agent right in your IDE
205M tokens
2.
90.4M tokens
3.
75.1M tokens
4.
Aider
AI pair programming in your terminal
42.2M tokens
5.
novelcrafter
Your personal novel writing toolbox. Plan, write and tinker with your story.
35.4M tokens
6.
SillyTavern
LLM frontend for power users
20M tokens
7.
OpenRouter: Chatroom
Chat with multiple LLMs at once
17.8M tokens
8.
Chub AI
GenAI for everyone
17.6M tokens
9.
9.36M tokens
10.
7.65M tokens

Recent activity on Claude 3.5 Haiku

Tokens processed per day

Nov 4, 2024Nov 10Nov 16Nov 22Nov 28Dec 4Dec 10Dec 16Dec 22Dec 28Jan 3Jan 90400M800M1.2B1.6B

Versions by Token Share

Dec 12, 2024Dec 15Dec 18Dec 21Dec 24Dec 27Dec 30Jan 2Jan 5Jan 80%30%60%100%
Currently Viewing
Anthropic: Claude 3.5 Haiku
Created November 4, 2024200,000 context
Anthropic: Claude 3.5 Haiku (2024-10-22)
Created November 4, 2024200,000 context
Anthropic: Claude 3 Haiku
Created March 13, 2024200,000 context
1.16B tokens

Recommended parameters for Anthropic: Claude 3.5 Haiku

Median values from users on OpenRouter

temperature

This setting influences the variety in the model's responses. Lower values lead to more predictable and typical responses, while higher values encourage more diverse and less common responses. At 0, the model always gives the same response for a given input.

  • Optional, float, 0.0 to 2.0

  • Default: 1.0

  • Explainer Video: Watch

p10
0
p50
0.40
p90
1
top_p

This setting limits the model's choices to a percentage of likely tokens: only the top tokens whose probabilities add up to P. A lower value makes the model's responses more predictable, while the default setting allows for a full range of token choices. Think of it like a dynamic Top-K.

  • Optional, float, 0.0 to 1.0

  • Default: 1.0

  • Explainer Video: Watch

p10
1
p50
1
p90
1
top_k

This limits the model's choice of tokens at each step, making it choose from a smaller set. A value of 1 means the model will always pick the most likely next token, leading to predictable results. By default this setting is disabled, making the model to consider all choices.

  • Optional, integer, 0 or above

  • Default: 0

  • Explainer Video: Watch

p10
0
p50
0
p90
0
frequency_penalty

This setting aims to control the repetition of tokens based on how often they appear in the input. It tries to use less frequently those tokens that appear more in the input, proportional to how frequently they occur. Token penalty scales with the number of occurrences. Negative values will encourage token reuse.

  • Optional, float, -2.0 to 2.0

  • Default: 0.0

  • Explainer Video: Watch

p10
0
p50
0
p90
0
presence_penalty

Adjusts how often the model repeats specific tokens already used in the input. Higher values make such repetition less likely, while negative values do the opposite. Token penalty does not scale with the number of occurrences. Negative values will encourage token reuse.

  • Optional, float, -2.0 to 2.0

  • Default: 0.0

  • Explainer Video: Watch

p10
0
p50
0
p90
0
repetition_penalty

Helps to reduce the repetition of tokens from the input. A higher value makes the model less likely to repeat tokens, but too high a value can make the output less coherent (often with run-on sentences that lack small words). Token penalty scales based on original token's probability.

  • Optional, float, 0.0 to 2.0

  • Default: 1.0

  • Explainer Video: Watch

p10
1
p50
1
p90
1
min_p

Represents the minimum probability for a token to be considered, relative to the probability of the most likely token. (The value changes depending on the confidence level of the most probable token.) If your Min-P is set to 0.1, that means it will only allow for tokens that are at least 1/10th as probable as the best possible option.

  • Optional, float, 0.0 to 1.0

  • Default: 0.0

p10
0
p50
0
p90
0
top_a

Consider only the top tokens with "sufficiently high" probabilities based on the probability of the most likely token. Think of it like a dynamic Top-P. A lower Top-A value focuses the choices based on the highest probability token but with a narrower scope. A higher Top-A value does not necessarily affect the creativity of the output, but rather refines the filtering process based on the maximum probability.

  • Optional, float, 0.0 to 1.0

  • Default: 0.0

p10
0
p50
0
p90
0

Sample code using the median

fetch("https://openrouter.ai/api/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": "Bearer <OPENROUTER_API_KEY>",
    "HTTP-Referer": "<YOUR_SITE_URL>", // Optional. Site URL for rankings on openrouter.ai.
    "X-Title": "<YOUR_SITE_NAME>", // Optional. Site title for rankings on openrouter.ai.
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    "model": "anthropic/claude-3.5-haiku",
    "messages": [
      {"role": "user", "content": "What is the meaning of life?"}
    ],
    "top_p": 1,
    "temperature": 0.4,
    "repetition_penalty": 1
  })
});

Uptime stats for Claude 3.5 Haiku

Uptime stats for Claude 3.5 Haiku across all providers

When an error occurs in an upstream provider, we recover by routing to another healthy provider.
If a model only has one host or the request filters only match a single provider, the request is "irrecoverable."

Learn more about our load balancing and customization options.

Sample code and API for Claude 3.5 Haiku

OpenRouter normalizes requests and responses across providers for you.

OpenRouter provides an OpenAI-compatible completion API to 294 models & providers that you can call directly, or using the OpenAI SDK. Additionally, some third-party SDKs are available.

In the examples below, the OpenRouter-specific headers are optional. Setting them allows your app to appear on the OpenRouter leaderboards.

Using the OpenAI SDK

import OpenAI from "openai"

const openai = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: "<OPENROUTER_API_KEY>",
  defaultHeaders: {
    "HTTP-Referer": "<YOUR_SITE_URL>", // Optional. Site URL for rankings on openrouter.ai.
    "X-Title": "<YOUR_SITE_NAME>", // Optional. Site title for rankings on openrouter.ai.
  }
})

async function main() {
  const completion = await openai.chat.completions.create({
    model: "anthropic/claude-3.5-haiku",
    messages: [
      {
        "role": "user",
        "content": "What is the meaning of life?"
      }
    ]
  })

  console.log(completion.choices[0].message)
}
main()

Using the OpenRouter API directly

fetch("https://openrouter.ai/api/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": "Bearer <OPENROUTER_API_KEY>",
    "HTTP-Referer": "<YOUR_SITE_URL>", // Optional. Site URL for rankings on openrouter.ai.
    "X-Title": "<YOUR_SITE_NAME>", // Optional. Site title for rankings on openrouter.ai.
    "Content-Type": "application/json"
  },
  body: JSON.stringify({
    "model": "anthropic/claude-3.5-haiku",
    "messages": [
      {
        "role": "user",
        "content": "What is the meaning of life?"
      }
    ]
  })
});

Using third-party SDKs

For information about using third-party SDKs and frameworks with OpenRouter, please see our frameworks documentation.

See the Request docs for all possible parameters, and Parameters for recommended values.

More models from Anthropic