OpenAI: Whisper Large V3

openai/whisper-large-v3

Whisper Large V3 is OpenAI's open-source automatic speech recognition model offering both audio transcription and translation. It supports 99+ languages and accepts common audio formats including mp3, mp4, wav, webm, flac, and ogg. With 1,550M parameters, it achieves a 10.3% word error rate and is well-suited for noise-robust, multilingual transcription in demanding conditions. Supports timestamp granularities at word and segment levels.

Modalities

Price

$0.111per hour

Released

May 1, 2026

OpenAI: Whisper Large V3

openai/whisper-large-v3

Compare

Whisper Large V3 is OpenAI's open-source automatic speech recognition model offering both audio transcription and translation. It supports 99+ languages and accepts common audio formats including mp3, mp4, wav, webm, flac, and ogg. With 1,550M parameters, it achieves a 10.3% word error rate and is well-suited for noise-robust, multilingual transcription in demanding conditions. Supports timestamp granularities at word and segment levels.

Modalities

Price

$0.111per hour

Released

May 1, 2026

Sample code and API for Whisper Large V3

OpenRouter normalizes requests and responses across providers for you.

OpenRouter provides a speech-to-text API that transcribes audio into text. Send base64-encoded audio with a model, and receive the transcribed text in JSON.

The generation ID is returned in the X-Generation-Id response header for tracking.

Using third-party SDKs

For information about using third-party SDKs and frameworks with OpenRouter, please see our frameworks documentation.