Mistral: Mistral 7B Instruct v0.2


Updated Dec 2832,768 context
$0.07/M input tkns$0.07/M output tkns

A high-performing, industry-standard 7.3B parameter model, with optimizations for speed and context length.

An improved version of Mistral 7B Instruct, with the following changes:

  • 32k context window (vs 8k context in v0.1)
  • Rope-theta = 1e6
  • No Sliding-Window Attention