Databricks: DBRX 132B Instruct

databricks/dbrx-instruct

Updated Mar 2932,768 context
$1.08/M input tkns$1.08/M output tkns

DBRX is a new open source large language model developed by Databricks. At 132B, it outperforms existing open source LLMs like Llama 2 70B and Mixtral-8x7b on standard industry benchmarks for language understanding, programming, math, and logic.

It uses a fine-grained mixture-of-experts (MoE) architecture. 36B parameters are active on any input. It was pre-trained on 12T tokens of text and code data. Compared to other open MoE models like Mixtral-8x7B and Grok-1, DBRX is fine-grained, meaning it uses a larger number of smaller experts.

See the launch announcement and benchmark results here.

#moe