Databricks: DBRX 132B Instruct
databricks/dbrx-instruct
DBRX is a new open source large language model developed by Databricks. At 132B, it outperforms existing open source LLMs like Llama 2 70B and Mixtral-8x7b on standard industry benchmarks for language understanding, programming, math, and logic.
It uses a fine-grained mixture-of-experts (MoE) architecture. 36B parameters are active on any input. It was pre-trained on 12T tokens of text and code data. Compared to other open MoE models like Mixtral-8x7B and Grok-1, DBRX is fine-grained, meaning it uses a larger number of smaller experts.
See the launch announcement and benchmark results here.
#moe