1.
MythoMax 13B
One of the highest performing fine-tunes of Llama 2 13B, with rich descriptions and roleplay. #merge • 4096 context
460.1M tokens
15%
2.
Toppy M 7B
A wild 7B parameter model that merges several models using the new task_arithmetic merge method from mergekit.
List of merged models:
- NousResearch/Nous-Capybara-7B-V1.9
- [HuggingFaceH4/zephyr-7b-beta](/models/huggingfaceh4/zephyr-7b-beta)
- lemonilia/AshhLimaRP-Mistral-7B
- Vulkane/120-Days-of-Sodom-LoRA-Mistral-7b
- Undi95/Mistral-pippa-sharegpt-7b-qlora
#merge • 32768 context
299.4M tokens
13%
3.
ReMM SLERP 13B
A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge • 6144 context
88.9M tokens
8%
4.
OpenHermes 2.5 Mistral 7B
A continuation of [OpenHermes 2 model](/models/teknium/openhermes-2-mistral-7b), trained on additional code datasets.
Potentially the most interesting finding from training on a good ratio (est. of around 7-14% of the total dataset) of code instruction was that it has boosted several non-code benchmarks, including TruthfulQA, AGIEval, and GPT4All suite. It did however reduce BigBench benchmark score, but the net gain overall is significant. • 4096 context
56.5M tokens
13%
5.
Mistral OpenOrca 7B
A fine-tune of Mistral using the OpenOrca dataset. First 7B model to beat all other models <30B. • 8192 context
51.7M tokens
8%
6.
MythoMax 13B 8k
One of the highest performing fine-tunes of Llama 2 13B, with rich descriptions and roleplay. Extended to 8k context length. #merge • 8192 context
23.8M tokens
17%
7.
Nous: Hermes 13B
A state-of-the-art language model fine-tuned on over 300k instructions by Nous Research, with Teknium and Emozilla leading the fine tuning process. • 4096 context
22.0M tokens
10%
8.
OpenHermes 2 Mistral 7B
Trained on 900k instructions, surpasses all previous versions of Hermes 13B and below, and matches 70B on some benchmarks. Hermes 2 has strong multiturn chat skills and system prompt capabilities. • 4096 context
20.8M tokens
20%
9.
Goliath 120B
A large LLM created by combining two fine-tuned Llama 70B models into one 120B model. Combines Xwin and Euryale. #merge • 6144 context
20.5M tokens
4%
10.
Anthropic: Claude v2.1
Claude 2.1 delivers advancements in key capabilities for enterprises—including an industry-leading 200K token context window, significant reductions in rates of model hallucination, system prompts and new beta feature: tool use. • 200000 context
17.9M tokens
31%
11.
OpenAI: GPT-4 Turbo (preview)
The latest GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Training data: up to Apr 2023.
**Note:** heavily rate limited by OpenAI while in preview. • 128000 context
16.8M tokens
20%
12.
Anthropic: Claude Instant v1
Anthropic's model for low-latency, high throughput text generation. Supports up to 100k tokens in one pass, or hundreds of pages of text. • 100000 context
16.2M tokens
21%
13.
lzlv 70B
A Mythomax/MLewd_13B-style merge of selected 70B models.
A multi-model merge of several LLaMA2 70B finetunes for roleplaying and creative work. The goal was to create a model that combines creativity with intelligence for an enhanced experience.
#merge • 4096 context
16.1M tokens
21%
14.
OpenChat 3.5
OpenChat is a library of open-source language models, fine-tuned with "C-RLFT (Conditioned Reinforcement Learning Fine-Tuning)" - a strategy inspired by offline reinforcement learning. It has been trained on mixed-quality data without preference labels. • 8192 context
12.5M tokens
23%
15.
MythoMist 7B
From the creator of [MythoMax](/models/gryphe/mythomax-l2-13b), merges a suite of models to reduce word anticipation, ministrations, and other undesirable words in ChatGPT roleplaying data.
It combines [Neural Chat 7B](/models/intel/neural-chat-7b), Airoboros 7b, [Toppy M 7B](/models/undi95/toppy-m-7b), [Zepher 7b beta](/models/huggingfaceh4/zephyr-7b-beta), [Nous Capybara 34B](/models/nousresearch/nous-capybara-34b), [OpenHeremes 2.5](/models/teknium/openhermes-2.5-mistral-7b), and many others.
#merge • 32768 context
10.2M tokens
14%
16.
OpenAI: GPT-4
OpenAI's flagship model, GPT-4 is a large-scale multimodal language model capable of solving difficult problems with greater accuracy than previous models due to its broader general knowledge and advanced reasoning capabilities. Training data: up to Sep 2021. • 8191 context
9.9M tokens
0%
17.
Mistral 7B Instruct
A 7.3B parameter model that outperforms Llama 2 13B on all benchmarks, with optimizations for speed and context length. • 8192 context
9.5M tokens
37%
18.
OpenAI: GPT-3.5 Turbo
GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks. Training data: up to Sep 2021. • 4095 context
8.6M tokens
8%
19.
Anthropic: Claude 100k v1
Anthropic's model for low-latency, high throughput text generation. Supports up to 100k tokens in one pass, or hundreds of pages of text. • 100000 context
6.5M tokens
41%
20.
Meta: Llama v2 13B Chat
A 13 billion parameter language model from Meta, fine tuned for chat completions • 4096 context
6.5M tokens
19%