OpenRouter

1.

The NeverSleep team is back, with a Llama 3 8B finetune trained on their curated roleplay data. Striking a balance between eRP and RP, Lumimaid was designed to be serious, yet uncensored when necessary. To enhance it's overall intelligence and chat capability, roughly 40% of the training data was not roleplay. This provides a breadth of knowledge to access, while still keeping roleplay as the primary strength. Usage of this model is subject to [Meta's Acceptable Use Policy](https://llama.meta.com/llama3/use-policy/). • 24576 context

117M tokens

new

2.

Llama 3 Lumimaid 8B (extended)

The NeverSleep team is back, with a Llama 3 8B finetune trained on their curated roleplay data. Striking a balance between eRP and RP, Lumimaid was designed to be serious, yet uncensored when necessary. To enhance it's overall intelligence and chat capability, roughly 40% of the training data was not roleplay. This provides a breadth of knowledge to access, while still keeping roleplay as the primary strength. Usage of this model is subject to [Meta's Acceptable Use Policy](https://llama.meta.com/llama3/use-policy/). Note: this is an extended-context version of [this model](/models/neversleep/llama-3-lumimaid-8b). It may have higher prices and different outputs. • 24576 context

45.9M tokens

new

3.

Yi 6B (base)

The Yi series models are large language models trained from scratch by developers at [01.AI](https://01.ai/). • 4096 context

3.61M tokens

2295%

4.

Google: Gemma 7B

Gemma by Google is an advanced, open-source language model family, leveraging the latest in decoder-only, text-to-text technology. It offers English language capabilities across text generation tasks like question answering, summarization, and reasoning. The Gemma 7B variant is comparable in performance to leading open source models. Usage of Gemma is subject to Google's [Gemma Terms of Use](https://ai.google.dev/gemma/terms). • 8192 context

166M tokens

1110%

5.

Neural Chat 7B v3.1

A fine-tuned model based on [mistralai/Mistral-7B-v0.1](/models/mistralai/mistral-7b-instruct) on the open source dataset [Open-Orca/SlimOrca](https://huggingface.co/datasets/Open-Orca/SlimOrca), aligned with DPO algorithm. For more details, refer to the blog: [The Practice of Supervised Fine-tuning and Direct Preference Optimization on Habana Gaudi2](https://medium.com/@NeuralCompressor/the-practice-of-supervised-finetuning-and-direct-preference-optimization-on-habana-gaudi2-a1197d8a3cd3). • 4096 context

552K tokens

382%

6.

Google: PaLM 2 Code Chat 32k

PaLM 2 fine-tuned for chatbot conversations that help with code-related questions. • 91750 context

1.22M tokens

379%

7.

Google: PaLM 2 Chat

PaLM 2 is a language model by Google with improved multilingual, reasoning and coding capabilities. • 25804 context

15.2M tokens

336%

8.

Toppy M 7B (nitro)

A wild 7B parameter model that merges several models using the new task_arithmetic merge method from mergekit. List of merged models: - NousResearch/Nous-Capybara-7B-V1.9 - [HuggingFaceH4/zephyr-7b-beta](/models/huggingfaceh4/zephyr-7b-beta) - lemonilia/AshhLimaRP-Mistral-7B - Vulkane/120-Days-of-Sodom-LoRA-Mistral-7b - Undi95/Mistral-pippa-sharegpt-7b-qlora #merge #uncensored Note: this is a higher-throughput version of [this model](/models/undi95/toppy-m-7b), and may have higher prices and slightly different outputs. • 4096 context

4.61M tokens

269%

9.

Mixtral 8x7B (base)

A pretrained generative Sparse Mixture of Experts, by Mistral AI. Incorporates 8 experts (feed-forward networks) for a total of 47B parameters. Base model (not fine-tuned for instructions) - see [Mixtral 8x7B Instruct](/models/mistralai/mixtral-8x7b-instruct) for an instruct-tuned model. #moe • 32768 context

4.85M tokens

186%

10.

Meta: Llama 3 8B Instruct

Meta's latest class of model (Llama 3) launched with a variety of sizes & flavors. This 8B instruct-tuned version was optimized for high quality dialogue usecases. It has demonstrated strong performance compared to leading closed-source models in human evaluations. To read more about the model release, [click here](https://ai.meta.com/blog/meta-llama-3/). Usage of this model is subject to [Meta's Acceptable Use Policy](https://llama.meta.com/llama3/use-policy/). • 8192 context

6.13B tokens

153%

11.

Perplexity: Sonar 7B

Sonar is Perplexity's latest model family. It surpasses their earlier models in cost-efficiency, speed, and performance. The version of this model with Internet access is [Sonar 7B Online](/models/perplexity/sonar-small-online). • 16384 context

592K tokens

142%

12.

Google: Gemini Pro 1.0

Google's flagship text generation model. Designed to handle natural language tasks, multiturn text and code chat, and code generation. See the benchmarks and prompting guidelines from [Deepmind](https://deepmind.google/technologies/gemini/). Usage of Gemini is subject to Google's [Gemini Terms of Use](https://ai.google.dev/terms). • 91728 context

615M tokens

102%

13.

Anthropic: Claude v1

Anthropic's model for low-latency, high throughput text generation. Supports hundreds of pages of text. • 100000 context

7.22M tokens

94%

14.

Google: Gemini Pro Vision 1.0

Google's flagship multimodal model, supporting image and video in text or chat prompts for a text or code response. See the benchmarks and prompting guidelines from [Deepmind](https://deepmind.google/technologies/gemini/). Usage of Gemini is subject to Google's [Gemini Terms of Use](https://ai.google.dev/terms). #multimodal • 45875 context

10.1M tokens

88%

15.

Perplexity: PPLX 7B Chat

The smaller chat model by Perplexity Labs, with 7 billion parameters. Based on [Mistral 7B](/models/mistralai/mistral-7b-instruct). • 8192 context

776K tokens

74%

16.

Mistral OpenOrca 7B

A fine-tune of Mistral using the OpenOrca dataset. First 7B model to beat all other models <30B. • 8192 context

3.51M tokens

68%

17.

OpenAI: GPT-4 Turbo (older v1106)

The latest GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Training data: up to Apr 2023. **Note:** heavily rate limited by OpenAI while in preview. • 128000 context

337M tokens

66%

18.

OpenAI: GPT-4 (older v0314)

GPT-4-0314 is the first version of GPT-4 released, with a context length of 8,192 tokens, and was supported until June 14. Training data: up to Sep 2021. • 8191 context

3.05M tokens

55%

19.

OpenAI: GPT-4 Turbo Preview

The latest GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Training data: up to Dec 2023. **Note:** heavily rate limited by OpenAI while in preview. • 128000 context

166M tokens

51%

20.

Anthropic: Claude (older v1)

Anthropic's model for low-latency, high throughput text generation. Supports hundreds of pages of text. • 100000 context

17.2M tokens

51%

LLM Rankings

Compare models by tokens processed

Weekly active models

LLM Rankings

Compare models by tokens processed

Weekly active models