- Perplexity: PPLX 70B Online
The larger, internet-connected chat model by Perplexity Labs, based on Llama 2 70B. The online models are focused on delivering helpful, up-to-date, and factual responses. #online
by perplexity4k context$0.00/M input tkns$2.80/M output tkns$5.00 / 1K requests410.2K tokens this week - Perplexity: PPLX 7B Online
The smaller, internet-connected chat model by Perplexity Labs, based on Mistral 7B. The online models are focused on delivering helpful, up-to-date, and factual responses. #online
by perplexity4k context$0.00/M input tkns$0.28/M output tkns$5.00 / 1K requests81.5K tokens this week - Perplexity: PPLX 7B Chat
The smaller chat model by Perplexity Labs, with 7 billion parameters. Based on Mistral 7B.
by perplexity8k context$0.07/M input tkns$0.28/M output tkns36.1K tokens this week - Perplexity: PPLX 70B Chat
The larger chat model by Perplexity Labs, with 70 billion parameters. Based on Llama 2 70B.
by perplexity4k context$0.70/M input tkns$2.80/M output tkns286.5K tokens this week - Psyfighter 13B
A #merge model based on Llama-2-13B and made possible thanks to the compute provided by the KoboldAI community. It's a merge between:
- KoboldAI/LLaMA2-13B-Tiefighter
- chaoyi-wu/MedLLaMA_13B
- Doctor-Shotgun/llama-2-13b-chat-limarp-v2-merged.
#merge
by jebcarter4k context$1.00/M input tkns$1.00/M output tkns3.9M tokens this week - OpenChat 3.5
OpenChat is a library of open-source language models, fine-tuned with "C-RLFT (Conditioned Reinforcement Learning Fine-Tuning)" - a strategy inspired by offline reinforcement learning. It has been trained on mixed-quality data without preference labels.
by openchat8k context$0.00/M input tkns$0.00/M output tkns28.9M tokens this week - MythoMist 7B
From the creator of MythoMax, merges a suite of models to reduce word anticipation, ministrations, and other undesirable words in ChatGPT roleplaying data.
It combines Neural Chat 7B, Airoboros 7b, Toppy M 7B, Zepher 7b beta, Nous Capybara 34B, OpenHeremes 2.5, and many others.
#merge
by gryphe33k context$0.00/M input tkns$0.00/M output tkns73.2M tokens this week - Noromaid 20B
A collab between IkariDev and Undi. This merge is suitable for RP, ERP, and general knowledge. #merge
by neversleep8k context$2.25/M input tkns$2.25/M output tkns25.9M tokens this week - Neural Chat 7B v3.1
A fine-tuned model based on mistralai/Mistral-7B-v0.1 on the open source dataset Open-Orca/SlimOrca, aligned with DPO algorithm. For more details, refer to the blog: The Practice of Supervised Fine-tuning and Direct Preference Optimization on Habana Gaudi2.
by intel4k context$5.00/M input tkns$5.00/M output tkns2.3M tokens this week - Anthropic: Claude v2.1
Claude 2.1 delivers advancements in key capabilities for enterprises—including an industry-leading 200K token context window, significant reductions in rates of model hallucination, system prompts and new beta feature: tool use.
by anthropic200k context$8.00/M input tkns$24.00/M output tkns139.5M tokens this week - OpenHermes 2.5 Mistral 7B
A continuation of OpenHermes 2 model, trained on additional code datasets. Potentially the most interesting finding from training on a good ratio (est. of around 7-14% of the total dataset) of code instruction was that it has boosted several non-code benchmarks, including TruthfulQA, AGIEval, and GPT4All suite. It did however reduce BigBench benchmark score, but the net gain overall is significant.
by teknium4k context$0.20/M input tkns$0.20/M output tkns242.1M tokens this week - Llava 13B
LLaVA is a large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding, achieving impressive chat capabilities mimicking GPT-4 and setting a new state-of-the-art accuracy on Science QA
by haotian-liu2k context$5.00/M input tkns$5.00/M output tkns621.8K tokens this week - Nous: Capybara 34B
This model is trained on the Yi-34B model for 3 epochs on the Capybara dataset. It's the first 34B Nous model and first 200K context length Nous model.
Note: This endpoint currently supports 32k context.
by nousresearch32k context$2.00/M input tkns$2.00/M output tkns3.6M tokens this week - OpenAI: GPT-4 Vision (preview)
Ability to understand images, in addition to all other GPT-4 Turbo capabilties. Training data: up to Apr 2023.
Note: heavily rate limited by OpenAI while in preview.
by openai128k context$10.00/M input tkns$30.00/M output tkns$14.45 / 1K input images3.2M tokens this week - lzlv 70B
A Mythomax/MLewd_13B-style merge of selected 70B models. A multi-model merge of several LLaMA2 70B finetunes for roleplaying and creative work. The goal was to create a model that combines creativity with intelligence for an enhanced experience.
#merge
by lizpreciatior4k context$0.56/M input tkns$0.76/M output tkns78.5M tokens this week - Toppy M 7B
A wild 7B parameter model that merges several models using the new task_arithmetic merge method from mergekit. List of merged models:
- NousResearch/Nous-Capybara-7B-V1.9
- HuggingFaceH4/zephyr-7b-beta
- lemonilia/AshhLimaRP-Mistral-7B
- Vulkane/120-Days-of-Sodom-LoRA-Mistral-7b
- Undi95/Mistral-pippa-sharegpt-7b-qlora
#merge
by undi9533k context$0.00/M input tkns$0.00/M output tkns1.6B tokens this week - Goliath 120B
A large LLM created by combining two fine-tuned Llama 70B models into one 120B model. Combines Xwin and Euryale. #merge
by alpindale6k context$7.03/M input tkns$7.03/M output tkns108.6M tokens this week - Auto (best for prompt)
Depending on their size, subject, and complexity, your prompts will be sent to MythoMax 13B, MythoMax 13B 8k or GPT-4 Turbo. To see which model was used, visit Activity.
by openrouter128k context - OpenAI: GPT-3.5 Turbo 16k (preview)
The latest GPT-3.5 Turbo model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Training data: up to Sep 2021.
by openai16k context$1.00/M input tkns$2.00/M output tkns19.2M tokens this week - OpenAI: GPT-4 Turbo (preview)
The latest GPT-4 model with improved instruction following, JSON mode, reproducible outputs, parallel function calling, and more. Training data: up to Apr 2023.
Note: heavily rate limited by OpenAI while in preview.
by openai128k context$10.00/M input tkns$30.00/M output tkns82.1M tokens this week - Hugging Face: Zephyr 7B
Zephyr is a series of language models that are trained to act as helpful assistants. Zephyr-7B-β is the second model in the series, and is a fine-tuned version of mistralai/Mistral-7B-v0.1 that was trained on a mix of publicly available, synthetic datasets using Direct Preference Optimization (DPO).
by huggingfaceh44k context$0.00/M input tkns$0.00/M output tkns22.8M tokens this week - Google: PaLM 2 Chat 32k
PaLM 2 is Google's flagship language model with improved multilingual, reasoning and coding capabilities.
by google32k context$0.50/M input tkns$0.50/M output tkns2.8M tokens this week - Google: PaLM 2 Code Chat 32k
PaLM 2 fine-tuned for chatbot conversations that help with code-related questions.
by google32k context$0.50/M input tkns$0.50/M output tkns2.9M tokens this week - OpenHermes 2 Mistral 7B
Trained on 900k instructions, surpasses all previous versions of Hermes 13B and below, and matches 70B on some benchmarks. Hermes 2 has strong multiturn chat skills and system prompt capabilities.
by teknium4k context$0.20/M input tkns$0.20/M output tkns112.7M tokens this week - Mistral OpenOrca 7B
A fine-tune of Mistral using the OpenOrca dataset. First 7B model to beat all other models <30B.
by open-orca8k context$0.20/M input tkns$0.20/M output tkns285.0M tokens this week - Airoboros 70B
A Llama 2 70B fine-tune using synthetic data (the Airoboros dataset).
by jondurbin4k context$0.70/M input tkns$0.95/M output tkns31.5M tokens this week - MythoMax 13B 8k
One of the highest performing fine-tunes of Llama 2 13B, with rich descriptions and roleplay. Extended to 8k context length. #merge
by gryphe8k context$1.12/M input tkns$1.12/M output tkns126.1M tokens this week - Nous: Hermes 70B
A state-of-the-art language model fine-tuned on over 300k instructions by Nous Research, with Teknium and Emozilla leading the fine tuning process.
by nousresearch4k context$0.90/M input tkns$0.90/M output tkns10.6M tokens this week - Xwin 70B
Xwin-LM aims to develop and open-source alignment tech for LLMs. Our first release, built-upon on the Llama2 base models, ranked TOP-1 on AlpacaEval. Notably, it's the first to surpass GPT-4 on this benchmark. The project will be continuously updated.
by xwin-lm8k context$6.56/M input tkns$6.56/M output tkns27.7M tokens this week - Mistral 7B Instruct
A 7.3B parameter model that outperforms Llama 2 13B on all benchmarks, with optimizations for speed and context length.
by mistralai8k context$0.00/M input tkns$0.00/M output tkns48.2M tokens this week - OpenAI: GPT-3.5 Turbo Instruct
This model is a variant of GPT-3.5 Turbo tuned for instructional prompts and omitting chat-related optimizations. Training data: up to Sep 2021.
by openai4k context$1.50/M input tkns$2.00/M output tkns5.2M tokens this week - Synthia 70B
SynthIA (Synthetic Intelligent Agent) is a LLama-2 70B model trained on Orca style datasets. It has been fine-tuned for instruction following as well as having long-form conversations.
by migtissera8k context$6.56/M input tkns$6.56/M output tkns5.0M tokens this week - Pygmalion: Mythalion 13B
A blend of the new Pygmalion-13b and MythoMax. #merge
by pygmalionai8k context$1.12/M input tkns$1.12/M output tkns28.6M tokens this week - OpenAI: GPT-3.5 Turbo 16k
This model offers four times the context length of gpt-3.5-turbo, allowing it to support approximately 20 pages of text in a single request at a higher cost. Training data: up to Sep 2021.
by openai16k context$3.00/M input tkns$4.00/M output tkns34.7M tokens this week - OpenAI: GPT-4 32k
GPT-4-32k is an extended version of GPT-4, with the same capabilities but quadrupled context length, allowing for processing up to 40 pages of text in a single pass. This is particularly beneficial for handling longer content like interacting with PDFs without an external vector database. Training data: up to Sep 2021.
by openai33k context$60.00/M input tkns$120.00/M output tkns23.3M tokens this week - OpenAI: GPT-4 32k (older v0314)
GPT-4-32k is an extended version of GPT-4, with the same capabilities but quadrupled context length, allowing for processing up to 40 pages of text in a single pass. This is particularly beneficial for handling longer content like interacting with PDFs without an external vector database. Training data: up to Sep 2021.
by openai33k context$60.00/M input tkns$120.00/M output tkns1.3M tokens this week - Nous: Hermes 13B
A state-of-the-art language model fine-tuned on over 300k instructions by Nous Research, with Teknium and Emozilla leading the fine tuning process.
by nousresearch4k context$0.15/M input tkns$0.15/M output tkns117.2M tokens this week - Phind: CodeLlama 34B v2
A fine-tune of CodeLlama-34B on an internal dataset that helps it exceed GPT-4 on some benchmarks, including HumanEval.
by phind4k context$0.40/M input tkns$0.40/M output tkns2.2M tokens this week - Meta: CodeLlama 34B Instruct
Code Llama is built upon Llama 2 and excels at filling in code, handling extensive input contexts, and folling programming instructions without prior training for various programming tasks.
by meta-llama16k context$0.35/M input tkns$1.40/M output tkns2.5M tokens this week - Mancer: Weaver (alpha)
An attempt to recreate Claude-style verbosity, but don't expect the same level of coherence or memory. Meant for use in roleplay/narrative situations.
by mancer8k context$4.50/M input tkns$4.50/M output tkns32.9M tokens this week - Anthropic: Claude v2.0
Anthropic's flagship model. Superior performance on tasks that require complex reasoning. Supports up to 100k tokens in one pass, or hundreds of pages of text.
by anthropic100k context$8.00/M input tkns$24.00/M output tkns25.4M tokens this week - Anthropic: Claude Instant v1
Anthropic's model for low-latency, high throughput text generation. Supports up to 100k tokens in one pass, or hundreds of pages of text.
by anthropic100k context$1.63/M input tkns$5.51/M output tkns73.5M tokens this week - Anthropic: Claude v1
Anthropic's model for low-latency, high throughput text generation. Supports up to 100k tokens in one pass, or hundreds of pages of text.
by anthropic9k context$8.00/M input tkns$24.00/M output tkns2.0M tokens this week - Anthropic: Claude (older v1)
Anthropic's model for low-latency, high throughput text generation. Supports up to 100k tokens in one pass, or hundreds of pages of text.
by anthropic9k context$8.00/M input tkns$24.00/M output tkns1.9M tokens this week - Anthropic: Claude Instant 100k v1
Anthropic's model for low-latency, high throughput text generation. Supports up to 100k tokens in one pass, or hundreds of pages of text.
by anthropic100k context$1.63/M input tkns$5.51/M output tkns1.1M tokens this week - Anthropic: Claude 100k v1
Anthropic's model for low-latency, high throughput text generation. Supports up to 100k tokens in one pass, or hundreds of pages of text.
by anthropic100k context$8.00/M input tkns$24.00/M output tkns19.8M tokens this week - Anthropic: Claude Instant (older v1)
Anthropic's model for low-latency, high throughput text generation. Supports up to 100k tokens in one pass, or hundreds of pages of text.
by anthropic9k context$1.63/M input tkns$5.51/M output tkns296.2K tokens this week - ReMM SLERP 13B
A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge
by undi956k context$1.12/M input tkns$1.12/M output tkns511.1M tokens this week - Google: PaLM 2 Chat
PaLM 2 is Google's flagship language model with improved multilingual, reasoning and coding capabilities.
by google9k context$0.50/M input tkns$0.50/M output tkns34.1M tokens this week - Google: PaLM 2 Code Chat
PaLM 2 fine-tuned for chatbot conversations that help with code-related questions.
by google7k context$0.50/M input tkns$0.50/M output tkns2.4M tokens this week - MythoMax 13B
One of the highest performing fine-tunes of Llama 2 13B, with rich descriptions and roleplay. #merge
by gryphe4k context$0.60/M input tkns$0.60/M output tkns3.1B tokens this week - Meta: Llama v2 13B Chat
A 13 billion parameter language model from Meta, fine tuned for chat completions
by meta-llama4k context$0.23/M input tkns$0.23/M output tkns48.0M tokens this week - Meta: Llama v2 70B Chat
A 70 billion parameter language model from Meta, fine tuned for chat completions.
by meta-llama4k context$0.70/M input tkns$0.95/M output tkns19.5M tokens this week - OpenAI: GPT-3.5 Turbo
GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks. Training data: up to Sep 2021.
by openai4k context$1.00/M input tkns$2.00/M output tkns122.2M tokens this week - OpenAI: GPT-3.5 Turbo (older v0301)
GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks. Training data: up to Sep 2021.
by openai4k context$1.00/M input tkns$2.00/M output tkns822.8K tokens this week - OpenAI: GPT-4
OpenAI's flagship model, GPT-4 is a large-scale multimodal language model capable of solving difficult problems with greater accuracy than previous models due to its broader general knowledge and advanced reasoning capabilities. Training data: up to Sep 2021.
by openai8k context$30.00/M input tkns$60.00/M output tkns71.7M tokens this week - OpenAI: GPT-4 (older v0314)
GPT-4-0314 is the first version of GPT-4 released, with a context length of 8,192 tokens, and was supported until June 14. Training data: up to Sep 2021.
by openai8k context$30.00/M input tkns$60.00/M output tkns2.3M tokens this week - OpenAI: Davinci 2
An InstructGPT model derived from the code-davinci-002 model, designed to follow instructions in prompts to provide detailed responses. Training data: up to Sep 2021.
by openai4k context$20.00/M input tkns$20.00/M output tkns21.7K tokens this week