Skip to content
  •  
  • © 2023 – 2025 OpenRouter, Inc
      Favicon for Featherless

      Featherless

      Browse models provided by Featherless (Terms of Service)

      24 models

      Tokens processed

      • EleutherAI: Llemma 7b

        Llemma 7B is a language model for mathematics. It was initialized with Code Llama 7B weights, and trained on the Proof-Pile-2 for 200B tokens. Llemma models are particularly strong at chain-of-thought mathematical reasoning and using computational tools for mathematics, such as Python and formal theorem provers.

        by eleutherai4K context$0.80/M input tokens$1.20/M output tokens
      • AlfredPros: CodeLLaMa 7B Instruct Solidity

        A finetuned 7 billion parameters Code LLaMA - Instruct model to generate Solidity smart contract using 4-bit QLoRA finetuning provided by PEFT library.

        by alfredpros4K context$0.80/M input tokens$1.20/M output tokens
      • OpenHands LM 32B V0.1

        OpenHands LM v0.1 is a 32B open-source coding model fine-tuned from Qwen2.5-Coder-32B-Instruct using reinforcement learning techniques outlined in SWE-Gym. It is optimized for autonomous software development agents and achieves strong performance on SWE-Bench Verified, with a 37.2% resolve rate. The model supports a 128K token context window, making it well-suited for long-horizon code reasoning and large codebase tasks. OpenHands LM is designed for local deployment and runs on consumer-grade GPUs such as a single 3090. It enables fully offline agent workflows without dependency on proprietary APIs. This release is intended as a research preview, and future updates aim to improve generalizability, reduce repetition, and offer smaller variants.

        by all-hands131K context$2.60/M input tokens$3.40/M output tokens
      • Qwerky 72BFree variant

        Qwerky-72B is a linear-attention RWKV variant of the Qwen 2.5 72B model, optimized to significantly reduce computational cost at scale. Leveraging linear attention, it achieves substantial inference speedups (>1000x) while retaining competitive accuracy on common benchmarks like ARC, HellaSwag, Lambada, and MMLU. It inherits knowledge and language support from Qwen 2.5, supporting approximately 30 languages, making it suitable for efficient inference in large-context applications.

        by featherless33K context$0/M input tokens$0/M output tokens
      • DeepSeek: R1

        DeepSeek R1 is here: Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass. Fully open-source model & technical report. MIT licensed: Distill & commercialize freely!

        by deepseek164K context$6.50/M input tokens$8/M output tokens
      • EVA Llama 3.33 70B

        EVA Llama 3.33 70b is a roleplay and storywriting specialist model. It is a full-parameter finetune of Llama-3.3-70B-Instruct on mixture of synthetic and natural data. It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and "flavor" of the resulting model This model was built with Llama by Meta.

        by eva-unit-0116K context$4/M input tokens$6/M output tokens
      • EVA Qwen2.5 72B

        EVA Qwen2.5 72B is a roleplay and storywriting specialist model. It's a full-parameter finetune of Qwen2.5-72B on mixture of synthetic and natural data. It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and "flavor" of the resulting model.

        by eva-unit-0132K context$4/M input tokens$6/M output tokens
      • Infermatic: Mistral Nemo Inferor 12B

        Inferor 12B is a merge of top roleplay models, expert on immersive narratives and storytelling. This model was merged using the Model Stock merge method using anthracite-org/magnum-v4-12b as a base.

        by infermatic32K context$0.80/M input tokens$1.20/M output tokens
      • Qwen2.5 Coder 32B Instruct

        Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). Qwen2.5-Coder brings the following improvements upon CodeQwen1.5: - Significantly improvements in code generation, code reasoning and code fixing. - A more comprehensive foundation for real-world applications such as Code Agents. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies. To read more about its evaluation results, check out Qwen 2.5 Coder's blog.

        by qwen128K context$2.60/M input tokens$3.40/M output tokens
      • EVA Qwen2.5 32B

        EVA Qwen2.5 32B is a roleplaying/storywriting specialist model. It's a full-parameter finetune of Qwen2.5-32B on mixture of synthetic and natural data. It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and "flavor" of the resulting model.

        by eva-unit-0132K context$2.60/M input tokens$3.40/M output tokens
      • Magnum v4 72B

        This is a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet(https://openrouter.ai/anthropic/claude-3.5-sonnet) and Opus(https://openrouter.ai/anthropic/claude-3-opus). The model is fine-tuned on top of Qwen2.5 72B.

        by anthracite-org33K context$4/M input tokens$6/M output tokens
      • NeverSleep: Lumimaid v0.2 70B

        Lumimaid v0.2 70B is a finetune of Llama 3.1 70B with a "HUGE step up dataset wise" compared to Lumimaid v0.1. Sloppy chats output were purged. Usage of this model is subject to Meta's Acceptable Use Policy.

        by neversleep131K context$4/M input tokens$6/M output tokens
      • Magnum v2 72B

        From the maker of Goliath, Magnum 72B is the seventh in a family of models designed to achieve the prose quality of the Claude 3 models, notably Opus & Sonnet. The model is based on Qwen2 72B and trained with 55 million tokens of highly curated roleplay (RP) data.

        by anthracite-org33K context$4/M input tokens$6/M output tokens
      • Rocinante 12B

        Rocinante 12B is designed for engaging storytelling and rich prose. Early testers have reported: - Expanded vocabulary with unique and expressive word choices - Enhanced creativity for vivid narratives - Adventure-filled and captivating stories

        by thedrummer33K context$0.80/M input tokens$1.20/M output tokens
      • NeverSleep: Lumimaid v0.2 8B

        Lumimaid v0.2 8B is a finetune of Llama 3.1 8B with a "HUGE step up dataset wise" compared to Lumimaid v0.1. Sloppy chats output were purged. Usage of this model is subject to Meta's Acceptable Use Policy.

        by neversleep131K context$0.80/M input tokens$1.20/M output tokens
      • Aetherwiing: Starcannon 12B

        Starcannon 12B v2 is a creative roleplay and story writing model, based on Mistral Nemo, using nothingiisreal/mn-celeste-12b as a base, with intervitens/mini-magnum-12b-v1.1 merged in using the TIES method. Although more similar to Magnum overall, the model remains very creative, with a pleasant writing style. It is recommended for people wanting more variety than Magnum, and yet more verbose prose than Celeste.

        by aetherwiing12K context$0.80/M input tokens$1.20/M output tokens
      • Mistral Nemo 12B Celeste

        A specialized story writing and roleplaying model based on Mistral's NeMo 12B Instruct. Fine-tuned on curated datasets including Reddit Writing Prompts and Opus Instruct 25K. This model excels at creative writing, offering improved NSFW capabilities, with smarter and more active narration. It demonstrates remarkable versatility in both SFW and NSFW scenarios, with strong Out of Character (OOC) steering capabilities, allowing fine-tuned control over narrative direction and character behavior. Check out the model's HuggingFace page for details on what parameters and prompts work best!

        by nothingiisreal32K context$0.80/M input tokens$1.20/M output tokens
      • Magnum 72B

        From the maker of Goliath, Magnum 72B is the first in a new family of models designed to achieve the prose quality of the Claude 3 models, notably Opus & Sonnet. The model is based on Qwen2 72B and trained with 55 million tokens of highly curated roleplay (RP) data.

        by alpindale16K context$4/M input tokens$6/M output tokens
      • NeverSleep: Llama 3 Lumimaid 70B

        The NeverSleep team is back, with a Llama 3 70B finetune trained on their curated roleplay data. Striking a balance between eRP and RP, Lumimaid was designed to be serious, yet uncensored when necessary. To enhance it's overall intelligence and chat capability, roughly 40% of the training data was not roleplay. This provides a breadth of knowledge to access, while still keeping roleplay as the primary strength. Usage of this model is subject to Meta's Acceptable Use Policy.

        by neversleep8K context$4/M input tokens$6/M output tokens
      • NeverSleep: Llama 3 Lumimaid 8B

        The NeverSleep team is back, with a Llama 3 8B finetune trained on their curated roleplay data. Striking a balance between eRP and RP, Lumimaid was designed to be serious, yet uncensored when necessary. To enhance it's overall intelligence and chat capability, roughly 40% of the training data was not roleplay. This provides a breadth of knowledge to access, while still keeping roleplay as the primary strength. Usage of this model is subject to Meta's Acceptable Use Policy.

        by neversleep25K context$0.80/M input tokens$1.20/M output tokens
      • Fimbulvetr 11B v2

        Creative writing model, routed with permission. It's fast, it keeps the conversation going, and it stays in character. If you submit a raw prompt, you can use Alpaca or Vicuna formats.

        by sao10k8K context$0.80/M input tokens$1.20/M output tokens
      • Toppy M 7B

        A wild 7B parameter model that merges several models using the new task_arithmetic merge method from mergekit. List of merged models: - NousResearch/Nous-Capybara-7B-V1.9 - HuggingFaceH4/zephyr-7b-beta - lemonilia/AshhLimaRP-Mistral-7B - Vulkane/120-Days-of-Sodom-LoRA-Mistral-7b - Undi95/Mistral-pippa-sharegpt-7b-qlora #merge #uncensored

        by undi954K context$0.80/M input tokens$1.20/M output tokens
      • Pygmalion: Mythalion 13B

        A blend of the new Pygmalion-13b and MythoMax. #merge

        by pygmalionai8K context$0.80/M input tokens$1.20/M output tokens
      • ReMM SLERP 13B

        A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge

        by undi954K context$0.80/M input tokens$1.20/M output tokens