NextBit

Browse models provided by NextBit (Terms of Service)

19 models

Tokens processed on OpenRouter

Qwen: Qwen3 VL 8B InstructQwen3 VL 8B Instruct
Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon temporal reasoning, DeepStack for fine-grained visual-text alignment, and text-timestamp alignment for precise event localization. The model supports a native 256K-token context window, extensible to 1M tokens, and handles both static and dynamic media inputs for tasks like document parsing, visual question answering, spatial reasoning, and GUI control. It achieves text understanding comparable to leading LLMs while expanding OCR coverage to 32 languages and enhancing robustness under varied visual conditions.
by qwen

NextBit

Browse models provided by NextBit (Terms of Service)

19 models

Tokens processed on OpenRouter

Qwen: Qwen3 VL 8B InstructQwen3 VL 8B Instruct
Qwen3-VL-8B-Instruct is a multimodal vision-language model from the Qwen3-VL series, built for high-fidelity understanding and reasoning across text, images, and video. It features improved multimodal fusion with Interleaved-MRoPE for long-horizon temporal reasoning, DeepStack for fine-grained visual-text alignment, and text-timestamp alignment for precise event localization. The model supports a native 256K-token context window, extensible to 1M tokens, and handles both static and dynamic media inputs for tasks like document parsing, visual question answering, spatial reasoning, and GUI control. It achieves text understanding comparable to leading LLMs while expanding OCR coverage to 32 languages and enhancing robustness under varied visual conditions.
by qwen

256K context

$0.12/M input tokens$0.70/M output tokens

TheDrummer: Cydonia 24B V4.1Cydonia 24B V4.1

Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligence.

by thedrummer131K context$0.30/M input tokens$0.50/M output tokens

OpenAI: gpt-oss-20bgpt-oss-20b

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for lower-latency inference and deployability on consumer or single-GPU hardware. The model is trained in OpenAI’s Harmony response format and supports reasoning level configuration, fine-tuning, and agentic capabilities including function calling, tool use, and structured outputs.

by openai131K context$0.10/M input tokens$0.45/M output tokens

Qwen: Qwen3 30B A3BQwen3 30B A3B

Qwen3, the latest generation in the Qwen large language model series, features both dense and mixture-of-experts (MoE) architectures to excel in reasoning, multilingual support, and advanced agent tasks. Its unique ability to switch seamlessly between a thinking mode for complex reasoning and a non-thinking mode for efficient dialogue ensures versatile, high-quality performance. Significantly outperforming prior models like QwQ and Qwen2.5, Qwen3 delivers superior mathematics, coding, commonsense reasoning, creative writing, and interactive dialogue capabilities. The Qwen3-30B-A3B variant includes 30.5 billion parameters (3.3 billion activated), 48 layers, 128 experts (8 activated per task), and supports up to 131K token contexts with YaRN, setting a new standard among open-source models.

by qwen131K context$0.14/M input tokens$0.55/M output tokens

Qwen: Qwen3 14BQwen3 14B

Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for tasks like math, programming, and logical inference, and a "non-thinking" mode for general-purpose conversation. The model is fine-tuned for instruction-following, agent tool use, creative writing, and multilingual tasks across 100+ languages and dialects. It natively handles 32K token contexts and can extend to 131K tokens using YaRN-based scaling.

by qwen132K context$0.06/M input tokens$0.24/M output tokens

Qwen: QwQ 32BQwQ 32B

QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems. QwQ-32B is the medium-sized reasoning model, which is capable of achieving competitive performance against state-of-the-art reasoning models, e.g., DeepSeek-R1, o1-mini.

by qwen131K context$0.15/M input tokens$0.40/M output tokens

DeepSeek: R1 Distill Qwen 32BR1 Distill Qwen 32B

DeepSeek R1 Distill Qwen 32B is a distilled large language model based on Qwen 2.5 32B, using outputs from DeepSeek R1. It outperforms OpenAI's o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.\n\nOther benchmark results include:\n\n- AIME 2024 pass@1: 72.6\n- MATH-500 pass@1: 94.3\n- CodeForces Rating: 1691\n\nThe model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.

by deepseek128K context$0.29/M input tokens$0.29/M output tokens

Microsoft: Phi 4Phi 4

Microsoft Research Phi-4 is designed to perform well in complex reasoning tasks and can operate efficiently in situations with limited memory or where quick responses are needed. At 14 billion parameters, it was trained on a mix of high-quality synthetic datasets, data from curated websites, and academic materials. It has undergone careful improvement to follow instructions accurately and maintain strong safety standards. It works best with English language inputs. For more information, please see Phi-4 Technical Report

by microsoft16K context$0.06/M input tokens$0.14/M output tokens

Sao10K: Llama 3.3 Euryale 70BLlama 3.3 Euryale 70B

Euryale L3.3 70B is a model focused on creative roleplay from Sao10k. It is the successor of Euryale L3 70B v2.2.

by sao10k8K context$0.65/M input tokens$0.75/M output tokens

TheDrummer: UnslopNemo 12BUnslopNemo 12B

UnslopNemo v4.1 is the latest addition from the creator of Rocinante, designed for adventure writing and role-play scenarios.

by thedrummer32K context$0.40/M input tokens$0.40/M output tokens

TheDrummer: Rocinante 12BRocinante 12B

Rocinante 12B is designed for engaging storytelling and rich prose. Early testers have reported: - Expanded vocabulary with unique and expressive word choices - Enhanced creativity for vivid narratives - Adventure-filled and captivating stories

by thedrummer33K context$0.17/M input tokens$0.43/M output tokens

NeverSleep: Lumimaid v0.2 8BLumimaid v0.2 8B

Lumimaid v0.2 8B is a finetune of Llama 3.1 8B with a "HUGE step up dataset wise" compared to Lumimaid v0.1. Sloppy chats output were purged. Usage of this model is subject to Meta's Acceptable Use Policy.

by neversleep131K context$0.09/M input tokens$0.60/M output tokens

Sao10K: Llama 3.1 Euryale 70B v2.2Llama 3.1 Euryale 70B v2.2

Euryale L3.1 70B v2.2 is a model focused on creative roleplay from Sao10k. It is the successor of Euryale L3 70B v2.1.

by sao10k131K context$0.65/M input tokens$0.75/M output tokens

Nous: Hermes 3 70B InstructHermes 3 70B Instruct

Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board. Hermes 3 70B is a competitive, if not superior finetune of the Llama-3.1 70B foundation model, focused on aligning LLMs to the user, with powerful steering capabilities and control given to the end user. The Hermes 3 series builds and expands on the Hermes 2 set of capabilities, including more powerful and reliable function calling and structured output capabilities, generalist assistant capabilities, and improved code generation skills.

by nousresearch131K context$0.30/M input tokens$0.30/M output tokens

Google: Gemma 2 27BGemma 2 27B

Gemma 2 27B by Google is an open model built from the same research and technology used to create the Gemini models. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. See the launch announcement for more details. Usage of Gemma is subject to Google's Gemma Terms of Use.

by google8K context$0.65/M input tokens$0.65/M output tokens

NousResearch: Hermes 2 Pro - Llama-3 8BHermes 2 Pro - Llama-3 8B

Hermes 2 Pro is an upgraded, retrained version of Nous Hermes 2, consisting of an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced Function Calling and JSON Mode dataset developed in-house.

by nousresearch8K context$0.025/M input tokens$0.08/M output tokens

Noromaid 20BNoromaid 20B

A collab between IkariDev and Undi. This merge is suitable for RP, ERP, and general knowledge. #merge #uncensored

by neversleep8K context$1/M input tokens$1.75/M output tokens

ReMM SLERP 13BReMM SLERP 13B

A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge

by undi954K context$0.45/M input tokens$0.65/M output tokens

MythoMax 13BMythoMax 13B

One of the highest performing and most popular fine-tunes of Llama 2 13B, with rich descriptions and roleplay. #merge

by gryphe4K context$0.06/M input tokens$0.06/M output tokens