NextBit

Browse models provided by NextBit (Terms of Service)

13 models

Tokens processed on OpenRouter

DeepSeek: DeepSeek V4 ProDeepSeek V4 Pro
DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding, and long-horizon agent workflows, with strong performance across knowledge, math, and software engineering benchmarks. Built on the same architecture as DeepSeek V4 Flash, it introduces a hybrid attention system for efficient long-context processing. Reasoning efforts `high` and `xhigh` are supported; `xhigh` maps to max reasoning. It is well suited for complex workloads such as full-codebase analysis, multi-step automation, and large-scale information synthesis, where both capability and efficiency are critical.
by deepseekApr 24, 20261.05M context$1.55/M input tokens$3/M output tokens

NextBit

Browse models provided by NextBit (Terms of Service)

13 models

Tokens processed on OpenRouter

DeepSeek: DeepSeek V4 ProDeepSeek V4 Pro
DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated parameters, supporting a 1M-token context window. It is designed for advanced reasoning, coding, and long-horizon agent workflows, with strong performance across knowledge, math, and software engineering benchmarks. Built on the same architecture as DeepSeek V4 Flash, it introduces a hybrid attention system for efficient long-context processing. Reasoning efforts `high` and `xhigh` are supported; `xhigh` maps to max reasoning. It is well suited for complex workloads such as full-codebase analysis, multi-step automation, and large-scale information synthesis, where both capability and efficiency are critical.
by deepseekApr 24, 20261.05M context$1.55/M input tokens$3/M output tokens

Google: Gemma 4 26B A4B Gemma 4 26B A4B

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at a fraction of the compute cost. Supports multimodal input including text, images, and video (up to 60s at 1fps). Features a 256K token context window, native function calling, configurable thinking/reasoning mode, and structured output support. Released under Apache 2.0.

by googleApr 3, 2026262K context$0.13/M input tokens$0.40/M output tokens

Qwen: Qwen3.5-35B-A3BQwen3.5-35B-A3B

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear attention mechanisms and a sparse mixture-of-experts model, achieving higher inference efficiency. Its overall performance is comparable to that of the Qwen3.5-27B.

by qwenFeb 25, 2026256K context$0.23/M input tokens$1.60/M output tokens

Mistral: Ministral 3 14B 2512Ministral 3 14B 2512

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counterpart. A powerful and efficient language model with vision capabilities.

by mistralaiDec 2, 2025128K context$0.35/M input tokens$0.35/M output tokens

Mistral: Ministral 3 8B 2512Ministral 3 8B 2512

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.

by mistralaiDec 2, 2025128K context$0.30/M input tokens$0.30/M output tokens

Mistral: Ministral 3 3B 2512Ministral 3 3B 2512

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.

by mistralaiDec 2, 2025128K context$0.15/M input tokens$0.15/M output tokens

Qwen: Qwen3 30B A3BQwen3 30B A3B

Qwen3, the latest generation in the Qwen large language model series, features both dense and mixture-of-experts (MoE) architectures to excel in reasoning, multilingual support, and advanced agent tasks. Its unique ability to switch seamlessly between a thinking mode for complex reasoning and a non-thinking mode for efficient dialogue ensures versatile, high-quality performance. Significantly outperforming prior models like QwQ and Qwen2.5, Qwen3 delivers superior mathematics, coding, commonsense reasoning, creative writing, and interactive dialogue capabilities. The Qwen3-30B-A3B variant includes 30.5 billion parameters (3.3 billion activated), 48 layers, 128 experts (8 activated per task), and supports up to 131K token contexts with YaRN, setting a new standard among open-source models.

by qwenApr 28, 2025131K context$0.14/M input tokens$0.55/M output tokens

Qwen: Qwen3 14BQwen3 14B

Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for tasks like math, programming, and logical inference, and a "non-thinking" mode for general-purpose conversation. The model is fine-tuned for instruction-following, agent tool use, creative writing, and multilingual tasks across 100+ languages and dialects. It natively handles 32K token contexts and can extend to 131K tokens using YaRN-based scaling.

by qwenApr 28, 2025132K context$0.10/M input tokens$0.24/M output tokens

Sao10K: Llama 3.3 Euryale 70BLlama 3.3 Euryale 70B

Euryale L3.3 70B is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the successor of [Euryale L3 70B v2.2](/models/sao10k/l3-euryale-70b).

by sao10kDec 18, 20248K context$0.65/M input tokens$0.75/M output tokens

TheDrummer: UnslopNemo 12BUnslopNemo 12B

UnslopNemo v4.1 is the latest addition from the creator of Rocinante, designed for adventure writing and role-play scenarios.

by thedrummerNov 8, 202432K context$0.40/M input tokens$0.40/M output tokens

Google: Gemma 2 27BGemma 2 27B

Gemma 2 27B by Google is an open model built from the same research and technology used to create the [Gemini models](/models?q=gemini). Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. See the [launch announcement](https://blog.google/technology/developers/google-gemma-2/) for more details. Usage of Gemma is subject to Google's [Gemma Terms of Use](https://ai.google.dev/gemma/terms).

by googleJul 13, 20248K context$0.65/M input tokens$0.65/M output tokens

ReMM SLERP 13BReMM SLERP 13B

A recreation trial of the original MythoMax-L2-B13 but with updated models. #merge

by undi95Jul 22, 20234K context$0.45/M input tokens$0.65/M output tokens

MythoMax 13BMythoMax 13B

One of the highest performing and most popular fine-tunes of Llama 2 13B, with rich descriptions and roleplay. #merge

by grypheJul 2, 20234K context$0.06/M input tokens$0.06/M output tokens