Discover AI language models for conversations, coding, and creative writing
Gemma 4 31B
Google's Gemma 4 31B instruction-tuned model on Tinfoil's TEE-backed inference path with provider attestation support. This route keeps reasoning disabled for faster direct answers while preserving structured output support.
Features
Context
262.1K
Max Output
131.1K
Date Added
Apr 4, 2026
Pricing
Input:
$0.14/1M
Output:
$0.40/1M
Est./msg:
$0.0003
Gemma 4 26B A4B Thinking
Google's Gemma 4 26B A4B instruction-tuned model with OpenRouter reasoning enabled, exposing structured reasoning for more deliberate coding, multimodal analysis, and long-context problem solving.
Features
Context
262.1K
Max Output
131.1K
Date Added
Apr 3, 2026
Pricing
Input:
$0.13/1M
Output:
$0.40/1M
Est./msg:
$0.0003
GLM 5V Turbo Thinking
Thinking-enabled GLM 5V Turbo for image, video, and text inputs. Uses the same multimodal foundation model, but keeps OpenRouter reasoning enabled for more deliberate vision-grounded analysis, planning, and tool use. Not included in the subscription.
Benchmarks (Artificial Analysis)
Intelligence
42.9
Coding
36.2
Features
Context
202.8K
Max Output
131.1K
Date Added
Apr 2, 2026
Pricing
Input:
$1.20/1M
Output:
$4.00/1M
Est./msg:
$0.0032
Qwen 3.6 Plus
Qwen 3.6 Plus is a commercial native multimodal agent model with a major step up over the 3.5 series in agentic coding, front-end programming, OCR, object localization, and general vision-language performance. Supports text, image, and video input with a 1M context window.
Features
Context
991.8K
Max Output
65.5K
Date Added
Apr 2, 2026
Pricing
Input:
$0.45/1M
Output:
$2.70/1M
Cache:
Read $0.05/1M · Write $0.56/1M (5m) / $0.90/1M (1h)
Est./msg:
$0.0018
Gemma 4 26B A4B
Google's Gemma 4 26B A4B instruction-tuned model built for scalable reasoning, coding, long-context, and multimodal workflows. This route keeps OpenRouter reasoning disabled for faster direct answers while preserving multimodal and structured output support.
Features
Context
262.1K
Max Output
131.1K
Date Added
Apr 2, 2026
Pricing
Input:
$0.13/1M
Output:
$0.40/1M
Est./msg:
$0.0003
Gemma 4 31B
Google's Gemma 4 31B instruction-tuned model for heavier reasoning, coding, agentic workflows, and long-context multimodal understanding. This route keeps tokenizer thinking disabled for faster direct answers.
Features
Context
262.1K
Max Output
131.1K
Date Added
Apr 2, 2026
Pricing
Input:
$0.14/1M
Output:
$0.40/1M
Est./msg:
$0.0003
Gemma 4 31B Thinking
Google's Gemma 4 31B instruction-tuned model with NVIDIA tokenizer thinking explicitly enabled, exposing reasoning traces for complex multimodal and coding workflows.
Features
Context
262.1K
Max Output
131.1K
Date Added
Apr 2, 2026
Pricing
Input:
$0.14/1M
Output:
$0.40/1M
Est./msg:
$0.0003
GLM 5V Turbo
Z.ai's native multimodal agent model for vision-based coding and agent workflows. This is the standard non-thinking variant for image, video, and text inputs, tuned for perceive-plan-execute loops, complex coding, and tool-driven task execution. Not included in the subscription.
Benchmarks (Artificial Analysis)
Intelligence
46.8
Coding
36.8
Features
Context
202.8K
Max Output
131.1K
Date Added
Apr 1, 2026
Pricing
Input:
$1.20/1M
Output:
$4.00/1M
Est./msg:
$0.0032
Trinity Large Thinking
Open source Arcee reasoning model with a 262K context window, 80K max output, and native reasoning and tool support for agentic workloads.
Features
Context
262.1K
Max Output
80.0K
Date Added
Apr 1, 2026
Pricing
Input:
$0.25/1M
Output:
$0.90/1M
Est./msg:
$0.0007
Grok 4.20
xAI's Grok 4.20 flagship release with tool calling, multimodal input support, and a 2M-token context window.
Benchmarks (Artificial Analysis)
Intelligence
48.5
Coding
42.2
Speed
243.3
Features
Context
2.0M
Max Output
131.1K
Date Added
Mar 31, 2026
Pricing
Input:
$2.00/1M
Output:
$6.00/1M
Est./msg:
$0.0050
View Providers
Grok 4.20 Multi-Agent
Grok 4.20 Multi-Agent is tuned for collaborative agentic workflows while keeping the same 2M-token context window and multimodal support.
Benchmarks (Artificial Analysis)
Intelligence
48.5
Coding
42.2
Speed
243.3
Features
Context
2.0M
Max Output
131.1K
Date Added
Mar 31, 2026
Pricing
Input:
$2.00/1M
Output:
$6.00/1M
Est./msg:
$0.0050
View Providers
Qwen3.5 Omni Flash
Qwen3.5 Omni Flash is Qwen's fast multimodal model. We verified live support for text prompts, images, audio files, and direct video URLs on Alibaba's chat-completions-compatible API. Alibaba describes Flash as a fully evolved version of Qwen3 Omni with audio input support across 60+ languages.
Features
Context
49.2K
Max Output
16.4K
Date Added
Mar 30, 2026
Pricing
Input:
$0.43/1M
Output:
$1.66/1M
Cache:
Read $0.04/1M · Write $0.54/1M (5m) / $0.86/1M (1h)
Est./msg:
$0.0013
Qwen3.5 Omni Plus
Qwen3.5 Omni Plus is Qwen's stronger general multimodal model. We verified live support for text prompts, images, audio files, and direct video URLs on Alibaba's chat-completions-compatible API. Alibaba describes Plus as a comprehensive evolution of Qwen3 Omni with support for over 10 hours of audio input.
Benchmarks (Artificial Analysis)
Intelligence
38.6
Coding
27.6
Speed
50.4
Features
Context
983.6K
Max Output
65.5K
Date Added
Mar 30, 2026
Pricing
Input:
$0.40/1M
Output:
$2.40/1M
Cache:
Read $0.04/1M · Write $0.50/1M (5m) / $0.80/1M (1h)
Est./msg:
$0.0016
Claude Sonnet Latest
Compatibility alias that routes to the newest version of Claude Sonnet. Currently routes to Claude Sonnet 4.6 Thinking.
Benchmarks (Artificial Analysis)
Intelligence
51.7
Coding
50.9
Speed
50.4
Features
Context
1.0M
Max Output
128.0K
Date Added
Mar 29, 2026
Pricing
Input:
$2.99/1M
Output:
$14.99/1M
Cache:
Read $0.30/1M · Write $3.74/1M (5m) / $5.98/1M (1h)
Est./msg:
$0.0105
GPT Latest
Compatibility alias that routes to the newest version of GPT. Currently routes to GPT 5.4.
Benchmarks (Artificial Analysis)
Intelligence
57.2
Coding
57.3
Speed
74.0
Features
Context
922.0K
Max Output
128.0K
Date Added
Mar 29, 2026
Pricing
Input:
$2.50/1M
Output:
$15.00/1M
Cache:
Read $0.25/1M
Est./msg:
$0.0100
Gemini Flash Latest
Compatibility alias that routes to the newest version of Gemini Flash. Currently routes to Gemini 3 Flash Preview Thinking.
Benchmarks (Artificial Analysis)
Intelligence
46.4
Coding
42.6
Math
97.0
Speed
186.9
Features
Context
1.0M
Max Output
65.5K
Date Added
Mar 29, 2026
Pricing
Input:
$0.50/1M
Output:
$3.00/1M
Cache:
Read $0.05/1M
Est./msg:
$0.0020
Gemini Pro Latest
Compatibility alias that routes to the newest version of Gemini Pro. Currently routes to Gemini 3.1 Pro Preview High.
Benchmarks (Artificial Analysis)
Intelligence
57.2
Coding
55.5
Speed
122.6
Features
Context
1.0M
Max Output
65.5K
Date Added
Mar 29, 2026
Pricing
Input:
$2.00/1M
Output:
$12.00/1M
Cache:
Read $0.20/1M
Est./msg:
$0.0080
Gemini Flash Lite Latest
Compatibility alias that routes to the newest version of Gemini Flash Lite. Currently routes to Gemini 3.1 Flash Lite Preview.
Benchmarks (Artificial Analysis)
Intelligence
33.5
Coding
30.1
Speed
207.1
Features
Context
1.0M
Max Output
65.5K
Date Added
Mar 29, 2026
Pricing
Input:
$0.25/1M
Output:
$1.50/1M
Cache:
Read $0.03/1M
Est./msg:
$0.0010
Claude Haiku Latest
Compatibility alias that routes to the newest version of Claude Haiku. Currently routes to Claude Haiku 4.5 Thinking.
Benchmarks (Artificial Analysis)
Intelligence
37.1
Coding
32.6
Math
83.7
Speed
124.6
Features
Context
200.0K
Max Output
64.0K
Date Added
Mar 29, 2026
Pricing
Input:
$1.00/1M
Output:
$5.00/1M
Cache:
Read $0.10/1M · Write $1.25/1M (5m) / $2.00/1M (1h)
Est./msg:
$0.0035
Claude Opus Latest
Compatibility alias that routes to the newest version of Claude Opus. Currently routes to Claude Opus 4.6 Thinking.
Benchmarks (Artificial Analysis)
Intelligence
53.0
Coding
48.1
Speed
49.8
Features
Context
1.0M
Max Output
128.0K
Date Added
Mar 29, 2026
Pricing
Input:
$5.00/1M
Output:
$25.01/1M
Cache:
Read $0.50/1M · Write $6.25/1M (5m) / $10.00/1M (1h)
Est./msg:
$0.0175
KAT Coder Pro V2
Latest high-performance coding model in Kwaipilot's KAT Coder series, built for complex software engineering, SaaS integration, and large-scale production workflows.
Context
256.0K
Max Output
80.0K
Date Added
Mar 28, 2026
Pricing
Input:
$0.30/1M
Output:
$1.20/1M
Est./msg:
$0.0009
GLM 5.1
GLM-5.1 appears to be an improvement over GLM 5. Not much else is publicly clear yet. It is available as a pay-as-you-go model.
Features
Context
200.0K
Max Output
131.1K
Date Added
Mar 27, 2026
Pricing
Input:
$0.90/1M
Output:
$2.70/1M
Est./msg:
$0.0022
GLM 5.1 Thinking
GLM-5.1 with thinking enabled. It appears to be an improvement over GLM 5, but public details are still limited. It is available as a pay-as-you-go model.
Features
Context
200.0K
Max Output
131.1K
Date Added
Mar 27, 2026
Pricing
Input:
$0.90/1M
Output:
$2.70/1M
Est./msg:
$0.0022
Qwen3.5 27B Musica v1
Creative Qwen3.5 27B roleplay, story generation, and conversational finetune built on ArliAI's derestricted base with reasoning and vision support.
Benchmarks (Artificial Analysis)
Intelligence
40.1
Coding
37.4
Speed
63.4
Features
Context
262.1K
Date Added
Mar 27, 2026
Pricing
Input:
$0.31/1M
Output:
$0.31/1M
Est./msg:
$0.0005
TheDrummer Skyfall 31B v4.2
TheDrummer's Skyfall 31B v4.2 is a roleplay and storytelling model with a 131k context window.
Features
Context
131.1K
Date Added
Mar 26, 2026
Pricing
Input:
$0.55/1M
Output:
$0.80/1M
Est./msg:
$0.0010
AionLabs: Aion-2.5
Aion-2.5 is a GLM-5 variant tuned for immersive roleplay and storytelling with stronger tension, conflict, and darker thematic nuance.
Context
131.1K
Max Output
32.8K
Date Added
Mar 20, 2026
Pricing
Input:
$1.00/1M
Output:
$3.00/1M
Est./msg:
$0.0025
MiMo V2 Omni
MiMo V2 Omni is Xiaomi's frontier omni model for multimodal agent workflows. It natively handles image, video, audio, and text in one shared architecture, with strong grounding, planning, and tool-use behavior.
Benchmarks (Artificial Analysis)
Intelligence
43.4
Coding
35.5
Features
Context
262.1K
Max Output
65.5K
Date Added
Mar 19, 2026
Pricing
Input:
$0.40/1M
Output:
$2.00/1M
Est./msg:
$0.0014
MiMo V2 Pro
MiMo V2 Pro is Xiaomi's flagship foundation model for agent systems, built for coding, tool use, orchestration, and long-horizon workflows. It uses a trillion-parameter-class architecture with 42B active parameters and supports up to 1M context.
Benchmarks (Artificial Analysis)
Intelligence
49.2
Coding
41.4
Features
Context
1.0M
Max Output
131.1K
Date Added
Mar 19, 2026
Pricing
Input:
$1.00/1M
Output:
$3.00/1M
Est./msg:
$0.0025
MiniMax M2.7
MiniMax M2.7 is the first model deeply involved in iterating on its own training. It excels in real-world software engineering (SWE-Pro 56.22%), end-to-end project delivery (VIBE-Pro 55.6%), and complex office workflows with strong tool-use compliance and agentic capabilities.
Benchmarks (Artificial Analysis)
Intelligence
49.6
Coding
41.9
Speed
41.3
Features
Context
204.8K
Max Output
131.1K
Date Added
Mar 18, 2026
Pricing
Input:
$0.30/1M
Output:
$1.20/1M
Est./msg:
$0.0009
View Providers
MiniMax M2.7 Turbo
MiniMax M2.7 Turbo is the highspeed and higher priced route for M2.7.
Benchmarks (Artificial Analysis)
Intelligence
49.6
Coding
41.9
Speed
41.3
Features
Context
204.8K
Max Output
131.1K
Date Added
Mar 18, 2026
Pricing
Input:
$0.60/1M
Output:
$2.40/1M
Est./msg:
$0.0018
View Providers