Nano GPT logo

NanoGPT

Explore Text Models

Discover AI language models for conversations, coding, and creative writing

google

Gemma 4 31B

Google's Gemma 4 31B instruction-tuned model on Tinfoil's TEE-backed inference path with provider attestation support. This route keeps reasoning disabled for faster direct answers while preserving structured output support.

Features

Context

262.1K

Max Output

131.1K

Date Added

Apr 4, 2026

Pricing

Input:

$0.14/1M

Output:

$0.40/1M

Est./msg:

$0.0003

Try Gemma 4 31B
google

Gemma 4 26B A4B Thinking

Google's Gemma 4 26B A4B instruction-tuned model with OpenRouter reasoning enabled, exposing structured reasoning for more deliberate coding, multimodal analysis, and long-context problem solving.

Features

Context

262.1K

Max Output

131.1K

Date Added

Apr 3, 2026

Pricing

Input:

$0.13/1M

Output:

$0.40/1M

Est./msg:

$0.0003

Try Gemma 4 26B A4B Thinking
zhipu

GLM 5V Turbo Thinking

Thinking-enabled GLM 5V Turbo for image, video, and text inputs. Uses the same multimodal foundation model, but keeps OpenRouter reasoning enabled for more deliberate vision-grounded analysis, planning, and tool use. Not included in the subscription.

Benchmarks (Artificial Analysis)

Intelligence

42.9

Coding

36.2

Features

Context

202.8K

Max Output

131.1K

Date Added

Apr 2, 2026

Pricing

Input:

$1.20/1M

Output:

$4.00/1M

Est./msg:

$0.0032

Try GLM 5V Turbo Thinking
qwen

Qwen 3.6 Plus

Qwen 3.6 Plus is a commercial native multimodal agent model with a major step up over the 3.5 series in agentic coding, front-end programming, OCR, object localization, and general vision-language performance. Supports text, image, and video input with a 1M context window.

Features

Context

991.8K

Max Output

65.5K

Date Added

Apr 2, 2026

Pricing

Input:

$0.45/1M

Output:

$2.70/1M

Cache:

Read $0.05/1M · Write $0.56/1M (5m) / $0.90/1M (1h)

Est./msg:

$0.0018

Try Qwen 3.6 Plus
google

Gemma 4 26B A4B

Google's Gemma 4 26B A4B instruction-tuned model built for scalable reasoning, coding, long-context, and multimodal workflows. This route keeps OpenRouter reasoning disabled for faster direct answers while preserving multimodal and structured output support.

Features

Context

262.1K

Max Output

131.1K

Date Added

Apr 2, 2026

Pricing

Input:

$0.13/1M

Output:

$0.40/1M

Est./msg:

$0.0003

Try Gemma 4 26B A4B
google

Gemma 4 31B

Google's Gemma 4 31B instruction-tuned model for heavier reasoning, coding, agentic workflows, and long-context multimodal understanding. This route keeps tokenizer thinking disabled for faster direct answers.

Features

Context

262.1K

Max Output

131.1K

Date Added

Apr 2, 2026

Pricing

Input:

$0.14/1M

Output:

$0.40/1M

Est./msg:

$0.0003

Try Gemma 4 31B
google

Gemma 4 31B Thinking

Google's Gemma 4 31B instruction-tuned model with NVIDIA tokenizer thinking explicitly enabled, exposing reasoning traces for complex multimodal and coding workflows.

Features

Context

262.1K

Max Output

131.1K

Date Added

Apr 2, 2026

Pricing

Input:

$0.14/1M

Output:

$0.40/1M

Est./msg:

$0.0003

Try Gemma 4 31B Thinking
zhipu

GLM 5V Turbo

Z.ai's native multimodal agent model for vision-based coding and agent workflows. This is the standard non-thinking variant for image, video, and text inputs, tuned for perceive-plan-execute loops, complex coding, and tool-driven task execution. Not included in the subscription.

Benchmarks (Artificial Analysis)

Intelligence

46.8

Coding

36.8

Features

Context

202.8K

Max Output

131.1K

Date Added

Apr 1, 2026

Pricing

Input:

$1.20/1M

Output:

$4.00/1M

Est./msg:

$0.0032

Try GLM 5V Turbo
arcee

Trinity Large Thinking

Open source Arcee reasoning model with a 262K context window, 80K max output, and native reasoning and tool support for agentic workloads.

Features

Context

262.1K

Max Output

80.0K

Date Added

Apr 1, 2026

Pricing

Input:

$0.25/1M

Output:

$0.90/1M

Est./msg:

$0.0007

Try Trinity Large Thinking

Grok 4.20

xAI's Grok 4.20 flagship release with tool calling, multimodal input support, and a 2M-token context window.

Benchmarks (Artificial Analysis)

Intelligence

48.5

Coding

42.2

Speed

243.3

Features

Context

2.0M

Max Output

131.1K

Date Added

Mar 31, 2026

Pricing

Input:

$2.00/1M

Output:

$6.00/1M

Est./msg:

$0.0050

View Providers

Try Grok 4.20

Grok 4.20 Multi-Agent

Grok 4.20 Multi-Agent is tuned for collaborative agentic workflows while keeping the same 2M-token context window and multimodal support.

Benchmarks (Artificial Analysis)

Intelligence

48.5

Coding

42.2

Speed

243.3

Features

Context

2.0M

Max Output

131.1K

Date Added

Mar 31, 2026

Pricing

Input:

$2.00/1M

Output:

$6.00/1M

Est./msg:

$0.0050

View Providers

Try Grok 4.20 Multi-Agent
qwen

Qwen3.5 Omni Flash

Qwen3.5 Omni Flash is Qwen's fast multimodal model. We verified live support for text prompts, images, audio files, and direct video URLs on Alibaba's chat-completions-compatible API. Alibaba describes Flash as a fully evolved version of Qwen3 Omni with audio input support across 60+ languages.

Features

Context

49.2K

Max Output

16.4K

Date Added

Mar 30, 2026

Pricing

Input:

$0.43/1M

Output:

$1.66/1M

Cache:

Read $0.04/1M · Write $0.54/1M (5m) / $0.86/1M (1h)

Est./msg:

$0.0013

Try Qwen3.5 Omni Flash
qwen

Qwen3.5 Omni Plus

Qwen3.5 Omni Plus is Qwen's stronger general multimodal model. We verified live support for text prompts, images, audio files, and direct video URLs on Alibaba's chat-completions-compatible API. Alibaba describes Plus as a comprehensive evolution of Qwen3 Omni with support for over 10 hours of audio input.

Benchmarks (Artificial Analysis)

Intelligence

38.6

Coding

27.6

Speed

50.4

Features

Context

983.6K

Max Output

65.5K

Date Added

Mar 30, 2026

Pricing

Input:

$0.40/1M

Output:

$2.40/1M

Cache:

Read $0.04/1M · Write $0.50/1M (5m) / $0.80/1M (1h)

Est./msg:

$0.0016

Try Qwen3.5 Omni Plus
anthropic

Claude Sonnet Latest

Compatibility alias that routes to the newest version of Claude Sonnet. Currently routes to Claude Sonnet 4.6 Thinking.

Benchmarks (Artificial Analysis)

Intelligence

51.7

Coding

50.9

Speed

50.4

Features

Context

1.0M

Max Output

128.0K

Date Added

Mar 29, 2026

Pricing

Input:

$2.99/1M

Output:

$14.99/1M

Cache:

Read $0.30/1M · Write $3.74/1M (5m) / $5.98/1M (1h)

Est./msg:

$0.0105

Try Claude Sonnet Latest
openai

GPT Latest

Compatibility alias that routes to the newest version of GPT. Currently routes to GPT 5.4.

Benchmarks (Artificial Analysis)

Intelligence

57.2

Coding

57.3

Speed

74.0

Features

Context

922.0K

Max Output

128.0K

Date Added

Mar 29, 2026

Pricing

Input:

$2.50/1M

Output:

$15.00/1M

Cache:

Read $0.25/1M

Est./msg:

$0.0100

Try GPT Latest
gemini

Gemini Flash Latest

Compatibility alias that routes to the newest version of Gemini Flash. Currently routes to Gemini 3 Flash Preview Thinking.

Benchmarks (Artificial Analysis)

Intelligence

46.4

Coding

42.6

Math

97.0

Speed

186.9

Features

Context

1.0M

Max Output

65.5K

Date Added

Mar 29, 2026

Pricing

Input:

$0.50/1M

Output:

$3.00/1M

Cache:

Read $0.05/1M

Est./msg:

$0.0020

Try Gemini Flash Latest
gemini

Gemini Pro Latest

Compatibility alias that routes to the newest version of Gemini Pro. Currently routes to Gemini 3.1 Pro Preview High.

Benchmarks (Artificial Analysis)

Intelligence

57.2

Coding

55.5

Speed

122.6

Features

Context

1.0M

Max Output

65.5K

Date Added

Mar 29, 2026

Pricing

Input:

$2.00/1M

Output:

$12.00/1M

Cache:

Read $0.20/1M

Est./msg:

$0.0080

Try Gemini Pro Latest
google

Gemini Flash Lite Latest

Compatibility alias that routes to the newest version of Gemini Flash Lite. Currently routes to Gemini 3.1 Flash Lite Preview.

Benchmarks (Artificial Analysis)

Intelligence

33.5

Coding

30.1

Speed

207.1

Features

Context

1.0M

Max Output

65.5K

Date Added

Mar 29, 2026

Pricing

Input:

$0.25/1M

Output:

$1.50/1M

Cache:

Read $0.03/1M

Est./msg:

$0.0010

Try Gemini Flash Lite Latest
anthropic

Claude Haiku Latest

Compatibility alias that routes to the newest version of Claude Haiku. Currently routes to Claude Haiku 4.5 Thinking.

Benchmarks (Artificial Analysis)

Intelligence

37.1

Coding

32.6

Math

83.7

Speed

124.6

Features

Context

200.0K

Max Output

64.0K

Date Added

Mar 29, 2026

Pricing

Input:

$1.00/1M

Output:

$5.00/1M

Cache:

Read $0.10/1M · Write $1.25/1M (5m) / $2.00/1M (1h)

Est./msg:

$0.0035

Try Claude Haiku Latest
anthropic

Claude Opus Latest

Compatibility alias that routes to the newest version of Claude Opus. Currently routes to Claude Opus 4.6 Thinking.

Benchmarks (Artificial Analysis)

Intelligence

53.0

Coding

48.1

Speed

49.8

Features

Context

1.0M

Max Output

128.0K

Date Added

Mar 29, 2026

Pricing

Input:

$5.00/1M

Output:

$25.01/1M

Cache:

Read $0.50/1M · Write $6.25/1M (5m) / $10.00/1M (1h)

Est./msg:

$0.0175

Try Claude Opus Latest

KAT Coder Pro V2

Latest high-performance coding model in Kwaipilot's KAT Coder series, built for complex software engineering, SaaS integration, and large-scale production workflows.

Context

256.0K

Max Output

80.0K

Date Added

Mar 28, 2026

Pricing

Input:

$0.30/1M

Output:

$1.20/1M

Est./msg:

$0.0009

Try KAT Coder Pro V2
zhipu

GLM 5.1

GLM-5.1 appears to be an improvement over GLM 5. Not much else is publicly clear yet. It is available as a pay-as-you-go model.

Features

Context

200.0K

Max Output

131.1K

Date Added

Mar 27, 2026

Pricing

Input:

$0.90/1M

Output:

$2.70/1M

Est./msg:

$0.0022

Try GLM 5.1
zhipu

GLM 5.1 Thinking

GLM-5.1 with thinking enabled. It appears to be an improvement over GLM 5, but public details are still limited. It is available as a pay-as-you-go model.

Features

Context

200.0K

Max Output

131.1K

Date Added

Mar 27, 2026

Pricing

Input:

$0.90/1M

Output:

$2.70/1M

Est./msg:

$0.0022

Try GLM 5.1 Thinking
qwen

Qwen3.5 27B Musica v1

Creative Qwen3.5 27B roleplay, story generation, and conversational finetune built on ArliAI's derestricted base with reasoning and vision support.

Benchmarks (Artificial Analysis)

Intelligence

40.1

Coding

37.4

Speed

63.4

Features

Context

262.1K

Date Added

Mar 27, 2026

Pricing

Input:

$0.31/1M

Output:

$0.31/1M

Est./msg:

$0.0005

Try Qwen3.5 27B Musica v1

TheDrummer Skyfall 31B v4.2

TheDrummer's Skyfall 31B v4.2 is a roleplay and storytelling model with a 131k context window.

Features

Context

131.1K

Date Added

Mar 26, 2026

Pricing

Input:

$0.55/1M

Output:

$0.80/1M

Est./msg:

$0.0010

Try TheDrummer Skyfall 31B v4.2
aionlabs

AionLabs: Aion-2.5

Aion-2.5 is a GLM-5 variant tuned for immersive roleplay and storytelling with stronger tension, conflict, and darker thematic nuance.

Context

131.1K

Max Output

32.8K

Date Added

Mar 20, 2026

Pricing

Input:

$1.00/1M

Output:

$3.00/1M

Est./msg:

$0.0025

Try AionLabs: Aion-2.5

MiMo V2 Omni

MiMo V2 Omni is Xiaomi's frontier omni model for multimodal agent workflows. It natively handles image, video, audio, and text in one shared architecture, with strong grounding, planning, and tool-use behavior.

Benchmarks (Artificial Analysis)

Intelligence

43.4

Coding

35.5

Features

Context

262.1K

Max Output

65.5K

Date Added

Mar 19, 2026

Pricing

Input:

$0.40/1M

Output:

$2.00/1M

Est./msg:

$0.0014

Try MiMo V2 Omni

MiMo V2 Pro

MiMo V2 Pro is Xiaomi's flagship foundation model for agent systems, built for coding, tool use, orchestration, and long-horizon workflows. It uses a trillion-parameter-class architecture with 42B active parameters and supports up to 1M context.

Benchmarks (Artificial Analysis)

Intelligence

49.2

Coding

41.4

Features

Context

1.0M

Max Output

131.1K

Date Added

Mar 19, 2026

Pricing

Input:

$1.00/1M

Output:

$3.00/1M

Est./msg:

$0.0025

Try MiMo V2 Pro
minimax

MiniMax M2.7

MiniMax M2.7 is the first model deeply involved in iterating on its own training. It excels in real-world software engineering (SWE-Pro 56.22%), end-to-end project delivery (VIBE-Pro 55.6%), and complex office workflows with strong tool-use compliance and agentic capabilities.

Benchmarks (Artificial Analysis)

Intelligence

49.6

Coding

41.9

Speed

41.3

Features

Context

204.8K

Max Output

131.1K

Date Added

Mar 18, 2026

Pricing

Input:

$0.30/1M

Output:

$1.20/1M

Est./msg:

$0.0009

View Providers

Try MiniMax M2.7
minimax

MiniMax M2.7 Turbo

MiniMax M2.7 Turbo is the highspeed and higher priced route for M2.7.

Benchmarks (Artificial Analysis)

Intelligence

49.6

Coding

41.9

Speed

41.3

Features

Context

204.8K

Max Output

131.1K

Date Added

Mar 18, 2026

Pricing

Input:

$0.60/1M

Output:

$2.40/1M

Est./msg:

$0.0018

View Providers

Try MiniMax M2.7 Turbo