Discover AI language models for conversations, coding, and creative writing
Hermes 3 70B
nousresearch/hermes-3-llama-3.1-70b
Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, better roleplaying, reasoning, multi-turn conversation, and long context coherence. This 70B model is a competitive finetune of Llama-3.1-70B focused on aligning LLMs to the user with powerful steering capabilities.
Context
65.5K
Date Added
Jan 7, 2026
Pricing
Input:
$0.41/1M
Output:
$0.41/1M
MiroThinker v1.5 235B
miromind-ai/MiroThinker-v1.5-235B
MiroThinker is the official implementation of the MiroMind Research Agent Project. It is an open-source search agent designed to advance tool-augmented reasoning and information-seeking capabilities, enabling complex real-world research workflows across diverse challenges.
Date Added
Jan 7, 2026
Pricing
Input:
$0.30/1M
Output:
$1.20/1M
Gemini 3 Pro Image
gemini-3-pro-image-preview
Gemini 3 Pro with native image generation support. Can generate images inline via text prompts.
Context
1.0M
Date Added
Jan 5, 2026
Pricing
Input:
$2.00/1M
Output:
$12.00/1M
The Drummer Cydonia 24B v4.3
TheDrummer/Cydonia-24B-v4.3
Cydonia 24B v4.3 continues TheDrummer's Cydonia series with updated tuning on Mistral Small.
Context
16.4K
Date Added
Dec 25, 2025
Pricing
Input:
$0.10/1M
Output:
$0.12/1M
The Drummer Magidonia 24B v4.3
TheDrummer/Magidonia-24B-v4.3
Magidonia 24B v4.3 is a new 24B Drummer finetune built for rich, creative roleplay.
Context
16.4K
Date Added
Dec 25, 2025
Pricing
Input:
$0.10/1M
Output:
$0.12/1M
GLM 4.6 Derestricted v5
GLM-4.6-Derestricted-v5
Derestricted GLM 4.6 tuned for open-ended creative writing and roleplay with relaxed filters.
Context
131.1K
Date Added
Dec 23, 2025
Pricing
Input:
$0.40/1M
Output:
$1.50/1M
GLM 4.7 Original
zai-org/glm-4.7-original
GLM-4.7 is a next-gen GLM series text model with stronger reasoning, long-context chat, and reliable tool use. Routed directly via Z-AI (Zhipu).
Context
200.0K
Date Added
Dec 22, 2025
Pricing
Input:
$0.15/1M
Output:
$0.80/1M
GLM 4.7 Original Thinking
zai-org/glm-4.7-original:thinking
GLM-4.7 original with extended thinking capabilities for complex reasoning.
Context
200.0K
Date Added
Dec 22, 2025
Pricing
Input:
$0.15/1M
Output:
$0.80/1M
GLM 4.7
zai-org/glm-4.7
GLM-4.7 is a next-gen GLM series text model with stronger reasoning, long-context chat, and reliable tool use.
Context
200.0K
Date Added
Dec 22, 2025
Pricing
Input:
$0.15/1M
Output:
$0.80/1M
GLM 4.7 Thinking
zai-org/glm-4.7:thinking
GLM-4.7 with extended thinking capabilities for enhanced reasoning on complex tasks.
Context
200.0K
Date Added
Dec 22, 2025
Pricing
Input:
$0.15/1M
Output:
$0.80/1M
Manta Mini 1.0
meganova-ai/manta-mini-1.0
Lightweight tier optimized for speed and cost.
Context
8.2K
Date Added
Dec 20, 2025
Pricing
Input:
$0.02/1M
Output:
$0.16/1M
Manta Flash 1.0
meganova-ai/manta-flash-1.0
The flagship model for balanced reasoning and context-rich dialogue. Perfect for AI roleplay, storytelling, and assistant tasks with a 16K window.
Context
16.4K
Date Added
Dec 20, 2025
Pricing
Input:
$0.02/1M
Output:
$0.16/1M
Manta Pro 1.0
meganova-ai/manta-pro-1.0
Tailored for deep reasoning, long-form generation, and RAG workloads. 32K token context window.
Context
32.8K
Date Added
Dec 20, 2025
Pricing
Input:
$0.06/1M
Output:
$0.50/1M
MiniMax M2.1
minimax/minimax-m2.1
MiniMax M2.1 builds on M2 with enhanced context understanding and improved complex tool use. 230B parameter MoE model (10B active) optimized for agentic workflows and long-horizon tasks.
Context
200.0K
Date Added
Dec 19, 2025
Pricing
Input:
$0.33/1M
Output:
$1.32/1M
Gemini 3 Flash (Preview)
google/gemini-3-flash-preview
Google's Gemini 3 Flash preview model optimized for speed while maintaining high capability. Features sub-second response times with strong multimodal understanding and reasoning.
Context
1.0M
Date Added
Dec 17, 2025
Pricing
Input:
$0.50/1M
Output:
$3.00/1M
Gemini 3 Flash Thinking
google/gemini-3-flash-preview-thinking
Google's Gemini 3 Flash preview model with thinking mode enabled for enhanced reasoning and chain-of-thought capabilities.
Context
1.0M
Date Added
Dec 17, 2025
Pricing
Input:
$0.50/1M
Output:
$3.00/1M
Mistral Small Creative
mistralai/mistral-small-creative
Mistral Small Creative is an experimental model designed for creative writing, narrative generation, roleplay and character-driven dialogue, general-purpose instruction following, and conversational agents.
Context
32.8K
Date Added
Dec 16, 2025
Pricing
Input:
$0.10/1M
Output:
$0.30/1M
Nvidia Nemotron 3 Nano 30B
nvidia/nemotron-3-nano-30b-a3b
Nvidia's latest Nemotron 3 Nano model with 30B total parameters (3B active) using hybrid Mamba-Transformer MoE architecture. Features excellent throughput and strong reasoning capabilities.
Context
256.0K
Date Added
Dec 15, 2025
Pricing
Input:
$0.17/1M
Output:
$0.68/1M
RNJ-1 Instruct 8B
essentialai/rnj-1-instruct
Essential AI's RNJ-1 Instruct 8B model. A capable instruction-following model optimized for general chat and task completion.
Context
128.0K
Date Added
Dec 13, 2025
Pricing
Input:
$0.15/1M
Output:
$0.15/1M
GLM 4.5 Air Derestricted Iceblink ReExtract
GLM-4.5-Air-Derestricted-Iceblink-ReExtract
ReExtract variant of the Iceblink LoRA for GLM 4.5 Air with refined extractions for creative writing.
Context
131.1K
Date Added
Dec 12, 2025
Pricing
Input:
$0.31/1M
Output:
$0.31/1M
GLM 4.5 Air Derestricted Iceblink v2 ReExtract
GLM-4.5-Air-Derestricted-Iceblink-v2-ReExtract
ReExtract variant of the v2 Iceblink LoRA for GLM 4.5 Air with enhanced creative extractions.
Context
131.1K
Date Added
Dec 12, 2025
Pricing
Input:
$0.31/1M
Output:
$0.31/1M
GLM 4.5 Air Derestricted Steam ReExtract
GLM-4.5-Air-Derestricted-Steam-ReExtract
ReExtract variant of the Steam LoRA for GLM 4.5 Air with refined uncensored extractions for creative RP.
Context
131.1K
Date Added
Dec 12, 2025
Pricing
Input:
$0.31/1M
Output:
$0.31/1M
Mixtral 8x7B
mistralai/mixtral-8x7b-instruct-v0.1
Mixtral 8x7B is a high-quality sparse Mixture of Experts (MoE) model with 45B total parameters but only 13B active per token. Excels at text generation, summarization, question answering, and code generation. Supports English, French, German, Spanish, and Italian. Apache 2.0 licensed.
Context
32.8K
Date Added
Dec 11, 2025
Pricing
Input:
$0.27/1M
Output:
$0.27/1M
Mixtral 8x22B
mistralai/mixtral-8x22b-instruct-v0.1
Mixtral 8x22B is a powerful sparse Mixture of Experts (MoE) model with 141B total parameters and 39B active per token. Features a 64K context window, exceptional math performance, and cost-efficient inference. Supports English, French, German, Spanish, and Italian. Apache 2.0 licensed.
Context
65.5K
Date Added
Dec 11, 2025
Pricing
Input:
$0.90/1M
Output:
$0.90/1M
GLM 4.6 Original
zai-org/glm-4.6-original
GLM-4.6, Zhipu's flagship text model with 256K context window and advanced reasoning capabilities. Direct via Z-AI (Zhipu).
Context
256.0K
Date Added
Dec 11, 2025
Pricing
Input:
$0.40/1M
Output:
$1.50/1M
GLM 4.6V
zai-org/glm-4.6v
GLM-4.6V scales its context window to 128k tokens in training, and achieves SoTA performance in visual understanding among models of similar parameter scales. Integrates native Function Calling capabilities, bridging 'visual perception' and 'executable action' for multimodal agents. Quantized at FP8.
Context
128.0K
Date Added
Dec 11, 2025
Pricing
Input:
$0.30/1M
Output:
$0.90/1M
DeepSeek V3.1 Nex N1
nex-agi/deepseek-v3.1-nex-n1
Nex-AGI's flagship 671B agentic model built on DeepSeek V3.1. Optimized for programming, tool use, web search, multi-hop reasoning, and mini-app development. Industry-leading results on SWE-bench Verified (70.6%), τ2-Bench (80.2%), and BFCL v4 (65.3%). Full production-ready agent capabilities with 128K context.
Context
128.0K
Date Added
Dec 10, 2025
Pricing
Input:
$0.28/1M
Output:
$0.42/1M
Devstral 2 123B
mistralai/devstral-2-123b-instruct-2512
Devstral 2 123B is a 123 billion parameter model from Mistral AI optimized for coding and development tasks. Features advanced reasoning capabilities for software engineering workflows.
Context
262.1K
Date Added
Dec 9, 2025
Pricing
Input:
$0.40/1M
Output:
$1.40/1M
GLM 4.6V Flash
zai-org/glm-4.6v-flash-original
GLM-4.6V-Flash (9B), a lightweight model optimized for local deployment and low-latency applications. Scales context window to 128k tokens and achieves SoTA performance in visual understanding among similar-scale models.
Context
128.0K
Date Added
Dec 8, 2025
Pricing
Input:
$0.10/1M
Output:
$0.40/1M
GLM 4.6V Original
zai-org/glm-4.6v-original
GLM-4.6V scales its context window to 128k tokens in training, and achieves SoTA performance in visual understanding among models of similar parameter scales. Integrates native Function Calling capabilities, bridging 'visual perception' and 'executable action' for multimodal agents. Direct via Z-AI (Zhipu).
Context
128.0K
Date Added
Dec 8, 2025
Pricing
Input:
$0.60/1M
Output:
$0.90/1M