Private AI
Browse and discover the best AI language models for conversations, coding, and creative writing.
Doubao Seed 2.1 Pro
Higher-capability model in the Doubao Seed 2.1 family for agentic coding, long-context analysis, complex instruction following, and productivity workflows. Supports a 256k context window and up to 128k output tokens. Note: privacy and logging guarantees may be limited.
Context
256.0K
Max Output
128.0K
Date Added
Jun 23, 2026
Pricing
Input:
$1.00/1M
Output:
$5.00/1M
Cache:
Read $0.50/1M
Est./msg:
$0.0035
Subscription
Not included in subscription
Doubao Seed 2.1 Turbo
Fast, lower-cost model in the Doubao Seed 2.1 family for everyday chat, coding assistance, document work, and high-throughput productivity tasks. Supports a 256k context window and up to 128k output tokens. Note: privacy and logging guarantees may be limited.
Context
256.0K
Max Output
128.0K
Date Added
Jun 23, 2026
Pricing
Input:
$0.50/1M
Output:
$2.50/1M
Cache:
Read $0.25/1M
Est./msg:
$0.0018
Subscription
Not included in subscription
Fugu Ultra
Sakana AI's higher-quality Fugu model. It coordinates a deeper pool of expert agents for hard, high-stakes reasoning and coding tasks.
Greg 2 Super
Greg 2 Super is CrofAI's balanced Greg 2 model for strong UI design, frontend iteration, coding, writing, and everyday agent tasks at a lower cost than Ultra.
Greg 2 Ultra
Greg 2 Ultra is CrofAI's most capable Greg 2 model, tuned for premium UI design, agentic coding, creative writing, and higher-end general reasoning tasks.
Sofya Research
Sofya Research runs a web research agent and returns a structured report with sources, sub-queries, and token usage metadata.
Cohere North Mini Code 1.0
Cohere's compact coding model for fast code generation, code editing, and agentic coding prompts. It supports a 256K-token input context, up to 64K output tokens, and configurable thinking.
GLM 5.2 TEE
GLM-5.2 is Z.AI's flagship model for long-horizon tasks with a 1M-token context window. Served as a text-only TEE deployment via Phala, with provider attestation support.
GLM 5.2
GLM-5.2 is Z.AI's flagship model for long-horizon autonomous coding and engineering workflows. It is built to plan, execute, iterate, and optimize complex development tasks over extended runs. This variant keeps thinking disabled for faster direct responses.
GLM 5.2 Thinking
GLM-5.2 with thinking enabled for harder long-horizon coding, autonomous agent workflows, complex engineering optimization, and real-world development tasks.
Kimi K2.7 Code High-Speed
Kimi K2.7 Code High-Speed is the accelerated coding-focused variant tuned for fast agentic software engineering. It targets roughly 180 tokens per second, with short-context responses reaching up to about 260 tokens per second for rapid coding iterations.
Kimi K2.7 Code
Kimi K2.7 Code is Moonshot AI's coding-focused agentic model built for long-horizon software engineering workflows. It supports native image input, tool calling, and forced thinking mode; instant/non-thinking mode is not supported.
MiMo V2.5 Pro UltraSpeed
MiMo V2.5 Pro UltraSpeed is Xiaomi's speed-focused 1T-parameter MiMo V2.5 Pro mode, built for near-instant coding assistance, real-time chat, live edits, and low-latency agent loops. Xiaomi reports up to roughly 1,000 tokens per second using its TileRT serving stack, FP4 expert quantization, and DFlash speculative decoding.
DeepSeek V4 Flash TEE
DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with a 1M-token context window. Running inside a TEE (Trusted Execution Environment), with provider attestation support.
Qwen3.6 35B A3B TEE
Qwen3.6 35B A3B is an open-weight MoE model from Alibaba's Qwen team with 35B total parameters and 3B active parameters per token. Running inside a TEE (Trusted Execution Environment), with provider attestation support.
Linkup Research High
Linkup Research with high reasoning depth. Runs an async web research agent and returns a sourced answer for complex research tasks. Responses can take several minutes.
Linkup Research Low
Linkup Research with low reasoning depth. Runs an async web research agent and returns a sourced answer for factual and multi-source questions. Responses can take several minutes.
Linkup Research Medium
Linkup Research with medium reasoning depth. Runs an async web research agent and returns a sourced answer for factual and multi-source questions. Responses can take several minutes.