API
API keys
Generate up to 5 API keys to use NanoGPT in other applications. If you require more keys, please contact us at support@nano-gpt.com and we will help you out.
Authenticate by including your API key as a HTTP header: "Authorization": f"Bearer API_KEY"
or "api-key": "API_KEY"
depending on the endpoint.
Name | Status | Created | API Key |
---|
Get notified about API updates.
We will only use this to contact you updates to how the API works. You can unsubscribe at any time.
If you are a (potentially) large user of our website or our API, we are glad to have you. Reach out to us at support@nano-gpt.com or join our Discord for a discount.
API Reference
The below example code can be used in Python, NanoGPTjs is a great starting point for JS users.
If you encounter issues or need further information please contact support@nano-gpt.com
Text models
POST https://nano-gpt.com/api/talk-to-gpt
Name | Model | Description |
---|---|---|
Auto model | recommended-model | Automatically uses the best model for your task. Categorizes the prompt, then uses the model that performs best in that particular category according to global user preferences. Scores updated daily. Ability to set pricing tier in Adjust Settings. |
Grok 3 Thinking | grok-3-reasoner | Grok 3 Thinking adds chain of thought into the Grok 3 model leading to even better results across a wide range of tasks. Displays its thinking. Note: currently heavily rate limited. |
Grok 3 | grok-3 | Grok 3 is the newest xAI model and current leader of most leaderboards. Comes with a massive 1 million token context window. |
Grok 3 Deepsearch | grok-3-deepsearch | Grok 3 Deepsearch is a lightning-fast AI agent built to relentlessly seek the truth across the entire corpus of human knowledge. DeepSearch is designed to synthesize key information, reason about conflicting facts and opinions, and distill clarity from complexity. Note: currently heavily rate limited. |
OpenAI o3-mini | o3-mini | OpenAI's newest flagship model. |
OpenAI o3-mini high | o3-mini-high | OpenAI's newest flagship model with reasoning effort set to high. |
OpenAI o3-mini low | o3-mini-low | OpenAI's newest flagship model with reasoning effort set to low. |
OpenAI o1 Pro | o1-plus | Note: Degraded service. Not fully stable. Occasionally fails to respond. The Pro version of OpenAI's flagship reasoning model for solving hard problems. Useful when tackling complex problems in science, coding, math, and similar fields. Minimum cost is ~$0.20, we are temporarily charging max $0.50 regardless of how big your prompt is. This model can think for a long time - please be patient. |
OpenAI o1 | o1 | OpenAI's flagship reasoning model for solving hard problems. Useful when tackling complex problems in science, coding, math, and similar fields. |
ChatGPT 4o | chatgpt-4o-latest | OpenAI's current recommended model, the well-known ChatGPT. |
Gemini 2.0 Pro Exp 0205 | gemini-2.0-pro-exp-02-05 | Gemini 2.0 Pro Experimental, the latest version of the Gemini 2.0 Pro model. |
Gemini 2.0 Pro Exp 1206 | gemini-exp-1206 | Gemini 2.0 Pro Experimental, the latest version of the Gemini 2.0 Pro model. |
Model Recommender | model-selector | Model Recommender - input your query to have it recommend the best model for your task |
Kimi K1.5 Preview | kimi-k1.5-preview | Kimi K1.5 is an o1-level multimodal model by Moonshot AI, outperforming o1 on many benchmarks. |
Gemini 2.0 Flash Thinking 0121 | gemini-2.0-flash-thinking-exp-01-21 | Google's newest model, outperforming even Gemini 1.5 Pro, now with a thinking mode enabled similar to the o1 series of OpenAI. |
Gemini 2.0 Flash Thinking 1219 | gemini-2.0-flash-thinking-exp-1219 | Google's newest model, outperforming even Gemini 1.5 Pro, now with a thinking mode enabled similar to the o1 series of OpenAI. |
Gemini 2.0 Flash Exp | gemini-2.0-flash-exp-search | Google's newest model, outperforming even Gemini 1.5 Pro. Now with web access. |
Gemini 2.0 Flash Exp | gemini-2.0-flash-exp | Google's newest model, outperforming even Gemini 1.5 Pro. |
OpenAI o1 preview | o1-preview | OpenAI's new flagship series of reasoning models for solving hard problems. Useful when tackling complex problems in science, coding, math, and similar fields |
OpenAI o1-mini | o1-mini | A fast, cost-efficient version of OpenAI's o1 reasoning model tailored to coding, math, and science use cases. |
Open Deep Research | deep-research | o3-mini-powered research assistant that performs deep analysis across multiple sources. Open source version of OpenAI's Deep Research. Warning: thinks for a while - it has to write an entire report! Based on open-deep-research by fdarkaou. |
Grok 2 1212 | grok-2-1212 | Grok 2 1212 introduces significant enhancements to accuracy, instruction adherence, and multilingual support, making it a powerful and flexible choice for developers seeking a highly steerable, intelligent model.. |
Grok 2 1212 | x-ai/grok-2-1212 | Grok 2 1212 introduces significant enhancements to accuracy, instruction adherence, and multilingual support, making it a powerful and flexible choice for developers seeking a highly steerable, intelligent model.. |
Doubao 1.5 Pro 256k | doubao-1.5-pro-256k | Doubao's (Bytedance) flagship model with a 256k token context window |
Gemini 2.0 Flash | gemini-2.0-flash-001 | Upgraded version of Gemini Flash 1.5. Faster, with higher output, and overall increase in intelligence. |
Gemini 2.0 Flash Lite Preview | gemini-2.0-flash-lite-preview-02-05 | Upgraded version of Gemini Flash 1.5. Faster, with higher output, and overall increase in intelligence. |
Grok 2 Beta | grok-beta | Grok-2 is xAI's frontier language model, the one used on X. Claims state-of-the-art reasoning capabilities, best for complex and multi-step use cases. |
Grok 2 Beta | x-ai/grok-beta | Grok-2 is xAI's frontier language model, the one used on X. Claims state-of-the-art reasoning capabilities, best for complex and multi-step use cases. |
Claude 3.5 Sonnet | claude-3-5-sonnet-20241022 | Anthropic's updated most intelligent model, offering even better results on many subjects than GPT-4o. |
DeepSeek R1 | deepseek-r1-nano | DeepSeek's R1 is a thinking model, rivalling OpenAI's o1. This version is run via Azure, with fallbacks to Azure, Fireworks and Together, never routing through DeepSeek themselves. |
Aion 1.0 mini (DeepSeek) | aion-labs/aion-1.0-mini | A distilled version of the DeepSeek-R1 model that excels in reasoning domains like mathematics, coding, and logic. |
Aion 1.0 | aion-labs/aion-1.0 | Aion Labs most powerful reasoning model with high performance across reasoning and coding. |
DeepClaude | deepclaude | Harness the power of DeepSeek R1's reasoning combined with Claude's creativity and code generation. Feeds your query into DeepSeek R1, then feeds the query + thinking process into Claude 3.5 Sonnet and returns an answer. Note: this routes through original DeepSeek meaning your data may be stored and used by DeepSeek. |
DeepSeek V3/Deepseek Chat | deepseek-chat | Latest model from DeepSeek, trained on nearly 15 trillion tokens, matches leading closed-source models at a far lower price. |
DeepSeek V3/Deepseek Chat | deepseek/deepseek-chat | Latest model from DeepSeek, trained on nearly 15 trillion tokens, matches leading closed-source models at a far lower price. |
DeepSeek R1 Llama 70b | deepseek-r1-llama-70b | DeepSeek R1 Llama 70b is a fine-tuned version of DeepSeek R1 on Llama 70B. |
MiniMax 01 | minimax/minimax-01 | MiniMax's flagship model with a 1M token context window |
MiniMax 01 | MiniMax-Text-01 | MiniMax's flagship model with a 1M token context window |
GLM Zero Preview | glm-zero-preview | GLM Zero Preview is a thinking model like o1, but with a smaller context window |
Step-2 16k Exp | step-2-16k-exp | Step-2 16k Exp is a 16k context window model |
GLM 4 Plus 0111 | glm-4-plus-0111 | GLM 4 Plus 0111 is a 1M token context window model |
GLM 4 Air 0111 | glm-4-air-0111 | MiniMax's flagship model with a 1M token context window |
Step-2 Mini | step-2-mini | MiniMax's flagship model with a 1M token context window |
Doubao 1.5 Pro 32k | doubao-1.5-pro-32k | Doubao's (Bytedance) pro model with a 32k token context window |
Doubao 1.5 Vision Pro 32k | doubao-1.5-vision-pro-32k | Doubao's (Bytedance) vision-enabled pro model (JPG only) with a 32k token context window |
Qwen QwQ 32B Preview | Qwen/QwQ-32B-Preview | Experimental release of Qwen's reasoning model. Great at coding and math, but still in development so may exhibit odd bugs. Not production-ready. |
Kimi Latest | kimi-latest | Always point to the latest stable Kimi model. |
Step-2 16k | step-2-16k | Chinese-based trillion-parameter model by StepFun that scores extremely well on Livebench for a broad range of tasks. Supports a variety of languages, but has a relatively small context window (~8000 words). |
Llama 3.3 70b Instruct | llama-3.3-70b | Llama 3.3 is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks. |
Llama 3.3 70b Instruct | meta-llama/llama-3.3-70b-instruct | Llama 3.3 is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks. |
Dolphin 72b | dolphin-2.9.2-qwen2-72b | Dolphin is the most uncensored model yet, built on top of Qwen's 72b model. |
Nvidia Nemotron 70b | nvidia/Llama-3.1-Nemotron-70B-Instruct-HF | Nvidia's latest Llama fine-tune optimized for instruction following. Early results hints that it might outperform models such as GPT-4o and Claude 3.5 Sonnet. |
Claude 3.5 Sonnet Old | claude-3-5-sonnet-20240620 | Anthropic's most intelligent model, offering even better results on many subjects than GPT-4o. |
Claude 3.5 Haiku | claude-3-5-haiku-20241022 | Anthropic's updated faster and cheaper model, offering good results on chatbots and coding. |
Claude 3 Opus | claude-3-opus-20240229 | Anthropic's flagship model, outperforming GPT-4 on most benchmarks. |
Yi Lightning | yi-lightning | Chinese-developed multilingual (English, Chinese and others) model by 01.ai that's very fast and cheap, yet scores high on independent leaderboards. |
Amazon Nova Pro 1.0 | amazon/nova-pro-v1 | Amazon's new flagship model. Can handle up to 300k input tokens, with comparable performance to ChatGPT and Claude 3.5 Sonnet. |
GPT 4o mini | gpt-4o-mini | OpenAI's most cost-efficient small model. Cheaper and smarter than GPT-3.5 (the original ChatGPT), but less performant than gpt-4o |
GLM-4 Plus | glm-4-plus | GLM high-intelligence flagship model with 128K context window |
Gemini LearnLM Experimental | learnlm-1.5-pro-experimental | LearnLM is a task-specific model trained to align with learning science principles when following system instructions for teaching and learning use cases. For instance, the model can take on tasks to act as an expert or guide to educate users on specific topics. |
Llama 3.1 Large | Meta-Llama-3-1-405B-Instruct-FP8 | Note: comes with a 90% discount currently, enjoy! Meta's largest Llama 3.1 405B model. Open-source, run through an open permissionless crypto network (no central provider). |
Hermes 3 Large | nousresearch/hermes-3-llama-3.1-405b | Llama 3.1 405b with the brakes taken off. Less censored than the regular version, but not abliterated |
Qwen Turbo | qwen-turbo | Alibaba's fastest and cheapest model. Suitable for simple tasks, fast and low cost, with a 1 million token context window. |
Qwen 2.5 Max | qwen-max | Qwen 2.5 Max is the upgraded version of Qwen Max, beating GPT-4o, Deepseek V3 and Claude 3.5 Sonnet in benchmarks. |
Qwen Plus | qwen-plus | Alibaba's balanced model. Fast, cheap, yet still very powerful. |
Qwen Long 10M | qwen-long | Alibaba's huge context window model. Takes in up to 10 million tokens, which is equivalent to dozens of books. |
Qwen 2.5 Coder 32b | Qwen/Qwen2.5-Coder-32B-Instruct | The latest series of Code-Specific Qwen large language models. |
Amazon Nova Lite 1.0 | amazon/nova-lite-v1 | Amazon's new lower cost model. Can handle up to 300k input tokens, with faster output but less thorough understanding than Amazon's Nova Pro. |
Amazon Nova Micro 1.0 | amazon/nova-micro-v1 | Amazon's lowest cost model. Comparable to GPT-4o-mini and Gemini 1.5 Flash, with the fastest output. |
Yi Large | yi-large | Large version of Yi Lightning with a 32k context window, but more expensive. |
Yi Medium 200k | yi-medium-200k | Medium version of Yi with a 200k context window. |
Nemo Arli 12b RPMa V1.2 | Mistral-Nemo-12B-ArliAI-RPMax-v1.2 | A Mistral Nemo 12b finetuned for roleplay and storytelling. |
LatitudeGames WayFarer 12B | Mistral-Nemo-12B-Wayfarer | Latitude Games Wayfarer 12B |
The Drummer Cydonia 24B | TheDrummer/Cydonia-24B-v2 | Cydonia 24B v2 is a finetune of Mistral's latest 'Small' model (2501). Aliases: Cydonia 24B, Cydonia v2, Cydonia on that broken base. |
EVA Llama 3.33 70B | EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.0 | A RP/storywriting specialist model, full-parameter finetune of Llama-3.3-70B-Instruct on mixture of synthetic and natural data. It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and flavor of the resulting model. |
Llama 3.1 70B Dracarys 2 | abacusai/Dracarys-72B-Instruct | Llama 3.1 70b finetune that offers improvements on coding. |
Mistral Large 2411 | mistralai/mistral-large | Upgrade to Mistral's flagship model. It is fluent in English, French, Spanish, German, and Italian, with high grammatical accuracy, with a long context window. |
Lumimaid v0.2 | NeverSleep/Lumimaid-v0.2-70B | Upgrade to Llama-3 Lumimaid 70B. A Llama 3.1 70B finetune trained on curated roleplay data. Extremely uncensored and suitable for NSFW. |
DeepSeek V3/Chat Cheaper | deepseek-chat-cheaper | Cheaper version of Deepseek V3/Chat. Note: may be routed through Deepseek itself. |
Inflection 3 Pi | inflection/inflection-3-pi | A chatbot with emotional intelligence. Has access to recent news, excels in scenarios like customer support and roleplay. Mirrors your conversation style. |
Inflection 3 Productivity | inflection/inflection-3-productivity | Optimized for instruction following. Good at tasks that require precise adherence to provided guidelines. Has access to recent news. |
WizardLM-2 8x22B | microsoft/wizardlm-2-8x22b | Microsoft's advanced Wizard model. The most popular role-playing model. |
SorcererLM 8x22B | raifle/sorcererlm-8x22b | Advanced roleplaying model with reasoning and emotional intelligence for engaging interactions, contextual awareness and enhanced narrative depth |
Llama 3.1 Large | accounts/fireworks/models/llama-v3p1-405b-instruct | Meta's largest and most capable Llama model. Competitive with GPT-4o and Claude 3.5 Sonnet. |
GPT 4o 08 06 | gpt-4o-2024-08-06 | OpenAI's precusor to ChatGPT-4o. Great on English text and code, with significant improvements on text in non-English languages. |
GPT 4o 11 20 | gpt-4o-2024-11-20 | OpenAI's precusor to ChatGPT-4o. Great on English text and code, with significant improvements on text in non-English languages. |
Llama 3.2 Medium | meta-llama/llama-3.2-90b-vision-instruct | Medium-size (and capability) version of Meta's newest model (3.2 series). |
Llama 3.1 Medium | accounts/fireworks/models/llama-v3p1-70b-instruct | Meta's updated version of their medium Llama model. Slightly lesser performance than Llama Large, but cheaper. |
Llama 3.3 70B Instruct abliterated | huihui-ai/Llama-3.3-70B-Instruct-abliterated | An abliterated (removed restrictions and censorship) version of Llama 3.3 70b. |
Perplexity Pro | sonar-pro | Sonar Pro tackles complex questions that need deeper research and provides more sources. |
Perplexity Reasoning Pro | sonar-reasoning-pro | Perplexity's Sonar Reasoning Pro uses DeepSeek R1's thinking process combined with looking up on the web to tackle complex questions that need deeper research and provides more sources. |
Perplexity Reasoning | sonar-reasoning | Perplexity's Sonar Reasoning uses DeepSeek R1's thinking process combined with looking up on the web to tackle complex questions that need deeper research and provides more sources. |
Llama 3.1 Large | meta-llama/llama-3.1-405b-instruct | Meta's largest and most capable Llama model. Competitive with GPT-4o and Claude 3.5 Sonnet. |
Gemini 1.5 Flash | google/gemini-flash-1.5 | Google's fastest multimodal model with great performance for diverse, repetitive tasks and a 2 million words context window. |
Perplexity Simple | sonar | A Perplexity model that gives fast, straightforward answers. |
MythoMax 13B | Gryphe/MythoMax-L2-13b | One of the highest performing and most popular fine-tunes of Llama 2 13B, with rich descriptions and roleplay. |
GLM-4 | glm-4 | High-intelligence model with 128K context window |
GLM-4 Long | glm-4-long | Extended context model supporting up to 1M tokens |
Qwen2.5 72B | qwen/qwen-2.5-72b-instruct | Great multilingual support, strong at mathematics and coding, supports roleplay and chatbots. |
EVA Qwen2.5 72B | eva-unit-01/eva-qwen-2.5-72b | Full-parameter finetune of Qwen2.5-72B on mixture of synthetic and natural data. It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and flavor of the resulting model. |
Yi Medium 200k | yi-34b-chat-200k | Medium version of Yi Lightning with a huge 200k context window |
Yi Spark | yi-34b-chat-0205 | Small and powerful, lightweight and fast model. Provides enhanced mathematical operation and code writing capabilities. |
Yi Large Turbo | yi-large-turbo | Super cost-effective, excellent performance. Balanced high-precision tuning based on performance, inference speed, and cost. |
Dolphin 2.6 Mixtral 8x7b | cognitivecomputations/dolphin-mixtral-8x7b | Designed for instruction following, conversational, and coding. |
GPT 4 Turbo Preview | gpt-4-turbo-preview | Can take in the largest messages (up to 300 pages of context), and all round seen as one of the best in class models. |
GPT 4 Turbo | gpt-4-turbo | Can take in the largest messages (up to 300 pages of context), and all round seen as one of the best in class models. |
GPT 4o | gpt-4o | OpenAI's precusor to ChatGPT-4o. Great on English text and code, with significant improvements on text in non-English languages. |
GPT 3.5 Turbo | gpt-3.5-turbo | Older model. Brought ChatGPT to the mainstream, seen as dated nowadays. 90% cheaper than GPT-4-Turbo, recommended for very simple tasks. |
Gemini 1.5 Flash | gemini-1.5-flash-001 | Google's fastest multimodal model with great performance for diverse, repetitive tasks and a 1 million context window. |
Gemini 1.5 Pro | gemini-1.5-pro-001 | Google's next-generation model with a breakthrough 1 million context window. Comparable to GPT-4o. |
Free model | free-model | Free model to try out our service with. Currently Llama 3.3 70B, but this might change at any time. |
Magnum v4 72B | anthracite-org/magnum-v4-72b | Upgraded model of Magnum V2 72B. From the creators of Goliath. Aimed at achieving prose quality similar to Claude Opus 3, trained on 55 million tokens of curated Roleplay data. |
EVA-Qwen2.5-32B-v0.2 | EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2 | A RP/storywriting specialist model, full-parameter finetune of Qwen2.5-32B on mixture of synthetic and natural data. It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and flavor of the resulting model. |
DeepSeek R1 Distill 70b | deepseek/deepseek-r1-distill-llama-70b | DeepSeek-R1 distilled version on Llama 70B. |
Llama 3.1 405B Instruct | nvidia/Llama-3.1-405B-Instruct-FP8 | NVIDIA's optimized version of Llama 3.1 405B with FP8 precision. |
MN-LooseCannon-12B-v1 | GalrionSoftworks/MN-LooseCannon-12B-v1 | Merge of Starcannon and Sao Lyra. |
EVA-Qwen2.5-72B-v0.2 | EVA-UNIT-01/EVA-Qwen2.5-72B-v0.2 | A RP/storywriting specialist model, full-parameter finetune of Qwen2.5-72B on mixture of synthetic and natural data. It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and flavor of the resulting model. |
EVA-LLaMA-3.33-70B-v0.1 | EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1 | A RP/storywriting specialist model, full-parameter finetune of Llama-3.3-70B-Instruct on mixture of synthetic and natural data. It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and flavor of the resulting model. |
Dolphin 2.9.2 Mixtral 8x22B | cognitivecomputations/dolphin-mixtral-8x22b | Successor to Dolphin 2.6 Mixtral 8x7b. Great for instruction following, conversational, and coding. |
Llama 3.1 70b Instruct | meta-llama/llama-3.1-70b-instruct | Optimized for high quality dialogue usecases. |
Llama 3.1 8b Instruct | meta-llama/llama-3.1-8b-instruct | Fast and efficient for simple purposes. |
ReMM SLERP 13B | undi95/remm-slerp-l2-13b | A recreation trial of the original MythoMax-L2-B13 but merged with updated models. |
Mistral Tiny | mistralai/mistral-tiny | Powered by Mistral-7B-v0.2, best used for large batch processing tasks where cost is a significant factor but reasoning capabilities are not crucial. |
Mistral Saba | mistralai/mistral-saba | Mistral Saba is a 24B-parameter language model specifically designed for the Middle East and South Asia, delivering accurate and contextually relevant responses while maintaining efficient performance. Trained on curated regional datasets, it supports multiple Indian-origin languages—including Tamil and Malayalam—alongside Arabic. This makes it a versatile option for a range of regional and multilingual applications. |
Mistral 7B Instruct | mistralai/mistral-7b-instruct | Optimized for speed with decent context length |
Llama 3 70b Instruct | meta-llama/llama-3-70b-instruct | Optimized for high quality dialogue usecases. |
WizardLM-2 7B | microsoft/wizardlm-2-7b | Finetune of Mistral 7B Instruct, very fast. |
DeepSeek R1 Zero Preview | deepseek-ai/DeepSeek-R1-Zero | Preview version of Deepseek R1, also known as DeepSeek R1 Zero. Deepseek R1 without the supervised finetuning. |
Cohere: Command R | cohere/command-r | 35B parameter model that performs conversational language tasks at a higher quality, more reliably, and with a longer context than previous models. It can be used for complex workflows like code generation, retrieval augmented generation (RAG), tool use, and agents |
Cohere: Command R+ | cohere/command-r-plus-08-2024 | 104B parameter model that performs conversational language tasks at a higher quality, more reliably, and with a longer context than previous models. It can be used for complex workflows like code generation, retrieval augmented generation (RAG), tool use, and agents |
Neural Daredevil 8B abliterated | mlabonne/NeuralDaredevil-8B-abliterated | The best performing 8B abliterated model according to most benchmarks. |
Llama 3 70B abliterated | failspy/Meta-Llama-3-70B-Instruct-abliterated-v3.5 | An abliterated (removed restrictions and censorship) version of Llama 3.1 70b. |
Nemotron 3.1 70B abliterated | huihui-ai/Llama-3.1-Nemotron-70B-Instruct-HF-abliterated | An abliterated (removed restrictions and censorship) version of Llama 3.1 70b Nemotron. |
Magnum V2 72B | anthracite-org/magnum-v2-72b | Magnum V2 72B |
Damascus R1. | Steelskull/L3.3-Damascus-R1 | Damascus-R1 builds upon some elements of the Nevoria foundation but represents a significant step forward with a completely custom-made DeepSeek R1 Distill base: Hydroblated-R1-V3. Constructed using the new SCE (Select, Calculate, and Erase) merge method, Damascus-R1 prioritizes stability, intelligence, and enhanced awareness. |
Mistral Nemo | mistralai/Mistral-Nemo-Instruct-2407 | 12B parameter model with multilingual support. |
DeepSeek Reasoner | deepseek-reasoner | DeepSeek-R1 is now live and open source, rivaling OpenAI's Model o1. |
Llama 3.1 70B ArliAI RPMax v1.3 | Llama-3.3+3.1-70B-ArliAI-RPMax-v1.3 | RPMax are a series of models that are trained on a diverse set of curated creative writing and RP datasets with a focus on variety and deduplication. This model is designed to be highly creative and non-repetitive by making sure no two entries in the dataset have repeated characters or situations, which makes sure the model does not latch on to a certain personality and be capable of understanding and acting appropriately to any characters or situations. |
Llama 3.05 Storybreaker Ministral 70b | Envoid/Llama-3.05-NT-Storybreaker-Ministral-70B | Much more inclined to output adult content than its predecessor. Great choice for novelty roleplay scenarios. |
Nemotron Tenyxchat Storybreaker 70b | Envoid/Llama-3.05-Nemotron-Tenyxchat-Storybreaker-70B | Overall it provides a solid option for RP and creative writing while still functioning as an assistant model, if desired. If used to continue a roleplay it will generally follow the ongoing cadence of the conversation. |
Mag Mell R1 | inflatebot/MN-12B-Mag-Mell-R1 | Mag Mell demonstrates worldbuilding capabilities unlike any model in its class, comparable to old adventuring models like Tiefighter, and prose that exhibits minimal slop. |
Evayale 70b | Steelskull/L3.3-MS-Evayale-70B | Combination of EVA and Euryale. |
Lumimaid 70b | NeverSleep/Llama-3-Lumimaid-70B-v0.1 | Neversleep Llama 3 Lumimaid 70B |
MS Evalebis 70b | Steelskull/L3.3-MS-Evalebis-70b | Combination of EVA, Euryale and Anubis. |
Anubis 70B v1 | TheDrummer/Anubis-70B-v1 | L3.3 finetune for roleplaying. |
Qwen 2.5 32b EVA | Qwen2.5-32B-EVA-v0.2 | A Qwen 2.5 32b finetuned for roleplay and storytelling. |
Llama 3.3 70b Mirai Fanfare | Llama-3.3-70B-MiraiFanfare | A Llama 3.3 70b finetuned for roleplay and storytelling. |
Dazzling Star Aurora 32b | Qwen2.5-32B-Dazzling-Star-Aurora-32b-v0.0 | A Qwen 2.5 32b finetuned for roleplay and storytelling. |
Gemini 1.5 Pro | google/gemini-pro-1.5 | Google's next-generation model with a breakthrough 4 million context window. Comparable to GPT-4o. |
Gemini 2.0 Flash Search | gemini-2.0-flash-search | Gemini 2.0 Flash Search is a version of Gemini 2.0 Flash that has been finetuned for search tasks. |
Gemini 2.0 Flash Exp Free | google/gemini-2.0-flash-exp:free | Small model optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization |
Llama 3.2 3b Instruct | meta-llama/llama-3.2-3b-instruct | Small model optimized for advanced natural language processing tasks like dialogue generation, reasoning, and summarization |
Llama 3.1 8B (decentralized) | Meta-Llama-3-1-8B-Instruct-FP8 | Meta's Llama 3.1 8B model via an open permissionless network |
GLM-4 AirX | glm-4-airx | Fastest GLM-4 variant with 8K context window |
GLM-4 Air | glm-4-air | High-performance model with 128K context window |
GLM-4 Flash | glm-4-flash | Extremely cheap model with 128K context window |
Llama 3.1 70B Hanami | Sao10K/L3.1-70B-Hanami-x1 | Euryale v2.2-based finetune. |
Rocinante 12b | TheDrummer/Rocinante-12B-v1.1 | Designed for engaging storytelling and rich prose. Expanded vocabulary with unique and expressive word choices, enhanced creativity and captivating stories. |
Llama 3.3 70B Euryale | Sao10K/L3.3-70B-Euryale-v2.3 | A 70B parameter model from SAO10K based on Llama 3.3 70B, offering high-quality text generation. |
Llama 3.1 70B Euryale | Sao10K/L3.1-70B-Euryale-v2.2 | A 70B parameter model from SAO10K based on Llama 3.1 70B, offering high-quality text generation. |
UnslopNemo 12b v4 | TheDrummer/UnslopNemo-12B-v4.1 | UnslopNemo v4 is the previous version from the creator of Rocinante, designed for adventure writing and role-play scenarios. |
Nous Hermes 3 70B | NousResearch/Hermes-3-Llama-3.1-70B | Generalist language model including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the board. |
NemoMix 12B Unleashed | MarinaraSpaghetti/NemoMix-Unleashed-12B | Great for RP and storytelling. |
Mistral Nemo Starcannon 12b v1 | VongolaChouko/Starcannon-Unleashed-12B-v1.0 | Mistral Nemo finetine that offers improvements on roleplay. |
Llama 3.1 70B Celeste v0.1 | nothingiisreal/L3.1-70B-Celeste-V0.1-BF16 | Creative model based on Llama 3.1 70B |
Mistral Nemo Inferor 12B | Infermatic/MN-12B-Inferor-v0.0 | Inferor is a merge of top roleplay models, expert on immersive narratives and storytelling. |
Claude 3.5 Sonnet | anthropic/claude-3.5-sonnet | Anthropic's updated most intelligent model, offering even better results on many subjects than GPT-4o. |
DeepSeek R1 | aihubmix-DeepSeek-R1 | DeepSeek's R1 model, offering even better results on many subjects than GPT-4o. |
DeepSeek R1 | deepseek/deepseek-r1 | DeepSeek-R1 is now live and open source, rivaling OpenAI's Model o1. |
Athene V2 Chat | Nexusflow/Athene-V2-Chat | An open-weights LLM on-par with GPT-4o across benchmarks. |
Deepseek R1 Qwen Abliterated | huihui-ai/DeepSeek-R1-Distill-Qwen-32B-abliterated | Uncensored version of the Deepseek R1 Qwen 32B model |
Deepseek R1 Llama 70b Abliterated | huihui-ai/DeepSeek-R1-Distill-Llama-70B-abliterated | Uncensored version of the Deepseek R1 Llama 70B model |
DeepSeek R1 671B | deepseek-r1-671b | DeepSeek R1 671B model |
Deepseek R1 Cheaper | deepseek-reasoner-cheaper | Cheaper version of DeepSeek R1. Note: may be routed through Chinese providers. |
Deepseek R1 Cheaper | ark-deepseek-r1-250120 | Cheaper version of DeepSeek R1. Note: may be routed through Chinese providers. |
Llama 3.1 8b (uncensored) | aion-labs/aion-rp-llama-3.1-8b | This is a truly uncensored model, trained to excel at roleplaying and creative writing. However, it can also do other things! |
Azure o1 | azure-o1 | Azure version of OpenAI o1 |
Azure o3-mini | azure-o3-mini | Azure version of OpenAI o3-mini |
Azure gpt-4o | azure-gpt-4o | Azure version of OpenAI gpt-4o |
Azure gpt-4o-mini | azure-gpt-4o-mini | Azure version of OpenAI gpt-4o-mini |
Azure gpt-4-turbo | azure-gpt-4-turbo | Azure version of OpenAI gpt-4-turbo |
Llama 3.1 Tulu 3 405B | Llama-3.1-Tulu-3-405B | Tülu 3 405B, a fine tune of Llama 405B that performs better than DeepSeek V3 on SambaNova Cloud. This powerful open-source model, developed by the Allen Institute for AI (Ai2), represents a significant leap forward in large language model capabilities. Thanks to the SambaNova RDU, we are able to efficiently support this model at over 90tokens/second. |
DeepSeek R1 Sambanova | deepseek-r1-sambanova | DeepSeek R1 via Sambanova: the full model with very fast output. Note: max 4k output tokens. |
Grok 2 Vision 1212 | grok-2-vision-1212 | Grok 2 Vision 1212 introduces significant enhancements to accuracy, instruction adherence, and multilingual support, making it a powerful and flexible choice for developers seeking a highly steerable, intelligent model.. |
DeepSeek V3/Deepseek Chat | deepseek/deepseek-chat:free | DeepSeek Chat is a model that is a good choice for general purpose chat. |
DeepSeek R1 | deepseek/deepseek-r1:free | DeepSeek R1 is a model that is a good choice for general purpose chat. |
Perplexity R1 1776 | r1-1776 | R1 1776 is a version of the DeepSeek R1 model that has been post-trained by Perplexity to provide uncensored, unbiased, and factual information. |
Llama 3.3 70B Wayfarer | LatitudeGames/Wayfarer-Large-70B-Llama-3.3 | Llama 3.3 70B Wayfarer is a fine-tuned version of Llama 3.3 70B, trained on a diverse set of creative writing and RP datasets with a focus on variety and deduplication. This model is designed to be highly creative and non-repetitive by making sure no two entries in the dataset have repeated characters or situations, which makes sure the model does not latch on to a certain personality and be capable of understanding and acting appropriately to any characters or situations. |
Image models
POST https://nano-gpt.com/api/generate-image
Name | Model | Description |
---|---|---|
Recraft V3 | recraft-v3 | The current best scoring model across all image models tested. |
Shorts Generator | longstories | Uses LongStories AI to generate high-quality content from text prompts. Generates engaging short stories, similar to Youtube Shorts, TikTok clips etc, on any subject you want. Offers many customization options. Note: generation can take from 30 seconds to a few minutes. |
Shorts Generator for Kids | longstories-kids | Generates engaging short stories for kids on any subject you want. Offers many customization options. Note: generation can take from 30 seconds to a few minutes. |
Flux Pro V1.1 | flux-pro/v1.1 | Excellent image quality, prompt adherence, and output diversity. |
Imagen V3 | imagen-3.0-generate-002 | Google's highest quality text-to-image model with fine detail, rich lighting, and excellent text rendering capabilities. |
Flux Pro V1.1 Ultra | flux-pro/v1.1-ultra | 4K version of Flux Pro V1.1. Excellent image quality, prompt adherence, and output diversity. |
Ideogram V2 | ideogram-ai/ideogram-v2 | An excellent image model with state of the art inpainting, prompt comprehension and especially text rendering. |
Flux Lora | flux-lora | FLUX.1 [dev] with LoRA support, fast and high-quality image generation with the option to use LORAs for specific styles. |
Flux Dev | flux-dev | Slightly faster and much cheaper than Flux Pro with similar output quality. |
Flux Schnell | flux/schnell | Fast and high-quality image generation - the cheaper version of the Flux range of models. |
SD 3.5 Large | stable-diffusion-v35-large | Stable Diffusion's newest model. Generates a wide variety of images reflecting different styles without complex prompting. |
Ideogram V2 Turbo | ideogram-ai/ideogram-v2-turbo | A fast image model with state of the art inpainting, prompt comprehension and especially text rendering. |
Flux Realism | flux-realism | Incredibly photorealistic image generation. Generate people, animals, landscapes that are hard to distinguish from reality. |
DALL-E-3 | dall-e-3 | OpenAI's most well-known image model. |
DALL-E-3 HD | dall-e-3-hd | OpenAI's most well-known image model, now in HD quality. |
SD 3.5 Large Turbo | stable-diffusion-v35-large/turbo | Turbo version of Stable Diffusion's newest model. Faster and cheaper performance while still maintaining great prompt adherence and quality. |
Playground V2.5 | playground-v25 | Playground V2.5 outperforms SDXL in many user tests. Suitable for a broad range of images. |
Proteus | proteus-v0.2 | A versatile image generation model with high-quality outputs. |
Promptchan | promptchan | The best NSFW image generation. High-quality image generation with lots of customization options. |
Flux Dev Uncensored | flux-dev-uncensored | Flux Dev Uncensored version for unrestricted image generation |
Fluently | fluently-xl | Fluently model for high-quality image generation |
Lustify SDXL | lustify-sdxl | High-quality NSFW image generation model based on SDXL architecture |
Uber Realistic | uberRealisticPornMerge_urpmv12_4979.safetensors | Generates realistic-looking NSFW images. |
Stable Diffusion 3 Medium | sd3_base_medium.safetensors | Excels at photorealism, typography, and prompt following. Works best in 1024x1024. |
Dreamshaper XL | dreamshaper_8_93211.safetensors | Dreamshaper generates realistic and anime/illustration-style images, and is best suited to sci-fi and fantasy scenes. |
ReV Animated | revAnimated_v122.safetensors | ReV Animated specialized in fantasy, anime and semi-realistic landscapes. |
Stable Diffusion XL | fast-sdxl | Cheap and powerful text-to-image model that generates pictures rapidly. |
Flux Pro V1 | flux-pro | Older version of Flux V1.1. Exceptional quality and prompt adherence. |