Suggestions and bugs

Share what you'd like to see next in NanoGPT or upvote ideas from the community.

Add a suggestion

Let us know what would make NanoGPT more useful to you.

Suggestion

Report issue

Clarify provider data retention agreements

Pending

I remember that NanoGPT used to advertise "zero data retention" agreements with their LLM providers. At some point this language has been removed (or I'm misremembering) and replaced with "minimum data retention", which is a far more nebulous term, though even "zero data retention" does not have a standardized definition among providers (especially regarding abuse detection). Can you disclose which providers you have special data retention agreements with and what those terms are? It would go a long way to help the transparency of this platform.

Updated 1/7/2026

Disclose providers for open-source models

Pending

I understand that you rely on many different providers for models, including open-source models in the subscription. However, it isn't clearly written anywhere which models are hosted by which providers. (Or maybe I'm just not looking hard enough.) Any chance you could disclose this information somewhere on the site? Like, "Primary provider for X is Fireworks, fallback Chutes", as an example. Adding a note about it for the "Available Models" list under the "</> API" tab would feel intuitive, for example.

Team reply

1/11/2026

Hi - we have added provider selection for (most of) the open source models now, so you can select your provider of choice. It's also still possible to use the "auto" routing still, which remains the same as it was before. For that, we do not display the provider and likely will not.

Updated 1/11/2026

Remove CloudFlare

Pending

I noticed your site and API are behind the Cloudflare proxy/CDN. Cloudflare inspects and routes all visitor traffic, which can expose users’ IP addresses and request data to a third party and centralizes metadata that could be retained or accessed. For better privacy and to avoid third‑party interception of user requests, please consider removing the Cloudflare proxy (or configure DNS only / self‑hosted CDN) so visitors connect directly to your origin servers. Thank you.

Team reply

2/15/2026

You’re right to flag the privacy implications of reverse proxies/CDNs and we’re evaluating options to reduce third-party exposure where feasible. For clarity: NanoGPT is not currently behind a Cloudflare reverse proxy. We use our hosting provider’s global edge network (running on AWS) for performance and DDoS protection, which means AWS/our edge provider may still process similar connection metadata.

Updated 2/15/2026

Show TokensPerSecond for each model?

Done

Is there a way to see supported TPS for each model like OpenRouter displays?

Team reply

2/6/2026

We now do this for most (if not all?) models. - Milan

Updated 2/6/2026

Continue generating text

Pending

Sometimes a reasoning model gets interrupted, and does not finish the task. Currently i make them continue by sending a message that just says "continue". Maybe it's better to provide a separate button to continue generating.

Updated 12/20/2025

More filter options

Pending

More filter options for models (all of them) would be useful, such as "can be coaxed easily into NSFW", "does not guarantee privacy", etc...

Updated 12/4/2025

Add more API settings

Done

Please consider adding generation settings such as Min-P, this is really helpful since most of prompts and presets for roleplay comes along with generation settings and its bad that nanoGPT API does not support Min-P setting.

Team reply

Hi! We already supported most of those but didn't have it in the documentation properly. https://docs.nano-gpt.com/api-reference/endpoint/chat-completion has now been updated!

Updated 11/21/2025

Add Paypal as payment option

Accepted

I'm primarily using PayPal, so having it as a payment option would be nice

Team reply

12/11/2025

We're working on this but this is proving quite a bit more difficult than we had hoped. Sorry - we'd love to add this in ourselves as well! - Milan

Updated 12/11/2025

Make a .onion address

Pending

Would be nice if there was an anonymous way to access this via Tor. Probably would have to disable the free model for that, though.

Updated 1/28/2026

Increased weekly limits for long-term subscribers

Pending

I read situation about those 3-5% users who caused abuse of service and unlimited tokens, and understand introduction of weekly limits fo 30mil tokens. But even among those 95% people can be few % of loyal users who may need extra tokens per week (especially when using agents or reasoning models). I would like to suggest a feature to introduce weekly limit size depending on how long user has been subscribed. It may be flat bonus after X months (i.e. 60m after 3-6 months) or incremental (i.e. +3-5mil with each month). Kinda a reward system for loyalty. And abusers usually will create a lot of new accounts and probably would not patient enough to cause issues.

Updated 2/15/2026

Opencode

Pending

Can you contact the OpenCode team so we can use all our subscription models, as they aren't currently listed.

Team reply

2/6/2026

Could it be that you are using api.nano-gpt.com? As far as we know they just use the v1/models endpoint which should return all. But the URL should be nano-gpt.com/api/v1. - Milan

Updated 2/6/2026

Pro / coding Subscription

Pending

As open source models are becoming an alternative for commercial models, especially for coding, the reliability needs are also increasing. The subscription provides a good way to have access to such models. For normal chat completion requests, "cheap" providers (your backend providers) are usually sufficient. But for coding, high reliability is needed (reasoning, tool calling). Some providers apply settings to the models that make them nearly impossible to use for coding. Therefore I suggest a pro or coding subscription with the same / similar limits like the normal subscription, but with providers who don't heavily modify the model settings (usually the model dev providers).

Team reply

2/26/2026

We are unlikely to do this or if we do it would be much more expensive - frankly we'd suggest looking at other providers for this because it's not something we are likely to do in the short term.

Updated 2/26/2026

Add Qwen3 Embedding 0.6B to Subscription Models

Rejected

Qwen3 Embedding 0.6B A powerful and cheap Embedding model that is also open source and available here: https://huggingface.co/Qwen/Qwen3-Embedding-0.6B, it would also be nice to obtain vectors without spending API requests.

Team reply

1/15/2026

Sorry, but we're not going to add embedding models to the subscription. - Milan

Updated 1/15/2026

Model Date

Done

I think it would be a good addition to include details about when a model was last updated or uploaded. Why? For example, the recent appearance of the Mistral Large 3 model. It appears in the available models list, but I'm not sure if it's the new version or an older one. While it might have "2512" at the end of the model name, this doesn't guarantee a specific date, as the other existing models don't have anything like that.

Team reply

We've now also added the date-added into the hover-over description, as well as in all other places that could be relevant!

Updated 12/8/2025

API Key Usage Control

Done

Allow finer control of API key usage, including daily RPD and daily NANO consumption allowed.

Team reply

2/25/2026

It is now also possible to set a daily token limit

Updated 2/25/2026

Unified changelog

Pending

You guys are a little too fast at adding new features to the website/API. I often only find very useful features (like this new suggestion board) months after they launch because they don't seem to be announced anywhere other than your Discord. Can you please merge what you announce on Discord and the updates section on the website into a unified changelog? Ideally this would be backdated to include everything you've added thus far as well.

Team reply

1/11/2026

https://nano-gpt.com/updates does something like this work for you or were you thinking more changelog also of sort of, code changes/backend changes? - Milan

Updated 1/11/2026

Pay with cash (by mail)

Pending

Or gift cards/voucher codes that can be purchased with cash from somewhere.

Updated 1/4/2026

Z-Image Turbo in subscription

Done

Hello! Z-Image Turbo is super cheap to run and is good enough quality for most of my needs, where the other models in the subscription don't quite meet my requirements. It would be super awesome if that model could be included in the subscription please! Thank you

Team reply

1/11/2026

This is now included in the subscription! - Milan

Updated 1/11/2026

Embedding models

Rejected

Would it be possible to add embedding models to the subscription?

Team reply

2/6/2026

Hi - we're not going to do this, there are not enough providers hosting embedding models for this to make sense for us to do. Sorry :/ - Milan

Updated 2/6/2026

Add Google Drive

In progress

Would be nice to add Google drive files straight to the model instead of downloading first

Team reply

12/11/2025

We have this working - you can actually already try it now. We however need to do extra verification with Google, since otherwise every user that tries to use this is shown a security warning which is quite a pain (essentially Google's way of saying "this company did not yet prove what they want to use this for". This might take a while, we've sent in our side but are now dependent on Google reviewing. We're leaving it up in the meantime for people to try out.

Updated 12/11/2025

please i wanna use glm 5.1

Pending

i want to use glm 5.1 as subscription. otherwise there is no reason to subscribe. i u

Updated 4/9/2026

Prompt Caching for more models

Pending

Many providers do offer prompt caching nowadays and utilizing that yields great discounts. Unfortunately, through NanoGPT, we can't benefit from those (except with Claude, which is the most inconvenient to implement). That remains a big advantage of OpenRouter. For me personally, the most interesting providers are Moonshot AI, Google (Gemini) and DeepSeek. I am aware there are some suggestions like this already, but they can't be voted on, so this is my way of expressing demand. I'm also aware this might hurt your profits, if you only pay discounted rates while you charge base costs, but that would also mean the pricing isn't fair.

Team reply

1/29/2026

Thanks! So a few aspects to this. We actually implemented it for OpenAI yesterday, and want to add it for more models (likely Gemini next). For Deepseek and co this is a bit more complicated. We generally do not run these models through the direct provider (Deepseek/Moonshot), mostly for privacy reasons. Some open source providers do offer prompt caching, so when those are used then we can pass it on, but this is a bit more complicated because they are not always the cheapest providers generally. So have to figure out what we do there. Maybe people pass a "caching" header for those models/providers, or something.

Team reply

2/26/2026

We now haev this for Gemini as well - Milan

Updated 2/26/2026

Chat Organization

Done

For those of us less organized, it would be nice to have option in all conversations tab to select multiple chats from a given page and have the option to apply a function from the three dot menu that you would normally only be able to apply for a chat one at a time and not even be able to select another without returning back to all conversations screen.

Author

12/11/2025

Yes, this is perfect thank you so much!

Updated 12/11/2025

Add Suno AI

Pending

It would be great to have suno available for music generation in nano-gpt. I love your work!

Updated 12/8/2025

Anonymous vouchers

Pending

Hello NanoGPT team! A suggestion I have is to provide vouchers like Mullvad does so people can deposit money anonymously without needing crypto. I didn't want to deal with the tax headache of cryptocurrencies, so I used the stripe process to add funds. I was under the impression that personal information would not be knowable by NanoGPT, but I can see my personal information under billing so you surely must have access, right? Vouchers would enable a clean separation for those like myself which do not want to rely on trust.

Updated 12/4/2025

Highlight open-source / open-weight models

Pending

It would be cool to have a badge in the model list for open-source / open-weight models, similar to the (old?) badges for vision-enabled models, or maybe another way to highlight them (colored model names perhaps), as the description for models can’t really be seen on a mobile device. Why? Because I like to run open models most of the time and consider them as having better privacy, but still want to be able to use proprietary models should I want to. It’s a bit of a pain to know whether they are open source or not. This might clutter the UI, so I guess it can also be an opt-in/opt-out setting?

Updated 1/20/2026

Add support for google grounding api + other native google tools

Pending

Gemini natively supports grounding/google search api and is arguably more powerful than third-party search tools such as the supported tavily or linkup tools. Currently, it cannot be used through the nano-gpt api Supporting the native grounding api would allow for better search performance additionally, supporting more tools like google's code execution tool would allow for more use cases, matching the official api's capabilities

Updated 1/11/2026

Interleaved Thinking

Done

Models like GLM 4.7 and MiniMax M2/M2.1 now recommend passing `reasoning_content` or `reasoning_details` back into the context for enhanced reasoning performance. See https://docs.z.ai/guides/capabilities/thinking-mode and https://platform.minimax.io/docs/guides/text-m2-function-call. Is this possible with NanoGPT's OpenAI-compatible API? Or does it perhaps need an Anthropic-compatible API? And if so, can I suggest some clarity in the docs on how this might work?

Team reply

12/23/2025

Hi! This is how we currently do this - any reasoning_content or reasoning_details are converted to the standard that the specific provider uses, nothing is dropped. - Milan

Author

12/23/2025

Wow, thank you for the reply! Glad to hear that you pass everything through. I'll give it a shot in my proxy layer.

Author

12/23/2025

It looks like passing `reasoning_content` works fine, however I'm getting 503 errors when I pass the reasoning in the `reasoning_details` field, specifically to MiniMax M2.1. It feels a bit like a timeout. On GLM 4.7 Thinking using the same interleaved thinking technique with `reasoning_content`, everything works fine. No 503 errors. Is it something wrong on my side, or is the `reasoning_details` field actually causing real issues?

Team reply

5/15/2026

Hi, Really very late reply, but from everything we can see it should now work properly.

Updated 5/15/2026

Custom HuggingFace support

Accepted

Your application already supports over 500 LLMs, and the Custom CivitAI update got me thinking - it would be incredible if users could access HuggingFace models through NanoGPT. Because RAM and GPU prices are rising, an AI rig is impossible for the average consumer, myself included. Adding this feature would be perfect because users could access more models and use them without getting expensive hardware. The basic idea would be to find a HF URL, like https://huggingface.co/mradermacher/Huihui-GLM-4.7-Flash-abliterated-i1-GGUF (a currently inaccessible model) that contains GGUFs and then to paste and submit it. Like Custom CivitAI, the LLM would have to be resolved to be used, but once finished you could start using it. Likely all users would enjoy this feature.

Team reply

1/29/2026

Thanks! We'd love to add something like this but as far as we know this is not possible to add in a cheap way - as in while this is possible for image models there is no "load on demand" for text models. Maybe we are wrong - if you did find it somewhere we'd definitely love to know! - Milan

Author

1/29/2026

Take a look at https://synthetic.new and their documentation. It appears that they offer this service using vLLM. It might not hurt to look at their API and see if it can be integrated, even if it's more expensive than a normal provider

Updated 1/29/2026

Another separate subscription tier that includes Gemini and Claude?

Rejected

Maybe people will pay for it, monthly limits can be toned down for those two models.

Team reply

1/18/2026

We will not do this, sorry. It's simply way too hard to figure out how much we would need to charge, and we'd have a lot of risk that people use it too much. We understand it's very attractive for users hah, but it's not for us. - Milan

Updated 1/18/2026

Add Qwen3 Embedding und Reranker models to subscription

Rejected

It would be great to add all Qwen3 embedding and reranking models to the subscription: Qwen3-Embedding-8B Qwen3-Embedding-4B Qwen3-Embedding-0.6B Qwen3-Reranker-8B Qwen3-Reranker-4B Qwen3-Reranker-0.6B Currently, such models are becoming increasingly necessary for codebase indexing and RAG tasks. As far as I have seen, competitors already offer these models within their subscriptions.

Team reply

2/6/2026

Hi - we're not going to do this, there are not enough providers hosting embedding models for this to make sense for us to do. Sorry :/ - Milan

Updated 2/6/2026

More Usage Information

Pending

As a user, I'd like to see more detailed information for each log in the Usage page, including provider, latency, throughput of each log/usage.

Updated 2/14/2026

Add latency and TPS counters to subscription models

Pending

Some of the subscription models take forever to start replying. Would be nice to know how fast they are in advance.

Updated 1/28/2026

Add Toggleable time-awareness

Rejected

Either on a per-conversation level (where the web-search/thinking/memory toggles are), a per-preset level, globally, or through prompt tag eg <CURRENT_TIME> (in the manner you already have <CURRENT_DATE>) Plz I beg, I need my character to stop hallucinating timestamps and arguing when I say it's X time but they think it's Y. And it's a big boost for immersion in general when prompt engineering ❤️

Team reply

1/18/2026

Sorry, but we don't really like to put this into the Settings/system prompt and such. Mostly because it feels like something that is quite.. specific in a way, and every option we add adds complexity and takes up space. Sorry :/ - Milan

Updated 1/18/2026

Consensus Across Models

Pending

Sup.Ai is doing something really cool that I bet you guys could implement. They give the user the ability to submit to multiple models, which you have, but instead of just returning the results, they pass all three results through a model to evaluate the responses and construct a consensus using the best aspects of each response. Then, they return the final consensus to the user. The idea is that while one model might hallucinate or write poorly, it's unlikely that they all would at the same time. I would love to, for example, be able to combine Deepseek, GLM and Kimi K2 or something like that. The biggest trick, here, would be how you prompt the final model to generate a useful consensus, but if you pulled it off, it would be incredible. Another advantage is that if a model is unavailable or unwilling to complete the task, another model will, so the user never experiences a dropped response.

Updated 12/9/2025

Ministral & Mixtral

Pending

I would love to see Ministral 8b and Ministral 3b added, as well as the Mixtral 8x7b & 8x22b Models, if that would be possible

Updated 12/4/2025

Simplified API Pricing Description

Pending

I want to make the API pricing easier to understand. This includes showing the price for image inputs and for video generation with or without audio. Some models do not have pricing, so I also want to clearly show which features each model can access, such as whether image input is allowed.

Updated 11/25/2025

TPS & Availability % for other providers

Pending

Provide speed information and reliability information next to each provider.

Updated 2/3/2026

Memory without relying on Google/Gemini

Accepted

There are understandable privacy concerns with relying on Google/Gemini behind Polychat for Memory. I believe there should be an option to use the regular open source model providers, which have far better privacy practices.

Team reply

1/17/2026

We've forwarded your suggestion to the Polychat folks, since they are the ones who build this service. - Milan

Updated 1/18/2026

Protect conversations with pincode or 2FA

Pending

Please add an option to protect past conversations with pincode or 2FA

Updated 1/5/2026

Rejected submissions placeholder

Done

Currently there are no rejected submissions and the placeholder text says "Be the first to submit an idea!", implying to suggest an idea to get rejected. Would be better to replace the text with something like "All of your ideas are good so far!". Or reject this idea. That would be funny.

Team reply

1/18/2026

We have solved this by now rejecting many submissions, hah. - Milan

Updated 1/18/2026

Auto aspect ratio for Nano Banana Pro

Done

Automatically choose the aspect ratio for generated image based on the original image.

Team reply

12/29/2025

Thanks! This should happen now. - Milan

Author

12/29/2025

Nice, thanks!

Updated 12/29/2025

GLM 4.7 - Original Option With Reasoning

Pending

Due to the free, subscription-based GLM providers being way too slow sometimes, I'd like it if the thinking/reasoning version of the Original GLM 4.7 was also offered. I *could* just get it directly from z.ai, but... I'd rather get all my API's from here, neat and organized.

Team reply

12/25/2025

Hiya! We have zai-org/glm-4.7:thinking, is that not exactly what you want? - Milan

Updated 12/25/2025

Implementing 2FA

Pending

Please consider adding 2FA using authenticator apps. This would act as an additional layer of security.

Updated 12/9/2025

Show model parameter count (e.g. 671B, 1T) on model listing pages

Pending

When browsing or selecting models, it would be very helpful to display the total parameter count (e.g. 671B, 1T MoE) alongside each model name. Currently, users have no way to understand the scale or capability tier of a model without leaving the platform to search externally. This is especially important for: - Comparing models before choosing one - Building products on top of NanoGPT where we need to communicate model specs to our own users - Understanding the difference between full models and quantized/distilled variants Suggested placement: model card, pricing table, or model selector dropdown. Thank you!

Updated 4/9/2026

Option for chime sound when model finishes response

Done

Staging work is often needed for working with reasoning models as it is impractical to sit and wait for the reply to be completed. Often however, it is easy to forget you were expecting the reply while you do other things, wasting time ultimately that could be used to have the LLM do more work or iterate. It would be quite useful to have a (very gentle, not annoying) little chime sound as an option that plays immediately after message completion. It would be nice to have several options for the sound so one that is pleasing can be chosen.

Team reply

1/12/2026

Great idea! We already have the notification possibility (the little bell, bottom left) but not with a chime. We'll implement - Milan

Team reply

1/12/2026

This is now live - you can pick from a few different sounds. - Milan

Author

1/12/2026

It works perfectly. Thank you. For me this is highly useful and makes a big difference, thanks so much for the excellent implementation.

Updated 1/12/2026

Chat Page on Mobile

Accepted

The chat page on mobile is far too crowded, particularly with the top bar, down. It's hard to read and navigate because the bottom bar for text input is so tall.

Team reply

1/12/2026

First off - we agree. Did you know you can turn off the sticky top bar in Settings? https://nano-gpt.com/settings#conversation Not saying that fully solves it, and we do need to improve, but that might be a temporary improvement for you already. - Milan

Updated 1/12/2026

Auto-timestamp prompts (and dynamic titles)

Pending

AI isn't aware by default of the current date, time and timezone of the user. For providing time-sensitive suggestions, an option to auto-add the users current time to prompts would be helpful. Then the AI could respond: for this interaction in this business communication, get back with the client on X of Jan at ~14:00 As an extension to this: Said response from the AI could be used to sort the conversations list, so that those contacts that require interaction earlier show above the ones with later dates. A more general approach to dynamic titles would be for the user to instruct the AI to respond in a certain format -like some sort of front-matter - that holds the next date to follow up. NanoGPT could interpret hints like this (and others) and adapt it's views accordingly. This are effectively three suggestions that could be considered separately. Feel free to split as you see fit.

Updated 1/7/2026

Conversation title in browser tab

Done

I believe you said you're going to bring this back, that was a while ago though

Team reply

12/26/2025

We're pushing this live! - Milan

Author

12/27/2025

It only changes the title when navigating via the conversation list, but even then it's always "Untitled conversation"

Team reply

12/27/2025

I think that for old conversations it might still show this now, but for new ones it should now take the title from either the conversation title or if that is unavailable from the first message - Milan

Updated 12/27/2025

Perplexity as Search Option (BYOK)

Done

Perplexity as Search Option (BYOK)

Team reply

12/25/2025

Great idea, will do. Both as BYOK and directly through us.

Team reply

12/26/2025

This is now live - Milan

Updated 12/26/2025