Bring Your Own Key (BYOK)
Use your own provider credentials. Keys are encrypted at rest; we only display the last 4 characters.
Provider credentials
OpenAI
OpenAI Responses
Anthropic
Openrouter
Chutes
AWS Bedrock
Paste JSON: {"accessKeyId":"...","secretAccessKey":"...","region":"us-east-1"}
Azure
Paste JSON: {"endpoint":"https://<endpoint>","apiKey":"<key>","deploymentName":"<deployment>","apiVersion":"2024-10-21-preview"}
Azure Responses
Paste JSON: {"endpoint":"https://<endpoint>","apiKey":"<key>","apiVersion":"2024-10-21-preview"}
Groq
SambaNova
Vercel
Novita
Akash
NVIDIA
Google AI Studio
Hyperbolic
Morpheus
Fireworks
Together
Example: Call v1/chat/completions with BYOK
Replace YOUR_API_KEY and choose one of the providers: openai | anthropic | aws | openrouter | chutes | akash | nvidia | google
BYOK FAQ
What is the pricing?
We charge a 5% markup on the normal at-cost API rates. Your provider bills you directly; we only charge the markup.
What if I use a free model via BYOK?
If the model that you use is free via the original provider where you use it from we base our 5% markup on what the lowest rate would be if you used the paid version of it. If you for example use an API key that has free requests for gemini-2.5-pro, we will charge the 5% markup based on regular gemini-2.5-pro pricing.
Which endpoint supports BYOK?
BYOK is currently supported only on the OpenAI-compatible endpoint /api/v1/chat/completions.
Do I have to set x-byok-provider?
No. It’s optional. We auto-map based on the model. Setting it is recommended if you want to force a specific provider.
How do I use Google Gemini AI Studio?
Add your Google AI Studio key above, then set the header x-byok-provider: google when calling /api/v1/chat/completions. This forces routing through Google AI Studio for Gemini models.
What usage counts are used for billing?
We prefer provider-reported usage tokens when available (streaming or final usage). If unavailable, we estimate conservatively.
How are keys stored?
Keys are encrypted at rest. We only store the last 4 characters for display and never send keys back to the client.
