Return to Blog

Trusted Execution Environments & Confidential AI on NanoGPT

Posted on 5/18/2025

Trusted Execution Environments & Confidential AI on NanoGPT

Ever wondered how NanoGPT’s TEE models can generate answers without ever seeing your data? The secret is a Trusted Execution Environment (TEE). This post walks you through:

1. What a TEE is and how it works.

2. Why it matters for your privacy and compliance.

3. Why TEEs beat other privacy techniques for real-time AI inference.

4. How to use TEE-hosted models on NanoGPT.

5. How to verify—cryptographically—that your request was handled inside an enclave.

6. Extra resources & FAQ.

1. What is a TEE?

A Trusted Execution Environment is a hardware‐isolated area of a CPU or GPU where code can run while:

  • being isolated from the host OS, hypervisors, and even cloud administrators,
  • being measured at boot—any change in code or weights changes the measurement hash, and
  • producing a remote attestation quote that anyone can verify against the chip vendor's root keys (Intel, NVIDIA, …).

Think of it as a secure vault inside the processor: you hand in encrypted data, the vault decrypts it internally, runs the model, and sends back only the answer (plus a proof that it really happened inside the vault).

2. Why does it matter?

1. Data confidentiality – Your prompts and proprietary context never leave the enclave unencrypted. Even if the server OS is compromised, attackers can't read your data in RAM.

2. Model integrity – You can verify the exact build of our model that is running. No hidden weights, no last-minute swaps.

3. Regulatory compliance – TEEs map cleanly to GDPR/CCPA requirements for data minimisation and confidentiality.

4. Zero-trust architecture – You no longer have to trust us (or our cloud provider); you trust the hardware vendor's root of trust.

3. TEEs vs. other privacy techniques

Technique Latency Scalability Leakage Risk
TLS only ✅ low ✅ high ⚠️ server can read plaintext
Homomorphic encryption ❌ high (seconds–minutes) ❌ limited ✅ none
Differential privacy ✅ low ✅ high ⚠️ adds noise, lowers accuracy
TEE (NanoGPT) ✅ low ✅ high ✅ none

For real-time chat, TEEs provide the best trade-off—hardware-enforced privacy without the massive latency penalty of homomorphic encryption.

4. Using a TEE model on NanoGPT

Every TEE-hosted model is prefixed with TEE/:

curl -X POST https://nano-gpt.com/api/v1/chat/completions \
     -H "Authorization: Bearer <YOUR_API_KEY>" \
     -H "Content-Type: application/json" \
     -d '{
           "model":"TEE/hermes-3-llama-3.1-70b",
           "messages":[{"role":"user","content":"Hello!"}]
         }'

5. Verifying that your session was private

a) Fetch the attestation report

curl -H "Authorization: Bearer <YOUR_API_KEY>" \
     https://nano-gpt.com/api/v1/tee/attestation?model=TEE/hermes-3-llama-3.1-70b

The endpoint gets a raw quote, validates it server-side, and returns the fields.

b) Get your chat answer (optionally streaming)

REQUEST_ID=$(curl -s -X POST https://nano-gpt.com/api/v1/chat/completions \
  -H "Authorization: Bearer <YOUR_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{"model":"TEE/hermes-3-llama-3.1-70b","messages":[{"role":"user","content":"Who won the 2022 Nobel Prize in Physics?"}]}' \
  | jq -r '.id')

c) Retrieve the enclave signature

curl -H "Authorization: Bearer <YOUR_API_KEY>" \
     https://nano-gpt.com/api/v1/tee/signature/$REQUEST_ID?model=TEE/hermes-3-llama-3.1-70b&signing_algo=ecdsa

The response contains the ECDSA signature produced inside the enclave. Combine it with the public key from step (a) to verify end-to-end integrity.

d) Simple test script

We have a simple python script example on our API page and in our documentation. Fill in your API key and run it, then use an independent website like Etherscan to verify.

sbb-itb-903b5f2

FAQs

Isn't an enclave just a VM?

No. A VM can still be inspected by the hypervisor. A TEE's memory is encrypted by hardware keys inaccessible to the host.

Does TEE add latency?

Our benchmarks show ≤100 ms extra overhead for Llama-3-70B—well within human-perceived limits.

Can I use my own verifier?

Absolutely. The attestation endpoint exposes all fields required by Intel or NVIDIA verifier libraries. Our JWT-based tee-verify endpoint gives you a tiny 15-LoC option too.

Can you explain TEE’s like I’m 5?

Quite difficult since it’s a complicated subject, this is AI’s best attempt:

Of course! Imagine you have a toy box where you keep your favorite toys. You don’t want anyone else to take or play with your toys unless you say it’s okay. A TEE, which stands for Trusted Execution Environment, is like a special, super-secure toy box inside a computer or phone.

When your computer or phone needs to keep a piece of secret information (like a password or important game data) really safe, it puts it inside this special toy box. Only programs that have permission can open it, and no one else can peek in.

Even if someone tries to break into your device, they can't get into the special toy box because it’s locked up tight with advanced magic (well, grown-up math and science magic). That way, all the secret stuff stays safe!

Can you explain TEE’s like I’m 18 and know a bit about computers?

Sure! A Trusted Execution Environment (TEE) is a secure part of your device’s processor designed to protect sensitive data. Think of your computer or smartphone like a building where apps and programs work in shared spaces. Most of the building is accessible to everything—including viruses or hackers if they manage to get in.

Now picture the TEE as a top-secret, locked vault inside that building. It's isolated and shielded from the rest of the system, ensuring that only authorized code and trusted data can enter. Even if the device gets hacked, anything inside the TEE stays safe.

Why does this matter? The TEE handles sensitive operations, like storing passwords, encryption keys, or biometric data (e.g., fingerprints or face scans). When you unlock your phone with a fingerprint, the scan is processed only in the TEE. That way, no app—or hacker—can access it or tamper with it.

Examples of TEE technology include Apple’s Secure Enclave and ARM’s TrustZone. These are built into the device’s hardware using encryption and security measures to guarantee added protection.

In simple terms, TEE is the ultra-secure “safe space” in your device where critical data stays private and protected, even if the rest of the system gets compromised.