Swift Guide for Text Generation APIs

Posted on 2/25/2025

Swift Guide for Text Generation APIs

Want to add AI-powered text generation to your Swift app? This guide covers everything you need to know: setup, implementation, error handling, and more. Here’s what you’ll learn:

Top APIs for Swift: OpenAI GPT, Google Gemini, and NanoGPT.
How to integrate APIs: Tools like Xcode, Swift Package Manager, and OpenAPI Generator.
Authentication methods: Keychain storage, OAuth 2.0, and server-side key retrieval.
Error handling: Manage rate limits, network issues, and authentication errors.
Performance tips: Caching, streaming responses, and hybrid on-device/cloud solutions.
Privacy protection: Store data locally, use encryption, and comply with Apple’s privacy rules.

Quick Comparison of Text Generation APIs

API Name	Key Features	Privacy	Cost Efficiency
OpenAI GPT	Large context, versatile models	Standard	$0.01/1K tokens
Google Gemini	Multimodal (text, voice, video)	Standard	Free up to limits
NanoGPT	On-device, privacy-first	Local storage	Pay-as-you-go ($0.10)

Start building smarter Swift apps today!

Build An AI ChatBot with OpenAI

OpenAI

Setup Requirements

Prepare your Swift environment for integrating a text generation API by configuring the necessary development tools and security protocols.

Required Tools Setup

Xcode serves as the main development environment, while Swift Package Manager (SPM) manages external dependencies within Swift's build system and Xcode .

Component	Purpose	Key Features
Xcode	Main IDE	Code completion, debugging tools, simulator integration
Swift Package Manager	Dependency Management	Automated dependency resolution, version control integration
OpenAPI Generator	API Integration	Swift client code generation, OpenAPI 3.0/3.1 support

To streamline Swift client code generation, use the latest version of swift-openapi-generator (Swift 5.9+).

API Authentication Setup

Multiple methods are available to secure API access. Here's an overview:

Authentication Method	Security Level	Implementation Complexity
CloudKit Key Management	High	Medium
Server Retrieval + Keychain	High	High
OAuth 2.0 with Access Tokens	Very High	Medium
.plist Injection via .xcconfig	Medium	Low

For production environments, follow these best practices:

Store sensitive credentials in the Keychain.
Use OAuth 2.0 access tokens for API authentication.
Implement server-side key retrieval to keep sensitive data secure.
Enable key rotation mechanisms to maintain long-term security.

The most secure setup combines CloudKit key management with Keychain storage. This approach leverages Apple's built-in security features while ensuring sensitive data stays off the device. When implementing OAuth 2.0, treat access tokens as Bearer tokens and include them in the Authorization header of your API requests .

These security measures will be crucial for the API call routines covered in the next section.

API Implementation Steps

Learn how to set up text generation API calls in Swift using URLSession, handle JSON responses, and manage errors effectively.

URLSession API Calls

Here's how to make API calls in your Swift app to generate AI-driven text:

struct TextGenerationAPI {
    let apiKey: String
    let baseURL = "https://api.textgeneration.com/v1/generate"

    func generateText(prompt: String) async throws -> GeneratedText {
        var request = URLRequest(url: URL(string: baseURL)!)
        request.httpMethod = "POST"
        request.setValue("Bearer \(apiKey)", forHTTPHeaderField: "Authorization")
        request.setValue("application/json", forHTTPHeaderField: "Content-Type")

        let parameters = ["prompt": prompt, "max_tokens": 100]
        request.httpBody = try JSONEncoder().encode(parameters)

        let (data, response) = try await URLSession.shared.data(for: request)
        guard let httpResponse = response as? HTTPURLResponse,
              httpResponse.statusCode == 200 else {
            throw APIError.invalidResponse
        }

        return try JSONDecoder().decode(GeneratedText.self, from: data)
    }
}

This function sends a POST request, encodes parameters as JSON, and includes the API key in the headers for authorization. It also checks for a valid server response.

JSON Response Handling

Decode the JSON response to map it into Swift models:

struct GeneratedText: Decodable {
    let id: String
    let text: String
    let createdAt: Date

    enum CodingKeys: String, CodingKey {
        case id
        case text
        case createdAt = "created_at"
    }
}

let decoder = JSONDecoder()
decoder.keyDecodingStrategy = .convertFromSnakeCase
let dateFormatter = DateFormatter()
dateFormatter.dateFormat = "yyyy-MM-dd'T'HH:mm:ssZ"
decoder.dateDecodingStrategy = .formatted(dateFormatter)

The GeneratedText struct maps the API response, while the JSONDecoder handles key formatting and date conversion.

Error Management

Handle errors gracefully to improve user experience:

enum APIError: Error {
    case invalidResponse
    case rateLimitExceeded
    case authenticationFailed
    case networkError

    var errorMessage: String {
        switch self {
            case .rateLimitExceeded:
                return "API rate limit exceeded. Please try again in 60 seconds."
            case .authenticationFailed:
                return "Invalid API key or authentication failed."
            case .networkError:
                return "Network connection error. Please check your internet connection."
            case .invalidResponse:
                return "Invalid response from server."
        }
    }
}

This structured error enumeration provides clear messages for each possible issue.

"URLSession is a class from the Foundation framework you use to perform network requests. It allows you to download data from and upload data asynchronously to endpoints identified by URLs."

For retries, implement exponential backoff to handle rate-limiting:

func performRequestWithRetry(maxAttempts: Int = 3) async throws -> GeneratedText {
    var attempts = 0
    while attempts < maxAttempts {
        do {
            return try await generateText(prompt: prompt)
        } catch APIError.rateLimitExceeded {
            attempts += 1
            if attempts == maxAttempts { throw APIError.rateLimitExceeded }
            try await Task.sleep(nanoseconds: UInt64(pow(2.0, Double(attempts)) * 1_000_000_000))
        }
    }
    throw APIError.networkError
}

This function retries failed requests with increasing delays, ensuring better handling of temporary issues like rate limits.

Implementation Guidelines

Use these guidelines to improve performance, protect data, and manage load effectively in your Swift app, building on the API implementation steps.

Performance Optimization

Minimize latency by caching API responses:

final class APIResponseCache {
    static let shared = APIResponseCache()
    private let cache = NSCache<NSString, GeneratedText>()

    func getCachedResponse(for prompt: String) -> GeneratedText? {
        return cache.object(forKey: prompt as NSString)
    }

    func cacheResponse(_ response: GeneratedText, for prompt: String) {
        cache.setObject(response, forKey: prompt as NSString)
    }
}

Other ways to boost performance include:

Shortening prompts: Use concise prompts like "Summarize US economy: inflation, unemployment, GDP, trends" instead of lengthy ones.
Streaming partial responses for real-time updates:

func streamResponse(prompt: String) async throws -> AsyncStream<String> {
    return AsyncStream { continuation in
        Task {
            for try await chunk in try await makeStreamingRequest(prompt: prompt) {
                continuation.yield(chunk)
            }
            continuation.finish()
        }
    }
}

Privacy Protection

Starting May 1, 2024, Apple requires explicit privacy declarations for API usage :

// PrivacyInfo.xcprivacy
{
    "NSPrivacyAccessedAPITypes": [
        {
            "Privacy Accessed API Type": "NSPrivacyAccessedAPICategoryUserDefaults",
            "Privacy Accessed API Reasons": ["CA92.1"]
        }
    ]
}

To safeguard user data, consider:

Storing data locally (e.g., NanoGPT's on-device storage).
Enforcing App Transport Security (ATS).
Using Secure Transport API for encryption.
Employing DeviceCheck and App Attest APIs to prevent fraud.

These practices work alongside the authentication methods previously outlined.

Rate Limit Management

Handle API load effectively with rate-limiting algorithms:

Algorithm	Use Case	Implementation
Token Bucket	Handles bursts	Gradually refills tokens
Sliding Window	Adapts to traffic	Uses rolling time windows
Fixed Window	Simple patterns	Resets at set intervals

Here’s an example of a basic rate limiter:

final class RateLimiter {
    private let requestsPerMinute: Int
    private var requestTimestamps: [Date] = []

    func canMakeRequest() -> Bool {
        let now = Date()
        requestTimestamps = requestTimestamps.filter { now.timeIntervalSince($0) < 60 }
        guard requestTimestamps.count < requestsPerMinute else {
            return false
        }
        requestTimestamps.append(now)
        return true
    }
}

"API rate limiting is, in a nutshell, limiting access for people (and bots) to access the API based on the rules/policies set by the API's operator or owner." - DataDome

To fine-tune limits, monitor factors like peak usage times, request patterns, server load, and error rates.

sbb-itb-903b5f2

API Comparison

Let's break down how these APIs stack up in terms of integration, features, performance, and cost.

Feature Comparison

Each text generation API offers its own set of strengths and trade-offs. Here's a quick look:

Feature	OpenAI GPT-4	Google Gemini 1.5 Pro	NanoGPT
Context Window	128K tokens	2M tokens	Varies by model
Max Output	16.4K tokens	8,192 tokens	Model-dependent
Input Cost	$2.50/M tokens	$3.50/M tokens	Pay-as-you-go ($0.10 minimum)
Output Cost	$10.00/M tokens	$10.50/M tokens	Model-dependent
Integration	GPTSwift wrapper	REST API	REST API
Privacy	Standard	Standard	Local data storage
Multimodal Support	Text-focused	Voice, Video	Text and image generation

Integration Options

For developers, GPTSwift makes integrating OpenAI services into Apple platforms a breeze. Supported on iOS 15+, macOS 12+, watchOS 8+, and tvOS 15+, it enables straightforward calls like this:

let answer = try await chatGPT.ask("What is the answer to life, the universe and everything in it?")

Performance Considerations

Performance varies significantly between these APIs. OpenAI GPT-4 processes 77.4 tokens per second, while Google Gemini 1.5 Pro supports a massive 2M-token context. NanoGPT's speed depends on the specific model being used .

Privacy and Security

Privacy is where NanoGPT takes a unique approach. Unlike others, it doesn't store prompts or conversations and keeps all data local:

"We store no prompts and conversations. Data is stored on your device. NanoGPT is committed to protecting your privacy and data sovereignty" .

Cost Efficiency

Pricing is another key factor to consider:

OpenAI recently dropped GPT-4 pricing to $0.01 per 1K prompt tokens for 128K context models .
Google Gemini offers free usage up to certain limits .
NanoGPT's pay-as-you-go model starts at just $0.10, without requiring subscriptions.

Choosing the Right API

OpenAI GPT-4: Ideal for advanced text generation and large context windows.
Google Gemini 1.5 Pro: Best for multimodal tasks like voice and video.
NanoGPT: Perfect if privacy and flexible pricing are top priorities.

Common Problems and Solutions

When integrating text generation APIs into Swift applications, developers often face a range of technical hurdles. Here's a breakdown of common issues and practical ways to tackle them.

Network Issues

Connectivity problems can disrupt communication with text generation APIs. A common error is NSURLErrorNetworkConnectionLost (error code -1005), which happens when the connection drops during data transfer.

To troubleshoot:

Set the Content-Type header correctly.
Use tools like Charles Proxy to inspect API responses.
Log error details, including HTTP status codes.
Test with different URLs or restart the simulator.

Here's a Swift example to help identify network issues:

let task = URLSession.shared.dataTask(with: request) { data, response, error in 
    if let error = error { 
        print("Error: \(error.localizedDescription)") 
        if let httpResponse = response as? HTTPURLResponse { 
            print("Status code: \(httpResponse.statusCode)") 
        } 
    } 
}

If the network setup seems fine but issues persist, double-check your authentication details.

Authentication Errors

Authentication problems often stem from incorrect headers or casing. A notable example comes from a Swift developer working with Google Cloud APIs:

"It also seems that casing matters. x-goog-api-key works, X-Goog-Api-Key doesn't -> invalid header key." - eaigner, GitHub Issue #234

To avoid errors, use precise casing and header names. Here's how to set up authentication properly:

var request = URLRequest(url: url)
request.setValue("Bearer \(apiKey)", forHTTPHeaderField: "authorization")
request.setValue("application/json", forHTTPHeaderField: "content-type")

Once authentication is sorted, it's important to monitor response times for smooth app performance.

Response Time Issues

Slow response times can result from model processing delays or suboptimal configurations. For instance:

GPT-3.5 processes about 3× faster than GPT-4 .
Azure GPT APIs tend to outperform OpenAI's GPT-3.5 models by around 20% .

To reduce latency, consider these strategies:

Prompt caching: Store frequently used prompts to avoid redundant processing.
Streaming responses: Display API output in real-time for a better user experience.
Parallel requests: Handle independent tasks simultaneously for efficiency.
Monitor performance: Track key metrics like average response time and failure rates.

These steps can help ensure your app runs smoothly while leveraging the full potential of text generation APIs.

Future Developments

Addressing current challenges paves the way for exciting advancements in text generation APIs.

New API Features

Apple Intelligence is blending generative models with personal context to deliver tools like real-time rewriting, automated proofreading, and content summarization.

Retrieval-Augmented Generation (RAG) is another game-changer. It boosts text generation by integrating language models with external knowledge sources. Here's a quick comparison:

Feature	Traditional LLM	RAG-Enhanced LLM
Knowledge Base	Fixed at training	Dynamically updated
Response Accuracy	Limited to training data	Improved with external sources
Information Freshness	Static	Real-time updates possible

Swift API Updates

The swift-transformers package now includes native transformer support, tokenization, Hugging Face Hub integration, and Core ML abstraction . Plus, ChatGPT features are built directly into iOS 18, iPadOS 18, and macOS Sequoia .

Agentic AI advancements, like Adept AI's ACT-1, hint at APIs with more advanced interaction capabilities. These updates aim to simplify API integration and expand the potential of Swift apps even further.

Summary

Text generation APIs play a central role in bringing AI capabilities to Swift apps. To integrate these APIs successfully, developers must focus on performance, error handling, and privacy considerations.

Key Components of Implementation:

Swift apps rely on URLSession for handling asynchronous HTTP tasks, ensuring the app remains responsive. The Codable protocol simplifies JSON parsing, providing a secure way to handle API responses. These tools form the foundation of the practices outlined below.

"Asynchronous calls are the backbone of responsive and dynamic apps"

Balancing Performance and Privacy:

Aspect	On-Device Processing	Cloud-Based Processing
Performance	Faster for smaller models	Better for complex tasks
Privacy	Stronger data protection	Requires data transmission
Resource Usage	Limited by device hardware	Scalable infrastructure
Implementation	Direct Core ML integration	API endpoint interactions

Best Practices for API Integration:

Handle errors and rate limits effectively to ensure smooth functionality.
For privacy compliance, declare data collection practices in privacy manifests , especially when dealing with sensitive user information.

Tips for Optimization:

Leveraging GPU acceleration can greatly enhance performance . A hybrid approach - combining on-device capabilities with cloud-based processing - often strikes the right balance between speed, privacy, and functionality.

As frameworks like Core ML continue to evolve, developers have more tools than ever to create powerful, AI-driven Swift apps. By following these guidelines, you can build applications that are not only efficient but also prioritize user data protection.

Return to Blog