Explore Audio Models

Discover AI audio generation models for speech and music

No preview available

google

Gemini 2.5 Flash Preview TTS

Text to Speech

Google Gemini native TTS. Single and multi-speaker support via prompt.

No preview available

openai

GPT-4o Mini TTS

Text to Speech

Ultra-low cost text-to-speech model with voice instructions support

No preview available

Kokoro 82M

Text to Speech

High-quality multilingual text-to-speech model

No preview available

fal

ElevenLabs Turbo V2.5

Text to Speech

High quality with lowest latency, ideal for real-time applications. Supports 32 languages while maintaining natural voice quality.

No preview available

openai

OpenAI TTS

Text to Speech

Standard quality text-to-speech model with low latency

No preview available

ElevenLabs Music v1

Music

Eleven Music is a studio‑grade text‑to‑music model. Generate music with natural‑language prompts in any style — perfect for game soundtracks, podcast backgrounds, and marketing reels. Control genre, style, and structure, with optional vocals or instrumental. Supports 10–300s MP3 with selectable sample rate and bitrate.

No preview available

Wavespeed ACE-Step

Music

ACE-Step composes complete songs from text descriptions using Wavespeed’s music foundation model. Guide genre, mood, and structure with style tags and optional custom lyrics. Generates up to 4 minutes of multi-track audio with vocals.

No preview available

MiniMax Music 02

Music

MiniMax Music 02 is a compact MoE music generator (230B params, 10B active) tuned for speedy, cost-effective song creation. Provide a creative prompt plus optional formatted lyrics to render polished, full-length tracks with configurable bitrate and sample rate.

No preview available

MiniMax Speech 02 HD

Text to Speech

High-definition text-to-speech with natural pronunciation and multiple voices.

No preview available

MiniMax Speech 2.6 HD

Text to Speech

Ultra-Fast, Ultra-Human, Ultra-Smart TTS with <250ms latency, natural voice cloning, seamless multilingual support across 40+ languages, and industry-leading text normalization for flawless, expressive communication.

No preview available

MiniMax Speech 2.6 Turbo

Text to Speech

High-definition Text-to-Speech with natural pronunciation and crisp articulation. Supports multiple built-in voices and custom cloned voices, adjustable speed, volume, and pitch, and coverage of 40+ languages for professional audio creation.

No preview available

google

Gemini 2.5 Pro Preview TTS

Text to Speech

Higher-quality Gemini TTS with controllable style and tone.

No preview available

fal

ElevenLabs v3

Text to Speech

High-quality text-to-speech with enhanced controls and natural voices.

No preview available

openai

OpenAI TTS HD

Text to Speech

High definition text-to-speech model with superior quality

No preview available

openai

GPT-4o Mini TTS (2025-03-20)

Text to Speech

Original release snapshot of GPT-4o Mini TTS with voice instructions support

No preview available

openai

GPT-4o Mini TTS (2025-12-15)

Text to Speech

Latest snapshot of GPT-4o Mini TTS with voice instructions support

You've seen all 16 models