Explore Video Models
Discover AI video generation models for stunning animations
Kling Video O1 Standard
Kuaishou's unified multi-modal video model (Standard tier) optimized for cost efficiency. Automatically routes based on your inputs: text-only for text-to-video, image for image-to-video, or reference images/video for reference-based generation.
Wan 2.6
Alibaba WanXiang 2.6 - cinematic text-to-video, image-to-video, and reference-to-video generation with multi-shot storytelling support. 720p/1080p, 5-15s clips.
Veo 3.1 Extend
Extend Veo 3.1 videos by 7 seconds per call with smooth motion, preserved style, and strong scene coherence. Input must be Veo 3.1 generated. Supports up to 20 extensions for max 148 seconds total. 16:9 or 9:16 aspect ratio, 720p or 1080p.
Veo 3.1 Fast Extend
Fast video extension for Veo 3.1 clips. Adds 7 seconds per call with optimized speed for quick iteration. Input must be Veo 3.1 generated. Supports up to 20 extensions for max 148 seconds total. 16:9 or 9:16, 720p or 1080p.
SeedVR2 Video
Upscale videos with SeedVR2 for crisp details, reduced artifacts, and strong frame-to-frame consistency. Supports 720p, 1080p, 2K, and 4K output for clips up to 10 minutes.
Kling V2 Avatar (Standard)
Turns a single portrait and one audio track into a realistic talking avatar with accurate lip sync, expressive facial motion, and consistent identity. Optional prompt can guide mood or energy.
Scroll to load preview
Kling V2 Avatar (Pro)
Creates social-ready talking avatars from one portrait and your audio with sharper detail, stable motion, and strong identity consistency. Optional prompt to nudge camera feel, expression, or mood.
Scroll to load preview
LatentSync
State-of-the-art audio-to-video lip synchronization using latent diffusion. Upload a talking-head video (480p+) and target audio to generate perfectly synchronized lip movements while preserving identity, pose, and background.
Scroll to load preview
Pixverse v5.5 Effects
Pixverse v5.5 Effects via Wavespeed. Apply cinematic effect presets (Kiss Me AI, Venom, Holy Wings, Muscle Surge, etc.) to portraits. 360p-1080p resolution, 5/8/10s durations.
Scroll to load preview
Pixverse v5.5
Pixverse v5.5 via Wavespeed. Text-to-video, image-to-video, and transition mode (first+last frame morphing). 360p-1080p resolution, 5/8/10s durations. Supports prompt optimization and audio generation.
Scroll to load preview
Kling 2.6 Pro
Latest Kling model with text-to-video and image-to-video capabilities. Supports native audio/voiceover generation. 5s and 10s durations with multiple aspect ratios.
Scroll to load preview
Kling Video O1
Kuaishou's unified multi-modal video model with MVL technology. Automatically routes based on your inputs: text-only for text-to-video, image for image-to-video, video for editing, or both for reference-based generation.
Scroll to load preview
Hunyuan Video 1.5
Hunyuan Video 1.5 generates 5–10 second clips from text or an input image. Supports 480p/720p and landscape or portrait runs routed automatically based on whether an image is attached.
Scroll to load preview
Seedance Upscaler
Enhance existing videos with ByteDance’s Seedance super-resolution for cleaner 1080p, 2K, or 4K output with strong temporal consistency. Supports clips up to 10 minutes.
Scroll to load preview
MiniMax Hailuo 2.3 Standard
MiniMax Hailuo 2.3 Standard produces 1080p cinematic clips with realistic motion and smooth scene transitions from text prompts or reference images. Choose 6 or 10 second runs for quick drafts or extended shots.
Scroll to load preview
MiniMax Hailuo 2.3 Pro
MiniMax Hailuo 2.3 Pro delivers cinematic 1080p 5-second clips with advanced physics, prompt fidelity, and character consistency. Supports both pure text prompts and image-conditioned motion.
Scroll to load preview
Avatar Omni Human 1.5
Animate a portrait using ByteDance's cognitive avatar model. Upload a static image and an audio track for expressive lip-sync and emotion.
Scroll to load preview
Video upscaler
Upscale existing videos with FlashVSR for sharper details, reduced compression artifacts, and improved temporal stability. Supports 720p, 1080p, 2K, and 4K output for clips up to 10 minutes.
Scroll to load preview
Wan 2.2 Spicy Extend
Extend existing videos by 5 or 8 seconds with smooth motion and vivid color. Supports 480p or 720p output and preserves temporal coherence.
Scroll to load preview
SeeDance V1 Pro Fast
Fast Seedance Pro variant. Generates cinematic clips from text or a single reference image with durations up to 12 seconds.
Scroll to load preview
Kling 2.5 Turbo Standard
Image-to-video only version of Kling 2.5 Turbo delivering cinematic motion at 720p with 5s and 10s clips. Optimized for fast, affordable production with 25% lower pricing than Kling 2.1 Standard.
Scroll to load preview
Lightricks LTX-2 Fast
High-speed LTX-2 pipeline tuned for rapid iterations. Convert text or a single image into cinematic clips with synchronized audio in seconds.
Scroll to load preview
Lightricks LTX-2 Pro
Flagship LTX-2 stack for production-ready motion. Generates synchronized audio and rich camera moves from text prompts or reference images.
Scroll to load preview
Veo 3.1
Text-to-video and image-to-video with optional end frame control. Native audio generation, cinematic realism, and consistent subjects. Supports 4/6/8 seconds at 720p or 1080p.
Scroll to load preview
Kling 2.5 Turbo Pro
Text-to-video and image-to-video with ultra-smooth motion, cinematic visuals, and precise prompt control. Supports 5s and 10s outputs and multiple aspect ratios.
Scroll to load preview
Sora 2
Create highly realistic videos. Toggle Pro for higher quality. Supports text-to-video and image-to-video (image becomes the first frame). Choose orientation and seconds; size must match orientation.
Scroll to load preview
Wan 2.5
Text or image to video with one‑pass audio/voiceover sync. Supports optional custom audio input. 480p/720p/1080p, 5s or 10s.
Scroll to load preview
VEED Fabric 1.0
Turn a static image + an audio track into a natural talking video. Supports 480p/720p output. Audio is required.
Scroll to load preview
Wan 2.2 Plus
Advanced text-to-video and image-to-video model. Supports 480p, 720p, and 1080p output with a fixed 5-second duration.
Scroll to load preview
Wan 2.2 (V2V)
Edit an existing video using a natural language prompt. Examples: "Change the color of the clothes to yellow", "Change the woman to a handsome boy". Supports 480p or 720p output, up to 120 seconds.
Scroll to load preview
Bytedance Waver 1.0
Image-to-video. Requires an input image. Supports 5s duration only.
Scroll to load preview
Wan 2.2 S2V
Generate a video from a static image and an audio track with realistic lip/body sync.
Scroll to load preview
Pixverse v5
Pixverse v5 video generation model via Runware. Supports text-to-video and image-to-video with customizable styles, effects, camera movements, and sound effects. Resolutions from 360p to 1080p and durations of 5 or 8 seconds.
Scroll to load preview
Wan 2.2 5b
Wan 2.2 5b model produces up to 5 seconds of 720p video at 24FPS with fluid motion and powerful prompt understanding.
Scroll to load preview
Wan 2.2 Turbo
Wan 2.2 Turbo is a faster, simplified version with fewer settings for both text-to-video and image-to-video generation. Variable pricing based on resolution.
Scroll to load preview
Wan 2.2 14b
Wan 2.2 14b is the full version of Wan 2.2 video model that generates high-quality videos with high visual quality and motion diversity from text prompts or images.
Scroll to load preview
Vidu Q1
Vidu Q1 video generation model. Creates high-quality 5-second videos. Supports both text-to-video and image-to-video generation with customizable visual styles (general or anime), movement amplitude control, and multiple aspect ratios.
Scroll to load preview
Pixverse v4.5
Pixverse v4.5 video generation model. Creates high-quality videos with customizable styles, effects, camera movements, and sound effects. Supports multiple resolutions from 360p to 1080p with durations of 5 or 8 seconds.
Scroll to load preview
Veo 3 Fast
Google's fast Veo 3 model. Creates high-quality 8-second videos from text or images. Supports audio generation ($1.60 with audio, $1.20 without). Supports 16:9 and 9:16 aspect ratios. For best results, prompts should be descriptive and clear.
Scroll to load preview
Midjourney Video
Midjourney Image-to-Video generator creates 4 videos of 5 seconds each from an input image with adjustable motion intensity.
Scroll to load preview
MiniMax Hailuo 02 Pro
MiniMax Hailuo-02 Pro video generation model with 1080p resolution. Creates high-quality videos from text prompts or images. Supports both text-to-video and image-to-video generation.
Scroll to load preview
MiniMax Hailuo 02
MiniMax Hailuo-02 Advanced video generation model with 768p resolution. Creates high-quality videos from text prompts or images. Supports both text-to-video and image-to-video generation.
Scroll to load preview
Seedance 1.0 Pro
ByteDance's Seedance video generation model. Supports both text-to-video and image-to-video generation with 5 and 10 second durations. Supports multiple aspect ratios including 16:9, 1:1, 3:4, 9:16, 21:9.
Scroll to load preview
Seedance 1.0 Lite
ByteDance's Seedance Lite video generation model. Fast and efficient model that supports both text-to-video and image-to-video generation with 5 and 10 second durations. Supports multiple aspect ratios including 16:9, 1:1, 4:3, and 9:21.
Scroll to load preview
Veo 3
Google's latest Veo 3 model. Creates high-quality 8-second videos from text or images. Supports audio generation ($4.80 with audio, $3.20 without). For best results, prompts should be descriptive and clear. Include the subject, context, action, style, camera motion, composition, and ambiance details. Note: This model has strict content filters and may reject NSFW or sensitive content — we issue refunds for content policy rejections.
Scroll to load preview
Kling 2.1 Master
Kling 2.1 Master text-to-video and image-to-video model. Premium quality video generation from text or images powered by Runware.
Scroll to load preview
Kling 2.1 Standard
Kling 2.1 Standard image-to-video model. Creates high-quality videos from images with text prompts. Requires an input image.
Scroll to load preview
Kling 2.1 Pro
Kling 2.1 Pro image-to-video model. Higher quality video generation from images with text prompts. Requires an input image.
Scroll to load preview
Kling 2.0 Master
Kling 2.0 Master text-to-video and image-to-video model. Blockbuster-quality scenes, lifelike characters, and smooth motion from text. Supports both text to video and image to video.
Scroll to load preview
Hunyuan Video
Hunyuan Video text-to-video generator creates high-quality 720p videos with customizable resolution, aspect ratio, and frame count. Features pro mode for enhanced quality.
Scroll to load preview
Longstories Movie
Generate AI mini-movies from 1 to 10 minutes. Bring any story to life with animated video and voice.
Scroll to load preview
Longstories Pixel Art
Generate pixel‑art mini‑movies from 1 to 10 minutes. A second universe with a stylized pixel art aesthetic.
Scroll to load preview
Wan 2.5 Extend
Extend short clips to 3–10 seconds while preserving motion, lighting, and audio sync. Upload a base video, optional custom audio, and a prompt. Supports 480p/720p/1080p output.
Scroll to load preview
Kling Lipsync T2V
Text-to-video lipsync. Upload a 2–10 second focal video and provide a script. Kling synthesizes a matching voiceover and animates lips/micro-expressions to the dialogue.
Scroll to load preview
Kling Lipsync A2V
Audio-to-video lipsync. Upload a 2–10 second focal video and a clean vocal track (≤5 MB). Kling aligns mouth shapes and facial muscles to the audio while preserving the original footage.
Scroll to load preview
Lucy Edit Dev
Ultra-fast text-guided video editor. Upload a source clip and describe the desired edit, and Lucy will transform the content while preserving timing, camera motion, and overall composition. Supports clips up to 120 seconds.
Scroll to load preview
Lucy Edit Pro
High-fidelity text-guided video editor focused on cinematic quality, temporal stability, and 720p-ready output. Upload a source clip and a clear prompt to restyle outfits, props, and scenes while preserving motion, timing, and composition. Supports clips up to 120 seconds at 480p or 720p.
Scroll to load preview
Runway Gen-4 Aleph (V2V)
Video-to-video generation. Requires input video + prompt. Optional reference image. Only the first 5 seconds of the input are used by the model; we charge per-second based on your input video length.
Scroll to load preview
Wan 2.2 Animate
Animate or replace a character: provide a reference image and a driver video. Best results when composition, camera, and pose are consistent; keep image and video aspect ratios identical. Supports 480p/720p, up to 120s.
Scroll to load preview
Veo 2
Google's Veo 2 text-to-video model. Highly censored, we get >50% prompt refusals. We issue refunds for content policy rejections. Creates 720p resolution videos (5-8 seconds) from detailed text descriptions. Image to video also supported. Supports both 16:9 (landscape) and 9:16 (portrait) aspect ratios. For best results, prompts should be descriptive and clear. Include the subject, the context, the action, and the style. A Google model, so unfortunately relatively censored.
Scroll to load preview
Veo 2 Image-to-Video
Google's Veo 2 image-to-video model. Animates an input image using detailed prompts, producing 720p videos.
Scroll to load preview
MiniMax T2V-01
Hailuo T2V-01-Live text-to-video API: Transform static art into dynamic masterpieces. Creates vivid 6-second videos with enhanced smoothness and motion. Optimized for stability and subtle expression, supporting a wide range of artistic styles. Has a built-in prompt optimizer that makes it easy to use.
Scroll to load preview
Kling 1.5 Pro
Kling 1.5 Pro text-to-video model. Creates high-quality videos from detailed text descriptions. Supports 16:9 (landscape), 9:16 (portrait), and 1:1 (square) aspect ratios with durations of 5 or 10 seconds.
Scroll to load preview
Video Face Swap
High-quality video face swapping. Swap faces between an image and a video with realistic results. Supports videos up to 4 minutes. Pricing varies by video resolution: 480p (half price), 720p/1080p (standard), 4K+ (1.5x price).
You've seen all 65 models
