Fast Veo 3.1 generation for text-to-video, image-to-video, and reference-to-video with up to 3 total images. Supports optional end frame control for image-to-video, native audio, and 4/6/8 seconds at 720p or 1080p.
Added May 14, 2026
Approx. Price
$0.640 per video
Model Type
both
Preview Examples
2
Generation controls available for this model.
Resolution/Aspect Options
2
Default Duration
8
3 duration options
Tunable Settings
4
Aspect Ratio (T2V)
Default
16:9
Options (2)
Landscape (16:9), Portrait (9:16)
Applies to text-to-video
Duration
Default
8
Options (3)
4 seconds, 6 seconds, 8 seconds
4/6/8 seconds
Generate Audio
Default
Yes
Enable native audio generation
Resolution
Default
720p
Options (2)
720p, 1080p
Output resolution
Human preference benchmarks sourced from Artificial Analysis.
Text to Video
#19 / 81
ELO
1209.0
Appearances
5,042
95% CI
-9/9
Image to Video
#15 / 74
ELO
1268.0
Appearances
5,246
95% CI
-10/10
Release Date 2026-01 · Matched as Veo 3.1 Fast
Artificial Analysis APITime-lapse of the city transitioning through the day. Shadows shift across buildings, traffic flows in streaks of light, clouds race across the sky. The scene shifts from bright midday to warm golden hour. Smooth hyperlapse feel.
A ballerina performing an elegant pirouette on a moonlit outdoor stage. Rose petals swirl around her in slow motion. Camera orbits slowly around her as she spins.