GLM-4.6V-Flash (9B), a lightweight model optimized for local deployment and low-latency applications. Scales context window to 128k tokens and achieves SoTA performance in visual understanding among similar-scale models.
Added Dec 8, 2025
Context Window
128.0K
Max Output
24.0K
Input Price (Auto)
$0.10/1M
Output Price (Auto)
$0.40/1M
Capabilities
Performance metrics and benchmarks
Sourced from Artificial Analysis.
No benchmark data is available yet for this model.
Auto routing is available for this model. Explicit provider selection is not available.
Loading provider options…