Private AI
GLM-4.6V-Flash (9B), a lightweight model optimized for local deployment and low-latency applications. Scales context window to 128k tokens and achieves SoTA performance in visual understanding among similar-scale models.
Added Dec 8, 2025
Context Window
128.0K
Max Output
24.0K
Input Price (Auto)
$0.10/1M
Output Price (Auto)
$0.40/1M
Capabilities
Performance metrics and benchmarks
No benchmark data is available yet for this model.
Auto routing is available for this model. Explicit provider selection is not available.
Loading provider options…