Vercel

AI Gateway Launches GLM 5.2 Fast Model with 2x Higher Throughput


Executive Summary

The GLM 5.2 Fast language model is now available on AI Gateway, served via a new high-performance infrastructure called Wafer. The company claims this new implementation delivers double the throughput of other serverless providers serving the same model. This offering is aimed at developers seeking higher speed and reliability for sustained text generation tasks.

Key Takeaways

* New Model Availability: GLM 5.2 Fast is now accessible through the AI Gateway platform.

* Performance Claims: The "Wafer" infrastructure provides 2x higher throughput, with benchmarked speeds of 170+ tokens/second for small contexts and 200+ tokens/second for large contexts.

* Platform Features: AI Gateway provides a unified API for various models, built-in usage and cost tracking, configurable retries, and Zero Data Retention support.

* Pricing: The service reflects provider pricing with no markup or platform fees on inference.

Strategic Importance

This launch positions AI Gateway as a competitive, high-performance option for developers, using optimized infrastructure to differentiate itself on speed and cost-effectiveness for popular AI models.

Original article