Nvidia's Nemotron 3 Ultra Model Now Available on Vercel AI Gateway

Jun 5, 2026, 7:19 PM UTC

Executive Summary

Vercel has integrated Nvidia's Nemotron 3 Ultra, a powerful open Mixture-of-Experts (MoE) reasoning model, into its AI Gateway platform. This gives developers access to a model specifically designed for complex, multi-turn AI agent workflows, featuring a one-million-token context window. The integration allows developers to leverage Nemotron 3 Ultra through Vercel's unified API, which simplifies usage, cost tracking, and performance optimization.

Key Takeaways

* Product: Nvidia's Nemotron 3 Ultra model.

* Primary Function: An open MoE reasoning model built to orchestrate long-running, multi-step agent workflows, including planning, tool use, and error recovery.

* Key Features & Capabilities:

* 1 million token context window.

* High throughput of up to 350 tokens per second.

* Up to 30% lower cost on agentic tasks.

* Target Audience: Developers building applications with advanced AI agents.

* Availability: Available now on the Vercel AI Gateway.

* Pricing & Tiers: Vercel's AI Gateway charges no markup or platform fees on inference, reflecting the direct provider pricing.

Strategic Importance

This integration strengthens the Vercel AI Gateway by adding a specialized, high-performance model for agentic AI, making the platform more compelling for developers building sophisticated AI applications.

Original article

Nvidia's Nemotron 3 Ultra Model Now Available on Vercel AI Gateway

Executive Summary

Key Takeaways

Strategic Importance

Related Posts