The NVIDIA Vera CPU marks a historic inflection point: it is the first processor built entirely on NVIDIA's own custom ARM microarchitecture, departing from the licensed Arm Neoverse IP used in the predecessor Grace CPU. Featuring 88 custom Olympus cores based on the ARMv9.2 architecture, Vera is engineered specifically for the trillion-parameter AI models and agentic AI workloads that define enterprise computing in 2026. Memory bandwidth reaches 1.2 terabytes per second — double the bandwidth of the Grace Hopper platform — enabling the massive data movement required by large language model inference and training pipelines. The processor ships as the CPU component of DGX Vera Rubin systems, NVIDIA's next-generation AI supercomputer platform expected to reach customer deployments in the second half of 2026. A single rack of DGX Vera Rubin hardware supports 22,500 or more concurrent CPU environments, enabling hyperscalers to run AI model serving at densities previously requiring multiple facilities. Vera is already confirmed available through all four major cloud providers: AWS, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure, ensuring broad accessibility for enterprise AI teams without on-premises hardware investment. The custom Olympus core design allows NVIDIA to optimize microarchitectural decisions specifically for AI workload patterns — including sparse matrix operations, attention mechanism execution, and the memory access patterns characteristic of transformer architectures — in ways that licensed IP cannot accommodate. NVIDIA's deep integration between Vera CPU and Rubin GPU through NVLink 6 creates a coherent memory architecture that eliminates the PCIe bottlenecks constraining heterogeneous computing performance. For organizations building AI inference infrastructure at scale, Vera fundamentally changes the cost and density calculus of GPU cluster design.
Comments on "NVIDIA Vera CPU"
Create a free account or sign in to join the discussion.
Sign in to join the conversation