NVIDIA has unveiled its Vera Rubin superchip module, integrating a custom CPU and dual GPUs, poised to significantly advance artificial intelligence and high-performance computing in data centers.
The powerful new system was publicly demonstrated for the first time during the GTC 2025 conference in Washington D.C., United States. It signals NVIDIA’s strategic shift toward integrated CPU and GPU modules for data centers, moving beyond discrete graphics cards.
Each Vera Rubin module is designed to deliver approximately 100 petaflops of FP4 performance, combining two “Rubin” GPUs with a customized “Vera” CPU. This substantial processing power is specifically aimed at accelerating AI inference, model training, and high-performance computing workloads.
Mass production for the Vera Rubin system, initially designated as “NVL144” for rack-scale deployments, is scheduled to begin in mid-2026. Adoption in data centers is targeted for 2027.
The “Vera” CPU features 88 customized ARM cores, designed to work in tandem with the GPUs. Each “Rubin” GPU incorporates a multi-chiplet architecture and includes eight stacks of HBM4 memory, alongside SOCAMM LPDDR modules, all integrated onto a dense server board.
Interconnection within the module is achieved through an NVLink-C2C bandwidth estimated at 1.8 terabytes per second. The board utilizes dedicated NVLink backplane connectors for high-scale rack integration and inter-module communication, rather than traditional PCIe-x16 standards.
This integrated architecture is expected to be particularly beneficial for cloud computing providers, researchers, and system integrators. It represents a potential redefinition of top-tier computing solutions for demanding AI and HPC environments.
