Google is aggressively migrating tens of thousands of internal production workloads to its custom Arm-based Axion processors, aiming for substantial performance and energy efficiency improvements that could reshape its data center operations and influence the cloud computing industry.
The technology giant has already transitioned approximately 30,000 production packages to the Arm architecture. It plans to convert the remaining internal workloads to run on both its Axion silicon and x86 processors.
Key services such as YouTube, Gmail, and BigQuery are among those already operating across both x86 and Google’s Axion CPUs. This strategic shift is driven by a focus on optimizing cost and energy efficiency.
Google reports that Axion offers up to 65% more performance per price point and roughly 60% greater energy efficiency compared to x86 instances, according to its internal data.
The operational objective is to enable Borg, Google’s proprietary cluster manager, to efficiently allocate internal workloads between Arm and x86 servers. This allocation will be based on real-time cost and performance metrics.
To manage the unprecedented scale of this migration, Google is employing a combination of existing automation tools and a new artificial intelligence (AI) solution called CogniPort.
CogniPort is designed to intervene when a library, binary, or test fails to compile or run successfully. Operating on compilation errors and test failures, its “Blueprint” editing mode generates automatic migration commits for non-trivial, repetitive adjustments.
Google indicates that CogniPort achieves automatic fixes in about 30% of cases under specific conditions. It shows particular effectiveness in correcting tests, platform-specific conditionals, and data representation issues.
While this success rate is insufficient for the AI to complete the entire migration alone, it significantly accelerates manual work. It also reduces repetitive human interventions across thousands of software packages.
The company emphasizes that AI serves as a complement to human oversight in critical systems, not a replacement. Final production stabilization still requires additional human validation.
Google has documented its extensive migration process in a preprint titled “Instruction Set Migration at Warehouse Scale” and in a subsequent company publication. The study authors are Parthasarathy Ranganathan and Wolff Dobson.
Early migrations for critical tasks like F1, Spanner, and Bigtable involved traditional methods, including weekly meetings, dedicated engineers, and manual adjustments to compilation and deployment systems.
Engineers discovered that the primary challenges included fixing tests over-tuned for x86 servers, updating legacy build and release systems, and resolving production deployment issues without destabilizing critical services.
Despite the initial assumption of significant architectural differences, Google encountered fewer surprises than expected. Modern compilers and tools like sanitizers had already addressed many historical incompatibilities.
With approximately 70,000 additional packages still requiring porting, the comprehensive migration project has broad implications. Completing this shift could reduce the reliance on x86 processors in Google’s data centers in the coming years.
Google’s public documentation of its experience provides a valuable case study for technical teams and industry analysts monitoring the transition of architectures in cloud computing. This shift could impact operational costs and the competitive landscape for chip manufacturers.
