2 ExaFLOPS, Tens of Thousands of CPUs and GPUs

admin June 22, 2023

0 2 minutes read

Argonne National Laboratory and Intel announced Thursday that they have installed all 10,624 blades in the Aurora supercomputer, a machine that was unveiled in 2015 with a particularly eventful history. The system uses tens of thousands of Xeon Max “Sapphire Rapids” CPU arrays with on-package HBM2E memory and Data Center GPU Max “Ponte Vecchio” compute GPUs to deliver peak theoretical compute performance in excess of 2 FP64 ExaFLOPS I promise to The system is expected to go live later this year.

“Aurora is the first to introduce Intel’s Max series GPUs, the largest Xeon Max CPU-based system, and the world’s largest GPU cluster,” said Jeff McVeigh, Intel corporate vice president and supercomputing group general manager. I’m here.

The Aurora supercomputer is also very impressive when you look at the numbers. The machine features 21,248 general-purpose processors with over 1.1 million cores for workloads requiring traditional CPU horsepower and 63,744 compute GPUs for AI and HPC workloads. In terms of memory, Aurora comes with 1.36 PB of on-package HBM2E memory along with 19.9 PB of DDR5 memory utilized by the CPU and 8.16 PB of HBM2E powered by a Ponte Vecchi compute GPU.

The Aurora machine uses 166 racks with 66 blades each. Spread out in eight rows, it occupies the space of two basketball courts. On the other hand, this does not include Aurora’s storage subsystem. Aurora employs 1,024 all-flash storage nodes that provide 220 TB of storage capacity and 31 TB/s of total bandwidth. At this time, Argonne National Laboratory does not publish official power consumption figures for Aurora or its storage subsystem.

The supercomputer will be used for a wide range of workloads, from fusion simulations to predictions, aerodynamics to medical research, and uses HPE’s Shasta supercomputer architecture with Slingshot interconnects. Meanwhile, before the system passes his ANL’s acceptance tests, it will be used for large-scale scientific generative AI models.

“As we work towards acceptance testing, we plan to use Aurora to train large-scale open-source generative AI models for science. ” said Rick Stevens, Deputy Director of the Argonne National Laboratory. “With over 60,000 Intel Max GPUs, a blazingly fast I/O system, and an all-solid-state mass storage system, Aurora is the perfect environment to train these models.“

Even with Aurora blades installed, the supercomputer must undergo and pass a series of acceptance tests. This is a common procedure for supercomputers. Successfully passing these and going live later this year is projected to deliver a theoretical performance of over 2 ExaFLOPS (2 billion floating point operations per second). The overwhelming performance is expected to secure the top spot on the Top500 list.

The installation of the Aurora supercomputer marks several milestones. It is the industry’s first supercomputer with performance exceeding 2 ExaFLOPS and the first Intel-based his ExaFLOPS class machine. Finally, it brings an end to the story of Aurora, which began eight years ago and has had its fair share of hardships on the supercomputer journey.

Announced in 2015, the Aurora was initially expected to feature Intel’s Xeon Phi coprocessor and was expected to deliver around 180 petaFLOPS in 2018. However, Intel has decided to abandon his Xeon Phi in favor of computing GPUs, and as a result will renegotiate its contract with Argonne National Laboratory to provide ExaFLOPS systems by 2021.

Due to the complexity of Ponte Vecchio’s compute tiles due to delays in Intel’s 7nm (now known as Intel 4) production node and the need to redesign the tiles for TSMC’s N5 (5nm class), system deliveries will continue I’m late. process technology. Intel finally introduced its Data Center GPU Max product late last year and is now shipping over 60,000 of these compute GPUs to his ANL.