Intel Talks Falcon Shores Flub, Merges Habana Gaudi Roadmap
Intel originally planned to have both GPU and CPU cores inside a Falcon Shores chip, creating the company’s first “XPU” for high-performance computing. But the sudden announcement months ago to shift to a GPU-only design and delay the chip to his 2025 shocked industry insiders. This puts Intel out of competition with AMD’s Instinct MI300 and Nvidia’s Grace Hopper processors, which have both processors. CPU+GPU design.
Today, Intel revealed some of the somewhat dubious rationale behind its decision to scale back its plans for Falcon Shores to become a GPU-only successor to its Xeon Max GPU series. Intel has also sketched out some of the early details of its new GPU-only Falcon Shores design. More on this below.
Intel has also issued a new HPC and AI roadmap that doesn’t show a successor to the Gaudi3 processor. Instead, Gaudi and the GPU will be integrated with the Falcon Shores GPU, which will take over the role of Intel’s premier HPC and AI chip. Intel says it plans to integrate Habana and AXG products [GPU] Roadmap”, but the details of the integration are few.
Gaudi’s computational architecture is so different from standard GPUs that it is unlikely that it can be fully integrated into GPUs. As such, Intel may incorporate smaller parts of the Gaudi design into GPUs, such as network interfaces and other IP blocks. We hear that Jeff McVeigh, vice president and general manager of Intel’s Accelerated Computing Group, will provide more details today. Mind you, Intel paid Habana Labs $2 billion to scrap products from its $350 million Nervana acquisition in order to focus on the Gaudi chip.
Intel has shared some basic details about its new Falcon Shores design. This design continues to focus on HPC and AI workloads, but employs GPU cores. HPC-focused Falcon Shores XPUs are designed for supercomputing applications, with plans to combine both CPU and GPU technology in a single mix-and-match chip package, but GPU-only in 2025 It will appear for the first time as the architecture of
Falcon Shores employs standard Ethernet switching much like Intel’s AI-focused Gaudi architecture, an unspecified amount of HBM3 memory, and “I/O designed to scale.” , presumably meaning that the Falcon Shores will come with different memory capacity options. Intel says the Falcon will feature up to 288 GB of HBM3 and a total memory throughput of 9.8 TB/s. As expected, smaller data types such as FP8 and BF16 are supported.
The base sketch of the device also includes OneAPI, a popular GPU-based programming interface that enables broad compatibility with other CPUs and architectures. Intel also cites CXL support as a key differentiator, which explains the rationale behind taking CPU cores out of the Falcon Shores package.
Intel says its original goal of mixing CPU and GPU cores in the same Falcon Shores package was premature. As shown in the slide above, Intel has seen the optimal mix of CPU and GPU cores change over time as workloads have evolved, and the optimal CPU/GPU ratio has changed even more rapidly and He said he expects an explosion of fundamental change. Introduces generative AI and LLM into his HPC space. As such, Intel says it doesn’t feel like it’s the right time to lock customers into specific CPU-to-GPU ratios.
However, as shown above, the original plans for Falcon Shores included the ability to adjust the CPU/GPU ratio by dropping different numbers of CPU tiles or GPU tiles into a 4-tile design, which allows you to compose an optimal blend. various workloads. Moreover, state-of-the-art supercomputers are, by design, purpose-built for the task at hand, and tuning software to their architecture is just a routine part of running a supercomputer. These factors suggest that his CPU/GPU ratio isn’t the only reason Intel removed CPU cores from their designs.
By allowing customers to use a variety of CPUs (which would logically include AMD’s x86 chips and Nvidia’s Arm chips in their GPU designs), Intel also wants customers to be Intel’s instead of others. It also points out to not be tied to choosing an x86 core. But again, Intel’s original plans also included his GPU- and CPU-only versions of Falcon Shores, so this rationale doesn’t seem compelling either.
Intel says it will leverage the CXL interface to enable a composable architecture that allows customers to combine different CPU/GPU ratios in custom designs. However, while the CXL interface only offers 64 GB/s throughput between elements, custom CPU+GPU designs such as Nvidia’s Grace Hopper can offer up to 1 TB/s memory throughput between CPU and GPU. increase. This provides both performance and efficiency advantages over CXL implementations for many types of workloads, especially memory bandwidth-intensive AI workloads. Not to mention the inherently low-latency connections between elements and other benefits such as increased performance density.
In other words, Intel’s idea of a more configurable architecture is fine for some workloads, but for certain applications it may not be able to compete with AMD’s MI300 or Nvidia’s Grace in terms of power, cost, or performance. more likely.
Likewise, Intel’s decision to slow down the pace of GPU releases leverages older products to compete with much more advanced architectures for HPC, like Nvidia’s Grace Superchips and AMD’s upcoming exascale APU, the Instinct MI300. It is not ideal because you need to 2023.
Intel’s redefinition of Falcon Shores to be a GPU-only product, despite Intel’s rationale for changing its goals, appears to be missing an architectural inflection point that will put the company at a competitive disadvantage in the future. I can’t help it.