Nvidia CEO Comments on Grace CPU Delay, Teases Sampling Silicon

Nvidia has hinted at its upcoming Arm-based Grace CPUs at GTC 2023, but the company’s announcement that the system will ship later this year represents a delay from its original launch schedule, which was targeted for the first half of 2023. increase. Huang addresses the delay in the Q&A session at today’s press conference below. Nvidia also showed off its Grace silicon for the first time and made many new performance claims during its GTC keynote. This includes things like Arm-based Grace chips being up to 1.3x faster than his x86 competitors at 60% power. also cover.
We asked Jensen Huang about the delay in shipping the Grace CPU and Grace Hopper Superchip systems to the end market. After he playfully pushes back about the expected release date (that was definitely 1H23now 2H23), he replied:
“First, Grace and Grace Hopper are both in production and silicon is flying around the fab. Systems are being created and we made a lot of announcements. We’re building it.” Huang also said Nvidia had only been working on the chip for two years. This is a relatively short period of time given the typical multi-year design cycle of modern chips.
The definition of today’s shipping system can be ambiguous. AMD’s and Intel’s first systems often ship hyperscalers for deployment long before the chips hit the general market. However, Nvidia says it’s offering chips to customers as samples, but doesn’t say the Grace has been put into production yet. So, according to the company’s projections, the chips are late, but to be fair, it’s not uncommon for chip launches from companies like Intel to lag year after year. It highlights the difficulty of launching a new chip, even when using a well-established hardware and software platform to build the dominant x86 chip.
In contrast, Nvidia’s Grace and Grace+Hopper chips are a radical rethink of many of the fundamental aspects of chip design with innovative new chip-to-chip interconnects. Nvidia’s use of the Arm instruction set means much better optimization and porting of software, requiring the company to build an entirely new platform.
Jensen alluded to some of that in his extended response, stating: We showed some numbers during the keynote. We didn’t want to burden the keynote with a lot of numbers, but there will be a lot of numbers available for people to enjoy. It was wonderful.
And Nvidia’s claims are impressive. For example, in the album above, you can see his Grace Hopper chip, which Nvidia showed live in his GTC for the first time (technical details here).
During the presentation, Huang claimed that the chip was 1.2x faster on the HiBench Apache Spark memory-intensive benchmark and 1.3x faster on the Google microservices communication benchmark than the “average” next-gen x86 server chip. Did. power.
Nvidia claims this will allow data centers to deploy 1.7x more Grace servers in power-capped installments, each offering 25% more throughput.company also claim Grace is 1.9x faster on computational fluid dynamics (CFD) workloads.
However, while Grace chips are very capable and efficient for some workloads, Nvidia isn’t targeting the general-purpose server market. Instead, the company tailors the chip to specific use cases such as his AI and cloud workloads, balancing excellent single-thread and memory processing performance with excellent power efficiency.
“[..]Almost all data centers are now power constrained and we designed Grace to deliver exceptional performance in power constrained environments. , and you have to be very low power and incredibly efficient. As a result, Grace systems are approximately 2x more power/performance efficient than the best CPUs of the latest generation. ”
“And it’s designed for different design points, so it makes a lot of sense,” Huang continued. It’s very important for cloud service providers, and it’s very important for data centers with unlimited power supply.”
With chips like the recently reviewed AMD EPYC Genoa and Intel’s Sapphire Rapids bumping to 400 and 350 Watts respectively, energy efficiency is more important than ever. This requires incredible power consumption on standard settings and quirky new air cooling solutions, including liquid cooling for the highest performance options.
In contrast, Grace’s lower power consumption makes the chip easier to cool. As first revealed at GTC, Nvidia’s 144-core Grace package is 5″ x 8″ and fits in a surprisingly compact passive cooling module. These modules still rely on air cooling, but in one slim his 1U chassis he can air cool two.
Nvidia also showcased its Grace Hopper Superchip silicon for the first time at GTC. Superchip combines Grace CPUs and Hopper GPUs in the same package. As you can see in the album above, he could even fit two of these modules into one server chassis. Learn more about this design here.
A key aspect of this design is CPU + GPU memory coherency enhanced by a fat, low-latency chip-to-chip connection that is 7x faster than the PCIe interface, allowing the CPU and GPU to share information held in memory. is. With speed and efficiency not possible with previous designs.
Huang explained that this approach is ideal for AI, databases, recommendation systems, and large language models (LLMs), all of which are in great demand. By giving the GPU direct access to the CPU’s memory, it streamlines data transfers and improves performance.
Nvidia’s Grace chips may be running a little behind schedule, but the company has a number of partners, with Asus, Atos, Gigabyte, HPE, Supermicro, QCT, Wiston, and Zt all OEMs for the market. Preparing the system. These systems are currently expected to arrive in the second half of this year, though Nvidia has not revealed if it will come in the beginning or end of the second half.