takium on tuesday said submitted A bid for the Department of Energy to build a 20 exaflops supercomputer in 2025. The machine is based on the company’s next-generation Prodigy processor, which features a unique microarchitecture that can be used for different kinds of workloads.
U.S. Department of Energy desire The system will be installed at Oak Ridge National Laboratory (ORNL) and will complement the lab’s Frontier system, which went online earlier this year.
Tachyum has not disclosed which hardware it has proposed to the DoE, but only states that it currently has a 128-core Prodigy processor and a more capable Prodigy 2 processor on its roadmap, so we can expect 2025. By 2020, having the latter on hand may allow for future systems.
Tachyum’s Prodigy is a universal homogeneous processor with up to 128 unique 64-bit VLIW cores with two 1024-bit vector units per core and one 4096-bit matrix unit per core. Tachyum expected its flagship Prodigy T16128-AIX Processor (opens in new tab) It provides up to 90 FP64 teraflops for HPC and up to 12 “AI petaflops” for AI inference and training (probably when running INT8 or FP8 workloads). The Prodigy draws up to 950W and uses liquid cooling.
Tachyum sued intellectual property provider Cadence for underperforming Prodigy processors. I don’t know the current performance expectations of the chip.
In theory, Tachyum could use over 11,000 Prodigy processors to power an exaflops system, but the power consumption of such a machine would be enormous. Arguably, the Prodigy 2 is more likely to meet the needs of next-gen exascale systems than his original Prodigy.
The United States currently has one exaflops-class supercomputer. This is Oak Ridge National Laboratory’s (ORNL) 1.1 exaflops Frontier system based on AMD’s 64-core EPYC CPU and Instinct MI250X computing GPU. Two more exascale systems are being built in the US. A 2 exaflops Aurora machine powered by Intel’s 4th generation Xeon Scalable processors and Xe-HPC computing GPUs (aka Ponte Vecchio), and a 2 exaflops “El Capitan” supercomputer. AMD’s Zen 4 architecture EPYC CPU and Instinct MI300 GPU.
One of the interesting things about the DoE’s supercomputing plan is that it wants to upgrade its high-performance computing capabilities every 12-24 months, not every 4-5 years. As a result, the DoE will be more eager than he is now to adopt exotic architectures like Tachyum’s Prodigy.
“We also want to move away from monolithic acquisitions to a model that allows faster upgrade cycles for deployed systems, and consider developing approaches that allow for faster innovation in hardware and software. We are doing it,” said the DoE. Document read. “One possible strategy is to increase reuse of existing infrastructure to modularize upgrades. The goal is to rethink system architecture, rethink efficient acquisition processes, (e.g., every 12-24 months rather than every 12-24 months).Understanding the trade-offs of these approaches is essential to this RFI. and seeks answers, including the advantages and/or disadvantages of this modular upgrade approach.”
One of the advantages Tachyum’s Prodigy has over traditional CPUs and GPUs for AI and HPC workloads is that it’s tuned for both types of workloads. As such, Prodigy can be used for AI workloads when HPC capabilities are not being used, and vice versa. The DoE may or may not use Tachyum for any of its upcoming supercomputers, but the company hopes to get the right deal.