432-Core Chiplet-Based RISC-V Chip Nearly Ready to Blast Into Space
The Occamy processor, which uses a chiplet architecture, has 432 RISC-V and AI accelerators, 32 GB of HBM2E memory, and is taped out. The chip is backed by the European Space Agency and is reportedly developed by engineers from the ETH Zurich and the University of Bologna. HPC wire.
The ESA-backed Occamy processor uses 216 32-bit RISC-V cores, two chiplets with an unknown number of 64-bit FPUs for matrix computation, and is powered by two 16GB HBM2E memory packages from Micron. I’m here. The cores are interconnected using a silicon interposer, allowing the dual-tile CPU to deliver 0.75 FP64 TFLOPS of performance and 6 FP8 TFLOPS of compute power.
Neither ESA nor its development partners have revealed the power consumption of the Occamy CPU, but the chip is said to be passively cooled. So it may be a low power processor.
Each Occamy chiplet has 216 RISC-V cores and matrix FPUs, with a total of about 1 billion transistors on 73mm^2 of silicon. The tiles are manufactured by GlobalFoundries using the 14LPP manufacturing process.
A 73mm^2 chiplet is not a particularly large die. For example, Intel’s Alder Lake (which has 6 high performance cores) has a die size of 163 mm^2. As far as performance goes, Nvidia’s A30 GPU with 24 GB of HBM2 memory offers 5.2 FP64/10.3 FP64 Tensor TFLOPS and 330/660 (sparse) INT8 TOPS.
On the other hand, one of the advantages of the chiplet design is that ESA and its partners at ETH Zurich and the University of Bologna can add other chiplets to the package to speed up specific workloads if needed.
The Occamy CPU is being developed as part of the EuPilot program and is one of many chips the ESA is considering for spaceflight computing. However, there is no guarantee that this process will actually be on board the spacecraft.
Occamy’s design is intended to support high-performance and AI workloads through its bare-metal runtime, but it’s not yet clear whether the runtime will be at the container or bare-metal level. Occamy processors can be emulated on FPGAs. This implementation has been tested on two AMD Xilinx Virtex UltraScale+ HBM FPGAs and a Virtex UltraScale+ VCU1525 FPGA.