TSMC’s 3nm Node: No SRAM Scaling Implies More Expensive CPUs and GPUs

As for the brand new manufacturing node, we can expect better performance, lower power consumption and higher transistor density. But while logic circuits have scaled well on recent process technologies, SRAM cells have lagged behind and seem to have nearly stopped scaling at TSMC’s 3nm-class production node. This is a big problem for future CPUs, GPUs and SoCs, which are likely to be more expensive due to the slow scaling of SRAM cell area.
Slow SRAM scaling
When TSMC officially introduced its N3 manufacturing technology earlier this year, it said the new node would offer 1.6x and 1.7x improvements in logic density compared to its N5 (5nm class) process. What it didn’t reveal is that his SRAM cells of the new technology scale very little compared to his N5. wiki chipinformation taken from a TSMC paper published at the International Electron Devices Meeting (IEDM).
TSMC’s N3 has an SRAM bitcell size of 0.0199µm^², which is only 5% smaller than the 0.021µm^² SRAM bitcell in N5. The improved N3E comes with 0.021 µm^² SRAM bitcells (roughly equivalent to 31.8 Mib/mm^²), so it’s even worse. This means no scaling at all compared to N5.
On the other hand, Intel’s Intel 4 (originally called 7nm EUV) pushed the SRAM bitcell size from 0.0312µm^² to 0.024µm^ for Intel 7 (previously called 10nm Enhanced SuperFin). Reduce to ². ^², slightly behind TSMC’s HD SRAM density.
moreover, wiki chip Recall Imec’s presentation showing SRAM densities of about 60 Mib/mm^² at the “above 2nm node” using fork sheet transistors. Such process technology is years away, and chip designers will have to develop processors at the SRAM densities being touted by Intel and TSMC (although Intel 4 is a non-Intel anyway). unlikely to be used).
SRAM load in modern chips
Modern CPUs, GPUs, and SoCs use large amounts of SRAM for various caches when processing large amounts of data, especially for various artificial intelligence (AI) and machine learning (ML) workloads. Fetching data from is very inefficient. But even general-purpose processors, graphics chips, and application processors for smartphones these days have huge caches. AMD’s Ryzen 9 7950X packs a total of 81MB of cache, while Nvidia’s AD102 uses at least 123MB of SRAM for various Nvidia-published caches.
Going forward, the need for cache and SRAM will only increase, but N3 (which is set to only be used in a small number of products) and N3E will reduce the die area occupied by SRAM, saving new cost. There is no way to mitigate the rise. Comparison of Node and N5. Fundamentally, this means that the die size of high-performance processors increases, and so does the cost. On the other hand, like logic cells, SRAM cells are prone to defects. Chip designers can use N3’s FinFlex innovation (combining different types of FinFETs in blocks to optimize performance, power, or area) to mitigate some of the larger SRAM cells. But at the moment we can only guess what kind. of the fruit this brings.
TSMC is set to introduce a density-optimized N3S process technology that promises to shrink the SRAM bitcell size compared to N5, which is expected to happen around 2024 and is expected to be announced by AMD, Apple, Nvidia and Qualcomm.
relief?
One way to mitigate the delay in scaling the SRAM region from a cost perspective is to employ a multi-chiplet design and break up large caches into separate dies made on cheaper nodes. This is what AMD is doing with his 3D V-Cache, but for slightly different reasons (for now). Another option is to use alternative memory technologies such as eDRAM and FeRAM for caching, the latter of which have their own characteristics.
In any case, delaying SRAM scaling beyond 3nm at FinFET-based nodes looks like a big challenge for chip designers in the years to come.