Zhaoxin’s 12- and 16-Core CPUs Tested: Centaur Lives On
Zhaoxin, a China-based CPU developer with an x86 license, has yet to officially introduce its next-generation KaiSheng KH-40000 processor with up to 16 cores for data centers. However, we have already started submitting benchmark results to the Geekbench 5 database. The new CPU offers a notable improvement in microarchitecture-related performance over its predecessor, but barely keeps up with AMD’s and Intel’s latest CPUs.
Mysterious CPU
Jointly owned by Via Technologies and the Shanghai Municipal Government, Zhaoxin has been gradually leveraging a microarchitecture designed by Via (or Centaur) since the mid-2010s, with the upcoming KaiSheng KH-40000 series processors for data centers coming from CentaurHauls. It is based on. Some claim it resembles Intel’s Haswell microarchitecture in 2013.
The KaiSheng KH-40000/16 and KaiSheng KH-40000/12 CPUs run at 2.20 GHz and have 16 and 12 cores with 32MB and 24MB of L3 cache respectively. Additionally, the 16-core model appears to feature Simultaneous Multi-Threading Technology (SMT), which can handle up to 32 threads simultaneously, assuming Geekbench 5 reads that feature correctly. Based on Zhaoxin specifications KaiSheng KH-40000/16 When KaiSheng KH-40000/12 These CPUs published in the Geekbench 5 database are very similar to Centaur’s unreleased CHA processors spotted earlier this year.
However, there are differences. CHA has 8 cores, doesn’t support SMT and was designed for his N16 node from TSMC while KaiSheng KH-40000 has up to 16 cores and seems to have SMT. , I believed Designed for TSMC’s N7 manufacturing process. Additionally, the processor IDs for both KH-40000 CPUs read “CentaurHauls Family 7 Model 11 Stepping 3” (1, 2), while Centaur’s processor ID Cha is a “CentaurHauls Family 6 Model 71 Stepping 2”, so the CPU in question uses a different silicon.
The weird thing though is that both CHA and KH-4000 operate at 2.20 GHz. So, without knowing the CPU ID, we can assume that the model KH-4000/16 uses two 8-core CHA dies manufactured by TSMC’s N16. Glued using nodes and interconnects.
mediocre performance
For Zhaoxin, CentaurHauls should be a significant microarchitectural advancement from the LuJiazui microarchitecture from 2019. Now let’s take a look at the performance numbers submitted by the CPU developers.
Akinobu KH-40000/16 | Akinobu KH-40000/12 | Centaur-chan | Akinobu KX-U6780A | AMD FX-8350 | Core i9-12900K | Risen 9 5950X | ||
---|---|---|---|---|---|---|---|---|
General specifications | 16C/32T, 2.20GHz, 32MB L3 | 12C/12T, 2.20GHz, 24MB L3 | 8C/8T, 2.20GHz, 16MB L3 | 8C/8T, 2.70GHz, 8MB L3 | 4C/8T | 8P, 8E, 3.20-5.10GHz, 30MB | 16C, 3.40-5.0GHz, 64MB | General specifications |
micro architecture | centaur | centaur | centaur | Lujiazui | Bulldozer / Piledriver | Golden Cove + Gracemont | Zen 3 | micro architecture |
OS | Uniontech OS DT 20 Pro | windows 10 pro | windows 10 pro | windows 10 pro | ? | windows 11 pro | windows 10 pro | OS |
Single Core | Integer | 450 | 439 | 476 | 366 | 670 | 1830 | 1435 | Single Core | Integer |
Single Core | Floating | 559 | 538 | 541 | 318 | 607 | 2189 | 1881 | Single Core | Floating |
Single Core | Crypto | 1039 | 934 | 782 | 583 | 1040 | 6064 | 4089 | Single Core | Crypto |
Single Core | Score | 512 | 493 | 511 | 362 | 670 | 2149 | 1702 | Single Core | Score |
multicore | integer | 9293 | 3452 | 3307 | 2364 | 3570 | 20631 | 16695 | multicore | integer |
Multicore | Floating | 11875 | 4176 | 3723 | 2089 | 3563 | 23205 | 18695 | Multicore | Floating |
Multicore | Crypto | 5233 | 2119 | 4825 | 3390 | 2431 | 17413 | 8145 | Multicore | Crypto |
Multicore | Score | 9915 | 3603 | 3508 | 2333 | 3511 | 21242 | 16868 | Multicore | Score |
Link | https://browser.geekbench.com/v5/cpu/15706425 | https://browser.geekbench.com/v5/cpu/16875254 | https://browser.geekbench.com/v5/cpu/12878360 | https://browser.geekbench.com/v5/cpu/12878360 | https://browser.geekbench.com/v5/cpu/15900997 | https://browser.geekbench.com/v5/cpu/15911328 | https://browser.geekbench.com/v5/cpu/9506672 | Link |
In terms of single-threaded performance, Zhaoxin’s (or Centaur’s) CentaurHaul microarchitecture has a It significantly outperforms the company’s previous generation LuJiazui microarchitecture. By contrast, the older one runs at 2.70 GHz. The FPU performance boost looks pretty dramatic, but you have to remember that you’re dealing with synthetic benchmarks.
The new microarchitecture is significantly better than the previous one, but the 12 and 16 core KaiSheng KH-40000 CPUs cannot compete with the latest CPUs. Moreover, their single-threaded performance is even lower than his ill-fated Bulldozer/Piledriver architecture from AMD in mid-2012.
When it comes to multi-threaded performance, we see the odd advantage of Zhaoxin’s 16-core KaiSheng KH-40000/16 with SMT outperforming the 12-core KaiSheng KH-40000/12 CPU. Theoretically, a 16C/32T chip can handle 2.66 times more threads than his 12C/12T chip (and I’ve never seen this kind of his SMT efficiency on any famous CPU microarchitecture before. no), but the actual performance benefit is higher than that. than the assumed 2.66X (2.69X for integer, 2.84X for float). Since we are dealing with a situation where one CPU has only 4 cores more than the competition, but the performance is almost 3x higher, we believe there are factors other than the number of cores that affect performance.
Bearing in mind that Windows 10/11 doesn’t always perform optimally with schedulers for unfamiliar multi-core CPUs, the results for a 12-core KaiSheng KH-40000/12 CPU on Windows 10 Pro are: We believe that it does not reflect its true potential.
Still, even without SMT on Windows 10 Pro, CentaurHoals is significantly faster than LuJiazui on multithreaded integer (40%) and multithreaded floating point (78%) workloads. The problem is that the absolute performance numbers demonstrated by both the KaiSheng KH-40000 and the Centaur CHA CPU are lacking by today’s standards.
Interestingly, the multithreading performance numbers Zhaoxin’s 12-core KaiSheng KH-40000/12 showed on Windows without SMT compared to AMD’s FX-8350 processor (4 modules, 8 threads). CPU. At least in Geekbench 5, the performance of a 10-year-old processor isn’t competitive by today’s standards. This is not the best benchmark.
some thoughts
While 12- and 16-core configurations look fine for desktop and entry-level servers, Zhaoxin’s 12- and 16-cores don’t offer comparable performance to AMD and Intel’s 12- or 16-core processors. On Windows, Zhaoxin seems to be a decade behind AMD and Intel when it comes to performance, judging solely by their Geekbench 5 scores. Even if Zhaoxin enabled SMT on their upcoming CentaurHoals-based CPUs (for client and server applications) and Windows “learned” how to use those cores properly, KaiSheng KH-40000/16 will still be available for AMD and Intel. 2x slower than the 2021 processor in with the same number of cores.