Nvidia’s Ada Lovelace architecture brings new levels of performance to the top of the stack. The RTX 4090 outperforms its predecessor, the RTX 3090 Ti, by an average of 52% in rasterization benchmarks and 70% in ray tracing benchmarks (both at 4K). ,Of course. The 4090 currently sits high on the GPU benchmark tier, ranking as one of the best graphics cards, at least if you have enough money.
Unfortunately, the transition from 4090 to RTX 4080 is pretty steep, with a 23% performance drop in rasterization and a 30% drop in ray tracing. Dropping it even further down to the new RTX 4070 Ti, there’s another 22% drop in performance compared to the 4080. In case you’re tracking (and we definitely want to keep the score), I mean his third row Ada card with an AD104 GPU. It’s slower than the previous generation 3090 Ti, but never mind Nvidia’s claim to the contrary, which relies on benchmarks using DLSS 3 frame generation.
Perhaps even more surprising about the RTX 4070 Ti is that it only has a 192-bit memory interface. It still has 12GB of GDDR6X memory and a massive L2 cache generally means narrower buses aren’t a deal killer, but we’re looking at future lower-tier RTX 40-series parts like the 4060. Things don’t look so good when you point the .and 4050.
Nvidia recently announced a full line of RTX 40-series laptop GPUs, from the RTX 4090 mobile which uses the AD103 GPU (which is basically the mobile 4080) to the poor-sounding RTX 4050. Here is a complete list of mobile part specifications: .
|graphics card||RTX 4090 for Laptop||RTX 4080 for Laptop||RTX 4070 for Laptop||RTX 4060 for Laptop||RTX 4050 for Laptop|
|process technology||TSMC 4N||TSMC 4N||TSMC 4N||TSMC 4N||TSMC 4N|
|Die size (mm^2)||378.6||294.5||?||?||?|
|Ray Tracing “Core”||76||58||36||twenty four||20|
|Boost Clock (MHz)||1455-2040||1350-2280||1230-2175||1470-2370||1605-2370|
|VRAM Speed (Gbps)||18?||18?||18?||18?||18?|
|VRAM bus width||256||192||128||128||96|
|L2 cache||64||48||32||32||twenty four|
|TFLOPS FP32 (Boost)||28.3-39.7||20.0 to 33.9||11.3-20.0||9.0-14.6||8.2-12.1|
|TFLOPS FP16 (FP8)||226-318 (453-635)||160-271 (321-542)||91-160 (181-321)||72-116 (145-233)||66-97 (131-194)|
|TDP (Watts)||80-150||60 to 150||35-115||35-115||35-115|
It’s a pretty safe bet that the desktop RTX 4070 uses the same AD104 as the RTX 4070 Ti, with less SM and shaders. Desktop RTX 4060 Ti may or may not use AD104. The only other option is probably the AD106 GPU used in the mobile 4070/4060. And that’s the problem.
The previous generation RTX 3060 Ti featured 8 GB of GDDR6 on a 256-bit interface. AMD wasn’t particularly happy with the lack of VRAM, especially when AMD started shipping his RX 6700 XT (and later his 6750 XT) with 12GB VRAM. Nvidia basically did a course correction on his RTX 3060, giving him 12GB of VRAM. This makes it a nice step up from its predecessor, the RTX 2060. The 2060 also eventually saw him come up with a 12GB model, but the price made it almost unattractive.
Now, we’re talking about the RTX 4060 most likely going back to 8GB, but that sucks. There are many games currently using more than his 8GB of VRAM, and that number will only increase over the next two years. But with GDDR6 and GDDR6X reaching memory capacities of 2GB per 32-bit channel, Nvidia doesn’t have many other options.
There’s the potential to run a “clamshell” mode with two memory chips per channel (one on each side of the PCB), which is rather cumbersome and something you wouldn’t expect from a mainstream GPU. This allows for up to 16GB of VRAM with a 128-bit interface, which is also strange because higher tier parts like the 4070 Ti only have 12GB. Still sounds better than his RTX 4060 8GB model to me!
What about the RTX 4050? Perhaps Nvidia will stick with his 128-bit interface on his AD106 GPU and skip using his AD107 on desktop parts. This is basically what happened with the GA107, which was used almost exclusively for the laptop RTX 3050. For desktops, there is only a maximum of 6GB of VRAM, but clamshell VRAM is potentially out.
Memory capacity is not the only concern. In our review of the RTX 4070 Ti, we said the performance wasn’t bad, but it wasn’t amazing either. This is basically he cheaper than the RTX 3090 with half the VRAM and less power. The 4070 Ti features 60 streaming multiprocessors (SMs) and 7680 of his CUDA cores (GPU shaders), slightly more than the RTX 3070 Ti. But the AD106 could top out at just 40 SMs, or even 36 SMs, putting it in similar territory to the RTX 3060 Ti in core count, leaving only the GPU clocks as a performance boost.
Combine two things, the lack of VRAM and the relatively small increase in the number of GPU shaders, and you might see a small performance improvement compared to previous Ampere generation GPUs.
Nvidia then announces performance improvements for DLSS 3, but this only applies to a subset of games, doesn’t provide a true performance boost, and things start to get worse. One of the benefits of having a GPU is that most games can still run 60 fps after years, even when the games get more demanding.But what if they are not genuine article frame rate?
Let’s assume a game that runs at 120 fps with DLSS 3’s frame generation technology and has a base performance of 70 fps. All is well for now, but as the game becomes more demanding, the base performance will drop below 40 fps and eventually drop below 30 fps. Frame generation with a base frame rate below 30 fps feels like 30 fps or less, even when you’re getting updates.
The same logic applies to higher frame rates, so DLSS 3 at 120 fps on a 70 fps base feels like 70 fps, even though it looks a little smoother to the eye. I cannot tell the difference between an input rate of 70 samples per second and an input rate of 120 samples per second. But when you hit 40 and under, even if you’re not a pro gamer, you start to feel the difference.
To put it more bluntly, DLSS 3 and frame generation are not a panacea. They can help smooth out the visuals and improve the feel of the game a bit, but especially when performance is below 60 fps, the benefits aren’t as noticeable as the actual fully rendered frame with new user input taken into account. .
I’m not saying it’s a bad technique — it’s actually very clever — and we don’t care that it exists. But Nvidia needs to stop comparing DLSS 3 scores to his non-DLSS 3 results and acting like they are the same thing. Adding 10-20% to the base framerate before frame generation is what the game feels like instead of the 60-100% higher fps the benchmark shows.
Coming back to the topic at hand, the future mainstream and budget RTX 40-series GPUs will definitely beat out existing models in pure performance and also offer DLSS 3 support. Hopefully Nvidia will return to prices closer to the previous generation.