Intel Details Inner Workings of XeSS
Intel has released an explainer video of its upcoming XeSS AI upscaling technology, showcasing how the technology will work on its Arc Alchemist GPUs, which are almost ready to go public. We used the fastest Arc A770 for demonstration, but it’s hard to say how the performance will match the best graphics cards based on the limited performance details provided.
If you’re even slightly familiar with Nvidia’s DLSS, it’s been around for four years in one form or another, but this video should inspire Deja Vu’s keen senses. Tom Petersen, who previously worked at Nvidia and did some of his previous DLSS presentations, explains the fundamentals of XeSS. In a nutshell, XeSS is very similar to Nvidia’s mirrored version of his DLSS, except that it’s designed to run on Intel’s deep learning XMX cores instead of Nvidia’s tensor cores. . However, the tech works with other GPUs as well, but using DP4a mode could be an interesting alternative to AMD’s FSR 2.0 upscaler.
In the demo shown by Intel, XeSS seemed to work well. Of course, it’s difficult to say for certain when the source video is a 1080p compressed version of the actual content, but I’ll save a detailed picture quality comparison for another time. Performance improvements appear to be similar to those seen with DLSS, with over 100% improvement in
Usage
If you already know how DLSS works, Intel’s solution is pretty much the same, with a few minor tweaks. XeSS is an AI-accelerated resolution upscaling algorithm designed to boost frame rates in video games.
It starts with training, the first step of most deep learning algorithms. The AI network takes low resolution sample frames from the game and processes them to produce what needs to be upscaled in the output image. The network then compares the result to the desired target image and backpropagates weight adjustments to try to fix the “error”. The resulting image doesn’t look good at first, but the AI algorithm slowly learns from its mistakes. After thousands (or more) of training images, the network will eventually converge towards the ideal weights that “magically” produce the desired result.
Once the algorithm is fully trained using examples from different games, it is theoretically possible to take any image input from any video game and upscale it almost perfectly. Similar to DLSS (and FSR 2.0), the XeSS algorithm also plays a role in anti-aliasing and replaces traditional solutions such as temporal AA.
Again, nothing particularly noteworthy so far. DLSS and FSR 2.0, and even standard temporal AA algorithms, have many of the same core features (minus the AI elements of FSR and TAA). The game integrates XeSS into its rendering pipeline. This is typically after the main rendering and initial effects are complete, but before any post-processing effects and GUI/HUD elements are drawn. That way, the UI stays crisp even when doing the difficult task of 3D rendering at lower resolutions.
XeSS runs on Intel’s Arc XMX cores, but can also run on other GPUs in slightly different modes. A DP4a instruction is basically four INT8 (8-bit integer) computations performed using a single 32-bit register, typically accessible via a GPU shader core. The XMX core, on the other hand, natively supports INT8 and can manipulate 128 values at once.
This may seem very biased, but as an example, the Arc A380 has 1024 shader cores, each capable of doing 4 INT8 operations at the same time. Alternatively, the A380 has 128 MXM units, each capable of 128 INT8 operations. This makes the MXM’s throughput 4x faster than his DP4a’s, but clearly he should be good enough to get some XeSS goodness even in DP4a mode.
DP4a is Wrong A trained network, possibly a less computationally intensive network. How that translates into actual performance and image quality remains to be seen.If game developers want to support non-Arc GPUs, they’ll need to explicitly include support for both XMX and DP4a modes. There seems to be
Intel XeSS Performance Expectations
Intel showed off several game tests running XeSS, including a development build of Shadow of the Tomb Raider and a new 3DMark benchmark made specifically for XeSS. The end of the video also showed short clips of Arcadegeddon, Redout II, Ghostwire Tokyo, The DioField Chronicle, Chivalry II, Naraka Bladepoint and Super People running with and without his XeSS.
In Shadow of the Tomb Raider, running on an Arc A770 graphics card at 2560×1440 and at near maximum settings including ray-traced shadows, XeSS delivers a performance boost of around 25% to over 100% on ultra quality settings. provided. Change the frame rate using performance settings. The Quality and Balanced settings are in the middle, with performance improvements of about 50% and 75%, respectively.
These gains will of course vary depending on your game engine, settings, and base performance. The more demanding the game and the lower the framerate, the more likely XeSS will benefit. Using performance mode, Intel showed typical gains of 40% to 110% on his 1440p, while balanced mode saw improvements of around 25% to as much as 75%.
3DMark also adds Intel XeSS Feature Test to the Advanced edition. It includes a benchmark mode and a Frame Inspector that allows the user to zoom in on the images of the benchmark to see the difference in visual quality. It looks much easier to use than Nvidia’s ICAT utility, but of course it’s also limited to serving frames from a single synthetic benchmark.
3DMark uses the demanding Port Royale ray tracing scene for XeSS feature testing, so the performance boost is particularly impressive. Benchmarked at 1440p with XeSS in performance mode, the FPS increased by 145%, balanced mode by 109%, quality mode by 81%, and ultra quality mode by 49%.
Frame Inspector also shows some good results, XeSS rebuilds the image very well, and Intel’s Tom Petersen claims that the XeSS image actually looks better than native with TAA . Of course, that has to be taken with a grain of salt, and images from a single canned sequence may not fully represent a real-world gaming experience.
XeSS SDK and 20+ games in progress
Intel provides an easy-to-use SDK for implementing XeSS in your game engine. The interface and requirements are very similar to DLSS and FSR 2.0 as well as TAA implementations, making it relatively easy to add to modern graphics engines.
Like TAA, FSR 2.0, and DLSS, XeSS requires motion vectors along with the current frame and maintains its own collection of previous frames. All of these are sent to the AI network, which finally yields good results. XeSS also uses camera jitter to eliminate aliasing in the scene.
Intel currently plans to release over 20 games using XeSS in the coming months. Some of them may be overlooked or delayed, but at least it’s a decent start for beginners. did. Nvidia has shipped well over 100 games with DLSS 2.0 and above. How many game developers are willing to add all three options to give gamers the choice of the best algorithm? Many games support only one or two of the possible upscaling options. It seems.
XeSS will officially launch in the near future when Intel releases its Arc Alchemist GPUs worldwide. The Arc A380 has effectively launched at this point and Intel is now hinting at the A750 and A770. Hopefully in the not too distant future he will be able to experience XeSS in both MXM and DP4a modes. At the moment, it is well behind the competition from AMD and Nvidia.