Intel XeSS ought to work on AMD RDNA 2 and Nvidia Ampere, Turing, & Pascal GPUs

An image depicting two graphics cards, one AMD and one Intel, over the XeSS golden logo

(Picture credit score: Future)

Intel XeSS is an AI-augmented upscaling know-how Intel hopes will likely be a feather within the cap of its Alchemist graphics playing cards after they lastly go head-to-head with AMD and Nvidia. It is a comparatively understood proposition for players, not not like DLSS or FidelityFX Tremendous Decision, however we additionally know Intel intends to supply this body rate-boosting know-how on its opponents’ GPUs.

Precisely which opponents’ graphics playing cards will profit from the know-how will come all the way down to help for DP4A directions.

So let’s get proper into the thick of it. On Intel Alchemist GPUs, Intel XeSS will function by way of XMX acceleration—particular matrix engines constructed into the structure that carry out a job just like that of Nvidia’s Tensor Cores. It is this deep studying acceleration that’ll assist XeSS increase your efficiency in-game.

For non-Intel Xe-HPG graphics playing cards, nonetheless, this acceleration can as an alternative be completed by way of DP4A directions at a considerably larger expense (which means it will not be fairly as performant as XMX-accelerated XeSS). These DP4A directions are used to multiply 8-bit integers (one byte, INT8) collected into one 32-bit integer after which run on a GPU’s ALUs. These are additionally used to speed up sure operations that don’t require excessive precision, i.e. deep studying.

Trendy GPUs help DP4A (Signed Integer Dot-Product of 4 Components and Accumulate) in keeping with Microsoft Shader Mannequin 6.4, whereas different older GPUs do not. Which means some GPUs can speed up sure directions required by XeSS, whereas others are unable to.

Anticipated XeSS compatibility
Structure Appropriate
Intel Xe-HPG Sure – XMX (confirmed)
Intel Xe-LP Sure – DP4A (confirmed)
Older Intel architectures (Gen 11, Gen 9) No
AMD RDNA 2 Sure – DP4A (confirmed)
AMD Vega No
AMD Vega 7nm Attainable – DP4A
AMD Polaris No
Nvidia Ampere Sure – DP4A
Nvidia Turing Sure – DP4A
Nvidia Pascal Sure – DP4A
Older Nvidia architectures No

AMD has confirmed that RDNA 2 graphics playing cards help DP4A, in a remark to ComputerBase. That is achieved by way of Fast Packed Math, a function initially launched with AMD’s Vega structure, although AMD says it is solely the RX-6000 collection that may help DP4A directions.

You do not solely must take AMD’s phrase for it; the RDNA 1 ISA reference information [PDF warning] doesn’t record DP4A compatibility, nor does first-gen Vega. Whereas the RDNA 2 information notes “Dot product ALU operations added speed up inference and deep-learning” as a function change with the newer structure.

It is price noting that 7nm Vega, discovered inside the Radeon VII, lists help for DP4A operations, which may imply it too helps XeSS and house owners of that card can lastly really feel justified in buying it as an alternative of an RX 5700 XT.

A graph showing rough frame render times between native rendering (top), XeSS with XMX (second), XeSS with DP4A (third), traditional upscaling (fourth), and low quality image (fifth)

Intel doesn’t anticipate a significant efficiency loss between XMX acceleration and DP4A. (Picture credit score: Intel)

AMD does, nonetheless, observe that older GPUs could in concept be capable to run XeSS on FP32/FP16 ALUs, however this may be slower. We’ll in all probability by no means discover out if that is the case, although, as Intel has already largely shot down a extra basic FP32/FP16 implementation of XeSS in a current interview with WCCFTech, citing attainable efficiency questions. 

I’ve reached out to AMD for any additional clarification on DP4A help and can replace this text if I hear something again.

Onwards to Nvidia, and it seems that the corporate’s Ampere and Turing architectures will help DP4A. No qualms there. Actually, the Pascal structure also needs to help DP4A, really because the first structure to help the instruction for crew inexperienced.

That is doubtlessly nice information for 10-series house owners, as Nvidia’s personal DLSS know-how is supported solely on 20-series RTX playing cards or newer.

Within the aforementioned interview, Intel’s Karthik Vaidyanathan additionally states it is not going to be seeking to program particularly for Nvidia’s Tensor Cores as that may require bespoke programming.

I’ve additionally reached out to Nvidia to see if it is going to share any extra info on supported GPUs.

As for older Intel equipment, it seems that Xe-LP based mostly iGPUs present in eleventh Gen cellular chips ought to help the instruction and subsequently work with XeSS. I’ve not seen any proof to show any earlier iGPUs or desktop iGPU fashions will likely be supported, nonetheless, so so far as I am involved in the present day the cut-off will likely be newer Intel Xe-based chips.

Nonetheless, it is trying like giant numbers of older discrete graphics playing cards will likely be open to utilizing Intel’s XeSS know-how. That ought to be an enormous win for Intel’s know-how, however does depend on it having the ability to ship increased body charges with out an excessive amount of loss in readability, even with DP4A over XMX. Success may even depend on recreation help, too, and Nvidia has a multi-year head begin in that regard.

If all of it goes Intel’s means, although, XeSS is bound to crank up the stress on AMD and Nvidia.

Jacob Ridley

Jacob earned his first byline writing for his personal tech weblog from his hometown in Wales in 2017. From there, he graduated to professionally breaking issues at PCGamesN, the place he would later win command of the equipment cabinet as {hardware} editor. These days, as senior {hardware} editor at PC Gamer, he spends his days reporting on the most recent developments within the know-how and gaming business. When he isn’t writing about GPUs and CPUs, you may discover him attempting to get as far-off from the trendy world as attainable by wild tenting.