Without matrix MAC unit in AMD GPU, how do you support AI?


I didn’t find any tech support contact, so I am asking here.
I design boards and systems, and need to give some feedback about what kind of GPU to use in some products, and that depends on software support too.
On this year’s Hot Chips conference both Nvidia and AMD presented their latest GPUs, and it seems AMD does not support a matrix-MAC unit that Nvidia already does. I also asked the Khronos guys at the booth about what they do, they said software libraries for AI (machine learning) on AMD GPUs.
What I want to figure out is whether it is okay to use AMD GPU. The advantage to me is that AMD they sells chips to system vendors, while Nvidia only sells mezzanine boards that only fit certain kind of systems. Some form factors (that are not data center rack servers, but aerospace or automotive) need raw BGA chips to be designed on the motherboard.

How is it possible to do machine learning on AMD GPU without the chip having the supporting logic?
Do you have performance comparison between Nvidia GPU running whatever software and AMD GPU running with Khronos software libraries? Is it comparable?


Take it like you do any other engineering problem:

  • What’s the problem you are trying to solve?
  • What are 1) the absolute requirements, and 2) the nice-to-haves?
  • What are your options?
  • Which options meet your requirements?
  • If multiple, which have the most nice-to-haves?

Low-precision matrix multiply+accumulate (MAC) can be implemented with an ordinary GPU, or on a CPU. The only question is performance.

You need to define what the throughput requirements are for the problem you are trying to solve. Then evaluate the possible solutions based on those requirements. You may find that you do (or that you don’t) need a specific hardware solution to meet those requirements.

However, if you require a specific hardware packaging or interface format for some form factors that some hardware solutions don’t offer, then they’re off-the-table. This may cause you to rethink what your requirements are vs. what are your nice-to-haves, or to refine them based on form factor.


Since I develop hardware, I cannot really predict what exactly the customers would do with it.
I was looking for some random but typical application where a performance comparison was already done, something related to inference or image recognition.
Can MLPerf be used for this? Do we have MLPerf results on both Nvidia and AMD (with Khronos software) GPUs?