Getting the most Powerful device

Hi there,

I’m asking myself, how to get the most powerful device from the List of devices I get by queueing CL. My Problem is, that CL only gives me some hints that would indicate the wrong computing device. Example:

Device            GTX580     X990
Max_clock         1564       3470
compute_units     16         12
clock sum         25024      41640

So it seems that the most powerful computation unit will be my i7. In this simple case I could differ between CL_DEVICE_TYPE. But when it comes to Sandy bridge both units are of TYPE_CPU, so it is even harder to calculate possible FLOPs.

Any suggestions of what informations i can get without starting all the devices with a small kernel to get my informations?

Thanks in advance,

You could try parsing the platform vendor name and device name strings for nvidia, amd, intel, etc. Depending on what that gives you, you could look for implementation specific extensions. For example, Nvidia GPUs support an extension for getting the compute capability. Thereafter, you can work out how many CUDA cores you have. There isn’t a generic method and this approach could fail if a vendor changes their name strings.

As a side note, running a few kernels beforehand could help you to write auto-tuning code, if you want to do that. That could happen the first time your program runs, after which it stores the configuration settings and device names. If the same device names come up next time the program runs, reuse the configuration settings. If not (i.e. hardware change) rerun the benchmarks. Just an idea.

Nice idea with the auto tuning, sad notice with the computing power. The Problem with the autotuning setup is, there won’t be any answers from users. So it would only help for those, having a pc like mine (or maybe of one of my colleagues).

have to sleep about the problem.