I’m asking myself, how to get the most powerful device from the List of devices I get by queueing CL. My Problem is, that CL only gives me some hints that would indicate the wrong computing device. Example:
Device GTX580 X990
Max_clock 1564 3470
compute_units 16 12
clock sum 25024 41640
So it seems that the most powerful computation unit will be my i7. In this simple case I could differ between CL_DEVICE_TYPE. But when it comes to Sandy bridge both units are of TYPE_CPU, so it is even harder to calculate possible FLOPs.
Any suggestions of what informations i can get without starting all the devices with a small kernel to get my informations?
Thanks in advance,
You could try parsing the platform vendor name and device name strings for nvidia, amd, intel, etc. Depending on what that gives you, you could look for implementation specific extensions. For example, Nvidia GPUs support an extension for getting the compute capability. Thereafter, you can work out how many CUDA cores you have. There isn’t a generic method and this approach could fail if a vendor changes their name strings.
As a side note, running a few kernels beforehand could help you to write auto-tuning code, if you want to do that. That could happen the first time your program runs, after which it stores the configuration settings and device names. If the same device names come up next time the program runs, reuse the configuration settings. If not (i.e. hardware change) rerun the benchmarks. Just an idea.
Nice idea with the auto tuning, sad notice with the computing power. The Problem with the autotuning setup is, there won’t be any answers from users. So it would only help for those, having a pc like mine (or maybe of one of my colleagues).
have to sleep about the problem.