OpenCL benchmark

Hi all,

I am new to OpenCL and to this forum. I wrote a simple application for my purposes to measure the peak FLOP/s of a device in OpenCL. I wanted to check how the datatypes affect the performance.
Basically I just ported my older code from CUDA in order to compare the results as I am considering OpenCL as an alternative. It is based on a simple FMAD code to maximize the throughput.

I programmed it under Windows as I couldn’t make NVIDIA and ATI GPUs work together in Linux.
So here’s my first question - has anyone succeeded having OpenCL capable GeForce/TESLA and Radeon in Linux? I’ve noticed that the drivers do something with the kernel, hence the problem.
The second one is related to OpenCL - Sometimes when I run the program on weaker GPUs (i.e. mobile ones) I get an error indicating that it is out of the resources.
Is there any way of getting information about that in advance - before actually running a kernel?
The GPU driver crashes sometimes because of that, it would be good to avoid that.

Anyway, you can download the exe here

or here … jects.html

and test it. So far I’ve run it on AMD, Intel and NVIDIA platforms and several ATI/NVIDIA GPUs + Intel Sandy Bridge/Ivy Bridge CPUs. Seems to give proper results. Any comments/suggestions appreciated.

The problem size that can be adjusted is the number of workgroups (I named it ‘Blocks’ as in CUDA). If you are planning to use a weaker GPU you might want to decrease it. Larger problem give more precise measurements on average.

Feel free to use and share it if you find it useful.


It’s been really useful for me to check the OC effects on a GPU/ CPU

I Increased the clock of 6990 from 830 to 900 and immidediately the improvement could be seen in FlopsCL

PLease send me your GPU’s/ CPU’s screenshots/ logs, I would like to compare them and see id there are any errors

I apologize if the previously posted program didn’t work in some cases.
I forgot to include the necessary MFC dlls in the exec. Now it’s works properly.