openCL and cuda on the same GPU (GTX260)

Hi there,

I have written an openCL, and also a cuda version for my cfd code. I used the same algorithm for both of them; and same optimization level, I guess… since I didn’t add anything else when compile.

I thought, if I run them on the same machine, they would have the same speed, roughly. However, I found cuda is about 2 times faster than openCL on GTX260, (3:1 speed).

Did I do anything wrong? or it should be like this? Could someone give me some suggestions? Any thought will be appreciated!


There have been a lot of reports that Nvidia’s OpenCL beta is slow compared to CUDA. You should file a bug report against Nvidia if you can figure out what is slower, but otherwise you’ll just have to wait for Nvidia to optimize their OpenCL implementation.