OpenCL on CUDA architecture


I’m implementing a problem using OpenCL 1.0 on a NVIDIA Fermi Architecture GPU. How is NVIDIA’s driver version going to influence the performance of my kernel? I couldn’t really get how OpenCL programs are mapped to CUDA Architecture.


From my understanding Nvidia’s OpenCL implementation is built on top of CUDA. So when you compile an OpenCL implementation using Nvidia’s toolchain they generate some CUDA assembly.

Unfortunately Nvidia’s OpenCL support is minimal at best, they have made a marketing decision to really only support CUDA and haven’t released a OpenCL 1.2 implementation yet. I wouldn’t expect the latest few driver version to have a huge impact on performance and would just use the latest drivers.