OpenCL profiling tools in Windows for Nvidia Gpus


I am trying to profile a Windows application running some OpenCL kernels on a computer having an Nvidia Gpu.
Is there any such tool for this configuration?
From my testing and searching so far, it seems that:

  • Amd CodeXL works only for AMD Gpus
  • Nvidia Visual Profiler works only for Cuda kernels
  • Profiling with Nvidia Nsight Visual Studio Edition is allowed only for Cuda kernels.

Any help would be enormously appreciated.
Thank you.

NVIDIA Nsight for Visual Studio can trace OpenCL API calls and can also show you GPU memory transfer and kernel execution times. I do it all the time. It can also give you details of kernel execution (avg. execution time, other details like occupancy, etc.). It cannot single step kernels. I would think that Visual Profiler can do it too but I haven’t tried it.

True, Nvidia Nsight for Visual Studio can trace OpenCL applications and it does provide useful information. However, this is different from what a profiler does and is not very helpful when trying to find bottlenecks and optimize code.
I did try using Visual Profiler, but it fails to generate the timeline. The same thing happens with both Visual Profiler 5.5 and 6.0. I get the same results with an OpenCL demo application from NVIDIA.

Any news?

Is there any chance one could profile an OpenCL application for Nvidia GPUs in Windows? Or there simply isn’t such a tool available yet.