Profiling OpenCL using Tracy

Hi,

I found myself in need of a profiler for OpenCL on a AMD machine running Linux and learned that Tracy has native support for OpenCL. The only step missing is plugging into the profiling info APIs and write wrappers around the APIs you want to trace. This sounded painful and after a bit of reading I learned that OpenCL ICD Loader has support for OpenCL layers and I decided to write a layer wrapping the relevant functions using Tracy’s OpenCL APIs.

The result is pretty good and now I can run any OpenCL application and have the information about the runtime of a OpenCL function (both CPU and GPU side) show up in Tracy’s interactive GUI! I get especially good and reliable results when using the rusticl and pocl drivers. The ROCm OpenCL driver (v6.3.1) does likely something funny with the timestamps which makes them not show up in Tracy’s interactive GUI.

The implementation is available at [RFC] Add a reference OpenCL layer implementation by martymichal · Pull Request #1190 · wolfpld/tracy · GitHub. You need to compile the layer to get a shared object which you can then use using the OPENCL_LAYERS environmental variable (see GitHub - Kerilk/OpenCL-Layers-Tutorial for the whole workflow).

Currently the implementation wraps around only a subset of all the APIs. The currently supported ones cover my particular use case but adding support for others (Pipes, SVM, Images,…) is easy enough.

I would like to hear if this could be helpful for you when profiling your OpenCL programs.

Cheers!