Reliable device identifier for caching binary

I’m caching the binary from a cl_program object so that I can efficiently recreate other cl_program objects from that binary. What is a reliable identifier to later match devices back to the appropriate binary? Is CL_DEVICE_NAME enough? Or should I use a combination of CL_DEVICE_NAME, CL_DEVICE_VENDOR, CL_DEVICE_VERSION, and CL_DRIVER_VERSION?


I’ve noticed issues with different drivers from ATI and NVIDIA exposing different bugs on different platforms (Windows, Linux, x86, x64 across a few different CPUs and GPUs). My guess would be that MORE information you attach to your binary, the better. Maybe different manufacturers of different devices have different specs and different versions of OpenCL have different bugs in their compilers. One thing you might also want to do is make sure that your program checks to make sure that it is not able to be run versions of the drivers or devices that your code has not been tested against (either before or after). At least, if your program is attempting to run on a device it is not tested against, you should have some mechanism to validate that the results you get on a given device are correct (or correct enough) for your application.

I agree with coleb in that it is better to use the most of the available info. I would suggest also appending your own application version at least when dealing with production code. As your CL source code and binary code could get out of sync e.g. using old binary after the kernel code has been updated. Here is what my code does (using c++ binding)

std::string result = "myapplication ";
cl_device_type type = d.getInfo<CL_DEVICE_TYPE>();
if (type==CL_DEVICE_TYPE_CPU) result += "GPU ";
else if (type==CL_DEVICE_TYPE_GPU) result += "GPU ";
else if (type==CL_DEVICE_TYPE_ACCELERATOR) result += "ACC ";
else result += "??? ";
result += d.getInfo<CL_DEVICE_NAME>() + " " + d.getInfo<CL_DEVICE_VERSION>() + " " + d.getInfo<CL_DRIVER_VERSION>();

what i thought as i read the spec: why isn’t a unique identifier stored IN the binaries. OpenCL could just take the binaries and dispatch them to the right device…

this would have also advantages when you are running a sli system. In this case you wouldn’t need N binaries for N devices (OpenCL could just return one binary instead of N if the implementation is able to optimize all redundant binaries away).

It isn’t clear from the spec, but you are supposed to use the CL_DEVICE_VENDOR_ID.

I ended up writing this

const char *CLGetDeviceType(cl_device_type type)
  switch (type)
  case (CL_DEVICE_TYPE_CPU):   
    return "CL_DEVICE_TYPE_CPU";
    return "CL_DEVICE_TYPE_GPU";
    return "CL_DEVICE_TYPE_ALL";
    return "Unknown";

static std::string GetUniqueDeviceName(const cl::Device &device)
  std::string uname;
  uname += CLGetDeviceType(device.getInfo<CL_DEVICE_TYPE>());
  uname += device.getInfo<CL_DEVICE_VENDOR>();
  uname += NumberToString(device.getInfo<CL_DEVICE_VENDOR_ID>());
  uname += device.getInfo<CL_DEVICE_NAME>();
  uname += device.getInfo<CL_DEVICE_VERSION>();
  uname += device.getInfo<CL_DEVICE_PROFILE>();
  uname += device.getInfo<CL_DEVICE_EXTENSIONS>();

  return uname;

This is probably way overkill, but at this point I’d rather be safe (I have to demo this software next week :-).

It should be noted that though this saves me some time in the NVidia linux implementation, creating a program from the binary and then building it still takes ~.3 seconds. Not cheap, but cheaper than the ~1 second to compile directly from source OpenCL.

How about Devices of a dual GFX card like GT 9800 GX2 ? Both Chips have the same informations.

Edit : Just don’t read this… didn’t read all of this thread properly