It would have been really useful if the OpenCL standard included a CPU side converter from float to half. In order to circumvent this I’ve created a kernel to specifically convert from floats to half, however, this is now a bottleneck in my application. I imagine because it has to touch the device for such a small amount of work.
Does anyone have some CPU code that will convert from a float to a cl_half without having to touch the device? The inverse (half to float) would be useful for completeness.