Why work is the group size specified inside the shader (local_size_x)?

aqnuep · August 15, 2012, 8:01am

The reason for that is really that the local work group size in fact affects the shader code. Think about it: local work group size affects thread scheduling scheme and shared memory usage pattern. While the driver could hide this and allow the developer to supply this at dispatch time, however, in fact it would probably still require a shader recompile so might not be deterministic from a performance point of view how expensive a compute dispatch is, even if the driver caches the shaders.

However, you can easily manage multiple local work group size yourself by simply creating multiple shaders with local work group sizes of your choice and select the appropriate when needed. This way there are no hidden costs and you can expect optimum dispatch speed.

Regarding GL compute shaders over CL-GL interop, well, you should definitely be able to gain some performance by using GL compute shaders as no matter how nice is CL-GL interop, developers often complain about its performance hit due to synchronization between the contexts. GL compute shaders are not affected by such cross-context synchronization issues.