I posted this question on StackOverflow, but when researching the answer, I saw someone there recommend here as the best place to get OpenCL questions answered. So hopefully you guys can give me some guidance,
In my OpenCL program, I am going to end up with 60+ global memory buffers that each kernel is going to need to be able to access. What’s the recommended way to for letting each kernel know the location of each of these buffers?
The buffers themselves are stable throughout the life of the application – that is, we will allocate the buffers at application start, call multiple kernels multiple times, then only deallocate the buffers at application end. Their contents, however, may change as the kernels read/write from them.
In CUDA, the way I did this was to create 60+ program scope global variables in my CUDA code. I would then, on the host, write the address of the device buffers I allocated into these global variables. Then kernels would simply use these global variables to find the buffer it needed to work with.
What’s the recommended way to do this in OpenCL? It seems that CL’s global variables are a bit different than CUDA’s, but I can’t find a clear answer on if my CUDA method will work, and if so, how to go about transferring the buffer pointers into global variables. If that wont work, what’s the best way otherwise?
OpenCL buffers are quite different from CUDA malloc()ed global memory, in that there is no 1:1 mapping between an OpenCL buffer and a specific global memory area on the device (theoretically, the OpenCL runtime could unmap and remap the OpenCL buffer to different global memory addresses at different kernel calls, although I doubt this happens in practice), so you cannot reliably do what you can do in CUDA (i.e. throw the global memory addresses returned by cudaMalloc() into some constant space and forget about passing them as parameters to kernels). You will have to pass each buffer as a separate kernel argument to each kernel.