What benefits can be expected, If I use OpenMP with OpenCL instead of OpenCL Asynchronous API on Multi-GPGPU environment (i.e a system with more than one GPGPU plugged-in) …?
OpenMP (Open Multi-Processing) is an API that supports multi-platform shared memory multiprocessing programming in C, C++ and Fortran on many architectures, including Unix and Microsoft Windows platforms. It consists a set of compiler directives, library routines, and environment variables that influence run-time behavior.
I’m certain the benefits are implementation-specific.
A system with multiple GPU’s should be able to use those GPUs, and OpenCL can handle the multiple contexts, devices, etc. You would need to use that management in your host code, though, because OpenCL does not do it automatically.
I’ve not much experience with OpenMP, but from what I gather, it is useful for both multi-core and multi-system programs. One thought I had was that, if you had multiple systems each with multiple GPUs, you could use OpenMP to distribute a main program out to the sub-systems (nodes), and then use OpenCL on the nodes.
The only benefit I can think of is that you have one thread for each GPU, so you can have a dedicated CPU core for each GPU, which might be faster than managing all the GPUs with just one core.
I think with some implementations you have to use one thread per device. At least that was the case with an old ATI Stream SDK version. Not sure if you still have to do it in the most recent one or with the NVIDIA implementation…
@HolyGeneralK: OpenMP only works on multi-processor, multi-core machines, but not across systems. For that you would have to use something like MPI.