Can the GPUs be leaveraged differently

My question is, instead of doing high performance computing on the GPU can it be used to run multible processes at one time? Like a multi core server would or cluster of servers? Is there a way to control how much of the resources a process gets? (cores for example) If this is possible can you tell me were to look like an api manual or a programing guide.

Thanks guys

I have the same question … :oops:

I’m a OpenCL neophyte so take what I say with a big grain of salt. Besides the obvious that this type of processing would probably run better on a multi core CPU. I’m not sure that GPU’s were designed for this type of processing. You could enqueue your kernel with a global work size of 1, but I think you will have problems with the local work size. I think it needs to at least match the number of stream processors in each multiprocessor (this might only apply to Nvidia hardware?). You might be able to limit the kernel to run on each multiprocessor but that will be limited to the number of multiprocessors your card has. My card (a GeForce 9400 GT) only has two multiprocessors, so if you ran on it, you would be limited to just two processes. That is my understanding anyway, it could be wrong. If it is, please someone let me know. :slight_smile:


The OpenCL concept of task (enqueueTask) if for this usage, but I think GPU are not well suited for this concept…
As grimm say, for GPU, a better performance will be reached using a complete GPU multiprocessor. I think for this, this extension will help … ission.txt.

I’m not even an OpenCL newbie yet, but…

Happened to notice a while back while reading the NVidia Fermi whitepaper that it supports Concurrent Kernel Execution. This sounds like what you’re asking for. Though no clue if/how you can actually access this capability in OpenCL and/or CUDA. Like I said, I’m a pre-newbie :roll:

thanks guys. the reason i am asking is because the new huge GPU’s have so many cores and dedicated memory. Thanks for the white paper I will go read that.


so many thanks on the fermi, I think the multi kernals is exactly what i was looking for, this will be huge. :smiley:

Keep in mind that concurrent kernel execution on Fermi has to be from the same context. So this isn’t quite multi-processing yet. Furthermore, it’s still SIMT (single instruction multiple threads), so mapping “processes” in the traditional operating system sense still won’t be very efficient.