How Can I reduce the GPU Kernel Performance ?

chizhan · October 25, 2012, 12:37pm

Hi. I need to reduce the performance of the GPU. When the kernel OpenCL running, OS is not responding(even Ctrl+Alt+Del) and I have to restart the computer. That it will be not 100%, but let’s say 90. When I do one famous benchmark test, I can see that its performance is less than 100%.

I’m sorry but I’m new to OpenCL. Where I can register in the code this limit?

notzed · October 25, 2012, 2:27pm

You just have to break your code up so it runs in more but shorter kernels, at this time even when the hardware supports concurrent scheduling the load balancing isn’t very good.

Bugs can also crash the computer.

chizhan · October 25, 2012, 7:21pm

You just have to break your code up so it runs in more but shorter kernels, at this time even when the hardware supports concurrent scheduling the load balancing isn’t very good.

Bugs can also crash the computer.[/quote]
notzed, Thank you! This code is matrix multiplication. When they are not big like 1000x1000, the GPU time to count them before the crash. If more, the problems begin.

I think there is a way to avoid the capture all resources of the GPU. For example, in the ordinary programming the CPU (not OpenCL) I start a thread and assigns THREAD_PRIORITY_LOWEST

Maybe there is some directive does not use all the comp units? (CL_DEVICE_MAX_COMPUTE_UNITS)

notzed · October 25, 2012, 9:13pm

OpenCL 1.2 has some api’s to partition the device. See section 4.3 of the opencl specification.

chizhan · October 25, 2012, 11:43pm

Thx, but it doesn’t support GPU (CPU only).
http://devgurus.amd.com/thread/159523

chippies · October 28, 2012, 4:02am

Since you are doing matrix multiplication, the kernel can be broken down into smaller parts. You could take the first hundred rows of your left matrix and multiply the first hundred columns of your right matrix, giving the first 100x100 block of your output matrix.

Currently, there aren’t any other methods supported by all vendors.