I am running performance tests of opencv-cl on odroid-xu board (GPU: POWERVRSGX544MP3 ), i have compared the results of GPU with that of CPU, i found CPU’s performance is good, found the reason that, clFlush is Blocking CPU, i have seen cycles for clFLush alone, most of the total cycles of GPU is because of clFlush(), but as explained clFlush should not block the cpu, i wanna see the internal behavior of clFlush function. can any one guide me regarding this.
clFlush can certainly block the CPU; it won’t return until the command queue has completely been flushed to the hardware, and if the hardware queue is full, the CPU will block.
Except for CL/GL interop, you hardly ever need clFlush.
Thanks for your reply, i am having a doubt, whats the use of blocking cpu there, after assigning task, cpu should be able to do other work till gpu completes its task, but flush, finish both are blocking the cpu. how cpu can be used for other tasks. i just wanna see the load being reduced on cpu when the task is assigned to gpu. I am not able to calculate that. i have kept start timer before clEnqueNDRangeKernel fucntion and stop timer after finish or flush, am getting more time value as cpu is getting blocked by flush or finish. please guide me in reducing cpu’s load by using gpu, and how can i see the cpu load being reduced when we are using GPU instead of CPU.