problem with clFinish


I have an application that runs opencl code and Im getting some unclear timing issues.

I have a function similar to the one below and I call it every few milliseconds. I measure the time each clFinish takes and I expected to have no delay since there is a long time between the calls, but each time the second clFinish (line 3) has a delay of about 1 millisecond. I tried a few tests like removing the second clFinish but then the third one has the delay. I also tried adding another clFinish (with m_clRunKernelCommandQueue ) at the end after clEnqueueNDRangeKernel but still I got a delay on the second clFinish (line 3) ;

can someone help me figure out what the problem is?

void f()
1 clFinish(m_clWriteCommandQueue);
2 clEnqueueWriteBuffer(m_clWriteCommandQueue,… , CL_FALSE, …);
3 clFinish(m_clRunKernelCommandQueue );
4 clFinish(m_clReadCommandQueue );
5 clEnqueueReadBuffer(m_clReadCommandQueue,…, CL_FALSE,…);
6 clEnqueueNDRangeKernel(m_clRunKernelCommandQueue , …);


The enqueued command might not be submitted to the device for execution until either a flush or a subsequent blocking call on the queue.

General recommendation: don’t call clFinish() unless you can explain in detail exactly why you need it.