I'm getting better performance when compiling with /DDEBUG than when not

I’ve found out that when I compile my OpenCL code with /DDEBUG when calling cl.exe I get better performance than when I don’t include it.

I tried finding out why this is and my testing points to the fact that the code that calls all the OpenCL functions must be the cause for this.

Specifically two calls must be the cause of this. One call to clEnqueueNDRangeKernel() which is taking longer when DEBUG is not defined and a call to clEnqueueAcquireGLObjects().

I’ve made sure to call clFinish() after both calls and before measuring the time.

Did anyone else have a similar issue once or does someone have any idea why this could be?

I’ve googled around and most of the time the cause of release builds being slower than debug builds are memory alignment issues or something like that.