hi, I found there is almost no speed difference for -O3 and -O0 for my openCL code. is this normal? thanks!
Optimization flags for the CL compiler - the part of the runtime that actually compiles your kernel and could therefore affect kernel performance - are passed during the call to clBuildProgram(). I’m guessing that you’re referring to your command line or Makefile optimization flags?
Check our sections 5.4.2 and 220.127.116.11 in the CL spec; they should clarify the situation. Note that there is a standard set of optimization flags a conforming implementation is required to provide, but individual vendors could add all kinds of goodies in there.
thanks! yes, I forgot that part…