I’m testing my application with nVidias OpenCL Visual Profiler and I’m noticing that memory buffer I’m allocating for the device using this code:
_cmDevBins = cl::Buffer(_context,CL_MEM_WRITE_ONLY | CL_MEM_COPY_HOST_PTR, sizeof(cl_int) * ndX64ndY64ndZ*64,outBI,&err);
Is being transferred to and from the device twice, that is it’s getting written to the device memory, and read from the device memory for every kernel iteration.
Now I noticed in the apple guide(http://developer.apple.com/mac/library/ … penCL.html)
it mentions that the two devices will be kept synchronised when using CL_MEM_USE_HOST_PTR , I assume this is also the case for CL_MEM_COPY_HOST_PTR, and I assume this is what I’m seeing here, that the buffer is being automatically syncd, hence why it is being copied out and copied back before and after a kernel executes.
However for my application I don’t want the devices to be synchronised, I only want the memory to be written to the GPU, I never want it to be read back, what flags do I need to set for this when creating the buffer?
Thanks in advance.