This presentation says that using the Map/Unmap APIs should use pinned memory:
https://on-demand.gputechconf.com/gtc/2016/presentation/s6382-karthik-ravi-Perf-considerations-for-OpenCL.pdf
So I have code like this to upload data to GPU:
auto ptr = cmdQueue.enqueueMapBuffer(_gpuBuffer, CL_TRUE,
CL_MAP_WRITE_INVALIDATE_REGION, 0, numBytes);
memcpy(ptr, _cpuData.data(), numBytes);
ThrowIfFailed(cmdQueue.enqueueUnmapMemObject(_gpuBuffer, ptr, waitEvents, cmdEvent));
So I am concerned about having to block on the enqueueMapBuffer? Does this cause a sync point where it waits for all commands to finish up to this point? Or will is still return pretty fast (when the mapped pointer can be returned)?