Using CL_MEM_USE_HOST_PTR

doru.popovici · April 16, 2014, 3:38pm

Hi! I have a small question regarding the usage of this flag. When using it with clCreateBuffer, does that mean that when the kernel executes, he will finally write the data back to the host pointer when he finishes?

So if I have a buffer = clCreateBuffer(with the CL_MEM_USE_HOST_PTR) and I use it with a kernel after which the CPU host waits for, will I have the data in the host vector, without invoking a read on the buffer?

Thanks

utnapishtim · April 17, 2014, 3:07am

The host buffer is not necessarily up-to-date when your kernel ends because its content can be cached in device memory.

You have to use clEnqueueMapBuffer / clEnqueueUnmapBuffer to ensure that the host buffer is updated with the latest values.

doru.popovici · May 5, 2014, 3:19pm

Thanks for the information.

But what is the actual meaning of the CL_MEM_USE_HOST_PTR. If I use it with clCreateBuffer, does that create a cl_mem object that will have the same address as the buffer that I created from the host? Therefore will place the data to the same location as the host allocated memory. Or will it create a separate and new memory location where the buffer will reside?

The reason I am asking this thing is that on the Intel Haswell you have the GPU and the CPU on the same dye sharing the L3 cache. Therefore if the two are sharing the same address location (the host buffer and the device buffer) then my idea would be to create a mechanism where the two cooperate with each other so that they could get the data they need faster. I have to test it out, but it would be great if someone could give me pointers or something: hey stupid stop wasting your time because it does not work.

Thanks

Dithermaster · May 5, 2014, 4:11pm

Yes, this is possible. Please see:

and
https://software.intel.com/sites/products/documentation/ioclsdk/2013/OG/Sharing_Resources_Efficiently.htm

doru.popovici · May 5, 2014, 8:37pm

But then do I have to use the clEnqueueMapBuffer to remap the buffer or write directly into the array and the GPU might see it?

Thanks

Dithermaster · May 6, 2014, 5:53am

Yes, in OpenCL 1.x you must use clEnqueueMapBuffer before accessing on host side, and clEnqueueUnmapMemObject before using on device side.

doru.popovici · May 6, 2014, 10:55am

Thanks, for all the information. This has helped me a lot.

Thanks