cache the data in global/local memory


There are two kernels(A and B) is to enqueue kernel to gpu in order .
And there is one argument © is shared between A and B (enqueue A -> use data C -> enqueue B -> use data C).
I want to keep the data © in global memory when A bring data C from host to global and A finished the kernel job
,and then B can access the data C directly instead of bring the C from host to gpu again.

Could anyone help me solve the problem ?


You should create © as CL_MEM_READ_WRITE buffer (or READ_ONLY, if possible) and fill it with clWriteBuffer to ensure it will be device resident (not standart’s requirement, but this is how existing implementations work). But if you need to update © on host-side often, you’d better stick with CL_MEM_USE_HOST_MEM. On many platforms this means your GPU has direct access to RAM and it doesn’t need to copy memory at all. And even if it does, your CL Runtime most likely knows what you wish to accomplish.