Const memory and buffer size in openCL


i have some confusion in const memory in opencl. As far as i know, my current hardware supports 64k of const memory which i understood as one cannot allocate >64K const memory for a kernel, and the kernel may fail if one tries to allocate more than that. the opencl doc says that the max buffer size is 64K, which may imply that we can create multiple 64K const buffer for the kernel.

So just to test that point i created a stand alone application (linux/opencl) to test that.

clGetDeviceInfo with CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE returned 65536 bytes

later i created two read only buffer as

cl_mem buffer1 = clCreateBuffer(ctx, CL_MEM_READ_ONLY, max_buffer,NULL,NULL); // max_buffer = 65536
cl_mem buffer2 = clCreateBuffer(ctx, CL_MEM_READ_ONLY, max_buffer,NULL,NULL); 



//set it as kernel args
err |= clSetKernelArg(kernel, 4, sizeof(cl_mem), &buffer1);
err |= clSetKernelArg(kernel, 5, sizeof(cl_mem), &buffer2);

// write some thing into the buffers
clEnqueueWriteBuffer(cmdQ, buffer1, CL_TRUE, 0, max_buffer, buffer, 0, NULL, NULL); 
// buffer is a char* array with max_buffer size and filled with 1
clEnqueueWriteBuffer(cmdQ, buffer2, CL_TRUE, 0, max_buffer, buffer, 0, NULL, NULL);

//execute kernel

kernel code accepts these params as __constant char* buffers and adds the two and writes to an output buffer.

The kernel just ran fine, created the expected output (accpetting two 64k const buffers)

my questions are

  1. How does the kernel accept two 64K buffer as const ?, does this overflow to global ? the opencl doc says that 64K is the max buffer size, which is not equal to max const memory size
  2. Is there any method by which we can make sure that the memory is indeed const.
  3. or am i missing something/doing wrong

Thanks in advance

  1. As you noticed, CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE gives the max size of a const buffer, but you can declare several const buffers with a total size exceeding this limit. OpenCL abstracts the notion of const buffer. Const buffers are handled differently on different architectures.
    AMD GPUs have several true const buffers of 64K each. Const allocations are made in these buffers as long as there is memory; small const allocations can be coalesced in the same buffer. Once the hardware const buffers are filled, const allocations are made in global memory. They also have specific instructions to load data from uniform locations.
    NVIDIA GPUs allocate const memory in global memory but have one 64K const cache to accelerate reads of const data.

  2. No, there’s no method in OpenCL to tell whether a buffer is really in a const buffer, but as previously said, the notion of const buffer is not necessarily concrete on a given architecture and can be handled by hardware as a mere hint.