What's the deal with clEnqueueWriteBufferRect?

vincentfpgarcia · January 29, 2013, 5:08am

Hi,

I have a buffer (float*) that represents an image of let’s say 120x120 pixels.
I create on the device a buffer that represents an image of 100x100.
What I want to do is to take the center of the first image (host) to fill the device one.
clEnqueueWriteBufferRect seems to be the perfect solution…

Let’s have a look on the documentation of clEnqueueWriteBufferRect.


cl_int clEnqueueWriteBufferRect(
    cl_command_queue command_queue,
    cl_mem           buffer,
    cl_bool          blocking_write,
    const size_t     buffer_origin[3],
    const size_t     host_origin[3],
    const size_t     region[3],
    size_t           buffer_row_pitch,
    size_t           buffer_slice_pitch,
    size_t           host_row_pitch,
    size_t           host_slice_pitch,
    void            *ptr,
    cl_uint          num_events_in_wait_list,
    const cl_event  *event_wait_list,
    cl_event        *event)

No comments about command_queue, buffer, blocking_write, buffer_row_pitch, buffer_slice_pitch, host_row_pitch, host_slice_pitch, ptr, num_events_in_wait_list, event_wait_list, and event. Now, because the device buffer will be entirely filled, we must have :


    size_t buffer_origin[3] = {0, 0, 0};

Only 2 parameters remain : host_origin and region. What the documentation says about these parameters is :

host_origin : The (x, y, z) offset in the memory region pointed to by ptr. For a 2D rectangle region, the z value given by host_origin[2] should be 0. The offset in bytes is computed as host_origin[2] * host_slice_pitch + host_origin[1] * host_row_pitch + host_origin[0].

region : The (width, height, depth) in bytes of the 2D or 3D rectangle being read or written. For a 2D rectangle copy, the depth value given by region[2] should be 1.

So, in my case I should use :


size_t input_offset[3]  = {10, 10, 0};
size_t region[3]        = {100*sizeof(float), 100*sizeof(float), sizeof(float)};

Of course, it doesn’t work. Let’s focus on the input_offset parameter.
If we consider their formula, it is said that “the offset in bytes is computed as host_origin[2] * host_slice_pitch + host_origin[1] * host_row_pitch + host_origin[0].”. Since host_slice_pitch and host_row_pitch are given in bytes we must have host_origin[2] and host_origin[1] as numbers and host_origin[0] in bytes! No? Otherwise the offset is wrong. Or host_slice_pitch and host_row_pitch must be given not in bytes. Why the parameters are not consistent?

Now, the region parameter. I agree that the region[0] must be in bytes so that we know how many bytes we have to copy.
However, region[1] and region[2] must be given in “number of rows” and “number of slices”, otherwise how to know how many line we have to copy? Anyway, if region[1] and region[2] are given in bytes, the program crashes. Again, why the parameters are not consistent?

Using these remarks, I have


size_t input_offset[3]  = {10*sizeof(float), 10, 0};
size_t region[3]        = {100*sizeof(float), 100, 1};

and it works perfectly.

So my question is what am I doing wrong? If I’m not doing anything wrong, don’t you think that the documentation is wrong then?

Thanks

clint3112 · January 29, 2013, 11:56pm

First region definition is not correct i think.

Doc tells you, with 2D region[2] should be 1 but you inserted sizeof(float), which is 4 on most systems

vincentfpgarcia · January 30, 2013, 1:33am

You’re right, region[2] must be 1. However, this doesn’t answer my question: even with region[2]=1, the problem still remains.

utnapishtim · January 30, 2013, 1:51am

Specification states that:

“region defines the (width in bytes, height in rows, depth in slices) of the 2D or 3D rectangle”

so the definition:

size_t region[3] = {100*sizeof(float), 100, 1};

makes perfect sense.

vincentfpgarcia · January 30, 2013, 1:55am

Yeap, that’s it, you’re right. My solution was correct finally. Maybe the online documentation should be updated. Thanks for your help.