Garbage output by array of struct in OpenCL

I’m a total beginner with OpenCL and I’m trying to make the following kernel to work. I am passing an array of structure to the kernel and trying to modify its value. My structure declaration is:

#define LIST_SIZE 10
#pragma pack(push, 1)

typedef struct pairt {
    int a;
    int b;
} pairt;

#pragma pack(pop)

My host code to create buffer for passing this struct is:

pairt p[LIST_SIZE];
p_mem_obj = clCreateBuffer(context, CL_MEM_READ_WRITE, LIST_SIZE*sizeof(struct pairt), NULL, &ret);
ret = clEnqueueWriteBuffer(command_queue, p_mem_obj, CL_TRUE, 0, LIST_SIZE*sizeof(struct pairt), &p, 0, NULL, NULL);

My code for setting kernel arguments is:

ret = clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *)&p_mem_obj);
size_t global_item_size = LIST_SIZE;
size_t local_item_size = 2;
ret = clEnqueueNDRangeKernel(command_queue, kernel, 1, NULL, &global_item_size, &local_item_size, 0, NULL, &events[0]);
ret = clWaitForEvents(1, &events[0]);
ret = clReleaseEvent(events[0]);
ret = clEnqueueReadBuffer(command_queue, p_mem_obj, CL_TRUE, 0, LIST_SIZE*sizeof(struct pairt), p, 0, NULL, &events[1]);
ret = clWaitForEvents(1, &events[1]);

My kernel is:

struct __attribute__ ((packed)) pairt {
    int a;
    int b;
};

__kernel void simple_diff( __global struct pairt* p)
{
    int i = get_global_id(0);
    __global struct pairt *tmp = &p[i];

    tmp->a = tmp->a * -1;
    tmp->b = tmp->b * -1;

}

I initialized array with following values:

1 2
3 4
5 6
7 8
9 10
11 12
13 14
15 16
17 18
19 20

but the values returned by kernel are:

-298660672 -32767
0 0
-4198172 0
-298660832 -32767
-4200052 0
-1 -2
-3 -4
-5 -6
-7 -8
-9 -10

I have no idea why this is happening?
P.S.- I skipped error checking code

Just guessing here, could be an issue with alignment, i think opencl requires 16-byte alignment, try padding your struct with an extra 8 bytes (two ints will do the trick).

When declaring the structure on the host side you should use the cl_* types to ensure that they are the same on the host and the device. cl_int in your case.

If you suspect alignment problems, then you can use the kernel to write the offset of each member relative to the start of the buffer. This will tell you how the ints are placed in memory.


__kernel void simple_diff( __global struct pairt* p)
{
    int i = get_global_id(0);
    __global struct pairt *tmp = &p[i];

    tmp->a = (int)(  (__global char*)&(tmp->a) - (__global char*)p  );
    tmp->b = (int)(  (__global char*)&(tmp->b) - (__global char*)p  );

}

Not sure if that is exactly right, but something along those lines.