Problem debuging Kernel

hi,

i found a strange behaviour debuging kernel.
I use this

struct my_debIdx {
    int tableau[1];
};
static int* debIdx[1] = {0};
my_debIdx debIdx;

for retreiving the number of time i set a spécial value in out[global index].x

I create the GPU buffer

debugIdx = cl::Buffer(gContext, CL_MEM_READ_WRITE|CL_MEM_USE_HOST_PTR, 1*sizeof(int), debIdx, NULL);

i pass the debug buffer to the kernel

__global int* __restrict__ debugIdx

and each time i set out[global index].x count it.

(*debugIdx)++;

then a retreive the information using

    gQueue.enqueueReadBuffer(debugIdx, CL_TRUE, 0, 1*sizeof(int), debIdx.tableau);
    LOGI(" debug0: indIdx value %5d \n",debIdx.tableau[0]);
    
    // réinit the buffer to zéro for anaother use.
    debIdx.tableau[0] = 0;
    gQueue.enqueueWriteBuffer(debugIdx, CL_TRUE, 0, 1*sizeof(int), debIdx.tableau);

I work with cl::NDRange(1024,1024), cl::NDRange(2,2) so the number of work_group to be processed is (1024 / 2)^2 = 512^2 = 262144 work_group.

When i work with smal buffer (36 * 36) everything are good but using (1024 * 1024) it look like the out number is a lot smaller than what i get when i retreive the data later.

that is what i get from the kernel
debug0: indIdx value 1601
and what i retreive form the output kernel buffer passed to CPU
void Extraction_Point: buf.bufligne Rouge: 25467

so kernel give 1601 and extracting the value from the output buffer give 25467 and it is the good value no doubt.

May be i made an error somewhere ?

hi,
It look like __global int* __restrict__ debugIdx is not shared through all the work group. I tried with the printf and i can find many time the same index number. does it that we call race condition ?

I just did a new test. i changed the way of testing and i can get all info from each work group, even thread inside work group.

I just chaged the struct to have the size of all the work group and second dimension for thread inside work group.

Increamenting index does not work. And i think it is normal, too much synchronisation between work group or thread.

I learned something ;))