concurrent access to global memory

I’m lost in this issue… maybe it’s an trivial problem, but i’ve no clue to solve it.
Btw. i’m new to OpenCL

kernel void Update(global float* MeasuredProjection, global float* CurrentVolume, global float* numinatorBuffer, global float* denominatorBuffer)
// HERE IS SOME RayResampleStuff
// If a Ray hits a Voxel the VoxelIndex will be stored in an Array called VoxelOffsetList

for (int j= 0; j < HitCount; j++)
 int UpdateOffset = VoxelOffsetList[j];

 numinatorBuffer[UpdateOffset] += d_value; //numinatorBuffer is 	a global float*								        
 denominatorBuffer[UpdateOffset] += 1; //denominatorBuffer is a global float*

Getting some sort of artefacts…

Do i have to synchronize the access to the global buffers?
Is there another way to solve this problem?

Thanks in advance!

problem solved… i implemented an atomic_add for float but it’s very slow!

Any faster suggestions?

Use an algorithm that doesn’t require global sync?

Unfortunately you don’t give enough information to suggest more than that …