ComputeShader atomicCounter Performance issue

Hey there,

in order to generate “pointer” addresses I need to use an atomic counter inside my compute shader. Unfortunately I meassure a massive drop in performance when using atomicCounterIncrement in the shader:

subroutine( mipmapFunction)
void mipmapStepOne()
	if(mipmapVolumeTex()) //in here we have textureReads and an imageStore after mipmapping has been done
		//getting the next pointer adress
		 uint atomicCount = atomicCounterIncrement(atomicCounter);		
		imageStore(ptrTex, ivec3(gl_GlobalInvocationID), uvec4(atomicCount));	

When using a static value for ‘atomicCount’ my compute shader calls add up to 2.16ms. Using the atomicCounterIncrement the compute shader calls now take 5ms!

I’m running 256³ invocations (32³ groups á 8), however the increment will only be called by a rather small number of invocations (~1%), so there shouldn’t be too many collisions.

Would be really grateful for any kind of advice! :slight_smile:

GPU is a GTX780 running the latest drivers on Win8.1.