Having issues with critical section implementation

Hello,

Without stating further details about the problem at hand, I need to implement some kind of a critical section.

I have some experience with CUDA and a pattern equivalent to this used to work well:

[[loop]] while(true) 
    [[branch]] if if (atomicCompSwap(lockRef, 0, 1) == 0) {
        // Critical section here... (outside the 'if' statement we would have a guaranteed deadlock if two threads from the same group were attempting to use the same lock)
        atomicExchange(lockRef, 0); 
        break;
    }

lockRef corresponds to some 0-initialized buffer element and multiple threads may use the same one.

Unfortunately, when I try to use this in glsl and compile SPIRV, I have deadlocks in an unit test with compute shader.

Here are my observations so far:

  1. If I set group size to 1, no deadlock occurs (expected behavior even if the code was not divergent);
  2. Also no error occurs if I run only one thread group and give each thread the same lock address;
  3. Any other case results in a deadlock. I am especially surprised that individual locks per gl_LocalInvocationID and many workgroups also locks up…
  4. As far as I can tell, the problems are the same on NVIDIA and AMD

Can anyone suggest what am I doing wrong? What is the fundamental difference between CUDA and SPIRV that makes the two use cases different?

Update:
With a bit more experimentation, the following code started working:

bool notAquired = true;
[[loop]] while(notAquired) 
    [[branch]] if if (atomicCompSwap(lockRef, 0, 1) == 0) {
        // Critical section here...
        atomicExchange(lockRef, 0); 
        notAquired = false;
    }

I am more confused now… Is this just luck, or am I missing something?

Since you’re using Vulkan, I assume that lockRef is a work group shared variable. Well, Vulkan has exactly zero forward progress guarantees (within a work group). So there is no requirement that this works at all.

If you manage to make it work, that’s only by accident. It’s still undefined behavior.

@Alfonse_Reinheart, thanks for the response,

It’s not a groupshared variable. I have a buffer binding. that name was an attempt to simplify, but I suspect I made it more confusing

Anyway, since there are no forward-progress guarantees, is there any legitimate way to implement a critical section?

(My actual required use case is within a fragment shader with per-pixel locks. I have found an extension named VK_EXT_fragment_shader_interlock, and it is more or less what I actually need, but it does not seem to be supported on AMD)

That’s not possible. That answer talks about OpenGL, but the same is true of the Vulkan memory model too, for largely the same reasons.

Fragment shader interlock exists in part because the memory model has no other way of achieving that effect.

1 Like

Thank you once again. It’s explaining a lot.

Unfortunately, I very much need my code to work on hardware from all vendors and until AMD decides to add support for interlock, it looks to me like I’m kind of screwed.

I’ll be using the modified code for now, since it seems to be working all right, even if it is technically unsafe and if AMD ever decides to support interlock or new GLSL standard creates better guarantees, or some driver update starts crashing/locking my application, I’ll switch to whatever is safer down the line.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.