Hi everyone,
While reading “OpenGL Programming Guide 9th Edition” (so up to Opengl 4.5) I stumbled upon this compute shader example (Example 12.11):
layout (local_size_x = 1024) in;
// input and output images
// ...
// Shared memory
shared vec4 scanline[1024];
void main(void)
{
// Get the current position in the image.
ivec2 pos = ivec2(gl_GlobalInvocationID.xy);
// Read an input pixel and store it in the shared array
scanline[pos.X] = imageLoad(input_image, pos);
// Ensure that all other invocations have reached this point
// and written their shared data by calling barrier()
barrier();
vec4 result = scanline[min(pos.x + 1), 1023] - scanline[max(pos.x-1,0)];
imageStore(output_image, pos.xy, result);
}
I am not sure about the comment
// Ensure that all other invocations have reached this point
// and written their shared data by calling barrier().
Does barrier()
ensures that? In particular I don’t understand if barrier should be used to synchronize instruction execution across all the invocations within a work group or it should also be used to make memory operations visible to all the invocations within the work gourp (synchronizing execution + memory).
From the barrier() docs
barrier
provides a partially defined order of execution between shader invocations.
…
For any given static instance ofbarrier
in a compute shader, all invocations within a single work group must enter it before any are allowed to continue beyond it. This ensures that values written by one invocation prior to a given static instance ofbarrier
can be safely read by other invocations after their call to the same static instance ofbarrier
.
… values written by one invocation…to what? All possible writes? Shared variables, imageStore, …?
Besides, if barrier()
ensures that all invocations have written their shared data (so it’s safe to read), why and when should anyone use memoryBarrierShared()
?
Regarding memoryBarrierShared()
, from the reference:
In particular, any modifications made in one shader stage are guaranteed to be visible to accesses performed by shader invocations in subsequent stages when those invocations were triggered by the execution of the original shader invocation (e.g., fragment shader invocations for a primitive resulting from a particular geometry shader invocation).
Aren’t shared
variables only available in compute shaders? How could subsequent stages access those?
Thanks in advance for the help!