What that means is “in parallel compared to not using GS instancing”.
To perform layered rendering for the purpose suggested by the Wiki means to take the same primitive and send it to multiple different viewports, probably using different transformation matrices.
If you’re not using GS instancing, then your GS looks something like this:
gl_out.gl_Position = transform_vertex(layer_ix, vertex);
gl_Layer = layer_ix;
So each geometry shader invocation will generate
number_of_layers primitives. Now, this generation happens sequentially; each GS invocation computes each primitive for all of the layers, one after the other. Multiple GS invocations can be running at the same time of course, but each one will be outputting their own set of
This also means that each GS invocation needs a much larger buffer to store output primitive data into, since each invocation is writing a lot of primitive data.
By using GS instancing, your GS now can look like this:
layout(invocations = number_of_layers) in;
gl_out.gl_Position = transform_vertex(gl_InvocationID, vertex);
gl_Layer = gl_InvocationID;
Each GS invocation only emits a single primitive. So multiple GS invocations can be working with the same input primitive and emit data for different output primitives for different layers in parallel.
This also means that each GS invocation is much shorter and doesn’t need nearly as much storage for its outputs (since it is only outputting a single primitive).