[Help] How to lay out SSBO data in shader program?

Zpointer · January 9, 2016, 7:37pm

Hello, I am using a SSBO to hold my shape data for a real-time raytracer project that I am working on. I tested it first with uniforms and it rendered as expected, but I decided to switch to SSBO since they can hold significantly more data. However, my scene renders differently and incorrectly with SSBO’s and I believe I am laying out the shader data in the SSBO wrong.

I am using basic “Whitted” raytracing to create reflections and as I mentioned previously. I am fairly certain the raytracing logic is correct since I tested with uniforms. I created a Gist: https://gist.github.com/zryan3/874d81cc733b01f6ea99 which hopefully explains most of the important parts of my code. I am using 2 shader programs: Program A contains a vertex and fragment shader and simply renders a 2D quad and textures it, Program B is a compute shader that uses the imageStore to store data in the texture and also has the SSBO and does all the raytracing.

I have read that SSBO’s cannot hold vec3 data types and must use vec4 but I have seen and tested code that uses vec4 in SSBO which works. For example this code: https://github.com/zryan3/glslcookbook/blob/master/chapter10/shader/cloth.cs. I also changed my vec3’s to vec4’s and it did not fix the issue.

Lastly, just for a sanity check. I used the glMapBufferRange function to get the data after I sent it to the SSBO and it appears to match the data I have in my C++ program.

GClements · January 9, 2016, 9:35pm

It’s not that they can’t use vec3, it’s that it makes calculating the layout a bit more complicated; unlike the other types, the required alignment of a vec3 is larger than its size.

std430 requires a vec3 to be aligned the same as a vec4 (i.e. 16 bytes), while an equivalent C structure will typically only be aligned to a 4-byte boundary.

But this issue doesn’t only affect the use of vec3, it also applies if you mix vec4 with vec2 or scalars, or vec2 with scalars. In short, GLSL has stronger alignment requirements than C, meaning that a GLSL structure is more likely to contain padding than an equivalent C structure.

The simplest option is to only use vec4, vec2 and float, and order the members in order of decreasing size (i.e. vec4 -> vec2 -> scalar). If you’re creating arrays of structures, you should also add dummy scalars to the end of the structure in order to make it’s size a multiple of a vec4. This ensures that there will be no padding anywhere within the structure, either in the GLSL or C versions. If you don’t want to do that, then you need to add dummy members within the C structure to force its members to be aligned according to std430.

[QUOTE=Zpointer;1281069]
Lastly, just for a sanity check. I used the glMapBufferRange function to get the data after I sent it to the SSBO and it appears to match the data I have in my C++ program.[/QUOTE]
But it probably doesn’t match what GLSL is expecting. Here is your structure layout:


                     C   GLSL

vec3 center;          0    0
float radius;         3    3
float radius2;        4    4
vec3 ambient;         5    8
vec3 diffuse;         8   12
vec3 specular;       11   16
float shininess;     14   19
float reflectivity;  15   20
                     16   24

The “C” column is the likely offset of the field within a C/C++ structure measured in multiples of sizeof(float) (i.e. multiply by 4 for bytes). The “GLSL” column is the offset according to std430. Each vec3 other than the first has between 1 and 3 scalars of padding before it in order to satisfy the 4N alignment constraint. Additionally, there is another 3N padding at the end of the structure to ensure that any following structure is correctly aligned. Altogether, the GLSL structure is 8N (32 bytes) larger than the equivalent C structure.

Zpointer · January 10, 2016, 10:16am

Ahhhh that fixed it… Thanks for the detailed response. This was my first voyage into SSBO’s (and also uniform buffer packing) so I think I have a much better sense for the next trip. I read the Superbible (latest edition) but I did not remember seeing the decreasing size mention (although I was aware of the packing issues mentioned).

GClements · January 10, 2016, 12:39pm

Well, decreasing size isn’t necessary, it’s just a simple way to ensure that padding isn’t needed (provided that no vec3’s are involved).

The required alignment for a structure is the largest alignment of any of its members, so the start of a structure (offset zero) is correctly aligned for any type. The alignment for vec4, vec2 and scalars is the same as their size, so the location after such a type automatically has the correct alignment for another instance of the type, or for any smaller type. vec3 is the exception; it’s required to be aligned to 4N but is only 3N in size, so if it’s followed by anything other than a scalar, 1N padding must be added.

Zpointer · January 10, 2016, 2:02pm

[QUOTE=GClements;1281079]Well, decreasing size isn’t necessary, it’s just a simple way to ensure that padding isn’t needed (provided that no vec3’s are involved).

The required alignment for a structure is the largest alignment of any of its members, so the start of a structure (offset zero) is correctly aligned for any type…[/QUOTE]

OHHH, now I get it. OK that makes sense. Wow.