SSBO alignment question

nickyc95 · March 14, 2017, 5:26am

Hi,

I am trying to implement SSBO (Shader Storage Buffer Object) support into my application.

When passing data between C++ and the GL buffer, what is the alignment for the data?.

For example if I use a struct like the one below, then everything is fine

struct Data
{
vec4 colour;
vec4 position;
};

However if I add data to the end like so:

struct Data
{
vec4 colour;
vec4 position;
uint type;
};

then the data doesn’t correctly get passed over to the OpenGL side of things, until I add additional padding (but only to the c++ struct, not the GL struct)

So… Do the structs need to be aligned to the highest alignment in the struct or what?

As a side note, I would like to simply memcpy the data from the c++ side to the GL data side, which I cannot currently do…
If there is a better way please let me know

Thanks

Nick

GClements · March 14, 2017, 11:37am

It depends upon the layout format (shared, packed, std140, or std430) you’re using. For shared (the default) and packed, you have to query the offsets (as well as array and matrix strides) at run time. For packed, you have to query these values separately for each program, even if they use the same structure, as the compiler is free to optimise the layout (e.g. removing unused variables). For shared, the layout of will be the same for all programs using the same structure (so you can use a single SSBO or UBO with multiple programs), but is otherwise undefined.

If you want the layout to be fixed (so that you can e.g. memcpy() between an array of C structs and a buffer), you need to use std140 or std430. std430 has slightly looser alignment constraints than std140, so it typically uses less padding. The exact layout of these formats is described in §7.6.2.2 of the OpenGL 4.5 specification.

nickyc95 · March 14, 2017, 1:17pm

[QUOTE=GClements;1286363]It depends upon the layout format (shared, packed, std140, or std430) you’re using. For shared (the default) and packed, you have to query the offsets (as well as array and matrix strides) at run time. For packed, you have to query these values separately for each program, even if they use the same structure, as the compiler is free to optimise the layout (e.g. removing unused variables). For shared, the layout of will be the same for all programs using the same structure (so you can use a single SSBO or UBO with multiple programs), but is otherwise undefined.

If you want the layout to be fixed (so that you can e.g. memcpy() between an array of C structs and a buffer), you need to use std140 or std430. std430 has slightly looser alignment constraints than std140, so it typically uses less padding. The exact layout of these formats is described in §7.6.2.2 of the OpenGL 4.5 specification.[/QUOTE]

Hi,

I have tried with both std140 and std340.

With both of these I have found that I need to add additional padding to the C struct in order for GL to register the second element in the array.

If I use a C struct and a matching GL struct, like so:

struct BufferData
{
vec4 position;
vec4 direction;
uint type;
};

then use a SSBO as:

layout(std340, binding = 0) readonly buffer BufferedData
{
BufferData[];
}bufferedData;

Having it like the above, I have found that the second element in bufferedData is not correctly being found / doesnt show…
If I modify the above to be a vec4 where the uint is, then it works fine but requires manual padding and doesnt allow me to simply memcpy

Thanks

john_connor · March 14, 2017, 9:48pm

you have this struct in your shader defined:

struct BufferData {
vec4 position;
vec4 direction;
uint type;
};

and your shader storage block looks like that:


layout (std140, binding = 0) buffer MyBufferDataBlock {
BufferData bufferdata[];
};

then you have to make (according to rule 9 in the specs) the corresponding cpp struct’s size rounded up to a multiple of a vec4 (= 16 bytes), that means you have to pad the struct like that:

struct BufferData {
vec4 position;
vec4 direction;
unsigned int type; float __padding[3];
};

the “float __padding[3];” together with the “unsigned int type;” are 16 bytes in size

i’m not sure about “layout (std430 …)”, but i think you can skip the padding in this case

EDIT: nope, you also have to do the padding in this case

nickyc95 · March 15, 2017, 2:07am

[QUOTE=john_connor;1286367]you have this struct in your shader defined:

struct BufferData {
vec4 position;
vec4 direction;
uint type;
};

and your shader storage block looks like that:


layout (std140, binding = 0) buffer MyBufferDataBlock {
BufferData bufferdata[];
};

then you have to make (according to rule 9 in the specs) the corresponding cpp struct’s size rounded up to a multiple of a vec4 (= 16 bytes), that means you have to pad the struct like that:

struct BufferData {
vec4 position;
vec4 direction;
unsigned int type; float __padding[3];
};

the “float __padding[3];” together with the “unsigned int type;” are 16 bytes in size

i’m not sure about “layout (std430 …)”, but i think you can skip the padding in this case

EDIT: nope, you also have to do the padding in this case[/QUOTE]

Haha thanks man.

This is the same conclusion that I have come to.

I thought that using std430 would give me tight packing that wouldn’t require me to manually pad it.
However I found this is not the case (as you did).

Do you know if this is documented in the specification?

EDIT: Also do you know whether after adding the padding to the C struct, will allow me to memcpy into the buffer?

Cheers
Nick

arekkusu · March 15, 2017, 2:35am

Did you read it?

GL4.5:

If the member is a structure, the base alignment of the structure is N, where N is the largest base alignment value of any of its members, and rounded up to the base alignment of a vec4. The individual members of this sub-structure are then assigned offsets by applying this set of rules recursively, where the base offset of the first member of the sub-structure is equal to the aligned offset of the structure. The structure may have padding at the end; the base offset of the member following the sub-structure is rounded up to the next multiple of the base alignment of the structure.

…

When using the std430 storage layout, shader storage blocks will be laid out in buffer storage identically to uniform and shader storage blocks using the std140 layout, except that the base alignment and stride of arrays of scalars and vectors in rule 4 and of structures in rule 9 are not rounded up a multiple of the base alignment of a vec4.

So: with std430, the rounding up to vec4 alignment doesn’t happen. BUT the largest member in your structure is a vec4, so the rounding doesn’t matter, the whole structure is always aligned to vec4 anyway.

nickyc95 · March 15, 2017, 2:47am

[QUOTE=arekkusu;1286369]Did you read it?

So: with std430, the rounding up to vec4 alignment doesn’t happen. BUT the largest member in your structure is a vec4, so the rounding doesn’t matter, the whole structure is always aligned to vec4 anyway.[/QUOTE]

EDIT: Nevermind.

Figured it out.

Thanks