SSBO std430 layout rules

From the OPENGL PROGRAMMING GUIDE VERSION 4.5 WITH SPIR-V kindle edition. I have a Table H.2 std430 layout rules that states for:

"Three-component vectors (e.g., vec3) and Four-component vectors (e.g, vec4)

“Both the size and alignment are four times the size of the underlying scalar type. How ever, this is true only when the member is not a part of an array or nested structure.”

So far I am programming OpenGL with glfw and Python 3.8. So far I understand that an SSBO buffer will need to be created in the typical similar sense that a VBO or EBO type of buffer is created. I have a drawing system where I would like to put vec4’s into the SSBO from a GLM api array (I guess of GLM type float32’s). I have to admit that after reading the language from the std430 table describing how this should be done that I don’t reall understand it what the table means. What does that description actually mean in terms of loading a GLM array of vec4’s into the buffer (is there some type of packing to reconsider from just buffering a glm float32 array like I nornally do for a VBO with Python)?

An array of vec4 has the same layout as float[][4] in C: packed 32-bit floats with no padding. An array of vec3 has essentially the same layout (every fourth entry is unused). This is true for both std140 and std430. They differ in that std140 also pads float and vec2 to match a vec4 while std430 doesn’t.

In structures, vec3 and vec4 are aligned to 4-word (16-byte) boundaries, but std430 lets you pack a vec3 followed by a float or a pair of vec2s into a 16-byte block.

Essentially, std140 was designed for hardware where a vec4 was the smallest addressable unit; anything else was just a vec4 with unused fields. std430 relaxes those restrictions, allowing objects to be aligned to 1-word, 2-word or 4-word boundaries (but this still means that vec3s can’t be packed; the fourth word must be padding or a single float field).

Related to std140

Have noticed that OpenGL GLSL support for std430 packing in Uniform Buffer Objects (UBOs) is finally available, for NVIDIA at least:

  • GL_NV_uniform_buffer_std430_layout

It’s definitely in NVIDIA’s recent drivers, but I haven’t seen this new extension show up in the OpenGL Registry yet.

FWIW, Vulkan GLSL has std430 support for UBOs since ~2015, but OpenGL GLSL been stuck with std140 for UBOs until recently.

Related:

Speaking of GL_NV_uniform_buffer_std430_layout.

Since driver release 22.7.1, AMD exposes the existing GL_EXT_scalar_block_layout extension in OpenGL GLSL which also allows using std430 on uniform blocks.
So both NVIDIA and AMD have a way to do it now!

Knowing GL_EXT_scalar_block_layout you might be tempted to try apply the even more useful scalar layout, however attempting to do so gives me the following compile error:

ERROR: 0:140: 'scalar' : only allowed when using GLSL for Vulkan

Here is a small thread regarding that. Sad that we STILL don’t have scalar in OpenGL.

Interesting! Didn’t know about that one:

Like the NVIDIA extension, it’s not on the registry yet:

Plenty of matches for this on the Vulkan side:

But oddly enough, no GL reports alleging GL_EXT_scalar_block_layout support:

Maybe you could upload one :slight_smile:

GL_EXT_scalar_block_layout is technically for Vulkan-GLSL only (requires GL_KHR_vulkan_glsl), so at least in it’s current state I don’t think it will ever be added to the OpenGL registry.

I have also noticed that GL_NV_uniform_buffer_std430_layout is not in the OpenGL registry despite being 7 months old, it should be there by now. Might actually be worth opening a ticket for this.

Yep, the AMD driver does not report (most?) extensions added in release 22.7.1.
It’s a bug.


Edit:
I opened a ticket about the missing extension spec in the NVIDIA forum.