UBO poor performance [GL 3.1]

I think the general idea is to keep things as tightly packed and sequential for access as possible.

See the vendor perf docs for details, but I don’t think small, partial, amid ships updates are going to buy you much in any case.

If the layout of your original data follows that of a C struct, then yes, being able to mirror that data with a simple code style would be a win for std140.

The packed form is going to be more useful if there are named uniforms of a significant quantity or size, that will not get ‘touched’ by a given linked program. It really depends on your code.

Where this becomes particularly noticeable is if you are running on hardware with a limited number of register slots for program parameters, and your total set of uniform space exceeds it - going to ‘packed’ could well get you back under the wire.

layout(std140):

Name: matLocal
Index: 2
Offset: 0
Size: 1

Name: matMVP
Index: 3
Offset: 64 // mat4 offset
Size: 1

Name: uvBase
Index: 18
Offset: 128 // mat4 offset
Size: 1

Name: perlinMovement
Index: 5
Offset: 136 // vec2 offset
Size: 1

Name: localEye
Index: 1
Offset: 144 // vec2 offset
Size: 1

All ok…

layout(packed):

Name: matLocal
Index: 2
Offset: 0
Size: 1

Name: matMVP
Index: 3
Offset: 64 // mat4 offset
Size: 1

Name: uvBase
Index: 18
Offset: 128 // mat4 offset
Size: 1

Name: perlinMovement
Index: 5
Offset: 128 // wtf?
Size: 1

Name: localEye
Index: 1
Offset: 128 // wtf?
Size: 1

Why three uniforms have some offsets?

Up…