Passing an array of mat4's per instance

I need to pass 30 bone matrices[4x4]'s per instance of a model. That is because every instance has a different animation.

I was looking at my options.

I was going to pass them as vertex attrib data, but the limitation on that is 16 and 1 matrix takes up 4 so that’s not an option. Even if a matrix took up 1 attrib, I have around 30 so either way.

Then I was thinking I had to pass them as VERTEX_SHADER_STORAGE, but my GPU returns 0 when I query GL_MAX_VERTEX_SHADER_STORAGE_BLOCKS, and it doesn’t look like there needs to be a limit of at least 1 so I’m guessing my GPU does not support it.

So my final option was uniforms or a UBO, but then I can’t use instance rendering…

Any advice? What gets me is I know Blender3D uses OpenGL and it runs just fine, so there’s got to be something I am missing.

Also, what does GL_MAX_VARYING_COMPONENTS mean? Does that mean there is a datatype in GLSL that can have more than 4 components?

So my final option was uniforms or a UBO, but then I can’t use instance rendering.

… why not?

The way you would use SSBOs with instances is identical to the way you would use UBOs with instances. You put an array in the buffer, and you use the instance index to access the particular instance’s matrix data from the array.

The only difference is the amount of data that can be accessed (and the performance of accessing that data). UBO memory limits are smaller than for SSBOs. With UBOs, OpenGL only guarantees that you get 16384 bytes (though implementations support upwards of 65535 bytes), which in your case only gives you about 8 instances.

What gets me is I know Blender3D uses OpenGL and it runs just fine, so there’s got to be something I am missing.

What makes you think Blender3D uses instancing? Remember: instancing is primarily useful when drawing many copies of the same object.

Also, what does GL_MAX_VARYING_COMPONENTS mean?

It’s an old query that isn’t strictly relevant anymore. It only applies when linking a program that contains a vertex shader and a fragment shader (ie: it was defined before other shader stages exist).

In that case, this constant defines the maximum number of “components” which can be output by the vertex shander and input by the fragment shader. Such variables are called “varying” (the current meaning behind that word is any variable output from one stage and input by the fragment shader, but the above query only apples to VS-to-FS linkage).

A “component” is essentially a non-vector basic type (float, int, bool, etc). So a vec4 takes up 4 components. If you output a vec2 and a vec3, then you use up 5 components in total. Structs and arrays of these count the sum of all of their components.

Unless your bone matrices include projective transformations, they only need to be 3x4 (in the absence of a projective transformation, the bottom row will always be [0 0 0 1]).

In theory, you could get it down to 6 elements per bone (3 for the translation, 3 for the rotation), but the computational overhead increases.

Also, if the translation component is constant for all instances (which is typically the case for human models), only the rotation needs to be per-instance.

If you only send a rotation by quat, then every vertex you would need to compute the translation matrix, or use quaternion mult which is more expensive than mat4x3 mult. I did hear something before on these forums about a compute shader