in C++ with OpenGL verison 4.6, I’m rendering a couple of models (about 4-5). Each model has two SSBOs associated with it, named MainData and SubData. My approach has been to allocate a differently indexed SSBO for each model, and use a switch case in the shader:
// data for model 0
layout(std430, binding = 0) buffer MainData0 {
Mat2c main_data0[];
};
layout(std430, binding = 1) buffer SubData0 {
Mat2c sub_data0[];
};
// data for model 1
layout(std430, binding = 2) buffer MainData1 {
Mat2c main_data1[];
};
layout(std430, binding = 3) buffer SubData1 {
Mat2c sub_data1[];
};
// data for model 2 ...
uniform uint ssbo_idx;
int main() {
switch (ssbo_idx) {
// code
}
}
This works, but doesn’t scale. Is there an elegant solution to this?
Notes:
I can’t preallocate everything before, becuse new models are added to the scene in runtime. I can limit a maximal amount of objects (e.g. 5).
but couldn’t make this work - specifically, when rendering less than 5 models, I’ve run into problems with sub_data indexing.
SSBOs were chosen over UBOs due to memory requirements (several MBs of data each). SSBOs were also preferred over textures, since each models also contains an variable amount of textures that are needed.
It seems to me that the goal of what you’re doing is to avoid changing buffer bindings between draw calls. But you do have to change this ssbo_idx uniform between draw calls. While that’s almost certainly cheaper than a buffer binding change, you can still improve to reduce the number of state changes even further.
It seems like your data structures are things like modelview matrices and the like. As such, it would generally be better to just have one big SSBO with all of your models in it. And if that big SSBO isn’t big enough for the scene… allocate a bigger one. How much of the buffer you’re using should be dynamic.
Since you’re using GL 4.6, you have access to persistent mapped buffers. So use them. Allocate two buffers and map them. On one frame, write all of the per-mesh data for all of your objects into buffer 1. Then bind it as an SSBO and render all the objects (I’ll explain how later). On the next frame, write all of the per-mesh data into buffer 2. Bind it and render everything. On the next frame, synchronize with your original rendering using a fence sync object to make sure that the GPU is finished with buffer 1. Then write the next frame’s data into that buffer. And repeat.
You will also want all of your vertex data in one buffer. So all of your meshes need to use the same vertex format, and as you load and unload meshes, you’ll need to play around with which mesh is where.
When it comes time to render, you want to use multi-draw rendering, with each draw representing a single mesh (using the base vertex and other parameters to select which set of data you’re using from the vertex buffers). Since you’re using GL 4.6, your shader will have access to gl_DrawID. That is what you use to index into the single buffer array to get your per-mesh data. If the fragment shader needs to access this data too, then have the VS pass gl_DrawID to the FS.
This means that the order of meshes in your draw calls must match the order of meshes written into the buffer.
Thanks for the answer. Indeed, I am trying to avoid changing buffer bindings between draw calls.
My renderer supports adding meshes to the scene in runtime, and I can’t constrain the meshes sizes. My scenes are variable, meaning they might contain only 2 meshes each with 10k faces, but they might have 4 meshes each with 100k faces. Allocating for worst case is one solution. If I understand correctly, For each model, I’ll have to send to the shader the offset for the model in that large SSBO buffer, so it’ll know where to start reading.
Thanks for all the pointers to new techniques as well.