Hello,
I’m currently refactoring my rendering pipeline to allow batching multiple draw calls into as few MultiDrawIndirect as possible. This resulted in pretty big changes in how I managed things, especially vertex buffers, uniform buffers & textures. I run a “typical” render loop where I submit “render tasks” which consist of all the information for a single draw call. Then I order them based on sort key which empasises shader & vertex buffers used, because these can’t be changed mid-indirect draw, so I end up with geometry ordered in a way that allows me to put ranges of it into indirect buffers (theoretically). I thought it would be good to put all my per-object uniforms into a single buffer, even if these items come from different “queues” (my geometry is split between few buckets: opaque, alpha_tested, 2d_overlay etc. which allow me to fetch specific lists of tasks to process in separate passes). So I have several queues containing RenderItems, and each RenderItem has entry in ObjectUniforms[] array which has data like model matrix, normal matrix and other per-object info.
Having one big ObjectUniforms array allows me to easily upload it to a perma mapped buffer, and each RenderItem keeps index into this array. This works nice in my initial implementation which is not using indirect draw yet, and I just provide this index by glUniform1i. Even if I sort my render tasks to minimise state changes, the indices are still valid (I do not sort the ObjectUniforms array). But with indirect draw, I won’t be able to provide these indices by glUniform1i. What I read so far is that gl_DrawID is available, but it starts with 0 for every call to glMultiDrawIndirect, which means I need to somehow offset it so I can index in the global ObjectUniforms array (it will also have to be sorted together with RenderTask array, because I won’t be able to provide arbitrary index anymore, but rather BASE + GL_DRAW_ID). How is this usually done? Should I provide single integer uniform as some baseDrawId per each glMultiDrawIndirect and then resolve the index into global buffer as baseDrawId + gl_DrawID? I’ll now obviously also have to sort the per-object uniform array, so it has the same order as render tasks, because it will have to be continuous (as I will be able to provide single offset per indirect draw, so each draw call inside a given indirect draw has to index into global uniform array as base + 0, base + 1, base + 2 - it can’t index into arbitrary random index anymore).
I saw some reference that this can be achieved by providing this offset as baseInstance, so I basically abuse this property to provide my index offset. But not sure how this affects vertex divisors as they probably use baseInstance value and this will break them, as they won’t follow the same offset as global uniform buffer. They’ll want to start from 0…
Any other options exist? I never saw this mentioned, everyone says “upload per-object uniforms to a single buffer” but this would mean I need to do it separately for every indirect call group - I can’t just upload ONE single buffer, as I can’t have a reliable way of indexing into it, when gl_DrawID always start from 0.