More flexible DrawElementsInstanced()

I am trying something a little different. I want to a command that is like glDrawElementsInstanced(), but I want to draw multiple ranges of elements, all packed into a single command.

Here is what I want to do (for example):

-Draw three instances of the element array from position 0-324
-Draw five instances of the element array from position 325-574
-Draw two instances of the element array from position 575-624

Is there any new OpenGL functionality that supports this, or something that gives me a little more flexibility with the element array? I would like to just provide a structure with the draw parameters, something like this:
-int 0 (start element)
-int 324 (end element)
-int 3 (number of instances)
-int 325
-int 574
-int 5
-int 575
-int 624
-int 2

[QUOTE=glnoob;1293701]I am trying something a little different. I want to a command that is like glDrawElementsInstanced(), but I want to draw multiple ranges of elements, all packed into a single command.
glMultiDrawElementsIndirect(). It’s essentially equivalent to multiple calls to glDrawElementsInstancedBaseVertexBaseInstance() (which is a more general version of glDrawElementsInstanced()), with the parameters taken from an array of structures (which can be in client memory or in the buffer bound to GL_DRAW_INDIRECT_BUFFER). It requires OpenGL 4.3 or the ARB_multi_draw_indirect extension.

GLuint indirect[3][5] = {
    {325-  0, 3,   0, 0, 0},
    {575-325, 5, 325, 0, 0},
    {625-575, 2, 575, 0, 0}
glMultiDrawElementsIndirect(mode, type, indirect, 3, 0);

One minor point (check me on this). With Indirect draw calls, I’m pretty sure that the indirect parameter is always a buffer offset. That is (in this case), the array of structures is always pulled from the bound GL_DRAW_INDIRECT_BUFFER (barring NV bindless use, of course).

Also, one thing to watch out for if your instances are very small (in terms of number of vertices): see this post for details.

For the core profile, you’re definitely correct (and the online reference page is incorrect, as that says that client memory can be used with no mention of profiles).

For the compatibility profile, my reading is that client memory is an option, although this part of the specification is quite hard to read (even more so than usual).

On the one hand, the core profile specification says (§10.4):

An [var]INVALID_OPERATION[/var] error is generated if zero is bound to [var]DRAW_INDIRECT_BUFFER[/var], or if no element array buffer is bound.

while the equivalent language in the compatibility profile specification is:

An [var]INVALID_OPERATION[/var] error is generated if no element array buffer is bound.

On the other hand, both versions say

indirect contains the offset of the first element of the array within the buffer currently bound to the [var]DRAW_INDIRECT[/var] buffer binding.

where I would expect the compatibility profile to have some red text regarding the use of client-side memory.

But then the preceding section (§10.3.11, “Indirect Commands in Buffer Objects”) says that

Arguments to the indirect commands … may be sourced from the buffer object currently bound to the corresponding indirect buffer target

(emphasis mine), and the compatibility profile specification has the red text

If zero is bound to [var]DRAW_INDIRECT_BUFFER[/var], the corresponding DrawIndirect commands instead source their arguments directly from the indirect pointer in client memory.

Good catch on that last one. Definitely ambiguous behavior and worthy of a spec bug report.

Now, how can I get the correct instance ID in the shader? I think gl_InstanceID gets reset to 0 with each internal draw call. gl_DrawID will give you the draw call number, but in the example here, we are drawing 3, 5, then two instances.

So I believe gl_InstanceID will give this sequence:

And gl_DrawID will give this sequence:

But what I want is this sequence:

I can think of one way to do that, but is there a simple built-in solution?

[QUOTE=glnoob;397965]But what I want is this sequence:

I get the thinking behind that, but you’re not going to get that as a simple number. Instead, you have to get clever.

What you’re trying to do is have a big array of per-model data. For each draw command, you add another entry into the array for each instance in that draw command. And in your shader, you just fetch from the array by that index.

Since the index you want doesn’t exist, what you need to do in order to pull this off is to build an array, where each element in the array corresponds to a draw command. The value of each element is an index into the per-model data array where that draw command’s data starts.

So your per-model fetching in the shader would look like:

perModelArray[drawIndex[gl_DrawID] + gl_InstanceID]

I wouldn’t worry too much about the performance of this extra layer of indirection. I mean, it’s not going to be good for performance, but it shouldn’t be bad either. Since every vertex of each rendering command will share the same gl_DrawID, drawIndex[gl_DrawID] will get stored in the cache pretty quickly, and it’ll stay there for most of the vertices in that command. So given a reasonable number of vertices, it shouldn’t be a problem.

BTW: one of the reasons you can’t just get the kind of index you’re talking about is that gl_DrawID has to be handled specially. Or more to the point, there are things that gl_InstanceID can’t do.

For example, if you want to access a texture based on gl_DrawID by fetching from an array of bindless textures, you can do that. However, if you do it based on gl_InstanceID, you cannot. The expression using a bindless texture is required to be dynamically uniform, and gl_InstanceID is not dynamically uniform. So the index sequence you’re wanting isn’t dynamically uniform.

So while you could have each draw call in a multi-draw use different textures through an array of bindless textures, you cannot have different instances within the same draw call do the same.

This is why array textures are still important even in a bindless texture world. Array textures don’t have this requirement, so instances can do as they see fit.

Oh and FYI:

Bugs go here now.