Setting a vertex base index in a glDrawElements call

knackered · July 14, 2004, 9:32am

I’m a little puzzled about why there isn’t the ability to specify an origin index in the glDrawElements/glDrawRangeElements functions.
It is possible in d3d9 (DrawIndexedPrimitive has it as a parameter), so why not in GL?
It would seem a better way of squeezing more out of a VBO without switching binds.

Won · July 14, 2004, 11:23am

Doesn’t BUFFER_OFFSET do just that? Am I misunderstanding you?

-Won

system · July 14, 2004, 11:31am

You can do this in both VA and VBO.

VA:
glDrawRangeElement(…, &index[start]);

VBO :

glDrawRangeElement(…, START_ADDRESS_IN_BYTE);

Yet again, GL is more efficient by having less parameters per function

knackered · July 14, 2004, 12:46pm

no, that specifies a different starting position in the index array, what I’m talking about is specifying a different ZERO index in a VA. So if you specify, say, 4 in the index array with a buffer origin of 10, the index will be interpreted as 14 rather than 4.

Humus · July 14, 2004, 1:27pm

For that you need to use an offset in the vertex buffer rather than the index buffer. Just offset 10 vertices into the vertex buffer and you’re set. As simple as that.

knackered · July 14, 2004, 1:41pm

No, you don’t understand - I know there are ways around it, I’m saying that to change the ‘offset’ into the VBO you have to do a glVertexPointer, which, if you read the various documents scattered around, is the most expensive operation you can do in the VBO extension…it is emphasised that you should only do that once per buffer object, so doing it several times while rendering a single model is obviously not going to win awards for efficiency.
HOWEVER, direct3D9 has a parameter for its version of glDrawRangeElements (called DrawIndexedPrimitive) which enables you to effectively do what glVertexPointer does but without the expense (it’s not an optional parameter, therefore I have to assume it’s done more efficiently than the GL workaround).
So, it’s in the drivers/hardware, why isn’t it in our favourite API?

system · July 15, 2004, 5:47am

Which parameter does that?
Is it the second one --> BaseVertexIndex

minIndex and numVertices are like the range in glDrawRangeElements

StartIndex is like an offset into the index buffer.

…so I was guessing it’s BaseVertexIndex.

I’m not an authority on GL, but my guess is the reason why this isn’t available is that glDrawRangeElements predates DX9 (maybe even DX8, I don’t remember) and nobody bothered to add the feature.

Are you doing a kind of animation or something that requires this?
Do you have multiple geometries in a single VBO with a single index buffer for them all?

knackered · July 15, 2004, 6:09am

Originally posted by V-man:
Which parameter does that?
Is it the second one –> BaseVertexIndex

Correct.

Originally posted by V-man:
I’m not an authority on GL, but my guess is the reason why this isn’t available is that glDrawRangeElements predates DX9 (maybe even DX8, I don’t remember) and nobody bothered to add the feature.

I realise that. My question was supposed to provoke the the wondering as to why an extension to glDrawRangeElements wasn’t introduced at the same time as VBO - to add this important parameter.

Originally posted by V-man:
Do you have multiple geometries in a single VBO with a single index buffer for them all?
Not a single index buffer, but a single vertex buffer for a particular sector isn’t an insane proposition, is it?

idr · July 15, 2004, 8:31am

To be honest, it wasn’t added in VBO because, AFAIK, nobody on the WG had ever heard anyone ask for that functionality in OpenGL. This is certainly the first time I’ve ever heard anyone ask for it.

Would adding a pair of functions like ElementIndexBase( uint base ) and an associated ‘get’ do the trick? I think I’d prefer that to adding yet another DrawElements entry point. The advantage being that setting the base index would automatically apply to all the various drawing functions (ArrayElement, DrawElements, DrawRangeElements, MultiModeDrawElements, etc.), while adding just one new entry point.

The disadvantage being that it would apply to all calls. That might make things complicated for display lists. Imagine calling ElementIndexBase in a display list and calling it before calling the display list, for example.

Thoughts?

knackered · July 15, 2004, 9:57am

It depends on whether there is the same cost as changing the VBO offsets - in other words, if calling the proposed ElementIndexBase just disguised a sequence of gl*Pointer calls, then there’s no point, although it would be implementation dependant I suppose.
I guess what I’m trying to ascertain is whether d3d9 actually IS paying some price for this handy extra parameter, or is it a driver shortcut that isn’t exposed in OpenGL simply because of semantics.

idr · July 15, 2004, 7:30pm

There are a couple differences between making a bunch of glPointer calls, at least from the app’s perspective. If an app is going to make a bunch of glPointer calls, it has to know which ones to make. Depending on the GL state, that may be non-trivial. It would certainly be easy to miss one and wonder why things looked screwy. Additionally, the app would have to make multiple GL calls in the glPointer method versus one call in the glElementIndexBase method. In the vast majority of cases that wouldn’t cause any performance in pact, but it could. Finally, we could spec it so that glElementIndexBase could be compiled into display lists, where as glPointer calls cannot.

Given the way that VBOs are supposed to be cached in video memory, there may be other performance wins with glElementIndexBase. I’ll have to look at the spec again and look through the Mesa code. Since I’m going to OLS next week, I might not get to it for awhile…

idr · July 15, 2004, 7:35pm

Given the way that VBOs are supposed to be cached in video memory, there may be other performance wins with glElementIndexBase.
I should elaborate on that a bit. For a VBO, gl*Pointer is the moral equivalent to glBindTexture. It’s the point when the VBO really gets bound into the state vector. As such, it can be a fairly expensive operation. Like I said, I’ll have to think about it and look through the spec & code.

SeskaPeel · July 16, 2004, 1:08am

It would seem a better way of squeezing more out of a VBO without switching binds.
Could you explain what could be done this way, please ?
The only thing I can think of would be to store 2 near to identical meshes in a single VBO (consecutively), and sharing for these two models the same index buffer.

SeskaPeel.

knackered · July 16, 2004, 6:17am

Basic example = the vertices for an entire game level are stored in a single vertex buffer (say you trust the driver to do a good job at memory managing, like some crazy fool), while each sector of your world is drawn using a local set of index arrays connecting up vertices found in the global vertex buffer. If ushorts are used for the indices then you’re obviously going to be limited to 0xFFFF vertices for the WHOLE level, unless you do one of the following: 1) split the big vertex buffer into smaller ones (err…). 2) re-specify your buffer offsets (gl*Pointer) for every bloody attribute before your glDrawRangeElements (very inefficient). 3) hassle the vendors or whoever to get with the programme and add the ability to specify an origin index.

system · July 16, 2004, 7:31am

Originally posted by SeskaPeel:
[b]Could you explain what could be done this way, please ?
The only thing I can think of would be to store 2 near to identical meshes in a single VBO (consecutively), and sharing for these two models the same index buffer.

SeskaPeel.[/b]
Yes, if both models can share an identical index buffer. You could use it to do keyframe animation and all the keyframes could be in the same VBO.

This should be cheap. Why would applying an offset be expensive for the card?

The big question is : Why did MS decide to put the Base parameter in DrawIndexedPrimitive in D3D9.
D3D8 has a SetIndices function.

SeskaPeel · July 19, 2004, 3:30am

Both example are correct. But …

1/ Entire level :
The weak experience I have on such batching is that you’ll have to switch a lot of parameters between each call, that should hide glVertexPointer() latency. This may be level specific, and I suppose you are thinking of a test application, with low rendering options and heavy geometry.

2/ Keyframe animation :
Then you won’t have any interpolation between keyframes.

I still think that the correct answer to such problems would be to expose an even more programmable pipeline. Like a memory shader …

SeskaPeel.

l_belev · July 19, 2004, 4:40pm

Probably this difference between OpenGL’s draw-indexed-primitive function and it’s directx counterpart has something to do with their performance difference - the microsoft’s variant is famous whith it’s slowness and high cost when compared to the OpenGL’s one. I don’t know what are the details of the different hardware implementations, but i can imagine that it is the case that specifying new vertex origin forces the driver to re-set-up the hardware’s vertex pullers to new start address (and probably all other states like the data formats/types/etc, if these operations can not be separated). After all that’s exactly what glVertexPointer/etc do, and they are heavy-weighted operations. So probably the lack of such parameter of glDrawElements (and the fact that no extension adds one) has a good reason.

zeckensack · July 19, 2004, 5:18pm

Originally posted by idr:
Finally, we could spec it so that glElementIndexBase could be compiled into display lists, where as gl*Pointer calls cannot.
Would this be desirable? I think it wouldn’t make much sense …

The element base state would be closely tied to the “stride” argument(s) to gl*Pointer and, obviously, to the “pointer” argument(s). It could be a new source of confusion if one would compile into DLs, and not the other.

If you have your vertex data in an array, you shouldn’t need to copy it into a display list. VBOs make geometry in DLs rather redundant anyway. I think the element base should be purely client state, and not compile into a DL.

idr · July 26, 2004, 2:23pm

I have a proposed extension spec available for review. If this looks about right, I’ll whip up an implementation for Mesa, and I’ll see if one of the other DRI developers can get it supported in some of the hardware drivers. Thoughts?

system · July 26, 2004, 5:23pm

You forgot to define what happens when base + index overflows.

Keep in mind the user can use ubyte(8), ushort(16), uint(32) for the indices.

OPTION 1 : let it wrap back to 0
OPTION 2 : flag GL_OVERFLOW
OPTION 3 : promote to higher precision