About small-size VBOs

knackered · August 15, 2006, 3:21am

yes, I know that nothing heavy happens until you glvertexpointer.
“An offset would not change this”.
Yes it would. With an offset, you could bind a single vbo, set the buffer offsets using calls to gl**pointer, then for each object you render you would call glElementOffset(m_objectIndexOrigin). No more buffer setup is needed. But as I said, this won’t be introduced because this offset can be prebaked into the indices, at the expense of making them 32bit rather than 16bit. It’s very rare a single draw call exceeds 65535 vertices, but common for a whole scene.

Jan · August 15, 2006, 3:46am

I have the same issue. I was doing terrain-rendering a while ago and i subdivided the terrain into patches, which would use different levels-of-detail. Pretty much the same, as Far Cry does it.

The thing is, i precalculated and stored the relative indices to render each patch at a given LOD.

I COULD have stored these in an index-buffer and rendered each batch with 2 lines of code, if there were such functionality to set an index-offset. But since there isn’t, for each patch, i need to take the offset myself, add it to each index in the precalculated relative offset array, thus generating a new buffer, this time with 32 Bit indices, that i now need to send over the bus to the GPU.

If there were functionality to set an index-offset, i could have stored all the data on the GPU. It would reduce memory footprint, CPU cycles and bus-bandwidth.

Jan.

knackered · August 15, 2006, 4:10am

Cool, another good use of an index offset.
Just did a search for the discussion we had when I first suggested it, can you believe it was 2 years ago? Still nothing added to the API.
http://www.opengl.org/discussion_boards/ubb/ultimatebb.php?ubb=get_topic;f=3;t=012219

Korval · August 15, 2006, 12:10pm

No more buffer setup is needed.
And is basic buffer setup (ie: the buffer object(s) are not changing, so they’re already loaded, and you’re just dropping a few tokens into the bitstream) an actual performance issue?

If there were functionality to set an index-offset, i could have stored all the data on the GPU.
Unless, of course, the GPU didn’t support it, in which case it’d just be the API making the gl*Pointer calls for you.

There’s no guarenteed performance benifit from having this offset. It may just be as heavyweight as gl*Pointer calls, in which case you have gained nothing.

knackered · August 15, 2006, 1:20pm

But you could gain a lot. If the GL driver were a simple GL-to-D3D wrapper (just if), it would be searching for a matching vertex-declaration format, sending the vertex-declaration change in the bitstream, configuring the streaming stuff etc. When the application knows that it’s the same vertex format, just a different area of the vertex buffer.

I don’t get it Korval, you were in favour of this when it was last discussed, what made you change your mind?
We know this offset is already in hardware because of the existence of the parameter in d3d->DrawIndexedPrimitive().

Korval · August 15, 2006, 3:27pm

I don’t get it Korval, you were in favour of this when it was last discussed, what made you change your mind?
I’m not against it; what I’m against is getting your hopes up that this is going to be a big performance win.

Maybe it will be, maybe it won’t. There’s no way to know (which I said on the original thread), and the people who do know won’t say anything about it.

If it’s there, I’ll use it. But if it’s not, I won’t complain about its absence.

We know this offset is already in hardware because of the existence of the parameter in d3d->DrawIndexedPrimitive().
A fair point.

Komat · August 15, 2006, 3:38pm

Originally posted by knackered:

We know this offset is already in hardware because of the existence of the parameter in d3d->DrawIndexedPrimitive().
Existence of that parameter does not mean that the hw must have explicit support for it. In case of indexing offset, that feature can be easily emulated by the driver. All the driver has to do is to advance address it gives to the hw as buffer beginning by offset * stride. Since many things that might be necessary to validate were already validated when the buffer was initially selected, change of offset might be significantly cheaper than the full bind.

Komat · August 15, 2006, 3:42pm

Originally posted by opengl_fan:

I am just wondering if it’s possible to use function “glVertexAttribPointer” to set vertex data for conventional attributes.
It is not possible. You have to use gl*Pointer functions corresponding to individual conventional attributes.

opengl_fan · August 15, 2006, 4:27pm

Originally posted by Komat:
[quote]Originally posted by opengl_fan:

I am just wondering if it’s possible to use function “glVertexAttribPointer” to set vertex data for conventional attributes.
It is not possible. You have to use gl*Pointer functions corresponding to individual conventional attributes. [/QUOTE]Thanks. I figured out the answer myself too and deleted my naive post Why it confused me is that I found function “glGetActiveAttribARB” will also return “conventional attribs”, though this is good to identify what kind of vertex format the shader is using (really possible when some inter-app-and-shader attribute naming conventions are used).

It will be wonderful if openGL can reserve some fixed attribute indices for the conventional attribs so that we can use just one common API function

Korval · August 15, 2006, 6:19pm

Existence of that parameter does not mean that the hw must have explicit support for it.
Wow, I was all set to rebut this, but then I realized the fault in my logic.

D3D already has a gigantic performance penalty from every Draw* call. A little thing like offsetting the pointer to the bound vertex buffers would be pretty meaningless compared to a switch to kernel mode.

In short, it could be emulated without the performance penalty seen in OpenGL, because OpenGL is already faster

So, I guess it goes back to the “We don’t have enough information, but would like to have it if it’s available,” stand.

Jan · August 16, 2006, 1:23am

So far, OpenGLs philosophy seems to be, that, if a feature is useful and if a driver could emulate it, then the feature is added. This way on hardware that does support it, one benefits from it and on everything else it’s at least no performance issue.

Also, when we have a feature that’s very useful, but not yet widely supported (in hw), then the vendors will think about implementing it directly in their hardware.

Even if D3D sometimes needs to emulate it, there is still the possibility, that some hardware can speed it up.

When VBOs were introduced one “feature” was, that switching VBOs should be “lightweight”. OK, so far so good, but if the gl***Pointer calls are still heavyweight and we need to call them any time we switch a VBO, then it’s all a bit pointless, IMO.

Jan.

system · August 16, 2006, 9:08am

Originally posted by Korval:
D3D already has a gigantic performance penalty from every Draw* call. A little thing like offsetting the pointer to the bound vertex buffers would be pretty meaningless compared to a switch to kernel mode.

In short, it could be emulated without the performance penalty seen in OpenGL, because OpenGL is already faster

So, I guess it goes back to the “We don’t have enough information, but would like to have it if it’s available,” stand. [/QB]
It was my impression that D3D only supports what the hw supports. I’m guessing since this DrawIndexPrimitives was added to DX9, then all DX9 GPUs support offsetting.
On lesser GPUs, they may emulate it.

In general D3D emulates almost nothing (HAL). The REF device emulates.

Also, people keep repeating that glVertexPointer is expensive. I only saw a NV document say it’s expensive for their own drivers and that was long ago. It may have been 3 years ago.

knackered · August 16, 2006, 12:08pm

Even if glVertexPointer did not cause any expensive operation to happen on the GPU, it’s very rare that people render meshes from only one attribute (position) - therefore in order to respecify a position in the vertex buffer the app has to issue 4 or 5 gl**pointer calls. Now considering most apps are CPU bound, is it not a good idea to reduce the call overhead when switching meshes?
idr stated in the other thread that glvertexpointer is almost the equivalent of swapping textures. I can imagine that there’s a unit on the GPU that is asynchronously caching the next part of the currently specified area of the vertex buffer, so a switch to another area of the vertex buffer is going to mean interupting that parallel process and telling it of the change, whereas enountering an index that is beyond the cache is a much more natural and efficient mechanism, sort of autonomous…I’m guessing here, blindly guessing.

Korval · August 16, 2006, 12:28pm

It was my impression that D3D only supports what the hw supports. I’m guessing since this DrawIndexPrimitives was added to DX9, then all DX9 GPUs support offsetting.
That’s a possibility, but there’s no guarentee of that. The driver itself can emulate the functionality by offsetting the pointers before actually executing the draw calls. The D3D layer doesn’t need to know anything about it.

Also, people keep repeating that glVertexPointer is expensive. I only saw a NV document say it’s expensive for their own drivers and that was long ago. It may have been 3 years ago.
That’s a fair point too; we don’t have recent tests that demonstrate the problem. And even when we did, it was limitted to nVidia hardware/drivers.

If the case for this is going to be made to the ARB, it’d be a good idea to have some actual profiling data (and appropriate test cases) to show the IHVs.