Poly Transformation with Octree/Frustum culling

Greets all, first post here. I’m a software engineer at EA Canada, mostly C/C++ stuff, done moderate amount of work with OpenGL and starting to do more. Here’s my Q:

I’m implementing Octree + Frustrum culling for viewing of static + dynamic geometry in a scene. For the dynamic geometry (i.e position, orientation constantly changing), I should need to transform the vertices of the models so that they can properly be inserted into the octree struct and culled with the frustum (if I’m wrong here, someone please tell me). Should I build my own transformation matrix for each model and then transform the polys with it, before inserting them into the octree? Or is there a better way to do this?

Thanks in advance for any help you can provide.

If what you’re talking about is regular
moveable animatable meshes, why do you need
to cull them at all? All you need to do is
decide if some part of the thing ('s boudning
sphere) intersects the viewing frustum.

Assuming they do, just throw 'em all at the
card and be happy. The extra transform work
(even if you don’t have HT&L) is fairly
minimal, and may be less than the expense of
trying to keep an octree up to date. With
HT&L, it certainly WILL be cheap to render
rather than update the tree, unless you have
massive polygon areas with repeated fills.

Originally posted by bgl:
[b]If what you’re talking about is regular
moveable animatable meshes, why do you need
to cull them at all? All you need to do is
decide if some part of the thing ('s boudning
sphere) intersects the viewing frustum.

Assuming they do, just throw 'em all at the
card and be happy. The extra transform work
(even if you don’t have HT&L) is fairly
minimal, and may be less than the expense of
trying to keep an octree up to date. With
HT&L, it certainly WILL be cheap to render
rather than update the tree, unless you have
massive polygon areas with repeated fills.[/b]

Ok, thanks, that’s what I thought but I wasn’t 100% sure. I’m already using the octree for the static geometry and sphere/frustum culling for the moving models, I’ll leave it as it is

As a side note, I know that state changes are bad, and have heard that frequent texture binding per frame is very bad. Does making a list of polys to render, and adding them to the list in order of texture so as to minimize texture binding sound decent? Are there any other common state-changes that are particularily bad?

Thanks bgl for the help!

I think that changing light settings is the next worst thing to changing textures.

Making a list of triangles with the same texture is a good idea, and you probably would want to maximize the re-use of vertices at the same time.

j

j’s on the ball here. In order of cost, it
goes something like:
Uploading textures (glTex(Sub)Image)
Binding textures
Changing lighting

The specifics do vary from implementation
to implementation, and on what your limiting
factor is (fill rate, texture upload, geometry
upload, call overhead, …) so judicious
benchmarking with different target hardware
is recommended.

Originally posted by bgl:
[b]j’s on the ball here. In order of cost, it
goes something like:
Uploading textures (glTex(Sub)Image)
Binding textures
Changing lighting

The specifics do vary from implementation
to implementation, and on what your limiting
factor is (fill rate, texture upload, geometry
upload, call overhead, …) so judicious
benchmarking with different target hardware
is recommended.[/b]

Perfect, thanks guys. I was thinking about segregating lit/unlit polys as well, thanks for the confirmation that that’s a good idea

What I had in mind was this: making three lists, one of normal lit polys, one of normal unlit polys (for my purposes there will be a decent number of them), and one list of transparent polys. I’d go through all the scene objects and after checking for octree/frustum culling, I’d add all visible polys to one of the three lists, all sorted for texture number, and for the transparent ones, depth sorted for proper blending. I’m just looking into using vertex arrays and glDrawElements, never used them before but read they provide a nice performance increase over good’ol glBegin/glEnd triangles. Is there anything better than VA’s and glDrawElements?

Thanks for the help, greatly appreciated!

Is there anything better than VA’s and glDrawElements?

If you want to go hardware specific, GeForce hardware has an extension that lets you store your vertex arrays in video or AGP memory, and it is the fastest geometry transfer method right now.

Other that, if your geometry is static, display lists would probably be faster than VA’s.

If you are not reusing vertices at all, glDrawArrays would be a bit faster than glDrawElements.

j

Originally posted by j:
[b] If you want to go hardware specific, GeForce hardware has an extension that lets you store your vertex arrays in video or AGP memory, and it is the fastest geometry transfer method right now.

Other that, if your geometry is static, display lists would probably be faster than VA’s.

If you are not reusing vertices at all, glDrawArrays would be a bit faster than glDrawElements.

j
[/b]

Can’t use display lists, I have my own customized lighting model and it doesn’t work with display lists

So glDrawArrays would be better than glDrawElements for geometry with changing vertices? Do you know how that works? (I like to know the nuts and bolts)

glDrawArrays is better if you are not re-using any vertices in the model, because all it does is go straight throught the array and render the vertices in order. The best case for DrawArrays would be a whole bunch of separate triangles, like in a particle system.

glDrawElements is best if you share vertices between polygons. For example, in a landscape mesh, you would use glDrawElements, because each vertex in the landscape is part of several triangles. What you need to remember to take the best advantage of this is that if you reuse a vertex, do it as soon after the first use as possible, which keeps it in the vertex cache. Otherwise you will not benefit that much. As an example, on a GeForce card, when glDrawElements is used, the GPU keeps vertices it has calculated in a cache. However, the cache holds only ten vertices at a time, so you can’t draw a whole lot of vertices, draw the first one again, and expect the GPU to find it in cache.

I hope that made sense.

For a model with animated vertices, the same basic rules apply. glDrawElements if polygons share vertices between them, glDrawArrays if no vertices are shared.

j

Originally posted by j:
[b]glDrawArrays is better if you are not re-using any vertices in the model, because all it does is go straight throught the array and render the vertices in order. The best case for DrawArrays would be a whole bunch of separate triangles, like in a particle system.

glDrawElements is best if you share vertices between polygons. For example, in a landscape mesh, you would use glDrawElements, because each vertex in the landscape is part of several triangles. What you need to remember to take the best advantage of this is that if you reuse a vertex, do it as soon after the first use as possible, which keeps it in the vertex cache. Otherwise you will not benefit that much. As an example, on a GeForce card, when glDrawElements is used, the GPU keeps vertices it has calculated in a cache. However, the cache holds only ten vertices at a time, so you can’t draw a whole lot of vertices, draw the first one again, and expect the GPU to find it in cache.

I hope that made sense.

For a model with animated vertices, the same basic rules apply. glDrawElements if polygons share vertices between them, glDrawArrays if no vertices are shared.

j[/b]

Great explanation, thanks alot j!

Where can one read about the caches of the nvidia cards? I mean, from where do you know that the geforce holds the ten least vertices in cache?

Oh, and would it be possible to say, that a certain array of vertices will stay in its order over several frames, so the cache could switch to a LRU mode? Means, the card knows, which indices of the array to cache best.

Read the Geforce OpenGL performance-faqs (version 2) at their website. they contain a great deal of information including on how to make best use of the geforce vertexcache, optimal texture formats, etc…

Sven

Well, I have found my GeForce2 card to be mind-blowing enough so I dont get that low-level with vertex caches.

In the GeForce I have found that, from fastest to slowest, the primitives are:

-begin/end GL_TRIANGLE_STRIP in Display Lists (Jesus!!!)
-glDrawElements(GL_TRIANGLE_STRIP) (still pretty fast)
-begin/end GL_TRIANGLE_STRIP (fast)
-glDrawElements(GL_TRIANGLES) (pretty ineficient)
-begin/end GL_TRIANGLES (embarrasing)

I must point out that even in the slower modes the geforce2 kills cards without T&L rendering high polygon count models.

What about glDrawArrays( GL_TRIANGLE_STRIPS ) ?

That would be nice, if it existed (GL_TRIANGLE_STRIPS). This could be a new topic for the “Suggestions for new features” thread… (a lot less overhead than the glBegin/glEnd intensive GL_TRIANGLE_STRIP).

There would need to be an additional array with indices of the starting and end points of the strips.

yep

> That would be nice, if it existed (GL_TRIANGLE_STRIPS). This could be a new topic for the “Suggestions for new features” thread…

You are late…

GL_EXT_multi_draw_arrays
GL_IBM_multimode_draw_arrays

Do the nVidia drivers support it? Can’t check for it right now in my engine, sitting in front of a different computer.