Improve performance for OBJ models?

I have a complex OBJ model as part of the geometry that I need to render (heightmap and some OBJ models in the terrain), the model is large enough that even though I use display lists for the OBJ models and keep the terrain in VBO the rendering only manages around 10 FPS on a 8600GT.

Since only one model is very complex that model alone causes the rendering to go from ~50-60 FPS to ~10 FPS – as such I was wondering what should I do to optimize the rendering performance?

I was wondering if there is some algorithm or technique that is readily available and not impossible to implement which can simplify an OBJ model? I already use GL_CULL_FACE and I don’t know what else I can do.

Why are you using display lists for the OBJ and not a VBO as well? Are you using GL_POLYGON? You might be able to get some speedup by converting everything to quads or tris so you only need one glBegin/glEnd call, if you have to use display lists.

As with all pipeline optimization, always ALWAYS determine in which stage you’re bottlenecked on first to ensure that you optimize the bottleneck. It’s very frustrating to spend time optimizing something you know could be done faster, only to find out you the resulting code doesn’t go any faster at all!

And when it’s “fast enough”, stop! You can always optimize more. But why if you don’t need to (unless you’re just bored :wink: ).

Here’s a pointer to an old presentation that describes this. The modern pipeline is more complex and uses unified shaders, but a lot of the same basic principles still apply:

GDC06: OpenGL Performance (Hart, ATI)

See pg. 8.

One of the easiest things you can do first: shrink your window. If performance improves, then you it’s likely you’re fill limited for some reason (frag shader, frag tex fetch, framebuffer, etc.). Note that here I assume you always update your frustum to fit the same FOV within the new resized window, and don’t do any LODing based on pixel coverage per batch (probably not, so just ignore that mention if it doesn’t make sense).

If that didn’t make a difference, try using fewer verts in your mesh but the same number of batches. If that improves, it’s likely you’re vertex bound. Very useful for visualizing these kind of problems is rendering your scene in wireframe mode:

glPolygonMode ( GL_FRONT_AND_BACK, GL_LINE ) ;

Are there an insane number of tiny triangles?

If that didn’t make a difference, try using fewer batches. If that improves, then likely your CPU bound (batches/state changes/etc.).

Do all this on the exact same view of the object. Don’t go changing anything except what you need to vary for the specific test.

Are you doing view frustum culling? If unsure, rotate the camera so that no objects are in-view. Does your frame rate go insanely high? If so, then you probably are, which is good.

I’m using GL_TRIANGLE_STRIP; and the reason I don’t use VBO for the OBJ models is that I found there to be a huge performance penalty when I do multiple glDrawArrays() calls on a VBO with split up offsets – since the various faces on the OBJ muses different textures I can’t just do with one call to glDrawArrays(), I need to constantly bind a new texture and then do another glDrawArrays() for the respective offset range.

As with all pipeline optimization, always ALWAYS determine in which stage you’re bottlenecked on first to ensure that you optimize the bottleneck. It’s very frustrating to spend time optimizing something you know could be done faster, only to find out you the resulting code doesn’t go any faster at all!

And when it’s “fast enough”, stop! You can always optimize more. But why if you don’t need to (unless you’re just bored :wink: ).

Here’s a pointer to an old presentation that describes this. The modern pipeline is more complex and uses unified shaders, but a lot of the same basic principles still apply:

GDC06: OpenGL Performance (Hart, ATI)

See pg. 8.

One of the easiest things you can do first: shrink your window. If performance improves, then you it’s likely you’re fill limited for some reason (frag shader, frag tex fetch, framebuffer, etc.). Note that here I assume you always update your frustum to fit the same FOV within the new resized window, and don’t do any LODing based on pixel coverage per batch (probably not, so just ignore that mention if it doesn’t make sense).

If that didn’t make a difference, try using fewer verts in your mesh but the same number of batches. If that improves, it’s likely you’re vertex bound. Very useful for visualizing these kind of problems is rendering your scene in wireframe mode:

glPolygonMode ( GL_FRONT_AND_BACK, GL_LINE ) ;

Are there an insane number of tiny triangles?

If that didn’t make a difference, try using fewer batches. If that improves, then likely your CPU bound (batches/state changes/etc.).

Do all this on the exact same view of the object. Don’t go changing anything except what you need to vary for the specific test.

Are you doing view frustum culling? If unsure, rotate the camera so that no objects are in-view. Does your frame rate go insanely high? If so, then you probably are, which is good.
[/QUOTE]

Thanks for the reply.

It seems that it is at least somewhat CPU bound: If I change the CPU frequency from 2.4 GHz to 800 MHz there is a huge performance drop: from 10-11 FPS to 3-4 FPS.

Fill limitation: When resizing the window from full screen to very tiny there is no difference in FPS.

View frustum culling: I don’t do this unfortunately, I’m trying to find some material on how to do this as it makes sense that this should improve the performance a lot.

When I change between glPolygonMode(GL_FRONT_AND_BACK, GL_FILL) and glPolygonMode(GL_FRONT_AND_BACK, GL_LINE) there is no difference in FPS.

Lots of places you can look. A few that spring to mind immediately:

That wasn’t meant as a performance test. That was just so you could see how course or how dense your polygon mesh was.