30.000 triangles

Hi all,

I am rendering a scene that contains 30.000 triangles.

  • I use glDrawElements() and display lists

On my Geforce 4 i have 100 - 200 FPS, but on TNT2 cards i only get 1 or 3 FPS. I tried without textures and the results are the same.

30.000 Polys are enought to collapse a TNT2? How can i make the scene rendering faster? (i know BSP, but i don’t know from where dowload some useful code)

Thanks.

The easiest way is to use some sort of culling. Try viewfrustum culling and a bounding volume hierarchy.
For furtheroptimization its important to know where in the pipeline the bottleneck is (application, geometry or rasterization), and try to optimiaze that part.

//Ninja

my guess is its mostly time spent with the CPU doing T&L.

mtm

Originally posted by jordiperez:
[b]Hi all,

I am rendering a scene that contains 30.000 triangles.

  • I use glDrawElements() and display lists

On my Geforce 4 i have 100 - 200 FPS, but on TNT2 cards i only get 1 or 3 FPS. I tried without textures and the results are the same.

30.000 Polys are enought to collapse a TNT2? How can i make the scene rendering faster? (i know BSP, but i don’t know from where dowload some useful code)

Thanks.[/b]

I use colorVertices, no lights, no normals, no transformations, only the vertex arrays and the camera tranformation.

I tryed right now on a ati radeon with 64 MB and i got 4 FPS.

I am fustrated, what’s the matter? are 30.000 triangles enought to kill a “standard” graphics card or what i am missing?

Ninja is right. You should use some kind of culling to render you scene in lower sized chunks. You are choking the card! I also have a TNT and I have seen that to mantain a constant 60 fps display you can’t send more than 3000-4000 triangles a second. Visit gametutorials at http://www.gametutorials.com for info on how to implement an octree or a bsp tree.

I have seen that to mantain a constant 60 fps display you can’t send more than 3000-4000 triangles a second.
Hehe. That’s 50-67 tris/frame. I think you mean 3000-4000 tris/FRAME.

Hey, i could hand-paint the polygons on paper faster than that :slight_smile: I’m sure a TNT is faster than me :slight_smile:

Y.

A TNT2 is faster than that, you should get more than 1 to 3 fps for 30k triangles. This is most likely something to do with how you pass the geometry (given that you don’t do much else), try compiling your display lists in immediate mode or using drawarrays as opposed to drawelements. Do you stripify the geometry?

I will try as you say drawArrays instead of DrawElements.

I use DisplayLists, what do you mean with inmediate mode display list??

Thanks for replys.

Is your TNT2 running at an absurdly high resolution? Running out of texture memory perhaps?

– Tom

[This message has been edited by Tom Nuydens (edited 10-08-2002).]

If he was fill limited the geforce 4 wouldn’t do 200 fps, it can’t be a texture memory problem because he gets the same perf without texturing.

Immediate mode is passing geometry using glVertex etc calls rather than using vertex arrays. I would probably try this first, I think there are some issues with display lists and vertex arrays (I think you have be selective about what goes in the display list?)

Cheers,
Madoc

I have some problems with performance on some systems too… I don’t know why,but it’s like that… http://openair.free.fr/public/TestProg.zip
It runs @ 60 FPS on a TNT2/ATHLON 700 and @ 24 on a GF2GTS/ATHLON 1000…
Crazy isn’t it?

I’m guessing fill rate. To begin with the fill rate on the gf4 smokes the gf2 (I would imagine the gf2 smokes the tnt). Second the gf4 does a depth check before the fragment ops right? That may help a little too…

John.

Btw: I compare the gf4 and gf2, because I have never used a tnt (sometimes I’m not very clear ).

[This message has been edited by john_at_kbs_is (edited 10-08-2002).]

My first guess would be that you’re using an OpenGL feature that isn’t supported by TNT, and the driver is falling back to software. OpenGL has evolved to 1.4, but TNT is probably closest to 1.1 in capability. In particular, features like cube maps, 3D textures, shadow maps, and much, much more, are not supported on TNT hardware.

Even so, there are “new” features in 1.4 that are supported natively on TNT hardware.

In a case like this, I usually recommend the 5-step program:

Step 1: Obtain Intel’s VTune
Step 2: Install Intel’s VTune
Step 3: Run Intel’s VTune while your app is running
Step 4: Intel’s VTune will tell you where you spend all the time
Step 5: You have to figure out how not to spend time there

Good luck!

Well,

I’ve tested the application with display list on Inmediate mode triangles, the results are in the TNT 2 8 FPS ( 4 times more than before), i will try to test in the newer ATI, i hope now in the ati almost i get some respectacle FPS number.

Surely the 30.000 triangles are what sloow down all on old cards without T&L.

I will be happy if i found a usefull C++ Source code making and rendering BSP trees.

Thanks to all.

what does your scene look like? is it a big open area or a series of rooms? try to visualize the scene, if you’re looking at a wall how many walls are behind it? are you rendering all of those walls?

fyi: I can bring the frame rate on my gf2 down to 20fps with just 4x over draw across the whole screen (1024x768x32)…

John.

Originally posted by jwatte:
[b]In a case like this, I usually recommend the 5-step program:

Step 1: Obtain Intel’s VTune
Step 2: Install Intel’s VTune
Step 3: Run Intel’s VTune while your app is running
Step 4: Intel’s VTune will tell you where you spend all the time
Step 5: You have to figure out how not to spend time there

Good luck![/b]

Great Advice!!! For OpenGL programs, I would also like to add:
Step 6: Replace glVertex, glNormal calls with glColor to find out if you are app-limited
Step 7: Disable lighting, to see if you’re transform-limited
Step 8: remove glColor, glNormal to see if AGP is the bottleneck
Step 9: render at lower resolution to see if your fill-limited

Step 1: Obtain Intel’s VTune
Step 2: Install Intel’s VTune
Step 3: Run Intel’s VTune while your app is running
Step 4: Intel’s VTune will tell you where you spend all the time
Step 5: You have to figure out how not to spend time there
I would like to add two steps:
Step 0: Get high-bandwidth connection (demo is 107.75 Mb)
Step 0.5: Pay $700 license fee.

Step 0: Get high-bandwidth connection (demo is 107.75 Mb)
Step 0.5: Pay $700 license fee.

You aint got broadband ? Ring intel up and try asking for a demo cd It might work!

OT, but Vtune is amazing and the Intel C compiler produces (in my cases) code that’s regularly 10-20% faster the MSVC++. That’s just plain vanilla (no simd) code.

But the price is high