Simple question: Triangles per second


I have a fairly simple question for people with experience on current hardware. I have moved my renderer over to use VBOs, however it still seems to be vertex limited (my rough test is that no matter if I use a complex or a simple fragment shader, the performance is nearly identical). I am drawing a whole lot of very small triangles, so the actual fill is quite small.

My question is simple: Does 1 426 474 triangles drawn using GL_TRIANGLES using glDrawRangeElements with interleaved vertex data in one VBO and index data in another VBO sound like a lot? What sort of performance should I be expecting in the ideal scenario? It seems likely that I am missing a bottleneck somewhere and I am not really vertex limited per se, but I figured I should sanity check it since it’s been a while since I was aware of the hardware.

Many Thanks, Sean

If these triangles do not take up much area, then you will get

Your first step in optimizing this should be to turn these triangles into a single triangle strip. There are plenty of programs for doing this. The good ones take into account the post vertex shader cache.

Your next step is LOD. If the object is so small on screen, then you should draw a less detailed version of it.

On a 600MHz gpu, you can draw up to 600 million tris/sec. Provided that the vtx-shader isn’t very complex, you can easily reach those numbers if all tris get culled by the gpu. Rasterization, changing of shaders, changing of textures and uploading of uniforms decrease the triangles/sec rate.

As gpus don’t increase that core clock a lot, know the upper bound is 10mil tri/frame @ 60Hz.
If you’re drawing a huge terrain, use tristrips, but for general meshes indices afaik can be faster.

what card are you getting 600 million tris/sec with?!

I mentioned that’s the upper limit :slight_smile:
On my GF8600GT, (540MHz) 480 million tri/sec is easily achievable (but those tris get culled in the benchmark! ) with a < 20 macro-instruction vtx shader.

strange, the most I’ve ever got out of a quadro 5800 (the top nvidia card) is 500 million tris/sec - and that’s well over what they claim it can do:-
Are you sure your timings are correct?

Yes, those timings were simply “while keeping 60Hz, how many batches of how many tris each can we draw”; error is <2%

Quadro FX 5600 runs at 600MHz, but there’s no info on FX5800, so I’ll assume it’s at least 600MHz. I have read at many places that a geforce can setup 1 tri/clk, and with such rough experiments possibly confirmed that notion. So, the “300mtri/sec” spec nVidia slap is probably more of a real-case scenario, where triangles are visible onscreen (some of them getting backface-culled, others getting drawn completely with nice shading).