Wait, wait! That’s very misleading!
Triangle strips are by no means slower than independent triangles in his case! Also, I don’t really see a setup bottleneck in this data.
Consider this:
In 1. (strips), Moshe is pushing 47 Million indices/s. Since it is one large strip, this equals to about the same number of triangles, i.e., 47 Million triangles/s.
In 2. (tris, 6xM), Moshe is pushing 93 Million indices, but each triangles need 3 indices instead of just 1, so the actual triangle rate is 31 Million triangles/s.
In 3. (tris, 5xM), it is 114 Million indices with 3 indices/tri, so we get 38 Million triangles/s.
So you see that triangle strips actually give you the very best performance for this mesh, and independent triangles are way slower. Obviously, this cannot be explained by a setup bottleneck, because setup is per triangle, and so for strips he is doing 47 Million setups/s, and for independent triangles only 31 or 38…
I wonder why the 6xM mesh is slower than 5xM. If you send the triangles indices in the correct order on a Geforce 3 (which has 18 effective cached vertices), you get maximum reuse, i.e., you transform each vertex in the whole mesh exactly once, even in the 6xM mesh (well, actually, there is exactly one vertex you need to transform twice). This should go for both strips and independent triangles.
Moshe: the dual pipeline + clock speed increase let’s me go from 30 Million vertices/s to 75 Million vertices/s, but not to 134! There’s something wrong here.
Ok, I think this topic is getting very confusing, especially for other readers.
What everybody has to keep in mind is that we are measuring three different entities here:
-
actual transformed vertices/s (i.e., a vertex taken from the vertex cache does NOT count here)
-
triangles/s (this could show up setup bottlenecks), to keep in mind how much geometry you are actually creating
-
sent indices/s (this basically measures how effective your geometry is organized and how well you exploit the vertex cache). This can from a maximum go from 3 times the triangle rate (if you use independent triangles) to exactly the triangle rate (if you use strips).
Now for the meshes discussed here, we have about twice as many triangles as vertices, but counting it exactly, you find that you send 5 indices per vertex if you use independent triangles, and 2 indices per vertex if you use strips. Actually, in the example above, 47 * 5/2 = 118 is not so far from the 114 so you can see where this comes from…
I hope this clarifies a bit, and please, let’s talk about indices/s from now, not vertices/s, if we talk about the second parameter to glDrawElements…
Michael