ATI VAO performance problems

Sorry…Managed to hit Reply instead of Edit:

More stuff:

What I realize now (I hate writing here while being away from my code :slight_smile: is that
the glMultiDrawElementsEXT from the EXT_multi_draw_arrays extension isn’t really compatible with the ATI_element_array extension, is it?

I mean, normally I use DrawElementArrayATI to draw each strip. The strip data array for the current mesh has been uploaded, together with the vertex and the normal arrays, to the card according to the ATI_vertex_array_object extension.

I assume that this means that I can’t use glMultiDrawElementsEXT, since it’s not part of the ATI_element_array extension and therefore (reasonably) not aware of the VAO stuff.

SO, I guess I’m sh*t-out-of-luck until the
ATI_element_array extension contains something similar to glMultiDrawElementsEXT :slight_smile:

Unless, as usual, I’ve failed to understand the whole extension hoolabaloo :slight_smile:

/Henrik

[This message has been edited by CAD_Swede (edited 03-12-2003).]

Originally posted by Ozzy:
[b] cool

(very solid but fixed to one type of vertex format if u want to get nice perfs)

did u use float + byte colors?
what about GL_ATI_vertex_attrib_array_object ?
what about performances while enabling/disabling lighting ?

just curious… i know you’ve done your homework! [/b]

For that 300.1 Mv/s figure I used plain vertices only. Unsigned bytes should work fine, I’ve tested that too with good performance. The vertex_attrib_array_object only makes sure you can use vertex attributes from GL_ARB_vertex_program in VAO too. VAO was designed when EXT_vertex_shader was around and since it uses its own set of draw routines a new call for attributes was needed when the ARB_vertex_program came around. This will not be a problem with VBO.
Lighting will of course reduce performance. The R9700 doesn’t have any fixed function lighting hardware, so all lighting end up as a long vertex shader. The more lights enabled, the longer the vertex shader running under the hood. Enabling stuff like double sided lighting etc will make it slower too.

Well, CAD, you could try the other thing I suggested:

attach all your strips together with degenerate triangles and render them in a single glDrawElements call

Originally posted by Korval:
Well, CAD, you could try the other thing I suggested: Putting them together with degenerate triangles.

As we’re working with a CAD program, where GL_LINES will be used as much as GL_FILL to draw the triangles, I think degenerate triangles would still show up and look weird, especially if they’re connecting triangles that are in separate planes and “far” away from eachother. Maybe there’s some logic I could slap on to make it be less of a problem, but I believe it would still exist… Thanks for the idea, though. Maybe some kind of mix for the separate cases can be achieved.

Thanks!

/Henrik

Originally posted by CAD_Swede:
Nichlas Lelong, who commented further up, says that he’s run into the same problem with short strips and that draw_multi didn’t help in that case. So, since neither him or me have had any success with draw_multi, I’ll have to try something else, right?

Just for the record, I’d like to add that I’ve now tried to pad the strips so that they’re always at least 32 vertices long. That didn’t help either. (Which is good. That’d be a weird bug :slight_smile: So, what’s left for me to do is to check for some kind of strip merging technique, as someone said degenerate triangles, etc, when that’s a possiblilty.

Or, I could just wait for the ATI drivers to get as good/fast as the nVidia and WildCat drivers are when it comes to rendering short strips. :wink:

Thanks, everyone for a good discussion and for taking your time to respond.

/Henrik

[This message has been edited by CAD_Swede (edited 03-13-2003).]

As we’re working with a CAD program, where GL_LINES will be used as much as GL_FILL to draw the triangles,

Well, there’s no such thing as a “degenerate line”, so theres nothing you can do on those primitives.

I think degenerate triangles would still show up and look weird

Degenerate triangles (triangles that use the same index twice) will never appear. The GL spec, and virtually ever hardware renderer I know of, guarentees this.

Also, if your data is really that poorly strippable, you should really consider a triangle list (GL_TRIANGLES) rather than a bunch of strips (GL_TRIANGLE_STRIP).

Also, when you say “triangle list”, what do you mean? Just non-indexed triangles? Or is it some magic I’ve never heard of? Again, as I wrote in my previous post I don’t recognize this way of drawing triangles.

It’s a list of triangles. If you’re drawing index with glDrawElements, each 3 indices represents a particular triangle. It’s just like a face list. If you’re not drawing indexed (and unless you have significant vertex reuse, I’m not sure I would suggest it), then you just arrange your vertex data into an array of 3 vertices, duplicating data where necessary.

Originally posted by Korval:
Well, there’s no such thing as a “degenerate line”, so theres nothing you can do on those primitives.

I wrote “GL_LINES”, not “GL_LINE”. GL_LINES draws the edges of a triangle instead of filling it as GL_FILL does. Thus, degenerate triangles can’t be used when I’m drawing GL_LINES as those lines will be clearly visible…or so I believe. We have an outline mode where degenerate triangles definitely makes things look like crap. :slight_smile:


Also, if your data is really that poorly strippable, you should really consider a triangle list (GL_TRIANGLES) rather than a bunch of strips (GL_TRIANGLE_STRIP).

The data is in general very nicely strippable. We often get strips over 2000 vertices long. Very nice. Very fast. However, we stumble across shorter strips at times and on those models, rendering speed drops from 50 fps for 1 million triangles to 20 fps for 150,000 triangles… or, in a few horror examples, way below that.

So, degenerate triangles will work nicely for rendering modes when we only use GL_FILL. (which is most of the time, so
it should be ok). In a few cases, though, we’re going to have to render with GL_LINES instead and in those cases we’ll have to ditch the degenerated triangles.

Thanks, though! I really appreciate it!

Edited to add: Now, the real problem isn’t the tristrip techniques used, to be honest.

The problem is that the Radeon 9700 Pro kicks nVidias and WildCat’s butt when being fed good strips. However, in the “bad” cases, the Wildcat and the GeForce 4 nVidia renders the models ALOT faster than the 9700 Pro does. Why, ATI, why!?!? :slight_smile:

Why can’t the ATI card be just as content as the nVidia card is when rendering many short strips? :slight_smile:

/Henrik

[This message has been edited by CAD_Swede (edited 03-13-2003).]

The simplest solution to your problem is to simply render the long triangle strips and then all triangles making up strips of less than 32 in a single GL_TRIANGLES call. I would try this before anything else.

When doing lines you might find it’s much faster do actually pass line geometry rather than changing the polygon mode. I found this to be especially true on wildcats.