vertex formatting

Jeff_Russell · June 1, 2007, 1:48pm

Hey, quick question:

Is there some documentation available anywhere that details what vertex formats are ‘fast’ and which require a slow conversion?

For example, say I want to pass in something like color for each vertex. If I format the colors as 4 component GL_UNSIGNED_BYTE it runs fine, but if I make it 3 component GL_UNSIGNED_BYTE I see a rather substantial performance hit. Ditto for 3 or 4 component GL_BYTE. Plus there’s still signed and unsigned shorts, halfs, floats and doubles (although floats always seem to work well).

Using glVertexAttribPointer seems to open up more possibilities for formatting than glTexCoordPointer, glColorPointer etc - it’d be nice to know what works well!

I would expect different cards/vendors may behave differently in this regard, but I would love to find out whats fast and whats not for any recent hardware.

Komat · June 1, 2007, 3:05pm

For ATI cards there is (or at least was when I last looked) table within Radeon SDK . I do not know about similar table for Nvidia HW however since theirs cards are less picky about vertex formats is is relatively safe to assume that if format works fast on Radeon card, it will work on GeForce too.

Jan · June 1, 2007, 3:29pm

Using GL_BYTE on nVidia will result in bad performance in general. On ATI it seems not to. However, you if you need a sign-bit, simply pack it into an unsigned byte and unpack it in a shader.

When using GL_UNSIGNED_BYTE, always use 4 components, so that the element is 32 bit aligned. 3 bytes for one element results in bad performance on all hardware.

I don’t know about 16 Bit and 32 Bit integer stuff, never used that. Floats are usually fine (no matter, whether you use 1,2,3 or 4 components).

Some papers state, one should try to make a whole element (position, texcoords, normals, etc.) 32 byte aligned, at least, when your data is interleaved (what it should be). Though i haven’t found that would make a difference. In general, i found, less data per element is always better.

One thing i found out for nVidia cards: Using glVertexAttribPointer works fine, both on nVidia and ATI, when you use a shader. When you disable GLSL shaders and simply use the fixed function pipeline, it doesn’t work on nVidia correctly. So, either always use a shader (which is the preferred way) or be prepared to use the old functions, when you have shaders disabled.

That’s all i can think of, at the moment.
Jan.

AlexN · June 1, 2007, 6:14pm

On nvidia, floats, signed shorts (normalized or unnormalized), unsigned bytes (2 or 4 components, normalized or unnormalized) are fast. The newer unified architecture graphics cards might support a wider range of formats, though.

Jackis · June 2, 2007, 3:11am

Actually, 3 elements unsigned bytes works good on nVidia, if total data is well-aligned.
I mean, I use data like that:
3 floats - 12 bytes
3 halfs - 6 bytes
3 unsigned bytes - 3 bytes
3 unsigned bytes - 3 bytes
4 halfs - 8 bytes
Total structure size is 32 bytes and no software fallback observed.

But I also would like to have a table with hardware supported formats for nVidia

Jeff_Russell · June 4, 2007, 7:57pm

Quite helpful, particularly the ATI sdk table - thanks everyone