VBO performance initally very slow

I’m using a VBO in conjunction with a vertex program. Initially the rendering is very slow ( slower than using glVertex calls ) but after rotating the object around awhile the performance will suddenly and dramatically improve.

As I mentioned, performance is OK when using the vertex program in conjunction w/ glVertex calls instead of using the VBO. So, at least for now, I think the problem lies in how I’m using VBO’s and not the vertex program.

HW: Geforce 7800
Driver: 93.71

On startup the VBO is initialized as:

glGenBuffersARB( 1, &cyl_buffer );
glBindBufferARB( GL_ARRAY_BUFFER_ARB, cyl_buffer );
glBufferDataARB( GL_ARRAY_BUFFER_ARB, TOTAL_NUM_CYL_PTS * 4 * sizeof( float ),
glBindBufferARB( GL_ARRAY_BUFFER_ARB, 0 );

When the VBO is to be used:

glEnableClientState( GL_VERTEX_ARRAY );
glBindBufferARB( GL_ARRAY_BUFFER_ARB, cyl_buffer );
glVertexPointer( 4, GL_FLOAT, 0, NULL );

Using the VBO: The glDrawArrays call is inside a loop that also makes several calls to glVertexAttrib4fvARB to setup the vertex program.


Upon completion:

glBindBufferARB( GL_ARRAY_BUFFER_ARB, 0 );
glDisableClientState( GL_VERTEX_ARRAY );

Hello Foxbat,

are you using a dual core machine? I had the same issues on a Laptop with Centrino Duo and NVidia 7950 GTX graphics card. Apparently this has something to do with the NVidia drivers running on a dual core machine.

I did some research and monitored fps over time. Interestingly, the slowdown is affecting only the first 120 frames or so. My workaround was to render 200 frames right after opengl context creation (just glClear, gluLookAt and glFlush). Performance is back to normal after doing this.



Interesting bug. Have you reported this to nVidia?

I have the same problem; my machine is a Pentium 4 D 830 ( dual core @ 3 Ghz ) on a 7800 GTX. That would confirm your theory…


Yes, my machine is dual core. Interesting.

When I get a chance I’ll check to see how many frames need to be rendered before the performance increase.

Foxbat: When setting it up, could you test if reordering the calls makes a difference (like, only enabling after all state is set up)?

If you’re on Windows, could you try to set the process affinity to just one CPU and see if it makes a difference?

(I just got a dual core and haven’t got the time to test these things myself just yet)

Running some tests…

Running the program without setting the affinity performance seems to increase around frame 80 - 90.

Setting the affinity after the windows/contexts have been created its around frame 30-40.

Setting a break point at the first line in main and setting the affinity ( this is before windows/contexts have been created ) all works normally.

So this definitely is a dual core issue.

Moving the call to glEnableClientState after the calls to glBindBuffer and glVertexPointer has no effect.

Has somebody already submitted a bug to nVidia?

Foxbat: Whether or not, I think this validates a report from you too (more reports could make them react faster).

Funny, first 64-bit (h/w) issues, now dual-core… :slight_smile:

Foxbat: I haven’t reported the bug yet, cause I had only two dual core nvidia machines to test the theory.

Funny that this topic hasn’t appeared more frequently, since it appeared around the end of June 2006 on my machine http://www.gamedev.net/community/forums/topic.asp?topic_id=400956 .

Is there any registered developer to report the bug?