A question about VBO

shelleyHerb · June 22, 2009, 11:26pm

My vidio card is ATI X300, when I used vbo with dynamic method in this card, I found the rendering speed is slower than rendering from system memory. The code is below:

Let’s define “Method 1” as using vbo
define “Method 2” as no using vbo

// Method 1,
// the code create vb with vbo

glGenBuffersARB( 1, &m_nVBOVertices );
glBindBufferARB( GL_ARRAY_BUFFER_ARB, m_nVBOVertices );
glBufferDataARB( GL_ARRAY_BUFFER_ARB, m_nVertexCount3sizeof(float), m_pVertices, GL_DYNAMIC_DRAW_ARB );

glGenBuffersARB( 1, &m_nVBOTexCoords );
glBindBufferARB( GL_ARRAY_BUFFER_ARB, m_nVBOTexCoords );
glBufferDataARB( GL_ARRAY_BUFFER_ARB, m_nVertexCount2sizeof(float), m_pTexCoords, GL_DYNAMIC_DRAW_ARB );

// the code rendering
glEnableClientState( GL_VERTEX_ARRAY );
glEnableClientState( GL_TEXTURE_COORD_ARRAY );

glBindBufferARB( GL_ARRAY_BUFFER_ARB, m_nVBOVertices );
float* ver = (float*)glMapBufferARB(GL_ARRAY_BUFFER_ARB, GL_WRITE_ONLY_ARB);
memcpy( ver, m_pVertices, m_nVertexCount3sizeof(float) );
GLboolean b = glUnmapBufferARB(GL_ARRAY_BUFFER_ARB);
glVertexPointer( 3, GL_FLOAT, 0, (char *) NULL );

glBindBufferARB( GL_ARRAY_BUFFER_ARB, m_nVBOTexCoords );
ver = (float*)glMapBufferARB(GL_ARRAY_BUFFER_ARB, GL_WRITE_ONLY_ARB);
memcpy( ver, m_pTexCoords, m_nVertexCount2sizeof(float) );
b = glUnmapBufferARB(GL_ARRAY_BUFFER_ARB);
glTexCoordPointer( 2, GL_FLOAT, 0, (char *) NULL );

glDrawArrays( GL_TRIANGLES, 0, m_nVertexCount );

glDisableClientState( GL_VERTEX_ARRAY );
glDisableClientState( GL_TEXTURE_COORD_ARRAY );

// Method 2 ,
// the code create vb in system memory

m_pVertices and m_pTexCoords and valid system memory
address and have valid data

// the code rendering
before darwing, their data are also be updated

glEnableClientState( GL_VERTEX_ARRAY );
glEnableClientState( GL_TEXTURE_COORD_ARRAY );

glVertexPointer( 3, GL_FLOAT, 0, g_pMesh->m_pVertices );
glTexCoordPointer( 2, GL_FLOAT, 0, g_pMesh->m_pTexCoords );

glDrawArrays( GL_TRIANGLES, 0, g_pMesh->m_nVertexCount );

glDisableClientState( GL_VERTEX_ARRAY );
glDisableClientState( GL_TEXTURE_COORD_ARRAY );

When rendering the same thing in Ati x300, the “Method 1” is much slower than “Method 2”. But in Nvidia card Gfx7300, I found “Method 1” is faster than “Method 2”. I don’t know why, whether it is the vidio driver’s problem or other problem? Is the driver of ATI not so good supporting vbo?

scratt · June 23, 2009, 12:35am

GL_DYNAMIC_DRAW_ARB may have something to do with it.
Try the static version.

AFAIK, if you use Dynamic you may find that it’s not stored in VRAM, depending on architecture, os, drivers etc.

Certainly try the different storage modes and see what difference you see.

Also what size are your arrays? This has a bearing on performance also.

Here is a link I have always found very helpful…
http://www.songho.ca/opengl/gl_vbo.html

_NK47 · June 23, 2009, 2:07am

afaik if its DYNAMIC_DRAW and WRITE_ONLY it should be placed in video memory and should not be slower as sending data from system memory to GPU. if you need to update the buffer then WRITE_ONLY is correct otherwise drivers achieve best performance with STATIC_DRAW. if you have a test application i could run it on GF 8800 with omega drivers and see how it behaves.

dletozeun · June 23, 2009, 4:08am

GL_DYNAMIC_DRAW_ARB or GL_STATIC_DRAW_ARB are some hints that suggest to the driver how to manage buffer objects. Switching between these two hints should not affect that much performances. I mean, you get a least vertex arrays performance with your vbo.

The problem here is that you are mapping vertex buffer objects and upload data every frame. Mapping a buffer may stall the program as long as it used by the hardware. Since you are doing it every frame, this is going to happen maybe every frame.

I see in your code that you are replacing the entire buffer data. In this case, mapping a buffer in not the most efficient way. You’d better trash vbo data calling once again glBufferData.

But, do you really need to update BO data every frame? In the method 2 code, you are not doing it.

shelleyHerb · June 23, 2009, 9:45pm

thanks for the help!

I know GL_STATIC_DRAW_ARB is the best if data is unchanged,so I didn’t compare it with the others.

But if data need to be changed every frame, I need GL_DYNAMIC_DRAW_ARB,such as particle system.

So, just think of the condition “data need to be changed every frame”.
Here I just want to compare the speed between the two method.

Someone told me maybe the vbo in Ati X300 is simulated by software.I don’t know if it is real or not.

scratt · June 23, 2009, 11:36pm

A simple way to check for that is to profile it.

I am not sure what platform you are on, but gDEBugger has a free trial for 30 days, works on all platofrms, and is very very good.

dletozeun · June 24, 2009, 3:01am

Someone told me maybe the vbo in Ati X300 is simulated by software.I don’t know if it is real or not.

I found that X300 supports the vertex_buffer_object extension from the realtech-vr glView database

I think you just misuse vbo here. If you need to update the entire vertex buffer data every frame or almost every frame depending on how much data need to be uploaded without stalling the application you can :

Roughly call glBufferData with the vertex array location to trash the current buffer object and set its new data for the next drawing call. This way you are sure to not stall the program because the hardware is still using your BO. But take care of memory consumption.
If you need to do sophisticated things like write little chunks of data you can mirror your data in two vertex buffers. The first one would be used for rendering. the second one mapped for data updates. Once the second buffer can be unmapped switch between them and use the second one for rendering. Ping pong between vbos each time you need to update vertex data.

Note that with these two approaches vertex buffers are not guaranteed to be updated every frame but as soon as possible almost without affecting the framerate.

shelleyHerb · June 24, 2009, 11:57pm

I found that Ati vidio cards don’t support dynamic vbo well.
I test the two method in Ati x300, Ati x700, Ati x1950, rendering from system memory are all faster than rendering from dynamic vbo. But when I tested them in nvidia card, I found two method almost the same fast.

shelleyHerb · June 25, 2009, 12:14am

I don’t know how to paste accessory here.I want to put my code here,then you can test them.