No more client side vertex arrays in OpenGL 3 : why ?

I have written a benchmark to compare the rendering performance between dynamic vertex arrays stored in a VBO (updated with glBufferSubData or glMapBuffer) and dynamic vertex arrays stored in client memory. In my bench, vertex position, color and texture coordinates are updated every frame.

My bench results show that the VBO approach is less efficient :
dynamic VA in VBO (glBufferSubData) : 1031 FPS
dynamic VA in VBO (glMapBuffer) : 805 FPS
dynamic VA in client-memory : 1479 FPS

This is on Nvidia 7800GTX with latest ForceWare (162.18). The geometry contains 10,000 triangles.

So i wonder why client side vertex arrays will disappear in OpenGL 3 ?!

The graphics card/driver has to copy the data into video memory anyway, before rendering it.

There’s no point in offering/emulating functionality that is in fact not existent.

It is quite possible that the client-side-VA version of your code turned into a double-buffered solution (the driver renders from a temporary VBO while it uploads the new data into another temp. VBO), while glBufferSubData and glMapBuffer always refer to the same memory in video ram and therefore can offer less parallelism of rendering and uploading.

Also, the difference between 1032fps and 1479fps is 0.3 msecs. Try more geometry to measure significant differences.

Try it with an 84.xx driver. I suggest 84.21. We have found VBOs used in this manner to have significant performance issues for all (so far) >= 90 series drivers.

How are you copying your data? Are you taking into account that the memory you’re writing to might be non-cached memory? If that’s the case, then if you’re using the wrong CPU instructions (non-streaming instructions) during your memory copy, this can degrade performance and might explain what you’re seeing.

Kevin B

Care to post the source code for this benchmark?

(updated with glBufferSubData or glMapBuffer)
Your benchmark is irrelevant, because GL 3.0 has all of these ways to make dynamic uploading faster.

I had read that glBufferData is better than glBufferSubData in case you will fill the entire buffer.
In my benchmark, I did not see a difference.

I had read that glBufferData is better than glBufferSubData in case you will fill the entire buffer.
In my benchmark, I did not see a difference.
It all depends on what you’re doing. If you’re still using that buffer when you want to fill it, then yes, it would be a performance boost. But if you don’t, then no, it would not.

The idea is that if the buffer is still in use, calling glBufferData allows the driver to simply allocate a new slice of memory instead of using the same one and waiting for rendering from it to cease.