I know I can use “glMapBufferDataARB()” & “glUnmapBufferDataARB()” to update VBO’s data,but is it worked asynchronously just like PBO?If my program create a VBO,and update it’s data every frame rendering,is it better than use VA(vertex array) directly?
I have found it is better to use a vertex array if the data is getting constantly updated.
Since the PBO spec was written on top of VBOs, I’d say the behavior should be similar.
As for whether VAs are better, I’d say it depends on where the update is coming from. If you can arrange things so that the information needed for the update is on the card, then VBOs are probably better; if you need information in system RAM, then VAs may be better.
In case you only update portions of the arrays, the fastest solution is maybe to keep the VA in main mem and create a VBO copy in VRAM using GL_DYNAMIC_DRAW. Then validate the arrays in main mem and synchronize only dirtied regions using glBufferSubData().
Actually, glMapBufferARB() causes sync issue. glMapBuffer() will wait(stall) until OpenGL finishes its job on the bound buffer.
Asynchronous transfer in PBO results from pixel operation function calls, such as glDrawPixels() or glTexImage2D(), not glMapBuffer(). These functions can be scheduled DMA transfer by OpenGL, and be returned immediately.
I don’t see an asynchronousness advantage on VBO, however, the performance gain in VBO comes from storing vertex data in the VRAM. So, the vertex data can be transferred to the rendering pipeline quickly.
You won’t get a stall if you call glBufferData with NULL before mapping. (presuming you’re only writing to the buffer, which is all you should ever do).
You could also get a performance boost if you use 2 VBOs and ping pong between them.
APPLE_flush_buffer_range provides… a method to asynchronously modify buffer object data.
mmm, this range flushing should have been part of the vbo spec from the beginning. But it took long enough for them to agree on the heap of trash that is the vbo spec we have now.
BTW, Apple has some really nice extensions… If only everyone could implement them…
For my system(ATI X3850+Cat8.2), map/unmap slows the rendering down considerably. glBufferSubData is always faster.
Even if i have to modify the data before the upload it’s faster to copy it to temp memory, modify it there and use glBufferSubData then.
Could be an ATI or application specific thing though.
And are the VBOs we control really on the VRAM? I thought they are in the system memory managed by the driver. And the driver might upload it later asyncronously to the VRAM when the bus is idle. Or the GPU might DMA it from “driver” memory. (All speculation of course)