I’m currently using keyframe animation on my engine. For each frame I interpolate the data between keyframes and I render it using normal arrays with glDrawElements.
Using glDrawRangeElements would be faster? if yes, why? note that always use the full range of vertices.
Using VBO would be faster? if yes, why? note that I must send geometry every frame.
Any other ideas to render geometry faster?
Thanx in advance. Martiño.
Streaming VBO is meant for this.
glDrawRangeElements would be faster than what you have because the driver wouldn’t need to calculate the range of data that needs to be sent down.
VBO would be faster if you use that data multiple times during the frame. If you don’t use the data multiple times it wouldn’t be faster and could possiblly be slower due to the overhead associated with vbo (memory surface creation and tracking).
If you use the VBO extension, you can map the buffer and write your data into this buffer every frame. It’s likely that this memory will not be cached, and thus you will avoid cache pollution caused by the variable data being written out. This can be a significant gain; the “memory management” needed by re-allocating the buffer each frame using NULL input data should be insignificant in comparision if you’re streaming heavy.
Regarding DrawRangeElements(), just because YOU know that you’re using all the vertices, doesn’t mean that the driver knows how many vertices there are (unless you use VBO). You just give it a pointer, and then a bunch of indices. The driver would have to scan the index list, calculating min/max, if it wanted to know which area was actually used. If you use DrawRangeElements(), you’re telling the driver this data up front, which allows it to make better decisions.
I wouldn’t call memory management insignificant unless you?re using larger buffers than the average usage size I?ve seen so far. When constantly doing a null data pointer you?re asking the driver to continually alloc new chucks of memory (most people will try to avoid lots of system memory mallocs in performance paths since they’re known to be slow). This can lead to fragmentation of video memory. Also on top of that a timestamp tracking system has to be used to track when surfaces can be freed. Don’t get me wrong, it’s not the worst thing to do, but calling it insignificant with out a little benchmark targeting your usage is risky.
If you?re worried about cache pollution you might want to try using non-temporal copies.
Thanx for the info, I think I will use glDrawRangeElements since it is there since gl 1.2 and I will have less compatibility problems.
When the buffer is allocated with the STREAM_DRAW hint, then the driver can do magic things behind the scenes to make it much more efficient. For example, it may put all STREAM_DRAW buffers in one big, cyclic buffer, and allocate by just moving the top pointer up (and then around, at the end).
Yes, the driver would have to fall back to “standard” memory allocations if you had a bunch of STREAM_DRAW buffers that weren’t actually used according to the STREAM_DRAW performance hint suggestions, but then, that’d be your own fault
Have you measured this? Does the driver really do it? Since I was planning to do a similar thing myself (just allocating a big dynamic vbo and doing the circular buffer stuff).
I have also seen other apps doing it themselves…
Nvidia drivers as early as the 45 version did this, I suppose, because otherways it shouldn’t have been possible to get peak performance through stream draw. It was just fine allocating a new VBO buffer (with NULL data pointer) for 500-2000 vertices multiple dozen times per frame.