compiled vertex array - no performance gain?

I fixed my terrain engine to support vertex array, then compiled vertex array.
I’ve tested my game in Linux with DRI.

first I get glLockArraysEXT address:

glstuff->glLockArraysEXT = (PFNGLLOCKARRAYSEXTPROC)glXGetProcAddressARB((GLubyte*)“glLockArraysEXT”);

then I enable states:

glEnableClientState(GL_VERTEX_ARRAY); glEnableClientState(GL_TEXTURE_COORD_ARRAY);

then I define and lock arrays:

glInterleavedArrays(GL_T2F_V3F, 0, vertices); glstuff->glLockArraysEXT(0,hsizex*hsizez);

my engine uses ROAM method, so indices change every frame:

indices[0]=leftx+leftzhsizex; indices[1]=rightx+rightzhsizex; indices[2]=centerx+centerz*hsizex; glDrawElements(GL_TRIANGLES,3,GL_UNSIGNED_INT,indices);

Everything is displayed correctly, but I see no performance gain.
What is wrong?

  1. I draw single TRIANGLE - is it bad?
  2. vertex array is huge (1024*1024) - to big?
  3. I am doing something wrong with compiled vertex array extension - how can I check it?
  1. vertex array is huge (1024*1024) - to big?
    is likely to be the correct answer

CVA usually grants you perf gain with small batches locked at a time.
Try locking a few hundreds/thousands tris at a time.

OK, now I’ve put glLockArraysEXT and glUnlockArraysEXT in Patch: raw - and size of patch could be for example 64*64. I tested it and see still no performance boost.

How can I check if glLockArraysEXT was succesfull?

You should note that you will only see a performance gain with CVAs under certain conditions. That being, drawing multiple passes between lock/unlock pairs.
e.g.

lock arrays
draw diffuse light
draw beauty
draw specular
unlock

And if memory serves me, even then you may not notice any significant change on some hardware.

Yep, CVA’s are pretty much pointless if you’re not multipassing on your vertex data.

Hi Jacek,

If vertex transfer is never a bottleneck in your app, then you won’t see a perf improvement by making it faster.

Are you sure you’re vertex transfer limited some of the time?

Thanks -
Cass

hello DFrey, you wrote:

“You should note that you will only see a performance gain with CVAs under certain conditions. That being, drawing multiple passes between lock/unlock pairs.”

But what about vertex sharing? I use ROAM method - I don’t use every vertex in every frame. So I can’t use FANS/STRIPS (there are methods for creating fans/strips with ROAM, but I don’t know it). I just used immediate mode. Shouldn’t performance increase after switching from immediate mode GL_TRIANGLES to Vertex Array?
BTW I’ve read different informations on newsgroups about vertex arrays. Are all vertices calculated when locking or just vertices I use?

hello cass, you wrote:

“Are you sure you’re vertex transfer limited some of the time?”

I am not sure, don’t know how to test it. What I am trying to do is to draw 7km of terrain distance with points every 10m. I not use every vertex every frame - that would kill my Voodoo3. I have ROAM which takes only few vertices and create triangles. GL_TRIANGLES, not strips or fans. So I transfer 3 vertices for every triangle. I thought with Vertex Arrays this will be fixed, and vertices transfer will be 4-6x smaller.
The problem is - I still have only about 5fps on Athlon XP 1800 and Voodoo3. And I have only texturing - without lights (but it’s not “one texture for patch”, I use different textures for mountains, hills or lakes).

UPDATE: I’ve added lighting based on normals, looks terrible (LOD!), but performance is exactly same like before (without lighting).
What does it mean?
I will probably release this code to public (http://decopter.sf.net), but I am afraid it’s to slow for use.

Two problems:

  1. Your vertex array should be chunked in 33x33 vertex chunks or something like that for locality.

  2. Drawing a single triangle per DrawElements() call is a fair bit of waste. Do what you can to issue larger lists.

I’ve had very little luck seeing a real performance improvement with compiled vertex arrays as well.

I finally sat down and wrote a synthetic test that definitely should show a performance improvement with compiled vertex arrays and sure enough I did, but only on some platforms. On others, it seemed as though the lock was ignored entirely.

If anyone’s interested I’ll see if I can clean up the code so it’s readable. The main point though is that you’re probably not doing anything wrong, it’s just that compiled vertex arrays seem to be very sensitive to when and where they’re a benefit over regular vertex arrays.

hello jwatte, you wrote:

“Drawing a single triangle per DrawElements() call is a fair bit of waste. Do what you can to issue larger lists.”

I created few indices arrays and sorted triangles by texture. Works faster now :slight_smile: