Advice on a couple of things

IneQuation.pl · September 20, 2008, 12:44pm

Hi,

I’ve started work on a free game engine and I have a couple of questions, mainly regarding renderer design and performance. You see, I’d like to make my renderer to take full advantage of OpenGL 2.1. The problem is that even though there are many tutorials that present how to use VBOs, FBOs, GLSL etc., they do not explain how to do it efficiently.

Some background on my engine: I’d like it to be a large, realistic outdoor environment engine, ARMA: Armed Assault being the closest of inspirations, with a target geometry throughput of about 1M triangles per frame. Model polycounts are to be at levels similar to ARMA as well.

So, here’s my questions:

1. How do I organise the scene’s geometry into VBOs to ensure optimal performance? Which parts of the world do I put in which kind of a VBO? How many VBOs should I create?
Here’s my sloppy go at the issue: I was thinking of creating a number of static VBOs and uploading all the world geometry to it.

[ul][li]Terrain VBO: I’d have a fixed pool of vertices for use with the terrain; they’d be put into their positions using a vertex shader VTF-ing the heightmap. A single terrain patch would consist of a fixed amount of vertices/triangles, but would cover a varying amount of space, depending on the LOD.[*]Map objects VBO: all the map objects (i.e. buildings, trees, all kinds of static props) stuffed into one VBO, one instance of each. Have a stream IBO assembled each frame and rendered.[/ul][/li]I have yet to think of what to do with entities (players, vehicles, all the moving parts of the world). Dynamic or stream VBOs?
Do these ideas sound reasonable? Or is there a better way?

2. What’s more expensive, texture binding or program switching? Should I sort surfaces by textures or programs to minimise the number of state changes?
I’ve introduced the concept of materials in my engine. A material object consists of a texture object (an encapsulation of one or more texture names/IDs) and a GPU program ID (all compiled and linked, ready to be glUseProgram’d). I know that binding textures and switching programs are both expensive operations, so I’m going to introduce surface sorting. I’m guessing that in a scene there are bound to be less program than texture changes, so I could sort first by the program, then by the texture objects. How does that sound?

3. Is GPU-based skeletal animation (skinning) worth a try? Say, having the bone motion data dumped into a floating point texture VTF-ed in the vertex shader? Is it widely used or is it still done by the CPU in today’s engines?

Thanks a lot in advance!

IneQuation

Ilian_Dinev · September 21, 2008, 6:53am

I’m just a newb enthusiast, but maybe I can help a bit:

By “making full use of gl2.1”, I’m assuming you target at least GF8x00, and ATi support is later to come, by adjusting to GL3. VTF runs nicely only on unified-shaders cards, after all. [and users can’t buy a shaderModel3/AGP card anymore]

the VTF terrains are a good direction, imho. Several VBOs (2x2 … 1024x1024), each vertex consisting only of 2 16-bit integers (gl_position.xy), and have an index-buffer for each vbo (avoid tri-strips and stuff). You’ll need to calculate normals either in the shader, or have a precomputed texture for that.

You can stuff many meshes’ vertices in one uber VBO if the data-size of each vertex is the same for each mesh. I.e you can’t put together two meshes, where one uses “3 floats for position, 3 for normal, 2 for texcoords” and the other uses “3 floats for pos, 3 for normal, 8 for texcoords/attribs”.

All moving entities can be put in static VBOs, too. If they’re moved/rotated - no difference from those meshes being trees. If they’re skin-meshed, you can simply use a shader to do the animation. No need to get the CPU involved
But creating the IBO every time… I think you’re better-off using instancing. I bet it’ll be many times faster than a streamed IBO.

I think texture-binding used to be on-par with program-switching before. But I bet program-switching will become more expensive. Especially when now we have texture-arrays. (you change a texture simply via a uniform, or via an instanced-object’s instanced attribute, or via a simple attribute).
yep, it’s almost useless to compute that on CPU now. Even more interesting, you can try using quaternions for the rotations - I saw a neat publication on this, that showed how superior those transformations are. (makes the triangles move nicely, instead of twist in an ugly way). You can even do complex animation like in “Uncharted:Drake’s Fortune” with multiple VTFetches.

I think I’d suggest feeding OpenGL with ASM shaders (precompiled by cgc). And use semantics - no need to ask OpenGL or guess where your uniforms/attributes are.

Also, make good use of z-cull and move as much computation as possible to the fragment-shader. For z-cull, you do a preliminary depth-pass (disable writing to color, use bare-bones vertex-shader and frag-shader). The move from computation in vertex-shaders to frag-shaders is because polygons are more than the pixels. The faster you compute triangles, the faster you can discard all those triangles that are hidden by the z-buffer.

IneQuation.pl · September 21, 2008, 2:17pm

Thanks, man! Rolling up my sleeves, getting to work. I’ll post if I need a hint.