100.000 polys w/lighting @ 30 fps - how?

karx11erx · May 27, 2008, 5:01pm

Yooyo,

thank you. I had thought of using VBOs, but I didn’t know I can use two simultaneously. Afaik I can’t, or am I wrong? I am using them for 3D objects (robots, player ships, powerups etc. already), so I know the basics about these.

Faces already get sorted by textures.

I wouldn’t know how to further optimize occlusion culling (which costs about 40% of the entire rendering). Currently the cuboids are walked, beginning at the viewer segments, cuboid faces are transformed and projected to determine what is occluded by them until it can be safely said that all further segments’ faces are occluded.

I am not using glGet…

Will look at glMultiDrawElementsEXT.

The vertex pointers are set per TMU. Don’t the additional TMUs need to know the vertices, too?

skynet,

I never said I’d come even remotely close to 100K polys @ 30 fps. I know other engines do, that’s why I am asking.

A typical Descent 2 mine has dozens and dozens of dynamic(i.e. moving, destructible, flashing) lights. There are often 16 or more lights affecting a single face (particularly during fire fights, which can spam the area with lights), and using less leads to lighting flaws. Blame it on the stone age engine. That’s why lighting takes so long: I have to determine the closest lights to each face (doing this per segment currently). That means: The less faces, the less work in this area, so I need software occlusion culling. I am already using precomputed lightmaps for static lights, but unless I have a stroke of genious (or use deferred lighting), I will not get around that type of light handling.

So while VBOs might help the draw calls, they wouldn’t really help much overall, given the draw calls only make up 25% of the entire rendering process.

As I said, even using VBOs for the 3D models only doubled their rendering speed. The only thing I know of I could apply here was face reordering to optimize the gfx hardware’s vertex buffer usage.

Edit: I was wrong. It’s almost 8 times faster.

yooyo · May 27, 2008, 6:18pm

Man… you have to read opengl spec before you start coding!

Yes… you can use more than one VBO. Read VBO spec and nvidia document listed above.

TMU doesnt care about vertices, colors and normals… again… take look this http://www.opengl.org/documentation/specs/version1.1/state.pdf
or better download one of pdf from http://www.opengl.org/documentation/specs/ and READ!

A question for you… do you want per-vertex or per-pixel lighting?

unsigned_void · May 28, 2008, 12:08am

If you doing per-pixel lighting via shaders, then you are not limited to standard 8 lights. If you manage to push 16 (or whatever needed) light positions (and colors) to uniforms, you could calculate their contribution to pixel color.
Of course, too much uniforms can be performance hit of its own, but only experiment will tell.
Another idea - light parameters encoded into texture.

Nicolai_de_Haan · May 28, 2008, 1:22am

Ouch, if occlusion culling eats 40% of the CPU time spend on rendering each frame, then you definitely need to do something (are you sure about that number?). You can optimize the geometry layout all day long but if the CPU doesn’t have time to submit your buffers because it’s doing OC then nothing is gained (only time is lost). Maybe you should look into “bounding volume hierarchies” and “spatial data structures”?

karx11erx · May 28, 2008, 1:44am

Nicolai,

definitely. I am pretty sure about that number. For lack of a working profiler I have added some simple time measuring code, and the result is plausible, as the OC’ing does a lot of vertex transformation and projection in software.

Yooyo,

afaik you must specify the TMU to which a color or tex coord buffer is bound, or how would you properly do multi texturing with vertex arrays or buffers? I know for sure that I can assign different tex coords to TMU0 and TMU1 this way, because I am doing it all the way. Color probably not, that wouldn’t make sense anyway. Thanks for the hint, I’ll try to leave out the color calls for any TMU other than TMU0.

void,

I am doing per pixel lighting via shader, but there are pretty hard limits on what I can pass. I have been getting a lot of GLSL linker errors or even just rendering flaws when exceeding them. So I am passing the light sources via the built in HW lights. I had thought about light source data in a texture, but I failed to implement it. There’s too much unclear about it. You don’t want that texture be interpolated e.g., so you need an orthogonal projection for it, which would collide with rendering the other textures (I had once been playing around with GPGPU stuff).

unsigned_void · May 28, 2008, 2:07am

Just specify no filtering/no mipmaps for it. Orthogonal projection has nothing to do with textures here…
Then you can sample it choosing texcoords tied with number of light.

karx11erx · May 28, 2008, 2:22am

Just specify no filtering/no mipmaps for it. Orthogonal projection has nothing to do with textures here…
Then you can sample it choosing texcoords tied with number of light. [/QUOTE]
I will definitely try that. It would be awesome to be able to get around that 8 HW lights limitation. Thanks.

I still have a question though: If I address elements of a light data texture using texture coords, then these tex coords are float. I need to compute them properly, don’t I? I.e. light #9 (counting from one) would be at


uniform sampler2D lightTex;
float lightPos = texture2D (lightTex, vec2 (1.0 / 8.0, 1.0 / 8.0));

for an 8x8 texture. Would that work?

yooyo · May 28, 2008, 2:52am

No…

In case of fixed function pipeline you have to spacify vertex color, normals, and positions once and texcoord for each TMU that you want to use.

In case of using shaders (programmabile pipeline) in a shader you can fetch texles from texture using any know coordinates in shader. This mean… you can sample from texcoords or use result of some calculations in shader as coordinates for texture fetch.

Because you are doing lighting, split your rendering in several passes:

enable depth write & depth test. Render visible geometry using static lightmaps
disable depth write. enable additive blending. render light contribution. Sort your faces by lights and render only faces that was affected by some light. You can optimize this pass using shaders with multiple lights
switch blending to multiply. render visible geometry with textures

After first pass ou should see lightmaps only. After second pass you should see lightmaps + lighting result. After 3rd pass you shoild see almost complete frame…

unsigned_void · May 28, 2008, 4:09am

karxx11erx: light #8 (counting from 0) in a texture 8x8 would be at (0, 1/8) (first in second row). Also, there are NPOT (non-power-of-two) textures, where you can specify unnormalized texcoords in range [0; n-1] (see spec for details).
And texture should have more than 8 bits per component, if you will interpret its components as coordinates in space (or derive coordinate from multiple components).

karx11erx · May 28, 2008, 4:26am

void,

I would use a texture with float components. Regarding addressing the light texture elements: Ofc you are right, I was just typing too fast. I remember that NPOT stuff, but does it work on older hardware? Many Descent fans do not have the latest and greatest (or even second latest) hardware.

yooyo,

which blend mode is multiplicative (I know that’s an absolute noob question …) ? Or are you talking about handling that in a shader (where I ofc would know how to multiply texture and light values)?

unsigned_void · May 28, 2008, 4:38am

karx11erx: see http://delphi3d.net/hardware/allexts.php for hardware info. Using NPOT is not strictly neccessary, but GL_ARB_texture_float puts some requirements too.

yooyo · May 28, 2008, 6:20am

Additive blending:
glBlendFunc(GL_ONE, GL_ONE);

Color multiply (modulate):
glBlendFunc(GL_DST_COLOR, GL_ZERO);