register combiners vs multi pass performance

I’ve got a radeon 9800 pro, and I’m experiencing some wierd performance issues. I have a terrain where I’m blending 2 textures using an alpha map. I first blend them by drawing the terrain once with the first texture, then drawing the entire terrain again using the 2nd texture, but with GL_BLEND enabled and using alpha values from the alpha map. Doing it this method, with about 12,000+ triangles, I get about 300 fps.

I then switch to using a single pass, and use register combiners and use GL_COMBINE_ARB extension to do the same thing. But instead of it going ‘faster’ like it’s suppose to do, I get a lower frame rate, about 208!

Isn’t this wrong? Shouldn’t it go faster with combiners? I’ll post the initialization code if anybody has any ideas.

I’ve just run the program on a Geforce 4 TI 4400, and it does slow down by about 20 fps when switching to 2 passes. It gets 130 fps with combiners and then 110 with 2 passes.

Is this a driver issue with the Radeon? I have those beta 3.10 drivers somebody posted a while back that supported GLSLANG, maybe that has something to do with it.

Modern cards (especially high-end cards) have very efficient memory subsystems. This means that you can probably get blending “for free” if you’re limited on something else. Using three texture environment combiners and setting the state such that you achieve A*(1-B)+C*B may turn out to use more fragment shading cycles than your two separate passes, and it may be that blending is fast enough to not be the limit (nor transform) so multi-pass is a win on this hardware.

Now try it on a Radeon IGP or 7200 and see what the result is…

Well I don’t have a radeon 7200 or below, but I tested it on a geforce 2 mx 400, and it got 40 fps, but that’s with a 1ghz amd t-bird, but if the blending is being done in the gpu, i guess that shouldn’t matter.