msi geforce fx5200, major slowdown using fragment programs

I wonder if there is any shader program that runs faster on a FX series card than the 9500+ cards?

Sure. They’re made using NV_fragment_program or Cg through NV_fp(which provides the necessary basic types needed for the performance).

Anything with lots of filtered shadow map lookups should be faster on the GeForceFX since it has dedicated hw for pcf. The radeon has to waste cycles and fetches doing the same thing manually. Another idea is something which uses the sincos or lit instructions lots. Those are native on the FX but emulated by exapnsion into several instructions on the Radeon IIRC.

Anything with lots of filtered shadow map lookups should be faster on the GeForceFX since it has dedicated hw for pcf. The radeon has to waste cycles and fetches doing the same thing manually. Another idea is something which uses the sincos or lit instructions lots. Those are native on the FX but emulated by exapnsion into several instructions on the Radeon IIRC.

While true, that isn’t the question at hand here. The question is why the 5200 seems slower than it ought to be. IE, doing things that should speed up the card do not. My best guess is that the 5200 lacks some actual hardware that the higher-end cards have that boosts their performance.

Well, he did ask about specific cases when the GeForceFX would beat the 9700. Anyway, theres a nice article on FX’s architechture here: http://www.3dcenter.de/artikel/cinefx/
It’s not official word form nvidia by any means but it does seem to be consistant with what nvidia have said and what people have measured. In short, to get the most of FX performance, use few registers, lots of textures and utilise the combiners at the end.