Hand tuning NV Fragment Program from Cg


I am hand-tuning a Cg fragment program
so that the number of instructions
generated becomes fewer. But my situation
is that the framerate decreases (121 fps ->
115 fps) even I got a decrease in number of instructions in the fragment program from
128 -> 108. Any idea why?

FYI, the texture accesses is the same. Only
the math part changes. I group two scalar
calls to atan2 into one single atan2 with
float2 plus a few math changes.

Thanks in advance.


Could you post the Cg and ASM shaders?


Sorry. But appreciate for your help.
It’s fixed.
After I rewrite some conditional stuff after
the two atan2’s, the problem’s got and the
performance goes up.


Remember that there’s quite a lot of optimization that goes on inside the driver. Reducing the number of instructions won’t necessarily improve performance. The assembly generated by the compiler only bears a slight resemblence to what’s actually executed by the hardware these days.

but sometimes there is a problem like a texture lookup with comparision(shadow) on half precision, which isnt supported in CG(posibly bug) but in assembler its well supported. Thats a funy situation when i must use ASM instead of GLSL(or CG) :slight_smile: