Power: NV_fragment_program, DX9, ARB_fragment_program

Correct me if I’m wrong. If I were to order the “conditional power” of various fragment shading instruction sets, they’d come out (from best to worst) as:

DX9 PS2.0

I base this on the nVIDIA extension having extensive predication implemented through write-masking based on a set of condition codes, PS 2.0 having a very simple version of this, and ARB not having any predication or conditional write-masking.

While you can implement these things using SGE and multiplication, it seems like a piece of hardware that can “do more” must have a very good optimizer to actually recognize what’s going on, and evenso, it would be hit or miss whether the instruction sequence could be collapsed to a single conditional write mask.

Is this a correct interpretation, or have I over- or under-estimated the power of any of these instruction sets?

I wouldn’t know enough to order the power of ARB_fragment_program vs. DX9, but in general that ordering probably holds (and for more than conditionals).

  • Matt

Sounds about right to me.


Well, here’s hoping there will be some commonly implemented extensions that add commonly occuring functionality (including conditionals) to the spec.

ARB_fragment_program_conditional_assignment perhaps :slight_smile:

Evan Hart from ATI said the ARB was working on ARB_vertex_program2 for flow control. I’m sure they are working on something similar for fragment programs. Since the 9700 supports DX9 pixel shaders and if they are more powerful than what’s currently available in GL, then it seems logical that ATI want’s to expose this.
Maybe other companies have hardware that can support the ARB_fp extension but only without the things you mention. Is that perhaps the reason it wasn’t included ?


The current crop of cards (GF3, Radeon 8500) can probably support large chunks pf the current fragment program spec. Note the queries for dependency depth, instruction counts, etc. I guess that’s the good news (assuming they actually want to spend the considerable effort necessary to make this useable). Although most DX8 level cards probably have to choke out to software with some instructions, such as the SCALARops.

It was not a design goal of ARB_fragment_program to be compatible with “DX8” hardware.

  • Matt

I haven’t read the spec thoroughly just yet but it doesn’t seem possible to have multiple render targets. Is that correct ?
Is that for a future extension too or is it already exposed ?

Possible future extension.