Future of ARB_fragment_program ?

It is possible to use all the new ps30 features of the geforce6 with OPTION NV_fragment_program2 and ARB_fp. Is a vendor independent version of this extension or something similar planned or is GLSL the only solution ?

Other than Nvidia, the vendors that I have talked to aren’t very enthusiastic about keeping the “assembly language” interfaces alive.

This came up at the last ARB meeting, but none of the vendors other than NVIDIA seemed interested in new assembly language programmability extensions (i.e. ARB_fragment_program2).

I personally think this is a shame. It’s true that the assembly doesn’t bear much resemblance to what is actually executed by the hardware these days, but I still think it has value as an intermediate representation.

You seem to like writing optimising compilers, Simon :smiley: Oh ok, I forgot, you already have it for NV_fragment_program2 :slight_smile:

Why do you need two different things to do the same work? Agreed, a intermediate language will be very handy(for example if one wants to write his own shading language), but it shouldn’t be an assembly language but rather some sort of compiler-fiendly language with fairly universal grammar. I can understand your concern, as this brings a lot of problems to Cg. But I don’t see how you can convince other vendors to extending GPU assembly.

As long as MS continues to define extended ASM for vertex and fragment, it will be valuable to support the equivalent in OpenGL, IMO.

Frankly I don’t see the point of an assembly language. Today’s drivers have built-in optimisers that typically does an equally good job as handwriting the code in assembly would give.

Frankly I don’t see the point of an assembly language. Today’s drivers have built-in optimisers that typically does an equally good job as handwriting the code in assembly would give.
Have you tried to use ATi’s drivers recently via glslang? They’ve got compiler bugs in there that are still, to this day, limitting what developers can do. The hardware isn’t getting in the way; the driver is.

From this crash bug to an oddball bugs to miscomputing the dependency chain coupled with driver bugs like these , how can you expect us to believe that your driver produces optimal code? Even today, many months after the first glslang implementations started showing up, I bet we can beat the ATi glslang compiler with ARB_fp most of the time. And with fewer bugs too.

I’m not a big fan of having 10 ways to do the same thing, but I prefer that to having one sub-optimal path and no way to bypass it.

I bet we can beat the ATi glslang compiler with ARB_fp most of the time. And with fewer bugs too
ah but then its hardcoded in + as such aint as likely to benifit from future graphics driver upgrades.
good example is quake3 (quasi details) there was a function in it fast_sqrt() (or someit) in asm, unfortunatly a few months later it was actually slower than sqrt() cause as it was asm it ‘had to be obeyed’.

found this
http://www.icarusindie.com/DoItYourSelf/rtsr/ffi/ffi.sqrt.php
actually now i think about it another anology would be doom3 and the specular (or someit i forget) where the texture was used as a ‘hack’ to look up the specular, ati changed the shaders to use their interpretation of the actual function and it was quicker (+ prolly more acurate)

I use the fragment programs equally between 3D renderer type applications and for GPGPU applications like image processing and filtering.

I have to say that some high level languages really do a crappy job of optimizing GPGPU code. In some cases, involving branching or longer code using a lot of variables (eg. certain color conversions), I’ve produce hand optimized assembly that was 3 to 4 times shorter than the assembly being produce the by the compiler.

For typicaly 3D type applications, like games, where you’re most doing typically lighting and transform stuff, GLSL/Cg/HLSL is cool and definitely worth it, provided you don’t want to do frequent on the fly compilation (some high level language shader compile over 100-1000 times slower than the corresponding ASM shaders!).

Also, getting the GLSL/Cg/HLSL compiler to produce fast code usually involves a few iterations of:

  1. look at the assembly for the GLSL/Cg/HLSL program

  2. add some hints using swizzles, masks, reorganizing the shader and/or variables, pack variables differently, then goto (1)

The high level compilers need to get a lot better before we’re ready to do away with ARB_fp.

EDIT - still waiting for rectangular texture sampling support in GLSL from ATI here… has this been implemented yet?

Doom 3 used that specular function texture so that the cards that support fragment programs will look the same as the older cards. One of the big deals with doom 3 was that id wanted it to look the same on all cards.

-SirKnight

ah but then its hardcoded in + as such aint as likely to benifit from future graphics driver upgrades.
good example is quake3 (quasi details) there was a function in it fast_sqrt() (or someit) in asm, unfortunatly a few months later it was actually slower than sqrt() cause as it was asm it ‘had to be obeyed’.
If your game doesn’t sell because it’s too slow, what happens in the future is irrelevant.

Plus, with better hardware, even the “less optimal” path will still be faster than it used to be. It may not be as fast as it could be, but, as I said, there’s no reason to have any faith in ATi’s driver development team.

EDIT - still waiting for rectangular texture sampling support in GLSL from ATI here… has this been implemented yet?
You want them to implement something? We should hope they get their bugs fixed before bothering to implement things… :wink:

It is possible to use all the new ps30 features of the geforce6 with OPTION NV_fragment_program2
That’s the idea, plus NV_fp2 has more instructions.
Pack, unpack.
extended swizzle.
Lit.
Trig functions.
Reflect.
Various set functions.

The same should be exposed in GLSL a la Cg language.

Originally posted by V-man:

The same should be exposed in GLSL a la Cg language.

It is exposed. NVIDIA has a paper concerning it (I couldn’t find it). All of my fragment shaders use:

 #ifndef __GLSL_CG_DATA_TYPES
	# define half float
	# define half2 vec2
	# define half3 vec3
	# define half4 vec4
#endif 

Doom3 uses a special complex specular function, hence the lookup texture. Replacing it with common specular function(which was the idea of Humus :slight_smile: - yes, if someone doesn’t know it, Humus works for ATI :smiley: ) is not correct way - as stated by Carmack himself.

It’s just a matter of time that assembly languages dissapear. Compilers can create portable and faster code than humans, optimized for present and future hardware. If they don’t do so right now, they will in a near future (near enough for projects that start today).

Originally posted by martinho_:
It’s just a matter of time that assembly languages dissapear. Compilers can create portable and faster code than humans, optimized for present and future hardware. If they don’t do so right now, they will in a near future (near enough for projects that start today).
well the question is, how long will it take to get there. How many years? As Korval already mentioned: till now (> one year driver support) IHVs unfortunately are not even able to provide a bug free support.

Another funny thing is, that imo the high-levelness of glsl is very limited. Several things which would be needed for good high-level use are not supported: attribute arrays and invariant uniform calculation optimization. Plus some annyoing specs like: missing literal conversion and varying attributes.

Till now for me glsl just provided lots of reasons for trying an alternative 3D API next time, or if that does not fix the problems, create an own high level language and compile it to glsl (assembler or whatever).

btw: Does any driver already support invariant uniform calculation optimizations? This would be an important optimization, but last time I checked it wasn’t supported by ATI/nvidia.

doom3 to look the same on all cards?, why didnt they limit quake3 to 16bit color then (as voodoo couldnt do 32bit color). Doom3 uses a special complex specular function, hence the lookup texture thats the official word but with the relations between ati/id im not 100% sure.

personally (i have an aversion to asm having used it in the early 80’s where there was zero documentation (no internet), learning by trial and error, it was fun not! actually in the start for a year i didnt even have a compiler thus code was just numbers, arrrgh, whoops i typed BA instead of B9 )
but speed is not the main issue (ease of use etc) are, how much faster is it to write a shader with glsl vs asm at least 10x im guessing, in the end time is what it all comes down to, my engine could be a lot better if i had an extra hour to spend on the material system, an extra hour on the particle system etc, but its not gonna happen if im farting around with asm

There’s a distinction between whether you should use a high level language for shader development and whether there’s utility in having a low level language.

From a stability and robustness perspective, low-level languages are easier to get implemented quickly and correctly.

As people have pointed out, language choice isn’t an either/or proposition. With a low level language, you can troubleshoot and analyze what kind of code the high level compiler is generating.

At the end of the day, the compiler is a software tool, not a hardware abstraction. I think MS got this right in the D3D design. And for better or worse (depending on your inclination) there will be a common shader ISA for some time to come.

Time will tell which model will be most successful, of course. But how many software developers do you know that like supporting multiple C++ compilers and all revisions of those compilers over multiple years. And C++ has been “done” for a long time now. The shading languages will continue to evolve for the forseeable future.

This is my personal view of the situation. I’m not “advocating” anything. I just believe that existing market forces will keep ARB_fp around for some time to come. And I believe those same forces will result in multi-vendor extensions to that ISA over time.

Time will tell for sure though.

[edit: fix crappy formatting]

Originally posted by Korval:
Have you tried to use ATi’s drivers recently via glslang?
Every day.

Originally posted by Korval:
[b]They’ve got compiler bugs in there that are still, to this day, limitting what developers can do. The hardware isn’t getting in the way; the driver is.

From this crash bug to an oddball bugs to miscomputing the dependency chain coupled with driver bugs like these , how can you expect us to believe that your driver produces optimal code?[/b]
I don’t see what these bugs have to do with optimal code. GCC refuses to compile some valid C++ files occasionally. Doesn’t mean it’s not producing fast code.

Originally posted by Stephen_H:
I have to say that some high level languages really do a crappy job of optimizing GPGPU code. In some cases, involving branching or longer code using a lot of variables (eg. certain color conversions), I’ve produce hand optimized assembly that was 3 to 4 times shorter than the assembly being produce the by the compiler.
I hope you don’t take Cg for being representative of compiler quality.