Future of ARB_fragment_program ?

Originally posted by Zengar:
Doom3 uses a special complex specular function, hence the lookup texture. Replacing it with common specular function(which was the idea of Humus :slight_smile: - yes, if someone doesn’t know it, Humus works for ATI :smiley: ) is not correct way - as stated by Carmack himself.
Actually, my first implementation was incorrect as it wasn’t a pow() function as I initially thought. Later revisions were fully equivalent though (and a good deal faster). He used saturate(4 * x - 3)^2, which could be implemented on Radeon/GF2 level hardware as well.

Originally posted by martinho_:
It’s just a matter of time that assembly languages dissapear. Compilers can create portable and faster code than humans, optimized for present and future hardware. If they don’t do so right now, they will in a near future (near enough for projects that start today).
Out of curiosity, when you say that it’s just a matter of time before assembly languages disappear, do you mean from OpenGL or from the CPU world as well?

Originally posted by valoh:
btw: Does any driver already support invariant uniform calculation optimizations? This would be an important optimization, but last time I checked it wasn’t supported by ATI/nvidia.
Are you asking for something like the preshader in DX?

Originally posted by cass:
Out of curiosity, when you say that it’s just a matter of time before assembly languages disappear, do you mean from OpenGL or from the CPU world as well?
It’s not gone already? :wink:
Even on the CPU, you’re probably better off using C++ code most of the time. Exceptions are 3DNow/SSE and the like. But even then, using intrinsics will probably work better and be quicker to write and debug and allows the compiler to optimize better across function calls and blocks etc.

Kronos, you just use the Cg syntax in GLSL.
GL_EXT_Cg_shader spec, if it exists, should mention this.

Originally posted by simongreen:
From a stability and robustness perspective, low-level languages are easier to get implemented quickly and correctly
Isn’t most of the problems related to optimization?
The original expression is too complex, so you try to reorder it to make it fit the GPU’s resources and you end up ****ing up everything?

Originally posted by Humus:
Are you asking for something like the preshader in DX?
exactly.

Haven’t used directx but I think this behaviour is called preshaders and handled by the fx framework. For glsl this should have been specified with a compiler flag to check if it is supported and a flag to enable/disable. Imo it’s a very important feature to allow a high level application/shader interface.

btw: as you are working with ATI/glsl every day. When can we do expect stable glsl/pbuffer support? this two bugs still hasn’t been fixed…

Originally posted by Humus:
[quote]Originally posted by cass:
Out of curiosity, when you say that it’s just a matter of time before assembly languages disappear, do you mean from OpenGL or from the CPU world as well?
It’s not gone already? :wink:
Even on the CPU, you’re probably better off using C++ code most of the time. Exceptions are 3DNow/SSE and the like. But even then, using intrinsics will probably work better and be quicker to write and debug and allows the compiler to optimize better across function calls and blocks etc.
[/QUOTE]I assume use that no significant amount of programmers uses CPU assembly anymore…

Again, it’s not a question of what you should write your code in, it’s a question of whether an assembly-level is useful.

Tools like compilers generate assembly. Having a common ISA target allows for different languages to link together into a single executable. It allows programs to generate fast machine code on-the-fly.

Could we really get by just expressing everything in high-level languages? I don’t see how. As shaders get more complex (and begin to look more like CPU ISAs) it seems natural that the tool chain will want to go the way that it has on
the CPU side.

Will the GPGPU folks want to have their shader code look like GLSL, or will they want to tailor
it more toward their computational goals?

Will we have two separate HLSLs?

In a lot of ways “less is more”. With low-level access, you can provide an abstraction suitable
to your goals. Software developers are free to extend the language constructs however they choose.

Of course HLSLs are the future of shading. Whether they’re based on an underlying assumption of an ISA is the question. ISA as a foundation for compiler tools and programmability is definitely the “tried and true” path.

I don’t see what these bugs have to do with optimal code. GCC refuses to compile some valid C++ files occasionally. Doesn’t mean it’s not producing fast code.
I guess ISV’s are funny about that kind of thing. They’ve got this crazy notion that, before you start optimizing something, it should probably work first. I guess IHV’s like ATi have different notions about whether something is useful.

I, for one, don’t agree that making something fast and broken is good.

Tools like compilers generate assembly. Having a common ISA target allows for different languages to link together into a single executable. It allows programs to generate fast machine code on-the-fly.
Admittedly, true, but then again, you’re not 3D Labs with their wierd vertex and fragment architecture.

We don’t share ISA’s across PowerPCs and x86 chips. We don’t ask a DEC Alpha to natively run x86 code. But shader hardware can be just as varied; as such, trying to create an ISA that provides for easy optimizations (which none of our current ones do) is not easy. Just look at the compiled code that comes out of your Cg compiler. It’s fine for an nVidia card, but it does things that an ATi fragment program doesn’t need to. High level constructs that a (theoretically functional) ATi compiler could have used to generate more optimal code has been lost. As such, passing the results of a compilation around is of no great value.

That’s not to say that I don’t ultimately agree with you. The problem is that even ARB_vp/fp are simply too low level to create good optimizations from. Perhaps there’s a happy medium between ARB_vp/fp and glslang, but nobody’s currently working on it, so it won’t get developed.

Will the GPGPU folks want to have their shader code look like GLSL, or will they want to tailor
it more toward their computational goals?

On a personal note, OpenGL and the ARB should spend absolutely no time finding solutions for the GPGPU people. If they want to hack their graphics cards into CPU’s, fine, but they shouldn’t expect a graphics library to help them at all.

Originally posted by Korval:
On a personal note, OpenGL and the ARB should spend absolutely no time finding solutions for the GPGPU people. If they want to hack their graphics cards into CPU’s, fine, but they shouldn’t expect a graphics library to help them at all.
:mad: Thanks a lot :stuck_out_tongue: Personally I don’t see a need for anything special for GPGPU. It’s slowly coming together anyway. GLSL is fine for it - I don’t use GPU asm personally. FBO will be fine when it arrives. Higher floating-point precision would be nice, but I won’t hold my breath for that - maybe 5-10yrs I’d expect it. The main thing that’ll allow GPGPU to really come into it’s own is scattering. Will that arrive with DX10/OpenGL x.x? Dunno, but I wouldn’t be surprised either way. If scattering is seen to be useful for graphics, then maybe. It’d certainly be a much appreciated OpenGL extension (hint, hint Cass and Simon :wink: ). Other useful things - tools for OpenGL. Maybe a free shader debugger like Visual Studio’s HLSL debugger (but better - it’s a bit cumbersome for my liking).

ffish: How would a shader debugger work? would you run in some software mode(mesa?) with a custom extension to access variables as a shader is running?

The closest I can come with GLIntercept is to allow you to edit and re-compile the shaders back into the program at runtime. (ie. so you can move a variable into the output color to “see” the value)

Yeah, I guess. The D3D version you run with the Microsoft reference (software) rasterizer. I’ve only used it with C# and there are a number of pitfalls, but it still works mostly. Maybe the unmanaged C++ shader debugger works better. Unfortunately I can’t see how anyone would want to create a tool like the HLSL version for free, but it’d be nice if they would.

Since I’m doing GPGPU stuff, it’s kind of hard to see what changing values does at runtime. My textures are just a bunch of numbers and sometimes it’s hard to see what’s happening. So your GLIntercept solution wouldn’t work for me.

Correct me if I’m wrong, but I think what Cass is talking about is a Java-like bytecode representation. That is, an almost 1-to-1 mapping of GLSL to “bytecodes”, with no optimizations whatsoever (like D3D does) and probably higher level than ARB_vp/fp. The result could then be (greatly optimized and) executed on any architecture. Additionally, more languages (non-GLSL) could be created, as long as they could be compiled to these “bytecodes”.

Hi spasi,

Yes, that’s more or less what I mean. From a tool chain perspective, there’s a target for compiling and linking. Then the driver performs a JIT phase of mapping the generic ISA onto the native architecture.

The generic ISA provides a reasonable hardware abstraction (but does not dictate actual underlying microarchitecture), and software developers can develop any sort of language tools they like on top of it.

This seems to be the way MS has gone, so any IHV hoping to sell D3D parts will already have to support this generic ISA.

Thanks -
Cass

I fully agree with mr. Everitt at this point.
We need low-level shading interface to have a possibility of providing different HL shading languages and toolkits.
But I still believe that a assembly-like language is not sufficient for such a goal as you still need too many compilation steps to compile the code to an native repserentaion. As a matter of fact, I am working on a virtual machine(just for fun), something like java or .NET but with abstract intermediate language. This language merely describes the state transformation chains. I found it to be much more effective with respect to further optimisation then usual assembly representation. Another advantage of such a system is ability to extend easily. Somethin glike this could be used for shading too.
Just my 2 cents :smiley: I release that no company will suddenly develop a new method because I said so :wink:

Cass,

I can see the advantages of such an architecture, but have a few questions:

  1. Has this been discussed with other ARB members? Not in a “let’s extend the low-level language” manner, but given the mentioned advantages.

  2. Are ARB_vp/fp (and whatever D3D uses) sufficiently abstract, or would something new be necessary? Also, I would expect this to work completely internally, something developers have no access to (no temptation to write low-level code, like a Java/.NET developer never writes bytecode).

  3. Is NVIDIA willing or has plan to go after such an architecture for OpenGL? Not necessarily independently implement it, at least try to prove its usefulness.

Originally posted by sqrt[-1]:
ffish: How would a shader debugger work? would you run in some software mode(mesa?) with a custom extension to access variables as a shader is running?
Yes, you have to run it in software and no you don’t need a custom extensions. D3D does it with it’s reference rasterizer.

Unfortunatly, this is only good to debug your shader.

Dumping the assembly to a file with the hw driver would be useful.

With ATI (3DLabs too), it’s a black box.
NV simply has the right solution.

Originally posted by spasi:
Cass,

I can see the advantages of such an architecture, but have a few questions:

  1. Has this been discussed with other ARB members? Not in a “let’s extend the low-level language” manner, but given the mentioned advantages.

There have been discussions, they just don’t seem to go anywhere. I don’t think enough ARB members are interested to have critical mass (yet).

  1. Are ARB_vp/fp (and whatever D3D uses) sufficiently abstract, or would something new be necessary? Also, I would expect this to work completely internally, something developers have no access to (no temptation to write low-level code, like a Java/.NET developer never writes bytecode).

No. Had we gone down the path of exending the
ASM it would have eventually been general enough,
I think. NVIDIA will continue to generalize this path with vendor extensions. For external tools to take root, there needs to be multi-vendor support for these general paths though. There’s no knowing when (or even if) that will happen.

  1. Is NVIDIA willing or has plan to go after such an architecture for OpenGL? Not necessarily independently implement it, at least try to prove its usefulness.
    If other IHVs and ISVs were interested, I’m sure NVIDIA would be involved. But there’s no groundswell of interest/support for this direction today.

The opinions I have expressed on this thread are about the way I think things are likely to go. I’m not trying to advocate action. After all, I could be wrong. :slight_smile:

Thanks -
Cass

Originally posted by valoh:
[b]exactly.

Haven’t used directx but I think this behaviour is called preshaders and handled by the fx framework. For glsl this should have been specified with a compiler flag to check if it is supported and a flag to enable/disable. Imo it’s a very important feature to allow a high level application/shader interface.[/b]
Yeah, it’s probably a useful thing going forward. I think what we need is not an assembly language (they break down occasionally as well), but rather more control for the application in terms of optimization flags, enable/disable preshaders and so on. Simply disabling optimizations should hopefully let many broken shaders run (although slow) until the bug is fixed.
If we’re talking about an intermediate language I might be in iff it doesn’t destroy the semantics of the original shader. But then again, then we’re not really saving a whole lot more than the parsing.

Originally posted by valoh:
btw: as you are working with ATI/glsl every day. When can we do expect stable glsl/pbuffer support? this two bugs still hasn’t been fixed…
I’m working with but not on (I’m no driver writer) ATI GLSL, so don’t know of when certain bugs will get fixed. But the bug you mention was just recently fixed. Don’t know when it goes into a public driver though.

Originally posted by Korval:
I guess ISV’s are funny about that kind of thing. They’ve got this crazy notion that, before you start optimizing something, it should probably work first. I guess IHV’s like ATi have different notions about whether something is useful.
On the other hand, the ISV people don’t risk losing millions in sales because their bar was a few pixels shorter than the competition on a chart on Toms Hardware Guide.