ARB FP 1.1

Originally posted by Zengar:
@Humus: übrigens, wie ist deine Deutschprüfung gelaufen?

Ich weiss nicht. Das Resultat ist noch nicht fertig. Aber ich glaube dass es gut geläuft hat. Es fühlte ziemlich gut.

Aber ARB_fp/ARB_vp sind ja nicht…

ehh,… sorry, so I was saying:
But ARB_fp/vp are not assembly languages, just assembly-like. I mean if all GPUs had a standard instruction set(not sure if this would be necessarily a good thing) then I would agree, that exposing the GPUs native instruction set and writing high-level language compilers to compile high-level code to machine code should be the way to go, no doubt about it. But since ARB_fp/vp is just an-assemply like virtual machine-like layer on top of the GPU’s native instruction set, building the compiler into the driver and bypassing assembly-like layers actually makes more sense(I mean it’s closer to how the CPU works). With ARB_fp/vp you have:
HLSL -> VM asm code -> native asm code
while by eliminationg the assembly like layer you would get:
HLSL -> native asm code
What’s the use of the intermediate code layer? You just don’t have access to the machine code layer and that’s, in essence, what I’m trying to say: beeing able to write machine code via some assembler should be quite useful but since instruction sets differ it must be handled by the driver (for now). Keeping an ASM-like VM layer in-between isn’t the same thing. Of course I’m assuming that this VM layer(fp/vp) exposes no more functionality than the high-level language which I’m not sure of as I don’t know much about the GLSL yet.
And of course there are other problems with the HLSL-compiler-in-the-driver approach but I just wanted to say that fp/vp and HLSL isn’t the same as C and assembly and so keeping (and wasting effort to maintain) ARB_fp/vp wouldn’t make that much sense.
But then again I may be completely wrong…

I’ve heard they change the way shadow maps work to something more logical so you don’t have to manually do the comparison yourself.

How about adding ps2.0-like per-instruction partial precission hint?

As I already mentioned somewhere, I would prefer if hardware would expose it’s native assembly, so that third-party users could write there own HLSL. I need to see GLslang in action to make up my mind, thought.

Yay! Please fix the shadowmap mess so things work like in glslang, the way it should’ve been from the beginning.

I’d like branching in VP, and predication in FP; if I get those two things, the rest of the feature requests aren’t all that important. Although filtered shadow maps would be nice, especially if there’s sufficient ability to tell whether this costs me instructions or not.

I like the idea of having a high level API and a low level API - even if the low level API is a “virtual machine” in itself.

If ARB_VP and ARB_FP would support conditional branching and jumps (as some vendor specific extensions already do), they were no less powerful then GLSL - and maybe it would be easier for videocard vendors to use an assembler like language as testbed for new capabilities, than having to implement everything in a more complex C-like language (which could be done later on, when everything works).

If one doesn’t want to use a C-Style HLSL it would seem more naturally to me, to write a compiler which generates “assembler like” output, than having to create a Pascal to C crosscompiler for example.

So in my opinion ARB_FP and ARB_VP are a viable solution for the future und should be maintained parallel to GLSL, maybe even integrated in the core specification, since they should be easier to maintain than GLSL - (having that said, a kind of inline assembler with native hardware support in GLSL would be cream on the cake (code generation for different video cards and a fallback option to standard GLSL could be steered with #ifdefs) ).

The problem with LLSLs is that they can never be optimal for all hardware, while a HLSL can, since they driver decides how to best map it onto the actual hardware.

C was originally designed as a sort of “portable assembler”. When you try to abstract out the hardware it’s pretty much what you end up with.

This is why all the various shading languages are C-like instead of Lisp-like or Java-like or whathaveyou. The idea of something lower-level yet still portable doesn’t really work unless you want to underexpose the hardware. This is why there many different languages at different levels “above” C. From stuff like C++ to Java to Lisp to scripting languages like Perl and Python. But no common language in-between C and assembler. If you want to get low-level but portable its a C-like model or the highway (excepting perhaps Forth)

[This message has been edited by Pop N Fresh (edited 11-22-2003).]

Having two language APIs which are completely seperate (i.e. they both compile to native GPU machine code) is fine by me, although it would take more effort to maintain them. I just wanted to point out that layering a HLSL over an assembly SL is not the obvious way to go (as I initially believed) because in essence you just layer the HL language over a lower level VM which doesn’t make much sense (since you can compile to native machine code right away), introduces added complexity and probably makes target-specific code optimization impossible. That’s as far as the ‘why not have access to the assembly code’ (because you don’t have it anyway) and the ‘it’s done this way with CPUs for decades’ (bacause it’s not) arguments are concerned. Of course in real life other non-technical factors matter like the effort to write and maintain and debug drivers (and an HLSL in there certainly adds complexity) etc., etc. but this is a whole different story.

well, ARBfp is actually a rather highlevel language. it is in asm-style, but it is, compared to NVfp, too, i think, a rather high level language.

why?

because you don’t have to mess with registers. you have variables.

the only thing a c-style language has over it, is a stack. and the possibility to group instructions into so called “functions”

but the fact that we don’t have to mess with registers makes ARBfp quite a step higher level actually.

ARBfp isn’t particularly HL. It’s asm with some candy. It’s more HL than DX’s ps2.0, but much closer to ps2.0 than to HLSL.

ahem - if you write an assembler program on standard CPUs, you have a stack too, since everything any high level language can do, is provided from the machine the language runs on.

ARB_VP/FP are a high level language (and can of course be optimized, since a driver is not restricted in generating any native code).
They are no native assembler on any existing CPU - however, compared to a “real” CPU, the model provided by ARB_VP/FP has a very limited instruction set (no stack, jumps, etc. which could be changed in the next version), and so ARB_VP/FP are limited compared to a “real” assembler (I’m referring to PC compatibles) - but GLSL is very limited compared to “real” C either (and in some ways (inverse matrices) even limited to ARB_VP/FP), so that’s not an argument.

What I tried to say, is that it may be easier for video card vendors to maintain a “simple” assembler like language, than having to support a more complex high level language - therefore both, a low level language and a high level language make sense to me - even more since in some time most shaders won’t be coded manually anymore, but created interactively by some advanced tools (and maybe optimized manually afterwards) - so it’s not as important, that OpenGL provides a very advanced high level language, but that the language used is standardized and compatible between video cards and drivers.

The asm-like code however, is neither easier nor more difficult to post-optimize from a driver than C-like code, since most optimizers won’t foresay what you “try” to achieve anyway (at least not correctly :^) ) but work as peephole optimizers to exchange some pieces of code with some other pieces with the same functionality but fewer or faster instructions.
The asm-like code could (I may be wrong) however be easier to get working in a video card driver.

(Besides I like the simple and straight ahead interface to ARB_VP/FP, however, if my video card driver will support GLSL someday, I may like that too…)

Originally posted by Humus:
ARBfp isn’t particularly HL. It’s asm with some candy. It’s more HL than DX’s ps2.0, but much closer to ps2.0 than to HLSL.

well… syntax is asm like and not c-like. but that doesn’t make a language high or lowlevel. ARBfp has variables, as HLSL does. it doesn’t have functions, as HLSL does. ps2.0 doesn’t have variables. so ARBfp is-between of both.

and i prefer the asm syntax then, and compile from a higher level language down to it. i don’t believe in all that realtime-highlevel-compile stuff… install time possibly, yes… but definitely not “everytime i change a shader” time…

yes we have 3gig pc’s today. but no i don’t want to waste my cycles all the time to have all sort of recompilations running. .NET platform, opengl/dx HLSL compilations, all the fuzz…

you can solve every problem with another layer, except the problem of too many layers.

we have much too many today.

Yeap, layers are prety bad because of that (.NET ). But they alow you to create more crossplatform progs. I don’t know which way is better. It’s a lengthy job to write driver that will compile GLSL to native instructions + fp to native instructions + all the proprietary vp’s/fp’s to same instr’ns . But it’s hard to do something without layers nowadays.

Originally posted by mw:
The asm-like code however, is neither easier nor more difficult to post-optimize from a driver than C-like code, since most optimizers won’t foresay what you “try” to achieve anyway (at least not correctly :^) ) but work as peephole optimizers to exchange some pieces of code with some other pieces with the same functionality but fewer or faster instructions.
The asm-like code could (I may be wrong) however be easier to get working in a video card driver.

I have very good reasons to think you’re wrong on both points. I don’t want to write lengthy posts about it, but will rather refer to this 20 page thread on beyond3d: http://www.beyond3d.com/forum/viewtopic.php?t=8589

Originally posted by mw:
(Besides I like the simple and straight ahead interface to ARB_VP/FP, however, if my video card driver will support GLSL someday, I may like that too…)

Trust me, once you’ve tried an HLSL you’re not going back.

Originally posted by davepermen:
[b] well… syntax is asm like and not c-like. but that doesn’t make a language high or lowlevel. ARBfp has variables, as HLSL does. it doesn’t have functions, as HLSL does. ps2.0 doesn’t have variables. so ARBfp is-between of both.

and i prefer the asm syntax then, and compile from a higher level language down to it. i don’t believe in all that realtime-highlevel-compile stuff… install time possibly, yes… but definitely not “everytime i change a shader” time…

yes we have 3gig pc’s today. but no i don’t want to waste my cycles all the time to have all sort of recompilations running. .NET platform, opengl/dx HLSL compilations, all the fuzz…

you can solve every problem with another layer, except the problem of too many layers.

we have much too many today.[/b]

Calling your variables “lightVec” and “viewVec” instead of r0 and r1 doesn’t make it high level. It’s a convenience, but it’s still low level.

Compilation takes place on load time, just like with asm shaders (which also needs to be compiled). Not when you switch shaders.

humus: not having to manage your registers yourself but getting that done by the compiler DOES making it higher level.

in ps2.0 you can #define around to get the namings.

but ARBfp has VARIABLES. you don’t need to bother about wich register you now have in use where and how. this CAN be done for ARBfp by the compiler. it can NOT if you code yourself with the registers directly.

register allocation got abstracted away.

that DOES make it higher level.

i know when shaders get compiled, humus. i know very well what i’m talking about. you don’t need to bother about my issues. i do

Well Humus, I stepped through your 20p+ monster - and while it wasn’t entirely ontopic (ahem), it was (sometimes) interesting nevertheless.

One thing I didn’t think of was, that the ARB Shader Assembler would map a subset of current hardware capabilities, wheras I thought it was an entirely virtual machine.
If it IS virtual, it’s no difference to GLSL to optimize it (high level structures are not needed to optimize - in the end every kind of code structure is a set of conditional jumps, which may be optimized as well if they are in an Assembler like language).
The current specification has to be changed however (at least conditional jumps) - and maybe you’re right, that this is the time to step to a higher level language, where it’s easier to define scalars and vectors, letting the driver doing the work how to implement them best in hardware (you sound quite convincing in this point, as I have to admit (and not being too happy about it)).