Does Spir-V stick too close to the glsl model?

ratchet_freak · June 11, 2015, 3:30am

The provisional spec uses the glsl model of declaring variables outside function scope (globally if you will) that you have to load and store to access. They also limit you to 1 entrypoint in a shader module

This means that you need at one module per stage in the pipeline.

On the other hand HLSL has the input/output as parameter and return value of the entry-point function of the stage. This has the distinct advantage of being able to specify the entire shader pipeline in a single file and be sure that the output of one stage will match the input of the next by not having to duplicate the declaration and possibly making a typo.

Should graphical shader spir-V be allowed to have multiple entrypoints defined in a single module in HLSL style?

Salabar · June 11, 2015, 5:20am

I bet this is how hardware works. Multiple shaders or kernels in one file is a language and compiler feature.

ratchet_freak · June 11, 2015, 6:20am

Yet most of the time they have to be combined into a single “program” object that you manipulate monolithically

Alfonse_Reinheart · June 11, 2015, 7:54am

[QUOTE=ratchet freak;37679]The provisional spec uses the glsl model of declaring variables outside function scope (globally if you will) that you have to load and store to access. They also limit you to 1 entrypoint in a shader module

This means that you need at one module per stage in the pipeline.

On the other hand HLSL has the input/output as parameter and return value of the entry-point function of the stage. This has the distinct advantage of being able to specify the entire shader pipeline in a single file and be sure that the output of one stage will match the input of the next by not having to duplicate the declaration and possibly making a typo.[/quote]

SPIR-V is not intended to be written by humans; it’s intended to be the output format from some kind of front-end compiler. So if a front-end wants to have multiple stages in one file, it can. If a front-end wants to check for the user “possibly making a typo”, it can.

That’s not SPIR-V’s job; it needs to be lower-level. A higher-level front-end that wants to do these things simply spits out multiple SPIR-V shaders.

Even in Mantle, where the Mantle Programming Guide explicitly tells you “don’t change pipelines often; use ubershaders instead,” they build pipelines out of individually compiled program objects. They want the user to be able to stitch together separate stages and combine them into a pipeline for use.

Furthermore, SPIR-V is intended for use by both Vulkan (and possibly OpenGL) and OpenCL. In OpenCL, you don’t have “stages” in a “pipeline”. So SPIR-V adding such a concept would be of benefit only to Vulkan (and possibly OpenGL) use. To the degree that it would be of benefit at all.

Not [since OpenGL 4.1](https://www.opengl.org/wiki/Separate Program). And it’s important to note that said feature has been something the OpenGL community has been begging for since… well, since GLSL first appeared with its whole “monolithic program object” approach.

ratchet_freak · June 11, 2015, 3:48pm

[QUOTE=Alfonse Reinheart;37682]SPIR-V is not intended to be written by humans; it’s intended to be the output format from some kind of front-end compiler. So if a front-end wants to have multiple stages in one file, it can. If a front-end wants to check for the user “possibly making a typo”, it can.

That’s not SPIR-V’s job; it needs to be lower-level. A higher-level front-end that wants to do these things simply spits out multiple SPIR-V shaders.

Even in Mantle, where the Mantle Programming Guide explicitly tells you “don’t change pipelines often; use ubershaders instead,” they build pipelines out of individually compiled program objects. They want the user to be able to stitch together separate stages and combine them into a pipeline for use.
[/QUOTE]

But it then has to output several spir-V files to represent that one input file. This is more burden on the programmer to say that multiple modules represent his programmable shaders.

You would still be able to link together modules for combining shaders. I fully expect the pipeline creation to have int** spirv and int* length parameters so you can pass in multiple modules.

Only 1 type of entrypoint “kernel” in openCL, whereas for vulkan (and the inevitable spir-V openGL extension) there are be 6. There are several things that are graphics-shader specific in spir-V that openCL isn’t even allowed to touch. They aren’t shy about putting in things just for benefit of one side of the coin. Just look at the native matrix capability, openCL isn’t granted that.

I’m not familiar enough with openCL to say anything about whether it would benefit from having the entrypoints get the thread-specific values as parameters to the entrypoint or not.

[QUOTE=Alfonse Reinheart;37682]
Not [since OpenGL 4.1](https://www.opengl.org/wiki/Separate Program). And it’s important to note that said feature has been something the OpenGL community has been begging for since… well, since GLSL first appeared with its whole “monolithic program object” approach.[/QUOTE]

I know that’s why I said mostly. Besides in vulkan the pipeline will be fixed after creation and you won’t be able to swap out shaders like you can with separate programs.

Alfonse_Reinheart · June 11, 2015, 10:11pm

… so? It’s only burdensome for a programmer who wants multiple modules in one file. And let’s be honest: it’s not that burdensome.

I fully expect the pipeline creation API to not take SPIR-V directly; it’d instead take individual objects that represent compiled shader stages. Compiling SPIR-V and creating pipelines would be separate steps. Just like Mantle.

And we should examine Mantle in this regard further. The GR_PIPELINE_SHADER object has a lot more than just a pointer to the shader object. For each shader stage you provide, you also have to specify a descriptor set mapping (which in Vulkan would probably be some kind of descriptor layout object). This describes how that stage’s resources map to the descriptor set.

These are per-stage properties. So if you have a single “shader” that represents multiple stages, then how exactly do you map its resources? Would this not mean that all shader stages in that shader use the same descriptor set mapping, and thus all of the resources must be loaded, even if a particular stage doesn’t actually use them?

If you look at the descriptor set setup code in Mantle, it’s very clear that the API expects descriptor sets to function independently of shader stage characteristics. All the shader stage creation data says about a particular descriptor set resource is that it uses it and what general type it is. How big it is in the shader and so forth is entirely defined by the descriptor set binding, not the shader stage data. And likely not the shader itself.

With your way, it is possible for a multi-stage shader to reference resources that are only used by one of the stages in the shader. But since it is mentioned in the descriptor binding for that shader (which is again not broken out by stage), the API has to do one of two things.

Know, for each shader stage within a multi-stage program, whether it actually uses a particular resource. This creates a dependency between the actual contents of a shader stage in a pipeline and the descriptor sets that it uses. Such a dependency did not exist before. Also, this requires more SPIR-V compiler work, since it needs to extract that data and store it in a place accessible by the CPU, so that descriptor set binds can read it and be adjusted accordingly.
Follow the data for the descriptor set. This means that it must assume that every shader stage will use every resource in the descriptor set layout defined when those shader stages were built into the pipeline.

This is a lot more complicated than just letting every shader stage be separate.

People generally don’t swap out shaders within pipelines. The point of the OpenGL feature is the lightweight “linking” between different programs. If you have 4 vertex shaders and 3 fragment shaders, and all are going to be used with each other, under the old system, you need 12 programs, which have 12 completely independent sets of uniform state (among other state). If you want to change uniforms for vertex shader #2, you have to change it in 3 programs so that it stays synchronized.

Under separate programs, you have only 7 programs. If you want to change uniforms in vertex shader #2, that’s what you do. Oh yes, you have 12 program pipelines. But pipelines are cheap; indeed, they’re so cheap that they’re not even shareable.

But most important of all in OpenGL (and for your proposed Vulkan suggestion) is compilation performance. Program linking is almost as complex in OpenGL as shader compilation. Indeed, in some implementations, they’re an almost identical process, which means that you’re spending a lot of performance on nothing.

In the above example, under the old system, you have 7 shader compilations and 12 linkings. With separate programs, you have 7 program linkings, and 12 pipeline constructions (which are far less invasive).

Vulkan will be different, in that pipeline construction is intended to be more invasive than OpenGL’s pipeline construction. Even so, the clear focus of this process is on inter-stage optimization.

Whereas the focus of inter-stage program linking in OpenGL was far more invasive. It had to deal with merging identical resource references together, building introspection databases, and so forth.

The process of building a Vulkan pipeline, while likely more expensive than an OpenGL pipeline, will still be cheaper than building a fully linked OpenGL program.

ratchet_freak · September 14, 2015, 4:36am

Some updates have happened since. One of them is multiple entry points per module. With an interestin set of properties:

https://www.khronos.org/registry/spir-v/specs/1.0/SPIRV.html#_entry_point_and_execution_model

The static function call graphs rooted at two entry points are allowed to overlap, so that function definitions can be shared. The execution model and any execution modes associated with an entry point apply to the entire static function call graph rooted at that entry point. This rule implies that a function appearing in both call graphs of two distinct entry points may behave differently in each case. Similarly, variables whose semantics depend on properties of an entry point, e.g. those using the Input Storage Class, may behave differently when used in call graphs rooted in two different entry points.

The globals for IO model remains. I get that it emulates some type of memory-mapped IO. But seriously What is so bad about the Cg IO model besides that microsoft uses it as well for HLSL. What’s worse is that there is no way to say that you only (want to) use a certain Variable in only 1 entry point. To find that out a tool needs to build the static call-graphs for each entry point defined. Which is not even enough if imported/exported functions are used.

Please don’t cripple your standard just to avoid copying a large “evil” corporation. That happened once already and it’s still biting people in the arse.

That uniforms, textures/samplers, UBO and SSBO are global I can understand but per invocation data and its output are tied to the invocation. Following normal programming standards I expect them to be bound to the entry point’s parameters using them.

Alfonse_Reinheart · September 14, 2015, 6:09am

But seriously What is so bad about the Cg IO model besides that microsoft uses it as well for HLSL.

I could ask you the reverse question: what’s so good about it?

If I use global variables for the interface, it makes it easier to communicate with imported code. Indeed, it allows imported code to define its own part of the interface while remaining independent from its source code. Its interface is merely an implementation detail to the calling module. Now obviously it can’t be totally independent, since the library can’t have its inputs/outputs conflict with the resource locations used by the main code. But so long as you apportion those resources correctly, there is no need to have a direct connection.

What are the advantages of having shader stage IO work via parameters? That the SPIR-V compiler has to do slightly less work? Building a static call graph is something that the compiler has to do anyway, so that it can tell if a variable is used or contributes to an output (and therefore can be optimized out).

Also, there are practical issues. SPIR-V has no notion of “output” parameters; all function parameters are inputs. To have an ouput parameter on a SPIR-V function, you explicitly have to make a pointer to a variable held by the external code. So in order to have shader stage IO be done by function parameters, your entry points are going to have to take a bunch of pointers.

Furthermore, interface variables in hardware are likely to be pieces of work-group-local memory, accessible from any function in any shader stage. As such, if you start having to pass them around from function to function as formal parameters, that could involve a lot of needless copying. To avoid that where possible, the SPIR-V compiler has to build… a static call graph, so that it can turn every use of that parameter, across function calls, into a work-grou-local memory access rather than passing the actual parameter.

So it’s not clear what good that would do.

Please don’t cripple your standard just to avoid copying a large “evil” corporation.

I fail to see how this would in any way be “crippling” the standard.

ratchet_freak · September 14, 2015, 7:23am

[QUOTE=Alfonse Reinheart;39434]I could ask you the reverse question: what’s so good about it?

If I use global variables for the interface, it makes it easier to communicate with imported code. Indeed, it allows imported code to define its own part of the interface while remaining independent from its source code. Its interface is merely an implementation detail to the calling module. Now obviously it can’t be totally independent, since the library can’t have its inputs/outputs conflict with the resource locations used by the main code. But so long as you apportion those resources correctly, there is no need to have a direct connection.
[/QUOTE]
In the current state IO variables have to be duplicated.

Pointers can only have a single storage class. That means that every stage’s output need to also be redeclared as the next stage’s input.

Return by pointer is IMO still better than return by global. (despite my dislike of return by pointer but that’s a rant for another place)

spir-V has return. So you can get around that by grouping everything in a structure and returning it.

Going further you can put the input to an invocation in a structure as a single param and returning the output as a single structure.

It’s not just the driver that needs to keep track of it all. Other tools (like optimizers and verifiers) would be helped. With input by parameters you don’t need to build the callgraph instead you just need to ensure the function calls are correct.

by that point calls will get mostly inlined and the call graph rendered moot.

[QUOTE=Alfonse Reinheart;39434]

I fail to see how this would in any way be “crippling” the standard.[/QUOTE]

I’ll admit “cripple” may be a bit too strongly worded. However the plea to avoid the knee-jerk reaction of doing anything but copying microsoft even if it means sacrificing usability still stands.

Alfonse_Reinheart · September 14, 2015, 8:09am

In the current state IO variables have to be duplicated.

First, no they don’t. They only have to be duplicated if you put all the shaders in one SPIR-V file.

Second, even then, they’re not necessarily duplicated. Different stages have different qualifiers on the variables.

Third, your way doesn’t remove duplication; you just put it in a different place. Rather than defining input variables, you define input parameters. Same for outputs. Each stage needs its own definitions for those. As for parameters of structs:

spir-V has return. So you can get around that by grouping everything in a structure and returning it.

Going further you can put the input to an invocation in a structure as a single param and returning the output as a single structure.

How exactly will that avoid duplication? Different stages have different sets of applicable decorators, like “Stream”, “XfbBuffer” and so forth. In particular, those are only allowed on outputs. So even if you have an input struct and an output struct, the output struct can’t necessarily be taken as the input to the next stage. You need two struct definitions: one as the output, and one as the input.

Not only that, what if you don’t want the next stage to take all of the values output from the previous one? That’s a perfectly legitimate thing to do. That’s why OpenGL Program Pipelines have rendezvous-by-resource via intra-stage location qualifiers.

Other tools (like optimizers and verifiers) would be helped. With input by parameters you don’t need to build the callgraph instead you just need to ensure the function calls are correct.

Any optimizer worthy of the name has to build a callgraph in order to determine which global constructs (Uniforms, buffers, textures, etc) are not being statically used. Adding inputs and outputs to that list is just another global construct. The same goes for verifiers.

by that point calls will get mostly inlined and the call graph rendered moot.

Premature optimization is just as bad when dealing with intermediate languages as when writing programs. Drivers are going to have to do inlining, and they will know a lot better when inlining is good and when it is not.

A SPIR-V optimizer, which is blind to the implementation, can only make a limited number of optimizations. Removing dead code, some small expression optimizations, and so forth. Anything else, such as inlining, needs to be done by a competent driver.

I’ll admit “cripple” may be a bit too strongly worded. However the plea to avoid the knee-jerk reaction of doing anything but copying microsoft even if it means sacrificing usability still stands.

But the difference between parameters and globals is exceedingly minor. It’s minor for tools that generate SPIR-V, it’s minor for tools processing SPIR-V, and it’s minor for users who have to work with SPIR-V.

There is no “sacrifice” of usability here. It may make one or two of your personal use cases slightly more inconvenient. But it makes other use cases more convenient.

You act like there are no advantages at all to the global model, that it’s the one right answer, that the only reason to use it is because it’s not what Microsoft does.

ratchet_freak · September 15, 2015, 7:05am

I’d be glad if I had some indication that they took a long and hard look at the entire thing instead of blindly copying from glsl and then adding stuff to spir-V to make it fit the glsl model.

They did with the textures and have since rejigged those to use separate images and samplers which get (optionally) combined at runtime.

Alfonse_Reinheart · September 15, 2015, 10:43am

Let’s ignore for the moment the question of whether SPIR-V is “blindly copying” anything and ask something more important:

Objectively speaking, does it matter?

SPIR-V is an intermediate language. You won’t be writing in it. You won’t be reading it, except when writing a shader compiler and checking what it generates. So long as the IR provides reasonable access to what the hardware can do, so long as it provides a solid abstraction, and so long that tools can manipulate it without much trouble… what does it matter if it looks more like GLSL than is strictly necessary?

Is that going to inhibit the ability of compilers to optimize code? Is that going to inhibit the ability of users to develop shading languages with whatever behavior they want to expose to the user? Is that going to break offline optimizers or reduce the functionality of other offline processing tools. Will that in any significant way negatively impact SPIR-V’s utility as an intermediate representation?

No.

So why does it matter?

On your actual question, you wanted “some indication that they took a long and hard look at the entire thing instead of blindly copying from glsl”. Well, what would such an indication look like? You seem to be suggesting that the only reason SPIR-V uses global interface variables is because GLSL did, that if they built SPIR-V without looking at GLSL at all, they would obviously make them function parameters.

That’s assuming your own conclusion. Namely, that function parameters are obviously the one correct way to go, and the only reason not to use them is because you’re copying from GLSL.

HLSL was created from scratch, based on whatever Microsoft wanted. GLSL was initially developed by 3D Labs, who unlike Microsoft actually bothered to investigate other shading languages in use at the time. They patterned all of GLSL after Renderman. Remember varying? That was from Renderman, as is the term uniform.

I would hope that you aren’t suggesting that Pixar was preemptively avoiding HLSL when they wrote Renderman. No, GLSL and SPIR-V did not pick this model because the developers hate HLSL.

There are good reasons why Renderman, and its derivatives, use this model. The most important of which is module linking.

It makes linking modules easier, with implicit rather than explicit interfaces between modules. If a module needs some input parameter, it just declares it; the module that uses it doesn’t have to change. This means that the interface variables are an implementation detail of other modules; the main module doesn’t have to know or care about them. Just like with other elements (images/textures/buffers/etc), you don’t have to shuffle values around from the main module to submodules if they require one.

This is just encapsulation, at the module level.

SPIR-V is not “blindly copying” from GLSL. It is simply making GLSL’s choices because it has similar needs. Module linking is an important feature of SPIR-V, and global interfaces are the only way have fully encapsulated, reuseable modules.

Please note: GLSL and SPIR-V (and Renderman) have module linkage; HLSL does not.

So I fail to see the need for any “indication” here. The choice is being made because it makes the most sense for the features of the system.

Yes, and that provides access to a genuine hardware feature of shaders, which would not be available in the old model (not that you can access it via Vulkan, since you have to put your OpImageSampler commands up-front).

What you’re asking about is ultimately just syntactic sugar.

Or to put it another way, separation between images and samplers is good because it’s useful, not because it matches HLSL. It’s not something you could layer over top of SPIR-V in your source shading language.

ratchet_freak · November 16, 2015, 8:54am

looks like they did listen:

Changed OpEntryPoint to take a list of Input and Output <id> for declaring the entry point’s interface.

This means that you can immediately find out which inputs and outputs belong to which pipeline stage without building the CFG.