Detecting if a gl_LightSource is disabled in compatibility profile

Doesn’t shader subroutines already give you the performance you’d expect from this hypothetical const re-compilation? I’ve found the cost of recompiling/linking a shader to outweigh dynamic strategies like this. Granted, shader subroutines are GL4-only, but at least they currently exist :slight_smile:

[QUOTE=atb123;1239924]2 more questions:

  • How is this idea different than branching on uniform bools? Couldn’t the compiler optimize uniform branches in the same way?[/QUOTE]
    It shouldn’t, no. Uniforms are variables that are just uniform (constant) for a batch. You can change them freely between batches and that change should be relatively cheap.

Further, suppose you are using an int (instead of a bool) to define enumerated shader permutations (e.g. type of lighting, type of fog mode, type of texturing on texunit0, etc.). The compiler doesn’t know about these, much less which subset you’re using (without deep expression analysis). So how does it precompile all of the permutations? Then, mix in 10-20 “shader permutation” states (int, bool, whatever) and a bunch of shader code using them and you have a permutation space that’s completely unreasonable to precompile exhaustively for.

So anyway, for uniforms, the compiler cannot reasonably “compile out” code for branches depending on mere uniforms.

Constants however are different. They cannot change across a specific compiled shader – ever. You’re effectively telling the compiler up-front “I want you to compile for this specific permutation”.

  • Previously I was under the impression that branches based on uniforms were relatively efficient, how much of a performance boost might I get using const branches set up by the preprocessor instead? …I know the correct answer will be “test and see”…

Yeah, I would suspect as you do that it’s going to depend alot on your specific usage. If the shaders are simple/trivial, it probably won’t make any difference. But suppose (for instance) complex shaders where, depending on the “shader permutation” activated you might need anywhere from 0 to 200 uniform values populated per shader bind (some uniform arrays, some textures, etc. with some effort needed to “get them ready” to bind). If you use conditional logic based on uniforms in the shader to “switch on” the correct shader path, those 200 max uniforms are still “active”, even if you’re only using 1, so to be correct you still need to populate them all. With hard-coded assumptions in your C++ code for how your GLSL code works you might be able to avoid populating the ones you think won’t be used in the shader, but this is cumbersome and fragile. Whereas with precompiling out switched-off code with conditional logic based on const settings (or preprocessor directives), those uniforms are truly inactive and GL knows this and will tell you as much.

Beyond active uniforms, think interpolators. Suppose in the general case you have 20 different named interpolators, of which you’d only ever use 5 at a time, depending on the specific shader permutation that’s active. With the “conditional logic based on uniforms” route, the compiler would have to compile in all 20 interpolators. AFAIK that means lots fewer verts in your vertex cache and fewer shaders running in parallel. However with the “conditional logic based on constants” route (or preprocessor) to actually remove the inactive code from the shader, then the GLSL compiler will build in support for only those interpolators you need, yielding more parallelism and a larger vertex cache.

I’m my case I only have the uniforms in gl_LightSource[gl_MaxLights] that may or may not be needed depending on their state, but no conditional interpolators or textures, so maybe I’m OK with uniform branching? It certainly makes the code a lot simpler…

Once you get into the interplay between different “shader permutation” settings, subroutines just don’t map well (in my opinion).

The analog to this approach in C++ is to turn each of your constant value permutations (inputs) into one or more virtual functions (outputs).

This is after the shader permutation constant combination fan-out so there are a lot of them (whereas the constant values themselves are before so there are few). Seem to me applying subroutines to this is the wrong solution to the problem and leads to spagetti-code unless you have trivially-few shader permutation settings.

Whereas using constants in expressions leads to very clean, readable code, even with a lot of shader permutation settings.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.