Apparently that’s the case, since the latest NVidia GL4 drivers don’t advertise this extension on a GTX285.
What prevents this on older hardware?
That is an excellent question, and that it’s not available on GL3 hardware (apparently) strongly suggests that this isn’t just doing a silly “recompile/reoptimize the program” on uniform change under-the-hood.
Perhaps it’s dynamically changing call instructions in the program code and physically doing subroutine calls (which suggests the presence of a stack, or at least 1-level jump-back mechanism). If so, I wonder about max subroutine nesting level (e.g. main->A->B->C). Haven’t read the spec, but since I find no mention nest or level in it, this probably isn’t it…
Or is it merely more like just a “switch” statement, where the “jump address” is patched-in dynamically based on the uniform value selected, and all alternative subroutine paths rejoin (“jump back”) to the same spot regardless. If so, this should be like switching on a uniform, except that the branch is unconditional rather than conditional, so should be more efficient.
And this is versus the usual ubershader scheme, where each code path switches based on const bool/int/etc. values in the shader which are varied per ubershader variation, and the GLSL compiler “compiles out” the dead paths not used in that ubershader permutation:
const bool FEATURE_ON = false; // Varies per ubershader permutation
if ( FEATURE_ON ) then
{
...feature_implementation...
}
I too am curious how this compares to, not only static branches, but “no” branches (i.e. traditional ubershader approach – that is, rebinding a different shader vs. presumably hot-tweaking some jump values in the currently-bound shader). But alas, no GL4 card on my desk to play with just yet…