OpenGL 3 and GLSL

An interesting issue came up with regard to GL 3.0 and its shading language, GLSL.

The OpenGL 3.0 API is designed to, among other things, keep the user on the fast, hardware-supported path. Image format objects were added to allow the user to (among other things) know if a particular kind of texture was supported by the hardware. Vertex array objects allow the implementation to tell the user that unsigned shorts are not supported as vertex inputs. And so on.

It all sounds pretty solid. But there’s a gigantic hole in this fortress: GLSL.

GLSL was never really designed to communicate adequately with the user as to whether certain features are available to the hardware. Because VAOs and IFOs are fairly simple constructs, a pass-fail mechanism is all that is necessary to test for certain features (though I expect an appropriate glError-like mechanism to be available too, for more in-depth testing). GLSL compilation can fail, however, for innumerable reasons.

The forums are littered with posts where a shader worked in hardware, but after a driver revision no longer did. Some of these issues are related to driver bugs, but some aren’t. Some are simply the vagaries of compiler design showing themselves.

GLSL has very few mechanisms to be able to tell if you are getting close to hardware-defined limits. You can ask for the number of uniforms and varyings, but that’s it. Instruction count, made much more nebulous by the C-style nature of the language, is indeterminant. And something even more nebulous, the number of temporaries available, isn’t available either.

As such, it is impossible to know a priori whether a shader will compile. But that’s not the big problem. After all, VAOs and IFOs are the same way; you have to ask the implementation if it will work. The problem is that, with VAOs and IFOs, if they fail, you know why. More importantly, you know how to correct it.

The absolute worst case with a failing VAO is that you revert to floats for everything. The absolute worst case with a failing IFO is that you either can’t use the image at all or you revert to RGBA8, with power-of-two sizes.

Shaders are a much more nebulous beast. If a shader doesn’t compile to hardware, you can’t really tell why. Not in any algorithmic way, that is. Even if you had a human there, it wouldn’t be a guarantee of being able to make a shader work.

Furthermore, you can’t even guarantee that a worst-case fallback will compile. Obviously, you can be reasonably sure that it will work, but you can’t be positive about it.

Is there a solution to this? Ultimately, even a lower-level shader language wouldn’t be a full solution (since instructions can be expanded to multiple opcodes, in an implementation-defined way, counting opcodes is unreliable). Is the ARB working on finding a way to alleviate this problem?

strstr the log for keywords… :wink:

I think there are some reasonable fallbacks though, such as supporting fixed function style features, you know you can do it. I agree that there is a problem with trying to figure out why something didn’t compile at run time, that would be a pain.

I think though, that offline compilation is a possible solution to this problem. Ship already compiled shaders for a variety of hardware?

I’m personally myself more scared about falling back to software rendering for whatever reason, without any way to detect it (other than testing the performance).

If you try to compile a shader and it fails, at least you can try to compile a “more simple” version of it, with less instructions/temporaries.

Y.

I’m personally myself more scared about falling back to software rendering for whatever reason, without any way to detect it (other than testing the performance).
I wouldn’t be concerned about it. Odds are, considering the general design of 3.0, if the hardware can’t run it, then the object simply isn’t created and an error results.

What approach would you like to see, Korval?
It seems to me that even if I can query the number of temporaries allowed, or be able to create a proxy shader and query how many instructions it generated, it still doesn’t solve my problem, I’ve still got to have a fallback, and there’ll usually only be one fallback so if that fails we’re stuffed regardless. The shader failing to build is enough for me - build most advanced shader, if failed build slightly less advanced shader, if failed build last resort shader, if failed throw up a message box.

Korval, you will see why the shader fails:

Longs Peak is going to include a debug-mode layer exactly like Direct3D does… So you will enable the debug mode in a control panel and the OutputDebugString will put messages in the IDE debug window or to a debug_log.txt

At realtime, you can do what knack says ( advanced shader-> basic shader->message box)… But I always liked the offline-compiled shaders + strict minimum capabilities. I don’t like realtime shader compilation at all. It just causes problems like this one.

About the software mode fallback… I hate it. Is all what I can say. I prefer 100 times an error abortion than to enter a superslow software render mode… so LP error approach is ok with me.

I don’t like the run-time compilation scheme that much either, but it’s not the worst thing in the world in my opinnion. Using this scheme makes it easier for the hardware vendors to optimize the compiled code for their hardware.

I’ll comment on SW fallback also by saying that it should atleast be optional. Mandatory fallback on insufficient capabilities is a huge nuisanse. But it’s trackable (haven’t written shaders that would have fallen back on SW in a while but atleast early ATI beta drivers, which led to that often, wrote about it into the info log by retrieving of which you could see when it’s being done).

A simple query “is shader HW accelerated” uniform across all driver/card combinations would make “Force HW shaders” option in graphics settings possible, so ppl. who don’t have a fast enough CPU punt intended can tell their program to avoid CPU fallbacks. :stuck_out_tongue:

Apple’s approach to this (in current GL 2.0) has two points worth mentioning:

  • software fallback can be detected at any time with a CGLGetParameter API. It will tell you if vertex or fragment processing fell back to software, although no indication why (could be GLSL, or other factors like NPOT texturing, drawable too big, etc.)

  • software fallback can be disabled at context creation time with a NoRecovery pixel format attribute. This affects fragment fallback (drawing will no-op) but not vertex (i.e GMA 950 needs SW TCL no matter what.)

What would you do to improve on this situation?
If you could have any API you want, to figure out why you fell back to software, what would it look like?

I think most users would want to avoid the overhead of needing to constantly query about that----or other GL errors, for that matter.

Which suggests to me that the “register a handler function” approach would be cleanest.

“overhead of needing to constantly query about that” <- I fail to see any need to query - one can just ignore it “safely”?

btw. I have to note that the IHVs unwritten rule of exposing only HW accelerated extensions does 95% of the “avoiding SW fall-back” thing auto-magically. That last 5% is keeping us users detecting ATI / NV / others yuck and using appropriate/most suitable code-paths. For my current usage case, GLSL shader SW fall-back coverage would not do the trick completely :frowning:

One can, but one would prefer not having to choose between frequent queries and complete ignorance.

If we’re talking strictly about shaders, single query after creation could be enough - if at all possible to implement with the shader “compile in the last moment” caching going on?

My main concern is silly compiler bugs.

Having multiple fallbacks for a complicated shader is par for the course. My concern is that even a fairly minor shader might fail to compile if a driver gets an unpleasant revision.

In GL 2.1, it’s easy for the developers to test their fixed-function glslang compilers; they just run any pre-shader code. It’s hard to ship a driver that breaks every pre-shader application.

The theoretical problem is that, once there is no pre-shader code anymore, will they still test small-shader code as effectively? I’m more concerned about fragment shaders, particularly for R300-class hardware that has some fairly strict limits. I’d hate to see basic DOT3 bump-mapping break on those cards due to a driver bug.