Rendering without Data: Bugs for AMD and NVIDIA

cameni · May 8, 2011, 2:17am

I had discovered the problem by writing a custom parser and code generator for a cg-like syntax to hide the design of GLSL. Consequently I was generating multiple GLSL sources from a single original source.

And of course you are right - given the design of shader compilation model it would be hard to come up with a reasonable solution to this; in my case it was logical because it came form a well ordered source, but generally it is not, so this is probably not a good case.

Ok, maybe it’s just me, seeing the design unnecessarily alien and generating problems that could have been avoided eventually.

I’m having some trouble imagining these circumstances. Could you describe a place in the spec were you could look at the API and get a really wrong idea about what is legal and what is not.

Nowhere did I say that the spec is poorly written with regards to the exactness, just that the undefined behaviors (and implementation specifics is the same to me) make it a harder API to live with when there’s also no error detection. In this regard the design of D3D is indeed better, IMO, with the independent layer that lies between the developer and vendor code. Not that it solves the problem per se, but together with specs designed to support the development process from the start it allows to write a general code for error detection. Sure, if it’s left on the vendors to implement it, nobody would. I’m just trying to find how it could be done better, not how it could not be done.
So what do you suggest? Do you see it as a problem at all?

If something can be caught via a debugger, it’s a lot more convenient way than to go step by step, consulting the specs in parallel.

Alfonse_Reinheart · May 8, 2011, 4:25am

Nowhere did I say that the spec is poorly written with regards to the exactness, just that the undefined behaviors (and implementation specifics is the same to me) make it a harder API to live with when there’s also no error detection.

I was asking if there was a part of the OpenGL API where it looks like the API says “X is possible” but the spec says “X is undefined/implementation dependent.” That would suggest the kind of miscommunication you’re talking about.

Do you see it as a problem at all?

Do I see which as a problem? You seem to be talking about two entirely separate issues. One issue is driver quality, which is what the D3D model helps to improve on. As the topic of the thread indicates (as well as the need for this sub-forum on drivers), this is clearly a problem.

The other issue you talk about is with too much “undefined behavior”, which you seem to suggest is lurking all over the place, waiting to swallow hapless OpenGL programmers.

In order for that to be true, the specification and the API must be at odds. That there are a lot of pitfalls in the API that don’t raise glGetError’s, but can cause one to enter “undefined behavior” that varies from implementation to implementation. The user mistakenly assumes that this behavior is compliant and comes to rely on it.

You have yet to show such a case, even though the topic of the thread provides one for the compatibility spec. Namely, that nothing in the array rendering API suggests that you need to actually bind anything to render, but the compatibility spec says otherwise. Also, the special treatment of attribute 0.

But since the core spec (the required one) gets this right, it is at best a half-example. Even less, considering the fact that a compliant compatibility implementation is not supposed to actually render anything if you try it; it should give a GL_INVALID_OPERATION error. So the only way you could come to rely on it is by relying, not on “undefined behavior” but on non-conforming behavior. The spec is not too loose or too “undefined”; there’s just a bad OpenGL implementation out there.

The problem is clearly on NVIDIA, not on the specification or the API.

To me, the biggest problems in OpenGL are, in order:

1: Driver quality. I’m looking at you, AMD. Sampler Objects are not a river in Egypt.

2: Driver quality. Yes, I’m looking at you too, NVIDIA, Mr. “I’ll let anything through.”

3: Driver quality. And that means you, Intel, Mr. “I don’t know how to write code!”

4: Specification quality. coughseparate_program_objectscough. And some of their others, like shader subroutines, have been very oddly specified. And unlike the past days, when features would often have a dry run at the EXT or ARB level before being promoted to core (geometry shaders in core is much better than even the ARB version), now features are born directly in core. Which is good when the spec doesn’t have problems, but when it does…

5: API quality. Seriously, have you looked at the texturing API lately? It’s like a madhouse in there. If there were one place I could take a sledgehammer to, it would be every non-FBO function that dealt with creating, managing, and binding textures. The combination of explicit attribute locations, shader inclusion (when AMD gets around to implementing it), and separate programs (the functionality, not the spec, which is equal parts buggy and stupid) are able to save the shader specification part of the API.

cameni · May 8, 2011, 5:28am

It’s pretty much possible that most of the things are related to driver quality, as you are saying. I was speaking from my experience, when everything seemed to work normally on NVIDIA as I would expect (the problem is, I didn’t study the specs in detail when it worked straight away, lazy me), and it later failed silently on ATI. Now what do you do, when there is no error being reported anywhere, everything works as expected on one card but you are getting a black screen or utter mess on another? Some of it were plain driver errors, I went and reported these, after a painful process of isolating them. But some were errors which maybe spec says are errors to do, but nobody ever catches them.

For example, if shaders don’t link right with their attributes, I’m getting no errors anywhere. The program exhibits strange behavior, but often times it somehow magically works on NVIDIA as it tolerates many errors, like mismatching types and names sometimes. Nowadays the generator I’m using verifies everything by itself and produces extra errors and warnings that I would not get otherwise, saving me a lot trouble.

The spec may be exact but I’m asking if it wouldn’t be a good idea if it imposed some quality control on the drivers then. If the spec itself went an extra mile to help with the driver quality issues you listed. Who else?

Alfonse_Reinheart · May 8, 2011, 5:52am

if shaders don’t link right with their attributes, I’m getting no errors anywhere.

What do you mean by that? Attributes are vertex shader inputs; they don’t link to anything.

The spec may be exact but I’m asking if it wouldn’t be a good idea if it imposed some quality control on the drivers then. If the spec itself went an extra mile to help with the driver quality issues you listed.

What do you want the spec to do? It already says, “names between vertex and fragment shaders must match, and an error is given when they don’t.” If an implementation has a bug and doesn’t handle this correctly, what can the spec do about it? Should the ARB write “pretty please, with sugar on top” into the spec?

Remember: a specification is ultimately just a sheet of paper (not even that; just a PDF file). It cannot actually do anything. It cannot force compliance.

That’s why the ARB is re-instituting a conformance testing suite. They plan to have it ready by SIGGRAPH, hopefully up to at least GL 3.3 core.

Maybe by then, AMD will have implemented sampler objects…

cameni · May 8, 2011, 6:01am

Sorry, I meant the names/types of varyings between the shaders.

What do you want the spec to do? It already says, “names between vertex and fragment shaders must match, and an error is given when they don’t.” If an implementation has a bug and doesn’t handle this correctly, what can the spec do about it? Should the ARB write “pretty please, with sugar on top” into the spec?

I don’t know, actually. Maybe I was just wishing for a better design of the api, with and intermediate layer built by an independent body that would be able to guarantee something.

system · October 19, 2021, 5:38pm

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.