Official feedback on OpenGL 4.4 thread

Which functionality exactly?

This on it’s own is cause for joy. Any chance of mandatory full certification being brought back to earlier versions as time goes by and drivers mature?

No. Khronos can’t require certification for specifications that have already gone through ratification and have shipping implementations. Mandatory starts with GL 4.4 implementations. But some vendors who are shipping earlier versions are likely to choose to make conformance submissions on those drivers. All the major desktop IHVs (and one IV who isn’t on desktop at the moment) have been actively contributing to the conformance tests and running them against their drivers during the test development process.

The conformance tests are not going to solve every behavior difference or address every bug - that level of testing is far out of scope relative to the resources Khronos can put into developing tests - but they should be a significant improvement over SGI’s OpenGL CTS,l which was last updated about 14 years ago.

You don’t get it, do you? Aside from there being principle to adhere to here (yeah I know, stupid principle, duh), Mesa is an important entity to pretty much any open-source driver on Linux as it is an OpenGL reference implementation. But the project is open-source, maintained by various people of various companies and independent developers and not by a single, loaded corporation. One might consider that a problem.

Couldn’t agree more…

So, what is to be expected? Just out of interest.

such things as image load\store, ssbo’s, atomic operations and texture views.
30% of the question is actual things you do with this functionality in real projects(except for texture views, it’s kinda obvious). and 70% is what kind of projects and why did you decide to invest your time to implement functionality using those extensions.

So, what is to be expected? Just out of interest.

The GL tests share a codebase with the OpenGL ES 2/3 tests, which a bunch of stuff added for GL-specific features. So there are API coverage and functional tests, in some cases quite comprehensive and in other cases not, and a bunch of shading language tests. There is work underway to integrate a more comprehensive set of GLSL tests contributed by drawElements. We’re still light on tests for the most recently added GL features, but making progress.

That’s reasonable and understandable (particularly in light of vendors who may have 3.x hardware that they no longer ship drivers for), if regrettable. The fear is that some vendors may freeze their implementations at pre-4.4 level in order to avoid the certification requirement, but given your last sentence it seems a lot more positive.

I don’t think many people expect every behaviour difference or bug to be fixed (even D3D with WHQL can’t do that); it’s a good thing to push them in the direction of more consistent and predictable behaviour though (and also good to hear that they’re all on-board with this). I think everyone wants to avoid another Rage, and this is a solid step in the right direction.

Khronos is committed to the creation of royalty-free specifications for use by the entire industry. It is our stated committed mission, and our actions over our history demonstrate that commitment.

We achieve that goal in a number of reasoned legal steps that also protect the IP of the Khronos membership. If we are not careful to create a structure that protects members IP, as well as the use of the specification in the industry, many of the members would not be able to participate in the creation of these standards for the good of the industry.

The wording in the Khronos IP framework grants reciprocal royalty-free license to other members. This is not exclude non-members as a goal, but because it is not acceptable to grant a valuable IP license to an unknown entity or entities (e.g. ‘the whole world’) that do not explicitly agree to reciprocal terms. So, the Khronos IP framework establishes the largest ‘raft’ of written reciprocal contractual obligations possible - i.e. between the entire Khronos membership.

Behind this is the stated commitment that anyone can implement a Khronos spec royalty free. In practice this means that if a non-member is tacitly following the terms of the written reciprocal agreement between the members, i.e. not suing Khronos members over the use of a Khronos specification, then Khronos welcomes their using the specification. Now, this stated commitment is not a written contract, but if a non-member requires a written contract between itself and the entire Khronos membership for implementing any Khronos specification, it just has to join Khronos. As Khronos membership is guaranteed (by our bylaws) to be open to any company that wishes to join, any implementer may gain access to a written reciprocal license for the cost of a Khronos membership - $10K.

For companies implementing a complete specification, $10K is very inexpensive (and we do need membership fees to keep the lights on). But the good point was made that open source communities cannot afford a Khronos membership. To address this, Khronos has a proud history of waiving membership fees to open source practitioners who are undertaking bona fide efforts to construct open source implementations of Khronos specifications. This enables them to enjoy the same protection as other Khronos members for free.

Finally, a comment was made that a member possessing a patent on a Khronos specification was a bad thing. The reverse is true. Under the Khronos IP Framework all members with patents that are essential to a ratified Khronos specification reciprocally license that patent royalty-free. Importantly, the more patents that Khronos members posses that are reciprocally licensed, the larger and stronger the patent ‘raft’ that protects implementers of the specification against non-members asserting patents against the spec. Patents that are licensed to you for your protection are a very good thing.

I hope this helps explain the Khronos IP Framework.

Neil Trevett
Vice President Mobile Content, NVIDIA | President, Khronos Group
ntrevett@nvidia.com

@Jon: first of all, that’s definitely a very good start. I wonder, would it be possible to open up the test descriptions and let the community participate? I guess we got enough man-power around here to at least fill some of the gaps with suggestions for viable tests. In general, an open conformance test-suite would be awesome - one would probably need one to n people signing off on contributions.

I’ve read the ARB_sparse_texture spec and I noticed that the AMD_sparse_texture spec has functions to fetch from a sparse texture and return information about whether any texture data is present, is there something similar to this in ARB_sparse_texture?

@neilt: Thanks for the extensive answer. Good to hear from the Prez. :wink:

One thing, however, sounds very vague:

How does Khronos determine that some independent, non-member entity is trustworthy enough? Plus, I assume you mean not only trustworthy but also promising, in the sense that it has to be a potentially successful endeavor?

Khronos will offer certification of drivers from version 3.3, and full certification is mandatory for OpenGL 4.4 and onwards. This will help reduce differences between multiple vendors’ OpenGL drivers, resulting in enhanced portability for developers.

This is fantastic news! I’m more excited about this than any of the new core 4.4 features or extensions. Dealing with broken features debuting in drivers, spec interpretation differences, and driver regressions are a particularly unpleasant part of cross-platform OpenGL development. Working around these issues drains resources from otherwise ‘useful’ development. While I don’t expect the situation to magically improve overnight nor make drivers perfect, this is a good start.

Is there a process for submitting conformance tests to be reviewed by the ARB? Or is this limited to ARB members?

[QUOTE=mhagain;1252907]Issue #9 for GL_ARB_buffer_storage makes for fairly grim reading, unfortunately… :frowning:

It’s a pity as this could have been the kick up the jacksie that GL’s buffer object API really needed, and the issue in question should really have been resolved by just saying “this is client memory, full stop, using incompatible flags generates an error, here are the flags that are incompatible and the vendors will have to just live with it”, but it seems another case of shooting too high and missing the basic requirement as a result.[/QUOTE]

OK, so… what is the “basic requirement?”

That’s what I don’t understand about this whole issue. What exactly would you like “CLIENT_STORAGE_BIT” to mean that is in any way binding? You say that certain flags would be incompatible. OK… which ones? And why would they be incompatible?

If client/server memory is a significant issue for some hardware, then that would mean something more than just “incompatible bits”. If client memory exists, then why would the driver be unable to “map” it? Why would it be unable to map it for reading or writing? Or to allow it to be used while mapped or to make it coherent?

The only limitations I could think of for suc memory would be functional. Not merely accessing it, but uses of it. Like an implementation that couldn’t use a client buffer for transform feedback storage or image load/stores. It’s not the access pattern that is the problem in those cases; it’s the inability to allow them to be used as buffers in certain cases.

So the ARB could have specified that client buffer objects couldn’t be used for some things. It would be the union of all of the IHVs who implement it. Which would exclude any new IHVs or new hardware that comes along. They could provide some queries so that implementations could disallow certain uses of client buffers.

But is that really something we want to encourage?

BTW, if you want to trace the etymology of CLIENT_STORAGE_BIT, it was apparently not in the original draft from January. According to the revision history (seriously ARB, use Git or something so that we can really see the revisions, not just a log. That’s what version control is for), the ancestor of CLIENT_STORAGE_BIT was BUFFER_STORAGE_SERVER_BIT (ie: reversed of the current meaning), which was added two months ago.

Also, from reading the issue it sounds very much like they didn’t really want to added it, but had to. Granted, since “they” are in charge of the extension, I have no idea why they would be forced to add something they didn’t want.

But as I said before, you can just ignore the bit and the extension is fine.

How does Khronos determine that some independent, non-member entity is trustworthy enough?

Generally speaking, a member company recommends them, the affected working group talks about it and and makes a recommendation to the Board of Promoters, and the BoP discusses and votes on the recommendation. Which is pretty much the way most things are decided in Khronos.

use Git or something so that we can really see the revisions, not just a log. That’s what version control is for

The extension specifications are in a public part of Khronos’ Subversion tree and you can see the history of public updates after a spec has been ratified. We’re not going to publish the entire history of a spec through it’s internal development, though.

Also, from reading the issue it sounds very much like they didn’t really want to added it, but had to. Granted, since “they” are in charge of the extension, I have no idea why they would be forced to add something they didn’t want.

Well that’s precisely what the problem is. It’s nothing specifically to do with CLIENT_STORAGE_BIT itself, it could have been about anything; it’s the introduction of more vague, woolly behaviour, more driver shenanigans, and via another “one of those silly hint things”.

What’s grim about issue #9 is the prediction that the extension will make no difference, irrespective of whether or not the bit is used:

In practice, applications will still get it wrong (like setting it all the time or never setting it at all, for example), implementations will still have to second guess applications and end up full of heuristics to figure out where to put data and gobs of code to move things around based on what applications do, and eventually it’ll make no difference whether applications set it or not.

It seems to me that if behaviour can’t be specified precisely, then it’s better off not being specified at all. I’ve no particular desire for CLIENT_STORAGE_BIT to mean that the buffer storage is allocated in client memory; that’s irrelevant. I have a desire for specified functionality to mean something specific, and put an end to the merry-go-round of “well it doesn’t matter what hints you set, the driver’s just going to do it’s own thing anyway”. If that’s going to be the way things are then why even have usage bits at all? That’s not specification, that’s throwing chicken bones in the air.

What’s grim about issue #9 is the prediction that the extension will make no difference, irrespective of whether or not the bit is used:

That section said “set it”, referring to the bit. Not to all of the flags, just CLIENT_STORAGE_BIT.

I have a desire for specified functionality to mean something specific, and put an end to the merry-go-round of “well it doesn’t matter what hints you set, the driver’s just going to do it’s own thing anyway”.

Ultimately, drivers are going to have to pick where these buffers go. The point of this extension is to allow the user to provide sufficient information for drivers to know how the user is going to use that buffer. And, unlike the hints, these represent binding contracts that the user cannot violate.

Drivers are always “going to do it’s own thing anyway.” It could stick them all in GPU memory, or all of them in client memory, or whatever, and still be functional. But by allowing the user to specify access patterns up front, and then enforcing those access patterns, the driver is able to have sufficient information to decide up front where to put it.

The only way to get rid of any driver heuristics is to just name memory pools and tell the user to pick one. And that’s just not going to happen. OpenGL is not D3D, and buffer objects will never work that way. OpenGL must be more flexible than that.

It also said “or not”. So take the situation where you don’t set CLIENT_STORAGE_BIT and explain how that text doesn’t apply.

[QUOTE=Alfonse Reinheart;1253023]Ultimately, drivers are going to have to pick where these buffers go. The point of this extension is to allow the user to provide sufficient information for drivers to know how the user is going to use that buffer. And, unlike the hints, these represent binding contracts that the user cannot violate.

Drivers are always “going to do it’s own thing anyway.” It could stick them all in GPU memory, or all of them in client memory, or whatever, and still be functional. But by allowing the user to specify access patterns up front, and then enforcing those access patterns, the driver is able to have sufficient information to decide up front where to put it.

The only way to get rid of any driver heuristics is to just name memory pools and tell the user to pick one. And that’s just not going to happen. OpenGL is not D3D, and buffer objects will never work that way. OpenGL must be more flexible than that.[/QUOTE]

Again, I’m not talking about CLIENT_STORAGE_BIT specifically, I’m talking about specification vagueness and woolliness in general. You say that “OpenGL is not D3D” but yet D3D 10+ (which by the way doesn’t have memory pools, it has usage indicators just like ARB_buffer_storage) has no problem whatsoever specifying explicit behaviour and yet working on a wide range of hardware. This isn’t theory, this is something that’s already out there and proven to work, and “OpenGL must be more flexible than that” just doesn’t cut it as an excuse.

Referring specifically to CLIENT_STORAGE_BIT now, go back and read the stated intention of this extension:

If an implementation is aware of a buffer’s immutability, it may be able to make certain assumptions or apply particular optimizations in order to increase performance or reliability. Furthermore, this extension allows applications to pass additional information about a requested allocation to the implementation which it may use to select memory heaps, caching behavior or allocation strategies.

Now go back and read issue #9:

In practice, applications will still get it wrong (like setting it all the time or never setting it at all, for example), implementations will still have to second guess applications and end up full of heuristics to figure out where to put data and gobs of code to move things around based on what applications do, and eventually it’ll make no difference whether applications set it or not.

Realise that it’s being predicted to not make the blindest bit of difference even if applications don’t set CLIENT_STORAGE_BIT.

This extension would have been great if CLIENT_STORAGE_BIT was more strictly specified.
This extension would have been great if CLIENT_STORAGE_BIT was not specified at all.

Right now best case is that implementations will just ignore CLIENT_STORAGE_BIT and act as if it never even existed. MAP_READ_BIT | MAP_WRITE_BIT seem enough to clue in the driver on what you want to do with the buffer. Worst case is that we’ve an exciting new way of specifying buffers that does nothing to resolve a major problem with the old way.

Realise that it’s being predicted to not make the blindest bit of difference even if applications don’t set CLIENT_STORAGE_BIT.

You’re really blowing this way out of proportion.

The mere existence of the bit changes nothing about how the implementation will handle implementing the rest, because it changes nothing about any of the other behavior that is specified. If you say that you won’t upload to the buffer by not making it DYNAMIC, you cannot upload to it. If you don’t say that you will map it for writing, you can’t. If you don’t say that you will map the buffer while it is in use, you can’t.

All of that information still exists, is reliable, and is based on a API-enforced contract. Therefore, implementations can still make accurate decisions based on it.

Worst case is that we’ve an exciting new way of specifying buffers that does nothing to resolve a major problem with the old way.

Um, how?

The fundamental problem with the current method is that the hints you provide are not guaranteed usage patterns. The API can’t stop you from using them the wrong way, nor can the documentation explain the right access pattern for the hints. Therefore, those hints will frequently be misused. Since they are misused, driver developers cannot rely upon them to be accurate. So driver developers are forced to completely ignore them and simply watch how you use the buffer, shuffling it around until they figure out a place for it.

With the exception of CLIENT_STORAGE_BIT, all of the hints are enforced by the API. You cannot use them wrong. Therefore they represent real, actionable information about how you intend to use the buffer. Information that driver developers can use when wanting to allocate the storage for it.

The mere existence of CLIENT_STORAGE_BIT changes nothing at all about how useful the other bits are. The discussion in Issue 9 is specifically about those cases where the other usage bits alone cannot decide between different memory stores.

And, as far as the DX10 comparisons go, I checked the DX10 API. The only functional difference between these two is that CLIENT_STORAGE_BIT exists in GL (that, and the GL version gives you more options, such as using GPUs to updating non-dynamic buffers). So why should I believe that the mere existence of an option suddenly turns an API that is functionally equivalent to DX10 into the wild west of current OpenGL buffer objects?

Or let me put it another way. If the DX10’s usage and access flags are sufficient information to place buffer objects in memory, why are the current set of bits provided by this extension not equally sufficient for this task? And if those bits are not sufficient, then there must already exist “heuristics to figure out where to put data and gobs of code to move things around based on what applications do” in D3D applications, so why would that code not apply equally well to OpenGL implementations?

I think you really taking that issue way out of proportion.

Is it possible for implementations to just ignore all of these bits (outside of enforcing the contract) and rely entirely on heuristics? Absolutely. But the information is there and it is reliable. So why would they? Just because there’s one bit that may not be reliable?