Official feedback on OpenGL 4.4 thread

khronos · July 22, 2013, 6:00am

July 22nd 2013 – SIGGRAPH - Anaheim, CA – The Khronos™ Group today announced the immediate release of the OpenGL® 4.4 specification,bringing the very latest graphics functionality to the most advanced and widely adopted cross-platform 2D and 3D graphics API (application programming interface). OpenGL 4.4 unlocks capabilities of today’s leading-edge graphics hardware while maintaining full backwards compatibility, enabling applications to incrementally use new features while portably accessing state-of-the-art graphics processing units (GPUs) across diverse operating systems and platforms. Also, OpenGL 4.4 defines new functionality to streamline the porting of applications and titles from other platforms and APIs. The full specification and reference materials are available for immediate download at Khronos OpenGL® Registry - The Khronos Group Inc.

In addition to the OpenGL 4.4 specification, the OpenGL ARB (Architecture Review Board) Working Group at Khronos has created the first set of formal OpenGL conformance tests since OpenGL 2.0. Khronos will offer certification of drivers from version 3.3, and full certification is mandatory for OpenGL 4.4 and onwards. This will help reduce differences between multiple vendors’ OpenGL drivers, resulting in enhanced portability for developers.

New functionality in the OpenGL 4.4 specification includes:

Buffer Placement Control (GL_ARB_buffer_storage)
Significantly enhances memory flexibility and efficiency through explicit control over the position of buffers in the graphics and system memory, together with cache behavior control - including the ability of the CPU to map a buffer for direct use by a GPU.

Efficient Asynchronous Queries i
[/i]Buffer objects can be the direct target of a query to avoid the CPU waiting for the result and stalling the graphics pipeline. This provides significantly boosted performance for applications that intend to subsequently use the results of queries on the GPU, such as dynamic quality reduction strategies based on performance metrics.

Shader Variable Layout (GL_ARB_enhanced_layouts)
Detailed control over placement of shader interface variables, including the ability to pack vectors efficiently with scalar types. Includes full control over variable layout inside uniform blocks and enables shaders to specify transform feedback variables and buffer layout.

Efficient Multiple Object Binding (GL_ARB_multi_bind)
New commands which enable an application to bind or unbind sets of objects with one API call instead of separate commands for each bind operation, amortizing the function call, name space lookup, and potential locking overhead. The core rendering loop of many graphics applications frequently bind different sets of textures, samplers, images, vertex buffers, and uniform buffers and so this can significantly reduce CPU overhead and improve performance.

Streamlined Porting of Direct3D applications
A number of core functions contribute to easier porting of applications and games written in Direct3D including GL_ARB_buffer_storage for buffer placement control, GL_ARB_vertex_type_10f_11f_11f_rev which creates a vertex data type that packs three components in a 32 bit value that provides a performance improvement for lower precision vertices and is a format used by Direct3D, and GL_ARB_texture_mirror_clamp_to_edge that provides a texture clamping mode also used by Direct3D.Extensions released alongside the OpenGL 4.4 specification include:

Bindless Texture Extension (GL_ARB_bindless_texture)
Shaders can now access an effectively unlimited number of texture and image resources directly by virtual addresses. This bindless texture approach avoids the application overhead due to explicitly binding a small window of accessible textures. Ray tracing and global illumination algorithms are faster and simpler with unfettered access to a virtual world’s entire texture set.

Sparse Texture Extension (GL_ARB_sparse_texture)
Enables handling of huge textures that are much larger than the GPUs physical memory by allowing an application to select which regions of the texture are resident for ‘mega-texture’ algorithms and very large data-set visualizations.

OpenGL BOF at SIGGRAPH, Anaheim, CA July 24th 2013
There is an OpenGL BOF “Birds of a Feather” Meeting on Wednesday July 24th at 7-8PM at the Hilton Anaheim, California Ballroom A & B, where attendees are invited to meet OpenGL implementers and developers and learn more about the new OpenGL 4.4 specification.

mhagain · July 22, 2013, 6:10am

full certification is mandatory for OpenGL 4.4 and onwards

This on it’s own is cause for joy. Any chance of mandatory full certification being brought back to earlier versions as time goes by and drivers mature?

Godlike · July 22, 2013, 6:33am

The ARB extensions (bindless texture & sparse texture) sound way more interesting/useful compared to the core ones. Also, having updated specs and new extensions backed by ARB every year is really great for the GL developers.

nigels · July 22, 2013, 8:33am

GLEW 1.10.0 is now available, including GL 4.4 support.
http://glew.sourceforge.net/

Nigel

thokra · July 22, 2013, 9:29am

and full certification is mandatory for OpenGL 4.4 and onwards

This is so awesome. We can only hope this won’t slow down spec adoption even further.

The other features sound cool as well, but we’ll see how it works out in practice. GL_ARB_buffer_storage, GL_ARB_query_buffer_object, GL_ARB_multibind … very interesting.

mhagain · July 22, 2013, 10:10am

Issue #9 for GL_ARB_buffer_storage makes for fairly grim reading, unfortunately…

It’s a pity as this could have been the kick up the jacksie that GL’s buffer object API really needed, and the issue in question should really have been resolved by just saying “this is client memory, full stop, using incompatible flags generates an error, here are the flags that are incompatible and the vendors will have to just live with it”, but it seems another case of shooting too high and missing the basic requirement as a result.

aqnuep · July 22, 2013, 10:27am

[QUOTE=mhagain;1252907]Issue #9 for GL_ARB_buffer_storage makes for fairly grim reading, unfortunately…
[/QUOTE]

Have to agree…

mdriftmeyer · July 22, 2013, 10:29am

[QUOTE=thokra;1252904]This is so awesome. We can only hope this won’t slow down spec adoption even further.

The other features sound cool as well, but we’ll see how it works out in practice. GL_ARB_buffer_storage, GL_ARB_query_buffer_object, GL_ARB_multibind … very interesting.[/QUOTE]

How do you figure? The spec is mulled over by members of all GPGPU vendors. They are the ones who signed off on it. This strikes me as an official commitment by the vendors to make OpenGL a solid and fully commited spec.

kRogue · July 22, 2013, 10:39am

My 2 cents on Issue #9 of GL_ARB_buffer_storage: the ultimate causes is that there are so, so many ways that buffer object data may reside. Indeed, there is the traditional dedicated video card where the client-server thing makes sense. But there are lots of other situations in UMA land. Memory unified but not cached by CPU, cached by CPU, shared cache between CPU and GPU [whatever that exactly means], if GPU can page memory… the list goes on and on.

At the end of the day, I think the new console folks are laughing at the whole thing because in that environment how the memory is can be precisely specified by the developer. Oh well. Life goes on.

thokra · July 22, 2013, 10:40am

True, but one will have to see how it plays out in practice. It should still work out pretty nicely with non-UMA setups.

AFAIK, AMD sadly doesn’t have a fully compliant GL4.3 driver out by the time GL4.4 is released … that’s how I figure. Let’s not even speak of Intel. I’m not bashing them, it’s just an observation. Also, we have no idea if the conformance tests are only specified or already fully implemented and whatnot. Talking the talk isn’t walking the walk …

Nowhere-01 · July 22, 2013, 11:35am

[QUOTE=thokra;1252913]
AFAIK, AMD sadly doesn’t have a fully compliant GL4.3 driver out by the time GL4.4 is released … that’s how I figure. Let’s not even speak of Intel. I’m not bashing them, it’s just an observation. Also, we have no idea if the conformance tests are only specified or already fully implemented and whatnot. Talking the talk isn’t walking the walk …[/QUOTE]

AMD is effin’ weird, they own 33% of GPU market, yet it’s driver development division and support feels like some tiny indie company of 5-10 enthusiasts that just started it’s life.

Anyway, subscribing to epic thread. Glad to see OpenGL evolving. I hope those extensions gonna be available for every major GPU vendor in a finite time to be any useful for mainstream development.

Alfonse_Reinheart · July 22, 2013, 11:38am

The Second Annual Unofficial OpenGL Feature Awards!

I hereby hand out the following awards:

We (Finally) Did What We Said We Were Gonna Award

The conformance test suite.

I’ll just quote Tychus Findley, “Hell, it’s about time!”

One Little Mistake Award

ARB_multi_bind

This was good functionality, until I saw glBindImageTextures. I can’t really think of a more useless way to specify that. It applies the “defaults”, based on the texture target. Which means that an array texture will always be bound layered.

OK to be fair, you can use texture_views to effectively create single textures who’s defaults match what you would pass to glBindImageTexture. And then you just bind them all in one go.

3D Labs Is Really, Really Dead Award

ARB_enhanced_layouts

So, not only can we specify uniform locations in the shader, we can even specify packing behavior. To the point that we can steal components from other vectors and make them look just like other variables.

That one must be a nightmare to implement. I hope the ARB has a really comprehensive conformance test for it…

Oh, and this also wins Most Comprehensive Extension Award. It lets us steal components from other interface elements, specify the uniform/storage block layout, define locations for interface block members, and define transform feedback parameters directly in the shader.

Is OpenGL Still Open Award?

ARB_bindless_texture

So. NVIDIA comes out with NV_bindless_texture. And unlike bindless attributes and pointers in GLSL, they actually patent this.

And now it’s an ARB extension. It’s not core… but it’s not a proprietary extension. Yet anyone who implements it will almost certainly be stepping on US20110242117, and therefore must pay whatever NVIDIA says they have to pay. Unless NVIDIA has some agreement with the ARB, granting anyone a license to implement ARB_bindless_texture without paying a fee.

The really disconcerting part is that the patent encumbrance issue… isn’t mentioned in the spec. Other extensions like EXT_texture_compression_s3tc mention their patent issues. But not this one.

Last Kid Picked for Baseball Award

EXT_direct_state_access

When bindless texturing gets the nod from the ARB, and this doesn’t, something interesting is happening behind the scenes. How much does the ARB not want this in core GL, for them to deal with sparse and bindless textures first?

Then again, apparently NVIDIA wants to support DSA [i]so badly[/i] that they may be updating the DSA extension with new stuff… and not telling anyone else who’s implementing it. If true, that’s not playing fair, guys. There’s clearly some kind of huge fight happening around this functionality within the ARB.

So I hope nobody’s holding their breath on this one.

Fragmenting The World Award

ARB_compute_variable_group_size

I understand why ARB_bindless_texture and ARB_sparse_texture aren’t core. That reason being (besides the patent issues) that we don’t want IHVs to have to say that <insert hardware here> can do 4.3, but not 4.4. There are lower-classes of 4.x hardware that just can’t do this stuff. So we leave them as extensions until as such time as the ARB decides that the market penetration of higher-end hardware is sufficient to incorporate them.

Or until we finally decide to have 5.0 (ie: when Microsoft decides to go to D3D 12).

But compute_variable_group_size really seems like something any 4.x-class hardware should be able to handle. Something similar goes for ARB_indirect_parameters.

Hey, That’s Actually Useful Now Award

ARB_buffer_storage

This extension adds functionality to glFlushMappedBufferRange, one of the more useless functions from ARB_map_buffer_range. Now, you can effectively keep a buffer mapped indefinitely and simply synchronize yourself with the server by flushing written ranges.

You Were Right Award

Um, me. Admittedly, it’s old, but I still called it: immutable buffer object storage, with better, enforced behavior. I even called the name (which admittedly is derivative and therefore obvious). Though to be fair, the whole “render while mapped” was not something I predicted.

I was going to say that it seemed odd that not specifying GL_DYNAMIC_STORAGE_BIT still allowed you to map the buffer for writing. But I did see a notation that glBufferStorage will fail if you use GL_MAP_WRITE_BIT without also specifying GL_DYNAMIC_STORAGE_BIT. And of course, you can’t map an immutable buffer for writing without GL_MAP_WRITE_BIT.

It doesn’t have all of the bits I would have liked to see. But it has all the bits the IHV’s wanted (they even said so in the “issues” section). So I’ll call that “good enough.”

Oh, and as for the complaints about GL_CLIENT_STORAGE_BIT being a hint… then don’t use it. Remember: the problem with the current buffer object setup isn’t merely that they’re hints (that contributes, but that alone isn’t the full problem). It’s that the hints don’t really communicate what you are going to do with the buffer. Buffer Storage lets you do that.

You describe exactly how you intend to use it. The bits answer the important questions up-front: will I write to it, will I map it for reading or writing, do I want OpenGL to access it while it’s mapped, etc. And the API enforces every single one of these methods of use.

I have no idea why they even bothered to put GL_CLIENT_STORAGE_BIT there, since that doesn’t describe how you will use the buffer. But as the issue rightly stated, drivers will ultimately just ignore the hint.

So encourage them to do so by ignoring it yourself.

Alfonse_Reinheart · July 22, 2013, 11:40am

Yeah, that’s what happens when your company is slowly imploding thanks to their failing CPU division. Whatever profits might have been made off of GPUs are eaten by the CPU division.

Booner · July 22, 2013, 11:54am

You probably want to read Khronos Intellectual Property Framework Briefing - The Khronos Group Inc

Your logic for the former extensions applies to the later ones as well.

Alfonse_Reinheart · July 22, 2013, 12:15pm

Here’s the relevant paragraph:

It states very clearly that Khronos members only agree not to sue "other Khronos member"s. That doesn’t sound very “open” to me. It sounds more like “pay Khronos money (by becoming a member) or you can’t implement our specifications.”

Are the people behind Mesa a “company” who can afford the “nominal price” of membership?

I’m not concerned about just whether AMD or Intel could implement it. It can’t rightly be called an “open specification” if you have to join an industry consortium to implement the specification.

My point was that they don’t seem like it. Bindless texture support requires something very substantial from the hardware, which is only found in the more modern shader-based systems. The ability to get compute dispatch and rendering call parameters from arbitrary locations seems like something that every 4.x piece of hardware ought to be able to do.

mhagain · July 22, 2013, 12:43pm

My 2 thingies on the DSA shenanigans, and other thoughts off the top of my head.

Right now DSA is not really needed any more (it never was from a purely technical perspective; I’m talking API cleanliness here).

Most of the functionality covered by the DSA extension is dead functionality in modern OpenGL. The only really relevant areas where this actually matters anymore are texture objects and buffer objects, and with vertex attrib binding, buffer objects don’t really need it (PBOs are a special case that can be passed over here). DSA as it stands will never be fully implemented in modern OpenGL because modern OpenGL doesn’t need all of it; the ARB can pick the best bits (as they have done before with the glProgramUniform calls) and respecify elsewhere to avoid the requirement for it.

I’m slightly disappointed that buffer storage didn’t specify a DSA API in core, but it’s not a big deal.

I would have liked to have seen glBindMultiTextureEXT go core, but it hardly seems worth it for a single entry point. GL_ARB_multi_bind covers the needed functionality anyway.

It’s unclear how GL_ARB_multi_bind interacts with a subsequent glTexImage/glTexSubImage/glTexParameter/etc call, or even a subsequent glBindTexture call. It seems obvious that since the original active texture selector remains unmodified, that’s the one that gets used. The resolution to issue #10 makes it clear for buffers, and it would have been nice to see similar for textures. This just makes texture objects even messier, and to be honest it’s looking as though junking the whole API and specifying a new GL_ARB_texture_objects2 (or whatever) from scratch may have been a better approach. That’s my prediction for OpenGL 5.

Gedolo · July 22, 2013, 1:06pm

[QUOTE=mhagain;1252907]Issue #9 for GL_ARB_buffer_storage makes for fairly grim reading, unfortunately…

It’s a pity as this could have been the kick up the jacksie that GL’s buffer object API really needed, and the issue in question should really have been resolved by just saying “this is client memory, full stop, using incompatible flags generates an error, here are the flags that are incompatible and the vendors will have to just live with it”, but it seems another case of shooting too high and missing the basic requirement as a result.[/QUOTE]

This could be something that could be improved, the OpenCL 2.0 announcement has a mention of a 6 months feedback window. Too bad the OpenGL 4.4 does not seem to mention such a thing. Could be a good thing to do this for OpenGL too.

Alfonse_Reinheart · July 22, 2013, 1:39pm

It’s unclear how GL_ARB_multi_bind interacts with a subsequent glTexImage/glTexSubImage/glTexParameter/etc call, or even a subsequent glBindTexture call.

The reason the issue for buffer objects needed clarification is because buffer objects have a separation between an indexed bind point and the target bind point. Binding with glBindBufferRange binds to both the target and the indexed bind point, while glBindBuffer only binds to the indexed bind point. So it’s not clear whether glBindBuffersRange would bind to the target bind point the way glBindBufferRange does. Hence the clarification.

Textures have no such dichotomy; there is no “target bind point”. There are only texture image unit binding points. So there’s nothing that needs to be said about them. glBindTextures binds the textures to those texture image units, period.

Nowhere-01 · July 22, 2013, 1:52pm

This topic actually makes me curious about some of active posters here and this forum in general.Those new functions get a lot of discussion. What are you using this recent(4.2 and above) functionality for? What is the scope of application? Because this functionality is barely supported, it means you either target really specific hardware or have time to maintain additional code paths.

mdriftmeyer · July 22, 2013, 2:51pm

[QUOTE=Alfonse Reinheart;1252924]Here’s the relevant paragraph:

It states very clearly that Khronos members only agree not to sue "other Khronos member"s. That doesn’t sound very “open” to me. It sounds more like “pay Khronos money (by becoming a member) or you can’t implement our specifications.”

Are the people behind Mesa a “company” who can afford the “nominal price” of membership?

I’m not concerned about just whether AMD or Intel could implement it. It can’t rightly be called an “open specification” if you have to join an industry consortium to implement the specification.

My point was that they don’t seem like it. Bindless texture support requires something very substantial from the hardware, which is only found in the more modern shader-based systems. The ability to get compute dispatch and rendering call parameters from arbitrary locations seems like something that every 4.x piece of hardware ought to be able to do.[/QUOTE]

Boo hoo. Perhaps you have millions of dollars and are willing to build the standards body wherein all hardware manufacturers come together and build specs, all for free gratis. Have at it.

Becoming a member is chump change.