Official feedback on OpenGL 4.2 thread

ARB_shader_image_load_store is a really big deal! Shaders can scatter as well as gather which is going to enable some interesting techniques.
I’m just wondering how it will perform - has anyone given it a try yet?

OIT stuff,
and http://www.opengl.org/discussion_boards/ubbthreads.php?ubb=showflat&Number=296393#Post296393

Wow, I’m just back from holiday and here we have GL 4.2.

Very nice update with a lot of useful features, even though only about half of the features I wanted to see in GL 4.2 were actually added.

By repeating this process over time, the feature set will asymptotically approach your needs!

I know that. Obviously they had some real tough extensions to discuss this time (I mean here especially the ARB_shader_image_load_store and ARB_shader_atomic_counters, even though the later was already almost complete a half year ago).

Anyway, what maybe we can say at least now with confidence is that GL 4.2 achieved feature parity with DX11 (at least from point of view of exposed hardware capabilities).

Obviously they had some real tough extensions to discuss this time (I mean here especially the ARB_shader_image_load_store and ARB_shader_atomic_counters, even though the later was already almost complete a half year ago).

Actually, I’d guess that most of the discussion was around ARB_shading_language_420pack and ARB_texture_storage. The former is really a merger of what seems like a half-dozen extensions, which much have been in development.

And ARB_texture_storage introduces functionality that I’m sure some in the ARB saw as superfluous. After all, there’s nothing you can do with ARB_texture_storage that you can’t with regular texture creation. And the immutability must have rubbed some of the more conservative elements the wrong way.

Anyway, what maybe we can say at least now with confidence is that GL 4.2 achieved feature parity with DX11 (at least from point of view of exposed hardware capabilities).

Honestly, it’s not “feature parity” that interests me so much as “niceness parity”. And 420pack really, really helps. Shaders are much more robust and stand-alone. And the last of the horrible 3D Labs-based nonsense is finally expunged from GLSL. It only took 10 revisions of the language and 6 years.

Also, I’ve come to appreciate why it’s one extension instead of a half-dozen. If it were many extensions, then those using GL 3.3 would have to have a giant list of #extension enable lines at the top of every shader. That would be pretty terrible. This way, you just have one extension (or two if you’re using pack) and that’s it.

I agree that ARB_shading_language_420pack is very important as I was also very frustrated by the way how you attach bindings to uniforms as an example. However, atomic counters were even more important, I think, as without it we simply cannot implement OIT and other stuff efficiently. And before anybody tries to argue, atomic read/writes exposed by EXT_shader_image_load_store and now also by the ARB version is not equivalent with atomic counters. Atomic counters are more than 10 times faster than atomic read/writes on AMD hardware as they have special hardware for it, and it is also faster on NVIDIA as well.

Honestly, it’s not “feature parity” that interests me so much as “niceness parity”. And 420pack really, really helps. Shaders are much more robust and stand-alone. And the last of the horrible 3D Labs-based nonsense is finally expunged from GLSL. It only took 10 revisions of the language and 6 years.

Yeah, that’s also important, however I’m more interested first in being able to solve a problem than how nicely I can solve it (no, I’m not saying that solving a problem elegantly is less important, however, functioning software has priority over that).

Also, I’ve come to appreciate why it’s one extension instead of a half-dozen. If it were many extensions, then those using GL 3.3 would have to have a giant list of #extension enable lines at the top of every shader. That would be pretty terrible. This way, you just have one extension (or two if you’re using pack) and that’s it.

Good point as well, even though the name is kind of funky, it’s better to have just a single extension.

Update: Actually I’m also very happy to see ARB_base_instance as this was a functionality that was present in hardware for quite some time but even ARB_draw_indirect didn’t expose the functionality earlier. This is one of my favorites, even though the name glDrawElementsInstancedBaseVertexBaseInstance sounds a bit ridiculous :slight_smile: Btw, I don’t know why they didn’t simply created two new functions with names like: glDrawElementsDirect and glDrawArraysDirect (analogous to the new indirect draw commands) which does the same like glDrawElementsInstancedBaseVertexBaseInstance and glDrawArraysInstancedBaseInstance. You can do all the glDraw* commands with these two by using proper parameters so people could migrate to the new commands and we could mark all the rest of glDraw* commands as deprecated. Also, it would be great to see glMultiDraw* versions as well (what we have at least for indirect drawing thanks to AMD_multi_draw_indirect).

Thanks for promoting ARB_shader_load_store and the atomic counters, while adding useful features for older hardware. Expanding the shader bindings is great, though I’m still not a big fan of the layout syntax.

However, the one ARB extension I do take issue with is ARB_internalformat_query - not for what it does, but for how it’s named. In all instances of “query” in the registry, the use of query as a noun indicates a query object (ARB_occlusion_query, ARB_timer_query), whereas its use as a verb indicates a synchronous, object-less fetch of the information (ARB_texture_query_lod, OES_query_matrix). So at least it should be ARB_query_internalformat.

Further to this, the extension doesn’t actually query the internal format of a texture, which I was really hoping for when I saw it. Rather, it queries the supported multisample counts for a given internal format. In my opinion, the extension should have been called ARB_query_multisample. Sorry if this comes across as a bit pedantic, and I realize it’s likely too late to do much about, but it felt misleading enough to me to point out in the feedback thread.

If you check the issues for that extension, it was going to be much more ambitious, but they scaled it back for the current release.

The name makes sense because it’s providing a query mechanism for internal format information. It just doesn’t do much with that mechanism yet.

I saw that, but that still implies a future extension with a different name. Perhaps ARB_internalformat_query2, 3, etc. are coming, but the name still doesn’t do much for the current extension.

I wish more people considered this. Direct state access is one of the prequisites for OpenGL to surpass DirectX, and yet there has been a lot of talk about “OpenGL finally catching up” of late.

Direct state access is there at AMD and NVIDIA drivers. I also agree that it would be nice to have it in core, but I’m pretty sure the issue is that additing direct state access would require a lot of language change and editorial work on the spec. I think we can expect DSA to get into core only when the promised restructuring of the whole spec happens.

EXT_direct_state_access is there but the extension have too many issues to be used in production code. Many entry points are missing so that you are stock to OpenGL 2.1. + We are supposed to bind an object to create it… erm… not good.

We need a serious update on direct state access!

TexStorage almost did it for textures, but still did not. Time to whine.

Many entry points are missing so that you are stock to OpenGL 2.1.

What entry points are missing? You have full DSA control over textures and their parameters, VAOs, program uniforms (programs were DSA before outside of uniform setting), samplers, and program pipeline.

The only missing entrypoints I can think of are transform feedback objects and a couple of the newer FBO features (TextureLayer). Hardly something that stops people from using it in “production code.”

Really, what we need is to make state objects for those last bits of non-object state: the post-fragment tests (depth+stencil), blending/masking, and the viewport transform.

  • We are supposed to bind an object to create it… erm… not good.

What are you talking about? The DSA extension specifically says:

For what do you need to bind an object to create it?

As an example, element buffer setting and vertex attrib divisor setting.

Okay, on this point you are right, you don’t need to bind an object to create it, but the object is not generated at the time you call glGen*. This is an issue.

What I wanted to say is that EXT_direct_state_access is acceptable for now and I can understand why DSA didn’t make it into core, actually I rather go on without DSA forever than having the current EXT_direct_state_access in core because it is ill-made (of course, this is just my opinion). I would be happy to see an ARB_direct_state_access extension put into core in the near future though.

Direct state access is there at AMD and NVIDIA drivers. I also agree that it would be nice to have it in core, but I’m pretty sure the issue is that additing direct state access would require a lot of language change and editorial work on the spec. I think we can expect DSA to get into core only when the promised restructuring of the whole spec happens.[/QUOTE]

I’m in favor of OpenGL 4.3 that will just be reworked specs WITHOUT any new extensions brought into core.

Then they can start on working upon this new spec architecture.

Also I think that while merging refactoring specs with letting new extensions into core may be doable, they should avoid refactoring and releasing next big OpenGL.

OpenGL 3.0 was such a mess because they wanted to do too much in too short time. I hope they have learned something.

As an example, element buffer setting and vertex attrib divisor setting.

Fair enough. Though the divisor was more recent than DSA, there was no excuse for not having element buffer attachment in DSA.

but the object is not generated at the time you call glGen*. This is an issue.

Is it? What exactly is the problem? You cannot tell the difference between glGen* creating the object and calls to a DSA function doing it.

And even if it were a problem, implementations are in fact allowed to create the object at glGen* time, since there is no possible way for you to detect that this is happening. Indeed, I understand that NVIDIA’s implementation does this.

You cought me again :slight_smile:
However, it would be still nice if the spec would explicitly require the object to be generated at glGen* time and to always require objects passed to modify or bind commands to be previously generated with glGen*.

However, it would be still nice if the spec would explicitly require the object to be generated at glGen* time and to always require objects passed to modify or bind commands to be previously generated with glGen*.

It already requires the latter. Well, if you use a core context at any rate.