Official feedback on OpenGL 3.2 thread

Are you sure? I already used that in GLSL 1.30.

It was always possible. I could not see why 3.2 should motivate more game developers going this way than before. If you don’t target the Mac or even more unlikely Linux there is simply no good reason to write an additional OpenGL rendering path. While the pure API Spec is getting better there are still so many holes when it comes to the infrastructure.

Are there plans to use ARB_sync with VBOs?

For instance after calling glBufferData - the data will not be copied immediately, and it can be safely changed or deleted only after the sync object is signaled. Calling ClientWaitSync will copy the data immediately (as is the current behavior of glBufferData ).

Is there a way to do this with the current API or an extension?

I’m not certain what you’re asking, but if it is whether we would make sync objects affect the behavior of previously existing API calls, there are no plans to do that. BufferData in GL 3.2 does just what BufferData always did before. Placing a fence after a GL command and waiting on the corresponding fence sync object in another context simply provides a (potentially) more efficient way to know that the command has completed from the point of view of the other context.

Are ARB extensions #74 + #75 (at meant to be labelled as “WGL_ARB_create_context” and “GLX_ARB_create_context”, or “WGL_ARB_create_context_profile” and “GLX_ARB_create_context_profile”, or are they a special case because they modify existing extensions?

Also just noticed that with wglGetExtensionsStringARB, NVidia beta drivers 190.56 display the extension string “WGL_ARB_create_context_profile”, but not “WGL_ARB_create_context”.

Ok, then lets call it glBufferDataRetained / glBufferSubDataRetained.

Currently sending data to VBO (or any other buffer data object) works like this:

  1. Allocate and fill the data
  2. Give it to glBufferData
  3. glBufferData immediately makes a copy of the data (or sends it directly to the card which is unlikely)
  4. When glBufferData returns the data is no longer needed and I can delete or modify it

What I want is to skip #3 to avoid the extra copy done by glBufferData. When using glBufferDataRetained the data will not be immediately copied and it cannot be changed/deleted till OpenGL signals that this data is no longer needed.

The ARB_Sync API is probably not designed for this case but something similar can be very useful. Using the ARB_Sync semantics this functionality will work like this:

  1. Allocate and fill the data
  2. Give the data and a sync object to glBufferDataRetained
  3. glBufferDataRetained returns immediately without copying anything
  4. The next time I need to change or delete the data I check if the sync object is signaled - and if so I can use it right away. If the object is not signaled - I can either call ClientWaitSync (or whatever) to ensure the data is copied right away or I can allocate the changed data in a new place.

Currently the only way (that I know of) to avoid the extra copying is to use glMapBuffer. And this is another can of worms…

Waiting OGL4 for features ten years ago? So stupid… :frowning:

There is no need to wait. The functionality is available for any vendor anyway. Just use the extension, why would that be a problem?

I want use core functionality only, but i forced use extension too, because core not have needed functionality. This is sad.

We folded the profile extension into the same documents as the original create_context extensions, largely because it was necessary to get them out on time for SIGGRAPH, for arcane reasons involving the Khronos spec approval process. So they are two extensions, in one file.

OK, I understand now. I think both Sun and Apple have done vendor extensions along these lines, and we have sometimes discussed it as a future use case in the ARB. If we do something like this in a future release I hope it would use sync objects to signal the driver being done with the client buffer, but at present it’s not being actively discussed in the group.

Now I’m just waiting for extension APIs like GLEW to catch up with a major release.

Does anyone know of any APIs like GLEW that allow for the easy handling of extensions in openGL 3.1/3.2?

Now that we have sync objects, the API to implement this functionality is a no-brainer, so I hope to see it implemented sooner rather than later.

Al least for PBO the glMapBuffer API works fine.

  • Mapping buffer is pretty straight forward usage pattern.
  • NVIDIA drivers provides good performance.
  • Loading the data using glTexImage and similar is async
  • Current ARB_sync should fill the missing gap

In my experience glMapBuffer works very well for the initial loading of any buffer. It’s once you start to use that buffer that you suffer synch issues. Certainly it’s much faster than multiple calls to glBufferSubData.

yes, calling glBufferData with data set to NULL is painless.

Maybe glCopyBufferSubData (ARB_copy_buffer) for updating subdata can help.

C# wrappers for OpenGL 3.2 available here. Also usable by VB.Net, F# and the rest of the Mono/.Net family of languages.

Very well done Khronos!!Now let’s see some drivers!..NVIDIA (officially) released 3.1 just a few weeks ago.
Again, great release!

Straightforward usage pattern? Mapping a buffer to update data will most certainly lead to sync issues. Even discarding the buffer with glBufferData(…, NULL) and updating the whole thing might not help. And of course updating the whole buffer migh not be what you want.
glTexImage being async comes at a cost - it will most likely create a copy of your image data right away. And this is what I want to avoid.

This very much depends on the sizes of your data/subdata.
The intricacies of glMapBuffer have been discussed to death. The solution to use buffer objects with sync objects is both elegant and simple.


It is hard to argue on this. Only driver guys can tell. But I see no reason why driver needs to make a copy. Probably it starts some DMA to copy the data from system memory to GPU memory.

Using buffers for small chunks of data is questionable. The overhead could be larger then using simple memory pointers and letting driver to copy it.

I am using buffers glBufferData(…, NULL) for streaming several hundreds of megabytes per seconds without problems. Of course I am not using SubData. Driver developers should write some performance hints&tips for using smaller chunks with SubData. I can imagine it is not for free.

If you are afraid of copying data in system memory try to use WriteCombined memory and/or SSE instructions with non temporal hint to not pollute cache.