Feature or Loophole in Specs?

The GL4.1 and earlier versions of the specs state:

Effects of Mapping Buffers on Other GL Commands
Most, but not all GL commands will detect attempts to read data from a mapped
buffer object. When such an attempt is detected, an INVALID_OPERATION error
will be generated. Any command which does not detect these attempts, and performs
such an invalid read, has undefined results and may result in GL interruption
or termination.

But what about write operations? Think of glReadPixels(),glGetTexImage(), glCopyBufferSubData() and the like…
Are GL commands which write into a buffer object well defined while the buffer object is mapped as a whole or just a part of it?

I believe that the spec meant to apply for both read/write.

on AMD implementation, there is actually a really interesting feature that you get out of this:

  • allocate your buffer and bind it as pack/unpack
  • map the buffer with unsynchronized bit
  • do not unmap it
  • read or write to that buffer will not throw any invalid operation
  • use sync objects to get feedback on when a specific read/write operations is done on the GPU

this helps to achieve the best performance for data streaming in/out of the GPU.

Wow, that manages to be both scary and unintuitive at the same time. OpenGL states that mapped buffers can be unceremoniously lost. The spec doesn’t say under what circumstances this can happen, but you can only find out that it has happened when you unmap the buffer. This suggests that mapping should be considered at least somewhat temporary; you map and unmap as needed.

Equally as important, how would users of OpenGL reasonably expect that this usage pattern, which relies on clearly violating the OpenGL specification (calling glDrawElements that would pull from a mapped buffer should result in GL_INVALID_OPERATION), would result in optimal streaming performance? Perhaps there should be an extension governing this kind of behavior.

Pierre, it was actually your post in AMD’s forums that made me start this thread ( http://forums.amd.com/devforum/messageview.cfm?catid=392&threadid=140335&enterthread=y )

I agree with Alfonse - users should not be ecouragd to make use of unspecified behaviour (that will almost certainly fail on other hardware) in order to get “optimal performance”. The same applies to Nvidia, of course!
If there is a performance feature such as “permanently mapped buffers”, it should be specified in an extension that also covers all corner cases.

maybe I can explain when a mapped buffer can be lost (on windows/linux), which should remove some of the mystery behind why/when a surface can be lost.

the fundamental issue is with the local video memory:

  • it is not managed by the OS using demand paging after a page fault (current implementation is paging based on scheduling context, or pinned allocation)
  • reads from local video memory are really slow (so there is often a copy involved for read access)
  • on system events (hibernate, modeset), video memory can be evicted

what this implies though: if your buffer is in system memory and directly accessible to GPU (no copy involved), then you are guaranteed that your memory is never lost (it is pinned by the OS, and correctly paged for ACPI events as part of system memory). actually, AMD driver uses this exact same feature all the time internally in order to transfer data in/out of the GPU (teximage2d, readpixels, drawpixels, …), and is guaranteed behavior. this behavior is most likely true for other OGL implementations even with different HW.

about the INVALID_OPERATION error, it would be really inefficient for the driver to check all cases where a buffer can be read/written on every draw to make sure that no attachment is currently being mapped because there are too many cases; one fast implementation would be to check that not single buffer is currently mapped, which is really restrictive.

if your buffer is in system memory and directly accessible to GPU (no copy involved), then you are guaranteed that your memory is never lost (it is pinned by the OS, and correctly paged for ACPI events as part of system memory).

But, since there’s no GL_BUFFER_OBJECT_IN_SYSTEM_MEMORY usage hint, there’s no way to know if this is true or not for any particular buffer object. So the only thing we have to go on is your suggestion that buffer objects don’t get corrupted while mapped anymore. Which is at best a platform dependent recommendation.

about the INVALID_OPERATION error, it would be really inefficient for the driver to check all cases where a buffer can be read/written on every draw to make sure that no attachment is currently being mapped because there are too many cases; one fast implementation would be to check that not single buffer is currently mapped, which is really restrictive.

However inefficient it may be, that’s what the OpenGL specification says. So unless you explicitly change the specification with an appropriately defined extension, this is a driver bug.

Furthermore, this doesn’t explain why the fast path for streaming on ATI hardware is also a path that is in direct violation of the OpenGL specification, and thus no one could possibly stumble across it by accident. In short, if people do not actually ask you, they’d never know it was there.

So unless you explicitly change the specification with an appropriately defined extension, this is a driver bug.

actually, the spec clearly indicates that an implementation is not required to check for the error. this was discussed at length when this language was written, so this was not an accident.

So the only thing we have to go on is your suggestion that buffer objects don’t get corrupted while mapped anymore. Which is at best a platform dependent recommendation.

this is an implementation specific behavior, that is correct; it does not mean that it does not have any use, though.

we are currently getting positive feedback on this well defined behavior, so I thought that you might want to be aware of it. And I started my post with a big caveat:

on amd implementation…