ARB_Sync please!

A long long time ago, there was work done on ARB_Sync, and it was good.

But then the promises started, it was going to be in 3.0, there was just some work to be done, etc, etc.

In the end, no ARB_Sync, NVidia and Apple have somewhat limited extensions that do some of it, ATI has none.

This is a huge hole in OpenGL for some applications, specifically where it is critical to avoid CPU busy waiting while still maintaining low total system latency.

Is there a reason that ARB_Sync died its death? Can it come back from the dead? I think it would be a significant enhancement these days.

As an indicator, our application gains 30-70% total performance on NVidia cards by using NV_Fence, ARB_Sync should allow us to gain even more. It would also seem to fit in nicely with the general direction of both GPUs and software developers expectations.

I think ARB_Sync is something internal and object buffers have it.

ARB_Sync was a proposal (basically signed off on I once though) for synchronisation primitives for OpenGL, similar to but significantly better than NV_Fence.

I can only imagine that there are some hushed up IP reasons why it has been swept under the floor, as the performance advantages are significant, and implementations effectively need this support internally anyway.

All very disappointing.

We can only speculate but surely something along these lines in underway already. I think from the general tone of the Siggraph Asia presentation (and the scuttlebutt on the street) we can infer that a revamping of display lists with an eye on concurrency is eminent in the versions to come. Sync in one form or another seems likely to be a central theme going forward.

With OpenCL capable drivers due soon, the obvious thing to do would be to copy the Event objects from the OpenCL spec to the OpenGL spec and make them a shareable resource.

When using a CL context created from a GL context and transfering data between them using buffer objects, it will be necisary to synchronise the two commands buffers to operate on the shared data in the correct order, without ever stalling the GPU or requiring a CPU/GPU synchronization such as glFinish.

The GL command queue could do an EnqueueWaitForEvents on CL events, and a CL queue could likewise wait for GL to complete some operations and set an event before it continues.

It would also be nice if we could synchronize the OpenCL work, or a background FBO GL context, so they are resumed when the primary GL context calls SwapBuffers, and suspended when the VSYNC occurs, so the primary GL context always has priority use of the GPU and is much less likely to drop a frame.

Why not extending the query object API?

Apparently sooner than anyone thought. Seems NVIDIA just released CL drivers to developers today.

Good news !

These thoughts are all very well, but what on earth happened to ARB_sync? it appears to be well designed, solves some very big problems, and was apparently accepted by everyone then involved in ARB decisions as far as I can tell…

The runtime performance penalty we face without it (ie: outside NVidia with their basic NV_Fence) is monstrous, and the drivers must implement something similar internally…

Please expose it, it cannot be that hard!