Synchronising Swapbuffers with vertical retrace

Hi all,

I’m trying to minimise the time spent in the SwapBuffers() call by attempting to call it as close to the actual vertical retrace as possible.
However I have no idea how I can accurately determine when the next vertical retrace will occur. I’ve come up with the following approach:

BTW all ‘times’ are recorded by doing a QueryPerformanceCounter() call into an __int64 variable. I’m using w98 and VSC++ 6. Assume a 85 Hz refresh rate.

  1. remember when the last SwapBuffers() returned (= lastFrame)
  2. next SwapBuffers should be at lastFrame + 1/85 sec -/- 1 ms (= nextSwap)
  3. issue frame drawing commands
  4. as long as we’re not at nextSwap: do stuff to prepare next frame
  5. as soon as nextSwap passes, issue SwapBuffers()

This works fine as long as I don’t miss any frames (i.e. the actual FPS is 85): time spent in 3 is approx. 10%, in 4 approx 80%, time in 5 approx. 8%

But as soon as I draw too much (or the window is too large) the FPS drops, as expected, but the time spent in 4 drops to 50% and time spent in 5 increases to 40%.

The question therefor is: Is there a better way to know/predict the correct time to call SwapBuffers() in order to minimise the time spent in SwapBuffers().

Many TIA

Jean-Marc.

take a look at wglSwapIntervalEXT.

Originally posted by AdrianD:
take a look at wglSwapIntervalEXT.

Thanks Adrian, but wglSwapIntervalEXT only allows me to specify the minimum number of screen refreshes that will occur before the next SwapBuffers finishes. That is not what I’m looking for.
I want to find a way to call SwapBuffers as late as possible while still making it before the next screen refresh…

Anyone else?

Jean-Marc

i know there is some function for X under Unix: glxWaitX. i think (only think) it permit to wait for the refresh of the screen.
there may be something similar under Windoze.

otherwise, you could know your monitor refresh rate and try to draw once per refresh.

VBLtime (in secs): 1.0 / videoFreq
VBLcycles: (1000000 * cpuFreq)/ videoFreq

but i don’t understand your prob.

  1. if u can render within the refresh period (which is the best case) then this not a problem to wait for the VBL just bcoz your loop is done.
  2. if your app rastertime makes you miss the VBL then you’re in a world of s.h.i.t coz with or without synced bufferSwaps display will be crappy &| ugly or whatever. :wink:
    (i mean refresh bars/(per missed frame) in your face! compared to nice & smooth rendering when you stay within the Vertical refresh period :wink:
    just don’t miss the frame. that’s all. :slight_smile:

GL

Thanks all for your replies.

I know it would be best to ensure drawing is completely finished within one frame. However, I’m trying to develop some basic rendering code that I expect to build upon. This basic loop will be:

  • draw frame primitives
  • do other stuff & precalculate next frame(s) as long as refresh is not yet needed
  • swap the buffers
    until the app is done.

I cannot beforehand know that all drawing will always be completely done within a single screen refresh period.

My attempts are aimed at giving the second step as big a slice of the remaining frame time as possible, while keeping the FPS as high as possible.

If I call SwapBuffers the system does three things (I think):

  1. finish up all drawing (if any drawing remains to be done)
  2. wait for right time to swap
  3. perform the swap
    AFAIK there is no way for me to know how much time is spent in each of these three steps.

Now assuming I can predict 2 (based on previous frame-time and refresh interval), and assuming a fixed time needed for 3, it would be great if I can find out beforehand if step 1 (all drawing) has completed.

TIA

Jean-Marc

Originally posted by JML:
[b]
This basic loop will be:

  • draw frame primitives
  • do other stuff & precalculate next frame(s) as long as refresh is not yet needed
  • swap the buffers
    until the app is done.

[/b]

Well, precalculating next frame maybe possibly interesting in non T&L situation.
(i mean if u need to do proj,lighting etc by yourself and finally send all your prims through the bus to the board).
Using 100% T&L proggy style with the appropriate extensions shouldn’t need precalculations for next frame bcoz:

  1. cpu is not the bottleneck.
    (i’m always assuming that display use the major part of the rastertime)
  2. the most important is to make the GPU start its work as soon as the new frame start. (coz u cannot start to draw a new frame while the back buffer is already filled :wink:

Originally posted by JML:
[b]

I cannot beforehand know that all drawing will always be completely done within a single screen refresh period.

[/b]

you can :wink:

On many GL implementations, switching to asynchronous mode when swapping buffers enable the original & interesting feature of glFinish();
note: that glFinish() in sync mode will most of the time and on most of the implementation returns just before the end of the VBL (before the swap) and generally doesn’t correspond to any useful idle state,
which of course s.u.c.k :frowning:

Anyway…
let’s say that GPU will always finish its work after CPU (u should work with more than one spinning cube then!! eheheh)

(Before swapping / endFrame)

  1. First check the time/cycles spent by your CPU tasks. (as u know VBLTime/Cycles you can easily calculate frame % )
  2. in asynchronous mode just check elapsed time/cycles spent after your call to glFinish() and then you’ve got your GPU occupancy.

Well, an approx of it! :wink: but it helps for plenty of … things as u imagine! :wink:

now you can swap without fear! ;))

Originally posted by Ozzy:
Well, precalculating next frame maybe possibly interesting in non T&L situation.

By precalculating I mean determining object positions, interactions, particle updates etc. Not actually pre-rendering the frames!

On many GL implementations, switching to asynchronous mode when swapping buffers enable the original & interesting feature of glFinish();
note: that glFinish() in sync mode will most of the time and on most of the implementation returns just before the end of the VBL (before the swap) and generally doesn’t correspond to any useful idle state,
which of course s.u.c.k

Ozzy, by asynchronous mode you mean disabling vsync right?

So you are saying:

  1. disable vsync
  2. call glFinish
  3. enable vsync
  4. swapbuffers

? That will allow me to spend some extra time in 3, but won’t 2 consume a big chunk of time if there is a lot of GPU work pending?

Perhaps I misunderdand

Anyways, thinking and talking about this made me remember having heard about an extension called NV_fence and I guess that may help me greatly. (until now I’ve not been using any extensions in my OpenGL dabblings).

Again thanks to all for the help.

Jean-Marc.

Originally posted by JML:
By precalculating I mean determining object positions, interactions, particle updates etc. Not actually pre-rendering the frames!
[/b]

why not. :slight_smile:

Originally posted by JML:

So you are saying:

  1. disable vsync
  2. call glFinish
  3. enable vsync
  4. swapbuffers

[/b]

Not exactly, you can’t enable/disable vsync within the same frame, so consider this sequence for debug/testing purposes.
btw you can always switch from one mode to another while your app is running. (note that you’ll need to get back to a cool framerate before GL can resynchronise)

so it is like this:
(vsync can be either on/off, it only affects glFinish behavior)

  1. get cpu elapsed time/cycles
  2. call glFinish
  3. get gpu elapsed time/cycles
  4. swapbuffers

Separate the processing in two threads:

Draw thread:

while (not_done)
{
sem_wait(s1);
issue_gl_calls ();
sem_post(s2);
swap_buffers();
}

App thread:

while (not_done)
{
do_app_stuff ();
sem_post (s1);
sem_wait (s2);
}

Thus the app thread will work on the next frame, while the draw thread sleeps in swap_buffers(), waiting for the next retrace.

Regards,
-velco

I’m not sure you’re aware of this but swapbuffers is non blocking the first time through on many OpenGL implementations on PCs. It only blocks when you have another swap which hasn’t completed.

This is pretty implementation specific but it’s generally understood to be ‘correct’ behaviour on PCs.

If it’s latency you are worried about you might want to explicitly block after swap, but you pay a price.

Originally posted by dorbie:
I’m not sure you’re aware of this but swapbuffers is non blocking the first time through on many OpenGL implementations on PCs. It only blocks when you have another swap which hasn’t completed.

OK, if this is the case, one can trivially implement the syncronization himself, with CreateWaitableTimer/WaitForSingleObject in WIN32 or pthread_cond_timedwait in POSIX.
(of course, there are other means to do it)

This is pretty implementation specific but it’s generally understood to be ‘correct’ behaviour on PCs.

Huh? Who with his mind will tell this is correct? Only for benchmarks, otherwise I can see absolutely no sense wasting time redrawing the screen when the redrawn image will never appear.