How to use tripple buffering with ogl

Hi

I would like to know how you can do tripple buffering with opengl. I know that it´s no problem with dx, but I think all the opengl initializing stuff is a great mess because you use an extra dc of a window instead of just use your own surfaces like dx. I know that you only get double buffering because there´s a flag which enables this, but there´s no flag for tripple buffering. However is this possible with opengl?

Thanks

Hmm…I´m not sure if you can enable
triple-buffering with OpenGl…I can enable it with the drivers of my gfx-card…

Greets, XBTC!

It’s the swap_control extension, I believe.

However, triple-buffering increases the
latency from when you read the controls to
when you actually see the frame, so it’s very
seldom a win.

i dont think the win_swap_control extension does triple buffering. to my knowlege this aint possible in windoze with opengl but ild love to be proved wrong.

Can someone tell me what is the advantage of triple-buffering over “simple” double-buffering ??

Originally posted by haust:
Can someone tell me what is the advantage of triple-buffering over “simple” double-buffering ??

It can improve frame rate in some scenarios when VSynch is enabled. Say your monitor is running at 100hz (10ms between VSynchs), and your rendering is running at 11ms per frame. With double buffering, that means after each frame is rendered, you have to wait 9ms before you can start on the next frame. End result is that it takes 20ms per frame, giving you 50 FPS.

With triple buffering you start with buffer 1 visible and render to buffer 2. Then while waiting for buffer 2 to flip you begin rendering to buffer 3. By the time you finish buffer 3, buffer 2 should be visible and buffer 1 is unused, so you can flip buffer 3 to the front and begin rendering to buffer 1 while waiting for the flip to complete. If you figure out all the timings, you will see that this technique will get you 90 FPS rather than the 50 FPS you got with double buffering.

Originally posted by bgl:
However, triple-buffering increases the
latency from when you read the controls to
when you actually see the frame, so it’s very
seldom a win.

Not true at all, actually quite the opposite. Triple buffering (in some cases) improves frame rate, which will decrese latency. Triple buffering is not the same as queueing up mulitple frames (which can increase latency). The point of triple buffering is so that while one buffer is visible, and the second buffer is waiting for a vsynch to complete its flip, you have a third buffer to begin rendering the next frame so that hopefully you can complete it without missing the following vsynch. In the worst case scenario (where you always finish rendering just before the vsynch) double buffering and triple buffering will perform identically (not considering the extra memory that would be wasted on the unneeded third buffer in this case).

P.S.
OK, After thinking about it, actually I take some of this back. In cases where you can complete 2 or more frames between 2 consecutive vsynchs, triple buffering will have the effect of queueing up 1 extra frame, and will increase latency by 1 frame.

[This message has been edited by LordKronos (edited 01-05-2001).]

Dude, don’t even think of smacking me down
on the subject of latency. I may be an idiot
on some topics, but latency ain’t it :slight_smile:

> OK, After thinking about it, actually I
> take some of this back. In cases where you
> can complete 2 or more frames between 2
> consecutive vsynchs, triple buffering will
> have the effect of queueing up 1 extra
> frame, and will increase latency by 1 frame.

Yes. And, for those who haven’t thought it
through: you have to wait one frame to see
your results at 50 fps (double-buffered),
which means 20 ms. You have to wait TWO
frames to see your result at 90 fps (triple-
buffered), which means 22 ms.

Of course, in the real world, you typically
don’t have this “near miss”; the average
frame time might be between one and two
frames (thus 15 ms in your example) which
results in still 50 fps/20ms for the double
buffer case, but 67 fps/30ms for the triple
buffer case.

Now if you’re benchmarking, triple-buffering
makes lots of sense. But don’t get me
started on benchmarking… :slight_smile:

Originally posted by bgl:

Yes. And, for those who haven’t thought it
through: you have to wait one frame to see
your results at 50 fps (double-buffered),
which means 20 ms. You have to wait TWO
frames to see your result at 90 fps (triple-
buffered), which means 22 ms.

Wrong. If running at 100hz monitor refresh rate, starting on a 0ms based timeline:

Assuming 11ms render time with double buffering:
*frame 0 starts at 0ms, completes at 11ms and is displayed at 20ms
*frame 1 starts at 20ms, completes at 31ms and is displayed at 40ms
*frame 2 starts at 40ms, completes at 51ms and is displayed at 60ms

Assuming 11ms render time with triple buffering:
*frame 0 starts at 0ms, completes at 11ms and is displayed at 20ms
*frame 1 starts at 11ms, completes at 22ms and is displayed at 30ms
*frame 2 starts at 22ms, completes at 33ms and is displayed at 40ms
*frame 3 starts at 33ms, completes at 44ms and is displayed at 50ms
*frame 4 starts at 44ms, completes at 55ms and is displayed at 60ms

remembering that any input before a frame starts is viewed when that frame is displayed:
*input at time 0ms is displayed at 20ms with double buffering and at 20ms with triple buffering
*input at time 10ms is displayed at 40ms with double buffering and at 30ms with triple buffering
*input at time 19ms is displayed at 40ms with double buffering and at 40ms with triple buffering
*input at time 21ms is displayed at 60ms with double buffering and at 40ms with triple buffering

clearly in this case latency is better with triple buffering. I did pick one of the better case scenarios for my example, but in summary, any time that your render code runs slower than the refresh rate triple buffering will help, and the smaller the ratio of “(render time) : (render time + flip delay)” the more triple buffering helps.

Of course, in the real world, you typically
don’t have this “near miss”; the average
frame time might be between one and two
frames (thus 15 ms in your example) which
results in still 50 fps/20ms for the double
buffer case, but 67 fps/30ms for the triple
buffer case.

Wrong. With triple buffering, you typically dont buffer up an extra frame unless your rendering code is rendering faster than the refresh rate. The third buffer just gives you a place to start working on frame X+1 while you are waiting for the last flip (frame X) to complete. By the time you finish frame X+1, frame X has completed, and your frame is ready to be displayed on the VERY NEXT vsynch. Since you dont have to wait any additional vsynchs, but you started on the frame earlier than you would have with double buffering, the result is that your frame gets displayed at the same time or earlier than it would have with double buffer.

Like I said, a latency of 1 frame IS an issue if you are rendering faster than the refresh rate. Now this is a debateable issue, but personally I feel that if your code is already rendering at 60hz or higher, its not that big of a deal and I can stand a single frame of latency. However, with a lot of cutting edge games you are lucky to be able to run at refresh rate, and triple buffering is often a win (if you can afford the extra video memory).

To examine your 15ms example:
Assuming 15ms render time with double buffering:
*frame 0 starts at 0ms, completes at 15ms and is displayed at 20ms
*frame 1 starts at 20ms, completes at 35ms and is displayed at 40ms
*frame 2 starts at 40ms, completes at 55ms and is displayed at 60ms

Assuming 15ms render time with triple buffering:
*frame 0 starts at 0ms, completes at 15ms and is displayed at 20ms
*frame 1 starts at 15ms, completes at 30ms and is displayed at 30ms
*frame 2 starts at 30ms, completes at 45ms and is displayed at 50ms
*frame 3 starts at 45ms, completes at 60ms and is displayed at 60ms

remembering that any input before a frame starts is viewed when that frame is displayed:
*input at time 0ms is displayed at 20ms with double buffering and at 20ms with triple buffering
*input at time 14ms is displayed at 40ms with double buffering and at 30ms with triple buffering
*input at time 16ms is displayed at 40ms with double buffering and at 50ms with triple buffering
*input at time 21ms is displayed at 60ms with double buffering and at 50ms with triple buffering

while in some cases, the latency for individual inputs can be higher with triple buffering, the average latency is lower.

Consider yourself somewhat “smacked down” :wink:

[This message has been edited by LordKronos (edited 01-05-2001).]

I’d be glad to accept a smack-down for half
of the story: I forgot the qualifier that
triple buffering gets really bad when you’re
rendering faster than the scan rate. Absolutely
and positively my bad (especially for going
against the current specific example).

However, you are measuring latency in a
different way from me (I almost wrote “wrong” :slight_smile:

When measuring latency, you must always
measure MAXIMUM latency. That’s all that
counts. A latency that jitters but
“averages” or “best-cases” at some smaller
number is WORSE than a steady, constant
slightly higer latency. If you don’t believe
me, look it up in some psychology research
papers or something. For achieving precision
control, jitter is a very bad thing.
(Of course, I’m assuming precision control is
the goal here – you may have other goals).

The maximum latency of triple-buffering is
TWO frames of rendering time; the maximum
latency of double-buffering is ONE frame of
rendering time; the benefit of double-buffering
is rock-solid latency (very low jitter).
Although you can easily construct SPECIFIC
cases where either mechanism comes out on
top, I choose double-buffering for the general
purpose solution because of this.

Thought: I now realize why we’re looking at
thisdifferently. I’ve been working a long
time with systems where your production time
must be shorter than your presentation
interval (pro audio & video). I guess in
games that’s not the common case. Once
again, I realize that my hobby of OpenGL muckingaround is different enough that I
have to control my ingrained reflexes :slight_smile:
(which makes it that much more interesting)

[This message has been edited by bgl (edited 01-05-2001).]

Well, going back to the original question, OpenGL in windows doesn’t support triple buffering. I believe the SWAP_CONTROL extension controls the type of swap to do while double buffering (flipping or copying pixels), and other extensions allow apps to enable/disable vsync.
That’s it.

I believe we have triple buffering implemented, but there’s just no way to expose it. Maybe we should have an extension.

  • Matt

Originally posted by bgl:
Thought: I now realize why we’re looking at this differently. I’ve been working a long time with systems where your production time must be shorter than your presentation
interval (pro audio & video). I guess in
games that’s not the common case.

Oh, yes!!! That ABSOLUTELY explains our difference of opinion on the topic. Clear enough. In games, I tend to just say “Well, if it is really running that fast, you can deal with it”. Rendering faster than refresh is not a commom thing in 3D games, so I tend to favor the more common case, and this is where triple buffering helps.

If it helps others see things a little more clearly, I did make up some charts that demonstrate at some common refresh rates how double and triple buffering compare in terms of FPS and Avg. Latency at various rendering times from 1 to 50ms. You brought up this topic of max latency after I created the charts so I dont have one for it, but I did run the numbers and look at them, and the shape of the chart for max latency is approx. the same as that for avg. latency. If you want to see it, its at: http://www.ronfrazier.net/buffering/buffering.html

Matt,
Yes, an extension for this would be great.

Also, would it be possible to make it so that you can toggle between double and triple buffering at will? This would allow an application to sample the render time when it starts up and after a while switch between double and triple buffering if it would decrease latency on the system. I realize that you would probably have to keep the 3rd buffer’s memory allocated even when you switch to double buffering, but thats understandable. Just an idea.

I’d love to see an extension too if it can’t be exposed in any other way. A driver setting for forcing triplebuffering would be nice too.

Since having vsync on is normally what you want, having no triplebuffering in GL seems to put it at disadvantage in real-world performance compared to D3D. And IMHO the latency issue is rather
negligeble compared to the performance gains.

“The latency issue is negligible compared
to the performance gains” huh?

Depends on how you’re measuring performance.
My preferred measurement is “max latency”,
closely followed by “latency jitter”.

Anyway, it seems you could do the best of
both worlds in the driver. If the program
asks the driver to flip, and the last vsync
already displayed a new buffer, wait until
the next vsync to display AND return to the
program. This ensures that you’re not triple
buffering if you run faster than the frame
refresh, which is important for latency.

If the last vsync did not display a new
buffer, you know that you’re running slower
than the refresh, and thus you should put
the completed buffer in a queue (unless there
is already one in the queue, in which case I
guess you replace it, which may result in an
unresolvable race with the hardware unless
the hardware is well designed – but I’m
digressing :-). In this case, you should
return immediately, letting the program start
no the new frame before the next vsync,
since it’s very likely that new frame will
also take longer than one vsync interval to
render.

This could be done transparently, with no
need for an extension at all, unless you want
to allow the unlimited-speed option for
those people who just HAVE to see a three-
digit FPS counter. The cost is one additional
frame buffer (if you compare to double-
buffering) which may or may not actually be
necessary, depending on rendering speed.

Originally posted by bgl:

This could be done transparently, with no
need for an extension at all, unless you want
to allow the unlimited-speed option for
those people who just HAVE to see a three-
digit FPS counter. The cost is one additional
frame buffer (if you compare to double-
buffering) which may or may not actually be
necessary, depending on rendering speed.

But that extra buffer is why we need the extension, because you cant just go around taking away video memory from an application that isnt expecting it. Some apps may ship with the minimum requirements on the box listing 16MB, when in fact 16MB may not be sufficient to support the extra buffer. But it is a great idea. We would just need an extension for it.

Hehe, another thing to put on their wish-list at nVidia ! Come on guys, we are waiting…

Regards.

Eric

P.S.: if others could do it as well, that would be nice…

> Some apps may ship with the minimum
> requirements on the box listing 16MB, when
> in fact 16MB may not be sufficient to
> support the extra buffer.

But there’s already some variability because
the GDI desktop framebuffer may be 1600x1200
in 32 bits, or just 800x600 in 8 bits, right?

Or is the OpenGL implementation such that a
GL ICD can talk to the driver and throw away
the “desktop” framebuffer when going into
full-screen mode?

At some level, I think this means that an
OpenGL implementation needs to keep a copy
of all textures in memory, because after
upload you might lose the copy on the card,
but the user would still expect it to be
there. So, at load time, you might have THREE
copies of the texture:

  1. in my file-load buffer
  2. in the RAM backing store for when you lose
    it
  3. in the RAM on the card

Yow!

Originally posted by bgl:
[b]
But there’s already some variability because
the GDI desktop framebuffer may be 1600x1200
in 32 bits, or just 800x600 in 8 bits, right?

Or is the OpenGL implementation such that a
GL ICD can talk to the driver and throw away
the “desktop” framebuffer when going into
full-screen mode?
[/b]

Well, no because I think that the desktop framebuffer becomes the front buffer when an openGL app starts. And some applications just start up, and switch straight to 640x480x16 or some other preset resolution without providing the option to switch. I know thats not the best thing to do, but I believe even Diablo 2 does that sort of thing (not that Diablo 2 was written in openGL, just using it as an example). So an app like that can know exactly what it’s requirements are and may not be able to cope with less than that.