Stopping driver from buffering frames


In our application we often use glCopyPixels to copy from the back buffer to front buffer (usually only small areas of the window need re-rendering each frame, so we get quite a speed up by only rendering the changed areas).

With most cards/drivers this works fine, however some drivers appear to be trying to buffer rendering calls a few frames ahead. Not a problem if we render the whole frame each time and call SwapBuffers, but since we’re not doing that it appears that the driver can’t figure out where the end of the frame is, and hence we sometimes get up to 2-3 seconds of lag in response to mouse movements.

Is there some way of telling the driver where the end of the frame is? Want to avoid having to call glFinish if possible… Or maybe some other way to stop the driver buffering so far ahead?


OpenGL buffers as many commands an implementation can handle. It’s designed as client-server architecture.
If you want to let rendering start earlier than necessary, you need to issue glFlush. That triggers the rendering but doesn’t wait until it’s done like glFinish, so it’s the faster method.

Mind that glCopyPixels actually is a 3D command and generates depth or color data as well from the current glRasterPos state. Set your OpenGL state so that you really only do a color copy operation for maximum speed, effectively a 2D BitBlt.

You have to carefully select your pixelformat if you want to retain your backbuffer for such incremental updates (if you ever swap in other cases). Select a PFD_SWAP_COPY format explicitly. ChoosePixelFormat is too dump and ignores many flags, use DescribePixelFormat to enumerate them manually.

BTW, from what you said about your update mechanism, you shouldn’t expect your application’s rendering performance to scale with multiple GPUs.

Yeah, we do a glFlush immediately after every glCopyPixels and turn off all the 3D stuff. It does work fine on the vast majority of cards.

We’ve always steered away from the PFD_SWAP_COPY stuff because the spec for that extension says it’s only a hint and the implementation is actually free to throw the backbuffer away if it wants (which I’ve always thought makes the extension pretty useless!). Is that not actually true - in which case we’d probably switch over…?

u may try this:

use an offscreen fbo/pbuffer to render (incremental) your content,
and use for each frame a fullscreen blit to your backbuffer,
and call glSwapBuffers

it has a higher memory usage (for the offscreenbuffer),
and memory bandwidth, since you also copy your updated areas,

but it requires only one fullscreen blit instead of many glCopyPixels,
so i assume it should still be faster than your current approach by copying partially the frontbuffer

it should bypass that lag problem,

and anyways i dont recommend copying from the front/back buffer,
cause it may get corrupted by the operation system (especially in pageflipped fullscreen apps,
i.e. windows GDI shares the backbuffer with OpenGL, and forgot about to NOT write in that),
the data in the fbo should be persistent

glFinish is not that bad. Used correctly will work just fine.

It’s very important to find a good place in your application to put glFinish. It’s also important to have your application well organised.

For example:

  1. CPU calculations
  2. rendering commands
  3. glFinish
    This is wrong - GPU will start executing commands from #2 while CPU will be waiting for GPU in #3. After that GPU is done and CPU is starting calculations for next frame in #1, while GPU does nothing.

This is better:

  1. CPU calculations
  2. glFinish
  3. rendering commands
  4. glFlush / swap buffers

This is of course general example.

Another approach is to use some GPU readback command to synchronise GPU and CPU. I think glGetTexImage is very nice command for that purpose:

  1. bind dummy texture
  2. glGetTexImage
  3. bind other texture
  4. bind FBO to render to dummy texture
  5. render anything to that texture
  6. bind previous FBO

Above steps, done once per frame, should synchronize CPU with GPU on per-frame basis without stalling GPU or CPU.

Look at GL_WIN_swap_hint. This maybe what you need, and is well supported

Thanks all!

Have for now switched to using WGL_ARB_pixel_format and asking for WGL_SWAP_COPY_ARB when first creating the pixel format for a window, which then means I can use SwapBuffers - seems to be working OK so far. Will see how it goes :slight_smile:

I’ve looked at PFD_SWAP_COPY before and found that a lot of ATI cards don’t support it… Maybe WGL_ARB_pixel_format is different?

Booo. Yes, just tried it, you’re right :frowning:

Although the cards that seem to be particularly bad for buffering too far ahead are all nvidia based (at least the ones I’ve seen so far…). So maybe I’ll get away with using SWAP_COPY for nvidia and fall back to glCopyPixels on ATI cards.

Sometimes makes you wonder if they get together and try to make things difficult on purpose :slight_smile:

Look at what I said previously - GL_WIN_swap_hint (and here )

I’d image that if the driver doesn’t support swap_copy it won’t work though.

For what it’s worth, I’m doing similar to you and have decided on the FBO approach. Just render to fbo and only update the bits that need changing, then copy the whole thing to the back buffer & swap. It’s quick, easy and save a lot of headaches assuming a reasonable graphics card :slight_smile:

Sorry if this is a n00b answer but can you use glScissor for this? Each time you need to update a region set the scissor box and only render that part, preserving the rest of what was already there, and then do a normal page flip after all your incremental updates each frame?

…and then have to repeat the same incremental updates to the previously-front-but-now-back buffer to get it in sync, presumably?
I think you might now see the problem with your suggestion, jcipriani2.

Oh, well, I tried! :o If you could theoretically force it to copy buffers instead of swapping them, then glScissor would help?