What's best practice for offscreen rendering?

Bjorn · December 28, 2009, 4:49am

What’s best practice for offscreen rendering in multithreaded apps?

I’m currently creating a general purpose framework for processing video data,
using a combination of GL, GLSL and straight C. The framework is multithreaded,
meaning that multiple threads can run and perform GL/SL tasks simultaneously.

I know very little GL and even less X, but have figured out how do do
texture mapping and color conversion using shaders and GLX. This part of my
framework works fine and I’m ready to move on to more advanced tasks, like
offscreen rendering and writing shaders with multiple output buffers. But GL
is a jungle and I’m rather confused when it comes to choosing the correct
solution to my problems.

Some of the questions I have are:

I use GLX 1.4 to manage my onscreen windows. Now I want to create a
new thread which renders data to an offscreen buffer. The OpenGL Super Bible
uses Frame Buffer Objects for this, but the GLX 1.4 doc mentions GLXPixmaps
and GLXPbuffers. Which one of these buffer types are appropriate to use for
rendering from shaders?
Multithreaded apps seems to have ‘issues’ when it comes to synchronizing
access to the rendering HW. What’s best practice for doing GL/GLSL/GLX
in a multithreaded application? Googling “multithreaded opengl” tells me that
it just won’t work, but is that really true in general? I do plan to use
separate rendering contexts for each thread, but I guess that even then I
may run into problems when the OS re-schedules my threads or when the app
runs on a multicore platform.
Are there any good and up-to-date tutorials or FAQ’s around for developing
multithreaded GL apps?

Thanks in advance for any answers.
Boa

Ilian_Dinev · December 28, 2009, 5:54am

Use FBOs in a single context, forget about Pbuffers and pixmaps.
Make the rendering itself single-threaded. But you can/should keep the calculation of arguments of render-calls multithreaded. (make a cmd-FIFO of your own, that threads append to, and the sole context-thread executes).
Modern nVidia drivers automatically use multithreading to make all glXXX calls very fast: 50-150 cpu cycles instead of 300-10000 cycles. They simply append commands to an extra FIFO, that a driver-created worker-thread uses to actually push the commands to the driver.
With PBOs (not Pbuffers, but pixel-buffer-objects) you can async-copy data around.
If you’re about to visualize results in multiple windows, you’ll meet trouble. There Pbuffers and context-sharing may be necessary in some solutions, or you can use some rarely-available extensions (iirc).