Multiple Windows, same contex but different MVP matrices

FilipVuk123 · April 4, 2022, 9:07am

Hello good people.

I’m trying to add a window with the same exact meshes (objects), textures, VAOs, VBOs, EBOs, etc. The only differences between those 2 windows are the cameras (positions, orientations, and where the cameras are looking). In other words, MVP matrices are different.

Imagine you are inside a mesh in OpenGL (a car for example) looking forward through the windshield in the first OpenGL window and looking backward in the second OpenGL window.

I managed to implement this using threads having the main thread for the first (original) window and a thread for the second one. This means that I have every object allocated twice (once for each thread) because this way I basically run two programs in one .out file.

I was wondering if there is a more elegant and/or more stable solution to this problem.

I’m on Ubuntu 20.04 LTS using GLAD and GLFW in C → OpenGL 3.3+

Thank you for your help and time

Dark_Photon · April 4, 2022, 12:57pm

Separate GPUs rendering each window or same GPU?

With separate GPUs, your approach makes sense and scales well (…assuming you properly target rendering to the correct GPU).
With the same GPU, it’s a harder call. (…more on that below.)

Conceptually, a GPU can only be working on one task at a time, with few special cases. So rendering to it in parallel with 2 different GL contents (one per CPU thread), or flipping back and forth using a shared GL context, on one or two threads) isn’t likely to be any better than just rendering to both serially in one process/thread/context. In fact, it could be worse. However…

Often, particularly in older code (or in your use case; see below), much of draw thread processing isn’t purely GPU limited but rather spends a significant amount of time limited by CPU-side processing. Either in:

the app’s draw thread,
the front-end graphics driver, and/or
the back-end graphics driver.

This is code that may benefit from CPU multi-thread processing or at least off-loading CPU ops to another thread (#2 and/or #3 depending on the driver and operation being performed).

Now there are some tricks that can be used to have a GPU render 2 views simultaneously from the same draw call submissions (e.g. for stereo views, w/ or w/o high-res insets), in the form of stereo and multiview rendering extensions. For “nearby overlapping frusta”, these can be pretty efficient – much more so than geom shader replication to 2 or 4 different viewports/layers. But these aren’t what you want for the use case you describe (two frusta, each pointed 180 deg from the other). You really do want separate culling and submission of a completely different set of draw calls and associated GL state changes for each frustum/window here. Otherwise, you’ll waste a ton of time performing needless vertex shader transforms on the GPU for primitives that will be discarded by the GPU’s view frustum culling before rasterization, yielding ~200-400% vertex transform waste for each view and a very heavy load on the GPU culler.

So in your case, there’s likely a fair amount of frustum-specific processing going on for each view/window (culling, state change/draw call list prep). If that’s on the CPU, and the time consumption is non-trivial, there’s probably real benefit for performing this in parallel on separate threads. However, the GPU submission piece, less benefit from multithread. It’ll depend on how your app uses the GPU.

Regardless of your draw submission approach, definitely pre-upload static geometry to the driver/GPU and submit from there using the fastest draw method available to you (bindless VBOs, VAOs, display lists, whatever). And for the CPU-side scene rep, if you decide to go multithread for submission, you’ll have to be the judge of whether you need one “scene store” or two. If there are a lot of updates to that scene store that may result in cross-thread sync points, preventing both threads from accessing the scene store in parallel, having two separate scene stores might be the better option. OTOH if the scene is static and there are no updates, sharing one scene store for both might be worth it. All that said, it’s my guess that you’re probably going to want to have one GPU submission thread per GPU, for best performance and simplicity.

For completeness on the multi-GPU side, I should also mention that NVIDIA does have some GL multi-GPU rendering extensions that support simultaneous (broadcast) rendering from a single, shared GL context. However, IIRC these broadcast the same draw calls to all GPUs, so this probably isn’t what you want for the use case you mentioned (frustums oriented 180 deg from each other).