The level above Vulkan - renderers, with emphasis on large scenes

I want to discuss what belongs in the layer above Vulkan, but below the application.

For concreteness, here’s a video of the sort of things I need to render. This is my all Rust client for Second Life/Open Simulator. Think of this as a big-world MMO where all the content comes from the server. So all the hard scaling problems apply.

The current approach uses a “renderer”, but not a game engine. A renderer is a library that offers an API with in meshes, textures, material properties, and transforms, and produces a good looking scene by calling Vulkan. The renderer is not a game engine - it doesn’t own the scene graph, do networking, input, audio, or gameplay. The renderer handles keeping track of what’s on the screen, what’s in the GPU, frustum culling, lighting, and shadows - all that heavy bookkeeping you have to do to use Vulkan on a complex scene.

Examples of “renderers” include

  • Three.js (widely used, but not super fast)
  • Three.rs (abandoned)
  • Rend3 (abandoned, but I support a fork)
  • Renderling (in progress, not usable yet)
  • Orbit (abandoned)

They all have rather similar functionality. Everybody seems to end up in about the same place. All of those connect to glTF content, for example. These lean towards Rust, and C++ programmers probably know of others in C++ land. Let me know, please.

The argument for using a renderer instead of a full game engine such as Unreal or Unity is that game engines want you to do everything their way, and if you have to connect to existing content or server APIs, there’s a big mismatch. If you have a renderer available, it’s reasonably easy to get glTF on the screen.

Now, this approach works fine until you hit a scaling wall. The scaling problems usually arise because the renderer needs some locality info from the scene graph. If you have shadow-casting lights, the renderer needs to know, for each light, roughly what objects can occlude that light. Otherwise, rendering cost is O(lights x objects), which scales very badly. A short ranged ceiling light in a room might be potentially occluded by only a few objects. We don’t want to make Vulkan and the GPU test every triangle in the town-sized scene against that.

So the renderer needs to be able to efficiently query the scene graph somehow. Options include:

  • Calling back from the renderer API via a lambda or something to query the scene graph for questions such as “Here’s a volume (probably an AABB). Tell me all the objects with any part inside that volume”. That gives us the set needed for culling for a light.
  • Caching similar information within the renderer, in a sort of dumbed down scene graph, like object centers and bounding spheres.

If we try to do occlusion culling (we’re in a room, let’s not render the whole town at the Vulkan level) some of the same problems appear.

There’s a school of thought that says general purpose renderers of this type should not exist. Rendering is the job of the application. But that means writing about half a game engine into an application.

Now, this is a general problem. It must have been solved by others. But I’m not finding any solutions that scale. Everybody seems to do My First Renderer and quits, or builds a whole game engine.

1 Like

Well, what you’re talking about basically is a scene graph. Which basically constitutes about 80% of the part of a game engine that deals with graphics.

What you call a “renderer” is ultimately not a high-performance tool. It can’t be. It has a lot of responsibilities with regard to performance, but relatively little power to ask the questions needed to create high performance. They’re nice tools if you want to blast something onto the screen quickly and easily, but if you want performance, the tool needs to be smarter.

And the more high-level concepts you put into a “renderer”, the closer it will become to building the graphical part of a game engine. There simply isn’t a good middle-ground here that has the performance of a full-fledged solution.

That’s the conventional wisdom. You could make the same argument about Vulkan. Real graphics programmers should be writing directly to the hardware, for maximum performance, right? Vulkan unified that layer and simplified things considerably.

It’s reasonably clear what a renderer should do. All five of the renderers I list have roughly the same API. It’s a useful level to define an API. They’re basically glTF as an API rather than a file format. Now that everybody has pretty much standardized on glTF, it makes sense to have an API with glTF entities.

The problem is performance. Sometimes you do need scene graph type info.

A renderer doesn’t need the full scene graph. It needs some geometry for culling, but bounding boxes or spheres are probably good enough. The physics engine and game logic are not at that level.

There have been scene graph APIs:

Scene graphs, like game engines, tend to dictate the architecture of the program, because the graph owns everything. One advantage of a renderer level is that it can be a library, not a framework. (Misery is trying to combine things which ought to be libraries but want to be frameworks. Each framework wants to own all the objects and the event loop. Fortunately, Vulkan doesn’t do that, and neither should a renderer.)

Vulkan Scene Graph seems to be the only one in active use. The examples look OK, but nothing looks like an AAA title. They’ve been plugging away since 2018. Again, anyone using it?

I’m currently thinking of adding culling for shadows to Rend3, with cached info about bounding boxes or spheres. Ought to work, but performance is a question. Someone tried adding occlusion culling and performance got worse.

… yes. You seem to be repeating what I said, just with different words.

You want a system that doesn’t really know about the nature of the scene, but that knowledge is crucial to achieving better performance. So… you have to build a system that has that knowledge.

Because the thing is, bounding volume culling isn’t everything with regard to performance. That’s just one issue among a host of others that ultimately require scene graph knowledge.

We don’t even have to leave the realm of culling to see the possibilities. Portal visibility systems cannot work without scene-graph-level knowledge, and those are absolutely critical for high rendering performance in many domains. Indeed, any kind of culling system that allows geometry to occlude other geometry (before sending it to the GPU) basically requires that the system has a significant understanding of what the scene is.

This is even more true if you’re trying to do shadow mapping or similar techniques because now you need visibility checks from multiple angles. The better those checks are, the less unnecessary GPU stuff has to be done.

Consider state change overhead. A “renderer” that has no scene graph knowledge has no control over the order in which you submit stuff to be rendered. The onus is therefore on the higher-level code to submit stuff to be rendered in an order that minimizes state changes. But the higher level code generally… doesn’t know when a state change will be needed.

If you have two glTF meshes, do they use the same vertex format and shader? Do they use the same textures, or do you need new descriptor sets? These are low level details that bubble up into the scene graph architecture, and those details can be quite important for performance.

The whole point of Vulkan as an API is to reduce the amount of unexpected code that lives between the scene graph and the GPU as much as is reasonable while still providing some hardware independent abstraction. Reintroducing an intermediary layer between the two things may make it easier to write code for such a “renderer”, but it won’t let you write efficient code. And there’s nothing you can do to change that without making the “renderer” into a scene graph in all but name.

This is not a matter of some “conventional wisdom” which you believe you can overturn. This is well-worn practice because it has been proven to work and alternatives that have been tried have demonstrably not been as effective. If you want ease-of-programming as a library, you can use a “renderer”. But if you need maximum performance within your domain, then you have to have as much control as possible between the scene graph and the GPU.

No one-stop library “renderer” can get around that.

A renderer doesn’t need the full scene graph. It needs some geometry for culling, but bounding boxes or spheres are probably good enough. The physics engine and game logic are not at that level.

I didn’t say anything about a “physics engine and game logic”.