What is the most flexible rendering technique in OpenGL up to date

Imagine you want to create the ultimate rendering using OpenGL rasterization, and you can live with slower frame rates, or progressive refine, or whatever, just to get more production like interactive renderer. It should support for example

  • huge number of ligths, why not with texture projecting support, VPL for area lights
  • huge number of shadow maps (each light type supported, different texture sizes)
  • arbitrary complex materials, or maybe a beasty ubershader
  • arbitrary lighting, shadows linking per object.
  • anti-aliasing
  • transparency
  • dof, ao, IBL, reflections, some variant of gi
  • you name it…

What would you choose - forward rendering, deferred shading or deferred lighting? Their tiled versions?

i think that there is not a unique answer, but to me “deferred shading” seems the most reasonable way, you can even try a tiled version but, for what i know, triple A games use “deferred shading”, obviously it depends from the result that you want to achieve, at some point you will have to make a choice between performance and graphics quality or a half way solution (for example crytek games are very expensive but they have a huge graphics quality), and this is not only a question of rendering techniques, the best thing would be build an engine that support at least forward and deferred rendering.

I don’t have a game in mind, but a general visualization software, and how OpenGL compares with a software production renderer. Just for an example, lets say I have a GTX Titan, and I target 24 fps. Even with such setup, it’s still very difficult to handle a really complex scene. I’ve implemented forward renderer, and it works, but trying to do everything in one forward pass is a bit complicated and breaks at some complexity level (I have 20K fragment shader). I’m surprised I got so far just with OpenGL 2.0, but I’m hitting a wall adding more features. The texture arrays in 3.0 are really helpful addition, and even with the limitation of using same texture sizes, it’s a lot better. I will definitely test the deferred methods, since it seems to me that their limitations are easier to overcome and can handle more easily scene complexity.

  • huge number of ligths, why not with texture projecting support, VPL for area lights
  • huge number of shadow maps (each light type supported, different texture sizes)
  • arbitrary lighting, shadows linking per object.
  • dof, ao, IBL, reflections, some variant of gi

These features definitely do suggest a deferred renderer would be appropriate as Ruggero suggests. Transparency is the main feature on your list that is difficult to do with deferred rendering. Most renderering solutions do transparent object rendering with a traditional forward renderer after the deferred shading of all opaque objects.

The main drawback to deferred rendering is high memory bandwidth use. A tiled deferred renderer attempts to reduce this by skipping unaffected tiles for each light. With aggressive light culling you can significantly improve performance. Memory bandwidth use also depends on the size of your gBuffer - you need at least a color, normal, and depth buffer, and antialiasing multiplies the bandwidth/storage costs. A lot of recent AA techniques attempt to reduce this cost. Finally, your material properities may require additional components, such as glossiness, which can be packed into alpha components of the other buffers.

However, the nice thing about a deferred renderer is that the geometry buffers it produces are a natural fit for many SSAO, DoF, and GI algorithms.

look, there is a good tutorial on deferred rendering (or shading) in this book OpenGL 4.0 Shading Language Cookbook is under “Deferred Rendering”, it’s a basic tutorial but is meant as a starting point. what i suggest you is start from the basics of modern opengl because there is a huge difference between modern opengl and the old one

I use a forward renderer based on tiled light queues to handle high amounts of lights and light types. Generating the light tiles requires hardware capable of using atomics and random writes to an output buffer.
I prefer forward rendering because of the flexibility it gives me, since it allows me to implement shading on a per-material basis, even for the same type of light. Using MSAA is also straight forward, which is always a plus in my book.

There’s been several complexities tied to this system, but I’m satisfied with the performance it gives me. I have not compared this to a deferred solution on similar use circumstances, mainly because the limitations imposed by deferred rendering on material and shading flexibility were not appropriate for me.

Thanks for the answers. Will definitely experiment with deferred rendering. Deferred lighting AFAIK works with MSAA, people say it’s a bit complicated to fix the lighting pass, but it was doable. About the material variety, I’m starting to think that it’s a bit overrated. With physically based shading, sooner or latter, you will want all effects anywas (diffuse, specular, reflections) and ubershader works pretty well, if you can live with the cost of course. With more powerful GPUs, switching shaders may become more costly actually. Production renderers like Mental Ray, V-Ray use such ultra materials for a long time now, and artists like to use them. I’ve seen that tiled forward rendering can handle a lot of point lights. However, I’m not sure how it can handle well for example many shadows, or progressive refinement of many virtual point lights (for GI or area lights approximation). I.e. deferred methods can iterate over the same geometry many times. Another interesting possibility is ray tracing - like true shadows, reflections, AO.

Lights and shadows on a tiled renderer (at least the one I use) are not really that different from deferred, it’s just that you attach the light shading (and shadowing) functionality to the shaders used on the invidual entities. This obviously allows you to fine grain the implementation of the actual lighting to the model level. I find this to be key for my purposes. You are not limited to point lights, but I’m sure this was obvious.

In the end, the right choice will vary depending on the amount of sacrifices you are willing to do on features or performance and your target usage.

Of course forward+ is not limited to point lights, but it can’t optimize global lights. With deferred shading, you still get the benefit of shading only the final pixels. But it’s more of a performance issue. Well, forward rendering is always needed and I have it already, so some tiled experiments will definitely not hurt :). I’ve seen a benchmark with many point lights betwee forward+ and deferred. Deferred was faster, but not by much.

Could you elaborate? I’m not sure what exactly you are referring to with optimizing global lights.

Running a depth-only pass before the production render will ensure most pixels are not shaded more than once and, depending on your pipeline, the depth data is going to be used by other stages, amortizing the cost. Which is probably something you’d do on a deferred renderer as well.

Don’t get me wrong, I’m not trying to specifically advocate for a tiled forward solution for your case, just trying to understand your point. I might be missing something.

Forward shading in many ways is the most flexible, but has pipeline drawbacks (tons of shader-combinations, too much cpu-work etc). Versions of precomputed-light combined with a forward pass (light pre-pass, Fwd+, clustered deferred etc) seem the most flexible from a material point of view. Unfortunately, and while this is an immensely annoying answer, it always depends on your specific use-case, and as with so many things, the devil is in the details; an unoptimised implementation of any of the techniques will be much slower than a fast implementation of any of the other techniques.

Yep, this is what I mean, you need also a depth prepass to optimize the lights with forward rendering even with clustering. I’m continuing with forward rendering though, and have now SSAO working with MSAA. Requires OpenGL 4.0 and the performance drops 3.6 times with 8xMSSA, but the point is - it’s possible, the result is pretty good without any artifacts. If the SSAO doesn’t use multisample texture, the performance drops only about 0.7x, but of course, there are some small artifacts here and there…

Without MSAA: http://i62.tinypic.com/2dw7br.png
With 8xMSAA: http://i62.tinypic.com/289iihi.png

The frame rates are visible in the corner, this is with GTX 780 and 64 samples for SSAO.