Hi everyone, right now I’m depth sorting my models and rendering them from furthest to closest, so that I can use opengl’s built in transparency. It’s working okay, but it breaks down when one model intersects with another or is inside another (imagine a semitransparent sphere bouncing around inside another semitransparent sphere–which one is further back?).
I did a bit of research and found out that OpenRL does real-time raytracing, so that seems to be an option. Would you guys recommend using something like OpenRL? Is it too processor intensive for a game?
There was also a technique out there called Depth Peeling, which looks really cool. It’s a shame that I have to send the geometry for each pass though (and I have a nagging suspicion there’s a way around re-sending like that).
Those two options look pretty promising, but before I choose one of those, what are your thoughts on them? Or do you have any other options that I could investigate?
Thanks for the amazing resources! I’m definitely going with the A-buffer, because its 8x faster than depth peeling. Two quick questions:
Is this technique compatible with screen space ambient occlusion, screen space deferred shading, or screen space shadow mapping? I don’t need any of these, I’m just curious.
The powerpoint says that the stencil buffer is being written to (by subtracting from all of the samples) and read from (by comparing the subsamples with the reference value). I thought we couldn’t read from and write to textures at the same time?
First of all, the problem with the A-buffer is that you have several “layers” of the color buffer (based on how many levels of transparency you would like to provide for a single fragment) thus it has a significant memory overhead, that’s why linked list buffers are preferred if your target hardware supports it. I agree, that depth peeling is a definite no from performance point of view but weighted average is also a pretty good drop-in feature for OIT. I don’t want to convince you, but I just wanted to clarify this in order to help your decision.
Theoretically it is possible to do SSAO and deferred shading with A-buffers, but one needs to heavily modify the way how the evaluation of those happens. I don’t know what exactly you mean by screen space shadow mapping but shadow mapping in general is also possible but again, with modifications needed to take all the layers into consideration (however, in order to have transparent shadows, well, that’s a more complicated question).
Reading and writing to the same texture is in fact supported officially only on GL4 class hardware (even though some earlier cards know it, e.g. ATI HD3000). I do not know the exact way how NVIDIA implemented their stencil routed A-buffer but I suppose here they mean stencil operations. If you check stencil operations they indeed allow you to read and write them (stencil test + stencil write operations). The stencil and the depth buffer are such that they allow read-write ops since the very beginning of OpenGL’s history.
All this information is a goldmine. Thanks a bunch!
So I did a bit of calculations, and with a resolution that requires a texture of 2048x1024, I would need 16mb of space on my graphics card per layer. If I was to have 8 layers, that’s 128mb of space… I suppose I could include a setting for how many layers I could use, but you’re right, that is pretty expensive. Maybe I’ll also include a setting to do weighted average for the low-end users.
So people keep saying that linked-list buffers are only available on “DX11-class hardware”. What exactly does that mean if I’m using opengl instead? I’m guessing a lot of DX11 class hardware also supports opengl. What percentage of users do you think has DX11 class hardware?
So people keep saying that linked-list buffers are only available on “DX11-class hardware”. What exactly does that mean if I’m using opengl instead?
That means OpenGL 4.1 or better. Though I’m not sure if unextended 4.1 can do all of the things you need to in order to implement linked lists. If the ARB keeps adding to GL on their previous schedule, then we’ll probably see OpenGL 4.2 at GDC.
What percentage of users do you think has DX11 class hardware?
There’s no way to know for sure, but the Steam hardware survey provides some numbers. How useful they are depends on how much your audience overlaps with Steam users.
The benefits of this are that we can have models that intersect with other models, and models that are inside other models. In the previous version we had to do things like only rendering the far sides of tunnels, and splitting things in half so they could be in different places in the sorted order.
The costs of this are that for a play area (everything left of the menus) anywhere between 513x1025 and 1024x2048, each “layer” will take 16mb of space on the video card, and usually around 8 layers are used, totalling 128 megabytes. Also, anything behind 8 layers of transparency will be dropped, but that shouldn’t be a problem in our case because in most of our program, after 6 layers of transparency, anything behind them will be impossible to discern anyway.