Details about handling transparency with depth peeling

Bruce_Sherwood · April 6, 2012, 5:25pm

Pixel-level transparency based on depth peeling

Handling transparency correctly in WebGL is challenging. A standard technique is to order the centers of objects according to their z depth from the camera and render back to front. This approach can make serious errors in the case of intersecting or enclosing transparent objects, where some parts of one object are in front of a second object, and other parts are behind the second object.

A better approach is called “depth peeling”, in which transparency is dealt with at the pixel level rather than at the object level. Consider for example a transparent sphere enclosing an opaque box, and carry out the following operations:

Render the pixels of the opaque box to a texture, which we’ll call a color texture. This of course takes into consideration the lights in the scene. Call this color texture C0. Note that all opacity values (the “a” in rgba) are 1.0 for this opaque object.
Render the z depths (distance from the camera) to a texture, which we’ll call a depth texture. That is, given the depth of a pixel, the fragment shader stores a false color into the texture, a false color whose 4 bytes represents in some form the depth. Call this depth texture D0.
Render the transparent object (the sphere) to a texture C1, but using information in the depth texture D0. In the fragment shader, read the (false color) depth from D0, and if the z depth of the transparent sphere pixel is not in front of the opaque box pixel, discard the pixel (using the “discard” statement in the fragment shader). Upon completion of this render step, C1 contains color and opacity information (rgba) for the front-most transparent surface, a hemisphere. This is called a “depth peel”.
We now have two color textures, C0 and C1, which we can merge to form a scene with an opaque box and a transparent hemisphere in front of the box. Render a simple quad object (two triangles) that fill the canvas. In the fragment shader, read the color information from C0 and C1. Store a pixel color that is determined like this, where (1.0-C1.a) is the transparency of the transparent layer:

vec3 color = C1.rgb*C1.a + (1.0-C1.a)*C0.rgb;
gl_FragColor = vec4 (color, 1.0);

These four steps illustrate the basic idea. In GlowScript, four transparent depth peels are performed rather than one, and 10 separate renders are carried out:

C0 – opaque color texture
D0 – opaque depth texture
C1 – frontmost transparent color texture
D1 – frontmost transparent depth texture, corresponding to C1
C2 – transparent color texture for the next deeper “peel” after C1; if the pixel does not have a depth between that of D0 and D1, the pixel is discarded
D2 – transparent depth texture corresponding to C2
C3 – next deeper transparent color texture; discard a pixel if z not between D1 and D2
D1 – transparent depth texture corresponding to C3; note the “ping-pong” with D2, which is possible because we’re done with the information recorded in D1
C4 – deepest transparent color texture, actually rendered into D2, which is no longer needed

MERGE – The final render is a merge of C0, C1, C2, C3, and C4:

vec3 color = C1.rgbC1.a +
(1.0-C1.a)(C2.rgbC2.a +
(1.0-C2.a)(C3.rgbC3.a +
(1.0-C3.a)(C4.rgb*C4.a +
(1.0-C4.a)*C0.rgb)));
gl_FragColor = vec4 (color, 1.0);

It may be that C2, C3, and C4 are all empty, but there is no obvious inexpensive way to get the information needed to tell the CPU to avoid scheduling those extra renders, because readPixels is very expensive. If there are more than four transparent layers, this algorithm will not treat them properly. However, note that the fifth and later peels will contribute little to the final pixel color, being partiallly occluded by four transparent layers in front.

Before starting the many renders the objects are sorted into opaque and transparent lists, and if there are no transparent objects a simple C0 render is all that is needed. Moreover, this simple render can exploit antialiasing, whereas the storage into textures for depth peeling unfortunately turns off antialiasing.

It is remarkable that doing 10 separate renders runs adequately fast for real-time rendering of moderately complicated scenes. For example, displaying a 10x10x10 grid of rotating transparent boxes can run at around 20 frames per second on ordinary computers, depending on the graphics card.

In OpenGL it is possible to create several textures in one render, but WegGL permits attaching just one texture to a framebuffer object, hence a large number of separate renders are needed.

To see this algorithm in action:

http://www.glowscript.org/#/user/GlowSc … ansparency

Note that the code for this transparency demo is remarkably short. GlowScript is aimed at making it feasible for nonexpert programmers to exploit WebGL to generate navigable real-time 3D animations. At glowscript.org the Help provides details about this, and there is a link to more Examples.

At https://bitbucket.org/davidscherer/glowscript, in the Source section, the key files dealing with depth peeling are lib/glow/WebGLRender.js and the shader programs in the shaders folder.

Bruce_Sherwood · April 7, 2012, 9:11am

I just came across a discussion of how to implement “Fast Approximate Anti-Aliasing” (FXAA) to get around the problem that the use of framebuffer objects turns off anti-aliasing:

http://www.codinghorror.com/blog/2011/1 … -fxaa.html

I haven’t tried this myself.

Ralph · July 15, 2012, 12:48pm

Nice writeup, made my life easier. I implemented it myself this weekend because I wasn’t too satisfied with my triangle sorting transparency. This works good and at acceptable speed. Only drawback I found so far is that mobile devices don’t seem to have that many texture units. I tested it on a Galaxy S3, which ran all my webgl stuff so far, and both your demo and my implementation fail there.

Bruce_Sherwood · July 16, 2012, 9:29am

I’m delighted that you found my post to be useful. Thanks for the report on the Galaxy S3. I had wondered how many texture units are provided on mobile devices but didn’t have a device I could try. In your own work, how many depth peels do you do? As you know, I treat the opaque peel plus 4 transparent peels. If you do only the opaque peel and (say) 2 transparent peels, does it work on the Galaxy S3?

Ralph · July 16, 2012, 9:53am

I’m also doing 4 transparent peels. But good idea reducing that number. Will do some testing later and report my findings.

Ralph · July 16, 2012, 11:06am

So did another test. Doing only two peels doesn’t really help much. You need C0, C1, C2, D0, D1 anyway. gl.texture0 and gl.texture1 are used too so 7 units are needed. To be honest i still had D2 bound during my test so I used 8 but I doubt that if it’s not working with 8 texture units it would with 7.

Unfortunately my Google skills couldn’t dig up any info regarding this.

Bruce_Sherwood · July 16, 2012, 1:27pm

Thanks for the news report. I guess you could try just one transparent peel, just to see what happens? That would only require C0, C1, and D0, right?

Ralph · July 16, 2012, 2:38pm

Well that wouldn’t make much sense as that would be worse than standard transparency.

Another idea I had was recycling tex0 and tex1. I only use them during the opaque render pass. That would mean

Pass 1:
target C0 uses C1 D1
target D0 uses C1 D1
Pass 2
target C1 uses D0
target D1 uses D0
Pass 3
target C2 uses D0 D1
target D2 uses D0 D1
Pass 4
target D1 uses D0 D2

That would mean having 3 transparent peels and using 6 texture units. But before I implement that I’ll try to find out how many texture units i have on the phone. Guess I’ll need something like glGet( GL_MAX_TEXTURE_UNITS )