Transparency issues, polygon sorting, headaches...

I am an intermediate OpenGL programmer. I’ve been doing basic OpenGL stuff for years, so I’m not a newbie, but I’m also not familiar with all the advanced techniques that are out there.

One problem that has always come up over the years is efficiently and correctly drawing multiple transparent, complex objects on the screen (mixed with opaque objects as well). These are objects that move around relative to eachother, sometimes overlapping, typically with a user controlling the viewpoint location as well. This is a pretty classic problem.

The current application I am writing is really frustrating me. This particular application uses OpenGL to draw 3 complex objects on the screen:

[ol][li]An opaque, convex surface model consisting of approximately 2000 triangles.[]A transparent, extremely complex, definitely-not-convex (and possibly not even fully closed), surface model, requiring no culling, two-sided lighting, and consisting of anywhere from 40000 to 200000 triangles.[]A transparent, textured quad with an animated texture on it, requring no culling and two-sided lighting.[/ol][/li]
In this application, all 3 objects are constantly moving around relative to each other. Particularly, the objects can overlap or be inside each other. The user can adjust the camera position and orientation and will do so frequently. The view may be from inside surface #2 looking out. Two things are critical:

[ul][li]High frame rates at all times (>= 30 FPS).[*]Correct (i.e. relatively realistic, though not necessarily perfect) appearance of transparency, even when objects overlap. For example, if textured quad #3 is partially inside surface model #2, it must look correct. Note that a single triangle in quad #3 may very likely be partially contained in the surface model, so even sorting triangles back-to-front would not handle the case correctly.[/ul][/li]
I feel I’m faced with a very common problem. Ideally, I’d want to be able to render all fragments from back to front, applying the blend function (GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA) (or whatever blend function, but that one is the kind of results I’m looking for). I can sort the objects back-to-front but the scene is too complex to get satisfactory results from this, and it does not handle self-transparency or overlapping objects either. I can sort all triangles from back to front but this does not handle overlap correctly either, and also it is far too slow to be useful (the best implementation I know how to do is quicksort on triangles based on their centroid’s GL window Z coordinate – multiplying all centroids by current matrix stack by hand first, even using optimized math libraries like GSL this is just too slow as triangle counts get into the 200000’s and up).

It’s definitely a priority for me to get it working for the particular case in my current application, but I’d really like to add some tips and techniques for general case transparency issues to my belt.

So far I’ve mostly just always tried a lot of depth buffer and blend function “tricks”, none producing completely satisfactory results. Even the simple technique of turning off depth buffer writes doesn’t produce good results with non-commutative blend functions like (GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA). As an example of how weird I’ve been getting here is the last thing I tried, where s_BigThing.Render() draws a complex non-convex surface model:

  glColor4f(1, 1, 0, s_Alpha);

  // fill depth buffer with nearest values

  // draw only the farthest faces
  glBlendFunc(GL_SRC_ALPHA, GL_ZERO);

  // refill depth buffer with nearest value

  // draw only the nearest faces, showing far faces through them.

That is ridiculous, of course. Four-pass rendering is not acceptable, although performance is still OK it doesn’t seem to make sense, and that doesn’t deal with intersecting objects either, obscured objects, and doesn’t quite look right anyway (many faces are missing when the object obscures itself multiple times). Still it’s the closest I’ve gotten. I’ve gotten OK results using clipping planes to render ranges of the scene from farthest to nearest, but this requires many passes and still looks pretty ugly unless you have a lot of “depth sections”. Typically when I try to come up with a new solution I get one of the same types of problems in the result:

[ul][li]Too slow. E.g. sorting a half a million triangles, or rendering the scene a hundred times per frame.[]Triangle “flashing” problems as back-to-front drawing order changes when an object is moving (for example, rotating a surface model).[]Too much contrast, for example using (GL_SRC_ALPHA,GL_ONE) with depth writes turned off produces exceedingly bright areas where fragments overlap, even though it gets rid of that “flashing” effect.[]Non-commutative blend functions producing ugly results when polygons drawn out of order. In the worst case if you used (GL_SRC_ALPHA,GL_ONE_MINUS_SRC_ALPHA) but draw polygons front-to-back, the farther away ones appear closer just because of the way the blending works (their fragment colors are weighted more in final result).[]Have yet to find any technique that produces even slightly satisfactory results for two overlapping objects (e.g. triangles are intersecting each other) when both are transparent.[/ul][/li]
I’m really looking for some good techniques that can help me do all of this. I’m willing to take any approaches, and to spend a good amount of time sitting down and learning stuff if necessary (I’m in no rush, I’d like to figure it out once and for all), the only requirements are the 2 listed above (performance + relative correctness).

Really what I want is a graphics card feature that stores all fragments in buckets, then sorts them all before blending and drawing, and does it really, really fast. Then I wouldn’t have to think about it at all.

Thanks a lot,


There are several techniques that you can use:

  1. Depth peeling - check Nvidia site for more details, they present both front peeling and dual peeling. (high quality result but has performance penalty)
  2. Weighted average - check Nvidia site for more details, it average based on weight and is order independent. (result are medium)
  3. Sort the triangles back to front and blend (using associative colors), good quality but has artifacts, very good performance. The sorting should be done with “bucket” sort, divide the Z into x(can be dynamic, I use 4096) buffers, and put each triangle center in the right buffer, using index array to draw. - very fast give very good results when triangle are small. The sort takes only few millisecond for hundredths thousands triangles. You could also refine the “intersected” triangles for better result.


I just wanted to note that the one company/person who solves this in one neat, fast and simple way could become rich.
I have one or two ideas about this, but the hardware is not there yet.
But if i had a blend shader (hint hint :slight_smile: ) and a bunch of MRTs i could solve it.

I am hoping for a blend shader pipeline stage where I am able to combine all fragments at each pixel however I want. :smiley:

I had the same problem, and sorting the transparent faces was way too slow. Here is how I solved it:

I created a “depth buffer” that every polygon gets added to, using a single point of the polygon (e.g. Z value of center of gravity, or greatest/smallest Z value) as index value. What I have to do for this is to transform these “pivot” points in my own code like OpenGL would do with the current modelview matrix.

To keep the buffer reasonably small, actual Z values are scaled so that they fit in the depth buffer (Z * <depth buffer size> / ((zfar - znear + 1) * <precision>)). <precision> would be how many floating point digits should be preserved.

I am using a secondary table the depth buffer points into to resolve collisions and sort faces coming from different types of objects depending on these objects’ rendering priority.

When my renderer encounters a transparent face, it stuffs it into this depth buffer.

To render the depth buffer contents, walk through it, starting at its end. You can do some optimization on walking it.

No sorting -> very fast.

I hope I was clear enough.

It is an interresting way to sort as you draw, but unless I am missing something, it still requires breaking the draw primitives in order to apply the same “sort” inside primitives (when needed).


My transparency renderer is fed with with OpenGL draw primitives (triangles, quads, etc.). Higher render functions (e.g. those rendering a 3D object) will feed the transparency renderer with single transparent faces of the object they’re responsible for.

You can however tell my transp. renderer the “type” of the face, and it will group faces of the same type at the same sort buffer position together. As collisions are rather rare if you chose your sort buffer size properly, not much sorting is involved.

Something on my buffer construction:

It uses two buffers: The depth sort buffer only points into a second buffer holding linked lists of the faces to be rendered. If several faces at the same depth sort buffer are to be rendered, the list of faces pointed to by the depth sort buffer gets longer.

I have also added some code to break very big faces into smaller ones (desired max. size controlled by parameter).

Another thing I have implemented is a “mesh optimizer” splitting the entire mesh of an object (currently just level geometry) into sufficiently small chunks, properly recomputing texture coordinates (could ofc be done by the level editor, but currently isn’t. Also, doing at at level load time allows me to do it for old levels, too).

Thanks for all the replies, guys. I’m doing a lot of research on algorithms that people have come up with. Right now I’m trying to tackle depth peeling; splitting models up into BSP trees is an option I may look at later except in my particular application the models can be very, very complex – and the extra geometry from splitting triangles may not be worth it.

I have also had limited success using the GL_MULTISAMPLE + GL_SAMPLE_ALPHA_TO_COVERAGE trick – but the results are not perfect and only really look good in highly constrained situations (alpha is a multiple of 1/numsamples, no overlapping faces with same alpha value [or “screen door” effect hides the far one], must use textures instead of vertex colors).

I’ll post back here with a depth peeling example once I get it working – information has been really hard to find, also I am getting familiar with vertex/fragment shaders first.

karx11erx: Your algorithm makes sense and is pretty clever, thanks. Unfortunately in my particular application, intersecting faces are very common. Doing real-time boolean operations on models with 200K+ faces each is not an option I want to pursue right now, though, at least not until I’ve tried some more general solutions. So splitting faces will have to wait.