Sorting of transparent primitives ...

It really annoys me that OpenGL need to get the transparent triangles, quads, … whatever, in sorted order to render correctly under GL_DEPTH_TEST enabled.

Is there any better solution of the problem?

If no, why does not OpenGL supply an adequate rendering pipeline for this,

If yes, TELL ME =)

Well, same goes for Direct3D aswell.

And the sorting of polygons is not a work for OpenGL. It’s done at a higher lever. In a retained more API for example. OpenGL is an immediate mode API, which means that the primitives are drawn at the same time as you pass the primitives to OpenGL. OpenGL doesn’t know anything about what has been passed, and what will be passed. It’s just drawing whatever you pass, whenever you pass it. If OpenGL was able to perform depthsorting, it whould have to know about the entire scene before anything at all can be drawn.

And as far as I know, depthsorting is about the only way to go.

try bsp tree

But what about the NV20? Something says me that it does depth test somehow different, like a tile algorithm. Couldn’t there be an extension where you’d tell the driver that a face is transparent, and it’s sorting them along with untransparent and rendering afterwards? Mmmh.

OpenGL is a strictly in-order, immediate mode API.

Sorting primitives in the driver would be difficult at best. If we batched them up, if an app kept drawing and drawing, we would run out of memory, at which point we’d have to start rendering or fail. But if we started rendering, then we might get the sorting wrong.

This is why low-level 3D APIs are generally not scene-capture APIs. Implementations can choose to capture bits and pieces of scenes and decide what the most efficient way to render them is, but the final result must be identical to that of rendering strictly in-order.

  • Matt

Oh, well. Hmmm.
I know that you won’t say anything about it, but I don’t believe that the NV20 will still have a zbuffer, I wouldn’t understand the “fictious” messages in the world otherwise. So, if one does NOT sort per pixel, one possibility is to tile the image into a raster with small polys, where it’s not very likely that they still intersect, and draw back to front order. And if one would use such an attempt, it would also be easy to get that transparency problem out of the world.
But when can the driver be sure that the frame is completely defined? After the next glClear( GL_COLOR_BUFFER_BIT )? Wouldn’t be very likely.
Are there any other alternatives to zbuffer tests?

Well, I use quicksort once every frame for all transparent primitives, but it drags performance down, and its abit of a hazzle when using mixed triangles and particles etc.

I wish that someone could explain in detail why/how the alphablended transparent primitives fails to render when not using sorted order.

Is it because the alpha value is blended immediately and the data concerning depth of that fragment is lost?

The problem is that the framebuffer doesn’t
remember “how much” of the pixel color comes
from “solid” textures, what depth the “solid”
textures were at, and how much comes from
“translucent” tinting, and where that was at.

You don’t necessarily have to draw them from
back to front, depending on what the textures
look like, and what your texturing mode is.
If you’re drawing in MODULATE mode with a lot
of transparency (for flares or other particle
effects, for instance) you can probably get
away with just turning off Z writing and then
draw in any ol’ order. You still need to have
Z test on for the most part, though.

The problem is that, because you have Z
writing turned off, the non-transparent parts
(if you have opaque, DECAL-style texturing)
of a primitive drawn will always overdraw
what’s on the screen already, even if that
“already” thing was closer in Z. Conversely,
if you do not turn off Z writes, and draw in
random order, something which is supposed to
be draw “behind” a window will not be drawn
if it’s drawn after the window, because the
Z buffer will clip it out.

Right. I did exactely this way: first I rendered the opaque things, with DEPTH_TEST enabled, and then the transparent ones, without DEPTH_TEST. I didn’t care about the ordering of either, and I always used MODULATE. Everything was ok.

PowerVR until series3 (KYRO) does it for you
So Dreamcast auto sort transparent polys in the right order

I love the Dreamcast, the most dev-friendly hardware I’ve seen.

That with the PowerVR is the thing I meant, I think they implement it the way I wrote.
And I also think that the NV20 will have a similar approach of depth testing. (I know that from rumors, no details, I’m not even sure)

Well, given that Matt said that sorting the polygons in the driver would be a drag I doubt the NV20 will do it. And for those of you who insist that there is a way to do transparency without sorting transparent polygons:

Blending modes can be thought of as binary operations on RGBA values. The specific operation to be performed is determined by glBlendFunc but in the end you have some operation,say # that is used to blend as:

dest = dest # source

Where opaque blending is the obvious operation

dest = source

And drawing a pixel of value source1 and then one of value source2 gives

dest = (dest # source1) # source2

wheras if we draw in the opposite order:


Hopefull it is clear that the order objects are drawn in, in general, matters since the operation # isn’t always (and rarely is) commutative (commutative means a#b=b#a). Additive blending and modulation (componentwise multiplication) are the only two commutative blend modes in common use. Alpha blending is neither commutative nor associative, making it extremely nasty. That means that if we just have a framebuffer, some blending ops and some fragments to draw we’d better make sure we submit them in the order we want them draw (back to front). That’s what the painter’s algorithm and most every software renderer is all about. This is true even if we are doing opaque blending since a#b=b for opaque blending, obviously not a commutative op.

The z-buffer is a special case, in effect, to remove the need to sort opaque objects. It works on the realization that in opaque blending only the frontmost fragment matters at each pixel. Transparency is, almost by definition, the case where more that just the frontmost pixel matters.

A z-buffer-like mechanism that can properly resolve arbitrary blend modes per-pixel would have to store all the fragments that would contribute to a pixel, and once rendering was finished it would have to composite them to the framebuffer, each with its own blend function, in back to front order. This requires potentially unbounded storage per-pixel however (potentially proportional to the depth complexity of non-opaque objects at that pixel). Doing things at the pixel level doesn’t remove the need to depth-sort, it just burdens the harware with doing it per-pixel, and today’s hardware can’t cope just yet. The solution for now is to keep depth-sorting, or to restrict yourself to opaque and commutative blend modes.

Sorry for making such length posts.

There are probably a couple of ways to speed up the depth-sorting and rendering you need to do. First off is to utilize your scene graph, BSP, octree or whatever to do some coarse depth-sorting. The traditional approach for a BSP is to traverse from front to back, and push transparent polygons onto a stack. Later reading back from the stack will give you back-to-front order. Even though you usually only have approximate depth information (like knowing one character is in front of another), this may be enough to speed up your sorting. I’d think that using a quicksort on all your transparent polygons could be a bad idea. First off you will hopefully have a nearly-sorted list already, and quicksort will have its worst case quite often. Second if you treat all the polygons uniformly you won’t be able to take advantage of things like sorting particles relative to each other by just transforming their centers.

If the sorting is not taking up most of your time - and you might be able to sort concurrently with submitting opaque geometry so the accelerator doesn’t stall waiting for you, then the likely culprit is either the fact that the triangles will be drawn individualy rather than as strips or lists (can’t be helped, probably), or the fact that in the worst case you will be swapping textures per-polygon.

I don’t know if its been explored or attempted, but the texture-swapping could be attacked by trying to get a partial ordering of polygons rather than a full order. This is possible because if the screen projection of two polygons are disjoint, then they may be drawn in arbitrary order. If a partial order were achieved then at each step there would one or more polygons that could be drawn without error. By prioritizing those polygons which would not require a texture switch you might be able to speed things up. This would, for instance, if you had two particle systems with different textures that interleaved a lot in z, but which projected to nearly disjoint areas of screen space. With a partial order you could render large batches of particles from each system in cases where you might otherwise have to switch off between the two very frequently.

I fear, however, that taking the projected area of polygons into account while sorting would not be a benefit in the long run.

If you really didn’t care at all how much framebuffer space you use, it would be possible to store every fragment that is drawn, and then sort them when you finish the scene. Either that or sort them as they are processed. That would give order independent alpha. It would also be a huge memory and computation hit, so it isn’t really practical. Yet.


Yes, I will need to optimize the sorting techniques in my engine. I will experiment with hiearchial collision detection to minimize the collections of transparents within a limited domain with interfering depth values.

Its not an option to use BSP (or likewise) to optimize my totally dynamic plasmaballs flying inbetween eachother when the players battles it out. There is only one true solution to this problem … don’t use too much transparent primitives :stuck_out_tongue:

Transparent primitives like particles are damn cool things , With the right math and the right amount of particles you can simulate things like true fireflames and flarebursts. But I guess I have to revert to using single animated transparent fire-bitmaps instead.

If you’re using MODULATE (instead of DECAL)
then the need to sort is not as great,
especially if you turn OFF depth writes, but
KEEP ON depth testing. Try your code with no
sorting and no depth writes (but keep depth
testing on) and see if it isn’t good enough.

Originally posted by Hull:
But I guess I have to revert to using single animated transparent fire-bitmaps instead.

If it’s just fire you’re looking for, remember that not all blending modes require depth sorting. I did a pretty neat fire demo (if I may say so myself) using only additive blending. Okay, so it wouldn’t look too hot on a bright background, but still

Furthermore, you probably wouldn’t even notice the lack of depth sorting for things like your plasma guns, due to the small size and high speed of the projectiles.

  • Tom