Single stencil pass shadows

I experimented a bit with stencil shadows after Carmack’s reversed method. I came to a single pass algorithm that goes as follows.

draw geometry
clear stencil to 0
stencilfunc( always )
color/depth mask 0

disable culling
stencilop( keep, invert, keep )
for( i = 0; i < objects; i++ )
stencilmask( 1<<i );
draw shadow volume

enable culling, depth/color mask 1
stencilop( keep, keep, keep )
stencilfunc( not equal, 0, ~0 )
draw shadow

As you can see the number of convex shadow volumes drawn is limited by the number of bits in the stencil buffer unless you run an extra pass to convert everything != 0 to 1…
What I want to know is, if this is a common method that I’ve just overseen or if it’s new.

I haven’t tackled carmacks reverse yet (time constraints), but I’ve done standard shadow volumes.
Your new algorithm looks plausible - congratulations if you’ve eliminated one of the passes! Maybe carmack will offer you a job!

IF I understand your code, I think I see an issue and that is that the shadows invert rather than increment decrement so complex shadows overlapping on the shadow surface (not just in projection from the viewer)don’t work.

Sorry I can’t be sure but I have trouble with your pseudocode, it seems ambiguous. Could you give me more verbose code, when you say “draw geometry” etc, is this pseudo code or have you reversed the calling order and listed state after the draw? When you draw the shadow volume is this the outside, inside, both or just one? Phrases like “Draw Shadow” is that a pass for the unlit geometry or a another pass on the volume (I only ask because you say “Draw Shadow Volume” elsewhere)? When you say enable culling do you mean face winding? Which face front or back?

Don’t post specific answers, just dumb down the pseudocode for me, put state before the draw (so I can be sure) and explicitly explain what you draw with more verbosity (and consistency).

for convex meshes this works fine, because he sorts by object. the only problem is you have only 8 different object-bits => if objects wich recieve the same bit overlap, you get problems…

Yes, dorbie is right. It only works with simple occluders that generate non-intersecting shadow volumes ( so if you use the silhouette to compute the shadow volumes, only convex occluders will work. Multiple convex occluders should work with your method though ). Mark Kilgard talked about the invert trick in a presentation on NVIDIA’s site.


The invert approach for convex shadow volumes is well-known. Unfortunately, for most general purpose uses of shadow volumes, we can’t make the assumption of convex volumes.

The two-sided stencil extension mentioned in this paper is a single pass solution for general shadow volumes. NVIDIA and ATI are currently working on an EXT_ spec (as opposed to an NV_ or ATI_ spec) for supporting this functionality.

Thanks -

My code goes like this:

draw it()
int i
// gl state here is like you normally draw your world
// draw the world as usual
for( i = 0; i < num_objects; i++ ) draw_object( i );

// fill the stencil buffer
glDepthFunc( GL_LESS );
glStencilMask( ~0 );
glStencilFunc(GL_ALWAYS, 0, 0xFFFFFFFF );
glClearStencil( 0 );
glEnable( GL_STENCIL_TEST );
glColorMask( 0,0,0,0);
glDepthMask( 0 );

glDisable( GL_CULL_FACE );
// draw the shadow volumes, that must be convex for this to work
// the maximum number of those volumes is limited here by the bits
// in the stencil buffer
for( i = 0; i < num_objects; i++ ){
glStencilMask( 1<<i );
draw_shadow_volume( i );
glEnable( GL_CULL_FACE );

// now draw the overlay using the stencil buffer (either light or shadow)
glStencilOp( GL_KEEP, GL_KEEP, GL_KEEP );
glColorMask( 1, 1, 1, 1 );
glDepthMask( 1 );
glDepthFunc( GL_EQUAL );
if( draw_lighted ){
glStencilFunc( GL_EQUAL, 0, ~0 );
glBlendFunc( GL_ONE, GL_ONE );
glEnable( GL_LIGHTING );
for( i = 0; i < num_objects; i++ ) draw_object( i );
glDisable( GL_LIGHTING );
} else {
glStencilFunc( GL_NOTEQUAL, 0, ~0 );

The code works (if I did not forget anything here), even if my test program is very simple. The shadows can overlap, because each object’s shadow is stored in a seperate bit plane.
The assumption of convex shadow volumes is the point here.
In the two pass code you incr/decr, that way you can have as many shadow volume surfaces overlap as you want. But if you just have two, you have only three possible values:
-1: only decr
0: nothing or incr and decr
1: only incr
so if you get two passes/fails you have light else shadow. If you invert now, you get 1 for one pass/fail and 0 if none passes or both pass -> same result.

Oops, I think I came a bit late here.

that’s just what I wanted to know. I never came across this method because in all demos and such the other methods were used. So unfortunately it was not so well-known by me.
Anyways thanks for the replys.

Hey cass you spending your days hacking in better stencil support. Frankly to be honest the current usefulness of the stencil buffer is a bit weak to say the least. But if you look at it, it wasn’t long ago that the stencil buffer was considered pretty much useless short of masking away parts of the framebuffer. Its really quite recently that more and more people are attempting to use the stencil buffer for shadows and more complex effects (depth peeling and all sorts of things.)

CASS !!! I wanted a great stencil buffer, I wanted it in hardware, and I wanted it yesterday!!!


Have fun guys and don’t hurt yourself, opengl can bite ya !


i want a programable framebuffer access with own read and write and such stuff… and bit-operators there, too. and ifs to check depth for example. and more returnvalues not only rgb in the pixelshader part, so that we can generate in the vs/ps part the correct values for example for stenciling, too, and pass them over to the framebufferprogram wich then counts the shadows, does the depthtesting and blends with complex blending equations

or bether, lets have hw-raytracing finally…

Originally posted by cass:
The two-sided stencil extension mentioned in this paper is a single pass solution for general shadow volumes. NVIDIA and ATI are currently working on an EXT_ spec (as opposed to an NV_ or ATI_ spec) for supporting this functionality.

Now that’s good news

  • Humus applause for nVidia & ATi cooperating on something *

humus you don’t get it… that was a late aprils-fools-day-joke from cass…

no. really… wonderful… they are WORKING TOGETHER!.. if this is the future of gl, gl will rock again…