Zfail or Zpass with Stencil Shadows Volume?

I’ve looked on several tutorials about Stencil Shadows Volume.(eg: Nehe…)
So I saw some guys using Zfail or Zpass techniques to cast shadows volume…
But I found them quite similar , ins’t it? or I’m totaly wrong?

Is there some case when one must be used rather than the other? or simply choose one and apply it all the time?

I saw also the terms Carmack’s Reverse , is there a link with that?

Last point:
I’m a little confused with all name of shadowing techniques due to my poor english, so is there a more efficient way to make dynamic shadows (not soften)?

Thanks.

http://developer.nvidia.com/docs/IO/2585/ATT/CarmackOnShadowVolumes.txt

The key difference between the algorithms is the capping & clipping requirements. Carmack’s reverses avoids the need to cap the shadow volumes where the near plane intersects and clips them. This near plane capping is very computationally expensive and so is even the simple ‘eye inside’ test and so it pays to eliminate it.

see this:
http://developer.nvidia.com/docs/IO/2585/ATT/RobustShadowVolumes.pdf

Actually you should use both techniques. You should try to use zpass when ever possible and zfail only when zpass will give you problems. There is a way to detect which you should use, Gamasutra has an article on this. Probably the biggest reason to need to switch to zfail is when the volume is intersecting the near clip plane. That is when zpass will go bananas on you.

-SirKnight

“will go bananas on you”
i definetely have to keep that term in memory )) thanks.

Thank to all
It’s working fine thanks to your docs link
No I’m going to improve my code.

I saw no difference in performance when using always zfail or switching between zfail or zpass (may be I’m using not enough faces on models).

I saw the problems mentionned about self shadowing along silhouette: (shadow stop effect of smooth lighting)

I’ve also a question:
When using stencil buffer (stencil test), is there a perfomance difference between render a frame (eg:800x600) when stencil bytes are set to 0 (with test set to EQUAL 0) and render a frame (same size) when stencil is masking 80% or more of the scene?.
(May be I’m not enough accurate with my words).

Thanks.

I don’t think current hardware implements hieararchical Stencil, which would be necessary to get a large fill rate increase based on stencil.

You know I almost completely forgot but there is this neat little trick you can do that will help save A LOT of fill when using zfail with the Pinf matrix. There is a demo of this on nvidia’s dev site. It’s called “Shadow Volume Intersection Demo.” Using this technique, you probably don’t have to switch back and forth between zpass and zfail. Just use zfail all the time now.

You may not see a drop in performance between zpass and zfail right now but you will. Zfail will most of the time use A LOT more fill than zpass since when using the zfail technique, you are projcting the shadow to inifity. There are a lot of fragments out there in infinity. But like I say, using that technique in that demo I described will not let that be a problem.

BTW, I figured “going bananas on you” would describe what happens when the volume in the zpass mode intersects the near plane quite nicely.

-SirKnight

Another fact to keep in mind are that modern HW often has occlusion culling HW such as hierarchical Z-buffers. At least on the Radeon series they are primarily designed to cull things that fail the depth test, so they tend to operate better with the update on Z pass method.

-Evan

Of course all of those fragments still have to be tested. Using an approach like in that nvidia demo, it cuts down the number of fragments to be rendered, thus less are going through that occlusion culling hw test, hence faster performance. Sure that test may be pretty quick, but the more the tests, the more time taken to get the final render out. Now, if you dont use an aggressive per-object scissor box and depth bounds setup like that demo, and maybe no scissor box at all, then the occlusion culling hw part is better than not having it at all.

-SirKnight

Even with aggressive scissor and depth_bounds, using zpass when you can is good advice. You render significantly fewer polygons, and many of those cap polygons would introduce pipeline bubbles (because they’re outside the aggressive scissor rect).

Thanks -
Cass

Cass:
Even with aggressive scissor and depth_bounds, using zpass when you can is good advice. You render significantly fewer polygons, and many of those cap polygons would introduce pipeline bubbles (because they’re outside the aggressive scissor rect).

I think too that Scissor can improve with depth_bounds the fill rate, but on my GF2MX I still using “standard” zfail or zpass volumes shadows rendering, and I noticed no dirrence in frame rate (using low or high count polygon).
Because (I might be wrong) “I think” that stencil buffer doesn’t impact on fill rate if used on Zpass or Zfail.
On Nvidia Site they show demo using only Zfail, and explain that stencil doesn’t make difference if rendering in one or other mode.
But a high detail silhouette rendering into volume is obviously (may be I’m wrong too) a significant down in FPS.

will go bananas on you.

yeah that rocks

Now I’ll go further on NVidia trick and see what it means

Thanks for helpful advices

I see a lot about nVidia saying things about performance. One thing you have to remember, as much as nVidia may like to be, they are not the only accelerator manufacturer. Performance on an nVidia may not be the same as on an ATI, and I doubt there will be any mention of ATI performance on nVidia’s site.

On an IHV web site, you should expect the information to pertain to that vendor’s hardware.

On this forum, I think Evan and I do a good job of qualifying advice if it may be specific to our respective hardware, but we also try to give general-purpose advice when we can.

Thanks -
Cass

[This message has been edited by cass (edited 06-10-2003).]

Originally posted by cass:
[b]
Even with aggressive scissor and depth_bounds, using zpass when you can is good advice. You render significantly fewer polygons, and many of those cap polygons would introduce pipeline bubbles (because they’re outside the aggressive scissor rect).

Thanks -
Cass[/b]

Ok then I was right the first time. I had the theory that maybe zpass wouldnt gain you anything while using the aggrsssive scissor with zfail but I wasn’t sure at the time. I was mainly just thinking of fill-rate at the time and not the amount of polygons needed to render.

-SirKnight

Antorian:
on my GF2MX I still using “standard” zfail or zpass volumes shadows rendering, and I noticed no dirrence in frame rate

Mea culpa, my code was a bit upsidedown
Zpass can improve performance against Zfail 'cause you needn’t to scan caps.
So almost improvement are made with Zpass that cover about 80% of time (and of course 20% for Zfail) of shadow casting.

Sorry for keeping bananas fo me

Now I try both Nvidia an ATI point of vue about that.
If it rocks, I’ll tell you

Thanks

  • Antorian goes & kill bananas with scissor… Ahahah

PS: Sorry high temperature is not good for my little brain…
Bunjiiiiii

I’ve noticed something very strange when using Zfail.
If my far clip (using gluLookAt) is not very ‘far’, the shadow bugs and let see a second inverse shadow of projected object.
I presume that’s from infinite projection of shadow volume (with w=0), insn’t it?

Have you an idea how to dispell that artifact? without removing far clip plane to infinite?

I use Frustrum Culling for all the scene and routines need a far clip plane not ‘to infinite’ and shadows volume got this artifact coming with in not so far clip plane.

Ideas are welcome

Thanks.

You really need an infinite far plane (or NV_depth_clamp) for robust zfail shadows.

Why can’t you simply frustum cull with a non-infinite far plane. Does it really matter if the actual far plane is infinite? You still get the savings from the objects you don’t render.

Cass

Yeah Sure!! That’s what I think.
I need to have a non infinite far clip for some other features but I can put it on infinite only for the shadow casting.
And after going back with a no so far clip plane
So I think it’s the only way to do for me.
Thanks for advice (arrr if I would have a GFX5900!! it looks pretty that new features on depth clamping)

I wouldn’t change the projection matrix inside a frame - just do your frustum culling with a finite far plane while rendering with a projection matrix that
has an infinite far plane.

Also, if you want to keep a finite projection matrix (for whatever reason) NV_depth_clamp is available on all GeForce products.

GeForceFX 5900 has EXT_depth_bounds_test, which is a much different extension. It’s all about avoiding unnecessary stencil incrs/decrs.

Thanks -
Cass

[This message has been edited by cass (edited 06-13-2003).]