Early fragment culling using the stencil buffe


I have a fragment shader that writes to the depth buffer (gl_FragDepth) and does some pretty expensive work. This shader only needs to run on a few screen pixels (maybe a thousand or more). They’re scattered everywhere on screen.

Is there a way I can early-cull fragments when rendering my scene, that is, before they even reach my fragment shader? Can I do this with the stencil buffer (I understand the stencil test is done before the depth test)? That way, I wouldn’t have to programmatically discard fragments that I do not want to process.

BTW, I’m exclusively working on nVidia hardware (GeForce 4xx) for the moment.


Can’t manage to get early stencil culling to work :frowning:

From what I can see, nVidia has a patent on stencil-based culling, but how I can enable it?

I have also found this page
where the following is written:

“On most hardware early stencil cull is accomplished by storing a low resolution, low bit depth cache of the stencil buffer. Blocks of between 16 and 64 pixels are reduced to a single bit that represents a conservative average of the stencil function across all pixels in the block. This cache is then consulted prior to running the fragment shader to quickly discard pixels which are guaranteed to fail the stencil test”

Would appreciate any insight.

It seems early stencil cull is completely disabled as soon as discard is being used in the shader.
I don’t understand why, as the stencil test should be done before the depth test and rasterization.
Anyway, enough time spent on this…

Both discard and depth writes can potentially disable early tests (usually both depth and stencil), so what you see sounds reasonable. But actual behavior might vary based on the GPU you are using.