Stencil routed A-buffer (explict_multisample)


I’m wanting to implement a stencil routed A-buffer in OpenGL, as in this paper :

Does anyone have any code to post ?

Or can anyone give me some pointers on how to set up the stencil test ? The first thing I would do is set up a multisample FBO with, for example, a 4 sample color render buffer, and a 4 sample depth-stencil render buffer. How then do I set up the stencil test ?

// do I have to bind the framebuffer, and a particular render buffer (color / depth-stencil) here first ?


GLbitfield M = 1;
for (unsigned int si=0; si<num_samples; ++si)
	glSampleMaskIndexedNV(si, M);
	M = M << 1;

glStencilFunc(GL_EQUAL, 0x01, 0xFF);

// rendering ...
// before rendering to the FBO do I glClearStencil(), and with what ?

glDisable(GL_DEPTH_TEST)	// don't need it (but can't create a stencil-only render buffer)

// set up fragment shader (Cg .. using samplerRBUF ..)
// use glTexRenderbufferNV(GL_TEXTURE_RENDERBUFFER_NV, ...) to let the shader access the render buffer

// etc
// Render to FBO

Any help would be much appreciated.


Did you manage to get a working OpenGL implementation of stencil-routed A-buffer?

I’m also trying to implement one but had no success so far at actually disabling multisampling while rendering to multisample textures.

Calling glDisable(GL_MULTISAMPLE) just won’t work; when a fragment is processed, it is not sent to every sub-samples so that stencil value won’t be decremented at once for every sub-samples of a processed fragment. Layers in the resulting framebuffer end up messed up.

Is there a way to send bloats to every sub-samples of processed fragments? Maybe some special fragment shader output? (providing fragment shaders are run only once per fragment)

By the way, I guess you figured that out since posting your message but the correct way of filling your stencil buffer with values 2, 3, 4… would be:

  glEnable( GL_MULTISAMPLE );
  glEnable( GL_SAMPLE_MASK );
  glSampleMaski( 0, 0xFFFFFFFF );
  glClearStencil( 2 );


  glDisable( GL_DEPTH_TEST );
  glDepthMask( GL_FALSE );
  glEnable( GL_STENCIL_TEST );
  glMatrixMode( GL_PROJECTION );
  glMatrixMode( GL_MODELVIEW );

  for( int i=1 ; i<numSamples ; ++i ) {
    glSampleMaski( 0, 1<<i );
    glStencilFunc( GL_ALWAYS, 2+i, 0xFFFFFFFF );
    glRectf( -2.0f, -2.0f, 2.0f, 2.0f );

glSampleMaski / glSampleMaskIndexedNV just filters out which samples are processed.

I start with a stencil value of 2 so as to be able to check I had enough “layers” for all the geometry when done rendering.

Anybody knows how to “align” sub-samples?

Help would be greatly appreciated.

Perhaps it is a bug in the driver you are using, because glDisable(GL_MULTISAMPLE) does cause the sample to be rasterized at the pixel center and then broadcast to each sub-sample, with the stencil test happening per-sample. I’ve implemented a technique similar to Stencil Routed A-Buffer and use this successfully.

Also, make sure you have correctly reset your stencil test and stencil mask after initializing your stencil buffer, and prior to rendering transparent geometry.

Hello and thanks for your reply.

Might be a bug, or it could have to do with my render target. I’m rendering to multisample textures (ARB_texture_multisample), RGB16F for color attachments, DEPTH24_STENCIL8 for depth & stencil.
By the way, I’m trying that on an nVidia GTX280 with latest drivers. I also have access to a 8800GTX (at home) but no ATi cards.

I tried disabling stencil test (and enabling depth test), expecting to get the same exact values for every sample of a given fragment. However, when visualizing the image for only one sample index at a time (through a texelFetch with a sampler2DMS), I could clearly observe a sub-pixel offset between images depending on the sample index used for visualization.

Still, I’m not quite sure how to interpret the specs. In Chapter 3 is says:

If MULTISAMPLE_ARB is disabled, multisample rasterization of all
primitives is equivalent to single-sample rasterization, except
that the fragment coverage value is set to full coverage.  The
depth values may all be set to the single value that would have
been assigned by single-sample rasterization, or they may be
assigned as described below for multisample rasterization.

In my understanding, it only says fragment COVERAGE is set to full coverage, that is, the coverage mask is all 1’s for processed fragments. But isn’t that coverage just a mask ANDed with some input generated by the rasterizer in some mysterious (at least non-specified) way, possibly depending on which samples are in, or depending on the type of render target?

I think I got that part OK. Here’s my setup:

  glDisable( GL_MULTISAMPLE );
  glDisable( GL_SAMPLE_MASK );
  glSampleMaski( 0, 0xFFFFFFFF );
  glColorMask( GL_TRUE, GL_TRUE, GL_TRUE, GL_TRUE );
  glDepthMask( GL_TRUE );
  glDisable( GL_DEPTH_TEST ); // Could be enabled too, really makes no difference as every sample is written at most once.
  glDepthFunc( GL_LESS );
  glStencilOp( GL_DECR, GL_DECR, GL_DECR );
  glStencilFunc( GL_EQUAL, 2, 0xFFFFFFFF );
  glEnable( GL_STENCIL_TEST );

Any suggestion for a solution or a workaround would be welcome.
Falling back to depth-peeling would require quite some code rewrite…

As the issue I’m experiencing is not specific to stencil-routed A-buffer, I started this other thread to discuss that particular multisampling issue.

I attached some app to diagnose the problem, if anybody have some time to look at it and try, I would very much appreciate it.

hey shodocko,
You used glRectf to fill the initial stencil values. Why doesn’t glClear work for this purpose? My experiments show that glClear doesn’t pay attention to the SAMPLE_MASK, and I can’t figure out the proper way to force it.
I’m working in a forward-compatible profile, so it would be easier (and faster, maybe) to not use glRectf.

For what it’s worth, I had to use a screen aligned quad as well, because glClear doesn’t respect the sample mask (or, if it’s supposed to, it doesn’t in practice).

The (3.2) spec says:

When Clear is called, the only per-fragment operations that are applied (if enabled) are the pixel ownership test, the scissor test, and dithering. The masking operations described in section 4.2.2 are also applied.

So, glClear must respect the gl[Color, Depth, Stencil]Mask state which is called out in 4.2.2.

But it does not respect the glSampleMaski state, which is called out in 4.1.3.

Thanks for clearing that out, guys!

It seems a bit too heavy to perform 3 full-screen passes in order to just init the 4x multi-sampled stencil buffer (ok, you can try to use 2 passes, but still…).


I’ve recently found this:

Full source code that uses latest OpenGL 4 to implement A-Buffer efficiently (I get 300-400fps) for a 1m triangles models.
Very nice.