SAVSM Shadows Numeric Instability

OniDaito · April 12, 2011, 3:30pm

Hi guys. Im having issue with loss of precision in my SAVSM setup.

when you see the light moving around the effect is very striking; there is a lot of noise with fragments going black and white all the time. This can be somewhat lessened by using the minvariance (thus ignoring anything below a certain threshold) but then we get even worse effects with incorrect falloff (see my other post).

Im using GLSL 1.2 because I’m on a mac so I dont have access to the modf function in order to split the precision across two channels as described in GPU Gems 3 Chapter 8.

Im using GL_RGBA32F_ARB textures with a Framebuffer object and ping ponging two textures to generate a summed area table which i use with the VSM algorithm.

Shadow Shader.

The Summed tables do seem to be working. I know this because I have a function that converts back from the summed table to the original depth map and the two images do look pretty much the same. Im also using the -0.5 + 0.5 trick in order to get some more precision but it doesnt seem to be helping

uniform sampler2D ShadowMap;
uniform float minVariance;

uniform float ambientLevel;
uniform float lightAttenuation;
uniform int shadowMapSize;
uniform float filterSize;
varying vec4 ShadowCoord;
uniform vec4 lightPosition;
varying vec4 lightDir, eyeVec;
varying vec3 vertexNormal;
varying vec3 vertexNormalWorld;

vec3 ShadowCoordPostW;

float step = 1.0 / float(shadowMapSize);

float g_DistributeFactor = 1024.0;  

// This is somewhat specific to the sub objects within our QC File so might need to be changed
// TODO  - Minus light dir do we think?

float lightLevel() {
	float d = ShadowCoordPostW.z - lightPosition.z;
	float attenuation = 1.0 / (d * d * lightAttenuation);
	return attenuation * max(dot(vertexNormalWorld, -lightDir.xyz), 0.0);
}

float linstep(float min, float max, float v)  
{  
  return clamp((v - min) / (max - min), 0.0, 1.0);  
}  

float ReduceLightBleeding(float p_max, float Amount)  
{  
  // Remove the [0, Amount] tail and linearly rescale (Amount, 1].  
   return linstep(Amount, 1.0, p_max);  
}  

// Box Sample Blur - WITH SUMMED TABLES!

vec4 btex2D(vec2 uv) {
	float ss = step * filterSize / 2.0;
	float xmax = uv.x - ss;
	float xmin = uv.x + ss;
	
	float ymax = uv.y - ss;
	float ymin = uv.y + ss;

	vec4 total = texture2D(ShadowMap, vec2(xmax,ymax)) -  texture2D(ShadowMap, vec2(xmax,ymin)) 
	- texture2D(ShadowMap, vec2(xmin,ymax)) + texture2D(ShadowMap, vec2(xmin,ymin));

	return total / (filterSize * filterSize);
}


// Upper Bound VSM Shadow code

float chebyshevUpperBound()
{
	// We retrive the two moments previously stored (depth and depth*depth)
	// these are split over r,g and b,a
	
	vec4 moments = btex2D(ShadowCoordPostW.xy);
//	float FactorInv = 1.0 / g_DistributeFactor;  

//	vec4 moments = vec4(splits.y * FactorInv + splits.x, splits.w * FactorInv + splits.z, 0.0,0.0);
	
	//vec4 moments = vec4(splits.x * FactorInv + splits.y, splits.z * FactorInv + splits.w,0.0,0.0);
	
	moments.x += 0.5;	// Since using SUMMED Tables we need to adjust
	moments.y += 0.5;
	
	//vec2 moments = texture2D(ShadowMap,ShadowCoordPostW.xy).rg;
	
	// Surface is fully lit. as the current fragment is before the light occluder
	// Hardly ever occurs because the distances will always be greater or very close to equal
	if (ShadowCoordPostW.z <= moments.x)
		return 1.0 ;

	// The fragment is either in shadow or penumbra. We now use chebyshev's upperBound to check
	// How likely this pixel is to be lit (p_max)
	float variance = moments.y - (moments.x * moments.x);
	variance = max(variance, minVariance);

	float d = ShadowCoordPostW.z - moments.x ;
	float p_max = variance / (variance + d * d);
	
	return p_max;
	
	//return max (1.0 - p_max, 0.0);
}


void main()
{	
	ShadowCoordPostW = 0.5 * (ShadowCoord.xyz / ShadowCoord.w + 1.0);

	float shadow = ReduceLightBleeding(chebyshevUpperBound(),0.15);
	float litFactor = (1.0 - ambientLevel) * (shadow) * lightLevel();
	//gl_FragColor = gl_Color * smoothstep(ambientLevel,1.0,max(lightLevel(),shadow));
	gl_FragColor = gl_Color * (litFactor + ambientLevel);
}

Depth Shader



varying vec4 v_position;
varying float tDepth;


float g_DistributeFactor = 1024.0;  
 
 
void main()
{
	// Is this linear depth? I would say yes but one can't be utterly sure.
	// Could try a divide by the far plane?
	
	float depth = v_position.z / v_position.w ;
	depth = depth * 0.5 + 0.5;			//Don't forget to move away from unit cube ([-1,1]) to [0,1] coordinate system

	vec2 moments = vec2(depth, depth * depth);


	// Adjusting moments (this is sort of bias per pixel) using derivative
	float dx = dFdx(depth);
	float dy = dFdy(depth);
	moments.y += 0.25 * (dx*dx+dy*dy);
	
	// Subtract 0.5 off now so we can get this into our summed area table calc
	
	moments -= 0.5;
	
	// Split the moments into rg and ba for EVEN MORE PRECISION
	
//	float FactorInv = 1.0 / g_DistributeFactor;

//	gl_FragColor = vec4(floor(moments.x) * FactorInv, fract(moments.x ) * g_DistributeFactor, 
//					floor(moments.y)  * FactorInv, fract(moments.y)  * g_DistributeFactor);


	gl_FragColor = vec4(moments,0.0,0.0);
}

BionicBytes · April 13, 2011, 3:56am

I can see you have been following the GEMS3 article as your posted code does indeed look correct.
However, GEMS3 does skip over the most inportant part - actually how to implement the Summed Tables (SAT) and suggest you lookup the work of a referenced article (pretty poor really).
How did you perform that part?
Perhaps create a shader which takes your SAT texture and subtracts the real depthmap and outputs the difference to the screen. Any visible pixels show that there is infact a difference.

BionicBytes · April 13, 2011, 4:26am

Also forgot to mention…
There are discussions on the net which mention that precision is a MAJOR factor for SAVSM and there a few things which help.
see this example for instance
The page quotes “One big problem with SAVSM is numeric stability, since both summed-area tables and variance shadow maps eat precision for breakfast. However the error is unbiased, meaning that simply increasing the minimum filter width (“softness”) will get rid of it.”

So what values have you played with for min filter width?

Additionally, what size if the VSM shadowmap?

Dark_Photon · April 13, 2011, 5:07am

Since Andrew Lauritzen was one of the primary guys behind Variance Shadow Maps (VSM), you might find some interesting pointers looking through his Sample Distribution Shadow Maps (SDSM) code here, where he does EVSM, a combination of Exponential Shadow Maps (ESM) and Variance Shadow Maps (VSM):

http://visual-computing.intel-research.net/art/publications/sdsm/

D3D rather than GL but still useful.

And for further info, search for Andrew’s VSM/EVSM posts on the beyond3d.com forums (he hangs out there). If you have a specific question, post it there and he’ll probably follow up.

OniDaito · April 13, 2011, 6:51am

Hi guys. Thanks for the replies.

I should prefix with with the fact that this is for a client who wanted soft shadows that harden on impact. From what I can tell, PCSS should work but didnt for me so I went with VSM with a variable filter, hence the need for SAVSM or similar where changing the softness doesnt result in more processing power needed.

Ok, so, the summed tables. I looked at the original paper and came up with an algorithm that uses a Ping Pong FBO and two shaders (that are almost identical)


-(void) generateSummedTables:(QCOpenGLContext *)context withTex:(GLuint) tex {
	
	CGLContextObj cgl_ctx = [context CGLContextObj];
	// Horizontal Scan

	
	int nm = ceil(log2(mFBOSize)) + 1;
	glUseProgramObjectARB([mSummedTableShader programObject]);
	glUniform1iARB([mSummedTableShader getUniformLocation:"texWidth"],mFBOSize);
	glUniform1iARB([mSummedTableShader getUniformLocation:"texture"],0);
	
	GLuint atex = tex;
	[mSummedFBO bindNoDraw];
	
	glMatrixMode(GL_PROJECTION);
	glPushMatrix();
	glLoadIdentity();
	glOrtho(-1, 1, -1, 1, 0.0, 10.0);
	glMatrixMode(GL_MODELVIEW);
	glPushMatrix();
	glLoadIdentity();
	glColor3f(1.0,1.0,1.0f);

	unsigned int ni = 1;
	BOOL usingA = TRUE;
	for(int i=0; i < nm; i++){

		//glClear(GL_COLOR_BUFFER_BIT);

		if (usingA) {
			glDrawBuffer(GL_COLOR_ATTACHMENT0_EXT);
		} else {
			glDrawBuffer(GL_COLOR_ATTACHMENT1_EXT);
		}

		// Start off with an input texture A
		glUniform1iARB([mSummedTableShader getUniformLocation:"Ni"],ni);
		
		glBindTexture(GL_TEXTURE_2D, atex);
		
		glBegin(GL_QUADS);
		glTexCoord2f(0.0, 0.0);	glVertex3f(-1.0, -1.0, 0.0);
		glTexCoord2f(1.0, 0.0);	glVertex3f(1.0, -1.0, 0.0);
		glTexCoord2f(1.0, 1.0);	glVertex3f(1.0, 1.0, 0.0);
		glTexCoord2f(0.0, 1.0);	glVertex3f(-1.0, 1.0, 0.0);
		glEnd();
		
		ni = ni << 1; // Move up
	
		if (usingA) {
			atex = [mSummedFBO getTextureAtTarget:0];
		} else {
			atex = [mSummedFBO getTextureAtTarget:1];
		}
		
		usingA = !usingA;
	}
	
	// Vertical Scan
	
	usingA = TRUE; // Sure about that? 
	ni = 1;
	atex = [mSummedFBO getTextureAtTarget:1];
	glUseProgramObjectARB([mSummedTableVShader programObject]);
	glUniform1iARB([mSummedTableVShader getUniformLocation:"texWidth"],mFBOSize);
	glUniform1iARB([mSummedTableVShader getUniformLocation:"texture"],0);
	
	for(int i=0; i < nm; i++){
		
		//glClear(GL_COLOR_BUFFER_BIT);
		if (usingA) {
			glDrawBuffer(GL_COLOR_ATTACHMENT0_EXT);
		} else {
			glDrawBuffer(GL_COLOR_ATTACHMENT1_EXT);
		}
		
		// Start off with an input texture A
		glUniform1iARB([mSummedTableVShader getUniformLocation:"Ni"],ni);
		
		glBindTexture(GL_TEXTURE_2D, atex);
		
		glBegin(GL_QUADS);
		glTexCoord2f(0.0, 0.0);	glVertex3f(-1.0, -1.0, 0.0);
		glTexCoord2f(1.0, 0.0);	glVertex3f(1.0, -1.0, 0.0);
		glTexCoord2f(1.0, 1.0);	glVertex3f(1.0, 1.0, 0.0);
		glTexCoord2f(0.0, 1.0);	glVertex3f(-1.0, 1.0, 0.0);
		glEnd();
		
		ni = ni << 1; // Move up
		
		if (usingA) {
			atex = [mSummedFBO getTextureAtTarget:0];
		} else {
			atex = [mSummedFBO getTextureAtTarget:1];
		}
		
		usingA = !usingA;
	}
	
	glPopMatrix();
	glMatrixMode(GL_PROJECTION);
	glPopMatrix();
	
	[mSummedFBO unbindFBO];

	glUseProgramObjectARB(NULL);
}

So that runs the passes over the depth texture that has float(moment, moment squared, 0, 0) for its channels. The shader is quite simple


uniform int texWidth;
uniform int Ni;	// texels along 2 ^ i (so we pre power)

uniform sampler2D texture;

void main (void) {
	// Horizontal Pass
	vec2 s = gl_TexCoord[0].st;
	vec2 sd = s;
	sd.x = sd.x + (1.0/ float(texWidth) * float(Ni));
	vec4 c = texture2D(texture, s) + texture2D(texture, sd);
	
	gl_FragColor = c;
	
	// Now we SWAP textures

}

Its the same for the height pass as well. To make sure that I was getting this right, I decided to create a shader that would reverse the process, keeping in line with the algorithm


// Reverse a summed area table so we may get the texture back
uniform int texSize;
uniform sampler2D texture;

float step = 1.0 / float(texSize);

float g_DistributeFactor = 1024.0;

void main (void) {
	vec2 s = gl_TexCoord[0].st;		// Four corners of the rectangle
	vec2 tl = vec2(s.x-step,s.y-step);
	vec2 bl = vec2(s.x-step,s.y);
	vec2 tr = vec2(s.x,s.y-step);
	
	vec4 c0 = texture2D(texture, s);
	vec4 c1 = texture2D(texture, bl);
	vec4 c2 = texture2D(texture, tr);
	vec4 c3 = texture2D(texture, tl);
	
	vec4 f = c0 - c1 - c2 + c3;

	float FactorInv =  1.0 / g_DistributeFactor;  
	//vec4 moments = vec4(f.y * FactorInv + f.x , f.w * FactorInv  + f.z ,0.0,0.0);
	vec4 moments = f;
	
	moments.x += 0.5;	// Since using SUMMED Tables we need to adjust
	moments.y += 0.5;

	gl_FragColor = moments;

}

I have issues with VSM anyway because the falloff is incorrect, I.e, shadows get stronger the further away they are which seems wrong to me; in fact that isnt what happens with a point source directional light (least thats how it seems when i use my desk lamp). If VSM and PCF arent really appropriate, maybe this ESM is superior?

Oh yes, we are using a filter of between 2 and 10 (in this case above its 8 I think) with a 512 x 512 shadowmap