Incorrect interpolated Z-buffer values causing strange artifacts

thealik · March 25, 2018, 9:45am

I hit a very strange Z-buffer interpolation issue while working on 2D shadow mapping. I’ll try to explain it as best as I can using simple example that still reproduces the issue.

I render just one 2D line into a shadow map texture of height 1px. In vertex shader, I calculate X coordinate of each vertex simple as angle at which light ray hits this vertex. Y coordinate is always zero, since texture is one-dimensional. And Z coordinate is defined as normalized distance from the vertex to the light source.

Here’s an image that explains this process:
[ATTACH=CONFIG]1733[/ATTACH]
This “shadow map” texture is then used to draw the actual shadows.

Everything works as expected until light ray perfectly aligns with the line. See image below:
[ATTACH=CONFIG]1734[/ATTACH]

When this happens, both vertices are mapped to the same fragment, which would be fine. But this fragment Z coordinate is wrong: somehow it is less then minimum distance from light source to both vertices.

If you look at the shadow, it can be seen as a shadow spike that extends towards light source past line segment. Please see this animation as an example:
https://imgur.com/a/qMRz1
or this zoomed screenshot:
[ATTACH=CONFIG]1735[/ATTACH]

Vertex shader:

#define PI 3.14159265
#define RAY_LEN 1000.0

uniform vec2 u_LightPos;

float CalcAngle(vec2 v, vec2 light)
{
	vec2 r = v - light;
	return atan(r.y, r.x);
}

void main()
{   
    float angle = CalcAngle(gl_Vertex.xy, u_LightPos);
	
    float x = angle / PI; // [-PI, PI] -> [-1, 1]
    float z = length(gl_Vertex.xy - u_LightPos) / RAY_LEN; // [0, 1]

    gl_Position = vec4(x, 0.0, z * 2.0 - 1.0, 1.0);
}

Fragment shader:

#version 330

uniform vec2 u_LightPos;

out vec4 frag_color;

void main()
{
    float z = gl_FragCoord.z;
    frag_color = vec4(z, z, z, 1.0);
}

I’ve been trying to find a cause of this issue for a several days now. I’ve tested it on multiple devices and platforms (Linux, Android), using multiple frameworks (SFML, Cocos2D), GL versions etc. I’m fairly certain that this is not a hardware or driver bug, since it always behaves consistently.

I guess that somehow OpenGL fails to correctly do a Z value interpolation when both line vertices are mapped very close. But this is mathematically impossible: there’s no way to get a value X that is less then min(a, b), while interpolating between a and b.

I’d really appreciate any suggestions or ideas of how to find out why this happens.

GClements · March 25, 2018, 12:35pm

My understanding of the rules for rasterising lines is that attributes (including depth) may be extrapolated if the centre of the fragment containing the starting point is on the opposite side of the starting point to the endpoint.

In the case where both endpoints lie on the same pixel, the derivative of depth with respect to x will be high, so even extrapolating by half a pixel may result in a depth value which is significantly beyond the values at the endpoints.

If this is what’s happening, one possible solution would be to add a vertex attribute which is 0 at one end and 1 at the other end. If a fragment has an interpolated value outside of the range [0,1], [var]discard[/var] the fragment (but note that may produce single-pixel gaps at the point where lines should join). Or explicitly assign gl_FragDepth, clamping it to the endpoint values.

Also: have you considered how this approach will handle the discontinuity at the negative X axis (where the angle “jumps” from pi to -pi)?

thealik · March 25, 2018, 1:21pm

I think you are right, and this is exactly what’s happening. I was suspecting that OpenGL is doing some kind of extrapolation, but didn’t know why and how. Now it makes sense (well, kind of, since the behavior is still mathematically incorrect).

I tested this hypothesis as you’ve suggested, by assigning dummy 0 and 1 values to vertices of the line, and discarding fragments for which the value is outside [0, 1] range. It indeed fixes the problem.

The discontinuity between -pi and pi is indeed an issue. In the actual implementation, I addressed it by sending shadow casters as a list of edges (i.e. every vertex has its “pair” vertex coordinates) instead of list of points. This allows me to continuously map it to [-pi, 2pi] range and I do another pass to fold [-pi, 2pi] back to [-pi, pi] (hence was my recent question about updating FBO depth values, where you also responded). I just didn’t want to clutter my examples with this unnecessary code since it is not related to the interpolation issue.

What’s your opinion on performance of conditional discard in fragment shader vs. clamping the distances? Actually, my workaround was clamping the distances, but I didn’t like it since it requires calculating length() in the vertex shader (well, and also because I wasn’t understanding what’s happening, so it was “blind” workaround). Your approach doesn’t need length, but introduces branching in fragment shader, so my guess would be probably go with clamping.

And thank you a lot for the response, it saved me a lot of time.

thealik · March 25, 2018, 3:18pm

Actually, never mind. I think I can do it without length(), and using just clamp(…) in a fragment shader should definitely be faster than if (…) discard;

thealik · March 25, 2018, 4:10pm

I think I found a better way to fix this interpolation/extrapolation issue.

The problem occurs when both line vertices are mapped “close enough”. This happens when the light ray is aligned with the line:
[ATTACH=CONFIG]1736[/ATTACH]

But instead of “clamping” incorrectly extrapolated value in the fragment shader, I can push this vertices apart in the vertex shader, so that interpolation will be correct. I obviously cannot push them on X axis, because this will produce incorrect shadows. But I can push them apart on Y axis, so that one vertex would be below fragment center, and the other will be above:
[ATTACH=CONFIG]1737[/ATTACH]

I did a quick test and it looks like this solution works, and completely eliminates artifacts.