Nvidia 441.41 driver bug?

textureGather would be even better for that sort of thing.

Texelfetch maybe better but the problem is before that, the actual LOD calculation fails.
I’ll try drawing the LODs as different colours and it might make the problem clearer

Well an update here …

I uncapped this method …

float mip_map_level(in vec2 texture_coordinate) // in texel units
    vec2  dx_vtc        = dFdx(texture_coordinate);
    vec2  dy_vtc        = dFdy(texture_coordinate);
    float delta_max_sqr = max(dot(dx_vtc, dx_vtc), dot(dy_vtc, dy_vtc));
    float mml = 0.5 * log2(delta_max_sqr);
    //return max( 0, mml );
	return mml;

So it doesn’t clamp at zero

Then put a catch value to draw red for values less than negative 1000000
And well the result looks like this

The values are undefined for many of the edges of polygons. Strangely you can see the quad outlines, so only the outlines of the quads are undefined … The internal triangles have valid edges.

I definitely think this is a driver bug

Interesting. So I think you’re saying that mml is < -1e6 for these QUAD-edge pixels, correct (possibly -Inf)?

This would this seem to suggest that delta_max_sqr is probably zero (or an insanely small positive number, or negative). Which suggests the texcoord derivs may be zero. Which would indicate that possibly there’s no change in the texcoord between the edge of the quads and the “helper invocations” launched just outside the quad.

You said:

How are you computing those texture coordinates?

Is it possible their values are “clamped” at the edge of the quad primitive? …possibly only in the GL_LINES_ADJACENCYGL_TRIANGLE_STRIPS case?

I didn’t explicitly check but pretty sure they are infinity.

Yes this is what I suspect. The issue is, how does the hardware calculate these extra invisible pixels? Because technically they are outside of the polygon.

Well … the short answer is I pass the vertex attribs for each of the 4 vertices of the quad to the fragment shader and calculate them based upon area and distance etc. The algorithm is here

No they aren’t clamped at all. The mipmap calculation is also done before the texture coordinates are wrapped/mirrored etc. It worked perfectly before before the driver update.

That doesn’t matter.

Given gl_Position.xyw (Z doesn’t matter here) for the three vertices of a triangle, you get a projective mapping from barycentric coordinates to screen coordinates. This can be inverted to give a projective mapping from screen coordinates to barycentric coordinates, which can then be used to calculate the remaining attributes for any screen position. The mapping covers the plane of the triangle; the triangle itself is the region for which all three barycentric coordinates are non-negative, but the mapping doesn’t care whether you’re interpolating or extrapolating; the calculation is the same either way.

When it comes to rasterisation, the hardware just needs to dilate (enlarge) the set of generated fragments by one pixel in each direction in order to be able to calculate partial derivatives. Calculation of fragment shader inputs proceeds without regard to whether the fragment is inside the triangle.

Makes sense …
The question is, is it even possible I can fix this with my code?
It looks like this on AMD / all older nvidia drivers

Hard to say. If it’s a bug in the compiler, refactoring the code may avoid triggering it. Also: is there anything interesting in the output of glGetShaderInfoLog or glGetProgramInfoLog? These can include warnings even if compilation and linking is successful.


In the presumably earlier version of the code that Ian Curtis posted here:

refed here:

Version 1 of the frag shader code has a discard for fragments that are outside of the quad.

There’s no discard in the version of the code that you’re using there, is there? That would definitely cause problems with derivative computation on the edges of your quads.

Also, in at least one of the versions, some of the fragment inputs have been changed to be noperspective so that interpolation occurs linearly in screen-space. From the form of the above, I’m assuming you are not calling dFdx() and dFdy() on the screen-space-interpolated texcoord input directly, but rather interpolating the noperspective values for this fragment of the quad (interp_texCoordOverW, interp_oneOverW), applying the perspective correction (dividing by interp_oneOverW), and then calling dFdx() and dFdy) on that perspective-correct texcoord. Is this correct?

Finally, just something I noticed. In one of the versions of the geom shader, it appears that the color and oneOverW interpolators have been declared with flat interpolation. This doesn’t seem right since the gl_Position.w can vary across the verts in a quad. This would tend to make at least the denominator of interp_texCoordOverW / interp_oneOverW constant across all quad fragments, possibly helping contribute to 0 derivatives. That said, I seriously doubt the texcoord interpolator (numerator) was declared flat, or you couldn’t be getting the results you are. And even so, I think in the case of mismatched qualifiers, it uses the ones in the fragment shader, which it looks like are all noperspective.

The fragment shader executes a discard if the weights have different signs (line 280; commented with “need to revisit this”). It would definitely be worth seeing what happens without that. It also executes a discard for transparent pixels, but the one for the weights seems a more likely candidate. I haven’t analysed the weight calculations thoroughly, but I wouldn’t be surprised if these are negative for points which lie outside of the quad.

In the code linked in the OP, oneOverW is (like most of the fragment shader inputs) a flat-qualified 4-element array, i.e. the geometry shader just passes the four per-vertex values directly to the fragment shader which does its own interpolation. Only v and area are interpolated (in screen space); these are then used to calculate the interpolation weights for the other values.

Ah! Thanks for noticing that! I looked up in the thread for the GLSL earlier, and not seeing it, thought he hadn’t posted more than that tiny snippet above. I’d forgotten that he’d posted links to .H files with the GLSL embedded in C++ strings. Those sources answer several questions, and confirm those concerns I had.

It also answers this question I had:

The answer to this question I asked is “yes”, …except that the texCoord and oneOverW varyings are not being interpolated with noperspective as I suspect they should be, but rather are being qualified as flat.

Hi guys, thanks for the detailed look at this. The flat attribute interpolation is correct. Each pixel in the fragment shader gets a copy of the vertex attributes for each of the 4 vertices that make up the quad. There is nothing to interpolate because each pixel gets the same values. The formula calculates the interpolation between the 4 vertices based upon the interpolated lengths and areas.

I’ll try without the discard. It’s possible it’s causing the pixels outside the quad to not draw but i haven’t explicitly checked the maths.

The discard allows us to draw complex non-planar quads that could potentially could only be drawn with thousands of triangles. I actually really like the solution, but I’ve bumped into all kinds of hardware and driver bugs trying to get it to work. On older amd drivers they didn’t support interpolation qualifiers for the data coming out of the geometry shader which meant it didn’t work at all. And passing 4x the amount vertex attributed you run out of space fast.

Ok, that makes sense. Thanks. So custom interpolation in the frag shader, even for the 1/w term.

Possibly related:

Finally had a chance to sit down at my pc and try these suggestions.
Removing the discard fixes the artifacts!

(Emphasis changed).

My interpretation of “subsequent” is that it should be possible to preserve the existing behaviour without the issue of undefined derivatives by setting a flag for differing signs then performing the discard at the end of the shader (or at least after the point where the derivatives are computed).

Makes sense. As it stands, a discard statement has two quite distinct effects. One is to abort processing, the other is to “mask” framebuffer updates for that fragment. You can obtain the first effect without the second by just using return instead, but there’s no straightforward way to obtain the second without the first. In the compatibility profile you can enable alpha testing and have the fragment shader output zero alpha. For the core profile: would clearing gl_SampleMask have the desired effect?

Awesome! Thanks for following up with the solution you chose.

Final solution I went with was

	if(!gl_HelperInvocation) {
		discard;		// need to revisit this

Which works perfectly. Basically disabling discarding of the helper fragments. Thanks for all your input. Not a driver bug after all :slight_smile:

ddi you try to reinstall? whenever i have problems related to GPU asimple reinstall solves this

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.