Probable nVidia GLSL compiler bug


Does the discard keyword imply a return? If the answer is yes here, I have a bug in the nVidia GLSL compiler.

It’s very hard to package a small repro. This is the best I could do so far. My apologies.

The bug occurs when you replace //WORKING1 or //WORKING2 with //NOTWORKING.
It causes an application freeze, or sometimes a BSOD on XP, or a driver recovery in Windows 7.
I expect the NOTWORKING, WORKING1 and WORKING2 code to behave the same.
But somehow, ‘discard’ doesn’t seem to do its job: execution seems to continue.

I would appreciate if somebody could have a look to check if I am not going nuts…

A few remarks:

  • please ignore the crap at the beginning of main(), above the first ‘z’ loop. These are not the culprit here.
  • also ignore the ‘problem’ flag and its management. I wrote this to try to detect odd code behavior.
  • you know something goes wrong during execution with shader performance: discard seems to continue execution, and as a result execution is very slow in my case.

flat in int segmentsPerScanLine;
flat in int primitiveNum; // always 4096

in vec3 normalDir;
in vec3 viewDir;
flat in vec4 finalColor;

void main(void)
	bool shouldDiscard = false;
	bool shouldExit = false;
	bool problem = false;
	int bufferSize = primitiveNum; // & 0xFFFF;

	int ttStartOffset = segmentsPerScanLine;
	float fbufferSize = float(bufferSize);
	float fbufferSizeM1 = fbufferSize - 1.0;
	float xx = gl_TexCoord[0].s * fbufferSize;
	float yy = gl_TexCoord[0].t * fbufferSize;
	float vx = clamp(xx, 0.0, fbufferSizeM1);
	float vy = clamp(yy, 0.0, fbufferSizeM1);
	float vxFloating = clamp(xx, 0.0, fbufferSize);
	int iVx = int(vx);
	int iVy = int(vy);
	int lineStartOffset = GET_TEXTURE_IVEC4(ttStartOffset + iVy).r;
	int currentOffset = lineStartOffset;
	int referencePixelIndex = 0;
	int nn=0;
	for (int z=0; z < (primitiveNum * 200); z++)
		if (shouldExit)
			problem = true;
		ivec4 pixelGroups = GET_TEXTURE_IVEC4(ttStartOffset + currentOffset);
		for (int n=0; n < 4; n++)
			int pixelGroup = pixelGroups[n];
			int lastPixel = (pixelGroup & 0xFFF);
			if (iVx <= lastPixel)
				bool canDisplayPixel;
				vec4 theColor = finalColor;
				if ((pixelGroup & 0x40000000) != 0 || (pixelGroup & 0x20000000) != 0) // interior pixel
					canDisplayPixel = true;
				else // exterior pixel
					canDisplayPixel = false;
				if (canDisplayPixel)
					shouldExit = true;
					shouldExit = true;
					shouldDiscard = true;

		if (shouldExit)
	if (problem)
		gl_FragColor = vec4(0.0, 0.0, 0.0, 1.0);
	else if (shouldDiscard)
		gl_FragColor = vec4(1.0, 0.0, 0.0, 1.0);


NB: drivers tested: 285.58, 260.99. Windows XP 32-bit, GeForce GT 430 1024MB.

According to my copy of the Orange book:

An implementation might or might not continue executing the shader, but it is guaranteed that there is no effect on the framebuffer.

A quick read through the GLSL spec backs this up: all that it says is that the shader outputs won’t be written, but it doesn’t specify whether or not the shader continues executing.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.