Software fallback

Is it possible to detect and/or control when software fallback occurs?

I have a simple per pixel phong(ish) glsl shader that mysteriously (to me) causes my cpu usage to go from nearly 0% to sky high the more on-screen pixels it has to shade. There seems to be some critical number of shaded pixels (fragments I guess) when the cpu switches from 0% to 100% - not much inbetween. So the question is, what happened?

Using an Nvidia 8300GS 512Mb with latest drivers.

Thanks for any help!

Vertex Shader


varying vec3 L[2], N, E;

void main()
	{	
	int i;
	
	/* normal in eye space */
	N = gl_NormalMatrix * gl_Normal;

	/* vertex in eye space */
	vec3 P = vec3(gl_ModelViewMatrix * gl_Vertex);

	E = normalize(-P);

	for (i = 0; i < 2; i++)
		{
		/* light vector in eye space */
		L[i] = normalize(gl_LightSource[i].position.xyz - P);
		}

	/* assign the uvs */		
	gl_TexCoord[0] = gl_MultiTexCoord0;
	
	gl_Position = ftransform();
}

Fragment Shader


varying vec3 L[2], N, E;

uniform sampler2D colourmap;

void main()
{
	int i, j;
	vec3 nN = normalize(N);
	
	vec4 tcol = texture2D(colourmap, gl_TexCoord[0].st);

	vec4 amb = vec4(0);
	vec4 diff = vec4(0);
	vec4 spec = vec4(0);
	for (i = 0; i < 2; i++)
		{		
		vec3 nL = normalize(L[i]);
		
		/* ambient term */
		amb += gl_LightSource[i].ambient;

		/* the diffuse term */		
		diff += clamp(dot(nN, nL), 0.0, 1.0)*gl_LightSource[i].diffuse;
		
		/* specular term */
		float NdotH = clamp(dot(reflect(-nL, nN), normalize(E)), 0.0, 1.0);
		spec += pow(NdotH, gl_FrontMaterial.shininess)*gl_LightSource[i].specular;
		}
		
	vec4 color = tcol*(gl_FrontMaterial.ambient*amb+gl_FrontMaterial.diffuse*diff)+gl_FrontMaterial.specular*spec;
		
	gl_FragColor = color;
}

As far as I know, the 8300GS is a dedicated chip, there is no software fallback.

What platform are you on?

In my experience most times that you fall off the fast path have to do with either vertex shader texture access, or an unsupported texture format.

What is your colourmap format?

Scratt - A Dell dual cpu 2.20GHz, 2GB ram, Windows XP SP3. The app is a Java applet using JOGL.

I use a mixture of jpg and png textures. I use jpgs for 3 channel maps and png for 4. Are there any formats that can cause software fallback?

Thing is, I never want my app to suffer software fallback since it pretty much disables the machine. I would rather have my app switch it off all together and deal with the consequences by dumbing down the rendered scene somehow - framerate is more important to me than image quality (to a limit!).

Generally speaking 3 channel formats are not recommended.
But will just be inefficient / slightly slower. Simply pad your textures if they are only 3 channel to be safe

Also make sure your textures dimensions are powers of 2. Always much safer.
That was more what I was interested in when I asked about texture format, as you were talking about this happening at certain loads, which I presumed might mean a change in colour texture size.

It’s also relevant which format you use. The newer GL_UNSIGNED_INT_8_8_8_8_REV and GL_BGRA are the best to use for source and internal formats AFAIK. GL_RGBA gets swizzled and may cause a 10% slowdown from experience.

However, most of this is based on my experience with OpenGL on Macs. It’s still relevant but there are differences between PC and OS X drivers.

Can you profile your GL implementation and see where it’s falling back to SW?

The reason I asked about formats was that some time back drivers I was using were getting big slowdowns with textures depending on how / where you chose to store them, and it was in fact a driver bug. But this was on OS X.

Scratt - I was thinking (maybe naively) that 3 channels must surely be more efficient than 4! If they are not, wouldn’t it be tremendously helpful if the graphics-card/opengl/jogl padded it for you? Maybe they do, maybe they don’t. Short of trawling the code, how will I ever know! I’m using a com.sun.opengl.util.texture.Texture and a com.sun.opengl.util.texture.TextureData and I’m presuming those libraries are optimal!

All my textures are powers of 2.

I would dearly love to profile my glsl code! But profiling a trivial shader (as mine is) in isolation away from the rest of the app doesn’t sound like it will yield much. After all, if glsl baulks at my trivial shaders, what hope does glsl programming have generally? Profiling it in situ would be ideal but I have not yet managed to do this while running a Java applet in a browser without installing crippling instrumented graphics drivers which render themselves useless for debugging,

All I want is a print statement from the glsl compiler or something at runtime telling me software fall back has or will occur. Do I have to accept defeat for the want of a print statement!!

Please Mr. Graphics Card, tell me what you are doing!

It’s the act of padding or dealing with non-uniform chunks of data that causes the slowdown… In effect RGBA is one chunk of 16 bytes, or one chunk of 4 bytes, depending on the format you choose. Both of which are native data sizes on both the client and GPU side on 99.999% of hardware right now.

In effect you don’t get a speed-up from RGBA, but you do get a slow-down with RGB. But like I said it’s a small percentage anyway. Similarly BGRA means the GPU does not have to swizzle from RGBA, which can also cause a small slow down.

POW2 is good. :slight_smile:

gDEBugger is a very good app to look at, and comes with a 30 day trial last time I checked. You can edit shaders on the fly in that, as well as profile etc. It is available for Unix and PC.

If you are on Apple hardware there is a version of gDEBugger in Beta right now. I am testing it. Also the Apple Developer Connection provides free dev tools, which include OpenGL profilers as well.

In all those apps you can trigger breakpoints on SW fallback, as well as examine texture formats internally, and pretty much any aspect of even the most trivial or complex OpenGL programs. You can even link the debugging to your main Compiler / Debugger.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.