According to the OpenGL spec, gl_NormalMatrix is calculated as the transpose of the inverse of the current ModelView matrix. Eg:
gl_NormalMatrix = transpose(inverse(modelview))

Each normal is transformed by this matrix, which is essentially object transform independant from camera transform (ie. world transform). This is used to transform the vertex normal when doing lighting equations. If you have thousands of objects in the world, each with their own modelview transform, you have a lot of inverse matrices to calculate.

I have stumbled onto a faster method which only performs the inverse calculation once. I apologise in advance if this is common knowledge, but I’ve never seen it before, and my google kung-fu is failing me, so I’ll post this here in case anybody else is interested.

The faster method to calculate the gl_NormalMatrix is to calculate camera(-1)modelview. Since modelview = cameraworld_transform, camera(-1)*camera cancel each other out, leaving our desired world transform. Trim this 4x4 matrix into a 3x3 matrix, and we have our gl_NormalMatrix. We only need to calculate camera(-1) once, and use it for all scene nodes, and since most engines already calculate this for other purposes, you can get the gl_NormalMatrix without a single additional matrix inverse calculation. When you have thousands of spatial scene nodes, the savings add up.

The faster method to calculate the gl_NormalMatrix is to calculate camera(-1)modelview. Since modelview = cameraworld_transform, camera(-1)*camera cancel each other out, leaving our desired world transform.

That doesn’t make any sense.

The purpose of using the inverse-transpose is to adjust for non-uniform scales. If your modelview matrix has no non-uniform scales in it, then there is no difference between the inverse-transpose and the modelview matrix.

However, if your modelview does in fact have non-uniform scales in it, what you have proposed is not mathematically viable. Even if you’re doing lighting calculations in world space.

Furthermore, if your scene is such that non-uniform scales are a common occurrence, you don’t have to take the inverse-transpose of anything. What you need to do is devise a matrix stack class that builds the inverse-transpose of a matrix while it is building the regular one. This doesn’t require any complex math or matrix inverses. You just build two matrices: the regular matrix for positions, and a second one that is the same as the first, except with any scaling operations inverted.

I agree with Alfonse on this one. This doesn’t make much sense to do this. To expound on and simplify what “I think” you’re saying (please correct me if I’m wrong) to unearth the hidden assumptions…:

So this math just gives you the object’s MODELING transform (OBJECT-space -to- WORLD-space transform). Which you might already have lying around in your application anyway and (if so) could have just passed to the shader. But anyway…

Let’s just go with this. I think what you’re proposing is: what if we use the upper-left 3x3 of this inferred MODELING transform to directly transform OBJECT-space normals to WORLD-space? What does that imply. It implies that the transform is orthonormal (basis vectors are unit vectors that are orthogonal to each other). This means there can be no scales or shears in the MODELING transform for example.

It also implies that you’re going to light in WORLD space, not EYE space as usual. What does this imply? It implies that you’re going to have to transform vertices (or fragments) that you are lighting to WORLD space, which you otherwise wouldn’t have had to do. You typically have to get them in EYE space to do fogging and such, so usually lighting is just performed in EYE space. Now you’ve got to go to both spaces.

Another thing implied with doing lighting in WORLD space (if you have some point light sources) is that your world is “tiny”. If it’s not, lighting inaccuracies are going to blow up on you. Why? Because with reasonable sized worlds, your WORLD space positions end up large enough that you end up spending your 32-bit float precision bits representing the shear magnitude of the WORLD-space coordinate numbers, which steals from the accuracy. For instance, represent 5m in float, and you end up with around 10^-6 accuracy. But represent 50,000m in float, and you end up with only around 10^-2 accuracy. So bottom line is you end up dealing with and subtracting big WORLD-space numbers in the shader, lose a lot of accuracy, and have your light positions and directions flashing and bouncing around. That’s one reason why lighting in WORLD space isn’t a great idea. I’d avoid dealing with WORLD-space positions on the GPU like the plague! WORLD-space vectors? No problem. They’re tiny as usual. But positions? Ouch!

Now there is a way to optimize your normal transforms if you know that all of your MODELVIEW transforms (except translates) are orthonormal (e.g. no scales or shears), but yet “still” light in EYE space as usual. Simply use the upper-left 3x3 of the MODELVIEW matrix as-is to transform your normals from OBJECT space to EYE space. No CAMERA^-1 manipulations required! And you’re still lighting in EYE space as usual.

Also, if you know that your MODELVIEW doesn’t have any non-uniform scales or shears, but it “may” have uniform scales applied, you can still avoid a full inverse transpose transform. Just extract the scale factor from the MODELVIEW matrix, and use that to correct for the scaling when you use the upper-left 3x3 of the MODELVIEW to transform the normal.