Small Normalization Cubemap improvement


Just got an idea, how one could improve the precision of normalization cubemaps a bit. It´s no big deal, but i´d like to know your opinion.

The thing is, that in a typical cubemap you store the sign of the vector by putting the vector into [0,1] range. This means, that in a texture with 8 Bit per channel, one bit gets “wasted”.

I thought, that this one bit could be saved, and used for more precision, because we already have the sign - inside the texture coordinate.

So i store my normals with absolute values, which means, that every channel can now really hold a value between 0…255 instead of -128…127.

I don´t know, if this makes a visual difference, i couldn´t see much, but i´ll have to test it a bit more.

Since the ARB_fragment_program extension does not have a SIGN function, this becomes a bit more difficult, than i thought. Well, here is my code:

#to be normalised:

#normalise it
TEMP vVectorNorm;
TEX vVectorNorm, vVector, texture[2], CUBE;

TEMP sgn, zero;
MOV zero, {0.0, 0.0, 0.0, 0.0};

#set to 1.0 on greater or equal to 0.0 else set to 0.0
SGE sgn, vVector, zero;

#put the signs into [-1, 1] range
MAD sgn, sgn, 2.0, -1.0;

#multiply each component with its sign
MUL vVectorNorm, vVectorNorm, sgn;

Well, most of the code simpy emulates the SIGN function. Since glSlang does now this function, there the code should be much shorter.


This sounds like a nice idea.
But compared to the other techniques:

Arithmetic Normalization: no texture is needed, 1 DOT3, 1 RSQ, 1 MUL
Classical Cube Map Normalization: cube map texture is needed, 1 TEX
Your Cube Map Normalization: cube map texture is needed, 1TEX, 1 SGE, 1 MAD, 1 MUL

Comparing these normalizatin technique I don’t think that yours is really efficient. The main advantage of Arithmetic Normalization is that it is the most precise technique but mostly also the slowest (because of RSQ). The advantage of classical cube map normalization is that you can use them on non-pixel shader hardware and it is mostly faster than arithmetic normalization. The only advantage of your technique over classical normalization is 1 Bit in precision. But you also need 3 more instructions and you can not use this technique on non-pixel shader hardware.

Yes, that´s true. But when i got the idea, i was believing that a sign-function would be supported natively. If it was (and maybe on some more recent hardware it is - in glSlang), then it would be one SIGN and one MUL, which might be nealry as fast as standard-cubemapping, which still needs a MAD (which you forgot).

Of course you need pixel-shader hardware, that´s another “limitation”. But even on pixel-shader hardware it is usually faster not to normalize everything with arithmetic, but to mix those two methods (see nVidias homepage for a paper about that).

I am not saying, this would be the holy grail, which i found, or something. I just got the idea and wanted to share it. I don´t know, if this one bit can make a big difference, but well, it´s worth a try, if you are not satisfied with precision.


The extra bit won’t be of much help. It gives us 512 possible values per component. Might as well use 16 bit textures.

The RSQ instruction should take one instruction slot. It is an approximation good enough for graphics.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.