Fragment program precision

It is interesting to know how big the precision of fragment programs really is.
I mean here the precision of individual instructions. Very interesting will be the precision of RCP and RSQ instructions. On my FX5600 I get three-digits precision with fp32 and 2 digits with fp16(Computing the 1/9 and sqrt(1/9) and writing the results to the floating-point buffer).
Did someone do tests for ATI cards?

1/9 - 0.111111 (in red)
sqrt(1/9) - 3.000000 (in green)

Reading it back from the fragment output:
float temp[3];

glReadPixels(0, 0, 1, 1, GL_RGB, GL_FLOAT, temp);
printf("%f
%f
", temp[0], temp[1]);

Odd. . . You should be able to get quite a bit more precision with FP32. You should have 24 bits of mantissa. FP16 will give you 10 bits and FP24, which the Radeon 9500 and above uses, gives 16 bits of mantissa.

It’s a question of the instruction, not the precision of the registers. The RCP and RSQ instructions don’t actually carry out the operation at the full precision of the register. They merely estimate the quantity to some arbitrary precision.

The precision of these instructions is implementation dependent.

Funny… I thought sqrt(1/9) shoudl be 0.33333…

What prooves that RGB8 normalization cubemaps have still better precision on GFFX.

Blame the recip sqrt…

1/9 - 0.111111
sqrt(1/9) - 0.333332

Is it possible that the precision you get depends on the number you feed in?

Do some more testing and let’s see.

Interesting thread!

The spec recommends doing one iteration of Newton/Rhapson root finding if precision is important. This will double the number of digits of precision; it converges very quickly.

Thus, it’s up to you to determine whether you want a precise result, or whether an etimate is good enough. If you have a denormal because of interpolation (as you’ll get out of a texture, or texcoord interpoland) then normalizing by the RSQ of the length will still make the normal quite a bit better than what it was.