 # Float vs. Double

This is basically a poll to see which data type ppl prefer.

How does this relate to OGL?
Due to the extra precision of a double, it takes longer to compute than an equivalent calculation with float.

Why use double at all?
Some applications require the extra precision of double for adequate accuracy.

I for one always use double in place of float. Although programming isn’t so deep that an inaccuracy of 10^-10 will factor into 10^+10, I like to have the accuracy just so I know things are running as pefectly as possible.

Has anyone noticed a sizeable speed boost using float instead in their applications, or are you all perfectionists who use double for the same reason I do?

Apparently (or so I’ve heard) that use of doubles on PC’s is actually faster than floats, cos most libs just convert the float to double first.

However in regards rendering, you should always use floats. This cuts down memory bandwidth (most common problem in pc’s today), and is the “fast path” used in drivers. i.e. current drivers are optimized more to use floats, rather than doubles.

Use doubles as much as you want in your physics/maths code, but always use floats for rendering.

• Nutty

>Due to the extra precision of a double, it takes longer to compute than an equivalent calculation with float.

This is actually incorrect, Technically speaking the FPU in the chip proforms all calculations on 10 byte or 80 bit floating point numbers. Glfloat is 4 byte and double is 8 byte. Each are converted into the 10 byte format internally prior to any floating point calculation, even addition! This conversion process is parcticlly identical for both. ( I think only 10 byte numbers take longer )

Also I agree with Nutty in regards to using glfloat for rendering, the percision provided by glfloat is more than enough given the fact that pixel position are intergers

A benchmark is needed. It would be useful to know how much slower if any using double rather than float on modern hardware. I know a few years ago you would be trying to put in a integer format and using fixed point math, how times have changed.

When I say floats are faster, I’m particularly refering to methematic functions like sqrt(). Math functions like sqrt() handle floats quicker because their method for calculating the result is by a process, not an equation. This process is recursively called to increase the accuracy of the result, so greater precision = greater # loops. This is where float out-performs double.

Have you tried to find single precision sin and cos functions in an Microsoft compiler?! What the heck is up with that?

Cheers,Angus.

From what i’ve heard you should use floats. Something like that current processors can shuffle around 32 bit in one CPU cycle (32bit = 4 Byte = sizeof(float)). Not sure where i heard it, but i think it was in a DirectX book And then i also read (OpenGL book) that OpenGL internally works with floats, which would make them faster because your vars are never going to get converted to somethin else…

since I’m inherently interested in perfomance related topics, I figure I’ll add my 2 cents…

Wrote a quick test app that performs multiplication on 10,000 doubles and then 10,000 floats.

On a P3 800, 64mb ram using VC++ 6 and it’s stock profiler:
doubles: 0.255 milliseconds
floats: 0.128 milliseconds

I toyed a bit with the amount of data and pretty much got the same results across the board, i.e. for a million mults, it took 26.320 ms.

Guess this pretty much just sums up what’s already been said.

Dave

I realise that very few, if any of you are asembly programmers so you wouldn’t know this, but while the benchmarks give you those results you don’t actually know what the complier is doing behind the scenes.

As I already said the fpu proforms all calculation on 10 byte floating point values.
therefore the following comment

>Math functions like sqrt() handle floats quicker because their method for calculating the result is by a process, not an equation. This process is recursively called to increase the accuracy of the result, so greater precision = greater # loops

is only true if your calculating square roots in sofware, there hasn’t been a need to do that since the 386, earlier if you count math co-processors for 286.

Technical Note- the maths coprocessors or x87 chips are still used, but since the 386 they were made onto the same chip as the x86

The command for square root is fsqrt and takes 83-87 clock cycles on a 486 (I don’t have pentium values to hand)

A quick scan through all the commands revels little or no difference between Doubles (8 byte) and GlFloats or Singles (4 byte). Where there is a difference its only of about 2 clock cycles, pretty irrevelant considering that division for example takes 73 in total.
Indeed the only commands where there is a difference is in the loading and saving of values, all aritmathic are the same. And again as I said the difference is only 2-3 clock cycles irrevelant compared to the time spent proforming opperations.

just curious, but always figured that sqrts, sin, cos, ect were calculated using taylor polynomials… ergo directly through a formula…

Taylor series are an equation, but they can also be seen as an algorithm, because has you had terms, you make the precision higher.

[This message has been edited by Gorg (edited 03-14-2001).]

Originally posted by Gorg:
Taylor series are an equation…

I suppose I am getting peeky here but mathematically speaking, writing a taylor series is not writing an equation !

It is a decomposition of a function on the (1,X,X2,X3,X4,…,XN) base of the polynomial space !

And it not an algorithm either, although, as you said, it can be used as an algorithm…

Regards.

Eric

Also worth mentioning is that, when working with a relatively small set of data, the data is more likely to fit into L1 cache if they are floats rather than doubles.

That being said, if you can fit you data entirely in the L1 cache even as doubles, then I dont think there will be any performance difference between floats and doubles(barring the fact that you are using more of the cache, and may thus be evicting other important data).

I made a variation of lpVoid’s test. However, instead of working on 1000000 floats and 1000000 doubles, I worked on 5 floats and 5 doubles. I made a loop that performs a variety of FP operations (sin, cos, abs, sqrt, mul, div, and add) on these 5 variables. I then ran through this loop 1000000 times. The result was that floats and doubles performed the same.

I looked at the assembly code to make sure the data was being writting back out to memory, so the 5 variables werent just staying on the FPU stack (initially they were, which I didnt think would necessarily be an accurate measure of performance, so I had to modify the code force it to perform fld/fst each time).

So, for large data sets, float is definitely faster, but for small data sets (not sure up to what size…that would be dependant on the processor) float and double should perform approximately the same.

I use GLfloat’s because all glInterleavedArrays formats contain floats, there are no formats for doubles. So if you want to use vertex arrays use floats. (Red book v1.1 page 77).

Carl