Statistics et al.: Fractional data types

Wednesday, 29 October 2014

Fractional data types

There's likely a good solution to this already, but it eludes me: Why are there no fraction data types at the same basic level at int, char, and float?

A lot of statistical operations, like matrix inversions, are computationally very expensive. Part of the reason they take so long to perform for large matrices is because of all the division of numbers that's going on. Compared to addition and multiplication, division of floating point numbers is very hard for computers. It has to be done in a form of binary long division, which involves a subtraction and multiplication for each bit.

However, division of fractions is simply cross-multiplication, which is cheap.

Consider a double length float in a 32-bit system. It has 64 bits, 52 which store the actual number ( the mantissa), 11 which store how large the number is, and 1 for the plus or minus sign.

Modelled on this, a 64-bit fraction data type could have two 28-bit mantissae, 11 bits for the magnitude, and 1 for the sign.

Note that fractions are not unique representations of numbers. 18/12 is 6/4 is 3/2. We can exploit that to use only a single sign and magnitude for the fraction as a whole.

Conversion from fraction to double is a simple division, so we can't avoid division entirely, but we could do intermediate steps as fractions.

Conversion from double to fraction is harder. It can be done in a arbitrary manner, of setting the denominator to 1, but in the above system, 28 bits of precision are lost, and the whole denominator is wasted storing '1'.

Is there a way to convert a double to fraction to minimize loss? Can it be determined in a way that's cheap enough to make conversion worthwhile? My uninformed hunch is converting x to x^1.5 / x^0.5 would work, but square roots are still a bit costly, and I have no idea if it's particularly good at preserving precision.

Any ideas?

No comments:

Post a Comment

About the Author

Jack Davis is a teaching professor in Statistics at the University of Waterloo, Canada.

Their research spans statistics in sport, data mining, adult education and Bayesian computation.

They have a course called "Statistics, Gambling, and Games of Chance" on Udemy.

They also pretend to be an expert in writing and game design, which is why they wrote a textbook called "Writing for Statisticians" and why they programmed a video game called "Doc Logic" for Xbox 360.

They enjoy chess and chess variants, but they are so, so, very bad at them. They want to try living on a houseboat sometime.

Statistics et al.

Featured post

Textbook: Writing for Statistics and Data Science

Wednesday, 29 October 2014

Fractional data types

No comments:

Post a Comment