## The Joel on Software Discussion Group (CLOSED)A place to discuss Joel on Software. Now closed. |
||

This community works best when people use their real names. Please
register for a free account.
Other Groups: Joel on Software Business of Software Design of Software (CLOSED) .NET Questions (CLOSED) TechInterview.org CityDesk FogBugz Fog Creek Copilot The Old Forum Your hosts:
Albert D. Kallal Li-Fan Chen Stephen Jones |
How can I write floating point numbers to an XML file and read them back again without loss of data? It seems that any conversion to ASCII will cause rounding errors during the trip and I need complete accuracy.
I've thought of converting the IEEE info to hexadecimal but that mean users can't see the numbers in a text editor. The other thing that occurs to me is to write both - an ASCII version and a hex version (with a "precise" tag). If they're very different when I read the file I can assume somebody edited the numbers and use the ASCII version. Or maybe there's a standard solution to this problem...?
Jimmy Jones Saturday, July 12, 2008
When using a standard conforming set of IEEE floating point conversion routines, an IEEE float can be converted from binary to ASCII decimal and back to binary with exact preservation of all bits in the binary representation. It is only necessary to include enough decimal digits of precision in the ASCII form.
Some floating point numbers have non-terminating decimal representations.
Binary dump is your only choice.
me Saturday, July 12, 2008
Not true, you only need enough bits of precision in your ascii representation to differentiate any two floating point numbers.
Saturday, July 12, 2008
It's not a case of differentiating them, I have some numbers in a binary file which I want to extract and write as XML.
When I read back the XML I need to be able to compare the with the original binary and get "true". Doubles have 16.86 digits of precision so if I write out, say, 18 digits am I guaranteed to get back the original number?
Jimmy Jones Saturday, July 12, 2008
I don't use floating point myself but I suggest you might experiment with the http://msdn.microsoft.com/en-us/library/system.xml.xmlconvert.aspx methods.
As I said before, binary dump is your only choice if you want to preserve all floating point numbers.
For background reading, refer to "What Every Computer Scientist Should Know About Floating Point" http://docs.sun.com/source/806-3568/ncg_goldberg.html
me Saturday, July 12, 2008
Example of different IEEE-754 floating point numbers that will most likely not survive ASCII round trip: +0, -0.
me Saturday, July 12, 2008
> As I said before, binary dump is your only choice if you want to preserve all floating point numbers.
Is that because there's no difference between some numbers: e.g. "zero" and "negative zero"; or "two" and "four halves"? If there is a non-infinitessimal difference between any two numbers then ASCII decimal should suffice, to however many significant digits are needed to distinguish it from the next-nearest number.
Without spending too much time, a form such as
<float> <sign>positive</sign> <exponent bits='8'>100</exponent> <mantissa bits='24'>23492302</mantissa> </float> Would be a format that represents the floating point number in a readable form (though similar in spirit to a binary expression. If you wanted to store it as a decimal, you need more than 71 characters devoted to storing the number to ensure you can recover it. Which wouldn't be all the readable anyway. Saturday, July 12, 2008
"binary dump is your only choice"
This seems true on an intuitive level ... if there's decimal numbers which you can't represent exactly in binary (eg. 0.1) then there has to be binary numbers which you can't represent exactly in decimal. It might all work out in the rounding/truncating but for now I'm going to use hexadecimal just to be on the safe side.
Jimmy Jones Saturday, July 12, 2008
"ASCII decimal should suffice, to however many significant digits are needed to distinguish it from the next-nearest number. "
The question is ... how many digits is needed? "...you need more than 71 characters devoted to storing the number to ensure you can recover it" Ahah...(!) I suspected it might be quite a few. That clinches it. Plain ASCII won't work. The best solution for editability seems to be something like the dual format I was thinking of earlier, eg: <num value="0.5" ieee="000000000000144"/> Saturday, July 12, 2008
> you need more than 71 characters
I think 17, not 71. The section titled "Binary to Decimal Conversion" (or "Theorum 15") at the end of http://docs.sun.com/source/806-3568/ncg_goldberg.html says, "When a binary IEEE single precision number is converted to the closest eight digit decimal number, it is not always possible to uniquely recover the binary number from the decimal one. However, if nine decimal digits are used, then converting the decimal number to the closest binary number will recover the original floating-point number. ... The same argument applied to double precision shows that 17 decimal digits are required to recover a double precision number." In .NET the result of System.Xml.XmlConvert.ToString(Math.PI) is "3.1415926535897931", and of System.Xml.XmlConvert.ToString(double.Epsilon) is "4.94065645841247E-324".
I question your math....
Single precision numbers have 24 bits of precision which is 7.22 decimal digits (ie. log10(2^24)). Double precision numbers have 56 bits which is 16.86 decimal digits. If 9 digits are required for single precision then 17 can't be enough for double precision. Saturday, July 12, 2008
Ian, Christopher, et al, are correct in that you only need enough ASCII characters to print a different value for each possible floating point value.
A couple details to be concerned with: - What do you do about flag values, e.g. NaN? - Are you sure your conversion algorithms are right? The last point I note based on a project I worked on a few years ago. We used C++ and stored double values in a database. The store and retrieve process returned values that differed by the least significant bit. An insignificant difference in value for us, but, of course, "==" tests failed.
EMF Saturday, July 12, 2008
Never compare floats for equality, it will fail much more frequently than you expect. For FP work, equality tests should be replaced with difference tests less than some small value: a == b ---> abs(a-b) < e where e is some small value based on the magnitudes you expect to be working with.
So long as you write out a representation that has more sig figs than your FP representation (~7 sig figs for single precision FP, ~15 sig figs for double precision) you should have minimal trouble with roundoff errors between write-out and read-in. Finally, if the values you are working with are exact (e.g. most monetary values) DON'T USE FLOATING POINT. For exact values use integers or other fixed point values (e.g. BCD strings), where you can be assured that you won't suffer from roundoff errors and that the expected rules of arithmetic will work (associativity, commutativity, etc.).
Here's a little test you can do to check the quality of your conversion routines:
#include <iostream> #include <string> #include <sstream> #include <cmath> #include <cassert> using namespace std; int main() { double data[] = { 10.0, 843.4, 1.2, 0.04313, 35.2E56, 3.283E-20 }; double *v, *vend = data + sizeof(data)/sizeof(double); double a, b; for (v = data; v != vend; ++v) { a = log(*v); ostringstream oss; oss.precision(18); oss << a; cout << oss.str() << endl; istringstream iss(oss.str()); iss >> b; assert (a == b); } return 0; } With VC++ 2005 there are no assertion failures. Of course, this only proves the basic assumption that 18 digits is enough. It doesn't test special cases like NaNs, denormals, +/- zero, etc. But hopefully you won't have such values in clean original data.
"...this only proves the basic assumption that 18 digits is enough"
That's a "proof"? :-) Saturday, July 12, 2008
It looks like Microsoft uses E notation to write only significant digits. When I found the 71 character result, I made the assumption that the author needed a decimal representation.
The E notation seems like a good solution to the problem. Saturday, July 12, 2008
> When I found the 71 character result
There is no 71 character result. Where did you think you found such a thing? > made the assumption that the author needed a decimal representation Scientific notation is a decimal representation. It contains decimal (base 10) digits and exponent. > It looks like Microsoft uses E notation to write only significant digits Nothing to do with Microsoft. The E notation is universal. > The E notation seems like a good solution to the problem Stop being a silly sausage and go and get an education. > That's a "proof"? Well, it's a proof that 18 decimal digits permits the exact binary representation to be recovered in at least some cases. Actually, the IEEE754-1985 standard says that 17 decimal digits is enough in all cases. However, it seems that the standard is a little vague on whether conforming implementations must guarantee lossless conversion when 17 digits are used. I think newer standards are more likely to make this a requirement rather than an expectation.
"Well, it's a proof that 18 decimal digits permits the exact binary representation to be recovered in at least some cases."
Try this: double a=0.0065919352608000007, b; std::ostringstream oss; oss.precision(18); oss << a << '\0'; // nb. Needs a trailing zero... std::istringstream iss(oss.str()); iss >> b; assert (a == b); Saturday, July 12, 2008
PS: Yes, I know why that fails ... I just post it as a cautionary tale.
You have to be able to represent a value to an accuracy of less than half an ULP. Doubles have 53 bits of precision (Did I say 56 above? My bad...) so we need: log10(2^(53.5)) = 16.105 decimal digits. 17 is more than 16.105 so it's enough. It's not an intuitive result, given all the well known problems with floating point numbers, but math is math.
Jimmy Jones Saturday, July 12, 2008
I don't know if it follows any sort of standard, but .NET has an "R" format specifier for floating point which is specifically designed to convert floats to an ASCII representation that can be round-tripped without loss.
If you're using .NET, I think the "R" format might be worth testing to see if it'll do what you want. In an earlier response, "me" mentioned +0 and -0 as values that might not round trip well - throw those into the test, too.
mwb Saturday, July 12, 2008
The solution is known in C world for ages - use a fixed-point floating format (http://en.wikipedia.org/wiki/Fixed-point_arithmetic). A simple format is to use a 32-bit integer to represent a floating number: the first two bytes represent the integral value, and the other two bytes fractional ones.
This format gurrantees that the prevision is retained across calculations. The 'precision' word is quit mis-used in many documentation. In reality, the 16-digit referred in the 'double' format is the number of significant digits, not the number of digits after the radix point. Certainly you can not retain it across calculations.
Glitch Saturday, July 12, 2008
"Try this:"
This works though: ;-) double a=0.0065919352608000007, b; ostringstream oss; oss.precision(17); oss.setf(ios::scientific, ios::floatfield); oss << a; cout << oss.str() << endl; istringstream iss(oss.str()); iss >> b; assert (a == b); I think there is a bug in the Microsoft C++ compiler, personally. 17 digits of precision is definitely enough for lossless conversion if the compiler doesn't mess up. Or maybe it's the C++ standard. Argh. Don't make me go and look this stuff up...
There's C code at http://netlib2.cs.utk.edu/fp/dtoa.c and a paper at http://cm.bell-labs.com/cm/cs/doc/90/4-10.ps.gz about converting between floats and strings. The code provides a "dtoa" and "strtod" functions that convert between an IEEE floating point value and the shortest string that, when converted back, will produce exactly the same binary float. Probably similar to the .NET function was mentioned above.
A simple 17 significant figure output, like printf with "%.16e", will result in untidy results like (double) 0.1 -> (char*) "1.0000000000000001e-001". The function provided in this code will correctly produce "0.1", because that is the shortest string that will be converted back into exactly the same binary value. Non-numeric values like NaN and Inf are also preserved correctly. It's not fast, but it you want your XML to be vaguely human-readable without losing precision it's a good choice.
+1 for never comparing two doubles for equality. Always use an epsilon.
+1 for writing the value as HEX. |

Powered by FogBugz