The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

designing a language: number type

Floating point numbers (float and double) in C/C++ have major flaws for normal everyday usage. You cannot compare 2 floating point numbers, and you cannot even accurately represent 0.1! These references describe these flaws:
http://support.microsoft.com/kb/125056
http://www.sqlite.org/faq.html#q18

The language I am designing (called foal) instead uses a number type implemented internally with an integer value and a number of decimal places. Not only does this provide accurate decimal arithmetic, I can also conveniently "remember" the number of decimal places so that when you add 0.97 and 0.03 you get 1.00 (nice for currencies). With the integer implementation you can be confident that 43.00 == 43 (unlike in floating point arithmetic).

Excluding the issue of bounds (maximum integer values), no accuracy can ever be lost when you add, subtract or multiply. It keeps the maximum number of decimal places of all the operands when adding or subtracting and the total number of decimal places when multiplying. So 2.01 * 0.3 = 0.603.

For division, there is a trade-off compared to floating-point which always keeps the result to the maximum precision supported by the floating-point type. With my number type you have to indicate the desired precision. So for example when dividing by 3, if you want the result to 7 places do 10 / 3.0000000.

Are there other languages that use a type like this? Can anyone see any disadvantages in this for a higher level language?
Ben Bryant Send private email
Tuesday, May 29, 2007
 
 
See Visual Basic, the "Currency" data type.  Is a 64 bit 'integer', shifted down 4 decimal places.  It has exactly those virtues you describe.
AllanL5
Tuesday, May 29, 2007
 
 
Thanks, and that led me to the .NET Decimal type which seems to be exactly what I have described (with a floating "scaling" decimal point.
Ben Bryant Send private email
Tuesday, May 29, 2007
 
 
"You cannot compare 2 floating point numbers, and you cannot even accurately represent 0.1!"

You can accurately represent 0.1.  It is represented to within 0.0000000000001% accuracy.  That's pretty good.  Floating points are designed to be used in cases where the original number was only a representation anyways.  For example, you measure the distance from your house to the nearest park as 0.1 miles using the odometer of your car.  If the language stores it as 0.100000000000001 miles, it is correct enough.

If you ask the question "Is it exactly 0.1 miles to the nearest park?", the answer is likely "No".  Even if you measured it at 0.1 miles, someone with better equipment than you will measure it as a different value.  The correct question is "Is it 0.1 miles to the park with this acceptable tolerance?".  That is not an equality expression.  In my opion, the == operator should throw an exception on floating point numbers.
JSmith Send private email
Tuesday, May 29, 2007
 
 
JSmith, I understand your position, but those slight inaccuracies are still inconvenient for everyday programming. Hence the decimal and currency types provided in VB and .NET. It is not that the slight inaccuracy matters, it is just that you have to remember you cannot test whether 0.10 + 0.10 == 0.20.
Ben Bryant Send private email
Tuesday, May 29, 2007
 
 
Reading into the .NET Decimal type, it appears the main difference between it and what I posted is that .NET Decimal does not limit the accuracy of the division result to the implied number of decimal places. This shows a difference in priorities from my design.

Since in integer arithmetic:
3 / 2 = 1
in my design:
0.3 / 2 = 0.1
and
0.30 / 2 = 0.15

So if anyone is still reading this thread, the question becomes: do you think it is better as default behavior in a high level language to:

a) truncate the result of a division operation to the implied number of decimal places, as I suggested.
10.00 / 3 = 3.33

or

b) keep the result to the maximum precision supported and have the programmer explicitly truncate or round the decimal number later before comparing or performing additional operations on it
10.00 / 3 = 3.333333333333333333333333333333
Ben Bryant Send private email
Tuesday, May 29, 2007
 
 
There are languages that keep the results of division as two fields: dividend and divisor. That way you can do a lot of math with no loss of precision. You might allow it to collapse whole numbers or decimals that don't lose information.

print 1 / 4 --> 0.25
print 1 / 3 --> 1 / 3
print ( 1 / 3 ) * 3 --> 1
Stan James Send private email
Tuesday, May 29, 2007
 
 
"Is it 0.1 miles to the park with this acceptable tolerance?".

It is a good point, but ultimately is it really the role of the computer to be adding its own inaccuracy to the figure which you have settled upon as within your tolerance? Sure, the number you come up with as the distance between two points is ultimately only an approximation, but then the computer is misrepresenting even that value, and for no good reason, since it could easily represent the exact number you give it.
Ben Bryant Send private email
Tuesday, May 29, 2007
 
 
Stan, yes I've heard of math libraries that track the dividend and divisor. But that is for another purpose.

I'm trying to weigh a design for a high level language such as a language that would underly the formulas in an Excel spreadsheet. I think that the majority of users would simply like to see 0.1 always look like 0.1 rather than 0.100000000, and to expect that 0.10 + 0.10 == 0.20. This is for normal everyday usage.
Ben Bryant Send private email
Tuesday, May 29, 2007
 
 
I may be mistaken, but isn't this the way that floating point numbers are implemented?  Aren't they already a base value and exponent?
Wayne M.
Tuesday, May 29, 2007
 
 
>in my design:
>0.3 / 2 = 0.1
>and
>0.30 / 2 = 0.15

that feels wrong.

i vote B. let the programmer decide. i'm smarter than your langauge (when it comes to my intent).
maybe
Tuesday, May 29, 2007
 
 
I'd choose keeping the result to the maximum 'native' resolution, and let the programmer decide where to truncate, round up, or round down, and to how many decimal places.

Base-10 versus Base-2 math on computers is one of those "leaky abstractions" Joel keeps going on about.  The computer REALLY does the math in Base-2, so there are some Base-10 decimals which are 'approximated' at the hairy edge in the least significant bits.  There's overflows, underflows, and loss of precision.

So, when doing math at the hairy edge, it's critically important that the programmer KNOW these insights.  Otherwise, the incremental errors can build up.  Even if the programmer knows, he then needs tools (like truncate, round up, round down) so he knows when the errors are occurring and can take steps to correct, or at least report the error.
AllanL5
Tuesday, May 29, 2007
 
 
Make sure you're not hiding a leaky abstraction yourself.  If 0.1 can't be exactly represented, but 0.099999999999998 can be, pretending that 0.099999999998 "is the same as" 0.1, and therefore showing 0.1 to the user, promotes this leaky abstraction.

This is why one of the rules for Reliable Software is that you NEVER compare the equality of two floating point numbers -- you ALWAYS include a 'range'.

Because two numbers off by 1e-12 aren't 'equal', but often we don't CARE to go down 12 decimal places.  So your very desire to compare equality of two floating-point numbers by using the equals sign is suspect -- it seems you yourself wish to ignore the leaky abstraction.

Which is not a recipe for reliability.
AllanL5
Tuesday, May 29, 2007
 
 
Do a search on IEEE Floating Point for a lot of articles on the issues with floating point calculations.  Here is one concise reference I got from the above search.  I don't see how the proposal here aids in resolving the problem described in this link. http://www.intel.com/technology/itj/2007/v11i1/s2-decimal/1-sidebar.htm
Wayne M.
Tuesday, May 29, 2007
 
 
But 0.1 CAN be exactly represented. that is my whole point and the point of the .NET Decimal type.

It is essentially represented as the integer 10, which is an exact representation. Floating-point on the other hand uses binary fractions which are inexact.
Ben Bryant Send private email
Tuesday, May 29, 2007
 
 
I should say 0.1 is represented as the integer 1, with a decimal place count of 1.
Ben Bryant Send private email
Tuesday, May 29, 2007
 
 
Wayne, the link you provided was actually for a new standard to actually deal with the issues you are referring to (and which I also provided references for at the top). This new standard is for *Decimal* Floating-Point Arithmetic which addresses some of the issues I am concerned with (and others).
Ben Bryant Send private email
Tuesday, May 29, 2007
 
 
Here's more on how my number type works (it is the same concept behind .NET Decimal type and the Decimal Floating-Point Arithmentic standard).

0.5 is represented as 5 with 1 decimal place.
4.87 is represented as 487 with 2 decimal places

0.5 * 2 = 1.0
is just 5 * 2 = 10, still with 1 decimal place

Since it boils down to integer arithmetic, there is no innacuracy, no surprises.
Ben Bryant Send private email
Tuesday, May 29, 2007
 
 
"But 0.1 CAN be exactly represented. that is my whole point and the point of the .NET Decimal type."

But one-third can not be.

"It is essentially represented as the integer 10, which is an exact representation. Floating-point on the other hand uses binary fractions which are inexact."

Decimal fractions are similarly inexact, just with a different set of numbers.

Sincerely,

Gene Wirchenko
Gene Wirchenko Send private email
Tuesday, May 29, 2007
 
 
Gene, the point is not that one third cannot be accurately represented, it is that decimal fractions such as 0.1 cannot be accurately represented in binary fractions, so instead we use decimal arithmetic to deal with decimal fractions accurately. Please read the IEEE standard referenced by Wayne above if you want to see the rationale behind this.
Ben Bryant Send private email
Tuesday, May 29, 2007
 
 
"i vote B"

Thanks, but as you pointed out you want the extra information to decide what to do with it. In my scheme you can get that information, but only if you ask for it, i.e. it is not the default behavior.

I want to know why it is bad for the higher level user if the default is to restrict the result to the decimal places involved in the operation. The main weakness I see is that the average high level user will expect it to be rounded, so 0.31/2 would be expected to equal 0.16, not 0.15.
Ben Bryant Send private email
Tuesday, May 29, 2007
 
 
How would you represent PI? Is 3.14 enough? What about e?

I guess you can go far enough with symbolic computations but will need some approximation at some point anyway. I.e. ((2/3 + e)*PI) is some number too...
WildTiger Send private email
Tuesday, May 29, 2007
 
 
Oracle implements numbers with the precision that you are looking for, and it's the source of much anti-Oracle FUD too. However when you want to know whether 3*2.91 equals 8.73 or not, accept no substitute ... I'll take being accurate over being fast any day.
David Aldridge Send private email
Tuesday, May 29, 2007
 
 
> 2.01 * 0.3 = 0.603.

Ideally, it should be:

2.01 * 0.3 = 0.6
but
2.01 * 0.300 = 0.603

In this way, you wouldn't be introducing false precision into the answer.
dev1
Tuesday, May 29, 2007
 
 
If you have a standard to follow, follow it. They've certainly put more thought into it than you have; this is not a trivial problem.

For division, I'd prefer to see the precision of the result specified explicitly. If you know that you spent $123.45 to produce 24691 parts, you probably want to know your price per part to more than .01 precision.

For multiplication, you'll run into trouble when you multiply by anything other than an integer. What's 21474.83648 * 21474.83648? You won't be able to avoid rounding, if you can calculate it at all.
Mark Ransom Send private email
Tuesday, May 29, 2007
 
 
If you haven't looked already, I suggest a quick look at Java BigInteger and BigDecimal.

If you really want to keep complete accuracy allow your Number class to store fractions as well - then you don't have to worry about specifying an accuracy on division.
DJ Clayworth
Tuesday, May 29, 2007
 
 
Ben,

In terms of "ease of use", this is a solved problem by APL and other interpreted array languages, using the concept of "comparison tolerance" You can read a little about it here:

http://www.aplusdev.org/APlusRefV2_9.html#0

Adding a decimal type, though, is solving a slightly different problem, no?
Paul Mansour Send private email
Tuesday, May 29, 2007
 
 
{showing my age here}

Doesn't COBOL give you the option for truncating digits outside your desired precision range, or doing several types of rounding?
xampl Send private email
Tuesday, May 29, 2007
 
 
>in my design:
>0.3 / 2 = 0.1
>and
>0.30 / 2 = 0.15


Never truncate or round in an intermediate result.

0.3 / 2 = 0.15 +/- 0.1

This conveys bath the accuracy and the center accurately.  Since you know the center, don't throw it away.
JSmith Send private email
Tuesday, May 29, 2007
 
 
Thanks for all the great input.

WildTiger, it would be up to the person doing the calculation involving PI to decide what to use. If they use 3.14 or 3.14159 well then that is what it will be. My language would not provide pi if that is what you're suggesting. The fact that pi can't be exactly represented is much like how one third can't be. It is not a problem I am trying to solve.

David, I am not sure whether you are implying decimal arithmetic is faster or slower, but if you're saying you want the precision and intuitiveness, then decimal arithmetic is where you want to be.

dev1, when you say "false precision" I think I know what you are referring to, but I was just showing how the exact result of the multiplication is known, with no rounding or loss of arithmetic inaccuracy. My proposal is not to do any rounding, which would be implied if you tried to impose the correct precision that you are referring to, but to sort of follow the idea of inteeger arithmetic which leaves the remainder out of the result.

Mark, I am not sure you understood that with multiplication the product of any two numbers represented in decimal can be determined exactly by simply removing the "periods," doing the integer multiplication, and then putting the decimal point back in the result in the right place. I recognize that with 21474.83648 you are pointing to the limit of a 32-bit integer but still it is not really a problem as the result would fit in a 64-bit integer if it were multiplied by itself.

Clayworth, yes BigDecimal is essentially what I am describing but BigDecimal has the added benefit of growing to accomodate big numbers.

Complete accuracy for division is not my goal at all, just accuracy on simple math as might meet a higher level users' expectations.

Paul, The problem I am describing is solved by the .NET Decimal type and these other "decimal" technologies mentioned. Comparison tolerance does not sound relevant to everyday math in an Excel worksheet.
Ben Bryant Send private email
Tuesday, May 29, 2007
 
 
xampl, my language has those features for rounding or setting the number of decimal places. The question is about the default behavior for the most common usage.

JSmith, what makes you call it an "intermediate result?" What if my simple program is simply displaying it, and if I didn't want it truncated I would have specified more precision in the divisor? I don't think +/- 0.1 is something they would want to see in a higher level language.
Ben Bryant Send private email
Tuesday, May 29, 2007
 
 
Why don't you just use BCD fixed or floating points, this eliminates the decimal rounding problem.
Sunil Tanna
Tuesday, May 29, 2007
 
 
Just stick with: 1, 2, many

and you can't go wrong.
old.fart
Tuesday, May 29, 2007
 
 
You cannot exclude the issue of bounds (maximum integer values).  If your numbers occupy fixed number of bytes in memory, like float or double, you have limited precision.  This is bad.  If your numbers occupy variable number of bytes in memory and you never round things, sooner or later you'll encounter a number that will take more memory than you have available.  This is even worse.  So you can't win.

Number representation is as fundamental as a wheel.  There's no need to reinvent it.
Jeff Zanooda Send private email
Tuesday, May 29, 2007
 
 
I was actually trying to make three points; you only addressed one of them.

I understand that multiplication will maintain exact answers, but that can be maintained only up until a point. Using 64 bit integers instead of 32 bit integers helps, but doesn't remove the problem entirely. If you chain a bunch of multiplications, each of which adds a few decimal points to the required precision, what happens when you run out? You're going to a lot of trouble to maintain the exactness of your results, but I want to know if you've thought through what will happen when the limits are reached and you can't maintain that exactness anymore. And will your users understand?
Mark Ransom Send private email
Wednesday, May 30, 2007
 
 
Sunil, BCD is another way of implementing decimal arithmetic for a lot of the same reasons. Until last week I had always assumed I needed BCD for my goals, but this design takes advantage of binary integer arithmetic and was relatively quick to implement.

Mark, why should seeing the price per part more than .01 precision be the *default* behavior?

I agree about following a standard which has been thought through, but there are still details to weigh -- for example .NET Decimal type changed between .NET 1.0 and .NET 1.1 where they added a feature more like what I am describing: remembering that 1.00 was specified with two decimal places when they print the number.

There is nothing being invented here. The basic design and purpose I described appears to be well-established. AllanL5 pointed that out right off the bat.

I don't think arbitrarily big numbers is an everyday need. It is just one of many things to consider but I am not trying to design a number type that does everything.
Ben Bryant Send private email
Wednesday, May 30, 2007
 
 
If you want to explore this subject further here are some good articles:

The Perils of Floating Point, by Bruce M. Bush
 http://www.lahey.com/float.htm

Binary & Decimal Floating Point Basics, by Jon Skeet
 http://www.yoda.arachsys.com/csharp/floatingpoint.html
 http://www.yoda.arachsys.com/csharp/decimal.html

Bush uses Fortran examples while Skeet talks about .NET data types. He references an article by Jeffrey Sax which contains further references.
Chris Nahr
Wednesday, May 30, 2007
 
 
Use 128 bit doubles for more precision. Almost equal to decimal precision:

http://www.netlib.org/cephes/128bdoc.html
Donald Duck
Wednesday, May 30, 2007
 
 
Ben,

You stated your original issue as:

"You cannot compare 2 floating point numbers, and you cannot even accurately represent 0.1! These references describe these flaws:
http://support.microsoft.com/kb/125056
http://www.sqlite.org/faq.html#q18
"

Comparison tolerance solves the first problem, that "you cannot campare two floating point numbers" In fact, the Microsoft reference that you note above, and most of the other references that other posters have noted, recommend comparison tolerance whenever comparing two floating point numbers. (if one should NEVER compare two floats for equality, then why not just make the equals function tolerant?)

The second problem of accurately representing a decimal number, for example, ensuring that adding up a bunch of pennies equals a the corrent dollar amount, is solved as you note by a decimal type.

These are two related but different problems.

To answer your other specific questions, if you do division on decimal type, I would just return a float and let the programmer do what he wants. There is long history of doing this in high level interpreted languages. If you implement comparison tolerance as well, the float will not be a problem.
Paul Mansour Send private email
Wednesday, May 30, 2007
 
 
"why should seeing the price per part more than .01 precision be the *default* behavior?"

Because your proposed default behavior is surprising, and will result in errors much larger than the ones you're trying to eliminate.

If you're going to change the way people are used to seeing numbers work, you'd better have a damn good reason.
Mark Ransom Send private email
Wednesday, May 30, 2007
 
 
Chris, good references, thanks

Paul, okay I see -- yes, comparison tolerance could essentially be hidden in the equals function, so the programmer wouldn't have to think about it and he could verify: account balance == 4.23

Mark, thanks. Your point is well taken, though I still wonder if new programmers and high level programmers still find it jarring to see 1.0000000000000000 when they assign a double value of 1. I'm not sure the "normal" way of dealing with numbers in programs is that compelling. There might be an option more towards the intuitive that non-programmers deal with numbers where you only get a lot of decimal places of precision when you explicitly request it.
Ben Bryant Send private email
Wednesday, May 30, 2007
 
 
I was just thinking, how about going the other direction?

Invent some sort of type that has unlimited precision and keep the precision around after an operation.  Leave it up to the programmer to 'cast' down in precision.  If the program slows to a crawl due to the cycles involved in doing a zillion by zillion bit multiply, so be it.
old.fart
Wednesday, May 30, 2007
 
 
Cobol has good data types for this. The best design for fixed point numbers, however, is in Ada where you can specify the DELTA.

One point that many are missing is that in some applications any deviance, no matter how small, makes the software incorrect.
Karel Thönissen Send private email
Wednesday, May 30, 2007
 
 
>One point that many are missing is that in some
>applications any deviance, no matter how small, makes the
>software incorrect.

Then you better use software that supports storing algebraic formulas instead of numbers.  Otherwise, how do you store the square root of 2?
JSmith Send private email
Wednesday, May 30, 2007
 
 
You do not need the square root of 2 in typical accounting applications. However, you must be sure that the books balance under all circumstances. Rounding of negative and positive numbers must well-behave otherwise you end up with one cent differences.
Karel Thönissen Send private email
Wednesday, May 30, 2007
 
 
>JSmith, what makes you call it an "intermediate result?"
>What if my simple program is simply displaying it, and if
>I didn't want it truncated I would have specified more
>precision in the divisor? I don't think +/- 0.1 is
>something they would want to see in a higher level
>language.

It's an intermediate result until it leaves the program.  Here is an example of why you don't truncate:

A = 3.
B = 2.
C = 2.

X = A / B
Y = X * C

With your system,
X = 3. / 2. = 1.
Y = 1. * 2. = 2.

With my system,
X = 3. / 2. = 1.5 +/- 1
Y = (1.5 +/- 1) * 2. = 3. +/- 1

Changing the order of operations to
X = C / B
Y = A * X

With your system,
X = 2. / 2. = 1.
Y = 3. * 1. = 3.

With my system,
X = 2. / 2. = 1. +/- 1
Y = 3. * (1. +/- 1) = 3. +/- 1

Not only does you system give iffy results (but could be considered correct under some assumptions), but more importantly, it breaks the commutative property of multiplication.  That makes it mathematically incorrect and will break a lot of code if evaluated under your system.
JSmith Send private email
Wednesday, May 30, 2007
 
 
>You do not need the square root of 2 in typical accounting
>applications. However, you must be sure that the books
>balance under all circumstances. Rounding of negative and
>positive numbers must well-behave otherwise you end up
>with one cent differences.

Contiuously compounded interest is calculated as:

A(t) = A(0) * e ^ (r * t)

e is impossible to get exactly correct as well.  Some approximations will happen.  I know what you are trying to get at, but you said it way wrong.

BTW,  the books don't have to exactly balance at the single calculation level.  If you take deposits from 100,000 depositors.  The interest that you earn on the money will not be totally distributed to all of the depositors due to rounding.  The books will reflect this fact as well as an adjustment for the rounding error.  I think accountants have realized that when settling an account, someone is going to keep the partial penny.
JSmith Send private email
Wednesday, May 30, 2007
 
 
"breaks the commutative property of multiplication"

Well the multiplication result is accurate regardless of order, but if what you mean is whether you divide before or after you multiply, then yes absolutely it will make a difference because the divide can always involve the loss of accuracy. This is the same thing you deal with in integer arithmetic if you want 2/3rds of a number, you should multiply by 2 before you divide by 3. It is certainly a trade-off, but again you can always explicitly introduce greater precision and get a result that is accurate enough for your purposes.
Ben Bryant Send private email
Wednesday, May 30, 2007
 
 
Karel, using decimal arithmetic is what ensures you won't end up with 1 cent differences. Most of balancing books is just addition and subtraction and having a number type that supports an exact representation of 10 cents means no rounding or approximation/tolerance is involved. As far as computing compounding interest and any calculations involving division, then you must ensure your calculation is done to the precision you require that is compatible with the result you need -- this will be an issue that will take some thought regardless of your programming language.
Ben Bryant Send private email
Wednesday, May 30, 2007
 
 
Check Ada's decimal types. With those you can indeed say something like

type Money is delta 0.01 digits 5;

allowing you to accurately count monetary (or other fixed point) values up to 999.99 such that each 0.01 incerement is exactly represented. Obviously if you need a greater range change the number of required digits.

See http://www.adaic.com/docs/95style/html/sec_7/7-2-8.html
-L
Wednesday, May 30, 2007
 
 
"Well the multiplication result is accurate regardless of order, but if what you mean is whether you divide before or after you multiply, then yes absolutely it will make a difference because the divide can always involve the loss of accuracy. This is the same thing you deal with in integer arithmetic if you want 2/3rds of a number, you should multiply by 2 before you divide by 3. It is certainly a trade-off, but again you can always explicitly introduce greater precision and get a result that is accurate enough for your purposes."

Although you say it is accurate regardless of order, it is different if you perform the operation in a different order.  How can they both be accurate?  If you want to I can give you an example where just by changing the order of operations, you have inacuracies far larger than your lowest significant digit.

For example, if you were specifying three digits of precision all the way through a calculation, this could cause an error in the third digit.  You now have two digits of precision.  By rounding at every step, you are introducing errors that don't belong in the output.  Sure you could tell people that these errors will be there, but that isn't going to make the results any more correct.  Also, to deal with this problem, people will start arbitrarily adding extra zeros to their numbers to get accurate answers which will invalidate all the advantages the model may have had in the first place.
JSmith Send private email
Wednesday, May 30, 2007
 
 
JSmith, your example involves division. There is no inaccuracy or difference in a multiplication result due to order. As I said above I would keep all of the places in the result of a multiplication. I never suggested rounding, only truncation in the case of division.

my original example is exact in either order:
2.01 * 0.3 = 0.603
0.3 * 2.01 = 0.603

"people will start arbitrarily adding extra zeros to their numbers to get accurate answers"

Adding zeros would only keep more digits for division; it would have no affect on the other operations. Division is the operation where you always need to be aware of the accuracy you want.

That said, I am leaning towards keeping all the available digits from the result of the division. I just want to fully consider the option of truncating.
Ben Bryant Send private email
Wednesday, May 30, 2007
 
 
How would you handle a case like this:

number a = 3.0
number b = 3.00
number c = 10

boolean ab = (a == b) // presumably true?

number ca = c / a    // presumably 3.3
number cb = c / b    // presumably 3.33

boolean abc = (ca == cb)  // Bizarro!!

Now you've painted yourself into a corner. Most programmers would expect ab = true, since 3.0 == 3.00 in most parts of the world. But then, when dividing a single number by these two equal values, you end up with two different results.

Very very confusing.

You could avoid that problem by saying that 3.0 != 3.00, but then you've opened up a whole other can of worms.

So, ummmm...I hope you like worms.
BenjiSmith Send private email
Wednesday, May 30, 2007
 
 
+1 for worms.

I see you're considering keeping maximum precision for the results of a divide - that's a step in the right direction. Now let's talk about multiplies again. What will you do with 42949.67296 * 42949.67296 * 3.14159?

There are a lot of possible pitfalls, that's why I suggested finding an already existing standard and sticking to it. You're not the first one to go through this.
Mark Ransom Send private email
Wednesday, May 30, 2007
 
 
guys, decimal arithmetic *is* a standard. I think you are looking for "worms" too hard.

BenjiSmith, you gave a well-explained example that shows an amazing understanding of this whole discussion. The only argument against your point is that perhaps your mathematician side is showing through too strongly, and the practical answer is that 3.3 is simply not equal to 3.33. You arrived at those numbers differently with an interim division, so the expectation that they be equal is not justified. Whenever you involve division you have to allow for differences in what precision was captured. I think you could devise a similar problem with .NET's Decimal type. Once you use division, all bets are off for later equality unless you have selected a number of digits to retain and/or purposefully rounded the intermediate division result. Ultimately to be able to infer later equality due to intermediate equality you'd have to go for one of those math number libraries that retains dividend and divisor as mentioned above.

Mark, I see that very large multiplication as a question either for a "big" type, or simply an undefined loss of precision in a type not built for that many digits. I agreed with you before; have I given the impression I would not follow a standard for implementing bounds or edge handling?
Ben Bryant Send private email
Wednesday, May 30, 2007
 
 
BenjiSmith,

x = 10
y = x / 3
z = y * 3

x == z? I don't think you can assume that in any number type that does not retain the dividend and divisor (unless I suppose it is a multiple of a 3-base number system).
Ben Bryant Send private email
Wednesday, May 30, 2007
 
 
Update: I've settled on using max precision which will pass BenjiSmith's test. So:

10 / 3 = 0.333333333
10 / 3.0 = 0.333333333

The rule then is that the max number of decimal places is retained for addition and subtraction. For multiplication and division, the number of decimal places may grow to accomodate the best precision result the type can support. So:

10 / 3.0 == 10 / 3.00
10.0 / 3 == 10 / 3
Ben Bryant Send private email
Wednesday, May 30, 2007
 
 
"guys, decimal arithmetic *is* a standard. I think you are looking for 'worms' too hard."

I got lost in the arguments at about the 20th reply, so I just skipped to the end.  My apologies if anyone else has already said this:

Sounds like YOU are getting lost, too.  Many languages already solve the problem, if others here can be trusted, yet it seems you are designing an entire language to deal with one problem: the "problem" of decimal precision.

Is this an intellectual exercise, or is there some other compelling reason why you have to reinvent the wheel to "solve" this "problem?"  I'm curious ...
Karl Perry Send private email
Wednesday, May 30, 2007
 
 
Not my cuppa, but I can see how it might be useful.  You might want to consider what 10 means.  It is generally considered to have only one digit of precision, but what if the measurement is accurate to two digits?  How will you specify 10 accurate to two digits of precision.

Sincerely,

Gene Wirchenko
Gene Wirchenko Send private email
Wednesday, May 30, 2007
 
 
Floating point is used for scientific computation. But this is only one encoding out of many. Sometimes the fixed decimal is the better option.

Alternatively, one can write and use a class like java.math.BigDecimal at the cost of performance.

PS Cobol has a powerful way of declaring numbers types.
Dino Send private email
Wednesday, May 30, 2007
 
 
Karl, when you design a language, you pick and choose how to make it work. There are plenty of choices, all of them previously "solved," such as different ways of handling typing, some more suited to certain problems than others. My language has nothing to do with solving the problem of decimal precision. This number type is just one of many things to decide on and implement in the language.

Gene, digits of precision and the digit accuracy of measurements is not something I or my average user would be concerned with. I have referred to number of decimal places, in US dollars two is often used. In quantities like kilograms there may be a number of decimal places that is used -- simple stuff like that.
Ben Bryant Send private email
Wednesday, May 30, 2007
 
 
"Gene, digits of precision and the digit accuracy of measurements is not something I or my average user would be concerned with. I have referred to number of decimal places, in US dollars two is often used. In quantities like kilograms there may be a number of decimal places that is used -- simple stuff like that."

Maybe, all you want is calculation to a specified number of decimal places, but this is very close to the precision issue.  What you have posted is very close to how to figure the number of digits of precision.

Sincerely,

Gene Wirchenko
Gene Wirchenko Send private email
Thursday, May 31, 2007
 
 
Gene, I think I just realized the difference.
If decimal places matter then:
7.00 * 10 = 70.00
If digits of precision matter then:
7.00 * 10 = 70.0
Digits of precision is more for scientific purposes which is why the exponential notation is normalized to one digit then decimals -- otherwise you would not be expressing the number of significant digits. e.g. 7.00e+2 tells you 3 significant digits, 700 does not.
Ben Bryant Send private email
Thursday, May 31, 2007
 
 
Just put the rounded pennies into my account.
OneMist8k
Thursday, May 31, 2007
 
 
Floating point representation was designed for a very specific purpose: fast calculations for scientific and engineering where some inaccuracy is acceptable. And float remains just as good for that purpose today as it was when the representation was created. The problem is just when float is used for things it wasn't designed for (like representing currency values).

And you have to keep in mind the historical perspective. Let's say that it is 1967 and you have ten million decimal values (say, measurements of some phenomenon) and you want to find the average. You could use a completely accurate data type, but with 1967 computers, it would take four months to do the calculations. But with float, it will only take 3 days. Which would you pick? And that massive performance difference still exists today (only now you have 10 quadrillion values to average).

Float has it's uses and accurate data types (Decimal, et al) have their uses. You want to have both in your toolkit and understand the limitations and advantages of both.
Bill Tomlinson Send private email
Thursday, May 31, 2007
 
 
It's sort of an interesting topic, but it seems to me that how numbers are stored in nearly orthogonal to language type.

For example, years ago, Turbo Pascal used to come in 3 flavours:  normal, extended and BCD

The only difference between the 3 was

Normal used what we would today call "float" for the Pascal REAL type.

Extended used  "double" for the REAL type

BCD used floating-BCDS for the REAL type.

The actual language itself was the same. The same programs would compile in all 3 versions of the compiler. 

(Of course you could have a language where you  could declare variables as  float, double, fixed, doublefixed, bcd, doublebcd  - but that's still kind of incidental to the overall design of the language).
Sunil Tanna
Thursday, May 31, 2007
 
 
Bill, agreed. That's why I specifically said "for normal everyday usage." I don't see any reason to support floating-point types like double except for scientific purposes. If my language has one non-integer number type, it will definitely be a decimal type.

Sunil, yes, as I mentioned above, this is just one detail in the design of the language. You're right that it has little impact on other features of the language.
Ben Bryant Send private email
Thursday, May 31, 2007
 
 
This has been educational for sure.

I'd still like to have (1 / 3) * 3 == 1.  I had a programmer friend who came from a math background and couldn't figure out why COBOL gave different answers for

( a / b ) * c
( a * c ) / b

Computers just don't do math very well.

The REXX language provides "arbitrary precision". When you exceed the bitness of the machine it goes into doing math digit by digit. I'm not sure why someone asked, but this is a runnable program:

/* */
numeric digits 20 /* default is 9 */
say 21474.83648 * 21474.83648  /* 461168601.8427387904 */
say 2147483648  * 2147483648  /* 4611686018427387904  */
say ( 1 / 3 ) * 3              /* 0.99999999999999999999 */

From The REXX Language Definition "Before every arithmetic operation the terms are ... truncated to digits+1 significant digits ... The operation is then carried out under (up to) double that precision ... the result is rounded if necessary to the precision specified by NUMERIC DIGITS" It also has a NUMERIC FUZZ setting for comparisons so we could make ( 1 / 3 ) * 3 == 1 if we wanted. :-)
Stan James Send private email
Friday, June 01, 2007
 
 
"have I given the impression I would not follow a standard for implementing bounds or edge handling?"

Actually, I did have that impression, perhaps because you hadn't decided on a standard yet. I won't belabor the point any further.

Someone above had suggested storing each number as a separate numerator and denominator. I've given that some thought, and realized that what you've proposed is just a special case of that method, where the denominators are always a power of 10. Instead of saying that 10.00 is 1000 with 2 decimal places, you can think of it as 1000 with a divisor of 100. Now what if you removed the restriction that divisors be a power of 10? 10.00/3.00 ==> (1000/100) / (300/100) ==> (1000/100) * (100/300) ==> 100000/30000. At any time you feel it's necessary, you can simplify the expression by removing the greatest common denominator: 100000/30000 ==> 10/3. You've retained the benefit of exact results for decimal fractions, and extended it to non-decimal fractions.

If I'm starting to sound too much like an architecture astronaut, please let me know.
Mark Ransom Send private email
Friday, June 01, 2007
 
 
Mark, that's what I referred to as keeping the dividend and divisor. Another way of expressing that is basically tracking the lowest common denominator as part of your variable. Your realization that this is related to the fact that decimal arithmetic is base-10 is a good way of looking at it.

Although I didn't look it up, it was mentioned above that this is done in some cases; I assume where algebra is involved. I just don't deem this as important in my case.
Ben Bryant Send private email
Saturday, June 02, 2007
 
 
How about a string number type of arbitrary precision like in php?  the numbers are stored as character strings?
Donald Duck
Friday, June 15, 2007
 
 

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
 
Powered by FogBugz