Please enable JavaScript to view this site.

ESL Documentation

Navigation: » No topics above this level «

Q13 How are rounding errors introduced within floating point arithmetic?

Scroll Prev Top Next More

ESL is written in a computer programming language called 'C'. 'C' stores integers and floats with a precision and size that will vary from machine to machine.  This means that in the case of floating point numbers ( i.e. numbers that may have a fractional part ) the stored representation of these numbers may not be exact.  This, combined with the fact that when doing integer arithmetic any fractional part is truncated, can lead to some interesting results.  Consider the following examples where F1 is a float:

 F1 = ( 2 / 4 )

F1 will be equal to 0 (because 2 and 4 are integers the decimal part of the result is truncated) .

 F1 = ( 2.0 / 4.0 )

F1 will be equal to 0.5 (because 2.0 and 4.0 are floating point numbers the decimal portion is retained).

 F1 = ( 2.0 / 4)

F1 will be equal to 0.5 (since only one number has to be a floating point number for the decimal portion to be retained).

When storing these floating point numbers in memory a conversion to binary needs to take place. In this conversion floating point numbers are represented by a discrete number of bits and may need to be approximated. For example, if you have 8 bits to represent a floating point number you might use the following format:  1 bit for the sign of the number, 3 bits for the exponent, and 4 bits for the significant. This leads to a limitation on the size and precision with which a floating point number can be represented ( the actual limitation would depend on the way the binary number is handled within the system ).  The precision is the smallest number that can be represented using such a format and is the smallest amount by which you can increment or decrement the floating point number.

The precision can become important when you start to mix floating point and integer numbers in the same calculation.  Say you performed the following calculation:

 Float = ( 10.0 / 2 + 1 )

If you print the variable Float you would get the expected answer of 6.0, however, suppose you then did this:

 Integer = Float

If you now print the variable Integer, you might expect to get the result 6, but you might get 5. What could happen is the number 6 is really being represented internally as 5.999999999767, because the precision of the floating point number is not infinite. When you print out a floating point number it would be represented by fewer places after the decimal than the internal representation and the number would be rounded up to 6.0. But when you put the number 5.999999999767 into an integer the decimal is truncated before any rounding can occur and the answer you get will be 5.

Since ESL is written in 'C' there are a number of implications of this basic issue:

First, you should not compare floating point numbers for equality (e.g. "if (Float1 = Float2) then...").

Second, when ESL does it's automatic type conversions from float to integer, it will truncate (not round) the result.

For example, the statement: "copy 9.99 to IntVal"

will yield "9" in the integer IntVal, not 10 as might be expected.

However, the "precision" environment declaration should help deal with most cases where this might be a problem since it will round a value properly for display purposes.