Floating-Point Computations

Computers are finite machines that have been designed to perform basic computations on values stored in registers by a Central Processing Unit (CPU). The size of these registers has evolved as computer architectures have grown from the popular 8-bit Intel processors from the 1970s to today's widespread acceptance of 64-bit architectures (such as Intel's Itanium and Sun Microsystems Sparc processor). The CPU often supports basic operations—such as ADD, MULT, DIVIDE, and SUB—over integer values stored within these registers. Floating Point Units (FPUs) can efficiently process floating-point computations according to the IEEE Standard for Binary Floating-Point Arithmetic (IEEE 754).

Computations over integer-based values (such as Booleans, 8-bit shorts, and 16- and 32-bit integers) have traditionally been the most efficient computations performed by the processor. Efficient programs that execute on computer architectures often take advantage of the performance differential between integer-based and floating point-based arithmetic. There are important issues that developers must be aware of when programming using floating-point arithmetic (Goldberg, 1991). Next we focus on the important issues that we consider in the algorithms and supporting code for this book.

Rounding Error

Any computation using floating-point values may introduce rounding errors because of the nature of the floating-point representation. In general, a floating-point number is a finite ...

Get Algorithms in a Nutshell now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Algorithms in a Nutshell by George T. Heineman, Gary Pollice, Stanley Selkow

Floating-Point Computations

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly