Chapter 2
Computer representation of numbers
2.1 Introduction
Before delving into code development, we first need to think about the basic ingredient
that lies at the core of all scientific com putations: numbe rs. Computer languages generally
provide specific ways of representing fundamental or primitive data types that correspond
to numbers in some direct manner. For C++ these primitive data types include
boolean (or bool) for variables that take on the two values true and false or, equiva-
lently, the values 1 (= true) and 0 (= false),
integer, which translates into specific types such as short, unsigned short, int, long
int and unsigned long int that provide dierent amounts of s torage for integer values,
character, indicated by the modifier char in source code, that is used for variables that
have character values such as “A” and
floating point which encompasses basically all non-integer numbers with the three storage
types float, double and long double.
Ultimately, all computer operations, no matter how sophisticated, reduce to working
with representations of numbers that must be stored appropriately for use by a computer’s
central processing units or CPUs. This process is accomplished through the manipulation
of transistors (in CPUs and random access memory or RAM) that have two possible states:
o and on or, equivalently, 0 (= o ) and 1 (= on). By combining transistors it becomes
possible to create a dyadic representation for a number that allows it to be stored and
used in arithmetic operations. Of course, the numbe r of transistors is finite which aects
both how and how much information can actually be stored. To manage overall memory
(i.e., storage) levels eectively it is necessary to restrict the amount of memory that can be
allocated to dierent kinds of numbers with the consequence that there are limits on how
big or small a number can be and still be stored in some meaningful fashion. To appreciate
the implications of this we first need to think a bit about computer arithmetic.
In the familiar decimal (or base 10) system, numerical values are represented in units or
powers of 10. For simplicity, let us work with only nonnegative integers for the moment.
Then, the basis representation theorem from number theory (e.g., Andrews 1971) has the
consequence that any such integer, k, can be written as
k =
m
j=0
a
j
(10)
j
(2.1)
for some unique (if we require a
m
= 0) integer m and some unique set of integer coecients
{a
0
,...,a
m
}. As an example,
193 = 1 (10)
2
+9 (10)
1
+3 (10)
0
with * indicating multiplication. So, in this case m =2,a
0
=3,a
1
= 9 and a
2
= 1. Note that
for the coecients in (2.1) to be unique their range must be restricted to, e.g., {0,...,9}.
9

Get Statistical Computing in C++ and R now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.