Chapter 2

Computer representation of numbers

2.1 Introduction

Before delving into code development, we ﬁrst need to think about the basic ingredient

that lies at the core of all scientiﬁc com putations: numbe rs. Computer languages generally

provide speciﬁc ways of representing fundamental or primitive data types that correspond

to numbers in some direct manner. For C++ these primitive data types include

• boolean (or bool) for variables that take on the two values true and false or, equiva-

lently, the values 1 (= true) and 0 (= false),

• integer, which translates into speciﬁc types such as short, unsigned short, int, long

int and unsigned long int that provide diﬀerent amounts of s torage for integer values,

• character, indicated by the modiﬁer char in source code, that is used for variables that

have character values such as “A” and

• ﬂoating point which encompasses basically all non-integer numbers with the three storage

types float, double and long double.

Ultimately, all computer operations, no matter how sophisticated, reduce to working

with representations of numbers that must be stored appropriately for use by a computer’s

central processing units or CPUs. This process is accomplished through the manipulation

of transistors (in CPUs and random access memory or RAM) that have two possible states:

oﬀ and on or, equivalently, 0 (= oﬀ ) and 1 (= on). By combining transistors it becomes

possible to create a dyadic representation for a number that allows it to be stored and

used in arithmetic operations. Of course, the numbe r of transistors is ﬁnite which aﬀects

both how and how much information can actually be stored. To manage overall memory

(i.e., storage) levels eﬀectively it is necessary to restrict the amount of memory that can be

allocated to diﬀerent kinds of numbers with the consequence that there are limits on how

big or small a number can be and still be stored in some meaningful fashion. To appreciate

the implications of this we ﬁrst need to think a bit about computer arithmetic.

In the familiar decimal (or base 10) system, numerical values are represented in units or

powers of 10. For simplicity, let us work with only nonnegative integers for the moment.

Then, the basis representation theorem from number theory (e.g., Andrews 1971) has the

consequence that any such integer, k, can be written as

k =

m

j=0

a

j

(10)

j

(2.1)

for some unique (if we require a

m

= 0) integer m and some unique set of integer coeﬃcients

{a

0

,...,a

m

}. As an example,

193 = 1 ∗ (10)

2

+9∗ (10)

1

+3∗ (10)

0

with * indicating multiplication. So, in this case m =2,a

0

=3,a

1

= 9 and a

2

= 1. Note that

for the coeﬃcients in (2.1) to be unique their range must be restricted to, e.g., {0,...,9}.

9

Get *Statistical Computing in C++ and R* now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.