
449
Appendix B: The Algebra of
Node Probability Tables
There are a number of algebraic “tricks” that are needed when performing marginalization and using
Bayes’ Theorem in practice. This revolves around the manipulation of node probability tables (NPTs). You
will only need to read this appendix if you are curious about the mathematical underpinnings, and if you
are interested in gaining an understanding of the underlying algorithms reading this appendix is essential.
There are three operations we are interested in and we can use these along with Bayes’ Theorem to com-
pute or derive any measure of interest in the Bayesian network (BN). These are marginalization, multipli-
cation, and division. Plus, instead of calculating probabilities one at a time we can use table containing rows
and columns indexed by variable state values to do so (note that these tables are not the same as matrices).
B.1 Multiplication
Consider the simple model P(A, B, C) = P(B | A) P(C | A) P(A) with the following probability tables (notice
that we are using x, y, z values for the probabilities and lowercase for each of the variable state values):
PA
aa
zz
PB A
aa
b
b
xx
xx
PC A
aa
c
c
yy
yy
() ,(|) ,(|)
12
12
12
1
2
12
34
12
1
2
12
34
== =
If we were to multiply the two NPTs p(B | A) and p(A)we would get a new NPT thus:
PAPB A
aa
zz
aa
b
b
xx
xx
aa
b
b
zx zx
zx zx
()(|)
12
12
12
1
2
12
34
12
1
2
11 22
13 24
=× =
Next if we want to get P(B | A)P(A)P(C | A) we can simply multiply again by P(C | A):
PB APAPCA
a
a
b
b
zx zx
zx zx
aa
c
c
yy
yy
c
a
c
c
a
c
b
b
zxy
zxy
zxy
zxy
zxy
zxy
zxy
zxy
(|)()( |)
1
2
1
2
11 22
13 24
12
1
2
12
34
1
1
2
1
2
2
1
2
111
131
113
133
222
242
224
244
=×
=

450 Appendix B: The Algebra of Node Probability Tables
A table’s weight is the product of the states of all variables contained in that table. So the preceding table
for P(B | A)P(A)P(C | A) is weight 8 since it contains 8 cells.
Multiplying a table by unit values leaves the table unchanged. For example:
PA PA
aa
zz
aa
aa
zz
() ()
11
12
12
12
12
12
×
′
=× =
Note that we could normalize the table P(u) to (0.5, 0.5), but it is actually easier when carrying out cal-
culations to leave the unit values and postpone normalization to the very end of a sequence of calculation
steps.
B.2 Marginalization
Marginalization involves simply summing over terms in the table. So if we had a table P(A,B):
PAB
aa
b
b
xx
xx
(,)
12
1
2
12
34
=
Then we marginalize by summing over index values:
PA PAB
aa
xx xx
PB PAB
bb
xx xx
() (,)
() (,)
B
A
12
13 24
12
12 34
∑
∑
==
++
==
++
Marginalization is distributive (can be broken into parts). First we can separate the parts by those vari-
ables that belong together and carry out table operations one part at a time. Thus, in this example p(A) can
be taken outside of a marginalization because it does not belong with the other variables:
PBCPCPAPAPBCPC(|)()()()(|)
()
CC
∑∑
=
Doing this means that we make the tables smaller, i.e., instead of calculating a big table for all the state
combinations for all variables we do it for two alone and then multiply the others in. This saves time manu-
ally and for the computer.
Get Risk Assessment and Decision Analysis with Bayesian Networks now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.