305
10
Hypothesis Testing and
Confidence Intervals
10.1 Introduction
Hypothesis testing and condence intervals play a prominent role in clas-
sical statistical texts. We have already indicated (in Chapters1and 5)
some concerns about the way these concepts are often interpreted, but
there is no doubt that they are extremely important for building models for
risk assessment. Although the term hypothesis testing is not prominent in
books and articles on Bayesian reasoning, this is largely because the eval-
uation of alternative hypotheses, in the form of causes, explanations, and
predictions, is such an integral part of the Bayesian approach, that it tends
to be implicit and therefore goes unmentioned. Likewise condence inter-
vals may go unmentioned because with the Bayesian approach we often
have access to the full probability distribution.
It is important to present the Bayesian approach to hypothesis test-
ing (which we do in Section 10.2) and condence intervals (Section
10.3) explicitly so that you understand the simplicity and clarity of
the Bayesian approach compared with the classical approach. In not
explicitly tackling the topic most Bayesian texts may give the reader
the impression that Bayesians dont do hypothesis testing or con-
dence intervals. But we do and in spades.
10.2 Hypothesis Testing
Hypothesis testing is the process of establishing which out of a number
of competing hypotheses is most plausible (i.e., has the greatest prob-
ability of being true) given the data or judgments available. Examples
of the kind of hypotheses we might be interested in when assessing risk
include
Is a soccer team coach competent or merely lucky?
Is drug “Precision” better than drug “Oomph” in causing
weight loss?
Visit www.bayesianrisk.com for your free Bayesian network software and models in
this chapter
We already introduced the idea of
hypothesis testing informally in
Chapter 4 where we used Bayes to
explain why the classical statistical
technique of p-values could lead to
conclusions that were demonstrably
wrong. A primary objective of this
chapter is to show how to do it prop-
erly using a Bayesian network (BN)
approach.
306 Risk Assessment and Decision Analysis with Bayesian Networks
Is investment in U.S. government bonds riskier than investment
in gold?
Is one scientic weather model better able to predict oods than
another?
Are caloric intake and lack of exercise a joint cause of obesity
or is it caloric intake alone?
In this section we will introduce the simplest forms of hypothesis
testing and introduce Bayes factors. Next we will show how we can
test for hypothetical “magnitude” differences between groups using the
Bayesian approach and contrast this with the potentially error-prone
classical approach to signicance. We discuss a relevant and very impor-
tant result called Meehl’s conjecture.
This then brings us neatly onto how we choose between compet-
ing models (hypotheses) on the basis of how well they predict events
or data. We next show how we can accommodate expert opinions into
the hypothesis testing process in a way that complements data. For com-
pleteness we show how the “Bayesian model choice” approach can be
used as a coherent alternative to the “distribution tting and testing” that
comes as a standard part of the classical statistical armory. Finally, we
talk about complex causal hypotheses and how we might test different
causal explanations of data in hypothesis testing terms.
10.2.1 Bayes Factors
Typically we will assess whether the data/evidence we have on hand sup-
ports one hypothesis over another and by how much. In Bayesian terms,
for two competing hypotheses, this is simply expressed using the Bayes
factor; this is the ratio of the conditional probability of one hypothesis,
H
1
, being true given the data, D, and the conditional probability of the
competing hypothesis, H
2
, being true given the same data.
Bayes Factor
PH D
PH D
(|)
(|)
1
2
The ratio provides the following very simple and straightforward
approach to hypothesis testing.
If the ratio equals one, then
PH DPHD(|)(|)
21
=
, and so we
are indifferent to the hypotheses (neither is more likely than the
other given the data available).
If the ratio is greater than one, then
PH DPHD(|)(|)
12
>
, and
so the data supports H
1
.
If the ratio is less than one, then
PH DPHD(|)(|)
12
<
, and so
the data supports H
2
.
The default Bayesian position is
that there may be many different
hypotheses, {H
1
, H
2
,,H
n
}. For
example, competent and lucky
might not be the only relevant
hypotheses for the soccer coach.
He might also be bad or unlucky.
However, in many cases there will
be just two competing hypotheses
which, as we saw in Chapter 4, are
normally referred to as the null
and alternative hypothesis, written
H
0
and H
1
, respectively.

Get Risk Assessment and Decision Analysis with Bayesian Networks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.