Chapter 4. Evaluation of Training Programs 65
Systematic and Systemic Evaluation of Training Programs
Measure
vs. evaluate
mea·sure \
õ
me-zhər,
õ
mā-\ vt
to choose or control with cautious restraint
to ascertain the measurements of
to estimate or appraise by a criterion
(Merriam-Webster’s Collegiate Dictionary, 2003, p. 769)
eval·u·ate \i-
õ
val-yə-
õ
wāt, -yü-
õ
āt\ vt
to determine or fix the value of
to determine the significance, worth, or condition of usually by careful
appraisal and study
(Merriam-Webster’s Collegiate Dictionary, 2003, p. 432)
Evaluation Evaluation is considered a form of applied research, but there is a slight
difference between applied research and evaluation:
Applied research is aimed at producing generalizable knowledge relevant
to providing a solution to a general problem.
Evaluation focuses on collecting specific information relevant to a parti-
cular or specific evaluation object or “evaluand” (Guba and Lincon, 1981,
cited in Worthen, Sanders, and Fitzpatrick, 1997). Specifically,
Evaluation is disciplined inquiry to make a judgment about the worth
of the evaluation objects (such as instructional programs).
Evaluation produces information that is used to make decisions.
One example of evaluation: a study about the effectiveness of a new safet
y
training program on reducing work-related injuries in an organization. If the
results reveal that the work-related injury rate has been reduced by 12 percent
after the training program is implemented, the organization might recognize
the worth of the training program and decide to continue to provide it to the
employees annually. The organization would still need to consider that othe
r
factors might have influenced the results.
Evaluation is conducted through various measurement processes, using quan-
titative methods, qualitative methods, or a combination of both.
Systematic
and systemic
evaluation
This chapter introduces Donald Kirkpatrick’s four-level model of evaluation,
which will help practitioners understand the systematic and systemic
approaches to evaluating the effectiveness of training programs. This chapte
r
also provides an overview of the main concepts in measurement an
d
evaluation, and describes methods of constructing evaluation instruments.
66 Foundations of Instructional and Performance Technology
Donald Kirkpatrick’s Four-Level Model of Evaluation
Four levels
of evaluating
training programs
About a half century ago, Donald Kirkpatrick, now professor emeritus at
the University of Wisconsin, was working on his Ph.D. at the Universit
y
of Wisconsin. He decided that his doctoral dissertation would focus on
evaluating a supervisory training program. He came up with the idea o
f
“measuring participants’ reaction to the program, the amount of learning that
took place, the extent of their change in behavior after they returned to thei
r
jobs, and any final results that were achieved by participants after the
y
returned to work” (Kirkpatrick, 1996b, p. 55). The practice of using fou
r
levels in evaluating training programs came out of his work. The levels
are (1) reaction, (2) learning, (3) behavior, and (4) results. Although he
originally called them “steps” in the article published in the November 1959
issue of Training and Development, “levels” is more widely used to refer to
his evaluation model (Kirkpatrick, 1996b).
Kirkpatrick (1978, 1996a) explains that there are three reasons for evaluating
training programs and that his evaluation model is useful for any of the three
reasons:
1. To know how to improve future programs
2. To determine whether to continue or discontinue the program
3. To justify the existence of the training program or department
Each level of Kirkpatrick’s evaluation model requires one to elicit different
information on the effectiveness of a training program:
1. Reaction: Did the participants like the training program?
2. Learning outcomes: Did they learn what they were supposed to learn? Ho
w
much did they learn?
3. Behavioral change: Did they change their on-the-job behavior?
4. Results on the organization: Did their knowledge and changed behavio
r
positively affect the organization in terms of resulting in increased pro-
duction, improved quality, decreased costs, etc.? (Kirkpatrick an
d
Kirkpatrick, 2005a, 2005b)
Kirkpatrick’s four-level model of evaluation helps practitioners measure the
effectiveness of a training program from level 1 to level 4 in a systematic way,
but also encourages them to project its systemic impact on long-term outcomes
as well as short-term outcomes.
Kirkpatrick (1996a) explains that “trainers must begin with desired results
and then determine what behavior is needed to accomplish them. . . . The fou
r
levels of evaluation are considered in reverse. First, we evaluate reaction.
Then we evaluate learning, behavior, and results” (p. 26). In other words, the

Get Foundations of Instructional and Performance Technology now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.