The classic model that underlies reliability stipulates that a rating of unit *i, i* =1, 2, … can be expressed as X_{ij} = ξ_{i} + _{ij}, where ξ_{i} is that unit’s “true value” (i.e., that free of rater error) and ε_{ij} is error made by the *j*th independent rater sampled from the population of raters [1]. The interrater reliability coefficient is defined as:

where σ_{ξ}^{2} is the variance of the ξ_{i} in the population of interest and σ_{X}^{2} that of a single observation per subject. Thus, in a sense, reliability relates to the signal-to-noise ratio, where σ_{ξ}^{2} relates to “signal” and σ_{X}^{2} combines “signal” and “noise.”

According to this definition, the reliability coefficient is zero if and only if subjects in the population are homogeneous in whatever X measures. This situation should almost never pertain when considering measures for use in randomized clinical trials. Consequently, testing the null hypothesis that ρ = 0 is virtually never of interest, although admittedly it is often observed in the research literature. Instead, the tasks of greatest interest to clinical research are (1) obtaining a confidence interval for ρ, (2) judging the adequacy of ρ, and (3) considering how to improve ρ.

Start Free Trial

No credit card required