Descriptive Methods for Survival Data
In any applied setting, a statistical analysis should begin with a thoughtful and thorough univariate description of the data. The fundamental building block of this analysis is an estimate of the cumulative distribution function. Typically, little attention is paid to this fact in an introductory course on statistical methods, where directly computed estimators of measures of central tendency and variability are more easily explained and understood. However, routine application of standard formulas for estimators of the sample mean, variance, median, etc., will not yield estimates of the desired parameters when the data include censored or truncated observations. In this situation, we must first obtain an estimator of the cumulative distribution function to obtain statistics that do, in fact, provide estimates the parameters of interest.
In the WHAS100 study described in Chapter 1, we saw that the recorded data are continuous and are only subject to right censoring. Remember that time itself is always continuous, but we must deal with our inability to measure it precisely. The cumulative distribution function of the random variable survival time, denoted 7, is the probability that a subject selected at random will have a survival time less than or equal some stated value, t. This is denoted as F(t) = Pr(T ≤ t). The survival function is the probability of observing a survival time greater than some stated value ...