Regression Models for Survival Data
As we noted at the end of Chapter 2, our first question when considering regression modeling of survival data is: What are we going to model? Specifically, what will play the role of the systematic component in a regression model? The inherent aging process that is present when subjects are followed over time is what distinguishes survival time from other dependent variables. The presence of censoring in the data makes the study of survival time more interesting from a statistical research perspective, but from a practical point of view, it is an annoying technical detail that must be dealt with when we fit models. Of the functions describing the distribution of survival time discussed in Chapter 2, the hazard function best and most directly captures the essence of the aging process. Thus, a natural place to begin is to explore how to incorporate a regression model-like structure into the hazard function.
The simplest possible hazard function is one that is constant at all values of time. We saw in figure 2.14 that the kernel-smoothed hazard function from the WHAS100 data was nearly constant for the first 5 1/2 years of follow-up, with a value approximately equal to 0.1. Thus for the WHAS100 data we might begin with the following model for the hazard function
or more generally
Because the hazard function is a ...