Chapter 7. A Model Workflow
In ChapterÂ 6, we discussed the
parsnip package, which can be used to define and fit the model. This chapter introduces a new concept called a model workflow. The purpose of this concept (and the corresponding tidymodels
workflow() object) is to encapsulate the major pieces of the modeling process (discussed in ChapterÂ 1). The workflow is important in two ways. First, using a workflow concept encourages good methodology since it is a single point of entry to the estimation components of a data analysis. Second, it enables the user to better organize projects. These two points are discussed in the following sections.
Where Does the Model Begin and End?
So far, when we have used the term âthe model,â we have meant a structural equation that relates some predictors to one or more outcomes. Letâs consider again linear regression as an example. The outcome data are denoted as $y_i$, where there are samples in the training set. Suppose that there are predictors that are used in the model. Linear regression produces the following model equation:
While this is a linear model, it is linear only in the parameters. The predictors could be nonlinear terms (such as the ).
The conventional way of thinking about the modeling process is that it only includes the model fit.
For some straightforward data sets, fitting the model itself ...