Errata

Tidy Modeling with R

Errata for Tidy Modeling with R

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".

The following errata were submitted by our customers and approved as valid errors by the author or editor.

Color key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted By Date submitted Date corrected
Page 64-65
Between last paragraph on p. 64 and first paragraph on p. 65

Currently the text from p. 64 to p. 65 reads:

- Specify the type of model based on its mathematical structure: Such as linear regression, random forest, KNN, etc.Most often this reflects the software package that should be used, like Stan or glmnet. These are models in their own right, and parsnip provides consistent interfaces by using these as engines for modeling.

- When required, declare the mode of the model: The mode reflects the type of prediction outcome. For numeric outcomes, the mode is regression; for qualitative outcomes, it is classification.1 If a model algorithm can only address one type of prediction outcome, such as linear regression, the mode is already set.

Instead, that text should read as follows:

- Specify the type of model based on its mathematical structure (e.g., linear regression, random forest, KNN, etc).

- Specify the engine for fitting the model: Most often this reflects the software package that should be used, like Stan or glmnet. These are models in their own right, and parsnip provides consistent interfaces by using these as engines for modeling.

- When required, declare the mode of the model: The mode reflects the type of prediction outcome. For numeric outcomes, the mode is regression; for qualitative outcomes, it is classification.13 If a model algorithm can only address one type of prediction outcome, such as linear regression, the mode is already set.

Julia Silge  Aug 26, 2022 
Page 95
3rd paragraph, right below Fig 8.1

The text currently says:
Here we see that two neighborhoods have less than five properties in the training data (Landmark and Green Hills); in this case, no houses at all in the Landmark neighborhood were included in the training set.

The text should instead read:
Here we see that two neighborhoods have less than five properties in the training data (Landmark and Green Hills); in this case, no houses at all in the Landmark neighborhood were included in the testing set.

Julia Silge  Aug 10, 2022 
Page 130
Fig 10-2

The data point represented by "21" should be a circle (not a square) to represent it belonging to the first heldout fold. It is correct in Fig 10-3.

Julia Silge  Sep 07, 2022 
Page 133
1st paragraph in section "Leave-One-Out-Validation"

In the first sentence, the phrase:

> where V is the number of data points in the training set

should be omitted.

Julia Silge  Sep 19, 2022 
Printed
Page 209
1st paragraph after Fig 13-9

The sentence that currently reads:

> Any parameter set whose confidence interval includes zero would lack evidence that its performance is not statistically different from the best results.

should have the "not" removed. It should read:

> Any parameter set whose confidence interval includes zero would lack evidence that its performance is statistically different from the best results.

Julia Silge
Julia Silge
 
Sep 22, 2022 
Printed
Page 241-242
Code chunk at the end of page 241 and 242

This code chunk should not have been included:

grid_ctrl <-
control_grid(
save_pred = TRUE,
parallel_over = "everything",
save_workflow = TRUE
)

full_results_time <-
system.time(
grid_results <-
all_workflows %>%
workflow_map(seed = 1503, resamples = concrete_folds, grid = 25,
control = grid_ctrl, verbose = TRUE)
)
#> i 1 of 12 tuning: MARS
#> ✔ 1 of 12 tuning: MARS (12.5s)
#> i 2 of 12 tuning: CART
#> ✔ 2 of 12 tuning: CART (2m 37.6s)
#> i No tuning parameters. `fit_resamples()` will be attempted
#> i 3 of 12 resampling: CART_bagged
#> ✔ 3 of 12 resampling: CART_bagged (1m 33.9s)
#> i 4 of 12 tuning: RF
#> i Creating pre-processing data to finalize unknown parameter: mtry
#> ✔ 4 of 12 tuning: RF (7m 31.8s)
#> i 5 of 12 tuning: boosting
#> ✔ 5 of 12 tuning: boosting (11m 50.6s)
#> i 6 of 12 tuning: Cubist
#> ✔ 6 of 12 tuning: Cubist (10m 30.8s)
#> i 7 of 12 tuning: SVM_radial
#> ✔ 7 of 12 tuning: SVM_radial (3m 36s)
#> i 8 of 12 tuning: SVM_poly
#> ✔ 8 of 12 tuning: SVM_poly (37m 21.3s)
#> i 9 of 12 tuning: KNN
#> ✔ 9 of 12 tuning: KNN (4m 2.1s)
#> i 10 of 12 tuning: neural_network
#> ✔ 10 of 12 tuning: neural_network (8m 8.9s)
#> i 11 of 12 tuning: full_quad_linear_reg
#> ✔ 11 of 12 tuning: full_quad_linear_reg (5m 24.7s)
#> i 12 of 12 tuning: full_quad_KNN
#> ✔ 12 of 12 tuning: full_quad_KNN (17m 34.6s)

num_grid_models <- nrow(collect_metrics(grid_results, summarize = FALSE))

It should not have been rendered when printing.

Julia Silge
Julia Silge
 
Sep 23, 2022 
Printed, PDF, ePub, Mobi, Safari Books Online, Other Digital Version
Page 244
Last paragraph of section

The sentence that reads:

The example model screening with our concrete mixture data fits a total of 25,200 models.

should instead read:

The example model screening with our concrete mixture data fits a total of 12,600 models.

Julia Silge
Julia Silge
 
Sep 23, 2022 
Printed, PDF, ePub, Mobi, Safari Books Online, Other Digital Version
Page 245
Last paragraph of the page

The sentence which reads:

Overall, the racing approach estimated a total of 4,652 models, 18.46% of the full set of 25,200 models in the full grid. As a result, the racing approach was 4.7-fold faster.

should instead read:

Overall, the racing approach estimated a total of 2,335 models, 18.53% of the full set of 12,600 models in the full grid. As a result, the racing approach was 4.7-fold faster.

Julia Silge
Julia Silge
 
Sep 23, 2022