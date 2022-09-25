Tidy Modeling with R

Tidy Modeling with R

by Max Kuhn, Julia Silge
Released September 2022
Publisher(s): O'Reilly Media, Inc.
ISBN: 9781492096481

Book description

Models can be used in almost any domain for purposes including prediction, inference, or simply describing data. In all these cases, the predictive capacity of a model can be used to evaluate it, and we can build better, more useful models by adhering to good statistical practice. The tidymodels framework harmonizes the heterogeneous model interfaces in R and offers a consistent, flexible framework for modeling suitable for beginners as well as the very experienced.

This book provides a practical introduction to how to use R software to create models, focusing on a dialect of the R programming language called the tidyverse. Software that adopts tidyverse principles shares a high-level design philosophy and low-level grammar and data structures, so learning one piece of the ecosystem makes it easier to learn the next. The tidymodels framework for modeling is built to be easily understood and used by a broad range of people.

Table of contents

  1. Preface
    1. Acknowledgments
    2. Using Code Examples
  2. 1. Software for modeling
    1. Fundamentals for Modeling Software
    2. Types of Models
      1. Descriptive models
      2. Inferential models
      3. Predictive models
    3. Connections Between Types of Models
    4. Some Terminology
    5. How Does Modeling Fit into the Data Analysis Process?
    6. Chapter Summary
  3. 2. A Tidyverse Primer
    1. Principles
      1. Design for humans
      2. Reuse existing data structures
      3. Design for the pipe and functional programming
    2. Examples of Tidyverse Syntax
    3. Chapter Summary
  4. 3. A Review of R Modeling Fundamentals
    1. An Example
    2. What Does the R Formula Do?
    3. Why Tidiness is Important for Modeling
    4. Combining Base R Models and the Tidyverse
    5. The tidymodels Metapackage
    6. Chapter Summary
  5. 4. The Ames Housing Data
    1. Exploring Important Features
    2. Chapter Summary
  6. 5. Spending our Data
    1. Common Methods for Splitting Data
    2. What About a Validation Set?
    3. Multi-Level Data
    4. Other Considerations
    5. Chapter Summary
  7. 6. Fitting Models with parsnip
    1. Create a Model
    2. Use the Model Results
    3. Make Predictions
    4. parsnip-Adjacent Packages
    5. Creating Model Specifications
    6. Chapter Summary
  8. 7. A Model Workflow
    1. Where Does the Model Begin and End?
    2. Workflow Basics
    3. Adding Raw Variables to the workflow()
    4. How Does a workflow() Use the Formula?
      1. Tree-based models
      2. Special formulas and in-line functions
    5. Creating Multiple Workflows at Once
    6. Evaluating the Test Set
    7. Chapter Summary
  9. 8. Feature Engineering with recipes
    1. A Simple recipe() for the Ames Housing Data
    2. Using Recipes
    3. How Data are Used by the recipe()
    4. Examples of recipe() Steps
      1. Encoding qualitative data in a numeric format
      2. Interaction terms
      3. Spline functions
      4. Feature extraction
      5. Row sampling steps
      6. General transformations
      7. Natural language processing
    5. Skipping Steps for New Data
    6. Tidy a recipe()
    7. Column Roles
    8. Chapter Summary
  10. 9. Judging Model Effectiveness
    1. Performance Metrics and Inference
    2. Regression Metrics
    3. Binary Classification Metrics
    4. Multi-Class Classification Metrics
    5. Chapter Summary

