April 2021
Beginner to intermediate
248 pages
6h 26m
English
In this chapter, we’ll apply what we’ve learned about data analysis and visualization in R to explore and test relationships in the familiar mpg dataset. You’ll learn a couple of new R techniques here, including how to conduct a t-test and linear regression. We’ll begin by calling up the necessary packages, reading in mpg.csv from the mpg subfolder of the book repository’s datasets folder, and selecting the columns of interest. We’ve not used tidymodels so far in this book, so you may need to install it.
library(tidyverse)library(psych)library(tidymodels)# Read in the data, select only the columns we needmpg<-read_csv('datasets/mpg/mpg.csv')%>%select(mpg,weight,horsepower,origin,cylinders)#> -- Column specification -----------------------------------------------------#> cols(#> mpg = col_double(),#> cylinders = col_double(),#> displacement = col_double(),#> horsepower = col_double(),#> weight = col_double(),#> acceleration = col_double(),#> model.year = col_double(),#> origin = col_character(),#> car.name = col_character()#> )head(mpg)#> # A tibble: 6 x 5#> mpg weight horsepower origin cylinders#> <dbl> <dbl> <dbl> <chr> <dbl>#> 1 18 3504 130 USA 8#> 2 15 3693 165 USA 8#> 3 18 3436 150 USA 8#> 4 16 3433 150 USA 8#> 5 17 3449 140 USA 8#> 6 15 4341 198 USA 8
Descriptive statistics are a good place to start when exploring data. We’ll do so with the describe() function from psych ...
Read now
Unlock full access