December 2018
Beginner to intermediate
682 pages
18h 1m
English
The weightlifting dataset, like many datasets, has easily digestible information in its raw form, but technically, it is messy, as all but one of the column names contain information for sex and age. Once the variables are identified, we can begin to tidy the dataset. Whenever column names contain variables, you will need to use the melt (or stack) method. The Weight Category variable is already in the correct position so we keep it as an identifying variable by passing it to the id_vars parameter. Note that we don't explicitly need to name all the columns that we are melting with value_vars. By default, all the columns not present in id_vars get melted.
The sex_age column needs to be parsed, and split into two variables. ...