Before passing our data to regression algorithms, we need to give a first look at what we've imported into the R environment to see if there are any issues. Often, raw data is messy and poorly formatted. In other cases, it may not have the appropriate details for our study.
To get started, it's good practice to keep your original data. To do this, every change will be performed on a copy of the dataset. Putting order in the data is the first step and it will make data cleaning more easily, but let's ask a question. When can we say that our data is tidy? According to Hadley Wickham, a dataset ...