© Thomas Mailund 2022
T. MailundBeginning Data Science in R 4https://doi.org/10.1007/978-1-4842-8155-0_3

3. Data Manipulation

Thomas Mailund1  
(1)
Aarhus, Denmark
 

Data science is as much about manipulating data as it is about fitting models to data. Data rarely arrives in a form that we can directly feed into the statistical models or machine learning algorithms we want to analyze them with. The first stages of data analysis are almost always figuring out how to load the data into R and then figuring out how to transform it into a shape you can readily analyze.

Data Already in R

There are some data sets already built into R or available in R packages. Those are useful for learning how to use new methods—if you already know a data set and what it can ...

Get Beginning Data Science in R 4: Data Analysis, Visualization, and Modelling for the Data Scientist now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.