2
Data Processing with dplyr
In the previous chapter, we covered the basics of the R language itself. Grasping these fundamentals will help us better tackle the challenges in the most common task in data science projects: data processing. Data processing refers to a series of data wrangling and massaging steps that transform the data into its intended format for downstream analysis and modeling. We can consider it as a function that accepts the raw data and outputs the desired data. However, we need to explicitly specify how the function executes the cooking recipe and processes the data.
By the end of this chapter, you will be able to perform common data wrangling steps such as filtering, selection, grouping, and aggregation using dplyr, one ...
Get The Statistics and Machine Learning with R Workshop now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.