Skip to Content
Practical Machine Learning with R
book

Practical Machine Learning with R

by Brindha Priyadarshini Jeyaraman, Ludvig Renbo Olsen, Monicah Wambugu
August 2019
Beginner to intermediate
416 pages
7h 5m
English
Packt Publishing
Content preview from Practical Machine Learning with R

Chapter 2

Data Cleaning and Pre-processing

Learning Objectives

By the end of this chapter, you will be able to:

  • Perform the sort, rank, filter, subset, normalize, scale, and join operations in an R data frame.
  • Identify and handle outliers, missing values, and duplicates gracefully using the MICE and rpart packages.
  • Perform undersampling and oversampling on a dataset.
  • Apply the concepts of ROSE and SMOTE to handle unbalanced data.

This chapter covers the important concepts of handling data and making the data ready for analysis.

Introduction

Data cleaning and preparation takes about 70% of the effort in the entire process of a machine learning project. This step is essential because the quality of the data determines the accuracy of the ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Machine Learning with R

Machine Learning with R

Brett Lantz

Publisher Resources

ISBN: 9781838550134Supplemental Content