3Data Inspection and Cleaning

3.1 Introduction

Data Cleaning and Inspection is the next important part of the data analysis pipeline. It implies that before starting analysis, visualization or machine learning and its insights, you should have cleaned any data that has to be analyzed. Though Machine Learning, Exploratory Data Analysis and Data Visualization take up more time in analytical education, in an actual data science project much more time is spent in data inspection and cleaning.

3.2 Data Inspection

Data inspection helps us determine that data import has been executed correctly, that variables are in same length (rows) and breadth (columns) and that variables (columns) are in the same format as expected.

3.2.1 Data Inspection in SAS

Let’s try this in SAS

  • Referring to a column is easier in SAS than in R
  • Referring to a row is more complex in SAS than R

Get SAS for R Users now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.