© Julian Soh and Priyanshi Singh 2020
J. Soh, P. SinghData Science Solutions on Azurehttps://doi.org/10.1007/978-1-4842-6405-8_3

3. Data Preparation and Data Engineering Basics

Julian Soh1   and Priyanshi Singh2
(1)
Olympia, WA, USA
(2)
New Jersey, NJ, USA
 

There is a common saying that the bulk of the work involved with data science is in data preparation. In fact, data preparation is a crucial part of the process, which, if not done correctly, would yield inaccurate results and may lead to negative consequences. That is why so much time is being spent on data preparation. If we want to make the data science process more efficient, shaving off the amount of time spent on data preparation is one area for us to look at.

However, we need to qualify what ...

Get Data Science Solutions on Azure: Tools and Techniques Using Databricks and MLOps now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.