Chapter 3Data Exploration and Preprocessing

In the previous chapter, you examined a hypothetical scenario and learned about the differences between traditional and machine-learning based approaches as well as the high-level steps involved in building a machine-learning solution. In this chapter, you will learn to use NumPy, Pandas, and Scikit-learn to explore data, perform common feature engineering tasks, and select the features that you will use to train your models.

Data Preprocessing Techniques

In Chapter 1, you learned about the different types of machine-learning systems and the general process for building a machine-learning solution. It should come as no surprise that the performance of a machine-learning system is heavily dependent on the quality of training data. In this section, you will learn some of the common ways in which data is prepared for machine-learning models. The examples in this section will use datasets commonly found on the Internet, and they are included with the downloads that accompany this lesson. ...

Get Machine Learning for iOS Developers now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.