4. Data Encoding and Preprocessing
4.1 Introduction
Data preprocessing is an important step in machine learning. It’s the first step and a place where there is a lot of room for subjective decision-making, which can reduce the information content of your data in ways you’ll learn about in this chapter.
Generally, data preprocessing is the process of mapping raw data into a format that is ready to pass into a machine-learning algorithm. You can assume for now that there’s no uncertainty in the data that you’re encoding, and you’ll revisit the problem of uncertainty in later chapters. For now, you’ll learn how you can encode measurements in a way an algorithm can understand. You can also take steps to make sure your models perform as well as ...
Get Machine Learning in Production: Developing and Optimizing Data Science Workflows and Applications now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.