© Himanshu Singh 2021
H. SinghPractical Machine Learning with AWS https://doi.org/10.1007/978-1-4842-6222-1_5

5. Data Processing in AWS

Himanshu Singh1  
(1)
ALLAHABAD, Uttar Pradesh, India
 

Data processing is one of the first steps of the machine learning pipeline. As different sources of data have different formats, it becomes almost impossible to handle all the formats inside the model. Hence, we give the data a synchronous structure, and then we try to process different unwanted sections of it. These sections include the null values, outliers, dummification of categorical columns, standardization of numerical columns, etc. We can use SageMaker effectively to process the data in all these domains. This chapter assumes that you have knowledge about ...

Get Practical Machine Learning with AWS: Process, Build, Deploy, and Productionize Your Models Using AWS now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.