October 2022
Intermediate to advanced
530 pages
11h 57m
English
Data needs to be analyzed, transformed, and processed first before using it when training machine learning (ML) models. In the past, data scientists and ML practitioners had to write custom code from scratch using a variety of libraries, frameworks, and tools (such as pandas and PySpark) to perform the needed analysis and processing work. The custom code prepared by these professionals often needed tweaking since different variations of the steps programmed in the data processing scripts had to be tested on the data before being used for model training. This takes up a significant portion of an ML practitioner’s time, and since this is a manual process, it is usually error-prone as well.
One of the more ...