© Ramcharan Kakarla, Sundar Krishnan and Sridhar Alla 2021
R. Kakarla et al.Applied Data Science Using PySparkhttps://doi.org/10.1007/978-1-4842-6500-0_8

8. Machine Learning Flow and Automated Pipelines

Ramcharan Kakarla1  , Sundar Krishnan1 and Sridhar Alla2
Philadelphia, PA, USA
New Jersey, NJ, USA

Putting a model into production is one of the most challenging tasks in the data science world. It is one of those last-mile problems that persists in many organizations. Although there are many tools for managing workflows, as the organization matures its needs change, and managing existing models can become a herculean task. When we take a step back and analyze why it is so challenging, we can see that it is because of the structure that exists ...

