© Pramod Singh 2019
P. SinghLearn PySparkhttps://doi.org/10.1007/978-1-4842-4961-1_5

5. MLlib: Machine Learning Library

Pramod Singh1 
(1)
Bangalore, Karnataka, India
 

Depending on your requirements, there are multiple ways in which you can build machine learning models, using preexisting libraries, such as Python’s scikit-learn, R, and TensorFlow. However, what makes Spark’s Machine Learning library (MLlib) really useful is its ability to train models on scale and provide distributed training. This allows users to quickly build models on a huge dataset, in addition to preprocessing and preparing workflows with the Spark framework itself.

This chapter focuses on how to leverage MLlib for building and applying various machine learning models. The first ...

Get Learn PySpark: Build Python-based Machine Learning and Deep Learning Models now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.