P. SinghMachine Learning with PySparkhttps://doi.org/10.1007/978-1-4842-7777-5_7

7. Clustering in PySpark

Pramod Singh¹

(1)

Bangalore, Karnataka, India

So far, we have seen supervised Machine Learning where the target variable or label is known to us, and we try to predict the output based on the input features. Unsupervised indicates that there is no labeled data and we don’t try to predict any output. Instead, we try to find interesting patterns and come up with groups within the data. It’s more of an art rather than going after the prediction accuracy. The values within the groups are very similar to each other, whereas any two groups are very distinct ...

Get Machine Learning with PySpark: With Natural Language Processing and Recommender Systems now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Machine Learning with PySpark: With Natural Language Processing and Recommender Systems by Pramod Singh

7. Clustering in PySpark

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly