© Pramod Singh 2019
Pramod SinghMachine Learning with PySpark https://doi.org/10.1007/978-1-4842-4131-8_8

8. Clustering

Pramod Singh1 
(1)
Bangalore, Karnataka, India
 

In the previous chapters so far, we have seen supervised Machine Learning where the target variable or label is known to us, and we try to predict the output based on the input features. Unsupervised Learning is different in a sense that there is no labeled data, and we don’t try to predict any output as such; instead we try to find interesting patterns and come up with groups within the data. The similar values are grouped together.

When we join a new school or college, we come across many new faces and everyone looks so different. We hardly know anyone in the institute, and there are ...

Get Machine Learning with PySpark: With Natural Language Processing and Recommender Systems now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.