© Pramod Singh 2019
Pramod SinghMachine Learning with PySpark https://doi.org/10.1007/978-1-4842-4131-8_8

8. Clustering

Pramod Singh1 
Bangalore, Karnataka, India

In the previous chapters so far, we have seen supervised Machine Learning where the target variable or label is known to us, and we try to predict the output based on the input features. Unsupervised Learning is different in a sense that there is no labeled data, and we don’t try to predict any output as such; instead we try to find interesting patterns and come up with groups within the data. The similar values are grouped together.

When we join a new school or college, we come across many new faces and everyone looks so different. We hardly know anyone in the institute, and there are ...

Get Machine Learning with PySpark: With Natural Language Processing and Recommender Systems now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.