Summary
In this chapter, we delved even deeper into machine learning and found out how we can take advantage of machine learning to cluster records belonging to a dataset of unsupervised observations. Consequently, you learnt the practical know-how needed to quickly and powerfully apply supervised and unsupervised techniques on available data to new problems through some widely used examples based on the understandings from the previous chapters. The examples we are talking about will be demonstrated from the Spark perspective. For any of the K-means, bisecting K-means, and Gaussian mixture algorithms, it is not guaranteed that the algorithm will produce the same clusters if run multiple times. For example, we observed that running the K-means ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access