June 2017
Beginner to intermediate
576 pages
15h 22m
English
In this section, we will illustrate how to implement a cluster analysis directly in Spark using k-means. K-means is a form of unsupervised learning and is an excellent place to begin to explore big data, especially if your data is large and you have no clear definition of target variables.