Skip to Content
Big Data Analytics with Java
book

Big Data Analytics with Java

by RAJAT MEHTA
July 2017
Beginner to intermediate
418 pages
9h 46m
English
Packt Publishing
Content preview from Big Data Analytics with Java

Summary

In this chapter, we learnt about clustering and we saw how this approach helps to group different items into groups with each group having items which are similar to them in some form. Clustering is an example of unsupervised learning and there are lots of popular clustering algorithms that are shipped by default in the Apache Spark package. We learnt about two clustering approaches, the first being k-means approach where items that are closer to each other based on some mathematical formula like Euclidean distance and so on were grouped together. We also learnt about bisecting k-means approach which is essentially and improvement on the regular k-means clustering and is creating by being a combination of hierarchical and k-means clustering. ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Data Science with Java

Data Science with Java

Michael R. Brzustowicz
Data Science on AWS

Data Science on AWS

Chris Fregly, Antje Barth
Machine Learning: End-to-End guide for Java developers

Machine Learning: End-to-End guide for Java developers

Richard M. Reese, Jennifer L. Reese, Boštjan Kaluža, Dr. Uday Kamath, Krishna Choppella

Publisher Resources

ISBN: 9781787288980Supplemental Content