Skip to Content
Data Science Bookcamp
book

Data Science Bookcamp

by Leonard Apeltsin
November 2021
Beginner to intermediate
704 pages
20h 16m
English
Manning Publications
Content preview from Data Science Bookcamp

10 Clustering data into groups

This section covers

  • Clustering data by centrality
  • Clustering data by density
  • Trade-offs between clustering algorithms
  • Executing clustering using the scikit-learn library
  • Iterating over clusters using Pandas

Clustering is the process of organizing data points into conceptually meaningful groups. What makes a given group “conceptually meaningful”? There is no easy answer to that question. The usefulness of any clustered output is dependent on the task we’ve been assigned.

Imagine that we’re asked to cluster a collection of pet photos. Do we cluster fish and lizards in one group and fluffy pets (such as hamsters, cats, and dogs) in another? Or should hamsters, cats, and dogs be assigned three separate clusters of ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Introducing Data Science

Introducing Data Science

Arno Meysman, Davy Cielen, Mohamed Ali
Learning Data Science

Learning Data Science

Sam Lau, Joseph Gonzalez, Deborah Nolan

Publisher Resources

ISBN: 9781617296253Publisher SupportOtherPublisher WebsiteSupplemental ContentErrata PagePurchase Link