Skip to Content
Practical Data Analysis Cookbook
book

Practical Data Analysis Cookbook

by Tomasz Drabas
April 2016
Beginner to intermediate content levelBeginner to intermediate
384 pages
8h 36m
English
Packt Publishing
Content preview from Practical Data Analysis Cookbook

Finding groups of potential subscribers with DBSCAN and BIRCH algorithms

Density-based Spatial Clustering of Applications with Noise (DBSCAN) and Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH) algorithms were the first approaches developed to handle noisy data effectively. Noise here is understood as data points that seem completely out of place when compared with the rest of the dataset; DBSCAN puts such observations into an unclassified bucket while BIRCH treats them as outliers and removes them from the dataset.

Getting ready

To execute this recipe, you will need pandas and Scikit. No other prerequisites are required.

How to do it…

Both the algorithms can be found in Scikit. To use DBSCAN, use the code found in the clustering_dbscan.py ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python Data Analysis Cookbook

Python Data Analysis Cookbook

Ivan Idris
Practical Simulations for Machine Learning

Practical Simulations for Machine Learning

Paris Buttfield-Addison, Mars Buttfield-Addison, Tim Nugent, Jon Manning

Publisher Resources

ISBN: 9781783551668Supplemental Content