Skip to Content
Hands-On Unsupervised Learning with Python
book

Hands-On Unsupervised Learning with Python

by Giuseppe Bonaccorso
February 2019
Intermediate to advanced
386 pages
9h 54m
English
Packt Publishing
Content preview from Hands-On Unsupervised Learning with Python

Structure of the dataset

In standard supervised (and often also unsupervised) tasks, the dataset is expected to be balanced. In other words, the number of samples belonging to each class should be almost the same. In the tasks we are going to discuss in this chapter, instead, we assume to have very unbalanced datasets X (containing N samples):

  • Noutliers << N, if there is an outlier detection (that is, the dataset is partially dirt;, therefore, we need to find out a way to filter all outliers out)
  • Noutliers = 0 (or, more realistically, P(Noutliers > 0) → 0), if there is a novelty detection (that is, we can generally trust the existing samples and focus our attention on the new ones)

The reason for these criteria is quite obvious: let's consider ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Hands-On Unsupervised Learning Using Python

Hands-On Unsupervised Learning Using Python

Ankur A. Patel
Introduction to Machine Learning with Python

Introduction to Machine Learning with Python

Andreas C. Müller, Sarah Guido

Publisher Resources

ISBN: 9781789348279Supplemental Content