Skip to Content
Outlier Detection in Python
book

Outlier Detection in Python

by Brett Kennedy
December 2024
Beginner to intermediate
560 pages
19h 50m
English
Manning Publications

Overview

Learn how to identify the unusual, interesting, extreme, or inaccurate parts of your data.

Data scientists have two main tasks: finding patterns in data and finding the exceptions. These outliers are often the most informative parts of data, revealing hidden insights, novel patterns, and potential problems. Outlier Detection in Python is a practical guide to spotting the parts of a dataset that deviate from the norm, even when they're hidden or intertwined among the expected data points.

In Outlier Detection in Python you'll learn how to:

  • Use standard Python libraries to identify outliers
  • Select the most appropriate detection methods
  • Combine multiple outlier detection methods for improved results
  • Interpret your results effectively
  • Work with numeric, categorical, time series, and text data

Outlier detection is a vital tool for modern business, whether it's discovering new products, expanding markets, or flagging fraud and other suspicious activities. This guide presents the core tools for outlier detection, as well as techniques utilizing the Python data stack familiar to data scientists. To get started, you'll only need a basic understanding of statistics and the Python data ecosystem.

About the Technology
Outliers—values that appear inconsistent with the rest of your data—can be the key to identifying fraud, performing a security audit, spotting bot activity, or just assessing the quality of a dataset. This unique guide introduces the outlier detection tools, techniques, and algorithms you’ll need to find, understand, and respond to the anomalies in your data.

About the Book
Outlier Detection in Python illustrates the principles and practices of outlier detection with diverse real-world examples including social media, finance, network logs, and other important domains. You’ll explore a comprehensive set of statistical methods and machine learning approaches to identify and interpret the unexpected values in tabular, text, time series, and image data. Along the way, you’ll explore scikit-learn and PyOD, apply key OD algorithms, and add some high value techniques for real world OD scenarios to your toolkit.

What's Inside
  • Python libraries to identify outliers
  • Combine outlier detection methods
  • Interpret your results


About the Reader
For Python programmers familiar with tools like pandas and NumPy, and the basics of statistics.

About the Author
Brett Kennedy is a data scientist with over thirty years’ experience in software development and data science.

Quotes
A wonderful job of covering the expansive topic of anomaly detection.
- Aric LaBarr, Institute for Advanced Analytics

Hands-on and rich! This book should not be missing from any data scientist’s bookshelf!
- Lukas Ruff, Aignostics

Impressively dives into the harder to spot, unknown types of outliers! Very interesting.
- Robert Brunner, Innodative

Masterfully captures the breadth and depth of outlier detection methods. A must-have for data analysts, scientists, and researchers.
- Matias Carrasco, Kind University of Illinois, Urbana Champaign

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Time Series Forecasting in Python

Time Series Forecasting in Python

Marco Peixeiro
Python Distilled

Python Distilled

David M. Beazley

Publisher Resources

ISBN: 9781633436473Publisher SupportPublisher Website