Skip to Content
Essential Math for Data Science
book

Essential Math for Data Science

by Thomas Nield
May 2022
Intermediate to advanced
352 pages
9h 15m
English
O'Reilly Media, Inc.
Content preview from Essential Math for Data Science

Chapter 3. Descriptive and Inferential Statistics

Statistics is the practice of collecting and analyzing data to discover findings that are useful or predict what causes those findings to happen. Probability often plays a large role in statistics, as we use data to estimate how likely an event is to happen.

It may not always get credit, but statistics is the heart of many data-driven innovations. Machine learning in itself is a statistical tool, searching for possible hypotheses to correlate relationships between different variables in data. However there are a lot of blind sides in statistics, even for professional statisticians. We can easily get caught up in what the data says that we forget to ask where the data comes from. These concerns become all the more important as big data, data mining, and machine learning all accelerate the automation of statistical algorithms. Therefore, it is important to have a solid foundation in statistics and hypothesis testing so you do not treat these automations as black boxes.

In this section we will cover the fundamentals of statistics and hypothesis testing. Starting with descriptive statistics, we will learn common ways to summarize data. After that, we will venture into inferential statistics, where we try to uncover attributes of a population based on a sample.

What Is Data?

It may seem odd to define “data,” something we all use and take for granted. But I think it needs to be done. Chances are if you asked any person what data is, ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python Data Science Handbook, 2nd Edition

Python Data Science Handbook, 2nd Edition

Jake VanderPlas

Publisher Resources

ISBN: 9781098102920Errata PageSupplemental Content