Skip to Content
Essential Math for Data Science
book

Essential Math for Data Science

by Thomas Nield
May 2022
Intermediate to advanced
352 pages
9h 15m
English
O'Reilly Media, Inc.
Content preview from Essential Math for Data Science

Chapter 6. Logistic Regression and Classification

In this chapter we are going to cover logistic regression, a type of regression that predicts a probability of an outcome given one or more independent variables. This in turn can be used for classification, which is predicting categories rather than real numbers as we did with linear regression.

We are not always interested in representing variables as continuous, where they can represent an infinite number of real decimal values. There are situations where we would rather variables be discrete, or representative of whole numbers, integers, or booleans (1/0, true/false). Logistic regression is trained on an output variable that is discrete (a binary 1 or 0) or a categorical number (which is a whole number). It does output a continuous variable in the form of probability, but that can be converted into a discrete value with a threshold.

Logistic regression is easy to implement and fairly resilient against outliers and other data challenges. Many machine learning problems can best be solved with logistic regression, offering more practicality and performance than other types of supervised machine learning.

Just like we did in Chapter 5 when we covered linear regression, we will attempt to walk the line between statistics and machine learning, using tools and analysis from both disciplines. Logistic regression will integrate many concepts we have learned from this book, from probability to linear regression.

Understanding Logistic ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python Data Science Handbook, 2nd Edition

Python Data Science Handbook, 2nd Edition

Jake VanderPlas

Publisher Resources

ISBN: 9781098102920Errata PageSupplemental Content