Book description
Master the math needed to excel in data science, machine learning, and statistics. In this book author Thomas Nield guides you through areas like calculus, probability, linear algebra, and statistics and how they apply to techniques like linear regression, logistic regression, and neural networks. Along the way you'll also gain practical insights into the state of data science and how to use those insights to maximize your career.
Learn how to:
- Use Python code and libraries like SymPy, NumPy, and scikit-learn to explore essential mathematical concepts like calculus, linear algebra, statistics, and machine learning
- Understand techniques like linear regression, logistic regression, and neural networks in plain English, with minimal mathematical notation and jargon
- Perform descriptive statistics and hypothesis testing on a dataset to interpret p-values and statistical significance
- Manipulate vectors and matrices and perform matrix decomposition
- Integrate and build upon incremental knowledge of calculus, probability, statistics, and linear algebra, and apply it to regression models including neural networks
- Navigate practically through a data science career and avoid common pitfalls, assumptions, and biases while tuning your skill set to stand out in the job market
Table of contents
- Preface
- 1. Basic Math and Calculus Review
- 2. Probability
- 3. Descriptive and Inferential Statistics
- 4. Linear Algebra
-
5. Linear Regression
- A Basic Linear Regression
- Residuals and Squared Errors
- Finding the Best Fit Line
- Overfitting and Variance
- Stochastic Gradient Descent
- The Correlation Coefficient
- Statistical Significance
- Coefficient of Determination
- Standard Error of the Estimate
- Prediction Intervals
- Train/Test Splits
- Multiple Linear Regression
- Conclusion
- Exercises
-
6. Logistic Regression and Classification
- Understanding Logistic Regression
- Performing a Logistic Regression
- Multivariable Logistic Regression
- Understanding the Log-Odds
- R-Squared
- P-Values
- Train/Test Splits
- Confusion Matrices
- Bayes’ Theorem and Classification
- Receiver Operator Characteristics/Area Under Curve
- Class Imbalance
- Conclusion
- Exercises
- 7. Neural Networks
- 8. Career Advice and the Path Forward
-
A. Supplemental Topics
- Using LaTeX Rendering with SymPy
- Binomial Distribution from Scratch
- Beta Distribution from Scratch
- Deriving Bayes’ Theorem
- CDF and Inverse CDF from Scratch
- Use e to Predict Event Probability Over Time
- Hill Climbing and Linear Regression
- Hill Climbing and Logistic Regression
- A Brief Intro to Linear Programming
- MNIST Classifier Using scikit-learn
- B. Exercise Answers
- Index
- About the Author
Product information
- Title: Essential Math for Data Science
- Author(s):
- Release date: June 2022
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781098102937
You might also like
book
Clean Code: A Handbook of Agile Software Craftsmanship
Even bad code can function. But if code isn't clean, it can bring a development organization …
video
Python Fundamentals
51+ hours of video instruction. Overview The professional programmer’s Deitel® video guide to Python development with …
book
40 Algorithms Every Programmer Should Know
Learn algorithms for solving classic computer science problems with this concise guide covering everything from fundamental …
book
Building Microservices, 2nd Edition
Distributed systems have become more fine-grained as organizations shift from code-heavy monolithic applications to smaller, self-contained …