Grokking Machine Learning

Book description

Discover valuable machine learning techniques you can understand and apply using just high-school math.

In Grokking Machine Learning you will learn:

  • Supervised algorithms for classifying and splitting data
  • Methods for cleaning and simplifying data
  • Machine learning packages and tools
  • Neural networks and ensemble methods for complex datasets

Grokking Machine Learning teaches you how to apply ML to your projects using only standard Python code and high school-level math. No specialist knowledge is required to tackle the hands-on exercises using Python and readily available machine learning tools. Packed with easy-to-follow Python-based exercises and mini-projects, this book sets you on the path to becoming a machine learning expert.

About the Technology
Discover powerful machine learning techniques you can understand and apply using only high school math! Put simply, machine learning is a set of techniques for data analysis based on algorithms that deliver better results as you give them more data. ML powers many cutting-edge technologies, such as recommendation systems, facial recognition software, smart speakers, and even self-driving cars. This unique book introduces the core concepts of machine learning, using relatable examples, engaging exercises, and crisp illustrations.

About the Book
Grokking Machine Learning presents machine learning algorithms and techniques in a way that anyone can understand. This book skips the confused academic jargon and offers clear explanations that require only basic algebra. As you go, you’ll build interesting projects with Python, including models for spam detection and image recognition. You’ll also pick up practical skills for cleaning and preparing data.

What's Inside
  • Supervised algorithms for classifying and splitting data
  • Methods for cleaning and simplifying data
  • Machine learning packages and tools
  • Neural networks and ensemble methods for complex datasets


About the Reader
For readers who know basic Python. No machine learning knowledge necessary.

About the Author
Luis G. Serrano is a research scientist in quantum artificial intelligence. Previously, he was a Machine Learning Engineer at Google and Lead Artificial Intelligence Educator at Apple.

Quotes
Did you think machine learning is complicated and hard to master? It’s not! Read this book! Serrano demystifies some of the best-held secrets of the machine learning society.
- Sebastian Thrun, Founder, Udacity

The first step to take on your machine learning journey.
- Millad Dagdoni, Norwegian Labour and Welfare Administration

A nicely written guided introduction, especially for those who want to code but feel shaky in their mathematics.
- Erik D. Sapper, California Polytechnic State University

The most approachable introduction to machine learning I’ve had the pleasure to read in recent years. Highly recommended.
- Kay Engelhardt, devstats

Publisher resources

View/Submit Errata

Table of contents

  1. Grokking Machine Learning
  2. inside front cover
  3. Copyright
  4. contents
  5. front matter
    1. foreword
    2. preface
    3. acknowledgments
    4. about this book
      1. Types of chapters
      2. Recommended learning paths
      3. Appendices
      4. Requirements and learning goals
      5. Other resources
      6. We’ll be writing code
    5. about the author
  6. 1 What is machine learning? It is common sense, except done by a computer
    1. Do I need a heavy math and coding background to understand machine learning?
    2. OK, so what exactly is machine learning?
    3. How do we get machines to make decisions with data? The remember-formulate-predict framework
    4. Summary
  7. 2 Types of machine learning
    1. What is the difference between labeled and unlabeled data?
    2. Supervised learning: The branch of machine learning that works with labeled data
    3. Unsupervised learning: The branch of machine learning that works with unlabeled data
    4. What is reinforcement learning?
    5. Summary
    6. Exercises
  8. 3 Drawing a line close to our points: Linear regression
    1. The problem: We need to predict the price of a house
    2. The solution: Building a regression model for housing prices
    3. How to get the computer to draw this line: The linear regression algorithm
    4. How do we measure our results? The error function
    5. Real-life application: Using Turi Create to predict housing prices in India
    6. What if the data is not in a line? Polynomial regression
    7. Parameters and hyperparameters
    8. Applications of regression
    9. Summary
    10. Exercises
  9. 4 Optimizing the training process: Underfitting, overfitting, testing, and regularization
    1. An example of underfitting and overfitting using polynomial regression
    2. How do we get the computer to pick the right model? By testing
    3. Where did we break the golden rule, and how do we fix it? The validation set
    4. A numerical way to decide how complex our model should be: The model complexity graph
    5. Another alternative to avoiding overfitting: Regularization
    6. Polynomial regression, testing, and regularization with Turi Create
    7. Summary
    8. Exercises
  10. 5 Using lines to split our points: The perceptron algorithm
    1. The problem: We are on an alien planet, and we don’t know their language!
    2. How do we determine whether a classifier is good or bad? The error function
    3. How to find a good classifier? The perceptron algorithm
    4. Coding the perceptron algorithm
    5. Applications of the perceptron algorithm
    6. Summary
    7. Exercises
  11. 6 A continuous approach to splitting points: Logistic classifiers
    1. Logistic classifiers: A continuous version of perceptron classifiers
    2. How to find a good logistic classifier? The logistic regression algorithm
    3. Coding the logistic regression algorithm
    4. Real-life application: Classifying IMDB reviews with Turi Create
    5. Classifying into multiple classes: The softmax function
    6. Summary
    7. Exercises
  12. 7 How do you measure classification models? Accuracy and its friends
    1. Accuracy: How often is my model correct?
    2. How to fix the accuracy problem? Defining different types of errors and how to measure them
    3. A useful tool to evaluate our model: The receiver operating characteristic (ROC) curve
    4. Summary
    5. Exercises
  13. 8 Using probability to its maximum: The naive Bayes model
    1. Sick or healthy? A story with Bayes’ theorem as the hero
    2. Use case: Spam-detection model
    3. Building a spam-detection model with real data
    4. Summary
    5. Exercises
  14. 9 Splitting data by asking questions: Decision trees
    1. The problem: We need to recommend apps to users according to what they are likely to download
    2. The solution: Building an app-recommendation system
    3. Beyond questions like yes/no
    4. The graphical boundary of decision trees
    5. Real-life application: Modeling student admissions with Scikit-Learn
    6. Decision trees for regression
    7. Applications
    8. Summary
    9. Exercises
  15. 10 Combining building blocks to gain more power: Neural networks
    1. Neural networks with an example: A more complicated alien planet
    2. Training neural networks
    3. Coding neural networks in Keras
    4. Neural networks for regression
    5. Other architectures for more complex datasets
    6. Summary
    7. Exercises
  16. 11 Finding boundaries with style: Support vector machines and the kernel method
    1. Using a new error function to build better classifiers
    2. Coding support vector machines in Scikit-Learn
    3. Training SVMs with nonlinear boundaries: The kernel method
    4. Summary
    5. Exercises
  17. 12 Combining models to maximize results: Ensemble learning
    1. With a little help from our friends
    2. Bagging: Joining some weak learners randomly to build a strong learner
    3. AdaBoost: Joining weak learners in a clever way to build a strong learner
    4. Gradient boosting: Using decision trees to build strong learners
    5. XGBoost: An extreme way to do gradient boosting
    6. Applications of ensemble methods
    7. Summary
    8. Exercises
  18. 13 Putting it all in practice: A real-life example of data engineering and machine learning
    1. The Titanic dataset
    2. Cleaning up our dataset: Missing values and how to deal with them
    3. Feature engineering: Transforming the features in our dataset before training the models
    4. Training our models
    5. Tuning the hyperparameters to find the best model: Grid search
    6. Using K-fold cross-validation to reuse our data as training and validation
    7. Summary
    8. Exercises
  19. Appendix A. Solutions to the exercises
    1. Chapter 2: Types of machine learning
    2. Chapter 3: Drawing a line close to our points: Linear regression
    3. Chapter 4: Optimizing the training process: Underfitting, overfitting, testing, and regularization
    4. Chapter 5: Using lines to split our points: The perceptron algorithm
    5. Chapter 6: A continuous approach to splitting points: Logistic classifiers
    6. Chapter 7: How do you measure classification models? Accuracy and its friends
    7. Chapter 8: Using probability to its maximum: The naive Bayes model
    8. Chapter 9: Splitting data by asking questions: Decision trees
    9. Chapter 10: Combining building blocks to gain more power: Neural networks
    10. Chapter 11: Finding boundaries with style: Support vector machines and the kernel method
    11. Chapter 12: Combining models to maximize results: Ensemble learning
    12. Chapter 13: Putting it all in practice: A real-life example of data engineering and machine learning
  20. Appendix B. The math behind gradient descent: Coming down a mountain using derivatives and slopes
    1. Using gradient descent to decrease functions
    2. Using gradient descent to train models
    3. Using gradient descent for regularization
    4. Getting stuck on local minima: How it happens, and how we solve it
  21. Appendix C. References
    1. General references
    2. Courses
    3. Blogs and YouTube channels
    4. Books
    5. Chapter 1
    6. Chapter 2
    7. Chapter 3
    8. Chapter 4
    9. Chapter 5
    10. Chapter 6
    11. Chapter 7
    12. Chapter 8
    13. Chapter 9
    14. Chapter 10
    15. Chapter 11
    16. Chapter 12
    17. Chapter 13
    18. Graphics and image icons
  22. index

Product information

  • Title: Grokking Machine Learning
  • Author(s): Luis Serrano
  • Release date: December 2021
  • Publisher(s): Manning Publications
  • ISBN: 9781617295911