O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Machine Learning with Python for Everyone

Book Description

The Complete Beginner's Guide to Understanding and Building Machine Learning Systems with Python

Machine Learning with Python for Everyone will help you master the processes, patterns, and strategies you need to build effective learning systems, even if you're an absolute beginner. If you can write some Python code, this book is for you, no matter how little college-level math you know. Principal instructor Mark E. Fenner relies on plain-English stories, pictures, and Python examples to communicate the ideas of machine learning.

Mark begins by discussing machine learning and what it can do; introducing key mathematical and computational topics in an approachable manner; and walking you through the first steps in building, training, and evaluating learning systems. Step by step, you'll fill out the components of a practical learning system, broaden your toolbox, and explore some of the field's most sophisticated and exciting techniques. Whether you're a student, analyst, scientist, or hobbyist, this guide's insights will be applicable to every learning system you ever build or use.

  • Understand machine learning algorithms, models, and core machine learning concepts
  • Classify examples with classifiers, and quantify examples with regressors
  • Realistically assess performance of machine learning systems
  • Use feature engineering to smooth rough data into useful forms
  • Chain multiple components into one system and tune its performance
  • Apply machine learning techniques to images and text
  • Connect the core concepts to neural networks and graphical models
  • Leverage the Python scikit-learn library and other powerful tools

Register your book for convenient access to downloads, updates, and/or corrections as they become available. See inside book for details.

Table of Contents

  1. Cover
  2. About This E-Book
  3. Half Title
  4. Series Page
  5. Title Page
  6. Copyright Page
  7. Dedication
  8. Contents
  9. Foreword
  10. Preface
    1. Audience
    2. Approach
    3. Overview
    4. Acknowledgments
    5. Publisher’s Note
  11. About the Author
  12. Part I: First Steps
    1. 1. Let’s Discuss Learning
      1. 1.1 Welcome
      2. 1.2 Scope, Terminology, Prediction, and Data
      3. 1.3 Putting the Machine in Machine Learning
      4. 1.4 Examples of Learning Systems
      5. 1.5 Evaluating Learning Systems
      6. 1.6 A Process for Building Learning Systems
      7. 1.7 Assumptions and Reality of Learning
      8. 1.8 End-of-Chapter Material
    2. 2. Some Technical Background
      1. 2.1 About Our Setup
      2. 2.2 The Need for Mathematical Language
      3. 2.3 Our Software for Tackling Machine Learning
      4. 2.4 Probability
      5. 2.5 Linear Combinations, Weighted Sums, and Dot Products
      6. 2.6 A Geometric View: Points in Space
      7. 2.7 Notation and the Plus-One Trick
      8. 2.8 Getting Groovy, Breaking the Straight-Jacket, and Nonlinearity
      9. 2.9 NumPy versus “All the Maths”
      10. 2.10 Floating-Point Issues
      11. 2.11 EOC
    3. 3. Predicting Categories: Getting Started with Classification
      1. 3.1 Classification Tasks
      2. 3.2 A Simple Classification Dataset
      3. 3.3 Training and Testing: Don’t Teach to the Test
      4. 3.4 Evaluation: Grading the Exam
      5. 3.5 Simple Classifier #1: Nearest Neighbors, Long Distance Relationships, and Assumptions
      6. 3.6 Simple Classifier #2: Naive Bayes, Probability, and Broken Promises
      7. 3.7 Simplistic Evaluation of Classifiers
      8. 3.8 EOC
    4. 4. Predicting Numerical Values: Getting Started with Regression
      1. 4.1 A Simple Regression Dataset
      2. 4.2 Nearest-Neighbors Regression and Summary Statistics
      3. 4.3 Linear Regression and Errors
      4. 4.4 Optimization: Picking the Best Answer
      5. 4.5 Simple Evaluation and Comparison of Regressors
      6. 4.6 EOC
  13. Part II: Evaluation
    1. 5. Evaluating and Comparing Learners
      1. 5.1 Evaluation and Why Less Is More
      2. 5.2 Terminology for Learning Phases
      3. 5.3 Major Tom, There’s Something Wrong: Overfitting and Underfitting
      4. 5.4 From Errors to Costs
      5. 5.5 (Re)Sampling: Making More from Less
      6. 5.6 Break-It-Down: Deconstructing Error into Bias and Variance
      7. 5.7 Graphical Evaluation and Comparison
      8. 5.8 Comparing Learners with Cross-Validation
      9. 5.9 EOC
    2. 6. Evaluating Classifiers
      1. 6.1 Baseline Classifiers
      2. 6.2 Beyond Accuracy: Metrics for Classification
      3. 6.3 ROC Curves
      4. 6.4 Another Take on Multiclass: One-versus-One
      5. 6.5 Precision-Recall Curves
      6. 6.6 Cumulative Response and Lift Curves
      7. 6.7 More Sophisticated Evaluation of Classifiers: Take Two
      8. 6.8 EOC
    3. 7. Evaluating Regressors
      1. 7.1 Baseline Regressors
      2. 7.2 Additional Measures for Regression
      3. 7.3 Residual Plots
      4. 7.4 A First Look at Standardization
      5. 7.5 Evaluating Regressors in a More Sophisticated Way: Take Two
      6. 7.6 EOC
  14. Part III: More Methods and Fundamentals
    1. 8. More Classification Methods
      1. 8.1 Revisiting Classification
      2. 8.2 Decision Trees
      3. 8.3 Support Vector Classifiers
      4. 8.4 Logistic Regression
      5. 8.5 Discriminant Analysis
      6. 8.6 Assumptions, Biases, and Classifiers
      7. 8.7 Comparison of Classifiers: Take Three
      8. 8.8 EOC
    2. 9. More Regression Methods
      1. 9.1 Linear Regression in the Penalty Box: Regularization
      2. 9.2 Support Vector Regression
      3. 9.3 Piecewise Constant Regression
      4. 9.4 Regression Trees
      5. 9.5 Comparison of Regressors: Take Three
      6. 9.6 EOC
    3. 10. Manual Feature Engineering: Manipulating Data for Fun and Profit
      1. 10.1 Feature Engineering Terminology and Motivation
      2. 10.2 Feature Selection and Data Reduction: Taking out the Trash
      3. 10.3 Feature Scaling
      4. 10.4 Discretization
      5. 10.5 Categorical Coding
      6. 10.6 Relationships and Interactions
      7. 10.7 Target Manipulations
      8. 10.8 EOC
    4. 11. Tuning Hyperparameters and Pipelines
      1. 11.1 Models, Parameters, Hyperparameters
      2. 11.2 Tuning Hyperparameters
      3. 11.3 Down the Recursive Rabbit Hole: Nested Cross-Validation
      4. 11.4 Pipelines
      5. 11.5 Pipelines and Tuning Together
      6. 11.6 EOC
  15. Part IV: Adding Complexity
    1. 12. Combining Learners
      1. 12.1 Ensembles
      2. 12.2 Voting Ensembles
      3. 12.3 Bagging and Random Forests
      4. 12.4 Boosting
      5. 12.5 Comparing the Tree-Ensemble Methods
      6. 12.6 EOC
    2. 13. Models That Engineer Features for Us
      1. 13.1 Feature Selection
      2. 13.2 Feature Construction with Kernels
      3. 13.3 Principal Components Analysis: An Unsupervised Technique
      4. 13.4 EOC
    3. 14. Feature Engineering for Domains: Domain-Specific Learning
      1. 14.1 Working with Text
      2. 14.2 Clustering
      3. 14.3 Working with Images
      4. 14.4 EOC
    4. 15. Connections, Extensions, and Further Directions
      1. 15.1 Optimization
      2. 15.2 Linear Regression from Raw Materials
      3. 15.3 Building Logistic Regression from Raw Materials
      4. 15.4 SVM from Raw Materials
      5. 15.5 Neural Networks
      6. 15.6 Probabilistic Graphical Models
      7. 15.7 EOC
  16. A. mlwpy.py Listing
  17. Index
  18. Code Snippets