Ensemble Methods for Machine Learning

Book description

Ensemble machine learning combines the power of multiple machine learning approaches, working together to deliver models that are highly performant and highly accurate.

Inside Ensemble Methods for Machine Learning you will find:

  • Methods for classification, regression, and recommendations
  • Sophisticated off-the-shelf ensemble implementations
  • Random forests, boosting, and gradient boosting
  • Feature engineering and ensemble diversity
  • Interpretability and explainability for ensemble methods

Ensemble machine learning trains a diverse group of machine learning models to work together, aggregating their output to deliver richer results than a single model. Now in Ensemble Methods for Machine Learning you’ll discover core ensemble methods that have proven records in both data science competitions and real-world applications. Hands-on case studies show you how each algorithm works in production. By the time you're done, you'll know the benefits, limitations, and practical methods of applying ensemble machine learning to real-world data, and be ready to build more explainable ML systems.

About the Technology
Automatically compare, contrast, and blend the output from multiple models to squeeze the best results from your data. Ensemble machine learning applies a “wisdom of crowds” method that dodges the inaccuracies and limitations of a single model. By basing responses on multiple perspectives, this innovative approach can deliver robust predictions even without massive datasets.

About the Book
Ensemble Methods for Machine Learning teaches you practical techniques for applying multiple ML approaches simultaneously. Each chapter contains a unique case study that demonstrates a fully functional ensemble method, with examples including medical diagnosis, sentiment analysis, handwriting classification, and more. There’s no complex math or theory—you’ll learn in a visuals-first manner, with ample code for easy experimentation!

What's Inside
  • Bagging, boosting, and gradient boosting
  • Methods for classification, regression, and retrieval
  • Interpretability and explainability for ensemble methods
  • Feature engineering and ensemble diversity


About the Reader
For Python programmers with machine learning experience.

About the Author
Gautam Kunapuli has over 15 years of experience in academia and the machine learning industry.

Quotes
An excellent guide to ensemble learning with concepts, code, and examples.
- Peter V. Henstock, Machine Learning and AI Lead, Pfizer Inc.; Advanced AI/ML Lecturer, Harvard Extension School

Extremely valuable for more complex scenarios that single models aren’t able to accurately capture.
- McHughson Chambers, Roy Hobbs Diamond Enterprise

Ensemble methods are a valuable tool. I can aggregate the strengths from multiple methods while mitigating their individual weaknesses and increasing model performance.
- Noah Flynn, Amazon

Step by step and with clear descriptions. Very understandable.
- Oliver Korten, ORONTEC

Table of contents

  1. inside front cover
  2. Ensemble Methods for Machine Learning
  3. Copyright
  4. dedication
  5. contents
  6. front matter
    1. preface
    2. acknowledgments
    3. about this book
      1. Who should read this book
      2. How this book is organized: A road map
      3. About the code
      4. liveBook discussion forum
    4. about the author
    5. about the cover illustration
  7. Part 1 The basics of ensembles
  8. 1 Ensemble methods: Hype or hallelujah?
    1. 1.1 Ensemble methods: The wisdom of the crowds
    2. 1.2 Why you should care about ensemble learning
    3. 1.3 Fit vs. complexity in individual models
      1. 1.3.1 Regression with decision trees
      2. 1.3.2 Regression with support vector machines
    4. 1.4 Our first ensemble
    5. 1.5 Terminology and taxonomy for ensemble methods
    6. Summary
  9. Part 2 Essential ensemble methods
  10. 2 Homogeneous parallel ensembles: Bagging and random forests
    1. 2.1 Parallel ensembles
    2. 2.2 Bagging: Bootstrap aggregating
      1. 2.2.1 Intuition: Resampling and model aggregation
      2. 2.2.2 Implementing bagging
      3. 2.2.3 Bagging with scikit-learn
      4. 2.2.4 Faster training with parallelization
    3. 2.3 Random forests
      1. 2.3.1 Randomized decision trees
      2. 2.3.2 Random forests with scikit-learn
      3. 2.3.3 Feature importances
    4. 2.4 More homogeneous parallel ensembles
      1. 2.4.1 Pasting
      2. 2.4.2 Random subspaces and random patches
      3. 2.4.3 Extra Trees
    5. 2.5 Case study: Breast cancer diagnosis
      1. 2.5.1 Loading and preprocessing
      2. 2.5.2 Bagging, random forests, and Extra Trees
      3. 2.5.3 Feature importances with random forests
    6. Summary
  11. 3 Heterogeneous parallel ensembles: Combining strong learners
    1. 3.1 Base estimators for heterogeneous ensembles
      1. 3.1.1 Fitting base estimators
      2. 3.1.2 Individual predictions of base estimators
    2. 3.2 Combining predictions by weighting
      1. 3.2.1 Majority vote
      2. 3.2.2 Accuracy weighting
      3. 3.2.3 Entropy weighting
      4. 3.2.4 Dempster-Shafer combination
    3. 3.3 Combining predictions by meta-learning
      1. 3.3.1 Stacking
      2. 3.3.2 Stacking with cross validation
    4. 3.4 Case study: Sentiment analysis
      1. 3.4.1 Preprocessing
      2. 3.4.2 Dimensionality reduction
      3. 3.4.3 Blending classifiers
    5. Summary
  12. 4 Sequential ensembles: Adaptive boosting
    1. 4.1 Sequential ensembles of weak learners
    2. 4.2 AdaBoost: Adaptive boosting
      1. 4.2.1 Intuition: Learning with weighted examples
      2. 4.2.2 Implementing AdaBoost
      3. 4.2.3 AdaBoost with scikit-learn
    3. 4.3 AdaBoost in practice
      1. 4.3.1 Learning rate
      2. 4.3.2 Early stopping and pruning
    4. 4.4 Case study: Handwritten digit classification
      1. 4.4.1 Dimensionality reduction with t-SNE
      2. 4.4.2 Boosting
    5. 4.5 LogitBoost: Boosting with the logistic loss
      1. 4.5.1 Logistic vs. exponential loss functions
      2. 4.5.2 Regression as a weak learning algorithm for classification
      3. 4.5.3 Implementing LogitBoost
    6. Summary
  13. 5 Sequential ensembles: Gradient boosting
    1. 5.1 Gradient descent for minimization
      1. 5.1.1 Gradient descent with an illustrative example
      2. 5.1.2 Gradient descent over loss functions for training
    2. 5.2 Gradient boosting: Gradient descent + boosting
      1. 5.2.1 Intuition: Learning with residuals
      2. 5.2.2 Implementing gradient boosting
      3. 5.2.3 Gradient boosting with scikit-learn
      4. 5.2.4 Histogram-based gradient boosting
    3. 5.3 LightGBM: A framework for gradient boosting
      1. 5.3.1 What makes LightGBM “light”?
      2. 5.3.2 Gradient boosting with LightGBM
    4. 5.4 LightGBM in practice
      1. 5.4.1 Learning rate
      2. 5.4.2 Early stopping
      3. 5.4.3 Custom loss functions
    5. 5.5 Case study: Document retrieval
      1. 5.5.1 The LETOR data set
      2. 5.5.2 Document retrieval with LightGBM
    6. Summary
  14. 6 Sequential ensembles: Newton boosting
    1. 6.1 Newton’s method for minimization
      1. 6.1.1 Newton’s method with an illustrative example
      2. 6.1.2 Newton’s descent over loss functions for training
    2. 6.2 Newton boosting: Newton’s method + boosting
      1. 6.2.1 Intuition: Learning with weighted residuals
      2. 6.2.2 Intuition: Learning with regularized loss functions
      3. 6.2.3 Implementing Newton boosting
    3. 6.3 XGBoost: A framework for Newton boosting
      1. 6.3.1 What makes XGBoost “extreme”?
      2. 6.3.2 Newton boosting with XGBoost
    4. 6.4 XGBoost in practice
      1. 6.4.1 Learning rate
      2. 6.4.2 Early stopping
    5. 6.5 Case study redux: Document retrieval
      1. 6.5.1 The LETOR data set
      2. 6.5.2 Document retrieval with XGBoost
    6. Summary
  15. Part 3 Ensembles in the wild: Adapting ensemble methods to your data
  16. 7 Learning with continuous and count labels
    1. 7.1 A brief review of regression
      1. 7.1.1 Linear regression for continuous labels
      2. 7.1.2 Poisson regression for count labels
      3. 7.1.3 Logistic regression for classification labels
      4. 7.1.4 Generalized linear models
      5. 7.1.5 Nonlinear regression
    2. 7.2 Parallel ensembles for regression
      1. 7.2.1 Random forests and Extra Trees
      2. 7.2.2 Combining regression models
      3. 7.2.3 Stacking regression models
    3. 7.3 Sequential ensembles for regression
      1. 7.3.1 Loss and likelihood functions for regression
      2. 7.3.2 Gradient boosting with LightGBM and XGBoost
    4. 7.4 Case study: Demand forecasting
      1. 7.4.1 The UCI Bike Sharing data set
      2. 7.4.2 GLMs and stacking
      3. 7.4.3 Random forest and Extra Trees
      4. 7.4.4 XGBoost and LightGBM
    5. Summary
  17. 8 Learning with categorical features
    1. 8.1 Encoding categorical features
      1. 8.1.1 Types of categorical features
      2. 8.1.2 Ordinal and one-hot encoding
      3. 8.1.3 Encoding with target statistics
      4. 8.1.4 The category_encoders package
    2. 8.2 CatBoost: A framework for ordered boosting
      1. 8.2.1 Ordered target statistics and ordered boosting
      2. 8.2.2 Oblivious decision trees
      3. 8.2.3 CatBoost in practice
    3. 8.3 Case study: Income prediction
      1. 8.3.1 Adult Data Set
      2. 8.3.2 Creating preprocessing and modeling pipelines
      3. 8.3.3 Category encoding and ensembling
      4. 8.3.4 Ordered encoding and boosting with CatBoost
    4. 8.4 Encoding high-cardinality string features
    5. Summary
  18. 9 Explaining your ensembles
    1. 9.1 What is interpretability?
      1. 9.1.1 Black-box vs. glass-box models
      2. 9.1.2 Decision trees (and decision rules)
      3. 9.1.3 Generalized linear models
    2. 9.2 Case study: Data-driven marketing
      1. 9.2.1 Bank Marketing data set
      2. 9.2.2 Training ensembles
      3. 9.2.3 Feature importances in tree ensembles
    3. 9.3 Black-box methods for global explainability
      1. 9.3.1 Permutation feature importance
      2. 9.3.2 Partial dependence plots
      3. 9.3.3 Global surrogate models
    4. 9.4 Black-box methods for local explainability
      1. 9.4.1 Local surrogate models with LIME
      2. 9.4.2 Local interpretability with SHAP
    5. 9.5 Glass-box ensembles: Training for interpretability
      1. 9.5.1 Explainable boosting machines
      2. 9.5.2 EBMs in practice
    6. Summary
  19. epilogue
    1. E.1 Further reading
      1. E.1.1 Practical ensemble methods
      2. E.1.2 Theory and foundations of ensemble methods
    2. E.2 A few more advanced topics
      1. E.2.1 Ensemble methods for statistical relational learning
      2. E.2.2 Ensemble methods for deep learning
    3. E.3 Thank You!
  20. index
  21. inside back cover

Product information

  • Title: Ensemble Methods for Machine Learning
  • Author(s): Gautam Kunapuli
  • Release date: May 2023
  • Publisher(s): Manning Publications
  • ISBN: 9781617297137