Ensemble Machine Learning Cookbook

Book description

Implement machine learning algorithms to build ensemble models using Keras, H2O, Scikit-Learn, Pandas and more

Key Features

  • Apply popular machine learning algorithms using a recipe-based approach
  • Implement boosting, bagging, and stacking ensemble methods to improve machine learning models
  • Discover real-world ensemble applications and encounter complex challenges in Kaggle competitions

Book Description

Ensemble modeling is an approach used to improve the performance of machine learning models. It combines two or more similar or dissimilar machine learning algorithms to deliver superior intellectual powers. This book will help you to implement popular machine learning algorithms to cover different paradigms of ensemble machine learning such as boosting, bagging, and stacking.

The Ensemble Machine Learning Cookbook will start by getting you acquainted with the basics of ensemble techniques and exploratory data analysis. You'll then learn to implement tasks related to statistical and machine learning algorithms to understand the ensemble of multiple heterogeneous algorithms. It will also ensure that you don't miss out on key topics, such as like resampling methods. As you progress, you'll get a better understanding of bagging, boosting, stacking, and working with the Random Forest algorithm using real-world examples. The book will highlight how these ensemble methods use multiple models to improve machine learning results, as compared to a single model. In the concluding chapters, you'll delve into advanced ensemble models using neural networks, natural language processing, and more. You'll also be able to implement models such as fraud detection, text categorization, and sentiment analysis.

By the end of this book, you'll be able to harness ensemble techniques and the working mechanisms of machine learning algorithms to build intelligent models using individual recipes.

What you will learn

  • Understand how to use machine learning algorithms for regression and classification problems
  • Implement ensemble techniques such as averaging, weighted averaging, and max-voting
  • Get to grips with advanced ensemble methods, such as bootstrapping, bagging, and stacking
  • Use Random Forest for tasks such as classification and regression
  • Implement an ensemble of homogeneous and heterogeneous machine learning algorithms
  • Learn and implement various boosting techniques, such as AdaBoost, Gradient Boosting Machine, and XGBoost

Who this book is for

This book is designed for data scientists, machine learning developers, and deep learning enthusiasts who want to delve into machine learning algorithms to build powerful ensemble models. Working knowledge of Python programming and basic statistics is a must to help you grasp the concepts in the book.

Table of contents

  1. Title Page
  2. Copyright and Credits
    1. Ensemble Machine Learning Cookbook
  3. About Packt
    1. Why subscribe?
    2. Packt.com
  4. Foreword
  5. Contributors
    1. About the authors
    2. About the reviewers
    3. Packt is searching for authors like you
  6. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the example code files
      2. Download the color images
      3. Conventions used
    4. Sections
      1. Getting ready
      2. How to do it…
      3. How it works…
      4. There's more…
      5. See also
    5. Get in touch
      1. Reviews
  7. Get Closer to Your Data
    1. Introduction
    2. Data manipulation with Python
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Analyzing, visualizing, and treating missing values
      1. How to do it...
      2. How it works...
      3. There's more...
      4. See also
    4. Exploratory data analysis
      1. How to do it...
      2. How it works...
      3. There's more...
      4. See also
  8. Getting Started with Ensemble Machine Learning
    1. Introduction to ensemble machine learning
    2. Max-voting
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
    3. Averaging
      1. Getting ready
      2. How to do it...
      3. How it works...
    4. Weighted averaging
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
  9. Resampling Methods
    1. Introduction to sampling
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    2. k-fold and leave-one-out cross-validation
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Bootstrapping
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
  10. Statistical and Machine Learning Algorithms
    1. Technical requirements
    2. Multiple linear regression
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Logistic regression
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    4. Naive Bayes
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    5. Decision trees
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    6. Support vector machines
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
  11. Bag the Models with Bagging
    1. Introduction
    2. Bootstrap aggregation
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
    3. Ensemble meta-estimators
      1. Bagging classifiers
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    4. Bagging regressors
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
  12. When in Doubt, Use Random Forests
    1. Introduction to random forests
    2. Implementing a random forest for predicting credit card defaults using scikit-learn
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Implementing random forest for predicting credit card defaults using H2O
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
  13. Boosting Model Performance with Boosting
    1. Introduction to boosting
    2. Implementing AdaBoost for disease risk prediction using scikit-learn
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. Implementing a gradient boosting machine for disease risk prediction using scikit-learn
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
    4. Implementing the extreme gradient boosting method for glass identification using XGBoost with scikit-learn 
      1. Getting ready...
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
  14. Blend It with Stacking
    1. Technical requirements
    2. Understanding stacked generalization
    3. Implementing stacked generalization by combining predictions
      1. Getting ready...
      2. How to do it... 
      3. How it works...
      4. There's more...
      5. See also
    4. Implementing stacked generalization for campaign outcome prediction using H2O
      1. Getting ready...
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
  15. Homogeneous Ensembles Using Keras
    1. Introduction
    2. An ensemble of homogeneous models for energy prediction
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
    3. An ensemble of homogeneous models for handwritten digit classification
      1. Getting ready
      2. How to do it...
      3. How it works...
  16. Heterogeneous Ensemble Classifiers Using H2O
    1. Introduction 
    2. Predicting credit card defaulters using heterogeneous ensemble classifiers
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
      5. See also
  17. Heterogeneous Ensemble for Text Classification Using NLP
    1. Introduction
    2. Spam filtering using an ensemble of heterogeneous algorithms
      1. Getting ready
      2. How to do it...
      3. How it works...
    3. Sentiment analysis of movie reviews using an ensemble model
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. There's more...
  18. Homogenous Ensemble for Multiclass Classification Using Keras
    1. Introduction
    2. An ensemble of homogeneous models to classify fashion products
      1. Getting ready
      2. How to do it...
      3. How it works...
      4. See also
  19. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think

Product information

  • Title: Ensemble Machine Learning Cookbook
  • Author(s): Dipayan Sarkar, Vijayalakshmi Natarajan
  • Release date: January 2019
  • Publisher(s): Packt Publishing
  • ISBN: 9781789136609