O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Machine Learning for OpenCV

Book Description

Expand your OpenCV knowledge and master key concepts of machine learning using this practical, hands-on guide.

About This Book

  • Load, store, edit, and visualize data using OpenCV and Python
  • Grasp the fundamental concepts of classification, regression, and clustering
  • Understand, perform, and experiment with machine learning techniques using this easy-to-follow guide
  • Evaluate, compare, and choose the right algorithm for any task

Who This Book Is For

This book targets Python programmers who are already familiar with OpenCV; this book will give you the tools and understanding required to build your own machine learning systems, tailored to practical real-world tasks.

What You Will Learn

  • Explore and make effective use of OpenCV's machine learning module
  • Learn deep learning for computer vision with Python
  • Master linear regression and regularization techniques
  • Classify objects such as flower species, handwritten digits, and pedestrians
  • Explore the effective use of support vector machines, boosted decision trees, and random forests
  • Get acquainted with neural networks and Deep Learning to address real-world problems
  • Discover hidden structures in your data using k-means clustering
  • Get to grips with data pre-processing and feature engineering

In Detail

Machine learning is no longer just a buzzword, it is all around us: from protecting your email, to automatically tagging friends in pictures, to predicting what movies you like. Computer vision is one of today's most exciting application fields of machine learning, with Deep Learning driving innovative systems such as self-driving cars and Google’s DeepMind.

OpenCV lies at the intersection of these topics, providing a comprehensive open-source library for classic as well as state-of-the-art computer vision and machine learning algorithms. In combination with Python Anaconda, you will have access to all the open-source computing libraries you could possibly ask for.

Machine learning for OpenCV begins by introducing you to the essential concepts of statistical learning, such as classification and regression. Once all the basics are covered, you will start exploring various algorithms such as decision trees, support vector machines, and Bayesian networks, and learn how to combine them with other OpenCV functionality. As the book progresses, so will your machine learning skills, until you are ready to take on today's hottest topic in the field: Deep Learning.

By the end of this book, you will be ready to take on your own machine learning problems, either by building on the existing source code or developing your own algorithm from scratch!

Style and approach

OpenCV machine learning connects the fundamental theoretical principles behind machine learning to their practical applications in a way that focuses on asking and answering the right questions. This book walks you through the key elements of OpenCV and its powerful machine learning classes, while demonstrating how to get to grips with a range of models.

Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

Table of Contents

  1. Preface
    1. What this book covers
    2. What you need for this book
    3. Who this book is for
    4. Conventions
    5. Reader feedback
    6. Customer support
      1. Downloading the example code
      2. Errata
      3. Piracy
      4. Questions
  2. A Taste of Machine Learning
    1. Getting started with machine learning
    2. Problems that machine learning can solve
    3. Getting started with Python
    4. Getting started with OpenCV
    5. Installation
      1. Getting the latest code for this book
      2. Getting to grips with Python's Anaconda distribution
      3. Installing OpenCV in a conda environment
      4. Verifying the installation
      5. Getting a glimpse of OpenCV's ML module
    6. Summary
  3. Working with Data in OpenCV and Python
    1. Understanding the machine learning workflow
    2. Dealing with data using OpenCV and Python
      1. Starting a new IPython or Jupyter session
      2. Dealing with data using Python's NumPy package
        1. Importing NumPy
        2. Understanding NumPy arrays
        3. Accessing single array elements by indexing
        4. Creating multidimensional arrays
      3. Loading external datasets in Python
      4. Visualizing the data using Matplotlib
        1. Importing Matplotlib
        2. Producing a simple plot
        3. Visualizing data from an external dataset
      5. Dealing with data using OpenCV's TrainData container in C++
    3. Summary
  4. First Steps in Supervised Learning
    1. Understanding supervised learning
      1. Having a look at supervised learning in OpenCV
      2. Measuring model performance with scoring functions
        1. Scoring classifiers using accuracy, precision, and recall
        2. Scoring regressors using mean squared error, explained variance, and R squared
    2. Using classification models to predict class labels
      1. Understanding the k-NN algorithm
      2. Implementing k-NN in OpenCV
        1. Generating the training data
        2. Training the classifier
        3. Predicting the label of a new data point
    3. Using regression models to predict continuous outcomes
      1. Understanding linear regression
      2. Using linear regression to predict Boston housing prices
        1. Loading the dataset
        2. Training the model
        3. Testing the model
      3. Applying Lasso and ridge regression
    4. Classifying iris species using logistic regression
      1. Understanding logistic regression
        1. Loading the training data
        2. Making it a binary classification problem
        3. Inspecting the data
        4. Splitting the data into training and test sets
        5. Training the classifier
        6. Testing the classifier
    5. Summary
  5. Representing Data and Engineering Features
    1. Understanding feature engineering
    2. Preprocessing data
      1. Standardizing features
      2. Normalizing features
      3. Scaling features to a range
      4. Binarizing features
      5. Handling the missing data
    3. Understanding dimensionality reduction
      1. Implementing Principal Component Analysis (PCA) in OpenCV
      2. Implementing Independent Component Analysis (ICA)
      3. Implementing Non-negative Matrix Factorization (NMF)
    4. Representing categorical variables
    5. Representing text features
    6. Representing images
      1. Using color spaces
        1. Encoding images in RGB space
        2. Encoding images in HSV and HLS space
      2. Detecting corners in images
      3. Using the Scale-Invariant Feature Transform (SIFT)
      4. Using Speeded Up Robust Features (SURF)
    7. Summary
  6. Using Decision Trees to Make a Medical Diagnosis
    1. Understanding decision trees
      1. Building our first decision tree
        1. Understanding the task by understanding the data
        2. Preprocessing the data
        3. Constructing the tree
      2. Visualizing a trained decision tree
      3. Investigating the inner workings of a decision tree
      4. Rating the importance of features
      5. Understanding the decision rules
      6. Controlling the complexity of decision trees
    2. Using decision trees to diagnose breast cancer
      1. Loading the dataset
      2. Building the decision tree
    3. Using decision trees for regression
    4. Summary
  7. Detecting Pedestrians with Support Vector Machines
    1. Understanding linear support vector machines
      1. Learning optimal decision boundaries
      2. Implementing our first support vector machine
        1. Generating the dataset
        2. Visualizing the dataset
        3. Preprocessing the dataset
        4. Building the support vector machine
        5. Visualizing the decision boundary
    2. Dealing with nonlinear decision boundaries
      1. Understanding the kernel trick
      2. Knowing our kernels
      3. Implementing nonlinear support vector machines
    3. Detecting pedestrians in the wild
      1. Obtaining the dataset
      2. Taking a glimpse at the histogram of oriented gradients (HOG)
      3. Generating negatives
      4. Implementing the support vector machine
      5. Bootstrapping the model
      6. Detecting pedestrians in a larger image
      7. Further improving the model
    4. Summary
  8. Implementing a Spam Filter with Bayesian Learning
    1. Understanding Bayesian inference
      1. Taking a short detour on probability theory
      2. Understanding Bayes' theorem
      3. Understanding the naive Bayes classifier
    2. Implementing your first Bayesian classifier
      1. Creating a toy dataset
      2. Classifying the data with a normal Bayes classifier
      3. Classifying the data with a naive Bayes classifier
      4. Visualizing conditional probabilities
    3. Classifying emails using the naive Bayes classifier
      1. Loading the dataset
      2. Building a data matrix using Pandas
      3. Preprocessing the data
      4. Training a normal Bayes classifier
      5. Training on the full dataset
      6. Using n-grams to improve the result
      7. Using tf-idf to improve the result
    4. Summary
  9. Discovering Hidden Structures with Unsupervised Learning
    1. Understanding unsupervised learning
    2. Understanding k-means clustering
      1. Implementing our first k-means example
    3. Understanding expectation-maximization
      1. Implementing our own expectation-maximization solution
      2. Knowing the limitations of expectation-maximization
        1. First caveat: No guarantee of finding the global optimum
        2. Second caveat: We must select the number of clusters beforehand
        3. Third caveat: Cluster boundaries are linear
        4. Fourth caveat: k-means is slow for a large number of samples
    4. Compressing color spaces using k-means
      1. Visualizing the true-color palette
      2. Reducing the color palette using k-means
    5. Classifying handwritten digits using k-means
      1. Loading the dataset
      2. Running k-means
    6. Organizing clusters as a hierarchical tree
      1. Understanding hierarchical clustering
      2. Implementing agglomerative hierarchical clustering
    7. Summary
  10. Using Deep Learning to Classify Handwritten Digits
    1. Understanding the McCulloch-Pitts neuron
    2. Understanding the perceptron
    3. Implementing your first perceptron
      1. Generating a toy dataset
      2. Fitting the perceptron to data
      3. Evaluating the perceptron classifier
      4. Applying the perceptron to data that is not linearly separable
    4. Understanding multilayer perceptrons
      1. Understanding gradient descent
      2. Training multi-layer perceptrons with backpropagation
      3. Implementing a multilayer perceptron in OpenCV
        1. Preprocessing the data
        2. Creating an MLP classifier in OpenCV
        3. Customizing the MLP classifier
        4. Training and testing the MLP classifier
    5. Getting acquainted with deep learning
      1. Getting acquainted with Keras
    6. Classifying handwritten digits
      1. Loading the MNIST dataset
      2. Preprocessing the MNIST dataset
      3. Training an MLP using OpenCV
      4. Training a deep neural net using Keras
        1. Preprocessing the MNIST dataset
        2. Creating a convolutional neural network
        3. Fitting the model
    7. Summary
  11. Combining Different Algorithms into an Ensemble
    1. Understanding ensemble methods
      1. Understanding averaging ensembles
        1. Implementing a bagging classifier
        2. Implementing a bagging regressor
      2. Understanding boosting ensembles
        1. Implementing a boosting classifier
        2. Implementing a boosting regressor
      3. Understanding stacking ensembles
    2. Combining decision trees into a random forest
      1. Understanding the shortcomings of decision trees
      2. Implementing our first random forest
      3. Implementing a random forest with scikit-learn
      4. Implementing extremely randomized trees
    3. Using random forests for face recognition
      1. Loading the dataset
      2. Preprocessing the dataset
      3. Training and testing the random forest
    4. Implementing AdaBoost
      1. Implementing AdaBoost in OpenCV
      2. Implementing AdaBoost in scikit-learn
    5. Combining different models into a voting classifier
      1. Understanding different voting schemes
      2. Implementing a voting classifier
    6. Summary
  12. Selecting the Right Model with Hyperparameter Tuning
    1. Evaluating a model
      1. Evaluating a model the wrong way
      2. Evaluating a model in the right way
      3. Selecting the best model
    2. Understanding cross-validation
      1. Manually implementing cross-validation in OpenCV
      2. Using scikit-learn for k-fold cross-validation
      3. Implementing leave-one-out cross-validation
    3. Estimating robustness using bootstrapping
      1. Manually implementing bootstrapping in OpenCV
    4. Assessing the significance of our results
      1. Implementing Student's t-test
      2. Implementing McNemar's test
    5. Tuning hyperparameters with grid search
      1. Implementing a simple grid search
      2. Understanding the value of a validation set
      3. Combining grid search with cross-validation
      4. Combining grid search with nested cross-validation
    6. Scoring models using different evaluation metrics
      1. Choosing the right classification metric
      2. Choosing the right regression metric
    7. Chaining algorithms together to form a pipeline
      1. Implementing pipelines in scikit-learn
      2. Using pipelines in grid searches
    8. Summary
  13. Wrapping Up
    1. Approaching a machine learning problem
    2. Building your own estimator
      1. Writing your own OpenCV-based classifier in C++
      2. Writing your own scikit-learn-based classifier in Python
    3. Where to go from here?
    4. Summary