book

OpenCV: Computer Vision Projects with Python

Name: OpenCV: Computer Vision Projects with Python
ISBN: 9781787125490

by Joseph Howse, Prateek Joshi, Michael Beyeler

October 2016

Intermediate to advanced

558 pages

12h 39m

English

Packt Publishing

Read now

Unlock full access

OpenCV: Computer Vision Projects with Python
Table of Contents
OpenCV: Computer Vision Projects with Python
OpenCV: Computer Vision Projects with Python
Credits
Preface
What this learning path covers
What you need for this learning path
Who this learning path is for
Reader feedback
Customer support
Downloading the example codeErrataPiracyQuestions

1. Module 1
1. Setting up OpenCV
Choosing and using the right setup toolsMaking the choice on Windows XP, Windows Vista, Windows 7, or Windows 8Using binary installers (no support for depth cameras)Using CMake and compilersMaking the choice on Mac OS X Snow Leopard, Mac OS X Lion, or Mac OS X Mountain LionUsing MacPorts with ready-made packagesUsing MacPorts with your own custom packagesUsing Homebrew with ready-made packages (no support for depth cameras)Using Homebrew with your own custom packagesMaking the choice on Ubuntu 12.04 LTS or Ubuntu 12.10Using the Ubuntu repository (no support for depth cameras)Using CMake via a ready-made script that you may customizeMaking the choice on other Unix-like systems
Running samples
Finding documentation, help, and updates
Summary
2. Handling Files, Cameras, and GUIs
Basic I/O scriptsReading/Writing an image fileConverting between an image and raw bytesReading/Writing a video fileCapturing camera framesDisplaying camera frames in a window
Project concept
An object-oriented design
Abstracting a video stream – managers.CaptureManagerAbstracting a window and keyboard – managers.WindowManagerApplying everything – cameo.Cameo
Summary
3. Filtering Images
Creating modules
Channel mixing – seeing in Technicolor
Simulating RC color spaceSimulating RGV color spaceSimulating CMV color space
Curves – bending color space
Formulating a curveCaching and applying a curveDesigning object-oriented curve filtersEmulating photo filmsEmulating Kodak PortraEmulating Fuji ProviaEmulating Fuji VelviaEmulating cross-processing
Highlighting edges
Custom kernels – getting convoluted
Modifying the application
Summary
4. Tracking Faces with Haar Cascades
Conceptualizing Haar cascades
Getting Haar cascade data
Creating modules
Defining a face as a hierarchy of rectangles
Tracing, cutting, and pasting rectangles
Adding more utility functions
Tracking faces
Modifying the application
Swapping faces in one camera feedCopying faces between camera feeds
Summary
5. Detecting Foreground/Background Regions and Depth
Creating modules
Capturing frames from a depth camera
Creating a mask from a disparity map
Masking a copy operation
Modifying the application
Summary
A. Integrating with Pygame
Installing Pygame
Documentation and tutorials
Subclassing managers.WindowManager
Modifying the application
Further uses of Pygame
Summary
B. Generating Haar Cascades for Custom Targets
Gathering positive and negative training images
Finding the training executables
On WindowsOn Mac, Ubuntu, and other Unix-like systems
Creating the training sets and cascade
Creating <negative_description>Creating <positive_description>Creating <binary_description> by running <opencv_createsamples>Creating <cascade> by running <opencv_traincascade>
Testing and improving <cascade>
Summary
2. Module 2
1. Detecting Edges and Applying Image Filters
2D convolution
Blurring
The size of the kernel versus the blurriness
Edge detection
Motion blur
Under the hood
Sharpening
Understanding the pattern
Embossing
Erosion and dilation
Afterthought
Creating a vignette filter
What's happening underneath?How do we move the focus around?
Enhancing the contrast in an image
How do we handle color images?
Summary
2. Cartoonizing an Image
Accessing the webcamUnder the hood
Keyboard inputs
Interacting with the application
Mouse inputs
What's happening underneath?
Interacting with a live video stream
How did we do it?
Cartoonizing an image
Deconstructing the code
Summary
3. Detecting and Tracking Different Body Parts
Using Haar cascades to detect things
What are integral images?
Detecting and tracking faces
Understanding it better
Fun with faces
Under the hood
Detecting eyes
Afterthought
Fun with eyes
Positioning the sunglasses
Detecting ears
Detecting a mouth
It's time for a moustache
Detecting a nose
Detecting pupils
Deconstructing the code
Summary
4. Extracting Features from an Image
Why do we care about keypoints?
What are keypoints?
Detecting the corners
Good Features To Track
Scale Invariant Feature Transform (SIFT)
Speeded Up Robust Features (SURF)
Features from Accelerated Segment Test (FAST)
Binary Robust Independent Elementary Features (BRIEF)
Oriented FAST and Rotated BRIEF (ORB)
Summary
5. Creating a Panoramic Image
Matching keypoint descriptorsHow did we match the keypoints?Understanding the matcher objectDrawing the matching keypoints
Creating the panoramic image
Finding the overlapping regionsStitching the images
What if the images are at an angle to each other?
Why does it look stretched?
Summary
6. Seam Carving
Why do we care about seam carving?
How does it work?
How do we define "interesting"?
How do we compute the seams?
Can we expand an image?
Can we remove an object completely?
How did we do it?
Summary
7. Detecting Shapes and Segmenting an Image
Contour analysis and shape matching
Approximating a contour
Identifying the pizza with the slice taken out
How to censor a shape?
What is image segmentation?
How does it work?
Watershed algorithm
Summary
8. Object Tracking
Frame differencing
Colorspace based tracking
Building an interactive object tracker
Feature based tracking
Background subtraction
Summary
9. Object Recognition
Object detection versus object recognition
What is a dense feature detector?
What is a visual dictionary?
What is supervised and unsupervised learning?
What are Support Vector Machines?
What if we cannot separate the data with simple straight lines?
How do we actually implement this?
What happened inside the code?How did we build the trainer?
Summary
10. Stereo Vision and 3D Reconstruction
What is stereo correspondence?
What is epipolar geometry?
Why are the lines different as compared to SIFT?
Building the 3D map
Summary
11. Augmented Reality
What is the premise of augmented reality?
What does an augmented reality system look like?
Geometric transformations for augmented reality
What is pose estimation?
How to track planar objects?
What happened inside the code?
How to augment our reality?
Mapping coordinates from 3D to 2DHow to overlay 3D objects on a video?Let's look at the code
Let's add some movements
Summary
3. Module 3
1. Fun with Filters
Planning the app
Creating a black-and-white pencil sketch
Implementing dodging and burning in OpenCVPencil sketch transformation
Generating a warming/cooling filter
Color manipulation via curve shiftingImplementing a curve filter by using lookup tablesDesigning the warming/cooling effect
Cartoonizing an image
Using a bilateral filter for edge-aware smoothingDetecting and emphasizing prominent edgesCombining colors and outlines to produce a cartoon
Putting it all together
Running the appThe GUI base classThe GUI constructorHandling video streamsA basic GUI layoutA custom filter layout
Summary
2. Hand Gesture Recognition Using a Kinect Depth Sensor
Planning the app
Setting up the app
Accessing the Kinect 3D sensorRunning the appThe Kinect GUI
Tracking hand gestures in real time
Hand region segmentation
Finding the most prominent depth of the image center regionApplying morphological closing to smoothen the segmentation maskFinding connected components in a segmentation mask
Hand shape analysis
Determining the contour of the segmented hand regionFinding the convex hull of a contour areaFinding the convexity defects of a convex hull
Hand gesture recognition
Distinguishing between different causes of convexity defectsClassifying hand gestures based on the number of extended fingers
Summary
3. Finding Objects via Feature Matching and Perspective Transforms
Tasks performed by the app
Planning the app
Setting up the app
Running the appThe FeatureMatching GUI
The process flow
Feature extraction
Feature detectionDetecting features in an image with SURF
Feature matching
Matching features across images with FLANNThe ratio test for outlier removalVisualizing feature matchesHomography estimationWarping the image
Feature tracking
Early outlier detection and rejection
Seeing the algorithm in action
Summary
4. 3D Scene Reconstruction Using Structure from Motion
Planning the app
Camera calibration
The pinhole camera modelEstimating the intrinsic camera parametersThe camera calibration GUIInitializing the algorithmCollecting image and object pointsFinding the camera matrix
Setting up the app
The main function routineThe SceneReconstruction3D class
Estimating the camera motion from a pair of images
Point matching using rich feature descriptorsPoint matching using optic flowFinding the camera matricesImage rectification
Reconstructing the scene
3D point cloud visualization
Summary
5. Tracking Visually Salient Objects
Planning the app
Setting up the app
The main function routineThe Saliency classThe MultiObjectTracker class
Visual saliency
Fourier analysisNatural scene statisticsGenerating a Saliency map with the spectral residual approachDetecting proto-objects in a scene
Mean-shift tracking
Automatically tracking all players on a soccer fieldExtracting bounding boxes for proto-objectsSetting up the necessary bookkeeping for mean-shift trackingTracking objects with the mean-shift algorithm
Putting it all together
Summary
6. Learning to Recognize Traffic Signs
Planning the app
Supervised learning
The training procedureThe testing procedureA classifier base class
The GTSRB dataset
Parsing the dataset
Feature extraction
Common preprocessingGrayscale featuresColor spacesSpeeded Up Robust FeaturesHistogram of Oriented Gradients
Support Vector Machine
Using SVMs for Multi-class classificationTraining the SVMTesting the SVMConfusion matrixAccuracyPrecisionRecall
Putting it all together
Summary
7. Learning to Recognize Emotions on Faces
Planning the app
Face detection
Haar-based cascade classifiersPre-trained cascade classifiersUsing a pre-trained cascade classifierThe FaceDetector classDetecting faces in grayscale imagesPreprocessing detected faces
Facial expression recognition
Assembling a training setRunning the screen captureThe GUI constructorThe GUI layoutProcessing the current frameAdding a training sample to the training setDumping the complete training set to a fileFeature extractionPreprocessing the datasetPrincipal component analysisMulti-layer perceptronsThe perceptronDeep architecturesAn MLP for facial expression recognitionTraining the MLPTesting the MLPRunning the script
Putting it all together
Summary
A. Bibliography
Index

Content preview from OpenCV: Computer Vision Projects with Python

What is a visual dictionary?

We will be using the Bag of Words model to build our object recognizer. Each image is represented as a histogram of visual words. These visual words are basically the N centroids built using all the keypoints extracted from training images. The pipeline is as shown in the image that follows:

From each training image, we detect a set of keypoints and extract features for each of those keypoints. Every image will give rise to a different number of keypoints. In order to train a classifier, each image must be represented using a fixed length feature vector. This feature vector is nothing but a histogram, where each bin corresponds ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Computer Vision Projects with OpenCV and Python 3

Publisher Resources

ISBN: 9781787125490Purchase Link

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

OpenCV: Computer Vision Projects with Python

by Joseph Howse, Prateek Joshi, Michael Beyeler

What is a visual dictionary?

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

More than 5,000 organizations count on O’Reilly

Julian F.

Addison B.

Amir M.

Mark W.

You might also like

Computer Vision Projects with OpenCV and Python 3

Learning OpenCV 3 Computer Vision with Python (Update)

OpenCV 3 Computer Vision with Python Cookbook

OpenCV with Python By Example

Publisher Resources