book

Programming Computer Vision with Python

Name: Programming Computer Vision with Python
Author: Jan Erik Solem
ISBN: 9781449316549

by Jan Erik Solem

June 2012

Beginner to intermediate

260 pages

6h 28m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Programming Computer Vision with Python
SPECIAL OFFER: Upgrade this ebook with O’Reilly
Preface
Prerequisites and OverviewWhat You Need to KnowWhat You Will LearnChapter OverviewIntroduction to Computer VisionPython and NumPyNotation and ConventionsUsing Code ExamplesHow to Contact UsSafari® Books OnlineAcknowledgments
1. Basic Image Handling and Processing
1.1 PIL—The Python Imaging LibraryConvert Images to Another FormatCreate ThumbnailsCopy and Paste RegionsResize and Rotate1.2 MatplotlibPlotting Images, Points, and LinesImage Contours and HistogramsInteractive Annotation1.3 NumPyArray Image RepresentationGraylevel TransformsImage ResizingHistogram EqualizationAveraging ImagesPCA of ImagesUsing the Pickle Module1.4 SciPyBlurring ImagesImage DerivativesMorphology—Counting ObjectsUseful SciPy ModulesReading and writing .mat filesSaving arrays as images1.5 Advanced Example: Image De-NoisingExercisesConventions for the Code Examples
2. Local Image Descriptors
2.1 Harris Corner DetectorFinding Corresponding Points Between Images2.2 SIFT—Scale-Invariant Feature TransformInterest PointsDescriptorDetecting Interest PointsMatching Descriptors2.3 Matching Geotagged ImagesDownloading Geotagged Images from PanoramioMatching Using Local DescriptorsVisualizing Connected ImagesExercises
3. Image to Image Mappings
3.1 HomographiesThe Direct Linear Transformation AlgorithmAffine Transformations3.2 Warping ImagesImage in ImagePiecewise Affine WarpingRegistering Images3.3 Creating PanoramasRANSACRobust Homography EstimationStitching the Images TogetherExercises
4. Camera Models and Augmented Reality
4.1 The Pin-Hole Camera ModelThe Camera MatrixProjecting 3D PointsFactoring the Camera MatrixComputing the Camera Center4.2 Camera CalibrationA Simple Calibration Method4.3 Pose Estimation from Planes and Markers4.4 Augmented RealityPyGame and PyOpenGLFrom Camera Matrix to OpenGL FormatPlacing Virtual Objects in the ImageTying It All TogetherLoading ModelsExercises
5. Multiple View Geometry
5.1 Epipolar GeometryA Sample Data SetPlotting 3D Data with MatplotlibComputing F—The Eight Point AlgorithmThe Epipole and Epipolar Lines5.2 Computing with Cameras and 3D StructureTriangulationComputing the Camera Matrix from 3D PointsComputing the Camera Matrix from a Fundamental MatrixThe uncalibrated case—projective reconstructionThe calibrated case—metric reconstruction5.3 Multiple View ReconstructionRobust Fundamental Matrix Estimation3D Reconstruction ExampleExtensions and More Than Two ViewsMore viewsBundle adjustmentSelf-calibration5.4 Stereo ImagesComputing Disparity MapsExercises
6. Clustering Images
6.1 K-Means ClusteringThe SciPy Clustering PackageClustering ImagesVisualizing the Images on Principal ComponentsClustering Pixels6.2 Hierarchical ClusteringClustering Images6.3 Spectral ClusteringExercises
7. Searching Images
7.1 Content-Based Image RetrievalInspiration from Text Mining—The Vector Space Model7.2 Visual WordsCreating a Vocabulary7.3 Indexing ImagesSetting Up the DatabaseAdding Images7.4 Searching the Database for ImagesUsing the Index to Get CandidatesQuerying with an ImageBenchmarking and Plotting the Results7.5 Ranking Results Using Geometry7.6 Building Demos and Web ApplicationsCreating Web Applications with CherryPyImage Search DemoExercises

8. Classifying Image Content
8.1 K-Nearest NeighborsA Simple 2D ExampleDense SIFT as Image FeatureClassifying Images—Hand Gesture Recognition8.2 Bayes ClassifierUsing PCA to Reduce Dimensions8.3 Support Vector MachinesUsing LibSVMHand Gesture Recognition Again8.4 Optical Character RecognitionTraining a ClassifierSelecting FeaturesMulti-Class SVMExtracting Cells and Recognizing CharactersRectifying ImagesExercises
9. Image Segmentation
9.1 Graph CutsGraphs from ImagesSegmentation with User Input9.2 Segmentation Using Clustering9.3 Variational MethodsExercises
10. OpenCV
10.1 The OpenCV Python Interface10.2 OpenCV BasicsReading and Writing ImagesColor SpacesDisplaying Images and Results10.3 Processing VideoVideo InputReading Video to NumPy Arrays10.4 TrackingOptical FlowThe Lucas-Kanade AlgorithmUsing the trackerUsing generators10.5 More ExamplesInpaintingSegmentation with the Watershed TransformLine Detection with a Hough TransformExercises
A. Installing Packages
A.1 NumPy and SciPyWindowsMac OS XLinuxA.2 MatplotlibA.3 PILA.4 LibSVMA.5 OpenCVWindows and UnixMac OS XLinuxA.6 VLFeatA.7 PyGameA.8 PyOpenGLA.9 PydotA.10 Python-graphA.11 SimplejsonA.12 PySQLiteA.13 CherryPy
B. Image Datasets
B.1 FlickrB.2 PanoramioB.3 Oxford Visual Geometry GroupB.4 University of Kentucky Recognition Benchmark ImagesB.5 OtherPrague Texture Segmentation Datagenerator and BenchmarkMSR Cambridge Grab Cut DatasetCaltech 101Static Hand Posture DatabaseMiddlebury Stereo Datasets
C. Image Credits
C.1 Images from FlickrC.2 Other ImagesC.3 Illustrations
D. References
E. About the Author
Index
About the Author
Colophon
SPECIAL OFFER: Upgrade this ebook with O’Reilly
Copyright

Content preview from Programming Computer Vision with Python

Chapter 4. Camera Models and Augmented Reality

In this chapter, we will look at modeling cameras and how to effectively use such models. In the previous chapter, we covered image to image mappings and transforms. To handle mappings between 3D and images, the projection properties of the camera generating the image needs to be part of the mapping. Here we show how to determine camera properties and how to use image projections for applications like augmented reality. In the next chapter, we will use the camera model to look at applications with multiple views and mappings between them.

4.1 The Pin-Hole Camera Model

The pin-hole camera model (or sometimes projective camera model) is a widely used camera model in computer vision. It is simple and accurate enough for most applications. The name comes from the type of camera, like a camera obscura, that collects light through a small hole to the inside of a dark box or room. In the pin-hole camera model, light passes through a single point, the camera center, C, before it is projected onto an image plane. Figure 4-1 shows an illustration where the image plane is drawn in front of the camera center. The image plane in an actual camera would be upside down behind the camera center, but the model is the same.

The projection properties of a pin-hole camera can be derived from this illustration and the assumption that the image axis is aligned with the x and y axis of a 3D coordinate system. The optical axis of the camera then coincides with ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Advanced Python Programming - Second Edition

Publisher Resources

ISBN: 9781449341916Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Programming Computer Vision with Python

by Jan Erik Solem

Chapter 4. Camera Models and Augmented Reality

4.1 The Pin-Hole Camera Model

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.