book

Programming Computer Vision with Python

by Jan Erik Solem

June 2012

Beginner to intermediate

260 pages

6h 28m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Prerequisites and OverviewWhat You Need to KnowWhat You Will LearnChapter OverviewIntroduction to Computer VisionPython and NumPyNotation and ConventionsUsing Code ExamplesHow to Contact UsSafari® Books OnlineAcknowledgments
1.1 PIL—The Python Imaging LibraryConvert Images to Another FormatCreate ThumbnailsCopy and Paste RegionsResize and Rotate1.2 MatplotlibPlotting Images, Points, and LinesImage Contours and HistogramsInteractive Annotation1.3 NumPyArray Image RepresentationGraylevel TransformsImage ResizingHistogram EqualizationAveraging ImagesPCA of ImagesUsing the Pickle Module1.4 SciPyBlurring ImagesImage DerivativesMorphology—Counting ObjectsUseful SciPy ModulesReading and writing .mat filesSaving arrays as images1.5 Advanced Example: Image De-NoisingExercisesConventions for the Code Examples
2.1 Harris Corner DetectorFinding Corresponding Points Between Images2.2 SIFT—Scale-Invariant Feature TransformInterest PointsDescriptorDetecting Interest PointsMatching Descriptors2.3 Matching Geotagged ImagesDownloading Geotagged Images from PanoramioMatching Using Local DescriptorsVisualizing Connected ImagesExercises
3.1 HomographiesThe Direct Linear Transformation AlgorithmAffine Transformations3.2 Warping ImagesImage in ImagePiecewise Affine WarpingRegistering Images3.3 Creating PanoramasRANSACRobust Homography EstimationStitching the Images TogetherExercises
4.1 The Pin-Hole Camera ModelThe Camera MatrixProjecting 3D PointsFactoring the Camera MatrixComputing the Camera Center4.2 Camera CalibrationA Simple Calibration Method4.3 Pose Estimation from Planes and Markers4.4 Augmented RealityPyGame and PyOpenGLFrom Camera Matrix to OpenGL FormatPlacing Virtual Objects in the ImageTying It All TogetherLoading ModelsExercises
5.1 Epipolar GeometryA Sample Data SetPlotting 3D Data with MatplotlibComputing F—The Eight Point AlgorithmThe Epipole and Epipolar Lines5.2 Computing with Cameras and 3D StructureTriangulationComputing the Camera Matrix from 3D PointsComputing the Camera Matrix from a Fundamental MatrixThe uncalibrated case—projective reconstructionThe calibrated case—metric reconstruction5.3 Multiple View ReconstructionRobust Fundamental Matrix Estimation3D Reconstruction ExampleExtensions and More Than Two ViewsMore viewsBundle adjustmentSelf-calibration5.4 Stereo ImagesComputing Disparity MapsExercises
6.1 K-Means ClusteringThe SciPy Clustering PackageClustering ImagesVisualizing the Images on Principal ComponentsClustering Pixels6.2 Hierarchical ClusteringClustering Images6.3 Spectral ClusteringExercises
7.1 Content-Based Image RetrievalInspiration from Text Mining—The Vector Space Model7.2 Visual WordsCreating a Vocabulary7.3 Indexing ImagesSetting Up the DatabaseAdding Images7.4 Searching the Database for ImagesUsing the Index to Get CandidatesQuerying with an ImageBenchmarking and Plotting the Results7.5 Ranking Results Using Geometry7.6 Building Demos and Web ApplicationsCreating Web Applications with CherryPyImage Search DemoExercises

8.1 K-Nearest NeighborsA Simple 2D ExampleDense SIFT as Image FeatureClassifying Images—Hand Gesture Recognition8.2 Bayes ClassifierUsing PCA to Reduce Dimensions8.3 Support Vector MachinesUsing LibSVMHand Gesture Recognition Again8.4 Optical Character RecognitionTraining a ClassifierSelecting FeaturesMulti-Class SVMExtracting Cells and Recognizing CharactersRectifying ImagesExercises
9.1 Graph CutsGraphs from ImagesSegmentation with User Input9.2 Segmentation Using Clustering9.3 Variational MethodsExercises
10.1 The OpenCV Python Interface10.2 OpenCV BasicsReading and Writing ImagesColor SpacesDisplaying Images and Results10.3 Processing VideoVideo InputReading Video to NumPy Arrays10.4 TrackingOptical FlowThe Lucas-Kanade AlgorithmUsing the trackerUsing generators10.5 More ExamplesInpaintingSegmentation with the Watershed TransformLine Detection with a Hough TransformExercises
A.1 NumPy and SciPyWindowsMac OS XLinuxA.2 MatplotlibA.3 PILA.4 LibSVMA.5 OpenCVWindows and UnixMac OS XLinuxA.6 VLFeatA.7 PyGameA.8 PyOpenGLA.9 PydotA.10 Python-graphA.11 SimplejsonA.12 PySQLiteA.13 CherryPy
B.1 FlickrB.2 PanoramioB.3 Oxford Visual Geometry GroupB.4 University of Kentucky Recognition Benchmark ImagesB.5 OtherPrague Texture Segmentation Datagenerator and BenchmarkMSR Cambridge Grab Cut DatasetCaltech 101Static Hand Posture DatabaseMiddlebury Stereo Datasets
C.1 Images from FlickrC.2 Other ImagesC.3 Illustrations

Content preview from Programming Computer Vision with Python

Chapter 7. Searching Images

This chapter shows how to use text mining techniques to search for images based on their visual content. The basic ideas of using visual words are presented and the details of a complete setup are explained and tested on an example image data set.

7.1 Content-Based Image Retrieval

Content-based image retrieval (CBIR) deals with the problem of retrieving visually similar images from a (large) database of images. This can be images with similar color, similar textures, or similar objects or scenes: basically any information contained in the images themselves.

For high-level queries, like finding similar objects, it is not feasible to do a full comparison (for example using feature matching) between a query image and all images in the database. It would simply take too much time to return any results if the database is large. In the last couple of years, researchers have successfully introduced techniques from the world of text mining for CBIR problems, making it possible to search millions of images for similar content.

Inspiration from Text Mining—The Vector Space Model

The vector space model is a model for representing and searching text documents. As we will see, it can be applied to essentially any kind of objects, including images. The name comes from the fact that text documents are represented with vectors that are histograms of the word frequencies in the text.^[18] In other words, the vector will contain the number of occurrences of every word (at the ...