book

Programming Computer Vision with Python

by Jan Erik Solem

June 2012

Beginner to intermediate

260 pages

6h 28m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Prerequisites and OverviewWhat You Need to KnowWhat You Will LearnChapter OverviewIntroduction to Computer VisionPython and NumPyNotation and ConventionsUsing Code ExamplesHow to Contact UsSafari® Books OnlineAcknowledgments
1.1 PIL—The Python Imaging LibraryConvert Images to Another FormatCreate ThumbnailsCopy and Paste RegionsResize and Rotate1.2 MatplotlibPlotting Images, Points, and LinesImage Contours and HistogramsInteractive Annotation1.3 NumPyArray Image RepresentationGraylevel TransformsImage ResizingHistogram EqualizationAveraging ImagesPCA of ImagesUsing the Pickle Module1.4 SciPyBlurring ImagesImage DerivativesMorphology—Counting ObjectsUseful SciPy ModulesReading and writing .mat filesSaving arrays as images1.5 Advanced Example: Image De-NoisingExercisesConventions for the Code Examples
2.1 Harris Corner DetectorFinding Corresponding Points Between Images2.2 SIFT—Scale-Invariant Feature TransformInterest PointsDescriptorDetecting Interest PointsMatching Descriptors2.3 Matching Geotagged ImagesDownloading Geotagged Images from PanoramioMatching Using Local DescriptorsVisualizing Connected ImagesExercises
3.1 HomographiesThe Direct Linear Transformation AlgorithmAffine Transformations3.2 Warping ImagesImage in ImagePiecewise Affine WarpingRegistering Images3.3 Creating PanoramasRANSACRobust Homography EstimationStitching the Images TogetherExercises
4.1 The Pin-Hole Camera ModelThe Camera MatrixProjecting 3D PointsFactoring the Camera MatrixComputing the Camera Center4.2 Camera CalibrationA Simple Calibration Method4.3 Pose Estimation from Planes and Markers4.4 Augmented RealityPyGame and PyOpenGLFrom Camera Matrix to OpenGL FormatPlacing Virtual Objects in the ImageTying It All TogetherLoading ModelsExercises
5.1 Epipolar GeometryA Sample Data SetPlotting 3D Data with MatplotlibComputing F—The Eight Point AlgorithmThe Epipole and Epipolar Lines5.2 Computing with Cameras and 3D StructureTriangulationComputing the Camera Matrix from 3D PointsComputing the Camera Matrix from a Fundamental MatrixThe uncalibrated case—projective reconstructionThe calibrated case—metric reconstruction5.3 Multiple View ReconstructionRobust Fundamental Matrix Estimation3D Reconstruction ExampleExtensions and More Than Two ViewsMore viewsBundle adjustmentSelf-calibration5.4 Stereo ImagesComputing Disparity MapsExercises
6.1 K-Means ClusteringThe SciPy Clustering PackageClustering ImagesVisualizing the Images on Principal ComponentsClustering Pixels6.2 Hierarchical ClusteringClustering Images6.3 Spectral ClusteringExercises
7.1 Content-Based Image RetrievalInspiration from Text Mining—The Vector Space Model7.2 Visual WordsCreating a Vocabulary7.3 Indexing ImagesSetting Up the DatabaseAdding Images7.4 Searching the Database for ImagesUsing the Index to Get CandidatesQuerying with an ImageBenchmarking and Plotting the Results7.5 Ranking Results Using Geometry7.6 Building Demos and Web ApplicationsCreating Web Applications with CherryPyImage Search DemoExercises

8.1 K-Nearest NeighborsA Simple 2D ExampleDense SIFT as Image FeatureClassifying Images—Hand Gesture Recognition8.2 Bayes ClassifierUsing PCA to Reduce Dimensions8.3 Support Vector MachinesUsing LibSVMHand Gesture Recognition Again8.4 Optical Character RecognitionTraining a ClassifierSelecting FeaturesMulti-Class SVMExtracting Cells and Recognizing CharactersRectifying ImagesExercises
9.1 Graph CutsGraphs from ImagesSegmentation with User Input9.2 Segmentation Using Clustering9.3 Variational MethodsExercises
10.1 The OpenCV Python Interface10.2 OpenCV BasicsReading and Writing ImagesColor SpacesDisplaying Images and Results10.3 Processing VideoVideo InputReading Video to NumPy Arrays10.4 TrackingOptical FlowThe Lucas-Kanade AlgorithmUsing the trackerUsing generators10.5 More ExamplesInpaintingSegmentation with the Watershed TransformLine Detection with a Hough TransformExercises
A.1 NumPy and SciPyWindowsMac OS XLinuxA.2 MatplotlibA.3 PILA.4 LibSVMA.5 OpenCVWindows and UnixMac OS XLinuxA.6 VLFeatA.7 PyGameA.8 PyOpenGLA.9 PydotA.10 Python-graphA.11 SimplejsonA.12 PySQLiteA.13 CherryPy
B.1 FlickrB.2 PanoramioB.3 Oxford Visual Geometry GroupB.4 University of Kentucky Recognition Benchmark ImagesB.5 OtherPrague Texture Segmentation Datagenerator and BenchmarkMSR Cambridge Grab Cut DatasetCaltech 101Static Hand Posture DatabaseMiddlebury Stereo Datasets
C.1 Images from FlickrC.2 Other ImagesC.3 Illustrations

Content preview from Programming Computer Vision with Python

Chapter 6. Clustering Images

This chapter introduces several clustering methods and shows how to use them for clustering images for finding groups of similar images. Clustering can be used for recognition, for dividing data sets of images, and for organization and navigation. We also look at using clustering for visualizing similarity between images.

6.1 K-Means Clustering

K-means is a very simple clustering algorithm that tries to partition the input data in k clusters. K-means works by iteratively refining an initial estimate of class centroids as follows:

Initialize centroids μ_i, i = 1 . . . k, randomly or with some guess.
Assign each data point to the class c_i of its nearest centroid.
Update the centroids as the average of all data points assigned to that class.
Repeat 2 and 3 until convergence.

K-means tries to minimize the total within-class variance

where x_j are the data vectors. The algorithm above is a heuristic refinement algorithm that works fine for most cases, but it does not guarantee that the best solution is found. To avoid the effects of choosing a bad centroid initialization, the algorithm is often run several times with different initialization centroids. Then the solution with lowest variance V is selected.

The main drawback of this algorithm is that the number of clusters needs to be decided beforehand, and an inappropriate choice will give poor clustering results. The ...