The visual local representation model based on local features and visual vocabulary serves as a fundamental component in many existing computer vision systems. It has widespread application in the fields of object recognition, scene matching, multimedia content search and analysis, and also is the ad hoc focus of current computer vision and multimedia analysis research. The pipeline of the visual local representation model is to first extract the local interest points from images, then quantize such points into visual vocabulary, which forms a quantization table to obtain the feature-space division into visual words. Subsequently, each image is represented as a bag-of-visual-words descriptor, and is inverted indexed into all its corresponding ...

Get Learning-Based Local Visual Representation and Indexing now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.