7.2 Attention Based Object Detection and Recognition in a Natural Scene

In pure bottom-up computational visual attention models in the spatial domain (e.g., the BS model mentioned in Chapter 3), local feature extraction forms part of the models, and after contrast processing and normalizing, the locations of some of the candidate objects can pop out on the resulting saliency map. Thus, we do not need to search the full scene to detect the objects by shifting windows one pixel and one pixel. Also, the idea of visual attention enables some engineers of computer vision to create fast algorithms of detecting objects. In this section, first a pure bottom-up model combined with a conventional object detection method is introduced. Then some two-region segmentation methods based on the visual attention concept are presented, after which object detection with the training set is provided. Finally, the BS visual attentional model with SIFT features for multiple object recognition is presented.

These introduced methods mainly provide readers a new strategy of how to incorporate visual attention models in object detection and recognition applications.

7.2.1 Object Detection Combined with Bottom-up Model

1. Simple detection method
The development of bottom-up computational models closely links to object detection. A simple method uses the most salient location in a scene as the possibly desired object or object candidate location. Suppose we have obtained the saliency map (SM) of an image ...

Get Selective Visual Attention: Computational Models and Applications now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.