7.2 Attention Based Object Detection and Recognition in a Natural Scene
In pure bottom-up computational visual attention models in the spatial domain (e.g., the BS model mentioned in Chapter 3), local feature extraction forms part of the models, and after contrast processing and normalizing, the locations of some of the candidate objects can pop out on the resulting saliency map. Thus, we do not need to search the full scene to detect the objects by shifting windows one pixel and one pixel. Also, the idea of visual attention enables some engineers of computer vision to create fast algorithms of detecting objects. In this section, first a pure bottom-up model combined with a conventional object detection method is introduced. Then some two-region segmentation methods based on the visual attention concept are presented, after which object detection with the training set is provided. Finally, the BS visual attentional model with SIFT features for multiple object recognition is presented.
These introduced methods mainly provide readers a new strategy of how to incorporate visual attention models in object detection and recognition applications.