4.5 Pulsed Discrete Cosine Transform Approach

Visual attention computational models in the frequency domain have faster computational speed and better performance than models in the spatial domain. However, it is not clear why they can obtain the perfect salient locations in input scenes, and what their biological basis is. Since in our brain there is no mechanism similar to the Fourier transform, frequency domain models have no biological basis, though some simple cells in the primary visual cortex may extract frequency features from input stimuli. One idea to find why this, should come from the development of connected weights in a feed-forward neural network as proposed in [7, 46]. It is known that the connected weights between neurons in the human brain are commonly obtained by the Hebbian learning rule [47, 48], and a lot of previous studies have showed that single layer feed-forward neural network, when given large numbers of data by the Hebbian learning rule, can find the principal components of the input data [49, 50]. The adjustment of the connected weights between input and output neurons at the learning stage is similar to the development stage of the visual system. When the connections are nearly stable, the neural network behaves like a linear transform from input image to all its principal components. Principal components analysis (PCA), mentioned in Section 2.6.3, can capture the main information of the visual inputs which is probably related to the spatial frequency ...

Get Selective Visual Attention: Computational Models and Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.