6.5 Quantifying the Performance of a Saliency Model to Human Eye Movement in Static and Dynamic Scenes
It is known that in overt attention, eye fixation locations in an image are usually salient places. To estimate the performance of a computational model, another measure is proposed in [22–25], which calculates the difference between the mean salient values sampled from a saliency map at predicted human fixation locations and at random saccades. Since human fixation locations are different from the locations of random saccades, for a computational model with a high difference from random saccades, its saliency map matches the human behaviour better. Afterwards, the idea is employed to dynamic scene by a more reasonable measure using the Kullback-Leibler (KL) distance between the probability densities in human and random saccades. The KL divergence (distance or score) can compare two different probability densities as described in Sections 2.6 and 3.8. The model with higher KL score gives a better performance because it can predict human eye-tracking better, so the KL score is widely used to compare computational models [26–30].
In order to explain how to compute the KL score, we first discuss the correlation between the saliency map and eye movement, and give some simple measurement methods, and then we will give the KL score estimation.
A simple performance comparison measure between salience to human eye movement and to computational model in a static scene is proposed in . ...