6.5 Quantifying the Performance of a Saliency Model to Human Eye Movement in Static and Dynamic Scenes

It is known that in overt attention, eye fixation locations in an image are usually salient places. To estimate the performance of a computational model, another measure is proposed in [22–25], which calculates the difference between the mean salient values sampled from a saliency map at predicted human fixation locations and at random saccades. Since human fixation locations are different from the locations of random saccades, for a computational model with a high difference from random saccades, its saliency map matches the human behaviour better. Afterwards, the idea is employed to dynamic scene by a more reasonable measure using the Kullback-Leibler (KL) distance between the probability densities in human and random saccades. The KL divergence (distance or score) can compare two different probability densities as described in Sections 2.6 and 3.8. The model with higher KL score gives a better performance because it can predict human eye-tracking better, so the KL score is widely used to compare computational models [26–30].

In order to explain how to compute the KL score, we first discuss the correlation between the saliency map and eye movement, and give some simple measurement methods, and then we will give the KL score estimation.

A simple performance comparison measure between salience to human eye movement and to computational model in a static scene is proposed in [22]. ...

Get Selective Visual Attention: Computational Models and Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.