4.8 Modelling from a Bit-stream

As described in Section 4.5, DCT can be used in building computational models of visual attention. This section will introduce a computational model of visual attention in the compressed domain [10]. Most existing saliency detection models are built in the image (uncompressed) domain. However, images in storage and over the internet are typically in the compressed domain such as JPEGs. A novel saliency detection model in the compressed domain is proposed in [10]. The intensity, colour and texture features of the image are extracted from the DCT coefficients from a JPEG bit-stream. The saliency value of each DCT block is obtained based on the Hausdorff distance calculation and feature map fusion. As DCT is used in JPEG compression at 8 × 8-px block level, the DCT coefficients are used to extract intensity, colour and texture features for each 8 × 8-px block for saliency detection. Although the minimum coded unit (MCU) can be as large as 16 × 16-px (for 4: 2: 0 subsampling format), the saliency detection in this model is performed at the 8 × 8 block level for each DCT block. The saliency map for an image is calculated based on weighted feature differences between DCT blocks.

4.8.1 Feature Extraction from a JPEG Bit-stream

The Baseline method of JPEG, which is implemented based on DCT, is the most widely used image compression method [70]. Entropy decoding is used to decode the JPEG bit-stream to obtain the quantized DCT coefficients. As Huffman coding ...

Get Selective Visual Attention: Computational Models and Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Selective Visual Attention: Computational Models and Applications by Liming Zhang, Weisi Lin

4.8 Modelling from a Bit-stream

4.8.1 Feature Extraction from a JPEG Bit-stream

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly