5.4 Video Plus Depth Coding

We have reviewed many different approaches to compress stereo video sequences. We notice that the main principles of all the discussed approaches so far only take the information directly from the pixels captured by stereo cameras and try to utilize the existing video coding framework to deliver the video with the highest possible fidelity. Potentially, the video coding performance can be further improved during the stereo video codec design phase by addressing both the geometric location of the camera arrays and the 3D scene to video camera. One potential solution is through the depth-image-based rendering (DIBR) approach [19, 20].

The inputs of the DIBR process consist of one color image frame, which represents the color information, and one depth map, which represents the depth information for each corresponding pixel in the image. The 3D coordinate for each pixel from the input image frame can be calculated with the aid of the depth information. Then, the DIBR process sets up two virtual cameras in a parallel configuration, which represent left view and right view, and projects those pixels in the 3D coordinate back to each virtual 2D image plane belonging to each virtual camera.

Figure 5.13 illustrates the details of the DIBR process. Let CC represent the viewpoint of the original captured camera, CL the viewpoint of the virtual left camera, and CR the viewpoint of the virtual right camera. Denote f as the focal length of the camera CC, and d as ...

Get 3D Visual Communications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.