Electronics and Telecommunications Research Institute. Korea. Abstract. We propose a novel disparity-compensated coding algorithm using MAC(Multip1e ...
WAM 14.5 DISPARITY-COMPENSTATED CODING USING MAC FOR STEREOSCOPIC VIDEO Sukhee Cho, Kugjin Yun, Byungjun Bae, and Youngkow Hahm Electronics and Telecommunications Research Institute. Korea
We propose a novel disparity-compensated coding algorithm using MAC(Multip1e auxiliary component) of MPEG-4 for stereoscopic video. MAC wa2s added to the MPEG-4 visual version 2 in order to describe /he transparency of the video object. The major difference between the existing coding method and the proposed coding method is the oddition of the residual texture coding. The proposed coding method assigns disparity map and residual texture to 3 components of MAC: one component for disparity map and the rest 2 components for the luminance and chrominance data of the residual texture, respectively.
object. In current MAC definition, disparity map is already included in the specifications to help generating another view from the VOP texture. The stereoscopic video coding using MAC can use both the motion and disparity compensations. However, there are no syntaxes and semantics for residual texture data of the disparity compensated image in current MAC even if stereoscopic real video can't guarantee the quality of reconstructed image without the residual texture data. Therefore, we propose the disparity-compensated stereoscopic coding method using MAC structure. The proposed coding method for stereoscopic video sequences has the merit that the multiplexing of the bit-streams is not necessary.
1. Introduction
2. Disparity-compensated coding using MAC
Abstract
There are conventional coding methods such as MVP(Multi-View Profile) in MPEG-2 or the TS(Tempora1 Scalability) in MPEG-4 for stereoscopic video sequences[l][2]. These coding methods consists of baselayer and enhancement-layer; base-layer for left-view image and enhancement-layer for right-view image. Left-view images are compressed using motion texture coding and right-view images are compressed using hlockbased motionidisparity compensated coding. However, in the market this coding scheme is scarcely used for practical implementations. The reason is two-folds: first, since its scheme has so much complex structure, implementing the MVP or TS codec is too costly. Secondly, this coding scheme outputs two encoded-streams. However, two-stream approach as used in temporal scalability is technically inconvenient, because it affects the codec system-wide. In other words, the two-stream approach needs multiplexing and de-multiplexing supports together with the frame synchronization among different views from the transport layer. On the other hand, onestream approach affects only the video codec part within the whole system chain and it is systematically simpler than the two-stream approach. Regarding the MPEG-4 MAC, it is a good mechanism through which we can get one-stream stereoscopic video coding. MAC was added to the MPEG-4 Visual version 2 [I] in order to describe the transparency of the video
0-7803-7721-4/03 $17.00 62003 IEEE
- I
............... ...........................-........._............ 0.roa.r
.,E.
~llh,-,n... R.r~n.Y"rl.d
. Fig. 1. Block diagram of dispantycompensated stereoscopic video coding using the extended MAC'S semantics. The disparity-compensated stereoscopic video coding uses the extended MAC with disparity map and residual texture for right-image as shown in Fig. 1
170
The disparity map is generated by pixel-by-pixel basis using some well-known disparity estimation algorithm. The major difference between the conventional coding method and the proposed coding method is the addition of the residual texture coding. Residual texture means the difference image between the original right-image and the disparity-compensated right-image obtained using the locally reconstructed left-image and the locally reconstructed disparity map. The proposed coding method assigns disparity map and residual texture to 3 components of MAC: one component for disparity map, one component for the luminance data of the residual texture, and the remaining one for the chrominance data of the residual texture. Since the chrominance data is a half of the size of the luminance data in case of 4:2:0, the chrominance data are placed in the first half of the last MAC component. And thus the coding of the last MAC component ignores the second half, which is garbage data. L.
7.5
3. Experlmental results We evaluate the performance of the proposed coding method, comparing with those of three coding methods in terms of PSNR values. The four coding methods are as the following. e Coding_methodl(conventionol coding using MAC) compresses left-images and the disparity map for the corresponding right images. For Coding-method2(The proposed coding using MAC), Coding-mefhod2-l assigns disparity map and luminance residual texture to 2 components of MAC. Coding-method2-2 assigns disparity map, luminance and chrominance residual texture to 3 components. Coding-method3 (independent 2 0 coding of the hvo views) compresses independently left and right ;images using the existing coding[l]. Finally, ~oding_method4(Codingusing Temporal Scnlobiliq) is a stereoscopic video coding method using Temporal Scalability. It consists of base-layer and enhancementxlayer; base-layer for left-view image and enhancementlayer for right-view image(l1. Figure 2 shows average PSNR values of the reconstructed right-view image when the left-view image is compressed by QP(I,P,B)=(4,8,12) for ‘Puppy’ and 3 o c c e r 2 ’ sequences. Bit-rate of the x-axis is obtained by ‘testing several arbitraly QP values. For both sequences, Coding-method4 and Coding-methodl have the highest and the lowest PSNR values, respectively, *Coding-methodZ-l and Coding-method2-2 also have the same PSNR values. In the comparison of Coding-method1 and Coding-method2 using MAC, Coding-method2 has higher PSNR values by about 4dB 7dB at all bit-rates than Coding-methodl for both sequences. Coding-methodl has similar PSNR values at all hit-rates by about 26dB and 32dB for ‘Puppy’ and ‘Soccer2’ sequences, respectively. i
I
2.5
3.5
Bkale. (b, sacc.r2
-
171
Fig. 2. PSNR values of the reconstructed right-viea image when left-view image is compressed by QP(l,P,B)=(4,8,12). 4. Conclusions a n d future works
We come to the conclusion that the Coding-method4 using Temporal scalability coding is the hest method in view of both PSNR values. However, in comparison of the coding methods using MAC, the proposed disparitycompensated coding method with residual-texture, Coding-method2, has better results than the conventional coding method, Coding-methodl. It is necessary to provide new ideas regarding better rate allocation for the disparity map and the residual right-view image and better prediction methods such as global disparityimotion estimation within this one-stream approach based on the disparity-compensated stereoscopic video coding scheme using MAC. Acknowledgement This work is supported by the Ministry of Information and Communication of Korea under the title of “The development of SmarTV technology.” References
[I] Generic Coding o f Audio-visual Objects Visual, ISOilEC 14496-2 : 2001 [2]
- Part 2
:
Chen X. and Luthra A., “MPEG-2 Multi-View Profile and its Application in 3DTV,” International Society for Optical Engineering Proceedings of SHE, ~01.3021,pp.212-223, San Diego, USA, 1997