H.-Y. Oh et al.: Low Complexity Video Encoding with One-Bit Transform based Network-Driven Motion Estimation
601
Low Complexity Video Encoding with One-Bit Transform based Network-Driven Motion Estimation Hwa-Yong Oh, Sarp Ertürk, Member, IEEE, Tae-Gyu Chang, Member, IEEE Abstract — Low complexity video encoding is desired in many applications requiring real-time encoding of video, particularly for mobile equipment. In order to reduce encoding complexity without much sacrifice in visual quality, some of the complexity can typically be shifted to the decoder, or the network infrastructure can take part in the process by performing some encoding and/or decoding tasks. In this paper a new hybrid video coding scheme is proposed that utilizes the standard motion compensated predictive coding architecture and uses one-bit transform (1BT) representations of video frames to facilitate remote motion estimation. The one-bit transform of each video frame is computed at the encoder using low complexity operations and sent to the decoder or network infrastructure, after entropy encoding, for motion estimation. After motion estimation is carried out remotely, the motion vectors are sent back to the encoder for motion compensated predictive coding. In order to reduce the overhead introduced by one-bit transforms, differential encoding of 1BTs is also investigated. The proposed approach can provide low complexity motion compensated predictive video encoding by shifting the high computational load of motion estimation to the decoder or network infrastructure1. Index Terms — One-bit Transform, Motion Compensated Predictive Coding, Low-Complexity Video Encoding, Network Driven Motion Estimation.
I. INTRODUCTION Emerging consumer electronics applications such as wireless video communications, wireless video cameras, disposable video cameras, and networked camcorders require low complexity encoders because of memory, computation, and power consumption limitations [1]. Motion compensated predictive coding (MCPC) approaches utilized in the MPEG video coding standards and included in the ITU-T H.26x recommendations, have significantly higher complexity at the encoder side compared to the decoder side, mostly due to motion estimation that is carried out at the encoder to exploit temporal redundancy present in successive frames. The video encoder in MCPC systems is regarded to be 5 to 10 times more complex than the decoder [2], and motion estimation 1
Hwa-Yong Oh is with the School of Electrical and Electronics Engineering, Chung-Ang University, Seoul, Korea (e-mail:
[email protected]). Sarp Ertürk is with the IITA Professorship Program, School of Electrical and Electronics Engineering, Chung-Ang University, Seoul, Korea and also with the Department of Electronics and Telecommunications Engineering, University of Kocaeli, 41040 Kocaeli, Turkey (e-mail:
[email protected]). Tae-Gyu Chang is with the School of Electrical and Electronics Engineering, Chung-Ang University, Seoul, Korea (e-mail:
[email protected]). Contributed Paper Manuscript received February 1, 2007
alone is regarded to comprise 50-70 % of the computational load of the entire coding process in these schemes. This asymmetry is particularly desirable for broadcasting type applications where the video is encoded once (usually a priory) and decoded many times at different receivers, however, it is not suitable for applications that require low complexity encoding. In [3] a video coding scheme for wireless communications is presented, in which it has been proposed to avoid motion estimation at the encoder by predicting motion vectors at the network infrastructure from previous frames, after which the estimated motion vectors are transmitted to both the encoding and the decoding mobile terminal. The approach basically predicts the motion vector of the current frame I t , using the motion information estimated from the previously reconstructed frames I t*−1 and I t*− 2 . However, this motion prediction approach is not effective as the motion estimated from previous frames is never ensured to persist in the current frame, and furthermore motion estimation is typically carried out to minimize the error (or maximize the similarity) between a certain set of pixels and a reference pixel set, and this process is not ensured to uncover the true motion. A similar approach is proposed in [4], and referred to as NetworkDriven Motion Estimation, which again predicts the motion vectors of the current frame from the motion vectors of the previous frames. Alternatively, it has been proposed to utilize Wyner-Ziv coding of video, where individual frames are encoded independently (intraframe encoding) but decoded conditionally (interframe decoding) as presented in [2]. While both [2] as well as [3] have used the term Distributed Video Coding for the process of reducing the encoder complexity, possibly at the cost of increased decoder complexity, recent research has widely explored the Wyner-Ziv encoding approach, and currently the term distributed video coding is basically being used to denote a system in which individual frames are encoded independently but decoded conditionally. Many variants of the Wyner-Ziv encoding scheme, or performance analysis and comparative evaluations have been proposed in the recent literature in [4-31]. These techniques will not be explained in detail, as Wyner-Ziv coding is not of main concern in this paper, but the reader is referred to [32] for a fairly recent review of such schemes. In [33], a WynerZiv video coding method based on network-driven motion estimation (NDME) has been proposed to reduce the dependency of the intra modes so as to improve the efficiency of Wyner-Ziv video coding. In the Wyner-Ziv NDME the
0098 3063/07/$20.00 © 2007 IEEE
IEEE Transactions on Consumer Electronics, Vol. 53, No. 2, MAY 2007
602
motion estimation is performed at the decoder, and then the motion vectors are sent back to the encoder, through a backward channel. The encoder does motion compensation and the residual coefficients are transform coded. This approach is improved in [33] by using a symmetrical Reversible Variable Length Code (RLVC) for the motion vectors being sent through the backward channel, to provide resiliency of motion vectors to transmission errors and delays. The high-complexity motion search is done at the decoder, and the use of predictive coded frames is increased by transmitting the motion information from the decoder to the encoder, improving coding efficiency without increasing encoder complexity. However, as information about the current frame is not available at the decoder to facilitate motion estimation, motion estimation is performed on the previous reconstructed frames at the decoder, and motion vectors for the current frame are predicted from the motion vectors of the previous decoded frames, similar to [3] and [4]. Therefore as in [3] and [4], the prediction accuracy of motion vectors is also a problem in this case. In this paper one-bit transform (1BT) side information based network-driven motion estimation (NDME) is proposed. The encoder computes the one-bit transform of the frame using low-complexity operations and sends the entropy encoded 1BT information to the network infrastructure (or decoder) for motion estimation. The motion vectors are sent back to the encoder through the backward channel (in a similar approach to [3] and [4]) to facilitate motion compensated predictive encoding of video. The proposed approach shifts the high-complexity motion estimation part from encoder to decoder, to reduce the computational load of the encoder. As motion vectors are estimated at the decoder using the 1BT of the current frame, acceptable motion vectors are estimated instead of unreliable predictions obtained using motion vectors of previously decoded frames which is typically the case in [3,4,32,33]. Hence, the encoder can utilize standard conventional interframe (i.e. MCPC) video coding because reliable motion vectors are estimated, with no need for transcoding at the network infrastructure for wireless communication applications for example. II. ONE-BIT TRANSFORM BASED NETWORKDRIVEN MOTION ESTIMATION Techniques for motion estimation using lower bit-depth representations have been proposed in the literature to reduce the computational complexity compared to Sum of Absolute Differences (SAD) or Minimum Squared Error (MSE) matching in motion estimation [34-37]. The one-bit transform (1BT) converts the image plane to an appropriate single bit/pixel representation so that motion estimation can basically be carried out on a binary image, reducing computational complexity of the matching process. A multiplication-free, low-complexity 1BT that can be implemented using integer arithmetic with addition and shift operations only has recently
been proposed in [37]. In order to obtain the 1BT of a video frame, the frame is basically filtered with the filter kernel ⎡0 ⎢0 ⎢ ⎢0 ⎢ ⎢0 ⎢0 ⎢ ⎢0 ⎢0 ⎢ ⎢0 ⎢ 0 1 ⎢ K = ⎢1 16 ⎢ ⎢0 ⎢0 ⎢ ⎢0 ⎢0 ⎢ ⎢0 ⎢ ⎢0 ⎢0 ⎢ ⎢0 ⎢0 ⎣
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0⎤ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0⎥⎥ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0⎥ ⎥ 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0⎥ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0⎥ ⎥ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0⎥ 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0⎥ ⎥ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0⎥ ⎥ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0⎥ 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1⎥ ⎥ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0⎥ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0⎥ ⎥ 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0⎥ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0⎥ ⎥ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0⎥ ⎥ 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0⎥ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0⎥ ⎥ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0⎥ 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0⎥⎦
(1)
and the corresponding 1BT is constructed in the form of
⎧1 , if I (i, j ) ≥ I F (i, j ) B(i, j ) = ⎨ otherwise ⎩0 ,
(2)
where I F (i, j ) represents the filtered version of the video frame I (i, j ) , obtained by applying the convolution kernel K to I . Note that this kernel defines a multi band-pass filter, similar to the kernel proposed in [36]. However, because the normalization coefficient is now a power of 2, it becomes possible to carry out this filtering by only adding the corresponding 16 pixels and shifting the final result. Hence the complete filtering operation can be carried out using integer arithmetic and the computational load is 16 addition and 1 shift operations per pixel. The construction of the 1BT requires a further comparison operation to carry out (2). Hence, the one-bit transform can be carried out using simple operations and the computational complexity is reasonably low. Fig.1 shows the 1BT of a sample frame.
Fig. 1. A sample frame of the “Salesman” sequence and the corresponding 1BT representation.
In order to facilitate 1BT based motion estimation, the 1BT of each image frame is constructed using (1) and (2), and the
H.-Y. Oh et al.: Low Complexity Video Encoding with One-Bit Transform based Network-Driven Motion Estimation
motion vector of a block is decided based on the number of non-matching points measure (NNPM), which can be formulated as N −1 N −1
∑ ∑ { B (i , j ) ⊕ B
NNMP ( m , n ) =
t
t −1
i=0 j=0
(i + m , j + n )}
becomes possible to simply shift the motion estimation part to the network infrastructure by transmitting the 1BTs of image frames, and then motion estimation can be carried out at the network infrastructure without the need for decoding.
(3)
− s ≤ m, n ≤ s − 1
Inverse Quantize
1BT
Data out Multiplex
Buffer
Inverse Transform Prediction Frame
Bit-plane Store
Motion Compensate
Frame Store
Decoded Frame
1BT of Previous Frame
Buffer
Prediction Frame
Motion Vectors
Inverse Quantize
Motion Estimate
Data ou Multiplex
Frame Store
Entropy Encode DCT Coefficients
1BT Entropy Encode DCT Coefficients
Inverse Transform
Motion Compensate
QP Quantize
Motion Vectors
Quantize
Enable Prediction
Prediction Error Frame Transform
Enable Prediction
0
QP
Prediction Error Frame Transform
Input Frame
Buffer Fullness
Control
0
Buffer Fullness
Control
where (m, n) shows the candidate displacement, s determines the search range, and ⊕ shows the exclusive-OR (XOR) operation. The candidate displacement that gives the lowest NNMP is designated to be the block motion vector. For 1BT based motion estimation the basic MCPC structure can be modified so that motion estimation is carried out on the 1BTs of the current frame and the previously reconstructed (i.e. decoded) frame, as shown in Fig.2.
Input Frame
603
Decoded Frame
Previous Decoded Frame 1BT
Motion Estimate
Fig. 2. Motion Compensated Predictive Coding (MCPC) scheme with 1BT based motion estimation (H263+,1BT)
One drawback of this scheme for the aimed approach is that the previously decoded frame is required for motion estimation and the 1BTs of decoded frames have to be constructed. As it is desired in this paper to provide a video coding scheme based on one-bit transform (1BT) side information for network-driven motion estimation (NDME), it would be required to decode image frames at the network side and compute the 1BT of decoded frames for motion estimation. As an alternative it has been investigated in this paper to carry out the motion estimation directly on the 1BTs of the current and previous frames and carry out motion compensation using the previously reconstructed frame only. This system architecture is shown in Fig. 3. This approach is referred to as direct 1BT motion estimation. Using this approach the need to construct the 1BT of decoded image frames is avoided, reducing the computational load compared to the system structure shown in Fig. 2. Furthermore it
Fig. 3. Motion Compensated Predictive Coding (MCPC) scheme with direct 1BT based motion estimation (H263+, D1BT).
In the proposed 1BT based network driven motion estimation approach, the encoder computes the 1BT of each frame and transmits this information to the network infrastructure (e.g. base station) or decoder for network driven motion estimation. In order to reduce the data amount required for transmission, 1BTs that are typically binary representations of the frames (i.e. binary images) are entropy encoded using JBIG [38]. JBIG is a highly effective lossless bi-level image compression algorithm based on context sensitive arithmetic coding, and can be implemented effectively to run at real-time. The compression of a single pixel is very fast and simple, and consists of an index construction, table look-up and few bit-level operations of the arithmetic coder without any multiplications. Then, motion estimation is carried out at the network infrastructure to obtain the motion vectors. These motion vectors are sent back to the encoder, that carries out motion compensation and transform encodes the residual in a typical motion compensated predictive coding approach. For decoding, the residual data (i.e. transform coded motion compensated prediction error) and the motion vectors are forwarded to the receiver through the network infrastructure, where the video can be reconstructed. Fig. 4. shows the corresponding system architecture for this approach. The high-complexity motion estimation part is moved to the network infrastructure, and the introduced overhead at the encoder is instead low-complexity 1BT computation and low-complexity arithmetic encoding of 1BT representations to facilitate NDME.
IEEE Transactions on Consumer Electronics, Vol. 53, No. 2, MAY 2007
604 Buffer Fullness
Control QP
Prediction Error Frame Transform
Input Frame
Entropy Encode DCT Coefficients
Quantize
Inverse Quantize
Enable Prediction
To the Receiver
Data out Multiplex
Buffer
De-Multiplex
Inverse Transform
Entropy Decode
Prediction Frame
0
Frame Store
Motion Compensate
1BT of Current Frame Motion Estimate
Decoded Frame
Motion Vectors Backward Channel
Entropy Encode
1BT
Multiplex
Bit-plane Store 1BT of Previous Frame
To the Transmitter
NDME
Fig. 4. 1BT side information based network driven motion estimation (NDME 1BT).
Buffer Fullness
Control
Input Frame
Prediction Error Frame Transform
QP Quantize
Entropy Encode DCT Coefficients
Inverse Quantize
Enable Prediction
To the Receiver
Data out Multiplex
Buffer
De-Multiplex
Inverse Transform
Entropy Decode
Prediction Frame
0
1BT
Motion Compensate
Frame Store
Entropy Encode
Multiplex
1BT of Current Frame Motion Estimate
Decoded Frame Backward Channel
Bit-plane Store 1BT of Previous Frame
To the Transmitter
NDME
Bit-plane Store
Fig. 5. Differential 1BT based network driven motion estimation (NDME-Diff1BT).
In order to further reduce the data amount required to encode the 1BTs at the encoder, differential encoding of 1BTs is also investigated. In the differential 1BT case, 1BT representations are JBIG encoded differentially as shown in Fig. 5, reducing the data amount (hence bandwidth) required for the entropy encoded 1BT data, however, at the cost of slightly higher computational complexity at the encoder (bitplane store and one binary addition or XOR operation per pixel) due to the need to compute the differential 1BT representations. Because the proposed NDME video coding approach utilizes typical motion compensated predictive coding
(MCPC) it is possible to utilize standard encoders (such as H.263 for example) for video coding. The modification required is that the encoder computes the 1BTs of frames and sends entropy encoded 1BTs to the network infrastructure, motion estimation is carried out at the network infrastructure using 1BT matching, and the motion vectors are sent back to the encoder that transform encodes the residual between the original frame and the motion compensated frame. The motion vectors computed at the network infrastructure are also sent to the decoding terminal. The encoder transmits the residual data, which is directly forwarded to the decoder.
H.-Y. Oh et al.: Low Complexity Video Encoding with One-Bit Transform based Network-Driven Motion Estimation
605
III. EXPERIMENTAL RESULTS
In order to evaluate the performance of the proposed approach, initially the rate-distortion performance of 1BT based MCPC is assessed. The H.263+ reference software tmn3.2 is used to obtain the rate-distortion performance for various test sequences for standard MCPC. Note that H.263+ was preferred so that it becomes possible to compare the performance of the proposed approach to other techniques already proposed in the literature because nearly all papers in this area have utilized H.263+ for comparative evaluations. The reference software is modified so that motion estimation is carried out using 1BT representations of the current and previously decoded frames, as shown in Fig. 2, and this approach is referred to as (H.263+, 1BT). Furthermore the rate-distortion performance is obtained for the case in which motion estimation is carried out directly on the current and previous frame 1BT representations, as shown in Fig. 3, and this case is referred to as (H.263+, D1BT). Fig. 6 shows the rate-distortion performances of these methods for various test sequences. Note that, (H.263+) refers to standard MCPC interframe encoding. The test sequences are of QCIF size and are encoded at a rate of 10 frames/sec. It is seen that MCPC with 1BT based motion estimation performs close to but slightly below conventional H.263+ interframe encoding. It is important that MCPC with direct 1BT motion estimation (H.263+, D1BT) performs similar to and sometimes even better than MCPC with standard 1BT motion estimation (H.263+, 1BT). Hence direct 1BT motion estimation based MCPC can be used to obtain a rate-distortion performance close to standard interframe encoding in a conventional MCPC video coding scheme. Fig. 7 shows the rate distortion results for various test sequences for H263+ interframe encoding, H263+ intraframe encoding, NDME using entropy encoded 1BT information (NDME-1BT) and NDME using entropy encoded differential 1BT information (NDME-Diff1BT). In this case, the bit-rate shows the transmit bit-rate from encoder and includes JBIG encoded 1BT or differential 1BT data and all remaining data of the H263+ encoder except the motion vector data for the NDME cases. It is seen that differential encoding of 1BTs can significantly improve the performance, which mainly results from higher compression rates of differential 1BT representations. For the Foreman sequence 1BTs are encoded at a rate of 46.25 kbit/s, while differential 1BTs can be encoded at a rate of 31.69 kbit/s. For the Grandma sequence 1BTs are encoded at 46.52 kbit/s and differential 1BTs at 27.07 kbit/s, for the Mother And Daughter sequence 1BTs are encoded at 42.36 kbit/s and differential 1BTs at 16.11 kbit/s, and for the Salesman sequence 1BTs are encoded at 50.74 kbit/s and differential 1BTs at 22.95 kbit/s. Hence an important gain in bit-rate is achieved with the differential encoding process at the cost of a slight increase in encoder complexity. It is observed that NDME-Diff1BT results in a PSNR of about 2 dBs lower than H263+ inter encoding at low transmit bit-rates and about 1-1.5 dB lower than H263+ inter
Fig. 6. Rate-distortion results for various test sequences for MCPC with different motion estimation approaches: Foreman, Grandma, Mother and Daughter, Salesman sequences.
IEEE Transactions on Consumer Electronics, Vol. 53, No. 2, MAY 2007
606
encoding at higher transmit bit-rates. The rate-distortion results provided in [33] show that in the case of Wyner-Ziv distributed video coding the PSNR is about 2-3 dB lower than H.263 interframe encoding and the gap increases significantly above a bit-rate of about 100-150 kbit/s where the PSNR difference can be over 4 dB. Hence the proposed distributed video coding scheme can provide superior video quality compared to Wyner-Ziv schemes for which results are given in [33]. Note that rate-distortion results provided in Fig. 7 are for the transmit rate of the proposed video encoder to the network infrastructure and includes encoded 1BT or differential 1BT data. In the case of a wireless video communication system, the data transmitted from the network infrastructure to the receiver would not include 1BT data but instead motion vector data which has typically a bit-rate of about 0.2-0.8 kbit/s. At this part of the system the operation is basically equivalent to the (H263+, D1BT) system and rate-distortion results shown in Figure 7 will be obtained for this part. The proposed 1BT based NDME video coding approach uses network driven motion estimation with 1BT side information and the conventional MCPC video coding architecture. Hence it can be used in conjunction with most standard video coding systems. For a wireless video communication system application it is not required to do any transcoding at the network infrastructure in the proposed approach, and the network infrastructure carries out only 1BT based motion estimation without entirely decoding image frames resulting in a lower computational complexity in this part. Note that it is possible to even improve the performance using JBIG2 compression of 1BT representations, and although JBIG2 is known to outperform JBIG in terms of compression performance it has not been utilized in this paper due to the lack of publicly available source codes. IV. CONCLUSION
A video coding system with network driven based motion estimation using 1BT side information has been proposed in this paper. The encoder computes the 1BT of image frames and entropy encodes the 1BTs using simple and efficient operations and sends this data to the network infrastructure. The network infrastructure is responsible for carrying out the high complexity motion estimation process and sends the motion vectors back to the encoder. The encoder carries out standard motion compensated predictive coding using these motion vectors. With the proposed approach the high computational load of the motion estimation part is shifted from encoder to the network infrastructure or decoder. Promising rate-distortion results are obtained for the proposed approach and as this is the first introduction of this scheme, methods for further performance improvements (such as enhanced entropy encoding of 1BT representations for instance) are to be investigated in future research.
Fig. 7. Rate-distortion results for various test sequences for H263+ inter, H.263+ intra, NDME-1BT and NDME Diff1BT: Foreman, Grandma, Mother and Daughter, Salesman sequences.
H.-Y. Oh et al.: Low Complexity Video Encoding with One-Bit Transform based Network-Driven Motion Estimation
REFERENCES [1]
[2] [3] [4] [5] [6] [7]
[8]
[9]
[10]
[11] [12] [13]
[14]
[15] [16] [17] [18] [19] [20]
X. Artigas, M. Tagliasacchi, L. Torres, and S. Tubaro, “Analysis of the coset statistics in a Distributed Video Coding Scheme,” Workshop on Immersive Communication and Broadcast Systems (ICOB 2005), Berlin, pp. 24-28, Oct. 2005. A. Aaron, R. Zhang, B. Girod, “Wyner-Ziv coding of motion video,” Thirty-Sixth Asilomar Conference on Signals, Systems and Computers, vol. 1, pp. 240 – 244, Nov. 2002. H. Li, A. Lundmark, R. Forchheimer, “Distributed Video Coding for Wireless Communications,” Personal Computing and Communication Workshop, 1998. W.B. Rabiner, A.P. Chandrakasan, “Network-Driven Motion Estimation for Wireless Video Terminals”, IEEE Trans. Circuits Syst. Video Technol., vol. 7, no. 4, pp. 644-652, Aug. 1997. Y. Yang, S. Cheng, Z. Xiong, W. Zhao, “Wyner-Ziv coding based on TCQ and LDPC codes,” Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 1, pp. 825-829, Nov. 2003. A. Aaron, E. Setton, B. Girod, “Towards practical Wyner-Ziv coding of video,” Proc. IEEE International Conference on Image Processing, ICIP 2003, vol. 3, pp. 869-872, Sept. 2003. Z. Liu, S. Cheng, A.D. Liveris, Z. Xiong, “Slepian-Wolf coded nested quantization (SWC-NQ) for Wyner-Ziv coding: performance analysis and code design,” Proc. Data Compression Conference, DCC 2004, pp. 322-331, Mar. 2004. H. Wang, A. Ortega, “Scalable predictive coding by nested quantization with layered side information,” Proc. IEEE International Conference on Image Processing, ICIP '04, vol. 3, pp. 1755-1758, Oct. 2004. R. Puri and K. Ramchandran, “PRISM: A New Robust Video Coding Architecture based on Distributed Compression Principles,” in Allerton Conference on Communication, Control and Computing, 2002. A. Majumdar, K. Ramchandran, “PRISM: an error-resilient video coding paradigm for wireless networks”, Proc. First International Conference on Broadband Networks, BroadNets 2004. pp. 478-485, 2004. M. Tagliasacchi, A. Majumdar, K. Ramchandran, “A Distributed Source Coding based Spatio-Temporal Scalable Video Codec,” Picture Coding Symposium 2004, San Francisco, Dec. 2004. A. Aaron, S. Rane, E. Setton, B. Girod, “Transform-domain WynerZiv codec for video,” SPIE Visual Communications and Image Processing Conference, 2004. A. Aaron, S. Rane, B. Girod, “Wyner-Ziv video coding with hashbased motion compensation at the receiver,” Proc. IEEE International Conference on Image Processing, ICIP '04, vol. 5, pp. 3097- 3100, Oct. 2004. I.H. Tseng, A. Ortega, “Motion Estimation at the Decoder Using Maximum Likelihood Techniques for Distributed Video Coding,” Thirty-Ninth Asilomar Conference on Signals, Systems and Computers, pp. 756-760, Oct. 2005. L. Zhen, E.J. Delp, “Wyner-Ziv video side estimator: conventional motion search methods revisited”, IEEE International Conference on Image Processing, ICIP 2005, vol. 1, pp. 825-828, Sept. 2005. X. Arigas, L. Torres, “Iterative Generation of Motion-Compensated Side Information for Distributed Video Coding,” IEEE International Conference on Image Processing, vol. 1, pp. 833-836 , Sept. 2005. K.M. Misra, S. Karande, H. Radha, “Multi-Hypothesis Based Distributed Video Coding using LDPC Codes,” Allerton Conference on Communication, Control and Computing, Sept. 2005. C. Brites, F. Pereira, “Distributed Video Coding: Bringing New Applications to Life,” 5th Conference on Telecommunications ConfTele, Tomar, Portugal, April 2005. X. Artigas, L. Torres, “A model-based enhanced approach to distributed video coding”, Image Analysis for Multimedia Interactive Services, Wiamis'2005, EPFL, pp. 128-142, 2005. J. Fowler, M. Tagliasacchi, B. Pesquet Popescu, “Wavelet-Based Distributed Source Coding of Video”, European Signal Processing Conference, Antalya, Sept. 2005.
607
[21] A. Majumdar, R. Puri, P. Ishwar, K. Ramchandran, “Complexity/performance trade-offs for robust distributed video coding,” IEEE International Conference on Image Processing, ICIP 2005, vol. 2, pp. 678-681, Sept. 2005. [22] R.P. Westerlaken, R.K. Gunnewiek and R.L. Lagendijk, “TurboCode Based Wyner-Ziv Video Compression,” Twenty-sixth Symposium on Information Theory in the Benelux, pp. 113-120, May 2005. [23] R.P. Westerlaken, R.K. Gunnewiek and R.L. Lagendijk, “The role of the virtual channel in distributed source coding of video,” IEEE International Conference on Image Processing, ICIP 2005, vol. 1, pp. 581-584, Sept. 2005. [24] J. Sun, H. Li, “Motion compensated Wyner-Ziv video coding,” International Workshop on Multimedia Signal Processing, MMSP05, Shanghai, China, Oct. 2005. [25] X. Artigas, L. Torres, “Improved signal reconstruction and return channel suppression in Distributed Video Coding systems,” Proc. 47th International Symposium ELMAR-2005 focused on Multimedia Systems and Applications, Croatia, pp. 53-56, June 2005. [26] L. Natário, C. Brites, J. Ascenso, F. Pereira, ”Extrapolating Side Information for Low-Delay Pixel-Domain Distributed Video Coding”, VLVB'2005, Italy, September 2005. [27] C. Brites, J. Ascenso, F. Pereira, “Improving Transform Domain Wyner-Ziv Video Coding Performance,” Proc. IEEE International Conf. on Acoustics, Speech, and Signal Processing, ICASSP’06, Toulouse, France, May 2006. [28] C. Brites, J. Ascenso, F. Pereira, “Modeling Correlation Noise Statistics at Decoder for Pixel Based Wyner-Ziv Video Coding,” Picture Coding Symposium 2006, Beijing, China, April 2006. [29] M. Tagliasacchi, A. Trapanese, S. Tubaro, J. Ascenso, C. Brites, F. Pereira, “Intra Mode Decision Based on Spatio-temporal Cues in Pixel Domain Wyner-Ziv Video Coding,” IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP’06, Toulouse, France, May 2006. [30] M. Dalai, R. Leonardi, F. Pereira, “Improving Turbo Codec Integration In Pixel-Domain Distributed Video Coding,” IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP’06, Toulouse, France, May 2006. [31] A. Aaron, D. Varodayan and B. Girod, “Wyner-Ziv residual coding of video,” Proc. Picture Coding Symposium, PCS 2006, Beijing, China, April 2006. [32] Z. Belkoura, T. Sikora, “Towards Rate-Decoder Complexity Optimisation in Turbo-Coder based Distributed Video Coding,” Proc. Picture Coding Symposium, PCS 2006, Beijing, China, April 2006. [33] B. Girod, A.M. Aaron, S. Rane, D. Rebollo-Monedero, “Distributed Video Coding,” Proceedings of the IEEE, vol. 93, no. 1, pp. 71-83, Jan. 2005. [34] B. Natarajan, V. Bhaskaran, K. Konstantinides, “Low-complexity block-based motion estimation via one-bit transforms,” IEEE Trans. Circuits Syst. Video Technol., vol. 7, no. 4, pp. 702–706, 1997. [35] A. Ertürk, S. Ertürk, “Two-Bit Transform for Binary Block Motion Estimation,” IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 7, pp. 938-946, July 2005. [36] O. Urhan, S. Ertürk, “Constrained One-Bit Transform for LowComplexity Block Motion Estimation,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 52, No. 4, pp. 1275-1279, Nov. 2006. [37] S. Ertürk, “Multiplication-free one-bit transform for low-complexity block-based motion estimation,” Signal Processing Letters, Vol. 14, No. 2, pp. 109-112, Feb 2007. [38] ITU-T Recommendation T.82: Information technology - Coded representation of picture and audio information - Progressive bilevel image compression, 1993. Hwa-Yong Oh. received his B.S., M.S and Ph.D degrees from Chung-Ang University, Seoul, Korea, in 2003, 2005 and 2007, respectively, both in electrical engineering. Now he is working at Signals and Communication Systems LAB in Chung-Ang University. His main research interests are multimedia signal processing, wireless communication systems and embedded system.
608 Sarp Ertürk (M’99) received his B.Sc. in Electrical and Electronics Engineering from Middle East Technical University, Ankara in 1995. He received his M.Sc. in Telecommunication and Information Systems and Ph.D. in Electronic Systems Engineering in 1996 and 1999 respectively from the University of Essex, U.K. From 1999 to 2001 he carried out his compulsory service at the Army Academy, Ankara. Since 2001 he has been with the University of Kocaeli, Turkey, where he is currently appointed as Associate Professor. His research interests are in the area of digital signal and image processing. Between March-September 2006 he was a visiting professor at the School of Electrical and Electronics Engineering, Chung-Ang University, Seoul, Korea.
IEEE Transactions on Consumer Electronics, Vol. 53, No. 2, MAY 2007 Tae-Gyu Chang. (M’86) received the B.S. degree from the Seoul National University, Seoul, Korea, in 1979, and M.S. degree from Korea Advanced Institute of Science and Technology, Seoul, in 1981, and Ph.D. degree from University of Florida, Gainesville, in 1987, all in electrical engineering. From1981 to 1984, he was with the Hyundai Engineering/Electronics Inc., Seoul, as a Computer Systems Design Engineer. From 1987 to 1990, he was a Faculty Member of Tennessee State University, Nashville, as an Assistant Professor of information systems engineering. In March 1990, he joined the faculty of the Chung-Ang University, Seoul, where he is currently a Professor at the Department of Electrical and Electronics Engineering. His research interests include multimedia signal processing and communications.