RGB Video Coding Using Residual Color Transform

6 downloads 0 Views 264KB Size Report
as Sony, Warner Bros., Matsushita, Microsoft, Thomson, Toshiba, Dolby, etc., and adopted as the international standards by .... wing autocorrelation matrix,. (6).
SAMSUNG Journal of Innovative Technology

SPECIAL ISSUES

AUGUST 2005

08

VOLUME 1 / NUMBER 1

RGB Video Coding Using Residual Color Transform Woo-Shik Kim*, Dae-Sung Cho, Dmitry Birinov, Hyun Mun Kim, Shi-Hwa Lee, and Yang-Seock Seo

Computing Lab. Digital Research Center Samsung Advanced Institute of Technology P.O. Box 111, Suwon 440-600, Korea This paper presents a new paradigm of the video compression technology, which radically reforms the conventional video coding process. A camera captures images in RGB domain that is suitable for the natural color representation. But, generally RGB space is regarded as a bad space from a compression point of view. To solve this problem, the RGB space is usually converted into another color space, which inevitably causes color distortion. As display devices become large and vivid, it is essential to maintain the original color fidelity. To cope with these problems, we propose an RGB coding method that compresses the video in RGB domain directly. We exploit the inter-color correlation within the coding loop to increase the coding efficiency. Consequently, the coding efficiency is increased up to 40% while the color distortion can be completely avoided. The proposed method has been recognized by the engineers from the worldwide prominent multimedia application companies such as Sony, Warner Bros., Matsushita, Microsoft, Thomson, Toshiba, Dolby, etc., and adopted as the international standards by ISO/IEC MPEG-4 AVC and ITU-T H.264. To further validate the effectiveness of the developed technology, we applied them to the Samsung LCD products. It results in visually lossless quality while saving a lot of memory requirements, which makes the Samsung products have an edge in the competitive display market. Keyword : Image compression, Video encoding, Color transformation

1. INTRODUCTION n this paper, we present a new video coding technology in RGB space. Generally, RGB space is regarded as a bad space from a compression point of view [1]. Each R, G, and B component contains the color information along with the luminance information. This redundancy degrades the coding efficiency. To exploit the inter-color redundancy, the RGB space is usually converted into another color space such as YCbCr. Since most of the spatial information is in Y component, Cb and Cr components can be further subsampled to compress the data without significant loss of image quality. But in professional applications such as digital cinema, professional television production, and post-production markets, it is very important to maintain the original color fidelity. Moreover, as the resolution of the display devices becomes large with a true color support, the color fidelity becomes essential in the consumer electronics market. However,

I

the color space conversion from RGB to YCbCr and back to RGB involves the rounding error [2]. To cope with these problems, we propose to use a resieual color transform (RCT) method that avoids the color distortion caused by the color space conversion, by coding directly in the given RGB space. This method can achieve higher PSNR than the color space converted coding methods as detailed in Section II. At the same time, it maintains great coding efficiency that YCbCr space can achieve. By applying the RCT to the MPEG-4 AVC/H.264 4:4:4 codec [3], we achieved up to 40% higher coding efficiency than that of the coding without RCT option. In Section 2, we analyze the color distortion due to the color conversion and introduce the recent approaches to solve this problem. In Section 3, we describe the proposed RGB coding technology in details. The simulation results are summarized in Section 4 and achievements from RGB coding and application to the LCD enhancements are VOLUME 1 | August, 2005 | 21

Woo-Shik Kim, Dae-Sung Cho, Dmitry Birinov, et al.,(3)

described in Section 5, followed by the conclusion in Section 6.

EB =

1 2 2 (1 + 1 + 1.8556 2 + 0 2 ) = 0.4536 12

Since we use 8 bits for each component, the achievable PSNR value for each component is

2. NEW COLOR TRANSFORMATION 2.1. Color Distortion Due to Color Space Conversion In this section, we analyze the color space conversion error and show that there exists an achievable PSNR limit due to this rounding error. This analysis assumes that we use 8 bit fixed-point RGB data ranging from 0 to 255 and use the same number of bits of precision to map the RGB data to another color space and to come back to the RGB space. We can generalize this analysis for any N-bit data. The RGB to YCbCr conversion using the Rec. BT.601 is performed as follows: 0.7152 0.0722   R  Y   0.2126 Cb  = −0.1146 −0.3854 0.5  G       −0.4542 −0.0458  B  Cr   0.5

(1)

After the conversion, the rounding operation is applied which introduces 1/12 rounding error due to uniform rounding quantization error. The YCbCr to RGB conversion is performed as follows: 0.0 1.5748  Y  R  0v1.026  Rb  G  =  1.0 −0.1873 −0.4681 Cb        B   1.0 0.0  Cr  1.8556

The rounding operations here introduce

(2)

1 2 a rounding 12

error from each component (Y, Cb, Cr) where a is the coefficient of the color conversion matrix in (2). It is due to the inherent rounding error from RGB to YCbCr conversion. For example, we have the following rounding error for G component: Error =

1 2 (1 + 0.18732 + 0.46812 ) 12

The overall conversion error for G component, EG, is: 1 EG = (12 + 12 + 0.18732 + 0.46812 ) = 0.1878 12

We can calculate the conversion error for the other components using the same method. ER =

1 2 2 (1 + 1 + 0 2 + 1.57482 ) = 0.3733 12

22 | VOLUME 1 | August, 2005

PSNRG = 10 ⋅ log

2552 = 55.4dB EG

PSNRR = 10 ⋅ log

2552 = 52.4dB ER

PSNRB = 10 ⋅ log

2552 = 51.6dB EB

According to this analysis, we can notice that the backward conversion coefficients are more important to the PSNR since the square sum of each row in (2) is greater than in (1). We also confirmed these PSNR values by conducting an experiment doing RGB to YCbCr and back to RGB conversion and measuring the PSNR values. For this experiment, we assumed the same probability for all possible color combinations from 0 to 255. So there are 2563 equi-probable cases in this experiment. Table 1 summarizes this theoretical analysis along with the experimental results. It confirms the validity of the theoretical analysis. Table 1. ACHIEVABLE PSNR VALUES WHEN COLOR SPACE CONVERSION (RGB→YCbCr→RGB) IS PERFORMED USING ITU-R REC. BT. 601 FOR 8-BIT IMAGES. Theoretical analysis

Experimental Results

PSNR-G

55.4 dB

57.2 dB

PSNR-R

52.4 dB

52.2 dB

PSNR-B

51.6 dB

51.5 dB

2.2. A New Color Transform YSbSr In the previous section, we analyzed how the color conversion introduces achievable PSNR limit. In this section, we introduce a new color transform that generates small conversion and propagation error. Recently, many new color transforms have been proposed [4]. Also during the course of development of Fidelity Range Extensions of the MPEG-4 AVC/H.264 standard [5], Malvar and Sullivan have proposed a new color

RGB Video Coding Using Residual Color Transform

transform, YCoCg-R, as defined below [6]. Y   1 / 4 1 / 2 1 / 4   R  Co =  1 0 −1  G       Cg −1 / 2 1 −1 / 2   B 

(3)

Also Topiwala and Tu have proposed a new color transform, YFbFr, as defined below [7]. Y   5 / 16 3 / 8 5 / 16   R   Fb  = −1 / 2 1 −1 / 2  G       −1   B  0  Fr   1

(4)

Looking at the coefficients of (3) and (4), we can easily see that they are very similar transforms. Both transforms are derived using Karhunen-Loe �ve (KL) transform. After KL transform, they are approximated using the dyadic coefficients to guarantee the integer reversibility without considering the error propagation. They guarantee the lossless conversion using the lifting scheme [8] and need one extra bit depth expansion for the chroma components due to the increased dynamic range. But this is based on the assumption that there is no coding error involved. The lossless conversion is meaningless when coding noise is involved. The error propagation is more important factor to consider. If coding is involved, the coding error in the transformed space is propagated through the backward conversion to the RGB space. It is proportional to the sum of the square of each backward conversion coefficient as the rounding error analyzed in the previous section. To cope with these problems, we devise a new color transform that gives high decorrelation gain along with small conversion and propagation error [9]. We start with the KL transform. For that purpose, we use the Kodak image set [10] to estimate the correlation matrix between the color components. We normalize each component to have zero mean and unit variance as follows: R=

r − E[r ] g − E[ g] b − E[b] , G= , B= std (r ) std ( g) std (b)

Next we estimate the autocorrelation matrix as follows:  var( R) E[ RG] E[ RB] Rx =  E[ RG] var(G) E[GB]    E[ RB] E[GB] var( B)

(5)

Using the given training data set, we obtaine the following autocorrelation matrix,

0.8525 0.7545  1  Rx = 0.8525 1 0.9225   0.7545 0.9225 1 

(6)

Next, we find the eigenvectors and the eigenvalues of Rx using the following equation, Rx Φ = ΦΛ

(7)

where Φ = [φ1 φ2 φ3 ] are the set of eigenvectors and Λ is the diagonal matrix where diagonal terms are the set of corresponding eigenvalues ordered in decreasing value. We obtained the following eigenvectors and eigenvalues,  0.5587 0.5968 0.5758  Φ = −0.7860 0.1597 0.5972    −0.2644 0.7863 −0.5584  0 0  −2.6882 0.2536 0  ∆ = 0    0 0 −0.0582  T

(8)

Now we can implement the KL transform to decorrelate the redundancy among the color components using Φ T. As shown in (8), Φ T is unitary matrix, i.e., it is normalized by L2 norm. So we need to scale the each row using L1 norm to guarantee the same dynamic range after color transform. Since the basis vectors are just scaled, it still maintains the characteristics of the KL transform. The resulting transform is, Φ

T norm

 0.3227 0.3447 0.3326  = −0.5095 0.1035 0.3870    −0.1643 0.4887 −0.3470 

(9)

Even though (9) guarantees the same dynamic range before and after the conversion, we slightly change (9) to use the same bias terms as in the YCbCr transform that makes the dynamic range of the chroma components ranging from 0 to 255. By setting the sum of negative coefficients equal to the sum of positive coefficients, we can use the same bias term 128 as YCbCr does. We can implement this by adjusting coefficients based on their absolute values. This approximation gives the following conversion, 0.3223 0.344 0.333  Φ Tnorm =  −0.5 0.106 0.394    −0.161 0.5 −0.339

(10)

The conversion formula in (10) is very close to the KL VOLUME 1 | August, 2005 | 23

Woo-Shik Kim, Dae-Sung Cho, Dmitry Birinov, et al.,(3)

transform while maintaining the same dynamic range and bias terms of the well-known YCbCr transform. As mentioned before, the transforms in (3) and (4) use one extra bit depth expansion for chroma components. It is based on the assumption that we can use separate bit depth for each component as defined in MPEG-4 AVC/ H.264 Professional Extension. It means we can transform N bit data into (N+1) bit data. We take the advantage of this by scaling the forward transform by 2 in (10). It reduces the backward transform coefficients by half. Since the rounding error or coding noise propagation error is proportional to the sum of the square of each backward conversion coefficient, it reduces that error by quarter. So we propose to use the following transform, YSbSr: R Y   0.6460 0.6880 0.6660   Rr Sb  =  −1.0 0.2120 0.7880  GG       1.0 −0.6780   BB  Sr  −0.3220

(11)

−0.6077 −0.2152  Y  R  00  Rr 00..500 5 GG  =  0.5 0.12 0.6306  Sb        BB   0.5 0.4655 −0.4427 Sr 

(12)

This new transform gives the high decorelation gain as KL transform does while reducing the conversion error by increasing the dynamic range of the new color space. We can generalize this for any bit depth by employing (N+k) bit codec for N bit data.

spatial correlation while the inter prediction uses the temporal correlation. To see whether there is still redundancy between color components after the intra/inter prediction, we plot the correlation between the residual signals of each component. For our tests, we modified the MPEG-4 AVC/H.264 codec [3] that uses the intra/inter prediction. The modified codec supports 4:4:4 chroma format to take RGB input directly. As shown in Fig. 1, after the intra/inter prediction, there still exists very strong correlation between residual images of each color components. To reduce this redundancy we applied linear regression predictor [11] F(.): F ( x ) = my + r

σy ( x − mx ) σx

where m and σ denotes mean and standard deviation. To support integer implementation, the coefficients can be chosen to be dyadic rational. We chose these coefficients by analyzing statistical properties of residual images. Our experimental data showed identity function is good enough [12-16]. So we subtracted the reconstructed G residue image from the R and B residual images. We named this prediction as “inter-plane prediction (IPP)”. The block diagram in Fig. 2 shows this process and the below equations are the case when the FR(.) and FB(.) are identity functions, Encoding: ∆2 G = ∆G ′ ∆2 R = ∆R − ∆G ′

3. THE PROPOSED RGB CODING METHOD

∆2 B = ∆B − ∆G ′

Decoding: ∆G ′ = ∆2 G ′

3.1. Inter-plane Prediction Coding In the previous section, we showed that coding by converting the given color space causes inevitable color distortion. Based on this observation, we propose to code the data in the given space. But the RGB space is generally regarded as a bad space from the compression viewpoint as addressed earlier. To increase the coding efficiency in RGB space, we need to exploit the redundancy between color components. To eliminate this redundancy without color distortion, we first applied intra/inter prediction to each color components. This process exploits the redundancy that exists in each color components. The intra prediction utilizes the 24 | VOLUME 1 | August, 2005

∆R′ = ∆2 R′ + ∆G ′ ∆B′ = ∆2 B′ + ∆G ′

where �X represents the intra/inter predicted residual value, �X´ means the reconstructed residual value, and �2X and �2X´ denote the inter-plane predicted and reconstructed signals, respectively. Therfore, the RGB video is directly coded by intra/inter prediction that uses the spatial/temporal correlation of each color components, followed by the inter-color-component prediction on the residual data.

3.2. Residual Color Transform

RGB Video Coding Using Residual Color Transform

B-G Residual Correlation (Intra)

400

400

300

300

200

200

100 0 -400 -300

-200

100

200

300

400

-100

B residual value

R residual value

R-G Residual Correlation (Intra)

100 0 -400 -300

-200

100

200

300

400

-100

-200

-200

-300

-300

-400

-400

G residual value (a)

G residual value (b)

R-G Residual Correlation (Inter)

B-G Residual Correlation (Inter) 700

600

600

500

500

300 200 100 -400 -300

-200

0

100

200

300

400 500

600

-100

400

B residual value

R residual value

400

300 200 100 0 -400 -300 -200

-100

100

200

300

400

500 600

-200

-200

-300

-300

-400

-400

G residual value (c)

G residual value (d)

Fig. 1. Inter-color correation between residual components. (a) and (b) show the correlation of the intra predicted residual signals, where (a) shows the correlation between R and G residual values and (b) shows that of B and G residual values. (c) and (d) show the correlation of the inter predicted residual signals, where (c) shows the correlation between R and G residual valuses and (d) shows that of B and G residual values.

By using the IPP, the correlration between the color components can be reduced. But, this prediction can consider only the correlation between the two color components at one time. To effectively eliminate the correlation among three color components at once, we applied the color transform described in section 2.2 to the residual data after intra/inter prediction. Though the YSbSr results in the best coding efficiency, we adopt the YCoCg-R transform to take an advantage of the integer transform. Since this transform is applied to the residual data of the RGB color components, we named it as “residual color transform (RCT)” [17-19].

Below are the forward and backward RCT equations: Forward: ∆2 B = ∆R − ∆B t = ∆B + ( ∆2 B >> 1) ∆2 R = ∆G − t ∆2 G = t + ( ∆2 R >> 1)

Backward: t = ∆2 G ′ − ( ∆2 R′ >> 1) ∆G ′ = ∆2 R′ + t ∆B′ = t − ( ∆2 B′ >> 1) VOLUME 1 | August, 2005 | 25

Woo-Shik Kim, Dae-Sung Cho, Dmitry Birinov, et al.,(3)

G Plane

Spatial/ temporal prediction

G plane residue

to encoder

R plane residue

to encoder

Reconstructed G plane

R plane

Spatial/ temporal prediction

+ -

Reconstructed R plane

B plane

Spatial/ temporal prediction

FR(.)

+

B plane residue

-

Reconstructed B plane

FB(.)

Reconstructed G plane

to encoder

Reconstructed G plane

Fig. 2. Block diagram of the inter-plane prediction coding process.

∆R′ = ∆B′ + ∆2 B′

where �X and �X´ represent the intra/inter predicted and reconstructed residual values, respectively, and �2X and �2X´ denote the residual color transformed and reconstructed signals, respectively. The notation “>>” denotes the right shift operation, which approximates a division by 2. Note that the variable t is used for the temporary calculation purpose.

3.3. Intra/Inter Prediction Mode Selection Since the IPP and RCT process the residual data of different color components together, it is important to make a consistent residual data through all color components. For example, during the inter prediction, if all color components have their own motion vector, the residual data of each color components would have different characteristics, which results in less correlation among the residual data. Therfore, it is important to use same motion vector and same prediction mode for all color components to get high correlation among the residual data. 26 | VOLUME 1 | August, 2005

This is also true for the intra prediction case. For example, in the MPEG-4 AVC/H.264 video coding method [3], there are 9 modes of 4×4 intra prediction and 4 modes of 16×16 intra prediction. It is better to use the same intra prediction mode for all color components to increase the coding efficiency.

4. SIMULATION RESULTS n this section, the simulation results are shown and discussed. For this test we modified MPEG4 AVC/H.264 reference software to support 4:4:4 chroma format and N-bit input data. The coding efficiency is compared by means of the rate-distortion (RD) curve where the rate is an exact amount of bits to code the given video, and the distortion is the quality of the reconstructed image. To measure the image quality, the peak signal to noise ratio (PSNR) is used, which is defined as

I

RGB Video Coding Using Residual Color Transform

  b Y = 20 ⋅ log10   ( dB)  F − F′ 2 

Kodak 51 50

4.1. Results of The New Color Transform YSbSr This section summarizes the simulation results of the new color transform YSbSr. The test materials are the widely used “Kodak set” of 24 images of size 512×768, captured with a high-quality 3-CCD camera [10]. We coded this 24 images using the above modified reference software. Since there is no temporal correlation, we coded the test images using intra prediction mode only. To compare the efficiency of each color transform, we first converted the original input RGB data into the new color space. Then we code the images using the above software and converted back the coded data into RGB space. Fig. 3 shows the simulation results for each color transform. It shows the efficiency of the proposed YSbSr transform since it reduces the conversion error most. The coding gain difference between YCoCg-R and YFbFr is very negligible even though YFbFr approximates the KL transform more closely. The well-known YCbCr performs worst as shown in Fig. 3. The input data for our simulation have different bit-depth for each component depending on color transform. So we need to adjust the quantizer for each component to compare the coding efficiency. For that purpose, we applied a quantizer that is independent of input data bit depth [20].

4.2. Results of The RGB Coding

49 48 47

PSNR (dB)

where F is the original image value, F´ is the reconstructed one, and b is the maximum value of the signal, for example, 255 in case of 8 bit image. ||.|| denotes a norm of an L2-metric, which imply the mean squre error. The unit of the PSNR is decibel (dB). By changing the quantization step size, we can get low to high bitrates and qualities. One quantization step size makes one R-D point, and after getting more than two R-D points, the R-D curve is made by interpolating the R-D points. The R-D curves of different coding methods can easily show the coding efficeincy differences between methods. One can verify that by PSNR differences at a specific bitrate, while others can do by bitrate reduction at a specific image quality. Usually, 1 dB difference in PSNR is assumed to be equivalent to 20 % bitrate reduction.

46 45 YSbSr (Samsung)

44 43

YCoCg-R

42 41

YFbFr

40 YCbCr

39 38 0

20

40

60

80

100

120

140

160

Bitrate (Mbps)

Fig. 3. Rete-distortion curves of the various color transforms.

This section summarizes the simulation results of the RGB coding. Since the proposed method is mainly targeted for the high quality applications, images with high quality and high resolution are selected such as high definition (HD) materials with 1280×720@60fps, 8 bit per pixel, and film materials with 1920×1080@24fps, 10 bit per pixel. Both are in 4:4:4 chroma format and captured using progressive scan method. The film materials are achieved by scanning the analog cinema film and correcting gamma. The “Analog TV” sequence in Fig. 5 is provided by Technicolor, which is from 35mm film and logarithmic gamma scaled. The “Restaurant” sequence is provided by Warner Bros., which is from 65mm filem and also logarithmic gamma scaled. We also tested the Thoms Viper sequences, which are captured using very high quality CCD sensors to enable the usage of the digital image in the cinematography. The Thomson Viper sequences have 1920×1080@24fps resloution and 10 bit per pixel with 4:4:4 chroma format. Fig. 4 shows the simulation results when the IPP is applied to HD materials. This results show that there exists an achievable PSNR limit if we code images by converting them into another color space as explained in the previous section. Coding RGB data directly without the IPP option shows poor coding efficiency even though it can achieve high PSNR values. Note that there is a crossover point between the RD-curves of the proposed method and that of the YCbCr coding results. The coding VOLUME 1 | August, 2005 | 27

Woo-Shik Kim, Dae-Sung Cho, Dmitry Birinov, et al.,(3)

Harbour (HD)

Crew (HD)

55

55

54

54

53

53

52

52

51

51

50 49

50

48

49

47 46

47

45

PSNR (dB)

PSNR (dB)

48

46 45 44

44 43 42

43

41

42

40 39

41

38

40

37

39

36

38

IPP

35

37

YCbCr

34

36

RGB

33

35

IPP YCbCr RGB

32 0

100

200

300

400

500

0

100

200

Bitrate(Mbps) (a)

300

400

500

600

Bitrate(Mbps) (b)

Fig. 4. IPP simulation results for HD materials. (a) RD-Curves for the Crew sequence. (b) RD-curves for the Harbour sequence.

Restaurant

Analog TV 51

51

50

50

49 49 48 48

47

47

45

PSNR (dB)

PSNR (dB)

46

44 43

46 45 44

42 41

43 RCT

RCT

40

IPP

IPP

42

YCoCg-R

YCoCg-R

39

41

38 0

100

200

Bitrate (Mbps) (a)

300

0

50

100

150

200

Bitrate (Mbps) (b)

Fig. 5. Coding efficiency comparison of the RCT, IPP and YCoCg-R for film materials. (a) RD-curves for the Analog TV sequence from Technicolor. (b) RD-curves for the Restaurant sequence from Warner Bros.

28 | VOLUME 1 | August, 2005

RGB Video Coding Using Residual Color Transform

efficiency of the proposed method is slightly reduced in the low bit-rate compared to the YCbCr coding. However, the proposed method can preserve the color fidelity better than the YCbCr coding since there was no color space conversion, at the same time the computational complexity is reduced for the same reason. Fig. 5 shows the results for the film materials where the coding efficiencies of the RCT and IPP methods are compred with that of YCoCg-R that is recently developed to increase the coding efficiency. The film material has a lot of noises so-called “film grain” which increase the bitrate much. The IPP shows the best coding efficiency in Analog TV sequence all through the bit rate, while the RCT outperforms at the low to middle bit rate in Restaurant sequence. The IPP gets the highest PSNR value in both sequences. Fig. 6 shows the results for the Tomson Viper materials, where the RCT shows the best coding efficiency at the whole range of the bit rate and the IPP gets the highest PSNR value. The experimental results confirm the effectiveness of the proposed methods for the various kinds of images.

5. ACHIEVMENTS FROM RGB CODING AND APPLICATION TO SAMSUNG LCD n section 3, we proposed a new coding technology that compresses the RGB image directly without any color conversion, and in sectoin 4, we showed that the proposed method achieved greater coding efficiency than the conventional color converted coding methods. The color converted coding methods such as YCbCr prevailed through the histroy of the video coding technology development. Well-known video coding standards such as ISO/IEC MPEG-1/2/4 and ITU-T H.261/262/263 use the YCbCr coding. One reason of this is that the Y component alone can be used for the black & white TV since it has the luminance information, and the other reason is that the RGB coding has not been able to achieve the sufficient coding efficiency compared to the YCbCr coding. However, with the evolution of the display devices, which represents the video more vividly by the aid of the true color support and a large display size with the high resolution, the video coding technology faced a need to

I

Night

KungFu 52

55 54

51

53 50 52 49

51 50 49

47

PSNR (dB)

PSNR (dB)

48

46 45

48 47 46 45

44

44

43 RCT

RCT

43

IPP

42

IPP 42

YCoCg-R 41

YCoCg-R

41 0

50

100

Bitrate (Mbps) (a)

150

200

0

50

100

150

Bitrate (Mbps) (b)

Fig. 6. Coding efficiency comparison of the RCT, IPP and YCoCg-R for Thomson Viper materials. (a) RD-curves for the KungFu sequence. (b) RD-curves for the Night sequence.

VOLUME 1 | August, 2005 | 29

Woo-Shik Kim, Dae-Sung Cho, Dmitry Birinov, et al.,(3)

preserve the true color of the world, which requires to code the RGB signal as RGB signal without any color conversion. This requirement has been expressed by several industry organizations and reflected to the video coding standardization works such as ISO/IEC MPEG-4 AVC and ITU-T H.264 Fidelity Range Extensions [21-23]. To achieve this new requirement, many companies and organizations made competitions and carefully verified each one’s technologies. We proposed the RGB coding technoloty such as IPP and RCT, and the RCT has been adopted as an RGB coding tool to the international standard. Compared to the conventional YCbCr coding, the proposed RGB coding methods not only improve the coding efficiency without color distortion, but also reduce the computational complexity since it removes the color conversion process that is required both at the encoder and the decoder. Since the color conversion process requires pixel-by-pixel floating-point operations, this consumes a power a lot in case of mobile applications. This makes the RGB coding easy to apply to the products. As a first step, we applied the RGB coding to a semiconductor integrated circuit (IC), which is devised to enhancement the output image quality of the Samsung liquid crystal display (LCD) by reducing a response time. This process, of which the name is dynamic capacitance compensation (DCC), requires storing the image to be displayed to the external memory such as dynamic random access memory (DRAM). In this process, the RGB signal was converted into YCbCr and the chroma samples were downsampled to reduce the amount of the data to be strored by a factor of 2/3, which results in the inevitable color distortion. By applying the RGB coding to the DCC process, we compressed the image by a factor of 1/3 without any color distortion neither the image quality degradation. This enabled to reduce the number of DRAMs by half and the number of I/O pins of the chip, which reduces the cost and the package size, and enhances the electromagnetic interference (EMI) chacteristics.

6. CONCLUSIONS

30 | VOLUME 1 | August, 2005

n this paper, we introduced RGB coding methods, which opened a new way in the video coding history by eliminating a color conversion process that causes undesirable color distortion. The RGB coding methods resulted in high coding efficiency while preserving the true color representation. We showed that coding in a given space was important to keep the fidelity of the original data by deriving the achievable PSNR limit when the color space conversion is involved using the theoretical analysis. The proposed coding technology avoids the color distortion and achieves high PSNR values, which are essential to high quality multimedia applications. The proposed technology works well over various image sets such as HD materials, film scanned data, and Tomson Viper sequences. It makes the proposed RGB coding methods ideal for digital cinema or other high quality multimedia applications such as HD-DVD, digital cinema, broadcast production, etc. This technology has been verified by many worldwide companies and organizations, and adopted as an international standard. It is also applied to the Samsung LCD products to be competitive in the upcoming display market.

I

REFERENCES [1] V. Bhaskaran and K. Konstantinides, Image and Video Compression Standards -Algorithms and Architectures, 2nd Edition, Kluwer Academic Publishers (1999). [2] G. Sullivan, “Approximate theoretical analysis of RGB to YCbCr to RGB conversion error,” ISO/IEC JTC1/SC29/ WG11 and ITU-T SG16 Q.6, Document JVT-I017 (2003). [3] “Final Draft International Standard of Joint Video Specification (ITU-T Rec. H.264 | ISO/IEC 14496-10 AVC),” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Document JVT-G050, (2003). [4] P. Hao, Q. Shi, “Comparative study of color transforms for image coding and derivation of integer reversible color transform,” IEEE Int’l Conf. Pat. Recog. (ICPR’00), vol. 3, Barcelona, Spain (2000). [5] “Fidelity range extensions amendment to ITU-T Rec. H.264 | ISO/IEC 14496-10 AVC,” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Document JVT-L047 (2004). [6] H. Malvar and G. Sullivan, “YCoCg-R: A color space with RGB reversibility and low dynamic range,” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Document JVTI014 (2003). [7] P. Topiwala and C. Tu, “New invertible integer color transforms based on lifting steps and coding of 4:4:4 video,” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6,

RGB Video Coding Using Residual Color Transform

Document JVT-I015 (2003). [8] G. Strang and T. Nguyen, Wavelets and Filter Banks, Wellesley-Cambridge Press (1997). [9] H. M. Kim, W.-S. Kim, and D. Cho, “A New Transform for RGB Coding,” IEEE Int’l Conf. on Image Processing (ICIP’04), Singapore (2004). [10] Available at ftp://ftp.ipl.rpi.edu/stills/kodak/color. [11] A. Papoulis, Probability & Statistics, Prentice Hall (1990). [12] W.-S. Kim, D. Cho and H. M. Kim, “Color Format Extension,” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6 Document JVT-H018 (2003). [13] W.-S. Kim, D. Cho and H. M. Kim, “Proposal for the unsolved issues in professional extensions,” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Document JVTI012 (2003). [14] W.-S. Kim, D. Cho and H. M. Kim, “Inter-plane Prediction for RGB Coding,” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Document JVT-I023 (2003). [15] W.-S. Kim, D. Cho and H. M. Kim, “Inter-plane Prediction for RGB Coding II,” ISO/IEC JTC1/SC29/WG11 and ITUT SG16 Q.6, Document JVT-J017 (2003). [16] W.-S. Kim, D. Cho and H. M. Kim, “Inter-Plane Prediction for RGB Coding,” IEEE Int’l Conf. on Image Processing (ICIP’04), Singapore (2004).

[17] W.-S. Kim and H. M. Kim, “Residue transform,” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Document JVTJ038 (2003). [18] W.-S. Kim, D. Birinov, and H. M. Kim, “Adaptive residue transform and sampling,” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Document JVT-K018 (2004). [19] W.-S. Kim, D. Birinov, and H. M. Kim, “Residue color transform,” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Document JVT-L025 (2004). [20] W.Gish and H.Yu, “Extended sample depth: Implementation and characterization,” ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Document JVT-H016 (2003). [21] D. C. Mead, “USNB request new work item for electronic cinema,” ISO/IEC JTC1/SC29/WG11, Document W5674, (2000). [22] WG11 Requirements Group, “Digital cinema requirements,” ISO/IEC JTC1/SC29/WG11, Document N5328 (2002). [23] WG11 Requirements Group, “Call for proposals for extended sample bit depth and chroma format support in the advanced video coding standard (ITU-T H.264 & ISO/IEC 14496-10)”, ISO/IEC JTC1/SC29/WG11, Document N5523 (2003).

VOLUME 1 | August, 2005 | 31