Lossless Compression of Pre-Press Images Using a Novel Color Decorrelation Technique Steven Van Assche, Wilfried Philips, and Ignace Lemahieu Elis - Medisip - Ibitech, University of Ghent St.-Pietersnieuwstraat 41, B-9000 Gent, Belgium
ABSTRACT
In the pre-press industry color images have both a high spatial and a high color resolution. Such images require a considerable amount of storage space and impose long transmission times. Data compression is desired to reduce these storage and transmission problems. Because of the high quality requirements in the pre-press industry only lossless compression is acceptable. Most existing lossless compression schemes operate on gray-scale images. In this case the color components of color images must be compressed independently. However, higher compression ratios can be achieved by exploiting inter-color redundancies. In this paper a new lossless color transform is proposed, based on the Karhunen-Loeve Transform (KLT). This transform removes redundancies in the color representation of each pixel and can be combined with many existing compression schemes. In this paper it is combined with a prediction scheme that exploits spatial redundancies. The results proposed in this paper show that the color transform eectively decorrelates the color components and that it typically saves about a half to two bit per pixel, compared to a purely predictive scheme. Keywords: data compression, lossless continuous-tone image compression, color decorrelation
1. INTRODUCTION
In pre-press companies documents (e.g., magazines, brochures and advertising posters) are usually composed electronically nowadays. These documents contain images with high spatial resolution (typically, 20002000 pixels) and high color resolution (24 bit per pixel for RGB-images and 32 bit per pixel for CMYK-images, which are more commonly used). Hence, such images occupy considerable amounts of storage space (typically 16 Mbyte/image), and also pose transmission problems (e.g., transmitting images from a pre-press house to a printing company over ISDN-lines takes 15 to 30 minutes). Clearly, data compression can signi cantly alleviate these problems. Because of the high quality requirements imposed by the customers, the pre-press industry is reluctant to adopt lossy compression techniques (i.e., they do not accept any degradation in image quality) and only lossless compression is acceptable (at least today). As far as lossless compression is concerned, many techniques are available, at least for gray-scale images. Some of the current state-of-the-art lossless compression techniques for contone gray-scale images are lossless JPEG (the current standard), BTPC (a binary piramid coder), FELICS (a \fast ecient lossless image coder"), the S+P-transform (an integer wavelet coder) and CALIC (a highly optimized technique with a complex prediction scheme). According to our experiments, CALIC yields the highest lossless compression ratios (of the order of 2), but it is also the slowest of the techniques mentioned. In the case of color images, the above techniques usually process each of the color components independently, i.e., the gray-scale images obtained after color separation are encoded separately. In this case, the K-component of a CMYK-image usually compresses better than the other components (factor of 3 to 4 instead of 2), but the mean 1
2
3
4{6
7
8
Author information: (send correspondence to S.V.A.) S.V.A.: Email:
[email protected]; Tel.: ++32-9-264.89.08; Fax: ++32-9-264.35.94; Research Assistant. W.P.: Email:
[email protected]; Tel.: ++32-9-264.33.85; Fax: ++32-9-264.42.95; Professor. I.L.: Email:
[email protected]; Tel.: ++32-9-264.42.32; Fax: ++32-9-264.35.94; Research Associate with the FWO - Vlaanderen, Belgium.
compression ratio for all components remains of the order of 2. It is clear that higher compression ratios can be achieved by exploiting inter-component redundancies. Nowadays, only few proposed techniques exploit such color redundancies. Two noteworthy exceptions operate on RGB color images. Both techniques rst compress the red component and then use information derived from the red component to predict the green component; subsequently, the green component is used in the same manner to predict the blue component. Unfortunately, this approach provides only a slight improvement in compression ratio (about 5%). This paper proposes a new technique for exploiting inter-component redundancies. The technique is based on a modi ed Karhunen-Loeve Transform (KLT) scheme in combination with a novel quantization scheme that guarantees losslessness. The KLT decorrelates the color components. It is recomputed for every image region (more speci cally: the image is divided into blocks or segmented) and is therefore spatially adaptive. It can be combined with several spatial prediction techniques. In this paper we investigate the combination with a lossless JPEG-predictor. Section 2 describes how the modi ed KLT is able to achieve both a lossless transform of the image components and a good compression ratio, which the usual KLT, and in fact all oating point transforms, are uncapable of (when they achieve losslessness, the compression ratio drops to about 1). Section 3 gives an overview of the complete compression scheme. Section 4 presents experimental results on many pre-press images. These results show that an increase of more than 10% in compression ratio is possible over pure gray-scale techniques. 9,10
2. LOSSLESS TRANSFORM CODING
In lossy transform coding a data vector d, consisting of sample values of a signal or image, is multiplied by an orthogonal transform matrix T , which results in a coecient vector a = T d. In the following we assume that the components of d are integers. A \good" transform yields coecients ai which are very little correlated. This is important because the rate-distortion theory predicts that the rate-distortion bound can be reached by coding each of the coecients independently of the others, which greatly simpli es the coding problem. In practical transform coders, the coecients aj are quantized, i.e., converted into integers a0j , which are then entropy-coded. In the simplest case (uniform quantization), a0j = [aj =], where [x] denotes the integer nearest to x. In lossy compression, is used to control the compression ratio (e.g., increasing causes a decrease in bit rate at the expense of larger coding errors). Decompression involves transforming the quantized coecients into an approximation d0 = [T t a0 ] of the original data vector d. In principle, all transform coding techniques can be made lossless by suciently reducing the quantizationpstep . Indeed, let N be the dimension of d. It is easily shown that d0 = d for all possible data vectors d when p 1= N ; this is because kd0 ?dk = ka0 ?ak and because ja0j ?aj j < =2, which implies that jd0j ?dj j kd0 ?dpk < N =2 < 1=2. It can also be shown that in general, a transform coding technique cannot be lossless when > 1= N . Unfortunately, the compression ratio that corresponds to the minimum value of is usually lower than 1, which means that lossless transform coding is useless in practice, at least in the manner described. In principle, transform coding techniques can also be made lossless by using higher values of and coding the residual error. However, this approach does not work either, because the residual image is noise-like and cannot be compressed well. We now present a novel technique for turning a lossy transform coding scheme into a successful lossless scheme. The procedure is based on the fact that any matrix T (even a non-orthogonal one) can be well approximated by two prediction operators P [:] and P [:] which map integers into integers, followed by a oating point scaling step: 1
2
a
h
i
= T d DP P [d] ; 2
1
where D is a diagonal matrix, e.g., for N = 3:
0 D=@
D1
0 0
0
D2
0
0 0
D3
1 A:
(1)
An important property of the prediction operators P [:] and P [:] is hthat they i are invertible when applied to 00 integer vectors. Now consider the transform a = P [d], where P [d] = P P [d] . This operator maps integer data vectors d into integer coecient vectors a00 and is lossless. Furthermore, in view of eq. (1), a00 D? a0 . Therefore, the statistical properties of the coecients a00j are similar to those of the corresponding coecients a0j in an \ordinary" transform scheme, but with a quantization step j = 1=Dj , which diers from coecient to coecient. Note that simply quantizing the coecients aj using the quantization steps j , does not produce quantized coecients a0j from which the integer data vector d can be error-freely recovered. The experimental results in Sect. 4 clearly demonstrate that the lossless transform that we have just de ned eectively compresses real data. In the following, we present some qualitative arguments to show why this is so. In lossy transform coding it is usually assumed that the coecients aj are gaussian variables with zero p mean. A crude approximation for the entropy of the quantized coecient a00j is then given by Hj00 = 0:5 + log ( 2j ) ? log (j ), Q where j = Dj? . Therefore, the total number of bits required to entropy-code a00 equals H 00 = H ? log ( Ni j ), where H is the part of H which is independent of the quantization steps j . Similarly, for the \ordinary" lossy transform coding scheme, bit rate is H 0 = H ? N log . As T is an orthogonal matrix,pj det(T )j = QNthe corresponding 1 which implies that i Dj = j det(T )j = 1. Therefore, H 00 = H . On the other hand, if = 1= N (which is required for lossless coding), then H 0 = H + N log (N )=2. This shows that, at least under these very crude assumptions, the proposed new lossless scheme saves about log (N )=2 bits per transform coecient. 1
2
2
1
1
2
1
2
0
2
=1
0
0
2
0
=1
0
2
2
3. AN OVERVIEW OF THE LOSSLESS COMPRESSION SCHEME
Figure 1 displays an overview of the proposed coder and g. 2 of the corresponding decoder. In the following paragraphs we brie y describe the dierent steps of the coding process. We will not discuss the decoder in detail because it is basically the inverse of the coder.
Figure 1.
The encoder of the proposed scheme
In the proposed scheme, spatial redundancies are removed using a spatial prediction scheme, while color redundancy is removed using the lossless KLT (the KLT may be applied before or after the prediction step). Finally, the resulting coecients are presented to an entropy coder.
3.1. The lossless Karhunen-Loeve Transform
The Karhunen-Loeve Transform is de ned as the transform which decorrelates the components of a random vector. In classical compression schemes decorrelation is usually applied in the spatial domain and is usually replaced by the DCT which is close to optimal in practice. In any case, the KLT can be well-approximated by a lossless operator, as explained in the previous section. We will refer to this operator as the lossless KLT. In the proposed scheme the lossless KLT is applied in the color domain, i.e., to transform the color values of a pixel into decorrelated numbers. In fact the KLT is computed for a region of pixels and is therefore region-dependent. This is to exploit the fact that the color statistics in an image can vary strongly from region to region. In our current 11
Figure 2.
The decoder of the proposed scheme
implementation, image regions can be constructed in two ways: simply by dividing the image into blocks of the same size or by segmenting the image into regions of homogeneous color. The applied segmentation algorithm will be discussed in the next paragraph. As the lossless KLT for an image region depends on the color statistics in that region, which are unknown to the receiver, side information must be transmitted so that the receiver can reconstruct the lossless KLT itself. For this purpose, we use the following scheme: we start by computing the orthogonal KLT matrix T for a given region; next, we expand T as the product of rotation matrices, which are each completely speci ed by one rotation angle. These rotation angles are quantized to 8 bit, and are used to construct an (exactly orthogonal) approximation T 0 of T . The lossless KLT is then derived from T 0 (instead of from T ). The quantized rotation angles are sent to the receiver which from them reconstructs the lossless KLT. The involved overhead is negligible in practice because each rotation angle requires only 8 bit and because the number of rotation angles is small (i.e., 3 for N = 3 and 6 for N = 4).
3.2. Segmentation
Dividing the image into equally-sized blocks is an easy way to crudely isolate image regions with homogeneous colors. Indeed, we expect colors not to dier much in such blocks as long as the block size is small. However, if the same block size is used for the whole image, then this block size must be a comprise between large block sizes preferable in large image regions with more or less the same color, and very small block sizes preferable in image regions with frequent color changes. The block size can be made adaptive to the color information in the image by using segmentation. Of course, segmented image regions do not have to be squares, but can take any suitable shape. The segmentation algorithm incorporated in our present implementation is developed by Christopoulos et al. It performs a split-and-merge segmentation: starting with an over-segmented image, gradually more small segments (belonging to the same object) are merged until the desired nal number of segments is obtained. This technique is applied upon the original image after conversion to the YUV color space. This way, a color segmentation is performed rather than a gray-scale segmentation, yielding a better segmentation. In the case of the segmentation-based technique, the location and the shape of the segmented regions must be coded, which is not the case in the block-based scheme. In our implementation, the contours of the segmented regions are coded losslessly. The involved overhead is considerable, and can only be compensated by an adequate segmentation yielding better coding of the transformed coecients. 12
3.3. Spatial prediction and entropy-coding
In most images spatial redundancies are much higher than inter-color redundancies. In any case, spatial redundancies must be removed in order to maximize the compression ratio. In principle, any spatial decorrelation scheme may be combined with the above lossless KLT color transform to further remove (spatial) redundancies. Currently we have implemented lossless JPEG predictor no. 7, which is a simple linear predictor. If Ia and Il are the intensities of the 1
pixel above and the pixel to the left of the pixel to be coded, then the prediction is b(Ia + Il )=2c, where bxc denotes the largest integer not greater than x. The data which remains after the (spatial and color) decorrelation is very non-uniformly distributed. Therefore, it should be entropy-coded. In principle, both Human and arithmetic coders can be used. We did not include an entropy coder in our implementation yet. This is not really a drawback because the entropy gures obtained from the present implementation will be very close to the gures one would obtain using an arithmetic coder.
4. EXPERIMENTAL RESULTS
We have tested the proposed color decorrelation scheme on several (mainly pre-press) RGB- and CMYK-images. In each case we reconstructed the image and veri ed the losslessness of the proposed scheme. In this section, we rst describe the results of some preliminary experiments aimed at optimizing the proposed scheme. Next, we present compression results obtained on a large set of images. In the rst experiment we evaluated the color decorrelation technique by itself (i.e., not in combination with a spatial decorrelation technique) on the CMYK-image \musicians", which is representative for pre-press applications. Note that in this case the color decorrelation is performed on a blocks, rather than on segmented regions. When applying only the KLT a bit rate of 19.86 bit per pixel (bpp) is achieved. It is interesting to compare this result to bit rates obtained with pure gray-scale techniques, which compress the color separations independently. With the LJPG-based technique a bit rate of 18.87 bpp is obtained. This result shows that the KLT eectively compresses the image data, but it also demonstrates that spatial correlations are more important than inter-color dependencies. In the second experiment, we investigated the optimal order in which to combine the KLT inter-color redundancy removal and the spatial prediction. When the KLT is applied rst, followed by LJPG-prediction, the bit rate is 17.84 bpp. However, higher gains are obtained when applying the lossless KLT step after the spatial prediction step, as a kind of error modeling. In this case a bit rate of 16.27 bpp is achieved, which is a 15% improvement compared to the purely spatial scheme. In the following we always performed the lossless KLT step after the spatial prediction step. In the third experiment, the in uence of the block size is investigated for the same image, see g. 3. The results show that the color decorrelation is more ecient at small block sizes, because inter-color dependencies tend to be local. For \musicians", the optimal compromise is a block size of 5050. Our experiments have shown that the optimal block size strongly depends on the image (optimal block sizes vary from 1010 to 100100); however, the bit rate does not vary rapidly as a function of the block size. 18
Bit rate (bpp)
17,5 17 16,5 16 15,5 15 0
100
200
300
400
500
Block size Figure 3.
In uence of the block size on the compression of \musicians"
For the segmentation-based technique, important parameters are the maximum number of segmented regions one allows to be constructed and the minimum number of pixels one requires in a segmented region. In order to deal with the very long processing times for segmentation, we did not apply the segmentation on the image as a whole. Instead, segmentation is applied upon smaller image blocks at a time (typical size 512 512, certainly larger than the block sizes used in the block-based scheme). In this case, the performance of the scheme does not really depend on the parameters mentioned. Typical values for the number of segmented regions in a 512 512 block range from
5 till 30. Note that the segmentation is applied upon the original image, while the KLT is applied upon the error image. In conclusion, the preliminary experiments above show that it is best to apply the KLT after prediction and to use a block size of 10 10 till 100 100, depending on the image. In the following evaluation, we compressed the images with a block size of 50 50 (nearly-optimal for most images) and for the segmentation-based technique we set the minimum number of pixels per region to 100. Tables 1 and 2 show compression results for some 32 bpp and 24 bpp color images, including the well-known \lena" and \peppers" images. In each case, the bit rate is listed for the LJPG-based scheme and the LJPG+KLT schemes (once block-based and once segmentation-based), and for the pure gray-scale schemes CALIC, FELICS and S+P-transform. The FELICS technique is a fast image compressor, which predicts the current pixel's value I based upon the values of the two nearest already known pixels Ia and Il ; the smallest one being denoted as L and the largest one as H . Depending on the position of I in respect to the [L; H ] interval, an adaptive binary coding is used (if I 2 [L; H ]) or a fast exponential Rice coding (if I 62 [L; H ]). The S+P-transform (scaling + prediction) performs a kind of subband-decomposition and can be seen as a reversible (i.e., lossless) wavelet transform. At each decomposition step, the S-transform converts the latest calculated low-resolution image into four new images which are the combinations of high- and low-resolution versions in the horizontal and vertical directions. Then the P-transform (prediction) is performed on the high-pass versions. In our S+P-transform implementation entropy coding is performed by an arithmetic coder. The third codec, CALIC, uses both a sophisticated non-linear predictor and an ecient context-modeling of the prediction errors to yield high compression. Prediction is performed in two steps. First a gradient-adjusted prediction I_ for the pixel to be coded I is calculated. Depending on the local gradients in pixel intensities, an appropriate predictor is selected. This way, the sharpness and the direction of edges are detected and taken into account. In the second prediction step context-modeling is used to make an even better prediction I. In the context spatial texture patterns and the energy of past prediction errors are included. This two step prediction scheme is able to greatly reduce spatial redundancies. Note that CALIC incorporates some special performance boosters (e.g., its binary mode for binary-type images, text-image combinations, and images with large uniform areas). The prediction errors are coded by an arithmetic coder. Table 1 shows results for some typical continuous-tone pre-press images for which our KLT technique was intended, while table 2 lists results for less typical color images, mainly images containing line-art or text (on which the KLT is expected to perform rather poorly). 7
3
4{6
Table 1.
Bit rate (bpp) for some typical continuous-tone pre-press color images
LJPG LJPG + KLT Number of pixels Orig. only (blocks) (segment.) CALIC FELICS S+P musicians 18532103 32 18.87 16.27 15.66 16.58 18.56 17.28 scid0 20482560 32 16.37 14.28 14.26 14.16 16.15 15.69 20482560 32 16.76 14.67 14.39 13.85 16.16 16.13 scid3 woman 20482560 32 18.92 15.06 15.77 16.16 18.30 17.06 bike 20482056 32 16.41 13.95 13.57 14.04 15.86 15.53 cafe 20482056 32 22.86 19.81 19.64 18.82 21.80 21.11 water 30722048 24 7.48 4.68 4.58 5.21 7.08 7.07 cats 30722048 24 10.90 6.01 6.11 7.54 9.90 9.23 The results in tables 1 and 2 show that the KLT-step leads to a considerable decrease in bit rate (0.5 to 2 bpp) on most images compared with the purely spatial LJPG-based technique, even on the non-typical images. Compared to the CALIC-scheme the compression gains on the typical images are quite moderate (< 0:5 bpp) and on the non-typical images, our method performs worse than CALIC. Compared to FELICS and the S+P-transform, our technique yields better results, except for the image \tools", which also poses problems to other compression algorithms (e.g., see the bit rate for CALIC).
Table 2.
Bit rate (bpp) for some less typical color images
LJPG LJPG + KLT Number of pixels Orig. only (blocks) (segment.) CALIC FELICS S+P koekjes1 827591 32 18.90 17.25 17.03 16.00 18.05 17.57 839638 32 18.35 16.80 16.50 15.76 17.58 17.13 koekjes2 tools 15241200 32 22.27 22.11 21.88 19.75 21.66 21.71 timep x d 339432 32 19.46 18.23 18.26 16.58 18.64 18.59 cmpnd2 10241400 24 6.76 3.95 3.58 3.72 7.19 8.75 graphicy 26443046 24 8.34 8.38 8.41 6.78 8.55 7.64 chart s 16882347 24 11.81 10.71 12.44 7.99 10.31 10.45 512512 24 14.42 13.64 13.61 13.19 14.52 13.70 lena peppers 512512 24 15.27 15.26 15.25 13.87 15.43 14.90 It is somehow surprising that the segmentation-based technique does not outperform the block-based technique and it is even more striking that it yields worse compression on some images. However, this could be expected: segmentation will be bene cial in image regions with many color changes (a better separation of homogeneous regions certainly compensates for the extra contour-coding bits), while in rather smooth image regions segmentation will not be able to compensate for the extra contour-coding bits.
5. CONCLUSION
This paper introduces a new lossless color transform, based on the Karhunen-Loeve Transform (KLT). The transform removes redundancies in the color representation of each pixel. The KLT is applied upon small image regions with a more or less homogeneous color. This is achieved by dividing the image into equally-sized blocks or by segmentation. Spatial redundancies are dealt with by means of a simple linear predictor. The proposed scheme is tested on several images and the results show that it typically saves about a half to two bit per pixel, compared to a purely predictive scheme.
Acknowledgments
This work was nancially supported by the Belgian National Fund for Scienti c Research (NFWO) through a mandate of \postdoctoral research fellow" and through the projects 39.0051.93 and 31.5831.95, and by the Flemish Institute for the Advancement of Scienti c-Technological Research in Industry (IWT) trough the projects Tele-Visie (IWT 950202) and Samset (IWT 950204).
REFERENCES
1. T. I. Telegraph and T. C. C. (CCITT), eds., Digital Compression and Coding of Continuous-Tone Still Images. Recommendation T.81, 1992. 2. J. A. Robinson, \Ecient general-purpose image compression with binary tree predictive coding," IEEE Transactions on Image Processing, vol. 6, pp. 601{608, Apr. 1997. 3. P. G. Howard, The Design and Analysis of Ecient Lossless Data Compression Systems. PhD thesis, Department of Computer Science, Brown University, Providence, Rhode Island, June 1993. 4. A. Said and W. A. Pearlman, \An image multiresolution representation for lossless and lossy compression," in SPIE Symposium on Visual Communications and Image Processing, Cambridge, MA, Nov. 1993. 5. A. Said and W. A. Pearlman, \Image compression via multiresolution representation and predictive coding," in Visual Communications and Image Processing, no. 2094 in SPIE, pp. 664{674, Nov. 1993. 6. S. Dewitte and J. Cornelis, \Lossless integer wavelet transform," IEEE Signal Processing Letters, 1997. To be published. y Graphics
is an image represented in the la b color space.
7. X. Wu and N. Memon, \CALIC - a context based adaptive lossless image codec," in IEEE International Conference on Acoustics, Speech, & Signal Pocessing, vol. 4, pp. 1890{1893, May 1996. 8. K. Denecker, J. Van Overloop, and I. Lemahieu, \An experimental comparison of several lossless image coders for medical images," in Proceedings of the Data Compression Industry Workshop (S. Hagerty and R. Renner, eds.), (Snowbird, Utah, USA), pp. 67{76, Ball Aerospace & Technologies Corp., Mar. 1997. 9. N. D. Memon and K. Sayood, \Lossless compression of RGB color images," Optical Engineering, vol. 34, no. 6, pp. 1711{1717, 1995. 10. K. Denecker and I. Lemahieu, \Lossless colour image compression using inter-colour error prediction," in Proceedings of the PRORISC IEEE Benelux Workshop on Circuits, Systems and Signal Processing (J.-P. Veen, ed.), (Mierlo, Nederland), pp. 95{100, STW Technology Foundation, Nov. 1996. 11. W. K. Pratt, Digital image processing. New-York: Wiley-Interscience, second ed., 1991. 12. C. Christopoulos, A. Skodras, W. Philips, J. Cornelis, and A. Constantinidis, \Progressive very low bit rate image coding," in Proceedings of the International Conference on Digital Signal Processing (DSP95), vol. 2, pp. 433{438, June 1995. Limassol, Cyprus. 13. W. Philips and K. Denecker, \A new embedded lossless/quasi-lossless image coder based on the Hadamard transform," in Proceedings of the IEEE International Conference on Image Processing (ICIP97), 1996. To be published.