Efficient lossless compression of colour image data ...

4 downloads 12909 Views 104KB Size Report
(PNG) format is the container of choice for storing synthetic ... no conversion), YUVr, YCgCo-R or HP2 – yields the best ..... Compaq-DEC/SRC-RR-124.html.
Efficient lossless compression of colour image data with diverse characteristics Tilo Strutz, Thomas Schmiedel and Thomas Schatton Deutsche Telekom AG Hochschule f¨ur Telekommunikation Leipzig, Institute of Communications Engineering Gustav-Freytag-Str. 43–45, 04277 Leipzig, G ERMANY

Abstract—Lossless compression of data is the attempt to model the data in such a manner that the resulting file contains the pure information without any redundancy. This paper presents a universal compression system which is able to compress colour images efficiently with a broad range of characteristics from natural images to synthetic images with sparse histograms. The proposed compression scheme uses a novel adaptive parameter selection and a pre-coding stage, which utilizes not only local correlation of the signal values, but also exploits long-distance dependencies. The new approach was tested against several stateof-the-art compressors for a set of eighteen images demonstrating that the average compression efficiency of the proposed scheme is superior.

I. I NTRODUCTION The reversible compression of image requires invertible processing steps. In general, this characteristic is achieved by using processing steps which map integer input samples to integer output values, particularly the decorrelation step. Successful compression schemes either predict the pixel values based on other pixels which have already been processed or apply integer transforms such as the integer wavelet transform (IWT) [1]. Typically, the lossless compression performance based on prediction is higher than for wavelet-based systems (see for example [2]). In prediction-based systems, the prediction errors are typically entropy coded. The remaining statistical dependencies are utilised by context-adaptive switching between different coding tables (e.g. [3], [4]). In combination with arithmetic coding, the modelling of the error distribution (e.g. [5]– [11]) and even the blending of the distributions of different predictors used are also promising (e.g. [12]). In the past, research on lossless image coding was mostly focussed on the compression of natural grey-scale images such as photographs. There are, however, many computergenerated images which have a more synthetic characteristic, which cannot be compressed efficiently with the same compression techniques. Typically, the portable network graphics (PNG) format is the container of choice for storing synthetic images. Unfortunately, PNG does not compress natural images very well. Moreover, images are not only purely natural or synthetic, such as simple diagrams or charts; instead, the full spectrum also contains mixtures of computer-generated image content and photographs, as well as rendered images, which are computer-generated but look like natural scenes. 978-963-8111-77-7

Image Colour Conversion

Prediction

RGB, YUVr, YCgCo, HP2, colour map

block-based selection

Interleaving

Y|U|V, YY|uvuv, YuvYuv, YYuuvv

Pre-coding

RLC 1 - BWT - MTF/IFC - RLC 2

Entropy coding

Huffman Coding

Bitstream

Fig. 1.

Processing stages of the proposed image compression system

The need for reversible coding techniques stems from several applications, which all have the common requirement that alteration of the image content is forbidden or undesirable. The reasons could be legal restrictions, for example, or iterative image processing steps, where each cycle could involve degradation in image quality, if lossy compression methods were used. This paper proposes a coding approach which is capable of compressing images successfully independently of their characteristics. This avoids the necessity to manually select one of a variety of compression formats. The data compaction is mainly based on a block-sorting algorithm, also known as Burrows-Wheeler transform (BWT) [13]. The BWT is combined with a ranking technique and run-length coding. The analysis of basic image properties controls the selection of optimal processing parameters. The paper is organised as follows: Section II explains the entire coding algorithm. The compression results are presented in Section III in comparison to the state-of-the-art methods. Section IV discusses the results and concludes the paper. II. M ETHOD A. General overview Figure 1 shows the stages of the encoding process. It

while YCgCo-R is a proposal for the fidelity range extension of the video coding standard H.264 [15] Co = R − B, t = B + ⌊Co/2⌋, Cg = G − t, Y = t + ⌊Cg/2⌋ .

20

Prediction RLC1 IFC RLC2

15 counts/10000

starts with an adaptive colour-space conversion. An analysis tool checks which of the four colour spaces – RGB (i.e. no conversion), YUVr , YCgCo-R or HP2 – yields the best decorrelation of the data by means of lowest entropy. YUVr corresponds to the reversible colour transform of JPEG2000 [14]   Ur + Vr Ur = B − G, Vr = R − G, Yr = G + , (1) 4

10

5

0

(2)

0

1

2

3

4

5

symbols

HP2 is defined in the CharLS project [16] as (3)

Fig. 2. Example distribution of symbols after successive processing steps (close-up)

All methods convert the image data without loss of information. The decision pro colour space conversion can be withdrawn by the automated parameter selection under certain circumstances (see below). The prediction is performed separately for each colour component. The image components are segmented in blocks of 24 × 24 pixels and the best predictor out of five predictors is selected, including - Median Edge Detector (MED) [3], - Gradient-Adjusted Predictor (GAP) [6], - Paeth Predictor [17], - simple linear prediction, - context-based adaptive linear prediction (CoBALP) [4]. The choice of the prediction method must be transmitted along the coded data as additional information. The processing in the pre-coding stage is one-dimensional and requires a certain interleaving of the colour components. In principle, it would be possible to process the components independently from one another (|Y|U|V). The components, however, are somewhat correlated, even if colour-space conversion and prediction were performed, and the joint treatment benefits the compression performance in most cases. The scheme selects between (i) sample-wise interleaving (YuvYuv...), which is especially helpful if the processing is performed in RGB colour space and existing correlations between the colour components must be exploited, (ii) a concatenation of components (YY..uu..vv..) and (iii) an interleaving with separation of luminance and chrominance components (YY..|uvuv..). Here, the Y component is processed independently, while the chrominance components are interleaved sample by sample. The pre-coding stage is a sequence of run-length coding (RLC), block-sorting, ranking and a second run-length coding (Figure 1). The RLC 1 reduces the number of symbols to be processed and also speeds up the BWT stage. The BWT sorts all symbols according to their context. The ranking step maps the sorted symbols to indices to reduce the variance and to produce runs of zeros. Here, either the move-to-front (MTF) coding method or the iterative frequency count (IFC) technique [18] is applied. In contrast to MTF, which moves the last

processed symbol to the top of the list, the IFC method counts the occurrence of all symbols processed in the recent past and ranks them according to their frequency. As the distribution of symbols might change within an image, the counts are regularly scaled down by a factor of two. The scaling is triggered when a certain number of pixels has been processed or the top count exceeds a maximum number. Both thresholds depend on the image size. The corresponding formulae were determined empirically and more investigations are necessary for optimisation. The second RLC stage compacts the produced runs of identical symbols. This pre-coding queue is based on Nelson’s algorithm for text compression, which uses RLC1-BWT-MTF-RLC2 [19] and was adopted later by bzip2 [20]. Both RLCs use an extra symbol (token) to mark runs with a length of at least four identical symbols. The run is substituted by three values: token, symbol and run length, which means there is never an expansion of the signal length, but only an extension of the symbol alphabet. The final processing step encodes the output symbols of the last RLC. Two Huffman codes are adaptively constructed, one for the symbols (including the token) and one for the run lengths. The code information is transmitted together with the coded data. All processing steps are fully reversible and the corresponding decoding algorithm reconstructs the image data in reverse order. Figure 2 shows an example how the distribution becomes narrower during the processing steps. To enable comparison, the prediction-error signal and the RLC1 output were folded into the non-negative range in an interlaced manner. In this example, the RLC 1 mainly compacts runs of zero, as can be concluded from the ‘Prediction’ and ‘RLC 1’ curves. After block sorting and ranking, the distribution is narrower and the number of zeros is increased. The second run-length coding stage reduces the total number of symbols to be transmitted by compacting the newly created runs of zeros.

Y = G,

U = B − ⌊(R + G)/2⌋ ,

V =R−G.

C/N

RGB/YUV prediction-based compression RGB/YUV no prediction

colour map no prediction C

Fig. 3. Categorisation of images in terms of compression parameters dependent on image size N (number of pixels) and number of unique colours C

B. Automated parameter selection The compression of data is optimal when the data model is understood as best as possible and this model can be addressed with an appropriate compression method. The proposed compression system shows some degree of freedom in selecting suitable processing parameters, which includes the colour space, the interleaving of colour components (both already discussed above), the prediction method and the precoding technique in terms of using either move-to-front (MTF) coding or iterative frequency count (IFC). The parameter selection is controlled by only two properties of the image: the number of colours (and its relation to the size of the image) and the ‘syntheticness’. The latter is explained below. 1) Categorisation of images: There are at least three general types of images with respect to data compression: images which should be processed using a colour map, images which should be decorrelated using colour-space transform and prediction1 and images which should be processed without a spatial decorrelation step. In principle, the categorisation could be done based on the number of pixels N and the number of unique colours C in the image (Figure 3). Please note that the diagram is not true to scale. If, for example, the quotient C/N is below a threshold and the number of colours is also below a certain number, then spatial decorrelation techniques (e.g. prediction) most likely will have adverse effects on compression performance. 2) Syntheticness of images: Categorisation based on number of colours C and image size N is not entirely sufficient for optimal parameter selection. In addition, a measure of syntheticness is proposed, which is based on the characteristic of the colour component’s histograms {h[r]}, {h[g]} and {h[b]}, whereby r, g, and b stand for the intensity values of red, green and blue. The measure sums up the absolute differences between adjacent frequency counts h[x] with 0 ≤ x ≤ 255, 1 Alternatively, the spatial decorrelation could be performed using a signal transform such as discrete wavelet or cosine transform.

x ∈ {r, g, b} for components with 8 bits per pixel  254  P |h[x + 1] − h[x]| + |h[0] − h[255]| Sx = x=0 , (4) N where N is the image size. Value S = (Sr + Sg + Sb )/3 is in the range [0, 2]. The larger S, the more distinct peaks and gaps the histogram contains and the more likely the image is of synthetic nature. Some images have a distinct background colour. The corresponding intensity value x corresponds to a significant peak in the histogram and would bias S towards large values. The influence of such a background value is suppressed by excluding the modal value xmod = argmaxx (h[x]) from the summation in equation (4) and correcting the number of pixels accordingly. 3) Derived processing steps: Using the characterisation of images as discussed above, the following compression parameters are used in the current implementation. If the number of unique colours C is lower than the fixed number of 100000, C/N < 0.032 and syntheticness is larger than 0.1, then the image is processed using a colour map (which must be included in the bitstream), without prediction and using plain MTF coding after the BWT. If one of the three conditions is not met, the best colour space is determined as described in Subsection II-A. Further, the parameters are tested against the following thresholds: C < 90000, C/N < 0.055 and S > 0.1. If the conditions are met, the prediction is bypassed and the interleaving mode YuvYuv and MTF coding are used. In all other cases, the block-based prediction is applied. If colour space conversion is enabled, YY|uvuv interleaving and IFC are activated. Otherwise (i.e. RGB is preferred), the syntheticness determines the other coding parameters. If S > 0.9, then YuvYuv interleaving and MTF are used; otherwise YYuuvv interleaving and IFC are activated. 4) Coding of the colour map: Since prediction is disabled when indexed colours are used, the map items can be sorted in a manner that allows efficient compression of the colour map. The order of the colours has only marginal influence on the performance of the pre-coding steps described above. The new coding scheme computes a single colour value from the (R,G,B) triple cRGB = G · 216 + R · 28 + B .

(5)

The values cRGB are sorted in ascending order, with the effect that the list of G values starts with a small number followed by identical or slightly increased values. The differences between adjacent G values, which are always non-negative, are fed into an adaptive Rice encoder. The coding of the list of R values follows the same principle, as long as the G values do not change. As soon as a new G value appears, the encoder switches to another context and the R value is processed in a non-differential manner. The same holds true for the B values, with the distinction that both R and G must be constant as a prerequisite for the differential method. The colour map of the image ‘meditate fel’ (see next section), for example, which contains 32096 unique colours, requires only 29528

bytes (4044 (G) + 4752 (R) + 19732 (B)), making the indexed compression the most efficient method for this image. To the author’s knowledge, compression of images with colour maps having more than 256 entries was not considered yet. III. C OMPRESSION

RESULTS

The performance of the new compression algorithm was tested based on 18 images [21] with different characteristics, mostly taken from the internet: ‘p08’, ‘woman’ and ‘bike’ from a JPEG test set provided by Thomas Richter. Eight of the images are photographs or scanned samples and three are ray-traced images, while the others are charts, screen dumps or other computer-generated images. Table I contains the compression results by means of bits per pixel. Column ‘S’ shows the value resulting from the analysis of the image histograms, eq.(4), and C is the number of unique colours. The proposed algorithm is called TSIP in the style of ‘ZIP’ for general lossless compression. ‘Auto’ means that all parameters of the algorithm are set automatically as described above; ‘opt.’ corresponds to tuned parameters. These are mainly different colour spaces and different block sizes for block-based prediction, but also other predictors or different interleaving modes. In particular, if an image shows a mixture of synthetic and natural content, the decision for or against a particular strategy might lead to suboptimal compression results. The results are compared to ‘PNGOUT’ (irfanView, plugin PNGOUT, default options), LOCO-I [22], and ‘7zip’ (version 9.20 with compression mode ‘Ultra’). LOCO-I is the core algorithm of JPEG-LS [3]. Its results are obtained using the same colour transform, as automatically chosen by TSIP (Table I, column ‘colour’). For each image, the best result is printed in bold numbers and the second best result is underlined. With regard to the tested images, the new TSIP compression scheme is almost always among the top two algorithms and requires the smallest number of bits per pixel on average. The image ‘png-8passes’ is the only example in which the proposed compression scheme performs distinctly worse than the best one (7zip). This fact is somewhat surprising, since TSIP performs comparably to or better than 7zip for the majority of tested images. The proposed compression scheme is able to switch automatically to a mode with indexed colours. It should be said that there are dedicated algorithms for images with low number of unique colours, such as the piecewise-constant image model [23] and context-tree modelling [24]. These methods could not be included in the test due to the lack of an appropriate software implementation, The compression time on a standard desktop computer varies for TSIP from few seconds up to several minutes, when BWT faces a high sorting depth due to many identical symbols. The source code, however, is not optimised for speed yet.

IV. S UMMARY

AND DISCUSSION

We have proposed a novel coding scheme for the lossless compression of arbitrary colour images. In contrast to other compression systems focussing on natural images like photographs, the coding stage of the new scheme does not rely on the local two-dimensional correlation of prediction errors, but it is able to exploit long-distance correlations. This makes the new approach applicable to images with synthetic content as well, while high compression ratios can be obtained regardless of the image type. The compression scheme described here distinguishes itself through image analysis-driven selection of parameter settings (colour space, prediction method, interleaving mode and precoding technique). The different interleaving techniques enable the exploitation of remaining correlations between the colour components. When using a YUV-like colour space, the combined treatment of the U and V components (i.e. using YY|uvuv mode) benefits the compression for most images. Without any colour transform, full interleaving (YuvYuv) typically yields the best results. The investigations have shown that compression of images with colour maps containing far more than 256 unique colours can be more efficient than processing with three separate components. Using the proposed coding method for colour maps, even the LOCO-I algorithm compresses the synthetic image ‘europa karte de’ just as efficiently as 7zip. Use of the Burrows-Wheeler transform (BWT) inhibits the exploitation of two-dimensional correlation, which means the performance of the new scheme cannot outperform specialised systems such as LOCO-I for images without any long-distance correlation. Based on a proper image analysis, however, the LOCO-I algorithm could be selected adaptively. The computation time is a drawback of the BWT. Even if the source code were optimised for speed, sorting would last longer than simple processing in LOCO-I. Future investigations will consider variants of block sorting with limited sorting depth [25]. Column ‘TSIP (opt.)’ in Table I shows that there is room for improvement in automated parameter selection. The block size for block-based predictor selection should be dependent at least on the image size, for example, but content-adaptive strategies using quadtrees might also be a promising option [26]. In addition, the re-scaling of symbol counts in the IFC step could be adapted to the image properties. In general, a more sophisticated analysis of the image or immediate signals is needed. The segmentation of images having content with varying properties is another problem which must be dealt with in future. Finally, it should be mentioned that the implemented entropy-coding stage could be improved by replacing the Huffman coding with an adaptive arithmetic-coding stage. This not only drops the amount of compressed data but also saves some bits required for the transmission of the Huffman-code tables.

TABLE I C OMPRESSION

image boots p08 blockMap barbara2 party 4s kodim23 png-8passes woman.rgb bike cwheel woodgrove artificial8 bee teacher dev chart pulinco meditate fel europa karte de light world-1 average a This

size 1018 × 854 2592 × 1944 1430 × 2012 720 × 576 1600 × 1200 768 × 512 1024 × 768 2048 × 2560 2048 × 2560 800 × 600 1149 × 852 800 × 533 4825 × 972 800 × 619 1122 × 720 3000 × 2248 1200 × 1000 4096 × 4096

S 0.03 0.04 0.05 0.05 0.07 0.11 0.12 0.19 0.24 0.26 0.28 0.31 0.42 0.67 1.11 1.53 1.69 1.99

C 4895 324170 329036 100013 448765 72079 43032 99642 271926 83172 56177 39584 266424 17565 56477 32096 52 401

RESULTS IN BITS PER PIXEL [ BPP ]

PNGOUT 9.040 12.442 14.590 15.915 12.685 11.252 5.922 14.851 14.643 3.713 3.376 5.106 4.786 1.872 3.062 1.963 0.600 1.057 7.604

7zip 7.685 14.493 16.937 17.804 14.607 12.821 2.684 13.223 12.333 5.021 3.309 4.512 4.995 1.424 2.739 1.194 0.663 0.462 7.606

colour HP2 HP2 YUVr YUVr YCgCo-R YUVr Indexed Indexed YCgCo-R RGB YCgCo-R YCgCo-R YUVr YCgCo-R YUVr Indexed Indexed Indexed

LOCO-I 7.332 7.884 12.508 10.197 10.429 8.580 12.162 11.346a 11.509 4.004 3.873 4.623 4.302 1.803 2.844 1.307b 0.676 2.733 6.576

TSIP (auto) 6.580 8.097 12.592 9.650 10.984 8.859 3.742 10.946 12.108 3.687 3.378 4.724 4.294 1.604 2.497 0.954 0.525 0.366 5.866

TSIP (opt.) 6.546 7.970 12.413 9.584 10.822 8.817 3.742 10.825 11.859 3.513 3.074 4.305 4.163 1.604 2.444 0.940 0.512 0.366 5.750

result was obtained using YUVr colour space. The bitrate using indexed colours is higher.

b dito

R EFERENCES [1] Calderbank, A.R.; Daubechies, I.; Sweldens, W.; Boon-Lock Yeo: Lossless image compression using integer to integer wavelet transforms. Proc. of ICIP, Vol.1, 26-29 Oct. 1997, 596–599 [2] Santa-Cruza, D.; Ebrahimia, T.; Askelof, J.; Larsson, M; Christopoulos, C.A.: JPEG 2000 still image coding versus other standards. Proc. of the SPIE, San Diego, California, 2000, Vol.4115, 446–454, [3] ISO/IEC 14495-1, Information technology – Lossless and near-lossless compression of continuous-tone still images: Baseline (JPEG–LS). Int. Standard, corrected and reprinted version, 15 September 2000 [4] Strutz, T.: Context-Based Adaptive Linear Prediction for Lossless Image Coding. 4th Int. Conf. on Source and Channel Coding, Berlin, 28-30 Jan. 2002, 105–109 [5] Takamura, S.; Takagi, M: Lossless Image Compression with Lossy Image Using Adaptive Prediction and Arithmetic Coding. Data Compression Conference, DCC’94, 1994, 166-174 [6] Wu, X.; Memon, N.: Context-based, Adaptive, Lossless Image Codec. IEEE Trans. on Communications, Vol.45, No.4, April 1997, 437–444 [7] Xin L.; Orchard, M.T.: Edge directed prediction for lossless compression of natural images. Int. Conf. on Image Proc., ICIP’99, Vol.4, 58–62 [8] Motta, G.; Storer, J.A.; Carpentieri, B.: Lossless image coding via adaptive linear prediction and classification. Proc. of the IEEE, Vol.88, No.11, 2000, 1790–1796 [9] Meyer, B.; Tischer, P.: Glicbawls — Grey Level Image Compression By Adaptive Weighted Least Squares. Proc. of Data Compression Conference, DCC’01, 2001 [10] Knezovic, J.; Kovac, M.; Mlinaric, H.: Classification and blending prediction for lossless image compression. IEEE MELECON 2006, 2006, 486–489 [11] Ulacha, G.; Stasinski, R.: Highly effective predictor blending method for lossless image coding. IEEE MELECON 2010, 2010, 1099–1104 [12] Meyer, B.; Tischer, P.: TMW – a new method for lossless image compression. Proc. of PCS’97, Berlin, Germany, 10–12 Sep. 1997, 67– 72 [13] Burrows, M.; Wheeler, D.J.: A Block-Sorting Lossless Data Compression Algorithm. Digital Systems Research Center, Research Report 124, Palo Alto, CA, USA, 1994, http://www.hpl.hp.com/techreports/ Compaq-DEC/SRC-RR-124.html

[14] ISO/IEC JTC1/SC29/WG11 N1890, Information technology – JPEG 2000 Image Coding System. JPEG 2000 Part I, Final Draft Intern. Standard 15444, 25 Sep. 2000 [15] Malvar, H.; Sullivan, G.: YCoCg-R: A color space with RGB reversibility and low dynamic range. ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Document JVT-I014, 2003 [16] http://charls.codeplex.com/, last visited 15.02.2011 [17] http://www.libpng.org/pub/png/spec/1.2/PNG-Filters.html, last visited 12.01.2011 [18] Abel, J.: Incremental frequency count A post BWT-stage for the BurrowsWheeler compression algorithm. Journal SoftwarePractice & Experience, Vol. 37, No. 3, March 2007, 247-265 [19] Nelson, M.: Data Compression with the Burrows-Wheeler Transform. Dr. Dobb’s Journal, September, 1996 [20] www.bzip.org, last visited on 6 June 2011 [21] http://www1.hft-leipzig.de/strutz/Papers/ Testimages/TSIP/ last visited 16.01.2011 [22] Weinberger, M.J.; Seroussi, G.; Sapiro, G.: LOCO-I: Lossless Image Compression Algorithm: Principles and Standardization into JPEG-LS. IEEE Trans. on Image Processing, Vol.9, No.8, August 2000, 1309–1324 [23] Ausbeck, J.P.: The Piecewise-Constant Image Model. Proc. of the IEEE, Vol.88, No.11, November 2000, 1779-1789 [24] Akimov, A.; Frnti, P.; Kolesnikov, A.: Lossless Compression of Color Map Images by Context Tree Modeling. Proc. of Data Compression Conference, DCC’2006, March 2006, Snowbird, Utah, 412–421 [25] Schindler, M.: A Fast Blocksorting Algorithm for lossless Data Compression. Proc. of Data Compression Conference, DCC’1997, March 1997, Snowbird, Utah, 469 [26] Matsuda, I.; Ozaki, N.; Umezu, Y.; Itoh, S.: Lossless Coding Using Variable Block-Size Adaptive Prediction Optimized for Each Image. Proc. of EUSIPCO 2005, Sep. 4-8, 2005, Antalya, Turkey