Graphics Image Compression Using JPEG2000
1
Ping-Sing Tsai1,2 and Ricardo Suzuki1 Department of Computer Science, University of Texas – Pan American 2 Sigma Designs, Inc.
[email protected],
[email protected] Abstract
The unified framework of the JPEG2000 architecture makes practical high quality real-time compression possible even in video mode. The Part 3 of the JPEG2000 standard, Motion JPEG2000, uses an intra-frame style for video compression which compresses video sequence frame by frame using the JPEG2000 core coding system. However, the compression performance of the JPEG2000 behaves poorly when compressing an image with low color depth such as Graphics images. In this paper, we propose a technique to distinguish the true color images from Graphics images and to compress Graphics images using a simplified JPEG2000 compression method that will improve the compression performance. The proposed method can be easily adapted in Motion JPEG2000 framework for video sequence compression without the need for compressed bit-stream syntax modification.
1. Introduction Motion JPEG2000, also known as the Part 3 of the JPEG2000 standard, specifies how to encode images with JPEG2000 Part 1 core coding codec for motion sequences. All images in a Motion JPEG2000 file are compressed frame by frame using JPEG2000 Part 1 codec without any inter-frame coding [1 – 4]. However, when a video sequence contains graphics type images, such as animation clips or logos, the compression performances of the JPEG2000 will degrade due to the fact that those graphics type images are either using color palette with low color depth or containing objects with solid areas and a limited number of colors. As discussed in [4], a general lossless image compression framework contains two steps. The first step, image de-correlation or preprocessing, aims for reducing the spatial redundancy of an image. Techniques, such as DCT (Discrete Cosine Transform), DWT (Discrete Wavelet Transform),
DPCM (Differential Pulse Code Modulation), etc., can be found in a lot of compression standards/methods. In principle, this step will potentially yield a more compact representation of the image. The second step, entropy encoding, the de-correlated image is directly processed by an entropy encoder using some variablelength coding techniques, such as Huffman coding or binary arithmetic coding etc. Ideally, these graphics type images should be compressed using different techniques, such as GIF format [5] using the LZW [6] compression technique. However, it is not visible or practical to change the compression technique under the motion JPEG2000 framework. It is not just the overhead that will have two different coding modules within a system. It also will have the issue regarding the bit-stream compatibility between two totally different coding schemes. In this study, we propose a simple method which still works under the framework of the motion JPEG2000 to improve the compression performance when we encounter graphics type images in a video sequence. The idea is to simply bypass the de-correlation step, if we encounter a graphics type image which is already in a quite compact form. The de-correlation step is actually degrading the compression performance. Another issue that arises is “how do we know that we encounter a graphics type image in a given video sequence?” Based on our observation, we found that the “entropy” of the RGB channels of an image frame is a good indicator for distinguishing between true color images and graphics type images. The rest of the paper is organized as follows. In section 2, we present our key observation regarding the compression performance issue and review what is entropy of an image. The proposed method will be described in section 3. Simulation results of the proposed approach are shown in section 4. Conclusions and future works will be discussed in section 5.
2. Key observation and related works Fig. 1(a) shows a true color image (Pencils) with 256 colors per channel which we downloaded over the web. The image has very nice color distribution as the Red, Green, and Blue histograms shown in Figs. 1(b) – (d). In order to mimic the behavior of a graphics image with low color depth, we convert the Pencils image into an image, as shown in Fig. 1(e), with 216 safe RGB colors (as described in [7], with 6 colors per channel) which is known as the set of all-systems-safe colors or safe Web/browser colors. As we can see in Figs. 1(f) – (h), the RGB histograms are very discrete as one can expect. We compressed both the Figs. 1(a) and 1(e) using the JPEG2000 software – “Kakadu” provided by [3], and the compressed sizes for the two images are 466,211 bytes and 570,480 bytes respectively. The compressed size of the Fig. 1(e) is actually increased by 22.4% as compared to Fig. 1(a). This observation inspired the idea of using the “entropy” of the RGB channels to distinguish the true color image or Graphics images and compress Graphics images using a simplified JPEG2000 compression method that will improve the compression performance. We will first provide a quick review for JPEG2000 standard, and then review the “entropy” of an image in the next sub-sections.
2.1 Overview of JPEG2000 standard For a JPEG2000 encoder, the image components can be divided into rectangular tiles for the purpose of working with huge images. DC level shifting is performed on these tile components followed by either an irreversible or reversible component transformation. The component transformation helps improve compression performance. Each component of a tile is independently transformed by the Discrete Wavelet Transformation (DWT) [8]. In JPEG 2000, the 9/7 irreversible wavelet transformation is used for lossy compression and the 5/3 reversible lifting based wavelet transform is specified for lossless compression. Uniform scalar quantization with deadzone at the origin is applied to the samples in subbands at the wavelet domain for lossy compression. The quantization step size can be determined by the dynamic range of the samples in a subband. After quantization, each subband is divided into nonoverlapping rectangular blocks, called code blocks. Code blocks are the basic coding unit for entropy coding. Encoding is done independently and the size of the code block is typically 32 × 32 or 64 × 64. A unique feature of JPEG2000 is region of interest (ROI) coding which allows different regions of an image to
be coded with different fidelity criteria. The MAXSHIFT method proposed by Christopoulos et al. [9, 10] is adopted by the JPEG 2000 part 1 standard. The entropy encoding in JPEG 2000 consists of a fractional bit plane coding (BPC) and binary arithmetic coding (BAC). The combination of BPC and BAC is also referred to as Tier 1 coding in the standard. BPC has three passes in each bit plane: Significance Propagation Pass, Magnitude Refinement Pass, and Cleanup Pass. Each pass generates context models and the corresponding binary data. The output of BPC and BAC produces the compressed bit stream. So each coding block has an independent bit stream. These independent bit streams of all the code blocks are combined into a single bit stream using Tier 2 coding, which is based on the result of rate-distortion optimization. An efficient rate-distortion algorithm provides possible truncation points of the bit streams in an optimal way to minimize distortion according to any given target bit rate. However, in order to obtain an optimal solution such as the EBCOT (Embedded Block Coding with Optimized Truncation) method proposed by D. Taubman [11], one will need to buffer up all the bit streams from all code blocks. This is a heavy burden for any hardware based implementation with limited memory available. Tier 2 coding multiplexes these independent bit streams that were generated in Tier 1 coding to compose the final compressed output bit stream. It also efficiently gives header information to indicate ordering of the resulting coded blocks and corresponding coding passes.
2.2 Entropy of an image As defined in Merriam-Webster dictionary, the term “entropy” means: a measure of the unavailable energy in a closed thermodynamic system that is also usually considered to be a measure of the system's disorder or chaos. Intuitively, in a statistical sense, entropy is a measure for degree of “surprise” or “uncertainty.” However, from the Information Theory [12] point of view, entropy is the expected length of a binary code over all possible symbols in a discrete memory-less source. In other words, entropy can be considered as the average number of bits one needs to represent a symbol in a stationary system, where the limited source symbols have fixed probabilities of occurrence. The entropy is expressed as N
E=−
∑ p(a ) log i
2
p (ai ) .
i =1
Where N is the number of symbols and p(ai ) is the probability of occurrence of symbol ai . This is a very
convenient measure for any coding system, and it provides a bound for compression that can be achieved. The entropy of an image can be easily calculated based on the image’s histogram information, which is nothing but the occurrence information of all the intensity values (symbols) in the image. For an 8-bit gray scale image, we have the following symbols h(i ) , {a1 , a2 ," , a256 } = {0,1," , 255} , and p(ai ) = Npixels where Npixles is the total number of pixels in the image, h(i) is the histogram count for the intensity value of i. As we will show in the later section, the “entropy” of the RGB channels of an image will be a good indication for distinction of true color images or graphics type images.
3. Proposed method The Motion JPEG2000 compresses a given video sequence frame by frame using JPEG2000 standard. The JPEG2000 coding standard uses the DWT for the purpose of de-correlation. The proposed method is to simply bypass the DWT module in the JPEG2000 codec whenever we encounter an image with an entropy value less than 4 in all three color channels. The JPEG2000 standard does allow specifying the number of DWT decomposition levels to zero which implies no transform. There will be no bit-stream compatibility issues. We will assume the whole image (or tile as in JPEG2000 terminology) as the LL subband and proceed with BPC (bit-plane coding), BAC (binary arithmetic coding), and the rest of the JPEG2000 coding modules. There is no overhead at the decoder side. The only overhead will be at the encoder side where we need to calculate and check the entropy values for each frame.
4. Experimental results Fig. 2(a) shows one of the three test images – “Pencils”, “Icon”, and “Bond”, we found over the web. In order to observe the impact of different color depths, we gradually reduced the color depth from 256 colors per channel (true color image) to 6 colors per channel (all-system-safe color image), as shown in Fig. 2. We first applied the JPEG2000 compression with 3 levels of DWT decompositions using the Kakadu software. The compressed bit-stream is formatted in a single layer for simplicity. Then we applied the proposed method to those images. The DWT step is bypassed by setting the level of DWT decompositions to zero and we used the code block size of 64×64 in this simulation. As we can see from Table 1, the lower the color depth used, the smaller entropy values we have.
When the entropy values are below 4 as we suggested, in all 3 cases, the proposed method outperforms the JPEG2000 standard. For the case of the image “Bond”, with 6 colors per channel, the proposed modified JPEG2000 method has shown an improvement around 58%. Fig. 3 shows the 16 frames of a short animation clip of Bart Simpson. It is clear that this type of images uses very limited number of colors, and all the entropy values are below 2 in this example (as shown in Table 2). The proposed method clearly has better compression performance.
5. Conclusions Based on the simulation results, it is clear that the JPEG2000 performs poorly when it deals with images having low color depth such as Graphics images. By bypassing the DWT step in the JPEG2000, we can improve compression performance significantly. We used “entropy” of the RGB channels to distinguish the true color image or Graphics images. The proposed method is simple and can be easily adapted in Motion JPEG2000 framework for video sequence compression without the need for compressed bit-stream syntax modification.
6. References [1] JPEG2000 Part 1: Core Coding System, Final Committee Draft (ISO/IEC FCD15444-1), ISO/IEC JTC1/SC29/WG1 N11855, March 2000. [2] JPEG2000 Part 3: Motion JPEG2000, Final Committee Draft (ISO/IEC FCD15444-3), ISO/IEC JTC1/SC29/WG1 N2117, March 2001. [3] D. Taubman and M. Marcellin, JPEG2000: Image Compression Fundamentals, Standards and Practice, Boston: Kluwer Academic Publisher, 2002. [4] T. Acharya and P. S. Tsai, JPEG2000 Standard for Image Compression: Concepts, Algorithms, and VLSI Architectures, John Wiley & Sons, Inc., NJ, 2004. [5] GIF89a Specification. (http://www.w3.org/Graphics/GIF/spec-gif89a.txt) [6] T. Welch, “A Technique for High-Performance Data Compression,” Computer, vol. 17, no. 6, pp. 8-19, 1984. [7] R. C. Gonzalez and R. E. Woods, Digital Image Processing, 2nd edition, Prentice-Hall, Inc., NJ, 2002. [8] M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies, “Image coding using wavelet transform,” IEEE Trans. Image Processing, vol. 1, pp. 205-220, April 1992. [9] D. Nister and C. Christopoulos, “Lossless region of interest with embedded wavelet image coding,” Signal Processing, vol. 78, no. 1, pp. 1-17, 1999. [10] C. Christopoulos, J. Askelof, and M. Larsson, “Efficient region of interest coding techniques in the upcoming JPEG2000 still image coding standard,” in Proc.
International Conference on Image Processing, ICIP 2000, vol. 2, pp. 41-44, Sept. 2000. [11] D. Taubman, “High performance scalable image compression with EBCOT,” IEEE Trans. Image Processing, vol. 9, no. 7, pp. 1158-1170, July 2000. [12] C. E. Shannon and W. Weaver, The Mathematical Theory of Communication, University of Illinois Press, Urbana, IL, 1949. [13] “Pencils” image.
(http://www.stpaulcareers.umn.edu/img/assets/16141/Graphi c%20Design145x100.jpg) [14] “Icon” image. (http://graphics.cs.brown.edu/games/G3D/icon.jpg)
Red
Blue
Green
6000
5000
5000
4000
4000
4500 4000 3500 3000 2500 2000 1500 1000 500 0
3000
3000 2000
2000
1000
1000 0
0 0
50
(a)
100
150
200
250
0
50
100
(b)
150
200
0
250
(c)
Red
100
150
200
250
150
200
250
(d) Blue
180000 160000 140000 120000 100000 80000 60000 40000 20000 0 50
100
Green
180000 160000 140000 120000 100000 80000 60000 40000 20000 0 0
50
200000 150000 100000 50000 0 0
50
100
150
200
250
0
50
100
150
200
250
(e) (f) (g) (h) Figure 1: The key observation. (a) Original “Pencils” image with 256 color per channel ([13]; image was resized to 800 by 600), (b) – (d) Histograms of (a)’s color channels; (e) Images with reduced color depths (216 safe RGB colors). (f) – (h) Histograms of (e)’s color channels. Table 1: Compressed size comparisons and entropy values Color Depth (per channel) Image Red Green Blue JPEG2000 Proposed Method Red Green Blue JPEG2000 Proposed Method Red Green Blue JPEG2000 Proposed Method
Entropy Pencils Compressed Size (Bytes) Entropy Icon Compressed Size (Bytes) Entropy Bond Compressed Size (Bytes)
256 colors
128 colors
64 colors
11 colors
6 colors
7.81 7.78 7.71 466,211 784,570 4.90 5.07 5.07 155,626 171,029 6.09 5.91 6.02 99,569 112,821
6.82 6.78 6.72 514,751 670,805 4.20 4.17 4.15 158,657 149,044 5.15 4.94 5.07 113,800 96,628
5.82 5.78 5.72 597,366 566,460 3.75 3.71 3.71 161,750 126,825 4.23 4.07 4.19 123,330 96,628
3.20 3.14 3.09 680,834 400,860 2.26 2.50 2.52 155,011 111,254 2.46 2.52 2.25 132,800 70,568
2.26 2.23 2.18 570,480 222,937 1.85 2.16 2.10 138,622 76,819 1.89 1.92 2.00 102,994 43,358
Table 2: Simulation results for the Bart Simpson animation clip E C
Frame # R G B J2K Proposed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
1.01 1.09 0.80
1.07 1.11 0.83
1.10 1.11 0.81
1.19 1.15 0.85
1.19 1.11 0.81
1.21 1.14 0.85
1.29 1.16 0.84
1.36 1.20 0.92
1.21 1.01 0.82
1.24 0.97 0.85
1.41 1.12 0.89
1.53 1.19 0.94
1.57 1.22 0.97
1.62 1.26 1.00
1.58 1.24 0.98
1.59 1.25 0.98
9.68 5.96
10.2 6.13
10.1 6.23
11.4 6.39
10.6 5.97
10.8 6.24
11.6 6.90
12.7 7.31
11.5 6.38
12.1 6.91
13.0 7.41
14.5 8.06
15.0 8.32
15.6 8.58
15.1 8.31
15.2 8.32
(a) Original “Icon” image with 256 colors per channel
(b) 128 colors per channel
(c) 64 colors per channel
(d) 11 colors per channel (e) 6 colors per channel Figure 2: (a) Original “Icon” image with 256 colors per channel ([14]; image size 432 by 392), (b) – (e) Images with different (reduced) color depths.
frame 1
frame 2
frame 3
frame 4
frame 5
frame 6
frame 7
frame 8
frame 9
frame 10
frame 11
frame 12
frame 13
frame 14
frame 15
frame 16
Figure 3: An animation clip of Bart Simpson; frame 1 – frame 16.