Check Image Compression: A Comparison of JPEG, Wavelet and ...

2 downloads 0 Views 242KB Size Report
Check Image Compression: A Comparison of JPEG, Wavelet and. Layered Coding Methods. Jincheng Huang and Yao Wang. Edward K. Wong. Dept. of ...
Check Image Compression: A Comparison of JPEG, Wavelet and Layered Coding Methods Jincheng Huang and Yao Wang

Edward K. Wong

Dept. of Electrical Engineering Polytechnic University Brooklyn, NY 11201

Dept. of Computer and Information Science Polytechnic University Brooklyn, NY 11201

Abstract

An emerging trend in the banking industry is to digitize checks for storage and transmission. An immediate requirement for ecient storage and transmission is check image compression. General purpose compression algorithms such as JPEG and wavelet-based methods produce annoying ringing and blocking artifacts at high compression ratios. In this paper, a layered approach to check image compression is proposed, based on which a check image is represented in several layers. The rst layer describes the foreground map; the second layer speci es the graylevels of foreground pixels; the third layer is a lossy representation of the background image; and the fourth layer describes the error between the original and the reconstructed image based on the rst three layers. The layered coding approach produces images of better quality than traditional JPEG and wavelet coding methods, especially in the foreground, i.e., the text and graphics. In addition, this approach allows progressive retrieval or transmission of di erent image layers.

1 Introduction

vary widely from about 20 to 60 square inches. A simple approach to scanning with 100 dpi (dot per inch) and 8-bit grayscale can lead to an image with 360 Kbytes of information per personal check. Commercial systems now in use employ standard compression techniques like JPEG[2] to reduce the information to about 42 Kbytes per check. Other general purpose image compression methods such as the wavelet method could also be used. At high compress ratios, the JPEG and the wavelet codecs produce annoying artifacts near the printed and written characters and symbols. Moreover, the JPEG method yields blocking and severe blurring e ects at high compression ratios. A check image can be viewed to compose of a foreground overlaid on a background. The foreground includes written or printed numerals and text, line segments, and logo. It is usually homogeneous in gray level value and appears dark in color. On the other hand, the background usually contains smoothly-varying scenery, or light-intensity regular patterns or textures. In view of these characteristics, we have developed a layered coding method for check image compression. The preliminary version of this scheme was reported in [1]. In this approach, a check image is separated into four di erent layers: the binary map of the foreground region, the gray-scale values of foreground pixels, a lossy compressed version of the background image, and the residual image representing the error between the original image and the reconstructed image obtained from the rst three layers. Since these layers have di erent characteristics, a di erent compression algorithm can be employed for each layer independently. In this paper, the layered coding algorithm is described in Section 2. Experimental results are presented in Section 3. Section 4 concludes the paper.

The Financial Services Technology Consortium (FSTC) has just completed a two-year Interbank Check Imaging (ICI) project to nd a better way of managing the more than 63 billion checks processed by banks and nancial institutions each year. One of the goals of the project was to demonstrate that digital imaging allows truncation of paper checks at earlier points of the clearing process. One of the key issues studied is check image compression, which a ects the bandwidth requirement in the interchange process and the storage requirement in archiving them. With the 63 billion checks processed in the United States per year, and a storage period of seven years on all returned checks as required by federal laws, any amount of improvement in compression will 2 The Layered Coding Algorithm lead to signi cant savings. A standard personal check measures approximately The crucial step in the layered coding is to segment 36 square inches (front and back). Corporate checks can an image into the foreground and the background. In

this section, we rst discuss the algorithm for separat- original grayscale values of all foreground pixels. The ing foreground from background. We then describe the background image (layer III) is obtained by removing coding algorithm for each layer. the foreground pixels from the input image. Finally, the error image (layer IV) is the di erence between the 2.1 Segmentation of Background and Foreoriginal image and the reconstructed image from the ground rst three layers. The basic segmentation operation used here is morLayer I, which is the binary foreground map, is coded phological closing[3]. For test images digitized with by the JBIG method[4]. To optimize the coding gain, varying resolutions, the foreground features we want to the sequential mode is used. extract vary in sizes. A small structuring element fails Layer II, which consists of the gray scale values of to extract thick lines and large features such as bank loforeground pixels, is coded by block-based DPCM. In gos. A large structuring element is required to extract this approach, the entire image is segmented into 8x8 large features. However, the use of a large structuring blocks, and if a block contains any foreground pixels, element over the entire image will increase computathen the average of the foreground pixels is calculated tion time and may also falsely extract some unwanted and the average value of the previous foreground block artifacts on checks with a rich background. is subtracted from this value. The di erence values are For detecting lines and features of variable sizes, varithen compressed by Hu man coding. able size structuring elements should be used. Toward Layer III is the background image after removing the this goal, we have developed a two-pass algorithm. the foreground pixels. We have explored the use of JPEG[2] image is divided into small blocks of size N  M. The and the vector-tree wavelet (VTW) coder by Said and rst pass determines the appropriate structuring elePearlman[5]. To improve the compression eciency in ment size for each block. This is accomplished by ndlayer III, the removed foreground pixels are replaced by ing the diameter of the valleys if they exist in each block. the average of the neighboring background pixel values. Since the transition from background to foreground is Layer IV is the di erence between the original and usually abrupt, the boundary of potential foreground the reconstructed image obtained by combining the rst regions can be detected by grayscale closing operation three layers. If a perfect reconstruction is required, this with a small cone-shaped structuring element, followed layer can be coded by a lossless compression method by subtraction of the original image and a thresholding such as arithmetic or Hu man coding. The coding for operation. Once the valley boundaries are found, a scanning this layer is not considered in our present study. algorithm scans the block line by line in the horizon- 3 Experimental Results tal, vertical and the two diagonal directions, identi es We have collected a total of 13 di erent personal foreground line segments by grouping pixels whose in- checks, and the front side of each is digitized with spatensities are similar, and nally determines the length tial resolutions of 100 dpi, 200 dpi, and 300 dpi, reof the longest line segment. This is recorded as the spectively. We evaluated the performance of two vermaximum width of valleys in the block in the partic- sions of the proposed layered coding method: one uses ular direction. The size of the structuring element to the JPEG coder (using the PVRG-JPEG code [6]) to be used for this block is taken to be the smallest of the code layer III (denoted as LCj) and the other one maximum widths scanned in the four directions. In the uses the VTW method (using the code from [7]; desecond pass, a closing operation is applied to each block noted as LCw). For comparison, we also applied the using a structuring element with size determined in the JPEG method (using the PVRG-JPEG code [6]) and rst pass. Then the original image is subtracted from the VTW coder (using the code from [7]) to the origithe resulting image. Thresholding is then applied to the nal images directly. residual image. Finally an area-ratio algorithm[1] is apFor the foreground segmentation step, we used a plied to the binaried image to remove isolated pixels or block size of 100  100 pixels. A cone-shaped strucunwanted short branches. turing element with a radius of 4 pixels is used in the 2.2 Overview of the Algorithm rst-pass to outline the potential foreground areas. A As shown in Figure 1, the layered coding algorithm variable size, disc-shaped structuring elements is used rst performs background/foreground segmentation on in the second pass. the input image. Two outputs are generated for the Figures 2 and 3 show the results for one test image. foreground: the foreground map (layer I) is a binary It is easy to see the ringing e ect near the numerals and image that masks all the foreground pixel locations; characters in the JPEG and VTW compressed images. the foreground grayscale image (layer II) consists of the The blocking e ect in the JPEG compressed image is

also highly noticeable. The background of the VTW compressed image shows blurring e ect. The ringing artifact is more pronounced in the JPEG compressed lower resolution image. The layered coding method can preserve the sharpness of the text and avoid the ringing artifact. Even for the background, the layered coding method has better visual quality than when applying the JPEG or wavelet codec to the entire image because the background image in the layered coding method is more homogeneous than the original image. We found that the LCw method has better performance than the LCj method on smoothly-varying type of background. However, it does not perform as well for images consisting of thin-line type of patterns. The LCw method has better performance for images digitized at high resolutions because pixel values change more gradually in these images. As can be seen from Figures 3(a) and 3(b), layer I alone provides necessary information for many banking applications, while layers I and II together give a more truthful representation of the foreground. Table 1 shows the average compression ratios obtained for the layered approach at minimum visually acceptable image quality. Table 2 lists the average compression ratios for di erent algorithms at minimum visually acceptable image quality. For similar visual quality on the background, the compression obtained by the layered coding method is about 50% to 100% higher than that of the JPEG or VTW at low resolution (100 to 200 dpi). As the resolution increases, the gain decreases. But still the foreground information is sharper with the layered coding method. As for the two layered coding methods, the LCw method achieves better results than the LCj method in most cases.

4 Summary and Conclusions

truthful representation (such as archives), the rst three layers may be desired. Finally, for special checks (with large endorsement amount or with special security requirement), layer IV may be required.

Acknowledgments

This work was supported by the New York State Science and Technology Foundation as part of its Center for Advanced Technology program in Telecommunications, and the Financial Services Technology Consortium (FSTC).

References

[1] A. Susanto, Y. Wang, and E. K. Wong, \Layered Coding of Check Images Using Foreground and Background Segmentation," Proc. VCIP'96, pp. 1040{9, Orlando, FL, March 17{23, 1996. [2] G. K. Wallace, \The JPEG Still Picture Compression Standard," Comm. ACM, Vol. 34, pp. 30{40, April 1991. [3] J. Serra, Image Analysis and Mathematical Morphology, Academic Press, New York, 1982. [4] ISO/IEC Draft International Standard 11544, Coded Representation of Picture and Audio Information | Progressive Bi-level Image Compression, 1992. [5] A. Said and W. A. Pearlman, \A New Fast and Ecient Image Codec Based on Set Partitioning in Hierarchical Trees," IEEE Trans. Circuits and Systems for Video Tech., Vol. 6, pp. 243{250, June 1996. [6] ftp:havefun.stanford.edu:pub/jpeg. [7]

.

ftp: ipl.rpi.edu:pub/EW Code

Compression Ratios Obtained for the Layered Approach at Minimum Visually Acceptable Image Quality (Averaged Over 13 Test Images). Table 1.

We have developed and tested a layered coding Layer I Layer I+II Layer I+II+III method on personal bank checks digitized at di erent Resolution Byte CR Byte CR Byte CR resolutions. The layered coding method has outper300 dpi 9,241 156.8 12,990 111.5 32,159 45.0 formed the JPEG and VTW methods in terms of visual 200 dpi 5,587 115.3 7,875 81.8 23,018 28.0 image quality for check images compressed at about the 100 dpi 2,468 65.3 3,155 51.0 8,368 19.3 same compression ratios. The layered coding method preserves the sharpness and legibility of the foreground Table 2. Average Compression Ratios for Di erent Alinformation better and does not produce ringing arti- gorithms at Minimum Visually Acceptable Image Qualfacts as in the JPEG/wavelet coding method. ity (Averaged Over 13 Images). Another important feature of the layered coding Resolution LCj LCw JPEG VTW method is that it facilitates progressive or indepen100 dpi 15  20 15  20 10 10 dent retrieval or transmission of the stored image layers 200 dpi 30  40 40  50 20  30 20  30 for di erent banking requirements and functions. For 300 dpi 40  50 45  60 30  40 40  50 many banking applications, such as general browsing or printing of returned checks for customers, the binary foreground map is sucient. For certain applications, layer II may also be necessary. For more complete and

Binary Foreground Map Check Image

. .

Foreground Segmentation

.

Foreground Map Coding

Foreground Grayscale Image

Gray−Level Coding Foreground Pixel Removing

.

Layer 1

Layer 2

Foreground Reconstruction

(a)

Background Image

Background Smoothing

Background Image Coding

.

Layer 3



+



Error Image Coding

Figure 1.

coder.

Layer 4

Block Diagram of the Layered Coding En-

(b) (a)

(c) (b)

(d) (c) Images Compressed by Layered Coding Method: (a) Layer I alone, CR = 112; (b) Layer I + II, CR = 80.7; (c) by LCj, CR = 51.8, PSNR = 26.62 dB; Figure 2. (a) Original Image; (b) JPEG compressed (d) by LCw, CR = 50.7, PSNR = 26.81 dB. Image, CR = 50.0, PSNR = 27.81 dB; (c) VTW Compressed Image, CR = 50.0, PSNR = 29.16dB. Figure 3.