Enhancement of document images using multiresolution ... - IEEE Xplore

2 downloads 0 Views 154KB Size Report
many situations when documents such as checks and credit card receipts ... Index Terms—Document image, enhancement, fuzzy logic, mul- tiresolution.
IEEE SIGNAL PROCESSING LETTERS, VOL. 6, NO. 10, OCTOBER 1999

249

Enhancement of Document Images Using Multiresolution and Fuzzy Logic Techniques Farook Sattar, Member, IEEE, and David B. H. Tay, Member, IEEE

Abstract— This letter presents a method for enhancing document images based on the multiresolution decomposition and fuzzy logic approach. The document image to be enhanced is obtained from a scanner and is a blurred binary image that is corrupted by additive noise. This type of image occurs in many situations when documents such as checks and credit card receipts become noisy and loww-contrast images after scanning, thereby reducing its quality. Our task is to improve the readability of such images by reducing the noise and increasing the sharpness of the text. This is achieved using the multiresolution decomposition, fuzzy logic and contrast enhancement operator. The improvement is shown using simulation examples and compared with other enhancement techniques. Index Terms—Document image, enhancement, fuzzy logic, multiresolution.

I. INTRODUCTION

T

HE aim of the paper is to develop an algorithm for the enhancement of document images which have been blurred and corrupted by additive noise. The algorithm presented here is a further development of the method proposed in [6] by incorporating fuzzy logic. The processing system is shown in Fig. 1. It consists of the multiresolution pyramid (Laplacian) in two dimensions that performs the multiresolution decomposition and has been widely used; for example, see [2]. The downsampling/upsampling are on the quincunx lattice [1]. is the linear filter used in both the decimation and interpolation process. In the reconstruction stage, two types of nonlinear processing are performed. The first is the contrast enhancement operator on the coarsest level image. The second is the fuzzy edge detector. Both types of processing will be elaborated later. The output obtained after using the contrast operator, (In Fig. 1), is processed by the fuzzy edge detector. The purpose of the fuzzy edge detectors is to extract the edges of the text. Simplicity of computations and ease of implementation motivate us to use the fuzzy edge detectors. Furthermore, other interesting properties such as flexibility and reduction Manuscript received January 15, 1999. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. G. Ramponi. F. Sattar is with the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798, Republic of Singapore (e-mail: [email protected]). D. B. H. Tay was with the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798, Republic of Singapore. He is now with the Department of Electronic Engineering, LaTrobe University, Bundoora, Vic. 3083, Australia (e-mail: [email protected]). Publisher Item Identifier S 1070-9908(99)07978-X.

Fig. 1. Block diagram of the system (three-level decomposition and reconstruction stages are shown).

of ambiguity in the decision rule are motivating factors for the use of fuzzy logic techniques [4], [5]. The fuzzy edge detector provides an edge-image, which is in general nonbinary (continuous valued in the range [0, 1]). The edge-detector can also made nonfuzzy in the or (binary valued) where limiting case, i.e., the pixel value is one if the corresponding pixel belongs is to an edge, and zero otherwise. The edge-image, and this extracts the multiplied with the highpass image, edge components and suppresses the noise components. The modified highpass image is then added to the lowpass image, which will then be interpolated. The which provides is used as the input for the fuzzy edgeresulting image, in Fig. 1) at the detector in order to give the edge-image ( next finer scale. The above process is repeated up to the finest resolution level II. FUZZY EDGE DETECTOR AND CONTRAST ENHANCEMENT OPERATOR A pixel window of length five is used in the fuzzy edge detector for calculating the pixel variations in five directions

1070–9908/99$10.00  1999 IEEE

250

IEEE SIGNAL PROCESSING LETTERS, VOL. 6, NO. 10, OCTOBER 1999

value

TABLE I FUZZY CONTROL RULE

: dir

arg max

(3)

A modification to the contrast enhancement technique proposed by Beghdadi and Le N´egrate [8] is used to enhance the blurred edges of the interpolated lowpass image at the coarsest level where the noise level is low and details are large (see Fig. 1). This technique is based on local edge detection by using an input data-driven window operator of size ( is odd). For a pixel point the edge values ( ) are calculated as where

with respect to the -axis. The variations at pixel defined as

are

abs (1) ( 2, 1, 0, 1, 2) is the direction offset for the where 45 60 ), is the offset in the direction ( 60 45 0 -axis, abs[ ] denotes absolute value, stands for row index, and stands for column index. into fuzzy variables. The fuzzy edge detector maps The fuzzy variables are represented by the associated fuzzy sets and membership values. A nonlinear membership function defines the fuzzy sets and is used to characterize large pixel 45 60 directions. Each direction variations in the 0 is represented by a fuzzy set, FS (see Table I), and the associated membership value. The nonlinear membership used is expressed as function, sigm sigm

(2)

sigm sigm where and the function sigm The parameters and control the threshold and degree of fuzziness, respectively. This function is continuous and monotonically increasing within the interval [0, 1]. Furthermore, derivaof all orders exist and are continuous. The tives of , is chosen such that pixels with absolute values threshold, are enhanced while pixels with values smaller larger than are suppressed. The exact value of is given by than in the nonlinear equation the solution for the unknown There is no closed-form solution for . The value of thus cannot be chosen directly but can be we indirectly controlled by the parameter At each level to determine use the standard deviation of pixel values, : For the fuzzy edge detector, the formulation of the fuzzy control rules is made based on the general assumption that the large pixel variations along a certain direction is the principle factor that determine the orientation of an edge. The fuzzy control rule which has been used is shown in Table I. The final decision about the edge-direction of the pixel is made through the defuzzification process [3], which considers the direction that gives the maximum membership

This method is less sensitive to the increment of noise due to the averaging of all of the gray levels that are weighted by the edge values in the local window. and the data-driven window Using the edge value the local edge gray value operator is calculated for each pixel point of the input image The point corresponds between to the center of the window. The local contrast and is then defined as gray values for (where and are the size of the input image ). We modify the by using the index only at the expression of denominator (instead of both the numerator and denominator as was proposed in [8]). The increase of in contrast is based on the adjustment of (see [8] for more details). For the fuzzy edge detector the number of additions and multiplications per pixel needed are 42 and 24, respectively. 5 On the other hand, the VWME edge detector (with 5 window size) used in [6] requires 119 additions and 82 multiplications per pixel. The number of additions and multiplications per pixel required for the RoA detector (with 3 3 window size) in [6] are both 36. For the LoG detector (with 7 7 window size) in [6] the number of additions and multiplications per pixel needed are both 49. The window sizes given above are the minimum sizes that will give satisfactory results [6]. The number of additions and multiplications needed for the fuzzy edge detectors are then less than that of the edge detectors in [6] (3 times less than the VWME edge detector, 1.5 times less than the LoG edge deector, and 1.1 times less than the RoA edge detector). III. SIMULATION RESULTS A simulation result is presented in Fig. 2 for a synthetic image representing a blurred homogeneous object on a noisy background (additive white Gaussian noise) in order to show the edge preserving and noise sensitivity of the proposed 256-pixel blurred and noisy synthetic method. The 256 image is generated as follows. A binary image with black

SATTAR AND TAY: ENHANCEMENT OF DOCUMENT IMAGES

(a)

251

(b)

(c)

(d)

(e)

m+1

Fig. 2. Simulation results for a synthetic image. (a) Noisy blurred image. (b) Enhanced image. (c) Edge image, v~ Edge image, v~ 01 :

m

(a)

(b)

(c)

(d)

(e)

(f)

:

m

(d) Edge image, v~

:

(e)

(g)

Fig. 3. Simulation results for blurred document image, with and without noise. (a) Original image. (b) Noise-free blurred image. (c) Enhanced image (noise-free). (d) Noisy, blurred image. (e) Enhanced image (noisy) by our method. (f) Image with second round processing. (g) Enhanced image (noisy) by Ramponi’s method.

and white pixels represented by “1” and “2,” respectively, is first generated. This binary image is then blurred by a twodimensional (2-D) lowpass filter (circular shape) and then added with white Gaussian noise (with a mean-value and a standard-deviation equal to 1.5 and 0.5, respectively). The filter used for blurring the images is of length 33 and has a cutoff frequency that is one-sixth the Nyquist frequency. For all experiments in this paper the following parameters (for noisy case) and (for noise-free are used: A three-level pyramid scheme is case), used. The size of the contrast enhancement operator used is 9). A 2-D diamond-shaped, nonseparable lowpass filter (9 is chosen as in the pyramid scheme [6]. Fig. 2(a) shows the blurred and noisy image, whereas Fig. 2(b) shows the enhanced image where the noise is reduced considerably preserving the edges. The results obtained with the fuzzy edge detectors on the lowpass images ( images) of the reconstruction stage are also shown in Fig. 2(c)–(e), where from the coarsest to the finest resolution edge images, level illustrate the edge preserving and noise suppression by the fuzzy edge detectors. In the following we have shown results for the real document images.

A. For the Noise-Free Case A document image enhancement example is shown in Fig. 3. The original binary image obtained from a flatbed scanner is shown in Fig. 3(a) where the black and white pixels are represented by the values of one and two, respectively. The scanned binary image in Fig. 3(a) is then blurred by the 2-D lowpass filter described above. The blurred image is shown in Fig. 3(b). The corresponding result is presented in Fig. 3(c). We see that the contrast of the image as well as the sharpness of the text [especially the texts which were less distorted during scanning—see Fig. 3(a)] has been improved. This can be seen

from the increment of the edge peaks across the letters, e.g., the letter “T.” The contrast of the blurred image can be further increased by using a lower value of The value can be chosen by considering the degree of blurriness.

B. For the Noisy Case The simulation results of a blurred and noisy image are presented in Fig. 3(d)–(g). In Fig. 3(d), the image is obtained by adding white Gaussian noise (with a mean-value and a standard-deviation equal to 1.5 and .5, respectively) to the blurred image. Here, the mean-value of the noise is chosen as 1.5 since the scanned black and white pixel values are represented by “1” and “2,” respectively. For the noisy case, both the noise reduction and sharpness increase are necessary. In order to reduce noise considerably while increasing the sharpness of the text, two changes to the noise-free case are to be made. First, we have taken the value of large [compared to as Fig. 3(b)]. Second, a second round of processing is performed on the image using the same system but with a modification. In the second round of processing is excluded from the reconstruction. the finest detail, This results in a further reduction of the background noise. When the variance is high, the improvement that is due to noise reduction is more significant than the impairment that is due to the reduction in sharpness. The corresponding result is depicted in Fig. 3(e). We see that the blurred and noisy image in Fig. 3(d), which is unclear and difficult to read, has been enhanced in Fig. 3(e) by reducing the noise and increasing the sharpness of the text. The second round processing as described above is done to the image in Fig. 3(e). Using the same values as in Fig. 3(e) the corresponding result is presented in Fig. 3(f) which shows some reduction of high variance noise. The presented method has been compared with the document enhancement method proposed by Ramponi et al. [7] where quadratic filters are used. The limitation with Ramponi’s

252

IEEE SIGNAL PROCESSING LETTERS, VOL. 6, NO. 10, OCTOBER 1999

method is that it does not work well in the high noise situation, e.g., the case in our simulation example. In Fig. 3(g) we have shown the result of the enhancement using Ramponi’s method. The value of the quadratic parameter used is and was selected by trial and error. We see that the method proposed by Ramponi is not able to reduce the noise as much as compared to our method. It was found that the variance of the void (without text) region is larger with the former method. The sharpness of the text is also slightly lower with the former method. Here we do not compare the results with the more conventional methods like median filtering, th nearest neighbor averaging filtering and Wilcoxon filtering. This is because it is found in [7] that Ramponi’s method works much better than the conventional ones in terms of noise reduction and improved sharpness.

IV. CONCLUSION We have applied the image enhancement scheme for document images that have been blurred and corrupted by additive Gaussian noise. The method is able to reduce the noise, while increasing the sharpness of the text. However, the noise reduction seems to be more significant than the contour sharpening effect. In our simulation example, we have used a three-level multiresolution scheme because with this scheme noise reduction is found to be satisfactory without excessive loss of image details. The degree of enhancement may depend on the distortion of the text caused during scanning.

We found that the proposed method is more efficient than Ramponi’s method [7] especially in the high noise case. Further experiments involving various types of noisy or/and blurred receipt images (the results are not shown here) indicate that the noise is smoothed out and the sharpness is either retained or increased. ACKNOWLEDGMENT The authors are very thankful to the reviewers and the associate editor for their remarks that helped to improve this work. REFERENCES [1] J. C. Feauveau, “Analyze multir´esolution pour les images avec un facteur de r´esolution (2);” Traite. Signal, vol. 7, pp. 117–128, 1990. [2] J. M. Jolion and A. Rosenfeld, A Pyramid Framework for Early Vision. Boston, MA: Kluwer, 1994. [3] B. Kosko, Fuzzy Engineering. Englewood Cliffs, NJ: Prentice-Hall, 1997. [4] T. Law, H. Itoh, and H. Seki, “Image filtering, edge detection, and edge tracing using fuzzy reasoning,” IEEE Trans. Pattern Anal. Machine Intell., vol. 18, pp. 481–491, May 1996. [5] L. A. Zadeh, K. S. Fu, K. Tanaka, and J. Shimara, Fuzzy Sets and Their Application to Cognitive and Decision Processes. New York: Academic, 1975. [6] F. Sattar, L. Floreby, G. Salomonsson, and B. L¨ovstr¨om, “Image enhancement based on a nonlinear multiscale method,” IEEE Trans. Image Processing, vol. 6, pp. 888–895, June 1997. [7] G. Ramponi and P. Fontanot, “Enhancing document images with a quadratic filter,” Signal Process., vol. 33, pp. 23–34, 1993. [8] A. Beghdadi and A. Le N´egrate, “Contrast enhancement technique based on local detection of edges,” Comput. Vis., Graph., Image Process., vol. 46, pp. 262–274, 1989.

Suggest Documents