Hysteresis Thresholding Based Edge Detectors for ...

2016 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC)

Hysteresis Thresholding Based Edge Detectors for Inscriptional Image Enhancement M. Sornama Department of Computer Science The University of Madras Chepauk, Chennai, 600 005 [email protected]

Muthu Subash Kavithab School of Electronics Engineering, Department of Computer Vision and Image Processing, Kyungpook National University, 80 Daehakro, Bukgu, Daegu, Korea. [email protected]

Abstract — Objective of the study is to suggest a hybrid computational method to enhance text from historical inscriptional images by differentiating foreground and background of the inscriptional images using hysteresis thresholding based on various edge detectors such as canny, sobel and laplacian techniques. Furthermore, the combined thresholding methods based on edge detectors were estimated to determine the peak signal to noise ratio (PSNR) and mean absolute difference (MAD) for the performance measurement error. Based on the enhancement results of 75 images, the combined hysteresis with global threshold based canny edge detector yields a better enhancement of text over individual threshold based edge detectors. Keywords — Edge Detection, thresholding, Hysteresis Local Thresholding, Global Thresholding, enhancement

I.

INTRODUCTION

The knowledge obtained using the inscriptional images can be verified in order to understand the world’s dynastic history. Several approaches to text information extraction have been proposed for page segmentation [1], license plate identification [2] and video indexing [3]. However, the enhancement of text line from inscriptional images is a difficult, since the text intensity of foreground is similar to background. There are reasonably a few approaches have been suggested for localization and extraction of text from the images of inscriptions [4, 5]. The commercially available optical character recognition (OCR) method often produced a poor recognition rate for the identification of images on monuments. The English inscriptional images by OCR method failed to recognize text, because it needed an acceptable color variation between the foreground and the background. Independent component analysis [5] method, failed to separate correlated sources of signals and hence lower recognition rate of text from the inscriptional images. In general, ancient image inscriptions do not have the ability to show any color variations between foreground and background regions. Therefore to evaluate the various kinds of inscriptions in four different languages, this study proposed a computational technique, which enhances the least variance between the

M. Nivethac Department of Computer Science The University of Madras Chepauk, Chennai, 600 005 [email protected]

foreground and background regions of inscriptional images. Hence, this study adopted hysteresis thresholding based edge detectors for least dependency between foreground, central, and background layers of such inscriptional images to facilitate the retrieval of characters from the foreground. Furthermore the combined hysteresis thresholding based edge detectors with global and local thresholding methods were evaluated to determine the rate of error and mean difference for text enhancement. We provide comparative image enhancement results on inscriptional images and the results showed combined hysteresis with global threshold based canny edge detector yields a better enhancement of text over individual threshold based edge detectors. II.

PROPOSED METHOD

A. Hysteresis Thresholding The double threshold operator consists in thresholding the input image for two ranges of grey scale values, one being included in the other [6]. The threshold for the narrow range is then used as a seed for the reconstruction of the threshold for the wide range.

threshold [t1t 2t 3t 4] ( f )  RT[t1,t 4] ( f )[T[t 2,t 3] ( f )]. (1) Where R represents the reconstruction of the threshold and f represents an image. If the gradient value of the edge pixel is larger than the wide threshold, then it is denoted as strong edge pixels. Otherwise if the gradient value of the edge pixel is lower than the wide threshold and larger than the narrow threshold, then it is denoted as weak edge pixels. If the pixel value is smaller than the narrow threshold, then it will be suppressed. B. Global Thresholding Global Thresholding is applied when the intensity of objects and background pixels are distinguishable. Initially it estimate threshold value T automatically. Based on T it

991

segments the objects of the image. It produce pixels brighter than T ( G1 ) and pixels darker than T ( G2 ) pixels. Then compute average intensities m1 and m2 of G1 and G2 , respectively. The new threshold value can be calculated by Tnew  m1  m2  / 2 . The process is repeated until the difference between T

 Tnew  T , otherwise it process

again all steps from the beginning.

a. Gaussian filter It is necessary to filter out the noise and unwanted details and textures. The choice of the Gaussian kernel size will affect the performance of the detection. If the kernel size is high then the detector’s sensitivity will be low to noise. Hence 5×5 kernel sizes with  =0.7 is chosen for better performance in this study. The equation applied to a Gaussian filter kernel of is given by, (4) g (m, n)  G (m, n) * f (m, n) Where

C. Local Thresholding It depends on the position of the image. The image is divided into overlapping sections which are thresholded one by one. Hence it can evaluate the entire image modifications [7]. For each pixel in the image a threshold has to be calculated. If the pixel value below the threshold it is set to the background value, otherwise it assumes the foreground value. It is denoted as, T= mean. D. Sobel Edge Detector It executes a 2-D spatial gradient calculation on an image and highlights high spatial frequency areas which represents to edges [8]. It is applied to realize the absolute gradient magnitude at each pixel in an input grayscale image. The Sobel edge detector utilized a pair of 3×3 convolution masks, which is very similar to Roberts cross operator. The kernels can be used individually to the input image, and to produce individual measurements of the gradient component in each orientation (Gx and Gy). These measurements were combined together to identify the absolute magnitude of the gradient at each pixel and orientation of that gradient. The gradient magnitude is given by

 G  Gx2  G 2 y  

(2)

The angle of orientation of the edge (relative to the pixel grid) giving rise to the spatial gradient is given by:

  arctanGy / Gx

(3)

The orientation  is taken to mean that the direction of maximum contrast from black to white runs from left to right on the image, and other angles are measured anti-clockwise from this. E. Canny Edge Detector Canny edge detector uses multistage algorithm to detect wide range of edges in images [9]. The computational stages are; smooth image with a Gaussian filter, compute the gradient magnitude using approximations of partial derivatives, thin edges by applying non-maxima suppression to the gradient magnitude and finally detect edges by double thresholding technique. Brief descriptions of every steps followed in this study is given in the following sections.

2 2   G 1 / 2 2 exp   m  n  2 2  

(5)

b. Computation of gradient magnitude of the image Canny algorithm applies four filters to compute horizontal, vertical and diagonal edges (0°, 45°, 90° and 135°). It results to estimate the first derivative in the horizontal (Gx) and vertical directions (Gy) [7]. The gradient magnitude can be evaluated by using equation (2 and 3). c. Thin edges by non-maximum suppression Non-maximum suppression is utilized once after the directions of the edge are computed. It is valuable to trace along the gradient in the edge direction and differentiate the value which is perpendicular to the gradient. Two perpendicular pixel values are measured for its similarity values in the edge direction. If the value is smaller than the edge pixel then the perpendicular pixel values are suppressed. If the value is larger than the edge pixel then the other two edge pixels are suppressed with a pixel value of 0. F. Laplacian of Gaussian The Laplacian technique is used to smooth an image by Gaussian smoothing filter for reducing noise [10]. This operator perform on gray level image and delivered a gray level image as its output. The two dimensional Laplacian of Gaussian operator is represented as a function centered on zero with Gaussian standard deviation  =0.9 by,

1  x2  y   LoG( x, y )  1 e  4  2 2 

x2  y2 2 2

(6)

G. Performance Evaluation Peak signal to noise ratio (PSNR) and mean absolute difference (MAD) were used to analyze the performance measurement error in evaluating the thresholding based edge detectors. The PSNR represents peak error rate, while MAD indicates the differences between two real numbers. Hence, the equations in calculating the error rate of PSNR and MAD for the enhancement images are,

992

 MAX I PSNR  10. log10   MSE

2

   

(7)

Where MAX represents maximum possible pixel value of the image and MSE represents mean square error, used to calculate the cumulative squared error between the computed and the original images, respectively.

MAD values of hysteresis with local and global thresholding of Telugu and Hindi based on canny detector were showed similar error rates with English and Tamil recognition rates compared to other detectors. The LOG edge detectors based on all languages delivered higher error rate values in both PSNR and MAD measurements with lower and higher values, respectively. Hence it revealed poor recognition rate compared with other two edge detectors used in this study.

2

Where MSE=

1 m 1 n 1  I (i, j )  K (i, j ) mn i  0 j  0

(8)

Where I indicates original image and K represents its approximated image. And MAD =

1 n

n xi  x

(9)

i 1

III. RESULTS This study has selected 75 images which includes four different Indian languages. These are 40 Tamil, 15 Telugu and 10 Hindi from Thanjavur Pragatheswarer temple, Tamil Nadu; Tirupathi temple, Telengana and Gurudwara, Punjab respectively. Online English images (10) were also included for this analysis. Generally the inscriptional images are inscribed into stone, or other durable materials. The enhancement of these inscriptional images based on individual and combined thresholding with edge detectors were evaluated. The enhancement results using hysteresis with global and local threshold based on canny, sobel and laplacian of Gaussian detectors images are shown in fig. 1. Furthermore, the rate of error and mean difference for text enhancement of English, Tamil, Telugu and Hindi inscriptions were determined. A higher value of PSNR revealed less value of error in estimating the complexity of the images. Whereas a low MAD value represents less error rate in determining the performance accuracy. Table 1 showed the error rate of English inscriptional images based on thresholding and edge detectors. The PSNR and MAD values of hysteresis with local and global thresholding based on canny detector showed lowest error rate compared to other detectors. Table 2 showed the error rate of Tamil inscriptional images based on thresholding and edge detectors. The recognition performance of Tamil is similar with the English characters and the PSNR and MAD values of hysteresis with local and global thresholding based on canny detector showed lowest error rate compared to other detectors. Tables 3 and 4 indicated the error rates of Telugu and Hindi inscriptional images, respectively based on thresholding and edge detectors. The PSNR and

Fig 1.(a) English, (b) Tamil, (c) Telugu and (d) Hindi inscriptions: (a-d) Original Images, (e-h) Canny-Hysteresis with global threshold, (i-l) Sobel- Hysteresis with global threshold, (m-p) Laplacian of Gaussian-Hysteresis with global threshold

In general, the rate of error values based on PSNR using

hysteresis with local and global thresholding on canny detector was higher than the individual thresholding. However in sobel and LOG the error value using local threshold shows higher than the other individual and combined thresholding. The rate of error values based on MAD using combined thresholding shows better performance with lower value than the local and global thresholding on canny detectors. Whereas using sobel and LOG, the error values based on MAD using combined thresholding shows poor performance with higher values than the local and global thresholding. The performance of Sobel detectors in recognizing text from its background was estimated as moderate in this study. Hence the rate of error values based on PSNR and MAD using Hysteresis with global thresholding on canny detector outperforms the enhancement results compared to other edge detectors. The hysteresis with global thresholding based on canny edge detector resulted an acceptable recognition rate of differentiating characters from its background than the other detectors utilized in this study is showed in fig. 1(e-h).

TABLE I. English inscriptional images with PSNR and MAD values

993

PSNR

IV. CONCLUSION

MAD

Methods

Canny

Sobel

LOG

Canny

Sobel

LOG

Local Global Hysteresis Hysteresis +local Hysteresis +global

2.93 2.22 54.36 68.00

54.10 4.29 52.97 50.23

2.23 2.17 2.23 2.20

167.89 191.24 0.20 0.01

0.24 82.64 0.29 0.35

191.8 193.9 191.8 191.8

70.65

7.33

2.20

0.01

47.58

191.8

TABLE II. Tamil inscriptional images with PSNR and MAD values PSNR

MAD

Methods

Canny

Sobel

LOG

Canny

Sobel

LOG

Local Global Hysteresis Hysteresis+ local Hysteresis + global

5.60 5.57 50.17 67.95

50.91 -0. 44 48.88 50.23

5.58 5.56 5.56 5.50

126.21 126.78 0.52 0.01

0. 48 203.4 0.66 0.35

127.1 127.5 127. 4 127. 4

70.55

2.99

5.50

0.01

129.2

127. 4

This study suggested a hybrid method for better output to reduce complex and blurred noises on historical inscriptional images. The combined hysteresis with global thresholding on canny edge detector revealed minimum rate of error value in differentiating the background from the foreground. In addition, the obtained results based on comparative image enhancement results on inscriptional images showed combined hysteresis with global threshold based canny edge detector yields a better enhancement of text over individual threshold based edge detectors. Further study using historical sculptures should be implemented to estimate the approach of the suggested hybrid method and furthermore to improve the algorithm based on feature strength measures and other scale spacing methods. REFERENCES [1]

[2]

TABLE III. Telugu inscriptional images with PSNR and MAD values [3]

Methods

Canny

PSNR Sobel LOG

Canny

MAD Sobel

LOG

Local

5.90

49. 44

6.56

116.77

0.61

107.9

Global Hysteresis Hysteresis +local Hysteresis + global

6.55 52. 44 68.06

-1.07 47.88 50.23

6. 45 6.55 5.56

107.56 0.22 0.01

233.4 0.77 0.35

109.5 108.1 127. 4

70.65

1.98

5.56

0.01

163.2

127. 4

[4]

[5]

[6] TABLE IV. Hindi inscriptional images with PSNR and MAD values [7] PSNR Methods

Canny

Sobel

LOG

Local Global Hysteresis Hysteresis + local Hysteresis + global

3.79 6.04 50.65 68.09

50.22 0.36 49.06 50.23

70.44

3.34

MAD Sobel

LOG

6.04 5.88 6.03 6.03

Cann y 124.0 120.3 0.63 0.01

0.53 174.9 0.63 0.35

120.80 123.50 120.95 120.95

6.03

0.01

119.5

120.95

[8]

[9]

[10]

A. K. Jain, and Y. Zhong, “Page Segmentation using Texture Analysis,” Pattern Recog, Vol. 29 (5), pp. 743- 770, 1996. D. S. Kim and S. I. Chien, “Automatic car license plate extraction using modified generalized symmetry transform and image warping,” Proc. Inter Symposium on Industrial Electronics, Vol. 3, pp. 2022-2027, 2001. S. Antani, D. Crandall, A. Narasimhamurthy, V. Y. Mariano, and R. Kasturi, “Evaluation of methods for detection and localization of text in video,” Proc. IAPR workshop on Document Analysis Systems, pp. 506-514, 2000. S. Das, S. Mandal, and A. K. Das, “Binarization of stone inscripted documents,” IEEE Inter Conf on Computer Graphics, Vision and Information Security (CGVIS), pp. 11-16, 2015. U. Garain, A. Jain, A. Maity, and B. Chanda, “Machine reading of camera-held low quality text images: an ICA-based image enhancement approach for improving OCR accuracy,” Proc.19th Inter Conf on Pattern Recog (ICPR), pp. 1–4, 2008. X-S. Hua, X-R. Chen, W. Liu, and H-J. Zhang, ”Automatic location of text in video frames,” Proc. ACM workshops on Multimedia: multimedia information retrieval, pp. 24-27. 2001. C. Wolf, J-M. Jolion, and F. Chassaing, “Text localization, enhancement and binarization in multimedia documents,” 16th Inter Conf on Pattern Recog, Vol. 2, pp. 1037-1040, 2002. O. R. Vincent, and O. Folorunso,”A descriptive algorithm for sobel image edge detection” Proc. Informing Science & IT Education Conf (InSITE), Vol. 40, pp. 97-107. 2009. Hojin Cho, Myungchul Sung, and Bongjin Jun,” Canny Text Detector: Fast and Robust Scene Text Localization Algorithm,” Proc. Computer Vison and Pattern recog (CVPR), pp.3566- 3573.2016. Qiaoyu Sun, and Yue Lu, ”Text detection from scene images using scale space model,” Proc Inter Forum on Digita; TV and Wireless Multimedia Communication. Vol.331, pp.156-161.2012.

978-1-5090-0611-3/16/$31.00©2016 IEEE

994