Image segmentation is the central task of sign language recognition system. It is a significant step in image processing in which threshold segmentation is a ...
Krishnaveni.M et al. / International Journal of Engineering Science and Technology (IJEST)
Improved Histogram based Thresholding Segmentation using PSO for Sign Language Recognition Ms. KRISHNAVENI.M Research Assistant, Department of Computer Science, Avinashilingam Deemed University for Women, Coimbatore,Tamil Nadu,India
Dr.V.RADHA Associate Professor, Department of Computer Science, Avinashilingam Deemed University for Women, Coimbatore,Tamil Nadu,India Abstract : The problem of estimating the human gestures from a set of still images requires prospective segmentation method. Image segmentation is the central task of sign language recognition system. It is a significant step in image processing in which threshold segmentation is a simple and important method in grayscale image segmentation. The multilevel thresholds image segmentation method based on maximum entropy and Particle Swarm Optimization (PSO) is presented in this paper. The proposed algorithm takes advantage of the characteristics of particle swarm optimization, and improves the parameter and evolutional process of basic PSO. This algorithm not only considers the spatial information, but also considers the gray information and decreases the computing quantity. In this paper, several images are segmented in experiment and compared with other related segmentation methods in which few are discussed through subjective and objective assessment .Experimental result shows that this method of computing using PSO can quickly convergence the results with high computational efficiency Keywords: Sign Language, Segmentation, Histogram, Entropy, Particle Swarm Optimization 1. Introduction The essential research related to sign language recognition states that there is a need of remarkable progress in this domain. Sign language can be considered as a collection of gestures, movements, posters, and facial expressions corresponding to letters and words in natural languages. Automatic sign language recognition is a challenging research area since sign language often is the only way of communication for the deaf people. The scientific study of speech, gestures, gaze and emotion has been progressing area of image processing. A significant bottleneck is that the determination of boundaries between them. Hence a research prototype is explored here that involves sign language as its input to which segmentation process is done with high optimization. The image segmentation refers to the process of partitioning an image into multiple segments based on selected image features. The goal of segmentation is to simplify and change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries (lines, curves, etc.) in images. More precisely, it is the process of assigning a label to every pixel in an image such that pixels with the same label share certain visual characteristics. There fore, a need for optimal solution algorithm is must for the above mentioned algorithm that has the capability to analyze the pattern of distribution of the pixels. Here the segmentation methods are evaluated with performance metrics. With the literature survey, the experimental result also proves PSO based on histogram is more efficient human pose segmentation. The paper is organized as follows: Section 2 deals with the need for Gray level based segmentation for image processing. Section 3 deals the comparison of segmentation methods and the subjective assessment of the approach Section 4 explores the performance evaluation of these methods. The paper ends with observations on future work and some conclusions.
ISSN : 0975-5462
Vol. 3 No. 2 Feb 2011
1014
Krishnaveni.M et al. / International Journal of Engineering Science and Technology (IJEST)
2. Gray level based Segmentation In Graylevel segmentation is done by separating the pixel of the image into some non-intersecting regions. On that basis, threshold segmentation method is a widely used and effective method for segmentation . Segmentation results rely on the segmentation threshold value. Therefore it is very important to choose an optimal threshold value in grayscale segmentation. By choosing an adequate threshold value and segmentation image, object can be extracted from grayscale image. Selection criterion of segmentation threshold value plays an important role in threshold segmentation method. And PSO algorithm is adopted to solve the optimal segmentation threshold value by taking its quick convergence as the main advantage. In traditional 2D gray histogram, the horizontal axis denotes the gray-level value of pixel. It’s range is also from 0 to L-1. f(m,n) denotes the gray-level value of pixel which located at (m,n). The range of m is from one to M. The range of n is from one to N. That is {1,2,.... n M} and n {1,2,....N} . g(m,n) is the average gray value of the neighborhoods of pixel which located at (m,n). In case of 2D gray histogram, there are fore regions. Object pixels and background pixels locate the diagonal neighborhood. Noise points and edge pixels are far from diagonal. Object pixels locate in region A. Background pixels locate in region B. Noise points locate in region C. Edge pixels locate in region D. According to associated criterion, optimal segmentation threshold value is obtained, such as a pair of values (s, t). Value s represents the segmentation threshold value about pixels. Value t represents the segmentation threshold value about average pixels.
Figure 1: Example of Histogram
Suppose that the gray-level histogram corresponds to an image, f(x,y), composed of dark objects in a light background, in such a way that object and background pixels have gray levels grouped into two dominant modes. One obvious way to extract the objects from the background is to select a threshold ‘T’ that separates these modes. Then any point (x,y) for which f(x,y) > T is called an object point, otherwise, the point is called a background point. 3. Analysis of different gray level based segmentation for human pose Thresholding is one of the most popular segmentation approaches because of its simplicity. However, the automatic selection of a robust, optimum threshold has remained a challenge in image segmentation. Besides being a segmentation tool on its own, thresholding is frequently used as one of the steps in many advanced segmentation methods. In gesture recognition applications, thresholding is not applied on the original images, but applied in a space generated by the segmentation method. Thus, the adopted gray level based segmentation methods are applied to few sign language images to determine effective thresholds that have robust results over it for further construction of the system.
ISSN : 0975-5462
Vol. 3 No. 2 Feb 2011
1015
Krishnaveni.M et al. / International Journal of Engineering Science and Technology (IJEST)
Figure 2 : Processing Diagram
Figure 3 : Analysis of different segmentation method based of histogram level
3.1 OTSU’s method for segmentation It computes a global threshold (level) that can be used to convert an intensity image to a binary image. Level is a normalized intensity value that lies in the range [0, 1]. Otsu's method, which chooses the threshold to minimize the intraclass variance of the black and white pixels. The effectiveness metric is a value in the range [0,1] that indicates the effectiveness of the thresholding of the input image. The lower bound is attainable only when the images are single gray level, and the upper bound is attainable only by two-valued images. Eqn (1) explains the same. 2 within (T ) B (T ) B2 o (T ) o2 (T ) ……………(1)
B
(T )
T 1
p (i)
i 0
[0,L-1] the range of intensity levels o (T )
L 1
p (i )
iT
B2 (T ) The variance of the pixels in the background (below threshold) o2 (T ) = the variance of the pixels in the foreground ( above threshold)
ISSN : 0975-5462
Vol. 3 No. 2 Feb 2011
1016
Krishnaveni.M et al. / International Journal of Engineering Science and Technology (IJEST)
3.2 K-means algorithm for Segmentation Here K-means is used as a two phase iterative algorithm to minimize the sum of point-to centroid distances, summed over all k clusters. The first phase uses each iteration that consists of re assigning points to their nearest cluster centroid. The second phase uses points that are individually reassigned. Arbitrarily choose k data points to act as cluster centers. Until the cluster centers are unchanged following are the steps carried out : Allocate each data -point to cluster whose center is nearest. Replace the cluster centers with the mean of the elements in their clusters end. Clustering in attribute space can lead to unconnected regions in image space (but this may be useful for handling occlusions).The K means image representation groups all feature vectors from all images into K clusters and provides a cluster id for every region of image that represents the salient properties of the region. K means is fast iterative and leads to a local minimum. It looks for unusual reduction in variance. This iterative algorithm has two steps Assignment step: Assign each observation to the cluster with the closest mean
Si( t ) { X j : X j mi( t )
x j m(t )
………. (2)
Update step: Calculate the new means to be centroid of the observations in the cluster
m
( t 1 ) i
1 S
( t ) i
x
j
S ( t )
X
j
…..(3)
i
The distance metric should be determined to quantify the relative distances of objects. Euclidean and Mahalanobis distances are two major types of distance metrics. Computation of the distance metrics is based on the spatial gray level histograms of images. The Mahalanobis metric distance is applied here , which is formulated as (3), where XA is the cluster center of any layer XA, s is a data point, d is the Mahalanobis distance, KA -1 is the inverse of the covariance matrix
d ( s X A ) T K A1 ( s X A ) …………….(4) 3.3. Delaunay algorithm for segmentation In this method a triangular mesh is used to divide an image into several disjoint regions whose characteristics, such as intensity and texture, are similar. This concept of the vector based segmentation has a number of advantages: (1) Direct vector representation of image regions eliminates difficult process of raster data vectorization .(2) Easy manual corrections of the segmentation (adding/removing vertices and manual assignment of tissue types to triangles). (3)The presented two-dimensional case can be extended to the 3D space. (4) Effective representation of image structure (reduced number of triangles instead of particular pixels). The Delaunay triangulation of a set of points generates regularly shaped triangles and is preferred over alternative triangulations for image segmentation. The DT can be constructed by several methods. Most common is the Incremental Method. Adaptive segmentation scheme is based on the Delaunay triangulation , while the mesh of the DT is adapted to the underlying structure of the image data. During construction of the Delaunay triangulation, the image is divided into a number of non-overlapping triangles ti. These triangles are not segments of the image by itself, but they belong to image regions Rk = {t1, t2, . . . , tn}. This relationship can be expressed by a region membership function. Hard assignment means that this function assigns exactly one region to a given triangle. In practice, membership function of the form m(ti,Rk) = p(ti|Rk) making a soft assignment of triangles overcomes the hard one and leads in better results. Soft membership function is usually a likelihood function assigning each triangle into every image region with some certainty. The value is higher as the similarity of the triangle and the region increases. 3.4 PSO for segmentation Computation of optimal threshold is handled here with Particle Swarm Optimization (PSO) which has its inspiration from the behavior of bird groups. It takes particle as individual and flight with a certain speed in the search space. These particles haven’t quality and volume. Every particle has the simple rules of conduct. According to the flying experience of individuals and groups, particles can dynamically adjust the flying speed. Implementation of PSO algorithm analyzed here to find out the optimal threshold for segmentation which is therefore applied for sign language datasets. There are six important control parameters in PSO algorithm. They are population size、 cognitive learning rate c1、social learning rate c2、the maximum of particle flying speed
ISSN : 0975-5462
Vol. 3 No. 2 Feb 2011
1017
Krishnaveni.M et al. / International Journal of Engineering Science and Technology (IJEST)
max V , the inertia weight factor 、constriction factor K. The population size of particles refers the number of particles in iterative process. The specific model as formula td (t 1) td (t ) c1 r1 ( Ptd (t ) xtd (t )) c 2 r2 ( Pgd (t ) xtd (t )) ………..(5)
xtd (t 1) xtd (t ) td (t 1) In this paper the algorithm is used because the sum of c2 and c1 should be 4,the values of parameters c2 and c1 are 2. The values of the parameters r are 0.5. The iterative formula of speed is as formula 6 td (t 1) td (t ) Ptd (t ) Pgd (t ) 2 xtd (t ) ……………………………….(6) Because gray value of pixel is integer, the solution of SDAIVE maximum can take as integer planning problem. Pixel particles move along the gray value(V-direction) and the gray D-value(μ-direction) at the same time. Speed and location of particles need iterate in two directions at the same time.
Figure 4: Subjective assessment of the adopted segmentation methods
The above figure 4 gives the visual assessment of the the segmentation method based on the threshold values.
ISSN : 0975-5462
Vol. 3 No. 2 Feb 2011
1018
Krishnaveni.M et al. / International Journal of Engineering Science and Technology (IJEST)
4. Performance evaluation For segmentation theory, there is no well-known standard method for the evaluation of its results. For this reason evaluation is done using empirical difference method using the Global Consistency Error and Variation of Information and entropy. These parameters verify the segmentation process and find the local minima in the segmentation evaluation measures. Global Consistency Error (GCE) forces all local refinements to be in the same direction and is defined as: 1 GCE(S , S ') min LRE(S , S ', xi ), LRE(S ', S , xi ) …………(7) N Global Consistency Error
Kmeans PSO OTSU
Im ag e7
Im ag e6
Im ag e5
Im ag e3
Im ag e2
Im ag e4
Delaunay
Im ag e1
Range of values
1.02 1 0.98 0.96 0.94 0.92 0.9 0.88 0.86 0.84
Number of images
Figure 5 : GCE based assessment for segmentation methods
The Global consistency error rate is low for PSO based method when compared to rest of the adopted methods. Variation of Information: A VI between two clustering X and Y can be represented by
VI ( X , Y ) : H ( X ) H (Y ) 2 I ( X ; Y ) …………(8)
16 14 12 10 8 6 4 2 0
Kmeans PSO OTSU
Im a ge 7
Im ag e6
Im a ge 5
Im a ge 4
Im a ge 3
Im ag e2
Delaunay
Im a ge 1
Range of values
Variation of Information
Number of images
Figure 6: VI based assessment for segmentation methods
The change of information is very less when compared to other segmentation methods which is depicted in figure 6 Entropy of the image is calculated G
G
Har FEnt p ( g , g ' ) log{ p ( g , g ' )} ……(9) g 1 g 1
ISSN : 0975-5462
Vol. 3 No. 2 Feb 2011
1019
Krishnaveni.M et al. / International Journal of Engineering Science and Technology (IJEST)
Entropy
Kmeans
1.5 1
OTSU
PSO Delaunay
Im a ge 7
Im a ge 6
Im a ge 5
Im a ge 4
Im a ge 3
Im a ge 2
0.5 0 Im a ge 1
Range of values
3 2.5 2
Number of Images
Figure 7: Entropy based assessment for segmentation methods
By the figure 7 the entropy value is low which is the information loss is less in the proposed method. Conclusion
Image segmentation techniques are widely used in Similarity Searches. The uniqueness of the proposed image segmentation algorithm includes a new quantization method to generate gray level based segmentation where the optimal threshold is automatically estimated based on particle swarm optimization. In this paper an approach is proposed that deals with the threshold value according to the similarity between gray levels. Such a similarity is obtained through a biological inspires computing, PSO. This method overcomes the local minima that affect most of the conventional segmentation methods. Furthermore, the proposed method is both fast and efficient to extract and the segmentation results are close to human perceptions. In future from the segmented image, the objects can be extracted using edge detection techniques. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
Moeslund, T., Hilton, A., and Krueger, V. (2006). A survey of advances in vision-based human motion capture and analysis. Computer Vision and Image Understanding, 104(2):90 – 126. S. Ivekovic, E. Trucco ,(1998),”Human Body Pose EstimationWith Particle Swarm Optimisation Evolutionary Computation 16(4): 509-528. P. L. George and H. Borouchaki. (1998),”Delaunay Triangulation and Meshing: Application to Finite Elements”,. Editions HERMES, Paris, France. T. Gevers. (2002)“Adaptive image segmentation by combining photometric invariant region and edge information”,. In IEEE Transactions on Pattern Analysis and Machine Intelligence, volume 24,. Michal Span el and Pˇremysl Kr sek, “Vector-based medical image segmentation using adaptive elaunay triangulation”, Computer Graphics and technology . Liping Zheng, Quanke Pan Guangyao Li Jing Liang,(2009) “Improvement of Grayscale Image Segmentation Based On PSO Algorithm”, Fourth International Conference on Computer Sciences and Convergence Information Technology. J.J. Liang, A.K. Qin, and S. Baskar,(2006) “Comprehensive Learning Particle Swarm Optimizer for Global Optimization of multimodal Functions,” IEEE trans. EvolutionaryComputation, vol. 10, no. 3, June. J. Bruce, T. Balch, M. Veloso, (2000)“Fast and Inexpensive Color Image Segmentation for Interactive Robots,” Proc. IROS-2000 3, pp. 2061–2066. M. Omran, A. Engelbrecht, and A. Salman, (2005)“Particle Swarm Optimization Method for Image clustering,” Int J Pattern Recogn Artif Intell, 19(3):297-322. C.C. Chen, (2006)“Design of PSO-based Fuzzy Classification Systems,” Tamkang Journal of Science and Engineering, vol. 9, no. 1, pp. 63-70. N.Otsu,(1979) : A thresholding selection method from graylevel histogram, IEEE Transactions on Systems, Man, and Cybernetics, n.9 62–66 C.Thillou and B.Gosselin ,(2004): Segmentation-based binarization for color degraded images, Proceedings of ICCVG 2004 C.W. ong and surendra and ranganath, (2006)“Automatic sign language analysis: A survey and future beyond lexial meaning Sylvie”, IEEE transactions on pattern analysis and machine intelligence, Vol27,No.6,June.
ISSN : 0975-5462
Vol. 3 No. 2 Feb 2011
1020