Gurumukhi Sign Language Translation Using FCM - IJETAE

22 downloads 1589 Views 356KB Size Report
interaction; analysis of interpretation of body language. In this paper, we develop a static hand gesture translation system. In particular, we employ the vision.
International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 5, May 2013)

Gurumukhi Sign Language Translation Using FCM Parul Chauhan1, Chirag Sharma2, Harleen Kaur Khehra3 1,3

M.tech (CSE), 2Asst. Professor, LPU Stage1: Hand gesture image acquisition Stage2: Hand gesture image processing Stage3: Translation

Abstract— This paper implements the one of uses, i.e. sign language translation, of gesture recognition. Sign language is an important tool or way to communicate with the deaf people. It helps the hearing marred people to understand the manual languages, like Hindi, Gurumukhi, etc. In this paper, we build a static and vision based hand gesture translation system. For this purpose we utilize the fuzzy logic to extract the skin region, fuzzy technique to extract the features of extracted skin region, and fuzzy C-means (FCM) clustering for finding the matching word purpose. In this paper we have translated the sign language (SL) to the Gurumukhi letters, which perform classification and make fuzzy clusters. This SL is tested on 41 letters of Gurumukhi.

II. CORRELATED WORK As mentioned in introduction a wide variety of work is done which is for Sign Language to another manual language. But, no previous work has been done on Sign Language (SL) to Gurumukhi manual language. Previous related work, which has been done on other sign language, mostly uses the fuzzy C-means technique to find the matched word or letter, as we have included in our experiment. And there are various techniques are followed for the feature extraction by the different authors. Feature extraction is a very important task in sign language translation system, human-computer interaction, etc. The U-SURF is the speed –up robust feature (SURF) without orientation assignment, which is used to match an image frame with that in signature library [3]. Haar-like features are also used to represent the shape of the hand. Then, these features are input to a fuzzy c-means classification algorithm. This approach is used by Juan Wachs, Helman Stern, Yael Edan, Michael Gillam, Craig Feied, Mark Smith, and Jon Handler, in their research ―Hand Gesture Interface for Medical Visualization Application [13]. A hybrid Architecture for gesture recognition is also proposed by Renata C. B. Madeo, Sarajane M. Peres, and Clodoaldo A. M. Lima. In which they have used heuristic classifiers using fuzzy syntactical strategy. In this gestures are recognized based on primitives [5]. Previous versions of sign language recognition have used the Hidden Markov Models (HMMs), and conditional random fields (CRFs) [9]. Some versions have used the Principle Component Analysis (PCA) method for the classification purpose [2]. And some have used the color segmentation and neural network for the data processing and translation purpose [1]. Spatio-Temporal FeatureExtraction Techniques have used for isolated gesture recognition in Arabic Sign Language Translation [8].

Keywords— Fuzzy C-means (FCM), Sign Language (SL), Gesture recognition, Sign Language Translation or Sign Language Translation (SLT), Gurumukhi.

I. INTRODUCTION The community of the hearing marred people is very large in every corner of the world. This community cannot understand the normal speaking language. So, the big problem in communicating with this community takes place. Sign language is the solution to get over this big problem of communication. Signed communication is characterized by the use of hand gestures. Signed communication is mainly represented by the development of official languages used by the hearing marred community. Sign language translator makes it easier to understand the sign language. For the users, the translator should be easy and natural to use. Several researches have been done in this area with the help of cyber glove or position tracking and with image legend or video legend. Signed communication has prominent number of applications, such as, interaction with autistic people or dumb people; automatic scene analysis; human-computer interaction; analysis of interpretation of body language. In this paper, we develop a static hand gesture translation system. In particular, we employ the vision based feature extraction method and fuzzy C-means clustering to find the matched signs. It will lead to the communication between hearing marred or dumb people community and normal or hearing community. There are three main stages in our SLT and they are:

395

International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 5, May 2013) Vision-based: The input to the translator is sets of images, which are captured by the system webcam and stored in the database. Translation: The translator is able to recognize the particular sets of signs and translate them into Gurumukhi.

III. PROPOSED SYSTEM OVERVIEW Our proposed system translates the sign language, used for the communication with hearing marred people purpose, into Gurumukhi, i.e. manuscript of Punjabi language, manual language. Our proposed system has the following:

Figure 1: Critical steps included in SLT (General Design)

A. Image Acquisition Images are the two dimensional representation of the electromagnetic information of the outlook. This two dimensional information can be acquired by the imaging device, which is able to store this two dimensional array information, like camera. In our proposed system images are acquired by the system camera. It will acquire the colour images of hand gesture in 3-D outlook. These colour images are made up of varying amount of light in three different wavelengths, corresponding to the colours red, green, and blue [4]. Figure 3 shows the process of image acquisition where 3-D outlook of hand gesture is projected to 2-D image plane and output the digitised image.

Fig. 2: Digital image acquisition [4]

396

International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 5, May 2013)

Cb  0.148R  0.291G  0.438B  128 (1)

Cr  0.439R  0.368G  0.071B  128 Or

Y  0.299R  0.587G  0.114B

Cb  B  Y

(2)

Cr  R  Y Fig. 4: RGB to YCbCr conversion matrix

For skin color segmentation we have used the fuzzy approach in our system. Fuzzy approach has some positive or additional features, like it has ability to handle uncertainties and partial truth values. Since the effect of light on skin color segmentation should not disregard, so a method is required which properly behave with different lighting conditions []. To resolve this problem and many other issues we used the fuzzy technique in our system. It also enhances the accuracy and reliability of the skin color segmentation. In our proposed method we used the membership criteria instead of checking for the threshold value for each component of the image. In this, how much a pixel is closed to the skin color and how much a pixel is closed to the non-skin color is determined by membership degree of that pixel. For this we used the Low and High linguistics variable for each component (i.e. YCbCr) of an image. For these variable there is range defined for them.

Figure 3: Gurumukhi Sign Language Alphabets

B. Image Processing 1) Segmentation Skin color segmentation is very important task in sign language translation, and human-computer interaction (HCI). It is done because big amount of surplus data is captured in our image, which is not required in gesture recognition for sign language translation. It can be done with the help of color tones. Image consists of RGB color bands. Skin color has the values of these RGB color bands different from another color that is in background in our images. This RGB color space is light sensitive. The valued of the intensities of red, green, and blue color in RGB color space is effected by the light. So that we transform this RGB color space to the YCbCR color space, where Y represents the intensity of light called the luminance component, which is independent of color, so can be adopted to solve the elucidation variation problem, and Cb, Cr represents the intensities of blue and red components relative to the green component called the chrominance component of the pixel. The transformation can be done by using the following operations:

Table: 1. Ranges of Terms

Component Y Cb Cr

LOW 16-119 16-94 16-109

HIGH 119-235 94-240 109-240

According to the table 1. The following inference rules are designed: 1. If (Y is LOW) and (CB is LOW) and (CR is LOW) then (output1 is Non-Skin) (1) 2. If (Y is LOW) and (CB is LOW) and (CR is HIGH) then (output1 is Non-Skin) (1) 3. If (Y is LOW) and (CB is HIGH) and (CR is HIGH) then (output1 is Non-Skin) (1) 4. If (Y is LOW) and (CB is HIGH) and (CR is LOW) then (output1 is Non-Skin) (1)

Y  0.257R  0.504G  0.098B  16 397

International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 5, May 2013) 5. If (Y is HIGH) and (CB is LOW) and (CR is LOW) then (output1 is Non-Skin) (1) 6. If (Y is HIGH) and (CB is LOW) and (CR is HIGH) then (output1 is Non-Skin) (1) 7. If (Y is HIGH) and (CB is HIGH) and (CR is LOW) then (output1 is Non-Skin) (1) 8. If (Y is HIGH) and (CB is HIGH) and (CR is HIGH) then (output1 is Skin) (1) Where Y, CB, CR are colour space components of the image and LOW, HIGH are the linguistic variables that we defined. Now following figures shows the membership functions for input and output variables based on mamdani fuzzy system.

Fig. 7: Membership function for output

2) Pattern Recognition The very first step i.e. the construction of formal description of objects is depended on experience and intuition of the designer. A set of elementary properties are chosen which describe some characteristics of the objects, but recognizes their patterns. The recognition theory deals with the problem of designing the classifier for the specific set of elementary object description [18]. Pattern also referred as pattern vector or feature vector. Numerical descriptions that are used for object description are called features. The set of all possible patterns forms the pattern space or feature space. If the features are correctly chosen then the likeness of objects in each class results in immediacy of their pattern in pattern space. And the classes form the clusters in the feature space or pattern space. The second step which is followed in pattern recognition is classification, which takes pattern as input, i.e. the output of the first step, to match the database image. We have used the fuzzy c-means clustering approach for the classification purpose. It allows the data to belong one or more clusters. It is based on the following objective function:

Fig. 4: Membership function for “Y”

N

C

J m   uijm xi  c j

2

,

1  m   (3)

i 1 j 1

Fig. 5: Membership function for “CB”

Where,

m :- any real number > 1, u ij :- degree of membership of xi in the cluster j,

xi :- d-dimensional measured ith data, ci :- cluster’s d-dimension cluster, Fig. 6: Membership function for “CR”

* :- expressed the similarity between any measured data and the centre.

398

International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 3, Issue 5, May 2013) [3]

IV. EXPERIMENTAL RESULT Above techniques are applied to both database and input image. Then the result will be the features of both the images. Then in this step we will compare or match the features of the both the images, and then the corresponding Gurumukhi letter, which already reside in the database will be shown as the result of this step.

[4] [5]

[6]

[7]

[8] Figure 8: Image pre-processing results [9] [10]

[11]

Figure 9: Final results

[12]

V. CONCLUSION [13]

The sign language translator proposed in our paper is able to translate the sign language into Gurumukhi (i.e. manuscript of Punjabi language) language. REFERENCES [1]

[2]

[14]

Akemliawati Rini, Ooi Po-Leen, Kuang Chow Ye (2007). Real-Time Malaysian Sign Language Translation Using Colour Segmentation and Neural Network. IMTC. Ashrafulamin M., Yan Hong (2007). Sign Language Finger Alphabet Recognition From Gabor-PCA Representation of Hand Gestures. Sixth International Conference on Machine Learning and Cybernetics.

399

Chanda Phonkrit, Auephanwiriyakul Sansanee, and Theera-Umpon Nipon (2012). Thai Sign Language Translation System Using SpeedUp Robust Feature and C-Means Clustering. IEEE, WCCI (2012). Gonzalez Rafael, Woods Richard (2008), Digital Image Processing, Third Edition Pearson Prentice Hall, New Jersey, USA, 2008. Madeo C. B. Renata, Peres M. Sarajane, Lima A. M. Clodoaldo, Boscarioli Clodis (2012). Hybrid Architecture for Gesture Recognition: Integrating Fuzzy-Connectionist and Heuristic Classifiers using Fuzzy Syntactical Strategy. IEEE, WCCI (2012). Mohmmad Saber Iraji, Azam Tosinia (2012). Skin Color Segmentation in YCBCR Color Space with Adaptive Fuzzy Neural Network (Anfis). I.J. Image, Graphics and Signal Processing, 2012, 4, 35-41. Shanableh Tamer and Assaleh Khaled (2007). Two Tier Feature Extractions For Recognition Of Isolated Arabic Sign Language Using Fisher’s Linear Discriminants. IEEE, ICASSP. Shanableh Tamer and Assaleh Khaled (2007),‖Spatio-Temporal Feature Extraction Techniques for Isolated Gesture Recognition in Arabic Sign Language‖, IEEE, VOL.37, NO.3. Sonka, Hlavac, and Boyle (2008), Digital Image Processing and Computer Vision, by Cengage Learning. Jiang Yenlai, Hayashi Isao, and Wang Shouyu (2012). Embodied Knowledge Extraction For Human Motion Using Singular Value Decomposition. IEEE, WCCI(2012). Liang Rung-Huei, Ouhyoung Ming(1998). A Real-time Continuous Gesture Recognition System for Sign Language. IEEE International Conference on Automatic Face and Gesture Recognition, pp. 558567, Japan, 1998. Vu A. H., Yamazaki Y., Dong F., and Hirota K. (2011). Emotion Recognition based on Human Gesture and Speech Information using RT Middleware. IEEE. Wachs Juan, Stern Helman, Edan Yael, Gillam Michael, Feied Craig, Smith Mark, Handler Jon,. A Real-Time Hand Geture Interface for Medical Visualization Applications. Project partially supported by the Paul Ivanier Center for Robotics Research & Production Management, Ben Gurion University. Yang Hee-Deok, Sclaroff Stan, Lee Seong-Whan (2009). Sign Language Spotting with a Threshold Model Based on Conditional Random Fields. IEEE, Vol. 31, No 7.