Relative finger-tip position is acquired in features extraction stage. The brief ... Hand region detection stage ... Finally this stage stores some information for later.
International Journal of Electronics and Computer Science Engineering Available Online at www.ijecse.org
398
ISSN- 2277-1956
Development of Hand Finger Recognition System with Relative Positioning Asaduz Zaman 1, Md Mainul Islam Mamun 2, Md Abdul Awal 3, Md Alamgir Hossain 4, Md Zahidul Islam 5 1, 4, 5
Dept. of ICE, Islamic University, Kushtia, Bangladesh Dept. of APEE, Rajshahi University,Rajshahi,, Bangladesh 3 Dept. of CSE, College of Engineering& Technology, IUBAT, Dhaka, Bangladesh 2
Abstract- In this paper, we propose a logical and procedural way to recognize hand finger with relative positioning. This relative position of hand finger will help us to identify and recognize hundreds of unique gestures boosting the result of current state of the art approaches by the means of gesture count. The proposed method uses a color based segmentation to identify skin area from image frames and connected component detection to detect hand region. Detected hand regions’ palm center and angular relationship is then used to identify relative position of hand fingers. The experimental results show that the process is able to recognize finger relative position with a satisfactory accuracy.
Keywords –Gesture Recognition, Relative Positioning, Finger Gesture, Color based segmentation,
I. INTRODUCTION In the last decade, researchers lean towards to provide a framework to next level Human Machine Interface (HMI). Gesture recognition using single and inexpensive imaging devices of-course dominated this field profoundly. This is because of the quest to find the ultimate user friendly system that could be used very conveniently and robustly. In fact, hand gesture recognition is the most natural way of being HMI or HCI. Because for millions of years human race communicated with each other using gesture even before they learned how to talk. Hand gesture is very strong field in gesture recognition and fingers proved them the most important component of gesture recognition in addition to hand gesture. Our proposed framework can be used to identify binary pattern in hand gesture which can be used to recognize finger binary [1]. Finger binary is very efficient yet well known to both human and computer. We can easily map fingers to binary pattern e.g. 1 for finger shown and 0 for not shown. Using finger binary, it’s possible to recognize 0 to 31 using only one hand. And if we use our both hands, we’ll be able to recognize 0 to 1023 which is 6 fold for one hand and 102 fold for both hands using current approach of just counting the finger without its relative value. The rest of the paper is organized as follows. Related works are discussed in section II. Proposed methodology is explained in section III. Experimental results are presented in section IV. Concluding remarks are given in section V. II. RELATED WORKS Different approaches have been taken to recognize hand gesture by the researcher community. These works can be broadly classified into two main categories. Data globe and other wearable sensor based and purely vision based. Data-Glove based methods use sensor devices for digitizing hand and finger motions into multi-parametric data. The extra sensors make it easy to collect hand configuration and movement. However, the devices are quite expensive and bring much cumbersome experience to the users [2], [3]. In contrast, vision based approach provide more natural and user friendly way. Vision based approaches can be again classified into two major categories. Depth camera based solution and single monocular camera based solution. Depth camera based solution uses recent advancement in 3D imaging technology [4], [5], [6] like Microsoft’s Kinect sensor and Sony’s Wii-Mote. But these devices are very expensive. For hand gesture recognition, there exists numerous approaches. C.C. Hsieh et. al. used motion history image[7], Hl Suk et. al. used Bayesian Network for hand gesture recognition[8], Z. Ren et. al. used
ISSN 2277-1956/V3N4-398-405
399
Development of Hand Finger Recognition System with Relative Positioning
Finger Earth Mover Distance (FEMD) for hand gesture recognition[9]. C.C. Wang et. al. proposed hand posture recognition using Adaboost with SIFT [10]. Most of the systems tried to intentionally or unintentionally just counting the fingers in a hand pose. The main focusing point of the proposed method is to increase the number of gesture that can be represented and recognized. Fully capable version of this method will be able to recognize 32 numeric counting gestures from one hand where traditional approach limits in 5 to 10 numeric counting gestures. Though it’s computationally expensive to find relative position of the fingers as compared with counting fingers, it’ll open a door to have thousands of hand gesture been recognized just using two hands and an imaging device . III.PROPOSED METHODOLOGY The method we propose uses color based segmentation approach to detect skin area and connected component detection for hand region detection. Relative finger-tip position is acquired in features extraction stage. The brief block diagram of finger binary recognition process is shown in Figure 1.
Figure 1.Block diagram of the proposed framework
A. Image Acquisition Each frame of input video stream either real time video or locally stored video will be processed for finger relative position recognition. Each frame is resized to 320x240 if the frame is larger. B. Segmentation The input frame is segmented for further processing in this stage. The segmentation is crucial because it isolates the task-relevant data from image background before passing them to subsequent recognition stages. There are several features which are used in image segmentation. Such as skin color, shape, motion and anatomical model of hand etc. We’ve used skin color based segmentation approach. For skin color segmentation, illumination invariant color spaces are more preferred. Normalized RGB, HSV, YCrCb and YUV are some illumination invariant color spaces. We’ve used YCrCb color space for segmentation process. After skin-color segmentation, the frame is passed through Gaussian Blur, Erosion and Dilation process to remove noise from image. Figure 2(a) shows the result of our used skin color segmentation process.
ISSN 2277-1956/V3N4-398-405
IJECSE, Volume 3, Number 4 Asaduz Zaman et al.
Figure 2. (a) input skin color segmented image, (b) contour detection with hand region and face area marked. Notice that if face areas are excluded, the biggest contour is hand region
C. Hand Region Detection Binary image frame produced by segmentation stage is treated as input of this stage. Hand region detection stage tries to detect the hand region of the input image. As we’ve stated at our environment setting, we’ve assumed that the biggest part of the input frame will be hand region. The process of finding biggest part of the skin image is done with help of finding contours of the binary skin image and taking the biggest contour as hand region. For being in safe side, input color image is passed through Haarclassifier based face detector with scale factor of 1.5 and minimum size of one-fifth of original image for faster processing. If any face is found, then hand region detection is processed excluding face area. Figure 2(b) shows the result of this process with hand region and face area marked with rectangle. D. Feature Extraction Features extraction is the most important stage for finger-relative position recognition. For features extraction, firstly we’ll find contour of hand region which we have found in hand region detection stage. Then we’ll find convex hull for that contour. The convex hull provides a set of convexity defect. Every convexity defect contains four piece of information. Such as, a. Start point b. End point c. Depth point d. Depth Convexity defects of a hand figure are shown in figure 3.Each start points and end points are possible finger-tip position. To prevent the system to detect false fingertips, the system uses equation (1). 1, ݔ. ݀݁ݐℎ > ݀ × ߙ ݂ሺݔሻ = ൜ 0, ݐℎ݁݁ݏ݅ݓݎ
(1)
Where x implies convexity defect and dm implies maximum depth of all convexity defects and α is the threshold value. Output 1 or 0 simply implies that the convexity defect is potential finger-tip bearer or not.
ISSN 2277-1956/V3N4-398-405
400
401
Development of Hand Finger Recognition System with Relative Positioning
Figure 3. Convexity defects: the dark contour line is a convex hull around the hand; the gridded regions (A–H) are convexity defects in the hand contour relative to the convex hull. Image curtsey: Learning OpenCV by Gary Bradski and Adrian Kaehler
After that start and end points of all potential convexity defects are taken as possible finger-tip position. Actual finger-tip positions are detected using equation (2). 1, ܮܮܷܰ = ܶܨ ݂ሺݔሻ = ൝0, ܦܧሺݔ, ݕሻ < ߙ ܶܨ ∈ ݕ 1, ݐℎ݁݁ݏ݅ݓݎ
(2)
Where x and y are potential finger-tip position, FT is finger-tip array which initially set to NULL. ED(x,y) is Euclidean distance of x and y and α is threshold value which implies the minimum distance for two consecutive finger-tips. This function will return 1 if FT is null and x will be stored in FT. If for any member ED(x,y) returns less than α, the point will be discarded. Otherwise the x will be stored in FT. This stage extracts information about all finger-tips position along with a center-point of hand region which is the center of bounding rectangle of the detected hand region. Finally this stage stores some information for later use. Initially the system asks the user to show binary number 11111 or all finger-tips shown position. When the user shows binary number 11111, the system learns the features for making further communication smooth. In this stage the system stores a structure for the information bellow, a. Angular relation from each finger to all other fingers. E.g. t2ir, t2mr, i2mr etc. meaning thumb to index finger angular relation, thumb to middle and index to middle. b. c2tA [Palm center to thumb angle with reference to x-axis] c. c2tD [Palm center to thumb distance] d. hand [bounding rectangle of hand region] E. Decision Making The last stage of the system is decision making stage. This stage provide the recognized finger binary number as the system output. Recognition of binary 00000 and binary 00001 are processed separately as
ISSN 2277-1956/V3N4-398-405
IJECSE, Volume 3, Number 4 Asaduz Zaman et al. they provide quite distinguishable features. All other recognition is done by predicting whether a specific fingertip is shown or not. 1) 00000 Recognizing Recognizing 00000 is quite easy task as when 00000 is shown by a user, the hand region provides smallest area. If current hand region’s height and width are less than a threshold level, we’ll detect the case as 00000. The system uses equation 3 for this case.
Figure 4. 00000 Recognizing. Here α and β are both taken as 0.8
1, ݔ. ℎ݁݅݃ℎ ߙ < ݐ, ݔ. ݐ݀݅ݓℎ < ߚ ݂ሺݔሻ = ൜ . 0, ݐℎ݁݁ݏ݅ݓݎ
(3)
Where x is the current frames hand region. This case is shown in figure 4. 2) 00001 Recognizing Recognizing 00001 is almost same as recognizing 00000. The only difference between 00000 and 00001 is that thumb is shown or not. The thumb finger just extends the 00000 frame’s hand region width above a threshold value of the actual hand width. The system uses equation 4 for this case. ݂ሺݔሻ = ൜
1, ݔ. ℎ݁݅݃ℎ ߙ < ݐ, ݔ. ݐ݀݅ݓℎ > ߚ 0, ݐℎ݁݁ݏ݅ݓݎ
(4)
Where x is the current frames hand region. This case is shown in figure 5. 3) Predicting Thumb Position Using stored information from features extraction stage, thumb position is predicted in each frame. Predicting thumb position is very important because the system uses this thumb position as the reference position for relative finger-tip position finding. The system uses equation 5 and 6 for predicting thumb position.
ISSN 2277-1956/V3N4-398-405
402
403
Development of Hand Finger Recognition System with Relative Positioning
Figure 5.00001 Recognizing. Here α and β are both taken as 0.8
ܶ. ܿ = ݔ2 × ݀ݐcosሺܿ2ܽݐሻ + ܿ݁݊ݎ݁ݐ. ݔ
ܶ. ܿ = ݕ2 × ݀ݐsinሺܿ2ܽݐሻ + ܿ݁݊ݎ݁ݐ. ݕ
(5) (6)
Where T is the thumb point, c2tD and c2tA are center to thumb distance and center to thumb angle with reference to X-axis from saved features and center is current frame’s center position. Red dots in figure 6(a), 6(b), 6(d), 6(e), 6(g), 6(h), 6(i) and 6(j) are predicted thumb position. 4) Predicting Finger-tips Shown or not For predicting finger-tips shown or not, we will firstly measure the angle among our predicted thumb position, current center position and each of the fingertips found from features extraction stage. Then we’ll label each finger-tip whether it’s shown or not. Let the angle found is a. Then system will decide what finger is shown using the equation (7) to equation (11). 1, ݈ܽ݊݃݁ < 45 ݐℎ = ܾ݉ݑ൜ 0, ݐℎ݁݁ݏ݅ݓݎ
݅݊݀݁ = ݔ൜
1, 45 < ݈ܽ݊݃݁ < 80 0, ݐℎ݁݁ݏ݅ݓݎ
1, 95 < ݈ܽ݊݃݁ < 100 ݈݉݅݀݀݁ = ൜ 0, ݐℎ݁݁ݏ݅ݓݎ
1, 105 < ݈ܽ݊݃݁ < 115 = ݃݊݅ݎ൜ 0, ݐℎ݁݁ݏ݅ݓݎ = ݕ݇݊݅൜
1, ݈ܽ݊݃݁ > 115 0, ݐℎ݁݁ݏ݅ݓݎ
(7) (8) (9) (10) (11)
Note that angles are measured in degree.
From equation (8) to equation (12), we can see that some angles are omitted. Omitted angle-ranges are 80-95 and 100-105.These angle-ranges are possible position for more than one finger. Angle-range 80-95 is possible position for
ISSN 2277-1956/V3N4-398-405
IJECSE, Volume 3, Number 4 Asaduz Zaman et al. index and middle finger and anglerange 100-105 is for middle and ring finger. To determine which finger is actually shown, we’ll use our stored information and will update our stored information in every frame using equation (12). ܴ=
ೌభ
ோାೌమ ଶ
(12)
Where R is the previous relation of angle between fingers. And then we’ll compute sum of distance of the relations using equation (13).
(13) = ݓୀሺݎݏ − ܿݎሻ Where sri is the i’th stored relationships and cr is the current relationship found by equation (12). Finger-tip with minimum value of w will be associated as predicted finger-tip. III. EXPERIMENT AND RESULT Some frames are affected by dynamicity of the real time approach. For this reason, some gestures couldn’t be recognized. Also some gesture is very hard to perform because of articulation of human hand. At this point of development, the system is not scale invariant. The most affecting issue is skin-color segmentation. The process will do better is a better skin-color segmentation approach is applied. But as our focusing point is not skin-color segmentation, we’ve just used traditional approach with best possible localization. We’ve used OpenCV library for our system that runs on a windows machine with gcc compiler. The system will also be able to run on a linux machine with gcc compiler. Figure 6 shows some outcomes of the proposed system.
Table 1: Experimental Result
Sample # S#1 S#2 S#3 Average
Total Frames 682 998 783 824
Gesture Frames 187 333 223 247
Correct Output 154 237 175 188
Recognition Rate 82.35% 74.55% 78.47% 78.45%
IV.CONCLUSION Although accuracy of recognition rate of experimental result is nearly 80%, it’s noteworthy that the total number of individual gesture is increased a lot in this process. It is also notable that the process uses very simple mathematical calculation to recognize gesture which is computationally very inexpensive. The systems’ performance of accuracy could be increased with using a more sophisticated skin-color segmentation approach. Good lighting condition also now affecting the system performance. If we use both hands for gesture recognition, it’s possible to recognize all 1024 finger binary gestures.
ISSN 2277-1956/V3N4-398-405
404
405
Development of Hand Finger Recognition System with Relative Positioning
Figure 6.Finger binary recognition result. (a), (b), (c), (d), (e), (f), (g) represents correct recognition of some gesture. (h) shows false detection of pinky finger which is corrected and (i), (j) shows incorrect recognition.
REFERENCE [1] [2]
http://www.en.wikipedia.org/wiki/FingerBinary Lamberti, Luigi, and Francesco Camastra. "Real-time hand gesture recognition using a color glove." Image Analysis and Processing–ICIAP 2011 (2011): 365-373. [3] Mulder, “Hand gestures for HCI”, Technical Report 96-1, vol. Simon Fraster University, 1996 [4] Li, Yi. "Hand gesture recognition using Kinect." Software Engineering and Service Science (ICSESS), 2012 IEEE 3rd International Conference on. IEEE, 2012. [5] Ren, Zhou, Jingjing Meng, and Junsong Yuan. "Depth camera based hand gesture recognition and its applications in human-computerinteraction."Information,Communications and Signal Processing (ICICS) 2011 8th International Conference on. IEEE, 2011. [6] Ren, Zhou, et al. "Robust hand gesture recognition with kinect sensor."Proceedings of the 19th ACM international conference on Multimedia. ACM, 2011. [7] Hsieh, Chen-Chiung, Dung-Hua Liou, and David Lee. "A real time hand gesture recognition system using motion history image."Signal Processing Systems (ICSPS), 2010 2nd International Conference on. Vol.2. IEEE, 2010 [8] Suk, Heung-Il, Bong-Kee Sin, and Seong-Whan Lee. "Hand gesture recognition based on dynamic Bayesian network framework." Pattern Recognition 43.9 (2010): 3059-3072. [9] Ren, Zhou, Junsong Yuan, and Zhengyou Zhang. "Robust hand gesture recognition based on finger-earth mover's distance with a commodity depth camera." Proceedings of the 19th ACM international conference on Multimedia. ACM, 2011. [10] C C Wang, K C Wang, “Hand Posture recognition using Adaboost with SIFT for human robot interaction”, Springer Berlin, ISSN 0170-8643, Volume 370/2008
ISSN 2277-1956/V3N4-398-405