Recent Advances in Knowledge Engineering and Systems Science
Face Detection Algorithm Based on Skin Detection and Invariant Moments Romana CAPOR HROSIK Milan TUBA Mirjana VUKOVIC Faculty of Appl./Business Computing Faculty of Computer Science Department of Mathem. University of Dubrovnik Megatrend University Belgrade University of Sarajevo Bran. Dubrovnika 29, Dubrovnik Bulevar umetnosti 29, N. Belgrade Zmaja od Bosne 33-35 CROATIA SERBIA BOSNIA and HERZEG.
[email protected] [email protected] [email protected] Abstract: The past decade witnessed an amazing growth of interest in the field of face detection and resulted in a tremendous explosion of its different purposes such as human computer interface, face recognition, image database management and to various human computer interaction domains. In the paper we present a fast and easy to implement algorithm for face detection that is based on skin detection and Hu moments. Our algorithm is invariant to the position of the face and experimental results demonstrate successful detection. Key-words: - Image processing, Face detection, Skin color detection, Hu moments rules or more specific rules, such as relative distances and positions between facial features, leads to many false negative results. The main challenge is to translate human knowledge to optimal set of rules. To avoid situation where too many false (negative or positive) are returned, is introduction of a hierarchy of rules [2]. To detect human face, feature invariant based algorithm, utilize invariant features of human face such as skin color, shape, texture etc. The main advantage of face detection based on skin color algorithm is the processing speed, because there is no need to process facial parts, however, there are some issues, especially because of color representation, light condition of environment where picture of an object is taken and different skin colors from person to person. Another problem is that color values, of two pictures of the same person, taken in exactly same time, but with different cameras, are not the same. To avoid occurrences of many false positive results, several features can be combined. Template matching algorithm utilizes templates of faces from database to determine presence of face in image, by correlation between an input image and the stored templates. Stored templates can have allowable deformations, which rely on the alignment. If amount of allowable deformations is too high, many false positive results are returned. Appearance based algorithms are widely used in face detection. Similar to template matching based algorithm, they use templates, but models of faces and facial patterns are obtained by learning from training data. Detection method is then based on comparing patterns of input image with obtained
1 Introduction Face detection has become applicable in various fields and is used for many different purposes. Thanks to development of computers many applications of face detection are becoming an integral part of our everyday life. For example, human-computer interfaces based on facial expressions and body gestures are being applied as ways to replace the traditional interfaces such as the mouse and the keyboard; on mobiles we have androids to recognize face of owner instead of pin. Human eye can easily detect another human face unlike computer vision. There are many difficulties and challenges associated with face detection (variations in lighting conditions can make face images significantly different; beards, mustaches, glasses change or hide some basic features of the face…). The main task is to find an effective discrimination function between face and non-face patterns. The generally accepted classification for face detection, proposed by Yang, Kriegman and Ahuja [1], can be divided into four categories: knowledgebased, feature-invariant, template matching based and appearance-based. Knowledge-based methods are rule-based approaches which attempt to model intuitive knowledge of facial features and capture the relationships between them. The main property of these methods is to translate human knowledge into well-formed rules. Simple or too general rules, leads to many false positive results. Utilizing more 1
This research is supported by Ministry of Science, Republic of Serbia, Project No. 44006
ISBN: 978-1-61804-162-3
110
Recent Advances in Knowledge Engineering and Systems Science
model. Methods based on this principle are achieving very good results. The above mentioned categories of algorithms are not strictly disjoint. There are a number of algorithms that can be classified into two or more categories according to their characteristics. In this paper we will describe our algorithm for face detection which roughly searched areas of the skin color, and then applying of the Hu moments precisely enhances its efficiency. We present a system based on feature invariant (skin color, Hu moments) giving rise to a decision function that discriminates face from other exposed skin body parts. The rest of the paper is organized as follows. Section 2 introduces us to papers being important to this work to some extent. Section 3 describes color space for skin detection. Section 4 shows description of skin detection algorithm. Section 5 describes briefly Hu moments and their application in our work. Section 6 discusses the proposed algorithm in detail. Section 7 shows our experimental results. Section 8 concludes the paper.
that uses the spatial characteristics of human skin color. Their experimental work has shown that ranges of Cb and Cr that represent skin well are: (1)
133 Cr 173
(2)
The proposed face-segmentation methodology was tested on many input images. The experimental results have shown that their algorithm can accurately segment out the facial regions from a diverse range of images that includes subjects with different skin colors and the presence of various background complexities. There are several challenges associated with face detection of faces captured in uncontrolled environments, such as surveillance video systems. These challenges can be attributed to some factors such as pose variance, feature occlusion, facial expression, imaging conditions, etc. Although various face detection algorithms were proposed, the results were not satisfactory enough to use in daily life. Therefore, face detection is still in its infancy, evolving all the time.
2 Related work
3 Color space for skin detection
Image processing is very active research area with many different applications [3], [4]. Detecting faces is a crucial steps in a wide variety of applications (human computer interface, face recognition…). Face detection algorithms usually share common steps. In the first phase some data dimension reduction is done, in order to achieve admissible response time. Also, some pre-processing could be done to adapt the input image to the algorithm prerequisites. The next phase usually involves extracting facial features or measurements. These will then be weighted, evaluated or compared to decide if there is a face and where is it. Finally, some algorithms have a learning routine and they include new data to their models. Hence, face detection is a two class problem where we have to decide if there is a face or not in a picture. Different face detection techniques have been used achieving various results [5], [6]. Algorithm for skin color modeling in tint-saturation-luma color space is described by Terrillon et al. [7] showed that TSL color space gives best results. Another method analyzing YCbCr called YCgCr was used by deDios and Garcia [8]. YCgCr use Cg color component instead of Cb component, which results to betters results then YCbCr. A new approach has been by analyzing skin color by Chai and Ngan [9]. They have developed an algorithm
ISBN: 978-1-61804-162-3
77 Cb 127
Worldwide the RGB is used as a baseline performance. Because of its sensitivity to intensity variations a lot of linear and nonlinear color spaces have been proposed. However, it is generally agreed that there does not exist a single space which is convenient for all color images [10]. Our algorithm of skin detection utilizes YCbCr color space where Y presents the luminance channel and Cb and Cr are the blue-difference and reddifference chrominance components respectively (Fig. 1). It belongs to a family of television transmission color spaces, where the main rule have YUV and YIQ. YCbCr has been defined in response to increasing demands for digital algorithms in handling video information. Formulae for the direct conversion from RGB to YCbCb are the following: Y 0.299 R 0.587 G 0.114 B
Cb 128 (0.168736 R) (0.331264 G) (0.5 B) Cr 128 (0.5 R) (0.418688 G)
(3) (4)
(5) (0.081312 B) When transfering the pixel values from RGB color space to YCbCr using these formulae we had
111
Recent Advances in Knowledge Engineering and Systems Science
outings pixel values outside of his potential range. Therefore it was necessary to return these values within the interval given sizes. We did it in a way that those values that go below zero we put to zero and those that exceeded 255 we set on 255.
We had to adjust the interval for Y, Cb and Cr to get better results. Because on our picture is a lots of light and also color of swimwear is approximative skin color. For the output image we get binary image where all pixels are marked either as skin or non-skin.
Fig. 1. Color image and its Y, Cb and Cr components
4
Fig. 2. Original image
Skin detection in YCbCr
People have different skin colors in appearance. Several studies have showed the major difference lies in intensity rather than color itself. In contrast to RGB, the YCbCr color space is lumaindependent so it is one of the most popular color spaces for skin detection. According to (Hsu et al, 2002), the skin color cluster is more compact in YCbCr than in other color space. There are many skin detection algorithms in YCbCr color space. Kukharev and Novosielski [11] built skin color model (people of different races have been investigated only people of black skin color were not considered). They defined a threshold for each component Y, Cb and Cr by: Y 80
(6)
85 Cb 135
(7)
35 Cr 18
(8)
Fig. 3. Binary image showing skin areas
5
If the components of each pixel was greater than the threshold, they were skin color, otherwise , were non-skin color. First what we do on an input image (Fig. 2) is color segmentation. It is done using the threshold for each of the components by following inequality 69 Y 215
(9)
94 Cb 126
(10)
139 Cr 17
(11)
ISBN: 978-1-61804-162-3
Moment invariants. Hu moments
The beginnings of moment invariants started in the 19th century under the framework of the theory of algebraic invariants by famous German mathematician David Hilbert. Moment invariants were firstly introduced to the pattern recognition community in 1962 by Hu [12], who employed the results of the theory of algebraic invariants and derived his seven famous invariants to rotation of two-dimensional objects. In pattern recognition, moments and functions of moments have been extensively used as invariant global features of images. An essential feature of pattern analysis is the recognition of objects and characters regardless of their size, position and orientation.
112
Recent Advances in Knowledge Engineering and Systems Science
h4 ( 30 12 ) 2 ( 21 03 ) 2
As the shape is a fundamental property of an object, the key component of description is an effective shape descriptor. There are two types of shape descriptors: contour-based shape descriptors and region-based shape descriptors (Kim & Sung 2000). Regular moment invariants are one of the most popular and widely used contour-based shape descriptors, a set derived by Hu (1962) [13]. Later on these geometrical moment invariants have been extended to larger sets by Wong & Siu (1999) and to other forms (Dudani et al 1977;Liao & Pawlak 1998) [14]. Two-dimensional moments of order ( p q) of
(18)
h5 ( 30 312 )( 30 12 )(( 30 12 ) 2 3( 21 03 ) 2 ) (3 21 03 )( 21 03 ) (3( 30 12 ) 2 ( 21 03 ) 2 )
(19)
h6 ( 20 02 )(( 30 12 ) 2 ( 21 03 ) 2 ) 411 ( 30 12 )( 21 03 )
(20)
h7 (3 21 03 )( 21 03 )(3( 30 12 ) 2 ( 21 03 ) 2 ) ( 30 312 )( 21 03 ) (3( 30 12 ) 2 ( 21 03 ) 2 )
(21)
digital image f ( x, y) of size M N are defined as: m p ,q
Despite numerous disadvantages, Hu invariants are not demanding for the implementation and therefore are often used as a basic set of characteristic properties in solving various pattern recognition problems, from medical applications, recognition of letters and mails, objects for robot orientation, identification persons, the analysis of satellite images and so on.
M 1N 1
x p y q f ( x, y )
(12)
x 0 y 0
where p, q 0,1,2 . The corresponding central moment of order ( p q) is defined as: M 1 N 1
p ,q x 0
f ( x, y)( x x
where x avg
y 0
avg
) p ( y y avg ) q (13)
m10 m and y avg 01 . m00 m00
The normalized central moments are defined as:
p,q
p,q m
(14)
pq 1 2 00
Invariants to similarity transformation group were the first invariants that appeared in literature as a solution to the problem of selecting the essential properties of objects classification regardless of their position, primarily because of its simplicity in application. Invariants to translation and scaling are trivial – central and normalized moments themselves solve it. So the only non-trivial problem remains finding rotational invariants. The problem is solved by M.K. Hu who defines the following seven rotational invariants which are computed from central moments through order three. h1 20 02
Fig.4. One of binary images to which we applied Hu moments
6 Our face detection algorithm In this paper we describe the algoritham for detecting human faces based on color information and shape analysis based on invariant moments Hu moments. The algoritham is dived in two steps: skin/non-skin color classification and Hu moments. Skin detection can be substantial part of a face detection algorithm and in some specific cases the only. First, we applied skin detection algorithm on an image. As the result we get the image where each pixel is marked as skin or as non-skin. After processing of original image to binary image, some noise can take place in both skin and
(15)
h2 ( 20 02 ) 4
(16)
h3 ( 30 312 ) 2 (3 21 03 ) 2
(17)
2
2 11
ISBN: 978-1-61804-162-3
113
Recent Advances in Knowledge Engineering and Systems Science
non-skin pixels. Therefore opening and closing operations are made before next step is performed. Opening involves morphological operation of erosion, followed by dilation. By this operation, elimination of noise, made by skin pixel is achieved. To eliminate non-skin pixels noise, closing operation is executed, which involves dilation followed by erosion. On the resulting image we use the method of invariant moments – Hu moments in order to characterize the shape of each cluster.
7
Dh hi hi _ h
2
i 1
7
Dnh hi hi _ nh
2
i 1
Average Variance Min Max h1_nh 0.453945 0.155478 0.129576 0.663056 h2_nh 1.55761 0.52609 0.83436 2.30533 h3_nh 2.73239 0.786849 1.69192 4.35934 h4_nh 3.11046 0.61762 1.75911 3.95771 h5_nh 6.18951 1.2828 3.66578 8.14026 h6_nh 4.17115 0.971526 2.89 5.6736 h7_nh 6.35061 1.32605 3.88451 8.60621 Table 2. Calculation average, variance, minimum and maximum of Hu moments on skin patches of non-head
Fig. 6. Image of head (Image2) Fig.5. Image after opening and closing operation
7
Values of the smallest distance from head and the smallest distance from non-head are found. Current image is classified as head or non-head based on comparing values which is the smallest. The skin patch on the Fig.6. is recognized as head.
Experiments
In this section we will present results achieved with our software that utilizes proposed method for face detection. Apart from advantage of simple implementation the method proved to be rather robust. In Table 1. and Table 2. we show results of calculation averages, variances, minimums and maximums of Hu moments on skin patches of head and non-head.
D_h Image1 Image2 Image3 Image4 Image5 Image6 Image7 Image8 Image9 Image10 Image11 Image12 Image 13 Image14
Average Variance Min Max h1_h 0.425693 0.115772 0.183706 0.603343 h2_h 2.03069 0.338621 1.52916 2.51303 h3_h 2.091 0.861106 1.1875 3.58511 h4_h 2.10779 0.648816 1.32427 3.41938 h5_h 4.40416 1.61599 2.63488 7.62239 h6_h 3.23751 0.685856 2.43664 4.70205 h7_h 4.81616 1.0745 2.94257 6.93042 Table 1. Calculation average, variance, minimum and maximum of Hu moments on skin patches of head
337,2191839 688,8752191 309,5878647 426,4449202 547,3629957 427,7867198 357,6851708 244,1779592 315,6584931 360,241427 160,8517576 291,6314199 444,1328016 275,9583857
Table3. Calculation of the functions D_h and D_nh
We applied our algorithm on skin patches which present head and non-head and get values of seven Hu moments. For one arbitrary image (for example Fig.6.) we calculate following functions
ISBN: 978-1-61804-162-3
290,9452797 555,4505523 255,2718345 367,8099451 428,595587 352,9514576 289,0350701 196,0656967 228,4183306 266,1296106 107,3221944 207,829559 337,7085075 212,1487689
D_nh
We apply the same procedure to all other images Our algorithm have solid success rate on test images. Recognition rate is around 78%. Based on calculated results and by comparing values of h1 on tested images, we can conclude, that
114
Recent Advances in Knowledge Engineering and Systems Science
results of first Hu moments are very similar and for our purpose, first Hu moment values are not relevant, therefore we can disregard them. In seeking for better results we get the following tables. From observing and comparing values in tables it is clear that 4th and 5th Hu moments are playing key role in recognizing skin patches as a head. Based on calculated from the training set we compared Hu moments for test images and based on simple total distance for all seven moments obtained acceptable recogninion success percentages. D_h Image1 Image2 Image3 Image4 Image5 Image6 Image7 Image8 Image9 Image10 Image11 Image12 Image 13 Image14
289,739034 554,316604 254,166235 366,694570 427,826860 351,910980 287,833990 195,076830 228,090028 265,796011 106,969115 207,479493 337,268486 211,896380
[3] Ivona Brajevic, Milan Tuba: Multilevel Image Thresholding Selection Using the Modified Seeker Optimization Algorithm, Proceedings of the 1st International Conference on Computing, Information Systems and Communications (CISCO '12), Singapore, May, 2012, pp.258-263 [4] Raka Jovanovic, Milan Tuba, Dana Simian: A New Visualization Algorithm for the Mandelbrot Set, Recent Advances in Mathematics and Computers in Biology and Chemistry, Prague, March 2009, pp. 162-166 [5] P. Kakumanu, S. Makrogiannis, N. Bourbakis, A Survey of Skin-Color Modeling and Detection Methods. Pattern Recognition, vol. 40, no. 3, March 2007, pp. 1106-1122. [6] V. Vezhnevets, V. Sazonov, A. Andreeva, A Survey on Pixel-Based Skin Color Detection Techniques. GraphiCon, Moscow, Russia, September 2003, pp. 85-92. [7] J. C. Terrillon, M. N. Shirazi, H. Fukamachi, and S. Akamatsu, Comparative Performance of Different Skin Chrominance Models and Chrominance Spaces for the Automatic Detection of Human Faces in Color Images. Proceedings of Fourth IEEE International Conference on Automatic Face and Gesture Recognition, March 2000, pp. 54-61 [8] J.J. de Dios, N. Garcia, Face Detection based on a New Color Space YCgCr, ICIP03, 2003. [9] D. Chai, and K.N. Ngan, Face segmentation using skin-color map in videophone applications. IEEE Transactions on Circuits and Systems for Video Technology, vol. 9 no. 4, June 1999, pp. 551-564 [10] R. C. Gonzalez, R. E. Woods. Digital Image Processing (2 nd Edition). Prentice Hall, January 2002 [11] G. Kukharev, A. Novosielski, Visitor identification - elaborating real time face recognition system. Proceedings of 12th Winter School on Computer Graphics (WSCG), Plzen, Czech Republic, February 2004, pp. 157-164. [12] Hu, M.K., 1962. Visual Pattern Recognition by Moments Invariants. IRE Trans. Information Theory, 8: 179- 87. [13] J. Wood, Invariant Pattern Recognition: A Review. Pattern Recognition, vol. 29, no. 1, January 1996, pp. 1-17 [14] M. Mercimek, K. Gulez, T. V. Mumcu, Real Object Recognition using Moment Invariants. Sadhana Vol. 30, Part 6, December 2005, pp. 765–775
D_nh
335,950083 687,680303 308,422054 425,269073 546,543929 426,687808 356,421368 243,132105 315,297017 359,874393 160,464305 291,247124 443,654501 275,676812
Table 4. Calculation of the functions D_h and D_nh without h1
8
Conclusion
In this paper, we have presented a new method to detect human faces in color images. We used invariant moments after initial detection of skin regions based on skin color in the YCbCr color space. Experimental results show that the proposed method can detect human faces in color image regardless of size, orientation and viewpoint. Thus the accuracy of this algorithm is quite good. Further development will include Support Vector Machine (SVM) for more versatile approach and improvement in detection rate. References [1] M.-H. Yang, D. Kriegman, N. Ahuja. Detecting Faces in Images: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 1, 2002, pp. 34-58. [2] G. Yang, T. S. Huang, Human Face Detection in Complex Background. Pattern Recognition, vol. 27, no. 1, January 1994, pp. 53-63.
ISBN: 978-1-61804-162-3
115