Face Detection Using Consecutive Bootstrapping ICA. Taekyun Kim, Changkyu Choi and Sang-Ryong Kim. Human and Computer Interaction Lab, Samsung ...
Face Detection Using Consecutive Bootstrapping ICA Taekyun Kim, Changkyu Choi and Sang-Ryong Kim Human and Computer Interaction Lab, Samsung Advanced Institute of Technology {taekyun, flyers, srkim}@sait.samsung.co.kr Abstract In this paper, a learning algorithm for frontal face
becomes much higher.
detection based on independent component analysis (ICA)
discarding the components having relatively small
PCA can reduce the dimension of input data space by
is proposed. An assumption made in this work is that
eigenvalues. This technique is often used for face
similar patterns to the representative face pattern are faces.
recognition. The face detection based on PCA [3] used
The representative face pattern is acquired by averaging
feature vectors (eigen-faces), whose higher order statistics
various face images with different lighting conditions,
were still dependent. We believe that the higher order
shapes, and facial expressions. Input vectors for ICA
moments of image pixels representing relationships among
learning consist of the representative face pattern and other
three or more pixels can play an important role in
non-face patterns. Non-face patterns are collected with a
distinguishing face patterns from non-face patterns. This is
view to bootstrapping. The bootstrapping is carried out
the reason why we use ICA here.
consecutively until there is no more false detection in the
The proposed method of consecutive bootstrapping
non-face images. The multiple steps of bootstrapping and
ICA makes
the use of small number of non-face patterns in each step
bootstrapping ICA uses a relatively small amount of
could make the learning process converge.
learning patterns and utilizes higher order moments.
The
consecutive
bootstrapping
steps
yield
independent
face
filters
(IFFs).
The
the
Furthermore, the feature vectors are obtained from both
independent face filters (IFFs). One or more IFFs are
face and non-face patterns, so that the resulting feature
obtained at each bootstrapping step. At each step, the
vectors are more discriminative than the ones from face
probability density functions (pdfs) for the coefficients of
patterns only.
the ICA feature vectors are acquired by projecting face images to each ICA feature vector. The feature vectors that
II. Independent Component Analysis
are peculiar to their responses to face images qualify as
ICA is an unsupervised learning algorithm based on the
IFFs. Face Detection requires the preprocessing of lighting
principle of optimal information transfer. In Fig. 1 let x be a
correction. The non-linear illumination model using a
single input and y be an output through a nonlinear function g (⋅) .
sigmoid function is also proposed. The proposed bootstrapping ICA is novel in that face features are acquired by considering higher order statistics of not only faces but also non-face patterns. Experimental results show that the proposed method outperforms those of previous face detection methods.
I. Introduction
1 (1) 1 + e −u To maximize the information transfer, an optimal weight w should be obtained in such a way that the pdf of the input x fits the slope of the nonlinear function g. The optimal w makes that the pdf of the output y is uniform. Thus, the maximum entropy at the output neuron is accomplished. u = wx + w0
y = g (u ) =
Among existing methods for face detection from images with complex backgrounds, neural network based approaches and principal component analysis (PCA) based approaches have been studied [1,3,4]. Methods using neural network need a huge amount of training patterns. Thus, it is very time consuming to collect learning patterns and in turn the complexity of the learning algorithm
Fig. 1. The principle of optimal information transfer
In (2), the weight w is updated to maximize the output entropy likewise. ∂w ∂H ( y) ( H ( y ) : entropy of y ) (2) = ∂t ∂w In case of multiple inputs and outputs, the joint entropy at the output layer is maximized, so that the outputs are statistically independent. An update rule of a weight matrix W is described by ∆W ∝ I − K tanh(u)uT − uuT W .
[
]
_ b e (6) I ( x, y ) = + c⋅ +f x/a y/d 1+ e 1+ e The sigmoid function is used to cope with a sudden spatial change of lighting that often happens due to the geometrical characteristics of a face. The optimal parameters (a,b,c,d,e,f) of an intensity model I ( x , y ) are obtained by
the least square algorithm. Er = ∑x , y ( I ( x, y ) − I ( x, y ))2
(7)
The estimated intensity is subtracted from each input (3)
The extended infomax learning rule[7] was adopted here.
image. The resultant input images are subjected to the histogram equalization.
Before the learning algorithm, input data is preprocessed by sphering. Each row vector of an input matrix X is subtracted by its mean value. Then, the input matrix is subjected to
W z = 2 × XX T
−
1 2
.
∑n
(4)
The final weight matrix WI of the ICA learning is calculated by
WI = W ∗Wz .
3. Independent Face Filter Let the total number of the IFFs and the number of the IFFs at kth learning step be N and nk , respectively.
(5)
from other non-face spaces. In this paper ICA is used as a method of extracting the face space. As shown in Fig. 2, the input row vectors composed of the representative face pattern and multiple non-face patterns are learned so that outputs become mutually independent. In composing the IFFs, several consecutive learning steps are involved. At each step the outputs which describe common face features most are selected as IFFs. This bootstrapping scheme selects IFFs automatically and
=N
(8)
The IFFs are collected from the results of several consecutive learning steps. With an input matrix X the infomax learning rule yields an output matrix U.
X = [ x1T , x2T , , xMT ]T ,
III. Independent Face Filters 1. Overview Face detection is a work of distinguishing a face space
k
k
x1 = x
U = WX = [u1T , u2T ,, u MT ]T where
(9) (10)
x and M are the representative face pattern and
the number of input images, respectively. Each row of the output is an independent feature vector. At each learning step probability density functions (pdfs) for feature vectors are obtained by projecting face images to each feature vector. Therefore, the histogram of the coefficients of a feature vector for face images can be used as the pdf for that feature vector. The pdfs are obtained by (11) and (12).
R (i, j ) = Facei ⋅ u j
makes them more discriminative.
p ( Facei , u j ) =
T
(11)
B ( R(i, j ) − min R (i, j )) 1 i ⋅ Huj ( ) L max R (i, j ) − min R (i, j ) i
i
i ∈ {1,, L}, j ∈ {1,, M } (12) where
Facei , L , B and H u j are a row vector converted
from a face image, the number of face images used for learning, the number of bins, and the histogram for a feature vector u j , respectively. The feature vectors that Fig 2. ICA learning for face detection
respond largely and peculiarly to face images are selected as independen face filters. The decision rule whether the
2. Correction of Lighting To correct the lighting change of an input image, a new model for illumination is proposed.
feature vector qualifies as an IFF or not is as follows.
mean( H (u j )) − ref ≥ max j var( H (u j )) var( H (u j ))
mean( H (u j ))
to distinguish face patterns from non-face patterns. Coefficients of these IFFs get smaller in magnitude as the
j ∈ {1, , M } (13)
Using the IFFs selected so far, all the image patches ( Ip i , i ∈ {1, , w × h} ) from non-face images are tested. Probability that a current image patch is a face image is
learning repeated.
IV. Experimental Results 100 20 × 20 face patterns were collected for learning (30 from the Olivetti face DB [7], 30 from the CMU face DB
measured by N
P( Ipi ) = ∏ p ( Ipi , IFF j ) .
(14)
j =1
Non-face image patches with their probabilities larger than a reference value are classified as false-alarm images. Among the false alarm images M − 1 images are chosen. These images and the representative image form the input
[1], and 40 from the SAIT face DB). They were clipped manually and pre-processed by the proposed lighting correction algorithm described in section 3.2. Faces with different lighting conditions, shapes and facial expressions were included in the face DB as shown in Fig. 5.
learning matrix of the next learning step. All the above procedures are repeated until no more false alarm happenes in non-face images.
Fig 5. Normalized face samples and representative pattern
As seen in Fig. 3, only one feature vector qualified as an IFF due to its strong positive response to face images in the early steps.
Initial non-face patterns were selected arbitrarily among 20 × 20 image patches from non-face images (10 scenery images and 10 face hidden images in CMU DB [1]) as shown in Fig 6. All the image patterns were converted into row vectors. Then, the input matrix consists of one representative face pattern and 10 non-face patterns. The representative face pattern is an average image of the whole face DB. Input
Fig 3. Examples of histograms for ICA bases at multiple
non-face patterns at each learning step were chosen among
steps. Histograms with a bold line were selected for IFFs.
the non-face patterns that produced false alarm. The number of the input non-face patterns was determined
Figure 4 shows the qualified IFFs in the early steps.
heuristically in consideration of the convergence property
Two or more feature vetors were qualified as IFFs
of ICA and the complexity and the entire learning time of
simultaneouly as the learning repeated (step 4 in Fig. 4).
the proposed algorithm.
This is because the input learning patterns at the current step were chosen among the non-face patterns classified into faces at previous step.
Fig 6. Non-face images for learning The bootstrapping was continued until there was no more false detection in the selected 20 non-face images. The consecutive bootstrapping steps were six and the number of IFFs was ten. However, convergence of the Fig 4. Examples of ICA bases at multiple steps. Bases with a rectangle were selected as IFFs.
proposed algorithm was not proofed generally. To detect faces in an image, a 20 × 20 image patch was acquired from all locations of the image. After the lighting
The IFFs obtained in later steps are less discriminative
correction of the image patch it was passed through the
V. Conclusion and Further Works
learned IFF. Image pyramids were used to detect faces with various sizes.
A new learning algorithm for frontal face detection was
Generally face responses occurred at multiple pixels
proposed using the consecutive bootstrapping ICA. The
that belonged to the center region of a face. A candidate
independent face filters were obtained through the
region was determined as a face region using sequential
bootstrapped learning scheme. They were applied to the
labeling algorithm in a proper grid. The size of the detected
tasks of face verification and detection. This method
face was determined as the scale of the best response
achieved a reliable detection result with a small amount of
window.
training patterns. Methods to reduce the computation time much more and to detect rotated faces remain as further works. A study
1. Face Verification For a test we used manually clipped 120 face images
on general pattern detection algorithm, not just for faces, is
(40 from the Oliveti face DB [7], 40 from the FERET face
ongoing.
DB [6] and 40 from the SAIT face DB) and 120 non-face images (from CMU DB[1]) that are not used in the learning. All the images are 20 × 20 in size. We compared the proposed method with a PCA based method, which is the same as the proposed method except that feature space is produced by PCA. Table 1. Result of face verification Face images
Non-face images
Method True
False
True
False
PCA based
107
13
96
24
Proposed IFF
114
6
117
3
2. Face Detection To verify the proposed method for detecting face
Fig 7. Examples of Detection Results
References 1.
H.A. Rowley and T. Kanade, “Neural Network-Based Face
regions in an arbitrary image, 51 images were used. (31
Detection”, Transactions on Pattern Analysis and Machine
from CMU front face DB [1], 20 from SAIT face DB). The
Intelligence, 1998.
detection result is summarized in Table 2. Our algorithm
2.
M.S. Bartlett and T.J. Sejnowski, “Independent component
shows a little better detection rate and a similar false alarm
representations for face recognition”, Conference on Human
rate compared to the neural network based algorithm [1].
Vision and Electronic Imaging III, 1998.
However, it is difficult to compare the two methods exactly
3.
Using Principal Components Analysis”, Image Processing
because test images are a little bit different.
and its Applications, IEE 1999.
The number of learning patterns used in the proposed method is smaller than that used in [1]. 60 non-face patterns
4.
Raphael Feraud, “PCA, Neural Networks and Estimation for Face Detection”, In Face Recognition: From Theory to
from 20 non-face images and 100 face patterns were used.
Applications, Springer-Verlag, 1998.
The complexity of the detection algorithm is similar to that of CMU [1]. While the networks in [1] were composed of
B. Mener and F. Muller, “Face Detection in Color Images
5.
T.W. Lee and T.J. Sejnowski, “Independent Component Analysis using an Extended Infomax Algorithm for Mixed
about 3,500 connections, ours used 3,000 connections.
Sub-Gaussian
Table 2.
Result of face detection
and
Super-Gaussian
Sources”,
Neural
Computation, 1999.
Number of Faces
135
Number of Windows
1612875
Recognition: From Theory to Application, pp. 244-261,
# of miss/Detection rate
False detects/False alarm rate
Springer, 1998.
11
24
/ 91.85%
6.
/ 1/67203
7.
P.J.Phillips and P. Rauss, The feret evaluation, Face
F.S. Samaria, Face Recognition Using Hidden Markov Models, Ph. D thesis, Univ. of Cambridge, 1994.