View-Based Clustering of Object Appearances Based on Independent Subspace Analysis Stan Z. Li, XiaoGuang Lv, HongJiang Zhang Microsoft Research China, Beijing Sigma Center, Beijing 100080, China Contact:
[email protected], http://research.microsoft.com/szli Abstract
It has been found that distributions of appearances in linear subspaces such as those based on principle component analysis (PCA) under perceivable variations in viewpoint, illumination are highly nonlinear, nonconvex, complex and perhaps twisted, and can hardly be well described by principle component analysis (PCA) [8, 34, 18, 19, 9]. Indeed, a single linear model can hardly provide a solution for the problem. There can be two broad types of approaches for learning representations of multi-view appearances: supervised and unsupervised. When a view-labeled data set of appearances is available, the modeling can be done in a supervised way. In view-based approach [38], the range of view is partitioned into a number of intervals. A view-subspace defines the manifold of possible appearances of the object in each interval, subject to illumination. Such view-subspaces can be constructed in supervised ways by using view-labeled examples. This is also adopted in [16, 40] for multi-view face detection. The supervised learning can also be done to build parametric subspaces: With training data labeled and sorted according to the view (and perhaps also illumination values), one may be able to construct a manifold describing the distribution across views [34, 2]. Gong and colleagues use kernel support vector machines for multi-pose face detection and pose estimation [35, 30]. In a recent work [29], nonlinear view-subspaces are learned in an supervised way by using an a correlated array of support vector regression estimators to derive a view-specific yet illumination insensitive signature representation. In this paper, we are interested in learning to model multi-view appearances of an object, such as the face, in an unsupervised way. Assume that a training set of viewunlabeled appearances is available, subject to changes in the view and illumination, and shape deformation; see examples of faces in Fig.1. We have two objectives: First, to derive a multi-view subspace model for the object, each representing object appearances in a distinct range of view. This piecewise view-based modeling is useful because a single model comprising all possible views can be difficult to obtain. The challenge here is to learn view-specific representa-
In 3D object detection and recognition, an object of interest is subject to changes in view as well as in illumination and shape. For image classification purpose, it is desirable to derive a representation in which intrinsic characteristics of the object are captured in a low dimensional space while effects due to artifacts are reduced. In this paper, we propose a method for view-based unsupervised learning of object appearances. First, view-subspaces are learned from a view-unlabeled data set of multi-view appearances, using independent subspace analysis (ISA). A learned viewsubspace provides a representation of appearances at that view, regardless of illumination effect. A measure, called view-subspace activity, is calculated thereby to provide a metric for view-based classification. View-based clustering is then performed by using maximum view-subspace activity (MVSA) criterion. This work is to the best of our knowledge the first devoted research on view-based clustering of images.
1 Introduction The appearance based approach [24, 44, 7, 38, 34] avoids difficulties in 3D modeling by using images of example appearances of the object. It has become a popular approach in image analysis applications such as image retrieval, and object detection and recognition. The appearance of an object in a 2D image depends on its shape, reflectance property, pose as seen from the viewing point, and the external illumination conditions. To facilitate tasks such as object detection and recognition, it is desirable to derive a representation which can take into account changes in viewpoint and illumination, and derive a representation in which intrinsic characteristics of the object are captured in a low dimensional space while reducing effects due to artifacts such as that of illumination. Much research has been done in dealing with view and illumination changes [10, 20, 14, 1, 4, 15, 41, 2, 5, 17, 12, 45, 21, 42]. 1
tion from the view-unlabeled data. Second, we want to perform view-based clustering of the view-unlabeled images in the data set by classifying each image into one of the distinct view groups according to some measure calculated based on the learned view subspaces. Such a clustering avoids the necessity to manually label the data in view-based modeling, the size of which can be very large, and provides a method for view-based image classification.
Figure 1. Multi-view face samples. We present independent subspace analysis (ISA) [22] based methods for the unsupervised learning of viewspecific basis components from view-unlabeled examples, and thereby performing view-based image classification and clustering. ISA, a variants of independent component analysis (ICA) [13] learning algorithm, is used for the learning task because they are able to take into account higher order statistics required to characterize the view of object. It is shown that applying the ICA algorithms on the face training data yields emergent view-specific basis components of faces; applying independent subspace analysis (ISA) moreover results in view-based grouping of the basis components. The span of the basis components in each viewgroup defines a view-subspace of faces. In contrast, principal component analysis (PCA) is unable to reveal viewrelated information. A view-subspace learned by ISA corresponds to a visual complex cell tuned to produce maximum output to stimuli of that view. The output or activity of a complex cell, defined as the norm of the projection of an image onto the corresponding subspace, provides a good metric for viewbased classification. By classifying each image in the data set into one of the view groups using the maximum viewsubspace activity (MVSA) principle, we can partition a set of multi-view images into groups according to the view. This work is to the best of our knowledge is the first devoted research on unsupervised clustering of images in terms of the view of object appearance. A comparison between PCA, ICA, ISA and topographic ICA (TICA [23], a further extension of ISA) and other related issues, such as effect of initialization and effect of illumination correction are presented in a companion paper [28]. The rest of the paper is organized as follows: Section 2
introduces the concepts of ICA and ISA, and presents our methods for unsupervised learning of view-subspaces. Section 3 presents the use of learned view-subspace representation for unsupervised view-clustering. Section 4 presents experimental results.
2 Learning View-Subspaces Using ISA While PCA has been a popular subspace analysis tool in image and vision, it does not capture view-related information. This is because it is derived from second order moments corresponding to low frequency property. In contrast, ICA [13] and ISA [22] capture information contained in high-order relationships, such as those among three or more image pixels, as required in view-based image analysis in this paper. ICA has been applied to image analysis [36, 6, 32, 27] and face recognition [3, 33, 31]. In ICA based image analysis, a gray-level image x = fx(u; v ) j 8u; v g is represented as a linear combination of m basis functions b = fb1 (u; v ); : : : ; bm (u; v ) j 8u; v g:
x(u; v) =
X b (u; v)s m
i
i
i
(1)
=1
where the coefficients s = (s 1 ; : : : ; sm ) are different for each image given b’s. We restrict the b i (u; v ) to be an invertible linear system, so that the equation above could be inverted by using the dot-product si
=< w ; x >= i
X w (u; v)x(u; v) i
(2)
u;v
where the w = b 1 is the inverse filter. Independent subspace analysis (ISA) combines the technique of multi-dimensional ICA [11] and the principle of invariant feature subspace [25, 26] such that some invariant feature subspaces of a data set can be automatically extracted. According to the invariant feature subspace theory [25, 26], the norms of the projections on the subspaces represent some higher-order, invariant features. ISA defines the dependency based on the norms and divides the components into a given number of groups indexed such that the components within a group are dependent on each other, but those in different groups are independent. This results in distinct invariant feature subspaces. In ISA, the independence assumption about s i made in the classic ICA is relaxed. The collection of s i are divided into a number of L groups indexed by S ` for ` = 1; : : : ; L; each group consists of n-tuples of s i ; so that m = n L. The si within a group are dependent on each other, but those in different groups are independent. The components ` fbi j i 2 S g in group ` spans a subspace which will be in this work the subspace for the `-th view. The probability
density for each group of components s i ’s is assumed to depend only on the norm of the components. An invariant feature subspace can be embedded in multidimensional ICA by assuming that the probability distribution for the n-tuples s i in the subspace is spherically symmetric, i.e. dependent only on the norm of the s i ’s. Although the exact nature of the invariance has not been specified in a subspace model, it will emerge from the input data as the maximization is performed in ISA. Given an ISA model represented by w = (w 1 ; : : : ; wm ), the logarithm of the likelihood of the observations can be formulated as
log p(x w) = j
X X log p + N T
N
L
`
= p (P 2 k
T
=1 `=1
log det w j
Subspace ` corresponds to the visual complex cells adapted to produce the maximum output for view `. Consider the norm of the projection of an image onto a subspace 2
2
i
S
`
i
2
i
S
`
i
2
2
1
2
1
2
4
4
1
2
1
2
3
1
2
4
1
2
4
1
2
4
1
2
4
1
2
4
1
2
3
1
2
4
1
2
4
4
(3)
3 View-Based Feature Extraction and Clustering Using ISA
`
1
4 j
s2i;k ) are the density functions of the where p` ` i S` norm (the forms of p ` () are assumed to be known), and si;k =< wi ; xk >. This model specifies the prior information on their independence. Learning an ISA model can be simply achieved by maximizing the likelihood function with respect to w and can be implemented by using a gradient ascent algorithm [22].
X s = X < w ;x > F (x) =
MVSA principle. Most images, which are subject to varying illumination conditions, are correctly labeled with the cluster number, with a few errors in the side view.
(4)
It can be identified as the response, or activity, of the complex cells for view ` to the stimulus x. This is the basis on which our unsupervised clustering is performed after the learning of independent view-subspaces. Our view-based clustering consists of three step: (1) learning view-based subspaces using ISA , (2) extracting view-based features in the subspaces, and (3) clustering based on the features. In step (1), ISA is applied to the training examples to learn a given number of L independent view-subspaces, each spanned by its m basis components. An image is projected to each subspace according to Eq.(2). This will produce L view-based feature vectors in the L subspaces. The subspace activity defined in Eq.(4) is computed as the higher-order, invariant feature for the clustering. This gives F 0 ; F 1 ; : : : ; F L 1 . Finally, view-based clustering is performed as follows: A sample is classified using the maximum view-subspace activity (MVSA) criterion: it belongs to the ` 0 -th view if `0 = arg max` F ` . Fig. 2 illustrates view-based clustering of appearances into 4 ISA view-subspaces based on the
Figure 2. Clustering with varying illuminations, based on the 4 learned view-subspaces and the MVSA principle. The numbers are the cluster label for the corresponding images assigned by the ISA+MVSA clustering method.
4 Experimental Results The purpose here is to demonstrate the ISA approach for unsupervised learning of view-subspaces and use the learned view-subspaces to do view-based clustering.
4.1 Data Preparation Let x 2 RN be a windowed grey-level image, or appearance, of the object of interest, possibly preprocessed. The appearance x is subject not only to the view , but also the illumination u determined by several parameters. Assume all right rotated faces (those with view angles between 91 Æ and 180Æ ) are mirrored to left rotated, which does not cause any loss of generality of the method, so that all the views are in the range of [0 Æ ; 90Æ ] with 0Æ representing the side view and 90Æ the frontal view. More than 6,000 such face samples are collected by cropping from various sources (mostly from video). A total number of about 100,000 multi-view face images are generated from the 6,000 samples in the following way: Each original (including mirrored) sample is left- and right- rotated by 5 degrees, and this gives additional 2 rotated versions of the sample. Each of these 3 versions is then shifted left, right, up and down by 2 pixels, and this produces additional 4 shifted versions. By these, each sample is duplicated into 15 varied versions. Each windowed subimage is normalized into a fixed size of 20 20 pixels, and preprocessed by illumination correction, mean value normalization, histogram equalization as
not play an important role in deriving view-specific basis components. Initialized at random, the view-based clusters always emerge as the iteration goes on, but affect the ordering of subspaces (the view ordering in Fig. 3 is made manually). The ISA imposes no ordering between the subspaces because the subspaces are independent of each other. A viewbased ordering can be achieved by using topographic ICA (TICA) [23], a further extension to ICA, or even better by a proper inductive initialization for the ISA or TICA learning [28].
4.3 Clustering Based on ISA Representation
Figure 3. 4 view-subspaces learned by using ISA with (upper) and without (lower) illumination correction of the training data. The first two rows in each case are the 30 components for the (near) frontal view and the last two rows for the side view subspace.
is done in most existing systems, e.g. [43, 39, 37]. Illumination correction is done by fitting a plane to the image surface and then subtracted it from the image. This reduces illumination variation to some extent, however is not crucial in the learning of view-specific subspaces [28].
4.2 ISA Learning of View-Subspaces
10; 000 un-labeled face images with 1000 images for each of the n 10 views, is used for ISA learning of view-subspaces. Each image contains the appearance of a face object viewed at a certain unknown pose. The ISA software downloaded from www.cis.hut.fi/projects/ica are used for the ISA learning. Fig. 3 shows the resulting ISA map, with L = 4 and n = 30. From the results, we can see that the 4 subspaces that have emerged roughly corresponding to the 90 Æ (frontal view), 70Æ (half side), 40Æ (half side), and 10 Æ (side view). They cover the range from 0 Æ to 90Æ and are distinct from each other. On the other hand, illumination correction does A
set
of
(x1 ; x2 ; : : : ; x ),
The following examples and results of ISA based clustering make use of the 4 subspaces learned previously. A sample is projected to the 30 component directions of each subspace. This gives four 30-dimensional vectors. The Euclidean norm of each vector is calculated according to Eq.(4). Such a norm is a measure of activity of the sample in that view-subspace. The sample is classified into one of the 4 clusters using the MVSA criterion. Fig. 4 illustrates view-subspace activities. There are 4 input images (those on the left column). They are projected onto the 4 ISA view-subspaces in Fig. 3, respectively. The images reconstructed from the projections onto the 4 ISA subspaces are shown at the 2nd to the right-most columns of the image panel. The clearest reconstruction is from the subspace of the best matched view. The calculated subspace activities of the 4 input images in the 4 ISA subspaces are given in the matrix (value multiplied by 10 3 , element (i; j ) is for image i and subspace j ), with the maximum subspace activity of all subspaces underlined. We can see that in these cases, the subspace activity value reaches the maximum when the view in the input matches the tuned view of the subspace. The statistics obtained with large data sets are presented as follows for the evaluation of the view-clustering: The first is the training set as explained in the above, and the second is a test set of 10,000 samples. Each test sample is attached its nearest view value in = f0 Æ ; 10Æ ; : : : ; 90Æ g (subject to manual labeling errors). The classification results are demonstrated through classification matrices (c-matrices) shown in Hinton diagrams in Fig. 5. An entry (i; `) in a c-matrix represents the number of samples whose view labels (manually labeled, subject to human errors) are ` (in column) but which are classified into the i-th subspace (in row). The left-most column corresponds to the frontal view of the 10 ground truth labels, and right-most to the side view of them. The top row corresponds to the frontal view of the 4 subspaces, and the bottom row to the side view of them. The entries are divided by the maximum value of the corresponding column so that the
.
1.3718 1.0856 0.9392 0.6520
1.1504 1.3379 0.9777 0.8405
0.9141 1.1117 1.1793 0.8994
0.9821 1.0224 1.1075 1.2296
Figure 4. 4 input images (those on the left column). Images reconstructed from the projection coordinates (from the 2nd column to the right-most column). The view-subspace activities (the matrix).
maximum value becomes 1. The sizes of the blocks in the Hinton diagrams are proportional to the normalized values. Results obtained by using the conventional k -means clustering algorithm are also included for comparison as opposed to the MVSA based clustering. Two types of features are used: (1) the concatenation of the 4 ISA feature vectors as opposed to (2) the raw data of 20 20 image. So, the “ISA+k -means” method differs from the “ISA+MVSA” method in the way they use the ISA features and also the way they do the classification. The results shown in the figure are obtained using (top-downwards) (1) raw+k -means, (2) ISA+k -means, (3) ISA+MVSA, and (4) ISA+MVSA without illumination correction preprocessing of training and test sets. We can see that the “ISA+MVSA” method has produced much better results than the other two methods: Samples with similar views tend to be grouped together. In the cmatrix, there are more graceful, gradual slopes off the sides of the ridges along the “diagonal” elements. In contrast, although the “ISA feature+k -means” method is also based on the same ISA projection features, it has produced far less favorable results, about the same as the ”Raw image+k means” method. In addition, the latter two methods do not have the graceful property as the “ISA+MVSA” method: there are no ridges along the diagonal lines. Comparing
Figure 5. C-matrices in Hinton diagrams for the clustering of the training set (left) and test set (right).
between c-matrices in the first and second rows of figure, we see that illumination correction has a favorable results but is not a crucial pre-processing step for the view-based clustering.
5 Conclusion The contributions of the paper are the following: First, we have presented an unsupervised approach, based on ICA and ISA algorithms, for unsupervised learning of viewsubspaces from a view un-labeled set of multi-view appearances. View-specific basis components emerge as a result of the ICA type learning; in contrast, results learned by using PCA provide little information for modeling the views. By using ISA, basis components of a similar view are grouped together, resulting groupings of distinct views. The viewspecific components enable us to construct view-subspaces as a representation for multi-view object appearances. The second contribution is the view-based clustering of images using the learned view-subspaces, which is to the best our knowledge the first devoted work on view-based clustering. Based on the subspace theory, a measure of view subspace activity is calculated based on the learned results to provide a good metric for view-based, illumination insensitive classification; and view-based clustering is performed based on the maximum view-subspace activity principle. Up until now, the exact mechanism underlying the view-
subspace learning and view-based clustering is an unanswered question in this paper. We would believe that the manifold of faces under changes in view and illumination is in such a shape that changes in view causes larger variations in Euclidean or dot product distances between sample points; and view-specific components emerge as a consequence of likelihood maximization in ICA/ISA. Currently, we are investigating into the issue in this direction.
References [1] Y. Adini, Y. Moses, and S. Ullman. Face recognition: The problem of compensating for changes in illumination direction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):721–732, July 1997. [2] S. Baker, S. Nayar, and H. Murase. “Parametric feature detection”. International Journal of Computer Vision, 27(1):27–50, March 1998. [3] M. S. Bartlett, H. M. Lades, and T. J. Sejnowski. “Independent component representations for face recognition”. In Proceedings of the SPIE, Conference on Human Vision and Electronic Imaging III, volume 3299, pages 528–539, 1998. [4] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman. “Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):711–720, July 1997. [5] P. N. Belhumeur and D. J. Kriegman. “What is the set of images of an object under all possible illumination conditions”. IJCV, 28(3):245–260, July 1998. [6] A. J. Bell and T. J. Sejnowski. “The ‘independent components’ of natural scenes are edge filters”. Vision Research, 37:3327–3338, 1997. [7] D. Beymer, A. Shashua, and T. Poggio. “Example based image analysis and synthesis”. A. I. Memo 1431, MIT, 1993. [8] M. Bichsel and A. P. Pentland. “Human face recognition and the face image set’s topology”. CVGIP: Image Understanding, 59:254–261, 1994. [9] H. Borotschnig, L. Paletta, M. Prantl, and A. Pinz. “Active object recognition in parametric eigenspace”. In Proc. 9th British Machine Vision Conference, pages 63–72, Southampton, UK, 1998. [10] R. Brunelli. Estimation of pose and illuminant direction for face processing. A. I. Memo 1499, MIT, 1994. [11] J.-F. cardoso. “Multidimensional independent component anlysis”. In Proceedings of the International Conference on Acoustic, Speech and Signal Processing, Seattle, 98. [12] H. F. Chen, P. N. Belhumeur, and D. W. Jacobs. “In search of illumination invariants”. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages I:254–261, 2000. [13] P. Comon. “Independent component analysis - a new concept?”. Signal Processing, 36:287–314, 1994. [14] R. Epstein, P. Hallinan, and A. Yuille. “5 2 eigenimages suffice: An empirical investigation of low-dimensional lighting models”. In IEEE Workshop on Physics-Based Vision, pages 108–116, 1995. [15] K. Etemad and R. Chellapa. “Face recognition using discriminant eigenvectors”. 1996. [16] J. Feraud, O. Bernier, and M. Collobert. “A fast and accurate face detector for indexation of face images”. In Proc. Fourth IEEE Int. Conf on Automatic Face and Gesture Recognition, Grenoble, 2000. [17] A. S. Georghiades, D. J. Kriegman, and P. N. Belhumeur. Illumination cones for recognition under variable lighting: Faces. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 52–59, 1998. [18] S. Gong, S. McKenna, and J. Collins. “An investigation into face pose distribution”. In Proc. IEEE International Conference on Face and Gesture Recognition, Vermont, 1996. [19] D. Graham and N. Allinson. “Face recognition from unfamiliar views: Subspace methods and pose dependency””. In Proc. 3rd International Conference on Automatic Face and Gesture Recognition, pages 348–353, Nara, Japan, April 1998. [20] P. W. Hallinan. “A low-dimensional representation of human faces for arbitrary lighting conditions”. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 995–999, 1994. [21] J. Hornegger, H. Niemann, and R. Risack. “Appearance-based object recognition using optimal feature transforms”. Pattern Recognition, 33(2):209–224, February 2000.
[22] A. Hyv¨arinen and P. Hoyer. “Emergence of phase and shift invariant features by decomposition of natural images into independent feature subspaces”. Neural Computation, 12(7):1705–1720, 2000. [23] A. Hyv¨arinen and P. Hoyer. “Emergence of topography and complex cell properties from natural images using extensions of ica”. In Advances in Neural Information Processing Systems, volume 12, pages 827–833, 2000. [24] M. Kirby and L. Sirovich. “Application of the Karhunen-Loeve procedure for the characterization of human faces”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(1):103–108, January 1990. [25] T. Kohonen. Emergence of invariant-feature detectors in the adaptivesubspace self-organizing maps. Biological Cybernetics, 75:281–291, 1996. [26] T. Kohonen. Self-Organizing Maps. Information Sciences. Springer, Heidelberg, second edition, 1997. [27] T. Lee, M. Lewicki, and T. Sejnowski. ICA mixture models for unsupervised classification of non-gaussian classes and automatic context switching in blind separation. PAMI, 22(10), October 2000. [28] S. Z. Li, X. G. Lv, and H. J. Zhang. “Independent view-subspace analysis of multi-view face patterns”. In IEEE ICCV Workshop on Recognition, Analysis and Tracking of Faces and Gestures in Real-time Systems, page ???, Vancouver, Canada, July 13 2001. [29] S. Z. Li, J. Yan, and H. J. Zhang. “Learning illumination-invariant signature of 3-d object from 2-d multi-view appearances”. In Proceedings of IEEE International Conference on Computer Vision, page ???, Vancouver, Canada, July 9-12 2001. [30] Y. M. Li, S. G. Gong, and H. Liddell. “support vector regression and classification based multi-view face detection and recognition”. In IEEE Int. Conf. Oo Face & Gesture Recognition, pages 300–305, France, 2000. [31] C. Liu and H. Wechsler. “Comparative assessment of independent component analysis (ica) for face recognition”. In Proc. Second Int’l Conf. on Audio- and Video-based Biometric Person Authentication, Washington D. C., March 2224 1999. [32] R. Manduchi and J. Portilla. “independent component analysis of textures”. In Proceedings of IEEE International Conference on Computer Vision, Corfu, Greece, 1999. [33] B. Moghaddam. “Principal manifolds and bayesian subspaces for visual recognition”. In Proceedings of IEEE International Conference on Computer Vision, Corfu, Greece, 1999. [34] H. Murase and S. K. Nayar. “Visual learning and recognition of 3-D objects from appearance”. International Journal of Computer Vision, 14:5–24, 1995. [35] J. Ng and S. Gong. “performing multi-view face detection and pose estimation using a composite support vector machine across the view sphere”. In Proc. IEEE International Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, pages 14–21, Corfu, Greece, September 1999. [36] B. A. Olshausen and D. J. Field. “natural image statistics and efficient coding”. Network, 7:333–339, 1996. [37] E. Osuna, R. Freund, and F. Girosi. “Training support vector machines: An application to face detection”. In CVPR, pages 130–136, 1997. [38] A. P. Pentland, B. Moghaddam, and T. Starner. “View-based and modular eigenspaces for face recognition”. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 84–91, 1994. [39] H. A. Rowley, S. Baluja, and T. Kanade. “Neural network-based face detection”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(1):23–28, 1998. [40] H. Schneiderman and T. Kanade. “a statistical method for 3d object detection applied to faces and cars. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2000. [41] A. Shashua. “On photometric issues in 3d visual recognition from a single 2d image”. International Journal of Computer Vision, 21:99–122, 1997. [42] A. Shashua and T. R. Raviv. “The quotient image: Class based re-rendering and recognition with varying illuminations”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(2):129–139, 2001. [43] K.-K. Sung and T. Poggio. “Example-based learning for view-based human face detection”. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(1):39–51, 1998. [44] M. A. Turk and A. P. Pentland. “Face recognition using eigenfaces.”. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 586–591, Hawaii, June 1991. [45] A. Yilmaz and M. Gokmen. “Eigenhill vs. eigenface and eigenedge”. In Proceedings of International Conference Pattern Recognition, pages 827–830, Barcelona, Spain, 2000.