AAM, appearance variations are encoded by extracting Bio. Inspired Feature from normalized shape-free images with a mean shape mask. The proposed LOR ...
A New Biologically Inspired Active Appearance Model for Face Age Estimation by Using Local Ordinal Ranking Lijun Hong
Di Wen
Chi Fang
Xiaoqing Ding
State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
{honglijun, wendi, fangchi, dxq}@ocrserv.ee.tsinghua.edu.cn ABSTRACT In this paper, a new facial feature called Biologically Inspired Active Appearance Model (BIAAM) is proposed for face age estimation by using a novel age function learning algorithm, called Local Ordinal Ranking (LOR). In BIAAM, appearance variations are encoded by extracting Bio Inspired Feature from normalized shape-free images with a mean shape mask. The proposed LOR divides the training set into several groups according to age labels and applies Ordinal Hyperplanes Ranker for each group to determine the final predicting age. A multiple linear regression function is used to decide which group a query sample belongs to. Experimental evaluation on the FG-NET aging database with mean absolute error 4.18 years demonstrates that our method outperforms other state-of-the-art algorithms.
Categories and Subject Descriptors I.4.7 [Image Processing and Computer Vision]: Application; I.5.2 [Pattern Recognition]: Design Methodology—pattern analysis
General Terms Algorithms, Experimentation
Keywords Face age estimation, Active Appearance Model, Biologically inspired feature, Machine learning
1. INTRODUCTION Recently age estimation via face images has attracted much attention due to its many practical applications. For example, in multimedia applications, advertisers can provide specific advertisements for potential customers in terms of age group, which can be estimated from face images on the social networks. However, it’s still a challenge task for machine to estimate age from face images automatically due
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ICIMCS’13 August 17-19, Huangshan, Anhui, China Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$15.00.
to many factors. The most popular method proposed previously is facial feature based statistical learning technique, i.e. learning a aging pattern from facial features extracted from a large number of face images. The pioneer work on face age estimation can be traced back to the work of Lanitis et al.[9], where they compared different classifiers for face age estimation by using AAM [4]. Fu et al.[5] proposed to using manifold learning and aging feature regression framework for estimating face age. They compared four manifold learning methods and applied multiple linear regression for aging function learning. Due to the complexity of face aging, Guo et al. presented LARR [8] for learning age function. Recently, they proposed a powerful Bio Inspired Feature (BIF) [6] and achieved very high accuracy. To make use of the ordinal information of age labels, Chang et al. proposed Ordinal Hyperplanes Ranker (OHRank) [3] with Cost Sensitivities for age estimation. This work presented here has focused on facial feature extraction and aging function learning. Firstly, the appearance model in AAM, used by [9], only makes use of grey-level information and contains a lot of non-aging variations. On the other hand, the BIF [6] mainly focuses on holistic appearance feature and ignores the shape feature, which however is important for face age estimation. To overcome these two disadvantages, we proposed a new Biologically Inspired AAM (BIAAM) feature for face age estimation. Secondly, in order to relieve the problem of unbalanced sample distributions when designing decision functions in OHRank [3], we also presented a novel age function learning algorithm called Local Ordinal Ranking (LOR), based on the idea of divide and conquer. The remainder of this paper is organized as follows. Firstly, we will introduce our framework of face age estimation in section 2. The newly proposed BIAAM and LOR will be explained in section 3 and section 4 respectively. In section 5, Evaluations on the FG-NET aging database will be given. At last, conclusions and some future work will be presented.
2.
FRAMEWORK OF AGE ESTIMATION
Our framework of automatic age estimation via face images is displayed in Figure.1, which is similar to that proposed in [5]. In both training and testing stages, face regions are first detected from the input images and then facial features will be extracted from face regions, i.e. BIAAM in this paper. The purpose of manifold learning is to reduce the dimensionality of facial feature and to preserve most agerelated information for face age estimation. In our work, the Linear Discriminant Analysis (LDA) [1] will be adopted
Table 1: The process of extracting BIAAM feature 1. Using original AAM to get the shape parameters bs and normalized shape-free image g. 2. Extracting feature from image g of step 1 by using Masked-BIF, getting a feature vector fb . 3. Scaling vector fb using Max-Min and concatenating it with shape parameters bs , resulting into the final feature. Figure 1: Framework of face age estimation. for this procedure. For age function learning, we will use the LOR, which however will be replaced by the simplest linear regression function for the purpose of validating the effectiveness of our BIAAM feature.
3. FACE FEATURE EXTRACTION
shape mask showed in Figure 2(a), called Masked-BIF. In the Masked-BIF, when pooling over S1 units, response of a pyramid of Gabor filters applied on the input image, to get C1 units, we only consider those S1 units satisfying the criterion: ∑ NT = mask(i, j) ≥ α × (NS × NS ) (4) i,j∈RS
3.1 Original AAM Feature The Active Appearance Model (AAM) [4] combines shape variation and appearance variation of face image in a shapenormalized frame. The key landmark points labeled on each face are aligned into a common co-ordinate and represented by vector x. After applying Principle Component Analysis (PCA) [1], shape of a sample can be formulated by: x=x ¯ + Ps bs
(1)
where x ¯ denotes the mean shape, Ps is orthogonal modes of variation and bs is the shape parameters. Then each image is warped to the mean shape by using triangulation algorithm and grey level texture gim is raster scanned over the shape normalized image. To suppress the effect of illumination, texture vector gim is normalized into g with zero mean and unit variance. By using PCA, appearance of a sample can be approximated by: g = g¯ + Pg bg
(2)
where g¯ denotes the mean grey-level vector, Pg is orthogonal modes of variation and bg is the grey-level parameters. Therefore, vectors bs and bg summaries a sample’s shape and appearance respectively. Due to correlations between shape and grey-level variations, a further PCA model b = Qc is applied and get the final model parameters c, which can be used for age estimation as [9]. Here vector b can be formulated as: ( ) ( ) Ws PsT (x − x ¯) Ws bs b= = (3) bg PgT (g − g¯) where entries of diagonal matrix Ws are weights for shape parameters, dealing with the units difference between shape and grey models.
3.2 Biologically Inspired AAM The texture information of original AAM is obtained only by pixel intensity, so the appearance parameters contain a lot of non-aging variations, which can distort the performance of face age estimation. To conquer this disadvantage of original AAM, we proposed the BIAAM to extract descriptive facial features in the shape and grey-level normalized image of AAM by using Bio Inspired Features (BIF) presented in [6]. Because the aligned shape-free image is not square, we modify the original BIF by using a mean
where RS means the area in the mean shape covered by pool grid and α ∈ (0, 1], set to 0.5 in our work. That is to say not all of the C1 units features will exist in the final facial feature vector. Here NS × NS means the neighborhood size over which the C1 uints pool. The element of mean shape mask matrix mask(i, j) equals to 1 only when the corresponding point (i, j) located in the mean shape, i.e. the bright region of Figure 2(a), otherwise equals to 0. With the mean shape mask, the nonlinear operation ”STD” of BIF also will be modified as: √ ∑ 1 2 (Fi,j − F¯ ) (5) std = NT mask(i,j)=1
where F¯ means the mean value of all available Fi,j . By concatenating the Masked-BIF features and shape parameters of AAM, we can get the final facial features for age estimation. The whole BIAAM feature extraction algorithm is listed in Table 1.
4.
AGE ESTIMATION METHOD
In order to verify the effectiveness of our proposed novel BIAAM feature, we firstly will only use the simplest age function learning method, e.g. the linear regression function, which constructs a map between facial feature vector X and age label l of target face using linear function: l = aT X + b
(6)
where a and b are regression coefficients vector and bias respectively. Remember that here X is a low-dimensional vector obtained by applying PCA and LDA to the original facial feature. To improve face age estimation accuracy as more as possible, we present a new age function learning algorithm, i.e. Local Ordinal Ranking. Given a training set S = {(Xi , yi )N i=1 } of N samples, age label yi ∈ L = {l1 , l2 , ..., lc }, satisfying l1 < l2 < ... < lc , we first divide the training set into c groups, each group Gi , i ∈ 1, 2, ..., c containing samples whose age label l ∈ [li − d, li + d], where d is an empirical value. Then we train an Ordinal Hyperplanes Ranker (OHRank)[3] for each group Gi . Due to paper size, comprehensive introduction of OHRank can be found in [3], but
Table 2: The proposed LOR algorithm 1. Dividing the training set S into c groups Gi , i ∈ 1, 2, ..., c according to the corresponding age labels. 2. Training an OHRank ri (X), i = 1, 2, ..., c for each group Gi . 3. Using multiple linear regression function to determine the group Gq that a query sample Z belongs to. 4. Applying the corresponding OHRank rq to obtain the final predicting result.
here we modify the original ranking rule r as: r(X) = ls , s = 1 +
K ∑
[[fk (X) > 0]]
(7)
k=1
to adapt OHRank to discontinuous case of age label set L. In the ranking rule, fk (X) means a binary decision function determining whether the age label of X is greater than lk or not and linear SVMs [13] is adopted in our work. For a query Z, we first use a multiple linear regression function [5] lq = b0 + bT1 Z + bT2 Z 2
(8)
to find the group Gq that Z belongs to. And then the corresponding OHRank rq (Z) will be applied to get the final predicting result. Because the two-class sample distribution for designing binary decision function fk (x) in OHRank is very unbalanced, such as samples higher than 2 years old are usually much more than that lower than 2 years old, our proposed LOR using a shorter age range [li − d, li + d] which can relieve this problem to some degree. The whole algorithm of the proposed LOR is summed up in Table 2.
5. EXPERIMENT 5.1 Data and Methodology In this section, we will adopt the widely-used FG-NET aging database [12] to evaluate the performance of our proposed face age estimation method. The FG-NET aging database contains 1002 color face images of 82 subjects with age labels ranging from 0 to 69 years old. Each face is labeled with 68 landmarks, which will be used for AAM. Here, we also use the popular Leave-One-Person-Out (LOPO) testing strategy as suggested by most previous literatures [8, 6, 3]. We will use two measures to evaluate the performance, e.g. Mean Absolute Error (MAE) and Cumulative Score (CS) curve [7]. The MAE can be defined as: M AE = ∑N ˆ k=1 |lk − lk |/N , which represents the average absolute errors between the truth ages lk and the estimated ages ˆ lk . The CS can be defined as: CS(j) = Ne≤j /N × 100%, where Ne≤j means the number of testing samples whose absolute age estimation errors are no higher than j years. N in both MAE and CS means the total number of testing samples. In the implementation of BIAAM, we will only use the 68 labeled key points and normalize all samples into texture size 120×120. The mean shape mask and some shape-normalized face images of the FG-NET age database are displayed in Figure 2. All the parameters of Gabor filters in MaskedBIF are the same as that used in BIF [6]. Libsvm [2] tool is used for SVR and linear SVMs binary classifiers in OHRank and LOR. For experimental repeatability, source code of our implementation can be provided by contacting us.
(a) Mean shape mask
(b) Shape-normalized samples
Figure 2: The mean shape mask and some shapenormalized face images of FG-NET Age Database. Table 3: The MAE of using different facial features and different age function learning algorithms. LRF means Linear Regression Function and MLR means Multiple Linear Regression function. AAM Masked-BIF BIAAM LRF 5.54 5.34 5.19 LDA+MLR 5.39 5.20 5.14 LDA+SVR 4.57 5.13 4.72 LDA+OHRank 4.31 4.54 4.28 LDA+LOR 4.25 4.40 4.18
5.2
Results
In order to verify the effectiveness of our proposed BIAAM feature, we compared other facial features, e.g. original AAM and Masked-BIF, all applying the simplest linear regression function to learn face aging trend and skipping the step of manifold learning in Figure 1. The results are showed in the first row of Table 3, from which we can conclude that age estimation accuracy of the novel BIAAM is better than both the original AAM and Masked-BIF, a variant of BIF. To improve the accuracy of face age estimation, we will use the proposed LOR for face age function learning, before which the manifold learning algorithm Linear Discriminant Analysis (LDA) [1] is applied. As comparison, we also use other age function learning methods, including Multiple Linear Regression (MLR) [5], Support Vectors Regression (SVR) [6] and Ordinal Hyperplanes Ranker (OHRank) [3], in which linear kernel of SVM is adopted in our experiment. The MAE results are listed from second row to fifth row in Table 3, from which we can find that our newly proposed LOR achieves the lowest MAE 4.18 years against other learning methods. The reasons of age estimation difference between our Masked-BIF and original BIF in [6] using SVR include two aspects: different face region normalization methods, mean shape for our Masked-BIF but simply resized to 60 × 60 for original BIF, and different face image preprocessing methods. On the other hand, our new facial feature BIAAM always outperforms both original AAM and Masked-BIF when using different age function learning methods except the case of SVR. When using nonlinear age function learning methods, including SVR, OHRank and LOR, the original AAM feature can achieve better performance than the Masked-BIF. In our opinion, the main reason may be the important role of shape variations in AAM for face age estimation. However, our new BIAAM still achieves the best estimation accuracy when using OHRank and LOR. When comparing OHRank with our LOR using different facial features, i.e. the last two rows in Table 3, we can conclude that the proposed LOR is better than OHRank, which proves our analysis that smaller age range can relieve the problem of unbalanced sample dis-
tigated in detail in our next work.
1
7.
0.9
This work was supported by the National Natural Science Foundation of China under Grant No. 60972094 and the National Basic Research Program of China (973 program) under Grant No. 2013CB329403.
Cumulative Score (%)
0.8 0.7 0.6 0.5
8. AAM(LRF) Masked−BIF(LRF) BIAAM(LRF) BIAAM(LDA+SVR) BIAAM(LDA+OHRank) BIAAM(LDA+LOR)
0.4 0.3 0.2 0.1
ACKNOWLEDGMENTS
0
2
4
6 8 Error Level (years)
10
12
14
Figure 3: Cumulative Scores of different facial features and age function learning methods. For clarity, only some are selected for display.
Table 4: Comparing MAE of our proposed method with other existing algorithms. Algorithm MAE Algorithm MAE RUN2[14] 5.33 PLO[10] 4.82 LARR[8] 5.07 BIF[6] 4.77 RPK[15] 4.95 OHRank[3] 4.48 MTWGP[16] 4.83 Proposed 4.18
tributions for OHRank. Detailed explanation can be found in section 4. In a word, our age function learning algorithm is powerful, so that if combined with other manifold learning methods or age estimation framework, it will achieve better performance. The CS curves for different facial features and age function learning methods are depicted in Figure. 3. Our proposed method with BIAAM and LOR obtains the highest cumulative score at most error levels. At last, comparison of mean absolute error between our proposed method with other recently presented algorithms is listed in Table 4. All of these methods used the same LOPO testing strategy, which means the same training and testing data. As shown, our method outperforms all the listed state-of-the-art algorithms.
6. CONCLUSIONS In this paper, we presented BIAAM feature for face age estimation by using LOR. The BIAAM applied original AAM to get shape variations and extracted BIF features from the normalized shape-free images with a mean shape mask. The proposed LOR first used a multiple linear regression function to determine the age group that a query sample belonging to and then applied the OHRank of that group to obtain the final predicting age. Evaluation on the FG-NET aging database with LOPO testing strategy showed that our method achieved the lowest MAE 4.18 years against other state-of-the-art algorithms. In the future, we will test our method on other aging databases, such as MORPH [11]. Locality of face model and sample distributions in feature space will also be inves-
REFERENCES
[1] P. Belhumeur, J. Hespanha, and D. Kriegman. Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE TPAMI, 19(7):711–720, 1997. [2] C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM TIST, 2(3):27:1–27:27, 2011. [3] K.-Y. Chang, C. S. Chen, and Y. P. Hung. Ordinal hyperplanes ranker with cost sensitivities for age estimation. In CVPR’11, pages 585–592. IEEE, 2011. [4] T. Cootes, G. Edwards, and C. Taylor. Active appearance models. In ECCV’98, volume 2, pages 484–498. Springer, 1998. [5] Y. Fu, Y. Xu, and T. Huang. Estimating human ages by manifold analysis of face pictures and regression on aging features. In ICME’07, pages 1383–1386. IEEE, 2007. [6] Y. F. G. Guo, G. Mu and T. Huang. Human age estimation using bio inspired features. In CVPR’09, pages 112–119. IEEE, 2009. [7] X. Geng, Z. Zhou, Y. Zhang, G. Li, and H. Dai. Learning from facial aging patterns for automatic age estimation. In ACM Multimedia, pages 307–316, 2006. [8] G. Guo, Y. Fu, C. Dyer, and T. Huang. Image-based human age estimation by manifold learning and locally adjusted robust regression. IEEE Trans. Image Processing, 17(7):1178–1188, 2008. [9] A. Lanitis, C. Draganova, and C. Christodoulou. Comparing different classifiers for automatic age estimation. IEEE Trans. Sys. Man Cyber. Part B, 34(1):621–628, Feb. 2004. [10] C. Li, Q. Liu, J. Liu, and H. Lu. Learning ordinal discriminative features for age estimation. In CVPR’12, pages 2570–2577. IEEE, 2012. [11] K. Ricanek and T. Tesafaye. MORPH: A longitudinal image database of normal adult age-progression. In AFGR’06, pages 341–345. IEEE, 2006. [12] The FG-NET Consortium. The FG-NET Aging Database [Online]. Available: http://www.fgnet.rsunit.com, 2012. [13] V. N. Vapnik. Statistical Learning Theory. New York, 1998. [14] S. Yan, H. Wang, T. Huang, and X. Tang. Ranking with uncertain labels. In ICME’07, pages 96–99. IEEE, 2007. [15] S. Yan, X. Zhou, M. Liu, M. Hasegawa-Johnson, and T. Huang. Regression from patch-kernel. In CVPR’08, pages 1–8. IEEE, 2008. [16] Y. Zhang and D. Yeung. Multi-task warped gaussian process for personalized age estimation. In CVPR’10, pages 2622–2629. IEEE, 2010.