Face Membership Authentication Using SVM Classification Tree ...

2 downloads 0 Views 1MB Size Report
For dynamic face membership authentication, F. Xie et al. [20] introduced a ..... dot dash line is for the single SVM system, and the dash line is for the SVM tree ...
436

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 16, NO. 2, MARCH 2005

Face Membership Authentication Using SVM Classification Tree Generated by Membership-Based LLE Data Partition Shaoning Pang, Member, IEEE, Daijin Kim, Senior Member, IEEE, and Sung Yang Bang, Senior Member, IEEE

Abstract—This paper presents a new membership authentication method by face classification using a support vector machine (SVM) classification tree, in which the size of membership group and the members in the membership group can be changed dynamically. Unlike our previous SVM ensemble-based method, which performed only one face classification in the whole feature space, the proposed method employed a divide and conquer strategy that first performs a recursive data partition by membership-based locally linear embedding (LLE) data clustering, then does the SVM classification in each partitioned feature subset. Our experimental results show that the proposed SVM tree not only keeps the good properties that the SVM ensemble method has, such as a good authentication accuracy and the robustness to the change of members, but also has a considerable improvement on the stability under the change of membership group size. Index Terms—Divide and conquer, locally linear embedding (LLE), membership authentication, membership-based LLE data partition, support vector machine (SVM), SVM classification tree.

I. INTRODUCTION

M

EMBERSHIP authentication is a typical problem in digital security schemes [1], [11], [18], [20]. The problem can be generally depicted as follows. Consider a certain human members, which is the universal set. If there group with such that and exists an arbitrary subgroup , then it is a membership group, and the remaining people make a nonmembership group. Thus, the objective of membership authentication is to distinguish the memberfrom the nonmembership class in the human ship class group. Since anonymity is an essential feature of digital security schemes [38], this requires membership authentication to allow that the size of the membership group and the members in the membership group can be changed dynamically. Therefore, unlike all previous types of works on face recognition for security [8], [11], [14], [15], [19]–[22], [32], [38], [40], [41], that needs to identify the identity of the given face image, dynamic membership authentication requires to authenticate an individual’s membership without revealing the individual’s identity and without restricting the group size and/or the members Manuscript received January 13, 2003; revised August 18, 2003. This work was supported by the Ministry of Education of Korea under the BK21 Program. S. Pang is with the Knowledge Engineering and Discovery Research Institute, Auckland University of Technology, Auckland 1020, New Zealand (e-mail: [email protected]). D. Kim and S. Y. Bang are with the Department of Computer Science and Engineering, Pohang University of Science and Technology, Nam-Gu, Pohang, 790-784, Korea (e-mail: [email protected]; [email protected]). Digital Object Identifier 10.1109/TNN.2004.841776

of the group. For example, for security or access control, the permission such as the right to entering an important building is often assigned to many individuals. To get the permission, it is required that members of the group be distinguished from nonmembers, while the members need not be distinguished from one another due to privacy concerns [18]. For dynamic face membership authentication, F. Xie et al. [20] introduced a verification system, in which they authenticated face memberships by combining a face recognition method using template matching and a face verification using single support vector machine (SVM) classifier. Since the system is not entirely independent from face identification, it is not a real dynamic membership authentication in our definition. In [1], we introduced an SVM ensemble method for membership authentication in dynamic face groups. This method was a novel membership authentication in the following points. First, it performed the membership authentication in terms of binary classification without revealing the individual’s identity, i.e., it was concerned whether a member was included in a membership group or not. A powerful SVM ensemble [1], [2] combining several binary SVM classifiers was introduced for supporting this property. Second, it performed dynamic membership authentication without restricting the group size and/or the members of the group, i.e., the membership authentication environments are changing dynamically. An effective face representation using an eigenfeature fusing technique was introduced for supporting this requirement. However, we found that the SVM ensemble method could only remain stable for the membership group whose size is less than 45 persons (16.6% of total group). As the membership group size increases, its membership authentication performance becomes poorer and very unstable. Furthermore, when the size of membership in the group becomes similar to the size of nonmembership in the group, it is almost impossible to obtain a satisfactory authentication performance. This is due to a complicated mixed data distribution among the membership and nonmembership face images, as it is very difficult to discriminate such complicated mixed data in terms of only one classifier even if its classification performance is powerful. To solve this complex classification problem, we introduce a binary classification tree algorithm. In this algorithm, we partition the whole data into several subgroups by an iterative membership-based clustering and authenticate the membership in each subspace by SVM classification. Consequently, we obtain a considerable improvement in robustness to the size changes in the membership group, while keeping a good membership au-

1045-9227/$20.00 © 2005 IEEE

PANG et al.: FACE MEMBERSHIP AUTHENTICATION USING SVM CLASSIFICATION TREE

437

thentication performance and robustness to the member changes in the membership group. This paper is organized as follows. Section II introduces the idea of divide and conquer on face membership authentication. Section III presents the algorithm of SVM classification tree with application to membership authentication. Sections IV and V describe the two important parts in the algorithm, membership-based locally linear embedding (LLE) data partition and SVM, respectively. Experiments and discussion are given in Section VI. Finally, We present conclusions and directions for further research in Section VII. II. FACE MEMBERSHIP AUTHENTICATION Suppose we know the membership in the group in advance, the authentication of face membership can be conducted by face recognition, which is to identify each input face as one of the individual members in the group. However, this approach is obviously overdone to deal with an intrinsical binary classification problem [1], [20] as a multiple classification one. Furthermore, such methods are not appropriate for supporting dynamic membership authentication in practice because they are highly dependent on the membership in the group, i.e., whenever some individual members are added to or removed from the group, existing system performance will fluctuate a lot with the variation of membership in the group. Fig. 1 illustrates the data distributions for the previous two membership authentication methods. Fig. 1(a) shows the membership authentication based on face recognition, where each different symbol corresponds to different people and several data with the same symbol represents different face images of the same membership person, and the remaining black dots represent the faces of nonmembership people. Fig. 1(b) shows the membership authentication based on binary membership classification, where all people included in the member group have the same symbol. Here, the circle represents the membership faces, and the black dots represent the nonmembership faces. To overcome the difficulty of classification in case of the membership authentication based on binary membership classification, we proposed one membership authentication that was modeled as (1) in our previous work [1]. In [1], we projected the input face image into the membership and the nonmembership space simultaneously, and authenticated the membership by an ensemble of membership SVM classifiers and nonmembership SVM classifiers if if

Model Model

fusion fusion

(1)

where implies “near to the model,” fusion donates a model for reducing within-class feature variation and departing beand Model tween-class feature scattering, and Model denote the classifier models of the membership and the nonmembership group, respectively. This model was proved to be robust to the change of membership in the group. But, when the percentage of membership in the group becomes higher, the distribution of features from the membership and nonmembership classes becomes identical and overlapped in the data space. A bayesian classifier, even a mix-expert model like SVM ensemble

Fig. 1. Comparison of data distributions between two different membership authentication methods. (a) Face identification method. (b) Binary classification method.

we proposed in our previous work [1], would not be able to distinguish between the two classes. To overcome the previous classification difficulty, we developed a new method of membership authentication that is based on a divide and conquer strategy [29], [32]. The two steps of this divide and conquer strategy are: 1) task decomposition, a high-complexity task is decomposed into several subtasks, then each subtask is handled by a simple, fast and efficient module. 2) multimodule decision-making, subsolutions are integrated via a multimodule decision-making strategy. In the literature, many well-known models such as modular classifiers [26], [32], [33], ensemble classifiers [1], [34], [35], and neural networks [27]–[29] are based on this strategy. However, most of these models follow the way of mixture experts that decisions are made by a decision-making framework such as voting schemes, gating or weighted output approaches, etc. [31] Another category of divide and conquered based model is classification tree [3]–[6], [23]–[25], [30]. Against previous multimodel methods and other category methods, the advantage is, classification tree can naturally partition the feature space into disjoint regions for a single class, and reduce the classification difficulty. The disadvantage is, the required recursive

438

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 16, NO. 2, MARCH 2005

face is first judged to be the member of a certain subgroup, then the class of is determined by the label of this subgroup. We implement the proposed membership authentication by the SVM classification tree including an membership-based LLE data partition technique, which will be explained in the following sections. III. SVM TREE ALGORITHM

Fig. 2. Illustration of face classification in divided spaces.

partitions in tree construction easily result in a huge tree size following a bad error accumulation and computational speed [25]. To avoid the previous disadvantage, we construct tree by data partitioning with taking class label into account. In our SVM tree, The structure is constructed by a membership-based data partition and learning. Each node is one single SVM. As classifying, Midnode SVMs just make a series of SVM soft classifications and deliver the new sample to one certain terminal-node SVM of the tree. Then this single SVM makes the last decision for classification. Therefore, instead of conducting one complex classification in the whole data space as previous methods [1], [20], we authenticate new faces using one simple classification in a subspace (corresponding to one terminal node in the binary tree). Fig. 2 illustrates the previous recursive divide and conquer idea, where all people included in the member group have the same symbol. The same as in Fig. 1, the circle and the black dots represent the membership faces and the nonmembership faces, respectively. As we can see in Fig. 2, the whole data space of Fig. 1(b), which originally is one very complex binary-classification, is divided into three subspaces (A), (B), and (C), and subspace (B) is further divided into subspaces (D), (E), and (F). Consequently, one input data can be authenticated by one of five simple binary classifications, which are represented as five simple decision boundaries in subspace (A), (C), (D), (E), and (F) of Fig. 2, respectively. Mathematically, the proposed membership authentication method can be formulated as follows: if if

Model Model

(2)

where implies “near to the model,” Model and denote classifier models of the membership and the Model nonmembership group of the subspace ,. The total group ( ) and is divided into the membership subgroup , where and the nonmembership subgroup are determined after model generation. Hence, each input

A classification tree induces a tree structure from a training set of labeled feature vectors in order to classify the unlabeled data. It usually consists of internal and external nodes connected by branches. Each internal node performs a split function that partitions the training data into two disjoint subsets, and each terminal node contains a label that indicates the predicted class of a given feature vector. In machine learning literatures, some typical examples of classification tree are the ID3 [3], C4.5 [4], [5] and CART [6]. ID3 partitions the internal nodes according to their attribute values, C4.5 grows the tree using an entropy function as the error measurement, and CART strictly induces a binary tree by using the resampling technique for error estimation. Compared with these trees, the SVM classification tree is constructed by a divide and conquer approach using a binary clustering and SVM classification technique. Similar to CART, we construct the SVM classification tree by performing a succession of splits that partition one data set into two disjoint subsets. Starting from the root node that contains all training data, a membership-based LLE clustering partitions one data set into two disjoint subsets. The disjoint subsets are represented by two child nodes originating from the root nodes, and the same splitting procedure is performed on both child nodes. This splitting procedure terminates when all nodes consist of only either membership or nonmembership data. Basically, we perform two procedures at each node in the previous tree generation. First, the membership-based clustering performs a “soft” separation of membership because it splits the data into two disjoint subsets. Next, an SVM classifier is trained by the result of “soft” classification and can be thought of as performing “hard” decision of membership. Consequently, with the continuing of data splitting, we grow a clustering hierarchy, and construct an SVM classification tree, whose terminal node is associated with a label of membership or nonmembership. Algorithm 1 shows the procedure of constructing the SVM classification tree that includes the binary membership-based clustering. Algorithm 1: Grow SVM classification tree. Function SVMTree_Build (A training set ) { , if contains both member i) /*Divide and nonmember data*/ contains the same membership) { if ( Mark the end of a branch in SVM tree ; return;} into two disjoint subsets ii) /*Divide and */ Membership); based_LLE_Clustering( ;

PANG et al.: FACE MEMBERSHIP AUTHENTICATION USING SVM CLASSIFICATION TREE

439

iii) /*Append a new node in */ Append_node( ); iv) /*Train a SVM classifier the previous from ;*/ node to distinguish Train_SVM_Classifier( ); */ v) /* Grow SVM tree on subset SVMTree_Build( ); vi) /* Grow SVM tree on subset */ SVMTree_Build( ); }

Therefore, for an input sample, we can predict its class in the previous SVM decision tree. First, we decide which cluster a of root node in new data belongs to using the SVM test the SVM tree. Depending on the result of a decision made by an , the tree will branch to one of the internal node SVM test node’s children. This procedure is repeated until a terminal node is reached and a class label is then assigned to the given input sample. Algorithm 2 illustrates a binary classification using the SVM classification tree.

Algorithm 2: Binary classification by SVM classification tree. Function SVMTree_Test (SVM tree , Input face ) { i) /* set current node to the root of */ ; current ii) /* Starting from the current node, in until reach a terminal node*/ test is an internal node) { while ( iii) /* Test input face using SVM on the current node*/ ; iv) /* Based on the previous SVM test, */ search the next node in next Search ; v) /* Set the next node to the current node*/ next; current } vi) /* Return the label of the terminal node obtained from the previous steps */ ); return Label( }

Fig. 3 is an example of the SVM classification tree which is derived from the previous SVM classification tree construction. Each ellipse node of the tree represents an SVM test . As mentioned, the SVM test starts from the root node 1 with the is on the left side of the test node, assumption that is on the right side. If the test and is observed, then the test is performed. If the condition is observed, then the input data is assigned to class a, and so forth.

Fig. 3. Example of an SVM classification tree.

IV. MEMBERSHIP-BASED LLE DATA PARTITION In the previous tree growing algorithm, data splitting procedure divide the data iteratively, and numbers of SVM classifier follow each step of the data partition, respectively. Here, By using LLE dimensionality reduction theory, we compressed member faces and nonmember faces as a set of LLE eigenfaces which together characterize the variation between all the member face images and nonmember face images, respectively, and we partitioned the training set by a membership-based clustering with the membership eigenfaces and nonmembership eigenfaces as two cluster centers. A. LLE Eigenface Compared with the linear dimensionality reduction method, principle component analysis (PCA) [1], LLE is a method of nonlinear dimensionality reduction introduced by S. T. Roweis and L. K. Saul [7]. This method recovers global nonlinear structure from locally linear fits. It attempts to preserve as well as possible the local neighborhood of each object, while preserving the global distances through the rest of the objects. These properties might not be a benefit for data classification, but it definitely implies a better clustering of the data [8], [9]. Here, we represent the whole face images in the membership group (or nonmembership group) in terms of a set of membership (or nonmembership) eigenfaces, which we obtained from the LLE eigenface technique as follow. Given a face feature dataset consists of real-valued vectors . Each vector is a high-dimensional vector with dimensionality , and is the low-dimensional vector embedded from with embedding dimensionality , where . The computation of LLE eigenface involves an optimal embedding procedure which reduces vector from high-dimensional data to low-dimensional data by minimizing the following cost function:

(3) This procedure consists of three steps as follows. Step 1. Compute the neighbors of each data points, by computing pairwise distances

440

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 16, NO. 2, MARCH 2005

and finding neighbors [10]. In the simplest formulation of LLE, one can identify nearest neighbors per data point, as measured by Euclidean distance. that best Step 2. Compute the weights from its reconstruct each data point neighbors, minimizing the following cost function by constrained linear fits (4) Fig. 4.

Step 3. Compute the matrix in terms of the previous weights computation

Comparison of membership LLE eigenfaces and PCA eigenfaces when j j = 20. (A) Original face images. (B) LLE eigenfaces. (C) PCA eigenfaces.

K

= 10 and

M

(5) where

is the

identity matrix.

Note that the bottom eigenvectors of the matrix are eigenvalues. Thus, for fixed corresponding to its smallest weights vectors , the embedding vectors are found by minimizing the cost function (3). That is, the optimal embedeigenvector of ding can be found by computing the bottom identifies the compressing ratio of the emthe matrix, . bedded data space to the original data space. The bigger ratio means that more local points of data variations in the original space are kept in the embedded eigenspace, and the smaller ratio means that more global information are reserved by each LLE eigenvector of the embedded space. To reserve the local face is constantly set as 1/3 in our experiments. variations, Therefore, for face image data, LLE eigenfaces are a subset of eigenvectors of matrix , which assumes that a facial image from training set can be reconstructed from its neighbors with the lowest reconstruction error (4). Fig. 4(B) shows 10 LLE eigenfaces derived from the 20 face images of Fig. 4(A), which are included in the membership group for authentication. Fig. 4(C) is 10 PCA derived from the same 20 face images of Fig. 4(A). As we can see, each PCA eigenface contains the global components from all of the 20 faces, which looks obviously very different from one another; while the differences between LLE eigenfaces are not very distinct, because each LLE eigenface is obtained from a locally embedding computation, which contains only the local information of 20 face images. B. Data Partition In membership authentication, the group members can be divided into membership and nonmembership group members as . Applying the previous LLE eigenface technique to and respectively, we obtain two representative eigenface sets such as the membership eignefaces and the nonmembership eigenfaces . They characterize the “membership-face” and “nonmembership-face” respectively. Fig. 4 is an example of membership LLE eigenfaces with membership size equal to 20. On partitioning, we identify the two partitioned groups as a binary matrix , where the element is 1 if the th data 2

Fig. 5.

Typical example of the binary LLE data partition.

point belongs to group , otherwise 0. Once the two cluster and are fixed, then the clustering based on centers membership can be performed as follows: if otherwise

(6)

is the minimum distance projected onto the Where membership eigenfaces ( ), and is the minimum distance projected onto the nonmembership eigenfaces ) ( Fig. 5 illustrates an example of binary LLE data partition, where two dotted lines represent two principle membership LLE eigenvectors(or eigenfaces) and two solid lines represent two principle nonmembership LLE eigenvectors(or eigenin the space is projected to faces). Each group member those eigenvectors, and is assigned to a class whose projected distance sum is the smallest. After completing the assignment of all group members, we obtain two disjoint subgroups that correspond to the membership group and the nonmembership group, respectively. Each subgroup will be used as two labeled class data to train the SVM classifier. We modified the LLE technique to take into consideration the labeling (membership and nonmembership) of the data. Thus, as the tree is growing, members and nonmembers are

PANG et al.: FACE MEMBERSHIP AUTHENTICATION USING SVM CLASSIFICATION TREE

forced to gather in numbers of separated subclusters by a recursive membership-based LLE clustering. Until arriving at the terminal node, all subclusters on the leaf nodes consist of either members or nonmembers only.

441

, for binary classificawhere tion and the decision function (7) can be rewritten as (12)

V. SVM SVM is a new and promising classification and regression technique proposed by Vapnik and his group at AT&T Bell laboratories [11]–[13]. In this work, we use SVMs as classifiers in a classification tree cooperatively to perform the membership authentication. In theory, SVM classification can be traced back to the classical structural risk minimization (SRM) approach, which determines the classification decision function by minimizing the empirical risk, as (7) where and represent the size of examples and the classification decision function, respectively. In SVM, the primary concern is determining an optimal separating hyperplane that gives a low-generalization error. A typical SVM classification decision function in the linearly separable problem is sign

(8)

whose separating hyperplane is determined by giving the largest margin of separation between different classes. This hyperplane bisects the shortest line between the convex hulls of the two classes, thus is required to satisfy the following constrained minimization as: (9) For the linearly nonseparable case, the previous minimization problem needs to be modified to allow misclassified data points. This modification results in a soft margin classifier that allows but penalizes errors by introducing a new set of variables as the measurement of violation of the constraints (10) is used to weight the penalizing variables . Minwhere imizing the first term in (10) corresponds to minimizing the VC-dimension of the learning machine and minimizing the second term in (10) controls the empirical risk. Therefore, in order to solve problem (10), we must construct a set of functions, and implement the classical risk minimization on the set of functions. Here, a Lagrangian method is used to solve the previous problem. Then, (10) can be written as

(11)

where is a kernel function. The choice of is closely related to the performance of single SVMs. By the experience of [2] and to guarantee a fair comparison, we are using the same second-order polynomial kernel in all SVMs of our experiments. However, people can also select other types of kernel in terms of different classification cases. VI. EXPERIMENT RESULTS AND DISCUSSIONS A. Face Database We perform various experiments to evaluate the proposed membership authentication technique using the MPEG-7 face dataset, which consists of 1,355 face images of 271 persons (5 different face images per person are taken), where each image has the size of 56 46. The images have been selected from AR(Purdue), AT&T, Yale, UMIST, University of Berne, and some face images obtained from MPEG-7 news videos [14]. For our experiments, we use four images of a person to construct the SVM classification tree and one remaining image of the person for evaluating the authentication performance of the constructed SVM classification tree. In order to remove the effect due to the illumination, each image is firstly normalized to have a constant mean and variance. Because the changes of the hairstyle are rather slight in our experimental data, we do not need to remove the hair information. However, for a robust face membership authentication system, face images are usually cropped to keep only the main facial regions, since the hairstyle can change in appearance easily, which will inevitably affect the result of our face authentication. For face feature extraction, we employed an effective facial encoding for good classification, which consists of Gabor wavelet feature extraction [15], [16], principal component analysis (PCA) feature fusion and linear discriminant (LDA) feature scattering [1]. We randomly select a certain number of persons among 271 persons and assign them as the membership group, which can be divided into a training and test set. The remaining persons are assigned as the nonmembership group. According to the requirement of membership authentication definition in Section I, we basically limit the size of membership group not to be greater than the size of nonmembership group, and the percentage of the members in the membership group should be within 40% of the total number of persons in order to ensure the membership group to be meaningful. B. Constructing SVM Classification Tree We implemented Algorithm 1 to construct the SVM classification tree for membership authentication using the training data set as follows. First, all training sets at the root node are divided into two disjoint subgroups where each subgroup corresponds to one of two child nodes. Then, the training set in

442

Fig. 6.

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 16, NO. 2, MARCH 2005

Illustration of SVM binary tree for classification.

each child node is divided into another two disjoint subgroups once again. This procedure is repeated until each node consists of the training data in the membership group or nonmembership group. At every division step, we train an SVM classifier and all SVM classifiers together make up an SVM classification tree. Fig. 6 illustrates an SVM classification tree for the authentication of 45 members out of 271 persons, which was implemented in our experiment. Each internal node of the tree identifies an SVM classifier, which is represented as an ellipse with a number as its identity. Here, when the parent node is labeled , reas , its two children nodes are identified as and spectively. We also represent the terminal node as a circle or a filled circle, which denotes membership or nonmembership, respectively. As seen in Fig. 7, the SVM binary tree appears as a balanced distribution of membership node and nonmembership node, although the number membership node is less than nonmembership node. This occurs because we are using LLE eigenfaces as the clustering centers in the tree generation, which preserves a good local neighborhood of each object in the global space. C. Authentication Performance After constructing the SVM classification tree, we evaluated the authentication performance of the proposed membership authentication method as follows. As described in Algorithm 2, we applied a new input face image to the root node of the SVM classification tree and we transferred the input face image to the left

or right child node depending on the classification result of the SVM classifier. We continued this procedure through a certain tree branch until we reached the terminal node. Finally, we assigned the input face image to membership or nonmembership according to the label of the reached terminal node. Table I shows the authentication results of the proposed membership authentication method. As in [1], we consider two different types of authentication error, such as “false-negative error” and “false-positive error,” where the former means that members are identified as nonmembers and the latter means nonmembers are identified as members, respectively. We tested the proposed membership authentication method with the group size of 10 to 60 with a 10 interval. As you can see in Table I, the proposed membership authentication method provides a very good performance on nonmembership authentication in that the average false-positive error is about 0.0025. Also, the correct rate is competitive to the SVM ensemble method, and is superior to the membership authentication based on the face recognition method. Table II shows the comparison result of authentication performances among three different membership authentication methods: SVM classification tree, the SVM ensemble [1], and single SVM [20], respectively, when the group size is considerably large (50 or 60 members). The structural and functional description of the SVM ensemble was explained in our previous work on dynamic membership authentication [1] and the single SVM was introduced by F. Xie et al. [20] for face verification.

PANG et al.: FACE MEMBERSHIP AUTHENTICATION USING SVM CLASSIFICATION TREE

443

as the face recognition technique [14], since this method was known to show the best recognition performance on our face database. As a result, the recognition-based method obtained 5.4% of authentication error and 530.0e-005 variance value of authentication errors, which indicate that the authentication performance is competitive to the SVM tree. However, the authentication stability of the recognition-based method is not comparable with the SVM tree owing to a much higher variance value of authentication errors. D. Stability Test

Fig. 7.

Ten different membership groups with five members in each group.

TABLE I CLASSIFICATION RESULTS OF THE PROPOSED MEMBERSHIP AUTHENTICATION

Here, we took the five-fold cross validation technique for avoiding the data tweak problem. The first column of Table II denotes five different experiments, where Ex1 means that one face image labeled as “1” was taken as the test input and the remaining four face images were taken as the training samples, and so on. From the table, we know the following: 1) the authentication performance of the proposed SVM tree method is the best among three different membership authentication methods due to the smallest average value of authentication errors; 2) the authentication performance of the proposed SVM tree method is the most stable among three different membership authentication methods due to the smallest variance value of authentication errors; 3) the authentication performance of the SVM ensemblebased method abruptly deteriorates as the group size increases from 50 to 60. In addition, we also compared the authentication performance of SVM tree with a face recognition-based method on the same task. Here, the embedded hidden Markov model (HMM) model with the second-order block-specific eigenvectors was chosen

First, we performed the stability test of the proposed authentication method under the condition of different members in a membership group of fixed size. We randomly selected 10 different membership groups and each consisted of different 10, 20, 30, 40, 50 and 60 members. Fig. 7 illustrates a part of membership group that consists of 10 different membership group with five members. We applied the proposed authentication method to each group independently. Fig. 8 shows that the authentication performance of our proposed method is very stable because the correct authentication rate keeps within a range of 96.5% to 99.5%, even though the members in the membership group are changed. Second, we performed the stability test of the proposed authentication method under the condition of different membership group sizes. Since the membership and the nonmembership group are complementary to each other, the proposed authentication system should be entirely symmetric with respect to members and nonmembers. To show this property, we specially used different membership group sizes ranging from 5 to 271 person (the total group size) with a step size of five persons. Additionally, for the purpose of a comparative analysis, we also used two existing models for face membership authentication applied on the same task. These two models are SVM ensemble [1] and single SVM [2]. Fig. 9 shows the comparison result of the number of misauthentications (false-positives + false-negatives), where three pieces of solid curve record the performance of the three models under different membership group sizes, respectively. For a better stability illustration, two pieces of additional lines are used as two performance variation approximations for single SVM system and SVM tree system, respectively. Where the dot dash line is for the single SVM system, and the dash line is for the SVM tree system. As you can see in Fig. 9, while the membership group size is growing from a small group size 5 to 135 (equals to 50% of total group size), the misauthentication of single SVM has almost a jump increment. Thus, the performance variation in the whole group size range can be approximated as a rectangle illustrated by the dot dash line. The performance of SVM ensemble is stable as the group size is lower than 45. However, when the group size grows greater than 50 and near to half of the total group size, some sequential mutation appears in its curve, which implies that the performance of system is out of control. Compared with the previous two models, SVM tree shows a much better stability because the small number of misauthentication increment as the membership group size grows greater than 45 and even near to 50% of total group size, and its curve fluctuates

444

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 16, NO. 2, MARCH 2005

TABLE II COMPARISON OF AUTHENTICATION ERRORS AMONG THREE DIFFERENT MEMBERSHIP AUTHENTICATION METHODS

E. Computational Complexity

Fig. 8.

Stability test under the condition of different members.

We noticed that the construction of SVM tree creates a computational costs increment on training computation. However, since the construction of SVM tree is in terms of a divide and conquer approach, only SVMs on the root node and several nodes near the root node have such burden on training computation, other node SVMs have a very fast training due to the less computation in a divided subspace. Therefore, the speed of SVM tree training is slower than SVM and SVM ensemble. But when we compared the speed of SVM tree with some evolving neural networks like ECMC and ECF [45] on the face membership authentication task, we found SVM tree training was five times faster than the training of these models. Our previous experiments were implemented using Matlab on Pentium 4 1.8 GHz-personal computer. The average authentication time was 0.35 s for all types of membership authentication task. This speed is sufficient for the task of security monitoring. Our experiments have demonstrated that although the SVM classification tree requires a higher computational effort than the single SVM and SVM ensemble, it enhanced the authentication accuracy and stability substantially. VII. CONCLUSION AND DIRECTIONS FOR FURTHER RESEARCH

Fig. 9.

Stability test under the condition of different membership group sizes.

very slightly and smoothly. Furthermore, from the performance variation approximation illustrated by the dash line in Fig. 9, we also can see that its a triangle with two small and similar bottom angles, which follows that the SVM tree system is a near linear system as the group size grows up to 50% of the total group size, and it is obviously more stable than a system with jumps and a system with mutation. Hence, we can conclude that the proposed SVM classification has a much better salability under the change of membership group size than the SVM ensemble method and single SVM.

This paper addresses a complex classification subject in face membership authentication problem. That is, when the size of membership data is near to the size of nonmembership data, the classification between membership and nonmembership becomes extremely difficult due to the identical distribution of features from two classes. The classification accuracy obtained from using one classifier was not sufficient despite its powerful classification ability [1]. To deal with the classification complexity, we proposed a SVM classification tree algorithm. The algorithm is designed in terms of a divide and conquer approach as follows. First, we partitioned one set of data into two disjoint subsets using membership-based LLE data partition, where an individual member is assigned to the subset whose projected distance onto its eigenfaces is smaller. Then, we trained the SVM classifier using two partitioned subgroups, where each subgroup data is used as one class for training the SVM classifier. These two procedures are repeated until each group includes the same membership or nonmembership. The trained SVM classification tree is finally used for identifying the membership of an unknown people. Our experimental results show that the proposed method shows a good membership authentication performance, as well as the

PANG et al.: FACE MEMBERSHIP AUTHENTICATION USING SVM CLASSIFICATION TREE

strong robustness to the variations of group membership, compared with the SVM ensemble method. Specifically, the proposed method shows a better improvement in authentication performance as the group size becomes larger. One problem of the proposed method is that the correct classification rate for the membership was not as good as was expected, when the size of membership group is bigger than 70 (25.8% of the whole group). This comes from the following facts. 1) We have a large amount of mixture between membership and nonmembership in the feature space. 2) The used face database consists of many different posed images. So, there is a high possibility that many different posed images of the same person are divided into different subgroups when we partition the group and this generally makes the height of tree deeper. This leads to a high degree of misclassification because we need to pass many nodes for deciding the membership of an input face image. 3) Five face images per person are not sufficient for membership training. Since the construction of SVM tree is based on a divide and conquer strategy, the training data is divided steadily as the tree is growing. Especially it is impossible to train the SVM near to the leaf node, because the number of training data is highly insufficient. Therefore, more training data is definitely beneficial for the construction of the SVM tree. Nevertheless, the simulation results have demonstrated that the proposed SVM classification tree-based method shows an outstanding authentication performance that can stand a high degree of classification difficulty in the case of sizeable membership group. In addition, we notice that the structure of tree is one key control that affects the classification performance, as well as the computational costs of the SVM classification tree. Given a collection of training examples from each distinct group, there are many classification trees that can exclusively classify these given examples. One tree is superior to the others if it is smaller as fewer decisions are required [25]. In other words, an optimal classification tree embodies the partitioned subspace that has the best class separability than subspaces produced by any other data partition method. As we know, if the partitioned subspace is linear separable for SVM, then it is a simple classification for SVM in theory [13]. Whereas, if it is nonlinear and nonseparable for SVM which is the most difficult classification case for SVM, the classification in such subspaces will become very difficult due to the complex hyperplane computation. Therefore, An optimal partition method, whose partition is the most beneficial to SVM classification can be counted as a breakthrough in the research of next generation SVM classification tree algorithm.

ACKNOWLEDGMENT The authors would like to thank the reviewers for their useful comments and suggestions.

445

REFERENCES [1] S. N. Pang, D. Kim, and S. Y. Bang, “Membership authentication in the dynamic group by face classification using SVM ensemble,” Pattern Recognit. Lett., vol. 24, pp. 215–225, 2003. [2] S. N. pang, D. Kim, and S. Y. Bang, “Fraud detection using support vector machine ensemble,” in Proc. 8th Int. Conf. Neural Information Processing, Shanghai, China, Sep. 2001, pp. 1344–1349. [3] J. R. Quinlan, “Induction of decision trees,” Mach. Learn., vol. 1, pp. 81–106, 1986. [4] , “Improved use of continuous attributes in C4.5,” Artif. Intell. Res., vol. 4, pp. 77–90, 1996. , C4.5:Programs for Machine Learning. San Mateo, CA: Morgan [5] Kaufmann, 1993. [6] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and Regression Trees. Belmont, CA: Wadsworth, 1984. [7] S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction by locally linear embedding,” Science, vol. 290, 2000. [8] A. B. A. Graf and F. A. Wichmann, “Gender classification of human faces,” in Biologically Motivated Computer Vision 2002, H. H. Blthoff, S.-W. Lee, T. A. Poggio, and C. Wallraven, Eds. Berlin, Germany: Springer-Verlag, 2002, vol. 2525, LNCS, pp. 491–501. [9] M. Vlachos, C. Domeniconi, D. Gunopulos, G. Kollios, and N. Koudas, “Non-linear dimensionality reduction techniques for classification and visualization,” in Proc. 8th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Edmonton, Canada, Jul. 2002, pp. 645–651. [10] L. K. Saul and S. T. Roweis, “An introduction to locally linear embedding,” AT&T, Florham Park, NJ, 2000. [11] J. Zhu and Y. L. Yu, “Face recognition with eigenfaces,” in Proc. IEEE Conf. Industrial Technology, 1994, pp. 434–438. [12] V. Vapnik, Estimation of Dependences based on Empirical Data. New York: Springer-Verlag, 1982. [13] C. Cortes and V. Vapnik, “Support vector network,” Mach. Learn., vol. 20, pp. 273–297, 1995. [14] M. S. Kim, D. Kim, S. Y. Bang, S. Y. Lee, and Y. S. Choi, “Face recognition descriptor using the embedded HMM with the second-order block-specific eigenvectors,”, Jeju, South Korea, ISO/IEC JTC1/SC21/WG11/M7997, 2002. [15] M. J. Lyons, J. Budynek, A. Plante, and S. Akamatsu, “Classifying faceial attributes using a 2-D Gabor wavelet representation and discriminant analysis,” in Proc. 4th IEEE Int. Conf. Automatic Face and Gesture Recognition, 2000, pp. 202–207. [16] P. A. Chou, “Optimal partitioning for classification and regression trees,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 13, no. 4, pp. 340–354, Apr. 1991. [17] D. E. Brown and C. L. Pittard, “Classification trees with optimal multivariate splits,” in Proc. Int. Conf. Systems, Man and Cybernetics, vol. 3, Oct. 1993, pp. 475–477. [18] S. Schechter, T. Parnell, and A. Hartemink, “Anonymous authetication of membership in dynamic groups,” in Financial Cryptography. Berlin, Germany: Springer-Verlag, 1999, vol. 1648, Lecture Nnotes in Computer science, pp. 184–195. [19] K. Jonsson, J. Kittler, Y. P. Li, and Matasj, “Support vector machines for face authentication,” Image Vis. Comput., vol. 20, no. 5-6, pp. 369–375, Apr. 2002. [20] F. Xie, G. Xu, and E. Hundt, “A face verification algorithm integrating geometrical and template features,” in Proc. Advances in Multimedia Information Processing, vol. 2195, LNCS, 2001, pp. 253–260. [21] K. Kim, J. Kim, and K. Jung, “Face recognition using support vector machines with local correlation kernels,” J. Pattern Recognit. Artif. Intell., vol. 16, no. 1, pp. 97–111, 2002. [22] S. Gutta, J. Huang, P. Jonathon, and H. Wechsler, “Mixture of experts for classification of gender, ethnic origin, and pose of human faces,” IEEE Trans. Neural Netw., vol. 11, no. 4, pp. 948–960, Jul. 2000. [23] P. L. Tu and J. Y. Chung, “A new decision-tree classification algorithm for machine learning,” in Proc. 4th Int. Conf. Tools with Artificial Intelligence, Nov. 1992, pp. 370–377. [24] D. Li and J. J.-X. Wu, “Hierarchical partition of the articulatory state space for overlapping-feature based speech recognition,” in Proc. 4th Int. Conf. Spoken Language, vol. 4, Oct. 1996, pp. 2266–2269. [25] Y. S. Chen and T. H. Chu, “A neural network classification tree,” in Proc. IEEE Int. Conf. Neural Networks, vol. 1, Nov.-Dec. 1995, pp. 409–413.

446

[26] X. D. Jian, A. Krzyzak, and C. Y. Suen, “A multi-net local learning framework for pattern recognition,” in Proc. 6th Int. Conf. Document Analysis and Recognition, Sep. 2001, pp. 328–332. [27] S. G. Romaniuk and L. O. Hall, “Dynamic neural networks with the use of divide and conquer,” in Proc. Int. Joint Conf. Neural Networks, vol. 1, Jun. 1992, pp. 658–663. [28] P. Liang, “Design artificial neural networks based on the principle of divide-and-conquer,” in Proc. IEEE Int. Symp. Circuits and Systems, vol. 3, Jun. 1991, pp. 1319–1322. [29] , “Problem decomposition and subgoaling in artificial neural networks,” in Proc. IEEE Int. Conf. Systems, Man and Cybernetics, Nov. 1990, pp. 178–181. [30] J. L. Castro, M. Delgado, and C. J. Mantas, “SEPARATE: A machine learning method based on semi-global partitions,” IEEE Trans. Neural Netw., vol. 11, no. 3, pp. 710–720, May 2000. [31] M. Sarkar, “Modular pattern classifiers: A brief survey,” in Proc. IEEE Int. Conf. Systems, Man, and Cybernetics, vol. 4, Oct. 2000, pp. 2878–2883. [32] L. L. Bao and I. Masami, “Task decomposition and module combination based on class relations: A modular neural network for pattern classification,” IEEE Trans. Neural Netw., vol. 10, pp. 1244–1256, 1999. [33] L. Guerra, M. Potkonjak, and J. Rabaey, “Divide-and-conquer techniques for global throughput optimization,” in Proc. Workshop on VLSI Signal Processing, Oct.-Nov. 1996, pp. 137–146. [34] S. Eschrich and L. O. Hall, “Soft partitions lead to better learned ensembles,” in Proc. Fuzzy Information Processing Soc. Annu. Meeting of the North American, Jun. 2002, pp. 406–411. [35] A. Rahman and M. Fairhurst, “Decision combination of multiple classifiers for pattern classification: Hybridization of majority voting and divide and conquer techniques,” in Proc. 5th IEEE Workshop Applications of Computer Vision, Dec. 2000, pp. 58–63. [36] H. C. Kim, D. Kim, and S. Y. Bang, “Face retrieval using first- and second-order PCA mixture model,” in Proc. Int. Conf. Image Processing, vol. 2, Sep. 2002, pp. 605–608. [37] D. Chaum, A. Fiat, and M. Naor, “Untraceable electronic cash,” in Proc. Advances in Cryptology (CRYTO), 1988, pp. 319–327. [38] M. H. Yang, N. Ahuja, and D. Kriegman, “Face recognition using kernel eigenfaces,” in Proc. Int. Conf. Image Processing, vol. 1, Sep. 2000, pp. 37–40. [39] K. Sohara and M. Kotani, “Application of kernel principle components analysis to pattern recognition,” in Proc. 41st SCIE Annu. Conf., vol. 2, Aug. 2002, pp. 750–752. [40] V. Bruce, “Identification of human faces,” in Proc. 7th Int. Conf. Image Processing and its Applications, vol. 2, Jul. 1999, pp. 615–619. [41] M. A. Fadzil and C. C. Lim, “Face recognition system based on neural networks and fuzzy logic,” in Proc. Int. Conf. Neural Networks, vol. 3, Jun. 1997, pp. 1638–1643. [42] S. R. Safavian and D. Landgrebe, “A survey of decision tree classifier methodology,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 21, no. 3, pp. 660–674, Mar. 1991. [43] J. Schuermann and W. Doster, “A decision tree theoretic approach to hierarchical classifier design,” Pattern Recognit., vol. 17, no. 3, pp. 359–369, 1984. [44] Q. R. Wang and C. Y. sune, “Analysis and design of a decision tree based on entropy reduction and its application to large character set recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. PAMI–6, no. 4, pp. 406–417, Mar. 1984.

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 16, NO. 2, MARCH 2005

[45] N. Kasabov and Q. Song, “GA-optimization of evolving connectionist systems for classification with a case study from bioinformatics,” in Proc. 8th Int. Conf. Neural Information Processing, Singapore, Nov. 2002, pp. 602–605.

Shaoning Pang (M’04) received the B.S. degree in physics, the M.S. degree in electronic engineering, and the Ph.D. degree in computer science. From 2001 to 2003, he worked as a Research Associate at the Pohang University of Science and Technology (POSTECH), Pohang, Korea. He is currently a Senior Research Fellow of the Knowledge Engineering and Discovery Research Institute, Auckland University of Technology, New Zealand. His research interests include support vector machines, brain-like computing, incremental learning, and bioinformatics. Dr. Pang is a member of the Institute of Electrical, Information and Communication Engineers (IEICE), Japan, and the Association for Computing Machinery.

Daijin Kim (M’90–SM’04) received the B.S. degree in electronic engineering from Yonsei University, Seoul, Korea, in 1981, the M.S. degree in electrical engineering from the Korea Advanced Institute of Science and Technology (KAIST), Taejon, in 1984, and the Ph.D. degree in electrical and computer engineering from Syracuse University, Syracuse, NY, in 1991. He is currently an Associate Professor in the Department of Computer Science and Engineering, Pohang University of Science and Technology (POSTECH), Pohang, Korea. From 1992 to 1999, he was with the Department of Computer Engineering at DongA University, Pusan, Korea. His research interests include intelligent systems, biometrics, and automatic identification.

Sung Yang Bang (M’74–SM’88) received the B.S. degree in electrical engineering from Kyoto University, Kyoto, Japan in 1966, the M.S. degree in electrical engineering from Seoul National University, Seoul, Korea in 1969, and the Ph.D. degree in computer science from the University of Texas, Austin, in 1974. He worked previously for Wayne State University, NCR, and Bell Laboratories. In 1986, he joined Pohang University of Science and Technology (POSTECH), Pohang, Korea, as a Professor. He served as the first Head of the Department of Computer Science and Engineering. He also served as the Director of the Brain Research Center, POSTECH. He participated in organizing APNNA (Asia Pacific Neural Network Assembly) and served as the organizing chair of the first ICONIP held in Seoul, Korea, in 1994. Currently, he is the Chairman of Korea Chapter of IEEE Computational Intelligence Society. His research interests are pattern recognition and neurocomputing.

Suggest Documents