Face Recognition Using Ensemble Feature Space in ...

3 downloads 0 Views 2MB Size Report
feature space that is free from irrelevant and redundant features(Hemalatha Gayatri and. Govindan). Feature selection reduce the dimensionality of theextracted ...
Face Recognition Using Ensemble Feature Space in Combination with Support Vector Machine

BY SHAHID AKBAR Thesis submitted to Abdul Wali Khan University Mardan in the partial fulfillment of the requirements for the degree of

MS (COMPUTER SCIENCE)

DEPARTMENT OF COMPUTER SCIENCE ABDUL WALI KHAN UNIVERSITY MARDAN SESSION (2012 – 14)

1

Face Recognition Using Ensemble Feature Space in Combination with Support Vector Machine

By Shahid Akbar

A Thesis Submitted in partial fulfillment of the requirements for the Degree of Master in Computer Science

To Department of Computer Science, Abdul Wali Khan University, Mardan, Pakistan 2015

2

DECLARATION The thesis of Mr. Shahid Akbar “Face Recognition Using Ensemble Feature Space in Combination with Support Vector Machine” is accepted in its present form by the Department of Computer Science, Abdul Wali Khan University Mardan as satisfying, all the requirements for the award of degree of MS (COMPUTER SCIENCE).

Supervisor

____________________ Dr. Maqsood Hayat Assistant Professor Department of Computer Science

External Examiner

____________________ Prof. Dr. Bashir Ahmad Department of Computer Science Gomal University

Chairman

____________________ Dr. Mukhtaj Khan Assistant Professor Department of Computer Science

Dean Faculty of Physical & Numerical Sciences AWKUM

____________________ Prof. Dr. Mushtaq Ahmad Department of Physics

3

Acknowledgements Firstly, I would like to thank God Almighty for the blessings bestowed upon me, during this research work. He provided me the strength, determination and knowledge to accomplish my MS research work. It is the special blessing of God Almighty to have coterie of talented teachers, praying parents and loving friends which all help to make my endeavors a success.

I would heartedly thank my supervisor Dr. Maqsood Hayat, whose sincere guidance and efforts were always source of inspiration and strength for me. In addition, it was his close assistance that helped me to maneuver the technical hurdles I faced during my research. Further, his encouraging and kindling attitude during down times were real motivation to achieve the set milestone. Thus, I feel myself the most fortunate to work with Dr. Maqsood Hayat.

I would also like to express my deepest gratitude for my loving parents and siblings whose prayers and support made this work a success.

I am very thankful Chairman, Department of computer Science and all the faculty members for their kind support, gratitude, guidance and precious comments from time to time during this research.

Shahid Akbar

4

List of Publications

JOURNAL PUBLICATIONS

2015

Shahid Akbar, Ashfaq Ahmad, Maqsood Hayat, Faheem Ali, Face Recognition Using Hybrid Feature Space in Conjunction with Support Vector Machine, J. Appl. Environ. Biol. Sci., 5(7), pp: 2836, 2015

2014

Shahid Akbar, Ashfaq Ahmad, Maqsood Hayat, Identification of Fingerprint Using Discrete Wavelet Transform in Conjunction with Support Vector Machine, IJCSI International Journal of Computer Science Issues, Vol. 11, Issue 5, No 1, 2014.

2014

Shahid Akbar, Ashfaq Ahmad, Maqsood Hayat, Iris Detection by Discrete Sine Transform Based Feature Vector Using Random Forest, J. Appl. Environ. Biol. Sci., 4(8S), pp: 19-23, 2014

CONFERENCE PUBLICATION

2015

Shahid Akbar, Zaheer Ullah, Ashfaq Ahmad, Maqsood Hayat, Face Recognition Using Principal Component Analysis Based Feature Space by Incorporating with probabilistic neural network, Int’l Conf. of computational and Social Science, UTM, Malaysia, 2015. (Accepted)

5

Abstract: Face recognition is one of the challenging problem in the area of pattern recognition and machine learning. Face recognition based automated detectors are applicable in different public area for security purpose, access control, public security, desktop login and manymore. Due to vagueness and intricacy in face pattern, there needs more exercise in order to enhance the quality of facerecognition. For this purpose, we propose a robust and reliable computational model for face recognition. In this Work,Transform approaches such asdiscrete sine transform (DST) and discrete wavelet transform (DWT) are utilized to extract the decomposed features from the face images. Furthermore, the numerical descriptors are obtained from face images using local phase quantization (LPQ) and local binary Pattern (LBP). In order to obtain unique and reliable features, Minimum redundancy maximum relevance (mRMR) are used to eradicate Irrelevant, noisy, and redundant features. Various classification algorithms such as K-nearest neighbor (KNN), Support vector machine (SVM) andProbabilistic Neural Network (PNN) are utilized. Stanford University Medical Students (SUMS) facial dataset and 10-folds cross validation test are used toevaluate the performance of classification algorithms. The proposed model achieved quite promising performance, which is 92.30% accuracy. It is investigated that the proposed model achieved the highest success rate in comparison with existing methods using SUMS facial dataset in the literature so far. This achievement is ascribed with the discrimination power of hybrid space and SVM. It is anticipated that the proposed computational model might be helpful for academia and researchers in face detection and recognition.

6

Table of Contents Abstract ......................................................................................................................................... 6 Table of Contents …..................................................................................................................... 7 List of Figures........... .................................................................................................................... 9 List of Tables................................................................................................................................10 List of Abbreviations.................................................................................................................. 11 1. Introduction ............................................................................................................................ 12 1.2. Research Objectives ......................................................................................................... 14 1.3. Structure of the Thesis ......................................................................................................15 2. Literature Survey.................................................................................................................... 16 2.1. Transform based Methods.....................…....................................................................... 16 2.2. Statistical Based Approaches........................................................................................... 17 2.3. Detection and Localization Techniques............................................................................19 2.4.Neural Network schemes...................................................................................................22 3. Material and Methods.............................................................................................................25 3.1. Dataset.....................................................................................................................................25 3.2. Feature Extraction Techniques.............................................................................................. 26 3.2.1. Discrete Wavelet Transform.......................................................................................... 26 3.2.2. Discrete Sine Transform.................................................................................................28 3.2.3. Local Phase Quantization.............................................................................................. 29 3.2.4. Local Binary Pattern...................................................................................................... 31 3.3

Feature Selection Schemes................................................................................................33 3.3.1Minimum Redundancy Maximum Relevance................................................................ 33 7

3.4

Classification Algorithms................................................................................................. 34

3.4.1

Support Vector Machine..............................................................................................35

3.4.2

Probabilistic Neural Network..................................................................................... 36

3.4.3

K-Nearest Neighbor.................................................................................................... 38

3.5

Proposed System.............................................................................................................40

3.5.1

Accuracy......................................................................................................................41

3.5. 2

Sensitivity and Specificity............................................................................................41

3.5.3

Mathew’s Correlation co efficient ...............................................................................41

4. Results and Discussion........................................................................................................... 42 4.1. Prediction performance using DWT Feature Vector....................................................... 43 4.2. Prediction performance using DST Feature Vector........................................................... 47 4.3. Prediction performance using LBP Feature Vector........................................................... 50 4.4. Prediction performance using LPQ Feature Vector........................................................... 53 4.5. Prediction performance using Hybrid Features Vector.................................................... 56 5. Conclusions.............................................................................................................................. 60 References ................................................................................................................................... 61

8

List of Figures Figure 1.1

Face Recognition Model

Figure 3.1

Sample faces of SUMS Dataset

Figure 3.2

Image Filtering Using DWT

Figure 3.3

Image Decomposition Using DWT

Figure 3.4

Local Phase Quantization Algorithm

Figure 3.5

The basic LBP operator

Figure 3.6

LBP based face description

Figure 3.7

Classification of SVM

Figure 3.8

General structure of PNN

Figure 3.9

Classification of KNN

Figure 3.10

Proposed Model for Face Recognition

Figure 4.1

Performance analysis of DWT

Figure 4.2

Performance analysis of DST

Figure 4.3

Performance analysis of LBP feature Vector

Figure.4.4

Performance analysis of LPQ feature Vector

Figure 4.5

Performance analysis of Hybrid feature space

9

List of Tables Table 4.1

Success rates of Classifiers on DWT feature vector.

Table 4.2

Success Rates of Classifiers on reduced DWT feature vector.

Table 4.3

Success Rates of Classifiers on DST feature Vector.

Table 4.4

Success rates of Classifiers on reduced DST feature vector

Table 4.5

Success rates of Classifiers on LBP feature vector.

Table 4.6

Success rates of Classifiers on reduced LBP feature vector

Table 4.7

Success rates of Classifiers on LPQ Feature space

Table 4.8

Success rates of Classifiers on reduced LPQ feature space

Table 4.9

Success rates of Classifiers on Hybrid Feature vector

Table 4.10

Success rates of Classifiers on reduced Feature Vector

10

List of Abbreviations ANN

Artificial neural network

BPNN

Back propagation Neural Network

BPSO

Binary Particle Swarm Optimization

CGM

Constrained Generative Model

DCT

Discrete Cosine transform

DFT

Discrete Fourier transform

DST

Discrete sine transform

DWT

Discrete Wavelet Transform

FFT

Fast Fourier Transform

FDA

Fisher discriminant analysis

GFC

Gabor–fisher classifier

HPF

High pass filter

ICA

Independent Component Analysis

KNN

K- nearest neighbor

KLT

Karhunen-Loeve transform

LBP

Local binary pattern

LDP

Local Derivative pattern

LFA

Local Feature analysis

LPQ

Local Phase Quantization

LPF

Low Pass Filter

LPQ

Local Phase Quantization

mRMR

Minimum Redundancy Maximum Relevance

PCA

Principal component analysis

RBF

Radial Basis Function

PNN

Probabilistic Neural Network

STFT

Short-term Fourier transform

SVD

Singular Value Decomposition

SVM

Support Vector Machine

11

1. Introduction

1

In this chapter, we will briefly discuss the necessary background of biometrics technology and also describe face recognition system as a pattern recognizer. Biometrics is a science of analyzing and measuring the features of biological organisms of a human body. The word” Biometrics” is derived from the Greek words bios(life) and metrikos(measure)(Li, 2009). Biometric systems extract behavioral and physiological characteristics from human that are used to determine an individual(Jain et al., 2006). It is found that biological characteristics of human are universal, Permanence and distinctive (Jain et al., 2004).Physical characteristics based recognition system includes various areas such as fingerprint, iris, face, voice recognition etc. Measuring the robustness and efficiency of the biometric recognition system, the need of automated systems are rapidly increasing for security purposes(Wayman et al., 2005). Biometric system is used for the authentication and security in airport, multiplexes and surveillance systems(Sonkamble et al., 2010). The traditional methods such as password based authentication, Pin numbers and swipe cards have several deficiencies which are replaced by the unique and reliable features of the biometric technology (Bhattacharyya et al., 2009). Biometric automated systems are flexible, fast and user- friendly. Looking at the importance of biometric system for security and authentication purposes, there needs an efficient and reliable system for face recognition. Face recognition is one the highly focused area for the researchers from few decades. Various numbers of statistical, transformation, local and geometrical based approaches have been applied to recognize faces. Face recognition is an intelligent system that collects convenient and persistent features from the face image to accurately identify an individual (Linge and Pawar, 2014).The reliable outcomes of face recognition system used in public security and law enforcement places, madeit one of the most reliable biometric technology as compare to others. It has been found that physiological approaches of the face possess high performance with less number of intrusiveness (Lin, 12

2000). Face recognition models are proposed using both supervised and unsupervised schemes. while in case of Supervised learning algorithms the satisfactory results are only found when

face images are in controlled state i.e. having static background, no

orientations and static lights(Intrator et al., 1996, Mathur et al., 2008).Recognition rates are effected when uncontrolled condition happen such as lighting conditions, effect of illumination and face expression are associated with automated face recognition system (Toth, 2005). These challenges are too much observed and studied by the researcher for a perfect recognition system.Owing to vagueness and intricacy in face pattern, it is timeconsuming and expensive for conventional approaches to classify human using face images. Therefore, it is needed to develop a computational model in order to enhance the quality of face recognition. In order to develop a vigorous and reliable computational model for face recognition a domain related benchmark dataset is required. After acquiring images from datasets, the next step is to isolate the face region from face image, for such reason different localization and geometrical methods are required (Gross and Brajovic, 2003). The next preliminary step is to extract numerical descriptors from the face region in such a way to enhance its generalization capability. In order to extract the features different feature extraction schemes are applied by the researchers such as template matching Eigen faces (Brunelli and Poggio, 1993), transformation (Belhumeur et al., 1997) and knowledgebased techniques(Brunelli and Poggio, 1993, Belhumeur et al., 1997, Wadkar and Wankhade, 2012, Zhang and Lenders, 2000). After feature extraction; preprocessing is used to remove unnecessary and noisy features from obtained feature space. Preprocessing technique brings the obtained feature vector in same range [0-1]. After preprocessing the next step is feature Selection (Serre et al., 2000).In some cases the extracted features vector contains too many features that affect the performance of the proposed model, for this purpose feature selection is applied to reduce the dimensionality of obtained feature space. Feature selection algorithm select high discriminative and unique features as well as it can minimize the redundant features(Guyon and Elisseeff, 2003).The last phase in face recognition system is classification. Classification phase has two modes of operation; training mode and testing mode. Training process constructs a model on extracted features vector(Langford, 2005). In testing mode, features from faces 13

are to be recognized against the previously generated model. The recognition rate of the proposed model is measured by utilizing different classification learners such as KNearest Neighbor, Support Vector Machine and Probabilistic Neural Network(Zhang et al., 2006). Furthermore, different cross validation tests can also be employed to enhance the performance of classificationalgorithms(Bengio and Grandvalet, 2004). A general model for face recognition is shown in Figure1.

Figure 1.1

1.2.

Face Recognition Model

Research Objectives

In the last few decades, face recognition has become a focusing area for the researchers to identify an individual. Tremendous enhancement has been made by the researchers through pattern recognition approaches but still it demands for more attention and exploration.

For this purpose, we have introduced and developed a comparative

computational model for face recognition. The proposed model is trained through an effective way that can enhance generalization capability.

The main objectives of our proposed work are stated as follows:  To implement a face recognition system based on local pattern and transformation algorithms.  To enhance the recognition accuracy than that of existing approaches.  To train the model on face images, that can effectively recognize the novel images.  The proposed model will be able to classify the gender.

14

 The performance of individual feature extraction technique and hybrid technique will be compared.

1.3.

Structure of the Thesis

The present research work is organized as following: In chapter 2, we discuss a thorough review of the work related to face recognition. Literature review illustrates various numbers of approaches are proposed and developed by the researchers to develop a reliable automatic system. In Chapter 3, we will explore all the techniques, algorithms and materials used in proposed model. The methods discussed in this part includes Dataset acquisition, feature extraction, feature selection and classification algorithms. In feature extraction step we will explain transformation and local features based approaches that are applied to obtain a feature vector. Feature selection technique is used to minimize the irrelevant and redundant features. At last different classification algorithms are explained that are utilized to measure the performance rate of the proposed model. In Chapter 4, results and discussions are presented. Recognition rates achieved against each method are fully explained. Chapter 5 concludes the proposed face recognition model.

15

2. Literature review

2

In this chapter, we will present brief description of the face recognition models, which are used for the facial detection and recognition as well as relevant to our proposed method. (Samra et al., 2003)following main methods.

2.1.

Transform based Methods

Transform based techniques are extensively used for face recognition. Among these methods Fourier transform(Samra et al., 2003), Discrete wavelet transform (Ramesha and Raja, 2011), Hadamard transform (Sadykhov et al., 2004), Karhunen-Loeve transform (Suarez, 1991), Discrete sine transform(Yahia, 2008), Singular Value Decomposition (Zeng, 2007), Discrete Cosine transform (Chadha et al., 2011), etc are more generally employed by the researchers to develop a reliable face recognition system. Hafed et al., proposed a holistic approach for facial recognition. Discrete Cosine transform (DCT) applied to extract feature from normalized face images(Hafed and Levine, 2001). A comparison analysis of both Karhunen -Loeve transform (KLT) and DCT is also performed. Similarly, nayak et al., combinedboth DCT and Discrete wavelet transform(DWT) algorithm using ORL face database(Nayak and Sharma). Support Vector Machine (SVM) and artificial neural network (ANN) are adopted to evaluate the proposed model. Classification results yield that SVM achieved considerably higher performance than ANN.Variation inillumination and poses at certain angles is challenging area of face recognition. The performance of the face recognition model can be affected through these issues; for this propose Yaji et al., proposed a novel hybrid model by fusing the feature spacesof DWTand Intensity Mapped Un-sharp Masking (Yaji et al., 2012).Binary Particle Swarm Optimization (BPSO) based feature selection algorithm isutilized to reduce the dimensions of the extracted features vector (Zahran and Kanaan, 2009).Similarly; Zhao et al., minimize the effect of illumination by employing Singular value decomposition (SVD) and Fast Fourier Transform (FFT) algorithms(Zhao 16

et al., 2012). FFT convert image into frequency domain space, texture information are obtained from phase spectrum and amplitude spectrum by calculating frequency domain space. SVD obtained three digital matrixes having both phase and amplitude spectrum descriptors(Drineas et al., 2004). It was found that the proposedtechniques achieved better successrate in presence of poor and complex image background. In contrast,Biswas et al., extract useful features by applying Coiflet packet and Radon transform(Biswas and Biswas, 2012).KNN algorithm using Euclidian and Mahalanobis distance are applied for classification (Zhang and Pan, 2011). The proposed model achieves remarkable accuracy in the presence of illumination and pose variations. Furthermore; Savithaet al., detect face region from images using Kanade-Lucas-Tomasi (KLT) and S-PCA Algorithms(Savitha and Kumar, 2014). S-PCA locates the face by calculating the energy ratioof odd and even symmetrical principal components using modified PCA(Ilin and Raiko, 2010). After detection, KLT approach is applied to track the movement of the detected objects in the video frames(Al-Najdawi et al., 2012). The ensemble features space enhanced the accuracy of the proposed approach. Recently; Varun et al., presented a novel Block-wise Hough Transform feature extraction scheme to deal with

illumination variation

problem(Varun et al., 2015). BPSO is used to select the optimal feature space (Zahran and Kanaan, 2009). The proposed model achieved outstanding success rate using various face databases.

2.2. Statistical Based Approaches Different statistical based approaches for automatic face recognition and detection have been proposed for the last few decades (Al-Najdawi et al., 2012). Recognition systems developed using statistical attributes are considered more accurate and precise due to its unique features(Lanitis et al., 2002).A large number of computational models have been proposed using different statistical properties such as correlation, covariance, standard deviation, Shannon entropy and etc(Matthews et al., 2002).Principal Component analysis (PCA) is one of the most commonly used statistical approachfor face recognition(Turk and Pentland, 1991). Hoang Le et al., used two dimensional PCA to extract features from FERET and AT&T databases (Turk and Pentland, 1991).SVM and KNN are utilized to measure the performance of obtained feature vector. It is investigated that the proposed 17

model performs better using SVM.Yilmaz et al., presents a novel preprocessing approach called “Eigen Hills” (Yilmaz and Gökmen, 2001).Features are also obtained from face edges and face region using Eigen face and Eigen Edge approachtechniques (Manjhi et al., 2013). The proposed technique is outstanding through static background images; but in case of expression variation the recognition rate is affected because face edges are sensitive to the variation (Braje et al., 1998).In contrast; Liu et al., proposed Gaborfisher classifier (GFC) for face recognition(Liu and Wechsler, 2002). Gabor based features spaceis derived from face images using Gabor wavelet representation. Extracted feature vector is reduced through Enhanced Fisher linear discriminant model (EFM)(Liu and Wechsler, 1998). The proposed model achieved superior results using FERAT frontal database. Face recognition becomes difficult when uncontrolled condition occurs such as illumination change and poses variation (Toth, 2005).To deal with such challenging issues;Chen et al., presents a DCT based normalized approach (Chen et al., 2006). It is observed that illumination variation is due to low-frequency band, to minimize the variation DCT coefficient is employed in the logarithm domain.Traditional PCA-based extraction techniques have the issues of high computationalload and low discrimination power and large. Therefore Yang et al., developed a real-time face recognition system by mergingPCA and linear discriminant analysis (LDA)(Yang et al., 2000).PCA is used to reduce the dimensionality after that LDA is employed to enhance the discrimination power.Lu et al., Proposed a new variant of linear discriminant analysis for face recognition, “D-LDA” (Lu et al., 2003). D-LDA efficiently eliminate null space betweenclass scatter matrix(Xu and Lee, 2012). It is foundthat linear model is more reliable against noises rather than non-linear model. Similarly, Singh et al., evaluate the comparative analysis of different algorithms on ATT and Indian face database(Singh). It is observed that LDA collectively achieved better accuracy using both databases. Thakur et al., Presented Fisher linear discriminant analysis (FLDA) and SVM to recognize faces. The proposed model obtained the highest success rate on ORL database (Thakur et al., 2009). In contrast, Yang et al., applied kernel PCA and kernel fisher linear discriminant analysis to extract features(Yang, 2002). It is investigated that error rate of kernel fisher face and kernel eigen face methods comparatively lower than traditional approaches. Liu et al., proposed anindependent Gabor approach for face recognition(Liu and Wechsler, 18

2003). The dimension of Gabor features vector is reduced using PCA and finally independent component analysis (ICA) is applied to obtain independent Gabor features(Draper et al., 2003). A novel classification learner, namely, Probabilistic reasoning model (PRM) is utilized toevaluate the recognition rate (Dora et al., 2013). Similarly; Shinohara et al., introduced a hybrid approach to obtainhigher order local autocorrelation (HLAC) features and Fisher weight maps (Shinohara and Otsu, 2004). HLAC features are extracted against each pixel of an image and feature vector is formedusing Fisher weight map(Kurita et al., 1997). Fisher discriminant analysis (FDA) is applied to recognize faces(Liu et al., 2002). Manglik et al., split the transformed face image into upper half and lower half partition(Manglik et al., 2004).Upper half tracks the information of the eye and eyebrows and lower half locate the position of nose, mouth and etc. Neural Network is used toevaluate the extracted information(Ou and Murphey, 2007). Furthermore; Cevikalp et al., proposed a model for small simple size problem using

discriminative

Common

features

Vector(Cevikalp

et

al.,

2005).Discriminativeapproach achieved outstanding results to recognize novel images. Kadam et al., presented a novel bias variance based face recognition (Kadam). Confidence interval is determined by calculating the smooth interval values against each image(Yan et al., 2008).Thiyagarajana combine the statistical features obtained using ICA models and PCA is applied to minimize the dimension of the feature spaces(Thiyagarajan et al., 2010). The features are further normalized through Unit Length and Zero mean & unit variance. A static Bayesian network is presented by Heusch et al., for authentication purposes (Heusch and Marcel, 2010).The comparative analysis reveals that proposed face model achieved better discrimination power than existing generative models.

2.3.

Detection and Localization Techniques

Face detection and localization is a key problem in the area of pattern recognition and image processing (Lu et al., 2012). Face detection through localizing approach extract reliable and unique features from the differentregions of face such as position of eye, eyebrow, nose, mouth and etc(Patel and Yagnik). During the past several years, various number of models has been proposed by the scientists to develop automated system for 19

face detection using local features based techniques (Singh et al., 2012). Lee et al., used local feature analysis (LFA) approach for face recognition(Lee et al., 2005). Local structure information is extracted from images and combined into a composite form Kiskuet al., proposed local template matching approach using scale invariant feature transform (SIFT) based features (Kisku et al., 2009). SIFT algorithm is applied to extract individual salient facial features from different regions such as eyes, mouth, etc(Aly, 2006).Obtained features are merged together using Dempster-Shafer decision theory (Beynon et al., 2000). Similarly, Jemaa et al., Presents local Gabor schemes for face detection and recognition(Jemaa and Khanfir, 2009). Valuable Features are extracted from different face regions such as eye, mouth and nose through Geometrical based algorithm. Neural Network approach is utilized to evaluate the performance of proposed model (Atkinson and Tatnall, 1997). Local binary pattern (LBP) received the attention of the researchers for facial detection due to its persistent and reliable features(Ojala et al., 2002).LBP is a non-parametric technique that assembles efficient information about local structure of face image and recognizes an individual based on collected features (Huang et al., 2011b, Shan et al., 2009). Zhang et al., extracted local features from face image for recognition(Zhang et al., 2005a). At first, Multi-classes of face areconverted into binary class by categorizing them into intra-personal and extra-personal classes. LBP histograms based local descriptors are obtained from images and similarity of the faces is matched using AdaBoost algorithm (Song et al., 2009, Ahonen et al., 2006). TheProposed method achieved the success rate of 97.9%. Similarly; Ersi et al., used local feature analysis (LFA) to extract the high deviation features(Ersi and Zelek, 2006). Gabor based histogram features are gathered to match and identify face images. The robustness of the proposed model is tested against variation of facial expression and illumination. In contrast, Chen et al.,used the hybrid features space using Gabor and Haar transform (Chen et al., 2005). The effectiveness of the proposed hybrid features are evaluated using different face databases. Face detection using statistical methods are the challenging area for the researchers due to generalization problems. In order to reduce to generalization issues, Zhang etal., developed a non-statistical technique, namely Local Gabor Binary Pattern Histogram Sequence (Zhang et al., 2005b).Local histograms are obtained using LBPin conjunction with Gabor magnitude and finally extracted histograms are 20

concatenated to form a single feature vector.K-nearest neighbor is employed to compute the similarity rate based on obtained descriptors.Due to uniqueness of spatial locality and orientation of Gabor filters; Huang et al.,extracted Gabor features using various filters(Huang et al., 2005). Preprocessing technique is used to decrease the variation of background lights and orientation. The performance of the proposed model is measured Polynomial neural network (Misra et al., 2006).Heisele et al., locate facial components from face images (Heisele et al., 2003).Zhang et al., usedlocal derivative pattern (LDP) to obtain local descriptors (Zhang et al., 2010). LDP extract high order local features are obtained using distinctive spatial relationship from the local region of the image(Raju et al., 2010, Zhang et al., 2010). The proposed LDP achieved better performance results than LBP using various face databases.Lei et al., presents face recognition model using Multi-scale local phase quantization (MLPQ)(Lei and Li, 2012). After extracting using fast MLPQ technique, high discriminative features are selected. Similarly, Zhang et al., proposed a model for facial expression (Zhang et al., 1998). Multi-scale and multiorientation information are extracted from face image using Gabor wavelet coefficients(An et al., 2011). On the other hand geometric based features are obtained. The classification measures are evaluated using two-layer perceptron (Silva et al., 2008). Experimental results show that Gabor wavelet features achieved better success rate than that of geometrical based features. Kanan et al., applied Pseudo Zernike Moment Invariant approach to extract high order features from face region(Kanan et al., 2006). Genetic Algorithm based feature selection is employed to produce an optimized feature vector(Li et al., 2010). The performance of the proposed approach is evaluated using SVM. Shastriet al., used Non-Negative Sparse Coding (NNSC), a part based method for face recognition (Shastri and Levine, 2007). It is found that proposed NNSC achieved outstanding recognition rate in comparison with other part-based techniques.Likely,Jones et al., proposed a novel rectangle based features for face detection(Jones and Viola, 2003). The features are obtained in the presence of different locations, scales, and orientations. The computational model is trained using adaboost algorithm on FERET faces.Combining Spatial and frequency domains provides useful information than applying them individually (Buades et al., 2005). In this regards,Yuan et al., presents the hybrid feature vector of LBP and local phase quantization (LPQ) for face recognition 21

(Yuan et al., 2012).Spatial and domain information are extracted using LBQ and LPQ respectively.

.2.4. Neural Network schemes Artificial Neural Networks (ANN) is a learning procedure that trains and predicts objects based on past experiences.ANN models are programmed with rules in conventional artificial intelligence (Peterson et al., 1994). ANN is a straightforward technique, (Kashyap and Yadav)composed of “neurons” that transform input intooutput through weight adjustment and training data. Neural network approaches are used for image matching , feature extraction(Zeng and Martinez, 2003), noise reduction (Patil and Narwade), and classification purpose (Cheung and Cannons, 2003). Feraud et al., applied Constrained Generative Model (CGM), a neural network based model for face detection(Feraund et al., 2001).CGM is a learning technique that calculates the probability among data through which the model is trained by adjusting different threshold value(Raphaël et al., 1997). It is found that due to low computational cost and robust features; the recognition rate of proposed model is improved.Similarly, Lin et al., proposed probabilistic decision based neural network (PDBNN) technique for face recognition(Lin et al., 1997). PDBNN detect the face region from image and then locate the position of eyes, nose and eyebrows to obtain feature vector. The proposed approach is implemented using ORL, FERET and OSR face databases. Haddadina et al., applied Zernike Movements, Pseudo Zernike Movements (PZM) and Legendre Movements to extract features(Haddadnia et al., 2001). Radial Basis Function is utilized as a classification learner. It is observed that high order degree of PZM contains valuable face expression information than low order degree. Garcia et al., presents a neural architecture that detect faceregion by rotatingwith different degrees (Garcia and Delakis, 2004). The model is trained through face and non-face patterns. The robustness of the model is alsotested through different face expression and orientation.In contrast,Huang et al.,used Pose Invariant scheme to recognize faces(Huang et al., 2000). Features are extracted using view-specific eigenface analysis. An ensemble neural network algorithm is applied to measure the success rate of the proposed method. Likely, Latha et al., implemented a robust neural network algorithm to detect view of frontal faces(Latha et al., 2009). PCA 22

is employed to reduce the dimensionality of the extracted feature vector. Back propagation Neural Network is used to recognize faces.Er et al., proposed a model using DCT, FLD and RBF neural networks [(Joo Er et al., 2005). DCT is used to eliminate low frequency coefficients (Hafed and Levine, 2001). FLD is applied on remaining coefficient to extract the more invariant and discriminative features are obtained (Thakur et al., 2009). Finally, RBF neural network is utilized to train samples. Furthermore, Zhao et al., applied a novel multi features scheme using neural networks committee (NNC) for face recognition (Zhao et al., 2004).NNC is the combination of several independent neural networks that are extracted from the blocks of the image(Köker et al., 2014). The recognitionrate of the multi feature is compared with single feature domain and found that proposed multi feature method performsefficiently.Rowley et al., usedBootstrap algorithm to minimize the training feature set(Rowley et al., 1998).Farfade et al., used the Convolutional Neural Network(CNN) technique for face detection (Farfade et al., 2015). Face region is described using a novel method called Deep Dense Face Detector. They investigated that the proposed technique can detect face from different angles as well as it can deal with occlusion problems. Nandani et al., extracted local feature from face by calculating the distance between eyes and mouth (Nandini et al., 2013). Back propagation Neural Network and RBF kernel are utilized to measure performance of feature vector. Similarly, Liet al., proposed a face detector using CNN cascade (Li et al., 2015). CNN is used to obtain the local information by rapidly eliminating the non-facial regions. The proposed predictor achieved the robust performance in the presence of large visual variations. Quraishi et al., combined the spatial and frequency domain techniques to extract region of interest and to calculate statistical based features (Quraishi et al., 2013).Feed Forward Back Propagation Neural Network is applied to classify faces based on selected feature vector(Srinivas et al., 2012).Furthermore, Jinet al., proposed a comparison analysis of single task learning (STL) and multi task learning (MTL) by employing Backpropagation neural network(Jin and Sun, 2008).MTL is an inductive transfer

approach

that

enhances

the

generalization

capability using

domain

features(Devries et al., 2014). Boughraraet al., used Multi-Layer Perceptron (MLP) algorithm to train the model(Boughrara et al., 2014). The performance of the proposed

23

model is evaluated using three different face expression databases such as GEMEP FERA 2011, FER-2013 and Cohn-Kanade facial expression(Boughrara et al., 2014).

24

3

3. Material and Methods

In this chapter, we will explain the proposed methodology such as feature extraction, feature selection, Dataset descriptionand finally classification algorithm is used to evaluate the performance of the proposed model. In Feature extraction part we will explain both transformation and local feature based algorithms. In the next section,feature selection scheme isdiscussed. Feature selection is the process of selecting the optimum feature space that is free from irrelevant and redundant features(Hemalatha Gayatri and Govindan). Feature selection reduce the dimensionality of theextracted feature vector to minimize the computational load. In the last section,classification algorithms are described, the classification learners are used to train the model in a suitable way by employing different cross validation tests that can enhance the performance rate of the proposed model.

3.1.

DATASET

In order to develop a computational model, the first step is to acquire a suitable and appropriate

benchmark

datasets.In

this

regards,

Stanford

University

medical

students(SUMS) face dataset is used to analyze and evaluate the performance of the proposed model [135]. SUMS dataset contain 400 face images that are used for training and testing the model. Dataset is equally divided into both male and female classes, where each class contains 200 images of male as well as female.All the images are stored in JPEG format.A sample faceimages of the SUMS dataset is figure 3.1.

Figure 3.1. Sample faces of SUMS Dataset 25

3.2.

Feature Extraction Techniques

Feature extraction is a mandatory step used in pattern recognition to obtain information using different geometrical or statistical attributes that are helpful to recognize an individual.The performance of any prediction system solely depends on the reliability of features that are obtained using feature extraction phase. In order to obtain prominent and reliable features from the face image, transformation and local features based algorithms are used. In this section feature extraction algorithms are discussed that are used to form a feature vector. 3.2.1. Discrete Wavelet Transform (DWT) Discrete wavelet transform is one of most popular image processing and machine learning tool that is used for feature extraction, detection, compression and image denoising. DWT is similar transformation algorithm like discrete Fourier transform in sense of orthogonal function.Wavelet transform has advantage of robustness, flexibility and less computational time over other transformation methods (Pokhriyal and Lehri, 2010, Hayat et al., 2012). The main task of wavelet transform is multi scaling function with resolution in both frequency and time domains. DWT decomposes the data (images, sequences) into different frequency componentsto preserve the high-frequency components. DWT operates by convolving a target function with wavelet kernels to obtain coefficients usingdifferent values for scales and orientations (Bodade and Talbar, 2009, Chitaliya and Trivedi, 2010). Scaled and a shifted values of the wavelet function  (t) is multiplied with the signal f (t) and then added(Hayat). The coefficients T (x, y) of

the signal f (t) can be defined as:

1 T ( x, y )  x

1

 f (t ) ( 0

ty ) x

Where x is a scale and y represents shifted parameter.  (

(3.1)

ty ) is the wavelet function x

and t represents the length of a signal. In wavelet transform, low pass and high pass filters are applied by sub-sampling an image in both horizontal and vertical directions(Farag and Atta, 2012).

26

Figure 3.2 Filtering Using DWT H and L represents high- pass and low-pass filter respectively,↓2 specify subsampling.Approximation coefficients of the filters are given by equation (3.2) and (3.3).

𝑎𝑙 (𝑛) = ∑ ℎ𝑘−2𝑛 𝑎𝑙−1 (𝑘)

(3.2)

𝑑𝑙 (𝑛) = ∑ ℎ𝑘−2𝑛 𝑑 𝑙−1 (𝑘)

(3.3)

Where approximation coefficient al(n) have low frequency used for next step of the transform and coefficient dl(n)is the wavelet coefficient determine the output of the transform. Wavelet coefficient contains both of high frequency and low scale components.

In this work, three levels wavelet decomposition is used to extract significant information from face images. In order to extract numerical attributes such as standard deviation, Euclidian distance, maximum, minimum, variance, skewness, log energy, and kurtosis are utilized to extract highly valuable features from each component of the signal.

27

Figure 3.3

Image Decomposition Using DWT

The figure 3.3 illustrates the three level wavelet decomposition of an image. At each level DWT decompose an image into low and high frequency parts. 1st level decomposes an image into four different parts of LL1, HL1, LH1 and HH1 respectively. Where LL1 is the low frequency coefficients are further decomposed into next four subparts and so on. 3.2.2. Discrete Sine Transform (DST) Discrete sine transform (DST) is a kind of Sinusoidal unitary and separable Transformdeveloped by Jain (Kekre and Mishra, 2010). It is quite equivalent to Discrete Fourier Transform. The fundamental properties of DST such as scaling, shifting and convolution, have been applied to extract valuable features from the face image in pattern recognition and Machine learning.DST is also used for data compression and image decompression(Farag and Atta, 2012). DST is the sum of sine functions, whose spectral methods are used for numerical solution using partial differential equations. DST is computationally efficient and fast image transformation algorithm that produces the real and orthogonal matrix(Ying and Yang). DST matrix is formed by arranging the pixels in row wise(Rao and Hwang, 1996). 28

The corresponding four types of DST to form a matrix are represented in equation (3.4), (3.5), (3.6), and (3.7), respectively,

DST-I= ∑𝑁−1 𝑛=0 𝑋𝑛 sin [

𝜋 (𝑛 𝑁+1 𝝅 𝑵

+ 1)(𝐾 + 1)] , 𝑘 = 0,1, , … , 𝑁 − 1 𝟏 𝟐

DST-II = ∑𝑵−𝟏 𝒏=𝟎 𝑿𝒏 𝐬𝐢𝐧 [ (𝒏 + ) (𝑲 + 𝟏)], DST-III=

(−𝟏)𝟐 𝒙𝑵 − 𝟐

(𝟑. 𝟓)

𝝅

𝟏

𝟏 + ∑𝑵−𝟐 𝒏=𝟎 𝑿𝒏 𝐬𝐢𝐧 [𝑵 (𝒏 + 𝟏) (𝑲 + 𝟐)] 𝝅 𝑵

𝟏 𝟐

(3.4)

𝟏 𝟐

DST-IV = ∑𝑵−𝟏 𝒏=𝟎 𝑿𝒏 𝐬𝐢𝐧 [ (𝒏 + ) (𝑲 + )]

(𝟑. 𝟔)

(3.7)

Where N is the number of elements in the input signal, k is the parameter that changes the base sine vector, and Xkis the real valued discrete sine transform coefficient (Jain, 1979). In Pattern Recognition, DST properties are used to obtain reliable feature Vector from the face images. A few of them are listed below: 

DST produces only real and orthogonal matrices.



The inverse of DST matrix creates the transpose matrix of the original matrix.



DST is fundamentally related to Discrete Fourier transform (DFT) (Ying and Yang).

DST algorithm works on only finite discrete sequences/data. Before using DST, it must be specified the function is either odd or even at both sided (min-n, max-n) of the domain. DST is applied only on odd extension of the original functions. 3.2.3. Local Binary Pattern (LBP) Local Binary Pattern is one of the standard local feature based method. LBP was first used in 1996 for analysis of texture of gray-scale images (Chen et al., 2010). LBP has been widely used for Facial Analysis, Image processing, Texture Analysis, etc. LBP in comparison with other local pattern methods is considered more efficient due to its computational power and robust features (Pietikäinen et al., 2011). LBP is an invariant 29

algorithm to both spatial rotations of objects and also monotonic changes occurs in the illumination and expression problems (Petrou and García Sevilla, 2006). A various number of extensions to the basic LBP algorithms have been developed by the researchers from the last few years such as:Transition Local Binary Patterns, Direction coded Local Binary Patterns, Modified Local Binary Patterns, RGB-LBP, Multi-block LBP uniform binary patterns, non-uniform pattern, rotational invariant ,etc(Petrou and García Sevilla, 2006). LBP obtains the local binary features from the face image by applying LBP operator. Basic LBP algorithm divides a face image into 3*3 matrix(Huang et al., 2011a). The central pixel of the matrix is used as threshold value, which is compared with other eight neighbors. Pixel value higher or equal then the central pixel is replaced by “1” and lower pixel values is replaced by“0” (Maturana et al., 2009). At last an eight bit binary code is obtained by calculating all the neighborhood pixels. A complete structure of LBP is shown in figure 3.4 as follows:

Figure 3.4.

The basic LBP operator

After calculating the binary codes of all the pixels of an image, the extracted binary features are represented in the form of histograms, shown in figure 6.(Shan et al., 2009). Furthermore, all histograms are combined together to form a single feature vector (Shan and Gritti, 2008).

30

Histogram of an image

𝑓𝑙 (𝑥, 𝑦)can be represented as:

Hi,j = ∑x,y I( fl (x, y) = i)

(3.10)

In order to reduce the invariant and noisy features from the extracted feature vector a preprocessing technique is applied. In our case we used mean and standard deviation function to normalize the extracted feature. After normalization all the features in feature vector is transformed into the range of [0-1].

Figure 3.5

LBP based face description

3.2.4. Local Phase Quantization (LPQ) The pioneer work on the local phase quantization was proposed by Ojansivu and Heikkil for texture description (Aly et al., 2013). LPQ overcome the limitations of some local patterns methods such as by increasing the neighborhood size it is unclear whether to use uniform codes or not (Tan and Triggs, 2010). It is investigated that when the binary codes of the pixels increases, uniform binary codes becomes infinitely small(Hussain et al., 2012).LPQis a generalized form of local patterns thatuse large local neighborhoods and domain-adaptive vectorquantization to solve the above mention problem of local patterns(Hussain et al., 2012).LPQ divide the labeled image into non-overlapping rectangular matrix of equal size (Ahonen et al., 2008). Local phase information is calculated against each pixel of a matrix using short term Fourier transform algorithm

31

shown in figure 7.Finally, the histograms based features are concatenated together to form a feature vector(Dhall et al., 2011).

The LPQ Algorithm is given as below: LPQ calculate the phase values in M*M neighborhood Nx against each pixel of the image f(x)(Zhu, 2014). The local spectra are computedby applying Short-term Fourier transform is defined in equation (3.8).

𝐹(𝑢, 𝑋) = ∑𝑦𝜖𝑁𝑥 𝐹(𝑋 − 𝑦)𝑒 −𝑗2𝜋𝑦𝑢

𝑇

(3.8)

The transform in Eq. (3.8) can efficiently calculate for all pixel positions 𝑋 ∈ {𝑋1 , 𝑋2 , … … . . 𝑋𝑁 }using one dimensional convolution for the rows and columns (Zhu, 2014). Transform coefficients F(k, y) for all pixels are calculated from four different frequencypoints, the frequencies computed by STFT are as following (Aly et al., 2013): 𝑦1 = [𝑎, 0]𝑇

, 𝑦2 = [0, 𝑎]𝑇

,

𝑦3 = [𝑎, 𝑎]𝑇 , 𝑎𝑛𝑑

𝑦4 = [𝑎, −𝑎]𝑇

Where a is a sufficiently small scalar to satisfy H(yi) >0. For each pixel position this result in a vector, where the value of a = 1=W, and W is the size of local filter. A vector for each pixel in the image is calculated as:

F(x) = [F(y1,x),F(y2, x),F(y3, x),F(y4, x)]

(3.9)

The phase information of an image can be calculated using a simple scalar quantization:

𝑞𝑖 = {

1, 𝑅𝑗 ≥ 0 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

(3.10)

Where Rj(x)is the jth component of the vector Rx.

R(x) = [Re {(F(X)},IM {F(X)}]

(3.10)

Then, the label image I LPQ (x)is represented as: 32

𝐼𝑙𝑝𝑞 (𝑋 ) = ∑8𝑗=0 q𝑖 (𝑋) . 2𝑗=1

(3.11)

As a result, we get a LPQ feature vector containing the phase information of each pixel of an image.

Figure 3.6

3.3.

Local Phase Quantization Algorithm

Feature Selection Scheme

Feature selection is the process of analyzing the extracted feature space. In order to enhance the generalization capability of a prediction system, different feature selection algorithms are employed to obtain a compact feature subspace. Feature selection is used to reduce the dimensionality of the feature vector by eliminating low variant features. Instead of using the whole feature vector, a reduced and optimal feature subspace is selectedto measure the performance of a proposed model. In this work we used mRMR feature selection technique to remove irrelevant, redundant and noisy features. 3.3.1 Minimum Redundancy Maximum Relevance (mRMR) mRMRis mutual information based dimensionality reduction technique proposed by peng et al.(Peng et al., 2005). mRMR find out feature subspace that are mutually dissimilar to form a compact featurevector, containing less number of redundant features and maximum relevance towards the target class (Lajevardi and Hussain, 2009).mRMR 33

first calculate mutual information (MI) between the candidate variable and target variable to measure relevance. Mutual information (MI) is used to compute statistical properties such relevance and correlationto reduce redundancy the average mutual information between candidate variable and the target variable (Kursun et al., 2010). The mutual information (I) of two features x, y is defined as: 𝑃(𝑋 ,𝑌 )

𝑖 𝑗 𝑀𝐼(𝑋, 𝑌) = ∑𝑖,𝑗 Ԑ 𝑁 𝑃(𝑋𝑖 , 𝑌𝑗 )𝑙𝑜𝑔 𝑃(𝑋 )𝑃(𝑌 𝑖

𝑗)

(3.12)

𝑃(𝑋𝑖 , 𝑌𝐽 )Calculate the joint probabilistic distribution. 𝑃(𝑋𝑖 ) , 𝑃(𝑌𝑗 ) are the marginal probabilistic. Minimum redundancy condition of the extracted features can be calculated using the following equation: 1

min(𝑚𝑅) = |𝑆|2 ∑𝑥,𝑦𝜖𝑆 𝑀𝐼(𝑋, 𝑌)

(3.13)

Where S represents the feature vector and |S| are total number of features in S. Similarly, maximum relevance condition used to maximize the relevance of all features of S and target classification variable. 1

max(𝑀𝑅) = (|𝑆| ∑𝑋 Ԑ S 𝑀𝐼(𝑋, 𝑍)

(3.14)

After calculating minimum redundancy and maximum relevance, both of the conditions are combined to obtain a single feature subspace as follows: 𝑀𝐴𝑋(𝛻𝑀𝐼) = (𝑀𝑅/𝑚𝑅)

(3.15)

It is investigated that mRMR achieves better performance results for categorical data. While using continuous feature space, features must be converted into discrete feature space to reduce noisy features (Chin et al., 2013).

3.4

Classification Algorithms

In Pattern recognition and machine learning various number of classification algorithms are utilized by the researchers to categorize the data into specific classesusing different cross validation tests. There are different supervised and unsupervised based learning algorithms that are used in literature to classify an individual. Supervised algorithms are applied only applied when the class label of a given data is given. On other hand unsupervised learning algorithms are used when the category/class of the data is unknown such as clustering techniques. In order to evaluate the discrimination power of 34

our proposed model, we utilized three different nature classification algorithms are explained below. 3.4.1 Support Vector Machine (SVM) Support vector machine is a classification algorithm introduced by V. Vapnik that analyzes and classifies objects based on statistical theory(Cortes and Vapnik, 1995). SVM is a fast hypothesis learner that is used for identifying linear as well as non-linear classification problems. SVM has a unique property of Structural Risk minimization principal that make him computational efficient as compared to other classification algorithms(Vapnik, 2013). Risk minimization property reduce the problem occurring while training the model (Joachims, 1998). SVM can classify both binary and multi class problems. In case of binary class problem SVM draw an optimal hyperplane among data of both classes(Duda et al., 2012). The hyperplane will be drawn, having maximum margins for the support vectors and minimal error rate for new instances (Theodoridis et al., 2010). In this work, we utilized SVM to separate features to classify gender based on extracted feature space.

Figure 3.7. Classification of SVM

35

The mechanism of Support vector Machine algorithm is explained below: Let we have T set of training data and n number of points, 𝑇 = {(𝑥𝑖 , 𝑦𝑖 ) 𝑥𝑖 ∈ 𝐑𝑝 , 𝑦𝑖 ∈ {−1,1}}𝑛𝑖=1

(3.16)

Where 𝑦𝑖 indicating the number of classes that could be either 1 or -1 to which 𝑥𝑖 set of points. ”1” and “-1”represents two different classes(Chapelle et al., 1999). Hyperplane drawn for satisfying

i number

of points can be written as:

W. X − 𝑏 = 0

(3.17)

Where W represents normal vector to the Hyperplane, The parameter

determines the

offset of the hyperplane from the origin along the normal vector W. If the training data is linearly separable, we need to select a Hyperplane in such a way that can clearly separate data in two parts, so the required two hyperplanes are described by the following equations: W. X − 𝑏 = 1

(3.18)

W. X − 𝑏 = −1

(3.19)

Then the distance

between two hyperplanes is calculated. We need to minimize∥

W ∥.To prevent data points from falling into the margin area, we add the following constraints: W. Xi − 𝑏 ≥ 1

for Xi of 1st class

(3.20)

W. Xi − 𝑏 ≤ −1

for Xi of 2nd class.

(3.21)

It can be also written as, yi (W. Xi − 𝑏) ≥ 1

for all 1≤ i ≤n

(3.22)

3.4.2 Probabilistic Neural Network (PNN) Probabilistic neural network (PNN) is a very powerful pattern recognition algorithm that is used to evaluate the performance of a model based on conventional statistical Bayesiandecision rule(Khan et al., 2015). PNN works on PNN was first introduced by specht in 1990 that estimate the probability density function (Specht, 1990). Owning its simplicity, transparency and fast training made PNN an optimal neural network model (Sitamahalakshmi et al., 2011). PNN is a multilayered back-propagation neural network that consist of four different layers such as input, pattern, summation, and output layers 36

shown in figure 9(Rao et al., 2009). The input layer computes the distance from the input vector to the training input vector. The second layer evaluates the summation by the contribution of input from each class and then finally output layer categorize the test examples into the predefined classes having maximum probability (Khan et al., 2008b). Unlike the other neural network techniques, PNN don’t needs of feedback from the individual neurons.

Figure 3.8. General structure of PNN

The input layer have N number ofnodes, each node is associated with an independent variable. The input layer does not perform any computation and simply distribute the input to the neurons in the pattern layer which has one node for each training example. On receiving the pattern x from the input layer, the neuron Zij of the pattern layer computes its output as 2

𝑍𝑖𝑗 = exp (− ||𝑃𝑗 − 𝑃𝑖 || ) /𝛿 2 )

(3.23)

where Zij represents the output neuron vector of pattern node j and δ is a smoothing factor that control the width of activation function.

Pj– Pi distance of the input vector Pi and the vector Pj of the pattern node j. if the distance between Pi and Pj of the pattern node jincreases, similarity between the two data 37

vectors decreases and vice versa.The outputs of the pattern layer are provided to the summation layer, which contains v competitive nodes each corresponding to same class. Now each summation node v is connected to the pattern nodes that are associated to training objects of class v. For an input vector Pi, the summation node k receives the outputs of the associated pattern nodes for producing an output: 𝑓𝑣 (𝑃𝑖 ) =

1 𝑁𝑣 ∑∀𝑃𝑖 ∈

𝑄𝑣 exp

[(𝑃𝑗 𝑃𝑖 𝑇 − 1) /𝛿 2 )]

(3.24)

Where Qv denoted the label of the class corresponding to the summation node v, while Nv is the number of training objects belonging to this class.

The outputs of the summation layer can be calculated using posterior class probabilities: 𝑃(𝑄𝑖 = 𝑉|𝑃𝑖 ) =

𝑓𝑉 (𝑃𝑖 ) 𝑉 ∑𝑣=1 𝑓𝑉 (𝑃𝑖 )

(3.25)

According to above equation, the input Pi is assigned to a particular class. 3.4.3 K-Nearest Neighbor (KNN) K- Nearest Neighbor is the most commonly used technique in the area of pattern recognition for regression and classification. KNN is popular classification algorithm because of its simplicity, adaptability, and fast training. Regardless of its simplicity, it can generate competitive and incredible performance compared to many other learning algorithms. KNN classifies objects based on proximity of training samples on nearest neighbors in the feature space (Khan et al., 2008b). Eachnearest sample to be classified is considered as an item of evidence regarding the class of that pattern (Garcia et al., 2008). KNN is a non-parametriclazy learning algorithm that has no priorknowledge about the distribution of data in feature vector(Keller et al., 1985). It has no explicit training phase, while keeping all the training data in testing phase. The unlabeled instances are classified through nearest neighbors in the feature vector. Therefore it is also called lazy learner or instance base learner(Khan et al., 2008a). The KNN learn based on distance, which calculates Euclidean distance between the protein query and thetraining instances (Ashari et al., 2013). The value of K is used to specify the number of instances from the feature 38

vector which has the closest distance from the query point.At last, the most frequently occurring class is assigned to the query point. The recognition rate of KNN effects by increasing the number of neighbors and presence of noisy data in the feature vector.

KNN algorithm is illustrated as following: Let us suppose, we have N numbers of face images F= {𝐹1 , 𝐹2 , 𝐹3 , … … … . 𝐹𝑁 }, labeled with 𝑌 = {𝑦, 𝑦2 , 𝑦3 , … … … . 𝑦𝑁 } are in the training dataset. Consider a query point X from where we can classify the nearest neighbors. The first step is measure the Euclidean distance from the query point X with neighbors𝐹𝑖 . Mathematically, Euclidean distance can be expressed as follows:

𝐷(𝑋, 𝐹𝑖 ) = 1 − [∥ 𝑋 ∥. ∥ 𝐹𝑖 ∥]

i=1, 2, 3…..N

(3.26)

Let’s consider the scenario to classify the given data into specific classes among the number of known samples, shown in the figure 3.9. The instances of different classes are placed into different location . Where 1, 2, 3,…. represents classes whose instances are placed at different location in feature space. “q” represents the query point from where the Euclidean distance is calculated with the instances of particular classes.

Figure 3.9. Classification of KNN 39

In figure 3.9, the circle contains 5 instances represents the numbers of neighbors from both classes so the value of K = 5. Euclidean distance is measured between the query point and all other five instances individually. Finally the decision is computed based on the majority voting.

3.5

Proposed System

Looking at the importance of face recognition in the field of machine learning and pattern recognition, we propose an efficient, reliable and accurate model for face recognition to classify gender. In this model, useful features are extracted from face images using transformation and local features based techniques. DST and DWT are applied to transform the face image and extract the reliable features using statistical attributes such as standard deviation, Euclidian distance, maximum, minimum, variance, skewness, log energy, and kurtosis. On other hand local features are extractedusing LBP and LPQalgorithms. It is found that the extracted feature vector are of large dimensions containing large number of irrelevant, noisy and redundant features that effect the performance of the model. Therefore we used minimum redundancy and maximum relevance (mRMR), a feature selection technique to reduce the dimension of feature vector by eliminating irrelevant and redundant features. In order to measure the performance of the proposed model, K-Nearest Neighbor, Support Vector Machine and Probabilistic Neural Network are utilized as classification learners. The discrimination power of the classification algorithms is assessed by using 10 folds cross validation test. In10-fold cross validation test, one fold is used for testing purpose and the remaining folds are used for training. The whole process is repeated 10 times and finally the results are combined.Furthermore,the high predictive feature vectors of DST and DWT are combined to get a single feature vector. The recognition rate of the proposed model is then measures before as well as after applying feature selection algorithm. Block diagram of the proposed system is depicted in figure 3.10. Furthermore, we applied different performance measures such as accuracy, sensitivity; specificity and Mathew’s Correlation Coefficient (MCC) are applied to measure the performance of classification algorithms.

40

3.5.1 Accuracy: The success rate of the classification learner is calculated by measuring the error rate or accuracy. Accuracy is one the simplest and commonly performance measure commonly used for classification problems in machine learning and pattern recognition. Accuracy evaluate the recognition of the model by calculating true positive (TP) in all training samples. It is the proportion of true predictions. It can be calculated as:

Accuracy = ∑ki=1

TPi N

(3.29)

3.5.2 Sensitivity and Specificity: Sensitivity and specificity are the performance measures used calculating the effectiveness and performance of the model. Sensitivity represents the number of true positive (TP) values and specificity indicates the proportion of true negative (TN). Sensitivity is the ratio between true positive instances of samples and total numbers of true positive instances. Similarly specificity represents the ratio between the predicted true negative (TP) and total number of true negative instances. TP

Sensitivity = (TP + FN) ∗ 100 TN

Specificity = ( FP + FN) ∗ 100

(3. 30) (3.31)

3.5.3 Mathew’s Correlation Coefficient: MCC is a powerful and efficient classification measure used for classification problems in pattern recognition. The value of MCC ranges from -1 to 1. Where 1 indicates the optimum prediction, 0 is the average prediction rate and -1 represents the inverse prediction.

MCC =

(TP∗TN−FP∗FN) √([TP+FP][TP+FN][TN+FP][TN+FN])

(3.32)

41

Where TP, FN, TN, and FP represents the number of True Positive, False negative, True Negative, and False Positive, respectively.

Figure 3.10. Proposed Model for Face Recognition

42

4. Results and Discussion

4

In this chapter, we will discuss the performance results that are achieved using proposed model.The proposed model consist of two transform based techniques such as DST and DWT in comparison with two local feature approaches, namely, LBP and LPQ. According to the proposed model the extracted feature vectors using transformation and localized techniques are evaluated byutilizing three different Classification algorithms such as KNN, SVM and PNN.In order to remove irrelevant and redundant features, a feature selection technique called mRMR is applied to get reliable feature vector.AfterusingmRMR, 400 high discriminative features are selected. In order to measure the success rate of the proposed model before and after feature selection three different nature classifiers such as KNN, SVM and PNN is used.It is investigated that the proposed model achieved the highest recognition rate using SUMS face dataset so far in the literature. In the next sub-sections the performance of the proposed techniques against each classification algorithm is discussed.

4.1) Prediction performance using DWT Feature Vector The success rate of DWT feature vector using proposed classifiers are listed in Table.4.1. The feature vector consist of 1000 features that are extracted using different statistical attributes. The performance result using KNN shows that an accuracy of 82.7%with specificity of 76%, sensitivity of 89.5% and MCC of 0.66 is achieved on 1st neighbor near to query point. Similarly, using 2nd neighbor a performance rate of 80%, sensitivity of 62%, specificity of 98% and MCC of 64% is reported. Using the same way KNN obtained the highest recognition accuracy of 84.5% on the 3rd nearest neighbor with sensitivity, specificity and MCC of 76%, 93% and 0.70 respectively. On other hand, we used PNN and SVM algorithms are used to evaluate the performance of the proposed model.In case of SVM, DWT achieved success rate of 88.8% with sensitivity of 89.5%, 43

specificity of 88% and MCC of 0.77 respectively. Finally using PNN, an accuracy of 56.6% is reported with sensitivity, specificity and MCC of 66.5%, 46.5% and 0.13 respectively. It is found that performance results of DWT feature Vector using PNN is comparatively lower than that of SVM and PNN.

Table 4.1. Success rates of classifiers on DWT feature vector Performance results using KNN Performance

1

2

3

4

Accuracy

82.7

80

84.5

79.9

Sensitivity

76

62

76

Specificity

89.5

98

93

MCC

0.66

0.64 0.70

5

6

7

8

9

10

84

81.7

82.7

81.7

83.2

79

63

76.5

67

73

68

74

94.5

96

91.5

96.5

92.5

95.5

92.5

63.5

0.63

0.69

0.66

0.67

0.68

0.68

0.61

Performance Results Using PNN Performance

Spread Value= 0.01 Spread Value=0.001

Spread Value=0.004

Accuracy

55.75

56.5

56.8

Sensitivity

65.5

66.5

66.9

Specificity

46

46.5

47

MCC

0.12 0.13 Performance results using SVM (polynomial)

0.13

Performance

-g=0.25

-g=0.30

-g=0.1

Accuracy

87.6

86.5

81.25

Sensitivity

87

86

72

Specificity

88

87

90.5

MCC

0.75

0.73

0.64

44

In order to remove irrelevant and noisy features we used minimum Redundancy Maximum Relevance based feature selection algorithm. After applying the extracted feature vector is reduced up to a feature set of 400 features. The success rate of reduced feature set using proposed classifiers is given in Table 4.2.An accuracy of 81.7% is reported on the first nearest neighbor of KNN with sensitivity of 79%, specificity of 84.5 and MCC of 0.64. Similarly, using 2nd neighbor the performance rate of

83%, with

sensitivity 72.5%, specificity of 93.5% and MCC of 0.67 is achieved. The performance of reduced DWT feature setis measured on ten nearest neighbors. Among the all the neighbors the highest recognition rate of 89% is obtained on 7th nearest neighbors with sensitivity, specificity and MCC of 94.5%, 96% and 0.76 respectively. On other hand using SVM classifier, an improved accuracy of 88.8% id reported with sensitivity of 89.5%, specificity of 88% and MCC of 0.77.Finally, PNN obtained an accuracy of 60.1% having spread value of 0.004 with sensitivity, specificity and MCC of 62.5%, 56% and 0.17 respectively. After examining all the results it is investigated that using reduced feature space SVM enhance the accuracy then other used classification learners.

DWT VS DWT(mRMR) 100 90 80 70 60 50 40 30 20 10 0 KNN

SVM

PNN

KNN

DWT Accuracy

SVM

PNN

DWT(mRMR) sensitivity

Specificity

Figure. 4.1. Performance analysis of DWT

45

Table 4.2. Success Rates of Classifiers on reduced DWT feature vector Performance results using KNN Performance

1

2

Accuracy

81.7

83

Sensitivity

79

3

4

5

6

7

8

87.7

88.2

87.2

89

72.5 84

80

76.5

79.5

Specificity

84.5 93.5 90

95.5

92.5

95

MCC

0.64

0.76

0.77

0.75

87

0.67 0.74

9

10

87.7

87.7

86.7

94.5

79.5

81.5

79

96

96

93.5

94

0.76

0.76

0.75

0.74

Performance Results Using PNN Performance Accuracy

Spread Value= 0.0001 56.5

Sensitivity

Spread Value=0.001

Spread Value=0.004

58.5

60.1

66.5

60

62.5

Specificity

46.5

56.5

56

MCC

0.13

0.16

0.17

Performance results using SVM (polynomial) Performance

-g=0.25

-g=0.30

-g=0.1

Accuracy

86.5

88

88.8

Sensitivity

87

89

89.5

Specificity

86

87

88

MCC

0.73

0.76

0.77

46

4.2) Prediction performance using DST Feature Vector The success rates of DST based feature Vectors are given in Table 4.3 and Table 4.4. In this work, we extract 1000 features from SUMS face dataset using DST algorithm using different statistical attributes. According to the proposed model the feature vector is evaluated using KNN, SVM and PNN. In case of KNN, DST achieved an accuracy of 84.7% on the first nearest neighbor with sensitivity of 83.5%, specificity of 86% and MCC of 0.69. Comparing the performance of all the nearest neighbors, the highest recognition rate of 89.5% is reported on the 10thnearest neighbors. Similarly, using 10thnearest neighbor’s sensitivity of 93.5%, specificity of 85.5% and MCC of 0.79is achieved. While using PNN DST feature vector obtained the success rate of 88.7%, with sensitivity, specificity and MCC of 93.5%, 94.5% and 0.78 respectively. Finally, SVM achieved the highest performance accuracy of 90% with sensitivity, specificity and MCC of 91.5%, 88.5%and 0.80.Which is highest than that of KNN and PNN. The performance rates of reduced DST feature vector using mRMR are given in Table 4.4. In order to minimize DST feature space mRMR Feature selection strategy is utilized. The recognition results are shown in Table 4.4. The DST feature vector is reduced up to 200 features. KNN obtained an improved performance on 10 nearest neighbors from the query point.KNN using 1st neighbor obtained an accuracy of 84% with sensitivity of 83.5%, specificity of 84.5% and MCC of 0.68.Similarly, measuring the performance of nearest neighbors, KNN achieved the highest success rate of 90.7% using 6th neighbor with sensitivity, specificity and MCC of 95%, 86.5% and 0.82 respectively. In Case of PNN, DST feature vector is evaluated by using different spread value. PNN using spread value of 3.0 reported an accuracy of 90.1% having sensitivity of 93.7%, specificity of 86% and MCC of 0.82. Furthermore, SVM is utilized to measure the performance of reduced feature vector. After evaluating the recognition results it is found that SVM achieved an accuracy of 90% with sensitivity, specificity and MCC of 92%,82% and 0.80 respectively. In this work SVM along with RBF function is used.

47

Table 4.3. Success Rates of Classifiers on DST feature Vector Performance results using KNN Performance

1

2

3

Accuracy

84.7

85.7 87.8

Sensitivity

83.5

95

Specificity

86

MCC

0.69

4

5

6

7

8

9

10

88

88.5

89.2

88

89.2

88.5

89.5

94.5

90.5

95

91

95

91.5

93.5

77.5 87

81.5

86.5

83.5

85

84.7

85.5

85.5

0.72 0.75

0.76

0.77

0.79

0.76

0.79

0.77

0.79

88.5

Performance Results Using PNN Performance Accuracy

Spread Value= 4.5

Spread Value=5.5

Spread Value=6

86.8

87.5

88.7

Sensitivity

88

91

93.5

Specificity

85.5

84

94.5

MCC

0.73

0.75

0.78

Performance results using SVM (polynomial) Performance

-g=0.01

-g=0.0004

-g=0.0008

Accuracy

89.2

89.7

90

Sensitivity

89.5

91.5

91.5

Specificity

89

88

88.5

MCC

0.78

0.79

0.80

48

Table 4.4. Success rates of Classifiers on reduced DST feature vector Performance results using KNN Performance

1

2

Accuracy

84

Sensitivity

3

4

5

86.7 89.2

88.7

89.5

83.5

94

94.5

Specificity

84.5

79.5

88

83

MCC

0.68

0.74

0.78 0.78

90.5

6

7

8

9

10

90.7

90

89

90

91

95

92

90.2

93

88

86.5

86.5

88

87

0.79

0.82

0.82

0.78

0.80

89 93 85 0.78

Performance Results Using PNN Performance Accuracy

Spread Value= 2.7

Spread Value=3

Spread Value=3.2

89.5

90.1

89.8

Sensitivity

93

93.7

93.5

Specificity

86

86

86

MCC

0.79

0.82

0.80

Performance results using SVM (polynomial) Performance

-g=0.25

-g=0.0025

-g=0.55

Accuracy

88.5

89

90

Sensitivity

87

88.5

92

Specificity

90

89.5

88

MCC

0.77

0.78

0.80

49

DST VS DST(mRMR) 100 90 80 70 60 50 40 30 20 10 0

KNN

PNN

SVM

KNN

DST Accuracy

PNN

SVM

DST(mRMR) Sensitivity

Specificity

Figure 4.2. Performance analysis of DST

4.3) Prediction performance using LBP Feature Vector The prediction rate of LBP feature vector is listed in Table 4.5. According to the proposed model, LBP algorithm is used to obtain the local features from the SUMS face dataset. In this model binary descriptors are extracted using LBP by dividing an image into 3*3 matrix. Binary local features is computed against each cell using LBP operator. After calculating the histogram features against each cell in the matrix all the histograms are combined to form a single feature vector.LBP feature vector contains 256 histograms features. A preprocessing technique using Standard deviation and Euclidian distance is used to bring the features into the range of [0-1].LBP Feature space is then evaluated using proposed classification algorithms. The performance results are listed in Table 4.5. In case of KNN, LBP achieved recognition rates of 83.5%, 81% and 85.5% on the 1st, 2nd and 3rd nearest neighbors respectively. Measuring all performances of all nearest neighbors KNN reported the highest success rate of 85.8% on 5th neighbor with sensitivity, specificity and MCC of 93.5%, 78% and 0.72, respectively. On other hand PNN is applied to evaluate the performance of the DST feature Vector.PNN using spread value of 6.0, achieved an accuracy of 85.3% with sensitivity of 93%, specificity of 77.5%

50

and MCC of 0.71. Finally DST feature vector is measures using SVM learner and obtained success rate of 89.5% with sensitivity of 92.5%, specificity of 86.5% and MCC of 0.79. The performance rates of proposed classification algorithms are compared. It is investigated that SVM learner achieved an outstanding results than that of KNN and PNN.

Table 4.5. Success rates of Classifiers on LBP feature vector Performance results using KNN Performance

1

2

Accuracy

83.5

81

Sensitivity

89

Specificity MCC

3

4

5

85.5

85.8

84.5

96.5 90.5

95.5

93.5

78

65.5

79.5

73.5

0.67

0.65

0.70

0.70

85

6

7

8

9

10

85.5

83.5

85.2

83

95

92

95.5

93.5

94

78

74

79

71.5

77

72

0.72

0.70

0.72

0.69

0.71

0.68

Performance Results Using PNN Performance Accuracy

Spread Value= 1.3

Spread Value=1.6

Spread Value=6

84

84.8

85.3

Sensitivity

89.5

90.5

93

Specificity

78.5

79

77.5

MCC

0.68

0.70

0.71

Performance results using SVM (polynomial) Performance

-g=3.5

-g=1.6

-g=1

Accuracy

89

89.2

89.5

Sensitivity

92

93.5

92.5

Specificity

86

85

86.5

MCC

0.78

0.79

0.79

51

The dimension of LBP feature vector is minimized using mRMR. The proposed mRMR select only 200 discriminative descriptors which are highly relevant and free from redundant features. The performance rates of the proposed classifiers are listed in Table 4.6. KNN yieldedan accuracy of 86% using 9th nearest neighbors with sensitivity, specificity and MCC of 94%, 79.5%, 0.74, respectively. In case of PNN, LBP obtained success rate of 85.8% with sensitivity of 93%, specificity of 78.5% and MCC of 0.72. On other hand SVM achieved the highest accuracy of 89.5% which is an improved predication accuracyusing LBP feature vector. SVM also obtained the other classification measures which are sensitivity of 92%, specificity of 87% and MCC of 0.79.

LBP VS LBP(mRMR) 100 90 80 70 60 50 40 30 20 10 0 KNN

PNN

SVM

KNN

DWT+DST Accuracy

PNN

SVM

DWT+DST(mRMR) Sensitivity

Specificity

Figure 4.3. Performance analysis of LBP feature Vector

52

Table 4.6. Success rates of Classifiers on reduced LBP feature vector Performance results using KNN Performance

1

2

Accuracy

83.8

85

Sensitivity

88.5

Specificity MCC

3

4

5

84.5

85.7

85.7

91.5 91.5

96

92.5

79

78.5

78

73

0.68

0.71

0.70

0.70

84.9

6

7

8

9

10

85.9 84.3

86

83.5

96

92

95

94

94.5

79

74.5

79.5

73.5

79.5

72.5

0.72

0.72

0.72

0.74

0.68

0.70

Performance Results Using PNN Performance Accuracy

Spread Value= 2.7

Spread Value=2.8

Spread Value=3.2

84.5

85.8

85

Sensitivity

93

93

93

Specificity

76

78.5

77

MCC

0.70

0.72

0.71

Performance results using SVM (polynomial) Performance

-g=0.09

-g=5.7

-g=1.4

Accuracy

89

89.5

89.3

Sensitivity

91

92

91.5

Specificity

87

87

87

MCC

0.89

0.79

0.78

4.4) Prediction performance using LPQ Feature Vector LPQ is a local feature extraction technique that compute the phase information from the face dataset. LPQ calculates the local phase information against each pixel of an image using STFT algorithm. LPQ algorithm extract 256 features against each face image to form a feature vector. In order to classify gender based on extracted feature vector classification algorithms are applied. KNN using LPQ feature vector achieved the higher most accuracy of 87.5% using the 3rd and 9th nearest neighbors. Other classification 53

measures reported on 9thnearest neighbors are sensitivity of 98.5%, specificity of 76.5% and MCC of 0.77. It is found that the prediction results of LPQ feature vector using all nearest neighbors are satisfactory. In case of PNN the highest success rate of 86.7% with sensitivity, specificity and MCC of 98.5%, 75% and 0.76 is reported. Using SVM, LPQ achieved the recognition rates of 57% with sensitivity of 67%, specificity of 47% and MCC of 0.14. All the prediction results of LPQ based features are listed in Table 4.7.

Table 4.7. Success rates of Classifiers on LPQ Feature space Performance results using KNN Performance

1

Accuracy

85

Sensitivity

2

3

4

5

6

83.7 87.5

84.8

87

86.2

87

84.5

98

98

97.5

97.5

99

97

Specificity

75.5

69.5

73

72

MCC

0.71

0.70

0.74

0.72

76.5 0.76

7

73.5

77

0.75

0.76

8 86.5

9

10

87.5

85.5

98.5

98.5

98.5

74.5

76.5

72.5

0.77

0.73

0.75

Performance Results Using PNN Performance Accuracy

Spread Value= 0.001 85.3

Spread Value=0.003

Spread Value=0.0045

86.5

86.7

Sensitivity

94.5

96

98.5

Specificity

76

77

75

MCC

0.72

0.74

0.76

Performance results using SVM (polynomial) Performance

-g=0.3

-g=1

-g=1.6

Accuracy

57

56.5

56

Sensitivity

67

66

65

Specificity

47

42

41

MCC

0.14

0.12

0.13

54

mRMR based feature selection is used to reduce the dimensions of LPQ feature vector up to 200 features. The reduced feature space is then evaluated using KNN,SVM and PNN. The classification results of the used classifiers after employing mRMR algorithm is given in Table 4.8.KNN obtained an accuracy of 86% on the 1st nearest neighbor with sensitivity, specificity and MCC of 94%, 78% and 0.73, respectively. After measuring the performance over all the nearest neighbors it is found that KNN obtained the highest accuracy of 88.2% using 8th neighbor having sensitivity of 95.5%, specificity of 80.5% and MCC of 0.77. Similarly using PNN, LPQ obtained the success rate of 87% with sensitivity of 93%, specificity of 77% and MCC of 0.71. Furthermore, the proposed LPQ feature vector is also evaluated by utilizing SVM. In case of SVM, LPQ achieved the lowest recognition rate of 58.8% than other used classifiers.

LPQ VS LPQ(mRMR) 100 90 80 70 60 50 40 30 20 10 0 KNN

PNN

SVM

KNN

DWT+DST Accuracy

PNN

SVM

DWT+DST(mRMR) Sensitivity

Specificity

Figure 4.4. Performance analysis of LPQ feature Vector

55

Table 4.8. Success rates of Classifiers on reduced LPQ feature space Performance results using KNN Performance

1

2

3

Accuracy

86

82.5 86.3

Sensitivity

94

98

Specificity

78

MCC

0.73

4

5

6

7

8

9

10

84

87.5

85.8

87.2

88.2

86.3

95.5

97.5

97

97.5

97.5

95.5

96.5

97.5

67

77

70.5

78.5

74

77

80.5

76

72.5

0.68

0.74

0.71

0.74

0.76

0.77

0.74

0.72

0.76

85

Performance Results Using PNN Performance Accuracy

Spread Value= 0.001 86

Spread Value=0.002

Spread Value=0.0024

86.5

87

Sensitivity

93

93

93

Specificity

76

78.5

77

MCC

0.70

0.72

0.71

Performance results using SVM (polynomial) Performance

-g=0.045

-g=0.000015

-g=0.007

Accuracy

56.5

58

58.8

Sensitivity

66.5

65

48

Specificity

46.5

52

68

MCC

0.13

0.17

0.16

4.5) Prediction performance using Hybrid Features Vector The success rates achieved using transformation based features are outstanding as compared to local based features. Therefore we combine the feature vectors of both DST and DWT algorithms to further enhance the recognition accuracy of the proposed model. The performance results of the hybrid feature vector are listed in Table 4.9 and 4.10. In case of hybrid feature vector, KNN achieved an accuracy of 89.2% using 9th nearest 56

neighbors. The sensitivity, specificity and MCC reported on 9th nearest neighbors are 94.5%, 84% and 0.79, respectively.SVM is also utilized to measure the performance of hybrid feature vector. SVM obtained the success rate of 90% with sensitivity of 93%, specificity of 87.5% and MCC of 0.81.Using PNN as a classification learner hybrid feature vector achieved the prediction rate of 56.75% which comparatively lower than KNN and SVM. It is found that the performance of hybrid feature vector using PNN is comparatively lower than that of SVM and KNN.

Table 4.9. Success rates of Classifiers on Hybrid Feature vector Performance results using KNN Performance

1

2

3

Accuracy

86.7

83.5 87.7

Sensitivity

89

95

94

Specificity

84.5

72

MCC

0.73

0.71

4

5

6

7

8

9

10

86.2

87.7

87

87.5

88

89.2

87.7

97

94.5

96.5

96.5

94

94.5

96.5

81.5

75.5

77.5

77.5

78.5

82

84

79

0.76

0.74

0.75

0.75

0.76

0.76

0.79

0.76

Performance Results Using PNN Performance Accuracy

Spread Value= 1

Spread Value=1.5

Spread Value=2

56.75

56.3

54

Sensitivity

66.5

55

54.5

Specificity

47

56.5

53.5

MCC

0.13 0.14 Performance results using SVM (polynomial)

0.08

Performance

-g=1.2

-g=0.5

-g=0.001

Accuracy

90

90

90.2

Sensitivity

93

92

93

Specificity

87

88

87.5

MCC

0.80

0.80

0.81

57

According to the proposed model we applied mRMR based feature selection on hybrid feature vector. The performance results of hybrid approach after feature selection are listed in Table 4.10. In hybrid approach we combined the reduced feature vectors of both DST and DWT. Therefore we selected 200 high relevant features from DST feature set and 300 features from DWT feature vector. The proposed hybrid feature vector is evaluated using classifiers. In case of KNN an improved accuracy of 91% with sensitivity of 89%, specificity of 95% and MCC of 0.82 is obtained on 5thnearest neighbors. Similarly, PNN reported an accuracy of 57%. Finally SVM achieved the recognition rate of 91.30% with sensitivity of 93.5%, specificity of 89% and MCC of 0.82 which is the highest performance using SUMS face dataset so far in literature.

DWT+DST VS DWT+DST (mRMR) 100 90 80 70 60 50 40 30 20 10 0 KNN

PNN

SVM

KNN

DWT+DST Accuracy

PNN

SVM

DWT+DST(mRMR) Sensitivity

Specificity

Figure 4.5. Performance analysis of Hybrid feature space

58

Table 4.10. Success rates of Classifiers on reduced Feature Vector Performance results using KNN Performance

1

Accuracy

90.2

87.7 89

90.5

91

90

90.2

90

90

90.2

Sensitivity

89.5

97

85.5

84

89

83

85

83.5

85

84

Specificity

91

78.5

92.5

97

95

97

95.5

96.5

95

96.5

0.80

0.77

0.78

0.82

0.82

0.82

0.81 0.80

0.80

MCC

2

3

4

5

6

7

8

9

10

0.81

Performance Results Using PNN Performance Accuracy

Spread Value=0.0001 55.2

Spread Value=1.5

Spread Value=2

56.5

57

Sensitivity

56

57

58.5

Specificity

45.5

56

55.5

MCC

0.11

0.13

0.14

Performance results using SVM (polynomial) Performance

-g=0.05

-g=0.044

-g=0.055

Accuracy

91.30

90.2

91

Sensitivity

93.5

89

92.5

Specificity

89

91.5

89.5

MCC

0.82

0.80

0.82

59

5. Conclusions

5

In this work, we proposed an automated recognition model for Face recognition to identify gender. Various transform based approaches, localization techniques and neural based algorithms have been utilized by the researchers to develop an efficient and robust face recognition system. In order to develop an accurate and reliable face recognition model, we proposed transformation and local features algorithms for face recognition. DWT and DST are used as transformation algorithms. While on other hand, we used LBP and LPQ to extract local features from face images. In order to select high descriptors and to remove irrelevant and redundant features from extracted feature vector, mRMR feature selection is applied. The performance of the proposed model is evaluated using three different nature classification learners such as KNN, SVM and PNN. Furthermore, the success rates of the classification algorithms are assessed using 10 folds cross validation test. In 10 folds test one fold is used for testing purpose and the remaining 9 folds are used for training purpose. Various number of performance measures such as accuracy, sensitivity, specificity and MCC are utilized to analyze the performance of the classification algorithms. Here, we adopted SUMS face dataset. In this work, we used a hybrid feature vector by combining the feature spaces of DST and DWT. After analyzing the performance results presented in chapter 4, it is investigated that the success rate of transform approaches is comparatively better than local features approaches. DWT feature vector using KNN achieved the success rate of 89% after feature selection (mRMR).Likely, DST and LPQ feature vectors obtained the highest accuracy of 90.70% and 88.20% using KNN classifier. While LBP feature vector reported highest accuracy of 89.50%. Furthermore, a Hybrid feature vector is form by combining DWT and DST feature vectors. It is investigated that Hybrid feature vector achieved an outstanding accuracy of 91.30% using SVM classifier. It is ascertained that the proposed approach might be helpful for researchers in future. 60

References: AHONEN, T., HADID, A. & PIETIKAINEN, M. 2006. Face description with local binary patterns: Application to face recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 28, 2037-2041. AHONEN, T., RAHTU, E., OJANSIVU, V. & HEIKKILÄ, J. Recognition of blurred faces using local phase quantization. Pattern Recognition, 2008. ICPR 2008. 19th International Conference on, 2008. IEEE, 1-4. AL-NAJDAWI, N., TEDMORI, S., EDIRISINGHE, E. A. & BEZ, H. E. 2012. An automated real-time people tracking system based on KLT features detection. Int. Arab J. Inf. Technol., 9, 100-107. ALY, M. 2006. Face recognition using SIFT features. CNS/Bi/EE report, 186. ALY, S., DEGUCHI, D. & MURASE, H. Blur-invariant traffic sign recognition using compact local phase quantization. Intelligent Transportation Systems-(ITSC), 2013 16th International IEEE Conference on, 2013. IEEE, 821-827. AN, X., LI, J., SHANG, E. & HE, H. Multi-scale and Multi-orientation Local Feature Extraction for Lane Detection Using High-Level Information.

Image and

Graphics (ICIG), 2011 Sixth International Conference on, 2011. IEEE, 576-581. ASHARI, A., PARYUDI, I. & TJOA, A. M. 2013. Performance Comparison between Naïve Bayes, Decision Tree and k-Nearest Neighbor in Searching Alternative Design in an Energy Simulation Tool. vol, 4, 33-39. ATKINSON, P. M. & TATNALL, A. 1997. Introduction neural networks in remote sensing. International Journal of remote sensing, 18, 699-709. BELHUMEUR, P. N., HESPANHA, J. P. & KRIEGMAN, D. J. 1997. Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 19, 711-720. BENGIO, Y. & GRANDVALET, Y. 2004. No unbiased estimator of the variance of kfold cross-validation. The Journal of Machine Learning Research, 5, 1089-1105. BEYNON, M., CURRY, B. & MORGAN, P. 2000. The Dempster–Shafer theory of evidence: an alternative approach to multicriteria decision modelling. Omega, 28, 37-50. 61

BHATTACHARYYA, D., RANJAN, R., ALISHEROV, F. & CHOI, M. 2009. Biometric authentication: A review. International Journal of u-and e-Service, Science and Technology, 2, 13-28. BISWAS, S. & BISWAS, A. 2012. Face Recognition Algorithms based on Transformed Shape Features. arXiv preprint arXiv:1207.2537. BODADE, R. M. & TALBAR, S. N. Shift invariant iris feature extraction using rotated complex wavelet and complex wavelet for iris recognition system. Advances in Pattern Recognition, 2009. ICAPR'09. Seventh International Conference on, 2009. IEEE, 449-452. BOUGHRARA, H., CHTOUROU, M., AMAR, C. B. & CHEN, L. 2014. Facial expression recognition based on a mlp neural network using constructive training algorithm. Multimedia Tools and Applications, 1-23. BRAJE, W. L., KERSTEN, D., TARR, M. J. & TROJE, N. F. 1998. Illumination effects in face recognition. Psychobiology, 26, 371-380. BRUNELLI, R. & POGGIO, T. 1993. Face recognition: Features versus templates. IEEE Transactions on Pattern Analysis & Machine Intelligence, 1042-1052. BUADES, A., COLL, B. & MOREL, J.-M. 2005. A review of image denoising algorithms, with a new one. Multiscale Modeling & Simulation, 4, 490-530. CEVIKALP, H., NEAMTU, M., WILKES, M. & BARKANA, A. 2005. Discriminative common vectors for face recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 27, 4-13. CHADHA, A. R., VAIDYA, P. P. & ROJA, M. M. Face recognition using discrete cosine transform for global and local features.

Recent Advancements in

Electrical, Electronics and Control Engineering (ICONRAEeCE), 2011 International Conference on, 2011. IEEE, 502-505. CHAPELLE, O., HAFFNER, P. & VAPNIK, V. N. 1999. Support vector machines for histogram-based image classification. Neural Networks, IEEE Transactions on, 10, 1055-1064. CHEN, C.-H., PAU, L.-F. & WANG, P. S.-P. 2010. Handbook of pattern recognition and computer vision, World Scientific.

62

CHEN, J., SHAN, S., YANG, P., YAN, S., CHEN, X. & GAO, W. 2005. Novel face detection method based on gabor features. Advances in Biometric Person Authentication. Springer. CHEN, W., ER, M. J. & WU, S. 2006. Illumination compensation and normalization for robust face recognition using discrete cosine transform in logarithm domain. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on, 36, 458-466. CHEUNG, V. & CANNONS, K. 2003. An Introduction to Probabilistic Neural Networks.

URL:

http://www.

psi.

toronto.

edu/∼

vincent/research/presentations/PNN. pdf (Last accessed in Jul. 2011). CHIN, Y. J., LIM, K. M., CHONG, S. C. & LEE, C. P. 2013. Minimal Redundancy Maximal Relevance Criterion-based Multi-biometric Feature Selection. SmartCR, 3, 103-111. CHITALIYA, N. G. & TRIVEDI, A. Feature Extraction using Wavelet-PCA and Neural network for application of Object Classification & Face Recognition. Computer Engineering and Applications (ICCEA), 2010 Second International Conference on, 2010. IEEE, 510-514. CORTES, C. & VAPNIK, V. 1995. Support-vector networks. Machine learning, 20, 273297. DEVRIES, T., BISWARANJAN, K. & TAYLOR, G. W. Multi-task Learning of Facial Landmarks and Expression. Computer and Robot Vision (CRV), 2014 Canadian Conference on, 2014. IEEE, 98-103. DHALL, A., ASTHANA, A., GOECKE, R. & GEDEON, T. Emotion recognition using PHOG and LPQ features.

Automatic Face & Gesture Recognition and

Workshops (FG 2011), 2011 IEEE International Conference on, 2011. IEEE, 878883. DORA, L., AGRAWAL, S. & PANDA, R. 2013. BFO-RLDA: A New Classification Scheme for Face Images Using Probabilistic Reasoning Model. Swarm, Evolutionary, and Memetic Computing. Springer.

63

DRAPER, B. A., BAEK, K., BARTLETT, M. S. & BEVERIDGE, J. R. 2003. Recognizing faces with PCA and ICA. Computer vision and image understanding, 91, 115-137. DRINEAS, P., FRIEZE, A., KANNAN, R., VEMPALA, S. & VINAY, V. 2004. Clustering large graphs via the singular value decomposition. Machine learning, 56, 9-33. DUDA, R. O., HART, P. E. & STORK, D. G. 2012. Pattern classification, John Wiley & Sons. ERSI, E. F. & ZELEK, J. S. Local feature matching for face recognition. Computer and Robot Vision, 2006. The 3rd Canadian Conference on, 2006. IEEE, 4-4. FARAG, A. & ATTA, R. 2012. Illumination Invariant Face Recognition Using the Statistical Features of BDIP and Wavelet Transform. International Journal of Machine Learning and Computing, 2, 1. FARFADE, S. S., SABERIAN, M. & LI, L.-J. 2015. Multi-view Face Detection Using Deep Convolutional Neural Networks. arXiv preprint arXiv:1502.02766. FERAUND, R., BERNIER, O. J., VIALLET, J.-E. & COLLOBERT, M. 2001. A fast and accurate face detector based on neural networks. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 23, 42-53. GARCIA, C. & DELAKIS, M. 2004. Convolutional face finder: A neural architecture for fast and robust face detection. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 26, 1408-1423. GARCIA, V., DEBREUVE, E. & BARLAUD, M. Fast k nearest neighbor search using GPU. Computer Vision and Pattern Recognition Workshops, 2008. CVPRW'08. IEEE Computer Society Conference on, 2008. IEEE, 1-6. GROSS, R. & BRAJOVIC, V. An image preprocessing algorithm for illumination invariant face recognition.

Audio-and Video-Based Biometric Person

Authentication, 2003. Springer, 10-18. GUYON, I. & ELISSEEFF, A. 2003. An introduction to variable and feature selection. The Journal of Machine Learning Research, 3, 1157-1182.

64

HADDADNIA, J., FAEZ, K. & MOALLEM, P. Neural network based face recognition with moment invariants.

Image Processing, 2001. Proceedings. 2001

International Conference on, 2001. IEEE, 1018-1021. HAFED, Z. M. & LEVINE, M. D. 2001. Face recognition using the discrete cosine transform. International Journal of Computer Vision, 43, 167-188. HAYAT, M. Prediction of Membrane Proteins Using Machine Learning Approaches. HAYAT, M., KHAN, A. & YEASIN, M. 2012. Prediction of membrane proteins using split amino acid and ensemble classification. Amino Acids, 42, 2447-2460. HEISELE, B., HO, P., WU, J. & POGGIO, T. 2003. Face recognition: component-based versus global approaches. Computer vision and image understanding, 91, 6-21. HEMALATHA GAYATRI, L. & GOVINDAN, V. FEATURE SELECTION USING MODIFIED

PARTICLE

SWARM

OPTIMISATION

FOR

FACE

RECOGNITION. HEUSCH, G. & MARCEL, S. 2010. A novel statistical generative model dedicated to face recognition. Image and Vision Computing, 28, 101-110. HUANG, D., SHAN, C., ARDABILIAN, M., WANG, Y. & CHEN, L. 2011a. Local binary patterns and its application to facial image analysis: a survey. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 41, 765-781. HUANG, D., SHAN, C., ARDEBILIAN, M. & CHEN, L. 2011b. Facial image analysis based on local binary patterns: a survey. IEEE Transactions on Image Processing. HUANG, F. J., ZHOU, Z., ZHANG, H.-J. & CHEN, T. Pose invariant face recognition. Automatic Face and Gesture Recognition, 2000. Proceedings. Fourth IEEE International Conference on, 2000. IEEE, 245-250. HUANG, L.-L., SHIMIZU, A. & KOBATAKE, H. 2005. Robust face detection using Gabor filter features. Pattern Recognition Letters, 26, 1641-1649. HUSSAIN, S. U., NAPOLÉON, T. & JURIE, F. Face recognition using local quantized patterns. British Machive Vision Conference, 2012. 11 pages. ILIN, A. & RAIKO, T. 2010. Practical approaches to principal component analysis in the presence of missing values. The Journal of Machine Learning Research, 11, 1957-2000. 65

INTRATOR, N., REISFELD, D. & YESHURUN, Y. 1996. Face recognition using a hybrid supervised/unsupervised neural network. Pattern Recognition Letters, 17, 67-76. JAIN, A., BOLLE, R. & PANKANTI, S. 2006. Biometrics: personal identification in networked society, Springer Science & Business Media. JAIN, A. K. 1979. A sinusoidal family of unitary transforms. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 356-365. JAIN, A. K., ROSS, A. & PRABHAKAR, S. 2004. An introduction to biometric recognition. Circuits and Systems for Video Technology, IEEE Transactions on, 14, 4-20. JEMAA, Y. B. & KHANFIR, S. 2009. Automatic local Gabor features extraction for face recognition. arXiv preprint arXiv:0907.4984. JIN, F. & SUN, S. 2008. A multitask learning approach to face recognition based on neural networks. Intelligent Data Engineering and Automated Learning–IDEAL 2008. Springer. JOACHIMS, T. 1998. Text categorization with support vector machines: Learning with many relevant features, Springer. JONES, M. J. & VIOLA, P. 2003. Face recognition using boosted local features. JOO ER, M., CHEN, W. & WU, S. 2005. High-speed face recognition based on discrete cosine transform and RBF neural networks. Neural Networks, IEEE Transactions on, 16, 679-691. KADAM, M. K. Face recognition based on Bias-Variance method. KANAN, H. R., FAEZ, K. & EZOJI, M. 2006. Face recognition: an optimized localization approach and selected PZMI feature vector using SVM classifier. Intelligent Computing. Springer. KASHYAP, K. & YADAV, M. FINGERPRINT MATCHING USING NEURAL NETWORK TRAINING. KEKRE, H. & MISHRA, D. 2010. Discrete Sine Transform Sectorization for Feature Vector Generation in CBIR. Universal Journal of Computer Science and Engineering Technology, 1, 2219-2158.

66

KELLER, J. M., GRAY, M. R. & GIVENS, J. A. 1985. A fuzzy k-nearest neighbor algorithm. Systems, Man and Cybernetics, IEEE Transactions on, 580-585. KHAN, A., KHAN, M. & CHOI, T.-S. 2008a. Proximity based GPCRs prediction in transform domain. Biochemical and biophysical research communications, 371, 411-415. KHAN, A., TAHIR, S. F., MAJID, A. & CHOI, T.-S. 2008b. Machine learning based adaptive watermark decoding in view of anticipated attack. Pattern Recognition, 41, 2594-2610. KHAN, Z. U., HAYAT, M. & KHAN, M. A. 2015. Discrimination of acidic and alkaline enzyme using Chou’s pseudo amino acid composition in conjunction with probabilistic neural network model. Journal of theoretical biology, 365, 197-203. KISKU, D. R., TISTARELLI, M., SING, J. K. & GUPTA, P. Face recognition by fusion of local and global matching scores using DS theory: An evaluation with uniclassifier and multi-classifier paradigm.

Computer Vision and Pattern

Recognition Workshops, 2009. CVPR Workshops 2009. IEEE Computer Society Conference on, 2009. IEEE, 60-65. KÖKER, R., ÇAKAR, T. & SARI, Y. 2014. A neural-network committee machine approach to the inverse kinematics problem solution of robotic manipulators. Engineering with Computers, 30, 641-649. KURITA, T., KOBAYASHI, Y. & MISHIMA, T. Higher order local autocorrelation features of PARCOR images for gesture recognition. Image Processing, 1997. Proceedings., International Conference on, 1997. IEEE, 722-725. KURSUN, O., SAKAR, C. O., FAVOROV, O., AYDIN, N. & GURGEN, F. 2010. Using covariates for improving the minimum redundancy maximum relevance feature selection method. Turkish Journal of Electrical Engineering & Computer Sciences, 18, 975-989. LAJEVARDI, S. M. & HUSSAIN, Z. M. Facial expression recognition using log-Gabor filters and local binary pattern operators.

Proceedings of the International

Conference on Communication, Computer and Power, 2009. 349-353. LANGFORD, J. Tutorial on practical prediction theory for classification. Journal of machine learning research, 2005. 273-306. 67

LANITIS, A., TAYLOR, C. J. & COOTES, T. F. 2002. Toward automatic simulation of aging effects on face images. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 24, 442-455. LATHA, P., GANESAN, L. & ANNADURAI, S. 2009. Face recognition using neural networks. Signal Processing: An International Journal (SPIJ), 3, 153-160. LEE, Y., LEE, K. & PAN, S. Local and global feature extraction for face recognition. Audio-and Video-Based Biometric Person Authentication, 2005. Springer, 219228. LEI, Z. & LI, S. Z. 2012. Fast multi-scale local phase quantization histogram for face recognition. Pattern Recognition Letters, 33, 1761-1767. LI, H., LIN, Z., SHEN, X., BRANDT, J. & HUA, G. A Convolutional Neural Network Cascade for Face Detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. 5325-5334. LI, M., DU, W. & YUAN, L. Feature selection of face recognition based on improved chaos genetic algorithm. Electronic Commerce and Security (ISECS), 2010 Third International Symposium on, 2010. IEEE, 74-78. LI, S. Z. 2009. Encyclopedia of Biometrics: I-Z, Springer Science & Business Media. LIN, S.-H. 2000. An introduction to face recognition technology. Informing Science, 3, 18. LIN, S.-H., KUNG, S.-Y. & LIN, L.-J. 1997. Face recognition/detection by probabilistic decision-based neural network. Neural Networks, IEEE Transactions on, 8, 114132. LINGE, G. V. & PAWAR, M. M. 2014. Face Recognition using Neural Network & Principal Component Analysis. LIU, C. & WECHSLER, H. Enhanced fisher linear discriminant models for face recognition. Pattern Recognition, 1998. Proceedings. Fourteenth International Conference on, 1998. IEEE, 1368-1372. LIU, C. & WECHSLER, H. 2002. Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition. Image processing, IEEE Transactions on, 11, 467-476.

68

LIU, C. & WECHSLER, H. 2003. Independent component analysis of Gabor features for face recognition. Neural Networks, IEEE Transactions on, 14, 919-928. LIU, Q., HUANG, R., LU, H. & MA, S. Face recognition using kernel-based fisher discriminant analysis.

Automatic Face and Gesture Recognition, 2002.

Proceedings. Fifth IEEE International Conference on, 2002. IEEE, 197-201. LU, J., PLATANIOTIS, K. N. & VENETSANOPOULOS, A. N. 2003. Face recognition using LDA-based algorithms. Neural Networks, IEEE Transactions on, 14, 195200. LU, Y., ZHOU, J. & YU, S. 2012. A survey of face detection, extraction and recognition. Computing and informatics, 22, 163-195. MANGLIK, P. K., MISRA, U. & MARINGANTI, H. B. Facial expression recognition. Systems, Man and Cybernetics, 2004 IEEE International Conference on, 2004. IEEE, 2220-2224. MANJHI, R., ABBAS, S. J. & PRIYAM, A. 2013. Face Recognition using Eigenface. International Journal of Emerging Technology and Advanced Engineering, 3, 625-627. MATHUR, S. N., AHLAWAT, A. K. & VISHWAKARMA, V. P. Illumination Invariant Face Recognition using Supervised and Unsupervised Learning Algorithms. Proc. of World Academy of Science, Engineering and Technology, 2008. Citeseer. MATTHEWS, I., COOTES, T. F., BANGHAM, J. A., COX, S. & HARVEY, R. 2002. Extraction of visual features for lipreading. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 24, 198-213. MATURANA, D., MERY, D. & SOTO, A. Face recognition with local binary patterns, spatial pyramid histograms and naive Bayes nearest neighbor classification. Chilean Computer Science Society (SCCC), 2009 International Conference of the, 2009. IEEE, 125-132. MISRA, B., SATAPATHY, S., BISWAL, B., DASH, P. & PANDA, G. Pattern classification using polynomial neural network.

Cybernetics and Intelligent

Systems, 2006 IEEE Conference on, 2006. IEEE, 1-6.

69

NANDINI, M., BHARGAVI, P. & SEKHAR, G. R. 2013. Face Recognition Using Neural Networks. International Journal of Scientific and Research Publications, 3, 1. NAYAK, D. & SHARMA, M. S. Face Recognition using DCT-DWT Interleaved Coefficient Vectors with NN and SVM Classifier. OJALA, T., PIETIKÄINEN, M. & MÄENPÄÄ, T. 2002. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 24, 971-987. OU, G. & MURPHEY, Y. L. 2007. Multi-class pattern classification using neural networks. Pattern Recognition, 40, 4-18. PATEL, R. & YAGNIK, S. B. A Literature Survey on Face Recognition Techniques. Volume. PATIL, M. T. P. & NARWADE, P. Development of Back Propagation Neural Network Model for Extracting the Feature from a Image Using Curvelet Transform. PENG, H., LONG, F. & DING, C. 2005. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 27, 1226-1238. PETERSON, C., RÖGNVALDSSON, T. & LÖNNBLAD, L. 1994. JETNET 3.0—A versatile artificial neural network package. Computer Physics Communications, 81, 185-220. PETROU, M. & GARCÍA SEVILLA, P. 2006. Image processing: dealing with texture. PIETIKÄINEN, M., HADID, A., ZHAO, G. & AHONEN, T. 2011. Background, Springer. POKHRIYAL, A. & LEHRI, S. 2010. A NEW METHOD OF FINGERPRINT AUTHENTICATION USING 2D WAVELETS. Journal of Theoretical & Applied Information Technology, 13. QURAISHI, M. I., DAS, G., DAS, A., DEY, P. & TASNEEM, A. A novel approach for face detection using artificial neural network. Intelligent Systems and Signal Processing (ISSP), 2013 International Conference on, 2013. IEEE, 179-184.

70

RAJU, U., KUMAR, A. S., MAHESH, B. & REDDY, B. E. 2010. Texture classification with high order local pattern descriptor: local derivative pattern. Global Journal of Computer Science and Technology, 10. RAMESHA, K. & RAJA, K. 2011. Face recognition system using discrete wavelet transform and fast PCA. Information Technology and Mobile Communication. Springer. RAO, K. R. & HWANG, J. J. 1996. Techniques and standards for image, video, and audio coding, Prentice Hall New Jersey. RAO, P. N., DEVI, T. U., KALADHAR, D., SRIDHAR, G. & RAO, A. A. 2009. A probabilistic neural network approach for protein superfamily classification. Journal of Theoretical and Applied Information Technology, 6, 101-105. RAPHAËL, F., OLIVIER, B. & DANIEL, C. 1997. A constrained generative model applied to face detection. Neural Processing Letters, 5, 11-19. ROWLEY, H., BALUJA, S. & KANADE, T. 1998. Neural network-based face detection. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 20, 23-38. SADYKHOV, R. K., SAMOKHVAL, V. & PODENOK, L. P. Face recognition algorithm on the basis of truncated Walsh-Hadamard transform and synthetic discriminant functions.

Automatic Face and Gesture Recognition, 2004.

Proceedings. Sixth IEEE International Conference on, 2004. IEEE, 219-222. SAMRA, A. S., EL TAWEEL GAD ALLAH, S. G. & IBRAHIM, R. M. Face recognition using wavelet transform, fast fourier transform and discrete cosine transform. Circuits and Systems, 2003 IEEE 46th Midwest Symposium on, 2003. IEEE, 272-275. SAVITHA, J. & KUMAR, A. S. 2014. Face Tracking and Detection using S-PCA & KLT Method. International Journal, 2. SERRE, T., HEISELE, B., MUKHERJEE, S. & POGGIO, T. 2000. Feature selection for face detection. SHAN, C., GONG, S. & MCOWAN, P. W. 2009. Facial expression recognition based on local binary patterns: A comprehensive study. Image and Vision Computing, 27, 803-816.

71

SHAN, C. & GRITTI, T. Learning Discriminative LBP-Histogram Bins for Facial Expression Recognition. BMVC, 2008. 1-10. SHASTRI, B. J. & LEVINE, M. D. 2007. Face recognition using localized features based on non-negative sparse coding. Machine Vision and Applications, 18, 107-122. SHINOHARA, Y. & OTSU, N. Facial expression recognition using fisher weight maps. Automatic Face and Gesture Recognition, 2004. Proceedings. Sixth IEEE International Conference on, 2004. IEEE, 499-504. SILVA, L. M., DE SÁ, J. M. & ALEXANDRE, L. A. 2008. Data classification with multilayer perceptrons using a generalized error function. Neural Networks, 21, 1302-1310. SINGH, A. Comparison of HGPP, PCA, LDA, ICA and SVM. SINGH, C., WALIA, E. & MITTAL, N. 2012. Robust two-stage face recognition approach using global and local features. The Visual Computer, 28, 1085-1098. SITAMAHALAKSHMI, T., BABU, A. V., JAGADEESH, M. & MOULI, K. C. 2011. Performance comparison of radial basis function networks and probabilistic neural networks for Telugu character recognition. Global Journal of Computer Science and Technology, 11. SONG, J., LU, X. & WU, X. An improved AdaBoost algorithm for unbalanced classification data. Fuzzy Systems and Knowledge Discovery, 2009. FSKD'09. Sixth International Conference on, 2009. IEEE, 109-113. SONKAMBLE, S., THOOL, D. R. & SONKAMBLE, B. 2010. SURVEY OF BIOMETRIC RECOGNITION SYSTEMS AND THEIR APPLICATIONS. Journal of Theoretical & Applied Information Technology, 11. SPECHT, D. F. 1990. Probabilistic neural networks. Neural networks, 3, 109-118. SRINIVAS, Y., RAJ, A. S., OLIVER, D. H., MUTHURAJ, D. & CHANDRASEKAR, N. 2012. A robust behavior of Feed Forward Back propagation algorithm of Artificial Neural Networks in the application of vertical electrical sounding data inversion. Geoscience Frontiers, 3, 729-736. SUAREZ, P. F. 1991. Face recognition with the karhunen-loeve transform. DTIC Document.

72

TAN, X. & TRIGGS, B. 2010. Enhanced local texture feature sets for face recognition under difficult lighting conditions. Image Processing, IEEE Transactions on, 19, 1635-1650. THAKUR, S., SING, J. K., BASU, D. K. & NASIPURI, M. 2009. Face recognition using Fisher linear discriminant analysis and support vector machine. Contemporary Computing. Springer. THEODORIDIS, S., PIKRAKIS, A., KOUTROUMBAS, K. & CAVOURAS, D. 2010. Introduction to Pattern Recognition: A Matlab Approach: A Matlab Approach, Academic Press. THIYAGARAJAN, R., ARULSELVI, S. & SAINARAYANAN, G. 2010. Gabor feature based classification using statistical models for face recognition. Procedia Computer Science, 2, 83-93. TOTH, B. 2005. Biometric liveness detection. Information Security Bulletin, 10, 291297. TURK, M. & PENTLAND, A. 1991. Eigenfaces for recognition. Journal of cognitive neuroscience, 3, 71-86. VAPNIK, V. 2013. The nature of statistical learning theory, Springer Science & Business Media. VARUN, R., KINI, Y. V., MANIKANTAN, K. & RAMACHANDRAN, S. 2015. Face Recognition Using Hough Transform Based Feature Extraction. Procedia Computer Science, 46, 1491-1500. WADKAR, P. D. & WANKHADE, M. 2012. Face Recognition Using Discrete Wavelet Transforms. International Journal of Advanced Engineering Technology, 3, 1-3. WAYMAN, J., JAIN, A., MALTONI, D. & MAIO, D. 2005. An introduction to biometric authentication systems, Springer. XU, W. & LEE, E.-J. 2012. Human Face Recognition Based on Improved D-LDA and Integrated BPNNs Algorithms. International Journal of Security and Its Applications (IJSIA), 6, 121-126. YAHIA, M. H. I. O. M. 2008. Discrete sine transform and alternative local linear regression for face recognition.

73

YAJI, G. S., SARKAR, S., MANIKANTAN, K. & RAMACHANDRAN, S. 2012. DWT feature extraction based face recognition using intensity mapped unsharp masking and laplacian of gaussian filtering with scalar multiplier. Procedia Technology, 6, 475-484. YAN, Y., OSADCIW, L. A. & CHEN, P. 2008. Confidence interval of feature number selection for face recognition. Journal of Electronic Imaging, 17, 011002-0110028. YANG, J., YU, H. & KUNZ, W. An efficient LDA algorithm for face recognition. Proceedings of the International Conference on Automation, Robotics, and Computer Vision (ICARCV 2000), 2000. 34-47. YANG, M.-H. Kernel eigenfaces vs. kernel fisherfaces: Face recognition using kernel methods. fgr, 2002. IEEE, 0215. YILMAZ, A. & GÖKMEN, M. 2001. Eigenhill vs. eigenface and eigenedge. Pattern Recognition, 34, 181-184. YING, W. X. & YANG, L. Comparison of 3-D Discrete Cosine and Discrete Sine Transforms for the Novelty Estimation in Volumetric Data. YUAN, B., CAO, H. & CHU, J. Combining local binary pattern and local phase quantization for face recognition.

Biometrics and Security Technologies

(ISBAST), 2012 International Symposium on, 2012. IEEE, 51-53. ZAHRAN, B. M. & KANAAN, G. 2009. Text Feature Selection using Particle Swarm Optimization Algorithm 1. ZENG, G. 2007. Facial recognition with singular value decomposition. Advances and Innovations in Systems, Computing Sciences and Software Engineering. Springer. ZENG, X. & MARTINEZ, T. A noise filtering method using neural networks. Soft Computing

Techniques

in

Instrumentation,

Measurement

and

Related

Applications, 2003. SCIMA 2003. IEEE International Workshop on, 2003. IEEE, 26-31. ZHANG, B., GAO, Y., ZHAO, S. & LIU, J. 2010. Local derivative pattern versus local binary pattern: face recognition with high-order local pattern descriptor. Image Processing, IEEE Transactions on, 19, 533-544.

74

ZHANG, G., HUANG, X., LI, S. Z., WANG, Y. & WU, X. 2005a. Boosting local binary pattern

(LBP)-based

face

recognition.

Advances

in

biometric

person

authentication. Springer. ZHANG, H., BERG, A. C., MAIRE, M. & MALIK, J. SVM-KNN: Discriminative nearest neighbor classification for visual category recognition. Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, 2006. IEEE, 2126-2136. ZHANG, L. & LENDERS, P. Knowledge-based eye detection for human face recognition.

Knowledge-Based Intelligent Engineering Systems and Allied

Technologies, 2000. Proceedings. Fourth International Conference on, 2000. IEEE, 117-120. ZHANG, S. & PAN, X. A novel text classification based on Mahalanobis distance. Computer Research and Development (ICCRD), 2011 3rd International Conference on, 2011. IEEE, 156-158. ZHANG, W., SHAN, S., GAO, W., CHEN, X. & ZHANG, H. Local gabor binary pattern histogram sequence (lgbphs): A novel non-statistical model for face representation and recognition. Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on, 2005b. IEEE, 786-791. ZHANG, Z., LYONS, M., SCHUSTER, M. & AKAMATSU, S. Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron.

Automatic Face and Gesture Recognition, 1998.

Proceedings. Third IEEE International Conference on, 1998. IEEE, 454-459. ZHAO, L., HU, W. & CUI, L. 2012. Face Recognition Feature Comparison Based SVD and FFT. Journal of Signal and Information Processing, 3, 259. ZHAO, Z.-Q., HUANG, D.-S. & SUN, B.-Y. 2004. Human face recognition based on multi-features using neural networks committee. Pattern Recognition Letters, 25, 1351-1358. ZHU, X. 2014. Face Representation with Local Gabor Phase Quantization. Journal of Networks, 9, 1617-1623.

75