Thermal Face Recognition Using Convolutional Neural ... - IEEE Xplore

2 downloads 0 Views 873KB Size Report
Southwest University. 2Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing. Southwest University. Chongqing, 400715 ...
2016 International Conference on Optoelectronics and Image Processing

Thermal Face Recognition Using Convolutional Neural Network

I 2

1 l 1 ' Zhan Wu , 2, Min Peng , 2, Tong Chen , 2 School of Electronic and Information Engineering Southwest University

Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing Southwest University Chongqing, 400715, China e-mail: [email protected]

Abstract-With the rapidly development of the face recognition

classifier. Goutam Majumder et al [16] used Gabor wavelet

(FR),

transform and fast independent component analysis (Fast

thermal

face

recognition

has

received

increasing

attention. However, the traditional methods for thermal face

ICA) to extract thermal face features, which can effectively

recognition mainly concentrate on the hand-crafted feature

improve

design, which requires more efforts from human to select and

reduction. However,

extract features and usually has relatively lower recognition (CNN) architecture for thermal face recognition. CNN is a new

show

that

our

proposed

recognition

by

dimension

higher recognition rate. But the moment invariant method is not capable enough to describe the details. Javier Ruiz-del­

CNN

Solar et al [18] analyzed the advantages and limitations of

architecture achieves higher recognition rate compared with

the traditional algorithms of thermal face recognition. The

the traditional recognition such as LBP, HOG and moments

results showed that, LBP improved the recognition rate to

invariant.

some extent and had good gray translation invariance and

Keywords-thermal face recognition;

better robustness.

convolutional neural

network; RGB-D-Tface database ; deep learning

I.

traditional

thermal

face

as the complicated process, low and inefficient recognition INTRODUCTION

rate, etc. An alternative method for face recognition would be

Face recognition has been a hot research topic, and

deep leaning method. It has shown more advantages and has been applied to many different fields such as computer

number of researches investigating RGB face recognition,

vision [20] and pattern recognition [21]. Deep learning

which has been applied to business [2], transportation [3],

achieves the complex function approximation through a

security [4], etc. However, the face recognition based on

nonlinear network structure and shows the powerful learning

RGB images are fragile to the change of illumination[5], [6],

ability

[7]. Therefore, some other images instead of RGB images

(CNN) architecture for thermal face recognition. CNN is a kind of deep learning algorithm. Thermal images are firstly

efficient [10]. However, the price of hyperspectral equipment

obtained from the RGB-D-T face database [14]. Then, a

is expensive and the data processing is time-consuming [11]. emitted by the objects in the scene and convert

recognition

In this paper, we present a convolutional neural network

can be damaged easily after long-term use [9], which brings

IR

traditional

feature [23].

more maintenance cost. Face recognition based on HIS is

the

with

study features to reduce the workload for the manual design

Near infrared images are relatively robust to the illumination change [8]. However, the equipment for the NIR imaging

absorb

Compared

extraction and classifier determination into one step and can

infrared (NIR) [8], and hyperspectral images (HSI) [10].

equipment

[22].

algorithms, deep learning combines feature selection or

have been used for the recognition, such as thermal, near

imaging

Nevertheless,

recognition methods do have some shortcomings [19], such

received widely attention for many years [1]. There are a

Thermal

face

the recognition rate has been not

neighbor classifier for the recognition, which achieved a

effective features from the raw data. Experiment results on database

of

invariant to extract features and employed the nearest

type of neural network method which can automatically learn face

speed

obviously improved. Naser Zaeri et al [17] used moment

rate. In this paper, we present a convolutional neural network

RGB-D-T

the

CNN method is proposed, which can automatically learn

energy

effective features from the thermal face data. Compared with

it into

other state-of-art methods such as LBP, HOG and moments

electrical signal [12]. The face recognition can be performed

invariant, experiment results show that our proposed CNN

under the insufficient or completely dark environment [13]

architecture for face recognition has higher recognition rate.

by using thermal imaging technique. What's more, the speed of image processing is faster than hyperspectral equipment.

II.

The traditional algorithms for thermal face recognition

PROPOSED ARCHITECTURE

often take the following steps [15]: thermal face data are

In a multilayer neural network structure, each node of

firstly preprocessed and the features are then selected or

each layer undertakes the forward calculation and serve as

extracted; based on the useful features, the classifier is

the input of the nodes of the next layer. The value of any

trained; the recognition is achieved by using the determined

node at a layer is relevant to values of all the nodes

978-1-5090-0880-3/16/$31.00 ©2016 IEEE

6

connecting to It

m

synchronously. There are images of 51 college students in

the prior layer [24]. In this paper, we

propose a CNN model that is designed for thermal face

the database, whose resolution are 640x480,

recognition. Different from the ordinary neural network, the

384x288 for RGB, depth, and thermal images, respectively.

proposed CNN model optimizes the neural network structure through

the

local

receptive

field,

power

sharing,

640x480,

Three factors influencing face recognition, including

and

head rotation, expression variation, illumination variation,

sampling. The CNN has nine layers appending rectified

are considered in the database. When the images of a subject

linear units, which contains an input layer, three convolution

rotating his/her head were taken, his/her facial expression

layers, three pooling layers, a norm layer and a softmax

was required to be neutral and the illumination was required

classifier layer. The CNN structure is shown in Fig. 1, where

to be constant. Expression variation experiment requires the

S denotes step length and P denotes padding. Data

are

convoluted

in

the

position of the head and the illumination to be constant. The

convolution

layer.

images

of

neutral,

happy,

sad,

angry,

and

surprised

Convolution kernels translate two pixels after convolution

expression were taken. Illumination variation experiment

every time. The first convolution layer has a filter of size 5 x5.

requires the expression to be neutral and head position to be

The other two convolution layers have filters of size 3x3.

constant. In this paper, the thermal images of all subjects

The convoluted data are pooled to reduce feature parameters.

under three conditions were employed. For one subject and

In the paper, three max pooling layers are used which have

one factor experiment, there are 100 images. Therefore, for

2x2 accept domain. The convolution layers and max pooling

51 subjects there are 15300 (51x3xlOO) images.

layers are used alternatively. Feature map is achieved by

In order to achieve recognition, the faces need to be

local convolution and the dimension is reduced through

located from upper body images by using binarization [25].

pooling. Rectified linear unit layers (ReLU) which are placed

The pixels corresponding to the body region were set to zero,

around the three convolution layers replace sigmoid as

and then the central point of upper body (m\, na and the

activation function to intensify the learning efficiency in our

highest point (m2' n2) were located. m" m2 and nt, n2 were

structure

set as the abscissa and ordinate of pixels. Square area is

and

make

network

capable

of

the

sparse

characteristics. Finally, Softmax classification layer is used

constituted by setting (nrn,) as side length of square and (m"

to classify the above figure characteristics.

na as the base midpoint. Finally, the images are normalized into 112x112 pixels. A Matlab index function provided by the database for

Classifier

randomly selecting training set and testing set was used in Pooling 3 2×2 S2 P2

this paper in order to realize random signal acquisition to

ReLU 3

improve the reliability of the experiment. For each subject and each condition, the training set are the 10 images out of 50 training images produced by the index function, and the

Convolution 3 3×3 S1 P1

testing set are the 50 testing images produced by the index function.

Pooling 2 2×2 S2 P2

B.

ReLU 2

Parameter Setting The proposed CNN has nine layers. Convolution layer

1 employs 64 different convolution kernels whose output

Convolution 2 3×3 S1 P1

image is 56x56 in size. The output data of max pooling layer 1are decreased into 28x28, which undertakes down-sampling. The fourth layer is mainly to perform normalization, which is

Norm 1

advantageous to the data in the form of image. Then, the data are convoluted and pooled twice, whose dimension and size

Pooling 1 2×2 S2 P2

become 128 and 7x7. Finally, the data are grouped into 51

ReLU 1

class. The structure and the output size of each layer are summarized in Table I.

Convolution 1 5×5 S2 P2

TABLE!.

Data

Figure 1.

OUR PROPOSED CNN STRUCTURE AND THE OUTPUT SIZE Layer

Output

Images data

112xl12 64x56x56 64x28x28 64x28x28

Convolution layer 1

The CNN architecture.

Pooling layer 1 Norm layer1

III. A.

EXPERIMENT

Convolution layer 2

128x28x28 128xl4xl4

Convolution layer 3 Pooling layer 3

128xl4xl4 128x7x7

Classifier layer

51xlxl

Pooling layer 2

Database and Preprocessing The RGB-D-T face database [14] was used to test our

CNN model. The images in the database are collected from RGB camera, Kinect camera, and thermal imaging camera

7

100

In this paper, we employed RGB-D-T face database to test the proposed method. In order to show the superiority of the CNN, we compared it with three competitive algorithms for thermal face recognition, which include LBP-KNN [26], moment invariant-KNN

[27],

HOG-KNN

[28].

parameters of the three compared algorithms

The

were



90

� " .g

80

....... LBP

70

_ Moment Invariant HOG

;:

bfJ 0

set

(,) '" � '" (,) '"

according to the references [26-28]. The LBP were computed using 8 sampling points on a circle of radius 1. As for HOG



algorithms, 8x8 cell units and 2x2 blocks were set. In the case of the moment invariant, there was no needs for setting any parameters. C.

60 50

Experimental Results and Discussion

Head Rotation Expression

Table II shows the recognition rate considering three

Figure 2.

Illumination

Line chart of face recognition rate.

factors including the head rotation, expression, illumination. ACKNOWLEDGMENT

It is observed that the factor of head rotation affects the recognition

rate most. The

recognition

rate

under

this

We would like to acknowledge the support from the

condition is the lowest, which can be as low as 59.37% for

National Natural Science Foundation of China (Grant No.

moment invariant. The illumination variation has the least effect on the recognition rate. The average rate of four methods is over 98% (the lowest is 94.51%, the highest is 100%). This results

61301297

and

Research

Funds

The proposed CNN method outperforms other three in

all

conditions.

Even

in

the

head

the recognition rate (98%) is a dramatic improvement. Under another two conditions, i.e. expression and illumination (99.4% for expression variation, and lOO% for illumination variation).

that the CNN has the best recognition performance and can

Klin, Ami, "A nonned study of face recognition in autism and related disorders." Journal of autism and developmental disorders 29.6 1999, pp. 499-508.

[4]

Karmakar, Dhiman, and C. A. Murthy. "Face Recognition using Face­ Autocropping and Facial Feature Points Extraction." Proceedings of the 2nd International Conference on Perception and Machine Intelligence. ACM, 2015.

[5]

Wang S, He M, Gao Z, Emotion recognition from thennal infrared images using deep Boltzmann machine[J]. Frontiers of Computer Science, 2014, 8(4), pp. 609-618.

[6]

Chan, Chi Ho, et al. "llIumination invariant face recognition: a survey." Face Recognition in Adverse Conditions 2014, pp. 147-166.

[7]

Zhu, Jun-Yong, Wei-Shi Zheng, and Jian-Huang Lai. "Logarithm gradient histogram: A general illumination invariant descriptor for face recognition."Automatic Face and Gesture Recognition (FG), 2013 10th IEEE International Conference and Workshops on. IEEE, 2013.

[8]

Ruiz-del-Solar, Javier, "Thermal Face Recognition in Unconstrained Environments Using Histograms of LBP Features." Local Binary Patterns: New Variants and Applications. Springer Berlin Heidelberg, 2014, pp. 219-243.

[9]

Liu Y, Ai K, Liu J, Dopamine - Melanin Colloidal Nanospheres: An Efficient Near - Infrared Photothermal Therapeutic Agent for In Vivo Cancer Therapy[J]. Advanced Materials, 2013, 25(9), pp. 1353-1359.

largely improve the recognition rate under extreme condition, such as head rotation. FACE RECOGNITION RATE CONSIDERING THREE FACTORS

LBP

79.33

96.27

98.35

59.37

91.76

94.51

HOG

90.27

98.78

99.18

Our Method

98.00

99.40

100.00

IV.

CONCLUSION

In this paper, we present a CNN method for thermal face

recognition.

Three

conditions,

i.e.

head

rotation,

expression variation, illumination variation, which affect recognition traditional recognition,

rate

were

considered.

recognition the

methods

proposed

Compared

for

method

with

[10] Pan, Zhihong, "Face recognition in hyperspectral images." Pattern Analysis and Machine Intelligence, IEEE Transactions on 25.12 2003, pp. 1552-1560.

the

the

thermal

face

can

produce

best

(No.

[3]

A line chart was made according to the data in Table 2,

Moment Invariant

Fundamental

Jain, Anil, Ruud Bolle, and Sharath Pankanti, eds. Biometrics: personal identification in networked society. vol. A479. Springer Science & Business Media, 2006.

which is shown in Fig 2. This chart can visually illustrate

illumination

the

Universities

[2]

variation, the CNN method can again produce best results

Expression

and

Central

Drira, Hassen, "3D face recognition under expressions, occlusions, and pose variations." Pattern Analysis and Machine Intelligence, IEEE Transactions on 35.9 2013, pp. 2270-2283.

Compared with the second highest recognition rate 90.27%,

Head Rotation

the

[I]

rotation

it can achieve a recognition rate of 98%.

TABLE II.

61472330),

REFERENCES

illumination variation.

experiment,

for

XDJK2013CI24).

support that the thermal face recognition is robust to

methods

No.

[II] Gonzalez, Carlos, "Use of FPGA or GPU-based architectures for remotely sensed hyperspectral image processing." INTEGRATION, the VLSljournaI 46.2,2013, pp. 89-103.

recognition results. This suggests that CNN is a promising method for the thermal face recognition under extreme

[12] Lloyd, 1. Michael. Thennal imaging systems. Springer Science & Business Media, 2013.

conditions, such as side face view and rapid changing illumination environment.

8

[13] Wang, Shangfei, "Emotion recognition from thermal infrared images using deep Boltzmann machine." Frontiers of Computer Science 8.4 2014, pp. 609-618.

[22] Arel, Itamar, Derek Rose, and Robert Coop, "DeSTIN: A Scalable Deep Learning Architecture with Application to High-Dimensional Robust Pattern Recognition." AAAI Fall Symposium: Biologically Inspired Cognitive Architectures, 2009.

[14] Simon, Marc Oliu, "Improved RGB-DT based Face Recognition." let Biometrics ,2016.

[23] Chen, Guang, "Combining unsupervised learning and discrimination for 3D action recognition." Signal Processing 110 2015, pp. 67-81.

[15] Glorot, Xavier, Antoine Bordes, and Yoshua Bengio, "Deep sparse rectifier neural networks." International Conference on Artificial Intelligence and Statistics, 2011.

[24] Klepac, Goran, ed. Developing Churn Models Using Data Mining Techniques and Social Network Analysis, IGI Global, 2014.

[16] Majumder, Goutam, and Mrinal Kanti Bhowmik, "Gabor-Fast ICA Feature Extraction for Thermal Face Recognition Using Linear Kernel Support Vector Machine." Computational Intelligence and Networks (CINE), 2015 International Conference on. IEEE, 2015.

[25] Pang, Ying-Han, Andrew Teoh Beng Jin, and David Ngo Chek Ling. "Binarized revocable biometrics in face recognition." Computational Intelligence and Security. Springer Berlin Heidelberg, 2005, pp. 788795.

[17] Zaeri, Naser, Faris Baker, and Rabie Dib. "Thermal Face Recognition Using Moments Invariants.", 2015.

[26] Ruiz-del-Solar, Javier, "Thermal Face Recognition in Unconstrained Environments Using Histograms of LBP Features." Local Binary Patterns: New Variants and Applications. Springer Berlin Heidelberg, 2014, pp. 219-243.

[18] Xie, Zhihua, and Guodong Liu, "Infrared face recognition based on LBP co-occurrence matrix and partial least squares." International Journal of Wireless and Mobile Computing 8.1, 2015, pp. 90-94.

[27] Zaeri, Naser, Faris Baker, and Rabie Dib. "Thermal Face Recognition Using Moments Invariants." 2015.

[19] Gui, Jie, "Discriminant sparse neighborhood preserving embedding for face recognition." Pattern Recognition 45.8 2012, pp. 2884-2893.

[28] Hermosilla, Gabriel, "Study of Local Matching-Based Facial Recognition Methods Using Thermal Infrared Imagery." International Journal of Pattern Recognition and Artificial Intelligence 29.08, 2015, 1556012.

[20] Bengio, Yoshua. "Learning deep architectures for AI." Foundations and trends® in Machine Learning 2.1 2009, pp. 1-127. [21] LeCun, Yann, Yoshua Bengio, and Geoffrey learning." Nature52 \.7553, 2015, pp. 436-444.

Hinton.

"Deep

9