CHALMERS Face verification using biometric hashing

0 downloads 0 Views 1MB Size Report
Those codes correct the offsets of the face data cap- ture, and the corrected data can ..... We will offer a short insight of the features. The main window is divided ...
CHALMERS

Face verification using biometric hashing Yann COMBARNOUS Supervisor: David C.L. Ngo Examiner: Irene Y.H. Gu Department of Signals and Systems Chalmers University of Technology Gothenburg, Sweden

April 2003

EX025/2003 1

Face verification using biometric hashing Yann Combarnous 4th April 2003 A thesis submitted in partial fulfillment of the requirements for the degree of MSc in Digital Communications Systems and Technology from the Department of Electrical Engineering at Chalmers University of Technology, Göteborg (Sweden) Supervisor: Dr. David C.L. NGO Faculty of Information System Technology Malaysia Multimedia University Cyberjaya, Malaysia Examiner: Dr. Irene Y.H. Gu Department of Signals and Systems Chalmers University of Technology Göteborg, Sweden

2

Abstract

Face-based biometrics approach has proved to be robust and efficient for verification systems (comparison with a reference database). While it opens the door to a range of applications principally aimed at securing areas, it fails to get its way to high-security systems, such as banking transactions. In this thesis, we outline an attempt to take a first step in the latter direction, by generating a unique bit-string per person with an almost error-tolerant transformation of the eigenprojections of a face, from an image processing view point. The discretization is carried out by first projecting the eigencoefficients of the face on a pseudo-random space, then keeping the resulting coordinates that meet a discriminating criterion, and finally deciding each bit on the sign of each selected coordinate. This pseudo-random sequence can be provided by a physical token. In the end, some experiments show that a system tolerant of image capture offsets could be built upon our results, opening the way to possible applications of interest in the cryptographic field. Keywords: biometric verification, biometric hashing

3

Acknowledgments The author wishes to thank Associate Professor David Ngo for his supervision, and his support during the thesis work. A special thanks too to the other persons associated with the project, namely Dr. Alwyn Goh for his advice and help throughout the thesis and Eugene Ho for his support, as well as Associate Professor Irene Gu for accepting to be my examiner. All his gratitude also to Ian Simpson, from his school ENST-B in France, for making this exchange possible. Finally, he would like to thank the “Conseil Régional de Bretagne”, as well as Malaysian Multimedia University for their financial support for this master thesis. The author would like to dedicate this work to the people that made this stay in Malaysia so enjoyable.

4

CONTENTS

CONTENTS

Contents 1 Introduction

7

2 Literature review

8

3 Background 3.1

3.2

3.3

3.4

10

Eigenfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

3.1.1

Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

3.1.2

Feature detection using PCA . . . . . . . . . . . . . . . . . .

13

3.1.3

Face recognition using PCA . . . . . . . . . . . . . . . . . .

14

Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

3.2.1

Dynamic Symmetry . . . . . . . . . . . . . . . . . . . . . .

16

3.2.2

Light normalization . . . . . . . . . . . . . . . . . . . . . . .

18

Wavelet analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

3.3.1

Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . .

20

3.3.2

Face recognition using wavelet transform . . . . . . . . . . .

21

Bio-hashing on PCA . . . . . . . . . . . . . . . . . . . . . . . . . .

22

3.4.1

Previous work . . . . . . . . . . . . . . . . . . . . . . . . .

22

3.4.2

Biometric hashing overview . . . . . . . . . . . . . . . . . .

24

3.4.3

Discretization process . . . . . . . . . . . . . . . . . . . . .

25

4 System presentation

27

4.1

Assumptions and choice of the test database . . . . . . . . . . . . . .

28

4.2

Face detection with Gaussian pyramids . . . . . . . . . . . . . . . . .

29

4.3

Features location . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

4.4

Features extraction . . . . . . . . . . . . . . . . . . . . . . . . . . .

33

4.5

Face recognition and verification . . . . . . . . . . . . . . . . . . . .

33

4.6

Wavelet transform selection . . . . . . . . . . . . . . . . . . . . . . .

34

4.7

Bio-hashing tests . . . . . . . . . . . . . . . . . . . . . . . . . . . .

35

4.7.1

Presentation of the test platform for issues resolution . . . . .

35

4.7.2

Practical adaptations . . . . . . . . . . . . . . . . . . . . . .

36

5

CONTENTS

CONTENTS

5 Results

39

5.1

Face and eye detection in an image

. . . . . . . . . . . . . . . . . .

39

5.2

Face recognition using PCA on eigenfeatures . . . . . . . . . . . . .

39

5.2.1

Identification . . . . . . . . . . . . . . . . . . . . . . . . . .

39

5.2.2

Verification with biometric hashing on PCA . . . . . . . . . .

42

Face recognition using Wavelet Transform and PCA . . . . . . . . . .

46

5.3.1

Identification . . . . . . . . . . . . . . . . . . . . . . . . . .

46

5.3.2

Verification with biometric hashing on wavelet+PCA . . . . .

48

5.3

6 Conclusion

52

7 Future work

53

8 Annex 1: GUI quick guide

54

9 Annex 2: Program tree

58

6

1 INTRODUCTION

1 Introduction Face recognition has become an intensive field of research since the early nineties, together with other biometrics identification methods. While fingerprints and iris scan can provide high accuracy rates, they still require complex and specialized scanners. On the contrary, face recognition can be performed with as simple a device as a webcam, guarantying both a non-intrusive feeling from the scanned person, and a wide range of everyday applications. Even though it does not provide with such high rate of accuracy as the aforementioned methods, it can be so easily implemented in cheap hardware that it would represent an additive mechanism for low cost to many security systems. The growing use of video surveillance in towns, banks, shops, private companies, etc... guarantees on its own a bright future to face recognition. Those video systems getting cheaper and cheaper, and easier and easier to install (in Japan, last generation of video surveillance systems do not need wires, as they are directly connected to 3G mobile networks), their wide adoption make the quantity of data hardly processable by human alone. Our goal in this thesis will be to show what significant security gain can be obtained in some applications by adding simple face recognition systems. While considering good image capture and light conditions, it could still apply to a wide range of applications, including but not limited to: ATMs, company security areas and computer login. Specifically we will try to see how well face recognition fits to the two kind of authentication: verification and identification. The definition of both concepts has been very well described [13] as: “The key difference between these two modes of authentication is whether the match/non-match decision is based on a one-to-one comparison or on an association based on a one-to-many search in a database. -Verification - The system verifies the claimed identity of the user by comparing his/her biometric sample with one specific reference template, which is either physically presented by the user or pointed to in the database. Verification can be knowledge-based (e.g. PIN or password) or tokenbased (e.g. smart card). The user says, "I am X!" and the system reply with "yes, your are X!" or "no, you are not X!" -Identification - The system identifies the end user from his/her biometric sample by associating it with his/her particular reference template based on a database search among the reference templates of the entire enrolled population. The user asks, "who am I?" and the system reply with "you are X!" or "your are not an authorized user".? “

7

2 LITERATURE REVIEW

2 Literature review One of the precursors of face recognition as a science could be traced back to be Francis Galton, as early as 1888 [14]. By considering important facial features or key-points, such as eye corners, mouth corners, nose tip and chin edge, and by measuring the relative distance between these key-points, he could construct vectors to describe each face. An unknown face could then be identified by comparing its feature vectors to a database of known faces. The success of this method highly depends on the accuracy of the measurements. This approach can be included in the more global constituent-based approach, in opposition to the face-based approach, both of which constitute the two main direction automatic face recognition system are exploring nowadays. The latest method can be described as an attempt to define the face as a whole, in the sense that it is treated as a two-dimensional pattern of intensity differences. As a result, the match is then done by identifying the underlying statistical regularities between the different tested faces. Face-based approach have proved more successful in time, whereas constituent-based ones turn out to be extremely sensitive to feature detection, and fail to provide good performance, even under restricted conditions. We will describe some of the former, namely neural networks, correlation, and eigenfaces. Neural networks are a popular approach in literature. For instance, the leading neuralnetwork based face recognition system have been developed by Lawrence and al. [15], and relies on Self Organizing Maps (SOMs) and Convolutional Neural Networks. This system have achieved quite reasonable performance, as it reached 94% of recognition rate on a database of 400 images. One of the biggest drawback of neural-networks can be considered to be their high computational complexity. For instance, with a small image of size 128x128 pixels, a neural net of 16,384 neurons was required for processing. Furthermore, the training necessary to ensure reliable performance also requires a very large training set ( 16, 384). This is likely to raise problems in realworld applications, where only a very few number of images of an individual would be in our possession. In addition, the neural-network approach does not make use –until now– of the inner properties of the face in an explicit manner [1], which, all in all, make this approach ill-suited to the problem of face recognition. Another popular technique for template matching in the image processing field is the use of correlation. Kosugi [4] has applied this technique to detect facial features and in face recognition. The matching in that case is a two-step process: first, the grey scale image is transformed into a vector by concatenating the pixel columns, then it is compared to a database of known faces by computing the correlation between this image’s representation vector and the ones of other people in the database. The member in the database with the largest correlation is declared the closest match. However, this method suffers from high sensitivity to scaling and noise, and its performance in badlylit conditions is poor [1]. Finally, eigenfaces method attempts to capture the variation between facial images in an orthogonal basis set of vectors, referred to as eigenfaces. In other words, the 8

2 LITERATURE REVIEW

eigenfaces are the image vectors which map the most significant variations between faces. It has been shown that any image face can be represented by a linear combination of eigenfaces, ie. by a weight vector. The recognition is done by comparing the weight vectors of an image to weight vectors in a database, and finding the closest match. Several advantages make eigenfaces a good starting point for our goal: • eigenfaces have been shown to produce 96% of correct classification under different lighting conditions [2] • eigenfaces are tolerant to small variations in scale, rotation and expression [2, 4] • eigenfaces have already been used in real-time systems, hence have an acceptable computational complexity [6] In this perspective, this technique appeared as the logical choice for this project.

9

3 BACKGROUND

3 Background 3.1 Eigenfaces The images we will consider are black-and-white images of size N × M. They will be described as a matrix I(x, y), where I(x, y) is the intensity of the point at coordinates (x, y) in a given image. Now let’s imagine a face can be represented as a single point in an N × M dimension space, simply by changing the N × M intensity matrix into a vector of length N × M. For instance, a 128x128 image would correspond to a point in a 16,384 dimensional space [2]. However, it has been proved by Turk-Pentland [1] that the facial images will occupy a much smaller subspace, which is quite intuitive as they constitute a subclass of the class “images” with common characteristics. This is where the eigenface method comes into play. 3.1.1 Principle As stated above, the assumption the eigenface method works on is that facial images occupy only a limited sub-region of the image space. Thus it is possible, through Sirovich-Kirby Principal Component Analysis or PCA [3], to find out an optimal coordinate system for facial images. PCA is simply a mathematical procedure to transform a number of correlated variables into a smaller number of uncorrelated variables. Its objective is to reduce the dimensionality of the data set, as well as to identify meaningful underlying variables. The idea is to catch the maximum of variations in the face space with as few variables as possible, which is realized by finding the axes (also called eigenvectors) on which faces projections have the higher variance, and iterating on the remaining orthogonal space. The direct application of this for face representation is that a face could be described with fewer variables, reducing thus the computational complexity of face recognition. Moreover, each variable is ranked depending on its relative importance in explaining the observation. So in practice, this optimal basis will be orthogonal and will maximize the variance in the set of facial images. The mathematical computation of those vectors is detailed below. Given a matrix C, the eigenvectors ui and eigenvalues λi of C satisfy Cui = λi ui , ∀i The eigenvectors are orthogonal and normalized, hence

10

(1)

3 BACKGROUND

3.1 Eigenfaces

ui u j =



1i = j 0 i 6= j

(2)

Let Ik be the matrix of intensity values for the greyscale facial image of index k, in a set of M facial images. Besides, let Γk represent the column vector of face k, obtained through lexical ordering of Ik . Finally, let φk be defined as the mean normalized vector for face k. In other words φk = Γk − ψ with ψ=

1 M ∑ Γi M i=1

Now, let C be the covariance matrix of the mean normalized face C=

1 M ∑ φk φTk M k=1

with M the number of facial images in our representative set. These facial images makes it possible for us to characterize the sub-space formed by faces within image space. Hence, we will refer to this sub-space as the face-space from now on. From (1), we have

Cui T ui Cui

= λi ui = uTi λi ui = λi uTi ui

Now, since uTi ui = 1 (2)

uTi Cui λi

= λi =

1 T M u ∑ φk φTk ui M i k=1

=

1 M T ∑ ui φk φTk ui M k=1

=

1 M T T T ∑ (ui φk ) (ui φk ) M k=1 11

3 BACKGROUND

3.1 Eigenfaces

=

1 M T 2 ∑ (ui φk ) M k=1

=

1 M T ∑ (ui Γk − meank(uTi Γk ))2 M k=1

= vark (uTi Γk ) In other words, the eigenvalue i stands for the variance of the representative facial image set along the axis described by eigenvector i. As a result, selecting the eigenvectors with the largest eigenvalues (Figure 1) as our basis is equivalent to selecting the dimensions which can express the greatest variance in image faces, which is what we expected. It has been shown that using this coordinate system, a face can be accurately recognized with as few as 6 co-ordinates. Which is to say that, where it previously took 16,384 bytes to represent a face in the image space, it only takes virtually 6 bytes to represent it in the face space. Generally speaking, we expect to be able to code faces on M 0  N 2 bytes.

Figure 1: Example of 10 given eigenfaces This reduction in dimensionality makes the process of face recognition much simpler, since we only have to focus on the relevant attributes of the face. M 0 is computed by determining the quantity of energy we want the eigenspace to capture from the image

12

3 BACKGROUND

3.1 Eigenfaces

space. Typically, one fixes a threshold (like 99%), and then find M 0 such as:   ∑Ki=1 λi 0 M = min | M > threshold K ∑i=1 λi 3.1.2 Feature detection using PCA Eigenfeatures (resp. eigenfaces) are also a popular way of detecting features (resp. faces) in a picture. This step is of the utmost importance, as the most accurately we detect the face and its features, the better our normalization will be (3.2), and ,hopefully, the better the performance of the system. The problem of face detection can be seen as determining how face-like each region in a picture is. We should recall that the eigenvectors in the face-space are those that represent optimally facial images in the image space. On the one hand, facial images will thus be reconstructed using this basis with relatively little error. On the other hand, non-facial images are bound to look completely different once reconstructed using the face-space eigenvectors [1]. By computing the sum of squares error between a region of a picture and its reconstruction using the face-space eigenvectors, we can get a measure of the reconstruction error. This reconstruction error is called the Distance From Face Space (DFFS) [1]. By sweeping a window across the scene, we are bound to find the most probable location of the face. It will simply be the point where the reconstruction error is minimal:

(x, y)

= arg min ε(x, y) (x,y)

(x,y)

= arg min k φ(x,y) − φ f (x,y)

k

Let us define I as the size Y × X matrix corresponding to the scene. In that case, Γ (x,y) is the column vector corresponding to the lexicographically-ordered window of size (M + 1) × (N + 1), centered at position (x, y) in the scene. Γ

(x,y)



M M M M = I x − : x + ,y − : y + 2 2 2 2



, lexicographically ordered

φ(x,y) is defined as the mean normalized version of Γ(x,y) φ(x,y) = Γ(x,y) − ψ (x,y)

and φ f by

(x,y)

is the reconstruction of φ(x,y) using the eigenvectors basis. You can get φ f (x,y)

φf

M0

=

∑ ωk

k=1

13

(x,y)

uk

3 BACKGROUND

3.1 Eigenfaces

(x,y)

with ωk = uTk φ(x,y) . The fact of sweeping a window across an entire image can turn out to be quite computationally expensive. Nevertheless, it can be simplified, as it has been shown [1] that it can be reduced to a number of FFT and IFFT of order O(M 0 ). 3.1.3 Face recognition using PCA

Figure 2: How face decomposition works The face recognition process can be divided into 2 main steps (provided the faces have been normalized before): • the training stage: we compute the optimal face-space using n pictures of each person we want to be able to recognize (3.1.1). • the recognition stage: one or several pictures per person are projected on this subspace, and are stored in a database. Then, when we want to identify somebody, 14

3 BACKGROUND

3.1 Eigenfaces

we project the face on the face-space (Figure 2), and find the closest match in the database. Let us explain the latter step in more details. As stated earlier, for some images Γ i , we can find its projection onto axis k by ωk = uTk (Γi − ψ) If we call Ωi the vector that contains the projections of Γi onto each of the M’ first eigenvectors (sorted from higher eigenvalue to lower)   Ωi = ω1 , ω2 , ..., ωM0 then, given a set of N photographs, we could easily determine the identity of an unknown face Ωi by finding out which face representation Ωn of our database is the closest in the face-space. For instance, a simple way of achieving this would be to minimize the Euclidean distance: 



n = arg  min k Ωi − Ωk k k However, it turns out that in our case, Euclidean distance is ill-suited as a decision criterion. To see why, we have to model the probability that Ωi and Ωk are the same person. To do so, we assume that the distance between the projections on any face-space axis of two pictures from the same person can be modeled as an error with Gaussian distribution, of mean zero, and variance the eigenvalue of that axis. By using a high dimensional Gaussian distribution based on the same model, we can give an estimate of the probability for two faces to match each other, depending on the error on each axis. Hence, the relationship is as follows:

∆ω2

i −ΣM e i=1 2σi

P ((Γi | Ωk ) | (Γi | Ω)) =

∆ω2

i −ΣM e i=1 2λi

=

= e

0 ∆ω2

i −ΣM i=1 2λi

∆ω2

.e

i −ΣM i=M 0 +1 2λi

Here, Ω stands for the class of all faces. Now, we can make an approximation by stating that the M’ first eigenvectors contain almost all of the energy (variance) P ((Γi | Ωk ) | (Γi |

0 ∆ω2

i −ΣM Ω)) ≈ e i=1 2λi

15

3 BACKGROUND

3.2 Normalization

0

∆ω2

i So, in theory, the lower the value of D = ΣM i=1 λi , the best chances we have that our two images come from the same person. This distance is called the Mahalanobis distance. It takes into account the fact that each of the eigen-dimensions exhibit a different variance, by normalizing each of those dimensions for unity variance. This distance measure will result in a high-dimensional Gaussian distribution with different variances in each of the eigen-dimensions.

To see what our probability would have been with the Euclidean distance, we just have to recall that the variance is then supposed to be uniform. Which would lead to the formula:

2 M 0 ∆ωi 2σ

P ((Γi | Ωk ) | (Γi | Ω)) ≈ e−Σi=1 ε2

≈ e− 2σ Moreover, it has been shown that the optimal value for σ is simply the average of the eigenvalues for the first M’ principal components. It is then likely that Euclidean distance is a bad decision criterion, as it does not fit the probability model.

3.2 Normalization One of the keys to an efficient face recognition system is the ability to rely on a good pre-processing of the face image, before the projection on the face-space and the final comparison take place. Different algorithms have to be applied to achieve this goal. They aim at solving issues such as lighting conditions, rotated and unknown scale of a face in a scene, to name a few. Besides, no real standard has been established until now for how to normalize a face. We will try to see how a 3-point normalization, inspired by dynamic symmetry, can benefit to our system. 3.2.1 Dynamic Symmetry Dynamic Symmetry dates back to the 1920’s, when Professor Jay Hambidge gave this name to the method of planning works of art used by Egyptian and Greek craftsmen from pre-historic times until the decadence of Greek Sculpture. This re-discovery was proved by exhaustive measurements of ancient work, including Greek vases and temples. It is a process that begins with various root rectangles as a primary shape that is to be sub-divided into its composite parts. It is considered that, by their subdivisions, they permit every feature of the picture to be a harmonious part of the whole panel. If done with intelligence there is a certain balance to the design, which is of great value.

16

3 BACKGROUND

3.2 Normalization

Symmetry, for Greek and Roman, and also Gothic architects, meant the linking of all the elements of the planned whole through a certain proportion or a set of related proportions. Dynamic Symmetry means that although the linear elements (segments of straight lines) used in the composition are irrational (non commensurable), the surfaces built on them may be commensurable, linked through a rational proportion. So in order√to obtain symmetry, we must use dynamic rectangles( 1) such as √ √dynamic 2 rectangles 2, 3, 5, φ( ) and φ2 These rectangles are such that the ratios between their longer and their shorter sides are equal to these numbers (Figure 3).

Figure 3: Head structure (A. Darville) So what is the link with face normalization? After comparing the measurements of a certain number of human heads belonging to Greek statues, to Renaissance paintings and to living people, M. Ghyka [17] has found that the ideal average faces fall, as regards proportions, in two main types, both framed frontally by a Golden Rectangle, divided into two equal parts by the horizontal line joining the centers of the eyes. In the first type (Figure 4a) the forehead is higher (in proportion to the whole head) than in the second type (Figure 4b), where the line of the eyebrows divides the whole height of the face according to the Golden Section. Figure 3 corresponds to measurements taken by the Belgian sculptor Alphonse Darville and set by him in average diagram. We will call these diagrams shown above, Dynamic Grid. It is interesting to see that each human face is proportioned rather the same way, and can be described by the 1 Dynamic Rectangles are the rectangles of which the proportions show irrational numbers. (The rectangles showing rational numbers √in their proportions, such as 3/2, 5/4, etc, are called static rectangles.) 2 φ symbolizes the number 5+1 2 =1.618... also called the golden section.

17

3 BACKGROUND

3.2 Normalization

(a) Head structure (A. Darville)

(b) Miss Helen Wills, harmonic analysis

Figure 4: Miss Helen Wills, harmonic analysis one dynamic grid, proportioned the same way. This means that if we can construct the dynamic grid corresponding to one face, we can deduce from it the place of some facial features such as nose, eyes, or mouth, in a very simple way. The model of face we adopted for the experiments is based on the first type (Figure 4a), and it makes use of the position of the left and right eye as dynamic points of subdivision to construct a dynamic face. Once we have the position of left eye and right eye, we can deduce the distance between the two eyes, and scale horizontally our face to place the eyes on their corresponding nodes on the dynamic grid. It is essential to recall that those grids are average behavior for face proportions. To make our feature extraction even more accurate, we need to scale the face vertically, which can be achieved for instance by locating the mouth, and scale the image so that the mouth is located on its corresponding node. From there, it is very easy to extract the features from the face, as shown on in Chapter 4.4. 3.2.2 Light normalization In a real-world situation, faces are likely to be subject to changes in lighting conditions. Those changes occur frequently due to variations in illumination intensity and in illumination direction. It is common practice to classify them into two categories: global intensity changes, and localized gradients. The former is easy to compensate for, as we only have to mean-normalize the image, and modify the intensity variance in the image so that it is equal to one 5. Histogram equalization has also proved a popular method, but it is computationally costlier.

18

3 BACKGROUND

3.3 Wavelet analysis

The latter requires complex techniques to get rid of its effect. Shadows, directional and specular lighting are some examples of source of localized gradients, and only nonlinear operations can compensate for them. Non-linear block processing and frequency selective filtering are two techniques that can be applied to solve this problem.

(a) normal photo

(b) photo after light normalization

Figure 5: light normalization effect Eigenfaces have been shown to perform worse with localized gradients [18] on the image. In a real-world environment, it is reasonnable to assert that a lot of face recognition applications can be assumed to be implemented in good light conditions, ie only with global intensity changes. This is especially true for identification and verification systems, where the user is supposed to be facing the camera and to be standing at a short-range distance, making it easy for the system to compensate for any lighting default prior to capture the image. This assertion would nonetheless be wrong in the case of surveillance systems. Hence, we will focus only on global intensity normalization.

3.3 Wavelet analysis Wavelet decomposition [22, 23] provides local information on both space and frequency domains. Wavelet transform (WT) is one of the method that have been investigated so far to compensate for some limitations the PCA-based method suffer from. The first drawback is that while PCA is well known to give a very good representation of a face, it has a poor discriminatory power between two faces from different persons. Moses et al. [19] has shown that PCA was maximizing the distance between interclass and intra-class images (a class being a set of pictures from the same person) used to create the eigenspace without distinction. While some additions to PCA has been made by Swets and Weng [20] to make up for that property, such as linear discriminant analysis (LDA), it is considered only a minor drawback for the kind of identification system we are to build (see). Moreover, we try to lessen its impact in Chapter 4.1. 19

3 BACKGROUND

3.3 Wavelet analysis

The second problem is much more sensitive, as it is involving the high computational load in finding the eigenvectors. Suppose we have an image of resolution 128 × 128. The computational complexity to find the eigenvectors of this image is estimated to be O(d 3 ), where d is the total number of pixels. From the matrix theory we know that, in the case where the number of training images N is smaller than d, the complexity will be reduced to O(N 3 ). But what if we add regularly users, and N becomes large? Recomputing every time the eigenspace will be done with a load increasing in cubic order, thereby leading to undesirable high computation cost. PCA on WT has been shown to address those two issues [21]. In particular, it has been possible to select a 16 × 16 sub-band, and still get excellent recognition rates and discrimination power. It ensures also that the computational complexity won’t grow when we extend our base of users to more than 16 × 16 = 128 people in that case.

Let us now have a look at the principle of WT. 3.3.1 Principle

The low-frequency content of a signal is in many cases the most important part of it. To some extent, it gives the signal its identity. On the other hand, the high-frequency content often contains the nuances. Take for instance the human voice. If the highfrequency components get removed, the voice is likely to sound different, yet one can still tell what’s being said. But if you remove enough of the low-frequency ones, you will end up with an non understandable signal. In wavelet analysis, it results in a distinction between approximations and details, approximations being the high-scale, low-frequency components of the image, and details the low-scale, high-frequency components. A basic approximation of the filtering process is shown on Figure 6. Two complementary filters are applied to the signal S, resulting in two output signals.

Lowpass filter S

A

cA

~1000 samples

~500 samples

1000 samples

Highpass filter

D

cD

~1000 samples

~500 samples

Figure 6: Filtering process Unfortunately, if we actually perform this filtering on a real digital signal, we get twice as much data as we started with. For instance, if the original signal S consists of 500 samples of data, then the output signals will each have 500 samples, ie a total of 1000. 20

3 BACKGROUND

3.3 Wavelet analysis

These signals L and H are interesting, but we get 1000 values instead of the 500 we had. In fact, there exists a more subtle way to perform the decomposition using wavelets. If we look carefully at the computation, we may keep only one point out of two in each of the two 1000-length samples and still get the complete information. This is the principle of down-sampling. We produce two sequences called dL and dH. The process on Figure 6 also includes down-sampling, and produces Discrete Wavelet Transform (DWT) coefficients. To gain a better appreciation of this process, we can perform it on a real-world signal, as shown on Figure 7.

Figure 7: Real world signal case It is important to realize that the decomposition process can be iterated, with successive approximations being decomposed in turn, so that one signal is broken down into many lower resolution components. This is called the wavelet decomposition tree. 3.3.2 Face recognition using wavelet transform To implement the wavelet transform on a 2D image, we first apply a one-dimensional transform to the rows of the original image, and then to the columns of the resulting transformed images [9]. An example of the resulting image (one-level wavelet transform) is shown on Figure 8a. The interpretation of the different zones (or sub-bands) can be made as follow: • zone LL (sub-band 1 in that case) is a low-frequency decomposition of the original image, or coarser approximation. • zone HL (sub-band 2) highlights the change along the vertical direction, and its counterpart zone LH (sub-band 3) the ones along the horizontal direction • zone HH (sub-band 4) contains the high-frequency components

21

3 BACKGROUND

3.4 Bio-hashing on PCA

(a) 1-level wavelet decomposition

(b) 3-level wavelet decomposition

Figure 8: 2D wavelet decomposition examples We can conduct further decomposition by applying the wavelet transform recursively on the LL band. Figure 8b shows a 3-level wavelet transform, for instance. In our case, we will be using the ’Whole face’ eigenfeature described in the first part, which is a 139x87 pixels image. So the scale of the various sub-bands (after rescaling the original image to 136x88, so that it is dividable by 8) will be 17x11 for sub-bands 1,2,3,4, 34x22 for sub-bands 5,6,7, and 68x44 for sub-bands 8,9,10. Figure 9 gives an example of application of a 3-level WT to an eigenfeature.

3.4 Bio-hashing on PCA All of our work is actually focused on bridging the gap between the world of biometrics and cryptography. The cryptographic part of the project will be developed in a paper based on this thesis. Those two formalisms turn out to be highly complementary in the security field, thereby raising the interest of combining them. The radical differences between those two worlds clearly pose a challenging problem. With biometrics, we are confronted with continuous data coming from analogue sources, and reference and test images, while alike, will generate different representations. Hence its reliance on similarity for the classification of sources, and the appearance of FAR(False Acceptance Rate) and FRR(False Rejection Rate). By opposition, cryptography is based on the principle of uniqueness and equality. A unique token is assigned to each individual. 3.4.1 Previous work The first publication concerning this subject was made by Soutar et al. [32, 33]. His research focused on cryptographic key-recovery, by integral correlation of captured 22

3 BACKGROUND

3.4 Bio-hashing on PCA

Figure 9: 3-level wavelet decomposition on an image fingerprint data and pre-registered bioscrypts. Those bioscrypts are the result of the mixing from user-specific and random data, in order to prevent the recovery of fingerprint data. The data capture uncertainty issue is addressed by a multiply-redundant majority-result table lookups. However, several issues are raised from this formulation. First, the keys are externally specified and then recovered, instead of being internally computed. Another main issue is the table dimensionality: while its adjustment makes it possible to ensure that the level of tolerance to offsets for data capture by the same user is high enough, it does not handle the discrimination of different users in a satisfactory way. Finally, the increase of the key-length does not mean an increase in security, but an increase in the lookup table dimension. Hence, the key concepts in what should ideally be achieved can be summed up in 3 words: tolerance, discrimination, and security (of the representation). An iris-based cryptographic signature verification system without stored references was developed by Davida et al. [34, 35]. It relies on open token-based storage of user-specific Hamming codes. Those codes correct the offsets of the face data capture, and the corrected data can be used for verification. While error correction via Hamming codes seem more rigorous than Soutar et al. method, the iris-derived key is completely deterministic. Monrose et al. [36] computed keys from user-specific voice data. The technique consists in the generation of bits via classification of user-specific features, depending on whether they are above (0) or below (1) a fixed population threshold. A feature can also generate an indeterminate state Ø. The concatenation of these bits is then combined 23

3 BACKGROUND

3.4 Bio-hashing on PCA

to randomized lookup tables, formulated by Shamir secret-sharing [37]. The errorcorrection mechanisms are composed of Shamir polynomial thresholding and Hamming codes. The advantage of this methodology is that the key length and the system security are both increased when we choose to use more features, contrarily to what happens in Soutar et al. Direct mixing of random and biometric data seem to be the most advantageous solution, in so far as there is no deterministic way to get the cryptographic key from a user without having both the token with random data, and the user face data. Moreover, changing the cryptographic key from a user is as simple as changing the support containing the random data. This would protect us for instance against biometric fabrication. 3.4.2 Biometric hashing overview The proposed bio-hashing methodology can be decomposed in: 1. A dimensionality reduction of the face data, as presented in the preceding chapters. This would involve the use of eigenanalysis, or wavelet transform associated with eigenanalysis, which 2. A discretization of the data via a product of tokenized random data and user data. A further step in offset tolerance is made. 3. A cryptographic interpolation is applied for error correction, via Shamir secret sharing corresponding to token and and biometric data. As one can notice, each of the stages are inspired from different parts of the previous works mentioned before: • Tokenised mixing of random data like Soutar et al formulation • Scalability of discretization like Monrose et al formulation • Rigorous error correction like Davida et al and Monrose et al formulations Part (1) of the methodology is described in detail in the previous chapters. The goal is to extract the main features of the face images, which are the most stable for images from same user, and vary a lot for different users. Part (3) won’t be described in this paper, as it is more related to the cryptographic part, and our main interest remains the signal processing side of this project. So, let us move to the discretization stage.

24

3 BACKGROUND

3.4 Bio-hashing on PCA

3.4.3 Discretization process The particular purpose of this project is to provide a unique and secure bit-string or key per individual, that could be later used for cryptographic systems. Let us have a look at the methodology used. In particular, we should examine the successive steps taken to gain a discrete and invariant bit-string output. If we call S the space containing face images, and S0 the one containing their eigenprojections on the M 0 first eigenvectors, we have the following progression: 1. Bitmap representation: Γ ∈ S ⊂ RM , with N the picture dimension 0

2. Eigenprojections: Ω ∈ S0 ⊂ RM , with M 0  N as described in Chapter 3.1.1 3. Discretization: X ∈ S00 ⊂ {0, 1}m , with m the length of the bit-string There is also a logical progression in dimension and uncertainty reduction. It is worth noting that there is no a-priori restriction on the value of m, and that by the virtue of using pseudo-random vectors, we can interpolate X ∈ {0, 1}m . The transition between (1) and (2) is important in so far as good feature location and extraction can reduce substantially the offset between two pictures of the same class. Achieving (3) requires a offset-tolerant transformation, and the choice of a decision criterion C to assign a single bit to each projection: 1. Compute x = Ω · r with r a random-normalized vector in RM   0 : i f x > 0 and C 1 : i f x < 0 and C 2. Let b(x) =  Ø otherwise

0

x1 (0)

( )

(1)

Figure 10: The bit attribution decision depending on the vector position

25

3 BACKGROUND

3.4 Bio-hashing on PCA

(1) and (2) can be iterated as many times as we need. Determining C requires that we look back at the properties of PCA. The eigenprojections of a given face are supposed to have at least some components that feature a relative stability over some axis, according to the similarity principle eigenfaces are based on. Therefore, in order to ensure our system is error-tolerant, we have two worst-case scenario to avoid. The first one is when Ω and r are almost perpendicular to each other, making thus the decision on the sign sensitive to small variations of Ω (Figure 10). The second is when Ω is subject to large variations on a particular r, in other words, r is along an unstable axis . Because of the PCA properties, we are sure to find some r vectors that do not fall into those scenarios, however we should have a C that makes the validation or rejection easy. A first and simple approach that we could imagine is to set a threshold angle (Figure 10) beyond which we would be in an indeterminate state. The process of generating a token of pseudo-random vectors taking place only once for an individual, it can be considered secure in the sense that there is no way to recover the face data by getting hold on the token (one-way transformation). As a result, we should get a unique bit-string per person, which is highly desirable in a secure environment and outperforms the classic verification scheme, considered a weak-security system for it needs to access an external database of user data.

26

4 SYSTEM PRESENTATION

4 System presentation In this part, we will focus on describing the algorithms used to have a fully functional system. From the theory explained in last chapter, we will show how we can build an entire system by adding data structure formatting, as well as some software implementation, and putting the bricks altogether. Eigenspace projection

?

Comparison in database

This is Mr Smith Acquisition

Face detection

Features detection

Scale,light, rotation

Wavelet transform

PCA projection

Figure 11: Face recognition system for identification A typical face recognition system for identification can be divided in 5 main components (Figure 11): • Face detection component: Locates the face in a bitmap image from a digital camera/web-cam. A rough segmentation of the face is then sent to the next component. • Features detection: Locates the eyes and the mouth in the segmented parts of the face. The coordinates are passed to the next component. • Rotation, scale, and light normalization: Normalize the face so that it can be used for comparison with the face database. • Feature(s) extraction: Extract the feature(s) we want to use for recognition • Projection and comparison: Features are projected on an eigenfeature space (or first we use the WT, and then we project), and finally compared with the face/feature database, the best match is chosen. In the case of a verification system (Figure 12), the output of the system should always be a unique bit-string for each person, thanks to the combination of a physical token. All we have to do is to add an additional step to an identification system where the output (the eigenprojections) would be mixed with the token.

27

4 SYSTEM PRESENTATION

4.1 Assumptions and choice of the test database

smartcard Token 1-way

1-way

key

Captured face

signature

Face recognition system Camera

Figure 12: Face recognition system for verification

4.1 Assumptions and choice of the test database We made several assumptions, as an a-priori knowledge of some properties from the data source (pictures of people) can lead to a greater optimization of the whole system. To proceed, we defined the kind of applications it could be applied to first. These include, but are not restricted to: • computer login • secured transactions (like banking on ATMs) • security checks in sensitive areas Consequently, we concluded that the pictures could be considered having: • good, symmetric lighting conditions • a clear image • small to medium facial expression changes • frontal view, approximately vertical No assumption was done on the distance from the camera. The range is indeed parameterizable in the Gaussian pyramids described in Chapter 4.2. Based on that, we considered several databases for our tests: Yale, Essex and Feret. The subset Faces 94 in the Essex database provided us with a great number of pictures per people, and met the previous requirements (see table 1). The subset we randomly extracted was containing 100 people, with 20 pictures for each person, ie 2000 pictures in total. 10 pictures per person were dedicated to create an average face for each person. This is in the hope to overcome the issue mentioned in Chapter 3.3. The tests are then carried out on the 10 remaining pictures 28

4 SYSTEM PRESENTATION

4.2 Face detection with Gaussian pyramids

Table 1: Examples of pictures from Essex database for each person. To enhance the relevance of our experiments, we computed every figure using all the possible combinations of reference pictures (as opposed to test pictures). For instance, the number of samples  to generate statistics for the intra-class k · (10 − k) · 100 /2 for k reference pictures per person, and figures would then be C10  k · (10 − k) · 99 · 100 /2 for the inter-class figures. To give an order of magnitude, C10 this represents, in the case of one reference picture, 4500 samples for the former case, and 222750 for the latter.

4.2 Face detection with Gaussian pyramids A significant issue in the whole system is to detect a face in a photo, not knowing at what distance from the subject the photo has been shot. One requirement of face detection using eigenfaces is obviously that the face on the photo be in the face space, which is a normalized space (by scaling faces 2 ways to a fixed eye distance). Even though eigenfaces identification has been shown to perform well at ±12% scale, and ±10 degrees rotation variations, higher changes are bound to degrade in great proportion the efficiency of the detection. The construction of multi-scale pyramids [4] makes it possible to extend the face detection algorithm to a multi-scale system. Multi-scale pyramids (Figure 13) can be described as three-dimensional structures, which consist in scaled versions of the eigenfaces. An effective way to proceed to a multi-scale face search within our image would be to search this face using the eigenfaces from each level of the pyramid. Although this is equivalent in effect to scaling our input image, it offers great computational savings, in so far as eigen and average faces multi-scale pyramids can be pre-calculated. The system we built makes use of Gaussian Pyramids, which are obtained by smoothing with a Gaussian filter and then sub-sampling. The reason for smoothing is that it avoids the effect of aliasing in the sub-sampling process. The Gaussian smoothing filter has the following definition:

Gaussian f ilter

= GT G

with G = [0.0625 0.2500 0.3750 0.2500 0.0625] The eigenfaces are then filtered by 2D convolution with the above matrix. Finally, the eigenfaces are sub-sampled by a factor of 1.125 for each scale. Our original eigenfea29

4 SYSTEM PRESENTATION

4.3 Features location

Figure 13: Multi-scale pyramids ture for face detection is a cropped face feature of size: 61x73 pixels, the other four layers of the pyramid being of size: 54x68, 49x58, 43x51, and 38x46 pixels, in the decreasing order. Practically speaking, the detection is done by extracting moving windows of the above dimensions across the image (Figure 14), and projecting the image part onto the corresponding eigenspace. Another way of increasing the computational efficiency with Gaussian Pyramids can be exploited [24, 25], as the more responsive the system, the faster transactions could be done. If we consider that face detection does not need as high a resolution as face recognition, based on the fact that the human eye can, for instance, spot somebody on a picture even though it might not be able to identify the person, it should be possible to reduce the original image size. This is to say that a sub-sampled version of the original image should be good enough for the purpose of rough face detection. As for the detection of the features (eyes and mouth) needed for normalization, we can use a full-scale image, as the search can be limited to restricted parts of the picture, due to the first rough detection. Within our system, we decided to subsample our input image by a factor of 2, without affecting the overall precision in normalization, as seen in the results part.

4.3 Features location Features location is a key part of the system, in the sense that it can affect the performance of it in great proportions. As a matter of fact, this assumption results from the eigenfaces being sensitive to changes in rotation and scale, as mentioned earlier. 30

4 SYSTEM PRESENTATION

4.3 Features location

Figure 14: the different windows used to detect the face Many different techniques can be found in literature, including morphological processing [26], grey-level reliefs [28], and eigenfeatures [27]. It is important therefore that the facial key-points be accurately located, namely the eyes and the mouth corners. The technique we opted for was eigenfeature detection. There were two main reasons justifying this choice, the first one being the robustness, and the second the code re-use possibility. Due to time constraints, we were not able to test other promising but more complicated methods, especially deformable templates. The different steps of features location can be divided as follows: • The face detection module locates roughly the face location in the input image. 10% of extra margin around the face is added to compensate for the inaccuracy of the process in itself, reducing thus the risk of errors (Figure 15).

Figure 15: Face detection and eye zone extraction

31

4 SYSTEM PRESENTATION

4.3 Features location

• The upper part of the extracted window is then divided into two parts: in the upper-left one, we look for the left eye with eigenfeature detection, and then we proceed with the upper-right one, and the right eye. Once again, we extract a window around each eigeneye feature with a 10% extra margin (Figure 15), and pass them to the next module. • Now that the search zones are really narrowed down, we look for the iris in each eigeneye feature. Considering the iris as the eye center may appear a questionable statement, but one has to recall the pre-requisites of our applications, ie that people are supposed to be looking into a camera. Hence we can surmise the iris will be centered. Focusing on the iris should also improve the resolution and accuracy of the system. By histogram-normalizing the image, and thresholding on the grey levels, we can isolate the iris from the rest of the image (Figure 16).

Figure 16: Eye detection and iris location in the eye zone • Once both eye centers are located, the mouth is then the next facial landmark we want to get. In the light of Chapter 3.2.1 on dynamic symmetry, we can consider that due to certain proportion in the face, the search for the mouth can be conducted in a zone around the estimated position from the template, knowing the position on the eyes. Similarly to the eye detection process, an histogram equalization is applied together with a grey-level thresholding to highlight the mouth corners (Figure 17).

Figure 17: Mouth detection and face rotation&rescaling

32

4 SYSTEM PRESENTATION

4.4 Features extraction

4.4 Features extraction The three coordinates we determined in last subsection are then passed to a scale and rotation module. The distance between the eyes is set at 40 pixels by horizontal scaling of the face, and it is also stretched vertically so that the center of the mouth is adjusted to the corresponding dynamic symmetry grid point. A light normalization is applied to attenuate effects of the environment (see Chapter 3.2). The last step of image pre-processing for our system is to extract the different features we want to analyze. We selected 7 of them that we will use in our tests: face, cropped face, nose, mouth, right and left eyes, eyes (Figure 18).

Figure 18: Features extraction process Once the face image has been normalized, the extraction is quite straightforward: we apply the dynamic grid on the face, and keep the zones of interest.

4.5 Face recognition and verification Once the face and features have been detected, extracted, normalized, it is ready to be processed by the face recognition (resp. verification) application. This module will be responsible for determining the closest (resp. exact) match. The implementation of this application was based on the theoretical background seen in Chapter 3. An eigenspace is created for each feature with M pictures of each individual of the database (N individuals in total), by averaging those M pictures, and using the resulting one. To generate statistics on feature recognition performance, the extracted features of the M 0 unused picture of each individual are projected on their corresponding eigenspace. Then, one of these M 0 pictures is used as a reference one for each individual, and the distance of the features of these N pictures to the ones of the (M 0 −1)∗N remaining pictures is computed, resulting in a Nx((M 0 −1)∗N) matrix. We look for the minimum in each column, and, depending on the rank of the column, we can assess if the result is correct or not. 33

4 SYSTEM PRESENTATION

4.6 Wavelet transform selection

The only difference between verification and identification is that in the former case, we add an additional stage by projecting on a random space and discretizing, and make use of the Hamming distance instead of the Euclidean distance.

4.6 Wavelet transform selection Wavelet filtering requires us to select a type of wavelet among the many existing. Consequently, in order to select a suitable wavelet, we computed the recognition rates for different popular wavelet filters [15]. From this experiment (Table 2), it turned out that Daubechies (2) was the one to adopt for our future tests, in so far as it shows a significant efficiency advantage over the other wavelets. Wavelet Daubechies (2) Daubechies (4) Daubechies (6) Daubechies (8) Symlets (4) Battle-Lemarie (4) Bi-orthogonal Wspline (4,4)

1 0.41 0.35 0.32 0.33 0.36 0.41 2.06

2 1.51 2.76 3.67 4.04 2.97 2.02 27.15

3 7.96 12.02 13.26 14.39 10.80 10.78 33.81

Sub-bands 4 5 7.31 1.83 11.46 7.33 14.35 10.28 15.08 12.38 12 7.19 10.62 2.29 53.31 34.86

6 10.58 21.93 26.78 31.53 21.69 9.69 54.29

7 14.44 39.94 48.51 55.04 39.83 23.04 91.12

Table 2: Error rate percentage (%) per sub-band for different types of wavelets for M 0 = 50 In addition, it is worth noting the general trends that the different sub-bands exhibit, by interpreting the table figures. According to Nastar et al. [11, 29] work, which explored the relationship between variations in facial appearance and the faces deformation spectrum, only high-frequency spectrum is affected by changes in facial expressions and small occlusions, whereas illumination changes affect mainly low-frequencies. Moreover, a change in human face will affect all the frequency components.

Figure 19: An example of eigenface for each sub-band Looking at our results, the 3 sub-bands that give the best results are in order 1,2 and 5 (Figure 19). Sub-band 1 is a kind of rough average of the face, and, from the study

34

4 SYSTEM PRESENTATION

4.7 Bio-hashing tests

mentioned above, should be used only if we can ensure the light conditions are going to be constant in our system. As for sub-bands 2 and 5, they are sub-bands that extract the horizontal changes in the face, as opposed to sub-bands 3 and 6. One interpretation would be that facial changes are more sensitive in the vertical direction. In the scope of our tests, we will only focus on sub-band 1.

4.7 Bio-hashing tests 4.7.1 Presentation of the test platform for issues resolution Due to the different stages involved in this project – image preprocessing, eigenanalysis, bio-hashing –, we felt very fast the need to develop a tool that would make it possible to automate the tests and give visual and figures feedback (Figure 20). Not only are the individual components supposed to perform well, but they should be tweaked and understood to optimize the bio-hashing output accuracy. One of the many features was the great reconfigurability, as well as the ability to deal with each stage separately or altogether.

Figure 20: Test platform main window We will offer a short insight of the features. The main window is divided into four parts, namely: 35

4 SYSTEM PRESENTATION

4.7 Bio-hashing tests

• Face and features database part: this part is dedicated to the selection of the face database, the configuration of the number of images used for testing, for eigenspace generation and as reference. In addition, the user can choose the extraction type (2-point or 3-point normalization, manual or automatic). • Face recognition with eigenfeatures: generation of the eigenspace and projection on the eigenspace. Also it is possible to generate statistics and graphs for by combining features, at different M 0 . • Face recognition with WT and eigenfeatures: same as above, except that there is an additional stage where it is possible to choose which wavelet filter to use. • Visual hashing (bio-hashing): the user can choose between applying bio-hashing on PCA or Wavelet+PCA, on which feature/sub-band to apply it, and the number of bits at the output. 4.7.2 Practical adaptations Thanks to the tool presented above, we were able to overcome some issues that raised during the test process. • The simple condition used for the bit determination in Chapter 3.4.3 turned out to be insufficient. Although the idea did make sense, the observation of realworld data revealed that the formulation should be slightly modified. Indeed, it x1

x1

(0)

(0)

( )

( )

(1)

(1)

(a) Rejection of a valid axis

(b) Validation of an unstable axis

Figure 21: Different cases of wrong decision seemed quite difficult for us to find a validation threshold. In our formulation, we stated that we had 10 images of each person for building the eigenspace. 36

4 SYSTEM PRESENTATION

4.7 Bio-hashing tests

How should we then set the aforementioned angle threshold? A situation that we observed frequently was that the vectors corresponding to one person possible co-ordinates would vary a lot within an half-space. But the axis corresponding to this co-ordinate and to a particular user could be considered stable, as it never changed sign (Figure 21a). Another situation that illustrated the ill-suiting of our criterion was the case when we have very small vectors, close to the space center, inside the criterion angle, but with little variations from one image to another that make them change sign often (21b). This is a major flaw of our criterion. It can be solved in a rigorous way though. Assuming a Gaussian behavior of x, the image vectors co-ordinate on x1(with mean m and variance v) in (1), the error function tells us the highest the ratio |m|/v, the less likely a zero-crossing is. Hence the possibility of ranking the values for this ratio for each axis, based on experience with the 10 reference pictures of one person. Then we would choose the n highest values (n the number of bits wished for the key) among the m random axes. • Despite this theoretical advantage of Mahalanobis distance over Euclidean one, we realized through the observation of graphs in our experiments that, even though the Mahalanobis distinguished clearly between images from two different classes, it failed to minimize the distance between images from the same class (Figure 22). Because of our final goal of managing verification, we had thus to favor Euclidean distance.

4

12

x 10

Histogram of correlation for:.&.Wholeface | CM:17.1989 | threshold:9 | FRR:2.5778%

4

12

x 10

Histogram of correlation for:.&.Wholeface | CM:17.7912 | threshold:8 | FRR:3.9778%

Matching faces (mean:2.7989 |std dev:2.3813) Different faces (mean:19.9978 |std dev:3.1425)

Matching faces (mean:2.2147 |std dev:2.5059) Different faces (mean:20.0058 |std dev:3.2088)

10

10

8

8

6

6

4

4

2

2

0

0

5

10

15

20

25

30

35

40

(a) bio-hashing using Mahonobilis distance

0

0

5

10

15

20

25

30

(b) bio-hashing using Eucliden distance

Figure 22: Mahonobilis vs. Euclidean distance for discretization without enhancement criterion • In order to optimize the result of bio-hashing for same user pictures, it came to 37

35

40

4 SYSTEM PRESENTATION

4.7 Bio-hashing tests

us that we could exploit the eigenfaces property that the importance (from the energy view point) of the eigenvectors was decreasing with their rank. In other words, the first eigenvectors are also bound to be the most stable co-ordinates. Hence the idea of emphasizing this trend by multiplying the co-ordinates of a face image in the eigenspace by the corresponding eigenvalue. As suggested by Figure 23, we get both an improvement in the average and the variance in the number of bit errors.

4

12

4

Histogram of Hamming distances for n=40

x 10

12

Histogram of Hamming distances for n=40

x 10

Matching faces (mean:0.49267 |std dev:1.0273) Different faces (mean:20.0204 |std dev:3.1132)

Matching faces (mean:0.17911 |std dev:0.66531) Different faces (mean:19.9325 |std dev:3.2152)

10

10

8

8

6

6

4

4

2

2

0

0

5

10

15

20

25

30

35

40

(a) without eigenvalue-weighting

0

0

5

10

15

20

25

30

(b) with eigenvalue-weighting

Figure 23: Evolution of the bio-hashing offset tolerance

38

35

40

5 RESULTS

5 Results 5.1 Face and eye detection in an image To quantify the quality of the detection and extraction process, we decided to evaluate the distance between the estimated and the real (manually entered) position of the facial features used for normalization. Those features are the eyes and the mouth. We extracted randomly 36 people from the Face94 database, and took 8 pictures of each. 3 were dedicated to generate the eigenspace for the detection system (2-point normalization, ie uniform scaling), and the 5 remaining were used for the tests (so 180 test pictures in total). In the first part of the experiment, the faces images had more or less all the same scale, so the Gaussian pyramids were not useful here. We obtained a distance of 2.25 pixels for the eye location and 2.04 pixels for the mouth location. In the second part of the experiment, we wanted to check how good the Gaussian pyramids were reacting to changes in scale. We randomly rescaled each of the images at a size between 80-100% of the original size. We found an average distance of 2.42 pixels for the eyes, and 2.01 pixels for the mouth. This looks quite encouraging, and future tests should be made on larger scales, with also bigger variations in head rotation and lighting.

5.2 Face recognition using PCA on eigenfeatures 5.2.1 Identification First of all, we will compute the error statistics for each eigenfeature. For each picture, we find the best match in the database. While the "whole face" feature is expected to behave the best, it is nevertheless a good idea to look at what error rates can be performed by other eigenfeatures. The main advantage of using eigenfeatures instead of eigenfaces is that it can overcome some facial expression changes or some occlusions from parts of the face. For instance, if someone starts growing a beard, the "eyes" eigenfeature will not be affected. The main drawback, obviously, is that at the same time the information about the contour of the face, and other important facial landmarks disappear. This is likely to yield to less accuracy in the recognition process. We first computed the error rate for different features, based on the best match (Figure 24a).

39

5 RESULTS

5.2 Face recognition using PCA on eigenfeatures

Error rate for verification using PCA

Error rate for verification using PCA 80

80

70

70

60

50 Error rate (%)

Error rate (%)

60

eyes cropped face left eye mouth nose right eye face

50

40

40

30

30

20

20

10

10

0

5

10

15

20

25 30 Number of eigenvectors

35

40

45

face and eyes face and cropped face face and nose eyes and nose

50

0 20

30

(a) single features case

40

50 60 70 Number of eigenvectors

80

90

100

(b) combined features case

Figure 24: Error rate for PCA Obviously, the bigger the feature, the lower the error rate, which is to correlate with the quantity of information each of them provide us with. It is worth noticing that the feature mouth has a different behavior, most probably because it is the facial landmark that has the highest variations between two pictures from the same person. Figure 24b shows the results in case we combine 2 features. While the dimensionality is higher, the error rate for most of them is very low. There is some clear redundancy when combining face and cropped face features for instance. Yet, one can consider this as an interesting weighting of the most important features of the face, having both good error rate and, in theory, good resistance to feature occlusion. As it is difficult to find an extensive face image database with different features occlusions, we won’t test this assumption, and leave it for future work. Eigenfeature Error rate Size

Face

Cropped face

Eyes

Nose

0.27% 87x139 pixels

3.60% 61x73 pixels

9.41% 61x35 pixels

21.3% 23x33 pixels

Table 3: Error rate per eigenfeature for M 0 = 50 From now on, we will focus only on testing the face and cropped face as single features in the rest of our experiment, as they turn out to be the most relevant ones (Figure 7). Vendors often use two different methods to rate biometric accuracy: false-acceptance rate (FAR) or false-rejection rate (FRR). Both methods focus on the system’s ability to allow limited entry to authorized users. However, these measures can vary significantly, depending on how you adjust the sensitivity of the mechanism that matches the 40

5 RESULTS

5.2 Face recognition using PCA on eigenfeatures

biometric. For example, you can require a tighter match between the measurements of face geometry and the user’s template (increase the sensitivity). This will probably decrease the false-acceptance rate, but at the same time can increase the false-rejection rate. In our case, the sensitivity would be a distance threshold. By observing the histogram of distances, we can estimate how error-prone our system is bound to be. In a error-free system, we would have no crossing of both curves, which would be a perfect differentiation. As this does not happen in real life, we need to come up with a criterion to characterize its performance. Because FAR and FRR are interdependent, it is more meaningful to plot them against each other, as shown in Figure 25. Each point on the plot represents a hypothetical system’s performance at various sensitivity settings. With such a plot, you can compare these rates to determine the Crossover Error Rate (CER). The lower the CER, the more accurate the system.

&.Wholeface: results of recognition on.1000.samples

&.Face: results of recognition on.1000.samples

.

.

5

5 dimension:20 dimension:40 dimension:60 dimension:80

4

4

3.5

3.5

3

2.5

2

1.5

3

2.5

2

1.5

1

1

0.5

0.5

0

dimension:20 dimension:40 dimension:60 dimension:80

4.5

False acceptance rate (%)

False acceptance rate (%)

4.5

0

0.5

1

1.5

2 2.5 3 False rejection rate (%)

3.5

4

4.5

5

(a) whole face feature

0

0

0.5

1

1.5

2 2.5 3 False rejection rate (%)

3.5

4

4.5

5

(b) face feature

Figure 25: CER for M’=20, 40, 60, 80 The face feature proves itself quite weak and unrealistic for identification. Yet, it might be good enough in combination with a token, which we will find out in next part. Summary

41

5 RESULTS

M’ 20 40 60 80

5.2 Face recognition using PCA on eigenfeatures

Error rate 0.76 | 5.90 0.30 | 3.92 0.24 | 3.31 0.23 | 2.9

FR (at FA=0%) 4.47 | 30.3 2.48 | 22.4 2.27 | 20.1 1.61 | 18.4

CER 0.57 | 2.57 0.41 | 2.20 0.33 | 2.06 0.29 | 2.04

Table 4: Summary of results for face and cropped face

5.2.2 Verification with biometric hashing on PCA In the following results, we will try to highlight the gain we get through the combination of bio-hashing and optimized mean-on-variance criterion (cf Chapter 3.4). Comparisons will be done with both simple PCA and PCA+bio-hashing (ie, without resorting to the mean-on-variance criterion). Let us first have a look at the distances histograms for the different cases. The naming convention in the graphs will be as follow: • pca: curves for the distances between two images using Principal Component Analysis, Euclidean distance. • pca+d: curves for the distances between two images using Principal Component Analysis and bio-hashing with token, Hamming distance. • pca+de: curves for the distances between two images using Principal Component Analysis and optimized bio-hashing with token, Hamming distance. For comparison purposes, the graphs were merged on a single graph, by normalization of the PCA curve, so that the peak of different users is centered with the same peak for bio-hashing curves (Figure 26).

42

5 RESULTS

4

16

5.2 Face recognition using PCA on eigenfeatures

4

Histograms for PCA−20, PCA+d−20 and PCA+de−20

x 10

12 Same user (pca) Different users (pca) Same user (pca+d) Different users (pca+d) Same user (pca+de) Different users (pca+de)

14

Histograms for PCA−40, PCA+d−40 and PCA+de−40

x 10

Same user (pca) Different users (pca) Same user (pca+d) Different users (pca+d) Same user (pca+de) Different users (pca+de)

10

12 8 10

8

6

6 4 4 2 2

0

0

0.5

1

1.5

2

0

2.5

0

0.2

(a) histograms comparison for n=20

4

10

4

9 Same user (pca) Different users (pca) Same user (pca+d) Different users (pca+d) Same user (pca+de) Different users (pca+de)

9

8

0.6

0.8

1

1.2

1.4

1.6

1.8

2

(b) histograms comparison for n=40

Histograms for PCA−60, PCA+d−60 and PCA+de−60

x 10

0.4

Histograms for PCA−80, PCA+d−80 and PCA+de−80

x 10

Same user (pca) Different users (pca) Same user (pca+d) Different users (pca+d) Same user (pca+de) Different users (pca+de)

8

7

7 6 6 5 5 4 4 3 3 2

2

1

1

0

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0

(c) histograms comparison for n=60

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

(d) histograms comparison for n=80

Figure 26: Histogram for distances It turns out that bio-hashing transforms the face coordinates in face space into pseudorandom bit strings. This assumption is made by observing the case of different users curves when bio-hashing is applied (pca+d or pca+de). Those curves are Gaussianshaped, with a mean of N/2 and a variance of N/4, with N the size of the bio-hashing (ie, the bit string’s length), which corresponds to the theoretical curve of distances between random bit strings of length N. This property makes it possible, therefore, to care only about reducing the intra-class distances, or in other words minimizing the mean and variance of the intra-class distances for a given N, as the mean and variance of the inter-class distances will be invariant. In addition, it is worth noting that the graphs also suggest a significant im-

43

5 RESULTS

5.2 Face recognition using PCA on eigenfeatures

provement variance-wise for the inter-class curves in the case of bio-hashing over simple PCA. This, in turn, will help building a better FR and CER. If we observe the evolution of the intra-class curves for the different cases, we can outline the following phenomena: • pca+d performs worse than pca, which can be explained, as mentioned before, by the fact that the first N random vectors chosen for projection are not likely to match the face space axes of greater stability for a given face. Hence the degradation. • on the other side, overcoming this bio-hashing drawback by random vectors selection (criterion-based) looks like a winning bet, and improves dramatically the coherence of intra-class representations. All in all, pca+de takes the advantages both of pca and pca+d and outweigh them by minimizing the inter-class distance and maximizing the intra-class one. As a result, differentiation is excellent, which is illustrated by the gap between the same user and different users curves (Figure 26). Besides, it is interesting to see the evolution of the curve shape depending on N. In this particular case, we obtained the N best random vectors by selecting them among 80 random orthogonal vectors describing the same space as the first 80 eigenvectors for the face eigenfeature. At some point when N is been raised, it is clear that performance starts to lag behind simple PCA, and gets closer and closer to pca+d. Hence the possibility to find a balance between complexity and performance, so that a constant coefficient c can be determined (computer would choose N random vectors among cxN ones). This is left to later work.

44

5 RESULTS

5.2 Face recognition using PCA on eigenfeatures

CE % for Pca−20, Pca+d−20 and Pca+de−20

CE % for Pca−40, Pca+d−40 and Pca+de−40

5

5 Pca−20 , Pca+d−20 Pca+de−20

4.5

4

4

3.5

3.5

3 FA %

FA %

3

2.5

2.5

2

2

1.5

1.5

1

1

0.5

0.5

0

Pca−40 , Pca+d−40 Pca+de−40

4.5

0

0.5

1

1.5

2

2.5 FR %

3

3.5

4

4.5

0

5

0

0.5

(a) CER comparison for n=20

1

CE % for Pca−60, Pca+d−60 and Pca+de−60

2.5 FR %

3

3.5

4

4.5

5

CE % for Pca−80, Pca+d−80 and Pca+de−80 5 Pca−60 , Pca+d−60 Pca+de−60

4.5

Pca−80 , Pca+d−80 Pca+de−80

4.5

4

4

3.5

3.5

3 FA %

3 FA %

2

(b) CER comparison for n=40

5

2.5

2.5

2

2

1.5

1.5

1

1

0.5

0.5

0

1.5

0

0.5

1

1.5

2

2.5 FR %

3

3.5

4

4.5

5

(c) CER comparison for n=60

0

0

0.5

1

1.5

2

2.5 FR %

3

3.5

4

4.5

5

(d) CER comparison for n=80

Figure 27: CER for distances CER graphs (Figure 27) seem to confirm our previous observations. On the one end, pca and pca+d CER get better with N, whereas on the other side, the lower N, the better the CER. For instance, for N = 40 bits, the CER is almost 0 for pca+de, which is very satisfying, as a system with more than one trillion possible keys is bound to be considered a secure one in most corporate environments. Finally, when looking at the actual figures in tables 5 and 6, the shortcomings of a high N for pca+de are obvious. For instance, N = 40 seems like a good compromise, the average distance between two pictures from one person being 0.15 bits, and the CER being very low. To overcome the problem of different bit strings per person, one can assume, based on the very low number of different bits between two image 45

5 RESULTS

5.3 Face recognition using Wavelet Transform and PCA

representations, that a person can be described by a few bit strings. Furthermore, those bit strings can be assumed to be always the same, because of the PCA nature itself. Hence a correcting code can be introduced on the token for those additional bit string representations, the validity and safety of this approach will described in detail in a work parallel to this thesis. M’ 20 40 60 80

Same user differences 1.15 | 0.03 2.18 | 0.15 3.53 | 0.86 4.48 | 4.51

FR % (at FA=0%)

CER %

29.7 | 3.37 4.04 | 0.01 1.57 | 0.22 1.31 | 0.93

2.07 | 0.02 0.77 | 0.01 0.49 | 0.1 0.37 | 0.34

Table 5: Summary of results for pca+d and pca+de Note that the “same user differences” in table 5 and table 6 refer to the number of bit differing from one individual image bio-representation to another bio-representation of the same individual, in average. For PCA, as the distance is not measured in bits but the Euclidean way, we normalized the results as in Figure 26 so that some performance comparison is achievable. M’ 20 40 60 80

Same user differences 0.51 | 0.03 1.17 | 0.15 1.92 | 0.86 2.88 | 4.51

FR % (at FA=0%)

CER %

4.47 | 3.37 2.48 | 0.01 2.27 | 0.22 1.61 | 0.93

0.57 | 0.02 0.41 | 0.01 0.33 | 0.1 0.29 | 0.34

Table 6: Summary of results for pca and pca+de

5.3 Face recognition using Wavelet Transform and PCA 5.3.1 Identification One of the other major advantages of wavelets+PCA over simple PCA is the possibility of combining sub-bands without any information overlap, as each sub-band covers a distinct frequency range. This results for example in a better recognition rate of subband 1+5 (0.18%, Figure 28b) over simple sub-band 1 (0.35%). But it comes at a cost, as mentioned earlier: the rise in computational complexity. Because reducing the latest is our reason for using wavelets, it would be interesting to find a golden mean between computational efficiency and recognition rate. This is a path to investigate, but it will be left to future developments. 46

5 RESULTS

5.3 Face recognition using Wavelet Transform and PCA

Results of recognition on.1000.samples

Results of recognition on.1000.samples

7

70 subband 1 subband 2 subband 3 subband 4 subband 5 subband 6 subband 7

60

5

Error rate (%)

Error rate (%)

50

40

30

4

3

20

2

10

1

0

subband 1 subband 1&2 subband 1&3 subband 1&4 subband 1&5

6

5

10

15

20

25 30 Number of eigenvectors

35

40

45

50

0 20

30

(a) single sub-band case

40

50 60 70 Number of eigenvectors

80

90

100

(b) combined sub-bands case

Figure 28: Error rate for wavelet+PCA If you consider Table 7, no single sub-band reaches the same level of accuracy as the face feature for PCA. Nonetheless, it is interesting to see how good the entire system will perform, now that it is based uniquely on a 16x16 pixels image. Sub-bands 1 and 2 perform the best and are among the smallest sub-bands blocks. Hence, we will limit ourselves to compare those two on CER and FRR basis. Sub-band Error rate Size

Sub-band 1 0.41% 16x16 pixels

Sub-band 2 1.54% 16x16 pixels

Sub-band 5 9.41% 32x32 pixels

Sub-band 1+5 32x32 + 16x16 pixels

Table 7: Error rate per eigenfeature for M 0 = 50 From Figure 29 and Table 8, we can conclude that sub-band 1 have the best recognition rate and CER, and therefore will be used for bio-hashing in our tests. We would like to stress out once again that this sub-band 1 can only be selected provided we ensure very uniform lighting across the image. We would have to consider other sub-bands if this requirement is not met. Besides, it would be interesting to split up sub-band 5 in 4 sub-bands to check if one of them does not give better results. Summary

47

5 RESULTS

5.3 Face recognition using Wavelet Transform and PCA

Subband 1: results of recognition on.1000.samples

Subband 2: results of recognition on.1000.samples

5

5 dimension:15 dimension:30 dimension:45 dimension:60

4

4

3.5

3.5

3

2.5

2

1.5

3

2.5

2

1.5

1

1

0.5

0.5

0

dimension:15 dimension:30 dimension:45 dimension:60

4.5

False acceptance rate (%)

False acceptance rate (%)

4.5

0

0.5

1

1.5

2 2.5 3 False rejection rate (%)

3.5

4

4.5

5

0

0

0.5

(a) Sub-band 1

1

1.5

2 2.5 3 False rejection rate (%)

3.5

4

4.5

5

(b) Sub-band 2

Figure 29: CER for M’=15, 30, 45, 60 M’ 15 30 45 60

Error rate 1.18 | 5.34 0.57 | 2.29 0.44 | 1.54 0.36 | 1.41

FR (at FA=0%) 5.87 | 26.3 2.80 | 13.2 2.53 | 9.87 2.31 | 9.16

CER 0.82 | 2.50 0.57 | 1.67 0.51 | 1.44 0.49 | 1.41

Table 8: Summary of results for sub-band 1 and sub-band 2 5.3.2 Verification with biometric hashing on wavelet+PCA The maximum size of the eigenspace for wavelet+PCA on sub-band 1 being 64 (space dimension containing 99% of the energy), it is not possible to compare directly figures in last section and the one we will produce here. The selection of the most stables axes will be done on the 60 first coordinates (instead of 80 before). In addition, we will stick to the proportional evolution for M’ as before, by calculating results for M’= 15, 30, 45, 60. Figure 30 shows how alike the results between PCA and wavelet+PCA are. The same trends are visible, in particular an optimality of the selection criterion for M 0 = 0 /2 = 30 in the present case (40 in the first case for PCA). This first observation is Mmax very encouraging. When it comes to comparing the CER (Figure 31) for those different cases, once again, we find a great similarity with the behavior observed in last case. Finally, we should have a look at the figures in Tables 9 and 10 to confirm our impression and quantify which method gives the best results. By assuming we can compare 48

5 RESULTS

5

2

5.3 Face recognition using Wavelet Transform and PCA

4

Histograms for PCA−15, PCA+d−15 and PCA+de−15

x 10

14 Same user (pca) Different users (pca) Same user (pca+d) Different users (pca+d) Same user (pca+de) Different users (pca+de)

1.8

1.6

Histograms for PCA−30, PCA+d−30 and PCA+de−30

x 10

Same user (pca) Different users (pca) Same user (pca+d) Different users (pca+d) Same user (pca+de) Different users (pca+de)

12

10

1.4

1.2 8 1 6 0.8

0.6

4

0.4 2 0.2

0

0

0.5

1

1.5

2

0

2.5

0

0.5

(a) histograms comparison for n=15

4

12

4

10 Same user (pca) Different users (pca) Same user (pca+d) Different users (pca+d) Same user (pca+de) Different users (pca+de)

10

1.5

2

2.5

(b) histograms comparison for n=30

Histograms for PCA−45, PCA+d−45 and PCA+de−45

x 10

1

Histograms for PCA−60, PCA+d−60 and PCA+de−60

x 10

Same user (pca) Different users (pca) Same user (pca+d) Different users (pca+d) Same user (pca+de) Different users (pca+de)

9

8

7 8 6

6

5

4 4 3

2 2 1

0

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

(c) histograms comparison for n=45

0

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

(d) histograms comparison for n=60

Figure 30: Comparison of histograms of distances for wavelet+PCA

49

1.8

2

5 RESULTS

5.3 Face recognition using Wavelet Transform and PCA

CE % for Pca−15Pca+d−15 and Pca+de−15

CE % for Pca−30Pca+d−30 and Pca+de−30

5

5 Pca−15 , Pca+d−15 Pca+de−15

4.5

4

4

3.5

3.5

3 FA %

FA %

3

2.5

2.5

2

2

1.5

1.5

1

1

0.5

0.5

0

Pca−30 , Pca+d−30 Pca+de−30

4.5

0

0.5

1

1.5

2

2.5 FR %

3

3.5

4

4.5

0

5

0

0.5

(a) CER comparison for n=15

1

CE % for Pca−45Pca+d−45 and Pca+de−45

2.5 FR %

3

3.5

4

4.5

5

CE % for Pca−60Pca+d−60 and Pca+de−60 5 Pca−45 , Pca+d−45 Pca+de−45

4.5

Pca−60 , Pca+d−60 Pca+de−60

4.5

4

4

3.5

3.5

3 FA %

3 FA %

2

(b) CER comparison for n=30

5

2.5

2.5

2

2

1.5

1.5

1

1

0.5

0.5

0

1.5

0

0.5

1

1.5

2

2.5 FR %

3

3.5

4

4.5

5

(c) CER comparison for n=45

0

0

0.5

1

1.5

2

2.5 FR %

3

3.5

(d) CER comparison for n=60

Figure 31: Comparison of CER for wavelet+PCA

50

4

4.5

5

5 RESULTS

5.3 Face recognition using Wavelet Transform and PCA

the efficiency between PCA and wavelet+PCA by normalizing the results over M’, we get that the bio-hashing results (for “Same user difference” field in tables) are confined in a range of ±7% for the 2 techniques. This is a clear hint that the use of sub-band 1, while giving similar results, could overcome some drawbacks of PCA with face feature, especially the computational cost of the the eigenspace calculation. M’ 15 30 45 60

Same user differences 0.35 | 0.03 0.78 | 0.11 1.32 | 0.67 1.97 | 3.17

FR % (at FA=0%)

CER %

5.87 | 2.27 2.80 | 0.04 2.53 | 0.73 2.31 | 1.24

0.82 | 0.21 0.82 | 0.01 0.57 | 0.31 0.57 | 0.65

Table 9: Summary of results for pca and pca+de (wavelet) M’ 15 30 45 60

Same user differences 0.82 | 0.03 1.48 | 0.11 2.41 | 0.67 3.21 | 3.17

FR % (at FA=0%)

CER %

48.3 | 2.27 6.47 | 0.04 2.87 | 0.73 0.82 | 1.24

2.96 | 0.21 0.94 | 0.01 0.60 | 0.31 0.48 | 0.65

Table 10: Summary of results for pca+d and pca+de (wavelet)

51

6 CONCLUSION

6 Conclusion In this project, we presented a system that could potentially generate error-tolerant bit strings from face data, or hash. A cryptographic hash, also called message digest or digital signature, is in essence a short summary of a long message. Hash functions take a message of arbitrary size as input and produce a small bit string, the hash or hash value. Hash functions are widely used as a practical means to verify, with high probability the integrity of (bit wise) large objects. In the paper on robust hash functions for Digital Watermarking, J. Fridrich and M. Goljan write the following [30]: “What would be useful to have is a mechanism that would return approximately the same bit-string for all similar looking images, yet at the same time, two completely different images would produce two uncorrelated hash strings. (...) One can say that we want approximately the same hash bit-strings for two images whenever the human eye can say that these two images “are the same”. Obviously this is a challenging problem that can never be solved to our complete satisfaction. This is because the fuzzy concept of two images being visually the same is inherently ill defined and difficult, if not impossible, to grasp analytically.” It is interesting to note that it is precisely one of the properties of the eigenfaces technique, which can explain high-dimensional observations with as few variables as possible. This has lead us to the idea of using eigenfaces to construct a hash function for encrypted facial identification. The system designed relied on the robust and tested PCA technique, to which was added an outside random token for “randomizing’ the output sequence. The system makes it possible to define a performance threshold that can be decided on several criterium, such as the size of the user base, the reliability required, the size of the hash, as well as the computational cost one is ready to have. As an example of result, our test achieved an average bit error between 2 hash strings from the same person of 0.15, with a CER of 0.01%, and for a hash string of length 40. While this result sounds encouraging, one should keep in mind that the computational complexity in PCA-based method to find the eigenvectors can be problematic, as it must be recomputed each time a new user is added. As the face images have a 136x88  pixels dimensions, the estimated complexity could be as high as O (136x88)3 =  O (11, 968)3 , if the number N of users in the system is superior to 11, 968, and O N 3 otherwise. As a possible solution, we showed that Wavelet Transform could be used to extract a sub-band from the image, on which PCA would be applied in turn. The advantage is that the image we apply PCA on is then of size 17x11, and the maximum complexity would then be O (187). Finally, it was demonstrated that this reduction in size of the image was not at the expense of the system performance, as we achieved an average bit error between 2 hash strings from the same person of 0.11, with a CER of 0.01%, and for a hash string of length 30.

52

7 FUTURE WORK

7 Future work Face image preprocessing being at the heart of the system, it is important to improve as much as possible the results we had. One thing to improve would be for instance the eye detection, for which the deformable templates technique looks promising [31]. In addition, we are conscious that we used eigenfeatures for face features detection mainly because of the code reusability advantage, so other way should be explored. Besides, many advanced techniques have been proposed since the development of eigenfaces. All those techniques can be considered as extensions to eigenfaces, since they build on the groundwork it laid: • The use and/or combination of eigenfeatures [7] is likely to be of interest, as some people, for religious or other reasons, might have occlusions on part of their face. I would also provide an increased robustness for changes in expression. Moreover, these eigenfeatures can be applied to the Wavelet+PCA method, in so far as one only need to extract the part of the face we are interested in, and apply a 3-level decomposition wavelet-tree to it. • Improved density modeling or Bayesian similarity metric, which are optimizations of the eigenface technique. They provide us with improved recognition accuracy, at the expense of the computational complexity. While improved density modeling offers, for instance, an increase in accuracy of face recognition of 8%, it results in twice the computational load. As to the Bayesian similarity technique, it can increase accuracy by up to 7%. • Fisherfaces also offer an interesting perspective [18]. It is an example of a classspecific method, in the sense that it tries to shape the scatters of the eigenvectors, so that the classification gets more reliable. In other words, it tries to minimize the intra-class distance, while maximizing the inter-class one. It offers promising improvements over simple eigenfaces. The test of a wider range of wavelet filters, as well as the research toward enhancement of the wavelet transform part of the project, such as sub-band combination effects, are likely to be profitable, as we have not been able to dedicate as much time as we wished to it.

53

8 ANNEX 1: GUI QUICK GUIDE

8 Annex 1: GUI quick guide This is a short presentation of the functionalities of the interface we used to produce the graphs for identification errors, histograms, and CER.

Figure 32: Main GUI window

A- FACE & FEATURES DATABASE Settings button:

54

8 ANNEX 1: GUI QUICK GUIDE

Interface:

-The “current test database” is our image test database. -The “Number of references per sample” is the number of reference images in the system database when we want to identify somebody. -The “Current eigenspace database” is the image database used to create the eigenspace (generally the same as the test database, but then the images used to create the eigenspace and the ones for the test are separated). -The “Number of references per sample” is the number of images per person used to create an average image to be used for the eigenspace generation.

Options:

-Extract facial features from a database folder:

1. Manual extraction 2 points: select the 2 eyes, and face is normalized, proportion is kept (used for creating eigenspace for face detection). An example of the interface to select features:

2. Manual extraction 3 points: select the 2 eyes and the mouth, vertical and horizontal normalization. 3. Automatic extraction 3 points: automatic detection and extraction . -Extract facial features from a face image: -Facial features extraction demo:

Same as above, except that instead of successively presenting the images contained in a folder, we just consider one image. A visual demonstration of the different stages of features detection and extraction, for testing.

55

8 ANNEX 1: GUI QUICK GUIDE

-Resize images:

Resizes a folder of images and copy the resized images in another directory.

B- FACE RECOGNITION WITH EIGENFEATURES Options: -Generate eigenspace: 1. Generate eigenspace demo: generates the eigenpace for one manually selected picture directory. 2. Generate all eigenspaces : generates the eigenpace for all the directories in /data/eigenspace. -Project training eigenspace:

database

on 1. Manual projection : projects the faces in a user-selected directory onto a user-selected eigenspace. 2. Automatic projection : projects all the directories in /data/eigenspace_ref and /data/test_pictures onto the corresponding eigenspaces in /data/eigenspace

56

8 ANNEX 1: GUI QUICK GUIDE

-Generate statistics: 1. Generate single features stats: generates the stats for all the single features (eyes, cropped face, nose,...) 2. Generate features stats: generates the stats for combinations of 2 features 3. Combine all features: makes the stats for all possible combinations (28) 4. Combine only: pick up 2 features you want to combine 5. Add additional combination(s): enter manually what we want to compute. An example: "1,2 3,4,5". A space separates 2 different computations, whereas a comer means we wanna combine the features. The correspondence is "Eyes-1, Face-2, Left eye-3, mouth-4, nose-5, right eye-6, Whole face-7. So in this example, we want to compute the stats for eyes combined with face, and for left eye, combined with mouth and nose 6. Generate correlation histograms: ask to display the histograms of distance for same/different users 7. Nb of eigenvectors used: enter the different sizes of the eigenspace we want to use, comer separated. Be cautious not to enter a value that is higher than the eigenspace dimension of one of the tested feature, be it combined or not

C- FACE RECOGNITION WITH WT AND EIGENFEATURES

57

9 ANNEX 2: PROGRAM TREE

-Use wavelet transform on image : 1. Choice: the type of wavelet filter we want to use 2. Order: the order of the filter (normally, we use Daub 4) 3. On feature: the feature we want to apply wavelet transform on 4. Wavelet transform for 1 directory: choose manually a directory where we should apply wv transform 5. Wavelet transform for all directories: apply the wavelet transform for the directories in /data/eigenspace, /data/eigenspace_ref and /data/test_pictures and save the results in the corresponding directories (the same + "_wv")

For the rest of the functions, refer to the section C.

D- VISUAL HASHING -Visual hashing 1. Wavelet/pca: choose on what to apply biohashing 2. Eigenfeature: choose the eigenfeature/subband on which to apply it 3. Combine manually: see "add additional combinations" in Generate statistics 4. Eigenspace size: see "nb of eigenvectors used" in Generate statistics

9 Annex 2: Program tree Naming conventions and variables -Global variables are in Capital Letters, standard variables are lower case. 58

9 ANNEX 2: PROGRAM TREE

-Global variables and paths are defined in the "init.m" files with a few exceptions. -The name of the function matches the following pattern: directory1directory2...directoryN _function name. Ie, in practice a function "search" located in the directory "root_directory/lib/files" will be called "libfile_search". Should make it easier to find functions in the tree.

Tree structure Only the main functions are mentionned here. In particular, all the GUI functions are not listed, as they are mainly GUIs for the libraries we created. root directory:

the install directory data/ :

lib/ :

repository containing all our faces databases and result files (workspace) db/ : contains the faces databases with subdirectories named after the eigenfeatures they contain eigenfeatures/ : contains the face database used for face/feature detection (2 point normalization, to enter manually) eigenspace/ : contains the current eigenspace faces/eigenfeatures in use (average of the faces/features in eigenspace_ref/ for one person) eigenspace_ref/ : contains the faces/eigenfeatures used to create the eigenspace eigenspace_ref_wv/ same as eigenspace_ref/ but for wavelets (this : time sub-bands instead of features) eigenspace_wv/ : same as eigenspace/ but for wavelets test_pictures/ : the set of pictures we are going to use as reference/test pictures, stats are based on these test_pictures_wv/ same as test_pictures/ but for wavelets : library functions detection/ : contain the function for face/feature detection libdetection_eigenfeature/ : detect a given eigenfeature in a given face libdetection_findaggregates : find the groups of black points in a binary picture libdetection_findiriscentre : find the center of the iris in an eye close-up libdetection_main : launches the detection of cropped face feature in a picture, then the eyes, finally locate irises eigenspace/ : eigenspace generation and projection

59

9 ANNEX 2: PROGRAM TREE

libeigenspace_genalleigenspaces : generates the eigenspaces for all the eigenfeatures libeigenspace_geneigenspace : generates the eigenspaces for a given eigenfeature libeigenspace_loadpop : load a population of eigenfeature (in a matrix, each column corresponding to one person’s feature) libeigenspace_makebasis : from a set of pictures, creates an eigenspace libeigenspace_projectall : project all the eigenfeature databases onto their corresponding eigenspaces extraction/ : libextraction_main : feature extraction with dynamic symmetry (location of the mouth if asked) image/ : image processing functions libimage_drawcross : draws a cross on a given image at given coordinates libimage_lightingnorm : lighting normalization in an image libimage_resize : resizes a whole folder of image, and save them in another directory libimage_show : shows an image matlabPyrTools/ Gaussian pyramids tools (external program), : see the readme in the folder orthonormalization/orthonormalization function for a set of vec: tors liborthonormalization_classical : classical Gram-Schmidt orthonormalization function liborthonormalization_modified : modified Gram-Schmidt orthonormalization function projection/ : misnamed directory, actually function for bio-hashing libprojection_bitextraction : projects a given eigenfeature on a given random space, and binaries the coordinates libprojection_genbeta : generates a set of orthonormalized random vectors recognition/ : functions for projection coefficients and distance matrix computation librecognition_calcweights : compute the projection coefficients of a feature on its corresponding eigenspace

60

9 ANNEX 2: PROGRAM TREE

librecognition_vectorsdist : from a set of projection coefficients for some feature, computes the distance matrix for pca librecognition_vectorsdist_vh : from a set of projection coefficients for some feature, computes the distance matrix after bio-hashing (enhanced case) librecognition_vectorsdist_vho : from a set of projection coefficients for some feature, computes the distance matrix after bio-hashing (normal case) statistics/ : libstatistics_countnbdirectories : counts the number of directories in a given folder libstatistics_countpersons : counts the number of distinct people in a given face database folder libstatistics_countsuccess4 : from a distance matrix, computes the error rate, distances for same/different users, and FR at FA=0 libstatistics_displaygraphics2 : display statistics script libstatistics_generate4 : main and start function for statistics computation libstatistics_generate_combine : sort out which features to combine together and for what eigenspace size, according to user’s choices libstatistics_generate_dist : computes the distance matrix libstatistics_listdirectories : list the directories name in a given folder wavelet utilities Uvi_Wave.300/ : wavelet toolbox (external program) libwavelet_splitimg : extract and save the 7 first sub-bands of the 3-level decomposition image libwavelet_transform : computes the 3-level decomposition image

wavelet/ :

61

LIST OF FIGURES

LIST OF FIGURES

List of Figures 1

Example of 10 given eigenfaces . . . . . . . . . . . . . . . . . . . .

12

2

How face decomposition works . . . . . . . . . . . . . . . . . . . . .

14

3

Head structure (A. Darville) . . . . . . . . . . . . . . . . . . . . . .

17

4

Miss Helen Wills, harmonic analysis . . . . . . . . . . . . . . . . . .

18

5

light normalization effect . . . . . . . . . . . . . . . . . . . . . . . .

19

6

Filtering process . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

20

7

Real world signal case . . . . . . . . . . . . . . . . . . . . . . . . .

21

8

2D wavelet decomposition examples . . . . . . . . . . . . . . . . . .

22

9

3-level wavelet decomposition on an image . . . . . . . . . . . . . .

23

10

The bit attribution decision depending on the vector position . . . . .

25

11

Face recognition system for identification . . . . . . . . . . . . . . .

27

12

Face recognition system for verification . . . . . . . . . . . . . . . .

28

13

Multi-scale pyramids . . . . . . . . . . . . . . . . . . . . . . . . . .

30

14

the different windows used to detect the face . . . . . . . . . . . . . .

31

15

Face detection and eye zone extraction . . . . . . . . . . . . . . . . .

31

16

Eye detection and iris location in the eye zone . . . . . . . . . . . . .

32

17

Mouth detection and face rotation&rescaling . . . . . . . . . . . . .

32

18

Features extraction process . . . . . . . . . . . . . . . . . . . . . . .

33

19

An example of eigenface for each sub-band . . . . . . . . . . . . . .

34

20

Test platform main window . . . . . . . . . . . . . . . . . . . . . . .

35

21

Different cases of wrong decision . . . . . . . . . . . . . . . . . . .

36

22

Mahonobilis vs. Euclidean distance for discretization without enhancement criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

23

Evolution of the bio-hashing offset tolerance . . . . . . . . . . . . . .

38

24

Error rate for PCA . . . . . . . . . . . . . . . . . . . . . . . . . . .

40

25

CER for M’=20, 40, 60, 80 . . . . . . . . . . . . . . . . . . . . . . .

41

26

Histogram for distances . . . . . . . . . . . . . . . . . . . . . . . . .

43

27

CER for distances . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

28

Error rate for wavelet+PCA . . . . . . . . . . . . . . . . . . . . . . .

47

29

CER for M’=15, 30, 45, 60 . . . . . . . . . . . . . . . . . . . . . . .

48

30

Comparison of histograms of distances for wavelet+PCA . . . . . . .

49

31

Comparison of CER for wavelet+PCA . . . . . . . . . . . . . . . . .

50

32

Main GUI window . . . . . . . . . . . . . . . . . . . . . . . . . . .

54

62

LIST OF TABLES

LIST OF TABLES

List of Tables 1

Examples of pictures from Essex database . . . . . . . . . . . . . . .

29

2

Error rate percentage (%) per sub-band for different types of wavelets for M 0 = 50 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

M0

3

Error rate per eigenfeature for

= 50 . . . . . . . . . . . . . . . . .

40

4

Summary of results for face and cropped face . . . . . . . . . . . . .

42

5

Summary of results for pca+d and pca+de . . . . . . . . . . . . . . .

46

6

Summary of results for pca and pca+de . . . . . . . . . . . . . . . .

46

M0

7

Error rate per eigenfeature for

= 50 . . . . . . . . . . . . . . . . .

47

8

Summary of results for sub-band 1 and sub-band 2 . . . . . . . . . .

48

9

Summary of results for pca and pca+de (wavelet) . . . . . . . . . . .

51

10

Summary of results for pca+d and pca+de (wavelet) . . . . . . . . .

51

63

REFERENCES

REFERENCES

References [1] M. Turk and A. Pentland, “Eigenfaces for recognition”, J. Cog. Neuroscience, vol. 3, no. 1, pp. 71-86, 1991. [2] M. Turk, A. Pentland, “Face recognition using Eigenfaces”, CVPR, Maui, HI, June 1991, pp. 586–591

IN

P ROC . IEEE

[3] L. Sirovich and M. Kirby, “Low-dimensional procedure for characterization of human faces”, Journal of the Optical Society of America A, vol. 4, pp. 519-524, March 1987. [4] M. Kosugi, “Human-Face Search and Location in a Scene by Multi-pyramid Architecture for Personal Identification”, Systems and computers in Japan, Vol. 26, No. 6, 1995, pp. 27-38. [5] A. Samal and P. A. Iyengar, “Automatic recognition and analysis of human faces and facial expressions: A Survey”, Pattern Recognition, Vol. 25, 65-77, 1992. [6] A. Pentland, B. Moghaddam and T. Starner, “View-based and Modular Eigenspaces for Face Recognition”, Proc. IEEE onf. Computer Vision and Pattern Recognition, Seattle, June, 84-91, 1994. [7] B. Moghaddam, A. Pentland, “Probabilistic Visual Learning for Object Detection”, Tech report No. 326, Perceptual Computing Section, Media Laboratory, Massachusetts Institute of Technology, Massachusetts, 1998. [8] B. Moghaddam, W. Wahid, A. Pentland, “Beyond Eigenfaces: probabilistic matching for face recognition”, The 3rd IEEE International Conference on Automatic Face and Gesture Recognition, IEEE Computer Soc. Press, Los Alamitos, California, 1998, pp. 30-35. [9] A. Grossman and J. Morlet, “Decomposition of Hardy functions into square integrable wavelet of constant shape”, SIAM J. of Mathematical Analysis, Vol. 15, 723-736, 1984. [10] A. L. Yuille, P. W. Hallinan and D. S. Cohen, “Feature Extraction from Faces using Deformable Templates”, Int. J. of Computer Vision, Vol. 8, No. 2, 99-111, 1992. [11] C. Nastar, “The image shape spectrum for image retrieval”, Technical report, No. 3206, INRIA, June 1997. [12] M. V. Wickerhauser, “Adapted Wavelet Analysis from Theory to Software”, AK Peters, Ltd., Wellesley, Massachusetts, 1994. [13] Biometric security from Guardware “http://www.guardware.com/products/2faq.html”

64

Systems,

REFERENCES

REFERENCES

[14] F. Galton, “Personal Identification and Description”, Nature, June 21, 1888, pp. 173-177 [15] S. Lawrence, C. Giles, A. Tsoi, A. Back, “Face Recognition: A Convolutional Network Approach”, IEEE Transactions on Neural Networks, Vol. 8, No. 1, pp. 98-113 [16] M. Kirby and L. Sirovich, “Application of the Karhunen-Loeve procedure for the characterization of human faces”, IEEE Pattern analysis and Machine Intelligence, vol. 12, No. 1, pp. 103-108, 1990 [17] M. Ghyka, “A practical Handbook of Geometrical Composition and Design”, 1952 [18] P. N. Belhumeur, J. P. Hespanha, D. J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection”, IEEE Transactions on pattern analysis and machine intelligence, Vol. 19, No. 7, July 1997 [19] Y. Moses, Y. Adini, and S. Ullman, “Face Recognition: The Problem of Compensating for Changes in Illumination Direction”, European Conf. Computer Vision, 1994, pp. 286-296 [20] D. L. Swets and J.J. Weng, “Using discriminant eigenfeatures for image retrieval”, IEEE Trans. PAMI., Vol. 18, No. 8, pp. 831-836, 1996 [21] G. C. Feng, P. C. Yuen, D.Q. Dai, “Human Face Recognition Using PCA on Wavelet Subband”, 2000 [22] I. Daubechies, “The Wavelet Transform, time-frequency localization and signal analysis”, IEEE Trans. Information Theory, Vol. 36, No. 5, pp. 961-1005, 1990 [23] I. Dauchebies, “Ten lectures on Wavelets”, CBMS-NSF series in Applied Mathematics, Vol. 61, SIAM Press, Philadelphia, 1992 [24] M. Bichel, A. Pentland, “Human Face Recognition and the Face Image Set’s Topology”, VGIP: Image Understanding, Vol. 59, No. 2, March 1994, pp. 254261 [25] G. Yang, T. Huang, “Human Face Detection in a Complex Background”, Pattern Recognition, Vol. 27, No. 1, pp. 53-64, 1994 [26] S. Jeng, H. Liao, Y. Liu, M. Chern, “An Efficient Approach for Facial Feature Detection Using Geometrical Face Model”, Proceedings of the 13th International Conference on Pattern Recognition, IEEE Computer Society Press, Los Alamitos, California, pp. 426-430, 1996 [27] B. Moghadam, A. Pentland, “An Automatic System for Model-Based Coding of Faces “, Techno. Report No. 317, Perceptual Computing Section, Media Laboratory, Massachusetts Institute of Technology, Massachusetts, 1995

65

REFERENCES

REFERENCES

[28] K. Sobottka, I. Pitas, “A fully Automatic Approach to Facial Feature Detection and Tracking”, Audio and Video-based Biometric Person Authentication, Springer-Verlag, Berlin, Germany, pp. 77-84, Germany [29] C. Nastar, B. Moghaddam, A. Pentland, “Flexible images: matching and recognition using learned deformations”, Computer Vision and Image Understanding, Vol. 65, No. 2, pp. 179-191, 1997 [30] J. Fridrich and M. Goljan, “Robust Hash Functions for Digital Watermarking”, in ITCC 2000, Las Vegas, USA, 2000 [31] A.L. Yuille, “Deformable templates for face recognition”, Journal of Cognitive Neuroscience, 3(1):59-70, 1991 [32] C. Soutar and GJ Tomko, ”Secure Private Key Generation Using a Fingerprint”, Cardtech/Securetech Conf 1: pp. 245-252, 1996 [33] C. Soutar, D. Roberge, A. Stoianov, R. Gilroy and BVK Vijaya Kumar, “Biometric Encryption Using Image Processing”, SPIE 3314, pp. 178-188, 1998 [34] GI. Davida, Y. Frankel and BJ. Matt, “On Enabling Secure Applications Through Off-Line Biometric Identification”, IEEE Symp on Security & Privacy, pp. 148157, 1998 [35] GI. Davida, Y. Frankel, BJ. Matt and R. Peralta, “On the Relation of Error Correction and Cryptography to an Off-line Biometric-based Identification Scheme”, Workshop Coding & Cryptography, Paris, France, 1999 [36] F. Monrose, MK. Reiter, Q. Li and S. Wetzel, “Cryptographic Key-generation from Voice”, IEEE Symp. on Security & Privacy, pp. 202-213, 2001 [37] A. Shamir, “How to Share a Secret”, ACM Comms 22 (11), pp. 612-613, 1979

66

Suggest Documents