Fractal Techniques for Face Recognition - Semantic Scholar

19 downloads 14178 Views 3MB Size Report
B.Sc: Computer Engineering(First Class Honors). PhD Thesis ... Keywords fractals, subfractals, fractal image-set coding, image coding, face recognition,.
Fractal Techniques for Face Recognition by

Hossein Ebrahimpour-Komleh M.Sc: Computer Engineering(With Honours) B.Sc: Computer Engineering(First Class Honors)

PhD Thesis Submitted in Fulfilment of the Requirements for the Degree of

Doctor of Philosophy at the

Queensland University of Technology Research Program in Speech, Audio, Image & Video Technologies August 2004

Keywords fractals, subfractals, fractal image-set coding, image coding, face recognition, image processing, computer vision

To my wife Soheila and my little daughter Niloufar

Abstract Fractals are popular because of their ability to create complex images using only several simple codes. This is possible by capturing image redundancy and presenting the image in compressed form using the self similarity feature. For many years fractals were used for image compression. In the last few years they have also been used for face recognition. In this research we present new fractal methods for recognition, especially human face recognition. This research introduces 3 new methods for using fractals for face recognition, the use of fractal codes directly as features, Fractal image-set coding and Subfractals. In the first part, the mathematical principle behind the application of fractal image codes for recognition is investigated. An image Xf can be represented as Xf = A × Xf + B which A and B are fractal parameters of image Xf . Different fractal codes can be presented for any arbitrary image. With the definition of a fractal transformation, T (X) = A(X − Xf ) + Xf , we can define the relationship between any image produced in the fractal decoding process starting with any arbitrary image X0 as Xn = Tn (X) = An (X − Xf ) + Xf . We show that some choices for A or B lead to faster convergence to the final image. Fractal image-set coding is based on the fact that a fractal code of an arbitrary gray-scale image can be divided in two parts – geometrical parameters and luminance parameters. Because the fractal codes for an image are not unique, we can change the set of fractal parameters without significant change in the quality of the reconstructed image. Fractal image-set coding keeps geometrical parameters

ii the same for all images in the database. Differences between images are captured in the non-geometrical or luminance parameters - which are faster to compute. For recognition purposes, the fractal code of a query image is applied to all the images in the training set for one iteration. The distance between an image and the result after one iteration is used to define a similarity measure between this image and the query image. The fractal code of an image is a set of contractive mappings each of which transfer a domain block to its corresponding range block. The distribution of selected domain blocks for range blocks in an image depends on the content of image and the fractal encoding algorithm used for coding. A small variation in a part of the input image may change the contents of the range and domain blocks in the fractal encoding process, resulting in a change in the transformation parameters in the same part or even other parts of the image. A subfractal is a set of fractal codes related to range blocks of a part of the image. These codes are calculated to be independent of other codes of the other parts of the same image. In this case the domain blocks nominated for each range block must be located in the same part of the image which the range blocks come from. The proposed fractal techniques were applied to face recognition using the MIT and XM2VTS face databases. Accuracies of 95% were obtained with up to 156 images.

Contents

Abstract

i

List of Figures

viii

List of Tables

xv

Acronyms & Units

Certification of Thesis

Acknowledgments

Chapter 1 Introduction

xvi

xvii

xviii

1

1.1

Chaos and Fractals . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.2

Face recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

1.3

Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

1.4

Publications Resulting from research . . . . . . . . . . . . . . . .

4

iv

CONTENTS

Chapter 2 Fractal Encoding and Decoding

6

2.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

2.2

Features of Fractals . . . . . . . . . . . . . . . . . . . . . . . . . .

7

2.3

Mathematical Foundations . . . . . . . . . . . . . . . . . . . . . .

9

2.3.1

Metric Space . . . . . . . . . . . . . . . . . . . . . . . . .

9

2.3.2

Contractive Transformations . . . . . . . . . . . . . . . . .

10

2.3.3

Fixed Point Theorem . . . . . . . . . . . . . . . . . . . . .

10

2.3.4

Affine Transformation . . . . . . . . . . . . . . . . . . . .

11

2.4

Iterated Function Systems(IFS) . . . . . . . . . . . . . . . . . . .

11

2.5

Principles of Fractal Coding . . . . . . . . . . . . . . . . . . . . .

12

2.5.1

Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . .

15

2.5.2

Transformations . . . . . . . . . . . . . . . . . . . . . . . .

15

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

2.6

Chapter 3 Face Recognition

19

3.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19

3.2

Facial Feature Detection . . . . . . . . . . . . . . . . . . . . . . .

19

3.3

Geometric Feature Based Methods . . . . . . . . . . . . . . . . .

20

3.3.1

Face Recognition Using Principal Component Analysis (Eigenfaces) . . . . . . . . . . . . . . . . . . . . . . . . . .

21

CONTENTS 3.3.2 3.4

3.5

v Recognition Using Independent Component Analysis (ICA)

22

Linear Discriminant-Based Method . . . . . . . . . . . . . . . . .

23

3.4.1

Other Methods . . . . . . . . . . . . . . . . . . . . . . . .

24

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

Chapter 4 Fractal Codes Directly as Features

27

4.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

4.2

Previous Related Work . . . . . . . . . . . . . . . . . . . . . . . .

28

4.2.1

Shape Recognition Using Fractal Geometry . . . . . . . . .

28

4.2.2

Face Recognition Using Fractal Dimensions . . . . . . . . .

28

4.2.3

Face Recognition Using Fractal Neighbor Distances . . . .

29

Fractal Codes as Features . . . . . . . . . . . . . . . . . . . . . .

29

4.3.1

Fractal Extraction . . . . . . . . . . . . . . . . . . . . . .

30

4.3.2

Normalizing Fractal Codes . . . . . . . . . . . . . . . . . .

36

4.3.3

Accuracy Tests . . . . . . . . . . . . . . . . . . . . . . . .

37

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

42

4.3

4.4

Chapter 5 Fractal Image-set Coding

43

5.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

5.2

Mathematical Bases . . . . . . . . . . . . . . . . . . . . . . . . .

44

vi

CONTENTS 5.3

Fractal Image-set Coding . . . . . . . . . . . . . . . . . . . . . . .

45

5.4

Similarity Measurements . . . . . . . . . . . . . . . . . . . . . . .

50

5.4.1

Minkowski-Form Distance . . . . . . . . . . . . . . . . . .

51

5.4.2

Cosine Distance . . . . . . . . . . . . . . . . . . . . . . . .

51

5.4.3

Fractal Similarity Measures . . . . . . . . . . . . . . . . .

52

5.5

Using Fractal Image-set Coding for Face Recognition . . . . . . .

53

5.6

Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . .

54

5.7

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

56

Chapter 6 Subfractals

65

6.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

65

6.2

Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . .

66

6.3

Subfractal Coding . . . . . . . . . . . . . . . . . . . . . . . . . . .

69

6.4

Mathematical Basis . . . . . . . . . . . . . . . . . . . . . . . . . .

72

6.5

How to Use Subfractals for Face Recognition . . . . . . . . . . . .

77

6.6

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

Chapter 7 7.1

Future Work and Conclusions

85

Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

87

7.1.1

87

Improving the Robustness . . . . . . . . . . . . . . . . . .

CONTENTS

vii

7.1.2

Face Location and Detection . . . . . . . . . . . . . . . . .

89

7.1.3

Face Recognition Using Subfractals of Eyes and Mouth Area 91

Appendix A Quick Glance Eye-Gaze Tracking System

93

Appendix B Experimental Details

95

B.1 Fractal Codes as Features . . . . . . . . . . . . . . . . . . . . . .

95

B.2 Fractal Image-Set Coding . . . . . . . . . . . . . . . . . . . . . .

96

Bibliography

101

List of Figures

2.1

One of the best examples for understanding the features of fractals is the fern.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.2

Van Koch’s snowflake with fractal dimension of 1.26

2.3

Serpinski triangle the attractor of an IFS containing 3 contractive transformations.

3.1

. . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . .

8 13

13

Examples of pose(XM2VTS face image database [65]), lighting (AR face image database [61]) and facial experssion variations (CMU-Pitt facial expression database [48]) in face images. . . . .

4.1

Domain (bottom) and range(top left) blocks for an image (top right)

4.2

4.3

25

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

The eight possible orientations of a block. The orientations consist of four 90o rotations, a reflection and four more 90o rotations. . .

33

An illustration of domain and range blocks . . . . . . . . . . . . .

33

LIST OF FIGURES 4.4

ix

Fractal features of an image (A=Domain index number, B=Rotation

(orientation)

index,

C=Brightness

shift

and

D=Contrast factor) displayed as gray values over the quadtree partition of the same image. . . . . . . . . . . . . . . . . . . 4.5

Typical

images

from

the

MIT

face

36

database

(ftp:\\whitechapel.media.mit.edu\pub\eigenfaces\pub\images). Two different frontal views of each person are included. 4.6

. . . . .

37

Recognition accuracy using Rotation, Domain index, brightness and contrast features, independently, and total accuracy achieved using all features, plotted against the number of the images in the database as this is progressively increased. . . . . . . . . . . . . .

4.7

38

A query Image and the first 8 closest matches found by the method (the first four images are the best hit for each feature and the last four images are the second best hit of that feature).

4.8

. . . . . . .

39

A rotated query image and the first 8 closest matches. Note that the best match image retrieved using the orientation feature is the correct person. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.9

40

Another rotated query image and the first 8 closest matches only using the orientation feature.

. . . . . . . . . . . . . . . . . . . .

41

4.10 A query image inverted in grayscale and the first 8 closest matches. Note that the rotation feature gets both first and second matches right. The brightness feature does not find the right match because the change negatives the brightness feature. The other two features

5.1

find the right match. . . . . . . . . . . . . . . . . . . . . . . . . .

42

An illustration of Get-Block and Put-Block operators . . . . . . .

44

x

LIST OF FIGURES 5.2

Illustrations of function T (x) = A × x + B for a one-dimensional space =. a)s > 1, b) s = 1, c,d,e) s < 1 . . . . . . . . . . . . . . .

5.3

47

An example of preprocessing with an image in the data-set.(A) The original image, (B) grayscale image with orientation normalized, (C) Nominated image with face area marked, (D) normalized histogram equalized face image . . . . . . . . . . . . . . . . . . .

5.4

48

(A) Average image of the data-set, (B) An arbitrary image from the data-set, (C) Range blocks for image A, (D) The same range blocks applied to image B . . . . . . . . . . . . . . . . . . . . . .

5.5

The initial image and the first, third and fifth iterates of decoding transformations corresponding to image 005 4 1. . . . . . . . . . .

5.6

49

50

The PSNR versus the number of decoding steps for 4 different 128 × 128 gray-scale, normalized, encoded images of the XM2VTS database. The dash-dot line, solid line, dashed line and dotted line correspond to images 002 1 1, 000 1 1, 003 1 1and 005 4 1 images of the XM2VTS database, respectively . . . . . . . . . . . . . . .

5.7

57

Euclidean distance takes both angle and vector lengths into account to calculate the distance, while cosine distance only takes angle into account. . . . . . . . . . . . . . . . . . . . . . . . . . .

5.8

57

Convergence trajectories for three different initial images when the same fractal code is applied iteratively. Note that the initial image (x03 ) closest to the fixed point shows the least distance between successive iterations (d3 < d2 < d1 ). The fractal parameters are A = 0.9 × ρ45 and B = (I − 0.9 × ρ45 ) × xf . . . . . . . . . . . . .

58

LIST OF FIGURES 5.9

xi

Convergence trajectories for the same three initial images when the fractal code parameters are A = 0.9 × ρ15 and B = (I − 0.9 × ρ15 ) × xf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

58

5.10 Convergence trajectories for the same three initial images when the fractal code parameters are A = 0.6×ρ45 and B = (I −0.6×ρ45 )×xf . 59 5.11 Convergence trajectories for the same three initial images when the fractal code parameters are A = 0.6 × ρ15 and B = (I − 0.6 × ρ15 ) × xf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

59

5.12 An example showing a query image on top followed by the six closest images in the database. The best match is on the top-left, followed by others left to right in row-first order. Note that the first three matches are images of faces of the correct person and some change in expression is tolerated by the method.

. . . . . .

60

5.13 The error(top left) and the similarity (top right) between the query image and the images in the training data-set. Errors are all very small. Normalized error (bottom left) and normalized similarity (bottom right) for the same images. Note that the normalized similarity measure clearly shows the best matching face number as 9. Values of this measure for other faces are below 0.7 in this case.

60

5.14 Another example showing a correctly identified case. Note here that there is a more marked change in facial expression and pose.

61

5.15 Yet another correctly identifed case. Note that the first three matches are images of faces of the correct person. . . . . . . . . .

61

5.16 Yet another correctly identified test case. Note here that the query image is of a light-skinned individual and so are all the 6 closest matched images.

. . . . . . . . . . . . . . . . . . . . . . . . . . .

62

xii

LIST OF FIGURES 5.17 A test case that failed. The second closest match is of the correct individual but the facial hair change is too severe for the method to cope. It seems as if the closest matched face exhibits a different person but with very similar expression and features such as eyes and mouth. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

62

5.18 Query image (top) and training images (bottom) for the individual number 019 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

5.19 The only other test case that failed. The fourth and sixth closest matches are of the correct individual. . . . . . . . . . . . . . . . .

63

5.20 Query image (top) and training images (bottom) for the individual number 005 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

63

5.21 A plot showing accuracy versus the number of persons in the data base. Three images are used for each person in training set and one image per person in test set. . . . . . . . . . . . . . . . . . . .

6.1

64

A distribution of the difference in the x position of the domains xd and ranges xr for an encoding of 512 × 512 Lena image, as well as the theoretical distribution (dashed line) of the difference of two randomly selected points. Adopted from [35]. . . . . . . . . . . .

6.2

66

A distribution of the difference in the y position of the domains yd and ranges yd for an encoding of 512 × 512 Lena image, as well as the theoretical distribution (dashed line) of the difference of two randomly selected points. Adopted from [35]. Note that the distribution is skewed and also has significantly large values close to 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

LIST OF FIGURES 6.3

xiii

Range blocks (top left) in four major subfractal areas (eyes, nose and lips) and corresponding domain blocks (bottom rows) for an arbitrary face image. Top right, a plot of pixel values vs. pixel numbers for last matched domain and range block is shown. . . .

71

6.4

A view of the eye-gaze tracking system . . . . . . . . . . . . . . .

78

6.5

A pair of face images shown to volunteers to verify the identity. .

80

6.6

An illustration showing the results of the eye-gaze tracking system for 10 viewers. Circles (the centers) show the gaze points and the radius of each circle shows the duration of gaze on that point. . .

6.7

Another pair of face images shown to volunteers to verify the identity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.8

81

The results of the eye-gaze tracking system show eyes, nose and lips area are the most important area for viewers to verify the identity.

6.9

80

81

Another pair of face images. The face images are inverted in grayscale (negative image).

. . . . . . . . . . . . . . . . . . . . .

82

6.10 The results of the eye-gaze tracking system for negative images. .

82

6.11 Yet another pair of face images. Note that the left face image is inverted in grayscale and the right face image is semi-drawing.

.

83

6.12 The results of the eye-gaze tracking system for negative images. .

83

7.1

Block diagram of the fractal face recognition system with PCA based feature reduction. . . . . . . . . . . . . . . . . . . . . . . .

90

xiv 7.2

LIST OF FIGURES Matrix showing differences between faces shown on the two axes. Darker points indicate larger difference. Entries below the diagonal are pixel-value differences. Entries above the diagonal are fractalfeature differences. . . . . . . . . . . . . . . . . . . . . . . . . . .

90

B.1 The results of Fractal image-set coding for subset of MIT face database. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

B.2 The results of Fractal image-set coding for the evaluation subset of XM2VTS database. Arrows showing the position of the threshold for FRR=0, FRR=FAR and FAR=0 . . . . . . . . . . . . . . . .

98

B.3 The results of Fractal image-set coding for the test subset of XM2VTS database. Arrows showing the position of the threshold for FRR=0, FRR=FAR and FAR=0 in the evaluation data set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

List of Tables

4.1

An example of fractal codes. . . . . . . . . . . . . . . . . . . . . .

35

B.1 Error rates obtained using Fractal image-set coding . . . . . . . .

99

B.2 Error rates Reported by T. Tan using fractal neighbor distances .

99

Acronyms & Units bpp

bits per pixel

dB

decibels

FA

False Acceptance

FR

False Rejection

ICA IFS

Independent Component Analysis Iterated Function Systems

KLT

Karhunen-Loeve Transform

LDA

Linear Discriminant Analysis

LDT

Linear Discriminant Transform

LED

Light Emitting Diode

PCA

Principal Component Analysis

PIFS

Partitioned Iterated Function Systems

PSNR

Peak Signal-to-Noise Ratio

XM2VTS

Extended MultiModal Verification for Teleservices and Security

Certification of Thesis

The work contained in this thesis has not been previously submitted for a degree or diploma at any other higher educational institution.

To the best of my

knowledge and belief, the thesis contains no material previously published or written by another person except where due reference is made.

Signed: Date:

Acknowledgments It is not possible to thank everybody who has had an involvement with me during the course of Ph.D. However, there are some people who must be thanked. Firstly, I would like to thank my family and parents whose encouragement, support and prayers has helped me achieve beyond my greatest expectations. I thank them for their understanding, love and patience, especially through the number of more difficult and stressful moments. Without their help and support throughout the years it was not possible for me to come this far. I would like to thank my principal supervisor Dr. Vinod Chandran for his guidance and encouragement throughout my course of study. In addition I must thank Dr. Chandran for his conscientious reviewing of my conference and journal papers as well as my thesis draft. I would also like to thank my associate supervisor, Prof. Sridha Sridharan for the research environment he has created, as well as the additional financial support he has provided me through the scholarship top-ups and the financial travel support for the many conference travels I have undertaken. In addition, I am appreciative of the financial support of the Iranian ministry of science, research and technology through the PhD scholarship, I had been awarded. A special acknowledgement goes to Prof. Javad Farhoudi, former Iranian scientific counsellor in Canberra for his role, support and help during my study in Australia. I also thank Dr. Kohian for his consideration and help.

ACKNOWLEDGMENTS

xix

Former and current staff and students of the Image and Video Research Laboratory must also be acknowledged and I was fortunate enough to interact and work with them. Anthony Ngyuen and Jason Pelecanos have been of particular help to me during my Ph.D. Simon Lucy, John Dines, Michael Mason, David Cole, and Eddie Wong all deserve special mention for their help at various times.

Hossein Ebrahimpour-Komleh Queensland University of Technology August 2004

Chapter 1 Introduction This thesis fundamentally addresses four related topics: (i) the study of possibility of using fractal codes of grayscale images as features for face recognition, (ii) the study of mathematical bases for using fractals for recognition, especially face recognition, (iii) the possibility of designing a more suitable fractal coding system for recognition, and (iv) theoretical investigations into the definition and use of Sub-fractals which are defined to be independent fractal codes of different parts of an image. In this thesis, the emphasis is on the use of fractal codes for recognition. Face recognition has been chosen as an application for testing this concept and the intention is not to aim for superior face recognition performance by fractal techniques alone. A short introduction of chaos and fractals as well as face recognition are given below.

1.1

Chaos and Fractals

A fractal is by definition “a set for which the Hausdorff-Besicovitch dimension strictly exceeds the topological dimension” [57]. Benoit Mandelbrot, who coined

2

1.2 Face recognition

fractal and its definition, in his classic book “The Fractal Geometry of Nature” [58] developed a new geometry of nature that describes many of the irregular and fragmented patterns around us using fractals. This ability is based on special features of fractals and their differences with other known models like geometrical models. For example fractals do not have a characteristic length. A shape usually has a definite scale that characterizes itself. Geometric shapes have their own characteristic length such as the radius or circumference of a circle and the edge or diagonal of a square. On the other hand, the length, size or volume of fractals cannot be measured with a single unit, as their surface are not smooth and that more closer we look in, the more complicated the shape appears. Mandelbrot used this ability of fractals to describe the geometry of natural shapes such as clouds, mountains and coastlines which can not be modelled by simple geometrical objects like spheres, cones or circles. After the developments in the field of dynamical systems and chaos and chaotic dynamical systems and their close relationship with fractals, there is no wonder why fractals can describe shapes like the fern very well. It is now understood that chaotic dynamics is inherent in nonlinear deterministic systems with seemingly random behavior. As a result, chaos and fractals have fascinated scientists from all fields. It is not only because of its importance in applications but also the beauty of the geometric patterns produced. In chapter 2 we will explain fractals and fractal image coding methods further.

1.2

Face recognition

Biometrics is an active area of research with a wide range of applications in surveillance, security systems, human-computer interfaces, etc. This term has been used to refer to the emerging field of technology devoted to automatic identification of individuals using physiological or behavioral traits. Techniques such as retinal or iris scanning, hand geometry, speech recognition, fingerprint scanning, signature

1.3 Thesis Outline

3

verification and face recognition are examples of biometric methods of identification which work by measuring unique human characteristics as a way to confirm identity. Face recognition has the advantage of requiring very little cooperation or modification of normal behavior on the part of the subjects in order to collect useful data. But unlike some other biometrics like fingerprints or irises, faces do not stay the same over time. Facial recognition systems have to deal with changes in hairstyle, facial hair, spectacles, make-up and aging. Face recognition is different from other pattern recognition problems such as character recognition. This difference arises from the fact that in classical pattern recognition, there are relatively few classes, and many samples per class. With many samples per class, algorithms can classify samples not previously seen by interpolating among the training samples. On the other hand for a typical face recognition, not only is there large intra-class variation, but also there are many classes and only few samples per class for training. A facial recognition system often relies, implicitly, on extrapolation from the training samples. Some variations such as location in an image frame, size and pose can be removed by preprocessing to align and normalize the face. Eyes are detected and eye locations are used for this purpose in many face recognition systems. Feature extraction is an important stage for a typical facial recognition system. Many feature or template based methods have been proposed for this task but still there are many new developments on the way. In chapter a review of some classical face recognition methods is given.

1.3

Thesis Outline

The goal of this thesis is to advance novel fractal recognition systems and their application in human face recognition. This idea is built based on the theory of fractal image encoding and decoding which are discussed in Chap. 2. As face recognition was used as the main application for testing the concepts presented

4

1.4 Publications Resulting from research

in this thesis, we review of some classical face recognition methods in Chap. 3. Our fractal techniques for recognition, including fractal codes directly as features, fractal image-set coding and subfractals are introduced and discussed in Chap. 4, Chap. 5 and Chap. 6 respectively. The final chapter discusses the summary and conclusion of this thesis and points out future and promising research directions.

1.4

Publications Resulting from research

The following fully-refereed publications have been produced as a result of work in this thesis:

Book Chapter 1- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ An Application of Fractal Image-set Coding in Facial Recognition,” vol. 3072 of Lecture Notes in computer science, Biometric Authentication, pp. 178-186. Springer Verlag, July 2004.

Conference Publications 2- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Fractal Image-set encoding for Face Recognition,” Proceedings of International Conference on Computational Intelligence for Modelling Control and Automation, pp. 664-672. July 2004. 3- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Facial Image Retrieval Using Fractal Image-Set Coding,” 2nd Workshop on Information Technology and Its Disciplines, Feb. 2004. 4- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Mathematical basis for use of fractal codes as features,” Proceedings of Image and Vision

1.4 Publications Resulting from research

5

Computing, IVCNZ02, vol. 1, pp. 203-208. Nov. 2002. 5- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Robustness to expression variations in fractal-based face recognition,” Proceedings of Sixth International, Symposium on Signal Processing and its Applications, vol. 1, pp. 359-362. 2001. 6- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Face recognition using fractal codes,” Proceedings of IEEE International Conference on Image Processing, vol. 3, pp. 58-61. 2001. 7- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Face recognition using fractal codes,” Proceedings of Third Australasian Workshop on Signal Processing Applications, (WoSPA) 2000, Brisbane, Australia, 2000.

Journal Publications 8- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Subfractals: A new concept for fractal image coding and recognition,”Complexity International, Monash university, ISSN 1320-0682 (Submitted ).

Chapter 2 Fractal Encoding and Decoding

2.1

Introduction

Fractals, as some interesting mathematical sets, were known and studied by mathematicians like Cantor, Poincar´e and Hilbert [14] in the late 19th and early 20th century. But it was Mandelbrot [56] who is widely recognized as the one who defined the science of fractal mathematics. Iterated function theory defined by John Hutchinson [43] was the second step in the development of fractal compression systems. This theory later used by Michael Barnsley [3] for describing the collage theorem that describes what a system of iterated functions must be like in order to produce a fractal image. Arnaud Jacquin, one of Barnsley’s graduate students, implemented the algorithm that can automatically convert an image into a Partitioned Iterated Function System [45]. This algorithm is the basis for most of current fractal coding algorithms. The goal of these algorithms is to be able to create a series of mathematical processes which would produce an accurate reproduction of an image. This reproduction using fractal codes is much more compact than the picture. Many algorithms [35], [46], [70], [99] have been proposed to use these codes for image compression. The remainder of this chapter is organized as follows. Section 2.2 explains some common features of fractals. Mathematical

2.2 Features of Fractals

7

foundations, Iterated function systems(IFS) and principles of fractal coding are presented in section 2.3, 2.4 and 2.5 respectively.

2.2

Features of Fractals

Fractal shapes are also characterized by their statistical self-similarity, regular processes that appear over a range of scales, and non-integer fractional dimension. In spite of its intuitively comprehensive concept and potential for wide application, complexity and difficulty concerning its visualization hindered the study of fractals until recent advances in computer processing. Fractal dimension can be measured using various methods including the box-counting method, i.e., estimating the complexity from the number of boxes used for approximating the figure at different scales [90]. Fractal figures generally share the following features in common:

No characteristic length: - A shape usually has a definite scale that characterizes itself. Geometric shapes, for instance, have their own characteristic length such as the radius or circumference of a circle and the edge or diagonal of a square. Fractal figures, on the other hand, have no such length. Their length, size or volume cannot be measured with a single unit, as their surface are not smooth and that more closer we look in, the more complicated nest of surface shape appears. Consequently, we cannot draw a tangent line of fractal figure; i.e. it is non-differentiable. Self-similarity: - Fractal figures are unique in that they cannot be measured with a single characteristic length, because of the repeated pattern we continuously discover at different scale levels. In other words, because fractal figures hold self-similarity, their shape does not change even when observed under different scale. One of the best examples for understanding this feature is fern. As it shown in figure 2.1, a small part of the figure when

8

2.2 Features of Fractals enlarged reproduces the original figure.

Figure 2.1: One of the best examples for understanding the features of fractals is the fern. Non-integer dimension (fractal dimension): - We normally consider a point to have a topological dimension of 0. In this sense, a boundary has a topological dimension of 1, a surface has a dimension of 2 and a solid has the dimension of 3. However, a complex curve may wander on a surface. In case of van Koch’s snowflake that is shown in the figure 2.2, the curve becomes

4 3

times longer than the original curve every time it grows. Thus,

a curve will have a fractal dimension of a real number between 1 and 2. A complex curve that approaches surface filling will have a fractal dimension approaching 2. Therefore, the more complex the geographic boundary, the higher the fractal dimension, (in the case of van Koch’s curve, we take log 4 log 3

or 1.26 as its fractal dimension). The actual values of these fractal di-

mensions differ slightly, depending on the method of defining it. Currently, there are several methods that are physically feasible. We can measure fractal dimension by: changing coarse-graining level (box-counting methods), using the fractal measure relations, using the correlation function, using the distribution function, or using the power spectrum.

2.3 Mathematical Foundations

2.3

9

Mathematical Foundations

This section provides basic notation and definitions related to fractal image coding.

2.3.1

Metric Space

A space M (can e.g. be the space of compact subsets of R3 ) is a metric space if for any of its two elements x and y, there exists a real number d(x, y), called distance, that satisfies the following properties: (1) d(x, y) ≥ 0 (non-negativity) (2) d(x, y) = 0 if and only if x = y (identity) (3) d(x, y) = d(y, x) (symmetry) (4) d(x, z) ≤ d(x, y) + d(y, z) (triangle inequality)

Cauchy sequence

A sequence {xn }∞ n=0 = {xn ∈ M, n ∈ N} is said to be a Cauchy sequence if, ∀ > 0, ∃K ∈ N, such that d(xn , xm ) ≤ , for all n, m > K

Complete metric space

A metric space (M, d) is complete if every Cauchy sequence of points {xn }∞ n=0 in M has a limit xn ∈ M.

10

2.3.2

2.3 Mathematical Foundations

Contractive Transformations

A Transformation w : M 7→ M is said to be contractive with contractility s ∈ [0, 1) if for any two points x, y ∈ M, the distance d(w(x), w(y)) < s.d(x, y) Loosely speaking, This formula says the application of a contractive map always brings points close together (by some factor less than 1). Contractive transformations have the nice property that when they are repeatedly applied, they converge to a point which remains fixed upon further iteration.

2.3.3

Fixed Point Theorem

If the space (M, d) is a complete metric space and w : M 7→ M is a contractive transformation with contractivity factor s, then

1- There exists one unique fixed point xf ∈ M, which is invariant under w: w(xf ) = xf 2- For any point x ∈ M, it holds that lim wn (x) = lim w(w(w(. . . (x)))) = xf n→∞ | {z }

n→∞

n times

3-(Collage theorem) For any point x ∈ M, it holds that d(x, xf ) ≤

1 .d(x, w(x)) 1−s

The fixed point theorem shows how fractal coding of images can be done. We consider images a points in a metric space and we find a contractive transformation on that space which its fixed point is the image we wish to encode (in

2.4 Iterated Function Systems(IFS)

11

practice it may be an image very close to it). The fixed point theorem grantees that the distance between transformed point(by the contractive transformation) and the fixed point is less than the distance between the initial point and the fixed point. If we apply the contractive transformation iteratively to an initial point we can see the results come closer and closer to the fixed point.

2.3.4

Affine Transformation

For a gray scale image I, if z denotes the pixel intensity at the position (x, y), then affine transformation W can be    x a       W y  =  c    z 0

expressed  b 0   d 0   0 s

in matrix form as follows:    x e       y + f     z o

Where a, b, c, d, e, f are geometrical parameters, s is the contrast and o is the brightness offset (luminance parameters). This transformation also can be shown in linear form W (X) = AX + B, where A is a n × n ( in our case n=3) matrix and B is an offset vector of size 1 × n. Using an affine transformation, we can scale, rotate an image, contrast scale or translate pixel intensities.

2.4

Iterated Function Systems(IFS)

An iterated function system {W : wi , i = 1, 2, . . . , N } consists of a collection of contractive affine transformations wi : M 7→ M with respective contractivity factor si together with a complete metric space (M, d). This collection of transformations defines a contractive transformation W with contractivity factor s = max{si , i = 1, 2, . . . , N }. The contractive transformation W on the complete metric space (M, d) will have a unique fixed point Xf which is also called the

12

2.5 Principles of Fractal Coding

attractor of this IFS. W (X) =

N [

wi (X)

i=1

W (Xf ) =

N [

wi (Xf ) = Xf

i=1

Figure 2.3 shows an example of attractor of an IFS with 3 simple contractive transformations w1 , w2 , w3 as :

wi

a

b

c

d

e

f

w1

0.5

0

0

0.5

0

0

w2

0.5

0

0

0.5

0.5

0

w3

0.5

0

0

0.5

0.25

0.5





which wi is in the following form:







a b 0 x       wi  y  =  c d 0    0 0 0 z





e x          y  +  f      1 z

Jacquine’s method as well as many other fractal image coding methods are based on partitioned iterated function systems(PIFS) which is a generalization of IFS. In PIFS each transformation wi will apply only to a restricted set of domains. This will help us to encode more general images which are not fully self-similar.

2.5

Principles of Fractal Coding

Various schemes of fractal image compression were proposed, which differ in the partitioning method, class of transformation or type of search used in locating suitable domain blocks. The first fully automated algorithm for fractal image

2.5 Principles of Fractal Coding

13

Figure 2.2: Van Koch’s snowflake with fractal dimension of 1.26

Figure 2.3: Serpinski triangle the attractor of an IFS containing 3 contractive transformations.

14

2.5 Principles of Fractal Coding

compression, proposed by Jacquin [45] in 1989. Until Jacquin’s encoder became available, attempts had been made to design fractal encoders which were supposed to create transformation with the structure of iterated function systems. Jacquin’s method was based on partitioned iterated function systems (PIFS), a more general type of transformation which exploits the fact that a part of an image can be approximated by a transformed and down-sampled version of another part of the same image, this property is called piecewise self-similarity. A PIFS consists of a complete metric space X, a collection of sub-domains Di ⊂ X, i = 1, . . . , n and a collection of contractive mappings wi : Di → X, i = 1, . . . , n. The encoder works, in principle, as follows:

Range Blocks: -An image to be encoded is partitioned into non-overlapping range blocks Ri . Domain Blocks: -An Image is also partitioned into larger blocks Dj called domain blocks which can be overlapped. Transformation: - The task of a fractal encoder is to find a domain block DRi of the same image for every range block Ri such that a transformed version of this block w(DRi ) is a good approximation of the range block. The contractive transformation w is a combination of a geometrical transformation and luminance transformation. The transformed version of the domain block can be rotated, mirrored, contrast scaled or translated, so the transformation can be shown as an affine transformation.

Various schemes of fractal image coding are different in the partitioning method, the class of transformation or the type of search used in locating suitable domain blocks.

2.5 Principles of Fractal Coding

2.5.1

15

Partitioning

The first decision to be made when designing a fractal coding scheme is in the choice of the type of image partition used for the domain and range blocks. The simplest possible range partition consists of the fixed size square blocks. Quadtree partitioning employs the well known image processing technique based on recursive splitting of selected image quadrants, enabling the resulting partition be represented by a tree structure in which each non-terminal node has four descendants. A horizontal-vertical (HV) partition like the quadtree, produces a tree-structured partition of the image. Instead of recursively splitting quadrants, however, each image block is split into two by a horizontal or vertical line and finally a number of different constructions of triangular partitions have been investigated. In triangular partitioning scheme, a rectangular image is divided diagonally into two triangles. Each of these is recursively subdivided into four triangles by segmenting the triangle along lines that join three partitioning points along the tree sides of the triangles.

2.5.2

Transformations

A critical element of a fractal coding scheme is the type of transform selected, since it determines the convergence properties on decoding, and its quantized parameters comprise the majority of the information in compressed representation. The fixed point theorem states that contractive transformations, through their fixed points, can be used to represent points in the space. However, this theorem does not show a method to find such transformations. If we find a suitable contractive transformation W for image Xf , we know that

16

2.6 Summary

the fixed point of W is Xf , so d(xf , W (xf )) = d(xf , xf ) = 0

It may be very difficult to find an exact transformation W for any arbitrary image x. Instead, many fractal image encoder only aim to find a transformation W ∗ with attractor x∗f which d(x, x∗f ) is as little as possible. If the distance d(x, W (x)) ≤ δ then the distance from x to its approximation x∗f , which is the attractor of W , will be bounded by : d(x, x∗f ) ≤

δ 1−s

Hence, both δ and s (which is the contractivity factor of W ) should be as small as possible. Affine transformations are good candidates for this case. Each transformation can have 2 different parts: geometrical and luminance. The geometrical part of the transformation, scales, rotates and translates a domain block to fit the range block. To keep the transformation contractive, the size of a domain block is always bigger than range block so the scale factor is always less than 1. The luminance part consists a few simple functions, such as a luminance shift and contrast scaling (again with contrast factor less than 1).

2.6

Summary

In this chapter the focus has been set on the brief introduction of fractals and their features, such as self similarity, non-integer dimension as well as basic concepts of fractal image coding. The mathematical basis, such as complete metric space, contractive transformations and fixed point theorem have been introduced. Later

2.6 Summary

17

in this thesis, the utility of fractal codes for face recognition is proposed and discussed.

Chapter 3 Face Recognition

3.1

Introduction

Face is a unique feature of human beings. However, all faces are similar in features and structures. During the past several years, face recognition has developed into a major research area in pattern recognition and computer vision. As one of the most challenging applications in these fields, face recognition has received significant attention. Unlike other biometric systems, facial recognition can be used for general surveillance, usually in combination with public video cameras. This chapter overviews some of the classic 2D still image face recognition algorithms.

3.2

Facial Feature Detection

Most of the practical face recognition systems need a face detection stage to detect the location of the face within a source image. Face recognition systems also normalize the size and orientation of the face to achieve more robustness. The normalization methods uses the location of the significant facial feature such as eyes, nose or mouth. For example, once the eyes are detected, one is able to

20

3.3 Geometric Feature Based Methods

transfer the eyes into pre-determined locations in an image of pre-defined size using an affine transformation. The importance of robust facial feature detection for both detection and recognition has resulted in the development of a variety of different facial feature detection algorithms [2], [20], [59], [66], [89], [106]. Brunelli and Piggio [15], [17] proposed a facial feature detection method which uses a set of templates to detect the position of the eyes in an image, by looking for the maximum absolute values of the normalized correlation coefficient of these templates at each point in the test image. To cope with scale variations, a set of templates at different scales was used. The problems associated with scale variation can be solved by using a set of templates at different scales or using hierarchical correlation as proposed by Burt [18].

3.3

Geometric Feature Based Methods

The geometric feature based approaches [39], [42], [47], [49] are the earliest approaches to face recognition and detection. These approaches were focused on detecting individual features such as eyes, ears, head outline and mouth, and measuring different properties such as eyebrow thickness and their vertical position or nose position and width, in a feature vector that is used to represent a face. To recognize a face, first feature vectors of the test image and the images in the database are obtained. Second, a similarity measure between these vectors, most often a minimum distance criterion, is used to determine the identity of the face. Brunelli and Poggio [16] compute a set of geometrical features such as nose width and length, mouth position, and chin shape. They report a 90% recognition rate on a database of 47 people. However, they show that a simple template-matching scheme provides 100% recognition for the same database.

3.3 Geometric Feature Based Methods

3.3.1

21

Face Recognition Using Principal Component Analysis (Eigenfaces)

Principal component analysis (PCA) [37], is a simple statistical dimensionality reducing technique that has perhaps become the most popular and widely used method for representation and recognition of human faces. PCA, via the Kahunen-Loeve transform can extract most statistically significant information for a set of images as a set of eigenvectors (usually called eigenfaces [96] when applied to faces), which can be used both to recognize and reconstruct face images. This method proposed by Turk and Pentland [75], [96], [97] is motivated by the earlier work of Sirovitch [88] and Kirby [50] for efficiently representing face images. Once the face images are normalized for eye position, they can be treated as a 1-D array of pixel values. The eigenvectors of the covariance matrix C of the ensemble of training faces are called eigenfaces. The space spanned by the eigenvectors vk , k = 1, , K corresponding to the K largest eigenvalues of the covariance matrix, is called the face space. Eigenvectors can be regarded as a set of generalized features, which characterize the image variations in the database. Each image has an exact representation via a linear combination of these eigenvectors and an arbitrarily close approximation using the K most significant eigenvectors. The number of eigenvectors chosen determines the dimensionality of face space. A new face image is transformed into its eigenface components by projection onto the face space. The projections form the feature vector, which describes the contribution of each eigenface in the representing the input image. A test image is recognized by computing the Euclidean distance in the feature space and selection the closest match. The effect of the lighting conditions over the KLT based method has been detailed in [31]. The eigenface method has also been used for face detection [67],[68] by measuring the distance from each local pattern in a test image to the face space defined by the eigenfaces. In [1] Akamatsu et. al., applied the eigenface method to the magnitude of the Fourier spectrum of the images after normalization with respect to illumination and scale.

22

3.3 Geometric Feature Based Methods

Due to shift invariance property of the magnitude of the Fourier spectrum, and to the illumination and scale normalization, the method, called the KarhunenLoeve Transform of Fourier Spectrum in the Affine Transformed Target Images (KL-FSAT), performed better than classical eigenfaces method under variations in head orientation and shifting. In summary, PCA is a very efficient signal encoder, and designed specifically to characterize and encode variation rather than ignore them. Thus it may find the optimal low-dimensional representation, but this may be more useful for reconstruction rather than recognition. In addition, the eigenface method is not invariant to image transformations such as scaling, shift or rotation in its original form and requires complete relearning of the training data to add new individuals to the database.

3.3.2

Recognition Using Independent Component Analysis (ICA)

Independent Component Analysis (ICA) is a statistical method for transforming an observed multidimensional random vector into components that are mutually as independent as possible. This technique can be used for extracting statistically independent variables from a mixture of them [22]. In a classical example, two people in the same room speak simultaneously and two microphones are placed at different locations recording the mixed conversations. ICA can be used to estimate the contribution coefficients from the two signals, which allows us to separate the two original signals from each other, assuming that the two speech signals are statistically independent. The tutorial [44] written by Hyvarinen and Oja contains more details about the algorithms involved. Bartlett and Sejnowski have used Independent Component Analysis (ICA) for face recognition [5], [6], [7]. Two approaches for recognizing faces across changes

3.4 Linear Discriminant-Based Method

23

in pose were explored using ICA. In the first architecture, a set of statistically independent basis images for the faces, was provided. This set can be viewed as a set of independent facial features. Unlike the PCA basis vectors, these ICA basis images were spatially local. The representation consisted of the coefficients for the linear combination of basis images that comprised each face image. The second architecture produced independent coefficients. This provided a factorial face code, in which the probability of any combination of features can be obtained from the product of their individual probabilities. Classification was performed using nearest neighbour, with similarity measured as the cosine of the angle between representation vectors. Both ICA representations showed better recognition scores than PCA when recognizing faces across sessions with changes in expression and changes in pose.

3.4

Linear Discriminant-Based Method

In [8], [9], [33] the authors proposed a new method for face recognition using Fisher’s Linear Discriminant Transform (LDT) [34], [37]. The Fisherface method uses the class membership information and develops a set of feature vectors in which variations between different faces are emphasized while different instances of faces due to illumination condition, facial expressions, and orientations, are deemphasized. In other words, LDT finds the line that best separates the points. Each test image is projected onto the optimal LDT space and the resulting set of coefficients is used to compute the Euclidean distance from the images in training set. The Fisherface method has also been applied to face detection from color images [86]. In [1], Akamatsu, Sasaki and Suenuga applied LDA to the magnitude of Fourier spectrum of the intensity image.The results reported by the authors showed that LDA in the Fourier domain is significantly more robust to variations in lighting than the LDA applied directly to the intensity images. In [54] authors proposed another LDA based method

24

3.4.1

3.5 Summary

Other Methods

Other popular face recognition approaches that will only be mentioned in this report include Dynamic Link Matching [100], [101] Matching Pursuit-Based Methods [55], [76], [77], [78] and Hidden Markov Model Based Methods [71], [85] Face Recognition by Elastic Bunch Graph Matching [102], [103], [104].Fractal based [24], [23], [25] , [26], [29], [27], [28], [30], [52], [93] approaches are a new application of fractals which will be presented in next chapter. Some other publications that describe latest achievements as well as currently unsolved issues of face recognition are as followed: [107], [87], [10], [40], [95], [19], [84].

3.5

Summary

Face recognition systems work very well under constrained conditions, such as frontal mug shot images and consistent lighting. In actual world face recognition, like other biometrics, suffers from several usability problems. Face is a changeable social organ (see figure 3.1) displaying a variety of possible presentations. Human facial expressions make changes in the shape of facial components such as eyes, mouth and eyebrows. Artificial changes include cuts and bandages from injuries or wearing glasses and fashion-related issues like makeup, jewelry also change the face images. Some changes may occur with time such as growth and removal of facial hair and wrinkles of the skin caused by aging. It has been shown that using facial images taken at least one year apart, can cause error rates of 43% [80]to 50% [74]. A facial image is a 2D view (projection) of a 3D surface. Viewing angle, pose and illumination (change in sunlight intensities) can affect this projection. For example when the face tilt right-left or up-down the 2D view changes. These changes in gray-level will cause features to change. Face recognition algorithms appear to be sensitive to changes from ideal conditions. FRVT evaluation report [12] show high error rates even in those ideal conditions. Different per-

3.5 Summary

25

Figure 3.1: Examples of pose(XM2VTS face image database [65]), lighting (AR face image database [61]) and facial experssion variations (CMU-Pitt facial expression database [48]) in face images.

26

3.5 Summary

formance evaluation tests such as, FERET [82],[81], [83], FRVT [12], [79] and XM2VTS[63], [65], show significant improvement in face recognition technology. However, there are still areas which require further research and development.

Chapter 4 Fractal Codes Directly as Features

4.1

Introduction

Using fractals for object or shape recognition is a relatively new application of fractal image encoding. The goal of the fractal image encoding algorithms is to be able to create a series of mathematical processes which would produce an accurate reproduction of an image. For many years, fractal encoding was a technique for image compression. Fractal codes are much more compact than the original image and many algorithms [4], [13], [35], [36], [38], [45], [69], [70] [98], [99], [105] have been proposed to use these codes for image compression. In this chapter another application of fractal codes is proposed. Fractal codes have this ability to reproduce an image (or at least a good approximation of it) by a set of contractive transformations. These transformations can be shown in simple affine form and can be recorded by several simple parameters. This compact presentation of images shows its usefulness in image compression, but, is it possible to use this code for recognition too? In this chapter, a brief explanation of some previous related work is given in section 4.2. In section 4.3, use of fractal codes as feature

28

4.2 Previous Related Work

for recognition, especially face recognition, is described and different aspects of this method such as fractal extraction, normalizing the Fractal codes, accuracy test and improving the robustness is discussed. Other original fractal techniques for face recognition are explained in the chapters 5 and 6.

4.2 4.2.1

Previous Related Work Shape Recognition Using Fractal Geometry

Neil [72], [73], proposed one of the first methods for using fractal techniques in shape recognition. His method is based on the comparison of a transformation and an object. To compare two different objects (shapes), his method firstly finds an associated transformation for the object being identified. Then a comparison between the transformation and other object is achieved by applying the transformation to the object. The object will remain unchanged if and only if the transformation is an associated transformation for that object. This method is based on the binary representation of shapes (black and white images) and has reported some invariance to rotation by giving standard orientations to shapes.

4.2.2

Face Recognition Using Fractal Dimensions

Kouzani [51], proposed a face recognition method based on the fractal dimension. In his method, each pixel of an image is replaced by the fractal dimension of region around that pixel. To handel the shortcomings of fractal dimension calculation, he used the average value of different fractal dimensions related to the region with different region sizes around the pixel. To compare between two images, he presented these fractal dimension maps to a normalized cross correlation stage in which the best match is chosen.

4.3 Fractal Codes as Features

29

In another work [52], Kouzani used 2 feed forward neural networks. The first one implements the search process for matching range and domain blocks in the face image. The second one compares the fractal code of a query image and the fractal code of the known face in the database. Kouzani claimed that the second neural network calculates the degree of similarity between the two fractal face models, but did not explain how.

4.2.3

Face Recognition Using Fractal Neighbor Distances

Yan and Tan [93], [92], [94] extended Neil’s method for gray scale images. In their method, a database of fractal codes for the set of training face images is generated first. Then for any unknown query image Iq and any known training image I with fractal code WI , the fractal neighbor distance Υ(WI , Iq ) = d(WI (Iq ), Iq ) is calculated and compared with others (d is a Euclidian distance function). The code Wmin with minimum Υ(Wmin , Iq ) is taken as the best match. This method was also used for face verification. The system comprised two components: face detection and face verification subsystems. The location of the head was detected based on the result of a search in the reduced region using the fractal neighbor distance between a generic face template and a portion of the image. The verification subsystem, also used fractal neighbor distance to compute and find the minimal distance between localized head image and the images stored in the XM2VTS database. The results were reported in [63].

4.3

Fractal Codes as Features

As the fractal encoding algorithms can apply to any (gray-scale) images, we can say that, any (gray-scale) image can be approximated by the attractor of a fractal code. Image xf is the attractor of fractal code W (x) if xf is the fixed point (see

30

4.3 Fractal Codes as Features

sec. 2.3.3 ) of the fractal code W (xf ) = xf . Since fractal representations are transformations that apply between one part in an image and another, some part of the code could be robust to many types of degradations that affect both the parts (domain and range blocks) similarly. This section describes the first system proposed in this thesis which is based on the use of the fractal code of an Image as feature for recognition.

4.3.1

Fractal Extraction

In fractal image coding, the code for an image x is an efficient binary representation of a set of contractive affine transformations W whose unique fixed point xf is a good approximation to x. The fractal coding algorithm used in this system can be described as follows:

1- Partition the image to be encoded into non-overlapping range blocks Ri using quad-tree partitioning. 2- Cover the image with a sequence of possibly overlapping domain blocks Dj . 3- For each range block, find the domain and corresponding transformation that best match the range block. 4- Save the geometrical positions of range block and matched domain block as well as the matching transformation parameters as fractal codes of image.

Quadtree partitioning

Quadtree partitioning method employs the well known image processing technique based on recursive splitting of selected image quadrants, enabling the resulting partition be represented by a tree structure in which each non-terminal node has four descendants. The usual top-down construction starts by selecting

4.3 Fractal Codes as Features

31

an initial level in the tree, corresponding to some maximum range block size. In order to produce contractive transformations, range blocks not smaller than the largest domain blocks are subdivided into smaller range blocks. Each range block larger than a preset limit, is recursively partitioned if a match with one of the domain blocks in the domain pool, better than some preselected threshold is not found. In figure 4.1 (top left side) a sample of quadtree partitioning is shown. Note that a region containing detail is split into smaller domains in the process of finding a sufficiently good match.

Figure 4.1: Domain (bottom) and range(top left) blocks for an image (top right)

32

4.3 Fractal Codes as Features

Domain blocks

The task of a fractal encoder is to find a domain block D of the same image for every range block such that a transformation of this block W (D) is a good approximation of the range block. In order to have contractive transformations, the domain block should be bigger than the range block. The number of different sizes of domain blocks and how much overlap is allowed are two important parameters of the system. Figure 4.1 (bottom) shows domain blocks of two different sizes 8 × 8 and 16 × 16. Note that in this example, the domain blocks of the same size are not overlapped but the domain blocks with larger size are overlapped with 4 domain blocks of smaller size.

Mapping domains to ranges

The main computational step in fractal image coding is the mapping of domains to range blocks. For each range block, the algorithm compares transformed versions of the domain blocks to the range block. The transformations are typically affine transformation. The transformations W is a combination of a geometrical transformation and luminance transformation. For a gray scale image I, if z denotes the pixel intensity at the position (x, y), then W can be expressed in matrix form as  x   W  y  z

follows:  



 



e x a b 0              =  c d 0   y  +  f        o z 0 0 s

(4.1)

Coefficients a, b, c, d, e and f control the geometrical aspects of the transformation (skewing,stretching, rotation, scaling and translation), while the coefficients s and o determine the contrast and brightness of the transformation and together make the luminance parameters. The geometrical parameters of the transformation limited to rigid translation, a contractive size-matching, and one of eight orientations. The orientations consist of four 90o rotations, and a reflection followed

4.3 Fractal Codes as Features

33

by four 90o rotations as shown in figure 4.2.

Figure 4.2: The eight possible orientations of a block. The orientations consist of four 90o rotations, a reflection and four more 90o rotations.

Figure 4.3: An illustration of domain and range blocks Domain-range comparison is a three-step process. One of the eight basic orientations is applied to the selected domain block Dj . Next, the rotated domain is shrunk to match the size of the range block Rk . The range must be smaller than the domain in order for the overall mapping to be a contraction. Finally, optimal contrast and brightness parameters are computed using least-squares fitting. Representing the image as a set of transformed blocks does not form an exact copy of the original image, but a close approximation of it. Minimizing the error between W (Dj ) and Rk will minimize the error between the original image and the approximation. Let ri and di , i = 1, · · · , n denote the pixel values of two equal size blocks Rk and shrink(Dj ). The error Err is defined as: Err =

n X i=1

(s.di + o − ri )2

(4.2)

34

4.3 Fractal Codes as Features

The minimum of Err occurs when the partial derivatives with respect to s and o are zero: n X Err = n.o + (s2 .d2i +2.s.di .o−2.s.di .ri −2.o.ri +ri2 ) 2

(4.3)

i=1

n

∂Err X (2s.d2i +2di .o−2di .ri ) = 0 = ∂s i=1

n X ∂Err (2.s.di −2.ri ) = 0 = 2.n.o+ ∂o i=1

which occurs when: P P P [n ni=1 di .ri − ni=1 di ni=1 ri ] h P i s= P 2 n ni=1 d2i − ( ni=1 di ) "

n n X 1 X ri − s o= di n i=1 i=1

#

(4.4)

(4.5)

(4.6)

(4.7)

These two equations can be simplified as: s=

α β

(4.8)

  α o = r− d β

(4.9)

where: n

d=

1X di n i=1

(4.10)

n

1X r= ri n i=1 α=

n X i=1

β=

n X i=1

Proof:

(4.11)

 di − d . (ri − r) di − d

2

 P n. ni=1 di − d . (ri − r) n.α α = = s= = 2 P β n.β n. ni=1 di − d

(4.12)

(4.13)

4.3 Fractal Codes as Features P P (di .ri ) − n.d. ni=1 ri − n.r. ni=1 di + n2 .d.r = P P 2 n. ni=1 d2i − 2.n.d. ni=1 di + n2 .d P P P P P P P n. ni=1 di .ri − ni=1 di . ni=1 ri − ni=1 ri . ni=1 di + ni=1 di . ni=1 ri = P P P 2 2 n ni=1 d2i − 2. ( ni=1 di ) + ( ni=1 di ) P P P [n ni=1 di .ri − ni=1 di ni=1 ri ] h P i = P 2 n ni=1 d2i − ( ni=1 di ) n.

and

35

Pn

i=1

" n #   n n n X 1X α 1X 1 X ri − s di d= o = r− ri −s. di = β n i=1 n i=1 n i=1 i=1

Sample fractal codes of an image are as shown here:

Table 4.1: An Quadtree Domain parameters index 111100 1 111200 1 111300 1 111400 1 112100 1 112200 1 112300 1 112400 1 113100 1 113210 7 .. .. . .

example of fractal codes. Orientation Brightness Contrast 6 7 6 5 2 2 5 7 5 1 .. .

111 301 194 67 324 274 -522 216 -47 128 .. .

0.111 -0.130 0.003 0.165 -0.157 -0.094 0.900 -0.025 0.305 0.022 .. .

The first columns contain the 6 quadtree parameters which showing the geometrical positions of range blocks, the next column is the domain index number which uniquely locate the position of domain block using some preset parameters such as size of domain blocks, number of different domain sizes and overlapping factor. The third column contains the orientation index which is a number between 0 and 7, the last two columns are brightness and contrast factor o and s respectively. In this system, last 4 columns (domain index number, rotation (orientation) index,brightness and contrast factor) are used as fractal features for recognition.

36

4.3.2

4.3 Fractal Codes as Features

Normalizing Fractal Codes

Each fractal feature used in this system is a vector, such that each image has 4 feature vectors of same size. The size of each vector, however, varies from one image to another and it depends on the number of range blocks which is depends on the partitioning threshold, the size of image, image complexity and the minimum size of range and domain blocks. In order to normalize the size of each vector we use the quad-tree partitioning geometry and apply each feature value at its geometrical position (as can be seen from figure 4.4). Because quadtree partitioning can be applied to an image of any arbitrary size, we can resize all feature vectors to the size of the query image. This makes our method robust to size and scale changes. For classification we used the Peak Signal-to-Noise ration (PSNR) between feature vectors of the query image and feature vectors of all image in the database as a measure of distance and a minimum distance classifier. A

B

C

D

Figure 4.4: Fractal features of an image (A=Domain index number, B=Rotation (orientation) index, C=Brightness shift and D=Contrast factor) displayed as gray values over the quad-tree partition of the same image.

4.3 Fractal Codes as Features

4.3.3

37

Accuracy Tests

To initially test our system, we have used a subset of the MIT face database. This version of MIT face database consists of 2 face images from 90 subjects for a total of 180 images, with some variation in the illumination, and the scale and head orientation. In figure 4.5 some examples from this face database are shown.

Figure 4.5: Typical images from the MIT face database (ftp:\\whitechapel.media.mit.edu\pub\eigenfaces\pub\images). Two different frontal views of each person are included. We used each feature separately for classification, first, to obtain some idea of their information content or ability to discriminate between faces. Classification accuracy was plotted as a function of the number of images as the size of the number of images tested grew from 1 to 180, the size of the database. It was found that the orientation parameter yielded an accuracy of about 72% and the domain index yielded an accuracy of about 64% separately on this data. The other two features yielded lower accuracy (as can be seen from figure 4.6 ). The use of all four features resulted in a total accuracy of close to 88.5% accuracy for a small database size. The accuracy tended to level off around 85%. This suggests that in a fractal representation of the face, the information about which parts are self similar to which other parts and the orientation differences between these parts is more useful for recognition than the transformations be-

38

4.3 Fractal Codes as Features

Figure 4.6: Recognition accuracy using Rotation, Domain index, brightness and contrast features, independently, and total accuracy achieved using all features, plotted against the number of the images in the database as this is progressively increased. tween ’averaged’ pixel gray level descriptions such as brightness and contrast from domain to range. Lighting variations are also more likely to affect brightness and contrast more significantly than the other two features. Figure 4.7 shows the results of using this method to retrieve the closest 8 images to a given image. The first four images are the best hit for each feature and the last four images are the second best hit of that feature. In figures 4.8, 4.9 robustness of this method to rotation is demonstrated. In the first test, a 180 rotated version was used as the query image. The method, using only the orientation feature vector, was able to pick the correct identity as the closest match in this test. The method, using only the orientation feature vector, was able to pick the correct identity as the closest match and the second view of the same person as the next closet match in the second test. It is interesting to note that the third and fourth best matches are similar to the query image from a human visual point of

4.3 Fractal Codes as Features

39

Figure 4.7: A query Image and the first 8 closest matches found by the method (the first four images are the best hit for each feature and the last four images are the second best hit of that feature). view, subjectively. The other matches are of the wrong gender but share some similarities in overall appearance and shape. In figure 4.10, the query image is inverted in grayscale. All the fractal features except the brightness feature find the right match. The rotation feature gets both first and second matches right. It is happened only because this change similarly effects range and domain blocks. Also the position of range and domain blocks are not changed. Thus, if the domain block Dj was the best match for the range block Rk in the original image, it is still is the best match even after inverting the pixel values. This fact shows that the first two fractal features (domain index number and orientation index) would not be effected by this change. The effect of this change on the other two fractal features (brightness and contrast factor) can be shown by an example.

40

4.3 Fractal Codes as Features

Figure 4.8: A rotated query image and the first 8 closest matches. Note that the best match image retrieved using the orientation feature is the correct person. Example

Suppose n = 3 and R = [1, 2, 3], D = [4, 5, 6] are the range and shrunk domain block. Brightness and contrast factor o, s are calculated as explained in equations (4.8) and (4.9): α s= = β s=

Pn

 di − d . (ri − r) 2 Pn i=1 di − d

i=1

(−1).(−1) + (0).(0) + (1).(1) =1 (−1)2 + (0)2 + (1)2 o = r − s.d = 2 − (1).5 = −3

If the image inverted in grayscale, the range and domain block R, D will be R = [255, 254, 253] and D = [252, 251, 250] and o, s will be:

s=

(1).(1) + (0).(0) + (−1).(−1) =1 (1)2 + (0)2 + (−1)2

o = r − s.d = 254 − (1).251 = 3

This example clearly shows that inverting the pixel values in a grayscale image only changes the brightness feature and does not change the other features.

4.3 Fractal Codes as Features

41

Query image

Figure 4.9: Another rotated query image and the first 8 closest matches only using the orientation feature. It can be easily shown that any shift in brightness which affects all the pixel values equally, only changes the brightness feature. Suppose the pixel value v(x, y) of the pixel at the position (x, y) is changed to v(x, y) + δ then the new contrast and brightness features are as follows:    Pn Pn 0 0 0 0 α0 i=1 di − d . ri − r i=1 (di + δ) − d − δ . ((ri + δ) − r − δ) 0 s = 0 = = 2 2 Pn Pn β d0 − d 0 (di + δ) − d − δ i=1

i

=

i=1

Pn

 di − d . (ri − r) α = =s 2 Pn β i=1 di − d

i=1

o0 = r0 − s0 .d0 = (r + δ) − s.(d + δ) = o + (1 − s).δ

Thus, all fractal features except the brightness feature are robust to a shift in brightness.

42

4.4 Summary

Figure 4.10: A query image inverted in grayscale and the first 8 closest matches. Note that the rotation feature gets both first and second matches right. The brightness feature does not find the right match because the change negatives the brightness feature. The other two features find the right match.

4.4

Summary

This chapter described a new method for face recognition by using fractal codes directly as features. It is shown that the fractal parameters of an image have a self similarity based representation of that image and can be used as features for object recognition. The fractal code of an image contains several different parts, some variations in images affect some of the parameters, while others remain unchanged. This introduces some degree of robustness in the system. It is discussed that the fractal codes of different images are different in the size of fractal features and a method for normalization of features to generate same reduced size feature vectors is presented. The details of the experiments are given in the Appendix B.

Chapter 5 Fractal Image-set Coding

5.1

Introduction

In this chapter, another method of using fractal codes for facial recognition is presented. It is shown that the fractal code of an image is not unique and that certain parameters can be held constant to capture image information in the other parameters. Fractal codes are calculated keeping geometrical fractal parameters constant for all images. These parameters are calculated from a set of images. The proposed method is faster than traditional fractal coding methods which require time to search and find the best domain for any range block. It also lends itself to preprocessing steps that provide robustness to changes in parts of a face and produces codes that are more directly comparable. Results on the XM2VTS database are used to demonstrate the performance and capabilities of the method.

44

5.2

5.2 Mathematical Bases

Mathematical Bases

A compact representation of fractal encoding and decoding process can be provided by using these operators [21] : Let =m denote the space of m × m digital grayscale images, that is, each element of =m is a m × m matrix of grayscale values. The get-block operator Γkn,m : =N → =k , where k ≤ N , is the operator that extract the k × k block with lower corner at n, m from the original N × N image,as shown in Figure 5.1.

Figure 5.1: An illustration of Get-Block and Put-Block operators The put-block operator (Γkn,m )∗ : =k → =N inserts a k × k image block into a N × N zero image, at the location with lower left corner at n, m. A N × N image xf ∈ =N can shown as xf =

M X i=1

(xf )i =

M X

(Γrnii ,mi )∗ (Ri )

(5.1)

i=1

that {R1 , . . . , RM } are a collection of range cell images that partition xf . Each

5.3 Fractal Image-set Coding

45

Ri has dimension ri × ri with lower corner located at ni , mi in xf . If the range cells Ri are the result of fractal image encoding of the image xf , then for each range cell Ri there is a domain cell Di and an affine transformation Wi such that Ri = Wi (Di ) = Gi (Di ) + Hi

(5.2)

Denote the dimension of Di by di , and denote the lower left coordinates of Di by ki , li . Gi = =di → =ri is the operator that shrinks (assuming di > ri ), translates (ki , li ) → (ni , mi ) and applies a contrast factor si , while Hi is a constant ri × ri matrix that represents the brightness offset. We can write Di = Γdkii ,li (xf ) Thus, equation 5.1 can be rewritten as the following approximation: M X xf = (Γrnii ,mi )∗ {Gi (Γdkii ,li (xf )) + Hi } i=1

xf =

M X

(Γrnii ,mi )∗ {Gi (Γdkii ,li (xf ))} +

i=1

|

{z

A(xf )

}

M X (Γrnii ,mi )∗ (Hi ) i=1

|

{z B

(5.3)

}

Then if we write the Get-Block operator (Γkn,m )∗ and Put-Block operator Γdkii ,li and transformations Gi in their matrix form we can simplify equation 5.3 as follow: xf = A × x f + B

(5.4)

In these equation A, B are fractal parameters of image xf .

5.3

Fractal Image-set Coding

In this section we will use the compact representation (5.4) to show some interesting properties of fractal image encoding and introducing a method for extracting fractal codes for a set of face images with the same geometrical parameters which we will call Fractal Image-set Coding. The fundamental principle of fractal image encoding is to represent an image by a set of affine transformations. Images are represented in this framework by viewing them as vectors. This encoding is

46

5.3 Fractal Image-set Coding

not simple because there is no known algorithm for constructing the transforms with the smallest possible distance between the image to be encoded and the corresponding fixed point of the transformations. Banach’s fixed point theorem guarantees that, within a complete metric space, the unique fixed point of a contractive transformation may be recovered by iterated application thereof to an arbitrary initial element of that space. The Banach’s fixed point theorem gives us an idea of how the decoding process works: Let T : =n → =n be a contractive transformation and (=n , d) a metric space with metric d then the sequence of {Xk } constructed by Xk+1 = T (Xk ) converge for any arbitrary initial image X0 ∈ =n to the unique fixed point Xf ∈ =n of the transformation T . The contraction condition in this theorem is defined by this definition: Transformation T : =n → =n is called contractive if there exists a constant 0 < s < 1, such that ∀x, y ∈ =n , d(T (x), T (y)) ≤ s.d(x, y)

(5.5)

This condition is a sufficient condition for existence of a unique fixed point for fractal transformations. Because if there exist two fix points xf and x0f for a contractive transformation T , we will have: T (xf ) = xf T (x0f ) = x0f and d(T (xf ), T (x0f )) = d(xf , x0f ) ⇒ s = 1 So the transformation T can not be a contractive because s = 1: Let us show the fractal transformation in compact form (5.4) as this: T (x) = A × x + B

5.3 Fractal Image-set Coding

47

So the fractal image coding of an image xf can be defined by finding A, B to satisfy this condition, while A, B define a contractive transformation: xf = A × x f + B This condition shows that the fractal code for an image xf is not unique because we can have infinite pairs of A, B to satisfy that condition, and have the same fixed point xf .

And in many of them, A and B define

a contractive transformation T (x) with |s| < 1. transformation. Figure 5.2 shows an illustrations of function T (x) = A×x+B for a one-dimensional space =. T (x) = A × x + B

a

6



b

c

  d e

  ! !

! !  

! !  ! 

! xf !   ! 

! 

! !  !  ! 

!    

 



xf

- x

Figure 5.2: Illustrations of function T (x) = A×x+B for a one-dimensional space =. a)s > 1, b) s = 1, c,d,e) s < 1 Different fractal coding algorithms, use different A and B for an image that makes the fractal face recognition process more complex. The aim of Fractal Image-set coding is to find fractal parameters for several images with the same geometrical part for all of them. In this case, the information in the luminance part of fractal codes of these images is more comparable. This method is also more efficient and faster than existing methods because there is no need to search for the best matching domain block for any range block which is the most computationally expensive part of the traditional fractal coding process. In this system, a sample image is nominated for using to find the geometrical parameters. This image can be an arbitrary image of database, an image out of the database or the average image of all or part of the database.

48

5.3 Fractal Image-set Coding

The Fractal image-set coding algorithm can be described as follow:

Step 0 (preprocessing) - For any face image data-set F use eye locations and histogram equalization to form a normalized face image data-set Fnormal . Any face image in this data-set is a 128 × 128, histogram equalized 256grayscale image, with the position of left and right eye at (32,32) and (96,32) respectively as shown in Figure 5.3. A

B

100

100

200

200

300

300

400

400

500

500 200

400

600

200

C

400

600

D

50

20

100

40

150

60

200 80 250 100

300

120

350 100

200

300

400

20

40

60

80

100

120

Figure 5.3: An example of preprocessing with an image in the data-set.(A) The original image, (B) grayscale image with orientation normalized, (C) Nominated image with face area marked, (D) normalized histogram equalized face image

Step 1 - Calculate the fractal codes for the sample image x (that can be the average image of data-set) using traditional fractal image coding algorithms [35]. These fractal codes contain the luminance information and geometrical position information for range blocks {R1 , R2 , . . . , Rn }, the domain block {D1 , D2 , . . . , Dm } corresponded to each range blocks and the geometrical transformation like rotation and resizing to match the domain block with the range block. Step 2 - For any image xi of the data-set, use the same geometrical parameters (range and domain blocks positions and geometrical transformations) that used for coding the sample image x as shown in Figure 5.4. Let (xRi , yRi ), lRi be the geometrical position and the block size of Range block Ri and

5.3 Fractal Image-set Coding

49

(xDj , yDj ), lDj the geometrical position and the size of domain block Dj which is the best matched domain block for Ri . A

B

C

D

Figure 5.4: (A) Average image of the data-set, (B) An arbitrary image from the data-set, (C) Range blocks for image A, (D) The same range blocks applied to image B

Step 3 - For any image range block Ri in image x use the domain block at the same position (xDj , yDj ) and the same size lDj and calculate the luminance parameters as follows to minimize the error e: e=

n X

(s.di + o − ri )2

i=1

Let di and ri denote the pixel value of domain block D and range block R. The minimum of e occurs when: α β   α o=r− d β s=

where α=

n X i=1

β=

 di − d . (ri − r)

n X i=1

di − d

2

50

5.4 Similarity Measurements n

1X d= di n i=1 n

as proven in section 4.3.1.

1X r= ri n i=1

Step 4 - Save the geometrical parameters as well as luminance parameters as fractal codes of image x Initial image

First itaration

third itaration

fifth itaration

Figure 5.5: The initial image and the first, third and fifth iterates of decoding transformations corresponding to image 005 4 1. In figure 5.5 an example of decoding result for one of encoded images of XM2VTS face database is shown. The PSNR versus the iteration is drawn in figure 5.6 for this image and three other images of the same database. It is clearly shown that the fixed point of each fractal image codes can be reached after only 5 or 6 iterations.

5.4

Similarity Measurements

A similarity measurement τ (x, y) is a method to calculate the similarity between two images. It is normally defined based on a metric distance d(x, y) (higher

5.4 Similarity Measurements

51

distance between two patterns showing lower similarity between them). Similarity measurement generally is a number between 0 and 1, which 0 shows the lowest similarity and 1 shows the highest similarity between two patterns. In this section different similarity measurements are described.

5.4.1

Minkowski-Form Distance

The Minkowski-form distance is defined based on the Lp norm as: v uN −1 uX p (xi − yi )p dp (x, y) = t i=0

Where x = x0 , x1 , . . . , xN −1 and x = x0 , x1 , . . . , xN −1 and y = y0 , y1 , . . . , yN −1 are the query and target feature vectors respectively. When p = 1, d1 (x, y) is the city block distance or Manhattan distance (L1 ): d1 (x, y) =

N −1 X

|xi − yi |

i=0

When p = 2, d2 (x, y) is the Euclidean distance (L2 ): v uN −1 uX d2 (x, y) = t (xi − yi )2 i=0

When p → ∞, we get L∞ d(x, y) = max {|xi − yi |} 0≤i ri ), translates (ki , li ) → (ni , mi ) and applies a contrast factor si , while Hi is a constant ri × ri matrix that represents the brightness offset. Because Gi is a combination of some geometrical transformation and a brightness scaling, we can show that matrix A is a product of a contrast matrix Ψ and another matrix Λ, that we call the distribution matrix : A=Ψ×Λ

The values on the contrast matrix Ψ are the contrast factors si , (0 ≤ si < 1). The distribution matrix Λ shows the relationship between each pixel of a range and corresponding pixels of the domain. So in each column of the matrix, we have non-zero values only in the rows corresponding to the domain pixels which effect that range pixel. As the fractal code of an image is not unique, there are many different possible values for Ψ and Λ. We can study these general cases:

6.4 Mathematical Basis

73

Case 1 - Each range pixel is in relation to only one domain pixel, each column of Λ has only one non-zero value λi :



     A=    

0

s1 . . .

...

0

s3 .. .

0 .. .

0

0

0



   . . . s2       . . . 0 ×     ...    sn 0

0

. . . λ3 . . .

λ1 0 . . . .. . . . . 0

λ2

0

0



  0    .. . λn   ... 0

This case can only happen when the size of range blocks is equal to the size of domain blocks and will not be true for most of fractal image encoding methods. Case 2 - Each range pixel is in relation to all the pixels of the image:



   A=   

s1 s2 .. . sn

s1

...

 

    s2 . . .     .. . .  ×  .   .   . . . sn

λ11 λ12 . . . λ1n λ21 λ22 . . . λ2n .. .. . . . . .. . . λn1 λn2 . . . λnn

       

This case can only happen when the range blocks are derived from the entire image and not only from a portion of the image. Case 3 - Each range pixel is related to some of the domain pixels of the image. In this case, each column of distribution matrix has some zero and some non-zero values. The subfractal concept is one special subclass of this case. For subfractals, we choose domain and range blocks from the same portion of image so the matrixes A and Λ are sparse but we can re-arrange them in the form of diagonal matrixes of subfractals. We will illustrate this idea with an example: Suppose image X is a 3 × 3 grayscale image below, with 3 different subfractal areas a, b, and c :

74

6.4 Mathematical Basis





a b b  1 1 2    X =  a2 a3 a4    c 1 c 2 a5 So xf can be : xf = A × x f + B



           xf =           

a1 b1 b2 a2 a3 a4 c1 c2 a5

                      

A = Ψ×Λ



           Ψ=          

sa11

0

0

sa12 sa13 sa14

0

0

sa15

0

sb11 sb12

0

0

0

0

0

0

0

sb21 sb22

0

0

0

0

0

0

sa21

0

0

sa22 sa23 sa24

0

0

sa25

sa31

0

0

sa32 sa33 sa34

0

0

sa35

sa41

0

0

sa42 sa43 sa44

0

0

sa45

0

0

0

0

0

0

sc11 sc12

0

0

0

0

0

0

0

sc21 sc22

0

sa51

0

0

sa52 sa53 sa54

0

0

sa55

                      

6.4 Mathematical Basis



λa11

0

75

0

   0 λb11 λb12    0 λb21 λb22    λa21 0 0   Λ =  λa31 0 0    λa41 0 0    0 0 0    0 0 0  λa51 0 0

λa12 λa13 λa14

0

0

λa15

0

0

0

0

0

0

0

0

0

0

0

0

λa22 λa23 λa24

0

0

λa25

λa32 λa33 λa34

0

0

λa35

λa42 λa43 λa44

0

0

λa45

0

0

0

λc11 λc12

0

0

0

0

λc21 λc22

0

λa52 λa53 λa54

0

0

λa55

                      

Now, we define a swapping transformations Υi,j row (X) as a transformation which swap the row(i) and row(j) of matrix or vector X with each other. In the same way, we define Υi,j col (X) for swapping col(i) and col(j). Using linear algebra, it can be easily shown that :

i,j i,j i,j i,j i,j Υi,j row (xf ) = Υrow (A × xf + B) = Υrow (Υcol (A)) × Υrow (xf ) + Υrow (B)

and i,j i,j i,j i,j i,j Υi,j row (Υcol (A)) = Υrow (Υcol (Ψ)) × Υcol (Υrow (Λ))

So the form of Ψ and Λ after this series of transformation will be 1,2 7,8 9,8 xˆf = Υ3,2 row (Υrow (Υrow (Υrow (xf )))) :

76

6.4 Mathematical Basis



          ˆ = Ψ            

          ˆ = Λ           

sb11 sb12

0

sb21 sb22

0

0

0

0

0

0

0

0

0

0



0

0

0

0

0

sa11 sa12 sa13 sa14 sa15

0

0

0

0

sa21 sa22 sa23 sa24 sa25

0

0

0

0

sa31 sa32 sa33 sa34 sa35

0

0

0

0

sa41 sa42 sa43 sa44 sa45

0

0

0

0

sa51

s52

0

0

0

0

0

0

0

0

0

sc11 sc12

0

0

0

0

0

0

0

sc21 sc22

λb11 λb12

0

λb21 λb22

0

sa53 sa54 sa55

0 0

0 0

0 0

0 0

0

                     

0

0

0

0

0

λa11 λa12 λa13 λa14 λa15

0

0

0

0

λa21 λa22 λa23 λa24 λa25

0

0

0

0

λa31 λa32 λa33 λa34 λa35

0

0

0

0

λa41 λa42 λa43 λa44 λa45

0

0

0

0

λa51

λ52

0

0

0

0

0

0

0

0

0

λc11 λc12

0

0

0

0

0

0

0

λc21 λc22

λa53 λa54 λa55

                      

ˆ and Λ ˆ can be divided to independent matrixes Ψa , Ψb , Ψc and Matrixes Ψ Λa , Λb , Λc . It is because we used subfractals and in each subfractal, pixels are only related  X  a  xˆf =  Xb  Xc

to other pixels of its   Ψ 0 0   a    =  0 Ψb 0   0 0 Ψc

and finally ˆa Xa = Ψ a × Λ a × X a + B

own area. Thus       Λa 0 0 Xa             × ×   0 Λ b 0   Xb  +        0 0 Λc Xc

ˆa B



  ˆ Bb   ˆ Bc

6.5 How to Use Subfractals for Face Recognition

77

ˆb Xb = Ψ b × Λ b × X b + B ˆc Xc = Ψ c × Λ c × X c + B These formulas clearly show that the fractal code of an image can be divided to several independent subfractal codes. Each pixel in a subfractal area is only related to other pixels of the same area.

6.5

How to Use Subfractals for Face Recognition

Hancock’s [41] psychophysical observations show that the human face recognition most likely is based on the low-level image properties, rather than on an abstract representation of the face. Certain image transformations, such as intensity negation, strange viewpoint changes, and changes in lighting direction can severely disrupt human face recognition. The fractal codes show some degree of robustness to some of these changes such as intensity negation. However in traditional fractal image coding systems, the fractal code of a part of an image is not independent from the changes in other parts of the same image. Subfractals, unlike traditional fractal codes, do not have this problem, because the subfractal codes of an image are defined to be independent. This fact can make subfractals more suitable for applications such as image and face recognition. To determine which part of the face should be a subfractal we devised a test. In this test all 10 pairs of face images shown to 10 volunteers (5 males and 5 females) and asked them to verify if the two images in each pairs are belong to one person or not. At the same time the gaze data of these volunteers collected using Eye-Gaze Tracking System (Figure 6.4, See Appendix A for more details). By using this system, we can indicate where on, and for how long, the computer monitor the user is looking. These information are used to show which parts of the face were compared to verify the person.

78

6.5 How to Use Subfractals for Face Recognition

Figure 6.4: A view of the eye-gaze tracking system

6.6 Summary

79

Figures 6.5 to 6.12 show 4 pairs of face images and the results of the eye-gaze tracking system for 10 viewers. The results are shown as circles on the face images. The center of each circle, shows the gaze point and the radius of each circle shows the duration of gaze on that point. The results in figures 6.6 and 6.8 show that eyes, nose and lips area are the most important area for viewers to verify the identity. In figure 6.9 the pair of face images are inverted in grayscale (negative image). About half of the viewers could not verify these images correctly as shown in figure 6.10. However, the most important areas for viewers were noise, eyes and lips again. Figure 6.12 shows how viewers compare a face image and a semi-drawing image. These results and the other results from other 6 pair images show that the most important areas for face verification task for humans are eyes, nose and lips. Negative images are more difficult for humans to verify than the normal images, while as shown in the example in section 4.3.3 (page 40), fractal recognition system can dealing with this difficulty very well. Based on these results, the suitable subfractal areas for a face must contain left and right eyes, nose and lips. To generate a complete fractal code of an image the other parts of the face are also coded.

6.6

Summary

This chapter is an introduction to the concept and underlying maths of a new fractal code for an image called subfractal coding. Based on this method the fractal code of an image can be divided to several subfractals. Each subfractal is defined to be independent from others, thus the changes in one part of the image do not have an effect on the subfractal codes of other parts of the same image.

80

6.6 Summary

Figure 6.5: A pair of face images shown to volunteers to verify the identity.

Figure 6.6: An illustration showing the results of the eye-gaze tracking system for 10 viewers. Circles (the centers) show the gaze points and the radius of each circle shows the duration of gaze on that point.

6.6 Summary

81

Figure 6.7: Another pair of face images shown to volunteers to verify the identity.

Figure 6.8: The results of the eye-gaze tracking system show eyes, nose and lips area are the most important area for viewers to verify the identity.

82

6.6 Summary

Figure 6.9: Another pair of face images. The face images are inverted in grayscale (negative image).

Figure 6.10: The results of the eye-gaze tracking system for negative images.

6.6 Summary

83

Figure 6.11: Yet another pair of face images. Note that the left face image is inverted in grayscale and the right face image is semi-drawing.

Figure 6.12: The results of the eye-gaze tracking system for negative images.

Chapter 7 Future Work and Conclusions This thesis started to address four research questions:

1- Is it possible to use fractal codes of grayscale images as features for recognition? It has been shown throughout this thesis that fractal codes have a great capability to be used for recognition such as face recognition. As it was described in Chapter 4, the fractal parameters of an image have a self similarity based representation of that image and can be used as features for face recognition. The fractal codes of different images are different in the size of fractal features. The system presented for use of fractal codes as features contains a method for normalization of features to generate same reduced size feature vectors is presented. As the fractal code of an image contains several different parts, some variations in images such as shift in brightness effect some of the parameters, while others remain unchanged. This results in some degree of robustness in the system. 2- What is the mathematical basis for using fractals for recognition? The extraction of fractal code from an image involves the partitioning of the image into a set of range blocks. There is also a corresponding set of

86 domain blocks to choose from. For each range block, a suitable domain block is found using some prescribed criterion. The mapping between the domain and range blocks, which is a contractive, similarity transformation, forms the fractal code for this range block. The fractal code for the image is a collection of fractal codes for all range blocks. Fractal code of an image is not unique. An image xf can be shown as attractor of a contractive transformation T which is in the form of T (xf ) = A × xf + B = xf . 3- Is it possible to design a more suitable fractal coding system for recognition? Fractal code of an image is a set of transformations. Each transformation has two parts. geometrical part and luminance part. Fractal image-set coding keeps geometrical parameters the same for all images in the database. Differences between images are captured in the non-geometrical or luminance parameters - which are faster to compute. For recognition purposes, the fractal code of a query image is applied to all the images in the training set for one iteration. The distance between an image and the result after one iteration is used to define a similarity measure between this image and the query image. Experiments show that this system can achieve 95% accuracy rate on a subset of the XM2VTS database and only 2 cases of 39 cases failed. 4- Are the different parts of a fractal code independent? And if not, how can we define and extract independent fractal codes of different parts of an image? Experience with face images shows that any changes in some part of the image may affect the fractal codes of that part and also other parts of the image. Chapter 6 a defines subfractal which is a new type of fractal code for an image. Each subfractal is defined to be independent from others. An algorithm is presented for extraction of subfractal codes.

7.1 Future Work

7.1 7.1.1

87

Future Work Improving the Robustness

Faces can vary in terms of size, location in an image, orientation about the z-axis. Such variation can be removed by normalising the face. Descriptions for recognition can then be obtained. The eyes are commonly detected for normalisation and some effective eye detectors have been produced. But eye detection cannot always be successfully applied to faces; glasses or other obstacles can hide the eyes. Some methods use a whole face approach to face normalisation, which is more robust. Facial expression is another kind of variation that can not be removed by normalisation. The expression is almost composed with six main emotions, such as happiness, sadness, surprise, disgust, anger and fear. Several algorithms have been proposed for facial expression detection [11],[32],[53], [62],[64]. Some of these techniques are related to extraction of motion of nose, mouth, eyebrows and eyes with tracking algorithm, optical flow, motion energy, network criteria, 3d geometric modelling with a range finder and color image analysis technique. Most of these techniques are used to recognize the facial expressions but only little effort has gone into the recognition of faces with varying facial expressions. A combination of fractal face recognition system and a PCA based feature reduction system (as shown in Figure 7.1) can be used to show how this method is robust to the some facial variations like human facial expression. In this application, domain index numbers for each range block in a feature vector is used. To normalise the size of each vector, the quadtree partitioning geometrical parameters which is part of fractal codes for each image is used. Because quadtree partitioning can be applied to an image of arbitrary size, the feature vector can be resize to the size of the query image. In a typical image of size 128x128 the

88

7.1 Future Work

quadtree decomposition produces about 400 or more range blocks. The feature vectors are of this length and will after normalisation be uniformly of size 64x64 because the smallest range size used is 4x4. This is a large vector and must be reduced to suit classifiers. The optimal linear method (in the least mean squared error sense) for reducing redundancy in a data set is the Karhunen-Loeve (KL) transform or eigenvector expansion via Principal Components Analysis (PCA). The basic idea behind the KL transform is to transform possibly correlated variables in a data set into uncorrelated variables. The transformed variables will be ordered so that the first one describes most of the variation of the original data set. The second will try to describe the remaining part of variation under the constraint that it should be uncorrelated with the first variable. This continues until all the variation is described by the new transformed variables, which are called Principal Components. Mathematically, the PCA can be described as follows. Suppose X is a vector, let P be the transformation matrix required such that Y has a diagonal covariance matrix. Y =P ×X It has been shown that the rows of P are eigen-vectors of the covariance matrix E[(X − X)(X − X)T ]. The eigenvectors are arranged in descending order of the corresponding eigenvalues. Elements of Y are called the principal components of X. The expectation operation is preformed as an average over all feature vectors from the training set. In this approach, PCA is preformed on fractal feature vectors, not pixel values directly. The aim of PCA is to reduce the dimension of the working space. The maximum number of principal components is the number of variables in the original space. However, in order to reduce the dimension, some principal components should be omitted. In order to minimize the error, the eigenvalues are classified in decreasing order and the last eigenvalues (and their eigenvectors) may be dropped. We use this method to reduce dimension of fractal features to the number of individuals

7.1 Future Work

89

in the training database, for example from 16384 to 100. Independent component analysis (ICA) could also be used for feature reduction but in this thesis we have restricted our attention to PCA. The eigenface approach uses normalised face images as vectors of pixel values, which are transformed using PCA into feature vectors. The difference with our approach is the use of fractal code vectors instead of pixel values as input to the PCA. Results in figure 7.2 seem to indicate that our method should provide better robustness to expression variations. We also do not need to normalise the face images for small changes in size, position and rotation. We used reduced fractal features as a vector. The size of each vector is equal. For classification we used the mean squared error between feature vectors of the query image and feature vectors of all images in the database as a measure of distance and a minimum distance classifier.

7.1.2

Face Location and Detection

When a face image is captured using a video camera, the face may be located anywhere on the video frame (on still image). Because most face recognition methods rely on some normalisation of size and position, it is important to locate the face and find either the contour or the location of some reference points such as the eyes or the mouth. A segmentation technique, which will distinguish face pixels or blocks from background ones, can be used for this task. However, if is a difficult task and no good segmentation algorithms are known, especially when the background is not uniform in grayscale or texture. The possibility of using the subfractal idea to segment an image and locate the face can be studied.

90

7.1 Future Work

Figure 7.1: Block diagram of the fractal face recognition system with PCA based feature reduction.

Figure 7.2: Matrix showing differences between faces shown on the two axes. Darker points indicate larger difference. Entries below the diagonal are pixelvalue differences. Entries above the diagonal are fractal-feature differences.

7.1 Future Work

7.1.3

91

Face Recognition Using Subfractals of Eyes and Mouth Area

Face recognition accuracy can be improved if global features are augmented by features depending only on specific parts such as eyes or mouth. This can only be done if these parts can be segmented out from the rest of the face. This requires properties of these parts, which are distinct from those of the rest of the face. It is our contention that there is self-similarity within these parts and range blocks from eyes will be transformed versions of domain blocks from within the eye provided the search for the best suited domain is constrained to weight domains inversely as the distance from the range. Under some such constraint the eye region might turn out to be a sub-fractal within the face. We intend to test and further develop these ideas. Other future directions include using subfractals for video coding and neural network based subfractals.

Appendix A Quick Glance Eye-Gaze Tracking System An eye-tracker system is designed to determine the gaze point and the duration of gaze of the user on the computer monitor. This appendix introduces the EyeTech Digital Systems’ product Quick Glance, an eye-tracking system that was used for the tests described in section 6.5. The Quick Glance system consists of two infrared LED light sources, a camera, a power supply and cabling, a PCI bus board and software. The camera and light sources are mounted on the computer’s monitor. The video capture card (PCI bus board) is installed into an available computer slot and connected to the camera with a cable. The software is designed to help users to setup the system, calibrate it and use it for their purpose. This system examines the pupil center and corneal reflections from the user’s eye which is illuminated by two low power infrared LEDs which are mounted on the computer’s monitor to measure the user’s gaze point. The reflected light is focused onto a camera, also mounted on the computer’s monitor. The image of the eye upon which the camera is focused is captured at a fast and user determined

94 rate by image capturing hardware provided with the system. By analyzing the position of the light reflections and the center of the pupil contained in the image, a software determines the gaze point. Gaze point duration is also derived. With that information, a gaze tracking program can illustrate the user’s gaze path by moving the location of the cursor according to the gaze point and its duration.

Appendix B Experimental Details This dissertation contains several experiments. The details of experiments as well as the results and comparison between the results are described in this Appendix:

B.1

Fractal Codes as Features

Method: direct use of fractal codes as features (Chapter 4). Coding method: conventional fractal coding. Domain block: overlapping square blocks of two different size (8X8 and 16X16). Range blocks: non-overlapping, square blocks, generated by quad-tree partitioning.(Figure 4.1) Geometrical aspects of transformation: contractive size matching and one of eight orientations.(Figure 4.2) Number of features: 4 vectors( Domain index number, Orientation, Brightness shift and Contrast factor)

96

B.2 Fractal Image-Set Coding

Normalization: each of the fractal features will be normalized to a specific size(64X64) using quad-tree partitioning geometry.(Figure 4.4) Database: a subset of MIT face database contains 2 face images of 90 subjects, with some variation in the illumination, and the scale and head orientation. (Figure 4.5) classification: The Peak Signal-to-Noise ratio (PSNR) between feature vectors of the query image and feature vectors of all images in the database are used as measure of distance. A minimum distance classifier then employed to determine the recognition accuracy. Results: classification accuracy for each features is calculated separately. The orientation parameter with 72% and the domain index with 64% showing the higher accuracy than 2 other features. The accuracy can be increased to 88.5% by using best of four features.(Figure 4.6)

B.2

Fractal Image-Set Coding

Method: Fractal image-set coding (Chapter 5). Coding method: calculating the geometrical fractal features only once from a mean image or even a single chosen image. Domain block: overlapping square blocks of two different size (8X8 and 16X16). Range blocks: non-overlapping, square blocks, generated by quad-tree partitioning. Geometrical aspects of transformation: contractive size matching and one of eight orientations.

B.2 Fractal Image-Set Coding

97

Number of features: 1 vector (luminance parameters). Normalization: any image in the data-set will be normalized using histogram equalization and eye locations to produce 128x128 face images with left and right eyes at (32,32) and (96,32) respectively.(Figure 5.3) Databases and results : this method has been tested on two databases including a subset of MIT face database and a subset of XM2VTS face database. The subset of MIT face database contains 90 person and 2 shots per person. one of the shots used as test data while the other shot used as training data. ROC plot in Figure B.1 shows the results of this experiment . ROC plot 1

0.9

0.8

0.7

Error Rate

0.6

0.5

0.4

0.3

0.2 FA FR

0.1

0

0

0.1

0.2

0.3

0.4

0.5 Threshold

0.6

0.7

0.8

0.9

1

Figure B.1: The results of Fractal image-set coding for subset of MIT face database. The recognition accuracy rate of this system is 83.33% which is higher than the results of any of 4 fractal features tested in the first experiment. The subset of XM2VTS database contains 39 people and 4 images per person(first

98

B.2 Fractal Image-Set Coding

shot of 4 sessions). Image data-set is divided to 3 sets: training set, evaluation set and test set. Three subjects ( subject number 000, 002 and 007) are used as imposters in evaluation set, 8 subjects ( subject number = 001, 008, 010, 011, 023, 028, 031, 039) are used as imposters in test set and the other subjects are used as clients. The first images of each client subject is used as test image while the other 3 images are used for training. Figure B.2 shows the results of this experiment for evaluation data in ROC plot format. Based on this plot the threshold will be set to obtain certain false acceptance (FAR) and false rejection (FRR) values. The same threshold will then be used on the test set. ROC plot 1

0.9

0.8

Evaluation FAR Evaluation FRR

0.7

Error Rate

0.6

0.5

FRR=FAR

0.4 FAR=0 0.3

FRR=0

0.2

0.1

0

0

0.1

0.2

0.3

0.4

0.5 Threshold

0.6

0.7

0.8

0.9

1

Figure B.2: The results of Fractal image-set coding for the evaluation subset of XM2VTS database. Arrows showing the position of the threshold for FRR=0, FRR=FAR and FAR=0 To compare this results with the results of other researchers that also used XM2VTS database, test set will be evaluated to three different thresholds T:

TF AR=0 = argminT (F RR|F AR = 0) TF AR=F AR = (T |F RR = F AR)

B.2 Fractal Image-Set Coding

99

TF RR=0 = argminT (F AR|F RR = 0)

Table B.1: Error rates obtained using Fractal image-set coding Error FRR=0 FAR=FRR FAR=0 FAR 53.33% 16.4% 0.3% FRR 0.0% 9.3% 51.85%

Table B.2: Error rates Reported by T. Tan using fractal Error FRR=0 FAR=FRR FAR 94.0% 13.6% FRR 0.0% 12.3%

neighbor distances FAR=0 0.0% 81.3%

Figure B.3 shows these results in the form of ROC plot. The error rates are summarized in the Table B.1. Using this information we can compare our results with the other results in the literature. For examples the results of face recognition using fractal neighbor distances [91] is shown in the table B.2 which indicates that our results have less errors in most of the cases and have slightly higher error in other cases.

100

B.2 Fractal Image-Set Coding

ROC plot 1 Test FAR Test FRR

0.9

0.8

FRR=0

FAR=0

0.7

0.6 FAR=FAR 0.5

0.4

0.3

0.2

0.1

0

0

0.1

0.2

0.3

0.4

0.5 Threshold

0.6

0.7

0.8

0.9

1

Figure B.3: The results of Fractal image-set coding for the test subset of XM2VTS database. Arrows showing the position of the threshold for FRR=0, FRR=FAR and FAR=0 in the evaluation data set

Bibliography [1] S. Akamatsu, T. Sasaki, H. Fukumachi, and Y. Suenaga, “A robust face identification scheme -KL expansion of an invariant feature space,” Proceedings of SPIE, vol. 1607: Intelligent Robots and Computer Vision X: Algorithms and Techniques, pp. 71–84, 1991. [2] A. Alattar and S. Rajala, “Facial features localization in frontal view head and shoulders images,” IEEE international Conference on Acoustics, Speech and signal Processing, vol. 6, pp. 3557–3560, 1999. [3] M. Barnsley, “Fractals everywhere,” Academic Press, San Diego, 1988. [4] M. Barnsley and L. Hurd, Fractal Image Compression. AK Peters, Wellesley, 1993. [5] M. S. Bartlett and T. J., “Viewpoint invariant face recognition using independent component analysis and attractor networks,” in Advances in Neural Information Processing Systems (T. P. M. Mozer, M. Jordan, ed.), pp. 817– 823, Cambridge, MA: MIT Press, 1997. [6] M. S. Bartlett and T. J. Sejnowski, “Independent components of face images: A representation for face recognition,” in Proceedings of the 4th Annual Jount Symposium on Neural Computation,, (Pasadena, CA,), May 1997. [7] L. H. M. Bartlett, M. Stewart and T. Sejnowski, “Independent component representations for face recognition,” in Proceedings of the SPIE, Conference on Human Vision and Electronic Imaging III,, vol. 3299, pp. 528–539, 1998.

102

BIBLIOGRAPHY

[8] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, “Eigenfaces vs. fisherfaces: Recognition using class specific linear projection,” Proceedings of European Conference on Computer Vision, ECCV’96, pp. 45–58, 1996. [9] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, “Eigenfaces vs. fisherfaces: Recognition using class specific linear projection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711–720, 1997. [10] T. D. Bie, N. Cristianini, and R. Rosipal, “Eigenproblems in pattern recognition,” Handbook of Computational Geometry for Pattern Recognition, Computer Vision, Neurocomputing and Robotics, E. Bayro-Corrochano (editor), Springer-Verlag, April 2004. [11] M. J. Black and Y. Yacoob, “Tracking and recognition rigid and non-rigid facial motion using local parametric models of image motion,” Proceedings of IEEE International Conference on Computer Vision, ICCV95, Boston, pp. 374–381, 1995. [12] D. Blackburn, M. Bone, and P. J. Philips, “Facial recognition vendor test 2000,” Evaluation report. National Institute of Standards and Technology, 2000. [13] R. D. Boss and E. W. Jacobs, “Archetype classification in an iterated transformation image compression algorithm.,” in Fractal Image Compression Theory and Application, (Y. Fisher, ed.), pp. 79–90, Springer-Verlag, New York, 1994. [14] Boyer and Merzbach, A History of Mathematics. New York: John Wiley, 2nd ed., 1989. [15] R. Brunelli and D. Falavigna, “Person identification using multiple cues,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, pp. 955–966, 1995.

BIBLIOGRAPHY

103

[16] R. Brunelli and T. Poggio, “Face recognition through geometrical features,” Proceedings of European Conference on Computer Vision, ECCV92, Santa Margherita Ligure, pp. 792–800, 1992. [17] R. Brunelli and T. Poggio, “Face recognition: features versus templates,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, 1993. [18] P. Burt, “Smart sensing within a pyramid vision machine,” Proceedings of the IEEE, vol. 76, pp. 1006–1015, 1988. [19] L. Chen, H. Liao, J. Lin, and C. Han, “Why recognition in a statisticsbased face recognition system should be based on the pure face portion: a probabilistic decision-based proof,” Pattern Recognition, vol. 34, no. 5, pp. 1393–1403, 2001. [20] G. Chow and X. Li, “Towards a system for automatic facial feature detection,” Pattern Recognition, vol. 26, no. 12, pp. 1739–1755, 1993. [21] G. M. Davis, “A wavelet-based analysis of fractal image compression,” IEEE Transactions on Image Processing, pp. 100–112, 1997. [22] O. Deniz, M. Castrillon, and M. Hernandez, “Face recognition using independent component analysis and support vector machines,” 3rd International Conference on Audio- and Video-based Biometric Person Authentication 2001, Halmstad, Sweden, June 6-8,, vol. 2091, pp. 59–64, 2001. [23] H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan, “Face recognition using fractal codes,” Proceedings, WoSPA 2000, , Brisbane, Australia, 2000. [24] H. Ebrahimpour-komleh, V. Chandran., and S. Sridharan, “Face recognition using fractal codes,” Proceedings of International Conference on Image Processing, vol. 3, pp. 58–61, 2001. [25] H. Ebrahimpour-komleh, V. Chandran., and S. Sridharan, “Robustness to expression variations in fractal-based face recognition,” Sixth International,

104

BIBLIOGRAPHY Symposium on Signal Processing and its Applications, vol. 1, pp. 359–362, 2001.

[26] H. Ebrahimpour-komleh, V. Chandran., and S. Sridharan, “Mathematical basis for use of fractal codes as features,” Image and Vision Computing ’02 New Zealand, 2002. [27] H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan, An Application of Fractal Image-set Coding in Facial Recognition, vol. 3072 of Lecture Notes in computer science, Biometric Authentication, pp. 178–186. Springer Verlag, July 2004. [28] H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan, “Facial image retrieval using fractal image-set coding,,” Feb. 2004. [29] H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan, “Fractal imageset encoding for face recognition,” in Proceedings of International Conference on Computational Intelligence for Modelling Control and Automation, (Gold Coast, Australia), pp. 664–672, July 2004. [30] H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan, “Subfractals: A new concept for fractal image coding and recognition,” Submitted to the Journal of Complexity International, 2004. [31] R. Epstein, P. Hallina, and A. Yuille, “5+/- eigenimages suffices: An empirical investigation of low-dimensional lighting models,” Proceedings of the Workshop on Physics-based Modeling in Computer Vision, pp. 108–116, 1995. [32] I. A. Essa and A. P.Pentland, “Facial expression recognition using a dynamic model and motion energy,” Proceedings of IEEE International Conference on Computer Vision, ICCV95, Boston, pp. 360–367, 1995. [33] K. Etemad and R. Chellappa, “Face recognition using discriminant eigenvectors,” Proceedings of International Conference on Acoustics, Speech and Signal Processing, pp. 2148–2151, 1996.

BIBLIOGRAPHY

105

[34] R. A. Fisher, “The use of multiple measurements in taxonomic problems,” Annals of Eugenics, vol. 7, pp. 179–188, 1936. [35] Y. Fisher, ed., Fractal Image Compression:

Theory and Application.

Springer-Verlag , New York, NY, USA, 1995. [36] Y. Fisher, ed., Fractal Image Encoding and Analysis. NATO ASI Series, Springer-Verlag, Berlin Heidelberg, 1998. [37] K. Fukunaga, “Introduction to statistical pattern recognition,” Academic Press, 2nd ed, 1990. [38] M. Gharavi-Alkhansari and T. S. Huang, “A generalized method for image coding using fractal-based techniques.,” Journal Visual Communication Image Representation, vol. 8, no. 2, pp. 208–225, 1997. [39] A. J. Goldstein, L. Harmon, and A. Lesk, “Identification of human faces,” Proceedings of the IEEE, pp. 748–760, 1971. [40] R. Gross, J. Shi, and J. Cohn, “The current state of the art in face recognition,” Technical Report, Robotics Institute, Carnegie Mellon University, Pittsburgh,USA, 2004. [41] P. Hancock, V. Bruce, and M. Burton, “A comparison of two computerbased face identication systems with human perceptions of faces,” Vision Research, vol. 38, 1998. [42] L. D. Harmon, M. K. Khan, R. Lasch, and P. F. Ramig, “Machine identification of human faces,” Pattern Recognition, pp. 97–110, 1981. [43] J. Hutchinson, “Fractals and self similarity.,” Indiana University Mathematics Journal, vol. 30, no. 5, pp. 713–747, 1981. [44] A. Hyvarinen and E. Oja, “Independent component analysis: Algorithms and applications,” Neural Networks, vol. 13, no. 4-5, pp. 411–430, 2000.

106

BIBLIOGRAPHY

[45] A. E. Jacquin, A Fractal Theory of Iterated Markov Operators with Applications to Digital Image Coding. PhD thesis, Georgia Institute of Technology, 1989. [46] A. E. Jacquin, “Fractal image coding: A review,,” Proceedings of the IEEE, vol. 81, no. 10, pp. 1451–1465, 1993. [47] T. Kanade, Picture Processing by Computer Complex and Recognition of Human Faces. PhD thesis, Kyoto University, 1973. [48] T. Kanade, J. Cohn, and Y. Tian, “Comprehensive database for facial expression analysis,” Proceedings of the 4th IEEE International Conference on Automatic Face and Gesture Recognition (FG’00), pp. 46 – 53, March 2000. [49] M. D. KELLY, “Visual identification of people by computer.,” Technical report AI-130, Stanford AI Project, Stanford, CA., 1970. [50] M. Kirby and L. Sirovitch, “Application of the karhunen-loeve procedure for the characterization of human faces,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, pp. 103–108, 1990. [51] A. Kouzani, F.He, and K. Sammut, “Face image matching using fractal dimension,” IEEE International Conference on Image Processing, pp. 642– 646, 1999. [52] A. Z. Kouzani, F. He, and K. Sammut, “Fractal face representation and recognition,” IEEE International Conference on Systems, Man and Cybernetics, vol. 2, pp. 1609–1613, 1997. [53] A. Lanitis, C.J.Taylor, and T. Cootes, “A unified approach to coding and interpreting face images,” Proceedings of IEEE International Conference on Computer Vision, ICCV95, Boston, pp. 368–373, 1995. [54] J. Lu, K. Plataniotis, and A. Venetsanopoulos, “Face recognition using ldabased algorithms,” IEEE Trans. on Neural Networks, vol. 14, no. 1, pp. 195– 200, 2003.

BIBLIOGRAPHY

107

[55] S. G. Mallat and Z. Zhang, “Matching pursuits with time-frequncy dictionaires,” IEEE Transactions on Signal Processing, vol. 41, pp. 3397–3415, 1993. [56] B. Mandelbrot, Les Objets Fractals: Forme, Hasard et Dimension. Paris: Flammarion, 1975. [57] B. Mandelbrot, Fractals: Form,Chance and Dimention. Freeman, W. H. and Company, 1977. [58] B. Mandelbrot, The Fractal Geometry of Nature. Freeman, W. H. and Company, 1982. [59] B. Manjunath, R. Chellappa, and C. D. Malsburg, “A feature based approach to face recognition,” Proceedings of IEEE Computer Society. Conference on Computer Vision and Pattern Recognition, pp. 373–378, 1992. [60] J. Manjunath, N. Orlans, and A. Piszcz, “Effects of eye position on eigenfacebased face recognition scoring,” Technical Report, The MITRE corporation, 7515 colshire drive, mclean, VA 22102,USA, 2003. [61] A. M. Martinez and R. Benavente, “The ar face database,” CVC Tech. Report 24, 1998. [62] K. Mase, “Recognition of facial expression from optical flow.,” IEICE Transactions, vol. E74, no. 10, pp. 3474–3483, 1991. [63] J. Matas, M. Hamouz, K. Jonsson, J. Kittler, Y. Li, C. Kotroupolous, A. Tefas, I. Pitas, T. Tan, H. Yan, F. Smeraldi, J. Bigun, N. Capdevielle, W. Gerstner, S. Ben-Yacoub, and Y. Abduljaoued, “Comparison of face verification results on the xm2vts database,” in Proceedings of the 15th ICPR (A. Sanfeliu, J. J. Villanueva, M. Vanrell, R. Alqueraz, J. Crowley, and Y. Shirai, eds.), vol. 4, (Los Alamitos, USA), pp. 858–863, IEEE Computer Soc Press, 2000.

108

BIBLIOGRAPHY

[64] K. Matsuno, C. Lee, S. Kimura, and S. Tsuji, “Automatic recognition of human facial expressions,” Proceedings of IEEE International Conference on Computer Vision, ICCV95, Boston, pp. 352–359, 1995. [65] K. Messer, J. Matas, J. Kittler, J. Luettin, and G. Maitre., “Xm2vtsdb: The extended m2vts database.,” March 1999. [66] M. Michaelis, R. Herpers, L. Witta, and G. Sommer, “Hierarchical filtering scheme for the detection of facial keypoints,” International Conference on Acostics, Speech, and Signal Processing, vol. 4, pp. 2541–2544, 1997. [67] B. Moghaddam and A. Pentland, “Probabilistic visual learning for object representation,” The 5th International conference on Computer Vision, Cambridge MA, pp. 786–793, 1995. [68] B. Moghaddam and A. Pentland, “Probabilistic visual learning for object representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, pp. 676–710, 1997. [69] D. M. Monro and F. Dudbridge, “Fractal block coding of images.,” Electronics Letters, vol. 28, no. 11, pp. 1053–1055, 1992. [70] D. M. Monro and F. Dudbridge, “Rendering algorithms for deterministic fractals,” IEEE Computer Graphics and Applications, vol. 15, no. 1, pp. 32– 41, 1995. [71] A. Nefian, A hidden Markov model-based approach for face detection and recognition. Phd thesis, Georgia Institute of Technology,, Atlanta, GA, 1999. [72] G. Neil and K. M. Curtis, “Scale and rotationally invariant object recognition using fractal transformations,” Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP96, vol. 6, pp. 3458–3461, 1996. [73] G. Neil and K. M. Curtis, “Shape recognition using fractal geometry,” Pattern recognition, vol. 30, no. 12, pp. 1957–1969, 1997.

BIBLIOGRAPHY

109

[74] A. Pentland and T. Choudhury, “Face recognition for smart environments.,” IEEE Computer, vol. 33, no. 2, pp. 50–55, 2000. [75] A. Pentland, R. Picard, and S. Scarloff, “Photobook: content-based manipulation of image databases,” International Journal of Computer Vision, vol. 18, pp. 233–254, 1996. [76] P. Phillips, “Matching pursuit filters design,” 12th International Conference on pattern recognition, pp. 57–61, 1994. [77] P. Phillips, “Matching pursuit filters design for face identification,” in SPIE, vol. 2277, pp. 2–9, 1994. [78] P. Phillips, “Matching pursuit filters applied to face identification,” IEEE Transactions on Image Processing, vol. 7, no. 8, pp. 1150–1164, 1998. [79] P. J. Phillips, P. Grother, R. Micheals, D. M. Blackburn, E. Tabassi, and J. M. Bone, “Face recognition vendor test 2002: Overview and summary,” National Institute of Standards and Technology, 2003. [80] P. J. Phillips, A. Martin, C. L. Wilson, and M. Przybocki, “An introduction to evaluating biometric systems,” IEEE Computer, vol. 33, no. 2, pp. 56–63, 2000. [81] P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss, “The feret evaluation methodology for face-recognition algorithms,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 10, pp. 1090–1104, 2000. [82] P. J. Phillips, H. Wechsler, J. Huang, and P. J. Rauss., “The feret database and evaluation procedure for face-recognition algorithm.,” Image and Vision Computing,, vol. 16, pp. 295–306, 1998. [83] S. Rizvi, P. Phillips, and H. Moon, “The feret verification testing protocol for face recognition algorithms,” Thechnical report, NISTIR 6281, National Institute of Standards and Technology, 1998.

110

BIBLIOGRAPHY

[84] J. RuizdelSolar and P. Navarrete, “Eigenspace-based face recognition: a comparative study of different approaches,” IEEE Transactions on Systems, Man and Cybernetics, Part C, vol. 35, pp. 315–325, 2005. [85] F. Samaria and S. Young, “Hmm based architecture for face identification.,” Image and Vision Computing, vol. 12, no. 8, pp. 537–583., 1994. [86] A. W. Senior, “Face and feature finding for a face recognition system,” Second International Conference on Audio- and Video-based Biometric Person Authentication, pp. 154–159, 1999. [87] G. Shakhnarovich and B. Moghaddam, “Face recognition in subspaces,” Handbook of Face Recognition, Eds. Stan Z. Li and Anil K. Jain, SpringerVerlag, pp. 154–159, 2004. [88] L. Sirovitch and M. Kirby, “Low-dimensional procedure for the characterization of human faces,” Journal of the Optical Society of America, vol. 4, pp. 519–524, 1987. [89] L. Stringa, “Eyes detection for face recognition,” Applied Artificial Intelligence, vol. 7, pp. 365–382, 1993. [90] H. Takayasu, Fractals in the Physical Sciences. Manchester University Press, 1990. [91] T. Tan, “Human face recognition based on fractal image coding,” PH.D. Thesis, The University of Sydney, 2003. [92] T. Tan and H. Yan, “Analysis of the contractivity factor in fractal based face recognition,” IEEE International Conference on Image Processing, vol. 3, pp. 637–641, 1999. [93] T. Tan and H. Yan, “Face recognition by fractal transformations,” Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP99, pp. 3537–3540, 1999.

BIBLIOGRAPHY

111

[94] T. Tan and H. Yan, “Object recognition using fractal neighbor distance: Eventual convergence and recognition rates,” Proceedings of 15th International Conference Pattern Recognition, pp. 781–784, 2000. [95] L. Torres, “Is there any hope for face recognition?,” Proc. of the 5th International Workshop on Image Analysis for Multimedia Interactive Services, WIAMIS 2004, pp. 21–23, 2004. [96] M. Turk and A. Pentland, “Eigenfaces for recognition,” Jornal of Cognitive Neuroscience, vol. 3, pp. 71–86, 1991. [97] M. Turk and A. Pentland, “Face recognition using eigenfaces,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 586– 591, 1991. [98] L. Vences and I. Rudomin, “Genetic algorithms for fractal image and image sequence compression,” in Proceedings Computacion Visual, pp. 35–44, Universidad Nacional Autonoma de Mexico, 1997. [99] S. Welstead, “Fractal and wavelet image compression techniques,” SPIE Press, 1999. [100] L. Wiskott, J. Fellous, N. Krger, and C. von der Malsbnurg, “Face recognition by elastic bunch graph matching,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, 1997. [101] L. Wiskott and C. v. d. Malsburg, “Recognizing faces by dynamic link matching,” Proceedings of International Conference on Artificial Neural Networks, ICANN’95, pp. pp. 347–352, 1995. [102] L. Wiskott and C. von der Malsburg, “Labeled bunch graphs for image analysis,” Apr. 2001. United States Patent 6,222,939. [103] L. Wiskott and C. von der Malsburg, “Labeled bunch graphs for image analysis,” Mar. 2002. United States Patent 6,356,659.

112

BIBLIOGRAPHY

[104] L. Wiskott and C. von der Malsburg, “Labeled bunch graphs for image analysis,” May 2003. United States Patent 6,563,950. [105] B. Wohlberg and G. de Jager, “A review of the fractal image coding literature,” IEEE Transactions on Image Processing, vol. 8, no. 12, pp. 1716–1729, 1999. [106] L. Yuille, P. Hallinan, and D. Cohen, “Feature extraction from faces using deformable templates,” International Journal of Computer Vision, vol. 8, no. 2, pp. 99–111, 1992. [107] W. Zhao and R. Chellappa, “Face recognition: A literature survey,” ACM Journal of Computing Surveys, pp. 399–458, 2003.