Biometrics in access control - Semantic Scholar

Silesian University of Technology Faculty of Automatic Control, Electronics and Computer Science

Institute of Computer Science

Human identification using eye movements

Doctoral thesis Paweł Kasprowski

Supervisor: prof. dr hab. inż. Józef Ober

Gliwice, 2004

Eyes are windows of our soul William Shakespeare

TABLE OF CONTENTS

TABLE OF CONTENTS ................................................................................... III LIST OF FIGURES ...........................................................................................VI LIST OF TABLES ...........................................................................................VIII LIST OF ABBREVIATIONS..............................................................................IX 1 INTRODUCTION ........................................................................................ 10 1.1

Biometric identification................................................................................10

1.2

Eye movements in biometric identification ................................................11

1.3

Thesis of the dissertation..............................................................................12

2 BIOMETRIC IDENTIFICATION ISSUES .................................................... 13 2.1

Classification of biometric identification methods ....................................14

2.2

Evaluating efficiency of biometric identification .......................................15

2.3 Overview of biometric identification methods ...........................................16 2.3.1 Fingerprint verification ...........................................................................16 2.3.2 Face recognition......................................................................................17 2.3.3 Iris recognition........................................................................................18 2.3.4 Behavioral techniques.............................................................................20 2.3.5 Multimodal systems................................................................................21 2.4

Summary........................................................................................................21

3 EYE MOVEMENTS .................................................................................... 23 3.1

Physiology of eye movements.......................................................................23

3.2

Previous researches considering eye movements.......................................24

3.3

Eye movement tracking methodologies ......................................................25

3.4

The OBER2 system.......................................................................................28

4 EYE MOVEMENT REGISTERING EXPERIMENT ..................................... 31 4.1

Possible testing strategies .............................................................................31

4.2

Possible stimulations.....................................................................................34 III

TABLE OF CONTENTS

4.3

Jumping point stimulation ...........................................................................35

4.4

The learning effect ........................................................................................36

4.5

Methodology used during the experiment ..................................................38

4.6

Storing results of the experiment ................................................................41

5 ENTRY PROCESSING OF COLLECTED DATA ....................................... 43 5.1

Sample calibration ........................................................................................44

5.2 Sample normalization...................................................................................46 5.2.1 Finding fixations .....................................................................................48 5.2.2 Pairing fixations with required fixation locations...................................50 5.2.3 Recalculating all values into a new range...............................................51 5.3 Calculating different eye movement properties.........................................54 5.3.1 Average velocity direction calculation ...................................................55 5.3.2 Eye distance ............................................................................................56 5.3.3 Distance to stimulation ...........................................................................57 5.3.4 Discrete Fourier Transform ....................................................................58 5.3.5 Wavelet Transform .................................................................................59 5.4

Conclusions....................................................................................................61

6 MINIMIZATION OF ATTRIBUTE VECTORS ............................................. 62 6.1

Relevancy estimation ....................................................................................63

6.2 Linear conversions........................................................................................63 6.2.1 Principal Component Analysis ...............................................................64 6.2.2 Other techniques .....................................................................................65 6.3

Conclusions....................................................................................................65

7 CLASSIFICATION METHODS................................................................... 67 7.1

K Nearest Neighbors.....................................................................................67

7.2

Template – threshold ....................................................................................68

7.3

Naïve Bayes....................................................................................................69

7.4

C45 Decision Trees........................................................................................70

7.5

Support Vector Machines ............................................................................72

7.6 Ensemble classifiers ......................................................................................74 7.6.1 Bagging...................................................................................................74 IV

TABLE OF CONTENTS

7.6.2 7.6.3 7.7

Boosting ..................................................................................................75 Using different classifiers and data representations ...............................75

Cross-validation ............................................................................................77

8 EXPERIMENT ............................................................................................ 79 8.1 Data preparation...........................................................................................79 8.1.1 Data gathering.........................................................................................79 8.1.2 Entry processing - datasets preparation ..................................................80 8.2 Performing classification tests .....................................................................80 8.2.1 Dividing a dataset into train-set and test-set...........................................81 8.2.2 Minimizing dataset .................................................................................81 8.2.3 Classification ..........................................................................................83 8.3 Verification of the results .............................................................................84 8.3.1 Analyzing errors for the datasets ............................................................85 8.3.2 Analyzing errors of the classification algorithms ...................................85 8.3.3 Voting classifiers ....................................................................................86 8.4

Conclusions – performance considerations ................................................87

9 RESULTS ................................................................................................... 89 9.1

Multiple trials estimation .............................................................................90

9.2

Problem of overfitting ..................................................................................92

9.3

Conclusions....................................................................................................92

10 LITERATURE ............................................................................................. 94 APPENDIX. SOFTWARE TOOLS ................................................................. 102 EyeLogin – data acquiring ....................................................................................103 EyeLoader – creating a dataset of samples..........................................................103 EyeAnalyser – maintaining datasets ....................................................................103 EyeDataset ...........................................................................................................104 EyeVisualizer .......................................................................................................105 EyeConverter .......................................................................................................106 EyeClassifier ........................................................................................................108 EyeResults............................................................................................................110 EyeStat – analyses of the results ...........................................................................110 External packages ..................................................................................................111 V

LIST OF FIGURES

LIST OF FIGURES

Fig. 2.1 Example of the ROC curve........................................................................................................ 16 Fig. 2.2 The image of the fingerprint and the same fingerprint with detected minutiae ................... 17 Fig. 2.3 Examples of eigenfaces............................................................................................................... 18 Fig. 2.4 Image of the human iris ............................................................................................................. 19 Fig. 2.5 Image with detected iris and corresponding IrisCode ............................................................ 19 Fig. 3.1 Image of the retina ..................................................................................................................... 23 Fig. 3.2 Six oculomotor system muscles. ................................................................................................ 24 Fig. 3.3 Subject ready for electro-oculography eye movement measuring experiment ..................... 26 Fig. 3.4 An example of contact lens coil ................................................................................................. 26 Fig. 3.5 Head mounted video-based eyetracker..................................................................................... 27 Fig. 3.6 Video based eye tracker with camera combined with the TFT display................................. 27 Fig. 3.7 OBER2 system operation principle........................................................................................... 28 Fig. 3.8 Goggles to be worn during the experiment .............................................................................. 29 Fig. 3.9 Graph illustration of the OBER2 measuring process.............................................................. 29 Fig. 3.10 The laboratory version of the OBER2 system ....................................................................... 30 Fig. 4.1 Schema of the system registering only eye movements ........................................................... 32 Fig. 4.2 Schema of the system registering both eye movements and the observed image.................. 32 Fig. 4.3 Schema of the system registering eye movements as the answer to the stimulation............. 33 Fig. 4.4 Hierarchy of possible stimulations............................................................................................ 34 Fig. 4.5 Scanpaths of eyes looking at the static image........................................................................... 34 Fig. 4.6 Points matrix for stimulation .................................................................................................... 39 Fig. 4.7 Typical eye movement reaction for point position change in one axis................................... 39 Fig. 4.8 Visual description of stimulation (a-l) ...................................................................................... 40 Fig. 4.9 Results of a single test ................................................................................................................ 41 Fig. 4.10 Result of one test stored in a text file (EyeTestFile format) .................................................. 42 Fig. 5.1 An example of a sample consisting of six independent parts (signals)................................... 43 Fig. 5.2 Information needed for proper calibration of eye movement signal ..................................... 45 Fig. 5.3 Example of badly acquired sample ........................................................................................... 46 Fig. 5.4 Two graphs presenting left eye horizontal reaction ................................................................ 47 Fig. 5.5 Signal (A) from Fig. 5.4 with detected fixations....................................................................... 49 Fig. 5.6 Signal (A) and its averaged levels.............................................................................................. 50 Fig. 5.7 Signal (B) with detected fixations.............................................................................................. 51 Fig. 5.8 Signal (B) and its averaged conversion with fixation assigned to the wrong level................ 52 Fig. 5.9 The same signal (B) as on Fig. 5.8 but the upper level has been rejected .............................. 52 Fig. 5.10 Signals (A) and (B) presented on Fig. 5.4 after normalization ............................................. 53 Fig. 5.11 A sample contains a source information for producing different vectors of attributes ..... 54 Fig. 5.12 Example of velocity vector calculated for left eye (using LX and LY signals) .................... 55 Fig. 5.13 Average velocities of left eye in 16 different directions ......................................................... 56 Fig. 5.14 Radar graphs of average velocities of left eye in 16 different directions ............................. 56 Fig. 5.15 Absolute distance between eyes’ gaze-points in the following moments of time ................. 57 Fig. 5.16 Difference of the LX eye signal from the required fixation location.................................... 57

VI

LIST OF FIGURES

Fig. 5.17 Comparison of the normalized signals (A) and (B) presented above on Fig. 5.10 .............. 58 Fig. 5.18 Fourier spectra of signals (A) and (B) presented on Fig. 5.10 .............................................. 59 Fig. 5.19 Discrete wavelet transform of LX signal (using Daub4 mother wavelet) ............................ 61 Fig. 6.1 Data conversions schema ........................................................................................................... 62 Fig. 7.1 Architecture of classification process ....................................................................................... 67 Fig. 7.2 Example of genuine-impostor diagram .................................................................................... 68 Fig. 7.3 Example of FAR and FRR in the function of the threshold distance value .......................... 69 Fig. 7.4 The idea of voting classifiers...................................................................................................... 76 Fig. 7.5 The idea of validation using train-set and test-set ................................................................... 77 Fig. 8.1 Three phases of the experiment................................................................................................. 79 Fig. 8.2 Process of the dataset creation .................................................................................................. 80 Fig. 8.3 Partial PCA calculation ............................................................................................................. 82 Fig. 8.4 Errors for six different dataset types........................................................................................ 85 Fig. 8.5 Errors for different classification algorithms .......................................................................... 86 Fig. 8.6 The voting algorithm is using results of all classifiers............................................................. 87 Fig. 9.1 Errors for different persons. ..................................................................................................... 89 Fig. 9.2 Errors for different persons in two trial test............................................................................ 92 Fig. 0.1 Schema of data preparing procedure ..................................................................................... 102 Fig. 0.2 Structure of file in EyeTestFile format ................................................................................... 103 Fig. 0.3 Structure of the file in EyeDatasetFile format ....................................................................... 103 Fig. 0.4 The visual description of EyeAnalyser application functionality......................................... 104 Fig. 0.5 Structure of the file in EyeResultsFile format........................................................................ 110

VII

LIST OF TABLES

LIST OF TABLES Table 8.1. Symbols of prepared datasets and descriptions with references. ........... 80 Table 8.2. Symbols of applied conversions. ................................................................ 83 Table 8.3. Symbols of used classification algorithms................................................. 83 Table 8.4. Average error rates for six different types of dataset.............................. 85 Table 8.5. Average error rates for eight different classification algorithms. .......... 86 Table 9.1. Error rates in authorization tests. ............................................................. 89 Table 9.2. Simulated error rates in authorization tests in two independent trials.90 Table 9.3. Calculation of paired results ...................................................................... 91 Table 9.4. Error rates in authorization test combined from two trials.................... 91

VIII

LIST OF ABBREVIATIONS

LIST OF ABBREVIATIONS CWT

Continuous Wavelet Transform (5.3.5)

DFT

Discrete Fourier Transform (5.3.4)

DWT

Discrete Wavelet Transform (5.3.5)

EER

Equal Error Rate (2.2)

EOG

Electro-oculography (3.3)

FAR

False Acceptance Rate (2.2)

FRR

False Rejection Rate (2.2)

HMM

Hidden Markov Model (5.2.1)

HTER

Half Total Error Rate (8.3)

ICA

Independent Component Analysis (6.2.2)

IROG

Infrared-oculography (3.3)

KNN

k Nearest Neighbors (7.1)

LDA

Linear Discriminant Analysis (6.2.2)

PCA

Principal Components Analysis (6.2.1)

RFL

Required Fixation Location (5.2)

ROC

Receiver Operating Characteristic (2.2)

SMO

Sequential Minimal Optimization (7.5)

STFT

Short Term Fourier Transform (5.3.5)

SVM

Support Vector Machines (7.5)

VOG

Video-oculography (3.3)

IX

1 Introduction Security issues seem to be one of the most important problems of contemporary computer science. One of the most important branches of security is identification of users. Identification may be required for access control to buildings, rooms, devices or information. In case of computer systems we say about access to software and data. The basic aim of identification is to make it impossible for unauthorized persons to access to the specified resources. There are generally three solutions for performing secure identification: •

Token methods (something you have),

•

Memory methods (something you know).

•

Biometric methods (somebody you are).

The token method has two significant drawbacks. Firstly, the token may be lost or stolen. A person who finds or steals a token may have an access to all the resources that the proper owner of the token was able to access, and there is no possibility to find out if they are the person they claim to be. Secondly, the token may be copied. The easiness of making a copy is of course different for different kinds of tokens, but it is always technically possible. Memory based methods identify people by checking their knowledge. The most popular memory methods are of course different kinds of passwords. The main drawback of this kind of methods is the unconscious selectivity of human memory. People may do their best to remember a password but they cannot guarantee that the information will not be forgotten. Similarly to the token method when a malicious user knows a password it is impossible to check if they are the person they claim to be . The problems with token and memory-based methods are the main cause of increasing interest in methods of identification based on biometric information of a person.

1.1

Biometric identification

The terms "Biometrics" and "Biometry" have been used since the early 20th century to refer to the field of development of statistical and mathematical methods applicable to data analysis problems in the biological sciences [88]. Biometric techniques are frequently used in medicine, agriculture or biology. Recently the emerging field of technology devoted to identification of individuals by means of using biological traits 10

Introduction

1.2. Eye movements in biometric identification

(e.g. biometric methods) resulted in the common narrowing the term ‘biometrics’ to refer only to that kind of researches. Therefore, to avoid misunderstanding, the term ‘biometric identification’ will be used in most cases in this dissertation. However when the term is shortened to ‘biometrics’ it is used only in its narrower meaning. Biometric identification uses a fact that measurements of biological properties often gives different results for different people. As some measurements are very similar to whole or most of the population – for example body temperature or pulse frequency – biometric identification methods seek measurements, which are characteristic of a single human being only and therefore unique. The main advantage of biometrics identification is that it is generally more difficult to forge it than in the case with classic methods. Another interesting property of biometric identification is that, contrary to classic methods, it enables the so called ‘negative identification’. It means that people not only can prove that they are who they claim to be but they also can prove that they are not who they claim they are not. Classic identification methods are based on the assumption that people want to be identified. However, in many applications the problem is not to give access to a specified user but to reject it. With such a problem proper identification is against user interest and for instance password identification becomes ineffective. Therefore a great amount of interest with biometrics appears in services aiming to find criminals or terrorists.

1.2

Eye movements in biometric identification

Using eyes to perform biometric human identification has a long tradition including well-established iris pattern recognition algorithms [17] and retina scanning. However, the only papers concerning identification based on eye movement characteristic, known to the author of this dissertation, are written by him and his supervisor [49][48][51][47][50]. It is a bit surprising because that method has several important advantages. Firstly, it compiles physiological (muscles) and behavioral (brain) aspects. The most popular biometric methods like fingerprint verification or iris recognition are based mostly on physiological properties of human body. Therefore, what is needed for proper identification, is only a “body” of a person who is to be identified. It makes possible to identify an unconscious or - in some methods - even a dead person. Moreover, physiological properties may be forged. Preparing models of a finger or even retina (using special holograms) is technically possible. As eye movement based

11

Introduction

1.3. Thesis of the dissertation

identification uses information which is produced mostly by brain (so far impossible to be imitated), forging this kind of information seems to be much more difficult. Although it has not been studied in that paper, it seems possible to perform a covert identification, i.e. identification of a person unaware of that process (for instance using hidden cameras). Last but not least, there are many easy to use eye-tracking devices nowadays, so performing identification by means of that technique is not very expensive. For instance a very fast and accurate OBER2 [71] eye tracking system was used in the present work. It measures eye movements with a very high precision using infrared reflection and the production costs are comparable to fingerprint scanners.

1.3

Thesis of the dissertation

1. Eye movements may be used for human identification. 2. Biometric identification using eye movements is the valuable addition to other existing biometric identification methods. 3. Eye movement measuring when subject is following jumping point stimulation gives information that may be used to perform identification. 4. Principal Component Analysis technique is very useful for feature extraction from eye movement signal.

12

2 Biometric identification issues As it was written in the introduction, memory and token methods are implemented to judge if the specified user should have an access to a specified resource. Therefore an exact identification of a person is not necessary and indeed is not always performed. It is possible that a group of people has the same token or know the same password. Contrary to this, biometric identification methods start with proper identification of a person and only after that, the proper rights are assigned. Thus the main difference between classic methods and biometrics is that biometric properties cannot be ‘borrowed’ so people cannot - in the way as simple as giving a token or telling the password - propagate their rights to others. It obviously increases security of the system but sometimes may cause problem. First stage in each biometric process is collecting a set of ‘samples’ from every user who should be identified by the system. A sample is a set of biometric data measured for a person in a single measurement. The biometric data may be a different kind of psycho-physiological measurements. Next stage in most methods is creating a ‘template’ for each user based on previously collected samples. A template is a kind of mean from all samples collected for this user. The process of creating a template is called an ‘enrolment’ of the user. Having a set of templates for each known user it is possible to identify new unclassified samples. There are two basic techniques: •

Identification.

•

Authorization.

During the identification process, system collects a sample and than tries to match it with one of the stored templates. Commonly it counts for each template a probability that the sample was collected from the user and chooses one with the highest probability. However, this method works only when we are sure that there are templates of all possible users in the system’s database. It must be remembered that this method finds identification for every sample. So persons whose templates are not in the database would be always classified as one of the previously enrolled persons and would get some rights, which they should not have. The solution of the problem is introducing an error threshold for each template. If the sample being identified is not close enough to any template the identification is rejected. Assigning a proper threshold is not a simple problem. It may be fixed for all templates 13

Biometric identification issues

2.1. Classification of biometric identification methods

or counted independently for each one on the basis of - for instance - variance of enrolled samples. Another kind of test is an authorization test. In such test users are first explicitly asked for their names or logins and then system measures a sample of their biometric attributes. After that the system evaluates similarity of the sample to the template of the specified person and accepts or rejects authorization. It is obvious that authorization is much more reliable than identification. Furthermore it is easier to provide and generally faster to perform.

2.1

Classification of biometric identification methods

A biometric feature is a specific attribute of a person that can be measured with some precision. There are a lot of different biometric features that can be measured [95]. The methods measure different parts of body using different measurement devices therefore it is difficult to compare them directly. But there are some properties, which may be evaluated for each measurement method. The most important are: •

Distinctiveness.

•

Repeatability.

•

Accessibility.

•

Acceptability.

Distinctiveness shows how much the specific feature is different for different people. For instance iris pattern or fingerprint are supposed to be very distinctive. On the other hand, a shape of the palm or the hair color are not very distinctive. Distinctiveness is very often considered the most important property of biometric methods but there are also some other properties that imply possibility of usage of the method. One of them is repeatability of the method. Generally speaking, it shows how easily the same feature may be different for different measurements of the same person. For instance fingerprint may be easily damaged with chemicals or simple injuries. The shape of the face may be easily changed also with moustache or glasses. On the other hand, it is rather difficult to change the shape of the palm or iris image. The property that is very important when considering practical usage of the biometric method is its accessibility. Questions that may be asked here are: 1) How fast is the process of collecting data from one user (a measurement) 2) How quickly can the measurement be repeated? 3) How complicated is the measurement process? 14


2.2. Evaluating efficiency of biometric identification

4) Should the identified person be previously trained and how difficult the training is? 5) What is the accessibility of devices performing the measurement (including their prices)? 6) What amount of space is needed to store the template of one person? 7) How fast are methods for evaluating a new measurement? The answers for all this questions let evaluate if it is possible to use this method in the real environment. Last but not least, the acceptability of the method should be mentioned. It may be said that acceptability is accessibility from the users’ point of view. One of the main problems is intrusiveness of the method. Wayman in [95] mentions a system based on the resonance patterns of the human head, measured through microphones placed in the users’ ear canals. Such a system is for sure very inconvenient for users and its acceptability is rather low. On the other hand face recognition systems that are using cameras are not invasive and may be considered acceptable for users.

2.2

Evaluating efficiency of biometric identification

Measurement of biological quantities is always to some degree imprecise and therefore is producing different values for the same quantity measured [88]. These errors are an instant part of every biometric method and the main problem of that kind of identification is to elaborate algorithms that sufficiently deal with these imprecise data. Although many companies are advertising their biometric identification products as reliable and error-free, the independent comparisons like Fingerprint Verification Competition [64][62][63] or Face Recognition Vendor Test [29][25] show that even a well-established fingerprint technology is not fully reliable. There are two kinds of tests when considering authorization (two class) system: •

Genuine test – when a sample is given with correct identification information (login). In another words ‘the identified person is telling the truth’. In such case the rate of improper rejections may be measured. This measure is often called a False Rejection Rate (FRR) or False Non-Match Rate.

•

Impostor test – when a sample is given with incorrect login. In another words ‘the identified person is lying’. Now a rate of improper acceptances may be measured. This measure is called a False Acceptance Rate (FAR) or False Match Rate.

15


2.3. Overview of biometric identification methods

Both measures are often dependent on each other. When decreasing False Rejection Rate, False Acceptance Rate increases and vice versa. Therefore, to properly state an efficiency of biometric method, its results are often presented on a graph called the ROC curve [95]. The acronym ROC stands for ‘Receiving Operating Characteristic’, a term used in signal detection to characterize the tradeoff between hit rate and false alarm rate over a noisy channel [21]. The ROC curve with FRR’s on X-axis and FAR’s on Y-axis presents how these two rates are dependent each other. Fig. 2.1 presents an example of the ROC curve for some biometric system. 100

FAR %

FRR %

100

Fig. 2.1 Example of the ROC curve. Each point stands for one classification verification for which FRR and FAR has been received.

Possibility of evaluating the ROC curve depends on the used pattern matching method. When it is possible one can calculate a point where FRR and FAR values are equal. Value of errors in this point is called Equal Error Rate (EER) and is often referred to in the literature.

2.3

Overview of biometric identification methods

Nearly every part of human body has been used to identification. There are wellestablished methods for measuring fingerprints, iris, eye retina, face, palm, teeth, ears and even smell. There are methods that measure human’s behavior patterns such as way of walking (gait), shape of signature or mouse signature. Some of them have been briefly described in this chapter.

2.3.1 Fingerprint verification Fingerprints verification is one of the oldest biometric techniques used for human identification. First uses of fingerprints instead of signatures were reported in 19th century [42]. The milestone was adopting a Galton/Henry system of identification by Scotland Yard in 1900. Since then fingerprints became one of the most important features used in forensic prosecutions.

16



There are a lot of easy to use and cheap fingerprint scanners. They are based on different technologies including optic, capacitive, ultrasound, pressure, thermal and electric field sensors. It is no need now for using dactylograms (inked fingerprints). The identification is based on detection of ridges and the so called minutiae – places where ridges end or bifurcate. The technology is widely accepted as very reliable. There is a common belief (however never proved!) that fingerprints are unique in whole human population. That is why fingerprint evidence is even acceptable in a court of law.

Fig. 2.2 The image of the fingerprint and the same fingerprint with detected minutiae.

However there are several reasons why the fingerprint verification has not completely dominated biometric identification. Firstly, fingerprints are very sensitive for physical damages and therefore are not very robust. Secondly, many people have chronically dry skin and cannot present clear prints. Fingerprints are supposed to be very distinctive but - what may be surprising – in the competitions using real world test samples the errors may be sometimes more than 2% [42][63]. Moreover fingerprints have a very bad ‘reputation’ as they are commonly joined with criminal investigations. Nevertheless, fingerprints are at present the most popular biometric identification system.

2.3.2 Face recognition Face recognition is one of the most promising techniques nowadays. The possibility of covert identification of people unaware of that makes it eligible for – for instance – terrorist search in crowded places. First face recognition technologies were the so called 17



geometric based methods. They were based on recognition of the specific elements of human face like nose or eyes and measuring its relative positions and shape. The methods were insensitive to variations in illumination and viewpoint. In 1990 Turk et al [90] proposed technique of extraction most expressive features from the face image. The technique based on Principal Component Analysis (see also section 6.2) is creating a set of eigenfaces – images containing the most meaningful parts of the source image. The technique has been widely accepted. Recently more sophisticated techniques like Fisher Linear Discriminant Analysis [6] or Independent Component Analysis [54] are also widely used. The new direction of researches is the so called 3D Face Recognition [60]. However face recognition is still in the very early stage with very limited usage in the real world. The attempts to use it for – for instance – terrorist recognition so far have failed [1].

Fig. 2.3 Examples of eigenfaces [52] (reproduced with permission)

The fact that (contrary to fingerprints) people are able to do face recognition by themselves without any special equipment, encourages researchers for seeking better methodologies imitating human. But the universal ready-to-use face recognition technology is still the ‘wave of future’.

2.3.3 Iris recognition Iris recognition is dominated by John Daugman and his algorithm based on Gabor transformation [17]. The algorithm was patented and now it is the property of Iridian Technologies Inc. Although the iris is very small (about 11 mm) it has enormous pattern variability among different persons [18]. What is very important iris is well protected from the environment and stable over all lifetime.

18



Fig. 2.4 Image of the human iris.

In the algorithm proposed by Daugman, the result of 2D Gabor wavelet applied on iris image is converted into 2,048 bits vector - so called IrisCode. Two IrisCodes may be very quickly compared using simple Hamming Distance and XOR operator. According to reports presented by Daugman [18] the methodology is absolutely error free. Therefore there is no doubt that iris recognition is the most reliable biometric identification technique.

Fig. 2.5 Image with detected iris and corresponding IrisCode [18]. (reproduced with permission)

However the main disadvantage of the iris recognition is the difficulty of acquiring the proper image. Iris is very small, partially occluded by eyelids (often drooping). The image is often obscured by eyelashes or lenses. Moreover eyes tend to move very fast.

19



That is the main reason why iris recognition has not dominated the biometric market despite having several important applications – like registering all travelers arriving to United Arab Emirates or registration of refugees in Afghanistan.

2.3.4 Behavioral techniques Methods described above measure physiological properties of the human body. The problem with that kind of methods is that to proper identification only the part of the human body is needed. Potential forgers may try to prepare models of human body like artificial fingers or contact lenses. Moreover the person being identified may be unconscious or – for some methods – even dead. The problems described above resulted in increased attention paid to methods using not only physiological properties but also measuring the behavioral patterns also [47]. Behavioral biometrics are based on measurements of data derived from an action. One of the defining characteristics of a behavioral biometric is the incorporation of time as a metric – the measured behavior has a beginning, middle and end. Behavioral methods are obviously more difficult to forge because it is difficult to imitate somebody’s behavior. On the other hand analyses of information obtained in such dynamic measurement are more difficult than in case of measurement of – presumably invariant - physiological properties. Therefore error rates achieved in behavioral methods are typically higher than for physiological ones. The most popular behavioral biometric techniques include: •

Speech recognition.

Speech recognition is the special area of interest of telecommunication companies [14]. The main advantage is low cost in application. The microphone is the only needed equipment. However the voice may be easily imitated, disguised and electronically transformed. Moreover the voice of a person may change in time (for instance altered by a cold). •

Keystroke dynamics.

The method is based on measuring the dynamics of the sequence of keystrokes when the user writes something on the keyboard. The idea behind keystroke dynamics has been around since World War II [13]. It was well documented during the war that telegraph operators on many U.S. ships could recognize the sending operator. Raw measurements already available by the standard keyboard can be manipulated to determine Dwell time (the time one keeps a key pressed) and Flight time (the time it takes a person to jump from one key to another) [47]. Another properties may be measured when using specially designed keypads [72]. 20


2.4. Summary

As applications are using mostly standard keyboards (that is common standard input devices) they are vulnerable for forgery. •

Dynamic signature verification.

Static signature identification uses only the image of the signature, the dynamic one uses also the information about pen velocities during making a signature on special tablets [47]. To improve performance, the signature verification may use a specially designed pen registering information about pen’s position and pressure [34] or even special gloves registering information about position of each finger [32]. There is also a patent pending methodology of identification based on signature made with mouse [24]. There are a lot of other behavioral biometric identification methods like for instance gait analysis [36] but most of them is in early experimental phase. It must be mentioned that every behavioral biometric identification method has also a physiological factor. For instance keystroke dynamics depends on length of fingers and speech recognition depends on physiology of the human’s vocal tract.

2.3.5 Multimodal systems There is no optimal biometric identification method. Therefore the technique of combining several methods became the area of interest of researchers [41][12][80]. Systems, that utilize more than one physiological or behavioral characteristic for identification are called multimodal biometric systems. The benefits of the multimodal systems are: •

Reducing false rejection and false acceptance rates.

•

Providing a secondary means of identification if sufficient data cannot be acquired.

•

Combating attempts to spoof biometric systems through non-live data sources such as fake fingers.

The first benefit comes from fact that combining results of weak classifiers may improve the overall performance. There are some obvious combinations of different biometric methods like finger and palm or face and voice. As there are still problems with performance and protection against forging of single biometric methods, the multimodal systems seem to become more important in future.

2.4

Summary

The section described briefly state-of-the-art methods in the field of biometric identification. As it can be seen, the problem is yet far from being solved. There are 21


2.4. Summary

plenty of methods but each of them has some drawbacks. On the other hand, there is a great interest in reliable human identification so there is still need for further researches. Developing a new biometric technologies - like eye movement based identification may prove to be very useful especially when combining it with other methods in multimodal systems.

22

3 Eye movements Eyes are one of the most important human organs. There is a common saying that eyes are ‘windows to our soul’. In fact eyes are the main ‘interface’ between environment and human brain. Therefore, it is not a surprise that the system that deals with human vision is physiologically and neurologically complicated.

3.1

Physiology of eye movements

When individual looks at an object, the image of the object is projected on to the retina, which is composed of light-sensitive cells that convert light into signals, which in turn can be transmitted to brain via the optic nerve. The density (or distribution) of these light-sensitive cells on retina is uneven, with denser clustering at the centre of the retina rather than at the periphery. Such clustering causes the acuity of vision to vary, with the most detailed vision available when the object of interest falls on the centre of the retina. This area is called yellow dot or fovea and covers about two degrees of visual angle. Outside this region visual acuity rapidly decreases. Eye movements are made to reorient the eye so that the object of interest falls upon the fovea and the highest level of detail can be extracted [16].

Fig. 3.1 Image of the retina. Dark region in the right is the fovea.

That is why it is possible to define a ‘gaze point’ – an exact point a person is looking at in a given moment of time. When eyes are looking at something for a period of time this state of the eye is called a fixation. During that time the brain analyzes the image that is projected on the fovea. The standard fixation lasts for about 200-300 ms, but of course it depends on the complexity of an image, which is observed. After the fixation,

23

Eye movements

3.2. Previous researches considering eye movements

eyes move rapidly to another gaze point – another fixation. This rapid movement is termed a saccade. Saccades differ in longitude, yet always are very fast. To enable brain to acquire image in real time, the system which controls eye movements (termed oculomotor system) has to be very fast and accurate. It is built of six extra ocular muscles (see Fig. 3.2) which act as three agonist/antagonist pairs concerned with horizontal, vertical and oblique rotations of eye [38]. Eyes are controlled directly by the brain with three cranial nerves originating from midbrain and pons. Therefore its movements are the fastest reactions for changing environment.

Fig. 3.2 Six oculomotor system muscles.

3.2

Previous researches considering eye movements

Eye movements are essential to visual perception [66], so it is not a surprise that there are a lot of researches on our vision. Most of them are concerned with neurobiological and psychological aspects of vision. One of the first scientists who emphasized the importance of eye movements in vision and perception was Descartes (1596-1650). First known researches were made by French ophthalmologist, Emile Javal in 1879 [44]. He discovered that eyes move in a series of jumps (saccades) and pauses (fixations). His research was based only on his direct observation of eyes, so it could not be fully reliable. First eye-tracker was developed by Edmund Burke Huey in 1897. The way in which people read text was the first area of interest. It turned out – contrary to common point of view in those times – that people read more than one letter simultaneously. They read whole words or even whole phrases. The nature of reading ability was examined and the results were published in a comprehensive form in 1908 [37]. Other area of interest was how the brain processes images. It turned out that placements and order of fixations were strictly dependent on the kind of picture that was seen and on previous individual experience with that kind of pictures. The brain was

24

Eye movements

3.3. Eye movement tracking methodologies

believed to be attracted by the most important elements of the picture, and, after examining them, to focus on less important details. The acquired knowledge on the way the brain was processing information was used mostly in psychological research [19][23]. Another evolving field where eye trackers are used is research called usability engineering – the study of the way that users are interacting with products to improve those products’ design. Among the most popular nowadays is the study of the usability of WWW pages [16][46][83]. Although there has not been any research of using eye movements to perform human identification, some authors noticed significant differences between people. Josephson and Holmes [46] tested the scanpath theory introduced by Stark and Norton [70] on three different WWW pages. They not only confirmed that individual learnt scanpaths (series of fixations) and repeated it when exposed on the same stimulation again, but they also noticed that each examined person learned a different scanpath. There are also some studies comparing the eye-movements of different categories of people, for instance males and females [83] or musicians and non-musicians [55].

3.3

Eye movement tracking methodologies

As it was stated above the first eye tracker was developed in 1897. Until now a lot of different methods for measuring eye movements have been developed. There are four broad categories of eye movement measurement methodologies including [20][59]: •

Electro-oculography (EOG).

•

Scleral contact lens/search coil.

•

Video-oculography (VOG).

•

Infrared corneal reflection oculography (IROG).

Electro-oculography (EOG) is the cheapest method and has been widely used in the past. It relies on recordings of the electric potential differences of the skin surrounding the ocular cavity. Surface recording electrodes are typically placed on the skin close to the eyes in horizontal and vertical planes so as to record relative shifts in the potential. When the eyes look to the left, the positive charge of the cornea moves closer to the left surface electrode, and a shift in DC output is recorded. The relationship between EOG output and horizontal angle of gaze is approximately linear for ± 30° of arc, and is usually accurate to within ± 1.5-2.0° [69]. 25

Eye movements


Fig. 3.3 Subject ready for electro-oculography eye movement measuring experiment [68]. (reproduced with permission)

The main disadvantage of that method is that it requires electrodes to be placed around the eye. It’s not very convenient for subjects being examined. Moreover that method is not very precise in comparison to other. Contact lens coil is the most precise eye tracking method. It involves attaching a mechanical or optical reference object mounted on a contact lens, which is then worn directly on the eye [20]. There may be different devices attached to the lens but the principal methods employ a wire coil, which is then measured moving through an electromagnetic field. Although the contact lens coil is the most precise eye movement measurement method (accurate to about 5-10 seconds of arc over a limited range of about 5 deg), it is also the most intrusive method. Insertion of the lens requires care and practice and wearing of the lens may cause discomfort. Its high intrusiveness makes it practically useless in human identification experiments.

Fig. 3.4 An example of contact lens coil [86]. (reproduced with permission)

Video oculography (VOG) is generally based on analyses of image of the eye, changing in time. Because it uses CCD cameras it is convenient for observed subjects as no physical contact with device is necessary. The recording device may be attached to a special head-mounted helmet (Fig. 3.5), what is not very convenient or may be tablemounted (in the distance from eye), for instance attached to computer display (Fig. 3.6). 26

Eye movements


Fig. 3.5 Head mounted video-based eyetracker [87]. (reproduced with permission)

These techniques involve the measurement of distinguishable features of the eyes under rotation/translation, e.g.: the apparent shape of the pupil, the position of the limbus (the iris-sclera boundary) and corneal reflections of a closely situated directed light source (often infra-red) [20].

Fig. 3.6 Video based eye tracker with camera combined with the TFT display [89]. (reproduced with permission)

The last methodology takes advantage of the reflection properties of human eyes. The beam of light is directed to the eye and the reflection is measured. As the directed, closely situated light source could be inconvenient for examined person an infrared light sources are often used. That is why the methodology is called Infrared oculography (IROG). Contrary to VOG the method does not use complicated image capturing devices but only the simple light receivers.

27

Eye movements

3.4. The OBER2 system

The method is very precise and not very intrusive so it seems to be the best choice for experiments presented in this work. One of the best examples of products based on that methodology is an OBER2 system described below.

3.4

The OBER2 system

The OBER2 system is the product of many years of experience and experiments. It was developed by the international group of researchers including dr Per Udden from Permobil Meditech, Sweden, professor Jan Ober from Institute of Biocybernetics and Biomedical Engineering, Polish Academy of Sciences and professor Józef Ober from Silesian University of Technology, Gliwice, Poland. The OBER2 system is an example of infrared oculography (IROG) based system. It works using pairs of infrared transmitters and receivers. Transmitters emit infrared light towards the eye. The light is reflected from iris or sclera regions and is collected by the receivers. Eye movements are measured using differential comparison of transmitted and reflected signals in time [59]. It occurs that difference of amount of light received during two consecutive trials is with good accuracy proportional to the angular position of the eye [43].

Fig. 3.7 OBER2 system operation principle.

Fig. 3.7 presents the basic idea of the OBER2 system. In fact there are eight pairs of transmitters and receivers measuring movements of each eye. They are attached to specially designed ‘goggles’ presented on Fig. 3.8.

28

Eye movements


Fig. 3.8 Goggles to be worn during the experiment.

Because of medical restrictions the infrared light to which eye is exposed should be as small as possible. So the OBER2 system emits infrared light in very short 80 µs pulses. The amount of light measured by receivers during the pulse is (after substraction of the predicted amount of ambient light) the output of the system. The whole process may be described in the following steps (see Fig. 3.9): •

Measure the amount of ambient light received in t0 (IR0).

•

Measure the amount of ambient light received in t1 (IR1) and start emitting the impulse.

•

Measure the amount of light received in t2, when the impulse reaches the maximum value (IR2).

•

Calculate (using the extrapolation from the amounts measured in t0 and t1) the predicted amount of ambient light in t1 (IR3).

•

The output of the system is IR2 –IR3. light impulse IR2 ambient light IR0 IR1 IR3 t0

t1

t2

time

Fig. 3.9 Graph illustration of the OBER2 measuring process.

29

Eye movements


The value is then compared to the value obtained during the calibration (see section 5.1) and sent to 12-bit AD converter. The result of the single measurement is therefore a 12-bit digit. The same procedure is used for both eyes and both directions (horizontal and vertical) so the system gives four such digits as the result of one measurement. The current version of the system has the ability to measure eye positions with frequency up to 2 kHz (4 kHz in case of measuring in only one direction). The estimated precision of measurement is about 30 seconds of arc. The system used during experiments was connected with a PC computer with RS232 interface. All results measured by the system were directly sent to the computer and stored on a hard drive (see section 4.6).

Fig. 3.10 The laboratory version of the OBER2 system.

The advanced technology used in OBER2 system makes it one of the fastest and the most accurate eye tracking devices.

30

4 Eye movement registering experiment To prove that eye movements may be used for human identification an experiment had to be performed. The experiment was divided into two stages: 1) Gathering samples of human eye movements from different persons. 2) Processing samples obtained in the previous step to extract individual features. This chapter analyses the first stage of the experiment – gathering eye movement samples. That process consists of series of tests on different subjects (persons). Each test is a registration of eye movements of the subject for the specified period of time with the OBER2 system. The result of the single test is a sample, which will be used in the second stage of the experiment.

4.1

Possible testing strategies

There are several possibilities how to perform a single test. This section discusses some of the possibilities and describes the chosen solution. •

Registering eye movement only, without information about an observed image.

In that kind of tests, eye movements of a person are registered for a specified period of time. The testing system consists of OBER2 device for eye movements registering and a PC computer connected with the OBER2 with RS232 interface for data storing (Fig. 4.1). The solution is very simple to conduct even without any cooperation from the person being identified. Eye movements may be measured during normal activity of that person without any information about the observed image. The main drawback of that method is the obvious fact that eye movements are strongly correlated with the image they are looking at. The movements would be quite different in case of a person looking at quickly changing environment (for instance a sport event or an action movie) than in case of a person looking simply at white solid wall.

31

Eye movement registering experiment

4.1. Possible testing strategies

Fig. 4.1 Schema of the system registering only eye movements without any information about the observed image.

Of course one may say that human identification should be independent of visual stimulation. Indeed, theoretically it should be possible to extract identification patterns from every eye movement without knowledge of the character of stimulation. However, that kind of extraction seems to be very difficult and requires more comprehensive study and experiments. •

Registering both eye movements and the observed image.

In that solution the testing system is expanded with the image-capturing device, which registers the ‘reason’ of eyes’ movements (Fig. 4.2). In that kind of test we treat a tested subject as the dynamic system for which we are registering input and the answer to that input.

Fig. 4.2 Schema of the system registering both eye movements and the observed image.

32


4.1. Possible testing strategies

Such improvement gives a lot more data to be analyzed [31]; yet it also has several serious drawbacks. First of all, the testing system is more complicated. We need additional camera recorder, which registers the image the examined person is looking at. Furthermore, we need to implement special algorithms to synchronize visual data with eye movement signal. A lot more capacity is also needed for data storing. We must additionally be aware that camera ‘sees’ the world differently than a human eye, thus the image we register cannot be considered completely equivalent to eyes’ input. Moreover to be usable in real world for the purpose of biometric identification, a single test cannot be too long. With no influence on stimulation (image being observed) one cannot be sure if enough interesting information about a person being identified can be registered during the short time of the test. •

Generating image and registering eye movements as the answer to it.

In that solution the testing system consists of the OBER2 eye tracker and PC computer for both data storing and controlling the monitor, which produces a visual signal (Fig. 4.3). The OBER2 system registers answer to that signal produced by the subject’s eyes. However, we should be aware of the fact that the monitor screen is only a part of the image that the eyes see, so not the whole input is measured. Furthermore, the input may consist of non-visual signals. Sudden loud sounds may, for instance, cause rapid eye movements [92].

Fig. 4.3 Schema of the system generating stimulation on a computer display and registering eye movements as the answer to that stimulation.

As that methodology gives influence on ‘input’ of the examined subject it seems to be the most interesting from the researcher’s point of view. Therefore all tests described

33


4.2. Possible stimulations

in this work were performed using a stimulation displayed on the monitor with the system architecture presented on Fig. 4.3.

4.2

Possible stimulations

Persons being tested look at the computer monitor for a specified period of time and their eye movements are measured. The computer monitor displays a scene called here a stimulation. One may consider different types of stimulations (see Fig. 4.4). The type of stimulation implies what aspect of eye movements is measured. Stimulations Static

Dynamic

Forcing eye movement

Interactive

Passive

Not interactive

Fig. 4.4 Hierarchy of possible stimulations.

The simplest one could be just a static image. Such image does not change in time and it is the same during the whole test. As has already been stated, eyes are moving constantly, even looking at the static image, to register every important element of image with fovea region fixation. According to Stark and Norton [70] brain is creating a ‘scanpath’ of eye movements for each image seen.

Fig. 4.5 Scanpaths of eyes looking at the static image.

34


4.3. Jumping point stimulation

As can be seen on the figure above scanpaths generated by eyes when observing an image may be a very interesting field of observation. The theory that scanpath characteristic is unique for a person [46] seems to be promising. A special kind of static stimulation may be a text stimulation. In such experiment a person just reads the text appearing on the screen. There are a lot of studies concerning eye movement tracking while reading a text and they give very interesting results [11][22][23]. After years of usage the human brain is very well prepared to control eye movements while reading a text and each human being has slightly different customs and habits based on different ‘reading experience’. Therefore, it may be assumed that by analyzing the way a person reads a specially prepared text a lot of interesting information may be extracted. However there is a problem with a (described later) learning effect of the same stimulation observed a number of times. The more sophisticated solution could be a dynamic, changing in time, stimulation. There may be different aspects of stimulation considered: color, intensity, speed, etc. This kind of stimulation may be passive – that is just showing the stimulation without any expectations of the examined person’s reactions. For instance it may be a movie or animation. On the other hand the stimulation may force eye movements. In that kind of stimulation the system is expecting eyes reaction for stimulation. That kind of stimulation may be interactive or non interactive. In interactive stimulation the computer display shows image and testing system is waiting for user’s reaction. It may be for instance a visual task stimulation like finding a matching picture [35] or finding missing elements on a known picture [33]. The stimulation changes according to registered eye movements. When a subject, for instance, finds with their eyes a matching picture the stimulation automatically changes to the next task. The person can also give information about task completion using any other input device like keyboard or mouse. In non-interactive stimulation the testing system is presenting a stimulation and the task of the tested person is to look at a specified point. The examined subject has not got any influence on stimulation as the stimulation is not sensitive to how the subject moves their eyes.

4.3

Jumping point stimulation

One of the simplest forms of dynamic, forcing eye movements and non-interactive stimulations is a ‘jumping point’ stimulation. It that kind of stimulation the screen is blank with only one point ‘jumping’ through it. The task of examined person is to follow the point with their eyes. 35


4.4. The learning effect

It is easier to analyze results of such stimulation. This time we are not interested in examining where the persons are looking but in examining how they look at the point. We may suppose that all results will be more or less similar to one another and our task is to extract the differences among people. The main drawback of the method, however, is that it completely ignores the will of the person. The person cannot decide where to look at the moment and therefore we are loosing all information from brain’s ‘decision centre’. We may say that that kind of stimulation examines more the oculomotor system than the brain. However the jumping point stimulation has several significant advantages: •

It is self-calibrating, as the required fixation location is known during the whole experiment. That allows us to omit the pre-calibrating stage, which might be necessary for other kinds of stimulations [35][2].

•

Its duration is the same for each experiment. When a person is completing a visual task like text reading or picture matching we can never be sure when they will finish. The jumping point stimulation lasts always for the same time.

•

It is very easy to display even without a monitor. In fact, there are only sources of light needed (for instance just simple diodes).

•

No additional hardware is needed (like mouse or keyboard for some visual tasks).

4.4

The learning effect

What is very important when using eye movements in biometrics is that the same test will be performed on the same person a number of times. Firstly, there should be several tests performed to enroll user characteristic. Then, the test is performed each time when an individual wants to identify themselves. It may be supposed that if the same stimulation was used each time, the person would get familiar with it and for instance would seek only for previously unnoticed details. After a lot of repetitions scene is known and the person may even stop to move their eyes when looking at it. The effect may be called a learning effect as the brain learns the stimulation. The learning effect is present in every experiment measuring human behavior. For example when reading the text for the first time eye movements are interesting, because eyes are stopping at more difficult words, and sometimes are going back to read some fragments again. This process is very often unconscious. However, when reading the same, already known, text once again eye movements are smooth and not interesting at all. The brain has this text in memory and these movements are not really reading it once again. 36


4.4. The learning effect

When performing a text reading test we may also for instance notice that a person has a lot of problems with reading the word ‘oculomotor’ and when reading this word eyes are going back to read it again. However, presenting the same word during each experiment causes that person’s brain gets familiar with the word and that effect disappears. There are two possible ways how to handle the learning effect: •

Overcome the learning effect by changing the stimulation for every experiment.

It is possible to overcome the learning effect by changing a stimulation in every test. When different stimulation is used every time the learning effect is obviously minimized. However these stimulations should be as similar as possible to enable extraction of the same eye movements’ parameters for future analyses. On the other hand they should be different so that a learning effect can be eliminated. The task is therefore not an easy one. Using different texts in the text reading stimulation one can overcome the learning effect. However extracting the same information about the examined person is difficult because the difficulty of different texts is varying. Similarly, when considering the jumping point stimulation, the moments of point changes and placements of the following points may be accidental. But the main drawback is that we cannot directly compare two experiments, as eye movements during them are different. That problem may of course be minimized by using a proper feature extraction method described in section 5. •

Use the learning effect as the direct identification measure.

Instead of avoiding the leaning effect it may be directly used in the identification process. A person who looks at the same stimulation a lot of times gets used to it and the results of the subsequent experiments are converging – the subsequent samples are more similar. Having that in mind we can suppose that, after a specific number of experiments, next samples will be very similar and therefore easier to identify. It is exactly the same effect as with the written signature. Let’s imagine the following experiment: Persons are asked to write a word on the paper. The written word looks typical for their handwriting style and it is possible for a specialist to identify them. Yet, when they are asked to write the same word over and over again, they get used to it and the brain produces the kind of automatic schema for performing that task. At this moment the handwritten word looks very similar every 37


4.5. Methodology used during the experiment

time and that is what we call a signature. Contrary to handwriting, the signature may be recognized even by an unqualified person – for instance a shop assistant. We would like to use the same effect with the eye movements. Firstly, we show a person the same stimulation several (as many as possible) times. After that process, the person’s brain produces an automatic schema and results of the following experiments will start to converge. It, of course, makes the process of recognition (identification) easier – remember a handwriting specialist versus a shop assistant. Of course we must be sure that even after convergence eye movements would still be informative – eyes must be active even when the stimulation is well known. Therefore that kind of ‘stimulation learning’ may be used only with stimulations forcing eye movements. Returning to the jumping point stimulation - if the stimulation is invariable, every test is similar and may be directly compared. After many repetitions a person gets familiar with the point’s appearance order and may even predict the next point position. When every examined person has their own, individual, stimulation the test is analogous to handwritten signature. Each person learns their own ‘eye movement signature’ and it is difficult for any other person to repeat the same sequence of eye movements. Of course – as the process is dynamic and mostly unconscious – forging the other persons by pretending their ‘eye movement signature’ is much more difficult than the written one. But, using the same analogy to hand writing, when we ask several persons to write the same word several times, they will do it differently with their own handwriting style. So it may be supposed that even when using the same stimulation for each person it may be possible to distinguish the differences. The latter method (using the same stimulation for every test) gives better opportunity for further cross testing of samples as every test may be used both as the genuine and impostor one (see section 7).

4.5

Methodology used during the experiment

The stimulation, which has been eventually chosen, was a ‘jumping point’ kind of stimulation with the same points order for every experiment. There are nine different point placements defined on the screen, one in the middle and eight on the edges, creating 3 x 3 matrix. The point flashes in one placement in a given moment. The stimulation begins and ends with a point in the middle of the screen. During the stimulation, point’s placement changes in specified intervals.

38



Fig. 4.6 Points matrix for stimulation.

The main problem in developing stimulation is to make it short and informative. Those properties are as if on two opposite poles, so a ‘golden mean’ must be found. It may be assumed that gathering one sample could not last longer than 10 seconds. Longer stimulations would be impractical when considering usage in real world. To be informative, experiment should consist of as many point position changes as possible. However, moving a point too quickly makes it impossible for eyes to follow it. Experiments and literature [38] confirmed that the reaction time for change of stimulation is about 100-200 ms. After that time eyes start a saccade, which moves fovea to the new gaze point. The saccade is very fast and lasts not longer than 10-20 ms. After a saccade, the brain analyses a new position of the eyes and, if necessary, tries to correct it. So very often about 50 ms after the first saccade the next saccade happens. It can be called a ‘calibration’ saccade.

Fig. 4.7. Typical eye movement reaction for point position change in one axis.

39



Therefore to register whole reaction for point change it was necessary to provide an interval between point locations change as more than 300 ms. The stimulation, which has been developed and used during all tests, consists of eleven point position changes giving twelve consecutive point positions. First point appears in the middle of the screen and the person should look at it with eyes positioned directly ahead. After 1600 ms the point in the middle disappears and for 20 ms a screen is blank. In that time eyes are in instable state waiting for another point of interest. That moment is uncomfortable for eyes because there is no point to look at. Then the point appears in the upper right corner. The flashing point on the blank screen attracts eyes attention even without the person’s will. The ‘jumps’ of the point continue until the last point position in the middle of the screen is reached.

a. 1600 ms

b. 550 ms

c. 550 ms

d. 550 ms

e. 550 ms

f. 550 ms

g. 550 ms

h. 550 ms

i. 550 ms

j. 550 ms

k. 550 ms

l. 1100 ms

Fig. 4.8 Visual description of stimulation (a-l).

The chosen stimulation is described on the picture above. Each picture except for the first and the last one is seen by the subject for about 550 ms. The first and the last points are placed in the middle of the screen. The next problem, which had to be solved, was choosing parameters of eye movements registration. The OBER2 eye tracking system measures eye movements with frequencies up to 2kHz. Such high frequencies may reveal information about eyes micro-movements. However higher frequencies are giving more data and such amount of data is sometimes difficult to maintain when considering number of experiments. In different eye movement researches described in section 3.2 measuring frequencies were almost always less than 100Hz. Therefore a frequency 250Hz was arbitrary chosen. 40


4.6. Storing results of the experiment

Such frequency makes it impossible to reveal eye micro movements like tremors [59] but was enough to acquire information about fixations and saccades.

4.6

Storing results of the experiment

Each test was a recording of 2048 positions of eyes looking at the stimulation. That number has been chosen for two reasons: •

It is easier to maintain signals with number of elements equal to power of two.

•

As OBER2 system measured eye positions with frequency of 250 Hz the whole experiment lasted 8128 ms which is fulfilling ‘less than 10 seconds’ condition.

Each of 2048 measurements (taken at intervals of 4 ms) consisted of six integer values which were giving the position of stimulation point on the screen (SX, SY), the position of the point the left eye was looking at (LX, LY) and the position of the point the right eye was looking at (RX, RY) at each moment of time during presentation of the stimulation. Positions were results from 12-bit AD converter so they were in range 0-4095. Stimulation point position was also recounted to those bounds. In each experiment a sample of 2048 x 6 = 12288 values is collected.

Fig. 4.9 Results of a single test.

The result of the single test may be showed on two graphs (Fig. 4.9). The result of a single test (called later a sample) was stored in a single text file in the format defined as EyeTextFile format. The example of such a file is presented below.

41



login 2003-12-17 11:00:39 2048 2048 2071 2026 2110 2229 2048 2048 2102 2030 2104 2236 2048 2048 2102 2032 2103 2233 2048 2048 2101 2031 2102 2231 2048 2048 2100 2030 2101 2226 2048 2048 2102 2032 2101 2224 2048 2048 2100 2028 2096 2219 2048 2048 2098 2030 2096 2216 2048 2048 2101 2029 2095 2216 2048 2048 2100 2024 2095 2214 2048 2048 2099 2026 2095 2215 2048 2048 2097 2028 2092 2213 ...

Fig. 4.10 Result of one test stored in a text file (EyeTestFile format).

As it can be seen the sample file consists of identification information, date and time of acquiring and set of 2048 measurements. Each measurement consists of six values for SX, SY, LX, RX, LY and RY. Each sample file was stored on a hard disk with filename created as: . where login is the identification of the person being tested and index is the number of the sample taken from that person. For instance a file ABC.12 is the 12th sample taken from a person identified as ABC. The samples obtained during tests and stored in separate files were subject of further analyses described in the next sections.

42

Entry processing of collected data


5 Entry processing of collected data Each test described in section 4 gave sample consisting of information about eye and stimulation point positions at specified moments of time. The sample consists of 12288 values grouped in six 2048 long vectors of integers from range 0-4095. The vectors are: •

SX – horizontal position of stimulation point. 0 for left position, 2048 for middle position and 4095 for right position.

•

SY – vertical position of stimulation point. 0 for top position, 2048 for middle position and 4095 for bottom position.

•

LX – horizontal position of the left eye. 0 for relative left position, 2048 for middle position and 4095 for relative right position of eye.

•

LY – vertical position of the left eye. 0 for relative upper position, 2048 for middle position and 4095 for relative bottom position of eye.

•

RX – horizontal position of the right eye. 0 for relative left position, 2048 for middle position and 4095 for relative right position of eye.

•

RY – vertical position of the left eye. 0 for relative upper position, 2048 for middle position and 4095 for relative bottom position of eye.

For the convenience of further operations all vectors were combined into one sample vector as it can be seen on Fig. 5.1. Each vector will be called ‘a signal’ later in this section and will be treated separately.

SX

SY

LX

RX

LY

RY

Fig. 5.1 An example of a sample consisting of six independent parts (signals). X-axis represents time (separately for each signal) and Y-axis represents a value obtained from OBER2 system’s AD converter (for signals LX, RX, LY and RY). In case of signals SX and SY, value of Y-axis has only three possible values (0, 2048 and 4095) representing stimulation point positions.

43


5.1. Sample calibration

The data has to be transformed to create a vector of attributes, which will be used for identification process described in the following sections. Each feature should give some information about a person who was the subject of the experiment. That information may be understandable – for instance “his dominant eye is left” or “his eyes are flickering with frequency 10 Hz”, but the meaning of the feature may also be hidden, giving only the value. The main problem is how to extract a set of features that have values for different samples of the same person (inner-class samples) as similar as possible and that have values for different person’s samples as different as possible. As it has been mentioned earlier, identification based on eye movement analysis is a brand new technique. The main disadvantage of that technique is that one cannot use already published algorithms and just try to improve it with one’s own methods. Therefore, the only possibility is to use methods which have been successfully used while dealing with similar problems. Methods described in this section include: •

Methods used elsewhere to analyze eye movement data.

•

General methods used in signal processing and classification.

Signal transformations described in this section use information from the sample (the result of a single test) to produce a vector of attributes. All calculations are independent of other samples. vector_of_attributes = f(sample)

Eq. 5.1

First problem that is to be solved before processing a vector of features is choosing samples, which are possible to further analyses and normalization of them to enable direct comparison of their different attributes.

5.1

Sample calibration

First step in laboratory tests involving eye trackers (or any other measuring device) is always a calibration of the device. The typical calibration consists of series of tests and analysis of the obtained results to improve the device measurements [35]. After achieving rewarding parameters the real experiment begins. The OBER2 device calibrates in the moment when the test is started. It is assumed that in that moment examined person is looking straight ahead and the initial value of system’s output is set to 2048 (the middle value of 12-bit AD converter). When the eye moves down the value describing the vertical position increases, when the eye moves up the value decreases. Similarly, when the eye moves right the value describing the horizontal position increases and when the eye moves left the value decreases. The 44


5.1. Sample calibration

amount of this increase or decrease may be controlled with gain parameter of the OBER2 system. The system is very sensitive to the adjustment of the goggles. The results are directly dependent on the distance between goggles and eyes of the tested person. When the distance is small, changes of reflection caused by eye movements are high. And opposite - when the distance is big, changes are smaller. The results are also dependent on the properties of the cornea – its shape and color. To calculate eye’s position in degrees relative to center position, a calibration procedure should have three values: •

Distance between eyes and stimulation (monitor screen)

•

Value obtained from OBER2 system when eye is looking at the center of the screen (usually 2048)

•

Value obtained from OBER2 system when eye is looking at the specified point on the screen.

Fig. 5.2 Information needed for proper calibration of eye movement signal.

When that three values are known, it is possible to recalculate pure OBER2 output to arc degrees. Obviously, on the basis of those numbers, it is also possible to adjust gain parameter of the system. The gain parameter controls the amount of light emitted by the transmitters. When the gain is high the system is more sensitive to eye movements. That parameter should be set up in such way that AD converter could measure the expected extreme positions of the eye without saturations. These positions should also give values from boundaries of the converter range to lower the measurement error. However, one of the main ideas of the proposed human identification technique is to make it as convenient as it is possible for potential users to ensure sufficient acceptability of the method. Therefore, a single test has to be easy to perform. In order 45


5.2. Sample normalization

to provide over 1000 tests on almost 50 people, one should make sure that the test is also fast. The method in which the device is to be calibrated for 5 minutes to acquire one 8-second sample has not been taken into account because it is too inconvenient for users. In fact many tests have been provided by users themselves without any assistance. Fortunately, to find features useful to human identification we seek only for characteristic properties of eye movements as well as for correlation between stimulation and eye movements. Therefore, we do not need exact information about the relative angle of eye position. Instead, we can only use values obtained from OBER2 AD converter without information about scale of eye movements. Hence, almost all calibrations were omitted during tests. The only problem that had to be solved at once during the test was problem of AD converter saturation. If the goggles are too close to the eye the changes of amount of light reflected by cornea extend capacity of 12-bit converter and values are going under 0 or beyond 4095. As can be seen on Fig. 5.3 a lot of information, which should be provided by the test, is lost. The result is therefore impossible to analyze.

Fig. 5.3 Example of badly acquired sample. Left eye, horizontal position (LX). Gain was to high and the signal causes AD converter saturation.

To avoid that situation, the user was informed during the test (with a beep signal) that measured values exceeded the allowed range. Than the gain was lowered and the test was repeated.

5.2

Sample normalization

As all calibrations were omitted, values obtained from the OBER2 system during tests may be sometimes disturbed. That is why it was necessary to preprocess gathered samples with a special normalization procedure. That procedure should adjust samples to the same range and reject outliers for which that correction is impossible. 46



For instance, although the tested persons were instructed to avoid blinks during the test, the eye blink is often impossible to stop. The blink is obviously influencing the gathered data. As there are methods of blink recognition in eye movements data [43][31], three possibilities could be chosen: -

Reject all samples with blinks.

-

Filter out blinks using methods described in [43].

-

Leave blinks unchanged.

Rejection of all ‘blinked’ samples is impractical as there are people who cannot abstain from blinking for eight seconds. On the other hand, while filtering blinks out from the signal, some information useful for identification may be lost (especially information about eye oscillations may be disturbed). Leaving blinks unchanged seems to be the safest solution. However, sometimes the number of blinks in the signal is so high that it is impossible to analyze the signal itself. That kind of samples (with number of blinks disturbing analyses) were rejected with a procedure described below. Moreover, the tested person may lose attention during the test because of external circumstances (somebody else entering the room or sudden noise). The identification system should also be able to recognize such badly acquired samples. The pre-processing should solve the problem of badly acquired samples. But even properly acquired samples have – because of the lack of system calibration - different amplitudes (Fig. 5.4). To produce vector of attributes that could be compared to another attribute’s vectors - obtained during different test - all samples had to be normalized.

(A)

(B)

Fig. 5.4 Two graphs presenting left eye horizontal reaction (LX). The presented samples were obtained from two different persons (A) and (B) for the same stimulation. The degree of inclination significantly differs.

47



The normalization procedure may be described by the following steps: 1) Find all fixations in the signal. 2) Pair fixations with required fixation locations (places where the stimulation point flashes at that time). The Required Fixation Locations are referred to as RFLs. 3) Calculate average value for RFLs taking into account values of fixations measured during the specified RFL. 4) Recalculate all values to different range. All calibrations were performed for signals (like LX or RY) separately using only information from corresponding stimulation signal (SX or SY). All procedures presented in this section are visualized on one of the LX signals presented on Fig. 5.4 referred to as signal (A) and (B).

5.2.1 Finding fixations There are a lot of methods for finding fixations and saccades. Salvucci at al. [82] in effort to create taxonomy of those methods named five types: Velocity-Threshold Identification The method separates fixation and saccade points based on their point-to-point velocities. The velocity profiles of saccadic eye movements show essentially two distributions of velocities: low velocities for fixations and high velocities for saccades. The only problem of the methodology is correct assignment of the threshold. HMM Identification Hidden Markov model fixation identification uses probabilistic analysis to determine the most likely identifications for a given protocol. Hidden Markov models (HMMs) are probabilistic finite state machines. This methodology uses a two-state HMM in which the states represent the velocity distributions for saccade and fixation points. This probabilistic representation helps to perform more robust identification than a velocitythreshold method. Dispersion-Threshold Identification In contrast to the velocity-based identification and HMM identification, dispersionthreshold identification utilizes the fact that fixation points, because of their low velocity, tend to cluster closely together. The method identifies fixations as groups of consecutive points within a particular dispersion, or maximum separation. The algorithm takes two thresholds: maximum dispersion and minimum duration.

48



MST Identification MST identification is based on minimum spanning trees (MSTs) — that is, a tree connecting a set of points in such a way that the total length of the tree’s line segments is minimized. MSTs can provide a highly flexible and controllable representation for dispersion-based fixation identification. Area-based Algorithms The four previous identification methods can identify fixations at any location in the visual field. In contrast, area-based fixation identification identifies only fixations that occur within specified target areas. Because it may be provided only after proper calibration – it is useless in our application. The presented methodologies have different accuracy and robustness. The easiness of implementation is also an important property. Because dispersion-threshold algorithms seem to be one of the most accurate and robust and are easy to implement [82] one of them was used in this work. Namely, it was the algorithm proposed by Augustyniak at al. [2]. It tries to find intervals with sufficient duration for which the point’s approximation is flat enough with error lesser than a specified threshold. It takes three parameters: minimal length of fixation, maximal slope of the interval approximation and threshold of a maximal approximation error. As the authors did not give their suggestions for these parameters, all three values had to be assigned empirically.

Fig. 5.5 Signal (A) from Fig. 5.4 with detected fixations (marked with the thick lines).

49



5.2.2 Pairing fixations with required fixation locations After calculating fixations the next step was to decide if the fixation was connected with any RFL (Required Fixation Location) and if so, to assign it to the RFL. The main problem in that fixation assignment was how to avoid incorrect classifications of fixations. Several sophisticated heuristic algorithms were tested, but every methodology resulted in many misclassifications. So one of the simplest algorithms was developed: •

Accept all fixations that last only during one RFL and assign it to that RFL.

•

Accept all fixations that last during two or more different RFLs but the duration of fixation in one of that RFLs is longer than 75% of overall fixation length and is longer than 150 ms.

•

Do not take into account fixations not fulfilling the previous conditions.

The main drawback of this algorithm was that it did not take into account fixations, which sometimes could be used (from the point of view of a human looking at the signal) and therefore less data was available for future analyses. But the advantage was that algorithm make mistakes very rarely (one of such cases will be described below). The algorithm assigns fixations into three different levels. It has been visualized as three different values on the graph. The next procedure was averaging all values in fixations assigned to each RFL as average levels L0, L1, and L2.

Fig. 5.6 Signal (A) and its averaged levels.

50



5.2.3 Recalculating all values into a new range The previous step gave approximated values of averaged RFL levels. For each signal three values were calculated: L0 – level for left (or upper for vertical signals) position, L1 – level for neutral position, L2 – level for right (or bottom for vertical signals) position. These values may be used for calibration of the input signal. If the signal was approximated correctly, the distances between levels would be equal e.g.: L0 – L1 = L1 – L2

Eq. 5.2

However, it is possible, that sometimes a level may be approximated incorrectly. It may happen when for instance one of fixations is assigned incorrectly to the wrong RFL as it may be seen on Fig. 5.7 for signal (B). The person being tested predicted stimulation change and moved their eyes, fixating in the middle position on the screen before the stimulation point changed its position (point t0). When the stimulation point actually moved to the middle, person’s eyes adjusted position causing a new fixation. Unfortunately, the first fixation happened during left position of stimulation point and has been assigned to left level. Such incorrect assignment causes incorrect corresponding level calculation as can be seen on Fig. 5.8.

t0 Fig. 5.7 Signal (B) with detected fixations. The arrow points a fixation incorrectly assigned to the upper level. SX signal (out of scale) added as the reference.

51



upper level is assigned to low

Fig. 5.8 Signal (B) and its averaged conversion with fixation assigned to the wrong level. This wrong assignment causes that upper level is approximated to low.

Because of that possibility the algorithm always rejected the least reliable level and used two others to recalculate it again. It was assumed that the most reliable level was that calculated of the most points. In the example above, the upper level L0 was calculated from only two short fixations so it was automatically rejected. The new value of that level was calculated according to the Eq. 5.2 as: L0 = 2 * L1 – L2

Eq. 5.3

The result of recalculation of upper level L0 is presented on Fig. 5.9.

recalculated upper level

Fig. 5.9 The same signal (B) as on Fig. 5.8 but the upper level has been rejected and calculated again using values of two other levels. Now upper level value seems to be more reliable.

52



After calculating three level values, the next step was recalculation of the original signal into the new boundaries. The formula is straightforward:

x

new i

=L

new 0

+

dLnew old

dL

( xiold − Lold 0 ) Eq. 5.4

where:

dLnew = Lnew − Lnew 2 0 old dLold = Lold 2 − L0

The new values for L0 and L2 where assigned to –1000 and +1000 respectively. After that recalculation the value signal may be interpreted as: -

zero for eyes looking straight ahead

-

above zero when eyes are looking right for horizontal signal or down for vertical signal

-

below zero when eyes are looking left for horizontal signal or up for vertical signal

When eyes are looking directly at the stimulation point on the boundaries of the screen the corresponding values should be close to 1000 or –1000 respectively. As can be seen on Fig. 5.10 it is now possible to directly compare two different signals.

(A)

(B)

Fig. 5.10 Signals (A) and (B) presented on Fig. 5.4 after normalization.

Having fixations and assignations to RFLs, several interesting attributes for each signal may be computed, like for example: average in-fixation variance (deviation) or average between-fixation variance (deviation) for each RFL. It’s also possible to compare these values between eyes for instance to point the dominant eye [3]. 53


5.3. Calculating different eye movement properties

The normalization method presented above has one serious drawback: when multiplying the signal we are multiplying all device errors as well. But that method was the only way to combine two opposite desired properties: convenience for users and valuable information. For some samples the normalization procedure described above failed, because it was impossible to calculate the level values. It might have been caused because of complete lack of reaction of examined person’s eye or because of too high deviations during fixations (for instance caused by too many blinks). Such samples were deleted from the dataset. The problem appeared in case of about 6% of all gathered samples.

5.3

Calculating different eye movement properties

Having a sample of eye positions signal it is possible to calculate different aspects of eyes’ movements. Some of them were described in the previous step. Other may be performed after calibration. Each of the methods described in that section is producing the so-called vector of attributes using information from the specified sample.

sample

vector of attributes 1

Entry processing

vector of attributes 2 vector of attributes 3 vector of attributes 4

Fig. 5.11 A sample contains a source information for producing different vectors of attributes.

There are of course a lot of possible attributes that may be calculated. The possibilities used in the work are the following: -

Average velocity direction calculation.

-

Distance to stimulation.

-

Eye difference.

-

Discrete Fourier Transform.

-

Discrete Wavelet Transform.

54



5.3.1 Average velocity direction calculation With values for vertical and horizontal eye locations it is possible to calculate pointto-point velocities of eye movements. The velocities are calculated for each eye independently with the differential formula: veli =

(xi +1 − xi )2 + ( yi +1 − yi )2

where xi and yi are horizontal and vertical positions of the eye (LXi and LYi for left eye and RXi and RYi for right eye)

Eq. 5.5

Fig. 5.12 Example of velocity vector calculated for left eye (using LX and LY signals)

Using the same signals it is also possible to calculate the eye movement direction in the specified moment of time. ⎛ α i = arccos⎜⎜ ⎝

xi +1 − xi

(xi +1 − xi )

2

+ ( y i +1 − y i )

2

⎞ ⎟ * sgn ( y − y ) i +1 i ⎟ ⎠

Eq. 5.6

The value of this function is in range ( -π, π 〉. Having pairs of velocity and angle values it was possible to calculate average eye velocities in each direction.

signal (β ) =

vel ∑ α β

i∈(

i

i= )

α ∑ α β

i∈(

i=

Eq. 5.7

i

)

To calculate discrete values, all calculations were done for intervals π/8, what gives 16 different values for whole range (0 stands for movement up, π/2 for right and so on). Fig. 5.13 presents graphs of average velocities in different directions of two samples acquired from different persons. X-axis is the direction in radians and Y-axis is the average velocity. Fig. 5.14 presents the same information on radar graphs. 55



(A)

(B)

Fig. 5.13 Average velocities of left eye in 16 different directions for two different samples taken from two different persons. X-axis is the direction in radians and Y-axis is the average velocity. 1

1 16

20

20

16

2

18 15

16 15

3

14

12 10

10 8

14

4

8

6

6

4

4

4

2

2 13

3

14

12

14

2 18

16

5

0

12

6

11

13

12

6

11

7

10

5

0

7

10

8

8

9

9

(A)

(B)

Fig. 5.14 Radar graphs of average velocities of left eye in 16 different directions for two different samples taken from two different persons.

5.3.2 Eye distance When signals from both eyes are normalized it is possible to calculate distance between both eyes gaze-points. This information may reveal how the eye movements are similar. The distance is calculated with the classic formula: EDi =

(RX i − LX i )2 + (RYi − LYi )2

Eq. 5.8

Fig. 5.15 presents graph of distances between eyes’ positions registered during the stimulation. As it can be seen eyes are moving differently.

56



Fig. 5.15 Absolute distance between eyes’ gaze-points in the following moments of time.

5.3.3 Distance to stimulation After calibrating the signals into (-1000,+1000) range, the stimulation signal may be also recalculated to the same range with 0 for the middle point and +/- 1000 for boundary points. After that we can count the distance from eye position to stimulation point as: LX std = LX − SX RX std = RX − SX LY std = LY − SY

Eq. 5.9

RY std = RY − SY

Such signal reveals latencies and reaction times. As can be seen on Fig. 5.16, when the stimulation point location changes, the absolute value of the signal increases until eye reacts to the stimulation.

Fig. 5.16 Difference of the LX eye signal from the required fixation location. SX signal below (out of scale) added as the reference. 57



5.3.4 Discrete Fourier Transform All tests were registered with frequency 250 Hz. It turned out that it might reveal some information about eye micro-movements. Differences in eyes oscillation of different persons were seen even without any special analyzing tools. As all tests were performed in the same place using the same equipment the oscillations had to be caused by eye muscles and not by external interferences.

signal (A)

signal (B)

Fig. 5.17 Comparison of the normalized signals (A) and (B) presented above on Fig. 5.10. The magnitude reveals differences in frequency characteristic.

58



As it can be seen on Fig. 5.17 frequencies of the signals may vary significantly. Different frequency patterns are clearly visible. Therefore, it seemed useful to extract information about signals frequencies. The standard way to do so was using a Discrete Fourier Transform (DFT): N −1

Fn = ∑ f k e − 2πink / N

Eq. 5.10

k =0

Fig. 5.18 presents spectra of both signals presented on Fig. 5.10 and zoomed on Fig. 5.17. As it can be seen, they are evidently different with the left signal showing more information in higher frequencies range.

(A)

(B)

Fig. 5.18 Fourier spectra of signals (A) and (B) presented on Fig. 5.10.

5.3.5 Wavelet Transform Fourier Transform reveals information about frequencies existing in the whole signal. When the signal is not stationary e.g. its frequency is changing in time, the information about changes is lost. To extract information about placement of the specific frequencies in time, the signal has to be converted into two-dimensional signal where two axes are showing frequency and time. One of the methods is Short Term Fourier Transform (STFT). In STFT, the signal is divided into small enough segments, where these segments of the signal can be assumed to be stationary. For this purpose, a window function “w” is chosen. The width of this window must be equal to the segment of the signal where it is stationary. The main problem with STFT is how to choose the window width. According to Heisenberg Uncertainty Principle we cannot find one best width. When the width is

59



narrow we have very precise information about time, but we loose information about lower frequencies. When the window is wider we can extract information about lower frequencies but loosing time resolution. The Continuous Wavelet Transform (CWT) was developed as an alternative approach to the Short Term Fourier Transform to overcome the resolution problem [93]. The wavelet analysis is done in a similar way to the STFT analysis, in the sense that the signal is multiplied with a function (called the wavelet), similar to the window function in the STFT, and the transform is computed separately for different segments of the time-domain signal. However, the main difference between the STFT and the CWT is that the width of the window changes as the transform is computed for every single spectral component [75]. The general formula for Wavelet Transform is showed on Eq. 5.11.

CWTxϕ (τ , s ) = Ψ (τ , s ) =

*⎛ t −τ ⎞ x t ϕ ( ) ⎜ ⎟dt ∫ ⎝ s ⎠ s

1

Eq. 5.11

where: x(t) – input signal s – scale

τ - translation ϕ(t) – mother wavelet function As it can be seen from Eq. 5.11 the continuous wavelet is function of two variables scale and translation. For each scale-translation pair the input signal is multiplied by the mother wavelet function translated and scaled. The scale parameter ensures that for higher frequencies the time area of computing the wavelet function will be short (giving good time resolution) and for lower frequencies the time area will be wider. The Discrete Wavelet Transform (DWT) employs the fact, that having N discrete samples of the signal we can compute frequencies up to N/2. So, after using a subsampling algorithm (not described here), N samples of discrete signal may be described with the N wavelet coefficients [93][75].

60


5.4. Conclusions

Fig. 5.19 Discrete wavelet transform of LX signal (using Daub4 mother wavelet).

5.4

Conclusions

The functions presented in this section use a raw sample obtained directly from the OBER2 system to calculate different vectors of attributes based on different properties of the sample. There are of course a lot of different possible properties that can be extracted from the sample. Yet, only the most “promising” were chosen in this dissertation. Each transformation produces one vector of attributes used in further identification process.

61

6 Minimization of attribute vectors In section 5 each sample was pre-processed and converted into several vectors of attributes. Most of these attributes are completely useless for classification. Such irrelevant attributes not only make the classification a more complex and time consuming task, but can also disturb it and increase classification errors. The aim of algorithms presented in this section is extraction of relevant attributes from the attributes vectors produced in section 5. As relevancy can be estimated only having a set of vectors from different samples, all methods described here are in fact converting a set of attributes vectors (called later a dataset) taken from different samples into another set of vectors containing calculated new – hopefully more relevant - attributes.

sample

... sample

Entry processing (section 5)

vector of attributes 1 vector of attributes 2

...

Dataset type 1

Vectors Minimization (section 6)

New Dataset

Dataset type 2

Vectors Minimization (section 6)

New Dataset

... vector of attributes 1 vector of attributes 2

...

Fig. 6.1 Data conversions schema.

As it can be seen on Fig. 6.1, the dataset is formed from vectors of the same type taken from different samples. The result of the dataset minimization is a new dataset with the same number of vectors but with different vector’s attributes. The number of attributes in the new vectors is always lower. Assuming that we have a set of vectors of attributes, we can check if attributes are relevant to the classification. Each vector X consists on n attributes X1…Xn and is accompanied by a label (or class) Y=y. One of the most obvious definitions of the relevancy may be: Attribute Xi is relevant if and only if there exists value of that attribute Xi=xi for which p(X=xi)>0 and there exists a label Y=y that:

p(Y = y | X i = xi ) ≠ p (Y = y )

Eq. 6.1

Under this definition Xi is relevant if knowing its value can change estimates for Y, or, in other words, if Y is conditionally dependent of Xi [45]. However there are many other definitions possible [7][45]. 62

Minimization of attribute vectors

6.1. Relevancy estimation

The definition above gives only the answer to the question if the attribute is relevant at all. But we are interested in estimating the degree of relevancy to choose the best attributes. There are a lot of possible algorithms that can be developed. One of the simplest, called MinChange has been proposed in the next section.

6.1

Relevancy estimation

One of the ways to estimate relevancy of the attribute may be calculating the number of classification changes for samples sorted with this attribute value. For instance, for three example attributes classification could be as follows: Value

1

17

72

219

295

299

312

417

428

472

962

995

CLASS

0

0

1

1

1

0

1

1

0

0

0

0

4 changes Value

1

17

72

219

295

299

312

417

428

472

962

995

CLASS

0

0

1

1

1

1

1

1

0

0

0

0

2 changes Value

1

17

72

219

295

299

312

417

428

472

962

995

CLASS

0

0

0

0

0

0

1

1

1

1

1

1

1 change

Taking into account the number of changes the best attribute is the last one because class assignment of the sample may be stated with only one condition: if (attribute_value>299) class=1

However, the main drawback of that method is that it completely ignores dependencies between attributes. If several strongly relevant attributes are highly correlated, the algorithm unnecessarily selects them all and (if the total number of attributes to select is defined) it may cause rejection of other weaker - relevant but not correlated - attributes.

6.2

Linear conversions

The problem of correlations between attributes may be solved by using algorithms calculating linear conversion of dataset of vectors. That linear conversion may be defined as:

Z mk = Amn X nk

Eq. 6.2

In the equation above Xnk represents an input dataset consisting of k probes with n attributes each. Matrix Amn is the matrix converting the data set into another one Zmk 63

Minimization of attribute vectors

6.2. Linear conversions

consisting of k probes with m attributes each. The value of m is often (but not always) less than n. What is important, after calculating the Amn matrix for the Xnk dataset, it is possible to use it for recalculating every new n-attribute sample to the new mdimensional one.

6.2.1 Principal Component Analysis The goal of Principal Component Analysis (called often also Karhunnen-Loeve transform or Hotelling transform) is to explain as much variance as possible with the smallest number of variables [10]. The assumption is made that attributes have a normal distribution, so all information about correlations between attributes is contained in the covariance matrix. The method creates a new dataset that should maintain as much of the original data structure as possible. The classic algorithm calculates eigenvectors and eigenvalues of the covariance matrix. These eigenvectors correspond to the directions of the principal components of the original data, their statistical significance is given by their corresponding eigenvalues. The algorithm may be presented as [10]: 1) Collect xi of an k dimensional dataset X, i=1...n 2) Mean correct all the points: calculate the mean x and substract it from each data point xi − x 3) Calculate the covariance matrix C.

Ci , j = ( xi − x)( x j − x)

Eq. 6.3

4) Determine eigenvalues and eigenvectors of the matrix. C is a real symmetric matrix so a positive real number λ and a nonzero vector α can be found such that:

Cα = λα

Eq. 6.4

where λ is an eigenvalue and α is an eigenvector of C. To find a nonzero α the characteristic equation

C − λI = 0 must be solved. If C is an n x n

matrix of full rank, n eigenvalues can be found λ1,… λn. Using

(C − λI )α = 0 all the corresponding eigenvectors can be found.

5) Sort descending the eigenvalues and corresponding eigenvectors. 6) Use first m (m A)

Eq. 7.8

where Nx≤A and Nx>A are numbers of samples in both subsets. Gain of the division may be calculated as: GAIN(d,xi,A) = H(d) – DIV(d,xi,A)

Eq. 7.9

After calculation of GAIN values for all possible splits of all attributes, the one with the highest GAIN value is chosen. Then the dataset is divided using this value into two datasets (nodes of the tree). The same procedure is then repeated for all new nodes. Splitting continues in order to build a decision tree. The original C45 algorithm stops divisions when all bottom nodes (leaves) contain samples from only one class (thus GAIN for all divisions is equal to zero). Such a tree is not generalizing data so the so called pruning of the tree is performed. The pruning algorithm checks if it is worth to replace the subtree of nodes with a single node basing on some criterion. The criterion used in C45 algorithm first calculates pessimistic error estimate for a node. Given a particular confidence c, we can find confidence limits z such that [96]: ⎡ ⎤ f −q P⎢ > z⎥ = c ⎢⎣ q (1 − q ) / N ⎥⎦

Eq. 7.10

where f is the error probability for the node estimated from the distribution of samples in the node, q is the unknown true error probability and N is the number of samples in the node. Having that, after solving a quadratic equation, we can calculate confidence limits for the true error rate for the specified confidence. In particular we are interested in obtaining a formula for pessimistic estimate of error rate e: f + e=

z2 f f2 z2 +z − + N N 4N 2 2N z2 1+ N

Eq. 7.11

where z is the number of standard deviations corresponding to the confidence c. For the value of c = 25% used in C45 algorithm z = 0.69. 71

Classification methods

7.5. Support Vector Machines

If the pessimistic error estimate for a node is lower than a weighted sum of pessimistic error estimates for sub-nodes, the tree is pruned – the sub-nodes are removed and the samples from that sub-nodes are included into the higher node. When the tree is prepared it is possible to classify new samples. The algorithm is seeking in the tree for the bottom node (leaf) for which the new sample fulfils all conditions. The classification of the sample is the distribution of class assignments in the node.

7.5

Support Vector Machines

Support Vector Machines (SVM) method was originally presented by Vapnik [91]. The basis of the method is finding a hyperplane that ideally separates both classes in the attribute space with the largest margin. That hyperplane becomes a decision function and may be used in classification according to the formula [9]:

⎛ ⎞ f ( x ) = sgn ⎜ ∑ α i y i X i , X + b ⎟ ⎝ i ⎠

Eq. 7.12

i – index of sample in train-set, X – a vector of attributes (sample) being classified Xi – vectors of attributes (samples) from train-set yi – classifications of samples from train-set (-1 or 1)

αi , b – parameters calculated in the algorithm. The αi parameters may be treated as weights of each sample from the training set. These parameters are in fact Lagrangian multipliers [5] calculated by maximizing the equation:

∑α − ∑α α i

i

i

j

yi y j X i , X j

Eq. 7.13

ij

subject to:

∑α

i

y i = 0,α i ≥ 0 ,

i

72

Eq. 7.14


7.5. Support Vector Machines

A remarkable property of this representation is that the only way that data appears in the training problem is in the form of dot products. Thus, if we map the data to some other Euclidean space using the function Φ and find a function such that:

K ( X i , X j ) = Φ ( X i ), Φ ( X j )

Eq. 7.15

we can replace dot products in equations Eq. 7.12 and Eq. 7.13 with this function. It allows generalization of the problem to non-linear decision function. The function K is called a kernel function. Such a function must fulfill Mercer conditions [9]. It is important to mention that since we would only need to use K kernel function in the training algorithm, we would never need to explicitly know Φ function. There are several kernel functions used in literature including polynomial, radial basis or sigmoid [30]. The Gaussian radial basis function was chosen in the researches presented here:

⎛ X −X i j ⎜ K ( X i , X j ) = exp⎜ − 2 2σ ⎜ ⎝

2

⎞ ⎟ ⎟⎟ ⎠

Eq. 7.16

Training a support vector machine requires the solution of a very large quadratic programming optimization problem. However, it is important to note that only a subset of the points (samples) will be associated with a non-zero αi . These points are called support vectors and are the points that lie closest to the separating hyperplane. The

sparseness of the vector of αi has several computational consequences used in the algorithms developed for support vector learning [74]. One of the fastest algorithms was introduced by Platt in 1998 [73]. It is called Sequential Minimal Optimization (SMO) and is based on decomposing the optimization problem into the smallest possible sub-problems. For the standard SVM problem (Eq. 7.13) the smallest possible optimization problem involves two Lagrange multipliers, because the Lagrange multipliers must obey a linear equality constraint (Eq. 7.14). At every step SMO choses two Lagrange multipliers to jointly optimize, finds the optimal values, and updates the SVM to reflect the new optimal values [73]. Sometimes perfect classification of all samples in the train-set results in data overfitting (because of outliers) or it is impossible altogether. To avoid that, the constraints may be relaxed to allow misclassifications. It may be proved [9] that it can be done by introducing the upper bound C on all αi parameters:

73


7.6. Ensemble classifiers

0 ≤ αi ≤ C

Eq. 7.17

When lowering C value we have more misclassifications in the train-set, but the classification model becomes more general. SVM method is widely used and is supposed to be very accurate in classification of even very “difficult” data. The usefulness of SVM has been once again proved in this dissertation. Although it started from different point of view the methodology gives results which are very similar to Neural Networks methods.

7.6

Ensemble classifiers

The idea is intuitive and may be compared with the following ‘real life’ example: We have several ‘experts’ that are independently answering the same question. When the experts’ answers are different we must decide by ourselves which answer is right, using the information about each expert’s opinions (or votes). If we do not have any information about the experts’ skills we choose the answer with majority of votes. If we trust some experts more than others we can incorporate weights of each experts’ vote. When considering the same example in a classification task, the experts are replaced by different classifiers and the idea of voting (or weighted voting) remains the same. The classifier, in the sense used here, is a function that – for a specific sample given as a parameter – judges the sample’s classification. There are three differences that may appear between classifiers: •

Different learning samples in the train-set.

•

Different attributes values for the same samples (obtained by conversions)

•

Different classification algorithms used.

7.6.1 Bagging An example of method creating different classifiers by changing samples in the trainset used for classification model creation is bagging (bootstrap aggregating) [8]. Having a set of N samples it creates different train-sets by random sampling with replacement of that set. The resulting train-sets consist of N samples but – because of sampling with replacement – some samples from the original set are missing and some are represented more than once. The classifiers are build using the same classification algorithm, but different train-sets. Classifiers than vote for the class to be predicted. The train-sets used in each classifier are certainly not independent, because they are all based on one dataset. However, it turns out that bagging produces a combined model that often 74



performs significantly better than the single model built from the original training data and it is never substantially worse [96].

7.6.2 Boosting Boosting [28] is based on the idea of building the following classifiers that concentrate on ‘hard’ samples. To indicate the ‘hardness’ of the sample it uses weights associated with every sample. The most widely used algorithm is AdaBoost [27]. The implementation uses the same classification algorithm several times. Unlike in bagging, the classifier uses the same set of samples each time. But after each classification the weights of misclassified samples are increased. Because of that, the next classifier treats samples misclassified previously as more important and concentrates on them. After preforming N iterations, the model consists of N different classifiers. In the testing phase a decision is made using a weighted sum of all classifiers’ decisions. The weight of the classifier corresponds to the number of errors it made in the train-set. The AdaBoost algorithm proved very good efficiency when combined with decision trees [77] or neural networks [61]. However, it may be used even with very “weak” classifiers because the only constraint on the classification method is that it must classify train-set with error rate lower than 50%. Such constraint is relatively easy to fulfil when considering a two-classes train-set.

7.6.3 Using different classifiers and data representations Bagging and boosting in their original forms use the same data and the same classification algorithms in every iteration. The main property of the methods are changes in the way the training data are treated. That feature makes both methods universal and possible to use in every classification problem. However, when we have more information about data, it is possible to use it to create more than one vector of attributes for each sample. For instance using conversions described in sections 5 and 6 it is possible to create for one sample several vectors containing different information about the sample. Each of these vectors may be separately used in the classification process. Moreover, each vector may be classified with several different classification methods.

75



Train-set

Classifier 1

...

Classifier 2

...

Classifier 3

Test sample

+1

...

+1

...

-1

∑ Result 〈-1,+1〉

Fig. 7.4 The idea of voting classifiers.

Using several dataset conversions and several classification algorithms for each conversion we can create a number of different classifiers. These classifiers may than be used independently on the unknown sample. There are only two possible classifications: positive for samples belonging to the specified person and negative for samples not belonging to that person. Therefore, the classification result for each classifier is +1 if the sample is classified as positive and –1 if the sample is classified as negative. All results from all classifications are then summed up and divided by the number of classifications. The final result is the value from range 〈-1,1〉. If the value is positive and higher than a predefined upper-threshold parameter the sample is classified as positive. If the value is negative and lower than a predefined lower-threshold parameter the sample is classified as negative. All samples with values within (lower-threshold, upper-threshold) range are checked as unclassified. The algorithm described above has two advantages. Firstly, it compiles different information obtained from one sample. Secondly, it can judge if the sample is possible to be classified and reject it if it is not. The idea of rejecting classification for some samples and checking them as unclassified, improves overall system performance. But care must be taken when setting upper and lower threshold parameters. It should be taken as a rule that the number of rejected samples cannot be higher than 10% of all samples acquired.

76


7.7

7.7. Cross-validation

Cross-validation

In the real world a set of all available samples is used as a train-set to create a classification model. Than the model is used to classify new samples. In research phase a set of all available samples must be used for both training and testing. The most obvious choice is to randomly divide the set into two separate subsets, one for training and one for testing. But the problem is that the way the dataset is divided may influence the testing results. In order to give a reliable estimation of errors the splitting has to be repeated more than once. The process of subsequent divisions of the whole dataset into train-set and test-set is called a cross-validation. The most popular algorithm is an N-fold stratified cross-validation. In the algorithm the available data (samples) are divided into N blocks containing equal number of samples and roughly equal (stratified) class distribution. For each block in turn, a model is developed using the data in the remaining blocks (the train-set) and then evaluated on the samples in the hold-out block (the test-set). Each sample is thus tested exactly once on a model that was developed without reference to it. The average performance on the test may be used to predict true accuracy of a model developed from all the data [76]. The most popular cross-validation method is 10-fold cross-validation. In that method 90% of the data is used for training and 10% is used for testing each time. Train-set

Classification model

Dataset Test-set

Classification of all test samples

Verification

Results

Error rates information

Fig. 7.5 The idea of validation using train-set and test-set.

In the case discussed here there were only authorization tests performed so the dataset consisted always of samples belonging to the specified person (referred as positive samples) and samples not belonging to that person (negative samples). For each 77


7.7. Cross-validation

person the number of positive samples was less then 3% of the whole dataset. That uneven distribution of classes can disturb the learning process when the whole dataset or its stratified part is taken for training. Fortunately the number of samples was big enough and there was no need to use most of the samples in the train-set. Therefore, the cross-validation procedure eventually used, randomly created for each person a train-set using only 20 positive and 80 negative samples. Such a train-set was then used for classification model creation. The rest of the samples were used as the test-set for performance evaluation. To obtain reliable performance estimation the procedure of train-set creation and test-set classification had to be repeated a considerable number of times (see section 8.2.1).

78

8 Experiment Methods presented in the previous sections have been implemented (see Appendix for code details) and used in the experiment in which the main thesis of dissertation, that eye movements may be used for identification of people, has been checked. The experiment may be divided into three phases: •

Data preparation.

•

Performing classification tests.

•

Verification of the classification results.

Data preparation

Classification tests

Verification of the results

Fig. 8.1 Three phases of the experiment.

8.1

Data preparation

The first thing that had to be done was preparation of the data. The process may be divided into two steps: •

Gathering samples of eye movements (see section 4).

•

Preprocessing the samples and creation of several different vectors of attributes for each sample (see section 5).

8.1.1 Data gathering Data gathering process was described in details in section 4. It has been done with an EyeLogin application (see Appendix). The same test was preformed on 47 persons at the age ranging from 19 to 38, both males and females. The data gathering experiment lasted from March 2003 to June 2004. 23 persons were subjects of the test for at least 30 times to acquire the reliable number of samples. The biometric tests should be performed on as many persons as possible to ensure correct error estimations [65]. However, enrolling users is a time consuming task. With limited access to OBER2 device and on the basis of good will of friends and students there were overall 1151 tests performed. Every test produced a single file in EyeTestFile format (see section 4.6). The contents of the file is below referred to as a sample. The 79

Experiment

8.2. Performing classification tests

sample contains information about person’s identification, date and time of test and 2048 measurements.

Test File 1

EyeLogin

Test File 2

EyeLoader

Dataset File

Test File 3

Test File 4

Fig. 8.2 Process of the dataset creation

8.1.2 Entry processing - datasets preparation In the next step pure samples stored in separate files were gathered in one file in EyeDatasetFile format using EyeLoader tool (see Appendix). Then all samples in the dataset were normalized (see section 5.2). The next step was the creation of several datasets of vectors of attributes with algorithms described in section 5. Each dataset was named with a special symbol. The list of symbols and created datasets is presented in Table 8.1. Table 8.1. Symbols of prepared datasets and descriptions with references.

8.2

Symbol

Description

N

Normalized signal (see section 5.2)

AVGVEL

Average velocity direction calculations (see section 5.3.1)

ED

Between eye distance calculations (see section 5.3.2)

F

Fourier Transform of the signal (see section 5.3.4)

STD

Distance to stimulation point (see section 5.3.3)

WD

Discrete Wavelet Transform of the signal (see section 5.3.5)

Performing classification tests

All tests performed were authorization tests (see section 2). In such test the classification model was used to check the hypothesis that the sample belongs to the specified person. Therefore, tests had to be done separately for each person. The procedure described below was used on all datasets for every person P: 80

Experiment


•

Change labels of all samples belonging to person P to ‘yes’ (below referred to as positive samples)

•

Change labels of all samples not belonging to person P to ‘no’ (below referred to as negative samples)

The next step was dividing the dataset into train-set and test-set and classification of samples from test-set using information from train-set. To get reliable results the sufficient number of tests is very important. Therefore, the procedure of random creation of the train-set was repeated 100 times for each person. The algorithm developed may be described in the following manner: For every person, repeat 100 times for each dataset: Divide dataset into train-set and test-set. Minimize vectors of attributes in the train-set using algorithms described in section 6. Create from a classification model using train-set and algorithms described in section 7 Repeat conversions calculated for the train-set on the test-set. Use the classification model to classify all test-set samples. Store the results.

8.2.1 Dividing a dataset into train-set and test-set All classification algorithms, described in detail in section 7, work in the following two steps: -

Learn classification from a training set and create a classification model.

-

Use the model to classify samples with unknown classification.

So the first step is division of the datasets into training and test ones. The procedure randomly selects 20 positive samples (that belonging to the person) and 80 negative samples. These 100 selected samples are then used to create the train-set and the rest of the samples are used to create the test-set.

8.2.2 Minimizing dataset Having a train-set it is possible to perform dataset transformations described in section 6. All transformations made on the train-set had to be then repeated on the testset. The MinChange algorithm was used in the first experiments. However, because 81

Experiment


Principal Components Analyses overperforms MinChange, the results presented here were obtained using only the PCA. In case of PCA the main problem was performance. Calculation of PCA with 2048 attributes lasted about 40 minutes on PC computer with Pentium 4 2.8 GHz processor. Calculation of PCA with 1024 attributes lasted about 3 minutes. And calculation of PCA with 512 attributes lasted only about 5 seconds. In our experiments PCA had to be calculated at least 500 times for one person. That is why it was decided to calculate a partial PCA on subsets of attributes. A standard vector of 2048 attributes was divided into 4 parts and PCA was calculated separately for each part. Vector of attributes (2048 elements)

512 elements

512 elements

512 elements

512 elements

PCA

PCA

PCA

PCA

New vector of attributes (less than 2048 elements)

Fig. 8.3 Partial PCA calculation.

After calculation of eigenvectors, the M eigenvectors associated with M highest eigenvalues are selected to create conversion matrix. The number of that vectors (M) may be fixed but it may be calculated from the explain percentage (see section 6.2.1). The latter was used in the experiment. There were six different explain percentages chosen and the procedure created six conversion matrices. The explanation parameters used were: 0.9, 0.92, 0.95, 0.98, 0.99, 0.999 (when explanation value is 0.9, it means that algorithm takes the most significant eigenvectors until 90% of the signal is explained). In case of AVGVEL dataset it was averaged in partitions giving three datasets containing 16, 64 and 128 attributes.

82

Experiment


Table 8.2. Symbols of applied conversions.

Symbol

Applied conversion

N

PCA0.9, PCA0.92, PCA0.95, PCA0.98, PCA0.99

AVGVEL

16, 64, 128

ED


F


STD


WD


The algorithm of datasets minimization may be described as follows: Calculate PCA for the train-set. For each explanation level (0.9, 0.92, 0.95, 0.98, 0.99, 0.999): Calculate conversion matrix. Convert train-set using calculated conversion matrix. Convert test-set using the same conversion matrix. Use train-set and test-set in classification process (see section 8.2.3) Very important here is the fact that the test-set is converted using the conversion matrix calculated from a train-set data only.

8.2.3 Classification Each train-set and test-set pair was then used in classification. There where eight different classification algorithms used as described in Table 8.3. Table 8.3. Symbols of used classification algorithms

Symbol

Description

KNN1

k Nearest Neighbor with k=1 (see 7.1)

KNN3


KNN7


NB

Naïve Bayes (see 7.3)

C45

C45 decision tree (see 7.4)

C45.1

C45 decision tree with confidence =50% (see 7.4)

SVM

Support Vector Machines (see 7.5)

THRES

Template-threshold method (see 7.2)

83

Experiment

8.3. Verification of the results

For each algorithm the procedure of classification was performed as follows: Use train-set to create classification model Classify all samples from the test-set as positive or negative. Store the results. The results of the single classification test were stored in EyeTest object. The EyeTest object contains: -

Description of the dataset used (like F_PCA0.92 – see Table 8.2).

-

Description of the classification algorithm (as in Table 8.3).

-

Sequence of classification results of the following samples (-1 for negative classification and +1 for positive classification).

The EyeTest object is added to EyeResults object containing also information about identification of samples included to the train-set and identification of samples included to the test-set. After performing all classifications on all conversions of all prepared trainset–testset pairs the EyeResults object consisted of 264 classification results (EyeTest objects) below referred to as classifiers. Each EyeResults object was stored in a text file in EyeResultsFile format.

8.3

Verification of the results

Calculation described in the previous section gave over 100 EyeResults files for each person being identified. The data was then analysed using simple statistical methods. To estimate the performance three factors were taken into consideration as it was previously stated in section 2.2: •

False Acceptance Rate (FAR) – the number of negative samples classified as positive to the number of all negative samples.

•

False Rejection Rate (FRR) – the number of positive samples classified as negative to the number of all positive samples.

•

Half Total Error Rate (HTER) – calculated as average of FAR and FRR.

It is worth to notice that FAR value is computed only using the negative test samples and FRR value is computed only using the positive test samples. Therefore, they may be calculated independently and the ratio between positive and negative samples in the test-set is not significant.

84

Experiment


8.3.1 Analyzing errors for the datasets First task was analysing the error rates obtained for different dataset types. There were six types of dataset and for each of them average error in all EyeResults objects was calculated. The results are presented in Table 8.4. Table 8.4. Average error rates for six different types of dataset

Type

Average FAR

Average FRR

HTER

AVGVEL

20,80

35,22

28,01

N

18,81

34,82

26,82

F

19,11

39,57

29,34

WD

24,52

50,13

37,33

STD

18,98

37,27

28,13

ED

20,07

62,79

41,43

It occurred that classifications using datasets based on eye difference (ED) and on wavelet transform (WD) are giving significantly higher errors than classifications using the rest four datasets. The ED and WD datasets were therefore excluded from further classifications. 70 far

60

frr

50

hter

40 30 20 10 0 N

AVGVEL

STD

F

WD

ED

Fig. 8.4 Errors for six different dataset types.

8.3.2 Analyzing errors of the classification algorithms Similarly average error for each classification algorithm has been calculated. As it can be seen it Table 8.5, because of complexity of eye movement data, the simple algorithms using the Euclidean distances between samples as kNN and THRES gave significantly higher errors than the more sophisticated algorithms. 85

Experiment


Table 8.5. Average error rates for eight different classification algorithms

Algorithm

Average FAR

Average FRR

HTERR

kNN1

15,78

48,81

32,30

kNN3

12,30

55,20

33,75

kNN7

9,43

63,43

36,43

NB

40,88

21,27

31,08

C45

12,43

45,79

29,11

C45.1

10,37

48,36

29,36

SVM

10,80

43,60

27,20

THRES

50,54

30,39

40,47

Support Vector Machines (SVM) algorithm was the methodology giving the lowest error rates. THRES and all kNN algorithms were excluded from further testing. 70 60 50

far frr hter

40 30 20 10 0 SVM

C45

C45.1

NB

kNN1

kNN3

kNN7 THRES

Fig. 8.5 Errors for different classification algorithms.

8.3.3 Voting classifiers The exclusion of two datasets (ED and WD) and four classification algorithms (THRES, kNN1, kNN3 and kNN7) reduced the number of results (EyeTest objects or classifiers) taken into consideration from 264 to 72. For that 72 classifiers a voting procedure was used. Every of 72 classifiers gave for each test sample a value –1 if the sample was not classified as the specified person or +1 if the sample was classified as the specified person (see Fig. 8.6). 86

Experiment

8.4. Conclusions – performance considerations

The sum of all votes is then divided by 72 and gives a value in 〈-1,+1〉 range. The value +1 means that all classifiers classified the sample as the positive one (belonging to the specified person). The value –1 means that all classifiers classified the sample as the negative one (not belonging to the specified person). As the classifiers are built using different information from the original train-set and are based on different classification algorithms their decision functions are different and to some degree independent. Therefore, combining them all in the system’s decision function may produce results that are more stable and more precise than results based on only one set of attributes and one classification algorithm. Train-set

N_PCA0.9

...

...

NB

N_PCA0.92

C45

C45.1

STD_PCA0.9

...

...

SVM

...

NB

F_PCA0.999

C45

C45.1

SVM

Test sample

+1

+1

-1

+1

+1

∑/72

+1

-1

+1

72 decisions

Fig. 8.6 The voting algorithm is using results of all classifiers.

8.4

Conclusions – performance considerations

The experiment described above lasted for almost a year. The most challenging task was – surprisingly – gathering the data. With no extra funds for research the only encouragement for people taking part in the experiment was their positive attitude to the researcher.

87

Experiment

8.4. Conclusions – performance considerations

The classification phase was performed for each user independently. The whole process of creating a random train-set, preparing conversions, classifying all test samples and storing the results in file in EyeResultsFile format lasted on average 50 minutes on PC computer with Pentium 4 2.8 GHz processor and 512MB of memory. As for each user at least 100 such actions should be done – the preparation of data for a single person lasted more than three days. Fortunately, holidays time gave opportunity to use more than one computer what significantly shortened preparations time. It must be noticed that it was preparation of the datasets and classification models that took the most time in process of EyeResults object creation. When classifiers were ready, classification of the test samples lasted less than one second – and that is the only part that would be repeated in a “real world” application. Every EyeResults object saved in a file stored information about classifier’s decisions for every sample. It gave opportunity to analyse all data and perform voting after collection of all results.

88

9 Results The experiment described in section 8 produced a lot of data which was than analysed in EyeStat application (see Appendix). The results of errors calculations were then averaged giving values presented in Table 9.1. As in the previous tables abbreviation FAR stands for False Acceptance Rate, abbreviation FRR stands for False Rejection Rate and Half Total Error Rate (HTER) is the average of both of them. Table 9.1 has three rows. The middle row shows average values of errors for all examined data. The Worst row shows the highest error rates obtained for each error type independently and similarly the Best row shows the best (lowest) error rates. The results presented in the table were obtained with the lower and upper rejection thresholds –0.1 and +0.1 (see section 7.6.3). The number of rejected samples varied for different persons, but was never higher than 10%. Table 9.1. Error rates in authorization tests

FAR

FRR

HTERR

worst

3,19

26,94

14,08

average

2,31

14,53

8,42

best

0,56

3,93

3,33

As it can be seen the methodology used is far from perfect and the errors obtained are too high to use the system as the reliable source of identification information. However, the results obtained are quite competitive to other behavioral biometric systems or face recognition systems [56][25][34]. Moreover – in the author’s opinion the results prove that there is some information about the person’s identity in eye movements - what was the main thesis of the dissertation. 30 far

25

frr 20

hter

15 10 5 0 worst

average

best

Fig. 9.1 Errors for different persons. 89

Results

9.1

9.1. Multiple trials estimation

Multiple trials estimation

As it can be seen in the table, False Rejection Rates are much higher than the corresponding False Acceptance Rates. That feature may be used to improve FAR by performing several subsequent trials. The trial in our example is one test. We assume that the success of the trial means here acceptance of the user as the right one. So the probability of success in the login trial of the honest person may be estimated as: pp = 1-FRR. Similarly the probability of success in the login trial of the not honest person is: pn = FAR. We can treat the experiment of N subsequent login trials as a sequence of Bernoulli trials. The acceptance criterion in such experiment is that at least one trial succeeded (n>0). The probability of acceptance when the honest user is trying to log in is therefore: PSN = 1-p(0) = 1-(1 – pp)N = 1- FRRN.

Eq. 9.1

so the new FRR becomes the function of the number of trials N: FRRN = FRRN

Eq. 9.2

And similarly when dishonest user tries to log in the probability of successful login is: PSN = 1 – p(0) = 1-(1 – pn)N = 1-(1-FAR)N

Eq. 9.3

This time the error this is our error rate so FRR in the function of number of trials may be calculated as: FARN = 1-(1-FAR)N

Eq. 9.4

The idea is tempting and Table 9.2 shows simulated error rates for two trials. Table 9.2. Simulated error rates in authorization tests in two independent trials

FAR(2)

FRR(2)

HTERR(2)

worst

6,17

7,28

5,80

average

4,56

2,6

3,58

best

1,12

0,15

1,02

As it could be suspected, increasing of the number of trials decreases false rejection rate (FRR), but increases false acceptance rate (FAR). However, the results are obtained with one assumption that does not hold: the independence of subsequent trials. In fact two subsequent eye movements registering

90

Results

9.1. Multiple trials estimation

trials taken from the same person are dependent, because there are several factors that influence both trials, including for instance: person’s tiredness or external circumstances like light or sounds. So it may be assumed that error rates of two subsequent trials lay somewhere between one trial results and simulated two independent trials. Fortunately, it could be checked experimentally. During the data gathering phase of the experiment some tests were made in series of two or three tests for the same person in a row. Because EyeResults file stores date of acquiring for every test sample being classified, it was possible to find “pairs” of test samples. The assumption was made that two samples are supposed to be a pair when they belong to the same person and the time between acquiring both samples is less than two minutes. The results were recalculated for each pair with the OR-like manner described in Table 9.3. Table 9.3. Calculation of paired results First sample result Second sample result Combined result

-1 -1 +1 +1

-1 +1 -1 +1

-1 +1 +1 +1

As it can be seen one positive result is enough to accept user. The results obtained after calculation of the paired classifiers are presented in Table 9.4. When comparing it to Table 9.1 it can be seen that percentage of false acceptances (FAR) increased. It is obvious because the forger has now two trials. On the other hand the probability of false rejection of the proper user is lower. Table 9.4. Error rates in authorization test combined from two trials

FAR(2)

FRR(2)

HTERR(2)

worst

6,83

15,38

11,25

average

4,84

9,4

7,12

best

1,82

3,44

3,88

The results are worse than for simulated independent trials, what assures one that the analysed attribute vectors have some noisy – environmentally dependent – information.

91

Results

9.2. Problem of overfitting

16 14

far(2)

12

frr(2)

10

hter(2)

8 6 4 2 0 worst

average

best

Fig. 9.2 Errors for different persons in two trial test.

9.2

Problem of overfitting

Improving the results and testing it on the same data results very often in the so called data overfitting. It happens when the classification algorithm tends to classify very accurately the given dataset but fails to properly recognize entirely new samples. It may be said that the algorithm works with very poor generalization. The experiment provided in this dissertation is not free from the overfitting problem because the classifiers used in voting procedure had been chosen on the basis of the average errors obtained during testing with the same data. However, as the train-set consisted of less than 7% of the whole dataset each time, it minimized the similarity of the subsequent randomly created train-sets used in the following calculations. The procedure obtained on the basis of average performance in all results was then used independently on each train-set and test-set pair. Therefore, the results may be considered reliable.

9.3

Conclusions

The idea of personal identification using eye movement characteristic presented in the dissertation seems to be valuable addition to other well known biometric techniques. What makes it interesting is the easiness of combining it with, for instance, face or iris recognition. As all of those techniques need digital cameras to collect data, the system that uses the same recording devices to gather information about human face shape, eye iris pattern and eye movements characteristic may be developed. Of course there is a lot of work to be done to improve the methodology, but first experiments show the great potential of eye movements identification. That potential was also acknowledged during 92

Results

9.3. Conclusions

6th World Conference BIOMETRICS’2003 in London where a poster ‘Eye movement tracking for human identification’ was awarded the title of ‘Best Poster on Technological Advancement.’ That confirms that the eye movement based biometric identification is the important issue in today’s biometric identification researches. It is the author’s belief that the dissertation is just the first step in the process of creating a new standard of human identification based on eye movements.

93

10 Literature [1]

A damning airport report. Biometrics Technology Today ISSN 0969-4765, Volume 11, Issue 10, Elsevier Science (October 2004)

[2]

Augustyniak P., Bubliński Z., Gorgoń M., Grabska-Chrząstowska J., Mikrut Z., Pawlik P.: Biocybernetyczne aspekty procesu obserwacji sceny – wstępna analiza trajektorii ruchu oczu. Materials of seminar "Przetwarzanie i analiza sygnałów wizji i sterowania", Słok, str. 44-49 (2002)

[3]

Augustyniak P., Mikrut Z.: Dominant Eye Recognition Based on Calibration of the OBER2 Eyetracker. IFMBE Vol.3 (2002)

[4]

Balakrishnama S., Ganapathiraju A.: Linear Discriminant Analysis - A Brief Tutorial. Institute for Signal and Information Processing Department of Electrical and Computer Engineering Mississippi State University.

[5]

Bartsekas D., Nedic A., Ozdaglar A. E.: Convexity, Duality and Lagrange Multipliers. Lecture Notes. Massachusetts Institute of Technology (2001)

[6]

Belhumeur N., Hespanha J., Kriegman D.: Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection. Proceedings of European Conference on Computer Vision (1996)

[7]

Blum A. L., Langley P.: Selection of Relevant Features and Examples in Machine Learning. Artificial Intelligence, vol. 97, no. 1-2, pp. 245-271 (1997)

[8]

Breiman L.: Bagging predictors , Machine Learning 26, No. 2, (1996)

[9]

Burges C.: A Tutotrial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery 2, 121-167 (1998)

[10] Calvo R. A., Partridge M., Jabri M. A.: A Comparative Study of Principal Component Analysis Techniques. In Proc. Ninth Australian Conf. on Neural Networks, Brisbane, QLD (1998) [11] Campbell C. S., Maglio P. P.: A robust algorithm for reading detection. Proceedings of the ACM Workshop on Perceptual User Interfaces (2002) [12] Chatzis V., Bors A. G., Pitas I.: Multimodal Decision-level Fusion for Person Authentication. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, Vol. 29, No. 6 (1999)

94

Literature

[13] Checco J. C.: Keystroke Dynamics & Corporate Security. Wall Street Technology Association TICKER Magazine, Sep./Oct. (2003) [14] Cholet G.: Automatic Speaker Recognition: Technologies, Evaluations and Possible Future. Presentation during 1st BioSec and Biometric Technologies Workshop, Barcelona (2004) [15] Cortes C., Vapnik V.: Support Vector Networks. Machine Learning, 20, (1995) [16] Cowen L., Ball L. J., Delin J.: An eye-movement analysis of web-page usability. Chapter in X. Faulkner, J. Finlay, & F. Détienne (Eds.): People and Computers XVI—Memorable Yet Invisible: Proceedings of HCI 2002. Springer-Verlag Ltd, London (2002) [17] Daugman J.: High Confidence Visual Recognition of Persons by a Test of Statistical Independence, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11 (1993) [18] Daugman J.: How Iris Recognition Works. IEEE Transactions on Circuits and Systems For Video Technology, Vol. 14, No. 1 (2004) [19] Duchowski A.: A Breadth-First Survey of Eye Tracking Applications. Behavior Research Methods, Instruments & Computers (BRMIC), 34(4) (2002) [20] Duchowski A.: Eye tracking methodology. Theory and Practice. Springer-Verlag Ltd, London (2003) [21] Egan J. P.: Signal Detection Theory and ROC Analysis, Academic Press (1975) [22] Elde A.: Information Processing in Dyslexic Readers: Deductive Mental Model. Journal of Psychology and the Behavioral Sciences vol.10, University at Madison, New Jersey (1996) [23] Engbert R., Longtin A., Kliegl R.: Complexity of Eye Movements in Reading. International Journal of Bifurcation and Chaos in Applied Sciences and Engineering, Vol. 14, No. 2 (2004) [24] Everitt R., McOwan P.: Human mouse trap. Biometrics Technology Today ISSN 0969-4765, Volume 11, Issue 10, Elsevier Science (October 2003) [25] Face Recognition Vendor Test. http://www.frvt.org [26] Fingerprint Verification Competition http://bias.csr.unibo.it/fvc2004/ [27] Freund Y., Schapire R. E.: A decision theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci., vol. 55, pp. 119-139 (1997)

95

Literature

[28] Freund Y., Schapire R. E.: A Short Introduction to Boosting. Journal of Japanese Society for Artificial Intelligence, 14(5):771-780, (September 1999) [29] Grother P. J., Micheals R. J., Phillips P. J.: Face Recognition Vendor Test 2002 Performance Metrics. Proceedings 4th International Conference on Audio Visual Based Person Authentication (2003) [30] Gunn S. R.: Support Vector Machines for Classification and Regression. Technical Report, University of Southampton (1998). [31] Hajda J.: Ocena poziomu zmęczenia kierowców na podstawie sygnału ruchu oka. Praca doktorska, Instytut Automatyki Politechniki Śląskiej, Gliwice (2004) [32] Hangai S., Higuchi T.: Writer Identification Using Finger-Bend in Writing Signature. Proceedings of Biometric Authentication Workshop, European Conference on Computer Vision in Prague 2004, LNCS 3087, Springer-Verlag, Berlin (2004) [33] Henderson J. M., Hollingworth A.: Eye Movements and Visual Memory: Detecting Changes to Saccade Targets in Scenes. Michigan State University, Visual Cognition Lab, Rye Lab Technical Report Tech Report No. 2001, 3 (2001) [34] Hook C., Kempf J., Scharfenberg G.: A Novel Digitizing Pen for the Analysis of Pen Pressure and Inclination in Handwriting Biometrics. Proceedings of Biometric Authentication Workshop, European Conference on Computer Vision in Prague 2004, LNCS 3087, Springer-Verlag, Berlin (2004) [35] Hornof A. J., Halverson T.: Cleaning up systematic error in eye tracking data by using required fixation locations. Behavior Research Methods, Instruments, and Computers, 34(4) (2002) [36] Huang P. S.: Automatic gait recognition via statistical approaches for extended template features. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics (2001) [37] Huey E. B.: The Psychology and Pedagogy of Reading. With a Review of the History of Reading and Writing and of Methods, Texts, and Hygiene in Reading. New York: Macmillan (1908) [38] Hung G. K.: Models of Oculomotor Control, World Scientific Publishing Co. (2001) [39] Hyvarinen A.: Survey on Independent Component Analysis. Neural Computing Surveys 2 (1999)

96

Literature

[40] Jacobson L., Ygge J., Flodmark O.: Nystagmus in periventricular leucomalacia. British Journal of Ophthalmology (1998) [41] Jain A. K., Hong L., Kulkarni Y.: A Multimodal Biometric System using Fingerprint, Face and Speech. Proceedings of Second International Conference on AVBPA, (Washington D. C., USA) (1999) [42] Jain A. K.: Fingerprint Matching. Presentation during 1st BioSec and Biometric Technologies Workshop, Barcelona (2004) [43] Jamnicki M.: Eye movement signal processing, Doctoral Thesis, Institute of Automatic Control, Silesian University of Technology, Gliwice (1999) [44] Javal É.: Physiologie de la lecture et de l’écriture. Paris: Félix Alcan (1905) [45] John G.H., Kohavi R., Pfleger K.: Irrelevant Features and the Subset Selection Problem. Machine Learning: Proceedings of the Eleventh International Conference, Morgan Kaufmann Publishers, San Francisco (1994) [46] Josephson S., Holmes M. E.: Visual Attention to Repeated Internet Images: Testing the Scanpath Theory on the World Wide Web., Proceedings of the Eye Tracking Research & Application Symposium 2002. New Orleans, Louisiana (2002) [47] Kapczyński A., Kasprowski P., Kuźniacki P., Ober J.: Behawioralne metody identyfikacji tożsamości. Materiały konferencji Współczesne Problemy Sieci Komputerowych, Wydawnictwa Naukowo-Techniczne (2004) [48] Kasprowski P., Ober J.: Eye Movement in Biometrics. Proceedings of Biometric Authentication Workshop, European Conference on Computer Vision in Prague 2004, LNCS 3087, Springer-Verlag, Berlin (2004) [49] Kasprowski P., Ober J.: Eye movement tracking for human identification. 6th World Conference BIOMETRICS’2003, London (2003) [50] Kasprowski P., Ober J.: With the flick of an eye. Biometrics Technology Today ISSN 0969-4765, Volume 12, Issue 3, Elsevier Science (March 2004) [51] Kasprowski P., Ober J.: Zastosowanie systemu pomiaru ruchu oka w biometrii. Konferencja Naukowa BIOMETRIA’2003 Technologia, Prawo, Społeczeństwo. Instytut Maszyn Matematycznych, Warszawa (2003) [52] Kawulok M.: Mask and Eigenvector Weights for Eigenfaces Method Improvement. International Conference on Computer Vision and Graphics, Warszawa (2004)

97

Literature

[53] Kerber R.: ChiMerge: Discretization of numeric attributes. Proceedings of the Tenth National Conference on Artificial Intelligence. MIT Press (1992) [54] Kim J., Choi J., Yi J.: Face Recognition Based on Locally Salient ICA Information. Proceedings of ECCV 2004 International Workshop, BioAW 2004, LNCS 3087, Springer-Verlag, Berlin (2004) [55] Kopiez R., Galley N.: The Musicians' Glance: A Pilot Study Comparing Eye Movement Parameters In Musicians And Non-Musicians. Proceedings of the 7th International Conference on Music Perception and Cognition, Sydney (2002) [56] Kuźniacki P.: Uwierzytelnianie użytkowników w Internecie oparte na analizie sposobu pisania na klawiaturze. Konferencja Internet w Społeczeństwie Informacyjnym, WNT, Warszawa (2004) [57] Lewis D.: Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval. Proceedings of ECML-98 pp. 4-15 (1998) [58] Loève M. M.: Probability Theory, Princeton, NJ: Van Nostrand (1955) [59] Loska J.: Wybrane problemy sprzętu i oprogramowania systemu pomiaru ruchu oka OBER2, Praca doktorska, Instytut Automatyki Politechniki Śląskiej, Gliwice (2003) [60] Lu X., Colbry D., Jain A. K.: Three-Dimensional Model Based Face Recognition. Proceedings International Conference on Pattern Recognition (ICPR), Cambridge, UK (2004) [61] Maghooli K., Moin M. S.: A New Approach on Multimodal Biometrics Based on Combining Neural Networks Using AdaBoost. Proceedings of Biometric Authentication Workshop, European Conference on Computer Vision in Prague 2004, LNCS 3087, Springer-Verlag, Berlin (2004) [62] Maio D., Maltoni D., Cappelli R., Wayman J. L., Jain A. K.: FVC2004: Third Fingerprint Verification Competition, Proceedings of International Conference on Biometric Authentication (ICBA), Hong Kong (July 2004) [63] Maio D., Maltoni D., Cappelli R., Wayman J. L., Jain A. K.: FVC2004: Third Fingerprint Verification Competition, Proceedings of International Conference on Pattern Recognition, Quebec City (August 2002) [64] Maltoni D., Maio D., Jain A. K., Prabhakar S.: Handbook of Fingerprint Recognition. Springer, New York (2003)

98

Literature

[65] Mansfield A. J., Wayman J. L.: Best Practices in Testing and Reporting Performance of Biometric Devices Version 2.1. National Physical Laboratory Report, Middlesex (2002) [66] Mast F. W., Kosslyn S. M.: Eye movements during visual mental imagery. TRENDS in Cognitive Sciences Vol.6 No.7 (2002) [67] McCallum A., Nigam K.: A Comparison of Event Models for Naive Bayes Text Classification. In AAAI-98 Workshop on Learning for Text Categorization (1998) [68] METROVISION, Perenchies, France. http://www.metrovision.fr [69] Morgan S. W., Patterson J., Simpson D. G.: Utilizing EOG for the measurement of saccadic eye movements. IEEE: Biomedical Research in the 3rd Millenium, Melbourne. ISBN 0-646-36946-6. pp 33-36 (1999) [70] Noton D., Stark L. W.: Scanpaths in eye movements during pattern perception. Science, 171 308-311 (1971) [71] Ober J., Hajda J., Loska J., Jamnicki M.: Application of Eye Movement Measuring System OBER2 to Medicine and Technology. Proceedings of SPIE, Infrared Technology and applications, Orlando, USA, 3061(1) (1997) [72] Ord T., Furnell S. M.: User authentication for keypad-based devices using keystroke analysis. Proceedings of the Second International Network Conference (INC 2000), Plymouth, UK (2000) [73] Platt J. C.: Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines, Technical Report MSR-TR-98-14, Microsoft Research (1998) [74] Platt J. C.: Using Analytic QP and Sparseness to Speed Training of Support Vector. Advances in Neural Information Processing Systems, MIT Press (1999) [75] Polikar R.: The Engineer's Ultimate Guide To Wavelet Analysis. http://users.rowan.edu/~polikar/WAVELETS/WTtutorial.html [76] Quinlan J. R.: A Case Study in Machine Learning. Proceedings of 16th Australian Computer Science Conference (ACSC'93), Brisbane, Australia, pp 83-92 (1993) [77] Quinlan J. R.: Bagging, boosting, and C4.5. Proceedings of the Thirteenth National Conference on Artificial Intelligence, pages 725–730 (1996) [78] Quinlan J. R.: C4.5: Programs for Machine Learning. San Mateo: Morgan Kaufmann (1993)

99

Literature

[79] Rabiner L. R., Schafer R. W.: Digital Processing of Speech Signals, Prentice Hall, Englewood Cliffs, NJ (1978) [80] Ross A., Jain A. K.: Information Fusion in Biometrics. Pattern Recognition Letters, Special Issue on Multimodal Biometrics, Vol. 24, No. 13 (2003) [81] Ryżko J.: Rozwój biometrii w latach 2002-2003 – techniki, zastosowania, rynek. Konferencja Naukowa BIOMETRIA’2003 Technologia, Prawo, Społeczeństwo. Warszawa (2003) [82] Salvucci D. D., Goldberg J. H.: Identifying Fixations and Saccades in EyeTracking Protocols. Proceedings of the symposium on Eye tracking research & applications Palm Beach Gardens, Florida, United States (2000) [83] Schiessl M., Duda S., Thölke A., Fischer R.: Eye tracking and its application in usability and media research. MMI-Interaktiv, No.6 (2003) [84] Schwardt L., Preez J.: Manipulating Feature Space. Lecture Materials. University of Stellenbosch, South Africa (2003) [85] Shannon C. E.: A Mathematical Theory of Communication. The Bell System Technical Journal, Vol. 27 (1948) [86] Skalar Medical, Delft, Netherlands http://www.skalar.nl [87] SR Research Ltd., Mississauga, Ontario, Canada. http://www.eyelinkinfo.com/ [88] Tadeusiewicz R.: Biometria. Wydawnictwa Akademii Górniczo Hutniczej, Kraków (1993) [89] Tobii Technology AB, Stockholm, Sweden. http://www.tobii.se [90] Turk M., Pentland A.: Face Recognition Using Eigenfaces, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (1991) [91] Vapnik V.: Statistical Learning Theory. John Wiley and Sons, Inc. New York (1998) [92] Vatikiotis-Bateson E., Eigsti I. M., Yano S., Munhall K.: Eye movement of perceivers during audiovisual speech perception. Perception and Psychophysics, 60(6), (1998) [93] Walker J. S.: A Primer on Wavelets and their Scientific Applications. CRC Press LLS (1999) [94] Wayman J. L.: A Definition of “Biometrics”. National Biometric Test Center Collected Works 1997-2000, San Jose University Press (2000)

100

Literature

[95] Wayman J. L.: Fundamentals of Biometric Authentication Technologies. National Biometric Test Center Collected Works 1997-2000, San Jose University Press (2000) [96] Witten I. H., Frank E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann (1999) [97] Zhang D. D.: Automated Biometrics Technolgies and Systems. Kluwer Academic Publishers (2000)

101

Appendix. Software tools

Appendix. Software tools All experiments have been done with using applications written mostly in Java language. The Java has been chosen because it had several significant advantages: -

Applications written in Java may be freely distributed and used.

-

There are a lot of ready to use tools and libraries helpful for application building.

-

An application written in Java is platform independent and may be started on different devices.

-

Author had experience with Java programming.

Moreover, the software written in a pure programming language like Java is easier to maintain and to use in real applications unlike for instance the code written in MatLab. The system consists of four independent applications: -

EyeLogin – application for data acquiring described in section 4.5. That is the only application written in Pascal language.

-

EyeLoader – a simple application converting set of files produced by EyeLogin into a single Dataset file.

-

EyeAnalyser – the main application allowing to maintain datasets – converting, showing graphs, classifying and storing results in a file in EyeResultsFile format.

-

EyeStat – application for analyses of results.

Test File

EyeLogin

Test File

EyeLoader

Test File

Test File

Fig. 0.1 Schema of data preparing procedure.

102

Dataset File


EyeLogin – data acquiring The EyeLogin application is used during the tests described in section 4. It shows the stimulation on the screen and records eye reaction for that stimulation. The result of a single test is stored in a file in EyeTestFile format described in section 4.6. ...

Fig. 0.2 Structure of file in EyeTestFile format.

EyeLoader – creating a dataset of samples The result of each test (a sample) was stored in a separate file (see section 4.6). EyeLoader is a simple batch application that reads test files from the specified directory and creates a dataset file (see Fig. 0.1). The dataset file is a text file in the format presented on Fig. 0.3. $ %

...

%

...

...

Fig. 0.3 Structure of the file in EyeDatasetFile format.

As it can be seen from Fig. 0.3, the dataset file simply gathers together all samples with information about its origin. For each sample a vector of attributes is stored. The number of attributes may vary in different datasets but should be equal for all samples in one dataset file. Information about attributes meaning is in the dataset description field in the first line of the file. The file is the main object of manipulations for EyeAnalyser application described in the next section.

EyeAnalyser – maintaining datasets EyeAnalyser is the main application of the whole system. The most important part of the application is the EyeDataset class. The class defines dataset objects handled by all other application’s modules. There are three functional modules of the application:

-

EyeVisualizer – a tool for graphical visualization of datasets. 103


-

EyeConverter – a tool for performing different conversions on datasets

-

EyeClassifier – a tool for preparing classification models and evaluating classification of unknown samples.

EyeVisualizer

Dataset file

EyeGraph

EyeConverter EyeDataset

EyeDataset

EyeSample EyeSample EyeSample EyeSample

Trainset

EyeClassifier EyeResults

Testset

Fig. 0.4 The visual description of EyeAnalyser application functionality.

This section briefly describes all classes mentioned above with examples of methods implemented in classes.

EyeDataset A dataset of samples is initially prepared by EyeLoader application and stored in the text file in format presented on Fig. 0.3. EyeDataset object is the memory representation of the dataset. It has two IO functions: •

load(dir, filename) that loads a dataset from disk,

•

save([filename]) that saves a dataset on the disk.

The EyeDataset object contains of a vector of EyeSample objects. EyeSample is storing all information about a particular sample: original filename, date, time and an array of attributes. EyeSample is equipped with a set of functions for attributes conversions, for instance: •

bound(double min, double max) – recalculating attributes values into the new range

•

fourier() – calculates Fourier transform of the signal

•

wavelet() - calculates DWT the signal using Daub4 mother wavelet.

104


•

cut(int min, int max, int skip) – cuts the signal from min to max-1 and subsamples it with skip.

•

findFixations(int fixsize, double maxslope, double maxdev) – finds parts of the sample longer than fixsize where approximated slope is lower than maxslope with deviation lower than maxdev. Creates a new sample with that information (see section 5.2.1).

•

merge(EyeSample[] signals, int start, int end) – static function that merges array of samples into a new sample.

There are also over 40 other functions implemented in EyeSample class. The EyeDataset class contains over 50 functions useful for information extraction and some simple conversions. There are tools for checking integrity, comparing datasets, adding and removing probes. Some of the examples: •

getNoe() – returns the number of samples in the dataset.

•

getProbe(int i) – returns the i-th EyeSample object.

•

getClass(String login) – returns a new EyeDataset containing samples with the specified login (class description).

•

assignFolds(int i) – divides all samples in the dataset into i random stratified folds (stratified means that each fold has class distribution as equal as possible to the whole dataset).

•

setFolds(int[] folds) – assigns samples to folds described in folds array.

•

getFold(int i) - returns a new EyeDataset containing samples from specified fold.

•

getNoFold(int i) - returns a new EyeDataset containing samples not belonging to the specified fold.

All tools of EyeAnalyser application described below are using EyeDataset objects as input.

EyeVisualizer EyeVisualizer is the object showing what is in the dataset on the graph (EyeGraph object). The tool was very useful for data analyses as visual description of data often reveals relations not visible elsewhere. Especially all graphs presented in section 5 had been prepared using EyeVisualizer. The tool has a graphical user interface allowing

105


adjustments of graph appearance like grids or axes labels and recalculations of the data taken from dataset.

Fig. 0.5 Screenshot from EyeVisualizer module.

EyeConverter EyeConverter consists of over 30 functions converting one dataset into another. Some of that functions are using transformation functions implemented in EyeSample class. There are some simple functions like: •

makeFourier(EyeDataset dts) – creates a new dataset containing Fourier transforms of the corresponding samples in the input dataset.

•

makeWavelet(EyeDataset dts) - creates a new dataset containing Wavelet transforms of the corresponding samples in the input dataset.

The EyeConverter object remembers parameters of the last conversion so the same conversion may be used with subsequent datasets. For instance there are three methods: •

makePCA(EyeDataset dts, int nofvectors) – calculates covariance matrix of the dataset and creates eigenvalues and eigenvectors of it. Returns the new dataset containing nofvectors attributes calculated from nofvectors the most significant eigenvectors (associated with the highest eigenvalues).

•

useLastPCA(EyeDataset dts, int nofvectors) – uses the eigenvectors and eigenvalues calculated previously with makePCA() on the dataset given as the parameter. Returns the new dataset containing nofvectors attributes calculated from nofvectors the most significant eigenvectors. 106


•

useLastPCAWithExplain(EyeDataset dts, double explPercent) - uses the eigenvectors and eigenvalues calculated previously with makePCA() on the dataset given as the parameter. On the basis of eigenvalues calculates the number of attributes that explain explPercent of the dataset. Than returns the new dataset containing that number of attributes calculated from the most significant eigenvectors.

As it can be seen, the PCA matrices are calculated once for the dataset given in makePCA(). Than we can use the same matrices for other datasets. It is very important when considering a train-set and a test-set during classification. First the train-set is used for calculation of PCA. The recalculated train-set is than used in classification model creation. To enable testing, every sample in the test-set has to be recalculated into the new space using the same linear conversion as was used for the train-set samples. It is done with useLastPCA() or useLastPCAWithExplain() functions. Similarly, all transformations using the whole dataset (described in section 6) have always two functions, like: •

makeMinChange(EyeDataset dts, int nofvectors) – calculates MinChange values for all attributes and returns the dataset containing most valuable nofvectors.

•

useLastMinChange(EyeDataset dts, int nofvectors) – uses values table calculated in makeMinChange() to extract attributes in the different dataset.

•

boundSignal(EyeDataset dts, double newmin, double newmax) – for each attribute calculates minimal (oldmin) and maximal (oldmax) values and recalculates it using the code: double dOld = oldmax - oldmin; double dNew = newmax - newmin; for(int i=0;i

Biometrics in access control - Semantic Scholar

Biometrics in access control - Semantic Scholar

Suggest Documents

Dynamic Biometrics - Semantic Scholar

Beyond Biometrics - Semantic Scholar

Federated Access Control in Heterogeneous ... - Semantic Scholar

Access control in collaborative commerce - Semantic Scholar

iris biometrics for secure remote access - Semantic Scholar

Biometrics Symposium 2007 - Semantic Scholar

Environmental Testing Methodology in Biometrics - Semantic Scholar

An Access Control System using Bimodal Biometrics - International

Situation-Based Access Control: privacy ... - Semantic Scholar

Computer-supported access control - Semantic Scholar

Compressing Network Access Control Lists - Semantic Scholar

Access Control Enforcement Delegation for ... - Semantic Scholar

Context-Aware Provisional Access Control - Semantic Scholar

RELATION BASED ACCESS CONTROL: LOGIC ... - Semantic Scholar

Access Control Using Fingerprint Authentication ... - Semantic Scholar

Web services access control architecture ... - Semantic Scholar

Lightweight, Distributed Access Control for ... - Semantic Scholar

Comparing Access-Control Technologies - Semantic Scholar

Access Control Enforcement Delegation for ... - Semantic Scholar

Split Capabilities for Access Control - Semantic Scholar

Toward Semantics-Aware Access Control - Semantic Scholar

Comparing Access-Control Technologies - Semantic Scholar

Anonymous biometric access control base - Semantic Scholar

Service Adaptively Medium Access Control ... - Semantic Scholar