People by the Way they Walk. Claudia Nickel. Christoph Busch. Hochschule .... h a patio in the ute without dead ute the subjects ntication points h other. In order.
Classifying Accelerometer Data via Hidden Markov Models to Authenticate People by the Way they Walk Claudia Nickel
Christoph Busch
Hochschule Darmstadt (CASED) Haardtring 100 64295 Darmstadt Germany
Hochschule Darmstadt (CASED) Haardtring 100 64295 Darmstadt Germany
Abstract – Promising results have been obtained when using Hidden Markov Models for accelerometer-based biometric gait recognition. So far, the used testing data contains only walking straight on a flat floor, which is not a realistic scenario. This paper shows the results when using a more realistic data set containing walking around corners, upstairs and downstairs etc. It is analyzed to which extent the biometric performance is degraded when this more demanding data set is used. To show practical results the cross-day performance is analyzed and compared with the same-day results. Error rates will be given depending on the amount of training data and after a voting scheme is applied. We obtain an Equal Error Rate (EER) of 6.15% which is less than a third of the EER obtained when applying a cycle extraction method to the same data set. Index Terms — biometric gait recognition, accelerometers, Hidden Markov Models I. INTRODUCTION As owners of mobile devices tend to deactivate their security settings, data on these devices is often insufficiently protected [1]. One reason for this is that most mobile devices do only offer the authentication via PIN or password, which requires explicit interaction and thus is time-consuming and not very user friendly. To solve this problem an alternative, unobtrusive authentication method based on gait is proposed in this paper. There are two main advantages of this approach. Firstly, gait can be captured via acceleration sensors, which are already integrated into most smart phones. Hence, there are no additional hardware costs for deploying this method. Secondly, gait recognition does not require explicit user interaction during verification as the phone does it literally “on-the-go”. These two factors make accelerometerbased biometric gait recognition a very user friendly method, which does not require extra interaction time. Research on accelerometer-based biometric gait recognition was started by Ailisto et al. which resulted in the first publication on this topic in 2005 [2]. The technique was further developed by Gafurov [3] who mainly focused on cycle extraction techniques. In these initial publications, dedicated prototype accelerometers were used, which were attached to the hip, leg, arm or ankle of the subjects. With the fast development of mobile phones from simple cell phones to highly capable smart phones, researchers started to directly use the sensors integrated in these devices to collect testing data [4] [5] [6] [7].
Research can be divided into two main groups. Either socalled gait cycles with roughly equal duration are extracted from the sensor data or the data is divided into fixed-length time-segments from which features are extracted. Gait cycles correspond to two steps and can be compared using different distance measures like Dynamic Time Warping (DTW) [8] or Cyclic Rotation Metric (CRM) [9]. For comparison of feature vectors the prominent approach is to use machine learning algorithms that are well established in other pattern recognition domains such as speaker recognition. These promising approaches include neural networks [6], HMMs [10] and SVMs [11]. In [12] it was shown that smart phones are already capable to perform the whole enrolment and authentication process when using a cycle extraction method. The contribution of this paper is a comprehensive evaluation on the biometric performance of Hidden Markov Models (HMMs) when a realistic gait data set is used. Nearly all previous publications report error rates obtained when subjects are walking on flat floor without any disturbances like doors or corners. In addition, the data sets are often collected on one day. Hence, only same-day results are stated which do not reflect the reality as the user of a mobile device will not enroll himself each day on his gait recognition system. Obtaining realistic results, like in this paper, is important to get an impression of the development of gait recognition systems. The remaining part of the paper is structured as follows. The data collection is described in section II. In section III. we explain the data processing and feature extraction before giving a short introduction on HMMs in IV. Section V. and VI. describe the evaluation and results followed by a discussion in section VII. Conclusions and proposals for future work are given in section VIII. II. DATA COLLECTION For this research work we used the same data collection as in [12]. Data of 48 subjects were collected. The mean age of the 18 female participants was 28.5 (minimum 20 years and maximum 53). The male participants were between 22 and 59 years old (mean age was 30.5 years). The participants were told to walk in their normal pace during the whole test. Each participant took part twice on two different days and in most cases was wearing the same shoes during both sessions. The phone was inside a pouch which was attached to the right side of the hip of the subject. Each session consisted of enrolment and walking on a defined route. During enrolment the subjects had to walk 10 seconds straight on a flat floor.
After enrolment the subjects had to walk o on a predefined route three times. This route involved tw wo floors of the building which has a rectangular shape with a patio in the ute without dead center, which allowed the definition of a rou end (see Fig. 1). During walking on that rou ute the subjects had to stop at nine predefined authen ntication points approximately 25 seconds apart from each h other. In order not to influence them, the subjects were walkking unattended. The route was chosen in a way that co orresponds to a realistic scenario. The left photo in Fig. 2 sshows the route section from authentication point 5 to 6. One can see the two on (linoleum and different kinds of floor the subjects walked o tiles), the doorsills which occurred several ttimes during the route, and chairs which prevented the subjeccts from walking on a straight line. The right photo in Fig. 2 shows the area around the starting point. Between authentica ation point 3 and 4 the subjects had to open the glass door an nd walk upstairs. In the last section (between authentication po oint 8 and 9) this door had to be opened and closed by the ssubjects, forcing them to stop. The route consists of 9 section ns (01 to 09); the enrolment corresponds to section 00. For each of the 48 subjects we obtained 28 d data sets in each session, 2688 in total. This data set fulfillss three goals: It corresponds to a realistic scenario, it contaiins enough data for training the HMMs and it consists of data from two on of cross-day different days which allows the computatio results that are more realistic than same-day results.
Fig. 1 Route on which the subjects had to wallk three times on two different days.
pling rate, centered around data is interpolated to a fixed samp zero and divided into segments. Fe eatures are extracted from each segment. A.
Fig. 3 shows the data collected off one subject in section 08. The part where the subject is wallking downstairs is clearly visible in the right half of the figure.. The file also contains the data where the subject was still stan nding and when the phone is taken out of the pouch for term minating the data capture program. When using this data directly, some segments containing no walking data would be created. Therefore, we automatically extract the walks. This is done by firstly extracting the data between the firrst and the last data value above the mean of the walk to o delete the ending part containing very low accelerations be ecause the phone is turned to a horizontal position to switch h off the data collection program. Afterwards a convolution filter is applied to the o consecutive acceleration absolute value of the differences of values to emphasize the difference e of the high acceleration parts (walks) to non-walking section ns. Based on this signal the walks are extracted.
eleration in x-direction in Fig. 3 Sample of measured acce section 08. The downstairs part is cllearly visible on the right. B.
Fig. 2 Photos illustrating the walking g route. III.
DATA PREPROCESS SING
The preprocessing consists of five steps. F Firstly, the walks are extracted from the data, i.e. the no on-walking parts recorded during one section are discarded.. Afterwards the
Walk Extraction
Interpolation
Due to the Android API we get only accelerometer values ent onSensorChanged is from the sensor when the eve triggered. The mean obtained sa ampling rate was 127.23 samples per second. Previous eva aluations [11] showed that varying the interpolation rate has no big influence on the classification results of HMMs and an interpolation rate of 50 ood results. In the following samples was identified to deliver go evaluation the data is linearly interp polated to a fixed sampling rate of 25, 50 and 100 samples per p second to confirm the previous results. C.
Centering around Zero
As the phone’s accelerometers are a not well calibrated and in order to remove the influence of o the gravity from the xdirection, the signals are centered around a zero by subtracting , , . the mean : ̂ ,
Fig. 4 Process to create MFCC and BFCC coefficients. D.
IV. HIDDEN MARKOV MODELS
Segmentation
The interpolated data are divided into segments of 2000, 3000 and 4000 milliseconds, with an overlap of 50%. TABLE 1 shows the median number of resulting segments per section as well as the mean duration of each walk. E.
Feature Extraction
In the last step, features are extracted from each segment, in fact they are extracted separately for the x-, y- and zdirection as well as for the magnitude vector ̂ ̂
̂ . As statistical features, like standard
deviation, minimum, maximum, have not resulted in sufficient low error rates [11], this work focuses on Mel-Frequency (MFCCs) [13] and Bark-Frequency Cepstral Coefficients (BFCCs) [14]. The general workflow for MFCC and BFCC generation is depicted in Fig. 4. The only difference lies in the applied scaling, where either the Mel Scale or the Bark Scale is used. In case of BFCCs two different configurations were utilized. This results in the following three single features: MFCC (window size = 1.44s, window hop time = 0.048s, maximal frequency = 10Hz), BFCC1 (window size = 1.12s, window hop time = 0.032s, maximal frequency = 7.5Hz) and BFCC2 (window size = 1.92s, window hop time = 0.048s, maximal frequency = 8.75Hz). In each case the common number of 13 coefficients was generated. As the features are extracted for each of the three acceleration axes and for the magnitude vector, the resulting feature vectors are of length 52. The coefficients are calculated using the implementation by Ellis [15]. In addition, all possible combinations of these single features were used: BFCC1BFCC2, BFCC1MFCC, BFCC2MFCC, BFCC1BFCC2MFCC, which are of length 104 or 156, respectively. TABLE 1 MEAN DURATION OF A WALK FOR EACH SECTION AND THE MEDIAN NUMBER OF SEGMENTS AFTER SEGMENTATION TO 2000, 3000 AND 4000MS
Section 01 02 03 04 05 06 07 08 09 walk 22.6 26.7 26.3 28.0 25.5 24.3 22.1 25.9 21.6 2000 21 25 24 27 27 23 20 25 15 3000 13 16 16 17 15 15 13 16 12 4000 10 12 11 13 11 11 9 12 10
Hidden Markov Models (HMMs) have been introduced by Baum and Petrie in the 1960s [16] and are since then used for various biometric applications like speaker [17] or writer recognition [18]. The predominant classification approach in speaker recognition [19] is the world model approach, which was proposed by Carey et al. [20]. In each test iteration, two models are considered. One is the genuine model, which was trained using data of the genuine user and the other is the world model. In our approach the world model has been trained using data from 20 different subjects. For a given probe feature vector we obtain for each model the probability that this model represents the probe data. Classification is based on the difference between these probabilities and a decision threshold. HMMs can be constructed using different numbers of states and mixtures per state. Using three states and one mixture per state has been identified as the best setting.
V. EVALUATION AND RESULTS Different tests were conducted, which are described in the following subsections. A.
Identification of Optimal Amount of Training Data
Fig. 5 shows the Detection Error Trade-off (DET-)curves obtained when using different amounts of training data using the example of feature BFCC2MFCC (a similar trend was observed for the other feature sets). The test set consisted of the first round of the second day. In the beginning only the flat floor data, recorded in the enrolment phase of the data collection, was used for training (00). These were only 10 seconds of training data per subject, which is clearly not enough and results in an EER of 31.6%. The training set was increased by adding one section after the other until the enrolment data and all data collected during the first walking round (section 00 to 09) were used for training. Already when adding only the first section the EER decreases significantly to 18.55%. The next significant decrease to an EER of 15.77% can be seen when sections 00 to 04 are used for training. Adding more data has no big influence anymore: Using section 00 to 09 of the first walk for training gives an EER of 15.58%, adding all sections of the second walk keeps this at the same level with an EER of 15.74%. Using all available training data (sections 00-09 of all three walks) gives an EER of 15.46%. Therefore, the training set consisting of section 00 to 04 is considered to be the best compromise of good classification performance and a short enrolment time.
Fig. 6 EERs separately for each section using feature set BFCC2MFCC with segment sizes 2000, 3000 and 4000. D. Fig. 5 DET-curves for different amounts of trraining data B.
Set Identification of Best Performing Feature S
TABLE 2 gives the EERs for all tested ffeature sets and segment lengths. All results range between an n EER of 15.77% and 18.94%, hence there are no significcant differences between the tested feature sets. The best pe erforming one is BFCC2MFCC, closely followed by MFCC. U Using a segment size of 2000ms results in slightly higher EERss than the similar performing segment sizes of 3000ms and 4000ms. These mounts of training tendencies are the same through all used am data. Therefore, from now on the focus will la ay on feature set BFCC2MFCC and using sections 00 to 04 for training. TABLE 2 EER [%] FOR DIFFERENT FEATURE SETS AND SEGMENT SIZES frequency = 50 Hz Feature Set 2000ms 3000ms 4000ms 18.94 18.466 18.13 BFCC1 16.59 18.13 17.188 BFCC2 15.98 18.11 16.544 MFCC 18.18 17.599 17.38 BFCC1BFCC2 17.09 17.89 17.211 BFCC1MFCC 15.77 17.7 16.511 BFCC2MFCC 16.81 17.68 16.955 BFCC1BFCC2MFCC
C.
Separate Results for Each Test Section
The system was trained using data from sections 00 to 04 nce is analyzed, collected on the first day. Now the performan when using the sections separately for testin ng. Fig. 6 shows the obtained EERs as bar plots. One can see e that the results are similar for all sections; only section 04 an nd 08 give worse results. This corresponds to the results in [12] and is due to e subjects partly the fact that these are the sections where the had to walk on stairs. As no activity recognition n has been done during preprocessing to identify the walking parts, the data recorded while the subjects are walking up-stairs or downstairs is still included. Therefore, some probe ssegments do not contain walking data and hence worsen the re esults.
Same-day Results
me on gait, the same-day To analyze the influence of tim results were computed. The training g set consists as before of data from sections 00-04 collected during d the first round at the first day. Now the test data consists of sections 01-09 from the s case an EER of 7.88% for second round of the first day. In this feature set BFCC2MFCC is obtaine ed, which is approximately half of the so far reported cross-day result. This indicates the d shows the importance of high variability of gait over time and conducting research on a data-base that was created on two different days. E.
Varying the Interpolation Rate
So far, all stated results are obtained when using an interpolation rate of 50 samples per second during preprocessing, as this rate perfformed best in previous evaluations. It is necessary to va alidate this result for the current database. TABLE 3 shows the t results for interpolation rate 25 and 100. One can see tha at interpolation to rate 100 yields nearly as good results as rate 50. When using 25 samples per second the results get worse. This confirms the correct choice of interpolation rate 50. 5 TABLE 3 RESULTS FOR INTERPOLATION N RATE 25, 50 AND 100 BFCC2MFCC 2000ms 300 00ms 4000ms 25 18.29 17.29 16.47 50 17.7 16.51 15.77 100 17.42 16.53 15.8
F.
Subject-wise Results
A further interesting aspect is the stability of the results over all subjects. The error rates are more m reliable if they do not vary much for different subjects. When W analyzing the cycle extraction results in [12] a high varia ability was observed. Fig. 7 gives the results for HMMs using feature set BFCC2MFCC. For each subject the FNMR is plotted which is obtained at an overall FMR of approximately 10% %. One can see that the FNMR is around 8% for most of the t subjects, with only ten outliers. Hence, a much higher stability than for the cycle e reason for some of the extraction method is obtained. The outliers might be that a few subjec cts (IDs 8, 13, 15, 17, 27) wore different shoes on the two sessions. But also the position of the phone due to different trouserrs could be a reason.
VII. DISCUSSION
Fig. 7 FNMR for each of the 48 subjects at a FMR of 10%. VI. VOTING SCHEME Until now, each classification result for each segment is directly used to calculate the FMR and FNMR. An alternative approach is a voting system, which merges multiple classification results to a single one [21]. The authentication decision is then based on these multiple (#V) classifications, not only on one. Inspired by a petition quorum, the scheme is called quorum voting. This quorum requires that at least #GV of the #V classifications vote for acceptance of the user’s verification claim, otherwise the probe signal is rejected. Fig. 8 shows the results obtained for feature BFCC2MFCC and different amounts of training data. Sections 01 to 09 from the second day have been used as test data. The upper line corresponds to the results stated in section V-A. which have been calculated without voting and using segment size 4000. Using #V=60 and #GV=1, these error rates can be decreased from 15.77% to 7.45% (when sections 00 to 04 are used for training). This decrease of the error level is depicted by the drop from the blue line (with circular symbols) down to the green line (with squared symbols). Even better results are obtained when a segment size of 2000ms is used. Using the same settings as above an EER of 7.33% can be reported. The same-day results (see section V-D. ) can even be decreased to 0.71% EER. After voting, the best results are obtained for the feature MFCC. When using sections 00-04 for training we get an EER of 6.15%. The lowest EER of 5.81% is obtained when further increasing the amount of training data (section 00 to 08).
The evaluations described in the previous sections show the outstanding performance which can be achieved when using HMMs for accelerometer-based biometric gait recognition. As stated in the introduction, two different approaches for accelerometer-based biometric gait recognition exist. One approach is based on segmentation and comparison using machine learning algorithms, like the method reported in this paper. The second approach uses cycle extraction techniques. A benchmark of these two techniques is possible, as in [12] a cycle extraction method was applied to the same data set as used in this paper. Only the enrolment data (10 seconds) was used to calculate the reference cycle. When all data from the second day is used as probe data, the best obtained EER is 21.7% when using the cycle extraction method. Fig. 5 shows that at least 33 seconds of enrolment data (section 00 and 01) are necessary for a sufficient training of HMMs, even better are 114 seconds (section 00 to 04). But as soon as well trained HMMs are available, the performance (EER = 6.15%) is much better than when applying the cycle extraction method. In [11] HMMs were evaluated on a data-base containing only walking straight on flat floor. Surprisingly, the results in that case have not been better than the results stated in this paper. Without voting an EER of 15.39% has been obtained, but quorum voting had only minor influence in this case and decreased the ERR to 13.98%. Hence, there is no degradation of recognition performance when a more realistic data set is used.
VIII. CONCLUSION AND FUTURE WORK Accelerometer-based biometric gait recognition is still a new field of research and most evaluations use test data recorded under laboratory conditions, containing just walking straight on flat floor. As HMMs showed good results for this kind of data set, the next step was an evaluation on a more realistic one like conducted in this paper. It could be shown that HMMs fulfill this task. Training the HMMs with about two minutes of walking data (containing walking around corners and upstairs), an EER of 7.45% could be obtained with mixed test data of all route sections. A separate evaluation for each section showed, that the results for the sections containing no stairs are even better. It has to be confirmed that the algorithm runtimes are acceptable for a practical application. Therefore, future work will include the implementation of a biometric gait recognition module for cell phones using HMMs for classification.
IX. ACKNOWLEDGEMENTS
Fig. 8 Influence of the voting scheme for different amounts of training data (cross-day).
This work was supported by CASED (www.cased.de). The authors would like to thank the numerous participants in the data collection.
X. REFERENCES [1]
[2]
[3] [4]
[5] [6]
[7]
[8] [9]
[10]
[11] [12]
[13] [14]
[15]
F. Breitinger and C. Nickel, “User Survey on Phone Security,” in BIOSIG 2010 - Proceedings of the Special Interest Group on Biometrics and Electronic Signatures, 2010. H. J. Ailisto, M. Lindholm, J. Mäntyjärvi, E. Vildjiounaite, and S.-M. Mäkelä, “Identifying people from gait pattern with accelerometers,” Biometric Technology for Human Identification II, vol. 5779, no. 1, pp. 7–14, 2005, vTT Electronics, Finland. D. Gafurov, “Performance and security analysis of gaitbased user authentication,” Ph.D. dissertation, 2008. S. Sprager, “A cumulant-based method for gait identification using accelerometer data with principal component analysis and support vector machine,” in Sensors, Signals, Visualization, Imaging, Simulation and Materials, 2009, pp. 94–99. J. Frank, S. Mannor, and D. Precup, “Activity and gait recognition with time-delay embeddings,” in AAAI Conference on Artificial Intelligence, 2010. J. Kwapisz, G. Weiss, and S. Moore, “Cell phone-based biometric identification,” in Biometrics: Theory Applications and Systems (BTAS), 2010 Fourth IEEE International Conference on, 2010, pp. 1 –7. M. O. Derawi, C. Nickel, P. Bours, and C. Busch, “Unobtrusive User-Authentication on Mobile Phones using Biometric Gait Recognition,” in Sixth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2010. M. Müller, Information Retrieval for Music and Motion. Secaucus, NJ, USA: Springer-Verlag New York, Inc., 2007, ch. 4 - Dynamic Time Warping. M. O. Derawi, P. Bours, and K. Holien, “Improved Cycle Detection for Accelerometer Based Gait Authentication,” in Sixth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2010. C. Nickel, C. Busch, S. Rangarajan, and M. Möbius, “Using Hidden Markov Models for Accelerometer-Based Biometric Gait Recognition,” in 2011 7th International Colloquium on Signal Processing & Its Applications (CSPA 2011), 2011. “Submitted for review.” C. Nickel, M. O. Derawi, P. Bours, and C. Busch, “Scenario Test of Accelerometer-Based Biometric Gait Recognition,” in IWSCN 2011 - 3rd International Workshop on Security and Communication Networks, 2011. L. Rabiner and B.-H. Juang, Fundamentals of speech recognition. Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 1993. E. Zwicker, “Subdivision of the audible frequency range into critical bands (frequenzgruppen),” The Journal of the Acoustical Society of America, vol. 33, no. 2, pp. 248–248, 1961. D. Ellis, Reproducing the feature outputs of common programs using Matlab and melfcc.m, Department of Electrical Engineering, Columbia University, last accessed 31st May 2011. [Online]. Available: http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/mfccs.html
[16] L. E. Baum and T. Petrie, “Statistical inference for probabilistic functions of finite state markov chains,” The Annals of Mathematical Statistics, vol. 37, no. 6, pp. pp. 1554–1563, 1966. [17] D.-P. Munteanu and S.-A. Toma, “Automatic speaker verification experiments using HMM,” in 8th International Conference on Communications (COMM), 2010. [18] A. Schlapbach and H. Bunke, “Using hmm based recognizers for writer identification and verification,” in IWFHR ’04: Proceedings of the Ninth International Workshop on Frontiers in Handwriting Recognition. Washington, DC, USA: IEEE Computer Society, 2004, pp. 167–172. [19] F. Bimbot, J.-F. Bonastre, C. Fredouille, G. Gravier, I. Magrin-Chagnolleau, S. Meignier, T. Merlin, J. OrtegaGarcá, D. Petrovska-Delacrétaz, and D. A. Reynolds, “A tutorial on text-independent speaker verification,” EURASIP J. Appl. Signal Process., vol. 2004, pp. 430– 451, January 2004. [20] M. J. Carey, E. S. Parris, and J. S. Bridle, “A speaker verification system using alpha-nets,” in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, ser. ICASSP ’91. Washington, DC, USA: IEEE Computer Society, 1991, pp. 397–400. [21] C. Nickel, H. Brandt, and C. Busch, “Classification of acceleration data for biometric gait recognition on mobile devices,” in BIOSIG 2011 - Proceedings of the Special Interest Group on Biometrics and Electronic Signatures, 2011. XI. VITA Claudia Nickel studied Mathematics with Computer Science at Technische Universität Darmstadt, Germany. She graduated in 2007 and afterwards worked at Fraunhofer Institute for Computer Graphics Research in Darmstadt in the area of Perceptual Hashing. In January 2009 she started her Ph.D. at the University of Applied Sciences Darmstadt which is a cluster partner of the Center for Advanced Security Research Darmstadt (CASED). Her fields of research are biometrics and protection of biometric data. Her focus lies on accelerometer based gait recognition. Christoph Busch is member of the faculty computer science and media technology at the Gjøvik University College (GUC), Norway. He holds a joint appointment with the media faculty at University of Applied Sciences Darmstadt, Germany. He received his PhD in the field of computer graphics in 1997. In the same year he joined the Fraunhofer Institute for Computer Graphics Research in Darmstadt as head of the department Security Technology. Prof. Dr. C. Busch has since been responsible for the acquisition and management of numerous applied research and development projects. Christoph Busch published numerous technical papers and has been a speaker at international conferences. He served for various program committees (NIST IBPC, BSI-Congress, GI-Congress, DACH, WEDELMUSIC, EUROGRAPHICS) and served for several conferences, journals and magazines as reviewer (ACMSIGGRAPH, IEEE CG&A, IEEE Transactions on Signal Processing, Elsevier Computers & Security, etc.).