User authentication using keystroke dynamics for cellular ... - COMLAB

27 downloads 5348 Views 583KB Size Report
Jan 15, 2009 - digital assistant (PDA), with enhanced functionalities such ... online purchase and selling, file transfer and the like, are ..... a user's signature.
www.ietdl.org Published in IET Signal Processing Received on 28th August 2008 Revised on 15th January 2009 doi: 10.1049/iet-spr.2008.0171

Special Issue on Biometric Recognition

ISSN 1751-9675

User authentication using keystroke dynamics for cellular phones P. Campisi E. Maiorana M. Lo Bosco A. Neri Dip. Elettronica Applicata, Universita´ degli Studi ‘Roma Tre’, Via della Vasca Navale 84, Roma I-00146, Italy E-mail: [email protected]

Abstract: A new approach for keystroke-based authentication when using a cellular phone keypad as input device is presented. In the proposed method, users are authenticated using keystroke dynamics acquired when typing fixed alphabetic strings on a mobile phone keypad. The employed statistical classifier is able to perform user verification with an average equal error rate of about 13%. The obtained experimental results suggest that, when using mobile devices, a strong secure authentication scheme cannot rely on the sole keystroke dynamics, which however can be a module of a more complex system including, as basic security, a password-based protocol eventually hardened by keystroke analysis.

1

Introduction

In the last decade we have witnessed a dramatic widespread use of mobile devices, such as cellular phones and personal digital assistant (PDA), with enhanced functionalities such as storing capabilities, multimedia files management, e-mail and multimedia messaging. This has been made possible thanks to both the fast development of wireless networks all over the world and the parallel decreasing of connection prices and device costs. Moreover, an ever-increasing number of applications, such as bank account managing, online purchase and selling, file transfer and the like, are made available through advanced mobile devices. Therefore with reference to this scenario, the need to grant secure access to stored data as well as to sensitive applications becomes an issue of paramount importance. In contrast with traditional authentication approaches, based on what a person knows (password, PIN etc.) or what a person has (ID card, keys, tokens etc.), biometric authentication [1] has been recently proposed as an authentication strategy, which relies on who a person is or what a person does. It is based on strictly personal traits, much more difficult to be forgotten, lost, stolen, copied or forged than traditional data. Loosely speaking, biometric systems are essentially pattern-recognition applications, performing authentication using biometric attributes derived from physiological (like fingerprint, face, iris, retina, hand IET Signal Process., 2009, Vol. 3, Iss. 4, pp. 333 – 341 doi: 10.1049/iet-spr.2008.0171

geometry, thermograms, DNA, ear shape, body odour, vein patterns, electrocardiogram, brain waves etc.) or behavioural characteristics (like voice, signature, handwriting, gait, keystroke, lip motion etc.) that persons possess. The most used authentication approaches using mobile handsets rely on the use of PINs. However, PINs are as inconvenient as passwords since they can be easily forgotten or stolen. Moreover, many users do not use the security provided by the PIN code at all, as discussed in [2]. In addition to PINs, a further level of security can be added to mobile systems by removing the SIM card from the handset and by storing it in a secure place when not used. In this scenario, the SIM is the token that the person has to possess in order to use the system. However, this approach is very inconvenient and therefore it is very unlikely to be used. Within this framework, we propose an approach for user authentication using keystroke dynamics with application to mobile phones, which because of the specific input keypad present some peculiarities with respect to keystroke analysis using traditional QWERTY keyboards. Specifically, static text-based authentication is performed by using a statistical classifier, which allows obtaining significant performance with respect to already proposed methods employing complex neural networks. 333

& The Institution of Engineering and Technology 2009

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on July 13, 2009 at 06:36 from IEEE Xplore. Restrictions apply.

www.ietdl.org The paper is organised as follows. In Section 2 keystroke dynamics is described and its feasibility for user authentication using mobile phones is discussed. In Section 3, the proposed approach is detailed. Experimental results and conclusions are presented in Sections 4 and 5, respectively.

2

Keystroke dynamics

Keystroke dynamics refers to user authentication, based not on what a user types but on how a user types at a terminal equipped with a keyboard, which can be a QWERTY keyboard used with personal computers, or a generic interface equipped with keys that can be pressed. Therefore this technique has potential applications with ATMs, cellular phones and the like. The first documented observation that each user has a unique way of typing dates back to the end of the 19th century, when telegraphists were able to identify other operators listening to the rhythm of their Morse code sequences. However, the first application of keystroke analysis to user identification is much more recent [3]. Keystroke authentication can be classified as either static or continuous. The first refers to keystroke analysis performed only at specific times, for example during the login process. When the latter is applied, the analysis of the typing rhythm is performed continuously during the whole session, thus providing a tool to detect user substitution after the login. Keystroke loggers, that is either software or hardware devices able to capture and store the keystroke activity of a user, have been already used in forensic applications [4]. However, the monitoring functionality can be used not only to detect if a user gets unauthorised access to a computer after an authorised user has logged in, but also to monitor the user level of attention and productivity in general. Of course, this raises many privacy issues that must be addressed to employ the system in real-world applications [5]. Keystroke dynamics has been extensively analysed for traditional QWERTY keyboards, and its feasibility as an authentication tool has been deeply investigated [6–9]. A survey of the proposed approaches up to 2004 and their comparative analysis is given in [10]. In [11] an authentication approach based on static keystroke dynamics and on the key ASCII codes is proposed. A system composed of parallel decision trees is used to authenticate users based on keystroke patterns in [12]. In [13] impostors patterns are used to retrain the classifier by means of learning vector quantisation. Different features can be extracted from the typing pattern, such as interkey time, keystroke time, keystroke latency, that is the time between one key being released and the next key being depressed, overall timing speed and frequency of errors in typing. More refined approaches, which however require special equipment, can also rely on data coming from the applied pressure on the keys [14] and on motion information coming from the hand movement when typing [15]. In [16] the problem of the quality of typed patterns has been discussed. Specifically, two major quality factors, 334 & The Institution of Engineering and Technology 2009

namely uniqueness and consistency, have been introduced. Moreover in [16], the authors have proposed to improve the quality of typing patterns by asking the users to follow some artificial rhythms. In [17] an experimental study on different typing strategies making use of artificial rhythms and cues, which help to keep the rhythm and to improve data quality, is presented. Even though the applicability of keystroke authentication to PC keyboards has been widely documented, a comparable effort has not been performed when keypads such as the ones used in mobile handsets or ATMs are used. Some preliminary studies have been performed in [18] where a traditional keyboard is substituted by four pairs of IR transmitters and receivers. A biometric verification system tailored to ATM user authentication was proposed in [19], where keystroke motion is analysed when typing a four-digit PIN code. With reference to mobile handset keypads, the different layout, the use of smaller keys, the keys shape and the key response to pressure make the keystroke analysis peculiar. A preliminary study has been carried out in [20], where the application of keystroke analysis to mobile phones is discussed. In more detail, two experiments have been performed: the same four-digit PIN code, for all the 16 users enrolled in the system, has been acquired 30 times for each user, and the same eleven-digit telephone number, for all the 16 users enrolled in the system, has been acquired 50 times for each user. A feed-forward multi-layered perceptron network has been employed, obtaining an equal error rate (ERR) of 15% for both static PIN code and static telephone number based user authentication. The work in [20] has been extended in [21] where different neural networks, like the feed-forward multi-layered perceptron network, the radial basis function network and the generalised regression neural network, have been used. ERR values ranging from 13.3 to 15.5% and from 12.8 to 13.9% are obtained for both fixed four-digit PIN codes and fixed eleven-digit telephone numbers, respectively, by using the aforementioned neural networks. By using individually optimised neural network configurations, ERR values equal to 8.5 and 4.9% were obtained for the four and eleven-digit numerical inputs respectively. In [21], the possibility to authenticate users on mobile phones by means of the way in which they type text messages is also investigated. Specifically, a static-based authentication, but independent of the actual words typed by the user, is taken into account. Five networks are employed, each with a minimum of two and a maximum of six input characters. The best results are obtained using a training algorithm, which tunes the network parameters to each user’s inputs, using input vectors of five characters, thus achieving an EER of 17.9%. However, it is worth pointing out that the approaches in [20, 21] are quite expensive in terms of computational complexity and execution time. In [22] text entry on mobile devices using only five keys is exploited and a comparative analysis with standard nine-key input keypads is made. IET Signal Process., 2009, Vol. 3, Iss. 4, pp. 333– 341 doi: 10.1049/iet-spr.2008.0171

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on July 13, 2009 at 06:36 from IEEE Xplore. Restrictions apply.

www.ietdl.org 3

Proposed approach

In this section, the proposed keystroke-based recognition system, tailored to cellular phones, is illustrated. The approach relies on the analysis of keystroke dynamics referred to static text input using mobile phone keypads. The user’s templates are computed during the enrolment phase, by evaluating some statistics from the available acquisitions, as described in Section 3. It is assumed that the employed device can record timestamps of the instants in which each key is pressed and released. Thus, if a user presses K keys when typing the requested input string, a vector of 2K events has to be recorded by the device, where two consecutive events correspond to a press/release couple. The authentication stage is performed by comparing the acquired keystroke dynamics with a stored template, as illustrated in Section 3.2.

3.1 Enrolment stage In the enrolment phase, for each user, a fixed number of biometric samples are acquired. Within this context, a biometric sample represents a single letter entered using the mobile keypad. We have chosen to use the key down– down time and the key up – up time as the base unit of measure. Each of these measures can be split into two orthogonal measures, for a total of four time measures, as delineated in the following. Specifically, given a couple of consecutive keys, it is possible to consider the following events (see Fig. 1): † time interval between the stroke of key i and the release of the same key, namely f PR (i); † time interval between the stroke of key i and the stroke of the successive key, namely f PP (i); † time interval between the release of key i and the stroke of the successive key, namely f RP (i); † time interval between the release of key i and the release of the successive key f RR (i).

Therefore by considering a word that requires K pressure/ release events in order to be typed, we define the following feature vectors f PR ¼ [ f PR (1), f PR (2), . . . , f PR (K )] f PP ¼ [ f PP (1), f PP (2), . . . , f PP (K  1)] f RP ¼ [ f RP (1), f RP (2), . . . , f RP (K  1)]

(1)

f RR ¼ [ f RR (1), f RR (2), . . . , f RR (K  1)] In Fig. 1 an example of the time features that can be extracted from the key press and release events, for an input word composed of three pressure/release events, is given. In the following, we indicate with f Du,e the generic feature vector for the eth acquisition, with e ¼ 1, 2, . . . , E, when the user u is considered, D [ {PP, RP, RR} being the time event taken into account. Using the feature vectors extracted from each of the considered E acquisitions, the intra-class mean vectors mPR u , RP RR mPP , m and m are estimated together with the intra-class u u u PP RP RR standard deviation vectors sPR u , su , su and su , as follows 8 E > 1X > > > mDu (k) ¼ fu,eD (k), k ¼ 1, . . . , K , D ¼ PR > > E > e¼1 < sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi E h i2 1 X D D > > , f (k)  m (k) D > u,e u > su (k) ¼ E  1 > > e¼1 > : k ¼ 1, . . . , K ; D [ {PP, RP, RR}

(2)

The intra-class mean and standard deviation vectors are then stored in the database and employed as a keystroke dynamics template during authentication. In order to take into account user intra-class variability, we can evaluate in the enrolment stage, for each user u, the distances of the enrolment acquisitions from the extracted template, and then average them

nDu

¼

8 E K D D > 1 X X jfu,e (k)  mu (k)j > > , > X jfu,e (k)  mDu (k)j > 1X > > , :E sD (k)

D [ {PP, RP, RR}

e¼1 k¼1

u

(3) The values nDu in (3), with D [ {PR, PP, RP, RR}, can be employed to perform the authentication stage in a useradaptive way, as will be described in Section 3.2

3.2 Authentication stage Figure 1 Timing-based keystroke features IET Signal Process., 2009, Vol. 3, Iss. 4, pp. 333 – 341 doi: 10.1049/iet-spr.2008.0171

In the authentication phase, a generic subject provides the requested input string and claims the identity of user u. 335

& The Institution of Engineering and Technology 2009

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on July 13, 2009 at 06:36 from IEEE Xplore. Restrictions apply.

www.ietdl.org From the acquired keystroke sequence, the four feature vectors f~ PR , f~ PP , f~ RP and f~ RR are evaluated and compared with the stored template of user u, in order to produce the dissimilarity distances

sD ¼

8 D K > X > j f~ u (k)  mDu (k)j > > , > < sDu (k)

D ¼ PR

k¼1

> K 1 ~ D X > j f u (k)  mDu (k)j > > , > : sD (k) k¼1

(4) D [ {PP, RP, RR}

u

which are used to generate a global distance s. A decision regarding whether the acquired input string has been typed by the real user is made by comparing the result of the matching with a threshold. If the distance s is lower than the selected threshold, the user is recognised as the owner of the claimed identity, otherwise he is not authenticated. Different fusion strategies can be used in order to obtain a global distance s. The simplest consists in the sum rule, that is s ¼ sPR þ sPP þ sRP þ sRR . However, more refined fusion strategies can be designed using the distances in (4). It is worth pointing out that the distances in (4) are computed by means of vectors whose length is dependent on the number K of press/release events necessary to generate the employed word. This variability directly affects the dissimilarity distances of (4). Therefore the use of some normalisation techniques can be beneficial. In the following paragraphs, we describe different normalisation strategies that can be employed in the proposed keystroke-based authentication system. The verification performances achievable using the proposed approaches will be compared in Section 4.

computed as rD ¼

sD , nDu

D [ {PR, PP, RP, RR}

(6)

where the values nDu , defined in (3), represent the statistics characterising the variability of the feature vectors in the enrolment set. The global distance s is computed as s ¼ r PR þ r PP þ r RP þ r RR . A similar approach has been employed in [23] for online signature verification using dynamic time warping. It is worth pointing out that the use of the approach described in (6) leads to a user-adaptive normalisation strategy, which requires the storage of each user normalisation parameter nDu , together with the template mDu , sDu , where D [ {PR, PP, RP, RR}.

3.2.3 Score normalisation by using training data: The normalisation procedures described in Sections 3.2.1 and 3.2.2 do not require any training phase. However, when a set of training keystroke acquisitions is available, it is also possible to compute the normalised dissimilarity distances r D , with D [ {PR, PP, RP, RR}, given by (5) or (6), and estimate suitable statistical models that fit the score distributions. Using these models, the parameters of an additional score normalisation procedure can be determined. Among the many normalisation techniques that have been proposed in the literature [24], in the experiments described in Section 4 we consider the following score normalisation techniques: † min – max score normalisation, † z-score score normalisation, † median score normalisation,

3.2.1 User-independent score normalisation: The scores, computed according to (4) can be normalised with respect to the length of the employed feature vectors. Specifically, the normalised scores can be obtained as 8 D s > > < , D ¼ PR rD ¼ K D > > : s , D [ {PP, RP, RR} K 1

(5)

and the global distance s is computed as s ¼ r PR þ r PP þ r RP þ r RR . As can be seen, the normalisation in (5) is performed for each user by means of the key press/release event number, thus resulting in the simplest possible form of normalisation.

3.2.2 User-dependent score normalisation: The dissimilarity distances introduced in (4) can be normalised with respect to the intra-class variability of each user. Following this approach, the normalised distances can be 336 & The Institution of Engineering and Technology 2009

† tanh-estimator score normalisation, † double sigmoid score normalisation. In Section 4, the recognition performances using the described normalisation approaches are presented.

4

Experimental results

An extensive set of experiments has been performed in order to evaluate the recognition performances achievable using the proposed keystroke dynamics-based authentication system. The employed database, which has been collected by the authors according to the modalities described hereafter, consists of alphabetical passwords whereas numerical passwords are not considered. The two cases are quite different; in fact, a character generation on a mobile handset involves the pressure of a key several times, whereas this is not the case for a number generation. Moreover, the analysis of IET Signal Process., 2009, Vol. 3, Iss. 4, pp. 333– 341 doi: 10.1049/iet-spr.2008.0171

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on July 13, 2009 at 06:36 from IEEE Xplore. Restrictions apply.

www.ietdl.org keypad have been selected for the considered experiments. Using the ITU E.161 standard keypad, different letters are associated with the same key, which have to be pressed more than once to produce the desired character. The employed passwords have also been selected to approximately equally distribute the number of times a key has to be pressed to produce a single letter between one and four. No predictive text technology, like the popular patented text on nine keys (T9) method, has been employed when typing the password. Using the considered passwords, the number K of key press/release events registered during each acquisition, and employed to describe the user’s keystroke dynamics, is comprised between 20 and 26.

Figure 2 Employed mobile phone and its division in regions

text input on a mobile device has received much less attention than numerical input. The database has been acquired using a Nokia 6680 mobile phone, shown in Fig. 2. The employed phone’s dimensions are 108.4  55.2  20.5 mm, while the keypad area is limited to 33 20 mm. Each key then has an approximate dimension of 11 5 mm, although the key’s shape is not perfectly rectangular. The Nokia 6680 mobile phone features a standard keypad with 12 buttons. The alphanumeric characters are arranged according to the ITU E.161 [25]. Keystroke dynamics is acquired by installing on the mobile phone a J2ME program (see the Appendix for details on the software implementation), which logs timestamps of key presses and releases. The phone clock, characterised by a sampling rate of 1 ms, has been used to determine the timestamps. The employed password database has been collected by the authors and consists of six passwords, each ten characters long. Thirty users with an age range between 25 and 40 years have donated each password 20 times. The employed database comprises 6  30  20 ¼ 3600 keystroke acquisitions. In the password acquisition protocol, if a user is mistaken in typing the sequence composing the password, the attempt is discarded. In fact, we assume that the keystroke-based authentication is used as a password hardening mechanism. The authentication protocol therefore requires that the password is first properly typed, and that the keystroke patterns are then analysed. Words whose letters are approximately equally distributed over the three different rows and columns of the employed IET Signal Process., 2009, Vol. 3, Iss. 4, pp. 333 – 341 doi: 10.1049/iet-spr.2008.0171

The available database has been split into an enrolment set, composed of the first ten acquisitions of each user, and into an authentication set, containing the remaining acquisitions of each user. The data set employed in the enrolment and the one used in the authentication stage are therefore disjoint. The enrolment phase has been realised taking, for each user and for each password, the first E ¼ {5, 6, 7, 8, 9, 10} acquisitions from the enrolment set. Two different kinds of authentication have been considered: † A legitimate user authentication, obtained when the users try to be authenticated in their own account. For each user and for each considered password, the ten users’ acquisitions from the authentication set have been employed for this task; † An impostor user authentication, obtained when the users try to be authenticated in other users’ accounts. The performances of the proposed keystroke dynamicsbased authentication system have been measured in terms of the false rejection rate (FRR), which is the probability that the system fails to verify the legitimate user claimed identity, and the false acceptance rate (FAR), which is the probability that the system fails to reject an impostor user. Specifically, the results obtained when using the user-independent score normalisation described in Section 3.2.1 are reported in Fig. 3a, whereas in Fig. 3b the performances of a keystrokebased authentication system employing the user-dependent score normalisation illustrated in Section 3.2.2 are given. In both cases, the verification performances are illustrated by means of different receiver operating characteristics (ROC) curves, obtained by varying the number E of acquisitions considered during enrolment, and by considering all the 30 users available in the collected database. It is worth pointing out that the reported performances are related to tests performed using all the six passwords in the database. As evident from the obtained results, the performances of a keystroke-based authentication system dramatically improve when considering more than five acquisitions in the enrolment stage. This is in agreement with what is stated in [8], where it is recommended not to use less than six training samples to obtain good verification 337

& The Institution of Engineering and Technology 2009

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on July 13, 2009 at 06:36 from IEEE Xplore. Restrictions apply.

www.ietdl.org

Figure 3 ROC curves for different values of enrollment acquisitions (E¼ f5, 6, 7, 8, 9, 10g) a System employing a user independent score normalisation (Section 3.2.1) b System employing a user-dependent score normalisation (Section 3.2.2)

Figure 4 EERs with respect to the number E of enrollment acquisitions a System employing a user-independent score normalisation (Section 3.2.1) b System employing a user-dependent score normalisation (Section 3.2.2).

performances. Moreover, the obtained results show that the user-dependent score normalisation described in Section 3.2.2 performs better than the user-independent one, proposed in Section 3.2.1. This behaviour can also be observed in Figs. 4a and 4b, which, respectively, show the equal error rates (EERs) obtained when employing a user-independent and a 338 & The Institution of Engineering and Technology 2009

user-dependent score normalisation, for different numbers E of keystroke acquisitions considered during enrolment. As evident from the obtained experimental results, when using the user-independent score normalisation, the ERR dramatically improves when the number of the acquisition E during enrolment stage increases from E ¼ 5, with an EER of 27.02%, to E ¼ 6, with an EER of 21.46%. The IET Signal Process., 2009, Vol. 3, Iss. 4, pp. 333– 341 doi: 10.1049/iet-spr.2008.0171

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on July 13, 2009 at 06:36 from IEEE Xplore. Restrictions apply.

www.ietdl.org EER keeps improving when increasing the number of enrolment acquisition, reaching the value of 16.21% when E ¼ 10 acquisitions are considered. However, if the userdependent score normalisation described in Section 3.2.2 is employed, the EER achievable when taking E ¼ 10 keystroke acquisition during enrolment decreases to 14.46%. The possibility of improving the achieved recognition performances using additional score normalisation, by exploiting the statistics of the scores computed over a set of training data, has also been investigated. Specifically, in our experiments, we have employed a training set given by the acquisitions taken from the first five users of the available database, and evaluated the parameters of the normalisation techniques in Section 3.2.3 on this dataset. Then, the verification performances achievable when employing the score normalisation techniques described in Section 3.2.3 have been evaluated over a test set including the acquisitions of the remaining 25 users of the collected database. The achieved results are shown in Fig. 5a for systems based on a

user-independent score normalisation, and in Fig. 5b for systems based on a user-dependent score normalisation. For the sake of clarity, the results given in Fig. 5 have been detailed in Table 1 for E ¼ 10. The obtained EERs show that only a slight improvement in the verification performances can be achieved by exploiting the statistics of training data. Specifically, using the user-independent score normalisation described in Section 3.2.1, an EER equal to 16.25% (over the test set with 25 users) is obtained, whereas an EER equal to 15.62% is achieved when employing double sigmoid score normalisation. When dealing with systems employing the user-dependent score normalisation described in Section 3.2.2, the EER of 13.63% can be improved to 13.15% by means of z-score normalisation procedure. However, the achieved performance improvements obtained by using the fusion techniques described in Section 3.2.3 are not significant. Therefore the normalisation procedures proposed by the authors in Sections 3.2.1 and 3.2.2, although simple, are quite effective and can be properly used in the proposed keystroke-based authentication system.

Figure 5 EERs achievable with different score normalisation techniques exploiting the statistics of training data with respect to the number E of enrollment acquisitions a System employing a user-independent score normalisation (Section 3.2.1) b System employing a user-dependent score normalisation (Section 3.2.2)

Table 1 EERs (in %) achievable with different score normalisation techniques exploiting the statistics of training data, considering E ¼ 10 acquisitions during enrolment Proposed system

No additional score normalisation

Additional score normalisation min– max z-score median tanh-estimator double sigmoid

user-independent normalisation

16.25

15.73

15.75

15.81

15.72

15.62

user-dependent normalisation

13.63

13.33

13.15

13.28

13.30

13.37

IET Signal Process., 2009, Vol. 3, Iss. 4, pp. 333 – 341 doi: 10.1049/iet-spr.2008.0171

339

& The Institution of Engineering and Technology 2009

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on July 13, 2009 at 06:36 from IEEE Xplore. Restrictions apply.

www.ietdl.org

Figure 6 EERs with respect to the increasing number of characters employed as password. The results are shown considering different values E ¼f5, 6, 7, 8, 9, 10g of enrollment acquisitions a System employing a user-independent score normalisation (Section 3.2.1) b System employing a user-dependent score normalisation (Section 3.2.2)

Finally, we performed tests to evaluate the importance of the length of the considered password. Specifically, we estimated the recognition performances obtained when the employed passwords are truncated, thus considering words with less than ten characters. The obtained results are illustrated in Fig. 6a for a system using the user-independent score normalisation described in Section 3.2.1, where the EERs achievable employing different numbers E of enrolment acquisition are displayed with respect to the increasing number of employed characters. The whole available database with 30 users has been employed to estimate the presented results. Moreover, the same analysis has been conducted for a system exploiting the user-dependent score normalisation described in Section 3.2.2. As can be expected, the achieved EER improves when the number of considered characters increases.

Moreover, the achieved performances are close to those obtained using other behavioural biometrics, like for example a user’s signature. Moreover, the social acceptance of keystroke-based authentication is quite high, which, together with interesting achievable performance, makes keystroke dynamics a good candidate to be used for secure authentication using cellular phones.

6

References

[1] JAIN A.K., FLYNN P., (Springer, 2008)

ROSS A. (EDS.):

‘Handbook of biometrics’

[2] CLARKE N.L., FURNELL S.M., RODWELL P., REYNOLDS P.L.: ‘Acceptance of subscriber authentication for mobile telephony devices’, Comput. Secu., 2002, 21, (3), pp. 220–228 [3] SPILLANE R. : ‘Keyboard apparatus for personal identification’, IBM Tech. Discl. Bull., 1975, 17, (3346)

5

Conclusions

In this paper we have proposed a keystroke dynamics-based authentication method with application to mobile phones, which can be used as a password hardening mechanism. In fact, a strong secure authentication system cannot rely on the sole keystroke dynamics, which however can be a module of a more complex system, which can provide, as basic security, a password-based protocol eventually hardened by keystroke analysis. Within this scenario, the user who wants to be authenticated needs both to type the correct password and to use the legitimate user’s correct typing pattern. Specifically, we have investigated the feasibility of static text-based authentication using as input device cellular phone keypads. 340 & The Institution of Engineering and Technology 2009

[4] ADAMS C.W.: ‘Legal requirements for the use of keystroke loggers’. Proc. First Int. Workshop on Systematic Approaches to Digital Forensic Engineering (SADFE05), 2005 [5] WOODWARD J.D.: ‘Biometrics: privacys foe or privacys friend?’, Proc. IEEE, 1997, 85, (9), pp. 1480 – 1492 [6] JOYCE R. , GUPTA G. : ‘Identity authorization based on keystroke latencies’, Commun. ACM, 1990, 33, (2), pp. 168– 176 [7] ROBINSON J.A., LIANG V.M., CHAMBERS J.A.M., MACKENZIE C.L.: ‘Computer user verification using login string keystroke IET Signal Process., 2009, Vol. 3, Iss. 4, pp. 333– 341 doi: 10.1049/iet-spr.2008.0171

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on July 13, 2009 at 06:36 from IEEE Xplore. Restrictions apply.

www.ietdl.org dynamics’, IEEE Trans. Syst. Man Cybern. A, 1998, 28, pp. 236– 241 [8] MONROSE F. , RUBIN A.D.: ‘Keystroke dynamics as a biometric for authentication’, Future Gener. Comput. Syst., 2000, 16, (4), pp. 351– 359 [9] BERGADANO F., GUNETTI D., PICARDI C.: ‘User authentication through keystroke dynamics’, ACM Trans. Inf. Syst. Secur., 2002, 5, (3), pp. 911 – 915 [10] PEACOCK A., KE X., WILKERSON M.: ‘Typing patterns: a key to user identification’, IEEE Secur. Priv., 2004, 2, (5), pp. 40– 47 [11]

ARAU´JO L.C.F., SUCUPIRA L.H.R. JR., LIZARRAGA M.G., LING L.L., YABU-

UTI J.B.T.: ‘User authentication through typing biometrics features’, IEEE Trans. Signal Process., 2005, 53, (2), pp. 851– 855

[12] SHENG Y., PHOHA V.V., ROVNYAK S.M.: ‘A parallel decision treebased method for user authentication based on keystroke patterns’, IEEE Trans. Syst., Man Cybern. B, 2005, 35, (4), pp. 826– 833 [13] LEE H.-J., CHO S.: ‘Retraining a keystroke dynamics-based authenticator with impostor patterns’, Comput. Secur., 2007, 26, (4), pp. 300– 310 [14] NONAKA H., KURIHARA M.: ‘Sensing pressure for authentication system using keystroke dynamics’, Proc. World Acad. Sci. Eng. Technol., 2005, 1, pp. 16– 19 [15] AHMAD F., MUSILEK P.: ‘A keystroke and pointer control input interface for wearable computers’. Proc. Fourth Annual IEEE Int. Conf. Pervasive Computing and Communications (PERCOM’06), 2006, pp. 2– 11 [16] CHO S., HWANG S.: ‘Artificial rhythms and cues for keystroke dynamics based authentication’, Lecture notes in computer science, 2005, 3832, pp. 626– 632 [17] KANG P., PARK S., HWANG S.-S, LEE H.-J, CHO S.: ‘Improvement of keystroke data quality through artificial rhythms and cues’, Comput. & Secur., 2008, 27, (1– 2), pp. 3 – 11 [18] MA¨NTYJA¨RVI J., KOIVUMA¨KI J., VUORI P.: ‘Keystroke recognition for virtual keyboard’. Proc. IEEE Int. Conf. Multimedia and Expo. 2002. (ICME’02), 2002, vol. 2, pp. 429– 432 [19] OGIHARA A., MATSUMURA H. , SHIOZAKI A.: ‘Biometric verification using keystroke motion and key press timing for ATM user authentication’. Int. Symp. Intelligent Signal Processing and Communications 2006 (ISPACS’ 06), pp. 223– 226 [20] CLARKE N.L., FURNELL S.M., LINES B.M., REYNOLDS P.L.: ‘Keystroke dynamics on a mobile handset: a feasibility study’, Inf. Manage. Comput. Secur., 2003, 11, (4), pp. 161– 166 IET Signal Process., 2009, Vol. 3, Iss. 4, pp. 333 – 341 doi: 10.1049/iet-spr.2008.0171

[21] CLARKE N.L., FURNELL S.M.: ‘Authentication mobile phone users using keystroke analysis’, Int. J. Inf. Secur., 2007, 6, (6), pp. 1 – 14 [22] DUNLOP M.D., MASTERS M.M. : ‘Investigating five key predictive text entry with combined distance and keystroke modelling’, Pers. Ubiquitous Comput., 2008, 12, (8), pp. 589– 598 [23] KHOLMATOV A., YANIKOGLU B.: ‘Identity authentication using improved online signature verification method’, Pattern Recognit. Lett., 2005, 26, (15), pp. 2400 – 2408 [24] ROSS A.A. , NANDAKUMAR K., JAIN A.K.: ‘Handbook of multibiometrics’ (Springer, USA, 2006) [25] ITU E.161: ‘Arrangement of digits, letters and symbols on telephones and other devices that can be used for gaining access to a telephone network’, http://www.itu. int/rec/T-REC-E.161-200102-I/en

7

Appendix: software

The software system is composed of two parts: a Midlet, which is an executable application installed on the mobile phone, used for data acquisition and a backend application for data retrieval, analysis and statistics. The Midlet has been written in Java language under Microsoft Windows XP Professional edition with eclipse platform using the plugin EclipseME for the integration of the Java micro edition libraries. The Midlet has been developed with the Eclipse platform using the plugin EclipseME development tools for the integration of the Java micro edition libraries in the editor environment. To compile and build the source code of the Midlet, the Sun Java Wireless Toolkit has been used. The output of this tool is a Jar file used in the deployment of the Midlet on the mobile phone. In order to access the file system of the mobile phone, the J2ME JSR-75 API has been used. The debugging and testing were done on Sony Ericsson W660i. The Nokia 6680 has been used to collect the users’ data without any need to adapt the source code. Specifically, all the details concerning the employed software follows: † Software editor and compiler tools: Eclipse Platform 3.3.2. Build id: M20080221-1800. Employed Eclipse Plugin: EclipseME J2ME Development Tools Version: 1.7.9. † Sun Java Development Kit: JDK 1.6.0-05. † Java 2 Micro Edition (J2ME) development tool: Sun Java Wireless Toolkit 2.5.2 for CLDC. Configuration: Profile MIDP 2.0 CLDC 1.1. PDA Profile for J2ME (JSR 75). † Mobile phone used for testing and debugging: Sony Ericsson W660i. Java version JP-7.5. 341

& The Institution of Engineering and Technology 2009

Authorized licensed use limited to: BIBLIOTECA D'AREA SCIENTIFICO TECNOLOGICA ROMA 3. Downloaded on July 13, 2009 at 06:36 from IEEE Xplore. Restrictions apply.

Suggest Documents