Password differences based on language and testing ...

2 downloads 0 Views 820KB Size Report
Password differences based on language and testing of memory recall. Jacob Abbott and Violeta .... crack, or a very long and complicated password, which is.
NNGT Int.J. on Information Security, Vol. 2, Feb 2015

Password differences based on language and testing of memory recall. Jacob Abbott and Violeta Moreno Garcia , [email protected] [email protected] Indiana University, Bloomington, Indiana, USA Summary Many researchers have discussed password creation and strength, but this paper takes a look at password creation differences based on users native languages, prompting with guidelines, feedback, and memory recall of passwords. Using mock account creation and member login schemes to obtain data from users, it was analyzed for usage patterns using various factors. It was found that interesting usage differences do exist between users with different languages and giving prompts and rules can increase memory recall and length of passwords.

4. 5. 6.

Key words: Password creation, Information security, Usability, User study, Authentication.

7.

1. Introduction

least one lowercase, one uppercase letter, one digit, and one special symbol, but pictures were not showed to them. E (English) R (Rules) NP (No Prompt). English speakers given both guidelines. It required a rule and participants were shown pictures. E (English) R (Rules) P (Prompt). Spanish speakers not given guidelines, meaning they were not given any rules nor pictures; S (Spanish) NR (No Rules) NP (No Prompt). Spanish speakers given only one guideline. No rules but pictures were shown to them. S (Spanish) NR (No Rules) P (Prompt). Spanish speakers given only one guideline. It was implemented with a rule requiring users to have at least one lowercase, one uppercase letter, one digit, and one special symbol, but pictures were not shown to them. S (Spanish) R (Rules) NP (No Prompt). Spanish speakers given both guidelines. It required a rule and pictures were shown to the users. S (Spanish) R (Rules) P (Prompt).

Password strength, creation, usability, user memory recall, and decision making have been studied by many scholars. Much research has been seen to search for trends of password creation and use for English speaking users [3] [10], but there has not been much research of password analysis comparing users with different native languages. The experiment was split in two phases, the first one was named Password Creation, where we created a web page that pretended to be a forum in which it asked participants to create an account with a user name and password (Sign up phase). The second phase was called Password Memory Recall (Log in phase), where one week later participants were asked to log in to the created webpage with the accounts they had previously created.

8.

This system was designed in English and Spanish to be able to direct it to the target audiences of native English speakers and native Spanish speakers. The system was divided into 4 groups for each language to assess 8 different total categories.

In this work, the strength of the password creation is assessed. This is achieved by evaluating (i) the length of the password and (ii) the character space. It also estimated the password recall rate by counting the number of attempts to log into the account made for each user one week later. Finally the results were compared between native English speakers and native Spanish speakers and comparisons between groups with and without guidelines. To the best of the researchers knowledge, this is the first work that analyzes, evaluates, and compares the strength of passwords between native English speakers and native Spanish speakers. More specifically, the contributions in this paper can be summarized as follows: 1. Investigated the distinct differences that exist between English and Spanish password creation.

1.

2. 3.

In the Password Memory Recall phase it was asked to the eight groups to login to their previously created accounts and their results were tracked. It should be pointed out that participants knew that the account creation was not being used to handle critical information, it was assumed that the dataset of password gathered are different from those that participants used for a secure website, such as a bank website. In this case the dataset required user generated information that is usually required when anyone participate in an online forum.

English speakers not given guidelines, meaning they were not given any rules nor pictures; it will be referred to later with the acronyms E (English) NR (No Rules) NP (No Prompt). English speakers given only one guideline. No rules but pictures are shown to them. E (English) NR (No Rules) P (Prompt). English speakers given only one guideline. It was implemented with a rule requiring users to have at

© N&N Global Technology 2015 DOI : 02.IJIS.2015.1.5

1

NNGT Int.J. on Information Security, Vol. 2, Feb 2015

2. 3. 4.

Shown that prompting users with feedback would lead to stronger password creation. Confirmed that giving guidelines before initial password creation would lead to initially stronger passwords. Tests how giving graphics to users could help to create a mental association and increase password recognition and recall.

important to examine the motives behind passwordmanagement behavior to improve overall password security. A survey of the memorability of passwords in many other studies, e.g. [2] [6] [9] [11], which quantify the effect of user choice on the security of passwords chosen and emphasize their memorability or [7] developed a secure and usable password system which addressed the problem of memorability and can be used in a wide array of applications. Amongst the various studies it was agreed that giving graphics to users helped to create a mental association and increase password recognition and recall.

The remainder of this paper is organized as follows. Section 2 reviews some related works and basic concepts. Section 3 introduces the methodology. In Section 4, the results are presented. In Section 5, the drawbacks of this work are discussed. In section 6, future work is announced, and it is concluded in Section 7.

3. Methodology 2. Related Work

The first phase of the project was to have users create accounts using an account creation page that was created and maintained by the researchers. Subjects were sent messages asking for their participation and were randomly sent one of four links based by their native language. The four subgroups per language were NRNP- where there were no guidelines or prompts given, RNP- rule requiring users to have one lowercase letter, one uppercase letter, one digit, and one special character and no picture prompts given, NRP- no rule used and pictures shown to users, and RP- group required a rule and were given pictures. Users were asked for a user name, email address, and password in order to get credentials.

Many studies have been done with unencrypted password datasets in order to investigate the strength of passwords [3]. They focused on the strength of passwords [3] [4] [5] [8] chosen by users in the absence of password strength enforcement. They also pointed out that it is debatable that systems enforcing password complexity actually increase security, however they may instead lead users to circumvent the enforcement techniques by adopting insecure behavior. Meanwhile Devillers [4] [5] took a look at a large database of users chosen passwords to determine the current state of affairs and extract a model from the database and provide his own password checker, which ranks passwords in various ways. In this work was reproduced the character type analysis, the length distribution analysis, and the letter frequency analysis in order to compare with his results some years later with our own dataset. It ended up that some similarities for the native English speakers and native Spanish speakers were found, but also that some distinct differences between the groups were found.

The account creation page was made by using standard html to create a basic webpage before using JavaScript to allow for the processing of data. Using JavaScript researchers were able to track the creation of passwords by users by creating a vector of each password as it was formed and tested. This allowed researchers to track if users changed their initial created passwords based on feedback of whether it met the minimum required length or the required rule for the specified groups. The page also allowed researchers to see which users actually clicked to test their passwords before submitting.

On the other hand the topic of security versus usability has always been of much debate in the computer security world [1]. One can either choose a short and simple password, which is easy to remember but also easy to crack, or a very long and complicated password, which is hard to remember but also hard to crack. This explicit trade-off creates an economic model where many users take a cost-benefit approach when creating their passwords. Some works, such as that by Hinds and Ekwueme [7], have tried to demonstrate that security and usability can be achieved simultaneously. It lays the foundation for developing a class of similar password systems, differing only in the degree of security required. [10] [12] It is

© N&N Global Technology 2015 DOI : 02.IJIS.2015.1.5

The second phase of the project required users to attempt to login using the credentials they created in phase one. Each user received an email containing a link to the login page before their email information was removed in order and privacy. The users were allowed three attempts to login successfully, if they were unable to login after all three attempts they were labeled as having forgotten their password and were still sent to the full disclosure page and thanked for their participation.

2

NNGT Int.J. on Information Security, Vol. 2, Feb 2015

The login page was made by using standard html with two text boxes asking for a username and password. Using JavaScript researchers were able to track the passwords used in each attempt by a user to login. This also allowed researchers to view how close users who forgot their password were to their originally created password.

4. Results The results of character type analysis for native English Speakers are shown in Figure 1 and native Spanish Speakers are shown in Figure 2. The largest category for native English speakers, which nearly makes up for half the dataset (48%), are passwords that consist of a word with a combination of characters consisting of lower and upper case characters, at least one digit, and one special character in contrast to Spanish speakers with only 29%. However, 25% of Spanish speakers make up a password with a combination of uppercase characters and lowercase, at least one digit, and a special character against 7% of English speakers. The difference with the previous categories is that the latter does not include the use of words within the password, which arguably ends up creating a stronger password.

Fig. 2 Spanish character type analysis.

Length distribution analysis gave us insight into what the common length of users chosen passwords were between the native English and Spanish speaking participants. The results of the analysis, as shown in Figure 3 are very similar to Devillers [4]. The results do not show a normal distribution either, but rather we can say that is a truncated form. It was agreed that this is a result of minimum password length requirements. The range of passwords between the size of 13 and 22 characters in the database covered about 17% of native English speakers and 20% of native Spanish speakers. Overall it was found that 72.7% of users had passwords between 6 and 12 characters in length.

Fig. 1 English character type analysis.

For both native English speakers and native Spanish speakers, the second largest category are passwords that consist out of lowercase characters that make up a word and at least one digit. This is inconvenient, considering that most passwords are only 8 to 12 characters long for both languages and the letters found in the passwords conform to their own language frequency.

Fig. 3 Passwords length distribution.

In Figure 4 it shows the spread of length between English and Spanish speakers with histograms and probability density functions. Interestingly English and Spanish speakers on average had passwords of very similar lengths and strengths, but interestingly the longest English password had a lower strength than quite a few shorter passwords created by other users due to its use of solely lowercase letters.

This would suggest, similar to Devillers [4], that a significant part of our dataset consists of words or names used in passwords, but in contrast to Devillers [4], we did not find any passwords that consisted solely of digits.

© N&N Global Technology 2015 DOI : 02.IJIS.2015.1.5

3

NNGT Int.J. on Information Security, Vol. 2, Feb 2015

in the Figure 6 a lowercase letter is most used by English speakers with 36% against 29% by Spanish speakers. Interestingly 0% of Spanish speakers used a special character at the beginning of their passwords.

Fig. 4 Password length distribution density figures.

Figure 5 shows breakdowns of password length comparing the different groups of whether they were given rules and pictures or not. It was found that on average groups that were given a rule created stronger passwords and were longer on average than passwords of users in groups that were not given a rule. One curiosity that appeared in the data was the trend of how showing pictures to users affected the length of created passwords. For the groups given pictures without a rule, users tended to create shorter passwords than those from the groups that were not given pictures and did not require a rule. Conversely those given pictures and required to obey the imposed rule set created the longest passwords on average than any other user group.

Fig. 6 Passwords frequency distribution of the first character.

In Figure 7 it is shown that the frequency of the last character type used in the passwords. 45% of the English speakers used a special character at the end of their password, 31% a number, 22% a character in lowercase, and 2% used an uppercase letter. The special character most frequently used was the exclamation mark (!) followed by the asterisk (*) as shown in Figure 8. Conversely 38% of the Spanish speakers used a number at the end of their passwords, while none of them used 0 or 1 (Figure 9). 33% used special characters that coincide with the frequency of the exclamation mark used by English speakers (Figure 8).

Fig. 5 Password length by guidelines.

As it was hypothesized groups required to create a password based on the rule set were stronger than those not requiring the rule set with a p value

Suggest Documents