The Design of a Web Page Prediction Tool

8 downloads 3853 Views 233KB Size Report
present the design, implementation and evaluation of a web page prediction tool. The tool offers personalized interaction by predicting user's behaviour from ...
The Design of a Web Page Prediction Tool Andrea Bai and Andrina Grani* Siemens d.d., Siemens IT Solutions and Services, Put brodarice 6, 21000 Split, Croatia E-mail: [email protected] *Faculty of Science, University of Split, Nikole Tesle 12, 21000 Split, Croatia E-mail: [email protected]

Abstract. A tool for web page prediction is presented in this paper. The developed tool offers personalized interaction by predicting user’s behaviour from her/his previous browsing history. According to an outcome of a conducted user testing, enhancements to the tool’s design are proposed. Its simplicity and usefulness makes it a valuable browsing aid for widespread application.

Keywords. web browsing, web page prediction, personalized interaction, tool

1. Introduction Intelligent user interfaces (IUIs) emerge as a means for making systems individualized or personalized, thus enhancing their flexibility and attractiveness. They are intended to facilitate a more natural interaction between users and computers, by not attempting to imitate humanhuman communication, but augmenting the human-computer interaction process instead [3]. The intelligence in the interface can e.g. make the system adapt to the needs of different users, take initiative and make suggestions, learn new concepts and techniques or provide explanation of its actions, cf. [12]. Measuring the degree of adaptation to differing user requirements and needs is obviously emphasized in such a context. Consequently, the frequently cited indication of intelligence is the ability to adapt [15], which implies the ability to adapt the interaction outcome to the level of understanding and the interests of individual users. Even more, the ability to predict the user's next action allows the system to envisage user's needs and to adapt to and improve upon her/his work. The initial idea for the development of the tool that would simplify and speed up browsing originated from the hypothesis that a user’s browsing history is repetitive and could be easily predicted. With such hypothesis we have developed a prototype characterized with ability to speed up user’s browsing. In this paper we

present the design, implementation and evaluation of a web page prediction tool. The tool offers personalized interaction by predicting user's behaviour from her/his previous web browsing history and then uses these predictions to simplify the user's future interactions. The paper is organized as follows. In Section 2 we give a short outline of research in the area of user actions prediction and offer an insight in our initial work related to web page prediction. The design of the developed tool for web page prediction is presented in Section 3, while Section 4 brings its usability assessment. In Section 5 we provide discussion and directions for further work.

2. Background to the research 2.1. Related work Initial work in command line prediction includes Greenberg's (1988) work on dataset’s [10], Darragh, Witten and James's (1990) Reactive Keyboard [6] and Greenberg, Darragh, Maulsby and Witten's (1991) Predictive Interfaces [11]. More recently, Davison and Hirsh (1997) have modified the UNIX shell in order to memorize a command history and subsequently calculated the probability of a certain command input [7; 8]. Jacobs and Blockeel (2003) have tried to improve Davison and Hirsh's command prediction by employing a time component [13]. In order to predict the next UNIX command along with its parameters, Korvemaker and Greiner (2000) have applied diverse statistical methods [14]. Additional work in the prediction of the sequence of user actions encompass prediction of the input of the next word or entire sentence in text editing [4; 16], prediction of motor actions based on eye movement [18] or prediction of the complete movement of the human body during the performance of some simple actions [2]. Several authors have based their research on the prediction of the next requested web site.

263 nd

Proceedings of the ITI 2010 32 Int. Conf. on Information Technology Interfaces, June 21-24, 2010, Cavtat, Croatia

Zukerman, Albrecht and Nicholson (1999) have addressed the ways for minimizing waiting for a requested web page by predicting which page the user will ask for next [19]. Based on their work is the research of Friaz-Martinez and Karamcheti (2002), who have enhanced their predicting model with an important parameter – the time between two requests [9]. In order to create more accurate "guess", prediction methods that authors have used vary through different statistical models, such as the Sequential behaviour model [9], the Markov prediction model [13; 18; 19] or the Prefix model [7; 8; 13; 14]. Apparently, predicting the sequences of the user's next actions is a topic that has been explored from many angles, from diverse sciences and utilizing a variety of methods in order to find the best solution. However, whether the prediction of the next UNIX command or the next web page is considered, the user and the server with the prediction model usually are remote. Hence to predict the next user action, the specific data about the user (i.e. Internet Protocol (IP) address, location and other user-specific information) should be transferred via Internet. Such transfer could be time consuming, a security risk and unreliable. Accordingly, in our contribution to this challenging topic we took a slightly different approach – a fairly simple application installed on a user’s computer is developed. It merges the user’s profiled behaviour with an application logic that is located on the user’s personal computer.

2.2. Starting point – web page prediction prototype The tool for web page prediction presented in this paper is based on the authors’ previous work [1]. We have introduced a potentially useful Internet browsing aid which offers personalized interaction by predicting the user's behaviour from her/his previous web browsing history. These predictions, which are based on simple statistical computations, are afterwards used to simplify the user's future interactions with the browser. The developed application groups URLs of requested web pages according to the time of the user’s request. Such grouping enables better prediction of a user’s behaviour during her/his web browsing (surfing). Although the prototype’s simplicity and effectiveness made it potentially useful for widespread application, a number of necessary improvements were identified:

x some aspects of the prototype’s GUI should be redesigned; multiple URLs should be opened in tabs instead of windows, the user name should not be offered as a choice from a combo box, the address bar should occupy the whole length of the window; x database installation and configuration should be upgraded; fairly complicated because the end user has to configure a MySQL Server and write commands in the console. Additionally, several prototype characteristics which needed improvement were also detected: x the prototype customization is the most important issue; the user should be able to adjust the application according to her/his preferences, for example to personally choose the number of time frames during the day along with the duration of each one of them; x the predefined borderline frequency of 55% may not be suitable for all users and should be subject to adjustments according to their preferences; x the possibility for the user to block the automatic opening of some URLs (unblocking should also be possible) or to disregard a few URLs from the calculation related to their request When considering possible options we have concluded that the development of a new browser is not realistic and feasible because of the high competition in the marketplace (see e.g. http://en.wikipedia.org/wiki/Browser_statistics). Furthermore, the used browser is a free plug-in for Java and is characterized with a high response time. Both problems could be resolved by modifying our application in order to make it a new add-on for the Firefox web browser since Mozilla Firefox is an open source application which allows modifications of its code and addition of functionalities developed by third parties. We concluded that with rather simple and feasible changes this preliminary work in the area of web page prediction might became an effective and useful browsing aid for common use.

3. Design and basic workflow Since Firefox is the second most popular web browser in the world market (see e.g. http://en.wikipedia.org/wiki/Usage_share_of_we b_browsers in order to check recent usage share of web browsers), the modified and upgraded application could be available to a large user

264

population. To provide intelligent services and personalized interaction, analysis of web server history is used to model the behaviour of web users. Throughout browsing, users visit many web pages located on different servers and the prediction of their next requested web page has been pointed out as an important issue. Some studies have already addressed the prediction of users’ sequence of actions based on their behaviour on one server, see for example [9]. A user’s behaviour, when analysed and grouped, can be used for speeding up web browsing, for simplifying the usage of a web browser itself and for enabling personalized, intelligent interaction as well. This is the main reason why we have focused our work on the design of an extension for the Firefox browser. As in the previously developed prototype, a grouping of the user’s behaviour is accomplished according to a set of time periods. Initially, six time frames that last for four hours are predefined, but the user can modify them according to her/his wishes within the limit of maximum ten time frames. In that way, the 24 hours of a day are divided into six identical fractions and in each time frame the frequency of a certain URL request is individually calculated. Additionally, the default frequency of a requested URL is set to 50% (changeable to user’s preferences), after which the URL is automatically requested by the plugin at the next startup of the Firefox browser. See Figure 1 for an illustration of the pseudocode which shows how the plugin "knows" if an URL passed the predefined frequency.

offered URLs should be requested. A dialog with the list of predicted URLs will be shown for the first time if the following preconditions are fulfilled: x predefined "Days of learning" have passed (which can be modified in the tool’s General settings panel, see Figure 2); x URL was requested at least ten times in total (user can’t modify this limitation); the limitation is added to make sure that the user is not annoyed with overly eager prediction, especially at the beginning of the usage of the plugin, when such a scenario is most likely to happen; x frequency of the request of the URL in the current time frame is greater than the borderline frequency (set in the General settings panel); x URL is not blocked from the prediction (every URL can be blocked from the prediction in the URL settings, see Figure 3).

Figure 2. Screenshot of the General settings tab in the Webpage prediction tool settings

x = number of times user requested url in current time frame y = number of times user not request url in current time frame border = x / (x + y) * 100 if (border > 50) return true else return false Figure 1. Pseudocode for the calculation of URL "popularity"

The user can define if URLs with request frequencies higher than the predefined percentage should be requested automatically or a list of those URLs will be populated. Moreover, she/he can choose which of the

265

Figure 3. Screenshot of the URL settings in the Webpage prediction tool settings

After the initial startup, the user can browse with no disturbances, just like always, and during that time the plugin gathers the data (i.e. URLs) and updates requests so that the frequency of the request can be calculated on the startup. Old requests which represent those URLs whose last request was more than three months ago, are deleted from the database. Furthermore, in the plugin settings users can select any URL that is saved in the database and determine that the selected URL should be disregarded or blocked from prediction (see Figure 3). The difference between this two options is as following: (i) if the URL is blocked, it will not be in the list for opening during startup but its requests will be updated in the database; on the other hand (ii) if the URL is disregarded, it will not be updated in the database, but if its frequency is still higher than the predefined borderline frequency, that URL will be in the list of the URLs that are candidates for automatic opening (or will open automatically, depending on the user’s settings). One additional option that the user can set is the number of days for learning. Initially, it is set to ten days, but the user can change it to whatever number of days she/he thinks is necessary for the plugin to be filled with relevant data.

4. Usability assessment In general, web browsers are used by a very large user group with highly diverse abilities, needs and interests, aspects which have to be reflected in the user interface design. In that context, an effective assessment of the web page prediction tool interface is essential because it identifies design problems to be corrected and provides guidance for the next iteration of the development process. The conducted user testing embodied two empirical methods used to collect both quantitative data and qualitative "remarks", thus obtaining useful data and producing quality usability information in a cost-effective way.

4.1. Data collection The System Usability Scale (SUS), a simple, standard, ten-item attitude questionnaire with a five-point Likert scale [5], was used for the subjective evaluation. As additional subjective feedback, answers to the semi-structured interview were collected. Nine participants with basic computer literacy skills (they had the knowledge and ability to use computers in an

efficient manner) were involved. The fact that the Internet remains a young medium in Croatia and that over 70% of those with access are under 40 years old was acknowledged (see e.g. http://www.digitaltrainingacademy.com/croatia). All participants were familiar with the Mozilla Firefox browser and their daily hours online were taken into account too. Relevant participants’ information is offered in Table 1. After the plugin was added to their Firefox browser, every participant had to spend at least eight days in usual, everyday browsing routine. For the first five days, the user did not notice any difference compared to her/his usual browser behaviour because the plugin’s "Days for learning" was set to 5 days, and as a consequence the "Popular URLs" dialog was not started. After those five days have passed, users noticed a difference – opening of the dialog at browser’s startup. When users got familiar with the plugin functionality, we conducted the evaluation. The assessment was carried out individually with each participant. The allocated session's average time for each participant was 45 minutes. An evaluation procedure consisted of two steps: a usability satisfaction questionnaire and a semi-structured interview. We used the SUS questionnaire, as it is argued that it yields the most reliable results across sample sizes [17]. Its questions address different aspects of the user's reaction to the tool as a whole, providing an indication of the level of statement agreement on a five-point Likert scale. The feedback was augmented with the users' answers in a semistructured interview, where participants expressed satisfaction or dissatisfaction with the tool’s design and offered suggestions for improvement. They were asked to rate and comment on the tool’s visual attractiveness too.

4.2. Data analysis The results acquired through usability test methods are addressed in what follows. The scores for users' subjective satisfaction acquired by SUS can range from 0 (very little satisfaction) to 100 (very high satisfaction). The calculated average of the participants’ satisfaction score is 83,61 implying their rather high satisfaction with tool’s ease of use. According to average mark for visual attractiveness (3,88 on the 5-point scale), some tool’s design improvements should be made, cf. with interview outcome. Data obtained for subjective satisfaction along with users’ rating of tool’s pleasant appearance is shown in Figure 4.

266

Table 1. Distribution of gender, age, educational level and daily hours online Participant

Gender

Age

Education level

Hours online (daily)

1 2 3 4 5 6 7 8 9

Female Female Male Male Male Male Female Female Male

14 21 19 51 36 29 27 34 31

Elementary school High school (non technical orientation) High school (non technical orientation) Bachelor (non technical orientation) Master (technical orientation) Bachelor (non technical orientation) Master (technical orientation) Master (technical orientation) Master (non technical orientation)

2,5 2 2 2 1,5 1 2,5 1 4

In addition, the statements from the conducted interviews proved to be very helpful because they enabled collection of users’ remarks and suggestions for the tool’s graphical interface improvement: x the way that URLs are sorted in the "Popular URLs" dialog should be modified (they should be sorted by the popularity and sequenciality of the URL’s request); the dialog should offer a way to select and deselect all URLs (to speed up URL selection process); the dialog should be available for opening at user’s wish, not only at the startup; x only the hostname should be remembered, and not the whole URL; x an option which differentiates the days of the week and predicts according to them together with daily time frames should be added (e.g. some users have very different surfing behaviour during weekdays and during weekends); x the design of the tool is in a way too "technical" and should be improved.

1

4

85

2 3

90

3

90

User

4 5

4

6 7

57.5 3

8

3

9

97.5 5

4

82.5

5

85

92.5 72.5

4

83,61 3,88

average

Result SUS results (0-100)

User's rating (1-5)

Figure 4. Results of users’ assessment

The conducted evaluation proved efficient at a low-cost and provided valuable feedback which will be used in the redesign of the tool’s user interface.

5. Discussion and concluding remarks This paper presents, according to the users’ comments, an interesting and useful Internet browsing aid. The developed tool for web page prediction is fairly distinctive. When compared to other Firefox suggestion tools (e.g. google toolbar, history or bookmarks), it can be asserted that the tool offers a combination of their functionality – favourite URLs (bookmarks) are suggested depending on the request frequency (google toolbar) and the request time (history). On the other hand, our tool has a dynamic list of favourite URLs (bookmarking is static), request frequency is depending on the individual user (google toolbar calculates global frequency of multiple users) and request time correlates with the hour of the day (history makes a list of the requested URLs of the day, week and/or month). While in the similar research field various statistical models have been used, for the prediction achievement we have used a rather simple statistical computation. In addition, according to the available related literature and our knowledge there is no application that groups URLs of requested web pages according to the time of the user’s request. Such grouping enables better prediction of the user’s behaviour during her/his web browsing. Taking into consideration the users’ feedback on the one hand, and the results of the market research for similar products on the other, we concluded that the FastDial extension for Firefox (https://addons.mozilla.org/en-US/firefox/addon/

267

5721) could be modified in a way that it provides the functionality of our developed web page prediction tool (allowed by the SourceCode license for FastDial https://addons.mozilla.org/ en-US/firefox/versions/license/74198). FastDial is an extension for Firefox that creates thumbnails of the user’s bookmarks in any newly opened tab. Thus instead of opening a dialog at the startup, the prediction tool could use FastDial’s functionality of opening a new tab filled with thumbnails of the most popular URLs. In addition, particular functionality of the plugin should be improved to address users’ input and comments (e.g. sorting order, hostname prediction instead of whole URL). We assumed that such feasible enhancements of the design of the web page prediction tool may result in an effective and useful browsing aid for widespread use.

6. Acknowledgements This work has been carried out within project 177-0361994-1998 Usability and Adaptivity of Interfaces for Intelligent Authoring Shells funded by the Ministry of Science and Technology of the Republic of Croatia. The authors would like to express their gratitude to Henning Schaefer for his help during the development of the plugin.

7. References [1] Bai A, Grani A. Intelligent Interaction: A Case Study of Web Page Prediction. Proc. of 31th International Conference on Information Technology Interfaces. 2009, p 287-292. [2] Bao L, Intille SS. (2004). Activity Recognition from User-Annotated Acceleration Data. Ferscha A, Mattern F, editors. PERVASIVE 2004, LNCS 3001. p. 1-17. [3] Benyon D, Murray D. Interacting with Computers. Special issue on intelligent interface technology 2000; vol. 12. [4] Bickel S, Haider P, Scheffer T. Learning to Complete Sentences. 16th European Conference on Machine Learning (ECML'05) 2005; p 497-504. [5] Brooke, J. SUS: a "quick and dirty" usability scale. In Jordan, P.W., Thomas, B., Weerdmeester, B.A, McClelland, A.L. (Eds.): Usability Evaluation in Industry. Taylor and Francis, London. 1996. [6] Darragh JJ, Witten IH, James ML. The reactive keyboard: A predictive typing aid. IEEE Computer 1990; 23(11):41-49. [7] Davison BD, Hirsh H. Experiments in UNIX Command Prediction. In Proceedings of the

14th National Conference on Artificial Intelligence and 9th Innovative Applications of Artificial Intelligence Conference (AAAI97/IAAI-97), Menlo Park. 1997. p. 827-827. [8] Davison BD, Hirsh H. Probabilistic Online Action Prediction. Intelligent Environments: Papers from the AAAI 1998 Spring Symposium. Technical Report SS-98-02. 1998. p. 148-154. [9] Frias-Martinez E, Karamcheti V. A Prediction Model for User Access Sequences. In WEBKDD Workshop: Web Mining for Usage Patterns and User Profiles. 2002. [10] Greenberg S. Using UNIX: Collected traces of 168 users. Research report 88/333/45, University of Clagary, Calgary Alberts. 1988. [11] Greenberg S, Darragh JJ, Maulsby D, Witten IH Predictive interfaces: What will they think of next? In Edwards A, editor. Extra-ordinary human–computer interaction: interfaces for users with disabilities. Cambridge University Press 1991. Ch. 6. p. 103-140. [12] Hook K. Steps to Take Before Intelligent User Interfaces Become Real. Interacting with Computers 2000; 12: p. 409-426. [13] Jacobs N, Blockeel H. Sequence Prediction with Mixed Order Markov Chain. Proceedings of BNAIC'02 - Belgian-Dutch Conference on Artificial Intelligence 2003. p. 147-154. [14] Korvemaker B, Greiner R. Predicting UNIX Command Lines: Adjusting to User Patterns. Proceedings of the 17th National Conference on Artificial Intelligence and 12th Conference on Innovative Applications of Artificial Intelligence 2000. p. 230-235. [15] McTear, M.F. Intelligent interface technology: from theory to reality? Interacting with Computers, Vol. 12. 2000. p. 323–336. [16] Schlimmer JC, Hermens LA. Software Agents: Completing Patterns and Constructing User Interfaces. Journal of Artificial Intelligence Research 1993; 1: 61-89. [17] Tullis, T.S., Stetson, J.N. A Comparison of Questionnaires for Assessing Website Usability. Proceedings of UPA Conference. Minneapolis. 2004. http://home.comcast.net/~tomtullis/ publications/UPA2004TullisStetson.pdf [18] Yu C, Ballard DH. Learning to Recognize Human Action Sequences. In Proceedings of the 2nd International Conference on Development and Learning, Boston, U.S. 2002. p. 28-34. [19] Zukerman I, Albrecht DW, Nicholson AE. Predicting User's Request on the WWW. Proceedings of the 7th International Conference on User Modeling, UM99. 1999. p. 275-284.

268