AN APPROACH TO PREDICTIVE EVALUATION FOR USERS WITH SPECIAL NEEDS A. V. Artífice Instituto Superior Técnico Av. Rovisco Pais, 1000 Lisboa Portugal
[email protected]
J. B. Lopes Instituto Superior Técnico Av. Rovisco Pais, 1000 Lisboa Portugal
[email protected]
ABSTRACT This paper addresses the prediction of user moves in a case study where user interaction with a software game is predicted on the basis of λ-parameters acquired from users with special needs playing the game. A model was built to reproduce actions performed by such users to overcome the difficult set-up of tests given user reduced availability, mobility and varying/erratic behaviour. Data on user actions was collected from video recordings of users playing a game requiring them to roam a virtual house and perform some tasks. The tests were carried out with users with disabilities such as severe physical injuries, cerebral palsy and Down syndrome. A three level simulator was developed with part of the data collected during experiments, the remaining experimental data serving as reference for validation. Data analysis calculated the λ-parameter required by the Poisson and Exponential Distributions used by the statistical approach adopted. The tests performed showed an excellent statistical fit of the data simulated with the reference data.
KEY WORDS Accessibility, Usability, User modelling, technology, Interaction, Simulation
Assistive
1. Introduction Testing computer applications with actual users is vital, yet the amount of work, time and care involved in proper testing is such that methods and techniques have been devised to allow testing without actual users. This is the case of conceptual walkthroughs, model-based evaluation and testing with experts assuming the role of actual users. But the workload and time required for testing are further increased when applications are targeted at users with disabilities, especially severe disabilities. Such users present a wide variability of disabilities that translate into very different assistive technology solutions. In general, users with disabilities need alternative pointing devices to interact with applications and point and click. Such
J. A. Jorge Instituto Superior Técnico Av. Rovisco Pais, 1000 Lisboa Portugal
[email protected]
M. J. Fonseca Instituto Superior Técnico Av. Rovisco Pais, 1000 Lisboa Portugal
[email protected]
devices range from common joysticks to sweep and switch combinations or myographic devices, to name a few. The choice of the more appropriate device for each situation must be done at an early stage [1]. The wide range of alternative devices used is the reason why many applications for users with disabilities are targeted at users who all use the same type of pointing device or different devices that operate similarly. This reduces the amount of testing but presents the serious drawback of reducing the scope of persons able to use a given application. Furthermore, usability tests with users with disabilities are difficult to replicate since in many cases users moods are very changeable and uncertain. Users may even not feel like to perform any tests at the time when they are available for testing. Another difference is the number of available test subjects. Tests and experiments require a number of subjects enough to ensure that results are significant and that number may not be available because the same test subjects cannot be used for testing and pre-testing. The case is further compounded when two user groups are required, one for reference and the other for actual testing. User profiling may help overcome these difficulties by modelling with enough accuracy to reproduce the main user actions perform when interacting with an application. This is particularly true for games where users move around rooms or spaces and perform actions such as picking or touching objects. Dix [2] suggests that evaluation must not only be a phase in design process but that it should take pace throughout all the design life cycle, with results being fed back into the process. Continuous testing with users while designing not possible, but the use of analytic and informal techniques is. Therefore, there is a growing need to determine prediction evaluation models for users with special needs, by modelling the human interaction with applications and simulating interaction times and delays. The problem domain includes models of mental processes, based on objects hierarchy, such as GOMS [3] (Goals, Operations, Methods, Selection) and CCT (Cognitive Complexity Theory) [5] and models based on
empirical knowledge of the human psycho-motor system that predict user performance in the task execution processes, such as KLM (Keystroke-Level Model) [4]. This paper presents predictive model for persons with disabilities based on a statistical framework that models the behaviour of such users when they interact with games by means of pointing devices and an interaction simulator implementing the statistical model. To build the model, data was obtained from video recordings registered during usability tests of a software game [8] developed for persons with cerebral palsy, accident injuries and Down syndrome. This study will help to determine a predictive evaluation model for people with disabilities. This paper is organized as follows: next section presents the experiment in detail. Special attention is given to the user group and the software game. Section 3 is dedicated to the users. Related work is discussed in section 4 that includes other predictive models and their possibility of use with persons with disability. In section 5 we present the model developed, with special focus on the analysis and data acquisition, statistical treatment, and model development and evaluation. Finally, section 6 presents the conclusions and suggestions for future work.
which games and other applications run. One of these is a communication application through alternate graphic languages (pictograms). Initially developed for persons with cerebral palsy, accident injuries and Down syndrome, the framework supports individual user configurations based on each user’s needs and capabilities. These are kept in the user profile that includes items such as colour preferences, input devices used and timing allowances. Parameters are set by experimenting with users. In the Virtual House game players move around a house (Figure 1) and carry out actions such as taking a bathing, dressing or eating. These actions can only be carried out at specific locations in the rooms. Users must open doors to pass from one room to another.
2. The Experiment The research was based on data obtained through video recordings of usability tests with disabled persons who played a software game in a day care centre. This Section discusses that experiment, the sample characteristics and the software used by the participants. 2.1 The Sample The software game previously developed had the aim of making the game accessible to all users in a day care centre, including those with more physical difficulties. Consequently, the sample of ten participants represented a heterogeneous group in terms of disabilities and age. A day care centre therapist provided the characteristics of the users in the user test group [9]. User disabilities included cerebral palsy, Down syndrome, severe physical injuries, some showing a high level of spasticity, psychotic disorders and mental impediments. About 60% of the participants had managed to finish elementary school, but the remaining had not because they were unable to read and write. User ages ranged from 20 to 47 years.
Figure 1. The “Virtual Home” room layout Users movements are simple game movements, i.e., forward or backward, turning left or right (but not at right angles) (Figure 2 and 3).
2.2 The “Virtual Home” Game The software game of the usability tests, the Virtual Home, is part of a program aimed at providing games and other applications to persons with mild to severe disabilities. These games run under a common framework that isolates game logic from display, user input and help. The framework [9] has a client-server architecture under
Figure 2. “Virtual Home” Game Test. Task of teeth cleaning using a Joystick device
drawback of reducing the range of persons that may be able to use a given application. On the user availability side, disabled users are far less available for taking part in usability testing than normal users therefore extending testing periods. Moreover, it is difficult to replicate usability tests involving disabled users because their mood and motivation are extremely inconstant.
4. Related Work
Figure 3. “Virtual Home” Game Test. Task of riding a carousel using a mouse device The game was tested in a series of usability tests that had was carried out at the subjects’ day care centre. The whole experiment was recorded on video and took several days to complete because users were only available at short periods at specific times during the day. Initial tests allowed the choice of the appropriate input device for each user and the device parameters and pointer moving speed. At that time some other behaviour was noticed (users tended to turn only in one given direction and consistently repeated moving actions) but its analysis was left for later because the experiment aim was to test the game and fine tune user virtual displacement in the game to the several input devices available.
3. Users Disabled users present a wide variability of disabilities, which implies the use of different devices when interacting with computers. Cook et al [1] refer that there should be an initial phase to detect user abilities, and the adequate devices for each particular situation. We can define two types of users according to the ability to manipulate pointing devices: those who have the capability to use conventional Human Device Interaction (HDI) devices, such as mice, joysticks and gamepads; and those who need alternative pointing devices. The last group includes most of the persons with special needs who usually have almost no capability or no capability at all to manipulate common HDI devices. They usually exhibit spastic movements with little or none precision. All interaction must be done through assistive technology devices. Most such users show high motor dependency and need help from therapists and others. The wide range of alternative devices used is the reason why many applications for users with disabilities are targeted at users who use the same type of pointing device or similarly operating pointing devices. This reduces the amount of testing but presents the serious
Testing must take place to ensure that the interactive systems have the desire behaviour and that software creators understand users needs [2]. To speedup development, software creators revert to predictive usability test with models such as GOMS, CCT and KLM. The GOMS model (goals, operators, methods and selection) [3] presents a hierarchical goal structure and is usually used to measure user performance at execution time with a specific interface and can be used to filter design options. In this model, goals are the user’s goals, what the user wants to achieve. They represent the mental target for the user, when the user evaluates the situation and decides what shall be done next. Operators are the lowest level of analysis, the basic actions that users can perform with the system. They can change the system or only the user mental state. Methods are ways to achieve goals. Selection depends on the user, the system and goal’s details. The CCT model (Cognitive Complexity Theory) [5] enriches the GOMS model, with emphasis in the predictive capacity. The model has two parallel descriptions: the user’s goals and the computer system (called the device). The goals description is based on GOMS-like goal architecture, described using production rules. The advantage of CCT model is its capability to measure interface complexity. The more production rules there are, the more difficult it is to learn an interface. This model can represent more complex tasks than the GOMS model. The KLM model (Keystroke-Level Model) [4] predicts user performance in low-level physical tasks. The tasks are divided into two phases: the acquisition phase, where the user constructs a task’s mental representation and the execution phase where system facilities are used to carry out the task. This model presents a hierarchical goal structure and can be used to measure user performance and execution time. The KLM model predicts only for the last stage of activity, it assumes that the user has made a decision during the acquisition phase, i.e., there is no accounting for high-level mental activity. The KLM model does not extend well to complex tasks and can be envisaged as a very low level GOMS model. The GOMS and CTT models consider also that the user is an expert, and therefore are not recommended to develop
a model for persons with disabilities. The KLM model also considers the user is an expert. Therefore, it is not recommended to implement a model for users with special needs. Further literature review showed no predictive usability model offering a solution supporting persons with disabilities [9].
5. The λ-Model To make possible interface development for users with special needs during the development process without the user presence, we proposed to simulate the behaviour of users with special needs. To that end we considered the Virtual House 3D game as our testing tool and collected data on the actions performed by the users and the times at which such actions happened. The data required for analysis was extracted from video recordings of usability test experiments. The data reflects the device used, the player actions, and the initial and final times of each task and action. The study that was carried out started by analyzing the entirely system [9], identified the important entities, the characteristics, the selected model perspective, and the events that have impact on the system. In the next stage, we analyzed the data from the interaction experiment, by constructing histograms and temporal diagrams for each experiment case and user. This was followed by the statistical analysis that identified the random variables required to model user interaction. After completing the real system description and the statistical analysis, the conceptual model was developed, along with a simulator with an appropriate architecture. In the next subsections we describe the various steps taken to create our model.
perspective for the λ-Model. This method has advantages such as efficiency in terms of memory occupation and in execution time. The flexibility of this perspective is the most advantageous. We consider that the system state undergoes changes when the game changes level. To appropriately model the system, it is necessary to identify the events that change system state. In our model, we considered that game start, the occurrence of an event from a device and the end of the game are state changing events. The game start event and an event generated by a device are similar because they have the same actions available when they occur. Such actions are: move up, move down, move left, move right and selection. The implementation of event perspective leads to the implementation of a calendar, containing predicted future events. The calendar will contain the occurrence event time and the event type. 5.2 Data Analysis Data was gathered for the following variables: device (e.g., keyboard, joystick, mouse or switch), operator indicating the player movement action (move up, down, left or right) and selection (open door or perform task). The initial and end times determine the duration of each device functionality selection. The task that the user must perform is another variable (“dress”, “eat hamburger”, “clean teeth”, “listen to music”, “ride the carousel”, “wash hands”, “eat fruit”, “wash dishes”, “watch TV”, “wash your hair”). Each task was analyzed by representative graphics: a temporal graphic and a histogram [9]. Here we present the case of user u1 using the keyboard device in the Virtual Home “eat hamburger” task.
“Eat Hamburger” Task 5.1 Model Description selection
Functions
The nature of the system is symbolic, because logic and mathematics relations can describe it. The model is also dynamic because the real model and the simulator model change with time, and thus time must be a model variable. The model is stochastic because there is at least one variable of the model of random nature which can be described by a probability function. The relevant entities of the system are: user, event, game, device and message. The permanent entities are user, device and game while event and message are temporary entities. User and game are active entities and event, device and message are passive entities. In the simulator, these entities were designed as classes according to an Object Oriented approach and assigned data and appropriate behaviour. The activities have start and end times. Each system state completely describes all the entities, attributes and activities at a given time. From the different possible model types in discrete simulation, we choose the event
right
left
up
down
10
20
30
40
50 time (s)
60
70
80
Figure 4. Temporal Diagram for user u1 in the “eat hamburger” task using the keyboard device
90
Using data from the histogram (Figure 5) and applying the laws that define the Poisson distribution, the resulting parameters are shown in Table 1:
“Eat Hamburger” Task 6
5
Frequency
4
Task 3
λdown
2
λleft
λup λright λselection
1
down
up
left
right
selection
Functions
Figure 5. Histogram for user u1 test diagram in the “eat hamburger” task using the keyboard device The key most pressed to complete the “eat hamburger” task is the one to move forward, followed by the selection key. The key to move left and the key to move right were pressed only twice and the key for backward motion was not pressed at all, as we can be see from Figure 5. After data acquisition and analysis, the study was conduced in order to obtain a mathematical representation of the user behaviour, using the data collected. The prediction of an event occurrence depends on establishing the probabilistic rules of the event. The distribution was chosen on the basis of the characteristics of the process to model. This resulted in the choice of the Poisson process [6] [9]. Historically, the term process has been used to suggest observation of a system over the time. The Poisson process allows counting the frequency of events in time. By definition [6]:”given an interval of real numbers, assume counts occur at random throughout the interval. If the interval can be partitioned into intervals of small enough length such that: (1) The probability of more than one count in a subinterval is zero, (2) The probability of one count is in a subinterval is the same for all subintervals and proportional to the length of the subinterval, and (3) The count in each subinterval is independent of other subintervals, Then the random experience is called a Poisson process. If the mean number of counts in the interval is λ> 0 then, the random variable X that equals the number of counts in the interval has a Poisson distribution with parameter λ, and the probability mass function of X is: -λ x f(x)= e λ x = 0, 1, 2,… x!
The statistical parameters obtained for user u1 were used to simulate another user interaction in the same disability category, device and goal task. The data simulated was compared with the one produced by another real user (user u2) with conditions similar to user u1 (Figure 9). The absolute difference between the data simulated for user u1 and the experiment with the u2 real user is presented in Figure 10 and shows a good fit between the two. These experiments were also carried out for other users and other devices. In another example the user was able to manipulate a mouse device and the goal task was “eat hamburger”. The λ-parameters for user u3 in this task were computed following the same procedure as for users u1 and u2. The temporal diagram from user u3 (Figure 6) was created and the correspondent histogram produced (Figure 7). “Eat Hamburger” Task selectio n right Functions
0
Table 1 u1 λ-Parameters Eat hamburger 0 6 2 2 5
left up down
100
110
120
130
140 time (s)
150
160
170
Figure 6. Temporal Diagram for user u3 test in the “eat hamburger” task using the mouse.
180
level initializes the variables and generates the random sequences.
“Eat Hamburger” Task 18 16
5.4 Evaluation
14
Frequency
12 10 8 6 4 2 0
down
up
left
right
selection
Functions Figure 7: Histogram for user u3 in the “eat hamburger” task using the mouse.
The statistical parameters were computed (Table 2) and used to simulate the interaction of another user (u4) of the same category, device and goal task. The results are presented in Figures 11 and 12, and discussed in the Evaluation subsection
Task λdown λup λleft λright λselection
Table 2 u3 λ-Parameters Eat hamburger 5 18 5 7 4
5.3 The Simulator and the Conceptual Model
The λ-model developed to predict the behaviour of users with disabilities showed an excellent statistical behaviour matching that of reference users. For this, we compared the prediction of a user behaviour produced by the simulator with the behaviour of a real user in the same conditions. The quality of the results is shown by another example. In this example we compared the simulated behaviour using the λ-parameters computed for user u1 (Table 1), with the behaviour of a user of the same category of disability, user u2 (Figure 9). “Eat Hamburger” Task 18 16 14 12 10 8 6 4 2 0 down
up
left
right
select
Function Experiment u2
Simulation u1
Figure 9: Same category user’s histogram: u2 experiment vs. u1 simulation using the keyboard device. The histogram of Figure 10 shows the absolute value of the difference between the results produced by the simulator with user u1 λ-parameters versus the real behaviour of user u2.
The simulation software was developed in three hierarchy levels [7]: the executive level or control program, the operative level and the detailed routines level (Figure 8).
“Eat Hamburger” Task
down
up
left
right
selection
Distance
Figure 10. Absolute value of the difference between real values (user u2) and user u1 simulated values for task “eat hamburger” and the keyboard device. Figure8. Packages Diagram The executive level controls the model, e.g., calendar management, calendar advance mechanisms, and operations sequencing. The operative level describes all important functionalities and their dependencies and activities. The instantiation of entities and the definition and description of all activity types that occur in the mode are examples of some operations. The detailed routines
The difference shown in the Figure 10 by the left, down and selection keys is justified by the repeated behaviour that the user presented. When the users try to perform a task and one single action does not produce the desired result they feel partial frustrated and react by repeating that action over a certain lapse of time. This tendency usually disappears with the user experience, but some users only lose this tendency very slowly.
Figures 11 and 12 compare data simulated for user u3 with that of real user u4, and illustrate also this situation. “Eat Hamburger” Task
down
up
left
right
selection
Function Experiment u4
Simulation u3
Figure 11. Same category user’s histogram: u4 experiment vs. u3 simulation using the mouse device “Eat Hamburger” Task
down
up
left
right
selection
they allow product evaluation without the presence of users and is less expensive than tests involving users. From the practical case analysed in this study, we concluded that it is possible to determine dependent variables, the λ-parameters, for each type of interaction and that the data thus produced by simulation shows an excellent match with reference data. By producing simulations with the λ-parameters, we replicated an identical behaviour in frequency of actions (movements in the virtual world), which compares well with reference user behaviour of the same category of disability, performing the same task, with the same device as shown by the analysis of the Euclidean difference. The results show an excellent statistical fit between data acquired from real data during the experiment and the simulation data in terms of frequency of events. Because we used a small group of users when we isolated the variables (task, device and disease level); we propose in future work, other studies with user’s groups with more homogeneous disability characteristics. In the future model development we intend to focus on the operative module of the λ-Model, and use techniques such as CTT model (Concurrent Task Trees) [Paterno97] that allow also studying user errors.
Distance
Figure12. Absolute value of the difference between real values u4 and u3 simulator values for task: “eat hamburger” and device mouse The experience included a reasonable number of participants. However, the sample was heterogeneous since it was restricted to users of the “Virtual Home” game in the day care centre because one of the project goals was to make the game accessible to all participants, including those with more difficulties. Participants presented a wide variety of disease, ages and education. These factors influenced the research because it was not possible to have a homogeneous group of an acceptable size after grouping user by disability and also by device and task. Apart from the difficulties with the available sample, the simulator presented excellent results as shown before.
6. Conclusion In this study we presented a behaviour predictive model for users with special needs in 3D games [9]. The goal of this work was to contribute to enhance interface design and development for users with disabilities. The effort in this research area is justified by the time needed and the cost involved in testing, compounded by the problems relating to testing with users with disabilities such as the low motivation and changeable mood. Another advantage of the use of predictive models is that
Acknowledgements Many thanks go to the therapists and patients involved in this project.
References [1] A. Cook, S. Hussey, Assistive Technologies (Englewood Cliffs, Mosby, 2002). [2] A. Dix, J. Finlay, G. Abowd, R. Beale, HumanComputer Interaction (Harlow, England, Pearson Prentice Hall, 2004). [3] S. K. Card, T. P. Moran, A. Newell, The Psychology of Human-Computer Interaction (London, LEA, 1983). [4] S. K. Card, T. P. Moran, A. Newell, The keystrokelevel model for user performance with interactive systems. Communications of ACM, 23, 1980, 396-410. [5] D. E. Kieras, P. G. Polson, An approach to the formal analysis of user complexity. International Journal of Man Machine Studies, 22, 1985, 365-94. [6] D. C. Montgomery, G. C. Runger, Applied Statistics and Probability for Engineers (New York, John Wiley & Sons, Inc., 1999). [7] G. Fischman, Concepts and Methods in Discrete Event Digital Simulations (New York, John Wiley, 1973). [8] N. Abreu, Modelação de Utilizadores Portadores de Deficiência Master Thesis (Lisbon, IST/UTL, 2006). [9] A. Artífice, Integração de Dispositivos Múltiplos para Acessibilidade Aumentada Master Thesis (Lisbon, IST/UTL, 2008).