Using TouchPad Pressure to Detect Negative Affect Helena M. Mentis Cornell University
[email protected] Abstract Humans naturally use behavioral cues in their interactions with other humans. The Media Equation proposes that these same cues are directed towards media, including computers. It is probable that detection of these cues by a computer during run-time could improve usability design and analysis. A preliminary experiment testing one of these cues, Synaptics TouchPad pressure, shows that behavioral cues can be used as a critical incident indicator by detecting negative affect.
1. Introduction Despite the efforts of the last decade to develop methods and theories of Human-Computer Interaction for increased usability, practitioners have failed to develop feasible and realistic systems that are adaptable to the users’ affect during run-time. To supplement good usability design and testing, a truly intelligent computer interface should be able to adapt to all situations and users, sensing the negative or positive state of the user and reacting in a beneficial manner. Even with current usability practices, computer interfaces continue to induce negative affective states in human users, such as frustration, anxiety, and anger. The system's inability to rectify these negative states may impede productivity, creativity, and cognitive capacity. Much usability research has been aimed at preventing negative affect from arising, little has been done in detecting affect during run-time and then reducing those negative states. In addition, usability testing focuses on quantitative cognitive indicators such as time to completion and binary (yes/no) successful actions. The only focus on affect has been through self-reporting exit questionnaires and interviews. This can cause problems and loss of information when a user becomes frustrated by an interface only to not remember or not have realized they were frustrated afterwards.
Geri K. Gay Cornell University
[email protected] 2. Affect and human behavior The ubiquity of facial displays underscores the concept that in interpersonal communication one must be aware of the emotional state of the other. Without awareness of nonverbal cues of internal state, a breakdown in communication can occur between individuals. Two types of information are conveyed through the outward expression of emotions to another. The first is the inner emotional state of the user (e.g. he is surprised by something that has been said) [2]. The second is information regarding the environment around them (e.g., there is danger nearby). Both of these categories have implications not only for natural selection for the individual communicating them but also for those in the environment around them. The goals for conveying emotion outwardly are, first, communicating to others what one is feeling and, second, influencing their behavior [5]. The second has important evolutionary implications. For instance, an infant experiencing discomfort shows its inner emotional state outwardly, which in turn elicits a comforting response from the child’s caretaker. This response to the outward expression of emotion ensures the care and nurturance of the child, ultimately assisting in its survival. Situations that influence affect have a much greater effect over human performance than merely altering mood. Environmental situations can alter affect, which in turn can effect cognitive and behavioral changes. Andersen and Guerrero [1] have linked negative affect with events in which interruptions hinder goal attainment; positive affect has been associated with efficient problem solving and improved memory, learning, and creative thought [3], [4].
2.1. Communicating with computers Although it is not difficult to accept that humans use nonverbal cues to display their emotions to other humans, it seems counter-intuitive to accept that humans
also use nonverbal cues while interacting with media. However, a significant number of studies have shown that, despite their conscious awareness that media is not sentient, subjects innately follow the same social rules of nonverbal emotional expression. In The Media Equation [10], Reeves and Nass summarize many of their 35 studies that show that humans interact with media in a “fundamentally social and natural” way. For example, they cite the results from their article in Human Factors in Computing, which was a set of five experiments providing evidence that "individuals' interactions with computers are fundamentally social” (p. 72) [7]. This phenomenon can cause problems in the usability of computers by humans. These perceptions can mask the differences between user and computers and encourage the user's assumption that the system has much more capability and flexibility that it really does [6]. This can then increase the amount of blame that the user places on the computer when there is a critical incident, thus increasing frustration
3. Affect in usability testing The usability community continues to test users’ cognitive capabilities through objective measures, but researchers turn to subjective measures, such as questionnaires and exit interview, to gauge affect. This is not giving a complete picture of the effect of the interface on the user. There needs to be a way to get affective information that is quantifiable without relying on the user’s themselves to remember back to certain situations. Researchers are creating affect sensing devices that sample physiological signals such as pulse and galvanic skin response (GSR) [11]. However, these methods are either too intrusive or rely on unproven techniques for affect recognition. Others have developed pressure mice that only measure the squeezing pressure of the user on the mouse [9]. Unfortunately, this method has technical problems and misses important information when the user actually clicks the mouse to achieve a goal. Humans examine vocal cues, posture, touching behavior, hand gestures, and, most of all, facial expressions, to construe the affect of others with whom they are communicating. Although these signals provide viable ways for computers to recognize their human user's state, they may be difficult to interpret computationally, and thus augmenting them through other data may yield a more robust and reliable sensing method.
Finding a reliable way for computers to sense affect poses the greatest challenge. The following statement is in Picard, Vyzas, and Healey [8]: “The most natural setup for gathering genuine emotions is opportunistic: The subject’s emotion occurs as a consequence of personally significant circumstances (event-elicited); it occurs while they are in some natural location for them (real-world); the subject feels the emotion internally (feeling); subject behavior, including expression, is not influenced by knowledge of being in an experiment or being recorded (hidden-recording, other-purpose). (p. 1178) The following study samples behavioral signals as indicators of user affect. These measures of the user's state during application run-time would be a sounder indicator in addition to it being less intrusive and not detected by the user, event-elicited and in a real-world environment thus eliciting real, internal emotions from the user. Microsoft Word was chosen as the platform for analyzing critical incidents. This was because we wanted to show that this is a viable way to analyze usability problems with a real world application. The behavioral measure we chose for this experiment was TouchPad pressure, because users would not be aware of the monitoring. In this study, we were trying to ascertain if behavioral measure such as TouchPad pressure could be used for affect research purposes and usability testing in the future.
4. Method 4.1. Participants This was a preliminary study in order to lay a basis for a more sound detection algorithm. Thus, only three subjects (n = 3; female = 2, male = 1) with varying computer skills were tested. Each of the subjects was familiar with Microsoft Word and the Synaptics TouchPad.
4.2. Apparatus Each subject was asked to complete a set of tasks using Microsoft Word. (Table 1) Tasks 3-7 were chosen
for their moderate difficulty and possibility of causing the user to make errors. While the participant was completing the tasks, a logging mechanism was recording his or her finger pressure on the Synaptics TouchPad. One reading was saved approximately every 100th of a second. The Synaptics TouchPad is a flat input device that captures the capacitance between its embedded wires and the user's finger. Its sensitivity is approximately 1000 points per inch. In addition, all screen actions were being taped to a VHS tape for later analysis. Both types of data were time stamped for later comparison.
4.3. Procedure After signing the consent forms, the participants were seated in front of a laptop computer with a Synaptics TouchPad pointing device. They were given a sheet with seven tasks to complete. (Table 1) After they completed the experiment they were asked by the experimenter what they thought of the tasks and if they were frustrated by any of the tasks. These frustrating incidents were noted for later analysis. The researchers then located the sections of the data where the participants indicated they were frustrated. The researchers identified a critical incident in each frustration inducing section. A critical incident was defined by a point where the outcome of a participants actions was not what the participant had intended. This definition was chosen for its ease in identification by the researchers. Table 1. Tasks for analysis 1. 2. 3. 4. 5. 6.
7.
Open the Word file “The Gettysburg Address” in the folder Speech under My Documents. Change the title to all upper case. Format the text into three even columns. Find a picture of Lincoln on the web, save it in the same folder as the speech. Put the picture of Lincoln at the top of the page, centered. Make a table at the bottom of the page, centered, with the following information. Days and times the speech is reenacted: Monday 1:00-2:00 4:00-5:00 Tuesday 9:00-10:30 2:30-6:00 Wednesday 8:00-10:00 12:00-2:00 Thursday 12:00-2:30 Friday 1:00-4:30 9:00-10:00 Make the times blue.
Each user's finger pressure on the TouchPad before and after each critical incident was compared with a t-
test. Each section of analysis was .50 seconds worth of data. This was acceptable primarily because if this is to be used for on-the-fly recognition in the future, then one second is fast enough to have the system react while providing enough data points for analyses. Six critical incidents for each participant yielded data that could be easily analyzed. To ensure that the significant pressure differences found during critical incidents were not coincidence, six successful outcomes for each participant were also analyzed with a t-test.
5. Results After the experiment, the changes in each participant's finger pressure on the TouchPad were analyzed. First, the pressure readings for each critical incident were aggregated, yielding one averaged value representing the period before and one value representing the period after each incident. An equal number of data points -- that is, an equivalent time window -- were selected to ensure a clean comparison between the before and after groups of pressure readings. Next, a pairedsamples t-test was performed on the six before-and-after pairs for each subject. From this analysis, we found a significant difference in the mean pressure levels before and after the critical incidents for all three participants. (Table 2) Table 2. Critical incident t-test results Participant 01 Aggregated Total Participant 02 Aggregated Total Participant 03 Aggregated Total
t(5) = -6.081, p < .002 t(5) = -5.324, p < .002 t(5) = -5.483, p < .002
Figure 1 is a graph of the TouchPad data from Participant 03, Critical Incident 01, which occurred when s/he was creating three even columns. Between time 0:00 and 2:30, the user was moving the mouse and choosing a menu item. Between 2:30 and 4:00, the user was waiting for the system to respond and then assessing the outcome. At 4:00 the user began to move the mouse in order to undo the undesirable result.
pressure
140 120 100 80 60 40 20 0 0:00
1:00
2:30
4:00
5:30
tim e (in seconds)
Figure 1. CI Pressure change example – Participant 03 Critical Incident 01 The non-critical incidents were then analyzed. As expected, smaller, non-significant changes occurred when the participants’ actions produced the results they intended. (Table 3) Table 3. Successful Outcome t-test results Participant 01 Aggregated Total Participant 02 Aggregated Total Participant 03 Aggregated Total
t(5) = -2.609, p > .068 t(5) = -1.868, p > .085 t(5) = -3.037, p > .059
Figure 2 is a graph of the TouchPad data from Participant 02, Successful Outcome 03 in response to successfully creating three equal columns.
pressure
150 100 50 0 0:00
In considering the implementation of such a system, it is important to note that each participant had a different mean TouchPad pressure in addition to varying degrees of difference between means before and after critical incidents. This is a strong argument for comparing behavioral measures to a baseline set by each individual user, instead of using a general baseline measure for all users. One limitation in the experiment was that many of the tasks were completed without any problems. Thus, we had a small number of sections for analysis after critical incidents. For future experiments it is recognized that a more controlled environment with more critical incidents for analysis is needed. The most important benefit of this approach is that behavioral measures such as these are unobtrusive and undetected by the user. In addition, there is no waiting for the sympathetic nervous system to show signs of affect. The only drawback of using this type of behavioral data is that we have to wait for the user to touch the TouchPad after the critical incident if they are not already touching it in order to have any data. This may not always be plausible. Thus, it may be interesting to combine TouchPad analysis with other behavioral measures such as keyboard pressure or sitting posture. Creating devices that detect levels of affect for responding machines is the primary goal of affecting computing. However, considering how detecting affect can be used in the usability testing process could provide designers and researchers with a more complete picture of the effect of interfaces on users. The primary design principle for interfaces should be that media should conform to social and natural rules so no instruction is necessary and the more enjoyable it is. Affect detection can be a means to that goal while providing new tools to help testers measure the user’s level of enjoyment as well as undetected usability issues.
7. References 1:00
2:00
3:00
4:00
tim e (in seconds)
Figure 2. Non-CI pressure example – Participant 02 Successful Outcome 03
6. Discussion It is possible to use behavioral data to analyze the effect of a critical incident. In addition, it is possible that these signals can be used by a system to detect when a critical incident occurs during run-time.
[1] Andersen, P. A., & Guerrero, L. K. (1997). Principles of communication and emotion in social interaction. In P. A. Andersen & L. K. Guerrero (Eds.), Handbook of communication and emotion (pp. 49-96). New York: Academic Press. [2] Cosmides, L., & Tooby, J. (2000). Evolutionary psychology and the emotions. In J. M. Haviland-Jones & M. Lewis (Eds.), Handbook of emotions (pp. 91-115). New York: The Guilford Press. [3] Isen, A. M. (1999). Positive affect. In T. Dalgleish & M. Power (Eds.), Handbook of cognition and emotion (pp. 521539). New York: John Wiley & Sons Ltd.
[4] Isen, A. M., Dabman, K. A., & Nowicki, G. P. (1987). Positive affect facilitates creative problem solving. Journal of personality and social psychology, 52(6), 1122-1131. [5] Levenson, R. W. (1994). Human emotion: A functionalist view. In P. Ekman, R. J. Davidson, K. Scherer (Series Eds.), P. Ekman & R. J. Davidson (Eds.), The Nature of Emotion (pp. 123-126). Oxford: Oxford University Press. [6] Marakas, G. M., Johnson, R. D., & Palmer, J. W. (2000). A Theoretical model of differential social attributions toward computing technology: when the metaphor becomes the model. International Journal of Human-Computer Studies, 52, 719750. [7] Nass, C., Steuer, J. S., & Tauber, E. (1994). Computers are social actors. Proceeding of the CHI Conference, 72-77. Boston, MA.
[8] Picard, R. W., Vyzas, E., & Healey, J. (2001) Toward machine emotional intelligence: analysis of affective physiological state. IEEE transactions on pattern analysis and machine intelligence, 21(16), 1175-1191. [9] Qi, Y., Reynolds, C., Picard, R. (2001) The Bayes Point Machine for Computer-User Frustration Detection via PressureMouse. PUI2001. [10] Reeves, B., & Nass, C. (1996). The Media Equation. Cambridge: Cambridge University Press. [11] Vyzas, E., & Picard, R. W. (1999, May 1). Offline and online recognition of emotion expression from physiological data.