Is Empathy the Key? Effective Communication via Instant ... - CiteSeerX

6 downloads 439 Views 308KB Size Report
We call this tool a Virtual Messenger (figure 2). ... Proceedings of 11th EATA International Conference on Networking Entities, October 2005, St. Pölten, Austria ...
Is Empathy the Key? Effective Communication via Instant Messaging Marc Fabri ([email protected]) David Moore ([email protected]) School of Computing Innovation North Leeds Metropolitan University United Kingdom

Abstract Instant Messaging (IM) has become an effective and convenient way for many to chat in real-time, over a distance with friends, family, student peers and colleagues. Messaging Tools such as Microsoft’s MSN® Messenger or Yahoo!® Messenger typically support text chat, show a picture of each interlocutor, and allow exchanging emoticons which are small emotional icons representing ones feelings. In this paper, we investigate how different ways of visualising such emotions affect the experience of an IM user. We created a two-person messaging application that represents each interlocutor as a 3D character, or avatar, capable of expressing emotion through facial expressions. Results obtained strongly suggest that emotional expressiveness in avatars increases involvement in the interaction between the participants. This has a positive effect on their subjective experience. Further, we identify empathy as a key component for creating a more enjoyable experience and greater harmony between spatially separated people communicating via IM technology. Directions for future research are outlined.

1 Introduction Instant Messaging (IM) has become an effective and convenient way for many to chat in realtime, over a distance with friends, family, student peers and colleagues. Messaging Tools such as Microsoft’s MSN® Messenger or Yahoo!® Messenger typically support text chat, show a picture of each interlocutor, and allow exchanging emoticons which are small emotional icons representing ones feelings (figure 1). Instant Messaging tools connect people who are spatially separated, who do not want to, or who cannot come together physically but want to discuss topics. In this paper, we investigate how different ways of visualising such emotions affect the experience of an IM user. For our experiment we created a 2-person messaging application that represents each interlocutor as a 3D character, or avatar, capable of expressing emotion through facial expressions. We call this tool a Virtual Messenger (figure 2). The Virtual Messenger enables two people to meet and discuss a topic, assisted by various interactions. The tool allows us to investigate how a user’s experience is different when the avatars representing interlocutors are emotionally expressive, as opposed to being non-expressive.

Fabri, M., Moore, D.J. (2005) Is Empathy the Key? Effective Communication via Instant Messaging, in Proceedings of 11th EATA International Conference on Networking Entities, October 2005, St. Pölten, Austria

Figure 1: Microsoft’s MSN ® Messenger

Figure 2: The Virtual Messenger interface

1.1 Character animation From the real world we know that emotions are important. Wherever one interacts with another person, that other person's emotional expressions are monitored and interpreted (Argyle 1988). Further, psychology suggests that emotions are also an important factor in decision-making, problem solving, cognition and intelligence in general (Damasio 1994, Lisetti and Schiano 2000). Giving computer-animated characters non-verbal, emotionally expressive abilities has long been considered beneficial as it potentially leverages the observer’s lifetime of experience with social interaction in the real world (Bates 1994, Reeves and Nass 1996, Picard 1997). The latest generation of computer-generated movies such as ‘Shrek2’ by DreamWorks (released 2004) fully exploits our innate familiarity with the human physiology and physiognomy. The characters display well-animated human features without looking perfectly human. Recent research in the area of animated characters and embodied agents (Bartneck 2002, Prendinger and Ishizuka 2004, Cowell and Stanney 2005) has informed the design of the Virtual Messenger characters. Our virtual 3D head is animated using a subset of the Facial Action Coding System with only 12 instead of the normal 58 Action Units (cf. Ekman and Friesen 1978, Ekman 1999). Morph targets allow independent movement of eyes, eyebrows, nose, cheeks, mouth, chin and the entire head. The resulting character is capable of displaying the six “universal” facial expressions of emotion happiness, surprise, anger, fear, sadness and disgust (Ekman and Friesen 1978), as well as a neutral face. All expressions are objectively verified and were designed to be highly distinctive and recognisable (see Fabri et al 2004a for details of design and evaluation).

2 Experimental study The Virtual Messenger was evaluated in a between-groups experiment (N=32) conducted in pairs. There were two versions of the tool: The first allows users to click on emoticons which then appear in the chat log of both participants. Both participants are represented by avatars, but there is no change in the appearance of the avatar other than idle animations such as regular blinking or random, subtle eyebrow movements. The second features the same emoticons and avatar representations. When a user clicks on an emoticon, it appears in the chat log and in addition it causes their avatar in the partner’s messenger window to display that emotion. We expected the absence or presence of emotional expressiveness in the animated characters to affect the user’s experience. To study this empirically, we defined the degree of "richness" experienced during interaction with the Virtual Messenger. Of course, the terms

‘experience’ as well as what we call ‘richness’ here are abstract ideas, and some observable characteristics need to be defined. That task is not trivial. When asking ‘How colourful was your day?’ Slater (1999) illustrates that the quality of an experience is often difficult, if not impossible, to capture and quantify. This is the same problem as having to rate exactly how much one enjoyed an evening of food, drink and dance – and then compare it to someone else’s experience of the same evening.

2.1 Scenario and Task The scenario chosen for the debate is a classical survival exercise: Two people have crashlanded on the moon after having won a space flight in a competition. They now have to make their way to the mothership. Their task is to decide what to take with them and what to leave behind – first individually, then together, in order to agree on a joint ranking list. This scenario was chosen for two reasons: First, it is sufficiently complex to warrant a stimulating discussion taking place, as well as being focussed. Survival exercises are often used to provide a structured experience in group decision-making where the advantages and disadvantages of different options have to be weighed up and a joint solution be negotiated (cf. O'Reilly 1996, Bartneck 2002). Second, because of the potential for both dispute and harmony, the scenario may naturally elicit emotional responses during the debate. It can be considered somewhere between real life and fantasy and participants are likely to be able to relate to it – if only as an imaginary adventure. The Virtual Messenger tool then allows participants to display their intentions and emotions, potentially to emphasise a statement, appease after a dispute, or help to pursue a set goal.

2.2 Measurements In an attempt to make ‘richness’ quantifiable, we postulate a richer experience as manifesting itself through: 1. 2. 3. 4.

More involvement in the task Greater enjoyment of the experience A higher sense of presence during the experience A higher sense of copresence

At the time the experiment was carried out (autumn 2004), no comparable study combining these four factors existed. However, various researchers have looked at these and related characteristics in isolation. Their interpretations informed the definition and the choice of evaluation tools. Koda (1996) for example investigated how simulated emotions in agent opponents can affect a player's involvement in a two-dimensional game task. Lessiter et al (2001) identified engagement as one of four factors influencing the sense of presence in virtual environments. Bartneck (2002) studied the enjoyability of interacting with a robotic character, while Nichols (1999) measured the enjoyment users felt when navigating a 3-D world. Presence is probably the concept that received most attention hitherto (cf. Witmer and Singer 1998, Slater 1999, Gerhard 2003). However, an on-going debate (cf. Lessiter 2001, Slater 2004) shows that it is not universally agreed what elicits presence, or how it can be measured effectively. In the following sections we will explore in detail each characteristic of richness and how we can measure it most effectively. 2.2.1 Involvement Involvement in the task is taken to be an objective measure of the number of user-initiated actions taking place. These are automatically recorded in the interaction log, consisting of communicative acts as well as manipulations of objects in the environment. The following events are recorded:

Involvement Log Each of the following events is automatically logged: • • • •

Chat message sent Emoticon selected Item picked up Item moved to new position

In addition, the log also contains information about the experimental condition, participant’s name and avatar, the time elapsed for each event, final ranking list and various statistics and scores that feed directly into the empirical analysis.

Another element of involvement may be revealed from the experimenter’s observation of participants during the task. These will be analysed separately as qualitative data. 2.2.2 Enjoyment Recently researchers have begun to look into enjoyment aspects of the VR interaction experience (cf. Nichols 1999, Bartneck 2002, Stedmon et al 2003, Blythe et al 2003). The measuring instruments used are typically adapted from related disciplines, such as product design and interface design. Designers of consumer products have long been aware of the potential that quantifying enjoyment and pleasurability of use can yield for a product's success – or as Norman (2004) puts it: “attractive things work better”. Jordan (2000) provides a pre-validated 13-item pleasurability questionnaire, asking users to state their agreement with statements like "This product gives me satisfaction", or "I feel that I should look after this product". In a more recent study on computer interface design, Chawda et al (2005) investigated the relationship between usability and aesthetic quality. Building on Norman’s (2004) argument that the emotional side of a design can be the overriding factor in its success rather than its practical elements, Chawda et al developed an aesthetics measure specifically for their search engine interface. Interestingly, they found that “attractive things are perceived to work better”, but found no evidence that usability was actually improved – confirming Hassenzahl’s (2001) notion of perceived hedonic quality and its effect on the appealingness of a product. In other words, if something looks and feels good and you enjoy using it, it does not matter too much that there are other products on the market which, objectively, function better. The underlying psychological basis of these instruments makes them potentially useful in other, related domains, especially where the focus is on subjective attitudes to a system or product. We used a questionnaire that was designed to capture emotional responses to an interactive experience directly after it has finished, Nichols' mood adjective checklist (Nichols 1999). The 12-item checklist is shown below:

Mood adjective checklist (Nichols 1999) For each of the following adjectives, please indicate how much you felt that feeling during your experience with the Virtual Messenger: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.

panicked exhilarated bored safe motivated scared happy trapped excited sad lost in control

(possible answers: never, rarely, sometimes, often, always) Such checklists are typically used in psychological studies as an overall self-report measure, e.g. to evaluate experienced contentment versus distress. Nichols' list was created specifically for measuring the enjoyment aspect of interacting in a virtual reality system and we therefore consider it a particularly suitable tool to measure the subjective enjoyment component of richness in this study. 2.2.3 Presence Presence can be defined as the sense of ‘being there’ in the virtual environment, despite knowing rationally that it is not the real world (Biocca et al 2001). Researchers investigating virtual reality systems have long considered presence an important element of the experience – perhaps the one aspect that may be universal independently of application and other such aspects as ‘task performance’ (Slater 2004). There are several instruments available to measure presence. Witmer and Singer's Presence Questionnaire (PQ) takes a predominantly technological approach, with 32 items querying how the VE interface affects the participant’s perception and ability to interact with the VE (Witmer and Singer 1998). Their analysis of data from 152 participants revealed three clusters: Involved/Control, Natural and Interface Quality. The Slater-Usoh-Steed questionnaire (SUS) takes a more psychological approach, focusing on internal states and introspection as a means of detecting presence. SUS queries the participant's feelings on three aspects: the sense of being in the virtual environment, the extent to which the environment becomes the dominant reality, and the extent to which the environment is remembered as a ‘place’ (cf. Usoh et al 2000). The weakness of SUS may lie in this very reliance on introspection, and the difficulty to identify and measure events that are ultimately internal and not easily objectively quantifiable. We are aware of the limitations inherent in measuring a very personal, experience-based phenomenon like the sense of presence in VE. An alternative approach is given by Meehan et al (2001) who use electrodermal activity (or galvanic skin response) and heart rate in an attempt to find a more objective measure of presence, or surrogates for it. However, such biometric methods can be very obtrusive, especially in non-immersive desktop VR where the user is fully aware of their own body and what sensors are attached to it. Further, their effectiveness is limited to situations where the physiological response is obvious, e.g. anxiety created by the situation presented in the virtual environment. An alternative and more inclusive method is offered by Lessiter et al (2001). Their 43-item ITC-Sense of Presence Inventory (ITC-SOPI) is a psychometrically sound questionnaire. It is available to the research community and its validity has been confirmed in several studies. In

a factor analysis based on a large sample (n=604), Lessiter et al identified four distinct factors of presence: • • • •

Spatial presence: Being in a physical space other than the actual place one is in. Engagement: A measure of the user’s interest in the content. Naturalness: Believability and realism of the content. Negative Effects: This factor measures physical effects of the experience eg. headache, eyestrain or tiredness. It is typically associated with more immersive experiences (eg. when wearing a head-mounted display, or in IMAX cinemas).

Like PQ and SUS, ITC-SOPI is a post-experiment questionnaire measuring subjective aspects of the participant's virtual reality experience. Unlike other presence measures, it is not just limited to a single type of virtual reality medium but applicable across various media including immersive VR, desktop VR, first-person 3D games and large projection screen cinema. This cross-media validity, combined with the fact that the four factors can be investigated in isolation, make ITC-SOPI a very appropriate tool for the current investigation. 2.2.4 Copresence Where presence research aims to understand what leads to people’s sense of ‘being there’, copresence researchers investigate the sense of being together with another person in a computer-generated environment (cf. Slater 1999, Schroeder et al 2001, Gerhard 2003). As a minimum requirement, there needs to be a degree of mutual awareness, social interaction or collaboration possible in the virtual environment to elicit a sense of copresence, also referred to a social presence (Biocca et al 2001). This can be visual contact i.e. seeing the other’s avatar, aural awareness i.e. hearing the other person, or by noticing the consequences of another person’s actions on the environment or objects in the environment. We define copresence here as a virtual face-to-face experience where both interlocutors share the same virtual space, which corresponds with Schroeder’s (2002) strict definition. Further, we follow other researchers (Schroeder et al 2001, Garau et al 2005) by measuring the phenomenon in the format of three post-experiment questions. These questions originate from the ITC-SOPI questionnaire: Co-presence questions (from ITC-SOPI) Please indicate how much you agree or disagree with each of the following statements: During my experience with the Virtual Messenger… 1. 2. 3.

I had the sensation that the other character was aware of me I felt as though I was in the same space as the other character I had the sensation that the other character was responding to me

(possible answers: strongly disagree, disagree, neither, agree, strongly agree)

2.2.5 Other measurements In addition to the factors contributing to ‘richness’ of the experience, we measured two other factors: the participants’ task performance and the usability of the Virtual Messenger as scored by the participants. We were not interested in the performance of participants as such but in comparing scores across conditions and within the pair, e.g. whether the joint ranking task led to an improvement or otherwise. Performance was measured by comparing a participant’s ranking list with an ‘ideal’ ranking list, which was available. The system's usability is a factor that needs to be controlled as poor usability can affect the validity of other factors. For example, lack of understanding how to use the interface may prevent participants from taking part in the conversation. Like the performance measure,

usability did not form part of the richness score. Usability was measured using the established 10-item System Usability Scale (SUS – not to be confused with the Slater-UsohSteed Presence questionnaire which has the same acronym) developed by Digital Equipment Corporation (Brooke 1996). This is a tried and tested questionnaire addressing evaluation at the overall system level. SUS has shown high reliability and it is sufficiently generic to be valid with a variety of systems and devices. System usability scale (Brooke 1996) What do you think about the Virtual Messenger tool as a product? 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.

I can imagine using the Virtual Messenger tool frequently I found the Virtual Messenger tool unnecessarily complex I thought the tool was easy to use I think that I would need the support of a technical person to be able to use this tool I found the various functions were well integrated I thought there was too much inconsistency I would imagine that most people would learn to use the Virtual Messenger very quickly I found the Virtual Messenger very cumbersome to use I felt very confident using the tool I needed to learn a lot of things before I could get going with this tool

(possible answers: strongly disagree, disagree, neither, agree, strongly agree)

Procedure Participants took part in pairs. Each participant was welcomed by an instructor. None of the participants knew who they were paired up with. On arrival they were shown into separate rooms and did not meet directly before, during or after the experiment. For the duration of the experiment the participant chose a fictitious nickname to maintain that degree of anonymity. Participants read the scenario description and started with their first task, the individual ranking of items in order of importance. This was done on paper. They could also write down additional comments on each item for their own benefit, if they wished. An instructor was available for questions, but did not actively observe the participant at this stage. For the joint ranking task, the Virtual Messenger tool was started up. Participants selected their virtual embodiment from a choice of three male and three female avatars, as shown in Figure 3. Once started, the two participants ‘met virtually’ and were aware of each other. The condition (expressive or non-expressive avatars) was allocated randomly but the same within the pair. The task was then to discuss their individual ranking lists and come to a consensus. There was no time limit. If participants could not agree on a joint ranking list then that was a valid outcome and reason to end the discussion. Immediately after the discussion in the Virtual Messenger Figure 3: Choice of avatar ended, the participants completed a post-experiment questionnaire. It covered the enjoyment, presence, copresence and usability measures. Participants were asked to record their immediate response to each item, rather than thinking about items for a long time. Following completion of the questionnaire, the participants were free to leave. If they wished they could take part in an individual debriefing session where the individual and joint ranking

lists were scored and compared. The reasoning behind the "ideal" ranking list was explained. Most participants chose to stay for a debriefing.

Pilot study We performed a pilot study to investigate the extent to which the scenario and item descriptions were appropriate and easily understood, whether the timing was appropriate, and to identify any remaining usability issues that were not picked up during design of the Virtual Messenger. Participants could comment on any other issue they felt worth noting. Six people took part in three pilot sessions. Two sessions featured expressive avatars, one session featured non-expressive avatars. Participants went through the full experimental procedure. They were then interviewed individually and could comment on any aspect of the experiment. As a result of observation, questionnaire responses and the interview, we made minor changes to scenario description and task instructions. For example, three participants commented that the speech bubble above the avatar’s head was disappearing too quickly, and that a history window may help setting utterances into context. Instant messaging tools such as Microsoft® MSN Messenger or Yahoo!® Messenger were given as examples of applications with an effective history feature. Further, participants felt that it would be useful to get visual feedback on emotions expressed via one’s own avatar. First person computer games were given as examples, where such a ‘mirror’ view is a common feature. Both the history window and a mirror view of oneself were included in the final version of the virtual messenger tool. All participants commented that the questionnaire was relatively long, although they felt the questions were relevant. The final version of the Virtual Messenger interface as used in the actual experiment is shown in Figure 4:

Interlocutor

Mirror of one’s virtual ‘self’

Chat history

Figure 4: Final Virtual Messenger interface

3 Results and analysis 32 volunteers took part in the study, 8 groups of 2 using expressive avatars (condition EX) and another 8 groups of 2 using non-expressive avatars (condition NE). Participants were aged 21-63 and equally split between female and male. The average age was 28.2 years

with a standard deviation of 13.4. Participants were generally computer literate, well educated and skilled in the use of keyboard and mouse. Few participants had experience of using games, virtual reality, or other applications that involve 3D characters. Several had used Instant Messaging tools before. None of the participants knew the moon survival scenario before, although some knew of similar scenarios. The condition (NE or EX) was allocated randomly but was the same within each pair. Sessions lasted between 8 and 35 minutes excluding pre and post questionnaires, with an average length of 21.2 minutes (standard deviation 8.0). All but one pair agreed on a joint ranking list.

3.1 Involvement Frequency and nature of all activity between the participants was logged electronically. The average time people took to complete the collaborative task was 20 minutes for (NE) and 22 minutes for (EX). Participants using the expressive avatars were significantly more involved in the task: 50% more messages, messages were 50% longer, 100% more items were moved, and there was 5 times more use of emoticons.

3.2 Enjoyment The mood adjective checklist (Nichols 1999) for evaluating Virtual Reality interfaces indicated high enjoyment scores for both conditions overall, with (NE) being significantly more enjoyable. (F(1,32)=0.81, p

Suggest Documents