Taking A Fresh Look at Auditory Displays: Gaze Driven ... - CiteSeerX

1 downloads 0 Views 266KB Size Report
Taking A Fresh Look at Auditory Displays: ... claim that this is because the natural tendency is for both ... Systems Control Inc. This device performs a software.
Taking A Fresh Look at Auditory Displays: Gaze Driven Auditory Browsing Rameshsharma Ramloll Co-operative Systems Engineering Group Computing Department Lancaster, LA1 4YR,UK E-mail: [email protected]

John Mariani Co-operative Systems Engineering Group Computing Department Lancaster, LA14YR, UK E-mail: [email protected]

ABSTRACT

operate in order to complete a direct manipulation task rather than having one hand dealing with auditory observation and the other with direct manipulation. These two tasks perceived as being distinct, fail to capture ubiquitously and simultaneously the attention of the user.

In this technical note, we illustrate a possible application of gaze tracking in the auditory domain. We show how coupling the position of gaze with that of a sound sensor provides a number of new interaction opportunities at information rich interfaces involving auditory landscapes. We describe our strategy for exploring this browsing technique and introduce the application area where it will be exploited.

Another significant issue is that often the position of the user embodiment, i.e. the mouse pointer, is forgotten. The user is then likely to be confused about the actual location of short lived sources.

KEYWORDS: Auditory displays, gaze tracking GAZE DRIVEN AUDITORY BROWSING INTRODUCTION

Thus, when designing Moksha, the following approach is adopted in order to achieve as much parallel interactions as possible. All direct interface manipulations are carried out through the keyboard and the mouse. Visual observation tasks are taken care of by normal visual processes and browsing of localised sounds is achieved through the coupling of the position of gaze with that of the sound sensor. Thus at no point during interactions, does the user loose control of her access to auditory or visual information because of the need to manipulate directly the interface. Also, this coupling naturally allows the user to be constantly aware of the position of the sensor thereby allowing location inferences to be made immediately.

The need to browse a dense auditory display was felt during the design of a multi-user desktop system, Moksha, which is described elsewhere [2]. In this system, each object, e.g. an icon, in a given desktop environment, visible or hidden, is able to receive events generated as a result of peer user activities. In order to provide for in-context event notification, the said objects represent locally these events both in the visual and the auditory medium. Thus, auditory representations are collocated with the interface objects. This gives rise to a dense auditory display which needs to be browsed and comprehended ideally with minimal workpractice disturbance and cognitive overhead at the multi-user desktop.

PROTOTYPICAL SYSTEM

The gaze tracker [3] being used is a prototype from Vision Systems Control Inc. This device performs a software analysis to average the pupil’s motion in order to determine the position of gaze. It is therefore not simply an eye position sensor, and tends to allow minimal jitter when positioning the sound sensor. At this stage, the gaze position is determined relative to the head position thus making a fixed head orientation necessary. In the future, the device will be operated in conjunction with a head tracker to relax the restrictions on head movement.

Coupling a sound sensor with the mouse pointer, as a virtual user embodiment, to browse such displays has been a popular approach [1]. However this approach has a few limitations. Users tend to view the mouse pointer as a metaphoric hand, and doubling its role as a channel for auditory perception forces the user to operate in either a ‘stethoscopic’ mode, referring to the familiar medical stethoscope, or direct manipulation mode. They cannot operate in both simultaneously thereby inhibiting the dialectical and parallel relationships that exists between perception and actions in the real world. Having a second mouse pointer dedicated to the ‘stethoscopic’ tasks, does not solve this problem of forced serialised interactions. We claim that this is because the natural tendency is for both hands to be mainly used in situations where they both co-

LEAVE BLANK THE LAST 2.5cm OF THE LEFT COLUMN ON THE FIRST PAGE FOR US TO PUT IN THE COPYRIGHT NOTICE!

Figure 1 User wearing the prototypical gaze tracker

1

SPATIALISING SOUND SOURCES

Since our implementation is based on a 3D sound API that does not produce convincing externalised sound sources, we rely on panning, distance attenuation and Doppler effects to locate sound sources. We are also experimenting with 4 speakers, two for left and right panning and the remaining for top and down panning to tackle elevation localisation problems.

Sensor Position due to Gaze

Area Exposed During direct Manipulation (Scrolling)

Sound Source BROWSING AUDITORY LANDSCAPES

Figure 2 illustrates a typical experimental set-up used to explore our browsing technique. We have developed ALET (Auditory Localisation Evaluation Tool) which allows the creation of auditory landscapes and the recording of a user’s navigational behaviour when solving simple tasks such as browsing for a given source. This provides us with the necessary data to evaluate our technique and our selection of sounds for a particular application domain. Auditory Landscape

Sensor Position due to Combination of Gaze and Direct Manipulation (Scrolling) Figure 4♣ An ALET recording of a user’s navigational behaviour.

Figure 5 shows a prototype of Moksha to illustrate how dense dynamic auditory landscapes are likely to arise. Elsewhere we describe how interface elements are mapped to sound sources to create a 21/2 D layered auditory environment, with parametrisations used to represent occlusion and containment [2].

Pupil Image as Captured by VCS Camera unit

Gaze Tracker

Example Potential Sites for Informative Sound Sources

Left Speaker

Right Speaker



Figure 2 Evaluating gaze driven auditory browsing EXPLORING GAZE DRIVEN AUDITORY BROWSING

Figure 3 shows an ALET window containing sound sources distributed on a 2D plane. The user is able to scroll the latter using the mouse or keyboard while perusing the auditory scene simultaneously. Figure 4 describes a typical recording of a user’s navigational behaviour.

Shared Drawing Tool Windows Containing Desktop Elements Figure 5 The Moksha Multi-User Desktop IMPLICATIONS

Sound source

While our interest in multi-media browsing stems from our work in the design of CSCW systems, the browsing technique introduced can be used in other application domains such as data visualisation and browsing of sound and music databases.

Sound Sensor

Source Range

ACKNOWLEDGMENTS

The authors thank Rik Bettly of VCS Inc, for making Vision Control System available. REFERENCES Figure 3♣ A window containing spatialised sound sources. RELATED WORK

In Moksha, localised sounds are exploited both at the level of the whole multi-user desktop and also within individual shared applications. Each icon, e.g. a folder, is a potential location for sound source(s). At the level of individual shared application, operations performed on the canvas for instance, can be associated with localised sounds. ♣

Modified to emphasise relevant detail 2

1.

Gaver, W. and Smith, R. Auditory Icons in Large Scale Collaborative Environments. Human Computer Interaction- Interact 90. Elsevier Science Publishers B.V (North-Holland) IFIP, 1990.

2.

Ramloll, R. and Mariani, J.A. Moksha: Exploring Ubiquity in Event-Filtration Control at the Multi-user Desktop. Submitted to CSCW ’98, Seattle, Nov 1998.

3.

Vision Control Systems Inc., http://www.visioncs.com/main.html

at

web

link:

3