Cross-display object movement in multi-display environments

„This is an electronic version of an article published in Human-Computer Interaction. Nacenta, M. A., Gutwin, C., Aliakseyeu, D. & Subramanian, S. (2009). There and Back Again: Cross-Display Object Movement in Multi-Display Environments. Human-Computer Interaction, 24(1), 170-229. doi:10.1080/07370020902819882. Human-Computer Interaction is available online at:

http://www.informaworld.com/smpp/content~content=a910602221~db=all~jumptype=rss

There and Back Again: Cross-Display Object Movement in Multi-Display Environments Miguel A. Nacenta and Carl Gutwin University of Saskatchewan Dzmitry Aliakseyeu Philips Research Sriram Subramanian University of Bristol RUNNING HEAD: Cross-display object movement in MDEs

Corresponding Author’s Contact Information: Miguel A. Nacenta Department of Computer Science, University of Saskatchewan 110 Science Place, Saskatoon, SK, S7N 5C9, Canada [email protected] Brief Authors’ Biographies: Miguel A. Nacenta is a researcher with interests in low-level interaction techniques, multi-display environments and collaborative work; he is a Ph.D. candidate in the Department of Computer Science at the University of Saskatchewan, Canada. Carl Gutwin is a researcher with interests in Computer-Supported Cooperative Work, interaction modeling, and information visualization; he is a professor in the Department of Computer Science at the University of Saskatchewan, Canada. Dzmitry Aliakseyeu is a researcher with an interest in interactive tabletop systems; he is a senior scientist in the Media Interaction group of Philips Research Labs, the Netherlands.

-1-



Sriram Subramanian is a researcher with an interest in designing and modeling interaction in post-desktop interfaces; he is a lecturer in the Department of Computer Science at the University of Bristol, United Kingdom.

-2-



ABSTRACT Multi-display environments (MDEs) are now becoming common, and are also becoming more complex, with more displays and more types of display in the environment. One crucial requirement specific to MDEs is that users must be able to move objects from one display to another; this cross-display movement is a frequent and fundamental part of interaction in any application that spans two or more display surfaces. Although many cross-display movement techniques exist, the differences between MDEs – the number, location, and mixed orientation of displays, and the characteristics of the task they are being designed for – require that interaction techniques be chosen carefully to match the constraints of the particular environment. As a way to facilitate interaction design in MDEs, we present a taxonomy that classifies cross-display object movement techniques according to three dimensions: the referential domain that determines how displays are selected, the relationship of the input space to the display configuration, and the control paradigm for executing the movement. These dimensions are based on a descriptive model of the task of cross-display object movement. The taxonomy also provides an analysis of current research that designers and researchers can use to understand the differences between categories of interaction techniques.

-3-



CONTENTS 1. INTRODUCTION 2. MULTI-DISPLAY ENVIRONMENTS 3. CROSS-DISPLAY OBJECT MOVEMENT 3.1. A model of cross-display object movement 3.2. Scope 3.3. Prerequisites for cross-display object movement 3.3.1. Hardware and software infrastructure 3.3.2. Input devices 3.4. Assessment criteria for cross-display movement techniques 3.4.1 Performance 3.4.2 Power 3.4.3 Feedthrough 4. TAXONOMY 4.1 Overview 4.2 Referential domain 4.2.1 Types of referential domains 4.2.2 Design considerations for referential domains 4.2.3 Summary 4.3 Display configuration 4.3.1 Input model types 4.3.2 Design considerations for display configuration 4.3.3 Summary 4.4 Control Paradigm 4.4.1 Control types 4.4.2 Design considerations for control 4.4.3 Summary 4.5 Summary of techniques 5. DISCUSSION 5.1 Design concepts from the taxonomy 5.2 Using the taxonomy 5.3 Critical reflection on the taxonomy: limitations and issues 6. RELATED WORK

-4-



6.1 Research on multi-display environments 6.2 Surveys and classifications 7. RESEARCH AGENDA 8. CONCLUSION APPENDIX A: INDEX OF INTERACTION TECHNIQUES REFERENCES

-5-



1. INTRODUCTION Multi-display environments (MDEs) are becoming more and more common, and are moving beyond simple multi-monitor setups to more complex environments that link tabletops, wall displays, projectors, PC monitors, and mobile devices into a single workspace (see Figure 1). These large-scale MDEs have the potential to dramatically change the way that we work with digital information: for example, they provide a variety of work surfaces to fit different kinds of tasks, they provide a very large display surface, they enable the use of peripheral attention space, and they naturally support colocated collaboration.

Figure 1. Different kinds of MDEs. a) multi-monitor computer, b) large composite display, c) advanced office system, d) meeting room, e) linked mobile composite environment

Although MDEs are clearly one of the directions in which computing environments are moving, we still know little about how to design and choose interaction techniques that enable fundamental tasks in these environments. Cross-display object movement – the action of moving a cursor or digital object from a specific location in one display to a

-6-



specific location in another display – is one of the core functionalities that allow fluid interaction in MDEs, and is the focus of this paper. There are a number of existing techniques that could be used for cross-display movement. Some techniques have been specifically designed for multiple displays (e.g., Pick-and-Drop (Rekimoto, 1997)), and others can be adapted from large-display research (e.g., cursor extension techniques such as the Pantograph (Hascoët, 2004), world-inminiature techniques such as the Radar view (Smith, 1992), or proxy-based techniques such as Drag-and-Pop (Baudisch et al., 2003)). However we still know little about the underlying principles that cause these techniques to work well or poorly in different task situations and in different kinds of MDEs. As a result, it is difficult to compare techniques, difficult to predict whether a particular technique should succeed, and difficult for designers to select appropriate techniques for different application types and different display configurations. In this paper we present a taxonomy that classifies cross-display movement techniques according to three dimensions: referential domain, display configuration, and control paradigm. The referential domain concerns the way in which the user and the system refer to a particular display. Users might refer to a display in a spatial or nonspatial way depending on the task at hand, and interaction techniques will require spatial or non-spatial actions to communicate the destination display to the system. In spatiallyorganized MDEs, the display configuration is the way that displays are arranged in logical space (e.g., as a set of planar surfaces, or as a perspective-based arrangement), which determines the way that a user can actually move an object from one display to another. Last, the control paradigm governs the way that the actual movement takes place (e.g., in an open-loop or closed-loop fashion). The taxonomy provides the first comprehensive survey of existing cross-display interaction techniques. The taxonomy is based on a descriptive model of cross-display object movement that provides a new view of the cross-display design space. This organization also allows us to summarize existing empirical evidence from HCI and other fields that help predict how different techniques will perform in different situations. Our work is intended to help designers as they choose the most appropriate interactive mechanisms for their systems, to stimulate the development of new and more efficient interaction techniques for MDEs, to facilitate further research by providing a common vocabulary and a common set of criteria for assessing and comparing techniques, and to highlight areas where more research is needed.

2. MULTI-DISPLAY ENVIRONMENTS A multi-display environment (MDE) is an interactive computer system with two or more displays that are in the same general space (e.g., the same room) and that are related to one another in some way such that they form an overall logical workspace. This

-7-



definition includes systems where multiple displays are connected to a single computer and systems where networked computers link their displays through a groupware system. MDEs can combine any number of different kinds of displays (e.g., monitors, flat screens, tablets, tabletop displays, projected surfaces, PDAs) into many possible physical arrangements. Common configurations of multi-display systems currently in use include multi-monitor systems (Figure 1.A) and large high-resolution composite displays (Figure 1.B). However, more advanced systems are already being designed, such as enhanced desktops for individual use that integrate displays of varying form factors – tabletops, wall displays and tablets (Figure 1.C) –, collaborative meeting rooms (Figure 1.D), or even ad-hoc composed workspaces from linked mobile devices such as PDAs and mobile phones (Figure 1.E).

3. CROSS-DISPLAY OBJECT MOVEMENT MDEs are fundamentally different than existing single-display systems, and these differences require that we pay special attention to interaction design. Previous research has focused on several different aspects of MDE interfaces, such as how to better display and visualize applications and data, or how people organize and use the different display surfaces (for a brief summary of this research, see section 6); in this paper we focus on interaction, and more specifically, on the problem of cross-display object movement. Cross-display object movement is the action of moving a digital object from one display to another. A digital object is anything that exists in the logical workspace – such as an object in an application, the application window itself, or a mouse cursor. Cross-display object movement is important because it is specific to MDEs and because it is one of the fundamental actions needed for the operation of MDEs; there can be little multi-display interaction if the cursor and the digital objects have to remain in their original displays. Other basic multi-display actions exist such as the ad-hoc configuration of shared spaces or the replication of objects or display spaces in other displays. We believe that cross-display object movement is independent of other possible multi-display operations, and therefore can be studied in isolation. The main contribution of this paper is a taxonomy of interaction techniques for crossdisplay object movement. The taxonomy uses three levels or dimensions; each dimension separates techniques into several categories. Categories are compared to each other according to three main assessment criteria: performance, power and feedthrough (see section 3.4 below). Before we describe the taxonomy, we present the cognitive model from which the dimensions are derived.

3.1 A model of cross-display object movement

-8-



Our model (Figure 2) distinguishes four processes within a cross-display object movement operation: first, the demands of the task or the constraints of the environment are transformed into the intention of moving an object, which includes determination of the correct destination display; second, an adequate response (i.e., an action plan for moving the object to the destination) has to be formulated; third, the movement must be executed; and fourth, in some cases the user must monitor and adjust movement action through a feedback loop.

Intention formation (“I need to move the object” “Where to move it?”)

User

User‟s Intention Response selection and Plan formulation (“How to move it there?”)

Environment / System

Task / Environment

Action Plan Execution (Actually move it)

Feedback loop

Action Figure 2. A model of cross-display object movement

The basic structure of the model can be illustrated with a simple example: a user is editing a document in a three-monitor desktop system (task/environment); at some point during the editing process, she realizes that she has to copy some text that is located in a different monitor, which requires moving the cursor to that display (user intention: to move the cursor to monitor B); the user then selects which of the possible actions has to be performed to achieve the goal, in this case, to move the mouse in a particular direction (action plan: move the mouse to the right); finally, the user performs the movement (action), which she adjusts along the way by looking at the position of the cursor (feedback loop). Even though this model implies certain causal constraints (e.g., action is not executed before the intention is formed), we are not suggesting a strict sequential ordering. Current research proposes that parts of these processes might happen in parallel (e.g., Hommel et al., 2001). The specifics of how these processes are organized or subdivided, and what elements they have in common, are still open research questions for experimental and

-9-



cognitive psychologists, and it is enough for our purposes to assume that the processes shown in Figure 2 occur in some form. We must note that taxonomies do not need to be based on models in order to be useful (see other examples of related taxonomies in the Related Work section). However, using this model as a base for the different dimensions of the taxonomy allowed us to provide a categorization that is fairly independent from the circumstances of use of the designs (see discussion in 3.2 below) and facilitates the mapping of categories to their useful characteristics.

3.2 Scope Any model simplifies reality in some way; our decision to focus on cognitive and psychophysical processes helps us describe the way that the user interprets information and executes actions, but it also oversimplifies other elements of the operation that lie at the edge of the model such as the environment that is acted upon or the higher-level task that users are trying to accomplish. Our model is a low-level model because it does not deal with issues such as how cross-display actions are aggregated to complete a general task (such as editing a document), how a destination display is chosen for a particular purpose, or how the execution of a cross-display action affects group dynamics in collaborative systems. Leaving these issues out of focus in the model was, however, intentional; the space of possible tasks for which MDEs can be used, the contexts in which these take place, and the effects on collaborative behavior (e.g., coordination, conflict, territoriality, group performance, privacy) are too large and complex to be analyzed here. Our analysis starts at the point where the user already knows which display she wants to move an object to, and ends at the description of how the action is executed and how it appears in the environment. We believe, however, that our analysis can be informative for researchers who decide to investigate the effects of interaction techniques at higher levels of abstraction, because the processes and phenomena that we deal with are mostly independent of higher-level phenomena, and because we provide information about the characteristics of techniques that will affect the higher levels. For example, we describe whether a technique reveals information about the action to others, which can be useful in a system that requires workspace awareness, or disruptive in a system that requires strict privacy control.

3.3 Prerequisites for cross-display object movement Cross-display interaction techniques depend on a number of underlying technologies; in this subsection we step back to set our research in the wider context of system design. The following sections are brief surveys on the hardware and software infrastructures and the input devices required to implement cross-display interaction techniques.

- 10 -



3.3.1 Hardware and software infrastructure The simplest kind of MDE in use today is a standard computer that is equipped with a dual-head video card and two monitors. This setup is supported by most current operating systems, and is no more difficult to program than single-display machines; the display space, although divided into two different display surfaces, is still considered a single virtual space by the operating system and allows seamless transitions of objects from one display to the other. This approach can be extended by adding extra video cards to a single machine. Building complex MDEs in this manner has limits, however, because not all kinds of displays can be connected directly to the machine through a cable (e.g., mobile displays), and because a single machine might not be able to deal with all the input, computation, and output required by the users of a complex MDE. Some researchers have instead proposed the use of meta-operating systems (also called middleware infrastructures) that would instead combine a number of different machines connected through a network into an integrated interface (e.g., i-Ros (Ponnekanti et al., 2003), Gaia (Roman et al., 2002), Beach (Tandler, 2000)). Metaoperating systems also take care of important issues in the exchange of objects such as where a transferred object really resides (i.e., is the object instantly copied into the destination device‟s memory, is it kept in a central repository, or it is merely an iconic representation that is transferred?), and access permissions for objects in other displays. These issues are important in multi-user systems because they can affect privacy and safety of the data exchanged between displays and devices. Meta-operating systems are powerful and can greatly simplify the implementation of cross-display interaction techniques. However, there exist easier ways to connect two or more devices (and their displays): for example, if two devices are connected through a network, it is relatively easy to support data transfer using e-mail, instant messaging, or other basic transfer services. However, for these kinds of systems the problem is the configuration and specification of the connection through which two devices are to exchange data, because it sometimes requires the selection of the devices among a list of all possible other devices in the network. This problem is critical for the success of MDEs and has been addressed by research such as SyncTap (Rekimoto et al., 2003), Smart-Its friends (Holmquist et al., 2001) or Bump (Hinckley, 2003), which reduce configuration and specification processes to natural gestures. Although the underlying hardware and software implementations of an MDE can affect the viability and performance of cross-display movement techniques, we assume for our purposes that the necessary underlying software and hardware are present and that they meet the requirements of the interaction technique at hand (e.g., interaction speed, efficiency of data transfer). For most interaction techniques we assume as well that a working connection between the involved devices has already been established.

- 11 -



3.3.2 Input devices The taxonomy that we propose in this paper describes many interaction techniques that are of several different categories. These techniques rely on a broad set of input technologies which can affect the performance, usability and cost of the designed interfaces. Most of these technologies are widely used and do not present problems for the implementation of the techniques presented in this work: for example, indirect devices (e.g., mouse or trackball), touch screens, pen-based devices, or buttons and keyboards (see Hinckley (2006) for a current survey). Other interaction techniques require input that goes beyond traditional input devices: for example, perspective techniques often require tracking of people and devices in the shared space (e.g., Nacenta et al., 2007-3). The problem of tracking objects and people has not yet been completely solved; however, research in this area is very active. Recent projects make use of a large array of different technologies to achieve different levels of accuracy, ranges, and costs in 2D and 3D tracking; for example, Hazas et al. (2005) and Harter et al. (1999) use ultrasound, Randall et al. (2007) use light measurement, Vildjiounaite et al. (2002) use a combination of accelerometers, magnetic sensors and map knowledge, Spratt (2003) uses signal information from wireless links, and Ji et al. (2006) use WiFi signals (see (Manesis and Avouris, 2005) for an overview of existing techniques). A discussion of the advantages and disadvantages of each of these technologies and systems is out of the scope of this paper. Instead, we assume that the technology underlying the techniques discussed below is already reasonably well implemented or will be in the near future. Occasionally we will highlight specific input device issues if there is a clear trade-off in accuracy, cost or performance that affects the comparison of two groups of techniques.

3.4 Assessment criteria for cross-display movement techniques The taxonomy presented below evaluates the different categories of techniques according to three main criteria: performance, power, and feedthrough. These criteria represent three important user needs – how quickly and how accurately the cross-display movement can be executed, what exact kind of actions can be performed with the technique, and how the technique presents feedback in the environment. In addition, these criteria are also meant to help practitioners and researchers establish relationships between a technique and its effects in the general context of use of the MDE. For example, a technique with high power might allow users to grab objects from anyone‟s display, which might negatively affect privacy and collaboration. Typically, techniques are not globally better than others; instead, trade-offs appear, and it is the job of the interaction designer to decide which requirements are most important for the task and the MDE.

- 12 -



3.4.1 Performance Most user studies of multi-display and single-display interaction techniques base their conclusions on classical measures such as speed and accuracy. These measures are generally gathered as study participants repeat basic actions that are components of higher-level domain tasks – movement actions such as targeting, steering, or docking; visual actions such as aligning or drawing; search tasks such as inspection or visual search; and navigation actions such as scrolling, panning, and zooming. For cross-display object movement techniques, we are primarily interested in movement actions (particularly targeting and steering). However, existing performance measures generally operate at the execution stage shown in Figure 2, and do not consider the performance of other stages such as forming a plan for moving the object. Since these aspects of the interaction are critical in MDEs, performance measures for this area will have to take these other stages into account. In general, designers should aim for the best possible performance unless it comes at the cost of power, necessary feedthrough, excessive implementation costs, user discomfort, etc. 3.4.2 Power Our definition of cross-display object movement encompasses a number of different kinds of interaction techniques that are enacted in very different ways. The definition allows for great variability in a technique‟s power – that is, the types of actions and manipulations that the technique makes possible for the user. We are particularly interested in remote power, since this is more relevant to crossdisplay movement (we assume here that the user is working at a „current‟ display and that other displays in the MDE are „remote‟ even though they are in the same environment). We assume also that all techniques provide adequate local power (that is, the ability to manipulate objects on the current display), although some techniques may have also have different local power characteristics. We define four main remote powers that objectmovement techniques can provide: Remote putting allows users to move objects to other displays. This is the minimal requirement for cross-display movement, and only a few techniques provide only this power (e.g., emailing an object to another display). Remote placing allows users to put objects at a specific location on a different display (and so provides more power than simple remote putting). For example, flicking and throwing techniques (Hascoët, 2003, Hascoët and Collomb, 2004) allow placing without providing any further control over the remote objects once they are placed. In addition, there can be different levels of granularity in placing (e.g., placing only into a quadrant of the remote display, or placing at a specific pixel).

- 13 -



Remote manipulation allows users to move an object on a distant display (and possibly also manipulate it in other task-specific ways). Techniques can provide this power by bringing proxies of remote objects to the local display (e.g., Drag-and-Pop (Baudisch et al., 2003)), by duplicating the remote display on the local (e.g., Radar Views (Nacenta et al., 2005)), or by relocating the cursor to the remote display (e.g., Multi-Monitor Mouse (Benko and Feiner, 2005) and Perspective Cursor (Nacenta et al., 2006)). Remote getting allows users to retrieve objects from other displays; this requires that the technique provide some means of accessing remote objects (e.g., by name, or through remote manipulation). The traditional baseline comparison of a technique‟s power is to the real world (that is, the capabilities that a person has in a purely physical space). Some techniques attempt to be very similar to the real world, and try to operate on the virtual space just as they would on the physical world (e.g., Tangible Bits (Ullmer et al., 1998), Pick-and-Drop (Rekimoto, 1997)). However, techniques can also take advantage of the computational space to provide „super powers‟ that people do not have in the real world – such as the ability to reach and manipulate objects that are out of arms‟ reach (e.g., Multi-Monitor Mouse, perspective cursor). Note that providing the maximum power to users might not be desirable in all circumstances; for example, over-empowered users might abuse their capabilities and therefore disrupt natural privacy and territorial behaviors of groups. 3.4.3 Feedthrough Many MDEs – particularly the meeting rooms and ad-hoc workspaces mentioned above (section 2) – often involve not only multiple displays but also multiple users. In these situations, it is also important to consider whether interaction techniques provide information about the cross-display object movement to other users. For example, some techniques such as Multi-Monitor Mouse (Benko and Feiner, 2005)) move objects with the press of a button, and without any embodiment of the actor in the display space; this provides very little information about the action to other users in the system. In contrast, techniques such as pick-and-drop (Rekimoto, 1997), make the action evident to others because the user has to physically move to the destination display. This information is called feedthrough1, and it can affect group behavior in several different ways. Techniques that provide rich information to others will promote group awareness, the up-to-the-moment understanding that collaborators have about who is 1

Our use of the term feedthrough is more general than the standard meaning attributed in CSCW (e.g., Dix, 1994); by feedthrough we mean the information available to others from the process of manipulating an object. This definition includes visible changes in position or other characteristics of the object (as in the standard use of the term), but also the information that is revealed through the operations of the actor and her embodiment.

- 14 -



working, what they are doing, and where they are doing it (Gutwin and Greenberg 1998, 2002); awareness is vital for smooth and natural execution of shared tasks, and becomes more and more important as the collaboration becomes more closely coupled. However, too much information about a user activity can also have negative effects such as clutter, visual or physical interference (for example when too many users are trying to manipulate objects in the same area) and can affect privacy behavior in some contexts (some users might decide not to perform certain actions because others will instantly see those actions).

4. TAXONOMY In this section we describe the taxonomy of cross-display object movement techniques. We first provide an overview of the three levels in the taxonomy, and then detail each level in turn: referential domain, display configuration and control paradigm.

4.1 Overview The taxonomy organizes cross-display object movement techniques using three conceptual levels (see also Figure 3): The referential domain level analyzes issues related to the translation of an intention into a specific plan for moving an object. This dimension deals mainly with response selection in our cognitive model (Figure 2). According to how techniques refer to displays, they can be classified as spatial and non-spatial. The display configuration level analyzes how the physical arrangement of the MDE combines with the input model of the interaction technique. This level also corresponds with the response selection process of the cognitive model, but relates to execution and feedback processes as well. According to their input model, techniques can be classified as planar, perspective, or literal. The control paradigm level analyzes the different mechanisms by which interaction techniques are controlled by the user. Control relates primarily to the execution and feedback processes of the cognitive model. According to their control paradigm, techniques can be open-loop, closed-loop, or intermittent.

- 15 -


http://www.informaworld.com/smpp/content~content=a910602221~db=all~jumptype=rss Cross-Display Movement Techniques

Referential Domain

Display Configuration

Control Paradigm

Spatial

Non-Spatial

Planar

Perspective

Literal

Open-Loop

Intermittent Open/Closed

Closed-Loop

Figure 3. Graphical representation of the three main levels of the framework

These levels are based on the model described in Section 3 and are similarly focused on the internal processes of the user that performs the cross-display action. Other categorizations are possible, but the advantage of our scheme is that it does not depend on the particular application of the technique, the context of use, or the task at hand; instead it reflects intrinsic characteristics of the technique with respect to the person that uses it. In other words, the same technique could prove a perfect choice or a usability disaster depending on the application domain or the task, but its intrinsic characteristics (its referential domain, its display configuration and its control paradigm) will stay the same. The next sections describe the levels of the taxonomy; in each, we first define the idea underlying the level, then describe how the level separates interaction techniques into groups. We then consider existing evidence that differentiates the groups of techniques, and summarize known evidence and open questions for that level.

4.2 Referential domain

Techniques

Reference Spatial Non-Spatial Cross-display object movement can be seen as the transformation of an intention Configuration Planar Perspective Literal (the user wants an object to be in a particular display) into a change of state Control Open-Loop Intermittent Closed-Loop for the system (the object appears in its new position). In order to achieve this change of state, the user needs to communicate to the system where to move the object through an interaction technique. The intention of the user and the display specification required by the interaction technique can be expressed and represented in a variety of ways. For example, the user might want to move the object to a display that is called

- 16 -



„John‟s display‟ or to the display that is to her right – two different ways of referring to the same display. Similarly, an interaction technique might require typing the name of the destination display, or may allow the user to indicate the destination with a pointing gesture. The different methods or „languages‟ in which objects can be referred to is what we call referential domains. A specific referential domain is formed by the set of references of a specific type to all the displays of the environment; for example, pointing gestures to displays is a common referential domain used by interaction techniques; another example is a naming scheme for displays. This section analyzes how the relationship between the referential domain imposed by the task and the referential domain required by the interaction technique can affect performance, power, and feedthrough for cross-display interaction techniques. At this level, we classify referential domains into two groups, spatial and non-spatial; interaction techniques are therefore also classified into the same two groups depending on the referential domain that they use. The spatial vs. non-spatial division is backed up by research in visuospatial cognition that suggests that the brain uses distinct mental representations for spatial and linguistic descriptions of objects and scenes (Tversky, 2004); we will not, however, enter into the discussion of the actual structure of these representations, or of the way that one type of representation relates to or is transformed into another. 4.2.1 Types of referential domains According to the way that they refer to a display, we divide cross-display object movement interaction techniques into two groups: spatial and non-spatial. 4.2.1.1 Spatial We consider that an interaction technique is spatial if it references displays spatially, i.e., if the required input to place an object in a display relates in a spatial way to the position of the display in the physical arrangement of the MDE. For example, the classic Put-that-there technique (Bolt, 1980) uses pointing gestures of the user in a large display to indicate the destination of an object. There exist a multitude of techniques that rely on a spatial referential domain. The large corpus of techniques based on the direct manipulation paradigm (Hutchins et al., 1985) is spatial. Other examples of spatial techniques include mouse-cursor movement techniques (Rekimoto and Saitoh, 1999; Ha et al., 2006-1), world-in-miniature techniques (Biehl and Bailey, 2004; Chiu et al., 2003; Kortuem et al., 2005; Swaminathan and Sato, 1997) and literal techniques (Ullmer et al., 1998; Holmquist et al., 2001), among others. We do not enumerate all spatial interaction techniques here because they are considered in more detail at the next levels of the framework. - 17 -



4.2.1.2 Non-spatial Cross-display interaction techniques are non-spatial when the destination display is referenced in any way that is not spatial. For example, we can refer to destinations through names, through the navigation of hierarchies, through lists, through association with colors or shapes, and many other possible schemes. Two simple examples of non-spatial movement techniques are Instant messaging (IM) file transfers and shared network folders. IM can be considered a cross-display movement technique when we use it to transfer files to a different device through the network. These methods are often used in co-located meetings when there is no easier way to share files. The referential domain of an IM-based technique is the „buddy list‟, with names and icons that represent possible destinations. Note that even though we might drag and drop a file into somebody‟s buddy icon (which is a spatial movement), the technique is non-spatial because the target‟s location in the list has no relation to the actual position of the receiving machine in the real world. Shared network folders allow certain folders in a file system to be referenced from another computer, and therefore from another display. Files can be dropped onto the icon of the shared folder in order to execute the transfer from one display to another. We refer to techniques like network folders and IM buddies as Wormhole techniques – objects disappear from one display and appear in another. Although most wormhole techniques are non-spatial, icons can also be organized in a spatial fashion (e.g., shortcuts can be arranged according to the physical position of the devices they link to, as in Figure 4). If this is the case, we consider these techniques to be a different technique called Spatially Arranged Wormholes (see Section 4.3.1). The two non-spatial techniques described above are generic data-transfer mechanisms that were not originally designed for co-located interaction. Other techniques that do take co-present multiple displays into account, however, also use a list metaphor for display selection. For example, Multibrowsing (Johanson et al., 2001) allows web content to be moved from one screen to another by selecting the destination display from a contextual menu; Bluetooth connections (Bluetooth SIG, 2007) allow users to select devices from a list; and WinCuts (Tan et al., 2004) and Mighty Mouse (Booth et al., 2002) represent other displays as a list of names or icons.

- 18 -



Figure 4. An example of spatially arranged folders

Two non-spatial techniques that do not use names or icons for reference are the keyboard-switch Multi-Monitor Mouse and the mouse-button-switch Multi-Monitor Mouse presented by Benko and Feiner (2005, 2007). In these variants of the MultiMonitor Mouse, the cursor switches from one screen to the next in a circular list of available displays, much as the alt-tab key combination switches between applications in Windows™ systems. Unfortunately, it is difficult to further categorize techniques based on non-spatial referential domains because there are countless possibilities for assigning symbols or descriptions to displays or locations inside displays. The subsections below contain some discussions on non-spatial referential domains, but there is a large unexplored space of possibilities for the design of techniques that use non-spatial referential domains. 4.2.2 Design considerations for referential domains Cross-display interaction techniques can refer to displays in many different ways. In this section we discuss trade-offs involved in this choice. We also introduce experimental and theoretical results that help explain differences between spatial and non-spatial techniques. 4.2.2.1 Intended destinations and required input: matching or mismatching domains. The two referential domains involved in a cross-display object movement – from the intention and from the interaction technique – can either match each other in type, or not match. For example, a user might want to move an object to one of the multiple vertical displays of a room that is close to her (“I want to move it to that display”), whereas the - 19 -



only interaction technique available to execute the movement forces her to choose the name of the display from a list (as in Multibrowsing (Johanson et al., 2001)). In this case there is a clear mismatch between the spatial referential domain in which the intention is expressed and the non-spatial way in which the interaction technique requires the user to express the destination display. However, if a spatial interaction technique such as Putthat-there was present in the system, both referential domains would match. Mismatch between referential domains can also happen in the opposite direction. Consider a meeting-room scenario where the chairperson is distributing documents to people by virtually sliding them to each person‟s laptop. If the chairperson uses people‟s names as the reference for where to send the documents, a spatial technique like Flick (Moyle and Cockburn, 2002; Reetz et al., 2006) could cause problems if people are not sitting beside their own laptops. Figure 5 introduces two extra examples of matching referential domains, this time in the context of a single-user MDE. In Figure 5.A the representation of the destination display in the user‟s mind is graphical; the user wants to transfer the file to a location that she knows in space. In this case, a spatial technique like the one represented in Figure 5.C matches the domain of the intention. Figure 5.B shows a user that knows the destination display by its name, regardless of where it is located. An matching technique would be a menu like the one in Figure 5.D.

Figure 5. Examples of dimensional overlap. A) Intention formulated in terms of position, B) intention formulated in terms of name C) spatial interaction technique, D) non-spatial interaction technique.

In order to better predict the effects of the match or mismatch between the user‟s and the technique‟s referential domains, we turn to research on Dimensional overlap (Kornblum et al., 1990; Kornblum, 1992 ). Dimensional overlap (DO) is a characteristic of Stimulus-Response (SR) sets, and refers to the degree of similarity between the stimulus and the response. In a cross-display movement task, the stimulus is the intended

- 20 -



destination display, and the response is the required action with a particular interaction technique. The stimulus set consists of all the possible destinations of an object, and the response set consists of all the possible actions that an interaction technique affords. The DO model states that “given a stimulus and response set, the fastest reaction time obtainable with optimal mapping is faster if the sets have dimensional overlap than if they do not” (Kornblum et al., 1990). The stimulus and response sets have dimensional overlap when they have properties or attributes in common. Since there is a direct correspondence between referential domains of intention and interaction technique and the stimulus and response sets in which dimensional overlap is formulated, we can apply this model directly to the scenarios described above. For example, in the meeting-room scenario, the stimuli set (the intention of passing a document to someone by name) and the response set (sliding the document in the direction of a laptop) have little dimensional overlap: names do not have much in common with directional gestures. If the chairperson was distributing the papers based on which person raised their hand, a sliding technique would have higher dimensional overlap, since the raised hand and the directional sliding gesture are both spatial (i.e., both have position, direction and distance in physical space). The DO model hypothesizes that when the stimulus and response sets are similar, performance and accuracy are improved because the translation between stimuli and response occurs through an automatic cognitive process. If the SR sets are not similar, people must carry out a time-consuming search or a rule application, in addition to an inhibition of the results of the automatic process. Besides the advantage in performance of overlapping SR sets, the DO model predicts that the slope of the reaction time vs. number of alternatives function is reduced when the SR sets overlap (Kornblum et al., 1990). This means that highly overlapping SR sets are more scalable than non-overlapping SR sets; in the context of MDEs it implies that techniques that match the referential domain of the user‟s intention will work with larger numbers of displays. Independent of the cognitive processes behind dimensional overlap, there is considerable evidence from the psychology literature that supports its predictions (there is little discussion about DO in HCI, with a few exceptions (Po et al., 2005; Proctor and Vu, 2003)). This evidence also matches empirical results from the HCI literature. For example, Biehl and Bailey (2006) compared two spatial application relocation methods to what they call a „textual interface‟. The task was a naturalistic cooperative task that required the movement of applications between large shared displays and personal tablets in order to compose a collage or a comic strip. They found that relocation times were several times shorter with the spatial interfaces than with the interface that used name references for displays. From the description of the task we deduce that the relocation

- 21 -



intentions were mostly represented in terms of spatial references (e.g., moving the applications from „there‟ to „here‟). In a laboratory study, Swindells et al (2002) evaluated gesturePen, a pointing-based spatial technique which requires a spatial gesture to select a device. They compare the spatial technique to selection from a list of devices (a non-spatial technique). They found that selecting a destination took significantly longer when using the list than when pointing. However, their task description does not state whether the goal destination was indicated to the participant through a spatial or name reference. 4.2.2.2 Non-spatial techniques in a spatial world The evidence presented above predicts better performance when intentions and techniques have matching referential domains, but does not show whether spatial reference frames will perform better than non-spatial frames. Since interfaces that use non-spatial referential domains are often cheaper and easier to implement than their spatial counterparts, it is important to determine whether non-spatial referential domains cause extra performance costs. In answering this question, we must also consider that cross-display object movement in an MDE takes place in a physical space. Even when the location of the destination display is not relevant for the task (e.g. the meeting-room example described above), the spatial nature of the environment could interfere with interaction. This equates to the question of whether people can ignore the actual position of destination displays when completing a task that uses a non-spatial reference domain. Since evidence on this question is difficult to come by (especially in our application domain), we summarize here the results from an unpublished empirical study that we carried out; for more detail, see (Nacenta et al., 2007-1). In the experiment, each of four participants was associated with a particular shape. Participants were then shown shapes on their screens, and had to indicate which of the other users the shape belonged to, by tapping on one of three buttons that contained the actual names of the participants. In one condition the buttons corresponding to other participants were aligned with their actual positions in the physical space, while in the other condition there was no spatial relation between buttons and positions of the participants (see Figures 6 and 7). The results of this experiment showed that the mappings that coincide with physical reality have a significantly lower completion time during the first 12 trials of each block, even though the task requires no specific knowledge about location of destinations (the association between shapes and users is arbitrary). That is, with an arrangement that does not fit the world, it takes 12 consecutive trials for people to achieve the performance of an arrangement that does match. This pattern was consistent across several blocks, ruling out an explanation based just on learning.

- 22 -



The effects found in the study are likely to be even more pronounced in realistic tasks because our study allowed users to focus only on getting the mappings right and were not distracted in between. In a real task in which attention is often divided, users would not be able to learn so efficiently from the previous trials.

Figure 6. The setting of the experiment described in (Nacenta et al. 2007-1)

Figure 7. Screen snapshots from the interference experiment. The left pane matches the real world (of Figure 7) for John, and the right pane does not.

There is also lower-level evidence that the location of a stimulus can strongly affect performance, even if the location is not relevant. This is known as the Simon effect after Simon and Rudell (1967), who discovered a peculiar phenomenon while researching hemispheric dominance for speech. Their subjects had to push one of two telegraph keys with their right or left hands when they heard a command that said “right” or “left” through their earphones. The researchers found that depending on which ear the command was issued to, the responses were different in speed and accuracy: “right” commands issued to the right ear and “left” commands issued to the left ear were faster than their counterpart. Note that the ear in which the command was given is totally - 23 -



irrelevant for the task, yet the directional cue affected how information was processed. The importance of this phenomenon in real HCI tasks has not yet been evaluated, but it adds to the evidence that spatial references in a spatial context can have performance advantages over non-spatial references. The Simon effect is also important because it is persistent; even after thousands of trials, participants still showed better reaction times with stimuli that came from the matching direction. The evidence from Simon effect studies and the experiment above are in line with observations that even abstract mental processes are often mapped to visuospatial representations (Tversky, 2005). In other words, it is difficult to escape the fact that we reside in a space, and we often use space to organize our thinking. In fact, in a study in which users where allowed to create their own miniature representations to interact with an environment, all representations corresponded with the spatial arrangement of the different devices, showing a preference for spatial versus abstract mappings (Aliakseyeu et al., 2008). All the evidence compiled in this section advocates for the use of spatial tasks and spatial interaction techniques in co-located spaces as MDEs. However, the evidence is not definitive; there might be situations in which non-spatial referential domains are more useful and efficient (e.g., the meeting-room example from the previous section), and therefore the alternative of using non-spatial techniques should be carefully considered. Further research is needed that explore the general conditions in which non-spatial techniques are superior to spatial ones. An alternative to having to choose between techniques is to provide several interaction techniques – or a hybrid technique – that cover the possible reference requirements from different tasks at different times. For example, the technique shown in Figure 4 uses a dual referential domain: the destination display is indicated by the relative position of the folder with respect to the destination display, or by the labels of the folders. Ideally, users would choose the right technique for the appropriate task; however, we know very little about whether this is the case. The combination of several alternatives might also confuse users, increase their reaction times (because they would have to choose between techniques), and make the interface more complex. Again, more research is required to answer these questions. 4.2.2.4 Feedthrough and non-spatial reference domains Most spatial interaction techniques reveal information to others about the action in two different ways: the movements of the actor‟s embodiment provide clues as to the actions being carried out and the changes to the object‟s appearance and location show that it is being moved. The spatial organization makes it possible for embodiments and objects to move gradually and predictably through space, showing others that the movement action is occurring, and allowing them to anticipate the object‟s eventual destination.

- 24 -



Non-spatial reference domains, in contrast, do not necessarily provide information about the action or, at least, don‟t provide information that is easy to interpret or predict by others. For example, selecting a destination display from a list, or typing in a command makes it very difficult for others to know what a user is doing until the object changes locations and, even then, it is difficult to know who performed the action if there are more than two people working together. In contrast, if a user has to physically move and touch the destination of the object, most other users will know what is going to happen and who is making it happen even before the interaction is completed. However, non-spatial interaction techniques can provide explicit feedthrough information through other mechanisms. An instantaneous action can be made more noticeable by lengthening the action and giving it an artificial visual representation that is easier for others to see (Gutwin and Greenberg 1998). For example, feedthrough of a movement action in a non-spatial MDE could be improved by highlighting the selection action itself (e.g., by temporarily enlarging the object and the list of displays), and by showing gradual fade-out and fade-in effects for the object to be moved. These artificial mechanisms can be effective, but require additional learning for users of the system, and additional work for the application developer. In contrast, feedthrough provided by spatial techniques is a fundamental part of the interaction techniques themselves, and is acquired and interpreted by others in a natural way. 4.2.2.4 Power of non-spatial techniques The choice of referential domain for the interaction technique also has consequences for the power of the technique. Interaction techniques that use a spatial referential domain usually allow users to put, place, manipulate, and get remote objects with a high degree of accuracy. For example, Perspective Cursor (Nacenta et al., 2006), Vacuum Filtering (Bezerianos and Balakrishnan, 2005-1), and Push-and-Throw (Collomb et al., 2005) all allow users to perform all four remote operations with pixel-level accuracy. In contrast, interaction techniques that use non-spatial referential frames often have much lower power. For example, IM file transfer only supports putting an object on a certain remote destination without allowing getting or placing; similarly, network folders allow users to put and get objects from a remote display, but do not allow the user to place an object at a specific location on the desktop of the remote display. 4.2.3 Summary This section summarizes the results and discussions regarding referential domains into four topics. Each paragraph describes the evidence on the topic (if any), results that can guide design (if any) and future research needed in that area. Referential domains of the intention and the technique: match and mismatch. The use of interaction techniques that match the referential domain of the task is supported by research from experimental psychology and some evidence from MDE research. Well- 25 -



matched techniques provide improvements in performance, accuracy and scalability. More research is needed to determine the magnitude of the improvements in different situations, to validate results from other fields into HCI and to develop novel non-spatial interaction techniques that work best with non-spatial tasks. Spatial vs. non-spatial referential domains. Research from an unpublished study and from experimental psychology (Simon effect) suggest that non-spatial techniques, even if they are matching a non-spatial task, can be negatively affected by the surrounding spatial environment. In contrast, spatial techniques seem to be intrinsically more powerful and provide more understandable feedthrough. Further research is required to quantify the performance differences between the two types of techniques and to evaluate if hybrid techniques (techniques that use several referential domains) or the combination of different types of techniques in the same system is beneficial for performance. Models that organize quantitative differences. Existing evidence on the effect of spatial and non-spatial techniques provides few useful measures of the magnitude of performance differences between techniques or the benefits of matching referential domains. Further research should ideally organize quantitative information about the techniques into performance models that would help designers decide on the correct technique before implementation. Characterization of the space of HCI tasks in terms of their referential domains. MDEs can be applied to virtually any situation that requires single-user or co-located cooperative computer support. However, we still know little about the kind of tasks that can be performed in MDEs and how often these rely on spatial or non-spatial referential domains. A taxonomy of tasks that provides guidance on the kind of referential domains that are used by common tasks would support design decisions and would help direct research to the most relevant areas. Table 1. Summary of interaction techniques according to their referential domain Referential Domain used Spatial

Non-spatial

Advantages

Disadvantages

Example techniques

Fits spatial tasks. Powerful. Natural and rich feedthrough. Fits non-spatial tasks.

Feedthrough cannot be eliminated.

Pick-and-Drop, Put-that-there, Radar View, Flick, Perspective Cursor.

Interference from spatial circumstances can affect performance. Limited power.

Multibrowsing, Bluetooth file transfer, Mighty Mouse (explicit switching), Multi-Monitor Mouse (button or keyboard switch), e-mail, Instant Messaging, network folders.

4.3 Display configuration

Techniques

In spatial terms, an MDE is defined by its display configuration. The display configuration depends on both the physical

Reference

Configuration

Control

- 26 -

Spatial

Non-Spatial

Planar

Perspective

Literal

Open-Loop

Intermittent

Closed-Loop



arrangement of the MDE (the positions and physical properties of its displays), and the input model of its interaction techniques (how spatial input commands are transformed into object movement within and between displays). The relationship between the physical arrangement and input model is called the mapping. A given MDE that has a particular physical arrangement may work with many different input models depending on how its interaction techniques are defined, and therefore there is a choice among multiple possible mappings. Consider a hypothetical interaction technique for the MDE portrayed in Figure 8. The technique‟s implementation establishes that the cursor will move to display 2 if the cursor is brought to the right limit of display 1, to display 3 if it reaches the left limit of display 1 and vice-versa. The input model thus corresponds to the scheme shown on the bottom of Figure 8. There are many possible input models because these are programmable into the logic of the interaction technique in many different ways; however, not all of these models are appropriate for a certain physical arrangement. Indeed, many of the characteristics of a technique depend to a great extent on the mapping between physical arrangement of the MDE and input model implemented by the technique.

Figure 8. A multi-display setting (up) and a possible input model (down)

In this section we start by classifying existing techniques according to their input models into three groups: planar, perspective and literal. We then analyze the different possibilities of display configuration using research into Stimulus-Response Compatibility (SRC) and Dimensional overlap. 4.3.1 Input model types 4.3.1.1 Planar

- 27 -



An input model is planar when the cursor travels across displays as if they are all arranged in a two-dimensional plane. The most common example of a planar input configuration is the multi-monitor model of current operating systems (See Figure 9) or extensions such as Mouse Ether (Baudisch et al., 2004). Other versions of planar configurations include those underlying Flick (Reetz et al., 2006), Throw (Hascoët, 2003), Drag-and-Pop and Push-and-Pop (Baudisch et al., 2003), SwordFish (Ha et al., 2006-1) and Hyperdrag (Rekimoto and Saitoh, 1999). Planar models are often used for cross-display object movement in multi-display rooms such as interactive workspaces (Johanson et al., 2002; Tani et al., 1994). Most world-in-miniature techniques also use flat representations of the display alignment even if the physical arrangement of the displays is not planar (e.g. ARIS‟ relocation maps (Biehl and Bailey, 2004), the „radar‟ implementations presented in Nacenta et al., (2005) and Wigdor et al. (2006)). More sophisticated versions use images from fixed cameras, or synthetic images of the environment generated from a general point of view as planar representations of the physical arrangements (Chiu et al., 2003; Massó et al., 2006).

Figure 9. Two of the input model configuration utilities of current operating systems

There is an interesting group of techniques that rely on planar models but that are subject to the users‟ interpretations of this space. For example, the arranged wormholes mentioned in section 4.2.1.2 can be modified by the user to represent space, but this representation is also subject to other considerations that are not specifically spatial: for example, a user might decide to eliminate from the input model the devices or displays that are not relevant for the task at hand. Techniques that use this kind of conceptual planar models are actually midway between the spatial planar techniques described above and non-spatial techniques from the 4.2.1.2 subsection; they are partially spatial, but introduce other referential space types. 4.3.1.2 Perspective techniques

- 28 -



Perspective techniques provide a mapping between the input model and the output that is based on the position or point of view of the user instead on some arbitrary external representation of the space (as in planar techniques described above). The goal is to provide an input model that corresponds more closely with the user‟s perception of the environment. In order to achieve this, the system must be able to gather information about how the environment is perceived by the user. For example, in Perspective Cursor (Nacenta et al., 2006), information of the user‟s head position is used to provide seamless transitions between displays (see Figure 10). Head orientation is also used to determine which display to activate for input in the multi-monitor with head tracking technique (Benko and Feiner, 2005), the multi-monitor technique presented in (Ashdown et al., 2005) and Look-to-Talk (Oh et al., 2002). It is also conceivable to use eye-tracking technology instead of head tracking technology as done by MAGIC (Zhai et al., 1999), and Attentive User Interfaces (Shell et al., 2003, Dickie et al., 2006). Augmented Reality techniques have also experimented with the idea of using tracked head-mounted displays to facilitate or enhance interaction with multiple displays as with in the InVision (Slay et al., 2003, 2004) and the EMMIE (Butz et al., 1999) systems. These techniques require wearing head mounted displays that are expensive, heavy and annoying to wear in common situations.

Figure 10. A perspective cursor moves from a monitor into a tabletop display (seen from the point of view of the user)

Finally, laser or finger pointing (Myers et al., 2001; Oh and Stuerzlinger, 2002; Voida et al., 2005; Bolt, 1980; Parker et al., 2005) is also an example of a perspective technique, although in this case, the perspective is calculated from the pointer device, and not the user‟s head. Laser pointer input systems are usually cheap to implement, but can have serious issues of accuracy and can become tiring after extended use (Nacenta et al., 2006; Myers et al., 2002). 4.3.1.3 Literal techniques

- 29 -



Literal techniques are cross-display interaction techniques that rely on the physical context (or contact) to provide connections between displays; unlike planar or perspective techniques, literal techniques do not require the system to model the input in any way because the physical configuration coincides with the model. For example, in Smart-Its Friends (Holmquist et al., 2001), elements that are shaken together establish one-to-one connections. Similarly, with Stitching (Hinckley et al., 2004), a pen gesture that starts in a display and ends in another determines how two displays form a single interactive surface. In Pick-and-Drop (Rekimoto, 1997), the contact position of the pen on the display is used to „catch‟ the object directly underneath, which can be released anywhere else. Variations of this technique use specialized input devices to detect the proximity of other devices (e.g. BlueTable (Wilson and Sarin, 2007), Tangible Bits (Ullmer et al., 1998), u-Textures (Kohtake et al., 2005), ConnectTables (Tandler et al., 2001)), but they all have in common that the input model is made to coincide literally with the physical arrangement (i.e. interaction is always „absolute‟). 4.3.2 Design considerations for display configuration The configuration of spatial techniques forms the core of an interaction technique and is fundamental to understand the differences between planar, perspective and literal techniques. The following subsections discuss performance, power and feedthrough differences based on the existing evidence. 4.3.2.1 Mapping between input and output It is a well-known HCI design principle that the input of a system should be designed to match what is to be controlled; Norman calls this “getting the mapping right” or designing “natural mappings” (Norman, 2002), Britton et al. (1978) call it “kinesthetic correspondence”, and Jakob and colleagues apply a similar philosophy to the design of general interaction techniques (Jakob et al., 1994). In cross-display object movement techniques, this principle translates into mapping the input so that it matches the arrangement of displays in the physical space. If the MDE is very simple and regular in its arrangement (e.g. a large composite wall display), planar techniques are easy to match to the space. However, when the MDE‟s physical configuration is more irregular, the match becomes more difficult. Consider, for example, a planar approach for the MDE of Figure 11. If a simple planar mapping is used (Figure 11b), the input becomes inconsistent with the output for trajectories that cross the gap between displays.

- 30 -



Figure 11. Two-display MDE. a) Physical arrangement, b) input model with two mouse trajectories, c) the continuous input mouse trajectories become confusing in the physical environment.

The static nature of planar mappings also causes problems for collaborative scenarios or situations in which users need to move around. Even if displays are arranged in parallel planes, the correct mapping will depend on the point of view of the user (see Figure 12); we call this the perspective problem.

Figure 12. The perspective problem: the "correct" planar model depends on the position of the user

The perspective problem is solved by using perspective techniques because these do not rely in a external input model; instead, they adapt to the user‟s point of view. Therefore, perspective techniques provide more natural mappings when the MDE is complex or has an irregular physical arrangement. Literal techniques are not subject to any problems of mismatch because there is no possible divergence between the input model and the physical arrangement: they are made to coincide. Evidence to support the superiority of better mappings for cross-display movement can be found in the predictions of the DO model (Kornblum et al., 1990; Kornblum, 1992). Dimensional overlap was used in the previous level of the taxonomy to explain

- 31 -



why a mismatch between referential domains (e.g., spatial task and non-spatial technique) results in poorer performance. It can also be used here to explain performance differences between different kinds of mappings. For spatial-spatial mappings, we define dimensional overlap as the geometrical similarity between the input model and the user‟s perception of the system. We hypothesize that the performance differences found in techniques that use different mappings are explained in part by a different level of correspondence between the input model and the physical arrangement of the MDE. For example, imagine an MDE as the one in Figure 11.A with a planar input model like the one in Figure 11.B. As explained above, if the cursor is controlled with a mouse, there will be misalignments between the movements of the input and the resulting movements of the cursor (Figure 11.C). The difference in dimensions between the physical arrangement (Euclidean 3D space) and the input model (a planar 2D space) is an example of poor dimensional overlap that has implications for performance. This problem is common to most planar mappings when they apply to MDEs with irregular physical arrangements. The better matches provided by perspective and literal techniques will, according to the model, also provide benefits in performance. Perspective techniques are designed to increase dimensional overlap; literal techniques provide perfect overlap because the control model and the physical arrangement are superimposed and have the same geometry. There is also empirical evidence that links greater overlap to higher performance. For example, in Perspective Cursor‟s initial evaluation (Nacenta et al., 2006), we compared two perspective techniques (Perspective Cursor and Laser Pointing) and a planar technique (the regular multi-display Stitching of current operating systems) in a complex environment similar to the one depicted in Figure 1.C. We found that perspective techniques perform better and were preferred to the planar technique. Perspective Cursor was, in general, superior to Laser Pointing. This difference could be attributed to the better dimensional overlap (Perspective Cursor uses eye position, avoiding the parallax of using pointer position, and therefore provides a better match between perceptual and physical spaces) and to the intrinsic stability problems of laser pointers (Myers et al., 2001, 2002, Oh and Stuerzlinger, 2004). We also found support for the dimensional overlap hypothesis in a previous study that compared six cross-display object transfer techniques with different degrees of dimensional overlap (Nacenta et al., 2005). The technique that was based on a literal model (Pick-and-Drop) achieved very good performance compared to techniques based on a planar model (Pantograph and Slingshot) and much better performance than a technique based on a pressure mapping. The best technique, however, was a planar world- in-miniature technique (the Radar View). This might seem contradictory to what is predicted by dimensional overlap; however, further analysis of the task showed that the user did not need to look at the real environment with this technique (the transfer destination was indicated in the radar), effectively converting it into a literal technique

- 32 -



that required smaller gestures than the „real‟ literal technique. This difference in the size of the gesture also helps explain why Radar View was faster than Pick-and-Drop. In general, we believe that techniques like the Radar View can perform better than any other kind of technique only if the task does not require changing the focus of attention from the real space to its planar representation (for a detailed discussion on the radar see Nacenta et al. (2007-2)). The evidence presented above supports that literal techniques perform best, as long as they do not require reaching into areas outside the user‟s local space. Perspective techniques can maintain reasonable performance without the need for intermediate representations. Last, planar techniques probably represent the least efficient option among spatial techniques. In terms of cost, however, the order is inverted because planar techniques can be fully implemented with cheap, readily available input devices (e.g. mouse, trackball), whereas perspective and literal techniques often require expensive tracking technologies or input devices that are compatible across all possible displays. 4.3.2.2 Compatibility of interaction techniques Dimensional overlap between the input mode and the physical arrangement is not the only characteristic of the mapping that can affect a technique‟s performance; within the dimensional constraints of a technique and the MDE, techniques with compatible mappings perform differently than techniques with incompatible mapping. Compatible mappings are those that correspond with the stereotype (i.e., the expected relationships from our real-world experiences). The hypothetical interaction technique shown in Figure 8 represents an example of incompatible mapping (i.e., a mapping that does not correspond to the stereotype). Notice that the input model is just as planar as the physical arrangement of the displays, and therefore there is a high dimensional overlap. However, the connections between the displays in the input model are inverted, resulting in an incompatible mapping. The concept of compatibility comes from Stimulus-Response Compatibility (SRC), a consistent and repeatable phenomenon widely studied in experimental psychology (see for example Simon, (1990); Fitts and Deininger, (1954); Fitts and Seeger, (1953); Chua et al., (2003); Proctor and Reeve, (1990); Proctor and Vu (2003)). Dozens of empirical studies show performance and accuracy advantages of “direct” or “stereotypical” mappings when reacting spatially to spatial stimuli. Designers must, therefore, be careful when designing spatial interactions so that they are compatible with the stimuli (the task). Although this recommendation might seem somewhat obvious, it has important consequences, especially in the design of mobile adhoc MDEs such as the one pictured in Figure 1.E. For example, providing compatible spatial gestures might require some kind of location tracking or configuration in the

- 33 -



mobile devices. This would improve performance and reduce errors, but would probably increase the cost of the system significantly. 4.3.2.3 Power and literal input models Literal input models force users to interact directly with the digital objects in their displays. This absolute interaction style has many advantages, but also a obvious drawback: easy access to the objects is limited to the physical reach of the user. In other words, literal input models have restricted power. This restriction might not be important for MDEs that are very small and within reach of the users (e.g., tabletops with personal devices), but it severely constraints the use of systems that are large or tall, or where certain parts of the space are not accessible. Some studies have shown that users would rather act remotely on distant objects, even at the cost of some performance, than having to exert themselves physically to manipulate objects (Nacenta et al., 2007-2; Voida et al., 2005). 4.3.2.4 Feedthrough Literal techniques provide obvious information to other users because they force the actor to physically touch the origin and destination of the object movement. In contrast, planar and perspective techniques can act remotely, making this information less apparent and more difficult to interpret. Feedthrough from perspective techniques can be particularly difficult to interpret, because perspective is intrinsically personal, and users need to put themselves in the place of other users (physically or figuratively) to see the world from their perspective, which may be difficult to do. Planar and perspective techniques can be extended with digital embodiments and other visualizations in order to provide sufficient feedthrough. For a study that explores these questions – although in the context of tabletop displays – see (Nacenta et al., 20072). 4.3.3 Summary Performance of planar, perspective and literal mappings. Some research in multi-display interaction fits the predictions from the DO model, suggesting that literal techniques are better than planar techniques and that, in complex MDEs, perspective techniques are also superior to planar techniques. Although the principles of dimensional overlap can be used to predict the performance of a technique and to design new mappings, more research is needed that validates the full DO model for a range of MDE configuration types. Compatible and incompatible input models. Research on Stimulus-Response Compatibility shows that compatible mappings reduce the number of errors and increase performance. The application of SRC to the design of cross-display object movement seems straightforward, but has not been validated through empirical research. Further

- 34 -



research is also needed to quantify the differences in performance and accuracy between compatible and incompatible mappings. A quantitative model would help designers decide whether the extra costs associated with compatibility are worth the interaction advantages. Reduced power of literal techniques. Although literal techniques are likely the best in performance, they are intrinsically limited by the reach of human limbs, and performance drops quickly when objects have to be moved beyond areas near the user‟s body. There is also some evidence that users prefer to act locally unless they have to exert themselves away from their positions; designers must be aware of this and provide techniques with more power when required. Feedthrough in planar, perspective and literal mappings. Literal techniques intrinsically provide richer feedthrough than planar or perspective techniques, and feedthrough from perspective techniques is particularly difficult to understand. Designers must match the need for feedthrough and privacy of their systems to the particular kinds of techniques. Further research is needed on the quality of feedthrough from different mappings and on visualizations and virtual embodiments for planar and perspective techniques that provide the right information to others without the power limitations of literal techniques. Table 2. Summary of display configuration categories Configuration Planar

Advantages Easy and cheap to implement. High power.

Perspective

High performance due to high dimensional overlap. High power.

Literal

Excellent performance within hand‟s reach. Rich and natural feedthrough.

Disadvantages Poor representation of complex MDEs Suffers from the perspective problem. Poor feedthrough (unless artificially provided) Pointing perspective techniques suffer from userfatigue and low accuracy. Feedthrough is difficult to interpret. Expensive to implement. Limited remote Power.

Example techniques Mouse Ether, Drag-and-Pop, Push-and-Pop, Flick, Throw, Pantograph, Shuffle, ARIS‟ application relocation, PointRight , HyperDrag, Sketch Radar, arranged wormholes. Perspective Cursor , MultiMonitor Mouse with head tracking, Laser Pointers.

Pick-and-Drop, Tangible Bits, passage, shaking (Smart-Its Friends), BlueTable, Synchronized Gestures , uTextures, Stitch.

4.4 Control Paradigm

Techniques

In previous sections we have analyzed the effects and trade-offs related to the intention and response selection processes of our cognitive model (Figure 2). In this section we analyze the issues related to the execution of actions and the feedback

- 35 -

Reference

Configuration

Control

Spatial

Non-Spatial

Planar

Perspective

Literal

Open-Loop

Intermittent

Closed-Loop



loops that control them. This section is based on control theory as applied to human-computer interaction systems. In particular, we are interested in the existence or absence of a closed control loop between the user‟s planning and execution processes and the perception of changes in the environment. 4.4.1 Control Types There are three control possibilities for cross-display interaction techniques: open loop, closed loop and intermittent open/closed control. 4.4.1.1 Closed loop A technique is closed-loop when there is some mechanism that allows the user to adjust the execution of the action before it is finished. This adjustment depends on feedback: for example, a pointing task in which the user can see the cursor as it moves towards the target is closed-loop, because the image of the cursor provides continuous feedback about position. Closed-loop pointing techniques have been widely studied in HCI; in particular, targeting tasks are systematically found to follow Fitts‟ Law (MacKenzie, 1991). We are interested in the feedback loop as it applies to cross-display object movement; however, it is often difficult to separate single-display pointing mechanisms from crossdisplay object movement because techniques that allow remote placing (see section 3.1.2) also have a component of single-display-space pointing. Techniques that have literal display configurations are always closed-loop because of the way we move objects in the real world: when we move an object we usually hold it all the way until it is at its destination. During the movement, we get feedback by looking at it (or our hand) and by feeling its position through the senses of touch and proprioception. Some techniques with planar and perspective input models are also closed-loop. For example, world-in-miniature techniques allow the transfer of objects from one screen to another in a continuous mode that resembles single-display operation. The regular multidisplay cursor present in current operating systems is also closed-loop because, although the cursor jumps in space from one display to another, there is feedback of its position at all times. Laser pointer techniques that are implemented using an actual laser are also closed-loop, since the laser spot is always visible even if it is not projected on an active surface. 4.4.1.2 Open loop

- 36 -



A technique is open-loop when it lacks a feedback channel or when the user cannot correct her actions before the object is in its final position (i.e., when the control loop is broken). In the physical world this is equivalent to throwing an object: once it has left our hand it is impossible to change its trajectory. Examples of open-loop techniques include Flick (Wu and Balakrishnan, 2003) (although not SuperFlick (Reetz et al., 2006) which closes the loop for the last part of the interaction) and the button, key, and head-tracked versions of the Multi-Monitor Mouse (Benko and Feiner, 2005). 4.4.1.3 Intermittent open/closed One of the important characteristics of MDEs that differentiate them from singledisplay systems is discontinuity in display space. In most MDEs there are gaps, bezels or non-visible parts of displays that make it impossible to give continuous feedback. This blank space between displays is accounted for in the input models of certain techniques. For example, Mouse Ether (Baudisch et al., 2004) forces the mouse to travel across the blank space between two planar monitors. Perspective cursor (Nacenta et al. 2006) also travels through blank space, but in an angular rather than a linear fashion. These techniques belong to the intermittent group because the existence of feedback depends on where the object or cursor is; when the cursor is in displayable space, the process is closed-loop, when it is in blank space, the process is open-loop. 4.4.2 Design considerations for control Control is the lowest-level part of the taxonomy, and choices at this level have repercussions in performance and in user perception of the interaction. 4.4.2.1 Control in Performance Open-loop control is usually faster than closed-loop control because it uses a single gesture that does not require further confirmation or adjustment. However, it does not allow correction and so can be very inaccurate, which makes it unacceptable in some situations. For example, we found that Flick is fast, but suffers from serious inaccuracy problems. When Flick is enhanced with a closed-loop control stage (i.e., what we called SuperFlick (Reetz et al., 2006)), accuracy increases dramatically at the cost of extra time, and makes the technique similar in performance to other closed-loop approaches. The existence of an accuracy-speed trade-off between open-loop and closed-loop control does not, however, preclude the development of techniques that improve the overall accuracy of open-loop techniques or the overall speed of closed-loop techniques. Recent research has been successful in creating new techniques that outperform traditional pointing methods in single-display and large-display interfaces, for example by making closed-loop tasks slightly more open-loop and vice-versa (e.g., Delphian

- 37 -



Desktop (Asano et al., 2005)). We believe that many of these optimizations could be applied to MDE cross-display techniques. 4.4.2.2 Continuity of feedback Multi-display environments are by definition fractured into several display surfaces that are separated from each other by displayless space. Cross-display movement techniques implemented in current operating systems simply ignore displayless space, causing a sudden warp of the object from one display to the other. This warp could reduce performance for two reasons: the visual feedback becomes discontinuous, forcing the user to visually reacquire the object or cursor; and the input becomes inconsistent with the output, which makes the motor planning of the task more difficult. These problems led Baudisch et al. (2004) to design Mouse Ether, a technique that accounts for space between monitors in the input space. In exchange for making the input consistent with the physical space, Mouse Ether becomes an intermittent technique because there is no feedback of the position of the cursor in between displays. The technique‟s initial evaluation compared the new technique with the standard multidisplay cursor of Windows™ for an MDE with two monitors of different resolutions separated by a short gap. Their results showed a performance advantage of Mouse Ether; however, this study does not reveal whether the advantage holds for larger gaps or whether the improvement in performance is due to the added motor space or to the resolution adaptation between the two monitors. We designed a follow up study to find out how best to deal with displayless space (Nacenta et al., in press). The study compared performance of three cross-display cursor movement techniques: the standard cursor movement (which we call Stitching2), Mouse Ether, and a version of Mouse Ether that included Halo, a form of off-screen feedback (Baudisch and Rosenholtz, 2003). The Halo condition was included because it could compensate Mouse Ether‟s possible drawbacks (Halo provides cursor location feedback when the cursor is in displayless space, see Figure 13). The setup of the experiment was similar to Mouse Ether‟s original study, but there was no resolution difference between the two screens.

2

Not to be confused with the pen-based technique devised by Hinckley et al. (2004).

- 38 -



Figure 13. Two examples of halos. A) the object is far to the left of the screen B) the object is close to the right of the screen.

The results from the study indicate that warping between displays has a cost in performance that increases with the distance between displays (with Stitching, targeting tasks took longer to complete with larger physical gaps); however, Mouse Ether was still slower than Stitching for all gap distances except for the smallest (with the two screens right next to each other), in which the two techniques were equivalent. Mouse Ether with Halo was better than plain Mouse Ether, but still clearly inferior to Stitching. These results seem to contradict those from the original Mouse Ether study; however, it is possible that the resolution differences used in the original study may have caused the earlier Mouse Ether advantage. The evidence from our study suggests that warping is preferable to providing matching motor space, even when off-screen feedback is provided; in other words, discontinuity of the feedback loop is more harmful to performance than discontinuity of the visual representation or mismatch between the input and feedback spaces. These results cannot, however, be generalized yet to more complex environments. The evaluation of Perspective Cursor (Nacenta et al., 2006) showed an advantage of Perspective Cursor – which treats displayless space just as Mouse Ether – when compared to Stitching. This suggests that the Ether could be more useful for non-planar display arrangements; however, it is possible that this advantage was due to the perspective mapping of the input and not to the extra input space. Further research is needed to clarify this issue. Designers can avoid displayless space altogether by using an absolute representation of the environment (e.g., a world-in-miniature technique). Worlds-in-miniature are represented completely in displayable space, and therefore are closed-loop techniques. Using these representations, however, introduces a mapping between the miniature and the world that can be costly depending on the fidelity of the miniature, and can also reduce the amount of information available to other people (Nacenta et al., 2007-2). 4.4.2.3 Feedthrough and control

- 39 -



Closed-loop techniques rely on continuous feedback to establish the control loop, and this information is usually also available as feedthrough for others. On the contrary, openloop techniques do not generally provide continuous feedback unless the technique implements special animations to make the process visible to other users (as in several implementations of Flick, e.g., (Reetz et al., 2006)). Feedthrough of intermittent techniques is affected by the gaps between displays as feedback, but it has the added problem that other users have less information about what is happening in the motor space, which makes it more difficult to predict where the cursor will appear. This lack of feedback/feedthrough can be mitigated through the use of off-screen feedback techniques, but designers must be aware that off-screen feedback can easily crowd displays or be distracting for other users. 4.4.3 Summary Speed vs. accuracy trade-off. Some evidence from the study of interaction techniques suggest that there is a trade-off between the time to complete an object movement and the accuracy with which it can be placed. Designers should consider open-loop techniques if errors in placing are unimportant, cheap to correct, or if there are few possible destinations for the object (e.g., when the technique only needs remote putting power). The domain of open-loop techniques has received little attention; further research on open-loop techniques and on the adaptation of existing single-display techniques might generate useful alternatives to closed-loop interaction. Approaches to displayless space. A study on the different ways to deal with displayless space supports the use of techniques that warp the cursor instead of techniques that provide extra motor space, at least in MDEs with simple co-planar physical arrangements. Off-screen feedback was shown to improve extra-motor-space techniques, but not enough to outperform warping. Further research is needed to generalize these results to more complex MDEs and to investigate more advanced off-screen feedback techniques that might improve performance. World-in-miniature techniques and control. Techniques that use small representations of the display environment circumvent the problem of feedback discontinuity, and allow all actions to be closed-loop. However, these techniques force users to map between „real‟ objects and miniature representations; more research is needed to compare the costs of this mapping with the costs of warping or extra motor space. Feedthrough of open-loop, closed-loop and intermittent techniques. Most closed-loop techniques reveal more information to others because they innately involve feedback, which also provides feedthrough. Feedthrough can be improved in closed-loop and intermittent techniques through the use of special visualizations. Table 3. Summary of control paradigm categories Configuration Closed-loop

Advantages High accuracy.

Disadvantages Added identification process (in world-in-miniature

- 40 -

Example techniques ARIS‟ application relocation, Push-and-Pop, Smart-Its



Open-loop Intermittent

High power. Natural feedthrough. Very high speed. Reduced power. High power.

techniques).

Friends, Tractor Beam.

Low accuracy.

Multi-Monitor Mouse, Flick, Email, Gesture Pen.

Discontinuity of feedback reduces performance. Object might get lost in displayless space.

Mouse Ether, Perspective Cursor.

4.5 Summary of techniques Figure 14 shows a summary of most of the techniques discussed in the sections above, placed according to where they fit in the divisions of the taxonomy. (Note that in some cases, not all techniques are listed). Further information and references for the techniques can be found in Appendix A below.

- 41 -



Cross-Display Movement Techniques

Non-Spatial:

Spatial

E-mail as cross device IT, Instant Messages as cross device IT, Mighty Mouse Explicit Switching, M 3 with keyboard or mouse switch, Multibrowsing, Network folders, Syncronized clipboard, SharedNotes

Ref erence

Conf iguration

Control

Planar

Perspective

Literal

Open-Loop Planar:

Closed-Loop Planar:

Flick, Flying click (flick), M 3 with mouse location switch, ModSlideShow / Discrete Modular Model, Rooms/Telepointers, Spatial file transfer, Throw, WipeIt Peephole

Drag-and-throw / Slingshot, Hop, HybridPointing, HyperDrag, Interaction points (Dynamo), Lightweight personal bindings, Mighty Mouse Implicit Switching, OS/stitching, PointRight, Pushand-throw / Pantograph, SpaceGlider, Superflick, Swordfish, Vacuum, World In Miniature*, ARIS*, Arranged wormholes*, BubbleRadar*, Drag-and-pick*, Drag-and-Pop*, Frisbee*, ModSlideShow / Panoramic View*, Push-and-pop*, Radar View*, SketchRadar*

Perspective:

Intermittent Planar: Mouse Ether

Perspective: Perspective Cursor

GesturePen, Head Tracking and Mouse Input for a GUI on Multiple Monitors, InfoBinder, Look-to-Talk, M 3 with head tracking, Put that there

Perspective: EMMIE, Head Tracking and Mouse Input for a GUI on Multiple Monitors, Laser pointers, Semantic snarfing, TractorBeam

Literal: BlueTable, Bump, HyperPalette, Passage/Bridge, Pick-anddrop, Proximal Interactions, Sensetable, (Shaking) SmartsIts Friends, Stitching, Synchronized gestures, SyncTap, Tangible Bits / Media blocks, u-Textures, TractorBeam

Figure 14. Summary of techniques classified according to our framework3

5. DISCUSSION In this section we consider several issues that arise from the cross-display movement taxonomy. We first summarize the most important MDE design concepts that we identified in the previous sections, then we discuss the ways that the taxonomy can be used by designers and researchers. We then reflect on possible limitations of this work, 3

For the purpose of this classification, world-in-miniature techniques (those marked with “*” ) are considered closed-loop because they afford absolute control of the objects in the miniature. However, these techniques only provide feedback for the full-size objects in the environment when the object is in display space. For other users, or depending on the requirement of the task, these techniques should be considered intermittent.

- 42 -



and ways that these limitations can be addressed in further work. Finally, we present a research agenda that is derived from our experience of developing and working through the taxonomy.

5.1 Design concepts from the taxonomy There are few prior research projects that have dealt with design concepts specific to the constraints and realities of MDEs – that is, that deal with issues like heterogeneity of displays or the empty space between displays (although there are some, such as Mouse Ether (Baudisch et al., 2004)). Our taxonomy identifies several of these, both from the concepts it uses to organize the techniques, and from the issues that these organizational principles raise. Dealing with action planning processes and intention in interaction techniques. The inclusion of the referential domain level in the taxonomy stresses the importance of the mapping between the reference domain of the interaction technique and that of the user‟s intention. Interaction techniques are often considered only in terms of execution issues (e.g., targeting performance), but the constraints of cross-display movement require that intention and action planning be considered as well. The idea that users must first reference the destination display adds necessary scope to the way that we think about interaction techniques in MDEs. Separation into spatial and non-spatial techniques. The separation at the top level of the taxonomy is a fundamental division for MDE techniques, one that has not previously been explored in detail. This distinction raises a number of issues that are important for design – such as the relative advantage of spatial referencing, and the difficulty of ignoring the real world when using non-spatial movement techniques, which is particularly important in more complex MDEs because they are so strongly situated in physical space. The importance of display arrangement in MDEs. The second and third levels of the taxonomy suggest that there will be major differences in the ways that users will interact with an MDE, depending on its display configuration, the mapping between input and output, and the presence of a feedback loop. In particular, the problem of how to deal with the gaps between displays should now be recognized as a major issue in MDE design. Dimensional overlap and Stimulus-Response Compatibility. These concepts from cognitive and experimental psychology provide explanatory power for understanding interaction techniques that map a logical organization of input space to a physical display space. Compatibility and dimensional overlap have not been used previously to explain the performance of interaction techniques.

5.2 Using the taxonomy

- 43 -



The taxonomy can be used both by designers who are interested in constructing MDEs and applications for MDEs, and by researchers who work in the areas of interaction techniques and multi-display environments. To begin with, designers will be able to use the taxonomy as a catalog of existing cross-display techniques; appendix A contains an alphabetical index of existing techniques to facilitate the access to the original sources of each technique and their classification. In addition, our taxonomy classifies techniques into categories at three levels. Each level corresponds to a major design decision, and provides a summary of current research for that particular topic. Current research results can be used by designers to make informed decisions when selecting techniques or when deciding on the appropriate display configuration of new MDEs. Classifying techniques into groups based on a cognitive model can help designers and researchers achieve a deeper understanding of the nature of the cross-display object movement task and its underlying principles. Thinking at the level of underlying principles and fundamental ideas is necessary as the research area matures and moves from innovation, empiricism, and replication, to modeling and theory. Our classification and model can be used either as starting points for other considerations of underlying ideas, or as a target for further debate (and hopefully for greater understanding of how interaction techniques work in MDEs). Finally, as a summary of state-of-the-art techniques and cutting-edge research, this taxonomy should help researchers to identify areas of interest and, perhaps, to discover new interaction techniques in unexplored gaps of the design space (see section 7 below).

5.3 Critical reflection on the taxonomy: limitations and issues The choices made in the design of the taxonomy lead to useful insights but also lead to certain limitations. Most importantly, any taxonomy forces objects into one category or the other, and this restricts the ways that some interaction techniques can be characterized. For example, the PointRight technique (Johanson et al., 2002) does not exclusively belong in the „planar‟ display configuration, since it uses several planes; and it does not quite belong in the „perspective‟ category, since it does not make use at all of contextual data from the user . Despite the fact that we cannot conclusively categorize this technique, the taxonomy does at least provide us with the conceptual vocabulary necessary to make this assessment, and provides us with the concepts needed to understand why the technique does not fit. In addition, this failure of fit can help to improve the taxonomy – for example, by suggesting that there is in fact a continuum between planar and perspective techniques. Similarly, some techniques such as SketchRadar (Aliakseyeu and Marten, 2006) move some of the way back to the nonspatial referential domain, and it is difficult to characterize them as purely spatial or nonspatial; this suggests that there could be additional categories of referential domain that are hybrids of the other classes.

- 44 -



Another issue with the taxonomy is that the levels and concepts that we chose to highlight are not the only way of organizing interaction techniques. As discussed earlier, there could be other classifications – e.g., based on criteria such as performance, usability, or cost; or based on requirements of the task or application domain. These other ways of conceptualizing cross-display movement techniques are entirely legitimate, but do not necessarily reduce the importance of the concepts presented in this paper. We chose to use the ideas of reference, configuration, and control because these were able to focus our attention on fundamental interactions between the demands of cross-display object movement and the way that space and displays are conceived of and organized in an MDE. The issues raised by this perspective are useful for thinking about the design of interaction techniques; however, we believe that other perspectives and other classifications can co-exist with and complement what is presented here.

6. RELATED WORK This section organizes research related to our topic into two sections: research related to MDEs (but not necessarily to interaction techniques) and existing surveys and classifications of interaction techniques.

6.1 Research on multi-display environments This paper has focused only on interaction in the design of MDEs. Other aspects of MDE design have also received considerable attention in the MDE literature. For example, output – that is, the way that applications and data should be designed for and shown on multiple displays – has been considered by researchers who have looked at the presentation of application windows and other objects (Nacenta et al., 2007-3; Mackinlay et al., 2004). Interface plasticity and multi-display application design are also an active field (Wigdor et al., 2006; Robertson et al., 1996; Rogers and Rodden, 2003; Forlines et al., 2006-1) as is the study of input redirection across displays (e.g., Booth et al., 2002; Biehl and Bailey, 2004; Baudisch et al., 2004). Finally, there is also significant research being carried out on the way that people organize and use display spaces (Grudin, 2001; Tan and Czerwinski, 2003; Rogers and Lindley, 2004; Hutchings and Stasko, 2004; Hutchings et al., 2004; Inkpen et al., 2005; MacIntyre et al., 2001).

6.2 Surveys and classifications There are very few previous attempts in classifying the design space of cross-display object movement techniques; therefore, we discuss both work directly related to MDE and also literature that can be generalized and applied to multi-display environments. One of the most pertinent works is Balakrishnan‟s review of pointing techniques (Balakrishnan, 2004), which surveys interaction techniques that aim to achieve better pointing performance on a single (large) display. The classification is based on the optimized initial impulse model of goal-directed movement, and techniques are classified

- 45 -



based on the three primary approaches used for improving performance: increasing the size of the target, reducing the distance between target and cursor, and combinations of the two. The survey helps create a framework for classifying various pointing techniques and highlights several limitations with current approaches and open research challenges. The author notes that while many techniques look promising and appear to be generally faster than conventional pointing, none of the techniques have been demonstrated to work consistently over all situations typical to GUI. Moreover, most of the techniques have difficulties in situations where the environment is cluttered with a large number of potential targets; this was especially prominent for techniques that rely on increasing target size. The survey also indicates that the success of a technique depends on the choice of input device (pen vs. mouse). The author argues that measuring end-user acceptability is as important as quantitative, performance based measurements. The review has subsequently led to many new interaction techniques that capitalize on the limitations exposed by the survey. Bezerianos and Balakrishnan, (2004) present six techniques for working with distant parts of a large display in an effort to support a variety of situations typical to GUIs and large displays. They propose techniques for bringing parts of the screen toward the user, portal widgets for interaction with portions of the virtual canvas and storage of items in unused portions of the screen. In Bezerianos and Balakrishnan (2005-2) they extend their previous work and introduce a canvas portals framework to integrate various interaction techniques developed for large scale high-resolution displays. Canvas portals provide an alternative view of display canvas areas where interacting with the portal‟s interior is equivalent to interacting with the depicted display area. By manipulating the various canvas-portal parameters a variety of novel interactions can be supported. The paper also demonstrates the use of portals with a number of existing techniques. More recently researchers have taken a more holistic view of interaction with large displays. Czerwinski et al., (2006) present an overview of large display research and discuss various usability issues such as accessing windows and icons at a distance, window management, and task management. Their work also introduces a number of new interaction techniques. Tao Ni et al., (2006) present a survey on research with large highresolution displays from a systems perspective. Their survey looks at hardware configurations, rendering and data pipelines, and applications and visual effects, but it also has sections on human performance and user interfaces and interaction techniques. The authors also identify ten challenges in large high-resolution displays. With the growing popularity of large interactive surfaces, researchers have also investigated interaction and collaboration over horizontal surfaces. For example, Scott et al. (2003) present a set of system guidelines for co-located collaborative work on tabletop displays. Their guidelines suggest that technology must support several communication and coordination activities like natural interpersonal interaction, transitions between personal and group work, and the use of physical objects. More recently, researchers have

- 46 -



explored these issues in greater detail, adding to a heightened understanding of these issues within the tabletop community (e.g., Kruger et al., (2004) and Scott et al., (2005)). Shen et al., (2006) present a variety of user interfaces, interaction techniques, and usage scenarios for direct touch tabletop systems. From a number of studies the authors identify three general issues that were observed across different application prototypes and evaluation sessions: orientation side effects, input precision, and non-speech audio feedback for group interaction. There are only a few papers that propose a taxonomy or classification of MDEs. Kraemer and King (1988) surveyed the area of meeting support systems. This work is, however, of limited use now because the focus of most early systems was to “improve the meeting process,” normally by imposing formal constraints on the participants. In one of our previous studies (Nacenta et al., 2005) we presented a design framework that classifies multi-display reaching techniques (putting objects in remote locations) based on their characteristics and requirements. The framework uses nine attributes to characterize the techniques: topology of the underlying interaction space, reaching range, nature of the destination display, feedback provided to the user, input devices used, display and input area requirements, implicit privacy rules, sidedness, and symmetry. In (Nacenta, 2007-2) we also conceptualized the design space of interaction techniques; this time for tabletop collaborative settings. This classification was aimed at collaborative measures and used three dimensions: location of the input space, location of the feedback space, and embodiments.

7. RESEARCH AGENDA In this paper we presented a taxonomy that attempts to divide up the world of crossdisplay movement techniques based on a set of what we believe are fundamental ideas and distinctions. The taxonomy helps identify several challenges that need to be addressed through future research. Here we discuss future research challenges in three categories- Innovative solutions, empirical design rules and predictive models. Innovative Solutions. The taxonomy reveals several cross-display scenarios for which we lack experience; most current MDEs rely on generic solutions for the different remote powers, and do not pay much attention to the context of the interaction. Similarly, there are limited instances where cross-display interaction techniques have been designed for use in collaborative settings. For example, one of the weaknesses of perspective techniques is its poor feedthrough (Section 4.3.2.4); further research is required to find innovative mechanisms through which perspective techniques can be made more suitable for collaborative environments. Point designs and building novel solutions for specific instances can help us gain experience in dealing with various factors that can influence cross-display movement techniques, and this in turn can lead to empirical design rules.

- 47 -



Empirical Design Rules. The taxonomy can help identify gaps in our knowledge of how different factors influence cross-display performance. The taxonomy suggests that there are divisions, but says little about the sizes of the effect of those divisions. Empirical studies need to be done to identify differences within and between the various referential domains, input configurations and control mechanisms. More comprehensive empirical comparisons between groups and better knowledge of the requirements of different tasks would help create a corpus of design rules to support better design decisions. Predictive Models. In building our taxonomy we used research results from the psychology literature. However, it is essential to build new models that can predict user‟s performance in various scenarios. The use of dimensional overlap models and stimulusresponse compatibility results can help predict performance of techniques in single and multi-display environments. However, we need to build a predictive model that knows how the particular details of implementation and the optimizations of techniques interact with the predictions of SRC and DO theory. Results from stimulus-response experiments can be applied to the design of MDEs, but these also need to be carefully studied and replicated in the context of human-computer interaction. We need quantitative data that measures the difference in speed and accuracy of relevant stimulus-response sets, particularly for non-spatial mappings. Most of the current predictive models for input control are derived from Fitts‟s and Steering laws that are primarily concerned with movement time for finger and wrist movements in a Euclidian space. These models need to be extended to include both open-loop and closed-loop interaction, the traversing of displayless space, and whole arm movement. Finally, we believe that the integration of our taxonomy with the models proposed above and existing models and laws such as the Hick-Hyman law of choice selection time (Hick, 1952; Hyman, 1953) or the GOMS model (Card et al., 1983) could lead to the development of a comprehensive framework for the design of MDEs.

8. CONCLUSION In this paper we present a taxonomy of cross-display object movement techniques. Crossdisplay movement is a fundamental action in multi-display environments, and the taxonomy identifies three fundamental levels in that action: the referential domain that determines how users select a display, the display configuration that determines the relationship between the input space and the displays, and the control paradigm that determines input characteristics. Interaction techniques are classified into categories for each one of the levels. In addition, we gather available evidence at each level that can help compare the categories and decide which interaction technique is appropriate for which MDE. By examining the existing literature in relation to the taxonomy we were able to identify several design ideas that have not been seen before in MDE research; based on this we present several challenges that need to be addressed through further research.

- 48 -



We believe the taxonomy will be valuable to the MDE-community in serving as a tool for informing design and as a guide to stimulate future research by identifying gaps in our understanding of interaction techniques in MDEs.

- 49 -



APPENDIX A: INDEX OF INTERACTION TECHNIQUES The following table contains an alphabetical list of interaction techniques. In some cases, the techniques are not named, and instead the name of the system appears. The last column references to the source where the technique is define or, by default, to a source where it is compared or studied. Abbreviations: Referential domain (2nd column):

SP – spatial NS – non-spatial

Display configuration (3rd column): PL – planar L – literal PR – perspective NA – Not applicable Control paradigm (4th column):

O – open-loop C – closed-loop C* - closed-loop (world-in-miniature - see below) IN – intermittent

Note: for the purpose of this classification, world-in-miniature techniques are considered closed-loop because they afford absolute control of the objects in the miniature. However, these techniques only provide feedback for the full-size objects in the environment when the object is in display space. For other users, depending on the requirement of the task, these techniques should be considered intermittent (IN).

NAME ARIS‟ application relocation Arranged wormholes BlueTable BubbleRadar Bump Drag-and-Pick Drag-and-Pop Drag-and-Throw / Slingshot E-mail as cross device IT EMMIE Flick Flying click (flick) Frisbee

Referential Domain SP SP SP SP SP SP SP SP NS SP SP SP SP

Display config. PL PL L PL L PL PL PL NA PR PL PL PL

- 50 -

Control Paradigm C* C* C C* C C* C* C O C O O C*

Reference (Biehl & Bailey, 2005) (Wu & Balakrishnan, 2003) (Wilson & Sarin, 2007) (Aliakseyeu et al., 2006) (Hinckley, 2003) (Baudisch et al., 2003) (Baudisch et al., 2003) (Hascoët, 2003) (Butz et al., 1999) (Moyle & Cockburn, 2002) (Dulberg et al., 2003) (Khan et al., 2003)


http://www.informaworld.com/smpp/content~content=a910602221~db=all~jumptype=rss GesturePen Head Tracking and Mouse Input for a GUI on Multiple Monitors Hop HybridPointing HyperDrag HyperPalette InfoBinder

SP

PR

O

(Swindells et al., 2002)

SP

PR

O

(Ashdown et al., 2005)

SP SP SP SP SP

PL PL PL L PR

C C C C O

InfoStick/InfoPoint

SP

PR

C

(Irani et al., 2006) (Forlines et al., 2006-2) (Rekimoto & Saitoh, 1999) (Ayatsuka et al., 2000) (Siio, 1995) (Kohtake et al., 1999; Kohtake et al., 2001)

NS

NA

O

-

SP SP SP SP

PL PR PL PR

C C C O

(Izadi et al., 2003) (Olsen & Nielsen, 2001) (Ha et al., 2006-2) (Oh et al., 2002)

NS

NA

O

(Booth et al., 2002)

SP

PL

C

(Booth et al., 2002)

SP NS SP NS

PR NA PL NA

O O O O

(Benko & Feiner, 2005) (Benko & Feiner, 2005) (Benko & Feiner, 2005) (Benko & Feiner, 2005)

SP

PL

O

(Chiu et al., 2003)

SP

PL

C*

(Chiu et al., 2003)

SP NS NS

PL NA NA

IN O O

OS/stitching

SP

PL

C

Passage/Bridge Perspective Cursor Pick-and-drop PointRight Proximal Interactions Push-and-pop Push-and-throw / Pantograph Put-that-there Radar View Rooms/Telepointers Semantic snarfing Sensetable (Shaking) Smarts-Its Friends Syncronized clipboard

SP SP SP SP SP SP SP SP SP SP SP SP SP NS

L PR L PL L PL PL PR PL PL PR L L NA

C IN C C C C* C O C* O C C C O

(Baudisch et al., 2004) (Johanson et al., 2001) http://www.microsoft.com/ windowsxp/using/setup/ learnmore/northrup_multimon.mspx (Streitz et al., 1998) (Nacenta et al., 2006) (Rekimoto, 1997) (Johanson et al., 2002) (Rekimoto et al., 2003-1) (Collomb et al., 2005) (Hascoët, 2003) (Bolt, 1980) (Nacenta et al., 2005) (Stefik et al., 1986) (Myers et al., 2001) (Patten et al., 2001) (Holmquist et al., 2001) (Miller & Myers, 1999)

Instant Messages as cross device IT Interaction points (Dynamo) Laser pointers Lightweight personal bindings Look-to-Talk Mighty Mouse Explicit Switching Mighty Mouse Implicit Switching M3 with head tracking M3 with keyboard Switch M3 with mouse location switch M3 withMouse Button Switch ModSlideShow / Discrete Modular Model ModSlideShow / Panoramic View Mouse Ether Multibrowsing Network folders

- 51 -


http://www.informaworld.com/smpp/content~content=a910602221~db=all~jumptype=rss SharedNotes SketchRadar SpaceGlider Spatial file transfer Stitching Superflick Swordfish Synchronized gestures SyncTap Tangible Bits / Media blocks Throw TractorBeam u-Textures Vacuum WipeIt Peephole World in Miniature

NS SP SP SP SP SP SP SP SP SP SP SP SP SP SP SP

NA PL PL PL L PL PL L L L PL PR/L L PL PL PL

- 52 -

O C* C O C C C C C C O C C C O C*

(Greenberg et al., 1999) (Aliakseyeu & Martens, 2006) (Leigh et al., 2002) (Hazas et al., 2005) (Hinckley et al., 2004) (Reetz et al., 2006) (Ha et al., 2006-1) (Nacenta et al., 2005) (Rekimoto et al., 2003-2) (Ullmer et al., 1998) (Geiβler, 1999) (Parker et al., 2005) (Kohtake et al., 2005) (Bezerianos & Balakrishnan, 2005-1) (Butz & Krüger, 2006) (Stoakley et al., 1995)



REFERENCES Aliakseyeu, D. & Martens, J.-B. (2006) Sketch Radar: A Novel Technique for MultiDevice Interaction. Proceedings of HCI’2006, Vol. 2, 45-49. British HCI Group Aliakseyeu, D., Lucero, A., Martens, J.-B. (2008). Where is a cat: users' quest for an optimized representation of a multi-device space. Submitted to AVI’08. Aliakseyeu, D., Nacenta, M. A., Subramanian, S., and Gutwin, C. (2006) Bubble radar: efficient pen-based interaction. Proceedings of the Working Conference on Advanced Visual interfaces (AVI '06), 19-26. New York: ACM Asano, T., Sharlin, E., Kitamura, Y., Takashima, K., and Kishino, F. 2005. Predictive Interaction using the Delphian Desktop. In Proceedings of the 18th Annual ACM Symposium on User interface Software and Technology (Seattle, WA, USA, October 23 - 26, 2005). UIST '05. ACM Press, New York, NY, 133-141. Ashdown, M., Oka, K., and Sato, Y. (2005) Combining head tracking and mouse input for a GUI on multiple monitors. Extended Abstracts of the CHI’05 Conference on Human Factors in Computing Systems, 1188-1191. New York: ACM Ayatsuka, Y., Matsushita, N., and Rekimoto, J. (2000) HyperPalette: a hybrid computing environment for small computing devices. Extended Abstracts of CHI '00 Conference on Human Factors in Computing Systems, 133-134. New York: ACM Balakrishnan, B. (2004). "Beating" Fitts' law: Virtual enhancements for pointing facilitation. International Journal of Human-Computer Studies, 61(6), 857-874. Baudisch, P., and Rosenholtz, R. (2003) Halo: A Technique for Visualizing Off-Screen Locations. Proceedings of the CHI’03 Conference on Human Factors in Computing Systems, 48-488. New York: ACM Baudisch, P., Cutrell, E., Hinckley, K. and Gruen, R. (2004) Mouse Ether: Accelerating the Acquisition of Targets Across Multi-Monitor Displays. Extended Abstracts of the CHI’04 Conference on Human Factors in Computing Systems, 1379-1382. New York: ACM Baudisch, P., Cutrell, E., Robbins, D., Czerwinski, M., Tandler, P., Bederson, B., and Zierlinger, A. (2003) Drag-and-pop and Drag-and-pick: Techniques for accessing remote screen content on touch- and pen operated systems. Proceedings of INTERACT 2003 - the Ninth IFIP TC13 International Conference on HumanComputer Interaction, 57-64. Amsterdam: IOS Press

- 53 -



Benko, H. & Feiner, S. (2005) Multi-Monitor Mouse. Extended Abstracts of the CHI’05 Conference on Human Factors in Computing Systems, 1208-1211. New York: ACM Benko, H. & Feiner, S. (2007) Pointer Warping in Heterogeneous Multi-Monitor Environments. Proceedings of Graphics Interface 2007 (in press). Bezerianos, A. & Balakrishnan, R. (2004) Interaction and visualization techniques for very large scale high resolution displays (Technical Report DGP-TR-2004-002). DGP Lab, University of Toronto. Bezerianos, A. & Balakrishnan, R. (2005-1) The Vacuum: Facilitating the manipulation of distant objects. Proceedings of the CHI’05 Conference on Human Factors in Computing Systems, 361-370. New York: ACM Bezerianos, A. and Balakrishnan, R. (2005-2) View and Space Management on Large Displays. IEEE Computer Graphics and Applications 25, 4, 34-43 Biehl, J. T. and Bailey, B. P. (2004) ARIS: an interface for application relocation in an interactive space. In Proceedings of Graphics interface 2004 (London, Ontario, Canada, May 17 - 19, 2004). ACM International Conference Proceeding Series, vol. 62. Canadian Human-Computer Communications Society, School of Computer Science, University of Waterloo, Waterloo, Ontario, 107-116. Biehl, J T.; Bailey, B P. (2006) Improving Interfaces for Managing Applications in Multipe-Device Environments. Proceedings of AVI '06 Conference on Advanced Visual interfaces, 35-42. New York: ACM Bluetooth SIG, the Bluetooth specifications https://www.bluetooth.org/spec/ last accessed June 14, 2007. Bolt, Richard A. (1980) “Put-that-there”: Voice and gesture at the graphics interface. Proceedings of SIGGRAPH '80 Conference on Computer graphics and interactive techniques, 262-270. New York, ACM. Booth, K. S., Fisher, B. D., Lin, C. J., and Argue, R. (2002) The "mighty mouse" multiscreen collaboration tool. Proceedings of UIST 2002 – the ACM Symposium on User Interface Software and Technology, 209-212. New York: ACM Britton, E. G., Lipscomb, J. S., and Pique, M. E. (1978). Making nested rotations convenient for the user. SIGGRAPH Comput. Graph. 12, 3 (Aug. 1978), 222-227 Butz, A. & Krüger, A. (2006) "Applying the Peephole Metaphor in a Mixed-Reality Room. IEEE Computer Graphics and Applications, Jan/Feb, 2006, vol. 26, no. 1, 56-63

- 54 -



Butz, A., Höllerer, T., Feiner, S., MacIntyre, B., and Beshers, C. (1999) Enveloping Users and Computers in a Collaborative 3D Augmented Reality. Proceedings of the 2nd IEEE and ACM international Workshop on Augmented Reality IWAR‟1999, 35-44. Washington: IEEE Computer Society. Card, S. K., Moran, T. P. , Newell, A. (1983). The Psychology of Human-Computer Interaction Chiu, P., Liu, Q., Boreczky, J., Foote, J., Fuse, T., Kimber, D., Lertsihichai, S. and Liao, C. (2003) Manipulating and annotating slides in a multi-display environment. Proceedings of INTERACT 2003 - the Ninth IFIP TC13 International Conference on Human-Computer Interaction, 583-590. Amsterdam: IOS Press Chou, P. , Gruteser, M. , Lai, J., Levas, A., McFaddin, S., Pinhanez, C. and Viveros, M. (2001) Bluespace: Creating a personalized and context-aware workspace. Technical Report RC 22281, IBM Research. Chua, R., Weeks, D. J., and Goodman, D. (2003) Perceptual-motor interaction: some implications for human-computer interaction. In J. A. Jacko & A. Sears, (Eds.), the Human-Computer interaction Handbook: Fundamentals, Evolving Technologies and Emerging Applications. Human Factors And Ergonomics (pp. 23-34). Mahwah, NJ: Lawrence Erlbaum Associates. Collomb, M., Hascoët, M., Baudisch, P., and Lee, B. (2005) Improving drag-and-drop on wall-size displays. Proceedings of the GI’05 conference on Graphics interface, 25-32. New York: ACM Dickie, C., Hart, J., Vertegaal, R. and Eiser, A. (2006) LookPoint: an evaluation of eye input for hands-free switching of input devices between multiple computers. Proceedings of OZCHI06, the CHISIG Annual Conference on Human-Computer Interaction. pp. 119-126. Czerwinski, M., Robertson, G., Meyers, B., Smith, G., Robbins, D., and Tan, D. (2006) Large display research overview. Extended Abstracts of the CHI’06 Conference on Human Factors in Computing Systems, 69-74. New York: ACM Dix, A. J. (1994). Computer-supported cooperative work - a framework. In Design Issues in CSCW, Eds. D. Rosenburg and C. Hutchison, 23-37. Springer Verlag. Dulberg, M.S., Amant, R.s. and Zettlemoyer, L.S. (2003) An Imprecise Mouse Gesture for the Fast Activation of Controls. Proceedings of INTERACT 2003 - the Ninth IFIP TC13 International Conference on Human-Computer Interaction, 57-64. Amsterdam: IOS Press

- 55 -



Fitts, P.M. & Deininger, R. L. (1954) S-R compatibility: Correspondence among paired elements within stimulus and response codes. Journal Experimental Psychology, 48, 483-492. Fitts, P.M. & Seeger C.M. (1953) S-R compatibility: Spatial characteristics of stimulus and response codes. Journal Experimental Psychology, 46, 199-210. Forlines, C., Esenther, A., Shen, C., Wigdor, D., and Ryall, K. (2006) Multi-user, multidisplay interaction with a single-user, single-display geospatial application. Proceedings of UIST 2006 – the ACM Symposium on User Interface Software and Technology, 273-276. New York: ACM Forlines, C., Vogel, D., and Balakrishnan, R. (2006) HybridPointing: fluid switching between absolute and relative pointing with a direct input device. Proceedings of UIST 2006 – the ACM Symposium on User Interface Software and Technology, 211-220. New York: ACM Geiβler, J. (1998) Shuffle, throw or take it! Working Efficiently with an Interactive Wall. Proceedings of the CHI’98 Conference on Human Factors in Computing Systems, 265-266. New York: ACM Greenberg, S., Boyle, M. and LaBerge, J. (1999) PDAs and Shared Public Displays: Making Personal Information Public, and Public Information Personal. Personal Technologies, Vol.3, No.1, 54-64 Grudin, J. (2001) Partitioning digital worlds: focal and peripheral awareness in multiple monitor use. Proceedings of the CHI’01 Conference on Human Factors in Computing Systems, 458-465. New York: ACM Gutwin, C. & Greenberg, S. (1998) Design for individuals, design for groups: tradeoffs between power and workspace awareness. Proceedings of the ACM Conference on Computer Supported Cooperative Work (CSCW '98), 207-216. New York: ACM Gutwin, C. & Greenberg, S. (2002) A Descriptive Framework of Workspace Awareness for Real-Time Groupware. Computer Supported Cooperative Work ,Vol. 11, 3-4, 411-446. The Netherlands: Springer Ha, V., Inkpen, K., Wallace, J., and Ziola, R. (2006) Swordfish: user tailored workspaces in multi-display environments. Extended Abstracts of the CHI’06 Conference on Human Factors in Computing Systems, 1487-1492. New York: ACM Ha, V., Wallace, J., Ziola, R., and Inkpen, K. (2006) My MDE: configuring virtual workspaces in multi-display environments. Extended Abstracts of the CHI’06

- 56 -



Conference on Human Factors in Computing Systems, 1481-1486. New York: ACM Harter, A., Hopper, A., Steggles, P., Ward, A., and Webster, P. 1999. The anatomy of a context-aware application. In Proceedings of the 5th Annual ACM/IEEE international Conference on Mobile Computing and Networking (MobiCom '99), 59-68. New York: ACM. Hascoët, M. (2003) Throwing models for large displays. Proceedings of 17th Annual Human-Computer Interaction Conference (HCI’2003), Vol. 2, 73-77. British HCI Group Hascoët, M. et Collomb, M. (2004) Speed and accuracy in throwing models for large displays. Proceedings of 18th Annual Human-Computer Interaction Conference (HCI’2004), Vol. 2, 21-24. British HCI Group. Hazas, M., Kray, C., Gellersen, H., Agbota, H., Kortuem, G., and Krohn, A. (2005) A relative positioning system for co-located mobile devices. Proceedings of the MobiSys '05 - 3rd international Conference on Mobile Systems, Applications, and Services, 177-190. New York: ACM Hick, W. (1952) On the rate of gain of information. J. Experimental Psychology, 4. 1952. 11-36. Hinckley, K. (2003) Bumping Objects Together as a Semantically Rich Way of Forming Connections between Ubiquitous Devices, UbiComp 2003 video program. Hinckley, K. (2006) Input Technologies and Techniques, Handbook of Human-Computer Interaction, ed. by Andrew Sears and Julie A. Jacko. Lawrence Erlbaum & Associates. Significant revision of 2001 chapter. To appear. Hinckley, K., Ramos, G., Guimbretiere, F., Baudisch, P., and Smith, M. (2004) Stitching: pen gestures that span multiple displays. Proceedings of AVI '04 Conference on Advanced Visual interfaces, 23-31. New York: ACM Holmquist, L.E., Mattern, F., Schiele, B., Alahuhta, P., Beigl, M., and Gellersen,H.-W. (2001) Smart-Its Friends: A Technique for Users to Easily Establish Connections between Smart Artefacts. Proceedings of Ubicomp 2001, 116-121. London: Springer-Verlag. Hommel, B., Musseler, J., Aschersleben, G. and Prinz, W. (2001) The Theory of Event Coding: A Framework for Perception and Action Planning. Behavioral and Brain Sciences, (2001), 24, 849-937

- 57 -



Hutchings, D. R. and Stasko, J. (2004) Revisiting display space management: understanding current practice to inform next-generation design. Proceedings of the 2004 Conference on Graphics interface, 127-134. New York: ACM Hutchings, D. R., Smith, G., Meyers, B., Czerwinski, M., and Robertson, G. (2004) Display space usage and window management operation comparisons between single monitor and multiple monitor users. Proceedings of the Working Conference on Advanced Visual interfaces (AVI '04), 32-39. New York: ACM Hutchins, E. L., Hollan, J.D., Norman, D.A. (1985) Direct Manipulation Interfaces. Journal of Human-Computer Interaction, Laurence Erlbaum Associates, 311-338. Hyman, R. (1953) Stimulus information as a determinant of reaction time. J. Experimental Psych., 45. 1953. 188- 196. Inkpen, K.M., Hawkey, K., Kellar, M., Mandryk, R.L., Parker, J.K., Reilly, D., Scott, S.D. and Whalen, T. (2005) Exploring Display Factors that Influence Co-Located Collaboration: Angle, Size, Number, and User Arrangement. Proceedings of HCI International 2005. Izadi, S., Brignull, H., Rodden, T., Rogers, Y., and Underwood, M. (2003) Dynamo: a public interactive surface supporting the cooperative sharing and exchange of media. Proceedings of UIST 2003 – the ACM Symposium on User Interface Software and Technology, 159-168. New York: ACM Jacob, R. J., Sibert, L. E., McFarlane, D. C., and Mullen, M. P. (1994). Integrality and separability of input devices. ACM Trans. Comput.-Hum. Interact. 1, 1 (Mar. 1994), 3-26 Ji, Y., Biaz, S., Pandey, S., and Agrawal, P. (2006). ARIADNE: a dynamic indoor signal map construction and localization system. In Proceedings of the 4th international Conference on Mobile Systems, Applications and Services (MobiSys '06). 151164. New York: ACM Johanson, B., Hutchins, G., Winograd, T., and Stone, M. (2002) PointRight: experience with flexible input redirection in interactive workspaces. Proceedings of UIST 2002 – the ACM Symposium on User Interface Software and Technology, 227234. New York: ACM Johanson, B., Ponnekanti, S., Sengupta, C., Fox, A. (2001) Multibrowsing: Moving Web Content across Multiple Displays. Proceedings of Ubicomp 2001, 346-353. London: Springer-Verlag Khan, A., Fitzmaurice, G., Almeida, D., Burtnyk, N., Kurtenbach, G. (2003) A Remote Control Interface for Large Displays. Proceedings of UIST 2003 – the ACM

- 58 -



Symposium on User Interface Software and Technology, 127-136. New York: ACM Kohtake, N., Ohsawa, R., Yonezawa, T., Matsukura, Y., Iwai, M., Takashio, K. and Tokuda, H. (2005) u-Texture:Self-organizable Universal Panels for Creating Smart Surroundings. Proceedings of the 7th International Conference on Ubiquitous Computing (UbiComp 2005), 19–36. Heidelberg: Springer Kohtake, N., Rekimoto, J., and Anzai, Y. (1999) InfoStick: An Interaction Device for Inter-Appliance Computing. Proceedings of the 1st international Symposium on Handheld and Ubiquitous Computing, 246-258. London: Springer-Verlag. Kohtake, N., Rekimoto, J., and Anzai, Y. (2001) InfoPoint: A Device that Provides a Uniform User Interface to Allow Appliances to Work Together over a Network. Personal Ubiquitous Computing 5, 4 (Jan. 2001), 264-274 Kornblum, S. (1992 ) Dimensional overlap and dimensional relevance in stimulusresponse and stimulus-stimulus compatibility. Advances in psychology, vol. 87, 743-777 Kornblum, S., Hasbroucq, T., and Osman, A. (1990) Dimensional Overlap: Cognitive Basis for Stimulus-Response Compatibility-A model and Taxonomy. Psychological Review 1990, Vol 97, No 2, 253-270 Kortuem, G., Kray, C., and Gellersen, H. (2005) Sensing and visualizing spatial relations of mobile devices. Proceedings of UIST 2005 – the ACM Symposium on User Interface Software and Technology, 93-102. New York: ACM Kraemer, K. L. and King, J. L. (1988) Computer-based systems for cooperative work and group decision making. ACM Computing Surveys (CSUR) 20, 2, 115-146 Kruger, R., Carpendale, S., Scott, S. D., and Greenberg, S. (2004). Roles of Orientation in Tabletop Collaboration: Comprehension, Coordination and Communication. Comput. Supported Coop. Work 13, 5-6 (Dec. 2004), 501-537 Leigh, J., Johnson, A., Park, K., Nayak, A., Singh, R. and Chowdry, V. (2002) Amplifed Collaboration Environments. Proceedings of VizGrid Symposium. MacIntyre, B., Mynatt, E. D., Voida, S., Hansen, K. M., Tullio, J., and Corso, G. M. (2001) Support for multitasking and background awareness using interactive peripheral displays. Proceedings of UIST 2001 – the ACM Symposium on User Interface Software and Technology, 41-50. New York: ACM

- 59 -



MacKenzie, I. S. (1991). Fitts' law as a performance model in human-computer interaction. Doctoral dissertation. University of Toronto: Toronto, Ontario, Canada. Mackinlay, J. D.; Heer, J. (2004) Wideband displays: mitigating multiple monitor seams. Extended Abstracts of the CHI’04 Conference on Human Factors in Computing Systems, 1521-1524. New York: ACM Manesis, T. and Avouris, N. (2005). Survey of position location techniques in mobile systems. In Proceedings of the 7th international Conference on Human Computer interaction with Mobile Devices &Amp; Services (MobileHCI '05), vol. 111, 291294. New York: ACM Massó, J. P., Vanderdonckt, J., and López, P. G. (2006) Direct manipulation of user interfaces for migration. Proceedings of the 11th international Conference on intelligent User interfaces (IUI '06), 140-147. New York: ACM Miller, R. C. and Myers, B. A. (1999) Synchronizing clipboards of multiple computers. Proceedings of UIST 1999 – the ACM Symposium on User Interface Software and Technology, 65-66. New York: ACM Moyle, M., Cockburn, A. (2002) Analysing Mouse and Pen Flick Gestures. Proceedings of the SIGCHI-NZ Symposium On Computer-Human Interaction, 19-24 Myers, B. A., Bhatnagar, R., Nichols, J., Peck, C. H., Kong, D., Miller, R., and Long, A. C. (2002). Interacting at a distance: measuring the performance of laser pointers and other devices. Proceedings of the SIGCHI Conference on Human Factors in Computing System. CHI '02. ACM Press, New York, NY, 33-40. Myers, B. A., Peck, C.H., Nichols, J., Kong, D., Miller, R. (2001) Interacting At a Distance Using Semantic Snarfing. Proceedings of UbiComp'2001, 305-314. New York: ACM Nacenta, M.A., Aliakseyeu, D., Subramanian, S., and Gutwin, C. (2005) A comparison of techniques for Multi-Display Reaching. Proceedings of the CHI’05 Conference on Human Factors in Computing Systems, 371-380. New York: ACM Nacenta, M.A., Sallam, S., Champoux, B., Subramanian, S., and Gutwin, C. (2006) Perspective cursor: perspective-based interaction for multi-display environments. Proceedings of the CHI’06 Conference on Human Factors in Computing Systems, 289-298. New York: ACM Nacenta M.A., Aliakseyeu, D., Stach, T, Gutwin, C., and Subramanian, S. (2007-1) Two Experiments on Co-located Mobile Groupware, Technical Report 1, Department

- 60 -



of Computer Science, University of Saskatchewan. URL: http://hci.usask.ca/publications/2007/TR-1.pdf Nacenta, M.A., Pinelle, D., Stuckel, D., and Gutwin, C. (2007-2) The Effects of Interaction Technique on Coordination in Tabletop Groupware. Proceedings of the Conference on Human-Computer Interaction and Computer Graphics 2007. Nacenta, M.A., Sakurai, S., Yamaguchi, T., Miki, Y., Itoh, Y., Kitamura, Y., Subramanian, S. and Gutwin, C. (2007-3) E-conic: a Perspective-Aware Interface for Multi-Display Environments. Proceedings UIST 2007 – the ACM Symposium on User Interface Software and Technology, 279-288. New York, ACM Nacenta, M.A., Mandryk, R., and Gutwin, C., (2008) Targeting across Displayless Space. Proceedings of the CHI’08 Conference on Human Factors in Computing Systems New York: ACM. Norman, D. (2002) The design of Everyday Things. Basic Books. Oh, A., Fox, H., Van Kleek, M., Adler, A., Gajos, K., Morency, L., and Darrell, T. (2002) Evaluating look-to-talk: a gaze-aware interface in a collaborative environment. Extended Abstracts of the CHI’02 Conference on Human Factors in Computing Systems, 650-651. New York: ACM Oh, J.Y. & Stuerzlinger, W. (2002) Laser pointers as collaborative pointing devices. Proceedings of Graphics Interfaces (GI’02), 141-149. Olsen, D., Nielsen, T. (2001) Laser pointer interaction. Proceedings of the CHI’99 Conference on Human Factors in Computing Systems, 17-22. New York: ACM Parker, K., Mandryk, R., Nunes, M., Inkpen, K. (2005) TractorBeam Selection Aids: Improving Target Acquisition for Pointing Input on Tabletop Displays. Proceedings of INTERACT 2005 - the Tenth IFIP TC13 International Conference on Human-Computer Interaction, 80-93. Heidelberg: Springer. Patten, J., Ishii, H., Hines, J., and Pangaro, G. (2001) Sensetable: a wireless object tracking platform for tangible user interfaces Proceedings of the CHI’01 Conference on Human Factors in Computing Systems, 253-260. New York: ACM Po, B.A., Fisher, D., and Booth, S. (2005) Comparing cursor orientations for mouse, pointer, and pen interaction. Proceedings of the CHI’05 Conference on Human Factors in Computing Systems, 291-300. New York: ACM Ponnekanti, S. R., Johanson, B., Kiciman, E., and Fox, A. (2003) Portability, Extensibility and Robustness in iROS. In Proceedings of the First IEEE

- 61 -



international Conference on Pervasive Computing and Communications (PERCOM). IEEE Computer Society, Washington, DC, 11. Proctor, R. W. and Vu, K. L. (2003) Human information processing: an overview for human-computer interaction. In J. A. Jacko and A. Sears (Eds.), the HumanComputer interaction Handbook: Fundamentals, Evolving Technologies and Emerging Applications, Human Factors And Ergonomics (pp. 35-51). Mahwah, NJ: Lawrence Erlbaum Associates. Proctor, R.W. & Reeve, T.G. (1990) Stimulus-response compatibility: an integrated perspective. New York: North-Holland. Randall, J., Amft, O., Bohn, J., Burri, M. (2007) LuxTrace - indoor positioning using building illumination. Personal and Ubiquitous Computing Journal, 11(6), 417428. Springer. Reetz, A., Gutwin, C., Stach, T., Nacenta, M., and Subramanian, S. (2006) Superflick: a natural and efficient technique for long-distance object placement on digital tables. Proceedings of the GI’06 conference on Graphics interface, 163-170. New York: ACM Rekimoto, J. (1997) Pick-and-Drop A Direct Manipulation Technique for Multiple Computer Environments. Proceedings of UIST 1997 – the ACM Symposium on User Interface Software and Technology, 31-39. New York, ACM Rekimoto, J. and Saitoh, M. (1999) Augmented Surfaces: A Spatially Continuous Work Space for Hybrid Computing Environments, Proceedings of the CHI’99 Conference on Human Factors in Computing Systems, 378-385. New York: ACM Rekimoto, J., Ayatsuka, Y., Kohno, M., and Oba, H. (2003) Proximal Interactions: A Direct Manipulation Technique for Wireless Networking. Proceedings of INTERACT 2003 - the Ninth IFIP TC13 International Conference on HumanComputer Interaction, 511-518. Amsterdam: IOS Press Rekimoto, J., Ayatsuka, Y., and Kohno, M. SyncTap: An interaction technique for mobile networking. In Proceedings of the Mobile Human Computer Interaction Conference 2003, 104-115. Berlin: Springer. Robertson, S., Wharton, C., Ashworth, C. & Franzke, M. (1996) Dual Device User Interface Design: PDAs and Interactive Television. Proceedings of the CHI’96 Conference on Human Factors in Computing Systems, 79-86. New York: ACM Rogers, Y. and Rodden, T. (2003) Configuring spaces and surfaces to support collaborative interactions. In O'Hara, K., Perry, M., Churchill, E. and Russell, D. (Eds.) Public and Situated Displays, Kluwer Publishers.

- 62 -



Rogers,Y. & Lindley,S. (2004) Collaborating around vertical and horizontal large interactive displays: which way is best? Interacting with Computers, Volume 16, Issue 6, 1133-1152 Román, M., Hess, C., Cerqueira, R., Ranganathan, A., Campbell, R. H., and Nahrstedt, K. (2002). A Middleware Infrastructure for Active Spaces. IEEE Pervasive Computing 1, 4 (Oct. 2002), 74-83. Scott, S., Grant, K. and Mandryk, R. (2003) System Guidlines for Co-located Collaborative Work on a Tabletop Display. In Proceedings of the European Conference Computer-Supported Cooperative Work, 1-20. Shell, J. S., Selker, T., and Vertegaal, R. (2003) Interacting with groups of computers. Communications of the ACM 46, 3, 40-46 Shen, C., Ryall, K., Forlines, C., Esenther, A., Everitt, K., Hancock, M., Morris, M.R., Vernier, F., Wigdor, D., and Wu, M. Interfaces, Interaction Techniques and User Experience on Direct-Touch Horizontal Surfaces. IEEE Computer Graphics and Applications, Sept/Oct 2006, 36-46. Siio, I. (1995) InfoBinder: a pointing device for a virtual destop system. Proceedings of the Sixth International Conference on Human-Computer Interaction 1995 v.III, 261-264. Elsevier Science Simon, J. R. (1990) The effects of an irrelevant directional cue on human information processing. In R. W. Proctor & T. G. Reeve (Eds.), Stimulus-response compatibility (pp. 31-86). New York: North-Holland. Simon, J. R., & Rudell, A. P. (1967). Auditory S-R compatibility: The effect of an irrelevant cue on information processing. Journal of Applied Psychology, 51, 300–304 Slay, H., Thomas, B., and Vernik, R. (2003) An interaction model for universal interaction and control in multi display environments. Proceedings of the 1st international Symposium on information and Communication Technologies, 220225. New York: ACM Slay, H., Thomas, B., Vernik, R., and Piekarski, W. (2004) A Rapidly Adaptive Collaborative Ubiquitous Computing Environment to Allow Passive Detection of Marked Objects. Proceedings of APCHI’04, Asia Pacific CHI, 420-430. Heidelberg: Springer Smith, R. B. (1992) What you see is what I think you see. SIGCUE Outlook 21, 3 (Feb. 1992), 18-23.

- 63 -



Spratt, M. (2003) An Overview of Positioning by Diffusion. Wireless Networks, Kluwer Academic Publishers, November. 2003, 9 (6), 565 – 574. Stefik, M., Bobrow, D. G., Lanning, S., Tatar, D., and Foster, G. (1986) WYSIWIS revised: early experiences with multi-user interfaces. Proceedings of the 1986 ACM Conference on Computer-Supported Cooperative Work (CSCW '86), 276290. New York: ACM Stoakley, R., Conway, M., Pausch, R. (1995) Virtual reality on a WIM: interactive worlds in miniature. Proceedings of the CHI’95 Conference on Human Factors in Computing Systems, 265-272. New York: ACM Swaminathan, K. and Sato, S. (1997) Interaction design for large displays. Interactions, 4, 1, 15-24. Swindells, C., Inkpen, K., Dill, J., Tory, M. (2002) That one there! Pointing to establish device identity. Proceedings of UIST 2002 – the ACM Symposium on User Interface Software and Technology, 151-160. New York: ACM Tan, D. & Czerwinski, M. (2003) Effects of Visual Separation and Physical Discontinuities when Distributing Information across Multiple Displays. Proceedings of OZCHI 2003 Conference for the Computer-Human Interaction Special Interest Group of the Ergonomics Society of Australia, 184-191. Tan, D.S., Meyers, B., Czerwinski, M. (2004). WinCuts: Manipulating Arbitrary Window Regions for More Effective Use of Screen Space. Short paper at CHI 2004. Tandler, P. (2000) Architecture of beach: The software infrastructure for roomware environments. In CSCW 2000: Workshop on Shared Environments to Support Face-to-Face Collaboration, Philadelphia, PA, December 2002. Tandler, P., Prante, T., Müller-Tomfelde, C., Streitz, N., and Steinmetz, R. (2001) Connectables: dynamic coupling of displays for the flexible creation of shared workspaces. Proceedings of UIST 2001 – the ACM Symposium on User Interface Software and Technology, 11-20. New York: ACM Tani, M. Horita, M. Llamáis, K, Tanikoshi, K. & Futakawa, M. Courtyard: Integrating Shared Overview on a Large Screen and Per-user Detail on Individual Screens. Proceedings of the CHI’94 Conference on Human Factors in Computing Systems, 44-50. New York: ACM Tao Ni, Schmidt, G.S., Staadt, O.G., Livingston, M.A., Ball, R., May, R. (2006) A Survey of Large High-Resolution Display Technologies, Techniques, and Applications. Proceedings of the IEEE Virtual Reality Conference (VR 2006), 223- 236. Washington: IEEE Computer Society.

- 64 -



Holmer T., Streitz, N.A., Geisler, J. (1998) Roomware for cooperative buildings: Integrated design of architectural spaces and information spaces. Proceedings of CoBuild’98, First International Workshop on Cooperative Building, 4-21. Heidelberg: Springer. Tversky, B. (2005) Visuospatial Reasoning. Chapter 10 in The Cambridge Handbook of Thinking and Reasoning, K. Holyoak and R. Morrison eds. Ullmer, B., Ishii, H., and Glas, D. (1998) media-Blocks: Physical containers, transports, and controls for online media. Proceedings of SIGGRAPH’98 Conference on Computer graphics and interactive techniques, 379-386. New York: ACM Vildjiounaite, E., Malm, E., Kaartinen, J., and Alahuhta, P. (2002). Location Estimation Indoors by Means of Small Computing Power Devices, Accelerometers, Magnetic Sensors, and Map Knowledge. In Proceedings of the First international Conference on Pervasive Computing, 211-224. London: Springer-Verlag. Voida, S., Podlaseck, M., Kjeldsen, R., and Pinhanez, C. (2005) A study on the manipulation of 2D objects in a projector/camera-based augmented reality environment. Proceedings of the CHI’01 Conference on Human Factors in Computing Systems, 611-620. New York: ACM Wigdor, D., Shen, C., Forlines, C., and Balakrishnan, R. (2006) Table-centric interactive spaces for real-time collaboration. Proceedings of the Working Conference on Advanced Visual interfaces (AVI '06), 103-107. New York: ACM Wilson, A. and Sarin, R. (2007) BlueTable: Connecting Wireless Mobile Devices on Interactive Surfaces Using Vision-Based Handshaking. Proceedings of Graphics Interface 2007, 119-125. New York: ACM Wu, M., Balakrishnan, R. (2003) Multi-finger and whole hand gestural interaction techniques for multi-user tabletop displays. Proceedings of UIST 2003 – the ACM Symposium on User Interface Software and Technology, 193-202. New York: ACM Zhai, S., Morimoto, C., Ihde, S. (1999) Manual and Gaze Input Cascaded (MAGIC) Pointing. Proceedings of the CHI’99 Conference on Human Factors in Computing Systems, 246-253. New York: ACM

- 65 -

Cross-display object movement in multi-display environments

Cross-display object movement in multi-display environments

Suggest Documents

Cross-display object movement in multi-display environments

Realistic Fireteam Movement in Urban Environments - Defense ...

Realistic Agent Movement in Dynamic Game Environments

Touch-based object localization in cluttered environments

Object Manipulation in Virtual Environments - SFU

Tracking Nearest Surrounders in Moving Object Environments

Prototyping Programming Environments for Object ... - CiteSeerX

Secure Delegation for Distributed Object Environments - CiteSeerX

Secure Delegation for Distributed Object Environments - CiteSeerX

Movement and Orientation in Built Environments - SFB/TR 8 Spatial ...

Interactive Imitation Learning of Object Movement Skills

Interactive Imitation Learning of Object Movement Skills

Early estimation of software size in object-oriented environments: a ...

Object Extraction in Cluttered Environments via a P300-Based IFCE

Object-Capability Security in Virtual Environments - Computer Science

optimizing object-based classification in urban environments using ...

Object Extraction in Cluttered Environments via a P300-Based IFCE

object selection in virtual environments with an ... - Semantic Scholar

Location Privacy in Moving-Object Environments - Transactions on

Querying Imprecise Data in Moving Object Environments - Orion

Object Interaction in Real-Time Virtual Environments - CiteSeerX

Monocular Object and Plane SLAM in Structured Environments

Remarks on Object Movement in Mandarin SOV Order*

Reaching and Object Movement Capability in the ... - Google Sites