Attentive Camera Navigation in Virtual Environments - Semantic Scholar

2 downloads 0 Views 272KB Size Report
"Constrained Navigation Interfaces" In Hans Hagen, editor, Scientific Visualization. IEEE Computer. Society Press. 4. Hanson, A. J. & Wernert, E (1997).
Attentive Camera Navigation in Virtual Environments Stephen Hughes and Michael Lewis University of Pittsburgh Pittsburgh, PA 15260 Abstract Gaining an accurate mental representation of real environments and realistic Virtual Environments is a gradual process. Significant aspects of an environment may be obvious to a trained expert, but not to the novice trainee. If a user does not know where to look, he or she may concentrate on irrelevant objects. This detracts from learning the locations of truly prominent landmarks. This paper explores attentive camera navigation, a technique that guides the user to focus on certain objects through automatic gaze redirection. Results of a user study suggest that this technique can help filter out unnecessary objects, and allow users to quickly understand the configuration of a selected subset of landmarks.

Introduction One of the major promises of Virtual Environments (VE) is the learning and transfer of spatial knowledge. Whether the motivation is danger and cost (i.e. learning about hostile environments) or constraints of the physical world (i.e. visualization of atomic structures), the goal remains the same. Users of virtual environments hope to obtain two types of information: the presence of particular objects and the organizational relationship between those objects. These requirements fit into the spatial knowledge framework established by Siegel and White [6]. They defined landmark knowledge as the ability to extract and remember features from the scene. Understanding the spatial relationships among the landmarks is known as survey knowledge. Survey knowledge is often likened to having a mental representation of a map. Having obtained survey knowledge of an environment, an individual is capable of developing novel routes between landmarks. To construct effective virtual environments, we must understand how people acquire these types of spatial knowledge and apply these properties to our design. Moreover, given the artificial nature of VE, we can produce tool sets based on these concepts that can facilitate the wayfinding experience [1] Although survey knowledge can be obtained solely by studying maps, a more thorough understanding can be achieved by way of motion through the environment.

Thorndyke and Hayes-Roth [7] conducted a study comparing map learning (static) to navigational-learning (moving). They discovered that while people learned survey knowledge quicker from a map, they were prone to make errors when their perspective did not match the orientation of the map. Navigators, on the other hand, learned survey knowledge at a slower rate, but were not subject to such errors. Several attempts have been made to develop a hybrid technique that merges the speed of map learning with the orientation-free aspects of navigation. Satalich exposed people to a VE with a heads up display containing a "Youare-here" map. Unfortunately, she found no improvement in survey knowledge over those who navigated the environment without this augmentation [5]. Similarly Goerger et. al. found that short exposures to complex VEs with the use of a map hybrid were, at best, not helpful compared to Map use alone [2]. The authors suggested that the VE offered superfluous details that overwhelmed the users. These distractions inhibit spatial learning. The findings of Goerger et. al. suggest the need for a new tool that can help focus the user’s attention on relevant landmarks, while allowing extraneous features to be dismissed. To obtain such a system, we turn to a technique proposed by Hanson, Wernert and Hughes. The attentive camera technique was originally conceived as a method for using a 2D input device for effectively exploring a 3D environment [4]. The underlying principle is that the direction of the user’s gaze and position in the environment could be computed as a function of the user’s position in a 2D-constraint space. For the purposes of this study, the constraint surface is set to the floor of the VE. Thus by walking around the floor of the environment, it is possible to manipulate the user’s gaze to attend to landmarks that are consistent with the navigation task. Previous work with this implementation has demonstrated that using the attentive camera significantly increases a user’s landmark knowledge [3]. This paper describes some interface improvements as well discussion on the effectiveness at enhancing survey knowledge.

Description of Technique Implementing the attentive camera technique raises some subtle, yet interesting interface challenges. The traditional mechanism for locomotion in VE is to walk forward in the direction that one is looking. If this approach were taken with the attentive camera technique, it would be impossible for the user to walk in a straight line. Instead, the user would only be able to walk to the nearest object, and then stop. The natural response to the problem is to switch from an egocentric motion to a world-centric motion. Using absolute directions, a user may specify motion in a fixed direction (North, South, East, or West, for example). Unfortunately, this approach is not natural for users, and consequently, is not well accepted.

attentive camera algorithm such that attention was only active for objects within 60 degrees to either side of the motion vector. If the attended object was out of bounds, the gaze vector was interpolated back to align with the motion vector. The net effect is that users never have to look over their shoulders, and potential collisions are always in view during forward motion. Figure 1 demonstrates this approach. Motion starts at point A. From A to B, the gaze remains aligned with the motion vector to allow the user to establish the direction of motion. From B to C, the gaze is redirected from the motion vector to fixate on an object of interest. From C to D, the intended gaze vector differs from the motion vector by greater than 60°, so the attention is shifted back to the motion vector.

To address this problem, consider a person walking down the street. As he moves forward, his eyes wander to see interesting features along the path. This traveller is generally not concentrating on the direction of travel until he arrives at a point where he must turn. As far as controlling motion, he is only concerned with determining a heading and then following that heading until he reaches a destination. This scenario is analogous to our novel interaction method for attentive camera navigation. The user controls two degrees of freedom: heading and forward motion. When the user initiates a movement forward (in the direction they are looking), it establishes a motion vector. All movement occurs relative to that vector until the user stops moving. While the user is moving, the attentive camera technique is free to adjust the user’s gaze independent of the motion vector. This allows the user to be shown a sequence of significant objects that lie between any two points in the environment. Furthermore, the controls are equivalent to the golf-cart navigation that is common in many existing VE applications. A pilot study suggested two additional enhancements to this general approach. 1) The user needs to be able to identify the motion vector. If the computer immediately starts tracking to the constraint points, the motion vector has not been made evident to the user. Therefore, whenever the user initiates movement, we do not activate the attentive camera until after the first ten steps. This allows a visual flow to be established that gives the user a sense of the motion vector. 2) Additionally, users in the pilot study complained about disorientation when they were walking forward and looking over their shoulder. Likewise, some participants did not like the prospect of colliding with a boundary while their attention was being diverted to some other landmark. Certainly additional constraints could be added to force them to attend to boundaries. However, that would add unnecessary complexity to the constraint space. Instead, we restricted the

Figure 1. Attentive Camera Navigation

Methods A user study was conducted with 20 paid participants to assess the effectiveness of attentive camera navigation. Following verbal instruction and a training period, participants were given 15 minutes to explore a virtual environment. Each environment contained eighteen objects; three shapes (sphere, cube and cone) by six colors (red, yellow, orange, green, blue, and violet). Upon completion, they were given a map framework (boundary lines with the floor pattern) and asked to position the objects they had encountered on the map. A completed map is pictured in Figure 2. Each participant experienced two trials.

Figure 2: A completed map

The ability to gain survey knowledge depends somewhat on the topology of the environment. For this experiment, a simple space was constructed that adhered to the following principles: •

A survey of the entire room cannot be obtained from a single vantagepoint. To successfully complete the task required the user to move about the environment. A checkerboard floor pattern was in place to assist the user in estimating distance traveled. This feature also guaranteed that that a visual flow would always be present whenever the user was mobile. Different colored walls allowed the user to maintain orientation in a symmetric room. The wall colors were also presented in the map framework, and initialized the orientation for the evaluation task.

A sample environment is pictured below in figure 3.

non-attentive, golf-cart navigation. For each trial, attentive camera users had their gaze redirected to specific subset of objects in the room. In the first trial, attention was focused on the six spheres; for the second trial, users attended to the six cubes.

Results The maps drawn by participants were analyzed for object misplacement. For each object, the euclidean distance between actual location and the reported location was calculated. There were no significant differences in the overall error measure between participants using the attentive camera technique and the golf cart technique. Wide variance across subjects suggests that individual spatial ability seemed to be a dominant factor in overall scores. This means that the attentive camera method of navigation does not interfere with a person’s ability to obtain survey knowledge of a VE. In other words, the attentive camera navigation technique appears to be as easy to use as the more traditional golf-cart method. The next analysis focuses on the comparing how accurately individuals placed different types of objects. Specifically, we are interested in how well they placed the objects that were the focus of gaze fixation (Attended Objects) versus those objects that were ignored by the attentive camera (Unattended Objects). Figure 4 indicates that users of the attentive camera technique made significantly less placement error on Attended Objects than Unattended Objects (Trial 1: T=-1.921; Trial 2: T=-2.096 DF = 9). The control group demonstrated no difference in placement error between Attended an Unattended Objects (Trial 1: T=0.370; Trial 2: T=1.439 DF = 9). This indicates that redirecting the user’s gaze to the objects improves his ability to gain survey knowledge of those objects. Conversely, objects that are unimportant or superfluous to the task are ignored by the user, thus introducing more error in survey knowledge. Mean Error Measures by Object Type

Figure 3: Sample environment

10 participants navigated with the attentive camera technique, while 10 participants acted as a control, using a


Mean Error

The participants used a standard mouse as the input device. Movement was registered by displacement from the initial starting position while the mouse button was depressed. The magnitude of the displacement was translated into a velocity in the VE. Moving the mouse forward/backward resulted in motion in the environment, while right/left motion caused the user to pivot clockwise or counterclockwise in place. Users were restricted from moving and pivoting simultaneously.


15.0 Attended Unattended 10.0


0.0 Attentive Camera


Figure 4

Despite the large individual differences in spatial ability, there is another significant result in this analysis. Across both trials, the overall mean error for attended objects was lower for the attentive camera technique (T = 2.842 DF=19).

Y". Connections like this help build configurational knowledge.

It is also worth noting that a ceiling effect was observed in the success of the attentive camera. Two of the ten participants using the attentive camera technique actually placed the unattended objects as well as the attended objects, with a total average error of 4.7 (just over 1 block per object). Had the environment been more complex, we expect that the attended score would have stayed low, while the unattended score increased.

Until now, VE designers have had to make a trade off. On one hand, they know that self-guided navigation is the best way to learn the configuration of their environments. Permitting this, however, allows the potential for significant elements could be overlooked until excessive training has been completed. Implementing attentive camera navigation allows them to prioritize a set of objects that should be learned first. Users are still allowed to control their exploration of the environment, but they are encouraged to examine specific elements through automatic gaze redirection.



Previous literature suggests that the development of complete survey knowledge is strictly dependent on repeated exposure to the landmarks in the environment. The problem is that given the complexity of a realistic VE, an explorer may focus on irrelevant or redundant information first, prolonging the development of survey knowledge. The attentive camera technique is effective at focusing the user’s attention on significant elements in the environment, maximizing their exposure, and thus knowledge of their configuration.

This research was supported by an Office of Naval Research grant N-00014-96-1-1222.

Thorndyke and Hayes-Roth argued that motion was critical for obtaining orientation free survey knowledge. One of the explanations they offered for this is the fact that the navigator is being exposed to multiple perspectives. The attentive camera takes this exposure a step further. Instead of merely passing in and out of the visual field, the object becomes the focal point of the scene. As the user walks past, this has the effect of the environment rotating around object, maximizing the number of different perspectives. Since this is done automatically, the user does not need to take any overt actions to take advantage of this optimized information. Furthermore, our implementation of a motion algorithm not only overcomes some of the intrinsic difficulties of attentive camera navigation; it also promotes exposure to the attended objects. Recall that when motion is initiated, a motion vector is established along the current line of sight. Whenever the user pauses to examine what the computer redirected him to see, his motion will resume in the direction of that object. The user would have to actively pivot away from the attended object in order to avoid it. Finally, the motion methodology also affords the ability to establish connections. When navigating in a VE, the heading is often determined by selecting a visible destination object. As the explorer traverses the path, his attention is drawn to another object. This redirection forges a connection between the intended destination and the attended object: "As I walk toward X, I’m going to look at

References 1.

Darken, R.P., & Sibert, J.L. (1993). "A Toolset for Navigation in Virtual Environments". Proceedings of UIST ’93. 157-165.


Goerger, S., Darken, R., Boyd, M., Gagnon, T., Liles, S., Sullivan, J., & Lawson, J. (1998). "Spatial Knowledge Acquisition from Maps and Virtual Environments in Complex Architectural Spaces". Proceedings of the 16th Applied Behavioral Sciences Symposium, 22-23 April, U.S. Air Force Academy, Colorado Springs, CO. pp. 6-10.


Hanson, A. J., Wernert, E., & Hughes, S. (1999). "Constrained Navigation Interfaces" In Hans Hagen, editor, Scientific Visualization. IEEE Computer Society Press.


Hanson, A. J. & Wernert, E (1997). "Constrained 3D Navigation with 2D controllers". In Proceedings of Visualization ’97 IEEE Computer Society Press.


Satalich, G. (1995) "Navigation and wayfinding in VR: Finding the proper tools and cues to enhance navigational awareness." Unpublished Master’s Thesis, University of Washington.


Siegel, A. W., White, S.H. (1975). "The development of spatial representations of large scale environments. in H. Reese (Ed.) Advances in Child Development and Behavior, 10. Academic Press


Thorndyke, P, & Hayes-Roth, B (1982). "Differences in spatial knowledge acquired from maps and navigation". Cognitive Psychology, 14.