Avatar: A Virtual Reality Based Tool for Collaborative Production of ...

Avatar: a virtual reality based tool for collaborative production of theater shows Christian Dompierre and Denis Laurendeau Computer Vision and System Lab., Laval University, Quebec City, QC Canada, G1K 7P4 [email protected], [email protected]

Abstract One of the more important limitations of actual tools for performing arts production and design is that collaboration between designers is hard to achieve. In fact, designers must actually be co-located to collaborate in the design of a show, something that is not always possible. While teleconference tools could be used to partially solve this problem, this solution offers no direct interactivity and no synchronization between designers. Also some problems like perspective effects and single viewpoint constrained by the camera are inherent to this solution. Specialized software for performing arts design (e.g. "Life Forms") do not generally provide real-time collaboration and are not really convenient for collaborative work. Also, these systems are often expensive and complex to operate. A more adapted solution combining concepts from virtual reality, network technology, and computer vision has then been specifically developed for collaborative work by performing arts designers. This paper presents a virtual reality application for supporting distributed collaborative production of theater shows resulting from our research. Among other constraints, this application has to ensure that the virtual scene that is being shared between multiple designers is always in sync (by use of computer vision) with a real counterpart and that this synchronization is achieved in real-time. Also, system cost must be kept as low as possible, platform independence must be achieved whenever possible and, since it is to be used by people that are not computer experts, the application has to be user-friendly.

Introduction In March of 2004, the LANTISS laboratory (for Laboratoire des nouvelles technologies de l’image, du son et de la scène) was created at Laval University in Quebec. The mission of this laboratory is to foster the use of information technology in the domain of performing arts design.

In this context, a research project called “The Virtual Mockup” has been developed. The main objective of this project is to provide performing arts directors with tools for distributed interactive design of live performances. To reach this goal, an application called “Avatar” combining virtual reality, network technology, and computer vision was built. This application provides a versatile yet simple to use environment for collaborative performance design. Directors usually exploit a miniature model of a scene, called “mockup” (“castelet” in French), in order to design performances. Our application does not aim at replacing this popular traditional tool, which directors do appreciate for various reasons, but Avatar rather augments it by a virtual component called “the virtual mockup” that can be shared by several directors in a collaborative performance design. Designers thus still use the “real” mockup, but now use it as an input/output peripheral for the virtual mockup. With Avatar, directors design a performance as they usually do using a real mockup but can now also count on a virtual replica of the scene that is synchronized with it and that can be shared by directors located at different geographical locations. Using Avatar, directors (local and remote) can visualize in 3D the scene that is actually being designed on the real mockup. In addition, directors at remote locations can modify the virtual mockup to collaborate in the design of the show. These changes are then transposed to the real mockup to keep all representations of the scene in sync. An important constraint in this project is that this synchronization between the virtual mockup and the real mockup be implemented in real-time in order to be efficient. Other constraints are that system cost be kept low (e.g. by the use of open source support tools) and that platform independence be as wide as possible since different designers could use different operating systems. This paper describes the different components of Avatar that allow interactive collaborative design of performing arts performances using both a real mockup and a shared virtual mockup of a scene. The different tools that were available for implementing Avatar are described first, followed by the solution that was adopted.

Proceedings of the 3rd Canadian Conference on Computer and Robot Vision (CRV’06) 0-7695-2542-3/06 $20.00 © 2006

IEEE

The paper then describes the collaborative aspects of Avatar that allow distributed collaborative work for design in performing arts. Although this approach is not unique to Avatar, the paper describes the constraints to which Avatar is submitted and the solutions that are proposed to face the challenges specific to performing arts design. The validity of the solution is supported by experimental results. Finally, the improvements that could be brought to Avatar to increase its performance are presented.

Avatar as a Virtual Reality Application Before being a performing arts support tool, Avatar is first a Virtual Reality application. Consequently, it is very important to design Avatar using tools that are adapted to VR in order to avoid reinventing the wheel. With respect to graphics rendering of the virtual mockup, several rendering engines are currently available. The Visualization Toolkit (VTK) [1] has been chosen among existing graphics rendering tools first because it supports stereoscopic visualization and, secondly, since it allows the development of platform independent applications. VTK is also an open source package that is readily available. Since Avatar is to be used by non computer-specialists, its Graphical User Interface (GUI) has to be simple and easy to use. Qt [2] has thus been chosen to implement Avatar since it eases the development of GUIs and is platform independent. Qt is also an open source library for most operating systems. The VTK_Qt API developed by Carsten Kübler [3] has also been used in order to allow VTK and Qt to be combined in a single application. This also allows maintaining the open source flavor of Avatar, which is entirely developed in C++ and currently runs on Windows, Linux, and Mac OS X. In order to offer directors the opportunity to visualize the virtual mockup in 3D through stereoscopy, Avatar supports two different stereo visualization modes: active visualization with stereo goggles (Crystal Eyes) and cheaper anaglyphs when cost is an issue. Avatar is an open source application that is available on SourceForge [4] and can be downloaded at the following Internet address: http://sourceforge.net/projects/avatar. Now that the development environment onto which Avatar is based has been described, the next section describes how a real mockup can be used as an input device for collaborative performing arts design using a virtual mockup.

Using a real mockup as an input device for interactive performing arts design A major objective of Avatar is to allow directors to use a traditional mockup as a Virtual Reality input device for the virtual mockup that is being shared by remote directors participating in the design. Consequently, Avatar must allow the use of figurines of actors in the real mockup as VR “peripheral devices” for editing a scene. This implies that figurines in the real mockup must be associated with their virtual correspondent in the virtual mockup. This association between real and virtual figurines of actors is achieved by using computer vision. In Avatar, real figurines manipulated by a director are tracked by a cheap webcam and their pose is computed in real-time. The poses are then applied to the corresponding virtual figurines in order to maintain the coherence between the real and virtual mockups. Several approaches have been proposed in the literature for pose estimation in the context of Augmented Reality [6] [19]. Some approaches [7] [8] are based on “global” pose estimation techniques and use object features for estimating pose and imply a training phase by the object recognition / pose estimation system. Other techniques rather adopt a local approach for pose estimation and again use feature points that are matched to a priori models of the objects to be tracked. The challenge consists in finding reliable feature points that can be used for robust pose estimation. Shi and Tomasi [9] describe a criterion for choosing reliable feature points. Local techniques are generally more robust than global techniques since they tolerate partial occlusion of the objects for which pose must be estimated, especially when 3D pose must be estimated from 2D images [10] [11]. The major problem of using natural object features is that image noise makes feature matching more difficult with the stored model [12] [11]. Several approaches have been proposed for solving the feature-matching problem. For instance, State et al. [13] combines feature-based pose estimation with magnetic pose tracking. Park, You and Neumann [14] [15] use artificial targets for pose tracking initialization and update. The use of artificial markers improves the robustness of pose estimation significantly. Some techniques even rely on artificial markers only since such markers are easier to segment. Typical markers are color coded targets [16] [17] or other types of easily identifiable targets [5] [18] [20]. In Avatar, ARToolkit [5] has been used for estimating the pose of real figurines using 2D images provided by a webcam observing the figurines manipulated in the real mockup by the director.


IEEE

ARToolkit consists in a library of C functions for the development of augmented reality applications. It allows real-time pose tracking of objects to be achieved using artificial targets. Since it uses very simple computer vision principles, ARToolkit allows real-time tracking of objects, a prerequisite in Avatar. In Avatar, artificial markers (i.e. targets) are glued to real figurines that are manipulated by the director. By observing the targets with a webcam, the pose of the figurines is computed in real-time and applied to corresponding virtual figurines as illustrated in Figure 1. The figurines in the real mockup are thus used as input devices for maintaining the pose of their virtual counterpart in the virtual mockup. In addition, it is not required that the virtual figurines look exactly the same as the real figurine. Consequently, figurines in the real mockup can be simple blocks while the virtual figurines may be complex actors that are rendered in stereo in the GUI of remote directors.

Figure 1: A scene in the real mockup can be recreated in a virtual mockup in Avatar by tracking the pose of actors using a low cost camera.

Support of Collaborative Work in Avatar As described above, the figurines of the real mockup are used, through the input interface provided by ARToolkit, to change the state of the virtual mockup. Since Avatar allows several users to share the same virtual mockup, a client-server architecture has been adopted to allow of the virtual mockup to be shared.

In Avatar, an instance of the virtual mockup is maintained in perfect synchronization with the real mockup on the server node. Clients also have their own copy of the instance stored on the server and any change brought to the virtual mockup on the server is automatically sent to copies on the clients, which are updated to reflect these changes. In addition, changes brought by remote directors to the copies on clients are sent to the server and broadcast to other clients. These changes can be the displacement of virtual figurines, etc. While this solution could first seem efficient, it in fact implies some serious problems with the use of real figurines since the manipulation of a figurine by a director does not impose the displacement of its counterpart in other real mockups. The virtual mockup must then have multiple configurations corresponding to each real mockup at the same time. So enabling the use of real mockups to control the state of the shared virtual mockup in a collaborative design raises some problems. In Avatar, the solution is that only the director using the server of the application has access to a real mockup of the scene. Otherwise, it would be very complex to attempt to register more than one single real mockup and to keep these multiple instances in sync with each other. Imposing the constraint of having one single real mockup is not really a problem since: i) the shared virtual mockup is in sync with the real mockup and thus allow remote directors to have a replica of the real mockup in their own virtual world, ii) real mockups occupy a lot of space and must be stored in a storeroom when not in use. Consequently, maintaining more that one real mockup is more expensive and is not efficient. However, even when implementing the single real mockup solution, Avatar still faces the problem of keeping the real mockup in sync with the virtual mockup when remote directors modify the state of the virtual mockup. Thus some way of maintaining the synchronization between the shared virtual mockup and the (single) real mockup must be found. In addition, since the virtual figurines always correspond to real figurines whose pose are tracked by the camera observing the real mockup, virtual figurines are instantly brought back to the pose of the real figurine even though their position has been changed in the virtual mockup and updated momentarily in all instances of the virtual mockup! In this sense, clients directly moving virtual figurines in their instance of the virtual mockup then only induce a very short-lived change since the state of the scene is instantly updated to the status of the real mockup (and this change is instantly broadcast to the copies of the virtual mockup stored on client nodes). The solution that is proposed to solve these problems is to attach two identical virtual figurines (in the server’s copy of the virtual mockup) to each real figurine in the real mockup. One of these two virtual figurines is called the


IEEE

“pseudo-virtual” figurine and the second is called the “purely virtual” figurine. The pose of the “pseudo-virtual” figurine is always associated with the pose of the real figurine in the real mockup while the pose of the “purely virtual” figurine in the virtual mockup is shared with the copies of the same figurine on the client nodes (which only have one copy of each figurine). When the pose of a figurine is changed on a client node, the new pose is sent to the server, which updates the pose of its copy of the “purely virtual” figurine and sends this update to the other client nodes. Then, the director on the server node moves

the real figurine in the real mockup until the pose of the “pseudo-virtual” figurine, computed by ARToolkit, matches the one of the “purely virtual” figurine (shared with the clients). Once both poses match within an acceptable error, the pose of the “purely virtual” figurine on the server is snapped to the pose of the “pseudo-virtual” figurine and the final pose is then broadcast to client nodes, which are now in perfect sync with the server. Figure 2 displays the solution implemented in Avatar to allow collaborative work in conjunction with the use of a real mockup.

Figure 2: Avatar allows multiple users to share a virtual world even with the use of real figurines as input peripheral. In this example, a client first moves a virtual figurine in the virtual mockup. This change is automatically sent to the server (and then broadcast to other clients). This change is applied to the “purely virtual” actor on the server’s instance of the virtual mockup. After moving the real figurine (and so the “pseudo-virtual” actor) until it meets the right pose, the “purely virtual” actor is snapped to the “pseudo-virtual” one and its pose is corrected. This modification is then broadcast to all clients to ensure that every instance of the virtual mockup is in sync with the real mockup.


IEEE

Results

Figure 3: Some elements of the GUI of Avatar

Designed to be used by non computer-specialists, Avatar presents a user-friendly GUI that is easily understandable and usable. Figure 3 shows some of the different options available in the GUI of Avatar. It is easy to build a scene in Avatar. All one needs to do is to add actors to the virtual world. Actors can then be positioned anywhere in the virtual world. When an actor is added to the scene (by any user), it is added in each instance of the shared virtual mockup. The user on the server node also needs to inform Avatar of which marker this new actor is associated with in the real mockup if he wants to use this physical tool to design the scene. Users have a lot of liberty when working with Avatar. For example, users can change the background color, use a reference grid and apply any texture they wish, move, rotate, scale actors at will, move the camera, decede whether or not to use one of the two stereo types built-in, etc. As a result, designers can build a complete scene very easily with Avatar. Figure 4 shows an example of a scene (from the movie The Matrix Reloaded (2003)) designed with Avatar.

Figure 4: Example of a scene designed with Avatar. One of the possibilities offered to directors while designing a performance with Avatar is to save custom viewpoints and return to them or to one of the standard viewpoints predefined in Avatar as shown in Figure 5.


IEEE

Another improvement that could be brought to Avatar would be to model light sources in the real mockup in order to add them in the virtual copies. As of today, Avatar does not implement collision detection between virtual figurines in the virtual mockup. Currently, in order to avoid collisions, certain configurations of real figurines are simply not implemented when the time comes to synchronize the real mockup to the virtual mockup. Finally, it would be interesting to animate virtual figurines in the virtual mockup in order to make the environment more realistic.

Figure 5: The camera position can be reverted to viewpoints predefined in Avatar as well as custom viewpoints defined by users.

Conclusion A tool for supporting distributed interactive performing arts design has been described. The tool, called Avatar, mixes computer vision and virtual reality to provide directors with the opportunity to modify at will the scene of interest on both a real and virtual mockup of the actual scene that will be built for a given show. Figure 6 illustrates the possibilities offered by Avatar to users in performing arts design. Even though Avatar has been designed for supporting the design of performing arts productions, it is not limited to this field and can be used in any applications for which it is important to maintain the synchronization between the geometry of a real environment and its virtual counterparts. In addition, Avatar has been designed as a low-cost platform-independent software package that is based on open source tools and that can be used by non computerexperts. Its interface is very simple to understand and use and it provides an interesting support for collaborative work in performing arts. Future work aims at including more realistic figurines in the virtual mockup and to offer users the opportunity to include an environment map in the virtual mockup. Currently, the visual feedback provided by Avatar is neither a function of the position nor the movement of the user and the latter has to use his mouse to navigate in the virtual world. It is thought that ARToolkit could be exploited for this purpose and that could be an interesting feature to add to Avatar. To achieve this, ARToolkit would have to be adapted for use in low-light conditions such as those found in VR rooms.

Figure 6: Avatar can be used as a new interactive tool for performing arts design allowing collaborative work between multiple disigners.


IEEE

References [1] [2] [3] [4] [5] [6] [7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16] [17]

[18] [19]

http://public.kitware.com/VTK/ http://www.trolltech.com/products/qt/index.html http://wwwipr.ira.uka.de/~kuebler/vtkqt/ http://sourceforge.net/ http://www.hitl.washington.edu/artoolkit/ Azuma, R., et al., Recent Advances in Augmented Reality, IEEE Computer Graphics and Applications, Vol. 21, No. 6, 2001, p. 34-47. S. K. Nayar, S. A. Nene, and H. Murase, Real-Time 100 Object Recognition System, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 12, 1996, p. 1186–1198. P. Viola and M. Jones, Rapid Object Detection using a Boosted Cascade of Simple Features, in Conference on Computer Vision and Pattern Recognition, 2001, p. 511–518. J. Shi and C. Tomasi. Good features to track, In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'94), IEEE Computer Society, Seattle, Washington, June 1994, p. 593-600. Vincent Lepetit, Pascal Lagger, and Pascal Fua, Randomized Trees for Real-Time Keypoint Recognition, In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), IEEE Computer Society, San Diego, CA, June 2005, p.775-781. Takashi Okuma, Takeshi Kurata, and Katsuhiko Sakaue, A Natural Feature-Based 3D Object Tracking Method for Wearable Augmented Reality, In Proc. The 8th IEEE International Workshop on Advanced Motion Control (AMC'04) in Kawasaki, Japan, 2004, p. 451456. Gilles Simon, and Marie-Odile Berger, Reconstructing while registering: A novel approach for markerless augmented reality, In Proc. IEEE and ACM International Symposium on Mixed and Augmented Reality, 2002, p.285-294. Andrei State et al., Superior Augmented Reality Registration by Integrating Landmark Tracking and Magnetic Tracking, Computer Graphics, 1996, p. 429-438. Jun Park, Suya You, and Ulrich Neumann, Natural Feature Tracking for Extendible Robust Augmented Realities, First IEEE International Workshop on Augmented Reality (IWAR98), USA, 1998. Jun Park, Suya You, and Ulrich Neumann, Extending Augmented Reality with natural feature tracking, SPIE on Telemanipulator and Telepresence Technologies, Vol.3524, Nov. 1998. Ulrich Neumann, and Youngkwan Cho, A self-tracking augmented reality system, In Proc. VRST 96, 1996, p. 109-115. Ulrich Neumann et al., Augmented Reality Tracking in Natural Environments, Mixed Reality - Merging Real and Virtual Worlds, Ohmsha & Springer-Verlag, 1999, p. 101-130. Jun Rekimoto, Matrix: A Realtime Object Identification and Registration Method for Augmented Reality, APCHI’98, 1998. Azuma, R.T., A survey of augmented reality, Presence, vol.6, No.4, 1997, p.355-385.


IEEE

[20] Hirokazu Kato, and Mark Billinghurst, Marker Tracking and HMD Calibration for a Video-based Augmented Reality Conferencing System, In Proc. the 2nd IEEE and ACM International Workshop on Augmented Reality ’99, 1999, p.85-94.