new functions for the next generation VRML (Virtual Reality. Modeling Language) .... a door and enter the next room, he must click the door and is left until the ...
Dynamic LoD for QoS Management in the Next Generation VRML Masatoshi Arikawa† Akira Amano† Kaori Maeda† Reiji Aibara†† Shinji Shimojo††† Yasuaki Nakamura† Kaduo Hiraki† Kouji Nishimura†† Mutsuhiro Terauchi† ( † Hiroshima City University †† Hiroshima University ††† Osaka University ) http://www.h2o.hiroshima-cu.ac.jp/ Abstract High speed computer networks will provide us with new broad band multimedia applications. This paper discusses new functions for the next generation VRML (Virtual Reality Modeling Language) over high speed computer networks. The LoD (Level of Detail) of 3D objects is the most important function for rendering scenes dynamically while managing the QoS (Quality of Service). New requirements for the next generation VRML are discussed. We present Differential VRML (DVRML) in order to update scene graphs dynamically, and describe principles of the LoD function based on the DVRML.
1. Introduction Due to technological advances in computer graphics and the infrastructure of Internet, the current style of Virtual Reality Modeling Language (VRML) [1,2] is overwhelming popularity as a 3D extension to World Wide Web (WWW). The specification of the current style of VRML was decided as a result of some compromise after considering the limitation of the current up-to-date technology of graphics and networks. In this paper, we discuss a generalized model of the next generation VRML (NGVRML) designed for high speed computer networks and super computing graphics workstations which will both be affordable for ordinary users in the near future. NGVRML enables users to walk smoothly through the virtual world, to meet other visitors, and to experience richer virtual worlds as a result of fusion of 3D graphics scenes and real time videos from the real world. In the model of NGVRML, the level of detail (LoD) is a key concept to control the traffic of network connections to guarantee the quality of service (QoS) in virtual reality applications.
2. The Current Style of VRML In this paper, VRML means the Virtual Reality Modeling Language version 1.0.
2.1 Objects in a Scene VRML is a 3D scene description language designed to be independent of computer platforms on the Internet. A
clickable object is attached to a WWW anchor which points at some multimedia data file. In addition, a WWW anchor can point at a VRML file. When we think about a set of links between VRML files, there exist hyper spaces where we can travel from one space to another. Another characteristic of VRML as an Internet language is the Inline function. The Inline function makes it possible for one 3D scene to be composed of multiple 3D scenes or objects which are described in other VRML files that can be located at remote sites. There are two advantages in the Inline function. The first one is that before completing all VRML data transfer, we can see and walk through a 3D scene without additional Inline 3D objects which are currently being fetched. When an Inline object is received, the object will appear immediately in the scene. The second advantage is that one scene can be managed by multiple people in a distributed computing environment. Each of Inline 3D objects comprising the scene can be managed by separate individuals.
2.2 Scene Graph and LoD A 3D scene is represented by a scene graph in VRML [3]. Each node of a scene graph holds a piece of information such as surface, material, shape description, geometric transformation, light and camera. A scene graph is usually constructed as a tree structure. VRML is designed in order to describe the scene graph. In general, the real world space is too large and complex to be described as one VRML file. If we make a VRML file representing a part of our real world, the size of the file may become too large to realize the scene with current technology. There are always limitations of computer graphics, memory, CPU and networks. We must divide the large scale space into small spaces enough to be controlled by the current computer technology. Rooms and Doors are a good metaphor for dividing a large space into multiple small spaces with links between the small spaces so that we can smoothly walk through them. VRML has the function of the level of detail (LoD). The LoD is represented as one node in a scene graph. The LoD node usually has multiple child nodes. For instance, one child node is prepared for fine representation, another for rough representation. A LoD node has threshold values for the distance from the user’s viewpoint to its corresponding 3D
object in order to choose an appropriate child node. If we are far from the 3D object, the simple child node is selected among the corresponding LoD's nodes, if we are near the 3D object, the complex child node is selected. Thus, one reason for the LoD is the limitations of computer graphics performance. Figure 1 illustrates examples of LoD in a scene. In the scene, from the user’s viewpoint, the objects [A], [B], [C] and [D] are rendered as level (1), (2), (2) and (3), respectively. LoD is also useful to offset the limitations of slow networks. In order to reduce waiting for a new scene transfer, low detail LoD representations for components in the scene are chosen first. After fetching simple LoD objects and providing users with a rough sketch, higher detail LoD representations are fetched and replace the low detail LoD representations. Some VRML browsers realize this transfer mechanism using the combination of LoD and Inline functions. Level (2) Level (2) Level (1)
prefetching successive scenes is extra network traffic due to unused prefetched scenes. In the current Internet society, increasing unnecessary traffic is not recommended. In the near future, however, the prefetching technique will be important in order for users to move smoothly from one space to another on the Internet. The Inline function is useful for slow networks. An important task of the Inline function is scheduling the order of data transfer. The current Inline function in VRML browsers is executed on nodes using the rule of the breadth first search of a scene graph (Figure 2). Not only the objects within our view, but some objects out of our view are fetched by applying rule of the breadth first search at the same time. The objects within our view should be more important than objects out of view. The order of each object’s transfer should be decided with respect to the distance from the user’s viewpoint to each 3D object (Figure 3). There is no such control mechanism based on the distance from the user’s viewpoint to a 3D object in the VRML systems that are composed of client/server applications.
User
Level (0)
Level (1)
1 Level (0)
[A]
[B]
2
[C]
3 5 Level (0) Level (2) Level (1)
Level (0)
4
Level (1) [D]
Figure 1. Examples of LoD contours.
Figure 2. Order of prefetching Inline objects for a Scene Graph in VRML. (Round rectangles are individual VRML files.)
2.3 Network Connection VRML was designed for current slow computer networks. To move from one space to another with a VRML browser, the user must click a 3D object with a WWW anchor in a scene. He must then wait until the entire VRML file is fetched. For instance, if the user wants to go through a door and enter the next room, he must click the door and is left until the entire corresponding VRML file of the next room is fetched. Some interactive two-dimensional graphics allow users to scroll contents in a window without waiting for the entire fetching of new contents. The smooth scrolling is realized by the technique of prefetching contents in advance. It is necessary to apply the prefetching technique to walkthrough of VRML scenes in order to move smoothly from one space to another. If successive scenes linked to the current scene have already been prefetched, the user can immediately move to another scene by clicking a WWW anchor object in the current scene. The problem caused by
4
5
1
2
3
Figure 3. Order of prefetching Inline objects with respect to user’s viewpoint.
3. A Generalized Model of NGVRML 3.1 Dynamic Scene Graph and Differential VRML (DVRML) In VRML, we cannot change the current scene graph dynamically. In other words, we cannot update a scene graph at all in VRML, such as moving a chair and putting a note on a wall. In order to realize the multi-user space
function and live video texture incorporation, the scene graphs should be updatable dynamically. We present a description language, Differential VRML (DVRML), used to dynamically change a scene graph (Figure 4). DVRML can add nodes to the scene graph and delete nodes from the scene graph, and can change a value in a node of a scene graph. Furthermore, DVRML allows a spatial database to manage smaller component 3D objects than those in VRML.
Change Values of Nodes + (DVRML)
Addition of Nodes
Deletion of Nodes
Figure 4. Dynamic scene graph and DVRML. In a continuous hyper space, we must prefetch the spaces which are linked from the current space by chains of WWW anchors. The order of fetching component objects in a space is not optimal in VRML. If all component objects can be stored in the form of DVRML, the order of fetching component objects can be decided with respect to the user’s viewpoint, view direction, and speed (Figure 5). VRML can often cause a burst of network traffic when a user clicks a graphic object attached to a WWW anchor and the corresponding scene is suddenly transferred. DVRML is effective for keeping the amount of required data transfer constant. In order to make use of DVRML, it is necessary for the clients and servers to understand the DVRML protocols which are called DVRML clients and DVRML servers. An
Prefetching Area
Viewing Area
User
Figure 5. Prefetching area in DVRML.
advantage of DVRML is that it is easier to add new functions to DVRML clients and servers in prototype systems. For instance, if we realize facial videos attached to face parts of avatars, we can send DVRML data of changing textures of facial parts as video textures to DVRML clients in real time. DVRML can be used as a fundamental protocol. More specific protocols, such as for multi-user, live video and LoD, can be constructed based on the DVRML protocol. In VRML, a scene must correspond to a file. In DVRML, there is no need to have a correspondence between a scene and a file. Scenes based on DVRML are dynamically constructed by using only rules of LoD. If an object is near to a user, the object with the highest quality detail should be put in the scene. If the object is far from the user, the object should not be put in the scene or the object with rough representation should be put in the scene. If a large object is far from the user but the object doe not appear small, the object with appropriate representation should be put in the scene. If a user is walking in a certain direction, the objects moving closer to him should be inserted into the scene graph and the objects leaving from him should be removed from the scene graph. The complexity of a scene can be controlled dynamically depending on the performance of computers, the number of objects in the scene, and the movement speed of users.
3.2 Scheduling of DVRML Transfer and Consistency of Scenes There are two types of DVRML systems used to manage the scheduling of the order of data transfer: client-based and server-based DVRML systems. In the case of clientbased DVRML systems, scheduling management for generating LoD is performed by DVRML clients. First, a DVRML client posts a spatial query to all servers. The spatial query obtains rough information about the locations of component objects with respect to the location of the user of the client system. All servers respond to the spatial query generated by the DVRML client. The DVRML client receives the meta information about the component objects from servers within the current viewing area so that it can make a map of the scene in order to decide the order of data transfer according to the location of the component objects, LoD and network speeds. Then the DVRML client sends queries about component objects to the particular servers managing them. In the case of server-based DVRML systems, each server must manage a particular area, which is called a cell. One server, called a central DVRML server, knows all DVRML component objects in the cell, and also knows which clients or users are walking through in the cell. The central DVRML server sends requests to spatial database servers storing DVRML component objects in order to transfer them to clients according to the location of the clients, speed of user’s movement and network speeds. The server-based DVRML system is a classical central system. If the number of clients becomes large, it is not practical, although it is easier to manage clients in scenes.
The following sequence shows the basic mechanism for realizing a multi-user space. [Check-in phase] When a user enters a new space, the current scene graph is sent to his DVRML client from the consistent scene database by the DVRML server. The DVRML client sends DVRML data of his avatar to the DVRML server in the scene. The server forwards the DVRML data to all other clients instructing them to add a new avatar to their scene graphs. [Stay phase] All DVRML clients send DVRML data concerning their updating the scene to the DVRML server. The DVRML server applies the update information to its current scene graph to keep its scene graph consistent. The DVRML server then forwards the same DVRML data to all DVRML clients in the scene in order to apply the data to the clients' scene graphs. [Check-out phase] When a user leaves the space, his client sends DVRML data to the DVRML server to remove his avatar from all scene graphs corresponding to the scene. The server forwards the DVRML data to all clients.
3.3 Interface for the Fusion of the Real World and 3D CG Worlds This section introduces a more generic representation of 3D component objects in order to incorporate live video into a 3D computer graphics (CG) space. In Figure 6 (c), a user can see the real world through a window in a 3D CG space. The window is a channel between the 3D CG space and the real world. The video texture on the window in the 3D CG space, in Figure 6 (a), is taken from the camera in the real world, in Figure 6 (b). The viewpoint of the camera is made the same as the user’s viewpoint. The video texture
can also be determined by the angle and distance. In terms of LoD functions, all video textures are controlled to maintain the appropriate QoS (Quality of Service) of the virtual scenes. The followings are the basic factors of the QoS in interactive virtual space applications: 1. Quick response 2. Smooth movement through a scene 3. High quality multimedia information for components that are close to users
4. Concluding Remarks H2O project is a research project organized by staff at three universities: Hiroshima City University, Hiroshima University and Osaka University. This project uses 155 M bps ATM (Asynchronized Transfer Mode) networks provided by NTT (Nippon Telegram and Telephone Corporation) as “Joint Utilization Tests of Multimedia Communications” executed from April 1995 to March 1997. The main purpose of the project is to experiment with new multimedia applications and a cooperative research environment based on high speed computer networks. One of the objectives of the project is an experiment to introduce virtual reality environments to the Internet. A further theme is to observe users’ activities in the environment using virtual reality from the viewpoint of social computing psychology. We are developing DVRML client/server applications to control the LoD of 3D campus data, and to realize multiuser spaces. We are also implementing an application to put a user’s facial video taken from a camera onto the face of an avatar in a multi-user space. A high speed ATM network is necessary to control the LoD of the facial video data. We are also testing LoD scheduling for DVRML data transfer with some rules for dynamic LoD in virtual spaces.
Acknowledgments User
Window α
β
β
α
Camera
User (a) 3D CG Space
Fusion
User
Window α β
(b) The Real World
β α
User 3D CG Space
We would like to thank Professor Yoshinori Isomichi and Professor Kitsutaro Amano, Hiroshima City University, for the opportunity to participate in the H2O project and for their useful comments. We also appreciate technical assistance provided by staff at Hiroshima City University. NTT (Nippon Telegram and Telephone Corporation) provides us with high speed ATM networks for “Joint Utilization Tests of Multimedia Communications.” We received support from the grant “Specified Research” at Hiroshima City University.
The Real World
(c) A Virtual Space
Figure 6. Fusion of the real world and a 3D CG space.
References [1] VRML Forum. [www]http://vrml.wired.com/ [2] Mark Pesce, VRML - Browsing & Building Cyberspace, New Riders, 1995. [3] Josie Wernecke, Open Inventor Architecture Group, The Inventor Mentor, Release 2, Addison Wesley, 1994. [4] Issues and Challenges in ATM Networks, Communications of the ACM, Vol. 38, No. 2, Feb. 1995.