application building, supports network-aware 3D visualization applications, and .... and creating walls and windows at will, and a fully interactive 3D visualization.
Technical Report no. 2003-04
3Dwm: A Platform for Research and Development of Three-Dimensional User Interfaces
Niklas Elmqvist
Department of Computing Science Chalmers University of Technology and G¨oteborg University SE-412 96 G¨oteborg, Sweden
G¨oteborg, 2003
Technical Report in Computing Science at Chalmers University of Technology and G¨oteborg University Technical Report no. 2003-04 ISSN: 1650-3023 Department of Computing Science Chalmers University of Technology and G¨oteborg University SE-412 96 G¨oteborg, Sweden G¨oteborg, Sweden, 2003
Abstract Although hugely successful, the near-ubiquitous WIMP paradigm dates back to the 1970s and has many shortcomings that could be addressed by harnessing modern computing technology in the user interface. Most importantly, the ready supply of consumer-level hardware-accelerated 3D graphics has opened the way for 3D user interfaces, i.e. interfaces that make direct use of the third dimension. However, there are many challenges to designing applications with such interfaces. In this paper, we present 3Dwm, an object-oriented software platform for the research and development of 3D user interfaces, and discuss how it addresses these challenges. 3Dwm provides a unified 3D abstraction layer that facilitates interoperability between 3D components, combats the inherent complexity of 3D interface programming, provides a reusable toolkit for application building, supports network-aware 3D visualization applications, and may potentially serve as a standardized software platform for many kinds of 3D graphics development. Keywords: 3D user interfaces, 3D interaction, 3D widgets, Virtual Reality
1
Introduction
Having been in widespread use for well over two decades, the success of the so-called WIMP (windows, icons, menus, pointer) user interface paradigm developed at Xerox PARC in the early 1970s [27, 28] is simply astounding. Most surprising, however, is the fact that the paradigm has survived virtually unchanged since its inception, especially when seen in light of the many new developments in other aspects of computing since then. Despite the fact that WIMP interfaces have low sensory bandwidth, scale poorly with application complexity, and make little use of the capabilities of modern computers, they seem to be just about “good enough” for the field of human-computer interaction to get comfortably stuck in a rut. However, there is no overlooking the fact that progress in computer hardware has been equally astounding, and that computing accordingly is very different today compared to thirty years ago. The mainframes of yesterday have become the desktop computers of today, memory and disk space is abundant, and the rise of consumer-level 3D hardware has taken 3D computer graphics from the exclusive domain of high-end graphical workstations to off-the-shelf computers in our very homes. What’s more, there now exists an impressive array of novel input devices, such as data gloves, 3D mice, and haptics, that are poised to play the same role as the mouse did in the previous user interface revolution. Looking beyond the consumer market, esoteric hardware platforms such as CAVEs [6], head-mounted displays (HMDs) [29], and wearable computers all provide exciting new possibilities that are just waiting to be exploited. Clearly, conditions have changed considerably since the 1970s, and the question is what these changes mean for the current and future state of human-computer interaction. 3Dwm (Three-Dimensional Workspace Manager, see Figure 1) is a user interface system devised to explore the use of interactive 3D graphics to take advantage of the new capabilities offered by modern computer hardware. In this paper, we first discuss the areas where a 3D user interface can be useful and then present the 3Dwm system as an integrated environment for this purpose. We then discuss how 3Dwm tackles the challenges inherent with 3D user interfaces (3DUIs), and describe the main system components. Finally, we give an example of an immersive CSG modelling application developed using the platform.
2
Related Work
In recent years, the HCI and computer graphics research community has invested a fair amount of time into so-called three-dimensional user interfaces, i.e. interaction techniques and objects that make use of 3D (as opposed to the traditional 2D) computer graphics to present an interface to the user. One notable effort here is Brown University’s UGA system [14, 34], a delegationbased object-oriented graphical architecture closely integrating modeling and
1
Figure 1: The Three-Dimensional Workspace Manager. animation, and the many 3D widgets1 created for the system [5, 13]. 3Dwm, the system presented in this paper, has many similarities to UGA, including an object-oriented framework approach, but has many additional features such as network transparency, language agnosticism, and cross-platform capability. The Xerox PARC Information Visualizer [2], a system based on a 3D extension of the Rooms [12] desktop metaphor, focuses on making information more accessible to the user using a number of application-specific information browsers and an iterative interaction manager. Here, the line between humancomputer interaction and information visualization2 is quite blurred. In fact, the Cognitive Co-Processor [21] that is an integral part of the architecture of the Information Visualizer is specifically targeted at real-time 3D interaction and addresses many of the issues of the WIMP paradigm discussed in this paper. Examples of information browsers for the system include cone trees [23], a 3D tree structure making use of interactive animation to visualize huge information hierarchies, the Perspective Wall [17], which employs perspective distortion to selectively display linear data structures, and the Data Sculpture [2], used for exploring unstructured scientific data. Again, the Information Visualizer, like the UGA system, lacks the distributed nature of 3Dwm that allows application to be truly location-transparent, and is furthermore not language- and platform-agnostic. 1 A widget is a user interface component consisting of both visual appearance and behavior, such as a button, menu, or slider. 2 Information visualization is a quite recent research area introduced in 1993 by Robertson et al. [22] as a blend of computer graphics, data visualization, and human perception.
2
The underlying 3D windowing system described by Feiner and Beshers for their n-Vision system [9] shows many similarities to the concepts behind 3Dwm, probably due to sharing a common heritage in existing 2D windowing systems (the X Window System [25], specifically). However, where n-Vision uses a hierarchy of nested boxes for representing 3D volumes, 3Dwm has a more refined general scene graph model that allows for greater flexibility in application programming. The MR Toolkit introduced by Shaw et al. [26] is a low-level software library for creating Virtual Reality applications based on a decoupled system model for discrete and continuous simulation. MR applications may be split into several concurrent processes and can thus be said to be distributed, but lacks the true network-transparency of 3Dwm. Furthermore, while the MR toolkit may be extended with a number of optional packages, 3Dwm offers a more high-level and integrated framework for application building. There are other software platforms exploring various 3D user interface paradigms, the most recent being the Task Gallery [20] 3D window manager developed by Microsoft Research. The Task Gallery is designed to visualize the 3D workspace in the shape of a virtual art gallery with standard Microsoft Windows applications presented as paintings on the walls. However, the system is limited to running normal 2D applications and uses the 3D environment for task management only, lacking the 3D user interface capability of 3Dwm. Also of interest is DIVE (Distributed Virtual Environment) [3], a scalable peer-to-peer network architecture for collaborative virtual environments. DIVE differs from 3Dwm in that it puts more focus on collaborative work and 3D simulation, and less on 3D user interfaces and 3D widgets in general. In addition to these 3DUI systems, there also exists a large number of general 3D graphics frameworks, too many to list here, that attempt to unify and simplify 3D application building. In this regard, 3Dwm is directly influenced by VRML97 [33] and SGI Performer [24], while its system architecture and distributed scene graph is inspired by Fresco [16], and many of the design guidelines of Fresco were used during development.
3
Motivation
The system presented in this paper is a platform for the development of graphical applications with 3D user interfaces. It was designed not to be a replacement for the standard WIMP paradigm, but rather to address the areas where traditional 2D interfaces perform badly. In this section, we will try to motivate the need for such an application development platform. While there certainly are a number of practical benefits to employ 3D user interfaces, the use of 3D in the interface goes beyond purely objective reasons in that users generally seem to prefer a 3D environment over a 2D equivalent [4, 20]. This may merely be a case of appreciating the novelty of 3D graphics, but the fact is that humans have evolved to live in a 3D world with 3D objects, intuitively suggesting that we are more comfortable using a 3D environment
3
which resembles the world we inhabit. Traditional 2D GUIs are well-suited to office tasks such as word processing and spreadsheets, but are much less useful in application areas involving 3D, such as modelling, visualization, and simulation. Here, the interface often becomes a hindrance rather than the invisible mediator it is supposed to be. With 3D user interfaces, the application can be much more closely integrated with the spatial data, making interaction easier. For example, this approach would allow architects to interact directly with the 3D blueprints of a house, moving and creating walls and windows at will, and a fully interactive 3D visualization of molecules and atoms could help biologists study and understand the complex process of protein folding. Needless to say, there are many other application areas where this kind of natural 3D interaction would be of great help. On a more pragmatic note, traditional 2D interfaces do not make full use of the capabilities of modern computer software and hardware, such as sound, speech and gesture recognition, haptic interfaces, force feedback, and 3D graphics, and one can surmise that much more should be possible using this new technology. This is a deficiency in itself. As stated above, 3D user interfaces are easily applicable to design, modelling, visualization, training, and simulation. However, we believe there are also great potential benefits to using 3D user interfaces in more conventional productivity applications such as word processors, spreadsheets, and web browsers, applications that today are considered inherently 2D. It has already been shown that a suitable 3D environment helps in task management of normal 2D GUI applications by evoking human spatial memory and cognition [20]. Furthermore, Robertson et al. [22] show that 3D information visualization techniques can be used with great effect to depict abstract data that has no natural 3D mapping, and moreover that employing 3D allows us to make the user’s immediate workspace larger and denser without imposing additional cognitive overhead. In the future, we will probably see many more non-obvious examples of successfully using 3D in applications that have traditionally been regarded 2D.
4
The 3Dwm System
As we have seen, 3D user interfaces carries a lot of unrealized potential for creating user applications with immersive interfaces. However, there are several challenges to achieving this potential, many of which are very similar to the original motivations for the X Window System [25], the difference being that the frontier now lies in interfaces using three instead of two dimensions. The solution back then was to define a standard application platform for building programs with a graphical user interface, albeit in 2D. We have chosen a similar approach, aiming to create a 3D user interface platform to facilitate work on immersive applications for developers and researchers alike. In the following sections, we first present 3Dwm, our attempt at providing such a platform. We then go on to describe the important features of the system and the challenges they address. Finally, after a brief look at the 3Dwm
4
CORBA
network
host computer
CORBA
user
Client B
CORBA
Display server output h/w
CORBA
input h/w
Client A
Client C
Figure 2: 3Dwm display server with three connected clients. object model, we explore the system architecture in depth and discuss its major components.
4.1
Fundamentals
3Dwm (Three-Dimensional Workspace Manager) is a single-user client/server 3D graphics system based on a distributed general-purpose 3D scene graph. The system is composed of a central display server, that runs on the user’s computer and is in command of the graphics and input hardware of the host, and a number of clients, that may connect to the server from anywhere on the network and employ its services to present an application to the user (see Figure 2). This architecture is similar to that of the X Window System [25], but centralizes most of the user interface functionality in the server instead of in the clients, and uses CORBA3 , for distributing the system. The programming interface is accordingly defined using language-independent IDL4 specifications.
4.2
System Features
Fully realizing the potential of 3D user interfaces by building a full-fledged application system such as 3Dwm has many challenges, and to date there has been no successful general 3D replacement for the 2D desktop metaphor [31]. In this section, we will explore these challenges and their solutions in 3Dwm. It should be noted that most of these techniques are not new and have been used before; it is their unified aggregation in 3Dwm that makes the system unique. 4.2.1
Retargetability
Problem: The lack of unified standardization in the field of 3D graphics forces 3D user interface applications to target a large number of diverse hardware platforms, including not only common desktop machines, but also special-purpose embedded 3D hardware (such as cellular phones, game consoles, and future generations of PDAs), Virtual Reality devices, and wearable computers. To 3 Common Object Request Broker Architecture, a middleware standard for distributing objects in a heterogeneous network environment. 4 Interface Definition Language, a language-independent specification language used when defining CORBA interfaces.
5
compound this problem, the lack of reliable standards makes it difficult to take advantage of the special capabilities of a hardware platform in a generic way. Solution: To remedy this problem, 3Dwm acts as a unified abstraction layer for 3D user interfaces by defining a hardware-independent programming interface for application building (specified as CORBA IDL interfaces). In addition, all platform-specific code is isolated in a dynamic module loaded at server startup. Thus, 3Dwm insulates applications from the underlying hardware, and, in fact, vice versa, making it possible to once and for all rewrite just this platformspecific module to make the display server run on a new hardware platform. All existing 3Dwm applications will then work without modification. 4.2.2
Accessible Programming Interface
Problem: User interface software in general is inherently hard to write [18], and 3D user interfaces complicate design and implementation even further [5]. Not only is the mathematical background of 3D graphics complex and unfamiliar to most software developers, there are also many human factors that come into play, such as positioning and occlusion of the interface elements, lighting and shading in the 3D environment, and the user’s manual dexterity and agility. As if this was not enough, the steady progress of 3D hardware and software adds to this complexity, making it difficult even for seasoned developers to stay abreast of new technology. Solution: The 3Dwm system suppresses this inherent complexity by providing a single programming interface that hides the intricacies of 3D graphics from developers. This is achieved by giving developers access to high-level 3D concepts such as scene graphs, 3D meshes, animation, transformations, etc. At the same time, the underlying implementation of this programming interface can transparently be extended to take advantage of new technology without forcing applications to be modified. The usability of the interface has been influenced by successful 3D graphics frameworks and user interface toolkits, and has been evaluated by programmers on the Internet (more specifically, external people involved with the project). 4.2.3
Reusable 3D User Interface Toolkit
Problem: Developers looking to build applications with fully three-dimensional interfaces quickly realize that there are very few toolkits to assist them, forcing them to create their own user interface components from scratch. Consequently, different 3D user interface applications will have different look and feel, making it hard for users to transfer their knowledge from one application to another. This is also an unfair situation to application developers, who should be able to concentrate on writing their own programs without having to build the very tools they are using as well.
6
Solution: This lack of reusable toolkits is remedied by the existence of a native toolkit for the 3Dwm system. The toolkit contains a basic set of 3D widgets that can be used when building the user interface of an application. Since the toolkit is centralized in the display server, it is relatively easy to change the “look-andfeel” of 3Dwm without affecting applications using the toolkit. At the moment, only rudimentary widgets such as buttons, text fields, and sliders exist; in the future, we will concentrate on implementing both existing 3D widgets, such as cursors, handles and virtual spheres [5], as well as researching new ones. 4.2.4
Distribution and Location-Transparency
Problem: The focus of computing has in recent years turned from individual workstations to networks of interconnected computers. This forces software to become network-aware in order to take advantage of shared resources scattered across the network, such as storage, printers, spare CPU time, etc. In many cases, CPU-intensive applications may have to be asymmetrically distributed among several computers on the network to balance the load. Solution: As mentioned above, 3Dwm uses a distributed client/server architecture to fit the modern network-centric computing environment. In fact, the use of CORBA as communication middleware makes the system fully locationtransparent, allowing shared entities to be accessed by display server and clients alike as if they were part of the local host’s address space (when, in reality, the object invocations are marshalled into network calls). However, this property comes at a cost; CORBA invocations incur considerable overhead compared to normal function calls, and should be minimized. We plan on exploring various optimizations using shared memory when clients and display server are colocated on the same host. 4.2.5
Cross-Platform Capability
Problem: The network is an inherently heterogeneous environment, with each of the nodes running any of a large number of operating systems on different kinds of hardware, making interoperability between the individual nodes difficult. In order to thrive in this network-centric environment, applications must be able to ignore platform boundaries. Solution: The fact that all communication is carried out through the interfaces specified in the implementation-neutral IDL language not only makes 3Dwm truly cross-platform, but programming language-agnostic as well. This enables programmers to write 3Dwm applications that will seamlessly work with any 3Dwm display server running on any platform that has a CORBA implementation and using any language that has IDL mappings. Again, this is a two-way street, also making it possible to reimplement the entire 3Dwm display server in another programming language or using different software tools, and without having to modify existing applications. 7
Figure 3: Multiple X11 desktops visualized in 3Dwm using VNC. 4.2.6
2D Application Support
Problem: As for all new application platforms, it is important that users are able to somehow retain access to their existing “legacy” applications when using 3Dwm, at least in a transitionary phase. It will be some time before applications for the new platform appear that cater to the every need of its users, and in the meantime, users must be able to stay productive. In many cases, this backwards compatibility is vital to achieving the critical mass of users needed for a platform to become a viable development target for software developers. Also, in the specific case of 3D platform like 3Dwm, it is often especially cumbersome to shut down the 3D system to access the conventional 2D interface, such as when using a CAVE or HMD. Therefore, users must be able to seamlessly use their old 2D applications unmodified from within the 3D environment. Solution: 3Dwm solves the need for 2D application support by providing a backwards compatibility layer for communication with existing 2D windowing systems. This is done through a VNC5 [19] client implementation that is able to connect to a graphical desktop residing on a VNC server located either on the network or the local computer. This desktop is then used as a texture inside 5 Virtual Network Computing, a cross-platform protocol designed for exporting a graphical desktop over the network.
8
3Dwm for drawing on any type of 3D object6 , and input commands captured by the object is fed back to the VNC server (see Figure 3), a technique first employed in [7]. Not only is this ability to connect to a remote graphical desktop faithful to the distributed nature of the rest of 3Dwm, the use of VNC also means that Windows, MacOS, and X11 desktops alike can potentially be visualized side-by-side in the system. The Task Gallery (see the section on related work) uses a technique called application redirection [32] to host unmodified Windows applications in a 3D virtual environment. This is very similar to 3Dwm’s VNC approach, with the added benefit that in the Task Gallery, individual windows get separate offscreen buffers instead of one for the whole desktop. On the other hand, 3Dwm permits network-transparent access to not only display Windows applications, but MacOS and X11 applications as well, something the Task Gallery lacks. 4.2.7
Session Management
Problem: One of the drawbacks of existing 3D user interface applications, especially those designed for Virtual Reality, is that they tend to be designed to run in “exclusive” mode with no awareness of other concurrently running applications. In fact, these applications generally take command of the whole 3D environment, making it impossible for other applications to gain simultaneous access to the environment as well. However, users familiar with 2D windowing systems expect to be able to perform multiple tasks side-by-side in the 3D workspace, such as editing a document using a word processor while referring to information found on a webpage viewed in a web browser. Moreover, the applications themselves must also be able to communicate with each other in order to support user operations like drag-and-drop and cut-and-paste. Solution: Application multitasking is too valuable to neglect in a user environment, and 3Dwm accordingly lets multiple applications run concurrently in the system. In this way, the 3Dwm display server can be said to be a resource manager for the 3D environment, performing session management on behalf of the user. Here, clients may request 3D volumes inside a workspace for their own use. Some applications may even require exclusive access to the full 3D environment, and in the future we plan to support such “immersive” application by temporarily suspending all other applications. This is very much akin to fullscreen applications in conventional windowing systems. Also, 3Dwm enables running applications to be aware of each other, facilitating inter-client communication for data exchange and synchronization. We want to look further into techniques for view and task management in a multi-session 3D environment like 3Dwm. 6 A flat plane usually works best, but we intend to study other shapes, such as slightly curved planes, as well.
9
4.3
Design
Framework building is a complex task, and usually involves many design tradeoffs and compromises. In 3Dwm, the focus has been on building a componentbased architecture with a lot of parametrization to facilitate reuse and extensibility. Still, a number of difficult issues arose during the development of the system. Most prominent of those was the recurring problem that plagues all abstractions: how to come up with an suitable abstraction model that captures all functionality needed, yet allows for extensions? In the specific case of building a 3D graphics abstraction layer like 3Dwm, we want to be able to make use of the specific capabilities of various 3D cards, yet still support the other 3D cards. It is often possible (albeit not elegant) to emulate or simply omit special features on hardware lacking them, but in 3Dwm’s case, it turned out that the most effective solution was to raise the abstraction level of other parts of the system correspondingly; in order to make use of compiled vertex arrays in the display server on cards supporting them, we would not give the client programmers immediate control over individual rendering of triangles, but instead retain whole 3D meshes as objects on the server-side and build the vertex array transparently. Another major issue was the design of the native 3D user interface toolkit that is part of the display server. User interface toolkit design is a significant research area in its own right, and involves many non-trivial decisions. Therefore, instead of trying to best these efforts, the 3Dwm toolkit is defined as a set of abstract interfaces that may have many different implementations and extensions (a default implementation exists, of course). By simply changing toolkit implementation (even at run-time), it is possible to completely alter the look and feel of the user environment. Unfortunately, it is likely that some toolkit implementations or extensions even may need to modify the actual interfaces or introduce completely new ones. How to allow for this contingency has not yet been resolved, but we are planning to implement a capability querying mechanism to allow 3Dwm clients to dynamically take advantage of special extensions in the display server. It is possible that dynamic introspection of CORBA interfaces might also be of use here. While the use of CORBA as a communication middleware does bring a number of useful benefits (such as language-independence, cross-platform capability, and, primarily, location-transparency), the overhead incurred by its use proved to be a major design issue. Performance is vital in 3D graphics applications and cannot be sacrificed for an elegant architecture. However, by consciously designing the 3Dwm API in a way so that most of the real-time interaction is retained in the display server and communication between clients and server is minimized, a good tradeoff was achieved. All 3D rendering is typically performed locally inside the display server (only rarely would a 3Dwm application maintain part of its scene graph in its own address space), thus incurring little to no extra overhead compared to traditional rendering. 3Dwm is a closed (retained-mode) system that provides no access to the un-
10
derlying 3D API. The main reason for this is to maintain the high abstraction level of the system, allowing for language- and platform-independent applications accessing a network-distributed scene graph. In addition, exposing the specifics of a low-level 3D API such as OpenGL would certainly work against the performance concerns addressed in the previous paragraph. However, there are tentative plans to implement a direct-access mode using shared memory that is accessible to applications running on the local host. Chromium [15], a distributed OpenGL driver, might be of interest in our future work in this direction.
4.4
Object Model
3Dwm provides an object-oriented framework for 3D user interface programming not unlike conventional 2D GUI frameworks such as Java Swing, OpenStep, and the Win32 API. However, in 3Dwm, the object framework is distributed using CORBA, and clients use the IDL interfaces of the 3Dwm API to transparently access servant objects that reside in the display server. Thus, whereas conventional GUI toolkits are located on the client side, 3Dwm uses a server-side toolkit (which is dynamically exchangable) to allow for a more uniform lookand-feel in the user interface. Servant objects are in turn allocated by clients using factory kits, which are “static” objects with the ability to create other server-side objects, and each object instance is reference counted to allow for automatic garbage collection when the associated client terminates. This separation of interface and implementation gives latitude in replacing and extending parts or the whole of the underlying implementation while shielding clients from low-level details. The 3Dwm object model has some features in common with the PostScriptbased rendering language of NeWS [11] (self-contained and potentially distributed graphics rendered locally on the display server), as well as major similarities to Fresco [16] (IDL-defined node hierarchy distributed using CORBA and developed using a similar design philosophy).
4.5
System Architecture
The 3Dwm display server uses a layered system architecture consisting of a few main components and a number of supporting subsystems on several levels of abstraction (see Figure 4). Central to the 3Dwm architecture is a distributed 3D scene graph that is shared by all clients connected to the display server. The scene graph is managed by a special scene manager that is responsible for rendering the 3D scene from the viewpoint of the user by using an abstract renderer that walks the scene graph once every rendering update. During this time, the renderer implementation will translate abstract rendering commands into concrete API calls (in this case using OpenGL). In addition to this rendering traversal, the scene manager may also initiate other scene graph traversals, for instance using ray and point intersection, for the purpose of propagating messages within the nodes of the 11
messages
Connection Server
creation
Factory Kits
Client Handles
Scene Graph
messages
event mapper
configuration
Renderer
events
Input device drivers
OpenGL impl.
OpenGL commands
commands, messages
Object Stubs
Event Manager render commands
messages
calls
commands connections
Scene Manager
commands, messages scene graph nodes
input data
Platform Abstraction Layer
Object Request Broker
OS/libraries/services
Network/Local IPC
Graphics/input hardware
Figure 4: 3Dwm layered system architecture. scene graph. These messages are the primary means for inter-object communication within the server and signify events such as view changes, modified bounding boxes, and node collisions. One source of such messages is the event manager that handles all input devices connected to the host machine running the display server. The event manager can handle an arbitrary number of devices using special 3Dwm-specific input device drivers that wrap the operating system drivers in a standardized interface, allowing the event manager to configure each device independently. The event manager will then listen for input events issued by the devices, processing each using a special event mapper. This event mapper is really an interpreter for a simple propositional logic language that queries input data in order to execute specific actions and issue messages, allowing users to dynamically redefine input handling, such as keyboard and mouse mappings. Common for both the scene and event managers is the platform abstraction layer that provides the platform services required for 3D rendering and input device querying. This system layer encapsulates the underlying operating system of the host machine, defines the display properties of the graphics hardware, and provides the necessary “glue” for performing OpenGL operations. It also contains generic OS services such as threads, mutual exclusion, synchronization, sockets, and logging. In addition to the scene and event managers, the third main component of the display server is the connection server. As mentioned earlier, 3Dwm is a distributed system and uses CORBA as communications middleware; hence, the connection server is in reality a CORBA object exported using the CORBA Naming Service7 . It is used by clients to connect to the 3Dwm server and receive a client handle that can be used to create and manage servant objects. The connection server performs keep-alive pinging to detect any malfunctioning 7A
standardized CORBA service used to make CORBA servant objects visible to clients.
12
clients or loss of network connectivity; if connection is lost, all resources created in the server by the affected client are automatically garbage-collected. 3Dwm clients allocate resources in the server using factory kits held by each client handle. As described above, factory kits are used to create other serverside objects, similar to a distributed memory allocation operator, and in 3Dwm they are generally employed to create scene graph nodes for various purposes. Several kits already exist for creating geometry, textures, materials, CSG, and layout nodes, and in the future it will be possible to add more factory kits (or implementations of factory kits) to the server at run-time using a plug-in mechanism. We will now explore some of the special properties of the display server architecture in more depth. 4.5.1
Distributed Scene Graph
The 3Dwm scene graph is a CORBA-distributed directed acyclic graph shared among all clients, with each client assigned a subtree for their exclusive use and with individual graph nodes accessible as if they were part of the client’s own address space. In 3Dwm, the scene graph may not only contain rendering and geometrical nodes, but also nodes for message routing, spatial sound, and behavior. The latter fact is capitalized upon in 3Dwm; the server supports a special type of behavioral scene graph node type called a controller that is designed to parse messages and perform actions in response, often emitting several messages in return. This is generally the way interactive user interfaces are built in 3Dwm, i.e. with a combination of visual and behavioral scene graph nodes. Accordingly, the scene graph forms the backbone of the user interface services of the display server and scene graph management is thus the most important part of writing a 3Dwm client. In accordance to this philosophy, user interface widgets in 3Dwm are merely scene graph subtrees consisting of event controllers, state, and geometry with a pre-defined behavior. Developers can easily create new widgets at run-time by combining existing scene graph nodes or by creating new ones in their local address space. 4.5.2
Platform Abstraction Layer
To cope with the staggering software and hardware diversity in 3D graphics described above, 3Dwm uses a platform abstraction layer that shields the display server from the underlying system platform. This layer is an abstract interface with a number of services related to rendering and input management that may be implemented by whatever hardware and software resources are available on the target machine, be it a desktop computer or an immersive VR device.
13
4.5.3
Renderer Abstraction
The 3D rendering services offered by the underlying system platform are an important part of the platform abstraction, and have been captured in the 3Dwm renderer abstraction. Specified in terms of an IDL interface, the renderer contains a number of methods for setting the rendering state and drawing simple 3D primitives such as lines, points, and triangles. It is designed to walk the 3Dwm scene graph once per server update in what is called a rendering traversal. 4.5.4
Message Propagation
The dynamic requirements of an interactive user interface is handled in 3Dwm using messages propagated through the scene graph. A general publish/subscribe scheme ensures that objects can listen to other objects for the type of messages they are interested in. Scene graph nodes are typically both message emitters and listeners, allowing the creation of propagation hierarchies for handling bounding volume updates, ray and point intersections, and input events throughout the scene graph. 3Dwm, like Fresco [16], makes use of the MVC [10] (model-view-controller) paradigm to enforce a clear separation between the state and the modifiers in the system. Special controller objects form the active elements of the framework, parsing messages and manipulating data, possibly issuing new messages or actions in return. The models that the controllers act upon are shared state objects ranging from integer and floating point values to text strings. The 3D renderer acts as the view, creating a visual representation of the scene graph for the benefit of the user once every rendering update. For example, a slidebar controller can be attached to a scene graph subtree representing a geometrical figure, and whenever an input event such as a mouse click intersects this geometry, an intersection message is generated and propagated up through the subtree hierarchy. The message will be captured by the slidebar controller, and the corresponding floating point value representing the position of the slidebar button will be modified. This value, in turn, is observed by the button geometry, which will change its position accordingly, and any other interested parties in the scene graph. In addition, the client programmer may also register a callback object that is also invoked when the button is triggered. More complex relationships are also possible, where a controller may pass on a message it does not know how to handle to its parent controller.
4.6
Implementation
The 3Dwm display server is implemented in ANSI C++ and runs on Linux and IRIX (eventual Solaris and Windows support is planned). Platform implementations exist for the X Window System and the Linux framebuffer console, and a VR Juggler [1] version is planned for the future to allow the system to run on immersive Virtual Reality equipment. OmniORB8 , is used for the distribution, 8A
freely available CORBA ORB, see http://www.omniorb.org.
14
Figure 5: Immersive CSG modeller in 3Dwm. and OpenGL 1.2 for rendering (preliminary planning of a Direct3D renderer implementation is underway).
4.7
Current Status
The core architecture of the 3Dwm display server has been implemented in full, and all of the features described in Section 4.2 have been realized. Work now focuses on various extensions to the core system, such as real-time animation, constructive solid geometry, and special-purpose rendering nodes (terrain, progressive meshes, procedural textures, etc), as well as additional hardware drivers and support for other platforms. In addition to this, the client-side requires considerable work, including an evaluation of the client programming interface, tools for rapid application development, and various proof-of-concept clients. However, the platform as a whole is still in its early stages, and a lot of work remains to be done to turn 3Dwm into a viable user environment. The system is distributed as Open Source and can be freely downloaded from http://www.3dwm.org.
5
Example: Immersive CSG Modeller
Let us briefly study an example of a real 3Dwm application, an immersive 3D modeller using constructive solid geometry (CSG), to shed some light on how to use this platform in practice. Figure 5 shows a screenshot of the modeller’s
15
workspace, with the 3D object the user is working on in the center of the picture and a toolbar below. Here, the user may use direct manipulation to instantiate various 3D primitives (such as spheres, cubes, cones, etc), and then transform and combine them using boolean set operations (union, intersection, subtraction, etc) into CSG trees. The toolbar, which is a group node consisting of a number of button controllers with their associated graphical representation, is decorated with a dragger node to allow the user to freely move it around the environment. To perform the actual CSG operations, a server-side extension was implemented that allows for the creation and evaluation of dynamic CSG trees using binary space partitioning (BSP) trees [30, 8]. Hierarchical caching is performed to minimize the amount of recomputation needed when parts of a CSG tree is modified. The CSG modelling client uses this extension to resolve the specific CSG factory kit and manipulate a single CSG tree in accordance with the user’s input. Figure 6 shows a part of the C++ source of the CSG modeller creating and manipulating a CSG tree using the 3Dwm API. An important part of an immersive modelling application is the input devices used for the interaction, the ideal setup probably being a combination of stylus/wand and data glove. Alas, 3Dwm still lacks the necessary drivers for these kinds of input devices, so in this example modeller, the mouse and keyboard is used for the interaction. The source code of the CSG modeller is little over 1000 lines, and took slightly more than a day to design and implement (not including the imnplementation of CSG extension in the display server). During the implementation, many existing 3Dwm components such as buttons, draggers, controllers, layout objects, and 3D triangle meshes (with textures) were employed, reducing development time considerably. Python bindings for 3Dwm are under development, and we expect that the rapid application development capabilities of the system will improve even more using this language.
6
Discussion
It is important to realize that the use of 3D graphics could easily exacerbate the existing problems of traditional 2D user interfaces and introduce additional ones. This includes issues such as difficulty of 3D navigation, nearby objects occluding distant ones, insufficient resolution in current graphics hardware, the difficulties of performing 3D interaction, etc. It is therefore unrealistic to believe that 3D user interfaces are superior to traditional 2D ones in all application domains. However, as has been discussed in this paper, 3D user interfaces do have much to offer for areas involving manipulation and visualization of three-dimensional data. Building a general 3D application framework like 3Dwm may seem pointless when there already exists so many 3D frameworks, but the 3Dwm system exhibits three important properties no other similar system supports: it is distributed (allowing for asymmetric load sharing), it contains all the basic user
16
interface primitives needed (i.e. not only basic 3D elements), and it is both cross-platform capable and language-independent (vital for real-world application development). Furthermore, unlike many comparable platforms, 3Dwm is freely available for both commercial and educational use. It may be argued that 3D framework building in itself is a futile exercise, since doing so prevents us from using hardware-specific features and extensions. However, the benefits of introducing this abstraction for 3D user interface development clearly outweigh the benefits of not doing so given the goals of the project, which favor flexibility and usability over high performance. Besides, by uniformly raising the abstraction level of the entire 3Dwm API, it was in many cases possible to transparently make use of these extensions. Also, as described in Section 4.3, future versions of 3Dwm might allow for direct access to the underlying 3D API, making the use of hardware-specific features and extensions possible.
7
Conclusions
The 3Dwm system described in this paper is a software platform targeted at both research and development of three-dimensional user interfaces. It was devised especially for application areas involving 3D such as modelling, visualization, and simulation. However, there are also considerable challenges to defining such a 3D user interface platform. Most of them are addressed in 3Dwm: the system acts as a unified 3D abstraction to combat the lack of standardization in 3D graphics; it provides a retargetable back-end to cope with the many diverse hardware platforms it needs to run on; its programming interface was designed to shield programmers from the intricacies of 3D graphics; it contains a 3D user interface toolkit to save application developers from rolling their own; it is network-transparent in order to thrive in today’s network-centric computing environments; it has cross-platform capability to allow run-time interoperability with clients running on other platforms; it provides bindings to existing 2D environments within the 3D environment to let users access their old applications; and, finally, it provides full application multitasking and communication in order to turn 3D user interfaces from a mere curiousity into a viable work environment. In fact, to facilitate all this, 3Dwm is freely available as Open Source on the Internet. The work on 3Dwm is far from done, however, and there is considerable effort remaining for realizing the vision of turning the system into a potential standard platform for 3DUI development. In the future, 3Dwm will serve as a research platform for our inquiries into view models and session management for multiapplication 3D environments such as this. Interesting subjects here include the integration of 2D and 3D graphics within the environment, camera control for desktop versus immersive VR use of the system, and the need for “immersive applications” that require exclusive access to the whole environment. We would also like to investigate how to make use of 3D information visualization within the system, as well as define a standard set of 3D widgets in the 3Dwm toolkit.
17
Finally, one of the most exciting—but also one of the most daunting—research directions for 3Dwm is that of computer-supported collaborative work (CSCW), i.e. turning the 3Dwm environment into a shared, distributed workspace for several concurrent users.
Acknowledgements Thanks to Robert Karlsson and Steve Houston for their invaluable work on 3Dwm. Thanks to Philippas Tsigas and Devdatt Dubhashi for their support, feedback, and help. Thanks to Hans Andersson, Erland Flygt, and Anders Janocha for their help on creating the 3Dwm video accompanying this paper.
References [1] Allen Bierbaum, Christopher Just, Patrick Hartling, Kevin Meinert, Albert Baker, and Carolina Cruz-Neira. VR Juggler: A virtual platform for virtual reality application development. In Proceedings of IEEE Conference on Virtual Reality, pages 89–96, 13–17 March 2001. [2] Stuart K. Card, George G. Robertson, and Jock D. Mackinlay. The information visualizer, an information workspace. In Proceedings of ACM CHI’91 Conference on Human Factors in Computing Systems, Information Visualization, pages 181–188, 1991. [3] Christer Carlsson and Olof Hagsand. DIVE - a multi-user virtual reality system. In Proceedings of IEEE Virtual Reality Annual International Symposium, pages 394–400, September 18–22 1993. [4] Andy Cockburn and Bruce McKenzie. 3D or not 3D?: Evaluating the effect of the third dimension in a document management system. In Proceedings of ACM CHI 2001 Conference on Human Factors in Computing Systems, Social Interfaces, pages 434–441, 2001. [5] D. Brookshire Conner, Scott S. Snibbe, Kenneth P. Herndon, Daniel C. Robbins, Robert C. Zeleznik, and Andries van Dam. Three-dimensional widgets. In Proceedings of 1992 Symposium on Interactive 3D Graphics, Special Issue of Computer Graphics, Vol. 26, pages 183–188, 1992. [6] Carolina Cruz-Neira, Daniel J. Sandin, Thomas A. DeFanti, Robert V. Kenyon, and John C. Hart. The CAVE: audio visual experience automatic virtual environment. Communications of the ACM, 35(6):64–72, June 1992. [7] Phillip Dykstra. X11 in virtual environments. In Proceedings of the Symposium on Research Frontiers in Virtual Reality, pages 118–119, San Jose, CA, USA, October 1993. IEEE Computer Society Press.
18
[8] Niklas Elmqvist. 3Dwm: Three-dimensional user interfaces using fast constructive solid geometry. Master’s thesis, Chalmers University of Technology, G¨ oteborg, 2001. [9] Steven Feiner and Clifford Beshers. Worlds within worlds: metaphors for exploring n-dimensional virtual worlds. In ACM, editor, Proceedings of ACM Symposium on User Interface Software and Technology, pages 76–83, New York, NY 10036, USA, October 1990. ACM Press. [10] Adele Goldberg. Smalltalk-80 - the Interactive Programming Environment. Addison-Wesley, Reading (MA), 1984. [11] James Gosling, David S. H. Rosenthal, and Michelle J. Arden. The NeWS Book. Springer-Verlag, Berlin, Germany / Heidelberg, Germany / London, UK / etc., 1989. [12] D. Austin Henderson, Jr. and Stuart K. Card. Rooms: the use of multiple virtual workspaces to reduce space contention in a window-based graphical user interface. ACM Transactions on Graphics, 5(3):211–243, 1986. [13] Kenneth P. Herndon and Tom Meyer. 3D widgets for exploratory scientific visualization. In Proceedings of ACM Symposium on User Interface Software and Technology, Groupware and 3D Tools, pages 69–70, 1994. TechNote. [14] Philip M. Hubbard, Matthias M. Wloka, and Robert C. Zeleznik. UGA: A unified graphics architecture. Technical Report CS-91-30, Brown University - Department of Computer Science, June 1991. [15] Greg Humphreys, Mike Houston, Ren Ng, Randall Frank, Sean Ahern, Peter Kirchner, and Jim Klosowski. Chromium: A stream-processing framework for interactive rendering on clusters. In John Hughes, editor, SIGGRAPH 2002 Conference Proceedings, Annual Conference Series, pages 693–702. ACM Press/ACM SIGGRAPH, 2002. [16] Mark Linton and Chuck Price. Building distributed user interfaces with Fresco. The X Resource, 5(1):77–87, January 1993. [17] Jock D. Mackinlay, George G. Robertson, and Stuart K. Card. The Perspective Wall: Detail and context smoothly integrated. In Proceedings of ACM CHI’91 Conference on Human Factors in Computing Systems, Information Visualization, pages 173–179, 1991. [18] Brad A. Myers. User-interface tools: Introduction and survey. IEEE Software, 6(1):15–23, January 1989. [19] Tristan Richardson, Quentin Stafford-Fraser, Kenneth R. Wood, and Andy Hopper. Virtual network computing. IEEE Internet Computing, 2(1):33– 38, January–February 1998.
19
[20] George Robertson, Maarten van Dantzich, Daniel Robbins, Mary Czerwinski, Ken Hinckley, Kirsten Risden, David Thiel, and Vadim Gorokhovsky. The Task Gallery: A 3D window manager. In Proceedings of ACM CHI 2000 Conference on Human Factors in Computing Systems, volume 1 of 3D Environments, pages 494–501, 2000. [21] George G. Robertson, Stuart K. Card, and Jock D. Mackinlay. The cognitive coprocessor architecture for interactive user interfaces. In Proceedings of ACM Symposium on User Interface Software and Technology, pages 10– 18, 1989. [22] George G. Robertson, Stuart K. Card, and Jock D. Mackinlay. Information visualization using 3D interactive animation. Communications of the ACM, 36(4):56–71, April 1993. [23] George G. Robertson, Jock D. Mackinlay, and Stuart K. Card. Cone trees: Animated 3D visualizations of hierarchical information. In Proceedings of ACM CHI’91 Conference on Human Factors in Computing Systems, pages 189–194. ACM Press, 28 April–2 May 1991. [24] John Rohlf and James Helman. IRIS performer: A high performance multiprocessing toolkit for real-time 3D graphics. In Proceedings of SIGGRAPH ’94 Conference on Computer Graphics, pages 381–395. ACM Press, July 1994. [25] Robert W. Scheifler and Jim Gettys. The X window system. ACM Transactions on Graphics, 5(2):79–109, 1986. [26] Chris Shaw, Jiandong Liang, Mark Green, and Yunqi Sun. The decoupled simulation model for virtual reality systems. In Proceedings of ACM CHI’92 Conference on Human Factors in Computing Systems, pages 321–328, 1992. [27] David C. Smith, Charles H. Irby, Ralph Kimball, Eric F. Harslem, and Howard L. Morgan. The Star user interface: An overview. In AFIPS Conference Proceedings. 1982 National Computer Conference, volume 51, pages 515–528. Xerox Corp., 1982. [28] David C. Smith, Charles H. Irby, Ralph Kimball, Bill Verplank, and Eric F. Harslem. Designing the Star user interface. Byte Magazine, 7(4):242–282, April 1982. [29] Ivan E. Sutherland. A head-mounted three dimensional display. In Proceedings of Fall Joint Computer Conference, volume 33, pages 757–764. Thompson Books, 9–11 December 1968. [30] William C. Thibault and Bruce F. Naylor. Set operations on polyhedra using binary space partitioning trees. ACM Computer Graphics SIGGRAPH ’87, 21(4):153–162, July 1987.
20
[31] Andries van Dam. Beyond WIMP. IEEE Computer Graphics and Applications, 20(1):50–51, January/February 2000. [32] Martin van Dantzich, George Robertson, and Vadim Gorokhovsky. Application redirection: Hosting windows applications in 3D. In Proceedings of Workshop on New Paradigms in Information Visualization and Manipulation (NPIVM-99), pages 87–91, N.Y., November 6 1999. ACM Press. [33] The VRML Consortium. The Virtual Reality Modeling Language (ISO/IEC 14772-1:1997), 1997. [34] Robert C. Zeleznik, D. Brookshire Conner, Matthias M. Wloka, Daniel G. Aliaga, Nathan T. Huang, Philip M. Hubbard, Brian Knep, Henry Kaufman, John F. Hughes, and Andries van Dam. An object-oriented framework for the integration of interactive animation techniques. Computer Graphics, 25(4):105–112, July 1991.
21
// Resolve the necessary kits SolidKit_var solid_kit = client.resolve (Nobelxx::name()); PrimitiveKit_var prim_kit = client.resolve (Nobelxx::name()); // Create the CSG container SolidContainer_var container = solid_kit->createContainer(); // Create primitives Primitive_var cube = prim_kit->createCuboid(); Primitive_var cone = prim_kit->createCone(32); // Create primitive containers Solid::Primitive_var cube_ctr = solid_kit->createPrimitive(cube); Solid::Primitive_var cone_ctr = solid_kit->createPrimitive(cone); // Create a subtraction Solid::Binary_var tree = solid_kit->createSubtraction(cube_ctr, cone_ctr); // Assign CSG tree to CSG container container->setTree(tree); // Evaluate and insert container into scene graph container->evaluate(); client.root()->insert(tree); Figure 6: C++ source code snippet of the immersive CSG modeller.
22