Article title:
State Synchronization in Heterogeneous Groupware
Author:
Ivan Marsic
Affiliation:
Department of Electrical and Computer Engineering Rutgers — The State University of New Jersey Piscataway, NJ 08854-8058
Paper Format: Regular Paper Corresponding author:
Ivan Marsic Department of Electrical and Computer Engineering Rutgers — The State University of New Jersey 94 Brett Road Piscataway, NJ 08854-8058 Email:
[email protected]
Abstract Traditional groupware systems for synchronous collaboration require identical applications running on nearly identical hardware platforms. However, the recent proliferation of computing devices and contexts of their use demand diversity in collaborative applications as well. The DISCIPLE and coZmo frameworks presented here support collaboration of heterogeneous applications. The heterogeneity is provided by using eXtensible Markup Language (XML) for the communication medium. The conferees share the same data or a subset of that data, represented in XML, but they may see it displayed in different ways as needed or desired. The coZmo visualization framework transforms information so that it matches the client’s local capabilities and resources, yet maintains semantic contents for effective sharing. The paper presents the analysis of and the constraints on the application state synchronization for heterogeneous groupware applications.
State Synchronization in Heterogeneous Groupware Ivan Marsic Department of Electrical and Computer Engineering Rutgers University Piscataway, NJ 08854-8058 USA +1 732 445 6399
[email protected] Abstract
Traditional groupware systems for synchronous collaboration require identical applications running on nearly identical hardware platforms. However, the recent proliferation of computing devices and contexts of their use demand diversity in collaborative applications as well. The DISCIPLE and coZmo frameworks presented here support collaboration of heterogeneous applications. The heterogeneity is provided by using eXtensible Markup Language (XML) for the communication medium. The conferees share the same data or a subset of that data, represented in XML, but they may see it displayed in different ways as needed or desired. The coZmo visualization framework transforms information so that it matches the client’s local capabilities and resources, yet maintains semantic contents for effective sharing. The paper presents the analysis of and the constraints on the application state synchronization for heterogeneous groupware applications. Keywords Groupware, heterogeneous computing, state synchronization, concurrency control, XML. 1 INTRODUCTION A key component for synchronous collaboration of knowledge workers is real-time sharing of software applications. Application sharing can be achieved by centralized or replicated architecture or a hybrid of both [11]. In the centralized case, the conferees are using a single copy of the application through multiple views. In the replicated case, each conferee runs a copy of the same application. In addition, most systems strive to synchronize the way the application looks across the different instances, the so-called WYSIWIS (What You See Is What I See) paradigm [17]. As a result, users’ screens look (nearly) identical across all the instances. A key reason for the uniformity is the difficulty and associated cost of developing diverse groupware applications that are interoperable. However, with recent anytime-anywhere proliferation of computing technology, the need for diversity can no longer be ignored. Kraut et al. [9], for example, show that field workers make repairs more quickly and accurately when they have a remote expert helping them. It is likely that the expert will be in a central office with a workstation, whereas the field worker will have a wearable computer. Thus, heterogeneity naturally occurs in practical tasks. As a result of heterogeneity, there is a need to enable collaboration of users with different views of the application and even with somewhat different applications. We take a data-centric approach, where conferees share the same data or a subset of that data. Our DISCIPLE system [20] provides a container for importing component Java applications and making them collaborative. The coZmo framework introduced here defines a set of such applications that use XML1 for data representation and visualization on heterogeneous devices.
1
http://www.w3.org/XML/. See also http://www.xml.com/
XML is suitable for representing data in a common form and visualizing it differently on different platforms. For various reasons we may also want to allow the application logic (models) of collaborating peers to be somewhat different. One reason may be bandwidth conservation—we may not want or need to transfer entire XML documents to a collaborator operating from a low-bandwidth wireless link. Or, a collaborator may have a small-size display and does not need the high-resolution map used by other peers. Yet another reason may be security—certain data should not be accessible to all collaborators. Our goal is to allow not only collaboration of applications with different operations, but also different interpretations of the same data by the same operations depending on the machine capabilities. For instance, if a machine is not powerful enough to display video clips, it should limit to audio or text. A major problem in the replicated groupware architecture is maintaining a consistent state across different copies of the application, where the consistency means logic equivalency among replicated copies of the same data object. Synchronization of replicated clients in a homogeneous system is a well-researched problem, see e.g., [1,3,,19], and is not addressed here. However, there are some issues with homogeneous replicated models that are specific to the coZmo framework, as elaborated below. The main goal of this paper is the analysis of the issues in a collaborative session with heterogeneous application models. The paper is organized as follows. We first review the state of knowledge and related work in progress. Section 2 overviews the architecture of the DISCIPLE system and presents the XML-based coZmo framework heterogeneous collaboration. Section 3 presents a solution for synchronization of replicated groupware applications which implement separation between the application logic and the view. Section 4 investigates the issues with heterogeneous clients, primarily consistency and stability. Finally, we conclude the paper. 1.1
Background and Related Work
The coZmo framework presented here is based on the Model-View-Controller (MVC) design pattern, a well known and frequently used design pattern to develop interactive applications with flexible humancomputer interfaces [2]. MVC divides an interactive application into three2 components. The model contains the core functionality and data, views display information to the user, and controllers handle user input. The need to allow conference participants to collaborate on dissimilar terminals was recognized early on by D. Engelbart, the pioneer of computer-supported collaborative work [4]. Although the WYSIWIS idealization recognizes that efficient reference to common objects depends on a common view of the work at hand, strict WYSIWIS was found to be too limiting and relaxed versions were proposed to accommodate personalized screen layouts [17]. Several researchers proposed MVC separation in synchronous groupware, with either centralized or distributed models and polymorphic views rendered on distributed hosts. Rendezvous [15], GroupKit [16] and several groupware toolkits thereafter use model-view separation so that developers can create models and drive different views. However, no actual implementation is attempted, and in some cases (e.g., [15]) the situation is greatly simplified by using centralized groupware architecture. A recent approach to collaboration in heterogeneous computing environments is CMU Pebbles project [13]. Pebbles is focused on single-display groupware, with the team being in a single meeting room. Multiple hand-held computers (PDAs) provide simultaneous input (mouse, keyboard) to a single workstation. PDAs are not treated as equal partners in collaboration and thus there is no need for heterogeneous data representation. In addition, the synchronization problem is fairly simple. The Visage system from Maya Design [8] is a powerful (single-user) visualization system for creating custom visualizations and direct manipulation of large and diverse datasets. Some of its unique features 2
Each controller is usually tightly coupled with a corresponding view, so there is a recent tendency to talk only about the Model-View pattern.
2
include dynamic data navigation through drill-down and roll-up (aggregation) techniques. While Visage addresses diverse data visualization, it is not explicitly intended for collaboration in heterogeneous computing environments. Visage is based on a proprietary language, whereas coZmo is based on XML— a widely accepted standard. Recent work on Visage Link [5] adds multiuser collaboration capabilities, but does not consider heterogeneous models or data. 2 SYSTEM ARCHITECTURE A distributed collaboration system comprises a set of participant sites connected by a communication network. In the replicated architecture of groupware [11], each site hosts the same or similar application program and all copies of the applications communicate with each other. The applications are kept in synchrony and activities occurring on any one of them are reflected on the other copies. An application implements an internal state structure by defining a set of operations Op1, Op2, …, Opn. The operations are application-specific and are transmitted as events to the others in an operation-independent manner. When a site receives an event, it identifies and performs the operation(s). Figure 1 shows the architecture of the DISCIPLE system. The set of conferees is represented hierarchically as an Organization, and they meet in Places. The central part of DISCIPLE is conceptualized as the collaboration bus [20]. It spans network fabrics and provides a virtual interconnect for geographically distributed clients. The bus achieves synchronous collaboration through real-time event delivery, event ordering and concurrency control. The collaboration bus comprises a set of communication channels, where the peers can subscribe to and publish information. 2.1
Sharing Java Beans
DISCIPLE is an application framework, i.e., a semi-complete application that can be customized to produce custom applications. The completion and customization is performed by end users (conference participants) who at runtime select and import task-specific Java components—Beans and Applets. The DISCIPLE workspace is a shared container where Java Beans [18] can be loaded very much like Java Applets downloaded to a Web browser, with the addition of group sharing. Collaborators import Beans by drag-and-drop manipulation into the workspace. The imported Bean becomes a part of a multi-user application and all conferees can interact with it. The application framework approach has advantages over the commonly used toolkit approaches in that with toolkit approaches the application designer makes decisions about the application functionality whereas in our approach the end user makes decisions. We consider the latter better because it is closer to the reality of usage and the real needs of the task at hand.
Figure 1: DISCIPLE architecture. Organizations and Places are abstractions implemented as multicast groups. They are represented in the user interface as Communication Center and Workspaces, respectively.
3
Disregarding the issue of choice between centralized vs. replicated groupware architecture (see e.g., [12]), our choice of the replicated architecture is dictated by the main tenet of DISCIPLE of allowing users to customize their workspaces by downloading software components. Since this is done on an individual basis, the architecture is necessarily replicated.
2.2
XML and Model-View Implementation in coZmo
allows importing and sharing arbitrary Java Beans. A class of beans of particular interest for heterogeneous environments is what we define as coZmo beans. coZmo is a data-centric framework that uses XML markup language as the communication medium. We chose XML because (i) it is transportable between heterogeneous information systems in a neutral and system-amenable manner, and (ii) many of XML parsers, tools and libraries are readily available. XML supports the notion of separation between view and data. The same data can be rendered in different ways to an on-screen representation. DISCIPLE
XSL (eXtensible Style-sheet Language) is designed to help browsers and other applications display XML. Stated simply, a style sheet contains instructions that tell a processor (such as a Web browser, print composition engine, or in our case a coZmo bean) how to translate the logical structure of a source document into a presentational structure. An XSL processor starts with a style sheet and a source tree (Figure 2). The tree that is created in this process is called the result tree. The result tree is in a format readily understood by the displaying application. Parsing the common XML file and the local XSL file generates the view at a particular user's machine. By applying the rules defined for the various objects in the XSL file on the XML data the result tree is formed which can be unique for every user. The views may differ due to different user needs, different expertise/concerns, or due to different display capabilities.
Figure 2: XML document rendering. coZmo architecture reflects the XML’s model-view separation. coZmo uses XML to represent the data conferees are collaborating on. There are two parts to the representation. One defines a DTD (Document Type Definitions) and the other expresses the data in XML in terms of the DTD. Once the data is written in XML, an associated XSL document has to be written to define the way it gets displayed. Users can render the document on the screen according to their own choice and the language used to describe this is XSL. In Figure 2, the XML tree is platform-independent and thus corresponds to the model, whereas the result tree corresponds to the view. 2.3
coZmo Beans
coZmo is implemented as a set of specialized Beans, each with a capability to visualize different DTDdefined language. This is in contrast to one large Bean that can visualize any XML document, and is particularly suitable for thin clients. A coZmo bean at present gets imported into a DISCIPLE workspace and the workspace layouts are the same, although the beans inside may be different, across the conferees. In general, any of the DISCIPLE user interface components (Figure 1), can be implemented using the coZmo framework. For example, different workspaces can provide different functionality such as gadgets for group awareness (telepointers, radar views, etc.) Our current efforts are in extending coZmo, so the workspace GUI interface layout can be also defined using XML/XSL. 3 SYNCHRONIZATION OF DISTRIBUTED MODELS AND VIEWS We define the application state as a structure S = , where id is a unique identifier associated with the site (e.g., its IP address) and O is an ordered set of instances of operations Op1, Op2, …, Opn derived from the user input, where each operation can be executed an arbitrary number of times. Since the operations generally do not commute, i.e., the order of compositions is important (Opi, Opj is generally not equal to Opj, Opi), the applications may end up with incoherent states.
4
The incoherence issue arises only in a collaborative system with memory, meaning that the current state depends on the previous state(s). In a memoryless system, such as video conferencing, the changes (audio/video streams) are multicast continuously and a new frame simply overwrites the previous one; any past inconsistency is irrelevant for the future system states. Unlike this, in a shared document editor, the errors accumulate and result in incoherency. In the replicated architecture we maintain the coherence of the applications by imposing that all the applications have the same consistent state at virtually all moments in time. The equality of two application sites is defied by a Boolean function Fα,β(Si, Sj) for a pair of sites with identifiers α and β and with application states Si and Sj. If the two applications are instances of the same, the function F is simply defined as checking that all the operations appear in the same order on both sites. We say that two sites α and β are in the same state or consistent at time t if the following two conditions hold: (a) Fα,β(Si, Sj) = 1 where Si is the application state at site α and time t and Sj is the application state at site β and the same time, and (b) no messages are in transit between the sites. We say that the configuration Γ of sites participating in a collaborative session is correct if any two distinct sites α, β in Γ have the same state. A key issue in distributed collaboration frameworks is maintaining the correctness of the configuration in the presence of concurrent access [3]. We assume that the messages are not lost in the network but we cannot assume any order between them. Since the operations generally do not commute, we have to guarantee a total ordering of events in order to maintain the correctness of the configuration [10]. 3.1
Event Replication
User interaction in coZmo is directly reflected in the local application view, which corresponds to the XML result tree. As in the regular MVC pattern, the changes are propagated to the local application model. In real-time collaboration, the question arises as to where the changes are captured and distributed to the collaborating peers (Figure 3). The two types of change flow can be represented as: Uα → Vα → …view event… → Vβ → Uβ
J-→ Mα
J-→ Mβ
or Uα → Vα → Mα → …model event… → Mβ → Vβ → Uβ There are advantages and disadvantages with both solutions. Intercepting user events at the view level, before they change the state of the model, simplifies total event ordering and concurrency control. The events are applied at views, as if each conferee generated the event at their site. We let the view generate the operation to be applied on its associated model. However, this requires commonality in the views: a view must be able to interpret events from all diverse remote views. It may be 8VHU 8VHU very difficult to guarantee that all models end up in the same state. It also compromises our goal of maximizing 9LHZ (YHQWV 9LHZ 9LHZ commonality between models and minimizing commonality between views. In addition, the local application 0RGHO (YHQWV 0RGHO 0RGHO response is slower. β
%HDQβ
%HDQα
α
&ROODERUDWLRQ %XV
Figure 3: Change replication in the collaborative system.
5
On the other hand, if the changes (operations) are captured at the model level, we can create a small XML document corresponding to the modified
Figure 4: Example of event replication in heterogeneous collaboration where the recipients may be automatically correcting the silent attributes. sub-tree and use a standard communication protocol, e.g., HTTP, which is a more general solution. As long as we maintain consistency among the replicated models, the views are automatically re-rendered as a result of propagation of changes from the models. If all models are the same, any event ordering or concurrency control algorithm can guarantee that the applications are in the same state. Also, the local application response is nearly at the speed of single-user applications. However, this solution poses problems with event ordering and concurrency control. Once an operation is committed on a model, we need an inverse operation (via undo/redo) and complex concurrency control algorithms to ensure consistency across the peers. coZmo implements this second solution. coZmo uses weak sharing, which means that each client distributes the operations that apply only the elements and attributes that its view knows about. Let us consider an example of an XML element with three spatial coordinates. User α in Figure 3 has a 2D view and user β has a 3D view. If user α relocates the corresponding view object, attributes (;326