Conferencing and collaborative computing - Springer Link

1 downloads 18046 Views 196KB Size Report
ther by conventional telephone conference calls, or packet- ..... for long-distance calls (Vin et al. ..... remote procedure call to accommodate distributed control.
Multimedia Systems (1996) 4: 210–225

Multimedia Systems c Springer-Verlag 1996

Conferencing and collaborative computing Eve M. Schooler⋆ University of Southern California/Information Sciences Institute, Marina del Rey, CA 90292, USA

Abstract. As integrated services have become available to the desktop, users have embraced new modes of interaction, such as multimedia conferencing and collaborative computing. In this paper, we provide a survey of past and present research that has influenced this application area, and describe research directions for the future. Key words: Multimedia conferencing – Collaborative computing – Groupware

to support group work are referred to as groupware. The essence of groupware is the creation of shared workspaces among collaborators. Conferencing is simply one form of collaborative computing. It is a term most often used to describe synchronous telecollaboration, in which shared computer-based applications (e.g., shared editors, whiteboards) are supported in real time. Often conferencing systems combine shared workspaces with live media, such as audio and/or video. In those instances, the notion of a shared computer-based workspace is enhanced by an element of shared presence in multiple media.

1 Introduction With the confluence of computers, televisions, and telephones, conferencing and collaborative computing have emerged as new styles of communication. The interactions that were once supported by several technologies over several disparate networks are beginning to be integrated within one framework: the computer. This field is distinguished from traditional computer endeavors in two ways: it emphasizes using the computer directly to facilitate humanto-human interaction, as well as multiuser communication. After a brief characterization of multimedia conferencing systems, a historical overview of the field is given, as well as an assessment of its evolutionary progress. We follow this with an in-depth discussion of the architectural requirements for collaborative computing. Finally, this paper takes both a look back and a look forward to describe research directions for this important area of multimedia applications. 1.1 Groupware and telecollaboration defined Broadly defined, the field of collaborative computing, otherwise known as computer-supported cooperative work (CSCW), encompasses the use of computers to support coordination and cooperation of two or more people who attempt to perform a task or solve a problem together (Borenstein 1992). Not surprisingly, systems that have been honed ⋆ Present address: California Institute of Technology, Computer Science Department, 256–80, Pasadena, CA 91125, USA e-mail: [email protected]

1.2 Synergy among disciplines Collaborative computing sits at the crossroads of many disciplines: multimedia, distributed systems, networking, and human factors, to name just a few. Conferencing is reliant on multimedia solutions, because at some level, conferencing is the management of multiple users in multiple media. As such, it requires low latency for user interactivity and high bandwidth for potentially dataintensive media. However, these problems are compounded by the fact that media is now being distributed among multiple end users. Thus, there is even more of an urgency to adopt techniques to optimize data delivery. Collaborative environments are inherently distributed. Whether data sharing is at the level of the graphical user interface (GUI) or the network, conferencing relies on distributed messaging for the dissemination of data and/or control information. Furthermore, the shared nature of groupware leads to issues in data consistency as well as fault tolerance. Networking plays an increasingly important role for widespread conferencing. With the migration of collaborative software out of local-area testbeds and into wide-area venues, networking substrates, like multicasting (Deering 1988), are needed to provide efficient multiway communication (1-to-N , N -to-1, and N -to-N ). In addition, protocol abstractions, such as distributed session control, are needed to shield users from the complexities of multiparty, multimedia coordination.

211

1.3.2 Locality

Fig. 1. Collaboration matrix

It is important to stress that conferencing is as much about human factors as it is about the underlying technology. As a result, we, as researchers in CSCW, must embrace the interdisciplinary nature of the field (Ellis et al. 1991; Malone and Crowston 1994). It is crucial to devise collaborative systems with an understanding of how sociological and psychological factors impact group work, especially since mismatched expectations in group-oriented systems have resulted in serious groupware failures (Grudin 1990).

1.3 Collaboration dimensions There are several differentiating features among conferencing and collaborative systems. The design space is most frequently categorized according to the attributes that appear in Fig. 1; synchrony, locality, and scale. One may think of these variables as forming a multidimensional space with each system falling somewhere in that space (admittedly, some points within the space are more interesting than others). See Ellis et al. (1991), Grudin (1994), Nunamaker (1991), Rangan and Vin (1991), Schooler et al. (1991), Szyperski and Ventre (1993), and Watabe et al. (1991) for related categorizations of collaboration attributes.

1.3.1 Synchrony Perhaps the most basic division is between synchronous and asynchronous conferencing. While both forms of conferencing cater to multiple users, synchronous conferencing is intended for simultaneous users interacting in real time, while asynchronous conferencing systems, such as structured messaging systems (Lai and Malone 1988), multimedia electronic mail (Borenstein 1993; Scheurmann 1996), and multiparty calendar services (Beard et al. 1990), provide nonreal-time communication. In this paper, we concentrate on synchronous multimedia collaborative systems and therefore use the terms conferencing and collaborative computing interchangeably.

Another fundamental distinction is local face-to-face computer-augmented meetings (Mantei 1988; Nunamaker et al. 1991; Stefik et al. 1987b), versus remote meetings for which a real-time voice and/or video channel is required (Casner and Deering 1992; Chen et al. 1992; Craighill et al. 1993; Crowley et al. 1990; Vin et al. 1991). These live media can be carried in digital (Casner and Deering 1992; Elliott 1993) or analog (Ahuja et al. 1988; Arango et al. 1992; Root 1988) form. Another contrasting feature is that some remote conferencing systems are designed for interoffice collaboration (Arango et al. 1992; Root 1988) while others are for conferences between special meeting rooms (Casner et al. 1990; Elliott 1993). In addition, some remote systems have been optimized to operate with the low delays seen across a local area network (LAN) (Arango et al. 1992; Swinehart 1991), whereas others have been designed to tolerate the longer delays of a more geographically dispersed wide area network (WAN) (Casner et al. 1990; Elliott 1993; Handley and Wilbur 1992; Macedonia and Brutzman 1994). 1.3.3 Scale A third axis of the state space is the extent to which a system can scale up to support growing numbers of collaborators (Schooler 1993a; Szyperski and Ventre 1993), or groups of collaborators (Nunamaker et al. 1991; Rangan and Vin 1991). A criticism frequently lodged at specialized groupware systems is that they are often completely disjoint from the software ordinarily used by individuals when working alone. Thus, there is an impetus to create solutions that offer familiar single-user tools in group settings. There is also a critical need to bridge seamlessly the gap between applications designed to support single-user mode, point-to-point mode, and multi-point mode. Although we are beginning to see systems that have better support for multiuser modes, they frequently have implementation-related upper bounds on the numbers of collaborators they are able to support (Schooler 1993a). 1.4 Venue agility As conferencing systems become more sophisticated, they may support venue agility (Gust 1989). That is, they may allow users to operate in multiple points of the multidimensional space. Such a system may support a move between synchronous modes (e.g., collaborative editing) and asynchronous modes (e.g., electronic mailings of these edits), or allow the transition from working stand-alone, to working with one other person, to working with a group of people. The complication of providing venue agility is that selecting a point in the design space has often been intimately tied to an underlying architectural model (see the discussion in Sect. 1.4, Architectural Considerations). However, there are several ongoing efforts to understand what is needed to provide more fluidity between the collaboration attributes and the underlying architecture choices Crowley et al. 1990; Handley et al. 1995; Roseman and Greenberg 1994; Schooler

212

1993a; Schulzrinne 1995; Shenker et al. 1994). One promising approach is to supply adaptive mechanisms that are semi transparent to end users (Ellis et al. 1991). 2 Survey: seminal work to state of the art Having laid the groundwork for the taxonomy behind collaborative systems, we are ready to discuss the evolution of this field, and to detail the technological and sociological trends behind it. If we look historically at the field of multimedia collaborative systems, we see that it is an outgrowth of several disparate efforts: shared computer-based workspaces, real-time audio, and live video spaces. We begin with an introduction to each of these components, describing sample research that has contributed to the maturation of each venue. We highlight the variations in media capabilities and requirements, and present results gleaned from studies of systems in use. 2.1 Shared computer-based workspaces Collaborative computing has its roots in the development of computer-based shared workspaces. The idea of group collaboration via computer was inspired first by the seminal ideas of Bush (1945), who introduced the notion of Memex, a group hypertext system, and also has been broadly influenced by Engelbart (1968), whose NLS/AUGMENT was one of the first systems implemented that used computers for asynchronous, as well as synchronous group interaction. Both of these efforts represent extremely forward-thinking research, since the computer was barely capable of such tasks when these ideas first appeared. Since then, there has been an explosion of interest and development of groupware systems ranging from multi-user text editors (Crowley et al. 1990; Ellis et al. 1991), to annotation systems (Cavalier et al. 1991; Neuwirth et al. 1990), sketching programs (Ishii 1990; Ishii et al. 1993; Jacobson and McCanne 1993b; Minneman and Bly 1991; Stefik et al. 1987a), and group support systems (Nunamaker et al. 1991). Originally designed for computer-equipped meeting rooms that allowed small groups to focus on problem solving in face-to-face meetings (Gibbs 1989; Mantei 1988; Stefik et al. 1987b), shared workspaces subsequently have been introduced into larger electronic classrooms (Nunamaker et al. 1991), and into inter-meeting-room and interoffice conferencing. When groupware applications are used by geographically distributed individuals, a voice channel substitutes for face-to-face speech. Typically, audio is supplied either by conventional telephone conference calls, or packetbased networks (Elliott 1993; Schooler 1993b; Vin et al. 1991). Part of the motivation behind computer-enhanced meeting rooms were studies that suggest considerable amounts of time are spent in meetings, that the presence of computers in meetings is still quite minimal even though they are used with increasing frequency outside of meetings, and that software that runs on computers is typically geared toward individuals rather than groups. Finally, tasks regularly found in meetings are ideally suited to computers (display, manipulation, storage, and redisplay of data).

However, by integrating computers into group endeavors, a different set of user challenges arises. For instance, how will the system: – support concurrency control – provide the necessary visual cues that online data is part of a shared interaction – preserve some separation between public and private workspaces – optimize user interactivity yet maintain data consistency – establish some continuity between the applications used when working alone and those used in meetings These issues are addressed in the following sections, through a discussion of floor control, GUI enhancements, and workspace architecture trade-offs. 2.1.1 Floor control Within shared workspaces, floor policies are employed to control access to the shared workspace. Each system must decide the level of simultaneity to support (i.e., numbers of active users at once) and a granularity at which to enforce access control (e.g., at the level of character entry or paragraph entry). In the simplest form of floor control, applications use “gavel passing”; only one participant has the floor at any given time and the floor is handed off when requested. To obtain the floor, one may be required to take an explicit action, like the selection of a special function key, whereas less restrictive systems allow any keyboard or mouse activity to signal a floor change. More recently, some systems are providing a range of policies to fit the different types of meetings that arise (Altenhofen et al. 1993; Craighill et al. 1993; Crowley et al. 1990; Roseman and Greenberg 1992), allowing participants to play a range of roles (e.g., meeting chairperson). Note that floor control policy is differentiated from floor control mechanism (Crowley et al. 1990). The distinction is that floor policies describe how participants request the floor and how it is assigned and released, whereas floor mechanisms are low-level means used to implement floor policies (Reinhard et al. 1994). A variety of mechanisms exist to maintain data consistency among group members, including centralized locks, token passing schemes, and dependency detection. For a more detailed discussion on groupware data consistency mechanisms see (Ellis et al. 1991). 2.1.2 GUI Enhancements A continued theme has been how to display visually shared simultaneously accessible workspaces. One solution is to enforce a strict what-you-see-is-what-I-see (WYSIWIS) policy for the display (Stefik et al. 1987b). However, a simple mapping of an application’s GUI from a single-user mode to a group-oriented mode is not always effective, e.g., multiple cursors on the screen (one for each participant) may lead to confusion. Essentially, multi user interfaces require some sense of shared context, but they also should preserve some degree of private control over the fate of the workspace. This has

213

led to relaxed WYSIWIS, or what-you-see-is-not-what-I-see (WYSINWIS), which allows a mixture of public and private windows, and personalized window layouts (Stefik et al. 1987a). However, personalized views of public windows may also cause confusion if one participant points to data that does not appear in another participant’s view, causing sudden context switches. Furthermore, users may want to know about changes being made by other users (e.g., to avoid contention over the same data, to be appraised of workspace modifications), regardless of the degree of private control over the workspace. However, notification of other users’ activities should not be exceedingly distracting. For example, techniques to display data sharing include graying out portions of the screen to provide a busy signal for data being modified by another group member, and using color to “age” text modifications (transitioning them through a series of colors, such as yellow, orange, red, brown, black) to indicate regions of recent and not-so-recent activity (Ellis et al. 1991; (Stefik et al. 1987a). In synchronous group meeting systems, designers also can take advantage of the availability of a verbal channel to mediate data contention among a small numbers of users. While race conditions can be mitigated by the use of visual GUI cues, the participants can also rely on verbal negotiation with other group members before altering shared data. As a result, strict distributed database-concurrency methods often have been avoided, and changes to shared data can be installed by merely broadcasting modifications without any synchronization. These techniques have been effective in supporting a floor mechanism that is lightweight and a floor policy that gives all users simultaneous access to the shared workspace.

2.1.3 Workspace architectures There is ongoing debate about the optimal underlying architecture for computer-based shared workspaces. The architectural choices are classified as centralized (Garfinkel et al. 1989; Lauwers and Lantz 1990; Patterson et al. 1990), replicated (Crowley et al. 1990; Dabous and Kiss 1993; Jacobson and McCanne 1993b) or hybrid (Bentley et al. 1994). As illustrated in Fig. 2, the centralized model is based on the execution of the application at one site. Input is forwarded from whichever site has the floor to the site where the application executes and all output is broadcast to the other sites. By comparison, a fully replicated architecture runs a copy of the application at each site in the conference. Input from the site with the floor is broadcast to the other participating sites and output is generated locally at each site. One of the reasons why centralized approaches are advocated in preference to replicated ones is that it is often straightforward to take existing single-user applications and make them group-oriented without modification (Garfinkel et al. 1989). This conversion is referred to as making applications collaboration-transparent since the application is unaware of the new mode of operation. Thus users can continue to use familiar applications. A program that uses simple character input/output is especially easy to import into centralized schemes in this fashion.

Fig. 2. Architectures for computer-based shared workspaces

Because a potential facility for arbitrary window management is not available across a wide range of platforms, sometimes graphically oriented programs cannot be incorporated in a straightforward fashion. In addition, if only one copy of an application runs, but the output is duplicated to all sites, it is impossible to support display policies other than WYSIWIS, which results in an inability to tailor the public workspace to individual needs or preferences. For these reasons, although Lauwers and Lantz (1990) lobby for centralized computer conferencing architectures, they conclude that modern window systems make this task very difficult to achieve. However, emerging groupware toolkits make it possible to adapt single-user applications to collaborative settings with only a few changes aimed to combat these shortcomings (Bentley et al. 1994; Jeffay et al. 1992; Knister and Prakash 1990; Patel and Kalter 1993; Patterson et al. 1990; Roseman and Greenberg 1992). The resulting application is thus collaboration-aware. With the migration to wide-area environs, centralized workspace architectures are often supplanted by distributed ones. Although it might be easier to take a centralized approach, in a geographically distributed environment the choice may result in unacceptable communication delays. Centralized architectures may provide poor interactive response to the conferee with the floor who accesses an application running at a different site. In addition, they may impose a heavier level of network traffic than replicated architectures because output, rather than input, must be distributed to all sites. In LANs, these disadvantages are masked, due to low delays, but they are exacerbated by the large distances involved in transcontinental WANs. Because of this, the replicated strategy seems more suited to the WAN setting. However, with replicated architectures, applications must avoid operations that are dependent on the timing of input (e.g., holding down a mouse key to scroll a window).

214

To avoid nondeterminism, applications are often specifically designed, or converted, to be collaboration-aware. Additionally, tools that use a replicated architecture (Crowley et al. 1990; Jacobson and McCanne 1993b) require each site to have its own copy of all files, be they data or executables. A valid concern is how and when to orchestrate file distribution: prestage (DeSchon and Braden 1988; SunSoft 1992), at startup (Crowley et al. 1990; Jacobson and McCanne 1993b), and/or on demand. Once in session, there is the added burden of maintaining synchronization among copies of the shared workspace and of facilitating late joiners who may need the latest session context. Hybrid approaches attempt to mix the best of these schemes; for example, Bentley et al. (1994) maintain data consistency through a centralized data store, but support individualized views by creating replicated graphical front ends. Whether a centralized, replicated, or hybrid approach is chosen, there are other difficulties that arise in large, diverse communication environments like the Internet. For example, the Internet may at times provide highly variable delays or routing failures that create brief service outages. Resynchronization after such a failure is considerably easier for centralized architectures, because there is only one copy of the data (Lauwers and Lantz 1990). 2.2 Audio conferencing Audio has been incorporated into conferencing systems in two forms. First, audio data, like other media, may be embedded in the computer-based workspace. Second, audio may be used to supplement the conferencing context when conferees are not meeting face to face. 2.2.1 Audio as data Although audio is not yet considered a standard data type for computers, there have been many research projects aimed at understanding how to make it so (Borenstein 1993; Buchanan and Zellweger 1993; Resnick 1993; Swinehart et al. 1983). The challenge for audio in groupware workspaces is to store efficiently and to play out the audio stream smoothly. Other issues must also be dealt with. For example, in face-to-face conferencing, unless close synchrony can be achieved, the audio component of a multimedia document should only be played out through one set of speakers in the room. This contrasts with the treatment of the textual or graphical data that are part of the shared workspace and that are displayed at each user’s workstation. This simple example implies that the groupware application would need to know the context of usage to behave properly. In those cases where conference participants are dispersed, the choice of data architecture (centralized or replicated) will influence the behavior of the system. When the architecture is centralized, audio originating at the central archive site must be sent over the network to each of the other participants, and smooth playout requires each site to buffer for potential WAN delays. In the case of replicated groupware, the audio is likely to be stored locally at each

site, avoiding network problems. In either case, there may be a desire to synchronize the playout of the audio streams at all of the remote sites. There are also storage concerns, since digital audio consumes considerably more storage than ASCII text. An indirect pointer technique called ropes has been used to reuse replicated audio segments efficiently (Vin et al. 1991). See Steinmetz (1994a,b) for further discussion of compression schemes.

2.2.2 Audio as a communication channel Because bidirectional audio has often conjured up images of telephones, some of the early work on audio integration reflects this influence. Resnick (1993), Schmandt and Casner (1989), Watabe et al. (1991), Schmandt (1993), Hoshi et al. (1992), and Clark (1992) give examples of computers coupled with telephony. This has been achieved through use of integrated services digital netwrk (ISDN) signalling, as well as through computer-controlled telephones. Using the computer as an alternate device to control advanced telephone functionality makes perfect sense when we consider the interface to the telephone. Anyone who has ever been lost in a voicemail-selection maze knows that the telephone’s keypad is far from optimal! Fortunately there are efforts to improve the interface to the telephone, in part through more standard integration with computers and fax machines (Resnick 1993; Schmandt 1993). There also have been attempts to carry not only audio control, but also the speech data themselves over computer networks, bypassing the telephone altogether (Casner and Deering 1992; Chen et al. 1992; Elliott 1993; Swinehart 1991). Whereas early experiments targeted transmission of packet audio over local Ethernets, but reverted to telephony for long-distance calls (Vin et al. 1991), subsequent work has aimed at transmitting audio data over the wide area, first in testbeds (Casner et al. 1990) and more recently over the general Internet (Casner and Deering 1992; Jacobson and McCanne 1992; Schulzrinne 1992a). A fundamental challenge for conversational audio in the packet realm is not only jitter-free playback, but also the need to meet delay thresholds for interactivity. Reasonable delays for interactive audio hover in the 40–100 ms range; delay is generally undetectable when under 20 ms, can cause trouble when significant echo is present between 40 and 80 ms, and begins to affect normal conversation when greater than 100 ms (Swinehart 1991). An added concern for wide-area packet audio conferencing, where packet loss is more likely due to routing mishaps and queueing delays, is that the ear can tolerate only a certain percentage of packet loss, typically in the 5%–10% range. For remote conferencing, the audio component is unquestionably the most important media stream, not only because audio carries important intonation cues with regard to people’s reactions, but also it is typically the stream that carries the critical content for group discussion. In implicit floor control schemes, it has been observed that the conference’s audio channel is often used to negotiate who should next take the floor and that, without verbal agreement, a flurry of retries sometimes results. With explicit floor con-

215

trol schemes, the audio channel is often used to verify that the mechanics of electronically-mediated floor control are working. A related observation is that full-duplex audio is important for recreating face-to-face group protocols. Telephone conference calls typically do not support this feature. For N way multiparty conferences on computer-based systems, it is possible to achieve the full-duplex analogue by mixing audio in software at the end systems (Casner et al. 1990; Jacobson and McCanne 1992; Schulzrinne 1992a). This technique allows multiple users to talk at once if desired, although when more than a few people speak at once it renders mixed audio unintelligible. Ergonomic considerations, such as audio equipment quality and unobtrusiveness, also influence the acceptance and regular usage of audio conferencing. For instance, now that workstations and PCs have built-in audio capabilities, it is easy to sit at one’s desk and conduct conversations via the computer instead of the telephone. Nonetheless, the problem of acoustic feedback must be addressed by either using headphones or employing echo cancellation technology.

2.3 Video conferencing Like audio, video has begun to appear in shared workspaces. It has many of the same storage and distribution concerns as audio. Yet, the increased demands made by video on the display technology, bus architecture, CPU, and network fabric, dictate when and where to include video in distributed environments. Video is also being used as an additional communication channel in telecollaborations. Experimentation with live video has moved beyond specialized meeting rooms and meeting places to individualized desktop solutions. Although some systems are targeted to supply analog video and others offer video in digital form, in both cases there is a trend toward integration with computer-based workspaces.

2.3.1 Video as data As part of shared workspaces, video has been used predominantly to capture shared drawings. Tang (1990) supplies a group drawing surface via live video, whereby individual architects could simultaneously share drawings as if colocated. Ishii et al. (1993) extend the idea of a video-based whiteboard by merging drawn pictures with camera images of the participants (the effect is similar to having a clear or seethrough drawing surface), capturing gestures and reactions to the video space. Similarly, the objective in Ishii (1990) is to incorporate non-computer-based materials, such as calligraphy drawn on paper, to support seamless transitions from computer collaborations to noncomputer collaborations. Milazzo (1991) shows video as a data type alongside other media in structured documents, and in distributed teleconferences, participants who are outfitted with identical video databases may view identical video clips via a hierarchy of VCR or laserdisc servers.

2.3.2 Video walls Important trial implementations of video teleconferencing include the video wall experiments conducted by Xerox between Palo Alto and Portland, and the VideoWindow project at Bellcore. Each experiment linked two research facilities, with connections that operated continuously 24 h a day. To encourage unplanned interactions across the two sites, the video walls were placed in common areas. Preliminary data from the Xerox experiment indicate that 70% of all communications were of a casual, drop-in nature, with users reporting that this most probably would not have occurred in the absence of the video link (Goodman and Abel 1986). Roughly two-thirds of all VideoWindow interactions were primarily technical in nature, the remainder being social (Root 1988). It was observed that “despite mediocre quality of both audio and video, users reported that the system was moderately useful for sharing culture and maintaining relationships across the two sites”. However, the 56-kbits/s digital video channel was considered insufficient for crucial aspects of joint work, such as detailed collaboration or delicate negotiation (Kraut et al. 1988). The motivation behind these projects was to provide impromptu access to colleagues and to provide a sense of proximity, thus fostering scientific discovery (Kraut et al. 1988). Although VideoWindow made spontaneous conversation possible, the rate of impromptu interaction was less than half that of face-to-face communication (Cool et al. 1992). This was attributed to the somewhat false sense of symmetry of video windows (e.g., if you can be seen, you can be heard; if I can see you, you can see me), and to the problems of placing the window where it would be the most accessible to the most people. Competition for use of the window was also noted as a factor in its mixed reception, as was the extra effort one had to undertake to meet somewhere other than one’s office. 2.3.3 Video windows Nonetheless, these experiments served as precursors to other forms of shared virtual spaces and have led to several natural extensions. First, there has been the impetus to supply intermeeting-room conferencing, which has less of the continuous 24-h connection flavor and more of a set-up-as-needed approach (Casner et al. 1990; Elliott 1993; Snell 1994). A second extension for video walls has been the move toward interoffice rendezvous, where video is displayed on monitors separate from the workspace computers. Initial implementations cater to computer-controlled solutions, with real-time media carried over switched analog networks (Ahuja and Ensor 1988; Arango et al. 1992; Root 1988). The initial audience for these systems were colleagues distributed throughout a building; with their computers accessible across a LAN. Because these systems predominantly catered to small groups of local individuals, remote users were reachable via codec bridges. Although some studies (Egido 1988) indicate a preference for interoffice conferencing, meeting-room-style conferencing does have its place; it has been noted that meeting rooms better accommodate groups of more than three conferees and that they typically provide higher audio quality.

216

The distinction between these two types of collaborations may also be a function of economy, since high-end equipment can be expensive, and it is easier to equip a small number of conference rooms than all participants. A third trend has been the expectation that video, like its audio counterpart, can be carried in digital form and transmitted across computer networks as a standard data type. As a result, packet video conferencing systems have been developed, not only for LAN-based conferencing (Vin et al. 1991), but also for wide-area collaborations (Casner et al. 1990; Elliott 1993; Fredericks 1994; Macedonia and Brutzman 1994; McCanne and Jacobson 1994; Turletti 1993) with video-in-a-window becoming a standard commodity. A problem that is frequently cited with digital video is the tremendous amount of data that is generated by a single stream (Steinmetz 1994a,b). A conservative estimate for a rudimentary system would be that each site generates 64 kbits/s of audio, 128 kbits/s of video, and shared workspace data (of a non-real-time nature) that is a negligible amount of data relative to the other media. The resultant per site data rate is approximately 200 kbits/s. Multiply that by the number of participants in a conference, and it becomes clear that one N -way conference alone is likely to consume a large amount of bandwidth. Fortunately, this news is tempered by the realization that in face-to-face meetings not all participants are always viewing all other participants, so it is not necessary in distributed electronic sessions to provide such a feature. Nonetheless, the entire capacity of a T3 backbone link (45 Mbits/s) could be consumed by 225 sites sending data simultaneously.

3.2 Telescience Telescience, such as marine biology and global atmospheric studies, is being conducted by dispersed collectives of scientists in a distributed fashion (Banerjea et al. 1994; Macedonia and Brutzman 1994). Scientists who were once isolated in remote settings are now given the opportunity to remain “connected”, albeit electronically. With steady improvements to virtual collaborative spaces, the importance of colocation is decreasing and “roving” experts, with affiliations across several institutions, are becoming more common. As a result, electronic communities or consortia are giving new meaning to the notion of the Collaboratory, an early vision of electronic telecollaboration to facilitate scientific discovery (Lederberg and Uncapher 1988).

3.3 Teleinstruction Teleinstruction via networks (e.g., Silicon Valley’s BAGnet, NSF’s Supercomputing Consortium, the European Mice Project, the electronically-based National University in the USA) is rapidly changing the way we view education. Because educational teleconferencing runs the gamut from formal seminar-style presentations to interactive student discussions, there is a high demand for flexible floor policies to reflect the spread of social protocols that exist in the classroom. For an institution offering teleservices, there are additional concerns, such as tracking remote student registration and payment, not to mention student authentication at exam time!

3 Application domains 3.4 Cyber art Although computer-based shared workspaces, real-time audio and real-time video have matured as separate system entities, the usage of these media is converging. This has reached the point that collaborative computing is moving away from the notion of conferencing for conferencing’s sake and into richer application domains. Medicine, science, education, and art are just a few of the areas in which conferencing has been adopted, and each new discipline places new demands on the collaboration infrastructure.

3.1 Telemedicine Telemedicine has made collaborative radiology, neurology and surgery a reality (Anupam and Bajaj 1993; Krieger et al. 1991; Mulvihill et al. 1993; Sauer and Mansur 1994). System prototypes mix real-time teleconsultation with online medical records. Shared workspaces must therefore combine radiographic material with 3D magnetic resonance images, stored audio records, and written text. The critical nature of the information requires high-quality display technology above and beyond standard CRTs and reliable delivery is required for both images and real-time visualizations. In addition, the privacy of medical information places a new and somewhat conflicting demand on the otherwise open infrastructure needed for group collaboration.

Artists are also capitalizing on the opportunity of telecollaboration and teleperformance. Distributed performance (Escobar et al. 1994; Fields-Meyer 1994), networked film making (Ramirez 1993) and synchronous CD mastering (Anderson et al. 1994) are fast becoming realities. Organizations such as the International Interactive Communication Society (IICS), regularly host teleperformances between the Electronic Caf´e, located in Los Angeles, and other Electronic Caf´es around the world, allowing musicians, dancers, and artists to perform together synchronously over the network (Fields-Meyer 1994). A frequently-cited caveat is the insurmountable time zone differences; it is challenging to schedule collaborations among communities that span the globe.

3.5 Virtual collaborative spaces Although many systems have been designed to simulate the existence of real meeting places, and thus to evoke a sense of familiarity, designers are turning their efforts more and more to convening people in virtual conferencing environments (MacIntyre and Feiner 1994). Donath (1994) and Morgan et al. (1994) experiment with visual spaces that allow configurable meeting spaces and that capture gesturing and interaction patterns through real-time data interpolation. For

217

example, in Donath (1994) individual bitmap images of conference participants are arranged around a local representation of the virtual meeting space, and as a conferee talks, the other conferees appear to be looking in his or her direction. Experimentation with the audio equivalent for this scheme is also underway (personal communication – B Smith, 1994); audio is processed to sound as though it is coming from different spatial locations depending on who is speaking and where they are seated in the local depiction of the virtual space. 3.6 Distributed simulation A related effort is the creation of the distributed simulation Internet (DSInet) that provides online wargaming exercises for the American government (DSI Newsletter 1994). What normally entails physically transplanting hundreds of troops and equipment (fighter jets, tanks, aircraft carriers) is now simulated on the computer via remote collaboration software. Exercises conducted in November 1993 relied on the successful cooperation of several hundred individuals/entities across the DSInet testbed, and the aim is to scale eventually to 100 000. The simulation may be thought of as a shared workspace that is accessible to many individuals and that they can modify simultaneously. The notification strategy in this realm is somewhat different from a distributed shared workspace in that modifications to the workspace are only of importance to individuals who are within “range” to see and hear them. Similar issues will eventually need to be addressed by commercial multiuser games, such as Doom from Id Software.

And, for interoperability, these protocols require standardization. Several protocol suites have been developed that cater to the varied needs of remote conferencing (Ferrari et al. 1992; Schooler 1993b; Wolf and Herrtwich 1994). Several standard-setting organizations such as Internet Engineering Task Force (IETF) and the Consultative Committee on International Telephony and Telegraphy (CCITT)/International Telecommunications Union (ITU) as well as a number of consortia (Snell 1994) also are concentrating on this area. What differentiates these communities are their assumptions about operating environments. The IETF has traditionally concentrated on teleconferencing solutions for the computer packet-switched realm, whereas the ITU has evolved from a more circuit-switched perspective. Nonetheless, the shared focus has been on extensions to support real-time and group modes of communication. 4.1.1 Interactivity Because collaboration is a human-to-human endeavor, there is a concern about minimizing communication delays. Delays seem to come in two flavors: delays for collaborative tools to propagate updates to all sites, and end-to-end delays of real-time media. Minimizing interaction delay is at the root of the centralized versus replicated debate for computer conferencing architectures. Delays may be noticed during information updates, as well as during floor changes. Endto-end delay of real-time audio has the potential to affect normal conversation, as seen in early satellite implementations. Consequently, the development of real-time transport protocols for time-sensitive data are equally critical (Schulzrinne et al. 1995).

4 Architectural considerations

4.1.2 Scaling and efficient distribution

As shown by the diversity in application domains, collaborative computing has made enormous strides in recent years. Yet, the infrastructure to support collaboration is far from complete. Widespread conferencing relies on interoperable solutions. There are at least two thrusts behind the search for interoperability. First, there is the promise that common building blocks will simplify the development process for groupware through reuseable components. Second, as collaborations span more dispersed and larger communities, the likelihood of heterogeneity increases; shared abstractions and standard interfaces are needed to accommodate heterogeneity better. At what level do we devise mechanisms to supply interoperation? In this section, we present an overview of several outstanding issues that are integral to constructing an architectural foundation for interoperable collaborative computing.

As teleconferences scale up in numbers of users, multicast distribution (Deering 1988) becomes essential for bandwidth reduction, considering that there is an N × N bandwidth explosion for media such as video that normally transmit continuously. Management of these group addresses also becomes more difficult. One complication is that there is a fixed number of multicast addresses. Because most telecollaborations will be transient, address assignment and reassignment will be highly dynamic. A global scheme is required to avoid unwanted address collisions and to promote reasonable address space sharing (Braudes and Zabele 1993; Deering et al. 1994; Pejhan et al. 1994; Schulzrinne 1992b), by partitioning the address space either randomly or among a hierarchy of multicast address servers. To offload dynamic addressing mechanisms, we can make use of fixed multicast addresses for static conferences, such as regularly held conferences or task force meetings, and use unicast addressing in point-topoint calls.

4.1 Communication underpinnings

4.1.3 Quality of service (QOS)

Multiple media associated with a particular conference session may have varied network transmission requirements. As such, a variety of underlying communication services are needed to carry conference data to other members.

Even though teleconferencing is presently possible on lightly loaded networks, conferencing in the large requires network resource-management mechanisms to avoid congestion (Clark et al. 1992; Topolcic 1987; Wolf and Herrtwich

218

1994; Zhang et al. 1993). Those mechanisms will have to scale to track many connections or flows at once, and perhaps use some form of data aggregation. Although resource management pertains to the usage of the network bandwidth, there are several layers of abstraction needed to convey QOS information from users to the network. Specifically, conference-operating parameters from the user interface will need to be collected and delivered to these lower-level mechanisms for translation into flow specifications (Partridge 1992).

4.2 Models for widespread collaboration Increasingly, there have been efforts to develop abstractions to model synchronous conferencing systems (Chang and Whaley 1992; Chen et al. 1992; Garcia-Luna-Aceves 1988; Rangan and Vin 1991; Roseman and Greenberg 1992; Schooler 1993a; Vin et al. 1991). These models serve a variety of purposes. At the most basic level, they introduce a common taxonomy (Rangan and Vin 1991; Szyperski and Ventre 1993), but they also aim at compartmentalizing system functionality (Bentley et al. 1994; Schooler 1993b; Vin et al. 1991), at identifying information flow (Craighill et al. 1993), and at specifying component interfaces (Arango et al. 1992; Interactive Multimedia Association 1993; Leung et al. 1990). The difficulty in the goal to model several disparate media types (shared workspace, audio and video) under one simplifying scheme is that each media has different requirements and different usage patterns. Nonetheless, there appears to be a growing consensus in terms of architectural modularity. In particular, many groupware control issues are similar across all media, whether or not the media is computer based. This is evidenced by the enormous spread of research from ISDN-based conferencing solutions (Clark 1992; Hoshi et al. 1992) to packet-based equivalents (Altenhofen et al. 1993; Casner et al. 1990; Macedonia and Brutzman 1994). Thus, the idea of a separable session control component has appeared both in audio/video based conferencing systems (Ahuja and Ensor 1992; Arango et al. 1992; Handley et al. 1995; Schooler 1993a; Schulzrinne 1995; Vin et al. 1991), as well as groupware-driven developments (Crowley et al. 1990; Roseman and Greenberg 1992). As shown in Fig. 3, the idea of a session manager is at the core of the architecture, which separates media control from media transport. By creating a reusable session manager, which is separate from the user interface, conference-oriented tools avoid duplication of effort. Session control encompasses the management of participation, authentication, and presentation of coordinated user interfaces. Yet, the session manager is also separate from underlying media agents, which are responsible for decisions specific to each type of shared media. This modularity promotes the development of replaceable agents to cater to diverse hardware capabilities and user preferences. Finally, the session manager provides a conduit for control. Locally, it facilitates intercommunication among media agents such as exchanges that pertain to inter-related QOS,

Fig. 3. Session control architecture

floor control, or synchronization. Remotely, it acts to facilitate intersite communication among peer session managers.

4.3 Collaboration policies

Even if all media are unified under one control scheme, not all sessions have the exact same control needs. For instance, the control flow for a design group meeting might be highly unstructured with no particular chairperson, whereas a seminar-type conference might require a professor at the helm who decides the order in which participants speak. A first step to accommodate session diversity has been to try to identify the range of collaboration styles that exist (Szyperski and Ventre 1993). In part, this is accomplished by cataloguing what we observe socially and by recognizing the various roles played by participants in group interactions. However a closer examination of the problem has revealed that sessions are characterized not only as a collection of participants and the media being used among them, but also by a collection of policies that govern their interactions (Arango et al. 1992; MMusic 1993; Roseman and Greenberg 1992). These policies impact everything from who may join a session, when and how a session may be modified, to who may learn of information pertaining to the session. To experiment further with policy trade-offs, Roseman and Greenberg (1994) and Jacobson et al. (1993) have proposed policy modules that are dynamically bound to a session at run time, when policy choices are selected. Trial implementations have focused on floor control policies. The IETF working group on multiparty multimedia session control (MMusic) has approached session policies through protocol design. The focus is on the specification of a distributed session control protocol, which is built on a common message substrate for multiparty agreement (Shenker et al. 1994) and a flexible language to describe policy and policy combinations (Handley and Wilbur 1992; Handley et al. 1995; Roseman and Greenberg 1992).

219

4.4 Lightweight sessions

4.5 Distributed messaging

Not surprisingly, there is ongoing discussion about the tradeoffs of distributed versus centralized control models, which very much resemble the issues in shared workspace architectures. However, in shared workspaces, the debate surrounds data distribution. Here, the arguments are centered around the flow of control information. With the movement of collaborative systems out into the general Internet, the conveniences of a centralized session manager (Arango et al. 1992; Craighill et al. 1993; Handley and Wilbur 1992; Vin et al. 1991) are outweighed by the improved response time and resiliency of distributed session control (Chang and Whaley 1992; MMusic 1994; Schooler 1993a). Yet, even within distributed control architectures, there is debate over the management of session state. At some level, conferences are a collection of shared state (who is participating in the session, what is the session name, when did the session start, what are the policies associated with the session). The degree to which state is private (only known locally) or shared (among multiple entities), and the degree to which it must be identical across participating sites dictates the scheme chosen to disseminate state information. The fundamental trade-off is between using reliable messaging to disseminate shared session information for immediate synchrony and using unreliable messaging with periodic refreshes for eventual consistency. Within a WAN context, the former is harder to guarantee. The latter approach, which has come to be known as lightweight sessions, may also be advantageous in sessions with a high degree of dynamics (e.g., many membership changes) (Jacobson et al. 1993). However, for static sessions where it would be redundant to send state refreshes or sessions requiring a tighter control loop, lightweight sessions are less appropriate. The success of lightweight sessions can be seen in the Multicast Backbone (MBone), a multicast-capable segment of the Internet. A principal behind several of the most popular tools (Fredericks 1994; Jacobson and McCanne 1992; McCanne and Jacobson 1994; Schulzrinne 1992a; Turletti 1993) is that multicasting is used to disseminate local information (the local user’s address and alias) to other participants tuned in to the same multicast address. Each site distributes its own participation status to other conferees, but there is no global notion of the group membership (who are the recipients of this information), and thus no guarantees that all users will have the same view of the state space (the participant list). This approach of loose control, in which there is little to no shared state and no strict dissemination requirements, has worked quite well for large sessions with little need for coordinated control. In addition, loose-control sessions are easy to implement because there is no one locale responsible for coordinating session state – each site is responsible for multicasting its own status. An added benefit is the inherent fault tolerance. If the network partitions midconference, but eventually is repaired, it is easier to re-establish state, since there are no strict consistency requirements. Finally, the scaling properties of loose-control sessions are quite good, though at some point the refresh periodicity needs to be adaptive to the size and scope of the session otherwise the session may be in danger of flooding itself with session reports.

There are scenarios in which conferees do want assurances that their views of the session are virtually identical to each other, and really do want to exert more control over dissemination and reception of all conference-related activity. For example, it may be important to know the actual membership of a conference, both to decide whether to join in the first place and to know if it is appropriate to discuss certain matters based on who is part of the discussion. This requires stricter multiway distribution mechanisms to maintain global synchrony of shared state. This is exactly the problem that shared workspace applications face. Although there have been mechanisms similar to the remote procedure call to accommodate distributed control among objects, one of their premises is that the relationship between the objects is of a client-server nature (Srinivasan 1994). This is not necessarily desirable in multimedia collaborative environments, in which individual system components are frequently peers and have equal access to and control over the session. The very impressive collection of distribution protocols provided by the ISIS toolkit have the same drawback, although ISIS has gone to great lengths to analyze and provide a range of group-oriented messaging for atomic and causal services (Birman 1993). As a result, there are several efforts to support more suitable (scalable, decentralized) multiway protocols at both the transport and the application levels. A key aspect is the ability to provide reliable group-oriented communication through the use of negative acknowledgments (Dabous and Kiss 1993; Floyd et al. 1995; Freier and Marzullo 1990; Whetten and Kaplan 1994). Yet, depending on how session policies have been established, it is very possible that not all conference-related functions require the same degree of reliable delivery. On the one hand, periodic update messages may not require reliable delivery at all. On the other hand, for a floor change request, it may be critical to know that the update was received by all participants, since, if not, multiple video channels may result and may overextend the capacity of the network. Handley et al. (1995) have even noted that different conferees may care differently about the delivery of the same message; a low-bandwidth link cares more about the news of a floor change than a high bandwidth link that can sustain multiple streams until the floor gavel is passed correctly. 4.6 Heterogeneity Standard protocols and modular plug-and-play architectures are not complete without self-describing media agents (MMusic 1994; Nicolaou 1990; Schooler 1993a). The implication is that a descriptive language is needed to characterize groupware capabilities and requirements. In turn, these descriptions must be exported (e.g., catalogued in a configuration resource directory) to allow selection among them. Furthermore, multiway interactions may require pairwise translations among of conference participants, such as those required to bridge different media encoding schemes. Combination nodes (Lukacs 1994; Pasquale et al. 1992; Schooler

220

1993a) have been proposed as a general solution to work in conjunction with participant sources and sinks. They would act to combine media streams as they head toward the receivers. These include software or hardware modules that embed functions for: mixing, as with audio streams; compositing, or assembling the interesting pieces of several video flows into a single flow; selection, by a sender (chairperson) or receiver (individually tailored); translation, between encodings; reduction, when scalable coding is used; and combinations of these operations along the path from sender to receiver. These functions may reside at users’ end systems, or in the network. In the latter case, they may be operated as a community resource (Lukacs 1994). Not only are combination nodes applicable to the translation problem for heterogeneity, they may also be used to avoid wasting network bandwidth by deferring reduction decisions until data arrive at the receiver. For example, to circumvent bandwidth limitations that would otherwise prohibit or restrict conference participation, a mixer would be located upstream from a slow link, then used to combine several streams into one.

4.7 Synchronization As multiple media are brought together under one control framework, a range of synchronization schemes becomes possible. Although schemes exist that achieve synchronization by bundling all media together in a single flow, synchronization also has been achieved with separate flows, via timestamping and adaptive synchronization mechanisms (Escobar et al. 1994; Little and Kao 1992). Although these approaches may not be particular to conferencing, they are further stressed by the multiway nature of collaboration. A motivating factor behind same-stream transport for the different media is interstream synchronization. Typical examples of interstream synchronization that might occur at each user site include lip synch (synchrony between audio and video streams) and the correlation between audio and workspace activity (e.g., tele-pointer). These are trivial to achieve with a shared transport scheme. However, in certain circumstances, interstream synchronization may be more detrimental than it is helpful. If it takes a few hundred milliseconds to encode and compress video data before they are ready for transport across the network (an estimate based on a commercially available video codecs), then audio data would also be delayed by as much time, resulting in undesirably large delays for interactivity. Thus, decoupling the media may work better in this case. In addition, if the video resolution is too low to be able to detect lip synchrony, it is certainly not worth delaying audio delivery. In addition, it may be important to take advantage of inherent differences of the media. From a bandwidth and QOS perspective, a conferencing system may opt to route media across different segments of the network, or across different networks altogether. From a user preference perspective, each user may opt to receive different media streams, which may necessitate unbundling media to provide the needed flexibility. From a heterogeneity perspective, not all of the media generated might be deliverable to all sites. If the me-

dia travel over different networks, each may experience different delays along the route between source and destination. Therefore, resynchronization at the endpoints may be necessary. Because collaboration may occur among multiple users at multiple sites, synchronization in the form of intersite coordination may also be required. For instance, a conferencing system or distributed simulation may require that all workspace events are delivered at the same time to all participating sites, so that some degree of equity is achieved in terms of information sharing. Because the network is a dynamic entity, both intersite and intermedia delays may fluctuate throughout the lifetime of a session. Thus (Escobar et al. 1994) have created an algorithm that adapts to the network dynamically, while continuing to achieve synchronization among media and sites. This technique was used to achieve synchronization for a teleperformance of a distributed Haydn trio across the Internet.

4.8 Floor control revisited As collaborative sessions grow in size, there is more of a need for floor control – at least in those situations where high user participation is expected. This is true in the nonelectronic realm as well, and usually results in hand raising or resorting to chaired sessions with rules of order. A unified architecture has the ability to allow flexible floor control policies across the various media associated with a collaborative session. Certain scenarios call for an integrated floorcontrol approach, whereas others demand separate treatment of the individual media. Floor control in shared workspaces is as often used to maintain data consistency as it is to institute social protocol. With real-time audio, there is no notion of data consistency; instead floor control is typically used in more formal settings to promote turn taking (e.g., the distributed classroom). However, for more life-like audio interactions, minimal floor control may be optimal; all participants should be able to speak at any time, with audio from simultaneous speakers being mixed together as they would be if colocated. However, audio is a self-moderating channel because realistically only a small number of people can speak at once. Also, in a large conference, most users probably will be silent at any given time. This is not necessarily the case with video. As a result, floor control for real-time video is frequently used to control bandwidth usage. Consider systems with no floor control. All sites send video to all other sites. In N -way conferencing, this means the receiver is faced with a bandwidth N times that of the sender. While multicasting reduces bandwidth usage by senders, mechanisms are also needed for reductions at receivers. A receiver may only want to process M of the N streams that are sent to it. Thus, at the end systems, each user decides individually which of the video streams is most important to display, either as video-in-a-window (Fredericks 1994), or captured via external codecs that support customized views (Casner et al. 1990; Elliott 1993; Lukacs 1994). With floor control, a session-wide floor holder typically is selected by a chairperson. For those sites that do not hold

221

the floor, control messages propagate back to senders and avoid wasting bandwidth by sending video that subsequently is not displayed at the receivers. This approach accommodates low-bandwidth links that do not have the resources to handle multiple streams at once. However, more lenient floor policies have also been instituted, such as allowing the last N floor holders to continue to send video (Schooler 1993b). Even if there is enough bandwidth to support N -way conferencing, there may also be problems decoding and/or presenting all the information at once. Upper bounds in the numbers of streams that may be viewed simultaneously may be hardware related, in the case of external codecs; they may be CPU related, in the number of video streams that can be decoded and displayed to the frame buffer at the same time; or may simply be bounded by the amount of screen real estate that a local user wants to invest in viewing remote users. More interestingly, there is a need to couple the floor controls of different media with each other. For instance, policies of video-to-follow-audio or video-to-follow-groupware tie the selection of the video floor holder to activities that are more pertinent to the group activity. 4.9 Group rendezvous How does one initiate a rendezvous with users or groups of users electronically? Both synchronous and asynchronous rendezvous techniques have been proposed and are presently in use across the network. 4.9.1 Directories and explicit invitations Within the MBone, the session directory tool, sd, is used as a TV guide to announce open conferences that users may join (Jacobson and McCanne 1993a). sd Resides at a known address and port on each user’s workstation, listening at that address for announcements, and posting its own sessions there as well (Handley and Jacobson 1995). These session announcements are distributed via periodic multicasting. The tool relies on the time-to-live (ttl) feature of multicast to control the scope of the announcements; the ttl is an upper bound for the number of hops the message is allowed to traverse before it is no longer forwarded. Although this is sufficient for now, in the future, as more sessions are announced in this fashion, other schemes with more granularity for scoping may be needed. In contrast, the session orchestration tool, mmcc, aims at supporting rendezvous via explicit invitation (Schooler 1993b). These collaborations may be point-to-point or multipoint in nature. mmcc Is also intended to reside at a wellknown address and port on users’ machines. Although this seems straightforward, there are problems with boot strapping, since “calling” others requires knowing where users reside. Schemes to register users include creating a registry for end user addresses and aliases (Arango et al. 1992), based on the X.500 or DNS infrastructure (Mockapetris 1989; Weider and Reynolds 1992), and creating a multicast solution much like the sd tool, which would dynamically track users as they “tune in” to a well known registry channel (MMusic 1994).

Another complication is that, for security measures, sites typically have gone to great lengths to hide information about host names. While e-mail addresses are readily available, they usually place users behind a domain name, e.g., [email protected], which gives no notion of where users reside physically. In other words, information is typically not included regarding the particular host machine(s) with which a user is affiliated. While it may be beneficial to promote a naming scheme that hides the details of user locale, it may add a level of complexity not only to the control model for user rendezvous, but also to the routing and delivery of audio and/or video to the actual end systems. 4.9.2 E-mail and WWW Prototype e-mail rendezvous mechanisms also exist (Borenstein 1992) (personal communication – V. Kumar 1994). Even though e-mail is asynchronous in nature, Borenstein suggests we use e-mail as a platform from which synchronous applications can be launched among groups of individuals. Sufficient information is contained in the message body for group session establishment. The beauty of this scheme is that it builds on the already existing e-mail infrastructure both to disseminate information and to address end users. E-mail also combats the user buy-in problem (Borenstein 1992), which is basically the technical equivalent of the chicken and egg problem: to create a community of users means enticing them to use the software regularly, yet to entice users to use the software means creating a community of users with whom others can rendezvous. E-mail already has a large user following. In a similar vein, the WWW infrastructure is also beginning to be used to support synchronous rendezvous (MMusic 1994). Researchers at Naval Research Laboratory are using the WWW to capture and to update dynamically update public session announcements for ongoing or upcoming open teleconferences, while work underway at Stanford Research Institute is designed to provide synchronous rendezvous with authors over documents that appear as hypertext pages in the Web. 5 Conclusions Teleconferencing is hardly a novel concept. The idea of video conferencing debuted in the 1920s (Bell Systems Research Labs 1971), AT&T introduced its PicturePhone at the World Fair in the 1960s, and marketing forecasts of the 1970s, 1980s and 1990s have continued to promise a teleconferencing revolution (Snell 1994). It would seem that videoconferencing perpetually has been touted as “a revolutionary concept on the brink of success” (Egido 1988). Yet, conferencing and collaborative computing have often fallen short of expectations as effective means of communication. Grudin (1990) attributes this to the technologically driven nature of the pursuit and paraphrases a colleague who sees this shortcoming as “technology searching for a need”. Egido’s (1988) articulate discussion of its failures points to factors lying beyond the scope of technology, such as psychological and sociological ones, and argues that the casting

222

of electronic communication in the image of face-to-face meetings has stood in the way of developing multimedia conferencing technology to its fullest potential. Others have lobbied for systems more attuned to group processes, taking the stance that system builders must consider the tools and technology already in place, as well as individual preferences. Clearly, the mission of conferencing and collaborative computing is not only to bring individuals together in space and time, but also to make groups more effective at their work. This requires an awareness of the interplay between technology and group productivity. Successful conferencing projects have had discoveries of seemingly unimportant human factors that flew in the face of technology – small modifications of either psychological or sociological import that created a good match between the capabilities of the systems and the tolerance, expectations, and needs of their user communities (Elliott 1993; Goodman and Abel 1986; Macedonia and Brutzman 1994; Mantei 1988; Root 1988). Recurring themes have also emerged. Simple considerations, such as accommodating a variety of conferencing scenarios, guarding against cognitive overload, and catering to a sense of familiarity, have repeatedly been cited as guidelines used by system builders. The necessity and quality of real-time media also figure into a system’s effectiveness, as do the simplicity of groupware interfaces and the impact of communication delays. These realizations reaffirm that collaborative computing is not only a multidisciplinary field, but an interdisciplinary one that must create synergies among its varied components (Ellis et al. 1991, Malone and Crowston 1994). The idea that collaborative technology is an activity in search of a need no longer holds. Increasingly, the world is digitally connected. As a result, conferencing becomes critical among individuals who spend most of their time in group endeavors, who use computers to do their work, and for whom potential collaboration has been impaired by a lack of geographic proximity. It is especially well suited for the kinds of scientific collaborations envisioned for the collaboratory, a virtual scientific laboratory without walls. The growing acceptance of multimedia conferencing also reflects a change in the way conferencing is promoted; it is a supplement to, not a replica of, face-to-face collaboration. A testament to the staying power of the concept of conferencing is the emergence of commercial products that have actually made inroads in the market; Silicon Graphics’ InPerson, ShowMe from Sun Microsystems, Intel’s ProShare, and publicly available tools that are in widespread use on the Internet and MBone (Fredericks 1994; Jacobson and McCanne 1992, 1993a,b; McCanne and Jacobson 1994; Schooler 1993b; Schulzrinne 1992a; Turletti and Huitima 1992). In particular, these developments are a result of the emergence of standards for interoperability. Windowing systems, such as the X-Window system (Scheifler et al. 1988), configurable graphical user interfaces, like the Tk/Tcl toolkit (Ousterhout 1990, 1991), and widely available network application programming interfaces (APIs), such as UNIX sockets (Sun Microsystems 1988), all have given rise to more generalized software platforms on top of which interoperable systems can be built. Additionally, the increased speed of processors, disks, and networks have contributed

to the more rapid adoption of distributed solutions and the move to include various real-time media as standard data types. Nonetheless, there are still many aspects about computer architectures that make it difficult to achieve widespread availability of these services, from the need for communication standards to seamless techniques to accommodate multiuser GUIs. There is the continued challenge to support not only small groups or moderate sized organizations (Grudin 1994), which have traditionally been the focus of most groupware systems, but also much larger scale telecollaborations (Schooler 1993a). The expectation is that integrated solutions, combining audio, video, and shared workspaces, will eventually make it as easy to rendezvous electronically as it is physically. More importantly, telecollaborations must support not one, but many real-world interaction protocols. Finally, an integral part of the coming-of-age process will be the continued attention to issues beyond the scope of technology itself. Acknowledgement. This research was sponsored in part by the National Science Foundation (NSF) Center for Research in Parallel Computation, the Advanced Research Projects Agency (ARPA) under Fort Huachuca contract number DABT63-91-C-0001, the Airforce Office of Scientific Research (AFOSR) grant AFOSR-91-0070, and a grant from the American Association of University Women (AAUW) Educational Foundation. The views and conclusions in this document are those of the author and should not be interpreted as representing the official policies, either expressed or implied, of NSF, ARPA, AFOSR, the AAUW, or the American government. This paper originally appeared in the Proceedings of the Dagstuhl International Workshop on Fundamentals and Perspectives on Multimedia Systems, pp. 175–208, Dagstuhl, Germany, July 1994.

References 1. Ahuja SR, Ensor JR (1992) Coordination and Control of Multimedia Conferencing, IEEE Commun 20:38–43 2. Ahuja SR, Ensor JR, Horn DN (1988) The rapport multimedia conferencing systems. Proceedings of the ACM Conference on Office Information Systems, Palo Alto, Calif., ACM Press: 1–8 3. Altenhofen M, Dittrich J, Hammerschmidt R, Kappner T, Kruschel C, Kuckes A, Steinig T (1993) The BERKOM multimedia collaboration service, Proceedings of the 1st ACM Conference on Multimedia, Los Angeles, Calif. ACM Press: 457–463 4. Anderson D, Doris R, Moorer J (1994) A distributed computer system for professional audio. Proceedings of the 2nd ACM Conference on Multimedia, San Francisco, Calif., ACM Press: 373—379 5. Anupam V, Bajaj CL (1993) Collaborative multimedia scientific design in SHASTRA. Proceedings of the 1st ACM Conference on Multimedia, Los Angeles, Calif., ACM Press: 447–456 6. Arango M, Bates P, Fish R, Gopal G, Griffeth N, Herman G, Hickey T, Leland W, Lowery C, Mak V, Patterson J, Ruston L, Segal M, Vecchi M, Weinrib A, Wuu S (1992) Touring machine: a software platform for distributed multimedia applications. Proceedings of the IFIP International Conference on Upper Layer Protocols, Architectures and Applications, Vancouver, Canada. Elsevier Science Publishers, Amsterdam: 3–15 7. Banerjea A, Knightly E, Templin F, Zhang H (1994) Experiments with the Tenet real-time protocol suite on the Sequoia 2000 wide area network. Proceedings of the 2nd ACM Conference on Multimedia, San Francisco, Calif., ACM Press: 183–191 8. Beard D, Palaniappan M, Humm A, Banks D, Nair A, Shan Y (1990) A visual calendar for scheduling group meetings. Proceedings of the ACM Conference on CSCW, Los Angeles, Calif., ACM Press: 279– 290

223

9. Bell Systems Research Labs (1971) The Picturephone System. Bell Syst Tech J 10. Bentley R, Rodden T, Sawyer P, Sommerville I (1994) Architectural support for cooperative multiuser interfaces. IEEE Computer 27: 37–46 11. Birman KP (1993) The process group approach to reliable distributed computing. Commun ACM 36: 36–53 12. Borenstein NS (1992) Computational mail as network infrastructure for computer-supported cooperative work. Proceedings of the ACM Conference on CSCW, Toronto, Canada, ACM Press: 67–73 13. Borenstein NS (1993) MIME: a portable and robust multimedia format for Internet mail. Multimedia Syst 1: 29–36 14. Braudes R, Zabele S (1993) Requirements for Multicast Protocols, RFC 1458 15. Buchanan MC, Zellweger PT (1993) Automatic temporal layout mechanisms. Proceedings of the 1st ACM Conference on Multimedia, Los Angeles, Calif., ACM Press: 341–350 16. Bush V (1945) As we may think. Atlantic Monthly 176: 101–108 17. Casner S, Deering S (1992) First IETF Internet Audiocast. ACM Comput Commun Review 22: 92–97 18. Casner S, Seo, K,. Edmond W (1990) N-Way conferencing with packet video. Proceedings of the 3rd International Workshop on Packet Video, Morristown, NJ 19. Cavalier T, Chandhok R, Morris J, Kaufer D, Neuwirth C (1991) A visual design for collaborative work: columns for commenting and annotation. Proceedings of the 24th Annual Hawaii International Conference on Systems Science 3: 729–738 20. Chang Y, Whaley J (1992) Remote conferencing architecture. Proceedings of the 24th IETF, Teleconferencing Architecture BOF. Boston, Mass., Corp. for National Res. Initiatives: 48–56 21. Chen M, Barzilai T, Vin HM (1992) Software architecture of DiCE: a distributed collaboration environment. Proceedings of the 4th IEEE ComSoc International Workshop on Multimedia Communications, Monterey, Calif., pp 172–185 22. Clark WJ (1992) The European MIAS system for ISDN multimedia conferencing. Proceedings of the 4th IEEE ComSoc International Workshop on Multimedia Communications, Monterey, Calif., pp 14–27 23. Clark D, Shenker S, Zhang L (1992) Supporting real-time applications in an integrated services packet network: architecture and mechanism. Proceedings of the ACM Conference SIGCOMM, Baltimore, MD, ACM Press: 14–26 24. Cool C, Fish RS, Kraut RE, Lowery CM (1992) Iterative design of video communication systems. Proceedings of the ACM Conference on CSCW, Toronto, Canada, ACM Press: 25–32 25. Craighill E, Lang R, Skinner K, Fong M (1993) CECED: a system for informal multimedia collaboration. Proceedings of the 1st ACM Conference on Multimedia, Los Angeles, Calif., ACM Press: 437– 445 26. Crowley T, Milazzo P, Baker E, Forsdick H, Tomlinson R (1990) MMConf: an infrastructure for building shared multimedia applications. Proceedings of the ACM Conference on CSCW, Los Angeles, Calif., ACM Press: 329–342 27. Dabbous W, Kiss B (1993) A reliable multicast protocol for a white board application. Research Report 2100, INRIA Centre de Sophia Antipolis, France 28. Deering S (1988) Host extensions for IP multicasting. RFC 1054, Stanford University, Stanford, Calif 29. Deering S, Estrin D, Farinacci D, Jacobson V, Liu C-G, Wei L (1994) An architecture for wide-area multicast routing. Proceedings of the ACM Conference SIGCOMM, London, England. ACM Press: 126–135 30. DeSchon A, Braden R (1988) Background file transfer program BFTP, RFC 1068, University of Southern California/Information Sciences Institute, Marina del Rey, Calif. 31. Donath JS (1994) Casual collaboration. Proceedings of the IEEE International Conference on Multimedia Computing and Systems, Boston, Mass, pp 490–496 32. DSI Newsletter, BBN Systems and Technologies, Cambridge, Mass. 1: 2 33. Egido C (1988) Video conferencing as a technology to support group

34. 35.

36. 37. 38.

39. 40.

41. 42.

43.

44.

45.

46.

47.

48. 49.

50. 51.

52.

53.

54.

55. 56.

57.

58.

work: a review of its failures. Proceedings of the ACM Conference on CSCW, Portland, Ore., ACM Press: 13–24 Ellis CA, Gibbs SJ, Rein GL (1991) Groupware: some issues and experiences. Commun ACM 34:39–58 Elliott C (1993) High-quality multimedia conferencing through a long-haul packet network. Proceedings of the 1st ACM Conference on Multimedia, Los Angeles, Calif., ACM Press: 91–98 Engelbart DC (1968) A research center for augmenting human intellect. Proceedings of the FJCC 33:395–410 Escobar J, Deutsch D, Partridge C (1994) Flow synchronization protocol. IEEE/ACM Trans Networking 2:111–121 Ferrari D, Banerjea, A, Zhang H (1992) Network support for multimedia: a discussion of the tenet approach. Technical Report TR-92– 072, Computer Science Division University of Ccalifornia, Berkeley, Calif Fields-Meyer T (1994) Artists get a lift onto “Info Superhighway”. Los Angeles Times, p W6, May 1 Floyd S, Jacobson V, McCanne S, Zhang L, Liu C-G (1995) A reliable multicast framework for light-weight sessions and application level framing. Proceedings of the ACM Conference SIGCOMM. Cambridge, Mass., ACM Press: 342–356 Fredericks R (1994) Experiences with real-time software video compression. Xerox PARC, Palo Alto, California Freier AO, Marzullo K (1990) MTP: an atomic multicast transport protocol. Technical Report No. 90-1141, Computer Science Department, Cornell University Garcia-Luna-Aceves JJ, Craighill EJ, Lang R (1988) An opensystems model for computer-supported collaboration. Proceedings of the 2nd IEEE Conference on Workstations Garfinkel D, Gust P, Lemon M, Lowder S (1989) The sharedX multi-user interface user’s guide, version 2.0. Hewlett-Packard Research Report STL-TM-89-07, Palo Alto, Calif Goodman GO, Abel MJ (1986) Collaboration Research in SCL. Proceedings of the ACM Conference on CSCW, Austin, Tex, ACM Press: 246–251 Gibbs SJ (1989) LIZA: an extensible groupware toolkit. Proceedings of the ACM Conference on SIGCHI, Austin, Texas, Addison Wesley: 29–35 Grudin J (1990) Why CSCW applications fail: problems in the design and evaluation of organizational interfaces. Proceedings of the ACM Conference on CSCW, Portland, Ore, ACM Press: 85–93 Grudin J (1994) Computer-supported cooperative work: history and focus. IEEE Computer 27: 19–26 Gust P (1989) Multi-user interfaces for extended group collaboration. Groupware Technology Workshop, IFIP 8.4 WG, Palo Alto, Calif. Handley M, Jacobson V (1995) SDP: session description protocol. IETF MMusic Working Group, Internet Draft Handley MJ, Wilbur S (1992) Multimedia conferencing: from prototype to national pilot. Proceedings of INET ’92, Reston, Va., Internet Society: 483–490 Handley MJ, Wakeman I, Crowcroft J (1995) CCCP: conference control channel protocol: a scalable base for building conference control applications. Proceedings of the ACM Conference SIGCOMM. Cambridge, Mass., ACM Press: 275–287 Hoshi T, Takahashi Y, Mori K (1992) An integrated multimedia desktop communication and collaboration platform for B-ISDN. Proceedings of the 4th IEEE ComSoc International Workshop on Multimedia Communications, Monterey, Calif., 28–38 Huang H, Huang J, Wu J (1993) Real-time software-based video coder for multimedia communication systems. Multimedia Syst 1: 110–119 Interactive Multimedia Association (1993) Submission and response to the multimedia system services RFT. Network FoG Ishii H (1990) Teamworkstation: towards a seamless shared workspace. Proceedings of the ACM Conference on CSCW, Los Angeles, Calif., ACM Press: 13–26 Ishii H, Kobayashi M, Grudin J (1993) Integration of interpersonal space and shared workspace: clearboard design and experiments. ACM Trans Inf Syst 11: 349–375 Jacobson V, McCanne S (1992) vat, Video audio tool, UNIX, Man-

224

ual Page 59. Jacobson V, McCanne S (1993a) sd, Session directory tool, UNIX Manual Page 60. Jacobson V, McCanne S (1993b) wb, Whiteboard, UNIX Manual Page 61. Jacobson V, McCanne S, Floyd S (1993c) A conferencing architecture for light-weight sessions. Mice Seminar Series, University College, London 62. Jeffay K, Lin JK, Menges J, Smith FD, Smith JB (1992) Architecture of the artifact-based collaboration system matrix. Proceedings of the ACM Conference on CSCW, Toronto, Canada, ACM Press, pp 195– 202 63. Knister MJ, Prakash A (1990) DistEdit: a distributed toolkit for supporting multiple group editors. Proceedings of the ACM Conference on CSCW, Los Angeles, Calif., ACM Press: 343–355 64. Kraut R, Egido C, Galegher J (1988) Patterns of contact and communication in scientific research collaboration. Proceedings of the ACM Conference on CSCW, Portland, Ore, ACM Press: 1–12 65. Krieger D, Burk G, Sclabassi RJ (1991) NeuroNet: a distributed real-time system for monitoring neurophysiologic function in the medical environment, IEEE Computer 24: 45–56 66. Lai KY, Malone TW (1988) Object lens: a spreadsheet for cooperative work. Proceedings of the ACM Conference on CSCW, Portland, Ore, ACM Press: 115–124 67. Lauwers JC, Lantz KA (1990) Collaboration awareness in support of collaboration transparency: requirements for the next generation of shared window systems. Proceedings of the ACM Conference on Computer Human Interaction CHI, Seattle, Washington, Addison Wesley: 303–311 68. Lederberg J, Uncapher K (1988) Towards a national collaboratory. Report of an invitational workshop at Rockefeller University, New York 69. Leung WH, Baumgartner TJ, Hwang YH, Morgan MJ, Tu, SC (1990) A software architecture for workstations supporting multimedia conferencing in packet switching networks. IEEE J Selected Areas Commun 8: 380–390 70. Little TDC, Kao F (1992) An intermedia skew control system for multimedia data presentation. Proceedings of the 3rd International Workshop on Networking and Operating System OS Support for Digital Audio and Video, San Diego, Calif. Springer Verlag: 121– 132 71. Lukacs ME (1994) The personal presence system – hardware architecture. Proceedings of the 2nd ACM Conference on Multimedia. San Francisco, Calif., ACM Press: 69–76 72. Macedonia MR, Brutzman DP (1994) MBone provides audio and video across the Internet. IEEE Computer 27: 30–36 73. MacIntyre B, Feiner S (1994) Future multimedia user interfaces. Proceedings of the Dagstuhl International Workshop on Fundamentals and Perspectives on Multimedia Systems, Dagstuhl, Germany, pp 209–249 74. Malone TW, Crowston K (1994) The interdisciplinary study of coordination. ACM Comput Surv 26: 87–120 75. Mantei M (1988) Capturing the capture lab concepts: a case study in the design of a computer supported meeting environment. Proceedings of the ACM Conference on CSCW, Portland, Ore, ACM Press: 257–270 76. McCanne S, Jacobson V (1995) vic: A flexible framework for packet video, Proceedings of the 3rd ACM Conference on Multimedia. San Francisco, Calif., ACM Press: 511–522 77. Milazzo P (1991) Shared video under UNIX. Proceedings of the Summer USENIX Conference, Nashville, Tennessee, Unsenix Association: 369–383 78. Minneman SL, Bly SA (1991) Managing a trois: a study of a multiuser drawing tool in distributed design work. Proceedings of the ACM Conference on Computer Human Interaction CHI, New Orleans, Louisiana, ACM Press: 217–224 79. Minutes of MMUSIC the multiparty multimedia session control working group (1993) Proceedings of the 28th IETF, Houston, Tex., Corporation for National Research Initiatives, pp 523–542 80. Minutes of MMUSIC the multiparty multimedia session control working group (1994) Proceedings of the 29th IETF, Seattle, Wash.,

Corporation for National Research Initiatives, pp 545–572 81. Mockapetris PV (1989) DNS Encoding of Network Names and Other Types. RFC 1035 82. Morgan B, Mankin A, Landwebber LL (1994) Observations of Internet video conferencing: towards a virtual reality based conferencing environment. Internal memo, Univ. Wisconsin 83. Mulvihill C, McDermott G, Patel A (1993) Cooperative decision support for medical diagnosis. Comput Commun, Elsevier, 16: 581– 593 84. Neuwirth CM, Kaufer DS, Chandhok R, Morriss JH (1990) Issues in the design of computer support for co-authoring and commenting. Proceedings of the ACM Conference on CSCW, Los Angeles, Calif., ACM Press: 183–195 85. Nicolaou C (1990) An architecture for real-time multimedia communications systems.l IEEE J Selected Areas Commun 8: 391–400 86. Nunamaker JF, Dennis AR, Valacich JS, Vogel DR, George JF (1991) Electronic meeting systems to support group work. Commun ACM 34: 40–61 87. Ousterhout JK (1990) Tcl: An embeddable command language. Proceedings of the Winter USENIX Conference, Washington, D.C., Usenix Association, pp 133–146 88. Ousterhout JK (1991) An X11 toolkit based on the Tcl language. Proceedings of the Winter USENIX Conference, Dallas, Texas, Usenix Association, pp 105–115 89. Partridge C (1992) A proposed flow specification. RFC 1363 90. Pasquale JC, Polyzos GC, Anderson EW, Kompella VP (1992) The multimedia multicast channel. Proceedings of the 3rd International Workshop on Networking and Operating System OS Support for Digital Audio and Video. San Diego, Los Angeles, Calif., Springer Verlag, pp 185–195 91. Patel D, Kalter SD (1993) A UNIX toolkit for distributed synchronous collaborative applications. J Comput Syst 6: 105–134 92. Patel K, Smith BC, Rowe LA (1993) Performance of a software MPEG video decoder. Proceedings of the 1st ACM Conference on Multimedia, Los Angeles, Calif., ACM Press: 83–90 93. Patterson JF, Hill RD, Rohall SL (1990) Rendezvous: an architecture for synchronous multi-user applications. Proceedings of the ACM Conference on CSCW, Los Angeles, Calif., ACM Press: 317–328 94. Pejhan S, Eleftheriadis A, Anastassiou D (1994) Distributed multicast address management in the global Internet. Columbia University, New York 95. Ramirez A (1993) A major record album: only a phone call away. New York Times, p C17, October 7 96. Rangan PV, Vin HM (1991) Multimedia conferencing as a universal paradigm for collaboration. Proceedings of the Eurographics Workshop on Multimedia Systems, Applications, and Interactions, Stockholm, Sweden 97. Reinhard W, Schweitzer J, Volksen G (1994) CSCW tools: concepts and architectures. Commun ACM 27: 28–36 98. Resnick P (1993) Phone-based CSCW: tools and trials. Trans Inf Syst 11: 401–424 99. Root RW (1988) Design of a multimedia vehicle for social browsing. Proceedings of the ACM Conference on CSCW, Portland, Ore., ACM Press: 25–38 100. Roseman M, Greenberg S (1992) Groupkit: a groupware toolkit for building real-time conferencing applications. Proceedings of the ACM Conference on CSCW, Toronto, Canada, ACM Press: 43–50 101. Roseman M, Greenberg S (1994) Registration for Real-Time Groupware Research Report 94/553/02, Department of CS, University of Calgary, Canada 102. Sauer F, Mansur K (1994) Multimedia technology in the radiology department. Proceedings of the 2nd ACM Conference on Multimedia, San Francisco, Calif., ACM Press: 263–269 103. Scheifler R, Gettys J, and Newman R (1988) The X Window system: C library and protocol reference. DEC Press, Newton, Mass. 104. Scheurmann G (1996) Multimedia mail. Multimedia Syst 4 105. Schmandt C (1993) Phoneshell: the telephone as computer terminal. Proceedings of the 1st ACM Conference on Multimedia, Los Angeles, Calif., ACM Press: 373–382 106. Schmandt C, Casner S (1989) Phonetool: integrating telephones and workstations. Proceedings of the IEEE Globecom, Dallas, Texas,

225

IEEE Communications Society: 21.3.1–21.3.5 107. Schooler EM (1993a) The impact of scale on a multimedia connection architecture. Multimedia Syst 1: 2–9 108. Schooler EM (1993b) Case study: multimedia conference control in a packet-switched teleconferencing system. J Internetworking Res Experience 4: 99–120 109. Schooler EM, Casner SL, Postel P (1991) Multimedia conferencing: has it come of age? Proceedings of the 24th Annual Hawaii International Conference on System Science 3: 707–716 110. Schulzrinne H (1992a) Voice communication across the Internet: a network voice terminal, Department of Electrical and Computer Engineering, Technical Report 92-50, Department of Computer Science, University of Massachusetts, Amherst, Mass 111. Schulzrinne H (1992b) Issues in designing a transport protocol for audio and video conferences and other multiparticipant real-time applications. IETF AVT Working Group, Working Draft 112. Schulzrinne H (1995) Dynamic configuration of conferencing applications using pattern-matching multicast. Proceedings of the 5th International Workshop on Networking and Operating System OS Support for Digital Audio and Video, Durham, N.H., Springer Verlag, pp 231–242 113. Schulzrinne H, Casner S, Fredericks R, Jacobson V (1995) RTP: a transport protocol for real-time applications. IETF AVT Working Group, Internet Draft 114. Shenker S, Weinrib A, Schooler E (1994) Managing shared ephemeral teleconferencing state: policy and mechanism. Proceedings of the International COST237 Workshop on Multimedia Transport and Teleservices, Vienna, Austria 115. Snell M (1994) Picture this: videoconferencing. IEEE Computer 27: 8–10 116. Srinivasan R (1994) RPC: a remote procedure call protocol specification version 2, Internet Draft 117. Stefik M, Bobrow DG, Foster G, Lanning S, Tatar D (1987a) WYSIWIS revised: early experiences with multi-user interfaces. ACM Trans Office Inf Syst 5: 147–167 118. Stefik M, Foster G, Bobrow DG, Kahn K, Lanning S, Suchman L (1987b) Beyond the chalkboard: computer support for collaboration and problem solving in meetings. Commun ACM 30: 32–47 119. Steinmetz R (1994a) Data compression in multimedia computing: principles and techniques. Multimedia Syst 1: 166–172 120. Steinmetz R (1994b) Data compression in multimedia computing: standards and systems. Multimedia Syst 1: 187–204 121. Sun Microsystems (1988) A socket-based interprocess communications tutorial. Network Programming Document, Sun Microsystems, Mountain View, Calif 122. SunSoft (1992) Forum teleconferencing software 123. Swinehart DC, Stewart LC, Ornstein SM (1983) Adding voice to an office computer network. Proceedings of the IEEE GlobeCom Conference, San Diego, Calif., IEEE Comm. Socienty: 392–402

124. Swinehart D (1991) The connection architecture for the etherphone system. Technical Report CSL 91-8, Xerox PARC, Palo Alto, Calif 125. Szyperski C, Ventre G (1993) A characterization of multiparty interactive multimedia applications. Technical Report TR-93-006, International Computer Science Institute, Berkeley, Calif 126. Tang JC, Minneman SL (1990) Videodraw: A video interface for collaborative drawing. ACM SIGCHI Conference on Human Factors in Computing Systems, Seattle, Washington, pp 313–320 127. Topolcic C (1987) Experimental Internet stream protocol. RFC 1190, IETF CIP Working Group 128. Turletti T (1993) H261 Software codec for videoconferencing over the Internet. Research Report 1834, Institut National de Recherche en Informatique et en Automatique, Sophia-Antipolis, France 129. Vin HM, Zellweger PT, Swinehart DC, Rangan PV (1991) Multimedia conferencing in the etherphone environment. IEEE Computer 24: 109–119 130. Watabe K, Sakata S, Maeno K, Fukuoka H, Ohmori T (1991) Distributed desktop conferencing system with multiuser multimedia interface. IEEE J Selected Areas Commun 9: 531–539 131. Weider C, Reynolds J (1992) Technical overview of directory services using the X500 Protocol, RFC 1309 132. Whetten B, Montgomery T, Kaplan S (1994) A high-performance totally ordered multicast protocol. In: Theory and Practice in Distributed Systems, LCNS 928, Springer Verlag 133. Wolf LC, Herrtwich RG (1994) The system architecture of the Heidelberg transport system. ACM Op Syst Rev 28: 51–64 134. Zhang L, Deering S, Estrin D, Shenker S, Zappala D (1993) RSVP: a new resource ReSerVation protocol. IEEE Network 7: 5

Eve Sschooler received her B.S. in Computer Science from Yale University in 1983, her M.S. in Computer Science from UCLA in 1988, and is pursuing her doctorate at the California Institute of Technology. From 1983– 1985, she was a member of the Operating Systems Group at Apollo Computer, and, between 1988 and 1994, she was a researcher at USC’s Information Sciences Institute in the High Performance Computing and Communications division, where her work focused on distributed systems, networking and multimedia. Presently, she is a co-chair of the IETF working group on Multiparty Multimedia Session Control (MMusic). An avid musician, she is also interested in the combination of technology, education and the arts.