Supporting Remote Collaboration Through Structured Activity Logging

0 downloads 0 Views 366KB Size Report
Abstract. This paper describes an integrated architecture for online collabora- ... ticipants' audio exchanges, automatic metadata generation and logging of users editing interaction and also information derived from the use of group aware-.
Supporting Remote Collaboration Through Structured Activity Logging Matt-Mouley Bouamrane1, Saturnino Luz1, Masood Masoodian2, and David King2 1

Department of Computer Science,Trinity College, Dublin, Ireland {Matt.Bouamrane, Saturnino.Luz}@cs.tcd.ie 2 Department of Computer Science, The University of Waikato, Hamilton, New Zealand {M.Masoodian, dnk2}@cs.waikato.ac.nz

Abstract. This paper describes an integrated architecture for online collaborative multimedia (audio and text) meetings which supports the recording of participants' audio exchanges, automatic metadata generation and logging of users editing interaction and also information derived from the use of group awareness widgets (gesturing) for post-meeting processing and access. We propose a formal model for timestamping generation and manipulation of textual artefacts. Post-meeting processing of the interaction information highlight the usefulness of such histories in terms of tracking information that would be normally lost in usual collaborative editing settings. The potential applications of such automatic interaction history generation range from group interaction quantitative analysis, cooperation modelling, and multimedia meeting mining.

1 Introduction In ordinary meetings, much of the information needed to understand how the group arrived at a particular outcome is contained in verbal interactions which took place during the meeting itself. In online synchronous collaborative scenarios, explicit communication will usually be conveyed via audio but also through group awareness widgets, such as pointing and gesturing. These can therefore be considered as an integral part of the interaction history [11], [12]. In [10] a model of interaction history is proposed which integrates audio and text activity events and supports information visualisation and retrieval from histories. In this model, the relevant abstract data unit with respect to timestamping and history maintenance is the paragraph. Paragraphs are natural text segments that often hold self-contained information items which can be independently linked to speech segments. Unlike individual characters and words, paragraphs have histories that persist when the segment is moved or altered. Paragraph histories also differ from sentence and line histories in that paragraphs tend to be less dependent on their surrounding contexts. One could, for instance, compare our model based on tracking paragraph history to the one adopted in [6] which attaches interaction information to lines of text and displays quantitative interaction feedback (“wear marks”) on special visualisation scrollbars. While this kind of interaction history can be quite effective in highlighting positional “hot spots” of activity in documents where the context surrounding a line is more relevant than the individual line itself, such as computer programs, its H. Zhuge and G.C. Fox (Eds.): GCC 2005, LNCS 3795, pp. 1096 – 1107, 2005. © Springer-Verlag Berlin Heidelberg 2005

Supporting Remote Collaboration Through Structured Activity Logging

1097

usefulness would be limited with respect to indexing and retrieving data external to the document, in our case: speech. In order for a post-meeting information retrieval model based on temporal and contextual relationships between text and audio media to be effective, detailed logging of textual events is needed. The system must be able to record and timestamp not only editing operations (insertions, deletions, etc.) but also all actions that meaningfully relate to the time-based medium, including pointing, highlighting, and feedback about each user’s focus of attention at various points in time. The last three types of actions are usually implemented in real-time collaborative editors through awareness widgets. In what follows we present a shared editing environment designed to support capture of user interaction metadata at the paragraph level (RECOLED), discuss the complexity of maintaining such metadata as the document evolves and its implications with respect to the implementation of basic editing operations, and present the solutions we have implemented. Post-meeting processing demonstrates the usefulness of such histories in terms of group interaction quantitative analysis, cooperation modelling, and multimedia meeting mining.

2 Design and Architecture of RECOLED In synchronous collaborative work over a shared workspace, audio has been shown to be the most efficient communication medium [7]. As such, RECOLED [1] was designed explicitly to be used in conjunction with an audio channel. It is assumed that the co-authors will use the audio channel provided by the conferencing tool for their explicit verbal communication, while the implicit communication and awareness will be supported (and recorded) through the shared editor. RECOLED employs a semi-centralised architecture which consists of a single server and several clients, as shown in Fig. 1. Text and workspace awareness are

Master Document TCP/IP Editing and gesture timestamping

Modifications, locks, gestures, etc.

RTP Recorder

RTP over UDP (Multicast)

Speech timestamping

Audio speech packets

RECOLED Server

Shared Text Editor Client Local copy of shared document Audio Conferencing Client RECOLED Client

Fig. 1. Audio and text editing history recording architecture of RECOLED

1098

M.-M. Bouamrane et al.

through TCP-IP connections while audio is multicast over UDP sockets. The audio tools used by the clients transmit audio in RTP (Real-time Protocol) packets and the server records these packets directly, keeping the original RTP timestamps and adding arrival timestamps to the recorded data. These timestamps are essential for profiling speech activities and linking them to text segments. In addition, all editing and gesturing operations performed by co-authors are automatically recorded by the central timestamping mechanism of the server. This mechanism is subordinated to the locking mechanism and is described in further detail in the following sections. 2.1 Document Consistency and Concurrency Control Due to the synchronous and distributed nature of the collaborative writing scenario targeted by RECOLED, concurrency and consistency mechanisms are critical to ensure effective use of the editor. Concurrency control is a well researched area in groupware and a number of approaches have been proposed, including: turn-taking, locking, serialization and operational transformations [3], [4], [13]. We opted for a paragraph-based locking scheme for its relative simplicity of implementation and because it suited best our system's architecture and central paragraph-based timestamping need. In order to maintain consistency between the copies of the document, we use a document implementation based on paragraphs, so that modifications are made on these rather than on the document as a whole. This segmentation of the document into paragraphs permits concurrent modification of distinct paragraphs, while a locking scheme ensures that a single paragraph can only be modified by one client at a time. This locking scheme is optimistic so that local responsiveness can be maintained. In order to reduce the likelihood of conflicting editing operations, RECOLED uses a number of group awareness mechanisms. A paragraph currently locked by another participant will have its margin highlighted in that participant’s colour (Fig. 2). A usability study carried out on RECOLED showed that overall, this locking scheme did not hinder the co-authoring task.

Menu Bar

Toolbar

Indicator Area

Shared Scrollbars

Avatars

Lock Margin

Context Menu

Document Area

Fig. 2. RECOLED client application interface

Supporting Remote Collaboration Through Structured Activity Logging

1099

The lock manager is the cornerstone of our collaborative writing module. On the client side, the lock manager is responsible for checking if a participant’s edit can proceed. If the participant is already holding the lock, the edit is carried out and sent to the server, which it then broadcasts the update to the other clients. However, if the participant does not yet hold the lock for the particular paragraph, then the lock needs to be requested from the server. On the server side, the lock manager grants or denies a lock depending on the current paragraph’s status. Since it is the lock manager’s responsibility to permit editing operations, it is also at the centre of the timestamping process. If an update is allowed to go ahead, the lock manager identifies the exact type of operation being performed and triggers the appropriate timestamping operations, generating action and paragraph timestamps, merging or mapping timestamps lists, according to the model described in details in the following sections. Although updates in RECOLED are character-based to allow high responsiveness, generating a separate timestamp for each character typed or deleted would be rather expensive. Furthermore, one of our main design goals is to identify and highlight semantic clusters in the document writing process. As such, a character-based timestamping mechanism would be pointless. Therefore, for timestamp generation purposes, a participant’s consecutive edits are buffered, and the timestamping is subordinated to the locking mechanism. This means that a timestamp is generated only when the paragraph lock is released (currently this is set to 3 seconds of user’s inaction). 2.2 RECOLED Shared Text Editor Designers of existing groupware applications such as synchronous shared editors have often focused on incorporating interface components which provide means of group awareness, particularly workspace awareness (e.g. use of tele-pointers, shared scrollbars, etc.) and social awareness (e.g. video links, participant lists or images, etc.). These types of group awareness mechanisms are clearly important in supporting the collaborative work process, and indeed the usability and effectiveness of groupware depends largely on the extent to which it supports group awareness [2], [5]. In most synchronous shared editor software, however, the group awareness information is only used to support the collaborative work activity during the synchronous session itself, after which such information is not recorded and largely ignored. Although some systems record information such as the names of the participants, session duration, and other particulars, the information relating to specific user actions such as text insertion, deletion, focus of attention, and pointing is lost. In order to address this, RECOLED not only provides the users with important awareness information during the collaborative writing and editing sessions, but also extracts and records the information derived from the awareness components of the interface such as pointing, gesturing and word highlighting for post-meeting access to document history. Successful collaborative writing essentially involves communication, argumentation and negotiation. Audio communication seems to be a natural choice for coordination in real-time collaborative work. RECOLED's collaborative functionality has therefore been implemented in a way that takes into account the fact that speech will in general form a significant part of the interaction history.

1100

M.-M. Bouamrane et al.

3 Metadata Generation One of our design assumptions is that when writing text, co-authors often naturally use text segmentation to structure their document into semantic units. Our aim is to attempt to keep track of these semantic units from their creation, and follow their evolution during the co-writing process. In RECOLED, it is assumed that the appropriate granularity of the text unit for this purpose is the paragraph; defined as a segment of text followed by a blank line. Metadata generated is stored with the document content in a XML format for convenient post-meeting processing. 3.1 Editing Primitives RECOLED is primarily designed to accurately capture editing operations. The original set of editing operation primitives defined was designed to be as general as possible, and included: • Insertion: insertion of a single character • Deletion: deletion of a single character • Gesturing: use of a tele-pointer or other deictic widgets used on one or more text segments. However, trials of earlier prototypes quickly showed that these were clearly insufficient. In particular, insertions or deletions of new lines are highly significant operations in that they modify the structure of the document. To address these particular cases, two new primitives were added to the previous set: • Paragraph_Insertion: insertion of a new line • Paragraph_Deletion: deletion of a new line Finally, someone using a single editor would naturally expect to be able to cut and paste text. These operations, supported by RECOLED, are also significant because once again, their use can possibly modify the structure of the document. As such, two further primitives were also defined: • Paste: insertion of strictly more than on character in a single editing operation • Cut: deletion of strictly more than on character in a single editing operation 3.2 Timestamping The timestamping mechanism used in RECOLED generates two distinct sets of timestamps. The first set consists of paragraph timestamps, as illustrated in Fig. 3, whose purpose is to describe the nature of editing operations performed on every single paragraph. These contain information on the agent who performed the operation, the type of action (i.e. one of the editing primitives previously described) the start and end time of the action, as well as a reference to a unique action timestamp. In the XML representation of the document, text segments consist of the paragraph content and the paragraph timestamps which were generated on that particular paragraph. If the document is structurally modified, RECOLED ensures that these paragraph timestamps are handled accordingly. We later describe the design choices that were made to handle these timestamps.

Supporting Remote Collaboration Through Structured Activity Logging

1101

Staying in Western N. Hotel. Fig. 3. Paragraph timestamps of RECOLED

The second set of timestamps consists of action timestamps, as illustrated in Fig. 4. Action timestamps accurately describe editing or gesturing operations performed on the document. The information they contain are: a unique identifier, the type of operation performed, the start and end time of the operation, the list of paragraphs which were affected by this operation, the offset (within the first paragraph affected by the operation) at which the operation started, and in the case of gesturing the positions of the gesturing. In addition, they contain the exact text content of the operation performed (e.g. what text was added or deleted), and in the case of gesturing the text which was pointed at. ... Staying in Western N. Hotel double double ... Fig. 4. Action timestamps of RECOLED

4 Document Structure and Timestamping Model The final version of a document text follows a linear path from start to end. Such document can be represented by the familiar tree structure of Fig. 5. This, however, fails to describe the often laborious process by which the text was produced. For instance, words might have been written or deleted in arbitrary locations, or paragraphs might have been merged, split and moved in many different ways. Although representing the structure of a text at a given state is a straightforward operation, attempting to make a structured representation of a document as the sums of its editing operations can quickly become extremely complex and intractable. In RECOLED, the actions timestamps describe completely the evolution of the document, and applying these actions at their time of appearance would in effect replay the process by which the document was created. The paragraph timestamps, on the other hand, attempt to describe as accurately as possible the evolution of the text segments. As such, some paragraph timestamps might no longer be reflected in the

1102

M.-M. Bouamrane et al.

actual text content, and instead might describe an operation which was performed when the paragraph was in an earlier state. 4.1 Moving Paragraphs Fig. 5 illustrates the evolution of the structure of a plain style text document, when its first paragraph was removed and its content was appended to the end of the document. The paragraphs “Par” represent positions within the document structure and make no reference to textual content. The actual text content is represented by the “text\n” boxes.

Fig. 5. a. Original state of document

b. Text1\n has been cut and appended

In this scenario, the list of timestamps which described the operations performed on the content of text1 prior to its move will follow the paragraph to its new position (illustrated in Fig. 6). In order to achieve this, the paragraph timestamps lists are mapped to paragraphs contents. Action timestamps will record both the paste and cut operations, and in addition, in the paragraph pasted, a newly generated “paste” timestamp will be appended to the previous list of paragraph timestamps.

Fig. 6. Effect of paragraph content manipulation on paragraph timestamps

4.2 Merging Paragraphs Steps similar to those taken when a paragraph is moved are also taken when paragraphs are merged. In Fig. 6 the deletion of the new line at the end of text3\n will merge paragraphs 2 and 3 (illustrated in Fig. 7). In this case, an action timestamp will record the operation as a Paragraph_Merge. However, as paragraph 3 disappears, its list of timestamps is appended to that of paragraph 2. In this context, the timestamps

Supporting Remote Collaboration Through Structured Activity Logging

1103

which recorded operations on the content of text1\n are not necessarily relevant to the content of text3\n. This however models the fact that paragraph 2 is a merge of two separate text segments, and therefore, its timestamps describe operations which were performed on earlier states of these two separate text segments and eventually led to the current state of paragraph 2.

Fig. 7. Effect of paragraphs merging on paragraph timestamps

4.3 Examples of Timestamps Manipulation The timestamping model which has just been outlined is useful for describing “clean cut” operations, where paragraphs are manipulated as atomic units. However, our experience from using RECOLED in co-authoring situations shows that text operations are often rather complex, and therefore require a more advanced timestamping model. In what follows we propose a taxonomy of cut operations and their effects on paragraph timestamps (Table 1). In the examples, the cut corresponds to the text in bold and the portion of the paragraph described by the cut operation taxonomy is underlined. The other paragraph is there to illustrate the relative position within the document.

5 Post-Meeting Processing 5.1 Analysing Shared Editor Usage and Group Interaction Analysing and building models of group interaction is a large and well documented research area of Computer Supported Collaborative Work (CSCW) and Computer Supported Collaborative Learning (CSCL) [8]. Analysing how people use computer as a communication medium is the key to building successful and viable groupware systems. Many of the approach have focused on qualitative evaluations of group behaviours, and user interfaces and artefacts usage, while seeking a systematic quantitative and automatic approach to collecting interaction data has often been overlooked. While this was not the original motivation behind the design of RECOLED, the automatic interaction history capture provides valuable information about the collaborative writing task, users' interactions and the shared editor's usage. An example of this is gesture information. The significance of group awareness information can be

1104

M.-M. Bouamrane et al.

evidenced by the frequency with which users utilize awareness mechanisms in the course of a typical collaborative writing session. Table 2 shows data collected during our early user trials illustrating the fact that users tend to employ the pointing awareness mechanism frequently and extensively. This confirms the importance of gesturing as a group awareness tool, as demonstrated in previous groupware awareness studies [5]. Table 1. Taxonomy of Cut Operations and their effect on paragraph timestamps

Title Example Effect on Document Structure Action taken on paragraph timestamps Title Example Effect on Document Structure Action taken on paragraph timestamps Title Example Effect on Document Structure Action taken on paragraph timestamps Title Example Effect on Document Structure Action taken on paragraph timestamps

Beginning Cut 15 staff, how many students? 40 students but some have their... Paragraph is merged with previous paragraph Timestamps are merged with previous paragraph timestamps Middle Cut 15 staff, how many students? 40 students but some have their... No change to paragraph in document structure No action taken on paragraph timestamps Whole Cut 15 staff, how many students? 40 students but some have their... Paragraph disappears paragraph timestamps are mapped to paragraph content End Cut 15 staff, how many students? 40 students but some have their... No change to paragraph in document structure No action taken on paragraph timestamps

Table 2. Editing and awareness actions in four separate meetings

Total editing operations 268 64 128 84

Total pointing operations 54 22 36 22

Pointing mean duration (in s) 8.5 6.4 5.7 6.2

Supporting Remote Collaboration Through Structured Activity Logging

1105

People adopt a variety of approaches to collaborative writing. RECOLED's paragraph based logging mechanisms enables us to develop a clear-picture of the coauthoring task in terms of collaboration behaviour and participants priorities during the collaborative meeting. During the usability study carried out on RECOLED, one set of participants (24 students in 12 separate dyadic groups) were asked to arrange the organization details of a fictional week-end away for final year students and staff. The tasks they were requested to perform included booking a hotel, a restaurant, organizing travel arrangements, proposing day time and night time activities, assessing costs, etc. Upon starting the session, a basic plan was loaded on to the editor with a number of outstanding issues to be resolved. An extremely interesting point was that participants almost never addressed these points in the order in which they appeared on the initial document. In fact, we observed a wide variety of approaches in tackling the various tasks. Participants overall did not try to physically reorder these points within the document. The various groups did however address them in totally different orders from each other, according to their own views of priorities (cost, travel arrangements, etc.) It other words, it would be impossible to infer how each group decided to approach the overall tasks by solely looking at the final document. Therefore, the users’ perceptions of priorities are typically something which would be lost in the document’s final outcome in traditional shared-writing scenarios. One of our assumptions is that, by capturing time and nature of participants’ actions, postmeeting processing of timestamps generated by RECOLED will allow to emphasize different groups’ sense of priorities when working on a similar task. In this case, editing history is useful to identify “clusters” of activity within each meeting. 5.2 Interaction History for Meeting Mining and Information Retrieval We have shown elsewhere [9], [11] that document histories can support effective retrieval of information from multimedia (audio, text and gestures) recordings of collaborative activities. One of our assumptions is that when writing text, co-authors naturally structure their documents into semantic units. The data collection framework described in this paper was designed so as to enable us to keep track of these semantic units from their creation, and follow their evolution during the co-writing process. The most interesting aspect of the paragraph-based editing history recording lies in using the timing of editing operations performed on each paragraph in order to provide temporal entry points to an otherwise sequential audio file. A similar approach was used in FILOCHAT [14] where hand-written notes on a portable device is used as temporal indexing to collocated meeting audio recordings. Editing timestamps captured on each paragraph during the co-writing tasks provide temporal entry points to the meeting audio recording by exploiting concurrency of actions and speech segments. These audio segments might in turn point to another set of editing actions performed on different paragraphs. Therefore, interactions on the space-based artefacts (segments of text) is used in order to uncover not only nonsequential temporal links with the continuous medium (audio), but also links to other non-contiguous space-based artefacts. In this scenario, the timing properties of the various data units are not used as the underlying structure of our data representation but rather as a means of linking these various data units. A paragraph’s retrieval unit is based on the notion of temporal intervals and uses concurrency of editing and ges-

1106

M.-M. Bouamrane et al.

turing actions performed on each paragraph of the text document and participants speech exchanges. The algorithm used to generate a paragraph's retrieval unit is as follows: • retrieve all editing and gesturing actions performed on a paragraph • use the timing information of these actions to find all concurrent speech exchanges • use the duration of these speech exchanges to find all concurrent editing actions not previously uncovered • iterate through the 2 previous steps until no new actions or audio intervals can be found The retrieval algorithm stops when no more concurrent editing operations or audio segments can be found. A paragraph's retrieval unit then consists of the set of audio speech exchanges and text interactions retrieved by this algorithm.

6 Conclusion This paper presented a real-time multimedia meeting architecture that integrates traditional editing and awareness functionality with mechanisms designed to implicitly capture document history. We presented a formal model for the automatic document history generation, based on a textual artefact (a paragraph and its interaction history) by defining editing and gesturing action primitives and presenting our proposed solutions for these artefacts manipulation (splitting and merging) relating to a number of common editing operations such as cutting and pasting. We highlighted a number of possible applications of this history tracking in terms of group interaction quantitative analysis and meeting mining. The model for history capture and retrieval presented here can easily be extended to any mix of space-based artefacts and continuous medium by defining a set of legal operations on the space-based artefact and defining rules for their manipulation in terms of history tracking. Future work will involve the integration of speech recognition for keyword spotting and text mining for contextual clustering of the multimedia meetings and the formal definition of the above model to any combination of space-based artefacts and continuous medium.

References 1. Bouamrane, M., King, D., Luz, S., Masoodian, M.: A Framework for Collaborative Writing with Recording and Post-Meeting Retrieval Capabilities. Special issue on the 6th International Workshop on Collaborative Editing Systems, IEEE Distributed Systems Online (2004). 2. Dourish, P., Bellotti, V.: Awareness and Coordination in Shared Workspaces. In: Proceedings of the Conference on Computer-Supported Cooperative Work, Toronto, ACM Press (1992) 107–114. 3. Ellis, C.A., Gibbs, S.J.: Concurrency Control in Groupware Systems. In: Proceedings of the ACM SIGMOD International Conference on the Management of Data, Seattle, Washington, USA (1989) 399–407.

Supporting Remote Collaboration Through Structured Activity Logging

1107

4. Greenberg, S., Marwood, D.: Real Time Groupware as a Distributed System: Concurrency Control and its Effect on the Interface. In: Proceedings of the ACM CSCW’94 Conference on Computer Supported Cooperative Work (Chapel Hill, N. Carol., Oct. 22–26). ACM Press (1994) 207–217. 5. Gutwin, C., Greenberg, S.: The Effects of Workspace Awareness Support on the Usability of Real-Time Distributed Groupware. ACM Transactions on Computer-Human Interaction 6(3), (1999) 243–281. 6. Hill, W. Hollan, J., Wroblewski, D., McCandless, T.: Edit Wear and Read Wear. In: Proceedings of the SIGCHI conference on Human factors in computing systems, CHI 1992, ACM press (1992) 3–9. 7. Jensen, C., Farnham, S., Drucker, S., Kollock, P.: The Effect of Communication Modality on Cooperation in Online Environments. In: Proceedings of CHI'00 (The Hague, Netherlands). ACM Press (2000) 470–477. 8. Komis, V., Avouris, N., Fidas, C.: Computer-Supported Collaborative Concept Mapping: Study of Synchronous Peer Interaction. Education and Information Technologies, vol. 7, No.2, Kluwer Academic Publishers, Hingham, MA, USA (2002) 169–188. 9. Luz, S., Masoodian, M.: A Mobile System for Non-Linear Access to Time-Based Data. In: Proceedings of Advanced Visual Interfaces AVI'04, ACM Press (2004) 454–457. 10. Luz, S., Masoodian, M.: A Model for Meeting Content Storage and Retrieval. In: Proceedings of the 11th International Conference on Multi-Media Modeling, ed. Yi-Ping Phoebe Chen. IEEE Computer Society (2005) 392–398. 11. Masoodian, M., Luz, S.: COMAP: A Content Mapper for Audio-Mediated Collaborative Writing. In: Usability Evaluation and Interface Design, Proceedings of HCI International, eds. M. J. Smith, G. Savendy, D. Harris, and R. J. Koubek. Lawrence Erlbaum (2001) 208–212. 12. Moran, T.P., Palen, L., Harrison, S., Chiu, P., Kimber, D., Minneman, S., van Melle, W., Zellweger, P.: “I’ll Get that off the Audio”: A Case Study of Salvaging Multimedia Meeting Records. In: Proceedings of ACM CHI 97, Atlanta, Georgia, USA (1997) 202–209. 13. Sun, C., Jia, X. Zhang, Y. Yang, Y., Chen. D.: Achieving Convergence, Causality Preservation, and Intention Preservation in Real-Time Cooperative Editing Systems. ACM Transactions on Computer-Human Interaction., 5(1) (1998) 63–108. 14. Whittaker S., Hyland P., Wiley M.: Filochat: Handwritten Notes Provide Access to Recorded Conversations. In: Proceedings of the ACM Conference on Human Factors in Computing Systems, ACM Press (1994) 24–28.

Suggest Documents