Co-Presence and Co-Working in Distributed Collaborative Virtual Environments Gernot Goebbels
Vali Lalioti
GMD, Department of Virtual Environments Sankt Augustin,Germany
University of Pretoria, Computer Science Department South Africa
[email protected]
[email protected]
ABSTRACT Collaborative virtual environments (CVEs) allow a number of users to share a common virtual space, where they may interact with one another and the environment. Immersive Telepresence involves the use of Video and Audio communication in projection-based CVEs, to support collaboration over distance as if face-to-face. Such approaches include the Office of the Future [29], the TELEPORT system [15] and other tele-presence [9] and spatially oriented video conferencing [5,6,7]. However, very little research has been carried out on what affects face-to-face communication when mediated via projection-based systems. In this paper, we explore how the use of video/audio, input device representations and other disturbance factors typical of projectionbased virtual environments affect co-presence, co-working and co-knowledge in distributed CVEs. We present results from copresence and co-working evaluation sessions of about 60 users of various profiles. An extensive statistical, group and variation group analysis of the results is carried out. The findings and the resulting design guidelines are presented in this paper in respect to the above factors.
Keywords Evaluation, Collaborative Virtual Environments, Telepresence, Co-presence, Co-work.
1.
INTRODUCTION
The problems of co-presence and co-working in shared workspaces are firstly explored in the research field of Computer Supported Collaborative Work (CSCW). Virtual Environments not only provide a better man-machine interface but appear able to overcome some of the problems of sharing and collaborating over distance. Projection-based systems in particular allow for more natural collaboration metaphors to be used to provide faceto-face communication and collaboration over distance. Systems and Research approaches such as Office of the Future [29], the TELEPORT system [15] and other tele-presence [9] and spatially
oriented video conferencing [6] are exploring these possibilities. However, very little research has been carried out on the factors that affect face-to-face communication when mediated via projection-based systems. In this paper, we explore how the use of video/audio communication, input device representations as well as disturbance factors typical of projection-based systems affect coworking and co-knowledge in CVEs. The overall work involves a medical collaborative and distributed application, that runs on a combination of projection based systems varying from CAVEs to Responsive Workbenches and Cylindrical displays. Over 60 users of various profiles were involved in 2 separate evaluation sessions completing 2 cross-checking questionnaires for each in conjunction with an observer evaluation. The statistical analysis of these results is done on 3 levels, namely a first level, group and variation analysis. The first session involves a teacher/learner medical scenario, while the second is a co-working session where participants have equal rights. Furthermore, three variation scenarios have been included to provide us with additional crosschecking of the analysis results. This work provided us with a variety of design guidelines and evaluation criteria for creating better CVEs. Partial results of this analysis can be also found in [25]. In this paper, we are focusing on the factors of video/audio communication and virtual input device representation and how they affect co-presence and coworking over distance specifically for projection based systems. We also touch on disturbance factors typical of projection based CVEs such as stereo glasses, wires of interaction and tracking devices. The next section summarises the related research approaches while section 3 gives an overview of the evaluation sessions and our evaluation methods. The results, statistical analysis and design guidelines, related to these factors, are presented in section 4. Section 5 concludes this paper.
2.
RELATED WORK
CVEs are multi-party Virtual Environments, which allow a number of users to share a common virtual space, where they may interact with each other and the environment itself. The problems of multiple users sharing the same workspace are already known from the field of Computer Supported Collaborative Work (CSCW) [1] and [3]. Some of the major problems are: the distribution of objects and information as well as the delegation of rights and the representation of group structures. Interest in spatial approaches to CSCW has grown over recent years. Specific
examples of the spatial approach include media spaces [4], spatially oriented video conferencing [5,6,7], collaborative virtual environments [2,8] and tele-presence systems [9]. In contrast to CSCW systems, direct collaborative real-time interaction leads to completely new interaction possibilities, especially concurrent interaction of at least two users with one or more objects. Unfortunately there is a lack of application and design support for CVEs. In addition to that most of the CVEs under investigation are web based collaborative Virtual Environments. Good overviews about examples of these desktop CVEs can be found in [2,3,8,10,11,12,14]. There exist only a few approaches of back projection based VEs and CVEs. An overview of these approaches can be found in [15,16,17,18]. The work presented in this paper, explores some of the issues related to human-to-human interaction mediated via projectionbased VE. In particular, issues around the use of video and audio for immersive telepresence and input device representations in CVEs are explored in a number of different evaluation sessions. In addition, the effects of disturbance factors typical of projectionbased VEs such as cabling, use of shutter glasses are elaborated. The results and their analysis provides important insights into the way these issues affect co-presence and co-working in CVEs and allow for design guidelines to be developed.
3.
THEORY OF EVALUATION
We developed a distributed Virtual Environment for medical applications, that allows two or more sites to share the same medical data set and manipulate it collaboratively in real time. The CVE allows the distribution of a virtual human data set, that includes detailed body skin, underlying skeleton and heart models. The functionality of the CVE allows users to cut the skin, pick, manipulate and query information about the bones of the skeleton, observe animations of the heart’s function and modify the transparency of the heart’s tissue.
user requests the functionality of the data set. In addition, bones can be positioned and their names can be queried [24]. The CVE is implemented in AVANGO [28]. Therefore, we are able without any modifications to use a variety of input and output devices. In this paper, we use two Collaborative Responsive workbenches [26,27], with a stylus and a three-button tool as interaction device at each site. The two sites are connected through a 100Mbps fast Ethernet. An Onyx2 Silicon Graphics computer at each site renders and drives the two output devices. One O2 Silicon Graphics workstation captures and sends realtime audio and video of the local participant while another one is receiving the real time audio and video of the remote participant at each site. Obviously, lower cost hardware can be used both for rendering and for the audio/video communication. The availability of the particular equipment and its high performance provide us with an easy choice. Usability sessions carried out by VR experts not involved in the initial design, resulted in an improved system. This system has also gone through an initial evaluation in a single user evaluation session. The system is improved in terms of menus and tool representations and overall functionality as a single user virtual environment. Thus, any usability issues related to the design and functionality of the VE in single user mode have been eliminated prior to the evaluation of the co-presence and co-working in the distributed scenarios. We focus in this paper on the two evaluation sessions that are more closely linked to co-presence and co-working. Three additional ones are also included in the variation group analysis for cross-checking of the analysis results.
3.1 Evaluation Sessions Co-presence Session The purpose of this co-presence session is to evaluate the CVE in terms of its support to the learning experience using immersive telepresence only. The scenario for this session is that of a distance learning paradigm. A teacher and a learner are communicating via a video/audio connection each working on a Collaborative Responsive Workbench (two-sided responsive workbench). The virtual data set is that of a human skeleton and it is distributed to the two sites, so that changes made by the student are immediately visible on the teacher’s site.
Figure 1.
Medical VE application
In Figure 1 the functionality of the VE in single user mode is presented. The model is that of a human skeleton covered by skin. The skin can either be made transparent using a three-dimensional slider or cut using a special skin cutter tool. These operations are selectable from the dynamic ring menu which appears when the
In this session, the learner is also the evaluator of the CVE. The task is for the learner to position three bones precisely in their correct location on a human skeleton. These bones lie in front of the evaluator and look very similar to each other so that it is not obvious where they have to be added to the skeleton. The teacher explains the task, the data set, the input devices and the tools remotely to the evaluator. If the evaluator does not know how to achieve the goal the teacher gives advice about which tool should be used, how to query information about the bones, how to change the viewpoint etc. In this session the remote teacher does not use any input devices or tools. Only gestures and verbal instructions are used. When the task is completed the co-presence session is ended and a questionnaire addressing co-presence assessments is handed out to the learner. After the questionnaire is filled out the leaner has a
Co-work Session
positioned in the skeleton the user can make use of a snap-back tool which allows the selected bone to snap back to the correct position. This facility enables users to evaluate and verify their own or their partner's work.
The purpose of the co-work session is to evaluate the CVE in terms of its support for collaborative work. In this session there is again a local learner and a remote teacher (see Figure 2). However, both partners have equal rights concerning decisions, manipulations and the access to tools. It is interesting to note that both users can complete the task autonomously as well. Therefore, this session tests if CVEs encourage team work even though such work is not essential for completing this task.
In order to complicate the task the human skeleton is covered with the skin. For positioning the bones the particular part of skeleton needs to be made visible by cutting away the skin in this region. For doing so one user selects the cutting tool from the ring menu and applies this to the interesting part of the human body. However, it is not possible to cut the skin in this region permanently and therefore the user applying the cutting tool has to hold the tool while the other user positions the bone.
five minutes recovery time before starting the co-work evaluation session.
When the six bones are positioned correctly into the skeleton, the session is completed and the co-work questionnaire is handed out to both users.
User Profiles Analysis of an introduction questionnaire allowed us to produce a user profile. The age of the 60 evaluators is from a minimum 17 years to a maximum 58 years. The majority are between 22 and 27 years old. Most of these evaluators are university students whereas the diversity of the other users’ professions reaches from personal assistants, journalists and workers, to technicians and computer science or non-computer science university professors and researchers. All evaluators are not Virtual Environment experts. However, their knowledge concerning computer hardware and software differs substantially. The group of 22-27 year old uses the computer mostly for web surfing as well as computer games, whereas he older evaluators use it mostly for editing with text processing software. Therefore, the first group is more experienced with hardware devices, such as game joysticks, steering wheels including force feedback. This observation is independent from the subject's profession or field of studies. A contrary result is that the older evaluators use a computer almost twice as long per week as the group of the 22-27 years old. No other significant differences between the evaluators that might have an impact on the analysis of the evaluation results are found.
3.2 Evaluation Methods
Figure 2. Collaborative Workbenches (RWB)
Medical
Responsive
The learner is again the evaluator of the CVE. The task is slightly different from the one of the co-presence session. Together the users have to position six bones belonging to three different pairs to complement the human skeleton. Each bone in a pair belongs to the left or right side of the skeleton (i.e. the femur bone of the right and the left leg). A set of three of these bones lie in front of each user. As the users stand opposite each other on different sides of the skeleton they have to find out which bones belong to their side as the bones are mixed. This can be done by querying the name of the bone. If a user finds out that a bone belongs to the partner’s side this bone can be exchanged by passing it over. After a bone has been
There exist three different evaluation methods which can be applied to Collaborative Virtual Environments: the expert heuristic, the formative and the summative evaluation [21,22,23]. The expert heuristic evaluation is an analytical method, while the formative and summative evaluations are empirical and observational methods [25]. In the expert heuristic evaluation the evaluator is a field expert who determines problems with usability in the design phase of the CVE. The output of the formative evaluation is a combination of qualitative and quantitative results. The quantitative data evaluates the amount of time, the number of trials, the number of mistakes etc. while performing a special task. The qualitative data can be obtained by observing so-called critical incidents [21]. A critical incident is a problem that occurs while a user is interacting
within the CVE. These incidents can be confusion, cancellation, errors, repetition etc. The objective of the summative evaluation method is to compare between different CVEs designed with the information obtained from the same User Task Analysis. Hence the output of the summative evaluation method enables the statistical comparison of different realizations of interaction techniques, operations, representation components etc. and the choice of the most appropriate one in terms of usability of the CVE.
these groups with a reference group from the previous group analysis. The variation approach of the analysis was expected to focus and cross-check the influence of these particular factors in supporting team work.
4.1 First Level Analysis Immersive Telepresence
These different evaluation sessions are implemented because there are too many evaluation items assessing different aspects of the computer mediated Human-to-Human interaction in projection based CVEs.
We found that in an educational scenario immersive telepresence supports the work flow. In this situation network drop outs do not have a negative impact on the perception of co-presence as long as the average frame rate does not go below 12fps.
Questionnaires are developed to let the evaluators assess these different aspects. The items of the questionnaires are enumerated. All answers are ranged in the interval of 0 - 6. This is done in accordance with other evaluation methods [13,14,19,20]. In order to support the evaluator assessing the different aspects of interaction descriptive text is placed beside the answer possibilities. For example, 0 corresponds to never/bad/no, 3 to sometimes/acceptable/maybe and 6 corresponds to often/good/yes. The text-based support enables evaluators to place assessments on a numeric scale more accurately and ensures that the numeric results, which are necessary for the statistical analysis, reflect the evaluator’s intentions.
In a collaboration scenario using immersive telepresence the position of the remote partner representation should be chosen in a way that both partners seem to have same virtual size in the CVE independent from their physical size in real world. This is particularly important when partners are given equal right as was the case in our co-work sessions.
In the early CVE design phase alternating cycles of expert heuristic and formative evaluation are performed in order to eliminate obvious usability problems from the very beginning. In this paper we are focusing on the summative evaluation, since the CVE used has undergone the two previous steps and has been improved accordingly.
Input Device Representations
For analyzing the numerical data obtained by the summative method expectancy values are computed. In order to handle the uncertainty of the numeric results the standard deviation is also computed.
4.
RESULTS AND ANALYSIS
When using a RWB the perception of co-presence can be increased with a remote partner's video texture representation together with a real background, since due to depth perception the user has the impression that the remote partner stands closer to the table.
Appropriate representations of the remote user's tools and input devices support collaboration more than body and hand gestures, during the co-work session.
Disturbance Factors Cabling of input devices, trackers and stereo glasses are perceived as annoying. Careful handling of loose wires is recommendable.
4.2 Group Analysis Additional results are found during the group analysis. In particular:
For this paper we present the evaluation results and the extracted guidelines rather than the detailed statistical data. We also focus on results concerning immersive telepresence, representation of input devices and disturbance factors related to projection based systems. However, they are presented according to the three different levels of analysis, namely the first level, the group and the variation group analysis.
Immersive Telepresence
In the first level analysis average values and their expectancy values are computed and compared for each session separately. In the group analysis these statistical values are compared between the different sessions. Since the questionnaires are especially designed so that questions belonging to different sessions are evaluating similar factors. In the variation group analysis we are again comparing different sessions with each other. In contrast to the group analysis we excluded the video representation of the remote partner from one group, the remote tool representations of another and complicated the collaboration task for the third group. Then we compared
Our experiments show that the remote partner representation is crucial in situations where problems need to be resolved. The use of the video connection enhances the collaboration at a psychological level but its quality can be traded off against other representation components.
When integrating immersive telepresence into a CVE, audio and video streams do not necessarily need to be synchronized unless the delay is bigger than 10 frames. Even the resolution plays a marginal role, since participants spent most of the time looking and working on the virtual data.
Input Device Representations Appropriate tool and input device representations of the remote partner are adequate means for supporting the perception of copresence which is the basic requirement for collaboration. With
the help of these representations the influence of video is reduced to support collaboration only psychologically.
Disturbance Factors High system responsiveness is perceived as having very positive impact on collaboration. Even downsizing the application in order to decrease the CPU load is recommendable. A good system responsiveness is guaranteed if all inputs and outputs are processed and rendered within less than 50ms. Although the work with input devices is assessed to have negative influence, this perception seems to be very subjective as was shown by the high variance in answers to the relevant questions. However, it is essential to facilitate the usage of VE input devices as well as shutter glasses and cabling. Using descriptive text in a Virtual Environment the developers should ensure that the alignment is realized with respect to the user's physical size. Readability should be provided from any point within the CVE interaction space. This is especially interesting when using a CAVE-like display system or a cylindrical projection. In this case describing text can be attached to the user's gaze, body or input devices.
4.3 Variation Group Analysis With the variation group analysis it is confirmed that the absence of representation forms has a negative impact on usability. It is proved by the statistical results from the variation group analysis that a missing remote partner representation handicaps the CVE team more than missing remote tool and input device representations. The intensification of a collaborative work session without restrictions in representations shows impact on usability too. Now in conjunction with the evaluation results it is possible to formulate a CVE rating scheme. This scheme consists of a chain starting with the audio link to the remote partner, which is proved to be the most important thing for a CVE. Without audio it is impossible to work adequately. The next component is the video representation of the remote partner. Although this representation form is important it is not essential for the completion of the collaborative task. The users are able to compensate for this missing feature with other adequate tools or forms of communication (i.e. remote tool representation and audio). The third item is the remote tool and input device representation. These representations support completing the collaborative task but they are also not essential. It is proved from the conclusions of the statistical and group analysis that compensation always performs at the expense of usability or the perception of co-presence and co-knowledge. Users who do not suffer any missing representation features perceive the collaboration in a CVE as most satisfying. If only one feature is missing the users have to compensate for it by adequate other tools and mechanisms. As a consequence the users are unable to concentrate on the task. The compensating tools and mechanisms stress most of the user's senses in a way that these are overloaded. Therefore the users perceive the usage of equipment, virtual tool and menus as disturbing and confusing. Users who feel supported are rather willing to accept components, which are weak in terms of usability.
CVE Advanced Design Guidelines Finally it is possible to formulate some further guidelines with the results obtained by the variation group analysis : CVE design and realization should consider the CVE rating scheme. An audio link to the remote partner(s)/team needs to be more reliable than a video link. Synchronization of audio and video streams is not necessary as long as the delay is not bigger than 10 frames. Appropriate remote tool and input device representations are supportive but with minor importance relative to the video link. If appropriate remote tool and input device representations are difficult to realize ensure that equivalent, compensating tools and mechanisms are offered. Action feedback is an appropriate solution for overcoming this representation drawback. Expert heuristic, formative and summative evaluations of the stand-alone Virtual Environment might not be able to identify weaknesses concerning the usability design for a collaborative Virtual Environment. The alignment of virtual tools and menus as well as the usability of input and output device combinations and other equipment should be designed and implemented with respect to CVE evaluation results. Work tools and mechanisms should be designed in order to disburden the users senses. High cognitive load, uncomfortable, non-intuitive usability and user fatigue also have negative impact on the perception of co-presence and co-knowledge and thus collaboration.
5.
CONCLUSIONS
In this paper we presented results and design guidelines related to the use of video/audio and virtual representations of input devices in collaborative distributed virtual environments. We were able to test in a variety of different sessions and with users from varied backgrounds how these and other factors typical of projectionbased systems affect co-presence and co-working in CVEs. In the future we would like to further investigate the influence of display systems and input device combinations on collaborative awareness and usability. In addition, it is necessary to find more evaluation parameters in order to screen a wider range of disturbance factors that might affect collaborative interaction in CVEs. The more disturbance factors are encountered the more subtle are the evaluation results. Finally, we will increase the number of evaluators assessing the CVE application. Although the Variation Group Analysis is able to reduce the problem of high uncertainty values of the evaluator's answer behaviour, a higher number of experimental subjects should be evaluating the CVE.
6.
REFERENCES
[1] C.A. Ellis, S.J. Gibbs, and G.L. Rein. Groupware - some issues and experiences. Communications of the ACM, 34(1):38–58, 1991.
[16] C. M. Curry. Supporting collaborative awareness in teleimmersion. Virginia Polytechnic Institute and State University, 1999.
[2] S. Benford, J. Bowers, L.E. Fahl´en, J. Mariani, and T. Rodden. Supporting co-operative work in virtual environments. The Computer Journal, 37(8), 1994.
[17] A. Fuhrmann, G. Hesina, F. Faure, and M. Gervautz. Occlusion in Collaborative Augmented Environments. In Proc. of the Eurographics Workshop in Vienna, Austria, May 31 - June1, 1999.
[3] D. Benford, C. C. Brown, G. T. Reynard, and C. M. Greenhalgh. Shared spaces: Transportation, artificiality and spatiality. Proc. ACM Conference on Computer Supported Cooperative Work (CSCW’96), ACM Press, 1996. [4] S.A. Bly, S.R. Harrison, and S. Irwin. Media spaces: Video, audio, and computing. Communications of the ACM, 36(1):28–47, 1993. [5] Y. Ichikawa, K. Okada, G. Jeong, S. Tanaka, and Y. Matushita. Majic videoconferencing system: Experiments, evaluation and improvement. In Proceedings of ECSCW95, 1995.
[18] M. Roussos, A.E. Johnson, J. Leigh, Ch. A. Vasilakis, C.R. Barnes, and T.G. Moher. Nice: Combining constructionism, narrative and collaboration in a virtual learning environment. Computer Graphics, 31(3):62–63, 1997. [19] M. Usoh, E. Catena, S. Arman, and M. Slater. Presence questionnaires in reality. Presence: Teleoperators and Virtual Environments , in press, 2000. [20] B.G. Witmer. and M.J. Singer. Measuring immersion in virtual environments. (Tech. Report 1014). Alexandria, VA: U.S. Army Research Institute for the Behavioral and Social Sciences, 1994.
[6] H. Ishii and M. Kobayishi. Integration of inter-personal space and shared workspace: Clearboard design and experiments. In Proceedings of CSCW’92, pages 33–42, 1992.
[21] D. Hix and H. R. Hartson. User Interface Development: Ensuring Usability through Product and Process. New York: John Wiley and Sons, 1993.
[7] S. Sellen and B. Buxton. Using spatial cues to improve videoconferencing. In Proceedings of CHI92, pages 651– 652, 1992.
[22] D. Hix, E. Swan II, J. L. Gabbard, M. McGee, J. Durbin, and T. King. User-centered design and evaluation of a real-time battlefield visualization virtual environment. IEEE, pages 96–103, 1999.
[8] H. Takemura and F. Kishino. Cooperative work environment using virtual workspace. In Proceedings of CSCW92, 1992.
[23] J. Nielson. Usability Engineering. Academic Press, 1993.
[9] H. Kuzuoka, G. Ishimoda, and T. Nishimura. Can the gesturecam be a surrogate? In Proceedings of ECSCW95, 1995.
[24] G. Goebbels, V. Lalioti, and M. Göbel. On collaboration in distributed virtual environments. In The Journal of Three Dimensional Images, Japan, 14(4):42–47, 2000.
[10] W. Broll. Interacting in distribute collaborative virtual environments. IEEE VRAIS, pages 148–155, 1995.
[25] G. Goebbels, V. Lalioti, and T. Mack. Guided design and evaluation of distributed, collaborative 3D interaction in projection based virtual environments. In Proceedings of HCII 2001 - 9th International Conference on HumanComputer Interaction, New Orleans Aug. 2001.
[11] E. Frecon and A. A. Nou. Building distributed virtual environments to support collaborative work. VRST, pages 105 – 113, 1998. [12] D. Margery, B. Arnaldi, and N. Plouzeau. A General Framework for Cooperative Manipulation in Virtual Environments. In Proc. of the Eurographics Workshop in Vienna, Austria, May 31 - June1, 1999. [13] M. Slater and A. Steed. A virtual presence counter. Presence: Teleoperators and Virtual Environments 9.5, 2000. [14] A. Steed, M. Slater, A. Sadagic, A. Bullock, and J. Tromp. Leadership and collaboration in shared virtual environments. IEEE Virtual Reality, Houston, pages 112–115, March 1999. [15] C. Breiteneder, S. Gibbs, and C. Arapis. Teleport- an augmented reality teleconferencing environment. Proc. 3rd Eurographics Workshop on Virtual Environments Coexistence and Collaboration, February 1996.
[26] W. Krüger, C. Bohn, B. Fröhlich, H. Schüth, W. Strauss, and G. Wesche, The responsive workbench: A virtual work environment. IEEE Computer,pages 12–15, May 1994. [27] W. Krüger and B. Fröhlich. The responsive workbench. IEEE Computer Graphics and Applications, May 1994. [28] H. Tramberend. Avocado: A Distributed Virtual Reality Framework. In Proc. of the IEEE Virtual Reality, 1999. [29] R. Raskar and G. Welch and M. Cutts and A. Lake and L. Stesin and H. Fuchs. The Office of the Future: A Unified Approach to Image-based Modeling and Spatially Immersive Displays, Proceedings of SIGGRAPH'98, pp. 179-188, ACM, 1999