Session S1F PERFORMANCE IN INTERNATIONAL COMPUTER SCIENCE COLLABORATION BETWEEN DISTRIBUTED STUDENT TEAMS Martha Hause 1 , Marian Petre 2 , Mark Woodroffe3 Abstract - Technology has developed such that it allows for effective remote collaboration. International collaboration gives students the opportunity to use different technologies for collaboration across time and distance as well as problem-solving experience with different cultures in a team-based environment. This paper investigates the interactions of high and low performing distributed student teams. These teams were involved in a software development project, part of a Computer Science course at two universities. A set of categories was developed for this study to examine the communication produced. This paper tracks the progression and changes in each team’s communication via coded categories. In particular, the use of communication media available, the amount of communication per team and decision-making patterns throughout the software development process. Results indicate that not only is communication crucial to a team’s success but the process and timing of specific actions can have an impact on a team’s performance. Index Terms – Distance Learning, Group Performance, International Collaboration, Software Process.
INTRODUCTION Approximately two -thirds of software projects are late because project teams encounter different challenges that threaten the success of a project [19]. The definition of a team’s success can vary widely. In industry, success is the completion of a project on time and within budget. In education, success is measured by the guidelines set by the teachers, which is then reflected in the final grade. Many studies have looked at different factors, which affect high and low performance in different types of projects [2, 12, 18, 19]. Mills [12] looked at a group’s interaction and behaviour. Communication and interaction among group members is essential to successfully achieve a group's goal or purpose. Many studies have developed and used categories to analyse a group’s communication to identify team interaction and behaviour [1, 6, 13]. Discourse analysis was used to examine the communication and interactions produced by high and low performing teams to find cues that would identify factors in software development performance. This paper examines distributed student teams involved in a software
development project that is part of Computer Science courses at two universities. It looks into understanding what makes good team build ing of software and what characterises high and low performance teams. This study tracks the progression and changes of categories coded onto each team’s communication throughout the project’s timeline most especially during key decision periods in the software development cycle.
RUNESTONE The Runestone Project, sponsored by the Swedish Council for Renewal of Undergraduate Education, is the case study employed in this investigation. The project is an international collaboration between Uppsala University (UU) in Sweden and Grand Valley State University (GVSU) in the USA. It began with a pilot study in 1998 and has continued through to date [3, 5, 8, 9, 10, 11]. The project’s primary aim is to introduce ‘international experience into undergraduate Computer Science education in a way that has value for all participants’ [5]. Students use appropriate technology to collaborate closely with their foreign counterparts to complete a software development task.
RUNESTONE 2000 The project focus for this study is the Runestone 2000 instance. It involved a total of 93 students, 47 from UU and 46 from GVSU. There were 16 teams in total, 13 teams of 6 students (three from each university) and three teams of 5 students in each team. The Swedish students were in their third year of university study and American students were in their third or fourth year. The task set for the course, called the Brio Project, was to design and implement a distributed, real-time system to navigate a steel ball through a pre-determined path by tilting the surface of the game board in two-dimensions with stepper motors [3, 5, 8, 9, 10, 11]. The Runestone Project has a nine-week duration and meets the Computer Science University requirements for each course. It encompasses the whole of the Brio project, which is incorporated as part of each university course. Data Collected Data, in a variety of forms was collected throughout the project. The data produced during Runestone 2000 is rich in
1
Martha Hause, Faculty of Mathematics and Computing, The Open University, Cambridge, England,
[email protected] Marian Petre, Faculty of Mathematics and Computing, The Open University, Milton Keynes, England,
[email protected] 3 Mark Woodroffe, Faculty of Mathematics and Computing, The Open University, Cambridge, England,
[email protected] 2
0-7803-7961-6/03/$17.00 © 2003 IEEE November 5-8, 2003, Boulder, CO 33 rd ASEE/IEEE Frontiers in Education Conference S1F-13
Session S1F both quantity and quality and included background ques tionnaires, project logs, journals, student email and IRC archives, web pages, peer evaluation and instructor interviews. It encompassed all types of interaction between team members except for informal face-to-face meetings. The students were advised that the information was not shared with the course instructors until after the course was completed. Technology Used The teams were given a choice of different media to use for communication. This included whiteboards, chat rooms, video conferencing, web pages, email and Internet relay chat. Although there was some use of video conferencing, this was very minimal and was used in conjunction with IRC. The preferred forms of communication were email, IRC and web pages. Students were required to have weekly meetings. They were also encouraged to have regular contact with the teacher and other local and remote team members. Students used Internet Relay Chat (IRC) for regular meetings and email for other types of communication with instructors or other team me mbers. Web pages were part of the first deliverable which entailed an introduction exercise. Web pages were also later used to make project documents available to the rest of the team. Team Formation and Procedures All teams worked on the same project. Two or three students from each university formed local sub-groups with the advice of the local instructors who had previous knowledge of some of the students. Sub-groups were arbitrarily matched with foreign sub-groups in order to form international teams. The Brio project had small set deliverables and presentations from each group were required at the end of each milestone. All members were to take on the role of developer. Each member was required to take a turn leading a presentation so that all me mbers had to present at least once. Each student received an individual mark based on both team and individual performance. The group leader was chosen from within the team and also had responsibility for the project co-ordination. Performance This study is focused on the email and IRC communication of the ‘high-performing’ and ‘low-performing’ teams. As a consequence, it was important to identify high and low performing teams. Ranking all 16 teams by their average team grade identified performance. The quantity of the communication for all 16 teams was very large. Because of time constraints in the project, it was decided to analyse 8 of the teams. These were chosen by taking the top four teams (25%) which were classified as ‘high-performers’ and the bottom four teams (25%) as ‘low-performers’ [8, 9, 10].
M ETHOD The use of electronic communication (Email and IRC) by the distributed teams allowed this project to log, quantify and analyse interactions by category and over time thus giving an insight into the students’ actions and timing of actions. Research shows that previous studies have created and used categories to analyse group communication in order to identify team interaction and behaviour in problem solving and in organising team activities [1, 6, 13]. Using discourse analysis, which focuses on the ways people construct individual versions of events through their conversation [4], Poole’s Group Development Model [14, 15] and the Waterfall Software Development Model [17] a set of categories was previously developed. Independent coders tested the categories for validation and reliability. The interpretation and validation of the categories was aided by assigning a definition and example to each category and sub-category during their development. Analysis of the frequency of categories against time was carried out and a comparison was made of the each team’s frequency of categories throughout the project timeline. Categories It was recognised that the data could be organised into specific categories by using data driven analysis. This analysis helped in the identification of both the software and group development processes and in the classification of interaction types. While examining the students' emails and IRC, twelve top category levels were identified. A finer granularity of categories was recognised as necessary and sub-categories were developed. For example, a particular phrase was given the general category of planning work (C1). However, aspects of the phrase showed that it could be identifying tasks or requesting update of work or a number of other actions. An example of this is 'how much of the game server and applet do you guys have done?' The top-level categories are shown in table I below [8, 9, 10]. TABLE I TOP-LEVEL CATEGORIES
C1 – Planning Work C2 – Planning Admin C3 – Decisions C4 – Roles C5 – Conflict C6 – Social/Get to know
C7 – Humour C8 – Graphical Expressions C9 – Ideas C10 – Identification C11 – Task/Work Specific C12 – Goals
Coding Process The identification of phrases in the individual team emails and IRC communication began the coding process. These phrases were then classified under one or more categories and sub-categories. Although categories and sub-categories were assigned to individual phrases, it was recognised that
0-7803-7961-6/03/$17.00 © 2003 IEEE November 5-8, 2003, Boulder, CO 33 rd ASEE/IEEE Frontiers in Education Conference S1F-14
Session S1F the phrases were context dependent. To conduct analysis more easily, the phrases were logged electronically, once they were coded. Logging included the following information: the type of communication (email or IRC) and a unique identifier, the date of the communication, the person who stated the phrase, the category and sub-category and a sample of the phrase.
OBSERVATIONS Due to the spatial limitations of this paper, the analyses reported in this study are only part of the entire scope of the study. The extent of the study encompasses analyses of all categories and sub-categories along the project’s timeline, further analyses of the decision-making and software development processes and the individual roles of team members and their effect on performance. Media Teams generated different amounts of communication in the form of emails and IRC’s. Both the volume and content of each team's communication were analysed. The amount of communication was calculated by totalling the number of coded phrases for all emails and all IRC’s for each team. The ratio of email vs. IRC for each team was undertaken using the percentage of each media over the total of both media for each team. The results can be seen in table II [8]. TABLE II EMAIL/IRC COMPARISON
IRC 88% 88% 61% 86%
Low Performing Team Email E 36% F 10% G 24% H 24%
•
When comparing Email with IRC for individual teams, the results again showed a notable difference in the teams’ use of email as opposed to IRC. The comparison of Email vs. IRC for the aggregate high and low performing teams did not show a notable difference between the two types of communication within the two types of groups. The differences are considered to have happened by chance.
Although the analysis did not show a significant or notable difference in the use of media between the high and low performing groups, it did show significant differences in the amount of communication and the use of media between the teams. The high performing teams have less communication than the low performing teams therefore supporting DeSanctis, et al [7] suggestion that "higher performing global learning teams do not necessarily communicate more, or more often, with one another compared to lower performing teams. More important to success is communicating deeply, with focus, and developing routines of communication and task completion". Category Frequencies A comparison of the total frequency of categories between all teams was done in order to see if there were any notable differences in the occurrence of any particular category. There was very little difference in the frequency pattern of categories between all eight teams for the 12 top-level categories, as figure 1 shows. FIGURE 1
IRC 64% 90% 76% 76%
The high performing teams are identified as teams A-D and the low performing teams are identified as teams E-H. The differences in ratios between the high and low performing teams suggest that the low performing teams had a higher percentage of email vs. IRC, with the opposite true for the high performing teams. The differences however did not seem notable so a significance test was conducted using the Chi-Square test. A test of significance is done in order to determine whether the differences are counted as showing a genuine effect or if they are a result of chance. The results are outlined below. • In the distribution of total communication for each team, results showed a notable difference in the total number of communications between all the teams. • In the distribution of totals within the summative high and low performance groups, results again showed a significant difference. A possible reason for this difference is that the high performing teams made fewer communications than the low performing teams.
COMPARISON OF OVERALL CATEGORY FREQUENCY Top Level Category Percentages for All Teams 35% 30% 25%
Percentage
High Performing Team Email A 12% B 12% C 39% D 14%
•
20% 15% 10% 5% 0% 1
2
3
4
5
6
7
8
9
10
11
12
Top Level Categories
Figure 1 shows the communication percentage patterns for all teams. As the patterns are so similar, it is difficult to differentiate between the teams; therefore the legend has been omitted. Further investigation into the relationship between the categories and the project time line showed some differences in the percentage of communication in individual categories over specific time periods. Because the timing of communication varied per team, the timing period of 9 weeks was divided into three specific time
0-7803-7961-6/03/$17.00 © 2003 IEEE November 5-8, 2003, Boulder, CO 33 rd ASEE/IEEE Frontiers in Education Conference S1F-15
Session S1F periods corresponding to deadlines for three major deliverables. Time period 1 = wks 1-3, time period 2 = weeks 4-6 and time period 3 = weeks 7-9. Figure 2 below shows the percentages of the total communication for each period for the top level for category 1 (C1) Planning Work. Each time period shows C1’s (Planning Work) percentage of the total communication for all the high performing and the total communication for the low performing teams.
TABLE III LIFECYCLE PHASE MOVEMENT FOR ALL TEAMS
FIGURE 2 TOP LEVEL CATEGORY 1 OVER TIME C1 - Planning Work for Three Time Periods 0.20 0.18 0.16 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0.00 % Period 1
% Period 2
% Period 3
The high performing teams are identified with the black bars ( ) and the low performing teams are identified with the grey bars ( ). Although the differences between the high and low performing teams are minimal, this does show the differences in the use of category 1 during each period. This suggests a process pattern for C1-planning work where each team has a different percentage of planning in each period.
• • • •
Software Development
•
The project’s structure of set deliverables followed the waterfall lifecycle, which is also the lifecycle most familiar to the students. The study therefore looked at each team’s software development via their communication, in terms of the waterfall lifecycle phases, which correspond with the project’s set deliverables. The start and length of each phase was identified for each team as well as identifying the phase as being sequential or segmented thus showing iteration in the waterfall lifecycle. Table III shows the results for each team. The first column gives the team identification. Teams A-D are identified as the high performing teams and teams E-H are the low performing teams. The following columns give for each phase, S-the start week, #-the number of weeks that phase lasted and Q-whether or not the phase was sequential. In the Q column, the Y therefore signifies the phase as sequential or continuous and the N signifies the phase as segmental. For instance, team A show their SE phase to start (S ) during week 1, to last (#) for 7 weeks and to be segmental or fragmented throughout the 7 weeks. Team H starts (S) their SE phase during week 2 and it also lasts (#) for 7 weeks but their phase is sequential or continuous rather than segmental.
• • • • • • • • •
SE-System Engineering Start week – earlier for the high performing teams Length – on average, half a week longer for the low performing teams. Seq/Seg – more segmented or iterative use by the high performing teams. Analysis Start week – between weeks 1-3 with no distinctive pattern between high and low performing teams. Length – on average, half a week longer for the high performing teams. Seq/Seg – an even number of segmented or iterative use between the high and low performing teams. Design Start week – between weeks 1-3 for most teams, however the majority of low performing teams began during week 3. Length – on average, one and a half a week longer for the high performing teams. Seq/Seg – all but one low performing team, have sequential use. Code Start week – all but one group began coding during week 3. Length – on average, half a week longer for the high performing teams. Seq/Seg – all but one low performing team have sequential use. Test Start week – between weeks 2-5 with the week mode of 4.for both the high and low performing teams. Length – on average, ¾ of a week longer for the high performing teams.
0-7803-7961-6/03/$17.00 © 2003 IEEE November 5-8, 2003, Boulder, CO 33 rd ASEE/IEEE Frontiers in Education Conference S1F-16
Session S1F • • • •
Seq/Seg – most of the teams, both high and low performing used this phase sequentially. Maintenance Start week – between weeks 4-7 with no distinctive pattern between high and low performing teams. Length – on average, half a week longer for the low performing teams. Seq/Seg – most of the high performing groups have segmented use and most of the low performing groups have sequential use.
All teams worked on the same project with the same deadlines and deliverables, however they each had their own process of when, how and how long they would work on a phase. Differences in the lifecycle patterns between the high and low performing groups were more concentrated within the starting and type of use in the SE phase, the starting, type and length of the design phase and the type of use (sequential or segmental) of the maintenance phase. Decision-Making Patterns Through the process of the data driven analyses, decisions were identified as implicit or explicit, goal or activity oriented and challenged or agreed. Implicit decisions are made as a result of a discussion on a particular subject where no definite decision had been made. Explicit decisions were made when there was an open and clear statement that a decision was made and what the decision encompassed. Goal oriented decisions were seen as those decisions that were directly related to a project goal. An activity decision is o n e that relates to the actions or steps to be taken in achieving the goal or deliverable but not directly related to a goal or deliverable. Challenged decisions were identified when a decision was initiated or made and thereafter followed a discussion contradicting or disagreeing with the decision. An agreed decision was identified when there was ‘silence’ after the decision was made, implying agreement or when there was an agreement voiced without any disagreements. Table IV below shows the decision breakdown for the total of the high performing teams and the total for the low performing teams. TABLE VI DECISION BREAKDOWN FOR HIGH AND LOW TOTALS
Tot Dec Implicit Explicit Goal Activity Challenge Agreed
High Performing 262 179 83 98 164 44 218
Low Performing 383 239 144 164 219 33 350
Using the Chi-Square Test to test for significance, the following results were discovered.
• • • •
•
In the distribution of total decisions made, for all the teams, the teams were very different in the way they made decisions. In the distribution of total decisions for the summative high and summative low decisions, the difference again was found to be consequential. No significant differences were found between the number of implicit vs. explicit decisions for all teams, or for the summative high and low performing groups. A significant difference was found between the number of goal vs. activity-oriented decisions for all teams but no significant difference for the summative high and low performing groups. Looking at the number of challenged vs. agreed decisions for the summative high and low teams, the results show a notable difference. The comparison of the challenged and agreed decision types across all the teams also show a consequential difference.
As with the software development process, the teams differ in the type of decisions made. The main differences in decision types between the high and low performing groups lies in the total decisions made and the challenged vs. agreed decisions. The high performing teams made significantly less total decisions than the low performing teams. This is consistent with the total communication breakdown between the high and low performing groups. In both the decisionmaking and the total communication, the significant differences between the high and low performing teams show an effect between lower amounts of communication and decision-making, and high performance. Decision Points during Software Development Further analyses of decisions looked into where goal and activity oriented decisions were made in terms of the software development process. During the highest goal and activity oriented decision points, the high performing teams worked on an average of 4 phases while the low performing teams worked on an average of 5 phases. The phases that were common to the high performing teams during the highest goal decision points were design and code while design alone was the most common phase during the highest activity oriented decision points. For the low performing teams, the common phases during the highest goal decision points are SE, design and code. Design and SE are the common phases for the low performing teams during the highest activity oriented decision points. The difference between the high and low performing groups is the work on the SE phase during the goal and activity oriented decision points. According to Pressman [16], System Engineering is the ‘establishment of requirements for all system elements and allocating a subset of these requirements to software. Includes defining their process and the needs of the customer as well as planning and management’. This then suggests that the low performing groups have spent more
0-7803-7961-6/03/$17.00 © 2003 IEEE November 5-8, 2003, Boulder, CO 33 rd ASEE/IEEE Frontiers in Education Conference S1F-17
Session S1F REFERENCES
time than the high performing groups on reestablishing the requirements, planning and management. [1]
Bales, R.F., Strodtbeck, F.L., ‘Phases in Group Problem-Solving’, Journal of Abnormal and Social Psychology, Vol. 46, 1951, pp. 485495.
[2]
Belbin, R.M., Management Teams: Why they succeed or fail, Oxford: Butterworth-Heinemann, 1996.
[3]
Berglund, A., Booth, S., ‘Are you guys really concerned about the grades? On the experience of grading systems as contextual to learning in an internationally distributed computer science course’, Presented at ISCRAT2002, Amsterdam, Netherlands, June 2002.
[4]
Coolican, H., Research Methods and Statistics in Psychology, Second Edition, London:Hodder & Stoughton, 1999.
[5]
Daniels, M., Petre, M., Almstrum, V., Asplund, L., Bjorkman, C., et al, ‘RUNESTONE, an International Student Collaboration Project’, Proceedings of IEEE Frontiers in Education Conference, AZ, 1998.
[6]
Danziger, K., Interpersonal Communication, Exeter: Pergamon Press Inc, 1976.
[7]
DeSanctis, G., Wright, M., Jiang, L., ‘Building a Global Learning Community’, Communications of the ACM, Vol. 44, No. 12. Dec 2001
[8]
Hause, M.L., Woodroffe, M.R. Software Development Performance in International Student Team Collaboration. Paper (to be presented) at the 6 th World Multiconference on System ics, Cybernetics and Informatics (SCI 2002), Orlando, USA. July 2002, Vol. II pp. 85-190.
[9]
Hause, M.L., Last, M.Z., Almstrum, V.L. Woodroffe, M.R., ‘Interaction Factors in Software Development Performance in Distributed Student Groups in Computer Science’, 6 th Annual Conference on Innovation and Technology in Computer Science Education ( ITiCSE), Canterbury, UK, June 2001, pp. 69-72.
CONCLUSION The conclusions drawn here are based on the analyses presented in this paper. However, as stated earlier, further analyses have been carried out in this project, which help corroborate the conclusions made here. The importance of communication in teamwork is evident. Results from the data driven analysis show that all teams use communication for work, planning, socialising etc. as is shown in the categories developed. The physical distance between some of the team members gives them a choice of media to use. Analysis shows that for both the high and low performing teams, there is a significant difference in the choice of media used. IRC communication has a significantly higher usage than email for both the high and low performing teams. One possible reason for this is that IRC communication is synchronous and therefore the students were able to give and receive an immediate response. The differences in the use of communication, i.e. work planning (C1), etc, by the teams became evident when they were compared against the project’s timeline. The ‘how’ and the ‘when’ the communication occurred differs in all the teams. Process and timing therefore became important in looking at a team’s performance. One of the differences between high and low performing groups is the amount of communication produced. The low performing groups have more communication than the high performing groups. Analysis of their work process suggests that it is not the quantity of the communication but the quality that is important in determining performance. Investigation of the team’s software development process revealed that the high performing groups began and used some of the development phases differently than the low performing groups. There were also differences in the types of decisions made by the high and low performing groups especially in terms of decisions during the software development lifecycle. The project was the same for all the teams, however all teams had individuals with different backgrounds and experiences, which made their work process unique. The differences between the high and low performing teams was due to the quality of the communication and the process and timing of specific actions. This study suggests that the importance of process and timing in software development should be emphasised not only to students but to practitioners in order to achieve successful results in software development.
ACKNOWLEDGMENT This paper is written with gratitude to all the students, staff and researchers involved in the Runestone project in Sweden, the United States and the United Kingdom.
[10] Hause, M.L. & Woodroffe, M.R., ‘Team Performance Factors in Distributed Collaborative Software Development’, 13 th Psychology of Programming Interest Group, Bournemouth, UK, April 2001, pp.7182. [11] Last, M.Z., Hause, M.L., Daniels, M, Woodroffe, M.R., Learning from Students: Continuous Improvement in International Collaboration, Paper Presented at 7 th Conference on Innovation and Technology in Computer Science Education, Aarhus, Denmark, June 2002, pp.136-140. [12] Mills, T.M., The Sociology of Small Groups, New Jersey: PrenticeHall, 1967. [13] Olson, G.M., Olson, J.S., Carter, M.R., Storrosten, M., ‘Small Group Design Meetings: An Analysis of Collaboration’, Human-Computer Interaction, Vol. 7, 1992, 347-374. [14] Poole, M.S., ‘Decision Development in Small Groups I: A comparison of two models’, Communication Monographs, Vol. 48, 1981, 1-24. [15] Poole, M.S., ‘Decision Development in Small Groups III. A multiple sequence model of group decision-making’, Communications Monographs, Vol. 50, 1983, pp.321-344. [16] Pressman, R.S., Software Engineering: A Practitioner’s Approach (European ed.), McGraw-Hill, 1992. [17] Sommerville, I., Software Engineering(4 th ed.), Addison-Wesley Pub. Ltd, 1993. [18] Taplin, M., Yum, J.C.K., Jegede, O., Fan, R.Y.K., Chan, M.S., ‘Help seeking Strategies Used by High -Achieving and Low-Achieving Distance Education Students’, 13th Annual Conference of the Asian Association of Open Universities, Beijing, 1999. [19] Teasley, S., Covi, L., Krishnan, M.S., Olson, J.S., ‘How does radical collocation help a team succeed?’ CHI 2000 Proceedings, 2000.
0-7803-7961-6/03/$17.00 © 2003 IEEE November 5-8, 2003, Boulder, CO 33 rd ASEE/IEEE Frontiers in Education Conference S1F-18