Software Process in the Classroom: A Comparative Study Sally E. Goldin and Kurt T. Rudahl Department of Computer Engineering King Monkut’s University of Technology Thonburi Bangkok, Thailand Fax:+66-2-872-5050 E-mail:
[email protected] Abstract—Software process paradigms are a core unifying concept in software engineering, but they are very difficult to teach. Recent studies that have attempted to bring software process into the classroom have focused mainly on agile methodologies. Few if any studies have compared multiple paradigms. The current research compared the use of the Rational Unified Process (RUP) and eXtreme Programming (XP) paradigms by teams of students developing projects to satisfy the same user requirements. We found that all teams understood their assigned paradigms, but the RUP teams were more successful in applying the methodology. On the other hand, the RUP teams were significantly more likely to claim that they would have preferred to use XP rather than their assigned process.
I. I NTRODUCTION Software process paradigms are a core unifying concept in software engineering (SE). Process frameworks such as the Rational Unified Process (RUP) [5], Model Driven Architecture (MDA) [7], eXtreme Programming (XP) [1] and Scrum [9] integrate specific SE techniques and topics and ground them within a context of the software development life cycle. Considered in isolation, individual SE practices such as object-oriented design or regression testing are difficult to understand and to justify. A software process paradigm implicitly or explicitly establishes a set of values, roles, work flows and deliverables that guide the use of, and elucidate the relationships between, SE practices. A process paradigm also determines which specific SE practices should be emphasized and which ones given less attention. Although the topic of software process paradigms is generally recognized as important, it is quite difficult to teach [4]. A software process remains an abstraction unless it can be applied in an actual team-based development project. However, few software engineering courses can incorporate projects with large enough teams and adequate scope for students to actually practice a particular process. Typically, software engineering courses present the concepts of software process in a lecture or two near the end of a term, after students have become familiar with the various individual practices addressed by a process framework. Students may recognize the terminology associated with RUP or XP, but they do not understand how these processes work in the real world or even why having a process framework is desirable.
978-1-4244-4522-6/09/$25.00 ©2009 IEEE
Some recent studies have reported efforts to apply formal software processes in student projects. Most have focused on Extreme Programming (XP) or other agile methodologies [2][3][6][10][11][12], often with the intent to evaluate the effectiveness of these paradigms. One university has applied the heavyweight Rational Unified Process in a two-course sequence [8]. Kessler and Dykman [4] report some success creating a course that integrates aspects of traditional waterfall approach with agile methods. There is considerable debate about which process paradigms make the most sense in an educational context. However, to our knowledge, no research to date has compared two process paradigms in the same environment. This is our objective in the present study. In a single course, we required some students to use a heavyweight process (RUP) while others used a lightweight process (XP), to develop the same types of applications. Our work aims to answer the following questions: 1) Can students understand and apply an explicit software process paradigm while developing a non-trivial application as a team project? 2) What difficulties do students encounter applying these paradigms in a classroom context? 3) Do the different process paradigms tend to produce different results, in terms of level of progress or completeness? 4) Do students tend to prefer one paradigm over the other? The first two questions address the general effectiveness of software process practice in an academic setting. The second two consider possible differences between paradigms. II. M ETHOD A. Procedures For the fall term 2008, we were asked to develop a new graduate-level software engineering course that would be required for all masters and doctoral students in the Department of Computer Engineering at KMUTT. The appropriate focus and organization for this new course were not immediately clear. The department already offered a software engineering course as a core requirement for all its undergraduates, but we did not want to duplicate the content of the existing course,
427
ISCIT 2009
since a significant fraction of our masters students receive their undergraduate degrees from our department. Given the authors’ backgrounds working in commercial software development environments as well as the industry-oriented nature of our students, we decided that the new course should focus more on practical application of SE techniques than on theory. Furthermore, we saw this as an opportunity to introduce software process paradigms in an integrated and meaningful way. Twenty one students registered for the course: nine graduate students and twelve seniors. During the first class meeting, they completed a questionnaire in which they rated their programming ability, their familiarity with software process and their past experience with real world software development. The course met one evening a week for fourteen weeks. (All graduate courses in our department meet in the evening to accommodate the many graduate students who work full time.) Each three hour class meeting included a brief lecture (one hour or less) followed by two hours of group project work. The class met in one of our computer labs and students were encouraged to continue working after the official end of class and/or to meet outside of the class. During the first two weeks, the instructor (the first author) reviewed software engineering concepts, discussed software process paradigms in general, and provided a detailed introduction to the Rational Unified Process and Extreme Programming. Reading assignments supported the lectures. Project work began during the third week. The instructor divided the class into four teams (three with five members and one with six members), attempting to balance self-rated experience and graduate/undergraduate proportions across teams. Each team was randomly assigned one of two applications (a sketchpad application or a calendar/appointment book application) and one of two process paradigms (RUP or XP). The instructor served as both the user/customer (describing the functionality desired in the application) and the process coach. She tried to make it clear to each team which role she was playing at any point in time. Teams were free to decide how to implement their applications (programming language, architecture, tools) and how to apply the process paradigms to fit their team. Each week she circulated among the teams, answering questions. In the role of the user, the instructor also viewed and commented on iterative releases. Teams were instructed to place all project artifacts (documents, code, tests, etc.) in a CVS repository. The instructor examined the contents of the repository each week in order to assign the teams’ progress scores for the week. Halfway through the course, the instructor in the role of user introduced a new, high-priority functional requirement for each application. The intent was to demonstrate the disruptive nature of requirements change, as well as to see whether the agile process groups were more effective (as would be predicted) in adapting to this change. At the last class meeting, each team presented an oral report on its project. This report included a live demonstration of the application as implemented to that point. The report
outline also asked the teams to evaluate which aspects of their respective paradigms they had applied successfully and which aspects had given them difficulty. As part of the final exam (which focused mainly on the lecture material), students completed a ten question survey gathering their opinions on the project. B. Data Gathered In order to address our research questions, we considered the following data: 1) Number of document and code artifacts produced by each team; 2) Instructor ratings, on a ten point scale, of application completeness and correctness, based on the live demos; 3) Ratings of the report slides, on a ten point scale, by the first author and a colleague who teaches the undergraduate SE course, on two dimensions: how well the team understood their ssigned process and how effectively and completely they applied their assigned process; 4) Final project grades for each team; 5) Responses from the final questionnaire. III. R ESULTS AND D ISCUSSION A. Hypotheses We had several hypotheses at the start of this study. First, with regard to the general use of process paradigms, we expected that teams would have at least some success in applying their assigned process frameworks. This should be reflected in two measures. We expected that the RUP groups would produce more documents than the XP groups (since RUP specifies many deliverables other than code, including requirements specifications, object models and so on). We also reasoned that the functionality of the XP-based applications might be more complete by the end of the term. The constraints of the classroom setting limited the time available for development. Since RUP stipulates more up front design work, we thought that RUP groups might have less time available for implementation. Our second hypothesis related to differences between the paradigms. We expected that students would prefer to use the XP process. This was based on previous observations that students tend to want to jump into coding, without spending much time on design. Furthermore, many of our students have difficulty with the English language. We reasoned that they would be more comfortable with a process that did not require them to create many documents in English. If students do prefer to work with an agile, code-centered process like XP, this might be reflected in the overall quality of the resulting applications and in the ratings of how well teams applied their respective processes. It should also show up in the questionnaire ratings. B. Success in Applying Process Paradigms Tables I and II show the final number of document and code artifacts produced by each team, respectively.
428
The document counts support our first hypothesis. The RUP groups produced almost three times as many documents, on average, compared to the XP groups. (This difference is only borderline significant, p = 0.08, due to the very low degrees of freedom.) The type of application does not appear to affect the number of documents. The code artifact count, on the other hand, suggests that the RUP groups produced more code than the XP groups, which is not what we would expect. These data, however, are distorted by the large number of artifacts created by the RUP Calendar group. This group used a tool that generated code, including HTML templates and PHP scripts, for each screen and subscreen. Overall there is some suggestion that the Calendar groups created more code artifacts than the Sketchpad groups. Both Calendar groups decided to use a web architecture and PHP, while the Sketchpad groups built standalone desktop applications (in C# and in Java). The web applications tended to have more separate files than the desktop applications. One might suggest that counting lines of code produced by each group would be more precise than counting files. Given the different languages and technologies utilized, however, it does not seem to be possible to define a general, unbiased metric. Table III shows the instructor’s ratings of application completeness for the four teams, based on their final demonstrations. The maximum possible rating was 10. Completeness was based whether most of the features originally specified by the user which were implemented in the final release of the application. These measures do not show much variability, with the exception of the RUP Sketchpad team, whose final application was significantly less mature. This group produced a functional prototype before the design was very advanced, but discarded it after the instructor pointed out that it did not reflect their object model. The individual playing the role of implementor in this team said that he really did not understand how to build an application based on a class diagram (even though C#, which this team chose for implementation, is an objectoriented language). The correctness ratings were 9 for all teams, with the exception of the XP Sketchpad team, which received a rating
of 8 out of 10. C. Differences between Process Paradigms Table IV shows project quality, as measured by the average project grade for each team, and the averaged ratings of two faculty members of process understanding and effectiveness of process application, on a ten point scale. The resulting project quality, as measured by average grade, tended to favor XP over RUP. On the other hand, the difference between the averages is less than 1 point, and RUP yielded both the highest and the lowest score in the class, so it difficult to draw any conclusions. Table IV also shows that although the groups seemed to have about the same level of understanding of their processes, the RUP groups had more success in applying the process, based on what they said in their final reports (borderline significant, p = 0.09 with 2 degrees of freedom). The problems reported by the XP groups are discussed in the next section. In Table V, we summarize the average questionnaire ratings for the two processes. These ratings are based on a five point scale, from 1 = strongly agree to 5 = strongly disagree. The p values are based on two-tailed heteroscedastic t tests between the groups. Two questions show significant differences between the process groups, at the 0.01 level. RUP students had a higher level of agreement with the statement: ‘I think that I would have been more comfortable working with the other development process. This supports our hypothesis that students would prefer XP over RUP. RUP students also had a higher level of agreement with the statement: The project part of this course was difficult and challenging. This is a bit surprising given the fact that XP groups seemed to have had more trouble actually applying their process framework. RUP students did agree more strongly with the statement: I understand the basic concepts of the development process (either XP or RUP) that my team used. However the difference did not reach significance. TABLE III C OMPLETENESS RATINGS FOR EACH TEAM Application/Process Sketchpad Calendar Mean
TABLE I N UMBER OF DOCUMENTS PRODUCED BY EACH TEAM Application/Process Sketchpad Calendar Mean
RUP 14 8 11
XP 3 5 4
Mean 8.5 6.5
XP 14 22 18
Mean 6.5 8.5
APPLICATION RATINGS
TABLE II N UMBER OF CODE ARTIFACTS PRODUCED BY EACH TEAM RUP 14 78 46
XP 9 8 8.5
TABLE IV P ROJECT GRADES , PROCESS UNDERSTANDING RATINGS , AND PROCESS Team
Application/Process Sketchpad Calendar Mean
RUP 4 9 6.5
RUP Sketchpad RUP Calendar Mean XP Sketchpad XP Calendar Mean
Mean 14 50
429
Grade (max 40) 31.73 38.18 34.96 34.71 36.82 35.76
Understanding of process 6.5 8 7.25 7.75 7.0 7.375
Applying process 7.0 8.25 7.625 6.0 6.5 .6.25
D. Observations During the course, the instructor observed and made notes on the activities of the different teams. Her observations support the conclusions that students can understand and apply software process paradigms in a classroom setting and that XP groups had more difficulty applying their process than RUP groups. Observation also highlighted some of the limitations of this initial study. As prescribed by the RUP paradigm, both RUP teams began by assigning roles to different members of the team. Each team appointed a project manager, one or more implementors, one or more designers, and a test coordinator. The RUP paradigm actually specifies more than thirty roles. Both teams decided to eliminate roles (e.g. business modeler, configuration manager) that seemed to be unnecessary for their projects. Once roles had been established, the RUP groups began working on use case models and then object models. Discussions with the groups, especially the more successful RUP Calendar group, indicated that they were aware of the different RUP phases (Inception, Elaboration, Construction, Transition) and had definite notions about their project’s current phase. The Calendar team produced two code iterations. The Sketchpad team, in contrast, got stuck in the Elaboration phase, trying to create their object model, and produced only one, fairly incomplete, iteration. The XP Sketchpad team created a set of user stories and then began immediately to write code, as prescribed by the paradigm. The XP Calendar team created user stories, but then spent more time than appropriate on group design discussions. The instructor, in the role of coach, had to remind them that XP is code-centric and that they should be working out their ideas by coding them rather than as abstract models. XP teams had difficulty applying the core XP practice of test-driven development. Both the Sketchpad and the Calendar projects were highly interactive applications. The xUnit testing frameworks introduced in the lecture and traditionally associated with XP are not very appropriate for GUI-intensive programs. The Sketchpad team tried using JUnit but found that the resulting tests were tedious to create and not very informative. The Calendar team defined manual unit tests in a spreadsheet format, without any attempt at automation. XP teams also made only limited use of pair programming. TABLE V
AVERAGE QUESTIONNAIRE RATINGS BY PROCESS AND P VALUES FOR COMPARISON
Statement Everyone contributed Project successful Understand process Process appropriate Enough time Prefer other Work together Interesting Challenging Enough info
RUP 2.11 2.11 1.67 2.67 2.89 2.22 1.89 1.25 2.0 2.22
XP 2.1 2.2 2.10 2.1 2.6 3.6 1.8 1.7 2.8 1.7
p Value 0.975 0.820 0.0945 0.160 0.489 0.008 0.808 0.062 0.002 0.073
There were significant disparities in the programming knowledge within the teams. Typically the strongest programmer did most of the implementation. The other team members apparently did not feel qualified to contribute as much as their process paradigm dictated that they should. Only one team (XP Sketchpad) actually implemented the new, high-priority user request introduced in the middle of the term. Both Calendar teams decided that the requested feature was of lower priority than other features, despite the user’s insistence to the contrary. In this sense, neither team followed their assigned process. RUP and XP are quite different, but they share an emphasis on guiding development to create value for the customer. Although we feel that this initial attempt to introduce and compare SE process paradigms was successful, it suffered from many limitations. Probably the most serious was the restricted time available for project work. Only about thirty hours of in-class time was devoted to the projects. We encouraged students to meet outside of class but for graduate students employed full time, this was very difficult. XP, especially, relies on close team interaction on a daily basis in order to facilitate communication, establish trust, and build the competence that allows all team members to share responsibility for the code. This sustained interaction was missing in our environment. A second limitation was the discrepancies in skill and experience among different team members. All the undergraduate students had taken the SE course the previous year. However, their recall and understanding of concepts such as use cases and object models varied widely. Meanwhile, only three of the graduate students worked for software development organizations, and most had no prior academic exposure to SE. General programming skill was also a problem. Only three students out of 21 rated themselves as “highly skilled” in at least one programming language. The questionnaire responses suggested that despite these limitations, most students found the project experience interesting (mean 1.48) and challenging (mean 2.4). Moreover, students agree (mean 1.88) that they understood their assigned process paradigm and that their projects were successful (mean 2.16). IV. C ONCLUSION This study introduced SE process paradigms into a classroom setting in the context of a full-term team project. We compared student teams working with the same user requirements, using either the Rational Unified Process or Extreme Programming. We were interested, first, in whether the teams could successfully apply their respective paradigms, and second, what kind of differences we would see between the paradigms. Evidence suggests that all teams understood the process paradigms and made serious attempts to apply them. The RUP groups were more successful than the XP groups in applying their SE process paradigm. XP groups had difficulty following the core practices of test driven development and pair programming. At the same time, the RUP groups agreed
430
more strongly that they would have preferred XP than the other way around. Restricted development time and insufficient student programming skill limit the degree to which these results can be generalized to other classroom settings or to real world software development teams. We are currently planning next year’s course. We intend to provide more time for team interaction as well as to ensure that all participants have the skills they need to contribute to their teams. We may change the nature of the assigned applications to make them more amenable to automated unit-testing. We will also introduce more objective measures of project quality. Hopefully, this will allow us to make more confident statements about the relative attractiveness and effectiveness of traditional versus agile process paradigms in a classroom environment. R EFERENCES [1] K. Beck and C. Andres, eXtreme Programming Explained: Embrace Change (2nd Ed.), Boston, MA: Addison-Wesley Professional, 2004. [2] Y. Dubinsky and O. Hazzan, “Improvement of software quality: introducing eXtreme Programming into a project-based course”, Proceedings of the Fourteenth International Conference of the Israel Society for Quality, Jerusalem, Israel, Volume I, 2002, pp. 8-12.
[3] O. Hazzan and Y. Dubinsky, “Teaching a software development methodology: the case of eXtreme Programming”, Proceedings of the 16th International Conference on Software Engineering Education and Training, Madrid, Spain, 2003, pp. 176-184. [4] R. Kessler and N. Dykman, “Integrating traditional and agile processes in the classroom”, Proceedings of ACM SIGCSE’07, March 7-10, 2007. Association for Computing Machinery, p. 312-316. [5] P. Kruchten, The Rational Unified Process: An Introduction. Boston, MA: Addison-Wesley Professional, 2003. [6] N.F. LeJeune, “Teaching software engineering practices with eXtreme Programming”, Proceedings of the CCSC Rocky Mountain Conference, Consortium for Computing Sciences in Colleges, 2005 [7] S.J. Mellor and M.J. Balcer, Executable UML: A Foundation for Model Driven Architecture. Boston, MA: Addison-Wesley Professional, 2002 [8] R.F. Roggio, “A model for the software engineering capstone sequence using the Rational Unified Process”, Proceedings of ACM SE’06, March 10-12, 2006, Association for Computing Machinery, p 306-311. [9] K. Schwaber and M. Beedle, Agile Software Development with Scrum. Englewood Cliffs, NJ: Prentice Hall, 2001. [10] L.B. Sherrell and J.J. Robertson, “Pair programming and agile software development: experiences in a college setting”, Proceedings of the CCSC Southeastern Conference, Consortium for Computing Sciences in Colleges, 2006. [11] K. Stapel, D. Lbke, and E. Knauss, “Best practices in eXtreme Programming course design”, Proceedings of ICSE’08, May 10-18, 2008, Association for Computing Machinery, pp. 769-775. [12] L. Williams, R.R. Kessler, W. Cunningham, and R. Jeffries, “Strengthening the case for pair-programming”, IEEE Software, Vol. 17, Issue 4, 2000, pp. 19-25.
431