test-driven development goes to school - ACM Digital Library

15 downloads 1152 Views 197KB Size Report
Consortium for Computing Sciences in Colleges. To copy otherwise, or to ... Not surprisingly computing and information technology educators have begun to call ...
TEST-DRIVEN DEVELOPMENT GOES TO SCHOOL* Christopher G. Jones, PhD Computer Information Technology & Education Department Utah Valley State College Orem, UT 84058-5999 (801) 863-8308 [email protected] ABSTRACT In industry experiments using test-driven development (TDD), some researchers report significantly increased code quality over traditional test-last approaches. Not surprisingly computing and information technology educators have begun to call for the introduction of TDD into the curriculum. This paper explores the pedagogical experience to date in using a test-first approach in the classroom. Selected studies include four experience reports, one conceptual paper, and three experiments comparing TDD against control groups. Issues in operationalizing TDD across the curriculum are examined, including programming language assertion mechanisms, the feasibility of employing test frameworks, and the automated verification of student test plans. Recommendations derived from the literature are presented. INTRODUCTION Proponents of Test-Driven Development (TDD) assert that commercial software defect rates can be reduced from 18% to 50 % when tests are written at the beginning rather than the end of the development cycle [7, 9]. TDD is one of the 12 key practices of Extreme Programming, a popular agile software development methodology [2]. In TDD, synonymously known as TFD (Test-First Design), TFP (Test-First Programming), and TDD (Test-Driven Design), software developers "test first, then code" [2, p. 9]. By focusing up front on the verification and validation of software requirements, the design organically evolves as new code is written to satisfy the failed tests [12]. With the promise of significantly increased code quality, computing and information technology educators have begun to call for the introduction of test-driven development ___________________________________________ *

Copyright © 2004 by the Consortium for Computing Sciences in Colleges. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the CCSC copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Consortium for Computing Sciences in Colleges. To copy otherwise, or to republish, requires a fee and/or specific permission. 220

CCSC: Rocky Mountain Conference into the curriculum [3, 12, 14]. This paper reviews the pedagogical experience to date in using TDD in the classroom beginning with the introductory programming course. Issues in operationalizing TDD across the curriculum are explored, including programming language assertion mechanisms, the feasibility of employing test frameworks, and the automated verification of student test plans. METHODS AND PROCEDURES To date, little research has been done to synthesize the reported findings on the application of TDD in the college classroom. To address this need, a systematic review of the academic literature was undertaken. The studies considered for inclusion in the analysis were identified from four sources: (a) on-line searches of the World Wide Web using key words and key word combinations of such terms and acronyms as test-driven development (TDD), test-first design (TFD), test-first programming (TFP), test-first coding (TFC), test-driven design (TDD), and teaching; (b) on-line searches (for the same terms and acronyms) using key computing electronic archives including ACM Digital Library, IEEE Computer Society Digital Library, Association for Information Systems eLibrary, Academic Search Elite, and EBSCOHost, (c) bibliographic references cited in the studies located through on-line searches and periodical reviews; and, (d) a review of websites dedicated to computer science and information systems education. A total of 23 academic articles related to computing education and test-driven development were located through these search procedures. Of these, 15 studies discussed TDD only peripherally as part of a larger effort to introduce Extreme Programming into the classroom. These were excluded from the candidate pool, leaving eight studies (Table 1) that satisfied the following criteria: (a) the primary thrust of the study had to explore the merits of TDD as a pedagogical approach to software development; (b) the study had to be targeted to an educational setting, with no restrictions on grade level or delivery institution; and, (c) due to cost and time considerations, the study had to be published and be available from university libraries or be readily available over the Internet. Table 1 Studies Reviewed Study

Author(s)

Title

1

Barriocanal, E., Urban, M., Cuevas, I., & Perez, P.

An Experience in Integrating Automated Unit Testing Practices in an Introductory Programming Course [1]

2

Elkner, J.

Using Test Driven Development in a Computer Science Classroom: A First Experience [5]

221

JCSC 20, 1 (October 2004)

3

Steinberg, D.

The Effect of Unit Tests on Entry Points, Coupling and CoHesion in an Introductory Java Programming Course [15]

4

Mugridge, R.

Challenges in Teaching Test Driven Development [10]

5

Olan, M.

Unit Testing: Test Early, Test Often [12]

6

Edwards, S.

Using Test-Driven Development in the Classroom: Providing Students with Automatic, Concrete Feedback on Performance [4]

7

Kaufmann, R. & Janzen, D.

Implications of Test-Driven Development: A Pilot Study [8]

8

Muller, M. M. & Hagner, O.

Experiment About Test-First Programming [11]

The eight studies selected for review ranged from experience reports to experimental design. As Tab le 2 indicates, 50% of the studies identified were experiential in nature, primarily providing anecdotal narrative about attempts to integrate test-driven development into introductory programming courses or software engineering/software project type courses. Study narratives did not provide any evidence of random selection of students, nor use of control groups with a corresponding random assignment of students to control and experimental groups, nor use of any instrumentation. While study validity (both internal and external) is in question, these experience reports do yield some interesting observations and classroom wisdom. Only three of the eight studies identified showed any methodological rigor. All three involved controlled experiments comparing TDD with traditional program testing. Table 3 summarizes the details of these studies. Table 2 Study Characteristics Category (n = 8) Study Type Experience report/Case history Experimental design Conceptual study

f 4 3 1

% 50.0 37.5 12.5

Course Level High School 1 Introduction to programming (f = 1)

12.5

222

CCSC: Rocky Mountain Conference College - Lower division 3 Introduction to programming (f = 3) College - Upper division 3 Comparative languages (f = 1) Software engineering (f = 1) Software studio (f = 1) College - Graduate Studies (f = 1) 1 12.5

37.5 37.5

Testing Framework XUnit JUnit (f = 5) tpUnit (f = 1) assertEquals() function (f = 1) Web-CAT custom framework (f = 1)

6

75.0

1 1

12.5 12.5

Programming Language Environment Java Python

6 1

75.0 12.5

1

12.5

Turbo Pascal

A test-driven development approach to programming and software construction requires additional platform infrastructure over and above the traditional editor/compiler combination or usual integrated development environment. Six of the studies referenced XUnit [16] as the automated test harness, with five of the six citing JUnit as the specific framework. In one study, functional tests were performed by students using nothing more than an assertion macro. In another, students were required to develop against a proprietary testing framework that provided custom feedback. The dominant programming language used in the TDD studies was Java (75%). This shouldn't be much of a surprise as JUnit was the leading test harness. Other languages used included Python and Turbo Pascal. EXPERIENCE REPORTS ON TDD IN THE CLASSROOM Of the four experience reports, three [1, 5, 15] investigate integrating TDD into the introductory programming course; the fourth [10] examines the use of TDD in a traditio nal software engineering course. Study syntheses follow and are grouped by course topic. Study 1: University of Alcalá, Madrid, Spain Barriocanal et al. report on a series of classroom experiments in 2002 designed to answer two questions: (a) Does unit-testing improve code quality in an introductory programming course (CS1)? and, (b) Do CS1 students "enjoy" the unit-testing approach? Subjects included 100 students enrolled in an introductory procedural programming course (3 hours lecture/2 hours lab) at the University of Alcalá, Madrid, Spain. Turbo Pascal was used as the teaching language. Students were provided a testing framework

223

JCSC 20, 1 (October 2004) (tpUnit), a half-day session on unit testing and test frameworks, and web-based documentation and examples. Use of unit testing was optional for course assignments. Only 10% of the students (around 10 subjects) chose to develop test cases for their course assignments. Seventy percent of the subjects using tpUnit reported that the framework was easy to use (4 on a 5 point Likert scale). All subjects using tpUnit reported (4 on a 5 point scale) that they felt unit testing improved code quality. The authors concluded that with 90% of the students choosing not to unit test, that students as a whole do not enjoy unit testing. However, those that did unit test offered some evidence that unit testing may improve code quality as "students that used the framework scored high in their assignment's evaluation" [1, p. 127]. Barriocanal et al. summarized their experience with TDD in the introductory programming course as follows: "The results of the experiment points out that a straightforward approach for the integration of unit testing in first-semester courses do[es] not result in the expected outcomes in term of students' engagement in the practice" [1, p. 125]. Study 2: Yorktown High School, Arlington, Virginia, USA Elkner [5] describes the introduction of test-driven development in a secondary education introductory procedural programming course using Python as the pedagogical language. With the help of an experienced mentor from industry, Elkner exposed his students to Extreme Programming [2] practices including pair programming and unit testing. Students were given an assignment to use TDD to develop a Pig Latin translator. (Pig Latin is a pseudo language in which words are formed by rearranging consonants and adding an "ay" suffix. Example: In Pig Latin, the English sentence "Pig Latin is fun" would be translated into "Igpay atinlay is unfay."). Elkner demonstrated the use of test-driven development to his class by writing tests and code to determine the pigPrefix. Students were then asked to pair program using TDD and complete the Pig Latin translator by writing functions to create the pigStem and the pigWord. To avoid the overhead of an XUnit-style testing framework, Elkner provided students with an assertion macro [Syntax: assertEquals(actual, expected, test message)]. Elkner reported that student enthusiasm for TDD and the translator assignment were high. "All the students were actively engaged in programming for the entire ninety minutes of class. Several pairs kept working after the bell rang, working into their break" [5]. The assignment completion rate was higher than for previous class assignments that didn't require TDD. Fewer students complained of not knowing where to start in solving the problem. An unexpected side effect of TDD was that by focusing on tests first, the instructor could communicate more effectively and succinctly what was expected. Study 3: John Carroll University, University Heights, Ohio, USA In the Steinberg study [15], the author recounted his experience with an Extreme Programming (XP) study group composed of computer science faculty and upper level undergraduate students. The group met periodically as preparation for delivering an introductory programming course that integrated XP practices. As part of its effort, the study group examined the issues surrounding unit testing in an introductory course. JUnit was used as the testing framework; Java as the programming language. 224

CCSC: Rocky Mountain Conference In Java, as with many languages having a C pedigree, program execution begins with the entry point main(). A variety of approaches have been developed for helping students understand the role of main(), from providing students with nothing more than a pre-built template, to giving students a complete explanation of the full method signature (public void static main(String[] args). On the other hand, using JUnit to drive program execution, hides main() within the testing framework. Students can focus on writing and running tests rather than wrestling with advanced concepts such as access modifiers, void return values, class methods, string objects, and arrays before they even begin to understand program construction. In his study, Steinberg suggested that using TDD and JUnit in an introductory course enables deferral of the discussion of the non-object-oriented concept of program bootstrapping using main(). Instead, students focused on creating and accessing objects, creating and calling methods, and using variables. Students came to view an application as a "collection of services" [15]. By writing tests first, students learned to write "code based on the specifications of the unit tests" [15]. Student code was "more cohesive", the coupling looser. "Objects had more clearly defined responsibilities" [15]. Even refactoring [6] was a possibility as student's projects evolved from the "simplest thing that could possibly work" to refined designs flexible enough to accommodate last minute extensions to the specification. After experimenting with the application of various XP practices in an introductory programming course, Steinberg [15] concluded that unit testing was the "most important" of the 12 practices. Benefits included (a) the deferral of coverage of main() as an entry point, and (b) a change in classroom dynamics that saw students collaborating more freely in addressing program issues. Study 4: University of Auckland, Auckland, New Zealand Mugridge [10] is unabashedly optimistic about the future of test-driven development. Early in his paper he proclaimed: "TDD is likely to have as much impact on programming practice in the 2000s as structured programming had in the 1970s and object-orientation had in the 1980s and 1990s" [10, p. 1]. At the Univeristy of Auckland, he has been experimenting with TDD in an undergraduate software engineering course for over two years. Year one of the course, Mugridge introduced test-first programming using JUnit as the automated test platform. All classroom demonstrations and assignments were reworked to support a test-first perspective. By year two, unit testing and JUnit coverage had been migrated down to a prerequisite course. Students now entered software engineering with a general TDD skill set. This allowed Mugridge to move beyond simple unit testing to method and class refactoring-an essential practice undergirding the XP approach to evolutionary design. Ultimately, Mugridge plans to integrate TDD across the curriculum. "To be most effective," says Mugridge, "TDD needs to be included as a fundamental element of programming from the very beginning" [10, p. 4]. Over the two-year TDD curricular experiment, Mugridge [10] identified several challenges to teaching test-driven development. First, students who have struggled to learn the test-last approach to software development and are now somewhat comfortable 225

JCSC 20, 1 (October 2004) with it, must unlearn last-minute testing in order to explore test-first. Using tests to specify the design rather than as metric for measuring design adherence requires a shift in mindset. Second, to use TDD effectively as a design approach requires a set of skills not often covered adequately in previous coursework. The result is that curriculum design must be augmented to include coverage of: (a) tests and test case writing, (b) use of automated testing frameworks, (c) message tracing to troubleshoot test failure, (d) writing the simplest possible implementation code to meet test specifications, (e) method and class refactoring, (d) common refactor patterns [6], and (f) packaging unit tests into appropriate acceptance tests. For introductory programming courses, Mugridge suggests a reduced set of TDD skills and a lightweight testing framework-something as simple as the assert statement available in modern programming languages. Study 5: Richard Stockton College, Pomona, New Jersey, USA In "Unit Testing: Test Early, Test Often", Michael Olan investigates the possibility of "gently introducing testing early" [12, p. 319]. In this conceptual paper, Olan recommends that IS educators "plant the seeds of unit testing" as early as the introductory programming course (CS1). Specific suggestions include: (a) assigning students to write a test driver for an instructor-supplied class, (b) using pedagogically-oriented integrated development environments such as BlueJ [13] to interactively test classes and methods without having to write a test driver, and (c) using a debugger to trace execution and inspect object state. Of course, interactive testing is limited to a single test at a time and requires human judgment to analyze output. For those ready to automate the unit testing process, Olan recommends an XUnit-style [16] framework such as JUnit, either standalone, or integrated into an IDE such as open source Eclipse. As students advance with unit testing, additional tools may be required for dealing with challenges of user interface testing, server-side code testing, or database access testing. Olan offers a list of possible XUnit extensions to simplify testing in these complex environments. From an instructor perspective, Olan addresses the issue of automated grading of student projects and test cases, recommending the use of automated submissions, build tools, and IDEs with integrated testing frameworks. PUTTING TDD TO THE TEST USING EXPERIMENTAL DESIGN Of the eight studies selected for review, three involved classroom experiments in which students were divided into two groups - experimental and control. Table 3 summarizes the research design for each study along with the dependent variables examined. Statistically significant differences are summarized in the results column.

226

CCSC: Rocky Mountain Conference Table 3 Experimental Design Research StudyResearch Design Subjects 6 Multiple factor (one-at-a-time), 59 posttest-only, inter-subject (one experimental and one control development group) 7

Multiple factor (one-at-a-time), 8 posttest-only, inter-subject (one experimental and one control group)

8

Single factor, posttest-only, inter-subject (one experimental and one control group)

Dependent Variables Results Software quality/reliabilityTDD significantly higher Programmer confidence TDD significantly higher Preference for TDD Significantly higher after Software quality/reliabilityTDD significantly higher Programmer productivity TDD significantly higher Programmer confidence TDD significantly higher

19 Software quality/reliabilityNo significant difference Programmer productivity No significant difference Program understanding TDD significantly higher

Study 6: Virginia Polytechnic Institute and State University, Blacksburg, Virginia, USA At Virginia Tech, computer science students submit programming assignments to Curator - an automated grading system that provides valuable feedback on program compilation and performance against instructor provided test data [4]. Unfortunately, according to Edwards, Curator encourages students to focus on "output correctness" rather than design or functional testing. As part of an effort to introduce TDD into an upper division computer science comparative languages course, Edwards was involved in reengineering Curator. The result -Web-based Center for Automated Testing - or Web-CAT, grades both student code and student tests. Spring 2003, fifty-nine subjects completed course assignments for the comparative language course using Web-CAT. A simple test framework was developed to execute test cases written in any one of the languages covered in the course (e.g., Pascal, Scheme, Prolog). Before using Web-CAT to submit test cases for grading, students were given 30 minutes instruction in TDD and how to use the system to submit, execute, and interpret test results. Demonstrations of TDD in action were provided throughout the class sessions [4]. Assignments were graded for software quality by examining test validity, test completeness, and code correctness. Test validity measured the accuracy of student's expected output against the problem assignment. Test completeness relied on a basis path coverage analysis. Code correctness measured the percentage of student tests that the submitted code actually passed. Using these three software quality measures, Edwards compared the results with the 59 non-TDD students completing the comparative language course, Spring 2001. The quasi-experimental design yielded some interesting results, all of which were statistically significant at an a of 0.05. On all three quality measures, TDD students outperformed the non-TDD students. The mean test validity was 94.0% for the TDD group versus 76.8% for the non-TDD subjects. Test completeness was 93.6% for TDD; 90.0% for non-TDD. Code correctness measured as the number of defects per thousand lines of code was only 38 for the TDD group compared to 70 for those not using test-driven development [4]. 227

JCSC 20, 1 (October 2004) At the conclusion of the Spring 2003 course, students were asked to complete an anonymous survey designed to measure programmer confidence levels and preference for the TDD approach. According to the self-reported data, 65.3% of students either agreed or strongly agreed that TDD increased their confidence in the correctness of their code. In addition, 67.3% either agreed or strongly agreed that TDD increased their confidence when making changes to their code base. Finally, 73.5% of students either agreed or strongly agreed that they would prefer to use TDD and Web-CAT for program testing and submission even if it were not required. Edwards was so pleased with the results of the classroom experiment he plans to migrate TDD to CS1. "We plan," says Edwards, "to apply this technique in our introductory programming sequence, where students will program in Java and use JUnit" [4]. Study 7: Bethel College, North Newton, Kansas, USA In the Bethel College study, Kaufmann and Janzen [8] conducted a controlled experiment comparing test-first programming against test-last programming. Subjects were sophomore to senior level students in an elective upper division Software Studio course. Students self-selected into two separate groups of four students each, one forming the test-first group; the other, the test-last group. Student projects were also self-selected with the TDD group developing an adventure game and the non-TDD group writing an action game using stick figures. All subjects were computer science majors with at least two programming courses in C++. To compare the two groups, three dependent variables were analyzed-design quality, programmer productivity, and programmer confidence. Design quality was assessed using four metric tools (JavaNCSS, JDepend, JMetric, and CCCC) to measure structural and object-oriented design complexity. Programmer productivity was measured using a count of non-commented lines of code and program size. Programmer confidence was assessed through an end-of-course survey [8]. Three times throughout the course, snapshots were taken of student code and analyzed using the metric tools. While both groups developed projects within "reasonably safe bounds" of complexity, cohesion, and coupling, the test-last group created classes with "more than twice the information flow measure (square of fan-in and fan-out)" of comparable classes in the TDD group [8, p. 299]. Kaufmann and Janzen concluded that the test-last group had a tendency toward overburdened classes. The test-first group produced 50% more code than the test-last group. As a check on the reasonableness of using lines of non-commented code as a metric for programmer productivity, a complexity analysis of the source code for both groups was undertaken. The analysis revealed that the two applications were equally complex, confirming that the test-first group was, indeed, more productive [8]. The final metric evaluated was programmer confidence. On a scale of 1 to 5 with 5 representing the highest confidence level, students in the test-first group reported more confidence (M = 4.75) in the functionality of their project than the test-last group (M = 2. 5). The test-first group also felt that TDD was effective in helping with debugging and design (M = 4.25) [8].

228

CCSC: Rocky Mountain Conference While the Kaufmann and Janzen [8] study tends to confirm claims by XP proponents as to the advantages of TDD, caution is warranted. The sample size was small (n = 8), the groups were self-selecting, the test-first group had greater programming experience, and the projects were not identical. Still the results suggest there may be some truth to the promised TDD benefits of increased productivity and code quality. Study 8: University of Karlsruhe, Karlsruhe, Germany Müller and Hagner's study presents the results of a controlled experiment comparing test-driven design in an academic setting [11]. Nineteen University of Karlsruhe graduate students in computer science participated in the experiment. Students were divided into two groups (test-first n= 10; control group n = 9) of equal programming experience. Prior to the experiment, all students completed a one-semester course on XP methodology. Students were given the assignment to implement the main class of a graphic library, given only the method prototypes. Testing and development were to be completed on an individual basis. Subjects were informed that a comprehensive acceptance test would be run against the final project deliverable. For all development, Java was used as the programming language; JUnit as the testing framework. Work was divided into two phases: Implementation and Acceptance Test. Students worked on the project during a semester break between July 2001 and August 2001. An automatic source code monitoring application non-intrusively recorded logins, compilations, and program output. Data were collected on three measures: programmer productivity, software quality, and program understanding. "Logged time spent" was used to assess programmer productivity. Software quality metrics included (a) the proportion of code assertions passed to all possible executable assertions, and (b) programming logic branch coverage. As a surrogate for program understanding, Müller and Hagner [11] tabulated the number of reused methods and the incidence of failed method calls. Comparison of the experimental group with the control group yielded the following findings: •

There was no statistically significant difference in programmer productivity between the TDD and traditional groups as measured by time spent. Müller and Hagner concluded that developers converting to a TDD approach will not necessarily program faster than test-last practitioners.

• There was no statistically significant difference in software quality between TDD and traditional development when measured either as a function of branch coverage or a percentage of assertions passed. The practice of test-first programming did not result in increased code reliability. •

There was a statistically significant difference in code reuse between the TDD and control groups. The test-first developers tended to reuse existing methods more quickly and with more accuracy.

While probably the most rigorous of the three experimental design research efforts reviewed, the University of Karlsruhe [11] study still suffers from the usual threats to external validity. Group sizes were small (10 and 9 respectively) with subjects that might 229

JCSC 20, 1 (October 2004) not be representative of professional programmers. Study subjects had only one semester exposure to XP practices and may not have been entirely comfortable with the test-first approach. Lastly, the graphic library problem set may not be representative of the typical development activities encountered on the job. CONCLUSION The test-first approach to software development has gained considerable attention in the academic community as a possible "silver bullet" for improving the code quality of information technology and computing science students. Early TDD literature tended to portray test-first as one of 12 integral practices comprising the whole that is Extreme Programming [2]. More recently test-driven development has emerged as a stand-alone topic worthy of exploration and research. The literature dedicated to TDD is growing. Experience reports provide anecdotal narrative on the benefits and challenges of integrating TDD into the curriculum, whether in upper division software development courses or in the introductory programming course. Early adopters tend to be in Computer Science departments, use Java as the teaching language, and JUnit as a testing framework. Classroom outcomes vary. Some instructors are exuberant, claiming they have finally found a way to help students "get it right." Others are more guarded. TDD is not cost-free. It requires knowledge of testing frameworks and skills in their use, an understanding of refactoring, and an unlearning of old habits from test-last development. Rigorous evaluation of the purported benefits of TDD yield mixed results. Two of the three experimental design studies, reported significantly higher quality when software is developed using TDD. A third study, perhaps the most methodologically sound, found no reportable difference. With regards to programmer productivity, one study reported a significant increase when counting non-commented lines of code, while another study found no difference when comparing time spent. For those students who do use TDD, two studies found confidence significantly increased. Finally, one study found program understanding was significantly higher for students using TDD over those who did not. At this point in TDD practice, it is too early to conclude that the literature is clearly on the side of test-driven development. What is clear is that the test-first paradigm shows promise as a means for helping students achieve a verifiable design specification. Further research is warranted. To paraphrase Extreme Programming guru Kent Beck, "Anything that can't be verified doesn't exist" [2, p. 45]. REFERENCES [1] Barriocanal, E. G., Urbán, M. S., Cuevas, I. A., Pérez, P. D., An experience in integrating automated unit testing practices in an introductory programming course, ACM SIGCSE Bulletin, 34, (4), 125-128, December 2002. [2] Beck, K., Extreme Programming Explained: Embrace Change, Boston, MA: Addison-Wesley, 2000. [3] Edwards, S. H., Rethinking computer science education from a test-first perspective, Proceedings of the 18th Annual ACM SIGPLAN Conference on Object-oriented 230

CCSC: Rocky Mountain Conference Programming, Systems, Languages, and Applications: Educators' Symposium, 148-155, 2003. [4] Edwards, S. H., Using test-driven development in the classroom: Providing students with automatic, concrete feedback on performance, Proceedings of the International Conference on Education and Information Systems: Technologies and Applications (EISTA '03), August 2003. [5] Elkner, J., Using test driven development in a computer science classroom: A first experience, February 2003, www.elkner.net/jeff/testFirst/index.html , retrieved April 1, 2004. [6] Fowler, M., Refactoring: Improving the Design of Existing Code, Boston, MA: Addison-Wesley, 2000. [7] George, B., Williams, L., An initial investigation of test driven development in industry, ACM SAC, Melbourne, FL, 2003. [8] Kaufmann, R., Janzen, D., Implications of test-driven development: A pilot study, Companion of the 18th Annual ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA '03), 298-299, October 2003. [9] Maximilien, E. M., Williams, L., Assessing test-driven development at IBM, Proceeding of the 25th International Conference on Software Engineering, 564-569, 2003. [10] Mugridge, R., Challenges in teaching test driven development, Proceedings of XP 2003, 410-413, May 2003. [11] Müller, M. M., Hagner, O., Experiment about test-first programming, IEEE Proceedings-Software, 149, (5), 131-136, 2002. [12] Olan, M., Unit testing: Test early, test often, The Journal of Computing in Small Colleges, 19, (2), 185-192, December 2003. [13] Patterson, A., Kölling, M, Rosenberg, J., Introducing unit testing with BlueJ, Proceedings of the 8th Conference on Information Technology in Computer Science Education (ITiCSE 2003), Thessaloniki, Greece, 11-15, 2003. [14] Smith, S., Stoecklin, S., What we can learn from Extreme Programming, The Journal of Computing in Small Colleges, 17, (2), 144-151, December 2001. [15] Steinberg, D. H., The effect of unit tests on entry points, coupling and cohesion in an introductory Java programming course," Proceedings of 2001 XP Universe, July 2001. [16] XUnit website [Computer software], www.xprogramming.com/software.htm , retrieved April 15, 2004.

231