Session 13b3 The Design of Online Tests for Computer ... - CiteSeerX

0 downloads 0 Views 53KB Size Report
compiling and running a program do not get in the way of actual problem ... At the outset, online testing, where a student is asked to write and .... inputs/outputs were listed, so that students could use them to .... the second test to the final test was lower when both the tests ... Writing the algorithm in the form of pseudocode as.
Session 13b3 The Design of Online Tests for Computer Science I and Their Effectiveness Amruth N. Kumar Ramapo College of New Jersey Mahwah, NJ, 07430 [email protected] Abstract – Solving problems on a computer is the focus of Computer Science I. Therefore, it may be more effective to hold tests online than in written form in this course. Yet, online tests are not widely used because of a perceived need for elaborate technological infrastructure and the feasibility of holding a fair fixed-duration online test that is not intimidating to students. In this paper, we present our design of online tests for Computer Science I that does not require elaborate technological infrastructure. We also evaluate the effectiveness of holding tests online in Computer Science I. We found that students like online tests; students tend to do better on online tests than on written tests, and there is a correlation between the number of projects a student completes in the course and the student’s grade on online tests.

Introduction In the past, we have held written tests in Computer Science I: a typical test is 2-hours long, and students are required to answer 6 questions, each with 2-3 subsections. The test is open-book, but with no access to a compiler or computer. The questions assess the ability of the student to synthesize and analyze programming constructs, and write small programs. Online testing in Computer Science I has been scantly reported in literature: we found only one reference to it [1] in the last five years of SIGCSE conference proceedings (199498) and none in the last two years of FIE proceedings (199798). We decided to try online testing on a trial basis at our institution. We wanted to check whether there was a correlation between online test scores and a student's problem-solving skills. Testing students online is challenging: we must strike the right balance between the complexity of a question and the feasibility of answering the question in a limited amount of time. We must ensure that the incidentals of editing, compiling and running a program do not get in the way of actual problem solving. We must minimize the stress on students from taking such tests, and the stress on the instructor from setting and grading such tests. In Fall 98, this instructor got the unique opportunity to teach three sections of Computer Science I. This meant a test audience of over 100 students. Traditionally, we administer two written tests and a written final in Computer Science I. In Fall 98, we decided to administer a written first test, but online second test and online final, and compare the results. In the rest of the paper, we will: 1. Analyze written versus online tests;

2.

3.

Describe how we held online tests on a Unix platform without elaborate additional technological infrastructure; Our preliminary analysis of the effects of testing online versus holding written tests.

Written Versus Online Tests At the outset, online testing, where a student is asked to write and run a program on a computer during the test seems more appropriate for measuring the problem-solving skills that students are expected to acquire in Computer Science I. The primary difference between written and online tests is that students have access to a compiler during online tests. It is not clear, though, that this by itself would make an online test more effective than a written test. Therefore, in this section, we will analyze the pros and cons of making a compiler available during the test. A Computer Science I test would necessarily include questions that require students to write programs during the test. How do we treat syntax errors in such programs? • A syntax error can give rise to a semantic error. E.g., in C, leaving out braces in the else-clause of an if-else statement can dramatically alter the semantics of the code. Therefore, we cannot always distinguish a syntax error from a semantic error. By extension, we cannot simply ignore syntax errors in a program. • On the other hand, penalizing students taking a written test for errors that are of syntactic origin is harsh: even experienced programmers make mistakes when simply writing out programs with pen and paper. Besides, this form of testing is not true to the practice of the computing profession. The test expects above and beyond what is normally expected of experienced practitioners, even assuming the test is open-book. Therefore, online tests are preferable to written tests in Computer Science I. Students have access to a compiler in an online test unlike in a written test. Therefore, they can locate and eliminate syntax errors in their programs during the test. There is no need for the instructor to overlook syntax errors when grading the tests. In other words, the instructor could interpret syntax errors as an unambiguous indication that a student is not well-prepared. It is true that students may have to spend more time on syntax in an online test than in a written test. But the process of debugging a program will further amplify the difference between well-prepared and ill-prepared students: students who have not had sufficient practice completing programs

0-7803-5643-8/99/$10.00 © 1999 IEEE November 10 - 13, 1999 San Juan, Puerto Rico 29th ASEE/IEEE Frontiers in Education Conference 13b3-1

Session 13b3 will be unfamiliar with compiler errors and the process of debugging. They will spend more time than well-prepared students on debugging their program. Therefore, online tests penalize lack of practice. In order to ensure that an online test is more than a test of syntax, though, we make the following provisions: • The test will contain fewer questions than a written test, so that students will have more time to spend per question. The questions will be more in-depth than the questions in a written test, so that students will have ample opportunity to demonstrate their understanding of semantics even after spending time on debugging syntax. • Students may ask the instructor for hints during the test. This provision helps students overcome mental blocks during the test.

Developing an Online Test Apart from designing a test that is reasonable in its expectations, online testing comes with other concerns such as ensuring a testing experience for students that is not significantly more stressful than written testing; preventing cheating from online sources during the test; and setting up the test so as not to encumber excessive amount of grading time and effort for the instructor. In this section, we will present details of our design of online tests that will address these concerns. The Setup: Students were assigned class accounts on a Unix machine at the beginning of the semester. The class accounts were set up so that the instructor could log into the accounts without a password. Since students have separate user accounts, privacy was not an issue. The Test Format: Our online test consisted of two questions: Question 1: Given a program, students were asked to debug it. The program contained both syntax and semantic errors. Grades were weighed in favor of semantic errors than syntax errors. A description of the problem being solved was included in comments at the top of the file. The file with the program listing was posted on the Web. The instructions for downloading the program from the Web were included in the test handout so that the students did not have to re-type the program. A listing of the program was also included in the test handout since it is more convenient to debug a program on paper than online. Question 2: Given a problem, students were asked to write a program to solve it, compile the program and debug it. The problem was broken down into several discrete steps. The points distribution for these steps was clearly listed in the test handout. For both the questions, a couple of sets of sample inputs/outputs were listed, so that students could use them to test their programs. Minimizing Student Stress:

• Students were assured that partial credit would be given for incomplete programs. They could earn points for subparts of a program even if they could not get the whole program to work correctly. • During the test, students were allowed to ask the instructor for hint(s), on the condition that they would forfeit the points for the sub-problem(s) to which the hint(s) pertained. This provision was meant to prevent students from failing an entire test if they got stuck on a small sub-problem. About 20% of the students took advantage of this offer. Answering the Test: • Students were asked to create a directory with a given name, and enter answers only within that directory. Further, students were asked to name their program files one and two for the two questions respectively. An occasional student (1 in 30) failed to follow these directions and saved their test files in some other directory. However, since the names of the files were pre-determined, the errant files could be easily located in the account. • Students were handed bluebooks for rough work. When a student asked for a hint, we wrote it down on the inside front cover of the bluebook. This served two purposes: o Since students were fairly closely seated in the laboratory, writing ensured that the hint would not be overheard by/benefit the next student. o Since students were asked to submit their bluebooks at the end of the test, the written hint(s) served as a log of the points to be deducted from each student’s final score on the test. However, students were advised that they would be graded only on what was in the test files on their account, and not on anything they wrote in the bluebook. • Students were asked to demonstrate the running of their programs before leaving the laboratory. They were asked to test their programs on the sample inputs/outputs listed in the handout. This provided an opportunity for the student to confirm in the presence of the instructor that the program had indeed been completed. • During the demonstration, we noted the following in the bluebook against each question: whether the program compiled successfully; whether the program ran, and whether the program produced correct results on the given sample input/output sets. • Students were prohibited from visiting the Web (except to download the first program), and were prohibited from accessing email during the test. • Students were asked not to modify the test directory after the test until the grades were announced. They were warned that if either program file was altered (even accidentally), they would be denied credit for the

0-7803-5643-8/99/$10.00 © 1999 IEEE November 10 - 13, 1999 San Juan, Puerto Rico 29th ASEE/IEEE Frontiers in Education Conference 13b3-2

Session 13b3 question. Since Computer Science I students are not usually Unix-savvy, we were not worried about students modifying their files and successfully concealing evidence of doing so. We had only one instance where we ended up denying credit to a student for one question. Grading the Tests: • After the test, we logged into each student’s account, and reviewed the programs online. Our objective was to read through the programs and assess points for correctness, robustness of input (e.g., does the program handle incorrect inputs from the user?), generality of the solution (e.g., does the program handle both the cases for character input?), and the style of coding. Since we had already verified whether each program compiled/ran, we did not have to compile and run the programs again. • We noted the points awarded for each question and subquestion in the bluebook. We returned the marked bluebook to the student, who could tell from the detailed breakdown of points awarded to each question, where (s)he had gone wrong. We also posted sample solutions to both the questions on the Web. Recommended Improvements: • We noticed that many students were spending an inordinate amount of time on the first question (debugging a given program). As a result, they were doing poorly on the second question. We addressed this situation by announcing the number of syntax/semantic errors they should expect to find in the first program. • Logging into student accounts to review programs is tedious. A better solution would be to provide Computer Science faculty with group permissions to read/write/execute in the class accounts. With this setup, faculty can access student accounts just by changing directory. • Although the majority of the students correctly followed our instructions to create a new directory for each test and worked within that directory, it may be more convenient to use a script to automatically create the directory for them so that they do not have to deal with unfamiliar Unix commands at the beginning of each test. • We were not concerned about students making unauthorized modifications to their test files after the test, since Computer Science I students are generally ignorant of Unix. If this were to be a serious concern, we could ask students to tar their test directory and either email the tar file to us as an attachment or ftp the file to an anonymous site. Alternatively, we could use a script to automatically copy each student’s test directory into the instructor’s account at the end of the test. In Preparation for the Test:

• Students were reminded to make sure their class account was in good standing for the test (i.e., the password had not expired, and the disk quota had not been exceeded). • Students were reminded to adequately familiarize themselves with the edit-compile-run commands on the Unix system. The editor was menu-driven, which was a great help. • The test was open-book: students were encouraged to bring any textbook and notes they pleased to the test.

Evaluating Online Tests We will now evaluate the merits of online testing both qualitatively and quantitatively. Our evaluation technique consisted of comparing data from the three sections where we held online tests, with historical data from three earlier sections where we had held written tests. To put this comparison in perspective: • the online tests consisted of 2 questions whereas the written tests consisted of 6 questions the student chose from among 8 (no choice was given in online tests); • the questions in the online tests were significantly more complicated than any written test question, in this instructor’s opinion; • the background of the students was similar across all six sections – a mix of Computer Science majors and nonmajors taking the course for general education credit; • the written and online testing groups both include day and evening sections, once-a-week as well as twice-aweek sections. Therefore, the results we present here illustrate the effect of online tests that were harder, but of the same duration as written tests, on comparably similar groups of students. Effect on Retention: Attrition is a major problem in Computer Science I. At our institution, attrition can be partly blamed on the fact that the course is also taken by nonmajors for general education credits. We were concerned that online testing, with its real-time problem solving format would significantly increase attrition. But, this did not prove to be the case. Test Takers Test 1 Test 2/Test 1 Final/Test2 Fall 94 16 93.75% 86.67% Fall 96 28 92.86% 76.92% Spring 97 30 73.33% 81.82% Fall 98/Sec1 28 78.57% 90.91% Fall 98/Sec2 19 78.95% 100.00% Fall 98/Sec4 28 89.29% 84.00% The above table lists the number of students who took the first test; the number of students who took the second test as a percentage of the number of students who took the first test; and the number of students who took the final test

0-7803-5643-8/99/$10.00 © 1999 IEEE November 10 - 13, 1999 San Juan, Puerto Rico 29th ASEE/IEEE Frontiers in Education Conference 13b3-3

20

40

60

80

Score on Final

Correlation Between Written and Online Tests: Fall 98/Sec 1 Cumulative Grades

200 150 Final 100

Test 2

50

Test 1

19

16

13

10

7

4

1

0 Student

Consider the area graph of test scores from Fall 1998 (Section 1), when the first test was written and the second and final tests were online. The three graphs from bottom to top represent the scores of students on first, second and final tests respectively, the scores on the first test being sorted in descending order. It is clear from the graph that there is a good correlation between scores on the written first test and the online tests: students who did well on the written test also did well on the online tests. This correlation becomes even more clearly evident when it is compared with the area graph of a section where all three tests were written tests, as in Fall 96 below (Please see [3] for a more detailed analysis): Fall 96 160 140 120 100 80 60 40 20 0

Final Test 2

19

16

13

10

7

4

Test 1

1

Class Ave Test2/Test1 Final/Test1 Fall 94 90.20% 70.59% Fall 96 65.91% 68.18% Spring 97 51.11% 68.89% Fall 98/Sec1 104.44% 88.89% Fall 98/Sec2 107.14% 76.19% Fall 98/Sec4 105.00% 82.50% Improvement in Performance: The above table lists the class average in the second and final tests as a percentage of the class average in the first test. Note that during the semesters when all the tests were written tests, the class average dropped from the first test to the second test as well as from the first test to the final exam. In Fall 98, when the second test and final were online, the class average improved from the first test to the second test. This is remarkable considering the novelty of the online test format at this institution, the more advanced nature of the material from the first test to the second test and the increased hardness of the test itself. It dropped from the first test to the final, but not as steeply as in the semesters when the final was also a written test. Therefore, either online testing tends to retain better students, or students can do better online when dealing with advanced topics in Computer Science I (both the second test and final deal with more advanced material than the first test), or both. Again, this is a positive result. Correlation Between Projects and Online Tests: Over the course of the semester, we assign 6-7 programming projects in Computer Science I. We found that there was a correlation between the number of projects a student completed and the student’s grades on online tests: students who correctly completed the most number of projects also scored well on online tests and those who submitted the fewest projects scored poorly on online tests (E.g., see the scatterplot below for Section 4, Fall 98). We may infer that completing the projects helps prepare students for online tests, which seems intuitive. In other words, online testing rewards problem-solving skills of students which are improved through programming projects, which is in keeping with the objectives of Computer Science I.

7 6 5 4 3 2 1 0 0

Cumulative Grades

as a percentage of the number of students who took the second test. Note that all the tests were written tests in Fall 94, Fall 96 and Spring 97. In Fall 98, the first test was written, but the other two tests were online. As the table indicates, the dropout rate from the first test to the second test was higher when the second test was held online, than when the second test was written. But, the dropout rate from the second test to the final test was lower when both the tests were online than when both the tests were written. Overall, the attrition in the course from the first to the final test remained the same for online testing as it was for written testing (20%-30%).

Projects Completed

Session 13b3

Student

0-7803-5643-8/99/$10.00 © 1999 IEEE November 10 - 13, 1999 San Juan, Puerto Rico 29th ASEE/IEEE Frontiers in Education Conference 13b3-4

Session 13b3 Monitoring Programming Practices: One of the advantages of testing online is that the instructor can observe first-hand whether students are indeed following the recommended practices of programming such as: • Writing the algorithm in the form of pseudocode as comments in a file before writing the code itself; • Writing code shell-first, i.e., typing in the syntactic shell of each construct before filling in the specifics of the program in order to avoid syntax errors from mismatched delimiters (parentheses/braces etc.). • Using proper indentation when writing code rather than returning to indent a program after it has been successfully compiled. This opportunity to observe students at work is especially helpful if the curriculum does not include closed laboratories in Computer Science I. This will enable the instructor to provide individualized feedback to students on the programming process. Student Feedback: More than 75% of the students agreed that “online tests were better at testing their learning in Computer Science I than written tests” in an anonymous survey conducted at the end of the semester. Following are actual comments written by students on end-of-the-semester evaluation forms about online testing: • “Testing online is great. Allows you to think + test your program much easier.” • “The projects were good, along with online testing, although the length of the test and projects were too long.” • “Online testing is effective, however debugging programs seems time-consuming.” • “Online tests are good – better than conventional methods. However, the tests MUST be designed so as they can be completed within a class session. It is not possible for everyone to stay late.” (This comment refers to the fact that we allowed up to one additional hour for students to complete the test.) • Instructor’s Perspective: As the instructor, we felt that online testing proffered several advantages: • We could ask students to write longer/more complex programs in an online test than in a written test, and hold the students to higher standards of programming since students have access to a compiler to verify the syntax and correctness of their program. • Grading the tests was easier since there was little need for giving “benefit of the doubt” credit to incorrect answers. • During grading, we could focus on the semantics and style of a program instead of its syntax. This emphasis on semantics over syntax is pedagogically well-placed.

Some negatives of testing online include: • The design and delivery of the test must be carefully thought out. Online tests present too many opportunities for disaster to strike: e.g., the computer/network may go down (happened to us). • If online tests are to be easily graded, issues such as directory names, file names, etc. must be thought out beforehand. In order to preserve the integrity of the tests, students must be informed in advance, what they can and cannot do during the test. • Students must be adequately assured that the test will not be graded as all-or-nothing, and that they will have opportunity to get help from the instructor if they get stuck. Otherwise, students may find online tests to be traumatic. Therefore, online tests take more time to plan and administer than written tests. However, we feel the pedagogical benefits of online testing are worth this additional investment of time and labor. We summarize our initial findings as follows: • online tests do not increase attrition in a course in spite of their real-time problem solving format; • the class average does not suffer with online tests even when online tests are harder than written tests; • there is a correlation between the number of projects a student has completed and the student’s score in an online test. We plan to continue to hold tests online in Computer Science I in the future.

References [1] Mason, D.V., and Woit, D.M. “Integrating Technology into Computer Science Examinations” in Proceedings of the Twenty-ninth SIGCSE Technical Symposium on Computer Science Education SIGCSE ’98 (Atlanta GA, Feb 1998), ACM Press, pp.140-144. [2] Woit, D.M., and Mason, D.V. “Lessons from Online Programming Examinations” in Proceedings of the Third annual conference on Integrating Technology into Computer Science Education ITiCSE '98 (Dublin City University, Ireland, Aug 1998), ACM Press, pp. 257259 [3] Kumar, A.N. “On Changing from Written to On-line Tests in Computer Science I: An Assessment”, to appear in Proceedings of the Fourth annual conference on Integrating Technology into Computer Science Education ITiCSE '99 (Krakow, Poland, June 28-30, 1999), ACM Press.

0-7803-5643-8/99/$10.00 © 1999 IEEE November 10 - 13, 1999 San Juan, Puerto Rico 29th ASEE/IEEE Frontiers in Education Conference 13b3-5

Suggest Documents