Document not found! Please try again

Identifying Top Java Errors for Novice Programmers - CiteSeerX

3 downloads 0 Views 157KB Size Report
Oct 19, 2005 - semantic errors such as a semi-colon immediately following the boolean expression of an if-statement or a while-statement. This syntactically ...
Session T4C

Identifying Top Java Errors for Novice Programmers James Jackson1, Michael Cobb 2, and Curtis Carver3 Abstract - All freshmen at the United States Military Academy take an introductory programming course. We use a custom-built integrated development environment to help teach Java. During previous work, we implemented an integrated semantic and syntax error pre-processing system to help novice programmers decipher the otherwise cryptic compiler error messages in order for them to focus more on design issues than implementation issues. The syntactic errors that we checked were gathered by an informal survey of the current and former faculty members teaching the course. We noticed over the course of the year that there were discrepancies between the errors that the instructors had identified and the errors that the students were encountering. In response, we developed a real-time, automated error collection system that logged 100% of the Java errors in a central database that all users, students and faculty alike, encountered while using the integrated development environment over the course of a semester. This paper discusses the implementation and results of our system as well as the implications for novice programmers. Index Terms – Syntax Errors, Programming, Information Technology BACKGROUND As the first dedicated engineering school in the United States, The Military Academy at West Point has had a long tradition of educating engineers. Keeping with this tradition, each freshman is required to take an introductory programming course with the intent of exposing them to a structured problem solving process. We built an integrated development environment (IDE) for Java to help alleviate some of the common problems associated with implementing source code. Previously we constructed an integrated semantic and syntax error pre-processing system [1] to help novice programmers decipher the otherwise cryptic compiler error messages. This system known as Gauntlet would, for example, check for semantic errors such as a semi-colon immediately following the boolean expression of an if-statement or a while-statement. This syntactically correct, but semantically incorrect, problem is almost always a result of novice programmer’s incorrect implementation of a design. Gauntlet translates syntactic errors into a more easily digestible form for the student. The semantic errors that we checked were gathered by an informal survey of the current and former faculty members teaching the course. In a typical semester, cadets complete three major

projects (100-300 Lines of Code (LOC)), nightly programming problem sets (5-20 LOC), and many in class programming exercises (10-30 LOC). PREVIOUS WORK Several previous works have attempted to identify the most common errors committed by novice or first time programmers. Our search of the literature showed that researchers used a variety of methods to identify common errors. Experience [1], faculty surveys [1]-[2], manually counting and categorizing of errors [3]-[4] or student provided compilations [5] were the most common. Our work is novel for three reasons. First, the population for the course has an eclectic academic array of students. Half the students will major in engineering and sciences the other half will major in the humanities. As information technology permeates our society the number of humanities majors requiring programming and IT courses will continue to rise. The typical introductory programming course at other institutions is geared to future computer science majors or other science major. Second, for most of these students this is the first time that they have been exposed to thinking about programming languages. We typically find less than 10% of our students have previous programming experience. Third, our system makes use of real-time error collection via a webenabled IDE that sends results to a web application. The top ten errors are not an interpretation or subjective assessment – these are the hard top ten errors for this population of students. PROBLEM Over the last year it became apparent that the Gauntlet system was not deciphering the most common errors that the students were encountering. There was a discrepancy between the errors identified by the faculty and those errors encountered by the students. To get a better grasp on the errors that the students were encountering, we developed a real-time, automated error collection system that logged Java errors encountered to a central database. Table 1 shows the top nine errors that the faculty identified.

1

James Jackson, Assistant Professor, United States Military Academy, [email protected] Michael Cobb, Assistant Professor, United States Military Academy, [email protected] 3 Curtis Carver, Associate Professor, United States Military Academy, [email protected] 2

0-7803-8552-7/04/$20.00 © 2004 IEEE October 19 – 22, 2005, Indianapolis, IN 35th ASEE/IEEE Frontiers in Education Conference T4C-24

Session T4C

6 7 8 9

Mismatched curly braces ({ or } expected). Mismatched quotations. Misplaced semicolon. Improper file name. Attempting to use variable before initializing it. (cannot resolve symbol or expected) Mismatched parentheses. ( ( or ) expected) Missing semicolon. (; expected) Misspelling printLine method. Package does not exist. IMPLEMENTATION

In our course we have a homogenous automation environment because each of our students is required to purchase the same computer system upon entry to the Military Academy. In addition, we have ubiquitous network access. Students are able to access our local intranet via encrypted wireless access points in every academic building and via wired means in their dorm rooms. Since we developed the integrated development environment that is used in the course, we have access to the source code to make low-level changes. The IDE used to write Java is also written entirely in Java. When the student opens the IDE and compiles a program, the IDE stores every Java error encountered. A typical Java error is found in Table II. TABLE II TYPICAL JAVA ERROR C:\Documents and Settings\user\Desktop\Test.java:6: ‘;’ expected System.out.println(count); ^

The IDE parses the error by retrieving all of the text between the third colon and the end of the error message. The parsed error appears in Table III. TABLE III TYPICAL JAVA ERROR ‘;’ expected

The IDE stores the parsed message in a list that is constantly updated every time a program is compiled. Each time the error is encountered a count of that particular error is updated. When the student goes to exit the IDE, this information is sent to a web application. On the backend we use the Tomcat servlet container and a MySQL database. The web application parses the request parameters of the HTTP request and stores each error in the database. If the error is not found in the database, a record is created for it. If the error is in the database, a field in the database that stores how many times this error has occurred is incremented. There are some factors that need to be taken into consideration when understanding the system. The system only collects errors when the IDE is closed and the student’s

computer is connected to the local intranet. In the rare cases when the student is away from the Military Academy (and actually doing homework) and without network access, the system will not collect the student’s generated errors. The system collects errors from every active copy of the IDE. This means that faculty who use of the IDE will generate errors too. There were 583 students using the IDE during the course and 11 faculty members. RESULTS The system collected a total of 559,419 errors over the course of one semester. The top ten errors represent 51.8% (290,134) of the total number collected. The top twenty represent 62.5% (349,553) of the total number collected. The following table represents the error and how many times it occurred.

Rank

1 2 3 4 5 6 7 8 9 10

TABLE IV TOP TEN JAVA ERRORS Number of Occurrences

Error

cannot resolve symbol ; expected illegal start of expression class or interface expected expected ) expected incompatible types int not a statement } expected

Faculty Identified

81655 47362 32107 25650 25223 21412 15854 14185 13878 12808

Yes Yes No No Yes Yes No No No Yes

90000 80000 70000 Number of Occurrences

Rank

1 2 3 4 5

TABLE I FACULTY IDENTIFIED ERRORS Faculty Identified Errors

60000 50000 40000 30000 20000 10000 0 1

2

3

4

5

6

7

8

9

10

Rank

FIGURE 1 GRAPH OF ERRORS AND NUMBER OF OCCURRENCES.

ANALYSIS The following analysis is based on the teaching experiences of novice programmers over a combined 20 semesters. It is not meant to be exhaustive nor definitive, but rather our opinions on why each of the errors occurred.

0-7803-8552-7/04/$20.00 © 2004 IEEE October 19 – 22, 2005, Indianapolis, IN 35th ASEE/IEEE Frontiers in Education Conference T4C-25

Session T4C The “cannot resolve symbol” error typically occurs when students use a variable without first declaring it. We believe that this is a direct result of high school algebra. The idea of declaring a variable is foreign to most novice programmers at first. The “; expected” error was anticipated because of the fairly rare use in the English language and its use as the terminating character of a statement. Students tend to forget the terminating character (e.g., int count = 5) and also add semi-colons where they are not needed (e.g., after a boolean expression, if (isLightOn == true);). The “illegal start of expression” error is typically caused by a malformed boolean expression in an if or a while. We see this most often when students use a space between the two equals signs in a boolean expression used in an if or while statement (e.g., if (count = = 5)). This error also occurs when students have more than one method in a class and fail to properly close an earlier method. It also occurs when a student tries to declare a static variable inside of a static method. Students really struggle with fixing this error because there are so many different causes and they are unfamiliar with the term expression as used in Java. The “class or interface expected” error occurs when the students forget to enclose their methods with a class typically forgetting “class Test {“. We believe this occurs due to students typing in code snippets from the class handbook directly into the IDE. The “ expected” error. Typically this error occurs when students declare static class-level variables but forget to include the type of the variable (static count2 = 10). The “) expected” error typically occurs when students try to form compound boolean expressions. The “) expected” error (21412 occurrences) appeared 5.7 times more often than the “( expected” error (3753 occurrences). Students tend to focus on the boolean expressions and forget to match parentheses. The “incompatible types” error occurs when students attempt to equate two variables of different types. We see this often with doubles and integers, and also with integers and booleans. In addition, this error occurs when students use the assignment operator in lieu of the equality operator in an ifstatement (e.g., if (count = 5)). The “int” error occurs is an artifact of the collection system. A Java error such as seen in Table V will generate an “error” of both “double” and “int.” We identified the logic flaw in our system about half-way through the semester. At that time, we decided to make a note of the flaw instead of fixing and redeploying the system to the 500+ users. TABLE V TYPICAL JAVA ERROR C:\Documents and Settings\user\Desktop\Test.java:6: possible loss of precision Found : double Required: int

The “not a statement” error occurs when students use notation such as count +5 in lieu of count = count + 5. This error also occurs often when students put a boolean expression after an else statement trying to use natural language translation (e.g. if (count>10{…} else (count

Suggest Documents