Intelligent Educational Systems for Teaching Programming
Page 1 of 6
Intelligent Educational Systems for Teaching Programming by Lin Qiu
Introduction Do you remember the moment when just beginning to learn a programming language and you got stuck in writing a program? You had a hard time figuring out what the syntax of an expression should be, which function you should use, or what structures your program should follow. You wished there was someone looking over your shoulder giving you the answer. It would have saved you hours and hours wandering by yourself. But unfortunately, such help was not available. Do you remember as a teacher or TA for a programming class the times when piles of student assignments arrived waiting for you to grade them? You read through them making personalized comments on their problems and how to correct them. But this only happened to the first few assignments. After a while, you realized that this would cost you too much time and effort. Addressing the same mistake again and again became tedious. So, you gradually reduced the amount of feedback and scanned through the rest routinely. The above scenarios are often seen in learning and teaching programming. In fact, individualized feedback is rarely available in school because reviewing student work and personalizing critiques are labor-intensive and time-consuming. But since programming is a skill that can only be learned through practice, providing just-in-time and individualized feedback during practice is highly valuable. Several software systems have been developed to solve this challenge. In the following, we will describe three intelligent systems, each of which represents a different approach.
Intelligent Tutoring Systems The LISP Tutor [1] is one of the earliest systems developed to teach programming. It is an exemplar of the intelligent tutoring approach. Students using this tutor are asked to solve programming exercises. These exercises are usually determined by the system's understanding of the student's competence levels based on interactions with the student. While the student writes programs in the system, the tutor closely monitors every student's move. It uses about 500 production rules to generate the next correct moves that the student should make. A production rule is in the form of an if-then rule. It specifies a programming goal and the action to satisfy the goal. For example, a rule in English will be like: "if a goal is to add a set of numbers then code the operator +, and set a goal to code the addends." [4]
The process of finding the steps in a solution path that the student could possibly be taking is called model tracing [1]. If the student makes a move that differs from the predicted steps, the tutor will provide some feedback to the student. The feedback could be an indication of problems in the code, a suggestion for the right function, or pseudo code for the next step. The tutor also uses a set of buggy rules to predict mistakes that the student might make. If the student input matches the error prediction, feedback will also be triggered. Using the tutor, the student gets step-by-step instructions on how to write a program. The student will always stay on the right track if he/she follows the advice from the
file://C:\work\thesis\papers\publication\critiquer.html
3/12/2004
Intelligent Educational Systems for Teaching Programming
Page 2 of 6
tutor. Unnecessary time and effort can be saved from debugging a simple syntax error or pondering on a wrong path. User studies have shown that compared with students without help from the tutor, students using the tutor completed programming tasks faster, and at the same time, performed better in post programming tests. One interesting finding is that different styles of feedback mechanisms have little difference on affecting student learning outcomes [5]. Besides the immediate feedback approach described above, two other feedback mechanisms were implemented in the LISP Tutor. One highlights a mistake immediately when it is detected, but does not require the user to fix it. The other lets the user determine when to see the feedback and it is only shown when asked for by the user. Results have shown that while the students working with immediate feedback approach completed the tutoring problems fastest, there was little difference on programming skill acquisition among students using the three different tutoring approaches. As long as students experience the same number of exercises and reach the correct solutions at the end, tutoring styles have little impact on final skill acquisition levels. Over the years, various tutoring systems have been built to teach programming (e.g., PROUST [10], MENO-II [19], and ELM-PE [20].) They analyze the user's solutions to exercises and provide feedback to identified misconceptions or missing skills based on the analysis. Other approaches have also appeared in intelligent tutoring paradigm. Kumar's model-based tutor asks students to predict C++ programs' output and identify semantic and run-time errors [12]. It provides explanations of program execution line by line to help students understand code behavior. Adaptive navigation based on student modeling is used in a web-based system called ELM-ART II [21] to provide individualized annotated hyperlinks and curriculum sequencing. These tutoring systems assign work on exercises that are predefined in the system. Users cannot get feedback from the tutor on other programs. Focusing on differences from exemplary solutions can, however, lead users to imitate the stored solutions, inhibiting innovative problem solving. Such tutoring is only suitable for users at the beginner level.
Standalone Expert Critiquing Systems Expert critiquing systems have proved effective in providing useful feedback to users' work in many domains. Such feedback includes alerts to problematic situations, relevant information to the task at hand, directions for improvement, and prompts for reflection [18]. One critiquing system, called the LISP-CRITIC [6] was developed to teach LISP programming. Unlike the LISP Tutor continuously analyzing user actions, the LISP-CRITIC lets the user activate the examining process. Once started, it matches user's code against a large set of critiquing rules. These rules look for mistakes in the code and suggest corresponding improvements, e.g., better programming styles, safer list operations, more advanced functions, etc. (see examples below.) Information, such as which rules have been triggered and what functions the user is using, forms a user model. The user model determines the set of rules used to check the code. There is also a visualization tool and a browser of LISP concepts to help the user understand the critiques. Users appreciate the utility of the critiques during the process where they make mistakes and can correct them according to the critiques. User code
Critique
(car (cdr x))
Use (cadr x)
file://C:\work\thesis\papers\publication\critiquer.html
3/12/2004
Intelligent Educational Systems for Teaching Programming
(setq f1 (cons x f1))
Use (push x f1)
(append (explode word) chars)
Use (nconc (explode word) chars)
Page 3 of 6
Figure 1: Samples critiques in the LISP-CRITIC. The LISP-CRITIC can be beneficial to both novice and intermediate programmers. Novice programmers can learn new functions and concepts. Intermediate programmers can learn how to produce better code or use the tool as a proofreader to detect opportunities for improvement. Hendrikx, Olivie and Loyaerts [7] built a system to detect novice Java programmer's misconceptions. They use XSLT for pattern matching. Their system lets users run a local client program to transfer files to the server making it possible to detect misconceptions involving code in different classes. There are existing commercial products (e.g., CodeAdvisor [8], LINT [9], CodeWizard [11], PatternLint [16]) and open source tools (e.g., Checkstyle [3], PMD [14]) that provide code review for programmers. They usually focus on bug detection, memory management, coding standards, and design flaws. They are not intended for educational purposes. While expert critiquing systems can fully automatically provide individualized feedback to the user, authoring remains a big problem. In order to support purely computer-based critiquing, experts have to anticipate all the mistakes that novices commonly make and enter them into the system during development stages. Totally depending on experts to anticipate all possible mistakes at design time may result in unnecessary effort spent on cases that rarely happen. More importantly, commonly occurring critical cases may failed to be collected.
Computer Supported Critiquing Systems Intelligent tutoring systems and expert critiquing system can provide accurate and comprehensive feedback to the user. These systems, however, require significant development effort before they can be put into use. Developers have to work with domain experts and educators to make sure the systems have complete coverage of all possible mistakes a student might make and corresponding appropriate feedback. This requires significant up-front design, implementation, and piloting of the systems. It is estimated that 1 hour of instruction in intelligent tutoring systems usually requires 100 hours of development [13]. The Java Critiquer [15], a critiquing system that we built to help teachers critique student Java code, uses an incremental authoring approach to amortize the high development cost. It provides an open interface for teachers to gradually enter and update critiquing knowledge during real use of the system. This eliminates the need to anticipate and implement all possible critiquing situations up-front. When using the Java Critiquer, the teacher pastes the student code into a textbox and lets the critiquer carries out automatic critiquing. Automatic critiquing is done using pattern matching. Critiques associated with matched patterns are inserted right below problematic code. The teacher then verifies these critiques by modifying or removing inappropriate ones as needed. After reviewing the critiques generated by the system, the teacher performs manual critiquing on the code. The teacher can insert a critique by typing in a new one or applying an existing one in the system. The Java Critiquer provides searching and editing tools to help the teacher edit existing critiques and patterns.
file://C:\work\thesis\papers\publication\critiquer.html
3/12/2004
Intelligent Educational Systems for Teaching Programming
Page 4 of 6
New critiques can be incorporated into automatic critiquing by creating a pattern for it. It is common to discover a pattern either fails to match in places where it should, or matches many places where it should not. A pattern editor is provided for editing and testing patterns. There are two types of patterns in the system, regular expression patterns, and XML patterns. Regular expression patterns are used to match Java source code. XML patterns are used to match the JavaML [2] code generated by an internal Java parser from the source code. The teacher can specify test cases for a pattern. The system automatically matches the pattern against the test cases and highlights each test case with either red or green to indicate whether the test result complies with expectation. The pattern editor saves the teacher from switching to another application for testing. It also serves as a documentation facility that allows other teachers to review all the test cases for a pattern and therefore helps them to understand the capability of the pattern. When a critique or pattern becomes fairly reliable, it can be made accessible through a web interface to other teachers, or students for self-assessment. The web interface for students will only have automatic critiquing with high reliable critiques without access to authoring. The use of the Java Critiquer presents a practical development approach because each effort in authoring made by the teacher is motivated by its immediate benefit. The teacher stores a critique into the system to save the effort of typing it again. This leads to a database of reusable critiques. The teacher creates a pattern to automate a critique when finding and applying the critique repeatedly becomes tedious and time-consuming. The teacher refines the pattern when false critiquing caused by the pattern requires additional effort for remedy. The development of the system becomes an evolutionary process in which situations for critiquing and corresponding critiques are identified, implemented into the system, assessed through practical use, and refined based on the experience. Instead of being built as intelligent at design time, the system gradually migrates into an intelligent system through practice. There is no need to anticipate and implement all possible critiquing situations up-front. Issues not anticipated during system design can be explored during real use. Critiquing knowledge in the system is kept up to date. Meanwhile, having a human teacher reviewing automatically generated critiques ensures the quality of system critiquing during early development stage. This is important because computers can easily lose credibility if users notice inappropriate critiques. When users have low trust of the computers, they pay little attention to those critiques that are appropriate[16]. We avoid this problem by allowing manual critiquing in the system to complement automatic critiquing.
Conclusions Programming is a highly practical skill to learn. While simulation and visualization tools can help students understand programming concepts and code operation, providing detailed individualized feedback to student code remains critical. In this article, we described three types of systems: intelligent tutoring system, standalone expert critiquing system and computer supported critiquing system. They all can provide helpful feedback to student code, but each has its own pros and cons. Tutoring systems can provide step-by-step support for completing a program, but they usually need to have extensive knowledge about the domain content, student modeling and pedagogical strategy [22]. Users in the system can only work on a predefined set of exercises. Expert critiquing systems do not have this limitation. They can perform critiquing on any code, which enables them to be beneficial for both beginner and intermediate level programmers. As with intelligent tutoring systems, expert critiquing systems require significant development effort before they can be put into use. Computer supported critiquing systems avoid this difficulty by allowing incremental authoring during real use of the system. Systems can be put into use during the early development stages. They require, however, a human in the feedback loop to ensure the quality of the feedback and handle situations the system cannot. While most systems mentioned in this article have proved effective in teaching programming, they are primarily for research purposes. Few systems have been deployed and used in a large scale. Nearly all of
file://C:\work\thesis\papers\publication\critiquer.html
3/12/2004
Intelligent Educational Systems for Teaching Programming
Page 5 of 6
them were started from scratch. Reusable infrastructures with authoring tools are needed to reduce the development effort for these systems. Eventually, integration into commonly used programming development environments can make the educational experience provided by these systems more available to real-life programmers.
References 1 2 3 4
5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Anderson, J. R., Conrad, F. G., & Corbett, A. T. (1989). Skill acquisition and the LISP tutor. Cognitive Science, 13, 467-505. Badros, G. (2000). JavaML: A Markup Language for Java Source Code. In Ninth International World Wide Web Conference, May 2000. Checkstyle. URL: http://checkstyle.sourceforge.net Corbett, A.T. and Anderson, J.R. (1989) Feedback timing and student control in the Lisp Intelligent Tutoring System. In D. Bierman, J. Breuker & J. Sandberg, (Eds.) Artificial intelligence and Education: The Proceedings of the 4th International Conference on AI and Education. Springfield, VA: IOS. Corbett, A.T. and Anderson, J.R. (2001). Locus of feedback control in computer-based tutoring: Impact on learning rate, achievement and attitudes. In Proceedings of ACM CHI'2001 Conference on Human Factors in Computing Systems, 245-252. Fischer, G. (1987). "A Critic for LISP," Proceedings of the 10th International Joint Conference on Artificial Intelligence, Milan, Italy. Hendrikx, K., Olivie, H., & Loyaerts, L., (2002). A System to Help Detect and Remediate Novice Programmer's Misconceptions. In Proceedings of World Conference on Educational Multimedia, Hypermedia & Telecommunications. Hewlett-Packard Company. (1998). SoftBench SDK: CodeAdvisor and Static Programmer's Guide. URL: http://docs.hp.com/hpux/onlinedocs/B6454-90005/B6454-90005.html Johnson, S.C. (1978). Lint, a C Program Checker. Unix Programmer's Manual. AT&T Bell Laboratories: Murray Hill, NJ. Johnson, W. L. (1986). Intention-Based Diagnosis of Novice Programming Errors. London: Pitman. Kolawa, A., & Hicken, A. (1998). Programming Effectively in C++. ParaSoft Corporation. URL: http://www.parasoft.com/products/wizard/cplus/papers/tech.htm Kumar, A.N. (2002) Model-Based Reasoning for Domain Modeling in a Web-Based Intelligent Tutoring System to Help Students Learn to Debug C++ Programs. In Proceedings of Intelligent Tutoring Systems, LNCS 2363, Biarritz, France, June 5-8, 2002, 792-801. Murray, T. and B. Woolf. (1992). Results of Encoding Knowledge with Tutor Construction Tools. In Proceedings of the Tenth National Conference on Artificial Intelligence, San Jose, CA, July, pp. 17-23. PMD. URL: http://pmd.sourceforge.net Qiu, L., Riesbeck, C.K. (2003). Making Critiquing Practical: Incremental Development of Educational Critiquing Systems. To appear in Proceedings of the 2004 International Conference on Intelligent User Interfaces, January 13-16, 2004, Island of Madeira, Portugal. Reeves, B., Nass, C.(1996). The Media equation. Cambridge, SLI Publications, Cambridge University Press. Sefika, M., Sane, A., & Campbell, R. H. (1996). Monitoring compliance of a software system with its high-level design models. In Proceedings of the 18th International Conference on Software Engineering, Berlin, Germany. Silverman, B. (1992). Survey of Expert Critiquing Systems: Practical and Theoretical Frontiers. CACM, Vol.35, No.4. Soloway, E., Rubin, E., Woolf, B., Johnson, W. L., and Bonar, J. (1983). MENO II: An AI-based
file://C:\work\thesis\papers\publication\critiquer.html
3/12/2004
Intelligent Educational Systems for Teaching Programming
20 21 22
Page 6 of 6
programming tutor. Journal of Computer-Based Instruction 10:20-34. Weber, G., and Möllenberg, A. (1995). ELM programming environment: A tutoring system for LISP beginners. In Wender, K. F., Schmalhofer, F., and Böcker, H.-D., eds., Cognition and Computer Programming. Norwood, NJ: Ablex Publishing Corporation, 373-408. Weber, G., Specht, M. (1997) User Modeling and Adaptive Navigation Support in WWW-based Tutoring Systems. In Proceedings 6th International Conference on User Modeling. Sardinia, Italy. Wenger, E. (1987) Artificial Intelligence and Tutoring Systems: Computational and Cognitive Approaches to the Communication of Knowledge. Los Altos, CA: Morgan Kaufmann Publishers, Inc.
Biography Lin Qiu (
[email protected]) is a Ph.D. candidate in computer science at Northwestern University. His research interests include human-computer interaction and artificial intelligence in intelligent educational systems.
file://C:\work\thesis\papers\publication\critiquer.html
3/12/2004