Automated Generation of Self-Explanation

1 downloads 0 Views 37KB Size Report
Keywords: Self-explanation, Worked example, Model-based tutor. ... line of explanation in the above example contains six self-explanation questions, and.
Automated Generation of Self-Explanation Questions in Worked Examples in a Model-Based Tutor Amruth N. Kumar Ramapo College of New Jersey, Mahwah, USA [email protected]

Abstract. A framework is proposed for automated generation of self-explanation questions in worked examples. In the framework, in addition to the questions, the correct answer, distracters and feedback are also automatically generated. The framework is based on model-based generation of worked examples, and is domain-dependent rather than problem-specific. Keywords: Self-explanation, Worked example, Model-based tutor.

We have been developing and evaluating software tutors for program comprehension, called problets (problets.org), for over a decade. These tutors cover all the programming language topics (16 in all) typically included in an introductory programming course, and can be used for C++, Java or C#. The tutors engage students in problem-solving activity – they present a complete, but short program to the student and ask the student to debug, predict the output of, or identify the state of the variables in the program. After the student has submitted the answer, if the answer is incorrect, they present the same problem as a worked example, complete with step-by-step explanation of the program that justifies the correct answer. Prior evaluations have shown that students learn from this step-by-step explanation [1]. In order to improve the effectiveness of the worked examples and ensure that all the students closely trace the worked examples, we incorporated self-explanation prompts into them in anticipative reasoning style [4]. In prior studies of the use of self-explanation prompts in anticipative reasoning style, self-explanation questions were hand-coded for each problem, as were the correct answers to the questions and possible feedback presented to the learner. In a tutor that includes a handful of problems, this is a viable strategy. However, in our software tutors on programming, hand-coding self-explanation questions, answers and feedback for the 2868 problems contained in the 16 tutors was impractical. So, we explored automatic generation of self-explanation questions in worked examples. We use model-based reasoning to build our software tutors [2], the advantages being that the domain model automatically generates correct answer to each problem, as

well as steps in its worked example. Suppose the domain model of if-else statement generates the following generic explanation as a step in the worked example: In the condition of the if-else statement, is compared to be than . The condition evaluates to , so the is executed on line . If a program contains the following if-else statement: if( count > 0 ) { … } else { … } the generic explanation generated by the if-else statement model is customized with program-specific details as follows: In the condition of the if-else statement, count is compared to be greater than 0. The condition evaluates to true, so the if-clause is executed on line 32. The customization segments in the explanation include the name of the variable count, the comparison operator greater, the comparison value 0, the value to which the condition evaluates, viz., true, and the line number of the if-clause, viz., 32. All these segments of customization are candidates for self-explanation, e.g., the student can be asked to identify the variable compared in the condition of the if-else statement, the value to which it is compared, the result of the comparison, and/or the line number of the code segment executed after the evaluation of the condition. These self-explanation questions can be categorized into two groups:  Syntactic questions that can be answered by reading the code, e.g., the variable or value compared in the condition of the if-else statement;  Semantic questions that require the student to understand and mentally execute the code before answering, e.g., the result of evaluating the condition of the if-else statement, and whether if-clause or else-clause is executed as a result. All the segments where each line of explanation must be customized are clearly identified in the domain model. A tutor can be configured to automatically generate a self-explanation question at every customization segment of every line of explanation. However, this would result in too many self-explanation questions, e.g., the single line of explanation in the above example contains six self-explanation questions, and the overall explanation of even simple programs could contain tens of lines of such explanation. So, automatically generating a self-explanation question at every customization segment would result in too many self-explanation questions per problem, possibly overwhelming the student. One approach for filtering questions is for a pedagogic expert to identify the specific segments in each line of explanation that would make the best candidates for self-explanation questions. Some heuristics used by the pedagogic expert are:  Semantic self-explanation questions are better than syntactic self-explanation questions because they induce the student to mentally execute the program using his/her mental model of the program.  Self-explanation questions with more answering options are preferable to those with limited answering options - the more the options, the less the likelihood that the student can arrive at the correct answer by guessing alone. For example, there are only two choices for the value to which the condition of the if-else statement can evaluate – true and false. On the other hand, there are

an infinite number of choices for the result of evaluating an arithmetic expression.  Repetitive events are less preferable to one-off events, e.g., the line that is executed first after the condition of a loop evaluates to true is a potentially repetitive question, whereas the line that is executed after it evaluates to false is a one-off question. Asking the student to answer the former question for every iteration of a loop is not productive – the student is unlikely to benefit from answering it for the iterations of the loop after the first one. In this work, a pedagogic expert with two decades of experience teaching introductory programming picked the segments of explanation that were candidates for selfexplanation. These candidates were picked once for each domain component, and not for each problem. The domain model was modified to annotate these customization segments with ask tags, e.g., the domain model of if-else statement was modified to generate the following annotated explanation: In the condition of the if-else statement, is compared to be than . The condition evaluates to , so the is executed on line . When presenting the worked example for a problem, each ask tag is presented as a self-explanation question, with a drop-down box of options from which the learner is asked to select the correct answer. Since the customized value of the segment enclosed by the ask tag is resolved by the problem model, the correct answer to each self-explanation question is known to the tutor. Based on the correct answer, the problem model automatically generates distracters for each self-explanation question as follows:  For program objects such as variable names, the problem model uses names of other variables in the program as distracters;  For literal values such as literal constants and line numbers (e.g., 32), the problem model generates distracters that randomly straddle the correct answer (e.g., 31, 33, 34, and 35). Finally, the tutor randomly orders the distracters and the correct answer before presenting them as options in the drop-down box of the self-explanation question. The problem model automatically generates feedback provided to a learner when the learner selects an incorrect option for a self-explanation question. A generic version of the feedback is:  For numerical answers (such as line numbers, literal constants in the program), the student is told whether the correct answer is higher or lower.  For symbolic answers (such as variable names), the student is told that the answer is incorrect. This feedback is adequate to prompt the student to try again. It is generic, i.e., not specific to the self-explanation question being presented. Since the problem model is also the executable expert module, the problem model can also automatically generate more context-sensitive feedback based on the correct answer, e.g.:  For numerical answers (such as values of variables), the student is asked to review specified line(s) of code and try again, e.g., “Consult line 23 for the value last assigned to the variable and try again”.



For symbolic answers (such as variable names), the student is told why the selected answer is incorrect, e.g., the selected variable is not in scope, or the program does not contain any function with the selected name. Both the generic and context-sensitive versions of feedback can be generated automatically by the problem model and do not have to be hand-coded by the author of the problem. The complete explanation of a typical program may involve 30-200 lines. Even with the pedagogic expert picking the customization segments that are the most suitable candidates for self-explanation, a typical program might produce tens of selfexplanation questions per program/problem. In order to keep the number of selfexplanation questions reasonable, an additional configuration parameter is built into the tutor that specifies the maximum number of self-explanation questions allowed per problem, e.g., 3. So, after generating the worked example of a program, the tutor picks the first three candidates, and turns them into self-explanation questions embedded in the worked example. If fewer than three candidates are available for a program, all the candidates are turned into self-explanation questions. All the steps in the generation of self-explanation questions: identification of candidates for self-explanation, creation of distracters for each question, selection of selfexplanation questions presented to the student, and the feedback presented when the student selects an incorrect option for a self-explanation question – are resolved by the domain model (of which, the problem model is a customized copy) for all the problems. They do not have to be hand-coded individually for each problem. Therefore the entire process of generating self-explanation questions embedded in worked examples is domain-specific, not problem-specific, and is in keeping with the benefits of using model-based reasoning to build software tutors. As a proof-of-concept, a model-based tutor on selection (if-else) statements and another on switch statements were extended to automatically generate selfexplanation questions using the framework described above, and evaluated [3]. Acknowledgments. Partial support for this work was provided by the National Science Foundation under grants DUE-0817187 and DUE-1432190.

1

References

1. Kumar, A.N. Explanation of step-by-step execution as feedback for problems on program analysis, and its generation in model-based problem-solving tutors, Technology, Instruction, Cognition and Learning (TICL) Journal. Vol 4(1), 2006 2. Kumar, A.N. Model-Based Reasoning for Domain Modeling in a Web-Based Intelligent Tutoring System to Help Students Learn to Debug C++ Programs. Proc. .ITS 2002, LNCS 2363, Biarritz, France, June 2002, 792-801. 3. Kumar, A.N. An Evaluation of Self-Explanation in a Programming Tutor. Proc. ITS 2014. LNCS 8474. Honolulu, HI, June 2014. 248-253. 4. Renkl, A. Learning from worked-out examples: A study on individual differences. Cognitive Science, 1997. 21, 1-29.

Suggest Documents