associated with developing generic, fully adaptive systems with ... developing generative tutoring systems [5], techniques ..... Information networks in generative.
A Generic Neural Network-Based Tutorial Supervisor For Computer Aided Instruction
B.P. Bergeron
A.N. Morse
R.A. Greenes
Decision Systems Group, Harvard Medical School Brigham & Women's Hospital, Boston, MA
Abstract
one-to-one communications channel it has with the user to collect relevant information about either individual students or representative groups of students.
When working with review materials in a self-test mode, student involvement is maximized when the problems presented variably fall within, and occasionally slightly beyond, a student's current level of ability or training. Because of the many difficulties associated with developing generic, fully adaptive systems with rule-based expert system technology, we have focused on using neural network technology as a practical, domain-independent means of optimizing the presentation of multimedia educational programs. The pattern classification capabilities of a neural networkbased tutorial supervisor, developed as a series of external commands, have been used to successfully mediate the presentation of image-intensive courseware in cardiac pathophysiology. Research issues include identifying how to extend this approach to dynamicallygenerated courseware content, e.g., graphic simulations, and determining the educational effectiveness of various control algorithms used to assign students to problem sets of different levels of difficulty.
ICAI In order to provide adaptive courseware, researchers in the area of Intelligent Computer Aided Instruction (ICAI) have developed programs that construct insightful models of a student's strengths, weaknesses, and preferred style of learning. To achieve this end, a number of Artificial Intelligence (Al) techniques originally developed for natural language understanding, knowledge representation, and methods of inference have been used [1].
Much of the initial work in ICAI focused on techniques of knowledge representation [2, 3, 4]. In later work developing generative tutoring systems [5], techniques of knowledge representation were augmented with methods of formulating models of the student in terms of issues or skills that should be learned for a given task [6, 7]. Some ICAM-based programs use AI techniques to represent the tutorial strategies themselves, providing some degree of flexibility and modularity of representation and control [8, 9]. For the vast majority of ICAI applications, rule-based expert systems have been used to guide the courseware presentation.
Introduction
Since medical students cannot typically appreciate what it is that they do not know, any competent tutor, be it human or computer, must be able to recognize and respond to student needs as they arise. In this regard, the impressions of experienced human tutors, formulated through close student-tutor interactions, are by far the best means of defining the boundaries and validity of student understanding. However, computerbased instructional systems, while currently not theoretically as effective as human tutors in qualifying student understanding, can, in practice, provide a more accurate assessment of a student's competence within a problem domain. TIhis unfortunate reality is due in part to the high student-to-faculty ratios common in contemporary medical education settings, which effectively limit opportunities for intimate studentteacher interaction. In comparison, an appropriately designed computer-based system can exploit the direct
The resource requirements associated with providing courseware systems with comprehensive student modeling capabilities, including the large time investment required to develop and verify domainspecific data bases for each specialty area, and the difficulty in creating student models that can be used in more than one domain, are prohibitive. We have therefore focused on developing generic adaptive systems that can be used with actual courseware systems, albeit with less sophisticated student modeling capabilities than those demonstrated by more complex laboratory systems. In particular, we have investigated the applicability of neural network technology as a means of providing practical adaptability for our multimedia educational programs [10].
435
0195-4210/90/0000/0435$01.00 0 1990 SCAMC, Inc.
Our interest in neural networks stems from their inherent ability to automatically learn and classify student behavior within a courseware' environment. Compared with rule-based approaches, neural network technology promises to minimize our reliance on domain experts for help in designing courseware presentations. For example, when working with digitized images of radiographs and pathology specimens, it is often difficult, without expert assistance, to quantify the findings presented. Even in domains where there are generally accepted guidelines for content presentation, there are frequently exceptions or special cases that require expert opinion. As described below, a simple neural network system can be used to determine these guidelines with little or no need for expert intervention. The Tutorial Supervisor
Our generic, neural network-based system for classifying students and problem sets and for controlling the overall presentation of courseware content, is called Tutorial Supervisor. The underlying pedagogical design of this system, the neural network technology used, and how the Tutorial Supervisor can be integrated into an existing courseware application are described below. Pedagogy For maximum educational effectiveness when working with review materials in a self-test mode, we feel that students should be presented with materials of difficulty at or near their level of ability. For example, confionting a fourth-year medical student with questions about images that are intended to challenge a third-year pathology resident will likely frustrate the student and provide little incentive for continuing the learning exercise. Similarly, providing the resident with problems more appropriate for a first year student will be viewed as a waste of time. Problems that fall within, and occasionally slightly beyond, a student's current level of ability or training can engage the student by providing both positive reinforcement and an element of challenge.
Traditionally the level of difficulty of both paper- and computer-based problem sets has been determined by relating test scores to the students' correct and incorrect responses to particular problems. For example, in a properly designed test, there are generally questions that are only answered correctly by "A" students. Similarly, "B" students generally answer correctly all of the problems that have been mastered by "C" students, but only a few, if any, of the problems mastered by "A" students. However, because of the labor and time
involved in manually assigning a level of difficulty to problems, this classification is normally performed as a batch process, and on an annual or semester basis at best. Paradoxically, classifications of the difficulty of problems are therefore commonly based on responses of students from previous classes, rather than those of students currently making use of the material. Changes in curriculum, teaching practices, and student populations from semester to semester or year to year are therefore not represented in the classification scheme. Courseware systems, when equipped with automatic pattern classification systems, such as neural networks, can address both the labor and time-lag issues associated with the classification of problem difficulty. Computer-based systems are, by their very nature, easily instrumented [11]. Audit trails of student activity within a courseware system, i.e., records of which problems were attempted, whether responses were correct, time spent on each problem, the total score, and so forth, can be easily compiled and encoded in a form suitable for processing by a neural network. After each student-courseware interaction, a neural network can categorize a student's responses and combine this response pattern with its current model for a given classification of difficulty, i.e., adapt to the student population. Since the classification of the categories of problem difficulty are refined after each student interaction session, the resulting model for each category should closely represent the current population of student users. Neural Network Design
Much of the current research in the area of neural networks is centered around multilevel architectures, often focusing on what the contents of the hidden layers signify [12]. However, training these multilayered neural networks, which often make use of the back-error propagation paradigm [13], can be extremely computationally intensive [14] and require expensive parallel hardware configurations to achieve anything approaching real-time responses [15]. In examining the variety of network architectures applicable to our practical courseware requirements, we have therefore opted to work with a simple but robust neural network technology first described over 30 years ago - the Madaline Perceptron [16]. The Madaline Perceptron, which may have any number of functional elements or neurodes, takes a set of attributes and classifies them into one of a number of predefined, discrete categories. In operation, each neurode receives a number of inputs that are weighted. From the weighted total input, each neurode computes a single output value. That is, the output of each neurode
436
is computed as the result of a transfer function of the weighted input, i.e., the dot product of the input and weight vectors. As in most neual network designs, a component of the learning process in Madaline involves a comparison of the computed output with the target output. In the Madaline design, the difference between the target and actual neurode output values is used to adjust the input neurode weights, so as to minimize the mean squared error, averaged over all neurode inputs. The Least Mean Squared (LMS) or Delta learning rule, developed by Widrow and Hoff [17, 18], defines how the input neurode weights are changed in response to a given neurode input and output: Wnew - Wold = K x ((E x I)AI12) where W is the input weight vector, K is the learning constant, I is the input vector, and E is the error value (computed by subtracting the actual output from the target neurode output).
SuperCard"m-based, image-intensive self-test environment. In this courseware system, originally developed by our laboratory for first-year medical students studying cardiac pathophysiology, students are presented with high-resolution images of gross and microscopic pathology specimens, and are asked to identify or discuss, in a multiple-choice format, highlighted regions on each image. After a random sequence of image-based problem sets has been presented (the number of images drawn from our image library is defined by the course instructor), the student is given a cumulative score. This score, along with a record of the images presented to the student and the appropriateness of each response, is saved to disk for future reference and used to train the underlying neural network The disk file is of value not only in preserving student scores, but in making it possible for us to utilize student response data from multiple machines to enrich the training set for Tutorial Supervisors installed on other machines. For example, if the courseware is simultaneously made available to students on four computers, then the data files from the computers can be combined periodically and input to an untrained Madaline Perceptron system. The resulting file of weightings for each difficulty classification can then be installed on each of the four machines. In this way, an adequate training set sample size can be easily maintained on any machine, no matter how often it is used. Training sets distilled from previous classes of students can also be used as a starting point for new groups of students.
In order to avoid network instability, the learning constant (K) must be positive, but less than 1. The ideal value for K depends on the quality of the input data. For example, if there is little noise in the input data, then a fairly large K. (e.g., 0.8) may be used, resulting in very rapid convergence. However, with noisy data a K larger than 0.1 or 0.2 will result in a network that requires many learning cycles to settle down, since the weight vectors will jump around excessively (we have found that a learning constant of 0.2 is best for categorizing student responses). After the error value has been computed, the difference between the new and old weight vectors (the Delta vector) is added to the current weight vector. In the classification mode, the Madaline Perceptron compares, by the means of a simple dot-product operation, the current input pattern with the input weights, to identify the classification most appropriate for the pattern. For more information on the design and operation of the Madaline Perceptron, including source code listings, see the article by Caudill [19].
For classification purposes, 10 levels of difficulty are defined in our system 0o-10%, l1%-20%, ..., 91%100%). For example, the response pattern from a student with a total score of 35% is assigned to the 31%-40% category. Although fewer classification levels would result in more rapid program execution and impose less significant memory demands on the disk storage system, the execution speed and memory requirements of a 10-category system are insignificant. Furthermore, using too few categories can result in meaningless data. For example, if three categories are defined - 0%-33%, 34%-66%, and 67%-100% - and the vast majority of students fall within the 34-66% classification, then the classification data are of no practical value.
Courseware Integration
Owing to the popularity of HyperCardT, SuperCardTm, and other high-level authoring environments available for the Apple Macintosh, we have developed our Tutorial Supervisor as a library of portable external code modules or XCMDs. In this form, the neural network and associated functions that comprise the Tutorial Supervisor are accessible by a variety of commercial databases and authoring systems, as well as by our own courseware, developed in HyperCard, SuperCard, and traditional programming languages.
To allow for random presentation of images, a lookup table is maintained for each neurode input, i.e., the first neurode input is always associated with a photomicrograph of aortic valve calcification, regardless of where in the presentation sequence the problem is actually displayed. In some instances, there may be several representative images of a particular finding or concept (e.g., valve calcification) and it may be
To examine the many practical issues involved in integrating our Tutorial Supervisor with a functional courseware system, we have instrumented a 437
desirable to present students with only one of these images during a given training session. To accommodate this ability, a Concept Table containing the identity of one or more images per concept is maintained (like the lookup table described above, the Concept Table takes the form of simple, easily edited text fields in the SuperCard environment). If the Concept Table lists only one image for a concept, and the course instructor configures the courseware system such that this concept should always be represented, then the image associated with that concept will always be shown to students. Similarly, when there are multiple images associated with a given concept, the problem image presented to a student is a function of the category of difficulty to which each image is assigned. If there are multiple images assigned to the student's current performance rating, then an image is selected at random from this subset of images. Similarly, if there are no images available that have been categorized at the current difficulty level, then an image assigned to a greater difficulty category is presented. Operation Owing to the simplicity and modest computational demands of our neural network system, the Tutorial Supervisor is totally transparent to the student. After responding to an initial question on level of training and background, each student is presented with an image problem appropriate to his or her response to the question, i.e., first semester students with no prior experience with cardiac pathophysiology are initially presented with problems from the 41-50% difficulty category. With subsequent image presentations and responses, the difficulty level of the images is modulated to reflect the student's performance. In our current implementation, students are challenged by moving them up to the next difficulty level after two consecutive correct responses. Similarly, frustration is avoided by moving students down to a less difficult level after two consecutive incorrect responses. Results
The computational overhead associated with calculation of the appropriate difficulty level of materials presented to students and with training of the neural network after each problem session is insignificant. For example, the training time for a 50-node Madaline Perceptron, written in C and running on a Macintosh 11, is less than 1 second. After completing one run of the cardiac pathology testing system, students are free to immediately work with another series of images or move on to another courseware application.
Interestingly, we have found that the neural network approach provides a number of unexpected dividends. For example, this system automatically identifies problems that are correctly answered by lower scoring students and incorrecdly handled by top scoring students. In addition, the use of a Concept Table simplifies the authoring process, since the selection of a problem pertaining to a given concept is handled by the system. Discussion
With the increasing availability of quality video disks and CD ROMs, each of which can contain thousands of images applicable to medical education, a portable, selfgenerating, domain-independent'approach to creating adaptive systems seems indispensable. Compiling thousands of expert-generated rules for each video disk is clearly not feasible, and simply presenting images to students in a rigid, sequential fashion is not acceptable. We believe that simple systems based on neural network technology, such as the system described above, represent a good compromise between fully adaptive educational systems with unreasonable hardware and labor requirements on one hand, and nonadaptive, electronic page-flippers on the other. As alluded to above, adaptive systems, regardless of the control mechanism, do require a certain amount of additional overhead, compared with hard-wired systems. For example, unless there is a rich problem set from which to choose, there is little point in developing an adaptive system. Fortunately, graphic simulations are potentially capable of providing virtually unlimited variability in content. Similarly, video disks containing thousands of images and other technologies are minimizing the problem of acquiring sufficiently rich data sets. As these technologies evolve, the economic impediments associated with their use are
diminishing. After full-scale tests of our instrumented cardiac pathophysiology application this Spring, we will investigate the applicability of this approach to dynamically altering the graphic complexity of patient simulations and to a variety of control algorithms for moving students between classifications. Acknowledgments
This publication was supported in part by grants LM04715-02, LM 07037, and LM 04572 from the National Library of Medicine, and by the Health Sciences & Technology Division of the Massachusetts Institute of Technology and Harvard Medical School.
438
9. Goldstein I. The computer as coach: an athletic paradigm for intellectual education. AI Laboratory,
References
MIT, 1977.
Clancey WJ, Shortliffe EH Buchanan BG. Intelligent computer-aided instruction for medical diagnosis. Readings in medical artificial intelligencethe first decade. Ed. WJ Clancey and EH Shortliffe. Reading, Mass: Addison-Wesley, 1984. 256-74. Brown JS, Burton RR. Multiple representations 2. of knowledge for tutorial reasoning. Representation and Understanding. Ed. D.G. Bobrow and A.M. Collins. 1.
10. Bergeron B. Using a spreadsheet metaphor to visualize neural network behavior. Collegiate Microcomputer 1989; 8 2: 81-92. 11. Bergeron B. Program instrumentation: a technique for evaluating educational software. Collegiate Microcomputer 1990; 8 1: 34-46. 12. Touretzky DS, Pomerleau DA . What's hidden in the hidden layers? Byte 1989; 14 8: 227-33.
New York: Academic Press, 1975. 311-49.
3. Carbonell JR. Mixed-initiative man-computer instructional dialogues. Bolt Beranek and Newman, Inc., Cambridge, Mass, 1970.
13. Rumelhart DE, McClelland JL. Parallel distributed processing. Cambridge: MIT Press, 1986.
4. Suppes P, Moringstar M. Computer-assisted instruction at Stanford, 1966-68: data, models, and evaluation of the arithmetic programs. New York: Academic Press, 1972.
14.
Hinton GE.
1985; Apr: 265-73. 15.
Learning in parallel networks. Byte
Morse KG. In an 222-3. Aug:
Wexler JD. Information networks in generative computer-assisted instruction. IEEE Trans ManMachine Sys 1970; 4 181-90.
5.
upscale world. Byte 1989;
16. Carpenter GA. Neural network models for pattern recognition and associative memory. Neural
6.
Barr A, Atkinson RC . Adaptive instructional strategies. In: IPN Symposium 7: Formalized Theories of Thinking and Learning and Their Implications for Science Instruction. 1975; 72-8. 7. Burton RR, Brown JS. A tutoring and student modelling paradigm for gaming environments. In: Proceedings of the Symposium on Computer Science
Networks 1989; 2 4: 243-57.
17. Widrow B, Hoff ME. Adaptive switching circuits. In: 1960 IRE WESCON Convention Record, Part4. 1960; 96-104.
18. Widrow B, Stearns SD. Adaptive signal processing. Englewood Cliffs, NJ: Prentice-Hall, 1985. 19. Caudill M. Neural network primer: part 11. AI Expert 1988; 3 2: 55-61.
and Education. 1976; 236-46.
8. Brown JS. Reactive learning environments for computer-aided electronics instruction. Bolt Beranek and Newman, Inc., Cambridge, Mass, 1976.
439