The Design and Implementation of an Interactive Learning Tool for Statistical Reasoning with Uncertainty Deborah A. Vastola and Ellen L. Walker Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY 12180 email:
[email protected],
[email protected] fax: 518-276-4033
1 Introduction Statistical reasoning with uncertainty is a topic that is generally covered in an introductory college level course in Arti cial Intelligence and is particularly relevant to expert systems in AI. Its purpose is to arrive at a degree of belief in one or more hypotheses, based on incomplete or uncertain data (evidence). Unfortunately, the concepts of statistical reasoning with uncertainty can seem as ambiguous to students as the data on which they were intended to work. We describe a framework, in the form of an interactive tool, to help students learn about the power and limitations of reasoning with uncertainty. There are numerous reasoning with uncertainty models [7]. We chose to focus on the Dempster-Shafer model [9]. But since we propose a generic framework for learning reasoning with uncertainty, our discussions are not limited to Dempster-Shafer theory. The tool structure and functionality was designed to accommodate additional models. The goal of this paper is to serve the interests of two types of readers: 1. Those who intend to teach reasoning with uncertainty. We describe in detail the educational objectives that have been established for the tool. 2. Those who intend to design educational computer applications. We describe the requirements and design process that must be undertaken in order to create a computer tool that eectively communicates the educational objectives. For a more detailed discourse on our development process, including a discussion of code re-use, choosing a graphics package, conducting usability reviews, and showing the results of those reviews, refer to [11].
2 Educational Objectives The rst step in developing a tool was to address the issue of why many students have diculty understanding reasoning with uncertainty. The answer to this question then helped us formulate 1
Learning Tool for Reasoning with Uncertainty
2
the educational objectives for the tool, and the objectives, in turn, dictated how the tool should look and feel (i.e., the framework). While it is true that reasoning with uncertainty can be complex, our own experiences in learning the material indicate that the diculty in learning the topic does not merely lie in its complexity. The cause is really two-fold. One can be characterized as a problem of foundation; the other as a problem of presentation: 1. Reasoning with uncertainty requires some knowledge and ease with probability. Students should have some initial experience in elemental probability theory in order to truly understand uncertainty models.y Probability is the root of reasoning with uncertainty; most models are either extensions of or calculated deviations from probability. Basic ideas about ways evidence is combined, disjointness, intersection and the meanings of terms like \most probable" and \less plausible" are necessary foundations. 2. Textbooks we surveyed generally lack a conceptual presentation to reasoning with uncertainty. These introductory-level materials either give only cursory overviews of various models or they concentrate on the nitty-gritty details, i.e., the algorithm and mechanics for generating numbers. There is little focus on the concepts that dierentiate one model from the other, other than by virtue of the algorithms they use. This tends to leave the impression that reasoning with uncertainty is a jumbled bag of tricks. On the other hand, advanced material does tend to focus on concepts, including dierentiation, but in a much more esoteric manner than would be appropriate for an introductory AI course [9][6]. It is unreasonable to expect students to be masters of probability (or to digest advanced material in the course allocation of time for this topic), so the tool must establish a basic foundation. Also, since we recognize that students won't likely remember the exact algorithms for each model a year from taking the course, the tool must concentrate on presenting those concepts that can and should be remembered. After studying probability [1][4], analyzing advanced material on Dempster-Shafer [9], and dissecting the algorithms of several models [7], we developed the themes itemized below. We were convinced that a tool designed for these themes would provide the students a meaningful and lasting learning experience in reasoning with uncertainty: 1. Reasoning with uncertainty is a natural process of the dynamic state of the world we live in. Our beliefs evolve as we learn more about the world. Some of our beliefs grow as we get more evidence; some of our beliefs are retracted. 2. We use terms like belief, likelihood, probably, plausible because the evidence itself carries measures of uncertainty (not only because we may not have all the evidence at a point in time). 3. The information that we gather and the beliefs that are derived are used to make decisions. 4. Those decisions are not always clear cut; reasoning models can be a powerful aid to human judgment/expertise, but not a mechanism to supplant the human factor (not yet anyway). The rst author was new to the topic area. The term probability refers to the mathematical rules of probability theory and both the objective and subjective views of probability. y
Learning Tool for Reasoning with Uncertainty
3
5. There is a tradeo between the cost of gathering evidence (e.g., making tests) and the cost of making the wrong decision based upon the amount of evidence we choose to gather. 6. Each speci c model has its own key concepts and a context within which to compare the model with probability and Bayes reasoning [6]. 7. Each model has its own algorithm for deriving beliefs. We will refer back to this list of themes to demonstrate how each item is satis ed by our tool. Since the initial implementation is based upon Dempster-Shafer theory, we will also refer back to the key concepts (i.e. #6 above) of Dempster-Shafer which are itemized below: A. Theta is the universal set of hypotheses. We believe that somewhere in that universal set lies the answer. B. While our belief in Theta is 100%, before any evidence is observed our belief in any one hypothesis is 0. (We can live with total ignorance instead of requiring a declaration of \a priori" belief.) However, any one of the hypotheses is completely plausible (100%) as the answer before we have any evidence. C. We deal with intervals of Belief and Plausibility (vs. point values in probability). D. The idea is to narrow the size of the interval [0, 100%]. E. This is done by looking at the commonality of evidence linked to hypotheses across sets, narrowing in on the hypothesis that may be the answer. Notice we not only examine the singleton subsets of Theta, but, unlike Bayes reasoning, we also examine the non-singleton subsets. F. The degree to which we do not believe that the alternative hypothesis (or hypotheses) is the answer, is the degree to which this hypothesis is plausibly the answer. Thus the growth or reduction in belief of a set's complement, directly impacts the set's plausibility (and vice versa). G. m values (resulting from some operation performed on the commonality of evidence linked to hypotheses) are the means to the end: Belief and Plausibility. The algorithm will show exactly how Belief and Plausibility are derived from m values. The key concept here is that m values must sum to 1 (as in probability) but this constraint is not placed on the sum of Beliefs (or Plausibility). To best achieve our educational objectives we decided that the tool's primary use will be by the individual student rather than by the teacher in a classroom setting (although the tool can certainly be used to introduce the concepts to the class as a whole, as described in Section 4). The decision to make this a learning (versus teaching) tool has some implications. Naturally, less attention was placed on ensuring that actions on the screen could be viewed from a distance. The more important implication, however, is that a learning tool views the student as the active and only participant in the learning process. Therefore the tool must be a platform from which the student can actively investigate and explore, providing the student with the ability to control interactions and providing varied interactions (paths, options) to control. A learning tool should present many examples, plenty of help, tutorials, and feedback. And it should be fun to use. An interesting, user-friendly interface is very important to a learning tool.
Learning Tool for Reasoning with Uncertainty
4
Great emphasis was placed on graphically representing ideas (\a picture is worth: : :"), each idea being reinforced by text. Section 3 discusses the graphical representations that were chosen and the reasons behind those choices. Color and shade were signi cantly utilized to convey ideas in the tool. Color is part of most of our daily lives and carries information content on its own. Why not exploit that content to reinforce the ideas of reasoning with uncertainty? Besides, color adds to the visual appeal of the tool.z Designing an educational computer application is not all that dierent from designing a course, it is necessary to: know who you are teaching; know what you are teaching; and know how you will teach it.
3 Features of the Framework: The statrad Tool In this section we describe the features and components of the tool, Stat Reasoning-Ace Detective (statrad).x They are: Two modes of operation: (1) Investigate mode, where students completely control which evidences are present and (2) Game mode, which provides a more supervised arena for learning. Graphical representation of the rules and the ability to interact with that representation. Animation of the model's belief results. Additional pictorial representation of intermediate values. Tutorial of the mechanics of the model (i.e., the algorithm). A way to lter and control the number of results displayed. Providing material for o-line study. An expert component | where students can design their own rules (i.e., evidence and hypotheses). Comprehensive, case-speci c help. As we navigate the reader through the tool, we will show how each framework feature achieves our educational objectives as well as how concepts are gradually introduced to students. Invoking statrad brings up the main menu shown in Figure 1.
3.1 Investigate Mode
Investigate mode, a menu option in Figure 1, allows students to control which rules are red (i.e., which evidence is actually observed). Students can see the impact that each observation has on the belief results of the reasoning model. This mode was designed to allow students to explore the possibilities of the model. In this section, we show how the features of this mode can be used to identify the likely suspect in a detective case. z x
While not optimal, the tool can be used on a monitor that does not support color. The tool is copyrighted: Copyright 1994 Deborah A. Vastola. All Rights Reserved.
Learning Tool for Reasoning with Uncertainty
5
There was a murder last night in the Hanover building in downtown. The victim was found in the oce of the Ace Company. (The Hanover building houses oces for more than one company.) A security camera shows that no one entered or left the building after 11:00pm. So anyone who was in the building after 11:00pm must be a suspect. That list of suspects consists of Walker, James, Abbott, and Costello. Walker and James are employees of the Ace Company. Abbott and Costello do not work for the Ace Company. Right now, the police have nothing to go by. In other words they haven't observed or collected evidence that points to one suspect over the other. Any one of the suspects could have done it. 3.1.1 Graphical Representation of the Rules
As we can see in Figure 2 a network graph conveys a snapshot of all the information that we need to apply the model, setting the stage for our educational objectives, particularly themes 1, 2, and key concept A. The network displays our list of suspects (walker, costello, james, abbott). This is our universal set of hypotheses. The network also displays the set of possible evidence that, if observed, will impact the evolution of our beliefs about who might have commintted the crime. The network also displays the measures of uncertainty associated with each piece of evidence. These values represent the strength of the evidence. For the Dempster-Shafer model these are the m values, or basic probability assignments, and each value represents the weight that the evidence contributes to the provability of the hypothesis it is linked to. In a Bayesian Network model these values would be conditional probabilities [5]. In a MYCIN-like expert system model they would be certainty factors [7]. In addition to the network graph being a quick and succinct way to convey the domain knowledge, the network graph also provides students with a simple interface in which they can interact. Students can press an evidence node to \observe" that evidence (i.e., to control which evidence is actually present). Also students can change the weight values associated with the evidence before beginning an investigation by simply changing the value box. In this way they can explore how changes in the uncertainty that each evidence carries might aect the results. The network graph is minimal. There are no labels or multi-directional arrows or special symbols. Showing all the sophisticated ways in which knowledge may be represented is an important topic but is not an objective of the tool. Keeping the graph as simple as possible satis es a basic pedagogical rule: don't muddy the waters by introducing super uous information. The network graph appears alone on the screen when the Investigate option is invoked. We wish students to rst focus on the domain. The other concepts will be phased-in as the students continue their interactions. The use of color is particularly important in the network graph. Evidence nodes are grouped by attribute, so that mutually exclusive nodes are readily visible. Attribute grouping is conveyed through assignment of like shades of grey. In Figure 2 we have two attribute classes, one based on handedness (left vs. right) and the other based upon whether the person is an employee of the Ace Company (not an inside job vs. an inside job). Each hypothesis node is a dierent color. We will see in the next section how the color of each hypothesis is used to help correlate results.
Learning Tool for Reasoning with Uncertainty
6
3.1.2 Animation of Results: Evolving Beliefs
To give the student the sense of the dynamic nature of the world, we want to show the progression of evolving beliefs in the hypotheses. No matter which model is used, the purpose of uncertain reasoning analysis is to move from a position of uncertainty toward a position of certainty (or at least to become more certain of how uncertain the beliefs are). \Dynamic", \progression", \evolution", \movement"| naturally the tool uses animation to show the results of the model. As each piece of evidence is \observed" (by clicking on an evidence node in the network graph), animated bar chart activity occurs on the screen. The animation shows the impact that evidence has on our beliefs and certainty about the set of hypotheses. This animation is intended to give students an intuitive feel of the reasoning model, not the details nor the mechanics of the model. Our goal was to present a common-sense pictorial view of what is happening. So it was a conscious decision to exclude any reference to numbers or speci c values. Let's continue with our murder mystery:
The Police Coroner report just arrived. The report states that the angle of the victim's wound leads her to believe that the killer is left-handed. The detectives now have their rst piece of evidence!
Figure 3 shows the results when students observe, by clicking on the evidence, that whoever \done it" is left-handed. In the network graph, the links of the lefty evidence change color, each to match the color of the hypothesis to which it points. Since walker and costello are left-handed, the links change to blue and orange, respectively. Highlighting the links of the observed evidence helps students keep track of which evidence was selected. The decision to use color matching, however, was speci cally made for the Dempster-Shafer model in order to reinforce the concept that this model focuses on the common points of linkage to arrive at its belief results. The belief results are the intervals of Belief and Plausibility. Observing evidence triggers the display of the Dempster-Shafer belief results, Belief and Plausibility. The Belief and Plausibility of each subset of Theta are represented as a bar (upper and lower bounds on a scale of 0 to 100%). The initial bar display of all subsets of Theta (except Theta whose Belief and Plausibility are xed at 100%) shows Belief at 0% and Plausibility at 100% (concept B). Remember we said that when no evidence has been collected yet (the initial state), any one of the suspects could be the killer. The changes in the intervals as a result of observing left-handed are animated. Figure 3 shows the state of belief after the animation completes. The top window in Figure 3 displays Belief and Plausibility for the singleton subsets. The base color used for each singleton bar matches the color used for that hypothesis in the network graph. This helps students to more quickly correlate results. The window on the right shows Belief and Plausibility for the non-singletons. The color red is reserved for non-singletons. In our example, james and abbott become less plausible suspects since neither of them is left-handed. The bar charts and coloring help to convey the Dempster-Shafer key concepts, A-F, (G is covered by another component). Belief and Plausibility measurements are the lower and upper bounds on the probability of a hypothesis. Thus, it makes sense to view the current probability of each hypothesis as a sub-range of the complete 0-100% range bounded by the Belief and Plausibility numbers. The location of the range shows the probability of the hypothesis, and the size of the range shows how certain (or uncertain) we are. The lower bound is our minimum conviction on the probability of a hypothesis. As such, Belief can be viewed as a stronger assertion than Plausibility. Thus we use the solid (i.e., stronger) color to represent belief versus the lines (which appear as a paler version of the solid color) for Plausibility.
Learning Tool for Reasoning with Uncertainty
7
Continuing with our murder mystery:
The police just received an anonymous tip that the crime was an \inside job". A second piece of evidence! Figure 4 shows the results when students observe the next evidence, insideJob. Abbott and costello become much less plausible suspects since neither of them are employees of the Ace Co. And we now have some belief that walker may have committed the murder. The walker node is a common point of linkage for the two pieces of evidence observed, left and insideJob. Two links are now dark blue to match the hypothesis node to which they connect. Color reinforces the commonality.
At this point, since there is no more evidence, the police have to make a decision on whether to bring in walker for more questioning or actually arrest walker now. Either way, the detectives are closer to solving the case! 3.1.3 Showing Intermediate Values: The More Option
The tool design anticipates that with any reasoning model, it might be desirable to show more information, such as intermediate values from which the end results are derived. Such intermediate values should be viewed side by side with the end results to which they contribute. The More button serves this function by popping up another window. For the Dempster-Shafer model, when the students press More, they will see a pie chart representation of the the mass distribution (m values) (see Figures 5 and 6). The m values are the means to obtaining the end results, i.e., Belief and Plausibility. A pie chart seemed to be the natural display vehicle since it conveys the concept that m values must sum to 1 (concept G). The color of each pie piece corresponds to the color used for the respective bar chart. We include a \before" and \after" snapshot of the m values since it is instructive for students to see how the pieces of the pie change when evidence is added and to see the eect of new evidence on the m value ofn Theta, in particular. Figure 5 is a result of students pressing More after they observed lefty evidence. Figure 6 is a result of students pressing More after insideJob was observed. We include these gures to show that the tool provides the means to capture the complete trail of values. The ability to step through a model can be of great bene t to someone just learning about a reasoning model.
3.1.4 Tutorial of the Mechanics of the Model: The #'s Option
We have initially focused the students' attention on acquiring a common-sense view of reasoning with uncertainty and key concepts of the speci c model. This does not minimize, however, the value of educational objective 7. The formalism of the algorithm is important in drawing out the details and gaining a deeper understanding of the model. In this component, the precise algorithm of the model (e.g., Dempster-Shafer evidence combination) is explained for the speci c case environment. By pressing the #'s button, a window pops up with a tutorial of the calculation steps and resulting numbers for the most recent piece of evidence that was observed. The tutorial can be printed (see Appendix A). By examining the tutorials generated when each piece of evidence is observed, students can see the complete trail of calculations for our Who-Done-It? example in the AceCoMurder case, from the state of things before any evidence
Learning Tool for Reasoning with Uncertainty
8
was observed, to the calculations for the observance of lefty, and nally to the calculations for the observance of insideJob.
3.1.5 Zooming In on the Network Graph Links
For larger more complex cases, such as the What's-The-Infectious-Disease? case in Figure 7, it may be dicult to follow the overlapping link paths. LinkZoom provides the capability to examine selected links more closely. If zoom mode is activated, when student press an evidence node the node's links will be highlighted. If students press a hypothesis node, all links of all evidence connected to that hypothesis will be highlighted. The connections in the network graph can be interactively explored and veri ed.
3.1.6 Filtering Results
The tool design anticipates that some models may generate a large collection of data (e.g., Bayes combination of evidences, Dempster-Shafer inclusion of all subsets of Theta) and therefore some mechanism for controlling the number of results displayed must be provided. In Dempster-Shafer, Belief and Plausibility values are associated with each subset of the complete set of hypotheses. When dealing with only a handful of hypotheses, it is valuable to display all possible bar charts to help students learn to distinguish between signi cant and redundant evidence. This is a critical distinction for minimizing cost. However, as the number of hypotheses increases, it becomes more impractical to read all 2N possible bars. (For a case as large as What's-The-Infectious-Disease? this would display 93 pages of bar charts!) For this reason, the tool includes a lter option which allows students to selectively restrict the displayed results. As Figure 8 shows, selectivity for the Dempster-Shafer lter is based upon three threshold categories: m value, Plausibility, and Belief. Only when a subset satis es all three thresholds will a bar chart be created for that subset. Setting all 3 thresholds to 0 deactivates ltering. The choice of lter setting for a given case depends upon the complexity (e.g., number of nodes and linkages) and nature (e.g., actual values of uncertainty associated with each evidence node) of the case. The appropriate setting also depends upon which ideas one wishes the case to emphasize. Filtering should be deactivated, for example, to show what is meant by \all subsets of Theta". It is probably not necessary to show all possible bar charts, however, to convey the idea that Dempster-Shafer examines joint sets as well as disjoint sets. And if the purpose of the case is to show how Dempster-Shafer normalizes when the null set has a non-zero value, the bar charts become irrelevant (the pie charts being more useful) so the maximum ltering could be set. Having a sense of how the complexity and nature of the case relates to possible outcomes of the threshold categories (e.g., m, Belief, Plausibility) and how the threshold categories relate to each other requires experience with the model. Thus we leave it to the \expert" (see Section 3.4) to determine the appropriate permanent lter settings when the case is created (see the Filter option in Figure 9). The lter setting can be changed temporarily during an investigation however. Experimenting with various lter settings can reinforce the the idea of the order relationship of the threshold categories (e.g., m Belief Plausibility).
Learning Tool for Reasoning with Uncertainty
9
3.2 Game Mode
While students can freely experiment with speci c cases of a reasoning model (via the Investigate mode), there is an advantage to introducing reasoning with uncertainty in a more supervised arena, such as a game. A game, naturally, provides an additional sense of fun to the tool, as well as the potential for some friendly competition. The game mode also sets the stage for themes 3, 4, and 5. Because the parameters (observed evidence) and objective (to have the student make a decision based on that evidence) are xed, we can provide relevant and immediate feedback on the student's interpretation of the model's belief results and their analysis skills in using those results to make decisions. The game is similar to \twenty questions" and can be played with any case. The larger cases are probably better suited to the game mode since their size makes the game more challenging. At the start of each game round, the tool randomly selects one piece of evidence from each attribute class as evidence that is present. The goal is to select the most reasonable conclusion (from the set of potential hypotheses) AND to do so in the fewest moves (i.e., number of questions asked). The more questions asked, the lower the total possible score. There is cost associated with gathering evidence! We use the What's-The-Infectious-Disease? case in Figure 7 to illustrate game mode:
A patient walks into your oce. As Doctor, your goal is to correctly diagnosis the patient's disease, based upon the patient's symptoms. Your goal is to also detect those symptoms in the fewest number of lab and oce tests. The process of determining symptoms (i.e., pieces of evidence that are observed) is done by pressing an evidence node in the network graph to inquire whether that symptom is present. The tool will pop up a window stating whether or not the symptom is present. If the students guess correctly (i.e., the patient does have that symptom), the tool animates the belief results (i.e., Belief and Plausibility) that have evolved from the presence of that symptom. Students can use those results to determine which question (lab test) to ask next. For example, one reasonable strategy would be to use the bar chart results of the non-singleton sets to identify the set of diseases that has a belief and then to use the network graph to identify symptoms that distinguish one or more diseases within that set. The symptom that makes that distinction would be a good piece of evidence to enquire about next. As the students continue to understand the relationship between the network graph and the evolving ranges, they will realize that some evidence nodes add more information than others. Once students have exhausted the allotted number of questions (lab tests in our example), they are prompted for a diagnosis. Of course students could choose to make a diagnosis before exhausting all questions, thereby potentially increasing their score. However, by not waiting until all the evidence is in, they risk making an unreasonable decision (5 points are awarded for every unused question if and only if the diagnosis was reasonable). Either way, students press a hypothesis node to select the disease that they believe is the most likely one. (The bar charts are then updated, if necessary, to re ect the entire set of symptoms the patient has.) The game mode contains all the features and components that were described in the Investigate mode described earlier. An additional component, unique to the game, is one that evaluates the students' answers or decisions.
Learning Tool for Reasoning with Uncertainty
3.2.1 Evaluating the Student's Decision
10
Figure 10 shows what happens when students make a diagnosis of rockyMSF { The window on the left pops up. Having students go through the analysis necessary to answer the question, So How Does Your Answer Measure Up?, forces them to focus on the concept that model results are not always clearcut; students learn they must also employ a logical thought process when weighing the results. Decision making involves human judgment, not just an application of numbers. The fact that there aren't always clear cut answers also presented the authors with a design challenge. We could not simply designate the hypothesis with highest Belief to be the correct answer. We recognized that for many game rounds, there will be a range of acceptable answers, although perhaps within that range some answers may be more reasonable than others. Our challenge was to design a technique that could identify and categorize that range, devise a mechanism to fairly judge student answers against that range, and then to communicate this to students. We had to design a logical \thought" process for the tool itself. For the Dempster-Shafer model, the following decision analysis was used, formulated in part using maximum likelihood detection and in part using just plain common sense. Based on the randomly selected evidence, the tool internally evaluates each hypothesis on its own merits as well as its standing among the other hypotheses. Depending upon what a hypothesis' Belief value is (both the magnitude alone as well as the magnitude compared to the Plausibility of the other hypotheses), the hypothesis is assigned a number between 0 and 4. The number range correlates to the bullet selections in Figure 10 (i.e., 0=Who could tell, 1=It's more like a tossup,..., 4=Clear Winner). Notice that with this approach, more than one hypothesis can receive the same assignment. The tool keeps track of the highest number assigned. A hypothesis with the highest number assigned is internally chosen by the tool as the best answer. If there is more than one hypothesis with that highest number assignment, then the hypothesis with the highest Belief within that group is chosen as the best answer. However, in the latter case, the tool will never communicate to the player that this is the ONLY best answer possible. Once students characterize their answers, the tool pops up the score received (see Figure 11). The scoring (and the text that accompanies the score) is based in part on the delta between the number assignment of the answer and the number assignment of the tool's choice. A 0 delta means the students chose the best possible answer given the circumstances (20 points). In general a 1 delta means the students chose a reasonable hypothesis but there might have been a better choice. Again this could be a judgment call so some points (10) are awarded. A delta of 2 or more means the answer was way o the mark (0 points). Students receive points both for selecting a reasonable answer and for correctly characterizing the answer they selected. The student in Figure 11 received 20 points for making the most reasonable diagnosis (i.e., best answer). If she had chosen ringworm instead, for example, only 10 points would have been awarded and the text accompanying the 10 points would have been less emphatic (e.g., \that's a reasonable choice but rockyMSF would probably be a better one"). The student in Figure 11 also received 5 points for correctly characterizing her diagnosis. If, for example, rockyMSF was described by the student as a clear winner, 0 points would have been awarded for this part of the scoring. There may be a higher belief in rockyMSF than any other disease in our universal set, but the Belief and Plausibility values of rockyMSF do not justify a level of con dence that says that rockyMSF is a \pretty sure thing". (If the student believes {
Rocky Mountain Spotted Fever
Learning Tool for Reasoning with Uncertainty
11
otherwise, perhaps it's time to go back to medical school.) The total points students receive for each game round are accumulated in the Score box in the network graph window.
3.3 Providing Material for O-line Study
The tool includes the capability of printing most of the material it presents. In fact Figures 2-7, 9, 10, 13-14, and the Tutorial were generated using the statrad interface. Such output can be used by students for further study when they are not sitting at a computer, particularly important since access to a computer may not available at all times. Printouts can also be used by students to submit homework and as study sheets. Additionally, this capability is useful to the instructor who might use the printouts for homework or exam solution sets.
3.4 The Expert: Create New or Modify/View Case
The Expert component allows the student to textually specify the rules of the environment on which the statistical reasoning model will act. An environment includes a set of attributes, evidence, and potential hypotheses. The rules establish certainty values, i.e., measurable links, between evidence and conclusions which complete the domain knowledge of the expert. In the Dempster-Shafer model, these values are the mass functions, m (also called basic probability assignments). In other models, these values would be certainty factors or fuzzy membership distributions [2]. Students can create new rules from scratch or modify existing ones. Each set of rules in an environment is stored in a unique le called a case le. Hereafter we just use the term case to refer to these rules. Figure 9 shows the rules of the very simple (and ctitious!) AceCoMurder.case. The form on the left describes, in plain English, the speci c characteristics of potential suspects that are linked to conclusions about the identify of the murderer of an Ace Company employee. The form may be printed using the PrntForm option. The Draw option displays this information in network graph form. The expert component makes the tool exible and extensible. It can be adapted to the domain of students' choice, and the students can explore various sets of connections between evidence and conclusions. This makes the tool more fun to use and provides a chance for students to be creative. Additionally, the availability of this component makes a key point: a statistical reasoning system is only as good as the expertise that it is based upon. Dempster-Shafer evidence combination, for example, does not help decide what the complete set of evidence and conclusions are; nor does it aid in selecting the correct m values. To create or modify the rules, the students simply begin typing on the form.
3.4.1 Using an Environment File
In addition to typing on the form, students can also use an existing environment le to help specify the rules. Figure 10, right side, shows an example of a medical environment le that was used to create the rules of the What's-The-Infectious-Disease? case [10]. The environment le provides lists of potential attributes, evidence, and hypotheses from which to choose. Any item from these lists can be dragged over to the form (using a mouse). This le saves some typing. More importantly, however, while the le is not meant to substitute for expertise in an area, it does provide students a starter for organizing the information regarding a particular case.
Learning Tool for Reasoning with Uncertainty
12
The tool can highlight which evidence generally belongs to which attribute class, and vice versa. Figure 12 shows that if students click on \skin" in the Attributes list, the tool will show that evidence in the \skin" class begins with \rashes". Clicking on \tick" in the Evidence list, would cause \bite" in the Attributes list to be highlighted, indicating that evidence \tick" generally is in the \bite" attribute class. The tool does not enforce this classi cation association since a piece of evidence could logically fall into more than one attribute classi cation. Besides, students can choose to use their own naming convention for attributes. After all, the student is the expert in this component. Since developing a usable set of domain knowledge requires an understanding of how it will be used, students should be able to use the tool without having to develop their own domain knowledge. It would not make sense to require that one start out as an expert in the reasoning model that one is trying to learn about in the rst place. Therefore, the tool package includes an initial set of case and environment les which is described in [11].
3.4.2 The Parser
We mentioned earlier that color plays an important role in the tool. The parser exploits the fact that most people have a great deal of experience in learning to associate ideas with color so that eventually the mere presence of the color itself quickly and easily conveys those ideas. When students select the Draw or Save option, the parser checks that the rules conform to format and syntax speci cations. When there is an error in syntax, instead of displaying a wordy message, the eld in error is highlighted in color. Yellow is a warning that data is missing; the text can be saved to a case le and then at a later time the le can be opened and more data added; red indicates that the student must stop and x the error before going any further. This convention proved to be an ecient and subtle way of conveying problems. Descriptions of eld format and and syntax requirements can be obtained by pressing the heading button that corresponds to the eld.
3.5 Comprehensive, Case-Speci c Help
A comprehensive help component should be part of any learning tool, but is especially important in a tool that deals with a complex topic like reasoning with uncertainty. Good help keeps the frustration level of learning dicult material (or a new tool) down to a minimum. Help should be immediate, relevant, and readily available. And help should be layered, providing more information as students gain a deeper understanding of the subject matter and more experience with using the tool. This means the design must anticipate the needs of students at each point. We characterize the three layers of help as overview, conceptual, and feedback. Overview Help prepares students to interact with the model. It describes such things as \how to get started," the intent of a function, and more detailed background information. Since the philosophy of the tool framework is that learning is through active exploration, we want students to begin their interactions at the get go. Thus overview help must be \just in time," providing the right amount of information, no more|no less, when it is needed and not before. Consider, for example, the main menu in Figure 1. When students rst invoke the tool, the information they will need is a general sense of what the the purpose of the tool is and what types of interactions they can do. About This Tool is an obvious starting point. When About This Tool menu option is selected, students are advised that if they click on the text of any menu option they will get a brief description of that option. For example, clicking on the word \Investigate," will generate a short description of what it means to Investigate and what
Learning Tool for Reasoning with Uncertainty
13
is needed before beginning an investigation (e.g., open a case le and instructions on how to open that le). Selecting Investigate, by clicking on the button next to the word, takes students to a screen where they can receive more detailed background information required to begin the investigation (via the Info option). The information is broken down into small, digestible pieces (starting point, brief description of intent, detailed background overview) and each piece has its own distinct interface. In essence, students have a virtual index of overview help at their disposal. Conceptual Help reinforces the concepts of the model that are conveyed graphically (e.g., animation of the bar charts, pie charts) and it provides the technical details that cannot be conveyed through pictures alone, such as de nition of terms. Conceptual help guides students' attention to particular areas of the pictures to ensure they are focusing on the key concepts, or it suggests where to go next, augmenting the phase-in of concepts. It prescribes an approach to learning this complex topic. Since conceptual help must be directly relevant to what the students are seeing pictorially it makes sense that this type of help be available by simply clicking on the picture in question. For example, the left window in Figure 13 popped up when a student clicked on the \james" singleton bar chart in the AceCoMurder.case. In addition to providing de nitions of Belief and Plausibility the help provides tips on how to analyze the results. Another example of this is the right window in Figure 14, which popped up when a student clicked on the \After" pie chart in the ComInfectiousDisease.case. Feedback helps students understand where they are. Feedback is better when it can be as speci c as possible, describing the interaction students just made as well as the case upon which they were working. In this way, feedback provides a clearer context from which to make the next interaction. Besides, messages that lack any variation tend to be ignored after repeated viewings. Text for all forms of help is contained in message les. Most messages, particularly those used for feedback, have both a common part and a variable part, the latter to be lled in at the point when the text is being displayed. We believe we have struck a good balance between generic information and feedback tailored to the speci c example. All help, whether it be overview, conceptual, or feedback should be concise and focused. It should not cover more than one or two concepts when displayed. Finally, help should be inviting. The reader will note that nowhere in the tool does the the word \help" appear. One of the ndings of our usability reviews is that students avoided options entitled \help" [11]. However, help options that were labeled otherwise (e.g., \About") were used frequently. It would be interesting to conduct further research on the cause of this phenomenon. In any event our observations dictated that we remove \help" from the tool's vocabulary.
4 Using statrad in a Classroom Setting While statrad is a learning tool, not a teaching tool, it can be used eectively in the classroom. We used the tool in class as a preview for the formalism and mathematics of the theory. We took the class through the Ace Co. Murder case very brie y (similar to our storytelling for Figures 2 and 3, in Section 3.1). This was enough to get the class excited and intrigued. As soon as we started getting questions like \how does it do that?", we put the tool aside and went to the blackboard and introduced the theory. The \it" they were referring to was not the tool but Dempster-Shafer, as it should be. In fact we made an eort to avoid making the tool itself the center of attraction, downplaying all the \neat features" of the tool, which the students would discover during their own explorations. Even though the formalism is complex material, we felt con dent in covering the theory at
Learning Tool for Reasoning with Uncertainty
14
a fairly rapid rate, knowing that these ideas would be reinforced and absorbed at each student's own pace when he or she used the tool to do a homework assignment. We do recommend that the instructor give an assignment or at least ideas to think about to accompany the student's session with the tool. (A sample homework and solution set are included in the tool package [11].) It is the coupling of the student's exploration and concentrated think time that will optimize the tool's eectiveness. By the way, the students seemed to enjoy our using the names of people they knew as the suspects. If the reader wishes to also use AceCoMurder.case, the case can be easily modi ed, using statrad, to have more meaningful suspects. We did one thing dierently the second time we used statrad in a classroom setting. The rst time we had erroneously assumed that the relationship among terms like \plausible", \belief", \conviction", etc. was well-understood. However, not all students interpreted a statement such as \It is my belief that I will get an B in Intro to AI" as a stronger conviction than a statement such as "but it is plausible that I will get a A". Consequently there was some confusion when we introduced the signi cance of the lower and upper bound values of the bar charts. This confusion was easily avoided the second time by reviewing everyday occurrences of these terms before previewing the tool and Dempster-Shafer. (Exploiting our vocal capabilities, i.e., stressing the term belief with more volume and intonation, didn't hurt either.)
5 Student Reactions Student reactions to using statrad were quite positive. We frequently received comments such as \Had lots of fun: : :really learned a lot about Dempster-Shafer." and \neat!". But it was the non-verbal reaction that pleased us the most. For example, students were eager to start using the tool and they stayed in the lab for many hours, exercising each feature. The tool kept their attention and focus on this complex theoretical topic longer than would otherwise have happened without the tool. Several students also adopted the notation of bar charts from the tool to describe reasoning with uncertainty results on in-class exams. Also, the tool exceeded our original expectations in achieving the goal of making a complex subject easier to understand. Not only did the undergraduates in the Arti cial Intelligence course demonstrate noticeable improvement in the understanding of reasoning under uncertainty over the previous year (when the topic was taught without statrad), but we also had success in using the tool with younger students. We introduced the statrad tool to high school students in a two-week summer program that encouraged them to consider future study in the eld of computer science [8]. The purpose of this introduction was not to convey concepts of Dempster-Shafer theory but to show an example of an exciting and useful computer application. To our amazement, several of these students arrived at the fundamental concepts of Dempster-Shafer just by using the tool. For example, one high school student proudly declared, as she was playing the \Name that Disease Game," that she would use the the \red" bar charts to gure out which questions to ask next so she wouldn't waste questions and thereby get more points. She had already picked up on the idea of \common points of linkage" and using the non-singletons to ask questions that would distinguish hypotheses (naturally, she did not use such formal terminology). Many high school students spent hours designing their own cases and games for each other to play. Of the eleven laboratory activities in the summer program, statrad received the second-highest rating, garnering comments such as \loved it" and \It was really interesting!". When parents came to visit, the games that the students designed were the rst thing they eagerly shared with them.
Learning Tool for Reasoning with Uncertainty
15
6 Conclusion One of the most important principles of teaching is to have a point of view and to focus on it. That's what a good teacher (or a good learning tool) brings to the student. There is, of course, more than one valid perspective on any subject and certainly more than one way to go about conveying that perspective. We have presented our point of view on the subject of learning reasoning with uncertainty and Dempster-Shafer Theory, and translated that perspective into concrete educational objectives. We described the statrad tool which is a framework for meeting those objectives and how the tool's interface, content, and presentation achieve those objectives. The tool is based upon the principles of simplicity, the need to phase-in concepts, and viewing the student as active explorer. The tool uses interactive graphics and animation to convey a common-sense view of reasoning with uncertainty. It uses color, exploiting the fact that color itself can carry information. And it provides comprehensive, \just in time" help that includes a virtual index into overview information, a prescription for how to approach learning the topic, and case-speci c feedback. We encourage all to try out the tool. There's no substitute for hands-on interaction. The statrad package can be obtained via the URL: http://ftp.cs.rpi.edu/pub/statrad
or via a direct anonymous ftp to ftp.cs.rpi.edu (cd into the /pub/statrad directory)
The README le in the statrad directory explains how to download the package as well as listing the platforms needed to run statrad.
Acknowledgements
The authors would like to thank all the Rensselaer undergraduate and graduate students who participated in usability testing of statrad, as well as the students in the course \Introduction to Arti cial Intelligence" during the Fall semester of 1994, for the time and eort they spent using and evaluating the tool, and for all their helpful comments. Finally, we are grateful to the anonymous reviewer for helpful comments that signi cantly improved this document.
References [1] Bowen, E. and Starr, M., Basic Statistics for Business and Economics New York, NY: McGraw-Hill Co. (1982). [2] Cox, E., \Fuzzy Fundamentals" IEEE Spectrum, October 1992, pp. 58-61. [3] Frenzel, L., Crash Course in Arti cial Intelligence and Expert Systems. Indianapolis, IN: Howard W. Sams & Co. (1987). [4] Huntsberger, D. and Billingsley, P., Elements of Statistical Inference Boston, MA: Allyn and Bacon, Inc. (1973).
Learning Tool for Reasoning with Uncertainty
16
[5] Lauritzen, S. and Spiegelhalter, D., \Local Computations with Probabilities on Graphical Structures and their Application to Expert Systems", Journal of the Royal Statistical Society, 1988, 50 No. 2, pp. 157-224. [6] Pearl, J., Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo, CA: Morgan Kaufmann Publishers, Inc. (1988). [7] Rich, E., and Knight, K., Arti cial Intelligence. New York, NY: McGraw-Hill (1991). [8] Rodger, S. and Walker, E., \Activities to Attract High School Girls to Computer Science", Proceedings of the Twenty-seventh SIGCSE Technical Symposium on Computer Science Education, 1996. [9] Shafer, G., A Mathematical Theory of Evidence. Princeton, NJ: Princeton University Press (1976). [10] Tapley, D., ed., Columbia University College of Physicians and Surgeons Complete Home Medical Guide Mount Vernon, NY: Consumers Union (1989). [11] Vastola, D. and Walker, E., Interactive Learning Tool for Statistical Reasoning with Uncertainty, Technical Report 95-4, Department of Computer Science, Rensselaer Polytechnic Institute, Troy, New York (1995). Based upon a project report submitted to RPI in partial ful llment of the requirements for the Master of Science Degree.
Learning Tool for Reasoning with Uncertainty
17
Appendix A: Tutorial Generated By Pressing #'s Button
Dempster-Shafer Calculations
Using Case File: /dept/cs/ai/suit/data/AceCoMurder.case Evidence currently observed: lefty Evidence previously observed: none
STEP 1 Calculate new m values We do this by: 1. Picking up the m function that was stimulated by the current observation of lefty. Remember, each rule has its own m function . The m function we pick up is mlefty (fco; wag)= 0:4; mlefty (fab; ja; co; wag)= 1:0 ? 0:4 = 0:6 2. Multiply the m function we just picked up by the m values of previously stimulated sets. Since this is the rst evidence we've observed, (the complete set of hypotheses) is the only previously stimulated set. 3. The newly stimulated sets are the set intersections resulting from the multiplication. m values of equivalent intersections are added. Here are the results of Step 1:
PREVIOUSLY STIMULATED fab; ja; co; wag m=1
STIMULATED BY CURRENT OBSERVATION OF EVIDENCE fco; wag Theta=fab; ja; co; wag m = 0:4 m = 0:6 ||||||||||||||||||||||||||||| fco; wag fab; ja; co; wag new m = 0:4 new m = 0:6
STEP 2 Calculate Belief and Plausiblity We do this by: 1. Calculating Belief. Bel(set) = P m, over all subsets of that set.
18
Learning Tool for Reasoning with Uncertainty
2. Calculating Plausiblity. Pl(set) = 1-bel(set's complement). Here are the results of Step 2: set
m(set)
bel(set)
pl(set)
fab; ja; cog
0
0
1
fab; ja; wag
0 0
0 0
1 0.6
0 0
0.4 0
1 1
0 0 0
0 0 0.4
1 0.6 1
0 0 0
0 0 0
1 1 0.6
0.4 0
0.4 0
1 1
0 0
0 0
1 0
||||||||||||||||||||||||||||||||||||||| fab; ja; co; wag 0.6 1 1
fab; jag fab; co; wag fab; cog fab; wag fabg fja; co; wag fja; cog fja; wag fjag fco; wag fcog fwag fg