(A Natural Deduction System) model, described in the following pages, is one such theory that ...... The answer to these questions is that the two rule ..... then R; therefore, if not M then R. (Panels a-e illustrate. ANDS's ..... The sentence above the line reads, "If it is not true that there is both an M and a P, then there is an R." b.
Copyright 1983 by the American Psychological Association, Inc. 0033-295X/83/9001-0038$00.75
Psychological Review 1983, Vol. 90, No. 1, 38-71
Cognitive Processes in Prepositional Reasoning Lance J. Rips University of Chicago Propositional reasoning is the ability to draw conclusions on the basis of sentence connectives such as and, if, or, and not. A psychological theory of prepositional reasoning explains the mental operations that underlie this ability. The ANDS (A Natural Deduction System) model, described in the following pages, is one such theory that makes explicit assumptions about memory and control in deduction. ANDS uses natural deduction rules that manipulate propositions in a hierarchically structured working memory and that apply in either a forward or a backward direction (from the premises of an argument to its conclusion or from the conclusion to the premises). The rules also allow suppositions to be introduced during the deduction process. A computer simulation incorporating these ideas yields proofs that are similar to those of untrained subjects, as assessed by their decisions and explanations concerning the validity of arguments. The model also provides an account of memory for proofs in text and can be extended to a theory of causal connectives.
The importance of deductive reasoning lo cognitive theory lies in its ccntrality among other modes of thought. Explanations of people's statements and actions presuppose some degree of logical consistency (Davidson, 1970; Dennett, 1981). If we explain why Steve stayed away from the dinner party by saying that Steve believes all parties are boring, we tacitly assign to him the ability to deduce, from his general belief and his recognition that this is a party, the conclusion that this party will be boring. True, we sometimes attribute to others deductive errors, particularly when the route to the conclusion is long and complicated, and we will see many instances in what follows. Nevertheless, these mistakes are only discernible against a core of logically accurate thought. As Davidson (1970) puts it, "Crediting people with a large degree of consistency cannot be counted mere charity: it is unavoidable if we are to
be in a position to accuse them meaningfully of error and some degree of irrationality" (P. 96). This logical core is evident in subjects' unanimous intuitions about the validity of certain problems. Virtually all subjects are willing to accept Argument 1, which has the familiar modus ponens form, and indeed, it is very difficult to imagine what one could say to convince someone who affirmed the premises of the argument (i.e., the first two sentences of Argument 1) but denied the conclusion (the final sentence). 1. If there is an M on the blackboard, there is an R. There is an M. There is an R.
One can often bring new evidence to bear in persuading someone to change his or her belief in particular propositions, but it is not clear what sort of evidence would be relevant in overcoming resistance to the validity of this inference. Further, any proof or explanation of Argument 1 is likely to be no more convincing than the argument itself (Carroll, 1895; Haack, 1976; Quine, 1936). If Argument 1 is not conclusive, what is? Thus, a major goal for a psychological theory of deduction is to account for the pervasive appeal of such arguments. Of course, not all deductive inferences are as obvious as that of Argument 1, and even
I would like to thank Jonathan Adler, Norman Brown, Carol Cleland, Allan Collins, Fred Conrad, Donald Fiske, Gary Kahn, Lola Lopes, Bob McCauley, Jim McCawley, Gregg Oden, Jay Russo, Steve Schacht, Paul Thagard, and especially Sandy Marcus for their help with this article. The research was supported by U.S. Public Health Service Grant K02 MH00236 and National Science Foundation Grant BNS 80-14131. Requests for reprints should be sent to Lance Rips, Behavioral Sciences Department, University of Chicago, 5848 South University Avenue, Chicago, Illinois 60637. 38
COGNITIVE PROCESSES IN REASONING
those that seem straightforward at first glance may turn out to be beyond the reach of some subjects. For example, only 61% of 10th- and llth-grade students accept Argument 2 as valid, according to some data of Osherson (1975, p. 146). 2. If there is not both an M and a P on the blackboard, then there is an R. If there is no M, then there is an R.
This inference presents no problem for logical analysis, since there are well-known algorithms or proof procedures for deciding its validity (see Chang & Lee, 1973, for a survey). But although these formal methods may be of use in suggesting psychological hypotheses, a psychological theory must explain the difficulties encountered by the unsuccessful subjects as well as the correct procedures of the successful ones. This provides a second goal for the research that follows. Fulfilling these goals is vital to cognitive psychology because an account of inference is part of the explanation of many cognitive components. To take just a few examples: (a) Most current theories of comprehension assume some sort of reasoning mechanism in order to explain how people anticipate upcoming information in text and relate new information to what has come before (see, e.g., Clark, 1977; Warren, Nicholas, & Trabasso, 1979). (b) One's understanding of others and one's commitment to social attitudes can be seen as the product of certain (not necessarily optimal) inference strategies (e.g., Hastie, in press; Nisbett & Ross, 1980). (c) Piagetians hypothesize that small changes in base-level inference procedures are responsible for massive changes in cognitive development, (d) According to standard theories of perception, knowledge of external objects is based on inference from visual cues, particularly when the objects are three-dimensional, occluded, or ambiguous ones (e.g., Fodor & Pylyshyn, 1981; Ullman, 1980). This dependence of cognitive explanations on reasoning is no accident. Because these explanations are couched in terms of mentally represented beliefs and goals, and because reasoning is our basic means for changing or reconciling beliefs, reasoning is naturally invoked in accounts of comprehension,
39
intellectual development, and the like. Obviously, the kind of reasoning that figures in these explanations is not always deductive inference of the type illustrated by Arguments 1 and 2. But where beliefs and goals are propositional in form, as in many current psychological models (e.g., J. R. Anderson, 1976; Miller & Johnson-Laird, 1976), it is plausible that one important brand of inference deals with propositional connectives such as if, and, or, and not. Propositional inference has been a focus of study in logic for more than a century, but only recently has it received from psychologists anything like a general treatment (Braine, 1978; JohnsonLaird, 1975; Osherson, 1975). The effort in this article is to extend these earlier models to a more powerful theory that can be given firmer empirical support. The approach to this problem outlined below takes the form of a computer model of propositional reasoning. The model is nicknamed ANDS—for A Natural Deduction System—and, as its name implies, is descended from the formal natural-deduction procedures pioneered by Gentzen (1935/ 1969) and Jaskowski (1934). It is also related to artificial intelligence (AI) theorem provers such as PLANNER (Hewitt, Note 1; see also Bledsoe, 1977). A working version of ANDS has been implemented in the LISP programming language. Within its natural-deduction framework, ANDS provides a simple way to handle temporary assumptions or "suppositions" that facilitate human reasoning (Rips & Marcus, 1977). In addition, it provides a specific account of working memory during deduction and of processes that manage its memory structures. ANDS also possesses the ability to reason both forward from the premises and backward from the conclusion. These features are set out below in the first section of this article. An overview of ANDS is followed by a closer look at its main components: its memory structures and inference routines. The second section of the article is devoted to an examination of ANDS's similarity to human inference, as evidenced by protocol data, recall performance for proofs in text, and judgments of the validity of sample arguments. The final section considers possible ways to expand ANDS's basic inference abilities.
40
LANCE J. RIPS
better account of the data (Osherson, 1974, 1975). It is certainly possible that a nonpropositional procedure will eventually be found Overview that is sufficiently accurate and general; but ANDS's central assumption—one that it for the present, inference rules appear the shares with the other psychological models most viable choice (however, see Johnsoncited above—is that deductive reasoning con- Laird, 1982, for a different view of this issue). In short, ANDS is a theory of deductive sists in the application of mental inference rules to the premises and conclusion of an reasoning that consists of prepositional inargument. The sequence of applied rules ference rules embodied in a set of compuforms a mental proof or derivation of the tational routines. The routines apply the conclusion from the premises, where these rules to produce a proof when an argument implicit proofs are analogous to the explicit is evaluated. Both the routines and the proof proofs of elementary logic. In the simplest are claimed to correspond to those used incase, a mental proof has a single step, formed tuitively by subjects who have not received by applying just one of these internal rules. formal training in logic. This section of the For example, suppose that among the stock article is devoted to an explanation of these of rules is one that specifies that propositions components, using as an illustration a sample of the form IF p, q and /; jointly imply the argument that ANDS evaluates with a fairly proposition q, where p and q are arbitrary simple, but nontrivial, proof. The first part propositions. This is the modus ponens rule, of the section considers informally what a mentioned above, and we can see that this proof of this argument might look like and schema matches the premises and conclusion compares this informal proof to ANDS's of Argument 1. Because of this match be- proof in a preliminary way. The comparison tween the argument and the inference rule, points out some of ANDS's special features, the conclusion of Argument 1 is cognitively and these features are then described in dederivable or provable from the premises. We tail, concentrating on the shape of the proofs can also say that Argument 1 is cognitively in ANDS's working memory and on the inacceptable or valid (although in doing so we ference routines that construct them. The depart somewhat from the way valid is tra- next part returns to the sample argument and ditionally used in logic). In more complex shows how its proof is built up step by step. examples of deductive reasoning, more than (A lengthier proof, which illustrates some fura single rule will be needed to derive the con- ther aspects of ANDS, is included in the Apclusion; but in all such cases, the rules are pendix.) The last part of this section comthe ultimate authority in determining which pares ANDS to other psychological deduction systems. arguments are valid. An initial proof of Argument 2. The samANDS's assumption about inference rules is not unassailable since there are other well- ple argument I will focus on is Argument 2, known methods for showing that arguments so we should take a moment to think about are valid. These include truth tables (Witt- its meaning. Recall that the premise was If genstein, 1921/1961) and various kinds of there is not both an M and a P on the blackdiagrams (e.g., Gardner, 1958). These meth- board, then there is an R, and the conclusion ods are, of course, rule governed in the sense was If there is no M, then there is an R. Our of being algorithms for evaluating arguments, task is to decide whether this conclusion folbut they do not involve rules like the modus lows from the premise. Looking at the premponens example that manipulate proposi- ise, we see that a main stumbling block is the tions directly. There is no evidence to date, phrase there is not both an M and a P. One however, that these alternative methods can thing to notice, however, is that situations successfully predict subject responses for that are consistent with this phrase are ones more than a very narrow range of reasoning in which there is an M but no P, a P but no problems. Moreover, when these methods M, or neither a P nor an M. In other words, have been compared to proposition-manip- any arrangement of letters that lacks an M ulating systems, the latter have provided a or lacks a P is one in which "there is not both ANDS as a Theory of Prepositional Reasoning
COGNITIVE PROCESSES IN REASONING
an M and a P" We could therefore rephrase the premise as If there is no M or no P, then there is an R. But remember that the first half of the conclusion is If there is no M. Supposing that there really was no M on the blackboard, then surely the rephrased premise tells us that there will be an R. If there is no M or no P, then there is an R and There is no M together imply There is an R. Hence, given the premise, it follows that if there is no M, there will be an R. But this is exactly the conclusion, and Argument 2 must therefore be valid. The following proof summarizes these steps: 3. a. If there is not both an M and a P (Premise) on the blackboard, then there is an R. b. If there is no M or no P on the (Consequence blackboard, then there is an R. of a) c. Suppose there is no M on the blackboard.
(Supposition)
d. Then there will be an R.
(Consequence of b and c)
e. Therefore, if there is no M, there (Consequence will be an R. of c and d)
The initial sentence in this proof is the premise of Argument 2, and the final sentence is its conclusion. The intermediate steps in Lines b and d, as well as the conclusion itself, are logical consequences of earlier steps and are derived from these earlier propositions by means of inference rules. These rules must be specified in a formal deductive system but are left unstated in this example (see the following section for a discussion of ANDS's deductive rules). Line c contains a temporary assumption or supposition, which is used to simplify deduction of the conclusion. No axioms appear in the above proof, and this is characteristic of "natural deduction," the type of proof developed by Gentzen and others cited above. These systems were intended to "come as close as possible to actual reasoning" (Gentzen, 1935/1969, p. 68) on the theory that in "actual reasoning" the conclusion is deduced directly from the premises of the argument rather than from primitive propositions or axioms. Natural deductive proofs are typically simpler than axiomatic proofs, and for this reason such systems have been adopted in many elementary logic texts
41
(e.g., Copi, 1954; Fitch, 1952; McCawley, 1980; Suppes, 1957; and Thomason, 1970b). Most previous psychological models of prepositional reasoning, though differing in other respects, have used natural deduction methods (see Braine, 1978; Johnson-Laird, 1975; Osherson, 1975). ANDS's proof of Argument 2. ANDS's proof is very similar to Proof 3. Figure 1 displays the contents of ANDS's working memory at the conclusion of the proof, and we can see that each of the propositions in Proof 3 finds a place in this configuration. The proof structure contains two parts, which are called the assertion tree and the subgoal tree. ANDS begins the problem by considering the premise (Assertion 1 in Figure 1), and this proposition (along with some inferences based on it) is placed in the top node of the assertion tree. ANDS also starts by taking into account the conclusion, the proposition that it must ultimately prove. This proposition appears at the top of the subgoal tree as Subgoal 2, and it is connected by a pointer to other propositions (subgoals) whose truth guarantees the truth of the conclusion. As in Proof 3, ANDS sometimes needs to "suppose" temporarily that a particular proposition is true and use this supposition to draw further inferences. When this happens, ANDS places the supposition in a subordinate node of the assertion tree. For example, Line c of Proof 3 contained the supposition that there was no M on the blackboard, and this same proposition Not M appears as Assertion 5 in the subordinate node. (Admittedly, the two structures in Figure 1 do not look much like trees, since they contain only a single branch apiece; in general, though, the assertion and subgoal trees will be bushier, as can be seen in the Appendix and Figure Al.) At the beginning of the problem, both the assertion tree and the subgoal tree are empty. The procedure begins when ANDS adds the premise of the argument to the main branch of the assertion tree and the conclusion to the top of the subgoal tree. The remaining propositions are placed in the memory trees during the course of the proof by inference routines that continually inspect the trees' contents and respond to appropriate prepositional patterns within them. In Figure 1, the numbering of the individual propositions
42
LANCE J. RIPS Assertion Tree
Subgoal Tree
I. If not (M and P) then R. 3. If not M or not P than R 8. If not M then R.
V R.
:
Not M.
Figure I . ANDS's memory structure at the conclusion of the proof of the argument, If not both M and P then R; therefore, if not M then R. (See text for explanation.)
indicates the order in which the routines have entered them in memory. First, one of the routines spots the fact that the premise, If not (M and P) then R, can be paraphrased to read // not M or not P then R, and it places this new proposition in the assertion tree at 3. Second, ANDS notices that the conclusion is a conditional, If not M then R. As in the earlier proof, ANDS supposes that the first part of the conditional, Not M, is true, putting it in a new subordinate assertion node. At the same time, it sets up a subgoal (Subgoal 4 in the figure) to try to prove that R is true; if it can show that R is true when Not M is true, it will have derived the conclusion itself. Finally, ANDS sees that R can be deduced from Assertion 3, provided that it can show that either Not M or Not P is true. The subgoal of proving Not M is entered at 6, but this proposition has already been established (it is just Assertion 5), and this match between the subgoal and the assertion clinches the proof. Given the premise, R is true if Not M is true, and, therefore, If not M then R is provable. After placing R and If not M then R in the assertion tree, ANDS declares the argument valid. If ANDS had run out of applicable rules before finding the critical match, it would have pronounced the argument invalid. This example highlights ANDS's main features: ANDS's proofs always consist of an assertion tree containing the premises and other propositions derived from them and a subgoal tree containing the conclusion and other propositions that warrant it. Inference rules fill in these working-memory trees in response to propositions previously placed inside them. The proof succeeds (and the argument is cognitively valid) if the rules can find a match between suitable subgoals and
assertions. The proof fails (and the argument is cognitively invalid) if the procedure runs out of applicable rules before finding a match. A deeper understanding of the proof in Figure 1 requires a more explicit description of both the rules and the tree structures, but the example will stand us in good stead until these elements have been examined. Working-Memory Components Reasoning in ANDS means constructing a proof in working memory. Inspection of Figure 1 turned up the basic parts of these proofs—the assertion and subgoal trees—but it remains to clarify their internal structure. In particular, we need to know the significance of the links and nodes of the figure and the way in which these links and nodes are established. The first of these questions is taken up at this point; the second will be delayed until we have had a chance to discuss the inference routines. The assertion tree. ANDS's assertion tree encodes a natural-deduction proof, as we have seen in comparing Proof 3 to the similar proof in Figure 1. The basic difference between these two versions is that the assertion tree represents explicitly the relation of the supposition (i.e., Not M) to the rest of the proof: Suppositions are always placed in a new node of the assertion tree. Unlike most of the other propositions in the assertion tree, suppositions are not necessarily true in those situations where the premises are true, and it is for this reason that suppositions are segregated in nodes of their own. For instance, in Figure 1 Not M need not be true when the premise is true, since the premise only mentions what will happen if there is no M (or no P). The advantage of suppositions is that they provide a way of exploring their consequences without our first having to know their truth value. (An analogy is the way one sometimes imagines contingencies in planning an important action so that one can anticipate potential problems or opportunities.) The result is a more streamlined proof than would be possible if all the propositions in the proof had to be guaranteed by the premises. The logical consequences of a supposition,
COGNITIVE PROCESSES IN REASONING
like the supposition itself, are not guaranteed to be true, and they are therefore placed in the same node as the supposition on which they are based. For example, in the assertion tree of Figure 1, the truth of R depends on the supposition Not M, and for this reason, R too appears in the subordinate node. By contrast, the remaining propositions in the assertion tree do not depend on the truth of Not M but on the premise alone, and ANDS therefore places them in the upper node of the tree with the premise itself. ANDS's inference routines are in charge of noticing when a supposition is used in a deduction and of placing the deduced proposition in the proper node of the tree. In determining the consequences of a supposition, ANDS's rules are free to use any of the superordinate propositions in the tree. For example, Assertion 3 of Figure 1 was used to deduce Assertion 7 in the subordinate node. Thus, whereas propositions in subordinate (supposition) nodes are not necessarily true in the superordinate nodes, superordinate propositions are always true in the subordinate contexts. This one-way logical relation is represented by the pointers in the assertion tree. The ability to represent suppositions is an important characteristic of ANDS's proofs and is responsible for the form of the assertion tree. Branching in the tree will occur if ANDS considers one or more trial suppositions before hitting on a fruitful one (see the Appendix proof for examples of unsuccessful suppositions). This hierarchical system of nodes in the assertion tree replaces the bracketing or numbering devices that are more common in textbook presentations of natural deduction.' The subgoal tree. If we think of the assertion tree as recording the logical steps leading from the premises to the conclusion of an argument, the subgoal tree indicates the reverse pathway from the conclusion to the premises. Typically, ANDS proves its theorems from "outside in," working alternately backward from the conclusion (in the subgoal tree) and forward from the premises (in the assertion tree). Unlike the assertion tree, however, the subgoal tree has no obvious counterpart in formal logic proofs. Perhaps the closest equivalent to ANDS's use of
43
subgoals occurs when an author breaks a complex theorem into lemmas or announces at the beginning of a problem that proof of a given formula is sufficient to establish the conclusion. Informal argumentation provides a better analogy to the subgoal tree. Argumentative discourse often starts with a statement on the part of a speaker that is challenged by another participant. The speaker can meet this challenge by producing evidence for the statement, where this evidence can be questioned in turn (Toulmin, 1958). Thus, the original statement plays the part of the conclusion in an argument, the main goal that the speaker wants to establish. The evidence is a subgoal, which, if agreed upon, would support the truth of the main goal. Disputes about the evidence can lead to further sub-subgoals on the part of the speaker, and so on, until common ground is reached. A basic motivation for the subgoal tree is processing efficiency: Subgoals keep the proof procedure aimed in the direction of the argument's conclusion rather than allowing it to produce implications at random from the premises (see Newell & Simon, 1972). The subgoal tree tells the system that if it wants to deduce conclusion SO, it should first deduce the propositional subgoal SI; if it wants to deduce SI, it should try to deduce proposition S2, and so on, until an achievable subgoal Sk has been located. This subgoal chaining is achieved by making many of ANDS's inference rules sensitive to the current subgoal at any given state of the problem. If at some stage one of these rules notices that the current subgoal is Si and that some further proposition Sj will serve to establish Si, then this inference procedure will place Sj in the subgoal tree as the now-current subgoal. This process is monitored by ANDS to ensure that the subgoal does not duplicate a previously attempted one. Whenever the current subgoal fails (none of the rules applies to it), ANDS backs up to the nearest 1 The term assertion tree may be something of a misnomer, since it contains suppositions as well as flat-out entailments of the premises. Proof tree might have been more appropriate, except that ANDS's proofs also involve the subgoal tree. Assertion tree seems harmless if we bear in mind that not all of the propositions in the tree have equally factual status.
44
LANCE J. RIPS
superordinate goal and tries again. In other words, ANDS generates its subgoal tree in a depth-first manner. The form of the subgoal tree is dictated by two considerations. First, because the final goal of the problem is to deduce the conclusion, the subgoals should be propositions whose truth will warrant that of the conclusion itself. For any conclusion, ANDS may consider several subgoals of this sort (as illustrated in the Appendix), creating a tree structure. In general, deduction of any subgoal in the tree will be sufficient to guarantee the truth of any proposition along the path leading up from it. The pointers in the subgoal tree represent this "deducible from" relation.2 Second, superimposed on this structure are nodes that correspond to those of the assertion, tree. Certain subgoals require ANDS to suppose that a particular proposition is true, that is, to set up a new supposition node in the assertion tree. This supposition is then used as an aid in deducing the subgoal. In the proof of Figure 1, for example, ANDS assumes that Not M is true in order to deduce R. So when R is first added as a subgoal, ANDS creates a new assertion node with Not M as its initial proposition. A suppositioncreating subgoal of this type is partitioned from the rest of the subgoals and is connected to the relevant assertion node (with dotted lines in the figure) to indicate that the subgoal is to be deduced in the context of the supposition. In the above example, R is to be deduced under the supposition that Not M is true. By contrast, the main goal is to be deduced in a context in which only the premise (and its entailments) are assumed true. Hence, it appears in the top node of the subgoal tree, which is connected to the assertion node that contains the premise. Both of these structural features—the subgoal hierarchy and the nodes—are controlled by the inference rules. These rules inspect the current subgoal to determine whether there might be a new proposition whose truth would establish the subgoal. Similarly, the rules are in charge of coordinating supposition nodes in the assertion tree with their counterparts in the subgoal tree. To see how all this is accomplished, we need to take a look at the rules themselves.
Processing Components ANDS's inference routines control its proofs by placing new assertions and subgoals in memory. Most of these routines correspond to familiar rules in propositional logic. For example, ANDS has two routines (Rl and Rl') corresponding to modus ponens. These relations are spelled out in Table 1, and the procedures themselves are summarized in Table 2. Each of these routines consists of a set of conditions that must be checked against the memory trees to determine whether the routine applies. If all the conditions are met, then a series of actions are carried out that modify the trees in specific ways. In the current version of ANDS, the routines are tested for applicability in a fixed order, and the first routine whose conditions are fulfilled is then carried out. In general, the simpler routines are tested first, where simplicity is determined by the amount of mental bookkeeping that is required (in a sense that will become clear below). Once a routine from this list has been applied, the process starts again at the top of the list in search of further routines. The Table 2 routines are not intended as an exhaustive set. This is true in the sense that there are arguments in classical sentential logic that ANDS cannot prove. But more important, there are undoubtedly deductive procedures that subjects use but that are not currently part of ANDS. Nevertheless, the routines in Table 2 were suggested by subject protocols and can claim psychological plausibility for this reason. Although not exhaustive, this set of routines can handle a fairly wide range of propositional arguments and permits us to test some of ANDS's assumptions. We assume that each of the Table 2 routines is available to a subject only on a probabilistic basis. For example, on a particular trial of an experiment, a certain routine Ri will come to mind with probability pi (a parameter of the model). This assumption is 2
As a minor qualification, ANDS must sometimes achieve both members of a pair of subgoals in order to fulfill the superordinate goal. See the Appendix for an example of these joint subgoals (Subgoals 16 and 17 in Figure Al).
45
COGNITIVE PROCESSES IN REASONING
important in deriving predictions from ANDS, as will be seen later. However, for purposes of describing ANDS's proof procedure, it is convenient to ignore the assumption temporarily. Thus, in the examples below, we take all routines to be available with Probability 1 so that ANDS behaves as a kind of ideal logician within the compass of its remaining structural and processing constraints. There are some notational practices that should be kept in mind in reading Table 2. Italicized lower case letters (e.g., p, q, and r) are prepositional variables that can be matched against any (possibly complex) proposition in memory. However, words in italicized caps like IF and OR are logical constants and must exactly match those words within a memory proposition. So, for example, the pattern IF p, q in Rule R1 will successfully match the propositions "If Jane is in Tulsa, Morry is in Chicago" or "If Jane is in Tulsa and Sam is in Austin, Morry is in Chicago," but not "Jane is in Tulsa and Sam is in Austin" or "Sam is in Austin, and if Jane is in Tulsa, Morry is in Chicago." When let-
Table 1 (continued)
Table 1
Note. Boxes represent subproofs within which the bottom line (or lines) is deduced from the top line (see, e.g., Fitch, 1952). These correspond to subordinate assertion nodes in ANDS.
A Comparison Between Standard Logical Rules and ANDS's Inference Routines Inference-rule schema
Rule name Modus Ponens
IF p, q P
DeMorgan's Law
NOT (p AND q)
ANDS routine R l , Rl'
R2'
NOT p OR NOT q
Disjunctive Syllogism
p OR q NOT p
Disjunctive Modus Ponens
P
IF p OR q, r
R3, R3'
R4
r
And Elimination
p AND q
R5, R5'
P
And Introduction
P
R6
Q p AND q
Or Introduction
P p OR q
R7
Rule name Law of Excluded Middle
Inference-rule schema p OR NOT p
// Introduction
ANDS routine R8 R9
11' p. q
RIO
Not Introduction
NOT p Or Elimination
pORq
Rll
ters occur within the propositions themselves, as in Arguments 1 and 2, they are capitalized to distinguish them from variables. If a pattern matches a given assertion or subgoal, then the variables in the pattern are bound to the corresponding propositions. For instance, if IF p, q is matched to the first of the sample sentences above, then p is bound to "Jane is in Tulsa" and q to "Morry is in Chicago" for the remaining steps in the rule. Two different variables can be bound to the same proposition, as would happen, for example, if IF p, q were matched to "If Jane is in Tulsa, Jane is in Tulsa." The conditionaction format of ANDS's rules resembles that of production systems (e.g., J. R. Anderson, 1976; Newell & Simon, 1972), but no attempt has been made in ANDS to follow the programming conventions of productionsystem languages. Forward versus backward rules. The in-
46
LANCE J. RIPS
Table 2 Deduction Rules in ANDS Rl (Modus Ponens, backward version) Conditions: 1. Current subgoal = q 2. Assertion tree contains IF p, q Actions: 1. Set up subgoal to deduce p 2. If Subgoal 1 is achieved, add q to assertion tree Rl' (Modus Ponens, forward version) Conditions: 1. Assertion tree contains proposition x = IF p, Q 2. x has not been used by Rl or Rl' 3. Assertion tree contains proposition p Actions: 1. Add q to assertion tree R2' (DeMorgan) Conditions: 1. Assertion tree contains a proposition x with subformula NOT (p AND q) 2. x has not been used by R2' Actions: 1. Set y = x 2. Substitute NOT p OR NOT q for NOT (p AND q) in y 3. Add y to assertion tree R3 (Disjunctive Syllogism, backward version) Conditions: 1. Current subgoal = q 2. Assertion tree contains /; OR q (alternatively, q OR p) Actions: 1. Set up subgoal to deduce NOT p 2. If Subgoal 1 is achieved, add q to assertion tree R3' (Disjunctive Syllogism, forward version) Conditions: 1. Assertion tree contains x = p OR q 2. x not used previously by R3 or R3' 3. Assertion tree contains NOT p (alternatively, NOT q) Actions: 1. If assertion tree contains NOT p, add q to assertion tree 2. If assertion tree contains NOT q, add p to assertion tree R4 (Disjunctive Modus Ponens) Conditions: 1. Current subgoal = q 2. Assertion tree contains IF p OR r, (I Actions: 1. Set up subgoal to deduce p 2. If Subgoal 1 is achieved, add q to assertion tree and return 3. Set up subgoal to deduce r 4. If Subgoal 3 is achieved, add q to assertion tree R5 (And Elimination, backward version) Conditions: 1. Current subgoal = p 2. Assertion tree contains a proposition with subformula x p AND q (alternatively, q AND p) Actions: 1. Set up subgoal to deduce x 2. If Subgoal 1 is achieved, add /; to assertion tree R5' (And Elimination, forward version) Conditions: 1. Assertion tree contains x = p ANDq 2. x not used previously by R5 or R5' Actions: 1. Add p to assertion tree 2. Add q to assertion tree
R6 (And Introduction) Conditions: 1. Current subgoal = p AND q Actions: 1. Set up subgoal to deduce p 2. If Subgoal 1 is achieved, set up subgoal to deduce q 3. If Subgoal 2 is achieved, add p AND q to assertion tree R7 (Or Introduction) Conditions: 1. Current subgoal = p OR q Actions: 1. Set up subgoal to deduce p 2. If Subgoal 1 is achieved, add p OR q to assertion tree and return 3. Set up subgoal to deduce q 4. If Subgoal 3 is achieved, add /; OR g to assertion tree R8 (Restricted Law of Excluded Middle) Conditions: 1. Current subgoal = p OR q 2. Premises contain a proposition with subformula NOT p (alternatively, NOT q) Actions: 1. Add to assertion tree p OR NOT p (alternatively, q OR NOT q) R9 (//Introduction) Conditions: 1. Current subgoal = //•' /;, q Actions: 1. Add new subordinate node to assertion tree containing assumption p 2. Set up corresponding subgoal node to deduce q 3. If Subgoal 2 is achieved, add IF p, q to superordinate node of assertion tree RIO (Not Introduction) Conditions: 1. Current subgoal = NOT p 2. Premise contains subformula q and conclusion contains subformula NOT q (alternatively, premise contains NOT q and conclusion q) Actions: 1. Add new subordinate node to assertion tree containing assumption p 2. Set up corresponding subgoal node to deduce q 3. If Subgoal 2 is achieved, set up subgoal node to deduce NOT q 4. If Subgoal 3 is achieved, add NOT p to superordinate node R11 (Or Elimination) Conditions: 1. Current subgoal = r 2. Assertion tree contains p OR q Actions: 1. Add new subordinate node to assertion tree with assumption p 2. Set up corresponding subgoal node to deduce r 3. If Subgoal 2 is achieved, add new subordinate node to assertion tree with assumption q 4. Set up corresponding subgoal node to deduce r 5. If Subgoal 4 is achieved, add r to superordinate node
Note. All rules are stated in the form of condition-action pairs: The action portion of the rule will be executed only if all of the listed conditions have been fulfilled. In certain cases, ANDS tags a proposition to signal that a rule has applied to it, and these tags are stored in the proposition's property list. The tags are checked in Condition 2 of Rules Rl', R2', R3', and R5'. Further conditions on the position of the component propositions in the trees are omitted in this listing.
COGNITIVE PROCESSES IN REASONING
ference routines of Table 2 can be divided into two main types: forward rules that work from the premises of the argument to the conclusion and backward rules that work from the conclusion to the premises. Forward rules will be denoted by a prime following the rule number; thus, in Table 2, Rl', R2', R3', and R5' are all forward rules. The unprimed rules—Rl and R3-R11—are backward rules. This forward/backward distinction depends on the sensitivity of the rules to problem subgoals. The condition part of backward rules contains a reference to the current subgoal, and their actions typically involve placing one or more new subgoals in the subgoal tree (though new assertions may also be added). Forward rules, however, operate independently of any subgoals. Their conditions refer only to propositions in the assertion tree, and their actions result only in new assertions, never in new subgoals. Hewitt's (Note 1) PLANNER system includes both forward rules ("antecedent theorems") and backward rules ("consequent theorems"), as do CONNIVER (McDermott & Sussman, Note 2), SNePS (Martins, McKay, & Shapiro, Note 3), and the deductive system described by Klahr (1978). J. R. Anderson, Greeno, Kline, and Neves (1981) have also incorporated forward and backward inference in their simulation of high-school geometry students' theorem proving. Let us first look at some examples of forward and backward rules, and then examine the case for their psychological validity. As an illustration of the difference between forward and backward rules, we can consider Rules Rl and Rl' from Table 2. Both of these rules implement the modus ponens inference, a pattern that was exemplified by Argument 1. However, Rl and Rl' carry out modus ponens in different ways and so provide an optimal contrast for our forward/ backward distinction. Rule Rl' is the forward version of modus ponens, and a glance at Tables 1 and 2 shows that it closely resembles the way this inference is usually expressed in logic texts. The conditions of the rule check to see whether propositions of the form IF p, Q and p exist in the assertion tree, and if they do, proposition q is then asserted. The only unfamiliar aspect of this routine is that it explicitly tests whether modus ponens has already applied to the conditional proposi-
47
tion (see Condition 2 of Rl')- This is done to keep the rule from applying repeatedly to the same assertions in an endless loop. Once modus ponens has applied to IF p, q, applying it again produces no new information, only redundant copies of q. Therefore, imposing this condition prevents looping without restricting the power of the rule. Rule Rl' allows us to conclude that q is true whenever both p and IFp, q have already been asserted. However, suppose that we would like to deduce q in a situation where IF p, q is known to be true, but p is not. A natural strategy might be to try to deduce p from other assertions because if p can be derived, modus ponens will permit us to conclude that q. This strategy is the one implemented by Rule Rl, the backward version of modus ponens. The conditions on this rule are met if the procedure notices that the current subgoal is the consequent of some previously asserted conditional. (In a conditional of the form IF p, q, q is said to be its "consequent" and p its "antecedent.") The action part of the rule then sets up the antecedent of the same conditional as the new subgoal. In other words, Rl adds p as a subgoal for q in the presence of an assertion IF p, q. If this new subgoal can be derived, then q itself has been deduced and can be added to the assertion tree. Thus, the point of the rule is to motivate further reasoning that may ultimately lead to fulfillment of the original goal. It is good to notice that backward rules like Rl are not passive implementations of logical principles; they incorporate heuristics that specify the sorts of proof strategies that are likely to pay off. The easiest way to appreciate this point is to consider an alternative way of formulating backward modus ponens that eliminates one of these heuristics. As just mentioned, Rule Rl is triggered by an assertion IF p, q and a subgoal q, and its action is to produce a new subgoal p. This means that the new subgoal can never be a longer proposition than the assertion that triggers it, and this heuristic limits the amount of search that the rule can initiate. We can contrast this with a different type of rule (Rl*) that would be triggered by an assertion p and a subgoal q and whose action would be to introduce as a subgoal IF p, q. From the point of view of logic, Rl and Rl* are equally reasonable, since all we have done is
48
LANCE J. RIPS
to reverse the role of the two premises of modus ponens—p and IF p, q—in the procedure. However, from a strategic standpoint, R1 * is almost always a bad idea. For example, imagine a proof in which the current subgoal is q and the only assertion is /;. These propositions meet the conditions on Rl*, and so this rule will set up a subgoal to deduce IF p, q. But no such deduction is possible in the situation at hand because there is no valid way to get from p alone to IF p, q. We ought to give up at this point, but to make things worse, Rule Rl* is still applicable, and with assertion p and subgoal IF p, q, it produces the new subgoal IF p, IF p, q. Again, there is no way to deduce this subgoal, and again Rl* applies to yield yet another subgoal IF p, IF p, IF p, q, and so on. In the case of Rl, the number of new subgoals is limited by the fact that the subgoal (p) is contained in the assertion (IFp, q). Rl* drops this heuristic and gets us (literally) into no end of trouble. Why do we need both forward and backward rules? In particular, why is it necessary to have rules like Rl and Rl' that are forward and backward versions of the same deductive scheme? The answer to these questions is that the two rule types play different roles in theorem proving, both equally important. Intuitively, forward rules are useful in getting started on a problem by carrying out obvious or routine deductions that clarify the meaning of the premises. At a later stage, they may also be helpful in simplifying intermediate results. Forward strategies have been observed in subjects' performance on other mathematical tasks. For example, Simon and his colleagues have documented forward use of equations by skilled subjects solving simple word problems in kinematics and thermodynamics (Bhaskar & Simon, 1977; Simon & Simon, 1978). Similarly, in proving geometry theorems, high-school students may consider congruent parts of a figure on the assumption that these parts will eventually be helpful, without having a specific subgoal in mind (Greeno, 1976). Moreover, forward routines are needed in situations where no particular goal is specified. For instance, Johnson-Laird and Steedman (1978) have given subjects sets of premises and have asked them to generate their own conclusions from these sets. Subjects are clearly able to comply
with these instructions, but it would be impossible for them to do so if all deductive steps had to be tied to well-formulated goals. On the other hand, forward rules alone are not sufficient to account for ordinary reasoning, and to see why not, consider the problem of proving the following argument: 4. If there is both an M and a P, then there is an R. There is an M. There is a P. There is an R.
An attempt to derive the conclusion on the basis of Rl'—forward modus ponens—will fail in this situation because Rl requires an assertion to match the antecedent of the firstpremise conditional (i.e., There is both an M and a P), Although we have the makings for such an assertion in the second and third premises, forward rules like Rl' lack the capacity to set up a subgoal to combine these propositions. However, the backward version, Rl, encounters no such problems. Because the conclusion matches the consequent of the first premise, it will advertise the need to prove the antecedent. ANDS's conjunctionformation rule, R6, can then apply to produce this proposition. The problem in deriving the conclusion of Argument 4 is that R6 is itself a backward rule and is therefore unable to operate unless triggered by a subgoal. This suggests that one way around the problem would be to transform R6 into a forward rule that would form conjunctions whenever possible. But such a move has a fatal difficulty. For not only would a forward R6 produce the desired sentence but it would also produce an infinite number of undesired ones such as There is an M and there is a P, and there is a P; There is an M and there is a P, and there is a P, and there is a P, and so on. Applying rules like R6 in a forward direction produces an avalanche of irrelevant conclusions, a process similar to the one Newell and Simon (1972) named the "British Museum" algorithm. Perhaps we could limit this runaway productivity as we did with R1' by prohibiting R6 from applying more than once to any proposition. But recall that in the case of Rl', we were eliminating only multiple tokens of the same sentence. Here we would be eliminating different sen-
COGNITIVE PROCESSES IN REASONING
tence types, and some of these could prove useful in other problems. For this reason, it seems that the best way to curb inferences like R6 (and others that operate on their own output) is to treat them as backward rules, using them only as needed in support of subgoals.3 Supposition-creating rules. All of ANDS's backward rules operate by positing new subgoals on the basis of old ones. However, some of the backward rules have the additional task of advancing suppositions within the proof, producing new nodes on both the assertion and subgoal trees. A good example of such a rule is R9, which performs the inference called "If Introduction" in logic textbooks (see Table 1). "If Introduction" is used to prove conditional conclusions, and it does so by assuming that the antecedent of the conditional is true and by seeing whether the consequent can be deduced on that assumption. If this consequent is deduced, then the conditional itself must be true. This pattern of reasoning is familiar from mathematics, and we have already run into an example in Proof 3. Faced with the task of showing that If there is no M on the blackboard, then there is an R, we made the assumption that there is no M, and then went on to demonstrate that this assumption, coupled with earlier assertions, implies that there is an R. The implication justifies the conditional conclusion. Rule R9 implements this deduction in the steps shown in Table 2. Its only condition is that the present subgoal have the IFp, q form. If this is the case, it builds a new subgoal node directly subordinate to the one containing the conditional and houses the consequent (i.e., q) in the new node. At the same time, it constructs an assertion node for the antecedent (p). The consequent then becomes the current subgoal, and its deduction can be attempted on the basis of the antecedent as well as any superordinate propositions. If this subgoal can be fulfilled, the entire conditional can be added to the assertion tree just above the node containing the antecedent. Thus, in Figure 1, If not M, R is placed in the superordinate node of the assertion tree.4 There is an intimate connection between supposition-creating rules and conditional propositions. Sentences of the IF p, q form
49
(e.g., "If the problem is complex, Morry won't be able to solve it") can often be paraphrased as Suppose that p; then q ("Suppose that the problem is complex; then Morry won't be able to solve it"). More generally, Mackie (1973) and Rips and Marcus (1977) have given accounts of the uses of if in ordinary English in terms of ifs ability to evoke suppositions. However, it is clear that suppositions can also take part in inferences that do not directly involve conditionals, and ANDS's Rules RIO and R l l perform deductions of this type. Rule RIO handles rcductio ad absurdum arguments in which a given proposition is supposed true in order to derive a contradiction from it. Because contradictions cannot be derived from true propositions, their presence indicates that the supposed sentence is false. The negation of that sentence can therefore be added to the superordinate node in the assertion tree (the Appendix deals with a proof of this type). In 3 Another way to limit the productivity of the forward version of R6 is to restrict it to produce only conjunctions that actually appear in the premises or conclusion. But although this restriction is successful in avoiding the problems with Argument 4, it is too limiting in general. For example, consider a deduction system (e.g., Braine, 1978, or Osherson, 1975) that contains the rule, "pAND {q OR r) implies (p AND q) OR (pAND r)." Within such a system, one would like to be able to prove that arguments of the following kind are valid:
There is a T. There is an X or a Y. There is a r and an X, or a T and a Y. To prove this, we need to conjoin the premises to form There is a T, and an X or a Y and then apply the above rule. However, this conjunction is blocked by the above restriction on "And Introduction" because the conjunction appears in neither the premises nor the conclusion. The point is that we need And Introduction to feed other rules, and the proposed restriction sometimes makes this impossible. Notice, however, that casting And Introduction as a backward rule does have exactly the desired property, and this supports the formulation of R6 that appears in Table 2. 4 A kind of bonus from the ANDS approach is that some "relevance" restrictions of the sort advocated by A. R. Anderson and Belnap (1975) fall out quite naturally. For example, because propositions are only supposed true if they are needed in support of a subgoal, they will usually take part in the deduction of that subgoal. (One can sometimes trick the model into ignoring such restrictions, and it is an interesting question whether one can likewise get subjects to abandon them.)
50
LANCE J. RIPS
R11, each of the disjuncts of a proposition p OR q are supposed in turn. If some common proposition r can be deduced, both on the supposition that p and on the supposition that q, then r can be placed in the same assertion node as p OR q. We can also expect supposition-creating rules to take part in reasoning schemes that go beyond the kinds of arguments considered so far, since the ability to suppose or to imagine something is hardly limited to dealing with connectives like if, not, and or and might even be considered a distinctive feature of human judgment. Reasoning about counterfactual situations obviously depends on this ability (Revlis & Hayes, 1972). But, in addition, causal reasoning may also involve supposing that a given event occurs in order to contemplate the effects of this cause (Kahneman & Tversky, 1982; Tversky & Kahneman, 1978). Probabilistic reasoning may similarly entail imagining that certain propositions are true and estimating the likelihood of other propositions on that basis (Ramsey, 1926/1980). I return to this point later in discussing how ANDS's suppositions can be modified to account for subjects'judgments about causal arguments. Of the rules in Table 2, supposition-creating rules produce the most complex structural changes, and because of this complexity, these rules are only attempted after other means of establishing a subgoal have failed. Similarly, the remaining backward rules (Rl and R3-R8) are more complex than forward rules because backward rules must set up subgoals, monitor the status of the subgoals, and coordinate changes in the subgoal and assertion trees. Therefore, at each stage of a problem, forward rules are checked first, normal backward rules second, and suppositioncreating rules last. An Example With our knowledge of ANDS's rules, we are in a position to understand exactly how they are used in constructing a proof. To illustrate this procedure, consider Argument 2, which is repeated below: 2. If there is not both an M and a P on the blackboard, then there is an R. If there is no M, then there is an R.
Figure 2 shows the stages in the proof of this argument by picturing the state of the assertion and subgoal trees before and after each of the rules in the proof has applied. The last panel of the figure is identical to that of Figure 1 and represents the completed proof. The interest in Figure 2 is in the way these proof trees are built up, proposition by proposition. A complete psychological theory of deductive reasoning would have to include some provision for handling natural-language sentence structure, either by adapting the deductive rules so that they apply to natural language directly or by parsing natural language into the more formal system in which the rules are currently specified. But although this problem is an interesting one, it has not been addressed in the present version of the theory. Instead, ANDS accepts as input a list of premises and a conclusion in formal notation. For readability, ANDS's propositions are represented here by sentences composed of connectives, proposition letters, and parentheses. Thus, ANDS receives on input the propositions If not (M and P) then R and If not M then R in place of the sentences in Argument 2. At the beginning of the proof in Figure 2 (a), memory consists of just this simplified premise in the assertion tree and the corresponding conclusion in the subgoal tree. To get things started, ANDS scans its set of rules (in the order Rl', R2', R3', R5', R3R8, Rl, R9-R11) to see whether the conditions on any of these rules are satisfied. As it turns out, Rule R2' applies in this setting since the single proposition in the assertion tree contains within it the formula not (M and P), which matches the pattern specified by Condition 1. Condition 2 checks to make sure that this proposition has not been used in a previous application of R2', and because it obviously has not, both of the conditions are fulfilled (see Table 2). The action part of this rule copies the original proposition, substituting not M or not P for not (M and P) and adds this new proposition, If not M or not P then R, to the assertion tree, as shown in Figure 2 (b). Note that Rule R2' makes no mention of subgoals and so provides us with another example of a forward rule. None of the other forward rules applies at
COGNITIVE PROCESSES IN REASONING Assertion Tree
Subgool Tree
Rule
Figure 2. Proof of the argument, If not both M and P then R; therefore, if not M then R. (Panels a-e illustrate ANDS's memory trees before and after each of the inference rules has applied.)
51
together with the above assertion implies R, so it sets up a subgoal to deduce Not M. However, this subgoal is immediately fulfilled since Not M has just been supposed. Because this subgoal is successful, R itself must be true within the subordinate context, and Action 2 of R4 adds this proposition to the assertion tree. Note that if subgoal Not M had failed, ANDS would have continued by trying to prove Not P. The fact that ANDS hit upon Not M before Not Pis a consequence of the ordering of the steps in R4. But although ordering the subgoals in the opposite way would have slowed down this proof, it would not have prevented ANDS from finding it eventually. Nearly all of the pieces are now in place. Recall that R was introduced as a subgoal by Rule R9, which was intent on proving the main conclusion of the argument. Because that subgoal has now been achieved, the conclusion itself has been deduced (see Action 3 of Rule R9). So as a finishing touch, this conclusion too can be added to the assertion tree, as shown in the final memory configuration in Figure 2 (e). And because this represents the main goal of the problem, ANDS can declare the original argument valid. The proof is, in fact, a fairly straightforward one due to the speed with which ANDS found the correct solution path. For an example of a more complex proof involving a number of false starts, see the Appendix. But even in a simple proof like that of Figure 2, difficulties may arise if one of the crucial rules is unavailable or hard to apply. For instance, removing R9 from the set of rules not only blocks the above proof but makes it impossible for ANDS to reach the correct (valid) answer at all. This source of difficulty will concern us later in accounting for subjects' performance on similar problems.
this point, but it is possible to do some work in the backward direction. The main goal of the problem is conditional in form and therefore satisfies Rule R9, a supposition-creating rule that we are already familiar with. In this context, R9 supposes that the antecedent (Not M) of the conclusion is true by placing it in a subordinate node of the assertion tree. At the same time, it proposes a corresponding subgoal, R, which it takes from the consequent of the conclusion, and places it in a new subgoal node. The result of these operations is the memory structure shown in Figure 2 (c). At this stage of the proof, then, ANDS is looking for some way to establish A Comparison to Other R in the context of the supposition node in Psychological Models the assertion tree. If it can do so, the conOur tour of ANDS's structural and proditional conclusion is guaranteed. The turning point of the proof occurs in cessing characteristics suggests some ways to Figure 2 (d). ANDS's search of the rules sort out differences among current psychocomes up with R4, since the new subgoal R logical models of deduction. On a first pass, and the assertion If not M or not P then R the most obvious difference is the kind and match this rule's two conditions. The action number of the rules that the models adopt: half of the rule knows, in effect, that Not M 22 rules in Osherson (1975), 18 in Braine
52
LANCE J. RIPS
(1978), 12 in Johnson-Laird (1975), and 14 in the present model. These rule systems are not merely notational variants, for some arguments are provable in one system but not in another. (For example, ANDS can prove valid the argument [P or Q] and not P; therefore, Q, which cannot be proved by the Osherson model. On the other hand, Osherson's model, but not ANDS, can handle the argument IfP and Q, R; therefore, if not R, not P or not Q.) Nonetheless, there are reasons to think that this difference in rules is not the most revealing way to compare the models. First, the difference in number of rules is somewhat misleading since it depends on arbitrary decisions about how rules are to be counted. For example, the "And Elimination" rule can be expressed either as two separate rules ("p AND q entails p" and "p AND q entails q ) or as a single rule (as in ANDS's Rule R5). Second, despite some nonoverlap, there is considerable agreement in choice of rules. All models, for example, have some rule corresponding to "And Elimination"; all but one has a rule for "Or Introduction" (e.g., R7), and so on. Third, with the exception of Braine, no one argues that his particular choice of rules is definitive in covering all and only the cognitively basic inference patterns. Rather, the claims are that the designated rules constitute a subset of the basic ones and that they allow us to predict certain facts about those more complex inferences that are derivable from them. Fourth, the effects of rule choice are apparent only if we know the control mechanisms, since even the same set of rules can produce nonequivalent inferences in different control environments. Whereas no processing assumptions are offered by Braine,5 both Johnson-Laird and Osherson are quite explicit about these matters, and this makes for a fruitful contrast. The main differences seem to lie in (a) whether the models can look ahead to the conclusion of the argument to be evaluated and (b) whether they can make use of subgoals. In ANDS, the backward rules handle both the conclusion and its subgoals. However, the conclusion can be treated independently of the subgoals, as the JohnsonLaird and Osherson models illustrate: The former allows subgoals without conclusion-
sensitivity and the latter conclusion-sensitivity without subgoals. Let's first consider Johnson-Laird's proposal. In this system, the primary deductive step is an attempt to apply three specially designated rules to the given set of premises. These rules are modus ponens, the disjunctive syllogism, and a final rule of the form NOT (p AND a); p; therefore, NOT q. Other "auxiliary" rules are used to aid these primary inferences in a manner similar to ANDS's backward rules. For example, given two premises of the form IF p, q and p AND r, the model will attempt to carry out a modus ponens inference, determine that the second premise is not of the requisite form, propose an auxiliary goal to deduce p, and accomplish this deduction through an auxiliary rule (i.e., p AND r implies p). Other facilities may also be called into play to produce negative or conditional conclusions. What is of interest is that none of this depends on having a conclusion available, since the primary rules are triggered solely by the premises. Thus, the primary rules, like ANDS's forward rules, are initiated by premises, but like backward rules, they can propose their own subgoals. The result is that although the model is able to produce fairly powerful conclusions, it is unable to evaluate arguments (premise-conclusion pairs) except in ex5 The main innovation in Braine's (1978) theory is the idea that conditional sentences in English have the same meaning (and should be represented in the same way) as the inference relation between the premises and conclusion of a valid argument. In particular, Braine advocates treating both conditionals and inference relations as material conditionals (true just in case the antecedent is false or the consequent is true). In a reply to Braine, Grandy (1979) questioned whether this identification doesn't distort the meaning of//. . . then; but a possibly more serious problem is the account of the inference relation. Although children may initially confuse truthfunctional and non-truth-functional relations, the available evidence implies that by college age most subjects acquire a notion of inferential validity that is logically stronger than that of a material conditional (Osherson & Markman, 1975; Moshman, Note 4). Although ANDS provisionally interprets if. . . then as a material conditional, it maintains the classical logic distinction between the truth of a conditional and the validity of an argument. (A strengthened form of the conditional for causal propositions is considered in the Extensions section later on.)
COGNITIVE PROCESSES IN REASONING
tremely simple cases.6 (In more recent work, Johnson-Laird, 1982, Note 5, has abandoned the use of inference rules in favor of principles of meaning combination associated with the connectives. However, the points raised above apply equally to this new system.) Osherson's model has the opposite properties. In this system, there are no subgoals apart from the conclusion and no way to return to a previous step in the problem once the current strategy has failed. At the first step of a deduction, a rule is applied to the premise to produce a new proposition, say, SI. On the second step, a rule is applied to S1 to produce S2, and so on, until either the conclusion has been reached or no more rules apply. Once a rule has operated on a given proposition, no memory of that proposition is retained, so it is impossible to return to it later in the deductive process. However, to guide the derivation to the argument's conclusion, the inference rules apply only if the conclusion has a specified form. For instance, the rule that derives p from p AND q operates only if some subformula of p appears in the conclusion with no subformula of q conjoined to it. One might suspect that it would be easy to fool such a model by constructing the conclusion of the argument in such a way as to lead it off the correct inferential path, and indeed, this is the case. For example, although the model can correctly evaluate the argument If M, N; therefore, either (if M, N or O) or (both P and Q), it fails with the similar argument IfM, N; therefore, either (ifM, N or O) or (both M and Q). In the first problem, the model correctly transforms IfM, N into IfM, N or O, and this sentence in turn is rewritten as the conclusion. The second problem ought to have exactly the same proof, but in this case the first step is preempted by another rule that is triggered by the presence of M andQ in the conclusion. Instead of deducing IfM, N or O from IfM, N, it comes up with IfM and Q, N and then halts after two more futile steps. Of course, any program that uses heuristics to guide the proof can be tripped up, and this is no less true of ANDS than of the Osherson model. The present point is that for the latter model,
53
this type of error is always fatal because there is no way to return to the place in the proof where the wrong rule was applied. In some sense, then, we can think of ANDS as combining the subgoal capability of Johnson-Laird's theory with the look-ahead feature of Osherson's. One can perhaps defend the last two models as explanations of special types of deduction—inference production in the case of Johnson-Laird and simple inference evaluations in the case of Osherson (see the distinction between "logical intuition" and "proof finding" in Chapter 2 of Osherson, 1975). However, it seems a reasonable working hypothesis to regard these activities as two modes of a more general-purpose deductive program. Indeed, it would be odd to suppose that humans have evolved three different logical systems, one for production of inferences, another for evaluation of simple arguments, and a third for more complicated ones. Granted that a general-purpose deduction system is a live possibility, the above considerations suggest that it will have to incorporate both some form of subgoals and some look-ahead feature. To sum up, ANDS's inference mechanism is more general than that of its progenitors, and it should therefore be better able to serve the varied demands placed on the inference system by other cognitive components (e.g., in comprehension or choice). Of course, generality has been purchased at the price of increased complexity, and we should check 6 The emphasis on production over evaluation of arguments might be justified on the grounds that production can be used to test validity (i.e., it can double as an evaluation procedure). Given an argument with premises PI, P2, . . . , Pk, and a conclusion C, one can add the negation of the conclusion to the premises and check whether a contradiction can be produced from this augmented set of propositions. However, such a mechanism is not very realistic psychologically (though it does form the basis for resolution theorem proving; see Chang & Lee, 1973). Furthermore, one could equally well support the primacy of evaluation over production on the basis of a (similarly unrealistic) procedure: In order to produce valid conclusions, one can systematically generate propositions using the syntactic rules of the language, test to see if they follow from the premises, and output any that do. In sum, although production and evaluation are related logically, it is unlikely that one is psychologically subsumed by the other. (For further discussion of production, see the section on Extensions.)
54
LANCE J. RIPS
whether this complexity is required in explaining actual instances of reasoning. ANDS's Empirical Consequences Protocol Examples Most of the psychological research on prepositional reasoning has focused on one sentential connective at a time, for example, determining the way people answer questions about negative or conditional statements (see Wason & Johnson-Laird, 1972, for a review of this research and Osherson, 1975, for an important exception to the one-connectiveat-a-time strategy). However, ANDS's aim is to explain reasoning with arbitrary combinations of connectives, and so to evaluate the theory's empirical adequacy, we need a more general set of results. In the present section, I examine new data that bear on this adequacy issue, beginning with a sample protocol from a subject who was trying to decide whether Argument 2 is valid. Because this argument is the same one used to illustrate ANDS's proof procedure in Figure 2, the example can help assess ANDS's advantages and disadvantages as a theory of reasoning. This protocol was collected before the theory was formulated and shaped some of the decisions that went into it. For this reason, the similarities discussed below cannot be taken as detailed confirmation of the theory. However, they may at least indicate the extent to which ANDS's general structure is compatible with human styles of reasoning. In the experiment from which the protocol was taken, two groups of subjects evaluated single-premise arguments while thinking aloud. Four subjects were included in each group, and the groups were assigned to separate sets of 12 arguments. Each set contained six valid and six invalid problems (the valid arguments were those of Osherson, 1975, Table 11.6). Individual arguments were presented to subjects one at a time in random order on index cards, which remained in view while the problem was being solved. The subjects' task was to decide whether "the conclusion had to be true if the premise was true." In addition, they were told to answer the question first in whatever way seemed
natural to them. Then, in order to encourage the subjects to mention all of the steps in their thinking, they were asked to explain their answers once again as if they were talking to a child who did not have a clear understanding of the premise and conclusion. The sample protocol in Table 3 is a complete transcript from a subject who correctly evaluated Argument 2. The subject was at the time a graduate student in English literature who (like the other subjects in this experiment) had no previous formal training in logic. In the subject's initial solution, he first reads the premise and conclusion of the argument (Lines a and b in Table 3) and then begins working over the premise by paraphrasing it in various ways (Lines c-e). The most helpful of these paraphrases occurs in Line d: "If you come upon any blackboard without an M or without a P . . . there will be an R." From this proposition, the answer seems to be self-evident because after repeating the conclusion in Line f, the subject declares this conclusion to be true. Although - the last step is not elaborated in the initial solution, the subject's explanation to the imaginary child provides some insight into his reasoning. He first tries to get the child to understand that if either an M or a P is missing or if both of them are missing from the blackboard, then there has to be an R, essentially a restatement of Line d. He then has the child imagine a situation in which the antecedent of the conclusion is true: "Look at this blackboard here. There is no M on it." Because the M is missing, then "there has to be an R there." There is a clear parallel between this informal proof and ANDS's assertion tree for the same argument in Figure 1. In particular, Lines a, i, and the second part of Line e in the transcript correspond to the premise (Assertion 1 in the figure), and the statement of the conclusion in Lines b and f to Assertion ; 8. Lines d, k, and the second half of Line m are all closely related to Assertion 3 (i.e., If not M or not P then R), and finally, the two sentences in Lines o and q are equivalent to the two propositions of the subordinate node. It seems fair to say that the main thread of the subject's explanation is basically the one
COGNITIVE PROCESSES IN REASONING
55
Table 3 Protocol From Subject 7 on the Problem, "If it is not true that there is both an M and a P, then there is an R; therefore, if there is no M, then there is an R" Initial solution a. The sentence above the line reads, "If it is not true that there is both an M and a P, then there is an R." b. The sentence beneath the line reads, "If there is no M, then there is an R." c. If it is not true that there is both an M and a P—• if you come upon a blackboard and there is an M and a P, there will always be an R. d. If you come upon any blackboard without an M or without a P, without both of them together, there will be an R. e. So with an M and a P, no R, and if it is not true that they're both there, then there is an R. f. Now the sentence below says if there is no M, then there is an R. g. That's true. h. Now I'm construing the top sentence a little differently, but I think that shows how I'm reasoning, incorrectly or otherwise, i. If it is not true that there is both, then there is an R. j. k. 1. m. n. o. p. q.
Explanation to a "child" OK. Anytime you see both the letters M and P on a blackboard, then you can be sure that there is no R. But if one of the letters is missing or if both of the letters is missing, then you'll see an R on the blackboard. Now look at this blackboard here, There is no M on it, so if either an M or a P is missing or both of them are missing, then there has to be WR. Now what about this blackboard? There is an M missing, What does that say? That says there has to be an R there.
abstracted in Proof 3 and that ANDS reproduced in the earlier example. However, the comparison between the subject's proof and ANDS's proof of the same theorem also points up a number of dissimilarities. The first of these is that some of the subject's statements never appear in ANDS's assertion or subgoal trees. This is true, for example, of the sentence in Line c of Table 3 ("if. . . there is an M and a P, there will always be an R"), of the statement in the first half of Line e, and of the very similar one in Line j ("Anytime you see both the letters M and P on a blackboard, then you can be sure that there is no R"). What makes these statements initially puzzling is that they are of the form IF p, q and IF p, NOT q, and it is somewhat unlikely that the subject believed both of these sentences to be true at the same time. But because the first of them is never repeated in the proof, it can plausibly be considered a slip of the tongue or a temporary misunderstanding of the premise, which is later abandoned in favor of the second type of sentence. The factors responsible for this error are unclear, and ANDS makes no attempt to duplicate it.
Lines e and j by themselves are easier to understand. These sentences are the inverse of the original premise; that is, from If not (M and P) then R the subject has inferred If M and P then not R. This inference is not valid on most formal interpretations of IF and cannot be deduced from the rules in Table 2. Nevertheless, the inverse does follow if the premise is understood as a biconditional—Not (M and P) if and only if R—as might be appropriate in certain contexts (Fillenbaum, 1977; Geis & Zwicky, 1971). For sentences of the type used here, a substantial minority of subjects accepts the inverse as valid on the basis of a premise conditional (Pollard & Evans, 1980). This behavior could be simulated by providing ANDS with the corresponding deduction rule (see the section on Extensions below), though for the present ANDS sticks to the classical interpretation of IF. Note that such a rule would have to work in a forward direction to account for the protocol, because the inverse does not seem to be motivated by any obvious subgoal. Indeed, the inverse plays no direct role in the proof at all, if this analysis of the subject's reasoning is correct.
56
LANCE J. RIPS
A second discrepancy between the two proofs is that the subject omits mention of one of the problem's subgoals, There is an R. (He does, of course, mention the main goal or conclusion in Lines b and f.) This is likely due to the subgoal being a simple and obvious one, and it seems to be implicit in the subject's explanation. In order to account for the directedness of the subject's proof, we need to make some assumptions about his subgoals, and those that appear in the subgoal tree of Figure 1 seem to be reasonable candidates. In the previous section, I noted that earlier psychological models have problems in accounting for certain aspects of the deductive process—look-ahead features in the case of Johnson-Laird's model and subgoals in the case of Osherson's. Some further excerpts from the protocols demonstrate how subjects use these features. Argument 5, for example, provides us with an instance where a lookahead mechanism is particularly helpful, since this argument is easy to handle by supposing as true that part of the conclusion inside the negative (i.e., There is both a Y and no R) and showing that this piece of the conclusion yields a contradiction when taken with the premise. 5. If there is a Y or an M on the blackboard, then there is an R. It is not true that there is both a Y and no R.
ANDS's strategy on this problem is to work backward from the conclusion via Rule RIO, first supposing that there is a Fand no R and then hunting for a contradiction based on this supposition. Subjects' strategies are remarkably similar. Thus, one subject's protocol begins as follows: 6. Ok, I think I have to take this one by analyzing the second part of the second sentence first. It says that there's both a Y and no R, and above it says if there is a Y, then there is an R. So then the second part of the second sentence is not true, so the second sentence as a whole is true.
Another subject explains the validity of the same argument in this way: 7. The bottom sentence says that—the second half of the bottom sentence says that there's a Y there and there is no R there, and this cannot be because the top sentence says that if there is a Y there then there must be an R on the blackboard. So when you put
the whole bottom sentence together, it says that it is not true that there is both a Y and no R. So this must be right.
The Johnson-Laird model cannot account for the subjects' reasoning in Excerpts 6 and 7 since the deductive procedure in that model has no access to the argument's conclusion and, hence, cannot analyze it in the manner of these subjects. (As presently formulated, the Osherson model also has trouble with this example. Although the model can at least inspect the conclusion, it has no way to extract part of the conclusion and use it to make further inferences. In other words, the Osherson system does not use suppositions, whereas the solutions in Excerpts 6 and 7 depend on supposing Y and not R.) There are also many illustrations in the protocols of subjects giving up a particular line of reasoning and switching to another. One case in point occurred with Argument 8: 8. If there is not a Q or not an N, then there is a B. There is a Q or a B.
Part of the transcript from one subject working on this problem shows a typical logical doubletake: 9. That's wrong, this is not necessarily true. Because the top sentence is a conditional, . . . because there is an or in there, that doesn't mean that there is a Q or a B. There might be. What it does mean is that there's a Q or an N or else there's a B, but not that there's either a Q or a B . . .1 messed this up again . . . If Q is on the blackboard, then the bottom sentence is true, because either Q or B is on the blackboard, namely, Q is on the blackboard. If Q is missing from the blackboard, then there must be a B on the blackboard, because the top sentence says that if there is no Q or no A', then there is a B. It says that if Q is missing from the blackboard, then there is a B, and the bottom sentence is true.
This reversal is difficult to explain unless some provision is made in a deductive model for abandoning unsuccessful strategies and initiating new ones. This is prohibited in the system developed by Osherson; but by contrast, in a system with subgoals it is easy to abandon a failed subgoal, back up to a superordinate goal, and begin a new line of reasoning. As a more global test of ANDS's proofs, two judges, who were familiar with logic, rated the similarity between these proofs and the subjects' protocols for the same argu-
COGNITIVE PROCESSES IN REASONING
ments. As a basis for comparison, these judges were also asked to rate the similarity between the transcribed protocols and the proofs generated by the Osherson system (1975, p. 262), the only alternative psychological model that yields clear-cut predictions for these arguments. Two steps were taken to facilitate the comparison: First, the stimuli were limited to those arguments that were valid in both the ANDS and Osherson models and to those protocols in which the subject had also correctly reached a valid decision. This left a total of 32 protocols, representing 10 different arguments and 8 subjects. Second, both types of proofs were reexpressed in a common format, that of Fitch (1952), with which the judges were acquainted. The judges were instructed to estimate the extent to which the protocols reflected the same method shown in the proof, and their ratings were given on a scale of 0 to 5, where 0 indicated that the protocol method was definitely not the same as the proof and 5 meant that the method was definitely the same as the proof. Several of the protocols used in the experiment were not very revealing because subjects were inexplicit about their solution strategies despite our instructions. These vague responses place a limit on the maximum similarity of the proofs to the transcripts. Nevertheless, the mean similarity rating for ANDS's proofs was 3.64 and that of Osherson's proofs 3.21. Taking the two judges' ratings as dependent measures, the difference between the proof systems is reliable in a multivariate analysis of variance in which the individual protocols served as the unit of analysis: Wilks's A = .620, corresponding to F(2, 30) = 9.21, / > < .01 (see Rao, 1973, pp. 555-556; the result is also significant by other standard multivariate criteria). For the reasons given above, these data should be interpreted cautiously. However, they help bolster the claim that ANDS's assumptions are fairly realistic with respect to reasoning by untrained subjects. Memory for Proofs The data most often cited in support of a theory of reasoning are judgments about the validity of inferences. The protocols just examined are a special kind of judgment of this
57
sort, and validity judgments are taken up again in the following sections. However, because ANDS makes specific assumptions about working memory for proofs, another way to evaluate the model is to see which propositions subjects recall after reading or listening to a demonstration like that in Proof 3. The rationale for such a test derives from previous experiments on memory for text (e.g., Kintsch, Kozminsky, Streby, McKoon, & Keenan, 1975; Meyer, 1975). A consistent finding in these studies—the so-called levels effect—is that recall probability for a given proposition is correlated with the height of that proposition in a hierarchical reconstruction of the text. High-level facts that are closely related to the central theme of the passage are more accurately recalled than low-level details. Because ANDS provides us with a ready-made hierarchical analysis of proofs in the form of its tree structures, an obvious prediction is that subjects should recall propositions from the top of these trees better than more embedded propositions. An experiment by Marcus (1982) on memory for proofs allows us to test this prediction. In this study, subjects listened to a series of passages, each of which was an instantiation of a simple proof, and then attempted to recall the passages in response to their titles. The crucial variable was whether the passage contained lines that would appear in a subordinate node of the assertion tree. Each passage appeared in two versions, one with subordinate sentences of this kind and the other without. (Passages were balanced so that a given subject saw only one version of each.) The sentences in Proof 10 are an example of a passage with subordinate propositions: 10. a. Suppose the runner stretches before running. b. If the runner stretches before running, she will decrease the chance of muscle strain. c. Under that condition, she would decrease the chance of muscle strain. d. If she decreases the chance of muscle strain, she can continue to train in cold weather. e. In that case, she could continue to train in cold weather. f. Therefore, if the runner stretches before running, she can continue to train in cold weather.
The assertion tree for this proof is shown on the left-hand side of Figure 3. The conditionals in Sentences b and d are premises
58
LANCE J. RIPS
If the runner stretches, she will decrease strain.
The runner stretches.
If she decreases strain,she can continue to train.
If the runner stretches, she decreases strain.
If the runner stretches, she can continue to train.
She decreases strain.
Runner stretches. She decreases strain. She can continue to train
Proof A
If she decreases strain,she can continue to train. She can continue to train. If she did not decrease strain, she would ruin her chance. Proof B
Figure 3. Assertion trees for two proof's from Marcus (1982). (Proof A is an embedded proof; Proof B is nonembedded.)
of the argument and are placed in the top node. However, Sentence a is introduced as an explicit supposition whose truth value is uncertain. Its role is exactly the same as the supposition Not M in the proof of Figure 1 — it is temporarily assumed in order to facilitate further conclusions—and as in the case of Not M, it belongs to the subordinate node of the tree. Propositions c and e are derived by modus ponens from this supposition and the conditional premises. Finally, Sentence f is deduced from Sentences a and e on the basis of Rule R9. Other subordinate proofs in Marcus's study used the supposition-creating rules RIO or Rl 1 in place of R9. We can compare the above example to its nonembedded counterpart in Proof 11: 11. a. The runner stretches before running. b. If the runner stretches before running, she will decrease the chance of muscle strain. c. Therefore, she will decrease the chance of muscle strain. d. If she decreases the chance of muscle strain, she can continue to train in cold weather. e. Thus, she can continue to train in cold weather. f. If she did not decrease the chance of muscle strain, she would ruin her chance of running in the Boston Marathon.
The basic difference between Proofs 10 and 11 is that the latter employs no suppositioncreating rules and hence is represented by an
assertion tree with a single node, as shown at the right of Figure 3. The conditional premises in Sentences 1 Ib and 1 Id are identical to the ones in Sentences lOb and lOd. But this time Sentence l l a also serves as a premise, as does the filler item in Sentence 11 f. (This last sentence was added to equate the length of the two passages.) Propositions 1 Ic and 1 le, like the corresponding sentences in Proof 10, are obtained by modus ponens from the earlier assertions. The important comparison in this experiment is between the recall rates for Propositions lOa, lOc, and lOe on the one hand ("embedded lines") and for Propositions 1 la, lie, and l i e on the other ("unembedded controls"). Apart from the initial adverbial phrases,7 embedded lines and unembedded controls have the same content; however, whereas embedded lines appear in the subordinate node in their proof, the controls belong to the superordinate node. Thus, if the representations in Figure 3 are correct and if recall is better for superordinate material, subjects should be more accurate with 7 In a subsequent experiment, Marcus (1982) has shown that the adverbial phrases alone do not account for the obtained differences between embedded lines and controls.
59
COGNITIVE PROCESSES IN REASONING
the controls than with the embedded items. In fact, subjects were correct on 44.6% of trials with sentences like 11 a, lie, and lie but only 25.4% correct with lOa, lOc, and lOe, in accord with the above prediction. This result cannot be blamed on global differences between the passages. Such differences would produce higher recall rates for sentences like 1 Ib than for equivalents like lOb, both of which are in the superordinate nodes of their respective assertion trees. However, the obtained effect is in the opposite direction, accuracy on 1 Ib-type sentences being 71.2% versus 83.1% for sentences like lOb. (There is an obvious recall advantage for Sentences lOb, 1 Ib, and their analogs over Sentences lOc, 1 Ic, and the like. This could be due to a number of structural or content factors, which were not controlled in this experiment and which have no bearing on the basic contrast.) In short, these results confirm ANDS's memory organization. Subordinate propositions in the assertion tree exist only to provide the grounds for a higher level conclusion, and because the truth of these subordinate propositions is uncertain, they should be less useful in the context in which the problem is stated. Given limited memory capacity, forgetting from the bottom of the proof trees would seem a more adaptive process than alternative possibilities. The Acceptability of Arguments As an additional test of ANDS, we can try fitting the model directly to subjects' decisions about the validity of arguments. In doing so we make crucial use of the assumption, mentioned above, that the deduction rules in Table 2 are unavailable on some proportion of trials (where unavailability may be due to failure to retrieve a rule, to recognize the rule as applicable, or to apply it properly). If we know which rules are needed in proving a given argument and how likely these are to be available, we can arrive at predictions about how often subjects will evaluate the argument correctly. For example, let us suppose that Argument 12 is presented to a group of subjects who are asked to determine its validity.
12. If Judy is in Albany or Barbara is in Detroit, then Janice is in Los Angeles. If Judy is in Albany or Janice is in Los Angeles, then Janice is in Los Angeles.
We assume that the subjects will attempt to construct a mental derivation in order to tell if the conclusion follows from the premise, doing implicitly what the subjects of our protocol experiment were doing explicitly. If the subjects are successful in deducing the conclusion, they will judge it to be valid. Otherwise, they will either guess at the answer with some probability, pg, or simply declare the argument invalid (with probability 1 -pg). If ANDS is a correct model of the derivation process, the subjects' ability to deduce the conclusion depends on whether the necessary inference rules from Table 2 are available. That is, the overall probability of a correct valid response to Argument 12 should be equal to the joint probability of having all essential rules available to construct the proof plus some increment due to correct guessing. Other factors— for example, misperceiving part of the argument — could keep subjects from finding a correct proof, but in order to simplify the predictions, let's assume that these sources of error are negligible compared to rule availability. When all of its deduction rules are available, ANDS proves Argument 12 using R4, R9, and R 1 1 . If these rules are accessible with probabilities p4,p9, and pn, and if we assume these probabilities are independent, then the likelihood of a correct response can be expressed as P( valid) = p4p9Pn + .5pg(l -
),
(1)
where the first term is the probability of producing the proof and the second term reflects a correct guess after failing to find a proof. As it stands, however, Equation 1 is not quite right. It would be perfectly correct if the R4R9-R1 1 proof were the only possible one, but it turns out that ANDS can still find a proof of Argument 12 even if R4 is missing: A combination of Rules R 1 and R7 fills the place left by R4. This means that additional terms must be added to Equation 1 to reflect this alternative derivation. The correct prediction equation is:
60
LANCE J. RIPS
/'(valid) = ptpgpn + (1 - p + .5/7g[l - PtPgPn - (1 ~ P^PiPrfgPn]. (2)
Here, the first term is again the probability of finding the original proof, the second term is the probability of finding the alternative proof, and the final term is the probability of finding neither proof but making a correct guess. Because any further deletion of rules results in no proof at all, Equation 2 is our prediction about the proportion of correct responses to Argument 12. It should be clear that prediction equations like Equation 2 can be developed for any argument by systematically deleting combinations of rules from ANDS's repertoire and seeing if the strippeddown model comes up with a proof. Predictions were derived in this way for 32 valid arguments that are listed schematically in Table 4. The arguments were constructed by selecting triples of rules from Table 2 and then attempting to generate a proof from each triple. With the exception of R8, all of the rules in Table 2 appear in ANDS's proofs of two or more of the arguments. In addition to these critical problems, an equal number of invalid arguments were produced by rearranging the original premises and conclusions. These problems were included to check whether subjects were responding to the form of the individual sentences rather than to the logical relations among them. Finally, 40 filler arguments were added to this set, most of which were simple valid problems. Thirty-six subjects participated in this experiment, none of whom had any formal course-work in logic. For half of them, the arguments were presented in terms of propositions about the location of people in cities, as in Argument 12. For the remaining subjects, the same problems were phrased with propositions describing the actions of parts of a machine. For instance, these subjects would have seen Argument 13 in place of Argument 12. 13. If the light goes on or the piston expands, then the wheel turns. If the light goes on or the wheel turns, then the wheel turns.
Within each of these two groups, atomic propositions (e.g., "Judy is in Albany" or
"The light goes on") were randomly assigned to the arguments, with no proposition repeated in more than one problem. The order of the arguments was randomized separately for each subject and presented in a single printed list. Beneath each problem were the phrases "necessarily true" and "not necessarily true." Subjects were asked to circle the first response if the conclusion had to be true whenever the premise was true and to circle the second response otherwise. A response was required for each problem, even if it meant guessing. Subjects were also cautioned that the answer to a problem did not depend on the answer to any other problem on their list. Subjects were tested in groups of varying size and were allowed to proceed at their own pace, normally about 1 hour to evaluate the 104 arguments. Table 4 presents the results for the valid problems from this experiment, and a preliminary look shows that subjects correctly judged them as valid on 50.6% of trials, with scores for the individual arguments varying from 16.7% on Problem 3 of the table to 91.7% on Problem 27. Although all of these arguments are valid in classical logic, they clearly span a wide range of difficulty for our subjects. The percentage of valid responses to the invalid arguments (22.9%) was significantly less than for the valid ones, F( 1, 34) = 84.26, p < .01. To some extent, then, subjects were successful in discriminating the two problem types, even though their overall hit rate was low. Because the valid and invalid items were matched for the complexity of their premises and conclusions, this result suggests that complexity alone cannot account for subjects' decisions. This is backed by relatively low correlations between the percentage of valid responses in Table 4 and factors such as the number of premises in the argument (r = -.23), the number of types of atomic propositions (r = -.04), and the number of proposition tokens (r = .10). An analysis of variance of the valid items showed no reliable difference due to the content of the problems (people in places or machine movements) and no interaction of content with scores on the individual arguments: for the content main effect, F(l, 34) = .20, p> .10; for the interaction, F(31, 1054) = 1.12, p > .10. For this reason, the results for the two
61
COGNITIVE PROCESSES IN REASONING Table 4 Observed and Predicted Percentages of Valid Responses to Stimulus Arguments Argument 1. ( p v ? ) & ~ p
Observed
Predicted
Argument
33.3
33.3
14. (p v r) -^ ~ s
66.7
70.2a
Observed 50.0
Predicted 38.1"
p - ~ (s & t)
qvr
2. s
15. ~ (p& q)
77.8
75.8
(~ p V ~ 4) -> r
P vfl
~ (p & f/) & r
~ p - (