Psychonomic Bulletin & Review 1996,3 (1), 95-99
Induction and category coherence MARY E. LASSALINE and GREGORY L. MURPHY University ofIllinois at Urbana-Champaign, Urbana, Illinois In studies of category formation, subjects rarely construct family resemblance categories. Instead, they divide objects into categories using a single dimension. This is a puzzling result given the widely accepted view that natural categories are organized in terms of a family resemblance principle. The observation that natural categories support inductive inferences is used here to test the hypothesis that family resemblance categories would be constructed if stimuli were first used to generate inductive inferences. In two experiments, subjects answered either induction questions, which made interproperty relationships more salient, or frequency questions, which required information only about individual properties, before they performed a sorting task. Subjects were likely to produce family resemblance sorts if they had first answered induction questions but not if they had answered frequency questions. A common assumption in categorization research is that natural categories are organized in terms of a family resemblance principle. According to the family resemblance hypothesis proposed by Rosch and Mervis (1975), categories are constructed so that members of each category are similar to one another and different from members of other categories. The most typical category members are those that are most similar to the other members and most different from nonmembers. In this view of categories, unlike the "classical view" (see Smith & Medin, 1981), there are no defining characteristics that are possessed by all category members but not by nonmembers. Rather, it is overall similarity that determines category membership. Even theorists who promote an exemplar or hybrid model of concepts assume that multiple features are involved in determining category membership and typicality (e.g., Hintzman, 1986; Medin & Schaffer, 1978). It is this clustering of many attributes that makes categories particularly useful for identification, inference, problem-solving, and other cognitive tasks (see Rosch, Mervis, Gray, Johnson, & Boyes-Braem, 1976). This argument about category structure has been made very strongly in regards to natural-kind categories, such as gold, wafer, oak, or robin (see, e.g., Keil, 1989; Markman, 1989). Natural kinds are assumed to represent biological or physical categories that are tied together by some kind of microstructure (e.g., genetic code or atomic structure) that explains their superficial features. As a result, natural-kind categories are inductively very rich. For example, if you see an animal of a certain size that
This research was supported by National Institute of Mental Health Grant MH4l704. We thank Arthur Markman, Brian Ross, and an anonymous reviewer for comments on a draft of this manuscript. Correspondence should be addressed to M. Lassaline, Beckman Institute, University of Illinois, 405 N. Mathews, Urbana, IL 6180I (e-mail:
[email protected]).
95
has wings, you can infer with a fairly high probability that it has a number of other features that may not be presently perceptible, such as having a beak, laying eggs, having two legs, and so on. If you can identify the animal as a bird, then this and other inferences become even more certain. Considerable research has shown that both adults and children use membership in natural kinds to support such inferences (see Markman, 1989, for a review). In order for natural kinds to allow this kind of inference, it is necessary that people learn multiple features for each category. That is, in order to support the inferences about birds, we must learn that birds generally have wings, lay eggs, live in nests, fly, have beaks, and so on. In summary, there is fairly wide agreement that the categories that people normally learn are represented by multiple features that are generally nondefining, rather than by one or two criterial characteristics. It is very surprising, then, that when subjects are asked to construct categories out of novel stimuli (e.g., Ahn & Medin, 1992; Medin, Wattenmaker, & Hampson, 1987; Regehr & Brooks, 1995; Spalding & Murphy, in press), they rarely construct family resemblance categories in a sorting task. In a series of experiments that used a variety of stimulus materials, category structures, procedures, and instructions, Medin et al. (1987) found that subjects overwhelmingly sorted on the basis of a single dimension. For example, in their Experiment 1, Medin et al. constructed novel animals that varied on the dimensions ofhead shape, number of legs, body markings, and tail length. If subjects had formed family resemblance categories, then these categories would have been highly predictive of all four dimensions. Instead, when given the stimuli and asked to form the most natural categories, all subjects divided them along a single dimension, such as striped versus spotted body marking. As a general rule, such divisions are much less informative: One dimension can be predicted perfectly, but the other three dimensions cannot. To understand the difference, imagine that people did not form the category
Copyright 1996 Psychonomic Society, Inc.
96
LASSALINE AND MURPHY
bird but instead had unidimensional categories, such as winged thing, or thing that flies. Because the category winged thing contains birds, bats, insects, and some artifacts (planes, etc.), it would be much less informative than bird. Knowing that something is a member ofthe category winged thing would allow a prediction about only one dimension. Thus, single-criterion concepts have been widely acknowledged as being uninformative (Markman, 1989, p. 88f). The question, then, is why subjects choose to create them in category formation experiments. To answer this question, it is informative to look at exceptions to the general rule of unidimensional category formation. Two consistent exceptions in free-sorting tasks are (I) in the presence ofconceptual knowledge that makes interproperty relationships salient (e.g., animals with wings are more likely to fly than are animals with arms), and (2) when causal relations between properties were explicitly given. Ahn (1990) found that people are more likely to create family resemblance categories when they are given the category prototypes or a theory integrating features of concepts. For example, if subjects are given the stimuli to be sorted and are explicitly told what the prototypes of the two categories are, many (but not all) subjects create family resemblance categories. If the features within a category can be related by prior knowledge, subjects may then identify the relations among the features and use multiple features in constructing categories (Spalding & Murphy, in press). Without this background knowledge, subjects tended to sort examples based on values on a single dimension. The question remains, however, as to why subjects presented with novel stimuli tend not to naturally form family resemblance categories, given that such categories are more useful and allow more inferences than unidimensional categories. One possibility is that the typical category formation task does not involve any use of the categories that would require inferences or helpful categorizations. Typically, subjects simply sort the stimuli into whatever categories they like, and then the task is over. They do not use the categories in any way. Presumably, family resemblance categories are initially more difficult to construct, because they require attending to and integrating information about values on multiple dimensions. Unidimensional categories, by definition, do not. Put simply, family resemblance sorting is harder than unidimensional sorting. Since the rich inductive structure offamily resemblance categories is not actually used in the experiments, it may not be surprising that subjects take the easier route. The experiments reported here test the hypothesis that when subjects must actually perform inductive inferences on the stimuli, they will be more likely to form family resemblance categories. This prediction is the obverse of the claim usually made about family resemblance stimuli. That is, family resemblance stimuli are said to be useful because they support inferences. Our prediction, then, is that when subjects are asked to draw inferences about the stimuli, they will be more likely to form family resemblance categories. Asking subjects to
first draw inductive inferences about objects could thereby make interproperty relationships more salient, which could result in their using multiple features in constructing categories. EXPERIMENT 1
In Experiment 1, subjects were given descriptions of vehicles, animals, or buildings and were asked to sort them into categories. Prior to the category construction task, half ofthe subjects were first asked inductive questions about the stimuli. For example, for a set of animal stimuli that vary on the dimensions oftaillength (long or short) and tooth shape (flat or sharp) among others, subjects were asked questions such, as "Given that this animal has a short tail, what kind of teeth would you expect it to have?" It was expected that the induction task would make interproperty relations more salient and lead to family resemblance sorting. Other subjects answered frequency questions, which required information only about single dimensions (c.g., "How many animals have a short tail?"). The subjects were asked questions about the same features. A third group of subjects performed only the sorting task, to provide a baseline against which to compare sorting data from the other two groups. If the induction task makes interproperty relationships more salient-and given that such relationships are the basis for family resemblance categories-subjects performing the induction task should be more likely to construct family resemblance categories than should subjects performing the frequency task or the baseline group. Method Subjects. Sixty-nine University of Illinois students participated in this experiment for credit in an introductory psychology course. Twenty-four were assigned to the induction condition, 24 were assigned to the frequency condition, and 21 were assigned to the baseline condition. Within each of these conditions, one third of the subjects were assigned to each of the three stimulus sets (vehicles, animals, and buildings). Materials. Three stimulus sets were used in this experiment: vehicles, animals, and buildings. One pair of family resemblance categories was formed for each stimulus set. Each set included 20 exemplars, 10 per category, which varied on eight attributes. There were two possible values for each attribute. For example, the eight attributevalue pairs for vehicles were bench or bucket seats, nonradial or radial tires, convertible or nonconvertible, license plate in front or back, uses diesel or gas, white or green, two or four doors, and power or manual windows. An abstract representation of the 20 exemplars is given in Table 1. The eight attributes are represented by the letters A-H. The two possible values of each attribute are represented by 0 and I, and absence of the attribute is indicated by a dash. Each exemplar possessed five of the eight attributes. The Attributes A-E were classified as "relevant" because they were perfectly predictive of family resemblance category membership, and attributes F, G, and H were "irrelevant" because they provided no information regarding which category an exemplar belonged to. Three of the five attributes possessed by each exemplar were relevant, and two were irrelevant. By creating a stimulus structure as indicated in Table I, possession of anyone of the five relevant attributes was sufficient but not necessary to predict category membership. For example, the first exemplar in Category I, represented in Table I as "I I I - - I I -," possesses Value I on Attributes A, B, C, F, and G, and no information is given about its value on Attributes D, E, or H. Therefore, using the attribute values provided for the vehicle
INDUCTION AND CATEGORY COHERENCE
Table 1 Stimulus Structure Used in Experiments 1 and 2 Feature Exemplar
A
B
I 2 3 4 5 6 7 8 9 10 II 12 13 14 15 16 17 18 19 20
0 0 0 0 0
0 0 0 0 0
0 0 0
D C Category I I
E
Category 2 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0
F
G
I I 0 0 I I
I 0 I 0 I 0 0 I I 0
I I 0 0 I I
I 0 I 0 I 0 0 I I 0
H
I
o I
o
I
o I
o
Note-Features A-E are relevant for family resemblance categorization; Features F-H are irrelevant. The two possible values of each attribute are represented by 0 and I. The absence of an attribute is represented by a dash.
stimulus set earlier, the actual description of that exemplar shown to the subjects would read, "bench seats, nonradial tires, convertible, white, two door." All 20. exemplars for each stimulus set were constructed in this manner and printed onto index cards. The set of 20 cards was shuffied together to form a stimulus set for each subject. Procedure. Induction subjects answered questions about properties designed to increase the salience of interproperty relations. Frequency subjects answered questions about individual properties. Both groups ofsubjects were given a stack of20 cards and asked to look through the stack once. Next, induction subjects were asked to provide answers to a set of five questions, and frequency subjects were asked to provide answers to a set of eight questions. The subjects were instructed to use the stack of 20 cards to answer the questions, and they were given the correct answer after each question. The set of questions used for each group of subjects is provided in Table 2. The attributes and values used in Table 2 refer to the abstract representation given in Table I. The actual questions given to the subjects were generated by inserting the specific values for a concept pair. For example, the first induction question, given in Table 2 as "If an X has A I, what kind of C does it have?" actually read "If a vehicle has bench seats, what kind of top does it have?" and the correct answer would be "nonconvertible."! Both sets of questions mention exactly the same eight attributes. Following the question-answering task, all three groups of subjects (induction, frequency, and baseline) completed a sorting task in which they were asked to divide the cards into two groups that seemed best or most natural to them. No constraints were placed on the size of the two groups.
Results The grouping created by each subject was classified as family resemblance, unidimensional, or other. There was only one way to group the 20 exemplars into a family resemblance sort (the structure indicated in Table I). The only way to create a unidimensional sort was to
97
group exemplars based on values on Attribute G, because Attribute G is the only dimension about which information is provided for all 20 exemplars. Also, for each subject, the number of deviations from a family resemblance sort was calculated by determining how many exemplars out of20 would have to be reclassified into the other category to form the family resemblance sort. The maximum deviation score was 10. The classification of subjects' categories and the average deviation from a family resemblance sort for each condition are presented on the left side of Table 3. Because there were no differences between the animal, vehicle, and building stimuli, results are collapsed across stimulus set. Consistent with past research, few subjects in the baseline condition created family resemblance categories. The percentage (19%) is comparable to that found in past research with the same category structure (Spalding & Murphy, in press). The following analyses will compare the two conditions in which the subjects answered questions about the stimuli prior to category construction. Induction subjects were three times more likely to produce family resemblance sorts than were frequency subjects (Fisher's exact test, p < .01). Average deviation from the family resemblance sort was 2.83 lower for induction subjects [F(l,46) == 4.76, MS e == 20,p < .05]. Answering the frequency questions led to virtually as many family resemblance sorts as did answering no questions (17% vs. 19%). When the category structure was not recovered, the subjects were most likely to sort unidimensionally. This demonstrates the strong bias toward unidimensional sorting in the task, given that only one ofthe eight dimensions permitted sorting of all the stimuli. EXPERIMENT 2 In this experiment, the verbal descriptions used in Experiment 1 were replaced with pictures of novel bugs to Table 2 Abstract Structure of Induction and Frequency Questions Used in Experiments 1 and 2 Question Induction Ifan X has If an X has Ifan X has If an X has Ifan X has
AI, what kind ofC does it have? C I, what kind of B does it have? EO, what kind of A does it have? DO, what kind of E does it have? FI, what kind ofG does it have?
Frequency How many How many How many How many How many How many How many How many
Xs have AI? Xs have C I? Xs have B1? Xs have EO? Xs have AO? Xs have DO? Xs have FI? Xs have Gl?
Correct Answer CI BI AO EO GlorGO 6 5 7 6 6 6 8 10
Note-X refers to the category name vehicle, building, or animal. The letters A-G refer to the stimulus dimensions, as used in Table I.
98
LASSALINE AND MURPHY
Table 3 Sorting as a Function of Question Condition for Experiments 1 and 2 Experiment I Experiment 2 Question Condition Question Condition Type of Sort
Induction
Frequency
None
Induction
Frequency
Family resemblance sort (%) Unidimensional sort (%) Other sort (%)
54 21 25
17 50 33
19 52 29
54 38 8
15 69 15
4.17
7.00
7.52
4.46
8.15
Mean deviation from family resemblance sort
test the generality of results across stimulus type. Because the frequency condition is a better control condition for the induction condition, and since no difference was found between the baseline and frequency conditions in Experiment 1, the baseline condition was dropped. Otherwise, Experiment 2 was essentially identical to Experiment 1.
Method Subjects. Twenty-six University of Illinois students participated in this experiment for credit in an introductory psychology course. Half were assigned to the induction condition and halfto the frequency condition. Materials. The eight attribute-value pairs used to construct the bug stimuli were six or eight legs, straight or curved tail, large or small wings, striped or angled pattern on body, eyes that do or do not protrude, long or short head, vertical or horizontal stripes on the thorax, and antennae that point in or point out. The same abstract stimulus structure (see Table 1) and the same induction and frequency questions used in Experiment I (see Table 2) were used in Experiment 2. Procedure. The subjects answered the induction and frequency questions and performed the sorting task in the same manner described for Experiment I, with a few changes necessitated by the use of pictorial stimuli. In Experiment 2, before previewing the stimulus cards, the subjects were shown a diagram illustrating all of the attributes that composed the bug stimuli. This was done to ensure that the subjects understood the terms used in the induction and frequency questions. Attribute values in the diagram were different from the two attribute values used in the 20 stimuli. For example, bugs in the stimulus set had either six or eight legs; the bug in the attribute diagram had four. Also, the induction and frequency questions were presented on cards containing illustrations of the attribute values relevant to each question. For example, one induction question was "If a bug has six legs, what kind of wings does it have?" This question was printed on an index card along with a picture of the set of six legs used in the bug stimuli and pictures of the two kinds of wings, small and large, used in the bug stimuli. The subject was then able to respond by pointing to one of the pictures (e.g., small or large wings). This procedure eliminated the necessity of labeling the attribute values. The frequency questions were presented in an analogous manner, with a picture of the attribute value in question. The subjects responded to frequency questions, as before, with a number.
Results Sorts were classified and deviations from the family resemblance categories were calculated in the same manner as for Experiment 1. As shown in Table 3, induction subjects were much more likely to construct family resemblance categories (Fisher's exact test,p < .05). Average deviations from family resemblance categorization was smaller for induction subjects [F(I,24) = 4.52, MSe = 20, p < .05]. These results are quite similar to
those of Experiment 1. Again, subjects who did not recover the family resemblance categories generally created a unidimensional sort. DISCUSSION Although single-criterion concepts are less informative than are family resemblance categories, in category-construction tasks subjects typically group examples into categories based on a single dimension. The experiments presented here test the hypothesis that the manner in which examples are used affects the kind of categories that will be formed from those examples. Specifically, we predicted that if interproperty relationships were made salient, subjects would be more likely to form family resemblance categories. This prediction was supported in both experiments. When the subjects were asked to perform inductive inferences on the stimuli, the majority produced family resemblance categories; subjects who did nothing with the stimuli or who answered frequency questions about single attributes very seldom produced family resemblance categories. This result was obtained for both verbal and pictorial stimuli. Note that the frequency and induction questions mentioned the same eight attributes. Furthermore, because subjects in the frequency condition answered more questions, they went through the stimulus cards more often and spent more time examining them. It is striking, then, that they seldom recovered the category structure, in contrast to the group that answered inductive questions. The degree of family resemblance sorting obtained in our experiments is in sharp contrast to the rarity with which family resemblance sorting has been obtained in other studies of category construction. Medin et al. (1987) found that even with instructions to use all properties of the stimuli in constructing categories, with the addition of exemplar-specific information designed to encourage individual exemplar identification, and with instructions to think about how individual properties are related to some more abstract property, subjects almost never created family resemblance categories. The induction judgments used here are not the only way in which family resemblance structure may be identified. If relations among features are made salient by explicit mention (e.g. Ahn & Medin, 1992) or are derived from conceptual knowledge (Spalding & Murphy, in press), family resemblance sorting also occurs. In general, any task that requires interaction with exemplars in a way that draws subjects' attention to feature relations is likely to result in the use of multiple features in creating categories. Spalding and Murphy (in press) found that just studying ·the exemplars was not sufficient to produce family resemblance categories, and our experiments show that focusing on individual features is little help. More generally, the kind of interaction one has with examples may determine the kind of categories one forms (Ross, in press), and which form ofinteraction helps most may depend on the category structure (see Spalding & Murphy, in press; Wattenmaker, 1995; Wattenmaker, Dewey, Murphy, & Medin, 1986; and below). One possible concern about the present experiments is that the category structure is not the same as that used some by past work on category construction by Ahn, Medin, and colleagues (e.g., Ahn, 1992; Medin et aI., 1987; though the structure is the same as that used by Spalding & Murphy, in press). In the category structure used by Ahn, Medin, and colleagues, each category was represented by a prototype
INDUCTION AND CATEGORY COHERENCE
(which included all the typical features) plus examples that had one atypical feature and the rest typical features. Using the terminology in which Is correspond to features typical ofCategory I and Oscorrespond to features typical of Category 2, we can describe their stimuli as having the following structure (using only four stimulus dimensions): 1111,0111, lOll, 1101, 1110 for Category I, and 0000,1000,0100, 00 I0, 000 I for Category 2. We call this the characteristic feature design. There were two reasons that we did not use this structure in the present experiments. First, in the characteristic feature design, the family resemblance categories and the unidimensional categories are quite similar-v-they differ only in one exemplar per category. For example, if subjects sorted on the first dimension, they would create categories that were identical to the family resemblance categories except for two exemplars (0111 and 1000 would be put into the "wrong" categories). In the present structure, the unidimensional category was orthogonal to the family resemblance categories, allowing a clearer separation in the experiment. (Related to this, the characteristic feature design allows four ways to sort unidimensionally-one for each dimension-but only one way to sort into family resemblance categories. The present design allowed only one perfect way of creating either category type.) Second, with the characteristic feature structure, there is less feature correlation than in the present categorics-i-that is, the Medin et al. (1987) categories have less inductive regularity than do the categories used here. This is because of the atypical features that appear in each category. Dimension I is not very strongly correlated with Dimension 2, because each category has an exemplar with the "wrong" value on Dimension I and an exemplar with the "wrong" value on Dimension 2. In the structure shown in Table I, on the other hand, relevant features are positively correlated with each other (see Note I), but not with the irrelevant features. Subjects who were asked induction questions were able to notice the correlational structure and form categories based on the relevant features. If the inductive basis for the categories had been weaker, it is likely that asking the induction questions would not have caused the subjects to create more family resemblance categories. In conclusion, our results have shown that subjects are more likely to form family resemblance categories when their interaction with the exemplars encourages them to notice the relations between properties. These relations are the basis for inductive inferences, which are thought to be a critical function for category use (Markman, 1989; Murphy & Ross, 1994). However, such inductions are not only a consequence of family resemblance structure; our results show that the attempt to draw inductions may be one factor that causes people to create such conceptual structures. This result helps to explain the long-standing puzzle of why the usual category formation experiment produces categories that are so different from natural categories.
REFERENCES AHN, W. (1990). Effect of background knowledge on creation of family resemblance categories. In Proceedings ofthe 12th Conference of the Cognitive Science Society (pp. 149-156). Hillsdale, NJ: Erlbaum. AHN, W., & MEDIN, D. L. (1992). A two-stage model of category construction. Cognitive Science, 16, 81-121.
99
HINTZMAN, D. L. (1986). "Schema abstraction" in a multiple-trace memory model. Psychological Review, 93, 411-428. KEIL, F. C. (1989). Concepts, kinds, and cognitive development. Cambridge, MA: MIT Press. MARKMAN, E. M. (1989). Categorization and naming in children: Problems ofinduction. Cambridge, MA: MIT Press. MEDIN, D. L., & SCHAFFER, M. M. (1978). Context theory of classification learning. Psychological Review, 85, 207-238. MEDIN, D. L., WATTENMAKER, W. D., & HAMPSON, S. E. (1987). Family resemblance, conceptual cohesiveness, and category construction. Cognitive Psychology, 19,242-279. MURPHY, G. L., & Ross, B. H. (1994). Predictions from uncertain categorizations. Cognitive Psychology, 27, 148-193. REGEHR, G., & BROOKS, L. R. (1995). Category organization in free recall: The organizing effect of an array of stimuli. Journal of Experimental Psychology: Learning, Memory, & Cognition, 21, 347-363. ROSCH, E., & MERVIS, C. B. (1975). Family resemblance: Studies in the internal structure of categories. Cognitive Psychology, 7, 573-605. ROSCH, E., MERVIS, C. B., GRAY, W. D., JOHNSON, D. M., & BOYESBRAEM, P. (1976). Basic objects in natural categories. Cognitive Psychology,8,382-439. Ross, B. H. (in press). Category representations and the effects of interacting with instances. Journal ofExperimental Psychology: Learning, Memory. & Cognition. SMITH, E. E., & MEDIN, D. L. (1981). Categories and concepts. Cambridge, MA: Harvard University Press. SPALDING, T. L., & MURPHY, G. L. (in press). Effects of background knowledge on category construction. Journal ofExperimental Psychology: Learning, Memory, & Cognition. WATTENMAKER, W. D. (1995). Knowledge structures and linear separability: Integrating information in object and social categories. Cognitive Psychology, 28, 274-328. WATTENMAKER, W. D., DEWEY, G. I., MURPHY, T. D., & MEDIN, D. L. (1986). Linear separability and concept learning: Context, relational properties, and concept naturalness. Cognitive Psychology, 18, 158194.
NOTE I. Although the relevant features occurred only in the "correct" category, they were not perfectly correlated. For example, subjects asked about the relationship between Dimension C and the Al feature would find considerable diversity among the examples. As Table I shows, Value C I would be the correct answer. However, within Category I, there are only three items that include both A I and C I; three items include A I but not C I; two items include C I but not A I; and two include neither feature. Thus, it is not the case that asking the induction question alerted the subjects to a perfectly predictable relationship. Nonetheless, the subjects might have noticed that Al occurred sometimes with CI but not with C2. (Manuscript received February 14, 1995; revision accepted for publication June 16, 1995.)