Within-category feature correlations and Bayesian adjustment strategies

2 downloads 0 Views 109KB Size Report
The category adjustment model (Huttenlocher,. Hedges, & Vevea, 2000) posits that people use such a. Bayesian process when they estimate stimuli from mem-.
Psychonomic Bulletin & Review 2006, 13(2), 245-250

Within-category feature correlations and Bayesian adjustment strategies L. ELIZABETH CRAWFORD University of Richmond, Richmond, Virginia and JANELLEN HUTTENLOCHER and LARRY V. HEDGES University of Chicago, Chicago, Illinois To the extent that categories inform judgments about items, the accuracy with which categories capture the statistical structure of experience should affect judgment accuracy. The authors argue that representations of feature correlations can serve as Bayesian priors, increasing the accuracy of stimulus estimates by decreasing variability. Participants viewed a series of objects that varied on two dimensions that were either uncorrelated or correlated. They estimated each item by manipulating a response object to make it match the presented stimulus. Subsequent classification and featureinference tasks indicated that the correlation was detected. The pattern of variability in recollections of stimuli suggested that the feature correlation informed estimates as predicted by a Bayesian model of category effects on memory.

Through incidental experience, people come to represent some of the statistical structure of their environments (Crawford & Cacioppo, 2002; Knowlton & Squire, 1993; Lewicki, Hill, & Czyzewska, 1997; Saffran, Aslin, & Newport, 1996). In particular, they detect correlations between features and form categories that capture information about those correlations (Anderson & Fincham, 1996; Thomas, 1998). Representing such relationships supports adaptive behavior by allowing people to make accurate predictions and adjust their behavior. It is not known whether representations of correlation inform other cognitive processes, such as estimating stimuli from memory. We posit that people may use a rational Bayesian strategy to adjust estimates of a particular stimulus with prior knowledge about feature correlations. Bayes’s theorem provides a method for combining a prior distribution with a sampling distribution to produce a posterior distribution that provides a maximally accurate estimate. The category adjustment model (Huttenlocher, Hedges, & Vevea, 2000) posits that people use such a Bayesian process when they estimate stimuli from memory. The model holds that when a stimulus is encountered, it is coded at two levels of detail: a categorical level and a particular level. Category-level knowledge operates as a prior distribution and the particular-level knowledge as a sampling distribution, and these sources of information

This work was supported by NIMH Grant 1 F31 MH12072-01A1 to the first author. We thank Sean Duffy and Jessica Choplin for helpful comments on an earlier draft. Correspondence concerning this article should be addressed to L. E. Crawford, Department of Psychology, University of Richmond, Richmond, VA 23173 (e-mail: [email protected]).

are weighted and combined when participants estimate the features of the stimulus from memory. This is a rational strategy, because although it introduces bias in estimates, it also reduces variability, producing a lower mean squared error than would be obtained had the category information not been used. If people represent feature correlations within categories and use those categories to adjust estimates, then the correlations should affect estimates. In the next section, the category adjustment model is extended to make specific predictions about the pattern of variability in estimates that would result if information about feature correlations were used in estimation. The present study tests these predictions. Using Correlations to Adjust Estimates For a category that encompasses a given range of stimulus values, the less dispersed the instances within the category, the more information the category provides about the value of any particular stimulus. Thus low-dispersion categories should be given more weight in estimation than should categories of the same range but with high dispersion (Huttenlocher et al., 2000). According to the category adjustment model, when a category is given more weight, estimates of a particular category member will be less variable. That is, the dispersion of instances in a category is monotonically related to the variability of the estimates of particular category members. Huttenlocher et al. (2000) examined the effects of within-category dispersion on estimation in a serial reproduction task. Participants viewed a set of items that varied in one dimension, such as size or shade. After each stimulus was shown, the participants estimated its value

245

Copyright 2006 Psychonomic Society, Inc.

246

CRAWFORD, HUTTENLOCHER, AND HEDGES

on that dimension by adjusting a response object to make it match the to-be-remembered stimulus. Within each set, the variability of items was manipulated. Estimates of comparable individual items were more variable when those items were embedded in a high-variability distribution than when they were embedded in a low-variability distribution. These findings indicate that people represent the variability of unidimensional distributions and use that information to inform their estimates of individual stimuli. Here we extend this Bayesian model to two-dimensional categories. The key mathematical observation that makes this generalization possible is that, if the dimensions of the sampling distribution and the prior distributions are independent, then the two-dimensional (bivariate) Bayesian estimation model is simply the same as the Bayesian model for both dimensions considered separately. For example, consider two categories formed to represent sets of objects that vary in size and shade, as in Figure 1. In Category A, items vary widely in the size dimension, but vary relatively little in shade. Category B has the opposite structure. If people use this distributional information to adjust estimates, then when an object belongs to Category A, estimates of its size should be more variable and estimates of its shade less variable than when the same object belongs to Category B. The same logic applies when the dimensions are correlated, with one addition. To see how the correlation will affect estimates, note that if stimuli follow a bivariate normal distribution with correlated dimensions (as in Figure 2), a rotation of the coordinate system along the major and minor axes of the probability density always results in a bivariate normal distribution with uncorrelated dimensions (see, e.g., Johnson & Kotz, 1972, p. 40). Thus, the case of two correlated features can be simplified by imposing two new dimensions along which the set is uncorrelated. In terms of the categories shown in Figure 2, these new dimensions are the major and minor axes of the elliptical probability density, which are also the positive and negative diagonals of the stimulus space. If feature correlations are used as

Shade

Bayesian priors in estimation, the model predicts that estimates of a particular stimulus will show greater variability along the positive dimension and less variability along the negative dimension when that stimulus is embedded in a positive distribution, and the opposite pattern when it is embedded in a negative distribution. In other words, the estimates of a particular stimulus value will reflect the correlation present in the distribution as a whole.1 As noted above, previous work has shown that people incorporate information about a distribution’s central tendency and variability when estimating unidimensional stimuli, but it is not known whether they capitalize on correlational information. In keeping with that earlier work, the present participants viewed and reproduced a series of stimuli. In this case, the stimuli varied in both size and shade, and participants estimated each one by adjusting the size and shade of a response item to make it match the to-be-remembered stimulus. The association between dimensions was manipulated between subjects. The correlation was never mentioned; it was to be learned incidentally through experience with individual stimuli. Markman and Ross (2003) argued that categorization research should address the variety of category uses and should not assume by default that representations formed under one set of learning conditions will serve all category functions. By focusing exclusively on stimulus estimation, previous studies on the category adjustment model have not considered whether the categories that influence estimation serve other functions. The present study included the estimation task along with two additional tasks: a classification task and a feature inference task, both of which have been shown to be sensitive to within-category correlations (Anderson & Fincham, 1996; Medin, Altom, Edelson, & Freko, 1982; Thomas, 1998). Including these tasks was important, because if we were to find that estimates were unaffected by the correlation, these additional tasks would allow us to interpret the null result. Specifically, if performance on all of the tasks was unaffected by the correlation, this would suggest that participants failed to detect the correlation. However, if performance on the

Shade

A

B

Size

Size

Figure 1. Category A has high variability in size, low variability in shade, and no correlation between the two dimensions. Category B has low variability in size, high variability in shade, and also no correlation between the two dimensions.

FEATURE CORRELATIONS AND BAYESIAN ADJUSTMENT

Shade

247

Shade

A

Size

B

Size

Figure 2. In Category A, shade and size are positively correlated. In Category B, they are negatively correlated.

estimation task was not affected by the correlation but performance on the other tasks was affected, this would be direct evidence against our prediction that participants would use the correlation to adjust stimulus estimates. METHOD Participants Sixty members of the University of Chicago community were paid $9. Materials The stimuli were images of gray fish varying in fatness and shade. There were 49 unique fish, with seven fatnesses ranging from 70 to 124 pixels in 9-pixel increments and seven shades ranging from 24 (dark) to 78 (light) photo units in increments of 9 photo units.2 The same 49 fish were used to form three frequency distributions (included in Appendix A, Table A1). The distribution in which fatter fish tend to be lighter in shade is arbitrarily labeled “positive” and the distribution with the opposite correlation is arbitrarily labeled “negative.” For each, the Pearson correlation coefficient between size and shade was .67. In addition, there was a uniform distribution in which shade and size were unassociated. Procedure Participants were randomly assigned to one of the three frequency distributions (negative, positive, or uncorrelated). On each trial, a test fish, drawn randomly without replacement from the distribution, was shown for 1 sec in the center of the screen and removed. After a 1-sec delay, a response fish appeared slightly to the right and below the center of the screen. The participants used a Kensington TurboMouse trackball to adjust this fish (rolling it forward made the fish fatter, back made it skinnier, to the left made it lighter, and to the right made it darker). Once the response fish matched the test fish, participants pressed a button to register that response. The second part of the experiment collected category and typicality judgments about a set of 29 fish (see Appendix B): 13 that had been shown in part one, and 16 that were fatter, skinnier, lighter, and/ or darker than the fish that were presented in part one. On each trial, a fish appeared and participants pressed a key to indicate whether this fish was a member of the category of fish that they had seen. If they responded “yes,” they were asked to rate whether it was highly typical, somewhat typical, or atypical of the category by pressing 1, 2, or 3, respectively. All participants judged the same set of fish (each in a different random order), with each fish presented twice, for a total of 58 trials.

Finally, the participants completed an inference task in which they were asked to infer the value of one dimension given the value on the other. On each trial, a fish of the median shade appeared in one of the seven possible sizes. The participants rolled the trackball left to right to adjust the shade until it matched what they thought was the typical shade for a fish of that size (size was not adjustable). After inferring the shade three times for each of the seven possible fatnesses, the participants were then given fish of the median size and each possible shade (three times each) and told to infer the size for a fish of each shade by rolling the trackball forward and back.

RESULTS Typicality and Inferences We first examined the typicality rating and inference data to determine whether participants detected the presented correlation. For the typicality judgments, stimuli were divided into a positive-quadrant group (including fish that were both fatter and lighter than average or both skinnier and darker than average) and the negative-quadrant group (including fish that were both skinnier and lighter than average or both fatter and darker than average). Each response was converted to a typicality rating on a fourpoint scale (highly typical member  4, somewhat typical member  3, atypical but still a member  2, and not a member  1). For each participant, an average typicality rating was calculated for items in the positive-quadrant group and items in negative-quadrant group. These ratings were submitted to a mixed model ANOVA with quadrant group (positive vs. negative) as the within-subjects factor and distribution as the between-subjects factor. There were no main effects of quadrant group or distribution (Fs  1), and there was a significant interaction between quadrant group and distribution [F(2,57)  15.17, p  .001; ηp2  .35]. The Holm’s sequentially rejective Bonferroni procedure was used to test three planned nonorthogonal comparisons using two-tailed tests at α  .05. As would be expected if participants used the presented correlation to judge typicality, those who viewed the positive distribution judged items in the positive quadrants to be more typical than items in the negative quadrants [positive quadrant M  2.97 (SE  .08), negative quadrant M 

248

CRAWFORD, HUTTENLOCHER, AND HEDGES

2.64 (SE  .09); tH(57)  4.41]. Similarly, those who viewed the negative distribution judged items in the negative quadrants to be more typical than items in the positive quadrants [negative quadrant M  2.92 (SE  .03), positive quadrant M  2.67 (SE  .04); tH(57)  3.35]. Within the uncorrelated condition, there was no significant difference between judged typicality of positive quadrant items and negative quadrant items [positive quadrant M  2.69 (SE  .09), negative quadrant M  2.70 (SE  .06)]. The results indicate that those who experienced a correlation used it to classify items differentially. The inference data were analyzed to determine whether participants used the presented correlation to infer a stimulus’ shade given its fatness. For each of the given fatness values, the average response shade was calculated. A regression of the average response shades on given fatness values in the negative condition produced a regression slope that was significantly negative [β  −.29; t(6)  −15.4, p  .01], whereas the regression slope from the positive condition was significantly positive [β  .62; t(6)  6.7, p  .01]. As expected, the regression slope from the uncorrelated condition was in between the other two, but unexpectedly, it was also significantly positive [β  .49; t(6)  5.0, p  .01]. The regression slope from the uncorrelated condition differed significantly from that from the negative condition [slope difference  .78; t(13)  10.0, p  .01], but did not differ significantly from that from the positive condition [slope difference  .12; t(13)  .91, n.s.]. Comparable analyses were done for inferences about fatness on the basis of given shade values. A regression of mean fatness settings on given shade values showed that the slope from the negative condition was significantly negative [β  −.40; t(6)  4.44, p  .01] and the slope from the positive condition was significantly positive [β  1.0; t(6)  16.42, p  .01]. Again, the regression slope in the uncorrelated condition was also significantly positive [β  .48; t(6)  4.9, p  .01], and in this case, it differed significantly from the slope in the positive condition [slope difference  .52; t(13)  4.47, p  .01] and from the slope in the negative condition [slope difference  .88; t(13)  5.1, p  .01]. These results indicate that participants used the presented distribution to inform their inferences about missing features. Estimation The results from the inference and typicality tasks show that participants were sensitive to the presented correlations. The primary concern in the present study is whether the presented correlations also influenced estimates of individual stimuli. In analyzing the estimation data, it is necessary to consider that the stimuli being estimated were presented different numbers of times in the different conditions. For example, the lightest, fattest fish was shown once in the negative distribution condition, 4 times in the uncorrelated distribution condition, and 11 times in the positive distribution condition. Here we present two approaches to the problem of unequal cases. The first includes all estimates of a single, common stimulus value (the central value) and uses participants as the units

of analysis. The second randomly selects one estimate of each stimulus value for each participant and uses stimulus value as the unit of analysis. If people use the presented correlation to adjust estimates, the pattern of variability in estimates of a given stimulus should reflect the distribution in which that stimulus was embedded. Using only responses to the central stimulus value (size  97, shade  51; see Appendix A), for each participant, we calculated the Pearson correlation coefficient of their size and shade estimates (based on 11 estimates per participant in the negative and positive conditions and 4 per participant in the uncorrelated condition) and submitted these coefficients to a one-way ANOVA. There was a main effect of distribution [F(2,57)  4.22, p  .05; ηp2  .13]. Holm’s sequentially rejective Bonferroni tests revealed that the average correlation coefficient for participants in the positive condition (M  .20, SE  .09) was significantly different from that for those in the negative condition (M  −.21, SE  .07) [tH(55)  2.93]. The average correlation coefficient from the uncorrelated condition fell in between (M  .03, SE  .13) and was not significantly different from the other two. This analysis indicates that estimates of the central stimulus value were significantly affected by the distribution in which that value was embedded. These results are consistent with the prediction that the presented correlation is used to adjust estimates of particular stimuli, thus influencing the variability in those estimates. Given that the model’s predictions apply to all stimulus values, and not only the central value, we conducted a second analysis that included all of the 49 stimulus values. To equalize the frequency of the stimulus values, trials were randomly removed in order to leave each participant with a uniform distribution of 49 trials: one estimate per stimulus value. The removed trials were thus treated as filler trials that had established the correlation. The remaining estimates were combined across participants within a condition, providing 20 estimates (1 from each participant) per stimulus value within each condition. The Pearson correlation coefficient was calculated for each stimulus value in each condition. These coefficients were then submitted to a repeated measures ANOVA (matched by stimulus value), which yielded a significant main effect of distribution [F(2,96)  9.85, p  .001; η2p  .17]. The Holm’s sequentially rejective Bonferroni test revealed that the average correlation coefficient from the negative condition (M  −.078, SE  .031) was significantly lower than the average correlation coefficient from the positive condition (M  .130, SE  .037) [tH(96)  6.66]. The average correlation coefficient for stimuli for the uncorrelated condition (M  .020, SE  .032) fell in between and differed significantly from that for the positive condition [tH(96)  3.54] and for the negative condition [tH(96)  3.12]. These results indicate that across the range of stimuli, the distribution of estimates reflected the correlation presented in the stimulus distribution. Consistent with the predictions of the category effects model, the presented correlation was reflected in stimulus estimates. These results reveal an overall bias to treat the sets as positively correlated. This bias should not be attributed to

FEATURE CORRELATIONS AND BAYESIAN ADJUSTMENT the well-documented positivity bias in correlation perception, because here the correlations are labeled arbitrarily. This bias may stem from a perceptual bias to perceive brighter items as larger (Robinson, 1954). Alternatively, it may reflect a prior expectation that large items will be brighter (e.g., Fiedler, 2000). Whatever the source of this asymmetry, it operates in conjunction with participants’ sensitivity to distributional experience, because responses on all three measures were influenced by the presented correlation. DISCUSSION In the present experiment, those who gained experience with a feature correlation were influenced by that correlation when performing various cognitive tasks, including inferring missing features and classifying novel items. Of primary interest here are the results from the estimation task, which show that the learned distribution influenced the variability of stimulus estimates made from memory. Estimates of individual stimulus values reflected the shape of the distribution in which those values were embedded. In earlier work, we developed a model of stimulus estimation according to which people adjust inexact representations of individual stimuli with information about the distribution in which they are embedded. The model posits that this adjustment is akin to Bayesian statistical inference given information about category distributions, and can increase the accuracy of responses by decreasing their variability. The present study has not tested directly the model’s claim that such adjustment makes estimates more accurate than when no adjustment is used. (To do so would require a comparison case in which participants estimate category members without learning anything about the category.) Instead, this study has examined whether estimates reveal the pattern of variability that would be expected if people were using a Bayesian strategy to improve accuracy. Extending previous studies of the role of category boundaries, prototypes, and variability (Crawford, Huttenlocher, & Engebretson, 2000; Huttenlocher, Hedges, Corrigan, & Crawford, 2004; Huttenlocher et al., 2000), the present study is the first to have applied the category effects model to two-dimensional, inductive categories. It provides initial evidence that inexactly remembered stimuli are adjusted with information about feature correlations. By employing multiple measures, the present study relates estimation effects to other findings in the conceptlearning literature. In our earlier work on Bayesian adjustment, category structure was inferred from patterns of bias and variability in estimates, and as a result it was not known whether the categories inferred from these studies served the other functions that categories are expected to serve. The present study suggests that the representation used to adjust stimulus estimates is also used to draw inferences about missing features and to judge the category membership and typicality of novel items. Earlier studies have examined within-category feature correlations in order to test competing models of

249

category or function learning (Anderson & Fincham, 1996; Ashby & Maddox, 1990; DeLosh, Busemeyer, & McDaniel, 1997; Thomas, 1998). In contrast, we make skeletal assumptions about the category that people would be expected to learn and we examine how this category influences estimates of particular stimuli. Although the results are consistent with the predictions of the category adjustment model, they cannot reveal the exact nature of the prior information that is being used to adjust stimuli. It may include information about all observed instances, or some recent subset of cases, or, at the extreme, only the most recent single case. The work reported here extends studies of statistical learning. Numerous studies have shown that people are sensitive to subtle contingencies in their experience and that knowledge of these contingencies can occur outside of conscious awareness (Knowlton & Squire, 1993; Lewicki et al., 1997). This sensitivity is thought to be beneficial because it enables people to accurately predict and adapt to their environments. The significance of statistical learning for the accuracy of memory has received little attention. Given that memories are constructed from multiple sources of information, and that representations of statistical structure are ingredients in that construction, accurate statistical learning has important implications for the accuracy of memory. REFERENCES Anderson, J. R., & Fincham, J. M. (1996). Categorization and sensitivity to correlation. Journal of Experimental Psychology: Learning, Memory, & Cognition, 22, 259-277. Ashby, F. G., & Maddox, W. T. (1990). Integrating information from separable psychological dimensions. Journal of Experimental Psychology: Human Perception & Performance, 16, 598-612. Crawford, L. E., & Cacioppo, J. T. (2002). Learning where to look for danger: Integrating affective and spatial information. Psychological Science, 13, 449-453. Crawford, L. E., Huttenlocher, J., & Engebretson, P. H. (2000). Category effects on estimates of stimuli: Perception or reconstruction? Psychological Science, 11, 280-284. DeLosh, E. L., Busemeyer, J. R., & McDaniel, M. A. (1997). Extrapolation: The sine qua non for abstraction in function learning. Journal of Experimental Psychology: Learning, Memory, & Cognition, 23, 968-986. Fiedler, K. (2000). Illusory correlations: A simple associative algorithm provides a convergent account of seemingly divergent paradigms. Review of General Psychology, 4, 25-58. Huttenlocher, J., Hedges, L. V., Corrigan, B., & Crawford, L. E. (2004). Spatial categories and the estimation of location. Cognition, 93, 75-97. Huttenlocher, J., Hedges, L. V., & Vevea, J. L. (2000). Why do categories affect stimulus judgment? Journal of Experimental Psychology: General, 129, 220-241. Johnson, N. L., & Kotz, S. (1972). Continuous multivariate distributions. New York: Wiley. Knowlton, B. J., & Squire, L. R. (1993). The learning of categories: Parallel brain systems for item memory and category knowledge. Science, 262, 1747-1749. Lewicki, P., Hill, T., & Czyzewska, M. (1997). Hidden covariation detection: A fundamental and ubiquitous phenomenon. Journal of Experimental Psychology: Learning, Memory, & Cognition, 23, 221228. Markman, A. B., & Ross, B. H. (2003). Category use and category learning. Psychological Bulletin, 129, 592-613. Medin, D. L., Altom, M. W., Edelson, S. M., & Freko, D. (1982).

250

CRAWFORD, HUTTENLOCHER, AND HEDGES

Correlated symptoms and simulated medical classification. Journal of Experimental Psychology: Learning, Memory, & Cognition, 8, 3750. Robinson, E. J. (1954). The influence of photometric brightness on judgments of size. American Journal of Psychology, 67, 464-474. Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8-month-old infants. Science, 274, 1926-1928. Thomas, R. D. (1998). Learning correlations in categorization tasks using large, ill-defined categories. Journal of Experimental Psychology: Learning, Memory, & Cognition, 24, 119-143.

NOTES 1. These predictions pertain to the variability of responses to particular stimulus values, not to the variability of the entire set of stimulus reproductions, which obviously will mirror the variability of the given distribution. 2. These photo units are a linear transformation of photometer readings taken directly from the computer screen and were set in pilot work so that the standard deviation of reproductions of shade and fatness would be comparable.

APPENDIX A Frequency Distributions of Stimuli in the Estimation Task There were three frequency distributions: positive, negative, and uncorrelated. Tables for the positive and negative correlation distributions are given below; in the uncorrelated condition, each stimulus was shown four times. Table A1 Positive Correlation Frequency Distribution Size (in Pixels) Shade (in Photo Units) 70 79 88 97 106 78 1 1 1 1 3 69 1 1 1 3 6 60 1 1 3 6 11 51 1 3 6 11 6 42 3 6 11 6 3 33 6 11 6 3 1 24 11 6 3 1 1 Note—For shade, 78 is light; 24 is dark.

115 6 11 6 3 1 1 1

124 11 6 3 1 1 1 1

Table A2 Negative Condition Frequency Distribution Size (in Pixels) Shade (in Photo Units) 70 79 88 97 106 78 11 6 3 1 1 69 6 11 6 3 1 60 3 6 11 6 3 51 1 3 6 11 6 42 1 1 3 6 11 33 1 1 1 3 6 24 1 1 1 1 3 Note—For shade, 78 is light; 24 is dark.

115 1 1 1 3 6 11 6

124 1 1 1 1 3 6 11

APPENDIX B Distribution of Stimuli Shown in the Typicality Rating Task Size (in Pixels) Shade (in Photo Units) 52 61 70 79 88 97 106 115 96 1 1 87 1 1 78 1 1 69 1 1 1 60 1 51 1 1 42 1 33 1 1 1 24 1 1 15 1 1 6 1 1 Note—For shade, 96 is light; 6 is dark.

124

(Manuscript received March 1, 2005; revision accepted for publication July 18, 2005.)

133

142 1

1 1 1 1 1 1

Suggest Documents