Journal of Experimental Psychology: Learning, Memory, and Cognition 1999, Vol. 25, No. 1,23-40
Copyright 1999 by the American Psychological Association, Inc. 0278-7393/99/S3.00
Age, Testing at Preferred or Nonpreferred Times (Testing Optimality), and False Memory M. J. Intons-Peterson
Paola Rocchi
Indiana University Bloomington
University of Padua
Tara West, Kimberly McLellan, and Amy Hackney Indiana University Bloomington Two experiments investigated whether age and testing at preferred (optimal) times of day or nonpreferred (nonoptimal) times affected the ability to select relevant from irrelevant but thematically related alternatives in a verbal false memory paradigm. A 3rd experiment pursued the same issues with a visual false memory paradigm. In all 3 experiments, younger adults (n = 195) correctly recalled studied items more often than older adults (n= 121), whereas the 2 age groups correctly recognized about the same numbers of previously studied items. In all 3 experiments, nonoptimally tested older adults had more difficulty excluding nonstudied but thematically related items than the other groups; thus, they showed the greatest evidence of false memory, although all groups did so to a significant extent. The results suggest that optimality and its circadian determinants need to be considered with some tasks for the elderly. Various models and mechanisms are discussed.
Evidence is mounting that people perform better when tested at their preferred times of day than at nonpreferred times, what we call an optimality-of-testing effect. This advantage of optimal testing (testing at preferred or peak times) over nonoptimal testing (testing at nonpreferred or nonpeak times) may be particularly pronounced for older adults who typically prefer the morning (Intons-Peterson, Rocchi, West, McLellan, & Hackney, 1998; May & Hasher, 1998; May, Hasher, & Stoltzfus, 1993). These circadian effects hold implications for the neuropsychology of aging as well as for the practical management of the cognitive activities of older adults. Circadian effects are not always found, however (May & Hasher, 1998), and a number of researchers have reported substantial individual differences
M. J. Intons-Peterson, Tara West, Kimberly McLellan, and Amy Hackney, Department of Psychology, Indiana University Bloomington; Paola Rocchi, Department of Psychology, University of Padua, Padua, Italy. Tara West is now at the Department of Psychology, State University of New York at Stony Brook; Kimberly McLellan is now in Cincinnati, Ohio; Amy Hackney is now at Washington University. We thank Lloyd R. Peterson for programming the computers and the following people for their assistance in conducting the research: Wendy Bergida, Rachel Fawcett, Christy McGowan, Deanna Mercurio, Brandon Rieber, Darren Schmidt, Courtney Scott, and Jennifer Snow. Meadowood, the Older American Center, and the Indiana University Annuitants Association helped us contact participants and sometimes provided us with research space. Fergus Craik, David Payne, and an anonymous reviewer provided excellent, detailed reviews. We are grateful to all of these individuals. Correspondence concerning this article should be addressed to M. J. Intons-Peterson, Department of Psychology, Indiana University, Bloomington, Indiana 47405. Electronic mail may be sent to
[email protected]. 23
(e.g., Anderson, Petros, Beckwith, Mitchell, & Fritz, 1991; Bodenhausen, 1990; Home, Brass, & Pettitt, 1980; Petros, Beckwith, & Anderson, 1990), so it is important to identify tasks that reflect circadian influence. In general, circadian effects seem most obvious when one among other potential responses must be selected, exemplified by tasks such as sentence recognition, a stop-signal paradigm, Stroop color-naming, instruction-directed complicated line drawings (May & Hasher, 1998), and negative priming (Intons-Peterson et al., 1998). Consider some tasks that yield time-of-day effects. When tested in the morning, their preferred time of day, older adults accurately recognized sentences slightly but nonsignificantly better than did young adults, who typically prefer times late in the day; whereas when tested late in the day, young adults correctly recognized significantly more sentences than did older adults (May et al., 1993). In this sentence-recognition task, the lures were similar to the correct sentence, so the participants presumably had to suppress tendencies to respond to the lures in favor of the correct sentences if they were to respond correctly. Similarly, when substitute endings were supplied after younger and older adults had responded to high or medium cloze sentences, both age groups were more likely to suppress the no-longer-relevant endings than were their age mates when tested at optimal times than when tested nonoptimally (May & Hasher, 1998). In another task requiring inhibition of momentarily inappropriate responses, a stop-signal task, the two age groups had approximately equivalent stopping probabilities when tested in the morning, but the younger group improved over the day, whereas the older group declined (May & Hasher, 1998). The same investigators observed that a Stroop effect increased over the day for older adults but not for younger
24
INTONS-PETERSON ET AL.
ones. Another task that pitted currently relevant items against currently irrelevant items, negative priming, also showed age and optimality-of-testing effects. When tested optimally, both older and younger adults showed negative priming, and both age groups also showed less or no negative priming when tested nonoptimally (Intons-Peterson et al., 1998). Thus, age and optimality effects emerged in these memory tasks that involve competition among alternative responses. The current research focuses on such tasks, namely recall and recognition in a false memory paradigm. These age-sensitive tasks share other characteristics. They activate the frontal lobes (Kramer, Humphrey, Larish, Logan, & Strayer, 1994; May & Hasher, 1998; McGaugh, 1995), cerebral areas that are vulnerable to the effects of aging (Gur, Gur, Obrist, Skolnick, & Reivitch, 1987; Schacter, Koutstaal, & Norman, 1997; Schacter, Norman, & Koutstaal, 1998; Schacter, Savage, Alpert, Rauch, & Albert, 1996; Shaw et al., 1984; Warren, Butler, Katholi, & Halsey, 1985; but see Perfect & Dasgupta, 1997). For example, in work using positron emission tomography (PET), hippocampal activation during the encoding of episodic information was lower in older adults than in younger adults (Grady et al., 1995). Furthermore, when retrieval was made effortful by decreasing the frequency of both exposure and semantic processing during encoding, Schacter et al. (1996) found that older adults showed more posterior frontal lobe activation than did younger adults, whereas younger adults showed more bilateral blood flow increases in the anterior prefrontal cortex than did older adults. Both age groups yielded increased hippocampal blood flows when they successfully recollected a studied word. It is reasonable to assume that the attentional efficacy afforded by optimal testing will facilitate neurological processes. Metabolic-physiological functioning clearly reflects responses to the age-related changes in circadian rhythms associated with optimal testing. For example, peak temperature moves earlier in the day as people age (Anderson et al., 1991; sWikova & Bronis, 1989; Tune, 1969; Webb, 1982). Metabolic fluctuations influence states of attentiveness, alertness, or readiness that may be manifested psychologically as time-of-day preferences. Working backward, timeof-day preferences may signal differential states of attentiveness in people of different ages so that those tested optimally would perform more, effectively than those tested nonoptimally, when age is held constant, although younger adults would generally be expected to outperform older adults on the basis of a general slowing with age (e.g., Salthouse, 1985, 1991, 1992). Note that a general model of slowing with age does not, itself, account for the sparing of mental activities, such as the extraction of gist and strategic planning (Shimamura, Berry, Mangels, Rusting, & Jurica, 1995), nor has such a claim been made, to our knowledge. Extant psychological models have not been designed to explain the differential sensitivity of situations to optimality and cannot be expected to do so. Nevertheless, some models incorporate promising mechanisms. These approaches focus on age effects in competitive tasks. The mechanisms include (a) inhibitory processes, which are assumed to become less effective with age (e.g., Allport, Tipper, & Chmiel, 1985;
Kane, Hasher, Stoltzfus, Zacks, & Connelly, 1994; Navon, 1989a, 1989b; Neumann, 1987; Stadler & Hogan, 1996); (b) constructive processes of memory that produce distortions through extensive dependence on gist information and confusion of the origins of the items, two mechanisms posited to increase with age (Schacter et al., 1998; also see Brainerd & Reyna's, 1990, fuzzy trace model; Brainerd, Reyna, Howe, & Kingma, 1990; Reyna & Brainerd, 1995, and Estes's, 1997, dual-trace model); and (c) differential reliance on self-initiated and environmental memory search cues (Craik, 1994; Craik & Jennings, 1992). These mechanisms are assumed to affect performance in competitive tasks, such as the following. Suppose younger and older adults hear lists of words and are then asked to recall only presented words. Presumably, the list words tend to elicit potentially intrusive, thematically related associates, even though these words are not actually presented to the participants and are not to be recalled. An inhibitory deficiency model posits that older adults will recollect more nonpresented words than will younger adults because of their less efficient inhibition of nonrelevant items. The constructive memory framework hypothesizes that older people, more than younger adults, retrieve gist or associates thematically related to the list items or confuse the origins of actual perceptions versus imagined or other self-generated items. Consequently, they recollect more nonpresented associates and probably at least equal numbers of presented words (see Schacter et al., 1998, p. 295). The differential memory search cue approach proposes that because dependence on the environmental cues afforded by recognition tasks increases with age and use of recall-dominant, selfinitiated cues decreases with age, older adults will recall fewer items than will younger adults, but older adults will recognize about the same number of items as will younger adults. As already noted, these models do not address optimality of testing. To date, this is true even when neuropsychological evidence is incorporated, as has been done by May and Hasher (1998) and by Schacter (for a summary, see Schacter et al., 1998). We can speculate that an inhibitory deficiency model might propose that nonoptimal processing activates lower levels of cerebral or metabolic processes, which, in turn, handicaps older individuals more than younger ones. The constructive memory framework might hypothesize that optimal testing aids encoding, retrieval strategies, or both. If so, we expect optimal testing to increase the distinctiveness of memory representations, thereby reducing the likelihood of false recollections. Finally, with Craik's (1994; Craik & Jennings, 1992) view of differential use of memory search cues, the argument might be that free recall draws on self-initiated cues, which are more vulnerable to individual circadian effects than is environmentally cue dependent recognition. If true, age differentials in optimality should be more pronounced in recall than in recognition. The false memory paradigm is ideal for comparing these ideas, one purpose of the current research. In this paradigm, participants learn a list of words (e.g., bed, rest, awake, tired, ... dream), all of which are associatively related to a particular critical theme word, such as SLEEP, but the theme
25
AGE, TESTING OPTIMALITY, AND FALSE MEMORY
word itself is never presented in the study list. The participants then recall the words on the list. After study and recall of similarly constructed lists, composed of words associatively related to other, nonoverlapping theme words, the participants complete a recognition test. This test contains words from each study list, the previously nonstudied theme words, and unrelated, nonstudied words. The participants identify words from the studied lists. The false memory paradigm is particularly useful for assessing the roles of age and optimality in a competitive task because the construction of lists from associates of a theme word should increase the likelihood of activating the theme (critical lure) plus an array of other associates of the critical theme word to serve as lures for the study-list items. This procedure was followed in Experiments 1 and 2 by presenting the study lists aurally. This paradigm was adapted from the work of Roediger and McDermott (1995), who modified a similar approach introduced by Deese in 1959 (see also Read, 1996). These researchers found that young adults occasionally recalled or recognized the nonstudied critical words, thereby showing evidence of what has been called "false memory." Recently, Norman and Schacter (1997) found that older adults showed greater evidence of false memories than did younger adults. They did not study the optimality of testing; hence, we extend the paradigm to include optimal and nonoptimal testing of younger and older adults. Interestingly, participants claimed to remember presentation of the nonstudied, nonpresented theme words rather than to have a vague notion that these words were on the list (Norman & Schacter, 1997; Payne, Elie, Blackwell, & Neuschatz, 1996; Roediger & McDermott, 1995). Two likely explanations for the high recollection of nonstudied, nonheard critical theme words applied specifically to the false memory paradigm are (a) that the learners have difficulty identifying the source or context of their memories (e.g., Bayen & Murnane, 1996; Cohen & Faulkner, 1989; Denney & Larsen, 1994; Ferguson, Hashtroudi, & Johnson, 1992), a problem that seems to be particularly true of older adults (Dywan & Jacoby, 1990; Mclntyre & Craik, 1987; Multhaup, 1995; Norman & Schacter, 1997; Schacter, Osowiecki, Kaszniak, Kihlstrom, & Valdiserri, 1994), or (b) that older adults use less stringent response strategies than do younger ones (Harkins, Chapman, & Eisdorfer, 1979; Rankin & Kausler, 1979). The first, or "misattribution" explanation, predicts higher retention of nonstudied critical lures by older than by younger participants; it is one of the contributors to memory distortions, according to the constructive memory framework. The second, or "guessing" explanation, predicts higher output of both critical words and intrusions by older than by younger adults. We also included tests for "remember" versus "know" to assess the effects of age and optimality on the participants' perceptions of the origins of their memories. The false memory paradigm described above provides an estimate of what we term verbal false memory. This paradigm was used to assess the effects of age and optimality when the participants preferred morning or late afternoonevening (Experiment 1) and when they preferred intermedi-
ate times of the day (Experiment 2). In Experiment 3, the lists were constructed to represent pictorial versions of concepts, or visual false memory, to assess the effects of another input modality. In the visual false memory tasks, pictures associated with a nonstudied critical picture were studied, and then all were tested for both recognition and free recall. Others have found greater false recognition by older than by younger adults for colored pictures (Koutstaal & Schacter, 1997) and photographs (Schacter, Koutstaal, Johnson, Gross, & Angell, 1997). In summary, the purposes of the research were to examine the effects of age and optimality on verbal and visual false memory and to consider possible mechanisms that might underlie any effects.
Experiment 1 Younger and older adults were tested for free recall and recognition of lists of words and the nonstudied critical words from which the lists were derived. The times of testing either did or did not coincide with morning or evening time-of-day preferences ascertained from scores on Home and Ostberg's (1976) Momingness-Eveningness Questionnaire (MEQ). The items on this questionnaire include time-of-day preferences for various physical and mental activities and, therefore, provide direct estimates of time-ofday activity preferences. This paper-and-pencil test has full-scale internal consistency coefficients of .82 (Smith, Reilly, & Midkiff, 1989) and .83 (Anderson et al., 1991), and an 8-week test-retest reliability coefficient of .77 (Anderson et al., 1991). The scale correlates significantly with peak time of oral temperature, bed and arising times, self-report measures of sleep habits, times of best mental and physical performance and alertness, and class performance (Home & Ostberg, 1976, 1977). In general, morning preferrers show earlier times of day than individuals who prefer later times for each of these measures.
Method Participants and design. Seventy-seven younger students (51 women, 26 men) in introductory psychology participated as one way to satisfy some requirements of their classes, and 42 older adults (30 women, 12 men) were recruited from a pool of individuals compiled as part of a previous research project (Intons-Peterson et al., 1998). The mean ages and age ranges of the two groups in years were 20 (18-25) for the younger group and 71.5 (60-90) for the older group. All participants were in good health. The older adults were paid for their participation. The younger group had completed fewer years of education (M = 13.7) than the older group (M = 15.6). The original design had four between-subjects variables: age group (younger, older); preferred time of day (a.m., p.m.); time tested (a.m., p.m.); and order of presenting the study lists (A, B, C). Preferred time of day had been obtained in earlier testing with the MEQ. The questionnaires were scored with Home and Ostberg's (1976) guidelines, which assign a scaled score to each response. The scaled scores are summed and assigned to a morningnesseveningness rating on the basis of the total score. The ratings associated with each range of summed scores are 16-30 (definitely evening), 31^41 {moderately evening), 42-58 (neutral), 59-69
26
INTONS-PETERSON ET AL.
(moderately morning), and 70-86 (definitely morning). The highest possible total is 86. We contacted older and younger adults who scored in the moderately morning and definitely morning ranges and who scored in the definitely evening and moderately evening ranges. The plan was to assign the younger and older adults in each preference group at random to morning (before 10:30 a.m.) or afternoon-evening (3 p.m. or later) testing. This plan had to be modified because we could locate only 5 older adults who preferred the evening. Hence, we combined the preferred times of day and the times tested into an optimality variable consisting of optimal testing (testing those who preferred the morning in the morning and those who preferred the afternoon in the afternoon) and nonoptimal testing (testing those who preferred the morning in the afternoon and afternoon preferrers in the morning). This modification resulted in the testing of 47 younger adults optimally, 30 younger adults nonoptimally, 20 older adults optimally, and 22 older adults nonoptimally. The order variable existed solely for counterbalancing purposes. It did not yield significant main effects or interactions with the other variables. In addition to the two between-subjects variables of age group and testing optimality of testing, the design contained two withinsubject variables: study condition (study + arithmetic, study + recall, nonstudied or control) and item type (studied lists, critical lures associatively related to the studied words, nonstudied lists, and critical lures associated with the nonstudied lists). The study conditions were used to replicate Roediger and McDermott's (1995) design. During study, participants heard two sets of four lists. After each list in the study + arithmetic set, the participants multiplied three-digit numbers by three-digit numbers on a sheet provided to them. After each list of the study + recall set, they recalled the words of the list immediately after hearing the last word of the list. The type of activity following each studied list was cued by the appropriate page in a response booklet, which contained a page with arithmetic problems after each study + arithmetic list and by a blank page after each study + recall list. Items from the remaining (nonstudied) four lists were lures for the recognition test and served as controls for possible differences in familiarity and frequency of occurrence because the lists used in the three conditions (study + arithmetic, study + recall, and nonpresented) appeared equally often in each set in each condition. Thus, using the list numbers identified in Appendix A to indicate the study + arithmetic condition, R to indicate the study + recall condition, and N to indicate the nonpresented lists, we presented the following sequences from left to right to participants assigned to the presentation orders, A, B, and C: Order A = 1 A, 5R, 2R, 6A, 3R, 7A, 4A, 8R, with N = 9,10,11,12; Order B = 1R, 11 A, 9A, 6R, 10A, 7R, 4R, 12A, withN = 2, 3,5,8; Order C = 5A, 11R, 9R, 2A, 10R, 3A, 8A, 12R,withN= 1,4,6,7. Materials. Twelve lists were prepared. Each list was composed of 15 fairly common associates given to 12 critical theme words. These lists were constructed by modifying the lists developed by Roediger and McDermott (1995) and Deese (1959), using the Russell and Jenkins's (1954) word-association norms, to select high-frequency associates of the critical theme words (see Appendix A). In general, the words appeared frequently in the most extensive and contemporary corpus of words in textbooks and other commonly used books in American schools and colleges (Zeno, Ivens, Millard, & Duwuri, 1995). The mean frequency of word type per million tokens, weighted by the dispersion of types across different sources (U), was 85.56 per million, and the median was 31.5 per million. The range was from .0163 per million (recliner) to 995 per million (good). The mean and median frequencies of each list appear in Appendix A. The 12 lists were assigned to one of three sets so that each set
was used equally often in the experimental conditions described above. The participants studied 8 of the 12 lists and then were transferred to a 48-word recognition test. The recognition response sheet contained, in a randomized order, from each of the 12 lists, the words presented in Serial Positions 1, 8, and 10, and the critical lures. The items from Serial Positions 8 and 10 were weakly related to the critical lures, whereas the items in the first serial position were strongly related to the critical lure. Among the 48 recognition items, 24 words had been heard (8 lists X 3 words per list), and 24 new items had not been presented in the experimental session (one critical lure from each of the 8 studied lists, one critical lure from each of the 4 nonstudied lists, and the 12 words in Serial Positions 1, 8, and 10 from each of the 4 nonstudied lists). The recognition words were printed on a page with two blanks after each word. The eight to-be-studied lists for each order were tape-recorded in order of frequency of association to the critical lure (Appendix A) by a male speaker at a rate of one word per 1.5 s. The tape contained instructions to turn to the next page in the examination booklet. Four words with no apparent overlap with the lists were used as practice words. Procedure. Tested in groups ranging from 1 to 4 people, the participants were told that they would hear lists of words, that they would do multiplication problems after some lists, and that they would be asked to write down the words they heard after other lists. The tape recorder then was started for the practice words. Recall followed the first two practice words and arithmetic, the third and the fourth words. No final free-recall or recognition test was given during practice. After handling any questions, the investigator advanced the tape to present the first list. After the voice instructed the participants to turn to the next page, 2.6 min were allowed for recall or arithmetic. After the eighth list had been recalled or the arithmetic had been computed, the experimenter chatted with the participants for 2-3 min and then passed out the recognition sheet. Participants were told, at this point, that they should circle " O " for "old" if they had heard the word during study or "N" for "new" if they had not heard the word during the study trials. The next column was used to identify the source of participants' judgments that they had heard the words during study. Working with only words called "old," they were to write an "R" for "remember" or a "K" for "know" in the next column. Participants were told that "remember" meant that you are consciously aware of some aspect or aspects of what happened or what you experienced at the time you heard the word. For example, you might remember something about the way the word sounded, or something that happened in the room when you heard the word, or what you were thinking at the time you heard the word. In other words, the "remembered" word should bring back to mind a particular thought, image, or something personal from the time you heard it, or something about its sound or position, such as what came before or after it. Participants were also told that "know" responses should be made when you recognize that the word was in the study list but you cannot recollect anything about its actual occurrence or what happened or what you experienced at the time you heard it. Please print "K" for "know" when you are certain you heard the word but you do not recall anything about actually hearing it. Here are examples of "R" and "K" judgments. If you heard an ordinary bird singing you probably wouldn't recall a specific time you heard a bird before, although you know you have heard them. You would print "K" for "know." If you heard a parrot screech "happy days," you probably would remember your surprise, so you would print "R" for "remember."
AGE, TESTING OPTIMALITY, AND FALSE MEMORY
These instructions, adapted from reports by Rajaram (1993) and Roediger and McDermott (1995), were also printed at the top of the recognition response sheet. Finally, the participants were asked about the purpose of the experiment. (No one gave a specific explanation.) They were debriefed, thanked, and dismissed.
Results In general, planned comparisons were computed, although overall analyses of variance were calculated first to obtain a general picture of the results. The analyses of variance were based on two between-subjects variables (age group: younger, older; optimality of testing: optimal, nonoptimal) and on the within-subject variables noted below for each analysis. The between-subjects mean square error term for the total sample had 115 degrees of freedom; follow-up planned analyses had degrees of freedom of 75 and 40 for the younger and older mean square error terms, respectively. An alpha level of .05 was used, unless indicated otherwise. The results are separated into recall and recognition. Recall. We examined the general characteristics of recall and then assessed evidence of false memory for the various groups. The mean correct recall was scored for each of the 15 serial positions for each individual and then subjected to an analysis of variance (ANOVA), using serial position as the within-subject variable. These means yielded a standard serial position curve, F(14, 1610) = 24.25, MSE = 0.059, with primacy higher than recency. Both groups showed similar serial position curves, with the younger group's overall mean recall (8.40 or 56% of the possible list recall) being reliably higher than the older group's recall (5.85 or 39%), F(l, 115) = 72.86, MSE = 0.163. The main effect for optimality, F(l, 115) = 1.11, p > .05, and its interactions with age group (F < 1) and Age Group X Serial Position, F(l, 1610) = 1.07, MSE = 0.059, p > .05, were nonsignificant, suggesting that the circadian effects indexed by optimality did not influence the respondents' abilities to recall the presented items. Nonpresented critical lures and other intrusions also occurred on the recall trials. Overall, 4.32 of the 8 possible critical words (54%) were erroneously recalled. Although an ANOVA of the critical-lure recall indicated that the two age groups did not differ (Myounger = 4.52 or 56%; MOider = 4.16 or 52%, F < 1), the optimally tested participants recalled fewer critical lures (M = 3.84 or 48%) than did nonoptimally tested participants (M = 4.80 or 60%), F(l, 115) = 4.22, MSE = 0.085. Moreover, age group interacted with optimality, F(l, 115) = 8.86, MSE = 0.085. Younger adults recalled about as many critical lures when tested optimally (4.72 or 59%) as when tested nonoptimally (4.32 or 54%; F < 1), but nonoptimally tested older adults recalled more critical lures (5.28 or 66%) than did their optimally tested age peers (3.04 or 38%), F(l, 40) = 9.65, MSE = 0.088. In fact, the older optimal group recalled the fewest critical lures. Intrusions other than the critical lures constituted 8% of the total number of words produced during the recall tests. The older group gave more intrusions (11%) than the
27
younger group (4%), F(l, 115) = 29.37, MSE = 0.005, suggesting that older people were prone to generate intrusions, consistent with the deficient inhibition and constructive memory views. No other main effects or interactions were significant. Most important are the estimates of false memory in recall. We assessed evidence of false memory by using Roediger and McDermott's (1995) procedure of comparing the proportion of correctly recalled list items averaged over the intermediate Serial Positions 4—11 (M = .40) with the proportion of erroneously recalled critical lures (.54). This difference was significant, F(l, 115) = 21.50, MSE = 0.048, indicating that the critical lures were erroneously recalled proportionately more than words located centrally in the list were correctly recalled. The younger group recalled significantly more of both types of words than did the older group, F(l, 115) = 10.20, MSE = 0.053. The triple interaction of Age X Optimality X Word Type (list, critical) was significant, F(l, 115) = 4.13, MSE = 0.048. The optimally and nonoptimally tested younger and older groups all recalled a higher proportion of critical lures than of actually heard list words, but the difference (false memory) was greatest for the nonoptimally tested older group [(MLure = .66) — (^List = -35) = .31]. These means differed reliably from comparable values for the optimally tested older group (.38 - .31 = .07), F(l, 40) = 7.65, MSE = 0.040, whereas the values for the optimally tested younger group (.59 — .50 = .09) did not differ from those for the nonoptimally tested younger group (.54 — .45 = .09; F < 1). In other words, the participants systematically recalled relatively more critical lures than centrally located list words, thus showing a false memory effect, but this effect was more pronounced for the older group tested nonoptimally than for the other groups. Thus, false memory is another paradigm in which optimality of testing is important for assessing older adults' memorial capabilities. Recognition. Again, we considered first the general results and then examined evidence for false memories. The recognition test contained the list items presented in Serial Positions 1, 8, and 10, the critical lure for each of the four study lists followed by arithmetic (study + arithmetic), the four study lists followed immediately by recall (study + recall), the four nonstudied, control lists, and the four critical lures for these nonstudied control lists for a total of 48 items (4 items X 12 lists). Participants first indicated whether each item was "old" (was heard during the study trials) or "new" (was not heard during the study trials). They returned to the items marked "old" to indicate whether they "remembered" each word or they simply "knew" that the item was on the study lists. Although recognition of studied list items and of critical lures from the study + recall lists reliably exceeded recognition of the same types of items from the study + arithmetic lists, as would be expected from the additional rehearsal practice afforded by recall, subsequent analyses focused on performance in the study + arithmetic lists to use measures of recognition less inflated by rehearsal than the
28
INTONS-PETERSON ET AL.
study + recall condition.1 The recognition results for studied and nonstudied words of Table 1 were collapsed over age group and optimality; the effects of these latter variables are summarized in Table 2. As shown in the top panel of Table 1, the mean proportion for correct recognition of possible studied items for the study + arithmetic list was .66. Of these words, .42 were claimed to have been "remembered," and .26 were said to have been "known.'rRelatively few words from nonstudied lists were erroneously called "correct" (.12). The lower panel of Table 1 presents the results of incorrect recognition of the critical lures in the three conditions. Particularly noteworthy is the fact that these entries are similar to or exceed those given in the top panel for correct recognition. For example, in the study + arithmetic condition, the proportion of false recognition of critical lures (.74) exceeded correct recognition of heard list words (.66), F(l, 115) = 14.84, MSE = 0.023. Now consider the effects of age and optimality on recognition measures (Table 2). Consonant with predictions of the differential memory search cue and the constructive memory models, the older group correctly recognized about the same proportions of heard words (.66) as did the younger groups (.67) for the study + arithmetic condition (F < 1). Optimality did not have a significant main effect on correct recognition (hits). In contrast, the robust false-alarm rates for the critical lures were qualified by two interactions. Age group interacted with the type of recognition (hits, critical lures), FStudy+Arithmetic(l, 115) = 3.60, MSE = 0.023, p = .06, and age group also interacted with optimality, Fstudy+Arithmeticd, 115) = 3.61, MSE = 0.086, p = .06 (Table 2). Planned comparisons showed that for the study + arithmetic condition, the Age Group X Type of Recognition interaction occurred because the difference between critical false alarms (.78) and hits (.66) was greater for older adults, F(l, 40) = 15.44, MSE = 0.019, than for younger adults (false alarms = .71; hits = .67), F(l, 75) = 2.43, MSE = 0.025, p > .05. In other words, the false memory effect shown by all groups was exaggerated among the older participants. The Age Group X Optimality interaction arose because the younger group had about the same hit and critical false-alarm rates, regardless of testing optimality, whereas
Table 1 Proportions of Items Called "Old" (O), "Remembered"(R), and "Known" (K) for List Words and for Critical Lures in Experiment 1 Condition List words Study + recall Study + arithmetic Nonstudied Critical lures Study + recall Study + arithmetic Nonstudied Note.
O
R
K
.74 .67 .12
.52 .41 .05
.22 .26 .07
.80 .74 .18
.53 .43 .06
.27 .31 .12
The entries are collapsed over the other variables.
only the nonoptimally tested older adults had reliably higher false-alarm rates than hit rates. The optimally tested older adults had almost identical false-alarm and hit rates. For study + arithmetic, the critical false-alarm and hit rates were .73 and .67, respectively, for the younger optimally tested group and .69 and .67 for the nonoptimally tested younger group, F(l, 75) = 2.43, MSE = 0.025,p > .05. The critical false-alarm and hit rates for the older optimally tested group were .69 and .70, and rates for the older nonoptimally tested group were .86 and .61, F(l, 40) = 15.44, MSE = 0.019. Our older groups showed greater false memory than did the younger group, and the difference increased when the older group was tested nonoptimally. The reduction in the false alarms of optimally tested older adults in a competitive task accords with previous findings (Intons-Peterson et al., 1998; May & Hasher, 1998; May et al., 1993) that performance in these tasks by older adults is aided by testing at preferred times. The role of the associative relation of the critical lures to the studied list items was further underscored by the significant differences in the higher frequency of false alarms for the critical lures than for other nonstudied words, either those composing nonstudied lists or the lures associated with the nonstudied lists. As suggested by Table 1, for the study + arithmetic condition, critical-word false alarms were reliably higher than false alarms for nonstudied list items, F(l, 115) = 571.41, MSE = 0.037, and were higher than lures associated with nonstudied list items, ^study+Aridunetic(l, 115) = 5.20, MSE = 0.052. Clearly, participants in both age groups were more likely to falsely label as "old" the critical words associatively related to the studied lists than items from the nonstudied lists. Given the counterbalancing of the design, these results argue against the possibility that critical words associated with the studied lists simply had a higher probability of recognition, in general, than either the words nonstudied or their associatively related critical words. These comparisons also showed optimality differences. For study + arithmetic, the proportions of critical lures associated with heard lists and of critical lures associated with nonheard lists for the younger group tested optimally were .73 and .18, respectively, and for the younger group tested nonoptimally, proportions were .69 and .11, F(l, 75) = 1.54, MSE = 0.062, p > .05. For the older optimally tested group, the proportions were .69 and .16, and for the older group tested nonoptimally, they were .86 and .30, F(l, 40) = 5.72, MSE = 0.087. In brief, older groups tested nonoptimally were more likely to falsely recognize all words from nonstudied lists than were the other groups. We turn next to estimates of monitoring the sources of the recognitions, the claims of remembering presentation of list items previously called "old." As in other studies (e.g., Norman & Schacter, 1997; Payne et al., 1996; Roediger & McDermott, 1995), our participants were as likely to claim remembering presentation of the nonpresented critical lures (.43) as they were to claim remembering the actually 1
Additional information on the study + recall conditions in Experiments 1 and 2 may be obtained from M. J. Intons-Peterson.
29
AGE, TESTING OPTIMALITY, AND FALSE MEMORY
Table 2 Proportions of Items Called "Old," "Remembered, "and "Known" by Age Group and Optimality for List Words and Critical Lures in Experiment 1 "Old" Younger Condition List words Study + arithmetic Nonstudied Critical lures Study + arithmetic Nonstudied Note.
"Remember" Younger
Older
M
Opt Nopt
M
Opt
Nopt
.67 .10
.67 .10
.67 .10
.66 .13
.70 .10
.71 .18
.73 .18
.69 .11
.78 .23
.69 .16
"Know" Older
Younger
Older
M
Opt
Nopt
M
Opt Nopt
M
Opt Nopt
M
Opt Nopt
.61 .16
.37 .02
.39 .03
.35 .02
.45 .06
.48 .05
.42 .07
.30 .08
.29 .09
.32 .07
.21 .06
.22 .04
.20 .09
.86 .30
.32 .03
.34 .03
.31 .03
.54 .11
.51 .08
.57 .14
.39 .12
.39 .15
.39 .08
.24 .12
.18 .09
.30 .16
Opt = optimal testing; Nopt = nonoptimal testing.
presented list words (.41, see Table 1; F < 1). These results were qualified by an interaction with age group, ^study+Anthmeticd, 115) = 7.61, MSE = 0.033 (Table 2). The younger group claimed to remember slightly fewer critical lures than heard words (the respective means for study + arithmetic were .32 and .37), whereas the older groups showed the opposite effect (the respective means for study + arithmetic were .54 and .45). The difference between the means for the critical words was significant, ^study+Antbmetic = 13.29, MSE = 0.096. Optimality did not qualify these results. In general, both age groups were likely to say they remembered presentations of the nonpresented critical words, in addition to the previously heard words; the older group was somewhat more inclined to do so than was the younger group. Most items called "old" were said to have been "remembered," but some were considered to have been "known" (Table 2). The younger groups were slightly more likely than the older groups to claim that they "knew" the recognized items. The differences were significant for the study + arithmetic condition: The proportions of "known" critical words were .39 (younger) and .24 (older), F{\, 115) = 8.32, MSE = 0.076, and the proportions of "known" studied list words were .30 (younger) and .21 (older), F{\, 115) = 7.63, MSE = 0.034. Optimality did not qualify the outcomes.
Discussion In general, younger participants correctly recalled more studied words than the older group, but the two age groups correctly recognized approximately the same proportion of studied words, results that are more consistent with a differential memory search cue model (Craik, 1994; Craik & Jennings, 1992) than with an inhibitory deficiency model (e.g., Allport et al., 1985; Kane et al., 1994; Navon, 1989a, 1989b; Neumann, 1987; Stadler & Hogan, 1996) or a constructive memory model (e.g., Schacter et al., 1998). According to a differential memory search cue model, older adults should recall fewer words than younger adults, but recognize about the same number, because the recognition tests provide more environmental retrieval cues than free recall. The inhibitory deficiency model predicts better recall and recognition of list items by younger than by older adults,
whereas the constructive memory framework seems to expect few differences in retention of the list items (a prediction also at odds with the results of Norman & Schacter, 1997, who found that younger adults retrieved more list items than older adults).2 However, the inhibitory deficiency and constructive memory models correctly predicted another result that is not an obvious prediction of a differential memory search cue model, namely that the older group would erroneously recognize more nonstudied critical lures than would the younger group. Norman and Schacter (1997) found trends in the same direction. Optimality of testing also was important: When tested nonoptimally, older adults recalled and recognized more nonstudied lures than when tested optimally. These results are consistent with data from May and Hasher (1998), May et al. (1993), and Intons-Peterson et al. (1998). This outcome accords with the notion that for older adults, optimal testing facilitates the exclusion of extraneous associates generated as the list stimuli are presented. It should be noted that in negative priming (Intons-Peterson et al., 1998) and in similar research (May & Hasher, 1998), young adults also benefited from being tested optimally, whereas in the current research, optimal testing of the young group did not reduce production of critical word intrusions more than nonoptimal testing. A reasonable explanation for the differences is that optimal testing is more critical with tasks that stress speed and attention than the slower pace of aural presentation of Experiment 1. The "remember" responses indicated that more older than younger adults claimed that nonpresented words had been presented, but the results did not show optimality effects. These outcomes suggest that older people may have difficulty distinguishing heard from generated items, but it is not clear whether these difficulties originated during encoding, arose in retrieval, or both.3 Indeed, Israel and Schacter (1997) found that adding distinctive (pictorial) features to list words reduced false recollection. 2 The constructive memory framework could predict superior retention by younger adults rather than older adults by assuming that the memory traces of younger adults are more coherent and distinctive than those of older adults. 3 Thanks to F. I. M. Craik for pointing out these implications.
30
INTONS-PETERSON ET AL.
The preference and testing times used in Experiment 1 were toward the extremes of a standard working day. In Experiment 2, we tested individuals at their preferred times of the middle of the day, predicting that the results would resemble those of the optimally tested groups in Experiment 1. Such a result would strengthen arguments about the importance of optimality. Experiment 2
Method Participants. The participants were 37 new younger adults (23 women; 14 men) from introductory psychology classes and 20 older adults (15 women; 5 men) who were recruited from a pool of individuals compiled during previous research (Intons-Peterson et al., 1998). The mean ages and (age ranges) in years for the two groups were 19.5 (17-26) for the younger group and 74 (60-86) for the older group. All participants were in good health. The older participants were paid for their services. The older group had completed more years of education (M = 15.63) than the younger (M = 13.21) group. All participants scored in the neutral range (42-58) on the MEQ (Home & Ostberg, 1976) and were tested between 10:30 a.m. and 2:30 p.m. Materials, design, and procedure. The materials and procedure were identical to those of Experiment 1. The design differed only in that preferred and testing times were not manipulated as variables. Hence, the two between-subjects variables were age group (younger, older) and order (A, B, C). Order was not an influential factor.
Results Recall. Once again, the two groups showed standard serial position curves, with the younger group having higher recall (M = 9.00 of 15 possible items or 60%) than the older group (M = 5.25 of 15 or 35%), F(l, 55) = 46.36, MSE = 0.249. Critical lures (M = 4.72 of 8 or 59%) and other intrusions (9% of the total words recalled) also occurred. The two age groups did not differ reliably in mean critical lures recalled (younger = 4.40 or 55%; older = 5.12 or 64%), f(55) = 1.28, but older adults made more intrusions (16% of all their responses) than did younger ones (3%), f(55) = 5.54. False memories in recall were assessed by comparing the erroneous recall of critical lures with the mean correct recall of list words from Serial Positions 4-11. Again, the mean proportion recall from Serial Positions 4-11 was significantly lower (.41) than the proportion of critical lures incorrectly recalled (.59), F(l, 55) = 23.15, MSE = 0.036. The difference interacted with age group, F(l, 55) = 16.30, MSE = 0.036, because the older group showed a significantly greater differential between recall of the lures (.64) and recall of the intermediate list words (.31) than did the younger group (lures = .52; list words = .55). As with Experiment 1, both age groups manifested false memories in recall, although in Experiment 2, the older group was somewhat more likely to do so than the younger group. Recognition. Previous recall enhanced correct recognition on the study + recall lists (see Table 3); hence, we again focused on the less rehearsal-contaminated condition of study + arithmetic. Mirroring Experiment l's results, the
Table 3 Proportions of Items Called "Old"(O), "Remembered" (R), and "Known " (K)for List Words and Critical Lures in Experiment 2 Younger Older group group Condition List words Study + recall Study + arithmetic Nonstudied Critical lures Study + recall Study + arithmetic Nonstudied
O
R
K
O
R
K
.84 .69 .13
.63 .43 .04
.21 .26 .09
.78 .73 .19
.52 .45 .11
.26 .28 .08
.80 .78 .10
.53 .39 .03
.27 .39 .07
.89 .82 .31
.61 .48 .22
.28 .34 .09
younger age group recognized nonsignificantly fewer items (.69) than the older age group (.73; F < 1). In the study + arithmetic condition, participants indicated that they "remembered" 44% of the list items and that they "knew" that 28% of the items had been on the studied lists. Only 16% of the items from the nonstudied lists were erroneously recognized, and these items were almost equally likely to be to elicit "remember" (.08) as "know" (.09) judgments, F(l, 55) = 2.27, MSE = 0.013, p > .05. Erroneous recognition of the critical lures (lower panel in Table 3) again resembled the entries in the upper panel, indexing false memory. In brief, participants, regardless of age, were more likely to falsely recognize nonstudied critical lures than they were to correctly recognize the heard items. In the study + arithmetic condition, the overall proportion of false recognitions of critical words (.80) reliably exceeded those of correct recognitions of list words (.71), F(l, 55) = 7.35, MSE = 0.027. Critical lures also were falsely recognized significantly more often (.81) than items from nonstudied lists (.16), F(l, 55) = 377.17, MSE = 0.032. Age groups and their interactions with the memory variables were not significant. Thus, we found substantial evidence of false memories among both older and younger adults, who preferred and were tested near the middle of the day. In general, in the study + arithmetic condition, both younger and older groups were about equally likely to claim that they "remembered" presentation of the items (.43 and .45, respectively). These differences were not significant. Finally, as suggested by the entries in Table 3, the mean proportions of list and critical lures claimed to be "known" were about the same. No age group differences emerged in these analyses.
Discussion When younger and older participants were tested at their preferred (intermediate) time of day, we again found that younger adults recalled more correct words than older adults, but the older group did not show the reduction of lure recollection found in Experiment 1. The recall of critical words exceeded the recall of intrusions (showing false
AGE, TESTING OPTIMALITY, AND FALSE MEMORY
memory) for both groups. In recognition, younger and older groups had about the same rate of correct recognition of previously studied words, with both groups manifesting significant evidence of false recognitions. Finally, both younger and older groups were as likely or more likely to "remember" presentation of falsely recognized critical words as they were to "remember hearing their correctly recognized list words. Thus, in general, the results of the two experiments were quite similar. These results carry implications for the role of optimality in models of aging and of memory, but before addressing those implications, we thought it prudent to consider another modality. In the preceding experiments, we tested what we called "verbal false memory" by carefully selecting words associated with specific conceptual themes. It is possible that the evidence of false memory we obtained, and particularly that of the older group, might be unduly influenced by the verbal nature of our stimulus materials and by auditory presentation. Experiment 3 was devised to extend the paradigm to visual and imaginal domains. Research has already shown that older adults are more likely than younger adults to falsely recognize colored pictures (Koutstaal & Schacter, 1997) or photographs (Schacter, Koutstaal, Johnson, et al., 1997) that were not in a to-be-remembered collection. Here we used line drawings to test the effects of optimality of testing in addition to those of age differences.
Experiment 3 The method of investigating the visual domain may determine the results. For example, older adults are as adept as younger adults at correctly recognizing previously seen pictures and at correctly rejecting entirely new picture abstractors (e.g., Park, Puglisi, & Smith, 1986; Till, Bartlett, & Doyle, 1982), but they perform less well when the distractors modify the details of previously seen pictures (e.g., Bartlett, Leslie, Tubbs, & Fulton, 1989; Bartlett, Till, Gernsbacher, & Gorman, 1983; Park & Puglisi, 1985; Pezdek, 1987) and when the previously unseen distractors were pictures taken in the same setting as the seen pictures (Schacter, Koutstaal, Johnson, et al., 1997). It seems obvious, therefore, that it would be easy to demonstrate false alarms by using lures such as pictures of faces morphed from the originals or unseen pictures from the same settings, but we wanted to try to activate conceptual themes similar to those presumably elicited by the stimulus materials of Experiments 1 and 2. To do this, we chose pictures from Snodgrass and Vanderwart's (1980) norms that are known to be associated with particular concepts. There is a long history of differences between the modalities of hearing and seeing. On the one hand, visual concepts may be more global, holistic, and instantly comprehensible than verbal ones (i.e., "a picture is worth a thousand words"). If so, visual false memory may differ from verbal false memory. On the other hand, because we used easily named pictures, thus providing the opportunity to name the pictures and convert the task from a visual to a verbal one, the visual and verbal false memory designs may produce similar results. Indeed, Koutstaal and Schacter
31
(1997) found that older adults often relied on the conceptual (gist) or perceptual similarity among study and lures as they made recognition judgments. If our participants labeled the pictures, we would expect similar results. We included imagery instructions for two reasons. The first was to determine whether imagining the studied items, in addition to seeing them, might add elaborative detail to the encoding of these items and make them easier to remember. Such an effect might also enhance source memory through the additional elaborative detail, as Bayen and Murnane (1996) observed. Conversely, the presence of multiple cues might handicap older adults, who have difficulty with distinguishing words actually said from those they imagined saying (Hashtroudi, Johnson, & Chrosniak, 1989) or between actions they watched someone else do and those they imagined doing themselves (Cohen & Faulkner, 1989). They also have problems with the simultaneous use of multiple cues (Ferguson et al., 1992). Would age and optimality modify these results? The second reason was to investigate possible age-related and optimality differences in the effects of instruction to use imagery in the false-alarm paradigm. Hyman and Pentland (1996) reported that recall of false events increased significantly when learners imagined a suggested event. Previous work, using various other tasks, suggests that older adults profit less from imagery instruction than younger adults but that these instructions are more beneficial to both age groups than no instructions or control (rote rehearsal) instructions (e.g., Craik & Dirkx, 1992; Dirkx & Craik, 1992; Dror & Kosslyn, 1994), although Dror and Kosslyn (1994) found that their older adults could generate and scan visual images as well as younger adults. Finally, using PET scanning, Kosslyn, Alpert, Thompson, Chabris, and Rauch (1993) found that visual imagery and perception activated similar brain regions. This outcome suggests that the results of imagery should mimic those for perceptual (picture) performance. In summary, in Experiment 3, we investigated the effects of age, optimality of testing, and imagery instruction on visual (pictorial) false memory.
Method Participants and design. Of the 90 undergraduate students who participated, 66 were female and 24 were male. Comparable figures for the 67 older adults were 47 and 20. The mean ages and age ranges of the two groups in years were 21 (18-26) and 73 (60-87). The data from 8 additional older adults, who did not understand the instructions or refused to use the computer, are not included. All participants were healthy individuals with normal vision, corrected or uncorrected. They were recruited from a large sample of individuals who had completed the MEQ (Home & Ostberg, 1976) and had been assigned to morning, neutral, or evening preference categories on the basis of their responses, using the same method of scoring as that used in Experiment 1. A few participants had served in either Experiment 1 or Experiment 2. The original plan was to assign participants to each of the three preference groups at random to three test times of the day: the morning (before 10:30 a.m.), an intermediate time (11:00 a.m.2:30 p.m.), and the afternoon (3:00 p.m. or later). But, once again, filling all nine cells for each age group proved difficult because the
32
INTONS-PETERSON ET AL.
number of older adults who preferred the afternoon-evening was limited. Hence, we collapsed time-of-day preferences and time of testing into an optimality-of-testing variable of participants tested at their preferred times and participants tested at nonpreferred times. Then, in accord with the Age Group (2) X Optimality (2) X Imagery Instruction Design (2), participants in each of the Age Group X Optimality cells were assigned at random to imagery or no-imagery instructions. The numbers of younger and older adults tested in the imagery-instruction/optimal groups were 18 and 14; the corresponding numbers for the imagery-nonoptimal, noimagery instruction/optimal, and no-imagery instruction/nonoptimal groups were 27 and 22,15 and 11, and 30 and 20, respectively. Materials. Six lists of 11 pictures each were compiled from Snodgrass and Vanderwart's (1980) norms. Each list represented a conceptual theme: animals, body parts, fruit, clothing, kitchen items, and vegetables. The pictures were rated as good exemplars of the concepts in the Snodgrass-Vanderwart norms and were rank-ordered in terms of frequency of association in pilot work in our laboratory (see Appendix B). This same pilot work was used to identify the most effective lure picture for each list. The lure pictures were not shown during study. Thus, the study lists contained 10 pictures, ordered from most associated with the critical lure to least associated with the critical lure. Four pictures from a different category (furniture) were used as practice items. The participants studied four of the six lists, with the list assignment counterbalanced. The 24-picture recognition test contained, for each of the four lists, the study pictures from Serial Positions 1,5, and 7, and the critical lure picture, plus the pictures from Serial Positions 1,5, and 7 and the critical lures from the two nonpresented lists. The last test, free recall, invited participants to write down names of the items pictured during study. Procedure. Each participant, tested individually, was seated in front of a computer and asked to wear a head set. All instructions were delivered simultaneously over the head set and in print on the monitor. The groups assigned to imagery instructions were asked to generate an imaginary scene that contained the pictures of the previous list. Groups assigned to no-imagery instructions received no such instructions. Instead, they were told to "think about the pictures in the list they had just seen." The main experiment began after the learners saw four practice pictures. No tests were administered during practice, but the participants were told that their memories would be tested in the main part of the experiment. During study, pictures were presented at a 5-s rate. Each list was followed by a 3-min unfilled interval to give the imagery instruction groups the time to generate a composite image of the pictures of each list. After all four lists had been seen, the pictorial recognition test was presented by displaying each picture until the participant responded by indicating whether she or he had seen the picture among the study lists or not. Then, all pictures judged as "old" were shown again, individually, until the participant decided whether she or he had a clear recollection of actually having seen the picture ("Remember") or had simply seen the picture at sometime ("Know"). The "O," "N," "R," and "K" responses were made by pressing marked keys on the keyboard. The recognition test was followed by a free-recall test, in which the participants wrote the names of the pictures they had seen on the study lists. The recognition test preceded the recall test because we wanted to avoid encouraging the participants to label the pictures during the recognition test.
Results In Experiment 3, recognition preceded recall, so the recognition performance did not have the advantage of prior
recall. Hence, recognition results are presented here before recall. Recognition. As Table 4 shows, studied pictures (.93) were more likely to be recognized than nonpresented critical picture lures from the same lists (.37), F(l, 149) = 425.11, MSE = 0.052, whereas the picture lures were more likely to be called "old" than were pictures from the nonstudied lists (.04), F(l, 149) = 216.34, MSE = 0.037. Thus, false recognition occurs with visual false memory, but to a lesser extent than with verbal false memory, as we have tested these paradigms. The participants were quite likely to say they "remembered" previously studied pictures (.70), but considerably less likely to "remember" seeing critical lures from the presented lists (.15), F(l, 149) = 1.89.80, MSE = 0.077. They rarely indicated that they "remembered" seeing pictures from nonstudied lists (.01). The differences between the proportions for remembering critical lures from the studied lists and lures from unstudied lists also was significant, F(l, 149) = 62.21, MSE = 0.094. As Table 5 shows, both younger (.93) and older (.93) groups correctly recognized most of the studied pictures (F < 1). Moreover, studied pictures were recognized marginally more often by the optimally tested groups (.95) than by the nonoptimally tested groups (.91), F(l, 149) = 3.83, MSE = 0.013, p = .052, and was higher without imagery instructions (.95) than with the instructions (.91), F(l, 149) = 6.31, MSE = 0.013. In general, the participants were more likely to assign "R" to the list pictures (.70) than "K" (.22), F(l, 149) = 156.73, MSE = 0.121 (Table 4). Pictures from the nonstudied lists were rarely called "old" (.04), indicating excellent differentiation between studied and nonstudied pictures. This differentiation was significantly better for the optimally tested group (.03) than for the nonoptimally tested group (.07), F(l, 149) = 5.99, MSE = 0.020 (see middle panel of Table 5). The few pictures from nonstudied lists called "old" were more likely to be classified as "known" (.03) than as "remembered" (.01), F ( l , 149) = 12.72, MSE = 0.091. In this analysis, imagery instructions led to more incorrect labeling of the nonstudied pictures as "old" (.40) than did no instructions (.34), F ( l , 149) = 6.79, MSE = 0.078 (see lowest panel of Table 5). No other main effects or interactions were significant. The lower two rows of the top panel of Table 5 show that nonstudied lures associated with the studied lists were mistakenly called "old" marginally more often by the older
Table 4 Proportions of Pictures Called "Old"(O), "Remembered" (R), and "Known " (K)for List Items and for Critical Lures in Experiment 3 Condition List items Studied Nonstudied Critical lures Studied Nonstudied Note.
O
R
K
.93 .04
.70 .01
.23 .03
.37 .05
.15 .01
.22 .04
The entries are collapsed over the other variables.
AGE, TESTING OPTIMALITY, AND FALSE MEMORY Table 5
Recognition for Studied and Nonstudied Pictures, for Critical Picture Lures From Studied Lists, and for Picture Lures From Nonstudied Lists as Functions ofAge Group, Optimality of Testing, and Imagery Instructions in Experiment 3 Younger group
Older group
Item type
O
R
K
O
R
K
Studied Nonstudied Critical lure from studied Lure from nonstudied
.93 .04 .33 .04
.67 .01 .10 .01
.26 .03 .23 .03
.93 .04 .41 .05
.74 .01 .22 .01
.19 .03 .19 .04
Optimally tested Studied Nonstudied Critical lure from studied Lure from nonstudied
Nonoptimally tested
O
R
K
O
R
K
.95 .02 .35 .03
.74 .00 .12 .00
.21 .02 .23 .03
.91 .05 .39 .07
.67 .01 .18 .02
.24 .04 .21 .05
No imagery instructions Studied Nonstudied Critical lure from studied Lure from nonstudied Note.
Imagery instructions
0
R
K
O
R
K
.95 .02 .34 .02
.70 .00 .13 .00
.25 .02 .21 .02
.91 .05 .40 .08
.67 .03 .16 .02
.24 .02 .24 .06
O = "old"; R = "remembered"; K = "known."
group (.41) than by the younger group (.33), F(l, 149) = 3.24, MSE = 0.081, p = .074. In addition, the critical lures related to studied lists were erroneously called "old" (.37) significantly more often than lures associated with the nonstudied lists (.05), F(l, 149) = 167.67, MSE = 0.044 (Table 4), indicating false memories for picture concepts. Table 4 also shows that most of the critical lures falsely labeled as "old" were said to be "known" (.22) more often than "remembered" (.15), F(l, 149) = 5.27, MSE = 0.288, but this difference interacted with age group, F(l, 149) = 8.32, MSE = 0.288, as appears in Table 5. The younger group was more likely to claim that they "knew" (.23) the nonstudied concept picture than that they "remembered" (.09) they had seen it, whereas the older group showed a nonsignificant trend in the opposite direction ("K" = .20; "R" = .22). This analysis also showed a main effect for imagery instructions, F(l, 149) = 4.16, MSE = 0.090, because groups with imagery instructions were more likely to claim they "remembered" or "knew" the concept items (M = .20) than were groups without imagery instructions (.17). In general, the data yielded evidence for false memories with visually based conceptual structures, albeit at lower levels than those found with verbally based structures. Moreover, optimally tested groups produced more hits than nonoptimally tested groups but did not differ on false alarms; imagery instructions yielded lower hits and more false alarms than did no imagery instructions, and older
33
groups showed more false alarms for picture lures than did younger groups. The two age groups did not differ in hit rates. Thus, the results for hits support the differential memory search cue and conceptual memory models that recognition will be similar for both young and old adults, but the false alarms for critical picture lures indicate that it is not sufficient to consider only hit rates. Recall. The serial position curves for pictures named correctly showed primacy but not recency effects, in marked contrast to the curves for verbal memory. Recall differed across the serial positions, F(l, 1341) = 55.47, MSE = 0.062. The younger group recalled marginally more picture names correctly (M = 6.1) than did the older group (5.5), F(l, 149) = 3.34, MSE = 0.385, p = .07; the optimally tested recalled more (6.3) than the nonoptimally tested (5.3), F(l, 149) = 9.06, MSE = 0.385; and the noninstructed recalled more (6.2) than the imaginally instructed (5.3), F(l, 149) = 8.09, MSE = 0.385. No interactions were significant. The names of critical lures were erroneously recalled more often by the older group (3.8) than by the younger group (1.7), F(l, 149) = 23.85, MSE = 0.069, the nonoptimally tested group made more false alarms for critical lures from studied lists (3.4) than the optimally tested group (2.1), F(l, 149) = 7.72, and age group interacted with optimality, F(l, 149) = 7.77. All MSEs for the above main effects were .069. The interaction occurred because, for studied lists, the optimally tested and nonoptimally tested younger group made identical false alarms for critical lures (1.7), whereas the optimally tested older group made significantly fewer false alarms (2.6) than the nonoptimally tested older group (5.0). Because the serial position curves did not have the usual trough near the middle, we used two methods to assess the significance of the differences between correct recall and recall of critical lure false alarms. The first approach was similar to that used in Experiments 1 and 2: The mean for lure false alarms was compared with the mean for correct recall in intermediate Serial Positions 5-7. In the second approach, the mean for lure false alarms was compared with the mean for correct recall in the final four Serial Positions 7-10. In general, the participants erroneously named fewer critical lures associated with the studied lists (.30) than they correctly recalled studied items from Serial Positions 5-7 (.63), F(l, 149) = 134.39, MSE = 0.069 or than they correctly recalled studied items from Serial Positions 7-10 (.41), F(l, 149) = 17.43, MSE = 0.072. Imagery instructions had a main effect for both of these analyses, F ^ vs 5-7 = 9.44, MSE = 0.042; F Lure vs. 7_10 = 8.88, MSE = 0.052. In both cases, the uninstructed groups recalled more items of both kinds than the imagery instructed groups. The analyses also yielded the same two double interactions of age group with the two kinds of recall, of optimality with the two types of recall, and a triple interaction of age group, optimality, and the two types of recall. The smallest F(l, 149) in this set was 5.47, MSE = 0.069, for the triple interaction comparing recall of critical lures and recall of the correct items from Serial Positions 5—7. The relevant means appear in Table 6,
34
INTONS-PETERSON ET AL.
Table 6 Mean Recall Scores for Pictures Studied in Serial Positions 5—7 and 7-10, for Nonstudied Pictures Associated With Studied Lists, and Difference Scores Condition Younger Optimal Nonoptimal M
Older Optimal Nonoptimal M Imagery No instructions Instructions Grand M
Difference score
Critical lure (CL)
M5-7
•M7-10
CL - M5_7
CL — M7_i0
.17 .17 .17
.70 .63 .66
.47 .40 .44
-.53 -.46 -.49
-.30 -.23 -.27
.26 .50 .43
.66 .55 .60
.47 .29 .38
-.40 -.05 -.17
-.21 .21 .05
.30 .25 .30
.69 .58 .63
.47 .35 .41
-.39 -.33 -.33
-.17 -.23 -.11
which tallies differences between the two types of recall in addition to the actual scores. As can be seen, in general, recall of correct items was higher for optimally tested participants than for nonoptimally tested participants, F5_7(l, 149) = 7.20, MSE = 0.042; F7_10(l, 149) = 8.87, MSE = 0.055, and declined when imagery instructions were given, F*_7(l, 149) = 9.84, MSE = 0.042; F7_10(l, 149) = 9.21, MSE = 0.055. Recall of the critical picture lures increased with age, F(l, 149) = 23.85, MSE = 0.064, and was higher for the nonoptimally tested participants (.34) than for the optimally tested participants (.21), F(l, 149) = 7.72, MSE = 0.069. Intrusions other than the critical lures constituted 3% of all recalled items. This rate was significantly lower than the recall for critical lures, F(l, 149) = 119.75, MSE = 0.034. The same interactions with age group and optimality emerged as with the comparisons with correct recall for items from different Serial positions: F A g e GroupXOptimalityXIntrusions vs. Lures
(1, 149) = 7.00, MSE = 0.034. The critical lure minus intrusion difference scores were. 16 and. 13 for the optimally and nonoptimally tested younger groups and .24 and .44 for the optimally and nonoptimally tested older adults. Once again, the nonoptimally tested older adults were more likely to falsely recall both the critical lures (.50) and intrusions (.06) than were younger groups or the older group tested optimally.
Discussion As with verbal false memory in Experiments 1 and 2, in Experiment 3 both age groups showed significant evidence of false memories in recall and recognition of pictures taken from various conceptual categories. These similarities occurred in spite of major differences in other respects. For example, recall of the verbal materials of Experiments 1 and 2 showed standard primacy and recency effects, whereas recall of the picture materials of Experiment 3 yielded only primacy effects, probably because recall followed recognition, which dampened the short-term memory assumed to produce recency effects. Recognition of list words (=50%)
was lower than recognition of list pictures (>90%), and nonpresented but thematically related words were much harder to suppress than nonpresented but thematically related pictures. As with the previous experiments, the younger groups recalled more items correctly than did the older groups, but the two ages did not differ in correct recognition. Unlike Experiment 1, in Experiment 3 the optimally tested groups outperformed the nonoptimally tested groups in terms of both correct recalls and recognition hits. More important is that the older optimally tested adults were less likely to show critical false alarms (and, hence less evidence of false memory) than were the older nonoptimally tested individuals. Optimality did not have much effect on the younger groups. Finally, imagery instructions did not facilitate performance. Correct recall and correct recognition were higher without imagery instruction than with it. Imagery instructions might have added an extra burden to an already complicated experimental paradigm, thus straining the limits of the participants' cognitive capacities, or the absence of such instructions (and the advice to "think about the pictures") may have provided more rehearsal time than was available with imagery instructions. These results contradict the notion that, at least as manipulated herein, imaginal performance duplicates perceptual responding. A propos the tenets of the conceptual memory framework model, the tendency for imagery instructions to increase "remember" claims to lures and to increase errors to list items suggests that in this paradigm, imagery instructions increased generation of general thematic associations rather than increasing the distinctiveness of the list items. General Discussion In brief, the major results were (a) that adults recalled and recognized nonstudied words and pictures that were thematically related to studied items more often than they recalled or recognized other types of intrusions, thereby demonstrating false memories for both kinds of materials and both
AGE, TESTING OPTIMALITY, AND FALSE MEMORY
retention measures; (b) that older adults recalled fewer previously studied items than did younger adults, but correctly recognized previously studied items with about the same frequency as younger adults; (c) that in general, the older nonoptimally tested adults were less able to discriminate studied from nonstudied but thematically related items than were the older optimally tested groups, but optimal testing did not have much effect on the younger groups; (d) that older adults were more likely than younger adults to claim that they "remembered" hearing unstudied words or seeing unstudied pictures; (e) that correct recall and recognition were higher without imagery instructions than with imagery instructions; and (f) that the type of material was important. Pictures were correctly recognized much better than words, whereas themes associated with word lists were more likely to elicit false recognition of thematically related items than were themes derived from pictures. Clearly, optimality, as well as age, affects the false memory effect. The older, nonoptimally tested group produced significantly greater false memory effects than did the older optimally tested group or than either the optimally or nonoptimally tested younger groups. This occurred even though all groups showed a significant false memory effect. These results emphasize the desirability of using optimal testing, which presumably taps attentiveness or alertness, to facilitate human cognitive performance on competitive tasks. These tasks are generally thought to be mediated by frontal lobe functioning, an area considered vulnerable to aging and presumably to levels of circadian alertness associated with optimal testing. It seems reasonable to hypothesize, therefore, that optimal testing delivers greater attentiveness than nonoptimal testing and that this factor will interact with age-related changes in cognitive ability, including neuropsychological and metabolic-physiological ones. Optimality does not appear to be synonymous with attention, however; for simply asking participants to attend to their retrievals (Norman & Schacter, 1997) did not materially affect false alarms. In contrast, making original study cues more distinctive by adding pictures portraying the cues did reduce false alarms (Israel & Schacter, 1997). Another purpose of the research was to consider contributory mechanisms. Because the performance declines associated with increasing age appear to be mitigated to some extent by optimal testing, we sought mechanisms that address both age and optimality. Note that a comprehensive model would accommodate competitive situations, such as false memory, when optimality has an effect and would accommodate less competitive ones, such as recall or recognition of correct responses, when it does not. In addition, the model would use the same mechanisms to explain retention of both correct items and related lures. Various mechanisms may contribute to the Age X Optimality interaction, including the type of cue used to search memory. Indeed, the only model mat correctly predicted age differences in correct recall but not recognition was Craik's (1994) differential memory search cue view (also see Craik & Jennings, 1992). This view holds that two kinds of retrieval cues aid recall, environmental and self-initiated,
35
and that with age, the ability to generate one's own cues declines. Hence, older people tend to rely mainly on environmental cues to initiate recall, whereas younger adults use both types of cues. Because recognition explicitly provides environmental cues but recall does not, age has less effect on correct recognition than on correct recall. These mechanisms do not address either nonstudied items or optimality, however, so their predictiveness in these other areas requires additional assumptions. If an extended model assumes that self-initiated cues are more vulnerable to circadian effects than is environmentally cue dependent recognition, we would expect nonoptimal testing to exaggerate age differences in recall more than those in recognition. This implication received some support in Experiment 1, where the false memory effect Claires ~~ ^ust) f° r nonoptimally tested older adults was greater for recall (.66 — .35 = .31) than for recognition (.86 — .61 = .25). The corresponding effects shown by the younger nonoptimally tested group were .54 — .45 = .09 for recall and .69 — .67 = .02 for recognition. These comparisons are rough, at best, because the recall and recognition measures have different baselines. They do, however, suggest that the self-initiated (and perhaps environmental) cues associated with recall may be more vulnerable to circadian effects than the environmental cues associated with recognition. The mechanisms posited by models that consider nonstudied, as well as studied, items show the same pattern of being supported in some predictions and not in others. For example, neither inhibitory deficiency models (e.g., Allport et al., 1985; Kane et al., 1994; Navon, 1989a, 1989b; Neumann, 1987; Stadler & Hogan, 1996) nor the constructive memory framework (Schacter et al., 1998) predicted age differences in correct recall but not in correct recognition, but they incorporated mechanisms to cover recollection of nonpresented lures. With an inhibitory deficiency model, this mechanism might involve age-related loss of inhibitory efficiency in competitive situations that require suppression of nonrelevant items. This mechanism tidily handles the finding of greater recall and recognition of lures by older than younger adults in Experiment 3 but does not accommodate the absence of a main effect for age in lure recall or recognition (Experiments 1 and 2) without some provision for optimality to explain optimality's interaction by age in these experiments. The constructive memory framework identifies overreliance on gist and source-monitoring problems as mechanisms that induce memory distortions. Certainly, the high correct recognition of names of pictures in Experiment 3 indicated that both age groups relied heavily on some form of linguistic gist, as did the marked tendencies in these paradigms to falsely recollect thematically related items but not unrelated items (also see Brainerd & Reyna's, 1990, fuzzy trace model; Brainerd et al., 1990; Reyna & Brainerd, 1995; and Estes's 1997, dual-trace model). The role of source monitoring is less clear. If elderly adults have greater difficulty monitoring the sources of presented and selfgenerated items than do younger adults, the lack of differentiation should yield high recollection of both presented and
36
INTONS-PETERSON ET AL.
lure items. The data are consistent for false alarms but not for recognition hits—an unsurprising outcome, given the nonincorporation of optimality in these models. The results of Experiments 1 and 2 differed somewhat from those of Norman and Schacter (1997) who also used a verbal false memory, paradigm. In counterpart conditions (without queries about properties of "remembered" items), they found lower recall of lures by younger than by older adults, whereas, in Experiments 1 and 2, we observed no age differences in lure recall. In addition, they reported higher recognition of list items by younger than by older adults but relatively few age differences in false recognition of critical lures. We found no age differences in correct recognition but higher recognition of critical lures by the older nonoptimally tested group than by the other groups. The reason for the differences may be that Norman and Schacter used twice as many study lists, which may have afforded greater differentiation of list items and lures to their participants. Moreover, the results of "remember" versus "know" judgments did not mirror the age and optimality effects that would be expected from age-related source-monitoring differences. "Remember" judgments are assumed to assess conscious recollection of the specific details of initial presentation of the words, whereas "know" judgments presumably monitor general familiarity with the items (Gardiner, 1988; Gardiner & Java, 1993; Mather, Henkel, & Johnson, 1997; Norman & Schacter, 1997; Rajaram, 1993; Roediger & McDermott, 1995). If optimality is related to attentiveness, we might expect to find that optimal testing yields greater differentiation between list items and lures than does nonoptimal testing. We found no such effect, although older adults tended to "remember" more lures than did younger adults. In fact, these results may challenge the assumption that "remember" judgments index alertness because Norman and Schacter (1997) found no significant reduction in false memory effects even when they required their participants to write or answer focused questions about the list items and lures they claimed to "remember."4 A likely explanation is that metamemory does not activate competitive responses to a substantial extent. If so, metamemorial processes would join processes initiated by simple noncompetitive language tasks, such as recollections of actually experienced stimuli (as found in all three experiments), simple vocabulary tasks, category generation, and naming colors or lines (May & Hasher, 1998) that are not affected by optimality of testing. The data argue, compellingly, in our opinion, for greater consideration of the role of optimal testing and the continued pursuit of underlying mechanisms. It probably is the case that to understand the role of optimality on competitive tasks, both circadian-metabolic and neuropsychological evidence need to be incorporated in a satisfactory model. We look forward to neuropsychological research that takes into account optimal preference times, circadian status, and cerebral activation as older and younger participants perform in competitive and noncompetitive tasks. Finally, the general results of optimality suggest two points. First, it would be desirable to have such information in Method sections of articles, particularly those dealing
with older adults. Second, it seems reasonable to encourage elderly adults to conduct demanding cognitive business at their preferred times of the day.
4 Both Norman and Schacter (1997) and Mather et al. (1997) found that both age groups reported remembering more information about associations than about sensations or contextual detail information. Sensory and contextual detail information was reported more often for list words than for lures.
References Allport, D. A., Tipper, S. P., & Chmiel, N. (1985). Perceptual integration and post-categorical filtering. In M. I. Posner & O. S. M. Marin (Eds.), Attention and performance XI (pp. 107-132). Hillsdale, NJ: Erlbaum. Anderson, M., Petros, T. V., Beckwith, B. E., Mitchell, W. W., & Fritz, S. (1991). Individual differences in the effect of time of day on long-term memory access. American Journal of Psychology, 104, 241-255. Bartlett, J. C , Leslie, J. E., Tubbs, A., & Fulton, A. (1989). Aging and memory for faces. Psychology and Aging, 4, 276-283. Bartlett, J. C , Till, R. E., Gernsbacher, M., & Gorman, W. (1983). Age-related differences in memory for lateral orientation. Journal of Gerontology, 38, 439-446. Bayen, U. J., & Murnane, K. (1996). Aging and the use of perceptual and temporal information in source memory tasks. Psychology and Aging, 11, 293-303. Bodenhausen, G. V. (1990). Stereotypes as judgmental heuristics: Evidence of circadian variations in discrimination. Psychological Science, 1, 319-322. Brainerd, C. J., & Reyna, V. F. (1990). Gist is the gist: The fuzzy-trace theory and new intuitionism. Developmental Review, 10, 3-47. Brainerd, C. J., Reyna, V. E, Howe, M. L., & Kingma, J. (1990). The development of forgetting and reminiscence. Monographs of the Society for Research in Child Development, 53, 3-4 (Whole No. 222). Cohen, G., & Faulkner, D. (1989). Age differences in source forgetting: Effects on reality monitoring and on eyewitness testimony. Psychology and Aging, 4, 10-17. Craik, F. I. M. (1994). Memory changes in normal aging. Current Directions in Psychological Science, 3, 155-158. Craik, F. I. M., & Dirkx, E. (1992). Age-related differences in three tests of visual imagery. Psychology and Aging, 7, 661-665. Craik, F. I. M., & Jennings, J. M. (1992). Human memory. In F. I. M. Craik & T. A. Salthouse (Eds.), The handbook of aging and cognition (pp. 51-110). Hillsdale, NJ: Erlbaum. •Deese, J. (1959). On the prediction of occurrence of particular verbal intrusions in immediate recall. Journal of Experimental Psychology, 58, 17-22. Denney, N. W., & Larsen, J. E. (1994). Aging and episodic memory: Are elderly adults less likely to make connections between target and contextual information? Journal of Gerontology: Psychological Sciences, 49, P270-P275. Dirkx, E., & Craik, F. I. M. (1992). Age-related differences in memory as a function of imagery processing. Psychology and Aging, 7, 352-358. Dror, I. E., & Kosslyn, S. M. (1994). Mental imagery and aging. Psychology and Aging, 9, 90-102. Dywan, J., & Jacoby, L. (1990). Effects of aging on source monitoring: Differences in susceptibility to false fame. Psychology and Aging, 5, 379-387.
AGE, TESTING OPTIMALITY, AND FALSE MEMORY Estes, W. K. (1997). Processes of memory loss, recovery, and distortion. Psychological Review, 104, 148-169. Ferguson, S. A., Hashtroudi, S., & Johnson, M. K. (1992). Age differences in using source-relevant cues. Psychology and Aging, 7, 443^52. Gardiner, J. M. (1988). Functional aspects of recollective experience. Memory & Cognition, 16, 309-313. Gardiner, J. M., & Java, R. I. (1993). Recognizing and remembering. In A. E. Collins, S.-E. Gathercole, M. A. Conway, & P. E. Morris (Eds.), Theories of memory (pp. 163-188). Hillsdale, NJ: Erlbaum. Grady, C. L., Mclntosh, A. R., Horwitz, B., Maisog, J. M., Ungerleider, L. G., Mentis, M. J., Pietrini, P., Schapiro, M. B., & Haxby, J. V. (1995). Age-related reductions in human recognition memory due to impaired encoding. Science, 269, 218-221. Gur, R. C , Gur, R. E., Obrist, W. D., Skolnick, B. E., & Reivitch, M. (1987). Age and regional cerebral blood flow at rest and during cognitive activity. Archives of General Psychiatry, 44, 617-621. Harkins, S. W., Chapman, C. R., & Eisdorfer, C. (1979). Memory loss and response bias in senescence. Journal of Gerontology, 34, 66-72. Hashtroudi, S., Johnson, M. K., & Chrosniak, L. D. (1989). Aging and source monitoring. Psychology and Aging, 4, 106-112. Home, J., Brass, C , & Pettitt, S. (1980). Circadian performance differences between morning and evening types. Ergonomics, 23, 29-36. Home, J., & Ostberg, O. (1976). A self-assessment questionnaire to determine momingness-eveningness in human circadian rhythms. International Journal of Chronobiology, 4, 97-110. Home, J., & Ostberg, O. (1977). Individual differences in human circadian rhythms. Biological Psychology, 5, 179-190. Hyman, I. E., & Pentland, J. (1996). The role of mental imagery in the creation of false childhood memories. Journal of Memory and Language, 35, 101-117. Intons-Peterson, M. J., Rocchi, P., West, T., McLellan, K., & Hackney, A. (1998). Aging, optimal testing times, and negative priming. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 362-376. Israel, L., & Schacter, D. L. (1997). Pictorial encoding reduces false recognition of semantic associates. Psychonomic Bulletin & Review, 4, 577-581. Kane, M. J., Hasher, L., Stoltzfus, E. R., Zacks, R. T., & Connelly, S. L. (1994). Inhibitory attentional mechanisms and aging. Psychology and Aging, 9, 103-112. Kosslyn, S. M., Alpert, N. M., Thompson, W. L., Chabris, C. F., & Rauch, S. L. (1993). Visual mental imagery activated topographically organized visual cortex. Journal of Cognitive Neuroscience, 5, 263-287. Koutstaal, W., & Schacter, D. L. (1997). Gist-based false recognition of pictures in older and younger adults. Journal of Memory and Language, 37, 555-583. Kramer, A. R, Humphrey, D. G., Larish, J. R, Logan, G. D., & Strayer, D. L. (1994). Aging and inhibition: Beyond a unitary view of inhibitory processing. Psychology and Aging, 9, 491527. Mather, M., Henkel, L. A., & Johnson, M. K. (1997). Evaluating characteristics of false memories: Remember/know judgments and memory characteristics questionnaire compared. Memory & Cognition, 25, 826-837. May, C. P., & Hasher, L. (1998). Synchrony effects in inhibitory control over thought and action. Journal of Experimental Psychology: Human Perception and Performance, 24, 363-379. May, C. P., Hasher, L., & Stoltzfus, E. R. (1993). Optimal time of
37
day and the magnitude of age differences in memory. Psychological Sciences, 4, 326-330. McGaugh, J. L. (1995). Emotional activation, neuromodulatory systems, and memory. In D. L. Schacter (Ed.), Memory distortion (pp. 255-273). Cambridge, MA: Harvard University Press. Mclntyre, J. S., & Craik, F. I. M. (1987). Age differences in memory for item and source information. Canadian Journal of Psychology, 41, 175-192. Multhaup, K. S. (1995). Aging, source, and decision criteria: When false fame errors do and do not occur. Psychology and Aging, 10, 492-497. Navon, D. (1989a). The importance of being visible: On the role of attention in a mind viewed as an anarchic intelligence system. I. Basic tenets. European Journal of Cognitive Psychology, 1, 191-213. Navon, D. (1989b). The importance of being visible: On the role of attention in a mind viewed as an anarchic intelligence system. EL Application to the field of attention. European Journal of Cognitive Psychology, 1, 215-238. Neumann, O. (1987). Beyond capacity: A functional view of attention. In H. Heuer & A. F. Sanders (Eds.), Perspectives on perception and action (pp. 361-394). Hillsdale, NJ: Erlbaum. Norman, K. A., & Schacter, D. L. (1997). False recognition in younger and older adults: Exploring the characteristics of illusory memories. Memory & Cognition, 25, 838-848. Park, D. C , & Puglisi, J. T. (1985). Older adults' memory for the color of pictures and words. Journal of Gerontology, 40, 198-204. Park, D. C , Puglisi, J. T., & Smith, A. D. (1986). Memory for pictures: Does an age-related decline exist? Psychology and Aging, 1, 11-17. Payne, D. G., Elie, C. J., Blackwell, J. M., & Neuschatz, J. S. (1996). Memory illusions: Recalling, recognizing, and recollecting events that never occurred. Journal of Memory and Language, 35, 261-285. Perfect, T. J., & Dasgupta, Z. R. R. (1997). What underlies the deficit in reported recollective experience in old age? Memory & Cognition, 25, 849-858. Petros, T. V., Beckwith, B. E., & Anderson, M. (1990). Individual differences in the effects of time of day and passage difficulty on prose memory in adults. British Journal of Psychology, 81, 63-72. Pezdek, K. (1987). Memory for pictures: A lifespan study of memory for visual detail. Child Development, 58, 807-815. Rajaram, S. (1993). Remembering and knowing: Two means of access to the person past. Memory & Cognition, 21, 89-102. Rankin, J. L., & Kausler, D. H. (1979). Adult age differences in false recognitions. Journal of Gerontology, 34, 58-65. Read, J. D. (1996). From a passing thought to a false memory in 2 minutes: Confusing real and illusory events. Psychonomic Bulletin & Review, 3, 105-111. Reyna, V. R, & Brainerd, C. J. (1995). Fuzzy trace theory: An interim synthesis. Learning and Individual Differences, 7, 1-75. Roediger, H. L., HI, & McDermott, K. B. (1995). Creating false memories: Remembering words not presented in lists. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 803-814. Russell, W. A., & Jenkins, J. J. (1954). The complete Minnesota norms for responses to 100 words from the Kent-RosanoffWord Association Test. (Tech. Rep. No. 11) University of Minnesota, Department of Psychology. Salthouse, T. A. (1985). Speed of behavior and its implications for cognition. In J. E. Birren & K. W. Schaie (Eds.), Handbook of the psychology of aging (3rd ed., pp. 400-426). New York: Van Nostrand Reinhold.
38
INTONS-PETERSON ET AL.
Salthouse, T. A. (1991). Theoretical perspectives on cognitive aging. Hillsdale, NJ: Erlbaum. Salthouse, T. A. (1992). Mechanisms of age-cognition relations in adulthood. Hillsdale, NJ: Erlbaum. Schacter, D. L., Koutstaal, W., Johnson, M. K., Gross, M. S., & Angell, K. E. (1997). False recollection induced by photographs: A comparison of older and younger adults. Psychology and Aging, 12, 203-215. Schacter, D. L., Koutstaal, W., & Norman, K. A. (1997). False memories and aging. Trends in Cognitive Sciences, 1, 229-236. Schacter, D. L., Norman, K. A., & Koutstaal, W. (1998). The cognitive neuroscience of constructive memory. Annual Review of Psychology, 49, 289-318. Palo Alto, CA: Annual Reviews. Schacter, D. L., Osowiecki, D., Kaszniak, A. W., Kihlstrom, J. F., & Valdiserri, M. (1994). Source memory: Extending the boundaries of age-related deficits. Psychology and Aging, 9, 81-89. Schacter, D. L., Savage, C. R., Alpert, N. M., Rauch, S. L., & Albert, M. S. (1996). The role of hippocampus and frontal cortex in age-related memory changes: A PET study. Neuroreport, 7, 1165-1169. Shaw, T., Mortel, K., Meyer, J., Rogers, R., Hardenberg, J., & Cutaia, M. (1984). Cerebral blood flow changes in benign aging and cerebrovascular disease. Neurology, 34, 855-862. Shimamura, A. P., Berry, J. M , Mangels, J. A., Rusting, C. L., & Jurica, P. J. (1995). Memory and cognitive abilities in university professors: Evidence for successful aging. Psychological Science, 6, 271-277. Smith, C. S., Reilly, C , & Midkiff, K. (1989). Differential memory changes with age: Exact retrieval versus plausible inference.
Journal of Experimental Psychology: Learning, Memory, and Cognition, 74, 728-738. Snodgrass, J. G., & Vanderwart, M. (1980). A standardized set of 260 pictures: Norms for name agreement, image agreement, familiarity, and visual complexity. Journal of Experimental Psychology: Human Learning and Memory, 6, 174-215. SovSikova, E., & BroiiiS, M. (1989, September). Short-term memory and attention changes related to biorhythm and microclimatic conditions. Paper presented at the meeting of the 25th Conference on Higher Nervous Functions, Olomouc, Slovakia. Stadler, M. A., & Hogan, M. E. (1996). Varieties of positive and negative priming. Psychonomic Bulletin & Review, 3, 87-90. Till, R. E., Bartlett, J. C , & Doyle, A. (1982). Age differences in picture memory with resemblance and discrimination tasks. Experimental Aging Research, 8, 179-184. Tune, G. S. (1969). The influence of age and temperament on the adult human sleep-wakefulness pattern. British Journal of
Psychology, 60, 431^41. Warren, L. R., Butler, R. W., Katholi, C. R., & Halsey, J. H. (1985). Age differences in cerebral blood flow during rest and during mental activation measurements with and without monetary incentive. Journal of Gerontology, 40, 53-59. Webb, W. B. (1982). Sleep in older persons: Sleep structures of 50to 60-year-old men and women. Journal of Gerontology, 37, 581-586. Zeno, S. M., Ivens, S. H., Millard, R. T., & Duvvuri, R. (1995). The educator's word frequency guide. Brewster, NY: Touchstone Applied Science Associates.
39
AGE, TESTING OPTIMALITY, AND FALSE MEMORY
c
O O
ft
«->
in l in mill e«84t.,-8
IB,
ft 5 £ I O >
C3
s "S
S ! .5
^
Or* O
K
illll1'
*
^
Q ^
.S
CJ
a o•
1?«Ji!i1 tills
I
INTONS-PETERSON ET AL.
40
Appendix B Names of the Pictures Studied in Experiment 3 List number 1
2
3
4
5
6
CARROT PEACH EAR PANTS POT arm apple lettuce shirt knife dog spoon sock eye orange corn cat fork shoe finger banana potato horse blouse foot pear asparagus stove lion skirt hair grapes celery bowl tiger coat hand cherry onion cup elephant frying pan dress heart lemon pepper pig leg hat refrigerator pineapple artichoke bear glass sweater lips strawberry mushroom mouse tie nose watermelon pumpkin toaster deer Note. The name of the critical picture lure is capitalized. From "A Standardized Set of 260 Pictures: Norms for Name Agreement, Image Agreement, Familiarity, and Visual Complexity," by J G Snodgrass and M. Vanderwart, 1980, Journal of Experimental Psychology: Learning, Memory, and Cognition, 6, pp. 178-180, 205-210. Copyright 1980 by the American Psychological Association. Adapted with permission of the authors.
cow
Received June 30, 1997 Revision received May 8, 1998 Accepted May 15, 1998
New Editors Appointed, 2000-2005 The Publications and Communications Board of the American Psychological Association announces the appointment of three new editors for 6-year terms beginning in 2000. As of January 1, 1999, manuscripts should be directed as follows: •
For Experimental and Clinical Psychopharmacology, submit manuscripts to Warren K. Bickel, PhD, Department of Psychiatry, University of Vermont, 38 Fletcher Place, Burlington, VT 05401-1419.
•
For the Journal of Counseling Psychology, submit manuscripts to Jo-Ida C. Hansen, PhD, Department of Psychology, University of Minnesota, 75 East River Road, Minneapolis, MN 55455-0344.
•
For the Journal of Experimental Psychology: Human Perception and Performance, submit manuscripts to David A. Rosenbaum, PhD, Department of Psychology, Pennsylvania State University, 642 Moore Building, University Park, PA 16802-3104.
Manuscript submission patterns make the precise date of completion of the 1999 volumes uncertain. Current editors, Charles R. Schuster, PhD; Clara E. Hill, PhD; and Thomas H. Carr, PhD, respectively, will receive and consider manuscripts through December 31, 1998. Should 1999 volumes be completed before that date, manuscripts will be redirected to the new editors for consideration in 2000 volumes.