Imageability Effects in Word Naming: An Individual Differences Analysis EAMON STRAIN, Anglia Polytechnic University CHRIS M. HERDMAN, Carleton University
Abstract The present research was designed to extend research by Strain, Patterson, and Seidenberg (1995) who found that imageability facilitates naming of low-frequency irregular words. Experiment 1 shows that the impact of imageability on word naming varies with phonological coding skill. In Experiment 2, the effect of imageability on naming low-frequency irregular words was shown to occur across an extended set of items. Together, the present findings support the notion that semantics may play a role in phonological coding when the mappings between orthography and phonology are weak.
Theorists have proposed a variety of models describing how readers translate printed words into sound. These models differ in many important respects including whether there are single (e.g., Plaut, McClelland, Seidenberg, & Patterson, 1996; Seidenberg & McClelland, 1989) or multiple (Coltheart, Curtis, Atkins, & Haller, 1993; Norris, 1994) routes from orthography to phonology, and whether semantics can be accessed directly from orthography or must be accessed via phonology (e.g., Van Orden, 1987). Despite these fundamental processing and architectural differences, most theorists assume that semantic information can influence phonological coding. It is surprising, therefore, that there has been very little research directly examining the role of semantics in phonological coding. Moreover, with the exception of evidence reported by Strain, Patterson, and Seidenberg (1995), research investigating the role of semantics on phonological coding has provided inconclusive results, with effects being either small (deGroot, 1989) or not statistically significant (Brown & Watson, 1987). The present study was designed to extend the findings of Strain et al. (1995). There are two ways in which semantics might influence the generation of a phonological representation. Firstly, from an early age individuals will have learned how to generate phonology on the basis of meaning, and, as they become experienced at reading, they may learn to access meaning directly from orthography. Therefore, it is possible
that the generation of a phonological representation may be achieved by the joint action of the orthography-to-phonology (O-P) mapping and the orthography-to-semantics-tophonology (O-S-P) mapping. Secondly, interactions between phonology and semantics may be used to clean up ambiguous phonological representations produced on the basis of the O-P mapping. In accord with the approach adopted in Strain et al. (1995), the present research is framed using a general connectionist model taken from Seidenberg and McClelland (1989; see also Plaut et al., 1996). As discussed below, this approach generates clear predictions concerning the role of semantics in phonological receding. It should be noted, however, that although few other theorists have explicitly discussed a role for semantics in phonological receding (but see Coltheart, 1987), the notion that semantics can impact on naming could be accommodated into most extant models of word recognition. Our intent, therefore, is not to differentiate specific models, or classes of models, but to provide further evidence concerning the role of semantics in phonological receding. Figure 1 illustrates a general connectionist framework for phonological receding taken from Seidenberg and McClelland (1989; see also Plaut et al., 1996). Only the O-P component of this framework has been implemented in the networks developed so far. It has been demonstrated that with only this part of the overall model, the network is capable of reading essentially all monosyllabic words, as well as, in the most recent implementation, performing about as well as people on nonwords. However, it was not the intention in these models to imply that this direct route was the only way to get from orthography to phonology. For example, the division of labour between the O-P and the O-SP pathways in generating phonology on the basis of spelling is an important aspect of the overall framework. Indeed, Plaut et al. (1996) have used this notion to account for the various patterns seen in the reading behaviour of surface dyslexic patients. This framework leads to quite clear predictions about the conditions under which meaning will have the largest
Canadian Journal of Experimental Psychology, 1999, 53:4, 347-359
348
Figure 1. General framework for phonological receding.
impact on naming, regardless of whether one assumes it acts directly or indirectly. In cases where the O-P mapping is efficient, there will not be enough time for semantics to play much of a role. This would be the case for all high-frequency words, both regular and exception, because the mappings for these words are well learned. In contrast, for low-frequency words, the O-P mapping is slower and more error-prone, and so there will be an opportunity for semantics to influence which phonological representation is settled on. This should be most pronounced for low-frequency exception words, which are not only infrequently encountered, but also have inconsistent mappings. Therefore, the connectionist framework developed by Plaut et al. (1996) would predict that although semantics can potentially affect the naming of all words, its effect will be most obvious on the naming of lowfrequency words, and in particular low-frequency irregular words. The influence of semantics on phonological receding was examined by Strain et al. (1995) who required participants to name low-frequency words which varied in terms of regularity and imageability value. Imageability was chosen as a semantic variable because it has been found to have a significant effect on the naming ability of patients with severe phonological deficits and for whom naming is assumed to be mediated mainly or solely via semantics (e.g., Coltheart, Patterson, & Marshall, 1980; Funnell, 1987; Plaut & Shallice, 1993). Strain et al. found an interaction between imageability and regularity where imageability facilitated naming, especially to low-frequency exception words. These findings support the notion that phonological receding is influenced by semantics when the O-P mapping is inefficient, such as during processing of low-frequency exception words. The present research was designed to extend the findings of Strain et al. (1995). In Experiment 1, the low-frequency items from Strain et al. were used to examine the role of imageability with readers who varied in phonological reading ability. In Experiment 2, a regression approach was used to assess the impact of imageability across an extended set of items. Experiment 1 Experiment 1 was designed to investigate the role that semantics plays in phonological coding with participants
Strain and Herdman who vary in phonological reading ability. In a connectionist framework, O-P mappings are assumed to be less efficient in low-skill as compared to highly skilled readers. Insofar as low-skill readers are inefficient at this direct mapping process, they should show a greater tendency to be affected by the semantic attributes of words. The notion that lowskill readers will show a greater influence of semantics on phonological receding is not only intrinsic to the connectionist framework shown in Figure 1, but also entirely consistent with suggestions made by Stanovich (1980). In Stanovich's interactive compensatory model, processes at any one level can compensate for deficiencies in processing at any other level. Because less-skilled readers are inefficient at mapping orthography onto phonology, they are more likely to make compensatory use of semantic information in deriving a word's phonology. In accord with our framework and with the Stanovich (1980) model, we hypothesized that the impact of imageability on naming performance will vary with phonological reading ability. For participants who score high in phonological reading ability, the effect of imageability should be seen primarily on naming of low-frequency exception words where the O-P mappings are not strong. For participants who score low in phonological reading ability, the mapping of orthography onto phonology is presumably inefficient for all types of low-frequency words. Accordingly, for these participants we predicted that imageability would affect naming of both exception and regular words. METHOD Participants. The participants were 90 students (51 female and 39 male) attending Carleton University, who were given course credit for volunteering for the study. Previous research at Carleton indicated that it would be possible to obtain participants with a wide range of reading skills from this sample. Participants ranged from 19 to 48 years of age. Assessment of 'phonological skilL The Word Attack and Sound Blending subscales of the Woodcock-Johnson reading test were used to classify participants as high versus low phonological skill. In Word Attack, participants are required to name aloud a set of printed nonwords (e. g., nat, haunted). In Sound Blending, participants listen to a tape on which words are presented broken down into their constituent sounds by introducing one-second delays. Each word is followed by a cue, and the participants' task is to combine the sounds to produce the word. The items are presented in increasing difficulty from easy items such as app • ell up to more difficult items such ast-e-l-e-v-i-zh-e-n. The participants' percentile scores on the two subscales were averaged, and then ranked from lowest to highest. The 31 participants with the highest combined percentile scores were categorized as high phonological skill, and the 30
Imageability and Naming
349
TABLE 1 Matching Statistics for the Four Word Types Used in Experiment 1 Regular
Frequency* (SD) Positional Bigram Frequency (SD) Number of Letters (SD)
Exception
Concrete
Abstract
Concrete
Abstract
4.7 (2.75) 5,009.5 (3,090.84) 5.8 (1.03)
5.8 (2.99) 4414.0 (3,436.1) 5.9 (1.2)
7.5 (4.1) 4,606.3 (2,107.63) 5.4 (1.35)
6.0 (5.35) 4,875.2 (2,366.22) 6.2 (1.55)
each trial, a fixation point (an asterisk) was presented at the centre of the screen until the participant initiated the trial by pushing the start button. When the start button was pushed, the fixation point disappeared immediately. Following a 250 ms delay, the word to be named appeared. As soon as a naming response was detected, the •word was removed from the screen and the cycle was repeated. The words were presented in different random orders for each participant. The experimenter recorded mispronunciations and voicekey errors by hand during the experiment.
*Kucera & Francis (1967)
lowest scoring participants were coded as low phonological skill. There were 31 participants in the high-skill group because of a tie at the lowest accepted percentile. The remaining 29 participants (i.e., those with intermediate scores) were categorized as medium phonological skill. Stimuli. The stimulus set consisted of 16 quartets of highimageability and low-imageability regular and exception items, closely matched on initial phoneme, number of letters, log Kujera & Francis (1967) frequency, and positional bigram frequency (Solso & Juel, 1980). This is the same set of items that was used in Experiments 2 and 3 of Strain et al. (1995); further details about the selection and composition of the lists may be found in that paper. Participants named these 64 words preceded by three starter items. The four different word types were spread evenly throughout each half of the list. Although participants named all 64 items, because of differences between the samples, certain of the items in the original set had to be removed from the analyses. For example, although the word "wrath" is an exception in British English it is not in Canadian English. In all, six of the original words were problematic for such reasons and were removed, along with their matched items, leaving ten items in each of the experimental conditions. This affected the matching slightly, but not seriously. The matching statistics for the four word types are shown in Table 1, and the words themselves can be found in Appendix A. Apparatus. The stimuli were presented in 12-point font in the centre of a monochrome computer screen, placed approximately 50 cms from the participant. Naming latencies were detected using a headset with built-in microphone connected to a voice key. Trials were self-paced with participants pushing the centre key on a three-button response panel to move on to the following trial. Procedure. Participants were tested one at a time in a quiet room. They were given instructions explaining that their task was to name the words aloud as quickly and accurately as they could. The instructions were followed by a block of 20 practice trials, and then the experimental trials. Within
RESULTS In the subsequent analyses, both subjects (Fs) and items (pi) means were used as units of analysis with alpha set at .05 unless otherwise indicated. Two variables were included in these analyses: imageability (high vs. low), and regularity (regular vs. exception). Both variables were treated as repeated measures in the analyses by subjects and as between-items in the analyses by items. A total of 6.0% of the data (216 responses) were excluded from the RTi analysis. Of these, 0.8% (28 values) were removed because of voice key errors, including any responses slower than 3,000 ms, or faster than 300 ms. A further 5.2% (188 responses) of the RT data were removed because of mistakes in naming the target word. Finally, the average response times for each type of word hi the design were calculated for each participant, and any responses which were more than two SD's greater or less than the individual's mean were replaced in the analysis by the cut-off value. This accounted for 2.8% of the data (101 values). Latency analysis: Full group. The ANOVA carried out on the RT data from the full group of 90 participants showed that naming was faster to high- (861 ms) than to low-imageability words (923 ms), Fs(l,89) = 93.5, MSE = 3,726.6; Fi(l, 36) = 8.0, MSE = 5,566.6, and to regular (880 ms) than to exception words (904 ms), significant only by subjects, fs(l,89) = 11.6, MSE = 4,449.1. As shown in Figure 2a, there was a significant interaction between these variables in the by-subjects analysis only, Fs(l,89) = 9.1, MSE = 3,020.0. Planned comparisons show that there was a larger effect of imageability for exception words, fs(l,89) = 94.6, MSE = 3,020.0; A(l, 36) = 7.8, MSE = 5,566.6, than for regular words (significant only by subjects, Fs(l,89) = 29.9, MSE = 3,020.0). High-skill group. The 31 high-skill participants were also faster at naming high- (826 ms) than low-imageability words (884 ms), Fs(l,30) = 33.3, MSE = 3,131.6; fi(l, 36) = 6.1, MSE - 5,434.6, and they named regular words (832 ms) faster than exception words (878 ms), /s(l,30) = 15.1, MSE = 4,319.5; fl(l,36) = 4.5, MSE = 5,434.6. The interaction between these variables was also significant in the bysubjects analysis only, Fs(l,30) = 6.9, MSE - 2,516.4. As shown in Figure 2b, planned comparisons indicate that the
Strain and Herdman
350 F* a
Ffcfe
Fig 3d
i-
Imageability Figure 2. Experiment 1. Mean naming latencies.
cause of this interaction was a larger effect of imageability on exception words, fs(l,30) = 41.0, MSE = 2,516.4; fi(l, 36) = 6. 6, MS£ = 5,434.6, than on regular words (significant only by subjects, fs(l,30) = 7.3, MSE = 2,516.4). Medium-skill group. The 29 medium-skill participants were also faster at naming high- (873 ms) than lowimageability words (920 ms), ft(l,28) = 16.8, MSE - 3,791.1; fi(l, 36) = 6.2, MSE = 4,766, and they named regular words (886 ms) faster than exception words (908 ms) (significant only by subjects, fs(l,28) = 5.5, MSE = 2,536). The interaction between these variables approached significance in the by-subjects analysis, Fs(l,28) = 3.1, MSE - 1,655.9,/> = .09. Planned comparisons indicate that there was a larger effect of imageability on exception words, Fs(l,28) = 31.7, MSE = 1,655.9; fl(l, 36) = 6.9, MSE - 4,766, than for regular words (significant only by subjects, Fs(l,28) = 9.8, MSE = 1,655.9) (see Figure 2c). Low-skill group. The low-skill group only showed a main effect of imageability, with high-imageability words (886 ms) being named faster than low-imageability words (967 ms) ft(l,29) = 51, MSE = 3,900.3; Fi(l, 36) = %.5,MSE - 10,023.4. There was no overall effect of regularity and no interaction between regularity and imageability (see Figure 2d). The absence of a regularity effect in this group appears to contradict previous research indicating that, if anything, poorer readers tend to produce larger regularity effects (e.g., Seidenberg, 1985). As noted below, however, the low-skill group showed a very strong regularity effect in the error data. Accordingly, we attribute the absence of a regularity effect in the latency data to a speed accuracy trade-off. Summary. The latency data from both the full group of participants and the high-skill subset replicate the key finding of Strain et al. (1995): an interaction between regularity and imageability, with low-frequency exception words being more strongly affected by imageability than low-frequency regular words. This interaction is consistent
with the notion that semantics will influence phonological coding when the O-P connections are weak, as is the case with low-frequency exception words. As predicted, the lowskill participants produced a different pattern of results in that the strong effect of imageability was undifferentiated across regularity. This supports the hypothesis that for our low-skill participants, the O-P mappings for both exception and regular words are inefficient. The medium-skill group produced intermediate results, in that although the interaction between regularity and imageability was not significant, there is evidence in the planned comparisons that imageability had a greater effect on naming of exception words. ERROR ANALYSIS
Errors were categorized into four types. The most common type were LARC errors (short for Legitimate Alternative Reading of Components, a term coined by Patterson, Suzuki, Wydell, & Sasanuma, 1995); these are responses where the participant's pronunciation of a component of the word (typically the vowel) is a legitimate pronunciation of that segment in other words in the English vocabulary but is inappropriate to this particular target word. The majority of LARC errors were exact regularizations (e.g., dread - > "dreed"), but two other types of responses included in this category were alternative pronunciations appropriate to a different exception word with the same spelling pattern (e.g., comb pronounced to rhyme with "tomb"), and pronunciations that split a monosyllabic word into two syllables (e.g., meant pronounced "me-ant"). There were 99 responses (52.5% of the total errors) of this type. Visual-phonological word (VPW) errors were those in which the participant produced a different word that was visually and/or phonologically similar to the target (e.g., dread - > "dead"). There were a total of 58 (31%) of these errors. Visual-phonological nonword (VPNW) errors were those in which the participant produced a nonword that was visually and/or phonologi-
Imageability and Naming
351
Flgta
FtaJc
Imageability
Figure 3. Experiment 1. Mean percent errors.
cally similar to the target (e.g., dread -> "drend"), and participants produced 30 such responses (16%). The fourth category, Other, contained any other incorrect response, for example, if the participant stuttered or failed to make a response. This category made up less than 1% of the total errors. All errors. Three two-way ANOVAs, one for each subject group, were carried out on the error data, with total number of errors as the dependent variable, and regularity and imageability as the independent variables. In all three analyses, there was a significant interaction between imageability and regularity (Full group: fs(l,89) - 38.7, MSB - 0.45; ft(l,36) - 4.6, MSE - 75.4; High-skill group: Fs(l,30) - 11.7, MSE - 0.40; fl(l,36) - 3.1, MSE - 4.7, p - 0.09; Medium-skill group: /s(l,28) - 45.5, MSE - 0.42; Fi(l,36) - 5.3, MSE - 10.4; Low-skill group: Fs(l,29) - 41.6, MSE - 0.44; fi(l,36) - 4.4, MSE - 12.5). As Figures 3a-3d show, the source of this interaction in all cases was a strong imageability effect for exception words in the absence of any such effect for regular words. Error-type analysis. A second set of ANOVAs was carried out on the exception word error data, this time including a new factor, "error type." This was a within-factor both by subjects and by items, and contained two levels, LARC errors vs. All Other errors. Only the responses to exception words were analyzed since LARC errors are largely confined to this type of item. Consequently, regularity was not included as a variable but the imageability factor was retained. The results from the error type analyses are included here for completeness but should be treated with caution, both because of the small number of items in each word condition and the relatively small number of errors produced overall. The error-type ANOVAs showed that participants from all
three groups made more errors in response to lowimageability exceptions than to high-imageability exceptions (Full group: ft(l,89) - 89.6, MSE = 0.37; A(l,18) - 4.3, MSE = 70.7; High-skill group: Fs(l,30) = 13.0, MSE = 0.27; 5i(l,18) - 2.5, MSE - 4.4, p = 0.1; Medium-skill group: ft(l,28) = 41.4, ACE = 0.37; ft(l,18) = 4.5, MSE - 9.8; Lowskill group: Fs(l,29) = 47.1, MSE = 0.39; fl(l,18) - 4.8, MSE - 11.5). The by-subjects analyses also indicated that the participants produced significantly more LARC errors than other error types, in all but the medium-skill group (Full group: fs(l,89) = 8.4, MSE = 0.38; High-skill group: ft(l,30) - 7.6, MSE = 0.18; Low-skill group: Fs(l,29) = 5.5, MSE - 0.34), although this was not significant in any of the by-items analyses. The relevant percentage error rates are presented in Table 2. COMPARISON OF SKILL-GROUPS
The previous set of analyses confirm our prediction that as phonological skill decreases, reliance on semantic information (as indexed by imageability effects) increases in reading aloud, to the detriment of reliance on more direct phonological receding (as indexed by regularity effects). However, although the preceding analyses clarify this progression in terms of reliance on imageability in naming, they do not allow a more direct comparison between the skill groups. This is required in order to test our prediction that the magnitude of the imageability effect will increase along with decreasing phonological skill. Specifically, we would predict that imageability effects will increase in magnitude as reading skill declines (a) between high-skill and medium-skill readers, and (b) between medium-skill and low-skill readers. Three variables were included in the following latency and error analyses: imageability (high vs. low), regularity (regular vs. exception), and reading skill (either high vs. medium, or medium vs. low). The regularity and imageability variables were treated as repeated measures in
Strain and Herdman
352 TABLE 2 Percentage of LARC and Other Errors Produced by the Different Skill Groups Error-Type Imageability Full Group High Skill Medium Skill Low Skill
Other
LARC
High
Low
High
Low
2.4 2.3 3.1 2
8.4 5.8 8.6 11
0.44 0.32
6.7 3.5
0.69 0.7
9.3 7.3
the analyses by subjects and as between-items in the analyses by items; the skill variable was treated as between-subjects in the analyses by subjects and repeated measures in the analyses by items. High-skill vs. medium-skill group. For the analyses of latencies, there are main effects of imageability, fs(l,58) = 47.8, MSE = 3,450; fi(l, 36) - 6.7, MSE = 9,233.5, regularity, Fs(l,58) - 20, MSE = 3,458.5; non-significant by items, and a significant interaction between these variables, Fs(l,58) = 9.7, MSE = 2,101; nonsignificant by items, reflecting the results described in the earlier separate skill analyses. There was also an interaction between reading skill and regularity, significant only by items, ri(l, 36) = 4.9, MSE = 967.1, reflecting the greater effect of regularity in the latency data for the high-skill group in comparison to the medium-skill group. There were no other significant effects in the latency data produced by the medium- and high-skill groups, and in particular, there were no significant interactions between imageability and skill. There is strong evidence that a speed accuracy trade-off is affecting the outcome of the latency analysis. Rather than the medium-skill readers simply being slower than the high-skill readers at naming low-imageability exception words, they instead make more mistakes on this particular type of item. A comparison of the error patterns illustrated in Figures 3b and 3c shows that the medium-skill group produce almost twice as many errors on low-imageability exception words than do the high-skill group. Since only R1$ for correct responses were included in the latency analysis, this eliminated those items which would cause the longest RTs in the low imageability condition, thereby greatly reducing the apparent size of the imageability effect for the medium-skill group. In order to demonstrate the increase in the magnitude of the imageability effect as skill declines between high- and medium-skill readers, it is therefore necessary to look at the error data produced by these two groups. The error analyses showed main effects of skill, imageability (borderline by items), regularity, and significant two-way interactions between each of these variables also. However, of most interest is the significant three-way interaction between skill, regularity, and imageability, ft(l,58) = 47.8, MSE = 3,450; A(l, 36) = 6.7, MSE = 9,233.5. As illustrated in Figure 4, there was (a) no imageability effect
High
Low
Imageability Figure 4. Experiment 1. Mean percent errors for high-skill vs. mediumskill group. "Med" refers to medium-skill group. "High" refers to highskill group,
for regular words, (b) an imageability effect for exception words, and (c) an imageability effect for exception words that is greater for medium-skill than for high-skill readers. In sum, the error analysis confirms that the magnitude of the imageability effect increases as phonological skill decreases. Medium-skill vs. low-skill group. Analysis of the latency data showed that the low- and medium-skill participants were faster in naming high-imageability words (880 ms) than low imageability words (944 ms), Fs(l,57) = 63.1, MSE - 3,846.7; fl(l, 36) = BA,MSE = 12,821.7. Of particular interest is the interaction between reading skill and imageability was significant, albeit borderline in the items analysis, ft(l,57) = 4.6, MSE = 3,846.7; fl(l, 36) - 3.7, MSE = 1,967.7, p = .06. This interaction reflects the greater effect of imageability in the latency data for the low-skill group in comparison to the medium-skill group. This confirms our prediction that imageability effects increase in magnitude with decreasing phonological skill. The only other significant effect in the latency data produced by the medium- and low-skill groups was a main effect for skill in the by-items analysis only, with low-skill readers producing longer latencies (933 ms) than medium-skill readers (898 ms;
Imageability and Naming A(l,36) = 8.4, MSE= 12,821.7). The error data showed main effects for imageability, fs(l,57) - 59, MSB = .505; ri(l, 36) = 4.1, MSB = 21.6, regularity, Fs(l,57) - 86.8, MSE = .53; fi(l, 36) = 6.2, MSB = 21.6, and a significant interaction, fs(l,57) = 87, MSE - .431; fi(l, 36) = 5.1, MSE - 21.6 between these variables, reflecting the results described in the earlier separate skill analyses. None of the effects involving skill were significant. DISCUSSION The results from Experiment 1 replicate the findings of Strain et al. (1995) by showing an interaction between regularity and imageability in word naming. Looking at the full data set (all 90 subjects), imageability has a significantly greater effect on the naming of low-frequency exception words compared with low-frequency regular words, both in naming latencies and in error rate, suggesting a greater influence of semantics in reading these words. Of primary interest was the finding that phonological skill modulated the impact of semantic information on naming words. Participants scoring relatively high on tests of phonological skill replicated the Strain et al. finding of an interaction between regularity and imageability, with imageability having a greater effect on the naming of exception words than on regular words. Medium-skill readers showed more reliance on imageability, in that this variable had a greater impact in the naming of regular words. For this group the impact of imageability on naming was not as strong for regular as for exception words. Finally, low-skill readers showed equally strong effects of imageability on the naming of regular and exception words. This pattern of results supports the notion that reliance on semantic information increases as phonological skill declines. The combined skill analysis confirmed that the magnitude of the imageability effect increased with decreasing phonological skill; this is seen in the error analysis when comparing high- to mediumskill readers, and in the latency analysis when comparing medium- to low-skill readers. Experiment 2 The items used in Experiment 1 were a subset of those used in Strain et al. (1995), which leaves open the possibility that the observed effects may be dependent upon the particular items used in both studies. Also, in the Strain et al. set of items, two factors restricted the selection of words. The most severe limitation was that because a factorial design was used, items had to be coded as high versus low imageability. This meant that items with mid-range imageability ratings could not be used. The second limiting factor was the need to match quartets of items on a wide range of factors, namely initial phoneme, frequency, length, and positional bigram frequency. This has lead to the stimuli used in the previous experiment being somewhat atypical, in
353
that unlike the stimuli used in most word recognition studies they tend to be quite long (an average of around 6 letters compared to 4 or 5 usually), and many of them are multisyllabic. This might mean that these stimuli are rather more difficult than the stimuli used in other studies. It is important to show that imageability plays a role in the naming of easier stimuli also.This issue is addressed in Experiment 2 by confining the stimuli to one syllable of an average length of 4.6 letters. Moreover, since imageability is actually a continuous variable, information was lost previously by dichotomizing it into high and low bands. The present experiment was designed to transcend the limitations inherent in the Strain et al. item set by developing a new, larger set of items, and by analyzing the data using a more powerful regression technique in which imageability was treated as a continuous variable. METHOD Participants. The data were collected from 24 members of the APU volunteer panel who were paid for their participation. The participants were aged between 19-45 years, and about two-thirds were female. Materials. The experimental stimuli were selected from a pool of 160 monosyllabic words for which imageability ratings had been collected from 40 participants in a paper and pencil test. In the imageability rating task, the 160 words were divided into two sets of 80 items, and each of the participants rated one of these two sets, each of which was presented in one of two random orders. Therefore 20 different individuals rated each item. The participants were given a set of written instructions, based on those used by Toglia and Battig (1978), in which they were told to rate the items from one to seven, one representing the low end of the imageability scale and seven representing the high end of the scale. In the case of ambiguous words (e. g., well), participants were instructed to rate the most familiar meaning rather than attempting to find some compromise value. It was also stressed that any sensory experience counted in determining imageability, not just visual sensations. The obtained imageability ratings were then averaged, and used in generating the experimental set of items. In this process, 10 items from the original list of 160 had to be removed in order to maintain the overall matching between the regular and exception items once the imageability ratings had been collected. In generating the stimulus list, items were first categorized as exception or regular. A word was classified as an exception if its pronunciation was inconsistent with GPC rules (Venezky, 1970). A further criterion concerning consistency was also applied: we excluded from the exception set any word belonging to an orthographic body neighbourhood in which the pronunciation of a large majority of the members conflicts with GPC rules. Specifi-
Strain and Herdman
354
tape recorder, which recorded the participants' responses for later analysis.
TABLE 3 Characteristics of Items Used in Experiment 2
Frequency (CELEX) (SD)
Imageability (SD)
PBF Score (SD) Coltheart's N (SD)
No. of Letters (SD)
Regular
Exception
145 (127) 462 (142) 4217 (2,940) 7.8 (6.1) 4.6 (.77)
133 (120.6) 469 (152) 4223 (3,603) 7.9 (6.7) 4.6 (.87)
cally, the summed frequency of the words pronounced similarly to any exception word could not be more than 30 occurrences per million greater than the summed frequency of the items pronounced dissimilarly. Exception words with no body neighbours (i.e., "unique" words) were only included if they had a Coltheart's N of at least two, to ensure that they were not very "orthographically strange." As in the first experiment, a word was classified as regular if both (a) its pronunciation was consistent with GPC rules, and (b) it belonged to a consistent orthographic neighbourhood. Each regular word was matched with an exception word in terms of initial phoneme, and regular and exception words were matched as groups on length, imageability, CELEX written frequency, Coltheart's N, and positional bigram frequency (PBF). In all, there were 75 pairs of regular and exception items in the stimulus set. The characteristics of the items are shown in Table 3, and the items themselves can be found in Appendix B. Participants in the experiment named this complete set of 150 words in a series of four blocks (separated by brief rest periods). Regular and exception words were spread evenly through the blocks, and each block was matched for mean CELEX frequency, PBF, N, and imageability. Each of the experimental blocks began with either two or three starter items (low-frequency regular and exception items), so that there were 40 items to be named per block. In addition to the experimental items, 15 low-frequency regular and exception words were selected and used as practice items. Apparatus. The stimuli were presented in the centre of a 14" monochrome screen, using PsyScope (Cohen, MacWhinney, Flatt, & Provost, 1993) running on an Apple Macintosh Ilci computer. The words were presented in black, lowercase print, New York 24 point font on a white screen placed approximately 60 cms from the participant. Naming responses were recorded using a headset microphone connected to the voice-key port of a CMU button box (see Cohen et al., 1993 for details). The button box was interfaced to the computer allowing it to time response latencies in milliseconds. The microphone was also connected to a
Procedure. Participants were tested one at a time in a quiet room. They were given written instructions (on the computer screen) explaining that their task was to name the words aloud as quickly and accurately as they could. The instructions were followed by a block of 15 practice trials, and then the four experimental blocks. The intertrial interval in both practice and experimental blocks was 500 ms. Within each trial, participants first saw a fixation point, which remained on the screen for 750 ms. Immediately at the offset of the fixation point, the word to be named appeared. As soon as the participant named the word, it disappeared, and the cycle was repeated after the intertrial interval. The words within each block were presented in a different random order for each participant, and the order of presentation of the four experimental blocks was randomly counterbalanced between participants. At the end of each block, the participants could rest for as long as they wished before starting the next block by pressing any key. The experimenter recorded mispronunciations and voicekey errors by hand during the experiment, and these were checked with the tape recordings afterwards. At the end of each session, participants were shown the items they made errors on during the experiment, mixed in with an equal number of items that they named correctly. They were asked to read these words carefully, taking their time, and ensuring they got them right. They were also told to indicate any words with which they were unfamiliar. This was to ensure that the participants knew the correct pronunciations of all the items in the experiment. RESULTS In the following analyses, RTs and errors were pooled over participants for each word. A total of 6.1% of the data (219 responses) were excluded from the RT analysis. Of these, 1.5% (55 values) were removed because of voice-key errors, including any responses slower than 1,500 ms, or faster than 300 ms.1 A further 3.9% (140 responses) of the ST data were removed because of mistakes in naming the target word. Finally, less than 1% (24 responses) of the data were removed because participants mispronounced the target during the experiment and, when questioned later, indicated that they did not know the correct pronunciation of the word. Following Treiman, Mullennix, Bijeljac-Babic, and
The upper cut-off value is lower in this experiment than that applied in Experiment 1 due to the different characteristics of the data in both experiments. The participants' response latencies were much shorter in Experiment 2, and variability in the data was much less, supporting our view that this sample of participants are higher skill than any of the groups in the previous experiment.
Imageability and Naming Richmond-Welty (1995), the initial phoneme of each target word was coded in terms of 10 binary variables. There was one variable for voicing, with voiced initial phonemes coded as 1, and voiceless initial phonemes coded as 0. Manner of articulation was coded using three dummy variables (nasal versus other, fricative versus other, and liquid/semi-vowel versus other). Place of articulation was coded using a further six dummy variables (bilabial versus other, labiodental versus other, palatal versus other, alveolar versus other, velar versus other, and glottal versus other). The measure of word frequency used was CELEX written frequency, which was the number of times that the printed word occurred in a sample of approximately 16.6 million words of text. The advantages of this measure of frequency were the size of the corpus, and the fact that it is based on British sources, making it more appropriate for our sample. Positional bigram frequency (PBF) was calculated using the MRC Psycholinguistics Database (Coltheart, 1981), and was based on all 150,837 items in the database. The imageabihty ratings used were those collected as described in the Method section. Word length was simply the number of letters in the target word. Regularity was coded as a dummy variable, with regular words coded as 1 and exception words coded as 0. The regression model included all 15 of these variables plus a term for the interaction between regularity and imageability. The distributions of each of the continuous independent variables were plotted, and transformations were applied as required to reduce skew and kurtosis and to improve normality. CELEX written frequency was log transformed, and both imageability and positional bigram frequency were square root transformed. Following the recommendations of Aitken and West (1991), all continuous variables were centred before being entered into the regression model. Centring of variables aids in the interpretation of interactions by reducing collinearity effects. Regression analyses: Naming latency. The following regression analyses used unique sums of squares. All of the variables entered the regression at once; each independent variable is evaluated in terms of what it adds to the prediction that is not also predicted by the other variables. The outcome of the regression analysis carried out on naming latencies is shown in Table 4. The percentage of variance accounted for by the regression model was 75% (72% adjusted: f(16,133) - 24.9, MSE - 505.87, p < .0005). The largest proportion of unique variance was accounted for by initial phoneme; voiced initial phonemes triggered the voice key faster than unvoiced; fricatives activated the voice key later than other phonemes; and bilabials, labiodentals, velars, and alveolars led to faster responses than phonemes with alternative places of articulation. Word length accounted for the next largest portion of unique variance, with shorter words being named faster than longer items. Word fre-
355 TABLE 4 Outcome of Regression Analysis Carried Out on the Latency Data of Experiment 2
(5
t
P
-30.81 14.95 16.63 - 12.90 -35.24 -29.39 13.75 -13.47 -21.54 -18.41
-5.94 1.26 2.01 -1.03 -3.59 -2.96 1.37 -1.91 -2.66 -0.97
< .001 .210 < .050 .303 < .001 < .005 .171 .058 < .010 .332
Word Frequency Word Length Positional Bigram Frequency
-15.48 12.91 0.05
-3.78 5.05 0.57
< .001 < .001 .567
2.7 4.8
Imageability Regularity
-3.48 -15.59
-4.74 -4.24
< .001 < .001
4.2 3.4
3.10
2.95
< .005
1.6
Variable Initial Phoneme Voice Nasal Fricative Liquid Bilabial Labiodental Palatal Alveolar Velar Glottal
Imageability x Regularity
% of unique variance (sr2)
14.51
Afore: Total percentage of variance accounted for — 75% (unique variability - 31%; shared variability - 44%) 'Total for initial phoneme. For each of the independent variables, the value of the standardized regression coefficient, beta, is given along with the corresponding t and p values. The percentage of unique variance accounted for by those variables that produced significant effects is also shown.
quency was also related to RT, participants responding faster to high-frequency words than low-frequency words. With respect to the variables of most interest in the present study, imageability accounted for the largest portion of unique variance, with high-imageability words producing the shortest latencies. Regular words were named faster than exception words. Importantly, imageability and regularity interacted. As recommended by Aitken and West (1991), the interaction was plotted using the regression equation, and is shown in Figure 5. This graph shows the mean RT of the regular and exception words at high and low imageability when the other predictor variables are at their mean values. Analysis of the simple slopes reveals that imageability significantly predicts/?Tfor exception words, £(133) = -4.7, p < 0.001, but not for regular words, f(133) - -0.5. Using the test for differences between the regression lines described in Aitken and West (1991; pp 132-133), the 27-ms difference between regular and exception word .R75 at low imageability is significant, *(133) 5.0, p < 0.001, whereas the 4-ms difference between the regular and exception items at high imageability is not, £(133) - -0.8. The RT analysis of this large set of items therefore supports our prediction that imageability has a greater effect on the naming of low-
356
Strain and Herdman 560
646
TABLES Outcome of Regression Analysis Carried Out on the Error Data of Experiment 2
540
Variable
Stmpl. Stop.--3-5 1*4.7, IKM001
Initial Phoneme Voice Nasal Fricative Liquid Bilabial Labiodental Palatal Alveolar Velar Glottal
536
530
525
520
515
Word Frequency Word Length Positional Bigram Frequency
510
506
500
Im.High
Im.Mid
Im.Low
Imageability Figure J. Experiment 2. Mean latencies of the regular and exception words at high- and low-imageability values.
frequency exception words than low-frequency regular words, replicating the findings of Strain et al. (1995) and those of the Full data set and the high phonological skill subgroup in Experiment 1 of the present research. Errors. At 3.9%, the error rate in this experiment was low, and consequently any regression analysis involving splitting the errors into the smaller, error type groups would be highly unreliable. Therefore, only an analysis of errors undifferentiated by type was conducted. The independent variables used as predictors in the regression model were as described for the RTanalysis and the dependent variable was the number of errors pooled over participants for each word. The outcome of the regression model is shown in Table 5. The percentage of variance accounted for by the regression model was 37% (29.7% adjusted: f(16,133) - 4.9, MSB - 1.69, p < .0005). The various initial phoneme variables accounted for 5.4% of the variance. Word length accounted for 3.4% of unique variance, with shorter words being named faster than longer items. Imageability accounted for 3% of unique variance, with high-imageability words producing the fewest errors. Regular words were named more accurately than exception words, and regularity accounted for the single largest portion of explained variability, 22.6%. Importantly, as in the latency analysis, imageability and regularity interacted, accounting for 1.9% of the variance. This interaction was plotted using the regression equation, and is shown in Figure 6. This graph shows the mean error rate of the
Imageability Regularity Imageability x Regularity
fl
t
p
-.243 -.924 -.761 .396 .275 -.344 1.193 -.321 -.313 -.382
.810 -1.35 -1.60 .548 .486 -.599 2.07 -.786 -.669 -.350
.419 .180 .114 .585 .628 .550 < .050 .433 .504 .727
-.316 .400 .003
-1.34 2.71 .689
.184 < .010 .492
3.4
-.108 -1.462
-2.54 -6.88
< .050 < .001
3.0 22.6
.120
1.99
< .050
1.9
% of unique variance (si1) 5.4'
Note: Total percentage of variance accounted for — 37% (unique variability - 36.3%; shared variability - 0.7%) 'Total for initial phoneme.
regular and exception words at high and low imageability when the other predictor variables are at their mean values. Analysis of the simple slopes reveals that imageability significantly predicts error rate for exception words, f(133) - -2.54, p < 0.05, but not for regular words, t(133) - .28. As shown on the graph, there were significant differences between regular and exception words at all levels of the imageability variable. The error analysis of this large set of items therefore confirms the findings of the RT analysis. This further supports our prediction that imageability has a greater effect on the naming of lowfrequency exception words than low-frequency regular words, replicating the findings of Strain et al. (1995) and those of the full data set and the high phonological skill subgroup in Experiment 1 of the present research. General Discussion The present research shows that the extent to which semantic information influences word naming depends on the efficiency of the mappings between orthography and phonology. As noted by Strain et al. (1995), efficiency can be defined in terms of consistency, with regular words corresponding to consistent O-P mappings and exception words corresponding to inconsistent O-P mappings. In support of this view, Strain et al. found an interaction between imageability and regularity, where imageability facilitated naming especially to low-frequency exception
Imageability and Naming
357
HI
M«*n Difference* 1.9 1 - 6 2 . p < 001
•5 ••
i Sknpta Stop* -.012 MTT.ra
0.5 -
mean RTs produced by both sets of participants — the mean response time in the Strain et al. study (Experiment 2) is 548 ms compared with a mean of 855 ms for the high-skill participants in the present study. The difference in RT patterns can therefore be taken as further evidence for the hypothesis that the more phonologically skilled participants are, the less strongly a semantic factor such as imageability will affect their naming latencies. In Experiment 2, a regression approach was used to determine whether the imageability by regularity interaction would generalize to a new, larger set of items. As in Strain et al. (1995) and also Experiment 1 of the present research, imageability facilitated naming responses and an interaction between imageability and regularity was obtained. These findings provide important additional evidence that semantics plays a role in naming words when the connections between orthography and phonology are weak. COMPLEMENTARY EVIDENCE: POLYSEMY EFFECTS IN NAMING
Im.HIgh
lm.MM
Im.Low
fzg»re 6. Experiment 2. Mean error rates for regular and exception words at high and low imageability values.
words. Efficiency can also be defined in terms of phonological reading ability with high-skill individuals having efficient OP mappings and low-skill individuals having inefficient OP mappings. On this view, Experiment 1 replicated and extended the findings of Strain et al. by showing that the effect of imageability on word naming is modulated by phonological reading skill. For participants with high phonological skill, the mapping between orthography to phonology produces clear phonological representations quickly, confining the effect of semantics primarily to words with uncommon and unusual mappings. Readers with low phonological skill, on the other hand, exploit consistencies within the orthography to phonology mapping less well, leading to slower and unclear phonological activation over a wider range of word types. This allows semantics to play a greater role in the reading of these items. The latency data produced by the high-skill group in Experiment 1 differed from those reported in Strain et al. (1995) in one respect: Whereas in Strain et al. the impact of imageability was restricted to low-frequency exception words, in the present research, imageability also influenced naming of low-frequency regular words, although this effect was less than for low-frequency exception words. This difference across studies can be explained in terms of the relative skill levels of the two samples. Although the "high" skill participants scored relatively higher on the phonological skill tasks than the low-skill participants in the present sample, they appear to be less skilled than the participants in the Strain et al. study were. This judgement is based on the
Hino and Lupker (1996) and Lichacz, Herdman, LeFevre, and Baird (1999) have recently shown that polysemous words (i.e., words with more than one meaning) are named faster than nonpolysemous words (i.e., words with only one meaning). Moreover, polysemy and frequency interact: As with imageability, the facilitative effects of polysemy are isolated to naming of low-frequency words. Insofar as both imageability and polysemy are semantic indices, it is reasonable to assume that these variables impact on phonological coding in a similar manner. To this end, our general approach provides a useful framework: As with imageability, polysemy facilitates naming of low-frequency words because, for these words, the O-P connections are inefficient and thereby allow time for semantic information to influence processing. CONCLUSIONS
To summarize, although most theorists in the word recognition literature assume that semantics can influence processing of single words, this assumption has seldom been directly examined. The importance of the present research is in showing that semantic information does indeed influence word naming when the mappings between orthography and phonology are inefficient. This research was supported by National Institutes of Mental Health Grant MH 47566 and by the Natural Sciences and Engineering Research Council of Canada through a grant to C. Herdman. We would like to acknowledge the many and valuable contributions of Karalyn Patterson and Jo-Anne LeFevre in the development of this study. We thank Penny Pexman and Ken Paap for helpful reviews. Ken Paap also served as the Action Editor for this paper. Address correspondence to Eamon Strain, Department of
Strain and Herdman
358 Psychology, Anglia Polytechnic University, Cambridge, UK (E-mail:
[email protected]). References Aitken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. Newbury Park, CA: Sage. Brown, G. D., & Watson, F. L. (1987). First in, first out: Word learning age and spoken word frequency as predictors of word familiarity and word naming latency. Memory & Cognition, 15, 208-216. Cohen, J. D, MacWhinney, B, Flatt, M., & Provost, J. (1993). PsyScope: An interactive graphic system for designing and controlling experiments in the psychology laboratory using Macintosh computers. Behavior Research Methods, Instruments and Computers, 25, 257-271. Coltheart, M. (1981). The MRC psycholinguistic database. Quarterly Journal of Experimental Psychology, 33A (4), 497-505. Coltheart, M. (1987). The cognitive neuropsychology of language. London: Erlbaum. Coltheart, M, Curtis, B., Atkins, P, & Haller, M. (1993). Models of reading aloud: Dual-route and parallel-distributedprocessing approaches. Psychological Review, 100, 589-608. Coltheart, M., Patterson, K. E., & Marshall, J. C. (eds). (1980). Deep dyslexia. London: Routledge & Kegan Paul. de Groot, A. M. B. (1989). Representational aspects of word imageability and word frequency as assessed through word association. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 824-845. Funnell, E. (1987). Morphological errors in acquired dyslexia: A case of mistaken identity. Quarterly Journal of Experimental Psychology, 39A, 497-539. Gilhooley, K. J., & Logic, R. H. (1980). Age of acquisition, imagery, concreteness, familiarity, and ambiguity measures for 1944 words. Behavior Research Methods and Instrumentation , 12, 395-427.
Hino, Y., & Lupker, S. J. (1996). Effects of polysemy in lexical decision and naming: An alternative to lexical access accounts. Journal of Experimental Psychology: Human Perception and Performance, 22, 1331-1356. Kujera, H., & Francis, W. N. (1967). Computational analysis of present-day American English. Providence, RI: Brown University Press. Lichacz, F. M, Herdman, C. M., LeFevre, J, & Baird, B. (1999). Polysemy effects in naming. Canadian Journal of Experimental Psychology, 53, 189-193. Norris, D. (1994). A quantitative multiple-levels model of reading aloud. Journal of Experimental Psychology: Human Perception and Performance, 20, 1212-1232. Paivio, A., Yuille, J. C., & Madigan, S. A. (1968). Concreteness, imagery and meaningfulness values for 925 words. Journal of Experimental Psychology Monograph Supplement, 76(3, part 2). Seidenberg, M. S., & McClelland, J. L. (1989). A distributed, developmental model of word recognition and naming., Psychological Review, 96, 523-568. Solso, R. L., & Juel, C. L. (1980). Positional frequency and versatility of bigrams for two through nine letter English words. Behavior Research Methods and Instrumentation, 12, 297-343. Stanovich, K. E. (1980). Toward an interactive model of individual differences in the development of reading fluency. Reading Research Quarterly, 16, 32-71. Strain, E., Patterson, K., & Seidenberg, M. S. (1995). Semantic effects in single-word naming. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 1140-1154. Toglia, M. P., & Battig, W. F. (1978). Handbook of semantic word norms. Hillsdale, NJ: Erlbaum. Van Orden, G. C. (1987). A ROWS is a ROSE: Spelling, sound, and reading. Memory & Cognition, 15, 181-198. Venezky, R. L. (1970). The structure of English orthography. The Hague: Mouton.
Appendix A Word List for Experiment 1 Regular Words
Exception Words
Concrete
Abstract
Concrete
Abstract
banner corpse groin mattress snail sandal spike trout trumpet wreck
blessing clause gait madness scorn stanza fraud truce traitor whence
boulder comb ghost meadow swamp shovel sword pear treasure worm
broader caste guise mischief sleight soften suave trough toughness warn
Imageability and Naming
359 Appendix B Word List for Experiment 2
Exception ache aunt axe blown bowl brooch crow caste choir chord comb cough couth cache dough draught dreamt deaf debt fete flood flown gauge glow ghoul gross hearth hearse leapt mauve mould mourn pear pint pour quart realm
Regular
Exception
arch arc ail blotch blunt broom cage carve clink cloak clung croft crude crumb dense dredge drench duck dumb filth flame fraud gait gift grace graft heave hoarse lamb merge mince moan peck plug plum quest rude
soot sew seize scarce sown shone shoe chute sieve sloth swear swab squad swan suave suede suite sponge swamp sleight swap steak swat sword tomb ton trough tow wad warp wand warn wan wasp wolf womb wool worm
Regular sage saint sauce scribe seam shawl shed shrewd skate slang snail sock sole sore spire spout stack starch steam strict stub stump surf swerve tack tale
trend tune wage waif wail waist wax web weed weep welt wisp
Sommaire La presente recherche a etc concue pour donner suite a celle menee par Strain, Patterson et Seidenberg (1995), qui ont decouvert que 1'imagibilite facilite Pappellation des mots irreguliers a faible frequence. L'experience 1 montre que 1'impact de 1'imagibilite sur 1'appellation des mots varie avec la capacite de codage phonologique. Dans 1'experience 2, on a montre que I'effet de 1'imagibilite sur 1'appellation des
mots irreguliers a faible frequence se manifeste sur une gamme etendue d'articles. Ensemble, les conclusions actuelles appuient la notion selon laquelle la semantique peut jouer un role dans le codage phonologique lorsque les configurations entre Porthographe et la phonologic sont faibles.
Revue canadienne de psychologie experimentale, 1999, 53:4, 359