Selective Influence of Working Memory Load on ... - PsycNET

2 downloads 15 Views 2MB Size Report
Jan 31, 2013 - as early-stage Alzheimer's disease (e.g., Jackson, Balota, Duchek,. & Head ... working memory (WM) abilities and the tail of the RT distribution.
Journal of Experimental Psychology: General 2014, Vol. 143, No. 5, 1837–1860

© 2014 American Psychological Association 0096-3445/14/$12.00 http://dx.doi.org/10.1037/a0037190

Selective Influence of Working Memory Load on Exceptionally Slow Reaction Times Nitzan Shahar

Andrei R. Teodorescu and Marius Usher

Ben-Gurion University of the Negev

Tel-Aviv University

Maayan Pereg and Nachshon Meiran This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Ben-Gurion University of the Negev The rate of exceptionally slow reaction times (RTs), described by the long tail of the RT distribution, was found to be amplified in a variety of special populations with cognitive deficits (e.g., early-stage Alzheimer’s disease, attention-deficit/hyperactivity disorder, low intelligence, elderly). Previous individual differences studies found high correlations between working memory (WM) and parameters that characterize the magnitude of the long-RT tail. However, the causal direction remains unknown. In 3 choice-reaction task experiments, we examined this relationship by directly manipulating WM availability. In Experiment 1, the stimulus–response rules were either arbitrary (WM demanding) or nonarbitrary. In Experiment 2, the arbitrary rules were either novel (demanding) or practiced. In Experiment 3, WM was loaded with either declarative (stimulus–stimulus) or procedural (stimulus–response) arbitrary rules. Using an ex-Gaussian model fitting, we found across all experiments that WM demands uniquely influenced the ␶ parameter, mostly responsible for the long-RT distribution tail. Evidence accumulation modeling of the choice process indicated that WM load had little influence on the decision process itself and primarily affected the duration of an exponentially distributed nondecision component, assumed to reflect the process of rule retrieval. Theoretical interpretations and implications are discussed. Keywords: working memory, ex-Gaussian distribution, choice-reaction task, intraindividual variability, evidence accumulation modeling

& Head, 2012; Spieler, Balota, & Faust, 1996; Tse, Balota, Yap, Duchek, & McCabe, 2010), cognitive aging (Spieler et al., 1996), attention-deficit/hyperactivity disorder (e.g., Buzy, Medoff, & Schweitzer, 2009; Karalunas & Huang-Pollock, 2013), schizophrenia (Karantinos et al., 2014), chronic alcohol consumption (Wright, Vandewater, & Taffe, 2013), cocaine dependency (S. Liu et al., 2012), low fluid intelligence (e.g., Coyle, 2003; Schmiedek, Oberauer, Süß, Wilhelm, & Wittmann, 2007), and brain damage (Stuss, Murphy, Binns, & Alexander, 2003). Additionally, the long-RT tail was found to reflect different cognitive manipulations such as self versus other face perception (Sui & Humphreys, 2013), priming in lexical decision (Balota & Spieler, 1999; Yap, Balota, & Tan, 2013), and task conflicts, as opposed to response conflicts (Moutsopoulou & Waszak, 2012; Steinhauser & Hübner, 2009). In recent years, evidence supporting a specific link between working memory (WM) abilities and the tail of the RT distribution has emerged. WM is a system devoted to the processing and maintenance of mental representations in a goal-directed manner (Oberauer, 2009). Interestingly, all of the special populations associated with long-RT tails have also been found to have poor WM abilities. Additionally, speeded RT tasks that are typically used to obtain empirical RT distributions require participants to hold in mind and retrieve stimulus–response (S–R) rules. When these rules are novel, they are assumed to be held in WM (Oberauer, 2009). Thus, it seems plausible that WM abilities take a significant role in the long-RT tail phenomenon.

The measurement of reaction time (RT) is widely used to study cognitive processes and their deficits in special populations. While studies analyzing RTs often rely only on the mean as the main dependent variable, it has been shown that other aspects of the RT distribution are more informative in characterizing the underlying process (e.g., Balota & Yap, 2011; Ratcliff, 1979). Typically, RT distributions are not symmetrical, but rather characterized by a long tail of slow responses. Importantly, the long-RT tail is strongly, and selectively, accentuated in special populations such as early-stage Alzheimer’s disease (e.g., Jackson, Balota, Duchek,

This article was published Online First July 7, 2014. Nitzan Shahar, Department of Psychology and Zlotowski Center for Neuroscience, Ben-Gurion University of the Negev; Andrei R. Teodorescu and Marius Usher, Department of Psychology, Tel-Aviv University; Maayan Pereg and Nachshon Meiran, Department of Psychology and Zlotowski Center for Neuroscience, Ben-Gurion University of the Negev. This research was supported by a research grant from the Israel Science Foundation to Nachshon Meiran, by a Leverhulme Trust Visiting Professorship to the University of Oxford for Marius Usher, and by a Fulbright Scholar Program for Andrei R. Teodorescu. We wish to thank Gal Eblagon, Danielle Dotan, Shir Bekhor, and Eyal Eilat for their help in running the experiments. Correspondence concerning this article should be addressed to Nitzan Shahar, Department of Psychology and Zlotowski Center for Neuroscience, Ben-Gurion University of the Negev, Beer-Sheva, Israel. E-mail: [email protected] 1837

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

1838

SHAHAR, TEODORESCU, USHER, PEREG, AND MEIRAN

Several studies have examined the relation between WM abilities and the long-RT tail. In those studies ex-Gaussian distribution fitting has been typically used to account for different aspects of the empirical RT distribution, among them the long-RT tail. The ex-Gaussian distribution—the sum of a Gaussian and an exponential distribution—was found to successfully describe empirical RT distributions (e.g., Balota & Yap, 2011; Heathcote, Popiel, & Mewhort, 1991; Ratcliff, 1979). Ex-Gaussian distributions have three parameters; ␮, ␴ (the mean and standard deviation of the Gaussian component), and ␶ (the exponential decay responsible for the tail of the distribution; see Figure 1). When ␶ equals 0, the distribution becomes Gaussian. Thus, ␶ qualifies the degree to which the RT distribution is affected by a minority of trials with extremely long durations. In the following, we will call ␶ “the tail-parameter.” Schmiedek et al. (2007) extracted ex-Gaussian parameters from a battery of two-choice reaction tasks. Using a structural equations modeling approach, the authors demonstrated an exceptionally high negative regression weight describing the correlation of a ␶ factor with a WM factor (indexed by complex span, short-term memory, and updating tasks; ␤ ⫽ ⫺.90), which was not true for the factors of ␮ (␤ ⫽ ⫺.02) or ␴ (␤ ⫽ .25). Similar support for the relationship between the tail-parameter and WM was reported in a number of other correlational studies. Unsworth, Redick, Lakey, and Young (2010) found that the tail-parameter in a perceptual detection task was negatively correlated with performance in WM capacity tasks (i.e., operation span and reading span; Engle, 2002) and with performance in response inhibition tasks (i.e., the flanker noise task, Eriksen & Eriksen, 1974, and the antisaccade task, e.g., Munoz & Everling, 2004). McVay and Kane (2012) found similar results. They demonstrated that the tail-parameter extracted from an RT distribution of a go/no-go task correlated negatively with WM capacity abilities, evaluated using three automated span tasks (i.e., reading span, operation span and symmetry span). Cowan and Saults (2013) also investigated the correlation between WM ca-

pacity and RT in a proactive interference task. Participants were presented with a list of words and were later asked to quickly identify whether a presented probe had appeared in the list. They found that for shorter lists (i.e., three to four words), higher WM span scores predicted lower tail-parameter values. Interestingly, this relationship disappeared when participants were presented with longer list sets (i.e., six to eight words), suggesting that the correlation between the tail-parameter and WM abilities might interact with set size. These findings demonstrate that participants with poorer WM abilities tend to show RT distributions with a longer RT tail (i.e., higher tail-parameter, ␶). However, all of the studies testing the above relationship between RT tail and WM share the same limitation of having done so using correlational designs. As such, causality has neither been tested nor been established. To help determine whether WM load has a causal role on the form of RT tails, one should preferably manipulate one variable (e.g., WM load) and examine the influence of this manipulation on the other variable (e.g., RT tail). Additionally, the cognitive process underlying the relation between WM load and the magnitude of the long-RT tail has been a matter of discussion, with some attributing changes in the tail-parameter to task interference (e.g., lapses of attention, McVay & Kane, 2012) and others to the quality of information available to the decision process (e.g., Schmiedek et al., 2007), regardless of interference. Thus, we have set up two aims for this study. First, we aimed to test the causal relation between WM and the long-RT tail by directly manipulating the availability of WM resources. In three experiments, WM load was manipulated by increasing the number of arbitrary S–R rules that had to be held in WM. We predicted that a shortage in WM resources would lead to a longer RT tail, reflected by higher tail-parameter values (i.e., ␶). Having established a directional, causal relation, our second aim was to identify the cognitive process responsible for the effect of WM demands on the long-RT tail. We focus on three plausible hypotheses: (a) lower WM resources affect the quality of information available for the perceptual decision (e.g., Schmiedek et al., 2007); (b) reduced WM resources increase fluctuations in attention and/or motivation (e.g., Engle & Kane, 2003; McVay & Kane, 2012); (c) shortage of WM resources prolongs the time needed for the retrieval of S–R rules from WM (e.g., Schmitz & Voss, 2012). Accordingly, the first section of this article is composed of an experimental investigation including three experiments that explore the effect of WM on the long-RT tail. In the second section, we carry out a computational investigation of the decision process using an evidence accumulation modeling approach to distinguish the three possible loci for the involvement of WM on the latency or response production.

Examining the Effects of Working Memory Load on the RT Distribution

Figure 1. Illustration of two ex-Gaussian distributions (␮ ⫽ 300 ms, ␴ ⫽ 150 ms, and ␶ ⫽ 250 ms; solid line) and the same distribution only with a higher tau value (␶ ⫽ 500 ms; dashed line).

A very common form of speeded RT tasks is choice-reaction tasks in which participants are asked to follow simple S–R rules, in order to categorize different stimuli into classes (e.g., odd– even, word–nonword, square– circle) using key presses as responses. Previous research has shown choice-reaction tasks to be characterized by Hick’s law—mean RT becomes longer as the number of alternatives (i.e., set size) increases (Hick, 1952; Hyman, 1953).

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

EFFECTS OF WORKING MEMORY LOAD ON SLOW REACTIONS

Additionally, the effect of set size on the mean RT is known to be attenuated after prolonged practice (e.g., Hale, 1968) or when familiar and congruent S–R mapping is used (Longstreth, ElZahhar, & Alcorn, 1985). Schneider and Anderson (2011) presented a memory-based model, claiming that the effect of set size on mean RT is partially a memory-based effect. They showed that both extensive practice and the use of familiar and congruent S–R mapping reduce memory demands, thus attenuating the effect of set size on the mean RT. While changes in the long-RT tail can directly affect the mean RT (mathematically, for ex-Gaussian distributions RTmean ⫽ ␮ ⫹ ␶), no investigation has explored the specific effect of WM on the distribution tail. Additionally, no study has directly examined this effect in a within-subject design, manipulating WM resource availability. Thus, we examined the effects of WM load on the long-RT tail in three experiments. In Experiment 1, we examined the effects of WM load by means of set size and the arbitrariness of the S–R mapping. In Experiment 2, we manipulated WM load by means of training that allowed consolidation of the S–R rules into long-term memory. In Experiment 3, we examined whether the effect of WM load on the long-RT tail remains when WM is loaded with declarative as well as procedural based rules. As described later, in all experiments we predicted that WM load would increase with the number of arbitrary rules. We further predicted that the increase in WM load would selectively influence the long-RT tail (as indexed by the tail-parameter, ␶).

Experiment 1 In the current experiment, we manipulated set size (i.e., 2, 4, or 6) in choice-reaction tasks with either a nonarbitrary S–R mapping that is based on knowledge stored in long-term memory or an arbitrary mapping that is novel to the participants and requires the maintenance of S–R rules in WM (Wilhelm & Oberauer, 2006). Thus, in this design WM load is manipulated by altering the number of arbitrary S–R rules that the participants are required to hold in mind. Specifically, when mapping is based on nonarbitrary rules, increasing set size should not affect WM load (i.e., the number of arbitrary rules remains zero). Yet, when mapping is arbitrary, increasing set size also increases the number of arbitrary rules and thus is assumed to raise WM load. In other words, if WM load holds a causal role in the form of the RT tail, then an interaction is predicted such that the tail-parameter (i.e., ␶) would increase with set sizes only in the arbitrary condition (see Figure 2).

Method Participants. Twenty-four healthy undergraduate students (mean age ⫽ 23.08 years, SD ⫽ 1.13; 20 females, four males) took part in the experiment in return for 25 NIS per hour (⬃$6). Participants were prescreened for self-reported head injury, psychiatric disorders, drug/alcohol use, color blindness, and diagnosed learning disabilities. Stimuli and apparatus. Three choice-reaction tasks were used with shapes, letters, or digits (see Figure 3). In the shapes task, the stimuli used in the nonarbitrary condition were asymmetrical in a way that indicated spatial direction, while those used in the arbitrary mapping condition were similar but symmetrical, thus

1839

Figure 2. Illustration of the predicted change in working memory load across mapping (arbitrary vs. nonarbitrary) and set size (2, 4, or 6) conditions in Experiment 1 and Experiment 2 (only the six-choice condition). Working memory load was assumed to increase with the number of arbitrary rules. Increasing set size when mapping is nonarbitrary was assumed not to affect working memory load (since the number of arbitrary rules remains zero). Yet, increasing set size under the arbitrary condition (i.e., 2, 4, or 6) adds arbitrary rules and is thus assumed to increase working memory load. This interaction effect was predicted to appear only in the tail-parameter (i.e., ␶), thus reflecting the influence of working memory load on the rate of exceptionally slow reaction times.

making the connection between stimuli and responses arbitrary. The letters task included six Hebrew letters. Nonarbitrary mapping had letters from the beginning of the alphabet (aleph [ ], bet [•], gimel [–], dalet [—], hei [ ], and vav [ ]) mapped to each key in a right-to-left order (compatible with the Hebrew language writing direction). Arbitrary mapping had randomly chosen letters. The digits task included six Arab digits following the same mapping principle as the letters task. Nonarbitrary mapping used the digits 1– 6 with keys arranged from left to right. The arbitrary mapping used randomly chosen digits. We were concerned with the possibility that what appeared to us as random assignment might nonetheless be somehow systematically related to the key arrangement. To avoid any such systematic influence, we used eight arbitrary mappings for each, digits and letters, counterbalanced across participants. In addition, four graduate students were run in a pilot study and were asked to indicate whether they noticed any systematic ordering in the sets. None of them could find any such systematic order. In both the letters and the digits tasks, participants responded using six horizontally adjacent keyboard keys on the lower row of the keyboard (i.e., X, C, V, B, N, M, in a QWERTY keyboard). In this arrangement, we capitalized on the natural left-to-right arrangement of digits along the number line (e.g., Dehaene, 2002) and the Hebrew right-to-left arrangement. In the shapes task, participants responded using the numerical keypad in which the relative positioning of the keys was the same as the relative positioning to which the shapes (in the nonarbitrary task) pointed (i.e., 1, 3, 4, 6, 7, and 9). All keys were covered with white

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

1840

SHAHAR, TEODORESCU, USHER, PEREG, AND MEIRAN

stickers to encourage their coding in terms of spatial location rather than as letters or digits. This experiment as well as all subsequent experiments was programmed with E-Prime 2.0 (Psychology Software Tools, Pittsburgh, PA). All stimuli were presented in the center of a black 19-in. (48.26-cm) computer screen. Each shape stimulus was 28 ⫻ 28 mm in size. Text was presented in a white 24-point Courier New font. Procedure. Participants were seated in front of the computer screen in a small room in the lab. After a short instruction screen, participants performed for each task (digits, letters, and shapes) two sequences of two-, four-, and six-choice tasks, one with arbitrary mapping and one with nonarbitrary mapping. Each condition consisted of 120 trials divided into two blocks, allowing a short recess in each condition. Mapping order (arbitrary, nonarbitrary) and task order (digits, letters or shapes) were counterbalanced between participants with a Latin square. Each trial included a fixation (250 ms), target (until response or until 6 s), and a blank screen (250 ms). A 400-ms beep signaled errors.

Results and Discussion For RT analyses, error and posterror trials, 10 first trials in each condition and the first trial after each recess were discarded. RTs below 200 ms or above 4 standard deviations from the participants’ mean in the respective condition were considered as outliers and thus omitted (Schmiedek et al., 2007). This resulted in an acceptable number of trials for an ex-Gaussian distribution fitting (mean number of trials was 95.73 in each condition). The exGaussian distribution fitting was performed with the DISTRIB toolbox in MATLAB (Lacouture & Cousineau, 2008). Data from one participant in the letters task were corrupted due to a computer error and were thus replaced with the mean value in each condition. In order to ensure that the ex-Gaussian distribution fits the data, we simulated 10,000 data points from each set of the exGaussian parameters in each condition. We then calculated four quantile means (e.g., .20, .40, .60, .80) for both the simulated and the empirical data. A quantile– quantile plot was generated to allow

Figure 3. Example of a nonarbitrary (left panel) and an arbitrary (right panel) stimulus–response mapping used in the shapes, digits, and letters six-alternatives tasks in Experiment 1 (English letters in all figures are presented for illustrative purposes only; letter stimuli in the experiment were Hebrew letters, in equivalent order). The shapes task made use of the number pad keys (i.e., 1, 3, 4, 6, 7, 9, in a QWERTY keyboard), and both the digits and letters tasks were performed using six horizontally ordered keys (i.e., X, C, V, B, N, M, in a QWERTY keyboard).

Figure 4. Results in Experiment 1 showing the interaction effect of set size (i.e., 2, 4, or 6) and mapping (i.e., arbitrary vs. nonarbitrary) on each of the ex-Gaussian parameters (i.e., ␮, ␴, and ␶). Results demonstrate that the tail-parameter (i.e., ␶) was most influenced by this interaction, demonstrating an effect of working memory load on the rate of exceptionally slow reaction times. Error bars reflect confidence intervals (Jarmasz & Hollands, 2009).

a visual comparison between the simulated and empirical data (e.g., Steinhauser, & Hübner, 2009). A good fit between the theoretical ex-Gaussian distributions and the empirical data was demonstrated (see Appendix A). A repeated measures analysis of variance (ANOVA) was performed separately for each parameter (␮, ␴, ␶) as dependent variables with mapping (arbitrary or nonarbitrary), set size (2, 4, or 6), and task (letters, digits, or shapes) as independent variables (see Figure 4). Main effects for mapping were found for ␮, F(1, 23) ⫽ 8.25, p ⬍ .01, ␩p2 ⫽ .26, and ␶, F(1, 23) ⫽ 58.68, p ⬍ .001, ␩p2 ⫽ .72, but not for ␴, F(1, 23) ⫽ 2.75, ns, ␩p2 ⫽ .11. Main effects for set size were significant for all three parameters: ␮, F(2, 46) ⫽ 186.51, p ⬍ .001, ␩p2 ⫽ .89; ␴, F(2, 46) ⫽ 23.2, p ⬍ .001, ␩p2 ⫽ .50; and ␶, F(2, 46) ⫽ 165.67, p ⬍ .001, ␩p2 ⫽ .89. Most importantly, to account for WM load effects as predicted (see Figure 2), we examined the paired interaction of Set Size ⫻ Mapping for each parameter. Following our predictions, results indicated a significant simple interaction between set size and mapping for ␶, F(2, 46) ⫽ 20.63, p ⬍ .001, ␩p2 ⫽ .47, but not for ␮, F(2, 46) ⫽ 0.75, ns, ␩p2 ⫽ .03, or ␴, F(2, 46) ⫽ 2.1, ns, ␩p2 ⫽ .08. To contrast the effects in ␶ with ␮ and ␴, we standardized the data1 and performed a repeated measures ANOVA on the calculated Z scores with parameter (␮, ␴, ␶), mapping (arbitrary or nonarbitrary), set size (2, 4, or 6), and task (letters, digits, or 1 Although the three parameters of the ex-Gaussian distribution are given in units of milliseconds, it is unclear if they can be compared statistically because they might have drastically different population distributions. To show that the interaction we report does not reflect such a scaling problem, we transformed the data to Z scores. The transformation was performed by first calculating the difference for each value from its parameter total mean (calculated across all conditions). The difference was further divided by the parameter standard deviation calculated for each parameter separately in the two-choice, nonarbitrary mapping conditions (we did not estimate standard deviations across conditions to avoid adding variability that is influenced by the experimental manipulations).

EFFECTS OF WORKING MEMORY LOAD ON SLOW REACTIONS

1841

The present results support our predictions, demonstrating the influence of WM load (as described by the Mapping ⫻ Set Size interaction) on the long-RT tail (see Figures 4 and 5). Namely, the addition of arbitrary relative to nonarbitrary rules had the strongest influence on the tail-parameter (i.e., ␶).

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Experiment 2

Figure 5. Reaction time (RT) frequency distributions in Experiment 1 (collapsed across all participants) for the nonarbitrary and arbitrary conditions. Results demonstrate a larger effect for set size (i.e., 2, 4, or 6) on the distribution tail in conditions where working memory was demanding (i.e., arbitrary mapping).

shapes) as independent variables. In order for our hypothesis to be supported, we needed to show both that ␶ differs significantly from ␮ and that ␶ differs from ␴ on the Mapping ⫻ Set Size interaction effect representing WM load manipulation. Accordingly, a onetailed planned comparison contrasting ␮ and ␶ on the effects of set size (2 vs. 6) and mapping (arbitrary vs. nonarbitrary) proved to be significant, t(23) ⫽ 4.15, p ⬍ .001, ␩p2 ⫽ .43. Furthermore, a one-tailed planned comparison contrasting ␶ and ␴ across the same conditions was also significant, t(23) ⫽ 2.96, p ⬍ .01, ␩p2 ⫽ .28. Thus, the effect of WM load, represented by the Mapping ⫻ Set Size interaction, was stronger for ␶ compared with either ␮ or ␴. This finding is in line with the much higher effect sizes for the interaction between mapping and set size found for ␶ (␩p2 ⫽ .47), compared with ␮ (␩p2 ⫽ .03) and ␴ (␩p2 ⫽ .08) in the analyses performed separately on each parameter. Error rates were subjected to an ANOVA with the same design. A significant main effect for mapping was found, F(1, 23) ⫽ 12.47, p ⬍ .01, ␩p2 ⫽ .35, with lower error rates for the nonarbitrary versus arbitrary condition. A significant main effect was found for set size, F(2, 46) ⫽ 18.79, p ⬍ .001, ␩p2 ⫽ .45, with lower error rates for conditions with fewer choices. The two-way interaction Mapping ⫻ Set Size, assumed to represent WM load, was not significant, F(2, 46) ⫽ 1.52, ns, ␩p2 ⫽ .06. As the arbitrary condition tended to have slower RTs with higher error rates, speed–accuracy trade-off is not a likely explanation for the present results (but see Modeling the Decision Process section for response criteria estimations).

In Experiment 2, we sought to replicate and extend the previous results using an additional load manipulation. We gave participants one task drawn from Experiment 1 (i.e., six-choice digit classification task) and manipulated WM load either by means of mapping type (arbitrary vs. nonarbitrary) or through training. The mapping manipulation included six S–R rules that were either nonarbitrary (i.e., digits 1 to 6 ordered left to right) or arbitrary (i.e., with randomly chosen digits), exactly as manipulated in Experiment 1. For the training manipulation, we asked participants to perform the same six arbitrary S–R rules over two sessions performed in consecutive days. We assumed that through training, especially when allowing a night rest in between sessions (Karni, Tanne, Rubenstein, Askenasy, & Sagi, 1994; Stickgold, 1998), the novel and unfamiliar arbitrary S–R mapping would produce longterm memory traces and reduce WM load. Thus, if indeed WM load holds a causal influence on the shape of the RT distribution tail, we predict that (a) mapping manipulation (arbitrary vs. nonarbitrary) in Session 1 would only (or mostly) be expressed by a reduction in the tail-parameter, replicating Experiment 1, and (b) the training manipulation (Session 1 vs. Session 2, only for arbitrary mapping) would mitigate the arbitrariness of the S–R mapping between sessions, causing a decline in the tail-parameter from Session 1 to Session 2. Note that since set size was held constant in this experiment, we could not observe the full interaction between set size and mapping as in Experiment 1. Rather, here we explored the effect of arbitrariness (manipulated by means of mapping or training) only in a six-choice condition. Thus, in this experiment WM load effect is assumed to be described by a main effect and not an interaction as before (see Figure 2, six-choice condition). Note that despite this change in statistical comparisons, our definition of WM load remained as it was in Experiment 1—namely, the number of arbitrary rules that must be held in WM.

Method Participants. Eight healthy undergraduate students (mean age ⫽ 24.5 years, SD ⫽ 1.28; six females, two males) similar to the participants in Experiment 1 took part in Experiment 2. Stimuli and apparatus. In this experiment, a six-choice reaction task with digits as stimuli was used (arbitrary and nonarbitrary). All stimuli and apparatus were the same as those used in Experiment 1, including the counterbalancing of eight arbitrary mappings across participants. Procedure. Participants performed two sessions in two consecutive days. The first session started with one block of nonarbitrary mapping followed by five blocks of arbitrary mapping (thus, the two first blocks in Session 1 allowed the examination of the nonarbitrary vs. arbitrary effect). The second session included five additional arbitrary mapping blocks. Each block consisted of 200 trials, with a short recess in between. Otherwise the procedure was the same as in the previous experiment.

1842

SHAHAR, TEODORESCU, USHER, PEREG, AND MEIRAN

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Results and Discussion The data were treated as in Experiment 1. Data trimming resulted in an acceptable number of trials for an ex-Gaussian distribution fitting (mean number of trials was 172 for each condition in the mapping manipulation and 884 for each condition in the training manipulation). As before, a quantile– quantile plot was used to estimate ex-Gaussian fitting, revealing a very good fit (see Appendix A). Mapping manipulation. To examine effects due to the mapping manipulation, we conducted analyses on the parameter estimates describing performance in the first two blocks of Session 1 (i.e., nonarbitrary and arbitrary blocks). Specifically, a repeated measures ANOVA was performed separately on each parameter (␮, ␴, ␶) with mapping (arbitrary or nonarbitrary) as an independent variable (see Figure 6, top panel). As predicted, we found an effect of WM load only on the tail-parameter as evidenced by significantly lower ␶ values in the nonarbitrary versus arbitrary condition, F(1, 7) ⫽ 13.28, p ⬍ .01, ␩p2 ⫽ .65, with no effect of

mapping on either ␮, F(1, 7) ⫽ 0.6, ns, ␩p2 ⫽ .08, or ␴, F(1, 7) ⫽ 1.12, ns, ␩p2 ⫽ .13. To compare the effect of mapping between the ex-Gaussian parameters, we performed a repeated measures ANOVA on the standardized Z scores with parameter (␮, ␴, ␶) and mapping (arbitrary vs. nonarbitrary) as independent variables. A one-tailed planned comparison performed on the simple interaction between the parameter contrast (␶ vs. ␮) and mapping (arbitrary vs. nonarbitrary) revealed a significant result, t(7) ⫽ 2.56, p ⬍ .05, ␩p2 ⫽ .48. A similar planned comparison comparing ␶ to ␴ revealed a marginally significant result, t(7) ⫽ 1.57, p ⫽ .08, ␩p2 ⫽ .26. Thus, the effect of WM load, represented by the mapping main effect, was stronger for ␶ compared with either ␮ or ␴. This finding is in line with the much higher effect size found for the mapping manipulation in ␶ (␩p2 ⫽ .65), compared with ␮ (␩p2 ⫽ .08) and ␴ (␩p2 ⫽ .13) in the analyses performed separately on each parameter. Finally, a repeated measures ANOVA with the same design performed on error rates was found to be nonsignificant, F(1, 7) ⫽ 0.22, ns, ␩p2 ⫽ .03. Training manipulation. To examine effects due to the training manipulation, we conducted an analysis on the parameter estimates describing performance in the arbitrary mapping across the two sessions (i.e., Session 1 vs. Session 2). A repeated measures ANOVA was performed separately for each parameter (␮, ␴, ␶) with session (1 or 2) as an independent variable (see Figure 6, bottom panel). Again, as predicted, we found an effect of WM load only on the tail-parameter apparent in a significant reduction from Session 1 to Session 2 in the ␶ parameter, F(1, 7) ⫽ 13.49, p ⬍ .01, ␩p2 ⫽ .66, with no significant effect of session on either ␮, F(1, 7) ⫽ 0.56, ns, ␩p2 ⫽ .07, or ␴, F(1, 7) ⫽ 0.01, ns, ␩p2 ⬍ .01. To explore the difference between the parameters, we performed a repeated measures ANOVA on the standardized Z scores with parameter (␮, ␴, ␶) and session (1 or 2) as independent variables. A one-tailed planned comparison was performed on the simple interaction of the parameter contrast (␶ vs. ␮) and session (1 vs. 2). It revealed a significant result, t(7) ⫽ 3.07, p ⬍ .01, ␩p2 ⫽ .57. A similar one-tailed planned comparison comparing ␶ to ␴ also revealed a significant result, t(7) ⫽ 3.14, p ⬍ .01, ␩p2 ⫽ .58. Thus, the effect of WM load, represented by the session main effect, was stronger for ␶ compared with either ␮ or ␴. This finding is again in line with the higher effect size found for ␶ (␩p2 ⫽ .66), compared with ␮ (␩p2 ⫽ .07) and ␴ (␩p2 ⬍ .01) in the analyses performed separately on each parameter. Finally, a repeated measures ANOVA with the same design performed on error rates was found to be nonsignificant, F(1, 7) ⫽ 0.06, ns, ␩p2 ⬍ .01. In the current experiment, we successfully replicated the WM load effect from Experiment 1, demonstrating that manipulating the number of arbitrary rules in choice-reaction tasks (i.e., either zero for the nonarbitrary condition or six for the arbitrary conditions) had an effect only on the tail-parameter. In addition, we demonstrated that manipulating WM load by training participants in the same arbitrary condition (i.e., reducing the arbitrariness of the S–R mapping) showed a similar effect on the tail-parameter.

Experiment 3 Figure 6. Results for Experiment 2, demonstrating that reducing working memory load by means of both mapping (arbitrary vs. nonarbitrary; top panel) and training (Session 1 vs. Session 2; bottom panel) affected only the tail-parameter (i.e., ␶). Error bars reflect confidence intervals (Jarmasz & Hollands, 2009).

Oberauer (2009) makes a distinction between declarative WM, a system devoted for the maintenance of facts and knowledge, and procedural WM, a system responsible for maintaining representations needed for the execution of a current task. Currently, there is

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

EFFECTS OF WORKING MEMORY LOAD ON SLOW REACTIONS

increasing evidence to support this distinction (Souza, Oberauer, Gade, & Druey, 2012; Oberauer, Souza, Druey, & Gade 2013). While Experiments 1 and 2 explored the effects of WM load using a procedural load manipulation (i.e., adding S–R) rules), the main aim of the current experiment was to examine whether declarative load (i.e., adding stimulus–stimulus [S–S] rules) would also affect the RT distribution in a similar way. For this aim, we asked each participant to perform a single procedural task that included two S–R rules (i.e., two-choice reaction task) under three load conditions: (a) no load, performed as a single task with no additional rules; (b) procedural load, in which the same two-choice task was performed with an additional procedural task with two S–R rules; and (c) declarative load, in which the same two-choice task was performed with an additional declarative task with two S–S rules. Thus, all three load conditions included a single two-choice RT (procedural) task that was performed either alone or with additional procedural or declarative load task. Analyses were later calculated only on the performance in the procedural task that was constant across load conditions. All of the tasks in this experiment were performed with WM demanding (arbitrary) or not demanding (nonarbitrary) rules. Specifically, WM load was predicted to increase with the number of arbitrary rules that had to be maintained in WM. When mapping was nonarbitrary, WM load was predicted to be held to a minimum for all three load conditions: no load, procedural load, and declarative load (i.e., zero arbitrary rules). Yet, when mapping was arbitrary, WM load was predicted to increase in the no-load (i.e., two arbitrary rules) and further increase in both the procedural and declarative load (i.e., two plus two arbitrary rules; see Figure 7). As in Experiments 1 and 2, we predicted that WM load would influence only the tail-parameter. Additionally, if indeed general WM resources (and not only procedural WM load) affect the rate of exceptionally slow RTs, both declarative and procedural WM load should cause a similar increase in the tail-parameter.

1843

dural and declarative load tasks, as described below (see Figure 8). For clarity, we now describe each load separately according to the instruction given to the participants in Experiment 3A (the conditions in Experiment 3B were the same, only substituting letters for digits and vice versa; thus Experiment 3B is not further described in this section). No load. Participants performed a single-task block with one procedural task containing two S–R rules. Thus, at the beginning of each block participants were presented with two target digits associated with response keys (see Figure 8). In each trial, one of the two target digits appeared in the center of the screen, and the participants were required to press the key that was associated with that specific digit. Additionally, four squares representing the four response keys used throughout the experiment were presented below the target stimulus. In all trials, two asterisks appeared inside two of the four squares (i.e., two inner or outer according to counterbalance conditions) pointing to the keys to be used in this task (see Figure 9, upper panel). This was used in order to equate the screen presentation to the procedural and declarative load conditions, as described below. Procedural load. Participants performed a mixed block with two procedural tasks, each including two S–R rules. Thus, at the beginning of the block, participants were presented with four target digits associated with response keys (see Figure 8). Later in each trial, one of the four target digits appeared at the center of the screen, and the participants were required to press the key that was associated with that target digit. Four squares representing the four response keys used throughout the experiment were presented

Method Design. To ensure the generality of the results, we conducted two similar experiments (i.e., Experiment 3A and 3B) with different target stimuli (i.e., digits and letters, respectively). Both Experiment 3A and Experiment 3B had a 3 ⫻ 2 factorial design with load (no load, procedural load, or declarative load) and mapping (arbitrary vs. nonarbitrary) as within-subjects independent variables. Participants. Twelve healthy undergraduate students took part in each experiment (Experiment 3A: mean age ⫽ 22.67 years, SD ⫽ 1.03; nine females, three males; Experiment 3B: mean age ⫽ 23.08 years, SD ⫽ 1.5; 11 females, one male), similar to the participants who took part in Experiment 1. Load conditions. Both experiments included the same three load conditions: no load (with one procedural task), procedural load (with two procedural tasks), and declarative load (with one procedural and one declarative tasks). Four adjacent keys aligned horizontally were presented to the participants in all load conditions. Two keys (either the inner second and third keys or the outer first and fourth keys, counterbalanced between participants) were used for the constant procedural task in all load conditions. The two additional keys (outer and inner, respectively, depending on the counterbalancing condition) were used for additional proce-

Figure 7. Working memory load manipulations and predictions for Experiment 3. As in Experiments 1 and 2, working memory load was assumed to increase with the number of arbitrary rules. Thus, manipulating load (no load, procedural load, or declarative load) when mapping is nonarbitrary was not assumed to affect working memory load (i.e., number of arbitrary rules remains zero). Yet, when mapping was arbitrary, working memory load was assumed to increase in the no-load condition (two arbitrary rules) and further increase in the procedural and declarative load conditions (two plus two arbitrary rules), relatively to the nonarbitrary conditions. This effect was predicted to appear only in the tail-parameter (i.e., ␶), reflecting the influence of either declarative or procedural working memory load on the rate of exceptionally slow reaction times.

1844

SHAHAR, TEODORESCU, USHER, PEREG, AND MEIRAN

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

squares presented below the target stimuli, allowing the participant to report a letter using a key press. Letter position within the two squares was randomly chosen on each trial. The random letter– key association implied that we tested participants’ memory for the digit–letter association rather than an association with a response. Thus, in all trials, whether a letter or a key-press response was to be retrieved, participants had to select one of two possible keys to press (see Figure 9, bottom panel). In sum, both experiments included three load conditions, each including a combination of either procedural or declarative tasks. The no-load condition included a single-task block with one pro-

Figure 8. Example of a nonarbitrary and an arbitrary mapping used in Experiment 3A for each of the three load conditions (no load, procedural load, and declarative load). (English letters are presented for illustrative purposes only; letter stimuli in the experiment were Hebrew letters, in equivalent order.) Two stimulus–response rules remained similar across all three tasks (in this example, mapped to the two inner second and third keys and marked with a dashed line) and performed under no additional load (upper panels), with two additional procedural stimulus–response rules (i.e., digit to key associations; middle panels) or two additional declarative stimulus–stimulus rules (i.e., digit to letter associations; bottom panels). Data analysis was later performed only on the two stimulus–response rules that were similar across load conditions (in this example, the two inner keys marked with a dashed line).

below the target stimulus. To point to the keys to be used in this task, two asterisks appeared inside two of the four squares (i.e., two inner or outer according to counterbalance conditions; see Figure 9, middle panel). Thus, the small but important difference between this double two-choice task and a four-choice reaction task was that in each trial, participants had to select one of two possible key pairs to press. In other words, participants were encouraged to use two sets of two S–R rules, instead of one set of four rules. To ensure that indeed the participants divided the four digits set into two two-choice reaction tasks, we predicted (and found) a switch cost between these two subsets. Declarative load. Participants performed a mixed block with one procedural task that included two S–R rules and one declarative task that included two S–S rules. Thus, at the beginning of the block, participants were presented with four target digits, two associated with response keys (i.e., procedural task) and two associated with letters (i.e., declarative task; see Figure 8). Later in each trial, one of the four target digits appeared at the center of the screen, and the participants were required to press the key and report the letter that was associated with that target digit. Thus, in this condition either letters or responses had to be retrieved, depending on the target digit presented in each trial (see Figure 9, bottom panel). Four squares representing the four response keys used throughout the experiment were presented below the target stimulus. In trials in which a digit that was mapped to a response appeared, a key-press response was to be retrieved. To point to the keys to be used in this task, two asterisks appeared inside two of the four squares (i.e., two inner or outer according to counterbalance conditions). In trials in which a letter was to be retrieved, the two target letters (rather than asterisks) appeared in two of the four

Figure 9. Screen illustrations for tasks using a nonarbitrary mapping in Experiment 3A. Two procedural rules were kept constant and performed, with no additional rules (i.e., no load; top panel), additional procedural stimulus–response rules (i.e., procedural load; middle panel), or additional declarative stimulus–stimulus rules (i.e., declarative load; bottom panel).

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

EFFECTS OF WORKING MEMORY LOAD ON SLOW REACTIONS

cedural task (two S–R rules). The procedural load included a mixed block with two procedural tasks (two S–R rules plus two S–R rules), and declarative load included a mixed block with one procedural and one declarative task (two S–R rules plus two S–S rules). Thus, all load conditions (i.e., no load, procedural load, declarative load) included one procedural task with two S–R rules, but performed either alone or mixed with a task that included two additional S–R/S–S rules (see Figure 8). Stimuli and apparatus. Target stimuli in Experiment 3A and 3B were digits and letters, respectively. The mapping manipulation (arbitrary vs. nonarbitrary) was exactly as in Experiment 1. That is, nonarbitrary mapping used digits or letters by order, and arbitrary mapping had randomly chosen digits or letters. In all conditions, four keys were used (i.e., the keys C, V, B, N, in a QWERTY keyboard) and were covered with stickers (to facilitate positional rather than letter-based coding). The stimulus display consisted of the centrally presented target digit or letter, and 3 cm below it, a line of four white squares (each 2 ⫻ 2 cm in size, with a 2.5-cm space between each square) indicating the four possible response keys. For procedural S–R rules, target stimuli were associated with a key. For declarative S–S rules, we used digit–letter associations (see Figure 8). In Experiment 3A, declarative S–S associations included target digits that were associated with the two Hebrew letters corresponding to the first and fourth or the second and third letters in Hebrew (i.e., aleph [ ] and dalet [—] or bet [•] and gimel [–]). In Experiment 3B, target letters were associated with the two digits (i.e., 1 and 4 or 2 and 3). Note that we capitalized on the fact that in Hebrew each letter also denotes a number, making it possible to create a nonarbitrary mapping for these stimuli. Procedure. Each condition in each experiment consisted of 400 trials divided into four blocks, to allow a short recess. The ordering of the conditions was counterbalanced across participants by using a Latin square. The trial sequence was identical to that in Experiment 1 with one target stimulus randomly selected and displayed on each trial.

Results To ensure that in both the procedural and the declarative load conditions the participants indeed used two sets of mappings, we conducted a preliminary analysis exploring for switch costs between the two tasks. We found significant switch costing in both load conditions (i.e., procedural or declarative) supporting our predictions (see Appendix B). For the main analysis exploring for WM load effects, we analyzed responses only to the two procedural S–R rules that were similar across conditions (see Figure 8). Additionally, switch trials were discarded in order to control for the influence of the switching effect (see Appendix B). Otherwise, the same data treatment approach was used as before. This resulted in an acceptable number of trials for ex-Gaussian distribution fitting per condition (327.17 mean number of trials for the no-load, 81.5 for the procedural load, and 84.4 for the declarative load conditions, calculated across both Experiments 3A and 3B). Visual examination of ex-Gaussian fitting was performed with quantile– quantile plot, revealing a very good fit for both Experiments 3A and 3B (see Appendix A).

1845

In this section, we begin by reporting the three-way interaction and the three main effects in this design. Next, we separately examine the effects of procedural versus declarative WM load manipulations using two nonorthogonal planned comparisons (procedural load vs. no load; declarative load vs. no load). Finally, to exhaust the variance, we examine the effects of WM load using two orthogonal planned comparisons (procedural and declarative load vs. no load; procedural vs. declarative load). Note that because WM load was assumed to change with the number of arbitrary rules, the manipulation of WM load is always examined across the interaction of load conditions (no load, procedural load, or declarative load) and mapping (arbitrary vs. nonarbitrary). Additionally, across the Results section, we present the effect seen in both Experiments 3A and 3B. Main effects and interactions. A repeated measures ANOVA was performed separately for each parameter (␮, ␴, ␶) with load (no load, procedural load, declarative load) and mapping (arbitrary vs. nonarbitrary) as independent variables (see Figure 10). In both experiments (analyzed separately), the main effect for load was found to be significant for ␮, Experiment 3A, F(2, 22) ⫽ 31.63, p ⬍ .001, ␩p2 ⫽ .74; Experiment 3B, F(2, 22) ⫽ 28.94, p ⬍ .001, ␩p2 ⫽ .72; and ␶, Experiment 3A, F(2, 22) ⫽ 19.99, p ⬍ .001, ␩p2 ⫽ .64; Experiment 3B, F(2, 22) ⫽ 17.41, p ⬍ .001, ␩p2 ⫽ .61; but not for ␴, Experiment 3A, F(2, 22) ⫽ 0.89, ns, ␩p2 ⫽ .07; Experiment 3B, F(2, 22) ⫽ 1.41, ns, ␩p2 ⫽ .11. The main effect for mapping was found to be significant only for ␶, Experiment 3A, F(1, 11) ⫽ 22.15, p ⬍ .001, ␩p2 ⫽ .67; Experiment 3B, F(1, 11) ⫽ 18.52, p ⬍ .001, ␩p2 ⫽ .63; but not for ␮, Experiment 3A, F(1, 11) ⫽ 0.07, ns, ␩p2 ⬍ .01; Experiment 3B, F(1, 11) ⫽ 0.42, ns, ␩p2 ⫽ .04; or ␴, Experiment 3A, F(1, 11) ⫽ 0.05, ns, ␩p2 ⬍ .01; Experiment 3B, F(1, 11) ⫽ 0.09, ns, ␩p2 ⬍ .01. In addition, the interaction effect of Load ⫻ Mapping was found to be significant for ␶, Experiment 3A, F(2, 22) ⫽ 8.33, p ⬍ .01, ␩p2 ⫽ .43; Experiment 3B, F(2, 22) ⫽ 7.29, p ⬍ .01, ␩p2 ⫽ .40; but not for ␮, Experiment 3A, F(2, 22) ⫽ 0.07, ns, ␩p2 ⬍ .01; Experiment 3B, F(2, 22) ⫽ 0.27, ns, ␩p2 ⫽ .02; or ␴, Experiment 3A, F(2, 22) ⫽ 0.29, ns, ␩p2 ⫽ .03; Experiment 3B, F(2, 22) ⫽ 0.09, ns, ␩p2 ⬍ .01. A repeated measures ANOVA was also performed on the error rates with load (no load, procedural load, declarative load) and mapping (arbitrary vs. nonarbitrary) as independent variables. The main effect for load reached significance only in one experiment, Experiment 3A, F(2, 22) ⫽ 2.04, ns, ␩p2 ⫽ .16; Experiment 3B, F(2, 22) ⫽ 6.42, p ⬍ .01, ␩p2 ⫽ .37. The main effect of mapping, Experiment 3A, F(1, 11) ⫽ 1.77, ns, ␩p2 ⫽ .14; Experiment 3B, F(1, 11) ⫽ 2.61, ns, ␩p2 ⫽ .19, and the Load ⫻ Mapping interaction, Experiment 3A, F(2, 22) ⫽ 1.3, ns, ␩p2 ⫽ .10; Experiment 3B, F(2, 22) ⫽ 1.02, ns, ␩p2 ⫽ .08, did not reach significance in any of the experiments. Procedural working memory manipulation. To account for the procedural WM load effect, we performed a one-tailed planned comparison between a load contrast (no load vs. procedural load) and mapping (arbitrary vs. nonarbitrary). In both experiments, we found a significant effect for procedural WM load only for ␶, Experiment 3A, t(11) ⫽ 3.24, p ⬍ .01, ␩p2 ⫽ .49; Experiment 3B, t(11) ⫽ 4.12, p ⬍ .001, ␩p2 ⫽ .61; but not for ␮, Experiment 3A, t(11) ⫽ 0.38, ns, ␩p2 ⫽ .01; Experiment 3B, t(11) ⫽ 0.33, ns, ␩p2 ⬍ .01; or ␴, Experiment 3A, t(11) ⫽ 0.33, ns, ␩p2 ⫽ .01; Experiment 3B, t(11) ⫽ 0.01, ns, ␩p2 ⬍ .01. To test whether the effect of procedural WM load differed significantly across the ex-Gaussian

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

1846

SHAHAR, TEODORESCU, USHER, PEREG, AND MEIRAN

Figure 10. Ex-Gaussian results for Experiments 3A (top panel) and 3B (bottom panel). Results demonstrate that working memory demands, as reflected by the mapping (arbitrary vs. nonarbitrary) ⫻ load (no load, procedural [pro.] load, declarative [dec.] load) interaction, affected only the tail-parameter (i.e., ␶). This was true whether the loading rules were procedural (i.e., stimulus–response) or declarative (i.e., stimulus–stimulus). Error bars reflect confidence intervals (Jarmasz & Hollands, 2009).

parameters, we performed a repeated measures ANOVA on the standardized Z scores of the parameters with parameter (␮, ␴, ␶), load (no load, procedural load, declarative load), and mapping (arbitrary or nonarbitrary) as independent variables. A one-tailed planned comparison was performed on the simple interaction between parameter (␶ vs. ␮), load (no load vs. procedural), and mapping (arbitrary vs. nonarbitrary). It revealed a significant result for both experiments, Experiment 3A, t(11) ⫽ 2.71, p ⬍ .05, ␩p2 ⫽

.40; Experiment 3B, t(11) ⫽ 3.11, p ⬍ .05, ␩p2 ⫽ .47. The same comparison only contrasting ␶ with ␴ revealed a similar significant result, Experiment 3A, t(11) ⫽ 2.41, p ⬍ .05, ␩p2 ⫽ .34; Experiment 3B, t(11) ⫽ 2.28, p ⬍ .05, ␩p2 ⫽ .32. These findings are in line with the higher effect sizes found in the ␶ parameter (Experiment 3A, ␩p2 ⫽ .49; Experiment 3B, ␩p2 ⫽ .61), compared with ␮ (Experiment 3A, ␩p2 ⫽ .01; Experiment 3B, ␩p2 ⬍ .01) and ␴ (Experiment 3A, ␩p2 ⫽ .01; Experiment 3B, ␩p2 ⬍ .01) in the

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

EFFECTS OF WORKING MEMORY LOAD ON SLOW REACTIONS

analyses performed separately on each parameter. Thus, these results demonstrate that the effect of procedural WM load was more pronounced in the tail-parameter compared to the two other parameters. Finally, a one-tailed planned comparison examining the interaction effect of Load (no load vs. procedural load) ⫻ Mapping (arbitrary vs. nonarbitrary) on error rates did not reach significance for both experiments, Experiment 3A, t(11) ⫽ 1.33, ns, ␩p2 ⫽ .14; Experiment 3B, t(11) ⫽ 1.13, ns, ␩p2 ⫽ .10. Declarative working memory manipulation. To account for the declarative WM load effect, we examined a planned comparison between a load contrast (no load vs. declarative load) and mapping (arbitrary vs. nonarbitrary). We found in both experiments a significant effect only for ␶, Experiment 3A, t(11) ⫽ 2.60, p ⬍ .05, ␩p2 ⫽ .38; Experiment 3B, t(11) ⫽ 3.49, p ⬍ .01, ␩p2 ⫽ .53; but not for ␮, Experiment 3A, t(11) ⫽ 0.29, ns, ␩p2 ⬍ .01; Experiment 3B, t(11) ⫽ 0.42, ns, ␩p2 ⫽ .02; or ␴, Experiment 3A, t(11) ⫽ 0.60, ns, ␩p2 ⫽ .03; Experiment 3B, t(11) ⫽ 0.46, ns, ␩p2 ⫽ .02. To test whether the effect of declarative WM load differed significantly across the ex-Gaussian parameters, we performed a repeated measures ANOVA on the standardized Z scores of the parameters with parameter (␮, ␴, ␶), load (no load, procedural load, declarative load), and mapping (arbitrary or nonarbitrary) as independent variables. A one-tailed planned comparison was performed on the simple interaction between parameter (␶ vs. ␮), load (no load vs. declarative), and mapping (arbitrary vs. nonarbitrary), revealing a significant result only in one experiment, Experiment 3A, t(11) ⫽ 0.98, ns, ␩p2 ⫽ .08; Experiment 3B, t(11) ⫽ 4.68, p ⬍ .001, ␩p2 ⫽ .66. The same comparison only contrasting ␶ with ␴ also revealed a significant result only in one experiment, Experiment 3A, t(11) ⫽ 0.56, ns, ␩p2 ⫽ .03; Experiment 3B, t(11) ⫽ 4.14, p ⬍ .001, ␩p2 ⫽ .61. Overall, these findings are in line with the higher effect sizes found in the ␶ parameter (Experiment 3A, ␩p2 ⫽ .38; Experiment 3B, ␩p2 ⫽ .53), compared with ␮ (Experiment 3A, ␩p2 ⬍ .01; Experiment 3B, ␩p2 ⫽ .02) and ␴ (Experiment 3A, ␩p2 ⫽ .03; Experiment 3B, ␩p2 ⫽ .02) in the analyses performed separately on each parameter, demonstrating that the effect of declarative WM load was more pronounced in the tail-parameter compared with the two other parameters. Finally, a one-tailed planned comparison examining the interaction effect of Load (no load vs. declarative load) ⫻ Mapping (arbitrary vs. nonarbitrary) on error rates was found to be nonsignificant for one experiment and marginally significant for the other, Experiment 3A, t(11) ⫽ 1.73, p ⫽ .06, ␩p2 ⫽ .21; Experiment 3B, t(11) ⫽ 1.41, ns, ␩p2 ⫽ .15. Examining WM load manipulations using orthogonal comparisons. The planned comparisons reported above, examining procedural and declarative WM load effects, are nonorthogonal and as such do not exhaust the variance in this design. Thus, we continued to examine our assumptions by performing two additional orthogonal planned comparisons on the performance estimates (a) examining the effect of both procedural and declarative load conditions compared to the no-load condition and (b) examining the difference between procedural and declarative loads. Thus, we first explored the interaction between a load contrast (no load vs. other) and mapping (arbitrary vs. nonarbitrary). This comparison was found significant only for the tail-parameter, Experiment 3A, t(11) ⫽ 3.37, p ⬍ .01, ␩p2 ⫽ .51; Experiment 3B, t(11) ⫽ 5.0, p ⬍ .001, ␩p2 ⫽ .69; but not for ␮, Experiment 3A, t(11) ⫽ 0.42, ns, ␩p2 ⫽ .02; Experiment 3B, t(11) ⫽ 0.08, ns, ␩p2 ⫽ .01; or ␴, Experiment 3A, t(11) ⫽ 0.18, ns, ␩p2 ⫽ .01; Experiment

1847

3B, t(11) ⫽ 0.22, ns, ␩p2 ⫽ .01, supporting the conclusion that both procedural and declarative WM load affected only the long-RT tail. We then proceeded with a second planned comparison, examining the interaction between a load contrast (procedural vs. declarative) and mapping (arbitrary vs. nonarbitrary). This interaction contrast was found significant only in one experiment and only for ␶, Experiment 3A, t(11) ⫽ 2.6, p ⬍ .05, ␩p2 ⫽ .38; Experiment 3B, t(11) ⫽ 1.02, ns, ␩p2 ⫽ .08; with a nonsignificant result found both for ␮, Experiment 3A, t(11) ⫽ 0.04, ns, ␩p2 ⬍ .01; Experiment 3B, t(11) ⫽ 0.69, ns, ␩p2 ⫽ .04; and ␴, Experiment 3A, t(11) ⫽ 0.57, ns, ␩p2 ⫽ .02; Experiment 3B, t(11) ⫽ 0.36, ns, ␩p2 ⫽ .01, in the two experiments. As can be seen in Figure 10, these results reflect a larger increase in ␶ when procedural WM load was used as compared to declarative WM load, but only in Experiment 3A. The fact that this interaction was found significant only for Experiment 3A raise the possibility that the difference between procedural and declarative WM loads might be task specific, an issue that would have to be resolved in a dedicated study. A one-tailed comparison of load (no load vs. other) ⫻ mapping (arbitrary vs. nonarbitrary) performed on error rates did not reach significance in one experiment and was found to be only marginally significant in the other, Experiment 3A, t(11) ⫽ 1.7, p ⫽ .06, ␩p2 ⫽ .20; Experiment 3B, t(11) ⫽ 1.41, ns, ␩p2 ⫽ .15. In addition, a one-tailed comparison of load (procedural vs. declarative) ⫻ mapping (arbitrary vs. nonarbitrary) performed on error rates was also found nonsignificant, Experiment 3A, t(11) ⫽ 0.88, ns, ␩p2 ⫽ .06; Experiment 3B, t(11) ⫽ 0.22, ns, ␩p2 ⬍ .01. In conclusion, overall results obtained with orthogonal comparisons supported the claim that loading WM with either procedural or declarative information significantly influenced the tailparameter but not the other two parameters.

Discussion In Experiments 3A and 3B, we demonstrated that the effects of declarative and procedural WM load manipulations were more pronounced in the tail-parameter compared with the two other ex-Gaussian parameters. In other words, the need to hold both S–R and S–S associations in mind increased the magnitude of the long-RT tail. This finding is in line with the results of Experiments 1 and 2, showing that WM load affects the long-RT tail. One could make the case that in the declarative WM load condition, there was a task switch involved between the S–R and S–S rules, whereas this was not the case when additional S–R rules were added in the procedural WM load condition. In other words, it might be that while participants switched between tasks in the declarative load conditions, they only performed one procedural task (i.e., four-choice reaction task) instead of two two-choice reaction tasks in the procedural load condition. Our results suggest that this was not the case, since comparable switch costs were found in both load conditions (see Appendix B), showing that participants made a choice between two alternatives and switched between two tasks in both procedural and declarative load conditions. We of course removed these task switch trials before estimating the ex-Gaussian parameters. Another argument might be that while our analytic approach removed task switch effects, it did not remove the influence of task mixing effects. Task mixing effects refer to the poorer performance seen even in nonswitch trials when the experimental block

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

1848

SHAHAR, TEODORESCU, USHER, PEREG, AND MEIRAN

involves task switching (e.g., Koch, Prinz, & Allport, 2005; Los, 1996; Rubin & Meiran, 2005). Moreover, Steinhauser and Hübner (2009) have shown that task mixing influences the tail-parameter, ␶, more than it influences the other two ex-Gaussian parameters. A related argument is that only when there was a load, participants had to monitor the stimulus display to know which task to execute. When there were only two choices and no load, such monitoring was not required. Yet, this task mixing effect should have been observed equally for the arbitrary and nonarbitrary conditions. Since WM load was always examined by contrasting the effect of load (no load, procedural load, declarative load) on the different levels of mapping (arbitrary vs. nonarbitrary), these arguments are rendered implausible. While the current experiments demonstrate that the long-RT tail is affected by both declarative and procedural WM load, they also hold an additional important contribution to the whole experimental section by demonstrating that the long-RT tail is affected even when WM is loaded with information that is irrelevant to the trial being currently performed. That is, in Experiments 1 and 2, WM was loaded with task-relevant rules (i.e., increasing the number of arbitrary rules in a single choice-reaction task). In Experiments 3A and 3B, participants performed one task, either alone or mixed with a set of two additional rules relevant to a different, second task. Thus, the supplementary contribution of Experiments 3A and 3B is the demonstration that the addition of arbitrary rules increased the tail-parameter, even when these rules were irrelevant to the current trial being performed.

Summary: Experiments 1–3 In three experiments we examined the influence of WM demand manipulations on the rate of exceptionally slow RTs, described by the tail-parameter. In all experiments, WM demands were manipulated by varying the number of arbitrary rules that needed to be maintained in WM. In Experiment 1, we manipulated WM demands by the number of choices and by the arbitrariness of the S–R mapping. As predicted, we found that the tail-parameter was most strongly influenced by the WM load manipulation. In Experiment 2, we replicated the effect of mapping on the tail-parameter and additionally demonstrated a similar effect when WM load was manipulated by means of practice. In Experiment 3, we found that both declarative and procedural WM load influenced the tailparameter. Additionally we found that that this effect is apparent even when WM was loaded with information irrelevant to the current trial being performed. Thus, the results are consistent, and demonstrate that WM demanding conditions lead to an increased degree of the long-RT tail, as indexed by the tail-parameter (i.e., ␶). To provide a summary of the results, we present in Table 1 effect sizes for WM manipulations in each of the three exGaussian parameters across the entire experimental section. We believe that these results demonstrate a causal relation between WM resources and the magnitude of the tail-parameter in choice-reaction tasks. In the following section, we examine the underlying mechanistic factors that correspond to this change within an accumulation to criterion decision framework.

Modeling the Decision Process While the ex-Gaussian model describes the different aspects of the RT distribution, fitting this model does not allow drawing

Table 1 Effect Sizes of Working Memory (WM) Manipulations Across Ex-Gaussian Parameters Experiment 1 2 3A 3B Note.

WM load manipulation







Mapping Mapping Training Procedural load Declarative load Procedural load Declarative load

.03 .08 .07 .01 ⬍.01 ⬍.01 .02

.08 .13 ⬍.01 .01 .03 ⬍.01 .02

.47 .65 .66 .49 .38 .61 .53

Effect sizes are presented in ␩2p estimates.

process related conclusions (e.g., Matzke & Wagenmakers, 2009). For that reason, in the following section, we implement a processbased modeling approach to gain further insight into the inner workings of the underlying mechanism through which the availability of WM resources influenced the RT distribution tail. In this investigation, we focus on the comparison of three distinct hypotheses about how higher WM load might influence the RT tail: (a) by reducing the quality of perceptual information available for the perceptual decision; (b) by aggravating fluctuations in attention and/or motivation, causing interference in the perceptual decision process; or (c) by prolonging the process of S–R rule retrieval from WM, a process that needs to be completed before response execution and is extrinsic to the perceptual decision process.

The Accumulation to Criterion Framework The process underlying perceptual decision tasks has been the object of extensive research (Brown & Heathcote, 2005; Ditterich, 2010; Forstmann, Brown, Dutilh, Neumann, & Wagenmakers, 2010; Gao, Tortell, & McClelland, 2011; Niwa & Ditterich, 2008; Philiastides, Ratcliff, & Sajda, 2006; Ratcliff, 2002; Ratcliff & Rouder, 2000; Siegel, Engel, & Donner, 2011; Smith & Ratcliff, 2009; Teodorescu & Usher, 2013; Trueblood, Brown, Heathcote, & Busemeyer, 2013; Usher & McClelland, 2001; Vickers, 1970; Vickers & Lee, 1998). Within this framework, the principal challenge for any theory of perceptual choice is to conjointly account for both choice and RT distributional data. When the data also span several experimental manipulations (as is the case here), strong constraints are placed on decision models. In response to this challenge, a number of models have been developed, which share the assumption that perceptual evidence is accumulated to a decision criterion, which in turn determines the response time. This type of model, also known as evidence accumulation, or sequential sampling model, is aptly equipped to account for a wide array of behavioral phenomena, simultaneously capturing both choice and RT distributions. One of the most widely applied sequential sampling models is the drift diffusion model (Ratcliff, 1978; Ratcliff & McKoon, 2008; Ratcliff & Rouder, 1998). The drift diffusion model has been successfully used to model a plethora of phenomena, ranging from perceptual decisions to memory and aging, and has been shown to provide important insights into the underlying cognitive mechanisms (Ratcliff & Rouder, 2000; Ratcliff, Thapar, & McKoon, 2003; Smith & Ratcliff, 2009; Starns, Ratcliff, & McKoon, 2012). The basic assumption of the classic diffusion model is that the difference in

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

EFFECTS OF WORKING MEMORY LOAD ON SLOW REACTIONS

supporting evidence between the two choice alternatives, also known as drift rate, is stochastically accumulated toward one of two decision thresholds: an upper, positive bound representing the response associated with the target and a lower, negative bound representing the nontarget alternative. When the choice is easy, drift rate is high and responses are fast and accurate. On the other hand, when the choice is difficult, drift rate is low, responses are slow, and the error rate is high. While the average drift rate is assumed to be constant within a trial and is determined by the specific set of stimuli, several additional sources of variability are necessary in order to generate error responses and to fully capture the variability in RT distributions. To simulate the noisiness intrinsic to neural processing within a trial, the average drift is repeatedly perturbed at each time step by a Gaussian random variable with a standard deviation (s) and mean 0. In addition, two sources of between-trial variability are introduced for each trial. Starting point variability (SPV) simulates the variability in starting conditions of the organism at stimulus onset and is implemented as a uniform random variable with a range (⫺SPV/2, SPV/2) that determines the initial bias of the decision variable. Drift variability simulates the variability in the general state of the organism from trial to trial (representing, for example, attentional shifts or fluctuations in concentration) and is implemented as a Gaussian random variable with standard deviation and mean 0. Drift variability is added to the mean drift rate before the onset of each trial and is applied for the whole duration of the trial. Starting point variability and drift variability are instrumental in the diffusion model’s ability to deal with certain empirical findings such as fast and slow error RT distributions relative to correct RT distributions, both of which were observed in this study (for review, see Ratcliff & McKoon, 2008). Another parameter of particular interest in this study is nondecision time that encapsulates all time-consuming nondecision processes. Usually, it is assumed that these processes include mostly sensory processing and motor response generation. However, our current results (see below) challenge this assumption. We will return to this issue later. Last, a step-size parameter is used to transform discrete simulation time steps into continuous time measured in milliseconds. The diffusion model in this form is easily implemented for two-alternative forced-choice decisions and could, in fact, be considered optimal for such decisions as long as difficulty does not change within a block (Bogacz, Brown, Moehlis, Holmes, & Cohen, 2006). One drawback of the classic diffusion model, however, is that it does not naturally extend to choice-reaction tasks with more than two alternatives. This difficulty emerges from the principal assumption of the diffusion model that evidence differences rather than absolute values are used to drive the diffusion process. Consequently, in the classic diffusion model there is only one accumulator and two decision thresholds, each identifying one of the two available alternatives. Evidence differences are simple to calculate for two alternatives. However, when the number of alternatives exceeds two, differences can no longer be straightforwardly used. To circumvent this limitation, several extensions to the classic diffusion have been proposed such as the feed-forward inhibition diffusion (Niwa & Ditterich, 2008; Teodorescu & Usher, 2013) and the max-minus-next diffusion models (Krajbich & Rangel, 2011; McMillen & Holmes, 2006; Teodorescu & Usher, 2013). Both these models implement N accumulators units representing the N response alternatives and a single, common decision

1849

threshold. In the feed-forward inhibition model, each accumulator receives input equal to the evidence supporting the alternative associated with it minus the average of the evidence supporting each of the other alternatives. These accumulator units then race each other, and the decision is determined by the first accumulator to reach the common threshold. In the max-minus-next model, N independent units accumulate evidence separately for each of the alternatives and the decision is terminated when the difference between the largest and the second largest accumulator crosses the threshold. Mathematically, the feed-forward inhibition and max-minusnext models reduce to the classic diffusion for N ⫽ 2 alternatives. However, from a process point of view, the models are not equivalent, and for some experimental manipulations the feed-forward inhibition model can result in different predictions from the classic diffusion and max-minus-next model and consequently has difficulty in accounting for certain empirical results (Teodorescu & Usher, 2013; Tsetsos, Usher, & McClelland, 2011). In addition, the max-minus-next model implements an asymptotically optimal version of the Bayesian Sequential Probability Ratio Test (Bogacz et al., 2006; Bogacz & Gurney, 2007; Y. Liu & Blostein, 1992; Wald & Wolfowitz, 1948). For these reasons and since our task involves multiple alternatives, we chose to implement the maxminus-next model as the base model through which to test the different hypotheses compared in this work. The main differences between the classic diffusion and the max-minus-next model, in terms of computational implementation, are that while all accumulators share variability distribution parameters such as drift variability, SPV, and s, the actual values at each time step (or trial) are randomly drawn separately for each accumulator. In addition, drift rates are defined separately for the target and commonly for all the nontargets (Here we arbitrarily set all nontarget drift rates to a value of 2, and the quality of the evidence was controlled by the advantage of the target drift rate over the nontarget drift rate.)

Contrasting Assumptions About the Locus of RT Tail Effects In Experiment 1, we demonstrated that increasing WM demands influenced the slow tail of the RT distribution. Two accounts for the effect of WM on RT have been previously suggested (see McVay & Kane, 2012, for discussion). First, it has been hypothesized that with more arbitrary S–R rules to keep in mind, the additional cognitive load can take a toll on perceptual processing, leading to reduced quality of evidence (Schmiedek et al., 2007). This could be modeled as a reduction in the drift rate within each trial. This assumption is also supported by previous findings, demonstrating that slower evidence accumulation rates are an established source for longer tailed RT distributions (Matzke & Wagenmakers, 2009; Ratcliff, 1978; Ratcliff & McKoon, 2008; Usher & McClelland, 2001; Vickers, 1979). Alternatively, WM load could also be hypothesized to reduce attentional or motivational resources that would lead to less consistency in stimulus processing between different trials. This can be modeled as an increase in the between-trial variability of the accumulation process (i.e., higher drift variability; McVay & Kane, 2012). Both these accounts are directly related to the course of evidence accumulation, making it also likely that WM load affects both processes to varying degrees. The assumption that both processes (i.e.,

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

1850

SHAHAR, TEODORESCU, USHER, PEREG, AND MEIRAN

drift rate and drift variability) are affected by WM load can also be regarded as representing a more general hypothesis that WM load affects the process of evidence accumulation. However, WM load need not necessarily affect the process of evidence accumulation. An alternative and plausible account could argue that WM load selectively affects the process of search and retrieval of S–R rules from WM, a process extrinsic to the accumulation of evidence. This describes a dual-process model whereby the evidence accumulation process is responsible only for stimulus evaluation (e.g., identifying the specific digit, letter, or shape) and, therefore, should remain unaffected by the availability of WM resources. In turn, the process responsible for response execution is charged with selecting the appropriate response matching the perceived stimulus. Thus, response execution requires the retrieval of S–R rules. When S–R rules are held in WM (as in the case of novel, arbitrary rules), the duration of the process of rule retrieval should be selectively influenced by the availability of WM resources. This account thus represents the hypothesis that WM load only affects components that are extrinsic to the evidence accumulation process. Thus, this model assumes that when the S–R information is novel (i.e., arbitrary), it must be held and retrieved from WM. Yet, for nonarbitrary rules that are assumed to have a stable representation in long-term memory, identification of the stimulus also carries with it the information required for execution of the appropriate response, and no additional processing is needed in order to retrieve response identity. For example, in a task requiring right–left key presses as responses to right–left stimulus positions, once the stimulus position is identified, no additional processing is required to identify the response. However, when S–R rules are held in WM, the response corresponding to the identified stimulus must be retrieved based on the outcome of the evidence accumulation process (e.g., given detection of a word in a lexical decision task, one needs to recall which key goes with “word”). Assuming that rules are stochastically retrieved one at a time and that this process is repeated until the appropriate rule is encountered, the rate of rule retrieval from WM should be constant, and dependent only on the number of items currently held in WM. In technical terms, what we describe here is a process with a constant (over time) hazard function (Townsend & Ashby, 1983) that is assumed to vary with WM load. Namely, the probability that the rule will be retrieved in the next moment, given that it has not yet been retrieved, remains constant for a given WM load. In other words, the retrieval process is not assumed to fatigue or improve its rate during the course of a given trial, but the probability to retrieve the right mapping at any given time is negatively related to the number of alternatives maintained in WM. When the hazard function is constant, the probability density function of retrieval times is exponentially distributed, as shown by Townsend and Ashby (1983, pp. 36 – 41). Thus, if the speed of memory retrieval is exponentially distributed with a rate that is a function of the number of rules currently held in WM, the model reduces to a max-minus-next model in which the nondecision processes component is broken down into two individual components. The first is the standard nondecision time parameter (involving motor response and perceptual processing), and the second is an additional exponential nondecision time process with a mean

time-constant parameter (1/␭) that is predicted to increase with the number of arbitrary rules and is assumed to reflect the process of information retrieval from WM. That is, what we describe here as an exponential nondecision time is assumed to reflect the time taken to retrieve the S–R rule after the stimulus has been identified (as described by the diffusion process). Evidence supporting the possibility that rule retrieval accounts for WM demands in choice-reaction tasks comes, for example, from Schmitz and Voss (2012), who employed diffusion modeling and found that nondecision time was influenced by task switching, especially when the time to prepare toward the task switch was short. Their finding is especially relevant here, since Mayr and Kliegl (2000) argued that task switching requires the retrieval of the rules of the upcoming task into WM. In the following section, we use a model comparison framework, based on the max-minus-next model, to test the above four hypotheses regarding the locus of this effect: (a) WM load interferes with perceptual processing and thus reduces the quality of the perceptual evidence going into the evidence accumulation process (i.e., drift rate decreases with the number of arbitrary S–R rules); (b) WM load causes a reduction in attentional or motivational resources and thus increases between trial fluctuations in the quality of the evidence retrieved from the stimulus (i.e., drift variability increases with the number of arbitrary S–R rules); (c) some interaction of a and b; (d) WM load only hinders retrieval of S–R rules from WM, thus prolonging the duration of the search for the right rule (i.e., the mean nondecision time-constant parameter [1/␭] increases with arbitrary S–R rules). There are many other model parameters that could be allowed to vary between conditions, and it is beyond the scope of this work to test all possible model combinations. Without proper theoretical justification, this endeavor is exploratory at best. Furthermore, increasing the number of parameters carries with it the cost of increased dimensionality that makes fitting such models more prone to local minima. Thus, the aim of the current modeling examination was to explore which of the three components described above (i.e., drift rate, drift variability, and exponential nondecision time) best accounts for the behavioral effects of WM load. Thus, the best fitting model will provide support for which component is mostly affected by WM load manipulation. In addition, the current study could also shed light on the more general question of whether WM load interferes with the actual evidence accumulation process or, rather, only affects processes that are extrinsic to evidence accumulation.

Models As outlined above, four models were compared, representing different hypotheses about the mechanism underlying the results of Experiment 1. The differences between the hypotheses were captured by allowing different parameters to vary with WM load, characterized by the number of arbitrary rules. That is, WM load was assumed to remain at a constant minimum across set size conditions in the nonarbitrary mapping (i.e., zero arbitrary rules) and increase with set size in the arbitrary mapping (see Figure 2). Additionally, speed–accuracy trade-offs have been found to underlie set size effects captured by Hick’s law (Usher, Olami &

EFFECTS OF WORKING MEMORY LOAD ON SLOW REACTIONS

1851

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Table 2 Fits and Parameter Estimations for Models Describing Performance in Experiment 1 ␹2 No. of errorb parameters

Arbitrary mapping

Model

BICa

1

48,510.45

0.29

12

Drift rate Threshold Drift variability

3.44 37.32 0.12

2

49,316.72

0.62

12

Drift rate Threshold Drift variability

3

4

48,563.17

48,290.61

0.28

0.29

15

12

Parameter

Nonarbitrary mapping

2 choices 4 choices 6 choices 2 choices 4 choices 6 choices 3.17 46.17

3.10 51.42

3.70 34.30 0.24

47.11 0.40

54.40 0.50

Drift rate Threshold Drift variability

3.45 37.35 0.12

3.18 46.13 0.13

3.11 51.20 0.21

Drift rate Threshold Drift variability NDT exponent

3.16 13.10 0.25 19.22

15.85

18.07

83.28

170.23

3.66

Noise: 2.87 SPV: 19.88 Step: 5.77 NDT: 299.46 Noise: 2.82 SPV: 24.16 Step: 6.45 NDT: 3.08

0.09 3.66

Noise: 2.84 SPV: 19.95 Step: 5.77 NDT: 305.48

0.11

0

Additional parameter estimates

0

0

Noise: 1.39 SPV: 13.04 Step: 10.00 NDT: 313.08

Note. SPV ⫽ starting point variability; NDT ⫽ nondecision time. a A difference in Bayesian information criterion (BIC) score that is larger than 10 is considered substantial (Raftery, 1995). b Chi-square values were calculated from averaged quantile proportions across subjects rather than frequencies (Ratcliff & Smith, 2004). Thus, chi-square values are used as a relative, rather than absolute, measure of performance that does not reflect statistical significance.

McClelland, 2002). Accordingly, for all models, decision threshold was allowed to increase with set size, regardless of mapping.2 Model 1. In Model 1, WM load was assumed to affect the quality of the perceptual evidence going into the evidence accumulation process such that drift rate was set to decrease with set size only in the arbitrary mapping condition and remain constant across all nonarbitrary conditions (i.e., three drift rate parameters for the two, four, and six choices in the arbitrary mapping condition and one drift rate parameter for all set sizes in the nonarbitrary mapping). Model 2. In Model 2, WM load was assumed to affect fluctuations in attention and/or motivation, and thus drift variability was set to increase with set size only in the arbitrary mapping condition and remain constant across nonarbitrary conditions (i.e., three drift variability parameters for the two, four, and six choices in the arbitrary mapping condition and one for all set size conditions in the nonarbitrary mapping). Model 3. In Model 3, WM load manipulations was assumed to affect both drift rate and drift variability, and thus both processes were set to change with set size only in the arbitrary mapping condition and to remain constant across all nonarbitrary conditions. Model 4. In Model 4, WM load was assumed to affect the process of rule retrieval, and thus the decay rate of the exponential nondecision time was set to increase with the number of alternatives only in the arbitrary mapping condition and was set to 0 in all nonarbitrary conditions. In addition, four parameters of the model (SPV, noise, step size, and basic nondecision time) were held constant across all experimental conditions for all models. In the following, we present the model comparison results (see Appendix C for details on the computational methods).

Modeling Results Table 2 presents four sets of best fitting parameters, the associated goodness-of-fit score, and the Bayesian information criterion (BIC) score for each model.3 Among the four models, the best BIC fits are achieved by the model assuming that the parameter describing the exponential nondecision time component increases with WM load (i.e., Model 4). Importantly, all models that did not assume an exponential nondecision component (i.e., Models 1–3) were outperformed by Model 4 when model complexity was taken into account. Furthermore, to allow the decision intrinsic model class the maximal opportunity to fit the data, we included additional parametric freedom in drift rates, drift variances, and threshold setting by fitting an additional model that allowed drift rate, drift variability, and decision threshold to vary across Mapping (arbitrary vs. nonarbitrary) ⫻ Set Size (two, four, or six) conditions, resulting in a whopping 22-parameter model. Despite the small improvement in chi-square score (␹2 ⫽ 0.26), this model was still found to be outperformed by Model 4 when model complexity was taken into account (BIC ⫽ 48322.92). To simultaneously present both RT distributions and accuracies in the same graphical representation, Figure 11 presents quantile 2 To test the contribution of decision thresholds to set size effects in our results, one model (Model 4) was tested both with and without allowing thresholds to vary with set size. Performance deteriorated considerably when thresholds were held constant (BIC ⫽ 49703.39, ␹2 ⫽ 0.89). We thus implemented this freedom in all following models. 3 Model fitting was executed on all of the trials. To ensure that the results are not due to this specific approach, we have refitted Model 4 after also excluding the first 10 trials in each condition. The result indicated a nearly identical chi-square score (changing only two digits after the decimal point) to that reported in Table 2.

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

1852

SHAHAR, TEODORESCU, USHER, PEREG, AND MEIRAN

Figure 11. Quantile probability functions for Model 4. The quantile probability functions plot both the empirical (black Xs) and simulated (colored markers) reaction time (RT) quantiles on the y-axis, while the probability for each type of response (correct ⫽ P; error ⫽ 1 ⫺ P) is plotted on the x-axis. Consequently, if for a specific condition the probability for a correct response was .92, then the .1, .3, .5, .7, and .9 correct RT quantiles for that condition will appear in ascending order above the .92 tick on the x-axis and the .1, .3, .5, .7, and .9 error RT quantiles for that condition will appear in an ascending order above the .08 tick on the x-axis. Thus, each figure contains the correct and error RT distributions for both empirical and simulated data in the arbitrary (high load [HL]) versus nonarbitrary (low load [LL]) mapping for each set size (i.e., 2, 4, or 6). See the online article for the color version of this figure.

probability functions (Ratcliff & McKoon, 2008; Teodorescu & Usher, 2013) for Model 4 (see Appendix C for quantile probability functions of Models 1–3). The failure of the models based on the proposition that WM load affects only decision parameters could be attributed to the fact that the data show a considerable increase in mean RT with only a minor increase in error rates. Perceptual decision parameters that affect RT in general and the tail of the RT distribution in particular, like drift rate and drift variability, are strongly and negatively related to the error rate and cannot be decoupled form it (Ratcliff & McKoon, 2008). The fit of the model indicates that drift rate and drift variability lead to large increases in the tail of the RT distribution without an accompanying large increase in error rate. This pattern of inversed U shapes for equiquantile lines is typical of diffusion models when multiple drift rates are fitted (Ratcliff & McKoon, 2008; Ratcliff & Smith, 2004). While the model assuming an additional, exponential nondecision component provides a better fit to the data, this fit is not perfect. One obvious caveat is that the nondecision component has no mechanism for error generation. Thus, the extrinsic model tested here fails to capture the small, albeit significant, increase in error rates that accompany increases in WM load. It therefore appears that the main obstacle for the model assuming only an exponential nondecision component is to account for the increase in the effect of WM load on error rate. On the other hand, it is theoretically quite plausible that a nondecision process responsible for the retrieval of the correct response mapping will also produce a small amount of retrieval errors. It is also highly likely that this

effect will increase with set size. It therefore seems that a more detailed model of the mechanism underlying response mapping retrieval process and its relation to error rates will be necessary before performing a more detailed model comparison between the two alternatives. Last, all models seem to have some difficulty in capturing the fast leading edge of the error RT distribution in the two-choice condition. Both the empirical effect of fast errors in the twoalternatives condition that slow down with increasing set size and the failure of the models in accounting for it are consistent with previous studies of multialternative perceptual choice (see Leite & Ratcliff, 2010, for an extensive computational study of multialternative choice-reaction tasks). Notwithstanding, the models do capture the qualitative trend by predicting faster leading edge of error RTs compared to correct RTs in the two-choice condition, approximately equal correct and error RT distributions (with a slightly faster leading edge and slightly slower tail for the error distribution) in the four-choice condition, and slow tail for error RTs in the six-choice condition.

General Discussion In this work, we tested the hypothesis that a shortage in WM resources leads to an increased tail of the RT distributions. The first section of our investigation included three experiments exploring this claim. All experiments demonstrated that WM demanding conditions, manipulated by the arbitrariness of the S–R

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

EFFECTS OF WORKING MEMORY LOAD ON SLOW REACTIONS

rules, led to a higher rate of exceptionally slow RTs, reflected in the long-RT tail. While our results are in accordance with previous studies, reviewed in the introduction (Cowan & Saults, 2013; McVay & Kane, 2012; Schmiedek et al., 2007; Unsworth et al., 2010), there are also several differences that should be noted. First, previous investigations were performed using correlational methods. In the current study, we found similar evidence using an experimental approach, demonstrating that under high WM load conditions, participants exhibited increased long-RT tail. This finding suggests causality, with WM resource shortage being a cause of changes in the RT tail. Also, while past investigations have mostly used a relatively low WM load (e.g., two choices), the current investigation demonstrates that the effect of WM on the RT distribution tail is most apparent when the number of alternatives is relatively high. Interestingly, Cowan and Saults’s (2013) findings indicate that the correlation between WM abilities and the tail-parameter disappeared when the RT task was WM demanding (i.e., lists with more items to be remembered). This discrepancy between the present results and those from correlational studies should be explored in the future.

Why the Tail? In order to gain a process-based understanding of the influence of WM demands on the performance, in the second section of our investigation we modeled both choice and RT distributional data from Experiment 1 using an evidence accumulation approach. We examined two types of models. The first are models in which WM load was assumed to affect only components of the perceptual decision process (i.e., drift rate and/or drift variability). The second was a model in which WM load was assumed to influence an additional extrinsic, exponential nondecision parameter, designed to represent the retrieval of the S–R rule from WM. We found that the first group of models had a poorer fit compared with the later. In other words, we found that the addition of an extrinsic exponential nondecision time parameter, describing WM demands manipulation, improved the fit substantially (see Table 2). In an attempt to interpret this finding, we will first tend to relevant theories that account for the correlation between WM and the magnitude of the long-RT tail using an evidence accumulation approach. Engle and Kane (2003) suggested that the rate of exceptionally slow RTs is attributed to task interferences (i.e., attentional lapses). By this rationale, the difficulty to maintain the task information in memory often contributes to the rate of exceptionally slow RTs by causing some trials to have extremely slow accumulation rates. As such, the drift variability parameter (assumed to describe, among other things, the effects of attentional lapses on the evidence accumulation process) should be able to account for the tailparameter. McVay and Kane (2012) found evidence supporting this notion by investigating the relation between drift diffusion parameters and the tail-parameter in a go/no-go task. The authors found that both drift variability, the parameter that accounts for interference, and subjective mind wondering reports during task performance mediated the relationship between WM abilities and the tail-parameter. They concluded that participants with higher

1853

WM abilities can better resist task extrinsic interference, leading to lower degree of long-RT tail. Alternatively, Schmiedek et al. (2007) demonstrated that the relationship between the tail-parameter and WM performance can be best explained by drift rates. This led the authors to conclude that a more parsimonious explanation than resistance to interference can account for the association between WM abilities and the tail-parameter. Mainly, it was argued that general efficiency in information processing can account for both individual differences in the tail-parameter and their performance in more complex, WM-demanding tasks. It was also suggested that general information processing in choice-reaction tasks might represent the cognitive ability to maintain novel S–R bindings in WM (Oberauer, Süß, Wilhelm, & Sander, 2007), with stronger S–R bindings leading to a more efficient information processing. In the present study, we found that neither the rate of evidence accumulation (i.e., drift rate) nor fluctuations in that process (i.e., drift variability), nor any combination thereof, could best account for changes in the tail-parameter caused by WM demands manipulation. Alternatively, the inclusion of an exponentially distributed extrinsic nondecision time parameter improved the fit substantially. To be specific, a model with an exponential nondecision time process describing the effect of WM load, along with a decision threshold describing the effect of set size irrespective of mapping conditions (Model 4, Table 2), was enough to outperform all of the models that did not include such an extrinsic process (Models 1–3, Table 2). It should be mentioned that previous investigations exploring the relationship between the tailparameter and WM using evidence accumulation modeling (McVay & Kane, 2012; Schmiedek et al., 2007) did not include an exponentially distributed nondecision time component in their models. Thus, those studies did not test the hypothesis posited in this work. Of course, other differences, most notably the individual differences design versus a manipulation-based design, could account for the discrepancy in conclusions relative to the present investigation. Earlier, we suggested that the nondecisional exponential component represents the process of WM rule retrieval, a process that is distinct from the categorical decision on the perceptual stimulus. The fact that this extrinsic process accounts in our analysis for the addition of arbitrary S–R rules supports our interpretation. Yet, the exact psychological mechanism underlying this residual time is still not fully understood and remains to be further explored. It is important to note that the term nondecision or extrinsic could be misleading, as it refers to a process that is extrinsic to the decision process only as described in the drift diffusion model. It by no means suggests that this component does not take a part in the psychological process of selecting a response. For example, Schmitz and Voss (2012) interpreted the nondecision time in their model as the time involved in setting up the response selection apparatus during task switching. In addition, it cannot be inferred at which stage (i.e., beginning, middle, end, or a mixture) of the response selection process the nondecisional component took part. Our suggestion that the process of WM rule retrieval could explain the effect of WM demands on the RT distribution is somewhat in line with a memory retrieval model suggested by Schneider and Anderson(2011). In detail, the authors suggested a memory-based explanation for the effect of number-of-alternatives on mean RT (Hick, 1952; Hyman, 1953), arguing that the set size

SHAHAR, TEODORESCU, USHER, PEREG, AND MEIRAN

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

1854

effect actually represents a fan effect in which the context serves as a retrieval cue for the S–R rule, with higher set size mapped to lower accumulation drifts. Our approach shares the assumption that rule retrieval plays a role in choice RT, with two reservations. First, our experimental section suggests that this process influences only the tail-parameter. Second, our modeling investigation supports the notion that the retrieval process cannot solely affect drift rates and is in fact more likely to affect the duration of nondecision processes. In sum, WM load was found to have a direct causal (and almost exclusive) effect on the rate of exceptionally slow RTs, described by the long-RT distribution tail. Our findings suggest that the effect of WM demands on the long-RT tail is the result of an additional nondecisional process, which we attributed to the process of retrieving the S–R rule from WM. Yet, further research is needed in order to better specify the mechanics of this process.

References Balota, D. A., & Spieler, D. H. (1999). Word frequency, repetition, and lexicality effects in word recognition tasks: Beyond measures of central tendency. Journal of Experimental Psychology: General, 128, 32–55. doi:10.1037/0096-3445.128.1.32 Balota, D. A., & Yap, M. J. (2011). Moving beyond the mean in studies of mental chronometry: The power of response time distributional analyses. Current Directions in Psychological Science, 20, 160 –166. doi:10.1177/ 0963721411408885 Bogacz, R., Brown, E., Moehlis, J., Holmes, P., & Cohen, J. D. (2006). The physics of optimal decision making: A formal analysis of models of performance in two-alternative forced-choice tasks. Psychological Review, 113, 700 –765. doi:10.1037/0033-295X.113.4.700 Bogacz, R., & Gurney, K. (2007). The basal ganglia and cortex implement optimal decision making between alternative actions. Neural Computation, 19, 442– 477. doi:10.1162/neco.2007.19.2.442 Brown, S., & Heathcote, A. (2005). Practice increases the efficiency of evidence accumulation in perceptual choice. Journal of Experimental Psychology: Human Perception and Performance, 31, 289 –298. doi: 10.1037/0096-1523.31.2.289 Buzy, W. M., Medoff, D. R., & Schweitzer, J. B. (2009). Intra-individual variability among children with ADHD on a working memory task: An ex-Gaussian approach. Child Neuropsychology, 15, 441– 459. doi: 10.1080/09297040802646991 Cowan, N., & Saults, J. S. (2013). When does a good working memory counteract proactive interference? Surprising evidence from a probe recognition task. Journal of Experimental Psychology: General, 142, 12–17. doi:10.1037/a0027804 Coyle, T. R. (2003). A review of the worst performance rule: Evidence, theory, and alternative hypotheses. Intelligence, 31, 567–587. doi: 10.1016/S0160-2896(03)00054-0 Dehaene, S. (2002). Verbal and nonverbal representations of numbers in the human brain. In A. M. Galaburda, S. M. Kosslyn, & C. Yves (Eds.), The languages of the brain (pp. 179 –190). Cambridge, MA: Harvard University Press. Ditterich, J. (2010). A comparison between mechanisms of multialternative perceptual decision making: Ability to explain human behavior, predictions for neurophysiology, and relationship with decision theory. Frontiers in Neuroscience, 4, Article 184. doi:10.3389/fnins .2010.00184 Engle, R. W. (2002). Working memory capacity as executive attention. Current Directions in Psychological Science, 11, 19 –23. doi:10.1111/ 1467-8721.00160 Engle, R. W., & Kane, M. J. (2003). Executive attention, working memory capacity, and a two-factor theory of cognitive control. Psychology of

Learning and Motivation, 44, 145–199. doi:10.1016/S00797421(03)44005-X Eriksen, B. A., & Eriksen, C. W. (1974). Effects of noise letters upon the identification of a target letter in a nonsearch task. Perception & Psychophysics, 16, 143–149. doi:10.3758/BF03203267 Forstmann, B. U., Brown, S., Dutilh, G., Neumann, J., & Wagenmakers, E.-J. (2010). The neural substrate of prior information in perceptual decision making: A model-based analysis. Frontiers in Human Neuroscience, 4, Article 40. doi:10.3389/fnhum.2010.00040 Gao, J., Tortell, R., & McClelland, J. L. (2011). Dynamic integration of reward and stimulus information in perceptual decision-making. PLoS ONE, 6, e16749. doi:10.1371/journal.pone.0016749 Hale, D. (1968). The relation of correct and error responses in a serial choice reaction task. Psychonomic Science, 13, 299 –300. doi:10.3758/ BF03342595 Heathcote, A., Popiel, S. J., & Mewhort, D. J. (1991). Analysis of response time distributions: An example using the Stroop task. Psychological Bulletin, 109, 340 –347. doi:10.1037/0033-2909.109.2.340 Hick, W. E. (1952). On the rate of gain of information. Quarterly Journal of Experimental Psychology, 4, 11–26. doi:10.1080/17470215208416600 Hyman, R. (1953). Stimulus information as a determinant of reaction time. Journal of Experimental Psychology, 45, 188 –196. doi:10.1037/ h0056940 Jackson, J. D., Balota, D. A., Duchek, J. M., & Head, D. (2012). White matter integrity and reaction time intraindividual variability in healthy aging and early-stage Alzheimer disease. Neuropsychologia, 50, 357– 366. doi:10.1016/j.neuropsychologia.2011.11.024 Jarmasz, J., & Hollands, J. G. (2009). Confidence intervals in repeatedmeasures designs: The number of observations principle. Canadian Journal of Experimental Psychology, 63, 124 –138. doi:10.1037/ a0014164 Karalunas, S. L., & Huang-Pollock, C. L. (2013). Integrating impairments in reaction time and executive function using a diffusion model framework. Journal of Abnormal Child Psychology, 41, 837– 850. doi: 10.1007/s10802-013-9715-2 Karantinos, T., Tsoukas, E., Mantas, A., Kattoulas, E., Stefanis, N. C., Evdokimidis, I., & Smyrnis, N. (2014). Increased intra-subject reaction time variability in the volitional control of movement in schizophrenia. Psychiatry Research, 215, 26 –32. doi:10.1016/j.psychres.2013.10.031 Karni, A., Tanne, D., Rubenstein, B. S., Askenasy, J. J., & Sagi, D. (1994, July 29). Dependence on REM sleep of overnight improvement of a perceptual skill. Science, 265, 679 – 682. doi:10.1126/science.8036518 Koch, I., Prinz, W., & Allport, A. (2005). Involuntary retrieval in alphabet– arithmetic tasks: Task-mixing and task-switching costs. Psychological Research, 69, 252–261. doi:10.1007/s00426-004-0180-y Krajbich, I., & Rangel, A. (2011). Multialternative drift-diffusion model predicts the relationship between visual fixations and choice in valuebased decisions. Proceedings of the National Academy of Sciences of the United States of America, 108, 13852–13857. doi:10.1073/pnas .1101328108 Lacouture, Y., & Cousineau, D. (2008). How to use MATLAB to fit the ex-Gaussian and other probability functions to a distribution of response times. Tutorials in Quantitative Methods for Psychology, 4, 35– 45. Leite, F. P., & Ratcliff, R. (2010). Modeling reaction time and accuracy of multiple-alternative decisions. Attention, Perception, & Psychophysics, 72, 246 –273. doi:10.3758/APP.72.1.246 Liu, S., Lane, S. D., Schmitz, J. M., Green, C. E., Cunningham, K. A., & Moeller, F. G. (2012). Increased intra-individual reaction time variability in cocaine-dependent subjects: Role of cocaine-related cues. Addictive Behaviors, 37, 193–197. doi:10.1016/j.addbeh.2011.10.003 Liu, Y., & Blostein, S. D. (1992). Optimality of the Sequential Probability Ratio Test for nonstationary observations. IEEE Transactions on Information Theory, 38, 177–182. doi:10.1109/18.108268

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

EFFECTS OF WORKING MEMORY LOAD ON SLOW REACTIONS Longstreth, L. E., El-Zahhar, N., & Alcorn, M. B. (1985). Exceptions to Hick’s law: Explorations with a response duration measure. Journal of Experimental Psychology: General, 114, 417– 434. doi:10.1037/00963445.114.4.417 Los, S. A. (1996). On the origin of mixing costs: Exploring information processing in pure and mixed blocks of trials. Acta Psychologica, 94, 145–188. doi:10.1016/0001-6918(95)00050-X Matzke, D., & Wagenmakers, E.-J. (2009). Psychological interpretation of the ex-Gaussian and shifted Wald parameters: A diffusion model analysis. Psychonomic Bulletin & Review, 16, 798 – 817. doi:10.3758/PBR .16.5.798 Mayr, U., & Kliegl, R. (2000). Task-set switching and long-term memory retrieval. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 1124 –1140. doi:10.1037/0278-7393.26.5.1124 McMillen, T., & Holmes, P. (2006). The dynamics of choice among multiple alternatives. Journal of Mathematical Psychology, 50, 30 –57. doi:10.1016/j.jmp.2005.10.003 McVay, J. C., & Kane, M. J. (2012). Drifting from slow to “D’oh!”: Working memory capacity and mind wandering predict extreme reaction times and executive control errors. Journal of Experimental Psychology: Learning, Memory, and Cognition, 38, 525–549. doi:10.1037/a0025896 Moutsopoulou, K., & Waszak, F. (2012). Across-task priming revisited: Response and task conflicts disentangled using ex-Gaussian distribution analysis. Journal of Experimental Psychology: Human Perception and Performance, 38, 367–374. doi:10.1037/a0025858 Munoz, D. P., & Everling, S. (2004). Look away: The anti-saccade task and the voluntary control of eye movement. Nature Reviews Neuroscience, 5, 218 –228. doi:10.1038/nrn1345 Niwa, M., & Ditterich, J. (2008). Perceptual decisions between multiple directions of visual motion. Journal of Neuroscience, 28, 4435– 4445. doi:10.1523/JNEUROSCI.5564-07.2008 Oberauer, K. (2009). Design for a working memory. Psychology of Learning and Motivation, 51, 45–100. doi:10.1016/S0079-7421(09)51002-X Oberauer, K., Souza, A. S., Druey, M. D., & Gade, M. (2013). Analogous mechanisms of selection and updating in declarative and procedural working memory: Experiments and a computational model. Cognitive Psychology, 66, 157–211. doi:10.1016/j.cogpsych.2012.11.001 Oberauer, K., Süß, H.-M., Wilhelm, O., & Sander, N. (2007). Individual differences in working memory capacity and reasoning ability. In A. R. A. Conway, C. Jarrold, M. J. Kane, A. Miyake, & J. N. Towse (Eds.), Variation in working memory (pp. 49 –75). New York, NY: Oxford University Press. doi:10.1093/acprof:oso/9780195168648.003.0003 Philiastides, M. G., Ratcliff, R., & Sajda, P. (2006). Neural representation of task difficulty and decision making during perceptual categorization: A timing diagram. Journal of Neuroscience, 26, 8965– 8975. doi: 10.1523/JNEUROSCI.1655-06.2006 Raftery, A. E. (1995). Bayesian model selection in social research. Sociological Methodology, 25, 111–163. doi:10.2307/271063 Ratcliff, R. (1978). A theory of memory retrieval. Psychological Review, 85, 59 –108. doi:10.1037/0033-295X.85.2.59 Ratcliff, R. (1979). Group reaction time distributions and an analysis of distribution statistics. Psychological Bulletin, 86, 446 – 461. doi: 10.1037/0033-2909.86.3.446 Ratcliff, R. (2002). A diffusion model account of response time and accuracy in a brightness discrimination task: Fitting real data and failing to fit fake but plausible data. Psychonomic Bulletin & Review, 9, 278 –291. doi:10.3758/BF03196283 Ratcliff, R., & McKoon, G. (2008). The diffusion decision model: Theory and data for two-choice decision tasks. Neural Computation, 20, 873– 922. doi:10.1162/neco.2008.12-06-420 Ratcliff, R., & Rouder, J. N. (1998). Modeling response times for twochoice decisions. Psychological Science, 9, 347–356. doi:10.1111/14679280.00067

1855

Ratcliff, R., & Rouder, J. N. (2000). A diffusion model account of masking in two-choice letter identification. Journal of Experimental Psychology: Human Perception and Performance, 26, 127–140. doi:10.1037/00961523.26.1.127 Ratcliff, R., & Smith, P. L. (2004). A comparison of sequential sampling models for two-choice reaction time. Psychological Review, 111, 333– 367. doi:10.1037/0033-295X.111.2.333 Ratcliff, R., Thapar, A., & McKoon, G. (2003). A diffusion model analysis of the effects of aging on brightness discrimination. Perception & Psychophysics, 65, 523–535. doi:10.3758/BF03194580 Rubin, O., & Meiran, N. (2005). On the origins of the task mixing cost in the cuing task-switching paradigm. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 1477–1491. doi:10.1037/ 0278-7393.31.6.1477 Schmiedek, F., Oberauer, K., Wilhelm, O., Süß, H. M., & Wittmann, W. W. (2007). Individual differences in components of reaction time distributions and their relations to working memory and intelligence. Journal of Experimental Psychology: General, 136, 414 – 429. doi: 10.1037/0096-3445.136.3.414 Schmitz, F., & Voss, A. (2012). Decomposing task-switching costs with the diffusion model. Journal of Experimental Psychology: Human Perception and Performance, 38, 222–250. doi:10.1037/a0026003 Schneider, D. W., & Anderson, J. R. (2011). A memory-based model of Hick’s law. Cognitive Psychology, 62, 193–222. doi:10.1016/j.cogpsych .2010.11.001 Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461– 464. doi:10.1214/aos/1176344136 Siegel, M., Engel, A. K., & Donner, T. H. (2011). Cortical network dynamics of perceptual decision-making in the human brain. Frontiers in Human Neuroscience, 5, Article 21. doi:10.3389/fnhum.2011.00021 Smith, P. L., & Ratcliff, R. (2009). An integrated theory of attention and decision making in visual signal detection. Psychological Review, 116, 283–317. doi:10.1037/a0015156 Souza, A. S., Oberauer, K., Gade, M., & Druey, M. D. (2012). Processing of representations in declarative and procedural working memory. Quarterly Journal of Experimental Psychology, 65, 1006 –1033. doi:10.1080/ 17470218.2011.640403 Spieler, D. H., Balota, D. A., & Faust, M. E. (1996). Stroop performance in healthy younger and older adults and in individuals with dementia of the Alzheimer’s type. Journal of Experimental Psychology: Human Perception and Performance, 22, 461– 479. doi:10.1037/0096-1523.22 .2.461 Starns, J. J., Ratcliff, R., & McKoon, G. (2012). Evaluating the unequalvariance and dual-process explanations of zROC slopes with response time data and the diffusion model. Cognitive Psychology, 64, 1–34. doi:10.1016/j.cogpsych.2011.10.002 Steinhauser, M., & Hübner, R. (2009). Distinguishing response conflict and task conflict in the Stroop task: Evidence from ex-Gaussian distribution analysis. Journal of Experimental Psychology: Human Perception and Performance, 35, 1398 –1412. doi:10.1037/a0016467 Stickgold, R. (1998). Sleep: Off-line memory reprocessing. Trends in Cognitive Sciences, 2, 484 – 492. doi:10.1016/S1364-6613(98)01258-3 Stuss, D. T., Murphy, K. J., Binns, M. A., & Alexander, M. P. (2003). Staying on the job: The frontal lobes control individual performance variability. Brain, 126, 2363–2380. doi:10.1093/brain/awg237 Sui, J., & Humphreys, G. W. (2013). The boundaries of self face perception: Response time distributions, perceptual categories, and decision weighting. Visual Cognition, 21, 415– 445. doi:10.1080/13506285.2013 .800621 Teodorescu, A. R., & Usher, M. (2013). Disentangling decision models: From independence to competition. Psychological Review, 120, 1–38. doi:10.1037/a0030776

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

1856

SHAHAR, TEODORESCU, USHER, PEREG, AND MEIRAN

Townsend, J. T., & Ashby, F. G. (1983). The stochastic modeling of elementary psychological processes. Cambridge, England: Cambridge University Press. Trueblood, J. S., Brown, S. D., Heathcote, A., & Busemeyer, J. R. (2013). Not just for consumers: Context effects are fundamental to decision making. Psychological Science, 24, 901–908. doi:10.1177/ 0956797612464241 Tse, C. S., Balota, D. A., Yap, M. J., Duchek, J. M., & McCabe, D. P. (2010). Effects of healthy aging and early stage dementia of the Alzheimer’s type on components of response time distributions in three attention tasks. Neuropsychology, 24, 300 –315. doi:10.1037/a0018274 Tsetsos, K., Usher, M., & McClelland, J. L. (2011). Testing multialternative decision models with non-stationary evidence. Frontiers in Neuroscience, 5, Article 63. doi:10.3389/fnins.2011.00063 Unsworth, N., Redick, T. S., Lakey, C. E., & Young, D. L. (2010). Lapses in sustained attention and their relation to executive control and fluid abilities: An individual differences investigation. Intelligence, 38, 111– 122. doi:10.1016/j.intell.2009.08.002 Usher, M., & McClelland, J. L. (2001). The time course of perceptual choice: The leaky, competing accumulator model. Psychological Review, 108, 550 –592. doi:10.1037/0033-295X.108.3.550 Usher, M., Olami, Z., & McClelland, J. L. (2002). Hick’s law in a stochastic race model with speed–accuracy tradeoff. Journal of Mathematical Psychology, 46, 704 –715. doi:10.1006/jmps.2002.1420

Vickers, D. (1970). Evidence for an accumulator model of psychophysical discrimination. Ergonomics, 13, 37–58. doi:10.1080/00140137008931117 Vickers, D. (1979). Decision processes in visual perception. New York, NY: Academic Press. Vickers, D., & Lee, M. D. (1998). Dynamic models of simple judgments: I. Properties of a self-regulating accumulator module. Nonlinear Dynamics, Psychology, and Life Sciences, 2, 169 –194. doi:10.1023/A: 1022371901259 Wald, A., & Wolfowitz, J. (1948). Optimum character of the Sequential Probability Ratio Test. Annals of Mathematical Statistics, 19, 326 –339. doi:10.1214/aoms/1177730197 Wilhelm, O., & Oberauer, K. (2006). Why are reasoning ability and working memory capacity related to mental speed? An investigation of stimulus–response compatibility in choice reaction time tasks. European Journal of Cognitive Psychology, 18, 18 –50. doi:10.1080/ 09541440500215921 Wright, M. J., Jr., Vandewater, S. A., & Taffe, M. A. (2013). The influence of acute and chronic alcohol consumption on response time distribution in adolescent rhesus macaques. Neuropharmacology, 70, 12–18. doi: 10.1016/j.neuropharm.2013.01.003 Yap, M. J., Balota, D. A., & Tan, S. E. (2013). Additive and interactive effects in semantic priming: Isolating lexical and decision processes in the lexical decision task. Journal of Experimental Psychology: Learning, Memory, and Cognition, 39, 140 –158. doi:10.1037/a0028520

EFFECTS OF WORKING MEMORY LOAD ON SLOW REACTIONS

1857

Appendix A Ex-Gaussian Distribution Fit

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

To obtain an estimation of how well the ex-Gaussian distributions fit the data, we first simulated a set of 10,000 data points using the previously extracted ex-Gaussian parameters. We then calculated the mean reaction time (RT) for four bins (.2, .4, .6, .8), once for the empirical and once for simulated data sets. A quantile– quantile plot (see Figure A1) was created.

Figure A1.

Each data point describes the mean RT for the relevant bin in the empirical data (i.e., x-axis) and simulated data (i.e., y-axis). A poor fit between the data and the ex-Gaussian should be observed by a discrepancy from linearity (described by the diagonal line). As can be seen, an excellent fit was observed in all of the experiments.

Quantile– quantile plot comparing simulated and empirical data for each experiment.

(Appendices continue)

1858

SHAHAR, TEODORESCU, USHER, PEREG, AND MEIRAN

Appendix B

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Switch Cost Effects To explore for switch costs, we compared trials in which the previous trial had indicated the use of a different key set (i.e., outer after inner or vice versa; switch trials) with trials in which the key set remained the same (i.e., outer after outer, inner after inner; repeat trials). A repeated measures analysis of variance was performed on the mean reaction time (RT) for each experiment with load (procedural load, declarative load), mapping (arbitrary vs. nonarbitrary), and trial type (repeat vs. switch) as within-subject variables. As expected, we found a significant switch cost for both experiments as indicated by the significant main effect of trial type (repeat vs. switch) in RTs, Experiment 3A, F(1, 11) ⫽ 61.48, p ⬍ .001, ␩p2 ⫽ .85; Experiment 3B, F(1, 11) ⫽ 169.63, p ⬍ .001, ␩p2 ⫽ .94; and error rates, Experiment 3A, F(1, 11) ⫽ 10.74, p ⬍ .01,

␩p2 ⫽ .49; Experiment 3B, F(1, 11) ⫽ 7.84, p ⬍ .05, ␩p2 ⫽ .41. This finding demonstrates that indeed participants switched between two sets, each comprising two alternatives as argued. To examine differences in this effect across load conditions, we further analyzed the interaction between load (procedural load, declarative load) and trial type (repeat, switch). The effect was found to be nonsignificant for the two experiments in RTs, Experiment 3A, F(1, 11) ⫽ 3.49, ns, ␩p2 ⫽ .24; Experiment 3B, F(1, 11) ⫽ 0.83, ns, ␩p2 ⫽ .07; and significant only for one experiment in error rates, Experiment 3A, F(1, 11) ⫽ 10.71, p ⬍ .01, ␩p2 ⫽ .49; Experiment 3B, F(1, 11) ⫽ 1.55, ns, ␩p2 ⫽ .12. These results support the claim that in the two load conditions (procedural, declarative), two sets of rules were used.

Appendix C Computational Methods for the Modeling Section First, all parameter values were constrained to be larger than 0. Second, parameter values that were allowed to vary between experimental conditions were constrained such that a monotonic relationship between parameter values and condition was maintained. For example, if a parameter increases between the twoalternative conditions and the four-alternative conditions, then it was expected to increase (though not necessarily by the same amount) between the four-alternative conditions and the sixalternative conditions. This was important in order to prevent the models from using parameter value combinations that are not interpretable from a cognitive point of view. Such effects could be observed when a model is trying to compensate for its inability to fit certain aspects of the data by using parameter combinations that are not plausible theoretically. To evaluate goodness of fit, we optimized models using a genetic algorithm procedure (“ga” function with population size ⫽ 20 and 200 generations; Global Optimization toolbox, MATLAB). The ga function was embedded in an envelope code that began with a random population and then repeatedly used the final population of the previous optimization as the starting population for the next optimization. This process was repeated until the optimization no longer improved the goodness-of-fit score. Finally, the above procedure was repeated 10 times for each model. To simultaneously fit the models both to correct and error RT distributions and to choice accuracy for each of the six experimental conditions from Experiment 1 including mapping (arbitrary or nonarbitrary) and set size (2, 4, or 6), we calculated the chi-square statistic for the averaged reaction time (RT) quantile proportion of

the data versus that of the model (Ratcliff & McKoon, 2008; Ratcliff & Smith, 2004; Teodorescu & Usher, 2013). The objective function was thus calculated by taking the 0.1, 0.3, 0.5, 0.7, and 0.9 RT quantiles and accuracy data for each experimental condition and for each type of response (correct/error), resulting in 12 quantile sets of the five quantiles. For each such set, simulated RT data were then grouped into six bins confined by the empirical RT quantiles. This procedure resulted in 12 bins per condition (six correct and six error) for a total of 12 ⴱ 6 ⫽ 72 bins. Then, for each model run (12,000 iterations for each evaluation during optimization and 120,000 iterations to generate the final error for a given set of parameters returned by the ga function), the proportion of simulation runs that fell within the bounds of each bin was calculated. These proportions were subsequently multiplied by the probability for a correct or error response (1 ⫺ P(correct)) to create the correct and error RT quantile proportions, respectively. Naturally, for each condition these proportions summed to 1. Quantile RT proportions are therefore equivalent to the probability for a response to end up in a particular bin (e.g., the probability to produce an error response in the arbitrary S–R mapping condition fourchoice alternatives for which the RT is slower than the .7 but faster than the .9 empirical quantile). Finally, the chi-square statistic was calculated as the sum, over all bins in all conditions, of the squared differences between the proportions predicted by the model (Mi) and the empirical (target) proportions (Ei), divided by the target proportion (Ei). Since the RT quantiles that divide the bins are calculated from the empirical

(Appendices continue)

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

EFFECTS OF WORKING MEMORY LOAD ON SLOW REACTIONS

Figure C1. Quantile probability functions for Model 1 (upper panel), Model 2 (middle panel), and Model 3 (lower panel). RT ⫽ reaction time. See the online article for the color version of this figure.

(Appendices continue)

1859

SHAHAR, TEODORESCU, USHER, PEREG, AND MEIRAN

1860

data, the target proportions (Ei) are always .1, .2, .2, .2, .2, and .1 multiplied by the probability for that type of response (correct/ error). For example, if the accuracy in a particular condition was 90%, then the target proportions would be .09, .18, .18, .18, .18, and .09 for the correct responses and .01, .02, .02, .02, .02, and .01 for the error responses:

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

␹2 ⫽

(Ei ⫺ M i)2 Ei

.

In addition, we provide the Bayesian information criterion (BIC; Schwarz, 1978) for each model. The BIC score is valuable, since it takes into account model flexibility by penalizing for extra free parameters. BIC was calculated as BIC ⫽ ⫺2N

score that is larger than 10 can be considered substantial (Raftery, 1995). Note that since the optimization process minimized the chi-square and not the log-likelihood objective function, this score cannot be considered a “true” BIC score and is used here as a proxy to the BIC. Nevertheless, the chi-square and the loglikelihood statistics are related and should converge to the same parameter values. For comparison, we fit one of the models with both objective functions. Indeed, both optimization processes produced very similar parameters and errors. Figure C1 provides a quantile probability function for Models 1–3.

兺i Ei ln(Mi) ⫹ P ln(N),

where N is the number of trials per participants and P is the number of free parameters for each model. A difference inBIC

Received January 31, 2013 Revision received May 13, 2014 Accepted May 16, 2014 䡲

Call for Replication Studies Papers The Journal of Experimental Psychology: General is inviting replication studies submissions. The Journal values replications and may publish them when the work clearly contributes to its mission. This includes contributions with interdisciplinary appeal that address theoretical debate and/or integration, beyond addressing the reliability of effects. The Journal preference is for replication plans to be submitted to the Journal before data collection begins. A proposal should provide a strong motivation for the study in the context of the relevant literature, details of the design, sample size, and planned analyses. The editorial team may encourage a replication project, based on external guidance. Final manuscripts will be evaluated through further external review. Replication articles will be published online only and will be listed in the Table of Contents in the print journal.

Suggest Documents