Multiple Systems of Perceptual Category Learning: Theory ... - CiteSeerX

4 downloads 440 Views 542KB Size Report
variety of category structures, but it learns in a slow incremental fashion and is highly dependent on reliable ..... supports this assumption (e.g., Baars, 1988). For ...
(in press). In H. Cohen & C. Lefebvre (Eds.), Categorization in cognitive science. New York: Elsevier.

Multiple Systems of Perceptual Category Learning: Theory and Cognitive Tests F. Gregory Ashby & Vivian V. Valentin University of California, Santa Barbara The COVIS (COmpetition between Verbal and Implicit Systems) theory of category learning (Ashby, Alsonso-Reese, Turken, & Waldron, 1998) postulates two systems that compete throughout learning – a frontal-based explicit system that uses logical reasoning and depends on working memory and executive attention, and a basal ganglia-mediated implicit system that uses procedural learning. The implicit system can learn a wide variety of category structures, but it learns in a slow incremental fashion and is highly dependent on reliable and immediate feedback. In contrast, the explicit rule-based system can learn a fairly small set of category structures quickly – specifically, those structures that can be learned via an explicit reasoning process. This theory is described in detail and a variety of cognitive-behavioral experiments are reviewed that test some parameterfree a priori predictions made by COVIS.

All animals must learn to categorize objects and events in their environment to survive. Is the mushroom edible or poisonous? Is the object behind the bush a deer or a wolf? Correct classification allows animals to select the appropriate approach or avoidance response to nutrients and poisons, and to prey and predators. Given the importance of categorization, it is not surprising that many different cognitive theories of human category learning have been proposed and tested. Although some of these theories have unquestionably been more successful than others, most categorization researchers would probably acknowledge disappointment in how difficult it has been to differentiate the predictions of the more successful models, despite the very different cognitive processes they hypothesize. It appears that behavioral category-learning data do not offer enough constraints to identify the correct model. The cognitive neuroscience revolution has offered some exciting new tools for resolving these conflicts. In particular, the past decade has seen an explosion of new results that collectively are beginning to paint a detailed picture of the neural mechanisms and pathways that mediate category learning. These results come from a wide variety of sources, including animal lesion, singlecell recording, neuroimaging, and neuropsychological studies. Lagging somewhat behind this avalanche of new data has been the development of new theories that can account for the traditional cognitive results as well as for these newer neuroscience results. Even so, some such

Preparation of this chapter was supported by Public Health Service Grant MH3760. We thank Todd Maddox for his helpful comments. Correspondence concerning this chapter should be sent to F. Gregory Ashby, Department of Psychology, University of California, Santa Barbara, CA 93106. email: [email protected], web page: http://www.psych.ucsb.edu/%7Eashby/lab.htm.

theories have been proposed. The most comprehensive and best tested of these is the COVIS (COmpetition between Verbal and Implicit Systems) theory of category learning (Ashby, Alfonso-Reese, Turken, & Waldron, 1998; Ashby & Waldron, 1999). This chapter describes COVIS and many of the cognitive experiments that have been run to test some strong and surprising a priori predictions of COVIS. The next chapter entitled “Multiple Systems of Perceptual Category Learning: Neuropsychological Tests”, describes tests of COVIS using neuropsychological patient data and neuroimaging results. Briefly, COVIS postulates two systems that compete throughout learning – a frontal-based explicit system that uses logical reasoning and depends on working memory and executive attention, and a basal gangliamediated implicit system that uses procedural learning. The procedural learning-based system is phylogenetically older. It can learn a wide variety of category structures, but it learns in a slow incremental fashion and is highly dependent on reliable and immediate feedback. In contrast, the explicit rule-based system can learn a fairly small set of category structures quickly – specifically, those structures that can be learned via an explicit reasoning process. Tasks that require subjects to learn such structures are called rule-based categorylearning tasks. On the other hand, there are many category structures that the explicit system cannot learn. An important example occurs in information-integration tasks, in which learning requires subjects to integrate perceptual information across two or more noncommensurable stimulus dimensions. Before describing COVIS in detail, we take a short detour to describe rulebased and information-integration category-learning tasks.

2 TWO CATEGORY-LEARNING TASKS As mentioned above, rule-based tasks are those in which the categories can be learned via some explicit reasoning process. Frequently, the rule that maximizes accuracy (i.e., the optimal strategy) is easy to describe verbally. In the most common applications, only one stimulus dimension is relevant, and the subject’s task is to discover this relevant dimension and then to map the different dimensional values to the relevant categories. However, there is no requirement that rule-based tasks be one-dimensional. For example, a conjunction rule (e.g., respond A if the stimulus is small on dimension x and small on dimension y) is a rule-based task because a conjunction is easy to describe verbally. Information-integration category-learning tasks are those in which accuracy is maximized only if information from two or more non-commensurable stimulus dimensions is integrated at some pre-decisional

Figure 1. (A) Stimuli that might be used in a rule-based categorylearning task. The vertical line is the optimal category boundary. (B) Stimuli that might be used in an information-integration category-learning task. The diagonal line is the optimal category boundary.

stage (Ashby & Gott, 1988). Perceptual integration could take many forms – from computing a weighted linear combination of the dimensional values to treating the stimulus as a Gestalt. Typically, the optimal strategy in information-integration tasks is difficult or impossible to describe verbally. Real-world examples of information-integration tasks are common. For example, deciding whether an x-ray shows a tumor requires years of training and expert radiologists are only partially successful at describing their categorization strategies. Some stimuli from rule-based and informationintegration tasks that are typical of the experiments described later in this chapter are shown in Figure 1. In both tasks, the two contrasting categories consist of circular sine-wave gratings (i.e., disks in which luminance varies sinusoidally). The disks are all of equal diameter, but they differ in bar width and orientation. In most experiments, each category contains hundreds of these disks. On each trial of a typical experiment, a disk is randomly selected and presented to the subject, whose task is to assign it to category A or B. In most cases, feedback about response accuracy is given at the end of each trial, and this process is repeated for hundreds or thousands of trials. Figure 1 also shows the decision bounds that maximize categorization accuracy. In the rule-based task (Figure 1a), the optimal bound, denoted by the vertical line, requires subjects to attend to bar width and ignore the grating’s orientation. This bound has a simple verbal description: “Respond A if the bars are thin and B if they are thick.” In the information-integration task (Figure 1b), equal attention must be allocated to both stimulus dimensions. In this task, there is no simple verbal description of the optimal decision bound. For a more thorough discussion of rulebased and information-integration tasks, see Ashby and Maddox (2005). COVIS As mentioned earlier, COVIS postulates two systems that compete throughout learning – an explicit system that uses logical reasoning and an implicit system that uses a form of procedural learning. The explicit system generates and tests hypotheses about category membership, a process that requires working memory and executive attention, and it controls performance in rule-based tasks, primarily because the optimal rule in these tasks is easy to reason about logically. The procedural learning that forms the basis of the implicit system has previously been associated with motor learning (e.g., Willingham, 1998; Willingham, Nissen, & Bullemer, 1989). COVIS assumes that the implicit system learns in an incremental fashion, and unlike the explicit system, is not constrained to learning any particular class of categorization rules. In informationintegration tasks, the optimal rule is difficult to describe

3 verbally, so it is unlikely to be discovered by the hypothesis generation process of the explicit system. As a result, COVIS assumes that the procedural-learning system dominates (asymptotic) performance in information-integration tasks. The next two major sections describe the COVIS explicit and implicit systems, respectively. Computational versions of the two systems are described in the Appendix. THE COVIS EXPLICIT SYSTEM The COVIS explicit system assumes subjects generate and test hypotheses about category membership. For example, the initial hypothesis may be to “respond A if the grating is tilted up, and B if it is tilted down.” This candidate rule is then held in working memory while it is being tested. With the categories

shown in Figure 1a, performance based on this rule will be sub-optimal, so the feedback will eventually tell the subject that this hypothesis is incorrect. At this point, an alternative hypothesis must be selected, and executive attention must be switched from the old hypothesis to the new. This selection, switching, and testing process continues until performance is satisfactory, or until the subject gives up and decides that no satisfactory rule exists. Figure 2 shows the neural mechanisms that mediate performance in the COVIS explicit system during a trial of the category-learning task illustrated in Figure 1. The key structures in the model are the anterior cingulate, the prefrontal cortex, and the head of the caudate nucleus. There are two separate subnetworks in this model – one that maintains candidate rules in working memory during the testing process and that mediates the switch

Figure 2. The COVIS explicit system (black arrows = excitatory projections, gray arrows = inhibitory projections, light gray arrow = dopminergic projections, VTA = ventral tegmental area, SN = substantia nigra pars compacta, CD = caudate nucleus, GP = globus pallidus, MDN = medial dorsal nucleus (of the thalamus), PFC = prefrontal cortex, ACC = anterior cingulate cortex, HC = hippocampus).

4 from one rule to another, and one that generates or selects new candidate hypotheses. The working memory maintenance and attentional switching network includes all structures in Figure 2, except the anterior cingulate. The idea is that the longterm representation of each possible salient rule is encoded in some neural network in sensory association cortex. These cortical units send excitatory signals to working memory units in lateral prefrontal cortex (PFC), which send recurrent excitatory signals back to the same cortical units, thereby forming a reverberating loop. At the same time, the PFC is part of a second excitatory reverberating loop through the medial dorsal nucleus of the thalamus (Alexander, Delong, & Strick, 1986). These double reverberating loops maintain activation in the PFC working memory units during the hypothesistesting procedure. One potential problem with this scheme is that an inhibitory input to the thalamus from the globus pallidus could disrupt processing in the reverberating loops. The spontaneous activity in the GABAergic pallidal cells is high, so without some intervention, this tonic inhibition will prevent the thalamus from closing the corticalthalamic loop, and the information will be lost from working memory. The solution to this problem is for the PFC to excite the head of the caudate nucleus. Cells in the caudate have a low spontaneous firing rate, and instead are characterized by bursts of activity when stimulated by cortex (Bennett & Wilson, 2000). The direct excitatory projection from the PFC causes the caudate cells to fire bursts when the working memory cells in lateral PFC are active. These bursts of caudate activity inhibit the pallidal cells (since the caudate cells are GABAergic), thereby preventing them from disrupting the reverberating cortical-thalamic working memory loop. When feedback convinces a subject that he or she is using an incorrect rule, then a new rule must be selected and executive attention must be switched from the old rule to this new one. In COVIS, these selection and switching operations are mediated by separate neural processes. The process of generating new candidate hypotheses is clearly complex, and a complete model of rule selection is beyond the scope of this chapter, in part because it likely involves poorly understood phenomena such as creativity and insight. We largely beg the question of exactly how rule selection occurs – that is, we simply assume that the process is mediated by some cortical network that includes the PFC and the anterior cingulate. In the computational version of the model, the anterior cingulate selects among alternative rules by enhancing the activity of the specific PFC working memory unit that represents a particular rule (Ashby, Valentin, & Turken, 2002).

Once a new rule is selected, its representation is maintained in its own set of reverberating loops. In Figure 2, one such loop maintains the representation of a rule focusing on the orientation of the bars and one loop maintains a rule focusing on the width. The next stage is to switch attention to the loop encoding this new rule. The next section describes this switching operation in some detail. Switching Attention in the Explicit System Switching, by definition, involves a change in the focus of executive attention. Thus, any model of switching must make assumptions about attention. There is growing consensus that the human attention system is sub-served by separate subsystems, or at least by a broad anatomical network in which different subtasks are mediated by different brain areas (e.g., LaBerge, 1997; Olshausen, Anderson, & Van Essen, 1993; Posner & Petersen, 1990). Although a number of different theories have been proposed, there is widespread agreement that perceptual (i.e., visual) attention is mediated by a posterior system that includes visual cortex, much of the PPC, the pulvinar, and the superior colliculus (e.g., Posner & Petersen, 1990; Olshausen, Anderson, & Van Essen, 1993). In contrast, executive (i.e., conscious) attention is thought to be mediated by an anterior system that includes anterior cingulate, PFC, and perhaps the basal ganglia and pulvinar (e.g., LaBerge, 1997; Posner & Petersen, 1990). Although anatomically separate, the question of whether the systems are functionally separate has been only recently addressed. Although the actions of the two systems are often correlated, empirical evidence that perceptual and executive attention can function independently is growing (e.g., Johnston, McCann, & Remington, 1995; Maddox, Ashby, & Waldron, 2002; Pashler, 1991). Johnston et al. (1995) provided some of the strongest evidence to date for the functional independence of perceptual and executive attention by showing that perceptual and executive attention systems operate at different temporal stages of processing. Specifically, they showed that a critical reference stage of processing (letter identification in their task) operates after the stage of perceptual attention, but prior to the stage of executive attention. Although it is common to assume that visual attention can be allocated simultaneously to more than one location (e.g., Palmer, Verghese, & Pavel, 2000), the possible functional independence of perceptual and executive attention means that executive attention, which we assume to operate in PFC, might follow different rules. In fact, we assume that only one rule (or item) can be in the current focus of attention. A variety of evidence supports this assumption (e.g., Baars, 1988). For

5 example, a long series of studies shows that, with sequentially presented items, rapid privileged access is available only for the single most recent item (Dosher, 1981; McElree, 2001). Even so, it is important to acknowledge that this assumption is controversial. In particular, there is a current debate as to whether executive attention can be allocated to a single item (e.g., Baars, 1988; McElree & Dosher, 2000) or simultaneously to multiple items (e.g., Cowan, 2000). For our purposes, this issue is not critical. We assume that only one rule can be in the current focus of attention, but an alternative version of the COVIS explicit system could easily be developed that allows executive attention to focus on several rules simultaneously. In either case, like a variety of models in the literature (e.g., Cowan, 1988), we assume a hierarchical organization to working memory. The most privileged set is the single rule in the current focus of attention. Next is a larger set of rules that can quickly become the focus of attention. These are candidate rules each with their own reverberating working memory loops through PFC. A much larger set of possible rules is represented in long-term memory and if properly activated could be moved into the intermediate set – that is, they could activate a PFC working memory unit and initiate their own reverberating circuit. There are two general types of attentional switches – automatic and volitional. Automatic attentional switches are typically initiated by some unexpected salient event that occurs in the environment. In contrast, volitional switches are initiated internally via some intentional process that requires effort. In category-learning tasks, attentional switching is volitional, so this section focuses on volitional switches. This is an important provision because we believe that automatic and volitional switches are mediated differently in the mammalian brain. For example, the neuromodulator dopamine likely plays a much more critical role in automatic switches than in volitional switches. Evidence for this comes from a long line of studies showing that dopamine cells (e.g., in the VTA) fire to any unexpected salient stimulus, including unexpected rewards (Mirenowicz & Schultz, 1994), stressful or noxious stimuli (e.g., Imperato, Puglisi-Allegra, Casolini, & Angelucci, 1991; Sorg & Kalivas, 1993), and even to random light flashes and tones that are not associated with either reward or punishment (Horvitz, Stewart, & Jacobs, 1997). Any of these environmental events is likely to elicit an automatic attentional switch. In COVIS, a volitional switch of attention from an old rule to a new rule is mediated by a reduction in the PFC excitatory input to the head of the caudate nucleus. Such de-activation would cause activation in the globus pallidus to increase back to its high baseline levels,

which in turn would inhibit the thalamus, thereby breaking the cortical-thalamic loop. For example, the two classic symptoms of a PFC lesion are impaired working memory and perseveration on the Wisconsin Card Sorting Test (e.g., Kimberg, D’Esposito, & Farah, 1997) – that is, a failure to switch volitionally when the feedback signals that a switch is necessary. Frontal patients have an intact dopamine system and intact basal ganglia. However, since they have lesions of the PFC, they presumably have decreased cortical control of the basal ganglia. So the excitatory projection from PFC to the head of the caudate represents a promising candidate via which a volitional switch might be mediated. In particular, with conscious effort a person could initiate a switch by significantly decreasing this excitation, which would decrease the caudate output, increase the pallidal inhibition of the thalamus, and thereby disrupt the loop that is maintaining the working memory of the rule in the current focus of attention. On the other hand, with conscious effort a person could also allocate more attention to the current rule by increasing the PFC activation of the caudate, which in turn would increase the caudate output, decrease the pallidal output, and thereby strengthen the reverberation in the corticalthalamic loop. According to this hypothesis, the signal for a volitional switch originates in the PFC, but the switching itself is mediated within the basal ganglia (i.e., an increase in basal ganglia output breaks the corticalthalamic loop). Evidence supporting the hypothesis that the basal ganglia play an important role in attentional switching comes from several sources. First, injections of a glutamate agonist directly into the striatum increases the frequency with which cats switch from one motor activity to another in a task where food rewards are delivered for such switching behaviors (Jaspers, De Vries, & Cools, 1990). Second, lesioning the dopamine fibers that project from the VTA into the PFC improves the performance of monkeys in an analogue of the Wisconsin Card Sorting Test (Roberts et al., 1994). If switching occurs in the PFC, then such lesions should impair switching performance. However, the PFC tonically inhibits the VTA (i.e., via a negative feedback loop). Lesioning the dopamine fibers projecting into the PFC releases the VTA from this inhibition. As a consequence, such lesions increase dopamine release into the basal ganglia (Roberts et al., 1994). If the basal ganglia are responsible for switching, and if switching is enhanced by dopamine, then lesioning dopamine fibers into PFC should improve switching, which is exactly the result observed by Roberts et al. (1994). Third, Van Golf Racht-Delatour and El Massioui (1999) demonstrated that rats with lesions to the dorsal striatum had no deficits in learning which arm of a radial-arm maze was

6 initially baited, but they did have deficits, relative to rats with sham lesions, when the position of the baited arm was successively switched according to a simple rule. Finally, there are well known switching deficits in individuals with caudate dysfunction. For example, numerous studies have shown that Parkinson's disease patients, who have abnormally low levels of dopamine in the striatum, have a greater tendency to perseverate on the Wisconsin Card Sorting Test (Brown & Marsden, 1988). Long-Term Storage of Explicit Category Knowledge Consider a case where the explicit system discovers the optimal categorization rule through this hypothesized process of hypothesis generation and testing. Of course, working memory cannot be used to maintain the representation of this optimal rule over any significant period of time. COVIS assumes that the consolidation and long-term representation of category knowledge obtained by the explicit system is mediated by medial temporal lobe structures, and in particular, by the hippocampus. Thus, as in many other models, the projection from the PFC to the hippocampus is assumed to mediate the transition from working memory to longterm declarative memory representations (e.g., Eichenbaum, 1997). The medial temporal lobe declarative memory systems might play an even more prominent role in complex rule-based tasks. When there are many alternative explicit rules to consider, the process of sorting through all these possible rules may exceed the capacity of working memory. In this case, episodic and/or semantic memory may be required to help the subject keep track of which rules have already been tested and rejected. Note that this hypothesis predicts that medial temporal lobe amnesiacs should learn normally in simple rule-based tasks, but they should be impaired in complex rule-based tasks. Several studies have reported evidence supporting this former prediction (Janowsky, Shimamura, Kritchevsky, & Squire, 1989; Leng & Parkin, 1988), but to our knowledge the latter prediction has not been tested. THE COVIS PROCEDURAL-LEARNING SYSTEM Figure 3 shows the COVIS procedural learning-based system (Ashby et al., 1998; Ashby & Waldron, 1999). The key structure is the caudate nucleus, a major input region within the basal ganglia. In primates, all of extrastriate visual cortex projects directly to the tail of the caudate nucleus, with about 10,000 visual cortical cells converging on each caudate cell (Wilson, 1995). The model assumes that, through a procedural learning process, each caudate unit associates an abstract motor

Figure 3. The COVIS procedural-memory-based categorylearning system (Excitatory projections end in solid circles, inhibitory projections end in open circles, and dopaminergic projection is dashed. PFC = prefrontal cortex, Cau = caudate nucleus, GP = globus pallidus, and Th = thalamus).

program with a large group of visual cortical cells (i.e., all that project to it). The medium spiny cells in the tail of the caudate send projections to a variety of prefrontal and premotor cortical areas. There are two synapses on this pathway. The first synapse on the principal path is in the globus pallidus, which is a major output structure within the basal ganglia. The second synapse is in the thalamus, primarily in the ventral anterior nucleus, pars magnocellualaris (VAmc). The primary cortical projection from VAmc is to Broadman Area 8 and the so-called supplementary eye fields (Shook, Schlag-Rey, & Schlag, 1991). Area 8 lies within the anterior bank of the arcuate sulcus, which falls between the PFC and the premotor cortex. Although some authors consider Area 8 to be transitional in function between prefrontal and premotor cortex (e.g., von Bonin & Bailey, 1947), more recent classifications include it as a premotor region (e.g., Pandya & Yeterian, 1985; Passingham, 1993). The supplementary eye fields however, are part of the supplementary motor area (SMA) (Broadman Area 6). Area 8 and the supplementary eye fields both mediate the selection of eye movements and orienting responses (Passingham, 1993). In contrast, the SMA is primarily responsible for selecting motor programs to move limbs. So, for example, animals with SMA lesions are impaired in learning to move a lever one way to one stimulus (call it A) and another way to a different stimulus (call it B),

7 but animals with Area 8 lesions learn normally in this task. In contrast, animals with SMA lesions are unimpaired if the task is to learn to direct attention to a stimulus in one location if A is presented and another stimulus in a different location if B is presented, whereas animals with Area 8 lesions are impaired (Halsband & Passingham, 1985; Petrides, 1985). These results suggest that the learning in the COVIS implicit system may be of orienting responses, rather than of abstract category labels. Few studies have directly tested this prediction, but consistent eye movements have been observed in a variety of cognitive tasks that had no objective requirement for subjects to move their eyes (e.g., Spivey & Geng, 2001). COVIS assumes that the procedural learning in the caudate is facilitated by a dopamine mediated reward signal from the substantia nigra (pars compacta). There is a large literature linking dopamine and reward, and many researchers have argued that a primary function of dopamine is to serve as the reward signal in rewardmediated learning (e.g., Miller, Sanghera, & German, 1981; Montague, Dayan, & Sejnowski, 1996; Wickens, 1993). Fairly specific neurobiological models of this learning process have been developed (e.g., Calabresi, Pisani, Centonze, & Bernardi, 1996; Wickens, 1993). Figure 4 shows a close-up view of a synapse between the axon of a pyramidal cell originating in visual cortex and the dendrite of a medium spiny cell in the caudate nucleus. Note that glutamate projections from visual cortex and dopamine projections from the substantia nigra both synapse on the dendritic spines of caudate medium spiny cells (e.g., DiFiglia, Pasik, & Pasik, 1978; Freund, Powell, & Smith, 1984; Smiley et al., 1994). There are a number of different glutamate receptors, but for our purposes the two most important are the NMDA and AMPA receptors. The AMPA receptor becomes active when small amounts of glutamate are released presynaptically. However, the NMDA receptor has a high threshold for activation because when the postsynaptic cell is hyperpolarized, the NMDA receptor is blocked by a magnesium plug. This plug dissociates from the receptor after the cell is partially depolarized. Thus, a strong presynaptic glutamate signal is required to activate postsynaptic NMDA receptors. The best available evidence indicates that three factors are required to strengthen synapses of the type shown in Figure 4 (i.e., for long-term potentiation or LTP to occur): 1) strong presynaptic activation, 2) strong postsynaptic (i.e., NMDA receptor) activation, and 3) dopamine release (e.g., Calabresi, et al., 1996; Nairn, Hemmings, Walaas, & Greengard, 1998; Wickens, 1993). The first factor should occur for those visual cortical cells that are maximally tuned to the presented stimulus, and the third factor will occur if the subject is

Figure 4. A closer view of a cortical-striatal synapse. The axon of a visual cortical cell releases glutamate (Glu) onto the dendritic spine of a medium spiny cell of the caudate nucleus. Dopamine cells of the substantia nigra also project onto medium spiny cells cells and upon presentation of reward, release dopamine (DA) into the same synapse.

rewarded for a correct response. Thus, according to COVIS, synapses in the implicit system between visual cortical cells and medium spiny cells in the caudate are strengthened if the visual cortical cell is maximally tuned to the presented stimulus (factors 1 and 2) and the subject is rewarded for responding correctly (factor 3). On the other hand, if either of factors 2 or 3 is missing, then it is thought that the strength of the synapse will weaken (i.e., long-term depression or LTD will occur; e.g., Calabresi, et al., 1996). Note that there are several ways this could happen. One is if the subject responds incorrectly (so factor 3 is missing), and another is if the visual cortical cell responds only weakly to the presented stimulus. Thus, this model of LTD predicts that any synapse responsible for the subject emitting an incorrect response will be weakened, as will any synapse that is driven by a cell in visual cortex that does not encode the perceptual representation of the stimulus. The combination of this with the three-factor model of LTP produces a powerful learning algorithm. The three-factor model of LTP is appealing, but a serious timing problem must be solved before it could operate effectively. The problem is that shortly after the stimulus is presented, the (visual) cortical-striatal (i.e., tail of caudate) synapse will be activated, but the dopamine release must necessarily occur some time later (e.g., several seconds), because it follows the reward, which follows the response, which follows the stimulus presentation. Evolution has produced a beautiful and

8 ingenious solution to this problem. NMDA receptor activation causes an influx of free Ca2+ into the spines of the caudate medium spiny cells. This Ca2+ triggers a number of chemical reactions, some of which have the effect of depolarizing the cell and eventually causing it to fire. After the cell fires, a natural hyperpolarization process is triggered that resets its membrane potential. The result is that by the time the dopamine is released, the depolarization produced by the presynaptic glutamate release has been erased from the major compartments of the medium spiny cell. However, because the spines are somewhat separated from the bulk of the intracellular medium, the mechanisms that reset the membrane potential operate more slowly than in the main cellular compartments. In fact, it turns out that free Ca2+ persists in the spines for several seconds after entering the cell (Gamble & Koch, 1987; MacDermott et al., 1986). Thus, so long as the reward is delivered within a few seconds of the response, a trace will still exist in the critical spines that were responsible for eliciting the behavior that earned the reward, and so the correct synapses will be strengthened (i.e., via LTP). Note that an obvious, and exceptionally strong prediction of this model is that if the feedback is delayed more than a few seconds then learning should be severely disrupted in information-integration tasks. It is important to note that the model shown in Figure 3 is strictly a model of visual category learning. However, it is feasible that a similar system exists in the other modalities, since they almost all also project directly to the basal ganglia, and then indirectly to frontal cortical areas (via the thalamus and either the substantia nigra pars reticulata or the globus pallidus; e.g., Chudler, Sugiyama, & Dong, 1995). The main difference is in where within the basal ganglia they initially project. For example, auditory cortex projects directly to the body of the caudate (i.e., rather than to the tail; Arnalud, Jeantet, Arsaut, & Demotes-Mainard, 1996). COMPTETITION BETWEEN THE COVIS EXPLICIT AND IMPLICIT SYSTEMS If human category learning is mediated by multiple systems, then an important question is to ask how the various systems interact and how their separate contributions are coordinated during the process of response selection. There is almost no data that directly addresses this question, and as a result this topic is likely to be the focus of intense research efforts during the next few years. In one of the few studies examining this issue, Poldrack et al. (2001) reported an antagonistic relationship between neural activation in the basal ganglia and medial temporal lobes (i.e., using fMRI) – that is, basal ganglia activation tended to increase with

category learning, whereas medial temporal lobe activation decreased. While this result is suggestive, it is obviously difficult to draw strong conclusions from a single study. The original version of COVIS (Ashby et al., 1998) assumed that the explicit and implicit systems learned independently, but that they competed for control of the observable response on each trial. The competition was implemented in the following way. The output of each COVIS system is a discriminant value. Roughly speaking, these discriminant values measure the distance from the percept to the decision bounds used by each system, with the provision that the values are positive in the response region assigned to category A (say) and negative in the region assigned to category B. Response confidence generally increases with distance-to-bound, so the absolute value of the discriminant is a measure of confidence. In COVIS, parameters θE and θI measure the relative strengths of the explicit and implicit systems, respectively (where θE + θI = 1). COVIS assumes that people initially rely almost exclusively on the explicit system (at least in western cultures), so at the beginning of learning θE is much larger than .5 (our applications typically set the initial value of θE to 0.99). However, in information-integration tasks, the explicit system frequently fails and the implicit system is often rewarded, and as a consequence θE gradually decreases with practice and θI increases. On each trial, the observable response is determined by whichever system has the greatest weighted confidence, which is defined as the system weight (θE or θI) multiplied by the absolute value of the system discriminant function. Simulations of COVIS in information-integration tasks show that after extensive practice θI exceeds θE, but even after thousands of trials, θE rarely falls much below 0.4. Thus COVIS predicts that even after massive amounts of practice, subjects will still use explicit rules on some trials in information-integration tasks. For example, in the Figure 1b information-integration task disks with the narrowest bars are all in category A and disks with the widest bars are all in category B. COVIS predicts that for such extreme stimuli subjects may apply an explicit rule, whereas for disks with bars of intermediate width they would learn that no rule is effective and instead rely on procedural knowledge. DISSOCIATIONS BETWEEN RULE-BASED AND INFORMATION-INTEGRATION CATEGORY LEARNING A number of recent experiments tested parameterfree a priori predictions made by COVIS. Collectively, the results of these studies also provide strong evidence that learning in these tasks is mediated by separate systems. A number of these results test COVIS

9 predictions about the nature and timing of trial-by-trial feedback about response accuracy. In particular, COVIS predicts that, because the explicit system has access to working memory and executive attention, it should be relatively unaffected by changes in the timing and form of the feedback signal. In contrast, the COVIS implicit system is highly sensitive to feedback parameters. First, because COVIS predicts that a dopamine mediated reward signal is necessary for learning (e.g., LTP) to occur in the caudate nucleus, the absence of such a reward signal should greatly interfere with this form of implicit category learning. In support of this prediction, in the absence of any trial-by-trial feedback about response accuracy, people can learn some rule-based categories, but there is no evidence that they can learn information-integration categories (Ashby, Queller, & Berretty, 1999). Second, COVIS predicts that learning in information-integration tasks should be impaired (relative, say, to learning in rule-based tasks) when the category label is shown before stimulus presentation (rather than after the response). To test this prediction, Ashby, Maddox, and Bohil (2002) trained subjects on rule-based and information-integration categories using an observational training paradigm in which subjects are informed before stimulus presentation of which category the ensuing stimulus is from. Following stimulus presentation, subjects then pressed the appropriate response key. Traditional feedback training was as effective as observational training with rule-based categories, but with information-integration categories, feedback training was significantly more effective than observational training. A third feedback-related prediction of COVIS is that the timing of the feedback signal relative to the response should be critical for information integration learning but not for rule-based learning. As described above, if the reward (e.g., feedback signal) is delayed by more than a few seconds, then the ensuing dopamine release will strengthen inappropriate synapses in the tail of the caudate and learning in information-integration tasks should therefore be adversely affected. In contrast, the explicit system can hold critical information in working memory for many seconds, and therefore COVIS predicts that rule-based learning should not be affected significantly by feedback delays of a few seconds. These predictions were tested by Maddox, Ashby, and Bohil (2003), who showed that information-integration category learning is impaired if the feedback signal is delayed by as little as 2.5 s after the response. In contrast, delays as long as 10 s had no effect on rulebased category learning. Another set of studies tested the fundamental assumption of COVIS that information-integration

categorization uses procedural learning, whereas rulebased category learning does not. Ashby, Ell, and Waldron (2003) had subjects learn either rule-based or information-integration categories using traditional feedback training. Next, some subjects continued as before, some switched their hands on the response keys, and for some the location of the response keys was switched (so the Category A key was assigned to Category B and vice versa). For those subjects learning rule-based categories, there was no difference among any of these transfer instructions, thereby suggesting that abstract category labels are learned in rule-based categorization. In contrast, for those subjects learning information-integration categories, switching hands on the response keys caused no interference, but switching the locations of the response keys caused a significant decrease in accuracy. Thus, it appears that response locations are learned in information-integration categorization, but not specific motor programs. The importance of response locations in informationintegration category learning, but not in rule-based category learning was confirmed in a recent study by Maddox, Bohil, and Ing (in press). These informationintegration results essentially replicate results found with traditional procedural-learning tasks (Willingham, Wells, Farrell, & Stemwedel, 2000). A third set of studies tested the COVIS prediction that working memory and executive attention are critical for rule-based category learning, but not for informationintegration category learning. First, Waldron and Ashby (2001) had subjects learn rule-based and informationintegration categories under typical single-task conditions and when simultaneously performing a secondary task that requires working memory and executive attention. The dual task had a massive detrimental effect on the ability of subjects to learn the simple one-dimensional rule-based categories (trials-tocriterion increased by 350%), but no significant effect on the ability of subjects to learn the complex informationintegration categories. This result alone is highly problematic for unified accounts of rule-based and information-integration categorization. Arguably the most successful existing single-process model of category learning is Kruschke’s (1992) exemplar-based ALCOVE model. Ashby and Ell (2002) showed that the only versions of ALCOVE that can fit the Waldron and Ashby data make the strong prediction that after reaching criterion accuracy on the one-dimensional rulebased structures, subjects would have no idea that only one dimension was relevant in the dual-task conditions. Ashby and Ell (2002) reported empirical evidence that strongly disconfirmed this prediction. Thus, the best available single-system model fails to account even for

10 the one dissociation reported by Waldron and Ashby (2001). Second, Maddox, Ashby, Ing, and Pickering (2004) tested the COVIS prediction that feedback processing requires attention and effort in rule-based category learning, but not in information-integration category learning. In this study, subjects alternated a trial of categorization with a trial of Sternberg (1966) memoryscanning. Two conditions were identical except one had a short delay after the categorization trial and before memory scanning, whereas the other had a short delay after memory scanning and before categorization. Information-integration category learning was identical in these two conditions, whereas rule-based category learning was significantly impaired when subjects had only a short time to process the categorization feedback. It is important to realize that these dissociations are not driven simply by differences in the difficulty of rulebased versus information-integration tasks. First, in several cases the experimental manipulation interfered more with the learning of the simple rule-based categories than with the more difficult informationintegration strategies (Maddox et al., 2004; Waldron & Ashby, 2001). Second, most of the studies explicitly controlled for difficulty differences either by decreasing the separation between the one-dimensional rule-based categories, or by using a more complex two-dimensional conjunction rule in the rule-based conditions. Both manipulations increase the difficulty of rule-based categorization, yet in no case did such increases in rulebased task difficulty affect the qualitative dissociations described above. CONCLUSIONS The preceding section demonstrates that COVIS provides a powerful description of the differences between the learning that dominates in rule-based and information-integration category-learning tasks. Even more impressive, however, is the fact that without COVIS, many of the experiments described above would never be run. For example, it is difficult to imagine how any purely cognitive theory could ever predict that delaying feedback by a few seconds should interfere with one type of category learning but not with another, or that switching the location of the response keys should cause a similar selective interference. Of course, one of the greatest benefits of a good theory is that it suggests new experiments to run, which themselves add to our knowledge in new and unexpected ways. In this sense, at least, COVIS has already been quite successful. This chapter has focused on cognitive behavioral tests and predictions of COVIS. But the theory also makes a variety of other types of predictions – mostly because of its neurobiological underpinnings. In

particular, it makes predictions about which neuropsychological patient groups should be impaired in category learning, and which groups should have normal category-learning abilities. Further, when it predicts a deficit, it makes fairly specific predictions about the precise nature of these hypothesized deficits. For example, COVIS predicts that patients with lesions confined to the PFC should be impaired in rule-based category-learning tasks, but they should be relatively normal in information-integration tasks. The next chapter reviews the existing neuropsychological tests of COVIS, with a focus on patients with striatal or hippocampal dysfunction. Despite its many initial successes, there are some recent signs that COVIS may be incomplete as a theory of human category learning. One important body of evidence pointing in this direction comes from a third type of category-learning task called the “(A, not A) prototype distortion task” (Ashby & Maddox, 2005), in which each exemplar of one category is created by randomly distorting a single prototype (e.g., Posner & Keele, 1968). The subject’s task is to respond “Yes” or “No” depending on whether or not the presented stimulus is a member of this category (not A stimuli are just random patterns). Several studies have shown that a variety of neuropsychological patient groups that are known to have deficits in rule-based and/or informationintegration tasks show apparently normal (A, not A) prototype distortion learning, a result that is obviously problematic for the original version of COVIS. This list includes patients with Parkinson’s disease (Reber & Squire, 1999), schizophrenia (Keri, Kelemen, Benedek, & Janka, 2001), or Alzheimer’s disease (Sinha, 1999; although see Keri et al., 1999). COVIS hypothesizes that rule-based category learning uses working memory and other declarative memory systems (e.g., episodic and semantic memory), and that information-integration learning uses procedural memory. One possibility is that performance in (A, not A) prototype distortion tasks is mediated by some different memory system. An obvious possibility is the perceptual representation memory system (Ashby & Casale, 2002; Reber & Squire, 1999). In support of this hypothesis, all of the existing neuroimaging studies of (A, not A) prototype distortion tasks have reported learning-related changes in occipital cortex (Aizenstein et al., 2000; Reber, Stark, & Squire, 1998a, 1998b). Learning is, by definition, the process of laying down some sort of memory trace, and there is certainly no reason to suspect that any of the separate memory systems that have been hypothesized are incapable of storing memories about categories. For this reason, a complete theory of human category learning is likely to assign some role to each of the different

11 memory systems that have been identified by memory researchers. APPENDIX Network Implementation of the Explicit System This section describes the computational version of the COVIS explicit system. The model described in this section is a hybrid neural network that consists of both symbolic and connectionist components. The model’s hybrid character arises from its combination of explicit rule selection and switching and its incremental saliencelearning component. First, denote the set of all available explicit rules by U = {R1, R2,…,Rm}. In most applications, the set U will include all possible one-dimensional rules, and perhaps a variety of plausible conjunction rules. Next, suppose rule Ri is used on trial n. If the response on trial n is correct, then rule Ri is used again on trial n+1. If the response on trial n is incorrect, then the active rule on trial n+1 is selected via the following three steps. Step 1: Choose a rule at random from U. Call this rule Rj. This is the rule selected by the frontal selection system (i.e., anterior cingulate/prefrontal cortex) for a possible attentional switch. Step 2: Define a weight, Y, for each rule as follows: Yi(n) = Zi(n) + (, for the active rule Ri, Yj(n) = Zj(n) + X, for the rule Rj chosen in step 1, and Yk(n) = Zk(n), for all rules Rk … Ri or Rj. The constant Zl(n) is a measure of the current salience of rule Rl. The parameter ( is a positive constant that reflects the tendency of the subject to perseverate. The larger the value of (, the greater the tendency to perseverate on the current rule, even in the presence of negative feedback. Thus, ( reflects the ability of the head of the caudate to switch attention to a new rule. If ( is small, then switching will be easy, whereas switching is difficult if ( is large. Finally, X is a positive-valued random variable that represents the attempt of the frontal selection system to select rule Rj. In most applications, X is a Poisson distributed random variable with mean 8. Larger values of 8 increase the probability that attention will be switched to the selected rule Rj. The initial salience of rule Rl [i.e., Zl(0)] is set to a value that reflects the subject’s initial tendency to experiment with this rule. For example, one reasonable strategy would be to set the initial salience of all onedimensional rules to one constant value and the initial salience of any conjunction rules to some lower constant value. After setting these initial saliences, they are then updated each time rule Rl is used. Specifically, if rule Rl is used on trial n-1 and a correct response occurs, then

Zl(n) = Zl(n-1) + )C, where )C is some positive constant. If rule Rl is used on trial n-1 and an error occurs, then Zl(n) = Zl(n-1) - )E, where )E is also a positive constant. Step 3: Choose the rule for trial n+1 with the greatest weight Y – that is, choose rule Rl on trial n+1 if Yl(n) = max[Y1(n), Y2(n),…,Ym(n)]. This algorithm has a number of attractive properties. First, the more salient the rule, the higher the probability that it will be selected, even after an incorrect trial. This feature enables the algorithm to be resistant to catastrophic interference. Second, after the first trial, feedback is used to adjust the selection probabilities up or down, depending on the success of the rule type. Third, the model has separate selection and switching parameters, reflecting the assumption of COVIS that these are separate operations. The random variable X models the selection operation. The greater the mean of X (i.e., 8), the greater the probability that the selected rule (Rj) will become active. COVIS assumes 8 increases with dopamine levels in frontal cortex. In contrast, the parameter ( models switching, because when ( is large, it is unlikely that the system will switch to the selected rule Rj. COVIS assumes that switching will be more difficult when there are abnormally low levels of dopamine in the striatum. Accordingly, ( is assumed to vary inversely with dopamine levels in the caudate. Once a particular rule is selected, say rule Rj, then the explicit system selects a response by computing a discriminant value hj(n) and using the following decision rule: Respond A on trial n if hj(n) > 0; respond B if hj(n) < 0. A full description of how discriminant values are computed is beyond the scope of this chapter (for details, see Ashby, 1992), but the general idea, which is approximately correct, is that hj(n) is the distance from the stimulus to the decision bound, with the provision that these distances are positive in the response region associated with category A and negative in the region associated with category B. Each one-dimensional rule has an associated decision criterion (e.g., the x-intercept of the vertical bound in Figure 1a), and each conjunction rule has two. The values of these constants are learned via a conventional gradient-descent process with learning rate δ. Thus, the model has 5 parameters, ( (perseveration parameter), 8 (selection parameter), )C (salience increment when correct), )E (salience increment when incorrect), δ (gradient-descent learning rate).

12 Network Implementation of the Implicit System From a computational perspective, the procedural learning in the caudate could occur in one of two ways. One possibility is that the caudate learns a decision bound. In this case, learning could be modeled as a process via which parameters that describe the bound are updated as the subject gains experience with the categories. A second possibility is that the caudate learns to assign responses to regions of perceptual space. Under this scenario, the decision bound has no psychological meaning, and instead is simply the partition between regions associated with contrasting responses. Ashby and Waldron (1999) tested between these two possibilities and found strong evidence against the former. In particular, their data disconfirmed the hypothesis that people learn decision bounds in information-integration tasks. Instead, the data are consistent with the hypothesis that the caudate learns to assign responses to regions of perceptual space. Ashby and Waldron (1999) proposed a computational model called the striatal pattern classifier (SPC) that implements this idea. The SPC, which is illustrated in Figure 5, is a model of the COVIS procedural-learning system. Figure 5a shows the category structure of an information-integration task. In this case, however, the optimal decision bound is quadratic. A large literature shows that healthy young adults often eventually learn to respond in a nearly optimal fashion in experiments of this type (e.g., Ashby & Gott, 1988; Ashby & Maddox, 1990, 1992; Maddox & Ashby, 1993). A simplified version of the SPC is illustrated in Figure 5b for the information-integration categorylearning experiment shown in Figure 5a. The twodimensional bar width-orientation space shown in 5b depicts the perceptual representation of the disks used in the Figure 5a experiment. Thus, small regions in this space are associated with a distinct cell in some extrastriate visual area (i.e., the cell maximally stimulated when a particular disk is shown). The four large dots represent four different striatal units. Each striatal unit is associated with one of the two category responses, which creates four distinct regions in perceptual space. In Figure 5b, two of those regions are associated with category A and two are associated with category B. The model assumes that each percept is mapped to the nearest striatal unit, and therefore that the subject emits the response associated with the striatal unit that is nearest the percept. Perceptual noise causes the percept elicited by each stimulus to vary randomly from trial to trial. All existing applications of the SPC have assumed that this perceptual noise has a multivariate normal distribution with mean 0 on each dimension and variance-covariance matrix equal to σ2I,

Figure 5. (A) Category structure of an information-integration category-learning task in which the optimal decision bound is quadratic. Each stimulus is a sine-wave grating that varies across trials in bar width and orientation. Each black plus denotes the bar width and orientation of an exemplar of Category A and each gray dot denotes the bar width and orientation of an exemplar of Category B. (B) A simplified version of the SPC. Each solid dot represents a different unit in the caudate nucleus.

where I is the identity matrix. The predicted probability that a subject will assign a particular stimulus to category A, say, is therefore equal to the volume, or integral under the distribution representing the

13 perceptual effects of that stimulus throughout the category A response region. For a detailed description of an efficient algorithm for computing these integrals, see Ennis and Ashby (2003). On trials when a stimulus with coordinates x is presented, a discriminant value can be defined for the SPC by computing log P(RB|x) - log P(RA|x). A fundamental feature of the SPC is the convergence it assumes between high-resolution perceptual space (i.e., in visual cortex) and low-resolution decision space (i.e., in the tail of the caudate). One implication of this convergence assumption is that the SPC predicts there is a limit on the complexity of the decision bound that can be learned in information-integration tasks. Although many studies have shown that people can learn linear or quadratic information-integration bounds, only a few studies have tested the ability of people to learn bounds more complex than the quadratic bound shown in Figure 5a. In support of the SPC prediction, each of these have reported a failure of people to learn these complex bounds, despite many hundreds of trials of practice (Ashby, Waldron, Lee, & Berkman, 2001; McKinley & Nosofsky, 1995). REFERENCES Aizenstein, H.J., MacDonald, A.W., Stenger, V.A., Nebes, R.D., Larson, J.K., Ursu, S. & Carter, C.S. (2000). Complementary category learning systems identified using event-related functional MRI. Journal of Cognitive Neuroscience, 12, 977-987. Alexander, G. E., DeLong, M. R., & Strick, P. L. (1986). Parallel organization of functionally segregated circuits linking basal ganglia and cortex. AnnualReview of Neuroscience, 9, 357-381. Arnalud, E., Jeantet, Y., Arsaut, J., & Demotes-Mainard, J. (1996). Involvement of the caudal striatum in auditory processing: c-fos response to cortical application of picrotoxin and to auditory stimulation. Brain Research: Molecular Brain Research, 41, 27-35. Ashby, F.G., & Casale, M.B. (2002). The cognitive neuroscience of implicit category learning. In L. Jimenez (Ed.), Attention and implicit learning. Amsterdam & Philadelphia: John Benjamins Publishing Company. Ashby, F. G., & Ell, S. W. (2002). Single versus multiple systems of category learning: Reply to Nosofsky and Kruschke (2001). Psychonomic Bulletin & Review, 9, 175180. Ashby, F. G., Ell, S. W., & Waldron, E. M. (2003). Procedural learning in perceptual categorization. Memory and Cognition, 31, 1114-1125. Ashby, F. G. & Ennis, D. M. (2003, August 25). Fitting Decision Bound Models to Identification or Categorization Data. Numerical Routines & Software. [Online]. Available: http://www.psych.ucsb.edu/~ashby/cholesky.pdf Ashby, F. G. & Gott, R. E. (1988). Decision rules in the perception and categorization of multidimensional stimuli.

Journal of Experimental Psychology: Learning, Memory, and Cognition, 14, 33-53. Ashby, F. G., & Maddox, W. T. (1990). Integrating information from separable psychological dimensions. Journal of Experimental Psychology: Human Perception and Performance, 16, 598-612. Ashby, F. G., & Maddox, W. T. (1992). Complex decision rules in categorization: Contrasting novice and experienced performance. Journal of Experimental Psychology: Human Perception and Performance, 18, 50-71. Ashby, F. G., & Maddox, W. T. (2005). Human category learning. Annual Review of Psychology, in press. Ashby, F. G., Maddox, W. T., & Bohil, C.J. (2002). Observational versus feedback training in rule-based and information-integration category learning. Memory & Cognition, 30, 666-677. Ashby, F.G., Noble, S., Filoteo, J.V., Ell, S.W., & Waldron, E.M. (2003). Category learning deficits in Parkinson’s Disease. Neuropsychology, 17, 115-124. Ashby, F. G., Queller, S., & Berretty, P. T. (1999). On the dominance of unidimensional rules in unsupervised categorization. Perception & Psychophysics, 61, 1178-1199. Ashby, F.G., Valentin, V.V. & Turken, A.U. (2002). The effects of positive affect and arousal on working memory and executive attention: Neurobiology and computational models. In S. Moore & M. Oaksford (Eds.), Emotional Cognition: From Brain to Behavior (pp. 245-287). Amsterdam: John Benjamins. Ashby, F. G. & Waldron, E. M. (1999). The nature of implicit categorization. Psychonomic Bulletin & Review, 6, 363-378. Baars, B.J. (1988). A cognitive theory of consciousness. New York: Cambridge Press. Bennett, B.D. & Wilson, C.J. (2000). Synaptoloty and physiology of neostriatal neurons. In R. Miller & J.R. Wickens (Eds.) Brain dynamics and the striatal complex (pp. 111-140). Amsterdam: Harwood. Brown, R. G. & Marsden, C. D. (1988). Internal versus external cures and the control of attention in Parkinson's disease. Brain, 111, 323-345. Calabresi, P., Pisani, A., Centonze, D., & Bernardi, G. (1996). Role of Ca2+ in striatal LTD and LTP. Seminars in the Neurosciences, 8, 321-328. Chudler, E.H., Sugiyama, K. & Dong, W.K. (1995). Multisensory convergence and integration in the neostriatum and globus pallidus of the rat. Brain Research, 674, 33-45. Cooper, J.R., Bloom, F.E. & Roth, R.H. (1991). The Biochemical Basis of Neuropharmacology (Sixth Edition). New York: Oxford. Cowan, N. (1988). Evolving conceptions of memory storage, selective attention, and their mutual constraints within the human information-processing system. Psychological Bulletin. 104, 163-191. Cowan, N. (2000). The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behavioral and Brain Sciences, 24, 87-185. DiFiglia, M., Pasik, T. & Pasik, P. (1978). A Golgi study of afferent fibers in the neostriatum of monkeys. Brain Research,152, 341-7.

14 Dosher, B.A. (1981). The effects of delay and interference: A speed-accuracy study. Cognitive Psychology, 13, 551-582. Eichenbaum, H. (1997). Declarative memory: Insights from cognitive neurobiology. Annual Review of Psychology, 48, 547-572. Freund, T.F., Powell, J.F. & Smith, A.D. (1984). Tyrosine hydroxylase-immunoreactive boutons in synaptic contact with identified striatonigral neurons, with particular reference to dendritic spines. Neuroscience, 13, 1189-215. Gamble, E. & Koch, C. (1987). The dynamics of free calcium in dendritic spines in response to repetitive synaptic input. Science, 236, 1311-1315. Halsband, U. & Passingham, R.E. (1985). Premotor cortex and the conditions for movement in monkeys. Behavioral Brain Research, 18, 269-276. Horvitz, J.C., Stewart, T. & Jacobs, B.L. (1997). Burst activity of ventral tegmental dopamine neurons is elicited by sensory stimuli in the awake cat. Brain Research, 759, 251258. Imperato, A., Puglisi-Allegra, S., Casolini, P. & Angelucci, L. (1991). Changes in brain dopamine and acetylcholine release during and following stress are independent of the pituitary-adrenocortical axis. Brain Research, 538, 111-117. Janowsky, J. S., Shimamura, A. P., Kritchevsky, M., & Squire, L. R. (1989). Cognitive impairment following frontal lobe damage and its relevance to human amnesia. Behavioral Neuroscience, 103, 548-560. Jaspers, R.M., De Vries, T.J. & Cools, A.R. (1990). Enhancement in switching motor patterns following local application of the glutamate agonist AMPA into the cat caudate nucleus. Behavioral Brain Research, 37, 237-246. Johnston, J.C., McCann, R.S. & Remington, R.W. (1995). Chronometric evidence for two types of attention. Psychological Science, 6, 365-369. Kéri, S., Kalman, J., Rapcsak, S. Z., Antal, A., Benedek, G. & Janka, Z. (1999). Classification learning in Alzheimer’s disease. Brain, 122, 1063-1068. Kéri, S., Kelemen, O., Benedek, G., & Janka, Z. (2001). Intact prototype learning in schizophrenia. Schizophrenia Research, 52, 261-264. Kimberg, D.Y., D’Esposito, M. & Farah, M.J. (1997). Effects of bromocriptine on human subjects depend on working memory capacity. Neuroreport, 8, 3581-3585. Kruschke, J.K. (1992). ALCOVE: An exemplar-based connectionist model of category learning. Psychological Review, 99, 22-44. LaBerge, D. (1997). Attention, awareness, and the triangular circuit. Consciousness & Cognition, 6, 149-181. Leng, N. R., & Parkin, A. J. (1988). Double dissociation of frontal dysfunction in organic amnesia. British Journal of Clinical Psychology, 27, 359-362. MacDermott, A.B., Mayer, M.L., Westbrook, G.L., Smith, S.J. & Barker, J.L. (1986). NMDA-receptor activation increases cytoplasmic calcium concentration in cultured spinal cord neurons. Nature, 321, 519-22. Erratum in: Nature, 321, 888. Maddox, W.T. & Ashby, F.G. (1993). Comparing decision bound and exemplar models of categorization. Perception & Psychophysics, 53, 49-70. Maddox, W.T., Ashby, F.G., & Bohil, C.J. (2003). Delayed feedback effects on rule-based and information-integration

category learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 650-662. Maddox, W. T., Ashby F. G., Ing, A. D., & Pickering, A. D. (2004). Disrupting feedback processing interferes with rulebased but not information-integration category learning. Memory & Cognition, 32, 582-591. Maddox, W.T., Ashby, F.G. & Waldron, E.M. (2002). Multiple attention systems in perceptual categorization. Memory & Cognition, 30, 325-339. Maddox, W. T., Bohil, C. J., & Ing, A. D. (in press). Evidence for a procedural learning-based system in category learning. Psychonomic Bulletin & Review. McElree, B. (2001). Working memory and focal attention. Journal of Experimental Psychology: Learning, Memory, & Cognition, 27, 817-835. McElree, B. & Dosher, B.A. (2000). The focus of attention across space and across time. Behavioral and Brain Sciences, 24, 129-130. McKinley, S. C. & Nosofsky, R. M. (1995). Investigations of exemplar and decision bound models in large, ill-defined category structures. Journal of Experimental Psychology: Human Perception & Performance, 21, 128-148. Miller, J. D, Sanghera, M. K., & German, D. C. (1981). Mesencephalic dopaminergic unit activity in the behaviorally conditioned rat. Life Sciences, 29, 1255-63. Montague, P. R., Dayan, P., & Sejnowski, T. J. (1996). A framework for mesencephalic dopamine systems based on predictive Hebbian learning. Journal of Neuroscience, 16, 1936-1947. Mirenowicz, J. & Schultz, W. (1994). Importance of unpredictability for reward responses in primate dopamine neurons. Journal of Neurophysiology, 72, 1024-1027. Nairn, A.C., Hemmings, H.C. Jr., Walaas, S.I. & Greengard, P. (1988). DARPP-32 and phosphatase inhibitor-1, two structurally related inhibitors of protein phosphatase-1, are both present in striatonigral neurons. Journal of Neurochemistry, 50, 257-62. Olshausen, B.A., Anderson, C.H. & Van Essen, D.C. (1993). A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information. Journal of Neuroscience, 13, 4700-4719. Palmer, J., Verghese, P. & Pavel, M. (2000). The psychophysics of visual search. Vision Research, 40, 12271268. Pashler, H. (1991). Shifting visual attention and selecting motor responses: Distinct attentional mechanisms. Journal of Experimental Psychology: Human Perception and Performance, 17, 1023-1040. Pandya, D.N. & Yeterian, E.H. (1985). Architecture and connections of cortical association areas (pp. 3-61). In A. Peters & E.G. Jones (Eds.) Association and Auditory Cortices. New York, NY: Plenum. Passingham R. E. (1993). The frontal lobes and voluntary action. Oxford; New York : Oxford University Press. Petrides, M. (1985). Deficits in nonspatial conditional associative learning after periarcuate lesions in the monkey. Behavioral Brain Research, 16, 95-101. Poldrack, R. A., Clark, J., Pare-Blagoev, E.J., Shohamy, D., Moyano, J.C., Myers, C. & Gluck, M.A. (2001). Interactive memory systems in the human brain. Nature, 414, 546-550.

15 Posner, M.I. & Keele, S.W. (1968). On the genesis of abstract ideas. Journal of Experimental Psychology, 77, 353-363. Posner, M. I. & Petersen, S. E. (1990). Attention systems in the human brain. Annual Review of Neuroscience, 13, 2542. Reber, P. J., & Squire, L. R. (1999). Intact learning of artificial grammars and intact category learning by patients with Parkinson's disease. Behavioral Neuroscience, 113, 235242. Reber, P.J., Stark, C. E. L. & Squire, L.R. (1998a). Cortical areas supporting category learning identified using functional magnetic resonance imaging. Proceedings of the National Academy of Sciences, USA, 95, 747-750. Reber, P.J., Stark, C.E.L. & Squire, L.R. (1998b). Contrasting cortical activity associated with category memory and recognition memory. Learning and Memory, 5, 420-428. Roberts, A.C., De Salvia, M.A., Wilkinson, L.S., Collins, P., Muir, J.L., Everitt, B.J. & Robbins, T.W. (1994). 6hydroxydopamine lesions of the prefrontal cortex in monkeys enhance performance on an analog of the Wisconsin card sort test: Possible interactions with subcortical dopamine. The Journal of Neuroscience, 14, 2531-2544. Shook, B. L., Schlag-Rey, M., & Schlag, J. (1991). Primate supplementary eye field. II Comparative aspects of connections with the thalamus, corpus striatum, and related forebrain nuclei. Journal of Comparative Neurology, 307, 562-583. Sinha, R. R. (1999). Neuropsychological substrates of category learning. Dissertation Abstracts International: Section B: The Sciences & Engineering, 60(5-B), 2381. (UMI No. AEH9932480). Smiley, J.F., Levey, A.I., Ciliax, B.J. & Goldman-Rakic PS. (1994). D1 dopamine receptor immunoreactivity in human and monkey cerebral cortex: predominant and extrasynaptic localization in dendritic spines. Proc Natl Acad Sci USA, 91, 5720-4. Sorg, B.A. & Kalivas, P.W. (1993). Effects of cocaine and footshock stress on extracellular dopamine levels in the medial prefrontal cortex. Neuroscience, 53, 695-703. Spivey, M. J., & Geng, J. J. (2001) Oculomotor mechanisms activated by imagery and memory: eye movements to absent objects. Psychological Research, 65, 235-241. Sternberg, S. (1966). High-speed scanning in human memory. Science, 153, 652-654. Van Golf Racht-Delatour, B. & El Massioui, N. (1999). Rulebased learning impairment in rats with lesions to the dorsal striatum. Neurobiology of Learning & Memory, 72, 47-61. Waldron, E. M. & Ashby, F. G. (2001). The effects of concurrent task interference on category learning. Psychonomic Bulletin & Review, 8, 168-176. Wickens, J. (1993). A theory of the striatum. New York: Pergamon Press. Willingham, D.B. (1998). A neuropsychological theory of motor skill learning. Psychological Review, 105, 558-584. Willingham, D.B., Nissen, M.J., & Bullemer, P. (1989). On the development of procedural knowledge. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 1047-1060.

Willingham, D.B., Wells, L.A., Farrell, J.M. & Stemwedel, M.E. (2000). Implicit motor sequence learning is represented in response locations. Memory & Cognition, 28, 366-375. Wilson, C. J. (1995). The contribution of cortical neurons to the firing pattern of striatal spiny neurons. In J. C. Houk, J. L. Davis, & D. G. Beiser (Eds.) Models of information processing in the basal ganglia (pp. 29-50). Cambridge, MA: MIT Press.