A computational model of fMRI activity in the ... - Springer Link

1 downloads 0 Views 722KB Size Report
25 Aug 2011 - the limited dynamic range for nodes and the operation of lateral inhibition ... new insights into the neural structures and processes involved in VWM (Xu ... sustaining of neural activity after the input is no longer present. Lateral ..... object (or to a set of connected points in a spatial ...... The excitatory input, ELR.
Cogn Affect Behav Neurosci (2011) 11:573–599 DOI 10.3758/s13415-011-0054-x

A computational model of fMRI activity in the intraparietal sulcus that supports visual working memory Dražen Domijan

Published online: 25 August 2011 # Psychonomic Society, Inc. 2011

Abstract A computational model was developed to explain a pattern of results of fMRI activation in the intraparietal sulcus (IPS) supporting visual working memory for multiobject scenes. The model is based on the hypothesis that dendrites of excitatory neurons are major computational elements in the cortical circuit. Dendrites enable formation of a competitive queue that exhibits a gradient of activity values for nodes encoding different objects, and this pattern is stored in working memory. In the model, brain imaging data are interpreted as a consequence of blood flow arising from dendritic processing. Computer simulations showed that the model successfully simulates data showing the involvement of inferior IPS in object individuation and spatial grouping through representation of objects’ locations in space, along with the involvement of superior IPS in object identification through representation of a set of objects’ features. The model exhibits a capacity limit due to the limited dynamic range for nodes and the operation of lateral inhibition among them. The capacity limit is fixed in the inferior IPS regardless of the objects’ complexity, due to the normalization of lateral inhibition, and variable in the superior IPS, due to the different encoding demands for simple and complex shapes. Systematic variation in the strength of self-excitation enables an understanding of the individual differences in working memory capacity. The model offers several testable predictions regarding the neural basis of visual working memory.

D. Domijan (*) Department of Psychology, Faculty of Humanities and Social Sciences, University of Rijeka, Slavka Krautzeka bb, HR-51000 Rijeka, Croatia e-mail: [email protected]

Keywords Computational model . Neural network . Dendrites . Functional neuroimaging . Computational model . Parietal cortex . Working memory Visual working memory (VWM) enables the temporary storage of relevant visual information that can be accessed and utilized after the stimulus disappears. Behaviorally, it is often probed using a change detection paradigm in which the task for participants is to decide whether two successively presented images are the same or different. The success rate in this task can be transformed into an estimate of the number of visual objects that can be stored in VWM (Pashler, 1988; Phillips, 1974). Behavioral investigations have suggested that the capacity of VWM is limited to approximately four items (Luck & Vogel, 1997). However, it is not clear whether the capacity is fixed or depends on stimulus complexity. Alvarez and Cavanagh (2004) showed that VWM capacity is larger for simple objects, such as colored squares, than for more complex objects, such as Chinese letters. Also, interindividual variability in the estimates of VWM capacity is great, ranging from 1.5 to 5 objects (Todd & Marois, 2005; Vogel & Machizawa, 2004; Vogel, McCollough, & Machizawa, 2005). Moreover, the capacity of VWM may increase through behavioral training (Jaeggi, Buschkuehl, Jonides, & Perrig, 2008; Olesen, Westerberg, & Klingberg, 2004). Another important issue is the representational format in which items are stored. Luck and Vogel (1997) suggested that integrated objects are representational units of VWM, on the basis of their finding that the capacity limit was not influenced by the number of features that needed to be stored. However, it is also possible that features are stored separately and bound together by attention during retrieval (Wheeler & Treisman, 2002). VWM is tightly linked to other visual

574

processes, such as attention and perceptual grouping (Awh & Jonides, 2001). Items that receive attentional priority through either endogenous or exogenous orienting get priority in access to the memory store (Schmidt, Vogel, Woodman, & Luck, 2002). Also, the attentional selection of part of the perceptual group facilitates memory storage of the whole group (Woodman, Vecera, & Luck, 2003). Recent functional neuroimaging studies have provided new insights into the neural structures and processes involved in VWM (Xu & Chun, 2009). Xu and Chun (2006) discovered that the inferior and superior parts of the intraparietal sulcus (IPS) contributed differently to VWM. In particular, the inferior IPS showed an increased BOLD response as a function of the number of stored objects, up to a capacity limit of about four objects. The same estimate for the capacity limit of VWM was found by Todd and Marois (2004). Neural activation in the inferior IPS was independent of the complexity of an object’s shape. On the other hand, the superior IPS showed a variable amount of activation dependent on an object’s complexity. Activity saturated at a lower number of objects for complex shapes as compared to simple shapes, consistent with the behavioral findings of Alvarez and Cavanagh (2004). On the basis of the reviewed findings, Xu and Chun (2009) proposed a “neural object file” theory that distinguished two stages of visual processing: object individuation and object identification. Each stage is mapped onto a different part of the IPS: The inferior IPS contributes to object individuation, and the superior IPS contributes to object identification. In the theory, inferior IPS is a crude spatial map providing information about object locations, but other features are not encoded. The inferior IPS exhibits fixed memory capacity and treats all objects as independent entities. On the other hand, superior IPS exhibits flexible memory capacity and encodes detailed featural information about an object, including its color, shape, and so forth. This area treats multiple objects with the same shape as multiple instances of the same shape. The aim of the present study is to provide a detailed neurocomputational account of the reviewed functional neuroimaging findings by applying artificial neural networks to VWM. The model is based on previous computational work on the memory storage of temporal order, also known as the competitive queuing (CQ) neural network (Bullock, 2004; Bullock & Rhodes, 2003). The CQ network is a nonlinear recurrent network with selfexcitation and lateral inhibition. Self-excitation enables the sustaining of neural activity after the input is no longer present. Lateral inhibition enables a distinction to be made among the nodes that encode different objects. Each node receives a different amount of inhibition, which results in different activity magnitudes for nodes receiving input at different points in time. Depending on the network

Cogn Affect Behav Neurosci (2011) 11:573–599

parameters, either a primacy or recency gradient could be observed, or both (Bradski, Carpenter, & Grossberg, 1992). The primacy (or recency) gradient refers to the network output in which stronger activity is assigned to the nodes receiving input earlier (or later) in time. CQ networks have been successfully applied in modeling various aspects of serial order in behavior, such as working memory storage and motor control (Grossberg, 1978; Grossberg & Pearson, 2008; Houghton, 1990; Page & Norris, 1998). However, they have not been employed in simulating and understanding the neural basis of VWM. In the present article, the basic CQ network is extended by distinguishing two populations of inhibitory interneurons with distinct functions (Kätzel, Zemelman, Buetfering, Wölfel, & Miesenböck, 2011; Markram et al., 2004). Furthermore, every excitatory node is divided into three distinct computational subunits, consistent with recent findings about dendritic computation (Polsky, Mel, & Schiller, 2004; Spruston, 2008; Spruston & Kath, 2004; Wei et al., 2001). The interpretation of the fMRI studies is based on the assumption that the input zone of the neuron is a major determinant of the fMRI signal (Logothetis, 2002; Logothetis & Wandell, 2004). More precisely, it is based on the hypothesis that the activity in dendritic trees is a driving force for hemodynamic response (Lauritzen, 2005). This is consistent with the finding that the geometry of the dendritic tree influences the strength of the BOLD signal (Logothetis & Wandell, 2004) and with the claim that dendrites behave as independent computational units that integrate their own synaptic inputs (Häusser & Mel, 2003; London & Häusser, 2005). These properties are essential for the interpretation of the pattern of brain activations exhibited by the IPS during the maintenance of visual objects in working memory.

Model description The specification of the model is split into three parts. Firstly, computation within a single excitatory node is described. Excitatory nodes are modeled as a hierarchy of three independent synaptic integration compartments (Häusser & Mel, 2003; Spruston & Kath, 2004). Each compartment has a different computational role and contributes differently to the generation of a BOLD signal and to the firing rate measurement. Secondly, within-layer interactions are described as explaining how two pairs of excitatory and inhibitory nodes enable serial ordering of inputs arriving at different times using the amplitude of the node’s activity (firing rate). Excitatory–inhibitory interactions on dendrites establish asymmetry in lateral inhibition, according to which the node activated earlier in time can inhibit the node activated later, but the reverse is not

Cogn Affect Behav Neurosci (2011) 11:573–599

allowed. In the third part, specific computations for each cortical layer are described in order to explain how the frontal eye field (FEF) and inferior and superior IPS contribute differently to the encoding and storing of visual information in working memory. The formal specification of the model is given in the Appendix. Computation in a single node The cortical microcircuit for competitive queuing in VWM is based on an excitatory node model that takes into account the existence of multiple independent synaptic integration zones within a single pyramidal cell (Spruston, 2008). The model depicted in Fig. 1 distinguishes between three integration subunits: (1) distal dendritic branches, (2) the dendritic trunk, and (3) the soma. Each subunit performs thresholding and sigmoidal or linear transformation of synaptic inputs arriving within the subunit and then transmits its output to the next subunit in the hierarchy (Poirazi, Brannon, & Mel, 2003; Spruston & Kath, 2004). Experimental evidence confirms that an individual dendrite operates as an independent decision-making subunit, because it can generate dendritic spikes when stimulated by suprathreshold input (Losonczy & Magee, 2006; Polsky et al., 2004; Wei et al., 2001). Furthermore, dendritic activity is enhanced when the animal is awake, suggesting its importance for behavior (Murayama & Larkum, 2009). Another important feature of dendritic computation is that dendrites can act as coincidence detectors (London & Häusser, 2005). Coincidence detection is implemented by back-propagation of action potentials from a neuron’s soma to distal dendritic sites (Larkum, Zhu, & Sakmann, 1999, 2001). Without back-propagating of action potentials, the output from distal dendrites would be attenuated along the dendritic trunk and would not be able to reach the soma of

Fig. 1 The model of an excitatory node, with three independent synaptic integration subunits: (1) distal dendrites, (2) proximal dendritic trunk, and (3) soma. Each subunit has its own output function, denoted as f1, f2, and f3, respectively. Besides, the soma is subjected to shunting nonlinearity, which assures that the node will keep its firing rate within the bounds defined by [B, –C]. Distal dendrites and the dendritic trunk contribute mainly to BOLD signal generation, while the soma contributes mainly to the node’s firing rate

575

the neuron. Therefore, synaptic inputs arriving at the distal dendrites can have only modulatory influence on the soma of the excitatory node. In other words, the output from Subunit 1 can reach the soma only if the soma has already been activated by feedforward input arriving directly at it. Such multiplicative interactions between Subunits 1 and 3 might be a physiological basis for the modulatory effects of horizontal projections (Hupé, James, Girard, Payne, & Bullier, 2001; Kapadia, Ito, Gilbert, & Westheimer, 1995) and feedback projections (Martinez-Conde et al., 1999), as shown in the primary visual cortex. The main hypothesis advanced in the present work is that different subunits of the excitatory node make different contributions to the generation of the BOLD signal and to the firing rate measurement. In particular, the proposal is that distal dendritic branches (Subunit 1) and the dendritic trunk (Subunit 2) are the main sources for the BOLD signal, due to the increase in the concentration of dendritic calcium when a dendritic spike is generated (Losonczy & Magee, 2006; Schiller, Schiller, Stuart, & Sakmann, 1997; Schiller, Major, Koester, & Schiller, 2000). According to Lauritzen (2005), influx of calcium into a dendrite triggers the production and release of vasodilatation substances such as nitric oxide and prostaglandins. These substances increase the blood flow around the dendrite, which is captured by the BOLD signal. Increased blood flow might supply energy for excitatory neurotransmission, known to be energy demanding (Attwell & Iadecola, 2002; Bonvento, Sibson, & Pellerin, 2002). On the other hand, the soma, or a spike initiation zone in the axon (Subunit 3), is a major determinant of the node’s firing rate. Therefore, synaptic input arriving at different subunits will have different impacts on the BOLD signal and the firing rate. For instance, synaptic input at dendrites will be visible in the BOLD signal, but it might leave the firing rate unchanged. In a similar vein, synaptic input to the soma will influence a node’s firing rate but will not affect the BOLD signal strength. Consequently, the firing rate and the BOLD signal provide complementary information about neural events that cannot be reduced to a single physiological process (Bartels, Logothetis, & Moutoussis, 2008). In the model, inhibitory nodes are treated as single compartments without extensive dendritic branching, as is seen in excitatory nodes. Consequently, inhibitory activity will not produce dendritic calcium and in turn will not be directly seen in the BOLD signal, consistent with the finding of Waldvogel et al. (2000). Inhibitory activity will be seen indirectly in the BOLD signal through its effect on the synaptic integration of the excitatory nodes. Inhibition will have different impacts, depending on the exact location at which it targets an excitatory node. Computational analysis and experimental measurements have confirmed that the location of an inhibitory synapse relative to the

576

excitatory cell has a great impact on the way that excitatory and inhibitory inputs are integrated (Hao, Wang, Dan, Poo, & Zhang, 2009; Liu, 2004). Furthermore, it is known that different classes of inhibitory cells selectively innervate different subunits of the pyramidal cells (Markram et al., 2004; Somogyi, Tamas, Lujan, & Buhl, 1998). For instance, the axons of Martinotti cells avoid the soma and innervate the distal dendrites of excitatory cells (Kapfer, Glickfield, Atallah, & Scanziani, 2007; Silberberg & Markram, 2007). In the model, inhibition of the distal dendritic branch will be contained in this subunit and will reduce the impact of the excitatory synapses that are collocated on the same dendrite. Therefore, dendritic inhibition will reduce the BOLD signal due to the reduction of dendritic activity, but it may not affect a node’s firing rate. On the other hand, inhibition at the soma will influence the firing rate of the excitatory node but have a lesser impact on the BOLD signal. It should be noted that the previous analysis does not imply that the firing rate and the BOLD signal are always uncorrelated. Instead, it suggests that their relationship will be dictated by anatomical details. For instance, we may assume that an excitatory node possesses self-recurrent collateral activation that terminates either at the soma or at the dendritic trunk. If self-excitation terminates at the soma, it will not be visible on the BOLD signal, which will be decoupled from the firing rate (under the assumption that the soma and the dendritic trunk receive distinct feedforward synaptic inputs). However, if self-excitation terminates at the dendritic trunk, the BOLD signal and firing rate will be perfectly correlated, because the dendritic trunk will constantly receive a copy of electrical signal from the soma. Computation in a cortical microcircuit The proposed cortical microcircuit for competitive queuing in VWM is depicted in Fig. 2. It consists of two excitatory and two inhibitory nodes for each cortical column (i.e., every spatial location). Excitatory nodes are designated as input and output nodes, and inhibitory nodes are designated as lateral inhibition and dendritic inhibition nodes because of their function within the column. Excitatory nodes Processing in the microcircuit starts with feedforward input that arrives at the output excitatory node, xij, at a spatial location {i, j}. The same node also receives feedback projections from other cortical layers on its dendrites. The output excitatory node is thus named because it is a major source of excitatory output within the cortical layer and to other cortical layers. The axon of the output excitatory node extends horizontally and contacts dendrites of the input excitatory nodes in other columns, ymn. Furthermore, self-recurrent collateral activa-

Cogn Affect Behav Neurosci (2011) 11:573–599

Fig. 2 (a) Cortical microcircuit used to store information in visual working memory. Excitatory nodes are depicted by open circles, and inhibitory nodes by gray filled circles. Lines with T endings are dendrites. The output excitatory node, xij, receives feedforward input, Iij, and self-recurrent excitation onto its dendritic trunk. Also, it receives a feedback signal, FBij, arising from other network layers. The output excitatory node sends excitation to other columns in the network, ymn, and also contacts soma of the input excitatory node. The input excitatory node, yij, receives recurrent input from the output excitatory nodes from other network locations, xmn. It activates lateral inhibition, zij, which contacts the soma of the output excitatory node and the dendritic trunk of the input excitatory node. The output excitatory node contacts an interneuron mediating dendritic inhibition, rij, which contacts the dendrites of the input excitatory node and soma of the output excitatory node. (b) Relation of the model components to the lamina of the real cortical circuit. Feedforward input arrives from Layer 4 (L4) to the output excitatory nodes in Layers 2 and 3 (L2/3). The output and input excitatory nodes, together with nodes for lateral inhibition, are located in L2/3. The output excitatory node projects to the Layer 5 (L5) node, xij', and forms a recurrent connection with it. The excitatory node in L5 contacts Martinotti cells that project back to the dendrites of the input excitatory nodes in L2/3, consistent with recent anatomical data (Kapfer et al., 2007; Kätzel et al., 2011). Selfinhibition for L5 excitatory nodes might arise from a distinct population of inhibitory interneurons in L5, denoted as zij'

tion of the output excitatory node enables it to exhibit sustained activity when the feedforward input is withdrawn. Self-recurrent collateral activation is an anatomical substrate

Cogn Affect Behav Neurosci (2011) 11:573–599

for self-excitation that enables the network to store and maintain a spatial pattern in working memory. An input excitatory node, yij, receives excitation from output excitatory nodes in other columns, xmn, and from the same column, xij. Excitation between columns is transmitted through dendrites. The input excitatory node contacts the node for lateral inhibition, zij, which further contacts the output excitatory node, xij, within the same column. In this way, the input excitatory node mediates lateral inhibitory interactions between different spatial locations. It should be noted that the excitation arriving from other columns is modulatory (i.e., it can influence an input excitatory node only if its soma is active and sends back-propagating action potentials). On the other hand, excitation from the output excitatory node in the same column arrives on a dendrite that is close to the soma and has a driving effect on the input excitatory node. Consequently, lateral inhibition is mediated only among columns that receive direct feedforward input. An important feature of the proposed connectivity scheme is that it avoids strong excitatory loops. Except for the self-excitation of the output excitatory nodes, there are no other reciprocal excitatory connections within the microcircuit; output excitatory nodes in different columns do not excite each other, and the same is true for the input excitatory nodes. In this way, the model respects the “nostrong-loop” hypothesis, which states that the real cortical circuits should avoid reciprocal excitatory connections in order to prevent dynamic instabilities (Crick & Koch, 1998). The hypothesis was originally conceived for excitatory communication between cortical layers, but it could be generalized to within-layer interactions. Extensive computer simulations with the neural circuit proposed here suggest that it achieves stable encoding of temporal order under a wide range of parameter settings. In order to facilitate comparison with the real cortical circuit, Fig. 2b depicts a possible location of the model’s nodes within the lamina of a canonical cortical circuit (Douglas & Martin, 2004). Feedforward input that originates from Layer 4 of visual cortex (L4) terminates on the output excitatory nodes in Layers 2 and 3 (L2/3). Computation within L4 is not explicitly modeled here. Axons of the output excitatory nodes extend horizontally within L2/3 to reach dendrites of the input excitatory nodes. Therefore, both output and input excitatory nodes should be considered as part of L2/3. As noted by Binzegger, Douglas, and Martin (2004, 2009), a prominent feature of L2/3 is its extensive recurrence; that is, many excitatory inputs to the pyramidal cells originate from within L2/3. However, Binzegger et al. were not able to discern whether or not the excitatory connections within L2/3 are reciprocal. The output excitatory node also projects to the L5 node at the same spatial location. I have not explicitly modeled

577

L5, but it is clear that the firing rate of its node should be proportional to the activity level of the corresponding output excitatory node, xij. Therefore, the L5 node is labeled xij'. Nodes in L5 project their output to other cortical layers. In the present model, they also serve another function—that is, to convey excitation to the special group of inhibitory interneurons, rij, which will be discussed in the next section. The proposed scheme respects the facts that there are more pyramidal cells in L2/3 than in L5 and that the density of excitatory connections between L2/3 and L5 is relatively low, presumably connecting only nodes in L2/3 with nodes in L5 within the same column (Binzegger et al., 2004, 2009). Inhibitory nodes The proposed model distinguishes two types of inhibitory interneurons with distinct functional roles. The interneuron mediating lateral inhibition, zij, receives excitation from the input excitatory node, yij, and inhibits the output excitatory node in the same column, xij. Also, it projects to the dendritic trunk of the input excitatory node. The role of this projection is to keep the firing rate of the input excitatory node proportional to the number of activated dendrites (Murayama et al., 2009). The interneuron mediating dendritic inhibition, rij, receives excitation from the output excitatory node xij, and it delivers inhibition to the dendrites of the input excitatory node yij. In this way, the output excitatory node protects itself from lateral inhibition arising from output excitatory nodes in other columns, xmn. Such a connectivity pattern establishes an inhibitory feedback loop that modulates the transmission along the dendrites of the input excitatory node. Recently, a distinct group of inhibitory cells, known as Martinotti cells, was shown to behave in the manner described above (Kapfer et al., 2007; Silberberg & Markram, 2007). Interneuron mediating dendritic inhibition, from rij, also contacts the soma of the output excitatory node in order to deliver self-inhibition to it. Self-inhibition keeps the output excitatory node away from reaching the physiological bound on firing rate. Such an arrangement is not anatomically correct, because Martinotti cells contact only the dendrites of pyramidal cells. However, in the real cortical circuit (Fig. 2b), self-inhibition of the output excitatory nodes could be mediated by a distinct group of interneurons, denoted as z'. The functional role of the proposed inhibitory feedback loop is to impose temporal order on lateral inhibitory interactions and to implement a competitive queue. The loop enables an output excitatory node activated earlier in time to inhibit output excitatory nodes activated later, but it prevents lateral inhibition in the opposite direction. This asymmetry in lateral inhibition arises from the fact that dendritic inhibition activated earlier in a sequence will

578

protect its output excitatory node form future inputs arriving from other columns. On the other hand, dendritic inhibition activated later in time will not be sufficient to override the lateral inhibition arising from the output excitatory nodes already activated by previous inputs. In this way, a competitive queue will emerge, with higher firing rates for output excitatory nodes activated earlier in a temporal sequence (i.e., a primacy gradient). With respect to the real cortical circuit (Fig. 2b), it is interesting to note that the division between lateral and dendritic inhibition closely follows the anatomical distinction between intralaminar and translaminar inhibition suggested by Kätzel et al. (2011). They analyzed patterns of inhibitory inputs to the excitatory cells in several cortical areas. The analysis revealed two distinct sets of pyramidal cells. The first set receives inhibition that originates from within the same lamina. In the model, this would correspond to lateral inhibition to the output excitatory nodes. The second set receives predominantly the inhibition originating from another lamina. Specifically, the dendrites of pyramidal cells in L2/3 receive inhibition from Martinotti cells in L5 in accordance with the proposed role for dendritic inhibition on the input excitatory nodes. Furthermore, it was shown that the dendritic inhibition of the pyramidal cell is a prominent feature found in many cortical areas, suggesting that it is a generic component of the cortical circuit (Berger, Perin, Silberberg, & Markram, 2009). The division of labor among distinct populations of inhibitory interneurons was previously used in a spatial working memory model proposed by Wang, Tegner, Constantinidis, and Goldman-Rakic (2004). They distinguished three types of inhibitory interneurons based on the targets of their projections: soma-targeting, dendritictargeting, and interneuron-targeting cells. The somatargeting cell mediates lateral inhibition to pyramidal cells in nearby columns. However, the dendritic-targeting cell receives inhibitory input from the interneuron-targeting cell rather than excitation from the pyramidal cells, as in the present model. Therefore, the dendritic-targeting cells in the model of Wang et al. could not perform the same function of inhibition of the nearby pyramidal cells as the Martinotti cell does. Computation in the neural architecture of VWM The neurocomputational architecture for storage, maintenance, and retrieval of items in VWM is shown in Fig. 3. It consists of four networks, denoted the surface labeling network, the attention map, and inferior and superior IPS. Input is first preprocessed by the surface labeling network, which assigns different activity amplitudes (firing rates) to distinct objects present in the visual scene. The surface

Cogn Affect Behav Neurosci (2011) 11:573–599

Fig. 3 Interactions between different cortical layers that enable the loading of object locations into the inferior IPS and feature information into the superior IPS. First, input is preprocessed in two separate pathways. One pathway maintains the spatial representation of the input and achieves object-based shifts of attention. It projects to the inferior IPS through two intermediate stages involving the surface labeling network and the attention map. The surface labeling network segregates different objects by assigning different firing rates to them using spatial gradients. It forms a competitive queue similar to that observed in FEF cells (Buschman & Miller, 2009). The attention map selects the object with the maximal firing rate in the FEF and loads it into the inferior IPS. Also, it gates extraction of features in V4. Shifting of spatial attention from one object to another is achieved through a nonspecific reset signal, α, applied to the surface labeling network and gated by the feedback from the attention map. Gating by the attention map enables activity suppression only for the nodes currently most active in the FEF. Another pathway extracts features into a translation-invariant representation that occurs in V4. In the present model, the relevant feature dimensions are color and shape. This pathway projects from V4 to the superior IPS and allows features to be loaded into superior IPS

labeling network forms a competitive queue that guides the deployment of spatial attention to objects. The attention network represents the current focus of attention by selecting the object with the maximal firing rate in the surface labeling network. The attention network loads the selected object representation into the inferior IPS and the superior IPS. It is assumed that the surface labeling network, the attention map, and the inferior IPS are 2-D retinotopic maps that share the same size. Feedforward

Cogn Affect Behav Neurosci (2011) 11:573–599

connections from the surface labeling network to the attention map and from the attention map to the inferior IPS are one-to-one (i.e., connecting only the nodes of the same spatial position). Furthermore, the attention map gates featural information (color, shape) from retinotopic maps into a translation-invariant feature representation stored in V4 and further in superior IPS. It is assumed that the surface labeling network, the inferior IPS, and the superior IPS share the same columnar and laminar design of the local microcircuit. However, due to their functional differences, each network has its own additional properties, which are described below. Surface labeling network (frontal eye fields) The input to the attention map and the inferior IPS arises from the surface labeling network, which is hypothesized to solve the problem of spatial binding by assigning the same activity value to all spatial locations that belong to the same object (or to a set of connected points in a spatial representation). On the other hand, spatial locations belonging to different objects will be segregated by the different firing rates of their corresponding nodes. Differentiation of firing rates is achieved by weighting the input to the network with the spatial gradient that assigns a different activity amplitude to each location in the 2-D map. The initial spatial gradient is smoothed by activity spreading. The proposed mechanism for activity spreading is based on previous computational work on the construction of a surface representation based on boundary detection and the filling in of surface qualities from the boundary to the interior of the surface. The interaction between anisotropic diffusion and its containment within the borders defined by orientation-selective responses in early visual cortex explains many properties of color and brightness perception (Cohen & Grossberg, 1984; Grossberg & Todorović, 1988). Domijan and Šetić (2008) generalized this computational scheme to attentional spreading, where, instead of brightness, abstract attentional labels also fill in the surface representation and form the neural substrate for objectbased attentional selection (Roelfsema, 2006). Neurophysiological evidence suggests that the FEF might implement the described attention shifting over the visual field representation. Visually responsive neurons in monkey FEF exhibit differential responses to the target and distractor stimuli during pop-out search (reviewed in Thompson & Bichot, 2005). A recent study showed that FEF neurons implement strategic control over serial shifts of attention (Buschman & Miller, 2009). Human functional neuroimaging studies also implicate FEF in target selection and attentional control (Corbetta et al., 1998; Donner et al., 2000). Furthermore, imaging studies have revealed sustained activation in FEF during the memory maintenance

579

period in VWM tasks (Pessoa, Gutierrez, Bandettini, & Ungerleider, 2002; Srimal & Curtis, 2008). It should be noted that the surface labeling network is not intended to be a full model of the FEF. This network is primarily concerned with visually responsive neurons and their role in feature binding and object-based attention. An anatomically and biophysically realistic model of FEF assigns a different functional role to each lamina of the canonical cortical circuit (Heinzle, Hepp, & Martin, 2007). The model was able to reproduce a number of electrophysiological findings about the visual-to-oculomotor transformation involved in the control of eye movements. However, it does not offer an explanation of how FEF cells can achieve object-level attentional selection—that is, selection of all spatial locations belonging to the same object and how attentional enhancement can spread along the whole object (Roelfsema, 2006). Attention map When the surface labeling network converges to its steady state, it assigns a different activity amplitude (firing rate) to each surface based on its location in the visual field. Due to the fact that the activity of the output excitatory nodes is bounded above, it is easy to read out the surface with a maximal activity value into a spatial map that represents the current focus of attention. In the model, this map is denoted as an attention map, and it is hypothesized to exist in the posterior parietal cortex. The inferior IPS encodes only the spatial position of the object, so it only needs to load activity from the attention map. On the other hand, the superior IPS encodes the features of objects, irrespective of their position. The attention map helps in feature encoding because it highlights features of the currently attended object in the ventral visual pathway. In other words, an attention map could multiplicatively gate only features that are positioned at attended locations. This is consistent with recent findings in the computational modeling of object recognition, which have shown that attention reduces interference and increases performance in object recognition in multiobject visual images (Fazl, Grossberg, & Mingolla, 2009; Walther & Koch, 2007). It should be noted that the attention map might also serve to read out information from the inferior and the superior IPS during retrieval. These networks also contain a queue with one set of nodes with maximal activity that could be loaded into the attention map to guide the binding of remembered features (Shafritz, Gore, & Marois, 2002). In order to load different objects into the attention network, it is important to remove the nodes with maximal activity from the surface labeling network. This is achieved using a nonspecific reset signal, α, which is multiplicatively gated by the signal from the attention network before it reaches the nodes in the surface labeling network. In this

580

way, only the FEF nodes encoding the currently selected object will be inhibited, while all other network locations will stay intact. After the reset, the surface with the second largest activity amplitude will be selected into the attention network, because self-excitation in the FEF network will enable these nodes to attain a maximal firing rate and to move one step farther in the queue. The same process continues until all surfaces in the visual field are attended and loaded into the inferior and superior IPS. Interaction between the attention map and the surface labeling network implements the inhibition of return that prevents attention from returning to previously visited spatial locations (Klein, 2000; Klein & MacInnes, 1999). After reset, the activity of the nodes encoding the previously selected object in the FEF is reduced below the threshold, so they do not participate in the competitive queue any more. Such a mechanism forces the FEF network to always search for a new surface. In this way, the visual field can be sequentially scanned during encoding in VWM in the same way as during visual search (Dodd, Castel, & Pratt, 2003). Further support for the claim that items are encoded serially in VWM comes from a behavioral study showing that more time is required to consolidate stimuli containing more objects (Vogel, Woodman, & Luck, 2006). Interestingly, it is estimated to take about 50 ms to consolidate one item into VWM, consistent with the rate of attentional deployment of 50 ms per item in visual search (Wolfe, 1998). Neurophysiological (Colby & Goldberg, 1999), neuropsychological, and functional neuroimaging (Behrmann, Geng, & Shomstein, 2004; Corbetta & Shulman, 2002) studies have suggested that the parietal cortex is involved in controlling visual spatial attention. More specifically, fMRI evidence is found in the human parietal cortex for shifting the focus of attention over retinotopic representations of visual space, consistent with the proposed operation of the attention map (Sereno, Pitzalis, & Martinez, 2001; Silver, Ress, & Heeger, 2005). The functional arrangement in which the surface labeling network influences neural activity in the IPS through the attention map is consistent with imaging data that suggest the causal role of FEF activity in attentional modulations in the IPS (Bressler, Tang, Sylvester, Shulman, & Corbetta, 2008). Furthermore, response latencies in single-unit recordings indicate that the FEF influences the IPS during top-down attentional control (Buschman & Miller, 2007). Empirical support for the existence of a global reset signal comes from the study of Buschman and Miller (2009), who found that attentional shifts in the monkey FEF cells were correlated with low-frequency oscillations in the local field potential. Furthermore, a human imaging study showed that the anatomical locus of the reset signal could be in the parietal cortex, because a distinct neural

Cogn Affect Behav Neurosci (2011) 11:573–599

population is transiently activated when an attentional shift is performed (Yantis et al., 2002). Inferior IPS In the model, the inferior IPS is a spatial map with the same dimensions as the surface labeling network and the attention map. The output excitatory nodes in the inferior IPS receive input from the attention map, which represents the locations of a currently attended object. Since the objects are already labeled, the inferior IPS does not need activity spreading, as in the surface labeling network. After the loading of all objects is achieved, the output excitatory nodes in the IPS also form a competitive queue, with the first-attended object receiving the highest activity label, the second attended object receiving the secondhighest activity label, and so on. An important point for the inferior IPS is that all locations of the object, irrespective of its size, are loaded into the inferior IPS in parallel. This poses a problem for the network, because increasing the number of nodes with the same activity label reduces the network capacity to encode multiple objects. With a larger object, the total amount of lateral inhibition increases, which results in a lower capacity to encode objects. In order to circumvent a severe capacity limit for large objects, the normalization of lateral inhibition is introduced, which enables the inferior IPS to represent a fixed number of objects irrespective of their sizes. Normalization is achieved using divisive inhibition that divides the total amount of activity arriving on the input excitatory node by the size of the activity in the attention map (Carandini & Heeger, 1994; Chance & Abbott, 2000). Consequently, the strength of lateral inhibition will be scaled with the size of the object in the attention map. In other words, the impact of the dendrites on the input excitatory node will be weaker as larger objects are encoded in the inferior IPS. Indirect evidence for activity normalization in the inferior IPS is found in an fMRI study of memorization of simple and complex objects (Xu & Chun, 2006). The study showed that the inferior IPS exhibits a fixed capacity limit that is independent of an object’s complexity (i.e., object size). The BOLD fMRI signal reached plateau at the same number of objects (i.e., four objects), despite the fact that complex objects occupied more space. Consistent with this finding, Zhang and Luck (2008) found that a highresolution representation exists in working memory that cannot be flexibly altered in order to store more visual items. Both studies seem to point to the existence of a retinotopic map that can memorize the spatial layout of objects; that is, it stores all locations occupied by an object. Superior IPS Superior IPS stores objects’ features independent of their locations in space, in a manner similar to the operation of visual area V4. In the model, superior IPS is

Cogn Affect Behav Neurosci (2011) 11:573–599

divided into different subnetworks of the same cortical microcircuits, which are responsible for coding different visual features such as colors, shapes, orientations, sizes, motions, and so on. An important constraint is that lateral inhibition should be restricted to the same feature dimension. In other words, there should be no inhibition between colors and shapes, but only between colors or between shapes. By assigning different activity labels to different feature values, the proposed model is able to solve the feature binding problem (Treisman, 1996) for spatial and featural information encoded into working memory. Competitive queuing enables a unique label to be assigned in all networks encoding different features of the same object. Namely, when all of the networks encoding different features converge to the same activity queue, it is easy to distinguish which feature belongs to which object. For instance, when the activity in the network encoding colors is 20 for red and 10 for green, while activity in the shape network for the node encoding “circle” is 20 and for the node encoding “square” is 10, the model is able to deduce that we have seen a red circle and a green square. It is not necessary that the activity label in all networks should be exactly the same. As long as they are kept distinct, the network will not have a problem in distinguishing between different objects at retrieval. Retrieval is conceptualized here in the same manner as in previous CQ networks (Houghton, 1990). The nodes with maximal activity in different networks are selected together, forming a bounded representation of all of the object’s features. When attention is moved to another object, a currently attended object representation is inhibited, and the next object in the queue is retrieved. There are also other mechanisms of retrieval that do not require resetting the node’s activity, as shown by Domijan (2003). In the present model, the superior IPS network encodes just colors and shapes and receives this information from V4. In both areas, the nodes for colors and shapes are organized in columns that do not possess reference to the spatial position from which they originate. In other words, V4 and superior IPS contain a translation-invariant representation of objects’ features. Each feature value (red, green) within a feature dimension (color) is arbitrarily assigned 20 nodes that are activated together when the feature is present at the attended location. Therefore, object features are first extracted from the retinotopic input into a translation-invariant representation in V4 and then transmitted to the superior IPS when attention is drawn to that object. This attention highlights the feature representation of the attended object and suppresses feature representations of unattended objects (Desimone & Duncan, 1995; Reynolds & Chelazzi, 2004). It is hypothesized that shape and color representations similar to those found in V4 also exist in the superior IPS,

581

which enables their temporary storage. The evidence for shape selectivity in the parietal cortex was obtained by Sereno and Maunsell (1998), who showed that more than 50% of cells in a monkey’s posterior parietal cortex are tuned to simple, 2-D shapes. Moreover, many of these cells are also active when the monkey is required to maintain shape information in working memory. Such selectivity is not an accidental by-product of the interaction between shape features and receptive field profiles. Rather, it constitutes a genuine contribution to the shape representation that is independent of eye movements, reaching, or object manipulation. Sereno and Maunsell concluded that shape selectivity in the parietal cortex is equivalent to that observed in the ventral visual pathway. When color becomes a relevant feature for the task, many cells in the monkey’s lateral intraparietal area were found to exhibit color selectivity (Toth & Assad, 2002). The monkeys were trained to perform eye movements based on color or location as a cue. Interestingly, color selectivity disappeared when color was no longer used as a cue, suggesting the dynamic encoding of color information in parietal areas. The anatomical substrate for color selectivity in the lateral intraparietal area is provided by input projections originating in extrastriate cortex, including V4 (Lewis & Van Essen, 2000). Furthermore, functional neuroimaging studies have shown activation of the human dorsal IPS in a color discrimination task (Beauchamp, Haxby, Jennings, & DeYoe, 1999; Claeys et al., 2004).

Results The model’s ability to sustain a representation of multiple visual patterns using recurrent excitation and inhibition was tested in the series of computer simulations described below. In figures illustrating the spatial layout of the networks, the activity of nodes is depicted with shades of gray, where white denotes the lowest and black the highest activity value in the network. The output excitatory nodes determine the firing rate representation because their long axons would be visible in single- and multiunit recordings, consistent with known sampling bias (Towe & Harding, 1970). On the other hand, the input excitatory nodes are visible in fMRI measurements because of the extensive processing in their distal dendrites. In figures illustrating the simulated fMRI response, rBOLD is shown calculated from Eq. 13 in the Appendix. Basic simulation Figure 4 illustrates the basic property of the surface labeling network. The input image (A) is augmented with a spatial gradient for which each location in the network receives

582

Cogn Affect Behav Neurosci (2011) 11:573–599

Fig. 4 Simulation of neural activity in the surface labeling network, which illustrates its ability to label different surfaces with different firing rates. This is a consequence of the spatial gradient applied on the input and of neural activity spreading among nodes encoding the same surface. (a) Input. (b) Spatial gradient. (c) Firing rates of output excitatory nodes in the surface labeling network. (d) Dendritic processing of the input excitatory nodes

activity of a different amplitude (B). Domijan and Šetić (2008) argued that such spatial biasing arises from visual field anisotropies that might explain why there is a tendency to assign figural status to surfaces in the lower visual field or with a wide bottom part and narrow top. Recurrent processing within the network results in a spatial representation in which each surface is labeled with a distinct activity amplitude or firing rate (C) due to lateral inhibition. All locations belonging to the same surface are labeled with the same activity amplitude due to the local activity spreading or filling in (Cohen & Grossberg, 1984; Grossberg & Todorović, 1988). Lateral spreading is constrained to the nodes stimulated by the input and does not extend into the background. Figure 4d shows the output of dendritic processing, as computed in Eq. 12 in the Appendix. Here, an important observation is that the amplitude of dendritic processing is strong at locations where the firing rate is low, and weak for high firing rates. Therefore, the nodes encoding surface with maximal firing rates will have the lowest dendritic activity, because there is no lateral inhibition on these nodes. On the other hand, nodes encoding the surface with the lowest firing rate will exhibit the strongest dendritic activity, because all other active nodes are able to inhibit it.

An exception is those nodes that have not received any input. They do not show any dendritic activity, because their input excitatory nodes are not active. Such behavior indicates that in a multiobject input, most of the energy consumption is devoted to the separation of representations of different surfaces. Figure 5 illustrates how the attention map loads object representation in the inferior and superior IPS. The attention map serially scans the nodes in the surface labeling network (FEF). The surface representation of an attended object enters the inferior IPS network, while feature representations enter the superior IPS. Shifts between surfaces are governed by the nonspecific reset signal that removes the representation of the attended object from the FEF network. The surface that was attended first would attain a maximal activity level (top row) in both inferior and superior IPS. The next attended surface will attain a lower activity level, because dendritic inhibition induces an asymmetry in lateral inhibition according to which the output excitatory nodes with a higher firing rate can inhibit nodes with a lower firing rate, but the reverse is not possible. The same process continues for the third surface (bottom row) and further until memory capacity is full. When too many nodes are already activated, lateral inhibition for newly recruited nodes is too

Cogn Affect Behav Neurosci (2011) 11:573–599

583

Fig. 5 Simulation of loading a surface representation in the inferior IPS and a featural representation in the superior IPS. The surface labeling network (FEF) provides input to the attention map (ATTENTION), which selects one surface whose spatial representation is loaded into the inferior IPS (iIPS), while its featural representation is loaded into the superior IPS (sIPS). Objects are attended serially from the bottom right part of the visual map because the spatial gradient for

the surface labeling network is oriented in this way. In the superior IPS, separate nodes encode the color and the shape of the object. Due to the fact that all of the shapes are identical, only one set of shape nodes is activated during the whole trial. On the other hand, different color nodes are activated for different objects because I have assumed that each object is colored differently. T1, loading of the first object. T2, loading of the second object. T3, loading of the third object

strong, and they are not able to sustain activity after the input disappears. Therefore, the neuron’s limited dynamic range, along with lateral inhibition, is a source of capacity limitation within the model. Figure 6 shows the detailed temporal dynamics of attentional switching and loading of object representations into the inferior and superior IPS. Each graph depicts neural activity as a function of time for one node whose spatial location corresponds to the first (full line), second (dashed line), or third (dotted line) object in the queue, respectively. The simulation starts after the surface labeling network (FEF) converges to its steady state and forms the initial competitive queue in which each object is labeled by a distinct firing rate. The attention map selects the first object from the FEF queue and loads its representation into the output excitatory nodes of the inferior and superior IPS (Fig. 6, left column). The reset signal, which is rhythmically turned on and off, drives attention switching by inhibiting most active output excitatory nodes in the FEF.

After the removal of the selected (attended) object, selfexcitation in the FEF enables the remaining objects to move one position up in the queue. Separately, in the right column of Fig. 6, I show the dynamics of the input excitatory nodes in the FEF and inferior and superior IPS. Input excitatory nodes in the FEF reduce their activity as a consequence of the removal of the selected object after the reset. On the other hand, in the inferior and superior IPS, adding new object representations increases the total activity of the input excitatory nodes, which is reflected in the BOLD signal increase. Inhibitory nodes are not shown here because temporal evolution of their activity closely follows that of the excitatory nodes. In particular, a node for dendritic inhibition has the same temporal dynamics as the output excitatory node, while a node for lateral inhibition has the same temporal dynamics as the input excitatory node at the same spatial location. The lateral interactions mentioned above induce the primacy gradient for firing rates of the output excitatory

584

Cogn Affect Behav Neurosci (2011) 11:573–599

Fig. 6 Temporal dynamics of attentional switching, inhibition of return, and loading of object representation into the inferior and superior IPS. Neural activity (in arbitrary units) is depicted as a function of simulated time steps. Attentional jumps between objects are controlled by the attention map and the reset signal, depicted in the top row. Reactions to the attentional jumps in the surface labeling network (FEF), inferior IPS (iIPS), and superior IPS (sIPS) are shown in the 2nd, 3rd, and 4th rows, respectively. Each plot is scaled to the dynamic range of the corresponding nodes. One node is selected in each network from the sets of nodes that encode the first, second, and third objects in the queue (solid, dashed, and dotted lines, respectively). x, output excitatory nodes; y, input excitatory nodes

nodes (i.e., better encoding for objects attended earlier in a sequence). However, the proposed cortical microcircuit can also produce a recency gradient, by increasing the magnitude of the feedforward input. In the simulation reported here, the strength of the feedforward input to the inferior and the superior IPS is kept weak relative to the maximal firing rate for excitatory nodes. When the input magnitude is increased, the asymmetry in lateral inhibition becomes reversed, thus giving the advantage to the recently attended objects relative to the older ones. Increased self-inhibition also contributes to the recency gradient, because it reduces the impact of previously activated nodes on the new one. Serial attentional shifts are controlled by the attention map, which serves as a gate that serializes access to different objects during encoding. However, this gate could also be used during retrieval, when the stored object’s location and features should be integrated (bounded). The attention map could receive input from the inferior IPS and from the superior IPS in order to resolve which feature belongs to which object. This is consistent with the findings of Shafritz et al. (2002), who isolated a part of the IPS that showed increased activation for storing conjunctions of features, as compared to the storing of a single feature.

However, activity in this brain area did not show a correlation with load in the task. Increased response to the conjunction of features could be explained by the observation that input to the attention map could arrive from three sources: from two feature maps in the superior IPS and from the spatial map in the inferior IPS. On the other hand, in a singleton feature condition, only two sources of input will be active (one feature map and spatial location). If we assume that each input arrives via a separate dendrite, it is clear that a conjunction condition will produce stronger BOLD response. On the other hand, activity will not depend on the number of objects that should be stored (and retrieved), because objects’ representations will be visited serially. The capacity of VWM Behavioral studies have suggested that a limited number of visual objects can be simultaneously held in VWM (Phillips, 1974). However, the precise nature of this limitation is not clear. According to Luck & Vogel, 1997; Vogel, Woodman, and Luck (2001), capacity is fixed to about four objects and does not depend on the number or

Cogn Affect Behav Neurosci (2011) 11:573–599

type of features that need to be remembered. On the other hand, Alvarez and Cavanagh (2004) argued that the capacity limit is a joint product of the limit in VWM and the limit imposed by the complexity of objects. A functional neuroimaging study conducted by Xu and Chun (2006) reveals a potential source of discrepancy in behavioral studies. These researchers used objects with simple or complex shapes, where complex shapes were produced by connecting two simple shapes into a single figure. Participants performed a change detection task with a variable number of objects while brain activity in the inferior and superior IPS was examined. The inferior IPS network showed an increase in activation as a function of the number of objects up to the four objects, and then reached a plateau. This implies a constant storage capacity of approximately four objects, independent of the complexity of shapes. The superior IPS showed a variable storage capacity, with lower capacity for more complex shapes. A computer simulation illustrating the difference between the inferior and superior IPS is shown in Fig. 7. The input to the network was a set of letter shapes that can be seen in Fig. 9 (bottom row). Complex shapes were produced by connecting two letter shapes into one object. The number of objects presented to the inferior and superior IPS was systematically varied from two to six. Due to input normalization, the inferior IPS achieved a fixed capacity limit. Its BOLD activity rose until it reached four objects. Changes in the BOLD signal strength did not differentiate simple from complex objects. On the other hand, the superior IPS does not need normalization, because it is not a spatial map; it stores sets of feature values associated with objects independent from their location of origin. With the lack of normalization in the superior IPS, it produces stronger lateral inhibition when more nodes are needed to encode features. For instance, when simple shapes are presented, they activate only 20 nodes per object in the shape network. On the other hand, if complex shapes are presented that are composed of two simple shapes, they will activate 40 nodes (20 per simple shape) per object. Therefore, the superior IPS network will show a smaller capacity for objects with complex shapes and a larger capacity for objects with simpler shapes, consistent with the behavioral results of Alvarez and Cavanagh (2004). This analysis is valid under the assumption that only simple shapes are prelearned in the network. In other words, properties of the simple shapes are encoded into long-term memory, which is accessed when we draw attention to an appropriate object. On the other hand, complex shapes do not have dedicated nodes but are encoded as a set of simple shapes from which they are constructed. The activation of multiple nodes for simple shapes when a complex shape is presented explains the reduced capacity limit for their memorization.

585

Fig. 7 Simulation of the Xu and Chun (2006) study, showing dependency of BOLD signal strength on the number of objects and their complexity. In the inferior IPS, the BOLD signal increases up to four objects and then saturates; here, object complexity does not have an influence on BOLD strength. On the other hand, in the superior IPS, BOLD signal strength increases up to four objects for simple shapes. However, it saturates at two objects for complex shapes

It should be mentioned that the observed pattern of fMRI activity is obtained for simultaneous object presentation. When objects are presented sequentially but each at a different (nonoverlapping) spatial position, the resulting fMRI activity is the same as for simultaneous presentation. However, when objects are presented sequentially at the same spatial location, interference will occur in the surface labeling network, which will not be able to distinguish different surfaces by different firing rates because different surfaces will recruit the same network nodes. Consequently, loading of object representations into the inferior and the superior IPS will be impaired. As a result, the fMRI signal

586

will be weak and independent of the number of objects, as observed by Xu and Chun (2006). Individual differences in VWM capacity Behavioral and brain imaging studies reveal great individual differences in their estimates of the capacity of VWM, ranging from only 1.5 to as many as 5 simultaneously stored objects (Todd & Marois, 2005; Vogel & Machizawa, 2004; Vogel et al., 2001). In the present model, the capacity limit is determined by several network parameters. The dynamic range of the output excitatory nodes is restricted in order to prevent unbounded activity growth due to selfexcitation. As the dynamic range is increased by setting the upper bound (B in Eq. 3 of the Appendix) to a higher value, the capacity to store items in memory is also increased. Therefore, an increase in the node’s dynamic range also increases the representational capacity of the network to label different objects with distinct firing rates. Furthermore, within the same dynamic range, it is possible to increase the capacity limit by reducing the strength of lateral inhibition. Stronger lateral inhibition will push more nodes with lower activity levels below the threshold. On the other hand, weaker lateral inhibition will enable more nodes with distinct firing rates to sustain their activity in working memory. A third factor contributing to the capacity limit is the strength of self-excitation, because it opposes the action of lateral inhibition. In the simulation presented in Fig. 8, I show how memory capacity could change as a function of the strength of self-excitation. In previous simulations, I

Fig. 8 The role of self-excitation in generating individual differences in visual working memory (VWM) capacity. The BOLD signal strength is shown as a function of the number of objects to be stored in the superior IPS. As the strength of self-excitation, wE, increases, BOLD signal strength increases, and VWM capacity increases from two to six simultaneously stored objects

Cogn Affect Behav Neurosci (2011) 11:573–599

treated the synaptic weights for self-excitation, wE, as a fixed parameter, but in the present simulation it varied in the superior IPS from wEsup = 0.5 to wEsup = 2. Low levels of self-excitation (wEsup = 0.5) lead to poorer performance and lower capacity estimates. In this case, the activity of all nodes in the network is lowered, and only two objects can be maintained in VWM. On the other hand, strong selfexcitation produces an increase in the capacity. For instance, when wEsup = 2, even six objects could be simultaneously stored. As can be seen from Fig. 8, increased memory capacity is accompanied with increased BOLD signal, which is consistent with a pattern of increase in working memory capacity as a function of age in childhood (Klingberg, Forssberg & Westerberg, 2002) and of the amount of behavioral training in adults (Olesen et al., 2004). Vogel & Machizawa, 2004; Vogel et al., (2005) pointed to attentional filtering as a source of individual differences in VWM capacity. They found that participants with a higher VWM capacity are better at excluding irrelevant information (distractors) from working memory. In a similar vein, Olesen, Macoveanu, Tegnér, and Klingberg (2007) showed that children are more susceptible to distraction than are adults, which might account for their poorer performance on working memory tasks. A filtering account is consistent with the present model, because the attention map can select multiple objects simultaneously if they are not properly labeled with different firing rates in the surface labeling network. This might happen if the spatial gradient failed to interact with the input to the labeling network. A simultaneous load of multiple irrelevant objects could overload the inferior and superior IPS and prevent the encoding of relevant objects that are attended to later in the encoding phase. Strengthening of self-excitation might also contribute to the explanation of how behavioral training results in better working memory performance (Jaeggi et al., 2008; Olesen et al., 2004). It is possible that repeated exposure to tasks that require maintenance of visual objects in memory could produce a long-lasting increase in the strength of selfexcitation, which would result in an increase in working memory capacity and a consequent increase in BOLD signal strength. The same mechanism could be responsible for changes in capacity and brain activity during childhood (Edin, Macoveanu, Olesen, Tegnér, & Klingberg, 2007). However, it should be noted that feedback projections from the prefrontal cortex might also contribute to increases in VWM capacity. In the present model, it is proposed that feedback projections contact the dendrites of the output excitatory nodes (Fig. 2). Such contacts might be mediated through NMDA synapses, which could increase the strength of feedback projections as a result of a repeated coactivation of the parietal and the prefrontal cortex.

Cogn Affect Behav Neurosci (2011) 11:573–599

587

There are two ways in which objects could be stored in VWM. One possibility is that integrated objects are stored, including all of their features. A hypothesis about integrated encoding arises from behavioral research that has indicated that the capacity limit of VWM remains the same, irrespective of the number of features that should be remembered (Luck & Vogel, 1997; Vogel et al., 2001). Another possibility is that features are independently stored and accessed as needed (Wheeler & Treisman, 2002).

Xu (2009) revealed an important difference between inferior and superior IPS when storing multiple objects in VWM. She compared IPS activation in three conditions, which featured presentation of (1) a single object, (2) multiple objects with identical shapes, and (3) multiple objects with different shapes. The lowest activation was observed in the single-object condition in both inferior and superior IPS. However, multiple instances of the same shape produced the same level of activation as input with different shapes in the inferior IPS, indicating the importance of location coding and not identity coding in this area. On the other hand, the superior IPS showed lower activation when four identical shapes were presented, as compared with four different shapes. This implies identity coding without reference to locations. A computer simulation illustrating the model’s response to stimulus patterns similar to those used in Xu (2009) is shown in Fig. 9. In order to better appreciate the difference between the firing rates of the output excitatory nodes (FR) and dendritic processing in the input and output excitatory nodes (DP), they are depicted in parallel. A single shape produces low DPs in both IPS areas due to the lack of

Fig. 9 Simulation of Xu’s (2009) study, showing different processing of a single object (top row), four identical objects (middle row), and four different objects (bottom row) in the inferior and superior IPS. In the inferior IPS, four different objects evoke a response similar to that for four identical objects. In the superior IPS, representation of one

shape and one color is activated when four identical objects are presented. When four different objects are presented, the set of nodes for four different shapes and for one color are activated, because all objects are in a single color. iIPS, inferior IPS; sIPS, superior IPS; FR, firing rate; DP, dendritic processing

Stronger synaptic contacts will presumably produce biochemical changes that will also increase dendritic weights. Stronger dendritic weights will result in an overall increase of the firing rate of the output excitatory nodes. Consequently, the working memory capacity of the whole network will increase in the same way as for selfexcitation. A similar idea was put forward by Edin et al. (2009), who identified dorsolateral prefrontal cortex as a source of top-down modulatory signals to the parietal cortex. Features and objects

588

lateral inhibition among nodes encoding the same object. When four identical shapes are presented, a pronounced DP signal is observed in the inferior IPS, because each object receives a unique activity label. In order to keep the representations of different shapes distinct, the inferior IPS must utilize a lot of energy for lateral inhibition, mediated by the input excitatory nodes. In the superior IPS, neural activation is the same as when the single shape is presented. In other words, superior IPS treats all presented objects as instances of the same shape, whose representation is activated only once. There is no need for segregation, and the total energy consumption is low. When four different shapes are presented, the inferior IPS assigns different activity labels, as in the previous case, resulting in the same DP strength. On the other hand, in the superior IPS, nodes encoding different shapes are activated, forming an activation queue with different firing rates for different shapes. In both cases, only one set of color nodes is activated, due to the fact that all objects are black. The percentages of change in BOLD signal strength in all conditions for inferior and superior IPS are summarized in Fig. 10. It is interesting to note that the BOLD signals in the inferior IPS are not exactly the same for four different and four identical objects. A subtle difference arises from the fact that the objects are not of the same size. The squares used in the identical-object condition occupied fewer pixels than the letters used in the different-object condition. The consequence is a slightly smaller DP in the former condition. A functional neuroimaging study by Xu (2007) further examined the issue of the encoding of integrated objects or of sets of features. In Experiment 3, she compared situations in which the number of objects was kept constant while the number of features was varied, so that participants encoded either two features (color and shape) or just one (shape) while the color remained the same for both shapes.

Fig. 10 Simulated BOLD response summarizing the response of the inferior and superior IPS to the presentation of a single object, four identical objects, and four different objects. In the inferior IPS, there is little difference between four different and four identical objects. The superior IPS response, however, is much lower to the four identical objects than to the four different objects

Cogn Affect Behav Neurosci (2011) 11:573–599

On the other hand, in Experiment 4, she varied the number of objects while the number of features was kept constant. This was achieved by connecting two shapes to form a single object or presenting the two shapes with spatial separation. While neural activity in the inferior IPS tracked the number of objects regardless of the number of features, the superior IPS showed an increased response for the greater number of features, regardless of the number of objects (Xu, 2007). The computer simulation presented in Fig. 11 reproduces this pattern of results. When the number of features is varied for the same number of objects, inferior IPS shows no difference in activation. On the other hand, superior IPS shows increased activity when the shape and color are remembered, as compared to when only the shape is relevant. When two different shapes are presented with two different colors and spatial separation, the inferior IPS treats the two shapes as two distinct objects that are separated by different firing rates. This is a consequence of lateral inhibition, which increases the BOLD signal. In

Fig. 11 Simulation of Xu’s (2007) study, showing different BOLD responses as the number of objects and the number of features are systematically varied. (a) When the number of features is changed while the number of objects is held constant, activity in the inferior IPS does not change, while activity in the superior IPS tracks the number of features. (b) When the number of objects is different but the number of features is the same, activity in the inferior IPS tracks the number of objects, while activity in the superior IPS remains unchanged

Cogn Affect Behav Neurosci (2011) 11:573–599

the superior IPS, two feature values on two feature dimensions are encoded as belonging to different objects, which also implies a difference in firing rates, and consequently a strong BOLD signal. However, when the two shapes are joined to form a single object, the activity in inferior IPS is reduced, because activity spreads along the whole surface, as described for connected and disconnected shapes. In other words, there is no lateral inhibition, and the BOLD signal is produced solely from self-excitation. On the other hand, in the superior IPS, two feature values (for two colors and two shapes) are still kept segregated, because feature values are treated as independent sources of information that should be distinguished, despite the fact that they share the same object.

Discussion Previous computational models of the cortical activation related to working memory have concentrated on prefrontal cortex and on explaining the difference in the activity patterns obtained in monkey neurophysiological and human functional neuroimaging studies (Deco, Rolls, & Horwitz, 2004). The models of Tagamets and Horwitz (1998) and Arbib and colleagues (Arbib, Billard, Iacoboni & Oztop, 2000; Arbib, Bischoff, Fag & Grafton, 1995) were based on the assumption that the PET signal predominately reflects synaptic events and not the firing rate of prefrontal cells. This property helps explain why a PET signal might be strong in a situation that produces only weak firing rates as measured by single-unit recordings. However, more recent imaging studies have indicated that the parietal cortex also plays an important role in working memory (Curtis & D’Esposito, 2003; Xu & Chun, 2009). The aim of the present model is to interpret fMRI activation arising in the parietal cortex related to VWM. It shares the assumption that the locus of the BOLD signal is in the input zone of the neurons (Logothetis, 2002; Logothetis & Wandell, 2004). Here, I extend this assumption by focusing on dendrites as a major site for cortical computation (Häusser & Mel, 2003; London & Häusser, 2005) and for producing the fMRI BOLD signal (Lauritzen, 2005). Furthermore, Tagamets & Horwitz, 1998; Horwitz, Tagamets, and McIntosh (1999) suggested that input to the target area arises from other brain regions, implicating the role of brain imaging in discovering functional connectivity among cortical areas. In the present model, the origin of the input is predominately intra-areal, suggesting that brain activation detected by the BOLD signal reveals recurrent computation in a specific cortical area. The proposed model suggests that multiple objects are distinguished in working memory using different firing rates (primacy gradient) to encode each of them. In this

589

way, the distributed representation of the same object in the spatial and feature maps is bound together by the same firing rate. Indirect evidence for the firing rate code comes from Averbeck, Chafee, Crowe, and Georgopoulos (2002), who showed that the prefrontal cortex uses the primacy gradient to represent the order of movements in a sequence during the copying of geometrical shapes. Furthermore, Buschman and Miller (2009) found evidence of a similar code in the FEF cells that is used to control serial covert shifts of attention. However, direct empirical evidence points to the phase synchronization of the oscillatory neural activity as a code for storing items in working memory (Lee, Simpson, Logothetis & Rainer, 2005; Siegel, Warden, & Miller, 2009). The present model does not have the capability for phase synchronization, but the recent discovery of rate-specific synchrony offers an interesting possibility to reconcile competitive queuing with phase synchronization. Markowitz, Collman, Brody, Hopfield, and Tank (2008) investigated how noisy oscillatory input interacts with the overall activation level of a neuron and found that the amount of synchrony between two neurons strongly depends on their mean firing rates. Similar firing rates resulted in synchronization, while different firing rates resulted in desynchronization of action potential timing in the two neurons. The rate specificity suggests that synchronous activation could be used to detect neurons with similar firing rates. This theory offers support for a general computational strategy that could be used for pattern recognition and synaptic plasticity (Brody & Hopfield, 2003; Hopfield & Brody, 2001). The CQ model proposed here fits within this strategy, because it offers an explanation of how similar firing rates arise through network interactions. Recently, a neural model of spatial planning was proposed (Ivey, Bullock, & Grossberg, 2011) that shares similar computational principles with the model described here. The Ivey et al. model explains the formation of a cognitive map during navigation through the environment. In this model, the spatial gradient guides the formation of a trajectory that highlights the shortest path from the current position to a goal position that avoids obstacles on the way. The planned trajectory is stored in working memory using a primacy gradient. Finally, inhibition of return is used to retrieve the trajectory from working memory during the execution of the plan. Taken together with the model proposed here, it illustrates how different areas of the cortex might utilize similar computational mechanisms in order to achieve different behavioral goals. It should be noted that the model proposed here rests on several assumptions not supported by the anatomical and physiological data and not shared by other biophysically detailed computational models of visuospatial working memory (Durstewitz, Seamans, & Sejnowski, 2000; Wang

590

et al., 2004). In particular, the present model uses a homogeneous spread of lateral inhibition, where the distance between cells in not taken into account. On the other hand, a prominent feature of real cortical networks is a distance-dependent fall in the strength of synaptic interactions (Kang, Shelley, & Sompolinsky, 2003). The model could accommodate this fact by observing that the strength of dendritic inhibition could also show a distancedependent strength reduction. If lateral inhibition and dendritic inhibition were matched in strength, each dendrite would still compute the difference in activity amplitude, despite the difference in the synaptic weights among dendrites. Furthermore, anatomical data suggest that excitatory cells outnumber inhibitory interneurons by a factor of 4 (Braitenberg & Schuz, 1991), while in the model the numbers of excitatory and inhibitory nodes are equal. Nevertheless, it should be noted that the model’s dendritic inhibition need not establish an input–output relation to a single excitatory cell, but rather to a population of excitatory cells in the neighborhood. In this way, spatial resolution might be reduced but the functional properties of the model would still be preserved. Finally, the activity of the model’s nodes is based on instantaneous firing rates and does not take into account the temporal dynamics of spike generation, propagation, and postsynaptic integration, and further research is needed to examine whether the model’s properties will generalize to biophysically realistic network dynamics (Durstewitz et al., 2000; Wang et al., 2004). The model makes several testable predictions. At the neurophysiological level, it suggests that the storage of multiple objects in VWM should be accompanied by increased dendritic activity and increased activity of a specific subtype of inhibitory interneurons that primarily contact the dendrites of pyramidal cells. Also, the model predicts that the firing rates of neurons encoding different objects in parietal cortex should follow the same pattern observed in prefrontal cortex during maintenance and retrieval of serial movements (Averbeck et al., 2002). At the behavioral level, the model predicts that the capacity of VWM could be improved through extensive experience with particular types of tasks. People who spend a lot of time practicing spatial tasks will develop stronger selfexcitation in the inferior IPS. This should be reflected in an elevated BOLD response in the inferior IPS, because more objects could be represented together, which would utilize more lateral inhibition. On the other hand, people who are experts in categorizing specific types of visual objects (i.e., bird watchers) will show reduced BOLD activity in superior IPS, because they form more elaborated representations of shapes. In other words, their superior IPS will require less featural information to be active in VWM to encode complex objects.

Cogn Affect Behav Neurosci (2011) 11:573–599 Author note I thank anonymous reviewers and Mia Šetić for helpful comments that improved the manuscript. Also, I thank Sanja Pehnec and Sarah Czerny for help in the manuscript preparation. This work was supported by Croatian Science Foundation Installation Grant 02.05/06 and Croatian Ministry of Science, Education and Sport Grant 009-0362214-0818.

Appendix Computation in a single node Each excitatory node is modeled as a hierarchy of three independent computational subunits: (1) distal dendrites, (2) dendritic trunk, and (3) the soma (see Fig. 1). Each subunit has its own output function. The output from the distal dendritic branches, f1(·), is defined as a choice (binary) function of the form  0 if a  0 f1 ðaÞ ¼ : ð1Þ 1 if a > 0 Strong nonlinearity in the output of Subunit 1 is justified by experimental data showing the all-or-none behavior of distal dendrites (Polsky et al., 2004; Wei et al., 2001). The functions f2(·) and f3(·), defined as f2 ðaÞ ¼ f3 ðaÞ ¼ maxða; 0Þ;

ð2Þ

describe the linear, above-threshold output of the dendritic trunk (Subunit 2) and the soma (Subunit 3), respectively. Linear summation of the output of distal dendrites has been justified by theoretical (Poirazi et al., 2003) and experimental (Polsky et al., 2004) work. Each subunit may receive different combinations of excitatory and inhibitory inputs, but the important point is that the subunit must integrate synaptic inputs arriving on its own surfaces with the output of the subunit that immediately precedes it in the hierarchy. In other words, the dendritic trunk integrates its own synaptic input with the output arriving from the distal dendritic branches, and the soma integrates its own synaptic inputs with the output arriving from the dendritic trunk. It is assumed that Subunits 2 and 3 react instantaneously to their respective inputs, so their temporal dynamics are not explicitly represented. The time evolution of the activity (firing rate) of the soma of the excitatory nodes, uLij , is modeled using a nonlinear shunting mechanism (Grossberg, 1988), given by an equation of the form tu

    duLij ¼ AL uLij þ BL  uLij EijLU  C L þ uLij HijLU : dt

ð3Þ

In Eq. 3, τu is an integration time constant, and u stands for the output excitatory node, x, or the input excitatory node, y. The indices {i, j} denote the spatial location of the node in the 2-D network with dimensions M × N; L is a

Cogn Affect Behav Neurosci (2011) 11:573–599

591

label for a network layer, with three possible values (surf, inf, sup) corresponding to the surface labeling network, the inferior IPS, and the superior IPS, respectively; AL is a decay parameter controlling the speed of return to the baseline firing rate; and BL and CL denote the upper and lower bounds for the firing rate. The terms EijLU and HijLU describe the total excitatory and inhibitory inputs to the soma. The superscript U in the term for somatic input indicates that the output and the input excitatory nodes receive the distinct excitatory and inhibitory synaptic inputs described below. The outputs from the soma of the excitatory nodes, XijL and YijL , are given by     XijL ¼ f3 xLij  TX 3 and YijL ¼ f3 yLij  TY 3 ; ð4Þ

other columns within the same cortical layer and to other cortical layers. Consequently, the output excitatory node is a major determinant of the firing rate arising from the column. On the other hand, the input excitatory node receives excitation from other columns and projects its output only locally, within the same column.

where TX3 and TY3 are somatic thresholds for the output and the input excitatory nodes, respectively. Inhibitory interneurons are treated as single electrical compartments. Therefore, their output is given by the somatic output function, f3. The time evolution of the firing rate of inhibitory nodes, vLij , is modeled using an additive equation of the form

ð7Þ

tv

dvLij ¼ vLij þ EijLV : dt

ð5Þ

In Eq. 5, τv is an integration time constant, and v stands for the node mediating lateral inhibition, z, or the node mediating dendritic inhibition, r. Excitatory input EijLV to the inhibitory nodes is not subject to the shunting nonlinearity. In the model, inhibitory nodes do not receive any inhibitory input on their soma. The somatic outputs from the inhibitory nodes, ZijL and RLij , are given by     ZijL ¼ f3 zLij and RLij ¼ f3 rijL : ð6Þ Computation in a cortical microcircuit All cortical layers in the model are composed of elementary processing units (cortical columns). Each column contains two excitatory nodes and two inhibitory nodes, as shown in Fig. 2a. Each node serves a different function within the cortical circuit and receives the specific combination of excitatory and inhibitory synaptic inputs described below. Excitatory nodes Two types of excitatory nodes are distinguished within the cortical column. They are labeled as the output and the input excitatory nodes with respect to their roles in the column. The output excitatory node transmits its output to

The output excitatory node The output excitatory node, xLij , mediates lateral interactions within the circuit because its long horizontal projections contact dendrites of the input excitatory nodes in all other columns. Excitatory input EijLX to the output excitatory node xLij arrives on the node’s dendritic trunk, whose activity level is given by h     i EijLX ¼ f2 IijL ðtÞ þ wLE XijL þ f0 xLij wLX f1 FBLþ1  TX 1  TX 2 : ij

Synaptic input consists of a feedforward input, IijL ðtÞ, which is transiently turned on for some time interval from t = t1 to t = t2, and self-excitation, XijL , arriving from the soma. The strength of the self-excitation is modulated by the synaptic weight wLE . Each cortical layer (i.e., the surface labeling network and the inferior and superior IPS) receives a different form of feedforward input, described in detail in Section 3 of this Appendix. Also, the dendritic trunk of the output excitatory node receives output from its distal dendritic branches, f1(·), where feedback, FBLþ1 ij , from another (unspecified) cortical layer, L + 1, arrives. Output from distal dendrites is multiplicatively gated by the output of the soma: f0(a) = 1 if a > 0, and f0(a) = 0 if a ≤ 0. This type of somatic output is back-propagated to the distal dendrites through the dendritic trunk, and it should not be confused with the somatic output to the axon, which is given by the output function f3(·). Such interaction assures that feedback can influence only those output excitatory nodes that are already stimulated by the feedforward input. In the present model, feedback is not specified, so FBLþ1 ¼ ij 0 in all simulations. It is included in the model description for completeness. The impact of distal dendrites on Subunit 2 (the trunk) is scaled by the dendritic weight, wLX . Each subunit of the excitatory node has its own threshold, so TX1 and TX2 stand for the thresholds of the distal dendrites and the dendritic trunk, respectively. The inhibitory input, HijLX , to the output excitatory node L xij arrives at the soma, and it is given by HijLX ¼ wLZ ZijL þ wLR RLij :

ð8Þ

Inhibition in Eq. 8 arises from the outputs of two classes of interneurons: Zij, mediating lateral inhibition, and Rij, mediating dendritic inhibition, which also delivers self-

592

Cogn Affect Behav Neurosci (2011) 11:573–599

inhibition to the output excitatory node. Inhibition is weighted by the synaptic strengths, wLZ for lateral inhibition and wLR for dendritic inhibition. Inhibitory weights do not have the spatial indices {i, j} because they represent only synaptic input that arrives from interneurons in the same column where the output excitatory node is located. The input excitatory node The input excitatory node receives excitation from the horizontal projections from the output excitatory nodes. The excitation, EijLY , to the input excitatory nodes yLij arrives from its dendritic trunk, whose output is described with

EijLY

h i 9 8 > > wLY f1 XijL  TY 1 þ . . . > > > > = <   M N h i P P L L L : ¼ f 2 f 0 yL w f X  R  T  . . . Y1 ij Y 1 mn ij > > > > m¼1 n¼1 > > ; : L Zij  TY 2 ð9Þ

In Eq. 9, the output from the output excitatory node at the same spatial location, XijL , arrives at the dendrite near the soma [first line inside function f2(·) in Eq. 9]. Because of the close distance to the soma, this dendrite can directly stimulate the soma and can deliver feedforward input to the input excitatory node. On the other hand, the output from the distal dendrites multiplicatively interacts with the activity of the soma of the input excitatory node yLij [second line inside function f2(·) in Eq. 9]. Each distal dendrite receives output from one output excitatory node, XmnL, from some other network location {m, n}, and inhibitory output from the interneuron mediating dendritic inhibition, RLij . Excitatory and inhibitory synaptic inputs to the distal dendritic branches are assumed to be of unit strength, so their synaptic weights are not explicitly represented. This assumption is necessary in order to achieve a homogeneous object representation, but it departs from the standard model of lateral inhibition in cortical circuits, which assumes distance-dependent interactions (Kang et al., 2003). The impact of dendritic branches on the trunk is scaled by the dendritic weight, wLY . It is assumed that all dendritic weights are homogeneous in strength, so they do not have spatial indices {m, n}. A sum is taken over all network locations {m, n} except over the interval {i, j}. The terms TY1 and TY2 are thresholds for the distal dendrites and dendritic trunk, respectively. In Eq. 9, the output from Subunit 1 (distal dendrites), f1(·), is multiplicatively gated by the activity of the soma of the input excitatory node, f0(·), where f0(a) = 1 if a > 0 and f0(a) = 0 if a ≤ 0. In this way, distal dendrites implement coincidence detection similar to that described for the

output excitatory node in Eq. 7. The effect of the multiplicative gating is that the recurrent excitation can have only a modulatory effect on the target excitatory node. Detailed justification for this multiplication in dendritic computation and for the special output function f0(·) is provided in the Computation in a Single Node section of the model description. The dendritic trunk of the input excitatory node also receives inhibition from the node mediating lateral inhibition, ZijL [third line inside function f2(·) in Eq. 9]. This type of inhibition increases the dynamic range of the excitatory node, because it forces the node’s firing rate to stay proportional to the input arriving from Subunit 1, consistent with the recent study of Murayama et al. (2009). In other words, it counteracts the effects of shunting nonlinearity and enables a faithful transmission of information about how many distal dendrites are activated (up to the saturation point, B). The input excitatory node does not receive any inhibition on its soma, so HijLY ¼ 0 in Eq. 1. Inhibitory nodes Inhibitory interneurons only receive excitation on their somas. The excitatory input, EijLZ , to the inhibitory interneuron for lateral inhibition, zLij , is given by EijLZ ¼ YijL :

ð10Þ

This node simply receives the activation from the input excitatory node YijL , and it inhibits the output excitatory node. The excitatory input, EijLR , to the inhibitory node for dendritic inhibition, rij, is defined as EijLR ¼ XijL :

ð11Þ

It receives excitatory output from the output excitatory node XijL , and it inhibits the dendrites of the input excitatory node in the same column. Computing the BOLD signal The simulated BOLD signal for cortical layer L is computed by the total activity of all output excitatory nodes, XijL , and all input excitatory nodes, YijL , as described by BOLD ¼ S L

L

X ij

XijL

þ2

X

! YijL

þ bL :

ð12Þ

ij

In Eq. 12, SL represents the scaling factor for the cortical layer L that couples electrical activity with hemodynamic response, and bL is a baseline signal that does not depend on dendritic processing. In physiological terms, SL could be

Cogn Affect Behav Neurosci (2011) 11:573–599

593

interpreted as hemodynamic response efficiency (Logothetis & Wandell, 2004). The simulated BOLD signal is estimated directly from the firing rates of the output and input excitatory nodes, because their firing rates correlate with the output arriving from their dendrites. After convergence to the steady state, the firing rate of the output excitatory node, XijL , will be proportional to the excitation arising from its dendritic trunk, EijLX . The reason for this is that feedback projections are not explicitly specified in the simulations. Furthermore, the feedforward input is only transiently active during encoding, whereas it is set to zero afterward. Consequently, Eq. 7 implies that the dendritic activity of the output excitatory node, EijLX , equals the node’s selfexcitation, XijL . The firing rate of the input excitatory node, YijL , is proportional to the number of active dendrites in its Subunit 3 transmitted through Subunit 2, whose output is given by EijLY . The input excitatory node does not receive any excitatory or inhibitory synaptic input on its soma, so all of its input is given by its dendrites. Dendritic activity on the input excitatory node is weighted by 2 in order to indicate that it encompasses equal amounts of activity in its Subunits 2 and 3, while the output excitatory node has only Subunit 2 active (i.e., there is no direct input to its Subunit 3). An assumption is made that each cortical layer quickly reaches the equilibrium state so that the BOLD signal strength predominantly reflects activity during memory maintenance, not activity during encoding or retrieval, consistent with the observations of Xu and Chun (2006). Therefore, I have not included a summation over time in the computation of imaging signal strength, as was done in Arbib et al. (1995) and Tagamets and Horwitz (1998). Also, I have not included more complex interactions between cerebral blood flow (CBF), cerebral blood volume, and the cerebral metabolic rate of oxygen in the calculation of BOLD response (Buxton, Uludag, Dubowitz & Liu, 2004). I simply assumed that there is a linear relationship between BOLD and CBF. This is a reasonable assumption, because we are only interested in simulating the plateau value of the BOLD signal, not in its temporal dynamics. The simulated BOLD signal strength described in Eq. 12 is an absolute measure of dendritic processing and not of the relative signal change as compared to fixation, as is often used in functional neuroimaging studies. With the assumption that in the control condition (fixation), the BOLD signal equals the baseline level bL, it is easy to compute the percentage of signal change during experimental conditions using the following equation (Arbib et al., 2000; Arbib et al., 1995):

rBOLDL ðe=cÞ ¼

BOLDL ðeÞ  BOLDL ðcÞ  100% ; BOLDL ðcÞ

ð13Þ

where rBOLD designates relative change in the BOLD signal strength, and e (vs. c) stands for the experimental (vs. control) condition. Computation in the neural architecture of VWM In the neural architecture for VWM presented in Fig. 3, each cortical layer performs distinct computations and serves a different function within. The surface labeling network specifies how attention will be serially deployed during encoding of objects into the inferior and superior IPS. The attention map reads out the spatial position of the most active nodes in the surface labeling network and represents the current focus of attention. When attention is drawn to the object, its spatial position is loaded into the fixed-capacity network of the inferior IPS, and its featural properties are loaded into the variable-capacity network of the superior IPS. Surface labeling network (FEF) In order to specify an object-based competitive queue for attentional selection, the surface labeling network implements three computational mechanisms, which will be described in detail: spatial gradient, activity spreading, and inhibition of return. Spatial gradient The input to the surface labeling network, Iijsurf , is weighted by the spatial gradient, Jij, as described by the equation Iijsurf ðtÞ ¼ Iij ðtÞ»Jij ;

ð14Þ

where Iij assumes the value 0.1 if the location {i, j} is occupied by the object, or 0 if the location is not. The input is temporarily turned on from t = 0 to t = 40. For t > 40, input is turned off; that is, Iij(t) = 0 for all spatial locations i and j. The spatial gradient Jij is given by Jij ¼ i þ ðj  1ÞN ;

ð15Þ

for all i and j. Therefore, the input to the nodes in the network is such that every node has a distinct input amplitude that corresponds to its position in a network. The spatial gradient is illustrated in Fig. 4b. The presented gradient will assign the highest-activity amplitude to the object in the lower right part of the visual field. The next in the activity queue will be the object in the lower left part, and so on. This spatial gradient produces a plan for serial deployment of attention that will shift attention from the lower right to the lower left corner, then to the upper right corner, and finally to the upper left corner.

594

Cogn Affect Behav Neurosci (2011) 11:573–599

A similar strategy has been observed in electrophysiological recordings in FEF cells (Buschman & Miller, 2009). Theoretical considerations (Groh, 2001) and direct anatomical evidence (Moschovakis et al., 1998) have suggested that spatial gradients of the form used here might be involved in sensory–motor transformations. Moschovakis et al. discovered a graded strength of anatomical projections from distinct superior colliculus cells to burst generators in the cat oculo-motor system. Based on this work, Domijan and Šetić (2008) proposed a computational model that used a spatial gradient to explain a tendency in the perceptual organization to assign figural status to surfaces in the lower part of the visual field. Furthermore, Ivey et al. (2011) showed how the spatial gradient helps to explain the formation of plans for navigation through an environment. Activity spreading The surface labeling network allows neural activity to spread among the nearest neighbor locations in order to achieve the smooth representation of surfaces. By smooth representation, I mean a spatial representation in which all of the nodes encoding the same object receive the same firing rate (or activity label). The computation of dendritic inhibition in the surface labeling network is augmented with respect to Eq. 9 in order to achieve activity spreading by disinhibiting the nearest neighbors and protecting them from lateral inhibition. Instead of Eq. 11, the excitatory input to the interneuron for dendritic inhibition in the surface labeling network, rijsurf , is given by EijsurfR

¼ Max

h

surf surf surf surf Xijsurf ; Xiþ1j ; Xi1j ; Xijþ1 ; Xij1

i

:

ð16Þ

Besides the interneuron at the same location {i, j}, every output excitatory node Xijsurf also projects to its four nearest neighbor dendritic interneurons, consistent with recent anatomical evidence (Kapfer et al., 2007). Therefore, every dendritic interneuron receives input from five output excitatory nodes and computes the maximum of their activity values. Neurophysiological findings have suggested that neurons in the visual system are capable of computing the maximum of their input (Gawne & Martin, 2002; Sato, 1989). With the maximum operator running on the dendritic interneurons, neighboring output excitatory nodes are able to disinhibit the output excitatory node at location {i, j}. In this way, the dendritic interneuron enables the output excitatory node to attain the same activity value as its neighbors. The described disinhibition provides the neural basis for activity spreading along the surface of the object, which highlights all spatial locations that belong to the object.

In the primary visual cortex, there is evidence that attention might spread along the shape of the object by enhancing the activity amplitude of all spatial locations belonging to the object (Roelfsema, Lamme, & Spekreijse, 1998). Such activity spreading is correlated with behavioral performance, since it “lights up” the wrong object when the monkey commits an error in a curve tracing task (Roelfsema & Spekreijse, 2001). In a similar vein, Müller and Kleinschmidt (2003) found that when one part of an object was attended, retinotopic locations in the early visual cortex corresponding with other parts of the same object were also enhanced. Given the fact that the FEF might play a causal role in modulating neural activity in the primary visual cortex (Bressler et al., 2008; Ruff et al., 2006), it is conceivable that the attention-related activity spreading observed by Roelfsema et al. (1998) might originate from the FEF. The activity spreading in the surface labeling network is possible only if the locations are connected; activity will not spread onto the empty space around the object. On the other hand, disconnected surfaces will inhibit each other, and their activity magnitudes will be distinct due to the different strengths of the input gradient. Inhibition of return (attentional shifts) The synaptic input to the interneuron mediating lateral inhibition in the surface labeling network is augmented with respect to Eq. 8 in order to enable attentional shifts from one object to another. Instead of Eq. 10, the total excitatory input to the lateral inhibition interneuron, Eijsurf Z , is given by EijsurfZ ¼ Yijsurf þ Pij ðtÞaðtÞ;

ð17Þ

where Yijsurf stands for the somatic output from the input excitatory node, Pij is the output from the attention map, and α(t) represents a nonspecific reset signal that shuts down the nodes currently most active—that is, the nodes currently selected by the attention map. The reset signal enables attention to shift to another object and implements inhibition of return that prevents attention from returning to already visited objects (Dodd et al., 2003; Klein, 2000; Klein & MacInnes, 1999). The strength of the reset signal is chosen in a way to inhibit the output excitatory nodes of the attended object below the somatic threshold. Consequently, the spatial representation of the attended object is removed from the competitive queue in the surface labeling network. The reset signal is periodically turned on, α(t) = α, for a short time period, (Δt + ta) < t < (Δt + tb), and then redrawn, α(t) = 0, for the time interval (Δt + tb) ≤ t ≤ (Δt + tc), where Δt = (k – 1) × (tc – ta), and k = 1, 2, . . . counts the number of attentional shifts. An example of the temporal profile of the reset signal is shown in Fig. 6. It should be noted

Cogn Affect Behav Neurosci (2011) 11:573–599

595

that attentional shifting starts after the surface labeling network has completed the activity spreading. A detailed description of how the attentional focus is represented and how it shifts in the spatial map is provided in the next section.

Attention map The attention map receives input from the surface labeling network. The read-out of the surface representation in the attention map is achieved by imposing a high input threshold in the feedforward pathway from the surface labeling network to the attention map. The threshold is set to a value just below the maximal activity in the surface labeling network so that the attention map will select only one object. This object is currently in the focus of attention, and its properties are loaded into the inferior and superior IPS. The time evolution of the activity (firing rate) of the node pij in the attention map is given by h i dpij ¼ pij þ f1 Xijsurf  aðtÞ  TP ; tp ð18Þ dt where τp is an integration time constant chosen such that the temporal evolution of the nodes is slower in the attention map than in the nodes in the surface labeling network. The term –pij denotes passive decay to the resting state in the absence of an input. The output of the surface labeling network, Xijsurf , contacts the distal dendrite at the node in the attention map, and the dendrite responds if the output from the surface labeling network is above the threshold TP. When the dendrites are active, they will contribute to the increase in the BOLD signal corresponding to the attended locations, consistent with the findings of Sereno et al. (2001), Silver et al. (2005), and Yantis et al. (2002). The attention map can be activated only when there is no reset signal, α(t), that provides inhibition on the dendrite and prevents its activation. The role of the described interaction, along with slower temporal dynamics, is to prevent switching between surfaces during the time interval when the reset signal is on. The transition to a new surface takes place only after the reset signal ceases. In this way, the attention map is protected from failures to deliver its signal to the surface labeling network and from the selection of two surfaces simultaneously. The output from the attention map, Pij, is given by   Pij ¼ h pij  0:5 ;

ð19Þ

where the h function , h(a) = 1 if a > 0 and h(a) = 0 if a ≤ 0, denotes a binary response function specific to the attention map.

Inferior IPS The input to the inferior IPS, Iijinf ðtÞ, arrives from the

attention map, Pij, so Iijinf ðtÞ ¼ Pij ðtÞ. In different time intervals, different objects will be in the focus of attention, and they will be loaded into inferior IPS up to the capacity limit. In order to assure a fixed capacity limit, independent of the size of the objects, a divisive inhibition is introduced that modulates the synaptic input to the input excitatory node. Instead of fixed dendritic weights, as in Eq. 9, in the inferior IPS dendritic weights are modulated by the total activity in the attention map, as in winf Y ¼

Ginf : 1þq

ð20Þ

In Eq. 20, Ginf describes the fixed strength of the distal dendritic impact on the trunk, and q represents a modulatory divisive inhibition that reduces the amount of activity in the trunk in proportion to the size of the object (Carandini & Heeger, 1994; Chance & Abbott, 2000). Divisive inhibition arrives from a special inhibitory node that computes the total activity in the attention map, as in q¼

M X N X

Pij :

ð21Þ

i¼1 j¼1

The sum is computed when the first object is selected in the attention map, and this activation is stored in working memory, which enables the same size estimate to be available during the whole trial. Therefore, q will be proportional to the size of the first attended object. It is assumed that the node for divisive inhibition reacts quickly to its input, and its time evolution is not represented explicitly. Superior IPS In the computer simulations, the input to the superior IPS is set by hand. For instance, if the currently attended object is red, the input to V4 and the superior IPS is simply set to 1 for all nodes encoding the color red, and to 0 otherwise. On the other hand, if a green object is presented in parallel, but it is not in the focus of attention, it will not activate the corresponding nodes in V4 and the superior IPS. Formally, the input to the output excitatory nodes in the superior IPS is given by Iijsup(t) = Pmn(t), if the currently attended object possesses the feature value j on the feature dimension i, and Iijsup(t) = 0 otherwise. The term Pmn(t) denotes the activity of a randomly selected node from the attention map that covers the spatial location {m, n} of the attended object. Index i in the superior IPS denotes feature dimensions, with

596

Cogn Affect Behav Neurosci (2011) 11:573–599

i = 1, . . . , 10 representing the color nodes and i = 11, . . . , 20 representing the shape nodes. Index j represents particular feature values within the dimensions. Therefore, the nodes in each row within the color dimension are sensitive to different colors, such as red, green, and so on. The nodes in the shape dimension are sensitive to different shapes, such as the letters used in the simulations presented in this article. In order to account for the higher average fMRI response of the superior as compared to the inferior IPS in the studies of Xu and Chun (2009), I have assumed that each feature value recruits 20 nodes (two rows of nodes) within each dimension. Simulation method and parameters The model’s differential equations are solved numerically using the Bogacki–Shampine method. In each simulation, the surface labeling network is run for a time interval from t = 0 to t = 120, in simulated time steps. After its convergence, the full architecture is run in order to load object representations into the inferior and superior IPS until the memory capacity is full. Attentional shifting is provided by the nonspecific reset signal, whose strength is set to α = 70, and temporal activation is given by ta = 10, tb = 20, and tc = 30. For the computer simulations reported in this article, the parameters controlling the dynamics of the network are reported in Table 1. As can be seen from the table, almost all parameters are identical in all three cortical layers. The exceptions are the parameters controlling the strength of lateral inhibition, which are set to lower

Table 1 Parameters for cortical layers used in all simulations Parameter

A B C τx τy, τz, τr TX1, TX2, TX3 TY1 TY2, TY3 wE wZ wR ωX ωY

Cortical Layer FEF

Inferior IPS

Superior IPS

1 80 1 2 1 0 1 0 1 1 0 1 .4

1 80 1 2 1 0 1 0 1 .75 1 1 1

1 80 1 2 1 0 1 0 1* .75 1 1 1

* In the simulation presented in Fig. 8, the strength of self-excitation varied between 0.5 and 2.

values in the FEF. In this way, the FEF is able to encode more objects than the inferior IPS. With respect to the comparison between the inferior and the superior IPS, it should be noted that all of their parameters are set to exactly equal values so that any observed differences in BOLD response between these layers should be attributed to their respective computational mechanisms and not to the parameter fitting. The threshold for the surface selection in the attention map is set to Tp = 75, and the time constant for the attention map is set to τp = 4. The parameter for the divisive inhibition, which is specific to inferior IPS, is set to G inf = 20. The parameters for computing simulated BOLD signal strength were set to S inf = S sup = 1/8,000 and binf = bsup = 100. The dimensions of each cortical layer were M = N = 20, except in the simulation depicted in Fig. 9, where complex objects required more space, so the dimensions were set to M = N = 30 in all cortical layers.

References Alvarez, G. A., & Cavanagh, P. (2004). The capacity of visual shortterm memory is set both by visual information load and by number of objects. Psychological Science, 15, 106–111. doi:10.1111/j.0963-7214.2004.01502006.x Arbib, M. A., Billard, A., Iacoboni, M., & Oztop, E. (2000). Synthetic brain imaging: Grasping, mirror neurons and imitation. Neural Networks, 13, 975–997. doi:10.1016/S0893-6080(00)00070-8 Arbib, M. A., Bischoff, A., Fag, A. H., & Grafton, S. T. (1995). Synthetic PET: Analyzing large-scale properties of neural networks. Human Brain Mapping, 2, 225–233. Attwell, D., & Iadecola, C. (2002). The neural basis of functional brain imaging signals. Trends in Neurosciences, 25, 621–625. doi:10.1016/S0166-2236(02)02264-6 Averbeck, B. B., Chafee, M. V., Crowe, D. A., & Georgopoulos, A. P. (2002). Parallel processing of serial movements in prefrontal cortex. Proceedings of the National Academy of Sciences, 99, 13172–13177. Awh, E., & Jonides, J. (2001). Overlapping mechanisms of attention and spatial working memory. Trends in Cognitive Sciences, 5, 119–126. doi:10.1016/S1364-6613(00)01593-X Bartels, A., Logothetis, N. K., & Moutoussis, K. (2008). fMRI and its interpretations: An illustration on directional selectivity in area V5/MT. Trends in Neurosciences, 31, 444–453. doi:10.1016/j. tins.2008.06.004 Beauchamp, M. S., Haxby, J. V., Jennings, J. E., & DeYoe, E. A. (1999). An fMRI version of the Farnsworth–Munsell 100-hue test reveals multiple color-selective areas in human ventral occipitotemporal cortex. Cerebral Cortex, 9, 257–263. Behrmann, M., Geng, J. J., & Shomstein, S. (2004). Parietal cortex and attention. Current Opinion in Neurobiology, 14, 212–217. doi:10.1016/j.conb.2004.03.012 Berger, T. K., Perin, R., Silberberg, G., & Markram, H. (2009). Frequency-dependent disynaptic inhibition in the pyramidal network—A ubiquitous pathway in the rodent neocortex. Journal of Physiology (London), 587, 5411–5425. Binzegger, T., Douglas, R. J., & Martin, K. A. C. (2004). A quantitative map of the circuit of cat primary visual cortex. Journal of Neuroscience, 24, 8441–8453. doi:10.1523/JNEUROSCI.140004.2004

Cogn Affect Behav Neurosci (2011) 11:573–599 Binzegger, T., Douglas, R. J., & Martin, K. A. C. (2009). Topology and dynamics of the canonical circuit of cat V1. Neural Networks, 22, 1071–1078. Bonvento, G., Sibson, N., & Pellerin, L. (2002). Does glutamate image your thoughts? Trends in Neurosciences, 25, 359–364. doi:10.1016/S0166-2236(02)02168-9 Bradski, G., Carpenter, G., & Grossberg, S. (1992). Working memory networks for learning temporal order with application to 3-D visual object recognition. Neural Computation, 4, 270–286. Braitenberg, V., & Schuz, A. (1991). Anatomy of the cortex. Berlin: Springer. Bressler, S. L., Tang, W., Sylvester, C. M., Shulman, G. L., & Corbetta, M. (2008). Top-down control of human visual cortex by frontal and parietal cortex in anticipatory visual spatial attention. Journal of Neuroscience, 28, 10056–10061. doi:10.1523/JNEUROSCI.1776-08.2008 Brody, C. D., & Hopfield, J. J. (2003). Simple networks for spiketiming-based computation, with application to olfactory processing. Neuron, 37, 843–852. Bullock, D. (2004). Adaptive neural models of queuing and timing in fluent action. Trends in Cognitive Sciences, 8, 426–433. doi:10.1016/j.tics.2004.07.003 Bullock, D., & Rhodes, B. (2003). Competitive queuing for serial planning and performance. In M. Arbib (Ed.), The handbook of brain theory and neural networks (pp. 241–248). Cambridge: MIT Press. Buschman, T. J., & Miller, E. K. (2007). Top-down versus bottom-up control of attention in the prefrontal and posterior parietal cortices. Science, 315, 1860–1862. doi:10.1126/science.1138071 Buschman, T. J., & Miller, E. K. (2009). Serial, covert, shifts of attention during visual search are reflected by the frontal eye fields and correlated with population oscillations. Neuron, 63, 386–396. Buxton, R. B., Uludag, K., Dubowitz, D. J., & Liu, T. T. (2004). Modeling the hemodynamic response to brain activation. NeuroImage, 23, 220–233. Carandini, M., & Heeger, D. J. (1994). Summation and division by neurons in visual cortex. Science, 264, 1333–1336. Chance, F. S., & Abbott, L. F. (2000). Divisive inhibition in recurrent networks. Network, 11, 119–129. Claeys, K., Dupont, P., Cornette, L., Sunaert, S., Van Hecke, P., De Schutter, E., et al. (2004). Color discrimination involves ventral and dorsal stream visual areas. Cerebral Cortex, 14, 803–822. Cohen, M. A., & Grossberg, S. (1984). Neural dynamics of brightness perception: Features, boundaries, diffusion, and resonance. Perception & Psychophysics, 36, 428–456. doi:10.3758/ BF03207497 Colby, C. L., & Goldberg, M. E. (1999). Space and attention in parietal cortex. Annual Review of Neuroscience, 22, 319–349. Corbetta, M., Akbudak, E., Conturo, T. E., Snyder, A. Z., Ollinger, J. M., Drury, H. A., et al. (1998). A common network of functional areas for attention and eye movements. Neuron, 21, 761–773. Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed and stimulus-driven attention in the brain. Nature Reviews Neuroscience, 3, 201–215. Crick, F., & Koch, C. (1998). Constraints on cortical and thalamic projections: The no-strong-loops hypothesis. Nature, 391, 245– 250. doi:10.1038/34584 Curtis, C. E., & D’Esposito, M. (2003). Persistent activity in the prefrontal cortex during working memory. Trends in Cognitive Sciences, 7, 415–423. Deco, G., Rolls, E. T., & Horwitz, B. (2004). “What” and “where” in visual working memory: A computational neurodynamical perspective for integrating fMRI and single-neuron data. Journal of Cognitive Neuroscience, 16, 683–701. doi:10.1162/ 089892904323057380

597 Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual Review of Neuroscience, 18, 193–222. doi:10.1146/annurev.ne.18.030195.001205 Dodd, M. D., Castel, A. D., & Pratt, J. (2003). Inhibition of return with rapid serial shifts of attention: Implications for memory and visual search. Perception & Psychophysics, 65, 1126–1135. Domijan, D. (2003). A mathematical model of persistent neural activity in human prefrontal cortex for visual feature binding. Neuroscience Letters, 350, 89–92. Domijan, D., & Šetić, M. (2008). A feedback model of figure–ground assignment. Journal of Vision, 8(7), 10:1–27. doi:10.1167/8.7.10 Donner, T., Kettermann, A., Diesch, E., Ostendorf, F., Villringer, A., & Brandt, S. A. (2000). Involvement of the human frontal eye field and multiple parietal areas in covert visual selection during conjunction search. European Journal of Neuroscience, 12, 3407–3414. doi:10.1046/j.1460-9568.2000.00223.x Douglas, R. J., & Martin, K. A. C. (2004). Neuronal circuits of the neocortex. Annual Review of Neuroscience, 27, 419–451. doi:10.1146/annurev.neuro.27.070203.144152 Durstewitz, D., Seamans, J. K., & Sejnowski, T. J. (2000). Neurocomputational models of working memory. Nature Neuroscience, 3, 1184–1191. Edin, F., Klingberg, T., Johansson, P., McNab, F., Tegnér, J., & Compte, A. (2009). Mechanism for top-down control of working memory capacity. Proceedings of the National Academy of Sciences, 106, 6802–6807. Edin, F., Macoveanu, J., Olesen, P., Tegnér, J., & Klingberg, T. (2007). Stronger synaptic connectivity as a mechanism behind development of working memory-related brain activity during childhood. Journal of Cognitive Neuroscience, 19, 750–760. doi:10.1162/ jocn.2007.19.5.750 Fazl, A., Grossberg, S., & Mingolla, E. (2009). View-invariant object category learning, recognition, and search: How spatial and object attention are coordinated using surface-based attentional shrouds. Cognitive Psychology, 58, 1–48. Gawne, T. J., & Martin, J. M. (2002). Responses of primate visual cortical V4 neurons to simultaneously presented stimuli. Journal of Neurophysiology, 88, 1128–1135. Groh, J. M. (2001). Converting neural signals from place codes to rate codes. Biological Cybernetics, 85, 159–165. Grossberg, S. (1978). A theory of human memory: Self-organization and performance of sensory–motor codes, maps, and plans. In R. Roden & F. Snell (Eds.), Progress in theoretical biology (Vol. 5, pp. 498–639). New York: Academic. Grossberg, S. (1988). Nonlinear neural networks: Principles, mechanism, and architectures. Neural Networks, 1, 17–61. Grossberg, S., & Pearson, L. R. (2008). Laminar cortical dynamics of cognitive and motor working memory, sequence learning, and performance: Toward a unified theory of how the cerebral cortex works. Psychological Review, 115, 677–732. doi:10.1037/a0012618 Grossberg, S., & Todorović, D. (1988). Neural dynamics of 1-D and 2D brightness perception: A unified model of classical and recent phenomena. Perception & Psychophysics, 43, 241–277. Hao, J., Wang, X.-D., Dan, Y., Poo, M.-M., & Zhang, X.-H. (2009). An arithmetic rule for spatial summation of excitatory and inhibitory inputs in pyramidal neurons. Proceedings of the National Academy of Sciences, 106, 21906–21911. doi:10.1073/ pnas.0912022106 Häusser, M., & Mel, B. W. (2003). Dendrites: Bug or feature? Current Opinion in Neurobiology, 13, 372–383. Heinzle, J., Hepp, K., & Martin, K. A. C. (2007). A microcircuit model of the frontal eye fields. Journal of Neuroscience, 27, 9341–9353. doi:10.1523/JNEUROSCI.0974-07.2007 Hopfield, J. J., & Brody, C. D. (2001). What is a moment? “Cortical” sensory integration over a brief interval. Proceedings of the National Academy of Sciences, 97, 13919–13924.

598 Horwitz, B., Tagamets, M.-A., & McIntosh, A. R. (1999). Neural modeling, functional brain imaging, and cognition. Trends in Cognitive Sciences, 3, 91–98. doi:10.1016/S1364-6613(99) 01282-6 Houghton, G. (1990). The problem of serial order: A neural network model of sequence learning and recall. In R. Dale, C. S. Mellish, & M. Zock (Eds.), Current research in natural language generation (pp. 287–319). London: Academic. Hupé, J. M., James, A. C., Girard, P., Payne, B. R., & Bullier, J. (2001). Feedback connections act on the early part of the responses in monkey visual cortex. Journal of Neurophysiology, 85, 134–145. Ivey, R., Bullock, D., & Grossberg, S. (2011). A neuromorphic model of spatial lookahead planning. Neural Networks, 24, 257–266. doi:10.1016/j.neunet.2010.11.002 Jaeggi, S. M., Buschkuehl, M., Jonides, J., & Perrig, W. J. (2008). Improving fluid intelligence with training on working memory. Proceedings of the National Academy of Sciences, 105, 6829– 6833. Kang, K., Shelley, M., & Sompolinsky, H. (2003). Mexican hats and pinwheels in visual cortex. Proceedings of the National Academy of Sciences, 100, 2848–2853. Kapadia, M. K., Ito, M., Gilbert, C. D., & Westheimer, G. (1995). Improvement in visual sensitivity by changes in local context: Parallel studies in human observers and in V1 or alert monkeys. Neuron, 15, 843–856. Kapfer, C., Glickfield, L. L., Atallah, B. V., & Scanziani, M. (2007). Supralinear increase of recurrent inhibition during sparse activity in the somatosensory cortex. Nature Neuroscience, 10, 743–753. Kätzel, D., Zemelman, B. V., Buetfering, C., Wölfel, M., & Miesenböck, G. (2011). The columnar and laminar organization of inhibitory connections to neocortical excitatory cells. Nature Neuroscience, 14, 100–107. Klein, R. M. (2000). Inhibition of return. Trends in Cognitive Sciences, 4, 138–147. doi:10.1016/S1364-6613(00)01452-2 Klein, R. M., & MacInnes, W. J. (1999). Inhibition of return is a foraging facilitator in visual search. Psychological Science, 10, 346–352. doi:10.1111/1467-9280.00166 Klingberg, T., Forssberg, H., & Westerberg, H. (2002). Increased brain activity in frontal and parietal cortex underlies the development of visuospatial working memory capacity during childhood. Journal of Cognitive Neuroscience, 14, 1–10. Larkum, M. E., Zhu, J. J., & Sakmann, B. (1999). A new cellular mechanism for coupling inputs arriving at different cortical layers. Nature, 398, 338–341. Larkum, M. E., Zhu, J. J., & Sakmann, B. (2001). Dendritic mechanisms underlying the coupling of the dendritic with the axonal AP initiation zone of adult layer 5 pyramidal neurons. Journal of Physiology (London), 533, 447–466. Lauritzen, M. (2005). Reading vascular changes in brain imaging: Is dendritic calcium the key? Nature Reviews Neuroscience, 6, 77– 85. Lee, H., Simpson, G. V., Logothetis, N. K., & Rainer, G. (2005). Phase locking of single neuron activity to theta oscillations during working memory in monkey extrastriate visual cortex. Neuron, 45, 147–156. Lewis, J. W., & Van Essen, D. C. (2000). Corticocortical connections of visual, sensorimotor, and multimodal processing areas in the parietal lobe of the macaque monkey. The Journal of Comparative Neurology, 428, 112–137. Liu, G. (2004). Local structural balance and functional interaction of excitatory and inhibitory synapses in hippocampal dendrites. Nature Neuroscience, 7, 373–379. Logothetis, N. K. (2002). The neural basis of the blood-oxygen-leveldependent functional magnetic resonance imaging signal. Philosophical Transactions of the Royal Society B, 357, 1003–1037.

Cogn Affect Behav Neurosci (2011) 11:573–599 Logothetis, N. K., & Wandell, B. A. (2004). Interpreting the BOLD signal. Annual Review of Physiology, 66, 735–769. London, M., & Häusser, M. (2005). Dendritic computation. Annual Review of Neuroscience, 28, 503–532. doi:10.1146/annurev. neuro.28.061604.135703 Losonczy, A., & Magee, J. C. (2006). Integrative properties of radial oblique dendrites in hippocampal CA1 pyramidal neurons. Neuron, 50, 291–307. Luck, S. J., & Vogel, E. K. (1997). The capacity of visual working memory for features and conjunctions. Nature, 390, 279–281. doi:10.1038/36846 Markowitz, D. A., Collman, F., Brody, C. D., Hopfield, J. J., & Tank, D. W. (2008). Rate-specific synchrony: Using noisy oscillations to detect equally active neurons. Proceedings of the National Academy of Sciences, 105, 8422–8427. doi:10.1073/ pnas.0803183105 Markram, H., Toledo-Rodriguez, M., Wang, Y., Gupta, A., Silberberg, G., & Wu, C. (2004). Interneurons of the neocortical inhibitory system. Nature Reviews Neuroscience, 5, 793–807. Martinez-Conde, S., Cudeiro, J., Grieve, K. L., Rodriguez, R., Rivadulla, C., & Acuna, C. (1999). Effects of feedback projections from area 18 layers 2/3 to area 17 layers 2/3 in the cat visual cortex. Journal of Neurophysiology, 82, 2667–2675. Moschovakis, A. K., Kitama, T., Dalezios, Y., Petit, J., Brandi, A. M., & Grantyn, A. A. (1998). An anatomical substrate for the spatiotemporal transformation. Journal of Neuroscience, 18, 10219–10229. Müller, N. G., & Kleinschmidt, A. (2003). Dynamic interaction of object- and space-based attention in retinotopic visual areas. Journal of Neuroscience, 23, 9812–9816. Murayama, M., & Larkum, M. E. (2009). Enhanced dendritic activity in awake rats. Proceedings of the National Academy of Sciences, 106, 20482–20486. doi:10.1073/pnas.0910379106 Murayama, M., Pérez-Garci, E., Nevian, T., Bock, T., Senn, W., & Larkum, M. E. (2009). Dendritic encoding of sensory stimuli controlled by deep cortical interneurons. Nature, 457, 1137–1141. doi:10.1038/nature07663 Olesen, P. J., Macoveanu, J., Tegnér, J., & Klingberg, T. (2007). Brain activity related to working memory and distraction in children and adults. Cerebral Cortex, 17, 1047–1054. Olesen, P. J., Westerberg, H., & Klingberg, T. (2004). Increased prefrontal and parietal activity after training of working memory. Nature Neuroscience, 7, 75–79. doi:10.1038/nn1165 Page, M. P. A., & Norris, D. (1998). The primacy model: A new model of immediate serial recall. Psychological Review, 105, 761–781. Pashler, H. (1988). Familiarity and visual change detection. Perception & Psychophysics, 44, 369–378. doi:10.3758/BF03210419 Pessoa, L., Gutierrez, E., Bandettini, P., & Ungerleider, L. (2002). Neural correlates of visual working memory: fMRI amplitude predicts task performance. Neuron, 35, 975–987. Phillips, W. A. (1974). On the distinction between sensory storage and short-term visual memory. Perception & Psychophysics, 16, 283– 290. Poirazi, P., Brannon, T., & Mel, B. W. (2003). Pyramidal neuron as 2layer neural network. Neuron, 37, 989–999. Polsky, A., Mel, B. W., & Schiller, J. (2004). Computational subunits in thin dendrites of pyramidal cells. Nature Neuroscience, 7, 621–627. Reynolds, J. H., & Chelazzi, L. (2004). Attentional modulation of visual processing. Annual Review of Neuroscience, 27, 611–647. doi:10.1146/annurev.neuro.26.041002.131039 Roelfsema, P. R. (2006). Cortical algorithms for perceptual grouping. Annual Review of Neuroscience, 29, 203–227. doi:10.1146/ annurev.neuro.29.051605.112939 Roelfsema, P. R., Lamme, V. A. F., & Spekreijse, H. (1998). Objectbased attention in the primary visual cortex of the macaque monkey. Nature, 395, 376–381. doi:10.1038/26475

Cogn Affect Behav Neurosci (2011) 11:573–599 Roelfsema, P. R., & Spekreijse, H. (2001). The representation of erroneously perceived stimuli in the primary visual cortex. Neuron, 31, 853–863. Ruff, C. C., Blankenburg, F., Bjoertomt, O., Bestmann, S., Freeman, E., Haynes, J., et al. (2006). Concurrent TMS-fMRI and psychophysics reveal frontal influences on human retinotopic visual cortex. Current Biology, 16, 1479–1488. Sato, T. (1989). Interactions of visual stimuli in the receptive fields of inferior temporal neurons in awake macaques. Experimental Brain Research, 77, 23–30. Schiller, J., Major, G., Koester, H. J., & Schiller, Y. (2000). NMDA spikes in basal dendrites of cortical pyramidal neurons. Nature, 404, 285–289. doi:10.1038/35005094 Schiller, J., Schiller, Y., Stuart, G., & Sakmann, B. (1997). Calcium action potentials restricted to distal apical dendrites of rat neocortical pyramidal neurons. The Journal of Physiology, 505, 605–616. Schmidt, B. K., Vogel, E. K., Woodman, G. F., & Luck, S. J. (2002). Voluntary and automatic attentional control of visual working memory. Perception & Psychophysics, 64, 754–763. doi:10.3758/ BF03194742 Sereno, A. B., & Maunsell, J. H. R. (1998). Shape selectivity in primate lateral intraparietal cortex. Nature, 395, 500–503. doi:10.1038/26752 Sereno, M. I., Pitzalis, S., & Martinez, A. (2001). Mapping of contralateral space in retinotopic coordinates by a parietal cortical area in humans. Science, 294, 1350–1354. Shafritz, K. M., Gore, J. C., & Marois, R. (2002). The role of the parietal cortex in visual feature binding. Proceedings of the National Academy of Sciences, 99, 10917–10922. doi:10.1073/ pnas.152694799 Siegel, M., Warden, M. R., & Miller, E. K. (2009). Phase-dependent neuronal coding of objects in short-term memory. Proceedings of the National Academy of Sciences, 106, 21341–21346. Silberberg, G., & Markram, H. (2007). Disynaptic inhibition between neocortical pyramidal cells mediated by Martinotti cells. Neuron, 53, 735–746. Silver, M. A., Ress, D., & Heeger, D. J. (2005). Topographic maps of visual spatial attention in human parietal cortex. Journal of Neurophysiology, 94, 1358–1371. doi:10.1152/jn.01316.2004 Somogyi, P., Tamas, G., Lujan, R., & Buhl, E. H. (1998). Salient features of synaptic organization in the cerebral cortex. Brain Research Reviews, 26, 113–135. Spruston, N. (2008). Pyramidal neurons: Dendritic structure and synaptic integration. Nature Reviews Neuroscience, 9, 206–221. Spruston, N., & Kath, W. L. (2004). Dendritic arithmetic. Nature Neuroscience, 7, 567–569. Srimal, R., & Curtis, C. E. (2008). Persistent neural activity during the maintenance of spatial position in working memory. NeuroImage, 39, 455–468. Tagamets, M.-A., & Horwitz, B. (1998). Integrating electrophysiological and anatomical experimental data to create a large-scale model that simulates a delayed match-to-sample human brain imaging study. Cerebral Cortex, 8, 310–320. doi:10.1093/cercor/ 8.4.310 Thompson, K. G., & Bichot, N. P. (2005). A visual saliency map in the primate frontal eye field. Progress in Brain Research, 147, 251– 262. Todd, J. J., & Marois, R. (2004). Capacity limit of visual short-term memory in human posterior parietal cortex. Nature, 428, 751– 754. doi:10.1038/nature02466 Todd, J. J., & Marois, R. (2005). Posterior parietal cortex activity predicts individual differences in visual short-term memory

599 capacity. Cognitive, Affective, & Behavioral Neuroscience, 5, 144–155. doi:10.3758/CABN.5.2.144 Toth, L. J., & Assad, J. A. (2002). Dynamic coding of behaviourally relevant stimuli in parietal cortex. Nature, 415, 165–168. Towe, A. L., & Harding, G. W. (1970). Extracellular microelectrode sampling bias. Experimental Neurology, 29, 366–381. Treisman, A. (1996). The binding problem. Current Opinion in Neurobiology, 6, 171–178. Vogel, E. K., & Machizawa, M. G. (2004). Neural activity predicts individual differences in visual working memory capacity. Nature, 428, 748–751. Vogel, E. K., McCollough, A. W., & Machizawa, M. G. (2005). Neural measures reveal individual differences in controlling access to working memory. Nature, 438, 500–503. doi:10.1038/nature04171 Vogel, E. K., Woodman, G. F., & Luck, S. J. (2001). Storage of features, conjunctions, and objects in visual working memory. Journal of Experimental Psychology. Human Perception and Performance, 27, 92–114. doi:10.1037/0096-1523.27.1.92 Vogel, E. K., Woodman, G. F., & Luck, S. J. (2006). The time course of consolidation in visual working memory. Journal of Experimental Psychology. Human Perception and Performance, 32, 1436–1451. doi:10.1037/0096-1523.32.6.1436 Waldvogel, D., van Gelderen, P., Muellbacher, W., Ziemann, U., Immisch, I., & Hallett, M. (2000). The relative metabolic demand of inhibition and excitation. Nature, 406, 995–998. Walther, D. B., & Koch, C. (2007). Attention in hierarchical models of object recognition. Progress in Brain Research, 165, 57–78. Wang, X.-J., Tegner, J., Constantinidis, C., & Goldman-Rakic, P. S. (2004). Division of labor among distinct subtypes of inhibitory neurons in a cortical microcircuit of working memory. Proceedings of the National Academy of Sciences, 101, 1368–1373. Wei, D. S., Mei, Y. A., Bagal, A., Kao, J. P., Thompson, S. M., & Tang, C. M. (2001). Compartmentalized and binary behavior of terminal dendrites in hippocampal pyramidal neurons. Science, 293, 2272–2275. Wheeler, M. E., & Treisman, A. M. (2002). Binding in short-term visual memory. Journal of Experimental Psychology. General, 131, 48–64. doi:10.1037/0096-3445.131.1.48 Wolfe, J. M. (1998). What can 1 million trials tell us about visual search? Psychological Science, 9, 33–39. doi:10.1111/14679280.00006 Woodman, G. F., Vecera, S. P., & Luck, S. J. (2003). Perceptual organization influences visual working memory. Psychonomic Bulletin & Review, 10, 80–87. doi:10.3758/BF03196470 Xu, Y. (2007). The role of the superior intraparietal sulcus in supporting visual short-term memory for multifeature objects. Journal of Neuroscience, 27, 11676–11686. Xu, Y. (2009). Distinctive neural mechanisms supporting visual object individuation and identification. Journal of Cognitive Neuroscience, 21, 511–518. doi:10.1162/jocn.2008.21024 Xu, Y., & Chun, M. M. (2006). Dissociable neural mechanisms supporting visual short-term memory for objects. Nature, 440, 91–95. doi:10.1038/nature04262 Xu, Y., & Chun, M. M. (2009). Selecting and perceiving multiple visual objects. Trends in Cognitive Sciences, 13, 167–174. doi:10.1016/j.tics.2009.01.008 Yantis, S., Schwarzbach, J., Serences, J. T., Carlson, R. L., Steinmetz, M. A., Pekar, J. J., et al. (2002). Transient neural activity in human parietal cortex during spatial attention shifts. Nature Neuroscience, 5, 995–1002. doi:10.1038/nn921 Zhang, W., & Luck, S. J. (2008). Discrete fixed-resolution representations in visual working memory. Nature, 453, 233–235. doi:10.1038/nature06860