Set Size Effects in the Macaque Striate Cortex Rogier Landman1,2 , Henk Spekreijse1 , and Victor A. F. Lamme2,3
Abstract & Attentive processing is often described as a competition for resources among stimuli by mutual suppression. This is supported by findings that activity in extrastriate cortex is suppressed when several stimuli are presented simultaneously, compared to a single stimulus. In this study, we randomly varied the number of simultaneously presented figures (set size) in an attention-demanding change detection task, while we recorded multiunit activity in striate cortex (V1) in monkeys. After figure – background segregation,
INTRODUCTION As we look around in the world, many objects are simultaneously in our visual field. However, we do not fully process everything at once. After a ‘‘preattentive’’ stage through which all visual stimuli go, a limited number of items is selected by attention for more thorough processing (Wolfe, 1998, 2000; Treisman, 1988; Neisser, 1967). During the last decades, we have started to unravel the brain mechanisms underlying the attentional selection process. Neural correlates of attention usually involve enhanced responding to the attended stimulus (e.g., McAdams & Maunsell, 1999; Luck, Chelazzi, Hillyard, & Desimone, 1997; Motter, 1993). Another essential part of the cortical mechanism of attention is thought to be suppression of responses when stimuli simultaneously compete for attentional resources, mainly in the extrastriate cortex (Kastner & Ungerleider, 2000; Desimone & Duncan, 1995). Both types of mechanisms are represented in the biased competition model of attention (Desimone & Duncan, 1995), where objects compete for neural representation by mutual suppression, and attending to a stimulus biases the competition in favor of the attended stimulus. Suppression is found when the activity evoked by a single stimulus is compared with the activity evoked by that same stimulus with other stimuli presented simultaneously, that is, when ‘‘set size’’ increases. Ample evidence for both suppression and enhancement effects in attention has been found in the extrastriate cortices of
1
Graduate School of Neurosciences, Amsterdam, The Netherlands, 2 The Netherlands Ophthalmic Research Institute, 3 University of Amsterdam © 2003 Massachusetts Institute of Technology
activity was suppressed as set size increased. This effect was stronger and started earlier among cells stimulated by the background than those stimulated by the figures themselves. As a consequence, contextual modulation, a correlate of figure – background segregation, increased with set size, approximately 100 msec after its initial generation. The results indicate that suppression of responses under increasing attentional demands differentially affects figure and background responses in area V1. &
monkeys and humans, including inferotemporal cortex (Rolls & Tovee, 1995; Miller, Gochin, & Gross, 1993; Sato, 1988), medial superior temporal, middle temporal cortices (Kastner, De Weerd, Desimone, & Ungerleider, 1998; Recanzone, Wurtz, & Schwarz, 1997), V4, and V2 (Reynolds, Chelazzi, & Desimone, 1999; McAdams & Maunsell, 1999; Kastner et al., 1998; Luck et al., 1997). Suppression appears to be scaled to the receptive field (RF) size (Kastner et al., 2001), and is especially strong when all the stimuli are within the RF of a given neuron (Reynolds et al., 1999). In striate cortex (V1), as far as suppression with increasing set size has been found, it is associated with preattentive image segmentation, rather than with attention (Li, 2000; Knierim & Van Essen, 1992; Allman, Miezin, & McGuinness, 1985). V1 cells, which are highly responsive to a single line element, give a suppressed response when that line is embedded in a field of lines with the same orientation. This is probably due to inhibitory interactions within V1. When the line is surrounded by lines of a different orientation, suppression is weaker (Knierim & Van Essen, 1992; Allman et al., 1985). This type of ‘‘contextual modulation’’ is also found under anesthesia, supporting the idea that the underlying mechanism is preattentive (Nothdurft, Gallant, & Van Essen, 1999), although attention may influence this type of activity (Ito & Gilbert, 1999). Contextual modulation is also found at a larger spatial scale, but may be due to feedback from extrastriate cortical areas (Lamme, Super, & Spekreijse, 1998). When a patch of homogeneously oriented line elements is surrounded by a similar texture with a different orientation, the resulting percept is one of a ‘‘figure’’ on a ‘‘background.’’ The V1 response in the background is Journal of Cognitive Neuroscience 15:6, pp. 873– 882
Figure 1. Materials, methods, and behavioral results. (A) Example of a stimulus display. (B) Schematic representation of a trial with an illustration of a typical neural response in V1 and our window of analysis. (C) Bar graph of change detection accuracy with a varied number of figures for the monkeys Uri (white bars) and Toni (black bars), represented as the fraction of saccades that were correctly directed at the figure that changed (the ‘‘target’’). Horizontal gray lines indicate chance level. Error bars indicate standard error of mean. (D) Illustration of the stimulus configuration when an RF (circle) is stimulated by a ‘‘figure’’ or ‘‘background’’ when set size is 4.
lower than inside the figure (Rossi, Desimone, & Ungerleider, 2001; Lee, Mumford, Romero, & Lamme, 1998; Zipser, Lamme, & Schiller, 1996; Lamme, 1995; Allman et al., 1985). Although this type of texture discontinuities is thought to be processed preattentively (Nothdurft, 1991, 1993; Bergen & Julesz, 1983), the interior enhancement of figures is absent under anesthesia (Lamme, Zipser, & Spekreijse, 1998). In humans, metabolic activity in the background around a grating is suppressed, yielding a similar difference between figure and ground, but this is only found when the grating is attended (Smith, Singh, & Greenlee, 2000; Vanduffel, Tootell, & Orban, 2000). When more than one figure has to be attended, competitive interactions may affect the response to the figures. To investigate whether contextual modulation of the type reported by Lamme and others is subject to attentional mechanisms, we recorded multiunit activity in area V1 in monkeys doing an attentionally demanding change detection task, with a varied number of rectangles in textured displays (see example in Figure 1A). The number of rectangles was 1, 2, 4, or 8, randomly chosen before each trial. A trial consisted of the subsequent presentation of a fixation screen, stimulus 1, an interval during which the screen was gray and stimulus 2. Stimulus 2 differed from stimulus 1 in that one randomly chosen rectangle had undergone a 908 change in global orientation. The monkeys were trained to fixate on a dot in the center throughout the trial, until stimulus 2 874
Journal of Cognitive Neuroscience
appeared and the fixation point disappeared. At that moment, the monkeys were rewarded for making a saccade to the rectangle that changed orientation. A schematic overview of the stimulus sequence is shown in Figure 1B. In paradigms such as this, attention is thought to be an important factor determining whether changes are seen (Rensink, 2000, 2002). Similar tasks have been shown to become more difficult as set size increases, where humans are able to monitor four to five objects for change (Rensink, 2000; Luck & Vogel, 1997). In our present study, we found that as set size increases, general activity in V1 decreases, and contextual modulation increases. The results further indicate that suppression related to increasing set size in area V1 differentially affects figure and background responses after initial figure – background segregation has taken place. The behavioral results have been published in preliminary form (Landman, Spekreijse, & Lamme, 2001).
RESULTS Behavior: Change Detection Accuracy Decreases with Increasing Set Size Figure 1C shows the percentage of correct trials for each set size. The accuracy of change detection, as measured by the fraction of trials that the animals correctly made a saccade to the object that changed, decreased as a function of set size (significant in both monkeys, p < .001, ANOVA). In other words, a change to a figure that Volume 15, Number 6
was easy to detect in isolation became hard to detect when more objects were present, showing that monkeys suffer from change blindness just like humans. Performance was above chance, indicating that the animals were not merely guessing. Average fractions correct, corrected for chance were ¡0.13, 0.19, 0.26, and 0.125 for set sizes 1, 2, 4, and 8, respectively.
the average value of activity during the period from the moment a difference in set size became significant until stimulus 1 offset, for the set size with the highest versus the set size with lowest activity. Each dot represents an individual recording site. These plots indicate that differences in set size are found in the majority of recording sites.
Neurophysiology
Background Activity, but not Figure Activity, Reaches Floor Level between Set Sizes 4 and 8
Figure 1B shows the typical neural response during a trial, obtained by averaging many trials. At the onset of stimulus 1, there was a transient response, followed by a decrease. This fell to a lower level than during the fixation screen, probably because the fixation screen (line segments of random orientation) is a more powerful stimulus for V1 cells than the homogeneously oriented line segments covering the RFs in the stimulus screen. After the presentation of stimulus 1, the gray interval and stimulus 2 followed, producing two more transients, a small one and a large one. During the presentation of stimulus 2, the animal made an eye movement, which produces a surge of activity in V1. Then, the fixation screen came back on and activity returned to the initial level. The window of analysis for the present study comprised the first 300 msec before and the subsequent 500 msec after stimulus 1 onset, that is, before change occurrence. This is the period during which the figures appear and the monkeys have to start spreading their attention across the figures to encode them to such an extent that they will be able to see the change when stimulus 2 appears. Note that the set size (1, 2, 4, or 8 figures) for a given trial was randomly chosen. General V1 Activity Decreases with Increasing Set Size We recorded multiunit activity while RFs were stimulated by a ‘‘figure’’ or by ‘‘background’’ (see illustration in Figure 1D). Normalized average activity within figures and in the background is depicted in Figure 2A, for the population of 29 multiunits from both monkeys (21 from monkey Toni and 8 from monkey Uri). These responses contain all trials in which the monkey correctly fixated until the onset of stimulus 2. The graphs show that both in the figure (leftmost graph) as well as in the background (middle graph), activity gets lower depending on set size. Binwise ANOVAs (see Methods) indicate that a set size effect in the ‘‘figure’’ responses becomes significant at 354 msec after onset of stimulus 1 and stays significant until the end of the window of analysis. In the ‘‘background,’’ the set size effect becomes significant at 281 msec after onset of stimulus 1. The effect in the ‘‘figure’’ responses is clearly smaller and arises later than the effect in the ‘‘background’’ responses. In the scatterplots (Figure 2A), we plotted
A decrease in response strength as a function of set size must have a floor level. For the ‘‘figure’’ responses, the decrease with set size continues at least until a set size of 8. The ‘‘background’’ responses decrease at least until a set size of 4, as can be seen from the graph. We were not able to get background responses for set size 8 from our main group of recording sites because there was not enough room between the figures in that condition to reliably fit an RF in. However, a group of 14 recording sites was always in the background because their RFs were so close to the fixation point that when a figure was on it, it would almost overlap the fixation point. These sites were not included in the main group of 29. Their signals can give a clue as to whether the background responses can be expected to further decrease when set size exceeds 4. The rightmost graph in Figure 2a shows the mean signal for each set size in this group of recording sites. Their response is also lower as set size increases, but the signals of set sizes 4 and 8 overlap, indicating that the set size effect in the background reaches its floor level between 4 and 8 figures. The set size effect is significant for bins between 268 and 427 msec after stimulus 1 onset. Figure – Ground Activity (Contextual Modulation) Increases with Set Size We have previously identified the difference in firing to line segments that belong to a figure and line segments that belong to the background as a neural correlate of figure – ground segregation (Zipser et al., 1996; Lamme, 1995). Now that we have the ‘‘figure’’ and ‘‘background’’ responses for each set size, let us see how contextual modulation is affected by set size. In Figure 2B, the figure and background signal are displayed within a single graph for each set size. In support of previous studies (Zipser et al., 1996; Lamme, 1995), contextual modulation is present in each condition: The figure signal is higher than the background signal for all set sizes. This was true for the majority of recording sites, as can be seen in the scatterplots under the graph of each set size, These plots show mean activity after stimulus onset for figure ( y-axis) and background (x-axis) for individual recording sites. The graphs show that contextual modulation increases with Landman, Spekreijse, and Lamme
875
set size. The rightmost graph in Figure 2B summarizes these findings. Here, the figure minus background response is displayed for each set size. After stimulus 1
onset, modulation arises in each condition at about the same time: For set size 1, modulation becomes significantly higher than zero at 100 msec after stimulus 1
Figure 2. Neurophysiological results. (A) This figure shows a row of graphs and a row of scatterplots. The leftmost graph shows the mean ‘‘figure’’ response in the main group of recording sites (n = 29) for each set size. The middle graph shows the ‘‘background’’ responses for each set size. The rightmost graph shows the responses of a separate group of recording sites that were always in the background (‘‘background electrodes,’’ n = 14). The thin dotted lines over the neural signals indicate the standard error of mean. On the x-axis, time = 0 msec marks the onset of stimulus 1. The thick, black, horizontal bars just above the x-axis indicate the periods during which set size effects are significant. In the scatterplots, we plotted the mean values of activity between the moment the set size effect becomes significant and the offset of stimulus 1, for the set sizes in which the highest and lowest values were measured. Thus, in the leftmost plot (‘‘figure’’ activity), set sizes 1 and 8 are plotted against each other, in the middle plot (‘‘background’’ activity) set sizes 1 and 4, and on the right (‘‘background electrodes’’) also set sizes 1 and 4. Each point in the graph represents one recording site. If there is no difference between the set sizes, the dots will fall directly on the diagonal, whereas if one is bigger than the other, the dots will fall on that condition’s side of the diagonal. (B) Contextual modulation. From left to right, the first three graphs show the ‘‘figure’’ and ‘‘background’’ signal for set sizes 1, 2, and 4. In all set sizes, the ‘‘figure’’ signal is higher than the ‘‘background’’ signal (contextual modulation). Time = 0 msec marks the onset of stimulus 1. Under the graph for each set size, scatterplots show the mean value of activity after stimulus onset for ‘‘figure’’ ( y-axis) and ‘‘background’’ (x-axis) plotted against each other. Each point represents a recording site. The rightmost graph entitled ‘‘modulation’’ shows the signals when the ‘‘background’’ signal is subtracted from the ‘‘figure’’ signal for each set size. The dotted vertical line indicates the moment at which modulation becomes significant (averaged across all set sizes). The thick, black, horizontal bars just above the x-axis indicate the periods during which set size effects are significant. The thin dotted lines over the neural signals indicate the standard error of mean.
876
Journal of Cognitive Neuroscience
Volume 15, Number 6
onset, for set size 4 at 80 msec. Later during the response, there is a significant positive effect of set size on contextual modulation. This is significant for bins between 207 and 402 msec after stimulus 1 onset. ANOVA on the average level of modulation between 200 and 500 msec with set sizes 1, 2, and 4 is significant, F(3,29) = 4.54, p = .013. Control for Differences in Fixation One could argue that these effects are due to differences in fixation between the conditions. Eye movements during fixation, including drifting eye movements and microsaccades, can influence the activity of V1 neurons (Snodderly, Kagan, & Gur, 2001; Leopold & Logothetis, 1998). If there is a great deal of eye movement during fixation, the variation in eye positions will be larger than if there is little eye movement. To control for differences in fixation, the standard deviation of the vector described by the two eye movement channels (x, y) was calculated for each trial. Thus, each trial got one standard deviation value. Next, we stratified figure and
background trials for each set size. By this we mean that the range of values was split into 20 bins, and trials were randomly selected until all conditions had an equal number of trials in each bin. A huge amount of data had to be discarded this way, because for each bin, the condition with the smallest number of trials dictates how many the other conditions will have. However, we did make sure that the distributions of figure and background with respect to the standard deviation in eye position were now identical. Figure 3A shows the average figure and background signals after stratification. The decrease of activity with increasing set size is still present in both figure and background in the main group of sites. The difference between set sizes 1 and 8 in the ‘‘figure’’ response is actually larger after stratification. The difference between set sizes 2 and 4 has disappeared. The difference between set sizes 1 and 8 in the figure response is significant from 334 msec after stimulus onset, and in the background from 261 msec. In the ‘‘background electrodes,’’ activity decreases as set size is increased from 1 to 4, but activity in set size 8 is higher than set size 4. Thus, the background
Figure 3. Neurophysiological results after stratification based on variation in eye position vector. (A) The leftmost graph shows the mean ‘‘figure’’ response for each set size (n = 29). The middle graph shows the ‘‘background’’ responses in the same group of sites. The rightmost graph shows the responses of ‘‘background electrodes’’ a group of recording sites that were always in the background (n = 14). x- and y-axes are as in Figure 2A. Time = 0 msec marks the onset of stimulus 1. The thick, black, horizontal bars just above the x-axis indicate the periods during which set size effects are significant. The thin dotted lines over the neural signals indicate the standard error of mean. (B) Contextual modulation. From left to right, the first three graphs show the ‘‘figure’’ and ‘‘background’’ signals for set sizes 1, 2, and 4. Axes are as in Figure 2B. Time = 0 msec marks the onset of stimulus 1. The thick, black, horizontal bars just above the x-axis indicate the periods during which set size effects are significant. The thin dotted lines over the neural signals indicate the standard error of mean.
Landman, Spekreijse, and Lamme
877
decrease with set size reaches its floor level between 4 and 8 in the stratified response as well as in the original response. Figure 3B shows contextual modulation for set sizes 1, 2, and 4 in the stratified data. ANOVA on the average level of modulation between 200 and 500 msec with set sizes 1, 2, and 4 is significant, F(3,29) = 5.10, p = .008, although the difference between set sizes 1 and 2 is diminished.
DISCUSSION We have described the effects of varying the number of figures (set size) in a change detection task on behavioral performance and activity in area V1. The behavioral results show that the task becomes more difficult as set size increases. Multiunit activity in area V1 decreases with set size late during the response (significant after 260 msec). The decrease is stronger for the background than for figures, therefore contextual modulation increases with set size. This occurs after the initial generation of contextual modulation, which occurs at 80 – 100 msec. The background response reaches floor level between set sizes 4 and 8. The decrease in change detection performance with increasing set size in our monkeys corroborates findings in humans (Rensink, 2000; Smilek, Eastwood, & Merikle, 2000; Luck & Vogel, 1997). This decrease is likely due to increasing demands on attention and working memory as set size increases. Attention to the relevant stimulus in the prechange display is thought to promote its encoding into memory that is necessary for a change to be detected when the postchange display appears (Rensink, 2002). Both attentional and working memory resources are limited. The more stimuli there are, the stronger the competition for resources (Desimone & Duncan, 1995), therefore errors become more likely. Did we successfully manipulate the attentional demand of the task by manipulating set size? We think we did. However, it should be noted that attention can be viewed in many different ways. It can be viewed as something under the control of the observer (top-down control), such as the effort put into the task, and biased processing in accord with behavioral goals, or it can be viewed as something that takes place even while topdown factors remain unchanged (bottom-up control). Bottom-up factors include stimulus properties, such as the relative appearance of the items in the display. Often, the case is made for an interaction of both (Treue, 2001; Kastner & Ungerleider, 2000; Desimone & Duncan, 1995). In this study, top-down factors were not explicitly manipulated, that is, we did not include a ‘‘no-attention’’ condition. In some sense, set size 1 can be viewed as a no-attention condition: Given the absence of trials in which no change occurred, there was no requirement to attend to the first array with set size 1. We cannot know, 878
Journal of Cognitive Neuroscience
however, whether the monkey put less effort in the set size 1 condition than in other conditions. Another way to obtain ‘‘no-attention’’ data is to compare trials in which the monkeys correctly made a saccade to the changing rectangle with trials in which the monkeys failed to do so (Landman et al., 2001). It was found that activity in the first array for the rectangle that is going to change, is lower in incorrect trials than in correct trials. This difference increases with set size. No effects were present in the background (Landman et al., 2001). Accepting that bottom-up information is the same in correct and incorrect trials for the same set size, this implies that the set size effect for the figure response is stronger when factors on the side of the monkey (topdown factors) are in a less optimal state. Thus, suppressive interactions may occur in a bottom-up fashion, while top-down factors (if in optimal state) may counteract suppression according to the animals’ goals (Kastner et al., 1998; Desimone & Duncan, 1995), in our case processing the rectangles and not the background. When viewed this way, the increase in contextual modulation with set size could reflect an increase in topdown effort. Could it be that our effects are merely due to the presence of additional items, instead of due to attentional demand? For example, the area in the display occupied by figures and the area occupied by background varied across conditions. However, this is not be a suitable explanation since it would predict opposite effects for figure and background. With increasing set size, the background area decreased and the figure area increased, while the decrease in activity with increasing set size occurred in the figure as well as in the background. It could be that cortical activity decreases because attention is distributed over a larger area of the display. However, that does not explain why the decrease is stronger in the background than inside the figures. One could also argue for an explanation in terms of the perceptual load that comes about with increasing set size. It has been shown that cortical activity evoked by peripheral input decreases when perceptual load of a target at fixation is increased (Handy, Soltani, & Mangun, 2001; Smith et al., 2000). From those observations, it was concluded that increasing the load of targets reduces the amount of residual attentional capacity available for allocation to task irrelevant locations. In our task, the increase in set size may have reduced the amount of capacity available to represent the background area. Since every rectangle was a potential target, the amount of capacity available for each rectangle may decrease in a similar way. One additional mechanism by which the stronger suppression in the background could be explained has been proposed by Tsotsos (1990, 1993) in the form of an ‘‘inhibitory beam’’ around attended stimuli. According to Tsotsos, the processing of information is Volume 15, Number 6
gated not by enhancement of relevant stimuli, but by active inhibition of everything that is irrelevant, thus increasing signal-to-noise ratio. In humans, metabolic activity in the background around a single grating in area V1 was found to be suppressed when the grating was attended. The response to the grating itself was not enhanced (Smith et al., 2000; Vanduffel et al., 2000). Our results suggest that this effect becomes stronger when attentional demand increases in the form of an increase in set size. Our finding that background suppression reaches a floor level between set sizes 4 and 8 indicates that this mechanism has a limit. Perhaps this marks the limit of attentional capacity. The estimated limit of attention to track independent objects is about 4 (Pylyshyn & Storm, 1988). Suppression among items in area V1 in previous studies was either very small or absent (Kastner et al., 2001; Luck et al., 1997). One explanation for this may be that our figure – ground displays are very potent stimuli for V1 neurons, whereas other studies used stimuli that typically activate areas located upstream in the cortical hierarchy. In extrastriate cortex, suppression effects scale with RF size (Kastner & Ungerleider, 2000) and suppression effects are strongest when all competing stimuli are inside a given cell’s RF (Moran & Desimone, 1985). Here, the RFs were completely covered by a rectangle, while other rectangles were several degrees away. It could be that the current effects arise indirectly from competition taking place in higher areas where RFs encompass multiple stimuli at once. Alternatively, the competition could involve more distributed neuronal networks, including V1 as well as higher visual areas. Although contextual modulation increased with set size from approximately 200 msec after stimulus onset, modulation was already present at approximately 100 msec after stimulus onset, independent of set size. Since we attribute the set size effects to attentional mechanisms, this implies that contextual modulation arises preattentively. This is a reasonable idea because psychophysically, the detection of texture discontinuities also appears to be a preattentive process (Nothdurft, 1991, 1993; Bergen & Julesz, 1983). However, several findings suggest that contextual modulation related to figure – ground segregation is different from that related to other preattentive phenomena, such as ‘‘pop-out’’ of a single line element. First, animals must be awake for interior enhancement of figures to arise (Lamme, Zipser, et al., 1998), whereas ‘‘pop-out’’ and contour enhancement occurs even under anesthesia (Nothdurft, Gallant & Van Essen, 1999). Second, the latency of figure – ground modulation is longer than modulation evoked by pop-out (Nothdurft et al., 1993; Knierim & Van Essen, 1992) or texture contours (Lamme, Rodriguez-Rodriguez, & Spekreijse, 1999; Nothdurft et al., 1993). The latter seems to be mediated by lateral inhibition from within V1 (Gilbert & Wiesel, 1990), whereas interior enhancement of figures is thought to be due to feedback from higher
areas (Hupe et al., 1998; Lamme, Super, et al., 1998), similar to the type of reentrant processing as discussed by others (Di Lollo, Enns, & Rensink, 2000). This suggests that figure –ground modulation reflects a more advanced stage in processing than modulation evoked by some other preattentive phenomena, possibly an intermediate level of processing, on which attention can subsequently operate (Gilbert, Ito, Kapadia, & Westheimer, 2000; Lee et al., 1998; He & Nakayama, 1994; Marr, 1982).
METHODS Animals and Stimuli Experiments were perfo rmed with two macaque monkeys (Macaca mulatta). Monkeys were seated in a primate chair with the head restrained at a distance of 75 cm from a 21-in. computer monitor. Display resolution was 1024 £ 768 pixels, refresh rate 72.4 Hz. The screen subtended 21 £ 288 of visual angle. Each trial consisted of the subsequent presentation of a fixation screen, stimulus 1, an interval during which the screen was gray and stimulus 2. The fixation screen consisted of black line segments with a random orientation on a white background (see example in Figure 1A). Stimulus 1 and 2 consisted of diagonal line segments in which 1, 2, 4, or 8 rectangles were defined by orthogonality of the line segments, creating the perception of ‘‘figures’’ on a ‘‘background’’ (Bergen & Julesz, 1983; Nothdurft, 1991, 1993). The rectangles were positioned horizontally or vertically. The size of the rectangles was 1 £ 1.48 in monkey Uri and 2 £ 3.28 in monkey Toni. The average eccentricity of the rectangles was 2.58 in monkey Uri and 58 in monkey Toni. Thus, the distance between the edge of a figure and the edge of the nearest figure in the display for set sizes 2, 4, and 8 would be about 3.68, 2.58, and 0.68, respectively, in monkey Uri, and 6.88, 4.78, and 0.78 in monkey Toni. The exact locations were chosen so as to cover the RF either with a rectangle or with background. All screens except stimulus 2 had a 0.28 red dot in the middle. Apart from that, the difference between stimuli 1 and 2 was that one of the rectangles had a different orientation. That is, if it was vertical in stimulus 1, it was horizontal in stimulus 2, or the other way around. Line orientations, orientation of the rectangles, and target figure were randomly chosen prior to each trial. For a schematic illustration of the task, see Figure 1B. The animals were trained to fixate at the dot in the middle of the screen while the fixation screen was on. When the eyes remained within 0.58 of the fixation dot for 300 msec, stimulus 1 appeared. Stimulus 1 remained on screen for 500 msec, provided the animal kept fixating, otherwise the trial was aborted. After a subsequent gray interval, stimulus 2 appeared. The duration of the gray interval was 200 msec. In monkey Toni, stimulus 1 was shown for a longer time (maximum 700 msec) in 11 Landman, Spekreijse, and Lamme
879
out of 35 recording sessions and the interval was shorter (minimum 100 msec) in 8 out of 35 sessions. With the onset of stimulus 2, the fixation dot also disappeared, which was the sign for the monkeys to make an eye movement to the figure that had changed its orientation. A trial was considered correct when the animal made an eye movement to the changed figure within 800 msec. For this, they were rewarded with a sip of apple juice. Incorrect trials were those in which the animal kept his eyes on the fixation dot until the gray screen appeared but did not make any eye movement or made an eye movement elsewhere. Trials in which fixation was broken prior to the onset of stimulus 2 were discarded. Surgical Procedures After they learned the task, the monkeys underwent two operations under general anesthesia. Anesthesia was induced with ketamine (15 mg kg¡1 im) and was maintained after intubation by ventilation with a mixture of 70% N2O and 30% O2, supplemented with 0.8% isoflurane, fentanyl (0.005 mg kg¡1 iv), and midazolam (0.5 mg kg¡1 h¡1 iv). In the first operation, a head holder was implanted. In addition, a gold-plated copper ring was implanted under the conjunctiva of one eye for the measurement of eye position. In the second operation, 40 – 50 Teflon-coated platinum – iridium wires were implanted chronically in area V1. The tips of the wires were positioned 0.5 – 2 mm below the cortical surface. The animals recovered for at least 21 days before training was resumed and data collection was initiated. All procedures complied with the NIH Guide for Care and Use of Laboratory Animals (National Institute of Health, Bethesda, MD) and were approved by the Institutional Animal Care and Use Committee of the Royal Netherlands Academy of Arts and Sciences. Recording and Data Analysis Eye movements were monitored using the double magnetic induction method (Bour, van Gisbergen, Bruijns, & Ottes, 1984) and digitized at 400 Hz. Multiunit activity was recorded with surgically implanted microwire electrodes on 16 channels. Aggregate RFs of the neurons contributing to each channel were assessed with moving dark bars over a bright background while the animal was fixating. RF sizes and positions were determined off-line from the responses to these stimuli. Based on signal-tonoise ratio of the responses, electrodes were selected for recording. Prior to the experiment, the location and diameter of each RF was determined as the area in which the response was greater than one-third of the maximal response when a bar moved across. The eccentricity of the RFs we found this way ranged from 18 to 58. RF diameter ranged from 0.48 to 0.88. The data we report are from recordings of 29 sites altogether (21 in monkey Toni and 8 in monkey Uri). 880
Journal of Cognitive Neuroscience
Before recording from a unit while the animals did the change detection task, we rotated and sized the imaginary circle of possible figure locations in such a way that the RF was within a figure in certain trials and in the background on others (see illustration in Figure 1D). We compared figure and background responses as well as figure-minus-background responses (contextual modulation) in conditions with different numbers of figures on the screen. We analyzed the neural responses within an 800-msec window starting from the onset of fixation until the 500 msec after stimulus 1 onset, at which time, in most sessions, the gray interval began. The mean of the signal at 0 – 61 msec after the monkeys started fixating was subtracted from the level activity during the entire epoch. The displayed responses were smoothed with a box-car algorithm. Signals for each electrode were normalized as follows. We calculated an average signal for each condition (background responses in correct trials, background responses in incorrect trials, target responses in correct trials, target responses in incorrect trials, nontarget responses in correct trials, and nontarget responses in incorrect trials). For each electrode, the maximum value across these averages was calculated. For each electrode, the average of each condition was divided by that maximum. This way every electrode contributed equally to the population average, regardless of absolute differences in firing rate, while differences between conditions were maintained. Trials with 1, 2, 4, and 8 figures (‘‘set size’’) were randomly mixed, and we planned to look for set size effects in the neural response. When a set size effect appeared present in the average signal, we determined the time after stimulus 1 onset at which this became significant using the following method. For each recording site, the part of the response during which stimulus 1 was present was divided into bins of 12.2 msec. Next, ANOVAs were done for each successive bin, with the electrodes as a group. When the difference was significant (a = .05) for five bins in a row, the start time of the first bin was taken as the onset of the effect. Acknowledgments Rogier Landman is supported by a grant from the Netherlands Organization for Scientific Research (NWO). We thank Kor Brandsma, Peter Brassinga, Jacques de Feiter, and Hans Meester for technical support. Reprint requests should be sent to Rogier Landman, The Netherlands Ophthalmic Research Institute, Meibergdreef 47, 1105 BA Amsterdam, The Netherlands, or via e-mail:
[email protected].
REFERENCES Allman, J., Miezin, F., & McGuinness, E. (1985). Stimulus specific responses from beyond the classical receptive field: Neurophysiological mechanisms for local – global Volume 15, Number 6
comparisons in visual neurons. Annual Review of Neuroscience, 8, 407 – 430. Bergen, J. R., & Julesz, B. (1983). Parallel versus serial processing in rapid pattern discrimination. Nature, 303, 696 –698. Bour, L. J., van Gisbergen, J. A., Bruijns, J., & Ottes, F. P. (1984). The double magnetic induction method for measuring eye movement—results in monkey and man. IEEE Transactions Biomedical Engineering, 31, 419 – 427. Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual Review of Neuroscience, 18, 193 –222. Di Lollo, V., Enns, J. T., & Rensink, R. A. (2000). Competition for consciousness among visual events: The psychophysics of reentrant visual processes. Journal of Experimental Psychology: General, 129, 481 – 507. Gilbert, C., Ito, M., Kapadia, M., & Westheimer, G. (2000). Interactions between attention, context and learning in primary visual cortex. Vision Research, 40, 1217 – 1226. Gilbert, C. O., & Wiesel, T. N. (1990). The influence of contextual stimuli on the orientation selectivity of cells in primary visual cortex of the cat. Vision Research, 30, 1689 – 1701. Handy, T. C., Soltani, M., & Mangun, G. R. (2001). Perceptual load and visuocortical processing: Event related potentials reveal sensory-level selection. Psychological Science, 12, 213 –218. He, Z. J., & Nakayama, K. (1994). Perceiving textures: Beyond filtering. Vision Research, 34, 151 – 162. Hupe, J. M., James, A. C., Payne, B. R., Lomber, S. G., Girard, P., & Bullier, J. (1998). Cortical feedback improves discrimination between figure and background by V1, V2 and V3 neurons. Nature, 394, 784 – 787. Ito, M., & Gilbert, C. D. (1999). Attention modulates contextual influences in the primary visual cortex of alert monkeys. Neuron, 22, 593 – 604. Kastner, S., De Weerd, P., Desimone, R., & Ungerleider, L. G. (1998). Mechanisms of directed attention in the human extrastriate cortex as revealed by functional MRI. Science, 282, 108 – 111. Kastner, S., De Weerd, P., Pinsk, M. A., Elizondo, M. I., Desimone, R., & Ungerleider, L. G. (2001) Modulation of sensory suppression: Implications for receptive field sizes in the human visual cortex. Journal of Neurophysiology, 86, 1398 –1411. Kastner, S., & Ungerleider, L. G. (2000). Mechanisms of visual attention in the human cortex. Annual Review of Neuroscience, 23, 315 – 341. Knierim, J. J., & Van Essen, D. C. (1992). Neuronal responses to static texture patterns in area V1 of the alert macaque monkey. Journal of Neurophysiology, 67, 961 –980. Lamme, V. A., Rodriguez-Rodriguez, V., & Spekreijse, H. (1999). Separate processing dynamics for texture elements, boundaries and surfaces in primary visual cortex of the macaque monkey. Cerebral Cortex, 9, 406 –413. Lamme, V. A., Super, H., & Spekreijse, H. (1998). Feedforward, horizontal, and feedback processing in the visual cortex. Current Opinion in Neurobiology, 8, 529 – 535. Lamme, V. A. F. (1995). The neurophysiology of figure – ground segregation in primary visual cortex. Journal Neuroscience, 15, 1605 – 1615. Lamme, V. A. F., Zipser, K., & Spekreijse, H. (1998). Figure – ground activity in primary visual cortex is suppressed by anesthesia. Proceedings of the National Academy of Sciences, U.S.A., 95, 3263 –3268. Landman, R., Spekreijse, H., & Lamme, V. A. (2001) Figure/ background segregation in primary visual cortex (V1)
predicts change detection. Society for Neuroscience Abstracts, 27, 721.6. Lee, T. S., Mumford, D., Romero, R., & Lamme, V. A. (1998). The role of the primary visual cortex in higher level vision. Vision Research, 38, 2429 – 2454. Leopold, D. A., & Logothetis, N. K. (1998). Microsaccades differentially modulate neural activity in the striate and extrastriate visual cortex. Experimental Brain Research, 123, 341 – 345. Li, Z. (2000). Pre-attentive segmentation in the primary visual cortex. Spatial Vision, 13, 25 – 50. Luck, S. J., Chelazzi, L., Hillyard, S. A., & Desimone, R. (1997). Neural mechanisms of spatial selective attention in areas V1, V2, and V4 of macaque visual cortex. Journal of Neurophysiology, 77, 24 – 42. Luck, S. J., & Vogel, E. K. (1997). The capacity of visual working memory for features and conjunctions. Nature, 390, 279 – 281. Marr, D. (1982). Vision. San Francisco, CA: W.H. Freeman. McAdams, C. J., & Maunsell, J. H. (1999). Effects of attention on orientation-tuning function of single neurons in macaque cortical area V4. Journal of Neuroscience, 19, 431 – 441. Miller, E. K., Gochin, P. M., & Gross, C. G. (1993). Suppression of visual responses of neurons in inferior temporal cortex of the awake macaque by addition of a second stimulus. Brain Research, 616, 25 – 29. Moran, J., & Desimone, R. (1985). Selective attention gates visual processing in the extrastriate cortex. Science, 229, 782 – 784. Motter, B. C. (1993). Focal attention produces spatially selective processing in visual cortical areas V1, V2, and V4 in the presence of competing stimuli. Journal of Neurophysiology, 70, 909 –919. Neisser, U. (1967). Cognitive psychology. New York: AppletonCentury-Crofts. Nothdurft, H. C. (1991). Texture segmentation and pop-out from orientation contrast. Vision Research, 31, 1073 – 1078. Nothdurft, H. C. (1993). The role of features in preattentive vision: Comparison of orientation, motion and color cues. Vision Research, 33, 1937 – 1958. Nothdurft, H. C., Gallant, J. L., & Van Essen, D. C. (1999). Response modulation by texture surround in primate area V1: Correlates of ‘‘popout’’ under anesthesia. Visual Neuroscience, 16, 15 – 34. Pylyshyn, Z. W., & Storm, R. W. (1988). Tracking multiple independent targets: Evidence for a parallel tracking mechanism. Spatial Vision, 3, 179 – 197. Recanzone, G. H., Wurtz, R. H., & Schwarz, U. (1997). Responses of MT and MST neurons to one and two moving objects in the receptive field. Journal of Neurophysiology, 78, 2904 – 2915. Rensink, R. A. (2000). Visual search for change: A probe into the nature of attentional processing. Visual Cognition, 7, 345 –376. Rensink, R. A. (2002). Change detection. Annual Review of Psychology, 53, 245 – 277. Reynolds, J. H., Chelazzi, L., & Desimone, R. (1999). Competitive mechanisms subserve attention in macaque areas V2 and V4. Journal of Neuroscience, 19, 1736 – 1753. Rolls, E. T., & Tovee, M. J. (1995). The responses of single neurons in the temporal visual cortical areas of the macaque when more than one stimulus is present in the receptive field. Experimental Brain Research, 103, 409 – 420. Rossi, A. F., Desimone, R., & Ungerleider, L. G. (2001). Contextual modulation in primary visual cortex of macaques. Journal of Neuroscience, 21, 1698 – 1709. Sato, T. (1988). Effects of attention and stimulus interaction on visual responses of inferior temporal neurons in macaque. Journal of Neurophysiology, 60, 344 – 364. Landman, Spekreijse, and Lamme
881
Smilek, D., Eastwood, J. D., & Merikle, P. M. (2000). Does unattended information facilitate change detection? Journal of Experimental Psychology: Human Perception and Performance, 26, 480 – 487. Smith, A. T., Singh, K. D., & Greenlee, M. W. (2000). Attentional suppression of activity in the human visual cortex. NeuroReport, 11, 271 – 277. Snodderly, D. M., Kagan, I., & Gur, M. (2001). Selective activation of visual cortex neurons by fixational eye movements: Implications for neural coding. Visual Neuroscience, 18, 259 – 277. Treisman, A. (1988). Features and objects: The fourteenth Bartlett memorial lecture. Quarterly Journal of Experimental Psychology A, 40, 201 – 237. Treue, S. (2001). Neural correlates of attention in primate visual cortex. Trends in Neuroscience, 24, 295 – 300.
882
Journal of Cognitive Neuroscience
Tsotsos, J. K. (1990). Analyzing vision at the complexity level. Behavioral and Brain Sciences, 13, 423 – 445. Tsotsos, J. K. (1993). An inhibitory beam for attentional selection. In L. Harris & M. Jenkin (Eds.), Spatial vision in humans and robots (pp. 313 –331). Cambridge: Cambridge University Press. Vanduffel, W., Tootell, R. B., & Orban, G. A. (2000). Attentiondependent suppression of metabolic activity in the early stages of the macaque visual system. Cerebral Cortex, 10, 109 – 126. Wolfe, J. M. (1998). Visual search. In H. Pashler (Ed.), Attention. Philadelphia: Taylor & Francis. Wolfe, J. M. (2000). Visual attention. In K. K. De Valois (Ed.), Seeing (pp. 335 – 386). San Diego, CA: Academic Press. Zipser, K., Lamme, V. A. F., & Schiller, P. H. (1996). Contextual modulation in primary visual cortex. Journal of Neuroscience, 16, 7376 – 7389.
Volume 15, Number 6