Evolutionary Psychology

5 downloads 82 Views 268KB Size Report
ding to area STPa; Bruce et al., 1981), revealed ... change in identity (Bruce and Young, 1986). ...... and Damasio, A. R. (1994) The return of Phineas Gage:.
00_Prelims

9/28/06

5:25 PM

Page iii

Oxford Handbook of

Evolutionary Psychology Edited by

R.I.M. Dunbar School of Biological Sciences, University of Liverpool

and

Louise Barrett Department of Psychology, University of Central Lancashire

1

13-Chap13

9/29/06

10:18 AM

Page 163

CHAPTER 13

Neural pathways of social cognition Tjeerd Jellema and David I. Perrett

13.1. A visual analysis of social behaviour without telepathy This chapter reviews recent ideas about visual processing pathways and mechanisms in the brains of human and non-human primates that support social cognition. We will attempt to show how detection of visual cues provides a basis for guiding the observer’s behaviour in ways that are based on the current and likely future behaviour of others. These visual mechanisms underpin social cognition, but do not rely on understanding others’ minds. They provide what one could call a ‘mechanistic’ description of others’ behaviour and of social events in terms of constituent components of actions, their causes and consequences, and expected future occurrence. There are at least two features crucial for enabling such visual processing to produce descriptions of the ‘mechanics’ of social events. 1. It should be able to combine the description of an action with the description of contextual cues that relate to the action, because only then can the action be put in a causal setting. These contextual cues can be derived either from the agent that performed the action (such as the agent’s direction of attention), or from the immediate environment

where the action took place. Examples of the latter include objects the action is directed at, or the relative spatial locations occupied by the agents and observer involved. 2. Processing should be sensitive to the immediate perceptual history, because this can form the basis from which likely events or outcomes can be predicted. After all, actions within a social context evolve over time; witnessing actions and their consequences typically spans several seconds. We will argue that the superior temporal sulcus (STS) forms the prime neural substrate for forming descriptions of the ‘mechanics’ of social events. This deviates from the usual view on the role of the STS, which centres on the forming of pictorial descriptions of someone’s actions without any account of the physical causes. A role of the STS in forming the higher-order visual descriptions of observed actions is well documented and not disputed (cf. Allison et al., 2000; Karnath, 2001). In this chapter we will qualify the nature of the higher-order descriptions by highlighting findings from single-cell recordings in the macaque monkey. These findings suggest that cell populations in the banks of the STS incorporate both features mentioned above, i.e. sensitivity to both actions and contextual cues (Jellema and Perrett, 2002, 2005, Jellema et al., 2002) and sensitivity to the perceptual history

13-Chap13

9/29/06

10:18 AM

Page 164

164 · CHAPTER 13 Neural pathways of social cognition

(Zacks et al., 2001; Jellema and Perrett, 2003a), which might enable the prediction of future actions. We also note the interplay between the STS and mirror neurons in the parietal lobe, which have also been reported to code for the most likely future actions of others on the basis of the current visual input, both in macaques (Fogassi et al., 2005) and humans (Iacoboni et al., 2005). A more sophisticated understanding of actions of conspecifics, and of social events in general, involves the attribution of mental states to the agents involved. A purely literal description of others’ actions and social events will not allow the observer to understand the behaviour of others when their goals are complex and do not bear a one-to-one relationship with the directly visible events. An example is deceit, when an agent desires object A, but deliberately pays attention to object B so as to draw the attention of a rival agent away from object A. Attribution of mental states such as intentions, beliefs, and feelings towards others, in order to explain their behaviour beyond mere physical causal relationships, is an important capacity provided potentially by phylogenetic and/or ontogenetic development (Tooby and Cosmides, 1990; Baron-Cohen, 1995, and this volume, Chapter 16; Wyman and Tomasello, this volume, Chapter 17). In essence, ‘mentalistic’ descriptions (intentions, beliefs etc.) are often referred to as theory-of-mind capacities. They resemble mechanistic descriptions, except that the physical causes have been replaced by abstract ‘mental’ causes. Mentalistic descriptions of others’ behaviour may well be built on top of mechanistic descriptions by linking the STS with descriptions of the mental states of others. The STS may allow prediction of behaviour, whereas systems built on top of it (ToM) may allow explanation of behaviour. The idea of the STS as part of a core system, which communicates with an extended system in order to accomplish the more sophisticated aspects of social cognition, such as social learning, imitation, empathy and theory of mind, is incorporated in several neural models (Adolphs, 1999; Haxby et al., 2000, 2002; Gallese et al., 2004; Iacoboni, 2005). The extended system typically includes structures such as the amygdala, insula, orbitofrontal, cingulate and parietal cortices.

We will discuss relationships of the STS to this system and the merits of the models. Thus, a basic principle according to which the visual system may operate to support social cognition is that it is tuned to detect contingencies between causes (be they physical or mental) and outcomes of others’ behaviour, so as to allow the prediction of their most likely future behaviour. These predictions are in turn used to guide the observer’s behaviour towards others. We speculate that in neurodevelopmental disorders such as autism, malfunctioning of the extended system compromises the formation of mentalistic descriptions, while the mechanistic ones are still intact. As a result, the autistic mind tries to explain the world—including others’ behaviour—in terms of physical, literal contingencies, and guides behaviour accordingly.

13.2. Breaking down bodies

and reassembling them Current ideas about where in the brain the different features of moving complex visual stimuli, such as the self-propelled actions of animate objects, are processed still rely heavily on the Ungerleider and Mishkin model (1982). This model envisaged a separation of visual processing into two distinct cortical streams: a dorsal ‘where’ stream, extending from V1 into the inferior parietal cortex, primarily dealing with the spatial relationships of objects; and a ventral ‘what’ stream, extending from V1 into the inferior temporal cortex (IT), dealing with the shape and identity of objects (Desimone and Ungerleider, 1989). A subsequent adaptation by Milner and Goodale (1995) questioned the strict ‘what–where’ dichotomy, and suggested that space and form are processed in both parietal and temporal areas but for different purposes (Goodale et al., 1991). In their view, the ventral stream subserves visual ‘perception’, i.e. object and scene recognition, requiring allocentric spatial coding to represent the enduring characteristics of objects, while the dorsal stream subserves the visual control of ‘action’, requiring egocentric spatial coding for short-lived representations (vision for perception versus vision for action). The role of the ventral stream in the recognition of complex objects is

13-Chap13

9/29/06

10:18 AM

Page 165

The face: a special case · 165

supported by findings showing a gradual increase in the complexity of stimuli analysed by cells from primary visual cortex to temporal cortex (Perrett and Oram, 1993). Across primate species, the size of ventral pathway is related to social group size (Barton, this volume, Chapter 11), perhaps because, as group size increases, so does the necessity to recognize more social signals and behaviour. Understanding the motion of animate objects is less straightforward than that of non-animate objects, whose motion can be explained by physical causality. If the animated objects are primates, they may well have a mental life, with the accompanying beliefs (both true and false), desires and fears, all of which make understanding the goals of actions uncertain. We will describe ideas about how the visual system has ‘solved’ this problem. An initial step in understanding others from sight comes from a fractionation of the body into key parts such as the eye, mouth, head, finger, hand and arm. These are represented in specific regions of the visual association cortex within the ventral stream, in a constellation of areas often referred to as the ‘core system’. They include in humans the fusiform face area (FFA; Kanwisher et al., 1997) and extrastriate body area (Downing et al., 2001), and in monkeys the inferotemporal cortex and cortex surrounding the superior temporal sulcus (STS). The motion of body parts is analysed separately in the dorsal stream. In humans and monkeys, processing in the posterior STS sees a confluence of the two streams of information, and the behaviour of others is specified in terms of key postures and animations. In most neural models of social cognition, the STS features at the top of the hierarchy within a core system. The core system is not itself concerned with social meaning, but communicates with an extended system, which serves to extract social meaning from the core system output in order to accomplish the more sophisticated aspects of social cognition such as social learning, imitation, action and emotion understanding, empathy, intentionality and theory of mind. A wealth of information about the visual analysis performed in the core system, and in the STS in particular, has come from single-cell studies in the macaque monkey. Gross et al. (1972)

made the first startling finding of temporal cortex cells that responded selectively to the sight of one specific body part, a hand. Subsequent work, much of which was done in the anterior part of the STS (STSa, corresponding to area STPa; Bruce et al., 1981), revealed populations of cells selectively responsive to specific parts of the body, such as a head, hand, eye, leg or arm and often required that part to make a specific action (e.g. Bruce et al., 1981; Perrett et al., 1982; Hasselmo et al., 1989a; Oram and Perrett, 1996). Other populations of cells within the STSa respond selectively to particular whole-body actions, such as walking, crouching or jumping and not to movements of isolated body parts (Perrett et al., 1989; Oram and Perrett, 1996; Jellema and Perrett, 2003a, 2005). Single-cell coding for whole-body actions presumably results from a complex pooling of the outputs of the cells coding for individual body parts and their movement (Perrett et al., 1989). Thus, apparently, after the initial breakdown of bodies, the parts are reassembled in the STS. Other STSa cells are tuned to multiple views of the same animate object (Perrett et al., 1985, 1987; Logothetis et al., 1995) or of the same action (Jellema and Perrett, 2002a), or are tuned to conceptually related visual stimuli, such as multiple body signals of attention directed to one point (Perrett et al., 1985, 1992). Such response selectivity is most likely obtained through pooling of the outputs of cells coding for separate views of distinct stimuli. Characteristic of many STSa cells is that they integrate information about form and motion of animate objects (Oram and Perrett, 1996; Tanaka et al., 1999).

13.3. The face: a special case The neural representation of the face has received a disproportionately large amount of scientific investigation because faces convey a wide variety of social information, most of it via dynamic changes in parts of the face, such as lip shape and gaze direction. These changeable or variant facial aspects are not confined to a particular individual. By contrast, the invariant facial aspects remain constant across facial movements, and carry information about identity, sex and age. In the model put forward by

13-Chap13

9/29/06

10:18 AM

Page 166

166 · CHAPTER 13 Neural pathways of social cognition

Haxby and colleagues (Haxby et al., 2000, 2002; Hoffman and Haxby, 2000), the invariant face aspects are analysed in the fusiform gyrus, the variable face aspects in the STS, while the lateral inferior occipital gyri provide input to both fusiform and STS. It makes sense that the representations of identity and of the changeable aspects of a face are relatively independent, because a change in expression should not be misinterpreted as a change in identity (Bruce and Young, 1986). Indeed, the two types of information processing can be experimentally disentangled, e.g. repetition-priming paradigms, which enhance face processing when identity is involved but not when expression is involved (Ellis et al., 1990). The separation of expression and identity processing is, however, controversial (Calder and Young, 2005). Long before the introduction of brain-imaging techniques, two types of research had already produced indications that specialized module(s) for face processing existed: (i) neuropsychological studies of people with prosopagnosia (Bodamer, 1947; Hecaen and Angelergues, 1962), who do not recognize familiar faces, but have no clear problem recognizing other categories of object, and (ii) single-unit studies in monkey temporal cortex, which revealed the existence of neurons specifically sensitive to faces (Desimone, 1991; Perrett et al., 1982, 1984, 1985). Single-unit studies also provided clues as to the different aspects of the face that different brain regions were tuned to: expressions and other changeable aspects of the face (e.g. gaze direction) seemed to be analysed in the STS, while the non-changeable aspects such as identity are analysed to a greater extent in the inferior temporal cortex (Perrett et al., 1984; Hasselmo et al., 1989a; Young and Yamane, 1992). Single-cell studies indicate that identity coding proceeds in a similar fashion in humans, with semantic integration of names and familiar faces occurring within the medial temporal lobe limbic structures (Quiroga et al., 2005). Recent imaging studies of the human brain have shown activation of three areas in response to the viewing of faces: the lateral fusiform gyrus, which has also been dubbed the fusiform face area (FFA; Kanwisher et al., 1997), the posterior STS, and the lateral inferior occipital gyri

(e.g. Hoffman and Haxby, 2000). Although the FFA consistently shows greater activation to faces than to any other object category, it has been asserted that the FFA is engaged in the discrimination between individual objects belonging to the same category (be it faces, birds, cars or any other category of object; Gauthier et al., 1999). Recent evidence shows that this assertion is wrong. Increased depth of processing (subtle discrimination of object identity rather than general recognition of object class) does enhance FFA processing, but enhancement only occurs for faces and birds, which have faces. Enhancement does not occur for flowers, cars or guitars (Grill-Spector et al., 2004; see also Xu et al., 2005). In humans, the posterior STS is activated by movements of face parts, such as mouth and eyes (e.g. Decety and Grezes, 1999; Puce et al., 1998; Puce and Perrett; 2003). These findings suggest that the face-processing areas identified in the monkey STS and inferior temporal (IT) cortex most likely correspond to the human posterior STS and lateral fusiform gyrus (FFA), respectively. The Hoffman and Haxby (2000) scheme of the STS processing variable face aspects and the FFA processing invariant face aspects is likely to be a simplification because single-cell, anatomical and functional imaging all suggest an organization with several patches of cortex processing faces within the STS and IT of the monkey (Harries and Perrett, 1991; Tsao et al., 2006).

13.4. Actions as they pertain

to the observer Perhaps it is most important to comprehend how the actions of others impinge on ourselves. Prior experience coupled with an egocentric visual analysis can suffice for a limited social competence. Fortunately, most visual analyses begin with an egocentric frame of reference; such viewer-referenced coding is a defining feature of the way faces, body postures and bodily actions are coded for in the STS. Perceived bodily actions can, in principle, be described within two different coordinate systems: a viewer- or an object-centred system (e.g. Perrett et al., 1989, 1991; Hasselmo et al., 1989b; Jellema and Perrett, in press). For a viewer-centred system, the view and direction of

13-Chap13

9/29/06

10:18 AM

Page 167

Actions as they pertain to others · 167

motion or articulation of the object are defined relative to the observer. For an object-centred description, the principal axis of the object, or another part of the object, is taken as a reference point to define the action. Object-centred descriptions therefore remain constant across different vantage points of the observer. The neural implementation of these coordinate systems has been the subject of intense research. Coding in STS is typically viewer-centred. For example, an STS cell may respond to an agent advancing (i.e. following its nose) to the right of the observer (from the observer’s perspective the right profile is visible and motion is to the right), but not to the agent retreating to the right (left profile view and motion to the right), nor to the agent retreating to the left (right profile view and motion to the left). Such cells require a specific combination of body view and motion direction, defined from the observer’s perspective. Although these cells do not generalize across changes in perspective view, they usually generalize very well across changes in illumination and size. In contrast, object-centred cells have received relatively little study, probably because they are rare: only about 5% of STS cells that respond to an action or static posture do so in an objectcentred manner, the other 95% use a viewercentred frame (e.g. Oram and Perrett, 1996). An example of object-centred coding would be selectivity for all examples of walking forward (i.e. when the body moves in the same direction as the nose), but lack of response to all backward walking. In the macaque, object-centred cells are perhaps more prevalent towards the pole of the temporal lobe, whereas viewer-centred cells may be more equally distributed along the STS (Jellema and Perrett, in press). Theories of ventral stream function, as well as psychological and computational models of object recognition, postulated object-centred (viewpoint-independent) coding of objects as the most efficient way of storing object information (Marr and Nishihara, 1978). Such coding would enable the neural system to achieve object constancy, which facilitates object recognition. In this light, it has been somewhat puzzling as to why most electrophysiological studies in the temporal cortex report viewpointdependent coding (e.g. Perrett et al., 1985, 1991; Wachsmuth et al., 1994; Logothetis et al., 1995).

There are at least two functions for viewer-centred representations which may account for their prevalence: (i) understanding an action sequence from momentary postures that constitute key components of that action (e.g. how to put on a lifejacket can be understood from poses depicted on safety instructions; Byrne, 1995; Perrett, 1999) and (ii) inferring the direction of attention of others (are their actions directed at me or at someone or something else?) (Perrett et al., 1992). A cell responding to the left profile, but not to the right, may code for the abstract notion of ‘attention directed to the observer’s left’, instead of the geometric characteristics of the left side of the face. The visual information arising from gaze and body cues appears to contribute to cell sensitivity in a way that is consistent with the cells’ role in analysing the direction of attention. For example, cells tuned to the left profile view of the head are often additionally tuned to left eye gaze and the left profile view of the body (Wachsmuth et al., 1994). Despite the findings of cellular sensitivity to attention direction in macaques, the extent to which Old World monkeys are able to use information about the gaze direction of others is still a matter of some debate (e.g. Anderson et al., 1996; Emery et al., 1997; Lorincz et al., 1999). None the less monkeys are acutely sensitive to eye contact, particularly from dominant individuals in close proximity (Perrett and Mistlin, 1990).

13.5. Actions as they pertain

to others Actions are more than simple movements— they typically involve goal direction. This means that an action is functionally related to aspects of the environment, to its cause and/or consequence. Hence, a more abstract representation of actions can be gleaned by relating actions to environmental cues such as position (defined with respect to landmarks), goal objects, and other individuals. Such analysis appears to be conducted in sub-regions of the temporal cortex (Perrett et al., 1989, 1990). Recent findings have make it increasingly clear that the sensitivity of cell populations in STSa exceeds that which would be required to

13-Chap13

9/29/06

10:18 AM

Page 168

168 · CHAPTER 13 Neural pathways of social cognition

form merely a ‘pictorial’ description of complex actions. STS cells respond, in intricate ways, to the sight of actions in conjunction with other visual cues. The response characteristics suggest that the cells are involved in computing (or predicting) the most likely next action or event. The cues used can be derived from the agent performing the action, or from the environment where the action took place. As such, these cells show sensitivity to the context in which the action was performed. Examples detailed below include STS cells that combine sensitivity for actions with sensitivity for the gaze direction of the performing agent (Section 13.5.1), the object the action was directed at (Section 13.5.2), or the spatial location where the action took place (Section 13.5.3). Such joint coding for actions and related contextual cues can be informative about the goals of the agent’s actions.

13.5.1. Sensitivity to actions and

gaze direction A subset of STSa cells responds only to the sight of an actor performing a reaching action on the condition that the direction of head and eye gaze of the actor matched the direction of reaching (Jellema et al., 2000). In other words, the agent performing the action needed to attend to the target position of his/her action in order to excite the cells. Such a combined analysis of action and gaze may well support the detection of intentional or purposive actions. That is, when an agent reaches out and knocks over an object while looking at the object, then a good guess is that it was this agent’s intention to knock the object over, whereas if the agent’s attention is directed elsewhere during the same reaching action then it is more likely that knocking the object over was unintentional and accidental.

13.5.2. Sensitivity to actions and

their goals Sensitivity to action goals is most clear for cells sensitive to hand–object interactions, such as reaching for, picking, tearing and manipulating objects (Perrett et al., 1989, 1990; Jellema et al., 2000). These cells are sensitive to the form of the

hand performing the action, and are unresponsive to the sight of tools manipulating objects in the same manner as hands. Furthermore, the cells code the spatio-temporal interaction between the agent performing the action and the object of the action. For example, cells tuned to hands manipulating an object cease to respond if: (i) the object is removed, (ii) the hand action is made in a direction away from the object, or (iii) the hands and object move appropriately but remain spatially separated (Perrett et al., 1989). This selectivity ensures that the cells are more responsive in situations where the agent’s motion is causally related to the object’s motion. Cells with strikingly similar sensitivity to the sight of goal-directed hand actions are found in the premotor and parietal cortex (F5, see Keysers and Perrett, 2004 for comparison). Cells in these latter areas are known as ‘mirror neurons’ because they respond when the monkey prepares to and executes the same hand action (i.e. the cells may respond during grasping and to the sight of grasping; see Rizzolatti and Fogassi, this volume, Chapter 14).

13.5.3. Sensitivity to actions

and location Milner and Goodale suggested that spatial position might be processed in both dorsal (parietal) and ventral (temporal) streams but for different purposes (e.g. Goodale et al., 1991; Milner and Goodale, 1995). Consistent with ventral coding of position, cell populations in STSa are sensitive to the spatial location of animate objects after they move out of sight behind a screen (Baker et al., 2001). Additionally the activity of cell populations sensitive to the action of walking is strongly modulated by the position of walking with respect to the observer: some cells respond only to walking at locations near to the subject, others only to walking at far-away locations (Jellema et al., 2004). Marr and Nishihara (1978) noted that to recognize an object one needs to generalize across viewing conditions (ignoring whether the object is far, near, seen from the front or side). Under this scheme it is surprising that the STS cells care so much about view and distance to objects and agents. We argue that recognizing

13-Chap13

9/29/06

10:18 AM

Page 169

Systems for predicting behaviour · 169

the nature and purpose of actions requires specification of how those actions are related to objects and the observer. For this coding of the location and orientation of actions within the environment is crucial. The combined sensitivity for actions and spatial location could enable the STS cell populations to represent meaningful aspects of social actions and interactions. The spatial positions people occupy with respect to each other and to objects often provide important clues to the meaning of the social event and to the goals or intentions of the people involved. Imaging studies in humans support the suggested role of the STS in using environmental cues to represent the goal-directedness and other contingencies related to biological actions (Castelli et al., 2000; Zacks et al., 2001). Saxe et al. (2004) showed that the right posterior STS is especially involved when the relationship of the action with the environment is manipulated to implicate different intentions of the agent. An obvious question is, of course, whether there is any behavioural evidence that nonhuman primates can discriminate between intentional and non-intentional actions. A study by Call and Tomasello (1998) showed that great apes preferentially follow intentional actions performed by the experimenter, rather than nonintentional actions. In reviewing behavioural evidence, Barrett and Henzi (2005) conclude that monkeys have a limited understanding of others’ minds; monkeys do not behave in a Machiavellian manner using a theory of others’ minds. None the less, the actions that monkeys do perform that are designed to promote short-term selfish goals (which can result paradoxically in cooperation) still need a sophisticated analysis of the behaviour of others and its context, one that we argue could be supplied by the representations of the type we are describing.

13.6. Systems for predicting

behaviour Recent discoveries in temporal and parietal cortex show that the context of recently witnessed behaviour can have a profound effect on the visual analysis of currently seen behaviour. In effect, these brain systems allow prior behaviour

to be taken into account to predict future actions of others. Such contextual analysis means that the understanding of others moves to a new level of sophistication (see also Rizzolatti and Fogassi, this volume, Chapter 14).

13.6.1. Predicting actions from static

body postures Articulation seen in point light or biological motion displays seems to be preferentially processed in human and monkey STS (Beauchamp et al., 2003; Puce and Perrett, 2003). To understand an articulated action performed by another individual, however, does not require that we witness the entire action sequence. A single momentary view is often enough to identify the action and its goal. This capacity allows us to understand another’s actions in a situation where the other is intermittently occluded from view. In instruction manuals, for example, the performance of dexterous manual tasks is often specified as a series of static pictures, each demonstrating particular sub-goals or stages in the action sequence. The momentary postures allow us to infer the dynamics of the whole action. The formation of associations between an action and its end-posture might well underlie the ability of the brain to infer impending or prior action from static ‘snapshots’ of the body. We have found that populations of STSa cells responsive to the sight of specific articulated body actions also code for the consequent articulated static body postures when presented in isolation (Jellema and Perrett, 2003b). Such actions occur when one body part (e.g. a limb or head) moves with respect to the remainder of the body; conversely, non-articulated actions occur when equivalent body parts move as one. Articulated postures contain a torsion or rotation between parts, while non-articulated postures do not. For the population of cells described it was notable that similar postures, which did not form the logical end-point of the effective articulated action, did not evoke responses. Starting postures, even of the effective articulated actions, also failed to evoke a response. Moreover, the cells also did not respond to unusual body actions that culminated in the

13-Chap13

9/29/06

10:18 AM

Page 170

170 · CHAPTER 13 Neural pathways of social cognition

effective end-posture. Together the data suggest that the cells’ responses were related to the implied prior action, rather than the static posture alone. It seems that the neural representations in STSa for actual biological motion may extend to biological motion implied from static postures. These implied motion representations could play a role in producing the activity in the medial temporal/medial superior temporal [V5(MT)/MST] areas reported in fMRI studies when subjects viewed still photographs of people in action (e.g. a snapshot of a running athlete; Kourtzi and Kanwisher, 2000; Senior et al., 2000). It is well established that the V5(MT)/ MST complex plays a primary role in the analysis of the direction and speed of moving objects (Maunsell and Van Essen, 1983). Its activation in response to implied motion suggests that it receives a top-down influence, since the object and context first need to be identified before the associated movement can be identified. The cell populations in STS sensitive to the articulated postures and associated preceding actions could well provide this top-down input (Barraclough et al., in press; cf. Lorteije et al., 2006). Thus, the visual processing of static form may contribute to the comprehension of dynamic actions: sensitivity to associations between image form and motion could form the basis of the ability of the nervous system to retrieve likely motion given entirely static images.

13.6.2. Prediction based on

perceptual history Some STS cell populations seem to be tuned to impending behaviours of others, based on their immediate perceptual history (Jellema and Perrett, 2003a). Seeing events prior to a body assuming a static posture can allow one to predict the likelihood and nature of the body’s next movement. Under natural viewing conditions the responses of some STSa cells to the sight of static body postures is controlled by the actions performed by that body (agent) in the one or two seconds directly preceding the onset of the static posture. In other words, the perceptual history can enable or prevent a cell’s response to the current retinal input (Jellema and Perrett, 2003a).

One example is that of cells responding to the front view of a static body when preceded by walking towards the observer. Other actions such as walking backward and stopping or body rotation that culminate in the same static body view at the same location in the testing room failed to result in a response. Although the type of preceding motion was crucial for evoking a response during the static phase, the cells did not respond during this prior movement. These neural representations for sequences of events may play a role in predicting or anticipating the next move or posture. For example, the sight of a body that has just stopped walking forward may invoke an expectation that, should walking commence again, it is likely to resume a forward direction. The same view of a static body that has just stopped walking backward, by contrast, may be expected to move in a backward direction should walking resume.

13.6.2.1. Prediction during unseen actions The actions of others are not always fully visible; for example, someone may become hidden from our sight as they move behind a tree, or their hands may not remain fully in view as they reach to retrieve an object. Within STS it is now apparent that specific cell populations are activated when the presence of a hidden agent can be inferred from the preceding visual events (i.e. the agent was witnessed passing out of sight behind a screen and has not yet been witnessed re-emerging into sight, therefore the agent is likely to remain behind the screen; see Baker et al., 2001). These STSa cells respond maximally to the sight of individuals ‘hiding’ behind an occluding screen. In the 3s following disappearance from sight behind a screen, the population response is significantly larger than in the prior 3s when the agent was visible and moving towards the screen. The cells responding to occlusion additionally showed spatial sensitivity, discriminating between locations where the agent was hidden (at the left, right or middle of the room; Baker et al., 2001). Cell responses to the experimenter walking in-sight were consistent with the out-of-sight responses. For example, if hiding behind a screen located at the right-hand side of the testing room evoked significantly larger responses than hiding behind a screen at the left-hand side, then walking

13-Chap13

9/29/06

10:18 AM

Page 171

The will and capacity to learn about others · 171

towards the right-hand screen would also evoke a larger response than walking towards the lefthand screen, with walking in both cases from left to right. These responses are consistent with the idea that the cells coded not only for the presence of the experimenter behind the righthand screen, but also for the intention/goal of the experimenter to go behind that screen. For this interpretation, we need only assume that walking towards the right screen reflects the intention to move behind that screen. Saxe et al. (2004) have argued similarly that a region of posterior STS is activated when human observers interpret events in actions as intentional as opposed to incidental. Corresponding cell properties are seen in F5 (Umilta et al., 2001). If a monkey sees an object hidden behind a screen, the monkey can ‘believe’ the object continues to exist. F5 cells responding to the sight of the experimenter reaching and grasping objects in full view will also respond to the sight of the experimenter reaching to grasp an object behind the screen (so long as the monkey has seen the object hidden behind the screen). In this instance the coding of the agent’s action and goal includes ‘belief ’ (see Rizzolatti and Fogassi, this volume, Chapter 14).

13.7. The will and capacity to

learn about others Humans preferentially look at faces and react to them in the first hours of life (Goren et al., 1975). This capacity appears more widespread amongst primates (K. Fujita, personal communication). Faces and facial expressions become (or perhaps are) intrinsically reinforcing for behaviour, encouraging or discouraging the observer to react in stereotyped or subtle ways. Brain systems (including the amygdala, ventral striatum, insular cortex, and orbitofrontal cortex) support: (a) the drawing of attention towards faces, (b) the extraction of meaning from faces, and (c) the reinforcement of social behaviour by face cues. For adults the sight of facial and bodily cues (infantile features, or sexually dimorphic adult features) can provide secondary reinforcement for social behaviour. A full treatment of the brain pathways underlying

social cognition should incorporate details of how such social cues come to activate ‘reward systems’ in the brain (Kampe et al., 2001; O’Doherty et al., 2003) and hence can guide social interactions. Most neural models of social cognition see the STS as part of a core system dealing with social perception per se (cf Allison et al., 2000), and postulate that connections with the ‘extended’ system (including insula, orbitofrontal, cingulate, anterior temporal, somatosensory and parietal cortices and the amygdala) serve to extract meaning from social perceptions and control reactions (Haxby et al., 2000). For example, the ability to follow gaze, and, in cases where the other’s attention is directed at a specific object, the ability to share attention with the observed agent, is likely to be underpinned by connections between the STS and the intraparietal sulcus (Harries and Perrett, 1991). The direction, or target, of another’s attention is computed by neurons in the STS specifically responsive to eye-gaze direction, head orientation and bodily orientation (Perrett et al., 1985, 1992), while the parietal cortex is involved in (covertly) directing the observer’s spatial attention (cf. Corbetta, 1998). It has been shown that the judgment of gaze direction indeed activates both the intraparietal sulcus and the posterior STS (Hoffman and Haxby, 2000). Another example is the connection between the STS and the superior temporal gyrus (STG). The perception of mouth and lip movements typically activates the STS, whether or not the movements are related to speech (Puce et al., 1998). To extract meaning from the lip movements, connections from the STS with the auditory cortex in the STG are recruited (STG typically responds to heard vocalizations). This was shown in experiments using silent lipreading tasks, which produced activity in both the STS and STG (Calvert et al., 1997). The STS is reciprocally connected with the amygdala to support the extraction of meaning from stimuli and to highlight those social stimuli that have an emotional significance for the observer on the basis of past life experiences or genetically determined strategies (Adolphs, 1999; Barton, this volume, Chapter 11). Fearful facial expressions are well known to excite the amygdala (Morris et al., 1996) but amygdala

13-Chap13

9/29/06

10:18 AM

Page 172

172 · CHAPTER 13 Neural pathways of social cognition

function is also implicated in other emotions and aspects of social cognition. Amygdala activity in response to processing faces is modulated by the gaze direction of these faces (probably computed in the STS), with direct gaze producing larger activation than averted gaze (Kawashima et al., 1999). This modulation could be related to an inherent ambiguity because direct gaze can reflect interest/attraction as well as a threat. Furthermore, it has been suggested that the amygdala is involved when an observer interprets the mental state of an agent on the basis of the eye region (Baron-Cohen et al., 1999), and may, in fact, be vital for developing a theory of mind (Baron-Cohen et al., 2000). Other parts of the extended system for extracting meaning from faces include the orbitofrontal and somatosensory cortex. The orbitofrontal cortex is known from single-cell studies in the macaque to contain face-responsive neurons (Thorpe et al., 1983) and to evaluate reward associated with stimuli (Rolls, 1996, 2000). This role, applied to social stimuli such as facial expressions, could guide behaviour in a socially acceptable manner. Dysfunction of this area results in inappropriate responses to social stimuli, presumably because an assessment of the positive/negative social value of social stimuli can no longer be made (Damasio et al., 1994). Somatosensory cortex also aids interpretation of faces (Adolphs, 1999). To understand emotional expressions of others we may refer to how the expressions would feel on our own faces (a kind of mental rehearsal without outward sign, see below). Through activity in the somatosensory cortex we can perhaps sense the position and stretch in the muscle movements required to produce the expressions. This simulation would tell us how the outward manifestation of an emotion state would feel if it were occurring in ourselves (cf. Wicker et al., 2003). The FFA, part of the core face-processing system identified by Haxby et al. (2000), also connects to an extended system of brain areas in order to extract information related to the identity of the face. This system includes the anterior middle temporal gyrus, which becomes active when the identity or name belonging to familiar faces is determined (e.g. Nakamura et al., 2000; see also Quiroga et al., 2005). This area is not exclusively involved in extracting social knowledge about

people, it is also activated by the perception of, for example, familiar outdoor scenes (Nakamura et al., 2000), suggesting a more general function in representing autobiographical information.

13.8. The mirror neuron

system as a guiding principle Gallese et al. (2004) suggest that our understanding of others’ actions depends on the STS in close association with the mirror neuron system (Di Pellegrino et al., 1992). The human mirror neuron system is comprised of frontal components: the posterior inferior frontal gyrus and the adjacent ventral premotor cortex (area F5), and posterior parietal components in the rostral section of the inferior parietal lobule (area PF; Rizzolatti and Fogassi, this volume, Chapter 14). The posterior STS forms one of the main inputs of this mirror neuron system, through its connections with area PF (Seltzer and Pandya, 1994). Since the mirror system responds during execution and observation of the same action, this system is thought to underlie imitative behaviour, which is a driving force behind the development of our social cognitive abilities (Hurley and Chater, 2005). Overt imitation may arise through interactions of the core system with the amygdala to enable social mirroring (Meltzoff and Decety, 2003), while imitation as a form of social learning is supported by dorsolateral prefrontal cortex and other motor preparation areas (Iacoboni, 2005). The mirror neuron mechanisms may allow us to understand the meaning of others’ actions by internally simulating these actions, without the need for any conceptual reasoning (see Rizzolatti and Fogassi, this volume, Chapter 14). The internal simulation happens at a sub-threshold level, i.e. not strong enough to actually cause a motor pattern through activation of muscles, but nevertheless strong enough to produce a cortical representation and an intuitive grasp of what the other does. ‘Mirroring’ may happen not only for actions but also for emotions. fMRI data indicates that the neural substrates for the perception of disgust in others and those for the experience of disgust overlap in the anterior insula (Wicker et al., 2003).

13-Chap13

9/29/06

10:18 AM

Page 173

Taking things literally · 173

The activity in the insula following the perception of others’ disgust is probably mediated by projections arising initially in face-responsive cell populations of the STS (Phillips et al., 1997), while insula activity during the experience of disgust is mediated by connections from olfactory and gustatory centres (Augustine, 1996) combined with information about the interoceptive state of the body (Craig, 2002). This connectivity allows the same insula area to be active when we witness the disgusted facial expression of someone else and when we experience disgust. Similar matching mechanisms for the perception and experience of other emotions probably exist. In this way the mirror mechanism allows us to fuse our third- and first-person experiences of actions or emotions. Thus we can begin to realize the experience of others. Understanding the minds of others might depend on mirror mechanisms (Gallese and Goldman, 1998; Gallese et al., 2002). It has further been argued that a simulation of another’s feelings is necessary to develop the capacity of empathy. This is one of the reasons why a mirror-system deficit features in models of autism (Williams et al., 2001; Dapretto et al., 2006). Lack of empathy is one of the most striking characteristics of the autistic syndrome (BaronCohen, 2002, 2005). Empathy has also been described as a social-mirroring process with a sensory–motor basis, supported by the core system (STS plus mirror neuron system) and the limbic system (Carr et al., 2003).

13.9. Taking things literally According to the ‘social brain hypothesis’ (Brothers et al., 1990; Dunbar, 1998), the advantage gained by understanding others’ behaviour and intentions constituted a major driving force behind the primate brain evolution. We argued that the STS forms the prime neural substrate for descriptions of the ‘mechanics’ of social events. This description goes beyond pictorial descriptions of others’ actions. Instead the STS embodies physical causes and consequences of actions allowing prediction within action sequences. This description provides a basis for guiding the observer’s reactions and planning contingencies based on the likely future behaviour of others. A purely mechanistic description

of others’ actions and social events will not suffice to understand the reasons agents have for doing what they do; a more sophisticated understanding of others’ actions could be based on the attribution of mental states to the agents involved. As noted, the STS communicates with an extended system that in humans may accomplish the more sophisticated mentalistic aspects of social cognition. The two stages of interpretation (mechanistic and mentalistic) are thus anatomically distinct and can therefore be affected differentially. The study of developmental disorders has proved helpful in dissecting the neural basis of social cognition. Aspects of social cognition can be selectively impaired, while many other cognitive abilities are spared in autistic individuals (Frith, 2001). By the same token, in Williams’ syndrome some aspects of social cognition can be spared (or even enhanced) in the presence of many impaired non-social abilities (Bellugi et al., 2000). These findings support the notion that social and non-social stimuli are represented by distinct neural substrates. The impairment in social cognition encountered in autism becomes most visible in difficulties establishing a theory of mind, i.e. the ability read the behaviour of others (and possibly of oneself) in terms of mental states (such as desires, beliefs and intentions; Frith et al., 1991; Baron-Cohen, 1995) and to empathize with others (Baron-Cohen, 2002, 2005). Presumably in typical people the processing of social cues is fully automated. One can imagine that, if mentalistic interpretation fails, daily social interactions become extremely puzzling. One idea is that the autistic mind cannot keep up with the pace and complexity of social interactions and resorts to the mode of operation that does work properly: the literal–mechanistic mode. The basic operations of the STS may be intact in autism. In fact, autistic people may rely too much on the STS for their interpretation of the (social) world, due to failures to recruit the extended systems. This leads to a focusing on the physical attributes of a social stimulus at the expense of social meaning or consequence. One of the characteristic features of the autistic mind is indeed the tendency to take things too literally (Grandin, 1995). One domain where it is particularly evident is language: people often do not

13-Chap13

9/29/06

10:18 AM

Page 174

174 · CHAPTER 13 Neural pathways of social cognition

speak literally, there may be a discrepancy between what the speaker intends to convey and what his or her words mean in terms of dictionary definitions. The ability to detect such discrepancies and go beyond the literal spoken word is crucial for conducting an intelligible conversation. People with autism, even those who have a good grasp of grammar and ample vocabulary, take words literally. For example, an autistic child may become upset when asked, “give me your hand” (Frith, 1995). Not surprisingly, such misunderstandings are not limited to the language domain. Social cues conveyed by bodily postures and actions can also be misunderstood by taking them literally. In this review we have focused on the properties of a core system for describing the behaviour of others. We have documented how the system achieves a visual representation of the mechanics of social interaction in terms that go beyond a pictorial description and embody causes and consequences of actions. This system utilizes visual cues in a hierarchical manner to achieve a relatively sophisticated interpretation of the social world. In concert with an extended system it is possible to begin to understand how neural mechanisms provide a basis for social cognition and how system dysfunction might underlie problems in comprehension of the social world.

References Adolphs, R. (1999) Social cognition and the human brain. Trends in Cognitive Sciences, 3: 469–479. Allison, T., Puce, A. and McCarthy, G. (2000) Social perception from visual cues: role of the STS region. Trends in Cognitive Sciences, 4: 267–278. Anderson, J. R., Montant, M. and Schmitt, D. (1996) Rhesus monkeys fail to use gaze direction as an experimenter-given cue in an object-choice task. Behavioral Processes, 37: 47–55. Augustine, J. R. (1996) Circuitry and functional aspects of the insular lobe in primates including humans. Brain Research Reviews, 22: 229–244. Baker, C. I., Keysers, C., Jellema, T., Wicker, B. and Perrett, D. I. (2001) Neuronal representation of disappearing and hidden objects in temporal cortex of the macaque. Experimental Brain Research, 140: 375–381. Barraclough, N. E., Xiao, D., Oram, M. W. and Perrett, D. I. (in press) The sensitivity of primate STS neurons to walking sequences and to the degree of articulation in static images. Progress in Brain Research.

Barrett, L. and Henzi, S. P. (2005) The social nature of primate cognition. Proceedings of the Royal Society of London, B, 272: 1865–1875. Baron-Cohen, S. (1995) Mindblindness: An Essay on Eutism and Theory of Mind. MIT Press, Cambridge, MA. Baron-Cohen, S. (2002) The extreme male brain theory of autism. Trends in Cognitive Sciences, 6: 248–254. Baron-Cohen, S. (2005) Autism and the origins of social neuroscience. In A. Easton and N. J. Emery (eds) The Cognitive Neuroscience of Social Behaviour, pp. 239–255, Studies in Cognition Series. Psychology Press, New York. Baron-Cohen, S., Ring, H. A., Wheelwright, S. et al. (1999) Social intelligence in the normal and autistic brain: an fMRI study. European Journal of Neuroscience, 11: 1891–1898. Baron-Cohen, S., Ring, H. A., Bullmore, E. T., Wheelwright, S., Ashwin, C. and Williams, S. C. R. (2000) The amygdala theory of autism. Neuroscience and Biobehavioral Reviews, 24: 355–364. Beauchamp, M. S., Lee, K. E., Haxby, J. V. and Martin, A. (2003) Parallel visual motion processing streams for manipulable objects and human movements. Neuron, 34: 149–159. Bellugi, U., Lichtenberger, L., Jones, W., Lai, Z. and St George, M. (2000) The neurocognitive profile of Williams Syndrome: a complex pattern of strengths and weaknesses. Journal of Cognitive Neuroscience, 12: 7–29. Bodamer, J. (1947) Die-Prosop-agnosie. Archiv für Psychiatrie und Nervenkrankheiten, 179: 6–54. [partial English translation by Ellis HD, and Florence M (1990) Cognitive Neuropsychology, 7: 81–105.] Brothers, L., Ring, B. and Kling, A. (1990) Responses of neurons in the macaque amygdala to complex social stimuli. Behavioral Brain Research, 41: 199–213. Bruce, C., Desimone, R. and Gross, C. G. (1981) Visual properties of neurons in a polysensory area in superior temporal sulcus of the macaque. Journal of Neurophysiology, 46: 369–384. Bruce, V. and Young, A. (1986) Understanding face recognition. British Journal of Psychology, 77: 305–327. Byrne, R. W. (1995) The Thinking Ape: Evolutionary Origins of Intelligence. Oxford University Press, New York. Calder, A. J. and Young, A. W. (2005) Understanding the recognition of facial identity and facial expression. Nature Reviews Neuroscience, 6: 641–651. Call, J. and Tomasello, M. (1998) Distinguishing intentional from accidental actions in orangutans (Pongo pygmaeus), chimpanzees (Pan troglodytes) and human children (Homo sapiens) Journal of Comparative Psychology, 112: 192–206. Calvert, G., Bullmore, E. T., Brammer, M. J., Campbell, R., Williams, S. C. R. and McGuire, P. K. (1997) Activation of auditory cortex during silent lipreading. Science, 276: 593–596. Carr, L., Iacoboni, M., Dubeau, M. C., Mazziotta, J. C. and Lenzi, G. L. (2003) Neural mechanisms of empathy in humans: a relay from neural systems for imitation to limbic areas. Proceedings of the National Academy of Sciences of the USA, 100: 5497–5502.

13-Chap13

9/29/06

10:18 AM

Page 175

References · 175

Castelli, F., Happe, F., Frith, U. and Frith, C. (2000) Movement and mind: a functional imaging study of perception and interpretation of complex intentional movement patterns. Neuroimage, 12: 314–325. Corbetta, M. (1998) Frontoparietal cortical networks for directing attention and the eye to visual locations: identical, independent, or overlapping neural systems? Proceedings of the National Academy of Sciences of the USA, 95: 831–838. Craig, A. D. (2002) How do you feel? Interoception: the sense of the physiological condition of the body. Nature Review Neuroscience, 3: 655–666. Dapretto, M., Davies, M. S. and Pfeifer, J. H. et al. (2006) Understanding emotions in others: mirror neuron dysfunction in children with autism spectrum disorders. Nature Neuroscience, 9: 28–30. Damasio, H., Grabowski, T., Frank, R., Galaburda, A. M. and Damasio, A. R. (1994) The return of Phineas Gage: clues about the brain from the skull of a famous patient. Science, 264: 1102–1104. Decety, J. and Grezes, J. (1999) Neural mechanisms subserving the perception of human actions. Trends in Cognitive Sciences, 3: 172–178. Desimone, R. (1991) Face-selective neurons in the temporal cortex of monkeys. Journal of Cognitive Neuroscience, 3: 1–8. Desimone, R. and Ungerleider, L. G. (1989) Neural mechanisms of visual processing in monkeys. In F. Boller and J. Grafman (eds) Handbook of Neuropsychology, vol. 2, pp. 267–299. Elsevier, Amsterdam. Di Pellegrino, G., Fadiga, L., Fogassi, V., Gallese, V. and Rizzolatti, G. (1992) Understanding motor events: a neurophysiological study. Experimental Brain Research, 91: 176–180. Downing, P. E., Jiang, Y. H., Shuman, M. and Kanwisher, N. (2001) A cortical area selective for visual processing of the human body. Science, 293: 2470–2473. Dunbar, R. I. M. (1998) The social brain hypothesis. Evolutionary Anthropology, 6: 178–190. Ellis, A. W., Young, A. W. and Flude, B. M. (1990) Repetition priming and face processing: priming occurs within the system that responds to the identity of a face. Quarterly Journal of Experimental Psychology, 42: 495–512. Emery, N. J., Lorincz, E. N., Perrett, D. I., Oram, M. W. and Baker, C. I. (1997) Gaze following and joint attention in rhesus monkeys (Macaca mulatta). Journal of Comparative Psychology, 111: 286–293. Frith, U. (1995) Autism. Explaining the Enigma. Blackwell, Oxford. Frith, U. (2001) Mind blindness and the brain in autism. Neuron, 32: 969–979. Frith, U., Morton, J. and Leslie, A. M. (1991) The cognitive basis of a biological disorder—autism. Trends in Neurosciences, 14: 433–438. Fogassi, L., Ferrari, P. F., Gesierich, B., Rozzi, S., Chersi, F. and Rizzolatti, G. (2005) Parietal lobe: from action organization to intention understanding. Science, 308: 662–667.

Gallese, V. and Goldman, A. (1998) Mirror neurons and the simulation theory of mind-reading. Trends in Cognitive Sciences, 2: 493–501. Gallese, V., Fadiga, L., Fogassi, L. and Rizzolatti, G. (2002) Action representation and the inferior parietal lobule. Attention and Performance, 19: 334–355. Gallese, V., Keysers, C. and Rizzolatti, G. (2004) A unifying view of the basis of social cognition. Trends in Cognitive Sciences, 8: 398–403. Gauthier, I., Tarr, M. J., Anderson, A. W., Skudlarski, P. and Gore, J. C. (1999) Activation of the middle fusiform “face area” increases with expertise in recognizing novel objects. Nature Neuroscience, 2: 568–573. Goodale, M. A., Milner, A. D., Jakobson, L. S. and Carey, D. P. (1991) A neurological dissociation between perceiving objects and grasping them. Nature, 349: 154–156. Goren, C. C., Sarty, M. and Wu, P. Y. K. (1975) Visual following and pattern discrimination of face like stimuli by newborn infants. Paediactrics, 56: 544–549. Grandin, T. (1995) Thinking in Pictures and Other Reports from My Life with Autism. Vintage Books, New York. Grill-Spector, K., Knouf, N. and Kanwisher, N. (2004) The fusiform face area subserves face perception, not generic within-category identification. Nature Neuroscience, 7: 555–562. Gross, C. G., Rocha-Miranda, C. E. and Bender, D. B. (1972) Visual properties of neurons in inferotemporal cortex of the macaque. Journal of Neurophysiology, 35: 96–111. Harries, M. H. and Perrett, D. I. (1991) Visual processing of faces in temporal cortex: physiological evidence for a modular organization and possible anatomical correlates. Journal of Cognitive Neuroscience, 3: 9–24. Hasselmo, M. E., Rolls, E. T. and Baylis, G. C. (1989a) The role of expression and identity in the face-selective responses of neurons in the temporal visual cortex of the monkey. Behavioural Brain Research, 32: 203–218. Hasselmo, M. E., Rolls, E. T., Baylis, G. C. and Nalwa, V. (1989b) Object-centred encoding by face-selective neurons in the cortex in the superior temporal sulcus of the monkey. Experimental Brain Research, 75: 417–429. Haxby, J. V., Hoffman, E. A. and Gobbini, M. I. (2000) The distributed human neural system for face perception. Trends in Cognitive Sciences, 4: 223–233. Haxby, J. V., Hoffman, E. A. and Gobbini, M. I. (2002) Human neural systems for face recognition and social communication. Biological Psychiatry, 51: 59–67. Hecaen, H. and Angelergues, R. (1962) Agnosia for faces (Prosopagnosia). Archives of Neurology, 7: 24–32. Hoffman, E. A. and Haxby, J. V. (2000) Distinct representations of eye gaze and identity in the distributed human neural system for face perception, Nature Neuroscience, 3: 80–84. Hurley, S. and Chater, N. (2005) Perspective on Imitation: From Neuroscience to Social Science, vol. 1. MIT Press, Cambridge, MA. Iacoboni, M. (2005) Neural mechanisms of imitation. Current Opinion in Neurobiology, 15: 632–637.

13-Chap13

9/29/06

10:18 AM

Page 176

176 · CHAPTER 13 Neural pathways of social cognition

Iacoboni, M., Molnar-Szakacs, I., Gallese, V., Buccino, G., Mazziotta, J. C. and Rizzolatti, G. (2005) Grasping the attention of others with one’s own mirror neuron system. PLos Biology, 3: 529–535. Jellema, T. and Perrett, D. I. (2002) Coding of visible and hidden objects. Attention and Performance, 19: 356–380. Jellema, T. and Perrett, D. I. (2003a) Perceptual history influences neural responses to face and body postures. Journal of Cognitive Neuroscience, 15: 961–971. Jellema, T. and Perrett, D. I. (2003b) Cells in monkey STS responsive to articulated body motions and consequent static posture: a case of implied motion? Neuropsychologia, 41: 1728–1737. Jellema, T. and Perrett, D. I. (2005) Neural basis for the perception of goal-directed actions. In A. Easton and N. J. Emery (eds) The Cognitive Neuroscience of Social Behaviour, pp. 81–112, Studies in Cognition Series. Psychology Press, New York. Jellema, T. and Perrett, D. I. (in press) Neural representations of perceived bodily actions using a categorical frame of reference. Neuropsychologia. Jellema, T., Baker, C. I., Wicker, B. and Perrett, D. I. (2000) Neural representation for the perception of the intentionality of actions. Brain and Cognition, 44: 280–302. Jellema, T., Baker, C. I., Oram, M. W. and Perrett, D. I. (2002) Cell populations in the superior temporal sulcus of the macaque and imitation. In A. N. Meltzoff and W. Prinz (eds) The Imitative Mind: Development, Evolution, and Brain Bases, pp. 267–290. Cambridge University Press, Cambridge. Jellema, T., Maassen, G. and Perrett, D. I. (2004) Single cell integration of animate form, motion, and location in the superior temporal sulcus of the macaque monkey. Cerebral Cortex, 14: 781–790. Kampe, K. K., Frith, C. D., Dolan, R. J. and Frith, U. (2001) Reward value of attractiveness and gaze. Nature 413: 589. Kanwisher, N., McDermott, J. and Chun, M. M. (1997) The Fusiform Face Area: a module in human extrastriate cortex specialized for face perception. Journal of Neuroscience, 17: 4302–4311. Karnath, H.-O. (2001) New insights into the functions of the superior temporal cortex. Nature Reviews Neuroscience, 2: 568–576. Kawashima, R., Sigiura, M., Kato, T. et al. (1999) The human amygdala plays an important role in gaze monitoring: a PET study. Brain, 122: 779–783. Keysers, C. and Perrett, D. I. (2004) Demystifying social cognition: a Hebbian perspective, Trends in Cognitive Science, 8: 501–507. Kourtzi, Z. and Kanwisher, N. (2000) Activation in human MT/MST by static images with implied motion. Journal of Cognitive Neuroscience, 12: 48–55. Logothetis, N. K., Pauls, J. and Poggio, T. (1995) Shape representation in the inferior temporal cortex of monkeys, Current Biology, 5: 552–563. Lorincz, E. N., Baker, C. I. and Perrett, D. I. (1999) Visual cues for attention following in rhesus monkeys. Current Psychology of Cognition, 18: 973–1001.

Lorteije, J., Kenemans, J. L., Jellema, T., Van der Lubbe, R. H. J., De Heer, F. and Van Wezel, R. J. A. (2006) Delayed response to implied motion in human motion processing areas. Journal of Cognitive Neuroscience, 18: 158–168. Marr, D. and Nishihara, H. K. (1978) Representation and recognition of the spatial organization from single twodimensional images. Proceedings of the Royal Society of London, B, 200: 269–294. Maunsell, J. H. R. and Van Essen, D. C. (1983) Functional properties of neurons in the middle visual temporal area of the macaque monkey. I. Selectivity for stimulus direction, speed, and orientation. Journal of Neurophysiology, 49: 1127–1147. Meltzoff, A. N. and Decety, J. (2003) What imitation tells us about social cognition: A rapprochement between developmental psychology and cognitive neuroscience. Philosophical Transactions of the Royal Society of London, B, 358: 491–500. Milner, A. D. and Goodale, M. A. (1995) The Visual Brain in Action. Oxford University Press, Oxford. Morris, J. S., Frith, C. D., Perrett, D. I. et al. (1996) A differential neural response in the human amygdala to fearful and happy facial expressions. Nature, 383: 812–815. Nakamura, K., Kawashima, R., Sato, N., Nakamura, A. and Sugiura, M. (2000) Functional delineation of the human occipito-temporal areas related to face and scene processing: a PET study. Brain, 123: 1903–1912. O’Doherty, J., Winston, J., Critchley, H., Perrett, D. I., Burt, D. M. and Dolan, R. J. (2003) Beauty in a smile: the role of medial orbitofrontal cortex in facial attractiveness. Neuropsychologia, 41: 147–155. Oram, M. W. and Perrett, D. I. (1996) Integration of form and motion in the anterior superior temporal polysensory area (STPa) of the macaque monkey. Journal of Neurophysiology, 76: 109–129. Phillips, M. L., Young, A. W., Senior, C. et al. (1997) A specific neural substrate for perceiving facial expressions of disgust. Nature, 389: 495–498. Perrett, D. I. (1999) A cellular basis for reading minds from faces and actions. In M. Hauser and M. Konishi (eds), Behavioural and Neural Mechanisms of Communication. MIT Press, Cambridge, MA. Perrett, D. I. and Mistlin, A. J. (1990) Perception of facial attributes. In W. C. Stebbins and M. A. Berkley (eds) Comparative Perception, vol. II, Complex Signals, pp. 187–215. John Wiley, New York. Perrett, D. I. and Oram, M. W. (1993) Neurophysiology of shape processing. Image and Vision Computing, 11: 317–333. Perrett, D. I., Rolls, E. T. and Caan, W. (1982) Visual neurones responsive to faces in the monkey temporal cortex. Experimental Brain Research, 47: 329–342. Perrett, D. I., Smith, P. A. J., Potter, D. D. et al. (1984) Neurones responsive to faces in the temporal cortex: studies of functional organization, sensitivity to identity and relation to perception, Human Neurobiology, 3: 197–208.

13-Chap13

9/29/06

10:18 AM

Page 177

References · 177

Perrett, D. I., Smith, P. A. J., Potter, D. D. et al. (1985) Visual cells in the temporal cortex sensitive to face view and gaze direction. Proceedings of the Royal Society of London, B, 223: 293–317. Perrett, D. I., Mistlin, A. J. and Chitty, A. J. (1987) Visual cells responsive to faces. Trends in Neurosciences, 10: 358–364. Perrett, D. I., Harries, M. H., Bevan, R. et al. (1989) Frameworks of analysis for the neural representation of animate objects and actions. Journal of Experimental Biology, 146: 87–113. Perrett, D. I., Mistlin, A. J., Harries, M. H. and Chitty, A. J. (1990) Understanding the visual appearance and consequences of hand actions. In M. A. Goodale (ed.) Vision and Action: The Control of Grasping, pp. 163–180. Ablex Publishing, Norwood, NJ. Perrett, D. I., Oram, M. W., Harries, M. H. et al. (1991) Viewer-centred and object-centred coding of heads in the macaque temporal cortex. Experimental Brain Research, 86: 159–173. Perrett, D. I., Hietanen, J. K., Oram, M. W. and Benson, P. J. (1992) Organization and functions of cells responsive to faces in the temporal cortex. Philosophical Transactions of the Royal Society of London, B, 335: 23–30. Puce, A. and Perrett, D. I. (2003) Electrophysiology and brain imaging of biological motion. Philosophical Transactions of the Royal Society of London, B, 358: 435–445. Puce, A., Allison, T., Bentin, S., Gore, J. C. and McCarthy, G. (1998) Temporal cortex activation in humans viewing eye and mouth movements. Journal of Neuroscience, 18: 2188–2199. Quiroga, R. Q., Reddy, L., Kreiman, G., Koch, C. and Fried, I. (2005) Invariant visual representation by single neurons in the human brain. Nature, 435: 1102–1107. Rolls, E. T. (1996) The orbitofrontal cortex. Philosophical Transactions of the Royal Society of London, B, 351: 1433–1444. Rolls, E. T. (2000) The orbitofrontal cortex and reward. Cerebral Cortex, 10: 284–294. Saxe, R., Xiao, D.-K., Kovacs, G., Perrett, D. I. and Kanwisher, N. (2004) A region of right superior temporal sulcus responds to observed intentional actions. Neuropsychologia, 42: 1435–1446. Seltzer, B. and Pandya, D. N. (1994) Parietal, temporal and occipital projections to cortex of the superior temporal sulcus in the rhesus monkey: a retrograde tracer study. Journal of Comparative Neurology, 243: 445–463.

Senior, C., Barnes, J., Giampietro, V. et al. (2000) The functional neuroanatomy of implicit-motion perception or ‘representational momentum’. Current Biology, 10: 16–22. Tanaka, Y. Z., Koyama, T. and Mikami, A. (1999) Neurons in the temporal cortex changed their preferred direction of motion dependent on shape. Neuroreport, 10: 393–397. Thorpe, S. J., Rolls, E. T. and Maddison, S. (1983) The orbitofrontal cortex: neuronal activity in the behaving monkey. Experimental Brain Research, 49: 93–115. Tooby, J. and Cosmides, L. (1990) The past explains the present—emotional adaptations and the structure of ancestral environments. Ethology and Sociobiology, 11: 375–424. Tsao, D. Y., Freiwald, W. A., Tootell, R. B. H. and Livingstone, M. S. (2006) A cortical region Ungerleider, L. G. and Mishkin, M. (1982) Two cortical visual systems. In D. J. Ingle, M. A. Goodale and R. J. W. Mansfield (eds) Analysis of Visual Behavior, pp. 549–586. MIT Press, Cambridge, MA. Umilta, M. A., Kohler, E., Gallese, V. et al. (2001) I know what you are doing a neurophysiological study. Neuron, 31: 155–165. Wachsmuth, E., Oram, M. W. and Perrett, D. I. (1994) Recognition of objects and their component parts: responses of single units in the temporal cortex of the macaque. Cerebral Cortex, 5: 509–522. Wicker, B., Keysers, C., Plailly, J., Royet, J. P., Gallese, V. and Rizzolatti, G. (2003) Both of us disgusted in my insula: the common neural basis of seeing and feeling disgust. Neuron, 40: 655–664. Williams, J. H. G., Whiten, A., Suddendorf, T. and Perrett, D. I. (2001) Imitation, mirror neurons and autism. Neuroscience and Behavioural Reviews, 25: 287–295. Young, M. P. and Yamane, S. (1992) Sparse population coding of faces in the inferotemporal cortex. Science 256: 1327–1331. Xu, Y. D., Liu, J. and Kanwisher, N. (2005) The M170 is selective for faces, not for expertise. Neuropsychologia, 43: 588–597. Zacks, J. M., Braver, T. S., Sheridan, M. A. et al. (2001) Human brain activity time-locked to perceptual event bounderies. Nature Neuroscience, 4: 651–655.