Face to face with an enactive approach: A ... - Springer Link

1 downloads 0 Views 276KB Size Report
Sep 26, 2007 - Capgras delusion, as it is a mirror of prosopagnosia with respect to .... in the pathology of other facial agnosias such as Capgras, and Fregoli's.
Phenom Cogn Sci (2007) 6:509–525 DOI 10.1007/s11097-007-9070-2 R E G U L A R A RT I C L E

Face to face with an enactive approach: A sensorimotor account of face detection and recognition Aaron Kagan

Published online: 26 September 2007 # Springer Science + Business Media B.V. 2007

Abstract The enactive approach to perception describes experience as a temporally extended activity of skillful engagement with the environment. This paper pursues this view and focuses on prosopagnosia both for the light that the theory can throw on the phenomenon, and for the critical light the phenomenon can throw on the theory. I argue that the enactive theory is insufficient to characterize the unique nature of experience specific to prosopagnosic subjects. There is a distinct difference in the overall process of detection (with respect to eye movement sequence) of familiar and unfamiliar faces in prosopagnosia; in contrast, normal subjects use the same scanning strategy when exploring both kinds of faces despite an obvious difference in qualitative character. In light of this limitation I outline a supplemental view basing sensorimotor contingencies upon the establishment and reaffirmation of regularities within the organism as it engages with the environment. Keywords Enaction . Prosopagnosia . Eye movement . Veridicality . Consonance Under the aegis of the burgeoning paradigm of perception and experience generally referred to as the ‘enactive approach’, the following seeks to provide a sensorimotor account of face perception. I start by summarizing the enactive approach to perception and sensorimotor contingency theory – particularly as championed by Noë (2004) and O’Regan and Noë (2001) – then provide a brief introduction to face perception and the face recognition deficit known as prosopagnosia (parts 1 and 2). With these in place, a sensorimotor account of face recognition is essayed in part 3. This account falls short, however, given that normal subjects act the same way in perception of both familiar and unfamiliar faces, despite a considerable difference in experiential character. I suggest a supplement to compensate for this inadequacy. The general idea is that multiple processes throughout the organism must take place in a A. Kagan (*) Department of Philosophy, Fordham University, Bronx, NY, USA e-mail: [email protected]

510

A. Kagan

synchronous fashion in order for experience(s) to coincide. This claim is discussed by juxtaposing empirical data on eye movement sequences and implicit recognition in face perception. It is then suggested that one’s low-level implicit responses may play a role in facilitating or impeding the exercise of regularities in sensorimotor contingencies – a condition necessary for veridicality. The results of this endeavor are twofold: i. The work demonstrates how an enactive paradigm can re-interpret old and ‘abandoned’ empirical data, and suggests fecund avenues for further research (i.e. the suggested series of experiments in part 5 that would help inform the articulation of sensorimotor contingency theory). ii. As a result of the attempted espousal of enactive perception and the employed empirical data, I begin to outline a condition for the possibility of sensorimotor contingencies.

Enactive perception The enactive approach to perception supports the idea that what is experienced depends crucially upon the way (how) one proceeds, as it classifies perceptual experience as “a temporally extended activity of skillful engagement with the environment” (Noë 2005, p. 1). Perception is described as an act based upon “implicit practical (not propositional) knowledge of the ways movement gives rise to changes in stimulation” (Noë 2004, p. 8). Experience rests upon such know-how, or the possession of skills. Experiences themselves then are not things that happen to us, but rather are ‘ways’ of acting, or things that we do (hence the term ‘enactive’). Part of the theory’s appeal is its distinctiveness concerning major themes in the philosophy of perception. “The sensorimotor approach allows us to address the problem of the explanatory gap: that is, the problem of explaining perception, consciousness, and qualia in terms of physical and functional properties of perceptual systems” (O’Regan and Noë 2001, p. 1020). There is no ‘qualifier’, or additional internal content motivated by, or arising from, different kinds of subject object interactions constituting experience. In addition, notions of ‘raw feels’ or sensational properties that have been said to fix what is it like to have certain kinds of experience (i.e. qualia), are rendered a nonissue.1 For even the most distinctive experiences, ‘what it is like’ is constituted by all the sensorimotor contingencies and one’s skillful mastery, ‘confident knowledge’2, and exercise of them. Thus in terms of qualia “there is no introspectibly available property determining the character of one’s experiential states, for there are no such states” (O’Regan and Noë 2001, p. 960). Instead, it is through ways of interacting that we experience and perceive, since experiences are ways of acting. Therefore, “to reflect on the character of one’s experience is to reflect on the character of one’s lawgoverned exploration of the environment” (O’Regan and Noë 2001, p. 961). 1

For more on the enactive approach and its relation to qualia theory see Noë (2004, section 3.10) and O’Regan and Noë (2001, section 6).

2

More on this later.

A sensorimotor account of face detection and recognition

511

Development of sensorimotor knowledge structures experience as it is enacted. As a result we can construct sensorimotor accounts of different perceptual modalities. Each modality has a particular “structure of the rules governing the sensory changes produced by various motor actions” (O’Regan and Noë 2001, p. 940). The character of each modality is determined by patterns of sensorimotor integration. According to this notion, modes of exploration of the world are mediated by knowledge of sensorimotor contingencies. With this idea, “how things look, smell, sound or feel (etc.) depends, in complicated but systematic ways, on one’s movement. The sensory modalities differ in distinctive forms that this dependence takes place” (Noë 2004, p. 109). O’Regan and Noë (2001) have developed a sensorimotor account of vision that is said to provide “a natural and principled way of accounting for visual consciousness, and for the differences in the perceived quality of sensory experience in the different sensory modalities” (ibid., p. 939). Visual perception then becomes the activity of exploring the environment in ways mediated by knowledge of the relevant sensorimotor contingencies; in this case, mediated by patterns in how things look. The framework of visual perceptual experience requires a sensorimotor integration characteristic of the visual modality (Noë 2004, p. 95). Sensorimotor contingencies governing visual exploration are determined by the visual apparatus, the visual attributes (of the object), and visual awareness.3 For instance, under a sensorimotor account, part of how things look stems from sensorimotor contingencies induced by the visual apparatus. Such laws are abstracted because “the distribution of information sensed by the retina changes drastically, but in a lawful way, as the eye moves” (O’Regan and Noë 2001, p. 941). A classic example of ‘unlawful’ activity is to close one eye and push on the open eyeball with a finger; the entire world ‘out there’ seems to be moved. Such nonveridical4 experience stems from the act being grossly outside “the structure of laws abstracted from the sensorimotor contingencies” (O’Regan and Noë 2001, p. 942) – more specifically from part of the visual apparatus (i.e. the eye).5 Such anomaly disrupts the regular relationship “between the movement of the retinal image, on the one hand, and the movement of things in the world, on the other” (Noë 2004, p. 37) violating “visual-apparatus-dependent-rules” (O’Regan and Noë 2001, p. 970). This alters “the patterns of dependence between movement and stimulation” (Noë 2004, p. 8) in a manner so radical from regular visual exploration that the character of the experience is ‘unreal’. Our encounter with objects as they are explored (be it visually, tactilely or otherwise) provides us with a sensory relation “mediated by familiar patterns of sensorimotor dependence” (Noë 2004, p. 77). It is through this sort of implicit understanding that perceptual experience acquires content. This gnosis of the way an object’s appearance changes as you move with respect to it – or, more strictly 3

For a detailed description of these sensorimotor contingencies see O’Regan and Noë (2001).

It is important to note that by ‘non-veridical’ I simply mean that the experience itself is inconsistent with what we regard as true (i.e. the world ‘out there’ is not in fact moving). I do not use this term to suggest that one does not truly experience it as such.

4

5

The eyeball itself, in one’s normal course of visual exploration, does not take on such a shape, hence this alteration is severe with respect to experiential character.

512

A. Kagan

speaking, the way sensory stimulation varies as you move – is known as the “sensorimotor profile” of the object (Noë 2004, p. 78). Such profiles make up a kind of perceived quality or appreciated character of what is being explored. Objects, when explored visually, present themselves to us as provoking sensorimotor contingencies of certain typically visual kinds, corresponding to visual attributes such as color, shape, texture, size, hidden and visible parts (O’Regan and Noë 2001, p. 944). A plate, for example, seen as circular despite its elliptical appearance when viewed at an angle “consists in our perception of its profile and our understanding of the way the profile, or apparent shape, depends on movement” (Noë 2004, p. 78). More simply stated, it is through its appearance as elliptical that we see it as circular. This spatial relation gives an appreciation of its actual shape and the subsequent veridical nature of one’s experience. Based on the “laws abstracted from the sensorimotor contingencies” (O’Regan and Noë 2001, p. 942) sensorimotor profiles display a visual potential.6

Faces For some ‘kinds’ of objects, we are (or eventually become) classification experts. Knowing a John Constable from a Thomas Gainsborough landscape for example, or a normal from an abnormal EKG, stems from the importance one places upon classes of objects, and the individual objects within the class. Faces however, are a ‘class’ of objects we as humans are all experts at recognizing, and discerning individuals within this class seems an almost instantaneous activity. A face we’ve seen before is much more discernable in a crowd of other faces, than say a particular stone we’ve looked at is recognizable in a pile of rocks.7 It seems that the “interaction of evolutionary and experiential forces has produced a neural mechanism that still outperforms the best computer vision algorithms” (Kanwisher and Moscovitch 2000, p. 1) as “a familiar face will be recognized from almost any angle in less than 250 milliseconds” (Churchland 1995, p. 27) despite the fact that “contrary to most objects, faces pose a special challenge to the visual system because they belong to a highly visually homogenous category and need to be recognized at the individual level for efficient social interactions” (Rossion et al. 2003, p. 13). Yet we ‘experts’ at discrimination can often ‘lose our knack’ or rather, become unable to recognize the differences that once seemed so striking, despite their prevailing importance or significance to us. Brain damage to specific areas can create such an immediate and drastic loss. Recently, several cortical regions specific to face detection and recognition have been isolated (Rossion et al. 2003; Grill-Spector and Knouf 2004). Earlier “evidence that this ability is localized in the brain came from certain stroke patients; stokes that hit the ventral temporal cortex can cause a syndrome called prosopagnosia” (Helmuth 2001, p. 196 ). Prosopagnosia has been classically defined as: “the inability to recognize faces” (ibid). Yet without a clear distinction 6

Whose veridical basis, I will argue, stems from further exploration coinciding with the relevant laws abstracted.

7

I owe this analogy to Cecelia Burman’s (2004) website.

A sensorimotor account of face detection and recognition

513

between detection and recognition, this definition is misleading. “Face detection is preserved in prosopagnosia” (Grill-Spector and Knouf 2004, p. 560, my emphasis), as one who suffers from this disorder does recognize faces-as-faces in contrast to other objects. The features of the face, and in some cases, the sex and physiognomy, can even be correctly distinguished (Damasio and Tranel 1990; Young and Haan 1988; Farah 2004). The inadequacy however, lies in recognizing to whom the face belongs. Being ‘face blind’ as those who suffer from prosopagnosia often call it, seems a more ample term. This description still implies the ability to recognize an object as a face. While a prosopagnosic’s sight and ability to detect faces remain fine, they are unable to recognize individuals (including themselves) based on the sight of their faces alone. Despite general difficulty in classifying ‘pure’ or ‘clean-cut’ cases of this disorder and agnosias in general8, in prosopagnosia damage occurs to a specific (occipito-temporal) cortical region categorized in neurology as an association area. Injury to these areas seems to thwart processes involved in associating past and present impressions. Under the aegis of the enactive approach, it becomes necessary to describe this disorder in terms of sensorimotor knowledge.

An enactive account of face perception Since we are addressing the perceived quality of how things look, this certainly implicates the visual sensory modality. According to sensorimotor contingency theory, categories are abstracted from the sensorimotor contingencies “that allow them to be classified into sensory modalities and, within these, into different categories like red, blue, [and] green” (O’Regan and Noë 2001, p. 1015). It seems reasonable to suggest that within the visual modality a category for human faces – the most common type of object we are universally faced with from birth – is abstracted with respect to how faces look. If we are to “reflect on the character of one’s law-governed exploration of the environment” (O’Regan and Noë 2001, p. 961), then perception of a face can in part be described as an act based upon “implicit practical knowledge of the ways [eye] movement gives rise to changes in stimulation” (Noë 2004, p. 8). Therefore, if experiences themselves are ways of acting, then regard for eye movement in both face detection and recognition is certainly justified. We are also dealing with a kind of ‘naturally embedded’ expertise. While we may not always be able to associate a name with a face, we nevertheless, from even an early age, are incredibly accurate at recognizing a present face as familiar – i.e. one that we have seen before. Despite similarity in task9 there nevertheless is a distinct difference in experiential character between recognizing a familiar face and perceiving the face of a stranger. Sensorimotor contingency theory is said to 8

The difficulty largely stems from the fact that brain damage to such a specific area of tissue is relatively unrestricted (e.g. head trauma, tumors, etc.), particularly because systematic destruction of brain portions to determine function (the experimental ablation method) is not employed in humans – for obvious ethical reasons. See also: Farah (2004, p. 2).

9

In this case, simply looking at human faces.

514

A. Kagan

account for just these sorts of distinguished differences, and in doing so opportunely dispenses with qualia. O’Regan and Noë (2001) purport that the distinct feeling of “what is it is like to drive a Porsche” for example, “is constituted by all these sensorimotor contingencies and by one’s skillful mastery of them, – one’s confident knowledge of how the car will respond to manipulations of its instruments”. Their example allows one to see how the difference in experiential character between that of the connoisseur and the neophyte is accommodated under a sensorimotor account: ...when you drive a Porsche for the first time, you may at first lack confident knowledge of how the car will respond to your actions. Insofar as you are an experienced driver of cars, you will exercise confident mastery of how to drive. In so far as you are new to Porsches, you may be tentative and exploratory. You try to learn how the car performs. The distinctive feel of driving a Porsche for the first time can thus be understood to differ from the experience of the connoisseur (O’Regan and Noë 2001, fn 42). While my intention is not to conflate the activity of Porsche driving with that of face perception, if we can tentatively regard prosopagnosics as akin to ‘novices’ in the task of face discrimination and normal subjects as ‘experts’ the above example may serve as an indication of each perceiver’s ‘way of acting’. This suggests that novices may exhibit a hesitant or uncertain sort of exploratory routine while experts should have more ‘confident’ manner of visual exploration. Finally, and most simply, if experiences are in fact ‘ways of acting’, then different experiences, should (to at least some degree) be different ways of acting. The evident difference in the qualitative character of experience in face detection and in face recognition10, suggests that such experiences are different ways of acting or things that we do. Given the unique character of experience in prosopagnosia, this would suggest anomaly with respect to the way prosopagnosics visually palpate familiar faces (given the non-veridical character of such experience11) in contrast to unfamiliar faces. The profound inability to recognize a familiar face would suggest alteration of the rules governing sensory changes by motor actions in the visual modality. Since face detection is preserved – to wit, faces are still seen ‘as-faces’ – sensorimotor knowledge with respect to faces (as a category) should still be present and, like normal subjects, should yield a particular ‘way’ regarding eye movement sequence. Yet sensorimotor contingencies (or lack thereof) involved in face recognition may be what is altered in prosopagnosia, engendering aberration in the visual exploration of familiar faces. Thus it follows from the enactive approach that anomalous eye movements would be present in the pathology of prosopagnosia. In adults who have recently been afflicted with prosopagnosia (i.e. ‘veterans’ at face recognition who have ‘lost their knack’ and are now ‘novices’) this ‘style of looking’ may in fact be unusual.12

10

Or, in the case of prosopagnosia, a lack thereof.

11

E.g. A familiar face perceived as unfamiliar.

12

It is important to note that there is nothing wrong with voluntary eye movement or the visual field. That is, this is a ‘blindness’ to faces only.

A sensorimotor account of face detection and recognition

515

In short, a sensorimotor account of face perception suggests the following: 1. An examination of eye movement in face detection and recognition should reveal different ways of acting given the difference in experiential character. 2. Face recognition ‘experts’ should exercise a confident mastery over familiar faces, while ‘novices’ (e.g. prosopagnosics) should display a tentative and exploratory manner of acting. This may begin to address face perception as a whole in an enactive light, as well as account for the peculiar experiential character in prosopagnosia as well as its pathology, which leaves open the possibility of empirical verification.

Vision data Recent vision research has established that “visual exploration consists of stereotypical sequences of saccadic eye movements which are known to depend upon both external factors, such as visual stimulus features, and internal cognitionrelated factors, such as attention and memory” (Leonards and Scott-Samuel 2005, p. 2677). In the act of visual exploration, objects are often regarded in a typical or even ‘predictable’ manner relative to each individual. These ‘paths’ followed intermittently but repeatedly by a subject’s eye while viewing different kinds of objects, have been commonly referred to in vision research as scanpaths. Scanpath theory has also been said to bear “a historical relation to the enactive approach and sensorimotor contingency theory” (O’Regan and Noë 2001, p. 945), and one also finds a close similarity between the notions of scanpaths and sensorimotor profiles. Based on the aforementioned account, this would suggest a link between scanpaths and the character of experience. With respect to faces “subjects will normally use a regular scanning strategy when viewing faces in a recognition task” (Walker-Smith and Gale 1977, p. 320). Given that exploring faces is routine, it would seem to follow, under the enactive approach, that such activity would not be present in prosopagnosics when recognizing faces, and thus begin to explain the differences in their peculiar perceived quality of sensory experience. Study examining the role of scanpaths in facial recognition and learning has found regular or ‘predictable’ scanpaths in both normal and prosopagnosics subjects when viewing unfamiliar faces (e.g. face detection). This ‘predictability’ is calculated using lambda indexes which are a “quantitative measure of the sequential dependence of fixations in a scanpath”, that calculate the “transitional probabilities of moving from one picture quadrant to the other” (Rizzo and Hurtig 1987, p. 42). Such high lambda indexes13 provide evidence of a particular ‘way’ faces are visually explored. What is interesting is that for prosopagnosics, “as with control subjects, the greatest number and duration

13

[H]igh lambda indexes the predictable control of scanpaths by external features in a visual stimulus, while low lambda indexes the scanpaths that are less directly predicted from the visual stimulus under consideration (Rizzo and Hurtig 1987, p. 43).

516

A. Kagan

of fixations was spent in those regions containing the essential facial features” (ibid, p. 43), and normal subjects appeared to use the same scanning strategy (i.e. acted the same ‘way’) in perceiving both unfamiliar and familiar faces, demonstrating no clear difference in visual exploration between face detection and face recognition. The fact that visual exploration is normally enacted the same way for both familiar and unfamiliar faces despite obvious differences in the qualitative character of experience14, challenges the idea that different perceptual experiences are different ‘ways of acting’ and seems to render sensorimotor contingency insufficient in accounting for “the differences in the perceived quality of sensory experience” (O’Regan and Noë 2001, p. 939). These empirical results touch upon Clark’s (2002) theoretical suspicion of O’Regan and Noë’s overdependence on variation in motor function to account for distinctness in experience. Clark feels that a sensorimotor approach “runs the risk...of a certain kind of over-sensitivity to low-level motoric variation” (Clark 2002, p. 193, author’s emphasis) which he calls “sensorimotor chauvinism” (ibid., p 181). Based on this concern, he states that “O’Regan and Noë must either accept that every difference makes a difference or they owe us an account of which ones matter and why” (ibid., p. 192). The authors respond to this claim and begin to elucidate a means of demarcating the attribution of experiences to robots and other physically different creatures, but this does little to aid the problem of the chauvinism we face here. Furthermore, Clark feels that “the question of what differences make a difference should...be an open empirical question” (Clark 2002, p. 194). And, as we have seen, from an eminently ‘chauvinistic’ perspective, the view that every difference (in experience) makes a difference (in behavior) runs contrary to the empirical results. While the data speak against suggestion (1), and the primitive sensorimotor account of normal face perception seems insufficient, there is however, a difference in the way of acting between experts and novices that seems to support suggestion (2) of the previous section. As they explore unfamiliar faces, prosopagnosics proceed in the same manner as normal subjects in regards to “fixation, pursuit, saccades, and scanning of salient features of scenes and faces” (Rizzo and Hurtig 1987, p. 41). Thus it seems that sensorimotor knowledge is, in fact, present with regard to faces (as a category), as these objects are consistently ‘felt out’ in a manner similar to normal subjects. The interesting finding is that when presented with familiar faces, prosopagnosics exhibit significantly less predictable eye movements. The average asymmetrical lambda for personally meaningful faces is 0.36 while for other faces it is 0.54 (ibid.). This less predictable and more ‘scattered’ means of exploration may be akin to the tentative and exploratory manner that ‘novices’ exhibit, while a strong and predictable way of acting by the ‘experts’ demonstrates a confident knowledge and mastery of the way eye movement gives rise to changes in stimulation. Given the success of the second claim, this sparse enactive account of prosopagnosia warrants further articulation.

14

For example, a face seen as ‘Mom’ verses a face unidentified or seen as ‘unfamiliar’.

A sensorimotor account of face detection and recognition

517

Integrated behavior The subsequent lack of order, or ‘more wavering’ scanpaths found in prosopagnosics’ visual exploration of familiar faces, I suggest, can be interpreted as a kind of ‘visual groping’ under the idea of sensorimotor contingency. Interpreting erratic palpation as a kind of ‘groping’ stems from the notion that vision has a “touch like character” (Noë 2004, p. 17), similar to much of Merleau-Ponty’s phenomenology that speaks of vision as “palpation with the look”15. For the enactive approach, seeing is a visual manner of ‘feeling out’ objects, and in this sense vision is intrinsically active (Noë 2004, p. 96). It is through our skillful ways of looking at objects based on our sensorimotor knowledge that we enact our perceptual content (ibid, p. 73). In addition, sensorimotor contingency theory maintains that “seeing also necessarily involves particular forms of action and reaction on the part of the visual apparatus and the environment” (O’Regan and Noë 2001, p. 960) – more specifically that “visual awareness is a fact at the level of the integrated behavior of the whole organism” (ibid., p. 969). I suggest this ‘integrated behavior’ involves the concerted coordination, or synchronous activity of multiple processes of the organism as the environment is explored, for experience to be of veridical nature. Regular or routine palpation indicates validated contingency. That is to say, what is discovered through exploration relatively affirms or falls within “the structure of laws abstracted from the sensorimotor contingencies” (O’Regan and Noë 2001, p. 942). This requires the absence of any major degree of anomaly, or ‘surprise’ within organism’s otherwise consonant proceedings. A relative consonance must continually occur within the interplay of an organism’s various systems for any sort of routine, or ‘schema’ to be acted out. This idea is summarized in Fig. 1. With something like recognition (or lack thereof), consonance and dissonance occur (in part) from the integrated proceedings of variables I, S, and G. “Subjects will use a regular scanning strategy when viewing faces in a recognition task” (Walker-Smith and Gale 1977, p. 320) provided there are no major surprises or inconsistencies in what is confirmed or reaffirmed based on sensorimotor contingencies, as this seemingly rote process is quickly carried out. The kind of ‘groping’ mentioned earlier is the result of dissonance from conflict in one’s integrated behavior, or actions in perception. While a particular scanning strategy may not be absolutely essential for all levels of recognition, “cases without regular scanning sequence lead to longer latencies” (Walker-Smith and Gale 1977, p. 320). This would indicate a temporary delay in making sense of the sensations. It has been demonstrated that cases in which the regular sequence of eye movements is not followed, identification is less accurate and or takes longer (ibid.). I suggest that this is due to significant levels of dissonance between each of the 3 variables (I, S, and G) in Fig. 1, whose consonance would have produced a more ‘regular’ (perhaps even veridical) experience. O’Regan

15

“...la vision est palpation par le regard” (Merleau-Ponty 1964, p. 177).

518

A. Kagan

I is explored as a particular ‘kind’ of S proceeds based on the relative confirmation/validation of sensorimotor contingencies as I is routinely ‘felt out’. G occurs in reflexive conjunction with S based on I and further drives or reinforces the enactment of the particular scanpath (e.g. S1). o Relative consonance

routine palpation continues (i.e. normal)

o Significant dissonance between I S or G Scanpath is thwarted. If dissonance is severe enough this yields to ‘groping’. Deterioration of a particular scanpath into a kind of ‘groping’ to make sense, possibly occurs in a attempt to reestablish consonance as S I interplay fails to verify/confirm what G purports (i.e. S1 disbanded, and the ensued groping is a rapid trial and error for harmony) Routine palpation (or lack thereof) in the pathology of different perceptual modalities of amnesic associative face agnosias may verify (or falsify) this claim. Fig. 1 “Recognition is not an instantaneous event but rather a continuous process of growing certainty” (Norton and Stark 1971). Part of what facilitates or deters this process are implicit basic autonomic responses (variable G)

and Noë state: “the basis of the qualitative character of experience, in our view, is the perceiver’s knowledge of the interdependence between stimulus and bodily movement” (O’Regan and Noë 2001, p. 1014). I suggest that such knowledge is dependent upon a relative consonance within the perceiver which must be established as any ‘type’ of scanpath or confident way of acting is enacted. Routinized behaviors – be they scanpaths, the enactment of sensorimotor contingencies, or body schemas in general – are developed in and dependent upon regularity. In the case of visual perception, “it is one’s exercise of the mastery of just such regularities in sensorimotor contingencies in which seeing consists” (O’Regan and Noë 2001, p. 968).

A sensorimotor account of face detection and recognition

519

What this means for re-cognition The seeming immediacy of face recognition belies how much of an ongoing procedure this phenomenon may be. While it is tempting to regard this seemingly instant process in a more ‘static’ light this is unhelpful, and the importance of a genetic account in enactive perception has also been emphasized (see Thompson 2005). The originators of scanpath theory have also suggested that “recognition is not an instantaneous event but rather a continuous process of growing certainty” (Norton and Stark 1971, p. 938). This highlights the main idea of the relationship behind variable I (confirmation/validation) and its influence on the ‘kind’ of scanning strategy (S) to be acted out. This dynamic serves as a kind of feedback loop based upon an even ‘lower-level’ fundamental property of rapid sequential eye movement, known as saccades: Even in the simple situation in which a subject moves his eyes voluntarily between two fixed markers the eye movement frequently takes the form of a double saccade – an initial step followed by a second corrective step. The latter may be in any direction and is controlled by a peripheral feedback loop (Weber and Daroff 1971, 1972). It appears that the effector mechanism whereby an eye movement command signal is converted into an actual movement is not intrinsically very precise but achieves precision by the use of feedback mechanisms (Walker-Smith and Gale 1977, p. 318). Part of what facilitates this process are implicit basic autonomic responses (variable G) which are a (‘low level’) component of making sense of a stimulus which for Noë “is a matter of figuring out the patterns of sensorimotor dependence governing your relation to the object” which is to be kept distinct from “having certain kinds of feelings” (Noë 2005, p. 22)16. The underlying idea is that perceptual sensations alone are not sufficient for the sense of the presence of the feature, but a ‘making sense’ of the sensations is also necessary.17 In this light, it affords designating these ‘gut responses’ as a ‘low-level making sense’ or more specifically a form of somatic-marking.18 I interpret action such as unpredictable eye movements to demonstrate incongruous proceedings which can be regarded as an indication of ‘confusion’19 or non-veridicality, regarding the visual world and facilitate the notion of groping. This idea would suggest significant dissonance in prosopagnosia when familiar faces are viewed and cannot be recognized, yet precisely which systems or variables are

16

The statement is made in regards to adaptation to reversing goggles.

17

A more concrete example of this is the blindness experienced as one puts on inverted lenses for the first time. For a full interpretation of this see Hurley and Noë (2003).

18

For the sake of brevity, I will have to gloss over the somatic-marker hypothesis and the role of ‘emotion’ in rationality, and refer the reader to Levin et al. (1991, p. 218–229) and Damasio (1994, 1999).

19

By ‘confusion’ I simply mean that, sense cannot be made.

520

A. Kagan

conflicting remain to be determined. As there is no irregularity or dissonance with respect to the environment and the objects themselves – i.e. normal subjects under the same conditions have little difficulty in identification and the explored objects themselves are stagnant and ‘unsurprising’ – focus is then directed towards investigating where, in the afflicted subjects, the digression lies. Implicit and explicit discord in prosopagnosia There is little or no memory loss associated with prosopagnosia. Experiences are retained and recollected quite well and can be articulated rather eloquently (as reflected in most prosopagnosic message boards). Yet, visual exploration of the familiar face alone is insufficient to elicit the association of past events with a present face, as the familiar individual is unidentified (i.e. perceived as unfamiliar). As described earlier, the success of recognition depends on a sufficiently comprehensive set of processes taking place “in a synchronous fashion” (Damasio and Tranel 1990). In patients with prosopagnosia, studies have surprisingly “found autonomic responses to the presentation of familiar faces that patients failed to recognize overtly but not to unfamiliar faces” (Sergent and Poncet 1990, p. 990). Familiar faces are identified but only on an unconscious covert level – to wit the ‘recognition’ is only implicit.20 Some form of learning also takes place, as new faces can also be implicitly identified after repetitive encounter, despite the consistent failure in explicit recognition. “The evidence of covert recognition suggests that some knowledge was effectively activated but was either insufficient or not specific enough to enable overt recognition” (ibid., p. 995). A sensorimotor account of vision maintains that “all seeing involves some degree of awareness, and some degree of unawareness” (O’Regan and Noë 2001, p. 944). Since the learning and recognition of familiar faces is only implicit in prosopagnosia, “the processing of familiar faces and their recognition are still taking place outside the patient’s realm of awareness” (Sergent and Poncet 1990, p. 990). Under the ideas outlined in this paper, the discord between implicit and explicit recognition is indicated by the uncertain reaching about or ‘groping’ of the visual apparatus. Similar to the ‘unlawful’ activity described in part 1, such actions are grossly outside “the structure of laws abstracted from the sensorimotor contingencies” (O’Regan and Noë 2001, p. 942) and are not part of “the patterns of dependence between movement and stimulation” (Noë 2004, p. 8) rendering such activity ‘unlawful’ and contributing to the non-veridical character of such experience. Groping suggests uncertainty, or rather, a lack of sensorimotor knowledge – particularly the ‘confident knowledge’ noted earlier. Thus the disruption between the normally concerted processes of implicit and explicit recognition begins to explain the peculiar non-veridical qualitative character of experience in prosopagnosia – that of a face in fact familiar to the subject, but not explicitly seen as such. Further evidence for discord and experiential non-veridicality can be seen in Capgras delusion, as it is a mirror of prosopagnosia with respect to implicit and 20

Typical autonomic responses can be measured by electrodermal discrimination in skin conductance response tests.

A sensorimotor account of face detection and recognition

521

Fig. 2 The success of recognition depends on the synchronous occurrence of a sufficiently comprehensive set of processes. “In other words, within approximately the same time window, the perceiver must not only see aspects of the face, but also have an internally recalled experience of information that pertains uniquely to that face” (Damasio and Tranel 1990). Other facial agnosias may fit along this continuum as well as the relationship between these two types of recognition in each disorder is established

explicit discord. Familiar individuals are explicitly recognized but not implicitly. Patients with Capgras are devoid of the autonomic responses present in successful recognition21 while explicit recognition and face perception are maintained. The character of experience with this delusion is likewise ‘non-veridical’ but in a way converse to prosopagnosia. The delusion is characterized by the belief that people look familiar but are no longer who they were and have been replaced instead by doubles, robots, or aliens. The inversely dissonant nature of this disorder’s pathology affords its placement on the continuum pictured in Fig. 2. Examination into the manner that familiar individuals are palpated by those suffering from Capgras delusion may yield a similar groping of certain sensory modalities in its pathology. Suggested themes for future research What complicates matters further is the well documented phenomenon of prosopagnosics’ paradoxically superior performance at matching faces when they are inverted, while in normal subjects this ‘face inversion effect’ creates a “loss of our normal proficiency at face perception when faces are inverted”. As objects, inverted faces have “the same complexity, inter-item similarity, [and] spatial frequency” (Farah and Wilson 1995, p. 2089), rendering them relatively identical and only differing in orientation. It seems that turning things upside-down impairs performance in normal subjects but improves prosopagnosics with respect latency, response time, and accuracy. 21

Such implicit responses are often given the role of emotion, and likewise this delusion is often characterized as ‘loss of’ or ‘under-active’ “emotional recognition system” (Carter 1999, p. 123).

522

A. Kagan

Based on the ideas postulated in this paper, if eye movements when viewing inverted faces are recorded, I hypothesize that prosopagnosics will have more predictable and ‘unwavering’ scanpaths (e.g. high lambda indexes) while in normal subjects, the ‘face inversion effect’ would suggest the presence of ‘groping’ or less predictable scanpaths (e.g. significantly lower lambda indexes). If such visual groping is present – that is, no dissonance within the subjects is manifest – this would demonstrate that such ‘confusion’ is related to external factors since the ‘whole organism’ is confused22. Recent study concerning only initial saccades in normal subjects23 has revealed that inverted faces “elicited far less systematic direction responses” (Leonards and Scott-Samuel 2005, p. 2681) as well as “significantly longer saccade latencies for inverted faces than [upright] faces” (ibid., p. 2682). A less systematic and more latent manner of exploration may already indicate evidence of a kind of groping. The simple act of inverting familiar faces may also be worthwhile. Testing for the presence of subsequent implicit responses when faces are inverted may help to further (or falsify) the role of consonance and the character of experience. Turning overtly recognized faces upside may inhibit a prosopagnosic’s implicit responses to these objects, and allow for more consonant proceedings. It would also be interesting to record the effect this has on normal subjects. A study by Kilgour et al. (2004) has indicated that prosopagnosia may in fact be cross-modal. Investigation into recognition based on touch (haptics) revealed a deficiency in face recognition, as well as the paradoxically improved performance with inverted faces and an ‘inverted face effect’ with normal subjects. The prosopagnosic subject’s “ability to haptically discriminate between two upright faces was at chance, as it was when he looked at them” (Kilgour et al. 2004, p. 710). This demonstrates that prosopagnosia cannot be limited to the visual modality, and also serves as an impetus to repeat such study to see if prosopagnosics literally do grope. If this is the case, this would lend strong support to my interpretation of unpredictable eye movements as a visual groping, as well as suggest that the notion of consonance may serve as an overall necessary property for sensorimotor contingencies in sensory modalities. In prosopagnosia a regular scanning strategy is used in the visual exploration of unfamiliar faces. Given that new faces can be learned and subsequently recognized on an implicit level, it would be of interest to see exactly when or how this initial strategy is thwarted or begins to taper off, as unfamiliar faces become familiar and dissonance is engendered. This would shed light on the role of implicit responses in the facilitation and insulation of routine palpation (i.e. when ‘G’ is a consonant variable) as well as its influence in the perceiver’s confusion as the wont process deteriorates to a groping (when it is a dissonant variable). As alluded to in Section Implicit and explicit discord in prosopagnosia exploring the scanning strategies (of lack thereof) as well as the levels of implicit and explicit response in the pathology of other facial agnosias such as Capgras, and Fregoli’s delusion24, as well as other associative and apperceptive facial agnosias would prove 22

Upside down faces are not within the purview of typical visual contingencies.

23

In an effort to explore lateralization biases in object exploration.

24

The delusional belief that different people are in fact the same person.

A sensorimotor account of face detection and recognition

523

useful. Such investigation could help shed light on the role these systems and variables play in learning and recognition. The levels of implicit and explicit response in each case could be mapped onto a continuum similar to the one in Fig. 2 and further our understanding of the character of such experiences. However, much work is needed to establish any sort of ‘degrees of dissonance’, between implicit and explicit systems in order to examine the relationship of these normally concerted processes and the character of experience.

Summary In characterizing visual awareness, O’Regan and Noë state that the sensorimotor contingencies of the visual modality are set up “by the visual apparatus [e.g. you cannot see behind you]; and the aspect which corresponds to the encounter with the visual attributes, that is, those features which allow objects to be distinguished visually from one another. These two aspects go some way towards characterizing the qualitative nature of vision” (O’Regan and Noë 2001, p. 943). This certainly seems correct in regards to the basic tenets of vision, and could suggest that erratic or non-routine visual palpation of an object is indicative of a significant difference in the qualitative character of experience, especially one that is non-veridical.25 Yet we still lack explanation as to why in normal subjects, both unfamiliar and familiar faces are explored in the same way despite an obvious difference in experiential character. This also exposes a tension with respect to sensorimotor chauvinism, emphasizing the importance of knowing “what differences make a difference” (Clark 2002, p. 194) under sensorimotor contingency theory. Part of what it is like to see in this account is that “objects when explored visually, present themselves to us as provoking sensorimotor contingencies of certain typically visual kinds corresponding to visual attributes such as color, shape, texture, size, hidden and visible parts” (O’Regan and Noë 2001, p. 944). This helps explain much of what face perception is like in normal subjects, but does not seem to fully account for recognition (or lack thereof). The suggested dynamic of a coordinated ‘agreement’ between systems and its role in the genesis of recognition is added to supplement this shortcoming. It is anticipated that veridical recognition is fundamentally dependent upon the harmonious synchronization of the organism due to the relative regularity of the events in the sense that consonance is manifested. Therefore I would add that characterization of a sensorimotor contingency framework requires that normal perceivers do not experience (implicit or explicitly) anything out of the ordinary. Simply stated: the laws the presented object affords do not drastically create any discontinuity in the perceiver’s experience. “Visual awareness is a fact at the level of the integrated behavior of the whole organism” (O’Regan and Noë 2001, p. 969, my emphasis) which requires concord between systems. This idea may play a helpful role alongside an enactive account and render consonance a low-level property of visual awareness in the determination and enactment of sensorimotor contingencies. Through this notion and the suggested

25

e.g. A person who, while looking at a picture of their own face does not recognize himself.

524

A. Kagan

experiments, we may continue to shed light on how fully embodied the integrated behavior of re-cognition may be. In addition, we have also seen how abandoned or ‘outmoded’ data can be productively reassessed under the lenses of a new paradigm. Scanpath data was initially used to measure attention, but given the theoretical misstep in conflating eye movement with attention, the studies and subsequent data were disregarded.26 Yet we have seen that this data can be reinterpreted under the paradigm of the enactive approach in a productive manner that does not conflate eye movement and attention, and engenders further themes for experimentation. I feel that it is through this very sort of philosophical and empirical ‘back and forth’ that what “ought to be...our paradigm of what perceiving is” (Noë 2004, p.1) can be articulated.

References Burman, C. (2004). http://www.prosopagnosia.com/main/stones/index.asp. Carter, R. (1999). Mapping the mind. Berkeley: University of California Press. Churchland, P. M. (1995). The engine of reason, the seat of the soul. Cambridge: The MIT Press. Clark, A. (2002). Is seeing all it seems? action, reason and the grand illusion. Journal of Consciousness Studies, 9(5–6), 181–202. Damasio, A. (1994). Descartes’ error: Emotion, reason, and the human brain. New York: Harper Collins. Damasio, A. R. (1999). The feeling of what happens. San Diego: Harcort, Inc. Damasio, A. R., & Tranel, D., et al. (1990). Face Agnosia and the Neural Substrates of Memory. Annual Reviews Neuroscience, 13, 89–109. Farah, M. J. (2004). Visual agnosia second edition. Cambridge: The MIT Press. Farah, M. J., & Wilson, K. D., et al. (1995). The inverted face effect in prosopagnosia: Evidence for mandatory face-Specific perceptual mechanisms. Vision Research, 35(14), 2089–2093. Grill-Spector, K., & Knouf, N. et al. (2004). The fusiform face area subserves face perception, not generic withing-category identification. Nature Neuroscience, 7(5), 555–562. Helmuth, L. (2001). Where the brain tells a face from a place. Science, 292(5515), 196–198. Hurley, S., & Noë, A. (2003). Neural plasticity and consciousness. Biology and Philosophy, 18, 131–168. Kanwisher, N., & Moscovitch, M. (2000). The cognitive neuroscience of face processing: An introduction. Cognitive Neuropsychology, 17, 1–11. Kilgour, A. R., & Gelder, B. d., et al. (2004). Haptic face recognition and prosopagnosia. Neuropsychologia, 42, 707–712. Leonards, U., & Scott-Samuel, N. E. (2005). Idiosyncratic initiation of saccadic face exploration in humans. Vision Research, 45, 2677–2684. Levin, H. S., Eisenberg, A. L., et al. (Eds.) (1991). Frontal lobe function and dysfunction. New York: Oxford University Press. Merleau-Ponty, M. (1964). Le Visible et l’Invisible, Gallimard. Noë, A. (2004). Action in Perception. Cambridge: The MIT Press. Noë, A. (2005). Real Presence. Copenhagen: World and Mind. Norton, D., & Stark, L. (1971). Scanpaths in saccadic eye movements while viewing and recognizing patterns. Vision Research, 11, 929–942. O’Regan, J. K., & Noë, A. (2001). A sensorimotor account of vision and visual consciousness. Behavioral and Brain Sciences, 24(5), 939–1031. Rizzo, M., & Hurtig, R., et al. (1987). The role of scanpaths in facial recognition and learning. Annals of Neurology, 22(1), 41–45. Rossion, B., Caldara, R., et al. (2003). A network of occipito-temporal face-sensitive areas besides the right middle fusiform gyrus is necessary for normal face processing. Brain, 126(11), 2381–2395.

26

I owe this point to Dr. Gregory Zelinksy (personal communication).

A sensorimotor account of face detection and recognition

525

Sergent, J., & Poncet, M. (1990). From covert to overt recognition of faces in a prosopagnosic patient. Brain, 113, 989–1004. Thompson, E. (2005). Sensorimotor subjectivity and the enactive approach to experience. Phenomenology and the Cognitive Sciences, 4(4), 407–427. Walker-Smith, G. J., & Gale, A. G., et al. (1977). Eye movement strategies involved in face perception. Perception, 6, 313–326. Weber, R. B., & Daroff, R. B. (1971). The metrics of horizontal saccadic eye movement in normal humans. Vision Research, 11, 921–928. Weber, R. B., & Daroff, R. B. (1972). Corrective movements following refixation saccades; type and control system analysis. Vision Research, 12, 467–475. Young, A. W., & Haan, E. H. F. D. (1988). Boundaries of covert recognition in Prosopagnosia. Cognitive Neuropsychology, 5(3), 317–336.