Preserved and Impaired Detection of Structure From ...

1 downloads 0 Views 224KB Size Report
to Charlie Heywood for discussions about the design of the experiments. .... temporal gyri, with the main focus of damage in the upper (cranial) banks of the.
VISUAL COGNITION, 1996, 3 (4), 363–391

Preserved and Impaired Detection of Structure From Motion by a “Motion-blind” Patient P. McLeod Oxford University, Oxford, UK

W. Dittrich University of Hertfordshire, UK

J. Driver University of Cambridge, Cambridge, UK

D. Perrett University of St. Andrews, St, Andrews, UK

J. Zihl Max-Planck-Institut fur Psychiatrie, Munich, Germany Following bilateral extrastriate damage to areas that include the suspected human homologue of V5/MT, the patient LM has a specific deficit in processing moving stimuli. She has difficulty detecting the movement or coding the velocity of single moving dots. Nevertheless, we find that she can report human actions in Johansson “biological motion” displays. This requires the accurate coding of the direction and velocity of many moving dots. The implication is that structure can be extracted from motion in regions of visual cortex other than those traditionally associated with motion processing. However, she cannot report the spatial disposition of the actors whose actions she has recognized, not their movement in depth relative to her. A possible interpretation is that coding in these additional regions is primarily Requests for reprints should be sent to Peter McLeod, Department of Experimental Psychology, South Parks Road, Oxford, OX1 3UD, England. E-mail: PDMCL @vax.oxford.ac.uk, or to Winand Dittrich, Department of Psychology (2H 258), University of Hertfordshire, Hatfield, Herts, AL10 9AB, England. E-mail W.H. [email protected] This research was supported by the Max-Planck-Institute for Psychiatric Research (Clinical Research), and the McDonnell-Pew Centre for Cognitive Neuroscience, Oxford. WHD was supported by a Feodor-Lynen-Fellowship from the A.v.Humbolt Foundation. We are grateful to Phil Benson whose programs were used in Experiments 2 and 6, and to Charlie Heywood for discussions about the design of the experiments. © 1996 Psychology Press, an imprint of Erlbaum (UK) Taylor & Francis Ltd

364

McLEOD ET AL.

object-centred. Adding a small number of random stationary “noise” dots to the display prevents her from identifying the actions, suggesting that segregation by motion is implemented within the traditional movement areas.

Much evidence has been presented over the last ten years to suggest that the primate visual system comprises a number of distinct pathways. One broad distinction is between the parietal (or dorsal) stream and the temporal (or ventral) stream (e.g. Desimone & Ungerleider, 1989; Ungerleider & Mishkin, 1982). These are thought to analyse different kinds of visual information in parallel. For example, it has been suggested that areas in the parietal cortex are primarily concerned with the spatial disposition of objects (e.g. their size, motion, and depth), whereas the temporal pathway is concerned with recognizing familiar objects via properties such as their colour and shape (Goodale & Milner, 1992; Livingstone & Hubel, 1988; McCarthy, 1993; Ungerleider & Mishkin, 1982). However, emphasis has recently shifted from establishing the independence of these anatomically distinct pathways to emphasizing the many ways in which they interact. For example, the magnocellular layers of the lateral geniculate nucleus, which are known to carry motion information, project to both temporal and parietal streams, and there are extensive interconnections between the extrastriate areas in the two streams (Merigan & Maunsell, 1993; Zeki, 1993). A number of recent studies also show that areas traditionally thought to be involved in processing of form rather than movement respond to various aspects of moving stimuli (Cheng, Hasegawa, Saleem, & Tanaka, 1994; Ferrera, Nealey, & Maunsell, 1994; Peterhans & von der Heydt, 1993; Sary, Vogels, & Orban, 1993; Schiller, 1993). Neuropsychology provided one line of evidence for the original idea that distinct visual dimensions, such as motion or colour, are processed relatively independently. Different kinds of brain injury are associated with deficits specific to particular visual dimensions. For example, the patient LM (Zihl, von Cramon, & Mai, 1993; Zihl, von Cramon, Mai, & Schmid, 1991) has a visual deficit restricted to the coding of movement, which is so profound that she has been described as “motion-blind”. This deficit resulted from bilateral extrastriate lesions, thought to include the human homologue of area V5/MT in the dorsal stream, which has long been identified with motion processing by physiological studies in primates. This has recently been confirmed by neuroimaging studies in humans (Corbetta, Miezen, Dobmeyer, Shulman, & Petersen, 1991; Watson et al., 1993; Zeki et al., 1991). The physiological evidence that some aspects of motion processing may be performed outside the dorsal stream now finds an echo in the neurophychological data. Vaina, Lemay, Bienfang, Choi, and Nakayama (1990) examined a patient, AF, who has extensive damage to the occipito-parietal areas conventionally associated with motion processing. Psychophysical studies reveal that

MOTION PERCEPTION BY LM

365

he, like LM, has impairments in “low-level” motion processing. His detection of coherent motion in noise, speed discrimination, and the coding of two-dimensional form from relative speed were all shown to be impaired. However, despite these deficits in the coding of low-level aspects of motion, AF could identify the actions performed by people in the “biological motion” displays introduced by Johansson (1973). In these, people are filmed so that only individual points of light on various joints are visible. The only information about actions in these displays comes from the motion of these points of light. Yet AF was able to identify an actor performing push-ups, stair-climbing, and cycling within such moving-dot displays. To interpret these displays, the observer must simultaneously code the relative motion of a number of points of light moving in different directions, at different speeds, and integrate this velocity information over the whole display. Static frames from these films are usually insufficient to identify the action, certainly when the possible actions are unknown in advance, as in Vaina et al.’s study. This result raises the paradox that a patient can be very poor at coding velocity and integrating it across a random display of dots, and yet be able to do this accurately when the dot-display coheres into a familiar biological Gestalt. One possible account of this result would be that a centre that lies outside the conventionally recognized motion pathway can extract biological structure from motion (although, for some reason, it cannot perform conventional psychophysical motion tasks). Two proposals about the existence of a specific structure from motion centre have been made recently. Zeki (1993) has suggested that area V3 may be part of a “dynamic form” recognition system, involved among other things in the recognition of structure from motion. Although V3 has not been considered to have a central role in motion processing, it has inputs both from the motion-sensitive magnocellular layers of the lateral geniculate nucleus via V1 and V2, and also from V5/MT, the cortical area conventionally associated with motion processing. Positive evidence that a centre capable of extracting structure from motion does exist outside the dorsal route (although not in V3) has been presented by Oram and Perrett (1994). They showed that cells in the anterior portion of the superior temporal sulcus (area STPa) of macaque monkeys are sensitive to the biological motion represented in Johansson displays. In this paper, we extend Vaina et al.’s work with AF by examining the possibility of preserved detection of structure from motion despite impairment of low-level motion perception in the patient LM. We find that, like AF, she can report the actions in Johansson-type dot displays of biological motion. However, further experiments show that certain aspects of LM’s “biological motion” perception are abnormal. For instance, although LM can identify that a Johansson display represents a walking figure, she is unable to report in which direction the figure is facing, or, despite substantial expansion or contraction cues, whether it is approaching or retreating from her.

366

McLEOD ET AL.

GENERAL METHOD The Patient LM has been extensively described by Zihl and colleagues (Zihl et al., 1983, 1991). MRI scans taken in 1989 showed bilateral lesions involving the upper (cranial) part of the occipital gyri and the adjacent portion of the middle temporal gyri, with the main focus of damage in the upper (cranial) banks of the anterior occipital sulcus. Cortico-cortical fibre pathways interconnecting occipital, temporal, and parietal visual areas are also affected bilaterally. Her stable perceptual deficits for moving stimuli have been studied in detail with conventional psychophysical motion displays (Baker, Hess, & Zihl, 1991; Hess, Baker, & Zihl, 1989; McLeod, Heywood, Driver, & Zihl, 1989; Zihl et al., 1983, 1991). To summarize her established deficits briefly: She is very slow to detect movement of even a single dot, particularly at low speeds; she cannot code differences between faster stimuli accurately, and above a certain speed she does not report motion at all; she cannot process displays containing stimuli moving in different directions simultaneously; she cannot attend to a sub-set of display elements defined by common movement; she reports no apparent motion in discontinuous displays. She describes practical difficulties with moving stimuli in everyday life and the phenomenal experiences of stimuli as appearing “frozen” even when she knows conceptually that they are moving (e.g. tea being poured into a cup).

Johansson ª Biological Motionº Displays In a Johansson display (Johansson, 1973), people are filmed as they perform everyday actions, such as walking, dancing, hopping, throwing, or hammering, with small lights attached to their ankle, knee, hip, wrist, elbow, and shoulder joints. At replay, the display contrast is adjusted so that the only visual information comes from these points of light. Their velocity, relative motion, and disappearance and reappearance are determined by the action being performed and the constraints imposed by the structure of the human body. Thus, the displays only contain information about the relative motion of the joints. However, to normal observers they give an immediate and unambiguous impression of a person carrying out a familiar action. Despite her well-established problems in the perception of motion, we found in a pilot study that LM could spontaneously identify a wide range of actions (walking, hand-shaking, embracing, standing up, cycling, or a couple dancing) when shown a video of the original Johansson film. The light dots in Johansson displays move in many different directions at different speeds. To identify the actions, the velocities of these dots must be coded accurately (because relative motion is essential for deriving the actions) and quickly (because they are

MOTION PERCEPTION BY LM

367

changing all the time). In conventional tests of motion perception, LM cannot code motion either accurately or quickly, and she cannot cope with displays containing stimuli moving in different directions. Furthermore, the light points in Johansson displays frequently disappear and reappear as the actor turns, or as one limb crosses another, temporarily occluding a joint from the camera. As LM has no sensation of apparent movement in discontinuous displays (Zihl et al. 1983), one would expect that this would also make the task impossible for her.

EXPERIMENT 1 Action Recognition, and the Discrimination of Normal and Inverted Johansson Figures The aim of the first two experiments was to corroborate our pilot observations than LM can detect actions in Johansson light-point displays and to see whether she can distinguish between Johansson displays that give rise to percepts of biological motion in normal observers and those that do not.

Method Displays Johansson displays were made by attaching small pieces of highly reflective tape at the ankle, knee, hip, wrist, elbow, and shoulder joints of actors dressed in black. Video film was taken at low light-levels of one actor walking, running, jumping up and down with both legs, hopping, dribbling a ball, hammering a nail, lifting a box, boxing, crawling, painting a wall, jumping up with arms and legs moving out to form a star, limping, and sitting down; and of a pair of actors catching, dancing, shaking hands, and tugging a rope. At replay, the display contrast was set at a level where only the reflective tape on each joint was visible. The method and stimuli are described in detail by Dittrich (1993). A tape was made of 52 five-second clips of actions. Twenty clips were of an actor (or a pair of actors) performing the different actions (some appeared twice). Twenty were the same displays inverted. Although the general spatial and temporal characteristics of the inverted displays were similar to those produced by real actions, the relative light point movements no longer correspond to any information that would be obtained from the normal viewing of people performing actions. Inverted displays do not give an impression of coherent biological motion and can be distinguished from biological motion by normal observers (see Dittrich, 1993; Sumi, 1984). Twelve clips were of actions that did not involve a person. For these displays, patches of reflective tape were attached to a sheet, a ball, or the joints of a puppet, which were then moved. The ordering of the three classes of action (biological, inverted biological, and nonbiological) was random.

368

McLEOD ET AL.

Task LM made two judgements about each clip. First, she judged whether or not the display represented a person (or people) doing something. The correct classification of the inverted displays and those with the three non-biological objects was “not a person”. (The movement of the puppet had some human movement characteristics but did not appear to be a real person to normal observers.) Second, if she thought the clip showed people, she was asked to describe the action they were performing. She was given no information beforehand about what actions were being performed on the tape. Results LM was correct on 50 of the 52 person/non-person judgements. She identified 14 of the 20 actions spontaneously, and a further 4 after some discussion with the experimenter. A group of 3 age-matched normal subjects, given the same tape and task, scored an average of 50.3/52 (range 50–51) on the person/non-person judgement and 18.3/20 (range 17–19) on the identity of the actions. Thus LM’s ability to perform this task is very similar to that of normal subjects. It should be noted, however, that her judgement was considerably slower and less confident than that of normal subjects. She usually made the person/nonperson judgement confidently after one showing of a display, but she often asked to have the display repeated before trying to identify the action, and sometimes she had some discussion with the experimenter before identifying the action. The normal controls could usually identify the action after one viewing.

Conclusion Experiment 1 confirms the pilot observations made when LM was shown the original Johansson film. She can extract information about the relative velocity and direction of many different points from a complex moving display, provided that the movement information reflects the relationships that hold for movements of a human body performing an everyday task. Given her well-documented inability to code the velocity of single dots of light or to make any judgement about displays containing dots moving in different directions, this result, like the report by Vaina et al. of the performance of AF, is remarkable. In the original report of her perceptual experience of moving objects (Zihl et al, 1983), LM was described as seeing the moving world in a series of stills. If this is the case, she might be able to use information about change between stills to draw inferences about movement without perceiving movement itself. For example, she reports that she uses the fact that a figure starts at the left of the screen and ends at the right to infer that it has moved to the right even though she does not perceive movement. But although she can draw some global infer-

MOTION PERCEPTION BY LM

369

ences about movement, changes between static samples would not in general be adequate to identify actions in Johansson films. This would certainly be the case in this experiment, where LM had no prior knowledge about the possible actions in the stimulus set. Baker et al. (1991) concluded, on the basis of experiments using random dot displays, that LM was unable to code motion unless it was “coherent”. By this they meant that she could not code information about movement in several different directions simultaneously. Experiment 1 shows that their notion of “coherence” must be extended beyond the case of all points moving in the same direction. If the coherence arises because the movement of dots in the display is related to the motion of a single human body, LM can code this despite the movement of the dots in different directions.

EXPERIMENT 2 Discrimination of Normal and Jumbled Computer-generated Johansson Walkers In Experiment 1, LM showed that she could distinguish the movement patterns of normal and inverted figures. In Experiment 2, we explore her ability to distinguish the movement patterns created by normal and jumbled Johansson figures.

Method Displays LM viewed a series of video clips of a computer-generated Johansson figure, represented by 13 dots of light—one for each leg and arm joint and one for the head. Half of the clips showed a normal figure walking on a treadmill (i.e. walking without translating across the screen), facing either to the left, to the right, or towards the camera, taking one step per second. Half showed a jumbled version of this figure, in which the starting position of each point was moved a random distance—up to 30% of the head to ankle distance—away from its position in the normal figure, in a random direction. Starting from these jumbled positions, each point then moved with the same trajectory that it would have had in a normal walking figure. Thus, the patterns of relative motion were preserved temporally, but the relative positions of the light dots no longer corresponded to the positions of the joints in a real figure. The jumbled and normal moving figures can be differentiated by normal subjects (Perrett, Harries, Benson, Chitty, & Mistlin, 1990) and by cells in the monkey temporal lobe (Oram & Perrett, 1994). The jumbled figures were on average larger than the normal figures. So on one-third of trials both normal and jumbled figures were scaled down by 50%, and on a third of trials they were scaled up by 50%, to prevent LM using size as a cue to the correct response. The normal, unscaled figure subtended an angle of

370

McLEOD ET AL.

about 8° 3 4° at a viewing distance of 1 m. LM viewed 108 displays, each lasting 5 sec. These comprised 6 randomly intermixed repetitions of the 18 possible combinations of size (small, medium, or large) and orientation (facing left, facing right, or facing the camera), either jumbled or normal. Task LM was asked to say whether or not each display represented a normal figure walking.

Results Three age-matched normal subjects performed the task (with a 1-sec display rather than the 5 sec given to LM). Overall they were 93% correct (range 100% to 85%). There was little difference between their performance with the three stimulus sizes, ranging from 96% correct at the intermediate size to 91% correct with the large figures. Performance at all three sizes was far above chance, x 2(1) ³ 33.8, p , .001. in each case. LM’s performance at each display size is shown in Table 1. The maximum score per cell is 18. Overall (summed across the three display sizes), LM was 69% correct. This is reliably better than chance, x 2(1) 5 18.5, p , .001, so LM can do this task. But 69% correct gives a misleadingly poor impression of her ability. The points of light in the largest displays moved slightly jerkily from frame to frame (for normal observers), and LM classified all these as jumbled (see Table 1). The movement in the small displays was smoother, and her performance with these was good; for example, she was 100% correct for the 24 trials requiring a judgement about the small figure facing to the left or the right.1 On being questioned as to how she discriminated between the normal and jumbled figures at the small size, she said (about the normal): “The actions belong to each other in a way which represents walking”, and (about the jumbled): “The dots do not belong to each other”. When asked in further detail about the normal figure, she said that the display gave her an immediate sensation of walking; however, there was no immediate sensation of a human figure. The presence of a walker was inferred from the sensation of walking. This seems like a phenomenal description of biological motion—in this case walking—as an abstract entity, coded independently of other information about form or motion.

1Her problem with the large figures was not a question of the size of the figures. The experiment was repeated with LM sitting 3 m from the screen. The large figures now subtended approximately the same visual angle as the small figures in the first experiment. But her performance was the same as before—she could distinguish normal and jumbled displays with the small figures but not with the large ones.

MOTION PERCEPTION BY LM

371

TABLE 1 LM’s Responses to Normal and Jumbled Stimuli in Experiment 2 as a Function of Stimulus Size Response Stimulus Size

Stimulus

Normal

Jumbled

Small

Normal Jumbled

14 4

4 14

Normal

Normal Jumbled

10 1

8 17

Large

Normal Jumbled

1 0

17 18

Conclusion Experiment 2 corroborates the conclusion of Experiment 1. LM can identify whether or not the movements of points of light correspond to the movements of human joints as someone walks. Oram and Perrett (1994) report that some of the cells in temporal lobe area STPa can also make the discrimination we required of LM. Their firing rate is increased by the biologically correct walking stimuli used in these experiments, but not by the jumbled stimuli.

EXPERIMENT 3 Is LM’s Residual Ability Specific to Biological Motion? The results of the first two experiments are consistent with the proposal that the visual system may contain a centre, independent of other “low-level” motion systems, for recognizing structure from motion. The structure-from-motion system is relatively preserved in LM, but the low-level motion system is impaired. All our displays so far have involved biological motion, so it is not clear whether the system that is preserved in LM recognizes specifically biological structure (i.e. concerned with familiar human or animal actions) or structure from motion in general. Areas with a particular sensitivity to signals of biological origin are known to exist in the primate visual system. For example, a sub-system devoted to the analysis of faces appears to have developed within the visual form recognition system. Indeed, biological motion appears to be analysed in those brain regions containing cells that selectively process faces (Perrett et al., 1989). The aim of Experiment 3 was to see whether LM’s residual ability to extract structure from motion is restricted to familiar biological movements

372

McLEOD ET AL.

Method Designing an experiment to try to tackle this question is not as straightforward as it might seem. Many non-biological objects are associated with specific motion patterns—wheels roll, doors open, and so on. But non-biological objects are usually rigid. So their motions are not identified by the relative motion of their parts in the same way that biological actions such as walking, as opposed to running or jumping, are defined by the relative motion of the human joints. Thus the appropriate distinction between the stimuli in Experiments 3 and 4 may be between the motion of rigid and non-rigid bodies rather than between biological and non-biological objects. Displays We used four kinds of display, in two of which structural information about non-biological objects was given by motion: 1. Computer-generated Displays. These used the relative motions and positions of points of light to give the impression of a rotating, threedimensional, rigid, transparent non-biological object: a. A Necker cube (i.e. the 12 edges of a transparent cube), defined by 11 equally spaced dots along each edge, rotating about a vertical axis. (The spacing expanded and contracted as an edge moved towards or away from the viewer.) The cube subtended roughly 12° 3 12° at a viewing distance of 1 m and rotated once every 2 sec. The display gave a clear impression of a rotating 3-D object to normal observers. b. The same cube, with its edges misaligned so that they did not meet at the corners, rotating about a vertical axis once in 2 sec. This display gave a clear impression of a rotating but disjointed figure to normal observers. c. A rectangle of line segments and several hundred random dots whose relative velocities and spacing gave the clear impression to a normal viewer of a cylinder rotating about a vertical axis. It subtended an angle of roughly 12° 3 12° and rotated once every 2 sec. d. A spiral of several hundred dots giving the impression of a long, thin spring or tube. Initially it lay in the plane of the screen with its long axis vertical. The long axis rotated in depth about a horizontal axis in the plane of the screen, one rotation taking 2.5 sec. It subtended an angle of about 6° in width and a maximum of about 16° in length. There was considerable expansion and contraction of the spacing between the dots at opposite ends of the spiral as it rotated, giving a strong impression to normal vision that the top end of the tube was moving out of the screen towards the observer as it rotated and the bottom end was moving away and behind the screen.

MOTION PERCEPTION BY LM

373

e. Several hundred dots, whose bunching and spacing gave a clear percept of a rectangular corrugated sheet. Like (d), the motion of the sheet appeared to normal vision to be rotation in depth, about a diagonal axis in the plane of the screen, once every 2.5 sec. Its maximum extent subtended an angle of about 15°. Expansion and contraction of the spacing of the dots gave a clear impression to normal viewing that one side of the sheet was moving out of the screen towards the observer, while the other moved away from the observer behind the screen. Compared to the stimuli of the first two experiments, these are not strictly structure-from-motion displays, as it would have been possible to identify all the objects except (c) by examining a still frame. 2. Light Point Displays of Everyday Objects. These were everyday objects that are normally associated with motion, such as a door or a wheel. The stimuli were made in the same way as the Johansson figures in Experiment 1. Reflective tape was attached to points on the object, which was then filmed at a low light level as it moved. At replay, the contrast level was adjusted so that only points of light were visible. The objects were: a. A wheel with light dots at 8 equidistant points around its circumference. It subtended an angle of about 14°. It either rotated in the plane of the screen, making a complete revolution once in 4 sec, or it rolled across the screen, making 1.25 complete revolutions as it crossed from left to right. b. A door with 12 dots in all—5 on the two uprights and 2 more across the top. The door was filmed opening and closing. The vertical edge of the closed door subtended an angle of about 10° (see Figure. 1). c. A rectangular piece of paper with 12 equidistant dots around its edges. It was folded in various ways in different clips—one corner of the paper was folded towards the centre and back, opposite sides were folded towards the middle and back, and so on. Two other sorts of display were included: 3. Johansson Figures With Externally Imposed Translations. These stimuli were people, filmed as in Experiment 1. One was a person sitting on a swing, swinging his arms and legs. The image subtended about 6° 3 6°. The other was a person riding an exercise bicycle. Both faced the viewer. The man on the swing was filmed by a camera, which tracked past him from left to right or vice versa; thus the movements of the limbs would not have generated the translation of the body as it crossed the screen. The camera remained stationary or retreated from

374

McLEOD ET AL.

FIG. 1. Static images from the light-point display of an opening door in Experiment 3. The images shown are taken at 2-sec intervals. In the real display the light-points move continuously from the start position to the end position.

or approached the man pedalling the exercise bicycle. Again, the movement of the figure—contraction, expansion, or no change—was not related to the movement that would have been produced by the pedalling had the object been a moving bicycle rather than a static one. The maximum extent of the image was about 12° 3 12°. Maximum contraction was to about half the size of the original. 4. Noise Displays. There were two sorts of computer-generated noise displays: One contained a random set of straight lines and circles, with a superimposed field of noise dots, which flicked on and off, and the second consisted of a random set of noise dots, which jumped at random intervals to new positions. They gave no particular impression to a normal viewer other than of general confusion. Task A video was made with 43 clips of these different stimuli, each lasting 5 sec. (The frequency of each stimulus type is shown in Table 2.) After each clip, LM was asked to make a forced choice between whether she saw an object, a person, or nonsense. If the display was an object or a person, she was asked to describe its motion or action.

MOTION PERCEPTION BY LM

375

TABLE 2 Relative Frequency of Display Types for the 43 Trials in Experiment 3 Stimulus

Frequency of Presentation

1. a. Necker cube

1

c. Noise cylinder

1

d. Spiral

2

e. Corrugated sheet

2

2. a. Rotating wheel Rolling wheel

4 2

b. Door

4

c. Paper

9

3. a. Man on swing b. Man on exercise cycle 4.

1

b. Jumbled cube

Noise

4 4 9

Results Computer-generated Displays of Rotating Three-dimensional Objects. LM appropriately identified the Necker cube (“a box”) and the disjointed cube (“something like a box”). She gave their direction of rotation correctly, initially with a hand movement and then verbally. She gave the direction of rotation of the noise cylinder correctly (with a hand movement) but did not identify the object clearly. She reported the identity of the spiral, but not its direction of motion. She reported neither the form nor the direction of motion of the corrugated sheet. LM’s ability to identify the Necker cube suggests that she may be able to extract structure from motion even when the object depicted is not biological. Vaina et al. (1990) drew a similar conclusion about AF, who could distinguish a set of moving dots that gave the appearance of a rotating cylinder from a comparable set of randomly moving dots that did not simulate cylindrical rotation. However, LM’s ability to detect the Necker cube should, perhaps, be treated with caution. She could have reported the identity of the Necker cube and the spiral on the basis of a perceptual “still” from the film sequence. Because their identity is revealed by the relative spacing of individual light spots in a single image. In contrast, single images give no clue to the actions of the Johansson figures. These can only be deduced from the relative motion of the light points during a sequence of images.

376

McLEOD ET AL.

However, it would be difficult to extract the direction of rotation of the cube via stills. The fact that LM could identify the direction of rotation of the cube and of the cylinder suggests that when the relative motions in the display are related to a coherent higher-order Gestalt, she can extract movement information from the display, even when it involves a rigid, non-biological object. Her difficulty with the spiral and the corrugated sheet may have been due to her difficulty in interpreting contraction and expansion as movement towards and away from her (see Experiment 4 and Zihl et al., 1983, 1991). Filmed Light-point Displays of the Movement of Everyday Objects With Familiar Motions. LM never identified these objects explicitly. She never said “door ” or “sheet of paper’. She called the wheel a “circle”, which is an accurate description but might be based on static information. She identified the direction of rotation of the static rotating wheel (3 times out of 4). She identified the rolling wheel as a “moving circle” (2 out of 2), getting the direction of movement correct once, but she never identified it as rolling. When pressed, she said she thought it was rotating on its own axis. She never identified the motion of the door as movement out of the plane of the screen. It was described as either an expanding or contracting rectangle (3 out of 4), with these terms used appropriately, or as a large rectangle that turned into a small rectangular (1 out of 4). Similarly, the sheet of paper was usually described as something changing shape (6 out of 9). However, on one occasion she made a folding gesture with her hand, and once, following a suggestion by the experimenter of what the object might be, she said “if it were a sheet of paper, then the change of shape would be folding”. “A contracting rectangle” or “an object changing shape” are perfectly correct descriptions of the displays of door and paper. This can be seen in Figure. 1, where three stills from the “door opening” sequence are shown. However, to normal viewing, the relative expansion and contraction of adjacent points on different edges of the shape give a compelling impression of the movement in depth of a solid object. LM’s preference for the interpretation of the changing array as an object at constant depth changing shape rather than the movement in depth of a rigid object is considered in the General Discussion. The movement information in the displays of the opening door and the folding paper appears simpler than that in the displays of the first two experiments, or in the other conditions of this experiment. That is, the patterns of relative motion are much less complex than those involved with human actors or rotating Necker cubes, cylinders, springs, and corrugated surfaces. But LM was apparently unable to use this simple motion information to suggest an underlying object. The problem may arise from its very simplicity. Although complexity is a problem for her in random-dot displays (Baker et al., 1991), it may be positively helpful in extracting structure from motion. The more relative

MOTION PERCEPTION BY LM

377

motions there are, the easier it may be for her to identify the underlying action or object, due to the greater number of cues and constraints on possible solutions provided by points on the object. Alternatively, LM’s difficulty may be primarily to do with interpreting expansion and contraction as movement towards or away from herself (as required for appropriate interpretation of the opening door or folding paper). She found this difficult in the first condition with computer-generated 3-D shapes (she could not identify the action of the objects that appeared to rotate in and out of the plane of the screen), and she finds this difficult again in Experiment 4. Externally-imposed Motion of Johansson Figures. LM always identified the swinging or cycling figures correctly as people (8 out of 8). She could identify the direction of motion (left or right) of the figure on the swing correctly (4 out of 4), but could not explain how it was moving, apart from saying that it was definitely not walking. She thought the figure on the bicycle might be cycling but could not identify the direction of motion (approach or retreat) (1 out of 4 correct) despite the substantial expansion or contraction of the image. Her ability to identify left and right translation is consistent with her similar ability in displays containing single points of lights (Zihl et al., 1991), but it may not be genuine motion perception. LM could infer from the fact that the image starts at the left of the screen and ends at the right that the display has moved from left to right, without detecting the motion itself. Indeed, she says that she uses this strategy. Noise Displays. She always reported that the noise stimuli meant nothing (9 out of 9). She could tell that the display was dynamic (for example, reporting it as “jumping dots”) but had no sense of objects or directions.

Conclusion We may tentatively conclude that LM can extract some information about both the structure and the motion (for example, direction of rotation), even when the motion is of a rigid, non-biological object. But she finds this easier if the visual stimulus provides fairly rich clues (she can interpret the rotating Necker cube, but not the opening door). Strikingly, LM appeared unable to interpret expansion or contraction of the image on the screen as motion of an object towards or away from herself. Her correct judgements that the figure on the swing whose arms and legs were swinging was definitely not walking confirms the conclusion of Experiments 1 and 2: She can code accurately the patterns of relative motions involved in familiar actions, such as walking.

378

McLEOD ET AL.

EXPERIMENT 4 Action Versus Translation in Johansson Displays The aim of this experiment was to see whether there was a dissociation between LM’s ability to recognize the objects and actions in light-point displays and her ability to recognize their orientation and/or direction of translation. In particular, we wished to explore LM’s apparent inability to interpret expansion or contraction of the image as motion towards or away from her.

Method Displays The displays were video clips from the motion of either biological or nonbiological figures. The biological figures were a man pedalling an exercise bicycle or walking. He was filmed with 12 light dots, one on each ankle, knee, hip, shoulder, elbow, and wrist joint. The non-biological figures were a wheel and an arrow. The wheel had 8 equidistant dots around its circumference, 3 on the spokes, and 1 one at the centre. The arrow had 12 light dots in the position that they would be if the biological figure stood facing the camera with his legs together and his arms at 45° from his body. This had the appearance of an arrowhead and shaft. (As in the previous experiment, the biological/non-biological distinction is unavoidably confounded with a possible non-rigid/rigid distinction.) There were two sorts of movement: a. Movement Without Translation. The figure moved without changing its position relative to the observer. The man walked on the spot; the cyclist pedalled on an exercise bicycle which remained stationary as he pedalled; the arrow rotated in the plane of the screen around a point in the centre of the screen. With the walker and the cyclist, the orientation of the actor relative to the observer as they walked or pedalled was varied between trials. The figure might be facing the viewer directly or at any of the eight possible orientations as the figure rotates away from the viewer in 45° steps—towards, towards-and-halfleft, left, away-and-half-left, etc. b. Translation. The figure could change its position relative to the observer. For the arrow head, this involved movement across the screen from left or right without changing its depth from the viewer. The other three figures moved in depth. The walker walked towards or away from the viewer; the viewer approached or retreated from the cyclist as he pedalled on an exercise bicycle; the wheel approached or retreated from the viewer, or was suspended and swung backwards and forwards and away from the viewer. The expansion/contraction of the image, which gave an unambiguous impression of approach/retreat to

MOTION PERCEPTION BY LM

379

normal viewing, was up to 300%/30% of the original size of the figure. In some of the displays the figure approached the viewer obliquely, giving both left/right motion and approach. A videotape was made with examples of various possible combinations of figure, movement, and, for the walker and cyclist, orientation relative to the camera. There were 39 such clips, assembled in random order, each lasting 5 sec. The relative frequencies of the different display types are shown in Table 3. Task LM was asked, first, to say whether each display showed a person or an object (although she never used the latter term in practice—see Results), to describe the action and orientation if the display showed a person, and then to describe anything she could about the motion of the figure.

Results Actions. LM could tell that displays that showed walking or cycling contained a person, and that the others did not (38 correct out of 39). (She always described the non-biological displays as “dots”, never as an arrow or a wheel.) If she identified the display as containing a person, she correctly identified whether the figure was walking or cycling (20 correct out of 21). She identified the two rotating arrows as “rotating” and correctly reported the direction. Lateral Translation. For the displays where the arrow-head moved across the screen without approach or retreat, her judgements about direction of TABLE 3 Relative Frequency of Display Types for the 39 Trials in Experiment 4 Biological Movement without translation

14

Non-biological 2

Translation Left/Right

6

Approach/Retreat

5

8

Both

2

2

Note: “Biological” displays were a cyclist or a walker; “non-biological’ displays a wheel or an arrow head. (Displays labelled “Both” involved oblique approach to the viewer. These had both a left/right and an approach/retreat component.)

380

McLEOD ET AL.

movement were always correct (6 out of 6). She usually reported the lateral component correctly in the displays where this was combined with approach/retreat (3 correct out of 4). Spatial Disposition. LM could not judge the orientation of the figure walking on the spot or cycling. (She reported the action correctly in each case; it was only the orientation of the actor that she could not identify.) The random nature of her responses on this judgement can be seen in Table 4. Movement in Depth. LM’s most striking inability was with displays that contained expansion or contraction. This was unambiguously interpreted as motion towards or away from the observer by normal viewers. Despite substantial expansion and contraction (up to 300% and 30% of the original, respectively) she only identified motion towards or away from her correctly on 2 out of 17 trials. On the others, she made no mention of motion (6 trials), reported the motion as being to the left or the right (which is true of some of the dots, but not of the display as a whole) (5 trials), reported (correctly) the dots getting larger but made no mention of motion of the figure (2 trials), reported (incorrectly) that the figure was rotating (1 trial), or said (correctly) that the dots were moving from the periphery to the centre (1 trial).

Conclusion LM’s ability to identify cycling and walking confirms the conclusion of the earlier experiments—she can extract structure from the motion in Johansson displays. However, she appears to be unable to identify the spatial disposition of TABLE 4 Relative Frequency of LM’s Comments About the Orientation of Stimuli Where a Walker Walks on the Spot or a Stationary Cyclist Pedals an Exercise Bicycle Response Stimulus

Left

Right

No comment

Left

3

1

Right

2

1

No component

2

3

Note: For example, on the four trials where the figure faced left, she reported on three occasions that it faced right, and once did not comment on the orientation. “No component” means that the figure faced directly towards or away from the camera.

MOTION PERCEPTION BY LM

381

the figure performing the action. Furthermore, she either does not detect expansion/contraction, or she does not interpret it as approach/retreat. This is consistent with her difficulties with motion in depth in the earlier experiments (e.g. with the cyclist in Experiment 3). Her ability to report left/right translation but not approach/retreat might seem paradoxical. To judge by her own account, she identifies lateral direction of motion by inference from the fact that the display starts nearer to one side of the screen and finishes nearer the other. One might have thought that if all points on the image ended nearer/further from the edges, she could similarly infer approach/retreat. This underlines the fact that it is not possible to explain LM’s perceptual abilities and disabilities in terms of what information she can extract from perceptual “stills”. LM’s difficulty with expansion and contraction is consistent with previous reports that she is particularly bad at detecting motion in depth (Zihl et al., 1983, 1991). It is also consistent with her own statements that she finds it difficult to judge whether she is on a collision course with approaching people in crowded places. She reports that in corridors she adopts the strategy of walking along the wall to avoid problems. In Experiment 3, she invariably reported the opening or closing door as a contracting or expanding rectangle, and the various paper-folding sequences were usually described as an object changing shape. This shows that LM could detect the expansion/contraction, but she chose to interpret it as expansion or contraction of the object, rather than as the result of a movement in depth of an object of constant size. A similar effect appears in this experiment. She prefers to interpret expansion/contraction as movement to the left or the right, which is true of parts of the display but not of the display as a whole.

EXPERIMENT 5 Dissociating the Figure’s Action from Its Viewer-centred Disposition In Experiment 4 we found that although LM could report the action performed by a Johansson figure, she was unable to report the direction in which the actor was facing. In Experiment 5 we explore this dissociation in greater detail by testing whether she can match the direction in which a walking Johansson figure is translating which the direction in which the figure is facing.

Method Displays Video clips were made of a computer-generated Johansson walker. The actions were those of a moving walker rather than of one walking on the spot. The figure, represented by light dots at the position of each joint and one on the

382

McLEOD ET AL.

head, faced either to the left or to the right. During the 5-sec clip, the walking figure moved across the screen from left to right or from right to left. The direction in which the figure was facing and the direction in which it was moving were combined orthogonally so that the figure either moved in the same direction as it was facing (e.g. facing left and moving left) or in the opposite direction (e.g. facing left and moving right). When the orientation and direction of motion were the same, the percept to normal viewers was of a person walking past while you remained stationary. With the incongruent combination, the percept to normal vision was of moving past someone who is walking in the same direction as yourself, but more slowly. The distinction between the two percepts was clear and unambiguous to normal vision. Task LM was shown a tape with 32 clips, each of 5-sec duration, with 8 examples for each of the 4 possible combinations or orientation and translation (i.e. facing left/right 3 translating left/right). She was asked to say whether or not the figure was facing in the same direction as it was moving across the screen.

Result Normal subjects are immediately at ceiling on this task (Perrett et al., 1990). In contrast, LM said that she could not do it, and her responses were guesses. Her performance is shown in Table 5. Results are collapsed across left and right to give 16 compatible stimuli (facing and moving in the same direction) and 16 incompatible stimuli (facing and moving in opposite directions). Overall, she was 59% correct. This was not reliably different from chance, x 2(1) 5 1.2, p . .25.

TABLE 5 Relative Frequency With Which LM Classified the Stimuli in Experiment 5 Response Stimulus Compatible Incompatible

Compatible

Incompatible

12

4

9

7

Note: “Compatible” means that it moved in the direction it was facing; “incompatible” means that it moved in one direction and faced the other.

MOTION PERCEPTION BY LM

383

Conclusion LM cannot tell whether a walking figure is facing in the same direction as it is translating. We know that LM can identify the direction of lateral translation of a variety of Johansson figures (Experiments 3 and 4), so we must conclude that she cannot identify the orientation of the figure in a Johansson display. This is consistent with our interpretation of her reports in Experiments 1 and 2. She says that she sees “walking” rather than “a person walking”. If, as we suggested, “walking” is an abstract entity coded independently of other information about the walker, knowledge of the action does not necessarily entail knowledge of the actor’s orientation from the observer’s viewpoint. Oram and Perrett (1994) identified single cells in STPa that are sensitive to biological motion stimuli representing a walking figure. They measured the responses of these cells to the stimuli used in this experiment. They found some cells that were sensitive to both the orientation and direction of motion of the stimulus, and some that were sensitive to the motion but not to the orientation. But they found no cells that were sensitive to the orientation of the figure, irrespective of its direction of motion. The relative scarcity of cells that code the orientation of the walking figure suggests that, in this visual area at least, the coding of view-point sensitive information would be more vulnerable to damage than coding of view-point-independent information. This is the dissociation we found with LM. Her ability to identify an action is preserved, but she has lost the ability to determine the spatial disposition of the actor. The former relies on view-point-independent information, and the latter relies on view-point dependent information. In the General Discussion we suggest that the information derived from motion to which LM has access may be coded in object-centred rather than viewer-centred coordinates. It might be argued that the test of viewer-centred information in this experiment (the discrimination of whether a walker is facing to the right or the left) requires the decoding of subtler cues from Johansson displays than our tests of object-centred information (action naming). Thus her inability to report orientation might reflect the greater difficulty of the task rather than her lack of viewer-centred information. However, our claim that she lacks access to viewer-centred information is also supported by her inability to report expansion or contraction of images as movement of an object towards or away from her in Experiments 3 and 4, and by her inability to report any information about the spatial disposition of the figures in Experiment 4.

EXPERIMENT 6 DISRUPTION BY STATIC NOISE Earlier reports of LM’s motion deficit have emphasized her difficulties with displays that lack full motion coherence. McLeod et al. (1989) showed that LM

384

McLEOD ET AL.

was unable to segregate the moving stimuli in a display of intermingled moving and stationary stimuli, in order to attend selectively to one group or the other. Normal subjects find this very easy (McLeod, Driver, & Crisp, 1988). Baker et al. (1991) reached a similar conclusion by a different route. They showed that in a display of dots moving in random directions, LM required about 80% coherence before she could report the direction of motion. That is, she required 80% of the dots to be moving in the same direction before she got any percept of motion in one particular direction. Normal subjects only require about 5% coherence before detecting the common direction of motion. Baker et al. concluded that “the presence of even very small percentages of stationary ‘noise’ dots was sufficient to totally disrupt direction discrimination of moving ‘signal’ dots”. The results of the first four experiments show that LM can attend to many directions of motion simultaneously, provided they are all generated by the same higher-order Gestalt. Experiment 6 tests whether this ability is also disrupted by static noise.

Method Displays This was a repeat of Experiment 2 (using just the normal-sized figures), except that stationary noise dots were added to the display. LM watched a computer-generated Johansson figure, which appeared to be walking on a treadmill, facing left, right, or towards the observer. The figure was either normal or jumbled, following the algorithm described in Experiment 2. The figure subtended an angle 8° 3 4°, but the display also contained 25 static points of light distributed at random across an area of 17° 3 22°. The position of these varied randomly from trial to trial. In a typical display, there would be no more than 4 or 5 noise dots in the vicinity of the figure. Task LM watched 48 video clips, each lasting 5 sec, 24 of which were normal and 24 jumbled. Her task was to report whether or not the display showed a normal figure.

Results The small number of stationary noise dots has little effect on the ability of normal subjects to attend to the moving dots and identify whether they represent a normal figure walking. They perform at greater than 90% accuracy with much briefer displays than those shown to LM (Perrett et al., 1990). In contrast, LM’s performance was at chance (48% correct for a 2-choice discrimination), x 2(1) 5 0.09, p . .7. Table 6 shows that she was unable to discriminate between normal and jumbled stimuli. For comparison, in

MOTION PERCEPTION BY LM

385

TABLE 6 Relative Frequency With Which LM Classified the Normal and Jumbled Stimuli in Experiment 6, Where They Were Viewed Against a Background of Low-density Static Noise Response Stimulus

Normal

Jumbled

Normal

10

14

Jumbled

11

13

Experiment 2 LM scored 75% correct on the normal/jumbled discrimination with figures of the size used in this experiment (see Table 1).

Conclusion The addition of static visual noise destroys LM’s ability to extract information in Johansson light-point displays. Seeing the relation between the moving dots would be impossible if the moving and stationary dots cannot be segregated. The present result suggests that LM’s previously reported inability to attend to just the moving stimuli in meaningless displays of moving and stationary stimuli extends to Johansson displays. The result of this experiment suggests that the ability to perform figural segregation on the basis of movement cues and the ability to interpret the movement in a part of the scene following segregation may be independent. In LM the former is damaged, while aspects of the latter seem relatively intact, provided there is object-based coherence to the scene.

GENERAL DISCUSSION The most striking aspect of LM’s performance with Johansson displays, given her deficit in low-level motion perception, is the fact that she can interpret them at all. But the various ways in which her perception of these displays is abnormal may be revealing about the operation of the sub-system that extracts structure from motion. We may gain insight into its properties and its relation to other motion-processing areas by contrasting the information that LM can and cannot extract from light-point displays. In summary, LM can: 1. identify the actions portrayed in Johansson displays (Experiments 1, 3, and 4);

386

McLEOD ET AL.

2. distinguish a correct Johansson figure from an inverted or jumbled one (Experiments 1 and 2); 3. discriminate whether the movement in a Johansson display is generated by a human figure or a rigid, non-biological object (Experiments 3 and 4); 4. give the direction of rotation of an object, provided its apparent rotation is primarily in the plane of the screen (Experiments 3 and 4); 5. possibly identify some rigid, non-biological objects (although this ability might arise from identification of static “snapshots”) (Experiment 3). Despite these preserved abilities, LM remains unable to: 1. distinguish a normal from a jumbled Johansson figure in the presence of static visual noise (Experiment 6); 2. report on the spatial disposition of an actor whose action she can identify— for example, she cannot tell whether a figure, identified as a cyclist from its actions, is facing to the left or the right as it cycles, or tell whether a figure, identified by its actions as a walker, is facing in the same direction that it is walking (Experiments 3, 4, and 5); 3. report the direction of rotation of a rigid, non-biological object when this involves considerable apparent movement out of the plane of the screen (e.g. the rotating “spring” in Experiment 3); 4. report motion in depth for expanding/contracting figures even when this involves a change of 300%/30% in a few seconds—for example, she cannot tell whether an expanding figure is approaching or retreating. And the contraction and expansion accompanying an opening door is reported as a change in shape, not as movement in depth (Experiments 3 and 4). This pattern of abilities and disabilities suggests the following conclusions: 1. There Is a System for Recognizing Structure From Motion Information Outside the Conventional Motion Pathway. As LM’s bilateral lesion is primarily in the dorsal pathway, the natural assumption is that the system responsible for her residual abilities lies in the ventral pathway. The possibility of a structure-from-motion centre in the ventral path is also supported by the report of neurons in infero-temporal cortex, which, although not directionally sensitive, show response to shape defined by motion (Sary et al., 1993). Zeki (1993) has proposed that there is a “dynamic form” system in the ventral stream, involving V3. An alternative location, although both sites may play a role, is the anterior STS region (STPa), where Oram and Perrett (1994) have found cells in macaque monkeys which respond selectively to Johansson-type displays showing the motion of familiar biological figures. They report that some of the cells in this area can discriminate between jumbled and normal displays and that some are insensitive to the orientation of an actor. Both features have parallels in LM’s performance (see Experiments 2 and 5, respectively).

MOTION PERCEPTION BY LM

387

2. Input From the Dorsal Route May Not Be Obligatory for Extracting Structure From Motion, at Least With Familiar Biological Figures. Zeki has suggested that the movement input to the ventral structure-from-motion system comes from the traditional movement areas: “when a structure is generated from coherent motion, the motion must be generated first in V5, and the results communicated to V3 . . . to stimulate the form selective cells there” (Zeki, 1993, p. 332). Given LM’s inability on conventional movement tasks and her preserved ability at recognizing actions in Johansson light-point displays, the natural assumption is that the system that extracts structure from motion has independent inputs from those of the conventional motion areas. As the ventral route has inputs from the magnocellular layers of the LGN via V1 and V2 (see, e.g., Merigan & Maunsell, 1993) motion processing might, in principal at least, proceed in the ventral stream without any input from V5/MT. 3. One Role for the Dorsal Motion System Is to Segregate Those Parts of the Display That Are Linked by a Common Motion Characteristic. An alternative characterization of the role that V5/MT plays in structure-from-motion perception, which can accommodate LM’s performance, stems from previous work on visual search (McLeod et al., 1988; McLeod, Driver, Dienes, & Crisp, 1991). Normal subjects can direct attention in a display of moving and stationary stimuli to any group of stimuli with a common movement characteristic— “filtering by movement”. As LM is unable to do this (McLeod et al., 1989), we argued that filtering by movement is one function of the dorsal stream. Experiment 6 showed that LM’s ability to extract structure from motion is completely destroyed by the addition of static noise to the display. It appears that filtering by movement is not preserved for her even in displays of biological motion, confirming that such filtering is one role of the dorsal route. This suggests an alternative view of the dependence of the ventral “dynamic form” system on the dorsal motion processing in V5/MT. The former may rely on the latter to segregate those parts of the display that belong together, and whose relative motion could therefore provide an input to structural recognition. On this view, V5/MT modulates the motion input to the ventral stream rather than being its sole source. Thus the role for V5/MT in providing an input to structure-from-motion processing may not be obligatory; rather, it will be required only when there are a number of objects with different movement characteristics in the display, so that segmentation of those elements that belong together becomes necessary.2 4. Coding in the Dynamic Form System May Be Primarily Object-centred. Our fourth conclusion stems from the observations that LM cannot report the 2 LM reports particular difficulty with real-life situations, such as crowds, where different objects move in different directions simultaneously.

388

McLEOD ET AL.

spatial disposition of the cyclist or walker (despite identifying the action) nor motion in depth. These results suggest that the coding of relative movement within her preserved dynamic form system may be primarily object-centred. That is, it specifies the relative motion of the dots in a way that identifies the action or object independent of view-point. We suggest that view-point-specific information from biological motion stimuli, which would include the actor’s spatial orientation and motion relative to the observer, would be provided by the “traditional” dorsal movement areas. These are disrupted in LM, leaving her reliant on the object-centred coding in the ventral dynamic form system. Thus she can report familiar objects and actions, but not their disposition relative to herself.3 Although this conclusion is somewhat speculative, given our limited data, it is consistent with a variety of other work. The proposal that she does not have access to viewer-centred information accounts directly for her inability to report the orientation of actors. It also suggests an explanation for her description of the expanding and contracting displays in Experiments 3 and 4. She reports the motion in the displays of inanimate objects (such as the opening door—see Figure 1) as a change in the shape of the object, not as approach or retreat of the object. That is, she chooses an object-centred interpretation of the change in the display rather than the alternative viewer-centred interpretation. It is possibly significant that she generally makes no report of change in the expanding or contracting displays identified as the actions of people. People do not change shape, so no object-centred interpretation is possible in these cases. The object-centred account is neatly illustrated by her inability to distinguish between a rotating and a rolling wheel. In object-centred coordinates there is no difference between rotation and rolling; the difference is only apparent to an external viewer. If she had lost the viewer-centred representation, she would not be able to distinguish them. It is not clear, however, why she can report direction of rotation. The changing relationship between dots that occurs during rotation provides objectcentred information, so we would expect her to be able to report this. But direction of rotation is a viewer-centred interpretation of the display (for example, clockwise motion when viewed from the front of an object becomes anti-clockwise when viewed from behind). Our account might therefore seem to predict that she should not be able to report the direction of rotation. Perhaps, though, one can argue that information provided by rotation is a special case. The dorsal route is thought to be concerned with the spatial relations between viewer and object that are necessary for control of actions (Goodale & Milner, 1992)—the object’s orientation and translation relative to 3We assume that LM’s successful discrimination of left and right translation was based, as she suggested herself, on an inference from change in position rather than on genuine perception of translational motion.

MOTION PERCEPTION BY LM

389

the viewer, for example. Although direction of rotation is viewer-centred, the information that rotation provides may be as relevant to the object-identification function of the ventral pathway as it is to the dorsal function of controlling action. As an object rotates, its viewer-centred representation must change, even though its intrinsic object-centred shape does not. Rotations in depth can reveal previously hidden features of the object, while even rotations in the image plane alter the viewer-centred shape. Thus the ability to correctly extract rotational movement, so as to correct for any changes in the projected shape that are produced by this alone, may be as important for the ventral function of object recognition as for the viewer-centred functions of the dorsal stream. This might explain why LM has apparently preserved coding of rotation despite a general loss of viewer-centred information. While her correct reports of the direction of rotation are problematic for a strict dichotomy between object-centred and viewer-centred information, they may be consistent with dichotomous functions of the ventral and dorsal pathways.

Object- and Viewer-based Coordinates for Ventral and Dorsal Systems This account would agree with the most generally accepted division between processing in the ventral and dorsal streams. The primary role of the ventral system is object recognition; thus it codes stimuli in object-based coordinates. The primary role of the dorsal system is to locate stimuli relative to the viewer, for the purposes of controlling on-line actions, such as reaching towards an object (Goodale & Milner, 1992; McCarthy, 1993; Plaut & Farah, 1990). Thus it codes in viewer-centred coordinates. Such a division has been made in the past on the basis of a wide body of data, ranging from single-cell recordings in monkeys as a function of stimulus location (Stein, 1992) to the effects of brain injury on human performance (Goodale, Milner, Jakobson, & Carey, 1991; McCarthy, 1993). However, in all this past work, the distinction between object-centred and viewer-centred coding has been applied solely to static stimuli; the present observations with LM suggest that it may also be useful in understanding the coding of motion. This account seems consistent with LM’s otherwise strange phenomenal descriptions of perceiving “walking’ in Johansson displays without perceiving either the walker or where the walker is heading. It also suggests that the reason why she cannot perform many of the apparently simple tasks used in psychophysical tests of motion detection is that the information they require is essentially viewer-centred—the motion of a point of light or a flow field relative to the observer, but not generated by the relative articulation of components of an object.

390

McLEOD ET AL.

REFERENCES Baker, C., Hess, R., & Zihl, J. (1991). Residual motion perception in a “motion-blind” patient, assessed with limited-lifetime random dot stimuli. Journal of Neuroscience, 11, 454–461. Cheng, K., Hasegawa, T., Saleem, K., & Tanaka, K. (1994). Comparison of neuronal selectivity for stimulus speed, length, and contrast in the prestriate visual cortical areas V4 and MT of the macaque monkey. Journal of Neurophysiology, 71, 2269–2280. Corbetta, M., Miezen, F., Dobmeyer, S., Shulman, G., & Petersen, S. (1991). Selective and divided attention during visual discrimination of shape, color and speed: Functional anatomy by positron emission tomography. Journal of Neuroscience, 11, 2383–2402. Desimone, R., & Ungerleider, L. (1989). Neural mechanisms of visual processing in monkeys. In F. Boller & J. Grafman (Eds.). Handbook of neuropsychology. New York: Elsevier. Dittrich, W. (1993). Action categories and the perception of biological motion. Perception, 22, 15–22. Ferrera, V., Nealey, T., & Maunsell, J. (1994). Responses in macaque visual area V4 following inactivation of the parvocellular and magnocellular LGN pathways. Journal of Neuroscience, 14, 2080–2088. Goodale, M., & Milner, A. (1992). Separate visual pathways for perception and action. Trends in Neuroscience, 15, 20–25. Goodale, M., Milner, A., Jakobson, L., & Carey, D. (1991). A neurological dissociation between perceiving objects and grasping them. Nature, 349, 154–156. Hess, R., Baker, C., & Zihl, J. (1989). The “motion blind” patient: Low-level spatial and temporal filters. Journal of Neuroscience, 9, 1624–1640. Johansson, G. (1973). Visual perception of biological motion and a model for its analysis. Perception and Psychophysics, 14, 201–211. Livingstone, M., & Hubel, D. (1988). Segregation of form, color, movement and depth: Anatomy, physiology and perception. Science, 240, 740–749. McCarthy, R. (1993). Assembling routines and addressing representations: An alternative conceptualization of “what” and “where” in the human brain. In N. Eilan, R. McCarthy, & B. Brewer (Eds.), Spatial representation: Problems in philosophy and psychology. Oxford: Blackwell. McLeod, P., Driver, J., & Crisp, J. (1988). Visual search for a conjunction of movement and form is parallel. Nature, 332, 154–155. McLeod, P., Driver, J., Dienes, Z., & Crisp, J. (1991). Filtering by movement in visual search. Journal of Experimental Psychology: Human Perception and Performance, 17. 55–64. McLeod, P., Heywood, C., Driver, J., & Zihl, J. (1989). Selective deficit of visual search in moving displays after extrastriate damage. Nature, 339, 466–467. Merigan, W., & Maunsell, J. (1993). How parallel are the primate visual pathways? Annual Review of Neuroscience, 16, 369–402. Oram, M., & Perrett, D. (1994). Responses of anterior superior temporal polysensory (STPa) neurons to “biological motion” stimuli. Journal of Cognitive Neuroscience, 6, 99–116.

MOTION PERCEPTION BY LM

391

Perrett, D., Harries, M., Benson, P., Chitty, A., & Mistlin, A. (1990). Retrieval of structure from rigid and biological motion: An analysis of the visual response of neurons in the macaque temporal cortex. In A. Blake & T. Trosciano (Eds.), AI and the eye. Chichester: John Wiley. Perrett, D., Harries, M., Bevan, R., Thomas, S., Benson, P., Mistlin, A., Chitty, A., Hietanen, J., & Ortega, J. (1989). Frameworks of analysis for the neural representation of animate objects and actions. Journal of Experimental Biology, 146, 87–114. Peterhans, P., & von der Heydt, R. (1993). Functional organization of area V2 in the alert macaque. European Journal of Neuroscience, 5, 509–524. Plaut, D., & Farah, M. (1990). Visual object representation: Interpreting neurophysiological data within a computational framework. Journal of Cognitive Neuroscience, 2, 320–343. Sary, G., Vogels, R., & Orban, G. (1993). Cue invariant shape selectivity of macaque inferior temporal neurons. Science, 260, 995–997. Schiller, P. (1993). The effects of V4 and middle temporal (MT) area lesions on visual performance in the rhesus monkey. Visual Neuroscience, 10, 717–764. Stein, J. (1992). The representation of egocentric space in the posterior parietal cortex. Behavioural and Brain Sciences, 15, 691–700. Sumi, S. (1984). Upside-down presentation of the Johansson moving light-spot pattern. Perception, 13, 283–286. Ungerleider, L., & Mishkin, M. (1982). Two cortial visual streams. In D. Ingle, R. Mansfield, & M. Goodale (Eds.), The analysis of visual behaviour, Cambridge, MA: MIT Press. Vaina, L., Lemay, M., Bienfang, D., Choi, A., & Nakayama, K. (1990). Intact “biological motion” and “structure from motion” perception in a patient with impaired motion mechanisms: A case study. Visual Neuroscience, 5, 353–369. Watson, J., Myers, R., Frackowiack, R., Woods, R., Mazziota, J., Shipp, S., & Zeki, S. (1993). Area V5 of the human brain: Evidence from a combined study using positron emission tomography and magnetic resonance imaging. Cerebral Cortex, 3, 79–94. Zeki, S. (1993). A vision of the brain. Oxford: Blackwell. Zeki, S. Watson, J., Lueck, C., Friston, K., Kennard, C., & Frackowiack, R. (1991). A direct demonstration of functional specialisation in human visual cortex. Journal of Neuroscience, 11, 641–649. Zihl, S., von Cramon, D., & Mai, N. (1983). Selective disturbance of vision after bilateral brain damage. Brain, 106, 313–340. Zihl, S., von Cramon, D., Mai, N., & Schmid, C. (1991). Disturbance of movement vision after bilateral posterior brain damage: Further evidence and follow up observations. Brain, 114, 2235–2252. Revised manuscript received 17 August 1995

Suggest Documents