Spatial Cognition and Computation - Spatial Intelligence and Learning ...

2 downloads 0 Views 220KB Size Report
Attention is central to the apprehension of spatial relations (Logan, 1994, 1995; ..... We thank Shannon van Deman, Ryan Kenny, Meghan Murray, and Katie.
SPATIAL COGNITION AND COMPUTATION, 6(4), 295–308 Copyright © 2006, Lawrence Erlbaum Associates, Inc.

Attention Unites Form and Function in Spatial Language Laura A. Carlson University of Notre Dame Terry Regier University of Chicago William Lopez University of Notre Dame Bryce Corrigan University of Chicago Recent research on spatial language has progressed along two largely separate fronts. First, there has been an effort to examine the role of attention within spatial language, focusing primarily on abstract geometric representations of objects. Second, there have been numerous demonstrations that spatial language is influenced by the intended function of objects as well as their geometry, but the mechanism by which this occurs has been left largely unspecified. We bring together these two lines of inquiry, and argue that attention may integrate geometric and functional information. Specifically, we argue that preferential attention to the functional parts of objects may explain effects of object function on the interpretation of spatial terms. We show this empirically and computationally, using an attentional model of spatial language. Keywords: Attention, spatial terms, function, vectors, landmark. Correspondence concerning this article should be addressed to Laura Carlson, Department of Psychology, University of Notre Dame, Notre Dame, IN 46556; email: [email protected] or Terry Regier at Department of Psychology, University of Chicago, Green 414, 5848 South University Ave, Chicago, IL 60637; email: [email protected].

296 CARLSON, REGIER, LOPEZ, CORRIGAN

Introduction Attention Unites Form and Function in Spatial Language Traditionally, comprehension of spatial terms has been described as a process by which a core meaning is identified consisting of geometric features (Bennett, 1975; Clark, 1973; Fillmore, 1971; Herskovits, 1986; Lyons, 1968; Miller & Johnson-Laird, 1976). This definition is applied to objects schematized on the basis of their form (Herskovits, 1986; Landau & Jackendoff, 1993; Talmy, 1983). For example, comprehension of “The toothpaste is above the toothbrush.” would involve schematizing the tube of toothpaste (the target) as a cylinder and the toothbrush (the reference object) as an elongated rectangle, and verifying that the cylinder was in a vertical direction, centered within the extended bounds of the rectangle, the presumed best geometric location (Gapp, 1995; Hayward & Tarr, 1995; Logan & Sadler, 1996; Regier & Carlson, 2001; Schirra, 1993). This traditional geometric perspective treats language as largely independent of other non-geometric processes and representations. Two separate lines of work have further elaborated this view, one focusing on mechanism and the other on underlying object representations. First, there has been an effort to link the apprehension of spatial terms with the mechanism of attention (for review see Carlson & Logan, 2005), while largely assuming simplified geometric representations of objects. Second, there has been an effort to define spatial terms not only with respect to geometric features, but also with respect to general knowledge about the objects, such as function (for review, see Coventry & Garrod, 2004), while leaving largely unspecified the process by which such influences are combined. We bring together these two lines of research by suggesting that attention may assist in uniting geometric and functional information.

An Attentional Model That Assumes Geometric Representations Attention is central to the apprehension of spatial relations (Logan, 1994, 1995; Moore, Elsinger, & Lleras, 2001; Regier & Carlson, 2001; Rosielle, Crabb, & Cooper, 2002). A possible application of this idea to spatial language is suggested by the attentional vector sum (AVS) model (Regier & Carlson, 2001). In this model, shown in Figure 1, an attentional spotlight (in light gray) is anchored at a focus point on the reference object, near the target. Attention is maximal at the focus point, and drops off with distance from it (panel a). Vectors are defined, rooted at each point within the reference object, pointing toward the target (panel b), with each vector weighted by the amount of attention at its root (panel c). These weighted vectors are summed (e.g., Georgopoulos, Schwartz, & Kettner, 1986; Wilson & Kim, 1994), with the vector sum (panel d) taken as an overall measure of the direction of the target relative to the reference object. The acceptability of a given spatial term is based on the degree of alignment of the vector sum with a given reference orientation (e.g., upright vertical for above; panel e). This model has accounted

ATTENTION UNITES FORM AND FUNCTION

297

Figure 1. The attentional vector sum (AVS) model. Panel (a) shows an attentional beam focused on the rectangular reference object at the location closest to the circular target. Panel (b) shows vectors rooted across the reference object, pointing to the target. Panel (c) shows that the vectors are weighted by the amount of attention allocated to their roots, represented by the length of the vector. Panel (d) shows the overall direction of the target to the reference object, represented as a sum across the attentionally-weighted vectors. Panel (e) shows the overall direction measured relative to a reference orientation (in this case, upright vertical). Panel (f) shows the reference object with a functional part (outlined in bold on left side of the object), a set of vectors whose weights are determined jointly by attention and function, and the resulting overall direction summed across these vectors, relative to the reference orientation. for acceptability ratings for projective spatial terms relative to a variety of abstract geometric reference objects (Regier & Carlson, 2001).

An Example of the Influence of Object Function on Spatial Language Carlson-Radvansky, Covey and Lattanzi (1999) asked participants to place an image of a target above or below an image of a reference object. When the two objects were functionally related (e.g., placing a toothpaste tube above a

298 CARLSON, REGIER, LOPEZ, CORRIGAN toothbrush), placements were biased toward its functional part (here, the bristles). This bias was present but weaker for object pairs that did not typically interact (e.g., a tube of oil paint and a toothbrush). This suggests that the object representations that are used in spatial language incorporate both geometric and functional information.

Attention as the Mechanism That Combines Geometric and Functional Information The AVS model is fundamentally geometric in character; however, its use of attention may offer an explanation of this type of functional effect. Lin and Murphy (1997) demonstrated that attention may be preferentially allocated to functionally important parts of an object during perception. Thus, in a natural extension to AVS, one could increase the attention paid to the functional part of the reference object—in Figure 1(f), this is the part that is outlined by the bold rectangle on the left. Greater attention to the functional part of the object will cause the vectors rooted in this part to be more strongly weighted. Thus, the vector sum will align with upright vertical—and above ratings will correspondingly peak—when the target is near this functional part. These are illustrated in Figure 1, panel (f). To assess this idea, we conducted computer simulations of the extended AVS model with an image of a toothbrush as reference object, designating the bristles as the functional part (see Figure 2). We examined three conditions: strong function-induced attention to the bristles (as with a toothpaste tube as the target, ϕ = 2), moderate function-induced attention (as with a tube of oil paint as the target, ϕ = 1), and no function-induced attention (ϕ = 0; see Appendix for explanation of ϕ and further details). Figure 2 plots the AVS-predicted above ratings for the row of points above the toothbrush for each condition. In the strong condition, ratings peak near the bristles due to the strong attention paid to the bristles, with the vectors rooted there receiving greater weight in the vector sum. As a result, the overall vector sum is well-aligned with upright vertical only for points that are horizontally quite near the bristles. In the moderate condition, the ratings peak between the bristles and the center of the object; because less attention is paid to the bristles, vectors rooted there play a smaller role in the overall vector sum. Finally, in the no function condition, the ratings peak near the center of the toothbrush, because the bristles are treated like any other part of the object. This pattern qualitatively matches Carlson-Radvansky et al.’s (1999) findings, and suggests that attentional differences may underlie this type of functional effect.

Functional Effects via Attention in Spatial Language Our empirical work tests two critical underlying assumptions of this account. First, understanding a linguistic description of a projective spatial relation between two objects requires attention to the part of the reference object nearest the target. This leads to the prediction that prior cueing of attention to a given part of the object should speed the processing of a spatial description for an

ATTENTION UNITES FORM AND FUNCTION

299

Strong function-induced attention Moderate function-induced attention No function-induced attention

Figure 2. Simulation results: The toothbrush bristles, circled in black, were taken as the functional part of the toothbrush. AVS-predicted above ratings (yaxis) were obtained for each position in the row of dots shown over the toothbrush, under three conditions: no (ϕ = 0, dashed curve), moderate (ϕ = 1, dotted curve), and strong (ϕ = 2, solid curve) function-induced attention. Stronger function-induced attention produced larger shifts in the peak of “above” ratings, toward the functional part of the toothbrush. object located near that part, because attention is already allocated to that part. This is relative to a case in which attention is cued to a different part of the object, and must shift to the part nearest the target. This prediction is tested in Experiment 1. The second assumption is that object function will attract attention, and cause a similar pattern of attentional facilitation. That is, the functional part of an object may act as a cue, drawing attention. If this is the case, then processing a spatial description relative to the functional part should be facilitated relative to a different part. Importantly, such facilitation should only be observed when the functional interaction of the objects is enabled based on their positions and on term constraints (Carlson & Kenny, in press). In contrast, when the two objects are positioned in a manner that does not enable their functional interaction, there should be no such facilitation because there should be no attentional

300 CARLSON, REGIER, LOPEZ, CORRIGAN highlighting of the functional part of the reference object. This prediction is tested in Experiment 2.

Experiment 1 In Experiment 1 we tested the idea that interpreting a spatial relation between a target and a reference object requires attending to a part of the reference object that is close to the target. This tests the underlying assumption in AVS of how the attentional beam is allocated across the reference object, as encapsulated in Figure 1, panel (a). If this assumption is correct, then when attention is allocated near a part of the object close to the target, processing the spatial relation should be facilitated, relative to when attention is allocated to part that is not near the target. To assess this, Experiment 1 used a speeded sentencepicture verification paradigm in which attention was cued to the left, center or right side of a rectangle by means of an exogenous cue, an established means for anchoring attention (e.g., Jonides, 1981; Posner, 1980). With this technique, we could be confident of the allocation of attention to parts of the object near the cue. We then compared responses times for processing spatial descriptions of targets located near cued versus non-cued parts. The critical prediction was faster response times for processing spatial descriptions for targets located near the cued parts relative to the non-cued parts.

Method Participants, Stimuli, Design, and Procedure. Thirty-two participants performed a speeded verification task (Clark & Chase, 1972) in which a sentence of the form The circle is above/below the rectangle appeared for 2000 ms, followed by presentation of the rectangle (width: 10.13 cm; height: 1.6 cm ) containing a central (“+”) fixation point. After 500 ms, a transient cue (the symbol ‘#’, .53 cm) appeared for 150 ms within the center, left or right side (3.52 cm from center) of the rectangle. This disappeared for 50 ms, followed by the presentation of a small circle (the target) (diameter = .53 cm) in one of 22 locations (spaced 1.76 cm horizontally apart, with 2.82 cm from center of rectangle to center of circle vertically) around the reference object (see Figure 3, panel (a) for cue and target placements). This sentence-picture presentation follows other work investigating attention in spatial language (e.g., Logan, 1994, 1995). Yes or no judgments corresponding to whether the sentence was an acceptable description of the display were made by pressing the n or m key on a standard keyboard using 2 fingers on the same hand, with assignment of responses to keys counterbalanced across participants. Response times were measured from the onset of the target, and accuracy of the judgment was recorded. There were 132 trials in all, constructed from 22 placements × 2 spatial terms (above/below) × 3 cue locations (left, center, right).

ATTENTION UNITES FORM AND FUNCTION

301

Figure 3. Reference objects, targets, and placements in Experiments 1–2. Panel (a) shows a rectangle with three possible cue (#) locations and the target (circle) at its possible locations. Panel (b) shows a watering can and a plant at its possible locations.

Results and Discussion The critical data were correct yes responses for locations above and below the cued positions (see Figure 3). Data for 4 participants were discarded due to accuracy below 52%. Overall accuracy for the remaining 28 participants was very high (M = 95%), precluding error analyses. Correct critical trials with response times less than 300 ms or greater than 2000 ms were trimmed (2%), and replaced with the participant’s overall mean response time. Figure 4 presents mean correct response times for placements of the target above and below the critical cue locations as a function of cue. Overall, mean response

302 CARLSON, REGIER, LOPEZ, CORRIGAN

Figure 4. Mean correct “yes” response times for Experiment 1 as a function of cue location, critical placements of the target, and spatial term. Error bars are 95% confidence intervals. time when the target was at a cued location (averaging above and below positions on left when cue on left; in center when cue in center; on right when cue on right) (M = 569 ms) was significantly faster than mean response time at the noncued locations (i.e., averaging above and below positions at center and right when cue on the left; at left and right when cue on center; at center and left placements when cue on right) (M = 606 ms), t(27) = 2.5, p = .02, one-tailed. This pattern held for 5 of the 6 sets of bars (the unexplained exception was below with a left cue). When an exogenous cue known to elicit attention was used to draw attention to a particular location, verification of a spatial term for objects around the cue proceeded more quickly.

Experiment 2 The key findings of Experiment 1 were facilitated processing of spatial descriptions involving objects located near a known locus of attention. In Experiment 2, we used this pattern of facilitated processing around a known locus of attention to diagnose the allocation of attention within objects with functional parts. The logic was that if processing a spatial description relative to a particular functional part of the reference object was facilitated relative to another part of the reference object, then this would indicate an allocation of attention to the functional part. Such a pattern would support the underlying assumption of AVS, as encapsulated in Figure 1, panel (f). A watering can served as the reference object, with the spout as the functional part and a plant as the target that selectively interacted with this part. Objects and placements are shown in Figure 3, panel (b). Response times to positions of the target that

ATTENTION UNITES FORM AND FUNCTION

303

enable the interaction with the reference object (below and on the right) should be facilitated, relative to response times to below-left placements that are matched in distance from the center; this is analogous to the strong-function condition of AVS. For above placements, the plant does not interact with the spout, thus the pattern of response times should reflect geometry and not function, with center locations facilitated, and no difference between sides; this is analogous to the no-function condition of AVS. Participants, Stimuli, Design, and Procedure. Twenty-four participants followed the general procedure from Experiment 1, except there was no exogenous attentional cue. The reference object was a watering can (5.59 cm × 5.8 cm mm); the target was a plant (2.5 cm × 2.86 cm) placed at one of 10 locations (spaced 4.35 cm horizontally, with 6.16 cm between the centers of the objects vertically). The sentence was The plant is above/below the watering can. There were 200 trials in all, constructed from 10 placements × 2 spatial terms (above/below) × 10 repetitions.

Results and Discussion The critical data were correct yes responses. Data for 3 participants were discarded (accuracy < 66%). Overall accuracy for the remaining 21 participants was high (M = 92%), precluding error analysis. Trials with response times less than 300 ms or greater than 2500 ms were trimmed (3%), and replaced with the participant’s overall mean response time. Figure 5 presents mean correct

Figure 5. Mean correct “yes” response times for Experiment 2 as a function of placement of the target, and spatial term. Error bars are 95% confidence intervals.

304 CARLSON, REGIER, LOPEZ, CORRIGAN response times as a function of target positions and spatial term. A 2 (above/below) × 5 (placement) repeated measures ANOVA revealed a main effect of term, F(1,20) = 46.7, with above trials (M = 882 ms) faster than below trials (M = 1044 ms), replicating a well-established effect (i.e., Braine, 1978; Carlson & Logan, 2001; Clark & Chase, 1972; McMullen & Jolicoeur, 1990). There was also a main effect of placement, F(4,80) = 4.5, and most importantly, a marginally significant interaction between placement and term, F(4,80) = 2.3, p = .067. For below trials, mean response time to the functional positions (right near and right far locations) (M = 1020 ms) were significantly faster than mean response time to non-functional positions (left near and left far locations) (M = 1100 ms), t(20) = 3.3, p = .004, one-tailed. In contrast, for above trials, mean response time for the right-side positions (M = 911 ms) did not differ significantly from the left-side positions (M = 888 ms), t(20) = 1.2, p = .26. These data support the idea that attention was allocated to the functional part and thereby facilitated processing only when the target was placed in a location that enabled its interaction with the reference object.

General Discussion A commonly stated supposition within research on spatial terms is that geometric and functional information are combined, with the manner in which they are combined dependent upon the types of objects and their depicted interaction, the spatial term in use, and the context (e.g., Carlson, 2000; Carlson & Kenny, in press; Carlson-Radvansky et al., 1999; Carlson-Radvansky & Tang, 2000; Coventry, Prat-Sala, & Richards, 2001; Coventry & Garrod, 2004). The central contribution of the current paper is to offer a specification of one means by which this may occur—via enhanced attention to the functional parts of objects, as simulated within an extended version of the AVS model, and supported by empirical data. Specifically, using the pattern of facilitated processing of spatial descriptions relative to attended parts observed in Experiment 1, the data in Experiment 2 support the idea that attention was allocated to the functional parts when the target and reference object were positioned in a manner that was consistent with their interaction (i.e., enabling one to pour water from the watering can via the spout to the plant). In addition to geometric and functional information, it is likely that other sources of information may also influence the allocation of attention. For example, Coventry and Garrod (2004) argue that dynamic-kinematic routines that underlie the interpretation of the interaction among the objects also play a critical role in the interpretation of spatial terms. We think that this source is also important, and could be used to specify the degree of function-induced attention, as reflected in the varying strengths assigned to the function parameter in Figure 2. Further, we view the implementation of the unification of geometry and function within AVS as an existence-proof that supports the central role of attention. Other implementations of the same general idea are also

ATTENTION UNITES FORM AND FUNCTION

305

possible. For example, increased attention to an object’s functional part may cause uncertainty as to whether to relate the target to that part alone, or to the entire reference object, and judgments could then reflect a combination of these two possibilities. Alternatively, it may be possible to anchor attention on the target (Coventry, Cangelosi, Joyce, & Richards, 2002). Nevertheless, a common component supported by the empirical and computational work in the current paper is that attention is a unifying force for geometry and function.

Acknowledgments We thank Shannon van Deman, Ryan Kenny, Meghan Murray, and Katie Hench.

References Bennett, D. C. (1975). Spatial and temporal uses of English prepositions: An essay in stratificational semantics. London: Longman. Braine, L. G. (1978). A new slant on orientation perspective. American Psychologist, 33, 10–20. Carlson, L. A. (2000). Object use and object location: The effect of function on spatial relations. In E. van der Zee & U. Nikanne (Eds.) Cognitive interfaces: Constraints on linking cognitive information (pp. 94–115). Oxford: Oxford University Press. Carlson, L. A., & Kenny, R. (in press). Interpreting spatial terms involves simulating interactions. Psychonomic Bulletin & Review. Carlson, L. A., & Logan, G. D. (2001). Using spatial terms to select an object. Memory & Cognition, 29, 883–892. Carlson, L. A., & Logan, G. D. (2005). Attention and spatial language. In L. Itti, G. Rees, and J. Tsotsos (Eds.), The neurobiology of attention (pp. 330– 336). San Diego, CA: Elsevier Academic Press. Carlson-Radvansky, L. A., Covey, E. S., & Lattanzi, K. M. (1999). “What” effects on “where”: Functional influences on spatial relations. Psychological Science, 10, 516–521. Carlson-Radvansky, L. A., & Tang, Z. (2000). Functional influences on orienting a reference frame. Memory & Cognition, 28, 812–820. Clark, H. H. (1973). Space, time, semantics and the child. In T. E. Moore (Ed.), Cognitive development and the acquisition of language. New York: Academic Press. Clark, H. H., & Chase, W. (1972). On the process of comparing sentences against pictures. Cognitive Psychology, 3, 472–517. Coventry, K. R., & Garrod, S. C. (2004). Saying, seeing and acting: The psychological semantics of spatial prepositions. Psychology Press, Taylor Francis: Hove.

306 CARLSON, REGIER, LOPEZ, CORRIGAN Coventry, K., Prat-Sala, M., & Richards, L. (2001). The interplay between geometry and function in the comprehension of over, under, above and below. Journal of Memory and Language, 44, 376–398. Fillmore, C. (1971). Santa Cruz lectures on deixis. Bloomington: Indiana University Linguistics Club. Gapp, K-P. (1995). Angle, distance, shape and their relationship to projective relations. In J. D. Moore & J. F. Lehman (Eds.), Proceedings of the 17th Annual Conference of the Cognitive Science Society (pp. 112–117). Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Georgopoulos, A. P., Schwartz, A. B., & Kettner, R. E. (1986). Neuronal population coding of movement direction. Science, 223, 1416–1419. Hayward, W. G., & Tarr, M. J. (1995). Spatial language and spatial representation. Cognition, 55, 39–84. Herskovits, A. (1986). Language and spatial cognition: An interdisciplinary study of the prepositions of English. Cambridge: Cambridge University Press. Jonides, J. (1981). Voluntary versus automatic control over the mind’s eye. In J. Long & A. D. Baddeley (Eds.), Attention and Performance (Vol. 9). Hillsdale, NJ: Erlbaum. Landau, B., & Jackendoff, R. (1993). What and where in spatial language and spatial cognition. Behavioral and Brain Sciences, 16, 217–265. Lin, E. L., & Murphy, G. L. (1997). Effects of background knowledge on object categorization and part detection. Journal of Experimental Psychology: Human Perception & Performance, 23, 1153–1169. Logan, G. D. (1994). Spatial attention and the apprehension of spatial relations. Journal of Experimental Psychology: Human Perception and Performance, 20, 1015–1036. Logan, G. D. (1995). Lingustic and conceptual control of visual spatial attention. Cognitive Psychology, 28, 103–174. Logan, G. D., & Sadler, D. D. (1996). A computational analysis of the apprehension of spatial relations. In P. Bloom, M. A. Peterson, L. Nadel, and M. F. Garrett (Eds.), Language and space (pp. 493–529). Cambridge, MA: MIT Press. Lyons, J. (1968). Introduction to theoretical linguistics. Cambridge: Cambridge University Press. Miller, G. A., & Johnson-Laird, P. N. (1976). Language and perception. Cambridge, MA: Harvard University Press. Moore, C. M., Elsinger, C. L., & Lleras, A. (2001). Visual attention and the apprehension of spatial relations: The case of depth. Perception & Psychophysics, 63, 595–606. Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Experimental Psychology, 32, 3–25. Regier, T., & Carlson, L. A. (2001). Grounding spatial language in perception: An empirical and computational investigation. Journal of Experimental Psychology: General, 130, 273–298.

ATTENTION UNITES FORM AND FUNCTION

307

Rosielle, L. J., Crabb, B. T., & Cooper, E. E. (2002). Attentional coding of categorical relations in scene perception: Evidence from the flicker paradigm. Psychonomic Bulletin & Review, 9, 319–326. Schirra, J. (1993). A contribution to reference semantics of spatial prepositions: The visualization problem and its solution in VITRA. In C. Zelinsky-Wibbelt (Ed.), The semantics of prepositions: From mental processing to natural language processing (pp. 471–515). Berlin: Mouton de Gruyter. Talmy, L. (1983). How language structures space. In H. L. Pick & L. P. Acredolo (Eds.), Spatial Orientation: Theory, research and application (pp. 225–282). New York: Plenum Press. Wilson, H. R., & Kim, J. (1994). Perceived motion in the vector sum direction. Vision Research, 34, 1835–1842.

Appendix Details of the AVS Model The focus point is that point on the top of the reference object that is vertically aligned with the target, or nearest to being so aligned. Let σ be the distance between the focus point and the target, and di be the distance between the focus point and point i in the reference object. The attention ai at point i is:

a i = exp(− d i /( λ σ)) where λ is a free parameter. Let ν i be the vector rooted at point i of the reference object. The attentional vector sum is: ai ν i i

and α is the angular deviation of this vector sum from a reference orientation (e.g., upright vertical for above). The measure of alignment is a linear function of this deviation: g(α) = mα + b where m and b are free parameters. Finally, g(α) is multiplied by a height factor which is effectively 1 for all positions higher than the highest point on the reference object – and thus for all target positions considered in this article – and which decreases for lower points (see Regier & Carlson, 2001 for further detail). The resulting quantity is the AVS model’s predicted spatial term (e.g., above, below) rating. All free parameters were set by fitting the AVS model to Logan and Sadler’s (1996) above rating data. Extension: The total amount of attention Ai at point i in the reference object is influenced by both the geometrically-determined value ai and by object function:

308 CARLSON, REGIER, LOPEZ, CORRIGAN

Ai =

a i (1 + ϕ ) if point i lies within the functional part of the reference object ai

otherwise

where ϕ represents how strongly functional the functional part is. In the extended model, the vector sum is weighted by Ai rather than ai, giving more weight to vectors rooted in functionally important parts of the object: Ai ν i i

Suggest Documents