purdue university

4 downloads 0 Views 6MB Size Report
Feb 19, 2009 - coded it and helped me understand the nature of the English data. ...... the locations in the environment whereas head direction cells ..... for an umbrella to be over someone means both the umbrella is a higher ...... two cause types: 1) One of the objects moved to the other one, hit it, stopped, the other.
Graduate School ETD Form 9 (Revised 12/07)

PURDUE UNIVERSITY GRADUATE SCHOOL Thesis/Dissertation Acceptance This is to certify that the thesis/dissertation prepared By Engin Arik Entitled "Spatial Language: Insights from sign and spoken languages"

For the degree of Doctor of Philosophy

Is approved by the final examining committee: Ronnie Wilbur Chair

Diane Brentari

Myrdene Anderson

Elaine Francis, Dan I. Slobin

To the best of my knowledge and as understood by the student in the Research Integrity and Copyright Disclaimer (Graduate School Form 20), this thesis/dissertation adheres to the provisions of Purdue University’s “Policy on Integrity in Research” and the use of copyrighted material.

Ronnie Wilbur Approved by Major Professor(s): ____________________________________

____________________________________ Approved by: Ronnie Wilbur

2/19/2009 Head of the Graduate Program

Date

Graduate School Form 20 (Revised 10/07)

PURDUE UNIVERSITY GRADUATE SCHOOL Research Integrity and Copyright Disclaimer

Title of Thesis/Dissertation: "Spatial Language: Insights from sign and spoken languages"

Doctor of Philosophy For the degree of ________________________________________________________________

I certify that in the preparation of this thesis, I have observed the provisions of Purdue University Executive Memorandum No. C-22, September 6, 1991, Policy on Integrity in Research.* Further, I certify that this work is free of plagiarism and all materials appearing in this thesis/dissertation have been properly quoted and attributed. I certify that all copyrighted material incorporated into this thesis/dissertation is in compliance with the United States’ copyright law and that I have received written permission from the copyright owners for my use of their work, which is beyond the scope of the law. I agree to indemnify and save harmless Purdue University from any and all claims that may be asserted or that may arise from any copyright violation.

Engin Arik ________________________________ Signature of Candidate

2/19/2009 ________________________________ Date

*Located at http://www.purdue.edu/policies/pages/teach_res_outreach/c_22.html

SPATIAL LANGUAGE: INSIGHTS FROM SIGN AND SPOKEN LANGUAGES

A Dissertation Submitted to the Faculty of Purdue University by Engin Arik

In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

May 2009 Purdue University West Lafayette, Indiana

ii

To Beril

iii

ACKNOWLEDGMENTS

This study is supported in part by NSF grant (BCS-0345314) awarded to Ronnie Wilbur, the Linguistics Program at Purdue University, the Lynn Fellowship, and the Bilsland Dissertation Fellowship given by Purdue Graduate School. I thank the TID, HZJ, ASL, and ÖGS Deaf signers and the Turkish, English, and Croatian speakers for their participation in this study. Their participation and comments made this study possible. I will always be grateful to all of you. I gratefully acknowledge the help I received from the following friends. Aysel Baar helped me collect the TID data. She also transcribed it and shared her native insights with me. Marina Milkovi collected and commented on the HZJ data and shared her native insights with me. Robin Shay helped me collect the ASL data. Katharina Schalber ran the experiments with the ÖGS Deaf signers. She also coded some of the data. Katie Watson helped me collect the English data. She also partially transcribed and coded it and helped me understand the nature of the English data. Kate Huhn transcribed some of the English data. Iva Hrastinski collected the Croatian data and transcribed it. Robert Cloutier, Michael Covarrubias, Alyson Eggleston, Josh Iddings, Elizabeth Strong, and Katie Watson made comments and suggestions on the earlier drafts of the dissertation. Thank you for your cooperation.

iv

I am also grateful to Beril Tezeller Arik for her constant emotional support and understanding and her inquiries and guidance from the beginning of this journey. I really owe everything to her. I thank my dissertation committee, Ronnie B. Wilbur, Diane Brentari, Elaine Francis, Myrdene Anderson, and Dan I. Slobin. This dissertation could not have been written without you. I hope you enjoy reading my dissertation as much as I enjoyed writing it.

v

TABLE OF CONTENTS

Page LIST OF TABLES…………………………………………………….…………………..x LIST OF FIGURES………………………...…………………………………………...xiii LIST OF ABBREVIATIONS…………………………...………………………………xvi ABSTRACT……………………………………………………………………….…..xviii CHAPTER 1. INTRODUCTION…………………………………………………………1 1.1. Motivation for the Current Study…………………………………………………...1 1.2. Rationale and Foundations………………………………………..………..…….....4 1.3. Current Study………………………………………...…………………..…………8 1.3.1. Research Questions……………………………………………………………..8 1.3.2. Assumptions…………………………………………………………………….9 1.3.3. The Hypotheses…………………………………………………………………9 1.4. Outline of the Study……………………………………………………………….11 CHAPTER 2. THEORETICAL FRAMEWORK………………………………………..13 2.1. The Model………………………………………………………..………..………13 2.2. The Hypothesis………………………………………………………..…………..15 2.2.1. Spatial Representation and Reference Frames………………………………...16 2.2.2. Temporal Representation and Reference Frames……………………………..21 2.2.2.1. Is time derived from space?.........................................................................22 2.2.2. TR vs. SR…………………………………………...…………..………..……24 2.3. Conceptual Structure………………………………………………………..……..26 2.3.1. The Jackendovian Conceptual Structures…………………………………….. 28 2.4. Challenges to the Jackendovian Conceptual Structures………………………….. 33 2.4.1. Dynamic Predicates as “States” ……………………………………………… 33 2.4.2. Nonuniversality of Place functions (IN, ON, AT, etc.) ………………………35 2.4.3. Non-universality of Place functions…………………………………………...38 2.4.4. Crosslinguistic Differences on the Language of Motion Events…..………….41 2.4.4.a. Manner and Path…………………………………………………………..41 2.4.4.b. Causative Motion Events………………………………………………….44 2.4.5. Sign Language Research on Spatial Language………………………………..45 2.4.6. Summary………………………………………………………..………..…… 47 2.5. The Crossmodal Spatial Language Hypothesis (CSLH)………..………..………. 48 2.5.1. CSLH for Spatial Static Situations……………………………………………49 2.5.2. CSLH for Spatial Dynamic Situations……………………………………...... 54 2.6. Summary………………………………………………………..………..……….. 60

vi

Page CHAPTER 3. METHODOLOGY…………...…………………..……….....………..… 61 3.1. The Justification for the Methodology……………………………..………..…….61 3.2. Study Languages……………………………..………..………..………..……….. 62 3.3. Participants……………………………..………..………..………..………..….63 3.4. Data Collection……………………………..………..………..………..…………64 3.5. Research Questions……………………………..………..………..……………66 3.6. Tasks……………………………..………..………..………..………..…………..68 3.6.1. Tasks for Spatial Static Situations……………………………..………..…….70 3.6.1.1. Experiment 1 and Elicitation Task 1……………………………..………..70 3.6.1.2. Experiment 2……………………………..………..………..………..….... 72 3.6.1.3. Elicitation Task 2……………………………..………..………..…….….. 74 3.6.1.4. Analysis……………………………..………..………..………..…….….. 75 3.6.2. Tasks for Spatial Dynamic Situations…………………………….……….….. 76 3.6.2.1. Experiment 3………………………………..………..…………..………. .77 3.6.2.2. Experiment 4………………………………..………..…………..……….. 78 3.6.2.3. Experiment 5………………………………..………..…………..……….. 80 3.6.2.4. Elicitation Task 3………………….……………..………..…………..….. 81 3.6.2.5. Analysis…………………...………………..………..…………..……….. 81 3.6.3. Language-internal Data……………………………..………..…………..……82 3.7. Summary.…………………………..………..…………..………..…………..…... 82 CHAPTER 4. STATIC SITUATIONS………………………………..…………………84 4.1. Introduction………………………………..………..…………..………..………..84 4.2. Experiment 1: Describing Angular Relations………………………………..…… 87 4.2.1. Methodology……………………………………………………..…………… 87 4.2.2. Results: Language by Language……………………………………………… 88 4.2.2.1. Sign Languages…………………………………………………………… 88 4.2.2.1.1. TID………………….………………………………………………… 88 4.2.2.1.2. HZJ……………………….…………………………………………… 90 4.2.2.1.3. ASL…………………………………………………………………… 92 4.2.2.1.4. ÖGS………………………………...………………………………… 94 4.2.2.1.5. Summary……………………………………………………………… 95 4.2.2.2. Spoken Languages…………………………...…………………………… 96 4.2.2.2.1. Turkish……………………………………...………………………… 96 4.2.2.2.2. Croatian……………………………………..………………………… 97 4.2.2.2.3. English………………………………………...……………………… 98 4.2.2.2.4. Summary…………………………………………………………… 100 4.2.3. Results: Comparisons………………………………………………………100 4.2.3.1. Crossmodal Differences………………………………………..………102 4.2.3.2. Crosslinguistic Differences………………………………………….….. 103 4.2.3.3. Summary………………………………………………………………... 107 4.2.4. The Representational System……………………………………………….. 108 4.2.4.1. Summary………………………………………………………………… 134 4.2.5. Discussion……………………………………………………..………..…… 134 4.3. Experiment 2: Describing Angular-Topological Relations……………………... 135

vii

Page 4.3.1. Methodology………………………………………………………………… 136 4.3.2. Results: Language by Language…………………………………………….. 137 4.3.2.1. Sign Languages………………………………………………………….. 137 4.3.2.1.1. TID…………………………………………………………………... 137 4.3.2.1.2. HZJ…………………………………………………………………...140 4.3.2.1.3. ASL………………………………………………………………….. 141 4.3.2.1.4. ÖGS…………………………………………………………………. 143 4.3.2.1.5. Summary…………………………………………………………..… 145 4.3.2.2. Spoken Languages………………………………………………………. 145 4.3.2.2.1. Turkish………………………………………………………………. 145 4.3.2.2.2. Croatian……………………………………………………………147 4.3.2.2.3. English………………………………………………………………. 148 4.3.2.2.4. Summary…………………………………………………………….. 150 4.3.3. Results: Comparisons……………………………………………………….. 150 4.3.3.1. Crossmodal Differences…………………………………………………. 151 4.3.3.2. Crosslinguistic Differences……………………………………………… 152 4.3.3.3. Summary………………………………………………………………… 156 4.3.4. The Representational System……………………………………………….. 157 4.3.4.1. Summary………………………………………………………………… 166 4.3.5. Discussion…………………………………………………………………… 167 4.4. General Discussion and Conclusion…………………………………………….. 167 4.4.1. Spatial features can be omitted in making topological relations………….. 174 4.4.2. The two reference frames are available in making topological relations…. 179 4.4.3. Static and dynamic predicates are used in making topological relations…. 181 CHAPTER 5. DYNAMIC SITUATIONS…………………………………………….. 183 5.1. Introduction……………………………………………………………………… 183 5.2. Experiment 3: Describing Motion Events with To and Toward…………………186 5.2.1. Methodology………………………………………………………………… 186 5.2.2. Results: Language by Language…………………………………………….. 187 5.2.2.1. Sign Languages………………………………………………………….. 187 5.2.2.1.1. TID…………………………………………………………………... 187 5.2.2.1.2. HZJ………………………………………………………………….. 189 5.2.2.1.3. ASL………………………………………………………………….. 190 5.2.2.1.4. ÖGS…………………………………………………………………. 192 5.2.2.1.5. Summary…………………………………………………………….. 194 5.2.2.2. Spoken Languages……………………………………………..………... 194 5.2.2.2.1. Turkish………………………………………………………………. 194 5.2.2.2.2. English………………………………………………………………. 196 5.2.2.2.3. Summary…………………………………………………………….. 198 5.2.3. Results: Comparisons……………………………………………..………… 198 5.2.3.1. Crossmodal Differences………………………………………………… 199 5.2.3.2. Crosslinguistic Differences……………………………………………… 201 5.2.3.3. Summary………………………………………………………………… 204 5.2.4. The Representational System……………………………………………….. 205

viii

Page 5.2.5. Summary and Discussion……………………………………………..…….. 220 5.3. Experiment 4: Describing Motion Events with Pass by vs. Away……………… 221 5.3.1. Methodology………………………………………………………………… 221 5.3.2. Results: Comparisons……………………………………………..………… 222 5.3.2.1. Sign Languages…………………………………………………………. 222 5.3.2.1.1. TID………………………………………………………………….. 222 5.3.2.1.2. HZJ………………………………………………………………….. 224 5.3.2.1.3. ASL…………………………………………………………………. 225 5.3.2.1.4. ÖGS…………………………………………………………………. 226 5.3.2.1.5. Summary……………………………………………………………. 228 5.3.2.2. Spoken Languages……………………………………………………… 228 5.3.2.2.1. Turkish……………………………………………………………… 228 5.3.2.2.2. English………………………………………………………………. 230 5.3.2.2.3. Summary…………………………………………………………….. 231 5.3.3. Results: Comparisons……………………………………………..………… 231 5.3.3.1. Crossmodal Differences…………………………………………………. 232 5.3.3.2. Crosslinguistic Differences……………………………………………… 233 5.3.3.3. Summary………………………………………………………………… 237 5.3.4. The Representational System……………………………………………..… 238 5.3.5. Summary and Discussion……………………………………………..…….. 267 5.4. Experiment 5: Describing Causal Motion Events………………………………. 268 5.4.1. Methodology………………………………………………………………… 268 5.4.2. Results: Comparisons……………………………………………………….. 269 5.4.2.1. Sign Languages………………………………………………………….. 269 5.4.2.1.1. TID………………………………………………………………….. 269 5.4.2.1.1. HZJ………………………………………………………………….. 270 5.4.2.1.1. ASL…………………………………………………………..……… 271 5.4.2.1.1. ÖGS………………………………………………….……………… 273 5.4.2.1.5. Summary……………………………………..……………………… 274 5.4.3.2. Spoken Languages……………………….……………………………… 274 5.4.3.2.1. Turkish……………………………….……………………………… 274 5.4.3.2.2. English………………………….…………………………………… 276 5.4.3.2.3. Summary………………………..…………………………………… 277 5.4.3.3. Results: Comparisons…………………………………………………… 277 5.4.3.3.1. Crossmodal Differences……...……………………………………… 278 5.4.3.3.2. Crosslinguistic Differences………………………………………….. 279 5.4.3.3.3. Summary…………………………………………………………….. 281 5.4.3.4. The Representational System…………………………………………….282 5.4.3.5. Summary………………………………………………………………… 298 5.5. General Discussion and Conclusion…………………………………………….. 299 CHAPTER 6. SPATIAL LANGUAGE AND TEMPORAL LANGUAGE…….…….. 305 6.1. Previous Studies and CSLH…………………………………………………….. 306 6.2. Spatial Representation and Spatial Language…………………………………... 310 6.2.1. Spatial Language for The Front-Back Situations…………………………… 310

ix

Page 6.2.2 The Front-Back Spatial Language for the Situations………………………... 318 6.3. Temporal Language and Temporal Representation……………………………... 324 6.3.1. Temporal Language and Temporal Representations in TID and Turkish…... 324 6.3.2. Summary…………………………………………………………………….. 333 6.4. Conclusion………………………………………………………………………. 333 CHAPTER 7. CONCLUSION AND FUTURE DIRECTIONS………………………. 335 7.1. Implications………………………………………………………………………338 7.2. Future Directions………………………………………………………………... 339 LIST OF REFERENCES…………………...………………………………………….. 341 APPENDICES Appendix A….……………………………………………………………………...... 350 Appendix B….……………………………………………………………………….. 352 Appendix C….……………………………………………………………………….. 357 Appendix D….……………………………………………………………………….. 361 VITA…………………...…………………...………………………………………….. 367

x

LIST OF TABLES

Table Page Table 4.1. The means and standard deviations of the TID angular data ..……..……….90 Table 4.2. The means and standard deviations of the HZJ angular data ………….…….92 Table 4.3. The means and standard deviations of the ASL angular data ...…...….……..93 Table 4.4. The means and standard deviations of the ÖGS angular data ..……...……..95 Table 4.5. The means and standard deviations of the Turkish angular data……...……... 97 Table 4.6. The means and standard deviations of the Croatian angular data……........….98 Table 4.7. The means and standard deviations of the English angular data………….…100 Table 4.8. The means and standard deviations of the angular data with respect to modality ……………..…………………..………………………...………………..103 Table 4.9. The means and standard deviations of the angular data with respect to language……………………………………………………………………………. 104 Table 4.10. The significant differences between the languages according to the spatial angular data……………..…………………..…………………..………………….. 107 Table 4.11. The means and standard deviations of the TID angular-topological data....139 Table 4.12. The means and standard deviations of the HZJ angular-topological data.... 141 Table 4.13. The means and standard deviations of the ASL angular-topological data...143 Table 4.14. The means and standard deviations of the ÖGS angular-topological data...144 Table 4.15. The means and standard deviations of the Turkish angular-topological data...……………………………………..…………..………..…………..……….. 147 Table 4.16. The means and standard deviations of the Croatian angular-topological data...……………..………..…………..………..…………..………..…………..… 148 Table 4.17. The means and standard deviations of the English angular-topological data…...…………..………..…………..………..…………..………..…………..… 150 Table 4.18. The means and standard deviations of the angular-topological data with respect to modality…..….…………..…………………..…………………..……… 152 Table 4.19. The means and standard deviations of the angular topological data with respect to language…..….…………..…………………..…………………..………154 Table 4.20. The significant differences between the languages according to the spatial angular-topological data………………….…………………..…………………..…156 Table 5.1. The means and standard deviations of the TID motional to vs toward data.. 189 Table 5.2. The means and standard deviations of the HZJ motional to vs toward data..190 Table 5.3. The means and standard deviations of the ASL motional to vs toward data..192 Table 5.4. The means and standard deviations of the ÖGS motional to vs toward data.193 Table 5.5. The means and standard deviations of the Turkish motional to vs toward data…………………………………………………………………………………. 196

xi

Table Page Table 5.6. The means and standard deviations of the English motional to vs toward data……………………………………………………… .……………………….. 197 Table 5.7. The means and standard deviations of the motional to vs. toward data with respect to modality.…………..……..…………………..…………………..……… 200 Table 5.8. The means and standard deviations of the angular data with respect to language .…………..………………..………..…..…………..……………………. 202 Table 5.9. The significant differences between the languages according to the spatial to vs. toward data…………..………..…………..………..…………..………………. 204 Table 5.10. The means and standard deviations of the TID motional passby/away data 223 Table 5.11. The means and standard deviations of the HZJ motional passby/away data 225 Table 5.12. The means and standard deviations of the ASL motional passby/away data………………….………………….………………….……………………….. 226 Table 5.13. The means and standard deviations of the ÖGS motional passby/away data.………….…………….………………….………………….………………… 227 Table 5.14. The means and standard deviations of the Turkish motional passby/away data….…………..………..…………..………..…………..………..…………..….. 230 Table 5.15. The means and standard deviations of the English motional passby/away data….…………..………..…………..………..…………..………..…………..….. 231 Table 5.16. The means and standard deviations of the crossmodal motional passby/away data..… …………..………..……………..………..……………..………..………. 233 Table 5.17. The means and standard deviations of the crosslinguistic motional passby/away data…………..………..…………..………..…………..………......... 234 Table 5.18. The significant differences between the languages according to motional passby/away data…………..………..…………..………..…………..……………. 237 Table 5.19. The means and standard deviations of the TID causative motion data……270 Table 5.20. The means and standard deviations of the HZJ causative motion data ……271 Table 5.21. The means and standard deviations of the ASL causative motion data……273 Table 5.22. The means and standard deviations of the ÖGS causative motion data …...274 Table 5.23. The means and standard deviations of the Turkish causative motion data ..275 Table 5.24. The means and standard deviations of the English causative motion data.. 277 Table 5.25. The means and standard deviations of the crossmodal causative motion data……………………….………………….………………….………………….. 279 Table 5.26. The means and standard deviations of the crosslinguistic causative motion data…………..………..…………..………..…………..………..…………..……... 280 Table 5.27. The significant differences between the languages according to the spatial angular data…………..………..…………..………..…………..………..………… 281 Appendix Table Table A.1. Illustrations of the classifier handshapes found in the sign language data ...350 Table A.2. An illustration of various uses of b handshape……………………………..351 Table B.1. The TID classifier handshapes used in Experiment 1…………………..….. 352 Table B.2. The HZJ classifier handshapes used in Experiment 1…………..…………..352 Table B.3. The ASL classifier handshapes used in Experiment 1.…………..…..…….. 353 Table B.4. The OGS classifier handshapes used in Experiment 1..……………………353

xii

Table Page Table B.5. A comparison of the classifier handshapes across the sign languages (Experiment 1) …………..…………..…………..…………..…………...………... 354 Table B.6. The Turkish linguistic forms used in Experiment 1.…………..…………… 355 Table B.7. The English linguistic forms used in Experiment 1 …………...……………356 Table C.1. The TID classifier handshapes found in Experiment 2…………..………… 357 Table C.2. The HZJ classifier handshapes found in Experiment 2. …………..………. 357 Table C.3. The ASL classifier handshapes found in Experiment 2.…………..……….. 357 Table C.4. The OGS classifier handshapes found in Experiment 2…..…………..…… 358 Table C.5. A comparison of the classifier handshapes across the sign languages (Experiment 2). …………..………....…………..…………..…………..…………. 358 Table C.6. The Turkish linguistic forms used in Experiment 2…………...……………359 Table C.7. The English linguistic forms used in Experiment 2 .………..………….....360 Table D.1. The TID linguistic forms used in Elicitation Task 1…………..…………... 361 Table D.2. The HZJ linguistic forms used in Elicitation Task 1 …………..…………...361 Table D.3. The ASL linguistic forms used in Elicitation Task 1 …………….………...362 Table D.4. The OGS linguistic forms used in Elicitation Task 1 …………..…………..362 Table D.5. A comparison of the classifier handshapes across the sign languages (Elicitation Task 1) …….…………...……………………..…………..…………....363 Table D.6. The Turkish linguistic forms used in Elicitation Task 1………..…..………364 Table D.7. The English linguistic forms used in Elicitation Task 1…………..……….. 365

xiii

LIST OF FIGURES

Figure Page Figure 1.1. The schematic representation of the Crossmodal Spatial Language Hypothesis……………………………………………………………………………10 Figure 2.1. A Fodorian system……………………..….…………..……………………..13 Figure 2.1. A Fodorian system……………..…………..…………..…………..………...14 Figure 2.3. The language of space model assumed in this study….…………………….. 15 Figure 2.4. The Crossmodal Spatial Language Hypothesis (CSLH)….………………… 16 Figure 2.5. Spatial Representation……………..…………..…………..………………... 16 Figure 2.6. SR, TR, and RF model….…………..…………..…………..………………. 21 Figure 2.7. One-dimensional timeline….…………..…………..……………………….. 25 Figure 2.8. CS-SR & TR interactions ……………..…………..…………..……………..26 Figure 3.1. The arrangement in the recording room ….…………..……………………...65 Figure 3.2. Object locations and axes from a bird-eye perspective ….…………………..69 Figure 3.3. The testing items for Experiment 1 ….…………..…………..………………71 Figure 3.4. The testing items in Experiment 2 ….…………..………..………………..73 Figure 3.5. The testing items for the topological relations ….…………..……………….75 Figure 3.6. A testing item from Experiment 3 ….…………..…………..………………..78 Figure 3.7. The stimulus #26 as a testing item for Experiment 4 ….…………………….79 Figure 3.8. A testing item from Experiment 5 ….…………..…………..……………..80 Figure 4.1. The language of space model assumed in this study ….…………..…………84 Figure 4.2. The testing items for Experiment 1 ……………..…………..……………….87 Figure 4.3. The TID scores for Experiment 1……………..…………..………………...89 Figure 4.4. The HZJ scores for Experiment 1….…………..…………..…………..…… 91 Figure 4.5. The ASL scores for Experiment 1….…………..…………..…………..……93 Figure 4.6. The ÖGS scores for Experiment 1…….…………..………..…………..……94 Figure 4.7. The Turkish scores for Experiment 1……………..…………..…………….. 96 Figure 4.8. The Croatian scores for Experiment 1….…………..…………..…………… 98 Figure 4.9. The English scores for Experiment 1 ……………..…………...…………….99 Figure 4.10. A comparison of scores for Experiment 1….…………..…………………101 Figure 4.11. The means in the static angular data……………..……………..………... 104 Figure 4.12. The stimuli #1 from Experiment 1……………..…………..…………….. 108 Figure 4.13. The stimuli #2 from Experiment 1…………..…………..……………….. 117 Figure 4.14. The stimuli #5 from Experiment 1 ..…………..…………..………………125 Figure 4.15. The testing items in Experiment 2 .…………..…………..……………..137 Figure 4.16. The TID scores for Experiment 2 ……………..…………..…………........139 Figure 4.17. The HZJ scores for Experiment 2.….…………..…………..……………..140

xiv

Figure Page Figure 4.18. The ASL scores for Experiment 2 .……………..…………..……………..142 Figure 4.19. The ÖGS scores for Experiment 2 .….………..…………..…………….144 Figure 4.20. The Turkish scores for Experiment 2 ….…………..…………..………….146 Figure 4.21. The Croatian scores for Experiment 2….…………..…………..………… 147 Figure 4.22. The English scores for Experiment 2….…………..…………..………….149 Figure 4.23. The scores for the axial information across languages……………………151 Figure 4.24. The means in the static angular topological data ...…………..…………..153 Figure 4.25. The stimulus #4 for Experiment 2…....………..…………..……………...157 Figure 4.26. The stimuli #5 from Experiment 1…...………..…………..…………..….160 Figure 4.27. The stimuli #11, a control item from Experiment 1….…………..……….171 Figure 4.28. Still frames for (a) TID participant description of Figure 4.27 and (b) addressee retelling….…………..…………..…………..…………..……………….172 Figure 4.29. Still frames for (a) HZJ participant description of Figure 4.27 and (b) addressee retelling ……………..…………..………..…………..………………..172 Figure 4.30. The stimulus #14 for Elicitation Task 2..…………..…………..…………174 Figure 4.31. The stimulus #11 for Elicitation Task 2.….…………..…………..………176 Figure 4.32. The stimulus #1 for Elicitation Task 2...……………..…………..……….179 Figure 4.33. The stimulus #19 for Elicitation Task 2..….…………..………………….181 Figure 5.1. A testing item from Experiment 3….……..……..…………..…………….. 186 Figure 5.2. The TID scores for Experiment 3.…..……..……..…………..…………….188 Figure 5.3. The HZJ scores for Experiment 3 .….……...……..…………..…………..189 Figure 5.4. The ASL scores for Experiment 3 .…..…….……..…………..…………..191 Figure 5.5. The ÖGS scores for Eperiment 3….………..…..…………….…………… 193 Figure 5.6. The Turkish scores for Experiment 3………..……..…………..…………..195 Figure 5.7. The English scores for Experiment 3….…….……..…………..…………..197 Figure 5.8. The % scores for each measure across languages in Experiment 3.……….199 Figure 5.9. The means in the dynamic motional to vs. toward data.….……………….. 201 Figure 5.10. The stimulus #16 for Experiment 3……………..……..……..…………... 205 Figure 5.11. The stimulus #5 for Experiment 3….…………..……..……..………….... 212 Figure 5.12. The stimulus #26 as a testing item for Experiment 4...….…………..…… 221 Figure 5.13. The TID scores for Experiment 4……………..……………..………….... 223 Figure 5.14. The HZJ scores for Experiment 4….………..…..…….………..…………224 Figure 5.15. The ASL scores for Experiment 4…..…………..…….………..………… 225 Figure 5.16. The TID scores for Experiment 4…..………..…..……..……..…………..227 Figure 5.17. The Turkish scores for Experiment 4….……………...…………..………229 Figure 5.18. The English scores for Experiment 4……………..…………..………….. 230 Figure 5.19. The language scores for each measure in Experiment 4….……………… 232 Figure 5.20. The language scores with respect to the testing items in Experiment 4…. 234 Figure 5.21. The stimulus #26 for Experiment 4……………..…………..……………. 238 Figure 5.22. The stimulus #3 ….…………..…………..…………..…………..………..247 Figure 5.23. The stimulus #8 …………..…………..…………..…………..…………..255 Figure 5.24. The stimulus #24…………..…………..…………..…………..…………. 261 Figure 5.25. A testing item from Experiment 5……………..…………..……………... 268 Figure 5.26. The TID scores for Experiment 5 ……………………………………...….269

xv

Figure Page Figure 5.27. The HZJ scores for Experiment 5 .………………………………………270 Figure 5.28. The ASL scores for Experiment 5……………..…………..…………....... 272 Figure 5.29. The ÖGS scores for Experiment 5 ….…………..…………..…………..273 Figure 5.30. The Turkish scores for Experiment 5….…………..…………..………… 275 Figure 5.31. The English scores for Experiment 5….…………..…………..…………. 276 Figure 5.32. The language scores with respect to the spatial features measured in Experiment 5….…………..…………..…………..…………..…………………… 278 Figure 5.33. The languages scores with respect to the testing items in Test 6……….... 279 Figure 5.34. The stimulus #17 for Experiment 5.…………..…………..……………… 282 Figure 5.35. The stimulus #2 for Experiment 5….…………..…………..…………….. 290 Figure 5.36. The stimulus #3 for Elicitation Task 3……………..…………..………… 300 Figure 5.37. Still frames for (a) TID participant description of Figure 5.36 and (b) addressee retelling……………..…………..…………..…………..………………..301 Figure 5.38. Still frames for (a) HZJ participant description of Figure 5.36 and (b) addressee retelling….…………..…………..…………..…………..…………..….. 301 Figure 5.39. Still frames for (a) ASL participant description of Figure 5.36 and (b) addressee retelling. …………..…………..…………..…………..…………………301 Figure 5.40. Still frames for (a) ÖGS participant description of Figure 5.36 and (b) addressee retelling…..…………..…………..…………..…………..………………302 Figure 6.1. One-dimensional timeline……………..…………..…………..…………... 309 Figure 6.2. SR, TR, and LR model…………………………………………………….. 310 Figure 6.3. The TID scores for axes and locational information in Exp. 1-5 ……….….311 Figure 6.4. The Turkish scores for axes and locational information in Exp. 1-5 ………312 Figure 6.5. The stimuli #2 from Experiment 1….…………..…………..…………..…. 312 Figure 6.6. The stimuli #14 from Experiment 3…..…………..…………..…………… 315 Figure 6.7. The stimuli #5 from Experiment 1.…………………..…………..………... 318 Figure 6.8. The stimulus #16 for Experiment 3….…………..…………..…………….. 322

xvi

LIST OF ABBREVIATIONS

* (1)

: Significance level (p < .05)

* (2)

: Ungrammatical (e.g., p. )

* (3)

: Not possible (e.g., p. 36)

#SIGN

: # stands for fingerspelling

______________

: Hold for sign language transcription

1sg/pl

: first person singular/plural

3-D

: three dimension

ABL

: Ablative

ACC

: Accusative

ADJ

: Adjective(al)

ADV

: Adverb

ASL

: American Sign Language

CAPITAL LETTERS (1)

: Glosses for lexical items in sign language transcriptions

CAPITAL LETTERS (2)

: Functions and fillers in CS

cl

: Classifier for spoken language transcriptions

CL

: “Classifier” for sign language transcriptions

COM

: Comitative

COMP

: Complementizer

CS

: Conceptual Structures

CSLH

: The Crossmodal Spatial Language Hypothesis

DAT

: Dative

EVID

: Evidential

FEM

: Feminine

FUT

: Future tense marker

xvii

GEN

: Genitive

HZJ

: Croatian Sign Language

IMPERF

: Imperfective marker

INSTR

: Instrumental

LH

: Left hand

LOC

: Locative

LR

: Linguistic Representations

M

: Mean

MASC

: Masculine

NOM

: Nominative

Obj

: Object

ÖGS

: Austrian Sign Language

P1, P2, etc.

: Participant 1, 2, etc.

PAST

: Past tense marker

PL

: Plural

POSS

: Possesive

RF

: Reference Frames

RH

: Right hand

SD

: Standard Deviation

SE

: Standard Error

SR

: Spatial Representations

TID

: Turkish Sign Language

TR

: Temporal Representations

xviii

ABSTRACT

Arik, Engin. Ph.D., Purdue University, May, 2009. Spatial Language: Insights from Sign and Spoken Languages. Major Professor: Ronnie Wilbur.

This dissertation examined how sign and spoken languages represent space in their linguistic systems by proposing the Crossmodal Spatial Language Hypothesis (CSLH), which claims that the features from spatial input are not necessarily mapped on the spatial descriptions regardless of modality and language. Moreover, CSLH explains that the way languages convey spatial relations is bound to the representational system: Spatial Representations (SR), Reference Frames (RF), Temporal Representations (TR), Conceptual Structure (CS), and Linguistic Representations (LR). To test the hypothesis, a systematic study of spatial language (sign, speech, and co-speech gestures) on the data obtained from experiments and elicitation tasks was conducted in sign languages (TID, HZJ, ASL, and ÖGS) and spoken languages (Turkish, English, and Croatian). The findings uncovered a large amount of variation in the signed and spoken descriptions of static situations and dynamic situations. Additionally, despite some shared characteristics of the two domains, the analyses indicated that space and time are encoded in SR and TR. The results provided supporting evidence for CSLH. The findings suggested that language users construct a spatial relation between the objects in a given time, employ a reference frame, which may not be encoded in the

xix

message, and use the same conceptual structure comprised of BE-AT for static spatial situations and GO-BE-AT for static dynamic situations. Experimental results also showed that language users do not have to distinguish left/right from front/back, in/on from at, to from toward, cause from go, and cause to move from cause to move together in their descriptions. Interestingly, the descriptions involved go-type predicates (go, walk) for both static and dynamic situations. Further analyses revealed not only a modality effect (signers > speakers) but also a language effect. Careful consideration of the data revealed that there were similarities and differences within and across modalities. Future study can shed more light on these variations and patterns.

1

CHAPTER 1. INTRODUCTION

The way languages encode location and motion of entities is one of the most exciting questions in the study of human mind. There has been growing interest on this topic in recent literature. In this dissertation, I have investigated the encodings of the locational and motional relations among entities from a crossmodal and a crosslinguistic perspective. I also provided a framework to account for observations on the way sign languages (Turkish Sign Language (TID), Croatian Sign Language (HZJ), American Sign Language (ASL), and Austrian Sign Language (ÖGS)) and spoken languages (Turkish, Croatian, and English) encode space. 1.1. Motivation for the Current Study There are, at least, three reasons why the study of location and motion is well motivated. First, space is one of the basic tenets of human language and cognition (e.g., Miller & Johnson-Laird, 1976). Levinson (2003) divides spatial language into two parts: static (locational) and dynamic (motion). Under static there are also two subparts: angular and topological. Angular relations are made by using relationals such as left-right, frontback; whereas, topological relations are made by using relationals such as in, on, at. Dynamic spatial language involves expressions about motion events in which at least one entity is in motion literally and/or figuratively. In recent years, studies on spatial languages have revealed both crosslinguistic similarities and differences across the

2 surface structure of the spatial expressions. All languages studied so far have special morphosyntactic ways to express spatial relations of objects. Yet, the morphosyntactic structures differ from each other to a great extent: English uses adpositions (e.g., in, on, at, to, toward, from), Turkish uses case markers (e.g., dative case -A , locative case –DE, ablative case –DAn) and Mayan languages use positionals (see Grinevald, 2006 for an overview). Although languages differ from each other in their morphological encodings of spatial relations, natural language users use similar strategies across languages such as Figure-Ground assignment, perspective taking, and reference frames. Second, space provides templates not only for locational expressions (e.g., ‘a book is on the table’) but also existential (e.g., ‘there is a book on the table’) and possessive (e.g., ‘I have a book (on the table)’) expressions (e.g., Lyons, 1977; Freeze, 1992; Heine, 1997). While crosslinguistic studies have shown this to be the case, the explanations differed. For example, in his crosslinguistic study, Freeze (1992) argues that a prepositional predicate headed by a locative preposition is the underlying structure for locatives, existentials, and possessives. Nonetheless, possessives are always definite whereas existentials are not. For him, possessives also have [+human] argument whereas locatives and existentials do not. Heine (1997) also surveyed languages and proposed a set of ‘event schemas’ to underlie possession. These schemas, such as action, location, companion, genitive, goal, source, topic, and equation, all have original meaning dealing with location and existence in human conceptualization. Third, space is, arguably, extended over time, which is one of the basic tenets of human language and cognition (Miller & Johnson-Laird, 1976). Since entities in spatial relations are concrete and events in temporal relations are abstract, it is often stated that

3 time is derived from space (see Clark, 1973; Lyons, 1977). It is also assumed that situations are located along a timeline which is supposed to be a straight line, hence onedimensional, so that, according to Comrie (1985), tense is a “grammaticalized expression of location in time”. It is, in fact, the case that spatial lexemes, such as adpositions and/or adverbials, are also used in temporal expressions. In his crosslinguistic study, Haspelmath (1997) has shown that typologically diverse languages use (some) spatial lexemes in their temporal expression. These observations led many scholars, such as Lakoff & Johnson (1980), to claim that time is derived from space. Most of the studies above focused on a single language, such as English, or a single modality, such as auditory-vocal. In recent years, these claims have been challenged by studies on a visual-gestural modality. In American Sign Language (ASL), for example, it is argued (Emmorey & Herzig, 2003; Emmorey, 2002) that linguistic forms that convey spatial relations have gradient properties, rather than categoric, as opposed to those in spoken language systems. It is also argued that spatial representations in sign languages are largely iconic, have more structural elements, more categories, and more elements per category when compared to those in spoken languages (Talmy, 2003, 2006). In line with these argumentations, Liddell (2000, 2003) claims that gestural and linguistic information is conflated in sign languages such as ASL. Studies on sign languages have already provided invaluable cases to test the localist theories and the theories of event structure (i.e., compositional representations of linguistic encodings of events). Following Gruber (1976) and Jackendoff (1983), Shepard-Kegl (1985) argued that ASL grammar can be decomposed locational and motional relations. Even though spoken languages might not mark the characteristics of

4 events overtly, Wilbur and her colleagues (Wilbur, 2003, 2008; Grose et al., 2007) argued that event decompositions such as states, and dynamic events and their transitions are visible in sign languages. They have shown that the linguistic distinctions among states (static vs. dynamic events) and their transitions to another state/event are phonologically and morphologically marked in ASL and ÖGS. Several approaches have been developed to understand spatial language in recent research with various theoretical approaches. Although they provided invaluable insights to understand the common properties of space conveyed in language, their focus was on a single language such as English or a single modality (i.e., Freeze, 1992; Heine, 1997). To date no one has attempted to do crosslinguistic and crossmodal study to understand similarities and differences in spatial expressions. I believe that a crosslinguistic and crossmodal (both auditory-vocal and visual-gestural) study can shed more light on human spatial language since visual-gestural modality but not auditory-vocal modality represents space by using the body and the space in front of the language users. In this dissertation, then, I will attempt to investigate the locational and motional properties of the spatial language as well as its relationship with temporal language by focusing on signed (Turkish Sign Language, Croatian Sign Language, American Sign Language, and Austrian Sign Language) and spoken languages (Turkish, Croatian, and English). 1.2. Rationale and Foundations My general strategy in writing this dissertation is as follows. I applied experimental methods to investigate how sign and spoken languages carry basic spatial relations. I also used elicitations and language-internal evidence to further detail the linguistic encodings of space and time.

5 I took a mixed approach as a framework for several reasons. I took an empirical approach in this dissertation because empirical studies have shown that spatial language may not be deduced to a single universal structure. For example, typological and experimental studies have shown that the way a natural language user describes spatial relations of objects is affected by perception (i.e. perspective taking strategies), his/her language (i.e. the way language lexicalize/grammaticalize linguistic forms), and context information outside the specific arrangement of objects (i.e. their use and function) (Levelt, 1996). Nonetheless, languages differ from each other with respect to the reference frames (i.e. the coordinate systems) available in the linguistic system (Levinson & Wilkins, 2006; Levinson, 2003; Pederson et al., 1998). For example, Mopan speakers use the intrinsic reference frame and Hai//om speakers use the absolute reference frame, while English speakers use the intrinsic, relative, and absolute reference frames. There are also intralinguistic differences. For example, it is shown that there are extra-geometric effects of functional relations, location or functional control relations, object association, animacy, and context in talking about spatial relations. Moreover, Coventry and Garrod (2005) argues that in and on in English can be used with respect to other objects in the context and non-canonical positioning of the objects. I also took a rational approach because rationalist studies provide some invaluable working hypotheses. It has been argued that spatial language is the most basic construct not only in linguistic structure but also in cognition. For example, according to the localist hypothesis (e.g., Gruber, 1976), all of the semantic and syntactic roles can be decomposed by using locational and motional features. Jackendoff (1983, 1990, 1996) argues further that the English prepositions such as in, on, at, to, toward, are primitives in

6 the conceptual structure which interfaces perceptual representations on the one hand and syntax and phonology on the other. For ‘metaphor theory’ (e.g., Lakoff & Johnson, 1980), space is a candidate for the most basic source of all of the metaphorical extensions. Nonetheless, these works are generally based on introspection and languageinternal evidence from English. In order to benefit from both approaches, I applied a mixed methodology. On the one hand, empirical studies often rely on distributional analysis in which the data are gathered by using elicitation techniques. These techniques allow the researcher to control the stimuli, narrow the research questions, and investigate the issues step-by-step. On the other hand, rationalist studies often rely on language-internal evidence. Grammaticality and acceptability judgments as techniques can provide a distributional analysis of a phenomenon. It is obvious that categorical information and insights are missed in empirical studies whereas variations are missed in rationalist studies. I believe that using the techniques from both approaches can provide a successful union to understand the phenomena further. Another reason influencing my theoretical framework is the interpretation of the data. For Levinson and colleagues, the results from the experiments have clearly shown that languages differ from each other in the spatial domain in choosing one strategy over another. This is beyond the lexicalization/grammaticalization process suggested by Talmy (1983, 2000) or Slobin (2004, 2006). However, Levinson and colleagues have also interpreted the results from “nonlinguistic” experiments and argued that if language uses a specific spatial language strategy, then the mind is shaped accordingly. Li and Gleitman (2002), nevertheless, argue that these patterns may differ depending on “the availability

7 and suitability of local landmark cues.” For them, there is also a contextual effect on the spatial language and the choice of the reference frames (cf. Levinson, et al., 2002). It is also argued that when context and object positionings are manipulated, the use of relationals may be altered (e.g., Coventry & Garrod, 2005). Hence, typological and descriptive studies may offer some explanations of the apparent diverse lexicalization and grammaticalization patterns but cannot present a successful and unified approach along with predictions. This brings me to the interpretation of the data in rationalist studies. Although they present insights on the spatial language, theoretical studies often neglect the variations in and across languages. For Gruber and Jackendoff, for instance, there is no need for discussion of the contextual effects on spatial language or the non-prototypical use of relationals in a given language. In my dissertation, therefore, I take these issues into account and present a unified analysis that can provide explanations for usage and task-specific variations as well. The other reason is to assess the claims from a broader framework. Empirical studies may or may not go beyond the linguistic domain (of space). For example, for Levinson and his colleagues, the domain is both spatial language and spatial cognition but there is little evidence on, and unconvincing and controversial explanations for, the effect of language on cognition. For Gruber, the domain is very broad: the language itself. I believe that in order to understand ‘spatial language’ there is a need to restrict the study area. Hence, in my dissertation, I investigated how language encodes spatial information and uses tools such as lexicalization and grammaticalization patterns,

8 perspective taking, context effects, and reference frames, already presented in the literature. 1.3. Current Study These approaches led me in a quite different direction compared to a large number of studies on spatial language. This study is fundamentally distinct from the previous ones in several respects. First, the analyses made on the data came from both the findings from the elicitations and the language users’ judgments. Second, the data were gathered from four visual-gestural languages -- Turkish Sign Language, Croatian Sign Language, American Sign Language, and Austrian Sign Language -- which use space in their primary articulatory systems and three auditory-vocal languages -- Turkish, Croatian, and English. In this way, I provided a typology of spatial language from both visual and auditory languages. Third, this study presents a unified analysis for spatial language by reevaluating the claims from empirical and rationalist works. Thus, the present dissertation is unique in its characteristics and is relevant not only to linguists and psycholinguists but also to cognitive scientists. 1.3.1. Research Questions I have focused primarily on the following research questions. (1) How do both sign and spoken languages represent spatial static situations? (2) How do both sign and spoken languages represent spatial dynamic situations? (3) Do languages differ from each other in their spatial domain? (4) Does modality (i.e. visual vs. auditory) have an effect on the way language users represent spatial relations?

9 (5) How is spatial language representing locational and motion events related to the other domains of language such as temporals? 1.3.2. Assumptions To address the above research questions, I made several assumptions. First, even though there could be perceptual effects, I assumed that when exposed to a simple spatial situation in which two objects were located, either stationary or in motion, humans, regardless of language and modality, perceived the situation in the same manner. Second, even though languages have various morphosyntactic structures in their spatial domain, I assumed that when users of different languages were asked to describe the same spatial situation, their descriptions ought to be functionally equivalent to each other and, thus, comparable. Third, I also assumed that when a native language user described a spatial scene to another native user of the same language, who understood the description, and when a third native user of the same language found the description acceptable, the linguistic forms used in the description were involved in the grammar of the language. 1.3.3. The Hypotheses Hypothesis 1: Languages do not directly and obligatorily encode all of the spatial features of a spatial relation due to the multiple representations in mind that interface with the spatial input and its linguistic description. Hypothesis 2: The properties of these representations are the same across languages and modalities. These two hypotheses are put together in a model called the Crossmodal Spatial Language Hypothesis (CSLH) given in Figure 1.

10 According to the CSLH, the representations having the same parameters across languages are the following: (1) Spatial Representation (SR) encodes 3-D axial information (x, y, z) and reference frames (egocentric, allocentric). (2) Temporal Representation (TR) encodes t for static situations and t1, t2, …, tn for dynamic situations, with respect to timeline and reference frames (egocentric, allocentric). (3) Conceptual Structure (CS) has a limited set of structures: BE-AT for spatial static situations, GO-BE-AT for motional dynamic situations. (4) Linguistic Representation (LR) has linguistic structures such as syntax, modifications, and information structure.

SR

Axes x, y, z

Syntax RF

Egocentric, Allocentric

CS

{GO}-BEAT

LR Modification Information Structure

TR

t, {t} Axis

Figure 1.1. The schematic representation of the Crossmodal Spatial Language Hypothesis.

11 1.4. Outline of the Study Analyses of the data to be presented provide supporting evidence for the above hypotheses. Thus, this study will suggest that the users of the sign languages (TID, HZJ, ASL, and ÖGS) and the spoken languages (Turkish, Croatian, and English) do not obligatorily encode all of the spatial features of given basic spatial static and dynamic situations. The findings will suggest that the angular relations are not made in the same way across sign and spoken languages. The distinctions among the relations may not be marked linguistically. Even though their descriptions differed from one another, the sign and spoken languages used the same representational system. The outline of the dissertation is as follows. Chapter 2 reviews the previous accounts of the spatial language from various theoretical perspectives. It also presents the hypothesis (CSLH) along with the discussions of the motivations for each representational system. Chapter 3 details the methodology used in investigating the spatial language, the research designs, and the elicitations techniques (experiments and grammaticality judgments). The data from the experiments on the descriptions of the spatial static situations are analyzed in chapter 4. The results from two experiments (1 & 2) are reported in this chapter. In doing so, the analysis of the angular relations, such as left-right and front-back is followed by the analysis of the angular-topological relations such as next to, besides, and near. Chapter 5 is devoted to the analyses on the descriptions of the spatial dynamic situations. The results from three experiments (3, 4, & 5) are reported in this chapter. In

12 doing so, the analysis of the dynamic relations, such as to and toward, the analysis of more complex dynamic relations such as pass by and away, and the analysis of causative dynamic relations are presented, respectively. Chapter 6 provides an analysis of the relationship between the spatial and temporal domains of language from the current framework. Even though the analyses in the preceding chapters were over the seven languages (TID, HZJ, ASL, ÖGS, Turkish, Croatian, and English), chapter 6 investigates only the linguistic structures in TID and Turkish.

13

CHAPTER 2. THEORETICAL FRAMEWORK

2.1. The Model In this dissertation, a multimodular approach was taken to understand spatial language. The approach consisted of the insights from Fodor, Jackendoff, and Slobin. Following Fodor (1983, 2000), I assumed that an architecture of human mind is modular in that “modular cognitive systems are domain specific, innately specified, hardwired, autonomous, and not assembled” (1983, p. 37). By modules, Fodor meant “any mechanism whose states covary with environmental ones can be thought of as registering information about the world; and, (…), the output of such systems can reasonably be thought as representations of the environmental states with which they covary” (1983, p. 39). Figure 2.1 sketches a simplified Fodorian system.

transducers



(e.g. sensory info)

input systems  (modules)

central processors (belief system, thought, etc.)

Figure 2.1. A Fodorian system.

According to this system, transducers such as sensory information feeds input systems which are called modules. This process is vertical and hierarchical in the sense that the flow of information is straightforward. Fodor also acknowledged a horizontal

14 component of his system: central processors. Central processors get information from the input systems and work on the information on the basis of an organism’s values, beliefs, and thoughts. In a sense, central processors can override the input and they are, thus, nonmodular. They also provide the output of the Fodorian system. In this study, therefore, I assumed language to be a separate module, which has an autonomous structure. Thus, I expected that there is no one-to-one mapping of the sensory information (input), process (representation), output (message). Nonetheless, I also took Slobin’s “thinking for speaking” hypothesis (1987, 2003) into account. This hypothesis stated that a message is not a direct reflection of perceived reality. It also claimed that some (but not all) conceptualization of an event is readily encodable in a particular language. Therefore, I assumed that input and message are by-products of a single online process. Figure 2.2 illustrates this:

Input Mental Representations Message Figure 2.2. The online modular system.

I also assumed a parallel architecture model (Jackendoff, 1997, 2002). Jackendoff (1996, p.12) proposed that conceptualization of space consists of several mental representational systems. He argued that these mental representations are Spatial Representation (SR), Conceptual Structure (CS), and Linguistic Representation (LR). SR, CS, and LR interface with each other. Thus, in this study, I assumed that language of space is a by-product of an online interface model among Input, SR, CS, LR, and Message.

15 Input  SR  CS LR Message Figure 2.3. The language of space model assumed in this study.

Assuming this model, I was concerned primarily with the following questions. First, is there one-to-one obligatory mapping between input and message across languages and modalities? If so, there should not be multiple representations such as SR, CS, and LR. Second, if there is no one-to-one mapping between input and message, as the model suggested, what would be the properties of the levels that intervene between input and message? Are there crosslinguistic and crossmodal differences in the structures of SR, CS, and LR? 2.2. The CSLH Hypothesis The research I have done until now and my insights led me to propose a hypothesis that can account for both similarities and differences across languages from the visual-gestural and auditory-vocal modalities: The Crossmodal Spatial Language Hypothesis (CSLH). CSLH follows a multi-modular approach and acknowledges online processing. It assumes multiple representations between input from the environment (spatial layout) and message (spatial language). According to CSLH, the representations have the same parameters across languages. The representations are the following. Spatial Representation (SR) encodes 3D axial information (x, y, z) and reference frames (RF) (egocentric, allocentric). Temporal Representation (TR) encodes information, t, with respect to timeline and reference frames (egocentric, allocentric). Conceptual Structure (CS) has a limited set of structures: BE-AT for static situations, GO-BE-AT for dynamic situations. Linguistic

16 Representation (LR) has the linguistic structures such as syntax, modifications, and information structure. Figure 2.4 gives a simplified version of CSLH. The rest of this chapter is dedicated to the details and justification of these representations. SR

Axes x, y, z Syntax

RF

CS Egocentric, Allocentric

LR {GO}-BE-AT

Modification Information Structure

Axis TR

t

Figure 2.4. The Crossmodal Spatial Language Hypothesis (CSLH). 2.2.1. Spatial Representation and Reference Frames According to Jackendoff (1983, 1990, 1996, 1997), SR interfaces with other systems such as visual, haptic, proprioception, action, and Conceptual Structure. Nonetheless, SR is not the same as physical space (e.g., Wagner, 2006), but it is comprised of geometric format (i.e., three-dimensional axial system) and reference frames. SR x, y, z Axes

RF

Egocentric, Allocentric

Figure 2.5. Spatial Representation.

17 In this dissertation, the SR constituents were justified on three grounds: Neurological underpinnings, cognitive abilities, and linguistic structures. First, research on neural structures showed that the human brain encodes both geometric information and reference frames. In the brain, with respect to navigation, there are the two different types of cells: place cells and head direction cells. Place cells encode information about the locations in the environment whereas head direction cells encode heading of the organism (e.g., O’Keefe, 2003; Poucet, 2003). There are also two distinct neural systems in the visual system: the ventral and dorsal pathways (Ungerleider & Mishkin, 1982). The ventral pathway, also called ‘what-system’, extracts information for object identification whereas the dorsal pathway, also called ‘where-system’, extracts information for size, location, and orientation of objects. These two systems help identify and recognize the objects in the environment, yet these processes are not independent of viewpointdependent and –independent representations (e.g., Pizlo, 2008, pp. 164-167; Sekuler & Bennett, 2001; see also Landau & Jackendoff, 1993). The brain research also showed that, in early vision, humans generate both allocentric and egocentric reference frames from the spatial input (Pylyshyn, 2003). Allocentric representations of space are made in the hippocampus, which is concerned more with large distances and long timescales, whereas egocentric representations of space are made in the parietal lobe, which is related to the space surrounding the body and short timescales (Hartley & Burgess, 2003). Second, the studies on cognition also indicated that axial information and reference frames are encoded in the brain. According to Gallistel (2001), ‘cognitive maps’ are a critical component to both animal and human navigation. These maps are the

18 representations of the layout of the environment in which the metric relations such as distances and angularity and the sense relations such as left-right are computed geometrically. According to Gallistel, in order for navigation to be successful, an organism updates changes in position with respect to not only an allocentric but also an egocentric reference frame. Studies on human memory (e.g., McNamara, 2003) also showed that there is evidence that an egocentric reference frame (which specifies location and orientation with regard to the body) and an allocentric reference frame (which specifies location and orientation with respect to the environmental cues such as gravity, landmarks) must be in use in memorizing things and places. Third, studies on spatial language also showed that axial information and reference frames are one of the shared properties across languages. According to Clark (1973, pp. 35-48), human “perceptual space” and “linguistic space” share properties such as ground level with respect to gravity, the left-right plane, and the front-back plane of the body. These properties defining the three dimensional space also provide directionality. For example, the ground level with up-and-down has two values: Up is positive whereas down is negative since the human body is upward in its canonical position. The left-right plane does not have values since human left and right hand sides are symmetrical. While the back of the front-back plane is negative, the front of the frontback plane is positive because humans normally move toward the region in front of them (see also Lyons, 1977, pp. 690-703). There is also evidence that speakers take a perspective / reference frame in describing the spatial organization of objects (Levelt, 1996). According to Levinson (2003), a reference frame is a coordinate system in which an entity (Figure) is in relation

19 to another entity (Ground) on the basis of a search domain of G. There are three coordinate systems, or frames of reference, defined and observed crosslinguistically: intrinsic, relative, and absolute (Levinson, 1996a; Levinson & Wilkins, 2006). When a spatial relation is made with only Figure and Ground, an intrinsic reference frame is in use as in the phrase, ‘the car is in front of the house’. When a spatial relation refers to a viewpoint in addition to Figure and Ground, a relative reference frame is in use, for example, ‘the car is to the left of the house’. When a spatial relation refers to a geographical landmark or direction, an absolute reference frame is in use as in, ‘the car is to the north of the house’. Nonetheless, these reference frames are only recruited when an angular relation is made. The crosslinguistic studies on basic angular relations showed that every language gives more importance to, or uses more extensively, one of the reference frames over the others. For example, Mopan uses only the intrinsic; Hai//om uses only the absolute; Yucatec uses all three reference frames (Levinson, 1996b, p. 8; Pederson et al. 1998; see also Majid et al. 2004 for discussion). Interestingly enough, these patterns can also be reflected in gestures. Speakers of the languages with the absolute reference frame, such as Tzeltal and Guugu Yimithirr, tended to gesture with respect to cardinal directions, whereas those with the relative reference frame, such as Dutch, tended to gesture with respect to body coordinates (Levinson, 2003). Levinson and his colleagues (e.g, Majid et al., 2004) argued that languages differ from each other in preferring one or more reference frames over another, which may have cognitive consequences, i.e. a Whorfian effect. However, Li and Gleitman (2002) argue

20 that choosing a reference frame is a linguistic phenomenon, that is, related to lexical resources, and may not have a Whorfian effect on cognition (cf. Levinson et al., 2002). There are also disagreements on the possible reference frames. In spite of Levinson’s three reference frames, which accounts only for angular relations such as leftright and front-back, Pederson (2001) proposed sixty-four probable reference frames. Jackendoff (1996, p. 15-16) argued that there are, at least, eight reference frames: four of them are intrinsic and the other four are environmental. In summary, there is plenty of evidence from the studies on neurological underpinnings, cognitive abilities, and linguistic structures that suggest that the threedimensional geometric information and, at least, the two reference frames are represented in the brain and reflected in human behavior and language. For the purposes of this study, therefore, I assumed that SR consists of the three-dimensional axial system and two reference frames: One is egocentric and the other one is allocentric. In line with Jackendoff’s arguments, the axial system is always present and identifies object locations relative to each other. Additionally, I assumed that both reference frames are available for all spatial expressions including static/dynamic spatial relations and locatives, as well as temporal relations. Moreover, the egocentric perspective can be divided into two parts: the narrator perspective, in which the speaker takes his/her perspective and describes the spatial relations accordingly, and the addressee perspective, in which the speaker describes the spatial relations according to the addressee’s viewpoint. The allocentric perspective can also be divided into two parts: an environment-based perspective in which the speaker describes the spatial relations according to fixed-bearings, such as

21 geocardinal directions, and a neutral perspective in which the speaker describes the spatial relations according to intrinsic features of the objects. 2.2.2. Temporal Representation and Reference Frames According to the model in Figure 6, the SR encodes the 3-D axial information x, y, and z, while the TR encodes one-dimensional axial information, t. They share the reference frames. SR

Axes x, y, z

RF Egocentric, Allocentric TR Axis

t

Figure 2.6. SR, TR, and RF model.

By analogy, therefore, I assume that, in talking about temporal relations, a natural language user establishes an asymmetrical relation. That is, by using the same spatial reference frame terminology, in the asymmetrical relation with regard to the axial system, i.e. one-dimensional, a situation is identified as Figure with respect to a reference point (Ground). In addition to Figure-Ground assignment, a natural-language user chooses a reference frame (egocentric or allocentric) in temporal language as well. In the egocentric reference frame, the reference point is defined according to the ego and now. Now is shared by both viewer and addressee, and the use of the front-back axis is the

22 preference in the timeline cross-linguistically. In the allocentric reference frame, there is probably no environment-based perspective since the beginning and the end point of time is not absolute but relative to now. Additionally, when neutral perspective is taken, there is no reference to the ego’s now. In the literature, nonetheless, there are claims that time is derived from space. In section 2.2.2.1, these claims were presented. The section that follows compares TR and SR. 2.2.2.1. Is Time Derived from Space? Temporal linguistic information and spatial linguistic information seem to share some properties. For example, ‘at’ can be used spatially as in ‘at the hospital’ and temporally as in ‘at 5pm’. These similarities led to the argument that temporal language derived from spatial language (e.g., Clark, 1973; Lyons, 1977; Lakoff & Johnson, 1980; Boroditsky, 2003), as well as Jackendoff’s claim that spatial relationals for place such as IN, ON, AT of Conceptual Structure can also be used for temporals. First I will address studies that argue that time is derived from space. Across languages, spatial lexemes such as adpositions and/or adverbials are also used in temporal expressions (e.g. Haspelmath, 1997). Yet, languages also encode temporal relations in their tense systems; the use of adverbials can correlate with tense markings. For example, *last week I will go to London is unacceptable since last week and non-past marking on the auxiliary do not “agree”. Typologically, grammatical markings of temporality are quite complicated. For example, Dahl (1985) argues that there are about forty-five different tense systems, which often overlap aspect and mood. It is also assumed that situations are located along a timeline which is supposed to be a

23 straight line, hence one-dimensional, so that, according to Comrie (1985), tense is a “grammaticalized expression of location in time”. Nonetheless, morphological markings of tense are not always obligatory. Some languages mark tense in their morphology while others, such as Mandarin, do not. Since entities in spatial relations are concrete and events in temporal relations are abstract, it is often stated that time is derived from space (see Clark, 1973; Lyons, 1977). The fact that (some) spatial lexemes are also used in temporal expressions across languages also supports this view. However, the lexical items referring to space and time do not overlap exclusively. The ways to convey spatial and temporal relations can also differ. For example, temporal relations can be carried by special morphological markers such as tense and aspect markers while those markers are not used in conveying spatial relations (for a review see Tenbrink, 2007, pp. 12-37). Nonetheless, these items are often neglected in discussions. Several hypotheses have been developed to understand the space-time relationship. For example, since time is understood one-dimensionally while space is three dimensional, one could argue that the spatial lexical items used in referring to onedimensional spatial relations are also used in temporal relations. Moreover, in the real world, space and time are interrelated; therefore, languages lexicalize spatial terms into temporal terms (e.g., Bybee et al., 1994). One could also argue that space is basic whereas time is a metaphorical extension of space (Lakoff & Johnson, 1980). The metaphorical extension was further developed recently by introducting two “perspectives” used in temporal relations (Clark, 1973; Traugott, 1978; Gentner, 2001; Radden, 2003). One is the moving-time perspective in which time is understood as

24 moving, as in ‘approaching’, ‘coming’, etc. The other is the moving-ego perspective in which time is understood as static and the ego moves as in ‘going to’. There is also increasing empirical evidence for these “perspectives” along the front-back axis to be used in the spatial and temporal domains of language. For example, when participants are primed with one of the strategies in spatial tasks, this strategy has an effect on the participants’ judgments in referring to temporal events (Casasanto & Boroditsky, 2008; Núñez et al., 2006; Núñez and Sweetser, 2006; Torralbo et al., 2006; Matlock et al., 2005; Gentner et al., 2002; Boroditsky & Ramscar, 2002; Gentner, 2001; McGlone & Harding, 1998; Gentner & Imai, 1992). Chapter 6 of this dissertation deals extensively with this issue. In summary, although some linguistic studies have argued that time is derived from space, their findings did not come from spatial and/or temporal domains, but from some lexical items and some experimental manipulations. This can be related to structural similarity, as argued in Murphy (1996), but it does not indicate that temporality is derived from spatiality at the domain level. When spatial and temporal cognition are thought of as cognitive domains and part of mental representations, the dissimilarities exceed the similarities. 2.2.2.2. TR vs. SR There are many reasons to assume that Temporal Representation is distinct from Spatial Representation. In spatial language, we locate entities in space, but in temporal language, we locate events in a timeline. Yet, neither ‘entities’ and ‘events’ nor ‘space’ and ‘timeline’ are identical. Thus, I assume a one-dimensional timeline as sketched in Figure 2.7. With one reference point (now) marked, past is on one side and non-past is on

25 the other side of now. What is crucial in representing temporal relations is that now as a reference point in the timeline conflates with the ego as perspective. For example, a situation which happens in the past is located on the left side of now which is defined on the basis of ego. That is, perhaps, why time is understood as a single dimension.

Ego t

t0 Past

Now

Future

Figure 2.7. One-dimensional timeline.

The linguistic encodings of space and time also differed from each other in their morphosyntactic structures. For example, typologically, locational information is mostly encoded in NPs not in predicates, with the exception of the so-called positionals, e.g. standing and sitting (Newman, 2002). However, temporal information can be encoded both in NPs and in predicates. There is also evidence that brain-damaged subjects process the prepositions used in both the spatial and temporal domains of English differently. For example, the assessments of the use of English prepositions in the spatial and temporal domain indicate that the knowledge may be intact in one domain but not the other (Kemmerer, 2005). Having TR and SR distinct from each other also helps to understand motion events further. In order to express a situation as a motion event, according to Bohnemeyer and his colleagues (Bohnemeyer, et al., 2007), a situation should carry a property, called

26 “Macro-Event Property” (MEP), which consists of ‘a unique initial and/or terminal boundary, a duration, and a unique position on the timeline’. This study suggested that an event should carry temporal information in order to be encoded linguistically. Based on the findings from their crosslinguistic study on the segmentation of motion events in eighteen genetically and typologically unrelated languages, they argued that the languages differ from one another in their encodings of an event with MEP because of their diverse lexicalization patterns and syntactic constructions. Thus, if temporal information is a unique characteristic of an event and if its linguistic encodings differ across languages, then it must be encoded in a separate domain. 2.3. Conceptual Structure CS is also another structure, which interfaces with other systems such as smell, auditory information representation, syntax, emotion, and SR. Figure 2.8 outlines Spatial Representation (SR), Conceptual Structure (CS), and the interfaces.

Auditory information Syntax

CS Emotion

Smell

Visual

SR & TR Action

Haptic

Proprioception

Figure 2.8. CS-SR & TR interactions (adapted from Jackendoff, 1997, p. 44, TR is mine).

This model also provides the two competing hypotheses (Jackendoff, 1996 : 21). On the one hand, there is the SR hypothesis in which SR specifies spatial information such as axes and perspective in its geometric format. On the other hand, there is the CS hypothesis in which CS specifies spatial information in its format. We can interpret

27 spoken language research (e.g. Pederson et al, 1998) in different ways by using these two hypotheses. But, for Jackendoff (1996 : 21-4), the SR hypothesis wins out since first, “people freely switch frames of reference in visuomotor tasks… [such as] an egocentric (or observer) frame for reaching but an environmental frame for navigating” and secondly, axes and reference frames do not have grammatical effects. In this dissertation, I pursue the SR hypothesis which claims that axial information and perspective types, along with reference frames, are encoded in SR. CS and SR differ from each other in many respects (Jackendoff, 1996; Zee & Nikanne, 2000): For example, CS represents functional features of objects and object parts, whereas SR does not. CS can represent a selection of all path features, whereas SR can represent all path features. CS encodes propositional representations, whereas SR is schematic. CS is built up out of ontological features, but SR is not. Moreover, while SR encodes axial information and reference frames, CS encodes ontological categories: Complex (SITUATION (EVENT/STATE)1, PLACE, PATH) and Simple (THING2, PROPERTY, AMOUNT) (Jackendoff, 1990: 22; Nikanne, 2000: 80). Each ontological category has a restricted function. I will detail the Jackendovian CS templates for spatial domain in the next section. Then, in section 2.3.2, I discuss the Crossmodal Spatial Language Hypothesis (CSLH) based on the Jackendovian CS templates.

1

Culicover & Jackendoff (2005) and Nikanne (2000) used Situation which is a cover ‘ontological category’ for both event/state distinction and modals. 2 Culicover & Jackendoff replaced Thing with Object. In this study, I kept Thing instead of Object.

28 2.3.1. The Jackendovian Conceptual Structures Jackendoff (1983, pp. 174-175; 1990, pp. 43-46) gave structures for States, which have functions such as BE, ORIENT, EXT(ENT). He also presented the structures for Events, which have functions such as GO, STAY, and CAUSE. Both states and events have the ontological category variables such as Thing, Place, Path, which are indicated below as subscripts. The templates are the following:

(1)

[State BE

(2)

[State ORIENT ([Thing ], ([Path

[{Thing, Place}

])])]

(3)

[State EXT

([Thing ], ([Path

[{Thing, Place}

])])]

(4)

[Event GO

([Thing ], ([Path

])]

(5)

[Event STAY

([Thing ], ([Place

])]

(6)

[Event CAUSE ([Thing/Event

([Thing ], ([Place

])]

], ([Event

])]

29 Let me illustrate these CSs by giving examples3. For example, ‘The book is on the table’ has a CS as in (7). According to this representation, the entire expression is a State which has a BE function. The State BE has two ontological category variables: Thing, which is filled with BOOK and Place, which has an ontological category, ON. The Place function ON is associated with Thing, which is filled with TABLE.

(7)

[State BE ([Thing BOOK], ([Place ON ([Thing TABLE])])])]

‘The sign points toward New York’ has a CS as in the following:

(8)

[State ORIENT ([Thing SIGN], ([Path TOWARD ([Place NEW YORK])])]

According to the CS in (8), the entire expression is a State which has an ORIENT function. The State ORIENT has two ontological category variables: Thing, which is filled with SIGN and Path, which has an ontological category: TOWARD. The Path function TOWARD has a Place argument filled with NEW YORK. ‘The road goes from Indianapolis to Chicago’ has a CS as in the following: FROM (9)

3

[State EXT ([Thing ROAD], (

Path

([Place INDIANAPOLIS])

TOWARD ([Place CHICAGO])

)]

Culicover and Jackendoff (2005) has a slightly different notational system in which modifiers of the arguments (fillers here) and features such as definiteness and plurality are encoded in CS. Nonetheless, I followed Jackendoff (1997) and earlier and Nikanne (2000) and earlier.

30 According to the structure in (9), the entire expression is a State which has an EXT function. The State EXT has two ontological category variables: Thing, which is filled with ROAD and Path, which has two ontological and nonhierarchical categories: FROM and TOWARD. Both categories take Place arguments which are filled with INDIANAPOLIS and CHICAGO, respectively. What follows are examples for Events. For example, ‘Bill entered the room’ has a CS as in (10). According to this representation, the entire expression is an Event, which has a GO function. The Event GO has two ontological category variables: Thing, which is filled with BILL and Path. Path, in turn, has an ontological category, TO, which takes another Path, IN, as an argument. Finally, IN is filled with a Thing, ROOM.

(10) [Event GO ([Thing BILL], ([Path TO ([Path IN ([Thing ROOM])])])]

‘Mary remained on the floor’ has a CS as in (11). Hence, the entire expression is an Event, which has a STAY function. The Event STAY takes a Thing, which is MARY, and a Place, ON, as arguments. The Place ON has one argument: a Thing, FLOOR.

(11) [Event STAY ([Thing MARY], ([Place ON ([Thing FLOOR])])]

‘Beth threw the ball out the window’ is an example for CAUSE type of the events. As given in (12) (Jackendoff, 1983, p.175), this expression is an Event with a CAUSE function. The Event CAUSE takes two arguments: a Thing (BETH) and another

31 Event (GO). The Event GO, in turn, takes a Thing, BALL, and a Path OUT WINDOW as its arguments.

(12) [Event CAUSE ([Thing BETH], ([Event GO ([ThingBALL], ([Path OUT WINDOW])])])]

Jackendoff’s treatment of temporals is the following. For him (1983, pp. 189191), temporal expressions define a one-dimensional timeline on which events and states are located. He claimed that not only temporal terms are structurally the same with spatial terms but also predicates concerning temporal location have the same structure with the spatial predicates. Let me illustrate this by giving examples below4. ‘The meeting is at 6:00’ in (13) and ‘The meeting is at the hospital’ in (14) have very similar, if not identical, CSs.

(13)

[State BETemp

([Event MEETING], ([Place ATTemp ([Time 6:00])])]

(14)

[State BE

([Event MEETING], ([Place AT ([Thing HOSPITAL])])]

Similarly, ‘We moved the meeting from Tuesday to Thursday’ in (15) and ‘We moved the statue from the park to the zoo’ have very similar CSs. Both of them belong to the ontological category Event, which has a GO function. Yet, Jackendoff made a notational distinction between them: GO and GOTemp , respectively. 4

Most of the examples from Jackendoff (1983, pp. 190-191).

32 (15)

[Event CAUSE ([Thing WE], [Event GOTemp ([Event MEETING],

Path

FROMTemp

([Time TUESDAY])

TOTemp

([Time THURSDAY])

)])]

‘Despite the weather, we kept the meeting at 6:00’ and ‘Despite the weather, we kept the statue on its pedestal’ belong to the event category with a STAY function. The CS structure is below in (16).

(16)

[Event CAUSE ([Thing WE], [Event STAYTemp ([Event MEETING], [Place ATTemp ([Time 6:00])])])]

The temporals can also be represented with EXT. Here is the example, ‘Ron’s speech lasted from 2:00 to 4:00’, which has the CS structure in (17).

(17)

[State EXTTemp ([Event SPEECH],

Path

FROMTemp

([Time 2:00])

TOTemp

([Time 4:00])

)]

For tense, Jackendoff did not present any analysis based on spatial treatments. According to Culicover and Jackendoff (2005), nevertheless, tense in CS can be represented as a situation category. For example, ‘We moved the meeting from Tuesday to Thursday’ can have a CS as in (18).

33 (18)

[Situation PAST [Event CAUSE ([Thing WE], [Event GOTemp ([Event MEETING],

Path

FROMTemp

([Time TUESDAY])

TOTemp

([Time THURSDAY])

)])])]

2.4. Challenges to the Jackendovian Conceptual Structures In what follows, I discussed the five challenges to the Jackendovian Conceptual Structures. These five challenges are: 1- The use of dynamic predicates in referring to the static situations, 2- Nonuniversality of place functions (in, on, at), 3- Nonuniversality of place functions (left/right, front/back), 4- Crosslinguistic differences on the language of motion events, 5- The findings from sign language research. 2.4.1. Dynamic Predicates as “States” The Jackendovian CSs indicated that States account for static relations whereas Events account for dynamic relations. However, static relations can be expressed as dynamic situations. There are, at least, two reasons: perceptual and linguistic. First, as opposed to ‘real movement’, humans perceive a set of consecutive static situations as ‘apparent movement’ or illusory movement because of the neural structure or perceptual inferences (e.g., Mather, 2003). For example, even though movies consist of a set of individual frames, humans perceive them as continuous. Pragmatically, humans make inferences about situations on the basis of their world knowledge and prior experiences. For example, if two cars are on a line and facing the same direction with no motion,

34 humans may infer that they are moving or one is following the other one. Even so, pragmatic inferences do not entail ‘apparent movement’. Second, contra what Jackendoff suggested, static relations are not only conveyed by the topological (in, on, at) and angular terms (left/right, front/back), but also some other terms such as sitting, lying, standing, looking, facing, which are referred to as dynamic predicates (Newman, 2002; Levinson & Wilkins, 2006; Ameka & Levinson, 2007). Even though these dynamic predicates are inherently ‘static’, according to the classical Jackendovian approach, these types of situations are assumed to be Events. For example, ‘sitting on a cup’ can be represented as [Event STAY ([Thing ], ON ([Place CUP])])])] in CS. According to Newman (2002, pp.1-3), prototypical meanings of sit, stand, and, lie in English include several domains. For example, in the “spatiotemporal” domain, sit refers to a compact position as in ‘the cat is sitting under the tree’, stand refers to vertical position ‘he is standing next to you’, lie refers to horizontal position ‘the child is lying on the ground’. Although all languages have these terms in their lexicon, their prototypical locational expressions differ. For example, whereas English uses a small set of prepositions with copula ‘be’, languages such as Likpe, Lapotec, and Tzeltal have a large number of positional verbs, while Guguu Yimithirr and Rossel have a small number of positionals in locative expressions (Ameka & Levinson, 2007, p. 852). In summary, recent studies have challenged the ideal relational system such as the Jackendovian CSs, which is generally assumed to be a set of preposition-like elements in a given language. Nonetheless, I believe that a theory of spatial language should account for the use of pragmatic inferences and dynamic predicates such as sit, stand, lie, and look in locative expressions.

35 2.4.2. Nonuniversality of Place functions (IN, ON, AT, etc.) Topological relations are the relations made by, for example, in, on, at, near, far, etc., in English. For Jackendoff, their conceptual correspondences (IN, ON, etc.) are universal since, he argued, all languages convey such relations. Although, conceptually, those relations could be universal, the way languages express them is not only related to the geometric information about the object relations and extensions of the original / prototypical sense over other senses but also the context in which it is being used. Additionally, the way a language refers to a spatial relation is not the same in other languages. For example, the Turkish correspondence of the English expression on the bus could be ‘at the bus’. The literal translation of on the bus might be ‘on top of the bus’ in Turkish. Classically, those relations are defined as follows (Miller & Johnson-Laird, 1976). For example, x is conceptually “in” y if x is part of z, which is included in y. X is conceptually “on” y if (i) x is included by the surface region of y and supported by y; otherwise, (ii) x is on y’s path and by y. Nonetheless, it became clear that the linguistic reflections of in and on are more complicated than this classic view. Herskovits (1986), instead, proposed an alternative view focusing on both ideal conceptual in and on and their linguistic uses (or extensions). She defined an ideal in as inclusion of a geometric construct in a one-, two-, or three- dimensional construct. She also presented the usage types of in (in English), which are extended to other senses. For example, spatial entity in a container, physical object in the air, person in clothing, and person/participant in institution. Her definition of an ideal on is: for a geometric construct x to be continuous with a line or surface y; if y is the surface of an object Oy, and x is the

36 space occupied by another object Ox, for Oy to support Ox. But, again, the uses of on are extended to a variety of situations such as an object supported by another, an object attached to another, an object with a wall, a spatial entity located on a geographical area, in English. However, it has been shown that the uses of spatial terms for topological relations are even more complicated. Coventry & Garrod (2004) argued that not only geometry and knowledge of objects but also how objects interact with each other, together with a given context, accounts for the situation-specific meaning of spatial prepositions. For example (p. 56), if y contains x then y also controls the location of x; x cannot be removed before y is removed. Similarly, if x depends on a support y, without the support x will fall to the ground. Coventry & Garrod gave a concrete example for this relation: for an umbrella to be over someone means both the umbrella is a higher position than that person and the umbrella is in a position to protect that person from getting wet (p.58). Moreover, Tyler & Evans (2003) argued that English spatial prepositions are polysemous and have many-to-one systematic mappings between meaning and form. They proposed proto-senses for each spatial relation and presented their extensions, which, they argued, are cognitively structured. For example, ‘the proto-sense’ for in constitutes a spatial relation: a trajectory (Figure) is located within a landmark (Ground), which has salient structural elements such as an interior, a boundary, and an exterior. Nonetheless, this proto-sense is extended to a location sense, an arriving sense, a disappearance sense, a reflexivity sense, etc. (pp. 183-184). Thus, the definition of topological relations seems to be motivated by both geometric relations and language uses in particular contexts although neither of them is

37 sufficient, at least, in English. The question then arises as to whether there are crosslinguistic differences in defining topological relations. Vandeloise (1991) extensively studied the French spatial prepositions. She argued that definitions of the French spatial prepositions based on geometric relations of objects are not sufficient. She claimed that geometry of an object is only a superficial consequence of the preposition but not the use of the preposition in a given context (p.7). She continued arguing that knowledge of the world around the body and the perception and conceptualization of the things around the body can provide a better explanation for the prepositions than simple and ideal geometric definitions (p. 235). Additionally, crosslinguistic studies have shown that the way one language uses relationals to correspond to a set of environments is not the same as in another language. Bowerman and Choi (2003) provided clear examples for the various uses of spatial relationals in English and Korean. For example, English speakers make a distinction between putting a figure into an enclosure, container, or volume of some kind and putting it into contact with an exterior surface of the ground object (in vs. on). However, Korean speakers make another distinction between these events: tight-fit containment vs. loosefit containment (pp. 393-394). Levinson and Wilkins (2006, pp. 512-575) overviewed a variety of less studied languages. They showed that the way the languages such as Warrwa, Japanese, Kilivila, Tzeltal, Dutch, English, among others, use a variety of lexical items in referring to pictures depicting topological relations. For example, the English speakers used “on” relation for the pictures depicting ‘cup on table’, ‘stamp on letter’, ‘ring on finger’, and ‘apple on skewer’ (p. 562) whereas the Dutch speakers used “op” for ‘cup on table’ and

38 ‘stamp on letter’; “om” for ‘ring on finger’; “aan” for ‘apple on skewer’ (p. 559). The Tzeltal speakers, yet, used “pak’al” for ‘cup on table’ and ‘stamp on letter’ but used dynamic predicates for ‘ring on finger’ and ‘apple on skewer’. In summary, all languages convey topological relations in some ways. Yet, there are both intra- and cross- linguistic differences in the linguistic correspondences of those relations. Therefore, an ideal preposition-like system such as the Jackendovian CS templates may not be sufficient to explain the observations above. 2.4.3. Non-universality of Place functions (LEFT, RIGHT, FRONT, BACK, ABOVE, BELOW, etc.) Angular relations are the relations made by left/right, below/above, etc. in English. For Jackendoff, the conceptual correspondences (TO-THE-LEFT-OF/TO-THERIGHT-OF, BELOW/ABOVE) are universal since, as for the other place functions such as IN, ON, AT, all languages carry such relations. Although, conceptually, those relations are universal, the way languages convey them is not. Idiosyncratic structures of a language and a given context can also make a difference in making spatial relations. Clark (1973) claimed that the human body provides a basis to convey angular relations since the human body has left side and right side, front and back, and above and below in its upright position. However, experimental studies on English showed that (1) the uses of angular terms refer to regions not points (Logan & Sadler, 1996; CarlsonRadvansky & Logan, 1997; O’Keefe, 2003). Thus, there are situations such as those in which objects located diagonally with respect to the viewer’s orientation, for which, at least, two angular relationals can compete. (2) The use of a pair of angular terms (left-

39 right) over another (front-back) is very much influenced by the reference frame employed (Carlson-Radvansky & Logan, 1997). Additionally, the use of angular terms are all influenced by the contextual / functional information. Coventry and his colleagues (Coventry et al., 2001) manipulated the scenes, which depict the relations such as under, over, above, below, and asked his participants to make acceptability judgments with respect to a given preposition in English. The results indicated that the participants made judgments according to the function of the objects in the scenes regardless of the objects’ positionings with respect to the human figure. The studies on English indicated that angular relations can be made by using a set of prepositions such as left/ right, front/back, and above/below. Nonetheless, the uses of any one of them depend not only on geometric information but also contextual/functional knowledge. How does this dependence play out in other languages? Vandeloise (1991) suggested a rather different framework in which both geometric relations and contextual/functional information affect the use of a given preposition in French. For example, in her generalization for devant/derriere ‘front/back’, general orientation is the only way to account for the use of these terms. Yet, the frontal direction, the direction of movement, the line of sight, the direction in which the sensory organs are oriented, and the directions of nutrition and defecation are all constituents of general orientation. In Finnish, the distribution of the relationals meaning ‘in front of’, ‘behind’, ‘above’, and ‘below’ provided interesting results (Nikanne, 2003). There are two kinds of relationals meaning ‘in front of’ and ‘behind’ in Finnish. Both of them indicate that the

40 referred objects are in motion, and both refer to not only horizontal spatial relations but also any one-dimensional relation. Coventry & Frias-Lindqvist (2005) inquired further to see whether these uses are different from that of English when motion is present. They found that it is the case that the Finnish speakers distinguished ‘behind’ terms when the objects are in motion. Yet, according to Nikanne (2003) ‘above’ and ‘below’ refer only to vertical relations as in canonical English. In their overview of variety of languages, Levinson and Wilkins (2006, pp. 567569) summarized that the languages they studied showed interesting patterns when it came to describing scenes with angular relations. The terms corresponding to ‘left’ and ‘right’ are not always used in angular relations. For example, even though ‘left’ and ‘right’ are used in referring to body parts, the Jaminjung, Warrwa, and Arrernte speakers did not use them spatially. Similarly, the Tzeltal speakers did not have a ‘left’/‘right’ spatial distinction. They also seemed to dislike using ‘front’ and ‘back’; instead, they used cardinal directions such as ‘uphill’, ‘downhill’, and ‘across’. Interestingly, ‘front’, ‘back’, ‘left’, and ‘right’ were commonly used among most of the male speakers but only some of the female speakers of Yukatek Maya. In summary, it seems that the angular terms are related to human body and, perhaps, environment. Yet, the use of those terms is also contextually/functionally motivated in language. Crosslinguistic studies revealed striking differences. These findings suggest that, perhaps, at one level such as CS, some of the angular terms are available at the same time (underspecified).

41 2.4.4. Crosslinguistic Differences on the Language of Motion Events According to Jackendoff (1990), a situation can be conceptualized as either a state or an event in CS. States are those which take BE, ORIENT, EXT(ENT) as functions whereas events are those which take GO, STAY, CAUSE as functions. Nonetheless, the prototypical GO events refer to motion events; the prototypical STAY events refer to positionals such as sitting, standing, lying, and looking, in encoding static situations; and, the prototypical CAUSE events refer to causation. In the following, I focus on modifications of the motion events such as manner and path, as well as the causative motion events as the new challenges to the Jackendovian conceptual structures. 2.4.4.a. Manner and Path The Jackendovian perspective assumed that all languages have functions that correspond to manner and path in their conceptual structures. Both manner and path are, thus, conceptual primitives. Nonetheless, recent studies challenged the universality of manner and path encodings of the motion events in grammar and linguistic behavior. First, it seems that languages differ from one another in their lexicalization patterns of manner and path. For example, Talmy (2000, p.189) claimed that all languages express a figure in motion and its path, a linear figure moving along the same path, and a stationary figure positioned in the same path. In doing so, all languages can encode the properties of an event such as figure (moving object), ground (reference object), manner (how movement is done, e.g. jumping), path (source/starting point and goal/endpoint of the motion), and predication. How they encode these universal properties led Talmy to propose a typology: (1) Languages that use verbs to carry both path of the motion and predication (verb-framed languages) and (2) languages that use

42 verbs to carry both manner of motion and predication (satellite-framed languages). Spanish is an example for the verb-framed languages and English is an example for the satellite-framed languages. Slobin (2004, 2006) revised this typology and added one more type (equipollently-framed languages) in which “path and manner expressed by equivalent grammatical forms” (2006, pp.64-65) such as languages with serial verb constructions, e.g., Sino-Tibetan languages. Secondly, Slobin also argued that languages differ from one another in the way their speakers talk about the manner of a particular motion event. He gave an example from the narratives on “the owl exit scene” of the Frog story (2006, pp.65-66). Interestingly, he found that the speakers of the verb framed languages such as Spanish, Hebrew, and Turkish did not talk about the manner of that scene, while the speakers of the satellite-framed languages, such as Dutch and English, talked about the manner, on average, 23% of the time. The speakers of the equipollently-framed languages such as Mandarin and Thai mentioned the manner of the scene in about 34% of their narrations. Third, recent research on crosslinguistic co-speech gestures also indicated that verbal encodings as well as gestures that accompany speech differ from one another in talking about path and manner of the motion events (Kita & Ozyurek, 2008). For example, Kita & Ozyurek (2003) asked English, Japanese, and Turkish speakers to talk about a cartoon which included a ‘swing’ event. The English speakers used swing, which encodes go+arch trajectory of the movement, in their descriptions; when they gestured, their gestures included the arch trajectory of the motion. The Japanese and the Turkish speakers, however, used verbs that correspond to go in English without encoding the trajectory of the movement. Kita and Ozyurek also found that co-speech gestures used by

43 Japanes and Turkish speakers did not encode the arch trajectory of the movement, but a straight movement of the hands. Kita and Ozyurek also asked the speakers of all three languages to describe a ‘roll down’ event, which included the manner of ‘rolling’ and a path of ‘descending’. While the English speakers encoded this event in a single clause by using roll down, the Japanese and the Turkish speakers used two separate clauses to encode the same event in distinguishing manner and path of the event as in he descended as he rolled. Interestingly, in line with the speech encodings, the English speakers could use gestures to indicate manner and path at the same time whereas, the Japanese and the Turkish speakers used two different gestures for each. Last but not least, although, for Jackendoff, FROM and TO have the same conceptual status, psycholinguistic studies indicate that there is an asymmetry between referring to the beginning point and the end point of a motion event in the encodings of path. For example, Lakusta and Landau (2005) examined the productions of motion events in various populations of English speakers: normal children, the patients with Williams Syndrome, and normal adults. They found that there is an asymmetry between source and goal of the paths of the motion events. All of the participants encoded the goal/endpoint of the paths but not the source/starting point of the motion events. The question then arises as to whether this is the case across languages. Regier and Zheng (2007) asked a similar question and ran a series of experiments in three different language groups: Lebanese Arabic, Mandarin, and English. Although these three languages are different from each other in their spatial language (Arabic is verb-framed, Mandarin is equipollently-framed, and English is satellite-frame), all of the participants encoded and made finer distinctions about the goal/endpoint of the paths of

44 motion events. These findings showed that not only English speakers, but also Arabic and Mandarin speakers tend to prefer encoding the endpoint (Jackendovian TO) over the source (Jackendovian FROM) of the motion events. In sum, the studies on manner and path of motion events revealed that languages differ from each other in the way they lexicalize/grammaticalize manner and path of motion events. These differences can also be found in the co-speech gestures. Additionally, the perceptual studies also showed that the people pay more attention to the endpoints of a motion event regardless of the language they speak. 2.4.4.b. Causative Motion Events For Jackendoff, causation is an Event-type and CAUSE is an event function. The template is the following:

(19=6) [Event CAUSE ([Thing/Event

], ([Event

])]

‘Beth threw the ball out the window’ is an example for CAUSE type of the events. As given in (20) (Jackendoff, 1983, p.175), this expression is an Event with CAUSE function. Event CAUSE takes two arguments: a Thing, BETH, and another Event GO. Event GO, in turn, takes a Thing, BALL, and a Path OUT WINDOW as arguments .

(20=12) [Event CAUSE ([Thing BETH], ([Event GO

([ThingBALL], ([Path OUT WINDOW])])])]

45 Nonetheless, Talmy (2000) claimed that the underlying structure of causation is force dynamic principles. According to him (p.466), a prototypical force dynamics involves two forces opposing each other, a stronger force overcoming a weaker one, a force acting on a straight line, a constant force tendency in the agonist – the focal force entity – which has a tendency to exert force to cause an action or rest and ends up with an action or rest. He grouped the types of causation into two: Extended Causation and Onset Causation. In extended causation, the focal element ‘agonist’ tends toward either rest or action and ends up in rest or action. The agonist force could be either greater or lesser than the other element, ‘antagonist’. In onset causation, the focal element again tends to either rest or action and its effect is either causing or letting. The antagonist ends up with starting or stopping with respect to the agonist. The force dynamics underlying causation is empirically supported as well (Wollf, 2007). So, the Jackendovian template for causation seems to miss the behavior of the agonist and the relative behavior of the antagonist with respect to the agonist. For example, ‘Beth threw the ball out the window’ is an example of an onset causation in which Beth caused the ball to start going away from her and Beth, then, rested. Yet, Jackendoff’s CS in (20) did not capture Beth’s initiation of the event and her situation after the initiation as well as the ball’s relative direction of movement with respect to Beth’s final positioning. 2.4.5. Sign Language Research on Spatial Language “Sign languages are articulated in three-dimensional space, and it is crucial to understand this aspect of the modality [i.e., visual-gestural modality] in order to find both the similarities and differences between sign and spoken languages” (Sandler & Lillo-

46 Martin, 2006, p. 489). Research has shown that the use of space is crucial in “pronominal and verb agreement system” as well as establishing spatial and discourse referents (ibid., pp. 479-489). In the following, how studies on spatial language in sign languages challenge the Jackendovian templates is discussed. First, on the one hand, spoken language research indicates that speakers use lexical and grammatical features in referring to spatial relations. Although Jackendovian conceptual structures provide a system of ontological categories and their functions, they fail to take into account pertinent information provided by gestures and sign languages. Yet, on the other hand, sign languages represent space by using the signing space in front of the signers, including their bodies (Emmorey, 1996). Crucially, sign languages need not use lexical items that correspond to spoken language relationals such as left, right, front, back, in, on, at, to, toward, and so on. Second, the Jackendovian templates, by nature, assume that ontological categories and functions are categorical. For example, TO and TOWARD are two different path functions. Moreover, there is no other function word in between TO and TOWARD. Nonetheless, there are findings that suggest that the spatial linguistic forms of sign languages such as American Sign Language (ASL) have gradient rather than categoric properties (Emmorey, 2002). It is argued that “locations in signing space [in ASL] used to express spatial relationships are mapped in an analogue manner to the physical locations of objects in space” (Emmorey & Herzig, 2003, p.242). Third, the Jackendovian conceptual structures assume that path and place functions are linked to the linguistic categories. However, according to Emmorey and Herzig (2003), it appears that “signers interpret locations in [ASL] signing space with

47 respect to nonlinguistic spatial category boundaries, rather than with respect to linguistic categories” (p. 243). There is also evidence for such observations in other sign languages. For example, Perniss (2007) argues that “DGS [German Sign Language] signers do rely to a large extent on the iconic properties of classifier predicates to encode location, orientation, and number of referents, and on the properties of sign space to create “isomorphic” representations of real-space scenes in signing space” (p. 240). Liddell (2000, 2003) argues further that gestural and linguistic information in signed spatial information are fused together. Last, but not least, Talmy (2006) compared sign and spoken language spatial structures and pointed out sharp distinctions. According to him, on the one hand, as shown in the Jackendovian perspective, spoken languages have a universally available inventory of spatial relationals. On the other hand, “… signed language can mark finer distinctions with its inventory of more structural elements, more categories, and more elements per category. It represents many more of these distinctions in any particular expression. It also represents these distinctions independently in the expression, not bundled together into prepackaged schemas. And its spatial representations are largely iconic with visible spatial characteristics” (p. 208). 2.4.6. Summary In sum, this section presented studies that challenge theoretical approaches such as the Jackendovian Conceptual Structures that account for spatial language by using a universal categorical system. The main focus was on the use of dynamic predicates in

48 referring to the static situations, nonuniversality of place functions (in, on, at, left, right, etc.), motion event descriptions, and sign language research. Recent studies on spoken and sign languages have argued that there is a considerable variation in spatial language across languages and modalities. In the following section, I propose a new hypothesis that can perhaps account for these challenges. 2.5. The Crossmodal Spatial Language Hypothesis (CSLH) In light of the findings from previous studies, it became clear to me that the present crosslinguistic and crossmodal study needed further elaborations on the Jackendovian perspective. In this dissertation, therefore, I propose the Crossmodal Spatial Language Hypothesis (CSLH) to account for static/dynamic spatial relations, locatives, as well as temporal relations. In making this hypothesis, I adapted the four criteria5 of Occam’s Razor as given in Culicover and Jackendoff (2005, p.4): (i)

Minimize the distinct components

(ii)

Minimize the class of possibilities

(iii)

Minimize the distinct principles

(iv)

Minimize the amount of structure

Following these criteria, here, I attempted to formalize every component of spatial language (i.e., SR, TR, RF, CS, and LR). In doing so, the possible parameters for each component are presented. For example, RF has only two parameters and both are available in any language at any given time, regardless of modality. In order to minimize the structure, the parameters of a single component were not replicated in any other

5

They referred to ‘grammar’ but, here, the criteria are overgeneralized to the theory of language.

49 component. Hence, for example, reference frames must be encoded in RF but not necessarily in LR; therefore, LR does not have a reference frame component. 2.5.1. CSLH for Spatial Static Situations For the spatial static situations (topological and angular relations), the SR will contain 3-dimensional entities. The RF will permit allocentric and egocentric reference frames. The TR will have a single region, represented as t. The CS will contain two sets of constructions: one for each entity in SR, providing the following information about those entities: BE, ORIENT, TOWARD, AT(LOC). The LR will require that four conditions. These conditions are fulfilled in order for there to be a well-formed linguistic description of the static situation in whatever language might be used. (21) gives CSLH for two objects in a spatial arrangement.

(21) SR

 x 1 , y 1 , z1

RF

allocentric / egocentric

TR

t

 x 2 , y 2 , z2

CS

ORIENT ([…], TOWARD ([…] / )) BE AT ([LOCi])

LR •

static / dynamic predicates

ORIENT ([…], TOWARD ([…]  / )) BE AT ([LOCi / j])

50 •

must encode either AT or ORIENT or both



perspective is optional



can modify BE and fillers: Thing and LOC.

According to (21), SR represents two objects,  and . The 3-D specifications are x1, y1, z1 for the referent , and x2, y2, z2 for the referent . For the sake of simplicity, this information was put together as x1, y1, z1 and  x2, y2, z2. These notations indicate that the two objects were recognized and identified as the two distinct entities and they occupy the two distinct coordinates with the two distinct object boundaries. SR rules out that the two referents are identical (22a) and that the two referents occupy the same 3-D space (22b). But there could possibly be multiple referents as in (22c), which falls outside the current study.

(22)

a. * SR

 x 1 , y 1 , z1

 x 2 , y 2 , z2

b. * SR

 x 1 , y 1 , z1

 x 1 , y 1 , z1

c.

 x 1 , y 1 , z1

SR

 x 2 , y 2 , z2

 x3, y3, z3

RF represents the two available frames: egocentric and allocentric. This indicates that any static situation can be represented in three ways. First, it is possible to align an allocentric reference frame by focusing on the intrinsic properties of the referents,  and , as in ‘the cat() is in front of the dog()’, and ‘the castle() is to the north of the lake()’ in English. Second, it is also possible to align an egocentric reference frame by focusing on the relative positioning of the representation bearer in addition to the relation

51 between  and  as in ‘the cat() is to the left of the dog()’ in English (although ambiguous, ‘leftness’ of the English speaker is projected onto the dog). Third, it is also possible to align both egocentric and allocentric reference frames by focusing on the intrinsic properties of the referents as well as the relative positioning of the representation bearer. In English, this could be shown in ‘the cat() on the left is in front of the dog()’. TR represents the time in which the situation took place. If the situation was static, TR would have t only. Thus, the given static spatial relationship was time-bound. This representation rules out the following situations: the same thing occupying two different coordinates at the same time (23a), the same thing occupying the same coordinate at two different times (23b), the two different things occupying the same coordinates in the space at two different times.

(23)

a. *

x1, y1, z1 and x1, y1, z1 at t

b. *

x1, y1, z1 at t1 and x1, y1, z1 at t2

c. *

x1, y1, z1 and  x2, y2, z2 at t1 and x1, y1, z1  x2, y2, z2 at t2

CS has BE as a situation function, which is filled with a place function AT and a path function ORIENT. For the representations of the static situations, there is no need to specify the function types; thus, they are omitted. AT is filled with LOC, which indicates the two different locations, i and j, but does not carry any vectoral information because it is already given in SR. As a Thing filler, the names of the concepts such as TRUCK, HORSE, PLANE, are used. Moreover, Things has superscripts such as , , , and  for

52 the co-referents given in SR. Whereas BE is filled with the place function AT, ORIENT is filled with the path function TOWARD. Notice that the CS template does not have to encode the linguistic relations such as in, on, at, under, next to, beside, in front of, behind, left, right, front, back, etc. On the one hand, for the topological relations such as in, on, at, both  and  are at LOCi. In this way, the only information CS encoded is that the two referents  and  occupy the same region (but not the same coordinate). This representation can account for all of the spatial relations in which the two referents happen to be in the same region. These relations, for example in English, can be carried in the message by prepositions such as in, on, at, next to, beside, near, etc. In ASL, for example, these relations can be made spatially without using any lexical items. On the other hand, for the angular relations such as left/right and front/back,  is at LOCi, while  is at LOCj. Thus, for the angular relations, CS encodes that the two referents  and  occupy two different regions. As for orientations, the referent  could face either  or another unknown referent represented as , (e.g., another referent or direction). On the other hand, the referent  could face either  or  (another unknown referent or direction). What these combinations capture is the following:

(24)

a.  face  and  face   ‘facing each other’ b.  face  and  face    face  and  face  or ‘facing the same direction’ c.  face  and  face    face  and  face  or ‘facing the same direction’

53 d.  face  and  face    face  and  face  or ‘facing different directions’ including exactly opposite directions.

As for orientations, CS rules out that a thing faces itself (given in (25)).

(25)

*  face 

LR encodes syntax, modifications, and information structure, which are idiosyncratic to the given language. The CSLH predicts that there is no one-to-one obligatory mapping between the input and the message because of the multiple representations that intervene. For example, the hypothesis predicts that there should not have to be six different linguistic structures for six different spatial inputs. Nonetheless, LR should have a static (‘be’, ‘exist’, ‘have’) and/or a dynamic predicate (‘sit’, ‘look’, ‘stand’, ‘go’, ‘face’, etc.) to correspond to the static situation. With respect to the predicates, the relationship between CS and LR is given below in (26).

(26)

CS BE

LR 

be, exist, have, sit, look, stand, go, face, walk,…

Additionally, LR can modify the situation on the basis of the representation bearer’s pragmatic knowledge or inferences. For example, BE can be modified with adverbials (‘sitting quietly’) in LR but in none of the other representations. LR can also

54 modify Things again on the basis of the representation bearer’s pragmatic knowledge or inferences. For example, Thing can be modified with adjectivals (‘a blue truck’, ‘two trucks’) but none of the other representations can encode this information. Finally, LR should have correspondences to AT and/or ORIENT. 2.5.2. CSLH for Spatial Dynamic Situations For the spatial dynamic situations (motion events and causation), the SR will contain 3-dimensional entities. The RF will permit allocentric and egocentric reference frames. The TR will have regions, represented as t1-to-t2, t2-to-t3, etc. The CS will contain two sets of constructions: one for each entity in SR, providing the following information about those entities: GO, BE, ORIENT, TOWARD, AT(LOC). The LR will require that four conditions. These conditions are fulfilled in order for there to be a well-formed linguistic description of the static situation in whatever language might be used. (27) gives CSLH for two objects in a spatial arrangement with motion.

(27) SR

 x 1 , y 1 , z1

RF

allocentric / egocentric

TR

t1-to-t2, t2-to-t3, …

 x 2 , y 2 , z2

CS

ORIENT ([…],TOWARD ([…] / )) GO BE AT ([LOCi])

ORIENT ([…],TOWARD ([…]  / )) {GO} BE AT ([LOCi / j])

55 LR •

static / dynamic predicates



must encode either AT or ORIENT or both



perspective is optional



can modify BE and fillers: Thing and LOC.

As for static situations, SR for the dynamic situations represents the two objects  and . The 3-D specifications are x1, y1, z1 for referent  ,and x2, y2, z2 for referent . Again, this information is put together as x1, y1, z1 and  x2, y2, z2. These notations indicate that the two objects are recognized and identified as the two distinct entities and they occupy the two distinct coordinates with the two distinct object boundaries at given times: t1 and t2. As for static situations, the two referents cannot be identical (28a), the two referents cannot occupy the same 3-D space (28b). But there can be multiple referents as in (28c), which is outside the current study.

(28=22)

a. * SR

 x 1 , y 1 , z1

 x 2 , y 2 , z2

b. * SR

 x 1 , y 1 , z1

 x 1 , y 1 , z1

c.

 x 1 , y 1 , z1

SR

 x 2 , y 2 , z2

 x3, y3, z3

As for the static situations, the RF for the dynamic situations represents the two available frames: egocentric and allocentric. This indicates that any dynamic situation can be represented in the following three ways:

56 i)

allocentric alignment by focusing on the intrinsic properties of the referents (such as  and ) as in ‘the cat went toward the front of the dog’ in English.

ii)

egocentric alignment by focusing on the relative positioning of the representation bearer in addition to the relation between  and , as in ‘the cat on the left went toward the dog’ in English (although ambiguous, ‘leftness’ of the English speaker is projected onto the dog).

iii)

aligning both egocentric and allocentric reference frames by focusing on the intrinsic properties of the referents as well as the relative positioning of the representation bearer. In English, this could be shown in the phrase‘the cat on the left went toward the front of the dog’.

TR represents the times in which the situation takes place. Since the situation is dynamic, TR has, at least, one time interval formalized as t1-to-t2. Thus, a dynamic situation must get involved in the change in time. Repeating from the TR for the static situations, the TR, here, rules out situations such as the same thing occupying in two different coordinates at the same time (29a), the same thing occupying the same coordinate at two different times (29b), or two things occupying the same coordinates in the space in two different times (29c).

(29=23)

a. *

x1, y1, z1 and x1, y1, z1 at t

b. *

x1, y1, z1 at t1 and x1, y1, z1 at t2

c. *

x1, y1, z1 and  x2, y2, z2 at t1 and x1, y1, z1  x2, y2, z2 at t2

57 While CS for the static situations has BE as a situation function, CS for the dynamic situations has GO as a situation function. GO, in turn, is filled with BE as another situation function. BE takes a place function AT and a path function ORIENT. AT is filled with LOC which indicates the two different locations, i and j, but does not carry any vectoral information which is already given in SR. Things are filled with the names of the concepts such as COW, PIG. Moreover, Things has superscripts such as , , , and  for the co-referents. Whereas BE is filled with the place function AT, ORIENT is filled with the path function TOWARD. Crucially, The CS template does not encode the dynamic linguistic relations such as to, toward, pass by, away, and cause. As for orientations, on the one hand, the referent  can face either  or another unknown referent represents as , (e.g. another referent or direction). On the other hand, the referent  can face either  or  (another unknown referent or direction). What CS rules out is that a thing faces itself, cannot move to itself, and cannot be stationary and dynamic at a given time interval.

(30)

a. *

GO x1, y1, z1 face x1, y1, z1 t1-to-t2

b. *

GO x1, y1, z1 face x1, y1, z1 t1-to-t2

c. *

GO x1, y1, z1 & BE x1, y1, z1 t1-to-t2

Notice that the CS template has GO function for the referent  but an optional GO function for the referent . This corresponds to a situation in which an object is in motion while the other one is stationary. On the one hand, the referent with GO function  can

58 move into the same region with the stationary referent. For example, an object can be in motion inside another object when the LOCs are the same for the two referents. On the other hand, when the LOCs are two distinct regions, the referent with GO function  moves either to / toward or away from the stationary referent . Both  and  can have GO function when they are in motion at a given time interval. When the LOCs are the same for the two referents, an object can be in motion inside the other moving object. When the LOCs are two distinct regions, the two objects move either to / toward or away from each other. What these combinations capture is the following:

(31)

a. GO  face , BE  from t1-to-t2

  goes to/toward 

b. GO  face , BE  from t1-to-t2

  goes away from 

c. GO  face , BE  from t1-to-t2 & GO  face , BE  from t2-to-t3   pass by  d. GO  face , BE  from t1-to-t2 & GO  face , GO  face  from t2-to-t3   causes  to move and  continues moving e. GO  face , BE  from t1-to-t2 & BE  face , GO  face  from t2-to-t3   causes  to move and  stops moving f. GO  face , GO  face  from t1-to-t2&GO  face , GO  face  from t2-to-t3   and  hit each other then go to different directions including ‘going back’ g. GO  face , GO  face  from t1-to-t2&GO  face , GO  face  from t2-to-t3

59   and  hit each other then go to the same direction h. GO  face , GO  face  from t1-to-t2   and  go to each other i. GO  face , GO  face  from t1-to-t2   and  go to different directions including opposite directions j. GO  face , GO  face  from t1-to-t2   and  go to same directions or different directions but not opposite directions k. GO  face , GO  face  from t1-to-t2 &GO  face  GO  face  from t2-to-t3   and  pass by each other

Notice that these combinations do not have specifications for LR-specific encodings. For example, CS do not distinguish ‘go’ and ‘cause’ due to the fact that while ‘go’ is obligatorily encoded, ‘cause’ was not. CS does not make any distinctions among the path information such as ‘to’,‘toward’, ‘away from’, ‘pass by’. Additionally, situation modification ‘manner’ is not encoded in CS since manner is not obligatorily encoded in and across languages and is out of the scope of the dissertation. As for the static situations, LR encodes syntax, modifications, and information structure which are all idiosyncratic to the language. Nonetheless, LR should have a dynamic predicate (‘go’, ‘move’, ‘walk’, ‘cause’, ‘hit’, etc.) to correspond to the dynamic situation. GO and BE can be modified with adverbials (‘moving slowly’, ‘hoping’, ‘sitting quietly’, ‘waiting’, etc.) in LR but nowhere else.

60 (32)

CS GO

LR 

go, walk, move, cause, hit,…

2.6. Summary Clearly, the Jackendovian SR-CS-LR hypothesis falls short of completely representing spatial language as shown by recent studies on spoken and sign languages. As an alternative I propose the Crossmodal Spatial Language Hypothesis (CSLH), which takes into account variations in spatial language across languages and modalities. The following chapters test and provide supporting evidence for CSLH.

61

CHAPTER 3. METHODOLOGY

This chapter outlines the methodology used in this dissertation. In section 3.1, the justification for the methodology is given, which is followed by the information about the studied languages (section 3.2). Brief information about the participants of the study is given in section 3.3. Afterwards, the data collection process is summarized (section 3.4). Section 3.5 presents the research questions of the current study. Finally, the details of the tests used are discussed in section 3.6. 3.1. The Justification for the Methodology The same mixed methodology was used to gather data from seven different languages. I had two main reasons for using this mixed methodology. First, the model detailed in the previous chapter (especially section 2.1) assumed that input, representations, and message are three independent modules, which interface with each other at the time of the utterance. It is desirable to have control over and to manipulate, at least, one part of this model to see its effects on the others. I chose input, which was the stimulus that triggered both representation and message in part because it was easier to manipulate. Thus, the message (the response, here the descriptions) was restricted. The structure of the representations was predicted by the Crossmodal Spatial Language Hypothesis and inferred from the indirect interaction between input and description.

62 Second, since this study was, to my knowledge, the first attempt to investigate crosslinguistic and crossmodal ways of representing spatial relations, I believe that the use of an experimental methodology can give better results. 3.2. Study Languages In this dissertation, aspects of spatial language were investigated by focusing on the data gathered from signed languages (Turkish Sign Language (TID), Croatian Sign Language (HZJ), American Sign Language (ASL), and Austrian Sign Language (ÖGS)) and spoken languages (Turkish, Croatian, and English). Crucially, these languages, typologically, belong to different families and none of them are dialects of one another. These languages were chosen because of the following reasons. Since I am a native speaker of Turkish, I can use my intuitions and insights in addition to the data from the Turkish speakers. I chose TID since I studied this language for my undergraduate projects and masters’ thesis (Arik, 2003). I believe that comparing the results from these two languages can provide new insights to the study of spatial language and sign linguistics in particular. I also included the data from the three other sign languages (HZJ, ASL, and ÖGS) since my project was funded in part by an NSF grant in which the structure of these languages has been studied. I have greatly benefited from this grant and its team members who greatly contributed to the knowledge of the structure of those languages. In addition to these reasons, I also included these sign languages to show that within the visual-gestural modality there is no uniform structure used, contra common belief. I also included the spoken languages (Turkish, Croatian, and English) to show that the properties, but not necessarily overt structures, used in the sign languages were similar to

63 those in the spoken languages. These findings are predicted by both the Jackendovian approach and the Crossmodal Spatial Language Hypothesis (there must be representations between input and message). I prepared all of the stimuli, designed the research, and helped collect the Turkish, TID, American English, and ASL data with the help of the native users of each language. My collaborators collected data from Croatian, HZJ, and ÖGS. In each session, the addressee, who was not the experimenter, was a native user of the respective language. 3.3. Participants The sign language data came from a total of fourteen Deaf TID signers (7 females and 7 males, age range=18-50), a total of eleven Deaf HZJ signers (8 females and 3 males, age range=21-54), a total of thirteen Deaf ASL signers (7 females and 6 males, age range=25-42), and a total of eight Deaf ÖGS signers (5 females and 3 males, age range=24-40). The spoken language data came from thirty-three native Turkish speakers (22 females and 11 males, age range=21-52), ten native Croatian speakers (5 females and 5 males, age=26-53), and ten American English speakers (8 females and 2 males, age range=18-25). All of the subjects reported that they had not participated in a similar experiment before. All of the speakers were native users of their languages. The TID, HZJ, and ÖGS signers were native signers (second, third, and fourth generation Deaf). Similarly, the majority of the ASL signers were native signers. All of the signers attended schools for the deaf in their countries and were active members of their deaf communities. The participants from the same country were also selected from the same city or geographical region in that country. The twelve Turkish participants, who participated in

64 the experimental sessions with pictures and movies, and the Deaf TID signers were from the city of Izmir, the location of the first school for the deaf with mixed education (sign and speech until 1953). Reportedly, the ancestors of the TID participants received their educations at that school. Until recently, the use of ‘signs’ in schools for the deaf had been forbidden in Turkey since 1953. Recently, the Turkish parliament passed a new act for ‘disabled citizens’, which included improvements for the use of TID in many settings6. Similarly, all of the participants from Croatia were from Zagreb. All of the American participants were from Indiana. All of the Deaf ASL signers received their education at the School for the Deaf in Indianapolis. 3.4. Data Collection The data were collected under the supervision of Prof. Ronnie Wilbur with funding in part from NSF grant (BCS-0345314) and a grant from the Linguistics Program at Purdue University. The fourth and final year of my study was supported by the Bilsland Dissertation Fellowship given by the Purdue Graduate School to outstanding Ph.D. candidates. The dates for the data collections were as follows: the Turkish data in Summer 2007, the TID data in Summer 2006 and 2007, the ASL data in Fall 2007 and Spring 2008, the American English data in Fall 2008, the Croatian, HZJ, and ÖGS data in 2006 and 2007. I was present for the Turkish, TID, English, and ASL data collection sessions. Neither I nor the other experimenters acted as an addressee in any of the data collection sessions. The addressee, instead, was a native user of the respective language.

6

More information on TID can be found in Ozyurek, Ilkbasaran, and Arik (2005).

65 The data were collected at the participants’ houses or a convenient location such as in a laboratory, or at a deaf club. The same procedure was applied in all of the experiments and with all of the participants. The directions were given in the participant’s native language. In the beginning of each session, the participants signed the consent forms and were paid for their participation. The arrangement of the recording room was as follows. A digital camera, a laptop, a small table for the laptop, and two chairs for the participant and the addressee were prepared beforehand. The laptop on the small table was put in front of the participant who had a clear view of the laptop. The participant, the addressee, and the camera were on a straight line. But the laptop was put slightly to the left side or right side of the participants to not obscure their view (see Figure 3.1).

Figure 3.1. The arrangement in the recording room.

Each participant was requested to look at the pictures very carefully and to describe what they saw to the addressee. The participants were told that their descriptions

66 should be ‘understood’ by the addressee who did not see the stimuli7. The experimenter made sure that the tasks were not about memory or intelligence but to compare the descriptions from the users of various languages. The participant was also told that there was no right or wrong description for the situations in the pictures and movies. While looking at the addressee, s/he described the event. There was no trial session but the very first items were control items, which, as such, were not analyzed. The entire data collection session including warm-up tasks lasted less than an hour. When asked, the stimulus was shown more than once. After the sessions, the participants were asked to give feedback about the tasks. When requested, further information about the project was given by the experimenter. For example, when asked, observations from the data collected were shared with the participant. 3.5. Research Questions This dissertation is an attempt to describe how sign and spoken languages represent space in their linguistic systems. This dissertation presents a new hypothesis, CSLH, in which SR, RF, TR, CS, and LR are formalized in order to understand the similarities and differences found in languages from two different modalities. The research questions I focused on are as follows: (1) How do both sign and spoken languages represent spatial static situations? (2) How do both sign and spoken languages represent spatial dynamic situations? (3) Does modality, i.e. visual vs. auditory, have an effect on the way language users represent spatial relations? (4) Do languages differ from each other in their spatial domain? 7

The addressees did not see the stimuli before or during the sessions.

67 (5) How is spatial language used in representing locational and motion events related to the other domains of language such as temporals? In answering these research questions, I tested CSLH inspired by the works of Fodor, Jackendoff, and Slobin. The hypothesis predicts that none of the languages would directly and obligatorily represent all of the salient features from the spatial input in their structures. To put it differently, the information in the linguistic message would not directly correspond to the information in the spatial input. CSLH further predicts that the characteristics of the multiple representations such as SR, RF, TR, CS, and LR were responsible for these possible non-correspondences. CSLH suggests that that the properties of SR, RF, TR, and CS are sufficient to explain the spatial language similarities across sign and spoken modalities. In order to test this prediction, I analyzed the data from the experiments in detail. I also showed how the parameters are aligned online at the time of utterance. The model also predicts that even though the properties of LR might be similar across languages, those properties have a great influence on the message not only in the spatial domain but also across linguistic domains such as temporals. In order to test this prediction, I also elicited cross-sectional data to do distributional analysis on temporals and relied on my insights. The hypotheses can be summarized as the following. Hypothesis 1: Languages do not directly and obligatorily encode all of the spatial features of a spatial relation due to the multiple representations that intervene between the spatial input and its linguistic description.

68 Hypothesis 2: The properties of the representations are the same across languages and modalities. These two hypotheses are my expectations on the basis of the Crossmodal Spatial Language Hypothesis (CSLH) given in chapter 2 (especially section 2.5). In order to test these hypotheses several tasks were designed. These tasks are given in the following section. 3.6. Tasks There are many variables that can affect the linguistic message on the spatial arrangements of the objects. Talmy (2000, p.241) proposed that the uses of spatial relationals are affected by several factors with respect to a particular spatial configuration. For example, assigning a figure and a ground, geometries of the figure and the ground, (a)symmetrical relation between the figure and the ground, orientation of the figure with respect to the ground, presence/absence of contact of the figure with the ground, presence/absence of further reference objects, further embeddings of one figureground configuration within another, adoption of a perspective point from which to regard the configuration, and change in a location of a figure or perspective point through time (motion). I took these factors into account when designing the tasks to understand spatial language, seeking answers for the research questions, and testing the hypotheses given in the previous chapter. In order to elicit data on the descriptions of spatial situations, I created pictures and movies in which small toys (dolls, planes, trucks, and animals) were in various spatial arrangements. The background color was white-gray or blue. The background and shadows of the objects were kept to give a 3-D impression. The participants were asked

69 to describe the pictures and movies. All descriptions were video-recorded for analysis by using a digital camera and the video-recordings were exported to an Apple computer. Quantitative and qualitative analyses were made on the descriptions of the pictures and movies. In each testing item, there were two objects. Figure 3.2 summarizes possible object locations and terms used in this study.

Front (=distal) Objc

Left

Obja

Objb Right

Objd Back (=proximal) Figure 3.2. Object locations and axes from a bird’s eye perspective.

In the following, the tasks8 are presented in two sections: the tasks for spatial static situations in section 3.6.1 and the tasks for spatial dynamic situations in section 3.6.2.

8

I was inspired by the similar studies done on the other spoken languages, for example, the Man-and-Tree task used in Peterson, et al., 1998. I was also inspired by the elicitation materials used in the crosslinguistic research projects that I was involved in as a student and as a research assistant. For example, Asli Ozyurek’s projects: “Crosslinguistic spatial cognition and language” (e.g., Ozyurek, et al., 2005) and “the development of spontaneous gesture systems in deaf children in four cultures” (e.g., Ozyurek, et al.,

70 3.6.1. Tasks for Spatial Static Situations One of the goals of this dissertation was to analyze the data from the descriptions of the spatial static relations on a table-top space crosslinguistically and crossmodally. Static situations can be divided into two parts: angular relations (left-right, front-back) and topological (in, on, at, next to, etc.) (Levinson, 2003). So, two experiments were designed: Experiment 1 targeted the descriptions of the situations with angular relations and Experiment 2 targeted those with angular-topological relations. Two additional tasks were also developed. Elicitation Task 1 targeted the descriptions of the situations exactly in Experiment 1. Here the addressees were expected to retell the descriptions. Elicitation Task 2 aimed at the descriptions of the situations with topological relations. 3.6.1.1. Experiment 1 and Elicitation Task 1 CSLH predicted that the spatial input may not be echoed in the message. Specifically, the hypothesis for this section was that the spatial features of the spatial angular input, i.e. relative positions of the objects and their relative orientations, were not directly and obligatorily carried in the message. In order to test this hypothesis, Experiment 1 was used to elicit the descriptions of the static angular relations. A total of sixty-two people (native users: 10 TID, 10 HZJ, 10 ASL, 4 ÖGS, 8 Turkish, 10 English, 10 Croatian) participated in this study, signed the consent forms, and were paid for their participation. The task lasted for about an average of 2 minutes per participant.

submitted). However, the experimental designs in my dissertation are unique in their characteristics.

71 The factorial design for this task was 2x3. The first factor was positioning with two levels (objects were located on lateral (left-right) vs. sagittal (front-back) axis); the second factor was facing with three levels (objects facing each other vs. facing the same direction vs. facing different directions (exactly opposite directions)). There were twelve stimuli which were randomly ordered. Six of them were the testing items; the other six were the control items. Figure 3.3 presents all six testing items. Thus, for example, in Figure 3.3a the two trucks are on the lateral axis facing different directions while in Figure 3.3b the horse and the cow are on the sagittal axis facing each other.

(a)

(b)

(c)

(d)

(e)

(f)

Figure 3.3. The testing items for Experiment 1.

The participants looked at the pictures and described them to the addressee (a native user of the same language). Each description was coded according to axial, locational, orientational, and situational information. Let me explain the coding by giving an example. Consider 1b in which the horse on the distal region and the cow on the proximal region are face-to-face. So, the spatial features are Axis=sagittal, Location=Distal (horse) & Proximal (cow),

72 Orientation=Face-to-face, Situation=Static (since the horse and the cow are stationary) Suppose that a description of this stimulus included correct information for all four features. Then, its score was 4. When the description had correct information for the three features out of four, its score was 3. For example, the horse and the cow are face to face received 2 since it gave orientation information and situation information correctly but did not mention which axis the horse and the cow were on and where they were located. Using the materials from Experiment 1, for Elicitation Task 1, the addressees were asked to retell the descriptions. Here the aim was to make sure that the viewers described the pictures naturally in addition to exploring the addressees’ spatial language. 3.6.1.2. Experiment 2 As in Experiment 1, the hypothesis for Experiment 2 was that the spatial features of the spatial angular-topological input, i.e. relative positions of the objects and their relative orientations, were not directly and obligatorily carried in the message. The main difference between Experiment 1 and Experiment 2 was the distance between the two objects in the stimuli. In Experiment 2, but not in Experiment 1, the objects were close to each other almost touching. A total of fifty-six people (native users: 7 TID, 8 HZJ, 8 ASL, 7 ÖGS, 8 Turkish, 8 English, 10 Croatian) participated in this study, signed the consent forms, and were paid for their participation.9 The factorial design for Experiment 2 was 2x2. The first factor was positioning with two levels: objects were located on lateral (left-right) vs. sagittal (front-back) axis. 9

Most of these people also participated in Experiment 1.

73 The second factor was facing with two levels (face the same direction vs. face different directions (=exactly opposite directions)). There were a total of twenty-six testing items, which were again randomly ordered. The first two testing items were not analyzed. Another four testing items were for Experiment 2, another six for Elicitation Task 2, and the remaining sixteen pictures were the control items. Figure 3.4 shows the testing items. As in Figure 3.4, there were two object location arrangements: on the left-right axis such as (a) and (b), and on the front-back axis such as (c) and (d). In addition, there were several arrangements of object orientations. Thus, for example in (a) and (c), objects faced toward the same direction; in (b) objects faced each other; and in (d) objects faced different directions.

(a) Lateral-Same

(b) Lateral-Diff.

(c) Sagittal-Same

(d) Sagittal-Diff.

Figure 3.4. The testing items in Experiment 2.

The participants looked at the pictures and described them to the addressee (a native user of the participant’s native language). As for Experiment 1, there were four measures for this test: axial, locational, orientational, and situational. When the description corresponded to the information in the stimulus according to the four measures, it scored 4. Suppose that an English speaker said, “The female and the male figures are next to each other” for Figure 3.4.a. This description does not give any

74 information about the axes and orientations of the figurines. Yet, it specified their locations (by saying next to each other) and the static situation correctly. Thus, it scored 2. One could describe the same picture as, “The female and the male figures are going left.” This description received a score of 1 since it only identified the orientations of the dolls correctly. There was no axial information. The locations were unclear because they could be located laterally as the female is following the male or sagittally but apart from one another. The situation was incorrect since the testing item was a picture with a static situation, yet, the description involved a dynamic predicate, are going. 3.6.1.3. Elicitation Task 2 In designing Elicitation Task 2, in addition to Levinson (2003), I also followed Jackendoff’s feature geometry (Jackendoff, 1990, pp. 112-122). He argued that there are two types of static situations: a) Pure locational where contact information is unmarked (no contact) and b) Pure contact where contact information is marked (contact) including attachment. Thus, in Elicitation Task 2, I elicited the descriptions for the pictures in which the two objects were physically in contact with each other. The participants of Experiment 2 (native users: 7 TID, 8 HZJ, 8 ASL, 7 ÖGS, 8 Turkish, 8 English, 10 Croatian) also participated in this study, signed the consent forms, and were paid for their participation. The session lasted for about an average of 4 minutes. This task aimed to test whether contact information was obligatorily encoded in the linguistic descriptions of topological relations. I created six situations: the three of them for in relations shown in Figure 3.5.a, b, and c and the other three for on relations given in Figure 3.5.d, e, and f. Note that in all testing items, the two objects were in

75 contact with each other. The participants looked at the pictures and described them to the addressee (a native user of the participant’s native language).

(a)

(b)

(c)

(d)

(e)

(f)

Figure 3.5. The testing items for the topological relations.

3.6.1.4. Analysis For Experiments 1 and 2, both quantitative and qualitative analyses were made on data gathered from TID, HZJ, ASL, ÖGS, Turkish, English, and Croatian. For the quantitative analysis, the descriptive statistics were given for each language in order to show that the data did not always include information about the salient spatial features. The General Linear Model with Repeated Measures (GLM) was applied to see whether the factors (positioning and facing) had an effect on the way the participants described the topological-angular relations. GLM was also applied to compare the languages to see whether one’s language made a difference on the way s/he specified the spatial features measured in Experiment 2. The data were also analyzed qualitatively to show the patterns in the descriptions, which also demonstrated how the hypothesis accounted for the similarities and the differences crosslinguistically and crossmodally.

76 For Elicitation Task 2, since I was interested in whether and how contact information was expressed, I only did a qualitative analysis and compared the languages. My expectation was that the encodings of the contact information was not obligatory in the message across languages. 3.6.2. Tasks for Spatial Dynamic Situations Another goal of this dissertation was to analyze the data from the descriptions of the spatial dynamic relations on a table-top space crosslinguistically and crossmodally. The rationale behind this task came from Talmy’s studies. According to Talmy (2000, p.189), all languages express a figure in motion and its path, a linear figure moving along the same path, and a stationary figure positioned in the same path. In doing so, all languages can encode the properties of an event such as figure (moving object), ground (reference object), manner (how movement is done, e.g. jumping), path (source/starting point and goal/endpoint of the motion), and predication. In order to understand the way the studied languages express spatial language in motion events, I designed three tests by using a similar procedure with the static tests with the exception that the toys were in motion in the dynamic tasks. The illusion of motion was created by classical motion picture techniques. Each movie consisted of no less than five frames. In each frame, the position of the ‘moving’ object changed. Then, the frames were put together in iMovie to create a motion picture. The end result was a movie about 1-2 seconds long.10

10

Before I collected data, I showed these movies to the speakers of various languages, who did not participate in this study, to make sure that the movies looked like motion events. Moreover, the participants’ feedback indicated that the stimuli were successfully created.

77 There were 8 TID, 8 HZJ, 10 ASL, 8 ÖGS, 10 Turkish, and 10 English native users, a total of fifty-four participants, who participated in Experiments 3, 4, and 5. The data from 10 Croatian speakers were unable to be used due to technical difficulties. Therefore, they were discarded. The entire session lasted for about an average of 9 minutes per participant. A total of thirty-five motion pictures consisting of several spatial arrangements of objects were shown to the participants who were expected to describe the object relations to a native user of their language. The order of the movies was random, however, the first two movies were not testing items. Experiment 3 had eight testing items, Experiment 4 had another eight testing items, Experiment 5 had four testing items. The remaining fifteen movies were control items. These motion pictures were prepared by using small toys, i.e. dolls (a male and a female), fruits (an apple, an orange, and a peach), and animals (a pig and a cow). All descriptions were video-recorded for analysis. In the following section, more information is given on each task. 3.6.2.1. Experiment 3 In order to test the current hypothesis, an experiment, similar to Experiment 2, was designed. The design was 2x2x2. The first factor was positioning with two levels: the objects were put either on a lateral or sagittal axis. The second factor was facing with two levels: the objects were facing either the same direction or each other. The third factor was motion type with two levels: one of the objects moved either to the other object or started moving and stopped in the middle of the screen (moved toward the other object). An example is given in Figure 3.6.

78

Figure 3.6. A testing item from Experiment 3.

In Figure 3.6, there were two dolls (a male and a female) located on the sagittal axis. The two dolls were facing each other. The male doll in the distal region moved to the female doll in the proximal region. There were four measures: axial, locational, orientational, and motion type. For each measure, the description of the motion event was compared to the stimulus. When the participants correctly gave information on the four measures, they scored 4. Suppose that an English speaker said, “The boy walked to the girl,” for Figure 3.6. This description got a score of 1 since it only gave the motion type information correctly. The rest was ambiguous/unspecified. This test was developed for two reasons: qualitatively, to see whether languages encode a distinction between to and toward; quantitatively, to compare those encodings in order to see the differences and similarities across languages. 3.6.2.2. Experiment 4 Experiment 4 was used to elicit the descriptions of the dynamic situations in which objects were passing by each other or going away from one another. This test was developed to see whether languages encode a distinction between pass by and away distinction, and quantitatively to compare those encodings to see the differences and similarities across languages.

79 The design was 2x2x2. As in Experiment 3, the first factor was positioning with two levels (objects on the left-right axis vs. the front-back axis). The second factor was facing with two levels (object facing each other vs. facing opposite directions). The third factor was motion participant with two levels (one object in motion vs. both objects in motion). In this task figurines and animals were used. Figure 3.7 gives an example of a testing item. In Figure 3.7, a female and a male doll both faced right. The female doll went toward the male doll, passed by him, and continued going to the right. Thus, this item was for the following conditions: the left-right axis, facing the same direction, one object in-motion.

Figure 3.7. The stimulus #26 as a testing item for Experiment 4.

There were four measures: axial, locational, orientational, and motion participant. When a description matched the information in the spatial layout according to measures, it scored 1 for each measure (max=4). Suppose that an English speaker described the movie in Figure 3.7 as, “The woman passed by the man.” This description does not give any clear information about axis, locations, or relative orientations of the objects. However, it specifies that while one of the objects was in motion, the other one was static. Therefore, it scored 1 out of 4.

80 3.6.2.3. Experiment 5 The last test for the dynamic motion events was developed to investigate how signed and spoken languages described the motional causative events and whether they used their linguistic structures to correspond to the motional causative events in Experiment 5. The design was 2x2. The first factor was positioning: the objects (apple, orange, peach) were put either on a lateral or sagittal axis. The second factor was one of two cause types: 1) One of the objects moved to the other one, hit it, stopped, the other one started moving, and 2) One of the objects moved to the other one, hit it, then they started going to the same direction. An example is given in Figure 3.8.

Figure 3.8. A testing item from Experiment 5.

In Figure 3.8, there were two fruits (an orange and a peach) located on the sagittal axis. The orange in the distal region started moving to the peach in the proximal region, hit it, and stopped. Then, the peach started moving to the same direction. There were three measures: axial, locational, and cause type. For each measure, the description of the motion event was compared to the stimulus. When matched, it was coded 1; otherwise, it was coded 0. The maximum score a participant could get was 3. Suppose that, for the test item in Figure 3.7, an English speaker said, “The orange hit the

81 peach then the peach moved away as a result.” This description could get 1 out of 3 since it correctly gives the cause type information but it does not specify the objects’ axes and locations. 3.6.2.4. Elicitation Task 3 Elicitation Task 3 was similar to Elicitation Task 1. There were five motion events which the viewers were asked to describe to the addressees. In turn, the addressees retold the descriptions. Here the aim was to make sure that the viewers described the pictures naturally in additional to explore the addressees’ spatial language. 3.6.2.5. Analysis For Experiments 3, 4, and 5, both quantitative and qualitative analyses were made on the data gathered from TID, HZJ, ASL, ÖGS, Turkish, and English. For the quantitative analysis, the descriptive statistics were given for each language in order to show that the data did not always include information about the salient spatial features. The General Linear Model with Repeated Measures (GLM) was applied to see whether the relevant factors had an effect on the way the participants described the topologicalangular relations. GLM was also applied to compare the languages to see whether one’s language made a difference on the way s/he specified the spatial features measured in each test. For the qualitative analysis, the data from the experiments including those from Elicitation Task 3 were discussed in detail to show the patterns in the data and also demonstrate how the hypothesis accounted for the similarities and the differences crosslinguistically and crossmodally.

82 3.6.3. Language-internal Data There is also an ongoing discussion on the relationship between spatial and temporal language as time is derived from space (e.g., Clark, 1973). This view assumes that space is basic and invariable. For example, in cognitive linguistics (e.g., Lakoff & Johnson, 1980), it is often assumed that temporal language is derived from space. Conceptualization of time is, therefore, a metaphorical extension of space. There is also increasing empirical evidence for that claim (e.g., Casasanto & Boroditsky, 2008; Núñez et al., 2006; Núñez & Sweetser, 2006). Recent research has argued that the spatial frontback axis is the source for the temporality. In contrast, the Crossmodal Spatial Language Hypothesis predicts that spatiality and temporality are encoded in SR and TR, respectively. According to this hypothesis, neither one of them is the source. LR for spatial language is, indeed, not invariable. The methodologies used above will allow me to show that (1) spatial language is quite more complex than assumed and (2) TR greatly contributes to the motion encodings. In order to test the predictions of the Crossmodal Spatial Language Hypothesis, the data from the static and dynamic situations were analyzed qualitatively. In addition, the language internal data from TID and Turkish were analyzed to show how a visualgestural and an auditory-vocal language encoded spatiality and temporality in its grammar. The focus was on the use of the front-back axis in the spatial and temporal language and their correlations. 3.7. Summary This chapter served to outline the methodology used in this dissertation. The methodology was then justified in section 3.1. The studied languages were presented in

83 section 3.2. Information on the participants and the data collection process were given in section 3.3 and 3.4, respectively. Section 3.5 presented the research questions of the current study. Finally, the details of the tests used were discussed in section 3.6. The remaining chapters present the results on the data gathered from the TID, HZJ, ASL, ÖGS, Turkish, English, and Croatian participants by using this methodology.

84

CHAPTER 4. STATIC SITUATIONS

4.1. Introduction This chapter examines how signed and spoken languages represent spatial static relations (left/right, front/back, side-by-side, next to). In doing so, this chapter is devoted to testing the Crossmodal Spatial Language Hypothesis (CSLH) by focusing on the crosslinguistic and crossmodal descriptions of static spatial relations. According to CSLH spatial input is not obligatorily and directly mapped onto the message, as shown in Figure 4.1. Therefore, it is predicted that the descriptions of the spatial relations will not carry all of the salient spatial features. In this chapter, this hypothesis is tested from the data gathered from signed and spoken descriptions of the static situations. Both quantitative and qualitative analyses of the data provide supporting evidence for CSLH.

Input  SR/RF/TR  CS LR Message Figure 4.1. The language of space model assumed in this study.

CSLH claims that a single spatial description does not carry all of the salient spatial features from the input due to the multiple representations that interface both the input and the description: Spatial Representations (SR), Reference Frames (RF),

85 Temporal Representations (TR), Conceptual Structure (CS), and Linguistic Representations (LR). SR specifies 3-D axial information and selects an available reference frame while TR specifies axial information on the timeline and selects a reference frame. CS provides a BE-AT template while LR modifies the rest of the representation in its syntax, modification, and information structure. Two experiments were designed to assess CSLH; the findings are reported in this chapter. Experiment 1 aimed at investigating static situations with left/right and front/back distinctions whereas Experiment 2 explored similar static situations without these distinctions. The results from the two experiments supported the hypothesis in that the TID, HZJ, ASL, and ÖGS Deaf participants and Turkish, English, and Croatian hearing participant did not encode the axial information in the spatial layout entirely and precisely in all of their descriptions. Careful investigation of the crosslinguistic and crossmodal data showed that, when encoded, the linguistic encodings and the spatial information did not match at all times, regardless of language and modality. The results also indicated that languages differ from one another in their encoding strategies of the salient spatial features from the static situations. The findings from Experiment 1 showed that while the TID and HZJ descriptions were affected by the change in the position and orientation (the direction the objects faced) of the objects, the ASL, ÖGS, Turkish, English, and Croatian descriptions were not. The findings from Experiment 2 indicated that while ASL descriptions were significantly altered with respect to the change in the position and orientation of the objects, the TID, HZJ, ÖGS,

86 and English descriptions were not. Additionally, position was significant in the Turkish descriptions, whereas orientation was significant in the Croatian descriptions. Moreover, crossmodal comparisons of the results from both experiments showed that the signers gave significantly more spatial information about the scenes than the speakers. However, crosslinguistic comparisons suggested that the amount of spatial information given in the descriptions varied depending on the language. In detailing the static situations, the TID descriptions were no more precise than the English and Croatian descriptions. Supporting CSLH, closer examination of the data revealed that the same representational system was responsible for the variations observed across languages and modalities. It was observed that although perspective may not be marked linguistically, two reference frames (egocentric and allocentric) were available in the signed and spoken descriptions of the static situations. For example, using an allocentric reference frame resulted in a certain amount of ambiguity in the descriptions. It was also observed that the conceptual structures did not encode projective relations such as left, right, front, and back, nor did they provide metrical relations. Thus, the descriptions of the static situations may not include these relationals. The outline of the chapter is as follows. Section 4.2 gives the methodology and results of Experiment 1. Section 4.3 provides the methodology and results of Experiment 2. Finally, Section 4.4 discusses the findings and concludes the chapter.

87 4.2. Experiment 1: Describing Angular Relations 4.2.1. Methodology Repeating from chapter 2, Experiment 1 was designed in order to investigate the descriptions of the static angular relations. The design was 2X3 (PositioningXFacing). Positioning with two levels was defined as the different positions of the objects on either a lateral or a sagittal axis with respect the screen. Facing with three levels was defined as the three different orientations of the objects with respect to each other. In each location, the objects could be in one of three orientations: facing each other, facing the same direction, and facing opposite directions. There were a total of six testing items, as shown in Figure 4.2. In Figure 4.2a, for example, the two trucks were on the lateral axis and facing different directions, while in Figure 4.2b the horse and the cow were on the sagittal axis and facing each other.

(a)

(b)

(c)

(d)

(e)

(f)

Figure 4.2. The testing items for Experiment 1.

In this test, the four spatial features were taken into account: Axial, Locational, Orientational, and Situational. Each had an equal value of 1 (a total of 4). When these features were correctly carried in the description, the description received a score of 4.

88 The hypothesis predicts that the manipulations of the spatial arrangements of the objects may not be reflected in the descriptions. In other words, the linguistic descriptions need not obligatorily encode all of the spatial features of a static spatial angular relation. Therefore, I expected to find any of these four spatial features missing in the participants’ linguistic descriptions of the testing items. A total of sixty-two people (native users: 10 TID, 10 HZJ, 10 ASL, 4 ÖGS, 8 Turkish, 10 English, 10 Croatian) participated in this study, signed the consent forms, and were paid for their participations. The results from the quantitative and qualitative analyses of the data from Experiment 1 supported CSLH. The next two sections give the quantitative analyses. 4.2.2. Results: Language by Language 4.2.2.1. Sign Languages 4.2.2.1.1. TID The TID participants used several complex predicates (i.e. those consisting of b, one, two, a, 3-bend, and five handshapes11) that encoded the locations and orientations (as well as the directions) of the objects manipulated according to their relative positions. Not all of the predicates were used across conditions. Nonetheless, the use of a complex predicate did not necessarily indicate that the linguistic encodings of the axial, locational, orientational, and situational information matched that information in the stimuli. Figure 4.3 below presents the results for the TID participants’ descriptions. As predicted, overall, the information in the stimuli was neither obligatorily nor precisely

11

See Appendix A for the list of the handshapes observed in the data. See also Appendix B distributions of the linguistic forms used by each language in Experiment 1.

89 encoded in their respective TID descriptions. As can be seen in Figure 4.3, the TID participants described the orientations of the referents in an angular static arrangement precisely. Nonetheless, their scores decreased when the axial, locational, and situational information in the descriptions and the spatial layout were compared.

Figure 4.3. The TID scores for axes, location, orientation, and situation types in describing the pictures for Experiment 1.

Table 4.1 presents the means and standard deviations for the TID overall scores. The General Linear Model with repeated measures analysis was conducted to evaluate the effect of position and orientation of the objects as the within-subjects effects on the TID overall scores which included axes, location, orientation, and situation. Significant main effects were found for positioning, F(1, 9) = 6.48, p < . 05), and orientation, F(2, 18) = 4.24, p < .05). These results indicated that (a) if the effect of facing was ignored,

90 the TID participants gave significantly more spatial information about the objects on the lateral axis (M = 2.63 SE = 0.19) than those on the sagittal axis (M = 2.07 SE = .19). Similarly, (b) if the effect of positioning was neglected, the TID scores for facing were significantly different across the three facing conditions (same M =2.05 SE = 0.20 vs. each other M = 2.60 SE = 0.21 vs. different M = 2.40 SE = 0.19). There was also an interaction between positioning and orientation, F(2, 18) = 4.46, p **

Turkish

= greater than, < = less than, significance *= p < .05, **= p < .001, ***= p < .000 For example, the TID descriptions scored significantly less than the HZJ descriptions (p < .05). The empty box indicates that the difference between the two languages is not significant.

108 A detailed statistical analysis indicated that modality and language made a difference in the scores. That is, the signers scored reliably higher than the speakers. However, being a signer did not guarantee that a signed description of a static angular relation (left-right, front-back) included all of the spatial features. For example, the TID scores were not significantly different from the English and Croatian scores. 4.2.4. The Representational System Despite the crosslinguistic and crossmodal differences in spatial language, CSLH proposes that people make use of the same representational system. To address this prediction, qualitative analyses were conducted on the data gathered in Experiment 1. The findings support the hypothesis in that regardless of language and modality, the participants constructed a spatial relation between the two objects, they took an available reference frame, and used the same conceptual structure, which allowed the participants to make choices among the available relationals and predicates in their own language. In what follows, the data were analyzed qualitatively to show how the representational systems interfaced with each other at the time of the spatial descriptions. Consider Figure 4.12.

Figure 4.12. The stimuli #1 from Experiment 1. The two trucks are located on the lateral axis and oriented the opposite directions.

109 According to the Crossmodal Spatial Language Hypothesis (CSLH), the representational system for Figure 4.12 is the following: (1) SR

x 1 , y 1 , z1

RF

allocentric / egocentric

TR

t

 x 2 , y 2 , z2

CS ORIENT ([TRUCK], TOWARD ([])) BE AT ([LOCi])

ORIENT ([TRUCK], TOWARD ([])) BE AT ([LOCj ])

LR •

static / dynamic predicates



must encode either AT or ORIENT or both



perspective is optional



can modify BE and fillers: Thing and LOC. CSLH states that in SR the two entities,  and , are recognized and identified as

3-D indicating that these entities cannot be identical. In TR, these two entities and their 3D shape are situated on a single region, t, on the timeline. In RF, two reference frames are readily available indicating that the relation between the two objects can be made by taking either one or both. CS links the static template and the representations above. Thus, CSLH predicts that a description of the situation in Figure 4.12 must involve a spatial relation between the two things. Confirming this prediction, all of the participants (as shown in (2), (3), (4), (5), (6), (7), and (8) below) constructed a spatial

110 relation between the two trucks in the picture, which were encoded as  and  in SR in their 3-D specifications and linked to TRUCKs in CS.

(2) TID-P513

RH:

CAR

LH:

CLb-horizontal CLb-horizontal

‘Two cars (=trucks) are facing different directions sagittally’

(3) HZJ-P4

RH: TRUCK LH:

TWO

CLb-horizontal CLb-horizontal

‘Two trucks are facing different directions laterally’

13

P: Participant, RH: Right hand, LH: Left hand, CLx: classifier and handshape, _____: hold.

111 (4) ASL-P3

RH: #TRUCK

CL3-vertical

LH:

CL3-vertical

‘Two trucks are facing different directions laterally’

(5) ÖGS-P4

RH: TWO

TRUCK CLb-horizontal

LH:

CLb-horizontal

‘Two trucks are facing different directions laterally’

(6) Turkish-P7 Farklı

yön-e

giden

iki kamyon

Different

direction-DAT go-ADV two truck

‘(There are) two trucks going to the different directions’

112 (7) Croatian-P10 To

su

dva kamiona

koji

Those be-PRES:3PLU two truck-MASC-GEN-SG which-MASC-NOM-PLU idu

jedan….

svaki

u svom

go-PRES:3PLU one-MASC-NOM-SG each-MASC-NOM-SG in own-MASC-LOC-SG smjeru.. ovoga

jedan

od

drugoga

ide.

direction-MASC-LOC-SG one-MASC-NOM-SG from other-MASC-GEN-SG goPRES:3SG ‘Those are two trucks, each goes in its own direction..each going away from the other’

(8) English-P4 Two toy sem-trucks they’re mini, and they’re facing opposite ways from each other

The hypothesis asserts that the spatial relations can be constructed by taking an egocentric reference frame, in which the locational and orientational information of the referents were given with respect to the viewpoint of the describer. Otherwise, the spatial relations are established by taking an allocentric reference frame in which information is given with regard to the relative positionings of the referents, regardless of the viewpoint. The data supported the hypothesis in that the participants took the available reference frames. In the sign languages (TID, HZJ, ASL, and ÖGS), the availability of the two reference frames was observed in the use of the lateral or sagittal axis of the signing space. The data cited above showed that the participants referred to the two objects by

113 using their lateral (as in (3), (4), and (5)) or sagittal (as in (2)) axis of the signing space. The use of the lateral axis of the signing space indicated the use of an egocentric reference frame since the trucks in the testing items were located laterally. (9) gives some examples in which the sign language participants employed an egocentric reference frame. Note that the perspective taken in the examples was not linguistically encoded. That is, the Deaf participants did not sign whether the description was from their viewpoint or their addressee’s viewpoint. Nor did they take a perspective of one of the trucks in the testing item.

(9) TID-P7

&

HZJ-P7

&

ASL-P1

& ÖGS-P2

However, the use of the sagittal axis was also available, at least, in TID. TID-P3 in (10) took an allocentric reference frame and located the objects on his sagittal axis of the signing space in describing Figure 4.21. Again, TID-P3 did not specify whether the objects’ spatial arrangement was from his perspective or his addressee’s perspective.14

14

There was no referential shift or the use of “character perspective” either.

114 (10) TID-P3

The two reference frames, encoded in RF, were also apparent in the spoken language descriptions. For example, in (11), Turkish-P7 used farkli yone giden ‘going to different directions’ and, in (12), English-P4 used facing opposite ways in which the trucks could be located either laterally or sagittally. Thus, they took an allocentric reference frame.

(11=6) Turkish-P7 Farklı

yön-e

giden

iki kamyon

Different

direction-DAT go-ADV two truck

‘(There are) two trucks going to the different directions’

(12=8) English-P4 Two toy semi-trucks they’re mini, and they’re facing opposite ways from each other

Both reference frames were available at the same time. The evidence came from speech and gestures accompanying the speech. In (13), Turkish-P9 used pointing to indicate that the trucks were on her lateral axis while in (14) English-P9 used gestures to indicate that the trucks were on his lateral axis and facing his left and right. Therefore,

115 they used an allocentric reference frame in their speech and an egocentric reference frame in their gestural productions.

(13)15 Turkish-P9

Iki tane oyuncak kamyon bir-i gid-iyor Two cl toy

truck

bir-i

gel-iyor

one-POSS go-IMPERF one-POSS come-IMPERF

‘(There are) two toy trucks, one of them is coming and one of them is going’

(14) English-P9

There’s two toy trucks pointing in opposite directions

The hypothesis predicted that in static relations, the CS would obligatorily encode the location and orientation of the objects in two distinct regions, i and j, with two different TOWARD fillers, here TRUCKs. As predicted, all of the participants, 15

Still frames of the relevant gestures are given above the lexical item(s) that they accompany to.

116 regardless of language and modality, gave relative, but not precise, positions and orientations of the objects in their descriptions. The hypothesis also predicted that an angular static relation could be encoded in LR with either static or dynamic predicates. This was, indeed, the case across languages. TID-P3 (above in (10)) and HZJ-P1 (below in (15)) described Figure 4.11 as ‘(lit.) the two trucks are going away from one another’. In the other descriptions, static predicates were used. Nonetheless, the CS was the same across the board.

(15) HZJ-P1

As observed in sign language descriptions, a static situation can be described as a dynamic situation in spoken languages. The examples above provided evidence for this prediction. For example, Turkish-P9 in (13) used two imperfective marked dynamic predicates geliyor ‘(it) is coming’ and gidiyor ‘(it) is going’ in her description. Additional evidence comes from gestural productions of the speakers. For example, in (16), TurkishP6 used an imperfective marked dynamic predicate gidiyorlar ‘(they) are going’ in her speech as well as a dynamic lateral hand movement in her gestural production.

117 (16) Turkish-P6

Iki tane kamyon

ters

yön-e

Two cl truck

opposite direction-DAT

gid-iyor-lar go-IMPERF-PL

‘Two trucks are going to the opposite directions’

So far, the descriptions of the objects located laterally and facing opposite directions have provided supporting evidence for the hypothesis. In the following further evidence is given from the descriptions of the objects located on the sagittal axis. Consider Figure 4.13.

Figure 4.13. The stimuli #2 from Experiment 1. The cow and the horse were located on the sagittal axis and facing each other.

118 According to the hypothesis, the template is the following: (17) SR

x 1 , y 1 , z1

RF

allocentric / egocentric

TR

t

 x 2 , y 2 , z2

CS ORIENT ([COW], TOWARD ([])) BE AT ([LOCi])

ORIENT ([HORSE], TOWARD ([])) BE AT ([LOCj ])

LR •

static / dynamic predicates



must encode either AT or ORIENT or both



perspective is optional



can modify BE and fillers: Thing and LOC. As predicted, all of the participants referred to the two objects, the horse and the

cow in Figure 4.13 (which were represented as three-dimensional x1, y1, z1 and  x2, y2, z2, respectively) in SR. An example from each language is given below in (18) to (24). In these examples, the participants referred to the referents and the objects’ relative positionings with no linguistic marking of perspective.

119 (18) TID-P2

RH: COW

HORSE

LH:

CL1-up_______________ CL1-up_______________

‘The cow and the horse are meeting each other sagittally’

(19) HZJ-P2

RH:

COW

LH:

HORSE

CL2-bend CL2-bend

‘The cow and the horse are facing each other sagittally’

(20) ASL-P10

RH: LH:

HORSE

CL1-up____________________________ COW

CL1-up

120 ‘The horse and the cow are facing each other sagittally’

(21) ÖGS-P3

RH: COW

CL2-bend____________________

LH:

HORSE CL2-bend

‘The cow and the horse are facing each other sagittally’

(22) Turkish-P10 At

ve inek karılıklı

Horse and cow

dur-uyor

opposite-NOM-ADV halt-IMPERF

‘(lit.) Horse and cow keep facing one another’

(23) Croation-P1 Dva konja idu

jedan

prema drugom.

Two horses go-PRES:3PLU one-MASC-NOM-SG towards another. ‘Two horses approaching one another’

(24) English-P7 There’s a plastic cow and a plastic horse facing each other vertically

121 The hypothesis predicted that because of the availability of the two reference frames, Figure 4.13 could be described in at least two different ways. The data supported this prediction. In (25) the signers described the positions of the cow and the horse by employing an egocentric reference frame.

(25) TID-P1

&

HZJ-P2

&

ASL-P3

&

ÖGS-P3

Yet, it is also possible to describe the same stimulus by taking an allocentric reference frame. Evidence of this came from TID-P8 who used his lateral axis, as shown in (26).

(26) TID-P8

The use of the two reference frames encoded in RF was also detected in the spoken language descriptions. For example, in (27), Turkish-P10 used karsilikli duruyor ‘(lit.) keep facing each other’. In her description, there was no reference to the precise

122 location of the horse and the cow with respect to her viewpoint; thus, she took an allocentric reference frame. In contrast, English-P7, in (28), took an egocentric reference frame by using vertically [sic] in addition to facing each other.

(27=22) Turkish-P10 At

ve inek karılıklı

Horse and cow

dur-uyor

opposite-NOM-ADV halt-IMPERF

‘(lit.) Horse and cow keep facing one another’

(28=24) English-P7 There’s a plastic cow and a plastic horse facing each other vertically

The availability of the two reference frames was also observed in the speech and gesture production of the speakers. Turkish-P1, in (29), did not refer to the locations of the objects with respect to her viewpoint since she used the predicate karılıklı ‘opposite/face-to-face’ only, which was an indication of the use of an allocentric reference frame. But she also used a pointing gesture on the sagittal axis of her gesture space, which corresponded to the axial information for the relative positionings of the horse and the cow. Thus, she took an egocentric reference frame in her gestural production. Similarly, English-P5, in (30) was not precise about the locations of the horse and the cow relative to her viewpoint in her speech production, which indicated the use of an allocentric reference frame. Yet, in her gestures, she used the sagittal axis of her gesture space to indicate both the locations and orientations of the horse and the cow in

123 the stimulus. Hence, she also took an egocentric reference frame in her gestural production.

(29) Turkish-P1

Bir at

bir inek

One horse one cow

karılıklı opposite-NOM-ADV

‘(There is) a horse and a cow facing one another’

(30) English-P5

The cow is back and its butt’s facing you again then…there’s a horse facing it directly so you just see the horse’s head.

Nonetheless, the use of gestures did not entail the use of an egocentric reference frame. The evidence of allocentric gesturing came from Turkish-P6’s description of the same static situation (31). In her speech, she used karılıklı ‘opposite/face-to-face’ indicating the use of an allocentric reference frame. In her gesture, she used the lateral

124 axis of her gesture space, which, again, indicated the use of an allocentric reference frame.

(31) Turkish-P6

At

ile

inek

Horse with cow

karılıklı opposite-NOM-ADV

‘(There is) a horse and a cow facing one another’

The hypothesis also predicted that a static situation can be represented by using dynamic predicates. The data in (32) also supported this prediction. In this description, the TID signer not only used an allocentric reference frame but also employed a dynamic predicate ‘(lit.) going to each other’ by moving his hands laterally and meeting them in the middle.

(32=26) TID-P8

125 Thus far, the data elicited from the descriptions of the objects located on the lateral/sagittal axes and facing opposite/same directions support the hypothesis, which predicts that the spatial information is not obligatorily and directly encoded in the linguistic descriptions due to the multiple representations. The descriptions of the objects facing the same direction also provide supporting evidence for the hypothesis. Consider Figure 4.14.

Figure 4.14. The stimuli #5 from Experiment 1. The pig and the goat were located on the lateral axis and facing the same direction.

According to CSLH, the template is the following: (33) SR

x 1 , y 1 , z1

RF

allocentric / egocentric

TR

t

 x 2 , y 2 , z2

CS ORIENT ([GOAT], TOWARD ([])) BE AT ([LOCi]) LR

ORIENT ([PIG], TOWARD ([])) BE AT ([LOCj ])

126 •

static / dynamic predicates



must encode either AT or ORIENT or both



perspective is optional



can modify BE and fillers: Thing and LOC. An example of this from each language, given in (34) to (40), referred to the two

objects, the pig and the goat in Figure 4.14, (represented as three-dimensional x1, y1, z1 and  x2, y2, z2, respectively) in SR.

(34) TID-P10

RH: PIG LH:

GOAT TOGETHER CL1-hor______________ CL1-hor______________

‘The pig and the goat are facing forward and going forward together sagittally’

(35) HZJ-P4

RH: PIG LH:

GOAT

CL2-horizontal

CL2-horizontal__________________________

127 ‘The pig and the goat are facing left laterally’

(36) ASL-P1

RH: PIG

GOAT

LH:

CL2-bend_________________________________

‘The pig and the goat are facing left laterally’

(37) ÖGS-P2

RH: PIG

CLbver

GOAT

LH:

CLb-vertical CLb-vertical

‘The pig and the goat are facing right laterally’

(38) Turkish-P7 Domuz ve keçi Pig

CL2-bend

galiba

aynı yön-de

and goat probably same direction-LOC

‘(There is) a pig and a goat, perhaps, on the same direction’

128 (39) Croatian-P1 Krava

ide

za svinjom

prema

lijevo.

Cow-FEM-NOM-SG go-PRES:3SG for pig-FEM-INSTR-SG towards left-ADV A cow follows a pig to the left.

(40) English-P8 There is a goat and a pig and both are facing to the left-hand side of the screen

The two reference frames, encoded in RF, were available in the sign language descriptions. In (41) HZJ-P4, ASL-P10, and ÖGS-P7 used the lateral axis of their signing space, which matched the spatial layout in the stimulus; therefore, they took an egocentric reference frame. In contrast, by taking an allocentric reference frame, TID-P13 in (42) used the sagittal axis of his signing space.

(41) HZJ-P4

&

ASL-P10

&

ÖGS-P7

129 (42) TID-P13

The two reference frames, encoded in RF, were also present in the spoken language descriptions. For example, in (43), Turkish-P7 used ayni yonde ‘on the same direction’, which did not specify which direction with respect to her viewpoint, thereby taking an allocentric reference frame. Nonetheless, English-P8, in (44), provided the directional information in her description: facing to the left-hand side. She did not, however, clarify the relative locations of the objects. Thus, she also took an allocentric reference frame.

(43=38) Turkish-P7 Domuz ve keçi Pig

galiba

aynı yön-de

and goat probably same direction-LOC

‘(There is) a pig and a goat, perhaps, on the same direction’

(44=40) English-P8 There is a goat and a pig and both are facing to the left-hand side of the screen

130 Similar to the analyses given above, both reference frames could be used in speech and gesture. For example, in (45), Turkish-P1 used an allocentric reference frame in her speech with arka arkaya ‘(lit.) in front and in back’ while she employed an egocentric reference frame by using pointing gestures on the lateral axis of her signing space. The same can be said for Turkish-P3’s description of the same static situation given in (46). In her description, she used önde…arkada ‘in front and in back’ indicating the use of an allocentric reference frame while she moved her head to point the locations of the pig and the goat in her lateral gesture space indicating the use of an egocentric reference frame in her gestural production.

(45) Turkish-P1

Domuz ve Pig

keci

and goat

arka arka-ya back back-DAT

‘A pig and a goat are back-to-back’

131 (46) Turkish-P3

Domuz-la

keçi var domuz ön-de

keçi arka-da…

yürü-yor-lar

Pig-COM

goat exist pig

goat back-LOC

walk-IMPERF-PL

front-LOC

‘There is a pig and a goat. The pig in the front and the goat at the back are walking’

Gestural productions did not necessarily match the spatial layout of the static situations. For example, in (47), Turkish-P6 used the sagittal axis of her signing space by moving her hands slightly forward and backward when she said önde ‘in front’ and arkada ‘in back’. Nonetheless, the pig and the goat in the stimulus were located laterally. Therefore, she took an allocentric reference frame both in her speech and co-speech gestures.

(47) Turkish-P6

Domuz ön-de

keçi arka-da

Pig

goat back-LOC

front-LOC

‘A pig is in the front; a goat is at the back’

132 The sign language data provided supporting evidence for the prediction that a static situation can be represented by using static and dynamic predicates. For example in (48) the TID signer moved his hands toward his left, indicating that the pig and the goat were moving left. Similarly, HZJ-P9 in (49) described the picture by using the dynamic predicate ‘go’.

(48) TID-P9

(49) HZJ-P9

Confirming the hypothesis, in the above example (46), Turkish-P3 used a dynamic predicate yuruyorlar ‘are walking’ in referring to the static situation. Similarly, in (50) English-P2 used a dynamic predicate walking off in her description.

133 (50) English-P2

There’s the same pig a with…looks like he’s walking off to the left of the screen

with a goat walking off to the left of the screen right behind him

The use of a dynamic predicate does not necessarily entail the use of a dynamic gesture accompanying the speech productions. For example, Turkish-P4, in (51), used a dynamic predicate takip ediyor ‘(lit.) is following’ in her speech and a dynamic hand movement from her right to her left. Yet, English-P9, in (52), used a similar dynamic predicate is following, but static hand gestures to give the locations of the goat and the pig.

(51) Turkish-P4

(Keçi) domuz-u takip ed-iyor

o-nun arka-sın-dan

134 goat

pig-ACC follow do-IMPERF

it-3GEN back-3POSS-ABL

‘A goat is following a pig from behind’

(52) English-P9

There’s two toy animals… so the goat

is following

the pig

4.2.4.1. Summary The noncorrespondences between the input and the descriptions were explained by the representational system in which Spatial Representation did not specify metric relations but separate regions for the objects, Temporal Representation restricted the objects and their regions at a given time, Reference Frames provided two possible ways to construct a spatial relation via allocentric/egocentric frames, Conceptual Structure constructed BE-AT&ORIENT relations, and finally, Linguistic Representation selected a set of syntactic frames, modifications, and an information structure. 4.2.5. Discussion Experiment 1 was designed to investigate how signed and spoken languages represent spatial static angular relations (left/right, front/back) and to test the Crossmodal Spatial Language Hypothesis (CSLH). The data supported CSLH in that the signed (TID, HZJ, ASL, and ÖGS) and spoken (Turkish, English, and Croatian) languages did not

135 encode the salient spatial features in the spatial layout entirely and precisely in all of their descriptions. Furthermore, investigation of the data indicated that there was a modality difference: the signers were significantly better in giving spatial information than the speakers. Nonetheless, crosslinguistic comparisons revealed that not all sign language groups gave more exact descriptions than the spoken language groups. CSLH explains these findings by proposing multiple representations interfacing the input and the descriptions. Maintaining CSLH, the data showed that the signers and speakers did not encode the salient spatial features due to the availability of the two reference frames and the conceptual structure templates. The results could have been dependent on the spatial layout design in Experiment 1 in which there is a relatively long distance between the objects. More experiments were needed to decide whether the hypothesis accounted for spatial language for the situations in which the distance between the objects is shorter. Experiment 2 was designed to address this question. 4.3. Experiment 2: Describing Angular-Topological Relations Section 4.2 reported that the signed (TID, HZJ, ASL, and ÖGS) and spoken languages (Turkish, Croatian, and English) do not obligatorily describe the prominent spatial features from the spatial angular input. Supporting CSLH, results showed that the signers and speakers do not encode the salient spatial features due to the availability of the two reference frames and the conceptual structure templates. Experiment 2 tested the effects of the same variables positioning and facing as well as modality and language on the descriptions of the static spatial angular-topological relations made by using, for

136 example, side-by-side, next to, beside. The results provide additional evidence to support CSLH. The outline of the remainder of the chapter is as follows. The methodology is given in section 4.3.1. Section 4.3.2 explores the results from individual languages whereas section 4.3.3 compares the languages and makes crosslinguistic and crossmodal generalizations. Section 4.3.4 goes into details of the representational system to understand these generalizations. Finally, section 4.4 discusses the results from the two experiments and concludes the chapter. 4.3.1. Methodology Repeating from chapter 2, Experiment 2 was designed to assess how language users described a spatial arrangement of two objects (animals and dolls), which had intrinsic features, such as front and back, and were put next to each other. The objects were almost touching. The design for Experiment 2 was 2x2. The first factor was positioning with two levels (objects on the left-right axis vs. the front-back axis). The second factor was orientation with two levels (objects facing the same direction vs. facing opposite directions). There were four testing items, as shown in Figure 4.15. There were two object location arrangements: on the left-right axis, such as (a) and (b), and on the front-back axis, such as (c) and (d). In addition, there were several arrangements of object orientations. For example, in (a) and (d) the objects faced toward the same direction; in (b) the objects faced each other, and in (c) the objects faced different directions.

137

(a) Lateral-Same

(b) Lateral-Diff.

(c) Sagittal-Same

(d) Sagittal-Diff.

Figure 4.15. The testing items in Experiment 2.

There were four measures: Axial, Locational, Orientational, and Situational. The maximum score a description could receive was 4 (1 for each measure). A total of fiftyeight people participated in this study. There were ten participants for the Croatian group; the other languages were represented by eight participants. 4.3.2. Results: Language by Language 4.3.2.1. Sign Languages The TID, HZJ, ASL, and ÖGS participants used complex predicates such as b, two, and one, that encoded the locations and orientations of the dolls and animals that were put side-by-side. Not all of the predicates were used across conditions. Nonetheless, as for the angular relations (Section 4.2) the use of a complex predicate did not necessarily indicate that the linguistic encodings of the axial, locational, orientational, and situational information matched that information in the stimuli. 4.3.2.1.1. TID The TID signers used a complex predicate that consisted of the handshapes b, one, two, and three16, that encoded relative positions and facing of the objects in their

16

See Appendix A for the list of the handshapes observed in the data. See also Appendix C distributions of the linguistic forms used by each language in Experiment 2.

138 descriptions. Yet, there was not a single complex predicate that obligatorily encoded a single condition since two or more different predicates were used for a given condition. For example, while one-up and two-ver were used for Figure 4.1a, b-vertical, onehorizontal, one-up, two-vertical and two-bend were used for Figure 4.1b. Moreover, there seemed to be patterns for the use of some, but not all, predicates. For example, the predicate one-up was used for each condition. B-vertical and one-horizontal were used for the pictures with a horse and an elephant while three-up was used for the objects located on the sagittal axis. Figure 4.16 below presents the results for the TID participants’ descriptions. As predicted, overall, the information in the stimuli was not obligatorily and precisely encoded in their respective TID descriptions. According to Figure 4.16, the TID participants described the orientations of the referents in an angular static arrangement precisely. Nonetheless, their scores decreased when the axial, locational, and situational information in the descriptions and the spatial layout were compared. Thus, they could use the sagittal axis of their signing space in representing the objects located on the lateral axis side-by-side. Furthermore, the TID participants used both dynamic and static complex predicates even though there was no movement in the stimulus.

139

Figure 4.16. The TID scores for axes, location, orientation, and situation types in describing the pictures for Experiment 2.

Table 4.11 summarizes the means and standard deviations for the overall TID scores. The General Linear Model with repeated measures analysis confirmed that the effects of positioning and orientation of the objects as the within-subjects effects on the overall TID scores including axes, location, orientation, and situation were not significant. Thus, the number of the spatial features in the TID signers’ descriptions did not vary with respect to the object manipulations.

Table 4.11. The means and standard deviations of the TID static angular-topological data (max=4). Condition LateralSame LateralDifferent SagittalSame SagittalDifferent

Mean 2.14 2.42 2.57 2.71

SD 1.06 0.97 0.53 0.95

140 4.3.2.1.2. HZJ The HZJ signers used a complex predicate (the handshapes b, one, two, and thumb) that encoded relative positions and orientations of the objects in their descriptions. Similar to the TID predicates, there was not a single complex predicate that obligatorily encoded a single condition since two or more different predicates were used for a given condition. For example, while one-up and two-horizontal and two-vertical were used for Figure 4.15a, b-vertical, two-horizontal, two-bend, and thumb were used for Figure 4.15b. Moreover, there seemed to be patterns for the use of some, but not all, predicates. For example, the predicate two-horizontal was used for each condition. Bvertical and two-bend were used for the pictures with a horse and an elephant while oneup was used for the male and female dolls. Compared to the TID participants, the HZJ participants gave more precise information about the spatial layout of the stimulus in their descriptions. Figure 4.17 introduces the results.

Figure 4.17. The HZJ scores for axes, location, orientation, and situation types in describing the pictures for Experiment 2.

141 The means and standard deviations for the HZJ overall scores were given in Table 4.12. The General Linear Model with repeated measures analysis was conducted to evaluate the effect of positioning and orientation of the objects as the within-subjects effects on the overall HZJ scores including axes, location, orientation, and situation. No significant difference was found. Similar to those in the TID descriptions, the HZJ signers did not alter the amount of the spatial features in their descriptions with respect to the object manipulations.

Table 4.12. The means and standard deviations of the HZJ static angular-topological data (max=4). Condition LateralSame LateralDifferent SagittalSame SagittalDifferent

Mean 3.25 3.87 3.87 3.62

SD 0.88 0.35 0.35 0.51

4.3.2.1.3. ASL The ASL signers used complex predicates (the handshapes one, two, and three) that encoded relative positions and orientations of the objects in describing the stimuli in Experiment 2. There was not a single complex predicate that obligatorily encoded a single condition since two or more different predicates were used for a given condition. For example, while one-up, two-horizontal, two-vertical, and two-up were used for Figure 4.15a, one-point, one-up, two-horizontal, two-bend, and three-vertical were used for Figure 4.15b. Moreover, there seemed to be patterns for the use of some but not all predicates. For example, the predicate two-horizontal and two-vertical were used for each

142 condition. One-point and two-bend were used for the pictures with a horse and an elephant. The ASL data indicated that the ASL signers did not precisely encode the axial, locational, orientational, and situational information in their descriptions of the angulartopological relations. According to Figure 4.18, their scores were higher on the axial information measure than the others.

Figure 4.18. The ASL scores for axes, location, orientation, and situation types in describing the pictures for Experiment 2.

Table 4.13 summarizes the means and standard deviations for the ASL scores. The General Linear Model with repeated measures analysis was conducted to assess the effect of positioning and facing of the objects on the overall ASL scores including axes, location, orientation, and situation. As a result, a significant main effect was found for positioning, F(1, 7) = 11.66, p **

>**

ASL

>**

>**

ÖGS

>**

>**

>**