A Computational Model of Natural Language Communication Interpretation, Inference, and Production in Database Semantics PREPRINT Second Edition corrected revised extended Please cite as Preprint March 9, 2016
ROLAND H AUSSER Friedrich Alexander Universität Erlangen Nürnberg March 9, 2016
email:
[email protected] c
Roland Hausser, March 9, 2016
Preface to the Second Edition
The 2nd Edition resembles the 1st in that the cycle of natural language communication is modeled as a cognitive agent’s turn-taking, consisting of (i) the hear mode, which maps unanalyzed modality-dependent external surfaces into cognitive content, (ii) the think mode, which selectively activates content by navigating along the semantic relations of structure, and (iii) the speak mode, which maps activated content into unanalyzed modality-dependent external surfaces. Inferences matching activated content derive new content for reasoning and as blue prints for action, including language action. Regarding the details of the software solution, however, the 2nd Edition is substantially revised. The hear mode relies on an automatically managed now front in the agent’s content-addressable database: proplets which have ceased to be potential candidates for concatenation with a next word are left behind in the sediment of memory, such that a current now front will contain no more than four or five. This allows to use the algorithm of Left-Associative Grammar without rule packages, relying upon self-organization instead. The absence of rule packages removes the main difference between the operations (formerly called rules) of the hear mode and inferences. The speak mode is also simplified by using language-dependent lexicalization rules which are embedded into the think mode operations. The selective activation of the think mode is characterized by a new kind of graph, first developed in CLaTR. The coreference of anaphoric pronouns is now based on the use of addresses instead of co-indexing, and may be found in chapters 10 and 11 of CLaTR. These changes contribute to a standardized presentation of Database Semantics (DBS), which has been extended to the 2nd Edition of CLaTR and is complied with in HBTR. June 2015 Nürnberg
Roland Hausser
Preface
In the preface to the First Edition of The Principia,1 Sir Isaac Newton distinguished two aspects of mechanics: theoretical and practical. The theoretical aspect, called rational by Newton, consists in accurate demonstration. The practical aspect includes all the manual arts. What would be the result of applying a corresponding distinction to the current state of linguistics? Instead of first praising the importance of our field – as Newton would – let us go straight to the questions at hand: What is theoretical linguistics and what is practical linguistics? Practical linguistics is instantiated by such tasks as speech recognition, desktop publishing, word processing, machine translation, content extraction, classification, querying the Internet, automatic tutoring, dialogue systems, and all the other applications involving natural language. They have generated a huge demand for practical linguistic methods. Compared to the users’ needs and expectations, however, the results leave much to be desired. Today, the most successful applications in practical linguistics are based on the methods of statistics and metadata mark-up. These are smart solutions,2 which try to get by without a general theory of how communicating with natural language works. Instead they aim to maximally exploit the special properties and natural limitations of each application or kind of application. Now consider practical mechanics: It is instantiated by tasks ranging from accurately predicting the tides, to predicting the future positions of the planets, to aiming cannon balls, to landing on the moon. These applications have created an equally large demand for applied methods as in linguistics. In contrast to linguistics, however, the field of mechanics was able to satisfy any such demands far beyond the users’ imagination. This was possible because Newton’s general theory can be translated into the specific applications while maintaining compatibility with traditional craft skills. Each translation is 1
2
Original Latin title: Philosophiæ Naturalis Principia Mathematica (1687), complete English title The Principia: Mathematical Principles of Natural Philosophy. Cf. FoCL, Section 2.3. The alternative is a solid solution.
viii
Preface
hard work and requires theoretical knowledge as well as practical experience, but the results are nothing but spectacular. The example of Newton’s mechanics leads naturally to the question: Can we do the same in linguistics? Can we conceive a new framework suitable to fulfill all the wide-ranging demands of the users by simply translating the linguistic theory into the limited and specialized contexts of various practical applications? This question may be taken as a worthy challenge to our basic research. As a first step towards a complete, general linguistic framework, let us reconstruct the cognitive ‘mechanics’ of natural language communication between humans. The theory, called Database Semantics3 (DBS) is presented here as the declarative specification of a talking robot. It is in the nature of our project that its potential to improve practical applications correlates directly with its relative success in adequately modeling human cognition.4 The declarative specification of a talking robot must be designed as a functional model, which effectively realizes the mechanism of natural language communication. To ensure completeness, the model must take the languagebased interaction between humans as its prototype. The model’s functionality and data coverage must be verified automatically by an efficient implementation as a running computer program. This combination of functionality, completeness, and verifiability constitutes the best scientific basis for the long-term success of upscaling the model. The resulting system is able to serve in all the practical applications involving natural language communication. In most cases it is sufficient to simply reduce the functionality and the data coverage of the model to fit the demands of the application at hand. For example, when using the cognition of the talking robot for building an automatic dialogue system used over the phone, there is no need for artificial vision, manipulation, or locomotion.5 Other applications, notably machine translation, not only allow a reduction, but also require an extension of the theory. For such extensions, however, 3
4
5
As the name of a specific scientific theory, the term Database Semantics is written with initial capital letters. This use is distinct from referring to generic issues, for example semantic constraints on databases (Bertossi, Katona, Schewe, and Thalheim eds. 2003). Like any basic science with practical ramifications, a computational reconstruction of natural language communication raises the threat of possible misuse. This must be curtailed by developing responsible guidelines for clearly defined laws to protect privacy and intellectual property while maintaining academic liberty, access to information, and freedom of discourse. Similar reductions apply to such applications as automatic grammar checking, content extraction, indexing to improve recall and precision of Internet querying, or supporting automatic speech recognition.
Preface
ix
Database Semantics provides a solid basis, given that it models monolingual communication, including monolingual language understanding. Furthermore, any application-independent (theoretical) improvements regarding the data coverage of the lexicon, of automatic word form recognition, of syntactic–semantic parsing, of absolute and episodic world knowledge, of inferencing, etc., may directly benefit existing practical applications simply by routinely replacing their components with improved versions. This is possible because the theory provides functionally motivated modules with clearly defined interfaces within a clearly defined functional flow. The following pages aim at presenting Database Semantics as directly and simply as possible. Intended audiences are graduate students, researchers, and software engineers in linguistics and natural language processing. The text may also be of interest to scholars in philosophy of language, cognitive psychology, and artificial intelligence. As background literature for readers who are new to computational linguistics in general and Database Semantics in particular, Foundations of Computational Linguistics (FoCL 1999, 3rd ed. 2014) is recommended. FoCL is a textbook that systematically describes the traditional components of grammar, compares a wide range of different linguistic approaches in their historical settings, and develops the S LIM theory of language, which is also used here. A complementary effort in cognitive psychology is the ACT-R theory by Anderson (cf. Anderson and Lebiere 1998). Like Database Semantics, ACT-R is essentially symbol-based rather than statistical, and uses computational modeling as the method of verification. However, ACT-R focusses on memory, learning, and problem solving, while Database Semantics concentrates on modeling the speak and the hear mode in natural language communication. February 2006 Erlangen–Nürnberg
Roland Hausser
Contents
Part I. Communication Mechanism of Cognition 1. Matters of Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Sign- or Agent-Based Analysis of Language? . . . . . . . . . . . . . . . 3 1.2 Verification Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Equation Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4 Objectivation Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.5 Equivalence Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.6 Surface Compositionality and Time-Linearity . . . . . . . . . . . . . . . 13 2. Interfaces and Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Cognitive Agents without Language . . . . . . . . . . . . . . . . . . . . . . . 2.2 Sensory Modalities and their Media . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Alternative Ontologies for Referring with Language . . . . . . . . . . 2.4 Theory of Language and Theory of Grammar . . . . . . . . . . . . . . . . 2.5 Immediate Reference and Mediated Reference . . . . . . . . . . . . . . . 2.6 The S LIM Theory of Language . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17 17 19 20 23 24 27
3. Data Structure, DB Schema, and Algorithm . . . . . . . . . . . . . . . . . 3.1 Proplets for Coding Propositional Content . . . . . . . . . . . . . . . . . . 3.2 Matching between Language and Context Proplets . . . . . . . . . . . 3.3 Storing Proplets in a Content-Addressable Database . . . . . . . . . . 3.4 Possible Continuations vs. Possible Substitutions . . . . . . . . . . . . 3.5 Cycle of Natural Language Communication . . . . . . . . . . . . . . . . . 3.6 Operations of the DBS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . .
33 33 35 37 39 42 44
4. Content and the Theory of Signs . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Formal Concept Types and Concept Tokens . . . . . . . . . . . . . . . . . 4.2 Proplet Kinds and their Mechanisms of Reference . . . . . . . . . . . . 4.3 Context Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Context Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49 49 51 55 56
xii
Contents
4.5 Language Interpretation and Production . . . . . . . . . . . . . . . . . . . . 57 4.6 Universal versus Language-Dependent Properties . . . . . . . . . . . . 60 5. Forms of Thinking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Retrieving Answers to Questions . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Episodic versus Absolute Propositions . . . . . . . . . . . . . . . . . . . . . 5.3 Inference: Reconstructing Modus Ponens in DBS . . . . . . . . . . . . 5.4 Non-Literal Uses as a Secondary Coding . . . . . . . . . . . . . . . . . . . 5.5 Secondary Coding as Perspective Taking . . . . . . . . . . . . . . . . . . . 5.6 Shades of Meaning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
63 63 67 68 72 75 76
Part II. Major Constructions of Natural Language 6. Intrapropositional Functor-Argument . . . . . . . . . . . . . . . . . . . . . . 6.1 Obligatory Subject/Predicate and Object\Predicate . . . . . . . . . . 6.2 Optional Adnominal Modifier|Modified . . . . . . . . . . . . . . . . . . . . 6.3 Optional Adverbial Modifier|Modified . . . . . . . . . . . . . . . . . . . . . 6.4 Semantic Interpretation of Determiners . . . . . . . . . . . . . . . . . . . . . 6.5 Complexity Levels of Surfaces and their Contents . . . . . . . . . . . . 6.6 Transparent vs. Opaque . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81 81 86 88 89 94 96
7. Extrapropositional Functor-Argument . . . . . . . . . . . . . . . . . . . . . . 7.1 Clausal Subject . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Clausal Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Clausal Adnominal with Subject Gap . . . . . . . . . . . . . . . . . . . . . . 7.4 Clausal Adnominal with Object Gap . . . . . . . . . . . . . . . . . . . . . . . 7.5 Clausal Adverbial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6 Subclause Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
103 103 107 109 112 114 117
8. Intrapropositional Coordination . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Comparing Simple and Complex Coordination . . . . . . . . . . . . . . 8.2 Simple Coordination of Subject and Object Nouns . . . . . . . . . . . 8.3 Simple Coordination of Verbs and of Adjectives . . . . . . . . . . . . . 8.4 Overview of Complex Coordination (Gapping) . . . . . . . . . . . . . . 8.5 Subject Gapping and Predicate Gapping . . . . . . . . . . . . . . . . . . . . 8.6 Object Gapping and Noun Gapping . . . . . . . . . . . . . . . . . . . . . . . .
123 123 126 132 136 137 142
Contents
xiii
9. Simple Coordination in Clausal Constructions . . . . . . . . . . . . . . . 9.1 Combining Intra- and Extrapropositional Coordination . . . . . . . . 9.2 Extrapropositional Coordination . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Simple Coordination in Clausal Subject . . . . . . . . . . . . . . . . . . . . 9.4 Simple Coordination in Clausal Object . . . . . . . . . . . . . . . . . . . . . 9.5 Simple Coordination in Clausal Adnominal . . . . . . . . . . . . . . . . . 9.6 Simple Coordination in Clausal Adverbial . . . . . . . . . . . . . . . . . .
149 149 151 153 160 165 171
10. Gapping in Clausal Constructions . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1 Subject Gapping in a Clausal Subject . . . . . . . . . . . . . . . . . . . . . . 10.2 Predicate Gapping in a Clausal Object . . . . . . . . . . . . . . . . . . . . . . 10.3 Clausal Adnom. (Subject Gap) with Subject Gapping . . . . . . . . . 10.4 Clausal Adnominal (Object Gap) with Object Gapping . . . . . . . . 10.5 Complex Coordination in Clausal Adverbial . . . . . . . . . . . . . . . . . 10.6 Gapping in Contents with a Three-Place Verb . . . . . . . . . . . . . . .
177 177 180 182 184 187 189
Part III. Declarative Specification of Formal Fragments 11. DBS.1: Hear Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Automatic Word Form Recognition . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Integrating the Steps of the Hear Mode . . . . . . . . . . . . . . . . . . . . . 11.3 Managing the Now Front . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4 Intrapropositional Suspension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5 The Three Steps of Applying a Lago Hear Operation . . . . . . . . . 11.6 Parsing a Small Fragment of English . . . . . . . . . . . . . . . . . . . . . . .
193 193 196 198 203 207 211
12. DBS.1: Speak Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1 Definition of LAGo Think.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 The Two Steps of Applying a LAGo Think Operation . . . . . . . . . 12.3 Navigating with LAGo Think.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.4 The Lexicalization Rule lexverb . . . . . . . . . . . . . . . . . . . . . . . . . . 12.5 The Lexicalization Rule lexnoun . . . . . . . . . . . . . . . . . . . . . . . . . . 12.6 The Lexicalization Rule lexadj . . . . . . . . . . . . . . . . . . . . . . . . . . . .
217 217 219 220 222 225 228
13. DBS.2: Hear Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.1 Lexical Lookup for the DBS.2 Example . . . . . . . . . . . . . . . . . . . . 13.2 Hear Mode Analysis of the First Sentence . . . . . . . . . . . . . . . . . . 13.3 Hear Mode Analysis of the Second Sentence . . . . . . . . . . . . . . . . 13.4 Hear Mode Analysis of the Third Sentence . . . . . . . . . . . . . . . . . .
231 231 232 240 245
xiv
Contents
13.5 Defining LAGo Hear.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 13.6 Hear Mode Extension to a Verb-Initial Construction . . . . . . . . . . 254 14. DBS.2: Speak Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.1 Content to be Realized in the Speak Mode . . . . . . . . . . . . . . . . . . 14.2 Speak Mode Production of the First Sentence . . . . . . . . . . . . . . . . 14.3 Speak Mode Production of the Second Sentence . . . . . . . . . . . . . 14.4 Speak Mode Production of the Third Sentence . . . . . . . . . . . . . . . 14.5 Definition of LAGo Think-Speak.2 . . . . . . . . . . . . . . . . . . . . . . . . 14.6 Speak Mode Extension to a Verb-Initial Construction . . . . . . . . .
259 259 262 267 270 274 277
15. Ambiguity, Word Order, and Collocation . . . . . . . . . . . . . . . . . . . . 15.1 Adnominal and Adverbial Uses of Phrasal Modifiers . . . . . . . . . 15.2 Graphical Analyses of the Hear and the Speak Mode . . . . . . . . . 15.3 Intrapropositional Modifier Chaining (Adnominal) . . . . . . . . . . . 15.4 Multiple Adverbials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.5 Inferences for Ritualized Phrases and Idioms . . . . . . . . . . . . . . . . 15.6 Collocations and Proverbs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
281 281 285 287 291 293 294
Concluding Remark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 Appendix A. Attributes, Values, Lexical Entries, Examples . . . . . . . . . . . . . . . . A.1 Number of Attributes in a Proplet . . . . . . . . . . . . . . . . . . . . . . . . . . A.2 Standard Order of Attributes in a Proplet . . . . . . . . . . . . . . . . . . . A.3 Summary of Proplet Attributes and Values . . . . . . . . . . . . . . . . . . A.4 Lexical Analysis of Nouns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.5 Lexical Analysis of Verbs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.6 Lexical Analysis of Adjectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.7 Coordinating Conjunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.8 Viewing DBS abstractly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.9 List of Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
301 301 303 305 309 311 316 317 318 320
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 Name Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 Subject Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
xv
Abbreviations Referring to Preceding and Subsequent Work References are to the latest editions. Preceding Work SCG
= Hausser, R. (1984) Surface Compositional Grammar, pp. 274, München: Wilhelm Fink Verlag
NEWCAT = Hausser, R. (1986) NEWCAT: Parsing Natural Language Using Left-Associative Grammar, Lecture Notes in Computer Science 231, pp. 540, Springer CoL
= Hausser, R. (1989) Computation of Language, An Essay on Syntax, Semantics, and Pragmatics in Natural Man-Machine Communication, Symbolic Computation: Artificial Intelligence, pp. 425, Springer
TCS
= Hausser, R. (1992) “Complexity in Left-Associative Grammar,” Theoretical Computer Science, 106.2:283-308, Amsterdam: Elsevier
FoCL
= Hausser, R. (1999) Foundations of Computational Linguistics, Human–Computer Communication in Natural Language, 3rd ed. 2014, pp. 518, Springer
AIJ
= Hausser, R. (2001) “Database Semantics for natural language.” Artificial Intelligence, 130.1:27–74, Amsterdam: Elsevier
L&I’05
= Hausser, R. (2005) “Memory-Based pattern completion in Database Semantics,” Language and Information, 9.1:69–92, Seoul: Korean Society for Language and Information
Subsequent Work L&I’10
= Hausser, R. (2010) “Language Production Based on Autonomous Control – A Content-Addressable Memory for a Model of Cognition,” Language and Information, Vol. 11:5–31, Seoul: Korean Society for Language and Information
CLaTR
= Hausser, R. (2011) Computational Linguistics and Talking Robots, Processing Content in Database Semantics, pp. 286, Springer, 2nd Ed. 2015, pp. 366, preprint at lagrammar.net
HBTR
= Hausser, R. (to appear) How to Build a Talking Robot; Linguistics, Philosophy, and Autonomous Control, pp. 120
Introduction
I. BASIC
ASSUMPTIONS
A computational model of natural language communication cannot be limited to the grammatical analysis of the language signs. Instead it must begin and end with the agent’s general recognition and action interfaces, treating language interpretation and production as special cases. Recognition and action are connected by cognition, which is embedded in the agent’s memory. Conceptually, agents without language have only one level of cognition, called the context level. Agents with language have two levels of cognition: the context level and the language level. The connection between language and the world, known as reference, is established solely by cognitive procedures which relate the levels of language and context by means of automatic pattern matching.1 The software of DBS (Database Semantics) models the behavior of natural agents, including language communication, by (a) reading propositional content resulting from recognition into the agent’s database and (b) reading content out of the agent’s database resulting in action. Recognition and action are (c) related by an autonomous control which maintains the agent in a state of balance vis à vis the changes in its internal state and external ecological niche. II. C OMPONENTS
OF A COGNITIVE AGENT
At the most abstract level, a cognitive agent consists of three basic components. These are (i) the external interfaces, (ii) the database (memory), and (iii) the algorithm.2 They use a common format, called the data structure,3 for 1 2
3
Autonomy from the metalanguage (FoCL Sect. 19.4). These components correspond roughly to those of a von Neumann machine (vNm): the external interfaces represent the vNm input-output device, the database corresponds to the vNm memory, and the algorithm performs the functions of the vNm arithmetic-logic unit. For a schematic comparison of standard computers, robots, and virtual reality machines see FoCL 1.1.3. The term “data structure” is closely related in meaning to the term “data type.” Even though there has been some argument that the format in question should be called an abstract data type rather than a data structure, the latter term is preferred here to avoid confusion with the classic type-token
xviii
Introduction
representing and processing content.4 The external interfaces are needed by the agent for recognition and action. In natural agents, recognition is based, for example, on eyes to see and ears to hear. Action is based, for example, on a mouth to talk, hands to manipulate, and legs to walk. Without them, the agent would not be able to tell us what it perceives and do what we tell it to do. The agent’s database is needed for the storage and retrieval of content provided by the interfaces. Without it, the agent would not be able to determine whether or not it has seen an object before, it could not remember the words of language and their meaning, and it would be limited to reflexes connecting input and output directly. The algorithm is needed to connect the interfaces and the database (i) for reading content provided by recognition into the database, and (ii) for reading content out of the database into action. Also, the algorithm must (iii) process the content in the database for determining goals, planning actions, and deriving generalizations (inferencing). In the cognition of natural agents, the external interfaces, the data structure, and the algorithm interact very closely. Therefore, in a computational model of natural cognition they must be co-designed within a joint functional cycle. The three basic components may be simple initially, but they must be general and functionally integrated into a coherent framework from the outset. III. T REATING
NATURAL LANGUAGE
A model of natural language communication requires the traditional languagespecific components of grammar, i.e., the lexicon and the rules of morphology, syntax, and semantics. During communication, these components must cooperate in (i) the hear mode, (ii) the think mode, and (iii) the speak mode. In the hear mode, the external interfaces provide the input, consisting of modality-dependent unanalyzed language signs. Processed by automatic wordform recognition (Sect. 11.1), they are parsed by a time-linear algorithm for syntactic–semantic analysis (Sect. 11.6) into a representation of content which is stored in the database. In the think mode, the algorithm is used for autonomously navigating through the database, thus selectively activating content. Inferences match their antecedent (forward chaining, CLaTR 13.5.1) or their consequent (backward
4
distinction (Sect. 4.1). It is for the same reason that we use the term “kind of sign” rather than “sign type” (Sect. 2.6), “kind of sentence” rather than “sentence type,” “kind of word” rather than “word type,” “kind of coordination” rather than “coordination type,” etc. See HBTR for a more explicit treatment of central cognition (autonomous control) in a talking robot.
Introduction
xix
chaining, CLaTR 13.5.2) with activated content to derive new content, resulting in reasoning and the derivation of blue prints for action. In the speak mode, the activation of content and the application of inferences is used for the conceptualization of language production, i.e., choosing what to say. For the production of language from activated content, the agent must work out how to say it. This requires the selection of language-dependent word form surfaces, the handling of word order, and compliance with the rules of agreement. IV. D EGREES
OF DETAIL
In the following chapters, some components of DBS are worked out in great detail, while others are only sketched in terms of their input, their function, and their output. This is unavoidable because of the magnitude of the task, its interdisciplinary nature, and the fact that some technologies are more readily available than others. For example, realizing DBS as the prototype of an actual robot with external interfaces for recognition and action, i.e., artificial vision, automatic speech recognition, robotic manipulation, and robotic locomotion, is practically out of reach because such machines are not yet available in today’s computational linguistics. This is regrettable because the content in the artificial agent’s database is built from concepts which are “perceptually grounded” in the agents’ recognition and action procedures (Roy 2003). While the external interfaces of the artificial agent are described here at a high level of abstraction, the algorithm and the data structure are worked out not only in principle, but are developed as “fragments” which model the hear, the think, and the speak mode using concrete examples. The fragments are defined as explicit rule systems for the implementation, verification, and upscaling of a computational implementation. V. AVAILABLE
SYSTEMS AND APPROACHES
Today many kinds of parsers are available. Some are based on statistical methods, such as the chunk parser (Abney 1991; Déjean 1998; Vergne and Giguet 1998), the Brill tagger and parser (Brill 1993, 1994), and the head-driven parser (Collins 1999; Charniak 2001). Others are based on the rules of a phrase structure grammar such as the Earley algorithm (Earley 1970), the chart parser (Kay 1980; Pereira and Shieber 1987), the CYK Parser (Cocke and Schwartz 1970; Younger 1967; Kasami 1965), and the Tomita parser (Tomita 1986). Also, there are many theories of syntax. Some are based on categorial grammar (Le´sniewski 1929; Ajdukiewicz 1935; Bar-Hillel 1964). Related to cate-
xx
Introduction
gorial grammar is the approach of valency theory (Tesnière 1959; Ágel 2000) and dependency grammar (Mel’ˇcuk 1988; Hudson 1991; Hellwig 2003). Others are based on phrase structure grammar (Post 1936; Chomsky 1957), for example generalized phrase structure grammar (GPSG, Gazdar et al. 1985), lexical functional grammar (LFG, Bresnan 1982, 2001), head-driven phrase structure Grammar (HPSG, Pollard and Sag 1987, 1994), and construction grammar (Östman and Fried 2004; Fillmore and Kay 1996). Similarly, there are many approaches to semantic analysis. Some are based on model theory (Tarski 1935, 1944; Montague 1974), others on speech act theory (Austin 1962; Grice 1957, 1965; Searle 1969), or semantic networks (Quillian 1968; Sowa 1984, 2000). In addition, there is rhetorical structure theory (RST, Mann and Thompson 1993), text linguistics (Halliday and Hasan 1976; Beaugrande and Dressler 1981) as well as different approaches to the definition of concepts in cognitive psychology, such as the schema, the template, the prototype, and the geon approach (Sect. 4.1). This list of partial systems may be continued by pointing to efforts at providing a more general theory of machine translation (Dorr 1993), finding a universal set of semantic primitives (Schank and Abelson 1977; Wierzbicka 1991), application-oriented systems of language production (Reiter and Dale 1997), as well as efforts to improve indexing and retrieval on the Internet by means of metadata mark-up based on XML, RDF, and OWL (Berners-Lee, Hendler, and Lassila 2001). VI. C ORPUS
LINGUISTICS
These days, most of the above approaches have been eclipsed or absorbed by corpus linguistics. Corpus linguistics takes credit for using real5 language data, in contradistinction to “armchair linguistics,” which relies on examples invented by scholars to illustrate certain language intuitions as simply and clearly as possible. To analyze large collections of real text, corpus linguistics uses the semiautomatic method of statistical tagging for annotating word forms, represented as letter sequences. First, a small part of the corpus at hand, called the core corpus, is annotated manually. Second, the transition probabilities from one word form to the next are computed, usually by means of hidden Markov models (HMMs), and transferred from the core corpus to the corpus as a whole, using a simplified tagset. Third, the result of step two is manually “post-edited.” Statistical tagging is caught on the horns of a dilemma, namely (i) low accuracy and (ii) a difficulty to correct. The dilemma has been concealed by misleading claims of excellent accuracy. For example, Leech (1995) asserts
Introduction
xxi
an error rate of 1.7% for the CLAWS 4 tagger of the BNC. This corresponds to a recognition rate of 98.3%, which seems to be very good at first glance. It is important to note, however, that these numbers apply to the word form tokens and not to the types (as in a lexicon). More specifically, the most frequent nine word forms of the BNC, namely the, of, and, to, a, in, is, that, and was, amount to 0.001368% of the types, but cover 21.895% of the word form tokens (FoCL 15.5.2). At the other extreme are the word forms which occur only once (hapaxlegomena). Even though they comprise 52.807% of the types in the BNC, they cover only 0.388% of the tokens. In other words, not recognizing more than half of the word form types in the BNC would fit 4.38 times into the 1.7% error rate asserted by Leech. This distressing lack of accuracy (FoCL 15.5.3) cannot be taken lightly because the correct and complete analysis of the word forms is the foundation for all further linguistic work,6 for example, syntactic analysis. The other horn of the dilemma is that specific errors cannot be corrected by improving step two of the statistical tagger. At a certain point, the only way to improve the error rate7 is correcting by hand. Today, the actual results of the 1994 CLAWS 4 BNC tagging (BNC users’ reference guide, Burnard et al. 1995) are buried under 12 years of massive manual post-editing (BNC XML Edition, Burnard et al. 2007). In summary, the claimed attraction of statistical tagging for NLP, namely working automatically for any amount of data and for any language, fails on two counts: (i) the manual tagging of the core corpus and (ii) the manual postediting of the statistical results. Moreover, tag sets are designed to maximize8 the desired distinctions during step two of statistical tagging. These annotations with elementary tags do not provide the grammatical detail which would be available from traditional linguistic analysis and which is essential for a computational model of natural language communication. Most importantly, however, statistical tagging does not run in real time, as is required for building a talking robot. The reason is that the method is inherently limited to annotating language fixed in storage. 5 6 7
8
For a critical evaluation of this assumption see Fillmore (1992). It is also at the root of a 20 year stagnation in speech recognition (CLaTR, Sect. 2.3). The errors of the CLAWS 4 BNC tagging have been analyzed in detail in FoCL Sect. 15.5. As far as we know, this analysis has never been challenged. On the contrary, it has been indirectly confirmed by the curious complaint (1999) that “the sole purpose [of the BNC tagging critique in FoCL] seems to be to argue that statistical word-tagging algorithms are not 100 per cent successful.” The resulting linguistic underspecification fosters the evolution of ever new tagging conventions. Not to worry, however: given enough time and money, different annotation styles may eventually be correlated for comparison thanks to work in the budding field of “interoperability” in corpus linguistics (Ide and Pustejovsky 2010).
xxii
Introduction
The DBS alternative is automatic word form recognition9 (Sect. 11.1). It is fully automatic and runs in real time, as required for the transfer of fresh content in free, spontaneous human-robot dialog. Based on lexical entries (Sects. A.4–A.6) and morphological rules (FoCL Chaps. 13–15), it provides the linguistic detail required for modeling the cycle of natural language communication. It is also well-suited for recognizing the word forms in a corpus automatically and accurately. In contradistinction to statistical tagging, automatic word form recognition allows to pinpoint and correct specific errors. Found by testing the system on arbitrarily large test lists derived from corpus data, errors are corrected by adding missing entries to the lexicon and by upgrading the rules. Also, the domain(s) of low frequency words may be specified in the lexical entries.10 As part of the linguistic analysis software, any such improvements will benefit subsequent applications (or redoing earlier ones). In statistical tagging, in contrast, the corrections of the post-editing phase apply only to the one corpus at hand. When annotating another corpus, post-editing must be done from scratch all over again. VII. I NTUITIVE
FOUNDATIONS
The approaches listed in Section V raise the question: Which of them should be chosen to serve as components of a general, complete, coherent, computational model of natural language communication? On the one hand, there is little interest in reinventing a component that is already available. On the other hand, reusing partial theories by integrating them into a general system of natural language communication comes at a considerable cost: given that the available theories have originated in different traditions and for different purposes, much time and effort would have to be spent on making them compatible. 9
10
The automatic word form recognition system of DBS is called LA Morph. Based on the allomorph method (FoCL Chap. 14) it has been implemented in C (Lorenz and Schüller 1996; Beutel 1997) and Java (Handl 2012, Sect. 4.1). LA Morph has been applied to Albanian (Kabashi 2003), Arabic (ben-Zineb 2010), Bulgarian (Sabeva 2010), Chinese (Feng 2011, Graf 2011), English (Leidner 2000, Bauer 2011), French (Pepiuk 2010), German (Lorenz 1996, Girstl 2006, Gao 2007, Jägers 2010), Italian (Wetzel 1996, Weber 2007), Korean (Kim 2009), Polish (Niedobijczuk 2009), Portuguese (Rudat 2011), Rumanian (Pandea 2010), Russian (Stefanskaia 2005, Vorontsova 2010), Swedish (Lipp 2010), and Spanish (Mahlow 2000). Most of these software systems were built within six month and tested on small corpora of 100 000 running word forms. On average, the recognition rate was 70% for types and 90% for tokens. Adding the domain(s) to the lexical analysis of a word allows to classify texts in which it is contained (Grün 1998). For example, megazostrodon (paleontology), free solo (rock climbing), cold opening (TV series), safe mode (space travel), or Costa Rica dome (oceanography) provide a practically single-handed characterization of the domain. As noted by Zipf (1949), the lower the frequency of a word or a phrase relative to everyday language, the more relevant it is semantically.
Introduction
xxiii
Apart from the time-consuming labor of integrating partial theories there is the more general question of which of them could be suitable in principle to be part of a coherent, comprehensive, functional theory of how natural language communication works. This question has been investigated in FoCL for the majority of the systems listed above.11 It turned out that a formal theory of how natural language works, suitable for building a talking robot, did not exist. Therefore, Database Semantics had to be developed from the beginning. It provides the first and so far the only formal rule system in which natural language interpretation and production are reconstructed as turn-taking, i.e., the cognitive agent’s ability to switch between the hear, the think, and the speak mode. DBS is founded on notions of traditional language analysis and philosophy of language, such as the time-linear structure of language emphasized by de Saussure and the notion of a proposition, as formulated by Aristotle. Frege’s Principle is used in the form of surface compositionality and Frege’s method of assigning semantic interpretations to different kinds of lines (edges) is used for a graphical representation of semantic relations. The type-token distinction of Peirce provides the formal basis for computational pattern matching. Elementary meanings are defined in terms of the agent’s recognition and action procedures. The agent’s external world (ecological niche) is treated as given; the task is modeling the autonomous agent’s cognitive processing. VIII. F ORMAL
FOUNDATIONS
The reconstruction of the communication cycle in Database Semantics required the following innovations: – The algorithm of LA grammar (TCS’92): Left-Associative Grammar (LA grammar) is based on computing possible continuations. This is in contrast to the algorithms commonly used in today’s linguistics, namely Phrase Structure Grammar (PSG) and Categorial Grammar (CG), which are based on computing possible substitutions. Computing possible continuations models the time-linear structure of natural language and permits us to handle turn-taking as the interaction of three kinds of LA grammar, namely LAGo hear, LAGo think, and LAGo think-speak. 11
These analyses are conducted at a high level of abstraction. For example, rather than discussing how Situation Semantics (Barwise and Perry 1983) and Discourse Semantics (Kamp and Reyle 1993) might differ from each other, FoCL concentrates on the more basic question of whether or not a metalanguage-based truth-conditional approach could in principle be suitable for a computational model of natural language communication. Similarly, rather than comparing GPSG, LFG, HPSG, and GB, FoCL concentrates on the question of whether or not the algorithm of substitution-based Phrase Structure Grammar could in principle be suitable for modeling the speak and the hear mode.
xxiv
Introduction
– The data structure of proplets (AIJ’01): DBS uses the data structure of proplets, defined as non-recursive feature structures with ordered attributes and double-ended queues as values. This is in contradistinction to most other systems of formal grammar analysis, which use recursive feature structures with unordered attributes. Proplets allow (i) to code any degree of empirical detail without increasing computational complexity, (ii) easy derivation of proplet patterns for efficient computational pattern matching, and (iii) the definition of interproplet relations solely by means of addresses implemented as pointers. – The database schema of a word bank (AIJ’01): The database schema is content-addressable in that it uses the alphabetical order induced by the proplets’ core values for storage and retrieval. This is in contrast to today’s most widely used databases, which are coordinateaddressable in that they rely on an index defined as an inverted file. A word bank differs from other content-addressable systems in that it uses not only the alphabetical order induced by the proplets’ core values for storage and retrieval, but also their time-linear arrival order. This is possible because a coding of interproplet relations by addresses makes proplets order-free. The algorithm of LA grammar, the data structure of proplets, and the database schema of a word bank provide for an autonomous navigation through propositional content, utilizing the grammatical relations between proplets as a kind of railroad system and LA grammar as a locomotive which moves a unique focus point along the rails. This combination of a data structure and an algorithm serves as our basic model of thought. It is designed as a control structure which relates the agent’s recognition and action, based on the selective activation of stored knowledge and on inferencing, and driven by the principle of balance. IX. S COPE
OF THE LINGUISTIC ANALYSIS
Using English as the main example, our linguistic analysis aims at a systematic development of the major constructions of natural language. These are (i) functor–argument, and (ii) coordination at the elementary, phrasal, and clausal levels of grammatical complexity. The major constructions are analyzed in a strictly time-linear derivation order, in the hear mode and in the speak mode. It is shown that the greater functional completeness of Database Semantics as compared to the sign-based approaches is no obstacle to a straightforward, linguistically well-motivated, homogeneous analysis, which provides for a highly efficient computational implementation. The analyses include constructions which have eluded a gener-
Introduction
xxv
ally accepted treatment within nativism, especially the gapping constructions (Chaps. 8 and 10), including “right-node-raising.” X. S TRUCTURE
OF THE BOOK
Part III
Part II
Part I
formal fragments
Part I presents the S LIM Theory of Language (FoCL) in terms of the cognitive agent’s external interfaces, data structure, and algorithm. Many questions which are crucial for the overall system are addressed in general, but not pursued in further detail. Examples are the nature of concepts and their role in recognition and action, the reference mechanisms of the different sign kinds, and the formal structure of the context level. Part II systematically analyzes the major constructions of natural language, presented as schematic derivations of English examples in the hear and the speak mode. The hear mode analyses show the strictly time-linear coding of functor–argument structure and coordination into sets of proplets. The speak mode analyses show the retrieval-based navigation through a word bank (conceptualization), as well as the language-dependent sequencing of word forms and the precipitation of function words. Part III presents fragments of English. In DBS, a fragment of natural language communication is functionally complete, but has limited data coverage. The fragments show the interpretation and the production of small sample texts in complete detail, explicitly defining the lexicon and the LAGo hear, LAGo think, and LAGo speak grammars required. The different scope and the different degrees of abstraction characterizing the three Parts may be summarized schematically as follows:
schematic derivations in the speak− and the hear−mode
interfaces, components, ontology, data structure, database schema, algorithm
abstraction level III (low)
abstraction level II (intermediate)
abstraction level I (high)
xxvi
Introduction
A high degree of abstraction corresponds to a low degree of linguistic and technical detail, and vice versa. The general framework outlined in Part I is built upon in Part II. The methods of analysis presented in Part II are built upon in Part III. The analyses and definitions of Part II and III have served as the declarative specification of a Java implementation called JSLIM (Kycia 2004), and was subsequently reimplemented by Jörg Kapfer and Johannes Handl, also in Java.
Part I
Communication Mechanism of Cognition
1. Matters of Method
In science, the method of verification is the outermost line of defense against error. It may be crude as long as it can be made objective. Designed for a theory or a kind of theories, the verification method should constantly raise new questions of a kind (i) which can be answered more or less conclusively and (ii) which are relevant for the theory’s further development. In natural science, verification consists in experiments which (i) are specified in quantitative terms and (ii) can be repeated by anyone anywhere. For this, the notions and structures of the theory must by suitable for the scientific setup of experiments. Given that the subject matter of linguistics are conventions in a language community, the quantitative verification method happens to be unsuitable for the analysis of natural language communication. The method we propose instead consists in building a functional model. This requires (i) a declarative specification in combination with an efficiently running implementation (prototype of a talking robot), (ii) establishing objective channels of observation, and (iii) equating the adequacy of the robot’s behavior with the correctness of the theory – which means that the robot must have (iv) the same kinds of external interfaces as humans, and process language in a way which is (v) input/output-equivalent with human language processing.
1.1 Sign- or Agent-Based Analysis of Language? A natural language manifests itself in the form of external signs (e.g. sound waves, dots on paper), the structures of which have evolved as conventions within a language community. Produced by cognitive agents in the speak mode and interpreted by agents in the hear mode, these signs are used for the transfer of content from the speaker to the hearer. Depending on whether the scientific analysis concentrates on the isolated signs or on the communicating agents, we may distinguish between sign-based and agent-based approaches.1 1
Clark (1996) distinguishes between the language-as-product and language-as-action traditions.
4
1. Matters of Method
Sign-based approaches like generative grammar, truth-conditional semantics, corpus linguistics, and text linguistics analyze expressions of natural language as objects, carved in stone or fixed on paper, magnetic tape, or other electronic means. The approaches abstract away from the aspect of communication and are therefore neither intended nor suitable to model the speak and the hear mode. Instead, linguistic examples, unrelated to the communicating agents, are analyzed as hierarchical structures which are formally based on the principle of possible substitutions. The agent-based approach of Database Semantics (DBS), in contrast, treats the concrete signs (unanalyzed surfaces) as the end point of the speaker’s language production and as the starting point of the hearer’s language interpretation. The cycle of natural language communication consists of the agents’ cognitive production and interpretation procedures and requires a time-linear2 analysis formally based on the principle of possible continuations. The goal of DBS is a theory of natural language communication which is complete with respect to function and data coverage, of low mathematical complexity, and suitable for an efficient implementation on the computer. The central question of DBS is: How does communicating with natural language work? In the most simple form, this question is answered as follows. Natural language communication takes place between cognitive agents. They have real bodies “out there” in the world with external interfaces for nonlanguage recognition and action at the context level, and language recognition and action at the language level. Each agent contains a database (memory) in which contents are stored. These contents consist of the agent’s knowledge, experiences, current recognition, intentions, plans, etc. The agent-internal, cognitive interaction between the language and the context level is called reference and based on pattern matching. In communication, an agent in the speak mode codes content from its database into signs of language which are realized externally via the language output interface. These signs are recognized by another agent in the hear mode via the language input interface and their content is decoded and stored in the second agent’s database. This procedure is successful if the content coded by the speaker is decoded and stored equivalently by the hearer. An agent’s switching between between the speak and the hear mode is called turn-taking.3 Conceptually, the modeling of turn-taking is based on three variants of timelinear (Left-Associative) grammar, called LAGo hear, LAGo think, and LAGo speak (Sect. 3.6). In communication, they cooperate as follows:
1.2 Verification Principle
1.1.1 T HE
5
BASIC MODEL OF TURN - TAKING
hear mode LAGo hear
speak mode s
LAGo speak
LAGo think
In the agent shown on the right (speak mode), LAGo think selectively activates content stored in the agent’s database. LAGo speak maps the activated content into surfaces of a natural language. They are realized as external signs (represented by the small box containing s). In the agent shown on the left (hear mode), the signs are interpreted by LAGo hear and stored as content. The basic model of turn-taking may be interpreted in two ways: 1.1.2 T WO VIEWS OF TURN - TAKING 1. Viewed from the outside: Two communicating agents are observed as they are taking turns. This is represented by 1.1.1 when the two boxes are taken to be two different agents, one in the hear and the other in the speak mode. 2. Viewed from the inside: One communicating agent is observed as it switches between being the speaker and the hearer. This is represented by 1.1.1 when the two boxes are taken to be the same agent switching between the speak and the hear mode (with the dashed right-hand arrow indicating the switch). In DBS, turn-taking is treated as a well-defined, well-motivated computational problem, which is central to the linguistic analysis of natural language: all syntactic and semantic analysis must be integrated into turn-taking as the most basic mechanism of communication. Without it, there is only one-sided monologue as the contrary of communication.
1.2 Verification Principle DBS is developed as a functional model, presented as a declarative specification for an efficient computer program running on appropriate hardware. A declarative specification describes the necessary properties of a software,such 2 3
“Time-linear” refers to a derivation order, while “linear time” is a complexity degree (FoCL 12.1.1). For studies of turn-taking see Schegloff (2007), Thórisson (2002) .
6
1. Matters of Method
as the external interfaces, the data structure, the database schema, the algorithm, and the functional flow. Thereby, the accidental4 properties of an implementation, such as the choice of programming language or the stylistic idiosyncrasies of the programmer, are abstracted away from. In contradistinction to an algebraic definition5 in logic, a declarative specification is not based purely on set theory. Instead, it takes a procedural point of view, specifying the general architecture in terms of components with input and output conditions. A declarative specification must be general enough to provide a solid mathematical foundation, and detailed enough to permit easy programming in different environments. A declarative specification is needed because machine code is not easily read by humans. Even programs written in a higher level programming language such as Lisp are meaningful only to experts. What one would like to see in a declarative specification is the abstract functional solution to the task which the software was programmed to perform. The declarative specification for a certain application consists of two levels: (i) a general theoretical framework (e.g. a functional system of natural language communication) and (ii) a specialization of the general framework to a specific application (e.g. English, German, Korean, or any other natural language). The theoretical framework in combination with a specialized application may in turn be realized (iii) in various different implementations, written in Lisp, C, or Java, for example. 1.2.1 D ECLARATIVE
SPECIFICATION AND ITS IMPLEMENTATIONS
(i)
theoretical framework declarative specification
(ii)
(iii)
specialized application 1
specialized application 2
specialized application 3
implemen− tation 3.1
implemen− tation 2.1
implemen− tation 1.1
implemen− tation 2.2
implemen− tation 1.2 implemen− tation 1.3 etc.
etc.
implemen− tation 3.2
implemen− tation 2.3 etc.
etc.
different procedural implementations
1.3 Equation Principle
7
The different implementations of a declarative specification should be equivalent with respect to the necessary properties. The evolving declarative specification and the associated up-to-date implementations6 should allow to automatically demonstrate the functioning of the theory in its current stage, and to test it automatically with respect to an ever-widening range of various tasks. In this way, error, incompleteness, and other weaknesses of the current stage may be determined (explicit hypothesis formation, FoCL 7.2.3), which is a precondition for developing the next improved stage of the declarative specification and its accompanying software (upscaling). The cycle of theory development and automatic testing is the verification method of DBS (FoCL Introduction VIII-X; CLaTR Sect. 1.3). It differs from the quantitative methods of the natural sciences (repeatability of experiments) as well as the logical-axiomatic methods of mathematics (proof of consistency), though it is compatible with them. The verification method of DBS is important for the following reasons. First, the signs of natural language are based on conventions which are not susceptible to the quantitative methods of the natural sciences. Second, the analysis of natural language in linguistics and neighboring fields such as the philosophy of language is fragmented into many different schools and subschools, which raises the question of their comparative evaluation.
1.3 Equation Principle DBS aims at modeling the language communication of artificial agents as naturally as possible for two reasons. First, maximal user-friendliness should be provided in practical applications. User-friendliness in human-machine communication means that the human and the robot can understand each other (i) correctly and (ii) without the human having to adapt to the machine.7 4
5
6
7
The term accidental is used here in the philosophical tradition of Aristotle, who distinguishes between the necessary and the accidental (or incidental – kata sumbebêkos). CoL and TCS present an algebraic definition of LA grammar, which benefited greatly from help by Dana Scott during the author’s 1986–1989 stay at Carnegie Mellon University. One may also write the software first and then distill the abstract system from it, as in NEWCAT (Lisp source code) and CoL, TCS (declarative specification). For this, the robot must be designed to have procedural counterparts of human notions. For example, in order to understand the word red, the robot must be capable of physically selecting the red objects from a basket; in order to understand the notion of being happily surprised, the robot must be capable of experiencing this emotion itself; etc. Given that the technical preconditions for this kind of user-friendliness will not become available for some time, Liu (2001) proposes to integrate current robotic capabilities with practical tasks guided by humans. This is a positive example of a smart solution, like the use of restricted language in machine translation (FoCL 2.5.5, 3).
8
1. Matters of Method
Second, long-term upscalability in theory development should be ensured. Upscaling in the construction of a talking robot means that one can proceed without difficulty from the current prototype to one of greater functional completeness and/or data coverage.8 In the history of science, difficulties in upscaling have practically always indicated a fundamental problem with the theory in question.9 To ensure user-friendliness and upscalability in the long run, DBS must strive to approximate at the various levels of abstraction what has been called “psychological reality.” For this purpose, we propose the following principle, which equates the correctness of the theoretical description with the behavioral adequacy of the electronic model (prototype of a talking robot). 1.3.1 T HE
EQUATION PRINCIPLE OF
DATABASE S EMANTICS
1. The more realistic the theoretical reconstruction of cognition, the better the functioning of the robot. 2. The better the functioning of the robot, the more realistic the theoretical reconstruction of cognition. The first part of the Equation Principle looks for support from and convergence with the neighboring sciences in order to improve the performance of the prototype. This means, for example, that we avoid conflicts with established facts or strong conjectures regarding the phylogenetical and the ontogenetical development as provided by ethology and developmental psychology, include the functional explanations of anatomy and physiology, and take seriously the results of mathematical complexity theory (no undecidable or exponential algorithms). The second part of the equation principle provides a heuristic strategy in light of the fact that the “real” software structures of cognition (at their various levels of abstraction) are not accessible to direct observation.10 Our strategy 8
9
10
For example, functional completeness at the language level requires the ability of automatic word form recognition in principle. Extending the data coverage means that more and more word forms of the language can be recognized. At the context level, functional completeness requires the ability of locomotion in principle. Extending the data coverage means that more and more differentiated forms of moving around become available. Problems with upscaling in truth-conditional semantics arise in the attempts to handle the Epimenides Paradox (FoCL Sect. 19.5), propositional attitudes (ibid, Sect. 20.3), and vagueness (ibid, Sect. 20.5). Problems with upscaling in generative grammar arise in the attempts to handle the constituent structure paradox (ibid, Sect. 8.5) and gapping constructions (Chap. 10). A notable exception is the direct study of central cognition in neurology, especially fMRI or functional magnetic resonance imaging (Matthews et al. 2003; Jezzard et al. 2001). Currently, however, these data leave room for widely differing interpretations, and are used to support conflicting theories.
1.4 Objectivation Principle
9
tries to achieve a realistic reconstruction indirectly by aiming for functional completeness and completeness of data coverage in the incremental upscaling of an artificial cognitive agent.
1.4 Objectivation Principle For a functional reconstruction of cognition in general and natural language communication in particular, different kinds of data are available. The differences stem in part from alternative constellations in which the data originate, and in part from alternative channels which are used in the respective constellations. The constellations regard the interaction between (i) the user, (ii) the scientist, and (iii) the electronic model (robot). They are distinguished as follows: 1.4.1 C ONSTELLATIONS
PROVIDING DIFFERENT KINDS OF DATA
1. Interaction between (i) the user and (iii) the robot 2. Interaction between (i) the user and (ii) the scientist 3. Interaction between (ii) the scientist and (iii) the robot Depending on the constellation, data can be transmitted via the following channels: 1.4.2 DATA
CHANNELS OF COMMUNICATIVE INTERACTION
1. The auto-channel processes input automatically and produces output autonomously, at the context as well as the language level. In natural cognitive agents, i.e. the user and the scientist, the auto-channel is present from the beginning in full functionality. In artificial agents, in contrast, the autochannel must be reconstructed – and it is the goal of DBS to reconstruct it as realistically as possible. 2. The extrapolation of introspection is a specialization of the auto-channel and results from the scientist’s effort to improve human-machine communication by taking the view of the human user. This is possible because the scientist and the user are natural agents. 3. The service channel is designed by the scientist for the observation and control of the artificial agent. It allows direct access to the robot’s cognition because its cognitive architecture and functioning is a construct which in principle may be understood completely by the scientist who built it.
10
1. Matters of Method
The three constellations and the role of the three data channels in the interaction between user, scientist, and robot may be shown graphically as follows: 1.4.3 I NTERACTION
BETWEEN USER , ROBOT, AND SCIENTIST
robot (iii)
(i) user auto−channel
extrapolation of introspection
service channel (ii)
scientist
The scientist observes the external behavior of the user and the robot via the auto-channel, i.e. the scientist sees what they do and can also interview them about it. In addition, the scientist observes the cognitive states of (a) the user indirectly via a scientifically founded extrapolation of introspection and (b) the robot directly via the service channel. For the scientist, the user and the robot are equally real agents “out there” in the world, and their cognitive states have the same ontological status.11 Of the three channels, the auto-channel is available to the user, the robot, and the scientist. It is the channel used most, but it is also most prone to error: at the level of context there are the visual illusions, for example, and at the level of language there are the misunderstandings. In addition, one has to take into account the possibility that the partner of discourse might deviate from truth, either consciously or unconsciously. As long as everyday access to the partner of discourse is restricted to the auto-channel, we can never be completely certain whether what we said was really understood as intended by us, or whether we really understood what was intended by the other, or whether what was said was really true. In philosophy, this is the much discussed problem known as solipsism (Wittgenstein 1921). For a scientific analysis of natural language communication, however, there are the privileged accesses of (a) the extrapolation of introspection and (b) the service channel. In the extrapolation of introspection, the discourse between the scientist and the user is restricted to the domain of user-robot interaction. Therefore, misunderstandings between the scientist and the user are much less likely than in free communication, though they are still possible. The direct access to the robot via the service channel, furthermore, allows the scientist
1.5 Equivalence Principles
11
to determine objectively whether or not the cognition of the artificial agent is functioning properly. Thus, artificial cognitive agents are of special epistemological interest insofar as they are not subject to the problem of solipsism.
1.5 Equivalence Principles The methodological principles of DBS presented so far, namely 1. the Verification Principle i.e. the continuous upscaling of a declarative specification accompanied by computational testing in the form of implemented prototypes (Sect. 1.2), 2. the Equation Principle i.e. equating theoretical correctness with the behavioral adequacy of the prototype during long-term upscaling (Sect. 1.3), and 3. the Objectivation Principle i.e. the establishing of objective channels for observing language communication between natural and artificial agents (Sect. 1.4), are constrained by the Equivalence Principles 1.5.1 and 1.5.2: 1.5.1 T HE I NTERFACE E QUIVALENCE P RINCIPLE 4. The artificial surrogate must be equipped with the same interfaces to the external world as the natural prototype. At the highest level of abstraction, interface equivalence requires the external interfaces of recognition and action at the context and the language level (2.3.2, 2.4.1). At lower levels of abstraction, the interfaces in question split up into the different sensory modalities (Chen 2006, Sect. 6.13.1), such as vision, audio, and sense of touch for recognition, and manipulation and articulation for action.12 Interface equivalence between the model and the natural prototype is crucial for a correct reconstruction of the functional flow from recognition to action. For example, if the human tells the robot to pick up the blue square, the robot must not only understand the request, but (i) be able to recognize the object and (ii) to perform the action in question. This requires vision and manipulation 11 12
This set-up constitutes an alternative to the hermeneutic approach presented in Habermas (1981). There are other sensory modalities, such as taste, smell, and locomotion, but it would be a challenge to use them for natural language communication.
12
1. Matters of Method
abilities in the robot. The Interface Equivalence Principle has fundamental consequences on the theory of semantics for natural language, especially the ontological foundations (2.3.1). The Interface Equivalence (4) is presupposed by the Principle of Input/Output Equivalence (5): 1.5.2 T HE I NPUT /O UTPUT E QUIVALENCE P RINCIPLE 5. Input/Output Equivalence requires that the artificial agent (i) takes the same input and produces the same output as the human prototype, (ii) disassembles input and output in the same way into parts, and (iii) orders the parts in the same way during input and output. The input and output data, like the external interfaces, are concretely given and are therefore susceptible to the quantitative methods of the natural sciences. Input/output equivalence between the computational model and the natural prototype is especially relevant for the automatic interpretation and production of the signs used in natural language communication. For example, the English sentences The dog bit the man and The man bit the dog consist of the same word forms, but clearly differ in their literal meaning. Given that both expressions use the same words, their difference in meaning stems solely from a difference of order, in the speak and the hear mode. If the robot didn’t use the same surface order as the human, as required by the Principle of Input/Output Equivalence, communication would fail. Therefore, this principle has fundamental consequences on the theory of grammar for natural language. The equivalence principles 1.5.1 and 1.5.2 constitute a minimal requirement for any scientific reconstruction of cognition in general and the mechanism of natural language communication in particular. The reason is as follows: If we had direct access to the architecture and the functioning of cognition, comparable to the investigation of the physical structures and functions of the bodily organs in the natural sciences (anatomy, physiology, chemistry, physics), the resulting model would certainly have to satisfy the Principles of Interface Equivalence and Input/Output Equivalence. If, due to the absence of direct access, the nature of the cognitive system must be inferred indirectly, namely in an incremental process of upscaling the functional completeness and the data coverage of an artificial surrogate, this does not diminish the importance of the external interfaces and the input/output data. On the contrary, as concretely given, directly observable structures they constitute the external fix points for any scientifically well-founded reconstruction of the internal cognition procedures.
1.6 Surface Compositionality and Time-Linearity
13
1.6 Surface Compositionality and Time-Linearity The general principles of Interface Equivalence and Input/Output Equivalence require a careful analysis and reconstruction (i) of the natural agent’s recognition and action components and (ii) of the data being passed through these components. One important kind of data are the expressions of natural language, produced in the speak mode and interpreted in the hear mode. Externally, these data are objects in a certain modality, represented by sounds, hand-written or printed letters, or gestures of a sign language. They may be recorded on tape, paper, film, or disc, and measured and described with the methods of the natural sciences. Given that these objects are concretely given, they constitute the empirical basis which linguistic analysis should neither add to nor subtract from. This elementary methodological principle is known as Surface Compositionality (SCG): 1.6.1 S URFACE
COMPOSITIONALITY
A linguistic analysis is surface compositional if it uses only the concrete word forms as the building blocks of grammatical composition, such that all syntactic and semantic properties of a complex expression derive systematically from the grammatical properties and the literal meaning1 of the lexical items. Surface Compositionality is best illustrated by examples which violate it, such as the following grammatical analysis: 1.6.2 A NALYSIS
VIOLATING SURFACE COMPOSITIONALITY
(v) (np’ v) (np)
(np)
(sn’ np) (sn) (np’ np’ v) (sn’ np) (sn) every
girl
drank
/ 0
water
In order to treat the noun phrases every girl and water alike, this analysis postulates the zero element ∅. The presumed “linguistic generalization” is illegitimate, however, because the postulated determiner ∅ of water is not concretely given in the surface.
14
1. Matters of Method
Nevertheless, the categories of 1.6.2 are well-motivated in terms of valency theory (Tesnière 1959) and will be reused in 1.6.5 for a correct analysis. They are defined as follows: 1.6.3 T HE
CATEGORIES OF
1.6.2
′
(sn np) = determiner, takes a singular noun for the valency position sn′ and makes a noun phrase np. (sn) = singular noun, fills the valency position sn′ in the determiner. (np′ np′ v)= transitive verb, takes a noun phrase np and makes an intransitive verb (np′ v). (np) = noun phrase, fills a valency position np′ in the verb. (np′ v) = intransitive verb, takes a noun phrase np and makes a (v). (v) = verb with no open valency positions (sentence).
The rules generating 1.6.2 are based on the principle of possible substitutions of phrase structure grammar (PSG), and are defined as follows: 1.6.4 PSG (v) (np) (np′ v) (sn′ np) (sn) (np′ np′ v)
RULES COMPUTING SUBSTITUTIONS FOR DERIVING
1.6.2
′
→ (np) (np v) → (sn′ np) (sn) → (np′ np′ v) (np) → every, ∅ → girl, water → drank
Each rule substitutes the category on the left-hand side of the arrow by the categories on the right-hand side (top-down derivation). It is also conceivable to replace the categories on the right-hand side by the one on the left-hand side (bottom-up derivation). Without the zero determiner postulated in 1.6.2, at least one additional rule would have to be defined. However, according to the Principle of Surface Compositionality, it is methodologically unsound to simply postulate the existence of something that is absent, but considered necessary or desirable13 for one’s theory. Failure to maintain Surface Compositionality leads directly to high mathematical complexity and computational intractability. Having determined the basic elements of linguistic analysis, i.e. the concretely given surfaces of the signs and their standard lexical analysis, let us turn to the proper grammatical relations between these basic items. The most elementary relation between the words in a sentence is their time-linear order. Time-linear means linear like time and in the direction of time (Sect. 3.4). The time-linear structure of natural language is so fundamental that a speaker cannot but utter a text sentence by sentence, and a sentence word form by 13
The inverse kind of violating Surface Compositionality consists in treating words which are concretely given in the surface as if they weren’t there, simply because they are considered unnecessary or undesirable for one’s “linguistic generalization.” For a more detailed discussion see SCG and FoCL Sects. 4.5, 17.2, 18.2, and 21.3.
1.6 Surface Compositionality and Time-Linearity
15
word form. Thereby the time-linear principle suffuses the process of utterance to such a degree that the speaker may decide in the middle of a sentence on how to continue, for example, because of the facial expression observed in the hearer. Correspondingly, the hearer need not wait until the utterance of a text or sentence has been finished before his or her interpretation can begin. Instead the hearer will interpret the beginning of the sentence without having to know how it will be continued. Example 1.6.2 violates not only Surface Compositionality, but also TimeLinearity. The grammatical analysis is not time-linear because it fails to combine every girl with drank directly. Instead, based on the principle of possible substitutions, the complex expression drank water must be derived first. A time-linear analysis, in contrast, is based on the principle of possible continuations. As an example, consider the following time-linear derivation, which uses the same categories (1.6.3) as the non-time-linear derivation 1.6.2: 1.6.5 S ATISFYING
SURFACE COMPOSITIONALITY AND TIME - LINEARITY (v)
(np’ v) (np) (sn’ np) (sn) every
girl
(np’ np’ v) drank
(sn) water
This bottom-up derivation always combines a sentence start with a next word into a new sentence start, using the following (simplified) rules of LeftAssociative grammar: 1.6.6 LA
RULES COMPUTING CONTINUATIONS FOR DERIVING
1.6.5
(VAR′ X) (VAR) ⇒ (X) (VAR) (VAR′ X) ⇒ (X) Each rule consists of three patterns. The patterns are built from the variables VAR, VAR′ , and X.14 14
In DBS, constants are written in lowercase Roman letters, while variables are written in uppercase Roman letters or in lowercase Greek letters (Appendix. A.3.3, 1).
16
1. Matters of Method
The first pattern of a rule, e.g. (VAR′ X), represents the sentence start ss, the second pattern, e.g. (VAR), the next word nw, and the third pattern, e.g. (X) the resulting sentence start ss′ . The variables VAR and VAR′ are restricted to a single category segment, while X is a variable for a sequence of category segments consisting of zero or more elements. Rules computing possible continuations are based on matching their patterns with the input expressions, thereby binding their variables: 1.6.7 A PPLICATION
OF A RULE COMPUTING A POSSIBLE CONTINUATION
ss rule patterns categories surfaces
(VAR′ X) | | ′ (sn np) every
nw (VAR) | (sn) girl
⇒
ss′ (X) | matching and binding (np) every girl
During matching, the variable VAR′ is “vertically” bound to sn′ , the variable X to np, and the variable VAR to sn. In the result, the valency position sn′ of the determiner category (sn′ np) has been filled (or canceled), producing the ss′ category (np), and the input surfaces every and girl are concatenated into every girl. To handle the combination between a verb and object nouns with or without a determiner, e.g. ...drank + a coke versus ...drank + water, in a surface compositional manner, the possible values of the variables VAR and VAR′ are restricted15 and correlated as follows (FoCL Sect. 16): 1.6.8 VARIABLE
DEFINITION OF THE RULES DERIVING
1.6.5
If VAR′ is sn′ , then VAR is sn. (identity-based agreement) If VAR′ is np′ , then VAR is np, sn, or pn. (definition-based agreement) The formalism of a time-linear derivation sketched in 1.6.5-1.6.8 is of a preliminary kind. It was used in NEWCAT for the automatic time-linear analysis of 221 syntactic constructions of German and 114 of English, published complete with the LISP source code. It was also used in CoL for 421 syntacticsemantic constructions of English with a sign-based, hierarchical semantic analysis (CLaTR Sect. 12.4). Introduced by categorial grammar, canceling by deletion is now replaced by #-marking (Sect. 11.6). In this way the information is preserved for the speak mode. 15
The variable restrictions for handling agreement in English are summarized in the Appendix A.3.3 2.
2. Interfaces and Components
That cognitive agents have bodies1 with interfaces for transporting cognitive content from the external world into the agent (recognition) and out of the agent into the external world (action) can hardly be denied. The properties of the interfaces may be established externally by observing the interaction of other agents with their environment and with each other, and internally by observing the functioning of one’s own interfaces through introspection (1.4.3). In addition, there is the analysis of the external interface organs involved, provided by the natural sciences such as physiology and anatomy, as well as modeling the functioning of these interfaces in robotics. Yet starting the reconstruction of cognition with the agents’ external interfaces determines the ontological foundations of Database Semantics (DBS) in a way which makes it incompatible with the sign-based theories. The reason is that the sign-based theories abstract the cognitive agent away, defining semantics as a direct relation between “language and the world.” This may have advantages, yet without an agent there cannot be external interfaces, and a cognitive theory without connections to external interfaces is unsuitable to control the behavior of an autonomous robot, including language behavior. There remains the possibility of extending the sign-based theories without external interfaces into ones which have them. This is not a promising option, however, as shown by the analogy with software development. Experience has shown again and again that a piece of running software can practically never be extended to an interface which was forgotten in the initial design – except for ad hoc emergency measures cleverly adapting some accidental feature of the program. For a clean solution, the program must be rewritten from scratch.
2.1 Cognitive Agents without Language Let us begin our model of natural language communication with the agent’s context component. This is motivated as follows: 1
The role of the body has recently been reemphasized by Emergentism (MacWhinney 1999).
18
2. Interfaces and Components
2.1.1 S UPPORT
FOR BUILDING THE CONTEXT COMPONENT FIRST
1. Constructs from the context-level may be reused at the language level. This holds for (i) the concepts, as types and as token, (ii) the external interfaces for input and output, (iii) the data structure, (iv) the database schema, (v) the algorithm, (vi) the inferences, and (vii) the functional flow. 2. The context is universal – in the sense of being independent of a particular language, yet all the different natural languages may be interpreted relative to the same kind of agent-internal context component. We may view the context component of a talking robot in isolation, like a cognitive agent without language. For example, a squirrel does not have language2 in the same way as humans do, but its external interfaces are very similar. It has two eyes, two ears, and a nose for recognition, and hands, hind-legs, and a mouth for action. It can bury a nut, when needed retrieve it even after a long time, and eat it. If we could use the cognition of an artificial squirrel as the context component of an autonomous robot to be extended to language, we would adapt the ears to recognize language surfaces, add synthesizers for speaking, provide ample computing power, and design a theory to use it for natural language communication. Schematically, a natural or artificial cognitive agent without language (or the isolated context component of an agent with language) may be shown as follows: 2.1.2 I NTERFACES
OF A COGNITIVE AGENT WITHOUT LANGUAGE cognitive agent
(ii) context action (i) context recognition
central cognition
peripheral cognition external reality 2
This does not mean that it cannot communicate. Instead we must distinguish between communication in general and communication with language. Natural agents, starting from the protozoa, must have the ability to communicate at least to a degree that ensures reproduction. Forms of non-language communication between higher animals such as birds are described by M.D. Hauser (1996). In natural agents, the language component arises as a special form of the communication component at the end of a long co-evolution of communication and non-communication cognition (CLaTR Sect. 14.1). In artificial agents, in contrast, the language component may be built as a copy of the context component which is appropriately specialized for natural language.
2.2 Sensory Modalities and their Media
19
The most general distinction regarding the external interfaces is between (i) recognition (4.3.1) and (ii) action (4.4.1). These are differentiated further into such recognition modalities as vision, audition, touch, taste, smell, and the sense of balance (equilibrioception), while action is differentiated further into such modalities as locomotion, manipulation, and articulation. During recognition, peripheral cognition must translate modality-dependent input into the homogeneous coding of the agent’s central cognition. During action, peripheral cognition must perform the inverse translation, from the homogeneously coded commands of central cognition into heterogeneous modalitydependent output. A homogeneous coding of content in central cognition is essential for storage and retrieval, for inferencing on the content, and for turning the result of these inferences into blueprints for action. We know of two kinds of homogeneous coding: neurological in natural agents and electronic in computers.3
2.2 Sensory Modalities and their Media The switching between a modality-independent type and a modality-dependent token is especially clear in the transfer of a language sign from the speaker to the hearer. It may be shown schematically as follows: 2.2.1 M ODALITY- INDEPENDENT
AND MODALITY- DEPENDENT CODING speaker
hearer
modality−independent modality− dependent coding of the sign interface
3
modality−dependent realization of the sign
modality− dependent interface
modality−independent coding of the sign
central cognition
central cognition
peripheral cognition
peripheral cognition
Let us assume, for example, that language in central cognition is coded homogeneously in ASCII (American Standard Code for Information Interchange). Externally the signs of language may be represented acoustically as spoken language, or visually as hand-written, printed, or signed language. In sign production (speak mode), the agent must select a modality in which to realize a word; thus the single ASCII coding of the word (type) is translated by peripheral cognition into one of practically infinitely many different modality-dependent realizations (tokens) based on visual, acoustic, or tactile representations. In sign recognition (hear mode), the many possible modality-dependent representations of the word as bitmap outlines, sound waves, or raised dots (Braille) are translated by peripheral cognition back into an equivalent single ASCII coding.
20
2. Interfaces and Components
In the medium of spoken language, the modality of production, i.e. speech, differs from the modality of interpretation, i.e. hearing, and correspondingly for manipulation and vision in the medium of written language. Regardless of whether the surface of a word like table is realized in speech or in writing, in the speak or the hear mode, it must have the same agent-internal, cognitive representation as a modality-independent type. Furthermore, this type must be roughly the same for the members of a language community in order to ensure successful communication.4 In addition to monomodal recognition and action there is multimodal cognition. An example of multimodal recognition in language communication is interpreting the facial expressions (vision) and the voice (hearing). An example of multimodal recognition at the context level is simultaneously seeing, touching, smelling, and hearing another agent. An example of multimodal context action is holding and biting a fruit. Recognition and action may also combine, as when seeing (recognition) and reaching for (action) an object. Even more diverse than the different modalities are the forms of existence in the external world, such as different materials, physical states, motions, natural laws, social conventions, etc., which can all be recognized, processed, and acted upon. Due to the translation of peripheral cognition, however, these diverse forms of existence are represented in the form of a homogeneous coding inside the cognitive agent – inasmuch as they have been perceived at all. In any modality, we must distinguish between immediate recognition and action and mediated recognition and action. For example, we can see and hear an oncoming train immediately in reality, or we can see and hear a mediated image in a cinema, whereby the event has been stored and reactivated in the medium of film. In both cases the recognition happens to be multimodal. An example of mediated action is telling someone to do something. While the notion of modality refers to input output devices, the term medium refers to forms of existence, which in turn determine agent-external storage, for example, print or video. The notions of medium and modality are related, however, because each medium, e.g. print, is designed for a certain modality, e.g. vision, or modalities, e.g. the medium of video for the modalities of vision and speech. The relation between modality and medium arises in the earliest beginnings of mankind, as shown by paleolithic rock carvings and paintings.
2.3 Alternative Ontologies for Referring with Language Relating expressions of language to the intended objects or events is called reference. There are two basic ways to analyze reference: (i) agent-externally
2.3 Alternative Ontologies for Referring with Language
21
using a metalanguage description and (ii) agent-internally using a cognitive pattern matching between the language and the context level. This alternative corresponds to the distinction between a sign-based and an agent-based approach to natural language communication (Sect. 1.1). For example, the sign-based approach of truth-conditional semantics treats reference as an agent-external relation between “language and the world.” The world is defined as a set-theoretical model (Sect. 6.4), and the relation between the language and the model is defined in a meta-language. All of this is formally constructed by the logicians, who look at their models of reference from the outside. The agent-based approach of Database Semantics, in contrast, reconstructs reference as a cognitive procedure inside the agent’s head, whereby the external world is treated a given. The ontological difference between truth-conditional semantics (sign-oriented) and DBS (agent-based) may be illustrated as follows: 2.3.1 R EFERENCE
IN ALTERNATIVE ONTOLOGIES
truth−conditional semantics sleep(Julia)
Database Semantics (DBS) Julia sleeps cognitive agent
o o o o o set−theoretical model
external reality
The schema of truth-conditional semantics on the left shows the sentence at the language level, formalized as sleep(Julia). It indicates that sleep is a functor denoting a set while Julia is an argument denoting an element of the set. The metalanguage-defined relation between the object language expressions and their set-theoretic denotations is indicated by the dotted lines. The approach is sign-based insofar as there is no agent and consequently there are no external interfaces. Thus there is neither a need nor a place for an agentinternal database or for an algorithm reading content into and out of the agent’s database. Also, there is no distinction between the speak and the hear mode. The schema of DBS on the right shows an agent in the hear mode.5 The agent, the language expression, and the referent in its appearance of sleeping 4
5
The switching between a modality-dependent and modality-independent coding in recognition and action holds also at the context level, but without the intervening external language surface and the need for a partner in discourse. The different constellations of recognition and action at the language and the context level are categorized as the 10 S LIM states of cognition (FoCL Sect. 23.5).
22
2. Interfaces and Components
are all part of the real world. There is no attempt to model the real world settheoretically or in any other way (such as virtual reality, FoCL Sect. 24.5). Instead, the goal is to model the agent’s information processing. This begins with the agent’s interfaces for recognition at the language and the context level, and ends with the agent’s interfaces for action – both externally relative to the real world, but also internally in the form of such perceptions as hunger and actions such as free association and reasoning by means of inferences. Technically, the step from artificial agents without language to agents with language may be realized as a doubling of the context component, whereby the newly gained level is re-utilized for the purposes of language. This holds specifically for the existing interfaces of (i) context recognition and (ii) context action, which may be reused by the new language component for (iii) sign recognition and (iv) sign action (synthesis), respectively. 2.3.2 E XTERNAL
INTERFACES OF A COGNITIVE AGENT WITH LANGUAGE cognitive agent
(iv) sign synthesis (iii) sign recognition central cognition
(ii) context action (i) context recognition peripheral cognition external reality
The distinction between the external interfaces of language (upper level) and context (lower level) may be motivated by differences in their interpretation. For example, when we observe some vague pattern on the bark of a tree we may be uncertain whether it is accidentally provided by nature or produced on purpose as a sequence of letters, e.g. Caution, Tiger! When we suddenly realize that the pattern is a sign intended for communication, the raw visual input is still the same, but the significance of its interpretation becomes completely different: We switch from the context to the language level.6 6
On the functioning of written language see FoCL Sect. 6.5.
2.4 Theory of Language and Theory of Grammar
23
2.4 Theory of Language and Theory of Grammar As an agent-based approach, DBS establishes the relation of reference between the language expressions and the referents procedurally rather than metalanguage-based.7 The procedural reconstruction of reference begins with transferring the distinction between the levels of the language and the world, as used in truth-conditional semantics, from the modeled external reality into the head of the cognitive agent. Thereby, the “world” changes into an agentinternal representation of episodic and absolute content, called the context. 2.4.1 S TRUCTURING
CENTRAL COGNITION IN AGENTS WITH LANGUAGE
cognitive agent central cognition language component
theory of grammar
pragmatics
theory of language
recognition action context component external reality peripheral cognition
Externally, the interfaces for language and nonlanguage recognition are the same, as are those for language and nonlanguage action. Internally, however, raw input data are separated into language and nonlanguage content. In action, a content is realized as raw output regardless of whether it originated at the language or at the context level. Compared with 2.3.2, central cognition has a more differentiated structure. Sign recognition and action interact with an agent-internal language component, reconstructed by a theory of grammar. Context recognition and action interact with an agent-internal context component. The interaction between the two components is called the language pragmatics. It establishes the reference relation between the agent-internal levels of language and context by 7
In order to utilize metalanguage definitions computationally, they would have to be reconstructed as effective procedures. However, most existing metalanguage-based systems, for example, modal logic, are based on such notions as “an infinite set of possible worlds,” which are not suitable for a procedural reconstruction (FoCL Sect. 19.4).
24
2. Interfaces and Components
means of (i) pattern matching (symbols), (ii) pointers (indexicals), and (iii) baptism (names). The context component has long been neglected in linguistics, even though it is functionally essential for modeling natural language communication. Without a context component, the cognitive agent, natural or artificial, cannot report to us what it perceives (context recognition), and cannot do what we tell it to do (context action). The language component includes the traditional linguistic subcomponents of the morphology, the lexicon, the syntax, and the semantics. For the purposes of Database Semantics, the language component must not merely analyze the signs as isolated objects, but must provide declarative specifications of the computational procedures which map cognitive content into surfaces (speak mode) and surfaces into cognitive content (hear mode). To be practical, these procedures must work for newly formed expressions, as in the transfer of fresh content in a spontaneous, free human-robot dialog, and run in real time.8 The agent-internal interaction between the language component, the context component, and the pragmatics is reconstructed by a theory of language, here the S LIM9 theory (Sect. 2.6). An adequate theory of language must model immediate and mediated reference with language (Sect. 2.5), in the speak and the hear mode. It must provide a natural method of conceptualization, i.e. the speaker’s choice of what to say and how to say it. And it must handle nonliteral language uses such as spontaneous metaphor in the speak and the hear mode (Sects. 5.4 and 5.5).
2.5 Immediate Reference and Mediated Reference Pragmatics in general is defined as the theory of use. Examples of pragmatic tasks are the use of a screwdriver to fasten a screw, the use of one’s legs to go from one place to another, the nightly scavenging of food from the refrigerator to fix a sandwich and satisfy one’s hunger, or the request that someone else fix and serve the sandwich. Depending on whether or not an act is performed by means of language signs, we speak of language or of context pragmatics. Language pragmat8
9
This is in contrast to today’s efforts at annotating stored text. Examples are the XML annotation of the BNC and the graphical annotation of tree banks. Quality annotation takes time (more than a decade for the BNC) because “post-editing” requires human labor. More and more annotated text is offered as a cost-free “resource” waiting to be used. The distinction between a theory of grammar and a theory of language has been emphasized by Lieb (1976). Language theories preceding S LIM are structuralism, behaviorism, model theory, speech act theory, and nativism.
2.5 Immediate Reference and Mediated Reference
25
ics should be analyzed as a phylogenetical and ontogenetical specialization of context pragmatics, just as language recognition and synthesis are phylogenetical and ontogenetical specializations of context recognition and action, respectively. An important part of language pragmatics is reference. From an ontogenetical and phylogenetical point of view, its most basic form is immediate reference. It consists in referring with language to objects in the immediate task environment of the communicating agents, for example, to the meaty bone in front of them. Thereby, the agents’ external interfaces process data at the language as well as the context level (as shown in 2.4.1). In addition, there is also the referring with language to referents which are not in the immediate task environment, but exist solely in the databases of the communicating agents, for example, the historical figure of Aristotle. In this mediated reference, the external interfaces process data at the language level, but not at the context level:10 2.5.1 U SE
OF EXTERNAL INTERFACES IN MEDIATED REFERENCE cognitive agent central cognition language component
theory of grammar
pragmatics
theory of language
recognition action context component external reality peripheral cognition
Even though immediate reference is phylogenetically and ontogenetically primary, it is a special case of mediated reference from a theoretical point of view. This is because all cognitive procedures of mediated reference are used also by immediate reference, but not vice versa: immediate reference requires the additional processing of external input and output at the context level. Artificial cognitive agents with recognition and action at the language level but not at the context level (and thus limited to mediated reference) are inter10
For further discussion of immediate and mediated reference see FoCL Sect. 4.3.
26
2. Interfaces and Components
esting for the following reasons. First, because the input and output of mediated reference is limited to the language level, the user can make do with the keyboard and the screen of today’s standard computers without principled loss of function. Second, there are many tasks for which artificial agents with external language interfaces but without external context interfaces are sufficient. In research, this applies to the modeling of cognitive operations which are based on stored data alone. In practice, it applies to such applications as the interaction with databases and the Internet, involving the reading in of new content and the answering of queries by means of natural language. In the long run, however, an absence of autonomous context data, caused by a lack of non-language recognition and action, has the following disadvantages: 2.5.2 D ISADVANTAGES
OF NOT HAVING AUTONOMOUS CONTEXT DATA
1. The conceptual core of language meanings remains undefined. Most basic concepts originate in agents without language as recognition and action procedures at the context level, and are reused as the core of language meanings in agents with language.11 Therefore, agents with language but without context data use meanings which are void of a conceptual core – though the relations between the concepts, represented by placeholders, may still be defined, both absolutely (for example, in the is-a and is-part-of hierarchies) and episodically. 2. The coherence or incoherence of content cannot be judged autonomously. The coherence of stored content originates in the coherence of the external world.12 Therefore, only agents with autonomous context data are able to relate content “imported” by means of language to the data of their own experience. An agent without external context recognition and action, in contrast, has nothing but imported data – which is why the responsibility for their coherence lies solely with the users who store the data in the agent. Due to the absence of suitable robots in today’s computational linguistics, the software development of Database Semantics is currently limited to input via the keyboard and output via the screen. As soon as an artificial autochannel for recognition and action (1.4.3) will become available, however, the theoretical framework of Database Semantics will be ready for the new technology. The reason for this is – roughly speaking – that immediate reference is a special case of mediated reference. 11 12
FoCL Sect. 22.1. Ibid. Sect. 24.1.
2.6 The S LIM Theory of Language
27
Thus, all the components developed for the version without autonomous context data will be suitable to serve in the extended system as well. This holds, for example, for the components of the lexicon, the automatic word-form recognition and production, the syntactic-semantic and semantic-syntactic parsers, and the inferences of the pragmatics, in the speak and the hear mode.
2.6 The S LIM Theory of Language The theory of language which Database Semantics is based on is called S LIM. In contrast to other theories of language such as structuralism, behaviorism, nativism, model theory, or speech act theory, S LIM has been designed from the outset to explain the understanding and purposeful production of signs in terms of completely explicit, software mechanical procedures. The letters of the acronym S LIM stand for the following principles: S = Surface Compositionality (methodological principle, 1.6.1):13 Syntactic-semantic composition assembles only word forms with concrete surfaces, excluding the use of zero-elements, identity mappings, or transformations. L = time-Linearity (empirical principle, 1.6.5): Interpretation and production of utterances are based on a strictly timelinear derivation order. I = Internal (ontological principle, 2.3.1): Interpretation and production of utterances are analyzed as cognitive procedures located inside the speaker-hearer. M = Matching (functional principle, 4.1.2): Referring with language to past, current, or future objects and events is modeled in terms of pattern matching between language meanings and a context of interpretation. S LIM models communication based on the Seven Principles of Pragmatics. 2.6.1 F IRST
PRINCIPLE OF PRAGMATICS
(P O P-1, F O CL 4.3.3)
The speaker’s utterance meaning2 is the use of the sign’s literal meaning1 relative to an agent-internal context of interpretation. 13
Apparently, surface compositionality is now accepted in constraint-based unification grammar (Sag and Wasow 2011).
28
2. Interfaces and Components
Meaning1 is the literal meaning of the sign; meaning2 is the speaker meaning of an utterance in which the sign’s meaning1 is used. Even in the hear mode, meaning2 is called the “speaker meaning” because communication is successful only if the hearer uses the sign’s meaning1 to refer to the same objects or events as the speaker. The correct use of a sign depends on the context of interpretation. It is defined by the parameters of the sign’s origin, called the STAR (FoCL 5.3.2): S = space (location where the sign has been uttered) T = time (time when the sign has been uttered) A = author (agent who produced the sign) R = recipient (agent intended to receive the sign) For successful communication, the speaker must encode the STAR into the sign in a way which enables the hearer to determine it correctly (FoCL 5.3.1). The role of the STAR is described by the Second Principle of Pragmatics: 2.6.2 S ECOND
PRINCIPLE OF PRAGMATICS
(P O P-2, F O CL 5.3.3)
A sign’s STAR determines the entry context of production and interpretation in the databases of the speaker and the hearer. Once the initial context of interpretation has been found, based on the STAR, other contexts may be accessed according to the Third Principle of Pragmatics: 2.6.3 T HIRD
PRINCIPLE OF PRAGMATICS
(P O P-3, F O CL 5.4.5)
The matching of signs with their respective subcontexts is incremental, whereby in production the elementary signs follow the time-linear order of the underlying thought path, while in interpretation the thought path follows the time-linear order of the incoming elementary signs. In communication, PoP-3 presupposes PoP-2. The first step in the production or interpretation of a natural language sign is to determine the entry context as precisely as possible (PoP-2). Then the sequence of referents at the level of context and the complex sign at the level of language are matched word by word (PoP-3).14 14
The time-linear derivation order in the speak and the hear mode is strongly supported by experimental results in cognitive psychology, especially regarding the interpretation of spoken language. For example, Sedivy, Tanenhaus, et al., (1999, p. 127) present “compelling evidence for a processing model in which linguistic expressions are undergoing continuous, moment-by-moment semantic interpretation, with immediate mapping onto the referential model.” Similarly, Eberhard, et al., (1995, p. 422) report that subjects incrementally “interpreted each word of the spoken input with respect to the visual discourse model and established reference as soon as distinguishing information was received.” See also Spivey et al. (2002) in a similar vein.
2.6 The S LIM Theory of Language
29
Next, the meaning1 and the associated reference mechanisms of the three basic kinds of signs, namely symbol, indexical, and name, are defined. The literal meaning1 of a symbol15 is a concept type (4.1.1). 2.6.4 F OURTH
PRINCIPLE OF PRAGMATICS
(P O P-4, F O CL 6.1.3)
The reference mechanism of a symbol is based on a meaning1 which is defined as a concept type. Symbols refer from their place in a positioned sentence by matching their meaning1 with corresponding concept tokens at the context level (HBTR Sect. 3.2). The reference mechanism of symbols is called iconic because the functioning of symbols and icons is similar: both can be used to refer spontaneously to new objects of a known kind. The difference resides in the relation between the meaning1 and the surface, which is arbitrary in the case of symbols, but motivated in the case of icons (FoCL Sect. 6.2). The second sign kind of natural language are the indexicals. Their literal meaning1 is defined as pointers to the STAR parameters of an utterance. 2.6.5 F IFTH
PRINCIPLE OF PRAGMATICS
(P O P-5, F O CL 6.1.4)
The reference mechanism of an indexical is based on a meaning1 which is defined as one of five characteristic pointers: loc, tim, pro1 and pro2 point at the S, T, A, and R parameter values, respectively, of the agent’s STAR, while pro3 points at the value of the additional STAR attribute 3rd (HBTR Sect. 3.3; CLaTR Sect. 10.2). The five kinds of pointers may be illustrated by the following examples: 2.6.6 T HE
FIVE POINTERS IN INDEXICAL WORDS OF
pro1: I, we, us
E NGLISH
pro2: you loc: here, there pro3: he, she, it, tim: now, then him, her, they, them, this Indexicals sharing the same pointer may differ in terms of additional symbolicgrammatical distinctions. For example, the pro3 pointer of the word this is restricted to a single, non-animate referential object, while the word they is restricted to a plurality of referential objects which may be animate or inanimate. Similarly, you is restricted to referential objects of any number and gender which are potential partners of communication, while he is restricted 15
We are following the terminology of Peirce (CP 2.228, 2.229, 5.473).
30
2. Interfaces and Components
to a single referential object of male gender which currently is not regarded as a partner of communication. In addition, there is the grammatical distinction between the nominative and the oblique case in English pronouns. The third sign kind of natural language besides symbols (concepts) and indexicals (pointers) are the names. Name-based reference is established by (i) object permanence (Piaget 1954) and (ii) an act of baptism. As a form of coreference, object permanence uses the address the initial referent (HBTR 3.4.1) for noninitial occurrences. For example, agent A notices an unknown dog and establishes the initial referent as a proplet (Sect. 3.1)) with the features [noun: dog] and [prn: 127]. As the animal appears and disappears running through the bushes, object permanence is maintained by providing new prn values to the propositions representing non-initial appearances, but using the address (dog 127) as the core value. An act of naming may be explicit, as in a ceremony of baptism, or implicit, as when the owner later calls (dog 127) with the name Fido (implicit baptism for agent A). The baptism connects the initial referent, which has an empty sur slot, and the lexical name proplet, which has an empty core slot, as follows:16 2.6.7 S IXTH
PRINCIPLE OF PRAGMATICS
(P O P-6, F O CL 6.1.5)
An act of baptism cross-copies (i) the sur value of the lexical name (11.5.2) into the empty sur slot of the initial referent (result: named referent) and (ii) the initial referent’s core value as an address into the empty noun slot of the lexical name (result: supplemented name). Supplemented names are written into the agent’s episodic memory rather than the lexicon, because an address core value like (dog 127) is not a convention of natural language. The surface of a supplemented name proplet is used in the hear mode to access the initial referent, while the core value of the named referent is used in the speak mode to access the name surface (HBTR 3.4.4). In linguistic examples, any baptism events and named referents are missing. For simplicity, we use the name surface as the lexical core value instead. Such a core value is needed for cross-copying during hear mode derivations (3.4.2). Orthogonal to the distinction between different reference mechanisms, i.e. matching, pointing, and baptism, there is the distinction between the basic parts of speech, i.e. noun, verb, and adjective – or more generally their ontological role as object, relation, and property. We prefer the term ontological 16
The intuitive explanation of name-based reference in terms of inserting a marker into the cognitive referent (FoCL 6.1.6) is complemented here by a more technical treatment, based on the data structure of proplets and a baptism which cross-copies between the initial referent and the lexical name proplet.
2.6 The S LIM Theory of Language
31
role because it may be used at the language and the context level, while the term part of speech suggests a limitation to language. Using the same constructs at the two levels is essential for a straightforward reconstruction of reference (3.2.4). Also, a given concept, e.g. red, may be assigned different ontological roles: (i) noun, as in a certain red, (ii) verb, as in redden the sky, and (iii) adj, as in the red shoes (CLaTR 6.6.4–6.6.7). The correlation between the reference mechanisms (establishing vertical relations) and the ontological roles (basis for establishing horizontal relations) is described by the Seventh Principle of Pragmatics: 2.6.8 S EVENTH
PRINCIPLE OF PRAGMATICS
(P O P-7, F O CL 6.1.7)
Concepts are used as verbs, adjectives, and nouns. Pointers are used as adjectives and nouns. Names are used only as nouns. Conversely, nouns use concepts, pointers, and names; adjectives use concepts and pointers; and verbs use only concepts17 for their mechanism of reference. In other words, the reference mechanism which is the most general with respect to the ontological roles is the matching of concepts, while name-based baptism is the most restricted. The ontological role which is the most general with respect to the different reference mechanisms is the noun (object, argument), while the verb (relation, functor) is the most restricted: 2.6.9 R EFERENCE
KINDS AND ONTOLOGICAL ROLES
baptism (name)
Fido
pointing (indexical)
this
here
matching (concept)
dog
black
find
noun
adj
verb
Because the ontological role of nouns, verbs, and adjs, on the one hand, and the reference mechanisms based on matching, pointing, and baptism, on the other, are orthogonal to each other, the combinatorics within a cognitive content may be defined over the ontological role of the elements alone, independent of whatever reference mechanism they may use. For example, the following two sentences use the same ontological roles for their combinatorics, but their mechanisms of reference are different: 17
This applies to the semantic core and is not in conflict with the fact that verbs in English, German, and other European languages integrate a strongly indexical component, namely the tense system – which is absent in Chinese (Zheng 2011; Zischler 2011), for example.
32
2. Interfaces and Components
2.6.10 D IFFERENT
REFERENCE KINDS IN THE SAME CONSTRUCTION
baptism (name) matching (symbol) matching (symbol) pointing (indexical) matching (symbol) pointing (indexical)
noun verb adjective John slept in the kitchen he slept there
The point is that John slept in the kitchen and He slept there have a similar grammatical structure. The two expressions differ, however, in that there is an elementary indexical while kitchen as the core of prepositional phrase is a symbol. Also, the noun John is a name, while the noun he is an indexical.
Concluding Remark The Seven Principles of Pragmatics are not merely of academic interest, but have far-reaching implications for the practical implementation and the basic functioning of the system in communication. PoP-1 allows to separate the literal meaning1 of the sign type from the meaning2 aspects of using tokens of the sign in an utterance – which is a precondition for a systematic syntacticsemantic analysis in accordance with surface compositionality. PoP-2 provides the values for the interpretation of indexicals and is a precondition for finding the correct context of use. PoP-3 establishes a uniform mechanism of interpretation which works in the speak and the hear mode. PoP-4, PoP-5, and PoP-6 explain for each sign kind how it relates to the intended referent in the context of interpretation. PoP-7 specifies which of the three mechanisms of reference must be implemented for which of the three ontological roles. Based on the Seven Principles of Pragmatics, the S LIM theory of language models the natural communication mechanism at a level of abstraction at which the meaningful use of language (i) between a human and a human, and (ii) between a human and a suitably constructed cognitive machine (DBS robot) function in the same way – just as a sparrow (natural) and a jumbo jet (artificial) are based on the same aerodynamic principles for being airborne (FoCL Introduction VI). Database Semantics was developed as the technical realization of S LIM in the form of a declarative specification designed for and supported by running software.
3. Data Structure, DB Schema, and Algorithm
While the external interfaces of an agent as well as their input and output may be observed directly and measured objectively (1.4.3), the data structure, database schema, and the algorithm must be inferred from the cognitive functions controlling the agent’s behavior. In language communication, the cognitive function of the algorithm is (i) to map the input provided by the sign recognition component into a content which is suitable for storage in the agent’s database (hear mode), (ii) to selectively activate and process content in the database (think mode), and (iii) to provide the action component with suitably formatted content to be read out of the database in the form of natural language surfaces (speak mode). In DBS, these functions are performed by the time-linear algorithm of LA grammar, used in the variants LAGo hear, LAGo think, and LAGo think-speak. The data structure supporting the algorithm is non-recursive (flat) feature structures with ordered attributes, called proplets (because they are the elementary building blocks of propositions – in analogy to droplet). Proplets are structurally equivalent to the records of classic databases. Stored in a contentaddressable database called word bank, proplets are selectively activated by a LAGo think navigation along the semantic relations holding between them.
3.1 Proplets for Coding Propositional Content In computer science, a feature structure is defined as a set of features. A feature is defined as an attribute-value pair (avp). As feature structures, proplets are special because their attributes are in a predefined standard order (for better readability and improved computational efficiency, Sect. A.2). Furthermore, the values of a proplet attribute are restricted to a finite list (double-ended queue) which consists of one or more basic items; the recursive embedding of feature structures (3.4.3) as attribute values is not allowed.1 1
This is in contradistinction to the use of recursive feature structures within nativism, for example, in LFG (Bresnan 1982) , GPSG (Gazdar et al. 1985), or HPSG (Pollard and Sag 1994).
34
3. Data Structure, DB Schema, and Algorithm
Proplets are connected into content by the semantic relations of structure, i.e. functor-argument and coordination, intra- and extrapropositionally. They differ from the semantic relations of meaning, e.g. hypernymy, and of content, e.g. cause and effect. The semantic relations of structure are coded solely by means of (i) attributes and (ii) values which are defined as the address of another proplet. The use of addresses makes proplets order-free, allowing storage and retrieval according to the needs of the agent’s memory (database). Consider the propositional content of an agent perceiving a barking dog (recognition) and running away (action) as an (order-free) set of proplets: 3.1.1 C ONTEXT sur: noun: dog fnc: bark prn: 22
PROPLETS REPRESENTING sur: sur: verb: bark noun: pro1 arg: dog fnc: run nc: (run 23) prn: 23 prn: 22
dog bark. I run.
sur: verb: run arg: pro1 nc: prn: 23
The attributes used are (i) the sur (surface) attributes, here empty (context content), (ii) the core attributes, here noun and verb, (iii) the continuation attributes of functor-argument, here fnc (functor) and arg (argument), (iv) the continuation attribute nc (next conjunct) of extrapropositional coordination, and (v) the book-keeping attributes prn (proposition number). The core attribute specifies the ontological role of a proplet, at the language and the context level. For example, the first proplet has the core attribute noun and takes the concept dog as its core value. The continuation attributes characterize the kind of semantic relation to another proplet. For example, the first proplet has the continuation attribute fnc and takes the continuation value bark as its continuation value (intrapropositional functor-argument). Also, the second proplet has the continuation attribute nc and takes (run 23) as its continuation value (extrapropositional coordination). The proplets of the two propositions each have their own prn value, 22 and 23. By using the same values in different slots, the semantic relations of structure (i.e. the combinatorial aspect of a complex content) are defined not directly on the values (e.g. a concept like dog), but in terms of the proplets containing them, as visualized by the diagonal lines2 in the following example: 3.1.2 C ONNECTING
PROPLETS INTO A COMPLEX COGNITIVE CONTENT
sur: context level: noun: dog fnc: bark prn: 22
sur: verb: bark arg: dog nc: (run 23) prn: 22
sur: verb: run arg: pro1 nc: prn: 23
sur: noun: pro1 fnc: run prn: 23
3.2 Matching between Language and Context Proplets
35
This is a cognitive content inasmuch as the core values are defined in terms of the agent’s recognition (4.3.1) and action (4.4.1) procedures, and the continuation values establish meaningful semantic relations of structure. The interproplet relations shown in 3.1.2 are duplex in that they connect two proplets A and B by copying (i) the address of A’s core value into the continuation slot of B, and (ii) the core value of B into the continuation slot of A (CLaTR 16.6.7). This is in contradistinction to extrapropositional coordination (9.2.3), which is simplex in that it is based only on relation (ii).
3.2 Matching between Language and Context Proplets To ensure a proper functional interaction between the language and the context level, they must be structurally compatible. In DBS, this is achieved by using the same data structure for the two levels. The only difference between a language proplet and a corresponding context proplet is the value of the sur attribute: in a language proplet this value must be non-empty. The sur value of a language proplet is a specialized, language-dependent pattern for a word form surface, for example, German Hund. A sequence of concatenated propositions coded with language proplets is illustrated in the following variant of 3.1.1 using German surfaces: 3.2.1 L ANGUAGE
PROPLETS REPRESENTING
sur: bellt sur: ich sur: Hund verb: bark noun: pro1 noun: dog fnc: bark arg: dog fnc: run nc: (run 23) prn: 23 prn: 22 prn: 22
dog barks. I run.
sur: renne verb: run arg: pro1 nc: prn: 23
While a context proplet has only one main key, namely the core value, a language proplet has two, the core and the surface value. 3.2.2 K EYS
FOR LEXICAL LOOKUP IN THE SPEAK AND THE HEAR MODE
sur: Hund ← key for word form recognition in the hear mode noun: dog ← key for word form production in the speak mode cat: sn sem: m sg fnc: mdr: nc: pc: prn: 2
To avoid entanglement of the diagonal lines indicating the semantic relations of structure between proplets, the order of the last two proplets is inverted as compared to 3.1.1. This is permitted because the proplets in a content are order-free.
36
3. Data Structure, DB Schema, and Algorithm
In a lexical proplet, the complete list of attributes is shown (Sect. A.4). For simplicity, connected proplets may show only those attributes which have values, as in the preceding example of a content 3.2.1 In the hear mode, an unanalyzed language-dependent surface (here German Hund) is used to retrieve a language proplet with a corresponding sur value (recognition). The lexical proplet retrieved provides a core value, other values, and other attributes needed for the composition of content (interpretation). In the speak mode, a core value (here dog) is used to retrieve a corresponding language proplet. The language-dependent sur value is derived from the core value by a suitable lexicalization rule (Sect. 12.4–12.6) and used for realizing an unanalyzed modality-dependent external surface (production). By replacing constants with restricted variables, proplets may be turned into patterns. Patterns serve not only in the reference relation between the language level and the context level3 (3.2.4, 4.1.2), but also in the application of operations to input proplets (11.6.1–11.6.5, 12.3.2, 12.3.3). Both kinds of pattern matching share the following conditions for success: 3.2.3 C ONDITIONS
ON SUCCESSFUL MATCHING
1. Attribute condition The attributes of the pattern proplet must be a sublist (equal or less) of the attributes of the input proplet. 2. Value condition Each value of the pattern proplet must be compatible with the corresponding value of the input proplet.4 In the speak mode (mapping content into language), proplets at the context level are matched at the language level by compatible proplets which contain language-specific lexicalization rules (Sects. 11.4–11.6) for producing surfaces which are correct regarding word form, word order, agreement, and function word precipitation. In the hear mode (mapping language into content), language proplets provided by automatic word form recognition are stored at the now front (3.3.1) of the database and concatenated into content. Even though the reference relation (vertical matching) takes place between individual proplets, the semantic relations (horizontal concatenation) holding between them at the two levels are taken into account as well.5 Consider the matching between the context proplets 3.1.1 and the language proplets 3.2.1: 3 4
5
In the type-token relation, the variables occur in the type (4.4.1). In order for language and context proplets to match, their prn values need not agree. Their purpose is to hold the proplets of a proposition “horizontally” together by means of a common value. Apparently, Aristotle struggled to reconcile reference with concatenation (Modrak 2001).
3.3 Storing Proplets in a Content-Addressable Database
3.2.4 I MPACT
37
OF INTERPROPLET RELATIONS ON MATCHING
sur: Hund language level: noun: dog (horizontal relations) fnc: bark prn: 122
sur: bellt verb: bark arg: dog nc: (run 123) prn: 122
internal sur: context level: noun: dog (horizontal relations) fnc: bark prn: 22
sur: renne verb: run arg: pro1 nc: prn: 123
matching
sur: verb: bark arg: dog nc: (run 23) prn: 22
sur: ich noun: pro1 fnc: run prn: 123 (vertical
sur: verb: run arg: pro1 nc: prn: 23
relations)
sur: noun: pro1 fnc: run prn: 23
The systematic extension of vertical matching between individual proplets (reference) to their horizontal semantic relations of structure may be illustrated by a matching failure: Assume, for the sake of the argument, that the noun proplet dog at the language level has the continuation feature [fnc: bark], but the corresponding dog proplet at the context level had the continuation feature [fnc: sleep]. In this case, despite their having the same core value, the two proplets would be vertically incompatible – due to their horizontal relations to different verbs, coded as different values of their respective fnc attributes.
3.3 Storing Proplets in a Content-Addressable Database By coding the semantic relations of structure within propositions and between propositions solely by means of address values implemented as pointers, the proplets of the language and of the context level are self-contained order-free items. They may be represented and stored independently of the semantic relations of structure holding between them.6 This freedom is needed to accommodate the database schema of the artificial cognitive agent’s memory. Called word bank, it is content-addressable because it uses (i) the alphabetical order induced by the core values of the proplets (vertical dimension) and (ii) their arrival order (horizontal dimension) for storage and retrieval – in contradistinction to the widely used coordinate-addressable databases, which require an extra index (inverted file).7 6
7
This is in contradistinction to the constituent structures (trees) of nativist linguistics (FoCL Sect. 8.5; CLaTR Sect. 12.2) and the linear notation (formulas) in truth-conditional semantics (6.4.1). In computer science, an inverted file is an index which lists for each item the storage locations in the database – like a library catalog. Providing “random access,” inverted files are the basis of the widely used coordinate-addressable databases (e.g. RDBMS, CLaTR Sect. 4.1).
38
3. Data Structure, DB Schema, and Algorithm
The two-dimensional structure of a word bank resembles a classic network database (Elmasri and Navathe 2010) with owner and member records – except that member proplets and owner values are used instead of records. A word bank is special compared to other content-addressable databases because it uses the new construct of horizontal token lines. A token line consists of (i) an owner value, preceded by (ii) an empty slot belonging to the now front, preceded by (iii) an open number of member proplets with the same core value as their owner proplet. Consider the word bank of a cognitive agent with language which contains the context and language proplets used in 3.2.4: 3.3.1 S TORING
PROPLETS IN A WORD BANK
member proplets now front . . . sur: bellt sur: verb: bark verb: bark . . . arg: dog arg: dog nc: (run 23) nc: (run 123) prn: 122 prn: 22 ... sur: Hund sur: noun: dog noun: dog ... fnc: bark fnc: bark prn: 122 prn: 22 . .. sur: verb: run . . . arg: pro1 nc: prn: 23 ...
owner values ... bark
... dog
... sur: renne verb: run run arg: pro1 nc: prn: 123 ...
After completing the hear mode derivation of a sentence (here 3.4.2), the now front is cleared to make room for the next sentence. The clearance consists in moving the owner values in the affected token lines one step to the right, leaving the proplets which have ceased to be candidates for further concatenation behind in the sediment of the member records. Equivalently, the member proplets may be pushed one step to the left, as in a pushdown automaton. In 3.3.1, the run proplet is left in place at the now front because it could be needed for an extrapropositional coordination with a next proposition. If such a continuation is provided, the verb proplet run is still at the now front such that an operation (11.6.5, 13.3.5, 13.4.5, ) may fill its nc slot with the address of the next verb proplet. In a word bank, the vertical reference relation between the language and the context level shown in 3.2.4 is rotated by 90o and established horizontally
3.4 Possible Continuations vs. Possible Substitutions
39
within the token lines. For example, the language member proplet bark with the prn value 122 matches the preceding context member proplet with the prn value 22. The extrapropositional relation between the first and the second proposition is coded by the address value (run 23) in the nc (next conjunct) slot of the verb proplet bark at the context level, and correspondingly at the language level. This setup is also suitable for the coreference relation, e.g., between a pronoun and its antecedent (CLaTR Chap. 11). Thus, full use is made of the order-free nature of proplets for the purpose of storage and retrieval.
3.4 Possible Continuations vs. Possible Substitutions Next we return to the Principle of Input/Output Equivalence 1.5.2. It applies to the algorithm operating on the data structure of proplets. What it means for an artificial agent to (i) take the same input and produce the same output as the human prototype, (ii) disassemble input and output in the same way into parts, and (iii) order the parts in the same way during input and output is especially clear in the case of language: 3.4.1 A PPLYING I NPUT /O UTPUT E QUIVALENCE
TO LANGUAGE
1. Input and output at the language level are the surfaces of natural language signs, such as word forms, phrases, sentences, or texts. 2. The elementary parts into which the signs of natural language disassemble during input and output are word form surfaces. 3. The order of the parts during input and output is time-linear. The algorithm of Left-Associative grammar (LA grammar, LAG) models time-linearity by taking a sequence of word form surfaces, e.g., a b c d e ..., as input, and by combining what has been analyzed so far, called sentence start, with the current next word into a new sentence start, i.e. a and b into (a b), (a b) and c into ((a b) c), ((a b) c) and d into (((a b ) c) d), etc. In computer science, this kind of combination is called left-associative (Aho and Ullman 1977, p. 47) – which gives LAG its name.8 In the hear mode derivation of a natural language sign, the semantic relations of structure are established by cross-copying values between proplets. As an example, consider the following derivation of the sentence Julia knows John: 8
This name was suggested by Stuart Shieber and Ron Kaplan during the author’s 1983–1986 stay at Stanford University.
40
3. Data Structure, DB Schema, and Algorithm
3.4.2 C ONTINUATION - BASED Julia
TIME - LINEAR HEAR MODE DERIVATION
knows
John
lexical lookup noun: Julia fnc: mdr: prn:
verb: know arg: mdr: prn:
noun: John fnc: mdr: prn:
syntactic−semantic parsing: 1
noun: Julia fnc: mdr: prn: 22
verb: know arg: mdr: prn:
noun: Julia fnc: know mdr: prn: 22
verb: know arg: Julia mdr: prn: 22
noun: Julia fnc: know mdr: prn: 22
verb: know arg: Julia John mdr: prn: 22
noun: John fnc: mdr: prn: result of syntactic−semantic parsing: 2
noun: John fnc: know mdr: prn: 22
Lexical lookup replaces the incoming surfaces of the first two words, Julia and knows with “isolated” proplets, most attributes of which have no value yet. The proplets are stored at the now front of the agent’s word bank for syntactic-semantic parsing, which turns them into “connected” proplets by cross-copying core values into the continuation slots of the Julia and know proplets (indicated by the arrows). Next, incremental (time-linear) lexical lookup replaces the surface John with an isolated proplet. It is also stored at the now front and connected by cross-copying, this time between the know and John proplets. The result is a set of self-contained proplets, held together by a common proposition number, here [prn: 22]. There is no need for sorting the resulting proplets into the agent’s database because they have been connected at the now front in what will become their final storage location during the next clearance (3.3.1). This analysis is input-equivalent with a natural hearer in that it takes the unanalyzed surface of a next word form as input to lexical lookup and connects the result to the current sentence start (i) by reconstructing the semantic relations of structure (here functor-argument) in terms of cross-copying values and (ii) by means of a categorial operation (e.g., valency canceling, not shown; see 11.6.1, 13.2.5, 13.2.6, 13.3.4, 13.4.4, 13.4.6, 13.4.8, 13.6.2). The time-linear derivation order of DBS is in contradistinction to the substitution-based algorithm of phrase structure grammar (PSG), which is not timelinear, as illustrated by the following alternative derivation of our example:
3.4 Possible Continuations vs. Possible Substitutions
3.4.3 S UBSTITUTION - BASED
NON - TIME - LINEAR DERIVATION OF
41
3.4.2
S
phrase structure derivation
NP
VP
V Julia
knows
NP John
lexical lookup noun: Julia verb: know noun: John tense: pres num: sg num: sg subj: gen: fem gen: masc obj:
unification
result
verb: know tense: pres subj: obj: noun: John num: sg gen: masc verb: know tense: pres subj: noun: Julia num: sg gen: fem obj: noun: John num: sg gen: masc
As a sign-based approach, PSG does not distinguish between the speak and the hear mode. Instead, the goal is to derive (generate) all well-formed sentences of a natural language from one and the same start symbol S by means of substitutions. For example, in 3.4.3 the start symbol S is substituted with the nodes NP and VP. The VP node in turn is substituted with the nodes V and NP. Then the three lowest non-terminal nodes are replaced by the terminal nodes Julia, knows, and John. The result is a phrase structure tree. Next the terminal nodes are replaced by feature structures via lexical lookup. Finally, the lexical feature structures are unified (indicated by the dotted arrows), resulting in one big recursive feature structure (result). The order of unification (Knight 1989) mirrors the derivation of the PS tree.9 9
The sample derivation is generalized insofar as it emphasizes what is common to the different variants of nativism, such as GPSG, HPSG, LFG, and (somewhat more obscurely) GB.
42
3. Data Structure, DB Schema, and Algorithm
Because the substitution-based PSG derivation generates knows John first, instead of analyzing the sentence word by word from beginning to end, it violates the Principle of Input/Output Equivalence 1.5.2. As a sign-based approach, PSG also violates the Principle of Interface Equivalence 1.5.1 in that the declarative specification of its rules and derivations does not provide any connection to the recognition and action components of the cognitive agent.10 Because the feature structure resulting from unification directly mirrors the structure of the PS tree, the recursive embedding of subfeature structures is essential. For the same reason, PSG represents only isolated sentences.11 In LAG, in contrast, all operations have an interface to external or internal data. Hear mode operations match the lexical proplets provided by time-linear automatic word form recognition. The surface order of sentences in a text and of words in a sentence is coded inside the proplets, making them order-free. Think mode operations match an activated proplet in the agent’s database to compute a continuation. This navigation along the semantic relations of structure between proplets re-introduces the macro word order of the language. The speak mode handles the micro word order, the choice of word forms, the agreement, and the function word precipitation with language-dependent lexicalization rules which “read” the proplets traversed by means of matching.
3.5 Cycle of Natural Language Communication From the perspective of grammar, the cycle of natural language communication is based on switching between the hear mode, the think mode, and the speak mode, thus modeling what is called turn taking (1.1.1): 3.5.1 T URN
TAKING FROM THE PERSPECTIVE OF GRAMMAR
1. The hear mode provides input for the think mode. 2. The think mode provides input for the speak mode. 3. The speak mode provides input for the hear mode. Given two communicating agents A and B, the think mode of agent A provides the input to its speak mode, which in turn activates the hear mode of agent B. After interpreting the incoming surface_x as content, agent B may switch into the speak mode and produce an outgoing content as surface_y: 10
11
The principle of possible substitution and the absence of external interfaces forces PSG-based parsers to use sizeable workarounds based on often huge intermediate structures, called charts, tables, or state-sets. Accordingly, in comparison with LA parsers, PSG parsers are inefficient (FoCL Part II). Even according to its own nativist standards, PSG has not provided a complete grammatical analysis of a single natural language, despite practically unlimited funding for more than five decades.
3.5 Cycle of Natural Language Communication
3.5.2 T URN
43
TAKING AT THE LEVEL OF LANGUAGE
hear mode of agent B
speak mode of agent B
surface_x interpretation
surface_y production
language content think mode
A special case of agent B switching from the hear mode to the speak mode is repeating in the speak mode what has just been interpreted in the hear mode, i.e. surface_x = surface_y in 3.5.2. Called the linguistic laboratory set-up (HBTR 2.6.3), this is not a plausible constellation for informative dialog, but constitutes an ideal laboratory condition for empirical work. For the purpose of testing, the software is considered adequate linguistically only if the input to agent’s B hear mode (unanalyzed surface) equals the output of agent’s B speak mode. A speak mode which maps a cognitive content derived in the hear mode back into the original surface (Chap. 12, 14) is far from trivial if the content has the form of order-free proplets connected by semantic relations of structure coded as addresses (3.1.2). However, not only language interpretation but also non-language recognition may provide content for the think mode. Moreover, the think mode may provide content not only for language production but also for non-language action. Therefore, the grammar variant of turn taking 3.5.1 must be complemented with the following non-language sources of content: 3.5.3 T HE
RECOGNITION - ACTION CYCLE AT THE LEVEL OF CONTEXT
1. Non-language recognition provides content for the think mode. 2. Contents derived by the think mode provide blueprints for action. 3. Action provides content for non-language recognition (feedback). The turn taking at the level of language 3.5.1 and the recognition-action cycle at the level of context 3.5.3 are connected by reference as follows: 3.5.4 T URN
TAKING INCLUDING REFERENCE
hear mode
speak mode
surface interpretation
surface production
language content reference recognition
context content inferencing
action
44
3. Data Structure, DB Schema, and Algorithm
The motor driving the cognitive cycle of recognition and action at the language and the context level is the principle of balance, which adds a substantial element of cybernetics (Wiener 1948) to DBS: 3.5.5 T HE DBS BALANCE P RINCIPLE Autonomous control drives language and non-language behavior to maintain the agent in a continuous state of balance vis à vis changing external and internal environments. The short-, mid-, and long-term success is defined in terms of survival in the agent’s ecological niche. A state of balance is an absolute like truth, and like truth it provides the fix point necessary for a system of semantic interpretation. But while truth is (at least12 ) bipolar, balance is monopolar. This may be the reason why balance, unlike truth, is a dynamic principle which drives the cognitive agent’s struggle to survive, and to survive comfortably, in a constantly changing world, short-, mid-, and long-term. Calibrating the DBS system to achieve balance in its ecological niche requires an actual robot. After all, without the robot’s external and internal interfaces we would have to recreate every nook and cranny of the changing environment by hand (as in model theory, 2.3.1, Sect. 6.4). This would violate a fundamental credo of nouvelle AI, namely that The world is its own best model (Brooks 1985 et seq.). In short, a DBS robot treats the world as given and concentrates on implementing the cognitive processing of a robot which is capable of (i) surviving autonomously in its ecological niche and (ii) free human-machine communication in the natural language of choice.
3.6 Operations of the DBS Algorithm The algorithm of DBS is LA grammar (TCS’92). LA grammar originated as a time-linear counterpart to phrase structure grammar (PS grammar). As an example, consider the LA grammar for context-sensitive ak bk ck (FoCL 10.2.3):13 3.6.1 LA
GRAMMAR FOR
ak bk ck
LX =def {[a (a)], [b (b)], [c (c)]} STS =def {[(a) {r1 , r2 }]} r1 : (X) (a) ⇒ (aX) {r1 , r2 } r2 : (aX) (b) ⇒ (Xb) {r2 , r3 } r3 : (bX) (c) ⇒ (X) {r3 } STF =def {[ε rp3 ]}
3.6 Operations of the DBS Algorithm
45
The lexicon LX contains three words. Each word, e.g. [a (a)], is analyzed as an ordered pair, consisting of a surface, e.g. a, and a category, e.g. (a). The categories are defined as lists of category segments. Here, they contain only a single segment which happens to equal the respective surface. The initial state STS consists of a pattern for the first word and a rule package. Here, the pattern specifies that the first word must be of category (a), i.e. it must have the surface a. The rule package specifies that the rules to be tried on the first and the second word are r1 and r2 . Thus, not all rules and not all words may be used at the beginning of an ak bk ck input. A rule consists of (i) a name, e.g. r1 , (ii) an operation, e.g. (X) (a) ⇒ (aX), and (iii) a rule package, e.g. {r1 , r2 }. While a PS grammar derives all the wellformed expressions of a language from the same single start node S by computing possible substitutions, an LA grammar takes a string of the language as input, for example, aaabbbccc, and computes possible continuations. The operation of an LA grammar rule consists of (i) a pattern for a sentence start, (ii) a pattern for a next word, (iii) a double arrow, and (iv) an output pattern. The patterns (i) for the sentence start and (ii) the next word attempt to match the category of the input, as shown below: 3.6.2 A PPLYING
R1 TO TWO INITIAL INPUT WORDS
rule level r1 : (X) (a) ⇒ (aX) {r1 , r2 } matching | | || language level [a (a)] [a (a)] [aa (aa)] The matching attempt is successful: (i) the unrestricted variable X of the sentence start pattern matches the category (a) whereby (a) is bound to X, and (ii) the pattern (a) of the next word at the rule level matches its counterpart at the language level. The output pattern (aX) adds the category segment a to whatever X has been bound to, here resulting in the output category (aa).14 Next the rule package {r1 , r2 } of r1 is applied to the current sentence start category, i.e. (aa), and the next word. If the next word happens to be of category (a), r1 is successful and r2 fails. If the next word is of category (b), r2 is successful and r1 fails. Because the three rules of 3.6.1 each match a different next word, the LA grammar is unambiguous and parses in linear time. Linear i timer complexity has been shown also for a2 (FoCL 11.5.1) and polynomial time for WW (FoCL 11.5.6). 12 13 14
See FoCL Sect. 20.5 for a discussion of non-bivalent logic systems. See FoCL 8.3.7 for the standard PS grammar definition of ak bk ck . In any successful rule application, the surface of the next word is always added at the end of sentence start surface, and is therefore handled implicitly by the parser.
46
3. Data Structure, DB Schema, and Algorithm
Applications of LA grammar in the format of 3.6.1 to natural language syntax worked well (NEWCAT’86, CoL’89, FoCL’99). With the development of a semantic interpretation, however, the lexical items became more and more differentiated. Consequently, the descriptive burden of a rule application migrated more and more from the rule packages to the operation patterns. The function of the rule packages to control the time-linear order of rule applications was further reduced with the distinction between the speak and the hear mode. It complements the grammatical analysis of the language signs with mappings between surfaces and cognitive contents, stored in the agent’s content-addressable database (word bank). This made it possible to reformulate many tasks of the older LA grammars for natural language. Storage at the now front in the hear mode (Sects. 3.3, 11.3) and navigating through the word bank in the speak mode finally resulted in the current LAGo15 approach, which uses the operations of the LA grammar rules, but without the rule packages, the start states, and the final states. By controlling the application of operations solely by the availability16 of compatible input, LAGo systems are self-organizing (Kohonen 1988). A LAGo hear mode operation begins by matching its second input pattern with the proplet resulting from the lexical lookup of the current next word. Then the first input pattern looks for a matching proplet at the current now front (11.5.3–11.5.5). Accordingly, the hear operations are backward looking. The two input patterns of a LAGo hear operation are roughly specified by the name of the operation, e.g., NOM+FV for nominative plus finite verb. The details of an operation are coded by its input and output patterns. The operations of the hear mode are of the following kinds: 3.6.3 T HE
FOUR KINDS OF
LAG O
HEAR OPERATIONS
1. cross-copying establishes the semantic relations of functor-argument and coordination: NOM+FV 9.2.1, line 1; 11.6.1 (subject/predicate); FV+NP 13.2.1, line 5; 13.2.6 (object\predicate); DET+ADN 13.2.1, line 1; 13.2.2 (modifier|modified); ADN+ADN 13.2.1, line 2; 13.2.3 (conjunct−conjunct). 2. substitution simultaneously replaces all occurrences of a substitution variable with an15 16
LAGo stands for LA grammar with operations only, i.e. without rule packages. Applying LA grammars without rule packages to natural language also avoids the “Klauselparadox” (Handl 2012, p. 12).
3.6 Operations of the DBS Algorithm
47
other value: DET+CN 13.2.1, line 7; 13.2.8. 3. absorption integrates the lexical values of a function word into a content word: S+IP 9.2.1, line 2; 11.6.2. 4. suspension adds the next word proplet to the now front without establishing a semantic relation: IP+START 9.2.1, line 3; 11.6.3. Many more examples like the ones mentioned above may be found in the following chapters. Next we turn to LAGo think. The think mode consists of the operations (i) navigation and (ii) inferencing. A LAGo navigation operation begins by matching the first pattern with an activated proplet and looks for a successor proplet in the word bank matching its second pattern (12.2.1–12.2.2). Accordingly, the navigation operations are forward looking. The navigation operations are of the following kinds: 3.6.4 T HE
FOUR KINDS OF
LAG O
THINK OPERATIONS
1. subject/predicate17 traversal between a subject and its predicate 9.2.3 (arcs 1 and 2), V/N (12.3.1), and N/V (12.3.2). 2. object\predicate traversal between an object and its predicate 14.2.1 (arcs 7 and 10), V\N (14.2.8), and N\V (14.2.11). 3. modifier|modified traversal between a modifier and its modified 14.2.1 (arcs 2 and 5), N|A (14.2.3), and A|N (14.2.6). 4. conjunct−conjunct traversal between two adjacent conjuncts 14.2.1 (arcs 3 and 4), A−A (14.2.4), and A←A (14.2.5). While the derivation order of the hear mode is controlled by the time-linear input, the derivation order of the think mode navigation is controlled by the Continuity Condition: 17
The noun-verb distinction does not express the different uses of a noun as a subject or as an object. To avoid terminologically incongruent “subject-verb” or “object-verb”, we use classical subjectpredicate and object-predicate whenever appropriate.
48
3. Data Structure, DB Schema, and Algorithm
3.6.5 C ONTINUITY C ONDITION
FOR
LAG O
NAVIGATION
A LAGo think operation AxB may only be followed by an operation ByC, where A, B, C are patterns and x, y are operators. The Continuity Condition ensures the time-linear nature of a think mode navigation: 3.6.6 S CHEMATIC
TIME - LINEAR THINK MODE NAVIGATION
AxB ByC CzD ...
As shown in 11.6.6 and 13.5.2, the Continuity Condition does not apply in the hear mode. The two kinds of DBS think mode operations, i.e. navigation and inferencing (Sect. 5.3), are the basis of the DBS speak mode: navigation operations may be turned into think-speak mode operations by inserting a languagedependent lexicalization rule into the sur slot of the second output pattern, ˆ in 12.5.2. e.g. [sur: lexnoun(β)]
Remark Comparing LAGo and OOP The idea behind LAGo is proplets as elementary items which click together into complex contents whenever possible, like atoms combining into molecules. This is perhaps reminiscent of the idea behind object-oriented programming. However, LAGo (Left-Associative grammar with operations only) and OOP (object-oriented programming) differ as follows. OOP places the “methods” (operations) inside the “classes” (proplets), while LAGo activates the operations externally by means of pattern matching between operation patterns and input proplets. Furthermore, OOP uses a class hierarchy to control the application order, while LAGo uses a strictly timelinear derivation order provided by the language input (hear mode) or by navigating along the semantic relations within a content (speak mode).
4. Content and the Theory of Signs
Having outlined the external interfaces, the data structure, the database schema, and the algorithm of a talking robot, let us turn to a rather sensitive topic: meaning – or more generally cognitive content (3.1.2). DBS defines a literal meaning1 (2.6.1) as a content conventionally attached to a language surface.1 Let us concentrate on contents which are defined as concepts, as in the sign kind called symbol (2.6.4) by Peirce – commenting on the sign kinds indexical (2.6.5) and name (2.6.7) in passing. We approach the notion of an elementary concept on the basis of a straightforward task for robotics. In recognition the artificial agent must be able to recognize new objects of a known kind, like a brand new pair of shoes. The system should be forgiving in marginal cases, like clogs, but give a clear “no” when it comes to wheelchairs or shovels. And accordingly for elementary actions. Performing elementary recognition and action requires the ability to distinguish between the necessary and the accidental, familiar from philosophy.
4.1 Formal Concept Types and Concept Tokens The considerable literature on concepts presents the schema (Piaget 1926; Stein and Trabasso 1982), the template (Neisser 1964), the feature (Hubel and Wiesel 1962; Gibson 1969), the prototype (Bransford and Franks 1971; Rosch 1975), and the geon (Biederman 1987; Kirkpatrick 2001) approach. Let us join the debate at a slightly higher level of abstraction by beginning with a distinction which should be common to all of these approaches, namely the distinction between concept types and concept tokens.2 We define concepts as non-recursive feature structures with ordered attributes, like proplets. The type of a concept specifies the necessary properties by values which are constants, while the accidental properties are represented by variables. The associated tokens share the attributes and the constant values 1 2
In DBS, this attachment is formulated by the listing of a sur and a core feature in a proplet. The type-token distinction was introduced by Peirce (CP, Vol.4, p. 537).
50
4. Content and the Theory of Signs
with the type, but instantiate the variables by constants which may vary from one token to the next. Consider the following example (FoCL 3.3.1, 3.3.3): 4.1.1 T YPE
AND TOKEN OF THE CONCEPT
type edge 1: α
o angle 1/2: 90 edge 2: α angle 2/3: 90o edge 3: α angle 3/4: 90o edge 4: α angle 4/1: 90o
square
token edge 1: 2 cm o angle 1/2: 90 edge 2: 2 cm angle 2/3: 90o edge 3: 2 cm angle 3/4: 90o edge 4: 2 cm angle 4/1: 90o
Here, the necessary properties, shared by the type and its token, are the four attributes edge and the four attributes angle. Furthermore, all angle attributes in the type and in the tokens have the same value: the constant 90 degrees. The accidental property is the edge length, represented by the variable α in the type. In a token, all occurrences of α are instantiated by a constant, here 2 cm. Because of its variable, the type of the concept square is compatible with infinitely many corresponding tokens, each with its own edge length. The definition of concept types and concept tokens as non-recursive feature structures with ordered attributes is well-suited for the computational implementation of a procedure which is central to DBS, namely pattern matching. For this, concept types and concept tokens are embedded as core values into the non-recursive3 feature structure of proplets: 4.1.2 C ONCEPTS
AS VALUES OF THE CORE ATTRIBUTES
edge 1: α cm o angle 1/2: 90 concept type edge 2: α cm o angle 2/3: 90 edge 3: α cm o angle 3/4: 90 edge 4: α cm o angle 4/1: 90 match edge 1: 2 cm o angle 1/2: 90 edge 2: 2 cm concept token o angle 2/3: 90 edge 3: 2 cm o angle 3/4: 90 edge 4: 2 cm o angle 4/1: 90
proplets sur: square noun: square fnc: mdr: prn:
language level
match
sur: noun: square fnc: mdr: prn:
context level
4.2 Proplet Kinds and their Mechanisms of Reference
51
The matching between the language and the context proplet is successful because (i) the language proplet and the context proplet have the same attributes (ii) in the same order and (iii) the values of their respective core attributes noun are compatible – due to the fact that they happen to be the type and a token of the same concept (other values are omitted for simplicity). The great utility of the type-token relation in the matching between language and context proplets containing concepts as core values may be summarized as follows: 4.1.3 W HY
THE TYPE - TOKEN RELATION IS IMPORTANT
A type-token relation based on non-recursive feature structures with ordered attributes is easily computed. Matching the concept type of a language core value with the concept token of a context core value enables the language proplet to refer in different utterance situations to different context proplets, including reference to tokens never encountered before, but of a known kind. Let us assume, for example, that the robot is shown new geometrical objects which differ from previous ones in their size. If we say: Pick the square!, then at least the middle part of the robot’s reference mechanism (4.5.2), i.e. the matching between the language and the context level, will be able to do it.
4.2 Proplet Kinds and their Mechanisms of Reference Regardless of their reference mechanism (2.6.4, 2.6.5, 2.6.7) core values are not combined directly into more complex cognitive structures. Instead they are embedded into the non-recursive feature structure of proplets. In DBS, there are three basic kinds of proplets, namely noun (N), verb (V), and adj (A). The grammatical terms noun, verb, and adj correspond to the distinction between (i) object, (ii) relation, and (iii) property in philosophy (ontological role, 2.6.8). The corresponding terms in symbolic logic are (i) argument, (ii) functor, and (iii) modifier (FoCL 3.4.1). Because verb proplets work only with the reference mechanism of symbols and adj proplets do not work with the reference mechanism of baptism (2.6.9), embedding the three reference mechanisms of core values into the three proplet structures (ontological roles) results in six (rather than nine) basic proplet structures. 3
Defining concept core values as feature structures and embedding them into proplets does not turn the latter into recursive feature structures because the embedding takes place in the lexicon before runtime and is therefore what is called static in computer science.
52
4. Content and the Theory of Signs
These six proplet structures may be illustrated as follows: 4.2.1 E XAMPLES
OF THE SIX MAIN KINDS OF PROPLETS
N, symbol N, indexical N, name sur: Dach sur: ihn sur: Waldi noun: roof noun: pro3noun: (dog 23) cat: snp cat: obq cat: snp sem: def sgsem: sg m sem: nm m fnc: see fnc: see fnc: see mdr: red mdr: mdr: nc: nc: nc: pc: pc: pc: prn: 1 prn: 2 prn: 3
A, symbol A, indexical sur: schwarz sur: dort adj: black adj: loc2 cat: adn cat: adv sem: pad sem: mdd: book mdd: sleep mdr: mdr: nc: nc: pc: pc: prn: 4 prn: 5
V, symbol sur: las verb: read cat: #ns3′ #a′ decl sem: past arg: John book mdr: nc: (sleep 7) pc: prn: 6
The examples are language proplets, due to their non-empty sur slots containing values of German. By deleting the sur values, we obtain the corresponding context proplets. The construction of complex cognitive content is defined via the comparatively language-independent core values, and not over the language-dependent surfaces (3.1.2). The attributes and values have the following functions:4 4.2.2 P ROPLET
ATTRIBUTES AND THEIR VALUES
1. Surface attribute: sur All proplet kinds have a surface attribute. If it has no value, the proplet is a context proplet. A language proplet gets its sur value from the lexicon (3.2.2, Sects. A.4–A.7) during automatic word form recognition (Sect. 11.1) and production (Sects. 12.4–12.6), or by generalized baptism. 2. Core attributes: noun, verb, adj Leaving function words aside, the value of a core attribute may be a concept, a pointer, or an address, corresponding to the sign kinds symbol, indexical, and name, respectively. Symbols and indexicals receive their core values from the lexicon, while a name receives it by generalized baptism. 3. Grammatical attributes: cat, sem The attributes cat (category) and sem (semantics) get their initial values from the lexicon; they may be modified during the hear mode derivation. 4. Continuation attributes of functor-argument: fnc, arg, mdd, mdr The values of continuation attributes are addresses supplied by crosscopying during time-linear concatenation (3.4.2). Intrapropositional values are coded by short address, e.g. [arg: dog], while extrapropositional values 4
The principle for ordering the attributes in a proplet is described in Sect. A.2.
4.2 Proplet Kinds and their Mechanisms of Reference
53
are coded by long address, e.g. [arg: (dog 23)]. In complete propositions, the values of fnc (functor), arg (argument), and mdd (modified) must be non-NIL (obligatory continuation values), while that of mdr (modifier) may be NIL (optional continuation value). 5. Continuation attributes of coordination: nc, pc The continuation attribute nc (next conjunct) is used intra-(8.2.1, lines 2 and 5) and extrapropositionally (9.2.1, lines 4 and 8), while the pc (previous conjunct) is used only intrapropositionally. Like the modifier|modified relations, the conjunct−conjunct relations are optional. 6. Bookkeeping attribute: prn The basic value of the prn (proposition number) attribute is an integer. At the beginning of a new content, the value is provided by the parser, while in subsequent continuations the value is provided by the contentbuilding rules (13.3.2, 13.3.4). As additional bookkeeping attributes, wrn (word number) and trc (transition counter) may be used (FoCL 24.4.3). To widen the codomain of certain operation patterns, DBS uses replacement variables. In contrast to binding variables (for example, α and β in 11.6.1), replacement variables are not bound to a value, but replaced by it. One kind5 of replacement variable is called RV_n and ranges over values. A lexical proplet which has a replacement variable as its core value is called a proplet shell. It generalizes over the distinction between pattern proplets and content proplets: 4.2.3 S UBSTITUTION
OF A REPLACEMENT VARIABLE
semantic core proplet shell apple noun: RV_1 fnc: mdr: prn:
pattern proplet noun: α fnc: mdr: prn:
content proplet noun: apple fnc: mdr: prn:
The pattern proplet results from substituting the replacement variable RV_1 of the core attribute noun in the proplet shell with the binding variable α, while the content proplet results from substituting RV_1 with the concept apple. The resulting pattern proplet and lexical proplet are compatible for matching: 5
The other kind of replacement variable is called RA_m and ranges over different attributes (5.1.3).
54
4. Content and the Theory of Signs
4.2.4 P ROPLET
SHELL , LEXICAL ITEM , AND OPERATION PATTERN
pattern proplet proplet shell noun: RV_1 fnc: mdr: prn:
noun: fnc: mdr: prn:
α
compatible for matching content proplet
noun: apple fnc: mdr: prn:
The proplet shell, the pattern proplet, and the content proplet differ only in their core value. In the proplet shell, the core value is the replacement variable RV_1. In the pattern proplet, the core value is the binding variable α, which may be bound vertically to a core value during an operation application. In the content proplet, the core value is the concept apple. A pattern proplet may be turned into a set of grammatically similar content proplets by defining a restriction on the binding variable, e.g., α ǫ {calf elf half hoof leaf loaf oaf scarf self sheaf shelf thief turf wharf wolf} in morphology. A content proplet may be turned into a pattern proplet by replacing a constant by a variable. While the content of a proplet is represented by (a) its core value, e.g. eat, (b) its lexical cat values, e.g. n′ a′ v, and (c) its lexical sem values, e.g. past, its combinatorics is represented by (i) the core attribute, e.g. verb, (ii) the obligatory continuation attribute(s), e.g. arg, (iii) the optional continuation attribute(s), e.g. mdr, and (iv) the bookkeeping attribute(s), e.g. prn. This way of separating the content and the combinatorics of a proplet allows to treat the combinatorics relatively independently of the content and vice versa. For handling the combinatorial aspect of proplets grammatically, it is sufficient to represent their core values with placeholders, i.e. a letter sequence like red without an explicit operational definition in terms of recognition. A working model of communication (talking robot), however, requires core values which have declarative specifications with adequate procedural counterparts. For example, if we present different geometric objects to a talking robot and tell it to pick up a square, it must be able to (i) interpret the word square at the language level, (ii) select the shape in question at the context level, and (iii) relate the language meaning to the corresponding context referent. As a sign, the word square is of the kind symbol, and thus takes a concept as the value of its core attribute (4.1.2). Similarly, if [noun: apple] appears as the core feature of a proplet, then the value apple should not be merely a placeholder, but represent a concept
4.3 Context Recognition
55
suitable for recognizing appropriate referents. For this, the concept must have been implemented as an effective recognition procedure of the cognitive agent (robot), capable of distinguishing apples from pears and peaches.
4.3 Context Recognition The type-token relation of concepts is used not only for the reference mechanism of one of the sign kinds (symbol), but is also central for the external interfaces at the context level. Thereby, the context recognition and action procedures provide the same switching between a modality-independent and a modality-dependent coding as language interpretation and production (2.2.1). Using the type-token relation, the context recognition 2.1.2 may be refined: 4.3.1 C ONCEPT
TYPE AND TOKEN IN CONTEXT RECOGNITION
cognitive agent without language
changes in the external environment
concept type recognition component
concept token
memory
The recognition component produces incoming parameter values, for example, in the form of a bitmap, which are matched by a concept type provided by the agent’s memory (database). Thereby, the variables of the type are instantiated by constants. The resulting concept token may be stored in memory. The matching process 4.3.1 may be shown more concretely as follows: 4.3.2 C ONCEPT
TYPE AND CONCEPT TOKEN IN RECOGNIZING A SQUARE
2cm
bitmap outline
edge 1: α cm angle 1/2: 90° edge 2: α cm angle 2/3: 90° edge 3: α cm angle 3/4: 90° edge 4: α cm angle 4/1: 90°
edge 1: 2 cm angle 1/2: 90° edge 2: 2 cm angle 2/3: 90° edge 3: 2 cm angle 3/4: 90° edge 4: 2 cm angle 4/1: 90°
concept type
concept token
56
4. Content and the Theory of Signs
The type matches the outline of all kinds of different squares, whereby its variables are instantiated in the resulting tokens. Our preliminary6 holistic method of representing the concept square may be applied to other polygons, using the same recognition procedure based on matching concept types with bitmap outlines. The approach may also be extended to other kinds of data, such as the recognition of colors, defining concept types as certain intervals in the electromagnetic spectrum and the tokens as particular constant values in such an interval (CoL, pp. 296 ff.). The method is even suitable to implement the recognition of relations like “A is contained in B” or “A moved from X to Y.” Today, there exist pattern recognition programs which are already quite good at recognizing geometric objects. They differ from our approach in that they are based almost completely on statistics. However, even if the terms of type and token may not be found in their theoretical descriptions, the type-token distinction is nevertheless implicit in any pattern recognition process. Furthermore, the rule-based, incremental procedures of pattern recognition presented in L&I’05 are well-suited to be combined with statistical methods. As shown by the work of Steels (1999), suitable algorithms may evolve new types automatically from similar data by abstracting from what they take to be accidental. For DBS, the automatic evolution of types has to result in concepts which correspond to those of a language community. This may be achieved by presenting the artificial agent with properly selected data in combination with human guidance (guided patterns method, CLaTR Sect. 6.2).
4.4 Context Action Having used the type-token relation for recognition in 4.3.1, let us apply it next to the reconstruction of action. Action may be divided into (i) intention as the process of developing a blueprint for action, and (ii) execution as the mechanism of realizing the blueprint by changing the environment. The purpose of the agent’s actions is maintaining a state of balance vis à vis a constantly changing external and internal environment (3.5.5), as presented by recognition. In this sense, changes in the agent’s ecological niche and its capability to recognize them constitute the motor driving the agent’s action. Like recognition, action is based on concept types and concept tokens, but in the inverse order: while recognition is based on the sequence periphery-typetoken, action is based on the sequence token-type-periphery: 6
For an incremental, memory-based procedure of pattern recognition using geons (Biedermann 1987) see L&I’05.
4.5 Language Interpretation and Production
4.4.1 C ONCEPT
57
TYPES AND CONCEPT TOKENS IN ACTION
cognitive agent without language
changes in the external environment
concept type action component
concept token
memory
The input to an execution is an intention. Intentions (in the sense of wanting to act in a certain way) must be specific with respect to time and space, the properties of the objects involved, and the desired result. Therefore, intentions are defined as concept tokens which specify a desired future state. Execution, as the concept type corresponding to the intention token, is defined as a general realization procedure, for example, drinking from a cup. A given intention may be provided with an execution procedure by the agent’s database. The execution procedure (type) is matched with the intention (token), whereby the variables of the execution type are instantiated by the constants of the intention (specialization). The specialized execution is realized by peripheral cognition as changes in the environment. Tasks for which a standard execution is not (yet) available, like the creative opening of a difficult lock in an unfamiliar door, are realized by trying different combinations of smaller standard executions.
4.5 Language Interpretation and Production Let us return from the recognition and action procedures at the context level to the corresponding procedures at the language level. There, recognition is called interpretation and action is called production. From a technical point of view, we may treat the language component as a copy of the context component (CLaTR Sect. 14.1) such that sign interpretation and production function as a secondary use of the (non-language) recognition and action procedures which have been developed for the context level. For example, just as the recognition of the object of a square at the context level is based on parameter values in a certain modality (here vision) which are matched by a concept type and instantiated as a concept token (4.3.2), the recognition of the surface of the word square is based on parameter values
58
4. Content and the Theory of Signs
in a certain modality (e.g., hearing) which are matched by a surface type and instantiated as a surface token, and similarly for surface synthesis in language production (4.4.1). As physical objects, e.g., as sound waves or dots on paper produced by the speaker and perceived by the hearer, the external surfaces carry neither a syntactic category nor a semantic representation nor any other grammatical property. Once recognized, however, their single purpose within cognition is to serve as keys: for the hearer, each recognized surface provides access to a lexical entry stored in memory, which contains a corresponding sur value. Consider the following example: 4.5.1 L EXICAL
LOOKUP OF
G ERMAN Apfel IN
THE HEAR MODE
agent in the hear mode external sign Apfel Apfel sign recognition
lexical lookup sur: Apfel noun: apple fnc: mdr: prn:
The lexical entry is defined as a proplet with sur as its first attribute, followed by other attributes such as the core attribute noun with the core value apple as the (literal) meaning1 . The proplet format of a non-recursive feature structure with ordered attributes firmly connects the lexical properties, of which the sur feature is just one. In each lexical entry (A.4–A.7), the list of attributes and the sur, core, cat, and sem values are held together and ordered by means of a convention,7 which each member of the language community has to learn. The processing of surfaces in the speak and the hear mode constitutes the third application of pattern matching based on the type-token relation. The three8 locations in the cycle of natural language communication which use the type-token relation for pattern matching9 may be summarized schematically as follows: 7
8
9
In 1913, de Saussure (1972) described the convention-based connection between surface and meaning in his first principle as l’arbitraire du signe (FoCL Sect. 6.2). Technically, DBS codes the conventions of a natural language by means of feature structure definitions. A fourth, less general, use is the relation between an episodic and an absolute proposition (5.2.1, 5.2.2). The DBS pattern matching based on the type-token relation uses variables indirectly, i.e. embedded into the type. A direct use of variables for pattern matching, in contrast, is shown by applications of the LAGo hear (Chaps. 11 and 13) and the LAGo think rules (Chaps. 12 and 14).
4.5 Language Interpretation and Production
4.5.2 T HREE
59
TYPE - TOKEN RELATIONS IN THE COMMUNICATION CYCLE hearer
external surface Look, a square!
surface type
✽
square
surface token square
language level
lexical lookup sur: square_token noun: square_type ✽
[noun: square_token ]
proplet lookup square external object
concept type
internal matching
✽
square
context level
concept token
The instances of a type-token relation are indicated by *. As shown in FoCL Sect. 23.5, there are altogether 10 different variants of 4.5.2. Called the S LIM states of cognition, they depend on different combinations of recognition and action at the language and the context level. In 4.5.2, the hear mode is shown in a constellation of immediate reference (Sect. 2.4; FoCL 23.5.8, S LIM 8). At the language level, recognition of the external surface square is based on matching it with a surface type and instantiating it as a token. The type triggers the lexical lookup of an appropriate proplet. The surface token replaces10 the original type value of the sur attribute (indicated as [sur: square_token]). The lexical value of the core attribute noun is the concept type square (indicated as [noun: square_type]). At the context level, recognition of the external object is based on a concept type, which is instantiated as a concept token. The type triggers lookup of a corresponding context proplet. The token replaces the original type11 value of the core attribute of the context proplet (indicated as [noun: square_token]). As a result, the feature [noun: square_type] at the language level can be matched with the feature [noun: square_token] at the context level. 12 10
11
12
This is because the hearer may remember the pronunciation of a word or a phrase. Also, we would like to maintain the analogy with the context level. So far, JSLIM implementations have simplified the issue by deleting the surface value right after lexical lookup (excepting name proplets), thus turning language proplets into context proplets. Because effective procedures for concepts are not yet available, concepts are represented by place holders. In placeholders, the distinction between concept types and concept tokens is in name only. Apart from the vertical internal matching of types with tokens, there is also the possible matching of types with types. This case arises in the interpretation of absolute, rather than episodic, propositions (Sect. 5.2). For example, in the proposition A square has four corners, the context proplet square would contain the concept type rather than a token as its core value. This is no problem for match-
60
4. Content and the Theory of Signs
4.6 Universal versus Language-Dependent Properties The function of natural language is communication. If we add the principle that form, including the form of language, follows function, then the different natural languages share a common core –simply for reasons of how natural language communication works (CLaTR 4.1.2). The goal of DBS is characterizing this common core as an abstract declarative specification, such that all the differences between the various natural languages like English, Chinese, Tagalog, or Quechua (CLaTR 3.5.2) have no effect on the basic functional framework. This raises the following questions: • What are the differences between natural languages? • How should these differences be integrated into Database Semantics? From a functional point of view, DBS treats the differences between natural languages as relatively minor variations at certain low-level locations, such as the values of certain attributes. From the view point of linguistics, however, the handling of the universal versus language-dependent aspects of natural language in DBS may be characterized in terms of the following hierarchy: 4.6.1 H IERARCHY
OF NOTIONS AND DISTINCTIONS IN
DBS
0 time−linear concatenation
1
word form
2 content word
relation function word universal vertical
3 symbol 4
indexical adjective
verb
5 active passive medium ... adnominal adverbial 6 indicative subjunctive... pad cad sad 7 present imperfect future... degrees
name
horizontal
reference functor−argument coordination
noun
det, conj, prep, ... nom gen dat acc ... case masc fem neut gender
language−dependent
8 N N_A N_D N_D_A ... valency 9 1. pers 2. pers 3.pers
sg pl number
sg pl number
ing with a corresponding language proplet, however, because matching between identical types is structurally straightforward.
4.6 Universal versus Language-Dependent Properties
61
The most basic property common to different natural languages is their timelinear structure (FoCL 5.4.1, 5.4.2). In DBS, this universal property is handled with the algorithm of LA grammar computing possible continuations (level 0). The most obvious difference between natural languages are their surfaces. DBS handles this difference by language proplets having different values of the sur attribute (3.2.1). In all natural languages expressions may be disassembled into a sequence of word forms as the basic items of syntax-semantics. In DBS, this property is handled by automatic word form recognition in the hear mode and automatic word form production in the speak mode. Furthermore, the word forms in expressions of all the natural languages participate in two kinds of relations (3.2.4). One is the vertical relation of reference between the language and the context level, handled in DBS as pattern matching (3.2.4) in combination with inferencing (5.4.4). The other is the horizontal relation between between proplets, based on the semantic relations of structure (3.1.2), and established by means of addresses (level 1). The degree to which grammatical information (such as number, gender, tense, honorifics, prepositions, and conjunctions) is handled (i) in the morphology by means of allomorphy and affixation or (ii) in the syntax-semantics by means of function words varies from language to language. There remains, however, the general question of (i) how content words and functions words should be distinguished and (ii) how their interaction should be treated. DBS defines the content words as those which have their own mechanism of reference (Autosemantika13), i.e. symbols, indexicals, and names, whereas the function words are those without their own mechanism of reference (Synsemantika), such as determiners, conjunctions, auxiliaries, and prepositions in English. Technically, function words are handled by absorption in the hear mode and by precipitation in the speak mode (level 2). We know of no language in which the three mechanisms of reference, i.e. the iconic (2.6.4), the indexical (2.6.5), and the baptism reference (2.6.7), are not used. In DBS, iconic reference is based on pattern matching between concept types and concept tokens; indexical reference is based on pointing at parameter values provided by the expression’s STAR (2.6.2, 15.6.2); and baptism reference is based on a cross-copying between the referent and the lexical name (level 3). Moreover, the iconic reference of symbols occurs with nouns, verbs, and adjectives, the indexical reference occurs with nouns and adjectives, while the baptism reference of names occurs only with nouns (2.6.8–2.6.10, 4.2.1). 13
Marty (1908), pp. 205 f.
62
4. Content and the Theory of Signs
In DBS, the types and tokens of iconic reference are defined as the core values of concept words, based on the recognition and action procedures of an autonomous robot. Regarding the horizontal relations between content words, we know of no language in which functor-argument and coordination are not used as the semantic relations of structure14 (level 4). In English, as in all other European languages, functor-argument divides into the obligatory subject/predicate and object\predicate and the optional adnominal|noun and adverbial|verb relations; coordination is optional and divides into noun−noun, verb−verb, and adj−adj. Finally, functor-argument and coordination occur at the elementary, the phrasal, and the clausal level of grammatical complexity (6.6.1).15 In DBS, these relations are defined between pairs of individual proplets and are coded by the proplets’ attributes and their address values. The distinctions shown above the dotted line in 4.6.1 are universal in that they concern functional issues of communicating with natural language. The distinctions below the dotted line, in contrast, concern differences in the concrete coding of grammatical properties. For example, classical Greek has the genus verbi medium in addition to active and passive (Portner 2005); in German, the unmarked form of an adjective is the adverbial, whereas in English it is the adnominal; in classical Latin, third person nouns are morphologically distinguished with respect to gender and case, which have only rudimentary reflexes in the English system of personal pronouns; compared to Russian, the word order regularities of German are rather fixed, but more free than those of English; the aspect of progressive in English has no counterpart in classical Latin; and so on.
14
15
Orthogonal to the semantic relations of structure is the use of “particles” which pragmatically shade the utterance meaning of a sentence or phrase, especially in spoken dialog. Examples in English are exactly in That wasn’t exactly cheap! and pretty in That was pretty expensive!. Examples in German (Weydt 1969, Engel 1991, Ickler 1994) are ja (yes), as in Das ist ja wunderbar! (That is yes wonderful!), aber (but) as in Das ist aber schön! (That is but beautiful!), auch (also) as in Sind Sie auch zufrieden? (Are you also content?), bloss (merely) as in Was hat sie sich bloss dabei gedacht?! (What has she herself merely thereby thought?!), and vielleicht (perhaps) as in Das war vielleicht teuer! (That was perhaps expensive!). Particles help to express the speaker’s attitude, for example, as a polite softening of questions and requests, a grading of certainty, and a roundabout means of emphasis. While symbolic logic recognizes coordination in the form of propositional calculus (clausal) and functor-argument as part of predicate calculus (elementary), it does not model the distinction between the three levels of grammatical complexity.
5. Forms of Thinking
Thought is the cognition which connects an agent’s recognition (input) and action (output). As the processing of content, it also occurs without recognition or without action as the limiting cases. The motor driving cognition is the principle of balance, i.e. the agent’s inherent drive to maintain a state of equilibrium vis á vis a constantly changing external and internal environment. The computational reconstructing of thought in DBS uses constructs developed for programming the cycle of natural language communication. They are the data structure of proplets, the content-addressable database schema of a word bank, the time-linear algorithm of LA grammar, the pattern matching based on concept types and concept tokens, and the automatic derivation of patterns from contents and of contents from patterns (CLaTR Sect. 6.5).
5.1 Retrieving Answers to Questions In DBS, different forms of thinking are based on different ways of navigating through content in the agent’s database. There are three kinds: 5.1.1 T HREE
DIFFERENT KINDS OF THOUGHT
1. Navigating across token lines Selective activation follows the intra- and extrapropositional functor-argument and coordination relations in content stored in the agent’s word bank (Chaps. 12, 14). 2. Navigating along token lines Querying constructs query patterns (think-speak mode) which are moved along token lines to derive query answers (hear-think-speak mode). 3. Inferencing Activated content matches the antecedent (forward chaining) or the consequent (backward chaining) of an inference to derive new contents as reasoning and as blueprints for action (Sect. 5.3).
64
5. Forms of Thinking
In this section, we begin with (2), i.e. the mechanism of querying a word bank and deriving appropriate answers. Consider an agent thinking about girls. This means activating the corresponding token line in the agent’s word bank, such as the following example: 5.1.2 E XAMPLE
OF A TOKEN LINE
member proplets now front noun: girl noun: girl noun: girl noun: girl fnc: walk fnc: sleep fnc: eat fnc: read mdr: young mdr: blond mdr: small mdr: smart prn: 19 prn: 15 prn: 12 prn: 10
owner value girl
As indicated by the fnc and mdr values of the connected proplets (member proplets, 3.3.1), the agent happened to observe or hear about a young girl walking, a blonde girl sleeping, a small girl eating, and a smart girl reading. Traversing the token line of any given core value (concept) is powered by the following LAGo think grammar, called LA think.line: 5.1.3 D EFINITION
OF
LA
THINK . LINE
mvf orwd
RA.1: α RA.1: α ⇒ prn: K prn: K+1
mvbckwd
RA.1 ǫ {noun, verb, adj} RA.1: α RA.1: α ⇒ prn: K prn: K-1
The core attributes are represented by the replacement variable RA.1 (A.3.3, 2), which is restricted (or rather generalized) to the attributes noun, verb, and adj. It allows to formulate a single grammar, e.g. 5.1.3, for navigating along different toke lines with different core attributes.1 The prn variables K, K+1, and K-1 follow the convention that K+1 stands for the proplet immediately following the current proplet K in the token line, and K-1 for the proplet immediately preceding. The operation mvf orwd traverses a token line from left to right (forward), following the temporal order, while the operation mvbckwd moves from right to left in the anti-temporal order (backward). Consider application of mvf orwd to the proplet with the prn value 12 in 5.1.2: 5.1.4 E XAMPLE
OF AN
LA
THINK . LINE OPERATION APPLICATION
pattern name input operation RA.1: α ⇒ mvf orwd : level prn: K ⇑ noun: girl fnc: sleep content mdr: blonde level prn: 12
output RA.1: α prn: K+1 ⇓ noun: girl fnc: eat mdr: small prn: 15
5.1 Retrieving Answers to Questions
65
During matching between the input pattern and the input proplet, the replacement variable RA.1 is substituted by the attribute noun, and the binding variables α and K are bound to the values girl and 12, respectively. This allows to navigate to the output proplet (prn value 15). Such moving along a token line may be extended to answering questions. In natural language, there are two kinds of basic query expressions, called (i) WH interrogatives and (ii) Yes/No interrogatives, illustrated below with examples relating to the token line example 5.1.2. 5.1.5 BASIC
KINDS OF QUERY EXPRESSION IN NATURAL LANGUAGE
WH interrogative Which girl walked?
Yes/No interrogative Did the young girl walk?
In DBS, these interrogatives translate into the following query patterns: 5.1.6 S EARCH
PROPLETS ILLUSTRATING THE TWO BASIC KINDS
WH interrogative noun:girl fnc: walk mdr: σ prn: K
Yes/No interrogative noun:girl fnc: walk mdr: young prn: K
In the search proplet of the WH interrogative, the mdr attribute has the variable σ and the prn attribute has the variable K as value; in the search proplet of the Yes/No interrogative, only the prn attribute has a variable as value. Technically, the answer to an interrogative consists in binding the variables of the search proplet to suitable values. Consider the application of the WH interrogative search pattern in 5.1.6 to the token line 5.1.2: 5.1.7 WH
SEARCH PATTERN CHECKING A TOKEN LINE
noun:girl fnc: walk search pattern mdr: σ prn: K member proplets matching? now front owner value girl noun: girl noun: girl noun: girl noun: girl fnc: walk fnc: sleep fnc: eat fnc: read token line mdr: young mdr: blonde mdr: smallmdr: smart prn: 19 prn: 15 prn: 12 prn: 10
The indicated attempt at matching fails because the fnc value of the query pattern (i.e. walk) and of the proplet token (i.e. read) are incompatible. The 1
The replacement variable RA_m is also used in the lexical entries of paratactic conjunctions like and (A.7), allowing the coordination of nouns, of verbs, or of adjs using one and the same conjunction.
66
5. Forms of Thinking
same holds after moving the pattern one proplet to the left. Only when reaching the leftmost proplet is the matching successful. Now the variable σ is bound to young and the variable K to 10. Accordingly, the answer provided to the question Which girl walked? is The young girl (walked). This procedure is formalized by the following LAGo think grammar: 5.1.8 D EFINITION
OF
LAG O
THINK .Q1
(WH
INTERROGATIVE )
noun: RV.1 noun: RV.1 fnc: ¬ RV.2 fnc: ¬ RV.2 wh1 : ⇒ mdr: σ mdr: σ prn: K-1 prn: K noun: RV.1 noun: RV.1 fnc: RV.2 fnc: ¬ RV.2 wh2 : ⇒ mdr: σ mdr: σ prn: K prn: K-1
Applied to the WH search pattern in 5.1.6, the replacement variables (4.2.4) RV.1 and RV.2 are substituted by the constants girl and walk, respectively. For the feature [fnc: ¬ RV.2] to match it must not be compatible with the input. In the definition of LAGo think.Q1, this is specified for the continuation attributes of the first operation. The corresponding feature [fnc: RV.2] in the output pattern of the second operation, in contrast, does have to be compatible, matching the input. If the rightmost proplet of a token line does not match the search pattern (5.1.7), the operations wh1 and wh2 are both activated. If the next proplet to the left does not match either, wh1 is successful; otherwise wh2 is successful. As long as there are proplets in the token line for wh1 to apply successfully (i.e. matching fails), the movement to the left continues; when no token remains, the algorithm terminates with the English expression I don’t know. If there is a successful application of wh2 , the algorithm terminates with an English noun phrase based on the matching proplet, here the young girl. A Yes/No interrogative2 is handled in a similar manner. For example, Did the young girl walk?, based on applying the Yes/No search pattern of 5.1.6 to the token line 5.1.2, results in the answer yes, due to a successful matching between the search pattern and the proplet with the prn value 10. The procedure is formalized in a simplified variant as LAGo think.Q2: 2
The term interrogative refers to a kind of sentential mood, while the term question refers to a kind of utterance. The correlation between the utterance kinds i.e. statement, question, and request, and the sentential moods, i.e. declarative, interrogative, and imperative, is not one-to-one. For example, a question may use an expression of the declarative mood, as in I ask you where you were last night., and the expression used in a request may be an interrogative, as in Could you open the window? A rudimentary treatment of Yes/No interrogatives is given in Sect. 13.6 for the hear mode and in Sect. 14.6 for the speak mode. For an explicit syntactic-semantic derivation of an interrogative with a long distance dependency (object sentence recursion) see Sect. 7.6.
5.2 Episodic versus Absolute Propositions
5.1.9 D EFINITION
OF
LAG O
THINK .Q2
(Y ES /N O
67
INTERROGATIVE )
noun: RV.1 noun: RV.1 fnc: ¬ RV.2 fnc: ¬ RV.2 yn1 : ⇒ mdr: ¬ RV.3 mdr: ¬ RV.3 prn: K prn: K-1 noun: RV.1 noun: RV.1 fnc: RV.2 fnc: ¬ RV.2 ⇒ yn2 : mdr: ¬ RV.3 mdr: RV.3 prn: K prn: K-1
In LAGo think.Q2, RV.1, RV.2, and RV.3 are replacement variables, while K and K-1 are binding variables. If the movement to the left of the token line ends in an application of the operation yn1 , the answer is no. If the operation yn2 is successful, the algorithm terminates and the answer yes is produced. For example, in the attempt to answer the question Did the blonde girl walk? relative to the token line 5.1.2, the algorithm terminates with a successful application of yn1 and the answer is no. The question Did the young girl walk?, in contrast, results in a successful application of yn2 and the answer is yes. For further discussion see CLaTR Chap. 10.
5.2 Episodic versus Absolute Propositions The third kind of thinking (5.1.1), besides (i) navigating along interproplet relations across token lines and (ii) moving along token lines, is (iii) inferencing. In DBS, some inferences are based on absolute propositions, as opposed to episodic propositions (AIJ’01). Absolute propositions, such as scientific or mathematical content, but also generalized personal beliefs like Mary (always) takes a nap after lunch, represent content which holds independently of any STAR (2.6.2). In English, absolute propositions are coded as generic sentences (Gerstner 1988) which expresses their atemporal character with the present tense and the indicative mood of the finite verb, and the absence of temporal and modal adverbs. Episodic propositions, in contrast, require specification of the STAR in order for the content to be coded correctly (CLaTR Chap. 10). In the English surface, episodic propositions are characterized by the possible non-present tense, non-indicative mood, and non-neutral aspect of the finite verb form, and the possible presence of temporal and modal adverbs. Formally, episodic and absolute propositions at the context level may be distinguished by the value of their prn attribute, as illustrated by the following example showing the proplets of the propositions corresponding to English A dog is an animal (absolute) and the dog is tired (episodic):
68
5. Forms of Thinking
5.2.1 A BSOLUTE
...
...
...
...
AND EPISODIC PROPLETS IN A WORD BANK
member proplets noun: animal fnc: be mdr: prn: a-11 verb: be verb: be arg: dog arg: dog animal . . . mdr: tired mdr: prn: e-23 prn: a-11 noun: dog noun: dog fnc: be fnc: be ... mdr: mdr: prn: a-11 prn: e-23 adj: tired mdd: be ... mdr: prn: e-23
now front
owner values animal
be
dog
tired
In addition to the different prn values, here a-11 versus e-23, absolute and episodic proplets differ in that the core value of absolute proplets is a concept type, while the core value of episodic proplets is a concept token: 5.2.2 C ORE
VALUES AS CONCEPT TYPES AND CONCEPT TOKENS member proplets type token
token line
noun: dog fnc: be mdr: prn: a−11 absolute proplet
noun: dog fnc: be mdr: prn: e−23
now front
owner value type
dog
episodic proplet
The core values of absolute proplets at the context level are assigned their type by the associated owner value, while the token core value of an episodic proplet may originate in recognition. The type-token relation between absolute and episodic proplets is the basis for many kinds of non-literal use (5.4.4).
5.3 Inference: Reconstructing Modus Ponens in DBS As the other part of the think mode besides selective activation, let us consider inferencing. Beginning with the classical inferences of propositional calculus, we use the following conventions: A and B are true propositions, ¬ is sentential negation, ∧ is logical and, ∨ is logical or, → is logical implication,
5.3 Inference: Reconstructing Modus Ponens in DBS
69
the expressions above the horizontal line are the premise or premises, and the expression below is the conclusion, indicated by ⊢ (turnstile): 5.3.1 I NFERENCE 1. A, B ⊢ A∧B
SCHEMATA OF PROPOSITIONAL CALCULUS
2. A ∨ B, ¬A ⊢B
3. A → B, A ⊢ B
4. A → B, ¬B ⊢ ¬A
5. A ∧ B 6. A 7. ¬A 8. ¬¬A ⊢A ⊢A∨B ⊢ A→B ⊢ A In the inference schema 1, called conjunction, the truth of two arbitrary propositions A and B implies the truth of the complex proposition A ∧ B. If the inference of conjunction were reconstructed in DBS, it would amount to an operation which establishes new extrapropositional relations between unconnected propositions, based on the conjunction and. This operation may be characterized as the following cross-copying operation. 5.3.2 O PERATION
FOR THE INFERENCE OF
verb: α verb: β pc: =⇒ inf1 : nc: prn: K prn: K+M
conjunction ( HYPOTHETICAL )
verb: α verb: β nc: (β K+M) pc: (α K) prn: K prn: K+M
The two patterns to the left of the arrow match the verb proplets of two propositions, thereby binding the variables α, β, K, and M. The same patterns are shown to the right of the arrow, but with the core values cross-copied as long (extrapropositional) addresses in the nc-pc slots.3 Thus, inf1 may produce the relation of coordination between two hitherto unconnected propositions, enabling new navigations. In its truth-conditional interpretation, the inference expresses a conjunction of truth and is, as such, intuitively obvious. From this point of view, it is not relevant which propositions are conjoined. The only condition for the inference to be valid is that the two propositions in the premise happen to be true. From the viewpoint of DBS, in contrast, there arises the question of why two – previously unconnected – propositions should be concatenated with and. 3
The content of a text consists of a sequence of coordinated propositions. Therefore, the nc and pc slots of most top verbs will be filled. There are, however, the usual exceptions for the first and the last conjunct (8.1.3), which would allow to connect previously unrelated texts by the hypothetical conjunction inference 5.3.2. The extrapropositional coordination between top verbs is complemented by the extrapropositional coreference between nouns. Coreference may supply an initial referent, e.g. Fido, with an unlimited number of coreferential copies which use the address as their core value (2.6.7; CLaTR, Sect. 4.4). While the coordination relation is defined across token lines (5.1.1, 1), the coreference relation is defined along token lines (5.1.1, 2).
70
5. Forms of Thinking
For example, even though conjoining The dog barked and The table cloth is white might not result in asserting a falsehood, it would make little sense. Because an uncontrolled application of inf1 would create new, unmotivated connections between hitherto unconnected propositions, it would destroy the coherence of content in a word bank. The example 5.3.2 of conjunction has shown how a classic inference can be reconstructed in DBS. We have also seen that the classic inference schemata may change their character substantially when transferred to DBS. Therefore they should not be transferred blindly. Let us turn now to modus ponens. In propositional calculus, its inference schema (3 in 5.3.1) may be illustrated as follows: 5.3.3 M ODUS
PONENS IN PROPOSITIONAL CALCULUS
premise 1: If Fido is a dog then Fido is an animal. A → B premise 2: Fido is a dog. A conclusion: Fido is an animal. ⊢B Because propositional calculus treats propositions as unanalyzed constants. e.g., A and B, this form of modus ponens is suitable to assert the inference relation between sentences, but not between individuals. Let us therefore consider modus ponens in first-order predicate calculus (Bochenski 1961, 16.07 ff., 43.16 ff.), which analyzes the internal structure of propositions using the connectives ¬, ∧, ∨, and → of propositional calculus plus one- and two-place functor constants like f and g, and the quantifiers ∃ and ∀ horizontally binding the variables x and y: 5.3.4 M ODUS
PONENS IN PREDICATE CALCULUS
premise 1: ∀x[f(x) → g(x)] premise 2: ∃y[f(y)] conclusion: ∃y[g(y)] If f is realized as dog and g as animal, the inference would read as follows: 5.3.5 PARAPHRASING
THE MODUS PONENS OF PREDICATE CALCULUS
premise 1: For all x, if x is a dog, then x is an animal. premise 2: There exists a y, such that y is a dog. conclusion: There exists a y, such that y is an animal. Because the binding of variables in prepositional calculus is horizontal (e.g., ∃y [... y]), but vertical in DBS, and because the parts of a logical formula are
5.3 Inference: Reconstructing Modus Ponens in DBS
71
ordered (linear notation), while proplets are order-free, this inference cannot be transferred to DBS directly. Instead, modus ponens may be reconstructed by (i) turning premise 1 into the conditional form α is a dog implies that α is an animal, (ii) using premise 2 as the input to the conditional, and (iii) treating the conclusion as the output of the conditional: 5.3.6 PARAPHRASING
PREDICATE CALCULUS MODUS PONENS IN
DBS
inference: α is a dog implies that α is an animal. input: Fido is a dog. output: Therefore, Fido is an animal. Using proplet patterns, this DBS inference may be restated as follows: 5.3.7 A PPLYING
THE FORMALIZED MODUS PONENS OF
DBS
noun: α verb: is_dog noun: α verb: is_animal pattern impl fnc: is_animalarg: α fnc: is_dogarg: α level prn: K prn: K prn: K+1 prn: K+1
⇑ noun: Fido verb: is_dog content fnc: is_dogarg: Fido level prn: 23 prn: 23
⇓ verb: is_animal noun: Fido fnc: is_animalarg: Fido prn: 23+1 prn: 23+1
By vertically binding Fido of the input output level to the variable α at the inference level, the consequent derives the new content Fido is an animal as the output. The easy derivation of new inferences relies on the absence of quantifiers (Sect. 6.4) and the automatic transformation of any content into a set of matching pattern proplets (CLaTR Sect. 6.5). Because DBS inferences may apply to complete propositions as input and derive a new proposition as output, the number of proplet patterns in the antecedent and the consequent is open. Also, there is an open number of different connectives, such as impl(ly), caus(e), sum(marize), prom(ise), exec(ute), inst(antiates), and cm (countermeasure). The hear mode operations, in contrast, have two input patterns, derive two or one output proplet(s), and are of four kinds (3.6.3). The operations of selective activation in the think mode are also of four kinds (3.6.4), but have one input pattern and derive one output proplet. In other respects, however, inferences resemble hear and navigation operations. For example, inferences apply by pattern matching their antecedent (ClaTR 13.5.1) or their consequent (ClaTR 13.5.2) to activated content (selforganization). Like the hear mode operations, inferences derive new content as output.
72
5. Forms of Thinking
5.4 Non-Literal Uses as a Secondary Coding Non-literal uses of language, such as metaphor4 and metonymy, originate at the agent’s context level as seeing a certain content as something else, e.g. an orange crate as a table or a dog as an animal. Without language, such a secondary coding of a content is a form of reasoning. For example, seeing an orange crate as a table has the practical effect of finding a place to put down the coffeepot. With language, such a secondary coding may be imparted from the speaker to the hearer. In DBS, a secondary coding is provided by an inference, such as the following, which helps to interpret an orange crate as a table: 5.4.1 NARROWING5
INFERENCE
α is table impl α has flat horizontal surface By matching the observation the orange crate has flat horizontal surface with the consequent, the agent binds orange crate to α. Backward chaining (CLaTR 13.5.2) to the antecedent results in the secondary coding orange crate is table. Let us use this inference to reanalyze an intuitive example of a nonliteral language use in FoCL, Sect. 5.2: A hearer has just entered a room containing only an orange crate. If the speaker requests Put the coffee on the table!, the hearer will infer that table refers to the crate. This is because (i) an orange 4
5
The importance of metaphor as a basic principle of natural language communication has been justly emphasized by Lakoff and Johnson (1980) and subsequent work. However, because their approach is based on the meaning definition by Grice (1957, 1965), it differs from the one taken by DBS and the S LIM theory of language (FoCL Sect. 4.5, Example II; CLaTR 5.1.1 f.). This may be shown with the following example from Lakoff and Johnson (1980, p. 12). They present the utterance Please sit in the apple juice seat as referring to the seat with the apple juice setting, and continue: “In isolation this sentence has no meaning at all, since the expression ‘apple juice seat’ is not a conventional way of referring to any kind of object.” The question for the S LIM theory of language is whether Lakoff and Johnson use the term “meaning” as the literal meaning1 of the sign, or the speaker meaning2 of the utterance (2.6.1). If “meaning” is to be interpreted as the literal meaning1 of the expression apple juice seat, then the sentence clearly does have such a meaning, compositionally derived from the meaning1 of its words. If “meaning” is to be interpreted as meaning2 , however, then an associated context of interpretation is required. Without it, the sentence in question is a sign type in isolation, which by definition does not have a meaning2 . Intuitively, the metaphoric reference in this example comes about by matching the meaning1 of the expression with a context containing a seat with an apple juice setting. The meaning1 derived from the words apple, juice, and seat is minimal in the sense of seat related to apple juice. A limited set of potential candidates is crucial for the successful interpretation of natural language. In the context in question, the literal meaning of the sign is sufficient to pick out the intended counterpart based on the principle of best match. In contradistinction to a widening inference such as α is table impl α is furniture (hypernymy), 5.4.1 narrows the range of referents by distinguishing table from that of other concepts such as lamps, cushions, or bath tubs.
5.4 Non-Literal Uses as a Secondary Coding
73
crate and a table are similar in terms of their flat horizontal surface and (ii) the limited context of use6 does not provide a better candidate (principle of best match). 5.4.2 N ON - LITERAL
USE OF THE WORD
hearer Put the coffee on the table
table speaker
Put the coffee on the table
[concept]
Put the coffee on the table [concept]
orange crate
However, if a prototypical table were placed next to the orange crate, the hearer would interpret the sentence differently, putting the coffee not on the crate, but on the table. This is not caused by a change in the meaning1 of the word table, but by the fact that the context of use has changed, providing an additional and more suitable candidate for best match. In the technical reanalysis, the speaker binds the primary coding orange crate to the variable α in the consequent of 5.4.1 and uses the antecedent to derive the secondary coding orange crate is table (backward chaining, CLaTR 13.5.2). The hearer, in contrast, binds each candidate available in the limited context of use to the variable α in the antecedent and checks which of them best fits the property asserted by the consequent (forward chaining).7 In the situation of the example, this candidate is the orange crate, enabling the hearer to correctly identify the referent intended by the speaker. In the speak mode, an inference interpreting a non-literal language use applies before the mapping from the context level to the language level, but in the hear mode it applies after the mapping from the language level to the context level. Furthermore, rather than loosening the matching conditions defined in 3.2.3 (as might be suggested by the intuitive approach illustrated in 5.4.2), the inference, e.g. 5.4.1, provides a strict matching between the primary and the secondary coding, in the speak mode and the hear mode. As a language example of a secondary coding based on a widening inference, i.e. an inference mapping into a superset, consider the reference to a dog as animal in the sentence The animal is tired: 6 7
Selection and delimitation of a context of use is based on the sign’s STAR (2.6.2; CLaTR Chap. 10). In the terminology of Richards (1937), orange crate in 5.4.2 would be the tenor (elsewhere called ground or target) and table the vehicle (elsewhere called figure or source).
74
5. Forms of Thinking
5.4.3 W IDENING
INFERENCE
α is a dog impl α is an animal This inference widens the meaning2 of dog by relating it to the hypernym animal and may be illustrated graphically as follows: 5.4.4 C ONTEXT
INFERENCE UNDERLYING A NON - LITERAL USE
secondary coding
tired type
animal type internal matching
primary coding
tired token
dog token
animal token inference
The speak mode matches the referent dog with the α in the antecedent of the inference 5.4.3, resulting in the secondary coding animal tired as output (forward chaining). This output is matched directly with language proplets. The hear mode matches the language proplet animal of the secondary coding directly with the α in the consequent. Backward chaining renders dog tired as a possible primary coding, which may be confirmed by observation. The examples 5.4.2 (narrowing inference) and 5.4.3 (widening inference) have in common that the condition 3.2.3 is strictly fulfilled by their matchings between a non-literal expression and its context counterpart. They differ in that the speak mode is backward chaining in the narrowing inference and forward chaining in the widening inference, and conversely for the hear mode. The handling of figurative language by means of inferences which provide secondary codings has the following advantages: 5.4.5 A DVANTAGES
OF HANDLING INDIRECT USES VIA INFERENCES
1. Inferencing, including indirect uses, apply at the context level. Therefore, agents with and without language may use the same method of reasoning. 2. Primary and secondary coding of content employ the same method of strict internal matching (3.2.3), which facilitates computational realization. 3. Inferencing at the level of context is more powerful and more flexible than inferencing based on isolated signs of language. 4. Assuming that natural language reflects the context coding directly, the context inferences can be studied by analyzing their language reflections.
5.5 Secondary Coding as Perspective Taking
75
In concord with a generalized notion of reference (HBTR 3.1.2), the matching of proplets (i) is based on content8 and (ii) is not limited to nouns. For example, in The marines hit the beach the verb hit is used metaphorically.9 In addition to metaphor and metonymy, classical rhetoric distinguishes such non-literal uses as allegory, allusion, analogy, catachresis, euphemism, hyperboly, irony, litotes, parabel, paradox, pars pro toto, periphrase, personification, simile, and synekdoche. Their treatment depends less on the agent’s current environment than the inferencing for maintaining a state of balance.
5.5 Secondary Coding as Perspective Taking In DBS, direct and indirect uses of language differ in whether or not the content representing the meaning1 is the original (primary coding) or instead some other content related to the original via an inference (secondary coding). The derivation of secondary codings is independent of whether or not the agent has the language faculty. Accordingly, even a dog, for example, could view a content from a certain subjective perspective, such as viewing the mail man as an intruder. The role of secondary codings in agents with and without language may be shown schematically as follows: 5.5.1 C ODING
LEVELS IN AGENTS WITH AND WITHOUT LANGUAGE input output
language coding
indirect language uses direct language uses
secondary coding
secondary coding
input output
primary coding
agent without language 8
9
input output
primary coding
agent with language
Reimer and Michaelson (2014) use the term “representational tokens” (instead of language expressions) for defining reference. DBS includes nonlanguage content among the representational tokens, as when the agent identifies a current nonlanguage recognition with something seen before. Whether or not figurative language (i) always has a literal counterpart which is (ii) required for understanding is discussed controversially in the literature (Ricoeur 1975). This holds especially for the figurative use of verbs, as in The tiger melts back into the grass (NatGeo Wild, Secrets of Wild India, 2012), accompanied by vivid TV images of the tiger’s becoming invisible. Without the need to determine the correct referent, as in the figurative use of nouns (5.4.2, 5.4.4), figurative uses of verbs may well be employed directly in the primary coding. For example, one would be hard pressed to instantly provide play the market or steal the show with a literal counterpart. This holds independently of their conventionalized use because they are immediately understood at first hearing.
76
5. Forms of Thinking
Agents without and with language use the same inference-based relation between primary and secondary codings of content. It is just that agents without language do not use contents as a meaning1 , while agents with language do. The functions of the three kinds of coding may be summarized as follows: 5.5.2 F UNCTIONS
OF CODING LEVELS IN DATABASE SEMANTICS
1. Primary coding at the context level: Represents context recognition and action at a low level of abstraction, i.e. essentially using the format (data structure) resulting from the peripheral interface procedures, in order to ensure veracity. 2. Secondary coding at the context level: Consists of inferencing over the primary coding in order to obtain sufficient expressive power at varying levels of abstraction. 3. Language coding: May use primary and secondary context coding as language meaning1 . The combination of primary and secondary coding allows content to be represented from different points of view at different levels of abstraction. In shortterm memory, primary context coding may be left intact. Long-term storage, however, may involve a fusion of primary and secondary coding into a condensed form – depending on the amount of content and the limits of storage. While the primary coding of a given content should be basically the same in different agents, their secondary codings may vary widely, depending on their knowledge and purposes. For example, an experienced scout and a greenhorn may see the same broken twig in primary coding. In secondary coding, however, the scout may spontaneously see the twig as a trace of whatever they pursue, while the greenhorn does not. Using language, however, the scout may adjust the greenhorn’s secondary coding to his or her own.
5.6 Shades of Meaning Popular opinion seems to take it for granted that a native speaker’s intuitions about the meaning of a word could never be modeled in an artificial agent. This is unjustified. All a computational modeling requires is a detailed analysis of the several well-defined ingredients which the meaning of a word consists of. That some of them consist of (i) personal experiences and (ii) implicit assumptions of the surrounding culture does not mean that they cannot be modeled in an agent-based approach.
5.6 Shades of Meaning
77
Leaving function words, indexicals, and names aside, let us consider words with the reference mechanism of concepts, i.e. symbols. Their meaning is based on the distinctions (i) between concept types and concept tokens, (ii) between episodic and absolute propositions, and (iii) between primary and secondary codings. Take the word dog, for example. First, there is the result of lexical lookup (3.2.2) which provides the semantic core. It is shown below at the centre as the concept type, serving as the meaning1 . Then there are the absolute proplets containing the concept type, e.g. dog, listed in the token line of this concept. These proplets are connected: based on their continuation values, all the absolute propositions about dogs may be activated, lighting up a sub-network in the word bank. In this way, the semantic core of the lexical proplet (meaning1) is complemented compositionally with the agent’s general knowledge about dogs, shown below as the second circle, called absolute connections: 5.6.1 C OMPLEMENTATION
OF A LITERAL MEANING 1 ( CONCEPT TYPE )
episodic connections concept tokens
absolute connections concept type
In addition there are the episodic proplets in the token line of the concept. If the concept is dog, for example, then the episodic proplets in its token line
78
5. Forms of Thinking
(similar to 5.1.2) complement the core value type of the lexical item with all the core value tokens of dogs the agent has encountered and stored so far. These core value tokens may have any degree of detail, such as shape, color, size, the sound of barking, etc., depending on memory capacity and knowledge as well as the relative importance of each dog to the agent (concept tokens). Finally, the episodic proplets in the token line of dog are the starting points of all the episodic propositions involving dogs, including secondary codings based on inferences. Using the retrieval mechanism of the word bank, the proplets of these propositions may be activated. In this way, the generalized aspect of meaning represented by the set of dog tokens is complemented compositionally with the agent’s individual dog experiences (episodic connections).
Remark Concluding Part I This part has presented a high-level description of Database Semantics (DBS). As an abstract model of natural language communication between human prototypes, it provides the theoretical foundation for building artificial cognitive agents which human users can communicate with freely in their accustomed natural language. The more this goal is being reached by means of continuous cycles of upscaling, the more the use of programming languages may be relegated to the maintenance routines of specialized “robot doctors.” Accordingly, the keyboard and the screen of today’s computers may be reduced to the artificial agent’s service channel, while the communication with the user is entrusted to the newly developed auto-channel (1.4.3) using natural language. In the following Part II, our focus changes from the interfaces and components of a cognitive agent to a systematic analysis of the major constructions of natural language, using English as our example. The purpose of these analyses is to show (i) how the compositional aspect of natural language meaning is coded in the semantic representation of content, (ii) how the hear mode derives this content from the natural surface, and (iii) how the speak mode derives the natural surface from the content.
Part II
Major Constructions of Natural Language
6. Intrapropositional Functor-Argument
The semantic relations of structure in natural language are (i) functor-argument and (ii) coordination, occurring at (a) the elementary, (b) the phrasal, and (c) the clausal level of grammatical complexity. Elementary and phrasal relations are intrapropositional, while clausal relations are extrapropositional. Analyzing the two kinds of relations at the three levels requires the use of examples. Moreover, in order to be concrete, the examples must be from real natural languages. Thus, our linguistic analysis has a sign-based aspect. Nevertheless, our approach is also agent-based insofar as the examples are analyzed in the hear, the think, and the speak mode. In the hear mode, a LAGo hear grammar maps a language-dependent surface (input) into cognitive content, represented as a set of proplets which are stored in the agent’s word bank. In the think mode, content is activated selectively by a LAGo think grammar which navigates along the semantic relations of structure between stored proplets. In the speak mode, lexicalization rules embedded into the LAGo think operations realize language surfaces from the proplets traversed (output). This chapter begins the systematic investigation of the major constructions with intrapropositional functor-argument relations. Each example is analyzed as (i) a graphical representation of the time-linear hear mode derivation, (ii) a content as a set of proplets, and (iii) a graphical representation of the think mode navigation underlying the speak mode.
6.1 Obligatory Subject/Predicate and Object\Predicate In a declarative sentence, the functor-argument relations may be (i) obligatory, i.e. subject/predicate and object\predicate, or (ii) optional, i.e. adnominal|noun and adverbial|verb. A semantic relation is obligatory if an expression is incomplete without it. For example, the expression John crossed is incomplete because cross is a two-place verb which requires an object\predicate relation. An expression like The little dog barked, in contrast, is intuitively complete even if the optional adnominal modifier little is omitted.
82
6. Intrapropositional Functor-Argument
Our first example is The man gave the child an apple. The content derived from this surface is a proposition with a three-place verb and the obligatory functor-argument relations of (i) subject/predicate, (ii) indirect_object\predicate, and (iii) direct_object\predicate: 6.1.1 H EAR
MODE DERIVATION OF CONTENT WITH THREE - PLACE VERB
the surface lexical lookup
man
gave
noun: n_1 noun: man verb: give arg: fnc: fnc: mdr: mdr: mdr: prn: prn: prn: syntactic−semantic parsing: noun: n_1 noun: man fnc: fnc: 1 mdr: mdr: prn: prn: 23 noun: man verb: give fnc: arg: 2 mdr: mdr: prn: 23 prn: noun: man verb: give fnc: give arg: man 3 mdr: mdr: prn: 23 prn: 23 noun: man verb: give fnc: give arg: man n_2 4 mdr: mdr: prn: 23 prn: 23 noun: man verb: give fnc: give arg: man child 5 mdr: mdr: prn: 23 prn: 23 verb: give noun: man fnc: give arg: man child n_3 6 mdr: mdr: prn: 23 prn: 23 result of syntactic−semantic parsing: verb: give noun: man arg: man child apple fnc: give mdr: mdr: prn: 23 prn: 23
the
child
an
apple
noun: n_2 fnc: mdr: prn:
noun: child fnc: mdr: prn:
noun: n_3 fnc: mdr: prn:
noun: apple fnc: mdr: prn:
noun: n_3 fnc: mdr: prn: noun: n_3 fnc: give mdr: prn: 23
noun: apple fnc: mdr: prn:
noun: n_2 fnc: mdr: prn: noun: n_2 fnc: give mdr: prn: 23 noun: child fnc: give mdr: prn: 23 noun: child fnc: give mdr: prn: 23
noun: child fnc: mdr: prn:
noun: child fnc: give mdr: prn: 23
noun: apple fnc: give mdr: prn: 23
That the derivation is time-linear (1.6.5) is apparent from the stair-like structure resulting from adding exactly one new “next word” in each proplet line. The derivation is also surface compositional (1.6.1) because (i) each word form in the surface has a lexical analysis and (ii) no “zero elements” (i.e. lexical proplets with empty sur slots) are postulated in the input. In line 1, the substitution value1 n_1 of the determiner proplet is replaced 1
To distinguish different determiners in the same proposition, the substitution values n_1, n_2, n_3, etc., are automatically incremented during lexical lookup (13.4.3, 13.4.6). This is motivated by such constructions as der die das Kind fütternde Frau liebende Mann in German.
6.1 Obligatory Subject/Predicate and Object\Predicate
83
by the value man of the noun proplet, which is then discarded. The result is shown in the first proplet of line 2. In the combination of the proplets man and give, the core value of man is copied into the arg slot of the verb and the core value of give into to fnc slot of the noun (cross-copying). In line 3, the value give is copied into the fnc attribute of the next word proplet the, and the substitution value n_2 of the is copied into the arg slot of the verb. In line 4, both instances of the substitution value n_2 are replaced by the value child of the next word (simultaneous substitution). Integration of the noun proplet apple in lines 5 and 6 is analogous to that of child in lines 3 and 4. The content derived in 6.1.1 is a set of proplets. It is order-free in that the proplets may be arranged in any order without affecting the semantic relations between them (CLaTR 3.2.7). Thus, a content may be shown not only in the order induced by the time-linear hear mode derivation, but equivalently in the alphabetical order induced by the core values, here with added cat and sem features: 6.1.2 C ONTENT
DERIVED IN
noun: apple noun: child cat: snp cat: snp sem: indef sg sem: def sg fnc: give fnc: give
prn: 23
prn: 23
6.1.1
IN ALPHABETICAL ORDER
verb: give cat: #n′ #d′ #a′ decl sem: past arg: man child apple prn: 23
noun: man cat: snp sem: def sg fnc: give
prn: 23
The proplets are connected by the values of the continuation attributes fnc and arg. These values are addresses implemented as pointers. The functor-argument relations are coded as follows: The three-place verb proplet give has the continuation feature [arg: man child apple], whereby the short address (4.2.2, 4) values man, child, and apple point at the core values of the corresponding noun proplets with the same prn value (intrapropositional relations).2 The noun proplets man, child, and apple, in turn, have the continuation feature [fnc: give], whereby the short address give points at the core value of the corresponding verb proplet with the same prn value. The relations of intrapropositional coordination are coded similarly (8.1.3), except that the continuation attributes are nc (next conjunct) and pc (previous conjunct). Coding the semantic relations of structure by means of propletinternal addresses is the basis for the selective activation of content by navigating from one proplet to the next, based on the value of a continuation attribute. Graphically, the semantic relations of structure between proplets are represented by using different lines for different relations. More specifically, “/” is 2
The order of the arg values specifies the case roles, i.e. the value in the first arg slot is the subject, the value in the second slot is the indirect_object, and the value in the third slot is the direct_object.
84
6. Intrapropositional Functor-Argument
used for subject/predicate, “\” for object\predicate, “|” for modifier|modified, and “−” for conjunct−conjunct (CLaTR Sect. 7.1). This method of assigning differentiated semantic interpretations to different kinds of lines was pioneered by Frege(1878).3 The graphical representation of a content is derived by matching the proplets of a content like 6.1.2 with the following patterns: 6.1.3 I NTRAPROPOSITIONAL
RELATIONS DERIVED FROM PATTERNS
subject/predicate noun: α verb: β fnc: β arg: α Y ⇒ N/V prn: K prn: K
object\predicate noun: α verb: β fnc: β arg: X α Y ⇒ N\V prn: K prn: K
adnominal|noun adj: β noun: α mdd: α mdr: β ⇒ A|N prn: K prn: K
adverbial|verb adj: α verb: β mdd: β mdr: α ⇒ A|V prn: K prn: K
noun−noun noun: α noun: β nc: β pc: α ⇒ N−N prn: K prn: K
verb−verb verb: α verb: β nc: β pc: α ⇒ V−V prn: K prn: K
adj−adj adj: α adj: β nc: β pc: α ⇒ A−A prn: K prn: K
The two pattern derivations in the first line are the obligatory functor-argument relations, the two in the second line are the optional functor-argument relations, and the three in the third line are the optional coordination relations. The N/V and N\V relations differ in that the arg feature in the former is [arg: α Y] (α in initial slot), but is [arg: X α Y], X ≥ 1, Y ≥ 0, in the latter. Applying 6.1.3 to the proplets in 6.1.2 results in the relations man/give, child\give, and apple\give. In the DBS analysis underlying the speak mode these relations are composed graphically as follows: 6.1.4 DBS
GRAPH ANALYSIS OF PROPOSITION WITH THREE - PLACE VERB
(i) semantic relations graph (SRG) give
(iii) numbered arcs graph (NAG) give 12
man
child
apple
man
(iv) surface realizations 4 5 6 a. 1 2 3 an_apple The_man gave the_child V/N N\V V\N N\V N/V V\N
.
(ii) signature
V
b.
N
6 34 5 child apple
N N
1 2 5 6 3 4 The_man gave an_apple to_the_child V/N N/V V\N N\V V\N N\V
.
6.1 Obligatory Subject/Predicate and Object\Predicate
85
The analysis presents four different views: the (i) semantic relations graph (SRG) is based on the core values of the proplet set 6.1.2, the (ii) signature uses the core attributes, the (iii) numbered arcs graph (NAG) shows a navigation along the arcs, known as pre-order in graph theory, and the (iv) surface realization shows which surface is realized in which numbered arc of the NAG. Subject and object(s) are placed below the predicate, just as the modifier is placed below the modified (6.2.3). This is expressed in the linear notation subject/predicate, object\predicate, and modifier|modified by placing the lower term before the higher one. In coordination, the notation conjunct−conjunct indicates that the node before the “−” line precedes in the graph (8.2.3). The graphs (i) and (ii) on the left show a static view of the semantic relations in the content 6.1.2. The (iii) NAG adds a dynamic aspect in the form of numbered arcs. For example, the static N/V relation is traversed downward as V/N (arc 1) and upward as N/V (arc 2).4 The pre-order numbering of the arcs in an intrapropositional NAG starts and ends at the root, here the top verb. It traverses all nodes in a top down, left to right fashion, and returns right to left bottom up. Beginning and ending a traversal with the highest verb is needed linguistically to enable the extrapropositional V−V coordination of propositions (9.2.3, 14.4.2) in a text. The content 6.1.2 happens to have two different surface realizations in English, shown in 6.1.4 as the a- and b-variants. They are based on alternative navigations through the NAG: the a-variant follows the traversal sequence 1, 2, 3, 4, 5, 6 while the b-variant follows the sequence 1, 2, 5, 6, 3, 4. The semantic nature of the transitions through the NAG is shown in (iv) the surface realization by rule names like V/N (14.4.3) and N/V (14.4.4), shown underneath the space between two adjacent surfaces. Surfaces are realized from the goal node of a traversal. Thereby more than one word form surface may be realized, e.g., the_man in arc 1 of 6.1.4 (function word precipitation, 12.5.2), but there may also be “empty” traversals in which no surface is realized, e.g. arc 4 in the a-variant and arc 6 in the b-variant of the surface realization in 6.1.4. 3
4
Frege (1878) used the method for a graphical (two-dimensional) notation of second order predicate calculus. Frege’s goal was to free thought from language – which DBS fulfills in the form of concept (symbol) and pointer (indexical) proplets with empty sur slots. Historically, Frege’s method was superseded by the Polish notation, which uses a linear order, like a sequence of letters. To the best of our knowledge, Frege’s method has not been utilized in other applications until now. Phrase structure trees (3.4.3) are also two-dimensional. They differ from Frege’s and our method, however, in that phrase structure trees express an abstract hierarchy in terms of dominance and precedence. Compared to functor-argument and coordination as used in predicate calculus and DBS, dominance and precedence are semantically vacuous. The direction of a traversal is underspecified in relations between proplets of the same kind, for example, A|A or A−A. When required, we code direction by using the distinction between A|A and
86
6. Intrapropositional Functor-Argument
6.2 Optional Adnominal Modifier|Modified The optional functor-arguments adnominal|noun and adverbial|verb resemble the obligatory functor-arguments subject/predicate and object\predicate in that they are binary relations between two single nodes, each representing a content proplet (6.1.3). There are no non-terminal nodes in DBS. The following hear mode derivation shows the incremental build-up of modified nouns in pre- and post-verbal position: 6.2.1 H EAR the lexical lookup
MODE DERIVATION OF PROPOSITION WITH MODIFIED NOUNS little
noun: n_1 adj: little fnc: mdd: mdr: prn: prn: syntactic−semantic parsing: noun: n_1 adj: little 1 fnc: mdd: mdr: prn: 2 prn:
dog
found
noun: dog fnc: mdr: prn:
verb: find arg: mdr: prn:
a noun: n_2 fnc: mdr: prn:
big adj: big mdd: prn:
2
noun: n_1 fnc: mdr: little prn: 2
adj: little mdd: n_1 prn: 2
3
noun: dog fnc: mdr: little prn: 2
adj: little mdd: dog prn: 2
verb: find arg: mdr: prn:
4
noun: dog fnc: find mdr: little prn: 2
adj: little mdd: dog prn: 2
verb: find arg: dog mdr: prn: 2
noun: n_2 fnc: mdr: prn:
5
noun: dog fnc: find mdr: little prn: 2
adj: little mdd: dog prn: 2
verb: find arg: dog n_2 mdr: prn: 2
noun: n_2 fnc: find mdr: prn: 2
adj: big mdd: prn:
verb: find arg: dog n_2 mdr: prn: 2
noun: n_2 fnc: find mdr: big prn: 2
adj: big mdd: n_2 prn: 2
noun: bone fnc: find mdr: big prn: 2
adj: big mdd: bone prn: 2
noun: bone fnc: mdr: prn:
noun: dog fnc: mdr: prn:
noun: dog adj: little fnc: find mdr: little mdd: dog prn: 2 prn: 2 result of syntactic−semantic parsing: noun: dog adj: little fnc: find mdr: little mdd: dog prn: 2 prn: 2 6
bone
verb: find arg: dog bone mdr: prn: 2
noun: bone fnc: mdr: prn:
In line 1, the core values of the and little are cross-copied (adnominal modifier|modified relation). In line 2, the two occurrences of n_1 are replaced by the core value of the dog proplet (simultaneous substitution), which is then discarded. In line 3, the core values of dog and bark are cross-copied A↑A or between A−A and A←A (14.2.4, 14.2.5).
6.2 Optional Adnominal Modifier|Modified
87
(subject/predicate relation). In line 4, the core values of the verb and the indefinite article proplet are cross-copied, (direct_object\predicate relation). In line 5, the core values of the indefinite article and the adnominal modifier are cross-copied (adnominal|noun relation). In line 6, all three occurrences of the substitution variable n_2 are replaced by the core value of the bone proplet, which is then discarded. The result shows the proplets in an order reflecting the derivation of the English surface. The same proplets may also be shown in the alphabetical order induced by their core values, with added cat and sem features: 6.2.2 C ONTENT
OF A PROPOSITION WITH TWO MODIFIED NOUNS
noun: bone adj: big cat: adnv cat: snp sem: pad sem: def sg mdd: bone fnc: find mdr: big mdr: prn: 2 prn: 2
noun: dog cat: snp sem: def sg fnc: find mdr: little prn: 2
verb: find ′ ′ cat: #n #a decl sem: past arg: dog bone mdr: prn: 2
adj: little cat: adnv sem: pad mdd: dog mdr: prn: 2
The nouns dog and bone are modified by the adjectives little and big, respectively. These relations are coded by the value dog in the mdd slot of the adjective proplet little, and by the value little in the mdr slot of the noun proplet dog, and similarly for big bone. The cat value adnv of little and big indicates that these adjectives may be used adnominally and adverbially. Using 6.2.2 as input to 6.1.3 results in the following DBS graph analysis: 6.2.3 G RAPH
ANALYSIS OF CONTENT WITH ADNOMINAL MODIFIERS
(i) semantic relations graph (SRG) find
dog
(iii) numbered arcs graph (NAG) find 1 5 8 4 dog bone 2 3 6 7 little big
bone
little
big
(ii) signature
V
(iv) surface realization
N
N
A
A
1
2
3
4
5
6
7
8
.
The little dog found a big bone V/N N|A A|N N/V V\N N|A A|N N\V
V/N (arc 1) realizes the definite article from the [sem: def sg] feature of the N dog. N|A (arc 2) realizes the adnominal modifier from the A little. A|N (arc 3) realizes the noun from the N dog. N/V (arc 4) realizes the finite verb
88
6. Intrapropositional Functor-Argument
from the V find. V\N (arc 5) realizes the indefinite article from the feature [sem: indef sg] of the N bone. N|A (arc 6) realizes the adnominal modifier from the A big. A|N (arc 7) realizes the noun from the N bone. N\V (arc 8) realizes the period from the feature [cat: #n′ #a′ decl] of the V.
6.3 Optional Adverbial Modifier|Modified The other modifier|modified relation of natural language besides adnominal|noun is adverbial|verb. In DBS, adnominals like beautiful in the beautiful tree and adverbials like beautifully in is singing beautifully are variants of the same word class, adjective, just as a finite form like gave and a non-finite form like given are variants of the same word class, verb. In proplets, the use of an adjective is coded by the features [cat: adn] for adnominal, [cat: adv] for adverbial, and [cat: adnv] for forms which can be used as either, for example, fast in the the fast car (adnominal) and in driving fast (adverbial). The following hear mode derivation of Julia has been sleeping soundly shows an adverbial modification following the modified: 6.3.1 H EAR
MODE DERIVATION OF ADVERBIAL MODIFIER FOLLOWING
Julia
has
been
sleeping
soundly
lexical lookup noun: Julia verb: v_1 arg: fnc: mdr: mdr: prn: prn: syntactic−semantic parsing: noun: Julia verb: v_1 arg: fnc: 1 mdr: mdr: prn: 3 prn: noun: Julia verb: v_1 fnc: v_1 arg: Julia 2 mdr: mdr: prn: 3 prn: 3 noun: Julia verb: v_2 arg: Julia fnc: v_2 3 mdr: mdr: prn: 3 prn: 3 noun: Julia verb: sleep fnc: sleep arg: Julia 4 mdr: mdr: prn: 3 prn: 3 result of syntactic−semantic parsing: noun: Julia verb: sleep fnc: sleep arg: Julia mdr: mdr: sound prn: 3 prn: 3
verb: v_2 arg: mdr: prn:
verb: sleep arg: mdr: prn:
adj: sound mdd: prn:
verb: v_2 arg: mdr: prn: verb: sleep arg: mdr: prn: adj: sound mdd: prn: adj: sound mdd: sleep prn: 3
6.4 Semantic Interpretation of Determiners
89
Just as multiple determiners in a proposition are distinguished by the different substitution values n_1, n_2, etc., (6.1.1, 6.2.1), multiple auxiliaries are distinguished by the different substitution values v_1, v_2, etc., as their core values (13.3.6). Substitution values are used for cross-copying during the timelinear hear mode derivation, and are simultaneously substituted when the associated content word, here sleeping, arrives. The contribution of the auxiliaries is coded as the feature [sem: pres perf . . . ] of the resulting verb proplet: 6.3.2 C ONTENT
OF A PROPOSITION WITH AN ADVERBIAL MODIFIER
verb: sleep adj: sound noun: Julia ′ cat: adv cat: snp cat: #n decl sem: nm sem: pres perf prog sem: pad mdd: sleep fnc: sleep arg: Julia mdr: sound mdr: mdr: prn: 3 prn: 3 prn: 3
The modifier|modified relation between the adjective sound and the verb sleep is coded by the value sleep in the mdd slot of sound, and by the value sound in the mdr slot of sleep. Needed for the speak mode, the adverbial form of the adjective is specified as the value adv of the sound proplet’s cat attribute . As in 6.2.3, the graphical analysis of the content 6.3.2 uses the line “|” for the modifier|modified relation, but in the A|V instead of the A|N constellation: 6.3.3 DBS
GRAPH ANALYSIS OF CONTENT WITH ADVERBIAL MODIFIER
(i) semantic relations graph (SRG) sleep
Julia
sound
NAG (numbered arcs graph) sleep 1 34 2 Julia sound
(ii) signature
V N
A
surface realization 1 2 3 4 Julia has_been_sleeping soundly V/N N/V V|A A|V
.
V/N (arc 1) realizes the subject from the N proplet Julia. N/V (arc 2) realizes the complex verb form has_been_sleeping from the sleep proplet, using the sem values. V|A (arc 3) realizes the adverbial modifier from the A proplet sound. A|V (arc 4) realizes the period from the V proplet sleep.
6.4 Semantic Interpretation of Determiners The operational semantics of DBS is grounded in the automatic recognition and action procedures of the agent’s external interfaces. The semantics of
90
6. Intrapropositional Functor-Argument
symbolic logic, in contrast, is based on metalanguage definitions relative to set-theoretical models (2.3.1). As human-made artifacts, the models are a little crude compared to the external reality as viewed by today’s science. Also, the logical formulas coding truth conditions (FoCL 19.3.2) relative to the models do not reflect the linguistic structure of natural language. This may be shown with a meaning analysis of the words some and all, which are treated as determiners in DBS, but as quantifiers in symbolic logic (Peters and Westersthål 2006). The following examples illustrate the universal quantifier ∀, pronounced for all, in predicate calculus and its interpretation relative to a set-theoretical model: 6.4.1 P REDICATE
CALCULUS ANALYSIS OF
All girls sleep
∀x[girl(x) → sleep(x)] This formula may be paraphrased as For all x, if x is a girl then x is sleeping. The dependence of the truth value of this formula on the actual definition of the model and a variable assignment is represented by Montague (1973) by adding @ (representing the model) and g (representing a variable assignment) as superscripts at the end of the formula: 6.4.2 I NTERPRETATION
OF
6.4.1
RELATIVE TO A MODEL
∀x [girl(x) → sleep(x)]@,g The model @ is defined as a tuple (A,F), where A is a set of individuals and F is an assignment function which assigns to every one-place predicate in the object language an element of 2A (i.e. the power set of A) as an interpretation, and accordingly for two-place predicates, etc. The interpretation of the quantifier ∀ based on g is as follows: the whole formula is true relative to the model @ if it holds for all possible variable assignments g′ that the formula without the outermost quantifier is true: 6.4.3 E LIMINATION
OF THE OUTERMOST QUANTIFIER
[girl(x) → sleep(x)]@,g
′
The purpose of eliminating the quantifier is to reduce the predicate calculus formula to propositional calculus and its truth tables (Bochenski 1961). This may be achieved by defining the formal model @ =def (A,F) explicitly. For example, let us define the set of individuals as A =def {a0 , a1 , a2 , a3 } and the denotation function as F(girl) =def {a1 , a2 } and F(sleep) =def {a0 , a2 , a3 }.
6.4 Semantic Interpretation of Determiners
91
Now g′ may systematically assign values in A to the variable x in 6.4.3, i.e. g′ (x) = a0 , g′ (x) = a1 , g′ (x) = a2 , and g′ (x) = a3 , and determine the truth value of the subformulas girl(x) and sleep(x) for each of the assignments. The first assignment g′ (x) = a0 makes the formula true: a0 is not in the set denoted by F(girl) in @; therefore, based on the truth table of p → q in propositional calculus, the formula in 6.4.3 is true for this assignment. The second assignment g′ (x) = a1 , in contrast, makes the formula in 6.4.3 false: a1 is in the set denoted by F(girl), but not in the set denoted by F(sleep) in @. Having shown that not all variable assignments g′ make the formula in 6.4.3 true, the interpretation of the formula in 6.4.2 is determined to be false relative to our set-theoretical model @ (FoCL Sect. 19.3). Given how we manually defined the model, this is in accordance with intuition. For building a talking robot, however, reducing the interpretation of a formula like 6.4.1 to the truth tables of propositional calculus is a dead end. Moreover, using quantifiers at the highest level of the logical syntax for treating determiners leads to artificial scope ambiguities5 caused by different possible orders when more than one quantifier is used. For example, the standard translation of Every man loves a woman into predicate calculus has the following readings, resulting from alternative orders of the quantifiers: 6.4.4 A NALYZING Every man loves a woman IN
PREDICATE CALCULUS
Reading 1: ∀x [man(x) → ∃y [woman(y) ∧ love(x,y)]] Reading 2: ∃y [woman(y) ∧ ∀x [man(x) → love(x,y)]] On reading 1, it holds for every man that there is some woman whom he loves. On reading 2, there is a certain woman, e.g. Marilyn Monroe, who is loved by every man. The two formulas of predicate calculus are based on the notions of functorargument, coordination, and coreference, though in a manner different from their use in DBS. Functor-argument is used in man(x), woman(y), and love(x,y); coordination is used in [man(x) → P] and [woman(y) ∧ Q]; and coreference is expressed by the quantifiers and the horizontally bound variables in ∀x [man(x) ... love(x,y)] and ∃y [woman(y)... love(x,y)]. 5
Kempson and Cormack (1981). In recent years, Minimal Recursion Semantics (MRS, Copestake, Flickinger et al. 2006) has devoted much work to avoid the unnatural proliferation of readings caused by different quantifier scopes, using “semantic under-specification.” In MRS, under-specification is limited to quantifier scope (see also Steedman 2005) . In Database Semantics, which has neither quantifiers nor quantifier scope, semantic under-specification applies to all content coded at the language level and is being used for the matching with a delimited context of use (FoCL Sect. 5.2).
92
6. Intrapropositional Functor-Argument
In DBS, in contrast, the determiners every and a are treated as function words rather than quantifiers. The hear mode derivation replaces their lexical (A.4.3) core values, i.e. the substitution variables n_1 and n_2, with the constants man and woman, respectively (similar to 6.1.1). Their semantic properties are expressed by the lexical sem values pl, exh, sg, and indef (defined set-theoretically in 6.4.7), which they contribute to the sem attribute of the noun proplets (Choe et al. 2006): 6.4.5 R ESULT OF PARSING Every man loves a woman IN DBS sur: sur: sur: noun: woman noun: man verb: love cat: #ns3 #a decl cat: snp cat: snp sem: indef sg sem: pl exh sem: pres fnc: love arg: man woman fnc: love prn: 4 prn: 4 prn: 4 This analysis uses only intrapropositional functor-argument: as in the natural surface, and unlike the formulas in 6.4.4, there is neither coordination nor coreference. Also, unlike the predicate calculus analysis, DBS does not analyze the sentence as ambiguous. Instead, 6.4.5 represents only reading 1 of 6.4.4 – which is entailed by reading 2 (i.e. reading 1 is true whenever reading 2 is true, but not vice versa). In other words, whether or not all of the men happen to love the same woman is treated in the agent’s pragmatics, and not as a structural ambiguity of the expression’s syntax-semantics (FoCL 12.5.1). The atomic values exh (exhaustive), sel (selective), sg (singular), pl (plural), def (definite), and indef (indefinite) are used in different combinations to characterize the following kinds of phrasal noun surfaces of English: 6.4.6 T HE cat AND sem VALUES OF DETERMINER - NOUN COMBINATIONS a girl [cat: snp][sem: indef sg] some girls [cat: pnp][sem: indef pl sel] every girl [cat: snp][sem: exh pl] all girls [cat: pnp][sem: exh pl] the girl [cat: snp][sem: def sg] the girls [cat: pnp][sem: def pl] The cat values snp and pnp stand for singular noun phrase and plural noun phrase, respectively (A.3.2, 4). The atomic sem values are defined as patterns which are used by the robot in the speak and in the hear mode. The patterns are procedurally defined as the following set-theoretic constellations:
6.4 Semantic Interpretation of Determiners
6.4.7 S ET- THEORETIC exh
sel
INTERPRETATION OF sg
pl
93
exh, sel, sg, pl, def, indef def
indef
The value exh refers to all members of a set, called the domain, while sel refers only to some. The value sg refers to a single member of the domain, while pl refers to more than one. The value def refers to a pre-specified subset of the domain, while no such subset is presumed by indef. Each value can only be combined with a value from one of the other pairs. Thus exh cannot combine with sel, sg cannot combine with pl, and def cannot combine with indef. However, the combinations exh pl, sel sg, sel pl, def sg, def pl, indef sg, indef pl, etc., are legitimate and have different meanings. The combination exh sg is theoretically possible, but makes little sense (pace Russell 1905) because the domain would have to be a unit set. If a robot in the speak mode perceives the set-theoretic constellation corresponding to exh and pl as shown in 6.4.7, it will use the determiner all,6 and similarly with the other values. Correspondingly, if a robot in the hear mode interprets the noun phrase all girls, for example, it will be able to draw the corresponding set-theoretic constellation or to choose the right schema from several alternatives. The Database Semantics approach differs from predicate calculus in that predicate calculus uses the words some and all in the metalanguage to define the words some and all in the object-language (as shown by the use of the variable assignment function g′ described above), while Database Semantics is based on a procedural interpretation. This difference is based on the different ontological assumptions of the two approaches, illustrated in 2.3.1 with the most simple sentence Julia sleeps (FoCL Sect. 20.4). In summary, the semantics of predicate calculus is based on the elementary notions of true and false, while the semantics of DBS is based on the agent’s recognition and action procedures. In DBS, truth is reconstructed as a derived notion: a proposition is true if the agent’s cognition works as it should. For example, if a robot observes that all girls are sleeping and communicates this fact by saying all girls are sleeping, it is speaking truly. Semantically, the sentence asserts that there is a set of more than one girl (pl) and all elements of the set (exh) participate in whatever is asserted by the verb. 6
The determiner every has the same sem values as all, but the cat values differ (A.4.3).
94
6. Intrapropositional Functor-Argument
6.5 Complexity Levels of Surfaces and their Contents In natural language, the expressions serving semantically as arguments, relations, and modifiers occur at three levels of complexity, called elementary, phrasal, and clausal (15.1.1), and at two levels of analysis, namely the surface and the content. The notion elementary has different interpretations depending on whether it is applied to the surface or to the content. A surface is called elementary if it consists of a single word form. A content is called elementary if it consists of a single proplet. What is phrasal at the surface may be represented by an elementary content, as shown by the following hear mode example (13.3.3): 6.5.1 F USING sur: every noun: n_1 cat: sn′ snp sem: pl exh + fnc: mdr: prn: 5
PHRASAL DET + NOUN INTO AN ELEMENTARY CONTENT
sur: child sur: every child noun: child noun: child cat: sn cat: snp sem: sg ⇒ sem: pl exh fnc: fnc: mdr: mdr: prn: prn: 5
The content derived from the phrasal surface every child is elementary. Deriving the phrasal surface in the speak mode is based on the features [cat: snp] and [sem: pl exh] in the elementary content (function word precipitation, 12.5.2). In the next example, the surface and the content are both phrasal, but the number of their elements is different. The hear mode connects three surface items into a two proplet content (13.2.7, 13.2.8): 6.5.2 F USING
PHRASAL DET + ADN + NOUN INTO A PHRASAL CONTENT
sur: beautiful sur: the noun: n_1 adj: beautiful cat: nn′ np cat: adn sem: pad sem: def + + mdd: fnc: mdr: mdr: prn: prn: 6
sur: the child sur: child noun: child noun: child cat: snp cat: sn sem: def sg sem: sg ⇒ fnc: fnc: mdr: beautiful mdr: prn: 6 prn:
sur: beautiful adj: beautiful cat: adn sem: pad mdd: child mdr: prn: 6
A corresponding hear mode derivation is illustrated in 6.2.1 for phrasal nouns in pre- and postverbal position. Phrasal surface structures of verbs involve auxiliaries which code various values of tense and verbal mood in English in combination with a nonfinite form of the main verb which provides the oblique valency positions. Consider the hear mode reduction of a phrasal verb into a single proplet, representing an elementary content:
6.5 Complexity Levels of Surfaces and their Contents
6.5.3 F USING
PHRASAL AUX + AUX + AUX + NFV INTO ELEMEN . CONTENT
sur: could sur: have verb: v_1 verb: v_2 cat: n′ v cat: n-s3′ hv’ v sem: cond + sem: hv_pres + arg: arg: mdr: mdr: prn: 7 prn: sur: could have been sleeping verb: sleep cat: n′ v sem: cond hv_pres be_perf prog arg: mdr: prn: 7
95
sur: been verb: v_3 dcat: be′ hv sem: be_perf + arg: mdr: prn:
sur: sleeping verb: sleep cat: be sem: prog ⇒ arg: mdr: prn:
For a related hear mode derivation see 6.3.1. Happily, in natural languages verbal constructions with “function word recursion” cannot go on indefinitely, but seem to be limited grammatically to at most three or four function words. Finally we turn to phrasal modifiers. The following example shows the prepositional phrase on+the+big+table in the style of 6.5.1– 6.5.3: 6.5.4 F USING
PHRASAL PREP + DET + ADN + NOUN INTO PHRAS . CONTENT
sur: on sur: the noun: n_1 noun: n_2 cat: adnv cat: nn′ np sem: on + sem: def + mdd: fnc: mdr: arg: prn: 8 prn:
sur: big adj: big cat: adn sem: pad + mdd: mdr: prn:
sur: table sur: on the table noun: table noun: table cat: sn cat: snp sem: sg ⇒ sem: on def sg fnc: mdd: arg: mdr: big prn: prn: 8
sur: big adj: big cat: adn sem: pad mdd: table mdr: prn: 8
The preposition on is fused with the determiner the into on_the, which is connected with big into on_the + big, into which table is fused as on_the_table+big. The core attribute of the preposition is noun rather than adj. This is motivated in part by languages like classical Latin, which use nouns inflected in a case called ablative to express modifier roles such as location, instrument, and time. English uses nouns syntactically as modifiers in expressions like kitchen|table, which other language handle in the morphology as compounds. The core attribute noun for prepositional phrases is also motivated by their speak mode derivation. Prepositional phrases come in such varieties as in Paris, in the city, in the beautiful city, etc., for which the full range of phrasal nouns must be realized. For this, the use of a core attribute adj would complicate the speak mode realization of the noun part. The core attribute noun, the continuation attribute mdd, and the initial sem value on originate in the lexical entry of the preposition (A.4.5). They characterize a prepositional phrase as a modifier and are absent in noun proplets
96
6. Intrapropositional Functor-Argument
which serve as subjects or objects. Because a prepositional phrase may be used as an adnominal as well as an adverbial modifier, the cat value of a preposition is adnv. The value of the mdd attribute connects a prepositional phrase to the modified noun (adnominal use) or the modified verb (adverbial use). The examples of a phrasal surface with an elementary content (6.5.1, 6.5.3) raise the question of whether or not the inverse, i.e. a phrasal content with an elementary surface, is also possible. An application would be the lexical definition of a word like kill as cause to become not alive, as proposed by Lakoff (1971), McCawley (1976), and others within the obsolete framework of generative semantics. The DBS alternative is the use of inferences (Sect. 5.3). Thus, an elementary content may have a phrasal surface, but an elementary surface may not have a phrasal content (at least not directly). Whether an elementary content has a phrasal surface or not depends on the natural language. For example, what is expressed by the complex expression I should have carried in English, is expressed equivalently by the single word form portavissem in Latin. In other words, tense and mood are coded in the English surface by a syntactic-semantic combination of function words, while the Latin surface uses the morphology instead. At the content level, however, the different surfaces of these languages are assigned the same single proplet (disregarding the sur value).
6.6 Transparent vs. Opaque Using a prepositional phrase like 6.5.4 as an adnominal or adverbial modifier exemplifies an intrapropositional opaque construction because it defines the modifier|modified relation as N|N or N|V, and not as transparent A|N or A|V. The transparent- opaque distinction7 applies to the semantic relations of structure at the three levels of grammatical complexity (CLaTR Sect, 9.6): 6.6.1 T RANSPARENT clausal:
V/V
AND OPAQUE RELATIONS AT THE THREE LEVELS
V\V
V|N
V|V
∅8
V-V
∅
phrasal: N/V N\V V\V N|N N|V N-N V-V A-A N-N V-V A-A elementary: N/V N\V A|N N|N A|V In DBS, a semantic relation between two proplets is called opaque if it requires (i) an additional feature in one proplet and (ii) a non-standard value in the other. Otherwise it is transparent. 8
Extrapropositional coordination holds only between two verbs (V−V). The absence of clausal N−N and A−A coordinations is indicated by ∅.
6.6 Transparent vs. Opaque
97
For example, in 6.1.2, transparent N\V codes the object\predicate relation (i) with the standard feature [fnc: give] in the noun proplet apple and (ii) the standard value apple in the arg slot of the verb proplet give. The opaque (shaded) counterpart V\V of the infinitive 6.6.6, in contrast, needs (i) the additional feature [fnc: try] in read and (ii) the non-standard value read in the arg slot of try to establish the relation. While modifiers are usually optional, there exist verbs which require a modifier for the construction to be complete. For example, the verb put requires a subject, an object, and a locative modifier. Consider the following example: 6.6.2 P REPOSITIONAL Julia lexical lookup
put
PHRASE SERVING AS AN OBLIGATORY MODIFIER the
flowers
in
a
vase
noun: Julia verb: put noun: n_1 noun: flower arg: fnc: fnc: fnc: mdr: prn: prn: prn: prn: syntactic−semantic parsing: put noun: Julia verb: arg: fnc: mdr: 1 prn: prn: 25 noun: Julia verb: put noun: n_1 arg: Julia fnc: put fnc: mdr: 2 prn: 25 prn: prn: 25 noun: n_1 noun: flower noun: Julia verb: put arg: Julia n_1 fnc: put fnc: 3 fnc: put mdr: prn: 25 prn: 25 prn: 25 prn: noun: Julia verb: put noun: flower arg: Julia flower fnc: put fnc: put mdr: 4 prn: 25 prn: 25 prn: 25
noun: n_2 noun: n_3 noun: vase mdd: fnc: fnc: prn: prn: prn:
noun: Julia 5 fnc: put prn: 25
noun: n_2 noun: n_3 mdd: put fnc: prn: 25 prn:
verb: put noun: flower arg: Julia flower fnc: put mdr: n_2 prn: 25 prn: 25 verb: put noun: Julia noun: flower arg: Julia flower fnc: put 6 fnc: put mdr: n_3 prn: 25 prn: 25 prn: 25 result of syntactic−semantic parsing: noun: Julia verb: put noun: flower arg: Julia flower fnc: put fnc: put mdr: vase prn: 25 prn: 25 prn: 25 8
noun: n_2 mdd: prn:
noun: n_3 mdd: put prn: 25
noun: vase fnc: prn:
noun: vase mdd: put prn: 25
This is an appropriation of terminology used earlier by Quine (1960) for what the mediaeval logicians called de dicto/de re (HBTR Sect. 3.6). Other terms for roughly the same dichotomy are even/uneven (Frege), and intensional/extensional (Montague). The term opacity has also been used in phonology (Kiparsky 1973).
98
6. Intrapropositional Functor-Argument
The prepositional argument in_a_vase is obligatory in the sense that the expression Julia put the flowers is incomplete. It is opaque because the core attribute of the prepositional phrase is noun. If the modifier were elementary, e.g. down, the modification relation would be transparent A|V. In the alphabetical order induced by the core values, the proplet set representing the content of 6.6.2 is as follows, with added cat and sem features: 6.6.3 V ERB
TAKING AN OBLIGATORY MODIFIER
noun: flower noun: Julia cat: pnp cat: snp sem: def pl sem: nm f sg fnc: put fnc: put mdr: mdr: prn: 25 prn: 25
verb: put noun: vase ′ ′ ′ cat: #n #mod #a decl cat: adnv snp sem: past sem: in indef sg arg: Julia flower mdd: put mdr: vase mdr: prn: 25 prn: 25
The obligatory nature of the adverbial modifier is made formally explicit by the mod′ slot (A.3.2, 5) in the verb’s cat feature. The modifier function of the prepositional phrase is coded by the additional feature [mdd: put] (opaque). The graph analysis underlying the speak mode of 6.6.2 is as follows: 6.6.4 DBS
GRAPH OF CONTENT WITH PREPOSITIONAL OBJECT
(i) semantic relations graph (SRG) put
Julia
vase
flower
(iii) numbered arcs graph (NAG) put 1 34 5 6 2 vase Julia flower
(ii) signature (iv) surface realization 1 2 5
V N
N
N
6
3
4
Julia put the_flowers in_a_vase V/N N/V V\N N\V V|N N|V
.
Julia put the flowers in a vase (obligatory modifier) and Julia read a book without her glasses (optional modifier) have similar hear and speak mode derivations, but differ in the cat values of their verbs. Because phrasal modifiers must follow their modified in adnominal use and may follow their modified in adverbial use (Sect. 15.1), they may cause a syntactic ambiguity. For example, on the adnominal interpretation of Julia ate the apple on the table, it is the apple which is on the table (15.1.5), while on the adverbial interpretation it is the eating which takes place on the table (15.1.6). These readings differ explicitly in their hear mode derivation, their proplet set, and their speak mode derivation.
6.6 Transparent vs. Opaque
99
Another intrapropositional example of an opaque construction besides phrasal adnominal N|N and adverbial N|V is the infinitive V\V . Consider the following example: 6.6.5 H EAR
MODE DERIVATION OF
John tried lexical lookup noun: John verb: try cat: snp cat: n’ a’ v sem: nm m sg arg: fnc: mdr: prn: prn: syntactic−semantic parsing verb: try noun: John cat: snp cat: n’ a’ v 1 sem: nm m sg arg: mdr: fnc: prn: prn: 32
to verb: v_1 cat: inf fnc: arg: prn:
John tried to read a book read
verb: read cat: n−s3’ a’ v arg: mdr: prn:
a noun: n_1 cat: sn’ snp sem: indef sg fnc: prn:
2
noun: John cat: snp sem: nm m sg fnc: try prn: 32
verb: try cat: #n’ a’ v arg: John mdr: prn: 32
3
noun: John cat: snp sem: nm m sg fnc: try prn: 32
verb: try cat: #n’ #a’ v arg: John v_1 mdr: prn: 32
4
noun: John cat: snp sem: nm m sg fnc: try prn: 32
verb: try cat: #n’ #a’ v arg: John read mdr: prn: 32
verb: read cat: n−s3’ a’ v arg: mdr: prn: noun: n_1 verb: read cat: sn’ snp cat: #n−s3’ a’ inf sem: indef sg fnc: try fnc: arg: John prn: prn: 32
5
noun: John cat: snp sem: nm m sg fnc: try prn: 32
verb: try cat: #n’ #a’ v arg: John read mdr: prn: 32
verb: read cat: #n−s3’ #a’ inf fnc: try arg: John n_1 prn: 32
noun: n_1 cat: sn’ snp sem: indef sg fnc: read prn: 32
noun: John cat: snp 6 sem: nm m sg fnc: try prn: 32 result noun: John cat: snp sem: nm m sg fnc: try prn: 32
verb: try cat: #n’ #a’ v arg: John read mdr: prn: 32
verb: read cat: #n−s3’ #a’ inf fnc: try arg: John book prn: 32
noun: book cat: snp sem: indef sg fnc: read prn: 32
.
book noun: book cat: sn sem: sg fnc: prn:
.
verb: cat: v’ decl prn:
verb: v_1 cat: inf fnc: arg: prn:
verb: try cat: #n’ #a’ decl arg: John read mdr: prn: 32
verb: v_1 cat: inf fnc: try arg: John prn: 32
verb: read cat: #n−s3’ #a’ inf fnc: try arg: John book prn: 32
noun: book cat: snp sem: indef sg fnc: read prn: 32
noun: book cat: sn sem: fnc: prn:
.
verb: cat: v’ decl prn:
100
6. Intrapropositional Functor-Argument
In line 2, the conjunction to cancels the a′ valency position in the cat slot of try and adds a proplet with the substitution value v_1, which is copied into the arg slot of try. The additional fnc attribute of read originates in the lexical entry of to (A.5.7). Using the alphabetical order induced by the core values, the content resulting from the hear mode derivation 6.6.5 may be shown as follows: 6.6.6 I NFINITIVE
CONSTRUCTION AS A SET OF PROPLETS
noun: book noun: John cat: snp cat: snp sem: indef sg sem: nm m sg fnc: read fnc: try prn: 32 prn: 32
verb: read ′ ′ cat: #n #a inf arg: John book fnc: try prn: 32
verb: try ′ ′ cat: #n #a decl arg: John read mdr: prn: 32
The nonfinite verb of the infinitive makes the construction intrapropositional, formally indicated in DBS by a common prn value, here 32. The graphical analysis of the content 6.6.6 uses the “\” line for the direct_object\predicate relation between try and read, like 6.2.3, though in the opaque V\V instead of the transparent N\V constellation: 6.6.7 DBS
GRAPH ANALYSIS OF AN INFINITIVE
(i) semantic relations graph (SRG) try
(iii) numbered arcs graph (NAG) try 3 6 1 read 2 4 5 John book
read John
book
(ii) signature
V
(iv) surface realization 1
V N
N
2
3
4
John tried to__read a__book V/N N/V V\V V\N N\V
5
6
. V\V
Instead of an infinitive, try may also take a noun, as in John tried a cookie. (transparent). This is in contrast to the verb appear, which takes an infinitive as object, as in John appeared to sleep, but not a noun, as shown by *John appeared a cookie. There are also three-place verbs taking an infinitive as one of their arguments, such as persuade and promise (CLaTR Sect. 15.6). Next let us consider the English auxiliaries be, have, and do. As function words with the core values b_1, h_1, and d_1 (substitution variables), they combine transparently with nonfinite forms of the main verb (FoCL 17.3.2):
6.6 Transparent vs. Opaque
6.6.8 AUXILIARIES
101
IN COMPLEX VERB CONSTRUCTIONS
does give: ns3′ do′ v + d′ a′ do ⇒ ns3′ d′ a′ v has given: ns3′ hv′ v + d′ a′ hv ⇒ ns3′ d′ a′ v is giving: ns3′ be′ v + d′ a′ be ⇒ ns3′ d′ a′ v
The cat values of the finite auxiliary and of the nonfinite main verb combine into the same cat feature as that of the corresponding elementary verb form. In addition to their use as auxiliaries, they function as main verbs in copula constructions such as does the boogie woogie, has a car, and is a doctor: 6.6.9 H EAR
MODE DERIVATION OF
Julia is lexical lookup noun: Julia verb: b_1 cat: snp cat: ns3’ be’ v sem: nm f sg sem: pres arg: fnc: prn: prn: syntactic−semantic parsing noun: Julia verb: b_1 cat: snp cat: ns3’ be’ v 1 sem: nm f sg sem: pres arg: fnc: prn: 38 prn:
Julia is a doctor.
a
.
doctor
noun: n_1 cat: sn’ snp sem: indef sg fnc: prn:
noun: doctor cat: sn sem: sg fnc: prn:
2
noun: Julia cat: snp sem: nm f sg fnc: b_1 prn: 38
verb: b_1 cat: #ns3’ be’ v sem: pres arg: Julia prn: 38
noun: n_1 cat: sn’ snp sem: indef sg fnc: prn:
3
noun: Julia cat: snp sem: nm f sg fnc: b_1 prn: 38
verb: b_1 cat: #ns3’ #be’ v sem: pres arg: Julia n_1 prn: 38
noun: n_1 cat: sn’ snp sem: indef sg fnc: b_1 prn: 38
4
noun: Julia cat: snp sem: nm f sg fnc: b_1 prn: 38
verb: b_1 cat: #ns3’ #be’ v sem: pres arg: Julia doctor prn: 38
noun: doctor cat: snp sem: indef sg fnc: b_1 prn: 38
noun: Julia cat: snp sem: nm f sg fnc: b_1 prn: 38
verb: b_1 cat: #ns3’ #be’ decl sem: pres arg: Julia doctor prn: 38
noun: doctor cat: snp sem: indef sg fnc: b_1 prn: 38
.
verb: cat: v’ decl arg: mdr: prn:
noun: doctor cat: sn sem: sg fnc: prn:
.
verb: cat: v’ decl arg: mdr: prn:
result
The core value of the is proplet is the substitution variable b_1. In line 1, it is cross-copied with the core value Julia into the fnc and arg slots of the first two proplets, and the ns3′ valency position in the cat slot of the is proplet is
102
6. Intrapropositional Functor-Argument
canceled by #-marking. In line 2, the core values of is and a are cross-copied, and the be′ valency position in the cat slot of the is proplet is canceled. In line 3, the core value doctor replaces the two occurrences of the substitution variable n_1. The construction is opaque because the valency position be′ is canceled with the nominal filler category snp. It is special because the substitution value b_1 appears as the verb’s core value in the result. The speak mode graph analysis, however, shows a standard transitive verb construction: 6.6.10 DBS
GRAPH ANALYSIS OF
Julia is a doctor.
(i) semantic relations graph (SRG) be
(iii) numbered arcs graph (NAG) be 1
Julia
doctor
Julia
2
4 3 doctor
(ii) signature
V N
(iv) surface realization 1 2 3 4 Julia is a_doctor V/N N/V V\N N\V
.
N
Traversal of arc 1 realizes Julia, traversal of arc 2 realizes is, traversal of arc 3 realizes a_doctor, traversal of arc 4 realizes the period . A second copula construction of English with the auxiliary be takes an adnominal adjective as its second argument, as in Fido is hungry. The content has the following proplet representation: 6.6.11 C ONTENT noun: Fido cat: snp sem: nm sg fnc: b_1 prn: 39
OF
Fido is hungry.
verb: b_1 ′ ′ cat: #ns3 #be decl sem: pres arg: Fido hungry prn: 39
adj: hungry cat: adn sem: pad fnc: b_1 prn: 39
The second arg position of the auxiliary is filled opaquely with the adj hungry. The valency position be′ of the auxiliary’s cat attribute is #-marked (canceled) by the adjective’s cat value adn. The adjective proplet is opaque because it has an additional fnc attribute. As in 6.6.9, the substitution value b_1 of the auxiliary is used as the verb’s core value.
7. Extrapropositional Functor-Argument
Two clauses connected by an extrapropositional coordination are “equal” (parataxis), whereas an extrapropositional functor-argument distinguishes between the “higher” and the “lower” clause (hypotaxis). A lower clause serves in the higher clause as (i) a clausal argument or as (ii) a clausal modifier. A clausal adnominal modifier, aka relative clause, is established between the verb of the subclause and the noun of the higher clause (V|N), while the other clausal functor-arguments are established between two verbs, i.e. V/V (clausal subject), V\V (clausal object), and V|V (clausal adverb).
7.1 Clausal Subject The verb of a subclause serves a double function, namely as (i) the predicate of the subclause and (ii) the argument or modifier in the higher clause: 7.1.1 E XTRA -
VS . INTRAPROPOSITIONAL FUNCTOR - ARGUMENTS
clausal subject
clausal object
clausal adverbial
V
V
V V
N
N
N
V
N
N surprise
V
N hear
smile Mary
bark
Mary subclause
Fido
That Fido barked surprised Mary. corresponding elemenatry subject
Mary subclause
Fido
Mary heard that Fido barked. corresponding elementary object
V N
bark
bark
Mary smiled when Fido barked. corresponding elementary adverb
V N
The answer surprised Mary.
N
subclause
Fido
V N
Mary heard the answer.
N
A
Mary smiled happily.
104
7. Extrapropositional Functor-Argument
The double function of bark as (i) the verb of the subclause and (ii) the subject, object, or adverbial modifier of the main clause is shown by the dotted rectangles in their SRGs. No such double function exists in the elementary counterparts shown at the bottom. The semantic relations between the verb of the lower sentence and the verb of the higher sentence are strictly local. For example, the remainder of the subject sentence comes into play only insofar as Fido is connected to, and is dragged along by, bark. The remainder of the main clause comes into play only insofar as Mary is connected to, and is dragged along by, surprise.1 The verbs which may take a clausal subject include amaze, amuse, anger, annoy, bore, concern, excite, interest, irritate, satisfy, surprise, tire, and worry. Called psych verbs in lexicology, they have two argument positions. Instead of a clause, the subject may also be a noun (intrapropositional transparent), e.g., The story amused Mary. The object noun is called the human experiencer. The following hear mode derivation illustrates the time-linear DBS interpretation of extrapropositional That Fido barked amused Mary: 7.1.2 I NTERPRETING that lexical lookup
A CLAUSAL SUBJECT CONSTRUCTION
Fido
barked
noun: Fido verb: v_1 fnc: arg: prn: fnc: prn: syntactic−semantic parsing noun: Fido verb: v_1 fnc: 1 arg: prn: fnc: prn: 27
verb: bark arg: prn:
verb: bark arg: prn:
2
verb: v_1 arg: Fido fnc: prn: 27
noun: Fido fnc: v_1 prn: 27
3
verb: bark arg: Fido fnc: prn: 27
noun: Fido fnc: bark prn: 27
noun: Fido verb: bark fnc: bark arg: Fido 4 fnc: (amuse 28) prn: 27 prn: 27 result of syntactic−semantic parsing: noun: Fido verb: bark fnc: bark arg: Fido fnc: (amuse 28) prn: 27 prn: 27
amused verb: amuse arg: prn:
Mary noun: Mary fnc: prn:
verb: amuse arg: prn: verb: amuse arg: (bark 27) prn: 28
noun: Mary fnc: prn:
verb: amuse arg: (bark 27) Mary prn: 28
noun: Mary fnc: amuse prn: 28
7.1 Clausal Subject
105
Clausal functor-arguments have in common that they are opaque. In 7.1.2, the lower verb has the additional feature [fnc: (amuse 28)], while the higher verb has the clausal arg value (bark 27). The additional continuation attribute fnc (normally reserved for nouns) for connecting the subclause to the (verb of the) next higher clause is introduced lexically by the subordinating conjunction that2 (A.4.4); it has the core attribute verb and absorbs bark. As an extrapropositional construction, two different prn values, 27 and 28, are used. More specifically, in line 1, the substitution value v_1 of the core attribute of that is copied into the fnc slot of Fido, and the core value of Fido is copied into the arg slot of that. In line 2, the two instances of v_1 are replaced by the core value of the bark proplet, which is then discarded. In line 3, the bark core value of the initial proplet is copied into the arg slot of the verb proplet amuse, and amuse is copied into the fnc slot of the bark proplet, establishing the desired V/V connection between the subordinate subject clause and the higher clause. Consider the content derived in 7.1.2 as a set of proplets shown in the alphabetical order induced by the core values, with added cat and sem features: 7.1.3 C ONTENT
OF A CLAUSAL SUBJECT CONSTRUCTION
verb: amuse ′ ′ cat: #n #a decl sem: past arg: (bark 27) Mary prn: 28
verb: bark noun: Mary ′ cat: #n v noun: Fido cat: snp cat: snp sem: that past arg: Fido sem: nm m sem: nm f fnc: bark fnc: amuse fnc: (amuse 28) prn: 28 prn: 27 prn: 27
The bark and amuse proplets are connected by an opaque extrapropositional V/V relation, coded by the additional [fnc: (amuse 28)] feature in bark and the long (extrapropositional address in [arg: (bark 27) Mary] of amuse. The subordinating conjunction that is coded as the first value of the sem attribute of bark. Production in the speak mode is based on the LAGo think navigation along the semantic relations of structure connecting the proplets in 7.1.3. They may be shown graphically as follows: 1
2
This is different from linguistic structuralism and its later ramifications, which are based on possible substitutions and combine whole constituents of arbitrary size. The conglomeration into constituents is shown by the nonterminal nodes in PSG trees (e.g. FoCL 8.4.4). In DBS graphs, nonterminals are absolutely absent. In the 1st Ed. the special core attribute n/v was used redundantly to indicate the double function of the verb in a subject clause, and accordingly for object, adnominal modifier, and adverbial modifier clauses. We now rely on the additional continuation attribute as the sole indicator of a subclause verb.
106
7. Extrapropositional Functor-Argument
7.1.4 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG) amuse
(iii) numbered arcs graph (NAG) amuse 1
bark
7.1.2
Mary
bark 2
Fido
6
5
4
Mary
3
Fido
(ii) signature
V (iv) surface realization
V
N
1 2 3 4 5 6 That Fido barked amused Mary
.
N
V/V V/N
N/V
V/V
V\N
N\V
The “/” line connecting the two Vs in the (ii) signature is the same as in the V/V transitions of the (iv) surface realization. The construction is minimal in that the verb of the subordinate clause, here bark, is one-place while the psych verb of the matrix, here amuse, has to be two-place. The subject/predicate relation between the verb of the subclause and the verb of the matrix clause remains unaffected by grammatical complexity in that the semantic relation between the subject clause and the matrix clause is defined solely between two individual verb proplets.3 As a more complex construction consider the analysis of That the little dog found a bone amused the young woman: 7.1.5 A
MORE COMPLEX CLAUSAL SUBJECT CONSTRUCTION
(i) semantic relations graph (SRG) amuse
(iii) numbered arcs graph (NAG) amuse 1 8
woman
find
7 5
bone
dog
woman
find 2
young
6 bone
dog
12
9
10 11 young
3 4 little
little
(ii) signature
V N A
V N
N
A (iv) surface realization 1 2 3 4 5 6 7 That the little dog found a_bone V/V V/N N|A A|N N/V V\N
8
9
10
11
12
amused the young woman
N\V V/V
V\N N|A A|N
.
N\V
7.2 Clausal Object
107
The V/V transitions 1 and 8 in 7.1.5 are just as elementary and direct as the V/V transitions 1 and 4 in 7.1.4.
7.2 Clausal Object The verbs which may take a clausal object (also called a sentential complement) include agree, believe, decide, forget, hear, know, learn, remember, regret, say, see, suspect, and want. Called mental state verbs in lexicology, they take two arguments. The subject is the living (e.g. human) experiencer and the object is a clause which describes an event or other cognitive content inducing the mental experience named by the verb. Often, the object may also be an elementary or phrasal noun (intrapropositional transparent), as in John heard Mary and John forgot his keys, or a prepositional phrase. The hear mode derivation of Mary heard that Fido barked is as follows: 7.2.1 I NTERPRETING Mary lexical lookup
A CLAUSAL OBJECT CONSTRUCTION heard
noun: Mary verb: hear fnc: arg: prn: prn:
that verb: v_1 arg: fnc: prn:
Fido noun: Fido fnc: prn:
barked verb: bark arg: prn:
syntactic−semantic derivation noun: Mary verb: hear 1 fnc: arg: prn: 30 prn: noun: Mary verb: hear fnc: hear arg: Mary prn: 30 prn: 30
verb: v_1 arg: fnc: prn:
3
noun: Mary verb: hear fnc: hear arg: Mary (v_1 31) prn: 30 prn: 30
verb: v_1 arg: fnc: (hear 30) prn: 31
noun: Fido fnc: prn:
4
noun: Mary verb: hear fnc: hear arg: Mary (v_1 31) prn: 30 prn: 30
verb: v_1 arg: Fido fnc: (hear 30) prn: 31
noun: Fido fnc: v_1 prn: 31
verb: bark arg: Fido fnc: (hear 30) prn: 31
noun: Fido fnc: bark prn: 31
2
verb: bark arg: prn:
result of syntactic−semantic parsing: noun: Mary verb: hear arg: Mary (bark 31) fnc: hear prn: 30 prn: 30 3
For example, instead of a content corresponding to That Fido barked, we could have used the content corresponding to That the cute little dog, which John had bought for Suzy yesterday, had been barking all night for the V/V relation in question; and similarly for the matrix clause.
108
7. Extrapropositional Functor-Argument
The subordinating conjunction that (A.4.4) is the same as the one used in the subject clause analyzed in Sect. 7.1. More specifically, in line 2, the core value of the proplet that, i.e. the substitution value v_1, is copied into the arg slot of the proplet hear, taking second (i.e. object) position. Also, the core value of the proplet hear is copied into the fnc slot of the proplet that (cross-copying). In line 3, v_1 is copied into the fnc slot of the Fido proplet, and the core value Fido is copied into the arg slot of the that proplet. In line 4, the three instances of v_1 are replaced4 by the core value of the proplet bark, which is then discarded. Consider the content derived in 7.2.1 as a set of proplets shown in the alphabetical order induced by the core values, with added cat and sem features: 7.2.2 C ONTENT verb: bark cat: #n′ decl sem: that past arg: Fido fnc: (hear 30) prn: 31
OF
Mary heard that Fido barked
noun: Fido cat: snp sem: nm m fnc: bark prn: 31
verb: hear ′ ′ cat: #n #a decl sem: past arg: Mary (bark 31) prn: 30
noun: Mary cat: snp sem: nm f fnc: hear prn: 30
The bark and hear proplets are connected by an opaque V\V relation, coded by the additional feature [fnc: (hear 30)] of bark and the extrapropositional value in [arg: Mary (bark 31)] of hear. The LAGo think navigation through the proplets of 7.2.2 underlying the speak mode production may be shown as follows: 7.2.3 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG) hear
7.2.1
(iii) numbered arcs graph (NAG) hear 1
bark
Mary 4
Fido (ii) signature
5
Fido
V V
N N 4
6
3
2
bark
Mary
Simultaneous substitution (3.6.3, 2).
(iv) surface realization 4 5 1 2 3 6 Mary heard that Fido barked
.
V/N N/V
V\V
V/N
N/V
V\V
7.3 Clausal Adnominal with Subject Gap
109
The “\” line connecting the two Vs in the signature is the same as in the V\V transitions of the surface realization. As in 7.1.4, there is no empty traversal.5
7.3 Clausal Adnominal with Subject Gap Unlike clausal arguments, clausal modifiers are optional and are not restricted to any subset of the matrix verb, e.g. psych verbs (Sect. 7.1) or mental state verbs (Sect. 7.2). Clausal modifiers may be adnominal or adverbial. Also called relative clauses, adnominal clauses use the modified noun, also called the head noun, as the implicit subject or object of the subclause.6 As a brief introduction to relative clauses in DBS, let us compare the SRGs of four relative clauses with those of their main clause counterparts: 7.3.1 M AIN
CLAUSES AND CORRESPONDING RELATIVE CLAUSES
main clause sleep man
relative clause man
11
sleep The man who sleeps
The man sleeps.
man love man
woman
love woman
211
The man loves a woman.
The man who loves a woman man
give man
woman flower
The man gives the woman a flower. 5
6
give 3111
woman flower The man who gives the woman a flower
There is the related content corresponding to English Mary heard Fido bark. It is not analyzed as an extrapropositional object clause construction in DBS, but as an intrapropositional bare infinitive, like Mary saw the accident happen (CLaTR 15.6.2). It is similar in Korean (Prof. Kiyong Lee, personal communication), though a case-marking suffix is used, enabling its basic SOV order. For example, the following relative clauses on the left have a subject gap, the ones on the right an object gap: English:The dog which saw Mary barked. English:The dog which Mary saw barked. Korean: Mary_acc see dog_nom bark. Korean: Mary_nom see dog_nom bark.
110
7. Extrapropositional Functor-Argument
man give woman
give
man kiss
woman
3111
kiss
The man whom the woman gives a kiss.
The woman gives the man a kiss.
Graph-theoretically (CLaTR Sect. 9.1), the degree sequence of a main clause and its corresponding relative clause are the same (shown between the two). The graphs show that relative clauses use their modified (head noun) as their subject or object. For example, in The man who loves a woman (subject gap) the graph shows no subject, just as in The man whom the woman loves (object gap) the graph would show no object. The “shared” noun, i.e. the modified one, is directly connected to the verb of the relative clause, which is no further than the distance between a noun and the verb in the corresponding main clause construction. The following relative clause has a subject gap, i.e. the modified dog serves as the implicit subject of the subclause which saw Mary: 7.3.2 I NTERPRETATION the
dog
OF
The dog which saw Mary barked
which
saw
Mary
barked
lexical lookup noun: dog noun: n_1 fnc: fnc: mdr: prn: prn: syntactic−semantic parsing
verb: v_1 arg: 0/ mdd: prn:
verb: see noun: Mary arg: fnc: prn: prn:
verb: bark arg: prn:
1
noun: n_1 fnc: mdr: prn: 32
2
noun: dog fnc: mdr: prn: 32
verb: v_1 arg: 0/ mdd: prn: 33
3
noun: dog fnc: mdr: (v_1 33) prn: 32
verb: v_1 verb: see arg: 0/ arg: mdd: (dog 32) prn: prn: 33
4
noun: dog fnc: mdr: (see 33) prn: 32
verb: see arg: 0/ mdd: (dog 32) prn: 33
noun: Mary fnc: prn:
verb: see arg: 0/ Mary mdd: (dog 32) prn: 33
noun: Mary fnc: see prn: 33
verb: bark arg: prn:
verb: see arg: 0/ Mary mdd: (dog 32) prn: 33
noun: Mary fnc: see prn: 33
verb: bark arg: dog prn: 32
noun: dog fnc: prn:
noun: dog fnc: mdr: (see 33) prn: 32 result of syntactic−semantic parsing: 5
noun: dog fnc: bark mdr: (see 33) prn: 32
7.3 Clausal Adnominal with Subject Gap
111
The conjunction which with the core attribute verb, also called a relative pronoun, lexically (A.6.3) introduces the additional continuation attribute mdd. It characterizes the grammatical role of a relative clause as a modifier and serves to connect the subclause to (the verb of) the higher clause. The core value is the substitution value v_1. The value ∅ in the first arg slot of the conjunction indicates the subject gap. More specifically, in line 2 the core value of dog is copied into the mdd slot of the verb proplet, and the v_1 value of the verb proplet is copied into the mdr slot of the dog proplet. In line 3, both occurrences of v_1 are replaced by the core value of the proplet see, which is then discarded. In line 4, the object of the relative clause is added by copying Mary as the second value into the arg slot of the verb proplet, and the core value see of the verb proplet into the fnc slot of the proplet Mary. In line 5, the verb of the main clause is added by copying bark into the fnc slot of dog and dog into the arg slot of bark. Consider the content resulting from 7.3.2 as a set of proplets shown in the order induced by the hear mode derivation, with added cat and sem values: 7.3.3 A DNOMINAL
CLAUSE CONSTRUCTION WITH SUBJECT GAP
verb: see noun: dog verb: bark cat: #n′ #a′ v noun: Mary cat: snp ′ cat: snp cat: #n decl sem: def sg sem: which past sem: nm f sem: past arg: ∅ Mary fnc: bark fnc: see arg: dog mdr: (see 33) mdd: (dog 32) prn: 33 prn: 32 prn: 32 prn: 33
The value implicitly filling the subject gap ∅ in the arg attribute of see is specified directly underneath by the feature [mdd: (dog 32)]. Next consider the DBS graph underlying the speak mode production: 7.3.4 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG)
(iii) numbered arcs graph (NAG)
(ii) signature
V
bark
bark
1
N
dog
6
dog
2 5
V
see
see 3
N
Mary
(iv) surface realization 1 2
3
4
5
4 Mary
6
The_dog which_saw Mary barked_ V/N N|V V\N N\V V|N N/V
7.3.2
.
112
7. Extrapropositional Functor-Argument
The modified dog serves implicitly as the subject of the subclause. In the signature, the extrapropositional relation is shown graphically as opaque V|N (instead of transparent A|N, 6.2.3). There are two empty traversals, 4 and 5.
7.4 Clausal Adnominal with Object Gap Whether an adnominal clause has an implicit subject or an implicit object is indicated in the English surface by the relative order of the noun and verb following the conjunction (relative pronoun). For example, in which saw Mary the implicit argument is the subject, while in which Mary saw it is the object: 7.4.1 I NTERPRETATION the
OF
dog
The dog which Mary saw barked which
Mary
verb: v_1 0/ arg: mdd: prn:
noun: Mary fnc: prn:
verb: v_1 0/ arg: mdd: prn: 33 verb: v_1 0/ arg: mdd: (dog 32) prn: 33 verb: v_1 arg: Mary 0/ mdd: (dog 32) prn: 33
noun: Mary fnc: prn:
saw
barked
lexical lookup noun: n_1 noun: dog fnc: fnc: mdr: prn: prn: syntactic−semantic parsing noun: n_1 noun: dog fnc: fnc: 1 mdr: prn: prn: 32
2
noun: dog fnc: mdr: prn: 32
3
noun: dog fnc: mdr: (v_1 33) prn: 32
4
noun: dog fnc: mdr: (v_1 33) prn: 32
noun: dog fnc: mdr: (see 33) prn: 32 result of syntactic−semantic parsing: 5
noun: dog fnc: bark mdr: (see 33) prn: 32
noun: Mary fnc: v_1 prn:
verb: see verb: bark arg: arg: prn: prn:
verb: see arg: prn:
noun: Mary verb: see arg: Mary 0/ fnc: see mdd: (dog 32) prn: 33 prn: 33
verb: bark arg: prn:
verb: see noun: Mary arg: Mary 0/ fnc: see mdd: (dog 32) prn: 33 prn: 33
verb: bark arg: dog prn: 32
The subordinating conjunction which (A.6.3) is the same as the one used in the adnominal clause construction 7.3.2 with a subject gap.
7.4 Clausal Adnominal with Object Gap
113
Up to line 3, the derivation resembles 7.3.2. The derivations diverge in line 4: In 7.3.2, the core value of the Mary proplet is copied into the second position. In 7.4.1, in contrast, the core value of the Mary proplet is copied into the first position of the arg slot of see, thus pushing the gap indicator ∅ into second (object) position. The remainders of the two derivations are analogous. Consider the content derived in 7.4.1 as a set of proplets shown in the alphabetical order induced by the core values, with added cat and sem features: 7.4.2 A DNOMINAL verb: bark ′ cat: #n decl sem: past arg: dog prn: 34
CLAUSE CONSTRUCTION WITH OBJECT GAP
noun: dog cat: snp sem: def sg fnc: bark mdr: (see 35) prn: 34
noun: Mary cat: snp sem: nm f fnc: see prn: 35
verb: see cat: #n′ #a′ v sem: which past arg: Mary ∅ mdd: (dog 34) prn: 35
The see and dog proplets are connected by an opaque V|N relation, coded by the additional feature [mdd: (dog 34)] of the see proplet and the extrapropositional value (long address) in the [mdr: (see 35)] feature of the dog proplet. The LAGo think navigation through the proplets 7.4.2 underlying the speak mode production may be shown as follows: 7.4.3 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG)
7.4.1
(iii) numbered arcs graph (NAG)
bark
bark
1 dog
6
dog
2 5 see
see 3
Mary
4
Mary
(ii) signature
V N V N
(iv) surface realization 1 2
3
4
5
The_dog which Mary saw V/N N|V V/N N/V V|N
6 barked_ N/V
.
114
7. Extrapropositional Functor-Argument
The implicit object of the adnominal subclause is clearly shown. The extrapropositional relation is opaque V|N, in contradistinction to transparent A|N in 6.2.3. Step 5 is an empty traversal.
7.5 Clausal Adverbial Unlike clausal subjects and objects but like clausal adnominals, clausal adverbials are optional and are not restricted to any subset of matrix verbs. Unlike clausal adnominals, however, clausal adverbials have no gap. In English, they are introduced by subordinating conjunctions like when, before, after, or even though. 7.5.1 I NTERPRETATION when
OF
When Fido barked Mary laughed
Fido
barked
Mary
laughed
lexical lookup noun: Fido verb: v_1 arg: fnc: mdd: prn: prn: syntactic−semantic parsing noun: Fido verb: v_1 fnc: arg: 1 mdd: prn: prn: 27
verb: bark arg: prn:
2
verb: v_1 arg: Fido mdd: prn: 27
noun: Fido fnc: v_1 prn: 27
verb: bark arg: prn:
3
verb: bark arg: Fido mdd: prn: 27
noun: Fido fnc: bark prn: 27
noun: Fido verb: bark arg: Fido fnc: bark 4 mdd: prn: 27 prn: 27 result of syntactic−semantic parsing: noun: Fido verb: bark fnc: bark arg: Fido mdd: (laugh 28) prn: 27 prn: 27
noun: Mary verb: laugh arg: fnc: mdr: prn: prn:
noun: Mary fnc: prn: noun: Mary verb: laugh fnc: arg: prn: 28 mdr: prn: noun: Mary verb: laugh fnc: laugh arg: Mary prn: 28 mdr: (bark 27) prn: 28
The arg values of adverbial when differ from those of adnominal which (7.4.1) in that they do not contain any ∅ slot fillers (A.6.4).
7.5 Clausal Adverbial
115
After completing the adverbial subclause in line 2, the subject Mary of the main clause is added to the set of proplets at the now front (Sect. 11.3), but can not be related grammatically to any of proplets there. Instead, it is suspended until the verb of the main clause is available (line 4). Consider the content derived in 7.5.1 in the alphabetical order induced by the core values with added cat and sem features: 7.5.2 C ONTENT
OF AN ADVERBIAL CLAUSE CONSTRUCTION
verb: laugh verb: bark noun: Mary noun: Fido ′ ′ cat: #n decl cat: #n v cat: snp cat: snp sem: past sem: when past sem: nm m sem: nm f arg: Mary arg: Fido fnc: bark fnc: laugh mdr: (bark 27) mdd: (laugh 28) prn: 28 prn: 27 prn: 28 prn: 27
The bark and laugh proplets are connected by an opaque V|V relation, coded by the additional [mdd: (laugh 28)] feature in bark and the extrapropositional value in the [mdr: (bark 27)] feature of laugh. The subordinating conjunction when is specified in the first sem slot of bark. The LAGo think navigation through the proplets 7.5.2 underlying the speak mode production may be shown as follows: 7.5.3 DBS
(i) semantic relations graph (SRG) laugh
(iii) numbered arcs graph (NAG) laugh 1
Mary
bark
7.5.1
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
3 6
2
Mary
bark 4 5
Fido
Fido
(ii) signature
V N N
V
(iv) surface realization 3 4 5 6 1 2 When Fido barked Mary laughed_ V|V V/N N/V V|V V/N N/V
.
The |-line connecting the two Vs in the signature is the same as in the V|V transitions of the surface realization. There is an empty traversal in step 6.
116
7. Extrapropositional Functor-Argument
In contradistinction to 7.5.1, no suspension7 occurs if the adverbial modifier follows the modified verb. Consider the following example: 7.5.4 I NTERPRETATION
OF
Mary laughed when Fido barked.
laughed
when
noun: Mary verb: laugh arg: fnc: mdr: mdr: prn: prn: syntactic−semantic parsing noun: Mary verb: laugh arg: 1 fnc: mdr: mdr: prn: 29 prn:
verb: v_1 arg: mdd: prn:
Mary
Fido
barked
noun: Fido fnc: mdr: prn:
verb: bark arg: mdr: prn:
lexical lookup
noun: Mary 2 fnc: laugh mdr: prn: 29
verb: laugh arg: Mary mdr: prn: 29
verb: v_1 arg: mdd: prn:
noun: Mary 3 fnc: laugh mdr: prn: 29
verb: laugh arg: Mary mdr: v_1 prn: 29
verb: v_1 arg: mdd: (laugh 29) prn: 30
noun: Fido fnc: mdr: prn:
verb: v_1 arg: Fido mdd: (laugh 29) prn: 30
noun: Fido fnc: v_1 mdr: prn: 30
verb: bark arg: Fido mdd: (laugh 29) prn: 30
noun: Fido fnc: bark mdr: prn: 30
noun: Mary verb: laugh arg: Mary 4 fnc: laugh mdr: mdr: v_1 prn: 29 prn: 29 result of syntactic−semantic parsing: noun: Mary verb: laugh fnc: laugh arg: Mary mdr: mdr: (bark 30) prn: 29 prn: 29
verb: bark arg: mdr: prn:
The DBS graph analysis underlying the speak mode uses the same (i) semantic relations graph (SRG), (ii) signature, and (iii) numbered arcs graph (NAG) as in 7.5.3, but differs in the (iv) surface realization: 7.5.5 S URFACE
REALIZATION OF
7.5.4
(iv) surface realization 1 2 3 4 5 6 Mary laughed when Fido barked V/N N/V V|V V/N N/V V|V
.
7
Extrapropositional suspension is standard in the extrapropositional coordination (13.3.2, 13.4.2) of SVO and SOV (Greenberg 1963) languages. Intrapropositional suspension is common in SOV languages like Korean. For example, Korean suspends all three arguments in man_nom her_dat apple_acc gave until the verb finally arrives (Lee 2002, 2004). Even though a similar verb-final word order occurs in German subordinate clauses, there is no suspension due to the presence of a clauseinitial subordinating conjunction. It contains a substitution value in the verb slot and an open arg slot into which the arguments are collected until the lower verb arrives (Mehlhaff 2007).
7.6 Subclause Recursion
117
Apart from the prn values, the content equals that of 7.5.2 (paraphrase, Sect. 6.6). Given (i) the absence of a suspension and (ii) the navigation following the pre-order numbering of the NAG, this construction may be regarded as unmarked, in contradistinction to 7.5.3, which would be the marked one.
7.6 Subclause Recursion By using verbs taking object clauses (Sect. 7.2) also in the subclause, the object clause construction may be recursive. For example, in John said that Bill believes that Mary loves Tom the first subclause takes the object clause verb believe. If the verb in the second subclause love is replaced by another object clause verb like claim, yet another object clause may be constructed: John said that Bill believes that Mary claims that Suzy loves Tom. Consider the DBS graph analysis of a minimal object clause recursion:8 7.6.1 P RODUCING John said that Bill believes that Mary loves Tom (i) semantic relations graph (SRG)
(iii) numbered arcs graph (NAG)
say
say 12 3 John believe 4 6 11 5 Bill love 1
2
John
believe
Bill
love
7
Mary (ii) signature
V N
V N
8
10 9 Tom
(iv) surface realization 1 2 3 4 5 6 7 8 9 10 11 12 John said that Bill believes that Mary loves Tom . V/N N/V V\V V/N N/V V\V V/N N/V V\N N\V V\V V\V
V N
Tom
8 Mary
N
For the hear mode derivation of an object clause example without recursion see 7.2.1.
118
7. Extrapropositional Functor-Argument
The iteration/recursion9 may be continued indefinitely – provided the verb of each object clause takes another object clause as its argument, based on the verbs listed at the beginning of Sect. 7.2. Like any content in the declarative, the recursive object clause construction analyzed in 7.6.1 may serve as the subject (7.1.2) of a suitable verb, as in That John said that Bill believes that Mary loves Tom surprised Priscilla. It may also serve as the object (7.2.1) of a suitable verb, as in Priscilla knows that John said that Bill believes that Mary loves Tom. Furthermore, it may be turned into a Yes/No interrogative, here Did John say that Bill believes that Mary loves Tom? And it may be turned into a WH interrogative, as in Whom did John say that Bill believes that Mary loves? The latter construction has been called an ‘unbounded dependency.’ 7.6.2 DBS
GRAPH ANALYSIS OF AN UNBOUNDED DEPENDENCY
(i) semantic relations graph (SRG)
(ii) signature
(iii) numbered arcs graph (NAG)
V
say
say 1
John
believe Bill
N
V N
love
12 3 John believe 4 11 5 6 Bill love
V
2
7
Mary
WH
N
8 Mary
N
10 9 WH
(iv) surface realization 3
6
9
Whom
10
11
1 2 3 4 5 6 7 8 11 12 does John say that Bill believes that Mary loves
V\V V\V V\N V\V V\V V\V V/N N/V V\V V/N N/V
V\V V/N
12
?
N/V V\V V\V
In DBS, this content has the same (ii) signature as 7.6.1, but the (i) SRG, the (iii) NAG, and the (iv) surface realization are different. In the surface, the dependency is between the initial whom (A.6.3) and the object position of the final subclause; it is unbounded because there is no limit on the number of intervening iterated object clauses. In the (iv) surface realization, there is a multiple visit (CLaTR Sect. 9.3): transition 11 is used twice, on the way from whom to did and on the way from love to ?. The transitions 3, 6, 10, and 11 are empty. The graphical analysis has the following proplet representation: 9
In computer science, every recursive function may be rewritten as an iteration, and vice versa. The choice between the two usually depends on the application and associated considerations of efficiency.
7.6 Subclause Recursion
7.6.3 S IMPLIFIED
CONTENT REPRESENTATION OF
119
7.6.2
verb: love verb: believe verb: say noun: Bill noun: John arg: Bill (love 1) arg: Mary WH arg: John (believe 3) fnc: say fnc: believe fnc: (believe 3) fnc: (say 2) prn: 2 prn: 3 prn: 2 prn: 1 prn: 3 noun: Mary noun: WH fnc: love fnc: love prn: 1 prn: 1
The order of the proplets follows the NAG in 7.6.2. The content is opaque because the verb proplets believe and love require additional fnc attributes; the extrapropositional nature of the subclauses is shown by different prn values. In the DBS hear mode derivation, the unbounded dependency has the form of an unbounded suspension: 7.6.4 Whom does John say that Bill believes that Mary loves? Whom lexical lookup
does
John
say
noun: wh_1 verb: v_1 noun: John fnc: arg: fnc: prn: prn: prn: syntactic−semantic parsing: noun: wh_1 verb: v_1 1 fnc: arg: prn: 1 prn: noun: wh_1 verb: v_1 2 fnc: arg: prn: 1 prn: 2
noun: John fnc: prn:
noun: wh_1 verb: v_1 arg: John 3 fnc: prn: 1 prn: 2
noun: John fnc: v_1 prn: 2
noun: wh_1 verb: say arg: John 4 fnc: prn: 2 prn: 1
noun: John fnc: say prn: 2
verb: say arg: prn:
that
Bill
believes
verb: v_2 noun: Bill fnc: fnc: arg: prn: prn:
verb: believe arg: prn:
that
Mary
loves
verb: v_3 noun: Mary verb: love fnc: fnc: arg: arg: prn: prn: prn:
verb: say arg: prn:
noun: wh_1 verb: say arg: John (believe 3) 7 fnc: prn: 2 prn: 1
noun: John fnc: say prn: 2
noun: wh_1 verb: say arg: John (believe 3) 8 fnc: prn: 2 prn: 1
noun: John fnc: say prn: 2
noun: wh_1 verb: say 9 fnc: v_3 arg: John (believe 3) prn: 2 prn: 1 result of syntactic−semantic parsing: noun: wh_1 verb: say arg: John (believe 3) fnc: love prn: 2 prn: 1
noun: John fnc: prn: 2
verb: v_2 arg: fnc: prn: verb: v_2 noun: Bill arg: fnc: (say 2) fnc: prn: prn: 3 verb: v_2 noun: Bill verb: believe arg: Bill arg: fnc: (say 2) fnc: v_2 prn: 3 prn: prn: 3 verb: believe noun: Bill verb: v_3 arg: arg: Bill fnc: believe fnc: fnc: (say 2) prn: 3 prn: 1 prn: 3 verb: believe verb: v_3 noun: Bill arg: Bill (v_3 1) fnc: believe arg: fnc: (believe 3) fnc: (say 2) prn: 1 prn: 3 prn: 3 verb: believe verb: v_3 noun: Bill arg: Bill (love 1) fnc: believe arg: Mary fnc: (believe 3) fnc: (say 2) prn: 1 prn: 3 prn: 3
noun: John fnc: say prn: 2
verb: believe arg: Bill (love 1) fnc: (say 2) prn: 3
noun: wh_1 verb: say 5 fnc: arg: John (v_2 3) prn: 2 prn: 1
noun: John fnc: say prn: 2
noun: wh_1 verb: say 6 fnc: arg: John (v_2 3) prn: 2 prn: 1
noun: John fnc: say prn: 2
noun: Bill fnc: believe prn: 3
verb: love arg: Mary wh_1 fnc: (believe 3) prn: 1
noun: Mary fnc: prn: noun: Mary verb: love fnc: v_3 arg: prn: 1 prn: noun: Mary fnc: love prn: 1
120
7. Extrapropositional Functor-Argument
The suspension is in line 1: there is no cross-copying between the proplets representing whom and does; only different prn values are assigned. In lines 2 to 7, the object clause construction plays out in the same way as in 7.6.1. In line 8 the suspension is resolved: the wh_1 value of the whom proplet is copied into the second arg slot of the second that proplet and the v_3 value of the second that proplet is copied into the fnc slot of the whom proplet. In addition, the usual cross-copying of core values applies between the proplets representing the second that and Mary. Finally, simultaneous substitution replaces all occurrences of the variable v_3 with the core value of the love proplet, which is then discarded (line 9). Because the duration of the suspension is only determined at the end of the input, the time-linear hear mode derivation is subject to the following ambiguity structure: 7.6.5 A MBIGUITY
STRUCTURE OF AN UNBOUNDED SUSPENSION
A Whom does John love? 1 1 1 1 does John say that B 2 2 2 1 C that 3 D
Bill 1
loves? 1
Bill 3
believes that 1 3 that 4
Mary loves? 1 1 Mary claims that Suzy loves? 4 4 1 1 1
Whom must belong to a proposition with a two-place (transitive) or threeplace (ditransitive) verb, because it is marked morphologically as oblique. Viewing the sentence in isolation, the prn value of the WH proplet is clear: it can only be 1 because whom is initial. The ambiguity is whether the next word, here does, should be assigned the same prn value (line A) or whether the prn value should be incremented (line B). In line A, whom is the object of an elementary proposition with a transitive verb which does not take a clausal object. Thus all proplets in line A share the prn value 1. In line B, in contrast, the matrix verb takes a clausal object which whom belongs to. Thus, does John say has the prn value 2 and Bill loves whom has the prn value 1. The construction in line B terminates because the verb of the object clause does not take a clausal object. Line C branches off line B because the first object clause uses a verb which takes a second object clause as its oblique argument. Thus, does John say continues to have the prn value 2, but the new object clause Bill believes X has the new prn value 3, whereby X is that Mary loves whom with the prn value 1. The construction in C terminates because the verb of the object clause does not take a clausal object.
7.6 Subclause Recursion
121
Line D branches off C because the second object clause uses a verb which takes a third object clause as its argument. Thus, does John say continues to have the prn value 2, Bill believes X continues to have the prn value 3, but the new object clause that Mary claims X has the new prn value 4, whereby X is that Suzy loves whom with the prn value 1. The construction in D terminates because the verb of the object clause does not take a clausal object. The point is that even though an unbounded suspension (i) may be continued indefinitely and (ii) systematically causes a syntactic ambiguity, it does not increase the computational complexity of natural language (FoCL Sect. 11.5). This is because one of the two branches of each time-linear ambiguity split terminates, each similar to a garden path sentence (FoCL 11.3.6). Another example of a subclause recursion is relative clauses with subject gaps (Sect. 7.3). An English example is The dog heard the man who loves the woman who fed the child., which has the following graph analysis: 7.6.6 DBS
GRAPH ANALYSIS OF A RELATIVE CLAUSE RECURSION
(i) semantic relations graph (SRG)
(ii) signature
hear dog
(iii) numbered arcs graph (NAG) hear
V man
N
love
1 2
12
3
dog
N
man
4 11 love
V
10
5 woman
N
woman 6 9
feed
V
feed 8 7
child
N
child
(iv) surface realization (English, subject gap) 1 2 3 4 5 6 The_dog heard the_man who_loves the_woman who fed V/N
N/V
V\N
N|V
V\N
N|V
V\N 7 8
the_child
9
10
11 12
.
N\V V|N N\V V|N N\V
After arc 3 realizes the head noun man, the navigation continues on the left (arcs 4–7), realizing almost all surfaces. Then it moves up on the right with empty traversals (arcs 8–11) all the way to the root (verb) to realize the period in arc 12. The subclause recursion may be continued indefinitely, processing limits and considerations of style permitting.
122
8. Intrapropositional Coordination
Coordination (parataxis) is the other primary semantic relation of structure in natural language, besides functor-argument (hypotaxis). The expressions concatenated in a coordination are called conjuncts. The relations between conjuncts may be conjunctive, disjunctive, or adversative, as specified by the function words and, or, and but (Sect. A.7). Called coordinating conjunctions, they are used at all three levels of grammatical complexity, in contradistinction to the subordinating conjunctions of functor-argument,1 which are used at the clausal level only (extrapropositionally).
8.1 Comparing Simple and Complex Coordination In natural language, two basic kinds of coordination must be distinguished, called simple and complex.2 Complex coordination is also known as gapping. Further distinctions are between noun, verb, and adjective conjuncts in simple coordinations, and between subject gapping, predicate gapping, and object gapping in complex coordination. Simple and complex coordinations raise the question of their similarities and their differences. There are examples in which the word forms of a complex coordination may be re-assembled into a simple coordination: 8.1.1 S AME
WORD FORMS IN A SIMPLE AND A COMPLEX COORDINATION
(i) Simple coordination Bob bought, cooked, and ate a pizza, a hamburger, and a taco. (ii) Complex coordination Bob bought a pizza, # cooked a hamburger, and # ate a taco. 1
2
Examples of subordinating conjunctions are that (clausal subject, Sect. 7.1; clausal object, Sect 7.2), which (clausal adnominal, Sects. 7.3 and 7.4), and when (clausal adverbial, Sect. 7.5). The terminological distinction between simple and complex coordination follows Greenbaum and Quirk (1990, pp. 271 ff.). Within nativism, simple coordination is called “constituent coordination,” and complex coordination is called “non-constituent coordination.”
124
8. Intrapropositional Coordination
The two examples are assembled from the same set of word forms. They differ, however, in meaning: while (i) asserts that three actions were performed on three kinds of food, (ii) associates each action with a single kind of food. Another difference is that a complex coordination using other words than those in 8.1.1 may be analogous to (ii), but not allow a corresponding simple coordination like (i) making any sense: 8.1.2 C OMPLEX
COORDINATION WITHOUT SIMPLE COUNTERPART
(i) Simple coordination *Bob ate, walked, and read an apple, his dog, and the paper. (ii) Complex coordination Bob ate an apple, # walked his dog, and # read the paper. As in 8.1.1, the complex coordination is an instance of subject gapping. The most important difference is that the relations constituting a complex coordination occur only intrapropositionally, in contrast to simple coordination, which is the means for the extrapropositional concatenation of propositions in a text or dialog (9.2.1). These dissimilarities show that simple and complex coordinations are essentially different, mapping into different proplet sets – unlike the a- and the b-variant in 6.1.4, which show unmarked and marked surface variants of English for the same content (paraphrase). Regarding frequency, März (2005) determined the following results for German, using the LIMAS corpus.3 Of a total of 77 500 sentences, 27 500 contain the coordinating conjunctions und (and), oder (or), or aber (but). The conjunction und was by far the most frequent, used in 21 420 sentences. Regarding the coordination kind, 57% consist of nouns and 24.3% of verbs. Of the noun coordinations, 25% function in prepositional phrases, 13% as subjects, 7% as objects, and 6.5% in genitives. Gapping occurs in only 0.1% of the coordinations, with no example of object gapping (Sect. 8.6). Given that all of these constructions provide clear grammaticality intuitions and may be found in a wide range of languages (such as Chinese, Czech, Georgian, Korean, Russian, and Tagalog),4 the least frequent constructions are just as relevant for a systematic linguistic analysis as the most frequent ones. 3
4
The LIMAS corpus (Heß, Lenders, et al. 1983) was designed as a balanced and representative corpus of German, built in analogy to the Brown corpus (Kuˇcera and Francis 1967). I would like to thank Hsiao-Yun Huang-Senechal (China), Marie Huˇcinova (Czech Republic), Sofia Tkemaladze-Fornwald (Georgia), Soora Kim (South Korea), Katja Kalender (Russia), and Guerly Söllch (The Philippines) for their native speaker grammaticality judgments during a seminar at CLUE.
8.1 Comparing Simple and Complex Coordination
125
The intra- and the extrapropositional uses of simple coordination differ in that the connections within an intrapropositional coordination are duplex, while the connections within an extrapropositional coordination are simplex. The reason is the speak mode: once an intrapropositional coordination has been traversed, the navigation has to get back to the top verb (8.2.3, 8.2.7, 8.3.4), while an extrapropositional coordination simply proceeds from one proposition to the next (9.2.3). Intrapropositional duplex coordinations are built by cross-copying into the nc (next conjunct) and pc (previous conjunct) attributes, as in the following content corresponding to the man, the woman, and the child: 8.1.3 S IMPLE
INTRAPROPOSITIONAL NOUN COORDINATION
noun: man& cat: snp sem: def sg m fnc: mdr: nc: woman pc: prn: 26
noun: woman cat: snp sem:def sg f fnc: mdr: nc: child pc: man& prn: 26
noun: child cat: snp sem: and def sg −mf fnc: mdr: nc: pc: woman prn: 26
The nc attribute of the man proplet specifies woman as the next conjunct; the pc and nc attributes of the woman proplet specify man as the previous and child as the next conjunct; and the pc attribute of the child proplet specifies woman as the previous conjunct. The elements of the coordination are held together by a common proposition number, [prn: 26]. All instances of the core value of the initial conjunct, here man, are marked with &. A languageindependent variant of and is written into the sem slot of the last conjunct, recording the semantically relevant choice of the coordinating conjunction. The integration of a coordination into the proposition’s functor-argument is based on using only the initial conjunct for establishing the grammatical relation in question. This is shown by the following example in which the coordination the man, the woman, and the child (8.1.3) is combined with the verb sleep in the subject/predicate relation, forming an elementary proposition: 8.1.4 I NTEGRATING noun: man& cat: snp sem: def sg m fnc: sleep mdr: nc: woman pc: prn: 26
A COORDINATION INTO FUNCTOR - ARGUMENT
noun: woman cat: snp sem: def sg f fnc: mdr: nc: child pc: man& prn: 26
noun: child cat: snp sem: and def sg −mf fnc: mdr: nc: pc: woman prn: 26
verb: sleep cat: #n−s3 v sem: pres arg: man& mdr: nc: pc: prn: 26
126
8. Intrapropositional Coordination
The initial conjunct has an empty pc slot, the intermediate conjunct has nonempty nc and pc slots, and the final conjunct has an empty nc slot. In English, simple intrapropositional noun and verb coordinations have an explicit conjunction, as in Tom, Dick, and Harry or bought, peeled, and cooked. In coordinations of adnominals, as in the smart, pretty, rich girl, and extrapropositional coordinations, as in Julia slept. Susanne dreamed. John sang., in contrast, an explicit conjunction is optional. If there is an explicit conjunction, it is written into the sem slot of the last conjunct, where it will be needed for realization in the speak mode. Using only the &-marked initial conjunct for establishing a functor-argument relation provides a coding of grammatical relations which is as complete as necessary and as parsimonious as possible – not only for modeling the timelinear interpretation and production of the surface, but also for supporting the required retrieval. For example, even though the sleep proplet in 8.1.4 does not explicitly specify child as one of its arg values, and the child proplet does not explicitly specify sleep as its fnc value, the correct answer to the query Did a child sleep? relative to a word bank containing 8.1.4 should be yes. The method to answer the query is to look through all sleep proplets (Sect. 5.1) and find the ones with the arg value child. Yet if child happens to be a non-initial conjunct, it will not be specified directly in the sleep proplet. However, if the proplet accessed via the core value of the verb, here sleep, has an &-marked arg value, here man&, this will trigger a subnavigation through all the non-initial conjuncts of the associated coordination, ensuring that the child conjunct in 8.1.4 will not be overlooked.
8.2 Simple Coordination of Subject and Object Nouns Having characterized the grammatical relations of a functor-argument containing a simple coordination (8.1.4), let us turn to modeling the cycle of natural language communication using expressions of English containing simple coordinations as subjects and as objects. The task consists in (i) mapping the surface into a content (hear mode interpretation), (ii) representing the content as an order-free set of proplets using the alphabetical order induced by the core values, and (iii) realizing the surface of English based on a think mode navigation along the semantic relations between proplets (speak mode production). The English surface The man, the woman, and the child slept. (8.1.4) consists of a predicate (here the elementary form of a one-place verb) and a simple noun coordination as subject. It has the following hear mode derivation:
8.2 Simple Coordination of Subject and Object Nouns
8.2.1 S IMPLE the lexical lookup
NOUN COORDINATION IN SUBJECT POSITION man
noun: n_1 noun: man fnc: fnc: prn: nc: pc: prn: syntactic−semantic parsing: noun: n_1 noun: man fnc: 1 fnc: nc: prn: pc: prn: 26
the noun: n_2 fnc: nc: pc: prn:
woman
and
the
child
slept
noun: woman cnj: and noun: n_3 noun: child verb: sleep arg: fnc: fnc: fnc: prn: prn: nc: prn: pc: prn:
noun: n_2 fnc: nc: pc: prn: noun: n_2 noun: woman fnc: fnc: 3 prn: nc: pc: man& prn: 26 noun: woman cnj: and fnc: 4 nc: pc: man& prn: 26 noun: woman cnj: and noun: man& fnc: 5 fnc: nc: nc: woman pc: man& pc: prn: 26 prn: 26 noun: woman noun: man& fnc: fnc: 6 nc: woman nc: n_3 pc: man& pc: prn: 26 prn: 26 noun: woman noun: man& fnc: fnc: 7 nc: woman nc: child pc: man& pc: prn: 26 prn: 26 result of syntactic−semantic parsing: noun: woman noun: man& fnc: fnc: nc: child nc: woman pc: man& pc: prn: 26 prn: 26 2
127
noun: man fnc: nc: pc: prn: 26 noun: man& fnc: nc: n_2 pc: prn: 26 noun: man& fnc: nc: woman pc: prn: 26
noun: n_3 fnc: nc: pc: prn: noun: n_3 noun: child fnc: fnc: prn: nc: pc: woman prn: 26 noun: child fnc: nc: pc: woman prn: 26 noun: child fnc: nc: pc: woman prn: 26
verb: sleep arg: prn:
verb: sleep arg: man& prn: 26
In line 1, the lexical proplets the and man are fused. In line 2, the core value n_2 of the second the is copied into the nc slot of man, and the core value of man is copied into the pc slot of the second the. This addition of another noun makes the man proplet an initial conjunct, reflected by &-marking its core value in line 3; also, the two occurrences of the substitution value n_2 are replaced by the core value of the lexical proplet woman. In line 4, the
128
8. Intrapropositional Coordination
conjunction is added (suspension). In line 5, its core value is written (8.2.2) into the sem slot of the third the, i.e. beginning of the last conjunct, after which the conjunction proplet is discarded. Also, the core value n_3 is copied into the nc slot of woman, and the core value of woman is copied into the pc slot of the. In line 6, the two occurrences of n_3 are substituted by the core value of the lexical proplet child, which is then discarded. In line 7, the core value of the initial conjunct, i.e. man&, is copied into the arg slot of sleep, and the core value of sleep is copied into the fnc slot of the initial conjunct.5 Consider the content derived in 8.2.1 in the alphabetical order induced by the core values, with added cat and sem values: 8.2.2 P ROPLET REPRESENTATION OF THE CONTENT DERIVED IN 8.2.1 noun: man& noun: child noun: woman verb: sleep cat: snp cat: snp cat: snp sem: and def sg -mf sem: def sg m cat: #n′ decl sem: def sg f sem: past fnc: fnc: sleep fnc: nc: woman arg: man& nc: child nc: pc: man& prn: 26 pc: pc: woman prn: 26 prn: 26 prn: 26 The initial conjunct man has an empty pc slot, a non-empty nc slot, and specifies sleep as its fnc value. The non-initial conjuncts woman and child have non-empty pc slots and empty fnc slots. Their fnc value may be obtained from the initial conjunct to which the non-initial conjuncts are connected via chains of nc and pc values. The verb proplet sleep specifies only the initial conjunct man& in its arg slot. The conjunction of the coordination, and, is specified in the sem slot of the last conjunct, child. As an intrapropositional construction, the proplets of the coordination and the verb proplet all have the same prn value, here 26. The think mode navigation through an unordered set of proplets representing the content of a declarative sentence begins with the top verb. While the verbs of subclauses are recognized by their additional fnc or mdd attributes, the top verb has only one continuation attribute, namely arg. In 8.2.2, the navigation uses the first arg value of the top verb to travel to the subject, here man. It is &-marked as the beginning of a coordination. From there, the order of the rest of the coordination may be reconstructed by following the chain of nc values. The end of the coordination is the proplet with an empty nc slot. 5
Given their disambiguating function, commata should eventually be represented as lexical proplets and added by suitable operations. For a syncategorematic treatment of commata see NEWCAT.
8.2 Simple Coordination of Subject and Object Nouns
129
The speak mode (re)production of the sentence from the set of proplets derived in 8.2.1 is based on the following DBS graph analysis. 8.2.3 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG)
(ii) signature
(iii) numbered arcs graph sleep
V
sleep
8.2.1
1 6 man
woman
N
child
N
5 4 man 2 woman 3 child
N
(iv) surface realization 1 2 3 4 5 6 The_man the_woman and_the_child sleep_ V/N N−N N−N N N N N N/V
.
The intrapropositional coordination is transparent because no additional attributes are required for the conjunct proplets man, woman, child, and all continuation values are standard. Return to the verb requires two empty traversals, 4 and 5, based on the pc values in 8.2.2. While a noun coordination in subject position is built first and then combined with the verb, a noun coordination in object position is attached to the verb incrementally (just as in phrasal subject and object nouns, 6.1.1). This is indicated by the following schematic characterization of the grammatical relations in the example The man, the woman, and the child bought an apple, a pear, and a peach: 8.2.4 S IMPLE noun: man& fnc: buy nc: pc:
COORDINATIONS OF SUBJECT AND OBJECT NOUNS
noun: woman fnc: nc: pc:
noun: child fnc: nc: pc:
verb: buy arg: man& apple&
noun: apple& fnc: buy nc: pc:
noun: pear fnc: nc: pc:
noun: peach fnc: nc: pc:
When the subject coordination is combined with the verb, the initial conjunct has been &-marked, here [noun: man&], and has no pc value. Thus it is clear which proplet of the coordination must serve as the initial arg value of the verb. When the object noun is started, in contrast, it is open whether or not it will turn out to be a coordination. In any case, the first object noun encountered will be the proper value for the second arg slot of the verb. If the derivation continues with a conjunct, the core value of the initial object conjunct and all its copies are &-marked. The following example shows the incremental build-up of a noun coordination serving as the object, using a name as subject:
130
8. Intrapropositional Coordination
8.2.5 S IMPLE
NOUN COORDINATION IN OBJECT POSITION
John bought lexical lookup noun: John verb: buy arg: fnc: prn: prn:
an
a
apple
noun: n_1 noun: apple fnc: fnc: prn: nc: pc: prn:
pear
and
noun: n_2 noun: pear fnc: fnc: prn: nc: pc: prn:
a
peach
noun: n_3 noun: peach fnc: fnc: nc: prn: pc: prn:
syntactic−semantic parsing: noun: John verb: buy arg: 1 fnc: prn: 27 prn: noun: John verb: buy noun: n_1 arg: John fnc: 2 fnc: buy nc: prn: 27 prn: 27 pc: prn: noun: John verb: buy noun: n_1 noun: apple arg: John n_1 fnc: buy fnc: 3 fnc: buy prn: prn: 27 nc: prn: 27 pc: prn: 27 noun: apple noun: John verb: buy arg: John apple& fnc: buy 4 fnc: buy prn: 27 nc: prn: 27 pc: prn: 27
noun: n_2 fnc: nc: pc: prn:
noun: John verb: buy noun: apple& arg: John apple& fnc: buy 5 fnc: buy prn: 27 prn: 27 nc: n_2 pc: prn: 27
noun: n_2 noun: pear fnc: fnc: prn: nc: pc: apple& prn: 27
noun: John verb: buy noun: apple& arg: John apple& fnc: buy 6 fnc: buy prn: 27 prn: 27 nc: pear pc: prn: 27
noun: pear fnc: nc: pc: apple& prn:
cnj: and
noun: John verb: buy noun: apple& arg: John apple& fnc: buy 7 fnc: buy prn: 27 prn: 27 nc: pear pc: prn: 27
noun: pear fnc: nc: pc: apple& prn: 27
cnj: and noun: n_3 fnc: nc: pc: prn:
noun: John verb: buy noun: apple& arg: John apple& fnc: buy fnc: buy prn: 27 prn: 27 nc: pear pc: prn: 27 result of syntactic−semantic parsing noun: John verb: buy noun: apple& arg: John apple& fnc: buy fnc: buy prn: 27 prn: 27 nc: pear pc: prn: 27
noun: pear fnc: nc: n_3 pc: apple& prn: 27
noun: n_3 noun: peach fnc: fnc: prn: nc: pc: pear prn: 27
noun: pear fnc: nc: peach pc: apple& prn: 27
noun: peach fnc: nc: pc: pear prn: 27
8
The first three proplets in line 4 are the same as they would be in the derivation of John bought an apple. The fourth proplet, however, continues the object noun apple as a noun coordination: the lexical proplet a(n) attaches to the apple proplet by copying the substitution value n_2 into the nc slot of apple and the core value of apple into the pc slot of a(n). The remainder of the derivation is analogous to that of the subject coordination in 8.2.1, lines 1–6.
8.2 Simple Coordination of Subject and Object Nouns
131
In the result, apple is &-marked as an initial conjunct and buy is specified as its fnc value (similar to the subject coordination 8.2.1). The non-initial noun conjuncts pear and peach have no fnc value. It may be obtained from the initial conjunct, however, to which the non-initial conjuncts are connected via chains of nc and pc values. The verb proplet contains the core value of the initial nominal conjunct, here apple& as the object, in its second arg slot. Consider the content derived in 8.2.5 in the alphabetical order induced by the core values, with added cat and sem values: 8.2.6 P ROPLET noun: apple& cat: snp sem: indef sg fnc: buy nc: pear pc: prn: 27
REPRESENTATION OF THE CONTENT DERIVED IN
verb: buy ′ ′ cat: #n #a decl sem: past arg: John apple& prn: 27
noun: John cat: snp sem: nm m fnc: buy prn: 27
noun: peach cat: snp sem: and indef sg fnc: nc: pear pc: prn: 27
8.2.5
noun: pear cat: snp sem: indef sg fnc: nc: peach pc: apple& prn: 27
The conjunction and was added to the sem slot of the last conjunct in line 7 of 8.2.5. The construction is transparent because no additional attributes are required for the conjunct proplets apple, pear, and peach, and all continuation values are standard. The speak mode (re-)production of this sentence from the set of proplets derived in 8.2.5 is based on the following DBS graph analysis. 8.2.7 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG) buy
(iii) numbered arcs graph buy 1
John
apple
peach
John
8 2
3 apple
7 6 pear peach 4 5
(iv) surface realization 1 2 3 4 5 6 7 8 John bought an_apple a_pear and_a_peach V/N N/V V\N N−N N−N N N N N N\V
(ii) signature
V N
pear
8.2.5
.
N
N
N
The navigation moves from the predicate to the subject (arc 1), back to the predicate (arc 2), on to the object (arc 3), and through the conjunction (arcs 4 and 5). Returning to the predicate via two empty traversals (arcs 6 and 7), the navigation realizes the period (arc 8) and prepares for a possible continuation to the predicate of a next proposition (extrapropositional coordination).
132
8. Intrapropositional Coordination
8.3 Simple Coordination of Verbs and of Adjectives The intrapropositional coordination of verbs is restricted insofar as their valency structure has to be the same. For example, the verb forms slept (oneplace), wrote (two-place), and gave (three-place) cannot be combined into a coordination because they differ in their valency structure. Thus, *Peter slept, bought, and gave Mary a letter is ungrammatical because the relation between the valency positions in the respective verbs and their fillers is not properly defined.6 In contrast, Peter wrote, sealed, and mailed the letter is well-formed because the three verbs are all two-place. With respect to other properties, however, the verbs in an intrapropositional coordination may vary. For example, Peter has written, is sealing, and will mail the letter, though contrived, is grammatical. The variation in tense in the coordination of verbs corresponds to a possible variation in number in the coordination of nouns, as shown by the well-formedness of Peter ate some müsli, one boiled egg, and two apples. While the initial conjunct of a noun coordination serves as a value in the arg slot of the verb (8.2.2), the initial conjunct of a verb coordination serves as the value of the fnc slot of the noun(s). This partial inversion of the grammatical relations in the coordination of verbs versus nouns is shown by the non-empty nc, the empty pc, and the non-empty arg slot of the initial verb conjunct, as well the non-empty pc and the empty arg slot of the non-initial verb conjuncts. The core value of the initial conjunct and all its copies are marked with &, as in the following schematic characterization of John bought, cooked, and ate the pizza, i.e. a declarative sentence with a coordination of three verbs. 8.3.1 S IMPLE noun: John fnc: buy& nc: pc:
COORDINATION OF TWO - PLACE VERBS
verb: buy& arg: John pizza nc: cook pc:
verb: cook arg: nc: and eat pc: buy&
verb: eat arg: nc: pc: cook
noun: pizza fnc: buy& nc: pc:
The nouns serving as the arguments of the two-place verbs in the coordination, namely John and pizza, contain only the first conjunct, buy&, in their fnc slot. Of the verbs in the coordination, namely buy, cook, and eat, only the first contains John and pizza in its arg slot. In this way, functor-argument and coordination are properly related, yet remain unaffected by each other. The example has the following time-linear hear mode derivation: 6
Peter bought and gave Mary a book is acceptable while ?Peter bought and gave a book to Mary is questionable.
8.3 Simple Coordination of Verbs and of Adjectives
8.3.2 V ERB
COORDINATION :
John
bought
133
John bought, cooked, and ate the pizza
cooked
and
ate
verb: cook arg: nc: pc: prn:
cnj: and
verb: eat arg: nc: pc: prn:
the
pizza
lexical lookup noun: John fnc: prn: 28
verb: buy arg: nc: pc: prn: syntactic−semantic parsing: noun: John verb: buy 1 fnc: arg: prn: 28 nc: pc: noun: John 2 fnc: buy prn: 28
verb: buy arg: John nc: pc: prn: 28
noun: John 3 fnc: buy& prn: 28
verb: buy& arg: John nc: cook pc: prn: 28
verb: cook cnj: and arg: nc: pc: buy& prn: 28
noun: John 4 fnc: buy& prn: 28
verb: buy& arg: John nc: cook pc: prn: 28
verb: cook cnj: and arg: nc: pc: buy& prn: 28
noun: John 5 fnc: buy& prn: 28
verb: buy& arg: John nc: cook pc: prn: 28
verb: cook arg: nc: eat pc: buy& prn: 28
noun: John 6 fnc: buy& prn: 28
noun: n_1 noun: pizza fnc: fnc: prn: prn:
verb: cook arg: nc: pc: prn:
verb: cook verb: buy& arg: John n_1 arg: nc: cook nc: eat pc: pc: buy& prn: 28 prn: 28 result of syntactic−semantic parsing: noun: John verb: buy& verb: cook fnc: buy& arg: John pizza arg: nc: cook nc: eat prn: 28 pc: pc: buy& prn: 28 prn: 28
verb: eat arg: nc: pc: prn: verb: eat arg: nc: pc: cook prn: 28
noun: n_1 fnc: prn:
verb: eat arg: nc: pc: cook prn: 28
noun: n_1 fnc: buy& prn: 28
verb: eat arg: nc: pc: cook prn: 28
noun: pizza fnc: buy& prn: 28
noun: pizza fnc: prn:
In line 2, the arrival of a second verb triggers its attachment to the first verb as a verbal conjunct, coded by copying the core value of buy into the pc slot of cook and the core value of cook into the nc slot of buy. In line 3, the conjunction is added (suspension); in line 4, its core value is written into the sem slot (not shown, see 8.3.3) of the last verb conjunct eat, after which the conjunction proplet is discarded. Also, the core values of eat and cook are cross-copied into their respective nc and pc slots. In line 5, the core value buy& of the
134
8. Intrapropositional Coordination
initial verb conjunct is copied into the fnc slot of the beginning of the object, i.e. the determiner the, and the substitution value n_1 is copied into the second arg slot of buy. In line 6, the two occurrences of n_1 are replaced by the core value of the lexical proplet pizza, which is then discarded. Consider the content derived in 8.3.2 in the alphabetical order induced by the core values, with added cat and sem values. 8.3.3 P ROPLET
REPRESENTATION OF A SIMPLE VERB COORDINATION
verb: buy& verb: cook cat: #n′ #′ decl cat: n′ a′ v sem: past sem: past arg: John pizza arg: nc: cook nc: eat pc: pc: buy& prn: 28 prn: 28
verb: eat cat: n′ a′ v sem: and past arg: nc: pc: cook prn: 28
noun: John cat: snp sem: nm m fnc: buy& prn: 28
noun: pizza cat: snp sem: def sg fnc: buy& prn: 28
The buy proplet is &-marked as the initial verb conjunct, and specifies John and pizza as its arg values. The non-initial verb conjuncts cook and eat have no arg values. They may be obtained from the initial conjunct, however, to which the non-initial conjuncts are connected via chains of nc and pc values. The noun proplets John and pizza specify only buy& as their fnc value, marking the presence of non-initial verb conjuncts for possible querying. The conjunction and appears in the sem slot of the last conjunct, eat, making it available for a possible speak mode production. Consider the LAGo think navigation through the proplets 8.3.3: 8.3.4 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG) buy cook eat
(iii) numbered arcs graph 8 7 buy 5 cook 6 eat 1
John
pizza John
(ii) signature
V
V V
8.3.2
2
3
4 pizza
(iv) surface realization 1 2 5 6 7 8 3 4 John bought cooked and_ate the_pizza V/N N/V V−V V−V V V V V V\N N\V
.
N N The navigation moves from the predicate to the subject (arc 1), back to the predicate (arc 2), and on through the predicate conjunction (arcs 5 and 6). After returning empty to the initial predicate (arcs 7 and 8), the object is realized (arc 3). The final return to the predicate (arc 4) realizes the period and prepares for a possible continuation to the predicate of a next proposition.
8.3 Simple Coordination of Verbs and of Adjectives
135
The third proplet kind participating in simple coordination are the adjectives. For brevity, the graph analyses of the hear and the speak mode are omitted.7 Limiting the analyses to contents, we begin with John loves a smart, pretty, and rich girl, i.e. a coordination of elementary adnominal adjectives. The proplet set is shown in the order induced by the hear mode derivation: 8.3.5 S IMPLE
COORDINATION OF ADNOMINAL ADJECTIVES
noun: John verb: love fnc: love arg: John girl mdr: mdr: nc: nc: pc: pc: prn: 21 prn: 21
noun: girl fnc: love mdr: smart& nc: pc: prn: 21
adj: smart& sem: pad mdd: girl nc: pretty pc: prn: 21
adj: pretty sem: pad mdd: nc: rich pc: smart& prn: 21
adj: rich sem: and pad mdd: nc: pc: pretty prn: 21
The adjective conjunct smart is &-marked as initial, and its modified, here the noun girl, is specified as its mdd value. The non-initial adjectival conjuncts pretty and rich have no mdd value. It is obtainable from the initial conjunct, to which the non-initial conjuncts are connected via their nc and pc values. The intrapropositional coordination of adnominal modifiers resembles the extrapropositional coordination of propositions in that the conjunction may be omitted. Here, however, it is explicit (and) and written into the sem slot of the last conjunct rich. Next, consider the content of John talked slowly, gently, and seriously, i.e. a coordination of adverbials, as a proplet set in the order induced by the hear mode derivation: 8.3.6 S IMPLE
COORDINATION OF ADVERBIAL ADJECTIVES
verb: talk noun: John fnc: talk arg: John mdr: slow& mdr: nc: nc: pc: pc: prn: 29 prn: 29
adj: slow& sem: pad mdd: talk nc: gentle pc: prn: 29
adj: gentle sem: pad mdd: nc: serious pc: slow& prn: 29
adj: serious sem: and pad mdd: nc: pc: gentle prn: 29
The conjunct slow is &-marked as initial, and the modified (here the verb talk) is specified by its mdd value. The non-initial adjective conjuncts have no mdd value. It is obtainable from the initial conjunct, however, to which they are connected via chains of nc and pc values. The modified proplet talk specifies the initial adverbial conjunct in its mdr slot. The modifier construction is transparent because no additional attributes are required for the conjunct proplets slow, gentle, and serious, and all continuation values are standard. 7
Related examples are shown in Sects. 15.2–15.4 for constructions combining elementary and phrasal modifiers.
136
8. Intrapropositional Coordination
8.4 Overview of Complex Coordination (Gapping) A veritable pinnacle of syntactic-semantic complexity is gapping (Kapfer 2010). In contrast to clausal arguments and modifiers, which are extrapropositional8 (Chap. 7), gapping is an intrapropositional functor-argument9 construction: one or more gapping parts share a subject, a predicate, or an object with a complete sentence; in addition, there is noun gapping, in which several adnominals share a noun (8.6.4–8.6.6). In the spoken language of everyday use, gapping rarely occurs.10 Yet native speakers from a wide range of languages (CLaTR 3.5.2) affirm with confidence that the various forms of gapping exist in their native tongue. We may conclude that gapping mirrors content structures which are shared by many natural languages (universal?). The most basic structural distinction between different kinds of gapping in English is whether the filler precedes or follows the gap(s): 8.4.1 E XAMPLES
OF COMPLEX COORDINATION
1. Subject gapping: complex predicate-object coordination (8.4.3, 8.5.1) Bob bought an apple, # peeled a pear, and # ate a peach. 2. Predicate gapping: complex subject-object coordination (8.4.2, 8.5.4) Bob bought an apple, Jim # a pear, and Bill # a peach. 3. Object gapping: complex subject-predicate coordination (8.4.4, 8.6.1) Bob bought #, Jim peeled #, and Bill ate a peach. The gap is indicated by #. In subject and predicate gapping, the gapped item precedes the gap; in object gapping,11 the gapped item follows the gap.12 The syntactic-semantic analysis in DBS is based on a different coding of the interconjunct relations. In predicate gapping, it consists in writing multiple subject-object pairs into the arg slot of the sole verb and writing the core value 8
9
10
11
12
The propositions of an extrapropositional construction have different prn values, while the proplets of an intrapropositional construction share a common prn value. This analysis of gapping differs from the 1st. Ed. Sects. 8.4–8.6, which ran the gapping relation via the coordination attributes nc and pc. Here, in contrast, the gapping relations are run via the continuation attributes of functor-argument, i.e. fnc, arg, mdd, and mdr. The reason for the change is predicate (or verb) gapping, which is not susceptible to being treated as a coordination. Regarding the LIMAS corpus, März (2005) estimates that only 0.25% of the sentences contain gapping constructions. This construction has also been called “right-node-raising.” Jackendoff (1972); Hudson (1976); Sag, Gazdar, et al. (1985). See Van Oirsouw (1987) for an overview of the nativist literature on gapping. In noun gapping (8.6.4, 8.6.5) the gapped item also follows the gap.
8.5 Subject Gapping and Predicate Gapping
137
of the verb into the fnc slot of the subject and object proplets constituting the complex conjuncts (8.5.4–8.5.6): 8.4.2 P ROPLET
ANALYSIS OF PREDICATE GAPPING
noun: Bob verb: buy noun: apple noun: Jim noun: pear noun: Bill noun: peach fnc: buy arg: Bob apple fnc: buy fnc: buy fnc: buy fnc: buy fnc: buy prn: 31 Jim pear prn: 31 prn: 31 prn: 31 prn: 31 prn: 31 Bill peach prn: 31
The intrapropositional nature of gapping is formally indicated by using only one prn value, here 31. This approach extends to subject (8.5.1–8.5.3) and object (8.6.1–8.6.3) gapping: 8.4.3 P ROPLET
ANALYSIS OF SUBJECT GAPPING
noun: Bob verb: buy noun: apple verb: peel noun: peach noun: pear verb: eat arg: Bob pear fnc: peel arg: Bob peach fnc: eat fnc: buy arg: Bob apple fnc: buy prn: 32 prn: 32 prn: 32 prn: 32 prn: 32 peel prn: 32 eat prn: 32
Here multiple values are written into the fnc slot of the gapped item Bob and the core value of the gapped item is written into the first arg slot of the verbs. Finally consider an example of object gapping: 8.4.4 P ROPLET
ANALYSIS OF OBJECT GAPPING
noun: Bob verb: buy noun: Jim verb: peel arg: Bob peach fnc: peel arg: Jim peach fnc: buy prn: 33 prn: 33 prn: 33 prn: 33
noun: peach noun: Bill verb: eat arg: Bill peach fnc: buy fnc: eat prn: 33 peel prn: 33 eat prn: 33
The gapped item peach has multiple fnc values and the verbs of the complex conjuncts have peach in their second arg slot. The remaining sections of this chapter continue with a more detailed analysis of subject, predicate, and object gapping.
8.5 Subject Gapping and Predicate Gapping Pretheoretically, the subject gapping 8.4.1 (1) may be shown as follows:13 Bob buy apple peel pear eat peach 13
A similar analysis was proposed by T. Winograd in a lecture at Stanford U. in 1985 or 86.
138
8. Intrapropositional Coordination
Formally, it is realized by writing the predicates into the fnc slot of Bob and the shared subject into the first arg slot of the predicates (8.4.3). The objects are attached to their predicates by the usual cross-copying. These semantic relations of structure are established by the following hear mode derivation: 8.5.1 Bob bought an apple, # peeled a pear, and # ate a peach Bob bought lexical lookup
an
peeled
apple
a
pear
ate
and
a
peach
noun: Bob verb: buy noun: n_1 noun: apple verb: peel noun: n_2 noun: pear cnj: and verb: eat noun: n_3 noun: peach arg: fnc: arg: fnc: fnc: arg: fnc: fnc: fnc: fnc: prn: prn: prn: prn: prn: prn: prn: prn: prn: prn: syntactic−semantic parsing noun: Bob verb: buy arg: 1 fnc: prn: prn: 30 noun: Bob verb: buy noun: n_1 fnc: buy arg: Bob fnc: prn: 30 prn: 2 prn: 30 noun: Bob verb: buy fnc: buy arg: Bob n_1 3 prn: 30
noun: n_1 noun: apple fnc: buy fnc: prn: 30 prn:
prn: 30 noun: apple noun: Bob verb: buy buy arg: Bob apple fnc: 4 fnc: buy prn: 30 prn: 30 prn: 30 noun: apple noun: Bob verb: buy arg: Bob apple fnc: buy 5 fnc: buy prn: 30 peel prn: 30
verb: peel arg: prn: verb: peel arg: Bob prn: 30
noun: n_2 fnc: prn:
prn: 30 noun: Bob verb: buy noun: apple arg: Bob apple fnc: buy 6 fnc: buy peel prn: 30 prn: 30
verb: peel arg: Bob n_2 prn: 30
noun: n_2 noun: pear fnc: peel fnc: prn: 30 prn:
prn: 30 noun: Bob verb: buy noun: apple 7 fnc: buy arg: Bob apple fnc: buy peel prn: 30 prn: 30
verb: peel arg: Bob pear prn: 30
noun: pear fnc: peel prn: 30
cnj: and
prn: 30 noun: apple noun: Bob verb: buy fnc: buy arg: Bob apple fnc: buy peel prn: 30 prn: 30
verb: peel arg: Bob pear prn: 30
noun: pear fnc: peel prn: 30
cnj: and
verb: peel arg: Bob pear prn: 30
noun: pear fnc: peel prn: 30
verb: peel arg: Bob pear prn: 30
noun: pear fnc: peel prn: 30
verb: eat arg: Bob n_3 prn: 30
verb: peel arg: Bob pear prn: 30
noun: pear fnc: peel prn: 30
verb: eat noun: peach arg: Bob peach fnc: eat prn: 30 prn: 30
8
prn: 30 noun: Bob verb: buy noun: apple arg: Bob apple fnc: buy 9 fnc: buy peel prn: 30 prn: 30 eat prn: 30 noun: Bob verb: buy noun: apple 10 fnc: buy arg: Bob apple fnc: buy peel prn: 30 prn: 30 eat prn: 30 result of syntactic−semantic parsing: noun: Bob verb: buy noun: apple fnc: buy arg: Bob apple fnc: buy peel prn: 30 prn: 30 eat prn: 30
verb: eat arg: prn: verb: eat noun: n_3 arg: Bob fnc: prn: 30 prn: noun: n_3 noun: peach fnc: eat fnc: prn: 30 prn:
8.5 Subject Gapping and Predicate Gapping
139
The complex conjuncts are buy apple, peel pear, and eat peach. The gapped item is Bob. The cross-copying between the predicates of the conjuncts and the lone subject occurs in lines 4 and 8. The time-linear hear mode derivation shows the incremental buildup of three fnc values in the gapped item and the adding of the core value of the gapped item to the first arg slot of each of the three predicates. The object\predicate relations are based on the usual crosscopying between a noun and a verb proplet. Consider the content derived in 8.5.1 in the order induced by the hear mode derivation, with added cat and sem values: 8.5.2 R EPRESENTING
CONTENT OF SUBJECT GAPPING AS PROPLET SET
noun: Bob verb: buy cat: snp ′ ′ sem: nm m sg cat: #n #a decl sem: past fnc: buy arg: Bob apple peel prn: 30 eat prn: 30 verb: eat noun: peach ′ ′ cat: #n #a v cat: snp sem: and past sem: indef sg arg: Bob peach fnc: eat prn: 30 prn: 30
verb: peel ′ ′ cat: #n #a v sem: past arg: Bob pear prn: 30
noun: apple cat: snp sem: indef sg fnc: buy prn: 30
noun: pear cat: snp sem: indef sg fnc: peel prn: 30
The conjunction and was copied into the sem slot of the last conjunct eat in line 8 of 8.5.1 (not shown). The speak mode (re)production of this sentence from the set of proplets derived in 8.5.1 is based on the following DBS graph analysis: 8.5.3 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG)
(iii) numbered arcs graph (NAG) buy
buy
peel
peel 1
10
eat Bob
9 peach pear apple
(ii) signature
V V V
N N N
eat 6
5
V\N
2
3
4
7
8
11
peach pear
Bob
(iv) surface realization 1 11 12 10 Bob bought an_apple V/N N/V
N
8.5.1
1
7 6 8 peeled a pear
N\V V/N N/V
V\N
12
apple 9
2 3 4 and_ate a_peach
N\V V/N N/V
V\N
5
10
.
N\V V/N N/V
140
8. Intrapropositional Coordination
The numbering of the arcs is strictly pre-order. The traversal begins and ends with the initial verb. There is a multiple visit in transition 1, and there are empty traversals in step 12-1, 8-9, and 4-5. Graph-theoretically (CLaTR Sect.9.1), the n=7 signature has the degree sequence 3222111. Next let us turn to the example 8.4.1(2) of predicate gapping.14 Pretheoretically, it may be shown as follows: Bob buy apple Jim pear Bill peach
The complex conjuncts are Bob # apple, Jim # pear, and Bill # peach. In analogy to 8.5.1 (subject gapping), the core values of these conjuncts are written into the arg slot of the sole verb, buy, while the core value of the verb is written into the fnc slot of the six nouns making up the complex conjuncts (8.4.2): 8.5.4 Bob bought an apple, Jim # a pear, and Bill # a peach. Bob bought lexical lookup
an
apple
Jim
a
pear
and
Bill
a
peach
noun: Bob verb: buy noun: n_1 noun: apple noun: Jim noun: n_2 noun: pear cnj: and noun: Bill noun: n_3 noun: peach fnc: fnc: fnc: arg: fnc: fnc: fnc: fnc: fnc: fnc: prn: prn: prn: prn: prn: prn: prn: prn: prn: prn: syntactic−semantic parsing noun: Bob verb: buy arg: 1 fnc: prn: 31 prn: noun: Bob verb: buy noun: n_1 arg: Bob fnc: 2 fnc: buy prn: 31 prn: prn: 31 noun: Bob verb: buy noun: n_1 noun: apple arg: Bob n_1 fnc: buy fnc: 3 fnc: buy prn: 31 prn: 31 prn: prn: 31 noun: Bob verb: buy noun: apple arg: Bob apple fnc: buy 4 fnc: buy prn: 31 prn: 31
noun: Jim fnc: prn:
prn: 31 noun: Bob verb: buy noun: apple arg: Bob apple fnc: buy 5 fnc: buy prn: 31 prn: 31 Jim
noun: Jim noun: n_2 fnc: buy fnc: prn: prn: 31
prn: 31 noun: Bob verb: buy noun: apple 6 fnc: buy arg: Bob apple fnc: buy prn: 31 Jim n_2 prn: 31
noun: Jim noun: n_2 noun: pear fnc: buy fnc: buy fnc: prn: 31 prn: 31 prn:
prn: 31 noun: Bob verb: buy noun: apple 7 fnc: buy arg: Bob apple fnc: buy prn: 31 Jim pear prn: 31
noun: Jim noun: pear fnc: buy fnc: buy prn: 31 prn: 31
cnj: and
prn: 31 noun: Bob verb: buy arg: Bob apple 8 fnc: buy Jim pear prn: 31 prn: 31
noun: apple fnc: buy prn: 31
noun: Jim noun: pear fnc: buy fnc: buy prn: 31 prn: 31
cnj: and noun: Bill fnc: prn:
8.5 Subject Gapping and Predicate Gapping noun: Bob verb: buy 9 fnc: buy arg: Bob apple prn: 31 Jim pear Bill prn: 31 noun: Bob verb: buy arg: Bob apple 10 fnc: buy Jim pear prn: 31 Bill n_3 prn: 31 result noun: Bob verb: buy arg: Bob apple fnc: buy Jim pear prn: 31 Bill peach prn: 31
141
noun: apple fnc: buy prn: 31
noun: Jim noun: pear fnc: buy fnc: buy prn: 31 prn: 31
noun: Bill noun: n_3 fnc: fnc: buy prn: prn: 31
noun: apple fnc: buy prn: 31
noun: Jim noun: pear fnc: buy fnc: buy prn: 31 prn: 31
noun: Bill noun: n_3 noun: peach fnc: buy fnc: fnc: buy prn: prn: 31 prn: 31
noun: apple fnc: buy prn: 31
noun: Jim noun: pear fnc: buy fnc: buy prn: 31 prn: 31
noun: Bill noun: peach fnc: buy fnc: buy prn: 31 prn: 31
As shown by the common prn value, here 31, predicate gapping resembles subject and object gapping in that the construction is intrapropositional. The time-linear hear mode derivation shows the incremental buildup of three subject-object pairs in the arg slot of the shared verb buy and writing the value buy into the fnc slots of the three argument pairs Bob apple, Jim pear, and Bill peach. The cross-copying between the (elements of) the complex conjuncts and the lone verb occurs in lines 4, 5, 8, and 9. There are no suspensions. In line 8, and is copied into the sem slot (not shown) of Bill. In the proplet representation of the content derived in 8.5.4, the shared item, i.e. the verb eat, lists the three subject-object pairs Bob apple, Jim pear, and Bill peach in its arg slot, just as the six nouns list eat in their fnc slot (bidirectional pointering): 8.5.5 R EPRESENTING
PREDICATE GAPPING CONTENT AS PROPLET SET
verb: buy ′ ′ cat: #n #a decl noun: apple noun: Bob sem: past cat: snp cat: snp sem: nm m sg arg: Bob apple sem: indef sg fnc: buy fnc: buy Jim pear prn: 31 prn: 31 Bill peach prn: 31 noun: peach noun: Bill cat: snp cat: snp sem: and nm m sg sem: indef sg fnc: buy fnc: buy prn: 31 prn: 31
noun: Jim cat: snp sem: nm m sg fnc: buy prn: 31
noun: pear cat: snp sem: indef sg fnc: buy prn: 31
The proplets are shown in the order induced by the hear mode derivation, with added cat and sem values. The LAGo think navigation through this content, underlying the speak mode production, may be shown graphically as follows: 14
We use the term predicate gapping (rather than verb gapping) to be consistent with the terms subject gapping and object gapping. The term verb gapping would correspond to noun gapping, losing the distinction subject and object. We are aware, however, that the trichotomy subject/predicate/object, on the one hand, and the dichotomy noun/verb, on the other, have been mixed in the linguistic literature, as in SVO (subject verb object) language.
142
8. Intrapropositional Coordination
8.5.6 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG) buy
8.5.4
(iii) numbered arcs graph (NAG) buy
1
12 11
2 Bob
apple
Jim
pear
Bill
peach
Bob
apple 3 4
9 10 5 6
7 8
Jim
(ii) signature
pear Bill
peach
V N N N
N N N
(iv) surface realiztion 11 12 3 4 9 10 5 6 7 8 1 2 a_pear and_Bill a_peach . Bob bought an_apple Jim V/N N/V V\N N\V V/N N/V V\N N\V V/N N/V V\N N\V
The numbering of the arcs in (iii) the NAG is pre-order. As shown by (iv) the surface realization, the initial sentence is realized by a 1-2-11-12 traversal, the second (gapped) sentence by a 3-4-9-10 traversal, and the third (gapped) sentence by a 5-6-7-8 traversal. The n=7 signature has the degree sequence 6111111. No arc in the NAG is traversed more than once by the surface realization (no multiple visits). There are the multiple traversals of buy in 4-5, 6-7, and 10-11, and the multiple realizations of an_apple in 3, a_pair in 7, and a_peach in 11. The continuity condition 3.6.5 is fulfilled.
8.6 Object Gapping and Noun Gapping The example of object gapping 8.4.1(3) may be shown pretheoretically as follows: Bob buy Jim peel Bill eat peach
Compared to subject and predicate gapping, object gapping is special in that the gapped item follows the gaps. This results in numerous suspensions until the gapped object finally arrives:
8.6 Object Gapping and Noun Gapping
143
8.6.1 Bob bought #, Jim peeled #, and Bill ate a peach. Bob
bought
Jim
peeled
and
Bill
ate
a
peach
lexical lookup noun: Bob verb: buy noun: Jim arg: fnc: fnc: prn: prn: prn: syntactic−semantic parsing noun: Bob verb: buy 1 fnc: arg: prn: 33 prn: noun: Bob verb: buy noun: Jim arg: Bob fnc: 2 fnc: buy prn: 33 prn: 33 prn: noun: Bob verb: buy noun: Jim buy arg: Bob fnc: 3 fnc: prn: 33 prn: 33 prn: 33 noun: Bob verb: buy noun: Jim arg: Bob fnc: peal 4 fnc: buy prn: 33 prn: 33 prn: 33 noun: Bob verb: buy noun: Jim arg: Bob fnc: peal 5 fnc: buy prn: 33 prn: 33 prn: 33 noun: Bob verb: buy noun: Jim 6 fnc: buy arg: Bob fnc: peal prn: 33 prn: 33 prn: 33 noun: Bob verb: buy noun: Jim 7 fnc: buy arg: Bob fnc: peal prn: 33 prn: 33 prn: 33
noun: Bob verb: buy arg: Bob n_1 8 fnc: buy prn: 33 prn: 33
verb: peel cnj: and noun: Bill verb: eat arg: fnc: arg: prn: prn: prn:
noun: n_1 noun: peach fnc: fnc: prn: prn:
verb: peel arg: prn: verb: peel cnj: and arg: Jim prn: 33 verb: peel cnj: and noun: Bill fnc: arg: Jim prn: prn: 33 noun: Bill verb: eat verb: peel arg: arg: Jim fnc: prn: prn: 33 prn: 33 noun: Bill verb: eat verb: peel arg: Jim fnc: eat arg: Bill prn: 33 prn: 33 prn: 33
noun: Jim verb: peel fnc: peal arg: Jim n_1 prn: 33 prn: 33
result noun: Bob verb: buy noun: Jim verb: peel arg: Bob peach fnc: peal arg: Jim peach fnc: buy prn: 33 prn: 33 prn: 33 prn: 33
noun: n_1 fnc:
prn: noun: Bill verb: eat noun: n_1 noun: peach fnc: fnc: eat arg: Bill n_1 fnc: buy prn: 33 prn: 33 peel prn: eat prn: 33 noun: Bill verb: eat noun: peach fnc: eat arg: Bill peach fnc: buy prn: 33 prn: 33 peel eat prn: 33
The complex conjuncts in this example are Bob bought, Jim peeled, and Bill ate. The gapped item is the object peach. The cross-copyings between the object and the predicates of the complex conjuncts occur in line 7, using the replacement value n_1 of the determiner as the valency filler. There are suspensions in lines 2, 4, and 5, which are resolved all at once in line 7. Because operations apply whenever conditions allow (self-organization, Sect. 3.6), the suspension compensations in line 7 do not require any extra work (Sect. 11.4). Consider the content derived in 8.6.1 in the order induced by the hear mode derivation, with added cat and sem values. 8.6.2 R EPRESENTING noun: Bob cat: snp sem: nm m fnc: buy prn: 33
OBJECT GAPPING CONTENT AS PROPLET SET
verb: buy ′ ′ cat: #n #a decl sem: past arg: Bob peach prn: 33
noun: Jim cat: snp sem: nm m fnc: peel prn: 33
verb: peel ′ ′ cat: #n #a v sem: past arg: Jim peach prn: 33
144
8. Intrapropositional Coordination
verb: eat noun: Bill ′ ′ cat: snp cat: #n #a v sem: and nm m sem: past fnc: eat arg: Bill peach prn: 33 prn: 33
noun: peach cat: snp sem: indef sg fnc: buy peel eat prn: 33
Only the initial verb proplet buy has the final cat value decl; it is traversed first and last, as shown by the LAGo think navigation through 8.6.2: 8.6.3 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG)
8.6.1
(iii) numbered arcs graph (NAG)
buy
buy peel
peel 1 eat Bob Jim Bill
peach
Bob
2
5 6
eat 9 10
11 8
7
4
Jim Bill
3
12
peach
(ii) signature
V V V N N N
N
(iv) surface realization 1 5 12 2 3 4 6 7 8 9 10 11 Bob bought Jim peeled and_Bill ate a_peach . N\V V/N N/V V\N N\V V/N N/V V\N N\V V/N N/V V\N
The navigation of object gapping follows the pre-order numbering of the NAG. As in predicate gapping, all transitions are used only once (no multiple visits). There are empty traversals in transitions 3, 4, and 7. The navigation returns to the initial verb to realize the period and to connect to the verb of a possible next proposition (Chap. 9). The n=7 degree sequence is the same as in subject gapping (8.5.3), namely 3222111. In conclusion we turn to noun gapping. In contradistinction to subject, predicate, and object gapping, which have a clausal appearance, but are intrapropositional, noun gapping is obviously intrapropositional because it applies to the 14
Less than 0.01% of the sentences in the BNC show this construction of object gapping.
8.6 Object Gapping and Noun Gapping
145
adnominal modifiers inside a phrasal noun construction.15 Consider the the analysis of Bob ate the red #, the green #, and the blue berries.: 8.6.4 DBS
GRAPH ANALYSIS OF NOUN GAPPING
(i) semantic relations graph (SRG)
(iii) numbered arcs graph (NAG) eat
eat
10 3 berry
1 2 Bob
Bob
berry 4 5 6 7 89 red green blue red green blue
(ii) signature
V N
N AAA
(iv) surface realization 1 2 3
4
5
6
7
8
Bob
red
the
green
and_the
blue
ate
the
9 berries
10
.
As in object gapping, the noun gapping parts precede the shared filler. The degree sequence is 421111. The node represented by the noun berry is shared by the adnominally used adjectives red, green, and blue, as indicated by the three parallel vertical lines. The surface realization shows no empty traversals, one multiple realization, and no multiple visits. The proplet representation is as follows:16 8.6.5 R EPRESENTING
NOUN GAPPING CONTENT AS PROPLET SET noun: berry
cat: snp verb: eat adj: blue adj: green noun: Bob sem: def pl adj: red ′ ′ cat: adn cat: adn cat: adn cat: #n #a decl cat: snp mdr: red sem: pad sem: pad sem: pad sem: nm m sg sem: past green arg: Bob berry fnc: eat mdd: berry mdd: berry mdd: berry blue prn: 34 prn: 34 prn: 34 prn: 34 prn: 34 fnc: eat prn: 34
15
Less than 0.02% of the BNC sentences show this construction of noun gapping.
146
8. Intrapropositional Coordination
The noun gapping construction has a distributive interpretation in the sense that each berry is either red, green, or blue. This is in contradistinction to the corresponding adnominal coordination, which has a collective interpretation in the sense that each berry may have all three colors at the same time. Consider the analysis of Bob ate the red, green, and blue berries. 8.6.6 DBS
GRAPH ANALYSIS OF ADNOMINAL COORDINATION
(i) semantic relations graph (SRG)
(iii) numbered arcs graph (NAG) eat
eat
1 2 Bob
Bob
10 3 berry
berry 4 9 red red
(ii) signature
V N
green
blue
8 green 5
7 6
blue
(iv) surface realization 6 7 8 9 10 5 3 4 1 2 berries . Bob ate the red green and__blue A−A A−A A|N N\V V/N N/V V\N N|A A−A A−A
N A
A A
The n=7 degree sequence is 222211. Compare the following proplet representation of the adnominal coordination 8.6.6 with that of noun gapping 8.6.5: 8.6.7 A DNOMINAL
COORDINATION CONTENT AS PROPLET SET
adj: red& adj: green adj: blue noun: berry cat: adn cat: adn cat: adn verb: eat noun: Bob cat: pnp cat: #n′ #a′ decl cat: snp sem: pad sem: pad sem: pad sem: def pl sem: nm m sg mdd: berry mdd: mdd: sem: past arg: Bob berry mdr: red& nc: green nc: blue nc: fnc: eat fnc: eat pc: pc: red& pc: green prn: 35 prn: 35 prn: 35 prn: 35 prn: 35 prn: 35
The connections between the adnominal modifiers are run via the nc and pc slots of an adjective coordination, and not via multiple connections between the mdd slots of the adjectives and the mdr of the noun, as in the gapping construction 8.6.5). 16
When gapped items have different determiners, as in Bob ate all red, some blue, and three green berries., multiple sem value may have to be used as well. If the determiners differ in number, as in Bob ate three red, two blue, and one green berry?/berries? the construction can not have a grammatical surface (natural bug, similar in German).
8.6 Object Gapping and Noun Gapping
147
Noun gapping may be combined with subject gapping, as in the following example. 8.6.8 C OMBINATION
OF NOUN GAPPING AND SUBJECT GAPPING
The smart #, the pretty #, and the small girl ate an apple, # walked the dog, and # read the paper. Noun gapping may also be combined with object gapping: 8.6.9 C OMBINATION
OF NOUN GAPPING AND OBJECT GAPPING
Bob bought #, Jim peeled #, and Bill ate the red #, the green #, and the yellow apple. In summary, a gapped content is special because parallel semantic relations of structure use a node, e.g. the subject (8.5.1), the predicate (8.5.4), the object (8.6.1), or the modified noun (8.6.4), more than once. This makes gapping constructions highly relevant for a linguistic understanding of the semantic relations of structure as coded by natural language, regardless of how (in)frequently they occur.
148
8. Intrapropositional Coordination
Summary In DBS, the conjuncts of a simple intrapropositional coordination are concatenated via their nc and pc values. The core value of the initial conjunct and all its copies are &-marked because only the initial conjunct participates in the functor-argument of the construction. The non-initial conjuncts do not specify functor-argument continuation values, but share them with the initial conjunct instead. During retrieval, the &-marker initiates a conjunction navigation along the nc and pc relations to check the non-initial conjuncts (Sect. 8.1). In complex coordinations such as subject gapping (8.5.1), predicate gapping (8.5.4), and object gapping (8.6.1), in contrast, the gapped items are concatenated via their functor-argument continuation values and participate in the functor-argument structure of the construction. For example, in the hear mode derivation 8.5.1 the proplet of the shared item, i.e the subject Bob, lists the gapped items buy, peel, and eat in its fnc slot, and the gapped items buy apple, peel pear, and eat peach list the core value of the shared item, i.e. Bob, in the first arg slot of the verbs. Thus, there is no need for &-marking or for conjunction navigation during retrieval. Simple and complex coordinations are alike, however, in that they may use different conjunctions, e.g., and, or, and but. These have to be recorded by the hear mode derivation, not only for later use in a possible speak mode realization, but more generally because they contribute to the content. The conjunction, if present, is stored in the sem slot of the last conjunct of a simple coordination (8.2.2, 8.2.6, 8.3.3, 8.3.5, 8.3.6, 9.3.2, 9.3.5, 9.3.8, 9.4.2, 9.4.5, 9.4.8, 9.5.5, 9.5.8, 9.6.2, 9.6.5, 9.6.8) and the beginning of the last gapped item of a complex coordination (8.5.2, 8.5.5, 8.6.2, 10.1.2, 10.2.2, 10.3.2, 10.4.2, 10.5.2).
9. Simple Coordination in Clausal Constructions
In this chapter, simple coordination is integrated into extrapropositional (clausal) constructions. Beginning with (i) extrapropositional coordination, we go systematically through the clausal functor-arguments analyzed in Chap. 7, namely (ii) clausal arguments serving as subject, (iii) clausal arguments serving as object, (iv) clausal modifiers in adnominal use with a subject gap (relative clauses with the head serving as the lower subject), (v) clausal modifiers in adnominal use with an object gap (relative clauses with the head serving as a lower object), and (vi) clausal modifiers in adverbial use. Each construction is analyzed as (1) a schematic hear mode derivation, (2) a content defined as a set of connected proplets, and (3) a speak mode production based on navigating along the semantic relations between proplets.
9.1 Combining Intra- and Extrapropositional Coordination At the language level, simple extrapropositional coordination may occur intrasententially, as in Julia sleeps and John sings. (complex sentence), and extrasententially, as in Julia sleeps. John sings. (sequence of sentences in a text or dialog). Intrasentential extrapropositional coordination requires a conjunction like and, placed before the final conjunct. In extrasentential coordination, the mere sequencing of sentences is sufficient; if there is a coordinating conjunction, it is placed at the beginning of the conjunct. Consider the following set of proplets representing Julia sang. Then Sue slept. John read.: 9.1.1 G RAMMATICAL noun: Julia fnc: sing mdr: nc: pc: prn: 10 1
verb: sing arg: Julia mdr: nc: (sleep 11) pc: prn: 10 2
RELATIONS BETWEEN PROPOSITIONS
noun: Sue fnc: sleep mdr: nc: pc: prn: 11 3
verb: sleep arg: Sue mdr: nc: (read 12) pc: prn: 11 4
noun: John fnc: read mdr: nc: pc: prn: 12 5
verb: read arg: John mdr: nc: pc: prn: 12 6
150
9. Simple Coordination in Clausal Constructions
The propositional conjuncts are connected by means of addresses in the verbs’ nc slots. The extrapropositional nature of the coordination is indicated by the use of three different prn values and by long addresses in the nc slots, e.g., [nc: (read 12)] in proplet 4. The optional conjunction of an extrapropositional coordination is specified in the sem slot of the top verb of a conjunct, e.g., [sem: then past] of sleep (not shown). Extrapropositional coordinations are special in that they occur only as V−V. This is because an extrapropositional coordination concatenates the highest verbs of propositions. Intrapropositional coordination, in contrast, allows N−N, V−V, and A−A at the elementary and the phrasal level (6.6.1). Extra- and intrapropositional V−V coordinations may be combined, as in Sue slept. John bought, cooked, and ate a pizza. Julia sang. The extrapropositional V−V coordination is sleep−buy−sing, while the intrapropositional V−V coordination is buy−cook−eat in the middle sentence. Given that both coordinations are run via the verbs’ nc slot, let us see how the two levels may be accommodated in a strictly time-linear analysis: 9.1.2 C OMBINING noun: Sue fnc: sleep mdr: nc: pc: prn: 26 1
INTRA - AND EXTRAPROPOSITIONAL COORDINATION
verb: sleep arg: Sue noun: John mdr: fnc: buy& nc: (buy& 27) mdr: pc: nc: prn: 26 pc: prn: 27 2
3
verb: buy& arg: John pizza mdr: nc: cook pc: prn: 27 4
verb: cook arg: mdr: nc: eat pc: buy& prn: 27 5
verb: eat arg: mdr: nc: (sing 28) pc: cook prn: 27 6
noun: pizza fnc: buy & mdr: nc: pc: prn: 27
noun: Julia fnc: sing mdr: nc: pc: prn: 28
7
8
verb: sing arg: Julia mdr: nc: pc: prn: 28
Because the last intrapropositional conjunct eat has an empty nc slot, this slot may be used for the extrapropositional coordination. More specifically, the first proposition 26 is connected extrapropositionally to the second proposition 27 by the feature [nc: (buy& 27)] of sleep (dashed line, long address, simplex). Buy, as the first intrapropositional conjunct of proposition 27, is connected to the second, cook, by the feature [nc: cook], and the second conjunct is connected to the first by the feature [pc: buy&], and similarly for the coordination of the second and the third conjunct (solid lines, short addresses, duplex). The nc slot of eat, unutilized by the last intrapropositional coordination, is used to establish the extrapropositional simplex coordination between eat and sing. In summary, extrapropositional and intrapropositional coordinations differ in (i) the form of their connection values (long vs. short addresses), (ii) &-marking being limited to intrapropositional coordination, and (iii) one being simplex and the other duplex (3.1.2 f.).
9
9.2 Extrapropositional Coordination
151
9.2 Extrapropositional Coordination In recognition, extrapropositional coordination encodes the temporal sequence of incoming propositions, at the language and the context level. Consider the derivation of an extrasentential coordination at the language level: 9.2.1 D ERIVATION Julia lexical lookup
sleeps
Julia slept. John sang. Suzy dreams John . . Suzy dreams sings
OF
.
.
. .
verb: sleep arg: nc: pc: prn syntactic−semantic parsing noun: Julia verb: sleep arg: 1 fnc: nc: prn: 1 pc: prn:
noun: John verb: cat: v’ decl fnc: prn: prn:
noun: Julia 2 fnc: sleep prn: 1
verb: sleep arg: Julia nc: pc: prn: 1
verb: cat: v’ decl prn:
noun: Julia 3 fnc: sleep prn: 1
verb: sleep arg: Julia nc: pc: prn: 1
noun: John fnc: prn:
noun: Julia 4 fnc: sleep prn: 1
verb: sleep arg: Julia nc: pc: prn: 1
noun: John fnc: prn: 2
verb: sing arg: nc: pc: prn:
noun: Julia 5 fnc: sleep prn: 1
verb: sleep arg: Julia nc: sing pc: prn: 1
noun: John fnc: sing prn: 2
verb: sing verb: arg: John cat: v’ decl prn: nc: pc: prn: 2
noun: Julia 6 fnc: sleep prn: 1
verb: sleep arg: Julia nc: sing pc: prn: 1
noun: John fnc: sing prn: 2
verb: sing arg: John nc: pc: prn: 2
noun: Julia 7 fnc: sleep prn: 1
verb: sleep arg: Julia nc: sing pc: prn: 1
noun: John fnc: sing prn: 2
verb: sing arg: John nc: pc: prn: 2
noun: Suzy verb: dream fnc: arg: prn: 3 nc: pc: prn: 3
noun: Julia 8 fnc: sleep prn: 1
verb: sleep arg: Julia nc: sing pc: prn: 1
noun: John fnc: sing prn: 2
verb: sing arg: John nc: dream pc: prn: 2
noun: Suzy verb: dream verb: cat: v’ decl fnc: dream arg: Suzy prn: nc: prn: 3 pc: prn: 3
verb: sleep arg: Julia nc: sing pc: prn: 1
noun: John fnc: sing prn: 2
verb: sing arg: John nc: dream pc: prn: 2
noun: Suzy verb: dream fnc: dream arg: Suzy nc: prn: 3 pc: prn: 3
noun: Julia fnc: prn:
result noun: Julia fnc: sleep prn: 1
verb: sing verb: noun: Suzy cat: v’ decl fnc: arg: prn: prn: nc: pc: prn:
verb: dream verb: cat: v’ decl arg: prn: nc: pc: prn:
.
.
noun: Suzy fnc: prn:
.
152
9. Simple Coordination in Clausal Constructions
In line 2, the value “.” of the pnc (punctuation) proplet is used to specify the sentence mood in the verb proplet as declarative (not shown here, but see 11.6.2). In line 3, there is an instance of suspension: the subject of the second sentence is added to the set of proplets at the current now front without establishing a semantic relation of structure (11.6.3). In line 4, two cross-copying operations apply. The first reconstructs the functor-argument of John sings, copying John into the arg slot of the proplet sing, and sing into the fnc slot of the proplet John ( duplex, 11.6.4). The second establishes the extrapropositional connection between the two verb proplets by copying sleep into the nc slot of the proplet sing (simplex, 11.6.5). From there, the earlier steps are repeated by the derivation until there is no more next word. Thus, even without an explicit conjunction, the two sentences are extrapropositionally conjoined: the nc slot in the verb proplet of the first sentence specifies the long (extrapropositional) address of the verb proplet of the second sentence, and the nc slot in the verb proplet of the second sentence specifies the address of the third. Consider the content as a set of proplets in the order induced by the hear mode derivation 9.2.1, with added cat and sem values: 9.2.2 C ONTENT
DERIVED IN
9.2.1
AS A SET OF PROPLETS
verb: dream verb: sing verb: sleep noun: Suzy noun: John noun: Julia ′ ′ ′ cat: #n decl cat: #n decl cat: snp cat: #n decl cat: snp cat: snp sem: nm f sgsem: past sem: nm f sgsem: past sem: nm m sgsem: past arg: John fnc: dream arg: Suzy fnc: sleep arg: Julia fnc: sing nc: nc: (dream 13)nc: nc: (sing 12)nc: nc: pc: pc: pc: pc: pc: pc: prn: 13 prn: 12 prn: 11 prn: 13 prn: 12 prn: 11
The speak mode (re)production of three sentences from the set of proplets derived in 9.2.1 is based on the following DBS graph analysis. 9.2.3 DBS
(i) semantic relations graph (SRG) sleep sing read
(iii) numbered arcs graph (NAG) 0
Julia
9.2.1
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
Susanne
sleep 3 4 1 2
John
Julia
6
sing
5 Susanne
read
7 8 John
(ii) signature
V N
N
V N
V
(iv) surface realization 0 1
2
3
.
4
5
6
.
7
8
.
Julia slept__ Susanne sang__ John read__ V/N N/V V−V V/N N/V V−V V/N N/V
9.3 Simple Coordination in Clausal Subject
153
V/N (arc 1) realizes Julia. N/V (arc 2) realizes slept_.. V−V (arc 3) navigates empty to the verb of the second proposition.1 V/N (arc 4) realizes John. N/V (arc 5) realizes sang_.. And similarly for the remainder. The LAGo hear and LAGo think-speak grammars handling 9.2.1 and 9.2.3, respectively, are explicitly defined in 11.5.6 and 12.4.1.
9.3 Simple Coordination in Clausal Subject Integrating a simple intrapropositional coordination into the subject clause of an extrapropositional functor-argument is illustrated by That Bob, Jim, and Bill slept surprised Mary. The matrix clause is X surprised Mary, X is the subject sentence that Y slept, and Y is the coordination Bob, Jim, and Bill.2 The conjunction that (A.4.4) is like a subclause-initial connection center providing the structural basis for upcoming cross-copyings, opaque but avoiding any suspension. In addition to the continuation attribute arg for the valency position(s) of the subclause, that introduces the continuation attribute fnc (normally reserved for nouns) for connecting the subclause to the verb of the next higher clause. 9.3.1 S IMPLE that
SUBJECT COORDINATION IN CLAUSAL SUBJECT Bob
Jim
and
Bill
slept
surprised
Mary
lexical lookup verb: v_1 arg: fnc: prn:
noun: Bob fnc: nc: pc: prn:
noun: Jim cnj: and fnc: nc: pc: prn:
noun: Bill fnc: nc: pc: prn:
verb: sleep arg: nc: pc: prn:
verb: surprise arg: nc: pc: prn:
noun: Mary fnc: nc: pc: prn:
syntactic−semantic parsing: verb: v_1 noun: Bob 1 arg: fnc: fnc: nc: pc: prn: 8 prn: 2
1
2
verb: v_1 arg: Bob fnc: prn: 8
noun: Bob fnc: v_1 nc: pc: prn: 8
noun: Jim fnc: nc: pc: prn:
The extrapropositional coordination transitions (arcs 3 and 6) are simplex, i.e. they are traversed only once. See CLaTR Sect. 9.4 for a more detailed discussion. This way of dissembling a complex expression into parts uses the principle of possible substitution. It is useful for explaining the hierarchical architecture of a construction (structuralism, nativism), but contributes little to understanding the functioning of natural language communication. In analogy, dismembering an old car into a set of wheels, the engine, transmission, seats, exhaust pipe, etc., is appropriate as a junkyard routine for recycling, but destroys the procedural interaction which held between the components while they were still part of a well-running machine. Merely naming the former relations between the separated parts is not enough for an agent-based computational aproach.
154
9. Simple Coordination in Clausal Constructions
3
verb: v_1 arg: Bob& fnc: prn: 8
noun: Bob& fnc: v_1 nc: Jim pc: prn: 8
noun: Jim cnj: and fnc: nc: pc: Bob& prn: 8
4
verb: v_1 arg: Bob& fnc: prn: 8
noun: Bob& fnc: v_1 nc: Jim pc: prn: 8
noun: Jim cnj: and fnc: nc: pc: Bob& prn: 8
noun: Bill fnc: nc: pc: prn:
5
verb: v_1 arg: Bob& fnc: prn: 8
noun: Bob& fnc: v_1 nc: Jim pc: prn: 8
noun: Jim fnc: nc: Bill pc: Bob& prn: 8
noun: Bill fnc: nc: pc: Jim prn: 8
6
verb: sleep arg: Bob& fnc: prn: 8
noun: Bob& fnc: sleep nc: Jim pc: prn: 8
noun: Jim fnc: nc: Bill pc: Bob& prn: 8
noun: Bill fnc: nc: pc: Jim prn: 8
verb: surprise arg: prn:
noun: Jim fnc: nc: Bill pc: Bob& prn: 8
noun: Bill fnc: nc: pc: Jim prn: 8
verb: surprise arg: (sleep 8) prn: 8
noun: Jim fnc: nc: Bill pc: Bob& prn: 8
noun: Bill fnc: nc: pc: Jim prn: 8
noun: Bob& fnc: sleep nc: Jim pc: prn: 8 result of syntactic−semantic parsing: verb: sleep noun: Bob& arg: Bob& fnc: sleep fnc: (surprise 9) nc: Jim prn: 8 pc: prn: 8 7
verb: sleep arg: Bob& fnc: (surprise 9) prn: 8
verb: sleep arg: nc: pc: prn:
verb: surprise arg: (sleep 8) Mary prn: 9
noun: Mary fnc: prn:
noun: Mary fnc: surprise prn: 9
In line 1, the core value Bob of the subclause subject is written into the arg slot of that, which will become the verb of the subject clause in line 5. When the second conjunct Jim is added in line 2, Bob is &-marked as the initial conjunct. In line 3, the coordinating conjunction is added, but suspended. In line 4, the conjunction’s core value is written into the sem slot of the last noun conjunct Bill, after which it is discarded. The grammatical relation between the subject sentence X = that Y slept and the matrix sentence X surprised Mary is established in lines 6 and 7 by the features [fnc: (surprise 9)] of the proplet sleep and [arg: (sleep 8) Mary] of the verb proplet surprise. Consider the result as a set of proplets in the order induced by the hear mode derivation 9.3.1, with added cat and sem values: 9.3.2 C ONTENT
OF
That Bob, Jim, and Bill slept surprised Mary.
noun: Bob& noun: Jim noun: Bill verb: sleep cat: snp cat: snp cat: snp verb: surprise noun: Mary cat: #n-s3′ v sem: nm m sem: nm msem: and nm mcat: #n′ #a′ decl cat: snp sem: that past fnc: sleep fnc: arg: Bob& fnc: sem: past sem: nm f nc: Jim nc: Bill nc: arg: (sleep 8) Maryfnc: surprise fnc: (surprise 9) pc: pc: Bob& pc: Jim prn: 9 prn: 9 prn: 8 prn: 8 prn: 8 prn: 8
9.3 Simple Coordination in Clausal Subject
155
Of the two verbs, only the mainclause verb surprise indicates the syntactic mood with the cat value decl. The verb sleep of the subordinate clause, in contrast, still has lexical v as its last cat value. The valency positions in the cat slots of surprise and sleep are all #-canceled. The conjunction values that and and appear in the sem slots of sleep and Bill, respectively, and originate in the lexical entries A.4.4 and A.7.1. The LA think navigation through the proplets of 9.3.2 underlying the speak mode production may be shown as follows: 9.3.3 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG)
9.3.1
(iii) numbered arcs graph (NAG)
surprise
surprise 1
sleep Bob
Mary
Jim
10 9
8
sleep 2 7 Bob 6 Jim 3
Bill
Mary 5 4
Bill
(ii) signature
V V N
N
N N
(iv) surface realization 1 2 3 4 5 6 7 8 9 10 that Bob Jim and_Bill slept surprised Mary . V/V V/N N−N N−N N N N N N/V V/V V\N N\V
The construction resembles 7.1.4, except that the subject Bob of the subclause is extended into a simple noun coordination. The connection between the subclause verb and the matrix verb is opaque V/V. Unlike 7.1.4, there are empty transitions in arcs 5 and 6, caused by the need to return from the end of the simple intrapropositional coordination back to the root (top verb). Proceding systematically through subject, predicate, and object coordinations in subject sentences, we turn next to a predicate coordination. The example is That the man bought, cooked, and ate a pizza surprised Mary. The matrix sentence is X surprised Mary, in which X is the sentential subject that the man Y a pizza, and the verb Y is the simple predicate coordination bought, cooked, and ate. The example resembles 7.1.4, like 9.3.1, but differs from the latter in that it contains a predicate, instead of a subject, coordination in the subject subclause:
156
9. Simple Coordination in Clausal Constructions
9.3.4 S IMPLE that
PREDICATE COORDINATION IN A CLAUSAL SUBJECT
the
man
bought
cooked
and
ate
pizza
a
surprised
Mary
lexical lookup verb: v_1 noun: n_1 noun: man arg: fnc: fnc: fnc: prn: prn: nc: prn: syntactic−semantic parsing: verb: v_1 noun: n_1 fnc: 1 arg: fnc: prn: nc: prn: 9 verb: v_1 noun: n_1 noun: man arg: n_1 fnc: v_1 fnc: 2 fnc: prn: 9 prn: nc: prn: 9
verb: buy arg: nc: pc: prn:
verb: v_1 noun: man arg: man fnc: v_1 3 fnc: prn: 9 nc: prn: 9
verb: buy arg: nc: pc: prn:
verb: cook cnj: and arg: nc: pc: prn:
verb: eat arg: nc: pc: prn:
noun: n_2 noun: pizza verb: surprise noun: Mary arg: fnc: fnc: fnc: prn: prn: prn: prn:
verb: buy noun: man arg: man fnc: buy 4 fnc: prn: 9 nc: prn: 9
verb: cook arg: nc: pc: prn:
verb: buy& noun: man 5 arg: man fnc: buy& prn: 9 fnc: nc: cook prn: 9
verb: cook cnj: and arg: nc: pc: buy& prn: 9
verb: buy& noun: man arg: man fnc: buy& 6 fnc: prn: 9 nc: cook prn: 9
verb: cook cnj: and arg: nc: pc: buy& prn: 9
n!v: buy& noun: man arg: man fnc: buy & 7 fnc: prn: 9 nc: cook prn: 9
verb: cook arg: nc: eat pc: buy& prn: 9
verb: eat arg: nc: pc: cook prn: 9
noun: n_2 fnc: prn:
verb: buy& 8 arg: man n_2 fnc: nc: cook prn: 9
noun: man fnc: buy& prn: 9
verb: cook arg: nc: eat pc: buy& prn: 9
verb: eat arg: nc: pc: cook prn: 9
noun: n_2 noun: pizza fnc: buy& fnc: prn: 9 prn:
verb: buy& noun: man arg: man pizza fnc: buy& 9 fnc: prn: 9 nc: cook prn: 9
verb: cook arg: nc: eat pc: buy& prn: 9
verb: eat arg: nc: pc: cook prn: 9
noun: pizza fnc: buy& prn: 9
verb: buy& noun: man arg: man pizza buy& 10 fnc: (surprise 10) fnc: prn: 9 nc: cook prn: 9 result of syntactic−semantic parsing: verb: buy& noun: man arg: man pizza fnc: buy& fnc: (surprise 10) prn: 9 nc: cook prn: 9
verb: eat arg: nc: pc: prn:
verb: cook arg: nc: eat pc: buy& prn: 9
verb: eat arg: nc: pc: cook prn: 9
noun: pizza fnc: buy& prn: 9
verb: cook arg: nc: eat pc: buy& prn: 9
verb: eat arg: nc: pc: cook prn: 9
noun: pizza fnc: buy& prn: 9
verb: surprise arg: prn:
verb: surprise arg: (buy& 9) prn: 10
verb: surprise arg: (buy& 9) Mary prn: 10
noun: Mary fnc: prn:
noun: Mary fnc: surprise prn: 10
When the second conjunct cook is added in line 4, buy is &-marked as the initial conjunct. The chain of nc values beginning with the buy proplet ensures that the non-initial conjuncts of the verb coordination can be accessed during
9.3 Simple Coordination in Clausal Subject
157
retrieval. The opaque V/V relation between the subject sentence and the main sentence is established in lines 9 and 10 by the [fnc: (surprise 10)] feature of the proplet buy and the extrapropositional value in the [arg: (buy& 9) Mary] feature of the higher verb proplet surprise. Consider the content derived in 9.3.4 in the alphabetical order induced by the core values, with added cat and sem values: 9.3.5 That the man bought, cooked, and ate a pizza surprised Mary. verb: eat verb: buy& verb: ′cook ′ ′ ′ ′ ′ noun: man cat: n a vcat: n a v noun: pizza cat: #n #a v noun: Mary verb: surprise ′ ′ cat: #n #a decl cat: snp sem: that past cat: snp sem: past sem: and pastcat: snp sem: nm f sem: indef sgsem: past arg: arg: man pizza sem: def sgarg: fnc: buy& arg: (buy& 9) Maryfnc: surprise nc: fnc: (surprise 10)fnc: buy& nc: eat prn: 9 pc: buy& pc: cook prn: 9 nc: cook prn: 10 prn: 10 prn: 9 prn: 9 prn: 9
The valency positions in the cat slots of the verbal conjuncts cook and eat are not #-canceled and their arg slots have no values. They may be retrieved, however, by navigating to the initial conjunct via the nc-pc values. The conjunction values that and and appear in the sem slots of buy and eat, respectively, and originate in the lexical entries A.4.4 and A.7.1. The LA think navigation through the proplets of 9.3.5 underlying the speak mode production may be shown graphically as follows: 9.3.6 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG)
(ii) signature
(iii) numbered arcs graph (NAG)
V
surprise
surprise 1
Mary cook eat
buy
V
V
N V
buy 2
man
pizza
N
N
9.3.4
3
4
5
10
9 cook 8 7 6
12 11 Mary eat
man
pizza (iv) surface realization 8 11 12 1 2 3 6 7 9 4 5 10 that the_man bought cooked and_ate a_pizza surprised Mary . V\N N\V V/V V/N N/V V−V V−V V V V V V\N N\V V/V
The construction resembles 7.1.4, except that the verb buy of the subclause is transitive and extended into a simple coordination. It remains to analyze a construction containing a subject clause with an object coordination, such as That Bob bought an apple, a pear, and a peach surprised Mary. The X in the matrix sentence X surprised Mary is the subject sentence that Bob bought Y, and the object Y is the simple coordination an apple, a pear, and a peach:
158
9. Simple Coordination in Clausal Constructions
9.3.7 S IMPLE that lexical lookup
Bob
OBJECT COORDINATION IN CLAUSAL SUBJECT bought
verb: v_1 noun: Bob verb: buy arg: arg: fnc: fnc: prn: prn: nc: prn: syntactic−semantic parsing: verb: v_1 noun: Bob arg: fnc: 1 fnc: prn: nc: prn: 11 verb: v_1 noun: Bob arg: Bob fnc: v_1 2 fnc: prn: 11 nc: prn: 11
an
apple
noun: n_1 noun: apple fnc: fnc: prn: nc: pc: prn:
a
pear
a
and
noun: n_2 noun: pear cnj: and fnc: fnc: nc: prn: pc: prn:
peach
surprised
Mary
noun: n_3 noun: peach verb: surprise noun: Mary arg: fnc: fnc: fnc: prn: prn: prn: nc: pc: prn:
verb: buy arg: prn:
verb: buy noun: Bob arg: Bob fnc: buy 3 fnc: prn: 11 nc: prn: 11
noun: n_1 fnc: nc: pc: prn:
verb: buy noun: Bob arg: Bob n_1 fnc: buy 4 fnc: prn: 11 nc: prn: 11
noun: n_1 fnc: buy nc: pc: prn: 11
verb: buy noun: Bob arg: Bob apple fnc: buy 5 fnc: prn: 11 nc: prn: 11
noun: apple fnc: buy nc: pc: prn: 11
noun: n_2 fnc: nc: pc: prn:
verb: buy noun: Bob arg: Bob apple& fnc: buy 6 fnc: prn: 11 nc: prn: 11
noun: apple& fnc: buy nc: n_2 pc: prn: 11
noun: n_2 fnc: nc: pc: apple& prn: 11
verb: buy noun: Bob arg: Bob apple& fnc: buy 7 fnc: prn: 11 nc: prn: 11
noun: apple& fnc: buy nc: pear pc: prn: 11
noun: pear fnc: nc: pc: apple& prn: 11
cnj: and
verb: buy noun: Bob 8 arg: Bob apple& fnc: buy fnc: prn: 11 nc: prn: 11
noun: apple& fnc: buy nc: pear pc: prn: 11
noun: pear fnc: nc: pc: apple& prn: 11
cnj: and noun: n_3 fnc: nc: pc: prn:
verb: buy noun: Bob arg: Bob apple& fnc: buy 9 fnc: prn: 11 nc: prn: 11
noun: apple& fnc: buy nc: pear pc: prn: 11
noun: pear fnc: nc: n_3 pc: apple& prn: 11
verb: buy noun: Bob arg: Bob apple& fnc: buy 10 fnc: prn: 11 nc: prn: 11
noun: apple& fnc: buy nc: pear pc: prn: 11
noun: pear fnc: nc: peach pc: apple& prn: 11
verb: buy noun: Bob arg: Bob apple& fnc: buy 11 fnc: (surprise 12) prn: 11 prn: 11
noun: apple& fnc: buy nc: pear pc: prn: 11
noun: pear fnc: nc: peach pc: apple& prn: 11
noun: n_3 noun: peach fnc: fnc: nc: prn: pc: pear prn: 11 noun: peach fnc: nc: pc: pear prn: 11 noun: peach fnc: nc: pc: pear prn: 11
noun: apple& fnc: buy nc: pear pc: prn: 11
noun: pear fnc: nc: peach pc: apple& prn: 11
noun: peach fnc: nc: pc: pear prn: 11
result of syntactic−semantic parsing verb: buy noun: Bob arg: Bob apple& fnc: buy fnc: (surprise 12) prn: 11 prn: 11
noun: apple fnc: nc: pc: prn:
noun: pear fnc: nc: pc: prn:
verb: surprise arg: prn:
verb: surprise noun: Mary arg: (buy 11) fnc: prn: 12 prn:
verb: surprise noun: Mary arg: (buy 11) Mary fnc: prn: 12 prn: 12
The verb buy representing the subject sentence and the matrix verb surprise are connected in line 10 by cross-copying; the apparent distance between them is of no consequence because (i) there are only five candidates for surprise to connect with at the now front (Sect. 11.3) and (ii) proplets are order-free. Consider the result as a set of proplets shown in the order induced by the above hear mode derivation, with added cat and sem values:
9.3 Simple Coordination in Clausal Subject
159
9.3.8 That Bob bought an apple, a pear, and a peach, surprised Mary. verb: buy ′ ′ cat: #n #a v sem: that past arg: Bob apple& fnc: (surprise 12) prn: 11
noun: Bob cat: snp sem: nm m fnc: buy prn: 11
verb: surprise ′ ′ cat: #n #a decl sem: past arg: (buy 11) Mary prn: 12
noun: apple& cat: snp sem: indef sg fnc: buy nc: pear pc: prn: 11
noun: Mary cat: snp sem: nm f fnc: surprise prn: 12
noun: pear cat: snp sem: indef sg fnc: nc: peach pc: apple& prn: 11
noun: peach cat: snp sem: and indef sg fnc: nc: pc: pear prn: 11
The conjunction values that and and appear in the sem slots of buy and peach, respectively, and originate in the lexical entries A.4.4 and A.7.1. The relation between the subject sentence X = that Bob ate Y and the main sentence X surprised Mary is established by the [fnc: (surprise 12)] feature of buy and extrapropositional address in the [arg: (buy 11) Mary] feature of the higher verb surprise. The noun conjuncts are connected via their nc and pc values. The semantic relations of structure and the transitions underlying realization in the speak mode may be shown graphically as follows: 9.3.9 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG)
(iii) numbered arcs graph (NAG) surprise
surprise
buy Bob
9.3.7
1 12 10 11 buy Mary
Mary apple
pear
peach
2 3 Bob
4
9 apple
8 pear 5
7 peach 6
(ii) signature
V V N
N N
N
N
(iv) surface realization 1 2 3 4 5 6 7 8 9 10 11 12 That Bob bought an_apple a_pear and_a_peach surprised Mary V/V V/N N\V V\N N−N N−N N N N N N\V V/V V/N N\V
.
The construction resembles 7.1.4, except that the subclause uses the two-place verb buy and its object is extended into a simple noun coordination.
160
9. Simple Coordination in Clausal Constructions
9.4 Simple Coordination in Clausal Object The example Mary saw that Bob, Jim, and Bill slept consists of the matrix sentence Mary saw X in which X is the subclause that Y slept serving as the sentential object, and Y is the simple noun coordination Bob, Jim, and Bill: 9.4.1 S IMPLE Mary lexical lookup
SUBJECT COORDINATION IN A CLAUSAL OBJECT
saw
that
noun: Mary verb: see verb: v_1 arg: arg: fnc: fnc: prn: prn: nc: prn: syntactic−semantic parsing: noun: Mary verb: see arg: fnc: 1 prn: 13 prn:
Bob
Jim
noun: Bob fnc: nc: pc: prn:
and
noun: Jim cnj: and fnc: nc: pc: prn:
Bill
noun: Bill fnc: nc: pc: prn:
slept
verb: sleep arg: nc: pc: prn:
noun: Mary verb: see arg: Mary 2 fnc: see prn: 13 prn: 13
noun: Mary fnc: see 3 prn: 13
noun: Mary fnc: see 4 prn: 13
noun: Mary fnc: see 5 prn: 13
noun: Mary see 6 fnc: prn: 13
noun: Mary see 7 fnc: prn: 13
verb: v_1 arg: fnc: nc: prn: verb: v_1 verb: see arg: Mary v_1 arg: fnc: (see 13) prn: 13 nc: prn: 14 verb: v_1 verb: see arg: Mary v_1 arg: Bob fnc: (see 13) prn: 13 nc: prn: 14 verb: v_1 verb: see arg: Mary v_1 arg: Bob& fnc: (see 13) prn: 13 nc: prn: 14 verb: v_1 verb: see arg: Mary v_1 arg: Bob& fnc: (see 13) prn: 13 nc: prn: 14 verb: v_1 verb: see arg: Mary v_1 arg: Bob& fnc: (see 13) prn: 13 nc: prn: 14
noun: Bob fnc: nc: pc: prn: noun: Bob fnc: v_1 nc: pc: prn: 14
noun: Jim fnc: nc: pc: prn:
noun: Bob& fnc: v_1 nc: Jim pc: prn: 14
noun: Jim cnj: and fnc: nc: pc: Bob& prn: 14
noun: Bob& fnc: v_1 nc: Jim pc: prn: 14
noun: Jim cnj: and fnc: nc: pc: Bob& prn: 14
noun: Bob& fnc: v_1 nc: Jim pc: prn: 14
noun: Jim fnc: nc: Bill pc: Bob& prn: 14
result of syntactic−semantic parsing: verb: sleep noun: Mary verb: see fnc: see arg: Mary (sleep 14) arg: Bob& fnc: (see 13) prn: 13 prn: 13 nc: prn: 14
noun: Bob& fnc: sleep nc: Jim pc: prn: 14
noun: Bill fnc: nc: pc: prn:
noun: Bill fnc: nc: pc: Jim prn: 14
noun: Jim fnc: nc: Bill pc: Bob& prn: 14
verb: sleep arg: nc: pc: prn:
noun: Bill fnc: nc: pc: Jim prn: 14
9.4 Simple Coordination in Clausal Object
161
Consider the result in the order induced by the hear mode derivation, with added cat and sem values. 9.4.2 C ONTENT
OF
Mary saw that Bob, Jim, and Bill slept.
verb: see noun: Mary ′ ′ cat: #n #a decl cat: snp sem: nm f sem: past arg: Mary (sleep 14) fnc: see prn: 13 prn: 13 verb: sleep noun: Jim noun: Bob& ′ cat: #n v cat: snp cat: snp sem: that past sem: nm m sem: nm m arg: Bob& fnc: sleep fnc: fnc: (see 13) nc: Bill nc: Jim nc: pc: Bob& pc: pc: prn: 14 prn: 14 prn: 14
noun: Bill cat: snp sem: and nm m fnc: nc: pc: Jim prn: 14
The crucial formal difference between the examples in this section and those in Sect. 9.3 is that the clausal and non-clausal arg values of the higher verb proplet are in inverse order: in 9.3.2 the higher verb is surprise and its arg feature is [arg: (sleep 8) Mary] (clausal subject), while in 9.4.2 the higher verb is see and its arg feature is [arg: Mary (sleep 14)] (clausal object). The LA think navigation through the proplet set 9.4.2 underlying the speak mode production may be shown graphically as follows: 9.4.3 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG) see Mary
sleep Bob
Jim
Bill
(iii) numbered arcs graph (NAG) see 10 1 2 3
(ii) signature
V V
N
9.4.1
Mary
sleep 4 9
N
N
(iv) surface realization
N
Bob
8 Jim 5
7 Bill 6
1 2 3 4 5 6 7 8 9 10 Mary saw that Bob Jim and_Bill slept V/N N/V V\V V/N N−N N−N N N N N N/V V\V
.
The construction resembles 7.2.3, except that the subject of the subclause is extended into a simple noun coordination. The next example is Mary saw that the man bought, cooked, and ate a pizza. It consists of the matrix clause Mary saw X, X is the object clause that the man Y pizza, and Y is the simple predicate coordination buy, cook, and eat:
162
9. Simple Coordination in Clausal Constructions
9.4.4 S IMPLE Mary lexical lookup
saw
PREDICATE COORDINATION IN A CLAUSAL OBJECT that
the
man
noun: Mary verb: see verb: v_1 noun: n_1 noun: man arg: fnc: arg: fnc: fnc: fnc: prn: prn: prn: prn: nc: prn: syntactic−semantic parsing: noun: Mary verb: see arg: 1 fnc: prn: 15 prn:
bought verb: buy arg: nc: pc: prn:
cooked
and
verb: cook cnj: and arg: nc: pc: prn:
ate
a
pizza
verb: eat noun: n_2 noun: pizza arg: fnc: fnc: prn: prn: nc: pc: prn:
noun: Mary verb: see verb: v_1 fnc: see arg: Mary arg: 2 prn: 15 prn: 15 fnc: nc: prn: noun: n_1 verb: v_1 noun: Mary verb: see fnc: arg: Mary (v_1 16) arg: 3 fnc: see fnc: (see 15) prn: prn: 15 prn: 15 nc: prn: 16 verb: v_1 noun: n_1 noun: man noun: Mary verb: see arg: Mary (v_1 16) arg: n_1 fnc: v_1 fnc: fnc: see prn: prn: 15 4 prn: 15 fnc: (see 15) prn: 16 nc: prn: 16 verb: v_1 noun: man noun: Mary verb: see arg: Mary (v_1 16) arg: man fnc: v_1 fnc: see prn: 15 fnc: (see 15) prn: 16 5 prn: 15 nc: prn: 16
verb: buy arg: nc: pc: prn:
verb: buy noun: man noun: Mary verb: see fnc: buy arg: Mary (buy 16) arg: man 6 fnc: see fnc: (see 15) prn: 16 prn: 15 prn: 15 nc: prn: 16
verb: cook arg: nc: pc: prn:
noun: Mary verb: see arg: Mary (buy& 16) fnc: see 7 prn: 15 prn: 15
verb: buy& noun: man arg: man fnc: buy& fnc: (see 15) prn: 16 nc: cook prn: 16
verb: cook cnj: and arg: nc: pc: buy& prn:
noun: Mary verb: see arg: Mary (buy& 16) fnc: see prn: 15 8 prn: 15
verb: buy& noun: man arg: man fnc: buy& fnc: (see 15) prn: 16 nc: cook prn: 16
verb: cook cnj: and arg: nc: pc: buy& prn: 16
verb: eat arg: nc: pc: prn:
noun: Mary verb: see arg: Mary (buy& 16) fnc: see 9 prn: 15 prn: 15
verb: buy& noun: man fnc: buy& arg: man fnc: (see 15) prn: 16 nc: cook prn: 16
verb: cook arg: nc: eat pc: buy& prn: 16
verb: eat arg: nc: pc: cook prn: 16
verb: buy& arg: man n_2 fnc: (see 15) nc: cook prn: 16
noun: man fnc: buy prn: 16
verb: cook arg: nc: eat pc: buy& prn: 16
verb: buy& noun: man arg: man pizza fnc: buy& fnc: (see 15) prn: 16 nc: cook prn: 16
verb: cook arg: nc: eat pc: buy& prn: 16
noun: Mary verb: see arg:Mary (buy& 16) see 10 fnc: prn: 15 prn: 15 result of syntactic−semantic parsing: noun: Mary verb: see arg: Mary (buy& 16) fnc: see prn: 15 prn: 15
verb: eat arg: nc: pc: cook prn: 16
noun: n_2 fnc: prn:
noun: n_2 noun: pizza fnc: buy& fnc: prn: 16 prn:
verb: eat arg: nc: pc: cook prn: 16
noun: pizza fnc: buy& prn: 16
The subordinating conjunction that turns into the initial conjunct by absorbing buy (line 5). The subsequent conjuncts are connected to it via their nc and pc values (lines 6-8). In lines 9-10 the object is connected to the initial conjunct.
9.4 Simple Coordination in Clausal Object
163
Consider the result in the order induced by the hear mode derivation, with added cat and sem values. 9.4.5 Mary saw that the man bought, cooked, and ate a pizza. verb: see ′ ′ cat: #n #a decl sem: past arg: Mary (buy& 16) prn: 15 verb: cook ′ ′ cat: #n #a v noun: man cat: n′ a′ v sem: that past cat: snp sem: past arg: man pizza sem: nm m arg: fnc: (see 15) fnc: buy& nc: eat nc: cook pc: buy& prn: 16 pc: prn: 16 prn: 16
noun: Mary cat: snp sem: nm f fnc: see prn: 15 verb: buy&
verb: eat ′ ′ cat: n a v sem: and past arg: nc: pc: cook prn: 16
noun: pizza cat: snp sem: indef sg fnc: buy& prn: 16
The LA think navigation through the proplets 9.4.5 underlying the speak mode production may be shown graphically as follows: 9.4.6 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG) see buy
Mary
cook
eat
(iii) numbered arcs graph (NAG) see 1 2 3 12 Mary
buy 4
man
7 5
pizza man
(ii) signature
6 pizza
.
V N
11 cook 10 eat 9 8
(iv) surface realization 1 2 3 4 5 8 9 10 11 6 7 12 Mary saw that the_man bought cooked and_ate a_pizza V/N N/V V\V V/N N/V V−V V−V V V V V V\N N\V V\V
V N
9.4.4
V
V
N
The construction resembles 7.1.4, except that the verb of the subclause is twoplace3 and is extended into a simple predicate coordination. Finally let us consider Mary saw that Bob bought an apple, a pear, and a peach. It consists of the matrix clause Mary saw X, X is the object clause that Bob bought Y, and Y is the simple noun coordination an apple, a pear, and a peach serving as the object of the subordinate clause: 3
In a most minimal version of this construction, the verbs would be one-place.
164
9. Simple Coordination in Clausal Constructions
9.4.7 S IMPLE Mary saw lexical lookup
OBJECT COORDINATION IN A CLAUSAL OBJECT that
Bob
noun: Mary verb:see verb:v_1 noun: Bob arg: arg: fnc: fnc: fnc: prn: prn: prn: nc: prn: syntactic−semantic parsing: noun: Mary verb:see arg: fnc: 1 prn: 15 prn:
noun: Mary verb: see arg: Mary fnc: see 2 prn: 15 prn: 15
an
bought verb: buy arg: nc: pc: prn:
apple
a
noun: n_1 noun: apple fnc: fnc: nc: prn: pc: prn:
pear
and
noun: n_2 noun: pear cnj: and fnc: fnc: nc: prn: pc: prn:
a
peach
noun: n_3 noun: peach fnc: fnc: nc: prn: pc: prn:
verb: v_1 arg: fnc: nc: prn:
noun: Bob noun: Mary verb: see verb: v_1 3 fnc: see fnc: arg: Mary (v_1 16) arg: prn: 15 prn: 15 fnc: (see 15) prn: nc: prn: 16 noun: Mary verb: see noun: Bob verb: v_1 arg: Mary (v_1 16) arg: Bob fnc: v_1 4 fnc: see fnc: (see 15) prn: 16 prn: 15 prn: 15 nc: prn: 16
verb: buy arg: nc: pc: prn:
noun: Mary verb: see noun: Bob verb: buy arg: Mary (buy 16) arg: Bob fnc: buy 5 fnc: see fnc: (see 15) prn: 16 prn: 15 prn: 15 nc: prn: 16
noun: n_1 fnc: nc: pc: prn:
noun: Mary verb: see noun: Bob verb: buy fnc: see arg: Mary (buy 16) arg: Bob n_1 fnc: buy 6 prn: fnc: (see 15) prn: 16 15 prn: 15 nc: prn: 16
noun: n_1 fnc: buy nc: pc: prn:
noun: Mary verb: see noun: Bob verb: buy fnc: see arg: Mary (buy 16) arg: Bob apple fnc: buy fnc: (see 15) prn: 16 prn: 15 prn: 15 7 nc: prn: 16
noun: apple fnc: buy nc: pc: prn:
noun: n_2 fnc: nc: pc: prn:
noun: Mary verb: see noun: Bob verb: buy fnc: see arg: Mary (buy 16) arg: Bob apple& fnc: buy 8 prn: fnc: (see 15) prn: 16 15 prn: 15 nc: prn: 16
noun: apple& fnc: buy nc: n_2 pc: prn: 16
noun: n_2 fnc: nc: pc: apple& prn: 16
noun: Mary verb: see noun: Bob verb: buy arg: Mary (buy 16) arg: Bob apple& fnc: buy 9 fnc: see fnc: (see 15) prn: 15 prn: 15 prn: 16 nc: prn: 16
noun: apple& fnc: buy nc: pear pc: prn: 16
noun: pear fnc: nc: pc: apple& prn: 16
cnj: and
noun: Mary verb: see noun: Bob verb: buy arg: Mary (buy 16) arg: Bob apple& fnc: buy 10 fnc: see fnc: (see 15) prn: 16 prn: 15 prn: 15 nc: prn: 16
noun: apple& fnc: buy nc: pear pc: prn: 16
noun: pear fnc: nc: and pc: apple& prn: 16
cnj: and noun: n_3 fnc: nc: pc: prn:
noun: Mary verb: see noun: Bob verb: buy arg: Mary (buy 16) arg: Bob apple& fnc: buy 11 fnc: see fnc: (see 15) prn: 16 prn: 15 prn: 15 nc: prn: 16
noun: apple& fnc: buy nc: pear pc: prn: 16
noun: pear fnc: nc: and n_3 pc: apple& prn: 16
noun: n_3 fnc: nc: pc: pear prn: 16
result of syntactic−semantic parsing noun: Mary verb: see noun: Bob verb: buy fnc: see arg: Mary (buy 16) arg: Bob apple& fnc: buy fnc: (see 15) prn: 16 prn: 15 prn: 15 nc: prn: 16
noun: apple& fnc: buy nc: pear pc: prn: 16
noun: pear fnc: nc: and peach pc: apple& prn: 16
noun: peach fnc: nc: pc: pear prn: 16
noun: apple fnc: nc: pc: prn:
noun: pear fnc: nc: pc: prn:
noun: peach fnc: nc: pc: prn:
Consider the result in the order induced by the hear mode derivation, with added cat and sem values.
9.5 Simple Coordination in Clausal Adnominal
165
9.4.8 Mary saw that Bob bought an apple, a pear, and a peach. verb: see noun: Mary ′ ′ cat: #n #a decl cat: snp sem: nm f sem: past arg: Mary (buy 16) fnc: see prn: 15 prn: 15 noun: apple& verb: buy cat: #n′ #a′ v noun: Bob cat: snp cat: snp sem: indef sg sem: that past arg: Bob apple& sem: nm m fnc: buy fnc: buy nc: pear fnc: (see 15) pc: prn: 16 prn: 16 prn: 16
noun: pear cat: snp sem: indef sg fnc: nc: peach pc: apple& prn: 16
noun: peach cat: snp sem: and indef sg fnc: nc: pc: pear prn: 16
For an analogous example without an object coordination see Sect. 7.2. 9.4.9 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG) see Mary
buy Bob
apple
pear
peach
9.4.7
(iii) numbered arcs graph (NAG) see 1 12 2 3 Mary buy 4 11 5 6 Bob apple 10 pear 9 peach 7 8
(iv) surface realization
(ii) signature
V
1 2 3 4 5 6 7 8 9 10 11 12 Mary saw that Bob bought an_apple a_pear and_a_peach V/N N/V V\V V/N N/V V\N N−N N−N N N N N N\V V\V
.
N
V N
N
N
N
V\V (arc 3) realizes that from the sem slot of the buy proplet. Following the empty transitions 9–11, arc 12 returns to see to realize the period.
9.5 Simple Coordination in Clausal Adnominal In transitive declarative sentences, there are the following basic constructions with clausal adnominal modifiers (aka relative clauses): 9.5.1 S IMPLE
COORDINATIONS IN RELATIVE CLAUSE CONSTRUCTIONS
A. Adnominal clause modifying subject 1. subject gap, predicate coordination The man who bought, cooked, and ate the pizza surprised Mary. The matrix is X surprised Mary, where X is the man who bought Y and Y is buy, cooked, and ate.
166
9. Simple Coordination in Clausal Constructions
2. object gap, predicate coordination The pizza which the man bought, cooked, and ate surprised Mary. The matrix is X surprised Mary, where X is the pizza which the man Y and Y is bought, cooked, and ate. 3. subject gap, object coordination The man who bought an apple, a pair, and a peach pleased Mary. The matrix is X pleased Mary where X is the man who bought Y and Y is an apple, a pair, and a peach. 4. object gap, subject coordination The man whom Bob, Jim, and Bill know kissed Mary (9.5.4) The matrix is X kissed Mary where X is the man whom Y know and Y is Bob, Jim, and Bill. B. Adnominal clause modifying object 5. subject gap, predicate coordination Mary saw the man who bought, cooked, and ate the pizza. (9.5.7) The matrix is Mary saw X, where X is the man who Y the pizza and Y is bought, cooked, and ate. 6. object gap, predicate coordination Mary saw the pizza which the man bought, cooked, and ate. The matrix is Mary saw X where X is the pizza which the man Y and Y is buy, cooked, and ate. 7. object gap, subject coordination Mary saw the pizza which Bob, Jim, and Bill ate. The matrix is Mary saw X, where X is the pizza which Y ate and Y is Bob, Jim, and Bill. 8. subject gap, object coordination Mary saw the man who bought an apple, a pear, and a peach. The matrix is Mary saw X, where X is the man who bought Y and Y is an apple, a pear, and a peach. Constructions using one- or three-place verbs, different syntactic moods, etc., may be analyzed as straightforward variants of those listed in 9.5.1. The combinations subject gap, subject coordination (cf. 3) and object gap, object coordination (cf. 7) are structurally excluded because a gap cannot participate in a simple coordination. This is shown by the following examples: 9.5.2 Mary saw the man who(*, Jim, and Bill) slept 9.5.3 The man whom(*, Jim, and Bill) Suzy knew kissed Mary The variants with the conjuncts in parentheses are not just ungrammatical, but incomprehensible.
9.5 Simple Coordination in Clausal Adnominal
167
The following hear mode derivation shows example 4, i.e. a declarative sentence with a modified subject, whereby the clausal adnominal combines an object gap with a subject coordination. 9.5.4 S IMPLE SUBJECT COORDINATION IN CLAUSAL ( OBJECT GAP, MODIFYING MAIN CLAUSE SUBJECT ) the man lexical lookup
Bob
whom
noun: n_1 noun: man fnc: fnc: prn: mdr: prn:
verb:v_1 0/ arg: mdd: prn:
noun: Bob fnc: nc: pc: prn:
Jim
Bill
and
noun: Jim cnj: and fnc: nc: pc: prn:
ADNOMINAL
know
kissed
Mary
noun: Bill verb: know verb: kiss noun: Mary fnc: arg: arg: fnc: nc: prn: prn: prn: pc: prn:
syntactic−semantic parsing: noun: n_1 noun: man fnc: fnc: 1 mdr: prn: prn: 21 verb: v_1 noun: man arg: fnc: 0/ 2 mdr: mdd: prn: 21 prn: noun: man 3 fnc: mdr: (v_1 22) prn: 21
verb: v_1 arg: 0/ mdd: (man 21) nc: prn: 22
noun: Bob fnc: nc: pc: prn:
noun: man 4 fnc: mdr: (v_1 22) prn: 21
verb: v_1 arg: Bob 0/ mdd: (man 21) prn: 22
noun: Bob fnc: v_1 nc: pc: prn: 22
noun: man fnc: 5 mdr: (v_1 22) prn: 21
verb: v_1 arg: Bob& 0/ mdd: (man 21) prn: 22
noun: Bob& fnc: v_1 nc: Jim pc: prn: 22
noun: man fnc: mdr: (v_1 22) 6 prn: 21
verb: v_1 arg: Bob& 0/ mdd: (man 21) prn: 22
noun: Bob& fnc: v_1 nc: Jim pc: prn: 22
noun: Jim cnj: and fnc: nc: pc: Bob& prn: 22
noun: man fnc: 7 mdr: (v_1 22) prn: 21
verb: v_1 arg: Bob& 0/ mdd: (man 21) prn: 22
noun: Bob & fnc: v_1 nc: Jim pc: prn: 22
noun: Jim fnc: nc: Bill pc: Bob& prn: 22
noun: Bill verb: know fnc: arg: nc: prn: pc: Jim prn: 22
noun: man fnc: 8 mdr: (know 22) prn: 21
verb: know arg: Bob& 0/ mdd: (man 21) prn: 22
noun: Bob& fnc: know nc: Jim pc: prn: 22
noun: Jim fnc: nc: Bill pc: Bob& prn: 22
noun: Bill fnc: nc: pc: Jim prn: 22
noun: man kiss 9 fnc: mdr: (know 22) prn: 21
verb: know arg: Bob& 0/ mdd: (man 21) prn: 22
noun: Bob& fnc: know nc: Jim pc: prn: 22
noun: Jim fnc: nc: Bill pc: Bob& prn: 22
noun: Bill fnc: nc: pc: Jim prn: 22
verb: kiss arg: man prn: 21
noun: Bob& fnc: know nc: Jim pc: prn: 22
noun: Jim fnc: nc: Bill pc: Bob& prn: 22
noun: Bill fnc: nc: pc: Jim prn: 22
verb: kiss noun: Mary arg: man Mary fnc: kiss prn: 21 prn: 21
result of syntactic−semantic parsing: noun: man verb: know arg: Bob& 0/ fnc: kiss mdd: (man 21) mdr: (know 22) prn: 21 prn: 22
noun: Jim fnc: nc: pc: prn: noun: Jim cnj: and fnc: nc: pc: Bob& prn: 22 noun: Bill fnc: nc: pc: prn: 22
verb: kiss arg: prn:
noun: Mary fnc: prn:
168
9. Simple Coordination in Clausal Constructions
Consider the result in the order induced by the hear mode derivation, with added cat and sem values. 9.5.5 The man whom Bob, Jim, and Bill know kissed Mary verb: know cat: #n′ #a′ v sem: whom past arg: Bob& ∅ mdd: (man 21) prn: 22
noun: man cat: snp sem: def sg fnc: kiss mdr: (know 22) prn: 21
verb: kiss ′ ′ cat: #n #a decl sem: past arg: man Mary prn: 21
noun: Mary cat: snp sem: nm f fnc: kiss prn: 21
noun: Bob& cat: snp sem: nm m fnc: know nc: Jim pc: prn: 22
noun: Jim cat: snp sem: nm m fnc: nc: Bill pc: Bob& prn: 22
noun: Bill cat: snp sem: and nm m fnc: nc: pc: Jim prn: 22
The object gap of the relative clause is indicated by ∅ in the second arg position of know and introduced by the lexical entry of whom (A.6.3). The graphical analysis underlying the speak mode production of the surface is as follows: 9.5.6 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG) kiss man
Mary
(iii) numbered arcs graph (NAG) kiss 1 10 11 12 man Mary 2 9 know
know Bob
9.5.4
Jim
3 8 Bob 7 Jim 4
Bill
6 5
Bill
(ii) signature
V N
N (iv) surface realization
V
1 2 3 4 5 6 7 8 9 10 11 12 the_man whom Bob Jim and_Bill know kissed Mary V/N N|V V/N N−N N−N N N N N N/V V|N N/V V\N N\V
.
N
N
N
Finally consider the analysis of example 5 in 9.5.1. It differs from example 4 (9.5.4) in that the adnominal modifies the higher object rather than subject, has a subject rather than an object gap, and shows a simple coordination of the lower predicate instead of the subject (compare 9.5.6 with 9.5.9):
9.5 Simple Coordination in Clausal Adnominal
9.5.7 S IMPLE Mary lexical lookup
169
PREDICATE COORDINATION IN A CLAUSAL ADNOMINAL
saw
the
man
noun: Mary verb: see fnc: arg: prn: prn:
noun: n_1 noun: man fnc: fnc: mdr: prn: prn: syntactic−semantic parsing: noun: Mary verb: see 1 fnc: arg: prn: prn: 19
who verb: v_1 arg: 0/ mdd: nc: prn:
bought verb: buy arg: nc: pc: prn:
cooked
ate
and
verb: cook cnj: and arg: nc: pc: prn:
the
pizza
verb: eat noun: n_2 noun: pizza arg: fnc: fnc: prn: prn: nc: pc: prn:
noun: Mary verb: see noun: n_1 2 fnc: see arg: Mary fnc: prn: 19 prn: 19 mdr: prn: noun: Mary verb: see noun: n_1 noun: man fnc: arg: Mary n_1 fnc: see 3 fnc: see prn: 19 prn: 19 mdr: prn: prn: 19 noun: Mary verb: see noun: man 4 fnc: see arg: Mary man fnc: see prn: 19 prn: 19 mdr: prn: 19 noun: Mary verb: see noun: man fnc: see arg: Mary man fnc: see 5 prn: 19 prn: 19 mdr: v_1 prn: 19
verb: v_1 arg: 0/ mdd: nc: prn: verb: v_1 arg: 0/ mdd: (man 19) nc: prn: 20
verb: buy arg: nc: pc: prn:
verb: buy arg: 0/ mdd: (man 19) nc: prn: 20
verb: cook arg: nc: pc: prn:
7
noun: Mary verb: see noun: man fnc: see arg: Mary man fnc: see mdr: (buy& 20) prn: 19 prn: 19 prn: 19
verb: buy& arg: 0/ mdd: (man 19) nc: cook prn: 20
verb: cook cnj: and arg: nc: pc: buy& prn: 20
8
noun: Mary verb: see noun: man fnc: see arg: Mary man fnc: see mdr: (buy& 20) prn: 19 prn: 19 prn: 19
verb: buy& arg: 0/ mdd: (man 19) nc: cook prn: 20
verb: cook cnj: and arg: nc: pc: buy& prn: 20
9
noun: Mary verb: see noun: man fnc: see arg: Mary man fnc: see mdr: (buy& 20) prn: 19 prn: 19 prn: 19
verb: buy& arg: 0/ mdd: (man 19) nc: cook prn: 20
verb: cook arg: nc: eat pc: buy& prn: 20
10
noun: Mary verb: see noun: man fnc: see arg: Mary man fnc: see mdr: (buy& 20) prn: 19 prn: 19 prn: 19
verb: buy& arg: 0/ n_2 mdd: (man 19) nc: cook prn: 20
noun: Mary verb: see noun: man arg: Mary man fnc: see 6 fnc: see mdr: (buy 20) prn: 19 prn: 19 prn: 19
result of syntactic−semantic parsing: noun: Mary verb: see noun: man fnc: see arg: Mary man fnc: see mdr: (buy& 20) prn: 19 prn: 19 prn: 19
verb: buy& arg: 0/ pizza mdd: (man 19) nc: cook prn: 20
verb: cook arg: nc: eat pc: buy& prn: 20 verb: cook arg: nc: eat pc: buy& prn: 20
verb: eat arg: nc: pc: prn:
verb: eat arg: nc: pc: cook prn: 20
noun: n_2 fnc: nc: pc: prn:
verb: eat noun: n_2 noun: pizza arg: fnc: buy& fnc: nc: prn: 20 prn: pc: cook prn: 20 verb: eat noun: pizza arg: fnc: buy& nc: prn: 20 pc: cook prn: 20
The subject gap of the relative clause is indicated by ∅ in the first arg position of buy&, and introduced by the lexical entry who (line 4). The V|N relation between the subclause and the head noun is coded by the feature
170
9. Simple Coordination in Clausal Constructions
[mdr: (buy& 20)] of the man proplet and the extrapropositional address in the feature [mdd: (man 19)] of the buy proplet. Consider the result in the order induced by the hear mode derivation, with added cat and sem values. 9.5.8 Mary saw the man who bought, cooked, and ate the pizza. noun: man verb: see cat: snp cat: #n′ #a′ decl sem: def sg sem: past fnc: see arg: Mary man mdr: (buy& 20) prn: 19 prn: 19 verb: cook verb: eat noun: pizza ′ ′ cat: #n #a v cat: n′ a′ v cat: n′ a′ v cat: snp sem: who past sem: past sem: and past sem: indef sg arg: ∅ pizza arg: arg: fnc: buy& mdd: (man 19) nc: nc: nc: eat nc: cook pc: pc: buy& pc: cook pc: prn: 20 prn: 20 prn: 20 prn: 20
noun: Mary cat: snp sem: nm f fnc: see prn: 19 verb: buy&
The subordinating conjunction who is specified in the sem slot of the buy proplet; it originates in the lexical entry A.6.3. The LA think navigation through the proplets 9.5.8 underlying the speak mode production may be shown graphically as follows: 9.5.9 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG)
(iii) numbered arcs graph (NAG) see
see 1 2 Mary
9.5.7
Mary
man
12 3 man 4 11
buy
cook
eat
10 7
buy
cook
9 8
eat
6 5
pizza (ii) signature
(iv) surface realization
V N
pizza
1 2 3 4 7 8 9 10 5 6 11 12 Mary saw the_man who_bought cooked and_ate the_pizza V/N N/V V\N N|V V−V V−V N\V V|N N\V V V V V V\N
.
N V
V N
V
9.6 Simple Coordination in Clausal Adverbial
171
9.6 Simple Coordination in Clausal Adverbial Finally we turn to simple coordination in adverbial clauses. The following example consists of the matrix sentence X Fido barked in which X is the adverbial clause when Y slept, and Y is the simple noun coordination Bob, Bill, and Jim, serving as the subject of the modifier clause: 9.6.1 S IMPLE when
SUBJECT COORDINATION IN A CLAUSAL ADVERBIAL
Bob
Jim
Bill
and
slept
Fido
barked
lexical lookup noun: Bob noun: Jim cnj: and fnc: fnc: nc: nc: pc: pc: prn: prn: syntactic−semantic parsing verb: v_1 noun: Bob arg: fnc: 1 mdd: nc: pc: prn: 23 prn: verb: v_1 arg: mdd: prn:
noun: Bill fnc: nc: pc: prn:
verb: sleep noun: Fido arg: fnc: prn: prn:
verb: bark arg: mdr: prn:
noun: Jim fnc: nc: pc: prn:
verb: v_1 arg: Bob 2 mdd: prn: 23
noun: Bob fnc: v_1 nc: pc: prn: 23
verb: v_1 arg: Bob& 3 mdd: prn: 23
noun: Bob& fnc: v_1 nc: Jim pc: prn: 23
noun: Jim cnj: and fnc: nc: pc: Bob& prn: 23
verb: v_1 arg: Bob& 4 mdd: prn: 23
noun: Bob& fnc: v_1 nc: Jim pc: prn: 23
noun: Jim cnj: and fnc: nc: pc: Bob& prn: 23
5
verb: v_1 arg: Bob& mdd: prn: 23
noun: Bob& fnc: v_1 nc: Jim pc: prn: 23
noun: Jim fnc: nc: Bill pc: Bob& prn: 23
noun: Bill fnc: nc: pc: Jim prn: 23
6
verb:sleep arg: Bob& mdd: prn: 23
noun: Bob& fnc: sleep nc: Jim pc: prn: 23
noun: Jim fnc: nc: Bill pc: Bob& prn: 23
noun: Bill fnc: nc: pc: Jim prn: 23
noun: Fido fnc: prn:
noun: Bob& noun: Jim fnc: sleep fnc: nc: Jim nc: Bill 7 pc: Bob& pc: prn: 23 prn: 23 result of syntactic−semantic parsing:
noun: Bill fnc: nc: pc: Jim prn: 23
noun: Fido fnc: prn: 24
verb: bark arg: mdr: prn:
noun: Bill fnc: nc: pc: Jim prn: 23
noun: Fido fnc: bark prn: 24
verb: bark arg: Fido mdr: (sleep 23) prn: 24
verb:sleep arg: Bob& mdd: prn: 23
verb: sleep arg: Bob& mdd: (bark 24) prn: 23
noun: Bob& fnc: sleep nc: Jim pc: prn: 23
noun: Jim fnc: nc: Bill pc: Bob& prn: 23
noun: Bill fnc: nc: pc: prn: verb: sleep arg: prn:
172
9. Simple Coordination in Clausal Constructions
The double function of sleep (i) as the predicate of the subclause and (ii) as an adverbial modifier of the higher clause verb bark is introduced lexically by the continuation attributes arg (predicate function in the subclause) and mdd (adverbial function in the higher clause) of the subordinating conjunction when. The arg attribute of the lower verb, here sleep, has no ∅ value (gap), in contradistinction to adnominal modifier clauses (Sects. 7.3, 7.4, 9.5). The V|V relation between the main and the adverbial subclause is coded by the features [mdd: (bark 24)] and [mdr: (sleep 23)] in line 8. Building this relation involves a suspension: in line 6, the subject Fido of the main clause cannot be connected to the matrix verb bark because it has not yet arrived. This is made up for in line 7. Consider the result in the alphabetical order induced by the core values, with added cat and sem values: 9.6.2 When Bob, Jim, and Bill slept, Fido barked. noun: Bob& noun: Bill noun: Jim verb: sleep cat: snp verb: bark cat: snp noun: Fido cat: snp cat: #n′ v sem: nm m cat: #n′ decl cat: snp sem: nm m sem: and nm m fnc: sleep sem: when past sem: past sem: nm -mf fnc: sleep fnc: sleep arg: Fido mdr: nc: nc: Bill arg: Bob& fnc: bark nc: Jim mdr: (sleep 23) pc: Bob& mdd: (bark 24) pc: Jim prn: 24 pc: prn: 23 prn: 24 prn: 23 prn: 23 prn: 23
The speak mode (re)production of the sentence from the set of proplets derived in 9.6.1 is based on the following DBS graph analysis. 9.6.3 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG)
(iii) numbered arcs graph (NAG)
(ii) signature
V
bark
9.6.1 bark
1 2
N
Fido sleep
V
Fido
3 10 sleep
4 9 Bob
Jim
Bill
N
N
N
Bob
8 Jim 7 Bill 5 6
(iv) surface realization 3 4 5 6 7 8 9 10 1 2 when Bob Jim and_Bill slept Fido barked_ V|V V/N N−N N−N N N N N N/V V|V V/N N/V
.
The next example shows a predicate rather than a subject coordination in the adverbial subclause:
9.6 Simple Coordination in Clausal Adverbial
9.6.4 S IMPLE
PREDICATE COORDINATION IN A CLAUSAL ADVERBIAL
when Bob lexical lookup
bought
verb: v_1 noun: Bob verb: buy fnc: arg: arg: nc: mdd: prn: pc: nc: prn: prn: syntactic−semantic parsing verb: v_1 noun: Bob fnc: arg: prn: 1 mdd: nc: prn: 25 verb: v_1 noun: Bob arg: Bob fnc: v_1 prn: 25 2 mdd: nc: prn: 25 verb: buy arg: Bob 3 mdd: nc: prn: 25
173
cooked
and
verb: cook cnj: and arg: nc: pc: prn:
ate verb: eat arg: nc: pc: prn:
the
pizza
barked
Fido
noun: n_1 noun: pizza noun: Fido fnc: fnc: fnc: prn: prn: prn:
verb: bark arg: mdr: prn:
verb: buy arg: prn:
noun: Bob fnc: buy nc: pc: prn: 25
verb: cook arg: nc: pc: prn:
verb: buy& noun: Bob arg: Bob fnc: buy& mdd: prn: 25 4 nc: cook prn: 25
verb: cook cnj: and arg: nc: pc: buy& prn: 25
verb: buy& noun: Bob fnc: buy& arg: Bob 5 mdd: prn: 25 nc: cook prn: 25
verb: cook cnj: and arg: nc: pc: buy& prn: 25
verb: eat arg: nc: pc: prn:
verb: buy& noun: Bob fnc: buy& arg: Bob 6 mdd: prn: 25 nc: cook prn: 25
verb: cook arg: nc: eat pc: buy& prn: 25
verb: eat arg: nc: pc: cook prn: 25
noun: n_1 fnc: prn:
verb: buy& arg: Bob n_1 7 mdd: nc: cook prn: 25
noun: Bob fnc: buy& prn: 25
verb: cook arg: nc: eat pc: buy& prn: 25
verb: eat arg: nc: pc: cook prn: 25
noun: n_1 noun: pizza fnc: buy& fnc: prn: 25 prn:
verb: buy& noun: Bob arg: Bob pizza fnc: buy& 8 mdd: prn: 25 nc: cook prn: 25
verb: cook arg: nc: eat pc: buy& prn: 25
verb: eat arg: nc: pc: cook prn: 25
noun: pizza fnc: buy& prn: 25
noun: Fido fnc: prn:
verb: buy& noun: Bob arg: Bob pizza fnc: buy & 9 mdd: prn: 25 nc: cook prn: 25
verb: cook arg: nc: eat pc: buy& prn: 25
verb: eat arg: nc: pc: cook prn: 25
noun: pizza fnc: buy& prn: 25
noun: Fido fnc: prn: 26
result of syntactic−semantic parsing: noun: Bob verb: buy& arg: Bob pizza fnc: buy& mdd: (bark 26) prn: 25 nc: cook prn: 25
verb: cook arg: nc: eat pc: buy& prn: 25
verb: eat arg: nc: pc: cook prn: 25
noun: pizza fnc: buy& prn: 25
noun: Fido fnc: bark prn: 26
verb: bark arg: mdr: prn:
verb: bark arg: Fido mdr: (buy& 25) prn: 26
The V|V connection between the main clause and the adverbial subclause resembles that in 9.6.1, including a suspension, here in line 8. Consider the result in the alphabetical order induced by the core values, with added cat and sem values:
174
9. Simple Coordination in Clausal Constructions
9.6.5 When Bob bought, cooked, and ate a pizza, Fido barked. verb: eat verb: buy& verb: cook ′ ′ ′ ′ ′ ′ verb: bark cat: n a v cat: #n #a v cat: n a v noun: pizza sem: and past noun: Fido sem: past cat: #n′ decl noun: Bob sem: when past cat: snp cat: snp cat: snp arg: arg: Bob pizza arg: sem: past arg: Fido sem: nm -mf sem: nm m sem: def sg mdr: mdr: mdd: (bark 26) fnc: bark fnc: buy& fnc: buy& mdr: (buy& 25) nc: cook nc: cook nc: eat prn: 26 prn: 25 prn: 25 pc: buy& pc: prn: 26 pc: prn: 25 prn: 25 prn: 25
The [mdd: (bark 26)] feature of the buy proplet codes one direction of the V|V relation, while the feature [mdr: (buy& 25)] of bark codes the other. As the initial conjunct of a simple intrapropositional coordination, buy is &-marked, has an empty pc, and non-empty nc and mdd slots, and is connected to the other conjuncts by chains of nc and pc values. The value when in the sem slot of buy originates in the lexical entry A.6.4 of the subordinating conjunction. The speak mode (re)production of the sentence from the set of proplets derived in 9.6.4 is based on the following DBS graph analysis. 9.6.6 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG) bark Fido
buy
Bob
cook pizza
eat
9.6.4
(iii) numbered arcs graph (NAG) bark 1 3 12 2 Fido buy 11 cook 10 eat 9 8 4 7 5 6 Bob pizza
(ii) signature
V N N
V
V
V
N
(iv) surface realization 4 5 8 9 6 2 3 10 11 7 12 1 when Bob bought cooked and_ate a_pizza Fido barked_ V|V V/N N/V V−V V−V N\V V|V V/N N/V V V V V V\N
.
As in 9.6.3, the adverbial clause precedes4 the modified main clause. However, adverbial clauses may also follow, as shown by the next example. 4
This is in contradistinction to clausal adnominals, which must follow the modified noun in English.
9.6 Simple Coordination in Clausal Adverbial
9.6.7 S IMPLE Fido
175
OBJECT COORDINATION IN A CLAUSAL ADVERBIAL when
barked
Bob
bought
an
apple
a
pear
and
peach
a
lexical lookup noun: Fido fnc: prn:
verb: bark arg: mdr: prn:
verb: v_1 noun: Bob arg: fnc: mdd: prn: prn:
noun: n_1 noun: apple fnc: fnc: nc: prn: pc: prn:
verb: buy arg: prn:
syntactic−semantic parsing noun: Fido verb: bark 1 fnc: arg: mdr: prn: 25 prn: noun: Fido 2 fnc: bark prn: 25
verb: bark arg: Fido mdr: prn: 25
noun: Fido fnc: bark 3 prn: 25
verb: bark arg: Fido mdr: (v_1 26) prn: 25
verb: v_1 noun: Bob arg: fnc: mdd: (bark 25) prn: 26 prn: 26
noun: Fido 4 fnc: bark prn: 25
verb: bark arg: Fido mdr: (v_1 26) prn: 25
noun: Bob fnc: v_1 prn: 26
noun: Fido 5 fnc: bark prn: 25
noun: n_2 noun: pear cnj: and fnc: fnc: nc: prn: pc: prn:
noun: n_3 fnc: nc: pc: prn:
noun: peach fnc: prn:
verb: v_1 arg: mdd: prn:
verb: v_1 arg: Bob mdd: (bark 25) prn: 26 verb: bark verb: buy arg: Fido arg: Bob mdr: (buy 26) mdd: (bark 25) prn: 25 prn: 26
verb: buy arg: prn:
noun: Bob fnc: buy prn: 26
noun: n_1 fnc: nc: pc: prn: noun: n_1 fnc: nc: pc: prn: noun: apple fnc: nc: pc: prn: 26
noun: apple fnc: nc: pc: prn:
noun: Fido fnc: bark 6 prn: 25
verb: bark arg: Fido mdr: (buy 26) prn: 25
noun: Bob verb: buy fnc: buy arg: Bob n_1 mdd: (bark 25) prn: 26 prn: 26
noun: Fido fnc: bark 7 prn: 25
verb: bark arg: Fido mdr: (buy 26) prn: 25
verb: buy arg: Bob apple mdd: (bark 25) prn: 26
noun: Fido fnc: bark 8 prn: 25
verb: bark arg: Fido mdr: (buy 26) prn: 25
noun: Bob verb: buy arg: Bob apple& fnc: buy mdd: (bark 25) prn: 26 prn: 26
noun: apple& fnc: nc: n_2 pc: prn: 26
noun: n_2 fnc: nc: pc: apple& prn: 26
noun: Fido 9 fnc: bark prn: 25
verb: bark arg: Fido mdr: (buy 26) prn: 25
verb: buy noun: Bob arg: Bob apple& fnc: buy mdd: (bark 25) prn: 26 prn: 26
noun: apple& fnc: nc: pear pc: prn: 26
noun: pear fnc: nc: pc: apple& prn: 26
cnj: and
noun: Fido 10 fnc: bark prn: 25
verb: bark arg: Fido mdr: (buy 26) prn: 25
verb: buy arg: Bob apple& mdd: (bark 25) prn: 26
noun: Bob fnc: buy prn: 26
noun: apple& fnc: nc: pear pc: prn: 26
noun: pear fnc: nc: pc: apple& prn: 26
cnj: and noun: n_3 fnc: nc: pc: prn:
noun: Fido 11 fnc: bark prn: 25
verb: bark arg: Fido mdr: (buy 26) prn: 25
verb: buy arg: Bob apple & mdd: (bark 25) prn: 26
noun: Bob fnc: buy prn: 26
noun: apple& fnc: nc: pear pc: prn: 26
noun: pear fnc: nc: n_3 pc: apple prn: 26
noun: n_3 fnc: nc: pc: pear prn: 26
result of syntactic−semantic parsing: noun: Fido verb: bark verb: buy fnc: bark arg: Fido arg: Bob apple & mdr: (buy 26) mdd: (bark 25) prn: 25 prn: 26 prn: 25
noun: Bob fnc: buy prn: 26
noun: apple& fnc: nc: pear pc: prn: 26
noun: pear fnc: nc: peach pc: apple& prn: 26
noun: peach fnc: nc: pc: pear prn: 26
noun: Bob fnc: buy prn: 26
noun: n_2 fnc: nc: pc: prn: noun: pear fnc: nc: pc: prn:
noun: peach fnc: prn:
The derivation shows a simple object coordination, in contradistinction to the adverbial modifiers in 9.6.1 and 9.6.4, which contain simple subject and predicate coordinations, respectively. The V|V relation is established in lines 2 and 3, resulting in the features [mdr: (v_1 26)] in the bark proplet and [mdd: (bark 25)] in the when proplet. Because the verb of the main clause is already available and the subordinating
176
9. Simple Coordination in Clausal Constructions
conjunction will turn into the verb of the subclause, there is no suspension, in contradistinction to 9.6.1 and 9.6.4. Consider the result in the alphabetical order induced by the core values, with added cat and sem values: 9.6.8 Fido barked when Bob bought an apple, a pear, and peach. noun: apple& verb: buy cat: snp verb: bark ′ ′ ′ cat: #n decl cat: #n #a v sem: indef sg sem: when past sem: past arg: Bob apple& fnc: buy arg: Fido nc: pear mdr: (buy 26) mdd: (bark 25) pc: prn: 26 prn: 25 prn: 26 noun: peach noun: pear cat: snp cat: snp sem: and indef sg sem: indef sg fnc: fnc: nc: nc: peach pc: pear pc:apple& prn: 26 prn: 26
noun: Bob cat: snp sem: nm m fnc: buy prn: 26
noun: Fido cat: snp sem: nm -mf fnc: bark prn: 25
The speak mode (re)production of the sentence from the set of proplets derived in 9.6.7 is based on the following DBS graph analysis. 9.6.9 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(i) semantic relations graph (SRG) bark
(iii) numbered arcs graph (NAG) bark 3 12
1 Fido
2
buy
Fido
buy 4
Bob
apple
pear
5
peach Bob
(ii) signature
V N N
9.6.7
11 6 apple 10 pear 7
9 peach 8
(iv) surface realization 1 2 3 4 5 6 7 8 9 10 11 12 Fido barked when Bob bought an_apple a_pear and_a_peach N N N N N\V V|V V/N N/V V|V V/N N/V V\N N−N N−N
.
V N
N
N
Semantically equivalent variants with the adverbial clause preceding (9.6.3, 9.6.6) and following (9.6.9) the verb differ in the traversal order through the (iii) NAG, as indicated by the (iv) surface realization. They differ also in that modifiers preceding the verb result in suspensions (9.6.1, 9.6.4), while modifiers which follow the verb do not (9.6.7). The absence of a suspension may be regarded as unmarked, and its presence as marked.
10. Gapping in Clausal Constructions
Unlike simple coordination, complex coordination (gapping) can not concatenate propositions in a paratactic, extrapropositional manner. Instead, the relations establishing complex coordinations are intrapropositional (Sects. 8.5, 8.6). They may, however, be used as clausal arguments or clausal modifiers in extrapropositional functor-argument constructions (hypotactic). Thus, a proposition with subject gapping may serve as a clausal subject, object, adnominal, or adverbial, and similarly for predicate gapping, and object gapping. Of special interest are constructions with two gaps (Sects. 10.3, 10.4), e.g., combining (i) an adnominal clause with a subject gap with (ii) a complex predicate-object coordination (subject gapping). Complex coordination (gapping) is coded by additional addresses in certain continuation slots (8.5.2, 8.5.5, 8.6.2, 10.1.2, 10.2.2, 10.3.2, 10.4.2, 10.5.2). Simple intrapropositional coordination, in contrast, is based on address values in the nc (next conjunct) and pc (previous conjunct) slots of the conjuncts.
10.1 Subject Gapping in a Clausal Subject The example That John hit Bob, kicked Fido, and kissed Suzy amused Mary contains a subject clause with a gapped subject. The top clause is X amused Mary, the subject X is that John Y and Y consists of the complex predicate-object conjuncts hit Bob, kicked Fido, and kissed Suzy. Of these, the first complex conjunct combines with John into the regular declarative sentence John hit Bob, while the following conjuncts are gapped in that they miss a subject. It is implicitly shared with the subject of the initial declarative sentence (Sect. 8.5). The pretheoretical structure is that amuse Mary John hit Bob kick Fido kiss Suzy
and has the following time-linear derivation in the DBS hear mode:
178
10. Gapping in Clausal Constructions
10.1.1 C LAUSAL that John lexical lookup
SUBJECT CONTAINING SUBJECT GAPPING
hit
Bob
verb:v_1 noun: John verb: hit arg: fnc: arg: prn: fnc: prn: prn: syntactic−semantic parsing verb:v_1 noun: John arg: fnc: 1 fnc: prn: 26 prn: verb:v_1 noun: John verb: hit arg: arg: John fnc: prn: fnc: 2 prn: 26 prn: 26 noun: John verb: hit fnc: hit arg: John 3 fnc: prn: 26 prn: 26 verb: hit noun: John arg: John Bob fnc: hit 4 fnc: prn: 26 prn: 26 noun: John verb: hit arg: John Bob fnc: hit 5 fnc: kick prn: 26 prn: 26 verb: hit arg: John Bob 6 fnc: prn: 26 verb: hit 7 arg: John Bob fnc: prn: 26
Fido
and
kissed
Suzy
amused
noun: Bob verb: kick noun: Fido cnj: and verb: kiss noun: Suzy verb: amuse arg: fnc: fnc: arg: fnc: arg: prn: prn: prn: prn: prn: prn:
Mary noun: Mary fnc: prn:
noun: Bob fnc: prn: noun: Bob verb: kick arg: fnc: hit prn: prn: 26 noun: Bob verb: kick noun: Fido arg: John fnc: fnc: hit prn: 26 prn: 26 prn:
noun: John fnc: hit kick
noun: Bob verb: kick noun: Fido fnc: hit arg: John Fido fnc: kick prn: 26 prn: 26 prn: 26
prn: 26 noun: John fnc: hit kick
noun: Bob verb: kick arg: John Fido fnc: hit prn: 26 prn: 26
noun: Fido fnc: kick prn: 26
cnj: and verb: kiss arg: prn:
noun: Bob verb: kick fnc: hit arg: John Fido prn: 26 prn: 26
noun: Fido fnc: kick prn: 26
cnj: and verb: kiss noun: Suzy arg: John fnc: prn: prn: 26
noun: Bob verb: kick fnc: hit arg: John Fido prn: 26 prn: 26
noun: Fido fnc: kick prn: 26
verb: kiss noun: Suzy verb: amuse arg: John Suzy fnc: kiss arg: prn: 26 prn: prn: 26
noun: Bob verb: kick fnc: hit arg: John Fido prn: 26 prn: 26
noun: Fido fnc: kick prn: 26
verb: kiss arg: J. S. prn: 26
noun: Bob verb: kick fnc: hit arg: John Fido prn: 26 prn: 26
noun: Fido fnc: kick prn: 26
noun: Mary verb: kiss noun: Suzy verb: amuse arg: J. S. fnc: kiss arg: (hit 26) Mary fnc: amuse prn: 27 prn: 26 prn: 26 prn: 27
prn: 26 noun: John fnc: hit kick kiss prn: 26 verb: hit noun: John arg: John Bob fnc: hit 9 fnc: kick kiss prn: 26 prn: 26 verb: hit noun: John arg: John Bob fnc: hit kick 10 fnc: (amuse 27) kiss prn: 26 prn: 26 result verb: hit noun: John arg: John Bob fnc: hit kick fnc: (amuse 27) kiss prn: 26 prn: 26 verb: hit arg: John Bob 8 fnc: prn: 26
kicked
noun: Suzy verb: amuse fnc: kiss arg: (hit 26) prn: 26 prn: 27
noun: Mary fnc: prn:
The connection between the gapped subject John and the complex conjuncts kick Fido and kiss Suzy are built by cross-copying in lines 4, 5 and 7, 8, respectively. The relation between the main clause X amused Mary and the clausal subject that John Y is established by the features [fnc: (amuse 27)] of the verb proplet hit and [arg: (hit 26) Mary] of the higher verb proplet amuse (lines 9 and 10).
10.1 Subject Gapping in a Clausal Subject
179
Consider the result in the order induced by the hear mode derivation with added cat and sem values: 10.1.2 That John hit Bob, kicked Fido, and kissed Suzy amused Mary. noun: John verb: hit verb: kick noun: Fido cat: #n′ #a′ v cat: snp noun: Bob ′ ′ sem: nm m cat: snp cat: #n #a v cat: snp sem: that past arg: John Bob fnc: hit sem: nm m sem: past sem: nm -mf kick fnc: hit fnc: kick arg: John Fido fnc: (amuse 27) kiss prn: 26 prn: 26 prn: 26 prn: 26 prn: 26 verb: kiss verb: amuse noun: Suzy noun: Mary ′ ′ ′ ′ cat: #n #a v cat: snp cat: #n #a decl cat: snp sem: and past sem: nm f sem: past sem: nm f arg: John Suzy fnc: kiss arg: (hit 26) Mary fnc: amuse prn: 26 prn: 27 prn: 26 prn: 27
The conjunction values that and and appear in the sem slots of hit and kiss, respectively, and originate in the lexical entries A.4.4 and A.7.1. Corresponding to 8.5.3, the semantic relations and the LAGo think navigation underlying the English speak mode may be shown graphically as follows: 10.1.3 DBS
GRAPH ANALYSIS UNDERLYING THE PRODUCTION OF
(i) semantic relations graph (SRG)
(ii) signature
amuse
10.1.1
(iii) numbered arcs graph (NAG) amuse
V 1
hit
Mary
hit
N
V
kick kiss John
kick
V
Bob Fido
V
Suzy
(iv) surface realization 1 2 11 12 13 2
16 15 Mary
14
N
N N N
2
John
11
10
kiss 7
6
3
9 5 8
4 Suzy
12 13
Fido
Bob
3 4 5 6 7 8 9 10 11 14 15 16 that John hit Bob kicked Fido and_kissed Suzy amused Mary . V/V V/N N/V V\N N\V V/N N/V V\N N\V V/N N/V V\N N\V V/N N/V V/V V\N N\V
V/V (arc 1) realizes that from the sem slot of hit. V/N (arc 2) realizes John. N/V (arc 11) returns to the verb and realizes hit. V\N (arc 12) realizes Bob. N\V (arc 13) moves empty to hit and completes the initial proposition. V/N (arc 2) moves empty to the shared subject John. N/V (arc 7) realizes kicked. V\N (arc 8) realizes Fido. N\V (arc 9) and V/N (arc 10) move empty to John. N/V (arc 3) realizes and_kissed. V\N (arc 4) realizes Suzy. N\V (arc 5) and N/V (arc 6) move empty to John. N/V (arc 11) moves empty to hit. V/V (arc 14) realizes amused. The remainder is completed as usual.
180
10. Gapping in Clausal Constructions
10.2 Predicate Gapping in a Clausal Object The example Anna knows that Bob loves Julia, Jim Suzy, and Bill Liz consists of the main clause Ann knows X, the clausal object X = that Bob loves Julia Y, and Y = Jim Suzy and Bill Liz, i.e. two conjuncts of predicate gapping. The DBS hear mode derivation may be shown graphically as follows: 10.2.1 C LAUSAL Ann knows lexical lookup
OBJECT CONTAINING GAPPED PREDICATE that
Bob
noun: Ann verb: know verb: v_1 noun: Bob arg: arg: fnc: fnc: prn: prn: prn: fnc: prn: syntactic−semantic parsing noun: Ann verb: know arg: 1 fnc: prn: 28 prn: noun: Ann verb: know verb: v_1 arg: 2 fnc: know arg: Ann prn: 28 prn: 28 fnc: prn: noun: Ann verb: know verb: v_1 3 fnc: know arg: Ann (v_1 29) arg: prn: 28 prn: 28 fnc: (know 28) prn: 29 noun: Ann verb: know verb: v_1 4 fnc: know arg: Ann (v_1 29) arg: Bob prn: 28 prn: 28 fnc: (know 28) prn: 29 noun: Ann verb: know verb: love 5 fnc: know arg: Ann (love 29) arg: Bob prn: 28 prn: 28 fnc: (know 28) prn: 29 noun: Ann verb: know verb: love 6 fnc: know arg: Ann (love 29) arg: Bob Julia prn: 28 prn: 28 fnc: (know 28) prn: 29 noun: Ann verb: know verb: love Julia 7 fnc: know arg: Ann (love 29) arg: Bob Jim prn: 28 prn: 28 fnc: (know 28) prn: 29 verb: love noun: Ann verb: know arg: Bob Julia fnc: know arg: Ann (love 29) 8 Jim Suzy prn: 28 prn: 28 fnc: (know 28) prn: 29 verb: love noun: Ann verb: know 9 fnc: know arg: Ann (love 29) arg: BobJulia Jim Suzy prn: 28 prn: 28 fnc: (know 28) prn: 29 noun: Ann verb: know verb: love Julia 10 fnc: know arg: Ann (love 29) arg: Bob Jim Suzy prn: 28 prn: 28 Bill fnc: (know 28) result prn: 29 verb: love noun: Ann verb: know fnc: know arg: Ann (love 29) arg: BobJulia Jim Suzy prn: 28 prn: 28 Bill Liz fnc: (know 28) prn: 29
loves
Julia
verb: love noun: Julia arg: fnc: prn: prn:
Jim
Suzy
and
Bill
Liz
noun: Jim noun: Suzy cnj: and noun: Bill noun: Liz fnc: fnc: fnc: fnc: prn: prn: prn: prn:
noun: Bob fnc: prn: noun: Bob verb: love arg: fnc: v_1 prn: prn: 29 noun: Bob noun: Julia fnc: love fnc: prn: prn: 29 noun: Bob noun: Julia fnc: love fnc: love prn: 29 prn: 29
noun: Jim fnc: prn:
noun: Bob noun: Julia fnc: love fnc: love prn: 29 prn: 29
noun: Jim noun: Suzy fnc: love fnc: prn: 29 prn:
noun: Bob noun: Julia fnc: love fnc: love prn: 29 prn: 29
noun: Jim noun: Suzy cnj: and fnc: love fnc: love prn: 29 prn:
noun: Bob fnc: love prn: 29
noun: Julia fnc: love prn: 29
noun: Jim noun: Suzy cnj: and noun: Bill fnc: love fnc: love fnc: prn: 29 prn: prn: 29
noun: Bob fnc: love prn: 29
noun: Julia fnc: love prn: 29
noun: Jim noun: Suzy fnc: love fnc: love prn: 29 prn: 29
noun: Bill noun: Liz fnc: love fnc: prn: prn: 29
noun: Bob fnc: love prn: 29
noun: Julia fnc: love prn: 29
noun: Jim noun: Suzy fnc: love fnc: love prn: 29 prn: 29
noun: Bill noun: Liz fnc: love fnc: love prn: 29 prn: 29
10.2 Predicate Gapping in a Clausal Object
181
The relation between Ann knows X and the clausal object X = that Bob loves Julia Y is established in lines 2 and 3 by the feature [arg: Ann (v_1 29)] of the know proplet and the feature [fnc: (know 28)] of the verb proplet. Consider the content derived in 10.2.1 with added cat and sem values: 10.2.2 Ann knows that Bob loves Julia, Jim Suzy, and Bill Liz verb: love ′ ′ cat: #n #a v noun: Bob sem: that pres cat: snp arg: Bob Julia sem: nm m Jim Suzy fnc: love Bill Liz prn: 29 fnc: (know 28) prn: 29 noun: Liz noun: Bill cat: snp cat: snp sem: and nm m sem: nm f fnc: love fnc: love prn: 29 prn: 29
noun: Ann cat: snp sem: nm f fnc: know prn: 28
verb: know ′ ′ cat: #n #a decl sem: pres arg: Ann (love 29) prn: 28
noun: Jim cat: snp sem: nm m fnc: love prn: 29
noun: Suzy cat: snp sem: nm f fnc: love prn: 29
noun: Julia cat: snp sem: nm f fnc: love prn: 29
The conjunction values that and and appear in the sem slots of love and Bill, respectively, and originate in the lexical entries A.4.4 and A.7.1. Corresponding to 8.5.6, the semantic relations and the LAGo think navigation underlying the English speak mode may be shown graphically as follows: 10.2.3 DBS
GRAPH ANALYSIS UNDERLYING THE PRODUCTION OF
(i) semantic relations graph (SRG) know Ann
V N
love
Bob
Julia
Jim
Suzy
Bill
(ii) signature
Liz
N N N
V N N N
10.2.1
(iii) numbered arcs graph (NAG) know 1 16 2 3 Ann 4 love 1415 5 Bob Julia 6 7 8 9
12 13 10 11
Jim
Suzy
Bill Liz (iv) surface realization 1 2 3 4 5 14 15 6 7 12 13 8 9 10 11 16 Ann knows that Bob loves Julia Jim Suzy and_Bill Liz . V/N N/V V\V V/N N\V V\N N\V V/N N/V V\N N\V V/N N/V V\N N\V V\V
N\V (arc 3) realizes that from the sem slot of love. V/N (arc 4), N/V (arc 5) and V\N (arc 14) realize Bob loves Julia. N\V (arc 15) returns empty to the shared verb love. V/N (arc 6), N/V (arc 7), and V\N (arc 12) realize the gapped conjunct Jim Suzy. N\V (arc 13) returns empty to love. V/N (arc 8), N/V (arc 9), and V\N (arc 10) realize and_Bill Liz. N\V (arc 11) returns empty to love. V\V (arc 16) returns to know to realize the period.
182
10. Gapping in Clausal Constructions
10.3 Clausal Adnom. (Subject Gap) with Subject Gapping An adnominal clause with a subject gap can only be combined with subject gapping. For example, the X in The man X smiled may be the adnominal clause X = who Y, and Y the complex coordination hit Bob, kicked Fido,... (subject gap). A predicate gap like Bob an apple, Jim a pair, ... (8.5.4) or an object gap like Bob bought, Jim peeled, ... (8.6.1), in contrast, is impossible. 10.3.1 P REDICATE - OBJECT the man lexical lookup
who
hit
COORDINATION ( SUBJECT GAPPING )
Bob
kicked
Fido
and
kissed
Liz
smiled
noun: man verb: v_1 verb: hit noun: Bob verb: kick noun: Fido cnj: and verb: kiss noun: Liz verb: smile arg: arg: arg: fnc: fnc: fnc: arg: fnc: prn: prn: prn: prn: prn: mdr: prn: prn: arg: 0/ prn: mdd: prn: syntactic−semantic parsing noun: n_1 fnc: mdr: prn:
noun: n_1 fnc: 1 mdr: prn: 30
noun: man fnc: mdr: prn:
noun: man fnc: 2 mdr: prn: 30 noun: man fnc: 3 mdr: (v_1 31) prn: 30 noun: man fnc: 4 mdr: (hit 31) prn: 30 noun: man fnc: 5 mdr: (hit 31) prn: 30 noun: man fnc: 6 mdr: (hit 31) prn: 30 noun: man fnc: 7 mdr: (hit 31) prn: 30 noun: man fnc: 8 mdr: (hit 31) prn: 30 noun: man fnc: 9 mdr: (hit 31) prn: 30 noun: man fnc: 10 mdr: (hit 31) prn: 30 result noun: man fnc: smile mdr: (hit 31) prn: 30
verb: v_1 arg: 0/ mdd: prn: verb: v_1
verb: hit arg: prn: arg: 0/ mdd: (man 30) prn: 31 verb: hit noun: Bob fnc: prn: arg: 0/ mdd: (man 30) prn: 31 verb: hit noun: Bob verb: kick arg: fnc: hit prn: prn: 31 arg: 0/ Bob mdd: (man 30) prn: 31 verb: hit noun: Fido noun: Bob verb: kick kick arg: (man 30) fnc: fnc: hit prn: prn: 31 prn: 31 arg: 0/ Bob mdd: (man 30) prn: 31 noun: Fido cnj: and verb: hit noun: Bob verb: kick kick arg: (man 30) Fido fnc: kick fnc: hit prn: 31 prn: 31 prn: 31 arg: 0/ Bob mdd: (man 30) prn: 31 verb: hit noun: Fido cnj: and verb: kiss noun: Bob verb: kick kick arg: arg: (man 30) Fido fnc: kick fnc: hit prn: 31 prn: 31 arg: 0/ Bob prn: 31 prn: 31 mdd: (man 30) prn: 31 verb: hit verb: kiss noun: Fido noun: Liz noun: Bob verb: kick kick arg: (man 30) fnc: arg: (man 30) Fido fnc: kick kiss fnc: hit prn: 31 prn: 31 arg: 0/ Bob prn: prn: 31 prn: 31 mdd: (man 30) prn: 31 verb: hit noun: Fido verb: kiss kick noun: Liz verb: smile noun: Bob verb: kick arg: (man 30) Liz fnc: kiss arg: kiss arg: (man 30) Fido fnc: kick fnc: hit prn: 31 prn: 31 arg: 0/ Bob prn: 31 prn: prn: 31 prn: 31 mdd: (man 30) prn: 31 verb: hit noun: Fido verb: kiss noun: Liz verb: smile kick noun: Bob verb: kick arg: (man 30) Liz fnc: kiss arg: (man 30) kiss arg: (man 30) Fido fnc: kick fnc: hit arg: 0/ Bob prn: 31 prn: 31 prn: 31 prn: prn: 31 mdd: (man 30) prn: 31 prn: 31
10.3 Clausal Adnom. (Subject Gap) with Subject Gapping
183
The semantic relation between The man X smiled and the adnominal modifier X = who Y is established in lines 2 and 3 by the feature [mdr: (v_1 31)] in the man proplet and the feature [mdd: (man 30)] representing who in the verb proplet. The verbs of the complex conjuncts are written into the verb proplet in lines 3, 5, and 8, and into their object proplets in lines 4, 6, and 9. Consider the result in the order induced by the hear mode derivation with added cat and sem values: 10.3.2 The man who hit Bob, kicked Fido, and kissed Liz smiled. verb: hit kick kiss sem: who past arg: ∅ Bob mdd: (man 30) prn: 31 verb: kiss noun: Liz ′ ′ cat: n #a v cat: snp sem: and past sem: nm f arg: (man 30) Liz fnc: kiss prn: 31 prn: 31
noun: man cat: snp sem: def sg fnc: smile mdr: (hit 31) prn: 30
noun: Bob cat: snp sem: nm m fnc: hit prn: 31
verb: kick ′ ′ cat: n #a v sem: past arg: (man 30) Fido prn: 31
noun: Fido cat: snp sem: nm f fnc: kick prn: 31
verb: smile ′ cat: #n decl sem: past arg: man prn: 30
The subject gap of the adnominal modifier is represented by ∅ in the first arg slot of the gapped verb proplet. The value represented by ∅, i.e. the head noun, is specified directly underneath by the mdd value (man 30). The value who in the sem slot of the verb proplet originates in the lexical entry A.6.3. Corresponding to 7.3.4 and 8.5.3, the semantic relations and the navigation underlying the speak mode production may be shown graphically as follows: 10.3.3 DBS
GRAPH ANALYSIS UNDERLYING THE PRODUCTION OF
(i) semantic relations graph (SRG)
(ii) signature
(iii) numbered arcs graph (NAG)
V
smile
10.3.1
smile
1 16
man
man
N
2 15
hit kick kiss
V V
kick
V Liz Fido Bob
hit
3
N N N
12
11 8
kiss 7 4
5
6 Liz
(iv) surface realization 12 13 1 2 3
9
14 10 13 Fido
Bob
14 3 12 15 16 8 9 10 11 4 5 6 7 the_man who hit Bob kicked Fido and_kissed Liz smiled_ V/N N|V V/N N\V V\N N\V V/N N/V V\N N\V V/N N/V V\N N\V V/N N/V V|N N/V
.
184
10. Gapping in Clausal Constructions
While the subject gap of an adnominal clause cannot participate in a simple subject coordination (9.5.2), it is the sole possible subject for complex predicate-object conjuncts in subject gapping. In 10.3.3, the node implicitly representing the subject gap in the adnominal clause is the modified, here the noun proplet man (compare 7.3.4).
10.4 Clausal Adnominal (Object Gap) with Object Gapping Adnominal clauses (i) with a subject gap and (ii) with an object gap are in complementary distribution insofar as those with a subject gap can only combine with subject gapping (Sect. 10.3), while those with an object gap can only combine with object gapping. For example, the matrix clause Liz knows the man X with the adnominal clause X = whom Y (possible object gap) and Y = Bob hit, Fido bit, and Suzy kissed is grammatical and uses whom as the gapped object. Variants with Y = hit Bob, kicked Fido, and kissed Suzy (subject gap) or Y = the man Bob, Jim Fido, and Bill Suzy (predicate gap), in contrast, are structurally excluded. Consider the hear mode derivation of Liz knows the man whom Bob hit, Fido bit, and Suzy kissed: 10.4.1 S UBJECT- PREDICATE Liz lexical lookup
knows
the
noun: Liz verb: know noun: n_1 fnc: arg: fnc: mdr: prn: prn: prn: syntactic−semantic parsing
man
COORDINATION ( OBJECT GAPPING )
whom
Bob
hit
Fido
and
Suzy
kissed
noun: man verb: v_1 noun: Bob verb: hit noun: Fido verb: bite cnj: and noun: Suzy verb: kiss arg: arg: fnc: fnc: fnc: fnc: arg: prn: prn: mdr: prn: 0/ prn: prn: prn: arg: prn: mdd: prn:
noun: Liz verb: know 1 fnc: arg: prn: prn: 32 noun: Liz verb: know noun: n_1 fnc: 2 fnc: know arg: Liz mdr: prn: 32 prn: 32 prn: noun: Liz verb: know noun: n_1 3 fnc: know arg: Liz n_1 fnc: know mdr: prn: 32 prn: 32 prn: 32
bit
noun: man fnc: mdr: prn:
noun: Liz verb: know noun: man 4 fnc: know arg: Liz man fnc: know mdr: prn: 32 prn: 32 prn: 32
verb: v_1 arg: 0/ mdd: prn:
noun: Liz verb: know noun: man 5 fnc: know arg: Liz man fnc: know mdr: (v_1 33) prn: 32 prn: 32 prn: 32
verb: v_1 noun: Bob arg: 0/ fnc: mdd: (man 32) prn: prn: 33
noun: Liz verb: know noun: man 6 fnc: know arg: Liz man fnc: know mdr: (v_1 33) prn: 32 prn: 32 prn: 32
verb: v_1 noun: Bob verb: hit arg: Bob 0/ fnc: v_1 arg: prn: mdd: (man 32) prn: 33 prn: 33
noun: Liz verb: know noun: man 7 fnc: know arg: Liz man fnc: know mdr: (hit 33) prn: 32 prn: 32 prn: 32
verb: hit noun: Bob arg: Bob 0/ fnc: hit mdd: (man 32) prn: 33 prn: 33
noun: Fido fnc: prn:
10.4 Clausal Adnominal (Object Gap) with Object Gapping
185
verb: hit noun: Bob arg: Bob 0/ fnc: hit mdd: (man 32) prn: 33 prn: 33
noun: Fido verb: bite fnc: arg: prn: 33 prn:
noun: Liz verb: know noun: man 9 fnc: know arg: Liz man fnc: know mdr: (hit 33) prn: 32 prn: 32 (bite 33)
verb: hit noun: Bob arg: Bob 0/ fnc: hit mdd: (man 32) prn: 33 prn: 33
noun: Fido verb: bite fnc: bite arg: Fido (man 32) prn: 33 prn: 33
prn: 32 noun: Liz verb: know noun: man know 10 fnc: know arg: Liz man fnc: mdr: (hit 33) prn: 32 prn: 32 (bite 33)
verb: hit noun: Bob arg: Bob 0/ fnc: hit mdd: (man 32) prn: 33 prn: 33
noun: Fido verb: bite cnj: and fnc: bite arg: Fido (man 32) prn: 33 prn: 33
prn: 32 noun: Liz verb: know noun: man know 11 fnc: know arg: Liz man fnc: mdr: (hit 33) prn: 32 prn: 32 (bite 33)
verb: hit noun: Bob arg: Bob 0/ fnc: hit mdd: (man 32) prn: 33 prn: 33
noun: Fido verb: bite cnj: and noun: Suzy fnc: fnc: bite arg: Fido (man 32) prn: prn: 33 prn: 33
verb: hit noun: Bob arg: Bob 0/ fnc: hit mdd: (man 32) prn: 33 prn: 33
noun: Fido verb: bite noun: Suzy verb: kiss arg: fnc: bite arg: Fido (man 32) fnc: prn: prn: 33 prn: 33 prn: 33
noun: Liz verb: know noun: man 8 fnc: know arg: Liz man fnc: know mdr: (hit 33) prn: 32 prn: 32 prn: 32
prn: 32 noun: Liz verb: know noun: man 12 fnc: know arg: Liz man fnc: know mdr: (hit 33) prn: 32 prn: 32 (bite 33) prn: 32 man noun: Liz verb: know noun: know fnc: know arg: Liz man fnc: mdr: (hit 33) prn: 32 prn: 32 (bite 33) (kiss 33) prn: 32
result
verb: hit noun: Bob noun: Fido verb: bite noun: Suzy verb: kiss arg: Bob 0/ fnc: hit arg:Suzy (man 32) fnc: bite arg: Fido (man 32) fnc: kiss mdd: (man 32) prn: 33 prn: 33 prn: 33 prn: 33 prn: 33 prn: 33
The relation between the object noun man and its adnominal subclause whom Bob hit... with an object gap is established in lines 4 and 5 by the features [mdr: (v_1 33)] of the man proplet and the long address of the [mdd: (man 32)] feature in the resulting hit proplet (line 7). The connection between the complex (gapped) conjuncts and their modified is coded by the three mdr values hit, kick, and kiss of the head noun man and the mdd value in the verbal part hit of the initial complex conjunct. The other conjuncts are related to the head noun by the second arg value, i.e. (man 32), of their predicate. Consider the result in the order induced by the hear mode derivation with added cat and sem values: 10.4.2 Liz knows the man whom Bob hit, Fido bit, and Suzy kissed. noun: man cat: snp verb: hit verb: know noun: Bob noun: Liz ′ ′ sem: def sg cat: #ns3 #a v ′ ′ cat: snp cat: snp cat: #ns3 #a decl sem: whom past fnc: know sem: nm m sem: nm f sem: past arg: Bob ∅ fnc: hit fnc: know arg: Liz man mdr: (hit 33) (bite 33) mdr: (man 32) prn: 33 prn: 32 prn: 32 (kiss 33) prn: 33 prn: 32 verb: bite verb: kissed noun: Fido noun: Suzy ′ ′ ′ ′ cat: snp cat: ns3 #a v cat: snp cat: ns3 #a v sem: nm -mf sem: past sem: and nm f sem: past fnc: bite arg: Fido (man 32) fnc: kiss arg: Suzy (man 32) prn: 33 prn: 33 prn: 33 prn: 33
186
10. Gapping in Clausal Constructions
The conjunction values whom and and appear in the sem slots of hit and Suzy, respectively, and originate in the lexical entries A.6.3 and A.7.1. Using the solution for adnominal modifiers with an object gap 7.4.3 and for object gapping 8.6.3, the semantic relations and the navigation underlying the English speak mode production may be shown graphically as follows: 10.4.3 DBS
GRAPH ANALYSIS UNDERLYING THE PRODUCTION OF (ii) signature
(i) semantic relations graph (SRG) meet
(iii) numbered arcs graph (NAG) meet
V 1
Liz
man
N
10.4.1
2
3
18
Liz
N
man 4 17
hit
V
bite
V
hit bite 5
V
kiss
Bob Fido
Suzy
N
N N
6
Bob
9 10
kiss 13
14
15
12
7
11 8
16
Fido Suzy
(iv) surface realization 1
2
3
4
5
6
7
8
9
10 11 12
13
14
15 16
17 18
.
Liz met the_man whom Bob hit Fido bit and_Suzy kissed V/N N/V V\N N|V V/N N/V V\N N\V V/N N/V V\N N\V V\N N\V V\N N\V V|N N\V
As in 10.3.3, the node implicitly representing the object gap in the adnominal clause is the modified, here the noun man. The relations (lines) are needed for building the complex subject-predicate conjuncts analogous to 8.6.3 (in which the common object noun is explicit). Assuming two basic kinds of relative clauses (subject gap and object gap), two kinds of coordination (simple and complex) which in turn each come in three kinds, namely simple subject, predicate, and object conjuncts, and complex subject-predicate, subject-object, and predicate-object conjuncts, there are 2*2*3=12 possibilities. Of these, (i) a relative clause with a subject gap and a simple subject coordination, (ii) a relative clause with an object gap and a simple object coordination, (iii, iv) a relative clause with a subject gap and a complex coordination with subject-predicate or subject-object conjuncts, and (v, vi) a relative clause with an object gap and complex coordinations with predicate-object or subject-object conjuncts are structurally impossible. Thus only 6 of the 12 possibilities do actually occur in English. Given its deep semantic cause, this constraint may be a real universal for all natural languages which have (a) relative clauses and (b) gapping. We know of none which don’t.
10.5 Complex Coordination in Clausal Adverbial
187
10.5 Complex Coordination in Clausal Adverbial As an example of an adverbial modifier clause consider the DBS hear mode derivation of John smiled when Bob kissed Suzy, Liz Jim, and Bill Ann: 10.5.1 A DVERBIAL John lexical lookup noun: John fnc: mdr: prn:
smiled
verb: smile arg: mdr: prn:
syntactic−semantic parsing noun: John 1 fnc: mdr: prn: 34
verb: smile arg: mdr: prn:
noun: John 2 fnc: mdr: prn: 34
verb: smile arg: mdr: prn: 34
noun: John 3 fnc: mdr: prn: 34
verb: smile arg: mdr: (v_1 35) prn: 34
noun: John 4 fnc: mdr: prn: 34
verb: smile arg: mdr: (v_1 35) prn: 34
noun: John fnc: 5 mdr: prn: 34
verb: smile arg: mdr: (kiss 35) prn: 34
noun: John fnc: 6 mdr: prn: 34
verb: smile arg: mdr: (kiss 35) prn: 34
noun: John fnc: 7 mdr: prn: 34
verb: smile arg: mdr: (kiss 35) prn: 34
noun: John 8 fnc: mdr: prn: 34
verb: smile arg: mdr: (kiss 35) prn: 34
noun: John 9 fnc: mdr: prn: 34
verb: smile arg: mdr: (kiss 35) prn: 34
noun: John 10 fnc: mdr: prn: 34
verb: smile arg: mdr: (kiss 35) prn: 34
verb: smile arg: mdr: (kiss 35) prn: 34
verb: v_1 arg:
Bob noun: Bob fnc: prn:
kissed verb: kiss arg: prn:
Suzy
Liz
Jim
and
Bill
Ann
noun: Suzy noun: Liz noun: Jim cnj: and noun: Bill noun: Ann fnc: fnc: fnc: fnc: fnc: prn: prn: prn: prn: prn:
mdd: prn:
verb: v_1 arg: mdd: prn: 35 verb: v_1 arg:
noun: Bob fnc: prn:
mdd: (smile 34) prn: 35 verb: v_1 arg: Bob
noun: Bob fnc: v_1 prn: 35
verb: kiss arg: prn:
mdd: (smile 34) prn: 35 noun: Bob fnc: kiss prn: 35
noun: Suzy fnc: prn:
noun: Bob fnc: kiss prn: 35
noun: Suzy noun: Liz fnc: fnc: kiss prn: prn: 35
noun: Bob fnc: kiss prn: 35
noun: Suzy noun: Liz noun: Jim fnc: kiss fnc: fnc: kiss prn: prn: 35 prn: 35
noun: Bob fnc: kiss prn: 35
noun: Suzy noun: Liz noun: Jim fnc: kiss fnc: kiss fnc: kiss prn: 35 prn: 35 prn: 35
noun: Bob fnc: kiss prn: 35
noun: Suzy noun: Liz noun: Jim fnc: kiss fnc: kiss fnc: kiss prn: 35 prn: 35 prn: 35
verb: kiss arg: Bob Suzy Liz Jim Bill mdd: (smile 34) prn: 35
noun: Bob fnc: kiss prn: 35
noun: Suzy noun: Liz noun: Jim fnc: kiss fnc: kiss fnc: kiss prn: 35 prn: 35 prn: 35
noun: Bill noun: Ann fnc: fnc: kiss prn: 35 prn:
verb: kiss arg: Bob Suzy Liz Jim Bill Ann mdd: (smile 34) prn: 35
noun: Bob fnc: kiss prn: 35
noun: Suzy noun: Liz noun: Jim fnc: kiss fnc: kiss fnc: kiss prn: 35 prn: 35 prn: 35
noun: Bill noun: Ann fnc: kiss fnc: kiss prn: 35 prn: 35
verb: kiss arg: Bob mdd: (smile 34) prn: 35 verb: kiss arg: Bob Suzy mdd: (smile 34) prn: 35 verb: kiss arg: Bob Suzy Liz mdd: (smile 34) prn: 35 verb: kiss arg: Bob Suzy Liz Jim
cnj: and
mdd: (smile 34) prn: 35 verb: kiss arg: Bob Suzy Liz Jim
cnj: and noun: Bill fnc: prn:
mdd: (smile 34) prn: 35
result noun: John fnc: mdr: prn: 34
SUBCLAUSE WITH PREDICATE GAPPING when
188
10. Gapping in Clausal Constructions
Because the modifier follows the main clause verb, there is no suspension. The relation between the higher verb smile and the adverbial clause is established between the mdr slot of smile and the mdd slot of the subordinating conjunction (lines 2 and 3), which will become the verb proplet of the adverbial modifier clause in line 4. The relation between the gapped verb and the complex conjuncts of the adverbial modifier clause is coded step by step (i) by the value pairs Bob Suzy, Liz Jim, and Bill Ann in the arg slot of the subordinating conjunction (which is turned into the sole verb of the subclause) and (ii) as the value kiss in the fnc slot of all the complex subject-object conjuncts. Consider the result in the order induced by the hear mode derivation with added cat and sem values: 10.5.2 John smiled when Bob kissed Julia, Jim Suzy, and Bill Ann. verb: smile cat: #n′ decl sem: past arg: John mdr: (kiss 35) prn: 34
noun: John cat: snp sem: nm m fnc: smile prn: 34
noun: Liz cat: snp sem: nm f fnc: kiss prn: 35
noun: Jim cat: snp sem: nm m fnc: kiss prn: 35
verb: kiss ′
′
cat: #n #a v noun: Bob sem: when past cat: snp arg: Bob Suzy sem: nm m Liz Jim fnc: kiss Bill Ann prn: 35 mdd: (smile 34) prn: 35 noun: Ann noun: Bill cat: snp cat: snp sem: and nm m sem: nm f fnc: kiss fnc: kiss prn: 35 prn: 35
noun: Suzy cat: snp sem: nm f fnc: kiss prn: 35
The value when in the sem slot of kiss originates in the conjunction (A.6.4). Combining the solution for predicate gapping 8.5.6 and 10.2.3 with the solution for adverbial subclauses 7.5.3 we arrive at the following graph analysis: 10.5.3 DBS
GRAPH ANALYSIS UNDERLYING THE PRODUCTION OF
(i) semantic relations graph (SRG)
(ii) signature
(iii) numbered arcs graph (NAG) smile
V
smile
1 2 John
John kiss
N
Bob
Suzy
Liz
Jim
Bill
Ann
N N N
V
4 5
N N N
3 16 kiss
1415 Suzy
Bob 6 7 8 9
12 13 10 11
Liz
Bill Ann (iv) surface realization 4 1 2 3 5 14 15 6 7 12 13 8 9 10 11 16 and_Bill . John smiled when Bob kissed Suzy Liz Jim Ann V/N N/V V|V V/N N/V V\N N\V V/N N/V V\N N\V V/N N/V V\N N\V V\V
Jim
10.5.1
10.6 Gapping in Contents with a Three-Place Verb
189
Unlike clausal adnominal modifiers, the adverbial counterparts do not have a gap. Therefore they are unconstrained with respect to complex coordinations. This is shown by the following examples of subject and object gapping in an adverbial subclause, which systematically complement the predicate gapping in 10.5.1: 10.5.4 A DVERBIAL
SUBCLAUSE WITH SUBJECT GAPPING
Liz smiled when Bob bought an apple, peeled a pear, and ate a peach 10.5.5 A DVERBIAL
SUBCLAUSE WITH OBJECT GAPPING
Liz smiled when Bob bought, Jim peeled, and Bill ate an apple In all three constructions, the V|V relation between the verb of the main clause and the verb of the adverbial subclause (10.5.3) is established by the subordinating conjunction when (A.6.4).
10.6 Gapping in Contents with a Three-Place Verb A structural precondition of complex coordination in any grammatical context is that its verb is at least two-place (transitive). Otherwise the complex conjuncts predicate-object (subject gapping), subject-object (predicate gapping), and subject-predicate (object gapping) are structurally impossible. While verbs with less than two argument places (intransitive) may not participate in complex coordinations, verbs with more than two places may. It seems, however, that the maximum number of valency positions in English verbs is three, namely subject, indirect_object, and direct_object.1 Consider the following example: 10.6.1 S UBJECT
GAPPING WITH A DITRANSITIVE VERB CONJUNCTION
John sold Bill a bike, gave Julia a book, and told Suzy a story Here, the complex conjuncts consist of the three parts (i) verb, (ii) indirect_object and (iii) direct_object: sell Bill bike, give Julia book, and tell Suzy story. Another structural possibility is complex conjuncts consisting of two parts and having two readings, as in the following example: 1
German has ditransitive verbs with a dative (indirect_object) and an accusative (direct_object) valency like geben (give), two accusative valencies like lehren (teach), and a genitive and an accusative valency like beschuldigen (accuse).
190
10. Gapping in Clausal Constructions
10.6.2 G APPED
THREE - PLACE VERB WITH TWO - PART CONJUNCTS
John gave Mary a book, Susy a cookie, and Liz a kiss This sentence is structurally ambiguous. On one reading, the subject and the predicate are gapped: John gave Mary a book, (John gave) Suzy a cookie, and (John gave) Liz a kiss. On the other reading, the predicate and the indirect_object are gapped: John gave Mary a book, Suzy (gave Mary) a cookie, and Liz (gave Mary) a kiss. It is also possible that the gap of complex conjuncts includes the direct_object, as in the following example: 10.6.3 P OSSIBILITY
OF GAPPING A DIRECT _ OBJECT
John gave a book to Mary, Bill to Suzy, and Jim to Liz On one reading, the gap consist of the predicate give and the direct_object book. This interpretation corresponds to John gave a book to Mary, Bill (gave a book) to Suzy, and Jim (gave a book) to Liz. On the other reading, the gap consists of the subject John and the verb gave. This interpretation corresponds to John gave a book to Mary, (John gave) Bill to Suzy, and (John gave) Jim to Liz. Apparently, a gap interpretation is possible only if the parts of the gap are adjacent in the surface. For example, John gave Mary a book is semantically equivalent to John gave a book to Mary (6.1.4). However, John gave Mary a book, Bill Suzy, and Jim Liz supports the second reading of 10.6.3, but not the first, as shown by *John gave Mary a book, Bill (gave) Suzy (a book), and Jim (gave) Liz (a book).
Part III
Declarative Specification of Formal Fragments
11. DBS.1: Hear Mode
Part I presented the declarative specification at the highest level of abstraction. After settling some crucial methodological issues, it described a cognitive agent, natural or artificial, in terms of (i) its external interfaces, (ii) the internal memory, (iii) the time-linear algorithm for (a) transporting content between the interfaces and the memory, and (b) processing content in memory, (iv) the data structure of the content, and (v) the database schema of the memory.1 Part II described the cog wheels of natural language communication. They are the ontological distinction between objects (nouns), relations (verbs), and properties (adj), the semantic relations of functor-argument and coordination, and the elementary, phrasal, and clausal levels of grammatical complexity. Part III formalizes the declarative specification at the lowest level of abstraction and the highest degree of empirical detail. Directly below this level there is the actual software implementation.2 It has accidental properties, but is the best method to verify and correct the declarative specification (Sect. 1.2). In this chapter, we begin with the coordination of propositions, as in a text. Long recognized as a basic structure of natural language, propositional calculus formalized it as the truth conditions of the connectives ∨, ∧, →, and ¬. The DBS alternative is the time-linear hear mode derivation and concatenation of simple sentences. Chap. 12 provides the corresponding speak mode.
11.1 Automatic Word Form Recognition In language communication, the hear mode is based on word form recognition, i.e. the assigning of lexical analyses to incoming unanalyzed modalitydependent external surfaces (4.5.1) of the natural language at hand (Handl 2008). Word form recognition in natural and artificial agents may be characterized as a combination of surface recognition and lexical lookup: 1
2
The framework is not limited to applications involving natural language. For example, it may control an autonomous cognitive agent which recognizes its current location as provided by GPS (input), computes a goal by processing content in memory (inferencing), and guides itself to the goal (output). The software evolution of DBS up to the year 2011 is described in Handl (2012), chapters 6 and 7.
194
11. DBS.1: Hear Mode
11.1.1 S URFACE
RECOGNITION AND LEXICAL LOOKUP
hearer surface recognition unanalyzed surface (type) analyzed surface (isolated proplet)
Julia sleeps lexical lookup sur: Julia noun:Julia cat: snp sem: nm f fnc: ...
sur: sleeps verb: sleep cat: ns3’ v sem: pres arg:
prn:
prn:
Julia sleeps external sign surfaces (tokens)
...
Surface recognition matches the unanalyzed external sign (token, Sect. 2.2) with the corresponding internal sign (type) provided by the agent’s word bank (memory, Sects. 4.3–4.5). In artificial agents, this may be achieved by automatic speech recognition, optical character recognition (OCR), or by simply typing the letters of the intended surface into the computer. In the latter procedure, the typewritten surface is converted directly into a digitally coded, modality-independent form. The representation of the recognized, but otherwise unanalyzed, surface is submitted to lexical lookup. In artificial agents, lexical lookup is based on an electronic list of lexical entries which characterize the syntactic and semantic properties of the word forms, here in the form of proplets. If the sur value of a lexical proplet matches the unanalyzed surface, the matching proplet is returned as the result (11.1.2). Depending on whether lexical lookup is based on (a) complete word forms or on their parts, namely (b) the morphemes or (c) the allomorphs, three kinds of automatic word form recognition may be distinguished. They are called (i) the full form, (ii) the morpheme, and (iii) the allomorph method (FoCL Sect. 13.5). The full form method is the simplest. Based on a finite list, however, it can only recognize a fixed number of items. More specifically, a full form lexicon (i) cannot handle neologisms on the fly, (ii) ignores the linguistic phenomena of word formation, and (iii) would have to list all possible word forms as lexical entries to be complete. The latter include those which result from productive inflection/agglutination, derivation, and compounding of the language – a complete listing of which is theoretically and practically impossible. Systems of morphological analysis recognize a word form by (i) segmenting the surface into smaller parts which are (ii) analyzed by look up in a suitable lexicon and then (iii) put together again by the composition rules of morphol-
11.1 Automatic Word Form Recognition
195
ogy. The smallest concrete parts of a word form (FoCL Sect. 13.3) are called allomorphs, e.g., learn, ing, or ed. Depending on the modality, e.g., speech or writing, allomorphs divide further into allophones and allographs. For example, the stem wolf has the allographs wolf and wolv in English writing. If the modality of realization is not an issue, the term allomorph is used. The crucial question for any compositional approach to automatic word form recognition is how to segment the surface into allomorphs. While the word form boundaries are explicitly indicated by spaces in written English, the allograph boundaries in a surface like over/eat/er/s are implicit. The computational method of choice for dissembling a complex word form into all possible allograph sequences is the use of a trie (Briandais 1959, Fredkin 1960). A trie is an efficient electronic facility for storing and retrieving lexical entries automatically. It allows to associate any letter sequence (key) with any kind of entry, regardless of whether the entry represents a full form, an allograph, the entries of a library catalogue, or any other letter sequence. Consider the following example, which shows the storage of the full forms [wolf] and [wolves] on the left and the allographs [wolv-] and [es] on the right: 11.1.2 T RIE (i) a
b
STRUCTURES FOR FULL FORMS AND FOR ALLOGRAPHS
c ... w ... x o
y
z
(ii) a
b
c ... e
...
y
z
o
s [es] l
l [wolf] f
w ... x
v
[wolf] f
v [wolv−]
e [wolves] s
Horizontally, the top line consists of the letters of the alphabet. Vertically, some top letters are continued downwards to spell the word form surfaces (trie i) or the allographs (trie ii) of the language. For example, the letter w is continued downwards with o, which in turn is continued with l. At this point, the continuation in (i) splits into f and v. The f, as the final letter of wolf, is used to store and retrieve the lexical entry shown in 11.1.3 (a). The v is continued downwards with e and s. The s, as the final letter of wolves, is used to store and retrieve the entry shown in 11.1.3 (b). Trie (ii) shows the compositional alternative to full form lookup: the allograph [wolv-] is stored at the letter v. When the next letter arrives, the navigation jumps up to the e in the top line and continues downward from there to the s. As the final letter of the plural allograph es, it is used for storing and retriev-
196
11. DBS.1: Hear Mode
ing its lexical analysis. The sequence of analyzed allographs retrieved is combined into an analyzed word form by morphological rules (FoCL Sect. 14.4). Computer science provides highly efficient algorithms for building a trie structure from any traditional lexicon, suitable for retrieving data items stored at the end of any vertical letter sequence in the trie structure. The items may take any form in any medium with any degree of detail. Consider the lexical analysis of two forms belonging to the same word: 11.1.3 DBS
WORD FORM RECOGNITION OF
sur: wolf noun: wolf cat: sn sem: animal (a) fnc: mdr: nc: pc: prn:
wolf AND wolves
sur: wolves noun: wolf cat: pn sem: animal (b) fnc: mdr: nc: pc: prn:
The analyses are related by a common core feature, namely [noun: wolf]. They differ in their surfaces, [sur: wolf] vs. [sur: wolves], and their cat values, [cat: sn] vs. [cat: pn] (A.3.2, 4). Automatic word form recognition allows the coding of firmly established, relevant grammatical detail, which is indispensable for scientific work in linguistics and which statistics alone could never deliver (FoCL Sect. 15.5).
11.2 Integrating the Steps of the Hear Mode For testing and debugging, the components of a DBS software system may be run in isolation. For example, a list of word form surfaces ordered according to frequency or alphabetically may be used as input to (i) automatic word form recognition, resulting in a list of lexical proplets; ordered as a coherent text or as a list of test sentences these proplets may be used as input to (ii) syntacticsemantic parsing, resulting in a content or a list of contents; once a content has been derived, its proplets may be stored in (iii) a database, etc. As part of the DBS communication cycle, in contrast, the components are tightly integrated. In the hear mode, the input of a single next word surface is followed by (i) lexical lookup, (ii) storage of the lexical item at the now front (3.3.1), and (iii) syntactic-semantic parsing (3.4.2). This cycle is repeated as long as a next input surface is provided. The only ways to interrupt the procedure before reaching the last input surface are (a) an unrecognized word form, (b) ungrammatical input, or (c) an event distracting the agent’s attention.
11.2 Integrating the Steps of the Hear Mode
197
The cycle of recognizing the next word, storing it at the now front, and concatenating it syntactic-semantically to a proplet at the now front is shown in the following hear mode derivation of Fido found a bone: 11.2.1 A LTERNATING
LOOKUP, STORAGE , AND CONCATENATION
lexical lookup of first word and storage at the now front Fido 0
noun: Fido fnc: prn: lexical lookup of next word and storage at the now front found verb: find arg: prn:
1
noun: Fido fnc: prn: 1
verb: find arg: prn:
application of operation
lexical lookup of next word and storage at the now front a noun: n_1 fnc: prn: 2
noun: Fido fnc: find prn: 1
verb: find arg: Fido prn: 1
noun: n_1 fnc: prn:
application of operation
lexical lookup of next word and storage at the now front bone noun: bone fnc: prn: 3
noun: Fido fnc: find prn: 1
verb: find arg: Fido n_1 prn: 1
noun: n_1 fnc: find prn: 1
result noun: Fido fnc: find prn: 1
verb: find arg: Fido bone prn: 1
noun: bone fnc: find prn: 1
noun: bone fnc: prn:
application of operation
198
11. DBS.1: Hear Mode
For ease of explanation, the proplets at the now front are shown in the order of their arrival – and not in the alphabetical order of token lines (Sect. 11.3). In the standard DBS presentation, used here throughout (3.4.2, 6.1.1, etc.), the hear mode derivation 11.2.1 is formatted as follows: 11.2.2 S TANDARD Fido
PRESENTATION OF
11.2.1 bone
a
found
lexical lookup noun: Fido fnc: prn:
verb: find arg: prn:
noun: n_1 fnc: prn:
noun: bone fnc: prn:
syntactic−semantic parsing noun: Fido 1 fnc: prn: 1
verb: find arg: prn:
noun: Fido 2 fnc: find prn: 1
verb: find arg: Fido prn: 1
noun: Fido 3 fnc: find prn: 1
verb: find arg: Fido n_1 prn: 1
noun: n_1 fnc: prn: noun: n_1 fnc: find prn: 1
noun: bone fnc: prn:
result noun: Fido fnc: find prn: 1
verb: find arg: Fido bone prn: 1
noun: bone fnc: find prn: 1
The point is that the standard format of a hear mode derivation should be read as (i) the sequence of lexical lookups (horizontal) and (ii) the sequence of applying operations (vertical) progressing in tandem.
11.3 Managing the Now Front The storage of a next word proplet at the now front activates all operations which match this proplet with their second pattern (11.5.4). Activated operations which find a proplet at the now front matching their first pattern apply (11.5.5). The number of proplets at the current now front is quite small, usually no more than five or six (self-organization). In other words, LAGo hear operations apply in situ to proplets which are already in their final storage position. Showing the now front in the alphabetical order of the token lines, the operation in line 1 of 11.2.2 applies as follows:
11.3 Managing the Now Front
11.3.1 A PPLYING member proplets
AN OPERATION
now front
in situ AT
199
THE NOW FRONT
owner values bone
.... ....
....
noun: Fido fnc: prn: 27
Fido
.... ....
verb: find arg: prn: ....
find
....
The next word is find, which matches the second pattern of NOM+FV (11.6.1). The proplet at the now front matching the first pattern is Fido. The result of this operation is shown by the first two proplets in line 2 of 11.2.2. To make room for the next input to be concatenated, the now front must be cleared in regular intervals. For this, the proplets currently stored at the now front are pushed to the left, as in a pushdown automaton, – or, equivalently, the now front and the column of owner values move to the right, leaving the proplets to be removed from the now front behind in the word bank’s member records (11.3.4, 11.3.6), like sediment. This mechanism of clearing the now front creates a faithful record of the time-linear arrival order. The only way to correct content stored in a word bank is by adding a comment at the current now front, like a diary entry. The relation between the comment and the content to be corrected is solely by address, thus leaving the sediment untouched.3 In a token line, the number of proplets affected by a clearance is usually either zero or one, except for several proplets with the same sur and prn value, as in Mary, Mary, and Mary or slept and slept and slept. In such a case, the token line of sleep, for example, would contain three instances of sleep proplets at the now front. Having the same prn value, they will all be left behind in the sediment during the next now front clearance. The moment for clearing proplets from the now front depends on when these proplets have ceased to to be candidates for possible additional concatenations. To determine this moment, let us consider the complement, i.e. the situations in which a proplet does retain the potential to be needed. 3
Unlike coordinate-addressable databases, there are no consistency checks and repairs in a word bank. This facilitates protecting the data from attempts at intrusion, manipulation, and fraud (“correction”).
200
11. DBS.1: Hear Mode
The first case is intrapropositional functor-argument, as shown in 11.3.1. In this particular case, one might argue that only the find proplet is still needed, but as a general rule we take the position that the proplets of an elementary proposition should not cleared from the now front as long as this proposition is not complete. This holds also for intrapropositional coordination (8.2.1). The second case is extrapropositional functor-argument. As an example consider the subject/predicate construction That Fido barked amused Mary (7.1.2) at the moment when the clausal subject is connected to the main verb: 11.3.2 V/V member proplets
CONCATENATION IN A SUBJECT SENTENCE now front ....
owner values .... amuse
....
verb: amuse arg: prn:
bark
....
verb: bark arg: Fido fnc: prn: 27 noun: Fido fnc: bark prn: 27 ....
Fido
....
....
As in 11.3.1, the token lines are shown in the alphabetical order induced by their owner value. The arg and fnc values of the bark and Fido proplets of the subclause show that they have already been connected. To connect the verbs bark (subclause) and amuse (main clause) in line 3 of 7.1.2, the operation relies on the additional fnc attribute (opaque) of bark, which originated lexically in the subordinating conjunction that (A.4.4). The example illustrates the general principle of DBS that semantic relations of structure are defined solely between two single content proplets, intra- and extrapropositionally (no non-terminal nodes). Regarding the moment of clearance from the now front, one might argue that only the amuse proplet is still needed, but as a general rule the proplets of an extrapropositional functor-argument should not be cleared as long as the construction is not complete. This holds also for clausal object\predicate (7.2.1), adnominal|noun (7.3.2, 7.4.1), and adverbial|verb (7.5.1) constructions. Remains the extrapropositional V−V relation between two propositions. Consider the concatenation between That Fido barked amused Mary. and Suzy gave Fido a bone. The first step is completing the first sentence with the application of S+IP (11.6.2), resulting in the following now front state:
11.3 Managing the Now Front
11.3.3 S TATE 1
OF AN EXTRAPROPOSITIONAL
member proplets
now front
COORDINATION
owner values amuse
....
verb: amuse arg: (bark 27) Mary nc: pc: prn: 28
bark
....
verb: bark arg: Fido fnc: (amuse 28) prn: 27
....
noun: Fido fnc: bark prn: 27
Fido
noun: Mary fnc: amuse prn: 28
Mary
....
V−V
201
At this point, the only proplet still needed for potential concatenation is the verb amuse. As the one verb proplet without an additional attribute, it is recognizable as the highest verb in the clausal construction. Accordingly all other proplets at the now front end up in the sediment of the word bank (body flush): 11.3.4 S TATE 2
OF AN EXTRAPROPOSITIONAL
member proplets
....
now front verb: amuse arg: (bark 27) Mary nc: pc: prn: 28
COORDINATION
owner values amuse
bark
....
verb: bark arg: Fido fnc: (amuse 28) prn: 27
....
noun: Fido fnc: bark prn: 27
Fido
noun: Mary fnc: amuse prn: 28
Mary
....
V−V
When the first word of the next proposition, i.e. the Suzy proplet, is added there is a suspension: the intrapropositional N/V functor-argument and the extrapropositional V−V coordination are postponed until the finite verb, here
202
11. DBS.1: Hear Mode
give, of the next proposition arrives. The next hear mode derivation step compensates the suspension by applying two operations at once (self-organization): 11.3.5 S TATE 3
OF AN EXTRAPROPOSITIONAL
member proplets
....
now front verb: amuse arg: (bark 27) Mary nc: pc: prn: 28
amuse
bark
....
....
noun: Fido fnc: bark prn: 27
Fido
....
....
noun: Mary fnc: amuse prn: 28 ....
COORDINATION
owner values
verb: bark arg: Fido fnc: (amuse 28) prn: 27
verb: give arg: nc: pc: prn:
V−V
give
Mary
noun: Suzy fnc: prn: 29
Suzy
The amuse proplet was left behind at the now front by the body flush shown in 11.3.4. The Suzy proplet was added to the now front by IP+START (not shown, but see 11.6.3 for an analogous application). The subject/predicate relation between Suzy and give is established by the operation NOM+FV (11.6.4) which cross-copies Suzy into the arg slot of give and give into the fnc slot of Suzy (lower half). The verb−verb relation between amuse and give is established by the operation V+V (11.6.5) which cross-copies amuse unidirectionally into the nc slot of give (upper half). From here on, parsing the remainder of the second sentence continues independently of its predecessor. When S+IP will apply again at the end of the second sentence, a head flush will move the amuse proplet stemming from the first sentence into the sediment because it is has ceased to be a potential candidate for additional concatenations. The result of the cross-copyings indicated in 11.3.5 by means of arrows and of the anticipated head flush is as follows:
11.4 Intrapropositional Suspension
11.3.6 S TATE 4
OF AN EXTRAPROPOSITIONAL
member proplets
now front
verb: amuse arg: (bark 27) Mary nc: (give 29) pc: prn: 28
....
V−V amuse
bark
....
....
noun: Fido fnc: bark prn: 27
Fido
....
verb: give arg: Suzy nc: pc: (amuse 28) prn: 29
give
Mary
noun: Mary fnc: amuse prn: 28 ....
COORDINATION
owner values
verb: bark arg: Fido fnc: (amuse 28) prn: 27
....
203
noun: Suzy fnc: give prn: 29
Suzy
In summary, the now front is cleared when the sentence-final hear mode operation S+IP (11.6.2) applies: (i) a body flush moves all proplets except the current top verb (head) and (ii) a head flush moves the top verb of the previous sentence into the sediment (member proplets) of the agent’s word bank. Using the application of sentence-final S+IP as the trigger for such a now front clearance ensures that all proplets needed for a possible concatenation are constantly available at the now front. As a simple, regular, general mechanism, the managing of the now front is treated in the background, as part of the parser’s book-keeping operations, similar to assigning prn values.
11.4 Intrapropositional Suspension A systematically managed now front is essential for our transition from LA grammar rules with rule packages to LAGo operations without (Sect. 3.6). This is because it provides backward looking LAGo hear operations with only a limited set of proplet candidates to match with their first pattern, thus reducing the potential for unnecessary computation and inappropriate ambiguity.
204
11. DBS.1: Hear Mode
The omission of rule packages is motivated further by the phenomenon of suspension, which is handled better by LAGo operations than by LA grammar rules. Consider the following construction with a sentence-initial adverb: 11.4.1 I NITIAL
ADVERB CAUSING INTRAPROPOSITIONAL SUSPENSION
Yesterday+Mary (danced). The elementary adverb yesterday modifies the finite verb. However, danced has not yet arrived in the time-linear input sequence. 11.4.2 I NTRAPROPOSITIONAL Yesterday
SUSPENSION
Mary
danced
lexical lookup adj: yesterday mdd: prn: syntactic−semantic parsing adj: yesterday 1 mdd: prn: 39 adj: yesterday 2 mdd: prn: 39
noun: Mary fnc: prn: noun: Mary fnc: prn: noun: Mary fnc: prn: 39
verb: dance arg: mdr: prn:
noun: Mary fnc: arrive prn: 39
verb: dance arg: Mary mdr: yesterday prn: 39
result adj: yesterday mdd: arrive prn: 39
verb: dance arg: mdr: prn:
When the finite verb arrives in line 2, two semantic relations of structure must be established: (i) subject/predicate by cross-copying into the fnc and arg slots and (ii) adverb|verb by cross-copying into the mdr and mdd slots. If we wanted to use an LA hear rule rather than LAGo operations to compensate for the suspension in line 1, we would have to define an extra rule which takes all three proplets in line 2 as input. In 11.4.2, this hypothetical rule, named ADV+NOM+FV, would be called by the start state, in On the table Mary danced it would be called by the rule package of DET+CN, but in In Paris Mary danced it would be called by rule package of PREP+NP. Thus, defining a new LA hear rule instead of self-organizing LAGo operations brings about the additional burden of correctly expanding the finite state transition network defined between the rule names and the rule packages. The application of hypothetical ADV+NOM+FV in line 2 is as follows:
11.4 Intrapropositional Suspension
205
11.4.3 S USPENSION - COMPENSATING LA RULE ADV+NOM+FV ADV+NOM+FV {FV+NP, AUX+NFV, S+IP} adj: α verb: γ verb: γ adj: α noun: β noun: β cat: ADV arg: β arg: cat: ADV fnc: γ fnc: mdr: α mdr: ⇒ mdd: γ mdd: prn: K prn: K prn: K prn: prn: K prn: K sur: sur: sur: sur: sur: sur: adj: yesterday noun: Maryverb: dance adj: yesterday noun: Maryverb: dance cat: adnv cat: snp cat: n′ v cat: adnv cat: snp cat: #n′ v sem: sem: nm f sem: past sem: sem: nm f sem: past mdd: fnc: arg: mdd: dance fnc: dance arg: Mary mdr: mdr: mdr: mdr: mdr: mdr: yesterday nc: nc: nc: nc: nc: nc: pc: pc: pc: pc: pc: pc: prn: 9 prn: 9 prn: 9 prn: 9 prn: prn: 9
The core value of dance is copied into the mdd slot of yesterday and the core value of yesterday into the mdr slot of dance. The core value of Mary is copied into the arg slot of dance and the core value of dance into the fnc slot of Mary. If the adverb follows the verb, in contrast, no suspension arises and no extra rule is needed. This is shown by the following word order variant of 11.4.1: 11.4.4 A BSENCE
OF A SUSPENSION IN A POSTVERBAL ADVERB
(Mary) danced+yesterday. Consider the graphical hear mode derivation, analogous to 11.4.2 11.4.5 H EAR
MODE DERIVATION OF Mary
danced
11.4.4 yesterday
lexical lookup noun: Mary fnc: prn: syntactic−semantic parsing
verb: arrive arg: mdr: prn:
noun: Mary 1 fnc: prn: 40
verb: dance arg: mdr: prn:
noun: Mary 2 fnc: arrive prn: 40
verb: dance arg: Mary mdr: prn: 40
adj: yesterday mdd: prn:
verb: dance arg: Mary mdr: yesterday prn: 40
adj: yesterday mdd: arrive prn: 40
result noun: Mary fnc: arrive prn: 40
adj: yesterday mdd: prn:
206
11. DBS.1: Hear Mode
Here the subject/predicate and the verb|adv relations are established separately in lines 1 and 2, respectively. For a similar minimal pair, but at the clausal level, consider 7.5.1 (suspension) and 7.5.4 (no suspension). The hypothetical suspension compensating rule 11.4.3 may be reformulated as the following LAGo operations called ADV+FV and NOM+FV: 11.4.6 D EFINITION
OF THE OPERATION
ADV+FV
ADV+FV adj: α adj: α verb: γ verb: γ cat: ADV cat: ADV mdd: mdr: ⇒ mdd: γ mdr: α prn: prn: K prn: K prn: K ADV ǫ {adv, adnv}
While ADV+FV is based on the first and the third pattern of hypothetical ADV+NOM+FV (11.4.3), NOM+FV is based on the second and the third: 11.4.7 D EFINITION
OF THE OPERATION
NOM+FV
NOM+FV verb: γ verb: γ noun: β noun: β ′ ′ cat: NP cat: #NP X VT cat: NP cat: NP X VT fnc: ⇒ fnc: γ arg: β arg: prn: K prn: K prn: prn: K
In line 2 of 11.4.2, the second pattern of ADV+FV and NOM+FV are both matched by the next word proplet dance. ADV+FV finds yesterday at the now front to match its first pattern, while NOM+FV finds Mary, resulting in the required concatenations – without the need for defining an additional rule. In 11.4.5, the second pattern of NOM+FV is matched by dance in line 1 and finds Mary at the now front to match its first pattern. In line 2, however, the next word yesterday requires the operation FV+ADV: 11.4.8 D EFINITION
OF THE OPERATION
FV+ADV
FV+ADV adj: α adj: α verb: γ verb: γ cat: ADV cat: ADV mdr: mdd: ⇒ mdr: α mdd: γ prn: K prn: K prn: K prn: ADV ǫ {adv, adnv}
ADV+FV and FV+ADV differ only in the order of the patterns.4 4
Should ADV+FV and FV+ADV be fused into a single operation? The cost would be defining a new operator and an adjusted interpretation of “first pattern” and “second pattern.” Also, there exist inversions of order which change the meaning, such as You did and Did you. We therefore refrain from such a “generalization.”
11.5 The Three Steps of Applying a Lago Hear Operation
207
The operations approach exceeds the methods of a traditional complexity analysis (FoCL Part II). This is because (i) addresses defining semantic relations between order-free proplets, (ii) automatic word form recognition storing proplets at the now front, and (iii) the systematic clearance of proplets from the now front (Sect. 11.3) are not in the arsenal of traditional algorithms and their complexity analysis. However, the new constructs of LAGo allow to show the efficiency of the operations approach in other ways. First, DBS complexity analysis is based on the time-linear derivation order: a(n unambiguous) string with n words requires n-1 combination steps and each step is based on the addition of a single word. Second, an operation is only activated (i) by the current next word proplet matching its second input pattern (hear mode) or (ii) by a current word bank proplet matching its first input pattern (speak mode). Third, the number of candidates available to potentially match “the other” input pattern of an activated operation is usually no more than four or five. The only source of increased computational complexity is recursive syntactic ambiguity (TCS’92; FoCL Sect. 12.5) also in LAGo.5
11.5 The Three Steps of Applying a Lago Hear Operation The declarative specification (1.2.1) of a complex software system must be defined at a level of detail which is sufficient for a straightforward implementation in a programming language of choice. Given the large structural variety of natural language expressions, the size of the lexicon, and the complexity of natural language communication, let us begin with a small “fragment.” By a fragment6 of natural language communication we mean a system which has limited data coverage, but is functionally complete in that it models the hear mode, the think mode (limited here to an autonomous navigation through the content of the agent’s database), and the speak mode. Given the impediment of having to use today’s standard computers rather than autonomous robots, the following fragments do not include context recognition and action, i.e. they are capable of mediated reference only (2.5.1). 5
6
For proving the complexity class of different formal languages we continue to rely on LA grammars with rule packages. This is justified because there memory operations are not taken into consideration. For a detailed complexity analysis of LA grammar with rule packages see Handl (2011). The term fragment of a natural language was introduced by Richard Montague (1930–1971), who was a student of Alfred Tarski at UCLA and received his Ph.D. there in 1957. Despite Tarski’s (1935, 1944) proof that inconsistency arises if the object language contains a truth predicate (FoCL Sect. 19.5), Montague (1973) applies Tarski’s semantics to English. He does not mention his former advisor’s result – except in his clever use of the term fragment (ter Meulen and Heusinger 2013) : Montague grammars are necessarily incomplete because the language modeled, i.e. the object language, may not contain the words true and false.
208
11. DBS.1: Hear Mode
Our first fragment of English is called DBS.1. As an agent-based counterpart to the propositional calculus of symbolic logic, the focus is on concatenating propositions (extrapropositional coordination). The components of DBS.1 are automatic word form recognition and production, and the grammars LAGo Hear.1 and LAGo Think-Speak.1. In Chaps. 13 and 14, DBS.1 will be upscaled into an agent-based counterpart to prepositional calculus (intrapropositional functor-argument). Called DBS.2, it will maintain the functionality and data coverage DBS.1 (systematic upscaling). To show the concatenation of propositions based on proplets and a timelinear derivation order, the data coverage of LAGo Hear.1 is limited to the small text analyzed informally in Sect. 9.2: 11.5.1 DATA
COVERAGE OF
LAG O H EAR .1
Julia sleeps. John sings. Suzy dreams. The words occurring in this minimal fragment have the following lexical definitions as proplets: 11.5.2 T HE
LEXICAL ENTRIES OF
LAG O
HEAR .1
7
names: sur: Suzy sur: John sur: Julia noun: noun: noun: cat: snp cat: snp cat: snp sem: nm f sem: nm m sem: nm f fnc: fnc: fnc: mdr: mdr: mdr: nc: nc: nc: pc: pc: pc: prn: prn: prn: one-place verbs inflected for third-person singular present indicative and punctuation sign: sur: dreams sur: sings sur: sleeps verb: dream verb: sing verb: sleep cat: ns3′ v cat: ns3′ v cat: ns3′ v sem: pres sem: pres sem: pres sur: . verb: v_1 arg: arg: arg: cat: v′ decl mdr: mdr: mdr: prn: nc: nc: nc: pc: pc: pc: prn: prn: prn: 7
In DBS, lexical name proplets have a sur value, but no core value. The core value is provided non-lexically in an act of baptism: the surface of the name is copied into the sur slot of the initial referent and the address of the referent is copied into the core slot of the name proplet (2.6.7). In linguistic examples, name proplets do not have a named referent. For simplicity, the name surface is used as the lexical core value instead. Such a core value is needed for cross-copying during hear mode derivations (3.4.2).
11.5 The Three Steps of Applying a Lago Hear Operation
209
The different attributes and their values are described in 4.2.2 and defined in Sects. A.1–A.3 of the appendix.8 Example 11.5.1 corresponds to p∧q∧r in propositional calculus, the most simple version of symbolic logic (CLaTR Sect. 6.4). Yet its LAGo hear analysis is subject to suspensions. This is shown by the graphical derivation 9.2.1. At the software level, each line in the graph analysis must be translated into an explicit hear mode operation. A hear mode operation is called by the input of the current next word, lexically analyzed by automatic word form recognition (11.5.2). In other words, a hear mode operation will not be attempted unless the current next word matches its second input pattern (self-organization efficiency). The first next word in the hear mode derivation 9.2.1 is the verb proplet sleep in line 1. Of the nine operations defined in LAGo Hear.2 (13.5.1), four match a verb proplet with their second input pattern, among them NOM+FV: 11.5.3 F IRST
STEP OF APPLYING A
LAG O
HEAR OPERATION
NOM+FV verb: β verb: β noun: α noun: α ′ ′ pattern cat: NP cat: NP X v cat: NP cat: #NP X v ⇒ fnc: β arg: α arg: fnc: level prn: K prn: K prn: prn: K NP ǫ {snp, pnp, s3, p3}; NP′ ǫ {n′ , ns3′ , n-s3′ }. If NP ǫ {s3, snp} then NP′ ǫ {ns3′ , n′ }; otherwise, NP′ ǫ {n-s3′ , n′ }. sur: sleep verb: sleep cat: n′ v sem: pres content arg: level mdr: nc: pc: prn: 5
Now that the operation NOM+FV has been activated, it must find a proplet matching its first input pattern. The range of candidates is limited to the proplets currently stored at the now front of the agent’s word bank (Sect. 11.3). In the case of derivation 9.2.1, only the initial Julia proplet has been stored there by automatic word form recognition. It happens to match the first pattern of the operation NOM+FV as follows: 8
No matter which method of automatic word form recognition is used (FoCL Sect. 13.5), it should work for any language. Changing to another language should require no more than providing another language-specific lexicon and, if present, other rules of morphological analysis. Also, it must be possible at any time to replace the current component of automatic word form recognition by a more advanced one (modularity). For this, the new component must merely be input/output equivalent with the old one: it should take unanalyzed surface tokens as input and render lexically analyzed surface types (i.e. isolated proplets) as output, suitable to be concatenated by the hear mode parser.
210
11. DBS.1: Hear Mode
11.5.4 S ECOND
STEP OF APPLYING A
LAG O
NOM+FV verb: β noun: α noun: α ′ pattern cat: NP cat: NP X v cat: NP fnc: arg: ⇒ fnc: β level prn: K prn: K prn:
HEAR OPERATION
verb: β ′ cat: #NP X v arg: α prn: K
NP ǫ {snp, pnp, s3, p3}; NP′ ǫ {n′ , ns3′ , n-s3′ }. If NP ǫ {s3, snp} then NP′ ǫ {ns3′ , n′ }; otherwise, NP′ ǫ {n-s3′ , n′ }. sur: sur: sleep noun: Julia verb: sleep cat: snp cat: n′ v sem: nm f sem: pres content fnc: arg: level mdr: mdr: nc: nc: pc: pc: prn: 1 prn: 1 5
By binding the core values Julia and sleep to the variable α and β, respectively, the output of the operation may be realized: 11.5.5 T HIRD
STEP OF APPLYING A
LAG O
NOM+FV noun: α verb: β noun: α ′ pattern cat: NP cat: NP cat: NP X v ⇒ fnc: β arg: fnc: level prn: K prn: K prn:
HEAR OPERATION verb: β ′ cat: #NP X v arg: α prn: K
NP ǫ {snp, pnp, s3, p3}; NP′ ǫ {n′ , ns3′ , n-s3′ }. If NP ǫ {s3, snp} then NP′ ǫ {ns3′ , n′ }; otherwise, NP′ ǫ {n-s3′ , n′ }. sur: sleep sur: sur: sur: verb: sleep verb: sleep noun: Julia noun: Julia cat: snp cat: n′ v cat: snp cat: #n′ v sem: nm f sem: pres sem: nm f sem: pres content arg: fnc: sleep arg: Julia fnc: level mdr: mdr: mdr: mdr: nc: nc: nc: nc: pc: pc: pc: pc: prn: 1 prn: prn: 1 prn: 1 1 1 5 5
The three step procedure of applying a hear mode operation holds not only for cross-copying, but also for substitution, absorption, and suspension (3.6.3). In the hear mode derivation 9.2.1, absorption is used in lines 2 and 5, and suspension is used in lines 3 and 6. Due to the absence of phrasal expressions, there is no substitution (examples are the application of AUX+NFV in 13.3.6 and 13.3.7 and of DET+CN in 13.2.4, 13.2.8, 13.3.3, 13.4.3, 13.4.7, and 13.4.9). The derivation in 9.2.1 requires the operations NOM+FV, V+V, S+IP, and IP+START, which may be summarized as the following (minimal) LAGo
11.6 Parsing a Small Fragment of English
211
hear grammar for a very small fragment of English modeling propositional calculus in the agent-based approach of DBS: 11.5.6 LAG O H EAR .1 LX: proplets defined in 11.5.2 NOM+FV noun: α verb: β verb: β noun: α ′ ′ cat: NP cat: NP cat: #NP X VT cat: NP X VT fnc: ⇒ fnc: β arg: α arg: prn: K prn: K prn: K prn: NP ǫ {snp, pnp, s3, p3}; NP′ ǫ {n′ , ns3′ , n-s3′ }. If NP ǫ {s3, snp} then NP′ ǫ {ns3′ , n′ }; otherwise, NP′ ǫ {n-s3′ , n′ }. VT ǫ {v, vi}. V+V verb: γ verb: γ ′ ′ cat: #X SM verb: β cat: #X SM verb: β prn: K ⇒ nc: (β K) prn: K nc: prn: K-1 prn: K-1 S+IP verb: β ′ cat: #X VT verb: v_1 ′ cat: VT SM prn: K
⇒
verb: β ′ cat: #X SM ip+ prn: K
VT ǫ {v, vi, vimp} and SM ǫ {decl, interrog, impv}. If VT = v, then SM ǫ {decl, interrog}; if VT = vi, then SM = interrog; if VT = vimp, then SM = impv. IP+START verb: β noun: α ′ cat: #X SM ip+ cat: NP prn: K prn:
SM ǫ {decl, interrog, imp}
⇒
verb: β noun: α ′ cat: #X SM cat: NP prn: K prn: K+1
The operations NOM+FV, S+IP, and IP-START show the use of variable restrictions for controlling (i) matching with the input and (ii) agreement.
11.6 Parsing a Small Fragment of English Using the lexicon defined in 11.5.2 and the operations of LAGo Hear.1 defined in 11.5.6, let us parse the test sample 11.5.1 from beginning to end. The derivation begins with storing the proplets derived by automatic word form recognition from the first two words, Julia and sleeps, at the now front. Of the four operations V+V, NOM+FV, S+IP, and IP+START provided by LAGo Hear.1, only the first two match the next word proplet sleep with their second pattern (11.5.3). Of these, only NOM+FV also finds a proplet matching
212
11. DBS.1: Hear Mode
its first pattern at the now front, namely Julia (11.5.4). The variable restrictions and agreement conditions of this operation are fulfilled and the application is successful: 11.6.1 C OMBINING Julia AND sleeps NOM+FV verb: β verb: β noun: α noun: α pattern cat: NP cat: #NP′ X VT cat: NP′ X VT cat: NP arg: α arg: ⇒ fnc: β level fnc: prn: K prn: K prn: prn: K NP ǫ {snp, pnp , s3, p3}; NP′ ǫ {n′ ,ns3′ , n-s3′ }. If NP ǫ {s3, snp} then NP′ ǫ {ns3′ , n′ }; otherwise, NP′ ǫ {n-s3′ , n′ }. VT ǫ {v, vi}. ⇑ ⇓ sur: sur: sleeps sur: sur: Julia noun: Julia verb: sleep noun: Julia verb: sleep cat: snp cat: #ns3′ v cat: snp cat: ns3′ v sem: nm f sem: pres sem: nm f sem: pres content fnc: sleep arg: Julia fnc: arg: level mdr: mdr: mdr: mdr: nc: nc: nc: nc: pc: pc: pc: pc: prn: 1 prn: 1 prn: 1 prn: 1 1 2 2
The ⇑ indicates the matching and binding of the input to the input pattern of the operation, while the ⇓ indicates the output of the operation. The matching complies with the conditions of 3.2.3. The variable restrictions shown underneath the pattern level are fulfilled by the content input. More specifically, NOM+FV relates the two input proplets by means of cross-copying (3.6.3, 1): the core value of Julia is bound to the variable α and the core value of sleep to β (⇑). In the pattern of the consequent, α appears in (is copied into) the arg slot of the verb and β appears in the fnc slot of the noun. At the content level, this causes (⇓) a corresponding copying of the values Julia and sleep bound to these variables, thus establishing the semantic relation subject/predicate between the two proplets. The cat patterns of NOM+FV code the valency canceling. In the input, the cat value snp (A.3.2, 4) of Julia is bound to the variable NP (valency filler) and the cat value ns3′ (A.3.2, 5) of sleep is bound to the variable NP′ (valency position). In the output of NOM+FV, the verb’s valency position ns3′ in the cat slot is canceled by #-marking the category segment, resulting in #ns3′ . Canceling by #-marking differs from canceling by deletion, as in categorial grammar. For sign-based categorial grammar, deleting a filled valency position may seem acceptable, but not for agent-based DBS. Apart from general considerations of principle, the information is needed in the speak mode for choosing the right word form.
11.6 Parsing a Small Fragment of English
213
For example, the English present tense distinguishes between the unmarked form, for example know with the valency position n-s3′ (nominative without third person singular), and the marked form, for example knows with the valency position ns3′ (nominative third person singular). When canceled, the differentiated nominative valencies #n′ , #ns3′ , and #n-s3′ are still availabe for subject-verb agreement in the speak mode.9 Next, automatic word form recognition provides the period10 as the next word proplet. It matches the second pattern only of S+IP. S+IP finds the sleep proplet at the now front matching its first pattern and applies: 11.6.2 C OMBINING sleeps AND . S+IP verb: α verb: α pattern verb: . ′ ′ cat: #X SM ip+ cat: #X VT ⇒ level cat: VT′ SM prn: K prn: K VT ǫ {v, vi, vimp} and SM ǫ {decl, interrog, impv}. If VT = vi, then SM = interrog; if VT = vimp, then SM = impv. sur: verb: sleep cat: #ns3′ v sem: pres content arg: Julia level mdr: nc: pc: prn: 1 2
⇑
sur: . verb: v_1 cat: v′ decl prn: 3
⇓ sur: verb: sleep cat: #ns3′ decl ip+ sem: pres arg: Julia mdr: nc: pc: prn: 1 2
As a function word, the period is absorbed (3.6.3, 3) into the verb proplet, replacing the cat value v with decl. The application of S+IP triggers a body flush (Sect. 11.3), so that the only proplet remaining at the now front is sleep. It will be needed for the extrapropositional V−V coordination (11.6.5) between the first and the second proposition of the sample sequence. The addition of the +ip marker (A.3.2, 10) in the cat slot of the output enables the application of IP+START. The possibility of immediately reapplying IP+START is prevented by removing the +ip marker in its output (11.6.3). Thus, ungrammatical Julia sleeps . . is not accepted by LAGo Hear.1. Next automatic word form recognition provides the John proplet as the next word to the now front. It is matched by the second pattern only of IP+START, which is a suspension operation (3.6.3, 4). 9
10
Additional valency segments are required for the agreement with auxiliaries, as in I+am, you+are, and he+is (A.4.2, A.5.4–A.5.6, A.3.3). Lexically, the core attribute of sentential punctuation signs is verb because they contribute to the finite verb proplet of a proposition. As function words, the main contribution of punctuation signs is their cat value, i.e. decl, interrog, or impv.
214
11. DBS.1: Hear Mode
11.6.3 C OMBINING Julia sleeps. AND John IP+START verb: α noun: β pattern cat: NP cat: #X′ SM ip+ ⇒ level prn: prn: K SM ǫ {decl, interrog, imp} ⇑ sur: sur: John verb: sleep noun: John cat: #ns3′ decl ip+ cat: snp sem: pres sem: nm m content arg: Julia fnc: level mdr: mdr: nc: nc: pc: pc: prn: prn: 1 4 2
verb: α cat: #X′ SM prn: K
noun: β cat: NP prn: K+1
⇓ sur: sur: verb: sleep noun: John cat: #ns3′ decl cat: snp sem: pres sem: nm m arg: Julia fnc: mdr: mdr: nc: nc: pc: pc: prn: 1 prn: 2 2 4
The application of the IP+START operation supplies the John proplet with an incremented prn value and deletes the ip+ marker in the sleep proplet. Next automatic word form recognition provides the sing proplet as the next word to the now front. Of the four operations in LAGo Hear.1, it is matched by the second patterns of (i) NOM+FV and (ii) V+V. Both apply successfully (self-organizing suspension compensation). More specifically, the operation NOM+FV finds the John proplet at the current now front matching its first pattern, enabling the following application: 11.6.4 A PPLYING NOM+FV
TO
John AND sing
NOM+FV verb: β verb: β noun: α noun: α ′ ′ pattern cat: NP cat: NP X VT cat: NP cat: #NP X VT ⇒ fnc: β arg: α arg: level fnc: prn: K prn: K prn: K prn: NP ǫ {snp, pnp , s3, p3}; NP′ ǫ {n′ ,ns3′ , n-s3′ }. If NP ǫ {s3, snp } then NP′ ǫ {ns3′ , n′ }; otherwise, NP′ ǫ {n-s3′ , n′ }. VT ǫ {v, vi}. ⇑ ⇓ sur: sings sur: sur: sur: noun: John verb: sing noun: John verb: sing cat: snp cat: ns3′ v cat: snp cat: #ns3′ v sem: nm m sem: pres sem: nm m sem: pres content fnc: arg: fnc: sing arg: John level mdr: mdr: mdr: mdr: nc: nc: nc: nc: pc: pc: pc: pc: prn: 2 prn: 2 prn: prn: 2 4 4 5 5
NOM+FV establishes the intrapropositional N/V relation by cross-copying the core values of John and sing into the respective fnc and arg slots. The V+V operation finds the sleep proplet, left by the body flush triggered by S+IP (11.6.2) at the now front, matching its first pattern and applies as well:
11.6 Parsing a Small Fragment of English
11.6.5 A PPLYING V+V V+V verb: γ ′ pattern cat: #X SM level nc: prn: K-1
TO
215
sleep AND sing
verb: β prn: K
⇑ sur: sur: verb: sleep verb: sing cat: #sn3′ decl cat: sn3′ decl sem: pres sem: pres content arg: Julia arg: level mdr: mdr: nc: nc: pc: pc: prn: 1 prn: 2 2 4
verb: γ ′ cat: #X SM ⇒ nc: (β K) prn: K-1
verb: β prn: K
⇓ sur: sur: verb: sleep verb: sing cat: #ns3′ decl cat: sn3′ decl sem: pres sem: pres arg: Julia arg: mdr: mdr: nc: (sing 2) nc: pc: pc: prn: 1 prn: 2 2 4
The extrapropositional V−V coordination is established by unidirectionally cross-copying the (address of the) core value of sing into the nc slot of sleep. The derivation continues as in the previous steps until there is no more input. Using the line numbers of 9.2.1, the sequence of operations is as follows: 11.6.6 T IME - LINEAR line 1 2 3 4 4 5 6 7 7 8
operation NOM+FV S+IP IP+START NOM+FV V+V S+IP IP+START NOM+FV V+V 2 S+IP
PARSING OF input proplets Julia sleep sleep . sleep John John sing sleep sing sing . sing Suzy Suzy dream sing dream dream .
9.2.1
WITH
LAG O H EAR .1
application 11.6.1 11.6.2 11.6.3 11.6.4 11.6.5
The hear mode derivation has resulted in a content. Its proplets fill an empty word bank as follows: 11.6.7 S TORAGE
OF THE
member records
now front sur: verb: dream cat: #n-s3′ decl sem: pres arg: Suzy mdr: nc: pc: prn: 3
...
LAG O H EAR .1
DERIVATION IN A WORD BANK
owner values dream
216
11. DBS.1: Hear Mode
member records sur: noun: John cat: snp sem: nm m fnc: sing mdr: nc: pc: prn: 2 sur: noun: Julia cat: snp sem: nm f fnc: sleep mdr: nc: pc: prn: 1 sur: verb: sing cat: #n-s3′ decl sem: pres arg: John mdr: nc: (dream 3) pc: prn: 2 sur: verb: sleep cat: #n-s3′ decl sem: pres arg: Julia mdr: nc: (sing 2) pc: prn: 1 sur: noun: Suzy cat: snp sem: nm f fnc: dream mdr: nc: pc: prn: 3
now front
owner values John
Julia
sing
sleep
Suzy
As the top verb of the last sentence, the dream proplet is still at the now front, available for concatenation with the verb of a possible next proposition. The owner values are represented by the unmarked form, e.g. sing. Because the sample text 11.5.1 contains each content word only once, there is only one member proplet in each token line. Normally a token line has many member proplets (5.1.2).
12. DBS.1: Speak Mode
This chapter completes the definition of DBS.1 by modeling the speak mode for the content derived by LAGo Hear.1 (11.5.6) from the sequence of unanalyzed English surfaces in 11.5.1. The fragment’s speak mode is based on a LAGo think grammar, called LAGo Think.1, which navigates through the content of the word bank 11.6.7 by taking a member proplet as input and computing a continuation proplet as output (selective activation). Navigating through the content of a word bank is like moving a focus point from one proplet to the next, temporarily highlighting the proplets traversed. The procedure reintroduces the macro word order of the language, lost during the conversion of surfaces into a set of order-free proplets in the hear mode. The micro word order, in contrast, is reintroduced by lexicalization rules which are integrated into the LAGo think operations.1 Utilizing the grammatical information contained in the proplets traversed, this process includes not only the language-dependent precipitation of function words, but also the correct realization of content word forms and the handling of agreement.
12.1 Definition of LAGo Think.1 A LAGo Think system consists of (i) the proplets in a word bank, (ii) a set of LAGo navigation operations, and (iii) a set of LAGo inferences (Sect. 5.3). Driven by the principle of balance (3.5.5), these components provide conceptualization – which is the speaker’s process of selecting (navigation) or deriving (inferencing) what to say. Just as the hear mode takes ordered surfaces as input which are turned into proplets by automatic word form recognition, the think mode takes connected proplets in the agent’s word bank as input which are turned into surfaces by automatic word form production. While the content is relatively language-independent (modulo different valency structures in lexicalization, CLaTR Sect. 3.6), the navigation is language1
The First Edition used separate LA think and LA speak grammars taking turns during production. During upscaling, this proved to be prohibitively complex for programming.
218
12. DBS.1: Speak Mode
dependent insofar as it provides the macro word order of the natural language or language class at hand.2 It holds in general that the hear, the think, and the speak mode of a LAGo system must be co-designed. For example, given that the operation patterns of LAGo Think.1 apply to proplets which have been derived by means of the lexicon and the operations of LAGo Hear.1, LAGo Think.1 must use the same attributes and the same values as the LAGo Hear.1 system defined in 11.5.6. LAGo Think.1 is defined as the following set of navigation operations: 12.1.1 F ORMAL
DEFINITION OF
LAG O T HINK .1
LX: member proplets in the word bank 11.6.7 V/N verb: α arg: β Y prn: K
noun: β ⇒fnc: α prn: K
#-mark β in the arg slot of proplet α
N/V noun: β fnc: α prn: K
verb: α ⇒arg: #β Y prn: K
#-mark α in the fnc slot of proplet β
V−V verb: α verb: β arg: #X nc: (β K+1)⇒ prn: K+1 prn: K
V stands for a verb proplet and N for a noun proplet. The “/” line stands for the semantic relation of subject/predicate and the “−” line for conjunct−conjunct. The instruction following the first operation patterns is short for #-mark the value bound to the variable β in the arg slot of the input matching the proplet α (12.3.3), and similarly for the instruction of the second operation. The operation V/N navigates from the predicate to the subject, the operation N/V from the subject back to the predicate, and the operation V−V from the verb (predicate) of the current proposition to the verb of the next proposition (see the numbered arcs graph (NAG) and the surface realization in 9.2.3) The overall purpose of navigating along the semantic relations between proplets is the derivation of blueprints for action (including language action) which maintain the agent in a state of balance (3.5.5) vis à vis a continuously changing external and internal environment. Here, however, let us pursue the 2
A basic difference between natural languages is the order of the subject, the predicate, and the object. In language typology, this has been described by Greenberg (1963) as the distinction between SVO (subject-verb-object), VSO (verb-subject-object), and SOV (subject-object-verb) languages.
12.2 The Two Steps of Applying a LAGo Think Operation
219
more modest goal of mapping a content into an adequate sequence of unanalyzed English surfaces which equals the original language input to the hear mode (3.5.2 ff.). Given the simplicity of the text 11.5.1, the traversal through the word bank 11.6.7 provided by LAGo Think.1 will be equally simple.
12.2 The Two Steps of Applying a LAGo Think Operation In the hear mode, an operation is activated by a next word proplet matching its second pattern (11.5.3). If there is a proplet at the now front matching the first pattern (11.5.4), the operation applies. As output of the hear mode operation, the two input proplets are concatenated (11.5.5) or combined (11.6.2). In the think mode, in contrast, there is neither a next word provided by automatic word form recognition nor a limitation to the proplets at the current now front. Instead, any proplet in the agent’s word bank may be activated by an external or an internal stimulus (CLaTR Sect. 13.4). Taking this proplet as the input to its first pattern, a think mode operation advances to a next proplet by computing the address from a continuation and the prn value of the current proplet. If there are several continuation values available, the one considered best for maintaining the agent in a state of balance (3.5.5) is chosen. For present purposes, however, we select the continuation required for mapping the content back into the original hear mode surface. The application of a LAGo think operation consists of two steps. They are illustrated below with V/N applying to the sleep member proplet in the word bank 11.6.7: 12.2.1 F IRST
STEP OF APPLYING A
V/N verb: α noun: β pattern arg: β Y ⇒ fnc: α level prn: K prn: K ⇑ ⇓ verb: sleep ′ cat: #n-s3 decl sem: pres content arg: Julia level mdr: nc: (sing 2) pc: prn: 1
LAG O
THINK OPERATION
#-mark β in the arg slot of proplet α
The first proplet pattern matches the input proplet at the content level successfully: the attributes verb, arg, and prn of the pattern have counterparts
220
12. DBS.1: Speak Mode
in the input proplet, and the variables of the pattern are compatible with the corresponding values of the input proplet. During matching, the verb value sleep is bound to the variable α, the arg value Julia to the variable β, and the prn value 1 to the variable K. Because the second pattern has the same variables as the first, though in different slots, this value assignment allows a navigation to (retrieval of) the second content proplet: 12.2.2 S ECOND
STEP OF APPLYING A
V/N verb: α noun: β pattern arg: β Y ⇒ fnc: α level prn: K prn: K ⇑ ⇓ sur: sur: verb: sleep noun: Julia cat: #n-s3′ declcat: snp sem: pres sem: nm f content arg: Julia fnc: sleep level mdr: mdr: nc: (sing 2) nc: pc: pc: prn: 1 prn: 1
LAG O
THINK OPERATION
#-mark β in the arg slot of proplet α
While LAGo hear operations use the # marker to cancel valency positions in cat slots, LAGo think operations use it to cancel values in the continuation slots arg, fnc, and mdr.
12.3 Navigating with LAGo Think.1 Let us traverse the content derived from the test sequence 11.5.1 using the word bank 11.6.7 and the operations of LAGo Think.1 (12.1.1). Anticipating the speak mode, we begin with the DBS graph analysis (repeated from 9.2.3): 12.3.1 DBS
GRAPH ANALYSIS UNDERLYING PRODUCTION OF
(iii) numbered arcs graph (NAG)
(i) semantic relations graph (SRG) sleep
Julia
sing
John
11.5.1
dream
sleep 1 2 Julia
Suzy
3
sing
6
dream 7
4
8
5 John
Suzy
(ii) signature
V
V
V
(iv) surface realization 1 2 3 4 5 6 7 8 Julia sleeps_ John sings_ Suzy dreams_ V/N N/V V−V V/N N/V V−V V/N N/V
.
N
N
N
.
.
12.3 Navigating with LAGo Think.1
221
The member proplet sleep in the word bank 11.6.7 has been activated by an external or internal stimulus. The only LAGo Think.1 operations which match a verb with their first pattern are V−V and V/N. Because the arg value of sleep has not yet been #-marked, the attempt by V−V fails. For the same reason V/N succeeds: 12.3.2 M OVING pattern level
V/N verb: α arg: β Y prn: K
⇑ verb: sleep
FROM
TO
N (arc 13 )
noun: β ⇒fnc: α prn: K
′
V
cat: #n-s3 decl sem: pres content arg: Julia level mdr: nc: (sing 2) pc: prn: 1
#-mark β in the arg slot of proplet α (see goal proplet in 12.3.3)
⇓ noun: Julia cat: snp sem: nm f fnc: sleep mdr: nc: pc: prn: 1
The clause following the operation patterns is called an instruction. Instructions supplement the expressive power of think mode operations consisting of two patterns; for more see 14.2.2 following. Now the noun Julia is the current proplet. In the think mode, it must be matched by the input pattern (12.2.1) of the next operation. In LAGo Think.1, this is the operation N/V, which results in the following transition: 12.3.3 M OVING
FROM
N
TO
N/V noun: β verb: α pattern fnc: α ⇒arg: #β Y level prn: K prn: K ⇑ noun: Julia
cat: snp sem: nm f content fnc: sleep level mdr: nc: pc: prn: 1
V (arc 2)
#-mark α in the fnc slot of proplet β
⇓ verb: sleep ′ cat: #n-s3 decl sem: pres arg: #Julia mdr: nc: (sing 2) pc: prn: 1
The #-marking of the Julia value in the arg slot of sleep, provided by the instruction of V/N in 12.3.1, is now shown in the goal proplet. The N\V move 3
The arc numbers in the headings refer to the NAG in 12.3.1.
222
12. DBS.1: Speak Mode
back to the verb enables (i) the speak mode realization of sleeps (12.4.3) and (ii) a V−V traversal to the next proposition. Now the current proplet is the verb sleep. The LAGo Think.1 operations matching a verb with their first input pattern are V/N and V−V. V/N fails because the first (and only) arg value of sleep is #-marked. For the same reason V−V succeeds: 12.3.4 M OVING
FROM ONE PROPOSITION TO THE NEXT
(arc 3)
V−V verb: α verb: β pattern arg: #X ⇒ prn: K+1 nc: (β K+1) level prn: K ⇑ ⇓ verb: sleep verb: sing ′ ′ cat: #n-s3 decl cat: #n-s3 decl sem: pres sem: pres arg: John content arg: #Julia mdr: level mdr: nc: (sing 2) nc: (dream 3) pc: pc: prn: 1 prn: 2
This think mode navigation is along a semantic relation which was established in the hear mode by V+V (11.6.5). In the speak mode, the traversal would be empty in that lexverb (12.4.3, line 1) would not produce a surface. The successful application of V−V renders the verb proplet sing as the current proplet. Because its arg value is not yet #-marked, an attempt to reapply V−V fails. For the same reason V/N succeeds, similar to 12.3.2. The second and third proposition of our example (9.2.3) may be traversed in the same manner as shown here for the first.
12.4 The Lexicalization Rule lexverb The navigation powered by LAGo think may run either as pure thought (5.1.1), i.e. without concomitant language production, or there may be a simultaneous mirroring of the navigation by natural language surfaces in the speak mode. This mirroring is provided by lexicalization rules which are embedded into the LAGo think operations. The lexicalization rules of the speak mode may be regarded as the inverse of automatic word form recognition in the hear mode. While automatic word form recognition in the hear mode may be fudged by full-form lookup (11.5.2, 13.1.2), this is not advisable for automatic word form production. To be empirically adequate, lexicalization rules must produce
12.4 The Lexicalization Rule lexverb
223
proper tense, mood, aspect, and number of simple verb forms (swam), proper agreement in complex verb forms (has been swimming), proper agreement in determiner-noun combinations (every girl, all girls), different surfaces from the same proplet (when ... kissed, 10.5.3), etc. Extending LAGo Think.1 (12.1.1) to LAGo Think-Speak.1 is based on the definition of lexicalization rules which are embedded into the sur slot of the goal proplet. Their names are derived automatically from the core attribute. Because there are only three core attributes in DBS, there are only three lexicalization rules, called lexnoun, lexverb, and lexadj. A lexicalization rule takes an argument, as in lexverb(ˆ α), and produces a language-dependent surface. The argument, pronounced “hat alpha,” is a variant of the core value bound to α. For example, the “hat version” of the core value sleep is schlaf(en) in German, dorm(ir) in French, and spa´c in Polish. 12.4.1 F ORMAL
DEFINITION OF
LAG O T HINK -S PEAK .1
LX: member proplets in the word bank 11.6.7 V/N verb: α arg: β Y prn: K
ˆ sur: lexnoun(β) ⇒noun: β fnc: α prn: K sur: lexverb(α) ˆ noun: β verb: α fnc: α ⇒ arg: #β Y prn: K prn: K ˆ verb: α sur: lexverb(β) verb: β arg: #X nc: (β K+1)⇒ arg: Z prn: K prn: K+1
#-mark β in the arg slot of proplet α N/V
#-mark α in the fnc slot of proplet β V−V
Lexicalization tables, built from traditional bilingual dictionaries, convert language-independent core values into their language-dependent counterparts: 12.4.2 S CHEMA
OF A LEXICALIZATION TABLE
If α = f, then α ˆ = g. where α and α ˆ are variables, f is a language-independent concept4 serving as a proplet core value, and g is its language-dependent counterpart. 4
Because concepts are represented here by corresponding English content words, the lexicalization tables for English would appear to be rather trivial. In a nontrivial implementation, the lexicalization table would show the surfaces of different languages for the same concept, e.g., book, Buch, livre, libro, etc. This would be useful for implementing translingual information retrieval systems as described in Frederking, Mitamura, Nyberg, and Carbonell (1997).
224
12. DBS.1: Speak Mode
The correct morphological variant of the stem is determined by the proplet’s cat and sem values. Consider the lexicalization function lexverb as it realizes the English surface slept from the core value of sleep: 12.4.3 L EXICALIZATION
RULE
lexverb
sur: lexverb(α) ˆ verb: α rule cat: NOM′ OBQ′ SM level sem: T arg: ARG1 ARG2 ARG3
⇒
sur: SUR PNC verb: α cat: NOM′ OBQ′ SM sem: T arg: ARG1 ARG2 ARG3
If SM = decl and if PNC 6= NIL, then PNC = . If lexverb(α) ˆ is called by 1 V−V, then SUR = NIL and PNC = NIL. 2 N/V and ARG2 = NIL, then SUR 6= NIL and PNC 6= NIL. 3 N/V, and ARG2 6= NIL, then SUR 6= NIL and PNC = NIL. 4 N\V, ARG3 6= NIL and not #-marked, then SUR = NIL and PNC = NIL. 5 N\V, ARG3 = NIL or #-marked, then SUR = NIL and PNC 6= NIL. If SM = interrog and if PNC 6= NIL, then PNC = ? V−V?: If WH 6 ǫ X (Yes/No) ... If WH ǫ X (WH) ... If SM = impv and if PNC 6= NIL, then PNC = ! V−V!: . . . ... 6 7 8 9 10 11 12 ... 13 ... 14 ... 15 ...
If SUR 6= NIL, OBQ′ 6 ∩ {#hv′ #be′ }, NOM′ = #n-s3′ , and T = pres, then SUR = α. ˆ NOM′ = #ns3′ , and T = pres, then SUR = α ˆ +s. NOM′ = #n′ , T = past, and α ˆ ǫ {list of regular verbs}, then SUR = α ˆ +ed. α ˆ = arise, then SUR = arose, α ˆ = bear, then SUR = bore, α ˆ = buy, then SUR = bought, α ˆ = sleep, then SUR = slept If SUR 6= NIL, OBQ′ ∩ {#hv′ #be′ }, cat = #n-s3′ #hv′ #be′ decl, and sem = past perf prog, then SUR = had been α+ing ˆ .
sur: lexverb(sleep) verb: sleep input cat: #n′ decl outputsem: past level . . . pc: prn: 1 input
sur: slept_. verb: sleep cat: #n′ decl sem: past . . . pc: prn: 1 output
The rule level consists of two proplet patterns, one for matching the input, the other for deriving the output. The output is specified by means of two
12.5 The Lexicalization Rule lexnoun
225
substitution variables, SUR and PNC, the realization of which is defined by conditions which “read” values in the input proplet, here illustrated by sleep (see input output level). The conditions are organized in three parts, one for each syntactic mood. Each part first specifies the punctuation sign, depending on the value bound to SM (syntactic mood). Then, the conditions for the LA think operations are defined in terms of whether or not the variables SUR and PNC are replaced by NIL (here only for the declarative, lines 1–5). For lexverb, the speak mode operations having a V as their goal proplet are V−V, N/V, and N\V (14.5.1). Following the skeletal condition for the interrogative and the imperative, the precise realizations of non-NIL SUR and PNC values are defined independently of the syntactic mood (lines 6–14). For example, the conditions in lines 6 and 9, i.e. If SUR 6= NIL, OBQ′ 6 ∩ {#hv′ #be′ }, NOM′ = #n′ , and T = past, then SUR = α ˆ +ed. would realize the past tense of a regular input proplet, e.g. walk, as walk+ed. The realization of complex verb forms is listed (e.g., line 14 and 15), here illustrated with the pattern has been α ˆ +ing (Sect. 14.3). The input output level at the bottom specifies the input proplet sleep with the sur value lexverb(sleep) and various grammatical values such as #n′ and past. In the output proplet, the sur value lexverb(sleep) is replaced by slept_.. As an irregular past tense form, slept is listed below the patterns for regular forms (e.g. lines 10–13). The hear mode transition from one sentence to the next in 12.3.4 does not provide a coordinating conjunction. Therefore, the speak mode operation of extrapropositional coordination does not realize a surface in DBS.1. This is formulated in 12.4.3 by the condition for V−V in line 1, which assigns NIL to both replacement variables, SUR and PNC.
12.5 The Lexicalization Rule lexnoun Let us complement the lexicalization rule lexverb (12.4.3) with the rule lexnoun. It must realize elementary nouns like pronouns and names,5 unmodified phrasal nouns like the_car, and the beginning and the end of modified 5
A simplified lexical analysis of names is illustrated in 11.5.2. Their surface realization in English is easy because there is no morphological variation. By setting the language-dependent stem βˆ equal to the core value β, lexnoun simply realizes the value bound to β. We may leave the genitive aside because it does not occur in our example.
226
12. DBS.1: Speak Mode
phrasal nouns like the heavy old car. As an example of elementary nouns consider the definition of the personal pronouns as provided by automatic word form recognition: 12.5.1 L EXICAL
ANALYSIS OF THE PERSONAL PRONOUNS OF
E NGLISH
Different pronominal forms with the pro1 pointer to first person sur: us sur: we sur: me sur: I noun: pro1 noun: pro1 noun: pro1 noun: pro1 cat: obq cat: obq cat: p1 cat: s1 sem: sg sem: sg sem: pl sem: pl fnc: fnc: fnc: fnc: mdr: mdr: mdr: mdr: nc: nc: nc: nc: pc: pc: pc: pc: prn: prn: prn: prn: The pronominal form with the pro2 pointer to second person sur: you noun: pro2 cat: sp2 sem: fnc: mdr: nc: pc: prn: Different pronominal forms with the pro3 pointer to third person sur: he sur: him sur: she sur: her sur: it sur: they sur: them noun: pro3 noun: pro3 noun: pro3 noun: pro3 noun: pro3 noun: pro3 noun: pro3 cat: s3 cat: obq cat: s3 cat: s3 cat: snp cat: p3 cat: obq sem: sg m sem: sg m sem: sg f sem: obq sem: -mf sg sem: pl sem: pl fnc: fnc: fnc: fnc: fnc: fnc: fnc: mdr: mdr: mdr: mdr: mdr: mdr: mdr: nc: nc: nc: nc: nc: nc: nc: pc: pc: pc: pc: pc: pc: pc: prn: prn: prn: prn: prn: prn: prn:
The personal pronouns use three indexical pointers, pro1, pro2, and pro3, serving as the core values (2.6.6; A.3.2, 2; A.4.2). The sem values m, f, and -mf distinguish gender, and sg and pl distinguish number (A.3.2, 7). Further distinctions (FoCL 17.2.1) are the cat values s1 for nominative singular first person, p1 for nominative plural first person, sp2 for nominative and oblique, singular and plural of second person, s3 for nominative singular third person, p3 for nominative plural third person, obq for most of the oblique (nonnominative) forms, and snp for the -mf nominative and oblique third person singular (A.3.2, 4). Together with the analysis of determiners (Sect. 6.4) these distinctions suffice for a precise surface realization of elementary and phrasal nouns in En-
12.5 The Lexicalization Rule lexnoun
227
glish by the operations with an N as their second input pattern, i.e. V/N, V\N, and A|N, of 14.5.1: 12.5.2 T HE
LEXICALIZATION RULE
lexnoun
ˆ sur: DET NN sur: lexnoun(β) noun: β rule noun: β cat: NP ⇒ cat: NP level sem: QL sem: QL mdr: Y mdr: Y V/N, V\N, A|N: 1 If nm ǫ QL or β ∩ {pro1, pro2, pro3}, then DET = NIL and NN 6= NIL 2 If nm 6 ǫ QL, β 6 ∩ {pro1, pro2, pro3}, and 3 Y 6= NIL and not #-marked, then DET 6= NIL and NN = NIL, 4 Y 6= NIL and #-marked, then DET = NIL and NN 6= NIL, 5 Y = NIL, then DET 6= NIL and NN 6= NIL. determiner 6 If QL ⊃ {indef sg}, then DET ǫ {a(n), some}. 7 If QL ⊃ {indef pl}, then DET ǫ {some, NIL}. 8 If QL ⊃ {exh pl} and NP = snp, then DET = every. 9 If QL ⊃ {exh pl} and NP = pnp, then DET = all. 10 If QL ⊃ {def}, then DET = the. common noun 11 If nm 6 ǫ QL, β 6 ∩ {pro1, pro2, pro3}, and 12 NP = snp, then NN = βˆ ˆ 13 NP = pnp and βˆ ǫ {list of regular nouns}, then NN = β+s ˆ 14 NP = pnp and β = child, then NN = children ... 15 NP = pnp and βˆ = man, then NN = men ... name 16 If NP = snp, and nm ǫ QL, then DET = NIL and NN = βˆ pronoun 17 If β ∩ {pro1, pro2, pro3}, then DET = NIL. 18 If β = pro1 and NP = s1, then NN = I. 19 If β = pro1, NP = obq, and QL ⊃ {sg}, then NN = me. 20 If β = pro2, then NN = you. 21 If β = pro3, NP = s3, and QL ⊃ {m}, then NN = he. 22 If β = pro3, NP = obq, and QL ⊃ {m sg}, then NN = him. 23 If β = pro3, NP = s3, and QL ⊃ {f sg}, then NN = she. 24 If β = pro3, NP = obq, and QL ⊃ {f sg}, then NN = her. 25 If β = pro3, NP = snp, then NN = it. 26 If β = pro1, NP = p1, then NN = we. 27 If β = pro1, NP = obq, and QL ⊃ {pl}, then NN = us. 28 If β = pro3, NP = p3, then NN = they. 29 If β = pro3, NP = obq, and QL ⊃ {pl}, then NN = them. sur: noun: cookie input cat: snp output sem: indef sg level mdr: . . . prn: 3 input
sur: a_cookie noun: cookie cat: snp sem: indef sg mdr: . . . prn: 3 output
Unlike lexverb (12.4.3), there is no top level distinction corresponding to the different syntactic moods. Also, the decision of whether or not the substitution
228
12. DBS.1: Speak Mode
variables, here DET and NN, are NIL does not depend on the operations V/N, V\N, and A|N, but on whether or not the noun is elementary and if phrasal, on whether or not it has at least one elementary modifier (lines 1–5). Like lexverb, the non-NIL values are specified independently of the initial conditions (lines 6–29). The rule may be easily extended to possessives, i.e. determiners with an indexical component. Like pronouns, determiners constitute a closed class in the sense of morphology: they form of a small stable set of frequently used word forms. However, while pronouns are content words, determiners are function words. In DBS, a word is called a content word if its semantic kind has its own mechanism of reference. This holds for symbols (iconic, 2.6.4), pronouns (pointers, 2.6.5), and names (2.6.7, baptism). It also holds for coreferential uses of symbols and pronouns based on addresses (HBTR, Sect. 3.5). A word is called a function word, in contrast, if it does not have its own reference mechanism. Instead, it uses the reference mechanism of the associated content word. This holds, for example, for determiners. As function words, determiners are absorbed in the hear mode and precipitated in the speak mode (unlike pronouns6 ). The reference mechanism of a determiner_noun combination is that of the common noun, i.e. symbol. In 12.5.2, the input proplet cookie is an unmodified singular phrasal noun and realized as a_cookie.
12.6 The Lexicalization Rule lexadj The third proplet class besides V and N is A, i.e. the adjective. It serves as a property in the sense of philosophy and as a modifier in the sense of symbolic logic. Of the operations in the LAGo Think-Speak.2 grammar 14.5.1, N|A, A−A, and A←A, have an A as their goal proplet. In the speak mode, they call the lexicalization rule lexadj. The surfaces to be realized by lexadj are elementary adnominal adjs in the positive, comparative, and superlative such as quick, quicker, quickest and their adverbial counterpart quickly in the positive.7 6
7
There are languages like classical Latin which treat the nominative of a personal pronoun as a verbal suffix, as in am/o (I love). This option integrates an indexical property into the verb – like tense, but it does not turn separate pronouns (indexicals), e.g. ego, into function words. See Dauses (1997 et seq.) for discussions of the redundancies found in many natural languages. Phrasal modifiers, i.e. prepositional phrases such as on the table, may be used adnominally (15.2.2) and adverbially (15.2.3). As explained in connection with 6.5.4, prepositional phrases have the core attribute noun, introduced lexically by the preposition (A.4.5). Therefore, their speak mode realization is a task of lexnoun, though omitted in 12.5.2.
12.6 The Lexicalization Rule lexadj
12.6.1 L EXICALIZATION sur: lexadj(ˆ γ) rule adj: γ level cat: CAT sem: QL
RULE
⇒
229
lexadj sur: ADJ adj: γ cat: CAT sem: QL
1 If lexadj is called by A←A, then ADJ = NIL. 2 If lexadj is called by N|A or A−A and 3 γˆ ǫ {list of regular elementary adjectives}, 4 CAT ǫ {adn, adnv}, and 5 QL = pad, then ADJ = γˆ ; 6 QL = cad, then ADJ = γ ˆ +er; 7 QL = sad, then ADJ = γˆ +est. 8 CAT = adv, QL = pad, then ADJ = γˆ +ly; 9 If γˆ = good and QL = pad, then ADJ = good, 10 QL = cad, then ADJ = better, 11 QL = sad, then ADJ = best. ...
input output level
sur: lexadn(big) adj: big cat: adnv sem: pad mdr: mdd: table nc: pc: prn: 8 7 input
sur: big adj: big cat: adnv sem: pad mdr: mdd: table nc: pc: prn: 8 7 output
Like lexverb (12.4.2), lexadj first specifies for each of the operations whether the substitution variable ADJ is NIL or non-NIL. Then the non-NIL values are specified independently of the operations. The only word form variation in synthetic8 elementary adjectives of English is comparation, as in heavy, heavier, heaviest, coded by the sem values pad (positive adjective), cad (comparative adjective), sad (superlative adjective), respectively (A.3.2, 9). In English, there are no agreement restrictions between an adnominal adjective and the preceding determiner or the following noun. The regular cases are handled by means of a template (pattern proplet) containing the variable γˆ (hat gamma), which is restricted to the list of all regular synthetic adjectives of the language (lines 1–8). Irregular cases, in contrast, like the suppletive paradigm of good-better-best, are handled by explicit listing (lines 9–11). The method for an efficient programming of the language-dependent surface realization in the speak mode of DBS may be summarized as follows: during 8
A synthetic comparation of an adjective is exemplified by quicker, in contrast to an analytic comparation as in more beautiful.
230
12. DBS.1: Speak Mode
the hear mode derivation, the grammatical properties relevant for the word order of content words, the correct choice of the morphological variant for agreement, and the appropriate function word precipitation are coded into the proplets representing the content. This involves values like decl, interrog, and imp for syntactic mood, sg and pl for number, #n′ , #ns3′ , and #n-s3′ for different nominative valency positions, and so on. In the speak mode, the macro word order is determined by the LAGo think navigation. The remaining detail of the micro word order is provided by the lexicalization rules which are integrated into the navigation operations. Various variable conditions are used by the lexicalization rules to read the relevant proplet values by means of pattern matching and use them for precipitating the correct function words at the correct moment, choosing the correct word forms for agreement, and generally providing a time-linear realization of surfaces which is as adequate as it is efficient. This method may be extended to complicated constructions, such as those shown in Chap. 10. The following chapters provide an extension of DBS.1 to the more basic constellations det_adn_adn_noun, det_adn_noun, det_noun, and aux_aux_nfverb, as well as the use of a two-place and a three-place verb, first in the hear mode (Chap. 13) and then in the speak mode (Chap. 14). The Sects. 13.6 (hear mode) and 14.6 (speak mode) discuss verb-initial constructions like English Yes/No interrogatives and imperatives.
13. DBS.2: Hear Mode
Corresponding to the upscaling from propositional to predicate calculus (CLaTR Sect. 6.4), this chapter extends LAGo Hear.1 (11.5.6) to LAGo Hear.2 (13.5.1). The upscaling applies solely to the data coverage of English. Based on changes in the linguistic definitions such as an extended lexicon, an extended handling of agreement, and additional operations for handling a further test sequence, the upscaling does not involve any changes in the general LAGo Hear software machine (1.2.1, i). The latter consists of the data structure of proplets, the database schema of a word bank, automatic word form recognition, the format of LAGo hear operations, the matching between operation patterns and language proplets, the time-linear derivation order of syntacticsemantic parsing, and the managing of the now front.
13.1 Lexical Lookup for the DBS.2 Example Let us upscale DBS.1 to DBS.2 by adding the following sequence of sentences to the earlier data coverage 11.5.1: 13.1.1 T HE
TEST SEQUENCE ADDED IN
DBS.2
The heavy old car hit a beautiful tree. The car had been speeding. A farmer gave the driver a lift. Using full-form lookup (FoCL Sect. 13.5) as a short cut, these English word form surfaces have the following word form analyses in DBS: 13.1.2 T HE
LEXICAL PROPLETS OF THE TEST SEQUENCE
sur: the sur: heavy sur: old sur: car sur: hit sur: a(n) sur: beautiful noun: n_1 adj: heavyadj: oldnoun: carverb: hit noun: n_2 adj: beautiful cat: nn′ npcat: adn cat: adj cat: sn cat: n′ a′ vcat: sn′ snp cat: adn sem: def sem: sem: sem: sg sem: past sem: indef sgsem: fnc: mdd: mdd: fnc: arg: fnc: mdd: mdr: mdr: mdr: mdr: mdr: mdr: mdr: nc: nc: nc: nc: nc: nc: nc: pc: pc: pc: pc: pc: pc: pc: prn: prn: prn: prn: prn: prn: prn:
232
13. DBS.2: Hear Mode
sur: had sur: tree verb: v_1 noun: tree ′ ′ cat: sn cat: n hv sem: past sem: sg sur: . verb: . fnc: cat: v′ declarg: mdr: mdr: prn: nc: nc: pc: pc: prn: prn: sur: driver sur: lift noun: drivernoun: lift cat: sn cat: sn sem: sg sem: sg fnc: fnc: mdr: mdr: nc: nc: pc: pc: prn: prn:
sur: gave sur: been sur: speeding sur: farmer verb: v_2 verb: speed noun: farmerverb: give ′ cat: n′ d′ a′ cat: sn v cat: be hvcat: be sem: past sem: perf sem: prog sem: sg arg: fnc: arg: arg: mdr: mdr: mdr: mdr: nc: nc: nc: nc: pc: pc: pc: pc: prn: prn: prn: prn:
v
The proplets are shown in the surface order of 13.1.1. However, each word form proplet is listed only once. Thus, the, a, and car are missing towards the end of the above sequence of lexical proplets, because have already occurred. Because the hear mode analysis of the test sequence is quite large, the three sentences are shown in the separate Sects. 13.2–13.4. Each of these sections begins with a graphical hear mode derivation, followed by the sequence of explicit operation applications, similar to the hear mode derivation in Sect. 11.6. As in LAGo Hear.1, the suspensions encountered by the extrapropositional coordinations connecting the three test sentences are handled by managing the now front (Sect. 11.3): the operation V+V (11.6.5) cross-copies the core value of the present top verb into the nc slot of the preceding top verb. The hear mode operations deriving the test sequence 13.1.1 are summarized in Sect. 13.5 as LAGo Hear.2, complete with the value definitions and variable conditions. The proplets resulting from the hear mode derivation of the example 13.1.1 are shown in 14.1.1 as they are stored in the agent’s word bank. The simplest way to upgrade the data coverage of DBS.2 further is by extending the lexicon and automatic word form recognition.
13.2 Hear Mode Analysis of the First Sentence The first test sentence The heavy old car hit a beautiful tree shows an elementary form of the two place verb hit and phrasal forms of the subject car and the direct object tree. For returning to the noun, the intrapropositional coordination heavy old modifying the subject requires duplex cross-copying into the respective nc and pc slots (13.2.3, 14.2.1), in contradistinction to the simplex cross-copying in extrapropositional coordination (Sect. 9.2).
13.2 Hear Mode Analysis of the First Sentence
The graphical hear mode analysis is as follows: 13.2.1 G RAPHICAL The lexical lookup noun: n_1 fnc: mdr: prn:
heavy
adj: heavy mdd: nc: pc: prn:
HEAR MODE DERIVATION OF THE FIRST SENTENCE
old
adj: old mdd: nc: pc: prn:
hit
car
noun: car verb: hit arg: fnc: nc: prn: pc: prn:
a
noun: n_2 fnc: mdr: prn:
beautiful
adj: beautiful mdd: nc: pc: prn:
tree
. .
noun: tree verb: cat: v’ decl fnc: prn: mdr: prn:
syntactic−semantic parsing noun: n_1 adj: heavy mdd: 1 fnc: nc: mdr: pc: prn: 4 prn: noun: n_1 2 fnc: mdr: heavy prn: 4
adj: heavy mdd: n_1 nc: pc: prn: 4
adj: old mdd: nc: pc: prn:
noun: n_1 3 fnc: mdr: heavy& prn: 4
adj: heavy& mdd: n_1 nc: old pc: prn: 4
adj: old noun: car fnc: mdd: prn: nc: pc: heavy& prn: 4
noun: car fnc: 4 mdr: heavy& prn: 4
adj: heavy& mdd: car nc: old pc: prn: 4
adj: old mdd: nc: pc: heavy& prn: 4
verb: hit arg: nc: pc: prn:
noun: car fnc: hit 5 mdr: heavy& prn: 4
adj: heavy& mdd: car nc: old pc: prn: 4
adj: old mdd: nc: pc: heavy& prn: 4
verb: hit arg: car nc: pc: prn: 4
noun: n_2 fnc: mdr: prn:
noun: car fnc: hit 6 mdr: heavy& prn: 4
adj: heavy& mdd: car nc: old pc: prn: 4
adj: old mdd: nc: pc: heavy& prn: 4
verb: hit arg: car n_2 nc: pc: prn: 4
noun: n_2 fnc: hit mdr: prn: 4
adj: beautiful mdd: nc: pc: prn:
noun: car fnc: hit 7 mdr: heavy& prn: 4
adj: heavy& mdd: car nc: old pc: prn: 4
adj: old mdd: nc: pc: heavy& prn: 4
verb: hit arg: car n_2 nc: pc: prn: 4
noun: n_2 fnc: hit mdr: beautiful prn: 4
adj: beautiful mdd: n_2 nc: pc: prn: 4
noun: car fnc: hit 8 mdr: heavy& prn: 4
adj: heavy& mdd: car nc: old pc: prn: 4
adj: old mdd: nc: pc: heavy& prn: 4
verb: hit arg: car tree nc: pc: prn: 4
noun: tree fnc: hit mdr: beautiful prn: 4
adj: beautiful mdd: tree nc: pc: prn: 4
adj: heavy& mdd: car nc: old pc: prn: 4
adj: old mdd: nc: pc: heavy& prn: 4
verb: hit arg: car tree nc: pc: prn: 4
noun: tree fnc: hit mdr: beautiful prn: 4
adj: beautiful mdd: tree nc: pc: prn: 4
result noun: car fnc: hit mdr: heavy& prn: 4
noun: tree fnc: mdr: prn:
.
verb: cat: v’ decl prn:
233
234
13. DBS.2: Hear Mode
Line 1 connects the determiner the and the adnominal heavy by functorargument, based on cross-copying into their mdr and mdd slots (3.6.3, 1). In line 2, the second adnominal old is connected to the first by coordination, based on cross-copying into their nc and pc slots (3.6.3, 1). To mark heavy as an initial conjunct, all instances are &-flagged (cf. condition in 13.2.3). In line 3, the value n_1 in the determiner the and the adnominal heavy is replaced by the core value of the car proplet, which is then discarded (3.6.3, 2). In line 4, the subject car and the predicate hit are connected by crosscopying into their fnc and arg slots (3.6.3, 1). The apparent distance between them is of no consequence because (i) there are only three candidates for hit to connect with at the now front (Sect. 11.3) and (ii) proplets are order-free. Line 5 shows the concatenation between the predicate and the beginning of a complex direct object by cross-copying into the arg and fnc slots of hit and a(n). The grammatical role of the noun is characterized by its core value n_2 taking second (object) position in the arg slot of hit. Line 6 continues the object noun by connecting the determiner a and the adnominal beautiful by cross-copying into their mdr (modifier) and mdd (modified) slots. In line 7 the object noun is completed by replacing the substitution value n_2 introduced by a(n) with the core value of tree (simultaneous substitution), which is then discarded. In line 8, the punctuation sign changes the final cat value of the verb from v to decl and is then discarded (13.2.9). Next let us turn to the sequence of detailed LA hear operations deriving the sentence. The first two proplets are produced by automatic word form recognition from the input surfaces The and heavy. Of the nine operations (13.5.1), DET+ADN and ADN+ADN match heavy with their second pattern. However, only DET+ADN matches the sole proplet at the now front with its first pattern: 13.2.2 C OMBINING The AND heavy DET+ADN adj: α adj: α noun: N_n noun: N_n cat: ADN cat: ADN pattern cat: CN′ NP cat: CN′ NP mdd: ⇒ mdd: N_n level mdr: mdr: α nc: nc: ea+ prn: K prn:K prn: K prn: ⇑ ⇓ sur: The sur: sur: heavy sur: noun: n_1 adj: heavy noun: n_1 adj: heavy cat: nn′ np cat: adn cat: nn′ np cat: adn sem: def sem: pad sem: def sem: pad content fnc: mdd: fnc: mdd: n_1 level mdr: mdr: heavy mdr: mdr: nc: nc: nc: nc: ea+ pc: pc: pc: pc: prn: prn: 4 prn: 4 prn: 4 2 2 1 1
13.2 Hear Mode Analysis of the First Sentence
235
The ea+ marker (for elementary adnominal +) in the nc slot of heavy allows adding an elementary adn conjunct, as shown by the next combination. Because the next word old is an adn, the only potential operations (13.5.1) are DET+ADN and ADN+ADN. DET+ADN fails because the mdr slot of the determiner is not empty any more (13.2.2), but ADN+ADN succeeds: 13.2.3 C OMBINING The heavy AND old ADN+ADN adj: β adj: β adj: α adj: α cat: ADN cat: ADN cat: ADN cat: ADN pattern mdd: mdd: mdd: Z mdd: Z ⇒ level nc: ea+ nc: nc: β nc: ea+ pc: pc: α prn: K prn: K prn: prn: K If Z 6= NIL, then α is &-flagged. ⇑ ⇓ sur: sur: sur: old sur: sur: sur: noun: n_1 adj: heavyadj: old noun: n_1 adj: heavy&adj: old cat: nn′ np cat: adn cat: adj cat: nn′ np cat: adn cat: adj sem: def sem: pad sem: pad sem: def sem: pad sem: pad content fnc: mdd: n_1 mdd: fnc: mdd: n_1 mdd: level mdr: heavymdr: mdr: mdr: heavy&mdr: mdr: nc: nc: ea+ nc: nc: nc: old nc:ea+ pc: pc: pc: pc: heavy& pc: pc: prn: 4 prn: prn: 4 prn: 4 prn: 4 prn: 4
The proplet heavy satisfies the condition If Z 6= NIL, i.e. it has a modified ([mdd: n_1]); this identifies heavy as an initial conjunct to be &-flagged (8.1.3, 8.1.4, 15.3.3, 15.3.5). We show the determiner proplet at the content level because the value of its mdr slot participates in the &-flagging. During retrieval, non-initial conjuncts may be found via their nc–pc concatenation (Sect. 8.2). The only adn proplet at the now front matching the first pattern of ADN+ADN is the one with the ea+ marker (A.3.2, 10) in its nc slot. By moving ea+ from the first to the second adn in the output, ADN+ADN moves the possible “growth point” of a prenominal chain forward, here from heavy& to old, and enables ADN+ADN to add yet another elementary adnominal. In our sample sentence, however, DET+CN completes the phrasal noun by simultaneously substituting all n_1 instances with car, after which the common noun proplet is discarded: 13.2.4 C OMBINING The heavy old AND car DET+CN noun: N_n pattern cat: CN′ NP sem: Y level prn: K
noun: α cat: CN sem: Z prn:K
⇒
noun: α cat: NP sem: Y Z prn: K
236
13. DBS.2: Hear Mode Delete any ea+ marker. CN′ ǫ {nn′ , sn′ , pn′ }, CN ǫ {sn, pn}, and NP ǫ {np, snp, pnp}. If CN′ = sn′ , then CN = sn. If CN′ = pn′ , then CN = pn. If CN′ = nn′ and CN = sn, then NP = snp. If CN′ = nn′ and CN = pn, then NP = pnp. ⇑ sur: sur: sur: car noun: n_1 adj: heavy& noun: car cat: nn′ np cat: adn cat: sn sem: def sem: pad sem: sg fnc: mdd: n_1 fnc: mdr: heavy& mdr: mdr: nc: nc: old nc: pc: pc: pc: prn: 4 prn: 4 prn: 4 2 4 1
content level
⇓ sur: sur: noun: car adj: heavy& cat: snp cat: adn sem: def sg sem: fnc: mdd: car mdr: heavy& mdr: nc: nc: old pc: pc: prn: 4 prn: 4 1 2
Even though the operation pattern does not include an adnominal, we show the heavy proplet at the content level because its mdd value n_1 participates in the simultaneous substitution. The adnominal old (not shown) has no mdd value and is therefore not affected. The ea+ marker (A.3.2, 10) in the nc slot of old (13.2.3) is deleted, ending the possibility of adding any more elementary1 prenominal adns to the phrasal noun at hand. The operation application 13.2.4 shows explicitly how all occurrences of the substitution value n_1 (line 3 of 13.2.1) are replaced with the core value of the car proplet and how the latter is then discarded. According to the variable conditions of DET+CN, the cat value of the resulting noun proplet car is snp (singular noun phrase).2 The new next word hit activates the operations NOM+FV and V+V. V+V fails because no previous proposition had occasion to leave its top verb at the now front (Sect. 11.3). The first pattern of NOM+FV, however, matches the newly derived (13.2.2–13.2.4) car proplet at the now front and succeeds: 13.2.5 C OMBINING The heavy old car AND hit NOM+FV verb: β verb: β noun: α noun: α ′ ′ pattern cat: NP cat: NP X VT cat: NP cat: #NP X VT fnc: ⇒ fnc: β arg: α arg: level prn: K prn: K prn: prn: K NP ǫ {snp, pnp, s3, p3}; NP′ ǫ {n′ , ns3′ , n-s3′ }. If NP ǫ {s3, snp} then NP′ ǫ {ns3′ , n′ }; otherwise, NP′ ǫ {n-s3′ , n′ }. VT ǫ {v, vi}. ⇑ ⇓ sur: hit sur: sur: sur: verb: hit verb: hit noun: car noun: car cat: snp cat: n′ a′ v cat: snp cat: #n′ a′ v sem: def sg sem: past sem: def sg sem: past content arg: fnc: hit arg: car fnc: level mdr: heavy& mdr: mdr: heavy& mdr: nc: nc: nc: nc: pc: pc: pc: pc: prn: 4 prn: 4 prn: prn: 4 1 1 5 5
13.2 Hear Mode Analysis of the First Sentence
237
The subject and the predicate are connected by cross-copying the core values into the respective fnc and arg slots. The subject valency is canceled by #-marking the n′ value in the cat slot of the verb. Because the next word a(n) is analyzed as a noun proplet, the only potential operations are those with a noun as their second input pattern, i.e. FV+NP, DET+CN, and IP+START. Of these, only FV+NP finds a proplet at the now front which matches its first pattern. 13.2.6 C OMBINING The heavy old car hit AND a FV+NP noun: α noun: α verb: β verb: β ′ ′ ′ ′ ′ ′ pattern cat: #X NP Y v cat: CN NP cat: #X #NP Y v cat: CN NP ⇒ arg: Z α fnc: β fnc: arg: Z level prn: K prn: prn: K prn: K NP ǫ {obq, snp, pnp, np}, NP′ ǫ {d′ , a′ }, and CN′ ǫ {NIL, nn′ , sn′ , pn′ }.
content level
sur: verb: hit cat: #n′ a′ sem: past arg: car mdr: nc: pc: prn: 4
v
5
⇑ sur: a noun: n_2 cat: sn′ snp sem: indef sg fnc: mdr: nc: pc: prn: 6
sur: verb: hit cat: #n′ #a′ v sem: past arg: car n_2 mdr: nc: pc: prn: 4 5
⇓
sur: noun: n_2 cat: sn′ snp sem: indef sg fnc: hit mdr: nc: pc: prn: 4 6
The operation cross-copies the values assigned to α and β, and cancels the non-initial valency position in the verb’s cat slot by #-marking the value a′ . Whereas a phrasal noun in preverbal position is put together first and then combined with the verb, a phrasal noun in postverbal position is added to the verb incrementally, as required by the strictly time-linear derivation order of natural language.3 Despite this apparent asymmetry in the derivation of phrasal nouns in pre- versus postverbal position, the operations DET+ADN and DET+CN apply alike in either case. This is made possible by the substitution variable N_n in the operation patterns and the corresponding substitution values n_1, n_2, ...,4 used as a placeholder in the core attribute of a determiner.5 When the common noun is finally added in a complex noun construc1 2
3
4
5
Unlike elementary adnominals, phrasal and clausal adnominals follow their modified in English. The English determiner the takes its number from the common noun, in contradistinction to a(n), all, or every (Sect. 6.4; 12.5.2), which control the number of the resulting noun phrase (agreement). See FoCL Sect. 17.1 for further discussion from a syntactic point of view. The analogous case for pre- and postverbal noun coordination is discussed above in connection with 8.2.4. Each time a lexical placeholder value is introduced into a derivation, it is automatically incremented, e.g., n_1 in 13.2.2 and n_2 in 13.2.6. Similar substitution values may occur also in the core feature of a verb (13.3.6) and the mdd feature of an adnominal (13.2.2) or adverbial adjective proplets.
238
13. DBS.2: Hear Mode
tion, all occurrences of the value n_n are substituted simultaneously,6 in pre(13.2.4) as well as postverbal (13.2.8) position. Because the next word beautiful is recognized as an adn proplet, the only potential operations are those with an adn pattern in second position, i.e. DET+ADN and ADN+ADN. ADN+ADN looks at the now front for a proplet matching its first pattern. The current now front provides two potential candidates, heavy and old. However, because neither has an ea+ marker in its nc slot anymore (13.2.4), ADN+ADN fails. DET+ADN, in contrast, succeeds: 13.2.7 C OMBINING The heavy old car hit a AND beautiful DET+ADN noun: N_n ′ pattern cat: CN NP level mdr: prn: K
adj: α cat: ADN ⇒ mdd: nc: prn: ⇑ sur: sur: beautiful noun: n_2 adj: beautiful cat: sn′ snp cat: adn sem: indef sg sem: content mdd: fnc: hit level mdr: mdr: nc: nc: pc: pc: prn: prn: 4 7 6
adj: α cat: ADN mdd: N_n nc: ea+ prn: K ⇓ sur: sur: noun: n_2 adj: beautiful cat: sn′ snp cat: adn sem: indef sg sem: fnc: hit mdd: n_2 mdr: beautiful mdr: nc: nc: ea+ pc: pc: prn: 4 prn: 4 7 6
noun: N_n ′ cat: CN NP mdr: α prn: K
Next automatic word form recognition produces the noun proplet tree. The operations with a noun as their second pattern (13.5.1) are IP+START, DET+CN, and FV+NP. IP+START fails because the last cat value of the verb proplet at the now front is still v. FV+NP fails because the cat value sn of tree is not in the restriction set of the variable NP (13.2.6). Remains DET+CN: 13.2.8 C OMBINING The heavy old car hit a beautiful AND tree pattern level
DET+CN noun: N_n noun: α noun: α cat: NP cat: CN cat: CN′ NP ⇒ sem: Y Z sem: Z sem: Y prn: K prn: K prn: K Delete any ea+ marker. CN′ ǫ {nn′ , sn′ , pn′ }, CN ǫ {sn, pn}, and NP ǫ {np, snp, pnp}. If CN′ = sn′ , then CN = sn. If CN′ = pn′ , then CN = pn. If CN′ = nn′ and CN = sn, then NP = snp. If CN′ = nn′ and CN = pn, then NP = pnp. ⇑
6
⇓
Determiner–adnominal–noun combinations are illustrated in 13.2.2–13.2.4 (preverbal) and 13.2.6– 13.2.8 (postverbal), determiner–noun combinations in 13.3.3 and 13.4.3 (preverbal), and 13.4.7 and 13.4.9 (postverbal). Auxiliary-verb combinations (13.3.6, 13.3.7) use the same technique.
13.2 Hear Mode Analysis of the First Sentence sur: sur: sur: tree noun: n_2 verb: hit noun: tree cat: #n′ #a′ v cat: sn′ snp cat: sn sem: indef sg sem: sg sem: past content fnc: arg: car n_2 fnc: hit level mdr: beautiful mdr: mdr: nc: nc: +a nc: pc: pc: pc: prn: prn: 4 prn: 4 8 6 5
239
sur: sur: verb: hit noun: tree cat: #n′ #a′ v cat: snp sem: indef sg sem: past arg: car tree fnc: hit mdr: beautiful mdr: nc: nc: pc: pc: prn: 4 prn: 4 6 5
The substitution value n_2 in (i) the verb proplet hit, (ii) the determiner proplet a, and (iii) the adjective proplet beautiful (omitted) is simultaneously replaced by the value tree, after which the tree proplet is discarded. When automatic word form recognition provides the “.” (period) proplet, the only operation matching it with its second pattern is S+IP. At the now front, it finds the hit proplet matching its first pattern and applies as follows: 13.2.9 C OMBINING The heavy old car hit a beautiful tree AND . pattern level
content level
S+IP verb: β verb: β verb: V_n ′ ′ cat: #X VT ⇒ cat: #X SM ip+ cat: VT′ SM prn: K prn: K VT ǫ {v, vi, vimp} and SM ǫ {decl, interrog, impv}. If VT = v, then SM ǫ {decl, interrog}; if VT = vi, then SM = interrog; if VT = vimp, then SM = impv. sur: verb: hit cat: #n′ #a′ v sem: past arg: car tree mdr: nc: pc: prn: 4 5
⇑
sur: . verb: v_1 cat: v′ decl prn: 8
⇓ sur: verb: hit cat: #n′ #a′ decl ip+ sem: past arg: car tree mdr: nc: pc: prn: 4 5
This S+IP application is analogous to 11.6.2, and triggers a body flush (Sect. 11.3) at the now front. The derivation of the first sentence is now completed as follows: 13.2.10 R ESULT
OF PARSING
The heavy old car hit a beautiful tree.
sur: sur: sur: sur: sur: sur: noun: car adj: heavy&adj: old noun: tree adj: beautiful verb: hit cat: snp cat: adn cat: adn cat: #n′ #a′ decl ip+cat: snp cat: adn sem: def sg sem: pad sem: pad sem: past sem: indef sg sem: fnc: hit mdd: car mdd: fnc: hit mdd: tree arg: car tree mdr: heavy&mdr: mdr: beautiful mdr: mdr: mdr: nc: nc: old nc: nc: nc: nc: pc: pc: pc: pc: heavy& pc: pc: prn: 4 prn: 4 prn: 4 prn: 4 prn: 4 prn: 4 1 5 2 6 3 7
240
13. DBS.2: Hear Mode
During the time-linear lexical look-up of the current next word in the hear mode analysis 13.2.1, each of these proplets, in their lexical form, has been stored step by step in the appropriate token line at the now front of the agent’s word bank (Sect. 11.2). Thus, the syntactic-semantic concatenation of lexical proplets by the operation sequence shown above has been entirely in situ. In other words, the order-free proplets in 13.2.10 do not need to be sorted into the agent’s word bank because they are already in place.
13.3 Hear Mode Analysis of the Second Sentence For reasons of space, the following hear mode derivation shows the second test sentence of 13.1.1 in isolation, instead of being surrounded by the first and the third sentence.7 13.3.1 H EAR The lexical lookup
MODE DERIVATION OF car
had
been
The car had been speeding.
speeding
noun: n_1 noun: car verb: v_1 verb: v_2 verb: speed arg: fnc: fnc: arg: arg: prn: prn: prn: prn: prn: syntactic−semantic parsing noun: n_1 noun: car fnc: 1 fnc: prn: prn: 5 noun: car 2 fnc: prn: 5
verb: v_1 arg: prn:
noun: car 3 fnc: v_1 prn: 5
verb: v_1 verb: v_2 arg: car arg: prn: 5 prn:
noun: car 4 fnc: v_2 prn: 5
verb: v_2 arg: car prn: 5
noun: car 5 fnc: speed prn: 5 result noun: car fnc: speed prn: 5
verb: speed arg: car prn: 5
. .
verb: cat: v’ decl prn:
verb: speed arg: prn:
.
verb: cat: v’ decl prn:
verb: speed arg: car prn: 5
The sentence shows a phrasal form of the one-place verb speed. 7
For the hear mode derivation of three properly connected, maximally short propositions see 9.2.1.
13.3 Hear Mode Analysis of the Second Sentence
241
In line 1, the proplets the and car are combined into a single proplet (borderline case of simultaneous substitution). In line 2, the noun is combined with the auxiliary had by means of cross-copying. In line 3, the auxiliary form been is added: the substitution value v_1 in the car and have proplets is replaced by the new substitution value v_2,8 after which the been proplet is discarded. In line 4, the present participle of speed substitutes the value v_2 in the car proplet and the proplet which resulted from the absorption of been into had (simultaneous substitution), after which the speed proplet is discarded. This successive fusion of had and been into had+been and of had+been and speeding into had+been+ speeding resembles a successive prep+det+noun fusion (6.5.4) in that the second proplet is integrated into the first, and the third proplet is integrated into that. Line 5 concludes the derivation by adding the period. As shown by the application of S+IP (13.3.8), the cat value v in the verb is replaced by decl. We turn now to a step by step derivation of the second sentence, beginning with the application of IP+START (interpunctuation plus start), which connects the second sentence to the first. The second pattern of IP+START matches the determiner proplet (next word) provided by automatic word form recognition. Its first pattern matches hit, which was left at the now front by the body flush (Sect. 11.2) triggered by the previous S+IP operation (13.2.9). 13.3.2 C OMBINING The heavy old car hit a beautiful tree. AND The IP+START verb: β noun: α pattern ′ cat: NP cat: #X SM ip+ ⇒ level prn: prn: K SM ǫ {decl, interrog, im p} ⇑ sur: sur: The verb: hit noun: n_1 cat: #n′ #a′ decl ip+ cat: nn′ np sem: past sem: def content arg: car tree fnc: level mdr: mdr: nc: nc: pc: pc: prn: 4 prn: 8 1
verb: β ′ cat: #X SM prn: K
noun: α cat: NP prn: K+1
⇓ sur: sur: verb: hit noun: n_1 cat: #n′ #a′ decl cat: nn′ np sem: past sem: def arg: car tree fnc: mdr: mdr: nc: nc: pc: pc: prn: 4 prn: 5 8 1
This step is analogous to 11.6.3. As an instance of a suspension, the operation is limited to incrementing the prn value of the next word and deleting the ip+ marker in the output. The determiner proplet in the output is the beginning of the second sentence. 8
Also, the sem value of the been proplet is added to the had proplet, as shown by the application of AUX+NFV in 13.3.6.
242
13. DBS.2: Hear Mode
Now the new next word provided by automatic word form recognition is the common noun proplet car. The operations with a second pattern matching a noun are FV+NP, DET+CN, and IP+START. FV+NP fails because the cat value sn of car is not in the restriction set of the variable NP (13.2.6). IP+START fails because its previous application (13.3.2) has removed the ip+ marker, which was introduced by S+IP into the cat slot of hit (13.2.9). Remains DET+CN, here combining a determiner and a noun without intervening adnominal adjectives (in contradistinction to the previous derivation in Sect. 13.2). Thus, deletion of the ea+ marker to prevent adding any further elementary prenominal modifier (13.2.4, 13.2.8) is vacuous. 13.3.3 C OMBINING The AND car DET+CN noun: N_n noun: α noun: α pattern ′ cat: NP cat: CN NP cat: CN ⇒ level sem: Z sem: Y Z sem: Y Delete any ea+ marker. CN′ ǫ {nn′ , sn′ , pn′ }, CN ǫ {sn, pn}, and NP ǫ {np, snp, pnp}. If CN′ = sn′ , then CN = sn. If CN′ = pn′ , then CN = pn. If CN′ = nn′ and CN = sn, then NP = snp. If CN′ = nn′ and CN = pn, then NP = pnp. ⇑ ⇓ sur: sur: sur: car noun: car noun: n_1 noun: car cat: snp cat: nn′ np cat: sn sem: def sg sem: def sem: sg content fnc: fnc: fnc: level mdr: mdr: mdr: nc: ea+ nc: nc: pc: pc: pc: prn: 5 prn: prn: 5 1 2 1
The noun proplet is absorbed into the determiner, including the sem value sg. The next proplet provided by automatic word form recognition is had. The operations with a second pattern matching a verb proplet (13.5.1) are V+V, NOM+FV, AUX+NFV, and S+IP. The second pattern of S+IP fails to match had and the first pattern of AUX+NFV fails to match the proplets hit and car currently at the now front. NOM+FV and V+V, however, apply simultaneously and successfully (self-organization compensating the suspension 13.3.2). The first input pattern of NOM+FV matches the car proplet available at the current now front: 13.3.4 C OMBINING The car AND had NOM+FV verb: β noun: α noun: α ′ cat: NP pattern cat: NP cat: NP X VT ⇒ fnc: β arg: fnc: level prn: K prn:K prn:
verb: β ′ cat: #NP X VT arg: α prn: K
13.3 Hear Mode Analysis of the Second Sentence
243
NP ǫ {snp, pnp, s3, p3}; NP′ ǫ { n′ , ns3′ , n-s3′ }. If NP ǫ {s3, snp} then NP′ ǫ {ns3′ , n′ }; otherwise, NP′ ǫ {n-s3′ , n′ }. VT ǫ {v, vi}. sur: noun: car cat: snp sem: def sg content fnc: level mdr: nc: pc: prn: 5
⇑ sur: had verb: v_1 cat: n′ hv′ sem: past arg: mdr: nc: pc: prn:
sur: noun: car cat: snp sem: def sg fnc: v_1 mdr: nc: pc: prn: 5
v
⇓ sur: verb: v_1 cat: #n′ hv′ sem: past arg: car mdr: nc: pc: prn: 5
v
The second input pattern of NOM+FV does not distinguish between a finite main verb (13.2.5) and a finite auxiliary. In either case NOM+FV connects the core values by cross-copying into the respective fnc and arg slots, and cancels the nominative valency, provided the agreement conditions are fulfilled. While NOM+FV provides an intrapropositional functor-argument relation, V+V provides an extrapropositional coordination: 13.3.5 E XTRAPROPOSITIONAL V+V verb: γ ′ pattern cat: #X SM nc: level prn: K-1 sur: verb: hit cat: #n′ #a′ decl sem: past content arg: car tree level mdr: nc: pc: prn: 4
COORDINATION BETWEEN
verb: β prn: K ⇑ sur: had verb: v_1 cat: #n′ hv′ sem: past arg: car mdr: nc: pc: prn: 5
⇒
v
verb: γ ′ cat: #X SM nc: (β K) prn: K-1
hit AND have
verb: β prn: K
⇓ sur: sur: verb: v_1 verb: hit cat: #n′ #a′ decl cat: #n′ hv′ sem: past sem: past arg: car arg: car tree mdr: mdr: nc: nc: (v_1 5) pc: pc: prn: 5 prn: 4
v
The first input pattern of V+V matches the hit proplet left at the now front by the body flush (11.3.4) associated with the application of S+IP (13.2.9). The two proplets are connected by unidirectional cross-copying the second verb’s core value into the first verb’s nc slot. By allowing the hit proplet to remain at the now front until S+IP applies (Sect. 11.3), the value (v_1 5) of its nc slot will be available for the simultaneous substitution with speed (13.3.7). The new next word proplet is been. As a verb, it is potentially matched by the second input pattern of V+V, NOM+FV, AUX+NFV, and S+IP. V+V fails because the cat slot of v_1 contains the uncanceled valency position hv′ . NOM+FV fails because the nominative valency of had is already canceled (13.3.4). S+IP fails because the cat values of been do not match its second pattern. AUX+NFV, however, succeeds:
244
13. DBS.2: Hear Mode
13.3.6 C OMBINING The car had AND been AUX+NFV verb: V_n verb: α ′ ′ pattern cat: #W AUX VT cat: Y AUX sem: Z level sem: X prn: prn: K AUX ǫ {be, hv, do}. VT ǫ {v, vi}. ⇑ sur: been sur: verb: v_2 verb: v_1 cat: #n′ hv′ v cat: be′ hv sem: perf sem: past content arg: arg: car level mdr: mdr: nc: nc: pc: pc: prn: prn: 5 4 3
⇒
verb: α ′ ′ cat: #W #AUX Y VT sem: X Z prn: K ⇓ sur: verb: v_2 cat: #n′ #hv′ be′ sem: past perf arg: car mdr: nc: pc: prn: 5
v
3
The absorption of non-finite been changes the syntactic category of the finite auxiliary had from #n′ hv′ v to #n′ #hv′ be′ v, preparing the addition of a progressive form. The substitution value v_2 of the been proplet replaces all occurrences of v_1. The semantic effect is the addition of the value perf to the sem attribute of the result. Next, AUX+NFV reapplies: 13.3.7 C OMBINING The car had been AND speeding AUX+NFV verb: V_n verb: α ′ ′ pattern cat: #W AUX VT cat: Y AUX ⇒ sem: Z level sem: X prn: prn: K AUX ǫ {be, hv, do}. VT ǫ {v, vi}. ⇑ sur: speeding sur: verb: speed verb: v_2 cat: #n′ #hv′ be′ v cat: be sem: prog sem: past perf content arg: arg: car level mdr: mdr: nc: nc: pc: pc: prn: 5 prn: 3 5
verb: α ′ ′ cat: #W #AUX Y VT sem: X Z prn: K ⇓ sur: verb: speed cat: #n′ #hv′ #be′ v sem: past perf prog arg: car mdr: nc: pc: prn: 5 3
AUX+NFV absorbs the non-finite main verb speeding into the proplet representing the complex auxiliary had been, completing the fusion of had+been+ speeding into a single proplet. The (incremented) variable v_2 in the nc slot of the previous verb hit left at the now front (13.3.5) is also replaced by speed, thus completing the coordination between the first and the second proposition. It remains for S+IP to add the period, and to trigger a body and a head flush at the now front (Sect. 11.3).
13.4 Hear Mode Analysis of the Third Sentence
245
13.3.8 C OMBINING The car had been speeding AND . S+IP verb: β verb: β verb: V_n ′ ′ cat: #X VT ⇒ cat: #X SM ip+ cat: VT′ SM prn: K prn: K VT ǫ {v, vi, vimp} and SM ǫ {decl, interrog, impv}. If VT = v, then SM ǫ {decl, interrog}; if VT = vi, then SM = interrog; if VT = vimp, then SM = impv.
pattern level
sur: verb: speed cat: #n′ #hv′ #be′ v sem: past perf prog content arg: car level mdr: nc: pc: prn: 5 3
⇑
sur: . verb: v_1 cat: v′ decl prn: 6
⇓ sur: verb: speed cat: #n′ #hv′ #be′ decl ip+ sem: past perf prog arg: car mdr: nc: pc: prn: 5 3
The contribution of the period is replacing v with decl in the cat slot of the resulting speed proplet. Introduction of the ip+ marker in that same slot enables the subsequent application of IP+START in 13.4.2. The time-linear hear mode derivation of the second test sentence may be summarized as the following order-free set of proplets: 13.3.9 R ESULT
OF PARSING
sur: noun: car cat: np sem: def sg fnc: speed mdr: nc: pc: prn: 5 1
The car had been speeding.
sur: verb: speed cat: #n′ #hv′ #be′ decl ip+ sem: past perf prog arg: car mdr: nc: pc: prn: 5 3
All function words, i.e. the, had, been, and . (period), have been absorbed into content word proplets, where they are reflected by sem and cat values. An extrapropositional nc-pc relation to the previous proposition has been established. The speak mode production (Sect. 14.3) will realize the six input surfaces from the two proplets in 13.3.9 by utilizing the various cat and sem values accumulated during the above hear mode derivation.
13.4 Hear Mode Analysis of the Third Sentence The third sentence in the test sequence 13.1.1 is A farmer gave the driver a lift. It uses a three-place verb, in contradistinction to the two-place verb hit in the first and the one-place verb speed in the second sentence:
246
13. DBS.2: Hear Mode
13.4.1 G RAPHICAL A
farmer
HEAR MODE DERIVATION OF THE THIRD SENTENCE gave
the
driver
a
lift
.
lexical lookup noun: n_1 noun: farmer verb: give fnc: arg: fnc: prn: prn: prn: syntactic−semantic parsing noun: n_1 noun: farmer 1 fnc: fnc: prn: 6 prn: noun: farmer 2 fnc: prn: 6
verb: give arg: prn:
noun: farmer 3 fnc: give prn: 6
verb: give arg: farmer prn:
noun: n_2 fnc: prn:
noun: driver fnc: prn:
noun: n_3 fnc: prn:
noun: lift fnc: prn:
.
verb: cat: v’ decl prn:
noun: n_2 fnc: prn:
noun: farmer 4 fnc: give prn: 6
verb: give arg: farmer n_2 prn: 6
noun: farmer 5 fnc: give prn: 6
verb: give arg: farmer driver prn: 6
noun: farmer 6 fnc: give prn: 6
verb: give arg: farmer driver n_3 prn: 6
noun: driver fnc: give prn: 6
noun: n_3 fnc: give prn: 6
noun: farmer 7 fnc: give prn: 6
verb: give arg: farmer driver lift prn: 6
noun: driver fnc: give prn: 6
noun: lift fnc: give prn: 6
result noun: farmer fnc: give prn: 6
verb: give arg: farmer driver lift prn: 6
noun: driver fnc: give prn: 6
noun: lift fnc: give prn: 6
noun: n_2 fnc: give prn: 6
noun: driver fnc: prn:
noun: driver fnc: give prn: 6
noun: n_3 fnc: prn: noun: lift fnc: prn:
.
verb: cat: v’ decl prn:
In line 1, the noun is absorbed into the determiner. In line 2, the subject/predicate relation between farmer and give is established by cross-copying the core values into the fnc and the arg slots. In line 3, the indirect_object\predicate relation between the and give is established, again by cross-copying. In line 4, the indirect_object is completed by substituting the n_2 values, introduced by the second determiner, with the core value of driver. Line 5 establishes the direct_object\predicate relation by cross-copying into give and the third determiner a(n). The apparent distance between them is of no consequence because (i) there are only three candidates for a(n) to try to connect with at the now front (Sect. 11.3) and (ii) proplets are order-free. Similar considerations hold for the simultaneous substitution of n_3 by lift in line 6. The derivation is completed in line 7 by adding the period.
13.4 Hear Mode Analysis of the Third Sentence
247
As in the previous test sentences, we turn next to the explicit step by step derivation. The third sentence begins with the noun proplet derived from The. The operations matching a noun with their second pattern are FV+NP, DET+CN, and IP+START. DET+CN fails to find a determiner at the noun front. FV+NP fails because the verb hit left at the now front after the last body flush has no uncanceled valency. For the same reason, IP+START succeeds: 13.4.2 C OMBINING The car had been speeding. AND A IP+START verb: β pattern cat: #X′ SM ip+ level prn: K SM ǫ {decl, interrog, imp}
noun: α cat: NP prn:
verb: β ⇒ cat: #X′ SM prn: K
⇑ sur: sur: A verb: speed noun: n_1 cat: #n′ #hv′ #be′ decl ip+ cat: sn′ snp sem: past perf prog sem: indef sg content fnc: arg: car level mdr: mdr: nc: nc: pc: pc: prn: 5 prn: 3 1
noun: α cat: NP prn: K+1
⇓ sur: sur: verb: speed noun: n_1 cat: #n′ #hv′ #be′ decl cat: sn′ snp sem: past perf prog sem: indef sg arg: car fnc: mdr: mdr: nc: nc: pc: pc: prn: 5 prn: 6 3 1
As in 13.3.2, this application removes the IP+ marker and causes a suspension. The new next word provided by automatic word form recognition is the common noun proplet farmer. The situation resembles that of 13.3.3: the only operation matching the input is DET+CN. It replaces the substitution value n_1 with the core value of the farmer proplet, which is then discarded: 13.4.3 C OMBINING A AND farmer DET+CN noun: N_n noun: α noun: α ′ pattern cat: CN NP cat: CN cat: NP ⇒ sem: Z sem: Y sem: Y Z level prn: K prn: prn: K Delete any ea+ marker. CN′ ǫ {nn′ , sn′ , pn′ }, CN ǫ {sn, pn}, and NP ǫ {np, snp, pnp}. If CN′ = sn′ , then CN = sn. If CN′ = pn′ , then CN = pn. If CN′ = nn′ and CN = sn, then NP = snp. If CN′ = nn′ and CN = pn, then NP = pnp. ⇑ sur: sur: farmer noun: n_1 noun: farmer cat: sn′ snp cat: sn sem: indef sg sem: sg content fnc: fnc: level mdr: mdr: nc: nc: pc: pc: prn: prn: 6 2 1
⇓ sur: noun: farmer cat: snp sem: indef sg fnc: mdr: nc: pc: prn: 6 1
248
13. DBS.2: Hear Mode
The definite article (13.3.3) and the indefinite article (13.4.3) are distinguished by the values def vs. indef (6.4.6, 6.4.7) in the respective sem slots. The new next word is the proplet representing the finite verb form gave. It enables the application of two operations in a single hear mode step, namely (i) NOM+FV establishing the intrapropositional subject/predicate relation (13.3.4) and (ii) V+V establishing the extrapropositional verb−verb relation (13.3.5). This compensates the suspension caused by IP+START (13.4.2): 13.4.4 C OMBINING A farmer AND gave NOM+FV verb: β verb: β noun: α noun: α ′ ′ pattern cat: NP cat: NP X VT cat: NP cat: #NP X VT fnc: ⇒ arg: fnc: β arg: α level prn: K prn: K prn: prn: K NP ǫ {snp, pnp, s3, p3}; NP′ ǫ {n′ , ns3′ , n-s3′ }. If NP ǫ {s3, snp} then NP′ ǫ {ns3′ , n′ }; otherwise, NP′ ǫ {n-s3′ , n′ }. VT ǫ {v, vi}. ⇑ sur: gave sur: noun: farmer verb: give cat: n′ d′ a′ cat: snp sem: indef sg sem: past language arg: fnc: content mdr: mdr: nc: nc: pc: pc: prn: 6 prn:
v
⇓ sur: sur: noun: farmer verb: give cat: #n′ d′ a′ cat: snp sem: indef sg sem: past arg: farmer fnc: give mdr: mdr: nc: nc: pc: pc: prn: 6 prn: 6
v
The body flush (11.3.4) associated with the application of S+IP (13.3.8) has left the verb speed of the previous proposition remaining at the now front. This allows V+V to establish a unidirectional extrapropositional V−V coordination by cross-copying the core value of give into nc slot of speed: 13.4.5 C OMBINING speed AND give V+V verb: γ ′ pattern cat: #X SM nc: level prn: K-1
verb: β prn: K
verb: γ ′ cat: #X SM ⇒ nc: (β K) prn: K-1
⇑ sur: sur: verb: give verb: speed cat: #n′ #hv′ #be′ decl ip+cat: #n′ d′ a′ sem: past sem: past perf prog content arg: farmer arg: car level mdr: mdr: nc: nc: pc: pc: prn: 6 prn: 5
⇓ sur: sur: verb: give verb: speed cat: #n′ #hv′ #be′ decl ip+cat: #n′ d′ a′ sem: past sem: past perf prog arg: farmer arg: car mdr: mdr: nc: nc: (give 6) pc: pc: prn: 6 prn: 5
v
verb: β prn: K
v
13.4 Hear Mode Analysis of the Third Sentence
249
The next word the is analyzed by automatic word form recognition as a noun proplet. LAGo hear operations matching a noun proplet with their second input pattern are FV+NP, DET+CN, and IP+START. Of these, only FV+NP finds a proplet at the now front which matches its first pattern (similar to 13.2.6). 13.4.6 C OMBINING A farmer gave AND the FV+NP noun: α verb: β noun: α verb: β ′ ′ ′ ′ ′ ′ pattern cat: #X NP Y v cat: CN NP cat: #X #NP Y v cat: CN NP fnc: β ⇒ arg: Z α fnc: arg: Z level prn: K prn: K prn: prn: K NP ǫ {obq, snp, pnp}, NP′ ǫ {d′ , a′ }, and CN′ ǫ {NIL, nn′ , sn′ , pn′ }.
content level
sur: verb: give cat: #n′ d′ a′ sem: past arg: farmer mdr: nc: pc: prn: 6
v
3
⇑ sur: the noun: n_2 cat: nn′ np sem: def fnc: mdr: nc: pc: prn: 4
sur: verb: give cat: #n′ #d′ a′ v sem: past arg: farmer n_2 mdr: nc: pc: prn: 6 3
⇓
sur: noun: n_2 cat: nn′ np sem: def fnc: give mdr: nc: pc: prn: 6 4
This operation application connects the verb and its indirect_object by crosscopying (13.4.1, line 3). The first oblique valency position d′ in the cat attribute of the verb is canceled by #-marking. Next automatic word form recognition provides the common noun proplet driver. The situation resembles that of 13.2.4, 13.2.8, 13.3.3, and 13.4.3. Accordingly, the only LAGo hear operation matching the input is DET+CN: 13.4.7 C OMBINING A farmer gave the AND driver pattern level
DET+CN noun: N_n noun: α noun: α cat: CN′ NP cat: CN cat: NP ⇒ sem: Y sem: Z sem: Y Z Delete any ea+ marker. CN′ ǫ {nn′ , sn′ , pn′ }, CN ǫ {sn, pn}, and NP ǫ {np, snp, pnp}. If CN′ = sn′ , then CN = sn. If CN′ = pn′ , then CN = pn. If CN′ = nn′ and CN = sn, then NP = snp. If CN′ = nn′ and CN = pn, then NP = pnp.
⇑ sur: sur: sur: driver verb: give noun: n_2 noun: driver cat: #n′ #d′ a′ v cat: nn′ np cat: sn sem: past sem: def sem: sg content arg: farmer n_2 fnc: give fnc: level mdr: mdr: mdr: nc: nc: nc: pc: pc: pc: prn: prn: 6 prn: 6 5 3 4
⇓ sur: sur: verb: give noun: driver cat: #n′ #d′ a′ v cat: np sem: past sem: def sg arg: farmer driver fnc: give mdr: mdr: nc: nc: pc: pc: prn: 6 prn: 6 4 3
250
13. DBS.2: Hear Mode
This application of DET+CN replaces the substitution value n_2 in the verb and in the determiner with driver, after which the driver proplet is discarded. The next word is a(n), analyzed by automatic word form recognition as a noun proplet. This addition of (the beginning of) a direct_object resembles 13.2.6. Accordingly, only the patterns of FV+NP find matching input. 13.4.8 C OMBINING A farmer gave the driver AND a FV+NP noun: α verb: β noun: α verb: β ′ ′ ′ ′ ′ ′ pattern cat: #X NP Y v cat: CN NP cat: #X #NP Y v cat: CN NP fnc: β ⇒ arg: Z α fnc: arg: Z level prn: K prn: K prn: prn: K NP ǫ {obq, snp, pnp}, NP′ ǫ {d′ , a′ }, and CN′ ǫ {NIL, nn′ , sn′ , pn′ }.
content level
sur: verb: give cat: #n′ #d′ a′ sem: past arg: farmer mdr: nc: pc: prn: 6
⇑ sur: a noun: n_3 ′ v cat: sn snp sem: indef fnc: mdr: nc: pc: prn: 6 3
⇓ sur: verb: give cat: #n′ #d′ #a′ v sem: past arg: farmer n_2 mdr: nc: pc: prn: 6 3
sur: noun: n_3 cat: sn′ snp sem: indef fnc: give mdr: nc: pc: prn: 6 6
The direct_object, represented by the determiner, and the verb are connected by cross-copying (13.4.1, line 5). The operation cancels the second oblique valency position a′ in the cat attribute of the verb by #-marking. Adding the next word lift9 resembles the situation 13.4.7. Accordingly, the only LAGo hear operation finding matching input is DET+CN: 13.4.9 C OMBINING A farmer gave the driver a AND lift DET+CN noun: N_n noun: α noun: α ′ cat: CN NP cat: CN cat: NP ⇒ sem: Z sem: Y Z sem: Y ′ ′ ′ ′ Delete any ea+ marker. CN ǫ {nn , sn , pn }, CN ǫ {sn, pn}, and NP ǫ {np, snp, pnp}. If CN′ = sn′ , then CN = sn. If CN′ = pn′ , then CN = pn. If CN′ = nn′ and CN = sn, then NP = snp. If CN′ = nn′ and CN = pn, then NP = pnp. ⇑ sur: sur: noun: n_3 sur: lift verb: give cat: sn′ snp noun: lift cat: #n′ #d′ #a′ v sem: indef sem: past cat: sn arg: farmer driver n_3 fnc: give sem: sg fnc: mdr: mdr: mdr: nc: nc: prn: pc: pc: 5 prn: 6 prn: 6 4 3
⇓ sur: sur: noun: lift verb: give cat: snp cat: #n′ #d′ #a′ v sem: indef sg sem: past arg: farmer driver lift fnc: give mdr: mdr: nc: nc: pc: pc: prn: 6 prn: 6 7 3
13.4 Hear Mode Analysis of the Third Sentence
251
This application of DET+CN replaces the substitution value n_3 in the verb and the determiner with the common10 noun lift, after which the lift proplet is discarded. This completes the direct_object. The period following next has the core attribute verb. The situation resembles 13.2.9 and 13.3.8. Only the patterns of S+IP find matching input. 13.4.10 C OMBINING A farmer gave the driver a lift AND . S+IP verb: β verb: β pattern verb: V_n cat: #X′ SM ip+ cat: #X′ VT ⇒ level cat: VT′ SM prn: K prn: K VT ǫ {v, vi, vimp} and SM ǫ {decl, interrog, impv}. If VT = v, then SM ǫ {decl, interrog}; if VT = vi, then SM = interrog; if VT = vimp, then SM = impv. sur: verb: give cat: #n′ #d′ #a′ v sem: past content arg: farmer driver lift level mdr: nc: pc: prn: 6 3
⇑
⇓ sur: verb: give cat: #n′ #d′ #a′ decl ip+ sem: past arg: farmer driver lift mdr: nc: pc: prn: 6 3
sur: . verb: v_1 cat: v′ decl prn: 6
The combination of the sentence start and the period results in replacing the v segment at the end of the verb’s cat slot with decl, after which the period proplet is discarded (function word absorption). This completes the hear mode derivation of the third and last sentence in the test sequence 13.1.1 and results in the following proplet set: 13.4.11 R ESULT
OF PARSING
sur: noun: farmer cat: snp sem: indef sg fnc: give mdr: nc: pc: prn: 6 2
A farmer gave the driver a lift.
sur: verb: give cat: #n′ #d′ #a′ decl ip+ sem: past arg: farmer driver lift mdr: nc: pc: prn: 6 3
sur: noun: driver cat: snp sem: def sg fnc: give mdr: nc: pc: prn: 6 4
sur: noun: lift cat: snp sem: indef sg fnc: give mdr: nc: pc: prn: 6 6
As the semantic representation of a last sentence, the nc slot of the verb proplet give is empty and the ip+ marker remains in place. If the text were continued, the next operation to apply would be IP+START. 10
The distinction between common nouns and proper nouns (names) originated in philosophy of language. A noun is called common if refers to a class, like table; in English it is preceded by a determiner, except for indefinite plurals and mass nouns. A noun is called proper if it names a unique referent. In reality, names may be combined with a determiner and occur as plurals, as in the Millers.
252
13. DBS.2: Hear Mode
13.5 Defining LAGo Hear.2 LAGo Hear.2 extends LAGo Hear.1 (11.5.6) by adding the operations DET+CN (determiner plus noun), DET+ADN (determiner plus adnominal), ADN+ADN (adnominal plus adnominal), FV+NP (finite verb plus noun phrase), and AUX+NFV (auxiliary plus non-finite verb). The LAGo Hear.1 sample 11.5.1 continues to be parsed by LAGo Hear.2. The operations used in Sects. 13.2– 13.4 are summarized as the LAGo Hear.2 grammar of DBS.2: 13.5.1 LAG O H EAR .2 NOM+FV verb: β verb: β noun: α noun: α cat: NP cat: #NP′ X VT cat: NP cat: NP′ X VT crco ⇒ fnc: β arg: α arg: fnc: prn: K prn: K prn: K prn: NP ǫ {snp, pnp, s3, p3}; NP′ ǫ {n′ , ns3′ , n-s3′ }. If NP ǫ {s3, snp} then NP′ ǫ {ns3′ , n′ }; otherwise, NP′ ǫ {n-s3′ , n′ }. VT ǫ {v, vi}. V+V verb: γ verb: γ cat: #X′ SM verb: β cat: #X′ SM verb: β nc: prn: K ⇒ nc: (β K) prn: K prn: K-1 prn: K-1
crco
AUX+NFV verb: V_n verb: α verb: α ′ ′ ′ ′11 cat: #W #AUX Y VT cat: #W AUX VT cat: Y AUX ⇒ sem: X Z sem: Z sem: X prn: prn: K prn: K AUX ǫ {be, hv, do}. VT ǫ {v, vi}.
sbst
S+IP verb: β verb: β verb: V_n ′ ′ cat: #X VT ⇒ cat: #X SM ip+ cat: VT′ SM prn: K prn: K
absp
FV+NP noun: α verb: β noun: α verb: β ′ ′ ′ ′ ′ ′ cat: CN NP cat: #X #NP Y v cat: CN NP cat: #X NP Y v fnc: β ⇒ arg: Z α fnc: arg: Z prn: K prn: K prn: prn: K NP ǫ {obq, snp, pnp, np}, NP′ ǫ {d′ , a′ }, and CN′ ǫ {NIL, nn′ , sn′ , pn′ }
crco
DET+CN noun: N_n noun: α noun: α ′ cat: CN NP cat: CN cat: NP sem: Z ⇒ sem: Y Z sem: Y prn: prn: K prn: K
sbst
VT ǫ {v, vi, vimp} and SM ǫ {decl, interrog, impv}. If VT = v, then SM ǫ {decl, interrog}; if VT = vi, then SM = interrog; if VT = vimp, then SM = impv.
13.5 Defining LAGo Hear.2
253
Delete any ea+ marker. CN′ ǫ {nn′ , sn′ , pn′ }, CN ǫ {sn, pn}, and NP ǫ {np, snp, pnp}. If CN′ = sn′ , then CN = sn. If CN′ = pn′ , then CN = pn. If CN′ = nn′ and CN = sn, then NP = snp. If CN′ = nn′ and CN = pn, then NP = pnp. IP+START verb: β verb: β noun: α noun: α ′ ′ cat: #X SM ip+ cat: NP ⇒ cat: #X SM cat: NP prn: K prn: prn: K prn: K+1 SM ǫ {decl, interrog, imp} DET+ADN noun: N_n ′ cat: CN NP mdr: prn: K
adj: α noun: N_n cat: ADN ′ cat: CN NP ⇒ mdr: α mdd: nc: prn: K prn:
ADN+ADN adj: β adj: α cat: ADN cat: ADN mdd: Z mdd: ⇒ nc: ea+ nc: pc: pc: prn: prn: K If Z 6= NIL, α is &-flagged.
adj: α cat: ADN mdd: Z nc: β pc: prn: K
adj: α cat: ADN mdd: N_n nc: ea+ prn: K
adj: β cat: ADN mdd: nc: ea+ pc: α prn: K
susp
crco
crco
The nine hear mode operations are ordered according to their second pattern: first those matching a verb, then those matching a noun, and finally those matching an adj. There are four general kinds of hear mode operations (3.6.3): (i) cross-copying (crco), (ii) absorption (absp), (iii) substitution (sbst), and (iv) suspension (susp). The operations of LAGo Hear.2 parse the test sentences 13.1.1 as the following sequence of operations, using DET and core values to represent the input proplets: 13.5.2 S EQUENCE
OF OPERATIONS IN
S ECTS . 13.3, 13.4,
operation DET+ADN ADN+ADN DET+CN NOM+FV FV+NP DET+ADN DET+CN S+IP
input proplets DET heavy heavy old car car hit hit DET DET beautiful DET tree hit .
operation kind crco crco sbst crco crco crco sbst absp
application 13.2.2 13.2.3 13.2.4 13.2.5 13.2.6 13.2.7 13.2.8 13.2.9
IP+START DET+CN NOM+FV V+V AUX+NFV AUX+NFV S+IP
hit DET DET car car have hit have have be have_be speed speed .
susp sbst crco crco sbst sbst absp
13.3.2 13.3.3 13.3.4 13.3.5 13.3.6 13.3.7 13.3.8
AND
13.5
254
13. DBS.2: Hear Mode speed DET DET farmer farmer give speed give give DET DET driver give DET DET lift give .
IP+START DET+CN NOM+FV V+V FV+NP DET+CN FV+NP DET+CN S+IP
susp sbst crco crco crco sbst crco sbst absp
13.4.2 13.4.3 13.4.4 13.4.5 13.4.6 13.4.7 13.4.8 13.4.9 13.4.10
This way of characterizing a derivation resembles 11.4.4, except that the operations are more varied and classified in terms of crco, sbst, absp, and susp. It also shows that the continuity condition 3.6.5 for the think mode does not hold for the hear mode.
13.6 Hear Mode Extension to a Verb-Initial Construction English declaratives have the finite verb in post-nominative position (FoCL Sect. 18.3). The sentential moods of Yes/No interrogatives like Did you give the driver a lift? or imperatives like Give the driver a lift!, in contrast, have the finite verb in initial position. WH interrogatives like Whom did the farmer give a lift? have the finite verb in second position whereby the first position is taken by the WH word which may assume the grammatical role of the subject (nominative, who), a noun object (oblique, whom), a prepositional object (oblique, to whom), or an adverbial (where, when, why, how). In DBS, such alternative word orders are treated by adding operations which feed back into existing transitions as defined for the declarative. For example, Yes/No interrogatives are treated by adding a single operation, called AUX+NOM?, to LAGo Hear.2. Its application is shown below in line 1: 13.6.1 DBS Did lexical lookup
GRAPH ANALYSIS OF A
you
give
verb: v_1 noun: pro2 verb: give arg: arg: fnc: prn: prn: prn: syntactic−semantic parsing verb: v_1 noun: pro2 arg: fnc: 1 prn: 7 prn: verb: v_1 2 arg: pro2 prn: 7
noun: pro2 fnc: v_1 prn: 7
verb: give arg: prn:
the
noun: n_1 fnc: prn:
Y ES /N O
INTERROGATIVE
driver
noun: driver fnc: prn:
a
noun: n_2 fnc: prn:
lift
noun: lift fnc: prn:
?
verb: ? cat: v’ interrog prn:
13.6 Hear Mode Extension to a Verb-Initial Construction verb: v_1 2 arg: pro2 prn: 7
noun: pro2 fnc: v_1 prn: 7
verb: give 3 arg: pro2 prn: 7
noun: pro2 fnc: give prn: 7
verb: give 4 arg: pro2 n_1 prn: 7
255
verb: give arg: prn:
noun: pro2 fnc: give prn: 7
noun: n_1 fnc: prn: noun: n_1 fnc: give prn: 7
noun: driver fnc: prn:
verb: give 5 arg: pro2 driver prn: 7
noun: pro2 fnc: give prn: 7
noun: driver fnc: give prn: 7
verb: give 6 arg: pro2 driver prn: 7
noun: pro2 fnc: give prn: 7
noun: driver fnc: give prn: 7
noun: n_2 fnc: prn:
verb: give 7 arg: pro2 driver n_2 prn: 7
noun: pro2 fnc: give prn: 7
noun: driver fnc: give prn: 7
noun: n_2 fnc: give prn: 7
verb: give 8 arg: pro2 driver lift prn: 7 result verb: give arg: pro2 driver lift prn: 7
noun: pro2 fnc: give prn: 7
noun: driver fnc: give prn: 7
noun: lift fnc: give prn: 7
noun: pro2 fnc: give prn: 7
noun: driver fnc: give prn: 7
noun: lift fnc: give prn: 7
noun: lift fnc: prn: verb: ? cat: v’ interrog prn:
When automatic word form recognition produces the first proplet do, it is matched by the second pattern of the operation V+V (13.3.5, 13.4.5). If the Yes/No interrogative is initial, i.e. has no predecessor in the dialog (CLaTR Chap. 10), V+V will fail because it cannot find a verb proplet at the now front matching its first pattern. If the expression is not initial, in contrast, V+V will establish a relation of extrapropositional coordination by cross-copying the core values into the respective nc and pc slots (not shown). Next automatic word recognition produces the second proplet pro2 from the surface you (12.5.1). At this point, any potential hear mode operation must match a noun with its second pattern and a verb at the now front with its first. The operations in 13.5.1 matching a noun with their second pattern are FV+NP, DET+CN, and IP+START. The second patterns of FV+NP and IP+START do not match the do proplet at the now front because its nominative valency is still uncanceled. DET+CN doesn’t match because do at the now front is a verb. Thus, the first derivation step shown in 13.6.1 requires the definition of an additional operation, called AUX+NOM?. The second pattern of this operation matches the next word proplet pro2, while its first pattern matches a finite auxiliary with an uncanceled nominative valency at the now front:
256
13. DBS.2: Hear Mode
13.6.2 C OMBINING Did AND you WITH AUX+NP? AUX+NOM? verb: V_n noun: α ′ ′ pattern cat: NP AUX vcat: NP fnc: ⇒ level arg: prn: prn: K ⇑ sur: Did sur: you verb: v_1 noun: pro2 cat: n′ do′ v cat: s2 sem: do_past sem: def sg content fnc: arg: level mdr: mdr: nc: nc: pc: pc: prn: prn: 7
verb: V_n noun: α ′ ′ cat: #NP AUX vicat: NP fnc: V_n arg: α prn: K prn: K ⇓ sur: sur: verb: v_1 noun: pro2 cat: #n′ do′ vi cat: s2 sem: do_past sem: def sg arg: pro2 fnc: v_1 mdr: mdr: nc: nc: pc: pc: prn: 7 prn: 7
The cat value v of did is changed to vi in the output, ensuring that the punctuation mark added by S+IP will be ?. The continuation with existing AUX+NFV may be shown as follows: 13.6.3 C OMBINING Did you AND give WITH AUX+NFV AUX+NFV verb: α verb: V_n verb: α ′ ′ ′ ′ pattern cat: #W AUX VT cat: #W #AUX Y VT cat: Y AUX ⇒ sem: X Z sem: Z level sem: X prn: prn: K prn: K AUX ǫ {be, hv, do}. VT ǫ {v, vi}. ⇑ sur: give sur: sur: noun: pro2 verb: give verb: v_1 cat: n-s3′ d′ a′ cat: #n′ do′ vicat: s2 sem: do_past sem: def sgsem: pres content fnc: v_1 arg: arg: pro2 level mdr: mdr: mdr: nc: nc: nc: pc: pc: pc: prn: prn: 7 prn: 7
v
⇓ sur: sur: noun: pro2 verb: give cat: #n-s3′ #do′ d′ a′ vicat: s2 sem: def sg sem: past fnc: give arg: pro2 mdr: mdr: nc: nc: pc: pc: prn: 7 prn: 7
The operation AUX+NFV simultaneously replaces the substitution value v_1 with the core value of the give proplet, which is then discarded. The remainder of the hear mode derivation corresponds to the steps 13.4.6–13.4.9. Following the principle of Surface Compositionality (Sect. 1.6), we use the unmarked form of a verb, i.e. non-third-person singular present tense, also for the infinitive, such as give in 13.6.1, read in 6.6.5, and for the imperative, as in Give the driver a lift!. Instead of postulating several lexical readings for one and the same surface, the necessary adjustments are performed by the operations in question modifying certain cat and sem values.12
13.6 Hear Mode Extension to a Verb-Initial Construction
257
The sequence of operations in 13.6.1 may be compared with some similar constructions as follows: 13.6.4 D IFFERENT Y ES /N O Did+you +give +the AUX+NOM? AUX+NFV FV+NP
INTERROGATIVE CONSTRUCTIONS +driver +a +lift DET+CN FV+NP DET+CN
Did+the +farmer AUX+NOM? DET+CN
+give +the AUX+NFV FV+NP
Was+the +car AUX+NOM? DET+CN
+speeding +? AUX+NFV S+IP
+driver DET+CN
+? S+IP
+a +lift +? FV+NP DET+CN S+IP
Was+the +heavy +old +car +speeding +? AUX+NOM? DET+ADN ADN+ADN DET+CN AUX+NFV S+IP
The first pair of examples shows how the different beginnings Did+you and Did+the+farmer feed into the same continuations FV+NP, DET+CN, FV+NP, DET+CN, S+IP. It is similar in the second pair of examples, in which the beginnings Was+the+car and Was+the+heavy+old+car feed into the same continuations AUX+NFV, S+IP. The data coverage may be extended to the imperative by defining the operation FV+NP!. Its first pattern matches a sentence-initial finite verb with an uncanceled n-s3′ valency position in its cat slot and the pres value in its sem slot. The categorial operation of FV+NP! changes the v in the cat slot to vimp, thus telling S+IP that the expression should end with an exclamation mark. For the speak mode, this extension is shown in the schematic derivation 14.6.4.
12
For a more detailed discussion of the infinitive see CLaTR Sect. 15.4.
258
14. DBS.2: Speak Mode
This chapter provides the speak mode derivations for the small text 13.1.1, analyzed in the previous chapter in the hear mode. The proplets representing the content derived from the text are shown in 14.1.1 as they are stored in the agent’s word bank. These proplets and their retrieval (activation) serve the same role in the speak mode as is served by the lexicon and automatic word form recognition in the hear mode. In Sects. 14.2, 14.3, and 14.4, the speak mode derivations of the three test sentences begin with the DBS graph analysis underlying their intrapropositional production, followed by the explicit navigation steps of LAGo Think.2. The surfaces are realized by the language-dependent lexicalization rules defined in Sects. 12.4–12.6. The navigation operations are summarized in Sect. 14.5 as the grammar LAGo Think-Speak.2. Sect. 14.6 concludes with an outlook towards extending the system further by upscaling to the sentential moods Yes/No interrogative and imperative.
14.1 Content to be Realized in the Speak Mode The production of coherent language presupposes coherent content. For the agent, the most reliable manner of acquiring coherent content is direct observation in its ecological niche. Thereby, the coherence of the cognitive content follows from the coherence of the external world (FoCL Sect. 24.5). Another possibility to acquire content is the interpretation of natural language signs. Because such signs are produced by authors who may reorder and reconnect elements that are familiar from observation, content coded by the signs of a language may be incoherent.1 Only agents capable of direct observation have the means to evaluate whether or not a given language-coded content is coherent or natural or normal, and to 1
Reassembling familiar parts in new ways occurs already in the earliest remnants of human nonlanguage cognition, as witnessed by mythical beings such as the sphinx, the griffon, or the Egyptian goddess Sekhmet, the “Lady of Slaughter,” depicted as a woman with a lion’s head. An example from recent folklore is the jackalope (German Wolpertinger).
260
14. DBS.2: Speak Mode
appraise it accordingly. This, however, requires not only a fairly orderly environment (CLaTR Chap. 6), but also autonomous context recognition and action. Because robots with these abilities are not yet available in computational linguistics, providing coherent content as a precondition for coherent language production is still the responsibility of the content provider. For designing LAGo Think-Speak.2 let us use the coherent content of 13.1.1, derived as a set of proplets by LAGo Hear.2 and stored as follows in the agent’s word bank:2 14.1.1 P ROPLETS
OF
LAG O H EAR .2
member proplets (connected proplets) sur: adj: beautiful cat: adn sem: pad mdd: tree mdr: nc: pc: prn: 4 sur: sur: noun: car noun: car cat: snp cat: snp sem: def sg sem: def sg fnc: speed fnc: hit mdr: heavy& mdr: nc: nc: pc: pc: prn: 5 prn: 4 sur: noun: driver cat: snp sem: def sg fnc: give mdr: nc: pc: prn: 6 sur: noun: farmer cat: snp sem: indef sg fnc: give mdr: nc: pc: prn: 6 2
now front
TEST SEQUENCE IN A WORD BANK owner values
beautiful
car
driver
farmer
The sur values provided by automatic word form recognition have been deleted by the LA Hear.2
14.1 Content to be Realized in the Speak Mode sur: verb: give cat: #n′ #d′ #a′ decl sem: past arg: farmer driver lift mdr: nc: pc: prn: 6
sur: adj: heavy& cat: adn sem: mdd: car mdr: nc: old pc: prn: 4 sur: verb: hit cat: #n′ #a′ decl sem: past arg: car tree mdr: nc: (speed 5) pc: prn: sur: noun: lift cat: snp sem: indef sg fnc: give mdr: nc: pc: prn: 6 sur: adj: old cat: adn sem: mdd: car mdr: nc: pc: heavy& prn: 4
sur: verb: speed cat: #n-s3′ #hv′ #be′ decl sem: past perf prog arg: car mdr: nc: (give 6) pc: prn: 5
261
give
heavy
hit
lift
old
speed
operations (Chap. 13). The following LAGo Think-Speak.2 operations will show how the sur values are reconstructed from the proplets by the lexicalization rules defined in 12.4.3, 12.5.2, and 12.6.1.
262
14. DBS.2: Speak Mode sur: noun: tree cat: snp sem: indef sg fnc: hit mdr: beautiful nc: pc: prn: 4
tree
The alphabetical order of the token lines is induced by the owner values. Because the word car occurs twice in the sample text, the token line of car has two member proplets. The adnominals heavy and old are connected intrapropositionally by their nc and pc values (13.2.3). The verbs hit, speed, and give are connected extrapropositionally by the nc values of hit (13.3.5) and speed (13.4.5). The give proplet remains at the now front for concatenation with a possible next proposition (Sect. 11.3).
14.2 Speak Mode Production of the First Sentence As in Chap. 13, we begin the analysis of each sentence in 13.1.1 with a graphical analysis, though here for the speak rather than the hear mode: 14.2.1 G RAPHICAL S PEAK M ODE (i) semantic relations graph (SRG) hit car
tree
heavy
old
V
A
(iii) numbered arcs graph (NAG) hit 10 1 6 7 car tree 2 5 heavy 4 3
old
8 9 beautiful
(iv) surface realization 1 2 3 4 5 6 7 8 9 10 The heavy old car hit a beautiful tree V/N N|A A−A A A A|N N/V V\N N|A A|N N\V
(ii) signature
N
beautiful
ANALYSIS OF THE FIRST SENTENCE
.
N A
A
The intrapropositional derivation starts with the activation of hit as the current proplet. The LAGo Think-Speak.2 operations (14.5.1) matching a verb with their first pattern (12.2.1) are V−V, V/N, and V\N. V−V fails because the arg values of hit have not been #-marked. Similarly for V\N, which fails because the nominative valency position of hit is still uncanceled. However, V/N succeeds, going from the verb to the noun and realizing the surface The:
14.2 Speak Mode Production of the First Sentence
14.2.2 NAVIGATING
FROM
263
hit TO car (arc 1)
V/N verb: α pattern arg: β Y level prn: K
ˆ sur: lexnoun(β) ⇒noun: β fnc: α prn: K ⇑ ⇓ sur: sur: The verb: hit noun: car cat: #n′ #a′ decl cat: snp sem: past sem: def sg content arg: car tree fnc: level mdr: mdr: heavy& nc: (speed 5) nc: pc: pc: prn: 4 prn: 4
#-mark β in the arg slot of proplet α (see goal proplet in 14.2.7)
The instruction following the operation patterns #-marks the first valency filler in the arg slot of hit, i.e. [arg: #car tree]. Instructions help ensure that navigation steps apply in the proper order. For example, the object of a declarative sentence is accessed only after traversing the subject and the finite verb. Based on the navigation order and the properties of the goal proplet, lexicalization rules realize bits and pieces of the sentence surface. For example, lexnoun (12.5.2, line 3) realizes only the determiner from the car proplet because it is modified (its mdr attribute has the value heavy&). It is realized as The because the sem attribute has the value def (12.5.2, line 10). The next traversal requires (3.6.5) an operation with an initial N pattern. N/V fails because the value heavy& in the mdr slot of the noun proplet is not yet #-marked. N\V fails because the non-initial arg value of hit has not yet been #-marked by V\N (14.2.8). Remains3 N|A, which applies as follows: 14.2.3 NAVIGATING N|A noun: β pattern mdr: γ level prn: K
FROM
car TO heavy (arc 2)
sur: lexadj(ˆ γ) adj: γ ⇒ mdd: β prn: K ⇑ ⇓ sur: sur: heavy noun: car adj: heavy& cat: snp cat: adn sem: def sg sem: pad content fnc: hit mdd: car level mdr: heavy& mdr: nc: nc: old pc: pc: prn: 4 prn: 4 3
#-mark γ in the mdr slot of proplet β (see goal proplet in 14.2.6)
N−N is not defined in 14.5.1, but would fail because no nc value is available.
264
14. DBS.2: Speak Mode
The relevant lines of the lexicalization rule lexadj (12.6.1) are 2–5. The next traversal requires an operation with an initial A pattern. Of the two operations available, A|N fails because the nc value old of heavy has not yet been #-marked (see 14.2.6 for a successful application of A|N.) For the same reason, A−A succeeds: 14.2.4 NAVIGATING A−A adj: α pattern nc: β level prn: K
FROM
heavy TO old (arc 3)
ˆ sur: lexadj(β) ⇒adj: β pc: α prn: K ⇑ ⇓ sur: sur: old adj: heavy& adj: old cat: adn cat: adn sem: pad sem: pad content mdd: car mdd: level mdr: mdr: nc: old nc: pc: pc: heavy& prn: 4 prn: 4
#-mark β in the nc slot of proplet α (see goal proplet in 14.2.5)
The operation A−A uses the lexicalization rule lexadj 12.6.1, like N|A in 14.2.3, this time for realizing old. Of the speak mode operations with an initial A pattern, A−A cannot reapply because the nc slot of old does not provide a next conjunct. Remains A←A, which returns empty (12.6.1, line 1) to the initial conjunct, as shown in 14.2.1, transition 4:4 14.2.5 R ETURNING A←A adj: β pattern pc: #α level prn: K
old TO heavy (arc 4)
sur: lexadj(α) ˆ adj: α ⇒ nc: #β prn: K
⇑ sur: adj: old cat: adn sem: pad content mdd: level mdr: nc: pc: heavy& prn: 4
EMPTY FROM
⇓ sur: adj: heavy& cat: adn sem: pad mdd: car mdr: nc: #old pc: prn: 4
Without a pc value, A←A cannot reapply. Thus, only A|N applies next:
14.2 Speak Mode Production of the First Sentence
14.2.6 NAVIGATING
FROM
heavy& BACK
A|N ˆ adj: γ sur: lexnoun(β) pattern mdd: β ⇒ noun: β nc: Z level mdr: #γ prn: K prn: K where Z is #-marked or NIL ⇑ ⇓ sur: sur: car adj: heavy& noun: car cat: adn cat: snp sem: pad sem: def sg content fnc: hit mdd: car level mdr: mdr: #heavy& nc: #old nc: pc: pc: prn: 4 prn: 4
TO
265
car (arc 5)
#-mark β in the mdd slot of proplet γ
In addition to the instruction to the right of the output pattern there is the condition “where Z is #-marked or NIL” below the pattern level at the matching frontier. It ensures that there is either no coordination of elementary adns or the return to the initial conjunct has been completed (arc 4 in 14.2.1, arcs 7 and 8 in 15.3.3). The common noun condition of lexnoun (12.5.2, line 4) is fulfilled. The surface car is realized by lines 11 and 12 of lexnoun. Because the noun car functions as the subject, as shown by its initial position in the arg slot of hit, N/V (but not N\V, 14.2.8) succeeds next: 14.2.7 R ETURNING
TO AND REALIZING
N/V sur: lexverb(α) ˆ noun: β pattern verb: α fnc: α mdr: Z ⇒ arg: #β Y level prn: K prn: K where Z is #-mar ked or NIL ⇑ ⇓ sur: hit sur: noun: car verb: hit cat: snp cat: #n′ #a′ decl sem: def sg sem: past content fnc: hit arg: #car tree level mdr: #heavy& mdr: nc: nc: (speed 5) pc: pc: prn: 4 prn: 4
hit (arc 6) #-mark α in the fnc slot of proplet β (see goal proplet in 14.2.8)
Lexverb 12.4.3 matches the goal with line 3 and a variant of lines 9–12 (for irregular verb forms of English) realizes hit. 4
Marking direction explicitly in A−A vs. A←A is necessary because the different proplets serving as conjuncts are represented by the same letter – in contradistinction to functor-argument relations like V/N vs. N/V. A similar distinction between traversal operators is N|N and N↑N, as used in 15.3.6
266
14. DBS.2: Speak Mode
Because the fnc slot of hit contains a non-initial value, i.e. tree, which is not yet #-marked, V−V fails, but V\N succeeds. The direct object is modified; therefore, lexnoun realizes only the determiner a(n), analogous to 14.2.2: 14.2.8 B EGINNING
REALIZATION OF THE DIRECT _ OBJECT
V\N verb: α pattern arg: #X β Y level prn: K
ˆ sur: lexnoun(β) ⇒noun: β fnc: #α prn: K ⇑ ⇓ sur: a(n) sur: noun: tree verb: hit cat: #n′ #a′ decl cat: snp sem: indef sg sem: past content arg: #car tree fnc: #hit level mdr: beautiful mdr: nc: (speed 5) nc: pc: pc: prn: 4 prn: 4
tree (arc 7)
#-mark β in the arg slot of proplet α (see goal proplet in 14.2.11)
The result of the instruction, i.e. the #-marking canceling the second valency filler tree in hit, is shown in the output of N\V, which realizes the return from the object to the verb. First, however, the adnominal modifier must be realized by N|A: 14.2.9 NAVIGATING N|A noun: β pattern mdr: γ level prn: K
FROM
tree TO beautiful (arc 8)
sur: lexadj(ˆ γ) adj: γ ⇒ mdd: β prn: K ⇑ ⇓ sur: sur: beautiful noun: tree adj: beautiful cat: snp cat: adn sem: indef sg sem: pad content fnc: #hit mdd: tree level mdr: beautiful mdr: nc: nc: pc: pc: prn: 4 prn: 4
#-mark γ in the mdr slot of proplet β (see goal proplet in 14.2.10)
Lexadj realizes beautiful analogous to 14.2.3. Of the navigation operations beginning with A, A−A fails because the nc slot of beautiful is empty. A|N, however, is successful. It returns to the noun and realizes tree with lexnoun, analogous to 14.2.6:
14.3 Speak Mode Production of the Second Sentence
14.2.10 NAVIGATING
FROM
beautiful BACK
A|N ˆ adj: γ sur: lexnoun(β) pattern mdd: β ⇒ noun: β nc: Z level mdr: #γ prn: K prn: K where Z is #-marked or NIL ⇑ ⇓ sur: tree sur: noun: tree adj: beautiful cat: snp cat: adn sem: def sg sem: pad content fnc: #hit mdd: tree level mdr: #beautiful mdr: nc: nc: pc: pc: prn: 4 prn: 4
TO
267
tree (arc 9)
#-mark β in the mdd slot of proplet γ
N\V returns to the verb to realize the period with lexverb 12.4.3, line 5: 14.2.11 M OVING
FROM
tree BACK
TO
hit (arc 10)
N\V sur: lexverb(α) ˆ noun: β pattern fnc: #α verb: α ⇒ mdr: Z arg: #X #β Y level prn: K prn: K where Z is #-marked or NIL ⇑ ⇓ sur: . sur: verb: hit noun: tree cat: snp cat: #n′ #a′ decl sem: def sg sem: past content fnc: #hit arg: #car #tree level mdr: #beautiful mdr: nc: nc: (speed 5) pc: pc: prn: 4 prn: 4
After concluding the realization of the first sample sentence successfully, V\N and V−V are tried next. Given that the arg values of hit are all #-marked at this point, V\N does not succeed in adding another object. For the same reason, V−V succeeds in navigating along the extrapropositional coordination relating hit and speed, thus beginning the traversal of the second sample sentence.
14.3 Speak Mode Production of the Second Sentence The second sentence of 13.1.1 contains a complex form of the one-place verb speed. While the hear mode analysis of had been speeding . uses the oper-
268
14. DBS.2: Speak Mode
ation sequence NOM+FV AUX+NFV AUX+NFV S+IP to add the four surfaces one by one, the speak mode uses the lexicalization rule lexverb 12.4.1 to realize them from the sem and cat values of the speed proplet all at once. The DBS graph analysis is as follows: 14.3.1 G RAPHICAL S PEAK M ODE (i) semantic relations graph (SRG) speed
ANALYSIS OF THE SECOND SENTENCE (iii) numbered arcs graph (NAG) 0
1 car
speed 2
car (ii) signature (iv) surface realization
V
0
1 2 The_car had_been_speeding_ V−V V/N N/V
N
.
The (iv) surface realization uses the V/N traversal to realize the_car and the N/V traversal to realize had_been_ speeding_.. This is in contrast to Latin, for example, which represents the complex verb form of English as the single surface celeraverat5 . In either language, the surface realization is based on a cocktail of grammatical values which was assembled in the V proplet during the hear mode derivation (13.3.1), either by automatic word form recognition (Latin morphology) or by hear mode parsing (English syntax-semantics). Before sentence-initial V/N can apply, the extrapropositional V−V coordination between the first and the second sentence must be traversed (12.3.4). 14.3.2 NAVIGATING V−V verb: α pattern arg: #X nc: (β K+1) level prn: K
EMPTY FROM
hit TO speed (arc 0)
ˆ sur: lexverb(β) ⇒verb: β arg: Z prn: K+1 ⇑ ⇓ sur: sur: verb: speed verb: hit cat: #n-s3′ #a′ decl cat: #n′ #hv′ #be′ decl sem: past perf prog sem: past content arg: car arg: #car #tree level mdr: mdr: nc: (give 6) nc: (speed 5) pc: pc: prn: 5 prn: 4 5
Because Latin has no forms for the progressive aspect in the verbal paradigm, we are using the past perfect form celeraverat.
14.3 Speak Mode Production of the Second Sentence
269
This think mode navigation is along a semantic relation which was established in the hear mode by V+V (13.3.5). In the speak mode, it is an empty traversal because the input to lexverb produces no surface (12.4.3, line 1). Next V/N applies as follows: 14.3.3 NAVIGATING V/N verb: α pattern arg: β Y level prn: K
FROM
speed TO car (arc 1)
ˆ sur: lexnoun(β) ⇒noun: β fnc: α prn: K ⇑ ⇓ sur: sur: The_car verb: speed noun: car cat: #n′ #hv′ #be′ decl cat: np sem: past perf prog sem: def sg content arg: car fnc: speed level mdr: mdr: nc: (give 6) nc: pc: pc: prn: 5 prn: 5
#-mark β in the arg slot of proplet α (see goal proplet in 14.3.4)
V/N calls the lexicalization rule lexnoun. Because the mdr slot of car is empty, lines 2, 5, 6, and 12 of 12.5.2 realize the complete phrasal surface The_car. The result of the instruction is shown in the output of N/V (14.3.4). The remainder of the sentence surface, consisting of the complex verb surface had_been_speeding and the period, is realized by lines 2 and 15 of the lexicalization function lexverb 12.4.3, called by N/V on the way from the subject back to the predicate, which happens to be a one-place verb: 14.3.4 NAVIGATING
FROM
car BACK
TO
speed (arc 2)
N/V sur: lexverb(α) ˆ noun: β pattern fnc: α verb: α #-mark α in the fnc slot of proplet β ⇒ arg: #β Y level mdr: Z prn: K prn: K where Z is #-marked orNIL ⇑ ⇓ sur: had_been_speeding_. sur: noun: car verb: speed cat: np cat: #n′ #hv′ #be′ decl sem: def sg sem: past perf prog content fnc: speed arg: #car level mdr: mdr: nc: nc: (give 6) pc: pc: prn: 5 prn: 5
270
14. DBS.2: Speak Mode
The lexicalization rule lexverb realizes the phrasal surface had_been_speeding_. from the sem values past, perf, prog and the cat values #n′ , #hv′ , #be′ , decl, accumulated in the hear mode derivation (Sect. 13.3).
14.4 Speak Mode Production of the Third Sentence The third sentence of the example 13.1.1 contains a three-place verb with an elementary surface. The navigation shows the time-linear realization of a preverbal subject and of an indirect_ and a direct_object in postverbal position. Before sentence-initial V/N can apply, the extrapropositional V−V coordination between the second and the third sentence must be traversed (12.3.4). 14.4.1 G RAPHICAL S PEAK M ODE
ANALYSIS OF THE THIRD SENTENCE (iii) numbered arcs graph (NAG)
(i) semantic relations graph (SRG) give
0
1 farmer
driver lift farmer
2
give 3
4
driver
6 5 lift
(ii) signature
V
(iv) surface realization 0
1 2 3 4 5 6 The_farmer gave the_driver a_lift V−V V/N N/V V\N N\V V\N N\V
.
N NN Arc 0 moves from the verb speed of the second to the verb give of the third proposition. Analogous to 14.3.2, the traversal is empty: 14.4.2 E MPTY
TRAVERSAL FROM
V−V verb: α pattern arg: #X nc: (β K+1) level prn: K
speed TO give (arc 0)
ˆ lexverb(β) ⇒verb: β arg: Z prn: K+1 ⇑ ⇓ sur: sur: verb: speed verb: give cat: #n′ #hv′ #be′ decl cat: #n′ #d′ #a′ decl sem: past perf prog sem: past content arg: #car arg: farmer driver lift level mdr: mdr: nc: (give 6) nc: pc: pc: prn: 5 prn: 6
14.4 Speak Mode Production of the Third Sentence
271
This think mode navigation is along a semantic relation which was established in the hear mode by V+V (13.4.5). Given the unmarked initial arg value of give, V/N navigates to the subject: 14.4.3 NAVIGATING
FROM
V/N verb: α pattern arg: β Y level prn: K
give TO farmer (arc 1)
ˆ sur: lexnoun(β) ⇒noun: β fnc: α prn: K ⇑ ⇓ sur: sur: A_farmer noun: farmer verb: give cat: #n′ #d′ #a′ decl cat: snp sem: indef sg sem: past content arg: farmer driver lift fnc: give level mdr: mdr: nc: nc: pc: pc: prn: 6 prn: 6
#-mark β in the arg slot of proplet α (see goal proplet in 14.4.4)
As shown in the output of V/N (14.4.4), the instruction #-marks the first arg value farmer of give. Because the mdr slot of farmer is empty, lines 2, 5, 6, and 12 of lexnoun (12.5.2) realize the complete phrasal noun A_farmer, analogous to 14.3.3. For the same reason N|A fails next, but N/V succeeds. In the hear mode, the semantic N/V relation was established in 13.4.4. In the speak mode, it is the transition for realizing the finite verb: 14.4.4 R ETURNING
TO
give AND
REALIZING THE VERB
N/V sur: lexverb(α) ˆ noun: β pattern fnc: α verb: α ⇒ mdr: Z arg: #β Y level prn: K prn: K where Z is #-marked or NIL ⇑ ⇓ sur: gave sur: noun: farmer verb: give cat: snp cat: #n′ #d′ #a′ decl sem: indef sg sem: past content fnc: give arg: #farmer driver lift level mdr: mdr: nc: nc: pc: pc: prn: 6 prn: 6
(arc 2)
#-mark α in the fnc slot of proplet β (see goal proplet in 14.4.5)
N/V calls lexverb (12.4.3). Lines 3 and a variant of line 10 realize gave. The instruction #-marks the fnc value give in the noun proplet farmer.
272
14. DBS.2: Speak Mode
The unmarked arg values in the give proplet cause V−V to fail, but V\N succeeds in navigating to the indirect_object: 14.4.5 NAVIGATING
FROM
give TO driver (arc 3)
V\N verb: α pattern arg: #X β Y level prn: K
ˆ sur: lexnoun(β) ⇒noun: β fnc: #α prn: K ⇑ ⇓ sur: sur: the_driver verb: give noun: driver cat: #n′ #d′ #a′ decl cat: snp sem: past sem: def sg content fnc: #give arg: #farmer driver lift level mdr: mdr: nc: nc: pc: pc: prn: 6 prn: 6
#-mark β in the arg slot of proplet α (see goal proplet in 14.4.6)
Because the mdr slot in the driver proplet is empty, lexnoun (12.5.2) realizes the complete phrasal noun the_driver, analogous to 14.3.3 and 14.4.4. The instruction #-marks the indirect_object driver in the arg slot of the verb, thus preventing the navigation from entering into a loop. Next N\V returns to the verb and calls lexverb (12.4.3). According to line 4, no surface is realized (gave was already realized by N/V in 14.4.4). 14.4.6 E MPTY
RETURN TO THE VERB
(arc 4)
N\V lexverb(α) ˆ noun: β pattern fnc: #α verb: α ⇒ mdr: Z arg: #X #β Y level prn: K prn: K where Z is #-marked or NIL ⇑ ⇓ sur: sur: noun: driver verb: give cat: snp cat: #n′ #d′ #a′ decl sem: def sg sem: past content fnc: #give arg: #farmer #driver lift level mdr: mdr: nc: nc: pc: pc: prn: 6 prn: 6
If the grammatical objects had all been #-marked, lexverb(ˆ α) would realize the period, as in 14.2.11. The unmarked arg value lift of give, however, allows V\N to apply once more:
14.4 Speak Mode Production of the Third Sentence
14.4.7 NAVIGATING
FROM
273
give TO lift (arc 5)
V\N verb: α pattern arg: #X β Y level prn: K
ˆ sur: lexnoun(β) ⇒noun: β fnc: #α prn: K ⇑ ⇓ sur: a_lift sur: noun: lift verb: give cat: snp cat: #n′ #d′ #a′ decl sem: indef sg sem: past content arg: #farmer #driver lift fnc: #give level mdr: mdr: nc: nc: pc: pc: prn: 6 prn: 6
#-mark β in the arg slot of proplet α (see goal proplet in 14.4.8)
It remains to return to the verb to realize the period (12.4.2, line 5): 14.4.8 F INAL
RETURN TO THE VERB TO THE REALIZE PERIOD
(arc 6)
N\V lexverb(α) ˆ noun: β pattern fnc: #α verb: α ⇒ mdr: Z arg: #X #β Y level prn: K prn: K where Z is #-marked or NIL ⇑ ⇓ sur: . sur: verb: give noun: lift cat: #n′ #d′ #a′ decl cat: snp sem: past sem: indef sg content arg: #farmer #driver #lift fnc: #give level mdr: mdr: nc: nc: pc: pc: prn: 6 prn: 6
At this point V\N cannot reapply because there is no unmarked arg value left in give. The operation V−V cannot apply either because the nc slot of give is empty. Thus, further navigation is prevented by a lack of suitable continuation values, regardless of the number of proplets which might be contained in the agent’s word bank. Starting a new navigation would require (i) activation of a new initial proplet by an external or internal stimulus, or (ii) the application of an inference with the antecedent (forward chaining) or the consequent (backward chaining) matching the content traversed (CLaTR Chap. 5). The surfaces realized by a lexicalization rule do not need to be contiguous. For example, , lexnoun realizes The in the V/N traversal 14.2.2 and car in the N/V traversal 14.2.6 from the same proplet, with heavy old intervening.
274
14. DBS.2: Speak Mode
14.5 Definition of LAGo Think-Speak.2 The following LAGo Think-Speak.2 grammar is used for the derivations in Sects. 12.3 and 14.2–14.4. The nine operations have six instructions and three conditions. Operations beginning with a V are ordered first, followed by those beginning with an N, and followed by those beginning with an A: 14.5.1 LAG O T HINK -S PEAK .2 LX: proplets in the word banks 11.6.7 and 14.1.1 V−V verb: α ˆ lexverb( β) arg: #X nc: (β K+1) ⇒ verb: β prn: K+1 prn: K V/N ˆ sur: lexnoun(β) verb: α noun: β arg: β Y ⇒ fnc: α prn: K prn: K
#-mark β in the arg slot of proplet α
V\N ˆ sur: lexnoun(β) verb: α noun: β arg: #X β Y ⇒ fnc: #α prn: K prn: K
#-mark β in the arg slot of proplet α
N/V sur: lexverb(α) ˆ noun: β verb: α fnc: α mdr: Z ⇒ arg: #β Y prn: K prn: K where Z is #-marked or NIL N\V lexverb(α) ˆ noun: β verb: α fnc: #α mdr: Z ⇒ arg: #X #β Y prn: K prn: K where Z is #-marked or NIL N|A sur: lexadj(ˆ γ) noun: β adj: γ mdr: γ ⇒ mdd: β prn: K prn: K A|N ˆ adj: γ sur: lexnoun(β) noun: β mdd: β nc: Z ⇒ mdr: #γ prn: K prn: K where Z is #-marked or NIL
#-mark α in the fnc slot of proplet β
#-mark γ in the mdr slot of proplet β
#-mark β in the mdd slot of proplet γ
14.5 Definition of LAGo Think-Speak.2 A−A ˆ sur: lexadj(β) adj: α adj: β nc: β ⇒ pc: α prn: K prn: K
275
#-mark β in the nc slot of proplet α
A←A sur: lexadj(α) ˆ adj: β adj: α pc: #α ⇒ nc: #β prn: K prn: K
Mapping the content derived in Chap. 13 from the text 13.1.1 back into the original surface may be shown as the following sequence of LAGo think traversals along the semantic relations holding between the proplets in 14.1.1: 14.5.2 I NTERPROPLET
TRAVERSALS REALIZING
13.1.1 applications 14.2.2 14.2.3 14.2.4 14.2.5 14.2.6 14.2.7 14.2.8 14.2.9 14.2.10 14.2.11
arc 1 2 3 4 5 6 7 8 9 10
operations V/N N|A A−A A←A A|N N/V V\N N|A A|N N\V
transitions hit car car heavy heavy old old heavy heavy car car hit hit tree tree beautiful beautiful tree tree hit
realizations The heavy old ∅ car hit a(n) beautiful tree .
0 1 2
V−V V/N N/V
hit speed speed car car speed
∅ 14.3.2 The_car 14.3.3 had_been_speeding_. 14.3.4
0 1 2 3 4 5 6
V−V V/N N/V V\N N\V V\N N\V
speed give give farmer farmer give give driver driver give give lift lift give
∅ A_farmer gave the_driver ∅ a_lift .
14.4.2 14.4.3 14.4.4 14.4.5 14.4.6 14.4.7 14.4.8
This sequence of operation applications satisfies the continuity condition 3.6.5 on LAGo think operations. The sign “∅” is used here to indicate an empty traversal, i.e. a traversal without a surface realization.
276
14. DBS.2: Speak Mode
Taking Stock The fragment of English handled by the LAGo hear and think-speak grammars defined in 11.5.6 and 12.4.1, and upscaled in 13.5.1 and 14.5.1, is small, but constitutes proof of feasibility. For approximating complete data coverage and maintaining it at a high level of correct linguistic detail year after year (RMD corpus, CLaTR Sect 15.3), much more upscaling is required. Systematic long-term upscaling cannot be successful unless the overall approach is based on the correct method, ontology, and functional flow. The method of DBS allows to detect errors automatically,6 locate them precisely in the software, and correct them permanently.7 The ontology of DBS is agentbased, i.e. the external reality is treated as given and the scientific work concentrates on modeling the agent’s recognition of and action in its external surroundings.8 The functional flow of DBS reconstructs the speak and the hear mode as mappings between the external modality-dependent unanalyzed language surfaces and the content in the agent’s memory.9 By integrating method, ontology, and functional flow into the design of a talking robot, DBS ensures compatibility between components and has a broader empirical base than any other theory of language. The basic motor driving central cognition is the changes in the agent’s ecological niche, which compel cognition to constantly maintain and regain a state of balance. Modeling the agent’s survival in an environment requires actual robots which interact with actual surroundings by means of their autonomous recognition and action interfaces. The environment may be artificial, e.g., a test course codesigned for a certain kind of robot, but it must be real. Simulating the external environment on a standard computer is not an option. For example, in virtual reality the goal of modeling the action of raising a cup is a convincing appearance, e.g., getting the shadows right. The same action by a real robot, in contrast, requires building the actual gripping action of the artificial hand and the actual movement of the artificial arm. Central concerns are not to break the cup and not to spill the liquid; a convincing handling of the shadows is left to nature. 6 7
8
9
They are revealed by parsing failures which occur when processing test lists and free text. For the statistical method, error correction is limited to manual “post-editing”, and must be done all over again from one text to the next. This is also the reason why statistical tagging cannot be used for spontaneous dialog involving nontrivial content. Truth-conditional semantics bypasses the cognitive operations of the agent by constructing settheoretical models which relate language signs directly to an abstraction of “the world.” The phrase structure grammars of nativism generate different expressions from the same start node S, without interfaces for recognition and action, without an agent-internal memory, and without a viable distinction between the speak and the hear mode.
14.6 Speak Mode Extension to a Verb-Initial Construction
277
14.6 Speak Mode Extension to a Verb-Initial Construction An intrapropositional speak mode navigation begins and ends with the highest verb (Chaps. 6, 8). Furthermore, for surface realization, the proplet matching the second pattern of the current operation is used. For example, in a subject/predicate relation such as Peter gave, the “downward” V/N traversal realizes Peter from the goal proplet matching the N, and the “upward” N/V traversal realizes gave from the goal proplet matching the V. But what about verb-initial constructions, as discussed for the hear mode in Sect. 13.6? To realize the initial verb from a goal node and to start the speak mode navigation with the highest verb, variants of the proposition-connecting V−V operation (12.3.4, 14.3.2, 14.4.2) must be used. These variants, called V−V? and V−V!, differ from V−V in that (i) they do not traverse empty (they realize the sentence-initial verb) and (ii) change the last cat value from v to vi in Yes/No interrogatives and to vimp in imperatives. The hear mode example 13.6.1 has the following speak mode graph analysis: 14.6.1 G RAPH
OF AN INITIAL
(i) semantic relations graph (SRG)
Y ES /N O
INTERROGATIVE PRODUCTION
(iii) numbered arcs graph (NAG) 0
give? 1 2 pro2
driver lift
(ii) signature
V N
V?
pro2
give 6 4 5 3 driver lift
(iv) surface realization 0 1 2 3 4 5 6 Did you give the_driver a_lift ? V−V? V/N N/V V\N N\V V\N N\V
NN
The graphs (i–iii) use a “start line” for the V−V? transition, i.e. the first V is shown as a blank, as in −give?. This is motivated as follows. If the speaker produces a Yes/No interrogative in the middle of a dialog, the V of the preceding proposition will be available in the word bank. However, because 14.6.1 is a linguistic example, we do not really care which verb preceded and therefore represent it as a blank. The other possibility is that the speaker uses the Yes/No interrogative at the beginning of a dialog. However, even if there is no preceding verb proplet for the first V pattern of the V−V? operation at the language level, there will surely be some preceding content prompting the question, thus allowing the arc 0 transition to realize the initial auxiliary.
278
14. DBS.2: Speak Mode
Given the example 13.1.1, we may assume a dialog at the point in which the sentence analyzed in Sect 13.2 just ended. This supplies the hit proplet, which matches the first V pattern of an extrapropositional coordination. Continuing with the Yes/No interrogative Did you give the driver a lift? (Sect. 13.6), the second pattern of V−V? is matched by give: 14.6.2 A PPLYING V−V? (arc 0) V−V? ˆ verb: α sur: lexverb(β) pattern arg: #Y ⇒verb: β nc: (β K+1) level cat: #X interrog prn: K prn: K+1 ⇑ ⇓ sur: sur: Did verb: hit verb: give cat: #n′ #a′ decl cat: #n′ #d′ #a′ interrog sem: past sem: past content arg: pro2 driver lift arg: #car #tree level mdr: mdr: nc: (give 5) nc: pc: pc: prn: 4 prn: 5 5
The verbs hit and give are coordinated via their nc and pc values. The first V pattern does not require a specific sentential mood (no cat pattern), while the second V pattern requires an interrogative. The realization of the sur value Did in the output presumes an extension of the lexicalization rule lexverb (12.4.3). Next transition 1 in 14.6.1 moves with V/N to the arg value pro2 to realize Did+you. From then on, the derivation continues as in 14.4.4–14.4.7. For the realization of ?, the operation N\V (14.4.8) calls the “interrogative section”n of the lexicalization rule lexverb 12.4.3. To shorten the derivation of the speak mode production shown graphically in 14.6.1 let us use the format of 14.5.2: 14.6.3 Y ES /N O
INTERROGATIVE STARTING WITH A
arc 0 1 2 3 4 5 6
transition hit give give you you give give driver driver give give lift lift give
operation V−V? V/N N/V V\N N\V V\N N\V
realization Did you give the_driver ∅ a_lift ?
V−V?
applications 14.6.2
14.4.5 14.4.6 14.4.7
TRANSITION
14.6 Speak Mode Extension to a Verb-Initial Construction
279
The navigation uses the operations of LAGo Think-Speak.2 (14.5.1) except for the first, which is defined in 14.6.2. Similarly in the following navigation underlying the realization of an imperative. 14.6.4 I MPERATIVE operation V−V! V\N N\V V\N N\V
STARTING WITH A
transition hit give give driver driver give give lift lift give
realization Give the_driver ∅ a_lift !
V−V!
TRANSITION
applications 14.4.5 14.4.6 14.4.7 14.4.8
Controlled by the incremental adding of #-markings to continuation values, the same lexicalization rule may produce different surfaces during different visits to the same proplet. For example, in 14.6.1 the V−V? transition (arc 0) uses lexverb to produce Did from the give proplet. Then N/V (arc 2) uses the same V proplet and the same lexverb to produce the unmarked form give. Finally, N\V (arc 6) returns to the give proplet to produce the question mark with lexverb.10
10
For the DBS analysis of a coherent dialog with expressions in the declarative, the interrogative, and the imperative see CLaTR, Chapter 10.
Remark on the Relation between DBS and Symbolic Logic Just as the Chaps. 11 (hear mode) and 12 (speak mode) are the DBS counterpart to the propositional calculus of symbolic logic, the Chaps. 13 (hear mode) and 14 (speak mode) are the DBS counterpart to predicate calculus. However, in contradistinction to symbolic logic, the hear mode of DBS is surface compositional and time-linear. The quantifiers of predicate calculus are replaced by determiners (Sect. 6.4). Defined as function words they are absorbed in the hear mode and precipitated in the speak mode. Instead of reducing the truth conditions of the quantifiers to propositional calculus and defining the meaning of “functors” like man′ , love′ , and woman′ by manually constructing ad hoc set-theoretic models (if at all), the elementary meaning of the content words is based on the agent’s automatic recognition and action procedures (autochannel, 1.4.3). In the hear mode, these meanings are combined into complex content by establishing semantic relations of structure based on addresses implemented as pointers. The contents are stored and processed in the agent’s memory. In the think mode, the contents are selectively activated by navigating along the semantic relations. If the agent is in the speak mode, the content traversed is realized in the natural language of choice. What is the elementary notion of truth (at least bi-polar) for truth-conditional semantics is the elementary notion of balance (mono-polar) for DBS (CLaTR 5.1.2). The goal of a DBS system is not a characterization of meaning in terms of truth conditions, as in a sign-based approach, but survival in the agent’s ecological niche, as in an agent-based approach. Based on this, the further goal of DBS is free natural language communication between natural and artificial agents.
280
15. Ambiguity, Word Order, and Collocation
A classic linguistic example of a syntactic-semantic ambiguity is Flying planes may be dangerous (Chomsky 1965). The ambiguity is caused by the possible interpretation of flying planes as planes which fly (intransitive) or as for someone to fly planes (transitive). The English verbs allowing intransitive as well as transitive use also include burn, boil, drown, and melt. Another syntactic-semantic ambiguity is caused by phrasal modifiers, such as on the table. They may be used adnominally, as in the apple on the table, or adverbially, as in ate on the table. Because phrasal modifiers may be chained, as in on the table under the tree in the garden..., the alternative between possible adnominal and adverbial interpretations may be construed as resulting in an exponential number of readings (FoCL Sect. 12.5). However, a high computational complexity of natural language is not plausible (Harman 1963, Gazdar 1982). A sign-based method to disarm syntacticsemantic ambiguities is “packing” as used in computer science (Tomita 1986): instead of representing each reading by a separate analysis, e.g. a separate tree, the information is packed into a single analysis which provides the information when needed and parses in linear time (semantic doubling, FoCL 12.5.6). As an agent-based alternative, this chapter uses the limitation of syntacticsemantic ambiguity to the hear mode:1 for communication to be successful it is sufficient if a cooperative hearer selects a meaning1 reading (and, by consequence, the meaning2 ) which corresponds to the one intended by the speaker.
15.1 Adnominal and Adverbial Uses of Phrasal Modifiers In natural language, the obligatory semantic relations of structure are (i) subject/predicate and (ii) object\predicate, depending on the valency properties of the predicate (verb). The optional relations are (iii) modifier|modified 1
The speak mode, in contrast, simply chooses a meaning1 (2.6.1) for the communicative purpose at hand. The special case of “diplomatic ambiguity,” as in the underhanded formulation of international peace treaties (Pehar 2001), is accommodated by the speaker’s ability to view a phrasing from the perspective of the hearer’s interpretation (turn-taking, 1.1.1).
282
15. Ambiguity, Word Order, and Collocation
and (iv) conjunct−conjunct. Modifier|modified divides further into (iii.a) adnominal|noun and (iii.b) adverbial|verb, while conjunct−conjunct divides further into (iv.a) noun−noun, (iv.b) verb−verb, and (iv.c) adj−adj (6.6.1). Orthogonal to the distinction between the four semantic relations of structure and the grammatical roles of subject, object, predicate, modifier and conjunct are the grammatical complexity levels (i) phrasal, (ii) elementary, and (iii) clausal (CLaTR Sect. 9.6). For adnominal and adverbial modification, these levels may be illustrated as follows: 15.1.1 M ODIFIERS
AT ELEMENTARY, PHRASAL , AND CLAUSAL LEVEL
adnominal iii clausal: the apple which Julia bought ii phrasal: the apple on the table i elementary: the beautiful apple
adverbial sang when John appeared sang on the table sang beautifully
At the (i) elementary2 and the (iii) clausal3 level, the adnominal and the adverbial modification are concretely distinct. At the (ii) phrasal level, however, the surface form does not provide a concrete distinction between the adnominal and the adverbial use.4 Depending on the position of the phrasal modifier in the sentence, this causes possible ambiguities, as in the following example: 15.1.2 A MBIGUOUS
PHRASAL MODIFICATION
Julia ate the apple on the table. The position of the phrasal modifier following both the verb and the object noun supports the following readings: on the adnominal reading, it is the apple located on the table which is eaten; on the adverbial reading, it is the eating which is located on the table. Syntactic ambiguity resembles suspension compensation (11.6.4, 11.6.5) in that both (i) are hear mode phenomena which (ii) apply more than one operation in a single derivation step. They also have in common that (iii) the core value of the next word is mapped into two different proplets at the now front. 2
3 4
An adnominal form like beautiful is unmarked in English, while the adverbial counterpart beautifully is marked by a suffix. In German, in contrast, it is the adverbial, e.g., singt schön, which is unmarked, while the adnominal forms are marked by suffixes which agree with the determiner, e.g., der schöne, ein schöner, dem schönen (FoCL Sect. 18.1). See Sects. 7.3, 7.4 for the adnominal and Sect. 7.5 for the adverbial clauses in English. There are also elementary adjectives such as fast which may be used adnominally, as in the fast car, and adverbially, as in the car drove fast (CLaTR 3.5.5). However, due to their prenominal position in English, elementary adnominals can not cause ambiguity like phrasal modifiers, which take a postnominal position when modifying a noun.
15.1 Adnominal and Adverbial Uses of Phrasal Modifiers
283
For example, the suspension compensation in Sect. 13.4 copies the core value give of the next word into the fnc slot of farmer with NOM+FV (13.4.4) and also into the nc slot of speed with V+V (13.4.5). In the syntactic-semantic ambiguity caused by the phrasal modifier on the table (15.1.2), V+PREP (15.1.5) may connect the prepositional phrase to the verb proplet eat and N+PREP (15.1.6) may connect it to the noun proplet apple. They differ, however, in that a suspension compensation makes up for a relation which could not be established earlier, resulting in a single content (meaning1 ), while an ambiguity establishes an additional relation, resulting in two5 contents. The choice between these contents may be handled by the hearer in the following four ways: 15.1.3 O PTIONS 1. 2. 3. 4.
FOR TREATING ADNOMINAL - ADVERBIAL AMBIGUITY
both relations are disregarded to avoid crowding in the modifier’s mdd slot only the adnominal relation is established, triggered by priming only the adverbial relation is established, triggered by priming both relations are established simultaneously, using packing
The options 1–3 take an agent-based point of view. Option 4 takes a sign-based point of view, though with the efficient (linear) method of semantic doubling (FoCL 12.5.6). While the mdr (modifier) slot of a noun or a verb proplet may have more than one value (15.3.9, 15.4.3), the mdd (modified) slot of a modifier usually has at most one value. Thus, writing more than one value into an mdd slot causes crowding.6 The hearer may avoid crowding by using option 1: 15.1.4 O PTION 1: noun: Julia sem: nm f fnc: eat mdr: prn: 1
BOTH RELATIONS ARE OMITTED
verb: eat sem: past arg: Julia apple mdr: prn: 1
noun: apple sem: def sg fnc: eat mdr: prn: 1
noun: table sem: on def sg mdd: mdr: prn: 1
The mdr slots of apple and eat, and the mdd slot of table are empty. Nevertheless, the proplets are connected by their prn value, here 1. 5
6
In a “modifier recursion” such as Julia ate the apple on the table under the tree in the garden (15.3.6) the alternatives between an adnominal and an adverbial interpretation may accumulate, depending on the method of analysis and the number of phrasal modifiers. The occurrence of crowding may be described in two ways, depending on whether multiple operations are assumed to apply simultaneously or in sequence. If multiple operations in a single hear mode derivation step are applied simultaneously, V+PREP and N+PREP would try to write into the mdd slot of the next word table at the same time. If they are applied in sequence, in contrast, the second operation would find an mdd slot in the table proplet which is already occupied by a value.
284
15. Ambiguity, Word Order, and Collocation
Option 2 would be chosen if the hearer’s preceding cognition involved an apple on the table, thus priming an adnominal interpretation of the phrasal modifier: 15.1.5 O PTION 2: noun: Julia sem: nm f fnc: eat mdr:
prn: 2
ONLY THE ADNOMINAL RELATION IS ESTABLISHED
verb: eat sem: past arg: Julia apple mdr: prn: 2
noun: apple sem: def sg fnc: eat mdr: table prn: 2
noun: table sem: on def sg mdd: apple mdr: prn: 2
In contradistinction to 15.1.4, the mdr slot of apple has the value table and the mdd slot of table has the value apple. Option 3 would be chosen if the hearer’s preceding cognition involved eating on the table, thus priming an adverbial interpretation of the phrasal modifier: 15.1.6 O PTION 3: noun: Julia sem: nm f fnc: eat mdr: prn: 3
ONLY THE ADVERBIAL RELATION IS ESTABLISHED
verb: eat sem: past arg: Julia apple mdr: table prn: 3
noun: apple sem: def sg fnc: eat mdr: prn: 3
noun: table sem: on def sg mdd: eat mdr: prn: 3
In contradistinction to 15.1.5, it is the mdr slot of eat which has the value table while the mdd slot of table has the value eat. Option 4, finally, is the view of the sign-based linguist: the sign is analyzed in isolation, without concern for possible preceding or current contexts. However, in order to avoid the unappetizing complexity-theoretical consequences of multiplying out each reading as a separate analysis, e.g. a phrase structure tree, a formula of predicate calculus, or a set of proplets, the two readings are “packed” as alternative values of the mdd attribute of the table proplet: 15.1.7 O PTION 4: noun: Julia sem: nm f fnc: eat mdr: prn: 4
ESTABLISHING BOTH RELATIONS SIMULTANEOUSLY
verb: eat sem: past arg: Julia apple mdr: table prn: 4
noun: apple sem: def sg fnc: eat mdr: table prn: 4
noun: table sem: on def sg mdd: apple $ eat mdr: prn: 4
This analysis is the opposite of the crowding analysis 15.1.4: instead of ignoring modifiers with multiple modified, it integrates the alternative readings 15.1.5 and 15.1.6 into a single set of proplets. The mdr slots of apple and eat have the value table and the table proplet has the feature [mdd: apple $ eat], using “$” to indicate alternative readings. As a sign-based analysis, a speak mode derivation from this set of proplets is neither required nor envisaged.
15.2 Graphical Analyses of the Hear and the Speak Mode
285
15.2 Graphical Analyses of the Hear and the Speak Mode The following hear mode graph analysis of 15.1.2 shows the four options 15.1.3 simultaneously in the form of the dotted arrows in line 4: 15.2.1 H EAR Julia lexical lookup noun: Julia fnc: prn:
MODE DERIVATION OF ate
verb: eat arg: mdr: prn:
the
noun: n_1 fnc: mdr: prn:
15.1.2
apple
on
noun: apple fnc: mdr: prn:
noun: n_2 mdd: mdr: prn:
the
table
noun: n_3 noun: table fnc: fnc: prn: mdr: prn:
. .
verb: cat: v’ decl prn:
syntactic−semantic parsing noun: Julia verb: eat arg: 1 fnc: prn: 8 mdr: prn: noun: n_1 fnc: mdr: prn:
noun: Julia 2 fnc: eat prn: 8
verb: eat arg: Julia mdr: prn: 8
noun: Julia 3 fnc: eat prn: 8
verb: eat arg: Julia n_1 mdr: prn: 8
noun: Julia 4 fnc: eat prn: 8
verb: eat arg: Julia apple mdr: prn: 8
noun: apple fnc: eat mdr: prn: 8
noun: n_2 mdd: mdr: prn:
noun: Julia 5 fnc: eat prn: 8
verb: eat arg: Julia apple mdr: n_2 prn: 8
noun: apple fnc: eat mdr: n_2 prn: 8
noun: n_2 mdd: eat*apple mdr: prn: 8
noun: Julia 6 fnc: eat prn: 8
verb: eat arg: Julia apple mdr: n_3 prn: 8
noun: apple fnc: eat mdr: n_3 prn: 8
noun: n_3 mdd: eat*apple mdr: prn: 8
noun: Julia 7 fnc: eat prn: 8
verb: eat arg: Julia apple mdr: table prn: 8
noun: apple fnc: eat mdr: table prn: 8
noun: table mdd: eat*apple mdr: prn: 8
verb: eat arg: Julia apple mdr: table prn: 8
noun: apple fnc: eat mdr: table prn: 8
noun: table mdd: eat*apple mdr: prn: 8
result noun: Julia fnc: eat prn: 8
noun: n_1 fnc: eat mdr: prn: 8
noun: apple fnc: mdr: prn:
noun: n_3 fnc: mdr: prn: noun: table fnc: prn:
.
verb: cat: v’ decl prn:
286
15. Ambiguity, Word Order, and Collocation
If the dotted arrows are ignored, option 1 is obtained. Option 2 is represented by accepting the cross-copying between apple and table, but ignoring the cross-copying between eat and table. Option 3 is represented by accepting the cross-copying between eat and table, but ignoring the cross-copying between apple and table. Option 4, finally, is obtained by accepting both crosscopyings. Regardless of which cross-copying is accepted in line 4, the remainder of the derivation continues the same. In contrast to the hear mode analysis, the speak mode does not have to deal with options 1 (15.1.4) and 4 (15.1.7). Instead, the agent simply chooses the adnominal or the adverbial interpretation in accordance with communicative intent. Consider the graph underlying the adnominal reading (15.1.5): 15.2.2 S PEAK
MODE GRAPH UNDERLYING PHRASAL ADNOMINAL
(i) semantic relations graph (SRG) eat
Julia
apple
(iii) numbered arcs graph (NAG) eat 1 6 2 3 Julia apple 4 5
table
table (iv) surface realization 1 2 3 4 5 6 Julia ate the_apple on_the_table V/N N/V V\N N|N N|N N\V
(ii) signature
V N
.
N N
The graph underlying the alternative adverbial reading (15.1.6) is as follows: 15.2.3 S PEAK
MODE GRAPH UNDERLYING PHRASAL ADVERBIAL
(i) semantic relations graph (SRG) eat
Julia
table
(ii) signature
(iv) surface realization 1 2 3 4 5 6 on_the_table Julia ate the_apple N|V V/N N/V V\N N\V V|N
V N
N
apple
(iii) numbered arcs graph (NAG) eat 1 6 2 3 4 5 Julia table apple
.
N
15.3 Intrapropositional Modifier Chaining (Adnominal)
287
Both modifications are opaque: N|N instead of A|N in 15.2.2 and N|V instead of A|V in 15.2.3. Constructing a similar ambiguity with elementary modifiers is not possible in English. For example, even if the adnominal and the adverbial uses of the elementary modifier are not distinguished morphologically, as in fast, the alternative uses cannot be combined in the same surface: in adnominal use the elementary modifier must precede the modified, as in The fast car won, in contradistinction to the adverbial use, as in The car drove fast. At the clausal level, finally, the adnominal and the adverbial modifiers can not even be related. For example, the adnominal clause which Julia bought has no adverbial counterpart. Conversely, the adverbial clause when John appeared7 has no adnominal counterpart (15.1.1).
15.3 Intrapropositional Modifier Chaining (Adnominal) Because modifiers are optional, their number is open, i.e. it may be zero, one, two, or more. Intrapropositionally, multiple modifiers raise the question of (i) how they are related to each other (8.1.3) and of (ii) how they are related to their modified (8.1.4). These questions must be answered separately for elementary and phrasal adnominals and adverbials. Let us begin with an example of elementary adnominals: 15.3.1 C HAINING
OF ELEMENTARY ADNOMINALS
Julia ate the sweet little red apple In English, elementary adnominals must be placed between the determiner and the modified noun. Because sweet does not modify little and sweet little does not modify red, the relation between them is coordination rather than modification, as shown by the following proplet set representing the content: 15.3.2 P ROPLET sur: noun: Julia cat: snp sem: nm f fnc: eat mdr: nc: pc: prn: 52
7
SET SHOWING AN ELEMENTARY ADNOMINAL CHAIN
sur: verb: eat cat: #n′ #a′ decl sem: past arg: Julia apple mdr: nc: pc: prn: 52
sur: noun: apple cat: snp sem: def sg fnc: eat mdr: sweet& nc: pc: prn: 52
sur: adj: sweet& cat: adn sem: pad mdd: apple mdr: nc: little pc: prn: 52
sur: adj: little cat: adn sem: pad mdd: mdr: nc: red pc: sweet& prn: 52
sur: adj: red cat: adn sem: pad mdd: mdr: nc: pc: little prn: 52
According to the stylists of English, adnominal uses as in the moment when John appeared are ungrammatical: the adverbial conjunction when should be replaced here with adnominal at which.
288
15. Ambiguity, Word Order, and Collocation
The elementary adnominals are coordinated via their nc and pc values. The coordination is connected to the modified apple by the conjunct sweet, &-marked as initial (8.1.4). By using only the initial proplet of the coordination, the relation to the modified is the same as for a single elementary adnominal (6.2.2). The speak mode realization from this set of proplets is based on the following DBS graph analysis: 15.3.3 E LEMENTARY
ADNOMINALS CHAINED BY COORDINATION
(iii) numbered arcs graph (NAG) eat 1 10 2 3 apple Julia
(i) semantic relations graph (SRG) eat
Julia
apple
4 9 sweet
little
sweet
8 little 5
7 red 6
(iv) surface realization 1 2 3 4 5 6 7 8 9 10 apple Julia ate the sweet little red V/N N/V V\N N|A A−A A−A A A A A A|N N\V
(ii) signature
V N
red
.
N A
A
A
The speak mode navigation realizes the when first traversing the apple proplet (arc 3), realizes sweet little red when it continues along the adnominal chain (arcs 4, 5, 6), returns empty (arcs 7, 8) to apple to realize the common noun (arc 9), and returns to eat to realize the period (arc 10). Next we turn to adnominals at the phrasal level, using the following example: 15.3.4 C HAINING
OF PHRASAL ADNOMINALS
Julia ate the apple on the table under the tree in the garden. In English, phrasal adnominals must directly follow the modified noun. If they are chained, no other material may intervene. Because under the tree modifies on the table and in the garden modifies on the table under the tree, the relation between them is modification rather than coordination, as shown by the following proplet set representing the content of 15.3.4:
15.3 Intrapropositional Modifier Chaining (Adnominal)
15.3.5 P ROPLET
289
SET SHOWING A PHRASAL ADNOMINAL CHAIN
sur: sur: sur: sur: sur: sur: verb: eat noun: Julia noun: apple noun: table& noun: tree noun: garden cat: snp cat: #n′ #a′ declcat: snp cat: adnv snp cat: adnv snp cat: adnv snp sem: nm f sem: past sem: def sg sem: on def sgsem: under def sgsem: in def sg fnc: eat arg: Julia apple fnc: eat mdd: apple mdd: table& mdd: tree mdr: mdr: table&mdr: tree mdr: mdr: garden mdr: nc: nc: nc: nc: nc: nc: pc: pc: pc: pc: pc: pc: prn: 53 prn: 53 prn: 53 prn: 53 prn: 53 prn: 53
Apple has table& in its mdr slot, table has apple in its mdd and tree in its mdr slot, and tree has garden in its mdr slot. The prepositions on, under, and in are specified in the sem slot (6.5.4, A.4.5) of the phrasal adnominals, here followed by determiner values. Modification-based chaining resembles coordination-based chaining in that only the first modifier is specified in the mdr slot of the modified, as shown by the apple proplet. To support exhaustive retrieval (8.1.4), the initial phrasal modifier table is &-marked, just like the initial conjunct of a coordination. The speak mode realization from this set of proplets is based on the following DBS graph analysis: 15.3.6 P HRASAL
ADNOMINALS CHAINED BY MODIFICATION
(i) semantic relations graph (SRG) eat
(ii) signature
(iii) numbered arcs graph (NAG)
V
eat 1
Julia
apple
N
N
Julia
2
3
10 apple 4 9
table tree
N
table 5 8
N
tree 6 7
garden
N
garden
(iv) surface realization 1 2 3 4 5 6 7 8 9 10 Julia ate the_apple on_the_table under_the_tree in_the_garden N N N N N N N\V V/N N/V V\N N|N N|N N|N
.
The speak mode navigation realizes the_apple in arc 3, on_the_table in arc 4, under_the_tree in arc 5, in_the_garden in arc 6, returns empty (arcs 7, 8, 9) to apple, and finally to eat to realize the period (arc 10).
290
15. Ambiguity, Word Order, and Collocation
A prenominal elementary adnominal chain may be combined with a postnominal phrasal adnominal chain, as shown by the following example:8 15.3.7 C OMBINING
ELEMENTARY AND PHRASAL ADNOMINALS
Julia ate the small red apple on the table under the tree in the garden. 15.3.8 DBS
GRAPH ANALYSIS UNDERLYING THE PRODUCTION OF (ii) signature
(i) semantic relations graph (SRG) eat
Julia
red
(iii) numbered arcs graph (NAG) eat
V
apple
N
N
small table
A
15.3.7
AN
1 2 Julia
14 3 apple
red
4 7 8 13 5 small table 6 9 12
tree
tree
N
10 11 garden garden
N
(iv) surface realization 8 9 10 4 5 6 7 3 1 2 11 12 13 14 Julia ate the small red apple on_the_table under_the_tree in_the_garden N N N N N N N\V V/N N/V V\N N|A A−A A A A|N N|N N|N N|N
.
The object of the transitive declarative Julia ate the apple is modified by the prenominal chain of elementary small red, and the postnominal chain of phrasal on the table under the tree in the garden. The mdr slot of a phrasal noun, here apple, may contain at most two chain-initial modifiers, one beginning a pre-, the other beginning a post-nominal chain: 15.3.9 P ROPLET
SET REPRESENTING THE CONTENT OF
sur: sur: noun: Julia verb: eat cat: snp cat: #n′ #a′ decl sem: nm f sem: past fnc: eat arg: Julia apple mdr: mdr: nc: nc: pc: pc: prn: 54 prn: 54
8
sur: sur: noun: apple adj: small& cat: snp cat: adn sem: def sg sem: pad fnc: eat mdd: apple mdr: small& table& mdr: nc: nc: red pc: pc: prn: 54 prn: 54
15.3.8
sur: adj: red cat: adn sem: pad mdd: mdr: nc: pc: small& prn: 54
In Korean as an SOV or head final language, in contrast, the modification begins with the phrasal adnominals followed by the elementary adnominals and concludes with the modified noun, as in Julia in garden under tree on table small red apple ate (Prof. Kiyong Lee, personal communication).
15.4 Multiple Adverbials sur: noun: table& cat: adnv snp sem: on def sg mdd: apple mdr: tree nc: pc: prn: 54
sur: noun: tree cat: adnv snp sem: under def sg mdd: table& mdr: garden nc: pc: prn: 54
291
sur: noun: garden cat: adnv snp sem: in def sg mdd: tree mdr: nc: pc: prn: 54
The prenominal elementary modifiers small and red are connected via their nc and pc values (coordination). The postnominal phrasal modifiers table, tree and garden are connected via the values of their mdd and mdr slots (modification). The chain-initial items, represented by the values small and table, are &-marked and have the value apple in their respective mdd slots.
15.4 Multiple Adverbials While the position of adnominal modifiers in English is fixed to the order det+adn+ +noun+prep+ , where adn indicates an elementary and prep a phrasal adnominal modifier,9 the position of adverbial modifiers is relatively free. For example, the adverbial reading of 15.1.2 has the following word order variant: 15.4.1 U NAMBIGUOUS
ADVERBIAL MODIFICATION
On the table Julia ate the apple. The hear mode derivation of this sentence results unambiguously in the content 15.1.6. The speak mode graph analysis has the same (i) SRG, the same (ii) signature, and the same (iii) NAG as 15.2.3, but a different (iv) surface realization. Within a sentence, all possible positions of adverbial modifiers may be filled. Consider the following example with four elementary adverbials: 15.4.2 M ULTIPLE
ELEMENTARY ADVERBIAL MODIFIERS
Perhaps Julia is still eating the apple there now. Because there is no coordination or modification relation between perhaps, still, here, and now, each adverbial is connected to the verb directly, as shown by the following proplet representation: 9
The notation adn+ and prep+ is borrowed and adapted from RegEx, which uses .+ to indicate a list of one or more elements. RegEx in turn borrowed and adapted .+ from the positive closure LX+ , which is used to indicate the free monoid without epsilon over a finite lexicon LX (FoCL Sect. 7.1). And similar for our use of .∗ (Kleene Star) and ǫ (element of a set).
292
15. Ambiguity, Word Order, and Collocation
15.4.3 R EPRESENTING
THE CONTENT OF
15.4.2
AS A SET OF PROPLETS
verb: eat noun: Julia adj: still sem: pres prog sem: nm f sem: temp fnc: eat mdd: eat arg: Julia apple mdr: perhaps still there now prn: 63 prn: 63 prn: 63 adj: now adj: there sem: loc sem: temp mdd: eat mdd: eat prn: 63 prn: 63
adj: perhaps sem: mod mdd: eat prn: 63 noun: apple sem: def sg fnc: eat prn: 63
The mdr slot of the eat proplet contains the values perhaps, still, there, and now, while the mdd slots of the perhaps, still, there, and now proplets all have the value eat. These relations are shown by the following DBS graph underlying the speak mode: 15.4.4 G RAPH
ANALYSIS UNDERLYING THE PRODUCTION OF
(i) semantic relations graph (SRG) eat
Julia perhaps now apple still there
15.4.2
(iii) numbered arcs graph (NAG) eat 12 1 2 3 4 9 10 11 7 8 5 6 apple Julia now perhaps still there
(ii) signature
V N A A AA
N
(iv) surface realization 3 4 1 2 5 6 11 12 7 8 9 10 Perhaps Julia is still eating the_apple there now V|A A|V V/N N/V V|A A|V V\N N\V V|A A|V V|A A|V
.
The order of the arcs in the NAG is strictly pre-order. The direct modification of the verb is the third possibility of connecting multiple modifiers to their modified – in addition to coordination and modification. The latter were illustrated in 15.3.3 and 15.3.6, respectively, for adnominals, but may also be shown for adverbials. Regarding the order of adverbial modifiers in the various slots possible in an English sentence, there are general recommendations of a stylistic nature, but no strict grammatical rules. To capture these collocational word orders, DBS
15.5 Inferences for Ritualized Phrases and Idioms
293
may employ the ability of a talking robot to record the uses encountered in the hear mode interpretation of native natural expressions, ranked by frequency, and to apply these recorded uses in the speak mode.
15.5 Inferences for Ritualized Phrases and Idioms Because autonomous robots are not yet available in today’s computational linguistics, let us turn to linguistic phenomena which (i) lie outside the linguistic laboratory set-up (3.5.2 f.), yet (ii) make do with a standard computer. An example is the handling of question asking (interrogative) and question answering (responsive) in Sect. 5.1. It goes beyond the laboratory set-up because it involves the derivation of a query pattern from the question in the hear mode (5.1.6), application of the pattern to content in the hearer’s word bank (5.1.7), and deriving an answer. Yet it stays within the limits of a standard computer and may be used for querying databases such as the Internet in natural language. Another such application is the use of inferences in the production and interpretation of a non-literal use (Sect. 5.4). Inferencing is also used in conventionalized uses. Let us begin with a ritualized phrase such as How are you?. Customarily used after a greeting like Hello Mrs. Smith, it has the same syntactic-semantic structure as How are they?. However, while the latter is used as a genuine enquiry, the utterer of How are you? is typically not interested in the addressee’s case history. What is expected instead of Mrs. Smith in the speak mode is the formulaic reply Fine, thank you. How are you? This may be implemented as the following DBS inference: 15.5.1 I NFERENCE
RESPONDING TO A GREETING
sur: are sur: you noun: pro2 verb: be sur: how ′ ′ cat: sp2 cat: #n2 #be interrog adj: wh-manner sem: sem: pres fnc: be =⇒ Fine, thank you. How are you? fnc: be arg: pro2 wh-manner prn: K prn: K prn: K
The antecedent of the inference is a pattern. It was derived from the proplet set representing How are you? by replacing the prn value with the variable K. The consequent is shown here in a simplified manner. In the robot taking the place of Mrs. Smith, the inference is triggered by incoming Hello. The connection between the incoming greeting and the outgoing reply is adjacency in the time-linear sequence of events.
294
15. Ambiguity, Word Order, and Collocation
Another use of inferences without external action (and thus within the limits of a standard computer) is the interpretation of idioms, such as Cuff’m, A is calling the shots, B is doing time, C is cooking the books, and D is getting out of hand. Their conventionally intended meanings may be given by more literal paraphrases, such as the following: 15.5.2 I NTERPRETING
IDIOMS BY MEANS OF INFERENCES
Cuff’m! Take a hike! A is calling the shots B is doing time C is cooking the books D is getting out of hand
=⇒ =⇒ =⇒ =⇒ =⇒ =⇒
Put them in handcuffs! Leave! A is in command B is in prison C is falsifying financial data D cannot be controlled
As a set of proplet patterns, an inference may adapt the grammatical properties of the consequent to those of the premise by binding the variables in the antecedent to such values as the numerus and genus of the subject, the tense and mood of the verb, etc., in the premise.
15.6 Collocations and Proverbs A third area outside the DBS laboratory set-up, but inside the means of a standard computer, are frozen phrases. As an example consider collocations, such as the preference of strong/weak over *high/low in combination with the noun current, but of high/low over *strong/weak in combination with the noun voltage. For the hearer, these preferences are recorded in the form of concatenated proplets in the word bank, as in the following example: 15.6.1 T HE high voltage PATTERNS member proplets now front .. . sur: high sur: high sur: high adj: high adj: high adj: high mdd: voltage . . . mdd: voltage. . . mdd: voltage prn: 10 prn: 16 prn: 23 .. . sur: voltage sur: voltage sur: voltage adj: voltage adj: voltage adj: voltage mdr: high . . . mdr: high . . . mdr: high prn: 10 prn: 16 prn: 23 ...
ownervalues ... high ... voltage ...
15.6 Collocations and Proverbs
295
Apparently, the agent has only encountered voltage modified by high. This convention of the surrounding language community has consequences when the agent switches into the speak mode and is in the process of choosing the appropriate adnominal for the noun voltage. Previous uses and their relative frequency are easily checked and emulated.10 It is not obvious to the language learner why *strong_voltage, for example, should be avoided, while high_voltage should be favored. Though violating collocational conventions does not cause communication to fail, they may be easily accommodated in a DBS robot simply by choosing the same words as those previously used by native speakers. A computational memory makes this especially easy: filling a word bank with the proplets resulting from parsing corpora provides a faithful recording of the collocations in the language. Related to collocations are “constructions” (Fillmore and Kay 1996), i.e. syntactic patterns which are frequently used. As an example consider the following “A and B” constructions, some with an indication of the domain: 15.6.2 S OME
INSTANCES OF THE
“A
AND
B”
CONSTRUCTION
arts and crafts bag and tag (crime scene investigation) bait and switch (party girl hiring) bed and breakfast big and stupid black and blue break and enter bump and bite (shark attack) by and by cocked and locked (rifle) crash and burn (landing airplane) cut and run down and out easy and slow feed and breed free and easy good-bye and good riddance high and dry high and mighty hit and rip (mugging) hit and run hold and mold idle and wild in and out kick and punch lean and mean left and right life and death 10
For a recent analysis of non-compositional (non-Fregean) verb-noun collocations such as take a step or draw a conclusion see Kolesnikova and Gelbuk (2015).
296
15. Ambiguity, Word Order, and Collocation life and limb light and shadow locked and loaded loud and proud meet and greet (election campaign) name and shame near and dear observe and report older and wiser pain and gain part and parcel pay and display (parking) pay and play plug and play pump and dump (stock marked) pump and burn (fitness) quick and dirty ram and rip (antique naval battle) rape and burn rest and recreation rise and shine rock and roll rock and ride search and destroy search and rescue shake and bake shake and fidget short and easy short and narrow short and sweet shut up and buckle slash and burn (jungle) slice and dice spic and span straight and narrow stupid and simple tag and release (modern fish research) up and away up and down up and running walk and talk wild and wooly
A DBS robot may acquire such “constructions” just as easily as collocations, namely by choosing the same words in the same syntactic constellation as observed in use by native speakers. Finally let us consider a proverb: 15.6.3 E XAMPLE
OF A PROVERB
The early bird catches the worm.11 11
This proverb was first recorded in John Ray’s A collection of English proverbs of the year 1670. The German analog is the rhyming Morgenstund hat Gold im Mund, which transliterates into English
15.6 Collocations and Proverbs
297
A human unfamiliar with this proverb may well guess the intended meaning as something like An early start leads to success, using analogical reasoning. In a DBS robot, proverbs and their proper use may simply be written directly into the language-dependent parts of the software. The real challenge in the design of a talking robot, however, is the ability of transferring new content spontaneously, including the production and interpretation of figures of speech. As an example of such a nonliteral use, consider our earlier example Put the coffee on the table, with table intended to refer to an orange crate (Sect. 5.4).
as Morning hour has gold in mouth.
Concluding Remark Over the centuries, the language sciences have described many grammatical details of various natural languages, such as the morphological variations of word forms, the regularities of word order, the restrictions of agreement, the semantic relations of functor argument and coordination, the sentential and the verbal moods, the complexity levels of elementary, phrasal, and clausal constructions, and the sign kinds of symbol, indexical, and name. These are the cogwheels of natural language communication. More recently, computer science has introduced the methodology of building electronic software systems. Their design and implementation is based on such notions as interfaces for input and output, data structure, algorithm, database schema, and functional flow. Their testing, debugging, and upscaling on large amounts of relevant data serves not only to build practical applications, but provides a verification method for systems modeling cognition. Database Semantics combines the two approaches by designing a talking robot. Presented as a declarative specification, DBS is based on such notions as surface compositionality, a time-linear algorithm computing possible continuations, a content-addressable memory, the agent’s switching between the hear, the think, and the speak mode, the coding of semantic relations by means of addresses, pattern matching based on the type-token relation, and the distinction between the literal meaning of language expressions (meaning1 ) and the speaker meaning of utterances (meaning2 ). DBS reconstructs language communication by using nothing but unanalyzed modality-dependent external surfaces for transferring content from one agent to another. Everything else is cognitive processing inside the natural or artificial agent. The mechanism is practical in that it runs in real time, as required for free human-robot dialog. The DBS robot’s cognitive operations are driven by the principle of balance, i.e., the inherent necessity to maintain an equilibrium vis à vis a constantly changing environment in the agent’s ecological niche. Balance is maintained by inferences which derive blueprints for action by matching currently activated content. Inferencing is also the basis for the coding and decoding of non-literal uses such as metaphor.
298
Appendix
A. Attributes, Values, Lexical Entries, Examples
In the declarative specification (1.2.1) of a DBS system, two aspects are to be distinguished.1 One is the definition of an abstract software machine which characterizes the necessary properties of a cognitive agent with language. This regards the language-independent external interfaces, components, functional flow, data structure, algorithm, and database schema. The other aspect is the application of the abstract system to different natural languages. This requires the definition of language-dependent attributes, their values as constants and variables, restrictions on variables, lexical proplets, and rule systems for automatic word form recognition, syntactic-semantic interpretation in the hear mode, semantic-syntactic activation in the speak mode, inferencing, and automatic word form production. This appendix summarizes the attributes, values, and lexical proplets used by the DBS fragments of English.
A.1 Number of Attributes in a Proplet In the DBS system for English defined in Part III, the standard number of attributes in a content proplet (4.2.1) is nine. Consider the following examples: A.1.1 S TANDARD noun 1. sur: 2. noun: car 3. cat: snp 4. sem: def sg 5. fnc: hit 6. mdr: heavy& 7. nc: 8. pc: 9. prn: 4 1
PROPLETS REPRESENTING CONTENT WORDS verb 1. sur: 2. verb: hit 3. cat: #n′ #a′ decl 4. sem: past 5. arg: car tree 6. mdr: 7. nc: 8. pc: 9. prn: 4
adjective 1.sur: 2. adj: heavy& 3. cat: adn 4. sem: pad 5. mdd: car 6. mdr: 7. nc: old 8. pc: 9. prn: 4
Together, these two aspects of a declarative specification may be realized in an unlimited variety of computational implementations which differ in their accidental (non-necessary) properties, such as the choice of the programming language, the hard ware, and the idiosyncrasies of the programmer.
302
A. Attributes, Values, Lexical Entries, Examples
The number of attributes in function words, in contrast, may differ. For example, the subordinating conjunctions have ten attributes. Consider the following lexical examples of that, who, and when:2 A.1.2 P ROPLETS subject and object sentence conjunction 1. sur: that 2. verb: v_1 3. cat: 4. sem: that 5. arg: A.4.4 6. fnc: 7. mdr: 8. nc: 9. pc: 10. prn:
REPRESENTING SUBORDINATING CONJUNCTIONS adnominal subclause conjunction 1. sur: who 2. verb: v_1 3. cat: 4. sem: who 5. arg: ∅ A.6.3 6. fnc: 7. mdr: 8. nc: 9. pc: 10. prn:
adverbial subclause conjunction 1. sur: when 2. verb: v_1 3. cat: 4. sem: when 5. arg: A.6.4 6. mdd: 7. mdr: 8. nc: 9. pc: 10. prn:
The additional fnc attributes in that and who and the additional mdd attribute in when accommodate the double function of a subclause as (i) the argument or modifier in the higher clause, and as (ii) the predicate of the subclause (the hear mode derivation will replace the substitution value v_1 with the core value of the lower verb). The specific kind of the subordinating conjunction is stored as the first value of the 4. sem slot, using its English form, e.g., that. The 5. arg attributes are used for the verbal function of the subclause conjunction (Chap. 7). The additional attributes 6. fnc and 6. mdd are continuation attributes which serve to connect the verb of the subclause to the higher verb (extrapropositional relation). In Part I and Part II, we use proplets which are reduced in that attributes without values are often omitted, as in the following examples: A.1.3 R EDUCED
PROPLETS OMITTING ATTRIBUTES WITHOUT VALUES
noun noun: car cat: snp sem: def sg fnc: hit mdr: heavy& prn: 4
verb verb: hit ′ ′ cat: #n #a decl sem: past arg: car tree prn: 4
adjective adj: heavy& cat: adn sem: pad mdd: car nc: old prn: 4
In Part III, however, we use complete proplets like those illustrated in A.1.1 and A.1.2, though without any attribute numbering, at the content but not at the operation level. 2
Other exceptions are the punctuation signs A.5.7 and the coordinating conjunctions (Sect. A.7).
A.2 Standard Order of Attributes in a Proplet
303
Beginners usually use more attributes and more values than necessary for the coding of semantic relations between proplets. This is like using suspenders and a belt, or even several suspenders with several belts. The reasoning is that it can’t hurt. This is mistaken, however. In longterm upscaling, unnecessary attributes and values get in the way, leading to more and more confusion and complexity. Distinctions should be implemented only when they are concretely given by the grammar of the language (methodological principle of surface compositionality).
A.2 Standard Order of Attributes in a Proplet Besides the standard number of attributes and their names in any given proplet, there is their standard order. Over the years, the following principles emerged: A.2.1 L EXICAL
ATTRIBUTES PRECEDE CONTINUATION ATTRIBUTES
There are four lexical attributes, namely 1. sur, 2. core, 3. cat, and 4. sem: 1. The sur(face) attribute is used in the hear mode for finding the relatively language-independent concept associated with a word form. For example, the German sur value Hund may be used to look up the proplet with the core value dog (3.2.2). 2. The core attribute is used in the speak mode for finding the language-dependent base form of a word. There are three kinds, namely noun, verb, and adj. For example, the core value dog may be used to look up proplets with the sur value Hund, chien, or pies (3.2.2). 3. The cat(egory) attribute takes values which code the distinction between singular and plural in nouns, the number and kind of valency positions in verbs, and the distinction between adnominal and adverbial use in adjs. 4. The sem(antics) attribute takes values which code grammatical gender in nouns, tense, mood, and aspect in verbs, and comparation in adjs, thus contributing to the control of agreement and to the word form selection in the speak mode.
304
A. Attributes, Values, Lexical Entries, Examples
A.2.2 O BLIGATORY
PRECEDE OPTIONAL CONTINUATION ATTRIBUTES
There are six continuation attributes, 1. fnc, 2. arg, 3. mdd, 4. mdr, 5. nc, and 6. pc. In a proplet, the number of continuation attributes is four or five. The obligatory continuation attributes fnc, arg, and mdd depend on the proplet kind, while the optional continuation attributes mdr, nc, and pc are the same for all kinds. 1. Origin of values The continuation attributes fnc (functor), arg (argument), mdd (modified), mdr (modifier), nc (next conjunct) and pc (previous conjunct) receive their values via operation-based cross-copying in the hear mode. This is in contrast to the lexical attributes, which receive their values during automatic word form recognition. 2. Obligatory vs. optional values The continuation attributes fnc, arg, and mdd are obligatory in the sense that a content is incomplete without them having suitable values. The continuation attributes mdr, nc and pc, in contrast, are optional in that a content may be complete even if they do not have any values. 3. Which proplet kinds take which obligatory continuation attributes Proplets with the core attribute noun take fnc, proplets with the core attribute verb take arg, and proplets with the core attribute adj take mdd as their obligatory continuation attributes (A.1.1, attribute 5). In subclause proplets with the core attribute noun, the obligatory continuation attributes are arg and fnc, while in subclauses with the core attribute verb they are arg and mdd (A.1.2, attributes 5 and 6). 4. Optional continuation attributes are common to all proplet kinds The optional continuation attributes are mdr (modifier, attribute 6 in A.1.1), nc (next conjunct, attribute 7 in A.1.1 and attribute 8 in A.1.2), and pc (previous conjunct, attribute 8 in A.1.1 and attribute 9 in A.1.2). They follow the obligatory continuation attributes and are the same for noun, verb, and adj proplets. A.2.3 B OOK - KEEPING
ATTRIBUTES COME AT THE END
The book-keeping attributes of DBS are prn, trn, and wrd. 1. The value of the prn attribute (preposition number) indicates the token line position of a proplet in the time-linear arrival order and holds the proplets
A.3 Summary of Proplet Attributes and Values
305
of an elementary proposition together.3 It is positioned as the last attribute of a proplet. 2. The value of the trn attribute (transition number) keeps track of how often a proplet in the agent’s word bank was traversed. 3. The wrn attribute (word number) indicates from which word form in the surface a proplet was derived. In the hear mode derivations of Part II, it is written as a proplet subscript. Of the book-keeping attributes, only prn and its value are obligatory in the declarative specification, while trn and wrn are only used in the software.
A.3 Summary of Proplet Attributes and Values A.3.1 ATTRIBUTES 1. sur
takes a language dependent word form surface as its value in language proplets and NIL in context proplets
2. Core attributes
noun = represents an object in philosophy and an argument in logic verb = represents a relation in philosophy and a functor in logic adj = represents a property in philosophy and a modifier in logic
3. Continuation attributes for functor-argument arg fnc mdd mdr
= = = =
(argument) (functor) (modified) (modifier)
continuation attribute of verb proplets continuation attribute of noun proplets continuation attribute of adj proplets continuation attribute of noun, verb, and adj proplets
4. Continuation attributes for coordination
nc = (next conjunct) continuation attribute for coordination of noun, verb, or adj proplets pc = (previous conjunct) continuation attribute for coordination of noun, verb, or adj proplets
5. Grammatical attributes
cat = takes syntactic properties as values which directly control combinatorics sem = takes semantic properties as values which do not directly control combinatorics
6. Book-keeping attributes
prn = proposition number trc = transition counter (not used as here) wrn = word number (used as here as a subscript)
A.3.2 C ONSTANT VALUES 1. Surface values
The value of a sur attribute is a language dependent word form, for example, Italian cani, French chien, or German Hunde as counterparts of English dogs.
3
This resembles the binding function of quantifiers in symbolic logic (CLaTR Sect. 6.4).
306
A. Attributes, Values, Lexical Entries, Examples
2. Core values The value of a core attribute is an elementary content defined lexically as a concept (sign kind symbol) or a pointer (sign kind indexical). The sign kind name receives its core value in a generalized act of baptism, resulting in a supplemented name proplet, i.e. a name supplemented with the address of its referent. As core values, concepts and pointers are used at the language and the context level. Concepts are represented here as English content words, e.g., farmer (A.4.5), serving as place holders for operationally defined recognition and action procedures of the cognitive agent. Indexicals, e.g., pro1 (A.4.2), are interpreted relative to a STAR (CLaTR Chap. 10). Core values may be represented by substitution values n_1, n_2, n_3, . . . for a noun and substitution values v_1, v_2, v_3, . . . for a verb.
3. Continuation values The value of a continuation attribute is defined as an address. Extrapropositional relations are coded by long address values, consisting of a core and a prn value, e.g. [nc: (give 6)] in 13.4.5. Intrapropositional relations, in contrast, are coded by short address values, e.g. [arg: farmer] in 13.4.4 (no prn value).
4. Cat values of a noun proplet (FoCL 17.2.1) sn sn′ pn pn′ nn nn′ nm snp pnp s1 sp2 s3 p1 p3 obq
= = = = = = = = = = = = = = =
singular noun filler singular noun valency position in a determiner4 plural noun filler plural noun position noun filler unmarked for number noun valency position unmarked for number name as noun filler singular noun phrase plural noun phrase first person singular nominative second person singular or plural nominative or oblique third person singular nominative first person plural nominative third person plural nominative oblique
5. Cat values of a verb proplet (FoCL 17.2.2) n′ n-s3′ ns3′ ns1′ n-s13′ ns13′ d′ a′ mod′ v v′ vi vimp decl interrog impv be be′ hv hv′
= = = = = = = = = = = = = = = = = = = =
unrestricted nominative valency position (sang) nominative valency position excluding 3rd person singular filler (sing) nominative valency position restricted to 3rd person singular filler (sings) nominative valency position restricted to 1rd person singular filler (am) nominative valency position excluding 1rd or 3rd person singular filler (were) nominative valency position restricted to 1rd or 3rd person singular filler (was) oblique valency position for indirect object (dative) oblique valency position for direct object (accusative) oblique valency position for obligatory modifier result category of a finite verb in a declarative and a Yes/No interrogative valency position in the punctuation mark . result category of a finite verb in a WH interrogative. result category of a finite verb in an imperative. syntactic mood declarative syntactic mood interrogative (CLaTR Sect. 10.5) syntactic mood imperative (CLaTR Sect. 10.6) valency filler for auxiliary be (FoCL Sect. 17.3) valency position for auxiliary be valency filler for auxiliary have valency position for auxiliary have
A.3 Summary of Proplet Attributes and Values
307
do =valency filler for auxiliary do do′ =valency position for auxiliary do
6. Cat values of an adj proplet adnv= adjective which may be used adnominally or adverbially adn = adjective restricted to adnominal use adv = adjective restricted to adverbial use
7. Sem values of a noun proplet (6.4.6) def = definite indef = indefinite sg = singular pl = plural exh = exhaustive sel = selective m = masculine f = feminine -mf = neuter5
8. Sem values of a verb proplet pres = past = perf = prog = subj = past/perf = be_pres = be_past = be_prog = and similarly for the auxiliaries have and do neg = ind = sbjv = impv =
present tense past tense perfect tense progressive aspect subjunctive mood past tense or perfect participle present tense of the auxiliary be past tense of the auxiliary be progressive form of the auxiliary be negation indicative verbal mood6 subjunctive verbal mood imperative verbal mood
9. Sem values of an adj proplet pad = adjective in the positive degree cad = adjective in the comparative degree sad = adjective in the superlative degree 4
5 6
For coding agreement by identity, the same category segment, here sn, is used for the valency filler and the valency position. To distinguish the filler and the position, the segment representing the position is marked with an apostrophe, here sn′ . This convention is applied to valency positions in general, including also instances of agreement by definition (FoCL Sect. 16.5). The letter n is already used for the unrestricted nominative valency position, A.5.1, 3. For a more detailed discussion verbal moods see Quirk et al. (1985:149–150).
308
A. Attributes, Values, Lexical Entries, Examples
10. Markers ′
= identifies a category segment as a valency position, e.g, n-s3′ Used only in the cat values of verbs. # = indicates a canceled valency position in a cat slot (hear mode) and a traversed proplet in a continuation slot (speak mode). & = indicates the initial conjunct of a coordination Used with core and continuation values. ea+ = marks an elementary adnominal as the currently last element of a coordination and as a growth point for the possible addition of another elementary conjunct. pa+ =marks a phrasal adnominal or adverbial as the currently last element of a coordination and as a growth point for the possible addition of another phrasal conjunct. ip+ =marks a sentence as complete and enables addition of a next sentence beginning. Introduced by the hear mode operation S+IP and eliminated by IP+START.
11. Book keeping values value of prn attribute = natural number assigned by parser to the proplets of an elementary proposition value of trc attribute = natural number assigned by parser indicating how often a proplet in the word bank has been traversed value of wrn attribute = natural number assigned by parser to next word proplet
A.3.3 VARIABLES
OF
P ROPLET PATTERNS
1. List of variables α, β, γ SM VT VT′ NP NOM OBQ OBQ′ CN CN′ ADJ ADN ADV AUX AUX′ X, Y, Z #W′
= = = = = = = = = = = = = = = = =
N_n V_n RV.n RA.n K
= = = = =
core value sentence mood verb type filler verb type valency position noun phrase filler nominative noun phrase filler oblique noun phrase filler oblique noun phrase valency position common noun filler common noun valency position adnominal or adverbial modifier adnominal modifier adverbial modifier auxiliary filler auxiliary valency position .?.?.?.? (arbitrary sequence up to length 4) one or more canceled (#) valency positions (′ ), such as #n′ or #n′ #hv′ (13.3.6, 13.3.7). simultaneous substitution variable for a noun simultaneous substitution variable for a verb replacement variable for value (4.2.4) replacement variable for attribute (5.1.3) variable for a prn value
A.4 Lexical Analysis of Nouns
309
2. Restrictions of variables
SM7 ǫ {decl, interrog, impv} VT ǫ {v, vi, vimp} VT′ ǫ {v′ , vi′ , vimp′ } NP ǫ {snp, pnp, s1, sp2, s3, p1, p3, obq} NP′ ǫ {n′ , n-s3′ , ns3′ , d′ , a′ } NOM ǫ {snp, pnp, s1, sp2, s3, p1, p3} OBQ ǫ {snp, pnp, sp2, pn, obq} OBQ′ ǫ {d′ , a′ , adj′ } CN ǫ {sn, pn} CN′ ǫ {nn′ , sn′ , pn′ } ADJ ǫ {adj, adn, adv} ADN ǫ {adn, adj} ADV ǫ {adv, adj} AUX ǫ {do, hv, be} AUX′ ǫ {do′ , hv′ , be′ } N_n ǫ {n_1, n_2, n_3, . . . } V_n ǫ {v_1, v_2, v_3, . . . } RV.n ǫ {core values} RA.n ǫ {core attributes}
3. Agreement conditions if NP = s1 if NP = s3, if NP = sp2, if NP ǫ {p1, p3} if NP ǫ {nm, snp}, if NP = pnp, if NP = np, if NP = obq, if AUX′ = do′ , if AUX′ = hv′ , if AUX′ = be′ , if N′ = nn′ , if N′ = sn′ , if N′ = pn′ ,
then NP′ ǫ {n′ , n-s3′ , ns1′ , ns13′ }} then NP′ ǫ {n′ , ns3′ , ns13′ } then NP′ ǫ {n′ , n-s3′ , n-s13′ , d′ , a′ } then NP′ ǫ {n′ , n-s3′ , n-s13′ } then NP′ ǫ {n′ , ns3′ , ns13′ , d′ , a′ } then NP′ ǫ {n′ , n-s3′ , n-s13′ , d′ , a′ } then NP′ ǫ {n′ , ns3′ , ns13′ , n-s3′ , n-s13′ , d′ , a′ } then NP′ ǫ {d′ , a′ } then AUX ǫ {snp, pnp, nm, obq} then AUX ǫ {hv, n′ } then AUX = be then N ǫ {nn, sn, pn} then N ǫ {nn, sn} then N ǫ {nn, pn}
A.4 Lexical Analysis of Nouns A.4.1 NAMES sur: Julia sur: John sur: Fido noun: noun: noun: cat: snp cat: snp cat: snp sem: nm f sem: nm m sem: nm -mf fnc: fnc: fnc: mdr: mdr: mdr: nc: nc: nc: pc: pc: pc: prn: prn: prn:
7
In DBS, restricted variables like SM are written in upper case because they are more encompassing than the constants in their restriction set, which are written in lower case.
310
A. Attributes, Values, Lexical Entries, Examples
Names are special in that they have no lexical core value. Instead, they receive their core value in an act of generalized baptism. It results is a supplemented name, the core value of which is the address of the initial referent 2.6.7). A.4.2 P ERSONAL sur: I noun: pro1 cat: s1 sem: sg fnc: mdr: nc: pc: prn: sur: him noun: pro3 cat: obq sem: sg m fnc: mdr: nc: pc: prn:
PRONOUNS
sur: me noun: pro1 cat: obq sem: sg fnc: mdr: nc: pc: prn: sur: she noun: pro3 cat: s3 sem: sg f fnc: mdr: nc: pc: prn:
sur: we noun: pro1 cat: p1 sem: pl fnc: mdr: nc: pc: prn: sur: her noun: pro3 cat: s3 sem: obq fnc: mdr: nc: pc: prn:
sur: he sur: you sur: us noun: pro3 noun: pro2 noun: pro1 cat: obq cat: sp2 cat: s3 sem: sg m sem: pl sem: fnc: fnc: fnc: mdr: mdr: mdr: nc: nc: nc: pc: pc: pc: prn: prn: prn: sur: them sur: they sur: it noun: pro3 noun: pro3 noun: pro3 cat: obq cat: p3 cat: snp sem: sg -mf sem: pl sem: pl fnc: fnc: fnc: mdr: mdr: mdr: nc: nc: nc: pc: pc: pc: prn: prn: prn:
PoP-5 2.6.5, 2.6.6 A.4.3 D ETERMINERS sur: a(n) noun: n_1 cat: sn′ snp sem: indef sg fnc: mdr: nc: pc: prn:
sur: some noun: n_1 cat: nn′ np sem: indef sel fnc: mdr: nc: pc: prn:
sur: all noun: n_1 cat: pn′ pnp sem: pl exh fnc: mdr: nc: pc: prn:
sur: every noun: n_1 cat: sn′ snp sem: pl exh fnc: mdr: nc: pc: prn:
sur: the noun: n_1 cat: nn′ np sem: def fnc: mdr: nc: pc: prn:
In DBS, determiners are analyzed as the beginning of phrasal nouns. Sect. 6.4. A.4.4 S UBORDIN . sur: that verb: v_1 cat: snp sem: that arg: fnc: mdr: nc: pc:
prn:
sur: how verb: v_1 cat: snp sem: how arg: fnc: mdr: nc: pc:
prn:
CONJUNCTIONS FOR SUBJECT AND OBJECT CLAUSE
The substitution value v_1 will be replaced by the verb of the subclause (Chap. 7). The fnc slot will take the address of the higher verb as value. The kind of the subordinating conjunction is specified as the initial sem value.
A.5 Lexical Analysis of Verbs
311
A.4.5 P REPOSITIONS sur: in noun: n_1 cat: adnv sem: in mdd: mdr: nc: pc: prn:
sur: on noun: n_1 cat: adnv sem: on mdd: mdr: nc: pc: prn:
A.4.6 C OMMON
sur: under noun: n_1 cat: adnv sem: under mdd: mdr: nc: pc: prn:
The cat value adnv enables adnominal as well as adverbial use.
NOUNS
sur: farmer sur: drivers sur: driver sur: cars sur: car noun: car noun: car noun: driver noun: driver noun: farmer cat: sn cat: pn cat: sn cat: pn cat: sn sem: sg m sem: pl sem: sg sem: pl sem: sg fnc: fnc: fnc: fnc: fnc: mdr: mdr: mdr: mdr: mdr: nc: nc: nc: nc: nc: pc: pc: pc: pc: pc: prn: prn: prn: prn: prn: sur: lift sur: lifts sur: tree sur: trees noun: lift noun: lift noun: tree noun: tree cat: sn cat: pn cat: sn cat: pn sem: sg sem: pl sem: sg -mf sem: pl fnc: fnc: fnc: fnc: mdr: mdr: mdr: mdr: nc: nc: nc: nc: pc: pc: pc: pc: prn: prn: prn: prn:
sur: farmers noun: farmer cat: pn sem: pl fnc: mdr: nc: pc: prn:
The specification of gender in the sem slot is needed for choosing the correct 3rd person singular pronoun, i.e. he, him, she, her, or it. In the case of man and woman, the choice is clear. In the case of dog it might be m, f, or -mf.
A.5 Lexical Analysis of Verbs A.5.1 M AIN
VERBS , ONE - PLACE
1. Non-third person present tense sur: sleep verb: sleep cat: n-s3′ v sem: pres arg: mdr: nc: pc: prn:
sur: speed sur: dream sur: sing verb: sing verb: dream verb: speed cat: n-s3′ v cat: n-s3′ v cat: n-s3′ v sem: pres sem: pres sem: pres arg: arg: arg: mdr: mdr: mdr: nc: nc: nc: pc: pc: pc: prn: prn: prn:
312
A. Attributes, Values, Lexical Entries, Examples
In English, this form of the finite verb is traditionally regarded as unmarked and used as the name (citation form) of the verbal paradigm (FoCL 13.1.3). It is also used as the infinitive (6.6.5–6.6.7; CLaTR Sect. 15.4) and the imperative form in English (CLaTR Sect. 10.6). 2. Third person singular present tense sur: sleeps erb: sleep cat: ns3′ v sem: pres arg: mdr: nc: pc: prn:
sur: sings verb: sing cat: ns3′ v sem: pres arg: mdr: nc: pc: prn:
sur: dreams verb: dream cat: ns3′ v sem: pres arg: mdr: nc: pc: prn:
sur: speeds verb: speed cat: ns3′ v sem: pres arg: mdr: nc: pc: prn:
The valency position ns3′ for the nominative is restricted to the 3rd person singular – in contradistinction to n-s3′ of the unmarked form, which accepts any nominative except 3rd person singular (complementary distribution). 3 Past tense – past participle sur: slept sur: sang sur: dreamt sur: sped verb: sleep verb: sing verb: dream verb: speed cat: n′ v cat: n′ v cat: n′ v cat: n′ v sem: past/perf sem: past sem: past/perf sem: past/perf arg: arg: arg: arg: mdr: mdr: mdr: mdr: nc: nc: nc: nc: pc: pc: pc: pc: prn: prn: prn: prn:
In English main verbs, the nominative valency position of the past tense form, here represented by the cat value n′ , is unrestricted for the filler. The agreement conditions between the auxiliary have and the past/perf form of a regular main verb allow the combination into a complex verb form, e.g., has slept. In this way, a lexical ambiguity is avoided (surface compositionality). 4 Separate past participle sur: sung verb: sing cat: hv sem: perf arg: mdr: nc: pc: prn:
Verbs which have different forms for the past tense (sang) and the past participle (sung) are classified as irregular, while verbs which do not distinguish morphologically between the two uses (learned) are classified as regular in English.
A.5 Lexical Analysis of Verbs
313
5 Present participle sur: sleeping sur: singing sur: dreaming sur: speeding verb: sleep verb: sing verb: dream verb: speed cat: be cat: be cat: be cat: be sem: prog sem: prog sem: prog sem: prog arg: arg: arg: arg: mdr: mdr: mdr: mdr: nc: nc: nc: nc: pc: pc: pc: pc: prn: prn: prn: prn:
A.5.2 M AIN
VERBS , TWO - PLACE
1 Non-third-person singular present tense sur: hit verb: hit cat: n-s3′ a′ sem: pres arg: mdr: nc: pc: prn:
sur: know verb: know ′ ′ v cat: n-s3 a sem: pres arg: mdr: nc: pc: prn:
v
In English, the second valency position of a two-place verb is always a direct object, represented as a′ (accusative). German, in contrast, distinguishes between two-place verbs like helfen (help) which take an indirect object d′ (dative) and those like töten (kill) which take a direct object a′ (accusative). 2 Third-person singular present tense sur: hits verb: hit cat: ns3′ a′ sem: pres arg: mdr: nc: pc: prn:
sur: knows verb: know ′ ′ v cat: ns3 a v sem: pres arg: mdr: nc: pc: prn:
3 Past tense – past participle sur: hit sur: knew verb: hit verb: know cat: n′ a′ v cat: n′ a′ v sem: past/perf sem: past arg: arg: mdr: mdr: nc: nc: pc: pc: prn: prn:
314
A. Attributes, Values, Lexical Entries, Examples
4 Separate past participle sur: known verb: know cat: a′ hv sem: perf arg: mdr: nc: pc: prn:
Combined with a finite form of the auxiliary have (A.5.5) in the active voice, the cat value a′ introduces a valency position for a direct object. 5 Present participle sur: knowing sur: hitting verb: hit verb: know cat: a′ be cat: a′ be sem: prog sem: prog arg: arg: mdr: mdr: nc: nc: pc: pc: prn: prn:
The cat value be ensures combination with the correct auxiliary. A.5.3 F ORMS sur: give verb: give cat: n-s3′ d′ a′ sem: pres arg: mdr: nc: pc: prn:
OF A THREE - PLACE MAIN VERB
sur: gives verb: give ′ ′ ′ v cat: ns3 d a sem: pres arg: mdr: nc: pc: prn:
sur: gave verb: give cat: n′ d′ a′ sem: past arg: mdr: nc: pc: prn:
v
sur: given verb: give ′ ′ v cat: d a hv sem: perf arg: mdr: nc: pc: prn:
sur: giving verb: give cat: d′ a′ be sem: prog arg: mdr: nc: pc: prn:
Three-place verbs of English always take an indirect object, represented as d′ (dative), and a direct object, represented as a′ (accusative). German, in contrast, distinguishes between three-place verbs like geben (give) which take an indirect object d′ (dative) and a direct object a′ (accusative), similar to English, and those like lehren (teach) which take two direct objects a′ (accusative). There are also three-place verbs in German which take an accusative and a genitive, such as verdächtigen (to suspect so. of sth.).
A.5 Lexical Analysis of Verbs
A.5.4 F ORMS
OF THE AUXILIARY
315
be
sur: am sur: ain’t sur: is sur: isn’t verb: v_1 verb: v_1 verb: v_1 verb: v_1 cat: ns1′ be′ v cat: n′ be′ v cat: ns3′ be′ v cat: ns3′ be′ v sem: be_pres sem: be_pres neg sem: be_pres sem: be_pres neg arg: arg: arg: arg: mdr: mdr: mdr: mdr: nc: nc: nc: nc: pc: pc: pc: pc: prn: prn: prn: prn: sur: wasn’t sur: was sur: aren’t sur: are verb: v_1 verb: v_1 verb: v_1 verb: v_1 cat: ns2p′ be′ v cat: ns2p′ be′ v cat: ns13′ be′ v cat: ns13′ be′ v sem: be_pres sem: be_pres neg sem: be_past sem: be_past neg arg: arg: arg: arg: mdr: mdr: mdr: mdr: nc: nc: nc: nc: pc: pc: pc: pc: prn: prn: prn: prn: sur: were sur: weren’t sur: been sur: being verb: v_1 verb: v_1 verb: v_1 verb: v_1 cat: ns2p′ be′ v cat: ns2p′ be′ v cat: be′ hv cat: be′ be sem: be_past sem: be_past neg sem: be_perf sem: be_prog arg: arg: arg: arg: mdr: mdr: mdr: mdr: nc: nc: nc: nc: pc: pc: pc: pc: prn: prn: prn: prn:
The paradigm of the auxiliary be has the greatest number of verbal forms in English. Consequently, the agreement conditions for the nominative are the most differentiated (FoCL 17.3.1, CLaTR 7.5.6–7.5.8). A.5.5 F ORMS
OF THE AUXILIARY
have
sur: have sur: haven’t sur: has verb: v_1 verb: v_1 verb: v_1 cat: n-s3′ hv′ v cat: n-s3′ hv′ v cat: ns3′ hv′ v sem: hv_pres sem: hv_pres neg sem: hv_pres arg: arg: arg: mdr: mdr: mdr: nc: nc: nc: pc: pc: pc: prn: prn: prn: sur: had sur: hadn’t sur: having verb: v_1 verb: v_1 verb: v_1 cat: n′ hv′ v cat: n′ hv′ v cat: be′ hv sem: hv_past sem: hv_past neg sem: hv_prog arg: arg: arg: mdr: mdr: mdr: nc: nc: nc: pc: pc: pc: prn: prn: prn:
sur: hasn’t verb: v_1 cat: ns3′ hv′ v sem: hv_pres neg arg: mdr: nc: pc: prn:
316
A. Attributes, Values, Lexical Entries, Examples
A.5.6 F ORMS
OF THE AUXILIARY
do
sur: do sur: don’t sur: does sur: doesn’t verb: v_1 verb: v_1 verb: v_1 verb: v_1 cat: n-s3′ do′ v cat: n-s3′ do′ v cat: ns3′ do′ v cat: ns3′ do′ v sem: do_pres sem: do_pres neg sem: do_pres sem: do_pres neg arg: arg: arg: arg: mdr: mdr: mdr: mdr: nc: nc: nc: nc: pc: pc: pc: pc: prn: prn: prn: prn: sur: done sur: doing sur: didn’t sur: did verb: v_1 verb: v_1 verb: v_1 verb: v_1 cat: a′ hv cat: a′ be cat: n′ do′ v cat: n′ do′ v sem: do_past sem: do_past neg sem: do_prog sem: do_perf arg: arg: arg: arg: mdr: mdr: mdr: mdr: nc: nc: nc: nc: pc: pc: pc: pc: prn: prn: prn: prn:
A.5.7 I NTRAPROPOSITIONAL
SUBORDINATING CONJUNCTION
sur: to verb: v_1 cat: inf fnc: (CLaTR 7.6.5, Sect. 15.4) arg: prn:
A.5.8 S ENTENTIAL 8
PUNCTUATION SIGNS
(11.5.2)
sur: ! sur: ? sur: . verb: v_1 verb: v_1 verb: v_1 cat: v′ decl cat: vi′ interrog cat: vimp′ impv prn: prn: prn:
A.6 Lexical Analysis of Adjectives A.6.1 F ORMS
OF ADNOMINALS
sur: beautiful sur: heavy sur: heavier sur: heaviest sur: old sur: older sur: oldest adj: beautiful adj: heavy adj: heavy adj: heavy adj: old adj: old adj: old cat: adn cat: adn cat: adn cat: adn cat: adnvcat: adnv cat: adnv sem: pad sem: pad sem: cad sem: sad sem: padsem: cad sem: sad mdd: mdd: mdd: mdd: mdd: mdd: mdd: mdr: mdr: mdr: mdr: mdr: mdr: mdr: nc: nc: nc: nc: nc: nc: nc: pc: pc: pc: pc: pc: pc: pc: prn: prn: prn: prn: prn: prn: prn:
A.7 Coordinating Conjunctions
A.6.2 F ORMS
317
OF ADVERBIALS
sur: heavily sur: beautifully adj: heavy adj: beautiful cat: adv cat: adv sem: pad sem: pad mdd: mdd: mdr: mdr: nc: nc: pc: pc: prn: prn:
A.6.3 S UBORDINATING sur: who verb: v_1 cat: sem: who 7.6.6 arg: ∅ 9.5.7 mdd: nc: pc: prn:
sur: whom verb: v_1 cat: sem: whom 7.6.4 arg: ∅ 9.5.5 mdd: nc: pc: prn:
A.6.4 S UBORDINATING sur: when verb: v_1 cat: sem: when arg: mdd: mdr: nc: pc:
prn:
7.5.1 7.5.4 9.6.1 9.6.7
CONJUNCTIONS FOR ADNOMINAL CLAUSE sur: which verb: v_1 cat: sem: which -mf 7.3.2, arg: ∅ 7.4.1 mdd: nc: pc: prn:
CONJUNCTIONS FOR ADVERBIAL CLAUSE
sur: before verb: v_1 cat: sem: before arg: mdd: mdr: nc: pc: prn:
sur: after verb: v_1 cat: sem: after arg: mdd: mdr: nc: pc: prn:
For prepositions see A.4.5.
A.7 Coordinating Conjunctions The coordinating (paratactic) conjunctions and and or may concatenate nouns with nouns, verbs with verb, and adjs with adjs. This is formally expressed by using the replacement variable RA_m (5.1.3) as their core attribute: 8
There are also clausal punctuation marks such as the comma, the semicolon, the colon, the hyphen, and the quotation marks. Their lexical, syntactic, and semantic handling in the hear and the speak mode is omitted. For an early treatment see NEWCAT.
318
A. Attributes, Values, Lexical Entries, Examples
A.7.1 C ONJUNCTIVE
sur: and RA_m: and prn:
where RA_m ǫ {noun, verb, adj}
A.7.2 D ISJUNCTIVE
sur: or RA_m: or prn:
where RA_m ǫ {noun, verb, adj}
The paratactic conjunctions resemble the punctuation signs A.5.7 in that their attribute number is reduced. They differ, however, in that punctuation signs are absorbed into a preceding finite verb (no suspension), while paratactic conjunctions are absorbed into a following noun, verb, or adj (suspension). A.7.3 A DVERSATIVE The conjunction but is traditionally listed with and and or. However, its use is much more restricted. For example, bought, cooked, and ate is acceptable, but *bought, cooked, but ate is not. It seems that but mainly concatenates pairs of conjuncts at the clause level, as in Mary walked, but Suzy talked or Mary declined the potatoes, but ate the broccoli. sur: but verb: but prn:
A.8 Viewing DBS abstractly The DBS model of cognition is built on the stream of time. Speaking metaphorically, time flows through the agent like a motion picture film being moved through a projector. However, while a film is a recording such that the individual images going in are the same as the images going out, the data going into the agent are recognition while the data going out are action. This difference is due to the nature of the agent’s external interfaces. The interfaces of recognition turn the incoming raw data into content, and the interfaces of action turn outgoing content into raw data. Agent-internally, the external interfaces are connected by a memory for storing, retrieving, and processing cognitive content. The motor driving cognition are the changes in the agent’s external and internal environment, which must be compensated by action to maintain a continuous state of balance.
A.8 Viewing DBS abstractly
319
The structure of the data is a single, time-linear coordination of chunks of content. The top item of a chunk, e.g. the verb, has a dual function: it serves (i) as the item being coordinated extrapropositionally and (ii) as the entrance and exit of the internal structure of the chunk, which is based on (a) intra- and (b) extrapropositional functor-argument, and (c) intrapropositional coordination. Formally, chunks are built from proplets which are connected by adress. A.8.1 E XTRAPROPOSITIONAL 1
verb: α continuation: δ nc: ( β K+1) prn: K 2
5
4 core: κ core: δ continuation: κ 3 continuation: α prn: K prn: K
COORDINATION OF SMALLER CHUNKS
verb: β continuation: nc: ( γ K+2) prn: K+1 6 core: ζ continuation: ϕ 7 prn: K+1
ζ
9
8 core: ϕ continuation: β prn: K+1
verb: γ continuation: χ nc: prn: K+2 core: χ continuation: η prn: K+2
core: η continuation: γ prn: K+2
The example is simplified in that it distinguishes only between (i) extrapropositional coordination and (ii) generic intrapropositional relations, leaving the linguistic distinctions between intra- and extrapropositional functor-argument, and intrapropositional coordination aside. Instead it concentrates on showing the basic coding of relations by means of proplet attributes and their values (6.1.3). Compared to the linguistic example 6.1.4 (iii), the intrapropositional relations proceed directly, e.g., from δ to κ instead of returning from δ to the verb to get the address of κ.9 More specifically, the α proplet is accessed by transition 1 and is continued intrapropositionally by transition 2 to its continuation value δ. From there, transition 3 continues to the continuation value κ. Then the schematic navigation10 returns to α by transition 4, completing the intrapropositional activation and enabling an extrapropositional coordination transition from the current verb to the next: it navigates from α to β, using the nc value (β K+1). From there, transitions 6, 7, and 8 navigate intrapropositionally through the proposition (β K+1) and continue on to the next proposition (γ K+2), etc. The coding of interproplet relations by means of proplet-internal addresses makes proplets order-free (CLaTR 3.2.7). Their storage in a word bank is according to the alphabetical order induced by their core value and their order of arrival, without any reliance on graphical properties, in contrast to the linear notation of symbolic logic.11 This explains why the time-linear input and out9 10
11
Similar simplifications are used in CLaTR 4.1.2 and 4.6.3. For more complex contents, including various forms of gapping, copula, infinitive, and unbounded dependency, see Parts II and III. On the one hand, a formula of propositional calculus like [p ∧ q ∧ r] becomes incorrect, if the linear order is scrambled into ∧ ] ∧ p q [ r, for example. On the other hand, [p ∧ q ∧ r] may be mistaken
320
A. Attributes, Values, Lexical Entries, Examples
put streams seem to disappear during storage and activation (e.g. 14.1.1): the time-linear nature of the data is preserved by address rather than by position. In a graphical manner, the time-linear nature of building new and selectively activating existing content may be shown as follows: A.8.2 T HE
FLOW OF TIME IN A COMPLEX CONTENT extrapropositional coordination 8
1 intra− and extrapropositional functor−argument and intrapropositional coordination
2 7 3
9 6
4
etc.
5
As indicated by the direction and numbering of the arrows, the traversal of a content is always forward, even when returning to a point of origin. This is in contradistinction to a locomotive, for example, which may traverse a track forward or backward,12
A.9 List of Examples Linguistic examples, figures, definitions, tables, and summaries are contained in risci. Typographically, a riscus is a standard general purpose container beginning with a numbered heading and ending when normal text resumes. The three part number of a heading specifies (i) the chapter and (ii) the section in which the riscus is contained as well as the number of (iii) the riscus within the section. For convenience, these headings, altogether 594, are listed below: Part I: Communication Mechanism of Cognition 1 Matters of Method 1.1 Sign- or Agent-Based Analysis of Language? 1.1.1 The basic model of turn-taking 1.1.2 Two views of turn-taking
12
to be time-linear; in fact, however, the rule of symmetry states that [p ∧ q] and [q ∧ p] are equivalent, thus eliminating the asymmetry of time-linearity. In earlier work, extrapropositional coordination was treated as bidirectional (slant duplex, CLaTR2 16.6.7), similar to intrapropositional coordination. However, an interpretation or navigation uses extrapropositional coordination only unidirectionally. Therefore, the connections between, e.g., John left the house. Then he crossed the street. and John crossed the street. Earlier he left the house. (NLC’06 Sect. 9.6) are now coded separately as slant simplex, and their temporal equivalence is treated by an inference.
A.9 List of Examples
1.2 1.2.1 1.3 1.3.1 1.4 1.4.1 1.4.2 1.4.3 1.5 1.5.1 1.5.2 1.6 1.6.1 1.6.2 1.6.3 1.6.4 1.6.5 1.6.6 1.6.7 1.6.8
Verification Principle Declarative specification and its implementations Equation Principle The equation principle of Database Semantics Objectivation Principle Constellations providing different kinds of data Data channels of communicative interaction Interaction between user, robot, and scientist Equivalence Principles The Interface Equivalence Principle The Input/Output Equivalence Principle Surface Compositionality and Time-Linearity Surface compositionality Analysis violating surface compositionality The categories of 1.6.2 PSG rules computing substitutions for deriving 1.6.2 Satisfying surface compositionality and time-linearity LA rules computing continuations for deriving 1.6.5 Application of a rule computing a possible continuation Variable definition of the rules deriving 1.6.5
2 Interfaces and Components 2.1 Cognitive Agents without Language 2.1.1 Support for building the context component first 2.1.2 Interfaces of a cognitive agent without language 2.2 Sensory Modalities and their Media 2.2.1 Modality-independent and modality-dependent coding 2.3 Alternative Ontologies for Referring with Language 2.3.1 Reference in alternative ontologies 2.3.2 External interfaces of a cognitive agent with language 2.4 Theory of Language and Theory of Grammar 2.4.1 Structuring central cognition in agents with language 2.5 Immediate Reference and Mediated Reference 2.5.1 Use of external interfaces in mediated reference 2.5.2 Disadvantages of not having context data 2.6 The S LIM Theory of Language 2.6.1 First principle of pragmatics (PoP-1, FoCL 4.3.3) 2.6.2 Second principle of pragmatics (PoP-2, FoCL 5.3.3) 2.6.3 Third principle of pragmatics (PoP-3, FoCL 5.4.5) 2.6.4 Fourth principle of pragmatics (PoP-4, FoCL 6.1.3)
321
322
A. Attributes, Values, Lexical Entries, Examples
2.6.5 Fifth principle of pragmatics (PoP-5, FoCL 6.1.4) 2.6.6 The seven pointers in indexical words of English 2.6.7 Sixth principle of pragmatics (PoP-6, FoCL 6.1.5) 2.6.8 Seventh principle of pragmatics (PoP-7) 2.6.9 Reference kinds and ontological roles 2.6.10 Different reference kinds in the same construction 3 Data Structure, Database Schema, and Algorithm 3.1 Proplets for Coding Propositional Content 3.1.1 Context proplets representing dog bark. (I) run. 3.1.2 Connecting proplets into a complex cognitive content 3.2 Matching between Language and Context Proplets 3.2.1 Language proplets representing dog barks. (I) run. 3.2.2 Keys for lexical lookup in the speak and the hear mode 3.2.3 Conditions on successful matching 3.2.4 Impact of interproplet relations on matching 3.3 Storing Proplets in a Content-Addressable Database 3.3.1 Storing the proplets of 3.2.4 in a word bank 3.4 Possible Continuations vs. Possible Substitutions 3.4.1 Applying Input/Output Equivalence to language 3.4.2 Continuation-based time-linear hear mode derivation 3.4.3 Substitution-based non-time-linear derivation of 3.4.2 3.5 Cycle of Natural Language Communication 3.5.1 Turn taking from the perspective of grammar 3.5.2 Turn taking at the level of language 3.5.3 The recognition-action cycle at the level of context 3.5.4 Turn taking including reference 3.5.5 The DBS Balance Principle 3.6 Operations of the DBS Algorithm 3.6.1 LA Grammar for ak bk ck 3.6.2 Applying r1 to two initial input words 3.6.3 The four kinds of LAGo hear operations 3.6.4 The four kinds of LAGo navigation operations 3.6.5 Continuity Condition for LAGo navigation 3.6.6 Schematic time-linear think mode navigation 4 Content and the Theory of Signs 4.1 Formal Concept Types and Concept Tokens 4.1.1 Type and token of the concept square 4.1.2 Concepts as values of the core attributes
A.9 List of Examples
4.1.3 4.2 4.2.1 4.2.2 4.2.3 4.2.4 4.3 4.3.1 4.3.2 4.4 4.4.1 4.5 4.5.1 4.5.2 4.6 4.6.1
Why the type-token relation is important Proplet Kinds and their Mechanisms of Reference Examples of the six main kinds of proplets Proplet attributes and their values Substituting a replacement variable with a core value Proplet shell, lexical item, and operation pattern Context Recognition Concept type and token in context recognition Concept type and concept token in recognizing a square Context Action Concept types and concept tokens in action Language Interpretation and Production Lexical lookup of German Apfel in the hear mode Three type-token relations in the communication cycle Universal versus Language-Dependent Properties Hierarchy of notions and distinctions in DBS
5 Forms of Thinking 5.1 Retrieving Answers to Questions 5.1.1 Three different kinds of thought 5.1.2 Example of a token line 5.1.3 Definition of LA think.line 5.1.4 Example of an LA think.line operation application 5.1.5 Basic kinds of query expression in natural language 5.1.6 Search proplets illustrating the two basic kinds 5.1.7 WH search pattern checking a token line 5.1.8 Definition of LAGo think.Q1 (WH interrogative) 5.1.9 Definition of LAGo think.Q2 (Yes/No interrogative) 5.2 Episodic versus Absolute Propositions 5.2.1 Absolute and episodic proplets in a word bank 5.2.2 Core values as concept types and concept tokens 5.3 Inference: Reconstructing Modus Ponens in DBS 5.3.1 Inference schemata of propositional calculus 5.3.2 Operation for the inference of conjunction (hypothetical) 5.3.3 Modus ponens in propositional calculus 5.3.4 Modus ponens in predicate calculus 5.3.5 Paraphrasing the modus ponens of predicate calculus 5.3.6 Paraphrasing predicate calculus modus ponens in DBS 5.3.7 Applying the formalized modus ponens of DBS 5.4 Non-Literal Uses of Language
323
324
A. Attributes, Values, Lexical Entries, Examples
5.4.1 5.4.2 5.4.4 5.4.5 5.5 5.5.1 5.5.2 5.6 5.6.1
Narrowing inference Non-literal use of the word table 5.4.3 Widening inference Context inference underlying a non-literal use Advantages of handling indirect uses via inferences Secondary Coding as Perspective Taking Coding levels in agents with and without language Functions of coding levels in database semantics Shades of Meaning Complementation of a literal meaning1 (concept type)
Part II: Major Constructions of Natural Language 6 Intrapropositional Functor-Argument 6.1 Obligatory Subject/Predicate and Object\Predicate 6.1.1 Hear mode derivation of content with three-place verb 6.1.2 Content derived in 6.1.1 in alphabetical order 6.1.3 Intrapropositional relations derived from patterns 6.1.4 DBS graph analysis of proposition with three-place verb 6.2 Optional Adnominal Modifier|Modified 6.2.1 Hear mode derivation of proposition with modified nouns 6.2.2 Content of a proposition with two modified nouns 6.2.3 Graph analysis of content with adnominal modifiers 6.2 Optional Adnominal Modifier|Modified 6.3.1 Hear mode derivation of adverbial modifier following 6.3.2 Content of a proposition with an adverbial modifier 6.3.3 DBS graph analysis of content with adverbial modifier 6.4 Semantic Interpretation of Determiners 6.4.1 Predicate calculus analysis of All girls sleep 6.4.2 Interpretation of 6.4.1 relative to a model 6.4.3 Elimination of the outermost quantifier 6.4.4 Analyzing Every man loves a woman in predicate calculus 6.4.5 Result of parsing Every man loves a woman in DBS 6.4.6 The cat and sem values of determiner-noun combinations 6.4.7 Set-theoretic interpretation of exh, sel, sg, pl, def, indef 6.5 Complexity Levels of Surfaces and their Contents 6.5.1 Fusing phrasal det+noun into an elementary content 6.5.2 Fusing phrasal det+adn+noun into a phrasal content 6.5.3 Fusing phrasal aux+aux+aux+nfv into elemen. content 6.5.4 Fusing phrasal prep+det+adn+noun into phras. content 6.6 Transparent vs. Opaque
A.9 List of Examples
6.6.1 6.6.2 6.6.3 6.6.4 6.6.5 6.6.6 6.6.7
Transparent and opaque relations at the three levels Prepositional phrase serving as an obligatory modifier Verb taking an obligatory modifier DBS graph of content with prepositional object Hear mode derivation of John tried to read a book Infinitive construction as a set of proplets DBS graph analysis of an infinitive
7 Extrapropositional Functor-Argument 7.1 Clausal Subject 7.1.2 Interpreting a clausal subject construction 7.1.3 Content of a clausal subject construction 7.1.4 DBS graph analysis underlying production of 7.1.2 7.1.5 A more complex clausal subject construction 7.2 Clausal Object 7.2.1 Interpreting a clausal object construction 7.2.2 Content of Mary heard that Fido barked 7.2.3 DBS graph analysis underlying production of 7.2.1 7.3 Clausal Adnominal with a Subject Gap 7.3.2 Interpretation of The dog which saw Mary barked 7.3.3 Adnominal clause construction with subject gap 7.3.4 DBS graph analysis underlying production of 7.3.2 7.4 Clausal Adnominal with an Object Gap 7.4.1 Interpretation of The dog which Mary saw barked 7.4.2 Adnominal clause construction with object gap 7.4.3 DBS graph analysis underlying production of 7.4.1 7.5 Clausal Adverbial 7.5.1 Interpretation of When Fido barked Mary laughed 7.5.2 Content of an adverbial clause construction) 7.5.3 DBS graph analysis underlying production of 7.5.1 7.5.4 Interpretation of Mary laughed when Fido barked. 7.5.5 Surface realization of 7.5.4 7.6 Subclause Recursion 7.6.1 Interpreting John said that Bill believes that Mary loves Tom 7.6.2 DBS graph analysis of an unbounded dependency 7.6.3 Simplified content representation of 7.6.2 7.6.4 Whom does John say that Bill believes that Mary loves? 7.6.5 Ambiguity structure of an unbounded suspension 7.6.6 DBS graph analysis of a relative clause recursion
325
326
A. Attributes, Values, Lexical Entries, Examples
8 Intrapropositional Coordination 8.1 Comparing Simple and Complex Coordination 8.1.1 Same word forms in a simple and a complex coordination 8.1.2 Complex coordination without simple counterpart 8.1.3 Internal structure of a simple coordination 8.1.4 Integrating a coordination into functor-argument 8.2 Simple Coordination of Subject and Object Nouns 8.2.1 Simple noun coordination in subject position 8.2.2 Proplet representation of the content derived in 8.2.1 8.2.3 DBS graph analysis underlying production of 8.2.1 8.2.4 Simple coordinations of subject and object nouns 8.2.5 Simple noun coordination in object position 8.2.6 Proplet representation of the content derived in 8.2.5 8.2.7 DBS graph analysis underlying production of 8.2.5 8.3 Simple Coordination of Verbs and of Adjectives 8.3.1 Simple coordination of two-place verbs 8.3.2 Verb coordination: John bought, cooked, and ate the pizza 8.3.3 Proplet representation of a simple verb coordination 8.3.4 DBS graph analysis underlying production of 8.3.2 8.3.5 Simple coordination of adnominal adjectives 8.3.6 Simple coordination of adverbial adjectives 8.4 Overview of Complex Coordination (Gapping) 8.4.1 Examples of complex coordination 8.4.2 Proplet analysis of predicate gapping 8.4.3 Proplet analysis of subject gapping 8.4.4 Proplet analysis of object gapping 8.5 Complex Coordinations with Subject and Predicate Gap 8.5.1 Bob bought an apple, # peeled a pear, and # ate a peach 8.5.2 Representing content of subject gapping as proplet set 8.5.3 DBS graph analysis underlying production of 8.5.1 8.5.4 Bob bought an apple, Jim # a pear, and Bill # a peach. 8.5.5 Representing predicate gapping content as proplet set 8.5.6 DBS graph analysis underlying production of 8.5.4 8.6 Object Gapping and Noun Gapping 8.6.1 Bob bought #, Jim peeled #, and Bill ate a peach. 8.6.2 Representing object gapping content as proplet set 8.6.3 DBS graph analysis underlying production of 8.6.1 8.6.4 DBS graph analysis of noun gapping 8.6.5 Representing noun gapping content as proplet set
A.9 List of Examples
8.6.6 8.6.7 8.6.8 8.6.9
327
DBS graph analysis of adnominal coordination Adnominal coordination content as proplet set Combination of noun gapping and subject gapping Combination of noun gapping and object gapping
9 Simple Coordination in Clausal Constructions 9.1 Combining Intra- and Extrapropositional Coordination 9.1.1 Grammatical relations between propositions 9.1.2 Combining intra- and extrapropositional coordination 9.2 Extrapropositional Coordination 9.2.1 Derivation of Julia slept. John sang. Suzy dreams 9.2.2 Content derived in 9.2.2 as a set of proplets 9.2.3 DBS graph analysis underlying production of 9.2.1 9.3 Simple Coordination in Clausal Subject 9.3.1 Simple subject coordination in clausal subject 9.3.2 Content of That Bob, Jim, and Bill slept surprised Mary. 9.3.3 DBS graph analysis underlying production of 9.3.1 9.3.4 Simple predicate coordination in a clausal subject 9.3.5 That the man bought, cooked, and ate a pizza surprised Mary. 9.3.6 DBS graph analysis underlying production of 9.3.4 9.3.7 Simple object coordination in clausal subject 9.3.8 That Bob bought an apple, a pear, and a peach, surprised Mary. 9.3.9 DBS graph analysis underlying production of 9.3.7 9.4 Simple Coordination in Clausal Object 9.4.1 Simple subject coordination in a clausal object 9.4.2 Content of Mary saw that Bob, Jim, and Bill slept. 9.4.3 DBS graph analysis underlying production of 9.4.1 9.4.4 Simple predicate coordination in a clausal object 9.4.5 Mary saw that the man bought, cooked, and ate a pizza. 9.4.6 DBS graph analysis underlying production of 9.4.4 9.4.7 Simple object coordination in a clausal object 9.4.8 Mary saw that Bob bought an apple, a pear, and a peach. 9.4.9 DBS graph analysis underlying production of 9.4.7 9.5 Simple Coordination in Clausal Adnominal 9.5.1 Simple coordinations in relative clause constructions 9.5.2 Mary saw the man who(*, Jim, and Bill) slept 9.5.3 The man whom(*, Jim, and Bill) Suzy knows kissed Mary 9.5.4 Simple subject coordination in clausal adnominal 9.5.5 The man whom Bob, Jim, and Bill know kissed Mary 9.5.6 DBS graph analysis underlying production of 9.5.4
328
A. Attributes, Values, Lexical Entries, Examples
9.5.7 9.5.8 9.5.9 9.6 9.6.1 9.6.2 9.6.3 9.6.4 9.6.5 9.6.6 9.6.7 9.6.8 9.6.9
Simple predicate coordination in clausal adnominal Mary saw the man who bought, cooked, and ate the pizza. DBS graph analysis underlying production of 9.5.7 Simple Coordination in Clausal Adverbial Simple subject coordination in a clausal adverbial When Bob, Jim, and Bill slept, Fido barked. DBS graph analysis underlying production of 9.6.1 Simple predicate coordination in a clausal adverbial When Bob bought, cooked, and ate a pizza, Fido barked. DBS graph analysis underlying production of 9.6.4 Simple object coordination in a clausal adverbial Fido barked when Bob bought an apple, a pear, and peach. DBS graph analysis underlying production of 9.6.7
10 Gapping in Clausal Constructions 10.1 Subject Gapping in a Clausal Subject 10.1.1 Clausal subject containing subject gapping 10.1.2 That John hit Bob, kicked Fido, and kissed Suzy amused Mary. 10.1.3 DBS graph analysis underlying the production of 10.1.1 10.2 Predicate Gapping in a Clausal Object 10.2.1 Clausal object containing gapped predicate 10.2.2 Ann knows that Bob loves Julia, Jim Suzy, and Bill Liz 10.2.3 DBS graph analysis underlying the production of 10.2.1 10.3 Clausal Adnom. (Subject Gap) with Subject Gapping 10.3.1 Predicate-object coordination (subject gapping) 10.3.2 The man who hit Bob, kicked Fido, and kissed Liz smiled. 10.3.3 DBS graph analysis underlying the production of 10.3.1 10.4 Clausal Adnominal (Object Gap) with Object Gapping 10.4.1 Subject-predicate coordination (object gapping) 10.4.2 Liz met the man whom Bob hit, Fido bit, and Suzy kissed. 10.4.3 DBS graph analysis underlying the production of 10.4.1 10.5 Complex Coordination in Clausal Adverbial 10.5.1 Adverbial subclause with predicate gapping 10.5.2 John smiled when Bob kissed Julia, Jim Suzy, and Bill Ann. 10.5.3 DBS graph analysis underlying the production of 10.5.1 10.5.4 Adverbial subclause with subject gapping 10.5.5 Adverbial subclause with object gapping 10.6 Gapping in Content with Three-Place Verbs 10.6.1 Subject gapping with a ditransitive verb conjunction 10.6.2 Gapped three-place verb with two-part conjuncts
A.9 List of Examples
10.6.3 Possibility of gapping a direct_object Part III: Specification of Formal Fragments 11 DBS.1: Hear Mode 11.1 Automatic Word Form Recognition 11.1.1 Surface recognition and lexical lookup 11.1.2 Trie structures for full forms and for allographs 11.1.3 DBS word form recognition of wolf and wolves 11.2 Integrating the Steps of the Hear Mode 11.2.1 Alternating lookup, storage, and concatenation 11.2.2 Standard presentation of 11.2.1 11.3 Managing the Now Front 11.3.1 Applying an operation in situ at the now front 11.3.2 V/V concatenation in a subject sentence 11.3.3 State 1 of an extrapropositional V−V coordination 11.3.4 State 2 of an extrapropositional V−V coordination 11.3.5 State 3 of an extrapropositional V−V coordination 11.3.6 State 4 of an extrapropositional V−V coordination 11.4 Intrapropositional Suspension 11.4.1 Initial adverb causing intrapropositional suspension 11.4.2 Intrapropositional suspension 11.4.3 Suspension-compensating LA Rule ADV+NOM+FV 11.4.4 Absence of a suspension in a postverbal adverb 11.4.5 Hear mode derivation of 11.4.4 11.4.6 Definition of the operation ADV+FV 11.4.7 Definition of the operation NOM+FV 11.4.8 Definition of the operation FV+ADV 11.5 The Three Steps of Applying a LAGo Hear Operation 11.5.1 Data coverage of LAGo Hear.1 11.5.2 The lexical entries of LAGo hear.1 11.5.3 First step of applying a LAGo hear operation 11.5.4 Second step of applying a LAGo hear operation 11.5.5 Third step of applying a LAGo hear operation 11.5.6 LAGo Hear.1 11.6 Parsing a Small Fragment of English 11.6.1 Combining Julia and sleeps 11.6.2 Combining sleeps and . 11.6.3 Combining Julia sleeps. and John 11.6.4 Applying NOM+FV to John and sing
329
330
A. Attributes, Values, Lexical Entries, Examples
11.6.5 Applying V+V to sleep and sing 11.6.6 Time-linear parsing of 9.2.1 with LAGo Hear.1 11.6.7 Storage of the LAGo Hear.1 derivation in a word bank 12 DBS.1: Speak Mode 12.1 Definition of LAGo Think.1 12.1.1 Formal definition of LAGo Think.1 12.2 Three Steps of Applying a LAGo Think Operation 12.2.1 First step of applying a LAGo navigation operation 12.2.2 Second step of applying a LAGo navigation operation 12.3 Navigating with LAGo Think.1 12.3.1 DBS graph analysis underlying production of 11.5.1 12.3.2 Moving from V to N (arc 1) 12.3.3 Moving from N to V (arc 2) 12.3.4 Moving from one proposition to the next (arc 3) 12.4 Turning LAGo Think.1 into LAGo Think-Speak.1 12.4.1 Formal definition of LAGo Think-Speak.1 12.4.2 Schema of a lexicalization table 12.4.3 Lexicalization rule lexverb 12.5 Hear and Speak Mode Analysis of Personal Pronouns 12.5.1 Lexical analysis of the personal pronouns of English 12.5.2 The lexicalization rule lexnoun 12.6 Hear and Speak Mode Analysis of Determiners 12.6.1 Lexicalization rule lexadj 13 DBS.2: Hear Mode 13.1 Lexical Lookup for the DBS.2 Example 13.1.1 The test sequence added in DBS.2 13.1.2 The lexical proplets of the test sequence 13.2 Hear Mode Analysis of the First Sentence 13.2.1 Graphical hear mode derivation of first sentence 13.2.2 Combining The and heavy 13.2.3 Combining The heavy and old 13.2.4 Combining The heavy old and car 13.2.5 Combining The heavy old car and hit 13.2.6 Combining The heavy old car hit and a 13.2.7 Combining The heavy old car hit a and beautiful 13.2.8 Combining The heavy old car hit a beautiful and tree 13.2.9 Combining The heavy old car hit a beautiful tree and . 13.2.10 Result of parsing The heavy old car hit a beautiful tree.
A.9 List of Examples
13.3 Hear Mode Analysis of the Second Sentence 13.3.1 Hear mode derivation of The car had been speeding. 13.3.2 Combining The heavy old car hit a beautiful tree. and The 13.3.3 Combining The and car 13.3.4 Combining The car and had 13.3.5 Extrapropositional coordination between hit and had 13.3.6 Combining The car had and been 13.3.7 Combining The car had been and speeding 13.3.8 Combining The car had been speeding and . 13.3.9 Result of parsing The car had been speeding. 13.4 Hear Mode Analysis of the Third Sentence 13.4.1 Graphical hear mode derivation of the third sentence 13.4.2 Combining The car had been speeding. and A 13.4.3 Combining A and farmer 13.4.4 Combining A farmer and gave 13.4.5 Combining speed and give 13.4.6 Combining A farmer gave and the 13.4.7 Combining A farmer gave the and driver 13.4.8 Combining A farmer gave the driver and a 13.4.9 Combining A farmer gave the driver a and lift 13.4.10 Combining A farmer gave the driver a lift and . 13.4.11 Result of parsing A farmer gave the driver a lift. 13.5 Defining LAGo Hear.2 13.5.1 LAGo Hear.2 13.5.2 Sequence of operations in Sects. 13.3 , 13.4 , and 13.5 13.6 Hear Mode Extension to a Verb-Initial Construction 13.6.1 DBS graph analysis of a Yes/No interrogative 13.6.2 Connecting Did and you with AUX+NP? 13.6.3 Connecting Did you and give with AUX+NFV 13.6.4 Different Yes/No interrogative constructions 14 DBS.2: Speak Mode 14.1 Content to be Produced in the Speak Mode 14.1.1 Proplets of LAGo Hear.2 test sequence in a word bank 14.2 Speak Mode Production of the First Sentence 14.2.1 Graphical Speak Mode analysis of the first sentence 14.2.2 Navigating from hit to car (arc 1) 14.2.3 Navigating from car to heavy (arc 2) 14.2.4 Navigating from heavy to old (arc 3) 14.2.5 Returning empty from old to heavy (arc 4)
331
332
A. Attributes, Values, Lexical Entries, Examples
14.2.6 Navigating from heavy& back to car (arc 5) 14.2.7 Returning to and realizing the verb (arc 6) 14.2.8 Beginning realization of the direct_object (arc 7) 14.2.9 Navigating from tree to beautiful (arc 8) 14.2.10 Navigating from beautiful back to tree (arc 9) 14.2.11 Moving from tree back to hit (arc 10) 14.3 Speak Mode Production of the Second Sentence 14.3.1 Graphical Speak Mode analysis of the second sentence 14.3.2 Navigating empty from hit to speed (arc 0) 14.3.3 Navigating from speed to car (arc 1) 14.3.4 Navigating from car back to speed (arc 2) 14.4 Speak Mode Production of the Third Sentence 14.4.1 Graphical Speak Mode analysis of the third sentence 14.4.2 Empty traversal from speed to give (arc 0) 14.4.3 Navigating from give to farmer (arc 1) 14.4.4 Returning to give and realizing the verb (arc 2) 14.4.5 Navigating from give to driver (arc 3) 14.4.6 Empty return to the verb (arc 4) 14.4.7 Navigating from give to lift (arc 5) 14.4.8 Final return to the verb to the realize period (arc 6) 14.5 Definition of LAGo Think-Speak.2 14.5.1 LAGo Think-Speak.2 14.5.2 Interproplet traversals realizing 13.1.1 14.6 Speak Mode Extension to a Verb-Initial Construction 14.6.1 Graph of an initial Yes/No interrogative production 14.6.2 Applying V−V? (arc 0 ) 14.6.3 Yes/No interrogative starting with a V−V? transition 14.6.4 Imperative starting with a V−V! transition 15 Ambiguity, Word Order, and Collocation 15.1 Adnominal and Adverbial Uses of Phrasal Modifiers 15.1.1 Modifiers at elementary, phrasal, and clausal level 15.1.2 Ambiguous phrasal modification 15.1.3 Options for treating adnominal-adverbial ambiguity 15.1.4 Option 1: both relations are omitted 15.1.5 Option 2: only the adnominal relation is established 15.1.6 Option 3: only the adverbial relation is established 15.1.7 Option 4: establishing both relations simultaneously 15.2 Graphical Analyses of the Hear and the Speak Mode 15.2.1 Hear mode derivation of 15.1.2
A.9 List of Examples
15.2.2 15.2.3 15.3 15.3.1 15.3.2 15.3.3 15.3.4 15.3.5 15.3.6 15.3.7 15.3.8 15.3.9 15.4 15.4.1 15.4.2 15.4.3 15.4.4 15.5 15.5.1 15.5.2 15.6 15.6.1 15.6.2 15.6.3
Phrasal adnominal modifier Phrasal adverbial modifier Intrapropositional Modifier Chaining (Adnominal) Chaining of elementary adnominals Proplet set showing an elementary adnominal chain Elementary adnominals chained by coordination Chaining of phrasal adnominals Proplet set showing a phrasal adnominal chain Phrasal adnominals chained by modification Combining elementary and phrasal adnominals DBS graph analysis underlying the production of 15.3.7 Proplet set representing the content of 15.3.8 Multiple Adverbials Unambiguous adverbial modification Multiple elementary adverbial modifiers Representing the content of 15.4.2 as a set of proplets DBS graph analysis underlying the production of 15.4.2 Inferences for Ritualized Phrases and Idioms Inference responding to a greeting Interpreting idioms by means of inferences Collocations and Proverbs The high voltage and strong current patterns Some instances of the “A and B” construction Example of a proverb
Appendix A.1 The Number of Attributes in a Proplet A.1.1 Standard proplets representing content words A.1.2 Proplets representing subordinating conjunctions A.1.3 Reduced proplets omitting attributes without values A.2 Standard Order of Attributes in a Proplet A.2.1 Lexical attributes precede continuation attributes A.2.2 Obligatory precede optional continuation attributes A.2.3 Book-keeping attributes come at the end A.3 Summary of Proplet Attributes and Values A.3.1 Attributes A.3.2 Constant Values A.3.3 Variables of Proplet Patterns A.4 Lexical Analysis of Nouns
333
334
A. Attributes, Values, Lexical Entries, Examples
A.4.1 A.4.2 A.4.3 A.4.4 A.4.5 A.4.6 A.5 A.5.1 A.5.2 A.5.3 A.5.4 A.5.5 A.5.6 A.5.7 A.5.8 A.6 A.6.1 A.6.2 A.6.3 A.6.4 A.7 A.7.1 A.7.2 A.7.3 A.8 A.8.1 A.8.2
Names Personal pronouns Determiners Conjunctions for subject and object clause Prepositions Common nouns Lexical Analysis of Verbs Main verbs, one-place Main verbs, two-place Forms of a three-place main verb Forms of the auxiliary be Forms of the auxiliary have Forms of the auxiliary do Intrapropositional subordinating conjunction Sentential punctuation signs Lexical Analysis of Adjectives Forms of adnominals Forms of adverbials Subordinating conjunctions for adnominal clause Subordinating conjunctions for adverbial clause Coordinating Conjunctions Conjunctive Disjunctive Adversative Viewing DBS abstractly Extrapropositional coordination of smaller chunks The flow of time in a complex content
Bibliography
Abney, S. (1991) “Parsing by chunks,” in R. Berwick, S. Abney, and C. Tenny (eds.) Principle-Based Parsing. Dordrecht: Kluwer Academic Ágel, V. (2000) Valenztheorie. Tübingen: Gunter Narr Aho, A.V., and J.D. Ullman (1977) Principles of Compiler Design. Reading, MA: Addison-Wesley AIJ = Hausser, R. (2001) “Database Semantics for natural language,” Artificial Intelligence, 130.1:27–74 Ajdukiewicz, K. (1935) “Die syntaktische Konnexität,” Studia Philosophica 1:1–27 Anderson, J.R., and C. Lebiere (1998) The Atomic Components of Thought. Mahwah, NJ: Lawrence Erlbaum Austin, J.L. (1962) How to Do Things with Words. Oxford: Clarendon Press Bar-Hillel, Y. (1964) Language and Information. Selected Essays on Their Theory and Application. Reading, MA: Addison-Wesley Barwise, J., and J. Perry (1983) Situations and Attitudes. Cambridge, MA: MIT Press Bauer, M. (2011) Eine Morphologische Analyse des Englischen mit JSLIM, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Beaugrande, R.-A., and W.U. Dressler (1981) Einführung in die Textlinguistik. Tübingen: Niemeyer ben-Zineb, R. (2010) Implementierung eines automatischen Wortformerkennungssystems für das Arabische in JSLIM, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Berners-Lee, T., J. Hendler, and O. Lassila (2001) “The Semantic Web,” Scientific American, May 17, 2001
336
Bibliography
Bertossi, L., G. Katona, K.-D. Schewe, and B. Thalheim (eds.) (2003) Semantics in Databases, Springer Beutel, B. (1997) Malaga 4.0, CLUE-Manual, CLUE, Univ. Erlangen–Nürnberg Biederman, I. (1987) “Recognition-by-components: a theory of human image understanding,” Psychological Review 94:115–147 Bloomfield, L. (1933) Language, New York: Holt, Rinehart, and Winston Bochenski, I. (1961) A History of Formal Logic. Notre Dame, Indiana: Univ. of Notre Dame Press Bransford, J.D., and Franks, J.J. (1971) “The abstraction of linguistic ideas,” Cognitive Psychology 2:331–350 Bresnan, J. (ed.) (1982) The Mental Representation of Grammatical Relation. Cambridge, MA: MIT Press Bresnan, J. (2001) Lexical-Functional Syntax. Malden, MA: Blackwell de la Briandais, R. (1959) ”File Searching Using Variable Length Keys,” Proc. Western Joint Computer Conf. 15, 295–298 Brill, F. (1993) “Automatic grammar induction and parsing free text: a transformation-based approach,” in Proceedings of the 1993 European ACL, The Netherlands: Utrecht Brill, F. (1994) “Some advances in rule-based part of speech tagging,” AAAI 1994 Brooks, R. (1985) “A robust layered control system for a mobile robot,” Cambridge, MA: MIT AI Lab Memo 864, p. 227–270 Brooks, R. (1990) “Elephants don’t play chess,” MIT Artificial Intelligence Laboratory, Robotics and Autonomous Systems, Vol. 6:3–15 Brooks, R. (1991) “Intelligence without representation,” Artificial Intelligence, Vol. 47:139–160 Brooks, R. (2002) Flesh and Machines: How Robots Will Change Us. New York, NY: Pantheon Burnard, L. (ed.) (1995) Users’ Reference Guide British National Corpus Version 1.0, Oxford Univ. Computing Services, England: Oxford Burnard, L. (ed.) (2007) Reference Guide for the British National Corpus (XML Edition), Oxford Univ. Computing Services, England: Oxford
Bibliography
337
Carbonell, J.G., and R. Joseph (1986) “FrameKit+: a knowledge representation system,” Carnegie Mellon Univ., Department of Computer Science Chang, Suk-Jin (1996) Korean. Philadelphia: John Benjamins Charniak, E. (2001) “Immediate-Head parsing for language models,” Proceedings of the 39th Annual Meeting of the ACL Chen, Fang (2006) Human Factors in Speech Interface Designs, Springer Cocke, J., and J. T. Schwartz (1970) “Programming languages and their compilers,” Preliminary notes. Courant Institute of Mathematical Sciences of New York University, New York Choe, Jay-Woong, and R. Hausser (2006) “Handling quantifiers in database semantics,” in Y. Kiyoki, H. Kangassalo, and M. Duží (eds.), Information Modelling and Knowledge Bases XVII. Amsterdam: IOS Press Ohmsha Choi, Key-Sun, and Gil Chang Kim (1986) “Parsing Korean: a free wordorder language,” Literary and Linguistic Computing 1.3:123–128 Chomsky, N. (1957) Syntactic Structures. The Hague: Mouton Chomsky, N. (1965) Aspects of the Theory of Syntax. Cambridge, MA : The MIT Press. Chomsky, N. (1995) The Minimalist Program. Cambridge, MA: MIT Press Clark, H. H. (1996) Using Language. Cambridge, UK: Cambridge Univ. Press CLaTR’11 = Hausser, R. (2011) Computational Linguistics and Talking Robots, Processing Content in Database Semantics, pp. 286, Springer CoL = Hausser, R. (1989) Computation of Language, An Essay on Syntax, Semantics, and Pragmatics in Natural Man-Machine Communication, Symbolic Computation: Artificial Intelligence, pp. 425, Springer Collins, M. J. (1999) Head-driven Statistical Models for Natural Language Parsing. Univ. of Pennsylvania, Ph.D. Dissertation Comrie, B. (1988) “Passive and voice,” in Shibatani (ed) (1988), pp. 9–24 Copestake, A., D. Flickinger, C. Pollard, and I. Sag (2006) “Minimal Recursion Semantics: an introduction,” Research on Language and Computation 3.4:281–332 Dauses, A. (1997) Einführung in die allgemeine Sprachwissenschaft: Sprachtypen, sprachliche Kategorien und Funktionen. Stuttgart: Steiner
338
Bibliography
Déjean, H. (1998) Concepts et algorithmes pour la découverte des structures formelles des langues. Thèse pour l’obtention du Doctorat de l’université de Caen Dorr, B. (1993) Machine Translation: A View from the Lexicon. Cambridge, MA: MIT Press Earley, J. (1970) “An efficient context-free parsing algorithm,” Commun. ACM 13.2:94–102, reprinted in B. Grosz, K. Sparck Jones, and B.L. Webber (eds.) Readings in Natural Language Processing (1986), Morgan Kaufmann Eberhard, K.M., M.J. Spivey-Knowlton, J.C. Sedivy, and M.K. Tanenhaus (1995) “Eye movements as a window into real-time spoken language comprehension in natural contexts,” Journal of Psycholinguistic Research 24.6: 409–437 Elmasri, R., and S.B. Navathe (2010) Fundamentals of Database Systems, 6th edition. Reading, MA: Addison-Wesley Engel, U. (1991) Deutsche Grammatik. 2nd ed., Heidelberg: Julius Groos Feng, Qiuxiang (2011) LAG Application to Chinese Parsing, Inaugural Dissertation, CLUE, Univ. Erlangen–Nürnberg Fillmore, C.J., (1992) “ ‘Corpus linguistics’ vs. ‘Computer-aided armchair linguistics’,” in J. Svartvik (ed.) Directions in Corpus Linguistics, Berlin: Mouton de Gruyter, 1992, pp. 35–60. Fillmore, C.J., and P. Kay (1996) Construction Grammar Coursebook, unpublished manuscript, UC Berkeley Fischer, W. (2004) Implementing Database Semantics as an RDMS. Diplom Thesis, Department of Computer Science, Univ. Erlangen–Nürnberg FoCL’99 = Hausser, R. (1999/2001/2014) Foundations of Computational Linguistics, Human–Computer Communication in Natural Language, 3rd ed., pp. 522, Springer Fornwald, S. (2012) Implementierung eines Parsers für elementare, phrasale und clausale Nomina im Georgischen, M.A. thesis, CLUE, Univ. Erlangen– Nürnberg Frederking, R., T. Mitamura, E. Nyberg, and J.G. Carbonell (1997) “Translingual information access,” presented at the AAAI Spring Symposium on Cross-Language Text and Speech Retrieval Fredkin, E. (1960) “Trie memory,” Communications of the ACM, 3:9 pp. 490–499
Bibliography
339
Frege, G. (1967) Kleine Schriften. I. Angelelli (ed.), Darmstadt: Wissenschaftliche Buchgesellschaft Gao, M. (2007) An Automatic Word Form Recognition System for German in Java, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Gazdar, G., E. Klein, G. Pullum, and I. Sag (1985) Generalized Phrase Structure Grammar. Cambridge, MA: Harvard Univ. Press Gerstner, C. (1988) Über Generizität: generische Nominalausdrücke in singulären und generellen Aussagen, München: Wilhem Fink Verlag Gibson, E.J. (1969) Principles of Perceptual Learning and Development. Englewood Cliffs, NJ: Prentice Hall Girstl, A. (2006) Development of allomorph-rules for German morphology within the framework of JSLIM, B.A. thesis, CLUE, Univ. Erlangen– Nürnberg Givòn, T. (1997) Grammatical Relations: A Functionalist Perspective. Amsterdam: John Benjamins Graf, S. (2011) Implementierung eines Chinesischen Lexikons in JSLIM, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Greenbaum, S., and R. Quirk (1990) A Student’s Grammar of English. London: Longman Greenberg, J. (1963) “Some universals of grammar with particular reference to the order of meaningful elements,” in J. Greenberg (ed.) Universals of Language. Cambridge, MA: MIT Press Grice, P. (1957) “Meaning,” Philosophical Review 66:377–388 Grice, P. (1965) “Utterer’s meaning, sentence meaning, and word meaning,” Foundations of Language 4:1–18 Grühn, A. von der (1998) Wort-, Morphem- und Allomorphhäufigkeit in domänenspezifischen Korpora des Deutschen, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Habermas, J. (1981) Theorie des Kommunikativen Handelns, 8th Ed. 2011, Berlin: Suhrkamp Verlag Halliday, M.A.K., and R. Hasan (1976) Cohesion in English. London: Longman Handl, J. (2008) Entwurf und Implementierung einer abstracten Maschine für die Oberfläenkompositional inkrementelle Analyse natürlicher Sprache, Diplom Thesis, Department of Computer Science, Univ. Erlangen–Nürnberg
340
Bibliography
Handl, J. (2012) Inkrementelle Oberflächenkompositionale Analyse und Generierung Natürlicher Sprache, Inaugural Dissertation, CLUE, Univ. Erlangen– Nürnberg (available at http://opus4.kobv.de/opus4-fau/frontdoor/index/index /docId/3933) Hanrieder, G. (1996) Incremental Parsing of Spoken Language Using a Left-Associative Unification Grammar, Inaugural Dissertation, CLUE, Univ. Erlangen–Nürnberg [in German] Harman, G. (1963) “Generative Grammar without Transformational Rules: a Defense of Phrase Structure,” Language Vol. 39:597–616 Hauser, M.D. (1996) The Evolution of Communication. Cambridge, MA: MIT Press Hausser, R.: for references cited as SCG, NEWCAT, CoL, TCS, FoCL, AIJ, and L&I’05, see p. xv above. For online papers and a list of publications, see lagrammar.net Hausser, R. (ed.) (1996) Linguistische Verifikation. Dokumentation zur Ersten Morpholympics, Tübingen: Max Niemeyer Verlag Hausser, R. (2002a) “Autonomous control structure for artificial cognitive agents,” in H. Kangassalo, H. Jaakkola, E. Kawaguchi, and T. Welzer (eds.) Information Modeling and Knowledge Bases XIII, Amsterdam: IOS Press Ohmsha Hausser, R. (2005b) “What if Chomsky were right?” Comments on the paper “Multiple solutions to the logical problem of language acquisition” by Brian MacWhinney, Journal of Child Language 31:919–922 Hellwig, P. (2003) “Dependency unification grammar,” in V. Ágel, L.M. Eichinger, H.-W. Eroms, P. Hellwig, H.-J. Heringer, and H. Lobin (eds.) Dependency and Valency. An International Handbook of Contemporary Research, Berlin: Mouton de Gruyter Heusinger, K. von, and Meulen, A. ter (eds.) (2013) Meaning and the Dynamics of Interpretation, Selected Papers of Hans Kamp, Leiden: Brill Heß, K., J. Brustkern, and W. Lenders (1983) Maschinenlesbare Deutsche Lexika. Dokumentation – Vergleich – Integration. Tübingen: Niemeyer Huang-Senechal, Hisaio-Yun (2011) The Grammatical Analysis of Three Structural Particles in Modern Chinese in JSLIM, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg
Bibliography
341
Huezo, R. (2003) Time Linear Syntax of Spanish, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Hubel, D.H., and T.N. Wiesel (1962) “Receptive fields, binocular interaction, and functional architecture in the cat’s visual cortex,” Journal of Physiology 160:106–154 Hudson, R. A. (1976) “Conjunction reduction, gapping and right-node-raising,” Language 52.3:535–562 Hudson, R. A. (1991) English Word Grammar. Oxford: Blackwell. Ickler, T. (1994) “Zur Bedeutung der sogenannten ‘Modalpartikeln’,” in Sprachwissenschaft 19.3-4:374–404 Ide, N., and J. Pustejovsky (2010) “What does interoperability mean, anyway? Toward an operational definition of interoperability,” Proc. Second International Conference on Global Interoperability for Language Resources (ICGL 2010), Hong Kong, China Ivanova, M. (2009) Syntax und Semantik des Bulgarischen im Rahmen von JSLIM, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Jackendoff, R.S. (1972) “Gapping and related rules,” Linguistic Inquiry 2:21– 35 Jägers, S. (2010) Entwurf und Implementierung von Grammatikregeln für die Analyse von Komposita im Rahmen der SLIM-Sprachtheorie, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Jezzard, P., P.M. Matthews, and S. Smith (2001) Functional Magnetic Resonance Imaging: An Introduction to Methods. Oxford: Oxford Univ. Press Kabashi, B. (2003) Automatische Wortformerkennung für das Albanische, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Kalender, E. (2010) Automatische JSLIM-Analyse der russischen Syntax im Rahmen der Datenbanksemantik, M.A. thesis, CLUE, Univ. Erlangen– Nürnberg Kamp, J.A.W., and U. Reyle (1993) From Discourse to Logic. Dordrecht: Kluwer Kapfer, J. (2009) Inkrementelles und oberfläenkompositionales Parsen von Koordinationsellipsen, Inaugural Dissertation, CLUE, Univ. Erlangen–Nürnberg Kasami. T. (1965) “An efficient recognition and syntax algorithm for contextfree languages,” Technical Report AFCLR-65-758
342
Bibliography
Kay, M. (1980) “Algorithm schemata and data structures in syntactic processing,” reprinted in B.J. Grosz, K. Sparck Jones, and B. Lynn Webber (eds.) Readings in Natural Language Processing (1986). San Mateo, CA: Morgan Kaufmann, 35–70 Kempson, R., and Cormack, A. (1981) “Ambiguity and quantification,” Linguistics and Philosophy 4:259–309 Kim, S. (2009) Automatische Wortformerkennung für das Koreanische im Rahmen der LAG, Inaugural Dissertation, CLUE, Univ. Erlangen–Nürnberg Kiparsky, P. (1973) “Abstractness, opacity and global rules (Part 2 of ’Phonological representations’),” In Fujimura, Osamu Three Dimensions of Linguistic Theory, Tokyo Institute for Advanced Studies of Language. pp. 57–86 Kirkpatrick, K. (2001) “Object recognition,” Chapter 4 of G. Cook (ed.) Avian Visual Cognition. Cyberbook in cooperation with Comparative Cognitive Press, http://www.pigeon.psy.tufts.edu/avc/toc.html Knight, K. (1989). "Unification: A Multidisciplinary Survey," ACM Computing Surveys 21 (1): 93–124. Kohonen, T. (1988) Self-Organization and Associative Memory, 2nd ed., Heidelberg Berlin New York: Springer Kolesnikova, O., and A. Gelbuk (2015) “Measuring Non-Compositionality of Verb-Noun Collocations using Lexical Functions and Wordnet Hypernyms,” O.R. Lagunas et al. (eds.) MICAI 2015, Part II, LNAI 9414, pp. 3–25 Kosche, S. (2011) Syntax und Semantik des Französischen im Rahmen von JSLIM, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Kuˇcera, H., and W.N. Francis (1967) Computational Analysis of Present-Day English. Providence, Rhode Island: Brown Univ. Press Kycia, A. (2004) An Implementation of Database Semantics in Java. M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Lakoff, G. (1971) “On generative semantics,” In D.D. Steinberg and L.A. Jakobovits (eds.), pp. 232–296 Lakoff, G., and M. Johnson (1980) Metaphors We Live By. Chicago: The Univ. of Chicago Press Lakoff, G., and S. Peters (1969) “Phrasal conjunction and symmetric predicates,” in D.A. Reibel and S. Schane (eds.), 113–142
Bibliography
343
Lee, Kiyong (2002) “A simple syntax for complex semantics,” in Language, Information, and Computation, Proceedings of the 16th Pacific Asia Conference, 2–27 Lee, Kiyong (2004) “A computational treatment of some case-related problems in Korean,” Perspectives on Korean Case and Case Marking, 21–56, Seoul: Taehaksa Leidner, J. (2000) Linksassoziative morphologische Analyse des Englischen mit stochastischer Disambiguierung, M.A. thesis, CLUE, Univ. Erlangen– Nürnberg L&I’05 = Hausser, R. (2005) “Memory-Based pattern completion in Database Semantics,” Language and Information, 9.1:69–92, Seoul: Korean Society for Language and Information Lenders, W. (1990) “Semantische Relationen in Wörterbuch-Einträgen – Eine Computeranalyse des DUDEN-Universalwörterbuchs,” Proceedings of the GLDV-Jahrestagung 1990, 92–105 Lenders, W. (1993) “Tagging – Formen und Tools,” Proceedings of the GLDVJahrestagung 1993, 369–401 Le´sniewski, S. (1929) “Grundzüge eines neuen Systems der Grundlagen der Mathematik,” Warsaw: Fundamenta Mathematicae 14:1–81 Lieb, H.H. (1976) “Zum Verhältnis von Sprachtheorien, Grammatiktheorien und Grammatiken,” in D. Wunderlich (ed.) Wissenschaftstheorie der Linguistik, Kronberg: Athenäum, pp. 200–214 LIMAS-Korpus (1973) Mannheim: Institut für Deutsche Sprache. Lipp, S. (2010) Automatische Wortformerkennung für das Schwedische unter Verwendung von JSLIM, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Liu, Haitao (2001) “Some ideas on natural language processing,” in Terminology Standardization and Information Technology 1:23–27 Lorenz, O. (1996) Automatische Wortformerkennung für das Deutsche im Rahmen von Malaga, M.A. thesis, CLUE, Univ. Erlangen Nürnberg Lorenz, O., and G. Schüller (1994) “Präsentation von LA-MORPH,” LDVForum, Vol. 11.1:39–51. Reprinted in R. Hausser (ed.) 1996 MacWhinney, B. (ed.) (1999) The Emergence of Language from Embodiment. Hillsdale, NJ: Lawrence Erlbaum März, B. (2005) The Handling of Coordination in Database Semantics. M.A. thesis, CLUE, Univ. Erlangen–Nürnberg [in German]
344
Bibliography
Mahlow, K. (2000) Automatische Wortformanalyse für das Spanische, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Mann, W.C., and S.A. Thompson (eds.) 1993 Discourse Description: Diverse Linguistic Analyses of a Fund-Raising Text. Amsterdam: John Benjamins Matthews, P.M., J. Adcock, Y. Chen, S. Fu, J.T. Devlin, M.F.S. Rushworth, S. Smith, C. Beckmann, and S. Iversen (2003) “Towards understanding language organisation in the brain using fMRI,” Human Brain Mapping 18.3: 239–247 McCawley, J.D. (1976) Grammar and meaning, New York: Academic Press Mehlhaff, J. (2007) Funktor-Argument-Struktur und Koordination im Deutschen, eine Implementation in JSLIM, M.A. thesis, CLUE, Univ. Erlangen– Nürnberg Mel’ˇcuk, I.A. (1988) Dependency Syntax: Theory and Practice. New York: State Univ. of New York Press Mel’ˇcuk, I.A., and A. Polguère (1987) “A formal lexicon in the meaning-text theory,” in Computational Linguistics 13.3-4:13–54 Meulen, A. ter, and Heusinger, K. von (2013) “Interview with Hans Kamp,” in Heusinger and ter Meulen (eds.) Modrak, D. (2001) Aristotle’s Theory of Language and Meaning, New York: Cambridge Univ. Press Montague, R. (1973) “The Proper Treatment of Quantification in Ordinary English,” in Jaakko Hintikka, Julius Moravcsik, Patrick Suppes (eds.) Approaches to Natural Language, Dordrecht, Reidel: 221–242 Neisser, U. (1964). “Visual search,” Scientific American 210.6:94–102 NEWCAT’86 = Hausser, R. (1986) NEWCAT: Parsing Natural Language Using Left-Associative Grammar, LNCS 231, pp. 540, Springer Newton, I. (1687/1999) The Principia: Mathematical Principles of Natural Philosophy. I.B. Cohen and A. Whitman (Translators), Berkeley: Univ. of California Press. Niedobijczuk, K. (2009) Implementierung eines automatischen Wortformerkennungssystems des Polnischen in JSLIM, M.A. thesis, CLUE, Univ. Erlangen– Nürnberg Nyberg, E.H., and T. Mitamura (2002) “Evaluating QA systems on multiple dimensions,” paper presented at the LREC Workshop on Resources and
Bibliography
345
Evaluation for Question Answering Systems, Las Palmas, Canary Island, Spain, May 28, 2002 Östman, J.-O., and M. Fried (eds.) (2004) Construction Grammars: Cognitive Grounding and Theoretical Extensions. Amsterdam: John Benjamins Pandea, P. (2010) Implementierung eines automatischen Wortformerkennungssystems im Rumänischen mit JSLIM, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Pehar, B. (2001) Language and Diplomacy, Saarbrücken: Lambert Academic Publishing Peirce, C.S. Collected Papers. C. Hartshorne and P. Weiss (eds.), Cambridge, MA: Harvard Univ. Press. 1931–1935 Pepiuk, S. (2010) Implementierung eines automatischen Wortformerkennungssystems für das Französische in JSLIM, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Pereira, F., and S. Shieber (1987) Prolog and Natural-Language Analysis. Stanford: CSLI Peters, S., and D. Westersthål (2006) Quantifiers in Language and Logic, Oxford: Oxford Univ. Press Piaget, J. (1954) The Language and Thought of the Child. London: Routledge Pollard, C., and I. Sag (1987) Information-Based Syntax and Semantics. Vol. I, Fundamentals. Stanford: CSLI Pollard, C., and I. Sag (1994) Head-Driven Phrase Structure Grammar. Stanford: CSLI Portner, P. (2005) “The semantics of imperatives within a theory of clause types,” in Kazuha Watanabe and R.B. Young (eds.), Proceedings of Semantics and Linguistic Theory 14. Ithaca, NY: CLC Publications Post, E. (1936) “Finite combinatory processes – formulation I,” Journal of Symbolic Logic I:103–105 Quillian, M. (1968) “Semantic memory,” in M. Minsky (ed.), Semantic Information Processing, 227–270, Cambridge, MA: MIT Press Quine, W.v.O. (1960) Word and Object. Cambridge, MA: MIT Press Quirk, R., J. Svartvik, G. Leech, and S. Greenbaum (1985) A Comprehensive Grammar of the English Language. New York: Longman
346
Bibliography
Reibel, D.A., and S.A. Schane (eds.) (1969) Modern Studies of English. Englewood Cliffs, NJ: Prentice Hall Reiter, E., and R. Dale (1997) “Building applied natural-language generation systems,” Journal of Natural-Language Engineering 3:57–87 Ricoeur, P. (1975) The Rule of Metaphor: Multi-Disciplinary Studies in the Creation of Meaning in Language, trans. Robert Czerny with Kathleen McLaughlin and John Costello, S. J., London: Routledge and Kegan Paul 1978. (Toronto: University of Toronto Press 1977) Richards, I.A. (1937) The Philosophy of Rhetoric, Oxford: Oxford Univ. Press Reimer, M., and E. Michaelson (2014) “Reference”, The Stanford Encyclopedia of Philosophy (Winter 2014 Edition), Edward N. Zalta (ed.), URL = Rosch, E. (1975) “Cognitive representations of semantic categories,” J. of Experimental Psychology, General 104:192–253 Rudat, S. (2011) Implementierung eines Systems der automatischen Wortformerkennung des Portugiesischen mit dem Programm JSLIM, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Russell, B. (1905) “On denoting,” Mind 14:479–493 Sabeva, S. (2010) Erstellung eines linksassoziativen morphologischen Analysesystems für das Bulgarische, Inaugural Dissertation, CLUE, Univ. Erlangen– Nürnberg Sag, I., G. Gazdar, T. Wasow, and S. Weisler (1985) “Coordination and how to distinguish categories,” Natural Language and Linguistic Theory 3:117–171 Sag, I., and T. Wasow (2011) “Performance-compatible competence grammar,” in R. Borsley and K. Borjars (eds.) Non-Transformational Syntax, Wiley-Blackwell Saussure, F. de (1972) Cours de linguistique générale. Édition critique préparée par Tullio de Mauro, Paris: Éditions Payot Schegloff, E. (2007) Sequence Organization in Interaction, New York: Cambridge Univ. Press SCG’84 = Hausser, R. (1984) Surface Compositional Grammar, pp. 274, München: Wilhelm Fink Verlag Searle, J.R. (1969) Speech Acts. Cambridge, England: Cambridge Univ. Press
Bibliography
347
Sedivy, J.C., M.K. Tanenhaus, C.G. Chambers, and G.N. Carlson (1999) “Achieving incremental semantic interpretation through contextual representation,” Cognition 71:109–147 Shibatani, M. (ed.) (1988) Passive and Voice, Amsterdam: John Benjamins Shieber, S. (ed.) (2004) The Turing Test. Verbal Behavior as the Hallmark of Intelligence. Cambridge, MA: MIT Press Söllch, G. (2009) Syntaktisch-semantische Analyse des Tagalog im Rahmen der Datenbanksemantik, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Sowa, J.F. (1984) Conceptual Structures. Reading, MA: Addison-Wesley Sowa, J. F. (2000) Conceptual Graph Standard. Revised version of December 6, 2000. http://www.bestweb.net/ sowa/cg/cgstand.htm Spivey, M.J., M.K. Tanenhaus, K.M. Eberhard, and J.C. Sedivy (2002) “Eye movements and spoken language comprehension: effects of visual context on syntactic ambiguity resolution,” Cognitive Psychology 45: 447–481 Steedman, M. (2005) “Surface-Compositional scope-alternation without existential quantifiers,” Draft 5.1, Sept 2005 Steels, L. (1999) The Talking Heads Experiment. Antwerp: limited pre-edition for the Laboratorium exhibition Stefanskaia, I. (2005) Regelbasierte Wortformerkennung für das Deutsche als Grundlage eines linksassoziativen Sprachmodells für einen automatischen Spracherkenner mit Prototyping einer Sprachsteuerungskomponente zur Steuerung dezidierter Arbeitsabläufe in einem radiologischen Informationssystem, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Stein, N.L., and Trabasso, T. (1982). “What’s in a story? An approach to comprehension,” in R. Glaser (ed.), Advances in the Psychology of Instruction. Vol. 2, 213–268. Hillsdale, NJ: Lawrence Erlbaum Steinberg, D.D., and L.A. Jakobovits (eds.) (1971) Semantics: An interdisciplinary reader in philosophy, linguistics and psychology . Cambridge: Cambridge University Press Tarski, A. (1935) “Der Wahrheitsbegriff in den Formalisierten Sprachen,” Studia Philosophica I:262–405 Tarski, A. (1944) “The semantic concept of truth,” in Philosophy and Phenomenological Research 4:341–375 TCS’92 = Hausser, R. (1992) “Complexity in Left-Associative Grammar,” Theoretical Computer Science, 106.2:283-308
348
Bibliography
Tesnière, L. (1959) Éléments de syntaxe structurale. Paris: Editions Klincksieck Thórisson, K. (2002) “Natural turn-taking needs no manual: computational theory and model, from perception to action,” in B. Granström, D. House, and I. Karlsson (eds.) Multimodality in Language and Speech System, 173– 207, Dordrecht: Kluwer Tomita, M. (1986) Efficient Parsing for Natural Language. Boston: Kluwer Academic Twiggs, M. (2005) The Handling of Passive in Database Semantics, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg [in German] van Oirsouw, R.R. (1987) The Syntax of Coordination. London: Croom Helm Vergne, J., and E. Giguet (1998) “Regards théoriques sur le ‘tagging’,” in proceedings of the fifth annual conference Le Traitement Automatique des Langues Naturelles (TALN) Vorontsova, L. (2010) Automatische Wortformerkennung für das Russische im Rahmen der LAG, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Weber, C. (2007) Implementierung eines automatischen Wortformerkennungssystems des Italienischen mit dem Programm JSLIM, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Weber, C. (2011) Syntaktisch-Semantisches Parsing des Italienischen–Implementierung unter Berücksichtigung von Korpusdaten, Inaugural Dissertation, CLUE, Univ. Erlangen–Nürnberg. Frankfurt am Main: Peter Lang 2012 Wetzel, C. (1996) Erstellung einer Morphologie für Italienisch in Malaga, CLUE-betreute Studienarbeit der Informatik, Univ. Erlangen–Nürnberg Weydt, H. (1969) Abtönungspartikel. Die deutschen Modalwörter und ihre französischen Entsprechungen. Bad Homburg: Gehlen Wiener, N. (1948) Cybernetics: Or Control and Communication in the Animal and the Machine Paris: Hermann and Cie; 2nd revised ed. 1961, Cambridge Mass: MIT Press Wierzbicka, A. (1991) Cross-Cultural Pragmatics: The Semantics of Human Interaction. Berlin: Mouton de Gruyter Wittgenstein, L. (1921) “Logisch-philosophische Abhandlung,” Annalen der Naturphilosophie 14:185–262 Younger, D. (1967) “Recognition and parsing of context-free languages in time n 3,” Information and Control 10.2:189–208
Bibliography
349
Zheng, Weiwei (2011) Verbalkomplexe im Chinesischen - Eine Implementierung in JSLIM, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg Zipf, G.K. (1949) Human Behavior and the Principle of Least Effort, Cambridge, MA: Addison-Wesley Zischler, L. (2011) Implementierung eines Syntaxparsers für das Chinesische, M.A. thesis, CLUE, Univ. Erlangen–Nürnberg
Name Index
Name Index Abelson, R., xx Abney, S., xix Ajdukiewicz, K., xix Ágel, V., xx Anderson, J.R., ix Aristotle, xxiii, 7, 36 Austin, J.L., xx Bar-Hillel, Y., xix Barwise, J., xxiii Bauer, M., xxii Beaugrande,R.-A., xx ben-Zineb, R., xxii Berners-Lee, T., xx Bertossi, L., viii Beutel, B., xxii Biederman, I., 49, 56 Bochenski, I., 70 Bransford, J., 49 Bresnan, J., xx, 33 Briandais, R., 195 Brill, E., xix Brooks, R., 44 Burnard, L., xxi Carbonell, J., 223 Charniak, E., xix Chen, Fang, 11 Choe, J.W. , 92 Chomsky, N., xx, 281 Clark, H.H., 3 Cocke, J., xix Collins, M.J., xix Copestake, A., 91 Cormack, A., 91 Dale, R., xx
Dauses, A., 228 Déjean, H., xix Dorr, B., xx Dressler, W., xx Earley, J., xix Eberhard, K.M., 28 Elmasri, R., 38 Engel, U., 62 Feng, Qiuxiang, xxii Fillmore, C., xx, 295 Flickinger, D., 91 Francis, W.N., 124 Franks, J., 49 Frederking, R., 223 Fredkin, E., 195 Frege, G., xxiii, 84, 85, 97 Fried, M., xx Gao, M., xxii Gazdar, G., xx, 33, 136, 281 Gelbuk, A., 295 Gerstner, C., 67 Gibson, J., 49 Giguet, E., xix Girstl, A., xxii Graf, S., xxii Greenbaum, S., 123 Greenberg, J., 116, 218 Grice, P., xx, 72 Grün, A. von der, xxii Habermas, J., 11 Halliday, M.A.K., xx Handl, J., xxii, xxvi, 46, 193 Harman, G., 281 Hasan, R., xx
351
352
Name Index
Hauser, M.D., 18 Hellwig, P., xx Hendler, J., xx Heusinger, K. von, 207 Heß, K., 124 Huˇcinova, M., 124 Huang, H.-Y., 124 Hubel, D., 49 Hudson, R.A., xx, 136 Ickler, T., 62 Ide, N., xxi Jägers, S., xxii Jackendoff, R.S., 136 Jezzard, P., 8 Johnson, M., 72 Kalender, K., 124 Kamp, J.A.W., xxiii Kapfer, J., xxvi, 136 Kaplan, R., 39 Kasami, T., xix Katona, G., viii Kay, M., xix Kay, P., xx, 295 Kempson, R., 91 Kim, Soora, xxii, 124 Kiparsky, P., 97 Kirkpatrick, K., 49 Knight, K., 41 Kohonen, T., 46 Kolesnikova, O., 295 Kuˇcera, H., 124 Kycia, A., xxvi Lakoff, G., 72, 96 Lassila, O., xx Le´sniewski, S., xix Lebiere, C., ix Lee, Kiyong, 109, 116, 290
Leech, G., xx Leidner, J, xxii Lenders, W., 124 Lieb, H.H., 24 Lipp, S, xxii Liu, Haitao, 7 Lorenz, O., xxii MacWhinney, B., 17 März, B., 124, 136 Mahlow, K., xxii Mann, W.C., xx Matthews, P.M., 8 McCawley, J.D., 96 Mehlhaff, J., 116 Mel’ˇcuk, I. A., xx Meulen, A. ter, 207 Mitamura, T., 223 Modrak, D., 36 Montague, R., xx, 90, 97, 207 Navathe, S.B., 38 Neisser, U., 49 Neumann, J. von, xvii Newton, I., vii Niedobijczuk, K., xxii Nyberg, E., 223 Oirsouw, R. van, 136 Östman, J.-O., xx Pandea, P., xxii Pehar, B., 281 Peirce, C.S., xxiii, 49 Pepiuk, S, xxii Pereira, F., xix Perry, J., xxiii Peters, S., 90 Piaget, J., 30, 49 Pollard, C., xx, 33 Portner, P., 62
Name Index
Post, E., xx Pustejovsky, J., xxi Quillian, R., xx Quine, W.v.O., 97 Quirk, R., 123, 307 Reiter, E., xx Reyle, U., xxiii Richards, I.A., 73 Ricoeur, P., 75 Rosch, E., 49 Rudat, S., xxii Russell, B., 93 Sabeva, S, xxii Sag, I., xx, 27, 33, 136 Saussure, F. de, xxiii, 58 Schank, R., xx Schegloff, E., 5 Schewe, K.-D., viii Schüller, G., xxii Schwartz, J., xix Scott, D., 7 Searle, J.R., xx Sedivy, J.C., 28 Shieber, S., xix, 39 Söllch, G., 124 Sowa, J.F., xx Spivey, M.J., 28 Steedman, M., 91
Steels, L., 56 Stefanskaia, I, xxii Stein, N.L., 49 Tanenhaus, M.K., 28 Tarski, A., xx, 207 Tesnière, L., xx, 14 Thórisson, K., 5 Thalheim, B., viii Thompson, S.A., xx Tkemaladze, S., 124 Tomita, M., xix, 281 Trabasso, T., 49 Vergne, J., xix Vorontsova, L., xxii Weber, C., xxii Westersthål, D., 90 Wetzel, C., xxii Weydt, H., 62 Wiener, N., 44 Wierzbicka, A., xx Wiesel, T., 49 Winograd, T., 137 Wittgenstein, L., 10 Younger, H., xix Zheng, W., 31 Zipf, G., xxii Zischler, L., 31
353
Subject Index
355
Subject Index ablative, 95 absolute proposition, 59, 67, 68, 71, 76 abstract data type, xvii action, xvii–xix, xxiv, xxv, 4, 8, 11, 13, 18, 21, 22, 25, 26, 56, 57, 59, 76 adjective, 87, 89 adnominal, 135, 316 adverbial, 135, 317 adnominal coordination, 146 agreement definition-based, 16 identity-based, 16 algebraic definition, 6 ambiguity, 91, 92, 120, 128, 190 annotation, xx, 24 ASCII, 19 attribute condition, 36 auto-channel, 9 automatic dialogue systems, viii auxiliary, 100, 315 balance, 44 behaviorism, 27 best match, 73 binding variable, 212 BNC, xxi, 24, 144, 145 body flush, 203 bookkeeping attribute, 53 Brill tagger/parser, xix cat attribute, 87 categorial grammar, xix, 4, 212 central cognition, 19 chart parser, xix Chinese, 31
chunk parser, xix cognitive content, 35 coherence, 26, 70, 259 collective plural, 146 comma, 128 common noun, 251 complexity, mathematical, 8, 14 computational, 207 conceptualization, xix, 24, 217 conjunction coordinating, 124 subordinating, 105 constituent structure, 8 constraint-based unification grammar, 27 construction grammar, xx content, 98, 119, 120, 139, 143, 145, 146 content extraction, viii continuation value, 53 continuations, 105 continuity condition, 47, 48, 254 control structure, xvii, xxiv coordinating conjunction, 124, 149 coordination, 69, 123, 149 copula, 101, 102 core attribute, 52 coreference, 91, 92 corpus linguistics, xx, 4 crowding, 283 cybernetics, 44 CYK Parser, xix data structure, xvii database schema, 37, 38
356
Subject Index
declarative specification, viii, xxvi, 3, 5, 6, 11, 24, 42, 54, 60, 207 definite article, 92, 242 dependency grammar, xx determiner, 83, 90, 92, 93 diplomatic ambiguity, 281 direct use of language, 72, 74 Discourse Semantics, xxiii distributive plural, 146 duplex, 35, 125, 150
GPSG, xx, xxiii, 33, 41 greenhorn, 76
Earley algorithm, xix English, 16 Epimenides Paradox, 8 episodic proposition, 67, 68, 71, 76 explicit hypothesis formation, 7 external interfaces, xviii, 22, 49 external interfaces, xvii, xix, 3, 4, 6, 12, 17, 19, 21, 25, 33, 42, 55 extrapolation of introspection, 9 extrapropositional relation, 69, 150, 152, 153 extrasentential, 149, 151
immediate reference, 25 immediate action, 20 immediate recognition, 20 immediate reference, 25, 26, 59 indirect use of language, 74 inference, 69, 71 instruction, 221, 263 intention, 56, 57 internal matching, 27, 59, 74 Internet querying, viii interrogative, 65, 66 intrasentential, 149 isolated proplet, 40, 58
feature approach, 49 fragment, xix, xxv, 207 frequency, 124 function word absorption, 83, 89, 101, 242 functor-argument, 81 gapping, xxv, 8, 124, 136, 139, 145, 147 GB, xxiii, 41 generative grammar, 4, 8 generic sentence, 67 geon approach, 49 German, 16, 35, 58, 116, 124, 282
head flush, 203 head-driven parser, xix hidden Markov model (HMM), xx horizontal variable binding, 70, 71, 91 HPSG, xx, xxiii, 33, 41 human experiencer, 104 hypernym, 73
kind of sign, xviii Korean, 116, 290 LA grammar ak bk ck , 44 LAGo Hear1, 211 LAGo Hear2, 252 LAGo Think.1, 218 LAGo Think-Speak.1, 223 LAGo Think-Speak.2, 274 language production, xix, xx, xxiii, 4, 28, 57, 75, 129, 131, 152, 172, 174, 176 lexical lookup, xviii, xxv, 36, 41, 58, 59, 77, 82 lexicon, 24, 52, 208, 231
Subject Index
LFG, xx, xxiii, 33, 41 LIMAS corpus, 136 linguistic laboratory set-up, 43, 293 literal meaning1, 13, 27, 29, 72 machine translation, viii, xx macro word order, 230 marked, 282 matching condition, 36 mediated recognition, 20 mediated reference, 25, 26, 207, medium (verbal voice), 20 mental state verbs, 107 metadata mark-up, vii, xx metalanguage, xvii, xxiii, 21, 23 metaphor, 72 micro word order, 230 modality, 13, 19, 20, 57, 194 model theory, xx, 27, 44, 90 modifier recursion, 135 modus ponens, 70 monopolar (balance), 44 mood sentence, 66, 152 verb, 66 morphology, xviii, 24 name, 30, 208 nativism, xxv, 27, 33, 41, 123 non-literal use, 68 nonterminal nodes, 105 noun, 311 noun gapping, 145 nouvelle AI, 44 object gapping, 144 ontological role, 30 opaque, 96 OWL, xx particle, 62
357
perceptual grounding, xix peripheral cognition, 19 personal pronoun, 226 perspective taking, 75 phrase structure derivation, 41 phrase structure grammar, xix, xx, 4, 40, 42 Polish notation, 85 practical linguistics, vii practical mechanics, vii pragmatics, 23, 24, 27, 28, 30 pre-order, 85 precision, viii predicate calculus, 93 predicate calculus, 70, 71, 90, 91 predicate gapping, 142 prepositional phrase, 98 primary coding, 74, 76, 77 primary relations, 81, 123 proof of feasibility, 276 proplet, xxiv, 33 proplet shell, 53 propositional attitudes, 8 propositional calculus, 68, 70 prototype approach, 49 psych verbs, 104 punctuation, 208 quantifier, 70, 90, 91 RDF, xx real time, xxii, 24 recall, viii recognition, xvii–xix, xxiv, xxv, 4, 11, 13, 18, 19, 21, 22, 25, 26, 55–57, 59, 76 reference, xvii, xxv, 4, 20, 21, 51, 55, 72 relative clause, 111 reliability of data, 199
358
Subject Index
right-node-raising, xxv, 136 schema approach, 49 scrap yard procedure, 153 secondary coding, 74–77 secondary relation, 76 sediment, 199 self-organization, 46, 71, 214, 242 semantic doubling, 281 semantic networks, xx semantic relations of structure, 34 semantics, xviii, 24 sensory modality, 11 sentence mood, 66, 152 service channel, 9 sign recognition/production, 22 simple coordination, 126, 129, 132 simplex, 35, 125, 150 Situation Semantics, xxiii S LIM theory of language, xxv, 27, 59, 72 smart solution, vii, 7 solipsism, 10 speech act theory, xx, 27 speech recognition, viii, 194 STAR, 28, 67, 73 statistical tagging, xx statistics, vii, xix, 56, 196 structuralism, 27 subject gapping, 139, 177 subordinating conjunction, 116 substitution value, 82, 89, 105, 108, 111, 127, 130, 237, 244 substitutions, 105 sur attribute, 52 Surface Compositionality, 13–15, 27, 82 suspension, 115, 152, 204, 213 syntax, xviii, xix, 24
template approach, 49 text linguistics, 4 theoretical linguistics, vii theoretical mechanics, vii thought, xxiv, 28, 63 time-linear, xxiii–xxv, 4, 13–16, 27, 28, 39, 40, 82, 126 transparent, 96 tree bank, 24 truth, 44 truth-conditional semantics, xxiii, 4, 8, 21, 69, 90 turn-taking, xxiii, 4, 5 type-token, 50, 51, 56–58, 60, 67, 68 unbounded dependency, 118 unification, 41 unmarked, 282 upscaling, viii, 8, 9, 11, 12 user-friendliness, 7 utterance meaning2 , 27, 72 vagueness, 8 valency theory, xx, 14–16 value condition, 36 variable, 15, 50 binding, 54, 66, 212 replacement, 53, 64, 67 substitution, 237 verb, 311, 314 verbal mood, 307 verification, viii, 3, 7, 11 vertical variable binding, 70 von Neumann machine, xvii WH interrogative, 65, 66 word bank, 38, 67, 215, 216 XML, xx, 24 Yes/No interrogative, 65, 66
Backcover
Everyday life would be easier if we could simply talk with machines instead of having to program them. Before such talking robots can be built, however, there must be a theory of how communicating with natural language works. This requires not only a grammatical analysis of the language signs, but also a model of the cognitive agent with interfaces for recognition and action, an internal database, and an algorithm for reading content in and out. In Database Semantics, these ingredients are used for reconstructing natural language communication as a mechanism for transferring content from the database of the speaker to the database of the hearer, using nothing but a time-linear sequence of modality-dependent unanalyzed external language surfaces. Part I of this book presents a high-level description of an artificial agent which humans can freely communicate with in their accustomed language. Part II analyzes the major constructions of natural language, i.e., intra- and extrapropositional functor–argument structure, coordination, and coreference, in the speak and the hear mode. Part III defines declarative specifications for fragments of English, which are used for an implementation in JavaTM. An appendix describes the basic software machine JSLIM, which is applicable to different languages. The book provides researchers, graduate students, and software engineers with a functional framework for the theoretical analysis of natural language communication and for practical applications of natural language processing.