Aug 11, 1997 - Let us look at an example of compositional semantics construction ... xed semantics, the proposed Dynamic Lambda-Calculus (DLC) can be ...... We prove the assertion by an induction over the structure of A. To ease theĀ ...
Dynamic Lambda Calculus Michael Kohlhase, Susanna Kuschert Universitat des Saarlandes August 11, 1997 Abstract
The goal of this paper is to lay a logical foundation for discourse theories by providing an algebraic foundation of compositional formalisms for discourse semantics as an analogon to the simply typed -calculus. Just as that can be specialized to type theory by simply providing a special type for truth values and postulating the quanti ers and connectives as constants with xed semantics, the proposed dynamic -calculus DLC can be specialized to -DRT by essentially the same measures, yielding a much more principled and modular treatment of -DRT than before, which is also expected to provide a conceptually simple basis for studying higher-order uni cation for compositional discourse theories.
1
CONTENTS
Contents
CONTENTS
1 Introduction 2 Dynamic -Calculus
3 6
3 Operational Equality
15
4 Semantics
21
5 Meta-logical Properties of DLC 6 Conclusion A Towards Higher-Order Uni cation for DLC
26 28 31
2.1 Types : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 6 2.2 Well-formed Formulae : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7 2.3 An Application: -DRT's syntax : : : : : : : : : : : : : : : : : : : : : : : : : : : : 13 3.1 Alphabetic Variants : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 15 3.2 Substitutions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 17 3.3 Reductions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 18
4.1 Constructing the Denotations : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 21 4.2 Application to -DRT : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 24
A.1 Long Head Normal Forms : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 31 A.2 General Bindings : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 32
2
1 INTRODUCTION
1 Introduction Over the past few years, there have been a series of attempts [Zee89, GS90, EK95, Mus96, KKP96, Kus96] to combine Montague's type theoretic framework for compositional semantics construction [Mon74] with dynamic approaches to discourse semantics, such as DRT [Kam81]. The motivation for these developments is to obtain a general logical framework for discourse semantics that combines compositionality and dynamic binding. Aside from the intrinsical theoretical interest, some of these formalisms have gained practical relevance in NLP application. For instance -DRT [KKP96, Kus96] has been used as the semantic representation formalism of the dialogue translation project Verbmobil [BMM+ 94] for several years. For semantics construction, all of these approaches give higher-order lexical semantics to the basic sentence constituents and use -reduction as a computational device for constructing the nal ( rst-order) semantics. Let us look at an example of compositional semantics construction of the sentence Ai man sleeps. (i denoting an index for anaphoric binding) in -DRT. ( Q:
Ui man(Ui )
Q(Ui)
) ( X:
sleep(X )
) ?!
Ui man(Ui )
?!
Ui man(Ui ) sleep(Ui)
sleep(Ui)
Here is the -DRT conjunction operator that intuitively merges two DRSes by merging the sets of discourse referents and that of conditions. Unfortunately, the above mentioned uni ed formalisms have failed so far to duplicate a key aspect of type theory that has lead to interesting linguistic analyses in computational linguistics. Type theory (or higher-order logic) is a two-layered formalism, where the algebraic content (the behavior of higher-order functions) is neatly packaged into a formalism of its own, namely simply typed -calculus; while the logical content (the speci c semantics of connectives and quanti ers) is built on top of it. Thus the use of type theory allows us to deal with the complexities of natural language semantics on two distinct levels: the simply typed -calculus provides the theory of -reduction (which is the motor of compositionality) whereas the logical side of semantics is dealt with by a system that is rather like predicate logic. By focusing on known mechanisms for dealing with each of the two subsystems, it has proved possible to use type theoretic techniques for natural language processing systems.
Higher-order uni cation [Hue75] solves equations in the simply typed -calculus and leads
to analyses of ellipsis [DSP91, GK96b, GK97, PRN97], and focus [GK96a, Pul94]. Note that these accounts are inadequate for the treatment of the logical structure, so they make insucient predictions about connectives and quanti ers, which are treated as ordinary uninterpreted constants. First-order automated theorem proving [Fit90] is used to reason about the logical structure of natural language, for presuppositions, and to integrate world knowledge into natural language semantics. Note that these approaches normally cannot capture higher-order aspects of the semantics like compositionality or underspeci ed (e.g. elliptic-) semantic elements. Only recently, logic formalisms for higher-order theorem proving [Koh95] have appeared that are generalizations of both higher-order uni cation and automated theorem proving. These can be used to integrate world knowledge into the uni cation-based approaches [GKvL96]. 3
1 INTRODUCTION The goal of this paper is to lay the foundation for analyses like the above by providing an algebraic foundation of compositional formalisms for discourse semantics as an analogon to the simply typed -calculus. Just as that can be specialized to type theory by simply providing a special type o for truth values and postulating the quanti ers and connectives as constants with xed semantics, the proposed Dynamic Lambda-Calculus (DLC ) can be specialized to -DRT (see sections 2.3 and 4.2) by essentially the same measures, yielding a much more principled and modular treatment of -DRT than before. The central theme of DRT is the establishment of anaphoric binding in discourse; -DRT, similar to the other above mentioned formalisms, establishes such bindings in a Montegovian-like construction process given coindexation of the pronoun with its antecedent through the syntactical analysis. As an example, consider continuing the above sentence by Hei snores. which is represented by snore(Ui )
From the representation of the two sentences, the representation of the discourse may then be constructed by applying them to P:Q:P Q and reducing, arriving at P:Q:P Q
Ui man(Ui ) sleep(Ui )
snore(Ui )
?! ( Q:
Ui man(Ui ) sleep(Ui )
?!
Ui man(Ui ) sleep(Ui )
?!
Ui man(Ui ) sleep(Ui ) snore(Ui )
P )
snore(Ui )
snore(Ui )
Note that in this process the free variable in the representation of Hei snores., standing for the pronoun hei, has in the course of -reduction been captured by the declaration of the discourse referent Ui in the representation of the rst sentence, which is part of the representation of ai man. Indeed, the capturing of free variables, the thing impossible in pure -calculus, is the driving force of the establishment of anaphoric binding in -DRT. The proposed formalism DLC focuses on the interaction of dynamic binding (declaration of discourse referents), function abstraction and function application. It is spelt out by the interaction of two distinct abstraction operators, the well-known -, and the dynamic -operator. One of the central themes here is the capturing of free variables by formulae containing dynamic binding constructs, shown in the above example to be the means of establishing the binding of anaphors in the construction of discourse representations. This leads us to a thorough study of the variables known from -calculus and the variables that can thus be captured. Indeed, DLC shows that we have variables of inherently dierent character, -bound variables with no linguistic meaning but which are merely place holders and dynamic variables standing for linguistic objects. Most of the burden of the interaction of the two kinds of variables, and in particular the respective abstraction mechanisms that go with them, is carried by an elaborate type system that takes into account structural properties of dynamic systems, such as binding power and the accessibility relation. We therefore have a means to describe the binding properties of any expression, its mode, and thus some of its discourse properties. First experiments have shown that this information proves very useful in dierent linguistic applications. For instance, if we augment 4
1 INTRODUCTION the linear higher-order uni cation analysis for quanti er underspeci cation from [Pin96] with DLC types, the algorithm produces information about the dynamic potentials of the readings that can be used to restrict the search space for solutions [EK97]. Furthermore, this paper (in particular, the general binding Lemma A.6) provides the basic building blocks for higher-order uni cation algorithms for DLC . Since DLC covers the structural level of compositional formalisms for discourse semantics such as -DRT, such a DLC -uni cation algorithm (which we will develop in another paper) can serve as the analogon for Huet's algorithm in dynamic extensions of the static HOU analyses discussed above. First experiments with the formal system have shown that these will be more intuitive than those for the static case.
5
2 DYNAMIC -CALCULUS
2 Dynamic -Calculus Previous work, e.g. on -DRT, has shown that identi cation of functional and dynamic properties and the interaction of these is the key to understanding the character of compositional dynamic formalisms. [KKP96, Kus96] observe that we can identify two dierent kinds of variables together with two kinds of abstractions of very dierent nature: these are the well-known standard abstraction, used for the compositional construction of representations, abstracting over variables that act as place holders, and a dynamic abstraction that we call -abstraction. The -abstracted variables stand for linguistic objects, that are introduced or quanti ed over, and therefore can be considered to have proper linguistic meaning; for this reason we shall sometimes also call them linguistic variables. The most prominent dierence between the two abstraction operators is their notion of scope: While -abstraction has the traditional notion of scoping (de ned with respect to applicative structure of formulae), the scoping of -abstraction is given by the notion of accessibility, motivated and de ned by some linguistic theory (e.g. DRT or DPL). Their interaction is non-trivial, since -abstraction can capture free variables during -reduction (in fact this is the driving motor of dynamicity) which breaks a taboo on abstraction in traditional -calculi. The central idea of DLC is to encode the information necessary for describing the two forms for scoping (in particular the accessibility relation governing dynamic abstraction) and variable capture in an elaborate type system, which can encode knowledge about -bound variables (which have the power to capture variables) and knowledge about free variables (which are liable to being captured). As a consequence, DLC 's types include mode information for the variables that are visible from the outside of an expression: In addition to the standard -calculus types (constructed from basic types by functional application) there are types of the form ?#, where the mode ? marks a variable with a '+' if the variable has capturing power, and with a '?' if it is prone to being captured (i.e. if it is a free variable). As a rst example, if the expression A has type X ? ; U + ; ?#, we know that A is of the standard (type theoretic) type (e.g. if is the base type o, then A is a proposition) and A contains at least the free variable X and the -bound variable U. -bound variables in A do not appear in the mode of its type, since they are not visible to the outside of A.
2.1 Types
In the following we presuppose a countably in nite set V of variables, which is partitioned into two disjoint subsets of variables of inherently dierent character: the set of functional variables V fun and the set of dynamic variables V dyn . For a discussion of the two notions of variables, see below. We shall use X; Y; Z for rst order functional variables and P; Q; R for higher order functional variables. U; V; W will be used for dynamic variables, A; B; C for variables that are either functional or dynamic. Non-capital letters are used for constants, bold face letters, for well-formed formulae. De nition 2.1 (Modes). A mode is a nite partial function1 from the set of variables to signs + and ? together with a set of natural numbers serving as place-holders. We reserve the uppercase Greek letters ?; ; and as meta-variables for modes and write the mode ? that declares U positive, Y negative, and has a placeholder 0 as ? = U + ; 0; Y ? . Formally, modes are constructed from mode declarations like U + and X ? for concrete variables and place-holders by the concatenation operator \,". Here we assume that positive declarations always overwrite negative ones: for instance U + ; U ? = U + = U ? ; U + . We use ?+ (?? ) for the positive (negative) submode of ?. For example, if ? = U + ; Y ? ; 0; P +, then ?+ = U + ; P + , and ?? = Y ? . 1 Dom(:) gives the domain of these functions. 6
2 DYNAMIC -CALCULUS
2.2 Well-formed Formulae
De nition 2.2 (DLC Types and Their Constituents). A type in DLC is either a base type, a moded type ?#, or a function type ! , where ; are again types. We denote the set of all types by T and that of base types by BT . We call a type simple, i it is not moded { i.e. a base type or a functional type { and identify a simple type 2 ST with the type ;#. Note that simple types may contain modes on functional level, e.g. ?# ! # is simple. We will call the subset of types that are generated from BT
only by function type construction (i.e. those types without #) pure, since they correspond to standard -calculus types. Finally, we identify ?#(#) with (?; )#. Thus we can always rewrite a type such that is simple in = ?#. In this case we call the characteristic type and ? the characteristic mode of . If the mode of a type contains positive variables, then we call it dynamic, else static. Thus adding further positive variables can be viewed as adding further degrees of dynamicity. Notation 2.3. We shall impose the convention that # binds stronger than functional construction !, i.e. the characteristic type of ?#( ! ) is ! and the characteristic type of ?# ! is the type itself. De nition 2.4 (Components of Type ). To manipulate types, we will need a function that extracts the top-level positive variables of a type . We recursively de ne + as ?+ [ + if = ?# , and + = ; if is a simple type, and analogously ? for the negative top-level mode. Note that for any type = ?# , where is simple, the characteristic mode ? is + [ ? ; thus take this as a de nition of characteristic mode for arbitrary types. Further, we shall need a function which collects the variables of all levels of a type, the binding potential of , de ned as BP () = + [ ? , if is not a functional type and BP () = BP ( ) [ BP ( ), if? = ! . We write the positive and negative parts of 's binding potential as BP + () and BP (). The name binding potential re ects that BP () gives full information on which variables contribute to the binding behavior of an expression, if that expression is of type . The binding behavior needs to contain both positive and negative variables, since, as has been said before, positive variables are those which do capturing while negative variables are those which are liable to being captured. Further, we need to consider all functional levels of the type, since they give information on what happens on function application. With types including variables, it is natural to de ne a type substitution in parallel to substitution within expressions (which are to be de ned later): De nition 2.5 (Mode Substitution). A mode substitution is a pair [=A] of a mode and a variable A. We de ne its application [=A] to a type by [=A] ! [=A] 0 for = ! [=A] = : [=A]?#[=A]0 for = ?#0 for 2 BT 8 ? for A? 2 ? < ; (? ? A ) [=A]? = : [=A]n; [=A](? ? n) for n 2 IN ? for A? 2= ?; ? \ IN = ; [=A]n = [=A]n 8