Logic Grammars, Compositional Semantics, and Overgeneration James H. Andrews
Department of Computer Science University of Western Ontario London, Ontario, Canada Email:
[email protected]
Veronica Dahl
School of Computing Science Simon Fraser University Burnaby, BC Canada V5A 1S6 Email:
[email protected]
Bharat Jayaraman
Department of Computer Science & Engineering State University of New York at Bualo Bualo, NY 14260 Email:
[email protected]
Abstract
First-order treatments of long-distance phenomena such as relativization typically suer from overgeneration. Higher order inspired extensions of Prolog have been proposed with varying degrees of success, but still suer from overgeneration in the case of imbricated structures. We rst propose an Assumption Grammar based treatment which deals successfully with this case both for analysis and for generation, and which maintains semantic compositionality as well. We then propose a cleaner, true higher order logic approach which solves the same problems, we argue that this approach is superior to other kinds of grammars dealing with long distance dependencies, and we advocate the development of a mixed platform (-Prolog plus continuation based assumptions) where the best of both worlds can be exploited.
1 Introduction Overgeneration (the acceptance by a language processor of incorrect as well as correct sentences) has long been a problem in logic-based as well as other computational accounts of language. Typically, such accounts focus on analysis more than on generation. Under the assumption that users master the human language described by the system, incorrect sentences are not as problematic in analysis as they are in generation mode. One of the trickiest cases of overgeneration involves structures in which two long distance constituents need to be related. In particular, if such structures can be recursively embedded (e.g. 1
relative or interrogative clauses), it can be dicult to determine just which dependencies hold. For instance, a grammar may admit the incorrect sentence \every house that the man that built painted collapsed", by construing the subject of \built" as \the man" and its object as \the house". Linguists have come up with various schemes to prevent overgeneration. Of notable interest to the logic programming community are categorial type logic treatments of resource management (see [5] for an overview), which rely on modally decorated type assignments to obtain structural relaxation, or to impose structural constraints. Approaches based on hypothetical reasoning have also been studied. For instance, assumption grammars [2] have been used for postulating candidate co-speci cation entities when they appear, and con rming them later, when their long distance correlating entity shows up. While more portable than categorial treatments, such approaches per se still do not solve the problem of overgeneration. In this article we rst discuss the use of continuation-based assumptions not only for proposing and con rming co-speci ers, but also for enforcing the linguistic restrictions under which they can be used, including scope. We show a proof-of-concept toy grammar for relativization, and we discuss how compositional semantics- an often neglected aspect in the above mentioned developments- can be conveniently preserved within our approach for the cases of sentence analysis and generation . We next present an alternative, more elegant approach- a higher order treatment through Prolog to produce semantics without unbound variables. This approach also maintains semantic compositionality and truly avoids overgeneration. We argue that it is preferable to other kinds of grammars dealing with long-distance dependencies.
2 Background and Motivation Hypothetical reasoning extensions of logic programming and logic grammars have made it possible to give elegant, executable logic accounts of the types of higher-order functions often involved in language processing (e.g. [8]). Previous approaches to long distance dependencies, for instance, had cluttered grammars with extra parameters that had to be carried around even by parts of the grammar that were not involved in the particular dependency treated. These parameters served for instance to relate the missing noun phrase (the \gap") at the end of the relative clause in \the house that Jack built" with its antecedent, \the house". Other than cluttering the grammar, such approaches made it easy to admit incorrect sentences as well as correct ones (by \ nding" missing elements outside the structure where they belong- e.g., postulating a missing noun phrase outside the relative clause). Specialized logic grammars (e.g. [1]), solved the cluttering problem by admitting several left hand side symbols in a single rule, so that two distant constituents can be explicitly related within it. But they still suer from overgeneration. Substructural logics treat such dependencies elegantly by temporarily augmenting the grammar with a resource that is available only where needed. For instance, upon encountering a relative pronoun, an empty noun phrase is made available for the duration of the relative clause's parse. However, overgeneration has remained a problem. Intuitionistic treatments of relativization (e.g. [7]) admit for instance (incorrect) sentences such as \the house that jack built the house", where the relative clause's antecedent, \the house", is referred to twice (once by the pronoun and once by the overt noun phrase italicized). This is because the intuitionistic assumption of a missing noun 2
phrase introduced by the pronoun is not required to be used. Other approaches solve such problems by resorting to linear implication [3, 2], so that introduced assumptions must be consumed. Yet overgeneration remains a problem for imbricated structures, as exempli ed in our Introduction.
3 An Assumption Grammar Approach to Overgeneration Assumption grammars [2] extend standard logic grammars with linear and intuitionistic implications ranging over the entire continuation. We will only use linear assumptions in this work. Linear +/1 adds a clause usable at most once in subsequent proofs. Being usable at most once distinguishes ane linear logic from Girard's original framework where linear assumptions should be used exactly once. This assumption vanishes on backtracking. The consumption of an assumption is noted -/1. Terminal symbols are preceded by #, and there is no notational dierence between rewriting and \if".
3.1 An Example|Relativization We take the common view of relative clauses as sentences introduced by a relative pronoun, from which a noun phrase is missing. The place where the missing noun phrase should have occurred is called a gap, often noted [ ], as in The house that1 Jack built [ ]1 . Superscripts show co-reference between the pronoun and the noun phrase it refers to. In terms of meaning, the gap can be represented by the same variable X introduced by the quanti er, so that we can obtain meaning representations as the(X,house(X),built(jack,X))), for instance. With assumption grammars, we can keep in an assumption the variable representing a quanti ed noun phrase, as a potential referent for a missing noun phrase to be found later. Omitting all other variables for clarity (a complete grammar can be seen in Appendix 1), we might have: np(X) :- det, noun(X), +referent(X), relative_clause.
% assumes X as a potential referent
relative_clause:- #that, s.
The parsing of a relative clause would then proceed as that of a complete sentence, but upon expecting a noun phrase that is not found, we could consume the assumption just introduced, by adding as the last np rule: np(X):-
-referent(X).
3
This has the eect of binding the variable introduced by the quanti er with the variable representing the missing noun phrase. Making the Scope More Precise. In order to ensure that a relative clause's assumption is consumed inside that clause, we can restrict the scope of assumptions by de ning, as in Appendix 1, a predicate "used(X)" which ensures that its argument X has been consumed, e.g.: np(X) :- det, noun(X), +referent(X), % assumes X as a potential referent relative_clause, used(referent(X)).
The stipulation \inside that clause", however, is too permissive with embedded relatives, as we have also seen through the example: \every house that1 the man that2 [ ]2 built [ ]1 painted collapsed. The reason that this sentence can be admitted is that, while both gaps introduced by the relative pronouns are used, and none is used outside a relative clause, the innermost relative clause can greedily consume both gaps. Requiring a relative clause to consume exactly one gap is no big help, since the outermost relative clause must consume two (via one of them being consumed by its embedded relative clause). A grammatical example of imbricated structure is "Every dancer that invited a partner that knows salsa excelled". Incorporating Linguistic Constraints. One way of solving the problem is by incorporating linguistic constraints. For instance, Ross' 1974 Complex NP constraint, which says: \No element contained in a sentence dominated by an np with a lexical head can move out of that np by transformation" (in other words, you cannot relate such an element with an antecedent outside that np), can be applied to our example (see gure 1). s
np
det every
vp
n(X)
rel
house
pro(X)
s
that np*
det the
n(Y) man
vp
rel pro(Y)
s*
that np(Z)
vp
Figure 1: Partial analysis of "Every house that the man that built painted collapsed" 4
In this case, the constraint says that Z cannot be equated to X while postulating np(Z) as a missing noun phrase, because s* is a sentence dominated by an np with a lexical head (man), and X is outside the corresponding np (marked also with *). This application of the complex NP constraint rules out the reading "X built Y". Now, Z can be uni ed with Y instead, but then when we continue with the expansion of vp as shown in Figure 2, the complex NP constraint now blocks W from being uni ed with X for the same reason (np(W) is another element contained in a sentence (s*) dominated by an np (np*) with a lexical head, and X is outside that np). So the reading with Y built X is also outruled. vp
v
np(W)
built
Figure 2: Expansion of vp continued from Fig. 1 More general variants of this constraint must also look outside the particular structure being parsed at the time. Chomsky's subjacency constraint, for instance, identi es speci c nodes as \bounding nodes" (a language dependent notion), and stipulates that two constituents (e.g. an antecedent and a missing np) cannot be related if in the path of derivation between one and the other there are two bounding nodes. For our example, since \s" (sentence) is a bounding node for English, and by when we are inside the innermost relative clause in our example we have crossed two s nodes after the introduction of X by \every house", X cannot serve as antecedent within the innermost embedded relative. This disallows both incorrect readings mentioned (\X built Y" and \Y built X"). Making Resource Manipulation Sensitive to Linguistic Structure. Continuing with our relativization example, the complex np constraint means that missing noun phrases inside an embedded relative clause (say, the innermost one in Figure 1) cannot relate to antecedents outside the noun phrase immediately dominating that relative clause (np* in our example). So the variable X, introduced outside np* in Figure 1, is forbidden from relating to any missing np inside the innermost relative, thus disallowing the incorrect readings \X built Y" and \Y built X". So all we have to do is to collect variables introduced by noun phrases with relatives as we go along, and mark them as forbidden when we enter the analysis (or generation) of a new embedded relative clause. As well as disallowing incorrect sentences, this will disallow incorrect readings of correct sentences (e.g. \every house that the man that loves Susan built collapsed" cannot have a reading in which every house loves Susan and the man built himself!). Rather than waiting for the constraint to disallow this reading, however, it is more expedient to choose for missing noun phrases, among the list of candidate antecedent variables, the most recently introduced one among all those still available and not forbidden. Appendix 1 implements this approach for a toy grammar. A similar technique can be used to implement subjacency: we can for instance assume +bounding1 (X) upon encountering the rst bounding node N after a given antecedent X has been introduced, +bounding2 (X) upon encountering another occurrence of a bounding node after the antecedent's introduction, and when trying to consume a missing noun phrase's variable as this X, fail if -bounding2(X) succeeds. 5
Using subjacency would, of course, also allow us to avoid overgenerating with other structures than relative clauses. For instance, the incorrect interrogative sentence \Who do you wonder why John likes?" is ruled out because two \s" nodes intervene between \Who" and the missing noun phrase after \likes".
3.2 Semantic Compositionality Compositional semantics is often not adequately covered in much of the linear/intuitionistic assumption based analyses. Our grammar of Appendix 1 compositionally builds meaning representations while using the techniques proposed above. We choose a -calculus based representation, in which each constituent is given a representation built up from applying a lambda expression to another. For instance, given the following representations: John P.@(P,john) paints X.paints(X) every P1.(P2,every(X,implies(@(P1,X),@(P2,X)))) man X.man(X) we can build paints(john) as the meaning representation for \John paints", by applying the noun phrase's representation (that of \John") to the verb phrase's one (that of \paints"). Since we are using Prolog, we must transform these functional representations into relational ones, e.g.: proper name(P.Q):- #john, @(P,john,Q). det(P1.P2.every(X,Q1 => Q2)))):- #every, @(P1,X,Q1), @(P2,X,Q2).
The calls to \@" use the following Prolog implementation of beta-reduction: @((X,P),X,P). BinProlog (http://www.cs.unt.edu/BinProlog) supports assumption grammars, but does not have true lambda-expressions. We therefore represent X.P by the rst order term X\P. We similarly represent universally quanti ed variables as every(X, P). Ideally, we would like a platform supporting both assumption grammars and true lambda-expressions. Building the representation of a relative clause is not signi cantly dierent from building that of an ordinary sentence: the verb phrase's representation is applied on the noun phrase's, except that one of the noun phrases in the relative clause will be implicit. This is captured by the rule: np(NP):- det(D), n(X\P), add_referent(X), relclause(P1), close_scope(X), @(D,X\and(P,P1),NP).
Applying the determiner's meaning to X.and(P,P1), where P1 is P.@(P,X) and P is man(X), yields the value of NP as: P2'.every(X,implies(@(X.and(man(X),paints(X)))),@(P2',X))
If we now apply a verb phrase's representation on this noun phrase's representation, say that of sings, we get: 6
every(X,implies(and(man(X),paints(X)),sings(X))
The Appendix shows several sample tests as well as the complete grammar. Of course, we could have chosen any other kind of meaning representation while using semantic compositionality together with our techniques for managing linguistic-sensitive resources. The grammar presented here is merely a proof of concept.
4 A Higher Order Approach with Relativization Finally, we present an approach to the relative clause problem using higher order logic. This approach is based on the basic \gap threading" scheme of Pereira and Shieber [9], but uses the higher order meaning terms themselves to direct the parse, thus avoiding complex additional parameters. The higher order meaning terms can be \relativized" in a natural way to indicate the presence of gaps, and thus contain all the information needed in order to parse clauses correctly. This fact provides further support to the common observation that semantic analysis must be an integral part of syntactic analysis. We use the higher order logic programming language Lambda Prolog for this task. Lambda Prolog supports the higher-order features of popular higher-order functional languages such as ML, but retains the useful backtracking features of Prolog. In contrast to BinProlog, which supports assumption grammars but does not have true lambda-expressions, Lambda Prolog has true lambdaexpressions, but does not support assumption grammars. However, assumption grammars are not necessary for the approach we take in this section. The version of Lambda Prolog we used (Terzo) also does not have explicit support for DCGs, so we use explicit list-consuming predicates. The grammar discussed in this section is in Appendix 2. Scope of Relative Clauses. Consider the sentence The dog that1 Jack owns [ ]1 chased the cat that2 [ ]2 slept. where \missing noun phrases" are denoted by [ ], and superscripts show the relationship between missing noun phrases and their corresponding relative pronouns. This example and similar examples may lead us to believe that every relative pronoun corresponds to the next missing noun phrase. This is the explicit or implicit assumption of many incomplete approaches to the relative clause problem. However, the following example shows that the assumption does not always hold for imbricated relative clauses. The cat that1 the dog that2 Jack owns[ ]2 chased [ ]1 slept. It is more accurate to say that a relative pronoun has a certain scope { that is, the rest of its relative clause. This scope extends down the sentence from just after the pronoun to the end of the clause: The dog that1 Jack owns [ ]1 chased the cat that2 [ ]2 slept. The cat that1 the dog that2 Jack owns[ ]2 chased [ ]1 slept. 7
It is also the case that, in English at least, a relative clause contains exactly one missing noun phrase that is not within the scope of an interior relative clause. Observations about the scope of relative clauses form the basis for the \gap threading" approach to parsing, pioneered by Pereira and Shieber [9]. In gap threading approaches, we must pass an extra ag (or a dierence list of ags) to each rule of our grammar in order to indicate whether or not the construct can have a missing noun phrase. In the higher order approach described here, such ags are unnecessary.
4.1 Relativized Meanings of Sentences One of our goals in this work is to maintain compositional semantics { that is, to ensure that the meaning of a grammatical construct is composed from the meanings of its constituent constructs. This goal leads us to a view of constructs as being either \normal" or \relativized" (having missing constituents). In a higher order treatment, the natural way to represent relativization is to use lambda abstraction. This representation turns out to allow us not only to maintain compositionality, but also to avoid overgeneration. Consider the sentence \Mary slept." The natural representation of the meaning of this sentence in higher order logic is the assertion (slept mary), where mary is an individual and slept is a predicate (function from individuals to assertions)1 . We have established that we want the relative clause \that slept" to be parsed as the relative pronoun \that" followed by a sentence with a missing noun phrase (\slept"). We need this partial sentence to return an appropriate meaning on being parsed; since we know we will later ll in the missing noun phrase, the most appropriate meaning term in a higher order logic approach is a sentence with a variable in place of the noun phrase, all enclosed within a lambda abstraction of that variable: x. (slept x). For the purposes of this paper, we call the complete sentence a \normal" sentence and the partial sentence a \relativized" sentence, to indicate that it is a sentence relative to the noun phrase which completes it. As another example, we would like the \normal" sentence \Mary chased John" to have the meaning (chased mary john), while the \relativized" sentence \chased John" should have the meaning x. (chased x john). Thus, the parse of a true sentence returns an assertion (that is, in Lambda Prolog, a term of type o), while the parse of a relativized sentence returns a function from individuals to assertions (that is, a term of type i -> o). This would seem to be a problem, since in higher order logic every parameter to a predicate must have a xed type. However, the type structure of Lambda Prolog allows for polymorphic constructors. We therefore de ne two of them in order to ensure that grammar rules can always return the same type for meaning terms. The constructor normal takes a term of any type A and returns a term of the same type. Its companion, the constructor reld (short for \relativized"), takes a function from individuals to terms of any type A, and returns a term of type A. Thus the meaning term for \Mary chased John", in this scheme, is (normal (chased mary john)), and the meaning term for \chased John" is (reld x. (chased x john)). Both are assertions, that is, Lambda Prolog terms of type o. 1
Recall that in higher order logic, functions and predicates are separated from their arguments by spaces; thus instead of slept(mary).
(slept mary)
8
Other Relativized Constructs. Because we are able to parse sentences which have been rela-
tivized, we must be able to parse the grammatical constructs within them which have been relativized. Thus, a relativized verb phrase is a verb phrase with a missing noun phrase. Where a normal verb phrase returns a predicate (function from individuals to assertions), a relativized verb phrase returns a function from individuals to predicates. In particular, a relativized noun phrase is a noun phrase with a missing noun phrase; for example, the missing noun phrase itself. Where a normal noun phrase returns a function from predicates to assertions, such as y. (y mary), the missing noun phrase returns a function from an individual (standing for the referent) to a function from predicates to assertions. The actual term returned is always the same: (reld x. y. (y x)). The key to ensuring that rules are not made unnecessarily complex by this scheme is the general predicate prapp, whose name is short for \possibly relativized application". This is the replacement for the usual function application construct which we would use to compose meaning terms. For instance, in a simple grammar, the meaning for a sentence would be composed by applying the meaning of its subject to the meaning of its predicate: s S --> np NP, vp VP, {S = (NP VP)}.
Because in our scheme either meaning of the sentence:
NP
or VP may be relativized, we instead call
prapp
to compose the
s S --> np NP, vp VP, {prapp NP VP S}.
simply applies its rst argument to its second to yield its third, but paying attention to whether its arguments are normal or reld. If one of its arguments is relativized, the relativization is \lifted" outside the application. prapp is called in three rules in the example grammar to compose meanings. The particular scheme given in Appendix 2 handles only single gaps. As Hodas [3] points out, this is inadequate as a basis for parsing sentences like \Which violins are these sonatas dicult to play [ ] on [ ]?", in which two gaps must be matched in the nal relativized sentence. However, the scheme can be generalized by nesting one reld structure inside another (with a normal structure at the innermost level) and taking account of this in the de nition of prapp. For clarity we have presented only one-level relativization here. prapp
4.2 Avoiding Overgeneration The technique of returning the \natural" higher order term as the meaning of a construct or relativized construct also has the eect of allowing us to avoid overgeneration. This can be checked informally by running the goal (words Words, sentence Words Meaning), which returns valid sentences to a depth of at least three levels of imbricated relative clauses. The system has also been tested on a variety of valid sentences, to see whether it undergenerates (it does not). Overgeneration is avoided essentially by the correspondence between those meaning terms that are generated at a lower level and those that are expected at a higher level of the analysis. Meaning terms have either a normal wrapper or a reld wrapper, from the level just above that of 9
individual words. Correspondingly, high-level rules always require a normal meaning term where a normal construct is expected, and a reld meaning term where a relativized construct is expected. A meaning term in this grammar can therefore be said to give three pieces of information: rst, whether it is normal or relativized; second, where the missing constituent occurs in the meaning term (i.e., the location(s) of the lambda-abstracted variable); and third, how to substitute the referent for the missing constituent (implicit in the automatic beta-contraction of Lambda Prolog). Agreement. Agreement in our example grammar is attained through the use of extra arguments. For instance, a declarative sentence requires a nite verb (e.g. chases, saw), while a yes-no question requires an in nitive (e.g. chase, see). The form of verb required is given by a simple ag, passed as an argument. This approach to agreement has been implemented for simplicity, since it is not the main focus of our paper. In general, it may clutter the semantics with agreement parameters. It may be preferable to achieve agreement instead with other methods, such as those involving assumption grammars, linear implication, or static discontinuity grammars.
5 Comparison and Related work Both approaches presented here address the same problem, namely that of precluding overgeneration while maintaining semantic compositionality. Our -Prolog approach, in our view the most elegant of the two, is preferable to other kinds of grammars dealing with long-distance dependencies:
Grammars which pass ground rst-order meaning terms containing all three pieces of information. These grammars require some extralogical symbol generation facility in order to ensure uniqueness of referents, and require a re-implementation of the substitution operation in order to replace a missing constituent by its referent. All this is handled automatically and logically by Lambda Prolog. Grammars which pass rst-order meaning terms with free variables, such as the scheme introduced in Section 3.6. While not technically overgenerating, as pointed out earlier, such schemes require the explicit implementation of linguistic constraints in order to avoid returning meaning terms which do not behave properly upon instantiation (for instance, were it not for the complex NP constraint implemented, if we rst got the semantics M for a grammatical sentence such as "every man that saw every man that john saw sings", and used the grammar to generate a sentence with that semantics, we would get both correct sentences with dierent semantics (e.g. "every man that saw every man that saw john sings") and nonsensical sentences (e.g. "every man that john saw every man that saw sings").
The -Prolog grammar is also preferable in some ways to that of Hodas [3], because it assumes only higher order logic and does not require linear implication, and because (as our other approach does too) it returns meaning terms rather than only parse trees. Moreover, Hodas' grammar would return parse trees which do not distinguish between the multiple gaps in an imbricated relative clause, requiring an extra parse tree analysis to do so. However, the two schemes are not really comparable, because Hodas also considers missing prepositional phrases (e.g. for \where"-type questions), and multiple gaps in unnested relativized sentences. 10
On the other hand, our -Prolog approach would lead to proliferation of our polymorphic constructors when extended to accommodate other long-distance dependency phenomena (e.g. adding interrogative clauses would necessitate an \intd" constructor parallel to our \reld" constructor, and so on). Having the best of both worlds would seem the ideal solution: a -Prolog platform extended with assumptions ranging over the current continuation, as in Assumption Grammars, would allow us to both exploit lambda abstraction to avoid having to pass on extra ags and parameters, and to straightforwardly implement linguistic constraints such as subjacency, that would cover several cases of long-distance dependencies at once, without the need for specialized extra constructors.
6 Conclusion We have shown how a simple extension to logic programming -continuation based hypothetical reasoning- allows us to manage resource manipulation in a way that is sensitive to linguistic structure, while avoiding clutter in the grammar and at the same time allowing easy access to the variables needed for compositional semantics. We have also shown an alternative, more elegant, higher order treatment which also avoids overgeneration while making compositional semantics a focus. This approach is novel in that it uses the higher order meaning terms themselves to direct the parse, thus avoiding complex additional parameters; does not necessitate extra ags as in the traditional gap threading approach; and uses lambda abstraction both to avoid overgeneration and to maintain semantic compositionality. With this work we hope to stimulate further research into grammatical platforms supporting true lambda expressions, perhaps in combination with continuation-based assumption reasoning.
7 Acknowledgements Thanks are due to the anonymous referees for useful comments on this article's rst draft. This research was made possible by grants from NSERC and the NSF.
References [1] A. Colmerauer. Metamorphosis grammars. Lecture Notes in Computer Science, Springer-Verlag, 63:133-189, 1978. [2] V. Dahl, P. Tarau and R. Li. Assumption Grammars for Processing Natural Language. Proceedings International Conference on Logic Programming'97, pages 256{270, 1997. [3] J. Hodas. Specifying Filler-Gap Dependency Parsers in a Linear-Logic Programming Language. In Krzysztof Apt, editor, Logic Programming Proceedings of the Joint International Conference and Symposium on Logic programming, pages 622{636, Cambridge, Massachusetts London, England, 1992. MIT Press. [4] R.A.K. Kowalski. Logic for Problem Solving. North-Holland, 1979. 11
[5] M. Moortgat. Categorial Type Logics. newblock In J. van Benthem and A. Ter Meulen, editors, Logic and Language , pages 93{177, North-Holland, 1997. [6] R. Pareschi. A De nite Clause Version of Categorial Grammar. In Proc.26th Annual Meeting of the Association for Computational Linguistics, pages 270{277, Munich, 1987. [7] R. Pareschi and D. Miller. Extending De nite Clause Grammars with Scoping Constructs. Proc. Second European Workshop on Logics and AI, In D.H.H. Warren and P. Szeredi, editors, Proceedings of the 1990 International Conference on Logic Programming, pages 373{386, MIT Press, 1990. [8] F. C. N. Pereira. Semantic Interpretation as Higher-Order Deduction. Proc. Second European Workshop on Logics and AI, Springer-Verlag, 1990. [9] F. Pereira and S. Shieber. Prolog and Natural Language Analysis. CSLI Lectures Notes Number 10, Stanford, California, 1987.
Appendix 1: Implementing Linguistic Constraints Through ContinuationRanging Assumptions N.B. The BinProlog predicates dcg def/1 and dcg val/1 respectively give input to the grammar and check which part of the input has not yet been used (thus to analyze Ws from s/1 we write: dcg def(Ws), s(S), dcg val([])). :-op(300,xfy,\). apply(X\P,X,P). % Lexicon: pn(P\Q):- #john, apply(P,john,Q). pn(P\Q):- #mary, apply(P,mary,Q). pn(P\Q):- #prolog, apply(P,prolog,Q).
n(X\man(X)):- #man. n(X\lesson(X)):- #lesson.
det(P1\P2\every(X,Q1 => Q2)):- #every, apply(P1,X,Q1), apply(P2,X,Q2). det(P1\P2\exists(X,and(Q1,Q2))):- #a, apply(P1,X,Q1), apply(P2,X,Q2). vi(P\Q):- #sings, apply(P,X\sings(X),Q). vi(P\Q):- #paints, apply(P,X\paints(X),Q). vt(P1\P2\Q2):- #saw, apply(P2,X\Q1,Q2), apply(P1,Y\saw(X,Y),Q1). vt(P1\P2\Q2):- #learnt, apply(P2,X\Q1,Q2), apply(P1,Y\learnt(X,Y),Q1). % Syntax: s(S):- np(NP), vp(VP), apply(VP,NP,S), all_consumed.
12
np(NP):- pn(NP). np(P\Q):- get_ant(X), % gets X from candidate antecedents apply(P,X,Q). np(NP):- det(D), n(X\P), apply(D,X\P,NP). np(NP):- det(D), n(X\P), add_referent(X), relclause(P1), close_scope(X), apply(D,X\and(P,P1),NP). get_ant(X):-
-rel([X|List]), % retrieve most recently assumed antecedent allowed(X), % within the present embedded sentence +rel(List). % make remaining list available
add_referent(X):- -rel(L), +rel([X|L]), % adds new referent X, LIFO fashion -forbid(Old), +forbid([L|Old]). % forbid antecedents in L close_scope(X):- used(X), % if X is not consumed, report overgen. -forbid([L|Old]), +forbid(Old). % delete L, which is local to the current np allowed(X):- forbid([H|T]), % get list of variables forbidden at this level \+member(X,H), % ensure X is not one of them +forbid([H|T]). % restore info on forbidden variables used(X):- -rel([X|_]), !, report, fail. used(_).
% X is unconsumed after the rel. clause
report:- nl, write('Relative clause should have an implicit noun phrase.'), nl. relclause(Rel):- #that, s(Rel). vp(VP):- vi(VP). vp(VP):- vt(V), np(NP), apply(V,NP,VP). % Utility predicates: words([]). words([_|Ws]):- words(Ws). test(Ws,S):- +rel([]), +forbid([]), words(Ws), dcg_def(Ws), s(S), dcg_val([]), -forbid([]). all_consumed:- \+ -(missing(_)). test:- +rel([]), +forbid([]), sentence(X), dcg_def(X), s(S), dcg_val([]), write(S), -forbid([]), nl.
A few sample tests: 13
Ws=[john,saw,every,man,that,saw,every,man,that,saw,john], S=every(_x2645,and(man(_x2645),every(_x3068,and(man(_x3068),saw(_x3068,john)) => saw(_x2645,_x3068))) => saw(john,_x2645)); Ws=[john,saw,every,man,that,saw,every,man,that,saw,john], S=every(_x2645,and(man(_x2645),every(_x3068,and(man(_x3068),saw(_x3068,john)) => saw(_x2645,_x3068))) => saw(john,_x2645)); Ws=[every,man,that,saw,every,man,that,saw,every,man,that,saw,john,saw,every,man], S=every(_x2635,and(man(_x2635),every(_x3035,and(man(_x3035), every(_x3458,and(man(_x3458),saw(_x3458,john)) => saw(_x3035,_x3458))) => saw(_x2635,_x3035))) => every(_x4110,man(_x4110) => saw(_x2635,_x4110))); Ws=[every,man,that,every,man,that,saw,sings,sings], no Ws=[every,man,that,john,saw,john,sings], Relative clause should have an implicit noun phrase.
Appendix 2: Lambda Prolog Formulation of Relativization Below is a Terzo Lambda-Prolog program for the relativization problem. module Relativized. kind kind
i type. meaning type.
type type type type
mary i. man i -> o. runs i -> o. sees i -> i -> o.
kind type type
verb_type type. finite_verb verb_type. inf_verb verb_type.
type
s
o -> verb_type -> (list string) -> (list string) -> o.
14
type type type type type type type type
relclause o -> (list string) -> (list string) -> o. vp (i -> o) -> verb_type -> (list string) -> (list string) -> o. np ((i -> o) -> o) -> (list string) -> (list string) -> o. n (i -> o) -> (list string) -> (list string) -> o. vt (i -> i -> o) -> verb_type -> (list string) -> (list string) -> o. vi (i -> o) -> verb_type -> (list string) -> (list string) -> o. pn i -> (list string) -> (list string) -> o. inv_aux (list string) -> (list string) -> o.
type type type
decl o -> meaning. ynq o -> meaning. whq (i -> o) -> meaning.
type
sentence
type type
normal A -> A. reld (i -> A) -> A.
type type type
prapp words test1
(list string) -> meaning -> o.
(A -> B) -> A -> B -> o. (list string) -> o. (list string) -> meaning -> o.
sentence Words (decl Decl) :s (normal Decl) finite_verb Words nil. sentence Words (ynq Question) :% Auxiliary verb + sentence with infinitive main verb inv_aux Words W2, s (normal Question) inf_verb W2 nil. sentence Words (whq Question) :% "who" + verb phrase with finite verb Words = ("who"::W2), vp (normal Question) finite_verb W2 nil. sentence Words (whq Question) :% "who" + aux verb + sentence with infinitive verb and % non-missing subject, but missing noun phrase Words = ("who"::W2), inv_aux W2 W3, np NP W3 W4, vp (reld VP) inf_verb W4 nil, prapp NP (reld VP) (reld Question). s S Verb_type W1 W3 :np NP W1 W2, vp VP Verb_type W2 W3, prapp NP VP S.
15
np NP W1 W2 :pn PN W1 W2, % proper noun NP = (normal pred\(pred PN)). np (reld x\ y\ (y x)) W1 W1. % missing np np NP W1 W4 :W1 = ("every"::W2), % determiner + n N W2 W3, % noun + relclause (reld LxS) W3 W4, % relative clause NP = (normal pred\(pi x\((N x), (LxS x) => (pred x)))). relclause (reld LxS) W1 W3 :W1 = ("that"::W2), s (reld LxS) finite_verb W2 W3.
% relative pronoun % relativized sentence with finite verb
vp VP Verb_type W1 W2 :vi VI Verb_type W1 W2, % intransitive verb VP = (normal VI). vp VP Verb_type W1 W3 :vt VT Verb_type W1 W2, % transitive verb + np NP W2 W3, % noun phrase prapp (normal (x\ y\ (x (VT y)))) NP VP. % prapp X Y Z: Z is the application of X to Y, where either X or Y % might be relativized, in which case so is Z prapp (normal X) (normal Y) (normal (X Y)). prapp (reld LxX) (normal Y) (reld x\ ((LxX x) Y)). prapp (normal X) (reld LyY) (reld y\ (X (LyY y))). % auxiliary verb for inverted sentences inv_aux ("does"::W2) W2. pn mary ("mary"::W2) W2. n man ("man"::W2) W2. vi runs finite_verb ("runs"::W2) W2. vi runs inf_verb ("run"::W2) W2. vt sees finite_verb ("sees"::W2) W2. vt sees inf_verb ("see"::W2) W2. % predicate useful for testing words nil. words (_::Words) :- words Words. test1 Words Meaning :words Words, sentence Words Meaning.
16