A Computational Account of Some Constraints on Language - CiteSeerX

3 downloads 0 Views 1MB Size Report
Constraints on Language. Mitchell Marcus. MIT Artificial intelligence Laboratory. In a series of papers over the last several years,. Noam Chomsky has argued for ...
A Computational Account of Some C o n s t r a i n t s on Language Mitchell Marcus MIT Artificial intelligence Laboratory

(jrammarians. The account which I will present b e l o w is crucially linked to Chomsky's, however, in that t r a c e t h e o r y is at the heart of this account.)

In a series of papers over the last several years, Noam Chomsky has argued for several specific properties of language which he claims are universal to all human languages [Chomsky 78, 76, 76]. These properties, which form one of the cornerstones of his current linguistic theory, are embodied in a set of constraints on language, a s e t of r e s t r i c t i o n s on the operation of rules of grammar.

Because of space limitations, this paper deals only with those grammatical processes characterized by the c o m p e t e n c e rule "MOVE NP"; the constraints imposed by the grammar interl)reter upon those processes c h a r a c t e r i z e d by the rule "MOVE WH-phrase" are discussed at length in [Marcus 77] where I show that the behavior c h a r a c t e r i z e d by Ross's Complex NP Constraint [Ross 6 7 ] itself follows directly from the structure of the grammar i n t e r p r e t e r for rather different reasons than the behavior considered in this section. Also because of s p a c e limitations, I will not attempt to show that the t w o constraints I will deal with here necessarily follow from t h e grammar interpreter, but rather only .that they n a t u r a l l y follow from the interpreter, in particular from a simple, natural formulation of a rule for passivization which itself d e p e n d s heavily upon the structure of the i n t e r p r e t e r . Again, n e c e s s i t y is argued for in detail in [Marcus 77].

This paper will outline two arguments p r e s e n t e d at length in [Marcus 7 7 ] demonstrating that important subc a s e s of t w o of these constraints, the Subjacency Principle and the Specified Subject Constraint, fall out naturally from the structure "of a grammar interpreter called PARSIFAL, w h o s e structure is in turn based upon the hypothesis that a natural language parser needn't simulate a nondeterministic machine. This "Determinism Hypothesis" claims that natural language can be parsed by a computationally simple mechanism that uses neither backtracking nor pseudoparallelism, and in which all grammatical structure c r e a t e d by the parser is "indelible" in that it must all be output as part of the structural analysis of the parser's input. Once built, no grammatical structure can be discarded or a l t e r e d in the course of the parsing process.

This pal)er will first outline the structure of the grammar interpreter, then present the PASSIVE rule, and tho.n finally show how Chomsky% constraints "fall o u t " of the formulation of PASSIVE.

In particular, this paper will show that the s t r u c t u r e of the grammar interpreter constrains its operation in such a w a y that, by and large, grammar rules cannot parse s e n t e n c e s which violate either .the Specified S u b j e c t Constraint or the Subjacency. Principle. The component of the grammar interpreter upon which this result principally d e p e n d s is motivated by the Determinism Hypothesis; this result thus provides indirect evidence for the hypothesis. This result also depends upon the use within a compLitational framework of the closely related notions of annotated surface structure and trace theory, which also d e r i v e from Chomsky's recent work.

Before proceeding with the body of this paper, t w o o t h e r important properties of the parser should be mentioned which will not be discussed here. Both are discussed at lengtll in [Marcus 77]; the first is s k e t c h e d as well in [Marcus 781: 1) Simple rules of grammar can by w r i t t e n for this interpreter which elegantly capture the significant generalizations behind not only passivization, but also such constructions as y e s / n o questions, imperatives, and s e n t e n c e s with existential there. These rules are reminiscent of the sorts of rules proposed within the framework of the theory of generative grammar, d e s p i t e the f a c t that the rules presented here must recover underlying s t r u c t u r e given only the terminal string of the surface form of the sentence.

(It should be noted that these constraints are far from universally accepted. They are currently the source of much controversy; for various critiques of Chomsky's position see [Postal 74; Bresnan 76]. However, w h a t is p r e s e n t e d below does not argue for these constraints, per se, but rather provides a different sort of explanation, b a s e d on a processing model, of why the sorts of s e n t e n c e s which these constraints forbid are bad. While the e x a c t formulation of these constraints is controversial, the f a c t t h a t some s e t of constraints is needed to account for this range of data is generally agreed upon by most g e n e r a t i v e

2)The grammar interpreter provides a simple e×planation for the difficulty caused by "garden p a t h " s e n t e n c e s , such as "The cotton clothing is made of grows in Mississippi." Rules can be written for this i n t e r p r e t e r t o

236

some of t h e c o n t e x t surrounding a given constituent, it does n o t allow a r b i t r a r y Iook-allead. The length of the b u f f e r is s t r i c t l y limited; in the version of the parser p r e s e n t e d here, t h e b u f f e r has only three cells. (The buffer must be e x t e n d e d to f i v e cells to allow the parser to build NPs in a nlanner which is t r a n s p a r e n t to the "clause l e v e l " grammar rules which will be p r e s e n t e d in this paper. This e x t e n d e d p a r s e r still has a window of only t h r e e cells, but the e f f e c t i v e s t a r t of the buffer can be changed through an " a t t e n t i o n ~hifting mechanism" w h e n e v e r the p a r s e r is building an NP. I n e f f e c t , this e x t e n d e d p a r s e r has t w o " l o g i c a l " buffers of length three, one for NPs and a n o t h e r for clauses, witll t h e s e t w o buffers implemented b y allowing an o v e r l a p in one larger buffer. For details, see [ M a r c u s

r e s o l v e local structural ambiguities which might seem t o r e q u i r e nondeterministic parsing; the power of such rules, h o w e v e r , depends upon a parameter of the mechanism. Most s t r u c t u r a l ambiguities can be resolved, given an a p p r o p r i a t e s e t t i n g of this parameter, but those which t y p i c a l l y cause garden paths cannot.

The S t r u c t u r e o f P A R S I F A L PARSIFAL maintains t w o major data s t r u c t u r e s : a pushdown s t a c k of incomplete donstituents called the active node stack, and a small t h r e e - p l a c e constituent buffer which c o n t a i n s c o n s t i t u e n t s which are complete, but w h o s e higher l e v e l grammatical function is as y e t uncertain.

77].)

Figure 1 below shows a snapshot of the p a r s e r ' s d a t a s t r u c t u r e s t a k e n while parsing the s e n t e n c e "John should h a v e scheduled -the meeting.". Note t h a t the a c t i v e node s t a c k in shown growing downward, so t h a t the s t r u c t u r e of the stack reflects the structure of t h e emerging parse tree. At the bottom of the s t a c k is an a u x i l i a r y node labelled with the f e a t u r e s modal, past, etc., which has as a daughter the modal "should". Above t h e b o t t o m of t h e s t a c k is an S node with an NP as a daughter, dominating the word "John". There are t w o words in t i m b u f f e r , t h e v e r b " h a v e " in the first buffer cell and the w o r d " s c h e d u l e d " in the second. The t w o words " t h e m e e t i n g " h a v e not y e t come to the attention of the parser. (The structures of form "(PARSE-AUX CPOOL)" and the like will be e x p l a i n e d below.)

Note t l l a t each of the three cells in the b u f f e r can hold a grammatical consHtuent of any t y p e , .where a c o n s t i t u e n t is any t r e e t h a t the parser has c o n s t r u c t e d under a single root node. The size of the s t r u c t u r e u n d e r n e a t h the node is immaterial; both " t h a t " and " t h a t t h e big green cookie monster's toe got s t u b b e d " are perfectly good constituents once the parser has c o n s t r u c t e d a subordinate clause from the l a t t e r phrase, The c o n s t i t u e n t buffer and the a c t i v e node s t a c k are a c t e d upon by a grammar which is made up of p a t t e r n / a c t i o n rules; this grammar can be v i e w e d as an a u g m e n t e d form o f Newell and Simon's production s y s t e m s [ N e w e l l & Simon 72]. Each rule is made up of a p a t t e r n , which is matched against some subset of the c o n s t i t u e n t s of the b u f f e r and the accessible nodes in the a c t i v e node s t a c k ( a b o u t which more will be said below), and an action, a s e q u e n c e of operations which acts on t h e s e c o n s t i t u e n t s . Each rule is assigned a numerical priority, which t h e grammar i n t e r p r e t e r uses to arbitrate simultaneous matches.

The A c t i v e Node Stack $1 (S DECL MAJOR S) / (PARSE-AUX CPOOL) NP : (John) AUX1 (MODAL PAST VSPL AUX) / (BUILD-AUX) MODAL : (should)

1 ;

2:

The Buffer WORD8 (~HAVE VERB TNSLESS AUXVERB PRES V-eS) : ( h a v e ) WORD4 ("SCHEDULE COMP-OBJ VERB INF-OBJ V-eS ED=EN EN PART PAST ED) : ( s c h e d u l e d )

Y e t unseen words:

the m e e t i n g .

Figure 1 - PARSIFAL's t w o major data structures. The c o n s t i t u e n t bt,ffer is the heart of the grammar i n t e r p r e t e r ; it is the central f e a t u r e t h a t distinguishes this p a r s e r from all others. The words that make up the p a r s e r ' s input first come to its attention when t h e y a p p e a r at t h e end of this buffer a f t e r morphological analysis. Triggered b y t h e words at the beginning of the buffer, the parser may d e c i d e to c r e a t e a n e w grammatical constituent, c r e a t e a n e w node at the bottom of the a c t i v e node stack, and t h e n begin to a t t a c h the constituents in the b u f f e r to it. A f t e r this n e w c o n s t i t u e n t is completed, the parser will then pop t h e f~ew c o n s t i t u e n t from the a c t i v e node s t a c k ; if t h e grammatical role of this larger structure is as y e t u n d e t e r m i n e d , the parser will insert it into the first cell of the buffer. The parser is free to examine the c o n s t i t u e n t s in the buffer, to a c t upon them, and to o t h e r w i s e use the b u f f e r as a w o r k s p a c e .

The grammar as a whole is s t r u c t u r e d into rule packets, clumps of grammar rules which can be a c t i v a t e d and d e a c ( i v a t e d as a group; the grammar i n t e r p r e t e r only a t t e m p t s to match rules in p a c k e t s t h a t h a v e b e e n a c t i v a t e d by the grammar. Any grammar rule can a c t i v a t e a p a c k e t b y associating that p a c k e t with the c o n s t i t u e n t a t t h e bottom of the a c t i v e node stack. As long as t h a t node is a t the bottom of the stack, the packets a s s o c i a t e d w i t h it are a c t i v e ; when t h a t node is pushed into the s t a c k , t h e p a c k e t s remain associated with it, but become a c t i v e again only w h e n t h a t node reaches the bottom of the stack. For e x a m p l e , in figure 1 above, the p a c k e t BUILD-AUX is a s s o c i a t e d with the bottom of the stack, and is thus a c t i v e , while the p a c k e t PARSE-AUX is associated with the S node a b o v e t h e auxiliary. The grammar rules themselves are w r i t t e n in a l a n g u a g e called PIDGIN, an English-like formal language t h a t is t r a n s l a t e d into LISP by a simple grammar translator b a s e d on t h e notion of t o p - d o w n operator p r e c e d e n c e [ P r a t t 7 3 ] . This use of pseudo-English is similar to tile use of p s e u d o English in the grammar for Sager's STRING parser [ S a g e r 73.]. Figure 2 below gives a schematic o v e r v i e w of the o r g a n i z a t i o n of the grammar, and exhibits some of t h e rules t h a t make up the p a c k e t PARSE-AUX. A f e w comments on the grammar notation i t s e l f a r e

While the buffer allows the parser to e x a m i n e

237

shifting time b u f f e r to time riglmt to c r e a t e a gap (and causing aim error if t h e b u f f e r was already full). If t h e c o n s t i t u e n t s in t h e b u f f e r provide sufficient e v i d e n c e t h a t a c o n s t i t u e n t of a g i v e n t y p e should be initiated, a new node of t h a t t y p e can be c r e a t e d and pushed onto time stack; this n e w node can also be a t t a c h e d to the node at the bottom of t h e s t a c k b e f o r e the s t a c k is pushed, if time grammatical function of t h e n e w constituent is clear when it is c r e a t e d .

in order. The general form of each grammar rule is: {Rule ( n a m e ) priority: ] [ ] --) ] [ ] [ ] [ ] --) PACKET2 [ ] --> [ ] --)

1 ) A d e t e r m i n i s t i c parser must be at least partially data driven. A grammar for PARSIFAL is made up o f p a t t e r n / a c t i O n rules which are t r i g g e r e d w h e n c o n s t i t u e n t s which fulfill specific d e s c r i p t i o n s a p p e a r in the buffer.

Action

2) A deterministic parser must be able to r e f l e c t expectations that follow from the partial structures built up during the parsing process. P a c k e t s o f rules can be a c t i v a t e d and d e a c t i v a t e d b y grammar rules to r e f l e c t the properties of the c o n s t i t u e n t s in the a c t i v e node stack.

ACTION1 ACTION2 ACTION8

3)

ACTION4 ACTION5

(a) - Time structure of time grammar. {RULE START-AUX PRIORITY: 10. IN PARSE-AUX [ = v e r b ] --> C r e a t e a n e w a u x node. Label C w i t h the meet of the f e a t u r e s of 1st and pres, past, future, tnsless. Activate build-aux.} {RULE TO-INFINITIVE PRIORITY: 10. IN PARSE-AUX [=~to, a u x v e r b ] [ = t n s l e s s ] - - ) Label a n e w a u x node inf. A t t a c h 1st t o C as to. Activate build-aux.}

A deterministic parser must have some sort o f constrained look-ahead facility. PARSIFAL's b u f f e r p r o v i d e s this constrained look-ahead. Because t h e b u f f e r can hold s e v e r a l constituents, a grammar rule can examine the c o n t e x t t h a t follows t h e f i r s t c o n s t i t u e n t in the buffer b e f o r e deciding w h a t grammatical role it fills in a higher l e v e l s t r u c t u r e . The k e y idea is t h a t the size of the b u f f e r can be sharply constrained if each location in the b u f f e r can hold a single complete constituent, r e g a r d l e s s of t h a t c o n s t i t u e n t ' s size. It must be stressed that this look-ahead ability must be constrained in some manner, as it is here by l i m i t i n g the length of the buffer~ otherwise the "determinism" claim is VACUOUS.

T h e G e n e r a l Gran~matical F r a m e w o r k - T r a c e s The form of the structures t h a t time c u r r e n t grammar builds is based on the notion of Annotated Surface Structure. This term has been used in t w o d i f f e r e n t s e n s e s b y Winograd I-Winograd 7 1 ] and CImomsky [Cllomsky 7,3]; t h e u s a g e of time term here can be thought of as a s y n t h e s i s of the t w o concepts. Following Winograd, tlmis term will be used to refer to a notion of s u r f a c e s t r u c t u r e annotated by the addition of a set of features to each node in a parse t r e e . Following Chomsky, time term will be used t o r e f e r to a notion of surface structure a n n o t a t e d b y the addition of aim element called trace to indicate t h e " u n d e r l y i n g position" of " s h i f t e d " NPs.

(b) - Some grammar rules t h a t initiate auxiliaries. Figure 2 Time parser (i.e. time grammar interpreter interpreting some grammar) o p e r a t e s by attaching c o n s t i t u e n t s which are in the buffer to the c o n s t i t u e n t a t t h e bottom of the s t a c k ; functionally, a c o n s t i t u e n t is in t h e s t a c k w h e n the parser is attempting to find its d a u g h t e r s , and in the b u f f e r when the parser is attempting to find its mother. Once a constituent in the buffer has b e e n a t t a c h e d , t h e grammar i n t e r p r e t e r will automatically r e m o v e it from t h e buffer, filling in t h e ' g a p by shifting to the l e f t the c o n s t i t u e n t s formerly to its right. When the p a r s e r has c o m p l e t e d the c o n s t i t u e n t at the bottom of the s t a c k , it pops t h a t c o n s t i t u e n t from the a c t i v e node s t a c k ; the c o n s t i t u e n t e i t h e r remains a t t a c h e d to its parent, if it w a s a t t a c h e d to some larger constituent When it w a s c r e a t e d , or e l s e it falls into the first cell of the c o n s t i t u e n t b u f f e r ,

In current linguistic theory, a t r a c e is e s s e n t i a l l y a "phonologically null" NP in the surface structure r e p r e s e n t a t i o n of a s e n t e n c e that has 11o daughters but is " b o u n d " to t h e NP t h a t filled t h a t position at some l e v e l of underlying structure. In a sense, a t r a c e can be v i e w e d as a " d u m m y " NP t h a t s e r v e s as a placeholder for the NP t h a t e a r l i e r filled t h a t position; in the same sense, t h e trace's

238

binding can be v i e w e d as simply a pointer to t h a t NP. It should be s t r e s s e d at the outset, however, t h a t a t r a c e is indistinguishable from a normal NP in terms of normal grammatical processes; a trace is an NP, e v e n though it is an NP t h a t dominates no lexical material.

c l a u s e t h a t dominates the embedded complement. Note t h a t w h a t is c o n c e p t u a l l y tile underlying s u b j e c t of the e m b e d d e d clause has been passivized into s u b j e c t position of t h e m a t r i x S, a phenomenon commonly called "raising". The analysis of tiffs phenomenon assumed here d e r i v e s from [ C h o m s k y 7 3 ] ; it is an a l t e r n a t i v e to the classic analysis which i n v o l v e s "raising" the s u b j e c t of the e m b e d d e d c l a u s e into o b j e c t position of the matrix S b e f o r e p a s s i v i z a t i o n (for details of this later analysis s e e [ P o s t a l 74]).

There are s e v e r a l reasons for choosing a p r o p e r l y annotated surface structure as a primary output r e p r e s e n t a t i o n for s y n t a c t i c analysis. While a d e e p e r analysis is n e e d e d to r e c o v e r the p r e d i c a t e / a r g u m e n t s t r u c t u r e of a s e n t e n c e (either in terms of Fillmore c a s e r e l a t i o n s [Fillmore 6 8 ] or Gruber/Jackendoff " t h e m a t i c r e l a t i o n s " [Gruber 65; Jackendoff 72]), phenomena such as focus, theme, pronominal reference, scope of quantification, and the like can be r e c o v e r e d only from the s u r f a c e s t r u c t u r e of a s e n t e n c e . By means of proper annotation, it is possible to encode in the surface structure the "deep" syntactic i.formation necessary to recover underlying predicate/argument relations, and tht,s to encode in the same formalism both deep syntactic relations and the s u r f a c e order n e e d e d for pronominal r e f e r e n c e and t h e o t h e r phenomena listed above.

T h e Passive Rule In this section and the n e x t , I will briefly s k e t c h a solution t o the phenomena of passivization and "raising" in t h e c o n t e x t of a grammar for PARSIFAl_. This s e c t i o n will p r e s e n t the Passive rule; the n e x t section will s h o w how this rule, w i t h o u t alteration, handles the "raising" cases. Let us begin with the parser in the s t a t e shown in figure 4 below, in the midst of parsing 0.2a a b o v e . The analysis process for the s e n t e n c e prior to this point is e s s e n t i a l l y parallel to the analysis of any simple d e c l a r a t i v e w i t h one exception= the rvle PASSIVE-AUX in p a c k e t BUILDAUX has d e c o d e d the passive morphology in the a u x i l i a r y and g i v e n the auxiliary the f e a t u r e passive (although this f e a t u r e is not visible in figure 4). At the point w e begin our e x a m p l e , t i l e p a c k e t SUBJ-VERB is active.

Some examples of the use of t r a c e are given in Figure 3 immediately below. (la) (lb}

lJhat d i d John g i v e t o Sue? IJhat d i d John g i v e t t o Sue?

(lc)

John gave what t o Sue.

(2a) (2b}

The m e e t i n g Has s c h e d u l e d f o r Wednesdag. The m e e t i n g uas s c h e d u l e d t f o r Wednesdag.

I

I

I (2c}

I

C:

V s c h e d u l e d ameeting f o r Wednesdag.

( 3 a ) J o h n gas b e l i e v e d ( 3 b } J o h n gas b e l i e v e d

I Figure

The Active Node Stack ( 1. deep) $21 (S DECL MAJOR) / (SS-FINAL) NP : (The meeting) AUX : (was)

VERB : (scheduled)

to be happg. [S t t o be h a p p g ] . 2:

I

3 - Some e x a m p l e s o f

t h e use o f

VP:¢ VPI 7 (VP) / (SUBJ-VERB)

trace.

The Buffer PP14 (PP) : (for Wednesday) WORD162 (". FINALPUNC PUNC) : (.) Figure 4 - Partial analysis of a passive s e n t e n c e : a f t e r the verb has been a t t a c h e d .

One use of trace is to indicate the underlying position of the w h - h e a d of a question or r e l a t i v e clause. Thus, the s t r u c t u r e built by the parser for ,3.1a would include the t r a c e shown in 8.1b, with tile t r a c e ' s binding shown b y t h e line under the sentence. The position of the t r a c e indicates t h a t ,3.1a has an underlying s t r u c t u r e analoqous to the o v e r t surface structure of 3.1c.

Tile p a c k e t SUBJ-VERB contains, among o t h e r rules, the rule PASSIVE, shown in figure 5 below. The p a t t e r n of this rule is fulfilled if the auxiliary of the S node dominating t h e c u r r e n t a c t i v e node (which will always be a VP node if p a c k e t SUBJ-VERB is a c t i v e ) has the f e a t u r e passive, and t h e S node has not y e t been labelled np-preposed. (The n o t a t i o n ""~ C" indicates that this rule matches against the t w o a c c e s s i b l e nodes in the stack, not against the c o n t e n t s of t h e buffer.) The action of the rule PASSIVE simply c r e a t e s a t r a c e , s e t s the binding of the t r a c e to t h e s u b j e c t of the dominating S node, and then drops the n e w t r a c e into the buffer.

Another use of trace is to indicate the underlying position of the surface s u b j e c t of a passivized clause. For e x a m p l e , 8.2a will be parsed into a structure t h a t includes a t r a c e as shown as 8.2b; this trace indicates t h a t t h e s u h j e c t of the passive has the underlying position shown in 3.2e. The symbol "V" signifies the f a c t t h a t t h e s u b j e c t position of ( 2 c ) is filled by an NP t h a t dominates no l e x i c a l s t r u c t u r e . (Following Chomsky, I assume t h a t a p a s s i v e s e n t e n c e in f a c t has no underlying subject, t h a t an a g e n t i v e " b y NP" prepositional phrase originates as such in u n d e r l y i n g strl,cture.) The trace in (,3b) indicates t h a t t h e p h r a s e " t o be happy", which the brackets show is r e a l l y an e m b e d d e d clause, has an underlying s u b j e c t which is i d e n t i c a l witll the sorface s u b j e c t of the matrix S, the

239

While t h e r e is not space to continue the e x a m p l e h e r e in detail, llote t h a t the rule OBJECTS will t r i g g e r w i t h t h e p a r s e r in t h e s t a t e shown in figure 6 above, and will a t t a c h NP53 as the o b j e c t of the v e r b "schedule. OBJECTS is thus t o t a l l y indifferent both to the f a c t t h a t NP53 w a s not a regular NP, but rather a trace, and the f a c t t h a t NP53 did not originate in the input string, but was placed into t h e b u f f e r by grammatical processes. Whether or not this rule is e x e c u t e d is absolutely u n a f f e c t e d by d i f f e r e n c e s b e t w e e n an a c t i v e s e n t e n c e and its passive form; t h e analysis process for either is identical as of this point in the parsing process. Thus, the an'alysis process will be e x a c t l y parallel in both cases a f t e r the PASSIVE rule has b e e n e x e c u t e d . (I remind the reader t h a t the analysis of p a s s i v e assumed a b o v e , following Chomsky, does not assume a p r o c e s s of " a g e n t deletion", " s u b j e c t postposing" or t h e like.)

{RULE PASSIVE IN SUBJ-VERB [~" c; t h e aux of the s above c is passive; the s a b o v e c is not n p - p r e p o s e d ] --> Label t h e s a b o v e c np-preposed. C r e a t e a n e w np node labelled trace. S e t t h e binding of c to the np of the s above c. Drop c.} Figure 5

-

Six lines of code captures np-preposing.

The s t a t e of the parser a f t e r this rule has been e x e c u t e d , w i t h t h e parser previously in the s t a t e in figure 4 a b o v e , is s h o w n in figure 6 below. $21 is now labelled with t h e featurenp-preposed, and there is a trace, NP53, in t h e f i r s t b u f f e r position. NP53, as a trace, has no daughters, but is bound to the s u b j e c t of $21.

Passives in Embedded Complements -

The Active Node Stack ( 1. deep) $21 (NP-PREPOSED S DECL MAJOR) / (SS-FINAL) NP : (The meeting) AU× : (was) VP:~

C:

1 :

2: 3:

"Raising"

The r e a d e r =nay have w o n d e r e d w h y PASSIVE drops the t r a c e it c r e a t e s into the buffer r a t h e r t h a n i m m e d i a t e l y a t t a c h i n g the new t r a c e to the VP node. As w e will n e e below, such a formulation of PASSIVE also c o r r e c t l y a n a l y z e s passives like 3.3a above which involve "raising", but w i t h no additional c o m p l e x i t y added to the grammar, c o r r e c t l y capturing an important generalization a b o u t English. To show the range of the generalization, t h e e x a m p l e which w e will i n v e s t i g a t e in this section, s e n t e n c e ( t ) in f i g . r e 8 below, is y e t a level more complex than ,3.3a a b o v e ; its analysis is shown schematically in 8.2. In this e x a m p l e t h e r e are t w o traces: the first, the s u b j e c t of t h e e m b e d d e d clause, is bound to the s u b j e c t of t h e major clause, t h e second, tile o b j e c t of the embedded S, is bound t o t h e first t r a c e , and is thus ultimately bound t o t h e s u b j e c t of t h e higher S as well. Thus t h e underlying position of tile NP " t h e meeting" can be v i e w e d as being t h e o b j e c t position of the embedded S, as shown in 8.3.

VP17 (VP) / (SUBJ-VERB) VERB : (scheduled) The Buffer NP53 (NP TRACE) : bound to: (The meeting) PP14 (PP) : (for Wednesday) WORD162 (*. FtNALPUNC PUNC) : (.) Figure 6 - After PASSIVE has been e x e c u t e d .

Now rules will rut] which will a c t i v a t e t h e ' t w o p a c k e t s SS-VP and INF-COMP, given that the v e r b of VP17 is " s c h e d u l e " . These t w o p a c k e t s contain rules for parsing simple o b j e c t s of non-embedded Ss, and infinitive complements, r e s p e c t i v e l y . Two such rules, each of which utilize an NP immediately following a verb, are given in figure 7 helow. The rule OBJECTS, in p a c k e t SS-VP, picks up an NP a f t e r the v e r b and a t t a c h e s it to the VP node an a simple o b j e c t . The rule INF-S-STARTI,in p a c k e t INF-COMP, t r i g g e r s w h e n an NP is followed by " t o " and a t e n s e l e s s v e r b ; it initiates an infinitive complement and a t t a c h e s t h e NP as its s u b j e c t . (An example of such a s e n t e n c e is "We w a n t e d John to give a seminar n e x t w e e k " . ) The rule INFS-START1 must have a higher priority than OBJECTS b e c a u s e tlie p a t t e r n of OBJECTS is fulfilled by any situation t h a t fulfills the p a t t e r n of INF-S-START1; if both rules are in a c t i v e p a c k e t s and match, the higher priority of INF-SSTART1 will cause it to be run instead of OBJECTS.

( 1 )The meeting was b e l i e v e d to h a v e been s c h e d u l e d for Wednesday. ( 2 ) T h e meeting was b e l i e v e d [s t to have been s c h e d u l e d t for W e d n e s d a y ] ( 3 ) v b e l i e v e d [s v to have scheduled the meeting for Wednesday]. Figure 8 - This e x a m p l e shows simple passive and raising. We begin our example, once again, right a f t e r " b e l i e v e d " has been a t t a c h e d to VP20, the c u r r e n t a c t i v e node, as shown in figure 9 below. Note t h a t the AUX node has b e e n labelled passive, although this f e a t u r e is not shown here.

{RULE OBJECTS PRIORITY: 10 IN SS-VP [=np] --> A t t a c h 1st to c as np.} {RULE INF-S-START1 PRIORITY: 5. IN INF-COMP [ = n p ] [ : ~ t o , a u x v e r b ] [ : t n s l e s s ] --> Label a n e w s node sac, inf-s. A t t a c h 1st to c as np. Activate parse-aux.} Figure 7 - Two rules which utilize an NP following a verb.

240

C:

The Active Node Stack ( 1. deep) $ 2 2 (S DECL MAJOR) / (SS-FINAL) NP : (The meeting) AUX : (was) VP:$ VP20 (VP) / (SUBJ-VERB) VERB : (believed)

Tile A c t i v e Node Stack ( 2. deep) $ 2 2 (NP-PREPOSED S DECL MAJOR) / (SS-FINAL) NP : (The meeting) A U X : (was) VP:$ VP20 (VP) / (SS-VP THAT-COMP INF-COMP) VERB : (believed) $2,3 (SEC INF-S S) / (PARSE-AUX) NP : bound to: (The meeting)

C: 1 :

2:

The Buffer WORD166 (~TO PREP AUXVERB) : (to) WORD107 ("HAVE VERB TNSLESS AUXVERB PRES ...) : ( h a v e )

1 :

2: Figure 9 - After the v e r b has been a t t a c h e d .

Y e t u n s e e n words:

The p a c k e t SUBJ-VERB is now a c t i v e ; the PASSIVE rule, c o n t a i n e d in this p a c k e t now matches and is e x e c u t e d . This rule, as s t a t e d above, c r e a t e s a trace, binds it t o t h e s u b j e c t of the current clause, and drops tile t r a c e into t h e first cell in the buffer. The resulting s t a t e is shown in figure 10 below.

C:

1 :

2: 3:

been scheduled for W e d n e s d a y .

Figure 11 o After INF-S-START1 has been e x e c u t e d . We are now well on our w a y to the desired analysis. An embedded infinitive has been initiated, and a t r a c e bound to the s u b j e c t of the dominating S has b e e n a t t a c h e d as its subject, although no rule has e x p l i c i t l y " l o w e r e d " t h e t r a c e from one clause into the other.

The Active Node Stack ( 1. deep) $ 2 2 (NP-PREPOSED S DECL MAJOR) / (SS-FINAL) NP : (The meeting) AUX : (was) VP:$ VP20 (VP) / (SUBJ-VERB) VERB : (believed)

The parser will now proceed e x a c t l y as in t h e p r e v i o u s example. It will build tile auxiliary, a t t a c h it, and a t t a c h t i l e v e r b "scheduled" to a new VP node. Once again PASSIVE will match and be e x e c u t e d , creating a t r a c e , binding it to the s u b j e c t of the clause (in this case i t s e l f a t r a c e ) , and dropping the new trace into the buffer. Again t h e rule OBJECTS will a t t a c h the trace NP67 as the o b j e c t of VP21, and the parse will then be completed b y grammatical p r o c e s s e s which will not be discussed here. An e d i t t e d form of the t r e e structure which results is shown in figure 12 below. A t r a c e is indicated in this t r e e b y giving t h e terminal string of its ultimate binding in p a r e n t h e s e s .

The Buffer NP55 (NP TRACE) : bound to: (The meeting) WORD166 (~TO PREP AUXVERB) : (to) WORD167 (~HAVE VERB TNSLESS AUXVERB PRES ...) : ( h a v e )

Yet unseen words:

Tile Buffer WORD166 (*TO PREP AUXVERB) : (to) WORD167 (*HAVE VERB TNSLESS AUXVERB PRES ...) : ( h a v e )

been scheduled for W e d n e s d a y .

Figure 10 - After PASSIVE has been e x e c u t e d .

(NP-PREPOSED S DECL MAJOR) NP: (MODIBLE NP DEF DET NP) The meeting AUX: (PASSIVE PAST V18S AUX)

Again, rules will now be e x e c u t e d which will a c t i v a t e the p a c k e t SS-VP (which contains the rule OBJECTS) and, since " b e l i e v e " takes infinitive complements, t h e p a c k e t INF-COMP (which contains INF-S-START1), among others. (These rules will also d e a c t i v a t e the p a c k e t SUI',J-VERB.) Now the patterns of OBJECTS and INF-SSTART1 will botll match, and INF-S-START1, shown a b o v e in figure 7, will be e x e c u t e d by the i n t e r p r e t e r since it has tile higher priority. (Note once again t h a t a t r a c e is a p e r f e c t l y normal NP from the point v i e w of tile p a t t e r n matching process.) This rule now c r e a t e s a n e w S node labelled infinitive a n d a t t a c h e s the trace NP,.55 to t h e n e w i n f i n i t i v e as its subject. The resulting s t a t e is shown in figure 11 below.

WaS

VP: (VP) VERB: b e l i e v e d NP: (NP COMP) S: (NP-PREPOSED sEC INF-S S) NP: (NP TRACE) (bound* to: The meeting) AUX: (PASSIVE PERF INF AUX) to have been VP: (VP) VERB: scheduled NP: (NP TRACE) (bound* to: The meeting) PP: (PP) PREP: for NP: (NP TIME DOW) Wednesday Figure 1 2 - Tile final t r e e structure. This e x a m p l e demonstrates t h a t the simple formulation of the PASSIVE rule p r e s e n t e d above, i n t e r a c t i n g w i t h o t h e r simply formulated grammatical rules

241

for parsing o b j e c t s and initiating embedded infinitives, allows a t r a c e to be a t t a c h e d either as the o b j e c t of a v e r b or as the s u b j e c t of an embedded infinitive, w h i c h e v e r is t h e a p p r o p r i a t e analysis for a given grammatical situation. B e c a u s e the PASSIVE rule is formulated in such a w a y t h a t it drops the t r a c e it c r e a t e s into the buffer, l a t e r rules, a l r e a d y formulated to trigger on an NP in the buffer, will a n a l y z e s e n t e n c e s with NP-preposing e x a c t l y the same as t h o s e w i t h o u t a preposed subject. Thus, w e s e e t h a t t h e a v a i l a b i l i t y of the b u f f e r mechanism is crucial to capturing this generalization; such a generalization can only be s t a t e d by a parser with a mechanism much like the b u f f e r u s e d here.

[e...Y...[~

z,,.x...]...Y...]

Figure 18 - SSC: No rule can involve X and Y in this structure. The SSC explains why the surface s u b j e c t position of v e r b s like " s e e m s " and "is certain" which h a v e no underlying s u b j e c t can be filled only by the s u b j e c t and not t h e o b j e c t of the embedded S: The rule "MOVE NP" is f r e e to shift any NP into the empty s u b j e c t position, but is c o n s t r a i n e d by the SSC so t h a t the o b j e c t of the e m b e d d e d S c a n n o t be moved out o f t h a t clause. This e x p l a i n s w h y (a) in figure 14 below, but not 14b, can be d e r i v e d from 1 4 c ; the d e r i v a t i o n of 14b from 14c would v i o l a t e t h e SSC.

The G r a m m a r I n t e r p r e t e r and C h o m s k y ' s Constraints Before turning now to a sketch of a computational a c c o u n t of Chomsky's constraints, t h e r e are s e v e r a l i m p o r t a n t limitations of this work which must be e n u m e r a t e d .

(a) John seems to like Mary. ( b ) * M a r y seems John to like. (c) v seems is John to like M a r y ]

First of all, while t w o of Chomsky's c o n s t r a i n t s s e e m to fall out of the grammar interpreter, t h e r e seems t o be no app.arent account of a third, the Propositional Island Constraint, in terms of this mechanism.

Figure 14 - Some examples illustrating the 8S(3. In e s s e n c e , then, the Specified S u b j e c t Constraint c o n s t r a i n s the rule "MOVE NP" in such a w a y t h a t only t h e s u b j e c t of a clause can be moved out of t h a t clause into a position in a higher S. Thus, if a t r a c e in an a n n o t a t e d s u r f a c e s t r u c t u r e is bound to an NP Dominated by a higher S, t h a t t r a c e must fill the s u b j e c t position of the l o w e r clause.

Second, Chomsky's formulation of these c o n s t r a i n t s is intended to apply to all rules of grammar, b o t h s y n t a c t i c rules (i.e. transformations) and those rules of s e m a n t i c i n t e r p r e t a t i o n which Chomsky calls "rules of c o n s t r u a l " , a s e t of shallow semantic rules which g o v e r n anaphoric p r o c e s s e s [Chomsky 77]. The discussion h e r e will only touch on purely s y n t a c t i c phenomena; the q u e s t i o n of how rules of semantic interpretatiqn can be meshed w i t h t h e f r a m e w o r k p r e s e n t e d in this document has y e t to be investigated.

In t h e remainder of this section I will show t h a t t h e grammar i n t e r p r e t e r constrains grammatical p r o c e s s e s in such a w a y t h a t a n n o t a t e d surface s t r u c t u r e s c o n s t r u c t e d b y thp. grammar i n t e r p r e t e r will have this same p r o p e r t y , g i v e n the formulation of the PASSIVE rule p r e s e n t e d a b o v e . In torms of the parsing process, this means t h a t if a t r a c e is " l o w e r e d " from one clause to another as a result of a "MOVE N P " - t y p e operation during the parsing process, t h e n it will be a t t a c h e d as the s u b j e c t of the second clause. To be more precise, if a trace is a t t a c h e d so t h a t it is Dominated by some S node $1', and the t r a c e is bound to an NP Dominated by some other S node $2, then t h a t t r a c e will n e c e s s a r i l y be a t t a c h e d so t h a t it fills the s u b j e c t position of $1. This is d e p i c t e d in figure 15 below.

Third, the arguments p r e s e n t e d below deal only w i t h English, and in f a c t depend strongly upon s e v e r a l f a c t s a b o u t English s y n t a x , most crucially upon the f a c t t h a t English is subject-initial. Whether these arguments can be s u c c e s s f u l l y e x t e n d e d to other language t y p e s is an open question, and to this e x t e n t this work must be c o n s i d e r e d e x ploratory. And finally, I will not show t h a t t h e s e c o n s t r a i n t s must be true without exception; as w e will see, t h e r e are v a r i o u s situations in which the constraints imposed b y the grammar i n t e r p r e t e r can be circumvented. Most of t h e s e situations, though, will be shown to demand much more c o m p l e x grammar formulations than those t y p i c a l l y n e e d e d in t h e grammar so far constructed. This is quite in k e e p i n g w i t h the suggestion made by Chomsky [Chomsky 77"] t h a t t h e constraints are not necessarily w i t h o u t e x c e p t i o n , but r a t h e r t h a t e x c e p t i o n s will be "highly marked" and t h e r e f o r e will count heavily against any grammar t h a t includes them.

The A c t i v e Node Stack S2

... / ...

NP2 C:

sl

... / ...

NP: NP1 (NP TRACE) : bound to NP2 Figure 15 - NP1 must be a t t a c h e d as the s u b j e c t of $1 since it is bound to an NP Dominated by a higher S.

The Specified Subject Constraint Looking back at tile complex p a s s i v e e x a m p l e involving "raising" p r e s e n t e d above, w e see t h a t t i l e parsing process results in a structure e x a c t l y like t h a t s h o w n a b o v e . The original point of the example, of course, w a s t h a t the r a t h e r simple PASSIVE rule handles this c a s e w i t h o u t t h e need for some mechanism to e x p l i c i t l y l o w e r t h e NP. The PASSIVE rule captures this generalization b y

The Specified Subject Constraint (SSC), s t a t e d informally, says t h a t no rule may involve t w o c o n s t i t u e n t s t h a t are Dominated by d i f f e r e n t cyclic nodes unless t h e l o w e r of the t w o is the s u b j e c t of an S or NP. Thus, no rule may involve c o n s t i t u e n t s X and Y in tl~e structure shown in figure 13 below, if cl and ~ are cyclic nodes and Z is t h e s u b j e c t of c(, Z distinct from X.

242

dropping t h e t r a c e it c r e a t e s into the b u f f e r ( a f t e r a p p r o p r i a t e l y binding the trace), thus allowing o t h e r rules w r i t t e n to handle normal NPs (e.g. OBJECTS and INF-SSTART1) to c o r r e c t l y place the trace.

COMP NP AUX or

NP AUX [vP VERB ...]. (The COMP node will dominate flags like " t h a t " or " f o r " t h a t mark t h e I)eginning of a complement clause.) But then, if a t r a c e , i t s e l f an NP, is one of tile first s e v e r a l c o n s t i t u e n t s a t t a c h e d to an embedded clause, the only position it can fill will be t i l e s u b j e c t of the clause, exactly the empirical c o n s e q u e n c e of Cbomsky's Specified S u b j e c t Constraint in such c a s e s as e x p l a i n e d above,

This s t a t e m e n t of PASSIVE does more, h o w e v e r , than simply c a p t u r e a generalization about a s p e c i f i c construction. As I will argue in detail below, the b e h a v i o r s p e c i f i e d by both the Specified Subject Constraint and S u b j a c e n c y follows almost immediately from this formulation. In [ M a r c u s 7 7 ] , I argue that this formulation of PASSIVE is t h e only simple, non-ad hoc, formulation of this rule possible, and t h a t all o t h e r rules characterized by the c o m p e t e n c e rule "MOVE NP" must o p e r a t e similarly; here, h o w e v e r , I will only s h o w t h a t t h e s e constraints follow naturally.from this formulation of PASSIVE, leaving the question of n e c e s s i t y aside. I will also assume one additional constraint below, t h e Left-to-Right Constraint, which will be briefly m o t i v a t e d l a t e r in this paper as a natural condition on the formulation of a grammar for this mechanism.

T h e L - t o - R Constraint Let us now return to the motivation for the L - t o - R Constraint. Again, I will not a t t e m p t to prove t h a t this c o n s t r a i n t must be true, but merely to show why it is plausible. Empirically, tile L e f t - t o - R i g h t Constraint Seems t o hold for the most part; for the grammar of English discussed in this paper, and, it would seem, for any grammar of English t h a t a t t e m p t s to capture the same range of generalizations as this grammar, the constituents in the buffer are utilized in l e f t - t o - r i g l l t order, with a small range of e x c e p t i o n s . This u s a g e is c l e a r l y not enforced by the grammar i n t e r p r e t e r as p r e s e n t l y implemented; it is quite possible to w r i t e a s e t o f grammar rules t h a t specifically ignores a constituent in t h e b u f f e r until some arbitrary point in the clause, though such a s e t of rL,les would be highly adhoc. However, t h e r e r a r e l y s e e m s to be a need to remove o t h e r than tile f i r s t c o n s t i t u e n t in the buffer.

The L e f t - t o - R i g h t Constraint: tile c o n s t i t u e n t s in t h e b u f f e r are (almost always) a t t a c h e d to higher l e v e l c o n s t i t u e n t s in l e f t - t o - r i g h t order, i.e. t h e first c o n s t i t n e n t in the buffer is (almost a l w a y s ) a t t a c h e d before the second constituent. I will now show t h a t a trace c r e a t e d by PASSIVE which is bound to an NP in one clause can only s e r v e as t h e s u b j e c t of a clause dominated by t h a t first clause. Given tile formulation of PASSIVE, a t r a c e can be " l o w e r e d " into one clause from another only by the i n d i r e c t r o u t e of dropping it into the buffer before the s u b o r d i n a t e c l a u s e node is c r e a t e d , which is e x a c t l y how the PASSIVE rule o p e r a t e s . This means that the ordering of the operation.~ is crucially: 1) c r e a t e a trace and drop it into tile buffer, 2) c r e a t e a subordinate S node, 3) a t t a c h t h e t r a c e to the n e w l y c r e a t e d S node. Tile k e y point is t h a t at t i l e time t h a t the subordinate clause node is c r e a t e d and b e c o m e s tile current a c t i v e node, tile trace must be sitting in the buffer, filling one of the three buffer positions. Thus, t i l e p a r s e r will be in the s t a t e shown in figure 16 below, w i t h t h e t r a c e , in fact, most likely in tile first b u f f e r ~)osition.

The one e x c e p t i o n to the L-to-R Constraint s e e m s t o be t h a t a c o n s t i t u e n t C~ may be a t t a c h e d b e f o r e t h e c o n s t i t u e n t to its left, Ci. I, if Ci does not appear in s u r f a c e s t r u c t u r e in its underlying position (or, if one prefers, in its unmarked position) and if its removal from the b u f f e r reestablishes the unmarked order of the remaining c o n s t i t u e n t s , as in the case of the AUX-INVERSION rule d i s c u s s e d earlier in this paper. To capture this notion, t h e L-to-R Constraint can be r e s t a t e d as follows: All c o n s t i t u e n t s must be a t t a c h e d to higher level c o n s t i t u e n t s a c c o r d i n g ' to tile l e f t - t o - r i g h t order of constituents in t h e unmarked case of t h a t constituent's structure. This reformulation is interesting in t h a t it would b e a natural c o n s e q u e n c e of the operation of the grammar i n t e r p r e t e r if p a c k e t s w e r e associated with the phrase s t r u c t u r e rules of an explicit "base component", and t h e s e rules w e r e used as templates to build up the s t r u c t u r e a s s i g n e d by tile grammar interpreter. A p a c k e t of grammar rules would tllen be e x p l i c i t l y associated with each symbol oil t h e right hand side of each phrase structure rule. A c o n s t i t u e n t of a given t y p e would then be c o n s t r u c t e d b y a c t i v a t i n g the p a c k e t s associated with each node t y p e of t h e a p p r o p r i a t e phrase structure rule in l e f t - t o - r i g h t order. Since t h e s e base rules would r e f l e c t the unmarked I - t o - r o r d e r of constituents, the constraint s u g g e s t e d here would t h e n simply falt out of the i n t e r p r e t e r mechanism.

The Active Node Stack C:

S 1 2 3 (S SEC ...) / ... Tile Buffer NP123 (NP TRACE) : bound to NP in S a b o v e $ 1 2 3

Figure 1 6 -

Parser s t a t e a f t e r embedded S c r e a t e d .

Now, given the L-to-R Constraint, a t r a c e which is m the b u f f e r at the time that an embedded S node is first c r e a t e d must be one of the first s e v e r a l c o n s t i t u e n t s a t t a c h e d to the S node or its daughter nodes. From t h e s t r u c t u r e of English, we know t h a t tile leftmost t h r e e c o n s t i t u e n t s of an embedded S node, ignoring t o p i c a l i z e d c o n s t i t u e n t s , must either be

Subjacency Before turning to the Subjacency Principle, a f e w a u x i l i a r y technical terms need to be defined: If we can

243

Having s t a t e d S u b j a c e n c y in terms of t h e a b s t r a c t c o m p e t e n c e t h e o r y of g e n e r a t i v e grammar, I now will s h o w t h a t a parsing c o r r e l a t e of S u b j a c e n c y follows from t h e s t r u c t u r e of t h e grammar interpreter. Specifically, I will s h o w t h a t t h e r e are only limited c a s e s in which a t r a c e g e n e r a t e d b y a "MOVE-NP" p r o c e s s can be " l o w e r e d " more t h a n one clause, i.e. t h a t a t r a c e c r e a t e d and bound while a n y g i v e n S is current must almost a l w a y s be a t t a c h e d e i t h e r to t h a t S or to an S which is Dominated b y t h a t S.

t r a c e a p a t h up t h e t r e e from a given node X to a g i v e n n o d e Y, t h e n w e s a y X is dominated by Y, or e q u i v a l e n t l y , Y dominates X. If Y dominates X, and no o t h e r nodes i n t e r v e n e (i.e. X is a d a u g h t e r of Y), then Y immediately (or directly) dominates X. [Akmajian & Heny 7,5]. One n o n - s t a n d a r d d e f i n i t i o n will p r o v e useful: I will s a y t h a t if Y d o m i n a t e s X, and Y is a c y c l i c node, i.e. an S or NP node, and t h e r e is no o t h e r c y c l i c node Z such t h a t Y dominates Z and Z d o m i n a t e s 'X (i.e. t h e r e is no intervening c y c l i c node Z b e t w e e n Y and X) then Y Dominates X.

L e t us begin b y examining w h a t it would mean t o l o w e r a t r a c e more than one clause. Given t h a t a t r a c e c a n o n l y be " l o w e r e d " b y dropping it into the b u f f e r and t h e n c r e a t i n g a s u h o r d i n a t e S node, as d i s c u s s e d a b o v e , l o w e r i n g a t r a c e more than one clause n e c e s s a r i l y implies t h e following s e q u e n c e of e v e n t s , d e p i c t e d in figure 19 b e l o w : First, a t r a c e NP1 must (a) be c r e a t e d with some S node, $1, as t h e current S, (b) bound to some NP Dominated b y t h a t S and then (c) dropped into t h e buffer. By rlefinition, it will be i n s e r t e d into the first cell in t h e b u f f e r . (This is s h o w n in figure l g a ) T h e n a s e c o n d S, $ 2 , must b e c r e a t e d , stq~planting $1 as the ct=rrent S, and t h e n y e t a third S, $3, must he c r e a t e d , becoming the current S. During all t h e s e s t e p s , t h e t r a c e NP1 remains sitting in t h e b u f f e r . Finally, NP1 is a t t a c h e d ¢,nder $3 (fig. 19b). By t h e S p e c i f i e d S u b j e c t Constraint, NP1 must then a t t a c h t o $ 3 as its s u b j e c t .

The principle of S u b j a c e n c y , informally s t a t e d , s a y s t h a t no rule can involve c o n s t i t u e n t s t h a t a r e s e p a r a t e d b y more than one Cyclic node. Let us s a y t h a t a node X is subjacent to a node Y if t h e r e is at most o n e c y c l i c node, i.e. at most one NP or S node, b e t w e e n t h e c y c l i c node t h a t Dominates Y and the node X. Given this d e f i n i t i o n , t h e S u b j a c e n c y principle s a y s t h a t no rule c a n i n v o l v e c o n s t i t u e n t s t h a t are not s u b j a c e n t . The S u l ) j a c e n c y principle implies t h a t m o v e m e n t rules a r e c o n s t r a i n e d so t h a t t h e y can move a c o n s t i t u e n t only into positions t h a t the c o n s t i t u e n t w a s s u b j a c e n t to, i.e. only within t h e clause (or NP) in which it o r i g i n a t e s , or into t h e c l a u s e (or NP) t h a t Dominates t h a t c l a u s e (...). This means t h a t if c~, ~, and ( in figure 17 are c y c l i c nodes, no rl, le can move a c o n s t i t u e n t from position X to e i t h e r o f t h e p o s i t i o n s Y, w h e r e [~...X...] is distinct from [ ~ X ] .

The A c t i v e Node S t a c k [,...Y..-[/3...[~...X...]...].-.Y..-] C: Figure 1 7 - S u b j a c e n c y : No nile can involve X and Y in this s t r u c t u r e . 1 st:

S u h j a c e n c y implies that if a c o n s t i t u e n t is to b e " l i f t e d " up more than one level in c o n s t i t u e n t s t r u c t u r e , this o p e r a t i o n must he done by r e p e a t e d operations. Thus, t o use one of Chomsky's examples, tile s e n t e n c e g i v e n in f i g u r e 18a, with a d e e p s t r u c t u r e analogous to 18b, must b e d e r i v e d as follows (assuming that '*is c e r t a i n " , like " s e e m s " , has no s u b j e c t in underlying s t r u c t u r e ) : The d e e p s t r u c t u r e must f i r s t u n d e r g o a movement operation t h a t r e s u l t s in a s t r u c t u r e analogous to 18c, and then a n o t h e r m o v e m e n t o p e r a t i o n t h a t results in 18(I, each of t h e s e m o v e m e n t s l e a v i n g a t r a c e as shown. That 18c is in f a c t an i n t e r m e d i a t e s t r u c t u r e is s u p p o r t e d b y t h e e x i s t e n c e of s e n t e n c e s such as 18e, which p u r p o r t e d l y result w h e n t h e V in t h e m a t r i x S is r e p l a c e d b y the l e x i c a l item " i t " , and t h e e m b e d d e d S is t e n s e d rather than infinitival. The s t r u c t u r e g i v e n in 18f is ruled out as a possible a n n o t a t e d s u r f a c e s t r u c t u r e , b e c a u s e the single t r a c e could only b e l e f t if t h e NP " J o h n " w a s moved in one fell s w o o p from its u n d e r l y i n g position to its position in s u r f a c e s t r u c t u r e , w h i c h would violate Subjacency. (a) (b) (c) (d) (e) (f)

$1 ... / ... The Buffer NP1 (NP TRACE) : bound to NP Dominated b y $1 (a) - NP1 is d r o p p e d into t h e b u f f e r while $1 is the current S. Tile A c t i v e Node S t a c k S1 ... / ...

C:

S2

...

/

...

S3

...

/

...

NP1 (NP TRACE) : bound to NP Dominated b y S1 ( b ) - After 52 and $3 are c r e a t e d , NP1 is a t t a c h e d to $3 as its s u b j e c t ( b y t h e SSC).

Figl,re l g - Lowering a t r a c e more than 1 c l a u s e But this s e q u e n c e of e v e n t s is highly unlikely. e s s e n c e of t h e argument is this:

The

Nothing in tile buffer can change b e t w e e n t i l e time t h a t $ 2 is c r e a t e d and $3 is c r e a t e d if NP1 remains in t h e b u f f e r . NP1, like any o t h e r node t h a t is d r o p p e d from t h e a c t i v e node s t a c k into the buffer, is i n s e r t e d into t h e f i r s t b u f f e r position. But then, b y the L-to-R Constraint, nothing to t h e right of NP1 can be a t t a c h e d to a higher l e v e l c o n s t i t u e n t until NP1 is a t t a c h e d . (One can s h o w t h a t it is most unlikely t h a t any c o n s t i t u e n t s will e n t e r to t h e l e f t of NP1 a f t e r it is d r o p p e d into tile buffer, but I will s u p p r e s s this d e t a i l h e r e ; tile full argument is included in [ M a r c u s

John seems to be certain to win. V seems IS V to be certain Is John to w i n ] ] v seems Is John to be certain [s t to w i n ] ] John seems Is t to be certain [s t to w i n ] ] It seems t h a t John is certain to win. John seems [s V to be certain [S t to w i n ] ]

Figure 18 - An e x a m p l e demonstrating S u b j a c e n c y .

77].)

244

p r e ~ e n t e d earlier make clear. Tl~e trace is simply d r o p p e d into t h e b u f f e r b e c a u s e its grammatical function is not clear, and t h e creation of tile second S follows from other i f l d e p e n d e n t l y m o t i v a t e d grammatical processes. From t h e point of v i e w of this processing theory, w e can h a v e our c a k e and e a t it too; to the e x t e n t t h a t it makes s e n s e t o map results from the realm of processing into the realm of c o m p e t e n c e , in a sense both the c l a u s e m a t e / " r a i s i n g " and the S u b j a c e n c y positions are correct.

But if the contents of the buffer do not c h a n g e b e t w e e n t h e creation of $2 and $3, then w h a t can possibly m o t i v a t e the creation of both $2 and $ 3 ? The c o n t e n t s of t h e b u f f e r mr,st necessarily provide clear e v i d e n c e t h a t botl] of t h e s e clauses are present, since, b y the Determinism Hypothesis, the parser must be c o r r e c t if it i n i t i a t e s a constittient. Thus, the same three c o n s t i t u e n t s in t i m b u f f e r must provide convincing evidence not only for t h e c r e a t i o n of $2 hut also for $3. Furthermore, if NP1 is t o b e c o m e tile s u b j e c t of $3, and if $2 Dominates $3, then it woL,Id seem t h a t tile constituents t h a t follow NP1 in the b u f f e r nit=st also be constituents of $3, since $ 3 must be c o m p l e t e d b e f o r e it is dropped from the a c t i v e node s t a c k and c o n s t i t u e n t s can then be a t t a c h e d to $2. But t h e n $ 2 must be c r e a t e d e n t i r e l y on the basis of e v i d e n c e p r o v i d e d by t h e c o n s t i t u e n t s of another clatise (unless Sa has less than t h r e e constituents). Thus, it would seem t h a t t h e c o n t e n t s of the buffer cannot provide e v i d e n c e for t h e p r e s e n c e of both clauses unless tile presence of $3, b y itself, is enough to provide confirming e v i d e n c e for the p r e s e n c e of $2. Tiffs would be the case only if t h e r e w e r e , say, a clausal construction that could only appear (perhaps in a particular environment) as tile initial c o n s t i t u e n t of a higher clause. In tiffs case, if there are such constructions, a violation of S u b j a c e n c y should be possible.

Evidence for the Determinism Hypothesis In closing, I would like to show t h a t the p r o p e r t i e s of t h e grammar i n t e r p r e t e r crucial to capturing the b e h a v i o r of Chomsky's constraints w e r e originally m o t i v a t e d ' b y the Determinism Hypothesis, and thus, to some e x t e n t , the Determinism Hypotlmsis explains Chomsky's constraints. Tile s t r o n g e s t form of such an argument, of course, w o u k l he to show t h a t (a) either (i) the grammar i n t e r p r e t e r a c c o u n t s for all of Chomsky's constraints in a manner which is c o n c l u s i v e l y ¢miversal or (ii) the constraints t h a t it will not a c c o u n t for are wrong and t h a t (b) the properties of t h e grammar i n t e r p r e t e r w h i c h w e r e crucial for this proof w e r e f o r c e d b y the Determinism Hypothesis. If such an argument cotdd be made, it would show t h a t tile Determinism H y p o t h e s i s provides a natural processing account of the linguistic d a t a c h a r a c t e r i z e d by Chomsky's constraints, giving strong confirmation to the Determinism Hypothesis.

With the one e x c e p t i o n just mentioned, t h e r e is no m o t i v a t i o n for creating t w o clauses in such a situation, and tllus the initiation of only one such clause can be m o t i v a t e d . But if only one clause is initiated before NP1 is a t t a c h e d , t h e n NP1 must he a t t a c h e d to this clause, and this clause is n e c e s s a r i l y s u b j a c e n t t o the clause which Dominates t h e NP to which it is bound. Thus, the grammar i n t e r p r e t e r will b e h a v e as if it enforces the Subjacency Constraint.

I h a v e shown none Of the above, and thus' my claims must be p r o p o r t i o n a t e l y more modest. I h a v e argued only t h a t important sub-cases of Chomsky's Constraints f o l l o w from t h e grammar interpreter, and while I can s h o w t h a t t h e Determinism Hypothesis strongly motivates the mechanisms from which these arguments follow, I c a n n o t show necessity. The e x t e n t to which this argument p r o v i d e s e v i d e n c e for tile Determinism Hypothesis must thus be l e f t to t h e reader; no o b j e c t i v e measure e x i s t s for such matters.

As a concluding point, it is w o r t h y of n o t e t h a t willie t h e grammar i n t e r p r e t e r appears to b e h a v e e x a c t l y as if it w e r e consD:ained by the Subjacency principle, it is in f a c t c o n s t r a i n e d by a version of the Clausemate Constraint! (The Clausemate Constraint, long t a c i t l y assumed b y linguists but first e x p l i c i t l y stated, I believe, b y Postal [ P o s t a l 6 4 ] , s t a t e s t h a t a transformation can only i n v o l v e c o n s t i t u e n t s t h a t are Dominated by the same cyclic node. This constraint is at the heart of Postal's a t t a c k on t h e c o n s t r a i n t s t h a t are discussed above and his argument for a " r a i s i n g " analysis.) The grammar interpreter, as w a s s t a t e d a b o v e , limits grammar rules from examining any node in the a c t i v e node s t a c k higher than tile current cyclic node, which is to s a y t h a t it can only examine clausemates. T h e trick is t h a t a t r a c e is c r e a t e d and bound while it is a " c l a u s e m a t e " of the NP to which it is bound in t h a t t h e c u r r e n t cyclic node at that time is the node t o which t h a t NP is a t t a c h e d . The trace is then dropped into the b u f f e r and a n o t h e r S node is created, t h e r e b y d e s t r o y i n g the c l a u s e m a t e relationship. The trace is then a t t a c h e d to this n e w 5 node. Thus, in a sense, the t r a c e is l o w e r e d from one clause to another. The crucial point is t h a t while this lowering goes on as a result of the operation of tile grammar i n t e r p r e t e r , it is only implicitly lowered in t h a t 1) t h e t r a c e w a s n e v e r attached to the higher S and 2) it is not d r o p p e d into t i l e b u f f e r because of any realization t h a t it must be " l o w e r e d " ; in f a c t it may end up a t t a c h e d as a c l a u s e m a t e of the NP to which it is bound - as the passive e x a m p l e s

The ability to drop a t r a c e into the b u f f e r is at the h e a r t of t h e arguments p r e s e n t e d here for S u b j a c e n c y and t h e 55C as c o n s e q u e n c e s of the functioning of the grammar i n t e r p r e t e r ; this is the central operation upon which t h e a b o v e arguments are based. But the buffer itself, and t h e f a c t t h a t a c o n s t i t u e n t can be dropped into the b u f f e r if its grammatical function is uncertain, are d i r e c t l y m o t i v a t e d b y t h e Determinism Hypothesis. Given tiffs, it is fair to claim t h a t if Chomsky's constraints follow from the o p e r a t i o n of t h e grammar i n t e r p r e t e r , then t h e y are strongly linked to the Determinism Hypothesis. If Chomsky's constraints are In f a c t true, then the arguments p r e s e n t e d in this p a p e r p r o v i d e solid e v i d e n c e in support of the Determinism Hypothesis.

Acknowledgments This paper summarizes one result p r e s e n t e d in my Ph.D. thesis; I would like to e x p r e s s my gratitude to the many p e o p l e who Contributed to the technical c o n t e n t of t l l a t work: Jon Alien, my thesis advisor, to whom I o w e a s p e c i a l d e b t of tllanks, Ira Goldstein, Seymour Papert, Bill

245

Martin, Bob Moore, Chuck Rieger, Mike Genesereth, Gerry Sussman, Mike Brady, Craig Thiersch, Beth Levin, Candy Bullwinkle, Kl,rt VanLehn, Dave McDonald, and Chuck Rich.

Winograd, T. [ 1 9 7 1 ] Procedures as a Representation for Data in a Computer Program for Understanding Natural Language, Project MAC-TR 84, MIT, Cambridge, Mass.

This paper describes research done at the Artificial Intelligence Laboratory of the Massachusetts Institute of Technology. Support for the laboratory's artificial intelligence research is provided in part by the Advanced .Research Projects Agency of the Department of Defence under Office of Naval Research Contract N0001475-C-0043.

Woods, W. A. [ 1 9 7 0 ] "Transition Network Grammars for Natural Language Analysis", Communications of the ACM 13:591.

BIBLIOGRAPHY Akmajian, A. and F. Heny [1975] An Introduction to the Principles of Transformational Syntax, MIT Press, Cambridge, Mass. Bresnan, d. W. [ 1 9 7 6 ] "Evidence for a Theory of Unbounded Transformations", Linguistic Analysis 2:853. Chomsky, N. [ 1 9 7 3 ] "Conditions on Transformations", in S. Anderson and P. Kiparsky, ads., A Festschrift for Morris Halle, Itolt, Rinehart and Winston, N.Y. Chomsky, N. [ 1 9 7 5 ] Reflections on Language, Pantheon, N.Y. Chomsky, N. [ 1 9 7 6 ] "Conditions on Rules ofjGrammar", LingUistic Analysis 2:803. Chomsky, N. [ 1 9 7 7 ] "Oil Wh-Movement", in A. Akmajian, P. Culicover, and T. Wasow, eds., Formal Syntax, Academic Press, N.Y. Fillmore, C. J. [ 1 9 6 8 ] "The Case for Case" in Universals in Linguistic Theory, E. Bach and R..T. Harms, eds., Holt, Rinehart, and Winston, N.Y. Gruber, J. S. [196,5] Studies unpublished Ph.D. thesis, MIT.

in

Lexical

Relations,

Jackendoff, R. S. [1972] Semantic Interpretation Generative Grammar, MIT Press, Cambridge, Mass.

in

Marcus, M. P. [ 1 9 7 7 ] A Theory of Syntactic Recognition for Natural Language, unpublished Ph.D. thesis, MIT. Marcus, M. P. [ 1 0 7 8 ] "Capturing Linguistic Generalizations in a Parser for English", in the proceedings of The 2nd National Conference of the Canadian Society for Computational Studies of Intelligence, Toronto, Ca.nada. Newell, A. and H.A. Simon [1972] Human Problem Solving, Prentice-Hall, Englewood Cliffs, N.J. Postal, P. M. [ 1 9 7 4 ] On Raising, MIT Press, Cambridge, Mass. Pratt, V. R. [ 1 9 7 3 ] "Top-Down Operator Precedence", in the proceedings of The SIGACT/SIGPLAN Symposium on Principles of Programming Languages, Boston, Mass. Sager, N. [197,3] "The String Literature", in [Rustin 73].

Parser

for

Scientific

246