the equivalence of four extensions of context-free ... - Semantic Scholar

19 downloads 3613 Views 355KB Size Report
Jan 25, 1994 - is a tree domain and is the set of node labels (and dom ( ) is the, usual, .... 2In 4] the name linear indexed grammars was also used to name a di ...
THE EQUIVALENCE OF FOUR EXTENSIONS OF CONTEXT-FREE GRAMMARS K. Vijay-Shanker Department of Computer and Information Science University of Delaware Newark, DE 19716 David J. Weir School of Cognitive and Computing Sciences University of Sussex Brighton, BN1 9QH January 25, 1994 Abstract There is currently considerable interest among computational linguists in grammatical formalisms with highly restricted generative power. This paper concerns the relationship between the class of string languages generated by several such formalisms viz. Combinatory Categorial Grammars, Head Grammars, Linear Indexed Grammars and Tree Adjoining Grammars. Each of these formalisms is known to generate a larger class of languages than Context-Free Grammars. The four formalisms under consideration were developed independently and appear super cially to be quite di erent from one another. The result presented in this paper is that all four of the formalisms under consideration generate exactly the same class of string languages.

1 Introduction There is currently considerable interest among computational linguists in grammatical formalisms with highly restricted generative power. This is based on the argument that a grammar formalism should not merely be viewed as a notation, but as part of the linguistic theory [6]. It should make predictions about the structure of natural language and its value is lessened to the extent that it supports both good and bad analyses. In order for a grammar This work has been supported by NSF grants MCS-82-19116-CER, MCS-82-07294, DCR-84-10413, IRI8909810, ARO grant DAA29-84-9-0027, and DARPA grant N0014-85-K0018. The authors would like to thank Aravind Joshi and Mark Steedman for their important contributions to the research that led to this paper. We are also extremely grateful to Joost Engelfriet and the anonymous reviewers for their helpful critiques of earlier versions of the paper. 

formalism to have such predictive power its generative capacity must be constrained. This has led to interest in the use of context-free grammars (cfg) [16, 6] as a notation with which to express linguistic theories. However, it is now generally accepted that cfg lack the generative power needed for this purpose [19, 3]. As a result there is substantial interest in the development and study of constrained grammar formalisms whose generative power exceeds cfg. This paper concerns the relationship between several of the most widely studied such formalisms. We compare the class of languages that are generated by combinatory categorial grammars, head grammars, linear indexed grammars and tree adjoining grammars. Each of these formalisms is known to generate a larger class of languages than cfg. Furthermore, this class includes the non-context-free languages that have been used as the basis for the most widely accepted arguments that generative power greater than that of cfg is needed to generate certain natural languages [19, 3]. The four formalisms under consideration were developed independently and super cially di er considerably from one another. Informally, di erences between the formalism can be explained in terms of the way in which they can be seen to extend cfg. For example, in addition to string concatenation, head grammars involve a wrapping operation with which one pair of strings can be wrapped around a second pair. In other respects head grammars are identical to cfg since the derivation process involves context-free rewriting of members of a nite set of nonterminal symbols. Both combinatory categorial grammars and linear indexed grammars, on the other hand, involve only string concatenation. However, they di er from cfg in that their derivation process involves rewriting of unbounded stack-like structures. The status of tree adjoining grammars, a tree manipulating system, is ambiguous since it is possible to interpret tree adjoining grammars as extending cfg in either of these ways. Despite these di erences, evidence existed suggesting that the weak generative capacity of these formalisms may be closely related. It appeared that while each of these formalisms had greater power than cfg, this extra power was very limited. In addition each formalism was apparently limited to a similar extent. This evidence involved the lack of an example of a language that could be generated by one formalism and not another. The languages f ww j w 2 f a; b g g, wwRwwR j w 2 f a; b g 1 and f anbncn dn j n  0 g are examples of languages that were known to be generated by all four formalisms, whereas, the languages f www j w 2 f a; b g g and f anbn cndn en j n  0 g are not generated by any of the formalisms [17, 22]. The result presented in this paper is that all four of the formalisms under consideration generate exactly the same class of string languages. Preliminary versions of some of these results have appeared in [25, 28]. In Section 2 we de ne each of the formalisms and give the equivalence proofs in Section 3. o

n

1 wR

is the reverse of w.

2

2 De nitions of Formalisms Three of the four formalisms (tree adjoining grammars, head grammars, and combinatory categorial grammars) have been used as the notation underlying various linguistic theories and linear indexed grammars have been discussed in [5] in connection with the relevance of indexed grammars to natural language. We will not discuss these theories here, however, as we de ne each formalism we will refer to relevant linguistic work.

2.1 Head Grammars Head grammars (hg) were introduced in [15] where their linguistic relevance was investigated. Some of their formal properties were studied in [17]. They can be viewed as a generalization of cfg in which a wrapping operation is used in addition to concatenation. The nonterminals of a cfg derive strings of terminals. The nonterminals of a hg derive headed strings or pairs of terminal strings (u; v) that we denote u"v. In Pollard's original (equivalent) de nition [15], headed strings involved the additional speci cation that either the last terminal symbol in u or the rst terminal symbol in v was the head. We nd it mathematically cleaner to view a headed string simply as a pair since in Pollard's notation the empty headed string was rather problematic since it contained no head.

De nition 2.1

A hg is a four tuple G = (VN ; VT ; S; P ) where VN is a nite set of nonterminal symbols, VT is a nite set of terminal symbols, S 2 VN is the start symbol and P is a nite set of productions of the form A ! f (1; : : : ; n) where A 2 VN , n  1, f 2 f W; C1;n; C2;n; : : :; Cn;n g, 1; : : :; n 2 VN [ (VT  VT), and if f = W then n = 2. Ci;n : (VT  VT)n ! (VT  VT) is a concatenation operation where Ci;n(u1"v1; : : :; ui"vi; : : : ; un"vn) = u1v1 : : : ui"vi : : : unvn W : (VT  VT)2 ! (VT  VT) is the wrapping operation where W (u1"v1; u2"v2) = u1u2"v2v1 Given a hg, G = (VN ; VT ; S; P ), =G) is de ned as follows.

 u"v =G0) u"v for all u"v 2 VT  VT.  If A ! f (1; : : :; n) 2 P then A =Gk) f (u1"v1; : : :; un "vn) i u v (1  i  n) and k = 1 + where i =kG) i" i

P

1in ki .

The string language L(G) generated by a hg, G = (VN ; VT ; S; P ) is de ned as



L(G) = uv 2 VT S =G) u"v where =G)= k0 =Gk). S

3



Example 2.1 The hg G = (f S; T g ; f a; b; c; d g ; S; P ) generates the language n n n n f a b c d j n  0 g where P is as follows. Note that  denotes the empty string. P = S ! C1;1("); S ! C2;3(a"; T; d"); T ! W (S; b"c) n

o

Consider the following derivation of the string aabbccdd. hence hence hence hence

S T S T S

=G1) =G2) =G3) =G4) =G5)

C1;1(") = " W ("; b"c) = b"c C2;3(a"; b"c; d") = ab"cd W (ab"cd; b"c) = abb"ccd C2;3(a"; abb"ccd; d") = aabb"ccdd

2.2 Tree Adjoining Grammars Tree adjoining grammars (tag) were introduced in [8] and their formal properties investigated in [7, 22]. The linguistic use of tag is discussed in [12, 13, 10, 11, 18, 14]. Let N+ denote the set of positive integers. D is a tree domain if it is a nonempty nite subset of N+ such that if d 2 D and d = d1d2 then d1 2 D and if di 2 D where i 2 N+ then dj 2 D for all 1  j  i. A tree is denoted by a partial function : N+ !  where dom ( ) is a tree domain and  is the set of node labels (and dom ( ) is the, usual, tree domain ). We say that the elements of a tree domain are addresses of the nodes of the tree and (d) is the label of the node with address d. Note that  is the address of the root node of the tree. If d 2 dom ( ) for some tree then =d, the subtree rooted at d in is de ned such that for all d0 2 N+ =d(d0 ) = (dd0). Tree adjoining grammars manipulate trees whose nodes are labelled either by terminal symbols or by triples of the form hA; sa; oai where A is a nonterminal symbol, sa is a set of tree labels (that determines which of the trees of the grammar can be adjoined at that node) and oa is either true (indicating that adjunction is obligatory) or false (indicating that adjunction is optional). We call sa and oa the adjunction constraints at that node. A node at which the value of oa is true is said to have an OA constraint. A node at which the value of sa = ; is said to have an NA constraint. Let VN be a set of nonterminal symbols, VT a set of terminal symbols, VL a set of tree labels and VT = VT [ f  g

Initial Trees For each A 2 VN , init (VN ; VT ; VL; A) is the set of trees : D ! VT [ (VN  2VL  f true; false g) where D is a tree domain and the following hold.  The root of is labelled hA; sa; oai for some sa  VL and oa 2 f true; false g.  All internal nodes of are labelled hB; sa; oai for some B 2 VN , sa  VL and oa 2 f true; false g.  All leaf nodes of are labelled by some u 2 VT. Auxiliary Trees For each A 2 VN , aux (VN ; VT ; VL; A) is the set of trees : D ! VT [ (VN  2VL  f true; false g) where D is a tree domain and the following hold. 4

 The root of is labelled hA; sa; oai for some sa  VL and oa 2 f true; false g.  All internal nodes of are labelled hB; sa; oai for some B 2 VN , sa  VL and oa 2 f true; false g.  All leaf nodes of except one are labelled by some u 2 VT . The remaining leaf node is called the foot node and is labelled hA; sa; oai for some sa  VL and oa 2 f true; false g. The address of the foot node of is denoted ft ( ). Let aux (VN ; VT ; VL) = A2VN aux (VN ; VT ; VL; A). Elementary Trees elem (VN ; VT ; VL; A) = init (VN ; VT ; VL ; A) [ aux (VN ; VT ; VL). S



For each 2 aux (VN ; VT ; VL) let spine ( ) = d ft ( ) = dd0 for some d0 2 N+ , i.e., spine ( ) are the addresses on the spine of which is the path from the root to the foot node of . We now de ne the tree adjunction operation n

o

r: elem (VN ; VT ; VL ; A)  aux (VN ; VT ; VL )  N+ ! elem (VN ; VT ; VL ; A) for every A 2 VN . For 2 elem (VN ; VT ; VL ; A), 2 aux (VN ; VT ; VL), d 2 dom ( ) we have

0 = r ( ; ; d) where for all d0 2 N+

(d0) if d is not a pre x of d0 (d00) if d0 = dd00 and d00 2 dom ( )

(dd00) if d0 = ddf d00 such that d00 6=  and df = ft ( ) The adjunction operation is shown in Figure 1.

0(d0) =

8 > < > :

A

A B

=

B

r ( ; ; d) =

=

B

B B

Figure 1: Adjunction

De nition 2.2

A tag is a seven tuple G = (VN ; VT ; VL; S; I ; A; ) where VN is a nite set of nonterminals, VT is a nite set of terminals, VL is a nite set of tree labels, S 2 VN is a distinguished nonterminal, I is a nite subset of init (VN ; VT ; VL; S ), A is a nite subset of aux (VN ; VT ; VL) and : A ! VL is a bijection, the auxiliary tree labelling function. 5

We use to denote the result of applying to . Given a tag G = (VN ; VT ; VL; S; I ; A; ) =G) is de ned as follows. If 2 elem (VN ; VT ; VL; A), (d) = hB; sa; oai, 2 A \ aux (VN ; VT ; VL; B ), and 2 sa then

=G) r ( ; ; d) Note that the same nonterminal, B , must label the adjunction node and the root and foot of the adjoined tree. Note that for all 2 elem (VN ; VT ; VL; A), 1; 2 2 aux (VN ; VT ; VL), d1 2 dom ( ), and d2 2 dom ( 1) r (r ( ; 1; d1) ; 2; d1d2) = r ( ; r ( 1; 2; d2) ; d1) Thus, for all 2 elem (VN ; VT ; VL ; A), 1; 2 2 aux (VN ; VT ; VL) and d 2 dom ( ) if =G) r ( ; 1; d) and 1 =G) 2 then =G) r ( ; 2; d) where =G) is the re exive, transitive closure of =G). A tree has no OA nodes if for all d 2 dom ( ) either (d) = hA; sa; falsei for some A and sa or (d) = u for some u 2 VT . The tree language T (G) generated by G = (VN ; VT ; VL; S; I ; A; ) is de ned as 



T (G) = =G) for some 2 I and has no OA nodes



In de ning yield ( ), the yield of a tree , we consider only symbols in VT and VN , i.e., we do not give the SA or OA constraints of nodes involving nonterminals. For a tree

 if consists of a single node then if () = u for some u 2 VT then the yield ( ) = u and if () = hA; sa; oai for some A, sa and oa then the yield ( ) = A.  Otherwise if the root of has k children (k > 0) then yield ( ) = yield ( =1) : : : yield ( =k )

The string language, L(G), generated by G, is de ned as L(G) = f yield ( ) j 2 T (G) g When displaying a tree graphically we use several conventions. The nonterminal or terminal component of a node label is always shown. The adjunction constraints for a node labelled hA; sa; oai are speci ed as follows. When sa is the empty set the node is annotated NA. The case in which sa includes the labels of all auxiliary trees in the grammar that are in aux (VN ; VT ; VL; A) (note that A is the same) is the default so no annotation is used. Otherwise, the set sa is given in full. The case in which oa is false is the default and no annotation is used. When oa is true the node is annotated OA.

Example 2.2

The tag G = (f S g ; f a; b; c; d g ; ; S; f g ; f g ; ) n

o

generates the language f anbncn dn j n  0 g. The trees and and a derivation of the string aabbccdd are given in Figure 2. 6

S NA S

a

=

d

= ε

S b

c S NA

S NA a S NA a

=G) ε

S NA d

a

S

S b

d

d S

c

=G)

b

S NA

c S NA

b ε

c S NA

ε

Figure 2: Example of a tag

2.3 Linear Indexed Grammars Linear indexed grammars (lig) where rst discussed, though not named, in [5]2. In a lig strings are derived from nonterminals with an associated stack denoted A[l1 : : : ln] where A is a nonterminal, each li is a stack symbol for 1  i  n, and ln is the top of the stack. A[] denotes the nonterminal A associated with the empty stack. Since, during a derivation, stacks can grow to be of unbounded size we need some way of partially specifying unbounded stacks in lig productions. We use A[ l1l2 : : :ln] to denote the nonterminal A associated with any stack  those top n symbols are l1; l2; : : : ; ln where n  0. We denote the set of all nonterminals in VN associated with stacks whose symbols come from VI by the expression VN [VI].

De nition 2.3

A lig is a ve tuple G = (VN ; VT ; VI ; S; P ) where VN is a nite set of nonterminals,

In [4] the name linear indexed grammars was also used to name a di erent restriction of indexed grammars [1] in which only one nonterminal can appear on the right of production. 2

7

VT is a nite set of terminals, VI is a nite set of indices (stack symbols), S 2 VN is the start symbol and P is a nite set of productions, having one of the following two forms. A[ ] ! A0[ 0]

A[] !

0

where A; A0 2 VN , ; 0 2 VI, ; 0 2 (VN [VI] [ VT ).

Given a lig G = (VN ; VT ; VI ; S; P ) =G) is de ned as follows.

 If A[ ] ! A0[ 0] 0 2 P then for all 1; 2 2 (VN [VI] [ VT ) and 00 2 VI we have 00 0 00 0 0 1 A[  ] 2 =G) 1 A [  ] 2 In this derivation step we say that the occurrence of A0[000] shown on the right is the distinguished successor of the occurrence of A[00] shown on the left.  If A[] ! 2 P then for all 1; 2 2 (VN [VI] [ VT ) 1 A[ ] 2 =G) 1

2

Let distinguished descendent be the re exive, transitive closure of distinguished successor. The language, L(G), generated by G is 

w 2V T



S []

=) w



G

where =G) is the re exive, transitive closure of =G).

Example 2.3 The lig, G = (f S; T g ; f a; b; c; d g ; f l g ; S; P ), generates the lann n n guage f a b c dn j n  0 g where P is as follows. ] ! aS [ l]d; S [ ] ! T [ ]; P = ST [[  l] ! bT [ ]c; T [] !  (

)

The following is a derivation of the string aabbccdd.

S [] =G) =G) =G) =G) =G) =G)

aS [l]d aaS [ll]dd aaT [ll]dd aabT [l]cdd aabbT []ccdd aabbccdd

8

2.4 Combinatory Categorial Grammars Combinatory categorial grammars (ccg) are an extension of classical categorial grammars [2] in which function composition is used in addition to application. The version of ccg that we consider was developed by Steedman [21, 20]. The objects that derive terminals in a ccg are categories. The set of categories, cat (VN ), over the alphabet VN , is the smallest set such that:

 VN  cat (VN ) (members of VN are the atomic categories) and  if c1 and c2 are categories in cat (VN ) then (c1=c2) and (c1nc2) are in cat (VN ). Intuitively, values having the categories (c1=c2) and c1nc2 are functions that can combine with a value of category c2 at their right and left, respectively, to give a value having the category c1.

For each category c 2 cat (VN ) we de ne its target category, target (c) 2 VN , argument categories args (c)  cat (VN ) and arity arity (c) 2 N .

 For c 2 VN target (c) = c, args (c) =  and arity (c) = 0.  For categories (c1=c2) and (c1nc2) { target ((c1=c2)) = target ((c1nc2)) = target (c1), { args ((c1=c2)) = args ((c1nc2)) = args (c1) [ f c2 g and { arity ((c1=c2)) = arity ((c1nc2)) = 1 + arity (c1). For example,

target ((((A=B )n(C=D))nE )) = A args ((((A=B )n(C=D))nE )) = f E; (C=D); B g arity ((((A=B )n(C=D))nE )) = 3 target ((((S nNP )=NP )=PP )) = S args ((((S nNP )=NP )=PP )) = f PP; NP g arity ((((S nNP )=NP )=PP )) = 3 An object of category c can combined with arity (c) objects (whose categories are in args (c)) to give an object having the atomic category target (c). Since the size of categories derived in ccg can be unbounded we need some way of denoting unbounded categories. We use terms of the form (: : : (xj1x1)j2x2) : : : xn?1 )jnxn) where n  0, x; x1; : : :; xn are variables ranging over cat (VN ) and ji 2 f =; ng (1  i  n). In addition, for each variable x and atomic category A 2 VN we have the target-restricted variable xA which ranges over f c 2 cat (VN ) j target (c) = A g and we say that the variable x has been target restricted to A. We de ne the set of combinatory schemata C as follows. C = Sn0 (Fn [ Bn) where for each n  0

 the set of forward schemata Fn consists of all (x=y) (: : : (yj1z1)j2 : : : jnzn) ! (: : : (xj1z1)j2 : : : jn zn) with ji 2 f =; ng 9

 the set of backward schemata Bn consists of all (: : : (yj1z1)j2 : : : jn zn) (xny) ! (: : : (xj1z1)j2 : : : jnzn) with ji 2 f =; ng. In these schemata (xjy) and (: : : (yj1z1)j2 : : : jnzn) are called the primary and secondary components, respectively. Each grammar restricts itself to the use of a nite selection of instances of schemata in C which we call combinatory rules. For each r 2 C such that (xjy) and (: : : (yj1z1)j2 : : : jn zn) are the primary and secondary component of r, respectively let

r(VN ) = fr[(x; w); (y; w0); (z1; w1); : : :; (zn; wn)] j w 2 f x g [ xA j A 2 VN w0 2 f y g [ cat (VN ) and wi 2 f zi g [ cat (VN ) for 1  i  ng n

o

where r[(x1; w1); : : :; (xk ; wk )] denotes the result of substituting wi for xi in r (1  i  k). The set of combinatory rules C (VN ) is as follows.

C (VN ) =

[

r2C

r(VN )

We extend the terms primary and secondary components to members of C (VN ) in the obvious way. c0c00 ! c000 is a ground instance of r 2 C (VN ) if there are c; c0; : : : ; cn 2 cat (VN ) such that (c0c00 ! c000) = r[(w; c)(w0; c0); (w1; c1); : : : ; (wn; cn)] and the following hold.

 The primary and secondary components of r are (wjw0) and (: : : (w0j1w1)j2 : : : jnwn), respectively.

 If w is a target restricted variable xA then target (c) = A.  If wi 2 cat (VN ) then ci = wi (0  i  n). De nition 2.4

A ccg is a ve tuple (VN ; VT ; S; f; R) where VN is a nite set of nonterminals (atomic categories), VT is a nite set of terminals (lexical items), S is a distinguished member of VN , f is a function that maps elements of VT to nite subsets of cat (VN ) and R is a nite subset of C (VN ). Each step in the derivations of a ccg, G = (VN ; VT ; S; f; R), involves the use of a combinatory rule in R. For every '1; '2 2 cat (VN )

'1c'2 =G) '1c0c00'2 if there is some r 2 R such that c0c00 ! c is a ground instance of r. Whichever of c0 or c00 is the primary component of the rule is said to be the primary child of c with respect to this derivation. Let the primary descendent relation be the re exive, transitive closure of the primary child relation. Note that in de ning the derivations we are applying the rules backwards. 10

The string language, L(G), generated by G is de ned as follows.

L(G) = fw1 : : : wnj S =G) c1 : : : cn for some c1; : : :; cn 2 cat (VN ) and ci 2 f (wi ) where wi 2 VT (1  i  n)g

Example 2.4

The ccg, G = (f S; T; A; B; D g ; f a; b; c; d g ; S; f; R), generates the language f an bncndn j n  0 g where f and R are as follows. Note that in this example parentheses have been omitted except when needed to override the left associativity of the two slashes.

f (a) = f (A=D) g

f (b) = f B g

f (d) = f D g

f (c) = f (T nA=T nB ) g f () = f (S=T ); T g S ) (T nA=T nB ) ! (xS nA=T nB ) (rule 1) (x=T S S x=D x ) (rule 2) n A ) ! ( ( A=D ) ( R= S S (x=y) y !x (rule 3) S S y (x ny) !x (rule 4) The following is a derivation of a string of categories that map to the string aabbccdd using the de nition of L(G) above. S =G) =G) =G) =G) =G) =G) =G) =G) =G)

8 > > > > >
> > > > =

> > > > > :

> > > > > ;

(S=D) D (A=D) (S nA) D (A=D) (S nA=D) D D (A=D) (A=D) (S nAnA) D D (A=D) (A=D) (S nAnA=T ) T D D (A=D) (A=D) B (S nAnA=T nB ) T D D (A=D) (A=D) B (S nA=T ) (T nA=T nB ) T D D (A=D) (A=D) B B (S nA=T nB ) (T nA=T nB ) T D D (A=D) (A=D) B B (S=T ) (T nA=T nB ) (T nA=T nB ) T D D

Example 2.5

rule 3 rule 2 rule 3 rule 2 rule 3 rule 4 rule 1 rule 4 rule 1

Figure 3 gives a derivation for the sentence John might eat apples . Note that in the tree parentheses have only been included when needed to override the left associativity of the slashes. The rule B0 is applied with (S nNP ) and NP as the primary and secondary constituents, respectively. The rule F0 is applied with ((S nNP )=NP ) and NP as the primary and secondary constituents, respectively. The rule F1 is applied with ((S nNP )=(S nNP )) and ((S nNP )=NP ) as the primary and secondary constituents, respectively. John and apples are in f (NP ), eat is in f (((S nNP )=NP )) and might is in f (((S nNP )=(S nNP ))).

11

S NP john

S\NP S\NP/NP

NP apples

S\NP/(S\NP)

S\NP/NP

might

eat

Figure 3: Example ccg derivation

3 Proofs of Equivalence Let the classes of languages generated by ccg, hg, lig, and tag be ccl, hl, lil, and tal, respectively. In this section we show that they are identical by proving that ccl  lil, lil  hl, hl  tal and tal  ccl.

3.1 ccl  lil We show that only one of the variables used in the combinatory rules of a ccg ranges over an unbounded number of categories. The use of the other variables can be viewed as merely a convenient way of describing a nite number of alternative rules.

Lemma 3.1

Given a ccg, G = (VN ; VT ; S; f; R), c; c1; : : : ; cn 2 cat (VN ), and w1; : : :; wn 2 VT , if c =G) c1 : : :cn and ci 2 f (wi) (1  i  n) then args (c)  args (G) where args (G) = f c j c 2 args (c0) for some c0 2 f (w) and w 2 VT g

Proof This follows by trivial induction on the length of the derivation in G. Lemma 3.2

For every ccg, G = (VN ; VT ; S; f; R), there is an equivalent ccg G0 = (VN ; VT ; S; f; R0) such that the rules in R0 have one of the following two forms. A (x=c ) (: : : (cj1c1)j2 : : : jncn ) ! (: : : (xA j1c1)j2 : : : jn cn)

(: : : (cj1c1)j2 : : : jncn) (xA nc) ! (: : : (xA j1c1)j2 : : : jncn) where n  0, ji 2 f n; = g (1  i  n), A 2 VN and c; c1; : : :; cn 2 cat (VN ).

Proof From a ccg G = (VN ; VT ; S; f; R) we construct the ccg G0 = (VN ; VT ; S; f; R0) satisfying

the conditions of Lemma 3.2 such that L(G) = L(G0 ). For each r 2 R from Lemma 3.1 we know that for any ground instance of the rule r used to derive a string in L(G) the categories substituted for variables in the secondary component of r 12

must be members of the nite set args (G). Thus, for every A 2 VN and c0; : : : ; cn 2 args (G) include the rule r (w; xA); (w0; c0); (w1; c1); : : :; (wn; cn ) in R0 when the following conditions hold. i

h

 The primary and secondary components of r are (wjw0) and (: : : (w0j1w1)j2 : : : jnwn), respectively.

 w is either the variable x or the target restricted variable xA.  If wi 2 cat (VN ) then wi = ci (0  i  n). Let G = (VN ; VT ; S; f; R) be a ccg satisfying the conditions of Lemma 3.2. We construct an equivalent lig, G0 = (VN ; VT ; VI ; S; P ) where VI = f =; n; (; ) g [ VN . The proof is straightforward since the rules in R can be seen as simple notational variants of lig productions. For c 2 cat (VN ) and j 2 fn; = g we call jc a directional category. Consider the category (: : : (Aj1c1) : : : jncn ) where A 2 VN is the target category and c1; : : : ; cn are the arguments of the category. This can be viewed as the atomic category A associated with n arguments jici that are directional categories (1  i  n). The combinatory rules of G manipulate the string of directional categories in a stack-like way that can be imitated by the lig G0. We begin by de ning a translation function enc : cat (VN ) ! VN [VI] that determines how the categories of G will be encoded in the lig G0.

 For A 2 VN let enc (A) = A[].  enc ((c1=c2)) = A[=c2] and enc ((c1nc2)) = A[nc2] where enc (c1) = A[]. For example, enc (((A=(B nC ))nD)) = A[=(B nC )nD].  Consider the forward rule A A (x=c 0 ) (: : : (c0 j1c1 )j2 : : : jn cn ) ! (: : : (x j1 c1 )j2 : : : jn cn ) 2 R where n  0 and for 1  i  n, ji 2 fn; = g, A 2 VN and c0; c1; : : :; cn 2 cat (VN ). Corresponding to this rule we add the following production to P . A[ ] ! A[ =c0] B [0] where  = j1c1 : : : jn cn and enc ((: : : (c0j1c1)j2 : : : jncn)) = B [0]  The corresponding backward rule is treated in a similar way.  If c 2 f (w) for w 2 VT then let A[] ! w 2 P where enc (c) = A[].

Lemma 3.3

c =G) c1 c2 if and only if enc (c) =G) 0 enc (c1 ) enc (c2)

Proof This follows trivially from the above construction. Thus, S =G) c1 : : :cn if and only if S [] =G) 0 enc (c1 ) : : : enc (cn ) and for each 1  i  n enc (ci ) ! w if and only if ci 2 f (w). Thus it is clear that L(G) = L(G0 ). 13

Example 3.1

G0 = (VN ; VT ; S; f; R0) is equivalent to G = (VN ; VT ; S; f; R) of Example 2.4 satisfying the conditions of Lemma 3.2. The above construction would convert G0 into the lig G00 = (VN ; VT ; f =; n; (; ) g [ VN ; S; P ) Note that args (G) = f A; B; D; T g. First, we show how the rules in R0 are derived from those in R. Rules in R

Corresponding Rule(s) in R0

S ) (T nA=T nB ) ! (xS nA=T nB ) (x=T S ) (A=D) (xS nA) ! (x=D S ) y !xS (x=y

8 > > > > < > > > > : 8 > > > >
> > > :

unchanged unchanged S ) A !xS (x=A S (x=B ) B !xS S ) D !xS (x=D S ) T !xS (x=T A (xS nA) !xS B (xS nB) !xS D (xS nD) !xS T (xS nT ) !xS

Second, we show those productions in P that are derived from f . Components of f

Corresponding productions in P

(A=D) 2 f (a) B 2 f (b) D 2 f (d) (T nA=T nB ) 2 f (c) (S=T ) 2 f () T 2 f ()

A[=D] ! a B[] ! b D[] ! d T [nA=T nB] ! c S [=T ] !  T [] ! 

Third, we show how the remaining productions in P are derived from the rules in R0 . Rules in R0

Corresponding Rules in P

S ) (T nA=T nB ) ! (xS nA=T nB ) S [ nA=T nB] ! S [ =T ] T [nA=T nB] (x=T S ) S [ =D] ! A[=D] S [ nA] (A=D) (xS nA) ! (x=D S S x x=A S [ ] ! S [ =A] A[] ( )A! S S [ ] ! S [ =B] B[] ) B !xS (x=B S S [ ] ! S [ =D] D[] ) D !xS (x=D S S S [ ] ! S [ =T ] T [] (x=T ) T !x S S S [ ] ! A[] S [ nA] A (x nA) !x S S S [ ] ! B[] S [ nB] B (x nB) !x S S S [ ] ! D[] S [ nD] D (x nD) !x S S S [ ] ! T [] S [ nT ] T (x nT ) !x

14

3.2 lil  hl This proof is derived from the standard technique used in the simulation of pushdown automata by context-free grammars. In that proof nonterminals are triples hp; q; li where p is the state of the machine when the stack symbol l is placed on top of the pushdown and q is the state of the machine at the point that l is eventually removed. This nonterminal derives the terminal string w if w is the string read by the machine between these two points in the computation. In our proof nonterminals of the lig take the place of states and we have nonterminals in the hg that are triples of the form hA; B; li. Figure 4 shows a derivation in which B [] is the distinguished descendent of A[l] and the stack symbol l has been popped. The terminal yield between these two points can be expressed as a pair w1"w2 and by wrapping this around the yield of B [] (the string u), the string derived from A[l] is obtained (the string w1uw2). A[ l]

w1

B[ ]

w2

u

Figure 4: l is popped from stack Let G = (VN ; VT ; VI ; S; P ) be a lig where, without loss of generality, we assume that productions in P have the following forms, where A; A1; : : : ; An 2 VN , l 2 VI and w 2 VT .

 A[ ] ! A1[] : : :Ai?1[]Ai[ l]Ai+1[] : : :An[].  A[ l] ! A1[] : : : Ai?1[]Ai[ ]Ai+1[] : : :An[].  A[] ! w. We construct a hg, G0 = (VN0 ; VT ; S; P 0), where

VN0 = VN [ f (A1; A2; ) j A1; A2 2 VN and  2 VI [ f  g g and P 0 is as follows.

 If A[ ] ! A1[] : : :Ai?1[]Ai[ l]Ai+1[] : : : An[] 2 P then for all B 2 VN let (A; B; ) ! Ci;n(A1; : : :; Ai?1; (Ai; B; l); Ai+1; : : :; An) 2 P 0  If A[ l] ! A1[] : : : Ai?1[]Ai[ ]Ai+1[] : : : An[] 2 P then for all B 2 VN let (A; B; l) ! Ci;n(A1; : : :; Ai?1; (Ai; B; ); Ai+1; : : :; An) 2 P 0  If A[] ! w 2 P then let A ! C1;1("w) 2 P 0 15

 For every A; B; C 2 VN , and  2 VI [ f  g let (A; B; ) ! W ((A; C; ); (C; B; )) 2 P 0 (A; A; ) ! C1;1(") 2 P 0 A ! W ((A; B; ); B ) 2 P 0 To show that L(G) = L(G0) we prove Lemma 3.4.

Lemma 3.4 Both of the following two statements hold. Part 1 For all A; B 2 VN ,  2 VI [ f  g and w1; w2 2 VT A[] =G) w1B []w2 where B [] is a distinguished descendent of A[] if and only if

(A; B; ) =G) 0 w1" w2

Part 2 For all A 2 VN and w 2 VT A[] =G) w if and only if w = w1w2 and A =G) 0 w1" w2 for some w1 and w2 . Proof We prove this by simultaneously inducting on the length of derivations in both parts of the lemma. There are four implications to consider. Basis: Part 1 only if Consider the zero-step derivation A[] =G) w1B []w2 where A = B ,  = , and w1 = w2 = . Using the production (A; A; ) ! C1;1(") of P 0 we have the desired

derivation in G0. Part 1 if The only one-step derivation of the appropriate form is (A; A; ) =G) 0 " which  involves use of the production (A; A; ) ! C1;1(") 2 P 0. Since =G) is re exive we have A[] =G) A[].

Part 2 A[] =G) w if and only if w 2 VT and A[] ! w 2 P if and only if A ! C1;1("w) 2 P 0 if and only if A =G) 0 "w. Inductive step:

Part 1 only if Suppose that A[] =Gk) w1B []w2 where B [] is a distinguished descendent of A[].

 If the derivation begins with the production A[ ] ! A1[] : : : Ai?1[]Ai[ l]Ai+1[] : : : An[] 16

then it can be decomposed as follows. A[] =G) A1[] : : :Ai?1[]Ai[l]Ai+1[] : : : An[] 1 =kG) uAAi[l]vA k2 =G) uAuiB 0[]vivA 3 =kG) uAuiuB0 B []vB0 vivA = w1B []w2 where k1, k20, and k3 are less than k and for each 1  j  n such that j 6= i we kj wAj where kj < k and uA = wA1 : : : wAi?1 and vA = wAi+1 : : : wAn , have Aj [] =G) and we have the following derivations k2 1 +) Ai[l] k= uiB 0[]vi G

and

3 u 0 B []v 0 B 0[] =kG) B B

By the inductive hypothesis we have (B 0; B; ) =G) (Ai; B 0; l) =G) 0 uB 0 " vB 0 0 ui "vi ;

and

Aj =G) 0 uj " vj

for each 1  j  n such that j 6= i and wAj = uj vj . By the construction we know that the following productions are in P 0. (A; B 0; ) ! Ci;n(A1; : : :; Ai?1; (Ai; B 0; l); Ai+1; : : : ; An) (A; B; ) ! W ((A; B 0; ); (B 0; B; )) Then we have (A; B 0; ) =G) Ci;n(u1"v1; : : :; ui?1"vi?1; ui"vi; ui+1"vi+1; : : :; un"vn) 0 = u1v1 : : : ui?1vi?1ui"viui+1vi+1 : : : unvn thus, (A; B; ) =G) W (u1v1 : : :ui?1vi?1ui"viui+1vi+1 : : : unvn; uB0 "vB0 ) 0 = u1v1 : : : ui?1vi?1uiuB0 "vB0 viui+1vi+1 : : :un vn = w 1 " w2

 Alternatively, suppose that the derivation begins with the production A[ l] ! A1[] : : :Ai?1[]Ai[ ]Ai+1[] : : : An[] In this case we know that  = l and the derivation can be decomposed as follows. A[l] =G) A1[] : : : Ai?1[]Ai[]Ai+1[] : : :An[] =G) uAAi[]vA =G) uAuiB []vivA = w1B []w2 where for each 1  j  n such that j 6= i we have Aj [] =G) wAj where uA = wA1 : : : wAi?1 and vA = wAi+1 : : :wAn , and we have Ai[] =G) uiB []vi. By the inductive hypothesis we have (Ai; B; ) =G) and Aj =G) 0 u i " vi 0 uj "vj 17

for each 1  j  n such that j 6= i and wAj = uj vj . By the construction we know that the following production is in P 0. (A; B; l) ! Ci;n(A1; : : :; Ai?1; (Ai; B; ); Ai+1; : : :; An) Hence, we have the following derivation. (A; B; l) =G) 0 Ci;n (u1" v1; : : : ; un "vn ) = w1 "w2

Part 1 if Suppose that (A; B; ) =Gk) 0 w1 " w2 .  Suppose that  =  and this derivation begins with use of the production (A; B; ) ! Ci;n(A1; : : :; Ai?1; (Ai; B; l); Ai+1; : : :; An) i u v and for each and that w1"w2 = u1v1 : : : ui"vi : : : unvn where (Ai; B; l) =Gk) i" i 0 kj 1  j  n such that j = 6 i we have Aj =G) 0 uj "vj where k1 ; : : :; kn are all less than k. Thus, by induction we know that for each 1  j  n such that j 6= i we have the derivation Aj [] =G) uj vj and Ai[l] =G) uiB []vi. By the construction we know that P contains the following production. A[ ] ! A1[] : : : Ai?1[]Ai[ l]Ai+1[] : : : An[] Thus, we have the following derivation in G. A[] =G) A1[] : : : Ai?1[]Ai[l]Ai+1[] : : : An[] =G) u1v1 : : :ui?1vi?1uiB []viui+1vi+1 : : :un vn = w1B []w2

 Alternatively, suppose that  2 VI and this derivation begins with use of the production (A; B; ) ! Ci;n(A1; : : :; Ai?1; (Ai; B; ); Ai+1; : : :; An) and that w1"w2 = u1v1 : : :ui"vi : : : unvn j ki where for each 1  j  n such that j 6= i we have Aj =Gk) ) 0 uj " vj and (Ai ; B; ) = G0 ui"vi where each kj is less than k (1  j  n). Thus, by induction we know that for each 1  j  n such that j 6= i we have  the derivation Aj [] =G) uj vj as well as the derivation Ai[] =G) uiB []vi. By the construction we know that P contains the following production. A[ ] ! A1[] : : : Ai?1[]Ai[ ]Ai+1[] : : : An[] Thus, we have the following derivation in G. A[] =G) A1[] : : : Ai?1[]Ai[]Ai+1[] : : : An[] =G) u1v1 : : : ui?1vi?1Ai[]ui+1vi+1 : : : unvn =G) u1v1 : : : ui?1vi?1uiB []viui+1vi+1 : : :un vn = w1B []w2

18

 Alternatively, suppose that the derivation begins with use of the production (A; B; ) ! W ((A; C; ); (C; B; )) 1 u v and where  2 VI [ f  g, w1"w2 = u1u2"v2v1 such that (A; C; ) =Gk) 1" 1 0 k2 (C; B; ) =G) 0 u2" v2 where k1 and k2 are less than k . Thus, by induction we know that A[] =G) u1C []v1 and C [] =G) u2B []v2. Thus, we have the following derivation in G. A[] =G) u1C []v1 =G) u1u2B []v2v1 = w1B []w2

Part 2 only if Suppose that A[] =Gk) w. In this case we have the following derivation. A[] =G) A1[] : : : Ai?1[]Ai[l]Ai+1[] : : :An[] 1 =kG) uA Ai[l]vA 2 =kG) uA uiB []vivA 3 =kG) uA uiwB vivA

= w 2 u B []v B [] is a where k1, k2, and k3 are less than k and in the subderivation Ai[l] =kG) i i 3 w . Thus, by induction we know distinguished descendent of Ai[l]; furthermore B [] =kG) B that (Ai; B; l) =G) ) uB "vB where uB vB = wB , and that for each 1  j  n 0 ui "vi , B = G0 such that j 6= i we have the derivation Aj =G) 0 uj "vj such that u1 v1 : : : ui?1 vi?1 = uA and ui+1vi+1 : : : unvn = vA. From the construction we have the following productions in P 0. (A; B; ) ! Ci;n(A1; : : :; Ai?1; (Ai; B; l); Ai+1; : : :; An) A ! W ((A; B; ); B ) Hence, we have the following. (A; B; ) =G) Ci;n(u1"v1; : : :; ui?1"vi?1; ui"vi; ui+1"vi+1 : : : ; un"vn) 0 = u1v1 : : : ui?1vi?1ui"viui+1vi+1 : : :un vn thus, A =G) W (u1v1 : : : ui?1vi?1ui"viui+1vi+1 : : :unvn; uB "vB ) 0 = u1v1 : : : ui?1vi?1uiuB "vB viui+1vi+1 : : :un vn = uAuiuB "vB vivA where w = uAuiuB vB vivA . Part 2 if Suppose that A =Gk) 0 w1 "w2 . This derivation begins with the production A ! 1 u v and B =k) 2 u v W ((A; B; ); B ) such that w1"w2 = u1u2"v2v1, (A; B; ) =Gk) 1" 1 2" 2 0 G0 where k1 and k2 are less than k. Thus, by induction we know that A[] =G) u1B []v1 =G) u1u2v2v1 19

3.3 hl tal The wrapping operation of hg can be simulated by the adjunction operation of tag. Suppose we have B =G) u1"v1, C =G) u2"v2 and we have the production A ! W (B; C ). From this we have A =G) u1u2"u2v1. This can be simulated by a tag as follows. Assume that corresponding to the derivations B =G) u1"v1 and C =G) u2"v2 we have trees 1 and 2 such that yield ( 1) = u1Bv1 and yield ( 2) = u2Cv2. Corresponding to the production A ! W (B; C ) include a tree in the grammar in which 1 can be adjoined at the parent of a node at which 2 can be adjoined; then we can derive a tree such that yield ( ) = u1u2Av2v1. From a hg, G = (VN ; VT ; S; P ), we construct a tag, G0 = (VN ; VT ; S; VL ; I ; A; ), such that L(G) = L(G0). Without loss of generality, we assume that the productions in P are of the following form where A; B; C 2 VN , w1; w2 2 VT and f 2 f C1;2; C2;2; W g.

A ! f (B; C ) A ! C1;1(w1"w2) The elementary trees of G0 are given as follows. S

OA

= ε A

case 1:

B

OA

A

NA

A

NA

C

case 2:

OA

ε

A

NA

B

OA

A

case 4: OA

A

NA

OA

ε

case 3: C

B

NA

C

OA

A

NA

NA

w 1

w A

2

NA

Figure 5: Trees in G0

I contains the single initial tree shown in Figure 5. The set, A, of auxiliary trees is as

follows.

Case 1: If A ! C1;2(B; C ) 2 P then include the tree shown in Figure 5 for case 1 in A Case 2: If A ! C2;2(B; C ) 2 P then include the tree shown in Figure 5 for case 2 in A Case 3: If A ! W (B; C ) 2 P then include the tree shown in Figure 5 for case 3 in A Case 4: If A ! C1;1(w1"w2) 2 P then include the tree shown in Figure 5 for case 4 in A 20

Let VL and be such that VL is the set of tree labels 1; : : :; n where A = f 1; : : : ; n g. G0 simulates derivations of G such that whenever a production is used in G the corresponding tree is used in G0. To show that L(G) = L(G0 ) we prove the following lemma by induction. n

o

Lemma 3.5

For all A 2 VN , w1; w2 2 VT A =G) w1"w2 if and only if there exist 0 ; 0 such that =G) 0 , 2 A \ aux (VN ; VT ; VL ; A), has no OA nodes and yield ( 0) = w1 Aw2.

Proof The basis of the induction involves proving the above lemma for derivations in G

involving the use of one production and derivations in G0 involving the adjunction of one tree. Thus, the only relevant part of the construction is case 4 and the desired equivalence clearly holds. For the inductive step proving each direction of the equivalence involves consideration of cases 1{3 in the above construction. The details are straightforward, so rather than enumerating all three cases for each direction we show one case for each direction. Suppose that A =Gk) w1"w2 for some A 2 VN involving use of the production A ! 2 v v where k and k are less 1 u u and C =k) C1;2(B; C ) and the subderivations B =kG) 1 2 1" 2 1" 2 G than k and C1;2(u1"u2; v1"v2) = w1"w2. Thus, by the induction hypothesis we know that ) 0 0 0 1 =G) 2 0 1 for some 1 2 A \ aux (VN ; VT ; VL ; B ) such that yield ( 1) = u1Bu2 and 2 = G0 0 for some 2 2 A \ aux (VN ; VT ; VL ; C ) such that yield ( 2) = v1Cv2. Let be the tree introduced in case 1 of the construction due to the presence of A ! C1;2(B; C ) in P . We have  0 0 0 0 =G) 0 r (r ( ; 1; 1) ; 2; 2) where yield (r (r ( ; 1; 1) ; 2 ; 2)) = u1 Au2v1v2. 0 Suppose that =Gk) 0 for some 2 A \ aux (VN ; VT ; A). Suppose that was introduced 1 , for some in case 3 of the construction and that 0 = r (r ( ; 2; 11) ; 1; 1) where 10 =Gk) 1 0 k 2 for some 0 2 A \ aux (V ; V ; C ) where k and k 10 2 A \ aux (VN ; VT ; B), and 20 =G) N T 1 2 2 2 0 0 are less than k. Let yield ( 1) = u1Bu2 and yield ( 2) = v1Cv2. Thus yield ( ) = u1v1Av2u2. Thus, by induction we know that B =G) u1"u2 and C =G) v1"v2. Since P contains the production A ! W (B; C ) we have A =G) W (u1"u2; v1"v2) = u1v1"v2u2.

Example 3.2

Consider the hg, G = (f S; Ta; Tb; A; B g ; f a; b g ; S; P ), that generates the language f ww j w 2 f a; b g g where P is as follows.

P=

(

S ! C1;1("); S ! C2;2(A; Ta); S ! C2;2(B; Tb); Ta ! W (S; A); Tb ! W (S; B ); A ! C1;1("a); B ! C1;1("b)

)

The following tag, G would be produced by the above construction

G0 = (f S; Ta; Tb; A; B g ; 1; : : :; 7 ; f a; b g ; S; f g ; A; ) o

n

where is as given in the construction and the trees in A are shown in Figure 6.

21

β1=

S

NA

S

NA

β2= ε

ε

A

S

OA

ε β4=

Ta

NA

S

OA

β 5 = Tb

S

β3=

NA

Ta

OA

B

S

NA

ε

S

NA

OA

Tb

OA

S

NA

NA

β6= A

NA

A

NA

β7= B

NA

B

NA

OA

ε A

OA

B

OA

Ta

NA

Tb

NA

a

ε

b

Figure 6: Auxiliary trees of G0

3.4 tal  ccl Consider a tag derivation in which trees are adjoined at the deepest adjunction points rst. Figure 7 illustrates an adjunction at a node labelled B . Adjunctions have been completed on the path marked with the solid line whereas adjunction at nodes on the path marked with the broken line have yet to be made. The broken line can be viewed as a stack of nodes and the e ect of adjoining a tree can be viewed as \pushing" a new path (the spine of the adjoined tree) onto the top of the stack. The stacking of paths seen in adjunction can be simulated by ccg A

adjoin at

A

this point

B

=)

B

B

Figure 7: Stacking of paths composition rules. To do this we encode auxiliary trees with ccg categories. This is achieved by arranging that the auxiliary trees are pruned. Informally, an auxiliary tree is pruned if it is at most binary branching and siblings of nodes on the spine have OA constraints and have a single child labelled by . An example of such a tree is shown in Figure 8. Trees that are pruned can be encoded by ccg categories, adjunction at the spine simulated by composition 22

A NA A1 OA A2 OA ε

A3 OA A4 OA

A5 OA A6 OA ε ε

A NA A7 OA ε

Figure 8: Example of a pruned tree rules (instances of the schemata Fn and Bn where n  1) and adjunction at siblings of the spine simulated by application rules (instances of the schemata F0 and B0). In the proof we use a result given in Appendix A showing that an arbitrary tag can be converted into an equivalent tag, G = (VN ; VT ; VL ; S; f g ; A1 [ A2; ), such that is shown in Figure 5 and the two sets that make up the auxiliary trees are as follows. A1 contains the trees that introduce terminal symbols. Each auxiliary tree 2 A1 has the form of the tree shown in Figure 9 where A 2 VN and w 2 VT . A

NA

= w

A

NA

Figure 9: Form of trees in A1 Each tree 2 A2 has the property that h ; ft ( )i is pruned where h ; di is pruned if and only if the following conditions hold.

 2 aux (VN ; VT ; VL) and d 2 spine ( ).  d03 62 dom ( ) for each d0 that is a proper pre x of d, i.e., there is at most binary

branching.  For each d0 that is a pre x of d either (d0) = hA; ;; falsei for some A 2 VN or (d0) = hA; VL; truei, i.e., either adjunction is obligatory with any tree with root labelled by the nonterminal A, or adjunction is not allowed.  For each d0 that is a proper pre x of d let i = 1 if d01 is not a pre x of d, otherwise i = 2. (d0i) = hA; VL; truei, (d0i1) =  and (d0i2) is unde ned for some A 2 VN , i.e., each sibling of a node on the path from the root to address d of is labelled by a member of VN , has an OA constraint with no restriction on which trees can be adjoined, and has a single child labelled with .

We make the further assumption, satis ed by the construction in Appendix A, that none of the trees in A1 can be adjoined at nodes on the spine of trees in A2. 23

From this tag, G, we construct an equivalent ccg, G0 = (VN0 ; VT ; S; f; R), where f and R are de ned below and VN0 = VN [ A^ j A 2 VN . First we de ne an encoding function enc : aux (VN ; VT ; VL)  N+ ! cat (VN0 ) with which each h ; di that is pruned is encoded by a member of cat (VN0 ). Let 2 aux (VN ; VT ; VL).  If () = hA; ;; falsei then enc ( ; ) = A. If () = hA; VL; truei then enc ( ; ) = (A=A^). n

o

 Let d be a proper pre x of ft ( ) and enc ( ; d) = c { if d1 2 spine ( ) and (d2) = hC; VL; truei then if (d1) = hB; VL; truei then enc ( ; d1) = ((c=C )=B^ ) and if (d1) = hB; ;; falsei then enc ( ; d1) = (c=C ). { if d2 2 spine ( ) and (d1) = hB; VL; truei then if (d2) = hC; VL; truei then enc ( ; d2) = ((cnB )=C^ ) and if (d2) = hC; ;; falsei then enc ( ; d2) = (cnB ). { If d2 62 dom ( ) and d1 2 dom ( ) then if (d1) = hB; VL; truei then enc ( ; d1) = (c=B^ ) and if (d1) = hB; ;; falsei then enc ( ; d1) = c.

The function f of G0 is de ned as follows.

 For each 2 A1 such that () = hA; ;; falsei and (1) = w where A 2 VN and w 2 VT

include A in f (w).  For each 2 A2 include c and c^ in f () where enc ( ; ft ( )) = c and c^ is identical to c with the exception of the target category3.

The tree in Figure 8 would lead to the following. (((((((AnA1)=A^2)=A4)=A^3)nA5)=A^6)=A7) 2 f () (((((((A^nA1)=A^2)=A4)=A^3)nA5)=A^6)=A7) 2 f () We complete the construction of G0 by de ning the set of combinatory rules in R. Let k = max (f arity (c) jc 2 f (w) for some w 2 VT g). For each A 2 VN , each 0  i  k, and j1; : : :; ji 2 f =; ng include the following rules in R.

 (x=A) A ! x  A (xnA) ! x  (x=A^) (: : : (A^j1z1)j2 : : : jizi) ! (: : : (xj1z1)j2 : : : jizi) The combinatory rules in R permit composition (when i > 0) only when the target category of the secondary component is of the form A^. This corresponds to adjunction with a tree in A2 at a node on the spine. In the above encoding of trees by categories every symbol of the form A^ is preceeded by a forward slash. Therefore, we only need forward composition rules. Before proving the equivalence of G and G0 we de ne a new derivation order for the above tag, G. For each tree 2 aux (VN ; VT ; VL) we de ne next ( ) which gives the address at which the next adjunction should occur. If the deepest OA node in is not on the spine then this is the next adjunction point, otherwise it is the deepest OA node on the spine. Note that next ( ) is not de ned when there are several deepest OA nodes not on the spine. More formally, next ( ) = d if d 2 dom ( ) and one or other of the following holds. 3

^ 2 ) = (c^1 =c2) and (c1 ^nc2) = (c^1 nc2 ). Formally, (c1=c

24

 d 62 spine ( ), (d) = hA; VL; truei for some A 2 VN and there is no d0 2 dom ( ) such that (d0) = hB; VL; truei for some B 2 VN and jd0j  jdj. We use jdj to denote the length of the address d 2 N+ . This gives the depth of a node with that address.  d 2 spine ( ), (d) = hA; VL; truei for some A 2 VN and there is no d0 2 dom ( ) such that (d0) = hB; VL; truei for some B 2 VN and jd0j > jdj Let the set Tk (G) include trees derived in k or fewer steps, for k  0, as follows.

 T0(G) = A1 [ A2  Tk (G) is the union of Tk?1(G) with the set of = r ( 0; 00; next ( 0)) such that 0 2 Tk?1(G), 00 2 Tk?1(G), 0(next ( 0)) = hA; VL; truei for some A 2 VN and when next ( 0) 62 spine ( 0) then 00 2 aux (VN ; VT ; VL ; A) where 00 has no OA nodes. Otherwise (when next ( 0) 2 spine ( 0)), 00 2 A2 \ aux (VN ; VT ; VL; A). It should be clear that T (G) as de ned in Section 2.2 is equal to the following set.

fr ( ; ; ) j for some k  0 2 Tk (G) \ aux (VN ; VT ; VL; S ) and has no OA nodes g L(G) = L(G0 ) follows from Lemma 3.6. For convenience we rst de ne the middle of a tree. For 2 Tk (G) (k  0) let middle ( ) =  if contains no OA nodes, otherwise middle ( ) = d where d 2 spine ( ) and jdj = jnext ( )j. In other words, the middle of a tree is the address of the node that is the closest node on the spine to the root that is as deep as any OA node in the tree. Note that for each tree 2 Tk (G) for some k > 0 the entire terminal yield of the tree is dominated by the node with address middle ( ), i.e., yield ( ) = yield ( =middle ( )). Note also that h ; middle ( )i is pruned.

Lemma 3.6

The following two statements are equivalent: 1. For all A 2 VN , c 2 cat (VN0 ), and w1; w2 2 VT, there exists 2 Tk (G) \ aux (VN ; VT ; VL ; A), k  0, such that yield ( ) = w1Aw2 , and enc ( ; middle ( )) = c 2. there exist c1; : : : ; cn and u1; : : :; un such that c =G) 0 c1 : : : cp : : :cn where cp is the primary descendent of c, target (c) = A, w1 = u1 : : : up, w2 = up+1 : : :un , uj 2 VT and cj 2 f (uj ) (1  j  n).

Proof We rst prove that statement 1 implies statement 2 by induction on k. The case for k = 0 follows directly from the construction. Suppose that for some k  1 2 Tk (G) \ aux (VN ; VT ; VL; A), yield ( ) = w1 Aw2, h ; middle ( )i is pruned and enc ( ; middle ( )) = c. Let = r ( 0; 00; next ( 0)) for 0; 00 2 Tk?1(G) and 0(next ( 0)) = hB; VL; truei for some B 2 VN .  Suppose that next ( 0) 62 spine ( 0) in which case 00 2 aux (VN ; VT ; VL ; B) where 00

has no OA nodes. It is either the case that next ( 0) = d1 for some d (see case 1a of Figure 10) or next ( 0) = d2 for some d (see case 1b of Figure 10). These two cases are very similar so we only consider the rst possibility in which the adjunction node is to the left of the spine. 25

Case 1a

Case 1b

ε

=

B

v3 B

C

ε

ε

=

D

v4

Case 2

A

A

v1 A

v2

C

A

ε

v1 A

v2

v3 B

ε

ε

B

ε

v4

ε

ε

B

=

B

D

ε

w1 A

w2

Figure 10: Three possibilities In this case d2 = middle ( 0) and enc ( 0; d2) = (cnB ). There are strings v1; v2; v3 and v4 such that yield ( 0) = v1Av2 and yield ( 00) = v3Bv4, thus, w1 = v3v4v1 and w2 = v2. By induction (cnB ) =G) c1 : : : cp : : :cn where cp is the primary descendent of (cnB ), target ((cnB )) = A, v1 = u1 : : :up , v2 = up+1 : : :un and cj 2 f (uj ) (1  j  n). Since 00 has no OA nodes, middle ( 00) =  and enc ( 00; ) = B . By induction we know that  0 0 0 0 0 0 B =G) 0 c1 : : : cm , v3v4 = u1 : : : um and cj 2 f (uj ) (1  j  m). We are not concerned with the primary descendent of B in this derivation. By the rules in R we have the following derivation.  c0 : : :c0 c : : :c : : : c c =G) p n m 1 0 B (cnB ) =G) 0 1

where cp is the primary descendent of c.  Alternatively, suppose that next ( 0) 2 spine ( 0) (see case 2 of Figure 10) in which case 00 2 A2 \ aux (VN ; VT ; VL; B ) and, therefore, 00 contains no terminal symbols and yield ( 0) = w1Aw2. Note that middle ( 0) = next ( 0). By the construction we know that c^00 2 f () where enc ( 00; middle ( 00)) = enc ( 00; ft ( 00)) = c00 = (: : : (B j1B1)j2 : : : jm Bm)

for some Bi 2 VN0 and some m > 0. In addition, we know that h ; next ( 0) middle ( 00)i is pruned and that enc ( 0; next ( 0)) = (c0=B^ ) and enc ( ; middle ( )) = enc ( ; next ( 0) middle ( 00)) = (: : : (c0j1B1 )j2 : : : jmBm) = c By induction, (c0=B^ ) =G) c1 : : :cp : : : cn where cp is the primary descendent of (c0=B^ ), 0   target (c0 =B^ ) = A, w1 = u1 : : : up, w2 = up+1 : : : un, and cj 2 f (uj ) (1  j  n). By the combinatory rules in R we have the derivation c = (: : : (c0j1B1)j2 : : : jmBm ) =G) (c0=B^ ) (: : : (B^ j1B1)j2 : : : jmBm) 0 =G) c1 : : : cp : : : cn(: : : (B^ j1B1)j2 : : : jmBm) 0

26

where cp is the primary descendent of (: : : (c0j1B1)j2 : : : jmBm ) and since (: : : (B^ j1B1)j2 : : : jmBm) 2 f () we have the desired terminal strings w1 and w2 (take un+1 = ). We prove that statement 2 implies statement 1 of the lemma by induction on the number of derivation steps that are used in derivations in G0. The basis involves consideration of categories in the range of f . It follows from the construction that the lemma will hold of these categories. Suppose that c =Gk) 0 c1 : : : cp : : : cn where cp is the primary descendent of c, target (c) = B , w1 = u1 : : : up, w2 = up+1 : : : un, uj 2 VT and cj 2 f (uj ) (1  j  n).

 Suppose the combinatory rule (x=B ) B ! x is used in the rst step of the derivation. There are l and p0 where p  l < p0  n such that c =G) (c=B ) B 0 k0 =G) c1 : : :cp : : : cl B 0 00 =Gk) c1 : : :cp : : : clcl+1 : : : cp0 : : : cn 0 where k0 and k00 are less than k. Thus, by induction we have for some k1  0 a tree 1 2 Tk1 (G) \ aux (VN ; VT ; VL ; A) such that target (c) = A, yield ( 1) = u1 : : : upAup+1 : : : ul, h 1; middle ( 1)i is pruned and enc ( 1; middle ( 1)) = (c=B ). There is some d 2 dom ( 1) such that middle ( 1) = d1, next ( 1) = d2 and 1(d2) = hB; VL ; truei. Also by induction we have for some k2  0 a tree 2 2 Tk2 (G) \ aux (VN ; VT ; VL; B ) such that h 2; middle ( 2)i is pruned, enc ( 2; middle ( 2)) = B (which implies that 2 has no OA nodes) and yield ( 2) = ul+1 : : :up0 Bup0 +1 : : : un Thus, = r ( 1; 2; d2) 2 Tk(G) for some k where yield ( ) = u1 : : :up Aup+1 : : :up0 up0+1 : : :un

Furthermore, h ; middle ( )i is pruned, and enc ( ; middle ( )) = c.  The case in which the combinatory rule B (xnB ) ! x is used is similar to the case just considered.  Suppose that for some 0  m  arity (c) the rule (x=B^ ) (: : : (B^ j1z1)j2 : : : jmzm) ! (: : : (xj1z1)j2 : : : jmzm) is used in the rst step of the derivation and that c = (: : : (c0j1c01)j2 : : : jmc0m). Thus we have 0 ^ 0 ^ 0 c = (: : : (c0j1c01)j2 : : : jm c0m) =G) 0 (c =B ) (: : : (B j1c1 )j2 : : : jm cm ) Since a category whose target category is B^ cannot instantiate the right hand side of a rule in R and from the construction of f , it must be the case that (: : : (B^ j1c01)j2 : : : jmc0m) 2 f () 27

Thus cn = (: : : (B^ j1c01)j2 : : : jmc0m ) and un =  and c = (: : : (c0j1c01)j2 : : : jm c0m) =G) (c0=B^ ) (: : : (B^ j1c01)j2 : : : jmc0m ) 0 0 =Gk) c1 : : : cp : : : cn?1 (: : : (B^ j1c01)j2 : : : jmc0m) 0

where k0 is less than k. Thus, by induction we have for some k  0 a tree 1 2 Tk (G)\aux (VN ; VT ; VL ; A) such that target (c0) = A, yield ( 1) = u1 : : :upAup+1 : : :un?1, h 1; middle ( 1)i is a pruned tree and enc ( 1; middle ( 1)) = (c0=B^ ). We also have next ( 1) = middle ( 1), (i.e., the next point of adjunction into 1 will be on the spine) and 1(middle ( 1)) = hB; VL; truei. Since (: : : (B^ j1c01)j2 : : : jmc0m) 2 f () there must be a tree 2 2 A2 \ aux (VN ; VT ; VL ; B) such that enc ( 2; ft ( 2)) = (: : : (B j1c01)j2 : : : jmc0m). Note yield ( 2) = B . Thus, = r ( 1; 2; middle ( 1)) 2 Tk+1(G) where yield ( ) = u1 : : :upAup+1 : : :un?1, h ; middle ( )i is a pruned tree, and enc ( ; middle ( )) = c = (: : : (c0j1c01)j2 : : : jmc0m).

Example 3.3

The tag G = (f S; A; B g ; f a; b g ; 1; : : :; 4 ; S; f g ; A1 [ A2; ) generates the language f anbn j n  0 g where is as shown in Figure 5, A1 contains the trees 1 and 2 and A2 contains the trees 3 and 4 all shown in Figure 11. n

1 =

o

A NA a

2 =

A NA

B NA b

B NA

S NA

S NA

A OA

3 =

4 =

S OA ε B OA

S NA

S NA ε

Figure 11: Auxiliary trees of G The above construction generates the ccg ^ A; A; ^ B; B^ ; f a; b g ; S; f; R) G0 = ( S; S; where f is de ned as follows. f (a) = f A g f (b) = f B g f () = (((S nA)=S^)=B ); (((S^nA)=S^)=B ); S; S^ and R includes the following rules (we have only included those that can be used in derivations of G0. ) B ! x; A (xnA) ! x; (x=S ) S ! x; R = (x=S^)(x=B ^ (((SnA)=S^)=B ) ! (((xnA)=S^)=B ); (x=S^) S^ ! x n

o

n

(

o

)

28

4 Conclusions As mentioned in the introduction, notational di erences between ccg, hg, lig and tag can be understood in terms of the way in which they extend cfg. On the one hand, hg maintain the context-freeness of cfg while adding the operation of wrapping. On the other hand, lig and ccg make use of unbounded stack-like structures to control the use of rules in derivations. From the equivalence results presented in this paper, we conclude that these two, super cially very di erent, approaches are equivalent extensions of cfg. Since these are independently conceived formalisms, each intended to capture certain aspects of the structure of natural language, our result showing their equivalence lends credence to each of the approaches. The understanding of the relationship between these two approaches gained from the constructions used in the proofs given in Section 3 has led to interesting discoveries regarding the relationship between parsing algorithms for these systems. Di erences between these formalisms gave rise to what appear to be distinct styles of parsing algorithms. The CKY algorithm [9, 29] for cfg has been extended in two ways. In [15] a hg CKY-style parsing algorithm is given. This algorithm resembles the cfg case in that subsets of the nonterminal alphabet are stored in array entries. The extension over cfg involves the use of a four dimensional array to encode pairs of substrings of the input string. Given the close relationship between hg and tag established in this paper, it was possible to adapt this algorithm to give a tag parser [22]. In [23, 24] a lig CKY-style parsing algorithm is given. This algorithm extends the cfg case in that encodings of stacks are stored in the array entries. Given the close relationship between lig, ccg and tag described in this paper, it was possible to adapt this algorithm to give ccg and tag parsers [24]. Obviously, as a result of the weak equivalence of ccg, hg, lig and tag any result shown for one formalism applies to the other three. Such results include the de nition of a string automaton for this class [22], various closure and decidability properties including the result that it forms a AFL [22, 17], and the de nition of an in nite language hierarchy extending the progression from c to this class [26, 27].

29

References [1] A. V. Aho. Indexed grammars | An extension to context free grammars. J. ACM, 15:647{671, 1968. [2] K. Ajdukiewicz. Die syntaktische Konnexitat. Studia Philosophica, 1:1{27, 1935. English translation in: Polish logic 1920-1939, ed. by Storrs McCall, 207-231. Oxford University Press. [3] C. Culy. The complexity of the vocabulary of bambara. Ling. and Philosophy, 8:345{351, 1985. [4] J. Duske and R. Parchmann. Linear indexed languages. Theor. Comput. Sci., 32:47{60, 1984. [5] G. Gazdar. Applicability of indexed grammars to natural languages. In U. Reyle and C. Rohrer, editors, Natural Language Parsing and Linguistic Theories, pages 69|94. D. Reidel, Dordrecht, Holland, 1988. [6] G. Gazdar, E. Klein, G. K. Pullum, and I. A. Sag. Generalized Phrase Structure Grammars. Blackwell Publishing, Oxford, 1985. Also published by Harvard University Press, Cambridge, MA. [7] A. K. Joshi. How much context-sensitivity is necessary for characterizing structural descriptions | Tree Adjoining Grammars. In D. Dowty, L. Karttunen, and A. Zwicky, editors, Natural Language Processing | Theoretical, Computational and Psychological Perspective, pages 206| 250. Cambridge University Press, New York, NY, 1985. Originally presented in 1983. [8] A. K. Joshi, L. S. Levy, and M. Takahashi. Tree adjunct grammars. J. Comput. Syst. Sci., 10(1):136|163, 1975. [9] T. Kasami. An ecient recognition and syntax algorithm for context-free languages. Technical Report AF-CRL-65-758, Air Force Cambridge Research Laboratory, Bedford, MA, 1965. [10] A. S. Kroch. Asymmetries in long distance extraction in a tag grammar. In M. Baltin and A. S. Kroch, editors, New Conceptions of Phrase Structure. University of Chicago, Press, Chicago, IL, 1986. [11] A. S. Kroch. Subjacency in a tree adjoining grammar. In A. Manaster-Ramer, editor, Mathematics of Language. John Benjamins, Amsterdam, 1987. [12] A. S. Kroch and A. K. Joshi. Linguistic relevance of tree adjoining grammars. Technical Report MS-CIS-85-18, Department of Computer and Information Science, University of Pennsylvania, Philadelphia, 1985. [13] A. S. Kroch and A. K. Joshi. Analyzing extraposition in a tree adjoining grammar. In G. Huck and A. Ojeda, editors, Syntax and Semantics: Discontinuous Constituents. Academic Press, New York, NY, 1986. [14] A. S. Kroch and B. Santorini. The derived constituent structure of West Germanic verb-raising construction. In Proceedings of the Princeton Workshop on Grammar, 1987. [15] C. Pollard. Generalized Phrase Structure Grammars, Head Grammars and Natural Language. PhD thesis, Stanford University, 1984. [16] G. Pullum and G. Gazdar. Natural languages and context-free languages. Ling. and Philosophy, 4:471{504, 1982. [17] K. Roach. Formal properties of head grammars. In A. Manaster-Ramer, editor, Mathematics of Language. John Benjamins, Amsterdam, 1987.

30

[18] B. Santorini. The West Germanic verb-raising construction: A tree adjoining grammar analysis. Master's thesis, University of Pennsylvania, Philadelphia, PA, 1986. [19] S. M. Shieber. Evidence against the context-freeness of natural language. Ling. and Philosophy, 8:333{343, 1985. [20] M. Steedman. Combinators and grammars. In R. Oehrle, E. Bach, and D. Wheeler, editors, Categorial Grammars and Natural Language Structures, pages 417|442. Foris, Dordrecht, 1986. [21] M. J. Steedman. Dependency and coordination in the grammar of Dutch and English. Language, 61:523{568, 1985. [22] K. Vijay-Shanker. A Study of Tree Adjoining Grammars. PhD thesis, University of Pennsylvania, Philadelphia, PA, 1987. [23] K. Vijay-Shanker and D. J. Weir. The recognition of combinatory categorial grammars and linear indexed grammars. In M. Tomita, editor, Current Issues in Parsing Technology. Kluwer Academic Publishers, 1991. [24] K. Vijay-Shanker and D. J. Weir. Parsing constrained grammar formalisms. Comput. Ling., in press. [25] K. Vijay-Shanker, D. J. Weir, and A. K. Joshi. Tree adjoining and head wrapping. In 11th International Conference on Comput. Ling., pages 202{207, 1986. [26] D. J. Weir. Characterizing Mildly Context-Sensitive Grammar Formalisms. PhD thesis, University of Pennsylvania, Philadelphia, PA, 1988. [27] D. J. Weir. Automata theory as relevant to linguistics. In W. Bright, editor, Oxford International Encyclopedia of Linguistics, volume 1, pages 145|146. Oxford University Press, Oxford, England, 1992. [28] D. J. Weir and A. K. Joshi. Combinatory categorial grammars: Generative power and relationship to linear context-free rewriting systems. In 26th meeting Assoc. Comput. Ling., pages 278|285, 1988. [29] D. H. Younger. Recognition and parsing of context-free languages in time n3 . Inf. Control, 10(2):189{208, 1967.

31

A A normal form for tag We give a ve phase conversion process to show that for every tag there is an equivalent tag that satis es the conditions given in Section 3.4. We do not give a detailed proof since the conversion at each stage is straightforward. 1. Let G1 = (VN ; VT ; VL;1; S; I1; A1; ) be a tag. We assume that trees in I1 [ A1 do not contain any OA nodes that are also NA nodes. We also assume that for every node label hA; sa; oai: sa  j 2 aux (VN ; VT ; VL;1; A) . G1 can be converted into an equivalent tag, G2 = (VN [ f S 0 g ; VT ; VL;2; S 0; f g ; A1 [ A2; ) where is such that () = hS 0; saS0 ; truei and (1) =  where saS0 = j 2 A2 . For each tree 0 2 I1 a tree 0 is included in A2 where 0 =1 = 0, 0 () = 0 (2) = hS 0; ;; falsei for some new nonterminal S 0. The left subtree of 0 is 0 and the right subtree is a single node that is the foot of the tree. Note that the tree r ( ; 0 ; ) corresponds very closely to 0 as shown in Figure 12. n

o

o

n

S’ OA

= S’

S’

NA

S

0 =

ε

S’

NA

r ( ; 0 ; ) =

S

NA

S’

NA

ε

Figure 12: Conversion of initial trees

VL;2 = VL;1 [ j 2 A2 . Technically, we should specify di erent tree labelling function for each grammar. However, for each grammar the function denoted by should be clear from the set of tree labels of the grammar. 2. From G2 we construct an equivalent tag, G3 = (VN [ f S 0 g ; VT ; VL;3; S 0; f g ; A3; ), such that every node of a tree in A3 that is labelled by a nonterminal either has a NA constraint or an OA constraint. In other words, there will be no nodes at which adjunction is optional. For each 2 A1 [ A2 and D  dom ( ) such that for all d 2 D we have (d) = hA; sa; falsei for some A 2 VN and nonempty sa  VL;2 include D in A3. For each d 2 N+ n

D(d) =

o

8 > < > :

hA; sa; truei if d 2 D and (d) = hA; sa; falsei hA; ;; falsei if d 62 D and (d) = hA; sa; falsei for some sa (d)

otherwise

VL;3 = j 2 A3 . n

o

3. From G3 we construct an equivalent tag, G4 = (2VL;3 ; VT ; VL;4; saS0 ; f 0 g ; A4; ), such that the root and foot of each tree in A4 has an NA constraint and all other internal 32

nodes have OA constraints with no restriction on which trees can be adjoined except that the nonterminals must match. To do this we use the set of labels forming the SA constraint at each node as its nonterminal label. We want a tree to be adjoinable at a node whose nonterminal label contains . Thus, for every set of labels sa containing there must be an instance of with a new root and foot nodes having nonterminal label sa. 0 is such that 0(1) =  and 0() = hsaS0 ; VL;4; truei where saS0 is such that () = hS 0; saS0 ; truei. For each 2 A3 let 0 be such that 0(d) = hsa; VL;4; truei if (d) = hA; sa; oai for some A and oa, and 0(d) = (d) otherwise. For each sa  VL;3 such that 2 sa include the tree sa in A4 where sa=1 = 0, sa() = sa(ft ( ) 1) = hsa; ;; falsei. Also include in A4 a tree ; where ;() = ;(1) = h;; ;; falsei and ft ( ;) = 1. VL;4 = j 2 A4 . o

n

4. From G4 we describe an equivalent tag G5 = (VN0 ; VT ; VL;5; saS0 ; f 0 g ; prune (A4) ; ), where prune (A) = prune ( ) [

2A

and prune ( ) is de ned below. First, note that tree substitution can be simulated (with respect to terminal yield) by adjunction if the node at which adjunction takes place dominates a single node labelled by . Informally, a tree is pruned by removing subtrees rooted at siblings of the spine and arranging that a simulated substitution of the removed subtree can occur. The removed subtree will be turned into an auxiliary tree with the addition of a foot node and itself be pruned. Thus, through a series of simulated substitutions a tree corresponding closely to the original tree can be recreated. Let 0 2 prune ( ) where for each d 2 N+

(d) if d 2 spine ( ) h( ; d); VL;5; truei if d 62 spine ( ) and d = d0i where i  1 and d0 2 spine ( ) 0(d) =  if d 62 spine ( ) and d = d0i1 where i  1, d0 2 spine ( ), and d0i 62 spine ( ) unde ned otherwise 8 > > > > > > > > < > > > > > > > > :

0 is produced from such that every node with address d that is the sibling of a node on the path from the root to foot of is relabelled with the new nonterminal ( ; d), given an OA constraint, and, in addition, the subtree under this node is replaced by a single node labelled . For each d 2 dom ( ) such that d 62 spine ( ) and d = d0i where i  1 and d0 2 spine ( ) (i.e., d is the sibling of a node on spine of ) trees are included in prune ( ) as follows. If (d) = w for some w 2 VT [f  g then include the tree w in prune ( ) where for each d0 2 N+ h( ; d); ;; falsei if d0 =  or d0 = 2 0 if d0 = 1 w (d ) = w unde ned otherwise Otherwise, let prune ( d)  prune ( ) where d is a slight modi cation of the subtree =d that is de ned as follows. 8 > < > :

33

Let d0 be such that d0 =1 = =d and d0 () = h( ; d); ;; falsei, i.e., d0 is the subtree of rooted at the node with address d with the addition of a new root with nonterminal label ( ; d) and with an NA constraint. We now add a foot node to d0 to get d. Let d0 be such that d0 (d0) = hA; sa; oai for some A, sa and oa and for all d00 such that d(d00) = hA; sa; oai for some A, sa and oa jd0j  jd00j (d0 is the address of a deepest nonterminal node in d0 ). Let i > 1 be such that d0(i ? 1) 2 dom ( d0 ) and d0i 62 dom ( d0 ). Let d be equal to d0 except that d(d0i) = h( ; d); ;; falsei. VL;5 = j 2 prune (A4) . The tree, , at the left of Figure 13 would be converted into the right four trees in the gure. o

n

8 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
a

D OA

> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > :

b

A NA

C OA

( β2 , 11 ) OA

ε ( β2 , 11 )

b ( β, 11 )

a

9 > > > > > > > OA > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > =

> > > > > > > > ( β, 2 ) NA > > > > > > > > > > > > > > > > > > > NA > > > > > > > > > > > > > ( β2 , 11 ) NA > > > > > > > > > NA > > > > > > > > > > > > ; ( β, 11 ) NA

Figure 13: Pruned trees 5. Finally, we give a tag G = (VN0 ; VT ; VL;5; saS0 ; f 0 g ; A; ), equivalent to G4, such that for every in A there is at most one OA node at any level of and the nodes of have at most 2 children. Let be a tree in prune (A4). Note that from the above construction only nodes on the spine of can have more than 1 child. The conversion of is shown in Figure 14 in which we show how an arbitrary node on the spine of can be stretched out to satisfy the above condition. Every node on the spine would be treated in the 34

same way. Note that where adjunction constraints remain unchanged they have been omitted. B

A B

A1 ε

A NA

.. . A NA

A

after = conversion

= A1

...

A i-1

Ai

Ai+1

...

An

A i-1

A NA

ε

A NA An

.. . ε

ε

ε

ε

ε

A NA

B

Ai

Ai+1 ε

B

Figure 14: Conversion to binary branching

35

Suggest Documents