Parsing beyond context-free grammar: Tree Adjoining Grammar Parsing. Laura
Kallmeyer, Wolfgang Maier. University of Tübingen. ESSLLI Course 2008.
Kallmeyer/Maier
ESSLLI 2008
Kallmeyer/Maier
ESSLLI 2008
Tree Adjoining Grammars (1)
Parsing beyond context-free grammar: Tree Adjoining Grammar Parsing Laura Kallmeyer, Wolfgang Maier University of T¨ ubingen
1
Kallmeyer/Maier
• adjunction: replacing an internal node with a new tree. The new tree is an auxiliary tree and has a special leaf, the foot node. • substitution: replacing a leaf with a new tree. The new tree is an initial tree
ESSLLI Course 2008
Parsing beyond CFG
A Tree Adjoining Grammars (TAG) (Joshi & Schabes 1997) is a tree-rewriting system, i.e., a set of elementary trees with two operations:
Notation: γ[p, γ ′] is the tree one obtains from replacing the node at position p in γ with the tree γ ′ (by substitution or adjunction).
TAG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
3
TAG Parsing
ESSLLI 2008
Tree Adjoining Grammars (2) (1) John sometimes laughs
Overview
S
1. Tree Adjoining Grammars
NP
VP VP
2. An Earley parser for TAG
NP
ADV
John
sometimes
(a) Introduction (b) Items
V
VP∗
laughs
(c) Inference Rules S
3. LR Parsing
NP
(a) Introduction
derived tree
(b) Construction of the automaton
John
VP ADV
VP
sometimes
V
laugh[1, john][2, sometimes]:
(c) The recognizer
laughs Parsing beyond CFG
2
TAG Parsing
Parsing beyond CFG
4
TAG Parsing
Kallmeyer/Maier
ESSLLI 2008
Kallmeyer/Maier
ESSLLI 2008
Tree Adjoining Grammars (3)
Tree Adjoining Grammars (5)
A Tree Adjoining Grammar (TAG) is a quadruple G = hN, T, I, Ai such that
Languages TAG can generate:
• T and N are disjoint alphabets of terminals and nonterminals, • I is a finite set of initial trees, and
• {ww | w ∈ {a, b}∗ } • L4 := {an bn cn dn | n ≥ 0} Languages TAG cannot generate:
• A is a finite set of auxiliary trees.
• {wn | w ∈ {a, b}∗ } for any n > 2.
The trees in I ∪ A are called elementary trees. G is lexicalized iff each elementary tree has at least one leaf with a terminal label.
⇒ TAG generate only a limited amount of cross-serial dependencies • Lk := {an1 an2 an3 . . . ank | n ≥ 0} for any k > 4. ⇒ TAG can “count up to 4, not further”.
TAG allows to specify for each node
n
• L := {a2 | n ≥ 0}.
1. whether adjunction is mandatory and
⇒ TAG cannot generate languages whose word lengths grow exponentially.
2. which trees can be adjoined. Parsing beyond CFG
5
Kallmeyer/Maier
TAG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
7
ESSLLI 2008
Tree Adjoining Grammars (4)
Tree Adjoining Grammars (6)
A derivation starts with an initial tree. In a final derived tree, all leaves must have terminal labels:
TAGs are mildly context-sensitive:
Let G = hI, A, N, T i be a TAG. Let γ and γ ′ be finite trees. • γ ⇒ γ ′ in G iff there is a node position p and an instance γ0′ of a tree (possibly derived from some) γ0 ∈ I ∪ A such that γ ′ = γ[p, γ0 ].
TAG Parsing
• TAGs are slightly more powerful than CFG, they can describe a limited amount of cross-serial dependencies. • TAGs are polynomially parsable (complexity O(n6 )). • TALs are of constant growth.
∗
⇒ is the reflexive transitive closure of ⇒. • The tree language of G is LT (G) := {γ | there is an α ∈ I such ∗ that α ⇒ γ, all leaves in γ have terminal labels and there are no OA nodes in γ}.
Parsing beyond CFG
6
TAG Parsing
Parsing beyond CFG
8
TAG Parsing
Kallmeyer/Maier
ESSLLI 2008
Earley Parsing: Introduction (1)
• Behaviour is due to pure bottom-up approach, no predictive information whatsoever is used • Goal: Earley-style parser! First in Schabes & Joshi (1988). Here, we present the algorithm from Joshi & Schabes (1997). We assume a TAG without substitution nodes.
9
Kallmeyer/Maier
ESSLLI 2008
Earley Parsing: Introduction (3)
• Left-to-right CKY parser (Vijay-Shanker & Joshi, 1985) very slow: O(n6 ) worst case and best case (just as in CFG version of CKY, to many partial trees not pertinent to the final tree are produced)
Parsing beyond CFG
Kallmeyer/Maier
General idea: Whenever we are • left above a node, we can predict an adjunction and start the traversal of the adjoined tree; • left of a foot node, we can move back to the adjunction site and traverse the tree below it; • right of an adjunction site, we continue the traversal of the adjoined tree at the right of its foot node; • right above the root of an auxiliary tree, we can move back to the right of the adjunction site.
TAG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
Earley Parsing: Introduction (2)
11
TAG Parsing
ESSLLI 2008
Earley Parsing: Items (1)
• Earley Parsing: Left-to-right scanning of the string (using predictions to restrict hypothesis space)
What kind of information do we need in an item characterizing a partial parsing result?
• Traversal of elementary trees, current position marked with a dot. The dot can have exactly four positions with respect to the node: left above (la), left below (lb), right above (ra), right below (rb).
[α, dot, pos, i, j, k, l, sat?] where • α ∈ I ∪ A is a (dotted) tree, dot and pos the address and location of the dot • i, j, k, l are indices on the input string, where i, l ∈ {0, . . . , n}, j, k ∈ {0, . . . , n} ∪ {−}, n = |w|, − means unbound value • sat? is a flag. It controls (prevents) multiple adjunctions at a single node (sat? = 1 means that something has already been adjoined to the dotted node)
Parsing beyond CFG
10
TAG Parsing
Parsing beyond CFG
12
TAG Parsing
Kallmeyer/Maier
ESSLLI 2008
Earley Parsing: Items (2)
Kallmeyer/Maier
Earley Parsing: Inference Rules (1)
What do the items mean? • [α, dot, la, i, j, k, l, nil]: In α part left of the dot ranges from i to l. If α is an auxiliary tree, part below foot node ranges from j to k.
ScanTerm
• [α, dot, lb, i, −, −, i, nil]: In α part below dotted node starts at position i. • [α, dot, rb, i, j, k, l, sat?]: In α part below dotted node ranges from i to l. If α is an auxiliary tree, part below foot node ranges from j to k. If sat? = nil, nothing was adjoined to dotted node, sat? = 1 means that adjunction took place.
wi+1
...
Scan-ǫ
• [α, dot, ra, i, j, k, l, nil]: In α part left and below dotted node ranges from i to l. If α is an auxiliary tree, part below foot node ranges from j to k. Parsing beyond CFG
ESSLLI 2008
13
Kallmeyer/Maier
wl
[α, dot, la, i, j, k, l, nil] [α, dot, ra, i, j, k, l + 1, nil]
• wl+1
[α, dot, la, i, j, k, l, nil] [α, dot, ra, i, j, k, l, nil]
TAG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
Earley Parsing: Items (3)
α(dot) labelled ǫ
15
TAG Parsing
ESSLLI 2008
Earley Parsing: Inference Rules (2)
Some notational conventions: • We use Gorn addresses for the nodes: 0 is the address of the root, i (1 ≤ i) is the address of the ith daughter of the root, and for p 6= 0, p · i is the address of the ith daughter of the node at address p.
PredictAdjoinable
•
[α, dot, la, i, j, k, l, nil] [β, 0, la, l, −, −, l, nil]
A
⇒
PredictNoAdj
14
A∗
wi+1 . . . wl
• For a node n, Adj(n) is the set of trees adjoinable at n. nil ∈ Adj(n) signifies that adjunction is not obligatory. Adj(n) = ∅ if n has a terminal or ǫ as label.
TAG Parsing
β ∈ Adj(α(dot)) • A
• For a tree α and a Gorn address dot, α(dot) denotes the node at address dot in α (if defined).
Parsing beyond CFG
α(dot) labelled wl+1
Parsing beyond CFG
[α, dot, la, i, j, k, l, nil] [α, dot, lb, l, −, −, l, nil]
16
nil ∈ Adj(α(dot))
TAG Parsing
Kallmeyer/Maier
ESSLLI 2008
Kallmeyer/Maier
ESSLLI 2008
Earley Parsing: Inference Rules (3)
Earley Parsing: Inference Rules (5)
PredictAdjoined
Complete II
[β, dot, lb, l, −, −, l, nil] [α, dot′ , lb, l, −, −, l, nil]
[α, dot, rb, i, j, k, l, sat?], [α, dot, la, h, −, −, i, nil]
dot = f oot(β), β ∈ Adj(α(dot′ ))
[α, dot, ra, h, j, k, l, nil]
β(dot) ∈ N
or A
[α, dot, rb, i, −, −, l, sat?], [α, dot, la, h, j, k, i, nil] ⇒
•
•A
[α, dot, ra, h, j, k, l, nil]
• A
A∗ • A
A•
⇒ wh+1
wi+1 . . . wl
Parsing beyond CFG
17
Kallmeyer/Maier
TAG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
19
Earley Parsing: Inference Rules (6) Adjoin
TAG Parsing
dot′ =f oot(β),
[β, 0, ra, i, j, k, l, nil], [α, dot, rb, j, p, q, k, nil]
β∈Adj(α(dot))
[α, dot, rb, i, p, q, l, 1]
⇒ •
A
A•adj
∗
A∗
wi+1 . . . wl
β ∈ Adj(α(dot))
• A
A A•
wl
ESSLLI 2008
Complete I
[β, dot′ , rb, i, i, l, l, nil] A
...
wh+1 . . . wi
Earley Parsing: Inference Rules (4)
[α, dot, rb, i, j, k, l, 1], [β, dot′, lb, i, −, −, i, nil]
β(dot) ∈ N
wi+1 . . . wj
•
A∗
wk+1 . . . wl
A•
⇒ wi+1 . . . wl
wj+1 . . . wk
sat? = 1 prevents the new item from being reused in another Adjoin application. Parsing beyond CFG
18
TAG Parsing
Parsing beyond CFG
20
TAG Parsing
Kallmeyer/Maier
ESSLLI 2008
Move the dot to daughter/sister/mother: [α, p, lb, i, j, k, l, nil] [α, p · 1, la, i, j, k, l, nil]
MoveUp:
• The parser has Complete, Scan and Predict operations plus an Adjunction operation. α(p + 1) is defined
[α, p + 1, la, i, j, k, l, nil] [α, p · m, ra, i, j, k, l, nil] [α, p, rb, i, j, k, l, nil]
Parsing beyond CFG
• We have seen an Earley-type recognition algorithm for TAG. We can turn our recognizer into a parser by storing each item with a set of pairs of other items from which it can be inferred.
α(p · 1) is defined
[α, p, ra, i, j, k, l, nil]
MoveRight:
• The algorithm has an upper time bound of O(n6 ) • The parser does not have the Valid Prefix Property. Ensuring this property for TAG parsing is costly.
α(p · m + 1) is not defined
21
Kallmeyer/Maier
TAG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
Earley Parsing: Inference Rules (8) Initialize:
[α, 0, la, 0, −, −, 0, nil]
ESSLLI 2008
Earley Parsing: Summary
Earley Parsing: Inference Rules (7)
MoveDown:
Kallmeyer/Maier
23
TAG Parsing
ESSLLI 2008
LR parsing: Introduction (1) • LR parsing: Left-to-right scanning and Right-to-left reduction
α∈I
• We compile a finite-state automaton from the grammar (offline) and use it to guide actions during parsing (online)
Goal item: [α, 0, ra, 0, −, −, n, nil], α ∈ I
• What does the automaton represent? – States: Correspond to sets of items closed under prediction – Edges: Correspond to scanning a terminal symbol or consuming an already recognized nonterminal Roughly, LR parsing is Earley parsing with precompiled predictions.
Parsing beyond CFG
22
TAG Parsing
Parsing beyond CFG
24
TAG Parsing
Kallmeyer/Maier
ESSLLI 2008
LR parsing: Introduction (2)
Kallmeyer/Maier
ESSLLI 2008
LR parsing: Introduction (4)
An LR automaton is typically represented by two tables.
• Nederhof (1998) extends traditional LR parsing to TAG
• The Action table lists what action must be performed (shift or reduce). This action depends on – the current state in the automaton
• His algorithm is based on – a LR parse automaton (automaton) – a function to scan the next symbol of the input: shif t(∆, aw)
– the next preterminal to be read • The Goto table lists the states where the automaton has to go after reducing a production
– two functions to reduce partial results on the stack: reduceSubtree(∆, w) and reduceAuxtree(∆, w) where w is the input and ∆ is the LR stack. • Nederhof (1998) mentions an implementation of the parser generator • LR automaton generation for the XTAG grammar seemed to be feasible
Parsing beyond CFG
25
Kallmeyer/Maier
TAG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
27
ESSLLI 2008
LR parsing: Introduction (3)
LR parsing: Introduction (5)
In CFG LR parsing, we dispose of two operations on a stack:
Notations: • N (t) is the set of nodes of a tree t.
1. shift(k): Scans a terminal, pushes the corresponding pre-terminal on the stack and switches to state k 2. reduce(A): The RHS of some production A → A1 . . . An has been recognized, i.e. is on the stack. reduce(A) pops A1 , . . . , An from the stack and pushes the LHS A on the stack, then switches to the next state (provided by the goto table)
Parsing beyond CFG
26
TAG Parsing
TAG Parsing
• children(N ) is the list of the children of a node N , given in linear precedence order.
Parsing beyond CFG
28
TAG Parsing
Kallmeyer/Maier
ESSLLI 2008
LR parsing: Introduction (6)
Kallmeyer/Maier
ESSLLI 2008
LR parsing: Construction of the automaton (2)
• The elementary trees are extended with artifical new nodes: – For each t ∈ I ∪ A, we add a unique node ⊤ immediately dominating Rt (the root of t). – For each t ∈ A, we add a unique node ⊥ immediately dominated by Ft (the foot of t). • For a t ∈ I ∪ A, (t, N ) denotes the subtree of t rooted in N . T = I ∪ A ∪ {(t, N )|t ∈ I ∪ A, N ∈ N (t)} is the set of all subtrees of elementary trees, including the elementary trees themselves.
Items have the form [τ, N → α • β], where • τ ∈ T, • N ∈ N (τ ), and • αβ are the daughters of N . An item is called completed if is has the form • either [t, ⊤ → Rt •] with t ∈ I ∪ A, • or [(t, N ), N → α•].
Assume that our TAG has no substitution nodes and does not contain empty words.
Parsing beyond CFG
29
Kallmeyer/Maier
TAG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
LR parsing: Construction of the automaton (1)
31
TAG Parsing
ESSLLI 2008
LR parsing: Construction of the automaton (3)
• The states of the LR automaton are sets of items
• The construction of the set of states of the automaton starts with an initial LR state qin = {[t, ⊤ → •Rt ]|t ∈ I}
• Transitions are labeled with terminals and nonterminals An item represents a subtree of height 1 (mother node N and its daughters) in one of the τ ∈ T together with a dot • that specifies up to which daughter the subtree has been recognized. This subtree is notated as a dotted production N → α • β.
• From each state, new states can be computed using functions goto and goto⊥ . • To compute these functions for a given state q, one needs the closure closure(q) of this state.
Intuition: the closure contains all items that can be obtained from an item [τ, . . .] in q by moving down or up in τ or predicting an adjunction or predicting the part below a foot node.
Parsing beyond CFG
30
TAG Parsing
Parsing beyond CFG
32
TAG Parsing
Kallmeyer/Maier
ESSLLI 2008
Kallmeyer/Maier
ESSLLI 2008
LR parsing: Construction of the automaton (4)
LR parsing: Construction of the automaton (6)
Definition of the closure of a state q: Let q be a set of items. closure(q) is then defined by the following inference rules:
Now we can define the set Q of LR states of our automaton as follows:
•
•
•
•
•
x
x∈q
[τ, N → α • M β] [τ, M → •γ] [τ, N → α • M β] [t, ⊤ → •Rt ] [τ, Ft → •⊥] [(t′ , N ), N → •γ]
qin q
q ′ = goto(q, M ) 6= ∅ for some node M
nil ∈ Adj(M ), children(M ) = γ
•
t ∈ Adj(M )
•
t ∈ Adj(N ).N ∈ N (t′ ), children(N ) = γ
A state is final (in Qf in ) if its closure contains a completed item for some initial tree:
q′ q q′
q ′ = goto⊥ (q, M ) 6= ∅ for some node M
Qf in = {q ∈ Q|closure(q) ∩ {[t, ⊤ → Rt •]|t ∈ I} 6= ∅} Parsing beyond CFG
33
Kallmeyer/Maier
TAG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
35
TAG Parsing
ESSLLI 2008
LR parsing: Construction of the automaton (5)
LR parsing: Construction of the automaton (7)
Intuition behind the goto-functions: goto shifts the dot over a node, goto⊥ shifts the dot over a ⊥ (i.e., a foot node daughter).
For the definition of the recognizer, we also need the notion of reductions(q) for a given state q.
Definition of goto and goto⊥ : Let q be a set of items, M a terminal leaf or a node with Adj(M ) ∩ A 6= ∅ (no NA constraint).
Intuition: If the closure of q contains a completed item, then the LHS node of the dotted production or, if this is a ⊤ in an auxiliary tree, the whole tree are part of the reductions.
• goto(q, M ) = {[τ, N → αM • β]|[τ, N → α • M β] ∈ closure(q)} • goto⊥ (q, M ) = {[τ, Ft → ⊥•]|[τ, Ft → •⊥] ∈ closure(q) ∧ t ∈ Adj(M )}
Definition of reductions(q) for a given state q:
reductions(q) =
{t ∈ A|[t, ⊤ → Rt •] ∈ closure(q)} ∪ {N ∈ N |[(t, N ), N → α•] ∈ closure(q)}
Parsing beyond CFG
34
TAG Parsing
Parsing beyond CFG
36
TAG Parsing
Kallmeyer/Maier
ESSLLI 2008
LR parsing: Construction of the automaton (8)
Kallmeyer/Maier
ESSLLI 2008
LR parsing: The recognizer (1)
We also need the definition of cross-sections through a tree rooted at some node N .
• The stack ∆ contains states and symbols. The latter are either terminal nodes or nonterminal nodes equipped with a stack.
Intuition: the sequences on the stack that can be reduced, i.e., that correspond roughly to the RHS of some completed dotted production are cross-sections.
• A configuration (∆, w) consists of a stack and a word (the remaining part of the input string).
A cross-section of a node N is either the node N or a sequence of cross-sections of the daughters of N in linear precedence order.
• There are three operations that allow the automaton to make a transition (i.e., to change configuration): shif t, reduce subtree and reduce aux tree.
Furthermore, nodes dominating foot nodes are paired with a stack of nodes (indicating where subsequent adjunctions took place).
Parsing beyond CFG
37
Kallmeyer/Maier
TAG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
39
TAG Parsing
ESSLLI 2008
LR parsing: Construction of the automaton (9)
LR parsing: The recognizer (2)
Definition of cross-sections CS(N ) of a node N : Define M := N ∪ (N × N ∗ )
shif t pushes the next input symbol followed by a new state on the stack:
Then for a given node N : • N ∈ CS(N ) if N does not dominate a foot node,
;
• (N, L) ∈ CS(N ) for each L ∈ N ∗ if N dominates a foot node, • x1 . . . xm ∈ CS(N ) if children(N ) = M1 . . . Mm and xi ∈ CS(Mi ) for 1 ≤ i ≤ m. Furthermore, CS + (N ) := CS(N ) \ ({N } ∪ {(N, L) | L ∈ N ∗ }) (the cross-sections without the node itself).
Parsing beyond CFG
38
TAG Parsing
•a
a•
(∆q, aw) ⊢ (∆qaq ′ , w) if q ′ = goto(q, a) 6= ∅.
Parsing beyond CFG
40
TAG Parsing
Kallmeyer/Maier
ESSLLI 2008
LR parsing: The recognizer (3)
...
• The stack is initialized with the initial state qin . • The stack always contains an alternation of states q ∈ Q and nodes or nodes with stacks X ∈ M. • A parse is successful if, in a sequence of transitions (i.e., applications of shif t, reduce subtree and reduce aux tree), the input is completely consumed and the automaton reaches a final state:
; ⊥[N . . .]•
X1
ESSLLI 2008
LR parsing: The recognizer (5)
Reduce subtree is applied when having completed a subtree rooted in N such that an adjunction occurs at N . In other words, it recognizes the part below a foot node.
N [. . .]•
Kallmeyer/Maier
Xm
Some input v is recognized if (qin , v) ⊢∗ (qin ∆q, ǫ) such that q ∈ Qf in .
′
(∆q0 X1 q1 . . . Xm qm , w) ⊢ (∆q0 (⊥, [N L])q , w) if • N ∈ reductions(qm ), X1 . . . Xm ∈ CS + (N ), q ′ = goto⊥ (q0 , N ) 6= ∅, and • L is defined as follows: if some Xj is of the form (M, L), then this provides L, otherwise L = [ ]. Parsing beyond CFG
41
Kallmeyer/Maier
TAG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
TAG Parsing
ESSLLI 2008
LR parsing: Summary
LR parsing: The recognizer (4) Reduce aux tree is applied once an auxiliary tree has been recognized. We then go back to the node where the adjunction occurred. Rt • ;
43
N [. . .]•
X1 . . . Xj [N . . .] . . . Xm
• LR parsing techniques can be applied to TAG. • Shift-reduce parser guided by a precompiled automaton. • General idea: precompile predictions and moves into states and precompile shifts and reductions into transitions of an automaton. • Problem: LR automata get very big.
(∆q0 X1 q1 . . . Xm qm , w) ⊢ (∆q0 Xq ′ , w) if • there is a t ∈ reductions(qm ) with X1 . . . Xm ∈ CS + (Rt ), • q ′ = goto(q0 , N ) 6= ∅ where N is obtained from the unique Xj of the form M [N L], and • if L = [ ], then X = N , otherwise X = N [L].
Parsing beyond CFG
42
TAG Parsing
Parsing beyond CFG
44
TAG Parsing