The idea behind range concatenation grammar (RCG) is comparable to the .....
language input, while the parser determines the destination language via string ...
Kallmeyer/Maier
ESSLLI 2008
Kallmeyer/Maier
ESSLLI 2008
Range Concatenation Grammar The idea behind range concatenation grammar (RCG) is comparable to the idea behind MCFG.
Parsing beyond context-free grammar: Range Concatenation Grammar Parsing
• While in MCFG, a string is generated, in RCG, a string is reduced to ǫ.
ESSLLI Course 2008
1
Kallmeyer/Maier
• One predicate can be true or false for a certain string. • Some string w is in the language of an RCG if the start predicate is true for w.
Laura Kallmeyer, Wolfgang Maier University of T¨ ubingen
Parsing beyond CFG
• Predicate-rewriting clauses describe ranges which are not necessarily adjacent.
RCG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
3
RCG Parsing
ESSLLI 2008
Expressivity of RCG • RCG exactly covers the class of PTIME recognizable languages (Bertsch&Nederhof, 2001). • Simple RCG (basically non-deleting non-copying RCG) is equivalent to MCFG
Overview 1. Range Concatenation Grammars (RCG)
• RCG can represent languages beyond mild context-sensitivity
2. Parsing RCG (a) Directional top-down parsing (b) Earley-style parsing 3. Uses of RCG
Parsing beyond CFG
2
RCG Parsing
Parsing beyond CFG
4
RCG Parsing
Kallmeyer/Maier
ESSLLI 2008
Kallmeyer/Maier
ESSLLI 2008
Definition of RCGs: Derivation Relation, Language
Definition of RCGs: Grammar Definition
• The derivation relation is defined as follows:
A RCG is a tuple G = hN, T, V, P, Si such that • N is a finite set of predicates, each with a fixed arity, • T and V are disjoint finite sets of terminals and variables, • S ∈ N is the start predicate of arity 1, and • P is a finite set of clauses of the form
For a predicate A of arity k, a clause A(. . .) → . . ., and ranges hi1 , j1 i, . . . , hik , jk i with respect to a given w: if there is an instantiation of this clause with LHS A(hi1 , j1 i, . . . , hik , jk i), then A(hi1 , j1 i, . . . , hii , jk i) can be replaced with the RHS of this instantiation. • The language of an RCG G is the set of strings that can be reduced to the empty word:
A0 (x01 , . . . , x0a0 ) → ǫ
∗
L(G) = {w | S(h0, |w|i) ⇒ ǫ with respect to w}.
or A0 (x01 , . . . , x0a0 ) → A1 (x11 , . . . , x1a1 ) . . . An (xn1 , . . . , xnan ) with n ≥ 1 and Ai ∈ N, xij ∈ (T ∪ V )∗ and ai being the arity of Ai . A predicate An (xn1 , . . . , xnan ) can be written as An (~xn ) Parsing beyond CFG
5
Kallmeyer/Maier
RCG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
7
RCG Parsing
ESSLLI 2008
Definition of RCGs: Instantiation
A sample RCG (1)
A given clause C is instantiated with respect to a string w if variables and arguments are consistently replaced by ranges of w.
Sample RCG G for the string language {an bk an | k, n ∈ IN }: An RCG with N = {S, A, B}, T = {a, b}, V = {X, Y, Z}, start predicate S and clauses
Example:
• S(X Y Z) → A(X, Z) B(Y ),
• A(hi . . . ji) → B(hi + 1 . . . ji)
• A(a X, a Y ) → A(X, Y ),
is an instantiation of the clause
• B(b X) → B(X),
• A(aX1 ) → B(X1 )
• A(ǫ, ǫ) → ǫ,
if wi+1 = a.
• B(ǫ) → ǫ
Parsing beyond CFG
6
RCG Parsing
Parsing beyond CFG
8
RCG Parsing
Kallmeyer/Maier
ESSLLI 2008
A sample RCG (2) A(X,
Z)
B(Y )
w3,5
w0,2
w3,5
w2,3
aa
aa
aa
b
Y
Z)
w0,2
w2,3
aa
b
ESSLLI 2008
A sample RCG (4)
As an example consider the reduction of w = aabaa: S(X
Kallmeyer/Maier
→
A(a
X,
a
Y)
w0,1
w1,2
w3,4
a
a
a
→
A(X,
Y)
w4,5
w1,2
w4,5
a
a
a
leads to A(w0,2 , w3,5 ) ⇒ A(w1,2 , w4,5 ). Then
With this instantiation, S(w0,5 ) ⇒ A(w0,2 , w3,5 )B(w2,3 ). Then
Parsing beyond CFG
9
Kallmeyer/Maier
RCG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
A sample RCG (3)
A sample RCG (5) A(a
X,
a
Y)
w3,3
w3,3
w1,2
w2,2
w4,5
ǫ
ǫ
a
ǫ
a
X)
w2,3 b
RCG Parsing
ESSLLI 2008
B(X)
B(b
→
11
→
A(X,
Y)
w5,5
w2,2
w5,5
ǫ
ǫ
ǫ
and B(ǫ) → ǫ
and A(ǫ, ǫ) → ǫ
lead to A(w0,2 , w3,5 )B(w2,3 ) ⇒ A(w0,2 , w3,5 )B(w3,3 ) ⇒ A(w0,2 , w3,5 ).
lead to A(w1,2 , w4,5 ) ⇒ A(w2,2 , w5,5 ) ⇒ ǫ
Parsing beyond CFG
10
RCG Parsing
Parsing beyond CFG
12
RCG Parsing
Kallmeyer/Maier
ESSLLI 2008
Kallmeyer/Maier
ESSLLI 2008
RCG parsing: Treatment of terminals
Definition of RCGs: Other properties (1) • An RCG with maximal predicate arity k is called an RCG of arity k (also called a k-RCG).
Without loss of generality, we presuppose that all non-ǫ clauses contain no terminals in their arguments.
• An RCG is called non-combinatorial if each of the arguments in the right-hand sides of the productions are single variables.
For each t ∈ T , we introduce a new clause Tt (t) → ǫ and for each clause C ∈ P ,
• An RCG is called linear if no variable appears more than once in the left-hand sides of the productions and no variable appears more than once in the right-hand side of the productions.
• we replace each occurrence t′ of t in all arguments of all predicates with a variable Vt′ , • for each Vt′ , we add the predicate Tt (Vt′ ) to the RHS of C. Furthermore, for all clauses we assume that its variables are continuously numbered from 1 to some j.
Parsing beyond CFG
13
Kallmeyer/Maier
RCG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
15
RCG Parsing
ESSLLI 2008
RCG parsing: Range vectors
Definition of RCGs: Other properties (2) • An RCG is called non-erasing if for each production, each variable occurring in the left-hand side occurs also in the right-hand side and vice versa. • An RCG is called simple if it is non-combinatorial, linear and non-erasing. • A simple RCG is called ordered simple if the range variables are ordered the same way in the RHS and the LHS predicates. Ordered simple RCG is equivalent to simple RCG.
We will use range vectors similar to those used for MCFG parsing. Range vectors are used to describe variable bindings. • φ = (hx1 , y1 i, . . . , hxk , yk i) is a range vector in w if all hxi , yi i are ranges in w for 1 ≤ i ≤ k. • φ = (hx1 , y1 i, . . . , hxk , yk i) is a range constraint vector if it contains pairs hx, yi where x, y ∈ P os(w) ∪ Vr (Vr is a set {r1 , r2 , . . .} of range boundary variables) such that if hx, yi ∈ P os(w)2 then it is a range. • k is called the dimension of φ • φ(i).l denotes then the first component and φ(i).r the second component of the ith element of φ.
Parsing beyond CFG
14
RCG Parsing
Parsing beyond CFG
16
RCG Parsing
Kallmeyer/Maier
ESSLLI 2008
Kallmeyer/Maier
ESSLLI 2008
RCG parsing: Variable constraint vectors
Directional top-down parsing
The variable constraint vector φ of a non-ǫ clause A(~x) → Φ is a range constraint vector of dimension j, j being the highest variable index in the clause. It contains only x ∈ Vr × Vr and must be consistent with variable adjacencies in the clause.
Corresponds to the algorithm presented in Boullier (2000). Item form: ~ → Φ • Ψ, φ] • Active items: [A(X) • Passive items: [A, ψ, f lag]
Formally, the elements of φ are pairs from Vr × Vr such that φ(h).r = φ(i).l iff Xh Xi occurs as a substring in one of the arguments of the clause.
where • φ is a range vector of dimension j, j being the highest variable index in the clause, • ψ is a range vector of dimension k, k being the arity of A, • flag= {p, c} indicates if a passive item is predicted or completed.
Parsing beyond CFG
17
Kallmeyer/Maier
RCG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
Update of range vectors
19
RCG Parsing
ESSLLI 2008
Directional top-down parsing (axiom and goal)
We define an update φ′ of a range constraint vector φ with respect to an identity x = y, x, y ∈ P os(w) ∪ Vr as follows:
• Axiom:
[S, (h0, ni), p]
• if x = y, then φ′ = φ; • else if x ∈ Vr and the result ψ of replacing all occurrences of x in φ with y is a range constraint vector, then φ′ = ψ;
• The goal item is [S, (h0, ni), c].
• else if y ∈ Vr and the result ψ of replacing all occurrences of y in φ with x is a range constraint vector, then φ′ = ψ; • otherwise, φ′ is undefined.
Parsing beyond CFG
18
RCG Parsing
Parsing beyond CFG
20
RCG Parsing
Kallmeyer/Maier
ESSLLI 2008
Kallmeyer/Maier
ESSLLI 2008
Directional top-down parsing (predict-rule)
Directional top-down parsing (scan)
We have two predict operations.
Scan:
Predict-rule predicts an active item for a previously introduced passive item.
[A, (hl, ri), p] [A, (hl, ri), c]
A(x) → ǫ, hl, ri(w) = x
[A, ψ, p] [A(~x) → •Ψ, φ] thereby, the variable bindings in φ applied to ~x yield ψ. Furthermore, φ respects the adjacency constraints imposed by the variable constraint vector of the clause.
Parsing beyond CFG
21
Kallmeyer/Maier
RCG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
23
RCG Parsing
ESSLLI 2008
Directional top-down parsing (predict-pred)
Directional top-down parsing (complete)
Predict-pred predicts a passive item.
Complete moves the dot over a predicate in the RHS of an active item if the corresponding passive item has been completed.
[A(. . .) → Φ • B(~x)Ψ, φ] [B, φB , c],
[B, ψ, p]
[A(. . .) → Φ • B(~x)Ψ, φ] [A(. . .) → ΦB(~x) • Ψ, φ]
thereby, ψ results from applying φ to ~x.
where φB must be the result of applying φ to ~x.
Parsing beyond CFG
22
RCG Parsing
Parsing beyond CFG
24
RCG Parsing
Kallmeyer/Maier
ESSLLI 2008
Directional top-down parsing (convert)
Kallmeyer/Maier
ESSLLI 2008
Earley-style parsing (initialization and goal)
Once the dot has arrived at the right end of the RHS of a clause, we can convert the active item to a passive item.
• Initialize:
[S, (h0, ni), p]
Convert: [A(~x) → Φ•, φ]
• The goal item is [S, (h0, ni), c].
[A, φ, c]
Parsing beyond CFG
25
Kallmeyer/Maier
RCG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
27
RCG Parsing
ESSLLI 2008
Earley-style parsing
Earley-style parsing (predict)
Presented in Kallmeyer&Maier (2009) (in preparation).
We have two predict operations.
Item form:
As for the top-down case, predict-rule predicts active items with the dot on the left of the RHS, for a given previously introduced passive item.
• Active items: [A(~x) → Φ • Ψ, φ] • Passive items: [A, ψ, f lag]
[A, ψ, p]
where
[A(x1 . . . y1 , . . . , xk . . . yk ) → •Ψ, φ′ ]
• φ is a range constraint vector of dimension j, j being the highest variable index in the clause, • ψ is a range constraint vector of dimension k, k being the arity of A, • flag= {p, c} indicates if a passive item is predicted or completed.
Parsing beyond CFG
26
RCG Parsing
where, starting from the variable constraint vector φ of the clause, we obtain φ′ by updating with the following identities: φ(xi ).l = ψ(i).l, φ(yi).r = ψ(i).r for all 1 ≤ i ≤ k. Note the difference to the top-down case: We are now dealing with range constraint vectors, i.e., some variable boundaries remain unspecified.
Parsing beyond CFG
28
RCG Parsing
Kallmeyer/Maier
ESSLLI 2008
Kallmeyer/Maier
ESSLLI 2008
Earley-style parsing (predict-pred)
Earley-style parsing (complete)
Also as for the top-down case, predict-pred predicts a passive item for the predicate following the dot in an active item.
Complete moves the dot over a predicate in the RHS of an active item if the corresponding passive item has been completed.
[A(. . .) → Φ • B(x1 . . . y1 , . . . , xk . . . yk )Ψ, φ]
[B, φB , c],
[B, ψ, p]
[A(~x) → Φ • B(x1 . . . y1 , . . . , xk . . . yk )Ψ, φ] [A(~x) → ΦB(x1 . . . y1 , . . . , xk . . . yk ) • Ψ, φ′ ]
where ψ(i).l = φ(xi ).l, ψ(i).r = φ(yi ).r for all 1 ≤ i ≤ k. where φ′ is φ updated with all new constraint information coming from φB , i.e., φ′ is an update of φ wrt. the identities φ(xj ).l = φB (j).l and φ(yj ).r = φB (j).r for all 1 ≤ j ≤ k.
Parsing beyond CFG
29
Kallmeyer/Maier
RCG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
31
ESSLLI 2008
Earley-style parsing (scan)
Earley-style parsing (convert)
Scan:
Convert turns an active item with the dot at the end of the righthand side into a completed passive item
[A, (hl, ri), p] ′
′
[A, (hl , r i), c]
A(x) → ǫ, hl′ , r ′ i(w) = x,
RCG Parsing
hl, ri compatible with hl′ , r ′ i [A(x1 . . . y1 , . . . , xk . . . yk ) → Ψ•, φ] [A, ψ, c]
• Reduce a single terminal to ǫ, recall definition • Here, “compatible with” means that there is a function f : {l, r} → {l′ , r ′ } such that f (l) = l′ , f (r) = r ′ and f (x) = x if x ∈ P os(w).
Parsing beyond CFG
30
RCG Parsing
where ψ(i).l = φ(xi ).l and ψ(i).r = φ(yi ).r for all 1 ≤ i ≤ k.
Parsing beyond CFG
32
RCG Parsing
Kallmeyer/Maier
ESSLLI 2008
RCG as a tool
Kallmeyer/Maier
ESSLLI 2008
TAG → simple RCG: Example
One can use RCG as an intermediary device, resp. a pivot formalism. We will see two applications:
S ǫ
• RCG for TAG parsing • RCG for Syntax-Directed Machine Translation
SNA
SNA
a
S
b
S
S∗NA
a
S∗NA
b
Start predicate: S(X) → α(X)
Parsing beyond CFG
33
Kallmeyer/Maier
→
ǫ
α(B1 B2 )
→
β1 (B1, B2)|β2(B1, B2)
β1 (aB1 , aB2 )
→
β1 (B1 , B2 )|β2 (B1 , B2 )
β2 (bB1 , bB2 )
→
β1 (B1 , B2 )|β2 (B1 , B2 )
β1 (a, a)
→
ǫ
β2 (b, b)
→
ǫ
RCG Parsing
Parsing beyond CFG
ESSLLI 2008
Kallmeyer/Maier
TAG → simple RCG
35
RCG Parsing
ESSLLI 2008
RCG for MT
A TAG can straightforwardly be converted into an RCG.
• Binary 2-RCG can be used for efficient syntax-based machine translation.
• Introduce a predicate for each elementary tree • A predicate corresponding to – an aux tree β has the form β(L, R), where L and R covers the yield of β to the left and the right of the footnode, including all material added to it – an initial tree α have the form α(X), with X covering the yield of α and all trees added to it • A predicate α/β reduces the input by determining which parts of the string come from the α/β respectively and which parts come from substituted/adjoined trees
Parsing beyond CFG
α(ǫ)
34
RCG Parsing
• Intuitively, the first argument of a clause specifies the source language input, while the parser determines the destination language via string variables, i.e., variables in the parser input that are instantiated by lexical items in parsing. • Main advantage over previous systems based on synchronous versions of CFG/TAG/etc.: Higher expressivity through availability of copying/deleting while still in the same complexity class (O(n6 )). • Refer to Søgaard (2008) (COLING 2008) for complete presentation.
Parsing beyond CFG
36
RCG Parsing
Kallmeyer/Maier
ESSLLI 2008
Example grammar: →
• Range concatenation languages coincide with the class of PTIME recognizable languages.
NP (X1 , Y1 )VP (X2 , Y2 )
VP (X1 X2 , Y1 Y2 Y3 )
→
V (X1 , Y1 )ObjP (X2 , Y2 )Part(X1 , Y3 )
VP (X1 , Y1 Y2 )
→
V (X1 , Y1 )Part(X1 , Y2 )
ObjP (X1 , Y1 Y2 )
→
NP (X1 , Y2 )Prep(X1 , Y1 )
NP (X1 X2 , Y1 Y2 )
→
Art(X1 , Y1 )N (X2 , Y2 )
NP (he, er )
→
ǫ
V (entered , trat)
→
ǫ
Part(entered , ein)
→
ǫ
Prep(the room, in)
→
ǫ
Art(the, das)
→
ǫ
N (room, Zimmer )
→
ǫ
Parsing beyond CFG
37
• Other parsing strategies are possible (cf. Kallmeyer&Maier (2009)).
RCG Parsing
ESSLLI 2008
RCG for MT: Example (2) • Call the parser with the input string w =He entered S, where S is a string variable, and the start predicate S(X1 X2 , Y1 Y2 ). • The algorithm should infer that S = Y1 Y2 = trat ein in order to reduce X1 X2 to ǫ. Example derivation: entered
the
room (the room)
er
trat
Parsing beyond CFG
in
das
• We have seen a top-down algorithm and an Earley-style algorithm.
• Range concanenation grammar are used as intermediary formalism in different applications.
Kallmeyer/Maier
he
ESSLLI 2008
Conclusions
RCG for MT: Example (1)
S(X1 X2 , Y1 Y2 )
Kallmeyer/Maier
Zimmer
38
ein
RCG Parsing
Parsing beyond CFG
39
RCG Parsing