Three Algorithms for Deterministic Dependency

11 downloads 0 Views 168KB Size Report
tention and suggest several ways in which pars- ... dency parsing use algorithms that are straight- ... vague concept which is probably best under- stood as ... is connected and acyclic, which means that it .... In this section, we describe ... pendency graphs by one of the authors. ... limiters such as colon and semicolon, as well.
Three Algorithms for Deterministi Dependen y Parsing Joakim Nivre and Jens Nilsson

Vaxjo University S hool of Mathemati s and Systems Engineering nivremsi.vxu.se

Abstra t

We present three algorithms for deterministi dependen y parsing and evaluate their a

ura y in parsing unrestri ted Swedish text. With very simple grammars, two of the algorithms a hieve pre ision and re all near 85%, whi h ompares favorably with results published previously. We

on lude that the algorithms merit further attention and suggest several ways in whi h parsing a

ura y an be improved in the future.

1 Introdu tion

Dependen y-based models have ome to play an in reasingly important role in the eld of natural language parsing in re ent years. First of all, dependen y relations have proven very useful for statisti al disambiguation in more traditional onstituen y-based parsing (Collins, 1999; Charniak, 2000). In addition, there seems to be a growing interest in parsing systems that produ e dependen y stru tures rather than traditional phrase stru ture representations (Eisner, 1996a; Tapanainen and Jarvinen, 1997; Barbero et al., 1998; Arnola, 1998; Menzel and S hroder, 1998; Courtin and Genthial, 1998; Du hier, 1999; Samuelsson, 2000). Many of the systems developed for dependen y parsing use algorithms that are straightforward modi ations of the algorithms developed for phrase stru ture grammar (Eisner, 1996a; Barbero et al., 1998; Courtin and Genthial, 1998; Samuelsson, 2000). Fa ed with massive ambiguity and nondeterminism, these algorithms rely on dynami programming and tabulation to derive ompa t representations of multiple analyses with reasonable eÆ ien y, but there is no attempt to resolve ambiguities or remove nondeterminism. Even when these algorithms are ombined with statisti al disambiguation, the basi parsing strategy is usually

the same, with the statisti al model being used to sort andidate analyses in order of de reasing probability, or possibly to prune low-probability bran hes from the sear h tree. It an be argued that in order to bring out the full potential of dependen y grammar as a framework for natural language parsing, we also need to explore alternative parsing algorithms. In this vein, Covington (1990) has proposed an algorithm that is better suited for languages with free word order, whi h are often said to be more naturally des ribed with dependen y grammars. Other resear hers have argued that dependen y parsing should be ast as a onstraint satisfa tion problem and solved using onstraint programming (Maruyama, 1990; Menzel and S hroder, 1998; Du hier, 1999). Here we will instead follow the line of Arnola (1998) and investigate deterministi algorithms for dependen y parsing. In the past, deterministi approa hes to parsing have often been motivated by psy holinguisti on erns, as in the famous Parsifal system (Mar us, 1980). However, deterministi parsing also has the more dire t advantage of providing eÆ ient disambiguation. If the disambiguation

an be performed with high a

ura y and robustness, deterministi parsing be omes an interesting alternative to more traditional algorithms for natural language parsing. Note that a

ura y does not have to be perfe t in order for this to be true. First of all, there are potential appli ations of parsing, for example in information retrieval, where it may be suÆ ient if lexi al dependen y relations an be identi ed with good enough pre ision and re all, even if the algorithm does not always nd the orre t

omplete dependen y stru ture for a senten e. Se ondly, it may be possible to improve parse a

ura y by adding ustomized post-pro essing

in order to orre t typi al errors introdu ed by the parser. In this way, deterministi dependen y parsing an be viewed as an interesting

ompromise between so- alled deep and shallow pro essing. It is a kind of deep pro essing in that the goal is to build a omplete synta ti analysis for the input string, not just identify basi onstituents as in partial parsing. But it resembles shallow pro essing in being robust, ef ient and deterministi . In this paper we present three di erent algorithms for deterministi dependen y parsing and provide an experimental evaluation of their a

ura y with respe t to the problem of parsing unrestri ted Swedish text. In se tion 2 we give a brief introdu tion to dependen y grammar and the kind of grammar rules required by the algorithms, and in se tion 3 we present the three algorithms. The main results of the paper are found in se tion 4, where we report an experimental evaluation of the three algorithms using test data from the Sto kholm-Ume a Corpus of written Swedish (SUC, 1997), while on lusions are stated in se tion 5.

2 Dependen y Grammar By and large, dependen y grammar is a rather vague on ept whi h is probably best understood as an umbrella term overing a large family of grammati al theories and formalisms that share ertain basi assumptions about grammati al stru ture (Tesniere, 1959; Sgall et al., 1986; Mel' uk, 1988; Hudson, 1990). Foremost among these is the assumption that synta ti stru ture onsists of lexi al nodes linked by binary relations alled dependen ies. Thus, the

ommon formal property of dependen y stru tures, as ompared to the more ommon synta ti representations based on onstituen y (or phrase stru ture), is the la k of phrasal nodes. In a dependen y stru ture, every lexi al node is dependent on at most one other lexi al node, usually alled its head or regent, whi h means that the stru ture an be represented as a dire ted graph, with nodes representing lexi al elements and edges representing dependen y relations. Normally we also require that the graph is onne ted and a y li , whi h means that it will in fa t be a rooted tree with the root node representing the head of the senten e. Figure 1 illustrates the di eren e between a phrase stru -

ture tree (top) and a dependen y graph (bottom) for the simple English senten e she bought a ar. Given these basi assumptions about dependen y stru ture, there are a number of parameters that an vary between di erent dependen y grammars. For example, lexi al nodes an be assumed to represent words, strings of words or even parts of words. Dependen y relations

an be labeled with synta ti fun tions, semanti roles, or not at all. Another set of issues on ern the relation between dependen y stru ture and word order (or surfa e realization generally). In many versions of dependen y grammar, in luding the in uential work of Tesniere (1959), dependen y stru ture is assumed to be independent of surfa e realization, whi h means that there is no linear order imposed on the nodes of the dependen y graph. In other versions, in luding most dependen y-based approa hes to natural language parsing, the nodes of the dependen y graph are ordered by a linear pre eden e relation representing the word order of the senten e. We will say that the latter kind of dependen y graph is linear and the former non-linear. By extension, we will also talk about linear and non-linear dependen y grammars. The distin tion between linear and non-linear dependen y grammars is related to, but distin t from, the more well-known distin tion between proje tive and non-proje tive dependen y grammars. Proje tivity (sometimes alled planarity ) is the requirement that, in surfa e stru ture, a head and its dependent an only be separated by other dependents of the same head (and dependents of these dependents, et .). In a linear dependen y grammar, this amounts to the requirement that there are no rossing edges (whi h is the motivation for the term planarity ). If there are many open issues with respe t to the representation of dependen y stru ture, there is even more diversity when it omes to de ning dependen y grammars as formal systems. Starting with the work of Hays (1964) and Gaifman (1965), there are many di erent proposals to be found in the literature, some of whi h de ne dependen y grammars as very similar to (or even as a spe ial ase of) ontextfree phrase stru ture grammars, others being more or less radi ally di erent (Barbero et al.,

S

QQ  VP   QQ    NP NP  QQ

Pro

V

Det

N

she

bought

a

ar





6



SUBJ



6

6

DET

OBJ

Figure 1: Phrase stru ture tree (top) and dependen y graph (bottom) 1998; Menzel and S hroder, 1998; Eisner, 2000; Samuelsson, 2000; Du hier, 2001; Gerdes and Kahane, 2001; Kruij , 2001). In the present paper, we will make very weak assumptions about the information available in a dependen y grammar, roughly along the lines of Covington (1990). Let W be a vo abulary (a set of word forms) and let R be a set of dependen y relations. Given two words wi ; wj 2 W and a dependen y relation r 2 R, a grammar G should minimally provide answers to the following two questions: 1. Can wi be a left dependent of wj with dependen y relation r? 2. Can wj be a right dependent of wi with dependen y relation r?

ompatible with the onstraints of G. In the interest of robustness, we do not require that the output be a rooted tree, nor do we impose the proje tivity onstraint, although the desired output for a given input will in most ases satisfy at least the rst of these requirements. The algorithms are deterministi in the sense that on e an edge has been added to the dependen y graph it an never be removed and will therefore blo k the addition of other possible edges, given the onstraint that ea h word an have at most one head. Given that synta ti relations tend to be lo al, all three algorithms have a preferen e for loser links over more distant ones. But they di er in the way that this preferen e is balan ed against other onstraints. 3.1

The Basi Algorithm

If one or both of these questions are answered in the aÆrmative, we say that G ontains the rule r wi r wj and/or the rule wi ! wj . In this paper, we will generally assume that R is a singleton set, whi h is equivalent to the assumption that dependen y relations are unlabeled. Consequently, we will often simplify the notation and write rules as wi wj and wi ! wj .

The basi algorithm onstru ts a dependen y graph by linking ea h word to its losest possible regent (pro eeding left-to-right through the input):

3 Three Algorithms

The operation Link(wi ; wj ) is de ned as:

The algorithms de ned in this se tion take as input a string of words w1 ; : : : ; wn in the vo abulary W of a dependen y grammar G and builds a linear dependen y graph with nodes w1 ; : : : ; wn by adding edges (wi ; wj ) that are

for

width k = 1 to n 1 do position i = 1 to n k Link(wi ; wi+k )

for

do

wi has no head and wi wj then add the edge (wj ; wi ) else if wj has no head and wi ! wj then add the edge (wi ; wj ) if

The basi algorithm runs through the input words from left to right n 1 times (where n is the number of words in the input), onsidering possible links of length k during iteration k, whi h gives a running time that is quadrati in the length of the input.

There are a number of equivalent ways of implementing the more omplex Link operation, but they all have a worst- ase running time that is linear in the number of nodes, whi h means that the omplexity of the proje tive algorithm is O(n3 ).

3.2

4 Parsing Swedish Text

The In remental Algorithm

While the basi algorithm links ea h word to its losest possible regent, the in remental algorithm is instead biased towards linking ea h new word to a previous word, with only a se ondary preferen e for loser links over more distant ones: for

position j = 2 to n do position i = j 1 to 1 do Link(wi ; wj )

for

The in remental algorithm only runs through the input from left to right on e, but for ea h new word onsiders possible links to all pre eding words (from right to left). The resulting time omplexity is therefore the same as for the basi algorithm. 3.3

The Pro je tive Algorithm

Without further onstraints, the rst two algorithms are not guaranteed to produ e onne ted and a y li dependen y graphs, let alone proje tive ones. The proje tive algorithm is an extension of the basi algorithm that eliminates the last two problems, i.e. the resulting graph is guaranteed to be a y li and proje tive (but not ne essarily onne ted). Let us rst de ne a notion of a

essibility for graph nodes:

 A node w

j is a

essible i there is no edge (wi ; wk ) su h that i < j < k.

Given the notion of a

essibility, we an de ne the proje tive algorithm simply by de ning a more omplex operation Link(wi ; wj ): if

wi has no head and wi wj and wi and wj are a

essible and there is no path from wi to wj

then

add the edge (wj ; wi ) wj has no head and wi ! wj and wi and wj are a

essible and there is no path from wj to wi

else if

then

add the edge (wi ; wj )

In order to estimate the a

ura y that an be expe ted from the three algorithms des ribed in the pre eding se tion, we have performed a series of experiments with parsing senten es from the Sto kholm-Ume a Corpus of written Swedish (SUC, 1997), whi h is a balan ed orpus onsisting of tional and non- tional texts, organized in the same way as the Brown orpus of Ameri an English. In this se tion, we des ribe our methodology (se tion 4.1), report the most important results (se tion 4.2), and dis uss the impli ations of these experiments (se tion 4.3). 4.1

Method

The Sto kholm-ume a Corpus is annotated for parts of spee h (and manually orre ted), and in these experiments we used the tag sequen es as input to the parser. This means that our grammars were not lexi alized and ould only express dependen y onstraints at the part of spee h level. Moreover, we eliminated most morphosynta ti features, so that the tagset used only

ontained 27 out of the totally available 156 distin t tags. Two independent data sets were sampled, ea h onsisting of roughly 2000 words, orresponding to 115 and 142 senten es, respe tively. Both samples were manually annotated with dependen y graphs by one of the authors. The rst data set was used for validation and optimization of grammars; the se ond was reserved for the nal evaluation. When annotating the senten es, major delimiters su h as olon and semi olon, as well as all kinds of parentheses and bra kets, were treated as barriers for dependen y relations. This means that the strings o

urring on ea h side of su h a delimiter were treated as separate parse units even if they were not stri tly speaking separate senten es. It is also worth mentioning some of the prin iples used in hoosing between alternative stru tural analyses:

 Synta ti dependen ies are preferred over

semanti ones (Mel' uk, 1988, 105{128), meaning among other things that: 1. Nouns depend on prepositions. 2. Finite verbs depend on subjun tions and fronted Wh -words. 3. Main verbs depend on auxiliary verbs.

 Coordinated items are treated as multiple

dependents of their mutual head (if any), while the oordinating onjun tions are left unatta hed.

 Multi-word proper names are treated as oordinated items.

 Nominal appositive onstru tions are onsistently analyzed as left-headed.

Several of these prin iples are illustrated by the dependen y graph in Figure 2. The omplete annotated data sets are available from the authors on request. The initial grammar used in the experiment was hand- rafted and ontained a total of 139 rules, divided into 100 left-headed rules (of the form wi ! wj ) and 39 right-headed rules (of the form wi wj ). This grammar was then optimized with respe t to the validation data set, by iteratively removing low pre ision rules until no further improvement was possible. The grammar was optimized separately for ea h of the three algorithms. Pre ision and re all were al ulated per senten e by omparing the dependen y graphs built by the parsers to the manually annotated gold standard: Pre ision = Re all =

j Corre t edges in parse j j Edges in parse j j Corre t edges in parse j j Edges in gold standard j

The overall pre ision and re all were then al ulated as the mean pre ision and re all over senten es onsisting of at least two words. As a baseline for omparison we in luded the dependen y graphs obtained by letting ea h word in a senten e be the head of the immediately following word (left-headed dependen ies being more ommon than right-headed dependen ies in Swedish). The baseline obtained in

this way was 30.4% pre ision and 33.9% re all on the validation data set, and 31.8% pre ision and 34.6 re all on the test data set. 4.2

Results

Table 1 shows the pre ision and re all obtained for ea h of the three algorithms on the validation data set (V ) and the nal test set (T ). The grammar G0 is the initial hand- rafted grammar, while the grammar Gm is the grammar that is produ ed by subtra ting the m rules with lowest pre ision from G0 , measured over the validation data set V . For the basi algorithm, maximum pre ision and re all was a hieved with m = 14, while the optimal grammars for the in remental and proje tive algorithms were obtained with m = 24 and m = 12, respe tively. As an be seen from Table 1, the best s ores were obtained with the basi algorithm and the proje tive algorithm, while the in remental algorithm performs onsistently worse, both with respe t to pre ision and re all and both before and after optimization. These di eren es are statisti ally signi ant beyond the .01 level (paired t-test). Comparing the basi algorithm with the proje tive algorithm, we see that the former appears to favor re all while the latter favors pre ision. Thus, the basi algorithm generally has better re all than pre ision, while the proje tive algorithm exhibits the opposite tenden y. Moreover, whereas the basi algorithm a hieves higher re all than the proje tive algorithm at all data points, the proje tive algorithm onsistently has higher pre ision than the basi algorithm. However, only the di eren es in pre ision turn out to be statisti ally signi ant (paired t-test, = :01). The e e t of grammar optimization an be seen as an in rease in both pre ision and re all of about 2 per ent units for the validation data set V . As expe ted, the improvement is generally smaller for the independent test data set T , and only the gain in pre ision is statisti ally signi ant for all algorithms, whi h is only natural given that low pre ision is the riterion for subtra ting a rule from the grammar. 4.3

Dis ussion

The most important result of the empiri al evaluation is that both the basi algorithm and the

 

?



?

   



? ?



   



?

?

?

 



?





?

?

?

PP NN VB PN VS JJ KN PC NN HP VB PM PM P a 60-talet hade han m alat djarva o h utmanande tavlor som retade Nikita Chrusjtjov. (In the-60's had he painted bold and daring pi tures whi h annoyed Nikita Chrustjev.) Figure 2: Annotated senten e example Algorithm Baseline Basi (m = 14) In remental (m = 24) Proje tive (m = 12)

Validation (V )

G0

P 30.4 84.4 75.4 84.9

R 33.9

84.7

75.7 84.4

Gm

P 30.4 86.2 81.2

R 33.9

86.5

78.2 85.9

86.8

G0

P 31.8 82.8 75.2 84.3

Test (T ) R 34.6

82.9

75.3 82.7

Gm

P 31.8 84.2 81.4 85.5

R 34.6

83.9

78.3 83.3

Table 1: Pre ision (P) and re all (R) for parsing Swedish text proje tive algorithm an a hieve a high a

ura y, measured in terms of pre ision and re all, even with the very simple grammars used so far. The in remental algorithm performs onsistently worse, whi h seems to indi ate that a preferen e for lose links, whi h is ommon to the other two algorithms, is more useful than the strategy of linking to previous material, at least when other onstraints on possible links are very weak and linking is deterministi without any kind of repair me hanism. In addition, the algorithms are robust in the sense that they output a dependen y graph for every input senten e, although the graph is not guaranteed to be onne ted. For ertain appli ations of dependen y parsing, su h as the dete tion of dependen y relations for information retrieval in unrestri ted text, the level of pre ision obtained here may well be suÆ ient to make the algorithms useful, given their eÆ ien y and robustness. For other appli ations, it may be possible to improve the a

ura y by spe ialized post-pro essing. At the same time, it is diÆ ult to say how good the results are, sin e there are no previously published results for dependen y parsing of unrestri ted Swedish text that an be used for dire t omparison. In order to provide some perspe tive on the results, we will therefore relate the results to three previous studies that are in di erent ways relevant to our on erns:

 Megyesi (2002) reports an a

ura y of lose





to 95% for shallow parsing ( hunking) of unrestri ted Swedish text, tested on data from the Sto kholm-Ume a Corpus. Just

omparing raw gures, these results are obviously mu h better than the ones reported here, but dependen y parsing is probably a more diÆ ult task than hunking. Arnola (1998) reports an a

ura y of around 85% for deterministi dependen y parsing of Finnish, measuring a

ura y as the proportion of senten es for whi h the parser produ es a ompletely orre t analysis, whi h is a mu h more onservative metri than the ones used in the present study. In addition, Arnola assumes labeled dependen y relations, whi h should make the parsing problem more diÆ ult. On the other hand, the evaluation is based on a test suite rather than on a statisti al sample of unrestri ted test, whi h again makes it hard to ompare the results dire tly. Eisner (1996b) reports an a

ura y of 90% for probabilisti dependen y parsing of English text, sampled from the Wall Street Journal se tion of the Penn Treebank. Moreover, if the orre t part of spee h tags are given with the input, a

ura y in reases to almost 93%. Although the evaluation metri used by Eisner, alled atta hment s ore, is not exa tly equivalent to the pre i-

sion and re all measures used in the present study, it seems reasonable to view Eisner's results as indi ative of the a

ura y level that an be a hieved in dependen y parsing also for Swedish and whi h we therefore set as a goal for the future. Comparing the two best algorithms in more detail, we see that the basi algorithm favors re all over pre ision, while the opposite is true of the proje tive algorithm. However, while the end result is that re all is omparable for the two algorithms, pre ision is signi antly higher for the proje tive algorithm. It thus seems that the onstraint of proje tivity, whi h has a rather

ontroversial status in the theoreti al literature on dependen y grammar, is bene ial in deterministi parsing by ltering out more bad links than good ones. However, it should also be kept in mind that the di eren e in pre ision is not very large, and that the basi algorithm is both faster and easier to implement. If we take a more detailed look at the kind of errors made by the parsers, restri ting our attention to the two best algorithms, we an divide them into two broad ategories, whi h are about equally important quantitatively. In the rst ategory we nd all the well-known problems having to do with atta hment ambiguity for prepositional phrases and other adjun ts, where our algorithms always prefer the losest possible link. In the se ond ategory we nd problems that seem more spe i to our algorithms and the way in whi h they intera t with the very simple form of grammars used, su h as violations of valen e or sub ategorization onstraints and linking a ross synta ti barriers. One important sour e of these errors is the o

urren e of non- anoni al synta ti forms, in parti ular ellipti al onstru tions where synta ti heads are omitted. There are several ways in whi h our grammars and algorithms an be extended in order to ope with these and other problems. First, grammar rules need to be lexi alized to a mu h higher degree, in the way that the best probabilisti parsers are lexi alized (Eisner, 1996b; Collins, 1999; Charniak, 2000). Se ondly, we need to add some kind of valen e and sub ategorization onstraints, as well as onstraints on the ontext separating a head from its dependent(s). Thirdly, it seems that prepro essing of

ertain multi-word units, su h as proper names and verb hains, may improve parsing a

ura y. Finally, it is possible to add post-pro essing to

orre t spe ial kinds of errors as in the Link Grammar parsing system of Sleator and Temperley (1993). Before on luding, we also want to point out some of the limitations of the present study. First of all, evaluation was performed on manually orre ted part of spee h sequen es, whi h means that the parsing a

ura y for arbitrary Swedish text is likely to be lower be ause of errors introdu ed in the part of spee h tagging phase. Se ondly, the dependen y stru tures built by the parser | as well as the stru tures in luded in the manually onstru ted gold standard | have two important short omings; they are not labeled with synta ti fun tions, and they do not in orporate a fully adequate analysis of oordinate stru tures. However, we believe that both of these short omings an be over ome in the future.

5 Con lusion We have presented three algorithms for deterministi dependen y parsing and evaluated them with respe t to the task of parsing unrestri ted Swedish text. The main on lusion is that algorithms based on the losest-link strategy an a hieve good parsing a

ura y even with very weak grammati al onstraints, while an in remental parsing strategy gives substantially lower a

ura y in this setting. In addition, the algorithms are both eÆ ient and robust. Adding a proje tivity onstraint to the

losest-link strategy gives a signi ant improvement in pre ision, while in reasing the running time of the algorithm from quadrati to ubi in the size of the input. We onje ture that a

ura y an be further improved by in orporating more omplex grammati al onstraints.

Referen es H. Arnola. 1998. On Parsing Binary Dependen y Stru ture Deterministi ally. In S. Kahane and A. Polguere (eds.) Workshop on

Pro essing of Dependen y-Based Grammars, COLING-ACL 1998, 68{77.

C. Barbero, L. Lesmo, V. Lombardo and P. Merlo. 1998. Integration of Synta ti and Lexi al Information in a Hierar hi al

Dependen y Grammar. In S. Kahane and A. Polguere (eds.) Workshop on Pro essing

G-J. M. Kruij . 2001. A Categorial-Modal

E. Charniak. 2000. A Maximum-EntropyInspired Parser. In NAACL-2000. M. Collins. 1999. Head-Driven Statisti al Models for Natural Language Parsing. PhD Thesis, University of Pennsylvania. J. Courtin and D. Genthial. 1998. Parsing with Dependen y Relations and Robust Parsing. In S. Kahane and A. Polguere (eds.) Work-

Prague. M. P. Mar us. 1980. A Theory of Synta ti Re ognition for Natural Language. MIT Press. H. Maruyama. 1990. Stru tural Disambiguation with Constraint Propagation In ACL 1990, Pittsburgh, 31{38. B. Megyesi. 2002. Shallow Parsing with PoS Taggers and Linguisti Features. Journal of Ma hine Learning Resear h 2, 639{668. I. Mel' uk. 1988. Dependen y Syntax: Theory and Pra ti e. State University of New York Press. W. Menzel and I. S hroder. 1998. De ision Pro edures for Dependen y Parsing Using Graded Constraints. In S. Kahane and A. Polguere (eds.) Workshop on Pro essing

of Dependen y-Based Grammars, COLINGACL 1998, 58{67.

shop on Pro essing of Dependen y-Based Grammars, COLING-ACL 1998, 95{101.

M. Covington. 1990. A Dependen y Parser for Variable-Word-Order Languages. Te hni al Report AI-1990-01, University of Georgia. D. Du hier. 1999. Axiomatizing Dependen y Parsing Using Set Constraints. In Sixth Meeting on Mathemati s of Language, Orlando, Florida, 115{126. D. Du hier. 2001. Lexi alized Syntax and Topology for Non-proje tive Dependen y Grammar. In Joint Conferen e on Formal

Grammars and Mathemati s of Language FGMOL'01, Helsinki, Finland.

J. M. Eisner. 1996. Three New Probabilisti Models for Dependen y Parsing: An Exploration. In COLING-96, Copenhagen. J. M. Eisner. 1996. An Empiri al Comparison of Probability Models for Dependen y Grammar. Te hni al Report IRCS-96-11, Institute for Resear h in Cognitive S ien e, University of Pennsylvania. J. M. Eisner. 2000. Bilexi al Grammars and Their Cubi -Time Parsing Algorithms. In H. Bunt and A. Nijholt (eds.) Advan es in Probabilisti and Other Parsing Te hnologies, Kluwer. H. Gaifman. 1965. Dependen y Theory: A Formalism and Some Observations. Language 40, 511{525. K. Gerdes and S. Kahane. 2001. Word Order in German: A Formal Dependen y Grammar Using a Topologi al Hierar hy. In ACL 2001, Toulouse, Fran e. D. G. Hays. 1964. Dependen y Systems and Phrase-Stru ture Systems. Information and Control 8, 304{337. R. A. Hudson. 1990. English Word Grammar. Bla kwell.

Logi al Ar hite ture of Informativity: Dependen y Grammar Logi and Information Stru ture. PhD Thesis, Charles University,

of Dependen y-Based Grammars, COLINGACL 1998, 78{87.

C. Samuelsson. 2000. A Statisti al Theory of Dependen y Syntax. In COLING-2000. P. Sgall, E. Haji ova and J. Panevova. 1986. The Meaning of the Senten e in Its Pragmati Aspe ts. Reidel.

D. Sleator and D. Temperley. 1993. Parsing English with a Link Grammar. In Third International Workshop on Parsing Te hnologies, Carnegie Mellon University. SUC 1997. Sto kholm Ume a Corpus. Version 1.0. Department of Linguisti s, Ume a University and Department of Linguisti s, Sto kholm University. P. Tapanainen and T. Jarvinen. 1997. A NonProje tive Dependen y Parser. In ANLP 1997. L. Tesniere. 1959. Elements de syntaxe stru turale. Editions Klin ksie k

Suggest Documents