A Parallel CKY Parsing Algorithm on Large-Scale Distributed-Memory ...

Recommend Documents

Efficient Parallel CKY Parsing on GPUs - Slav Petrov

of applications in various domains by executing a number of threads and thread blocks in parallel, which are specified

Better Binarization for the CKY Parsing - Microsoft

[CC VP] the. (c) with right. Figure 1: Parsing with left and right binarization. If a left binarized grammar is used, see Fig- ure 1(b), an extra constituent [NP CC] ...

Better Binarization for the CKY Parsing - Microsoft

into an equivalent binary grammar, is essential for ... Figure 1: Parsing with left and right binarization. ..... lexicalized context-free parser with the raw treebank.

A Data Parallel Algorithm for XML DOM Parsing

tectures (SOAs) [16], grid computing, RSS feeds, ecommerce sites, and most recently ... support to an application, and it is widely supported in open-source and ...

Hindi Parser-based on CKY algorithm - International Journal of ...

Hindi parser uses the CKY. (Coke- Kasami-Younger) parsing algorithm for. Parsing of Hindi language. It parses whole sentence and generate a matrix.

EFFICIENT IMPLEMENTATION OF THE CKY ALGORITHM Nathan ...

Jurafsky and Martin's CYK algorithm below. // CKY algorithm; Jurafsky & Martin, 2000 n=length(sentence) for span in 2:n for beg in 1:n-span+1 end=beg+span.

Iterative CKY parsing for Probabilistic Context-Free ... - Semantic Scholar

ducted in the innermost loop of the A* algorithm, the cost of O(log(n)) makes it difficult to build a fast parser by using the A* algorithm. In this paper, we propose ...

CYK Parsing Algorithm

It is also possible to extend the CKY algorithm to handle some grammars which are not in CNF. ◦ Harder to understand.  Based on a “dynamic programming” ...

Parallel Algorithm Based on a Frequential ... - cs.UManitoba.ca

We present a tomographic reconstruction algorithm based on a frequential decomposition of the data. We show that the frequential components of the ...

PPP: Towards Parallel Protocol Parsing

Jun 2, 2014 - dresses, source and destination port numbers, ... popular network applications obey the TCP and UDP ports specified by Internet Assigned.

A Robust Parsing Algorithm For Link Grammars

Aug 2, 1995 - ... a maximal parse. With pruning, the system is no longer guaranteed to produce. 1 ...... 3] K. J. Lee, C. J. Kweon, J. Seo, and G. C. Kim. A robust ...

A Robust Parsing Algorithm For Link Grammars

Aug 2, 1995 - We placed a version of the robust parser on the Word Wide Web for ..... nds a word W and a disjunct d on W such that the connector l matches ...

A generalized CYK algorithm for parsing stochastic

Abstract. We present a bottom-up parsing algorithm for stochastic context-free grammars that is able (1) to deal with multiple interpretations of sentences ...

Lecture 15: CYK Parsing Algorithm

CS 373: Theory of Computation. ˜. Sariel Har-Peled and Madhusudan Parthasarathy. Lecture 15: CYK Parsing Algorithm. 3 March 2009. 1 CYK parsing.

A parallel genetic algorithm - UNAM

Oct 25, 2014 - 1. Introduction. Fracture mechanics has been the fundamental tool used by .... prestress values on the fault, to set a reasonable reference level, we ... seismograms at the free surface with those computed with .... sampling gives info

CKY Algorithm, Chomsky Normal Form - University of Washington

Jan 13, 2010 ... CKY algorithm. Chomsky Normal Form (CNF). Homework2. CKY algorithm. Cocke-Kasami-Younger (CKY) algorithm: a fast bottom-up parsing.

Implementation of a Parallel Algorithm Based on a Spark Cloud ...

Jul 3, 2015 - Keywords: cloud computing; MAX-MIN Ant System; TSP; MapReduce; Spark platform. 1. Introduction. The ant colony algorithm is a heuristic ...

Implementation of a Parallel Algorithm Based on a Spark Cloud ...

Jul 3, 2015 - Keywords: cloud computing; MAX-MIN Ant System; TSP; MapReduce; Spark platform. 1. Introduction. The ant co

Parallel Genetic Algorithm on the CUDA Architecture

any more to master the extra complexity of graphics programming APIs when they design non .... control. Fig. 1. Mapping of the genetic algorithm to CUDA software model ... crossover may destroy the best chromosome in the island population.

Correspondence-guided Synchronous Parsing of Parallel ... - IJCAI

Nl,Ml are NIL or a (non-)terminal symbol for language L1 and L2, respectively, and il,jl are natural numbers for the rank in the sequence for L1 and L2 (for NIL ...

The Parallel Algorithm Based on Genetic Algorithm for Improving the

Jan 4, 2018 - Orthogonal Frequency Division Multiplexing (SP-WOFDM) can not only obtain the same perfect performance of OFDM but also configure ...

Manycore Parallel Algorithm - WordPress.com

TE091585 – Komputasi Grid. Outline. ❑ How to compute Phi ? ❑ Implementation. ❑ Result and Experiment. TE091585 – Komputasi Grid ...

fcgkj cky Hkou - Kilkari

fcgkj cky Hkou. ¼f'k{kk foHkkx] fcgkj ljdkj }kjk LFkkfir laL Fkk½ fcgkj cky Hkou ^ fdydkjh*] cPpkssa essa l``t ukRed fodkl dkss c

Parsing Word-Aligned Parallel Corpora in a Grammar Induction Context

radically different. Figure 1: Word-aligned German/English sentence pair from the Europarl corpus ... sion of context-free grammars to the synchronous grammar ...

A Parallel CKY Parsing Algorithm on Large-Scale Distributed-Memory ...

Download PDF

1 downloads 0 Views 441KB Size Report

Comment

chose the CKY algorithm[4][5] as a basis of our parallel CFG parsing algorithm. A parallel CKY algorithm is desirable from the viewpoints of speedup, distribu-.

A Parallel CKY Parsing Algorithm on Large-Scale Distributed-Memory Parallel Machines NINOMIYA Takashi TORISAWA Kentaro TAURA Kenjiro TSUJII Jun'ichi Department of Information Science University of Tokyo, Hongo 7-3-1 Tokyo 113, Japan fninomi,torisawa,tau,[email protected]

Abstract

This paper describes an ecient parallel CKY algorithm for CFG. We intend to obtain an ecient HPSG parsing algorithm by using this parallel CKY algorithm in Torisawa's HPSG parsing algorithm. Torisawa's parsing algorithm for HPSG consists of two phases. At Phase 1 a parser enumerates possible parse trees using CFG rules compiled from lexical entries in HPSG. At Phase 2 the parser solves constraints which cannot be covered by CFG. We realized a parallel parsing algorithm for Phase 1 on a massively parallel computer AP1000+(256 Super Sparc 50MHz) with concurrent object-oriented programming language ABCL/f. The average parsing time for a corpus consisting of 2,173 sentences(the average length is 41.43 words) was 120.6 msec. The speedup by using 256 nodes was about 45 times when the average length of input sentences was 119.0 words.

1 Introduction This paper proposes a parallel CFG parsing algorithm for a practical use in aspects of speed, data distribution and memory eciency. Although many parallel CFG parsing algorithms exist, a parallel parser which can be used for parsing real-world texts has not been developed yet. We developed a parallel CFG parser for a practical use and applied this algorithm to a parser using a more sophisticated grammar formalism, HPSG[1]. Recent Natural Language Processing(NLP) based on HPSG attracts a great deal of researchers' attention[2], but only a few works are beyond theoretical speculation or experiments with a small grammar. We aim at constructing a framework and an environment based on HPSG in order to develop several NLP techniques on them, including knowledge acquisition, machine translation and information extraction. To accomplish our aims, Torisawa developed an ecient two-phased HPSG parsing algorithm[3]. The key ideas of Torisawa's algorithm are compilation of HPSG and a two-phased parsing technique. At the compile time, the lexical entries in HPSG are compiled into CFG rules. At Phase 1, a parser enu-

merates possible parse trees using bottom-up chart parsing for CFG which is obtained by the compiler. The remaining constraints which cannot be covered with the CFG are solved at Phase 2. This paper describes a parallel parsing algorithm for Phase 1. We chose the CKY algorithm[4][5] as a basis of our parallel CFG parsing algorithm. A parallel CKY algorithm is desirable from the viewpoints of speedup, distribution of data and memory eciency. The next section describes the sequential CKY algorithm. Section 3 describes our parallel CKY algorithm. The eectiveness of our method is exempli ed with a series of experiments using real-world text in Section 4, and the performance limit of our algorithm is discussed in Section 5.

2

Sequential CKY Algorithm

This section describes the sequential CKY algorithm. Let G = (VN ; VT ; P; ) be a context-free grammar, where VN is a set of nonterminal symbols, VT is a set of terminal symbols, P is a set of rewriting rules and is the starting symbol. For any input string w = w1 w2 : : : wn , Si;j is de ned as the subset of VN such that A 2 Si;j if and only if A!3 wi+1 : : : wj . The

string w belongs to L(G) if and only if is in S0;n . When the set of rewriting rules P is in Chomsky Normal Form(CNF)(i.e. each rule is of the form A ! BC or A ! w, where A; B; C 2 VN ; w 2 VT ), following constraints between Si;j hold.

2 Si01;i , 9wi ( ! wi 2 P ) (1)

2 Si;j , 9k; ; (i < k < j; 2 Si;k ; 2 Sk;j ;

! 2 P ) (2) We introduce Ti;k;j for the convenience of describing our parallel CKY algorithm. Ti;k;j is de ned as,

2 Ti;k;j , 9; ( 2 Si;k ; 2 Sk;j ; ! 2 P ) (3) That is, Ti;k;j is a subset of Si;j and Si;j = Si