Compiling HPSG-style Grammar to an Object ... - Semantic Scholar

10 downloads 0 Views 184KB Size Report
features. Name: Adjuncts. Mother FS: 2. 6. 4 sign syn. [ adjuncts [1] ] head-dtr h ..... and Ivan A. Sag. 1993. Head-. Driven Phrase Structure Grammar. University.
Compiling HPSG-style Grammar to an Object-Oriented Language

Kentaro Torisawa

Jun'ichi Tsujii

Department of Information Science, Centre for Computational Linguistics, University of Tokyo, UMIST, Hongo 7-3-1, 88, Manchester M60 1QD, U.K. Bunkyo-ku, Tokyo, 113, Japan PO [email protected] [email protected]

Abstract

This paper describes the compilation techniques used in our HPSG(Pollard and Sag, 1987; Pollard and Sag, 1993)based parser. As the rst step, our parser builds up as fast as possible the parse trees which cover the whole input, by ignoring complex constraints. In the second phase, these ignored constraints are solved, but only for the completed trees covering the whole sentence. Our HPSG compiler generates code for the Common Lisp Object System (CLOS) which is used in the rst phase. The original grammar is written using feature structures and de nite clause programs(Carpenter, 1992). The compiled code contains uni cation routines specialised for a given grammar and a xed parsing strategy, i.e., bottom-up parsing, as well as reduced constraints derived from the original grammar. Preliminary experiments show that our parser can be faster than a parser with conventional uni cation routines.

1 Introduction

Our aim is to build an ecient and robust HPSGbased parser that can be used as a component of a knowledge acquisition system from unrestricted text(Horiguchi et al., 1995; Torisawa and Tsujii, 1995). In knowledge acquisition, we cannot expect a full grammar description to be available to the parser. We only give a concise set of general schemata which can handle most of the linguistic constructions that appear in unrestricted texts, together with underspeci ed lexical descriptions. Starting with these, the system has to learn more detailed constraints from corpora. The parser must be able to handle the overgeneration caused

by such an incomplete grammar description. Conventional HPSG-based parsers(Franz, 1990; Carpenter, 1993; Kodric et al., 1992) are not robust and ecient enough for such a situation. In addition, we believe that conventional uni cation-based formalisms like PATR which rely heavily on CFG skeletons are not appropriate for knowledge acquisition. This is because an incomplete grammar description in such formalisms implies undergeneration which is more serious than overgeneration for knowledge acquisition(Kiyono and Tsujii, 1994; Torisawa and Tsujii, 1995). As a result, conventional optimization techniques based on CFG skeletons are not e ective for our purpose. Unlike other HPSG-based pasers, our parser uses a phased parsing archtecture with the following two phases, in order to cope with ineciency caused by overgenerating grammar. Phase 1 Bottom-up parsing with compiled object-oriented code. Phase 2 Applying a full grammar to parse trees covering the whole sentence. This phased architecture allows us to parse a sentence using only simple but strong constraints in Phase 1, and to delay expensive and complex parts of constraint solving. The purpose of Phase 1 is to build up syntactic structures which cover the whole input, and to save time and storage which would be wasted in dead-end parsing paths in Phase 2. The compiler for Phase 1 not only ignores certain constraints, but also exploits characteristics of a given HPSG grammar and a xed parsing strategy in order to optimize the compiled code. Furthermore, ignoring certain constraints in Phase 1 enables an e ective packing of signs according to subsumption ordering. This also improves the eciency in terms of both space and time. In this paper, we focus on the compilation techniques for generating object-oriented code for Phase 1. Though they are developed for a parser

for our knowledge acquisition system, the proposed phased architecture and the compiling techniques can be used to improve eciency and robustness of HPSG-based parsers in general.

2 Compiling HPSG-style Grammar

One of the reasons why HPSG (or uni cationbased formalisms in general) has attracted the attention of so many researchers is that it provides a uniform and well-de ned formalism whose interpretation is neatly given by the universal algorithm of uni cation. By using the same algorithm, we can build a generator as well as a parser. Uni cation is also used for treating heterogeneous aspects of language including morphology, syntax, semantics, etc. However, the desirable characteristics of simplicity and universality, have adverse e ects on an actual processing with a speci c purpose. This is because the uniform uni cation algorithm is overspeci ed with respect to a speci c grammar and its speci c use i.e., parsing, generation; and if parsing, what parsing strategy is used, etc. Although there have been attempts at ecient implementation of uni cation, they have natural limits unless they commit to more speci c grammar formalisms and their speci c uses. An obvious overspeci cation of uni cation is its bi-directionality. Bottom-up parsing and top-down generation are completely di erent tasks with respect to the direction of moving/copying feature structures. Roughly, the former moves/copies daughters' feature structures in order to create a mother sign, while the movement goes in the opposite way in the latter. A uni cation routine has to decide this direction at run-time. However, if we can circumscribe the context in which uni cation is used, and restrict possible feature structures and their movements, uni ability checking and decision of feature structure movement at run-time can be eliminated. In an extreme case, uni cation can be replaced by simple assignement. Furthermore, di erent types of information are represented by using arbitrary feature structures unless we commit to speci c grammar formalisms. A universal uni cation algorithm can not use semantics given to feature structures by a speci c grammar. Syntactic/semantic constraints and linguistic representations of various levels (e.g. logical form), for example, have to be treated equally. However, in parsing, uni cation between feature structures representing syntactic/semantic constraints fails frequently, but once uni ability checking succeeds, the result of uni cation

is hardly required by subsequent parsing. This means that, for those constraints, only uni ability checking is crucial and the creation of uni ed structures is not necessary in most cases. On the other hand, uni cation for building representations such as logical form does not fail very often, while the result of the uni cation is indispensable. This is the opposite case of the constraints. i.e., uni ability checking can be eliminated. Compilation of HPSG grammar we propose is based on the above observations. In short, we can provide ecient procedures which replace a universal uni cation algorithm by exploiting properties of a particular grammar and forms of possible lexical entries. In the next section, we describe various compilation techniques for HPSG and show how our method can be combined with them. Section 3 sketches our grammar formalism, whose characteristics a ects our compilation method. The result of our experiments is discussed in Section 4.

3 Related Work

Although there has been previous works on compilation of HPSG, In each case, the `compiler' has a di erent purpose. The previous attempts can be categorized according to source formalism and target formalism: 1) Compilation of de nite clause programs to Prolog(Carpenter, 1993; Carpenter and Penn, 1994). 2) Compilation of non-computationary-oriented HPSG formalism into de nite clause programs(Goetz and Meurers, 1995) or other high-level formalisms(Kasper et al., 1994). The purpose of our compilation is similar to the rst one, i.e., the compilation speeds up each application of a rule. The compiler in the rst categories can be used to improve the eciency of Phase 2 parsing in our parser. The di erence is that, by ignoring certain constraints, the compiled code from our compiler is faster than any parser with built-in uni cation routine, though the compiled code may overgenerate. In spite of this overgeneration, the required storage is decreased drastically by specializing the uni cation routine. Even if some structures are overgenerated, the total amount of storage is less than that of parsing with an original grammar description. The attempts in the second category are motivated by lling gaps between di erent formalisms. As a result, the research (Kasper et al., 1994) improves the eciency by introducing a notion external to original HPSG. However, a single type of uni cation is used. Our compilation can be used for further speed up if another compilation technique exists from the more high-level description

Rewriting Rule: Mother([1]) Non-Head([2]) Head([3]) Principles: (head-feature,subcat-right,semantic) Mother FS: 2 sign 2 subcat-right ? 3 [4] 66 4 adjunct1 adjunct2 [5] 5 66 syn adjuncts [6] [7] 2subcat-left 3 66 sign 2 subcat-righth[sign]; 111i3 66 [4] [1] 6 head-dtr [3] 6 4 syn 4 adjunct1 57 5 adjunct2 [5] 66 adjuncts [6] 66 2 sign " subcat-left [7] #3 subcat-righthi 64 adjunct1 non-sign comp-dtr [2] 4 syn adjunct2 non-sign 5 adjuncts ? subcat-left ?

3 7 7 7 7 7 7 7 Figure 2: Adjuncts Principle 7 7 7 7 Singleton Type : A single feature structure 7 7 which is uni ed with another sign. 7 5 List Type : A list of feature structures. Each

Figure 1: An example of a rule schema. to our formalism (or an extension of it).

4 Grammar Formalism

Name: Adjuncts Mother FS: 2 3 sign adjuncts [1] ] syn h[ sign i7 6 4 head-dtr syn [ adjuncts [2] ] 5 adjunct-dtr [3] [ sign ] De nite Clause Program: cancel-a-member ([2]; [3]; [1]):

HPSG grammar is represented in our system by the following three distinct components. Rule Schemata A rule schema consists of a unary/binary rewriting rule and a mother feature structure, accompanied by a set of principles. The principles are applied to the feature structure of the mother after it is uni ed with the other signs in the schema. An example of a rule schema for English is given in Figure 1. Principles : A principle consists of a mother feature structure and a de nite clause program. The de nite clause program is written in a logic programming language whose arguments are feature structures. The de nite clause program performs complex operations involving, for example, set-valued feature structures or building complex logical forms. An example of principles for English is given in Figure 2. Lexical Entries : While schemata and principles only de ne mutual relationships among the feature structures of the mother and daughter constituents, more speci c constraints are given by the selecting features of lexical entries for individual words such as subcat-left, subcat-right, etc. A selecting feature speci es what kinds of signs the sign containing the word can be combined with. In general, when a rule schema is applied to construct a mother constituent, the accompanying principles are applied which check, among other things, the compatibility of the selector (i.e. the sign containing the selecting feature) and the selectee (i.e. the sign which is to be uni ed with the selecting feature). Our current version of English grammar has the following three types of selecting features.

feature structure is uni ed with another sign according to the order of the list. Set Type : A set of feature structures. Any of the feature structures in it can be uni ed with another sign, regardlessly of the order. (This includes slash.) We assume that a rule schema is accompanied either by 1) the use of at least one selecting feature, i.e., its uni cation with another sign, or by 2) a transfer of an element of a selecting feature to another selecting feature. Though this assumption is more restrictive than usual, we observed that it does not hamper the coverage of our grammar. More importantly, this restriction enables us to produce more ecient compiled code.

5 Compilation to Object-Oriented Code Our compiler produces the following items in Common Lisp Object System (CLOS).

Sign Objects An object corresponding to a sign.

Rule Methods Procedures which play the roles

of rule schemata and principles. Rule methods take sign objects representing daughter signs as input and produce sign objects corresponding to mothers. Sign objects have slot corresponding to only part of feature structures. The operations performed by a uni cation routine in a rule schemata and principles are substituted by low-level procedures in rule methods. The lowlevel procedures directly operates on slot values of sign objects and play a role of a uni cation routine in the original grammar. Thus, our compiler eliminates feature structures that do not have the corresponding part in a slot value format or slot value itself. For example, the sem value does not have a corresponding slot in sign objects. This is because the uni cations of sem values does not fail very often and they are not necessary to restrict possible parse trees.

5.1 Soundness

Before going to the detail, we examine the soundness of our compilation, though the following simple formalization cannot capture our compilation completely. Sign objects contain the data structures corresponding to only a part of feature structures which describe a sign in the original grammar. In general, sign objects do not contain semantic representation or other rather complex structure. A property of sign objects can be stated as the following if we can ignore the di erence in implementation between the sign objects and feature structures.

Theorem 1 (Compile of Signs)

For any sign

S in an original grammar with a feature structure formalism and its compiled signs object S 0 , S0 v S

The similar property holds between a compiled rule methods and its original rule schemata and principles. In general, the feature structures reserved in a compiled grammar are speci ed by programmers so that drastic overgeneration can be avoided. The soundness of our compiled code is de ned as follows.

De nition 1 (Soundness)

If a sign S1 is produced by an original grammar, there must be a sign S0 produced by its compiled grammar such that S0 v S1

Parsing consists of applications of rule schemata, principles and interpretation of a de nite clause program, which amounts to a series of uni cation steps1 ,the soundness of a compiled grammar can be proven by 1) the subsumption relation between an original grammar and its compiled grammar, and 2) monotonicity of uni cation (i.e. for any feature structures F0,F1 , F00 , F10 , if F0 v F00 , F1 v F10 and F00 t F10 exists, then F0 t F1 exists and F0 t F1 v F00 t F10 .) In the followings, we present examples of compiled grammars and explain the di erence of the actual compiled code from the above simple formalization. 5.2 Sign Objects

Each slot of a sign object corresponds to a feature structure in a sign. It contains a fragment 1 In fact, the compilation remove some de nite clause programs. The soundness can be proven even with this operation because de nite clause programs do not contain control structure such as negation as failure or cut. However, elimination of feature structures by compilation may prevent termination of all de nite clause programs. The termination must be guaranteed somehow.

(SLOT-NAME SLOT-VALUE) (SUBCAT-RIGHT ((E-LIST NIL #))) (SUBCAT-LEFT ((NIL NON-OBLIGATORYSIGN #))) (ADJUNCTS ((E-LIST NIL #))) (ADJUNCTS-1 (NON-SIGN #)) (ADJUNCTS-2 (SIGN # ((ADJUNCT1 NON-SIGN :SINGLETON)) (SELECTING-FEATURE-SHARING SUBCAT-LEFT SUBCAT-LEFT))) (SLASH ((E-LIST # NIL))) (REL (NON-SIGN # NIL NIL)) NIL is ? in a feature structure notation.

Figure 3: A compiled lexical entry

2 sign 2 33 head [ maj Aux ] 6 7 subcat-left h[1]i 6 6 7 7 adjunct1 2 [2] 3 6 6 7 sign" h i #77 6 6 syn 6 6 77 7 V 4 4 adjunct2 4 syn head [3] maj in in nitive 555 subcat-left h[1]i adjunct1 [4]non-sign

Figure 4: An original lexical entry of the feature structure or other Lisp objects converted from the feature structure, such as symbols representing types. Figure 3 shows the sign object compiled from the lexical entry in Figure 4. Typical slot values represent selecting features. The value for a singleton type selecting feature, e.g., ADJUNCTS2, is a tuple of the following elements: sign-type The type of selected sign. e.g., sign or non-sign. head-feature A feature structure which is a head feature of selecting feature. selecting-feature-constraints Some selecting features contain constraints on the selecting features of the selected sign. This part contains the commands to check the constraints. structure-sharing-commands Commands which perform transfer of feature structure information which would be performed through structure sharings in a full grammar. In the case of the ADJUNCTS2 in Figure 3, the sign-type is sign and the head-feature is a feature structure #, which represents the feature structure denoted by the tag [3] in Figure 4. The third element is selecting-feature-constraints. This element in Figure 3 represents the constraint that the ADJUNCTS2 value of a selected sign must be non-sign. This also corresponds to the feature structure tagged as [4] in Figure 4. The fourth element is a command to transfer the subcat-left value of a selected sign to the mother's same slot. This transfer is represented by the structure sharing [1] in the original lexical entry.

Most structure sharing in an original sign does not have a corresponding representation in a sign object. The only structure sharing reserved by the compiler is represented as a command in a structure-sharing-commands (like selecting feature-sharing in Figure 3). As the commands are generated at the compile time, run-time check of structure sharings is omitted. 5.3 Rule Methods

As mentioned in Section 4, rule schemata with their principles are categorized as 1) rule schemata to use selecting features and 2) rule schemata to transfer selecting features. The rst category is divided further into three according to the type of the selecting features (singleton, list or set)in a rule schema. The compiler generates compiled code by using a template of CLOS code which is prepared for each type of rule schemata. A rule method is generated by lling out the un lled part with references to a rule schema and its principles. The differences among code templates re ect the di erences of the de nite clause programs to be evoked in application of each type of rule schemata. For example, a transfer of selecting features is realized by a de nite clause program for the trace principle. The other three templates are also accompanied by di erent programs. Cancel-a-member in Figure 2, for example, is the program for selecting features of set type. The behaviour of such important parts of the de nite clause programs, which constitute the basic skeleton of HPSG, are re ected directly in the templates. 2 In the following, we only describe how to compile the rule schema which treats selecting features of list type. The other types of schemata are compiled in similar ways. The following is the template for a rule schema for the list type selecting features. (lambda (selector selectee) (if (and (match-head-feature? (first-element ( selector)) selectee) ) (progn (let ((mother-sign (create-sign))) (setf ( mother-sign) (rest-elements ( selector))) (interpret-structure-sharing-commands selector (first-element ( selector))) mother-sign)) 2 Other part of de nite clause program does not have a corresponding part in a compiled code, except for the literals whose de nition contains only facts.

nil))

A rule method takes daughter sign objects, which are bound to the variables selector and selectee in the argument list, and produces their mother sign objects bound to the variable mother-sign. At rst, a rule method checks compatibility between the selecting feature elements of selector and selectee (match-head-feature?). Then the code checks compatibility of a rule schema and its principles with selector and selectee. If they are satis ed, the mother sign object is created (create-sign). lls the slots of the created sign object. Then structure-sharing-commands corresponding to structure sharings in the original sign are interpreted. In the following, we explain how to ll the variable parts of the template. match-head-feature? Uni ability checking of the rst element of the slot value corresponding to a selecting feature with a selected sign objects. Because the used selecting feature is removed from a mother sign in a HPSG convention, this code does not produce the result of uni cation. This is the substitution for the uni ability checking between rule schemata/principles and daughter signs. It directly calls the function only for type compatibility checking in general. This lls the slots of a mother sign, according to a rule schema or its principles. If structure sharing occurs between a mother sign and a daughter sign in a rule schemata or principles, the value of the daughter's slot is simply transferred to the mother's corresponding slot. The uni ability checking is not performed because we can assume that a mother sign is rather empty. (interpret-structure-sharing-commands)

This interprets commands structure-sharing-commands stored in a sign object. Execution of commands plays a role of the structure sharings in signs, which are not captured by eld. Most commands simply transfer the value of one slot to another. In general, a uni cation evoked by structure sharing is substituted by a simple transfer of values from a slot to another. The simple formalization of compilation described before cannot capture this substitution. We observed that structure shared nodes in a feature structure are likely to be empty in HPSG

style grammars. Structure sharing tends to be used only to transfer the information given by another feature structure in most cases. Therefore, simple transfer of a slot value is sucient for uni cation. If a destination slot is assumed to be empty, even uni ability checking can be omitted. This increases the sharing of data among sign objects and decreases the required storage. More importantly, this substitution of uni cation does not a ect the soundness of our compilation. Both inputs to a uni cation always subsume the output. Therefore, if a sign object is created by the code containing a slot value transfer instead of a uni cation, it subsumes the sign object created with a uni cation. Thus, a created sign object with our compiled code always subsume the original sign. 5.4 Performance

We measured the execution time of the compiled code for creating a mother sign from two daughter signs. The rule schemata used in this experiment is presented in Figure 1. The code is about 43 times as fast as the application of a rule schemata and its principles. The required storage is about 58 times less than the storage for the original grammar. It is dicult to measure total speed-up of parsing because it depends on the input sentence and other components in the parser. As a preliminary experiment, we will show only one case. the target sentence is 25 words. Bottom-up parsing with a compiled code and applications of the original grammar to the completed parse trees was 3.1 times as fast as the parsing with only the original grammar. The required storage for compiled grammar was 370% less than that of the original grammar.

6 Conclusion

We presented a compilation technique for HPSG style-grammar to an object-oriented programming language. The resulting code is far faster than a conventional parser with a uni cation routine, though the compiled grammar tends to overgenerate. Such overgeneration can be eliminated by applying the original grammar to completed parse trees covering a whole sentence. Thus, the entire parsing process is speeded up. Linguistically well-de ned grammar formalisms such as HPSG have been regarded as inappropriate for dealing with unrestricted real-world text. In order to build feasible systems, researchers have relied on more procedural grammar whose wellde ned-ness is dicult to show. However, by using our compilation technique, we will be able to

develop a robust and ecient HPSG-based parser which can be a component of a practical system.

References

Bob Carpenter and Gerald Penn. 1994.

The Attibute Logic Engine User's Guide (Version 2.0.1). Carnegie Mellon University.

Bob Carpenter. 1992. The Logic of Typed Feature Structures. Cambridge University Press. Bob Carpenter. 1993. Compiling typed attributevalue logic grammars. In Third International Workshop on Parsing Technologies. Alex Franz. 1990. A parser for HPSG. Laboratory for Computational Linguistics Report CMU-LCL-90-3, Laboratory for Computational Linguistics, Carnegie Mellon University. Thilo Goetz and Walt Detmar Meurers. 1995. Compiling HPSG type constraints into de nite clause programs. In ACL-95. Keiko Horiguchi, Kentaro Torisawa, and Jun'ichi Tsujii. 1995. Automatic acquisition of content words using an HPSG-based parser. to appear in NLPRS'95. Robert Kasper, Bernd Kiefer, Klaus Netter, and K. Vijay-Shanker. 1994. Compilation of HPSG to TAG. In ACL 95. Masaki Kiyono and Jun-ichi Tsujii. 1994. Hypothesis selection in grammar acquisition. In Coling 94. Sandi Kodric, Fred Popowich, and Carl Vogel. 1992. The HPSG-PL system: Version 1.2. Technical Report CSS-IS TR 92-05, Simon Fraser University. Carl Pollard and Ivan A. Sag. 1987. InformationBased Syntax and Semantics Vol.1. CSLI lecture notes no.13. Carl Pollard and Ivan A. Sag. 1993. HeadDriven Phrase Structure Grammar. University of Chicago Press and CSLI Publications. Kentaro Torisawa and Jun'ichi Tsujii. 1995. An HPSG-parser for automatic knowledge acquisition. the 4th International Workshop on Parsing Techonologies.

Suggest Documents