A Formalization Tool for Comparing Syntax

A Formalization Tool for Comparing Syntax Structures in Written and Spoken MSA Corpora. Sameh Al-Ansary, TCMNO, Nijmegen University, P.O. Box 9103, HD Nijmegen, The Netherlands Email : [email protected] [email protected]

Abstract The present paper aims at building a formal grammar for the automatic analysis of syntax structures of MSA. The proposed parser is intended to be the base for comparing two corpora of spoken and written MSA. In this initial stage of the work, the grammar written is limited to the NP level. The Formal description is implemented using the Affix Grammar over Finite Lattices (AGFL) representing a linguistic approach in terms of functions and categories to account for the sequences inside the Noun Phrases together with the relations governing these sequences. The two corpora used in representing the spoken and written varieties are selected from the Egyptian Media. The linguistic approach followed together with a detailed description to the Arabic word classes have led to an accuracy exceeding 97%. This revealed striking and reliable differences, associated with frequencies, between written and spoken Modern Standard Arabic structures which may have a great impact on the computational processing aiming at building automatic natural language applications based on both varieties.

1 Introduction There is a long history of attempts to analyze the syntax structures of MSA. Cantarino (1974) followed a “meaning-based theory of grammar” for the syntactic analysis he adopted. He defined the sentence and subdivided it subsequently into a nominal sentence in which only nominal elements are used as constituents and a verbal

sentence which includes a verb as a constituent. Aoun (1979) used Chomsky’s transformational model to discuss the internal structure of the NP in Standard Arabic and explained the specific linguistic facts of definiteness and genitive case assignment in terms of generation with subsequent local movement by means of a transformation. Bakir (1980) used the framework of the Government and Binding theory concerning word order in Arabic. Ayoub (1981) discussed the WH-movement and the complement structure of complex transitive verbs involving a direct object. In 1985 Fassi Fehri adopted an approach by which he described the sentence structure in Arabic. He emphasized the role of the lexicon to make transformations more realistic or at least to restrict the number of transformations. Edward (1983) has developed a Generalized Phrase Structure Grammar (GPSG) and this grammar has been used for the description for Arabic data and the analysis of Egyptian newspaper data. He accounted for relative clauses in Egyptian Arabic. Al-Motawakil (1985) explored the descriptive capabilities of Functional Grammar (FG) with respect to syntax and the features of the Arabic language. He studied “topic”, “focus”, “theme”, “tail”, and “vocative constructions”, each with pragmatic functions. I would like to end this section by the approach of Ditters (1992) who used the immediate constituent approach combined with alternating functions and category layers to describe the MSA phrase structures. He used a context free grammar in a two-level approach and managed in giving, to a high extent, an accurate description to syntax structures of Modern Standard Arabic. Our approach in building our formalized tool (for the automatic comparison of two corpora of MSA) is based on that of Ditter’s with a few

elaborations depending on our linguistic approach concerning a detailed description of the parts of speech (POS) in Arabic.

about the structure of the corpora compared cf. Al-Ansary (2001)

3 Linguistic perspectives 2 The proposed linguistic approach for describing the NP in MSA In describing the NP, we dealt with Immediate Constituents and with smaller units in which these NPs can be analyzed. Constituent analysis is associated with alternating function and category layers until the terminal lexical entries are reached. In this respect, a distinction is made between the functions which these units can fulfill and the categories in which these functions can be realized. The same distinction in terms of functions and categories is used in the description of the elements in which a phrasal constituent can be analyzed. Some structures consist of a single obligatory central element with optional expansions giving way to a description in terms of head and modifiers. Other structures, as a prepositional phrase embedded sometimes within the NP structure, are rather described in terms of 'header' and 'complement'. In this context the relations between the elements of a constituent are dealt with. For more details about this linguistic approach cf. Owens (1988) and Ditters (1992,2000,2001). It is necessary at this point to make two important issues clear. First, the NP selected for the study is not the smallest building block of the sentence, as far as nouns are concerned, but it is the unit that is supposed to have one function in the sentence. Second, the adequacy of the grammar is limited in the present stage to the extant of coverage of the corpora being studied. Once the grammar is felt to be adequate, it will be possible to analyze huge amount of texts, consequently giving global information about speech and writing in MSA. This discussion of the NP is accompanied by a detailed subclassification of the Arabic class of 'noun' as far as the terminal entries are concerned of the description (cf. El-Kareh and Al-Ansary 2000). In addition, a number of semantic features are used to complete the description of the lexical entries. Elaborated in this way, we believe that our descriptive model can be used for building a powerful formal tool, which is in turn, could be used in an automated comparison of NPs in different corpora. For information

In this section we will explain briefly the linguistic framework in which the NP category occurred. Regarding the NP, a number of functions could be distinguished. These functions are: the head, the determiners and the postmodifier. The individual behavior of these functions ranges between determination and postmodification of the nucleus of the NP, the element occurs in the head function. The NP, in its simplest form , only consists of a head. Even in a more or less complicated NP, only one head function is to be distinguished. The principle of mono-headedness will not cause any undesired analysis results of the phrases, especially in cases of coordination, since it is related to the functional level while something like coordination occurs at the categorial level. The head is specified and defined as the unit which is marked for its function at the next higher level of description and cannot be deleted without affecting the meaning of the constituent (cf. Ditters (1992)). By this definition, the head function of an NP can only be realized by the category noun. Owens (1988) distinguished several subcategories able to realize this function (cf. Owen’s 1984, Ch. 9). According to our subclassification of nouns, a common noun, pronoun, proper noun, present participle, passive participle, adjectival noun, standard infinitive (verbal noun), noun of title… etc are examples of heads of an NP (for more details cf. El-Kareh and Al-Ansary 2000). In extension of the head, an element can function as a determiner to the head of the NP. The element occupying this function may occur before or after the head. This brings us to differentiate between what is called a “predeterminer” (PREDET) and “postdeterminer” (POD). However, it has to be kept in mind that they are mutually exclusive in relation to the head i.e. they could not occur together. The category in the function of predeterminer is mainly the prefixed article “ ” while category in the function of postdeterminer is a normal NP marked for genitive case. The postmodifier function is always placed after the head of the NP and; is for this reason called “post modifier” (POM). In our approach,

post-modification could, according to its categorial realization, be classified into PPOM, ADJPOM, NPOM or ADVPOM, realized by a prepositional phrase, adjective phrase, noun phrase and adverbial phrase respectively. An additional element could be distinguished functioning as a complement of the head of the NP (COMPL). Like postmodification, the complement function is always realized after the head. However, it is not recommended to treat both of them as a postmodification since the complement has a particular syntactic function in relation to the head. For example, a postmodifier follows its head with respect to ‘definiteness’, ‘number’, ‘gender’, and ‘case’. There in no direct relation between the head and its complement as far as agreement is concerned. On the contrary, the head imposes specific values on its complement. In summary, the general structure of the NP in MSA in terms of function could be represented by the following rules in (3) (Note that the ( ) are traditionally used to represent optional elements and {} to represent choice): a- NP

b- NP

(PREDET) HEAD

HEAD

POM COMPL POD POM COMPL

4 Formal perspectives In this section we will discuss a formal representation of the linguistic description of the NP in MSA as presented in the former section. We will concentrate on the rules of the first and second levels of description and demonstrate the descriptive power and economy of the combination of both levels as realized within the AGFL formalism. We will give an overall description of the rewriting process at a lower linguistic level. Also the non-terminal and terminal elements, the affix variables and values, the lexical rules will be briefly dealt with. 4.1 The first and second levels of description The linguistic description of the NP in MSA can be represented by means of context-free rules. In this way, ROOT is the start symbol in our grammar and ROOT is rewritten in the phrasal category NP. The NP is in its turn rewritten as a sequence of optional and obligatory functional elements. The description

alternates between functions and categories till the description in lexical terms has been reached. Starting with our initial label ROOT, the first rule is: ROOT: NP. A number of restrictions is applied via some linguistic features to determine the dependencies and relations between the elements of the NP. Thus our start label could be revised as: ROOT: NP(definiteness, By number,gender,person,case). means of the non-terminal affix variables, we are dealing with the elements of the first and second levels of description. At the first level, nonterminal elements are arranged in phrase structure rules called syntax rules or hyper rules. These phrase structure rules care context-free rules describing syntactic structures. As we have seen with ‘definiteness’, number, gender, person and case, other affix variables can be attached to the non-terminal of the first level. These meta affixes constitute the second level of description. Conventions for the writing of affix rules and syntax rules can be found in Koster (1991). 4.1.1 Affix rules subclass::NPGOD;NW;NIA;NIS;NOP;NNC;NNP;N E;NPP;NT;NDS;NIM;NDP;NF;NDJ;NDC;ND;NU;N QTM;NQTUM;NDO;NDE;NNL. notsubclass::NM;NPH;NCONF;NPRO;notnph. defness::def;INDEF. def::DEF BY PREDET;DEF BY POD;DEF BY ITSELF. gender::FEM;MASC;BOTH. number::SING;DUAL;plu. plu::BPLU;RPLU. person::FIRST;SECOND;THIRD. case::GEN;NOM;ACC.

In our formal description we like to highlight information about the head realization of the NP. For this reason one of the affix non-terminals used is the ‘subclass’ which has values according to the subclasses of the Arabic noun adopted in this study. To allow for different values for the same affix non-terminal occurring more than once in the right hand side, we add a digit to the affix name hinting at a possible difference in the realized value. Other affixes such as listed below will be commented upon in the course of our discussion. humanity::HUMAN;NONHUMAN;GOD. function::ENUMERATIVE;PARTITIVE;PURPOSE; ASSOCIATIVE;LOCATIVE;INSTRUMENTAL; EXCLUSIVE;SIMILE;TEMPORAL;direction. direction::SOURCE;TARGET.

transitivity::INTRAS;MONOTRANS;DITRANS; EMPTY.

4.1.2 Syntax rules For the description of the NP we want to include the affix name ‘subclass’ and to keep record of the realized values for ‘definiteness’, ‘number’, ‘gender’, ‘person’, and ‘case’. Therefore, our first rule will be reformulated as: (3) ROOT: NP(defness1, subclass, number, gender, person, case).

The affix non-terminal ‘subclass’ will always be instantiated by the subclass of the main HEAD of the NP. This affix is important for keeping track of the subclass of the head of the NP which will have a noticeable role in disambiguating a lot of undesired alternatives by filtering out the subclass of the head depending on the context. A specific value for ‘definiteness’ results from the presence or absence of a pre- (DEF BY PREDET) or postdeterminer (DEF BY POD). An entry may be lexically determined as for example belonging to the subclass of personal pronouns or proper nouns (DEF BY ITSELF). The value of ‘gender’ is exclusively linked with the specific semantic or syntactic realization of the head element. ‘Number’ is an important feature to realize a distinction in singular, dual and plural (both regular and broken) which is crucial for the agreement between subconstituents inside the NP at a lower linguistic level. The affix ‘person’ is needed for the description of structures in which a personal pronoun occurs. All other nouns are THIRD person. A value ‘case’ may result depending on from the function of the elements in the context. 4.1.2.1 The first cycle In the preceding section we presented the overall structure of the NP with its obligatory and optional elements. We will now discuss all the elements occurring at the right hand side starting with the rewrite rule of the phrasal category NP: (4) NP(defness,notndj,number,gender, THIRD,case): NHEAD(notndj,number,gender,THIRD, transitivity, humanity), POD(defness1,number1,gender1, Person,GEN), DEFNESS HERE is(defness, defness1), 1

For making the name of this feature of definiteness shorter, it is contracted to ‘defness’.

PPOM(defness1,number2,gender2, person2, GEN), [PPOM(defness2,number3,gender3, person3, GEN)].

The rule can be paraphrased as follows: an NP is realized by a nominal head followed by a postdeterminer and then postmodified by one or two prepositional phrases (PPOM). The rule shows a lot of interesting features on which the grammar is based. For example the flexible grouping of the values of the affix non terminal ‘subclass’ according to the need of the position to be filled. ‘notndj’ points at any noun subclass that can not fill the position of an ADJP. We will come back to this issue in the ‘filtering rules’ section 4.1.4. Second, no agreement (in number and gender) is detected neither between the head and the POD nor between the head and the prepositional postmodifiers. It is even not necessary to have such agreement between two successive PPOMs (if occurring because the second PPOM is optional). The final value for the definiteness of the whole NP depends on the value of the sub NP realized in the function of the POD. This is controlled by the predicate rule: (5) DEFNESS HERE IS(defness,defness1). Predicate rules will be dealt with in more details in section 4.1.4 as they are helpful in both expressing relations between elements and in filtering out undesired alternatives that may result from an analysis. As we differentiated between different types of devices which the Arabic language uses in defining the NP, we made a distinction between three types of the first function layer imposing the affix values: INDEF, DEF BY PREDET, DEF BY ITSELF with each type. The value “DEF BY POD” could not be imposed since it depends on whether the head will be followed by a postdeterminer or not2. It depends also on the definiteness value of the head of the NP realized in the postdetermier function. Therefore, it enters within the scope of the rule (4) since the affix ‘definiteness’ could be instanciated either by INDEF or DEF BY POD depending on the ‘definiteness’ value of the postdeterminer. 2

To be precise, only one case, till now, occurred in which the “DEF BY POD” value could be imposed. It was one in which the function of the NP is confirmation. In this case, the head of the NP should be a confirmational noun (CONF) followed by a POD which is in its turn realized by a pronoun (NPRO).

Consequently, we can draw the following alternatives for the NP according to the value of ‘definiteness’: (6) NP(INDEF,notnproper,number,gender,THIRD,case): NHEAD(notnproper,number,gender,THIRD, transitivity,humanity). (7) NP(DEF BY REDET,subclass,number,gender, THIRD, case): PREDET,NHEAD(subclass,number,gender, THIRD,transitivity,humanity). (8) NP(DEF BY ITSELF,NPH,number,gender, THIRD,case): NHEAD(NPH,number,gender,THIRD, EMPTY,HUMAN).

According to the imposed value, a lot of alternative rules to (4), (6), (7) and (8) are formalized. The alternatives recommend that the inclusion of some syntactic features is necessary. A typical example is the affix nonterminal ‘transitivity’. This affix gives either ‘INTRANS’, MONOTRANS’ or ‘DITRANS’ to all derivatives and verbal nouns (infinitives). Otherwise it gives ‘EMPTY’ to the rest of the subclasses. For example, the attachment of these values enabled the grammar to successfully detect the complements for those derivatives or verbal nouns with verb-like complementary valence (see the sample in section 5). Since the complement function is normally an NP, the grammar sometimes fails to differentiate between an NP realizing a complement function and other NPs realizing other functions. This has led to more ambiguity by giving more parses, while the correct analysis is included. However this affix could limit the application to the nominal heads that are marked for transitivity and therefore the problem is localized. The formal grammar presented the inclusion of some semantic features to support the application of the syntactic rules. The affix ‘humanity’ is crucial in controlling agreement in gender between different elements of the NP. For example, without this affix, it is impossible to establish the gender agreement between the nominal head and an adjective postmodifier since there is no gender agreement in case of nonhuman heads as we see in (9) below: (9) a-

’alrijaalu ’al’aqweyaa’u ‘The strong men’ NHEAD (HUMAN-MASC)+ADJPOM (MASC). b- !#"$ %'& ()'*+ ’alše’unu ’alseyaseyyatu ‘The political matters’

NHEAD (NONHUMAN-MASC) + ADJPOM (FEM)

Another semantic affix is used to express the semantic function of some particles needed for the description of the NP at a lower linguistic level. We restricted ourselves at this stage to prepositions. The affix ‘function’ is used to express this kind of relations. It is also proved that the use of a detailed description of the ‘noun’ in MSA guides the grammar to successfully attach the PPOM to the corresponding appropriate head, either to the nearest or to the main nominal head. In (10) the PPOM ‘,.- /!0 132 4 ’ is attached to the main head ‘ 56!7#8:9 ’ (verbal noun), otherwise the possibility is given to the grammar to attach the PPOM to the nearest noun (NHEAD) of the immediate constituent. PRQSKTVU W XZY:[]\^A_3`a

(10)

O

;=:?A@B CEDAF

GIHKJMLN

tafjiru ba’Di ’al munša’ati ’alHukumeyyati biAsyut Exploding some of the governmental establishments in Asyut

4.1.3 Lexical rules Non-terminal categories carrying affix names but not further rewritten in function elements can be rewritten in lexical items since the value of the associated affixes has already been specified in the meta affix. This rewriting of the categories in final terms yields true alternatives or realizations which only differ at affix level. The lexical rules given for defining the noun in our grammar is: (11) NOUN(subclass,number,gender,person, transitivity, humanity).

When the value of ‘subclass’ is defined, a given set of nouns is specified. For example, when the subclass is instantiated as a pronoun (NPRO), only the subclass of pronouns will be defined. The affixes ‘number’, ‘gender’ and ‘person’ will determine which pronoun is needed. The following lexical items are examples for those entries that should be defined by instantiating the affix values: NOUN(NPRO,SING,BOTH,FIRST, EMPTY, e cKd HUMAN):" b . NOUN(NPRO,BPLU,MASC,SECOND, EMPTY, HUMAN):" g:h=iKj f . NOUN(NPRO,SING,FEM,THIRD, EMPTY, humanity):" lnm k .

4.1.4 Filtering rules Two types of rules are used to filter out irrelevant alternatives to the string being analyzed. These rules are predicate rules and, what we call, “word-class filtering rules”. Predicate rules are parts of the Extended Affix Grammar formalism while the “word-class filtering rules” are linguistic filters by which we study the relation between the slot and the subclass of the word that can fulfill that slot. a) Predicate rules In section (4.1) you can find some references where the formal definition of this component of EAGs may be found. Here we limit ourselves to a very informal definition of ‘predicate’ (as given by Ditters 1992) that serves our purposes. A predicate is an explicit listing of affix values instantiated in a given rewrite rule to examine the relation between meta affixes. We will illustrate this informal definition by the following predicate rule we testing the relation between meta affixes responsible for definiteness. The rule in (5) has been used to control the definiteness relation between the elements of the NP in order to transfer the definiteness value from the POD to the whole constituent. To control this, all possible instantiations for the variables ‘definiteness’ and ‘definiteness1’ are listed as in (12) below: (12 ) a) DEFNESS HERE IS(INDEF,INDEF):. b) DEFNESS HERE IS(DEF BY POD, DEF BY PREDET):. c) DEFNESS HERE IS(DEF BY POD,DEF BY POD):. d) DEFNESS HERE IS(DEF BY POD,DEF BY ITSELF):.

The first affix value inside the bracket are designed to be transferred to the whole constituent once the second affix value is instantiated by the POD function at a lower linguistic level. Let us take a concrete example to show how predicate rules work. tvu wyx=z z {n|~} ' ] (13) oqp r r s ‘Ending the violence works’ (Rule: 12 c ) DEF BY POD (Rule: 12 c ) DEF BY POD (Syntax rule) DEF BY PREDET :A

' A

A

b) Word class filtering rules We believe that not all nouns fit in any noun position. Even if the same noun subclass (traditionally) can fit in more than one position (considering the fact that it is in any case a noun) then it should be assigned (computationally) different subclasses in relation to the context. If this is correct then the rules for describing the syntax of the language should take care of this hypothesis by including the possible subclasses that may fill in each functional slot. This in turn will guide the disambiguation process as well as give a more accurate view how words are actually used in the language. The usefulness of the subclassification raises the idea of the urgent need for a linguistic subclassification of the words for computational purposes. By now we have discussed a first cycle of rewrite rules in terms of functions and categories. A number of categories has been rewritten in terminal values (lexical items). Other categories can be further described in terms of functions and categories in a second cycle (at a lower linguistic level). The remaining categories may now figure as left-hand sides; they are again rewritten in terms of combination of functions at the second functional layer. 4.2 Rewrite cycles at a lower linguistic level In the first cycle, the head may be followed by other elements: POD, or POM (i.e. either NPOM, ADJPOM, PPOM or ADVPOM). The possibility of accumulation of different functions inside the NP has been accounted for. In these cycle, the functional elements occur in the right hand side of the context-free rules. In this section of the grammar the rewriting processes dealt with these functions in a lower linguistic level allowing a second cycle starts by having the functional elements on the left hand side to be rewritten in terms of categories at a lower linguistic level. We will limit ourselves in this section to the head function and postdeterminer function. a) The nominal head. The second layer starts by defining the head in the left hand side as in (14) below: (14) NHEAD(subclass,number,gender,person, transitivity, humanity): NOUN(subclass,number,gender,person, transitivity,humanity).

Here it is stated that the head function can be realized by a noun of a given subclass according to the instantiation of the affix non-terminal ‘subclass’. b) The postdeterminer function A postdeterminer may follow the head in two ways: either t as a part of it (suffixed to it as in case of pronouns) or following it separately. The rules in (15) state that the category NP occurs in the postdeterminer function. The head of this NP should be filtered to fit in this functional slot. For example, it is not logical to give a possibility for a subclass that is planed to fit a head of a postmodifier to be tested to fit a postdeterminer slot. The rules below present a description of the postdeterminer function in MSA (15) a) POD(defness,number,gender,person,case): POD-NP-FILTER(defness,number,gender, person,case); COORDINATED NPs(defness,subclass, number, gender, person,case). b) POD-NP-FILTER(DEF BY PREDET,number, gender, THIRD,GEN): PREDET,NHEAD(notnph,number, gender, THIRD, transitivity,humanity1); PRE XTENSION(NM,number,gender,humanity), PREDET,NHEAD(notnph,number1,gender1, THIRD, transitivity,humanity1); PREDET,NHEAD(notnph,number,gender, THIRD, transitivity, humanity), CONFIRMATION(DEF BY POD,number1, gender1, THIRD,GEN); PREDET,NHEAD(notnph,number,gender, THIRD,transitivity,humanity), ADJPOM(DEF BY PREDET,number1,gender1, person,GEN), ADJPOM(DEF BY PREDET,number1, gender1, person,GEN)].

In (15a) we see the postdeterminer function rewritten in the form of an NP. This NP is either a simple NP or a coordinated NP. In (15b) a categotial is introduced to select the subclass of the head able to occur in this NP. The same kind of ‘word-class filters’ proved to be efficient in controlling coordination (in the next cycle) since only similar constituents can be coordinated. Since the postdetermier function is normally an NP, the grammar returns to the first cycle-rules to account for the structure in terms of this category. With other functions, as ADJPOM, that grammar enters in a deeper cycle to account

for the categorical realization of the function concerned.

5 An output sample Elaborated according to what has been surveyed, the grammar could successfully handle very complicated NPs. In (16 ) below we can see a sample of a moderate complexity. ½y¾¿ ÀÂÁÃ ¿ ÆÇ¿ È:ÉAÊÌËÎÍ§ÏAÐ:Ñ3Ï Ë£¿ ÓVÔ¿ Õ:ÐKÖ ¼!ÄIÅ »¼nÒ ¼ (16) S

Nhead

Complement

POD NP

Nhead

Coord

POD

ADJP

CONJ ADJP

NP Pre Modifier ADJhead ADJhead Pre Nhead ADVP ADVhead V. Noun NDS Pre Comm. N. Nesba °§± ²:³:´Vµ

¶3·£¸ ¹

º=»¼

¥§¦A¨:©3ª

« ¬:A®V¯

NDS CONJ NDS

! I¡£¢ ¤

y

6 Using the formalized tool to compare spoken and written MSA corpora. The formalized tool is used to compare spoken MSA with written MSA at the NP level. The parsed corpora were converted into searchable relational linguistic databases using the AGFLParse-Tree reader, a hand-build interface programmed in visual basic v. 5.00. The interface is capable of storing all information about all details obtained from the parser including structure, relations, number of levels, affix values…etc for each NP, in addition to the alternative number being executed to produce each layer in the parse tree of each NP. In fact we intended to keep record of each piece of information even if we did not use it for the time being. This will be important for future use of the database. The comparison has focused on three points. First, the overall structure of the NP in spoken and written MSA. Second, the structure of each functional element in a lower linguistic level. Third, the distribution of noun subclasses within different functions. The frequency of occurrence is included with each of these points. The results revealed striking and reliable differences between the two varieties of

Arabic text data which have led to the conclusion that grammars and tools implemented for written language applications can not be used with spoken language applications, and vice versa. Using tools of one variety with the other will hinder the different systems for both varieties. For all details of the comparison cf. Al-Ansary (2001).

7 Conclusion This paper dealt with building a formal tool to compare syntax structures of spoken and written MSA corpora. It has been proved that the immediate constituent approach to grammatical analysis is very powerful in describing Arabic syntactic structures after it has been combined with the alternating functions and category layers till the terminal symbols have been accounted for. The study raised an important issue concerning the significant role played by the Parts-of-Speech in the syntactic analysis of Arabic. The detailed subclassification helps in filtering the terminal categories according to the realized function. This in turn will definitely be applicable in the automatic elimination of undesired parses and in developing automatic rule-based POS taggers that can assign the word a tag according to the syntactic context. The AGFL proved to be efficient in the description because of the two-level power in imposing restrictions inside the syntax rules.

References Al-Ansary (2001). NP Structure-Types in Spoken and Written Modern Standard Arabic (MSA) Corpora: A Formal-Based Approach. Paper presented in the 15th ALS Annual Symposium on Arabic Linguistics, 2-3 March 2001, Salt Lake City, Utah, the United States. Aoun, Youssef(1979). Parts of Speech: a case of redistribution, in Analyses Theories 2,1-25. Ayoub, Georgine (1981). Structure de la phrase verbale en arabe standard. Universite de Paris VII. Analyses Theories 1. Cantarino, Vicente (1974). Syntax of Modern Arabic Prose. 3 vols. Bloomington: Indiana University Press. Ditters, W.E. (1992). A Formal Approach to Arabic Syntax: The Noun Phrase and Verb Phrase. Ph.D., Nijmegen University. ---- (2000). Basic Structures of Modern Standard Arabic Syntax in Terms of Function and

Categories. In Proceedings of the International Conference on Artificial and Computational Intelligence for Decision Control and Automation in Engineering And industrial Applications. Natural Language Processing panel, pp. 83-88. 2224 March 2000, Monastir, Tunisia. ---- (2001) Distinct(ive) Sentence Functions in Descriptive Arabic Linguistics. Paper presented in the 15th ALS Annual Symposium on Arabic Linguistics, 2-3 March 2001, Salt Lake City, Utah, the United States. Edwards, Malcolm (1983). Relative clauses in Egyptian Arabic. Paper resented to the Autumn Meeting of the Linguistics Association of Greet Britain, University of Newcastle , September 21- 23, 1983. El-Kareh, S. and Al-Ansary, S. (2000). An Interactive Multi-Features POS Tagger, in the Proceedings of the International Conference on Artificial and Computational Intelligence for Decision Control and Automation in Engineering and Industrial Applications. Natural Language Processing panel, pp. 8388. 22-24 March 2000, Monastir, Tunisia. Fassi Fehri, Abdelkader (1985).’allisaniyyat wa ’allughatu ’alcarabiyya, in proceedings of the 2nd conference on Arabic Computational Linguistics. Kuwait, 295-/1-14. Koster, C.H.A.(1991). Affix Grammars for Natural Languages. In: Attribute Grammars, Applications and Systems, International Summer School SAGA, Prague, Czechoslovakia, June 1991. Lecture Notes in Computer Science, volume 545. Springer-Verlag. Motawakkil, Ahmed (1985). Topics in Arabic: Towards a Functional Analysis, in Bolkestein, A.C. de Groot and Mackenzie (eds.): Syntax and Pragmatics in Functional Grammar. Dordecht: Foris Publications. Owens, Jonathan (1988). The Foundations of Grammar: An Introduction to Medieval Arabic Grammatical Theory. John Benjamin publication company, Amsterdam.

A Formalization Tool for Comparing Syntax

A Formalization Tool for Comparing Syntax

Suggest Documents

A Diagnostic Tool for German Syntax

PhosphoBlast, a Computational Tool for Comparing Phosphoprotein

a visualization tool for comparing paintings and

Syntax Extension as a Tool for Application ... - Semantic Scholar

Syntax Extension as a Tool for Application Programming

The Formalization of Syntax-Based Mathematical Algorithms Using ...

Comparing Dependency and Constituent Syntax for Frame-semantic ...

Concepts for Comparing Modeling Tool Architectures

A Visualization Tool for Analyzing, Exploring, and Comparing Storage ...

eShadow: A Tool for Comparing Closely Related ... - Genome Research

A Textual Syntax with Tool Support - CEUR Workshop Proceedings

exploratory study of space syntax as a traffic assignment tool

exploratory study of space syntax as a traffic assignment tool

Comparing the Performance of Abstract Syntax Notation One (ASN. 1 ...

A Formalization of Digital Forensics

Towards a formalization of budgets

A Formalization of Digital Forensics

Social Motives for Syntax

SYNTAX

VIP: Vision tool for comparing Images of People - LVSN - UniversitÃ© ...

ARISTOTELIAN SYNTAX FROM A

SYNTAX

A Hybrid Environment for Syntax-Semantic Tagging

comparing a bottleneck identification tool with the ... - CiteSeerX