1st
International Conference on Computer
&
Information Engineering,
Organizer: Dept. of CSE, Rajshahi University of Engineering
&
26-27
November,
An Empirical Framework for Parsing Bangia Assertive, Interrogative and Imperative Sentences Mohammed Safayet Arefin, Lamia Alam
Shayla Sharmin, Mohammed Moshiul Hoque
Dept. of Computer Science & Engineering Chittagong University of Engineering & Technology, Chittagong, Bangladesh. Email:
[email protected]
Dept. of Computer Science & Engineering Chittagong University of Engineering & Technology, Chittagong, Bangladesh. Email:
[email protected]
Abstract-
To interpret language we need to determine a
II. PREVIOUS WORK
sentence structure. To do this we know the rule of how sentences
Parsing of Bangia sentences is in rudimentary stage now. A method to translate Bangia sentences into English sentences using context-sensitive grammar rules which accepts Bangia sentences including assertive, interrogative and imperative sentences is implemented in [1]. In their work, they emphasized on machine translation rather than parsing. A parsing technique for simple sentence was implemented using a set of CFG rules in [2]. A comprehensive approach for CFG rules to parse all types of sentences including complex, compound, exclamatory and optative sentences was shown in [4]. Anwar et al. develop a technique to parse Bangia sentences using context sensitive grammar rules which accept almost all types of Bangia sentences including simple, compound and complex sentences is implemented in [5]. It also describes the technique to decompose a complex sentence into a dependent and independent clause and a compound sentence into a simple sentence respectively. Besides in [3], analyzing of the syntax of various types of Bangia sentences and design transformational generative grammar rules for them was shown. A detail explanation of Bangia phrases and different types of sentences by using the TGG was given in [7].
of a language are organized and have an algorithm to analyze sentences
given
those
rules.
Parsing
serves
in
language
to
combine the meaning of words and phrases. Parsing a sentence then involves finding a possible legal structure for sentence. This paper proposes a set of context-sensitive grammars (CSG's) to parse the Bangia sentences including assertive, interrogative and imperative.
Experimental
result
reveals
that
the
proposed
80% accuracy. Index Terms-Natural language processing, context-sensitive grammars. Lexicon, and parse tree. framework
can
parse
Bangia
of
sentences
with
over
I. INTRODUCTION Parsing is the most important part as far as natural language processing of Bangia is concerned. To analyze language we must have a good idea about sentence structures. To do these we must know the rules of how language is organized and have an algorithm to language given those rules. We can fmd out words in sentence related to each other by analyzing the structure of a sentence. The result of parsing is usually a parse tree or structural representation [2]. For Bangia to other language machine translation we need parsing. Most of the previous system has used CFG's to parse the BangIa sentences into English. However, CFG are not sufficient to parse the all types of sentences and hence translation [1, 6]. To parse different kinds of Bangia sentences, we have to use CSG's due to its capabilities to handle agreement between subject-verb and person class [6, 8]. For example, if we consider the sentence "�f.j;- � �".
III. PROPOSED PARSER MODULE The schematic representation of proposed Bangia natural language parser module is illustrated in fig. 1. Details description of this module is given in the following subsections. Source Language Sentence (Bangia)
That can't be parsed by CFG.
Most of the previous parsing systems parse the Bangia sentences into English structurally (i.e., simple, complex, and compound) rather than their function or purpose of the user. Bangia sentences may be classified according to the purpose of the speaker or writer into five categories namely, assertive, interrogative, imperative, optative, and exclamatory sentences. The main contribution of this work is develop a parser that can parse the three types of BangIa sentences such as, assertive, interrogative, and imperative sentences by using a set of CSG rules. The parser output is given in a list in the paper and have verified the system with several types of examples and found that the performance is satisfactory.
IEEE
Rule Generator Source Language Structure (Parse tree)
Fig. 1. Proposed Bangia natural language parser
122 978-1-4673-8343-1/15/$31.00 © 2015
2015
Technology, Rajshahi, Bangladesh
A. Input sentence Bangia sentences are taken as input for the parsing framework. In this system, only assertive, interrogative and imperative Bangia sentences together their negative form are considered as input for implementation. B.
Tokenizer
Tokenizer is the program module that accepts a sentence to be parsed as an unbroken string, breaks into individual words called Tokens. Tokens are stored in the list for further access. The token is then checked into the lexicon for validity, some words, if necessary, should be combined into groups because two or more words may represent a single word type [1]. For . - ' , the output 0 f the exampIe, fior Illput sentence: ''3E!E � �Q:'"-;;f tokenizer can be represented as(C�' EI:� : �.�,:!,.::l "I.!l" '�') Output: = � ,
5
NP-7(Qntfr) (PP) NIPN
6
NP-7N (Biv) (Adj)
7
NP-7 NIPN IW
8
NP-7 Null
9
VP-7(NP) VF
10
VP-7 (NP) VF 1M
II
VF-7 V (Con) (Aux) (ind)
12
N-7 �, �, �, �, ,¥!, 1SR'if'l, ...
13 14
PN-7 �,
�,�, ...
V-7 9fi!',C"-1"!,C'11f,�, 'It, ...
IS
Adj-7 '5l'1,
�, ...
16
Biv-7
n,"I, ...
�
>
�,
C.
'
�
17
.
Lexicon
The typical entries in the lexicon which we have used in our system are shown in the Table l. 1: TYPICAL ENTRIES TN LEXICON
Bangia
Features
�
[PR,Perl]
�
[N,Per2]
�
[PR,Per3]
�
[IW]
�
[N]
\Of
lind]
on
lind]
�
[V]
?
[1M]
A
r, ...
[AbbrevIatIons: S: Sentence, AS: AssertIve sentence, IRS: InterrogatIve sentence, IS: Imperative sentence, NP: Noun phrase, N: Noun, PN: Pronoun, VP: Verb phrase, VF: Verb form, V: Verb, Qntfr: Quantifier, PP: Preposition, Biv: Bivokti (inflection), Adj: Adjective, Con: Concord, Aux: Auxiliary, ind: indeclinable,IW: Interrogative word,1M: Interrogative marker]
In this paper, Bangia CSG is used for different kinds of Bangia sentences those are discussed in the following subsections. List of used Bangia CSG to parse the sentences is given in the table 2. While parsing, "Null " is used in every cases when a token in the right side of the rule is not needed to parse the sentence.
Bangia CSG's rule
�,
19
22
Rule Generator
Rule No
�,
Aux -7 "ffClI, 'l'l:l!, ...
21
[AbbrevlatlOns: PR: Pronoun, N: Noun, Perl: FlfSt person, Per2: Second person, Per3: Third person, IW: Interrogative word, 1M: Interrogative marker, ind: Indeclinable, V: Verb] D.
CIO, "I, �,
18
20 TABLE
Con-7
C'I',
123
SAS [Rule no: 1] NP VP [[Rule no: 2] (Qntfr) (PP) N VP [Rule no: 5] Null (PP) N VP VP Null Null N VP Null Null হাসান VP [Rule no: 12] Null Null হাসান (NP) VF [Rule no: 9] Null Null হাসান N (Biv) (Adj) VF [Rule no: 6] Null Null হাসান নদী (Biv) (Adj) VF [Rule no: 12] Null Null হাসান নদী Null (Adj) VF Null Null হাসান নদী Null Null VF Null Null হাসান নদী Null Null V (Con) (Aux) (ind) [Rule no: 11] Null Null হাসান নদী Null Null ভালবাস (Con) (Aux) (ind) [Rule no: 14] Null Null হাসান নদী Null Null ভালবাস এ (Aux) (ind) [Rule no: 17]
Null Null হাসান নদী Null Null ভালবাস এ Null (ind) Null Null হাসান নদী Null Null ভালবাস এ Null না [Rule no: 19] Structural representation (SR) of this assertive sentence is:
3)
CSG for Imperative Sentences
In imperative sentences, if the subject is second person then the subject may remain hidden. The negative-imperative sentences are same as the normal imperative sentence except that the indeclinable "on" [1] Let's consider a negative-imperative Bangia sentence "�0f'3 on". There is no noun and pronoun in the first noun. There is no
Fig. 2. Structural representation of "� � � on"
2)
CSG for Interrogative Sentences
In Bengali interrogative sentence we always find an interrogative word (lW) and an interrogative marker (1M) at the end of the sentence. Negative-interrogative sentences are the combination of negative and interrogative rules. In this type of Bangia sentence we find the indeclinable "on" as well as IW and 1M in the same sentence.
use of second person in this type of sentence. In order to get the parse tree we have used CSGs rules from Table 2. S-7IS [Rule no: 1] -7NP VP [Rule no: 4] -7Null VP -7Null (NP) VF [Rule no: 9] -7Null N (Biv) (Adj) VF [Rule no: 6] -7Null ¥l (Biv) (Adj) VF [Rule no: 12] -7 Null ¥l '-'1(Adj) VF [Rule no: 16] -7 Null ¥l '-'1Null VF
As an example, we can consider the Bangia sentence "� f.j5- � �
Fig. 5. Parser output of "�� � on" Input:
REFERENCES
�.w�
[10] M. Z. Iqbal ,Dipu Number Two,1996 83.09
125