The reordered output after morphological generation of Malayalam words is
displayed as the final output of the machine translation system. The system is
freely.
RULE BASED MACHINE TRANSLATION SYSTEM FOR ENGLISH TO MALAYALAM LANGUAGE
op
Submitted for the degree of
y
A Thesis
Master of Science (by research)
C
in the School of Engineering
By
D
o
N
ot
R. HARSHAWARDHAN
Centre for Excellence in Computational Engineering and Networking Amrita School of Engineering Amrita Vishwa Vidyapeetham Coimbatore - 641112
December, 2011
AMRITA SCHOOL OF ENGINEERING
y
AMRITA VISHWA VIDYAPEETHAM, COIMBATORE - 641112
op
BONAFIDE CERTIFICATE
This is to certify that the thesis entitled “RULE BASED MACHINE TRANSLATION SYSTEM FOR ENGLISH TO MALAYALAM
C
LANGUAGE”, submitted by “R. HARSHAWARDHAN” (Reg. No.: CB.EN.M*CEN09002) for the award of the degree of Master of Science
ot
(by Research) in the School of Engineering, is a bonafide record of the research work carried out by him under my guidance. He has satisfied all the
N
requirements put forth for the project and has completed all the formalities
o
regarding the same to the fullest of my satisfaction.
D
Ettimadai, Coimbatore. Date:
Dr.K.P.Soman, Research Guide and Head, CEN.
AMRITA SCHOOL OF ENGINEERING AMRITA VISHWA VIDYAPEETHAM, COIMBATORE - 641112
Centre for Excellence in Computational Engineering and
op
DECLARATION
y
Networking
I, R.Harshawardhan (Reg. No.: CB.EN.M*CEN09002), hereby declare that this thesis entitled RULE BASED MACHINE TRANSLATION
C
SYSTEM FOR ENGLISH TO MALAYALAM LANGUAGE is the record of the original work done by me under the guidance of
ot
Dr.K.P.Soman, Head, Centre for Excellence in Computational Engineering and Networking, Amrita School of Engineering, Coimbatore and to the best
N
of my knowledge this work has not formed the basis for the award of any degree/ diploma/ associateship/ fellowship or a similar award, to any candidate in any University.
o
Place: Ettimadai
D
Date:
Countersigned by
Dr.K.P.Soman, Professor and Head, CEN, Amrita Vishwa Vidyapeetham, Coimbatore.
Signature of the Student
ABSTRACT A rule-based machine translation system for English to Malayalam language pair has been developed (Model). The machine translation system takes in the English sentence as input and parse with the help of Stanford Parser. The Stanford Parser is made use for four main purposes on the source (English) side processing, in the machine translation system: Parsing, POS tagging, Stemming and Morphological analysis. The English to Malayalam bilingual dictionary is created. Several font converters of Malayalam are built to convert
y
the collected data into Unicode (UTF-8) format. Malayalam words of the dictionary are
op
converted into lexicons with the help of linguists. The system takes in the parsed output and separates the source text word by word with POS category and searches for their corresponding target words in the bilingual dictionary. In this stage, the set of Malayalam
C
words with their POS category are the output. For named entities, SVM based English to Malayalam transliterator, which is developed by CEN, Amrita Vishwa Vidyapeetham, is used. The words that are not available in the dictionary also get transliterated to fulfill the
ot
need. The Malayalam words from the dictionary are romanized with the help of mapping. The mapping file has been created for English to Malayalam and vice versa. By making use of the morphological information of English words from the parser, the target word
N
form is synthesized. The system processes through the FST model which has been developed for incorporating Malayalam morphology. The orthographic rules are written for Malayalam inflections. The nominal and verbal forms of Malayalam are synthesized
o
by morphological synthesizer. The output at this stage would be the morphologically
D
inflected Malayalam words. The reordering rules of Malayalam Language are written. The transfer rules for reordering from English parse tree with respect to Malayalam help us to get the output in the syntactic pattern of target language. After applying the reordering rules, English sentence would be syntactically reordered to suit Malayalam language. The reordered output after morphological generation of Malayalam words is displayed as the final output of the machine translation system. The system is freely available online at http://nlp.amrita.edu:8080/Eng2Mal/ .
i
CONTENTS
Abstract
........................................................................................................................... i
List of Tables ................................................................................................................... vii List of Figures................................................................................................................. viii
y
Acronyms and Abbreviations ......................................................................................... ix
op
CHAPTER 1 ...................................................................................................................... 1 INTRODUCTION .............................................................................................................. 1
C
1.1 Need for Translation............................................................................................. 1 1.2 Translation Research ............................................................................................ 1
ot
CHAPTER 2 ...................................................................................................................... 2 LITERATURE SURVEY ................................................................................................... 2 2.1 Machine Translation ............................................................................................. 2
N
2.1.1 Related Works ............................................................................................... 3 2.2 Statistical Machine Translation ............................................................................ 4
o
2.2.1 Related Works ............................................................................................... 4
D
2.3 Example Based Machine Translation ................................................................... 5 2.3.1 Related works................................................................................................ 5
2.4 Rule Based Machine Translation ......................................................................... 5 2.4.1 Related Works ............................................................................................... 6 2.5 Morphological Synthesizer and Analyzer ............................................................ 6 2.5.1 Related Works ............................................................................................... 8 2.6 Reordering ............................................................................................................ 8
ii
2.6.1 Related Works ............................................................................................... 8 CHAPTER 3 ...................................................................................................................... 9 STUDY ON ORTHOGRAPHIC RULES OF MALAYALAM NOUNS AND VERBS .. 9 3.1 Malayalam Morphology ....................................................................................... 9 3.2 Orthographic (Sandhi) Rules ................................................................................ 9 3.3 Malayalam Nouns .............................................................................................. 10
y
3.4 Noun Inflections ................................................................................................. 11
op
3.5 Inflections for plural numbers ............................................................................ 11 3.6 Exceptions in plural formation ........................................................................... 12 3.7 Plural forms of pronouns .................................................................................... 13
C
3.8 Malayalam Case Markers ................................................................................... 14 3.9 Nominative Case Marker ................................................................................... 14
ot
3.10 Accusative Case Marker..................................................................................... 15 3.11 Dative Case Marker............................................................................................ 16
N
3.12 Sociative Case Marker ....................................................................................... 17 3.13 Locative Case Marker ........................................................................................ 18 3.14 Instrumental Case Marker .................................................................................. 19
o
3.15 Genitive Case Marker......................................................................................... 20
D
3.16 Benefactive Case Marker ................................................................................... 21 3.17 Ablative Case Marker......................................................................................... 21 3.18 Adjectivization ................................................................................................... 21 3.19 Adverbalization .................................................................................................. 22 3.20 Clitics ................................................................................................................. 22 3.21 Emphatic particles .............................................................................................. 22 3.22 Interrogative particles ......................................................................................... 23 iii
3.23 ‘And’ Coordination ............................................................................................ 24 3.24 ‘Or’ Coordination ............................................................................................... 24 3.25 Malayalam Verbs - Morphology ........................................................................ 25 3.26 Malayalam Verb Base Forms ............................................................................. 25 3.27 Intransitive (akarmaka) ...................................................................................... 25 3.28 Transitive (sakarMaka) ...................................................................................... 25
y
3.29 Causative (prayOjaka) ........................................................................................ 26
op
3.30 Tense Forms ....................................................................................................... 27 3.31 Past Tense (bhUtaM) .......................................................................................... 27 3.32 Present Tense (vartamAnam) ............................................................................. 29
C
3.33 Future Tense (bhAvi) ......................................................................................... 29 3.34 Continuous Tense ............................................................................................... 29
ot
3.35 Perfect Tense ...................................................................................................... 30 3.36 Perfect Continuous Tense................................................................................... 30
N
3.37 Voice (prayOga) ................................................................................................. 30 3.38 Auxiliary Verbs .................................................................................................. 31 3.39 Negation ............................................................................................................. 31
o
3.40 Question Verbs ................................................................................................... 31
D
3.41 Infinite Verbs...................................................................................................... 31 CHAPTER 4 .................................................................................................................... 32 IMPLEMENTATION OF RULE BASED MACHINE TRANSLATION SYSTEM ...... 32 4.1 English Parser..................................................................................................... 33 4.1.1 Introduction ................................................................................................. 33 4.1.2 Usage........................................................................................................... 33 4.2 English to Malayalam Transliteration ................................................................ 35 iv
4.2.1 Introduction ................................................................................................. 35 4.2.2 Preparation .................................................................................................. 35 4.2.3 Usage........................................................................................................... 35 4.3 English-Malayalam Bilingual Dictionary .......................................................... 37 4.3.1 Introduction ................................................................................................. 37 4.3.2 Preprocessing .............................................................................................. 37
y
4.3.3 Implementation ........................................................................................... 38
op
4.4 Malayalam Morphological Generator ................................................................ 39 4.4.1 Introduction ................................................................................................. 39 4.4.2 Preparation .................................................................................................. 39
C
4.4.3 Building FST Model ................................................................................... 39 4.4.4 Writing orthographic rules .......................................................................... 42
ot
4.4.5 Working of FST .......................................................................................... 44 4.5 Malayalam Morphological Analyzer ................................................................. 46
N
4.5.1 Introduction ................................................................................................. 46 4.5.2 Working ...................................................................................................... 46 4.6 Reordering by Transfer Rules ............................................................................ 48
o
4.6.1 Introduction ................................................................................................. 48
D
4.6.2 Preparation .................................................................................................. 48 4.6.3 Implementation ........................................................................................... 48
CHAPTER 5 .................................................................................................................... 57 RESULTS ........................................................................................................................ 57 5.1 Results of Malayalam morphological generator and analyzer ........................... 57 5.2 Discussion about results of Malayalam morphological generator and analyzer 60 5.3 Results of Rule Based Machine Translation System ......................................... 62 v
5.4 Discussion about the results of Rule Based Machine Translation System ........ 63 5.5 Screenshots ......................................................................................................... 65 CHAPTER 6 .................................................................................................................... 68 CONCLUSION ................................................................................................................. 68 6.1 Limitations ......................................................................................................... 68 6.2 Applications ....................................................................................................... 68
y
6.3 Future Work ....................................................................................................... 69
op
REFERENCES ................................................................................................................ 70 APPENDIX - A................................................................................................................ 73 A.1. Penn Treebank Tag set for POS category ........................................................... 73
C
APPENDIX - B ................................................................................................................ 75 B.1. Hand coded Reordering Rules for RBMT........................................................... 75
ot
APPENDIX - C................................................................................................................ 78 C.1. Inflection Markers used in morphology of Malayalam Nouns ........................... 78
N
C.2. Inflection Markers used in morphology of Malayalam Verbs ............................ 79 APPENDIX - D................................................................................................................ 80 D.1. FST State Transition Table modeled for Noun Morphotactics ........................... 80
o
D.2. FST State Transition Table modeled for Verb Morphotactics ............................ 81
D
D.3. Orthographic Rules for Malayalam Nouns ......................................................... 83 D.4. Orthographic Rules for Malayalam Verbs .......................................................... 89
APPENDIX - E ................................................................................................................ 94 E.1. Tested sentences of Machine Translation System with Rankings....................... 94 PUBLICATIONS .......................................................................................................... 115
vi
LIST OF TABLES Table.2.1 Some Machine Translation projects in India ...................................................... 3 Table.3.1 Classification of nouns based on stem ends...................................................... 11 Table.3.2 Examples for Plural forms of English and Malayalam nouns .......................... 11 Table.3.3 Exceptions to ‘kaL’ ........................................................................................... 12
y
Table.3.4 Exceptions to adding ‘mAr’ as plural marker ................................................... 13
op
Table.3.5 Other exceptions of Plural forms ...................................................................... 13 Table.3.6 Plural forms of pronouns .................................................................................. 14 Table.3.7 Nominative case markers for various stem ends .............................................. 14
C
Table.3.8 Special cases of Past Tense forms for Malayalam Verbs ................................. 28 Table.4.1 Sandhi rules for various stem endings of Malayalam nouns ............................ 42
ot
Table.4.2 Sandhi rules for various stem endings of Malayalam verbs in past tense ........ 44 Table.5.1 Statistics of morphology for Malayalam nouns ................................................ 57
N
Table.5.2 Statistics of morphology for Malayalam verbs ................................................. 58 Table.5.3 Testing results of morph generator for Malayalam nouns ................................ 59 Table.5.4 Testing results of morph generator for Malayalam verbs ................................. 59
o
Table.5.5 Testing results of morph analyzer for Malayalam nouns ................................. 59
D
Table.5.6 Testing results of morph analyzer for Malayalam verbs .................................. 60 Table.5.7 Coverage of Malayalam nouns and verbs ......................................................... 60 Table.5.8 Statistics of Rule Based Machine Translation System ..................................... 62 Table.5.9 Testing results for English-Malayalam machine translation system ................ 62
vii
LIST OF FIGURES Fig.4.1 Block Diagram of English-Malayalam Rule Based Machine Translation System .. ........................................................................................................................ 32 Fig.4.2 Sample Parse Tree for the sentence “I am writing a book” .................................. 33 Fig.4.3 First Transition of FST ......................................................................................... 40 Fig.4.4 Second Transition of FST ..................................................................................... 41
y
Fig.4.5 Third Transition of FST........................................................................................ 41
op
Fig.4.6 Final Transition of FST ........................................................................................ 41 Fig.4.7 Parse Tree of English Sentence ‘I am eating an apple’ ........................................ 49
C
Fig.4.8.a Reordered Parse Tree of English Sentence by executing Rule – 1 .................... 50 Fig.4.8.b Reordered Parse Tree of English Sentence by executing Rule – 2 ................... 50 Fig.4.9 Parse Tree of English Sentence ‘I work hard to finish the work and achieve the ........................................................................................................................ 52
ot
goal’
Fig.4.10 Reordered Parse Tree of English Sentence - 2 by executing Rule – 1 ............... 53
N
Fig.4.11 Reordered Parse Tree of English Sentence - 2 by executing Rule – 2 ............... 54 Fig.4.12 Reordered Parse Tree of English Sentence - 2 by executing Rule – 3 ............... 55
o
Fig.4.13 Reordered Parse Tree of English Sentence - 2 by executing Rule – 4 ............... 56 Fig.5.1 Screenshot of Machine Translation system .......................................................... 65
D
Fig.5.2 Screenshot of online translation system with proper font rendering of Malayalam text
........................................................................................................................ 65
Fig.5.3 Screenshot of Morph Generator for Malayalam nouns ........................................ 66 Fig.5.4 Screenshot of Morph Analyzer for Malayalam nouns.......................................... 66 Fig.5.5 Screenshot of Morph Generator for Malayalam verbs ......................................... 67 Fig.5.6 Screenshot of Morph Analyzer for Malayalam verbs .......................................... 67
viii
-
Natural Language Processing
MT
-
Machine Translation
SMT
-
Statistical Machine Translation
RBMT
-
Rule Based Machine Translation
EBMT
-
Example Based Machine Translation
POS
-
Parts of Speech
RE
-
Regular Expression
FSA
-
Finite State Automata
FSM
-
Finite State Machine
FST
-
Finite State Transducer
CFG
-
Context Free Grammar
SVM
-
Support Vector Machines
D
o
N
ot
C
op
NLP
ix
y
ACRONYMS AND ABBREVIATIONS
CHAPTER
1
INTRODUCTION 1.1 Need for Translation Have you ever imagined a centralized education system for the whole nation? The centralized education system requires all study materials in local languages. There is nothing wrong in creating such educational database from dawn to dusk in all languages
y
of India, but we all know that “Time is Money”. So, the problem can be tackled easily by a process called translation. The translations are language specific and the manual
op
translations are again time-consuming. The machine translation is the only hope for all these situations. If we could possibly develop a perfect machine translation system, then all information from Kindergarten to PhD will be centralized and can be learnt in the
C
local languages. Not only the education system be necessarily been made available in their languages but also the government policies and official documents can be easily made local by having such translation engines. Developing such a system is not an easy
ot
task and it involves various processes.
1.2 Translation Research
N
Translation research is carried out all over the world to build up an efficient system for translation; however the basic idea remains the same for all languages. In India, the
o
machine translation project is carried out at many places for years but still we are in need of a good translation system. Any basic translation requires two main view points: First is
D
the linguistic point of view and second is the mathematical point of view. The development of a machine translation system should go hand in hand by both the Language experts and the Engineers. The whole world is designing a translation system from English because of its international use. The data available in English is immense. For machine translation, we need data of both source and target languages for training the system. More the data available, more the accuracy level can be reached. It is worthy to be mentioned here that one of the world’s leading company Google doesn’t have a machine translation system for English to Malayalam language till now.
1
CHAPTER
2
LITERATURE SURVEY 2.1 Machine Translation Machine Translation is the process of translating the sentences from source language into target language, by the use of computers, with or without the influence of human assistance. The idea of building a machine translation system came into existence
y
during the days of World War for encoding and decoding [1]. The human translation of any language is highly time consuming and expensive as well. With the advent of
op
machine translation systems, the task of human translators had been greatly reduced. The machine translation systems are built world-wide and now the field has become the active research area of computer science. There are three approaches to machine
C
translation: Statistical, Example based and Rule based machine translation systems. The three techniques of machine translation systems are as follows: Direct translation,
i.
ot
Interlingua based translation and Transfer based translation. Direct Translation – This is a direct word by word (substitution) translation. No
N
detailed linguistic aspects are followed. This primitive approach is not used in recent times. ii.
Interlingua-based Translation – This technique linguistically analyzes the source
o
text and converts to the intermediate semantic representation called Interlingua. The advantage of this approach is that Interlingua can be used for translation to
D
many target texts.
iii.
Transfer-based Translation – This technique grammatically analyzes the source text and transfers to the target in grammatical representation by hand written rules. The advantage of this approach is that it is well-suited for domain wise translations.
iv.
Knowledge-based Translation – This technique uses the knowledge base (KB) as a source of information for translation. The knowledge base has to be created based on ontology and semantic web.
2
v.
Corpus-based Translation – This technique uses the parallel corpus of source and target sentences. A huge amount of corpus is required to get better results in this technique.
vi.
Hybrid Translation – This technique involves the combination of two or more techniques that are discussed earlier.
2.1.1 Related Works Some of the MT projects for Indian languages are shown in table from [2].
Languages
Domain/ Main Application
Approach /Formalism
Strategy
Anglabharati (IIT-K, ER&DCI-N)
Eng-IL (Hindi)
General (Health)
Transfer/Rules
Post-edit
Anusaaraka (IIT-K, UoH)
IL-IL (5IL>Hindi)
General (Children)
MaTra (NCST)
Eng-IL (Hindi)
Mantra (CDAC)
C
op
Project
Post-edit
General (News)
Transfer/Frames
Pre-edit
Eng-IL (Hindi)
Govt. notifications
Transfer/XTAG
Post-edit
Eng-IL (Kannada)
Govt. circulars
Transfer/UCSG
Post-edit
General
Interlingual/UNL
Post-edit
General (Children)
LWG mapping/PG
Post-edit
ot
LWG mapping/ PG
N
UCSG MAT (UoH)
Tamil Anusaaraka (AU-KBC)
Eng/IL (Hindi, Marathi) IL-IL (TamilHindi)
MAT (JadavpurU)
Eng-IL (Hindi)
News Sentences
Transfer/Rules
Post-edit
Anuvadak (Super Infosoft)
Eng-IL (Hindi)
General
N/A
Post-edit
StatMT (IBM)
Eng-IL
General
Statistical
Post-edit
D
o
UNL MT (IIT-B)
3
y
Table.2.1 Some Machine Translation projects in India
Apart from them, a consortium of ten Indian Universities along with CDAC has been working on machine translation for Indian languages under DIT, India. MHRD, India is also taking several measures to make machine translation serve for people.
2.2 Statistical Machine Translation According to [3], the statistical machine translation (SMT) is a machine translation paradigm where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora. The SMT is a
y
corpus based approach, where a massive parallel corpus is required for training the SMT
op
systems.
The SMT systems are built based on two probabilistic models: language model and translation model. The advantage of SMT system is that linguistic knowledge is not
C
required for building them. The difficulty in SMT system is creating massive parallel corpus. SMT systems work well for machine translation of English to European languages because the word order is almost preserved in such translations. For machine
ot
translation of English to Indian languages, the parallel corpora have to be preprocessed (changing word-order) and trained in SMT.
N
2.2.1 Related Works
Many SMT systems are being built nowadays for Indian Languages. For
o
Malayalam, in [4], it is said that the SMT system from Amrita – CEN has been built, with 10,000 sentences trained by considering rule based reordering and morphological
D
processing of Malayalam, the BLEU score for baseline - 13.15, baseline and syntax – 15.6, baseline, syntax and morphology – 16.1 is achieved. In [5], it is said that another SMT system with reordering approach from Amrita – CEN has been built, when trained with 2000 sentences, a BLEU score of baseline -10.76 and baseline and reordering – 15.6
is achieved. In [6], it is said that the SMT system by incorporating morphology and syntax from Amrita – CEN has been developed and trained with 1000 sentences, the BLEU score of 15.9 for baseline, 19.3 for baseline and syntax, 24.9 for baseline, syntax and morphology is achieved. In [7] and [8], it is said that the SMT systems for
4
Malayalam trained with 250 sentences, the BLEU score of baseline -0.48, suffix separated – 0.69 with one set and baseline – 0.22, suffix separated – 0.38 for another set.
2.3 Example Based Machine Translation The example based machine translation (EBMT) is the corpus based approach without any statistical models. The example based systems are trained with the parallel corpus of example sentences, similar to SMT systems. The example based systems generally don’t learn from the corpus. They store the parallel corpus and uses matching
y
algorithms to search and retrieve the sentences.
op
The translation memory is one of the example based machine translation systems. The translation memories (TM) are built to aid the human translators by serving as an assisting tool for translation. The advantage of translation memories are easy to
C
implement and linguistic knowledge is not required. The translation memories are not used for translation purposes but they are also useful for dictionary search of words,
2.3.1 Related works
ot
idioms and proverbs, etc.
EBMT systems are getting implemented for Indian languages. In [9], an EBMT
N
system is built for four Indian languages (Tamil, Malayalam, Kannada and Telugu). With 18,000 sentence rules, the BLEU score of 0.7164 is achieved. Translation memories for
o
Tamil and Malayalam have been built. TM For Tamil in [10] is stored with a parallel corpus of 44833 sentences, 761 idioms and phrases, 1776 proverbs and loaded with a
D
terminology dictionary of 2,13,202 words. TM for Malayalam in [11] is stored with a parallel corpus of 2499 sentences and a dictionary of 11,569 words. The Malayalam grammars are also stored: Paryayam – 176, Arththa Vyathsyam – 316, Samasam – 47, Ashayavipulanam – 8, Santhi – 113, Lingam – 107, Nanartham – 133, Ottapatham – 90, Vipareetham – 279.
2.4 Rule Based Machine Translation The rule based machine translation system translates the source text into target text by a set of linguistic rules. Three techniques of machine translation – Direct, 5
Interlingua and Transfer based are applicable to rule based machine translation system. The rule based machine translation system is developed by hand coded rules for translation. The system requires good linguistic knowledge to write the rules and a bilingual dictionary is also needed. Other MT systems like SMT and EBMT requires huge parallel corpus for training, which is not readily available for Indian languages. The source of parallel corpus is internet and texts. The parallel texts are not widely available in internet and in
y
multi-lingual text books, the alignment of sentences vastly vary. On the other hand, the rule based systems are highly suited for translation of English to Indian Languages
op
because the bilingual dictionary could be collected easily compared to parallel corpus and the rules could also be written well with the help of linguists. The rule based system which has been developed follows the transfer based approach of reordering rules. The
C
drawback of rule based system is that the system is confined with the rules and the rules will evolve with the language over time.
ot
2.4.1 Related Works Rule based systems are also getting developed for Indian languages. In [12], rule
N
based machine translation for noun phrases for Punjabi language, trained with 2000 phrases; accuracy of 75%-85% is reached. AnglaMalayalam from CDAC is developed for Malayalam in health and tourism domains with 75-80% accuracy [13]. A rule based
o
system from Amrita – CEN [14] has also been developed by utilizing dependencies from
D
parser, POS tagger and transfer link rules for reordering and rules for morphology.
2.5 Morphological Synthesizer and Analyzer Malayalam is the morphologically rich and highly agglutinative language. The
morphological synthesizer and analyzer are required for machine translation of English to Malayalam language. There are many methods to develop morphological synthesizer and analyzer. Some of them are discussed below:
6
i.
Paradigm based method This approach is suitable for inflectionally rich languages. The words are
classified into different paradigms based on their morpho-phonemes, and the paradigm table is created. The word to be morph analyzed or generated is classified into the particular paradigm and the corresponding inflections of that paradigm are applicable. ii.
Suffix stripping method
y
This approach uses stem dictionary that stores the root words and suffix dictionary that store all possible suffixes of nouns and verbs. The morphotactics and
iii.
op
orthography can also be stored and used for suffix stripping. Directed Acyclic Word Graph (DAWG) method
C
DAWG data structure is used for both morphological analysis and generation. This approach is language independent it does not require any morphological rules or any
ot
other special linguistic information. iv.
Corpus based method
N
The corpus based approach needs large number of morphologically variant data in order to train the system. The SVM based morphological analyzer and generator belongs to this category. The data collection in the format required to train the system is difficult
o
in this approach.
Finite State Transducer (FST) based method
D
v.
The FST based morphological analyzer and generators are widely implemented
for many languages. The FST systems are mainly used in speech recognition and speech processing while building the language models. The morph analyzer and generator can be built in a bidirectional manner using FST.
7
2.5.1 Related Works The morphological analyzers and generators are being developed for many Indian languages with all the above mentioned approaches. The unsupervised learning of morphological analysis for inflectionally rich Indian languages (Hindi) has been proposed by [15], with primitive morph coverage of 32-63% and advanced morph coverage of 9697%. From Amrita – CEN, morph analyzer and generator with a rule based FST approach for Tamil has been built [16]. Three approaches: paradigm, suffix stripping and hybrid,
y
for Malayalam morph analyzer have been compared in [17]. The morph analyzer and
op
generator for Malayalam to Tamil machine translation have also been developed by [18].
2.6 Reordering
Malayalam is the free word-order language. Reordering is the ultimate need for
C
English to any Indian language machine translations. The reordering in rule based systems is different from the statistical systems. Many levels of reordering are possible such as word level reordering and phrase level reordering. In word level reordering, the
ot
source words get reordered to target format or target words get reordered after translation. In phrase level reordering, the phrases are reordered syntactically. The reordering of source sentence in the order of target language is performed in rule based reordering.
N
Reordering is carried out with the help of parse trees. The parser could be dependency parser or Context Free Grammar (CFG) based parser. The reordering rules
o
are written based on the parser used. The pattern based reordering for Tamil [19] has been developed from Amrita – CEN, which is useful for building the rules for
D
Malayalam.
2.6.1 Related Works The related works for reordering aspects of English to Malayalam machine translation are discussed in [4], [5], [6] and [14].
8
CHAPTER
3
STUDY ON ORTHOGRAPHIC RULES OF MALAYALAM NOUNS AND VERBS 3.1 Malayalam Morphology In linguistics, Morphology is the study of word forms. The morphology varies with language and the word structures. The smallest meaningful word unit of morphology
y
is morpheme. The morphology plays an important role in inflections of words in a language. Therefore morphology is the important aspect of machine translation. In this
op
chapter, we will discuss only the particular aspects of Malayalam morphology of nouns and verbs that are considered for writing orthographic rules to build morphological generator for Malayalam. We have considered only the handful of morphological aspects
C
of Malayalam in our system. Each inflection aspect we have considered for writing a rule is detailed under each inflection types. The chapter is much dedicated about the transformation of words by writing rules for each type. The morphology of Malayalam
ot
language is well explained in [20] and we followed that book in building the Morphological synthesizer.
N
3.2 Orthographic (Sandhi) Rules
The orthographic or sandhi rules form the backbone of the language morphology.
o
The Sanskrit word ‘sandhi’ means 'to join together'. Sandhi denotes the phonological changes that occur at the morpheme boundaries depending on the grammatical function
D
of neighboring morphemes. For adjacent words, the last letter of previous word and the first letter of the next word when come in contact may lead to the unification of two words to form a larger word based on the sandhi rules. There are two types of sandhi: Internal Sandhi and External Sandhi. The internal sandhi involves changes within a word whereas external sandhi involves changes at word boundaries. Sandhi is further classified into four categories namely: loopa, aagama, dvitva, aadeesa.
9
i.
Elision (Loopa) It is the reduction in the duration of phoneme by omitting the last vowel of the
first word at the boundary. Example: padi + ikk ii.
padi+ikk
padikk
Augmentation (Aagama) It is the insertion of a new consonant in between the two vowels at the boundaries
of the adjacent words. Example: amma+OT Reduplication (Dvitva)
ammayOT
y
iii.
amma+y+OT
Example: pU+kaL
pUkkaL
Substitution (Aadeesa)
C
iv.
pU+k+kaL
op
It is the duplication of the consonant that follows the vowel at the word boundary.
It is the substitution of particular phoneme with the other new phoneme at the maraM+ngng+kaL
marangngaL
ot
word boundary. Example: maarM+kaL
3.3 Malayalam Nouns
In any language, the nouns are words that indicate people, beings, things, places,
N
phenomena, qualities or ideas. Nouns that indicate individual entities, such as names of persons, places or organizations are called proper nouns. Examples for some Malayalam
o
and English nouns are: Arm-കയ്, Language-ഭാഷ, Cheeks-കവിള്, and Mother-am. For generating the morphology for Malayalam nouns, first the whole list of available nouns is
D
categorized into four main categories based on their stem ending (phonology) and morphological changes. The different categories of nouns are listed below in table.3.1. Apart from these four categories, we consider human nouns like ‘amma’ and non-
human nouns as two separate categories to take care of two different plural markers (‘mAr’ and ‘kaL’)
10
Table.3.1 Classification of nouns based on stem ends Stem Endings
Example amma mAla Guru pU maraM varaM vIT vAtil
Vowels(a,A,e,E,i,I,o,O) Vowels(u,U) Vowel(aM)
y
Consonant
op
3.4 Noun Inflections
In the present work, we have considered three main categories of inflections are
C
considered for the nouns. They are: Plural Markers, Case Markers and Clitics.
3.5 Inflections for plural numbers
ot
Malayalam Plurals are grammatical numbers, typically referring to more than one of the referent in the real world. In English language, singular and plural are the only grammatical numbers.
Plural and Singular play an important role in Malayalam
table.3.2:
N
morphology; therefore they need very special attention. Some examples given in
o
Table.3.2 Examples for Plural forms of English and Malayalam nouns Malayalam Plural
book
പുസ്തകം
books
പുസ്തകങ്ങള്
daughter
പുത്രി
daughters
പുത്രിമാര്
D
English Plural
There are two separate plural markers – ‘mAr’ and ‘kaL’ based on whether the singular noun is human or non-human. When the singular nouns are changed to their
11
corresponding plural form by adding ‘kaL’, the sandhi operates in different ways based on the end phonemes of the singular nouns. For example: i.
tira+N+PL
tira+kaL
tirakaL
guru+k+kaL
gurukkaL
(sandhi
without change) ii.
guru+N+PL
(sandhi with doubling
of ‘k’)
y
In the third category, the words which end in ‘anuswAram’, when added with ‘kaL’,
iii.
op
resulting in the change of ‘M’ and ‘k’ into ‘ngng’. maraM+N+PL
maraM+ngng+kaL
substitution)
marangngaL (sandhi
with
C
In the fourth category, the words which end in a consonant, when added with ‘kaL’, a new phoneme is inserted between them. kUT+N+PL
kUT+u+kaL
kUTukaL
ot
iv.
(Augmentation of ‘u’)
N
3.6 Exceptions in plural formation
There are exceptions to the categorization of nouns into human and non-human on the basis of plural marker ‘mAr’ and ‘kaL’. The following tables 3.3, 3.4, 3.5 will reveal
D
o
this:
Table.3.3 Exceptions to ‘kaL’ Singular
Plural
kurangngan
kurangnganmAr
kurukkan
kurukkanmAr
12
Table.3.4 Exceptions to adding ‘mAr’ as plural marker Plural
makan/makaL
makkaL
peN
peNungngaL
dATA
dATAkkaL
suhartt
suharttukaL
shatru
shatrukkaL
vakkIl
vakkIlanmAr
y
Human Singular
op
Table.3.5 Other exceptions of Plural forms Singular
Plural
paNikkAr
C
paNikkAran/paNikkAri
kalAkkAr
kaTankAran/kaTankAri
kaTankAr
vITTukAran/vITTukAri
vITTukAr
toTTakAran/ toTTakAri
toTTakAr
addhyApakan/addhyApika
addhyApakar
sahOdaran/sahOdari
sahOdarar
snEhitan/snEhiti
snEhitar
dEvan/dEvi
dEvar
o
N
ot
kalAkAran/kalAkAri
D
3.7 Plural forms of pronouns The Plural forms of pronouns need different treatment. It is better to consider
them as two pair of forms, rather than deriving one form from the other. Some of the words under this category and their inflections are shown in the table.3.6 below:
13
Table.3.6 Plural forms of pronouns Inflected Plural form
atu
ava
njAn
njangngal
nI
ningngal
avan
avaR
avaL
avaR
y
Root Word
op
3.8 Malayalam Case Markers
The bound case affixes denote the syntactic and semantic functions of nouns and are added to the oblique bases of nouns. The addition of case markers is the inflectional
C
processes of nouns. The different case markers considered in Malayalam language are: nominative, accusative, dative, sociative, locative, instrumental and genitive. The changes which occur while adding case suffixes to the oblique noun bases can be
ot
captured by sandhi rules. The case markers are added to the singular stems or plural stems. (N±PL+Case Suffix)
N
3.9 Nominative Case Marker
In Malayalam, the nominative case form is unmarked. The base form itself
o
functions as the nominative case form. The table.3.7 given below shows that the addition of null marker for the nominative case, which does not make any change in the nominal
D
forms.
Table.3.7 Nominative case markers for various stem ends
Noun Category
Examples
Vowels(a,A,e,E,i,I,o,O)
amma
Vowels(u,U)
guru
Vowel(aM)
maraM
Consonant
kUT 14
Nominative
Inflections amma
Ф
guru maraM kUT
3.10 Accusative Case Marker The accusative case marker of Malayalam is ‘e’. The addition of case suffixes to nominal forms of nouns changes them into oblique forms and thereby making changes due to sandhi. The changes that occur are given below: i.
In the first category of nouns ending in vowels ‘a,A,e,E,i,I,o or O’, the on glide ‘y’ appears between the base form and the suffix ‘e’. ammaye
In the second category of nouns ending in the vowels ‘u or U’, the accusative
op
ii.
amma+y+e
y
Eg: amma+ACC
case suffix ‘e’, is added to the nominal base augmented by the inflectional
Eg: guru+ACC
guru+vin+e
guruvine
In the third category of nouns ending in ‘M’ (AnusvAram), ‘M’ is deleted and
ot
iii.
C
increment ‘vin’.
‘tt’ is augmented to the base, when the suffix ‘e’ is added to them.
iv.
maraM+tt+e
maratte
N
Eg: maraM+ACC
In the fourth category of nouns ending in consonants, ‘in’ is augmented to the
o
base, when the case suffix ‘e’, is added to them. kUT+i+ne
kUTine
D
Eg: kUT+ACC
v.
The accusative case is added to the plural stems without any change due to sandhi. Eg: pUkkaL+ACC
pUkkaL+e
pUkkaLe
The accusative form normally indicates the object of the verb. Eg:
English: The child asked about his mother (child mother_ACC about ask_PAST) Malayalam: kuTTi ammaye paRRi cOdiccu 15
3.11 Dative Case Marker The Dative case marker for Malayalam is ‘kk’, which alternates with ‘in’ i.
The dative case suffix ‘kk’ is added directly without change with the first category of nouns ending in vowels ‘a,A,i,I,e,E,o or O’. ‘(u)’ occurs with ‘an’ ending nouns, elsewhere ‘kk’ occurs as dative suffix.
ii.
amma+kk
ammakk
y
Eg: amma+DAT
In the second category of nouns ending in vowels ‘u or U’, the dative suffix ‘in’ is
Eg: guru+DAT
guruvin
In the third catefory of nouns ending in AnusvAram ‘M’, the dative case suffix
C
iii.
guru+v+in
op
added to the nominal base augmented by ‘v’ inflectional increment.
‘in’ is added to the nominal base augmented by ‘tt’ inflectional increment with the
ot
deletion of ‘M’. Eg: maraM+DAT
marattin
In the fourth category of nouns ending in consonants, the dative case suffix ‘in’ is
N
iv.
maraM+tt+in
added to the nominal bases.
kUT+in
kUTin
The dative case marker ‘kk’ is added to the plural stems without any sandhi
D
v.
o
Eg: kUT+DAT
changes.
Eg: pUkkaL+DAT
pUkkaL+kk
pUkkaLkk
The dative form usually indicates the indirect object of the verb. Eg:
English Sentence:
I gave her the book (I she_DAT book give_PAST)
Malayalam Sentence: njAn avaLkk pustakaM koTuttu 16
3.12 Sociative Case Marker The sociative case marker is ‘OT’. The addition of ‘OT’ makes changes to the base nouns depending on the four categories to which they belong. i.
In the first category of nouns, the on glide ‘y’ is added when the sociative case suffix is suffixed to the nominal base.
ii.
amma+y+OT
In the second category of nouns ending in ‘u or U’, ‘vin’ is augmented to the
Eg: guru+SOC
op
base when ‘OT’ is added.
guru+vin+OT
guruvinOT
In the third category of nouns ending in ‘M’, ‘ttin’ is augmented to the
C
iii.
ammayOT
y
Eg: amma+SOC
nominal bases with the deletion of ‘M’.
iv.
marattinOT
maraM+ttin+OT
ot
Eg: maraM+SOC
In the fourth category of nouns ending in consonants, ‘in’ is augmented to
N
them, when the sociative case suffix ‘OT’ is added. kUT+in+OT
kUTinOT
o
Eg: kUT+SOC v.
The sociative case suffix ‘OT’ is added to the plural bases without any
D
sandhi change. Eg: pUkkaL+SOC
pUkkaL+OT
pUkkaLOT
The sociative case usually denotes the accompanying person. Eg:
English Sentence:
She came with a friend. (she friend_SOC along with come_PAST)
Malayalam Sentence: avaL kUTTukAriyOT kuuTi vannu
17
3.13 Locative Case Marker The Locative case marker of Malayalam is ‘il’. The addition of the locative case suffix to the nominal forms makes changes to the base depending on the four categories to which they belong. i.
In the first category, the on glide ‘y’ is added when ‘il’ is suffixed to the noun.
In the second category of nouns, the on glide ‘v’ is added when ‘il’ is suffixed to the base.
guru+v+il
guruvil
C
Eg: guru+LOC iii.
ammayil
op
ii.
amma+y+il
y
Eg: amma+LOC
In the third category of nouns ending in ‘M’, the nominal bases are
ot
augmented by ‘tt’, when suffixed with ‘il’. Eg: maraM+LOC
marattil
In the fourth category of nouns, ending in consonants, the suffix ‘il’ is
N
iv.
maraM+tt+il
added without any sandhi change (optionally the final consonant may
o
germinate).
kUT+il
kUTTil
D
Eg: kUT+LOC
v.
The locative case suffix ‘il’ is added to the plural nominal forms without any sandhi change. Eg: pUkkaL+LOC
pUkkaL+il
pUkkaLil
The locative forms usually denote the location concerned with the verb. Eg: English: No one came from home (house_LOC from anyone come_PAST_NEG) Malayalam: vITTil ninnu AruM Vannilla 18
3.14 Instrumental Case Marker The Instrumental case marker is ‘Al’. The sandhi changes and augmentation occur depending on the four categories of the nouns. i.
In the first category, the on glide ‘y’ is added when ‘Al’ is suffixed to the noun.
ii.
amma+y+Al
In the second category of nouns, the on glide ‘v’ is added when ‘Al’ is
Eg: guru+INS
op
suffixed to the base.
guru+vin+Al
guruvinAl
In the third category of nouns ending in ‘M’, the nominal bases are augmented
C
iii.
ammayAl
y
Eg: amma+INS
by ‘ttin’, when suffixed with ‘Al’.
iv.
marattinAl
maraM+ttin+Al
ot
Eg: maraM+INS
In the fourth category of nouns, ending in consonants, the suffix ‘Al’ is added
N
without any sandhi change.
kUT+in+Al
kUTinAl
o
Eg: kUT+INS v.
The instrumental case suffix ‘il’ is added to the plural nominal forms without
D
any sandhi change. Eg: pUkkaL+INS
pUkkaL+Al
pUkkaLAl
The instrumental forms usually inflect the instrumental role of the noun concerned with the verb. Eg:
English Sentence:
He stabbed a tiger with a knife (He one tiger_ACC knife_INS stab_PAST)
Malayalam Sentence: avan oru puliye kattiyAl kutti 19
3.15 Genitive Case Marker The genitive case marker is ‘uTe’. ‘Re’ occurs after nominal bases or oblique bases ending in ‘n’, where ‘uTe’ occurs elsewhere. i.
In the first category, the on glide ‘y’ is added when ‘uTe’ is suffixed to the noun.
ii.
amma+y+uTe
In the second category of nouns, the on glide ‘v’ is added when ‘Re’ is
Eg: guru+GEN
op
suffixed to the base.
guru+vin+Re
guruvinRe
In the third category of nouns ending in ‘M’, the nominal bases are augmented
C
iii.
ammayuTe
y
Eg: amma+GEN
by ‘ttin’, when suffixed with ‘Re’.
iv.
marattinRe
maraM+ttin+Re
ot
Eg: maraM+GEN
In the fourth category of nouns, ending in consonants, the suffix ‘Re’ is added
N
without any sandhi change.
kUT+in+Re
kUTinRe
o
Eg: kUT+GEN v.
The genitive case suffix ‘uTe’ is added to the plural nominal forms without
D
any sandhi change. Eg: pUkkaL+GEN
pUkkaL+uTe
pUkkaLuTe
The genitive case suffix links a noun with another noun by possession. Eg:
English Sentence:
The book is on the table (Book table+GEN on be-PRES)
Malayalam Sentence: pustakaM mESayuTe mEle unTe 20
3.16 Benefactive Case Marker The benefactive and ablative cases suffixes are secondary case suffixes added to the primary dative and locative case markers respectively. Benefactive is expressed by the adding postpositions ‘aayi’ or ‘vENTi’ to a noun suffixed with dative case marker. There is no sandhi change when ‘vENTi’ is added to the dative forms. Eg:
English Sentence:
She lives for her children
Malayalam Sentence: avaL makkLkk vEnTi jIvikkunnu
op
3.17 Ablative Case Marker
y
she children_DAT for_BEN live_PRES
Ablative is expressed by adding the postposition ‘ninnu’ to the nouns suffixed
C
with locative case marker ‘il’. There is no sandhi change when ninnu is added to the locative form.. English Sentence:
We took this ball from the basket
ot
Eg:
we basket_LOC from_BEN this ball take_PAST
Malayalam Sentence: njangngal sanciyilninnu E pant eduttu
N
3.18 Adjectivization
o
Adjectives are formed by adding the adjectivizer ‘Aya’ to the noun. i.
The addition of ‘Aya’ to the first category of nouns inserts an on glide ‘y’ in
D
between the base and the suffix. Eg: amma+ADJZ
ii.
amma+y+Aya
ammayAya
In the second category of nouns, the on glide ‘v’ is inserted between the base and the suffix. Eg: guru+ADJZ
guru+v+Aya
21
guruvAya
iii.
In the third category of nouns ending in ‘M’, the addition of ‘Aya’ changes ‘M’ to ‘m’. Eg: maraM+ADJZ
iv.
maraM+m+Aya
maramAya
In the fourth category of nouns, ending in consonants, the addition of ‘Aya’ doesn’t make any sandhi changess. kUT+Aya
kUTAya
y
Eg: kUT+ADJZ
op
3.19 Adverbalization
Adverbs are derived by adding ‘Ayi’ to the nouns. The changes that take place due to sandhi are exactly similar to that of adjectivization.
C
3.20 Clitics
Clitic is a morpheme that is grammatically independent, but phonologically
ot
dependent on another word or phrase. It functions like an affix, but works at the phrase level. Clitics elements are of two types: i) Free Clitics; ii) Bound Clitics. Free clitics can occur freely (without being attached to verb, noun or another clitics). The bound clitics
N
occur after verbs, nouns and other clitics. The Free clitics include interjections and ideophones and sometimes, manner adverbs (verb attributes). The bound clitics includes Proclitics which are mainly prefixes of Sanskrit origin, along with infinite attributive
o
quantifiers and enclitics, which comprise a somewhat miscellaneous group of suffixed
D
elements. There are different categories of clitic forms present in Malayalam language. Presently, the following types of clitics are considered.
3.21 Emphatic particles The emphatic clitics, ‘tanne’ or ‘E’ can be seen as sentence particles when attached to a verb in sentence-final position. Eg:
amma tanne, guru tanne, maraM tanne, kUT tanne; ammayE, guruvE,
maramE, kUTE. 22
3.22 Interrogative particles ‘ANO’ is the interrogative clitics and it is added after nominal bases, case markers, plural markers, etc. i.
With the first category of nouns, the on glide ‘y’ appears when ‘ANO’ is suffixed to the nominal base.
ii.
amma+y+ANO
With the second category of nouns, the on glide ‘v’ appears when ‘ANO’ is
Eg: guru+ CLI_ANO
guru+v+ ANO
guruvANO
maraM+m+ANO
ot
Eg: maraM+ CLI_ANO
C
With the third category of nouns, ending in ‘M’, the suffixation ‘ANO’ changes ‘M’ to ‘m’.
iv.
op
suffixed to the nominal base.
iii.
ammayANO
y
Eg: amma+CLI_ANO
maramANO
With the fourth category of nouns ending in consonants, the suffixation of
N
‘ANO’ doesn’t make any sandhi change.
kUT +ANO
o
Eg: kUT+ CLI_ANO
kUTANO
'A' which is another interrogative clitic can replace ‘ANO’, subject to the above
D
mentioned sandhi changes. Example:
ammayA, guruvA, maramA, kUTA
23
3.23 ‘And’ Coordination ‘um’ is the clitics for denoting coordination between nouns. The sandhi changes are as described in the case of interrogative clitics. amma+CLI_um
amma+y+um
ammayum
guru+ CLI_um
guru+v+ um
guruvum
maraM+ CLI_um
maraM+m+um
English Sentence
:
Malayalam Sentence :
y kUTum
Unni too went to Kottayam
unniyum kOTTayatteekk pOyi
3.24 ‘Or’ Coordination
C
Eg:
kUT +um
op
kUT+ CLI_um
maramum
ot
For ‘Or’ coordination of Malayalam nouns, 'O' is added after nominal bases similar to the clitics discussed before.
ammayO, guruvO, maramO, kUTO
D
o
N
Example:
24
3.25 Malayalam Verbs - Morphology In a language, there are words which are categorized as verbs. The verbs usually carry tense and functions mostly as predicates. In Malayalam, the verb inflect or get modified for tense, mood, negation, aspect and voice. The base forms of the Malayalam verbs are arrived at by removing the infinitive
Example: cirikkuka
paRayuku+V
paRayuka
op
cirikk
cirikkuka+V
y
suffix ‘uka’ from the infinitive form of the verbs suffixed with ‘uka’.
paRay
C
According to the strategy adopted here, ‘cirikk’ and ‘paRay’ are the base forms. The inflections are accounted by making use of the base forms.
ot
3.26 Malayalam Verb Base Forms
The three base forms of Malayalam verbs are considered. They are Intransitive
N
verb, Transitive verb, and Causative verb.
3.27 Intransitive (akarmaka)
o
The verb without the object is called as Intransitive verb. The intransitive verb denotes the state or process or action of the subject. The intransitive forms are usually the
D
root forms to which the transitive and causative suffixes can be added to arrive at the respective base or stem forms. A simple rule for no change in Sandhi for Intransitive base is written below:
tinn+INT
tinn+ε
tinn
3.28 Transitive (sakarMaka) The verbs which can take an object noun are called transitive verbs. Not all transitive verbs are derived from their intransitive counterpart. Some verbs are inherently 25
transitive (eg. maRakkuka). Only those verbs which are derived from their intransitive counterparts are considered for derivation. The intransitive verbs are converted into transitive verbs at least by the following fourteen kinds of processes which are listed
muRukk
ii.
AT+TRA
ATT
iii.
kayaR+TRA
kayaRR
iv.
uRangng+TRA
uRakk
v.
kUmpu+TRA
kUpp
vi.
nIL+TRA
nITT
vii.
cuzal+TRA
cuzaRR
viii.
tIr+TRA
tIrkk
ix.
kariy+TRA
karikk
x.
paRakk+TRA
paRatt
xi.
viz+TRA
vIztt
xii.
irikk+TRA
irutt
xiii.
nilkk+TRA
xiv.
poTT+TRA
op
muR+TRA
ot
C
i.
y
below.
nirtt
N
poTTikk
3.29 Causative (prayOjaka)
o
If the cause of the action denoted by the verb is the subject of the sentence and is reflected in the verb form by the causative marker, then the verb concerned is the
D
causative verb. The transitive verbs are converted into causative by at least three kinds of processes, which are listed below: i.
ceyy+CAU
ceyyikk
ii.
uNN+CAU
UTT
iii.
kAN+CAU
kATT
iv.
kELkk+CAU
kELppikk
26
In general, the transitive forms are accompanied by the tense markers of Malayalam verbs. Consider the example sentence where the transitive form comes.
3.30 Tense Forms The time at which the action of the verb takes place is denoted by tense. There are
Past (bhUtaM)
•
Present (vartamAam)
•
Future (bhAvi)
op
•
y
three different tenses:
The morphotactics of Malayalam verbal forms are as follows:
3.31 Past Tense (bhUtaM)
C
Verb+TRA+CAU+TENSE
ot
The past tense of the verb denotes the action that already took place. There are three sets of past tense suffixes: ‘i, tu, ntu’. Morphophonemic change occurs when these suffixes are added to the verbal bases. Accordingly we get the following alternates of past
N
tense suffix: ‘i, t, T, R, njnj, NT, N, nn, tt, and cc’. kayaR+PAST
kayaR+i
kayaRi
ii.
ceyy+PAST
ceyy+tu
ceytu
iii.
kAN+PAST
kAN+tu
kaNTu
D
o
i.
iv.
viT+PAST
viT+tu
viTTu
v.
peR+PAST
peR+tu
peRRu
vi.
paRay+PAST
paRay +ntu
paRanjnju
vii.
curuL+PAST
curuL +ntu
curuNTu
viii.
vIZ+PAST
vIz+ntu
vINu
ix.
cEr+PAST
cEr+ntu
cErnnu
x.
tar+PAST
tar+ntu
tannu
xi.
koTukk+PAST
koTukk+ttu
koTuttu
27
xii.
vilkk+PAST
vilkk+tu
viRRu
xiii.
kELkk+PAST
kELkk+tu
kETTu
xiv.
kaTikk+PAST
kaTikk+tu
kaTiccu
There are verbs whose past tense formations are not predictable. They are handled as special cases before going to any generalized orthographic rule. Some of the special cases of past tense form of Malayalam verbs are given in the following table.3.8.
C
op
nonTu venTu taLLi colli konnu pOyi vannu tannu pOyi kantu vannu tannu tinnu koTuttu veccu veccu cattu cattu ninnu unTu kaTTu nakki cikki uzhutu vINu tAzhNu koNTu
D
o
N
ot
nOv vEv taLL coll koll pO vA tA pOk kAN var tar tinn koT vey veykk cAv cAk nilkk uNN kakk nakk cikk uzh vIzh tAzh koLL
28
y
Table.3.8 Special cases of Past Tense forms for Malayalam Verbs
3.32 Present Tense (vartamAnam) The present tense denotes the current state of action. The present tense is marked by adding the suffix ‘unnu’ to the verbal bases. ceyy+INT+PRES
ceyy+unnu
ceyyunnu
Transitive:
ceyy+TRA+PRES
ceyy+ikk+unnu
ceyyikkunnu
Causative:
ceyy+CAU+PRES
ceyy+ippikk+unnu
ceyyippikkunnu
y
Intransitive:
var+unnu
3.33 Future Tense (bhAvi)
vannu
C
var+INT+PRES
op
The unpredictable present tense formations of verbs are handled separately.
The future tense denotes the action that is going to take place. The future tense is
ot
marked by adding the suffix ‘um’ to the verbal bases. ceyy+INT+FUT
ceyy+um
ceyyum
Transitive:
ceyy+TRA+FUT
ceyy+ikk+um
ceyyikkum
Causative:
ceyy+CAU+FUT
N
Intransitive:
ceyyippikkum
o
ceyy+ippikk+um
D
3.34 Continuous Tense The continuous tense denotes the continuity of action denoted by the verb. The
continuous tense is marked by adding the compound auxiliary verb ‘konTirikk’ to the past participle form of the main verb. ceyy+CONT+PAST
ceytu+konTirukk+nnu
ceyy+CONT+PRES
ceytu+konTirukk+unnu
ceytukonTirikkunnu
ceyy+CONT+FUT
ceytu+konTirukk+um
ceytukonTirukkum
29
ceytukondirunnu
3.35 Perfect Tense Perfect tense is realized by adding to the past participle form of the verbs, three types of compound auxiliaries ‘iTTuNT, iTTuNTAyirunnu and iTTuNTAvum’ forming past perfect, present perfect and future perfect respectively. ceyy+PERF+PAST
cey+tu+iTTuNTAyirunnu
ceyy+PERF+PRES
cey+tu+iTTuNT
ceyy+PERF+FUT
cey+tu+iTTuNTAvum
ceytiTTuNTAyirunnu
y
ceytiTTuNT
op
ceytiTTuNTAvum
3.36 Perfect Continuous Tense
It is realized by a complex auxiliary string ‘koNTirukkukayAyiru’ to the past
C
participle form of the verb.
paTi+PERFCONT+PAST paTi+cci+konTirukkukayAyiru+unnu
ot
paTiccikonTirukkukayAyirunnu paTi+PERFCONT+PRES
paTi+cci+konTirukkukayAyirikku+unnu
N
paTiccikonTirukkukayAyirikkunnu paTi+PERFCONT+FUT
paTi+cci+konTirukkukayAyirikku+um
o
paTiccikonTirukkukayAyirikkum
D
3.37 Voice (prayOga) The voice is divided into two categories: Active and Passive. Passive voice is
marked and active voice is unmarked. The passive voice is realized by adding the auxiliary verb ‘peT’ to the infinitive form of a transitive or causative verb. ceyy+INF+PASS+PAST
ceyy+a+peT+Tu
ceyyappeTTu
ceyy+INF+PASS+PRES
ceyy+a+peT+unnu
ceyyappeTunnu
ceyy+INF+PASS+FUT
ceyy+a+peT+um
ceyyappeTum
30
3.38 Auxiliary Verbs The auxiliary verbs such as can, may, should and could are handled by adding the
ceyy+AUX_CAN
ceyy+AM
ceyyAM
ceyy+AUX_MAY
ceyy+tu+EkkAM
ceytEkkAM
ceyy+AUX_SHOULD
ceyy+aNaM
ceyyaNaM
ceyy+AUX_COULD
ceyy+Anpatti
y
suffixes ‘AM, EkkAM, aNaM and Anpatti’ respectively.
op
ceyyAnpatti
3.39 Negation
The negation is marked by adding the suffix ‘illa’ to the tense forms of the verbs. ceyy+tu+illa
C
ceyy+PAST+NEG_NOT
ot
3.40 Question Verbs
ceytilla
The ‘yes’ or ‘no’ questions are marked by adding the suffix ‘O’ to the tense forms of the verbs.
N
ceyy+PAST+QUES
ceyy+tu+O
ceytO
3.41 Infinite Verbs
D
o
The infinite forms of verbs are marked by adding the suffix ‘An’ to the base forms of the verbs without any tense. ceyy+INF
ceyy+An
ceyyAn
31
CHAPTER
4
IMPLEMENTATION OF RULE BASED MACHINE TRANSLATION SYSTEM A rule-based machine translation system for English to Malayalam language pair has been developed (Model). Each module of the system will be discussed in detail in this
y
chapter. The block diagram of the system is shown in fig.4.1 below:
op
ENGLISH SENTENCE (INPUT SOURCE TEXT)
C
STANFORD ENGLISH PARSER
D
o
N
ot
ENGLISH – MALAYALAM BILINGUAL DICTIONARY
ENGLISH – MALAYALAM TRANSLITERATOR
MALAYALAM MORPHOLOGICAL SYNTHESIZER
REORDERING BY TRANSFER RULES
MALAYALAM SENTENCE (OUTPUT TARGET TEXT) Fig.4.1 Block Diagram of English-Malayalam Rule Based Machine Translation System 32
4.1 English Parser 4.1.1 Introduction Parser is an algorithm which produces a syntactic structure for a given input. The parser is the first component of the rule based machine translation system and it is used on the source (English) side. The statistical Stanford parser based on the probabilistic context free grammar (PCFG) is used in the system. English sentence is directly given to the parser without preprocessing. The Penn tree bank tagset of POS tag used in Stanford
y
Parser is given in Appendix – A.1.
op
4.1.2 Usage
The Stanford Parser is used for four main purposes in the machine translation system. i.
The parser is used for syntactic analysis of the English sentence in order to give
C
the parse tree structure of the English sentence by context free grammar. The example of parsing is shown in fig.4.2 below:
ot
ROOT
N
S
VP
o
NP
VBP
I
am
D
PRP
VP VBG
writing
NP DT
NN
a
book
Fig.4.2 Sample Parse Tree for the sentence “I am writing a book”
33
This tree structure is required for re-ordering the source (English) sentence with respect to the target (Malayalam) sentence by transfer rules. ii.
The parser is used for Parts of Speech (POS) tagging of the English sentence to give English words and their corresponding POS tags based on the Penn Tree bank tag set. The example of POS tagging is as follows: I will bring the pen
I(PRP) will(MD) bring(VB) the(DT) pen(NN)
y
These POS tagged words are used to search the target equivalent of English word in bilingual dictionary, to synthesize morphology of Malayalam words and also to
iii.
op
reorder the English text with respect to the Malayalam text.
The parser is used for stemming the words of English sentence, to get their
children are playing
C
corresponding root words. The example of stemming is as follows:
(children‐child) (are‐be) (playing‐play)
ot
The root words of English obtained after stemming are used to find the equivalents of Malayalam words from bilingual dictionary. The parser is used for the morphological analysis of words in the English
N
iv.
sentence, to get the morphology of English words.
(I) (go+V+PRES+CONT) (Chennai+N+DAT)
o
I am going to Chennai
D
The morphology information of English is used in the morphological synthesizing for equivalent Malayalam words.
34
4.2 English to Malayalam Transliteration 4.2.1 Introduction The transliteration is the process of labeling the text in one language with other. In English to Malayalam transliteration, the English text is replaced with the Malayalam text by preserving the spell. The SVM based Multilingual Amrita English-Malayalam Transliteration tool [21] is developed by Amrita – CEN and we use the same in the
y
machine translation system.
op
4.2.2 Preparation
The Amrita English-Malayalam transliteration system is implemented using SVMTool. First the corpus of English words are collected and preprocessed. The
C
preprocessing involves two level Romanization, segmentation and alignment. The English words are romanized into Malayalam words, by English-Malayalam mapping. The romanized Malayalam words are again romanized back to English, by Malayalam-
ot
English mapping. The regular English words and romanized English words are segmented by phonetics of digrams, trigrams, etc. The segmented parallel corpus is aligned as such for one to one alignments but for other alignments they are aligned with
N
help of empty variables (^). The system is trained with 20000 names using SVMlight and the model is tested with 1000 names using SVMTeval, and gives an accuracy of 90%
o
[21].
D
4.2.3 Usage
In machine translation, the proper nouns like person names and place names, named
entities, may not have the equivalent Malayalam words in the bilingual dictionary. In
such cases, the translation system will not produce good output. Such words should not be translated but these words had to be transliterated. The Amrita English-Malayalam Transliteration tool is used for transliteration in the rule based machine translation system. The transliteration is invoked after parsing the English sentence with Stanford Parser. Because only after parsing, the proper nouns could be identified (easy way is to identify the word with Capital case letters), by POS tagging them with NNP (proper noun 35
singular) or NNPS (proper noun plural). Any word with ‘NNP’ or ‘NNPS’ POS category will be directly transliterated without entering into other translation modules. The transliteration is used in the machine translation system to serve for three main purposes: i.
The transliteration is used for transliterating the proper nouns such as place names and person names. Since we are using the machine translation system for Indian Languages, the transliteration system is trained with local places and persons in India and Kerala. The sample output for transliteration of proper nouns is as
ii.
ഹ ഷവ ധന്
op
Harshawardhan Pondichery
y
follows:
െപാnിെഛര ്
The transliteration is used for transliterating the named entities such as the
C
Organization names and University names. The machine translation system should not translate such names because their words could be mapped to their equivalent meaning in Malayalam and the prepositions in between might also
ot
morphologically processed with the word. The sample output for transliteration of named entities is as follows:
iii.
N
CEN Renault
െസന്
െരെനൗ
്
The transliteration is used for transliterating the words that don’t have the
o
equivalent Malayalam words. Some English words don’t require equivalents in Malayalam because they might be contributing to morphological processing of
D
Malayalam. The machine translation system will not give any equivalent for the word that is not available in the dictionary. In such cases, those words are transliterated to notify the user that the word has to be added to the dictionary. This transliteration is very much helpful in improving the system accuracy by aiding in manual testing. The sample output for transliteration of non-available dictionary words is as follows: Cytoplasm Chromosome
െ ാp മ് െ ഛാെമാെസാെമ
36
4.3 English-Malayalam Bilingual Dictionary 4.3.1 Introduction The dictionary contains words and their corresponding meanings. The bilingual dictionary has the words in one language and their meanings in the other. EnglishMalayalam bilingual dictionary is used in the machine translation system for translating the English words to equivalent Malayalam words.
y
4.3.2 Preprocessing
op
Around 21,000 English-Malayalam bilingual data and more than 40,000 Malayalam- English bilingual data have been collected. The bilingual data is manually typed and preprocessed. The preprocessing of dictionary undergoes various stages
i.
C
depending on the data.
Font-converting – The Malayalam data has to be in Unicode font for the system to process. The font converters of are built for converting many
ii.
ot
Malayalam fonts to into Unicode (UTF-8) format by distinct mappings. Aligning – The English words have to be aligned with the equivalent
N
Malayalam word with respect to their meanings. The inflections of the data are also removed and only the roots words are taken. POS tagging – The POS category of each English-Malayalam bilingual pair
o
iii.
has to be tagged. Many POS categories of the same word may exist and all the
D
categories of that word have to be POS tagged.
iv.
Lexicalizing – This is the most important stage of preprocessing. If we have a
detailed description of Malayalam meaning for equivalent English word, then there will be ambiguity in morphology generation of the word and also the reordered word may not give a readable output. The challenge in creating bilingual dictionary for machine translation is to find the one word Malayalam equivalent to the English word. So, most of the Malayalam words of dictionary are manually lexicalized. 37
v.
Adding synonyms – One English word may have one or more equivalent Malayalam words (depending on the sense) in dictionary. The primary lexicon (first sense) of Malayalam word is stored as an equivalent target word for English and all other secondary lexicons (other senses) are stored as synonyms in dictionary along with the primary lexicon. The synonyms of the Malayalam word are stored along with the secondary lexicons in dictionary.
vi.
Removing duplicates – The duplicates in English words with same
y
Malayalam equivalents have to be removed so that the one word of English
op
could be matched and its equivalent can be retrieved.
After preprocessing, manual checking has to be performed to the entire dictionary.
C
4.3.3 Implementation
The preprocessed bilingual dictionary is loaded into the database and MySQL
ot
server is used. Based on the POS categories, the dictionary is separated into seven different databases: Noun, Verb, Adjective, Adverb, Pronoun, Preposition and General (excluding reordering rules, auxiliary tense and Stanford dependencies). Their names
N
suggest the content of the database, while the general category stores all other POS categories such as conjunctions, interjections, determiner, particle, cardinal, etc. Each
o
database has five fields: source, target, category, Feature and Synonym. The field ‘source’ stores the English words.
ii.
The field ‘target’ stores the Malayalam words.
iii.
The field ‘category’ stores the POS category of the source and target words.
iv.
The field ‘feature’ stores the person-number-gender (PNG) marker, which is
D
i.
not required for Malayalam, so the feature column is left empty. v.
The field ‘synonym’ stores the synonym of Malayalam words for that particular English word. All the databases are set the type as ‘varchar’ and the source field is set as the
primary key, because the English words have to be unique for better search and retrieval. 38
4.4 Malayalam Morphological Generator 4.4.1 Introduction The morphological synthesizer adds morphology to the words. A bi-directional Morphological Generator cum Morphological Analyzer has been developed for Malayalam, for synthesizing morph to the Malayalam words. Finite State Transducer is used to model the morphology and orthographic rules of Malayalam are written. The FST
y
based Malayalam morphological synthesizer is used in the machine translation system. 4.4.2 Preparation
op
Around 1000 Malayalam Nouns and 1500 Malayalam Verbs belonging to various stem ends have been collected. The rule based approach is followed here. The words are distributed based on their stem endings. The words collected are manually inflected for
C
different types of inflections. There are approximately 7500 inflected forms for Malayalam Nouns and 12500 inflected forms for Malayalam Verbs. The inflections considered for noun are: Plural markers; Case Markers: Accusative, Dative, Locative, Instrumental,
Genitive,
Ablative,
ot
Sociative,
Benefactive,
Adjectivization;
Adverbalization, Cliticization. The inflections which have been considered for verb are:
N
Transitive marker, Causative marker; Tense Markers: Simple Present, Simple Past, Simple Future, Continuous Present, Continuous Past, Continuous Future, Present Perfect, Past Perfect, Future Perfect, Present Perfect Continuous, Past Perfect Continuous, Future
o
Perfect Continuous, Passive, negative, question types and infinitive verbs . The inflected word forms are manually analyzed in order to model the Morphological Generator for
D
Malayalam and implemented in open source software open FST-1.2.7. The linguistic aspects of Malayalam morphology and the orthographic rules are discussed in the previous chapter.
4.4.3 Building FST Model The morph synthesizer is used in the rule based machine translation system for English to Malayalam translation. The morphological information about the English words will be transferred to Malayalam words. The Stanford Parser is used to stem the English words from the input sentence and also to get the morphologically analyzed 39
information. The equivalent Malayalam words are extracted from the English to Malayalam Bilingual Dictionary. The target words are romanized by Unicode to Roman character mapping and given as input to the FST. Therefore we have the romanized target word along with the morphological information from the source side. The dependency information from the parser is transferred to the required format for morphological synthesis. So these dependency transfer information are separately stored for nouns and verbs in the database. The FST model is studied from [22] where FST based morphology
y
is used for Speech Recognition. The input to the FST is considered as the Regular Expression. For example, the
FST input: maraM+NOUN+PLURAL
op
plural marking for noun maraM is written as:
maraM+N+PL
C
Here ‘+’ is used for our convenience and stored separately in a variable to distinguish from the regular expression operator ‘+’. All the romanized alphabets of Malayalam are categorised as variables for the usage in FST. Next step is to build the
ot
FST. Let us consider the Plural marker of a noun as an example to build FST. The working of the FST will also be explained later, with an example of a verb in Present
N
Perfect Continuous form.
The romanized Malayalam word along with the morphological information from the source parser is taken as input. The romanized characters are stored in a variable and
D
o
the FST transition from the state ‘a’ to state ‘b’ takes place, which is given in fig.4.3. maraM a
b
Fig.4.3 First Transition of FST In the next transition, the Parts of Speech category (N for noun) of the word is taken. No inflection occurs for Malayalam nouns in this transition from state ‘b’ to ‘c’ which is given in fig.4.4, so it is marked with ‘epsilon’:
40
maraM
+ N
a
b
c ε
Fig.4.4 Second Transition of FST For the next transition, the morphological information from the source language is considered, in our case, the plural marking. The inflection for plural has to be added to the Malayalam noun: for ‘human’ ‘mAr’ has to be appended and for the rest ‘kaL’,
y
during the transition from state ‘c’ to‘d’, which is given in fig.4.5. Since we use more
op
words other than ‘human’ nouns, we define the plural marking as ‘kaL’. Also the ‘+’ sign will be replaced with ‘~’ sign for identification of markings:
a
+N b
+PL
c
C
maraM
ε
d
~kaL
ot
Fig.4.5 Third Transition of FST The final transition is to define the final state‘d’, which is given in fig.4.6:
N
maraM
o
a
+ N
b
+PL c
ε
d ~kaL
Fig.4.6 Final Transition of FST
D
It is observed that it is always better to have separate final states for FST for
future use, if we are going to extend the model. Output of FST: maraM+N+PL
maraM~kaL
Thus the FST model for Plural marking of Malayalam Nouns has been built. The important aspect in modeling FST is to follow the morphotactics. The morphotactics of Malayalam are discussed in the previous chapter.
41
4.4.4 Writing orthographic rules As Malayalam is an inflectionally rich language, Malayalam words have to be classified into different categories by defining them with different sets of orthographic rules. The linguistic aspects of orthographic rules are seen in the previous chapter. The computational aspects of the orthographic rules are discussed in this chapter. The rule notation of Chomsky and Halle is followed for Malayalam orthographic rules. b/c_d’ states that ‘a’ is replaced with ‘b’ in the position between ‘c’
The rule ‘a
y
and ‘d’, when the word ends with ‘c’ joins with the word starting with ‘d’.
op
For plural marking, the nouns are categorized based on the following phonological endings of stems and semantic forms: (consonants), (u|U), (M), (general), (human); whose orthographic rules are given in table.4.1 as follows:
amma[b2]kaL M[b2]k [b2] [b2] [b2]
‐> ‐> ‐> ‐> ‐>
ot
Stem end [human]: Stem end M: Stem end [u or U]: Stem end [consonants]: Stem end [general]:
ammamAr angng k u []
/ __ / __ aL / [u|U] __ kaL / [CON] __ kaL / __ kaL
N
i. ii. iii. iv. v.
C
Table.4.1 Sandhi rules for various stem endings of Malayalam nouns
Here the variable [b2] denotes the ‘~’ sign, variable [CON] denotes all consonants of Malayalam, [epsilon] is an empty variable. The suffixes are phonologically,
o
morphologically as well as semantically conditioned.
D
The rule (i) states that the human nouns are marked with ‘mAr’ and the ‘kaL’ that is
added during pluralization has to be replaced with ‘mAr’. Example: accan~kaL
accanmAr
The rule (ii) states that the nouns with ‘M’ stem ends are marked with ‘kaL’, where ‘M~k’ is replaced with ‘angngaL’. Example: maraM~kaL
marangngaL
The rule (iii) states that the nouns with ‘u’ or ‘U’ stem ends are marked with ‘kaL’, where ‘~’ is replaced with ‘k’. Example: pasu~kaL
42
pasukkaL ; pU~kaL
pUkkaL
The rule (iv) states that the nouns with ‘consonant’ stem ends are marked with ‘kaL’, where ‘~’ is replaced with ‘u’. Example: kall~kaL
kallukaL
The rule (v) states that the nouns with all other stem ends are marked with ‘kaL’, where ‘~’ is replaced with nothing in general. The four categories (consonants), (u|U), (M), (general) are common for all inflections of Malayalam nouns. The two other categories to be considered are (a|A,i|I,e|E,o|O) and (aL) that results from the plural marking by morphotactics of
y
Malayalam. Therefore six categories are sufficient for marking each inflection of
op
Malayalam nouns.
The order of orthographic rules is important. The special rules have to be considered before the general rules written. The exceptions are taken as special rules as
C
they don’t follow the general rules. For example, in the case of ‘accan’ the special rule overrides the general rule since it ends with consonant sound.
ot
Same strategies of FST that are applied to Malayalam nouns are also applicable to Malayalam verbs. FST model is built for Malayalam Verbs too. The categories considered for Malayalam verbs are different from the groups of Malayalam nouns. In
N
Malayalam nouns, the total of 35 stem ends considered is grouped into 7 categories in order to optimize the rules of all inflections, whereas in Malayalam verbs, the total of 71 stem ends is considered and they are grouped into 10 categories, where majority of them
o
cause inflections among the past tense markings.
D
The ‘uka’ forms of Malayalam Verbs are taken as base forms as they help in
exploring the verbal inflections, in clear terms, the ‘uka’ forms comes handy in explaining the morphophonemic changes, for example, ‘ceyyuka’. Before the morph getting generated, these ‘uka’ suffixes has to be removed. Consider the Malayalam word ‘paRayuka’, its root is ‘paRa’, the corresponding Future tense form is ‘paRayum’. The form ‘paRayum’ can be easily derived from ‘paRayuka’ as it has ‘y’ in it. As already mentioned, the past tense marker in Malayalam verbs, brings most of the inflections. Some of the rules for the past tense marker are given in table.4.2:
43
Table.4.2 Sandhi rules for various stem endings of Malayalam verbs in past tense / / / / / / / / / / / / / / / / / / / /
iTTuNT __ u koNTirikk __ u __ u __ __ u __ u __ u __ u __ u __ u __ u __ u __ u __ u __ u __ u __ u __ u __ u __
y
Ayirunn unn koNTAyirunn pOyi kaNT vann tann tinn rnn N nn yt nj iTT icc rtt utt ann nn i
op
‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐>
C
[b2] [b2] koNTAyirikk[b2] pOk[b2]u kAN[b2] var[b2] tar[b2] tinn[b2] r[b2] zh[b2] ll[b2] yy[b2] y[b2] iT[b2] ikk[b2] rkk[b2] ukk[b2] akk[b2] lkk[b2] [b2]u
ot
i. ii. iii. iv. v. vi. vii. viii. ix. x. xi. xii. xiii. xiv. xv. xvi. xvii. xviii. xix. xx.
The first eight rules are written as special rules that deviate from the general rules,
N
whereas the next twelve are all general rules common to most of the verbs that are considered. The first three rules are written to aid the continuous tense marker, perfect
o
tense marker and perfect continuous tense marker, because these three markers will first convert the verb into past tense and then create inflections depending on continuous or
D
perfect or both and again moves to the tense marker. All other Malayalam verb tense markers and inflections have only three to four common rules. 4.4.5 Working of FST The working of FST by following the Sandhi rules is discussed below with the example of a Malayalam verb ‘ceyyuka’ in present perfect continuous form. Input: ceyyuka + Present Perfect Continuous Tense Input to FST: ceyyuka+V+PAST+PERFCONT+PRES 44
i.
FST reads the input: Output: ceyyuka
ii.
FST goes to next state (POS category-verb ‘V’ state) and checks for the rule, then truncates ‘uka’ by the rule: Rule 1: uka[b2]uka
‐>
[]
/
__
Output: ceyyuka => ceyy
Rule 2: yy[b2]
‐>
yt
/
Output: ceyy => ceytu
__ u
FST transition to ‘PERFCONT’ tense state and append ‘koNTAyirikk’ by the rule: Rule 3: [b2]
‐>
C
iv.
y
FST moves to ‘PAST’ tense state and replaces ‘yy’ by ‘yt’ by the rule:
op
iii.
[]
/
__ koNTAyirikk
FST comes back to tense states, now it is ‘PRESENT’ state and append ‘unnu’ by the rule:
N
v.
ot
Output: ceytu => ceytukoNTAyirikk
Rule 4: [b2]
‐>
[]
/
__ unnu
o
Output: ceytukoNTAyirikk => ceytukoNTAyirikkunnu
D
FST final Output of Malayalam Morph Generator: ceytukoNTAyirikkunnu By employing four rules in morphotactics order, we got the correct output for
given input. The built FST model is directly used in the rule based machine translation system. The parser dependencies are converted into the required format of inflection marker inputs of FST.
45
4.5 Malayalam Morphological Analyzer 4.5.1 Introduction The morphological analyzer analyzes the morphology of words. With similar FST model, the morph analyzer is built for Malayalam nouns and verbs based on the morphotactics. This is the added advantage of using FST in building Malayalam morphological generator. The Malayalam morph analyzer gives multiple outputs in the order of expected results because of the multiple connected paths in the FST model.
y
Although we have built the Malayalam morph analyzer, it is not used in the rule based machine translation system. But the morph analyzer could be useful for future purpose
op
like Malayalam to English machine translations. 4.5.2 Working
C
With similar FST model, the morph analyzer is built for Malayalam nouns and verbs based on the morphotactics. This is the added advantage of using FST in building Malayalam morphological generator. The Malayalam morph analyzer gives multiple
ot
outputs in the order of expected results because of the multiple connected paths in the FST model. In general, the first output of the morph analyzer is taken as the exact output
N
amongst all.
The morph analyzer is discussed with the example of Malayalam noun and verb as
o
follows:
D
FST Morph Analyzer Noun Input: pUkkaL Expected Output: pU+N+PL FST Outputs: i.
pU+N+PL
ii.
pUkkaL+V
iii.
pUkkaL+V+INT
iv.
pUkkaLuka+V
v.
pUkkaLuka+V+INT 46
The first output is taken as the exact output. The second and third outputs are logically incorrect. The last two outputs (iv & v) from the above results are entirely wrong. This problem of wrong outputs could be solved if we build a morph analyzer separately for nouns and verbs.
FST Morph Analyzer Verb Input: ceytu Expected Output: ceyy+V+PAST
ii.
ceyy+V+INT+PAST
iii.
ceyyuka+V+PAST
iv.
ceyyuka+V+INT+PAST
v.
ceytu+V
vi.
ceytu+V+INT
vii.
ceytuuka+V
viii.
ceytuuka+V+INT
C
ceyy+V+PAST
N
ot
i.
op
y
FST Outputs:
The first four outputs are correct outputs and the first output is taken as the expected
o
output. The last two outputs (vii & viii) are wrong outputs. This problem of outputs in morph analyzing the Malayalam verbs could be solved by developing morph generator
D
and analyzer separately.
47
4.6 Reordering by Transfer Rules 4.6.1 Introduction In machine translation, the reordering denotes the change in syntactic structure of source text with respect to the target text. The reordering can be machine learned or executed by rules or both. The reordering by rules is followed in the machine translation system to reorder the English sentence in the order of Malayalam sentence.
y
4.6.2 Preparation
op
English is the Subject-Verb-Object (SVO) language, whereas Malayalam is Subject-Object-Verb (SOV) language. Therefore the reordering is necessary for English to Malayalam translation. The pattern based reordering approach proposed for Tamil in
C
[19] is followed for Malayalam. The pattern based reordering is based on the pattern of CFG rules from the parser (Stanford Parser). The transfer rules are written as transfer links, which carries the order of the child nodes from the parse tree. First the parallel
ot
corpus of English and Malayalam has been collected from the open source Kerala Government school texts. The corpus is manually typed, converted to Unicode format and aligned manually and has around 2500 parallel sentences. These sentences are
N
analyzed for all possible combinations of sentence structures such as simple, compound and complex. Then the reordering rules are written based on the syntactic variation of structure of English sentence depending on Malayalam sentence. At present, we have
o
written around 100 reordering rules for Malayalam. Since Malayalam is a free word-
D
ordered language, the reordering rules are flexible. The rules are loaded into a separate database with two fields: source pattern and target pattern. Source pattern stores English CFG rules and target pattern stores corresponding transfer CFG rules of Malayalam. 4.6.3 Implementation The reordering forms the last component of the machine translation system. The syntactic information of English sentence from Stanford Parser is checked for the match in the database of reordering rules. If the syntactic pattern of English sentence matches with the source rule, then the corresponding Malayalam rule is taken and the source tree 48
structure of parser is modified with respect to the target rule. If the pattern matches, the transfer rule is applied to the child nodes of all branches in parse tree. Now the system output would be syntactically reordered to suit Malayalam language. So we are not transforming the parse tree as a whole which might require one rule for each type of sentence, and the count is infinite. But if we are transforming the tree branch wise with respect to child nodes, then many transfer rules can be applied to reorder the sentence and also the same rules can be applied to other sentences also. The reordered output after morphological generation of Malayalam words is displayed as the final output of the
y
machine translation system. Let us see some of the English-Malayalam reordering rules
op
for two different types of sentences and see how the rules are applied in step by step procedure in the following examples.
“I am eating an apple”, this sentence is reordered with respect to Malayalam by executing two re-ordering rules: Rule – 1: Rule – 2:
C
i.
VP (VBP: VP) VP (VP: VBP) VP (VBG: NP) VP (NP: VBG)
ot
ROOT
N
S
o
NP
D
PRP I
VP
VBP am
VP VBG
eating
NP DT
NN
an
apple
Fig.4.7 Parse Tree of English Sentence ‘I am eating an apple’ 49
By applying rule – 1, the reordered parse tree is shown in Fig.4.8.a, and by applying rule – 2, the reordered parse tree is shown in Fig.4.8.b.
ROOT
S
S VP
VBG
NP
I
am
DT
NN
N
eating
PRP
VBP
ot
I
VP
op
PRP
NP
an
VP
C
NP
y
ROOT
apple
DT an
VBP
VBG am NN eating apple
(Fig. 4.8.b)
o
(Fig. 4.8.a)
NP
VP
Fig.4.8.a Reordered Parse Tree of English Sentence by executing Rule – 1
D
Fig.4.8.b Reordered Parse Tree of English Sentence by executing Rule – 2
After applying the above two rules, the English sentence is in the order of required Malayalam sentence as shown in Fig.4.8.b.
50
ii.
“I work hard to finish my work and achieve my goal”, the parse tree of this sentence is shown in fig.4.9 and it is reordered with respect to Malayalam by executing following four re-ordering rules and the reordered parse trees are shown in fig.4.10, 4.11, 4.12 & 4.13 be executing rule-1, rule-2, rule-3 and rule-4
VP (VB: ADJP) VP (ADJP: VB)
Rule – 2:
ADJP (JJ: S) ADJP (S: JJ)
Rule – 3:
VP (TO: VP) VP (VP: TO)
Rule – 4:
VP (VB: NP) VP (NP: VB)
op
Rule – 1:
y
respectively.
C
The complete set of reordering rules that are written is given in Appendix – B.1. The rules have certain limitations while applying to the source sentences. Some sentence may have same structure but they need to be reordered in different ways. In such
ot
cases, the only option is to apply the transfer rule that is written. Such kinds of sentences create ambiguity in machine translation outputs. This problem is not handled in the
N
system and it is yet to be handled by separating such ambiguous sentences from other sentences that follow the general rule written. Such ambiguous sentences have to be handled by identifying them with their lexicons and structure and pre-process or post-
D
o
process the same.
51
ROOT
S NP
VP VBP
I
work
ADJP JJ
op
y
PRP
S
hard
C
VP
VP
ot
TO to
N
CC
VP
D
o
VP
VB
finish
and NP
VB
DT
NN
the
work
NP
achieve DT
NN
the
goal
Fig.4.9 Parse Tree of English Sentence ‘I work hard to finish the work and achieve the goal’
52
ROOT
S
PRP
ADJP JJ
S
hard
work
VP
CC
N
to
VP
ot
TO
C
I
VBP
y
VP
op
NP
VP
VP
D
o
VB
finish
and
NP
VB
DT
NN
the
work
NP
achieve DT
NN
the
goal
Fig.4.10 Reordered Parse Tree of English Sentence - 2 by executing Rule – 1 (VP (VB: ADJP) VP (ADJP: VB))
53
ROOT
NP
VP
PRP
ADJP
I
S
JJ
VP
hard
op
VBP
C
work
VP
ot
TO to
N
CC
VP
o
VP
D
VB
finish
y
S
and NP
VB
DT
NN
the
work
NP
achieve DT
NN
the
goal
Fig.4.11 Reordered Parse Tree of English Sentence - 2 by executing Rule – 2 (ADJP (JJ: S) ADJP (S: JJ)) 54
ROOT
S NP
VP
PRP
ADJP
I
S
JJ
VP
hard
op
y
VBP
C
work
TO
ot
VP
to
N
CC
VP
VP
and
NP
VB
D
o
VB
finish
DT
NN
the
work
NP
achieve DT
NN
the
goal
Fig.4.12 Reordered Parse Tree of English Sentence - 2 by executing Rule – 3 (VP (TO: VP) VP (VP: TO)) 55
ROOT
S VP
PRP
ADJP
I
S
VBP
work
hard
TO
ot
VP
JJ
C
VP
op
y
NP
N
CC
to VP
VP
and
D
o
NP
DT
NN
the
work
NP
VB finish
VB
DT
NN
the
goal
achieve
Fig.4.13 Reordered Parse Tree of English Sentence - 2 by executing Rule – 4 (VP (VB: NP) VP (NP: VB))
56
CHAPTER
5
RESULTS 5.1 Results of Malayalam morphological generator and analyzer The details and statistics of Morphological generator systems are given in the following tables. The table.5.1 shows the statistics of nouns and table.5.2 shows the statistics of verbs.
NOUN STEMS
STEMS
INFLECTIONS
y
Table.5.1 Statistics of morphology for Malayalam nouns
(END) COVERED
GROUPED
COVERED
MARKER
Noun
-
Plural
kaL/mAr
Case Markers
-
Nominative case
-
Accusative case
e
kha
amma
A
kurunn
aval
al
M
makaL
aL
makaL
M
amma
ngng
u,U
App
nj
a,A,i,I,e,E,o,O
Dative case
in
ar
Ol
aL
Locative case
il
at
OL
consonants
Sociative case
OT
atu
u
njAn
Instrumental case
Al
av
U
ni
Genetive case
uTe
Ay
ub
Ablative case
ilninnu
Azh
Ud
Benefactive case
inuvENTi
chcha
ud
o
Adjectivization
Aya
chchat
UN
Adverbalization
Ayi
D
N
C
a
ot
op
INFLECTION
ER
Im
Clitics
-
i
k
Clitics_um
um
I
r
Clitics_E
E
Ih
ni
Clitics_A
A
Clitics_O
O
Clitics_tanne
tanne
Clitics_ANO
ANO
njAn
TOTAL = 37
TOTAL = 10
TOTAL = 21
Total Number of Sandhi Rules Written = 152
57
Table.5.2 Statistics of morphology for Malayalam verbs VERB STEMS (END)
VERBS
INFLECTIONS
INFLECTION
COVERED
GROUPED
COVERED
MARKERS
Azh
ll
ikk
Verb
uka
ak
enc
opp
ar
Intransitive
-
Akk
eNN
Or
rkk
Transitive
ikk
akk
eNT
oRR
yy
Causative
ippikk
al
ER
ott
tt
Tense
-
aL
ERR
rkk
akk,Akk,ukk,
Past Tense
u
alk
ES
rtt
Okk,ykk
Present Tense
unnu
aLL
eT
tt
iT,eT,oT
Future Tense
um
AN
ett
uk
ng,ngng
aNG
ikk
ukk
y
ANG
Ikk
Ukk
others
ar
iLK
UL
AR
inn
UL
aRR
Int
umm
ARR
ippi
uNG
as
iT
upp
AT
ITT
uR
att
iy
uRR
ATT
iZh
avv
Lmp
ay
ng
Okk
ngng
yy
ntt
zhtt
op Perfect
iTTuNT
Perfect Continuous
koNTAyirikk
Passive
appeT
Auxiliary-can
AM
Auxiliary-may
EkkAM
Auxiliary-should
aNaM
Auxiliary-could
Anpatti
Negative
illa
Question
O
Infinite
An
N
ot
C
koNTirikk
US utt
ykk
L
D
oT
Continuous
o
okk
y
Ak
TOTAL = 71
TOTAL = 10
TOTAL = 19
Total Number of Sandhi Rules Written = 103
58
Table.5.3 Testing results of morph generator for Malayalam nouns Morphological Generator for Malayalam Nouns Number of nouns taken for testing
100
Number of inflections considered for testing
7
Total number of nouns in testing corpus
700
Number of correct outputs
498
Number of wrong outputs
202 71.14
op
y
Accuracy (in %)
Table.5.4 Testing results of morph generator for Malayalam verbs Morphological Generator for Malayalam Verbs
C
Number of verbs taken for testing
Number of inflections considered for testing Total number of verbs in testing corpus
100
12 1200 872
Number of wrong outputs
328
ot
Number of correct outputs
72.66
N
Accuracy (in %)
o
Table.5.5 Testing results of morph analyzer for Malayalam nouns Morphological Analyzer for Malayalam Nouns
D
Number of nouns taken for testing Number of inflections considered for testing
7
Total number of inflected nouns in testing corpus
700
Number of correct outputs
475
Number of wrong outputs
225
Accuracy (in %)
67.85
59
100
Table.5.6 Testing results of morph analyzer for Malayalam verbs Morphological Analyzer for Malayalam Verbs Number of verbs taken for testing
100
Number of inflections considered for testing Total number of inflected verbs in testing corpus
12 1200 786
Number of wrong outputs
414
Accuracy (in %)
65.5
y
Number of correct outputs
op
Table.5.7 Coverage of Malayalam nouns and verbs Coverage for Malayalam nouns Number of nouns taken
1000
Number of nouns inflected
C
7500
Coverage for Malayalam verbs Number of verbs taken
ot
Number of verbs inflected
1500 12500
N
5.2 Discussion about results of Malayalam morphological generator and analyzer A testing corpus of 100 nouns and 100 verbs has been taken. The nouns include all
o
types of human nouns and stem suffixes and the verbs include all different kinds of
D
special categories. The inflections considered are 7 and 12 for nouns and verbs respectively. Therefore there are 700 inflections for nouns and 1200 inflections for verbs for testing. The difference between the testing of morph generator and analyzer is the testing corpus. For morph generator, the testing corpus contains word stems and inflection information separately whereas the testing corpus of morph analyzer will be the inflected words. The testing corpus of morph analyzer consists of 100 nouns that are inflected into 700 inflections and 100 verbs that are inflected into 1200 inflections. Table.5.3, table.5.4, table.5.5 and table.5.6 show the testing results of morph generator and analyzer for Malayalam nouns and verbs and the table.5.7 shows the 60
coverage of them. At present, the morphological generator system is developed for plural markers, case markers, adjectivization, adverbalization and clitics markers for Malayalam nouns. For Malayalam verbs, base forms, tense markers in all forms, voice, auxiliary, negative types, question types and infinitive verbs. The morphological synthesizer for Malayalam nouns gives an accuracy of 71.14 percent for the inflections considered. The accuracy could be improved when more human nouns are considered. The morphological synthesizer for Malayalam verbs gives
y
an accuracy of 72.66 percent for the inflections considered. The accuracy could be improved when more number of special cases of past inflections of verbs is considered.
op
Also the question and negative types have to be effectively handled for all cases. The morphological analyzer for Malayalam nouns gives accuracy of 67.85 percent and for morphological analyzer are: i.
C
Malayalam verbs gives accuracy of 65.5 percent. The factors that affect the accuracy of
When same type of inflections occur for different categories (i.e.,) clitics ‘um’
ii.
ot
and future tense ‘um’. Example: ‘pAmpum’ and ‘Odum’. When different words end up with same word after morphological generation. Example: ‘avan’, ‘avaL’ => ‘avar’.
When special cases and general cases vary completely with inflections. Example:
N
iii.
‘mar’ and ‘kaL’. When a human form is given, then output would be with ‘kaL’. iv.
In case of Malayalam nouns, the first output amongst all is taken as the morph
o
analysed output. In case of verbs, the output is given in the order of the rules
D
written and ambiguity arises in choosing the best possible output.
These problems of Malayalam morph analyzer could be solved by developing morph
analyzer and generator separately and having a separate FST model and orthographic rules for the same. Inflection Markers used in morphology of Malayalam Nouns and verbs are given in Appendix – C.1, C.2. FST state transition tables of Malayalam nouns
and verbs are given in Appendix – D.1, D.2. Orthographic rules written for Malayalam nouns and verbs are given in Appendix – D.3, D.4 respectively.
61
5.3 Results of Rule Based Machine Translation System The details and statistics of English-Malayalam rule based machine translation system is given in table.5.8. Table.5.8 Statistics of Rule Based Machine Translation System Bilingual Dictionary Database Statistics 3576
Adverbs
832
Pronouns
90
y
Adjectives
Prepositions
op
Verbs
100
4276
Nouns
11909
Others
195
20, 978
Dependency for nouns considered
40
Dependency for verbs considered
292
Number of Reordering rules
99
Number of Orthographic rules for nouns
152
N
ot
C
Total words in Database
Number of Orthographic rules for verbs
103
D
o
Table.5.9 Testing results for English-Malayalam machine translation system Testing of Rule Based Machine Translation system
Number of sentences tested
757
Number of correctly translated sentences
406
Number of understandable (readable) translations Number of wrong translations
253
Overall System Accuracy (in %)
62
98
53.63
5.4 Discussion about results of Rule Based Machine Translation System The testing results of English-Malayalam rule based machine translation system is given in table.5.9. The test corpus consists of 757 random simple sentences collected from short stories. While testing, the sentences are ranked into three categories: 1) Exact translations, 2) Understandable or Readable translations and 3) Wrong translations. The complex sentences that involve various clauses, exclamations, negations and questions give wrong reordering outputs because they are not handled in the system.
y
Words that are not available in dictionary and the phrases in sentences resulted in
op
semantically wrong translations. Consider the example below, where the translation is in wrong order and also the word ‘how’ got translated. Eg:
He teaches them how to read » aവന് aവെരtെnെയ വായിkുക e
െന പഠിpിkുnു
C
There are some sentences which are correct in semantic and reordering but they are wrong in that particular context. Also some translations convey the meaning but not in a clear sense. These kinds of sentences are considered as understandable, readable or
ot
acceptable translations. Consider the example below, where the translation is word by word correct, but ‘powerful’ and ‘strikes’ are context-wise wrong. Powerful earthquake strikes Philippines » ഫിലിpിേനസകള് വീരനായ ഭൂമികുലുkം-പണിമുടkുകള്
N
Eg:
The correct outputs of sentences without any errors are considered as exact translations. Consider the example below, where the translation is semantically,
We celebrated his birthday » ഞ
D
Eg:
o
morphologically and syntactically exact. ള് aവന്െറ ജന്മദിനം ആേഘാഷിc
The major problem with the rule based system is the execution of same rule for
different sentences with same structure that require different transfer rule in the target. Eg:
Please open the door » കതകിെന തുറn സേnാഷിpിkുക
The sentence follows the reordering rule (VP:VB PRT NP) (VP:NP PRT VB), which gives syntactically correct output for all other sentences like ‘Wake up the kid’, with same structure, but the same rule gives wrong output for the above sentence. 63
The system gives an overall accuracy of 53.63 percent for the exact translations of test sentences we have used. The complete list of tested sentences with rankings is given in Appendix – E.1. The system accuracy could be improved by increasing the number of lexical items in the bilingual dictionary, increasing the number of rules and by considering more inflections of Malayalam morphology. The accuracy can also be improved by taking care of semantic ambiguity among the lexical items. The semantic ambiguity could be solved by linking the dictionary with the Malayalam WordNet.
y
The screenshot of the sample output of the machine translation system is shown in fig.5.1. The first text box in the translation window is to type the input sentence for
op
translation. The second text box gives the corresponding Malayalam translated output when the ‘Translate’ button is pressed. The third box shows the parsed output with mapping of English words to equivalent Malayalam words along with the reordered
C
structure. The fourth text box shows the Morphological generation of Malayalam nouns and verbs. The final fifth box shows the occurrence of words of given input sentence, in bilingual dictionary along with their synonyms, if available. The font used in the
ot
‘Netbeans IDE’ is ‘Arial Unicode MS’ in order to display both Malayalam and English words. The Malayalam alphabets are displayed in a separated fashion because of font rendering problem with the ‘jdk’. But when the output is displayed in any text editor or
N
html document, the font will be rendered with combinational alphabets of Malayalam, by setting the default font to ML-TT Karthika in text editor or UTF-8 encoding in html. The
o
screenshot of the online translation system is shown in fig.5.2, where the font of Malayalam
text
is
rendered
properly.
The
system
is
available
online
at
D
http://nlp.amrita.edu:8080/Eng2Mal/ . The screenshots of sample outputs of Malayalam morph generator and analyzer
for nouns are shown in fig.5.3 and fig.5.4 respectively and the screenshots of Malayalam morph generator and analyzer for verbs are shown in fig.5.5 and fig.5.6 respectively. Here the command ‘mg.sh’ executes the morph generator for given input and the command ‘ma.sh’ executes morph analyzer for given inputs. The input is given along with the executable commands and outputs are displayed in the following lines of input.
64
5.5 Screenshots
N
ot
C
op
y
D
o
Fig.5.1 Screenshot of Machine Translation system
Fig.5.2 Screenshot of online translation system with proper font rendering of Malayalam text
65
D
o
N
ot
C
op
y
Fig.55.3 Screensh hot of Morph Generatoor for Malayyalam nouns
hot of Morph Analyzer for Malayaalam nouns Fig.55.4 Screensh
66
y
D
o
N
ot
C
op
Fig.55.5 Screensh hot of Morph Generatoor for Malayyalam verbss
Fig.55.6 Screensh hot of Morph Analyzer for Malayaalam verbs 67
CHAPTER
6
CONCLUSION The rule based machine translation system for English to Malayalam has been developed. The main focus of the thesis revolves around morphological synthesizer in developing the rule based system. The morphology is well modeled into a state diagram with transitions. The reason is all other components depend only on the source text processing which are already developed and available as the open source to aid the
y
development of machine translation system for various languages from English. The goal
op
of developing such a translation system is to make the resources available to everyone.
6.1 Limitations
The machine translation system has many unavoidable limitations towards its NLP
C
modules that are being used. Multiple parse trees are not handled by the Stanford parser and the dependency parser is also not used in the tranasltion system. So handling the
ot
verbal phrases is not possible by the system. The transliterator is limited to the Indian names and so international names give wrong transliterations. The bilingual dictionary lacks the word sense information, so the semantic ambiguity arises in the system for
N
many words. The morph generator is implemented for certain cases but the dependency information of many inflectional categories is not given by the parser, such cases works well in morph generator but not in the translation of sentences. The reordering rules are
o
confined to the nodes of the branches and same rule could not be handled for different cases with same syntactic structure. The system has to be improved in a better way, by
D
resolving all the ambiguities discussed earlier.
6.2 Applications The translation system is flexible enough further developed for speech access and
speech to speech translation with human computer interaction. Then if a user is giving a speech input then the system has to be smarter enough to recognize and translate the documents to the user’s language and read it aloud. Such kinds of systems would improve the literacy rate in the country if they are installed at public places like ATM 68
machines or carried as a source of education in mobile schools. With all these hopes we are releasing our machine translation as open source software to be used by anyone whoever wants to contribute for the society in good means. The machine translation system will be useful in schools for pedagogical purposes in teaching grammar. The corpus for SMT systems could be created using this RBMT system. The system could be installed in hand held devices such as mobile phones, for easy access to language. The system might be installed in restaurants and hotels for translation of food menus. A good translation system will be able to provide all educational resources available in internet,
i.
op
6.3 Future Work
y
in the local language to improve our society.
The system can be utilized as such for developing speech systems. A smart and intelligent speech to speech translation system with human computer
ii.
C
interaction will be producing a high impact on the society.
The system can be further enhanced by using a massive database of bilingual
iii.
ot
dictionary for better choice of words.
The major part of morphology is covered and more morphological categories can be handled and the reordering rules can also be further added. The phrase based approach can be used to develop a translation system.
v.
Rules always evolve with the language evolve. Therefore it is always better to
N
iv.
have a statistical machine translation system for a prolonged usage. Word Sense Disambiguation system can be developed on the target side for
o
vi.
Malayalam to English translation for avoiding semantic ambiguties. The translation memories could be used for handling idioms and phrases and
D
vii.
proverbs to aid children.
viii.
The word alignment system in [23] could be used for handling question and negative type sentences.
ix.
The data available for the Indian languages could be added more in internets on public interests to favor the translation oriented research.
x.
Every student should take up the translation work as an initiative measure by contributing with data which is still remains a coveted resource for research. 69
REFERENCES [1] Philipp Koehnn, Statistical Machine Translation, Cambridge University Press, 2010. [2] Durgesh Rao, “Machine Translation in India: A Brief Survey”, in Proc. of SCALLA 2001 Conf., Bangalore, India, 2001. [3] Chris Manning and Hinrich Schutze, Foundations of Statistical Natural Language Processing, MIT Press, Cambridge, MA, May, 1999.
y
[4] C. Rahul, K. Dinunath, Remya Ravindran and K. P. Soman, “Rule Based Reordering and Morphological Processing For English-Malayalam Statistical
op
Machine Translation”, in Proc. of Int. Conf. on Advances in Computing, Control, and Telecommunication Technologies (ACT), Trivandrum, India, December, 2009.
C
[5] P. G. Raji, “Reordering Approach in English-Malayalam Statistical Machine Translation”, Master’s Thesis, Department of Computational Engineering and Networking, Amrita School of Engineering, Amrita Vishwa Vidyapeetham,
ot
Coimbatore, India, July, 2010.
[6] P. Unnikrishnan, P. J. Antony and K. P. Soman, “A Novel Approach for English to South Dravidian Language Statistical Machine Translation System”, in Int.
N
Journal on Computer Science and Engineering (IJCSE), ISSN: 2749-2759, Vol. 2, No. 8, November, 2010.
o
[7] Mary Priya Sebastian, K. Sheena Kurian and G. Santhosh Kumar, “English to Malayalam Translation: A Statistical Approach”, in Proc. of the 1st Amrita ACM-
D
W Celebration on Women in Computing, India, September, 2010.
[8] Mary Priya Sebastian, K. Sheena Kurian and G. Santhosh Kumar, “Alignment Model and Training Technique in SMT from English to Malayalam”, in Contemporary Computing, Vol. 94, pp. 305-315, 2010. [9] Rashmi Gangadharaiah and N. Balakrishnan, “Application of Linguistic Rules to Generalized Example Based Machine Translation for Indian Languages”, in Proc. of the First National Symposium on Modeling and Shallow Parsing of Indian Languages (MSPIL), Mumbai, India, April, 2006.
70
[10] R. Harshawardhan, Mridula Sara Augustine and K. P. Soman, “Phrase based English – Tamil Translation System by Concept Labeling using Translation Memory”, in Int. Journal of Computer Applications (IJCA), ISSN: 0975 – 8887, Vol. 20, no. 3, April, 2011. [11] R. Harshawardhan, Mridula Sara Augustine and K. P. Soman, “Advanced English –
Malayalam
Translation
Memory
for
Natural
Language
Processing
Applications”, in Proc. of Nat. Conf. on Indian Language Computing (NCILC),
y
February, 2011.
op
[12] Kamaljeet Kaur Batra and G. S. Lehal, “Rule Based Machine Translation of Noun Phrases from Punjabi to English”, in Int. Journal of Computer Science Issues (IJCSI), ISSN: 1694-0814, Vol. 7, Issue 5, September, 2010.
C
[13] Centre for Development of Advanced Computing (CDAC), Annual Report, 20072008 [Online]. Available at: http://www.cdacindia.com/html/about/annual.aspx [14] Remya Rajan, Remya Sivan, Remya Ravindran and K. P. Soman, “Rule Based
ot
Machine Translation from English to Malayalam”, in Proc. of Int. Conf. on Advances in Computing, Control and Telecommunication Technologies (ACT), Trivandrum, India, December, 2009.
N
[15] Rajeev Sangal, S.M. Bendre, Pavan Kumar and Aishwarya, “Unsupervised Improvement of Morphological Analyzer for Inflectionally Rich Languages”, in Proc. of 6th NLP Pacific Rim Symposium, Tokyo, November, 2001.
o
[16] A.G. Menon, S. Saravanan, R. Loganathan and K. P. Soman, “Amrita Morph
D
Analyzer and Generator for Tamil: a Rule based approach”, in Proc. of Int. Tamil Internet conference 2009, Univ. of Cologne, Germany, October, 2009.
[17] Jisha P. Jayan, R.R. Rajeev and S. Rajendran, “Morphological Analyser for Malayalam - A Comparison of Different Approaches”, in Int. Journal of Computer Science and Informaiton Technology (IJCST), Vol. 2, No. 2, pp. 155160, December, 2009. [18] Jisha P.Jayan, R. R. Rajeev and S. Rajendran, Morphological Analyser and Morphological Generator for Malayalam - Tamil Machine Translation, in Int.
71
Journal of Computer Applications (IJCA), ISSN: 0975 – 8887, Vol. 13, No.8, January, 2011. [19] S. Saravanan, A. G. Menon and K. P. Soman, “Pattern Based English-Tamil Machine Translation”, in Proc. of Tamil Internet conference, Chap. 4, No. 6, pp. 295 - 300, Coimbatore, India, 2010. [20] R. E. Asher and T. C. Kumari, Malayalam, Routledge, 1997. [21] Sumaja Sasidharan, R. Loganathan and K. P. Soman , “English to Malayalam
op
Trends in Engineering, Vol. 1, No. 2, May, 2009.
y
Transliteration Using Sequence Labeling Approach”, in Int. Journal of Recent
[22] Daniel Jurafsky and James H. Martin, SPEECH and LANGUAGE PROCESSING - An Introduction to Natural Language Processing, Computational Linguistics,
C
and Speech Recognition, Prentice Hall, Second Edition, 2009.
[23] R. Harshawardhan, Mridula Sara Augustine and K. P. Soman, “A Simplified Approach to Word Alignment Algorithm for English-Tamil Translation”, in
ot
Indian Journal of Computer Science and Engineering (IJCSE), ISSN: 0976-5166,
D
o
N
Vol. 2, No. 1, 2011.
72
APPENDIX - A A.1. Penn Treebank Tag set for POS category Tag
Description
Examples
''
closing quotation mark
' ''
dash
--
$
dollar
$ -$ --$ A$ C$ HK$ M$ NZ$ S$ U.S.$ US$
(
opening parenthesis
([{
)
closing parenthesis
)]}
,
comma
,
.
sentence terminator
:
colon or ellipsis
``
opening quotation mark
CC
conjunction, coordinating
& 'n and both but either et for less minus neither
CD
numeral, cardinal
mid-1890 nine-thirty one-tenth ten million 0.5
DT
determiner
all an another any each neither no some such that the
EX
existential there
there
FW
JJ
foreign word preposition or conjunction, subordinating adjective or numeral, ordinal
JJR
adjective, comparative
bleaker braver breezier briefer brighter brisker broader
JJS
adjective, superlative
LS
list item marker modal auxiliary
calmest cheapest choicest classiest cleanest A A. B B. D E F First I J K One SP-44001 SP-44002 Third can cannot could couldn't dare may might must
NN
noun, common, singular or mass
common-carrier cabbage knuckle-duster Casino
NNP
noun, proper, singular
Motown Venneboerger Czestochwa Ranzer Conchita
NNPS
noun, proper, plural
Americans Americas Amharas Amityvilles Amusements
NNS
noun, common, plural
undergraduates scotches bric-a-brac products
PDT
pre-determiner
all both half many quite such sure this
.!?
op
: ; ...
C
` ``
gemeinschaft hund ich jeux habeas
astride among upon whether out inside pro despite on
third ill-mannered pre-war regrettable oiled calamitous
D
o
MD
N
ot
IN
y
--
POS
genitive marker
' 's
PRP
pronoun, personal
hers herself him himself hisself it itself me myself
PRP$
pronoun, possessive
her his mine my our ours their thy your
RB
adverb
occasionally unabatingly maddeningly adventurously
RBR
adverb, comparative
further gloomier grander graver greater grimmer harder
RBS
adverb, superlative
best biggest bluntest earliest farthest first furthest
RP
particle
SYM
symbol
TO
to as preposition or infinitive marker
aboard about across along apart around aside at away % & ' '' ''. ) ). * + ,. < = > @ A[fj] U.S U.S.S.R \* \*\* \*\*\* to
UH
interjection
Goodbye Goody Gosh Wow Jeepers Jee-sus Hubba Hey
73
VB
verb, base form
ask assemble assess assign assume atone attention
VBD
verb, past tense
dipped pleaded swiped regummed soaked tidied
VBG
verb, present participle or gerund
telegraphing stirring focusing angering judging
VBN
multihulled dilapidated aerosolized chaired
WDT
verb, past participle verb, present tense, not 3rd person singular verb, present tense, 3rd person singular WH-determiner
WP
WH-pronoun
that what whatever whatsoever which who whom
WP$
WH-pronoun, possessive
whose
WRB
Wh-adverb
how however whence whenever where whereby
VBP
bases reconstructs marks mixes displeases seals that what whatever which whichever
op
y
VBZ
predominate wrap resort sue twist spill cure
C
ot
D
o
N
74
APPENDIX - B B.1. Hand coded Reordering Rules for RBMT
op
C
N
D
75
1:3 2:2 3:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:1 1:3 2:2 3:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:3 3:1 1:3 2:1 3:2 1:2 2:1 1:2 2:1 1:2 2:3 3:4 4:5 5:1 1:2 2:1 1:2 2:1 1:3 2:2 3:1 1:3 2:2 3:1 1:4 2:1 3:2 4:3 1:2 2:1 1:2 2:3 3:1 1:2 2:3 3:1 1:2 2:3 3:1 1:2 2:3 3:4 4:5 5:1 1:2 2:1 1:2 2:3 3:4 4:1 1:2 2:3 3:1 1:3 2:1 3:2 1:3 2:2 3:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:1 1:4 2:2 3:3 4:1 1:2 2:3 3:4 4:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:1
y
SBAR NP PP NP SBAR NP VP NP NP IN PP IN S IN NP TO S IN RB S WHADVP S WHNP S WHPP S IN ADVP VP MD VP MD RB VP MD VP TO ADVP S VB* ADJP VB* ADVP VBZ PP ADVP VB* VP ADVP VB* PP VB* CC VB* NP VB* NP ADVP VB* NP NP VB* NP PP VB* NP PP , PP VB* PP VB* PP , PP VB* PP SBAR VB* ADJP VB* RB VP RB VB* S VB* SBAR VB* VP VB* WHNP IN SBAR NP NP VB* ADVP ADJP PP VB* PP ADJP PP JJ S JJ PP VB* SBAR ADVP
ot
NP SBAR NP PP NP SBAR NP VP IN NP IN PP IN S TO NP RB IN S WHADVP S WHNP S WHPP S IN S MD ADVP VP MD RB VP MD VP TO VP VB* ADVP S VB* ADJP VB* ADVP VB* ADVP PP VB* ADVP VP VB* CC VB* PP VB* NP VB* NP ADVP VB* NP NP VB* NP PP VB* NP PP , PP VB* PP VB* PP , PP VB* PP SBAR VB* RB ADJP VB* RB VP VB* S VB* SBAR VB* VP IN WHNP VB* NP NP SBAR VB* ADVP ADJP PP ADJP PP JJ PP JJ S VB* PP ADVP SBAR
o
NP NP NP NP PP PP PP PP SBAR SBAR SBAR SBAR VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP WHPP VP VP ADJP ADJP ADJP ADJP ADVP
RB RB NP PP NP-TMP DT NN CD DT NN S JJ NN S NN S NP NP NP NP QP RB RB CD NN CC NP IN ADVP VB* NP IN CD TO CD RB JJR IN CD ADVP NP VP NP ADVP VP IN S XS WHNP SQ PP VP NP VB* NP ADJP MD NP VP MD NP VP MD RB NP VP
RB RB PP NP-TMP NP CD DT NN S NN DT S JJ NN S NN NP NP NP NP RB QP CD NN RB NP CC ADVP IN NP VB* CD TO CD IN CD IN RB JJR NP ADVP VP NP VP ADVP S IN SX SQ WHNP PP NP PP NP ADJP VB* NP VP MD NP VP MD NP VP RB MD
SQ
S MD NP VP
S NP VP MD
SQ SQ SQ SQ SQ VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP VP
VB* NP ADJP VB* NP NP VB* NP PP VB* NP VP VB* NP VP ADVP VB* NP ADVP VB* SBAR NN ADVP SBAR VB* ADJP , SBAR VB* ADJP ADVP VB* ADJP PP VB* ADVP ADJP VB* ADVP ADVP VB* ADVP NP VB* ADVP NP-TMP VB* FRAG VB* NP NP-TMP VB* NP PP PP VB* NP SBAR VB* NP-TMP VB* PP ADVP VB* PP NP-TMP
NP ADJP VB* NP NP VB* PP NP VB* NP VP VB* NP VP VB* NP ADVP VB* ADVP SBAR VB* ADVP NN SBAR ADJP VB* , SBAR ADVP ADJP VB* PP ADJP VB* ADVP ADJP VB* ADVP ADVP VB* ADVP NP VB* NP-TMP ADVP VB* FRAG VB* NP-TMP NP VB* PP PP NP VB* NP SBAR VB* NP-TMP VB* ADVP PP VB* PP NP-TMP VB*
op C
ot
N
o D
76
1:2 2:1 1:2 2:3 3:1 4:4 1:3 2:1 3:2 1:3 2:2 3:1 1:3 2:1 3:2 1:2 2:1 1:2 2:1 1:2 2:1 3:3 1:2 2:1 1:2 2:3 3:1 1:2 2:1 1:2 2:1 1:2 2:1 1:2 2:3 3:4 4:1 1:4 2:3 3:1 4:2 1:2 2:1 3:3 4:4 1:1 2:3 3:2 4:4 1:2 2:1 1:2 2:1 1:2 2:1 3:3 1:1 2:3 3:2 4:4 1:2 2:3 3:1 1:2 2:3 3:1 1:2 2:3 3:1 4:4 1:3 2:4 3:2 4:1 5:5 1:1 2:2 3:4 4:5 5:3 6:6 1:2 2:3 3:1 4:4 1:2 2:3 3:1 4:4 1:3 2:2 3:1 1:2 2:3 3:1 1:2 2:3 3:1 4:4 1:3 2:1 3:2 1:1 2:3 3:2 1:2 2:1 3:3 1:2 2:1 3:3 4:4 1:3 2:2 3:1 1:3 2:2 3:1 1:2 2:3 3:1 1:3 2:2 3:1 1:2 2:3 3:1 1:3 2:2 3:1 1:2 2:1 1:3 2:2 3:1 1:3 2:4 3:2 4:1 1:2 2:3 3:1 1:2 2:1 1:3 2:2 3:1 1:2 2:3 3:1
y
ADVP FRAG NP NP NP NP NP NP NP NP PP PP PP QP QP S S SBAR SBAR SBARQ SINV SINV SQ SQ SQ
VP VP VP VP VP VP VP
VB* PP PP VB* PP S VB* PRT VB* PRT NP VB* PRT NP-TMP PP VB* PRT PP VB* RB NP
PP PP VB* S PP VB* PRT VB* NP PRT VB* NP-TMP PRT PP VB* PP PRT VB* NP VB* RB
1:2 2:3 3:1 1:3 2:2 3:1 1:2 2:1 1:3 2:2 3:1 1:3 2:2 3:4 4:1 1:3 2:2 3:1 1:3 2:1 3:2
y
op
C
ot
N
D
o
77
APPENDIX - C C.1. Inflection Markers used in morphology of Malayalam Nouns INFLECTION MARKER
Noun
N
Plural
PL
Accusative case
ACC
Dative case
DAT
Locative case
LOC
Sociative case
SOC
Instrumental case
INS
op
Genetive case
GEN
Ablative case
ABL
Benefactive case
BEN
Adjectivization
ADVZ
Clitics_E Clitics_A
CLI_A
CLI_um CLI_E
ot
Clitics_um
CLI_O CLI_tanne
Clitics_ANO
CLI_ANO
o
N
Clitics_O
Clitics_tanne
D
ADJZ
C
Adverbalization
78
y
NOMINAL INFLECTION
C.2. Inflection Markers used in morphology of Malayalam Verbs INFLECTION MARKER
Verb
V
Intransitive
INT
Transitive
TRA
Causative
CAU
Past Tense
PAST
Present Tense
PRES
Future Tense
FUT
Continuous
CONT
Perfect
PERF
op
Perfect Continuous
PERFCONT
Passive
PART
Auxiliary-can
AUX_CAN
Auxiliary-may Auxiliary-could Auxiliary-would
AUX_MAY
AUX_SHOULD
C
Auxiliary-should
AUX_COULD AUX_CAN
AUX_MAY
Auxiliary-shall
FUT
Negative
NEG_NOT
Question
QUES
Infinite
INF
D
N
ot
Auxiliary-might
o
79
y
VERBAL INFLECTION
APPENDIX - D D.1. FST State Transition Table modeled for Noun Morphotactics
e e f1 f2 f3 f4 f5 f6 f7 f8 g g g g g g g g
ACCUSATIVE DATIVE LOCATIVE SOCIATIVE INSTRUMENTAL GENETIVE ABLATIVE BENEFICIAL
kaL
: : : : : : : :
e in il OT Al uTe ilninnu inuvENTi
C
ot
o D
: :
h h
ADJECTIVIZATION ADJECTIVIZATION
: :
Aya Aya
i i
ADVERBALIZATION ADVERBALIZATION
: :
Ayi Ayi
j j j k1 k2 k3 k4 k5 k6 l l l l l
CLITICS1 CLITICS2 CLITICS3 CLITICS4 CLITICS5 CLITICS6
: : : : : :
um E A O tanne ANO
80
SURFACE TAPE
y
LEXICAL TAPE CHARACTER SET NOUN PLURAL
op
NEXT STATE b c d
N
CURRENT STATE a b c d: c d e e e e e e e e f1 f2 f3 f4 f5 f6 f7 f8 g: c d h: c d i: c d g j j j j j j k1 k2 k3 k4 k5
k6 l:
l
D.2. FST State Transition Table modeled for Verb Morphotactics a
b
b c:
c
LEXICAL TAPE CHARACTER SET VERB
c d d d e1 e2 e3 f:
d e1 e2 e3 f f f
INTRANSITIVE TRANSITIVE CAUSATIVE
c f k m1 o q1 g g g h1 h2 h3 i:
g g g g g g h1 h2 h3 i i i
D
o
:
uka
: : :
ikk ippikk
: : :
unnu um u
C
N
ot
PRESENT FUTURE PAST
i j k:
j k
CONTINUOUS
:
koNTirikk
c f l m1:
l l m1
PARTICIPLE
:
appeT
i n o:
n o
PERFECT
:
iTTuNT
81
SURFACE TAPE
y
NEXT STATE
op
CURRENT STATE
i
p
p
q1
PERFECT CONTINUOUS
:
koNTAyirikk
: : :
AM AnpaRRi aNaM
q1: r r r r r r s1 s2 s3 t t t
AUX_CAN AUX_COULD AUX_SHOULD
i u v:
u v
AUX_MAY
:
EkkAM
i t v w x1:
w w w x1
NEG_NOT
:
illa
y1 y1 y1 y1 z
QUES
:
O
aa aa bb
INF
:
An
op
C
ot N
D
o
i t v x1 y1 z: c f aa bb:
82
y
c f k m1 o q1 r r r s1 s2 s3 t:
D.3. Orthographic Rules for Malayalam Nouns #Plural "kaL"
avan[b2]kaL ivaL[b2]kaL avaL[b2]kaL tAn[b2]kaL nI[b2]kaL njAn[b2]kaL nAM[b2]kaL amma[b2]kaL accan[b2]kaL cETTan[b2]kaL cEcci[b2]kaL makan[b2]kaL makaL[b2]kaL ammAyi[b2]kaL muttaccan[b2]kaL muttaSSi[b2]kaL
/
__
ivar
/
__
avar
/
__
ivar
/
__
avar
/
__
tAngngaL
/
__
ningngaL
/
__
njangngaL
/
__
nammaL
/
__
/
__
/
__
/
__
/
__
makkaL
/
__
makkaL
/
__
ammAvanmAr
/
__
ammAyimAr
/
__
muttaccanmAr
/
__
muttaSSimAr
/
__
avar
/
__
ivar
/
__
mantrimAr
/
__
manushyar
/
__
rAjAkkanmAR
/
__
tampurAkkannmAR
/
__
tampurATTikaL
/
__
kalAkAranmAR
/
__
anujanmAR
/
__
anujattimAR
/
__
AngaLamAR
/
__
pengnganmAR
/
__
ANungngaL
/
__
peNNungngaL
/
__
AtmAkkaL
/
__
addhyApakanmAR
/ __
addhyapikamAR
/ __
ammamAr accanmAr cETTanmAr cEccimAr
N
ayAL[b2]kaL
__
ava
ot
ammAvan[b2]kaL
/
iyAL[b2]kaL
mantri[b2]kaL
manushyan[b2]kaL
o
rAjAv[b2]kaL
tampurAn[b2]kaL
D
tampurATTi[b2]kaL kalAkAraN[b2]kaL anujan[b2]kaL anujatti[b2]kaL AngngaLa[b2]kaL pengngaL[b2]kaL AN[b2]kal peNN[b2]kal AtmAv[b2]kal
addhyApakan[b2]kaL addhyApika[b2]kaL
83
y
ivan[b2]kaL
iva
op
atu[b2]kaL
‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐>
C
itu[b2]kaL
bhaTan[b2]kaL vakkIl[b2]kaL purushan[b2]kaL bhArya[b2]kaL ay[b2] NN[b2] M[b2]k [b2] [b2] [b2]
/ __
‐> bhaTanmAR ‐> vakkIlanmAR ‐> purushanmAR ‐> bhAryamAR
/ __
‐> ‐> ‐> ‐> ‐> ‐>
aya
/
__ kaL
NNu
/
__ kaL
ngng
/
__ aL
k
/ (u|U) __ kaL
u
/ [CON] __ kaL
/ __
/ __ / __ / __
y
sahOdari[b2]kaL
‐> sahOdaranmAR ‐> sahOdarimAR
[]
op
sahOdaran[b2]kaL
/ (a|A|i|I|e|E) __ kaL
# CASE MARKERS
‐> ‐>
njAn[b2]e
[b2] [b2] M[b2]
D
o
[b2]
njAn[b2]n nI[b2]n [b2]e [b2]n [b2]n [b2]
M[b2][] [b2]
/
__
ninne
/
__
‐> ‐> ‐> ‐> ‐> ‐>
[]
/ (L) __ e
y
/ (a|A|i|I|e|E) __ e
v
/ (u|o) __ e
vin
/ (U|O) __ e
tt
/
in
/ [CON] __ e
N
[b2] [b2]
enne
ot
nI[b2]e
C
# Accusative "e"
# Dative "n" ‐> enikk ‐> ninakk ‐> u
/
__
/
__
‐> ‐> ‐> ‐> ‐>
kk
/ (L) __
kk
/ (a|A|i|I|e|E) __
vi
/ (u|U|o|O) __ n
tti
/
i
/ [CON] __ n
/ n __
84
__ e
__ n
# Locative "il"
nI[b2]il
[b2] [b2] M[b2] [b2]
‐> ‐>
ennil
/
__
ninnil
/
__
‐> ‐> ‐> ‐>
y
/ (a|A|i|I|e|E) __ il
v
/ (u|U|o|O) __ il
tt
/
[]
/ [CON] __ il
[b2] [b2] M[b2][] [b2]
ennOT
‐> ‐> ‐> ‐> ‐>
[]
/ (L) __ OT
y
/ (a|A|i|I|e|E) __ OT
ninnOT
vin
/
__
/
__
/ (u|U|o|O) __ OT
ot
[b2]
‐> ‐>
C
nI[b2]OT
op
# Sociative "OT" njAn[b2]OT
__ il
y
njAn[b2]il
ttin
/
__ OT
in
/ [CON] __ OT
N
# Instrumental "Al"
njAn[b2]Al
o
nI[b2]Al
D
[b2] [b2]
M[b2] [b2]
‐> ‐>
ennAl
/
__
ninnAl
/
__
‐> ‐> ‐> ‐>
y
/ (a|A|i|I|e|E) __ Al
v
/ (u|U|o|O) __ Al
tt
/
[]
/ [CON] __ Al
__ Al
# Genitive "e" njAn[b2]Te nI[b2]Te
‐> ‐>
enRe
/
__
ninRe
/
__
85
‐> ‐> ‐> ‐> ‐>
[b2]Te [b2]Te [b2]Te M[b2]Te [b2]Te
uTe
/ (L) __
yuTe
/ (a|A|i|I|e|E) __
vinRe
/ (u|U|o|O) __
attinRe
/
inRe
/ [CON] __
__
# Benefactive "inuvENTi"
[b2]in [b2]in [b2] M[b2][]
__
ninakkuvENTi
/
__
‐> ‐> ‐> ‐> ‐>
kk
/ (L) __ uvENTi
kk
/ (a|A|i|I|e|E) __ uvENTi
v tt
y
/
/ (u|U|o|O) __ inuvENTi /
[]
ot
[b2]
enikkuvENTi
op
nI[b2]inuvENTi
‐> ‐>
C
njAn[b2]inuvENTi
__ inuvENTi
/ [CON] __ inuvENTi
# Ablative "ilninn"
njAn[b2]ilninn
[b2]
ennilninn
/
__
ninnilninn
/
__
‐> ‐> ‐> ‐>
y
/ (a|A|i|I|e|E) __ ilninn
v
/ (u|U|o|O) __ ilninn
att
/
[]
/ [CON] __ ilninn
N
nI[b2]ilninn
‐> ‐>
o
[b2]
M[b2]
D
[b2]
__ ilninn
# RootWord + N + ADJECTIVZATION njAn[b2]Aya nI[b2]Aya
[b2] [b2]
‐> ‐>
njAnAya
/
__
nIyAya
/
__
‐> y ‐> v
/ (a|A|i|I|e|E) __ Aya / (u|U|o|O) __ Aya
86
‐> m ‐> []
M[b2] [b2]
/
__ Aya
/ [CON] __ Aya
# RootWord + N + ADVERBALIZATION
[b2] [b2] M[b2] [b2]
njAnAyi
/
__
nIyAyi
/
__
‐> ‐> ‐> ‐>
y
/ (a|A|i|I|e|E) __ Ayi
v
/ (u|U|o|O) __ Ayi
m
/
[]
‐> ‐>
njAn[b2]um
[b2]u
[b2]u
/
__
nIyum
/
__
‐> ‐> ‐> ‐>
yu
/ (a|A|i|I|e|E) __ m
vu
/ (u|U|o|O) __ m
vu
/
u
/ [CON] __ m
N
[b2]u M[b2]u
njAnum
ot
nI[b2]um
__ Ayi
/ [CON] __ Ayi
C
# RootWord + N + Clitics " um "
y
nI[b2]Ayi
‐> ‐>
op
njAn[b2]Ayi
__ m
o
# RootWord + N + Clitics " E " njAn[b2]E
D
nI[b2]E
[b2] [b2]
M[b2] [b2]
‐> ‐>
njAnE
/
__
nIyE
/
__
‐> ‐> ‐> ‐>
y
/ (a|A|i|I|e|E) __ E
v
/ (u|U|o|O) __ E
tt
/
[]
/ [CON] __ E
__ E
# RootWord + N + Clitics " A " njAn[b2]A
‐>
njAnA
/
87
__
nI[b2]A
‐>
nIyA
/
[b2]
‐> ‐> ‐> ‐>
y
/ (a|A|i|I|e|E) __ A
v
/ (u|U|o|O) __ A
m
/
[]
/ [CON] __ A
[b2] M[b2] [b2]
__
__ A
# RootWord + N + Clitics " O "
[b2] [b2] M[b2] [b2]
/
__
nIyO
/
__
‐> ‐> ‐> ‐>
y
y
njAnO
op
nI[b2]O
‐> ‐>
/ (a|A|i|I|e|E) __ O
v
/ (u|U|o|O) __ O
m
/
[]
__ O
/ [CON] __ O
C
njAn[b2]O
njAn[b2]tanne
[b2]
njAntanne
/
__
nItanne
/
__
‐> []
/
__ tanne
‐> ‐>
N
nI[b2]tanne
ot
# RootWord + N + Clitics " tanne "
o
# RootWord + N + Clitics " ANO" njAn[b2]ANO
D
nI[b2]ANO
[b2] [b2]
M[b2] [b2]
‐> ‐>
njAnANO
/
__
nIyANO
/
__
‐> ‐> ‐> ‐>
y
/ (a|A|i|I|e|E) __ ANO
v
/ (u|U|o|O) __ ANO
m
/
[]
/ [CON] __ ANO
88
__ ANO
D.4. Orthographic Rules for Malayalam Verbs ######********VERB********######## ‐> uka[b2]uka [] ‐> [b2]uka []
/
__
/
__
######********INTRANSITIVE********########
[b2]i [b2]ikk [b2]ikk [b2]
tt
/
r __
I
/
T __ kk
[] [] []
op
‐> ‐> ‐> ‐> ‐>
[b2]ikk
y
######********TRANSITIVE********########
ikk __
/
akk __
/
__ ikk
ot
C
######********CAUSATIVE********######## ‐> [b2] tt ‐> [b2]i I ‐> ikk[b2] [] ‐> akk[b2]i a ‐> [b2] []
/
/
r __ ippikk
/
T __ ppikk
/
__ ippikk
/
__ ppikk
/
__ ippikk
‐>
[]
/
__ koNTirikk
o
[b2]
N
######********CONTINUOUS********########
D
######********PERFECT CONTINUOUS********######## [b2]
‐>
[]
/
__ koNTAyirikk
######********PASSIVE********######## [b2]
‐>
[]
/
__ appeT
‐> ‐>
kk
/
iri __ unnu
pOk
/
__ unnu
####PRESENT##### [b2] pO[b2]
89
tA[b2] koT[b2] vey[b2] [b2]unnu [b2]
‐> ‐> ‐> ‐>
var
/
__ unnu
tar
/
__ unnu
koTukk
/
__ unnu
veykk
/
__ unnu
‐> ‐>
T
/
iTTuN __
[]
/
__ unnu
‐> ‐> ‐>
kk
/
iri __ um
TAv
/
iTTuN __ um
[]
/
y
vA[b2]
####FUTURE##### [b2] [b2]
####PAST#####
######## SPECIAL CASES ############# ‐> [b2] TAyirunn ‐> ikk[b2] unn ‐> koNTAyirikk[b2] koNTAyirunn
iTTuN __ u
/
koNTirikk __ u
/
__ u
nontu
/
__
ventu
/
__
taLLi
/
__
colli
/
__
konn
/
__ u
pOyi
/
__
vann
/
__ u
tann
/
__ u
pOyi
/
__
kaNT
/
__ u
vann
/
__ u
tann
/
__ u
tinn
/
__ u
koTutt
/
__ u
vecc
/
__ u
vecc
/
__ u
catt
/
__ u
cattu
/
__
ninn
/
__ u
uNT
/
__ u
C
/
ot
nOv[b2]u vEv[b2]u coll[b2]u koll[b2] pO[b2]u vA[b2]
o
tA[b2]
pOk[b2]u
D
kAN[b2] var[b2] tar[b2]
tinn[b2]
koT[b2] vey[b2] veykk[b2] cAv[b2] cAk[b2]u nilkk[b2] uNN[b2]
‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐>
N
taLL[b2]u
90
__ um
op
[b2]
nakk[b2]u cikk[b2]u uzh[b2] vIzh[b2] tAzh[b2] koLL[b2] pUkk[b2] iri[b2]
kAttu
/
__
nakki
/
__
cikki
/
__
uzhut
/
__ u
vIN
/
__ u
tAzhn
/
__ u
koNT
/
__ u
pUtt
/
__ u
irunn
/
__ u
/
y
‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐> ‐>
kAkk[b2]u
‐> ‐> ‐>
nn
__ u
/
LL __
NT
/
__ u
‐> ‐>
yt
/
__ u
njnj
/
__ u
‐> ‐> ‐> ‐>
iTT
/
__ u
eTT
/
__ u
oTT
/
__ u
aTT
/
__ u
‐> ‐>
icc
/
__ u
IRR
/
__ u
ykk[b2]
‐>
ycc
/
__ u
rkk[b2]
rtt
/
__ u
ukk[b2]
‐> ‐>
utt
/
__ u
akk[b2]
‐>
ann
/
__ u
lkk[b2]
‐> ‐>
RR
/
__ u
TT
/
__ u
L[b2] yy[b2] y[b2]
eT[b2] oT[b2] aT[b2]
N
iT[b2]
o
ikk[b2]
D
Ikk[b2]
Lkk[b2]
i
C
/
[b2]u
91
__ u
ot
ll[b2]
/
__ u
op
############GENERAL CASE############# ‐> R[b2] RR ‐> r[b2] rnn
‐>
[b2]u
i
/
__
[]
/
__ iTTuN
y
/
i __ iTTuN
[]
/
__ iTTuN
‐> ‐> ‐>
u[b2] [b2] [b2]
y
######********PERFECT********########
kk
[b2]
‐> ‐> ‐> ‐>
[b2]
‐>
[]
[b2]
[] TAk
ot
[b2]
[]
iri __ AM
/
koNTAyirikk __ AM
/
koNTirikk __ AM
/
iTTuN __ AM
/
__ AM
‐> ‐> ‐> ‐>
kk
/
iri __ AnpaRRi
[]
/
koNTAyirikk __ AnpaRRi
[]
/
koNTirikk __ AnpaRRi
TAk
/
iTTuN __ AnpaRRi
‐>
[]
/
__ AnpaRRi
kk
/
iri __ aNaM
[]
/
koNTAyirikk __ aNaM
[]
/
koNTirikk __ aNaM
[b2]
‐> ‐> ‐> ‐>
TAk
/
iTTuN __ aNaM
[b2]
‐>
[]
/
__ aNaM
u[b2]
‐> ‐>
[]
/
__ EkkAM
[]
/
__ EkkAM
[b2] [b2] [b2]
o
[b2]
N
[b2]
D
[b2] [b2] [b2]
i[b2]
92
/
C
[b2]
op
######********AUXILLARY********########
‐> ‐> ‐> ‐>
Ekk
/
__ illa
AnpaRR
/
__ illa
AnpaRR
/
__ illa
aNTa
/
__
[]
/
y
/
um[b2]
‐> ‐> ‐>
y
######********NEGATIVE********########
[b2]
‐>
[]
AM[b2] AnpaRRi[b2] aNaM[b2]illa u[b2]
[]
i __ illa
/
__ illa
/
__ illa
C
[b2]
__ illa
op
EkkAM[b2]
######********QUESTION********########
iTTuNTAyirunnu[b2] koNTirunnu[b2] irunnu[b2] unnu[b2]
/
illa __ O
o
u[b2]
D
[b2] [b2]
‐> ‐> ‐> ‐>
iTTuNTAyirunn
/
__ O
koNTirunn
/
__ O
koNTAyirunn
/
__ O
irunn
/
__ O
‐> ‐> ‐>
unnuNT
/
__ O
[]
/
__ O
y
/
i __ O
‐>
[]
/
__ O
kk
/
iri __ An
[]
/
__ An
N
koNTAyirunnu[b2]
y
ot
‐>
[b2]
######********INFINITE********######## [b2] [b2]
‐> ‐>
93
APPENDIX - E E.1. Tested sentences of Machine Translation System with Rankings RANK
ENGLISH INPUT SENTENCE
MALAYALAM OUTPUT SENTENCE
She loves her work
1
He put his towel on the towel rack
aവള് aവള െട േജാലിെയ സേനഹിkുnു aവന് aവന്െറ തൂവാലെയ തൂവാല-പലകt ില് i
1
It was a new red bike
aത് oരു പുതിയ ചുവn േമ ാ ൈസkിള് ആയിരുnു
1
The bike lock was on the ground
േമാ ാര്ൈസkിള്-പൂ ് തറയില് ആയിരുnു
1
It was an old book
aത് oരു പഴയ പുസതകം ആയിരുnു
1 1
Buy a goat She chopped off the two ends of the carrot
1
She had a mole on her face
oരു ആടിെന വാ ുക aവള് ശീമമുll ിെcടിയുെട ര ട് a നുറുtു aവ k് aവള െട മുഖtില് oരു മറുക് u ടായിരുnു
1
She hated it
aവള് aതിെന െവറുkുnു
1
It was a dark brown circle
aത് oരു iരു
1
It killed people
1
I love that store
1
She was not married
1
He lived in Los Angeles
aവന് െലാസ്-a
1
They were in love
aവര് സേനഹtില് ആയിരുnു
1
Your family is in Los Angeles
നി
1
You have your brothers
നി
1
He looked outside
aവന് പുറേtk് േനാkി
1
I hate it
1
His car was dark blue
aവന്െറ കാര് iരു
1
She walked outside
aവള് പുറേtk് നടnു
y
1
op
C
ട തവി നിറമുll വ ം ആയിരുnു
aത് ആള കള കെള െകാnു ഞാന് ആ കലവറെയ സേനഹിkുnു
o
N
ot
aവള് കല ാണംകഴിkെp ില െചെലസയില് ജീവിc
ള െട കുടുംബം െലാസ്-a k് നി
െചെലസയില് ആണ്
ള െട സേഹാദര മാര് u
ട്
ഞാന് aതിെന െവറുkുnു ട നീല ആയിരുnു
But it had a big mouth
enാലും aതിന് oരു വലിയ വായ് u
He sliced an onion
aവന് oരു uളളിെയ േഛദിc
He looked at the kitchen floor
aവന് aടുkള-തറയില് േനാkി
1
Water was on the kitchen floor
െവളളം aടുkള-തറയില് ആയിരുnു
1
He put it on the kitchen floor
aവന് aതിെന aടുkള-തറയില് i
1
Fifty stars are on the flag
amത് നk ത
1
They live on the street
aവര് െതരുവില് ജീവിkുnു
1
Everything was on sale
സകലതും വി പനയില് ആയിരുnു
1
I hate tomatoes
ഞാന് തkാളികെള െവറുkുnു
1
She fell down
aവള് താെഴ വീണു
1
She hit her head on a rock
aവള് aവള െട തലെയ oരു പാറയില് iടിc
1
Where is the mustard
e
1
The name of the magazine was Time
മാസികയുെട േപര് സമയം ആയിരുnു
1 1
D
1
94
െള കള
ടായിരുnു
ള് പതാകയില് ആണ്
് ആണ് കടുക്
്
1
A cop saw her The farmer put a dozen apples into a bag
oരു േപാലീ േകാ s ള് aവെള ക ടു കര്ഷകന് oരു പ n ട് ആpി pഴ െള oരു സ ചിk് ullില് i
1
I love my newspaper
ഞാന് eന്െറ വ tമാനപ തെt സേനഹിkുnു
1
We live in a rainbow world
ഞ
1
A lemon is yellow
oരു െചറുനാര
1
An apple is red or green
oരു ആpിള്pഴം ചുവn aഥവാ പc ആണ്
1
Your teeth are white
നി
1
She started screaming
aവള് കൂകിവിളിkുക ആരംഭിc
1
Money has germs
പണtിന് aണുkള് u
1
People have germs
1
Germs live on people for months
ആള കള ക k് aണുkള് u ട് aണുkള് മാസ kുേവ ടി ആള കള കളില് ജീവിkുnു
1
Two people died
ര
1
Eight people were hurt
e ് ആള കള കള് വണെpടുtെp
1
He had a job
aവന് oരു േജാലി u
1
It was a bad job
aത് oരു ചീtയായ േജാലി ആയിരുnു
1
He was a waiter
aവന് oരു േഹാ
1
Luggage hit people in the face
യാ താസാമാന
ള് ആള കള കെള മുഖtില് iടിc
1
Luggage hit people in the head
യാ താസാമാന
ള് ആള കള കെള തലയില് iടിc
1
The police came
1
He was helpful
1
This was her money
iത് aവള െട പണം ആയിരുnു
1
She bought five lottery tickets
aവള് a
1
She looked for dark red
aവള് iരു
1
She liked cherry pink
aവള് രk പസാദമുll പാടലവ
1
She looked in a mirror
aവള് oരു ദ pണtില് േനാkി
1
She gave the cashier $20
aവള് പണംസൂkിkുnവന് $--5 20 െകാടുtു
1
He gave her a little change
aവന് aവ k് oരു െചറിയ മാ െt െകാടുtു
1
They are not taking vacations
o
aവര് aവധിkാല
1
she asked again He sells the corn at his vegetable stand
aവള് വീ ടും േചാദിc aവന് aവന്െറ പckറി-ത ില് ധാന െt വി kുnു
1
It is bright yellow
aത് േശാഭമയമായ മ
1
The cow loves the corn
പശു ധാന െt സേനഹിkുnു
1
A winter storm is coming
oരു ശീതകാലം-െകാടു ാ ് വnുെകാ
1
He has pretty pictures and maps
aവന് നിസര്gസുnരമായ ചി തം uം ഭൂപടം u
1
He talks and talks
aവന് സംസാരിkുnു uം സംസാരിkുnു
1
The wind started to blow
കാ ് കാ വ്ീശുക ആരംഭിc
1
Paper flew through the air
വര്tമാനp തം വായു മുേഖന പറnു
1
The rain started to fall
മഴ വീഴുക ആരംഭിc
1
It was a flood
aത് oരു പവാഹം ആയിരുnു
1
Then he saw lightning
പിnീട് aവന് മിnലിെന ക
ആണ്
ള െട പല കള് െവll ആണ്
y
ട്
op
ട് ആള കള കള് മരിc
ടായിരുnു
പരിചാരകന് ആയിരുnു
C
ot 95
മ
aവന് സഹായകമായ ആയിരുnു
D
1
ള് oരു മഴവില്-ഭൂമിയില് ജീവിkുnു
േപാലീസ് വnു
N
1
ച് ഭാഗ kുറി-കുറിമാന
െള വാ
ി
ട ചുവpിനുേവ ടി േനാkി െള i െp
െള eടുtുെകാ
ടിരിkുnില
ആണ്
ടു
ടിരിkുnു ട്
1
It was a very cold night
1
His head feels like it will explode
aത് oരു വളെര തണുt രാ തി ആയിരുnു aവന്െറ തല aത് െപാ ിെtറിkുമ് േപാെല aനുഭവിkുnു
1
They liked their jobs
aവര് aവരുെട േജാലികെള i െp
1
His head ache started an hour ago
aവന്െറ തല-േവദന oരു മണിkൂര് മുm് ആരംഭിc
1
They started shooting
aവര് െവടിവ kുക ആരംഭിc
1
The cop gave him the ticket
േപാലീ േകാ
1
He brushed his teeth
aവന് aവന്െറ പല കെള തുടc വൃtിയാkി
1
He went into his bedroom
aവന് aവന്െറ കിടpറk് ullില് േപായി
1
He drove fast
aവന് േവഗtില് വ
1
She had a new friend
aവ k് oരു പുതിയ കൂ കാരന് u
1
They put a big towel on the sand
aവര് oരു വലിയ തൂവാലെയ മണലില് i
1
They watched the sun go down
aവര് സൂര ന് താെഴ േപാകുnു ക
1
It was huge and orange
aത് ഗംഭീരമായ uം ഓറ
1
I like the beach
ഞാന് കട tീരെt i െpടുnു
1
A triangle has three sides
oരു തിേകാണtിന് മൂn് വശ
1
Faces have various shapes
മുഖ
1
Clouds have various shapes
േമഘ
1 1
Paul is a good man Some people stand on the edge of cliffs
1
They killed their friend
1
They feel the wind
1
Their marriage ends
aവരുെട വിവാഹം aവസാനിkുnു
1
We have a nice house
ഞ
1
It has three bedrooms
aതിന് മൂn് കിടpറകള് u
1
It has three bathrooms
aതിന് മൂn് കുളിമുറികള് u
1
We live on a quiet street
1
The baby had a rare disease
ഞ ള് oരു ശാnമായ െതരുവില് ജീവിkുnു െചറിയകു ിk് oരു aസാധാരണമായ േരാഗം u ടായിരുnു
ള് aവന് കുറിമാനെt െകാടുtു
ടിേയാടിc
y ടു
op
ച് ആയിരുnു
k് വിവിധ
ള് u
ട്
ളായ ആകൃതികള് u
C
k് വിവിധ
ട്
ളായ ആകൃതികള് u
ട്
േപാള് oരു നല പുരുഷന് ആണ് a പം ആള കള കള് കിഴുkാംതൂkായപാറകള െട സീമയില് നി kുnു
ot
aവര് aവരുെട കൂ കാരനിെന െകാnു aവര് കാ ിെന aനുഭവിkുnു k് oരു ഹൃദ മായ വീട് u
ട്
ട് ട്
1
o
N
ടായിരുnു
He looked in the paper for another job
ഓേരാ രാ തി aവന് oരു സ നം u ടായിരുnു aവന് വ tമാനp തtില് മെ ാരു േജാലിkുേവ ടി േനാkി
1
She was a nurse
aവള് oരു ആയ ആയിരുnു
1
Her husband was a doctor
aവള െട ഭര്tാവ് oരു ൈവദ ന് ആയിരുnു
1
She saw another aeroplane
aവള് മെ ാരു വിമാനെt ക
1
They bought their tickets
aവര് aവരുെട കുറിമാന
1
He looked at the bubbles
aവന് കുമിളകളില് േനാkി
1
aവന് ക ാടിെയ േമശയില് i പണkാരനായ-ആള കള കള് aമിതവിലയുll കുറിമാന െള വാ ി
1
He put the glass on the table Rich people bought the expensive tickets The name of the lake was Yellow Lake
1
It had many colors
aതിന് ധാരാളമായ നിറ
Every night he had a dream
D
1
1
കായലി െറ േപര് മ
96
s
ടു
െള വാ
ി
-കായല് ആയിരുnു ള് u
ടായിരുnു
1
But its belly was white
enാലും aതിന്െറ uദരം െവll ആയിരുnു
1
But usually they lose
enാലും പതിവായി aവര് േതാ kുnു
1
It was her birthday
aത് aവള െട ജ മദിനം ആയിരുnു
1
Mumbai is a big city
മുംൈബ oരു വലിയ നഗരം ആണ്
1
He examined the wife
aവന് ഭാര െയ പരിേശാധിc
1
He saw the baby
aവന് െചറിയകു ിെയ ക
1
They both smiled
aവര് iരുവരും പു
1
It attacked Donna
aത് െദാnെയ ആ കമിc
1
He loved his plants
aവന് aവന്െറ െചടികെള സേനഹിc
1
His plants were in pots
aവന്െറ െചടികള് പാ ത
1
It was Friday
aത് െവllിയാ ച ആയിരുnു
1
He went outside
aവന് പുറേtk് േപായി
1
The airplane had two engines
വിമാനtിന് ര
1
A big bird flew into each engine
oരു വലിയ പkി ഓേരാ യ ntിന് ullില് പറnു
1
She examined the pot
aവള് പാ തെt പരിേശാധിc
1
They ran
aവര് ഓടി
1
It was a problem
aത് oരു പ നം ആയിരുnു
1
Herman agreed
െഹര്മന് സmതിc
1
He had bought a ticket
1
The baby smiles
1
Air comes through the window
1
It is warm air
1
The boy goes to the kitchen
ആണ്കു ി aടുkളk് േപാകുnു
1
The ball is on the floor
പn് തറയില് ആണ്
1
It is a red ball
aത് oരു ചുവn പn് ആണ്
1
He sits down
aവന് താെഴ iരിkുnു
1
It is a rubber ball
aത് oരു റbര്-പn് ആണ്
1
The dog barks
നായ കുര kുnു
1
The bird sings
പkി പാടുnു
He throws the ball
aവന് പnിെന eറിയുnു
ചിരിc
y
ളില് ആയിരുnു
ള് u
ടായിരുnു
C
op
ട് യ n
aവന് oരു കുറിമാനെt വാ െചറിയകു ി പു
ിയി
ചിരിkുnു
aത് iളംചൂടുളള വായു ആണ്
o
N
ot
വായു ജനല് മുേഖന വരുnു
1
The cat licks its paws
പൂc aതിന്െറ ൈകptികെള നkുnു
1
The cat licks its belly
പൂc aതിന്െറ uദരെt നkുnു
1
The dog licks its paws
നായ aതിന്െറ ൈകptികെള നkുnു
1
He sees a package
aവന് oരു െപാതിെk ിെന കാണുnു
1
He opens the refrigerator
aവന് ശീതീകരണയ nെt തുറkുnു
1
It has a red cover
aതിന് oരു ചുവn ആവരണം u
1
She likes animals
aവള് മൃഗ
െള i െpടുnു
1
She has two cats
aവ k് ര
ട് പൂcകള് u
1
She likes her cats
aവള് aവള െട പൂcകെള i െpടുnു
1
He has a job
aവന് oരു േജാലി u
1
He is a teacher
aവന് oരു ad ാപകന് ആണ്
D
1
97
ടു
ട്
ട്
ട്
ടായിരുnു
1
He teaches kids
aവന് കു ികെള പഠിpിkുnു
1
He likes his job
aവന് aവന്െറ േജാലിെയ i െpടുnു
1
He likes kids
aവന് കു ികെള i െpടുnു
1
He chews it
aവന് aതിെന ചവ kുnു
1
He sees an airplane
aവന് oരു വിമാനെt കാണുnു
1
The airplane is in the sky
വിമാനം ആകാശtില് ആണ്
1
The baby cried again
െചറിയകു ി വീ
1
It has two wings
aതിന് ര
1
It has a tail
aതിന് oരു വാല് u
1
She has a doll
aവ k് oരു പാവ u
1
The doll has long hair
പാവk് നീളമുളള തലമുടി u
1
It crawls slowly
aത് മnമnം iഴയുnു
1
She watches it
aവള് aതിെന കാണുnു
1
She puts it in her mouth
aവള് aതിെന aവള െട വായില് iടുnു
1
She likes the monkeys
aവള് കുര
1
They have long tails
aവ k് നീളമുളള വാലുകള് u
ട്
1
There are six monkeys in the cage
aവിെട തടവറയില് ആറ് കുര
ുകള് ആണ്
1
The snow falls
മ
1
It covers the ground
1
He goes outside
1
He looks at his bicycle
1
It is an old bike
1
She watches the ants
aവള് uറുmുകെള കാണുnു
1
She has a dog
aവ k് oരു നായ u
1
The snow falls from the sky
മ
1
He sees a butterfly
aവന് oരു പൂmാ െയ കാണുnു
1
The music starts
സംഗീതം ആരംഭിkുnു
1
She looks in the mirror
aവള് ദ pണtില് േനാkുnു
1
It has white hair
aതിന് െവള t തലമുടി u
They always hug her
aവര് aവെള eേpാഴും െക ിpിടിkുnു
The big fish eat the small fish
വലിയ മt ം െചറിയ മt െt തിnുnു
1
She loves him
aവള് aവെന സേനഹിkുnു
1
The hero solves the problem
കഥാനായകന് പ നെt പരിഹരിkുnു
1
The story has a happy ending
കഥk് oരു സnുഷ്ടമായ സമാ തി u
1
It is blue
aത് നീല ആണ്
1
He talks with his wife
aവന് aവന്െറ ഭാര േയാട് സംസാരിkുnു
1
He sneezes
aവന് തുmുnു
1
It was dark green
aത് iരു
1
He loved animals
aവന് മൃഗ
1
She went under the water
aവള് െവളളtിന് aടിയില് േപായി
1
She was a little girl
aവള് oരു െചറിയ െപ
ട്
ട് ട്
op
y
ട്
ുകെള i െpടുnു
C
് വീഴുnു
aത് തറെയ മൂടുnു
aവന് aവന്െറ ൈസkിളില് േനാkുnു aത് oരു പഴയ േമ ാ ൈസkിള് ആണ്
98
ട് ചിറകുകള് u
ot
N
D
1
ു
aവന് പുറേtk് േപാകുnു
o
1
ടും കര
ട്
് ആകാശtി നിn് വീഴുnു
ട്
ട്
ട പc ആയിരുnു െള സേനഹിc കു ി ആയിരുnു
1
His new car is green
aവന്െറ പുതിയ കാര് പc ആണ്
1
Dora loved her mom
െദാര aവള െട amെയ സേനഹിc
1
He loves his red bicycle
aവന് aവന്െറ ചുവn ൈസkിെള സേനഹിkുnു
1
He was at school
aവന് വിദ ാലയtില് ആയിരുnു
1
His friend laughed
aവന്െറ കൂ കാരന് ചിരിc
1
His brother took the apple
aവന്െറ സേഹാദരന് ആpി pഴെt eടുtു
1
The baby cried
െചറിയകു ി കര
1
I love my mom
ഞാന് eന്െറ amെയ സേനഹിkുnു
1
She stood in the water
aവള് െവളളtില് നിnു
1
It was a big lake
aത് oരു വലിയ കായല് ആയിരുnു
1
Wash your hands
നി
1
Be a good boy
oരു നല ആണ്കു ി ആകുക
1
Crow is a bird
കാk oരു പkി ആണ്
1
It was a good book
aത് oരു നല പുസതകം ആയിരുnു
1
It was closed
aത് aട kെp
1
It has three legs
aതിന് മൂn് കാലുകള് u
1
He has sheep
aവന് െചmരിയാട് u
1
She has blue eyes
aവ k് നീല ക
1
He loved her
1
It was his money
1
It was a big nail
1
He followed her
1
It was under the tree
aത് മരtിന് aടിയില് ആയിരുnു
1
I am a teacher
ഞാന് oരു ad ാപകന് ആണ്
1
I am from Delhi
1
He killed a police officer
aവന് oരു േപാലീസ്-തലവെന െകാnു
1
I hate her
ഞാന് aവെള െവറുkുnു
1
My friend went to Delhi
eന്െറ കൂ കാരന് െഡ ഹിk് േപായി
1
My dad is in Mumbai
eന്െറ acന് മുംൈബയില് ആണ്
I have three brothers
eനിk് മൂn് സേഹാദര മാര് u
It was a good question
aത് oരു നല േചാദ ം ആയിരുnു
1
My brother is a teacher
eന്െറ സേഹാദരന് oരു ad ാപകന് ആണ്
1
This is a good place
iത് oരു നല iടം ആണ്
1
She is dancing
aവള് നൃtംെച തുെകാ
1
He saw a cow
aവന് oരു പശുവിെന ക
1
It was a big house
aത് oരു വലിയ വീട് ആയിരുnു
1
Eight people stayed there
e ് ആള കള കള് aവിെട ത
1
My grandmother lived there
eന്െറ മുt ി aവിെട ജീവിc
1
It is a village
aത് oരു ഗാമം ആണ്
1
I went there
ഞാന് aവിെട േപായി
1
He is a good cook
aവന് oരു നല പാചകkാരന് ആണ്
y
op ട്
C
കള് u
ട്
N
ot
aത് oരു വലിയ നഖം ആയിരുnു aവന് aവെള പിnുട nു
ഞാന് െഡ ഹിയി നിn് ആണ്
99
ട്
aത് aവന്െറ പണം ആയിരുnു
D
1
ള െട ൈകകള് കഴുകുക
aവന് aവെള സേനഹിc
o
1
ു
ട്
ടിരിkുnു ടു ി
1
He called his wife
aവന് aവന്െറ ഭാര െയ വിളിc
1
James is a lawyer
െജയിംസ് oരു വkീല് ആണ്
1
He was a soldier
aവന് oരു േയാdാവ് ആയിരുnു
1
He went by train
aവന് തീവ
1
He is a cobbler
aവന് oരു െചരുp കുtി ആണ്
1
Rain is coming
1
She is working in an office
മഴ വnുെകാ ടിരിkുnു aവള് oരു കാര ാലയtില് പണിെയടുtുെകാ ടിരിkുnു
1
He saw a police officer
aവന് oരു േപാലീസ്-തലവെന ക
1
She had fever
aവ k് പനി u
1
She stood there
aവള് aവിെട നിnു
1
I love my grandmother
ഞാന് eന്െറ മുt ിെയ സേനഹിkുnു
1
She is a widow
aവള് oരു വിധവ ആണ്
1
He plays well
aവന് തൃ തികരമായി കളിkുnു
1
He has a boat
aവന് oരു വ
1
I went there
ഞാന് aവിെട േപായി
1
I met her
ഞാന് aവെള ക
1
He has my books
aവന് eന്െറ പുസതക
1
I gave my pen to him
1
He is my friend
1
We both went to Chennai
1
ടു
op
y
ടായിരുnു
ചി u
ട്
ടുമു ി
C
ള് u
ട്
ഞാന് eന്െറ േപനെയ aവന് െകാടുtു aവന് eന്െറ കൂ കാരന് ആണ് ള് iരുവരും െചൈnk് േപായി
We met him at the station
ഞ
ള് aവെന താവളtില് ക
1
Then we went out
പിnീട് ഞ
1
He is my nephew
1
She is my aunt
1
Economy is in a good state
സാmtികവ വs oരു നല നിലയില് ആണ്
1
My name is Geetha
1
My daughter is working in a school
eന്െറ േപര് ഗീത ആണ് eന്െറ മകള് oരു വിദ ാലയtില് പണിെയടുtുെകാ ടിരിkുnു
1
Today is a holiday
in് oരു aവധിദിവസം ആണ്
This is a red ball
iത് oരു ചുവn പn് ആണ്
ടുമു ി
ള് പുറt് േപായി
aവന് eന്െറ aനnരവന് ആണ് aവള് eന്െറ amായി ആണ്
1
o
N
ot
ഞ
He asked for a ball
aവന് oരു പnിനുേവ
1
She was new in town
aവള് െചറുപ ണtില് പുതിയ ആയിരുnു
1
She is enjoying
aവള് ആസ ദിc െകാ
1
He replied to her
aവന് aവ k് മറുപടിന കി
1
He is singing
aവന് പാടിെകാ
1
An idea can change your life
oരു ആശയം നി
1
It is a good idea
aത് oരു നല ആശയം ആണ്
1
This is a red flower
iത് oരു ചുവn പൂവ് ആണ്
1
I saw a beautiful girl
ഞാന് oരു സുnരിയായ െപ കു ിെയ ക
1
She has three goats
aവ k് മൂn് ആടുകള് u
1
It was a lie
aത് oരു നുണ ആയിരുnു
D
1
100
ടിയില് േപായി
ടി േചാദിc ടിരിkുnു
ടിരിkുnു ള െട ജീവിതകാലെt മാ ാം
ട്
ടു
1
Sunday is a holiday
ഞായറാഴ്ച oരു aവധിദിവസം ആണ്
1
You should wait for her
നി
1
Chief minister went to Delhi
മുഖ -മ nി െഡ ഹിk് േപായി
1
The rain was very loud
മഴ വളെര uറെkയുളള ആയിരുnു
1
They are playing
aവര് കളിc െകാ
1
This is a cart
iത് oരു കാളവ
1
He did not eat meat
aവന് iറcിെയ തിnില
1
He is a writer
aവന് oരു eഴുtുകാരന് ആണ്
1
He is a famous doctor
aവന് oരു േപരുേക ൈവദ ന് ആണ്
1
I saw a lion in the zoo
ഞാന് oരു സിംഹെt മൃഗശാലയില് ക
1
I saw a yellow parrot
ഞാന് oരു മ
1
She had a beautiful mom
aവ k് oരു സുnരിയായ am u
1
It was an interesting story
aത് oരു രസകരമായ കഥ ആയിരുnു
1
He was a good farmer
aവന് oരു നല ക ഷകന് ആയിരുnു
1
He worked in a hotel
aവന് oരു േഭാജനശാലയില് പണിെയടുtു
1
She lived in a village
aവള് oരു ഗാമtില് ജീവിc
1
This is a hill
iത് oരു കുn് ആണ്
1
She is coming from a city
aവള് oരു നഗരtി നിn് വnുെകാ
1
He is a stranger
1
He went to his room
1
This aeroplane had two engines
1
I saw another aeroplane
ഞാന് മെ ാരു വിമാനെt ക
1
He is the owner of the house
aവന് വീടി െറ uടമസഥന് ആണ്
1
I bought a goat
ഞാന് oരു ആടിെന വാ
1
Colors are beautiful
നിറ
1
He is a merchant
aവന് oരു കcവടkാരന് ആണ്
1
My father is a professor
eന്െറ acന് oരു ആചാര ന് ആണ്
1
His elder brother is a teacher
aവന്െറ മൂt സേഹാദരന് oരു ad ാപകന് ആണ്
1
She did not believe her doctor
aവള് aവള െട ൈവദ നിെന വിശ സിcില
He failed in the examination
aവന് പരീkയില് േതാ േപായി
He became a doctor
aവന് oരു ൈവദ ന് ആയിtീ nു
1
His brother is a lawyer
1
Kerala seeks permission for new dam
aവന്െറ സേഹാദരന് oരു വkീല് ആണ് േകരള പുതിയ aണെk ിനുേവ ടി aനുമതിെയ ആരായുnു
1
They were killed in an accident
aവര് oരു aപകടtില് െകാലെp
1
We celebrated his birthday
ഞ
1
The man did not come out
പുരുഷന് പുറt് വnില
1
It was written in Hindi
aത് ഹിnിയില് eഴുതെp
1
I have a sister
eനിk് oരു സേഹാദരി u
1
I invited her
ഞാന് aവെള kണിc
1
He would have rushed into the street
aവന് െതരുവിന് ullില് പാ
1
He is a student
aവന് oരു വിദ ാ tി ആണ്
ടി ആണ്
ടു
y
തtെയ ക
ടു
op
ടായിരുnു
C
ടിരിkുnു
aവന് aവന്െറ മുറിk് േപായി iത് വിമാനtിന് ര
101
ടിരിkുnു
ട് യ n
ot
N
D
1
ടി കാtിരിkണം
aവന് oരു aപരിചിതന് ആണ്
o
1
ള് aവ kുേവ
ള് u
ടായിരുnു
ടു
ി
ള് സുnരിയായ ആണ്
ള് aവന്െറ ജന്മദിനം ആേഘാഷിc
ട് ുകയ ി
ടാകാം
1
He saw a strange sight
aവന് oരു aസാധാരണമായ കാ ചെയ ക
1
Viju was the son of a gardener
വിജു oരു േതാ kാരനി െറ മകന് ആയിരുnു
1
Kochi is a famous city
െകാcി oരു േപരുേക നഗരം ആണ്
1
He was the chief guest
aവന് മുഖ മായ aതിഥി ആയിരുnു
1
A scientist came here
oരു ശാ
1
We made arrangements
ഞ
1
I saw an eagle
ഞാന് oരു കഴുകനിെന ക
1
He did not see the red light
aവന് ചുവn െവളിcെt ക
1
She is an angel
aവള് oരു മാലാഖ ആണ്
1
This is an island
iത് oരു തുരുt് ആണ്
1
Geetha is my aunt
ഗീത eന്െറ amായി ആണ്
1
He is our servant
aവന് നmുെട ഭൃത ന് ആണ്
1
The speaker was Gabriel
പസംഗകര്tാവ് ഗ ബിേയല് ആയിരുnു
1
She was very sad
aവള് വളെര ദുഃഖകരമായ ആയിരുnു
1
I saw a crocodile in the lake
ഞാന് oരു മുതലെയ കായലില് ക
1
Something was wrong
eേnാവസ്തു െത ായ ആയിരുnു
1
It was a strange town
aത് oരു aസാധാരണമായ െചറുപ ണം ആയിരുnു
1
It is delicious
aത് സ ാദുളള ആണ്
1
The water was not hot
1
He drove to the hotel
1
Shankar felt sorry for him
1
They were very scared
aവര് വളെര േപടിkെp
1
It was late at night
aത് രാ തിയില് കാലംെത ിയ ആയിരുnു
1
I am hungry
1
He did not like jail
1
He had tried
aവന് ശമിcി
1
My name is Aazaad
eന്െറ േപര് ആസാട് ആണ്
1
Old news was interesting
പഴയ വാര്t രസകരമായ ആയിരുnു
1
Rama lived in a small village The small village had a small school
രമ oരു െചറിയ ഗാമtില് ജീവിc െചറിയ ഗാമtിന് oരു െചറിയ വിദ ാലയം u ടായിരുnു
It was a forest
aത് oരു വനം ആയിരുnു
1
He went to his room
aവന് aവന്െറ മുറിk് േപായി
1
Everyone laughed
eലാവരും ചിരിc
1
They had short legs
aവ k് കുറിയ കാലുകള് u
1
Life is hard
ജീവിതകാലം ദൃഢമായ ആണ്
1
He was a carpenter
aവന് oരു ആശാരി ആയിരുnു
1
She saw an ad in the paper
aവള് oരു പരസ െt വ tമാനp തtില് ക
1
My brother is in hospital
1
My brother is coming from Mumbai
eന്െറ സേഹാദരന് ആശുപ തിയില് ആണ് eന്െറ സേഹാദരന് മുംൈബയി നിn് വnുെകാ ടിരിkുnു
1
Thaar is a desert
താര് oരു മരുഭൂമി ആണ്
ടാkി ടു ടില
C
op
y
ള് പdതികെള u
ടു
െവളളം ചൂടുll ആയിരുnു aവന് േഭാജനശാലk് വ ഷ ര് aവനുേവ
ടിേയാടിc
ടി നിn മായ aനുഭവിc
ഞാന് വിശp ll ആണ് aവന് ജയിലിെന i െp ില
102
തjന് iവിേടk് വnു
ot
N
D
1
o
1
ടു
ടായിരുnു
ടായിരുnു
ടു
1
Wait until tomorrow
നാെള വെര കാtിരിkുക
1
A woman came out of her house
oരു
1
He was a great man
aവന് oരു മഹാനായ പുരുഷന് ആയിരുnു
1
He was a Prince
aവന് oരു യുവരാജാവ് ആയിരുnു
1
It was a rabbit
aത് oരു മുയല് ആയിരുnു
1
Autumn arrived in the forest
ശരല്kാരം വനtില് etിേc nു
1
I saw a deer in the forest
ഞാന് oരു മാനിെന വനtില് ക
1
He was a blacksmith
aവന് oരു െകാലpണിkാരന് ആയിരുnു
1
I hope
ഞാന് പതീkിkുnു
1
I have some books
eനിk് a പം പുസതക
1
your father called yesterday
നി
1
I am a housewife
ഞാന് oരു വീ m ആണ്
1
I am fine
ഞാന് സുഖം ആണ്
1
I am bold
ഞാന് ധീരമായ ആണ്
1
Forget the past
ഭൂതകാലെt മറkുക
1
Lock the door
കതക് പൂ ്
1
Reduce the volume
വ ാ തെt കുറ kുക
1
Return it safely
1
Put on your shirt
1
send him inside
1
they will win
1
They will not listen to me.
aവര് eെnk് േക kില
1
I like reading books
ഞാന് പുസതക
1
I like walking in the morning sun
ഞാന് പഭാതം-സൂര നില് നടkുക i െpടുnു
1
I like listening to music
ഞാന് സംഗീതtിന് േകള്kുക i െpടുnു
1
I like travelling by train
ഞാന് തീവ
1
I keep my books here
ഞാന് eന്െറ പുസതക
1
I wait for him at the station
ഞാന് aവനുേവ
I have a scooter
eനിk് oരു
We have a Maruthi Car
ഞ
1
I have two brothers
eനിk് ര
1
I have three sisters and a brother
eനിk് മൂn് സേഹാദരി uം oരു സേഹാദരന് u
1
I will never forget your help I have come to the end of my patience.
ഞാന് നി
1
Don't try my patience. You are always complaining about something.
eന്െറ kമെയ ശമിkുക നി ള് eേnാവസ്തു പ ി പരാതിെp െകാ ടിരിkുnു eേpാഴും
1
go to hell
നരകtിന് േപാകുക
1
Anyone can make a mistake
ഏെത ിലുെമാരാള് oരു പിഴെയ u
1
I too have the same problem
eനിk് aധികമായി തുല മായ പ നം u
1 1
ട്
op
y
ള െട acന് inെല വിളിc
ള െട uടുpില് iടുക
aവെന aകേtk് aയ kുക
N
ot
aവര് വിജയിkും െള വായിkുക i െpടുnു
ടിയില് യാ തെചy ക i െpടുnു െള iവിേടk് വ kുക
ടി താവളtില് കാtിരിkുക
കൂ ര് u
ട്
k് oരു മാരുതി-കാര് u
ട്
ട് സേഹാദര മാര് u
ട്
ള െട സഹായെt മറkില
ഞാന് eന്െറ kമയുെട a tിന് വnി
103
ള് u
C നി
D
1
ടു
aതിെന സുരkിതമായി തിരിെctുക
o
1
തീ aവള െട വീടി െറ പുറt് വnു
ട്
ടാkാം ട്
ട്
1
He came to my office
aവന് eന്െറ കാര ാലയtിന് വnു
1
He came to meet my father
aവന് eന്െറ acനിെന ക
1
He came along with his wife
aവന് aവന്െറ ഭാര േയാട് േചര്n് വnു
1
He came here alone
aവന് തനിcാkി iവിേടk് വnു
1
I give her a pen
ഞാന് aവ k് oരു േപനെയ െകാടുkുnു
1
My friend Ravi has a scooter
eന്െറ കൂ കാരന്-രവിk് oരു
1
I would have come yesterday
ഞാന് inെല വnി
1
He is on leave
aവന് aവധിയില് ആണ്
1
He has gone out.
aവന് പുറt് േപായിയി
1
Milk is white
പാല് െവll ആണ്
1
I am eating an apple
ഞാന് oരു ആpി pഴെt തിnുെകാ
2 2
Her shoes are old All shoes were on sale at the shoe store
aവള െട പാദരkകള് പഴയ ആണ് eലാ പാദരkകള് പാദരk-കലവറയില് വി പനയില് ആയിരുnു
2
They were very comfortable
aവര് വളെര സുഖ പദമായ ആയിരുnു
2
They felt good
aവര് നല aനുഭവിc
2
He opened his travel bag
aവന് aവന്െറ യാ ത-സ
2
His bike was gone
e
2
It was cut in two
2
It was delicious
2
She thought about it
2
Your dog is lazy
2
It was delicious
2 2
He was so happy He turned on the water and rinsed his face
aവന് a പകാരം സnുഷ്ടമായ ആയിരുnു aവന് െവളളtില് തിരി ു uം aവന്െറ മുഖെt കഴുകി
2
She rinsed the carrot peeler
2
I am going to adopt two baby children
aവള് ശീമമുll ിെcടി-േതലുരിkുnവെന കഴുകി ഞാന് ര ട് െചറിയകു ി-കു ികെള ദെtടുkുക േപായിെകാ ടിരിkുnു ഞാന് oരു െചറിയ െപണ്കു ി uം oരു െചറിയ ആണ്കു ി ആവശ െpടുnു aവ k് ര ട് െചറിയകു ിക kുേവ ടി കാtിരിkുക u ടായിരുnു
ടാകാം ട്
op
y
ടിരിkുnു
ചിെയ തുറൈn
C
ടില് മുറിkെp
N
ot
aവള് aത് പ ി ചിnിc
I want a little girl and a little boy She had to wait for the two babies
നി
ള െട നായ aലസനായ ആണ്
aത് സ ാദുളള ആയിരുnു
2
I can wait one year Every night she scrubbed her cheek extra hard
2
He sat down and read the newspaper
ഞാന് on് െകാലം കാtിരിkാം ഓേരാ രാ തി aവള് aവള െട കദളെt aതിയായ ദൃഢമായ uര c aവന് താെഴ iരിc uം വ tമാനപ തെt വായിkുnു
2
She poured the beans into the grinder
aവള് പയറുകെള ആ കലിന് ullില് oഴിc
2
Today I bought a winter cap
ഞാന് oരു ശീതകാലം-െതാpി വാ
2
The reception was bad
സ കാരം ചീtയായ ആയിരുnു
2
He went to jail
aവന് ജയിലിന് േപായി
2
It barked in the morning
aത് പഭാതtില് കുര kുക
2
He washed the plate
aവന് പാ തെt കഴുകി
2
104
ട്
aത് സ ാദുളള ആയിരുnു
D
2
കൂ ര് u
് ആയിരുnു aവന്െറ േമാ ാര്ൈസkിള്
aത് ര
o
2
ടുമു ക വnു
ുക
2
She said she was fine
aവള് aവള് സുഖം ആയിരുnു പറ
2
Something was leaking
eേnാവസ്തു േചാ nുെകാ ടിരിc
2
He called his landlord
aവന് aവന്െറ ഭൂവുടമസഥനിെന വിളിc
2
What was wrong with him
ഏത് aവേനാട് െത ായ ആയിരുnു
2
He closed his eyes
aവന് aവന്െറ ക
2
He was sitting in his chair
aവന് aവന്െറ പീഠtില് iരിc െകാ
2
I am an adult
2
I want to drink a soda
ഞാന് oരു പായപൂ tിയായ ആണ് ഞാന് oരു േസാഡാെവllെt കുടിkുക ആവശ െpടുnു
2
I want to work
ഞാന് പണിെയടുkുക ആവശ െpടുnു
2
They were going to the airport
aവര് വിമാനtാവളtിന് േപായിെകാ
2
Something can always go wrong
eേnാവസ്തു eേpാഴും െത ായ േപാകാം
2
Where are we
e
2
I will send the police
ഞാന് േപാലീസെയ aയ kുമ്
2
The magazine was one year old
മാസിക on് െകാലം പഴയ ആയിരുnു
2
She was angry
aവള് കുപിതനായ ആയിരുnു
2
She was angry at her husband
aവള് aവള െട ഭ tാവില് കുപിതനായ ആയിരുnു
2
She was crying
aവള് കര
2
He walked into his house
2
She cooked the raw apples
2
She saw the flashing light
2
Superman was strong
സുെപര്മന് ശkമായ ആയിരുnു
2
Superman could pick up a house
സുെപര്മന് oരു വീടിെന േമേല eടുkാ പ ി
2
He looked down the street
aവന് െതരുവിെന താെഴ േനാkി
2
He sat down on the bench
aവന് െബ
2
The wind was blowing
2
The bus accident was near a dam
കാ ് കാ വ്ീശിെകാ ടിരിc ബസ്-aപകടം oരു aണെk ് aരിെകയുളള ആയിരുnു
2
He was poor
aവന് ദരി ദമായ ആയിരുnു
2
The meals were cheap
ഊണുകള് വിലകുറ
Floating is so easy
oഴുകുക a പകാരം eള pമായ ആണ്
Your mother said to tell the truth
നി
2
They say that lying is evil
aവര് നുണ േദാഷകരമായ ആണ് പറയുnു
2
That is a big lie
ആ oരു വലിയ നുണ ആണ്
2
Look at the people around you
നി
2
The weather got cold
കാലാവs തണുt കി ി
2
He turned up the volume
aവന് വ ാ തെt േമേല തിരി
2
He looked outside his door
aവന് aവന്െറ കതക് പുറം േനkി
2
Then they pulled out guns
പിnീട് േതാkുകള് പുറt് വലിkുക aവര്
2
He was late
aവന് കാലംെത ിയ ആയിരുnു
2
They sat on the towel
2
He talked to them for a minute
aവര് തൂവാലയില് iരിc aവന് aവെരtെnk് oരു kണtിനുേവ ടി സംസാരിc
ടിരിc
y
ടിരിc
op
ള്
ടിരിc
C
ുെകാ
aവള് aപക മായ ആpി pഴ
െള പാചകംെചy ക
aവള് മിnിkുക െവളിcെt ക
ടു
ചില് താെഴ iരിc
ആയിരുnു
ള െട am േനരിെന പറയുക പറ
ു
ള് ചു പാടും ആള കള കളില് േനാkുക
105
കെള aട c
aവന് aവന്െറ വീടിന് ullില് നടnു
ot
N
D
2
o
2
് ആണ് ഞ
ു
ു
2
They are alone
aവര് തനിcാkി ആണ്
2
I must find another job
ഞാന് മെ ാരു േജാലിെയ ക
2
They had four children
aവ k് നാലാം കു ികള് u
2
She was always sick
aവള് eേpാഴും ദീനമായ ആയിരുnു
2
It was nice and cold
aത് ഹൃദ മായ uം ചീരാp് ആയിരുnു
2
They drove to the lake
aവര് കായലിന് വ
2
They got out of the car
aവര് കാരി െറ പുറt് കി ി
2
Then they went home
പിnീട് aവര് വീടിെന േപായി
2
She played for several hours
aവള് പേത കമായ മണിkൂരുക kുേവ
2
They were crying
aവര് കര
2
powerful earthquake strikes philipines
വീരനായ ഭൂമികുലുkം-പണിമുടkുകള് ഫിലിപിെനunു
2
Your dog does not eat grass
നി
2
He did not listen to his friends
aവന് aവന്െറ കൂ കാരനുക k് േക ില
2
Her husband could not believe it
aവള െട ഭര്tാവ് aതിെന വിശ സിkാ പ ില
2
She could not handle it
aവള് aതിെന ൈകകാര ംെചyാ പ ില
2
I am doing good
ഞാന് നല െച തുെകാ
2
Few persons seemed to love him
ചുരുkം ആള കള് aവെന
2
He pushed me to the door
aവന് eെnെയ കതകിന് തllി
2
I reached my home
2
Grandmother took her to the garden
2
Life was difficult for Somu
2
He had made a new friend on his way
മുt ി aവെള പൂേnാ tിന് eടുtു ജീവിതകാലം െസാമുവിനുേവ ടി ദു കരമായ ആയിരുnു aവന് oരു പുതിയ കൂ കാരനിെന aവന്െറ വഴിയില് u ടാkിയി ടായിരുnു
2
Bablu did not go to school
ബാ
2
I do not want to go to school
ഞാന് വിദ ാലയtിന് േപാകുക ആവശ െpടുnില
2
I will teach you to read and write
ഞാന് നി
2
Somu loved to read ghost stories
െസാമു ഭൂതം-കഥകെള വായിkുക സേനഹിc
2
They loved everyone
2
A farmer had some puppies
aവര് eലാവരുെt സേനഹിc oരു ക ഷകനിന് a പം നാ kു u ടായിരുnു
ടിേയാടിc
ടി കളിc
y
ടിരിc
op
ള െട നായ പുലിെന തിnുnില
ടിരിkുnു
C
േനഹിkുക േതാnി
ഞാന് eന്െറ വീടിെന etിേc nു
I threw the brick
ള വിദ ാലയtിന് േപായിയില ള് വായിkുക uം eഴുതുക പഠിpിkുമ്
ുകള്
I planted a tree in my garden
2
He called me yesterday
aവന് inെല eെnെയ വിളിc
2
He explained to me
aവന് eെnk് വിവരിc
3
Then she turned on the stove
3 3
She took an egg out of the refrigerator He picked up the can of shaving cream
പിnീട് aടുpില് തിരിയുക aവള് aവള് eടുtു oരു a െt പുറt് ശീതീകരണയ natി െറ aവന് പാ pാടെയ kൗരംെചy ക കഴിയുകെയ േമേല eടുtു
3
Then he shaved his upper lip
3
He had red bites all over his body
3
She told Kim to fill out many forms
D 2
ഞാന് i ികെയ eറി ു ഞാന് oരു മരെt eന്െറ പൂേnാ tില് ന പിടിpിc
പിnീട് aവന്െറ upറ് ചിറി kൗരംെചy ക aവന് aവന് ചുവn കടികള് eലാ േമല് aവന്െറ ശരീരം u ടായിരുnു aവള് കിം ധാരാളമായ രൂപ െള പുറt് നിറ kുക പറ ു
106
ടായിരുnു
ot
N
o
2
ുെകാ
െടtുക
3
She asked her mom to cut the mole off with a razor
aവള് aവള െട amെയ മുറിkുക മറുകിെന കള oരു kരktിേയാട് േചാദിc
3
He got up
3
3
It was still frozen He cut the brown spots out of the apple She took the bag of coffee beans out of the freezer She put a paper filter into a plastic cone She waited until the cup was full of hot coffee She sipped her coffee while she read the newspaper
3
I can pull the cap down over my ears
aവന് േമേല കി ി aത് നി ലമായിരിkുn തണുp െകാ ടുറc േപാകെp aവന് മുറിc തവി നിറമുll പുളളികെള പുറt് ആpി pഴatി െറ aവള് പുറt് fെരezeര് കാpി-പയറ് സ ചിെയ eടുtു aവള് oരു വര്tമാനp തം-ഫി തറിെന oരു ള തിക് കൂ പിന് ullില് i aവള് കp് ചൂടുll കാpിയുെട നിറ ആയിരുnു വെര കാtിരിc aവള് aവള െട കാpിെയ aവള് വ tമാനപ തെt വായിkുക aേpാള് മുtിkുടിc ഞാന് വലിkാം െതാpിെയ താെഴ eന്െറ െചവികള് േമല്
3
Paula said she wasn't pretty
3
I do not have anyone
3
I do not have any brothers
3
What was the noise
3
He drove it out to the street He wants to play golf with me next week
3
3
y
3
op
3
െപൗല aവള് നിസര്gസുnരമായ ആയിരുnു പറ ു eനിk് ഏെത ിലുെമാരാള് u ടായിരിkുക iല െചy ക eനിk് ഏെത ിലും സേഹാദര മാര് u ടായിരിkുക iല െചy ക
C
3
ശbം ആയിരുnു ഏത്
aവന് വ ടിേയാടിc aതിെന പുറt് െതരുവിന് aവന് കളിkുക െഗാ ഫ്~e eെnേയാട് aടുt ആഴ്ച ആവശ െpടുnു ഞാന് നി െള സഹായിkുക aത് eടുkുമ് ധാരാളമായ ആ ചകള് നി ള െട െഗാല്ഫ്-തൂkിയിടുകെയ പുേരാഗമിkുക
His friends didn't believe him Will you help me improve my golf swing
aവന്െറ കൂ കാരനുകള് aവെന വിശ സിkുക നി ള് eെn eന്െറ െഗാല്ഫ്-തൂkിയിടുകെയ പുേരാഗമിkുക സഹായിkുമ്
Don't drink and drive Let me make you a cup of hot chocolate The landlord said he would talk to the lady
കുടിkുക uം വ ടിേയാടിkുക eെn നി ള് ചൂടുll െചാെകാലെ യുെട oരു കp് ആയി u ടാkുക aനുമതിെകാടുkുക ഭൂവുടമsന് aവന് മാന സ തീk് സംസാരിkാം പറ ു enാലും aവന് െചy ക
3
But he never did The dog's mouth was bigger than the dog
3
One day he yelled at the dog
aവന് നായയില് aലറുക on് പകല്
3
The lady got angry
മാന
3
aവള് മറുപടിപറയുക
3
She couldn't answer He knew that electricity was dangerous The teacher walked into the classroom
3
He didn't know what to do
aവന് െചy ക ഏത് aറിയുക
3
What was wrong
െത ായ ആയിരുnു ഏത്
3
But he didn't tell them how he felt
enാലും aവന് aവെരtെnെയ aവന് aനുഭവിc
3 3 3 3
D
3
N
3
o
3
ot
I can't help you It will take many weeks to improve your golf swing
3
3
നായ ് വായ് നായ കാള് ബിgറ് ആയിരുnു തീ കുപിതനായ കി ി
aവന് ആലkികത ആപല്kരമായ ആകുക aറിയുക ീചറ് kസ സൂtിന് ullില് നടnു
107
്
e
െന പറയുക
3
He didn't say anything
aവന് eെn ിലുെt പറയുക
3
Many planes were behind it
ധാരാളമായ നിരpായകള് aത് പിnില് ആയിരുnു
3 3
Planes are for flying, not sitting The doctor asked her a lot of questions
നിരpായകള് പറkുക , iരിkുക ആണ് ൈവദ ന് aവള് േചാദ ള െട oരു eലാം ആയി േചാദിc
3
The doctor asked for a blood sample
ൈവദ ന് oരു രkം-ആദ ശtിനുേവ
3
You need to exercise
നി
3
Walk up stairs every day
േകാണികെള േമേല നടkുnു ഓേരാ പകല്
3
She didn't believe her doctor
aവള് aവള െട ൈവദ നിെന വിശ സിkുക
3
She didn't exercise
aവള് aഭ സിkുക
3
I am not a kid
ഞാന് oരു കു ി ആണ്
3
I have no food
eനിk് iലാെത ഭkണം u
3
But water has no taste
enാലും െവളളtിന് iലാെത രുചി u
3
My car has no bed
3
Many people do not have a car
eന്െറ കാരിന് iലാെത കിടk u ട് ധാരാളമായ ആള കള ക k് oരു കാര് u iല െചy ക
3
A street has no bed
oരു െതരുവിന് iലാെത കിടk u
3
I don't know what to do
ഞാന് െചy ക ഏത് aറിയുക
3
I don't know where to go
3
Carol said they must leave early
3
You never know what can go wrong
3
They left two hours early
3
The train made many stops
തീവ
3
Target was having a sale
unം oരു വി പനെയ u
3
We have no more mustard
ഞ
3
She told him to take a seat
aവള് aവന് oരു iരിpിടെt eടുkുക പറ
3
He looked at the date on the magazine
aവന് ദിനtില് മാസികയില് േനkി
3
He didn't mind
aവന് ശdിkുക ആള കള കള് സുെപ മനിന് ക k eലാ െള െകാടുtു
He was at the bus stop
y
op
ട്
ട്
ടായിരിkുക
C
ട്
നി ള് െത ായ േപാകാം ഏത് aറിയുക aവര് ര ട് മണിkൂരുകള് iളം പായtില് uേപkിkലി ടി ധാരാളമായ ബs്~കെള u
ടാkി
ടായിരുnുെകാ
k് iലാെത െമാെര കടുക് u
ടിരിc
ട് ു
ടatി െറ
aവന് ബസ്-ബs്~iല് ആയിരുnു aത് etിേcരുക ബസkുേവ ടി സമയം ആയിരുnു
3
It was time for the bus to arrive
3
He stood up again
3 3
I also get the news from my friends I can read the newspaper any time I want
3
I can read any story I want
aവന് നിnു േമേല വീ ടും ഞാന് aതുകൂടാെത കി nു വാ tെയ eന്െറ കൂ കാരനുകളി നിn് ഞാന് വായിkാം വ tമാനപ തെt ഏെത ിലും സമയം ഞാന് ആവശ െpടുnു ഞാന് ഞാന് ആവശ െpടുക ഏെത ിലും കഥെയ വായിkാം
3
But it didn't stop
enാലും aത് നിര്tിവ kുക
3
It kept going I have only one problem with my newspaper
aത് േപാകുക വ c
3
eനിk് on് പ നം onുമാ തമായ eന്െറ
108
ടിവരിെകൗnു
ഞാന് േപാകുക e ് aറിയുക ഭkിഗീതം aവര് iളം പായtില് uേപkിkല് പറ ു
ot
N
People gave Superman lots of candy
D
3
o
3
ള് aഭ സിkുക േവ
ടി േചാദിc
വര്tമാനപ തം u
3
People like good news He told her that he was in love with her
നല വാര്t േപാെല ആള കള കള് aവന് aവള് aവന് aവേളാട് സേനഹtില് ആയിരുnു പറ ു
3
He did not even know her
aവന് േപാലും aവെള aറി
3
He didn't know anything about her
aവന് eെn ിലുെt aവള് പ ി aറിയുക
3
She did not see him or hear him
aവള് aവെന ക
3 3
Everything you touch has germs Wash your hands after you touch other people
നി ള് aണുkള് u ട് െതാടുക കഴുകുക നി ള് പിnീടു നി ള െട ൈകകള് പര്ശിkുക മേ തായ ആള കള കള്
3
It is not a wide road
aത് oരു വി താരമുളള നിരt് ആണ്
3
I will plug it in
ഞാന് aട kുമ് aതിെന aകേtk്
3
But it wasn't a good job
enാലും aത് oരു നല േജാലി ആയിരുnു
3
That made him angry
ആ aവന് കുപിതനായINF u
3
He didn't want to go to jail
aവന് ജയിലിന് േപാകുക ആവശ െpടുക
3
Right now life was bad
ipം aര്ഹത ചീtയായ ആയിരുnു ജീവിതകാലം
3
But he would make it better
enാലും aവന് aത് കൂടുത നലതായINF u
3
This was not their money She went across the street to the liquor store
iത് aവരുെട പണം ആയിരുnു aവള് െതരുവ് വിരുdമായ ദാവകം-കലവറk് േപായി
3
ില
y
ടില aഥവാ aവെന േകള്kുക
op
ടാkി
ടാkാം
C
3
a പകാരം aവള് aവള െട ബസെയ ന െpടുക a പകാരം aവള െട തലവന് aവള് കാലംെത ിയ ആയിരുnു aറി ില
3
But she couldn't find dark red
enാലും aവള് iരു
3
Are you coming or not
നി
3
Why wouldn't I
ഊi െവൗല്ദ് ഞാന്
3
Your grandma told you not to lie
നി
3
No one can tell the truth all the time
iലാെത oരു eലാം സമയം േനരിെന പറയാം
3
Everyone lies sometimes
eെവര്െയാെന കൂെടkൂെട നുെണൗnു
3
You lie to be polite
You lie to get something you want
നി ള് നുണ സഭ മായINF ആകുക നി ള് നുണ നി ള് േനഹിkുക െസാെമാെനെയ പതിേരാധിkുക നി ള് നുണ നി ള് ആവശ െpടുക eേnവസതുവിെന കി ക
3
You lie to be popular
നി
3
They say that they never lie
aവര് aവര് നുണ പറയുnു
3
തറയില് മ
3
He plants yellow corn in the ground He plants the yellow corn in the spring
3
They don't pay for it
aവര് aതിനുേവ
3
They eat it while it is in the field
aവര് aത് aത് പാടtില് ആകുക aേpാള് തിnുnു
3
They don't cook it
aവര് aതിെന പാചകംെചy ക
3
They eat it raw
aവര് aത് aപക മായINF തിnുnു
3
The farmer doesn't get angry
കര്ഷകന് കുപിതനായ കി ക
3
Is that true
ആണ് ആ തു
N
You lie to protect someone you love
D
3
o
3
ot
3
So she didn't miss her bus So her boss did not know that she was late
3
ട ചുവpിെന ക
െടtുക
ള് aഥവാ iലെയ വരുക ആണ് ള െട ഗന് മ നി
ള് നുണ പറ
ു
ള് നുണ ജനകീയമായINF ആകുക ധാന ം aവന് െചടികള്
വസnകാലtില് മ
109
ട്
ധാന ം aവന് െചടികള്
ടി വിലെകാടുkുക
3
We do not have fancy technology
ഞ k് ഭാവന പയുkശാ iല െചy ക
3
Then it got louder
പിnീട് െലൗദറ് കി ക aത്
3
It was a storm
aത് oരു െകാടു ാ ് ആയിരുnു
3
The mountains will have snow
മലക k് മ
3
3
Then he can read again But right now he must live with the pain They both worked for the same supermarket But you can't make a snow man out of hail They planned to get married and live together
3
But they needed a down payment
പിnീട് വീ ടും വായിkുക aവന് enാലും ipം aര്ഹത aവന് േവദനേയാട് ജീവിkുക aവര് iരുവരും പണിെയടുkുക തുല മായ സുെപ മ െക ിനുേവ ടി enാലും നി ള് u ടാkുക oരു മ ്-പുരുഷനിെന പുറt് വിളിയുെട aവര് മറീട് കി ക uം െ ാെഗtറ് ജീവിkുക uേdശ ംi enാലും aവര് oരു പkിcിറകിെലമൃദുേരാമം വീ ലിെന േവ ടിവരിൈക
3
Two people were dead
ര
3
Time always went too fast
സമയം eേpാഴും േപായി aധികമായി േവഗtില്
3
They shook the sand out of the towel Thank you for taking me to the beach today
aവര് ചലിc മണലിെന പുറt് തൂവാലയുെട നി െള eെnെയ കടല്tീരം-inിന് eടുkുക കൃതjതകാ ക
3
y
3
ട് ആള കള കള് പൂര് മായി ആയിരുnു
3
Other things have weird shapes
aവര് aവരുെട കാലുകള് നന INF കി ി aവര് ക ടു ധാരാളമായ ആള കള കെള തമാശെയ u ടായിരിkുക െകൗ ര്ീസക k് മ nവാദസംബnമായ ആകൃതികള് u ട്
3
3
Then he visited her at home But it was the best thing that ever happened to you Prisoners can't hide in their orange uniforms A recession is a time when people have only a little money
പിnീട് aവെള വീ ില് സn ശിkുക aവന് enാലും aത് തുടര്cയായി നി k് സംഭവിkുക ത ് െബs് സംഗതി ആയിരുnു തടവുകാരനുകള് aവരുെട മധുരനാര യൂണിേഫാറ ളില് oളിkുക oരു പിന്വാ ല് ആള കള ക k് onുമാ തമായ oരു െചറിയ പണം u ട് eേpാള് oരു സമയം ആണ്
3
They don't buy new things
aവര് പുതിയ സംഗതികെള വാ
3
They don't go to Disneyland
aവര് ദിസെനഐല ഡിന് േപാകുക
They point the gun at a friend
aവര് േതാkിെന oരു കൂ കാരനില് നിയമിkുnു
They get cancer
3 3
N
3
o
3
ot
3
They got their feet wet They watched many people having fun
3
ുക
It does not have any stairs
3
It doesn't have a second floor They had to take her to the doctor often He said she would be healthy in a few years
aതിന് oരു aടിtറ u ടായിരിkുക ന് ് െചy ക aവ k് eടുkുക aവെള ൈവദ നിന് കൂെടkൂെട u ടായിരുnു aവന് aവള് oരു ചുരുkം െകാല ളില് ആേരാഗ കരമായ ആകുക പറ ു
3
The ice cubes floated to the top He took a cigarette out of the Marlboro box
iെക-സമചതുരഷ ഭുജ ള് പ പരtിന് oഴുകി aവന് eടുtു oരു ചിഗെരെtെയ പുറt് മര്ല്െബാെരാ-െപ ിയുെട
3
But it was not gold
enാലും aത് സ
3
Then one of the kids caught a fish
പിnീട് oരു മt െt പിടിkുക കു ികള െട on്
D 3
aവര് ഞ ടിെന കി ക aതിന് ഏെത ിലും േകാണികള് u െചy ക
3 3 3
110
ടായിരിkുക വില്ല്
op
3
ടായിരിkുക
C
3
്u
തം u
ടായിരിkുക iല
ം ആയിരുnു
3
They took a chance with their money
3
People said they don't have money
aവര് oരു aവസരെt aവരുെട പണtിേനാട് eടുtു ആള കള കള് aവ k് പണം u ടായിരിkുക ന് ് െചy ക പറ ു
3
They have no money
aവ k് iലാെത പണം u
3
3
The wife couldn't believe it They passed a new law to protect children They jumped when they heard a strange sound
ഭാര aതിെന വിശ സിkുക aവര് കടnുേപായി oരു പുതിയ നിയമെt കു ികെള പതിേരാധിkുക aവര് aവര് oരു aസാധാരണമായ ആേരാഗ മുllെയ േക eേpാള് ചാടി
3
I think she likes to argue
ഞാന് aവള് വാദിkുക i െpടുnു ചിnിkുnു
3
I wanted it closed
3 3
I guess my vocabulary is not so good The pilot said he felt sorry for the two dead birds
ഞാന് aത് aട c ആവശ െp ഞാന് eന്െറ പദാവലി a പകാരം നല ആണ് ഊഹിkുnു ൈവമാനികന് aവന് ര ട് മരിc പkിക kുേവ നിn മായ aനുഭവിc പറ ു
3
No one died
iലാെത oരു മരിc
3
she didn't care
aവള് ശd
3
But it didn't stop
enാലും aത് നിര്tിവ kുക
3
But Richard did not pull over
enാലും രിചര്ദ് oവറ് വലിcില
3
He never changed his car clock
aവന് aവന്െറ കാര്-ഘടികാരെt മാ ക
3
It wasn't broken
3
He put it back on the stove
3
I hope she is all right
3 3
Their mom wasn't going She takes another cookie out of the package
3
He teaches them how to read
3
She takes it out of her mouth
aവരുെട am േപാകുക aവള് പുറt് െപാതിെk ് മെ ാരു കൂകീെയ eടുkുnു aവന് aവെരtെnെയ വായിkുക e െന പഠിpിkുnു aവള് eടുkുnു aതിെന പുറt് aവള െട വായി െറ
3
They are pictures with words
aവര് പദ
3
He takes his glasses off
y
op
C
ot
aവന് i aതിെന പിnില് aടുpില് ഞാന് aവള് aര്ഹത ആണ് പൂര് മായി പതീkിkുnു
േളാട് ചി തം ആയി ആണ്
3
He did not have any friends in school
3
She jumped in again
aവള് ചാടി aകേtk് വീ ടും
3
The other was the manager Delhi Police files FIR against Kiran Bedi You can not wash your hands too often
മേ തായ േമ േനാ kാരന് ആയിരുnു െഡല്ഹി-േപാലീസ് കിരണ്-െബദി eതിേര ശ കപാദപെt രാണ് നി ള് നി ള െട ൈകകെള കൂെടkൂെട aധികമായി കഴുകാ പ ില
Her mom told her not to worry James took the milk out of the refrigerator
aവള െട am aവള് വ ാകുലെpടുക പറ െജയിംസ് eടുtു പാലിെന പുറt് ശീതീകരണയ natി െറ
D
o
3
He needs his glasses to read She does not have many germs on her hands
aവന് eടുkുnു aവന്െറ ക ാടികെള കള ് aവന് േവ ടിവരിെകൗnു aവന്െറ ക ാടികെള വായിkുക aവ k് aവള െട ൈകകളില് ധാരാളമായ aണുkള് u ടായിരിkുക iല െചy ക aവന് വിദ ാലയtില് ഏെത ിലും കൂ കാരനുകള് u ടായിരിkുക iല െചy ക
3
3 3 3 3
111
ടി
aത് െപാ ിkുക
N
3
ട്
ു
3
Both mayors were happy
iരുവരും നഗരപതികള് സnുഷ്ടമായ ആയിരുnു
3 3
You can not miss them My mother wished me to have a good education
3
I don't need new workers right now
3
I will teach you how to be even better
നി ള് aവെരtെnെയ ന െpടാ പ ില eന്െറ am eെnk് oരു നല വിദ ാഭ ാസം u ടായിരിkുക ആ ഗഹിc ഞാന് പുതിയ പണിkാരിെന ipം aര്ഹത േവ ടിവരിക ഞാന് നി െള േപാലും െബ റ്INF ആകുക e പഠിpിkുമ്
3
He didn't like that idea Place your hands firmly on the ground A group of frogs were traveling through the woods
aവന് ആ ആശയെt i െpടുക
Some people say that lying is bad You cannot talk about colors to blind people He had many arms and multiple heads.
a പം ആള കള കള് നുണ ചീtയായ ആണ് പറയുnു നി ള് നിറ ള് പ ി anമായ ആള കള ക k് സംസാരിkാ പ ില
The waves at the beach will be high The next morning the weather was clear
കട tീരtില് തിരമാലകള് uയര്n ആകുക
3 3 3 3 3 3
She was also having fun He had one thousand children with his wives
y
3
op
3
വ kുക നി ള െട ൈകകെള മുറുെക തറയില് തവളകള െട oരു സംഘം തടികള് മുേഖന യാ തെച തുെകാ ടിരിc
aവന് ധാരാളമായ ൈക uം മട
ടായിരുnു
കാലാവs വ k ആകുക aടുt പഭാതം aവള് തമാശെയ aതുകൂടാെത u ടായിരുnുെകാ ടിരിc aവന് on് ആയിരം കു ികള് aവന്െറ ഭാര മാര് u ടായിരുnു
aത് oരു മാനുഷികമായ oc ആയിരുnു കു ികള് ഓേരാ മേ തായെയ ക ടുമു ക സnുഷ്ടമായ ആയിരുnു
3
But they can be dangerous
3
I know what you are thinking
enാലും aവര് ആപല്kരമായ ആകുക ഞാന് നി ള് ചിnിc െകാ ടിരിkുnു ഏതിെന aറിയുnു
3
They were meeting after 10 years Your father had an accident while driving to office
3
N
D
3
It made him feel very proud They waited till everyone else was asleep it became difficult to preserve the peace
o
3
ot
3
It was a human voice The children were never happy to meet each other
3
3 3 3 3 3
Rama gave Hanuman Sita's ring Rotate your fist clockwise and anticlockwise. Lower you head to face your navel. Place your hands firmly on the ground
aവര് 10 െകാല ള് പിnീടു ക ടുമു ിെകാ ടിരിc നി ള െട acനിന് കാര ാലയtിന് വ ടിേയാടിkുക oരു aപകടം-aേതസമയം u ടായിരുnു aത് aവന് വളെര aഭിമാനമുll aനുഭവിkുnു u ടാkി aവര് eലാവരും aലായിരുnുെവ ില് നി ദയില് ആയിരുnു aതുവെര കാtിരിkുക aത് ശാnിെയ പുലര്tുക ദു കരമായ ആയിtീ nു രമ ഹനുമന്-സീത ് വളയെt െകാടുtു നി ള െട ൈകpട ഘടികാരദിശയില് uം aണ് ിെ ാk ിെസ കറkുക നി ള് െലാവറ് നി ള െട െപാkിെള േനരിടുക തലവ kുnു വ kുക നി
ള െട ൈകകെള മുറുെക തറയില്
ൈദവ
3
The gods lived in Heaven In the skies there were magical creatures.
3
It was inhabited by humans.
aത് മനുഷ ഗുണ
3
ആകാശ
112
ായ തല u
C
3
െന
ള് ആശാശtില് ജീവിc ളില് aവിെട മഗികല് ആകുക ജnുkള്. ള llകളാല് പാ kെp
3
He went about disguised as Indra
aത് രാജാവ്-െ ാസെരാ ാല് ഭരിkെp aത് oരു ക ാടി-ആരാധനാസഥലtില് iലാtയാള് aവെന െകാലാ പ ി a പകാരം വ kെp aവന് സംബnിc് ദിസഗ ിെസദിെന iന് ദ a പകാരം േപായി
3
He seduced many of the goddesses. He had one thousand children with his wives The demons became gradually more aggressive and powerful Janaka gave her the name Sita and brought her up
aവന് േദവികള െട മനിെയ വഴിെത ിc aവന് on് ആയിരം കു ികള് aവന്െറ ഭാര മാര് u ടായിരുnു പിശാചുകള് പടിപടിയായി െമാെര aെ gsിവ് uം വീരനായ ആയിtീ nു ജനക aവ k് േപര്-സീതെയ െകാടുtു uം െകാ ടുവരിൈക aവെള േമേല
He was educated by Shiva Hanuman's mother had told him to join Rama, He advised Rama against proceeding towards Lanka on his own. They would need a powerful army to succeed Hanuman mobilised support from friendly kings, Rama instructed Hanuman to find the way to Lanka and to carry a message for Sita.
aവന് ശിവയാല് പഠിpിkെp ഹനുമന് ് am aവന് രമെയ േയാജിpിkുക പറ ി ടായിരുnു , aവന് ല േലk് aവന്െറ o നില് മുേnാ നീ eതിേര രമെയ uപേദശിc aവര് േവ ടിവരികആം oരു വീരനായ േസനെയ തുട cയായിവരിക ഹനുമന് ചാെയ സൗഹാര്dപരമായ രാജാk മാരി നിn് പടെയരുkംെച തു
3 3 3 3 3 3 3
രമ ഹനുമന് വഴിെയ ക െടtുക നിര്േdശിkുക ല aന്ഡ് സീതkുേവ ടി oരു സേnശെt ചുമkുക
aവന് സീത ് വളയെt പിnില് രമ െകാടുtു രാവണാ സീതെയ ല k് െകാ ടുവരിൈകയി ടായിരുnു uം utരവാകുക aവന്െറ ആയിരം മkെള aവെള കാkുക രമ uം ലkമന് സീതയുെട aേന ാഷിkുകയില് പുറt് സഥാപിc aടിയnിരമായി ഹനുമന് oരു രാജകുമാരികള െട മകന് a പകാരം സഹിkെp ഹനുമന് aവന്െറ acന് ് ബലം uം മഗികല് ശkി aനnരവകാശമായികി ി
3
Sadayu attacked Ravana in the air
he tried to lure her to leave Rama when she refused, Ravana shifted back to his true shape. Ravana abducted her and flew away with her towards Lanka
സദയു വായുവില് രാവണാെയ ആ കമിc രാവണാ oരു ചാkുഷവിദ വളയെt സീത ് ൈകവിരലി നിn് eടുtു uം aതിെന സദയുവില് eറി ു രാവണാ aവnെnെയ oരു മുനിk് ullില് പരിവ tിc aവന് aവള് രമെയ uേപkിkല് പേലാഭിpിkുക ശമിc aവള് നിരസിc eേpാള് , രാവണാ aവന്െറ തു ആകൃതിk് പിnില് സഥാനംമാ ്റു രാവണാ aവെള ത ിെkാ ടുേപാകുക uം പറnു ദൂെര aവേളാട് ല േലk്
he decided to seize her he went to the forest together with one of his men They went near the place where Rama lived with his wife and his brother. Mareet transformed himself into a beautiful golden deer.
aവന് aവെള കyടkുക തീരുമാനിc aവന് വനtിന് െ ാെഗtറ് aവന്െറ പുരുഷ മാരി െറ on് േപായി aവര് iടം aരിെകയുളള രമ aവന്െറ ഭാര uം aവന്െറ സേഹാദരന് ജീവിc e ് േപായി മരീ aവnെnെയ ullില് പരിവ tിkുക a സുnരിയായ കനകമയമായ മാന്
3
3
Ravana took a magic ring from Sita's finger and threw it at Sadayu Ravana transformed himself into a hermit
D
3
N
3
o
3
ot
3
he gave Sita's ring back to Rama. Ravana had brought Sita to Lanka and ordered his thousand sons to guard her. Rama and Lakshaman immediately set out in search of Sita. Hanuman was born as the son of a princess Hanuman inherited his father's strength and magical powers.
3
3 3 3 3 3 3 3
113
ുക
op
3
C
3
y
3
It was ruled by King Tosarot. It was kept in a glass shrine so that nobody could kill him.
3
3
Rama went hunting for the deer he cried out for help with the voice of Rama
രമ ഹു ടി aവന് കര ocേയാട്
3
Lakshaman heard the cry
ലkമന് കരയുകെയ േക
3
He and Rama remained friends,
aവന് uം രമ കൂ കാരന് ആയി aവേശഷിc
3
he refused to accept the throne. Rama would not break his father's promise. He went to live in the deep forests together with Sita and his younger brother, Lakshaman.
aവന് സിംഹാസനെt സ ീകരിkുക നിരസിc
Barata accepted the throne He vowed to kill himself after fourteen years
ബരത സിംഹാസനെt സ ീകരിc aവന് aവnെnെയ പതിnാല് െകാല െകാല ക െവാവി
3
Rama did not return to retrieve it Rama and Sita lived happily together for a while. Kaiyakesi was the second wife of King Tosarot of Ayudhya
3
King Tosarot, had promised her to fulfil any of her wishes.
3
she wished that her son Barata should rule as a king for fourteen years.
രമ aതിെന രkെpടുtുക തിരിെctിയില രമ uം സീത ജീവിc സേnാഷേtാെട െ ാെഗtറ് oരു aേതസമയtിനുേവ ടി ൈകയെകസി aയുധ ായുെട രാജാവ്-െ ാസെരാ ി െറ aധികമായ ഭാര ആയിരുnു രാജാവ്-െ ാസെരാ ് , u ടായിരിkുക aവള് നിറേവ ക പതിൈj aവള െട ഏെത ിലും ആ ഗഹിkുnു. aവള് aവള െട മകന്-ബരത പതിnാല് െകാല kുേവ ടി oരു രാജാവ് a പകാരം ഭരിkണം ആ ഗഹിc
3
King Tosarot became very unhappy He would not eat nor drink until he died of grief.
രാജാവ്-െ ാസെരാ ് വളെര aസnു ആയിtീ nു aവന് തിnാ പ ില aതുമില aവന് ദുഖatി െറ മരിc വെര കുടിkുnു
3
Once he heard the story.
aവന് കഥെയ േക
3
3
D
y
3
114
ള് പിnീടു
op
3
C
3
aവന് വന k് െ ാെഗtറ് സീത uം aവന്െറ െയൗ റ് സേഹാദരന് , ലkമന് േപായി
ot
3
ിെന മാനിനുേവ ടി േപായി ു പുറt് സഹായtിനുേവ ടി രമയുെട
രമ aവന്െറ acന് ് വാkിെന െപാ ിkാ പ ില
N
3
o
3
oരിkല് .
PUBLICATIONS
[1] R. Harshawardhan, Mridula Sara Augustine, K. P. Soman, “Phrase based English – Tamil Translation System by Concept Labeling using Translation Memory”, in Int. Journal of Computer Applications (IJCA), ISSN: 0975 – 8887, Vol. 20, no. 3, April, 2011.
y
[2] R. Harshawardhan, Mridula Sara Augustine, K. P. Soman, “A Simplified Approach to Word Alignment Algorithm for English-Tamil Translation”, in Indian Journal of
op
Computer Science and Engineering (IJCSE), ISSN: 0976-5166, Vol. 2, No. 1, 2011. [3] R. Harshawardhan, Mridula Sara Augustine, K. P. Soman, “Advanced English – Malayalam Translation Memory for Natural Language Processing Applications”, in
D
o
N
ot
C
Proc. of Nat. Conf. on Indian Language Computing (NCILC), February, 2011.
115