Iterative Translation by Monolinguists implementation and tests of ...

24 downloads 1185 Views 855KB Size Report
engines (including Google Translate[5] and Babel Fish [6]) as well as using data ..... Access Protocol W3C Recomendation, http://www.w3.org/TR/soap/, 2009.
Iterative Translation by Monolinguists implementation and tests of the new approach Anna Pote¸pa, Piotr Płonka, Mateusz Pytel, Dominik Radziszowski∗ Department of Computer Science, AGH University of Science and Technology, Al. Mickiewicza 30, 30-059 Krak´ow, Poland

Abstract The paper states Iterative Translation by Monolinguists (ITM), the new approach to the translation process in which a translation is carried out by persons without a common language who work in a cooperation, i.e. by two monolingists instead of a translator. To prove the effectiveness of the approach, the pilot ITM platform as well as its dedicated extensions for both text translation and instant messaging communication have been designed, implemented and tested. Results of the tests confirm ITM’s usability both in terms of quality and efficiency. Measured translation quality is acceptable and translation provisioning time is significantly lower. The concept is likely to bring a reduction of the multilingual information flow barrier and to be a transitory solution during the process of creation of an infallible machine translation engine. Keywords: iterative translation, monolingual human translation, web services, machine translation, translation service platform, translation quality testing

1. Introduction Pierre from France and Lisa from the USA want to communicate with each other, but they know only their native languages. They don’t want to use professional translator’s help because it would be too expensive, uncomfortable or embarrassing. The only solution they have is to use one of Machine Translation (MT) engines, which provides a fast and free way of having a text translated. Although widely used and continuously enhanced, MTs still introduce many errors to produced translations. What should Lisa do with the machine translated phrases she does not understand? In our solution Lisa highlights incomprehensible fragments of the translated text, and then Pierre, seeing the problem, tries to say the same thing using simpler words. Hopefully, this time the previously paraphrased text is translated by the MT engine correctly and Lisa is able to understand the sense although it was impossible. If the MT engine fails again, the process may be repeated. As a result, after one or more iterations Lisa should receive correct and comprehensible text. The above real live example is based on almost any (especially highly motivated) person’s ability to recognize and to correct errors in texts written in his or her native language and to express ideas in different ways (provide paraphrase). An exploitation of this facts in combination with machine translation at its current stage of maturity, gives an opportunity to prepare acceptable quality translations. This approach has been called ITM – Iterative Translation by Monolinguists. With ITM, translation is carried out by two or more people without a common language who work in a cooperation, i.e. by monolingual users instead of a translator – Fig. 1. Such a translation is a solution which aims to be more reliable than automated algorithms, but far cheaper and faster than a traditional approach involving a professional translator. tel. +48 12 6333982, fax: +48 12 6339406 Email addresses: [email protected] (Anna Pote¸pa ), [email protected] (Piotr Płonka), [email protected] (Mateusz Pytel), [email protected] (Dominik Radziszowski) ∗ Corresponding,

Figure 1: Simple paraphrasing process between users in ITM communication model

To prove the effectiveness of such an approach a pilot implementation of a ITM system has been successfully developed and tested. In this paper the results of the research on the ITM platform and its dedicated extensions for both text translation and instant messaging communication are presented, together with a series of simulated test scenarios performed by monolingual users. Section 2 focuses on a classification of machine translation output, which is crucial for the arrangement of ITM’s decision flow and a translation model proposed in Section 3. Section 4 gives technology details on ITM platform implementation and is followed by results of effectiveness tests collected in Section 5. Section 7 presents conclusions and benefits that usage of the ITM system can bring to different translation parties. 2. Quality Classification of MT results In general, the ITM approach is based on the MT engine, and assumes an enhancement of its result by both target and source language speaker. As the automatically translated segments are far from being perfect, and happen to be completely incomprehensible, the MT produced output has to be divided into quality related categories that determinate indispensable corrective actions. From the point of view of target language monolinguist, the following categories were introduced: 1. VALID - comprehensible and correct output. 2. CORRECTABLE - comprehensible output which requires some post edition changes in style, punctuation, vocabulary or grammar. Example: MT translation (EN): This solution dedicate to companies who reduce capital expenditure. Correct translation (EN): This solution is dedicated to companies that reduce capital expenditure. 3. INCOMPREHENSIBLE such translation may be nonsensical, there is no chance a monolingual target language speaker will know how to post-edit it without additional piece of information. Example: Source segment (FR): A‘ l’impossible nul n’est tenu. MT translation (EN): In the impossible no one is required. Correct translation (EN): No one is bound to do the impossible. 4. DUBIOUS - the output is comprehensible (correct or may need some post-edition changes), but it is possible that it contorts the original meaning. The chance of this incompatibility is determined primarily on the basis of the text’s context. Example: Translation: I went to bed, I could only think of a sleeve. Possible correction: ’sleeve’ to ’sleep’ indicated by the word ’bed’, but one cannot be certain, that the the original sentence did not mention ’sleeve’. 5. FALSE - comprehensible and correct target language output which carries improper meaning. Such output introduces false information into translated text. Example: Source segment(FR) : C’est un homme qui a annonce qui avait remporte la vente aux encheres. MT translation (EN) : This is the man who announced that he had won an auction. Correct translation (EN): This is the man who announced who had won an auction. 2

There are existing systems which help human translators to manage and to correct an output of translation memory [1, 2] or MT engines [3] by displaying both original and translated segments. A bilingual person can then compare them and fix possible mistakes. ITM approach addresses the problems in groups 2, 3, 4 and 5 so that they can be resolved by monolingual users. A translation produced by a MT engine can be easily enhanced by a target language speaker. Machine translated segments, need to be revised and categorized into the appropriate groups. Someone who is familiar with a target language can recognize accurate sentences from the first category as well as correct mistakes of the sentences from the second group and therefore introduce a more appropriate style to the final text. When another user who knows the source language is added to the process new possibilities emerge. Supposing first user (target language speaker) cannot understand a segment or finds it dubious, a paraphrase of its original form can be requested. The same phrase written in other words in the source language and carrying the same meaning can be translated to the target language by a MT again, with a bit of luck this time the translation is understandable, or gives a hint on the meaning of the original one. If the translation of the paraphrase is not sufficient there are other ways to clarify the meaning, such as images and comments which are discussed later. In the worst scenario when everything fails a sentence is to be translated by a bilingual human, which is not an unfavorable solution as such problems are expected to occur rarely. The main advantage of this method is that it is easier to find two monolingual people, than one who speaks both necessary languages. This benefit is even more significant, when two languages are not likely to be known by the same person, for example Dutch and Armenian. In that case the standard approach is to have the text translated to English and then to target language, which doubles the work and may introduce more errors. Instead of processing the text twice by bilingual translators, monolingual people could use the ITM to translate text directly and work together to enhance the translation. What is more, translation in cooperation enables more than two users to concurrently work on the same text. 3. ITM Translation Model A number of features need to be provided by the ITM system which enables users to translate texts or messages in a cooperation to ensure comprehension is achieved. Those features define the actions performed on individual segments, which are the smallest, logical units of the text, mostly corresponding to a single sentence. At the beginning of the process, the text is split to segments [4], and than each segment is machine translated. At this point, the output returned by the MT is presented to the interlocutor who speaks the target language . For each segment one decides which category it belongs to and undertakes an appropriate action. In case of incomprehension the paraphrasing process is triggered. User actions - Fig. 2 - contains target language monolinguist decision flow. Diagram includes three basic ITM actions (in black) and optional actions (in grey), which define additional means of communication able to significantly speed up the entire process.

Figure 2: Target language monolinguist actions and decision flow

Basic actions cover: 3

• Acceptance - Segment is found comprehensible and correct. • Edition - Segment is grammatically wrong or any other kind of mistake introduced by the MT can be fixed. After edition a user may accept the segment. In the background a better translation suggestion is sent back to the MT engine or stored in a database. • Paraphrase Request - segment or its part is incomprehensible, a target language user requests a paraphrase and put further editing of the text segment on hold until a response is delivered. Once the translated paraphrase is available the user should try to fix the segment translation using the provided information. Optional actions supported by ITM system: • Alternative Translation - System may use two or more MT engines. Provided with multiple translations of a given segment, target language monolinguist can pay special attention and request a paraphrase of differently MT translated text, reducing an amount of false facts introduced to the translation - errors in category 4 or 5 (comprehensible, but the sense is contorted). • Verification Request - Sometimes the editing user cannot decide whether the fixed sentence reveals the real meaning of the original one in a source language. The edited sentence should then be reverse-translated and confirmed or rejected (and parphrased) by a source language user.

Figure 3: Source language monolinguist actions and decision flow

Source language monolinguist’s actions are performed on demand, according to the flow in Fig. 3. There is only one basic action: • Providing Paraphrase - Whenever a segment is incomprehensible for a target user, a source language speaker should provide a redundant source phrase which means the same as an original segment for which help has been requested. If necessary the sentence should be analyzed in context of surrounding segments. Optional actions include: • Acceptance of reverse-translated segments, which a target user post-edited and was unsure if the edition was correct. This process can introduce further errors, but in our tests it proved to be a valuable addition to the system. • Segment splitting segments and replacing long sentences with two or more shorter ones, which possibly can be easier to translate for a machine translation engine. To sum up, in a presented method of translation target language the user is responsible for the final appearance of a translated text. He or she alone decides whether help is needed and when a sentence can be accepted, while source language user’s help is also provided on demand. The presented translation model was used in an implementation of the experimental ITM platform. A graphical interface was also created which enables users to perform described actions and therefore utilize the ITM process to provide a translation of the text. 4

4. Pilot implementation During the research, the pilot implementations of the ITM platform was developed. The main aim of the platform is to provide basic functionality and programmer interfaces, to enable its usage in a number of services designed for different translation purposes. To fulfill this task the ITM Platform is based on number of machine translation engines (including Google Translate[5] and Babel Fish [6]) as well as using data storage and access possibilities (including database and content repository). Overall system architecture is presented in Fig. 4. It is implemented in Java as a 3-tier application. The bottom, data persistence layer currently supports two different storage possibilities - Hibernate[7] with MySQL[8] database and Apache Jackrabbit[9] implementation of Java Content Repository[10] which was chosen because of scalability reasons. Middle-level layer gathers platform business logic and exposes it as a package of services and also SOAP[11] web services for remote usage. Built upon Spring Framework[12] the implementation of ITM is easily configurable, and open to further development. The platform itself provides a common WebService[13] interface which enables other applications to provide ITM process driven translation.

Figure 4: ITM - General Platform Architecture

The core system functionality can be extended with a number of plug-ins that enables ITM usage within different context and technologies. Our research efforts focused mostly on: • Translation of a text – with the main aim to produce coherent, accurate translation understandable for a target language user and possibly free of grammar and style mistakes. • Real time chat – aimed to provide means of communication to monolingual users. The messages exchanged during the chat should be understandable for all participants. There is no need to enhance sentences which contain even serious grammar or style errors but are comprehensible. In this case collaborative translation service should provide means of indicating problems in comprehension and resources indispensable to clarify expressed thoughts. There are other possible areas of the practical application of the ITM technology including social network services and an exchange of internal documents in international enterprises, they will be under research in the near future. Currently implemented extensions are: an online web application providing users with tools they need to successfully translate a text in a cooperation with others, and a Google Wave technology integrated solution, which allows different monolinguists for a multilingual instant communication, they are described in detail the in following subsections. 4.1. Online Web Application Online Web Application (ITM-OWA) is the ITM platform extension that enables translation provisioning to web users. The main functionality covers an upload of a text for a translation (including .txt and .doc files as well as a manual text input) and a download of the translated results. ITS-OWA provides essential utilities for target and 5

source language monolinguist editors to organize their work on the text: a target editor view and a source editor view as presented in Fig. 5. An uploaded text is split to segments using a set of rules of SRX standard (Segmentation Rules Exchange)[14]. The segmentation is especially important in the system as in fact the work is based on text fragments. In a next step separated segments are initially translated using the Google Translate[13] Engine and then made available to both editors via mentioned views.

Figure 5: ITM-OWA - source and target monoliguist editor panels

The editor view is used to enhance a translated text and perform other actions described in the section 2. It is used to review translation segments and to classify them in main groups depending on which action is necessary. In the view there is a list of text segments displayed in the target language, marked in different colors depending on their current status (accepted, sent for paraphrase, etc.). This status may be changed by the user using the appropriate option from the menu. Source editor view is a perspective dedicated for source language monoliguists. The only concern is to provide paraphrases of the segments for which help has been requested, as well as verifying reverse-translated segments by comparison to the original ones. Users of the Source editor view, may use some additional means to help the target language user to better understand the meaning of the segment. Since paraphrasing is successful in most cases, additional metadata like comments and images can be provided when it fails. 4.2. Google Wave extension

Figure 6: Short conversation using ITM-GWE

6

ITM Google Wave extensions allows users without a common language to chat in real-time via Google Wave[15] and is implemented via a technology mix of Wave Robot[16] and Google Gadget[17]. Each user may choose their native language and use it in the communication while the provided robot instantly translates all messages to the chosen language. Whenever any part of the translation is incomprehensible it can be highlighted by the interlocutors. The robot then finds the original phrase and highlights it on the author’s window. The author of this sentence is then aware of a translation problem and should rewrite a problematic expression using other words (perhaps with an easier syntax) so that the other users benefit from an translation which is provided immediately after paraphrasing is finished. Fig. 6 shows such a conversation. During the pilot implementation visual comments (downloaded and drawn images) were considered as an aid, but were proved ineffective and rarely used. The main reason is that images are most likely to describe single nouns. Machine translation engines rarely have problems with single words, it is in fact the general meaning of the sentence which can be misunderstood. Although visual aids can be helpful to the users in some special cases they are not irreplaceable and a phrase context is usually more useful than a picture as it usually indicates preferred meaning. Having our web application available, a set of tests were performed to determine whether the ITM process can be used for real life translation projects. 5. Effectiveness Tests The main goals of effectiveness tests were to verify the quality of an ITM translation in comparison to a traditional process and to estimate the time and effort needed to perform the translation. Texts of three different specialisations were translated in three language combinations - Tab. 1. Professional translations of these texts were the patterns used in the effectiveness evaluation. ITM translation process was performed by three native speakers of English, Polish and French, who could communicate using only their own mother language making use of the ITM system. The translators as well as the judges (high school teachers, engineers and an accountant) did not have experience in the translation business and were not familiar with the texts before the experiment.

Property \Test

I

Table 1: Tested texts characteristics II III

IV

V

VI

Language Combination

English-Polish

English-Polish

English-Polish

Polish-English

Polish-English

French-English

Type of the text

Adventure book

Technical text

Article

Technical text

Article

Article

Number of segments

33

45

28

20

28

10

Number of words

633

804

610

321

533

192

To measure effectiveness metrics proposed by Callison-Burch et al. in [18] were used. A quality of translations was evaluated using different metrics: • Automated evaluation - BLEU • Human evaluation - assigning scores based on 5-point adequacy and fluency scale • Human evaluation - ranking translated sentences relative to each other (ties allowed). Bleu [19] is the standard in a machine translation quality assessment. It calculates n-gram precision and a brevity penalty and can also use multiple reference translations to determine allowable variations in the translation (for example ’We will discuss... ’ instead of ’We will talk about...’) however for the experiment, in most cases only one version of reference text was used. In human evaluation the five point scale for adequacy indicates how much of the meaning expressed in the reference translation is also expressed in a hypothesis translation. The second five point scale indicates how fluent the translation is - Tab. 2. The results achieved - Tab. 3 - showed considerable improvement in terms of BLEU (56% to 118%). In human assessment by bilinguals - Tab. 4 - scores showed that a produced text appears correct and is accurate and most of the meaning was preserved. As expected the most significant improvement was noted on fluency (even 75% in comparison to ’bare’ MT) and an average score can be described as good. 7

Table 2: Translation fluency and adequacy scales Adequacy Fluency 5 = All

5 = Flawless

4 = Most

4 = Good

3 = Much

3 = Non-native

2 = Little

2 = Disfluent

1 = None

1 = Incomprehensible

Table 3: BLEU results I II III

Results \Test

IV

V

VI

BLEU (5-grams) for MT

0.152

0.207

0.101

0.192

0.172

0.243

BLEU (5-grams) for ITM

0.301

0.356

0.221

0.341

0.316

0.381

Improvement in BLEU

98%

71%

118%

77%

83%

56%

Results \Test

I

Table 4: Adequacy and fluency results II III IV

V

VI 3.3 / 3.1

rating (1-5) for MT

2.2 / 2.0

3.1 / 2.1

2.9 / 2.0

3.2 / 2.9

3.0 / 1.9

rating (1-5) for ITM

3.0 / 3.1

4.2 / 4.2

3.9 / 3.5

4.3 / 4.0

3.8 / 3.2

4.1 / 4.1

improvement

36%/55%

28%/52%

34%/75%

43%/38%

26%/68%

24%/32%

Because fluency and adequacy ratings are not a standard and depend only on judges’ feelings, [20] suggests trying a separate evaluation where people are asked to rank translations in relation to another. The instructions for this task is: Rank each whole sentence translation from best to worst relative to the other choices (ties are allowed)”. Judges were to rank the initial machine translation, and a post-edited in ITM process version of it in relation to professional translation. The point of interest was how good quality in relation to the professional one can be obtained. The results present percentage of segments which were ranked of equally good quality as the reference translation.

Results \Test

Table 5: Judges rank I

II

III

IV

V

VI

MT seg. perceived as professional translation quality

6%

5%

4%

15%

7%

20%

ITM seg. perceived as professional translation quality

26%

44%

36%

40%

32%

53%

During relative assessment - Tab. 5 - only 5-20% of MT output segments were perceived as correct and fluent as professional translation (mostly very short sentences) while 26% up to 44% of ITM produced segments were considered of as good quality as the reference professional translation. This was especially noticeable when applied to technical, straightforward text. The worst result in this test was achieved for an adventure book, where judges expected the best quality. Table 6 presents a classification of the segments translated by the MT engine, done by the target language editor. A number of segments classified into categories (VALID, CORRECTABLE, INCOMPREHENSIBLE, DUBIOUS), described in Section 2, varies considerably for different types of text. For all texts, less than half of the segments were accepted by the target language editor immediately or after post-edition. For texts with long, complex and difficult sentences (e.g. Text 1) MT engine returned correct or nearly correct translation of only 28% of them. Other segments were unintelligible, so paraphrases were requested for them or verifications by the source language editor were required. Effectiveness of the MT engine is not a subject of this research, but the conclusion is that ITM must be prepared to handle any type of text and make no assumptions on how many paraphrases/verifications will be needed. FALSE (contorting the general meaning) segments were counted by bilingual judges after translation had been done. The fact that only a few segments were categorized as FALSE is noticeable. Most of the MT introduced false facts were eliminated thanks to available mechanisms. 8

Category \Test

Table 6: Segment classification I II III IV

V

VI

VALID

6%

5%

4%

20%

7%

20%

CORRECTABLE

22%

40%

28%

25%

43%

50%

INCOMPREHENSIBLE

36%

20%

46%

15%

32%

20%

DUBIOUS

36%

35%

22%

40%

18%

10%

FALSE

3%

0%

3%

0%

7%

0%

Observations have shown that technical texts, written in simple sentences and using straightforward language, required the smallest number of paraphrases whereas most problems with understanding MT output appeared in the case of adventure novel written using vivid language. In the second scenario it happened that a segment was paraphrased twice or more (4%). The most frequent paraphrases consisted of complex segment split into several simpler phrases, so that a target editor could understand the meaning of whole sentence analysing one part by another. Tests have shown that verification of edited segments, which meaning is uncertain to the target editor, is a very useful tool. It is easy and fast in comparison to paraphrasing which absorbs more of the source editor time. During tests verified segments were mostly correct and only 10% of them have been rejected. The problem with verification tool is that sometimes correctly translated segments are rejected due to failure of back-translation. Even though verification is not completely reliable our testers found it useful as it reduced number of sentences which needed paraphrasing. Table 7: Translation time comparison Category \Test I II Working time of the target language editor

50

90

III

IV

V

VI

45

25

50

12

Working time of the source editor

18

25

20

5

15

3

Working time of the professional translation

200

260

150

110

165

60

Working time of the professional translator with CAT*

170

210

130

85

145

50

The results achieved by ITM, clearly show that average quality of performed translations is good (4.1 in 1-5 scale) or at least acceptable (3.1 - non native) for the worst test case. The main advantage of ITM usage is of course time (and money). Table 7 gathers time statistics of three translation methods ITM, traditional translation and modern CAT tool (XTRF-TM). Overall working time of both persons involved in ITM process is at least half shorter than time needed by a professional bilingual translator (even working with CAT tools). 6. Benefits for MT Engines Thanks to the data obtained during work on translations, the ITM system may improve the effectiveness of MT engines. For example, the web version of Google Translate Service[5] provides the interface to contribute a better translation for the given segment. Each human revised and accepted phrase in the ITM system can be stored together with its original form and accepted translation. The collected translations may be submitted via mentioned interface or made available as a web service to the MT. This way the most significant improvement can be achieved in style and reader friendliness. A human proofreader can express the meaning in more accurate words or remove ambiguity introduced by the MT engine in a way that no algorithm can achieve. Not only does the ITM system collect improved translations, but it also has the ability to deliver a set of paraphrases which are supplied by users of the source language whenever the target language user does not understand a machinetranslated sentence. As it has been proved that a paraphrase corpus can be crucial for training MT engines [18] an engine which could use a data set gathered during ITM process would increase efficiency. There are existing methods to extract paraphrases automatically from texts [22], but what ITM technology offers is a way to provide human created, fluent paraphrases for specific sentences, which are currently translated incorrectly. * Translation time with CAT is shorter mostly not because of segment repetitions but unique XTRF-TM[21] tool ability to operate concurrently and perform a proofreading just after a translation of a segment.

9

7. Conclusions Iterative Translation done by Monolinguists (ITM) is a very promising way of international rendition. Created platform proves not only implementability of the concept, but has also been tested in a real runtime environment. Results of the tests confirms ITM’s usability both in term of quality and efficiency. The measured translation quality is acceptable, parties involved in the translation process are required to know only one language and a translation provisioning time is radically lower. The concept is likely to bring a significant reduction of the multilingual information flow barrier and change current translation market. The most important promises of ITM are: • an increase of the number of participants in multilingual communication, as well as the number of people who may be engaged in the translation process - shift from number of persons with good knowledge of at least two languages to all Internet users, • a chance to expand the translation market by the spheres that until now, have been treated as unprofitable due to the cost and time of the translation, • a significant improvement of machine translation engines effectiveness, by providing feedback information about segments’ translation quality, paraphrases and correct translations. In conclusion it could be seen as a transitory solution, during a process which aims to produce an infallible fully automated translation engine. References [1] F. Casacuberta, J. Civera, E. Cubel, A. L. Lagarda, G. Lapalme, E. Macklovitch, E. Vidal, Human interaction for high-quality machine translation, Communications of the ACM, 2009. [2] S. Barrachina, F. Casacuberta, E. Cubel, A. Lagarda, J. Tomas, J.-M. Vilar, O. Bender, J. Civera, S. Khadivi, H. Ney, E. Vidal, Statistical Approaches to Computer-Assisted Translation, Computational Linguistics, 2007. [3] S. Barrachina, F. Casacuberta, E. Cubel, A. Lagarda, J. Tomas, J.-M. Vilar, O. Bender, J. Civera, S. Khadivi, H. Ney, E. Vidal, Correcting Automatic Translations through Collaborations between MT and Monolingual Target-Language Users, Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, 2009. [4] M. Milkowski, J. Lipski, Using SRX standard for sentence segmentation in LanguageTool, 2009. [5] Google Translate Homepage, http://translate.google.com/, 2010. [6] BabelFish Homepage, http://babelfish.yahoo.com/, 2010. [7] Hibernate Website, http://www.hibernate.org/, 2009. [8] My SQL Website, http://www.mysql.com/, 2009. [9] Apache JackRabbit Website, http://jackrabbit.apache.org/, 2009. [10] Content Repository for Java Technology API Specification, JavaContentRepository:http://www.day.com/specs/jcr/1.0/, 2009. [11] Simple Object Access Protocol W3C Recomendation, http://www.w3.org/TR/soap/, 2009. [12] SpringSource Community Website, http://www.springsource.org/, 2010. [13] Apache CXF Framework Official Website, http://cxf.apache.org/, 2009. [14] Segmentation Rules Exchange Specification, http://www.lisa.org/fileadmin/standards/srx20.html, 2009. [15] Google Wave, http://wave.google.com/, 2009. [16] Google Wave Robot - Developer Guide, http://code.google.com/intl/pl/apis/wave/extensions/robots/, 2009. [17] Google Wave Gadget - Developer Guide, http://code.google.com/intl/pl/apis/wave/extensions/gadgets/guide.html, 2009. [18] C. Callison-Burch, C. Fordyce, P. Koehn, C. Monz, J. Schroeder, Evaluation of Machine Translation, Proceedings of the Second Workshop on Statistical Machine Translation, 2007. [19] K. Papineni, S. Roukos, T. Ward, , W.-J. Zhu, Bleu: a Method for Automatic Evaluation of Machine Translation, Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), 2002. [20] C. Callison-Burch, P. Koehn, M. Osborne, Improved Statistical Machine Translation Using Paraphrases, Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, 2006. [21] XTRF-TM Translation Management System, http://www.xtrf.eu/, 2010. [22] Y. Shinyama, S. Sekine, K. Sudo, Automatic Paraphrase Acquisition from News jurnals, Proceedings of Human Language Technology Conference, 2002.

10

Suggest Documents