Translation in Digital Space: Machine Translation, CAT and Localization
Adam Bednarek and Joanna Dróżdż University of Łódź
[email protected]
Abstract: Contemporary solutions for translation have far exceeded technological requirements in recent years. This has caused a shift in both professional enterprise and scientific investigation. Global communication systems increase translation output and insure quality, while hypertext type documents are far more frequent. Recent technological developments have thus found their way into translation services. An array of translation tools have come into popular use over the last decade, while translation has entered the realm of project management. The paper will focus on the use of computer tools aimed at fostering the translation process and present an overview of the challenges that lay ahead in digital space. Keywords: corpus, CAT, Translation Memory, Terminology Managemant, Machine translation
1. Introduction There is some ambiguity to a certain extent within the term 'translation' since its scope of meaning encompasses different notions, though they are inevitably correlated with one another. The whole process intends to recreate the content of the source language message by means of the target language in such a way that the semantic subject matter of the former remains represented as accurately as possible by the latter. Consequently, as stated by Savory (1968: 50), the target text is supposed to be a reproduction of the words, ideas and style contained in the source text. However, taking into account that the selection of lexis and grammar along with the values and perception of meaning are unique for everyone, the idiosyncrasies incorporated in the translated text result in attaching per203
Adam Bednarek and Joanna Dróżdż
sonal message to it (Newmark, 1981: 8). Yet the purpose of this chapter is not to focus on the theoretical background of translation, but rather on current developments in the digital realm. Contemporary solutions for translation have far exceeded technological requirements in recent years. This has caused a shift in both professional enterprise and scientific investigation. Files received from customers may have different formats. Currently clients provide the whole web content as well as collections of text chains extracted from software source code. Global communication systems increase translation output and insure quality, while hypertext type documents are far more frequent. Recent technological developments have thus found their way into translation services. An array of translation tools have come into popular use over the last decade, while translation has entered the realm of project management. Hence the term localization, which can be defined as the second phase of translation project work, accounting for distinctions, both socio-cultural, linguistic and technical within appropriate markets. The following chapter will focus on the use of computer tools aimed at fostering the translation process and present an overview of the challenges that lay ahead in digital space.
2. From fiction to fact: machine translation In order to understand the complexity of MT (machine translation), one must first refer to the concept of corpus linguistics. The principle behind this area involves collecting authentic texts of practical application as linguistic data, which become the subject of further study. With regard to the number of languages within a corpus, it is possible to distinguish monolingual corpora consisting of texts written in one language exclusively, bilingual corpora composed of contents in two languages, and multilingual corpora containing texts in more than two languages. Also, the use of corpora, specifically those which include parallel texts, i.e. printed texts written originally in the target language with the same communicative purpose as the source text, highly contributes to the improvement of the work of translators thanks to the fact that they provide access to terminology and phraseology as well as style and format appropriate to use in a certain contextual framework in the target language (Bowker 2002: 43). In accordance with contemporary technological development, printed corpora have been gradually replaced with their electronic versions, whose size is unrestricted and, as a result, are able to encompass an immense scope of records. Taking into consideration that nowadays the term ‘corpus’ unequivocally pertains to an electronic corpus, it will henceforth be referred to in this way. The 204
Translation in Digital Space
customary names for a bilingual compilation of texts are ‘bitext’ as well as ‘parallel corpus’, the latter of which can also pertain to a multilingual group of texts, nevertheless both of the terms are solely relevant in the case when source language texts and their translated versions underwent the process of alignment, i.e. particular segments of the original text were connected with their matching parts in the target language texts. It is essential to emphasize that parallel corpora must not be mistaken for printed parallel texts, which solely belong the same type and pertain to the same topic as the source texts, though do not constitute their translations. Additionally, within corpora directed specifically at facilitating the process of translation there are also bilingual comparable corpora, which share the fundamental idea with printed parallel texts, namely they consist of texts whose subject matter and purpose are in common. However, the records incorporated in bilingual comparable corpora must not be translated — it is necessary that texts in one language as well as those in another be initially written in their languages. Not only can corpora play an important role in assisting translators in their work, but also in theorising about the character of translated text. An example of such text collections are monolingual comparable corpora composed of records in a single language only, although divided into a group of texts translated into this language and a group for whom this is the source language. The use of this type of corpora in translation studies does not aim at assessing the quality of translation, its ultimate purpose is to investigate translation as a process (Baker 1996: 175) and the inherent attributes of its result. In the domain of machine translation, the concept of corpora led to the emergence of corpus-based (as opposed to rule-based) machine translation systems, which incorporate large corpora as examples. Briefly, the process of translation is achieved through the recognition of the elements that overlap between the SL text inserted into the system and the contents of the corpus, and consequently, the generation of the TL text on the basis of the aligned units in the corpus, where both of the steps are performed automatically. Nevertheless, on account of the fact that corpus-based machine translation is further divided into statistical machine translation and example-based machine translation, the procedures are more complex and vary depending on each type. Pertaining to computer-assisted translation, it was the idea of parallel corpora, the pattern of which induced the development of a translation memory enabling translators to utilize their previous translations while working on a new project. Furthermore, all the tools for data retrieval from corpora at the same time constitute computer-assisted translation tools since they are helpful to translators in such procedures as selecting more common target language terms with-
205
Adam Bednarek and Joanna Dróżdż
in a particular subject field to be used in a translation or establishing accurate equivalences in their target language environment. The fundamental assumption of machine translation (MT) is based on the aim to attain the final translation product of the highest quality possible with the responsibility for the whole process on the part of a computer. Still, in practice it turns out to be such a complex task that its accomplishment is highly demanding, if not impossible, hence various approaches to MT have been emerging since the beginning of its existence as researchers have been seeking to achieve more and more satisfying results. As a consequence, depending on certain factors including the character of the SL text, the type of the language involved, the purpose of the TL text or the amount of time to carry out the translation, not only can a MT system alone be in charge of completing the translation (which would obviously be preferable), but also human contribution at different stages of the process is possible in order to enhance the output quality. However, if a human is involved, his or her function must remain only supplementary since in MT it is the system that is expected to perform the main task. According to the principle behind rule-based machine translation (RBMT), in order to perform the process of translation it is necessary to examine the source language (SL) text with the aim of retrieving its semantic or semantico-syntactic representation being the indispensable factor that enables the synthesis of the target language (TL) text. The mechanism of RBMT systems typically functions on the basis of labelled tree representations transformed from one into another, e.g. a morphological tree undergoes a conversion into a syntactic one, which is subsequently altered into a semantic tree, and so forth (Hutchins, 1995: 3). On the way of gradual evolution of MT, an innovative approach, namely corpus-based machine translation, emerged in the late 1980s owing to the progress within the field of corpus linguistics, especially the facilitated availability of text corpora from large databanks, as well as the re-discovery of Makoto Nagao's development from 1981 and the revival of the earliest statistical methods. The two major corpus-based methods encompass example-based machine translation and statistical machine translation, the achievement of which does not involve the application of any linguistic information (Hutchins, 1995: 6). The foundation of statistical machine translation (SMT) lies in a translation model composed of SL-TL frequencies and a language model incorporating word sequences in the TL, both of which are retrieved from a previously aligned bilingual corpus on the level of phrases, word groups and, eventually, single words, nevertheless it does not necessarily have to be the same corpus for each of the two models.
206
Translation in Digital Space
If the whole process of translation is performed with no human participation whatever, it is referred to as fully automatic machine translation (FAMT), nevertheless its output quality may frequently be questionable. Therefore, the application of FAMT is adequate when the purpose of the TL text does not require being flawlessly translated, i.e. it allows for certain inconsistencies in comparison with the SL input or stylistic shortcomings. In fact, such a rapidly achieved rough translation may also be very helpful, e.g. for a scientist who needs an immediate access to the content of a text within their domain written in a distant foreign language for which traditional human translation is not easily available, since he or she can employ their knowledge of the subject field in order to retrieve the actual intended meaning from the imperfect TL output (L. Somers, 1998: 137–138). Despite the fact that the idea of accomplishing fully automatic high quality machine translation (FAHQMT) is essentially unfeasible on account of the inability of computers to reason in the same way humans do (Bar-Hillel 1960: 41–42), a considerable reduction of mistakes and inaccuracies in a FAMT output is, however, possible. Tokenization is especially useful in the case of translating between languages of different writing systems, where one-to-one correspondence between SL and TL tokens may not occur. In contrast to pre-processing, which influences the TL text in an indirect way, post-processing involves a direct introduction of modifications into the final output (Bogucki 2009: 89). Accordingly, the main task on the part of post-processing is clearing up the rough translation by the means of error correction or particular fragments revision as well as detokenization and appropriate cases establishment. Not only is it possible that human involvement in shaping the MT output can take place either before or after the MT proper, but the system may also allow for the user intervention during the process of translation, the phenomenon of which is known as interactive machine translation (IMT).
3. Computer assistance While machine translation is intended to perform the translation proper for a human translator and leaves their scope of influence on the results relatively limited, in the case of computer-assisted translation (CAT) it is the human translator who remains all the way intellectually responsible for carrying out the process of translation and its final outcome, nevertheless with the support of various computerised tools that facilitate the work (Freigang 1998: 134). On the one hand, such technological developments as word processors, dictionaries on 207
Adam Bednarek and Joanna Dróżdż
CD-ROM, online dictionaries, the Internet in general as well as spelling and grammar checkers may in a broad sense be considered as CAT tools to a certain extent; however, on the other hand they constitute basic tools that are applied for a vast range of other purposes beside the profession of translation (Bowker 2002: 8). Consequently, excluding these multi-purpose inventions, the term CAT tools pertains to more specific tools and encompasses terminology management systems, translation memory systems, localisation tools and corpus-analysis tools, which are designed particularly for translators or for linguists and have been adopted by translators on account of the fact that the mechanism of CAT in part overlaps with and actually is based on the idea of corpus linguistics. Taking into consideration that CAT shares the intellectual function of the translator with traditional human translation and the involvement of computer systems with machine translation, it can therefore be placed approximately in the middle on the scale of the two extremities.
3.1 Terminology management Since the process of translation is inevitably dependent on selecting appropriate equivalent TL terms for SL terms, which may appear in a text repeatedly, and a given term (especially in technical translation) should systematically be translated in the same way throughout the whole text, it is necessary to pay special attention to the use of terminology. In addition, when certain terms have only one formally established and accepted equivalent, the translator must not invent their own ones, therefore assembling subject-specific terminology within a particular domain constitutes one of the fundamental factors in translation as it increases the TL text quality by providing accurate equivalents. Accordingly, CAT tools offer a number of solutions to facilitate terminology management. The structure of a terminology bank (term bank) is based on a compilation of electronically stored term records which comprise information, frequently detailed, with regard to these terms. An entry may typically include such data as definitions, contexts, synonyms, foreign language equivalents or grammatical information (Bowker 2003: 50). In fact, term banks first appeared in the 1960s as a consequence of the evolution of printed dictionaries, specifically within technical domains, with the predominant aims of terminology unification, resulting in the increased terminological consistency of translations performed by multiple translators, as well as the efficiency enhancement of translators' work. As proposed by Nkwenh-Azeh (1998: 249–50), term banks can be classified according to the form of data representation or the performed function. The former taxonomy distinguishes such categories as the number of languages (i.e., 208
Translation in Digital Space
monolingual, bilingual or multilingual term banks), the number of subject fields (i.e., monodisciplinary or multidisciplinary term banks), thematic orientation (i.e., whether a term bank is inclined towards terms or concepts) and lexical orientation (i.e., whether a term bank encompasses both terms and words, exclusively terms, or terms, phrases and sentences combined together); whereas the latter is divided into term banks whose aim is to facilitate the process of technical and scientific translation (e.g., such multilingual term banks as Eurodicautom or Termium) and those intended to keep term and concept data documentation (e.g., Normaterm). An inevitable consequence of terminological data storage is the need for the access to the assembled compilations, hence terminology management software provides a number of methods for retrieving the stored terms, the most rudimentary of which is to locate exact matches. In more advanced systems, other options are wildcard searches applied on the same basis as in the case of concordancers (see section 1.3.3.2.) and fuzzy matches, which are akin, yet slightly different forms of the search pattern, including morphological and spelling variants, spelling errors and wrong sequences of multi-word terms (Bowker 2002: 79–81).
3.2. Translation memory The most significant aspect of CAT is, presumably, a translation memory (TM), which constitutes a form of database, or more precisely an aligned parallel corpus, since it is built of previous translations along with their equivalent SL texts, both divided into segments (typically corresponding to sentences), at the level of which the alignment is performed. The fundamental purpose behind a TM is to provide an access to the previously translated texts and their original SL counterparts during a new text translation process in order to enable a direct application of TL equivalent segments, or at least to give a suggestion of possible solutions, if certain fragments recur. Despite the fact that the idea of TM was initiated in the 1970s and first put into practice with, actually, MT systems in the next decade, it was not until the mid 1990s when TM became widely available to assist human translation (Somers 2003: 31). The first approach to creating a TM is referred to as interactive translation, which involves aligning equivalent SL and TL segments and adding them one by one to an empty TM as the text is being translated. Next time the translator opens the TM system, he or she may choose whether to continue building the previous TM or start another one. The former option is recommended when the new text to be translated covers a similar subject matter or is assigned by the 209
Adam Bednarek and Joanna Dróżdż
same client on account of the fact that a TM is typically more effective if it stores segments within a particular domain or orders placed by a particular customer. Regardless of being extremely time-consuming, this method ensures a high quality TM, nevertheless it does not allow for incorporating the translations performed previously to the implementation of the TM system.
4. Localization – current trends Localization has become a popular trend in current ‘translation related markets’. One might say that its ‘creation’ was a natural result of the development of software engineering. In order to increase sales and expand into different linguistic and thus cultural environments first companies were set up in the United States. Esselink (2000) talks about INK (now Lionbridge) and I DOC (now Bowne) as the first Multi-language Vendors. Its popularity achieved a new level with the introduction of HTML and the ever expanding trends of the World Wide Web. Although in a broad sense the idea of localization encompasses the adjustment of a product to make it relevant for the target receivers in terms of their language and culture, this term is frequently used with reference to a narrower concept limited to the processes of translating and adapting a software or a Web product (Esselink 2002: 1, 3). In fact, the level of complexity to which a given product requires being localized is determined by the degree of similarity between the SL and TL cultures including their technological development, thus, the more akin the two cultures are, the less demanding localization becomes (Bogucki 2009: 97). With a multitude of new tasks ahead, this process started to separate itself from mainstream translation. As with any growing field, localization does lack a fair degree of methodological background and support, allowing one to venture into terra incognita. The process as defined by the Localizastion Industry Standards Association (LISA) is perceived as the second phase, post Internationalization, recognized as I18N. By quote: Internationalization is the process of generalizing a product so that it can handle multiple languages and cultural conventions without the need for re-design. Internationalization takes place at the level of program design and document development (LISA).
In other words, as mentioned by Lommel & Ray (2007) it is ‘enabling’ a product at a technical level for localization. This requires, among others, international software development within multicultural environments, or as one can say, find 210
Translation in Digital Space
out what other cultures want and how they function. One thus takes away any culture specific instances visible within the product making it generally acceptable by all users. Please notice, the use of the term product as from hereonin we shall refer to the finished translation project as product. As a result we change the methodological approach instantly, especially in reference to TQA. Localization, or L10N, in turn becomes taking the product and making it linguistically, culturally and technically adequate for the target locale (Yunker 2002). Locale can be defined as ‘a set of parameters’ used to identify the users language and preferences (Sandrini, 2008). Localisation is often treated as a mere ‘high-tech translation’, but this view does not capture its importance, its complexity or what it encompasses. In general, localisation addresses non-textual components of products or services. Translation thus becomes just a part of this process. Similarly, cultural or rather cross-cultural knowledge becomes a prerequisite. R.A Hudson (1980) writes that “culture is something that everybody has” and involves some “property of a community, especially that which might distinguish it from other communities” (Hudson 1980: 73). Furthermore, it may be defined as “the kind of knowledge which we learn from other people, either by direct instruction or by watching their behavior” (Hudson, 1980: 81). The author claims that if “culture is knowledge”, then “it can exist only inside people’s heads” (Hudson 1980: 74). When outlining the basic assumptions of modern discourse, Deborah Schiffrin (1987) states that “language always occurs in context” and that “language is context sensitive” (Schiffrin 1987: 3). She implies that one’s world knowledge background is a key factor to understanding linguistic elements and assumes that “language always occurs in some kind of context, including cognitive contexts, in which in which past experience and knowledge is stored and drawn upon, cultural contexts consisting of shared meanings and world views, and social contexts through which both self and others draw upon institutional and interactional orders to construct definitions of situation and action” (Schiffrin 1987: 4). In effect, culture shapes experience and affects our view of realisty, Hus becoming and important component of localization.
4.1. Website localization When referring to localization one cannot omit reference to web marketing. By definition, using this is the process of digital space in order to increase product output and in certain situations create demand among consumers. Naturally this will require all realms available, however website value (and further localization) plays an important role in the process. Website translation and localization is a particularly delicate and complex operation, involving diverse skills and 211
Adam Bednarek and Joanna Dróżdż
requiring experience in the coordination of work phases. Van der Meer (2002) and Sandrini (2008) regard such a process as modifying the site in order to make it more “accessable, usable and culturally suitable” while Yunker (2003) describes it as project-oriented including linguistic elements and ‘digital assets’. Having this in mind the localizer thus becomes and agent of internet marketing. Let us now consider the very basics of what to consider in this process. Firstly one must take into account the marketing function; secondly the sociocultural aspect and finally linguistic and technical issues. Thus, first and foremost the localizer needs to address an array of non-linguistic issues. While honestly, this seems to be an obvious statement, many forget that first and foremost one must consider the informative value of website content. This allows for further co-authoring of localized products. A website in effect is the symbol of the brand or company in question and should, therefore set up trends for prospective consumers. Note the following figure.
WEBSITE
PRODUCT
ESTHETIC VALUE
CONSUMER
Accurate presentation with valid information
Creating a good image of the company/product
Consumer oriented approach
Fig 1. Website functions.
Bearing in mind that our primary role as localizers is delivering proper functioning products, the process will require work under the following: the contents must be adapted to the linguistic and cultural system of the target language; the communication tone must be suitable for technical standards and stylistic requirements in the target market;
212
Translation in Digital Space
the graphical components must also undergo any necessary transformations to meet the linguistic and cultural communication requirements. Furthermore, page formatting must be taken into account and made compatible with the original graphical structure, and with the demands of search engine and directory positioning. This allows Richard Sikes (2009) to present three possible methodological approaches, which include: Controlled authoring The Global Pyramid Capstone (Globalize – Internationalize – Localize) Optimization In the case of controlled authoring, the localizer may predict difficulties and thus have huge influence on the original product. Schewe (2001) establishes a close link between marketing strategy on localization in reference to language choice. Texts must be adapted to domestic marketing strategies and they must be adapted to domestic communication standards. This allows the translator/localizer to obtain the intended pragmatic/marketing effect or pragmatic/sementic equivalence. The effect is thus based on cultural filtering and difference in expectation norms between recepients. In such a case, translation is treated as recontextualization; as mentioned by House (1981), creating the recepients contextual conditions. This approach sees texts as bound to their original and new recepients, being the basis for the equivalence relation. What is important is its communicative performance. The text, and in this case, an entire website product, aims at functional pragmatic equivalence. Generally, the L2 product must be a semantic and pragmatic equivalent. General translation theory allows one to create a set of parameters to asses the quality of translation. In reference to Newmark (1988), Hatim and Mason (1990) House (1997) and Baker (1992) one can arrive at the following points:
Typology and Tenor Formal Correspondance Cohesion and Coherence Dynamic equivalence Lexical and grammatical quality Nature of the text Purpose of the text Audience
213
Adam Bednarek and Joanna Dróżdż
House (1997) assumes that unless influenced by pragmatic variance, only minimal violations of the above mentioned are acceptable in translation. Due to the fact that no two languages are alike, and thus no two cultures are alike, the nature of the message, purpose of the producer and intended audience seem to be variables most suited for quality assessment of both general translation projects as well as websites. Further investigation within the area of assessment requires referring to Skopos, a technical term for the aim or purpose of a translation’ (Vermeer 2000: 221). Skopostheorists believe that any action has a particular aim, or rather a purpose. Thus, translation is considered not as a process of transcoding but as a form of human action having its own purpose (Schäffner 1998b: 235; Hönig 1998: 9). The skopos of a translation, as explained by Vermeer (2000), is the goal, defined by the commission and adjusted by the translator. He defines commission as „the instruction, given by oneself or by someone else, to carry out a given action [which could be translation] (Vermeer 2001: 229)”. Similarly, such can greatly build methodological background for website localization. Let us now focus on strategies and procedures as used by translators. Krings (1986) defines translation strategy as “translator’s potentially conscious plans for solving concrete translation problems in the framework of a concrete translation task (Krings 1986:18). Furthermore, Bell (1998) differentiates between global and local strategies and confirms that this distinction results from various kinds of translation problems (Bell 1998:188). On the other hand Nida (1964) sees procedures as performance actions, which include making judgments on syntactic and semantic approximations. These approaches have been analysed in depth in Newmark (1988). The open question remaining, is how far these are applicable towards in-depth text intervention during the process of localization, since the procedure itself requires applying functional constraints. Naturally technical requirements many a times prove to be of great difficulty to average translators. This naturally results in group or rather project team work. Eva Muller (2009), thus suggests the following in reference to the linguistic realm of the localization project. Please note: Localization tests covering linguistics must be performed for each language independent of the internationalization test. Typical test tasks to be fulfilled are proofreading of translated objects such as the UI and user documentation… All items must be named consistently and according to the specification in order to build a user-friendly localized product, and uniformity between user documentation and online help, for example.
We can thus conclude that the localization process puts stronger emphasis on translation tools and technology as compared to standard translation. The localization industry is rather young, as one can trace its origins back to the 1980’s, 214
Translation in Digital Space
therefore as mentioned in the beginning of this paper one is still in the dark concerning set methodological considerations. The following is thus a proposal for reaching set criteria concerning quality assessment as based on project work. Judging the quality of website products within objective requirements involves the preparation of a model for assessment. Project work had been concluded with the following points required for WLQA (Website Localization Quality Assessment) based on earlier scholarly considerations:
Skopos Functionality (Pragmatic/Semantic Function) Technicality Encoding of text elements Displayability HTML/XML acceptability Marketing value Target audience
The presented parameters represent an approximation of assessment based criteria for product quality and may be grouped according to the proposed three variables: Cross-cultural communicative strategy, market function and digital acceptability. In other words, pragmatic/semantic function of the localized text must fulfill socio-cultural requirements (in accordance with Skopos); the site must fulfill its market/advertising role in reference to the target audience; and finally, displayability, that is the combination of text and image must support the purpose of the product. What has also been observed is the need for further training in this area as contemporary translation courses rarely touch upon localisation. Within this new field of translation study, experts call for further training among future translators. Given global monetary difficulties and the increase in the number of translators, the offer of training for localizers seems to be of great market value. Note this quote form Muller (2009): Train the translators in the software to be localized to make sure they are familiar with the product, the target group, your style guide and the workflow they should follow. Provide a teach-the-trainer course for your Multilanguage provider. In this instance, you have to conduct a single training for all languages since you’re cooperating with a multi-language provider.
215
Adam Bednarek and Joanna Dróżdż
4.2. Software and video game localization This is yet another challenging aspect of localization. Software is a much more complicated issues and requires specialized knowledge through applicable training or experience. Working under such projects will require the division of the data into code and data segments or blocks. The code is restricted for developers, while the data segment is stripped of source code elements. If properly prepared this division is done by developers at the level of internationalization. Next, localizers deal with linguistic restraints and these involve, keeping the language simple or as simple as possible; unambiguous and consistent within terminology. In the words of Esselink (2000) “Typical examples of a writing style that could make the translation work easier are short sentences, simple vocabulary, consistency in terminology, and careful use of punctuation” (Esselink 2000, 27-28). This also refers to online help and documentation. The application of CAT systems in software localization constitutes an inevitable occurrence owing to the repetitive quality of software-related linguistic data, which are an excellent example of text that is appropriate to be translated with such systems allowing for an extensive exploitation of their possibilities. Beside the facilities incorporated in typical CAT tools (i.e., terminology management systems, translation memory systems, corpus-analysis tools), software localization tools cover additional functions, the scope of which may vary depending on the type of tool or a particular producer, since software localization deals not only with linguistic and cultural aspects. Accordingly, software localization tools are supposed to support additional file formats as they cope with entire Web pages or software applications consisting of pure text as a small part of a greater entity, and, therefore, also manage such issues as physical distribution of text or shortcut key adjustment. In practice, it may, for instance, turn out that a TL equivalence for a SL word on a button is too long to fit in the width constraints, thus some systems provide a preview of the effect in the form of user interface (in order to enable the verification whether a given TL word is appropriate in terms of length or requires being abbreviated or altered into a shorter one) or even allow for the button width modification to accommodate a longer translation (Bogucki 2009: 99–100; Bowker 2002: 131). Furthermore, the source code of a software or a Web page comprises text to be translated and tags combined together, the latter of which must not be even slightly changed, let alone translated, despite the fact that some of their elements deceptively take the form of translatable words. In fact, the removal or alteration of only one element within tags may cause that the expected user interface will not be displayed properly or will not show at all, hence software localization producers have taken a number of steps to prevent such incidences, i.e. some tools exhibit tags and 216
Translation in Digital Space
translatable text in different colours, other tools block the possibility of introducing any modifications to the tags, and the most advanced ones retrieve the text to be translated in isolation from tags and copy it to another file, at the same time creating one more file where an appropriate annotation to enable reinserting the TL text in the proper place is preserved. Considering video game localization, apart from the basics as graphical elements, menus in the before mentioned code blocks, there are other significant elements that apply to manufacturing a complete and valid product. These take localizers very much into realms applicable to both literary and audio-visual translators. Many have attributed this to ‘domestication’ as in terms of dubbing. Some claim, however, that it is dubbing—and not any other form of screen translation—that can aspire to being the 'ideal' form of film translation in terms of faithfulness, on the assumption that strictly linguistic considerations should not determine the overall value of a translation. In dubbing, the translator has to be faithful not only in the theatrical sense but also in terms of phonological synchronisation (Pieńkos 1993: 131). Similar observations are frequently put foreward in localization of video games.
5. Conclusions In this sense, following Baker (1998) who states, that film is a semiotic composition consisting of four channels, the same logic may be applied to the matter at hand as many new products make use of subtitling in their ‘video intermissions’ during thematically structured games: The verbal auditory channel, which includes dialogue and background voices and maybe lyrics. The non-verbal auditory channel, which is made up of natural sound, sound effects, as well as music. The verbal visual channel, comprising the sub-titles and any writing within the film, as for example, letters, posters, books, newspapers, graffiti, or advertisements. The non-verbal visual channel, which includes the composition of the image, camera positions and movement as well as the editing which controls the general flow and mood of the movie1.
1
Cf. Schwarz (2002)
217
Adam Bednarek and Joanna Dróżdż
In effect, the translator needs to create subtitles in such a way so as they fulfill their role within this polysemiotic environment (Shwarz 2002). Finally, faithfulness to the lore or genre should be mentioned. Players frequently reject localized products if its overall composition goes away from the original. Character names, place names and language specific elements should be as close as possible to the ‘world of the game’. As with any other action undertaken, one needs to remember that a direct and ideally domesticated product can in fact become a poor example of the localization process. References
Alabau, V., Sanchis, A., & Casacubierta, F. (2001) Improving on-line handwritten recognition using translation models in multimodal interactive machine translation. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, (pp. 389–394). Portland (USA), 19–24 June 2011. ALPAC. (1966) Language and Machines: Computers in Translation and Linguistics. A report by the Automatic Language Processing Advisory Committee (ALPAC). Division of Behavioral Sciences, National Academy of Sciences, National Research Council Publication 1416. Washington: NAS/NRC. Baker, M. 1992. In Other Words: a Coursebook on Translation. London and New York: Routledge. Baker, M. (1996) Corpus-Based Translation Studies: The Challenges that Lie Ahead. In Somers, H. (Ed.), Terminology, LSP and Translation: Studies in Language Engineering in Honour of Juan C. Sager (175–86). Amsterdam and Philadelphia: John Benjamins. Barnbrook, G. (1996) Language and computers: A practical introduction to the computer analysis of language. Edinburgh: Edinburgh University Press. Barrachina, S., Bender, O., Casacuberta, F., Civera, J., Cubel, E., Khadivi, S. Vidal, E. (2009) Statistical approaches to computer-assisted translation, Computational Linguistics, 35(1): 3-28. Bell, R. T. (1998). Psychological/cognitive approaches. In M. Baker (Ed), Routledge encyclopedia of translation studies. London & New York: Routledge. Benson, M., Benson, E., & Ilsen, R. F. (1986) The BBI combinatory dictionary of English: A guide to word combinations. Amsterdam and Philadelphia: John Benjamins. Bogucki, Ł. (2009) Tłumaczenie wspomagane komputerowo. Warszawa: WN PWN. Bouma, G. (2009) Normalized (Pointwise) Mutual Information in Collocation Extraction. In Chiarcos, C., de Castilho, R. E., and Stede, M. (Eds.), From Form to Meaning: Processing Texts Automatically. Proceedings of the Biennial GSCL Conference 2009 (31–40). Tübingen: Gunter Narr Verlag.
218
Translation in Digital Space Bowker, L. (2002) Computer-Aided Translation Technology: A Practical Introduction. Ottawa: University of Ottawa Press. Bowker, L. (2003) Terminology Tools for Translators. In Somers, H. (Ed.), Computers and Translation: A translator's guide (49–65). Amsterdam and Philadelphia: John Benjamins. Bromberger, S. (1992) On What We Know We Don't Know. Chicago: University of Chicago Press. Craciunescu, O., Gerding-Salas, C., & Stringer-O'Keeffe, S. (2004) Machine Translation and Computer-Assisted Translation: a New Way of Translating? In Translation Journal, Volume 8, No. 3, July 2004. Dryden, J. (1956) Preface to Ovid's Epistles. In Hooker, E.N., & Swedenberg, H.T. (Eds.), The Works of John Dryden: Poems 1649–1680 (Vol. 1, pp. 109–119). Berkeley and Los Angeles, CA: University of California Press. (Original work published 1680) Esselink, B. (2000) A Practical Guide to Localization. Amsterdam and Philadelphia: John Benjamins. Freigang, K-H. (1998) Machine-aided translation. In Baker, M. (Ed). Routledge Encyclopaedia of Translation Studies (134–136). London: Routledge. Retrieved from http://carynannerisly.wikispaces.com/file/view/Routledge+Encyclopedia+of+Transl ation+Studies.pdf Gries, S. T. (2010) Useful statistics for corpus linguistics. In Sánchez, A., & Almela, M. (Eds.), A mosaic of corpus linguistics: Selected approaches (269–291). Frankfurt am Main: Peter Lang. Halliday: 1978: Halliday, M. A. K. (1978), Language as social semiotic, Edward Arnold, London. Hatim, Basil & Ian Mason (1990): Discourse and the Translator, London: Longan Holley, R. (2009) How Good Can It Get? Analysing and Improving OCR Accuracy in Large Scale Historic Newspaper Digitisation Programs. D-Lib Magazine, 2009, vol. March/April 15, n. 4. Retrieved from http://www.dlib.org/dlib/march09/holley/03holley.html Hönig, H. G. (1998). Positions, power and practice: Functionalist approaches and translation quality assessment. In C. Schäffner (Ed.) Translation and quality (pp. 6-34). Philadelphia: Multilingual Matters. House, J. (1977) A Model for Translation Quality Assessment, Tübingen: Gunter Narr. House, J. (1997) Translation Quality Assessment: A Model Revisited. Tübingen: Narr. House, J. (2000) "Translation Quality Assessment: Linguistic Description versus Social Evaluation"journal des traducteurs / Meta: Translators' Journal, vol. 46, n° 2, 2001, p. 243-257 Hudson, R.A. (1980): Sociolinguistics. Cambridge, C.U.P Hunston, S. (2002) Corpora in Applied Linguistics. Cambridge: Cambridge University Press. Hutchins, J. (1986) Machine translation: Past, present, future. Chichester, England: Ellis Horwood Limited.
219
Adam Bednarek and Joanna Dróżdż Hutchins, J. (1995) A new era in machine translation research. Aslib Proceedings 47 (1): 211–219. Retrieved from http://www.hutchinsweb.me.uk/AslibProc-1995.pdf Hutchins, J. (2001) Machine translation over fifty years. Histoire, Epistemologie, Langage: Vol. 23 (1): 7–31. Retrieved from http://www.hutchinsweb.me.uk/HEL-2001.pdf Hutchins, J. (2003) Machine Translation: A General Overview. In Mitkov, R. (Ed.) The Oxford Handbook of Computational Linguistics (501–11). Oxford: University Press. Hutchins, J. (2005) Example-based Machine Translation: A review and commentary. Machine Translation 19 (197–211). Hutchins, J. (2006) Machine translation: history of research and use. In Brown, K. (Ed.) Encyclopedia of Languages and Linguistics: 2nd Edition. Oxford: Elsevier. Retrieved from http://www.hutchinsweb.me.uk/CUHK-2006.pdf Hutchins, J. (2007) Machine Translation: A concise history, In Wai, C. (Ed.), Computer aided translation: Theory and practice. Hong Kong: Chinese University of Hong Kong. Hyland, K. (2006) English for Academic Purposes: An advanced resource book. London: Routledge. Hymes, D.H. (1972) On Communicative Competence In: J.B. Pride and J. Holmes (eds). Sociolinguistics. Selected Readings. Harmondsworth: Penguin Krings, H.P. (1986). Translation problems and translation strategies of advanced German learners of French. In J. House, & S. Blum-Kulka (Eds.), Interlingual and intercultural communication (pp. 263-75). Tubingen: Gunter Narr. Lommel, Arle & Ray, Rebecca, (2007) The Globalization Industry Primer. An introduction to preparing your business and products for success in international markets. The Localization Industry Standards Association. Luther, M. (1963) Sendbrief vom Dolmetschen. In Störig, H.J. (Ed.), Das Problem des Übersetzens (pp. 14–32). Darmstadt: Wissenschaftlicher Buchgesellschaft. (Original work published 1530) Manning, C. D., & Schütze, H. (1999) Foundations of Statistical Natural Language Processing. Cambridge, Massachusetts: The MIT Press. McKeown, K. R., & Radev, D. R. (2000) Collocations. In Dale, R., Moisl, H. & Somers, H. (Eds.), A Handbook of Natural Language Processing (507–523). New York: Marcel Dekker. Muller, Eva (2009) Localization: The Global Pyramid Capstone. Multilingual: Localization, Getting Started. Munday, J. (2001) Introducing Translation Studies: Theories and Applications. London: Routledge. Newmark, Peter (1988) A textbook in translation Prentice-Hall International (New York) Nida, Eugene A. (1964) Towards a Science of Translating, Leiden: E. J. Brill Nkwenti-Azeh, B. (1998) Machine-aided translation. In Baker, M. (Ed.), Routledge encyclopedia of translation studies. London: Routledge. Retrieved from http://carynannerisly.wikispaces.com/file/view/Routledge+Encyclopedia+of+Transl ation+Studies.pdf
220
Translation in Digital Space Norton, P., & Clark, S. (2002) Peter Norton's New Inside the PC, Indianapolis, Ind.: Sams Publishing. Piotrowski, T. (2008) The Translator and Polish-English Corpora. In Anderman, G. & M. Rogers (Eds.), Incorporating Corpora: The Linguist and the Translator. Clevedon: Multilingual Matters. Reinke, U. (2013) State of the Art in Translation Memory Technology. Translation: Computation, Corpora, Cognition, 3(1). Samuelsson-Brown, G. (2010) A Practical Guide for Translators. Salisbury: Short Run Press. Sandrini, Peter (2005) Website localization and Translation; MuTra 2005 – Challenges of Multidimensional Translation: Conference Proceedings Savory, T. (1960) The Art of Translation. Boston: The Writer. Schäffner, C. (1998). Skopos theory. In M. Baker (Ed.) Routledge encyclopedia of translation studies (pp. 235-38). London: Routledge. Schewe, Theo (2001): ‘Multilingual Communication in the Global Network Economy’: In Eschenbach, Jutta & Schewe, Theo (eds): Über Grenzen gehen – Kommunikation zwischen Kulturen und Unternehmen. Halden/Norwegen: Hogskolen i Ostfold: 195209. Schwarz, B (2002) „Translation in a confined space” in Translation Journal Vol. 6 No. 4, October Sikes, Richard (2009) Localization: The Global Pyramid Capstone. Multilingual: Localization, Getting Started. Somers, H. (2003) Translation memory systems. In Somers, H. (Ed.), Computers and Translation: A translator's guide (31–47). Amsterdam and Philadelphia: John Benjamins. Somers, L. (1998) Machine translation, applications. In Baker, M. (Ed), Routledge Encyclopaedia of Translation Studies (136–139). London: Routledge. Retrieved from http://carynannerisly.wikispaces.com/file/view/Routledge+Encyclopedia+of+Transl ation+Studies.pdf Somers, L. (1998) Machine translation, history. In Baker, M. (Ed), Routledge Encyclopaedia of Translation Studies (140–143). London: Routledge. Retrieved from http://carynannerisly.wikispaces.com/file/view/Routledge+Encyclopedia+of+Transl ation+Studies.pdf Somers, L. (1998) Machine translation, methodology. In Baker, M. (Ed), Routledge Encyclopaedia of Translation Studies (143–148). London: Routledge. Retrieved from: http://carynannerisly.wikispaces.com/file/view/Routledge+Encyclopedia+of+Transl ation+Studies.pdf Tomás, J., & Vidal, E. (2009) Statistical approaches to computer-assisted translation. Computational Linguistics. 35(1), 3–28. Tytler, A. F. (1970) Essay on the Principles of Translation. New York: Garland. (Original work published 1797) Van der Meer, J. March 3, 2002, Where’s that Localization Business Model 2.0? LISA Newsletter
221
Adam Bednarek and Joanna Dróżdż Venuti, L. (1995) The Translator's Invisibility: A History of Translation. London: Routledge. Vermeer, H. (2000) “Skopos and Commission in Translational Action” in The Translation Studies Reader ed. Lawrence Venuti, New York: Routledge. Wills, W. (1982) The Science of Translation: problems and methods. Tübingen: Narr. Yunker, John (2002): Beyond Borders. Web Globalization Strategies. Indiananpolis: NewRiders Publishing.
222