Sign Language & Linguistics - CiteSeerX

Feature-based natural language processing for GSL synthesis Eleni Efthimiou1, Stavroula-Evita Fotinea1, and Galini Sapountzaki2 1Institute

for Language and Speech Processing, Athens / 2University of

Thessaly

The work reported in this study is based on research that has been carried out while developing a sign synthesis system for Greek Sign Language (GSL): theoretical linguistic analysis as well as lexicon and grammar resources derived from this analysis. We focus on the organisation of linguistic knowledge that initiates the multi-functional processing required to achieve sign generation performed by a virtual signer. In this context, structure rules and lexical coding support sign synthesis of GSL utterances, by exploitation of avatar technologies for the representation of the linguistic message. Sign generation involves two subsystems: a Greek-to-GSL conversion subsystem and a sign performance subsystem. The conversion subsystem matches input strings of written Greek-to-GSL structure patterns, exploiting Natural Language Processing (NLP) mechanisms. The sign performance subsystem uses parsed output of GSL structure patterns, enriched with sign-specific information, to activate a virtual signer for the performance of properly coded linguistic messages. Both the conversion and the synthesis procedure are based on adequately constructed electronic linguistic resources. Applicability of sign synthesis is demonstrated with the example of a Web-based prototype environment for GSL grammar teaching. Keywords: Greek Sign Language (GSL), sign synthesis, rule-based linguistic analysis, grammar coding, feature-driven coding

1. Introduction Greek Sign Language (GSL) is the natural language of the Greek Deaf Community. As all known sign languages, GSL uses three-dimensional space to articulate linguistic utterances. The language’s fundamental semantic units are base signs that correspond to the traditional notion of morpheme. Sign units are either Sign Language & Linguistics 10:1 (2007), 1–21. issn 1387–9316 / e-issn 1569–996x © John Benjamins Publishing Company

2

Eleni Efthimiou, Stavroula-Evita Fotinea and Galini Sapountzaki

monomorphemic or morphologically complex and can be described as to their phonological composition. The phonological composition of a sign includes values for both manual sign formation features — namely handshape, location, movement, orientation, and number of hands — and all obligatory non-manually articulated elements — such as mouth pattern, head and shoulder movement, facial expression, and eye-gaze — which also constitute part of a sign’s articulation (Stokoe 1978). Until recently, research on the grammar of GSL has been limited and in most cases either focused on the analysis of specific syntactic phenomena (Antzakas & Woll 2002; Sapountzaki 2003; Antzakas 2006) or made general theoretical assumptions about its status as a natural language (Papaspyrou 1990). The lack of a full scale analysis of the GSL linguistic system was the major obstacle in developing a wide range of language-based communicative and educational products, most strikingly reflected in the lack of a concise grammar handbook for deaf education, although Greek legislation (2817/2000) recognises GSL as an official language of the Hellenic State. Traditionally, video is used to convey linguistic content of sign languages. Given the need for three-dimensional representation of natural sign language production, video has been successfully used for lemma presentation when developing electronic educational tools, which exploit data of mainly lexicographic character (see Efthimiou, Vacalopoulou, Fotinea & Steinhauer (2004) for the application of video in GSL electronic lexicography). Although video fully preserves the naturalness of signing, it is still a “static” mechanism for the representation of linguistic utterances, in the sense that it cannot be easily modified to support the dynamic nature of human language performance in real use situations. Hence, the problem of dynamic language representation remains, which in turn heavily affects the generation of extensive and easily updated language material. With respect to the latter, ease of update is a crucial parameter for the use of sign languages on the Web and consequently for the access of deaf individuals to widely available electronic communication facilities, tools and services. In the last decade, a technological solution towards sign production has been explored to partly fill the gap of dynamic sign language content representation. This technical solution exploits virtual signing technology for the development of sign synthesis systems, where signs are performed by a virtual human signer (avatar).1 1. Projects with the goal to have an avatar perform movements with specific linguistic value include the European VISICAST project (http://www.visicast.sys.uea.ac.uk) as well as the eSIGN project (http://www.sign-lang.uni-hamburg.de/eSIGN/Overview.html), the American

Feature-based natural language processing for GSL synthesis

All avatar-based systems for SL generation obligatorily rely on vast linguistic resources and employ specific Natural Language Processing (NLP) mechanisms derived from or at least strongly influenced by the area of Machine Translation. The work reported here is based on systematic data collection and theoretical linguistic research on GSL grammar, which has been carried out by the Sign Language Research Lab of Institute for Language and Speech Processing (ILSP) in Athens since 1999, with the aim to develop electronic resources, tools and systems for the incorporation of GSL into sign language technology environments. The paper is organised as follows: Section 1 presents some background information on GSL, as well as general information on the technologies underlying sign synthesis machines. Section 2 deals with the framework of linguistic analysis related to GSL coding for sign synthesis, while Section 3 presents the available Natural Language resources, focusing on the sign lexicon database and the organisation of structure rules. Section 4 discusses Greek-to-GSL conversion procedures and 3D sign performance by a virtual agent. Section 5 offers an example of exploitation of feature-based sign synthesis in the framework of a GSL grammar teaching educational platform. Finally, Section 6 summarises current potentials and limitations of sign language NLP and the performance of signing avatars.

2. GSL coding for sign synthesis Just like other known sign languages (Stokoe 1978; Kyle and Woll 1985; Valli & Lucas 1995; Sutton-Spence & Woll 1999; Sandler & Lillo-Martin 2006), GSL makes use of sequential (linear) and simultaneous (non-linear) means to combine linguistic units at all levels of grammar. The linear part of the language involves all sequences of linguistic tokens and their syntactic relations, while non-linear mechanisms are encountered at all four levels of the grammar (phonology, morphology, syntax, and pragmatics). In the case of complex sign or sign phrase production, non-linear information is expressed by non-manual features that simultaneously combine with manual signs. These non-manual features result in a (strong) modification of the linguistic message of the base sign or signed utterance. In this respect, GSL grammar is organised following the principles and core mechanisms for linguistic message generation utilised by all sign languages (Bellugi & Fischer 1972; Liddell 1980).

SignSynth project (http://www.unm.edu/~grvsmth/signsynth) and the DePaul University ASL Project (http://asl.cs.depaul.edu/), andr the South African SASL-MT Project (http://www.cs.sun. ac.za/~lynette/SASL/), among others.

3

4


In our theoretical linguistic analysis, we follow generally accepted natural language research principles and standards based on systematic data collection of native GSL signers for both lexicon (Efthimiou, Vacalopoulou, Fotinea & Steinhauer 2004) and syntactic-semantic analysis (Fotinea, Efthimiou & Kouremenos 2005; Efthimiou, Fotinea & Sapountzaki 2006a). Research at the lexical level has supported the definition of concepts and a collection of GSL representations of the basic GSL vocabulary (Efthimiou & Katsoyannou 2001). Moroever, it has allowed for the analysis of the signs’ morpho-phonological structure. This knowledge has led to a formal description of sign formation mechanisms, which are the basis for vocabulary extension, in turn leading to the development of a methodology for terminology creation (a recent overview of related work can be found in Efthimiou & Fotinea (2007)). Corpus-based syntactic analysis of sign utterances is a prerequisite for the study of grammatical phenomena, taking into account various instantiations of linguistic performance of different native informants (Efthimiou & Fotinea, in press). while semantic analysis involves on the one hand the study of allowed feature combinations on the level of sign formation, and on the other hand constituent relations within a sign utterance (Efthimiou, Fotinea & Sapountzaki 2006a; Efthimiou, in press; for a similar approach to ASL linguistic analysis see Neidle (2002)). The theoretical analysis of GSL has enabled us to create adequate electronic resources for the development of sign language technologies at ILSP. Our approach to GSL synthesis is based heavily on the experience gained from NLP processes of syntactic parsing and Speech Synthesis applied to oral languages. In GSL (as in any sign language) there is a closed set of phonological components (Stokoe & Kuschel 1979; Fischer & Gough 1979; Sutton-Spence & Woll 1999; Efthimiou & Katsoyannou 2001), various combinations of which generate every possible sign. With respect to oral languages, speech technology has exploited properties of the phonological composition of words to develop speech synthesis tools for unrestricted text input. In the case of sign languages, a similar approach is being experimented with, to generate signs (word level linguistic units of sign languages) not by mere video recording, but rather by composition of their phonological components.2 To achieve phonological synthesis of GSL signs, a library of sign notation features, among other linguistic principles and standards, has been converted into motion parameters of a virtual agent (avatar). The same set of features that 2. A number of scientific attempts in this research domain describes the outline of the adopted approach. These attempts employ different phonology coding schemes and address distinct purposes of use (e.g. HamNoSys (http://www.sign-lang.uni-hamburg.de/projects/HamNoSys. html) as a sign phonology transcription system and SignWriter (http://www.signwriting.org) as a widely used writing system for SLs).


comprises the library content, is utilised to represent the phonological structure of sign lemmata in the system’s sign database. The sign phonology part of the database directly feeds the 3D sign generation module, which translates the phonological components of a given sign into its respective set of virtual movement instructions, performed by the avatar. Thus the avatar can perform any new sign, given that its phonology is broken down into features describing all the essential elements of a sign’s formation (handshape, location, palm orientation, movement) (Karpouzis, Caridakis, Fotinea & Efthimiou, in press). In our case, the phonological structure of GSL signs is analysed into sign formation components, according to the HamNoSys notation system (Prillwitz, Leven, Zienert, Hanke & Henning 1989). Signs are broken down as to handshape, location, movement, palm orientation, number of hands and their obligatory nonmanual elements. Complex sign formation is based on morphological rules combining free and bound morphemes that appear as individual records in the lexical database, thereby creating new signs or inflected forms of a sign. The combinatorial possibilities are restricted by specific features that accompany lemma entries, such as markings for the various plural formation rules or movement length, speed, and repetition indicating different grammatical categories of the same base sign (i.e. single movement = verb, repeated movement = derived nominal). In order to extend the system’s generative capacity to the phrase level, a set of core grammatical rules provides structure patterns for grammatical GSL sentences, which may receive unrestricted sign units on the leaf level.

3. Natural Language (NL) resources for GSL sign synthesis The GSL knowledge implemented in the system consists of two parts: (i) a lexicon that provides information relevant to all levels of grammar (sign database) and that is multiply exploited by the system’s modules (Section 3.1), and (ii) a set of structure rules utilising strings of morphemes to compose core sign utterances of GSL (Section 3.2). GSL synthesis is heavily based on the Natural Language (NL) knowledge of the system, which is necessary in order to guarantee, to an acceptable extent, the linguistic adequacy of signing performed. Coding of GSL knowledge involves a lexicon of annotated signs organised in a multi-purpose database environment and a set of rules for structuring core grammar phenomena in GSL, making extensive use of feature properties and structuring options.

5

6


3.1 The sign lexicon The sign lexicon contains lemmata of morphemic linguistic value of both free and bound morpheme categories. Information accompanying these lemmata is organised with respect to both phonological composition of the respective lemma and related syntactic-semantic properties. In order for the different modules of the sign synthesis system to efficiently exploit the information necessary for a specific task, the sign lexicon is organised in a database which incorporates manual and various non-manual features of sign phonology, annotated in HamNoSys, along with features indicating morpho-syntactic properties of sign lemmata, allowing for grammatical synthesis of morphologically complex signs. In accordance with the above, multilayer phonological information is given in a set of fields for mouthing patterns, features for facial expressions, head and shoulder(s) movement, and eye gaze. These features, when positively marked, have to be realised simultaneously with the sign’s movement. For example, in the case of marking for eye gaze, eyes obligatorily follow the movement path, when e.g. indicating pronouns, inflecting verbs of movement like walk or using classifiers like airplane-flies (Sutton-Spence & Woll 1999; Berenz 2002; Thompson, Emmorey & Kluender 2006). In all known sign languages, these features are also utilised for prosodic reasons, for instance, to indicate stress and focus. Information related to morpho-syntactic and semantic properties is coded in fields that feed different parts of the linguistic production. For instance, one of the possible groupings involves the identification of a lemma as a member of a specific morphologically marked GSL word family. Such a family is based on the main morphological feature of sign formation from which a number of derived signs may be produced. For example, the base morpheme masculine in combination with a number of bound morphemes yields the signs for man, boy, and uncle (Fernald & Napoli 2000). A more representative example for this kind of coding in a sign language environment is the base morpheme surface in the creation of the signs as table, field, and yard. Word level sign formation features form an extensive set of fields in the sign database. These features often add either morphological or syntactic and semantic values to base sign morphemes based on pragmatic parameters. To demonstrate how lemmata assigned with these properties are modified, we present below a representative subset of this type of features. Modulation for opposites involves a clustering of parameters which incorporate mirror image movement as in the case of the sign pairs open – close or go away – come here, or change of orientation as in the pairs go-upstairs – godownstairs or upwards – downwards.


A feature of significant importance adds information as to the lemma’s ability to become a classifier. This happens when the handshape used for the formation of a base sign can behave like a classifier, that is, the handshape characterises the formation of a whole set of semantically and pragmatically related signs. In GSL, lemmata like glass, airplane, walk, table etc. may function as classifiers by virtue of their handshape A complementary feature to the one signifying the lemma’s ability to become a classifier, indicates that the lemma it is assigned to may utilise a standard classifier for the indication of movement or location parameters. This is especially productive in the case of concrete object linguistic representations; for example, the sign pencil utilises classifier Δ (delta), the sign bottle utilises classifier C, the sign field utilises classifier 5, etc. A further marking of base morphemes signifies the distinction between the grammatical categories verb and noun, by movement repetition and/or modification of speed and size. Typical examples of this type are love(verb) — love(noun), sit — chair, and eat — food, among others. Similar production rules apply in other typological categories, which are sometimes unclear in GSL, as is the case for grow-up versus adult. At the current stage, database information does not code the direction of derivation or genetic relations between such pairs. The ‘default’ feature marks a lemma positively when the lemma functions as a base sign for a given word family. Yes/no ‘default’ values are combined with further parameters such as plural formation and aspectual options, i.e. the sign man is the base sign for a word family, the members of which follow its pattern of plural formation. The same happens with the base sign give as regards its aspectual options, whereas airplane forms a word family and also passes on its plural formation and movement patterns (real and syntactic) to its members. In Figure 1, one may track the word family ‘child’, the base sign of which is provided by the lemma little-child (id 241), while it also includes child (id 277), short (id 200), and get-older (id 231) as its members. Morpho-phonological properties of sign languages often entail a pragmatically oriented parameter. The ‘real movement’ feature is related to the utilisation of classifiers and real space representation, i.e. for the composition of signs for ID 241 277 200 231

Lemma ΜΙΚΡΟ-ΠΑΙΔΙ ΠΑΙΔΙ ΚΟΝΤΟΣ ΜΕΓΑΛΩΝΩ-ΣΕ-ΗΛΙΚΙΑ

Word family Παιδί Παιδί Παιδί Παιδί

Default Yes No No No

Figure 1. Sign lexicon database view of the fields for ‘word family’ and ‘default’ in the case of word family ‘child’ (παιδί).

7

8


vehicle and signs expressing physical motion. An example is the group of space verbs, which includes formations like i-jump, i-walk, airplane/insect-flies, car-moves, etc. Plural in GSL can be expressed either by free or bound morphemes. In both cases, the features which mark the different plural formations belong to the class of morphological markers. Thus, nominal lemmata may receive a plural formation value that indicates their pluralization behaviour, when plural is formed by a numeric value or quantifier on the singular sign (e.g. ‘2, 3, … days’, ‘2, 3, … pencils’ etc.) or groups of many or few objects in real space. Otherwise, plural may be marked by movement repetition and/or change in space (as e.g. in the case book, tree, and child). Yet another option is provided by the [2hand plural] coding tag3 in the sign lexicon database, if the base sign is formed with one hand (e.g. airplane) and its plural form requires two-handedness. A set of syntactic-semantic features assigns (multilayer) properties of phrase formation to lemmata that can function as phrase heads. Assignment of aspectual values to predicative heads, for instance, provides a typical example of behaviour of a sign language in this respect. [Durative aspect] indicates that the sign movement continues for longer than default in order to express durative aspect. [Diminutive aspect] expresses a small span of movement to indicate minimal action/event (e.g. with predicative signs such as it-is-blowing, i-walk, i-speak, i-eat, etc). [Intensive aspect] marking requires a bigger span and abrupt pauses in movement in order to indicate intensity (e.g. with signs such as feel-a-pain, it-rains, etc). [Repetitive aspect] marking requires repetition of the sign’s movement with interval pauses (e.g. with signs such as ask or travel). Different aspects can be mutually exclusive or can occur simultaneously on a given predicate. In contrast, the feature ‘syntactic movement’ is related to the conjugation of the so-called agreement verbs (e.g. ask, scold, pay, etc), a group where sign formation necessarily incorporates the subcategorisation frame of the head predicate. Finally, a sentence level structural feature is indicated as ‘can be a topic’; this feature signifies that the lemma it is assigned to may be placed at the beginning of a sentence, obligatorily accompanied by an eye brow raise and followed by a pause (= topic marking on X0 level signs for concrete entities and place specifiers). Figure 2 provides a view of the sign lexicon database where the fields for concept representation (in Greek) and HamNoSys lemma annotation are demonstrated. The phonological decoding of lemmata reveals a number of sign formation properties as, for example, the combination of root morphemes of different semantic categories for the formation of signs in the same concept group. An example is the HamNoSys representation of the signs sister (id 45) and brother 3. Bracketed labels in the text correspond to coding tags in the sign lexicon database.


ID 45

Lemma ΑΔΕΛΦΗ

HamNoSys 

+ 

46

ΑΔΕΛΦΟΣ



+ 

Figure 2. Sign lexicon database view of the fields for concept representation (in Greek) and HamNoSys lemma annotation. ID 358

Lemma ΤΡΕΧΩ

HamNoSys 

Mouthing OL7

227

ΜΑΛΩΝΩ



CN17

185

ΚΑΤΗΓΟΡΩ



CN17

379

ΦΙΛΩ



CN14

107

ΕΣΕΙΣ



Eye gaze

YES

Figure 3. Examples of non-manual feature coding in the sign lexicon database.

(id 46), respectively, where a semantic unit roughly interpreted as ‘born by X’ is combined with the signs boy and girl, respectively. In Figure 3, the marking for obligatory non-manual features in the sign lexicon database is demonstrated, among which the mouthing gestures combining with the signs for the predicates ‘run’ (id 358), ‘scold’ (id 227), ‘accuse’ (id 185), and ‘kiss’ (id 379). In the case of eye gaze, the ‘yes’ value denotes obligatory simultaneous performance of the specific feature with the hand motions (annotated in HamNoSys), as, for example, in the second person plural of the personal pronoun you (id 107 in Figure 3), which obligatorily requires eye gaze towards the direction of the pointing finger. 3.2 GSL structure rules Following theoretical linguistic analysis, a set of GSL structure rules was developed taking into account experience-based decisions about the efficiency of computational grammar rule writing (Shieber 1992; Carpenter 1992). Cross-sign linguistic evidence was also taken into account (Fischer 1996; Boyes-Braem & Sutton-Spence 2001; Zeshan 2004), in order to support the formulation of rules with generative capacity (Efthimiou, Fotinea & Sapountzaki 2006a) that adequately describe natural signing patterns. The resulting rule set is utilised, on the one hand in the Greek-to-GSL conversion environment as part of a Machine Translation (MT) based engine, and on the other hand as input to the sign synthesis engine to facilitate phrase level generation of linguistic utterances.

9

10


The fundamental hypothesis that forms the basis for the computational linguistic analysis we adopt is related to the level of representation of linguistic units. Structural representations are based on morpheme classes instead of the widely accepted word class level. This choice was based on the observation and analysis of GSL data which verified that an analysis using word level categories is cannot account for all instantiations of use of one and the same item in the language’s core grammar structures. Moreover, a number of syntactic-semantic functions, which are ordered linearly in spoken languages and thus need the identification of a syntactic category in order to participate in the internal structure of phrase level formations, are expressed by multilayer feature assignment on structural heads in GSL. As a result, the rules for GSL analysis and generation use morpheme units with well-defined grammatical functions, which may fill standard positions in string-like formations for the creation of signing utterances, with multilayer features adding mainly semantic values to maximal phrases. Poly-morphemic sign formation involves processes in which either (i) adjunct morphemes are allowed to attached to base morphemes to provide cumulative or derivational morphological information, or (ii) information is added requiring multilayer processing, related to the various semantic functions of non-manual features. In this respect, the structure rules that provide linear orders are combined with condition-dependent feature insertion rules which allow for the representation of values assigned to heads of phrases, in order to express semantic properties which in spoken languages, are usually expressed by means of various structural categories (X0-XP) and particles participating in string formations. Furthermore, for computational efficiency, the main decision on rule writing was to restrict structural depth to level one (1). Therefore, sentence level signed utterances involve the filling of well-defined structured positions for morphemes in a way that facilitates an efficient description of the language’s generative capacity. The leaves of the created structures are the lemmata of the GSL lexicon. Lemmata and rules comprise the GSL coded knowledge. The rules generate linearly ordered surface structures that correspond to base sign sequences in a phrase. However, the maximal phrase level representations also contain features that provide linguistic information, which are expressed nonlinearly by non-manual mechanisms. Typical non-manual elements in phrase formation are markers expressing sentential negation and interrogation. Negation, for instance, is indicated by a complex non-manual feature on the sentence level, which accompanies (at least) the predicate head sign. In addition, the expression of qualitative value within nominal phrases also requires the addition of non-manual


information. Adjectives like nice/good, for example, are formed by incorporating the adjectival value on the nominal head morpheme by means of an appropriate mouth gesture. Hence, the GSL equivalent of ‘nice apple’ is expressed by signing the head apple while performing the mouth gesture that accompanies the qualitative adjective nice. Finally, various adverbial values require non-manually articulated information on (parts of) the verb phrase. Meanings like ‘walking comfortably’, ‘eating nicely’ etc. are expressed in GSL by a single morpheme, with adverbial information superimposed on that morpheme by specific facial expressions.

4. Greek-to-GSL conversion To the best of our knowledge, all systems that attempt conversion from an oral to a sign language entail a module that exploits some kind of transfer mechanism. Indicatively, we refer to the conversion procedures reported in Marshall & Safar (2002); Lancaster et al. (2003); and Huenerfauth (in press). This transfer component is required by all MT systems to match the linguistic units of the source to those of the target language. Translation from an oral to a sign language must preserve information utilised by the multilayer message structure of the target language. This is a major goal of all reported systems, equally true for early as well as recent attempts (Zhao, Kipper, Schuler, Vogler, Badler & Palmer 2000; Zijl & Combrink 2006; Fotinea, Efthimiou, Karpouzis & Caridakis, in press). The possibility to extend such a system to cover a wider range of linguistic phenomena within one language pair, but also its potential adaptation for a different language pair, rely heavily on the continuous enrichment of appropriately coded linguistic resources. In order to handle written Greek input for conversion to GSL, we make use of a previously developed statistical parser for Greek (Boutsis, Prokopidis, Giouli & Piperidis 2000). On the basis of tag annotations on the input which consists of lemmatised word strings, this parser provides syntactic chunks as output. These chunks are mapped onto GSL structures, which generate sign string patterns to be performed by the avatar. Structural and lexical mapping incorporates standard MT procedures that are able to handle the addition or deletion of non-matching linguistic elements between source and target language, as well as perform feature insertion on GSL heads in order to provide formations that utilise multilayer means of expression, characteristic of natural (complex) sign performance. The adopted organisation of linguistic knowledge allows for robust conversion of written Greek text into GSL signing, resulting in a tool which is, in principle, independent of the application environment, and, when enriched with appropriate resources, may support access by deaf users to a wide range of e-content.

11

12


4.1 NLP procedures The Greek-to-GSL conversion tool consists of three submodules, the Shallow Parsing for Greek, the Greek-to-GSL Mapping, and the GSL Synthesis. The Greek-toGSL conversion tool is schematically depicted in Figure 4. In the Shallow Parsing submodule, Greek written sentences (text) are processed by a shallow, statistical parser that also makes use of linguistic information based on morphological tags on lemmatised words of phrasal strings. Parsing results in structured chunks that correspond to grammatically adequate Greek syntactic units, with feature values for morpho-syntactic annotations on input words and structural annotation on phrases. Incorporation of the procedure described above into the conversion tool takes advantage of the shallow parsing capability to successfully handle large amounts of natural language data. The chunks created by the parser serve as input to the Greek-to-GSL mapping submodule, where the present target is acceptable performance in a well-defined sublanguage environment. The Greek-to-GSL Mapping module transfers written Greek chunks into equivalent GSL structures, and aligns input tagged words with corresponding signs or features on sign heads (Fotinea, Efthimiou & Kouremenos 2005). For instance, the Greek noun phrase ‘nice apple’ has to match the structural needs of GSL: the specific noun phrase has to be realised as a complex sign by performing the manual sign for the nominal head (‘apple’) simultaneously with the mouthing gesture for ‘nice’. In the mapping module, the adjective chunk is replaced by a corresponding mouthing feature. This procedure is combined with a general mapping rule of the semantic tag ‘qualitative’ on adjective heads and deletes the input chunk related to the adjective, while at the same time creating a corresponding feature on the nominal head. This feature may receive several values deriving from mapping between specific adjectives and mouthing gestures. We exemplify sentence-level mapping by the handling of a Greek predicate with an empty pronominal subject generating a double deictic pronoun subject GSL lexicon Greek parsed chunk input (Shallow Parsing)

GSL rules

Greek-toGSL mapping

GSL synthesis

VRML signer

Figure 4. Schematic presentation of the Greek-to-GSL conversion tool.

SYN SYN SYN TOK SYN SYN SYN SYN

Feature-based natural language processing for GSL synthesis [cl [vg *sing τρώω W *Pr01Sg* vb_sg *sing /vg] [*XP] /cl]

SYN SYN LU SYN SYN SYN LU SYN LU SYN SYN SYN SYN SYN

[cl [Pr_deictic εγώ *Pr01Sg* /Pr_nm] [vg *sing τρώω W *Pr01Sg* vb_sg [Pr_deictic εγώ * Pr01Sg* /Pr_nm] *sing /vg] [*XP] /cl]

Figure 5. Output example of the empty pronominal subject mapping with doubled deictic pronoun.

in GSL. Under this rule, sentential strings such as ‘eat(-Ι)’ are mapped on a GSL structure that results in the string ‘i-eat-i’, an instantiation of GSL subject doubling. This string is the preferred grammatical realisation for the construction of the predicate eat in this situation, as shown in a study on the instantiations of the phenomenon within the GSL Corpus (GSLC; Efthimiou & Fotinea, in press). The rule for mapping the relevant chunks is presented in Figure 5. The chunks of the verb group (vg) on the left-hand side are the output of the shallow parser for Greek while the corresponding verb group for GSL is shown on the right-hand side. The rule generates positions for the deictic pronoun that serves as subject and that has to be signed by the virtual signer in a grammatical signed utterance. The right-hand side of the mapping rules is identical to the structural rule content of the GSL synthesis module. This module contains all structural representations that correspond to the rules which generate core GSL sentences. In order to provide input to the virtual signer, this module interacts with the sign lexicon database and the library of features that define avatar motion under various conditions. In the case of the example nice-apple described above, the chunk description provides the information necessary for signing the nominal head, while adding in parallel the mouthing gesture corresponding to the modifier (adjective). In order for the avatar to perform this example, the module reads the corresponding HamNoSys notation as well as the mouthing gesture from the library. The conversion procedure from written text to GSL structure representation aims to provide controlled input to the 3D sign generation, the technological background of which is presented below.

13

14


4.2 3D Sign Performance The adopted Web 3D technologies for signing by a Virtual Character (VC) use a VRML (Virtual Reality Markup Language), h-anim compatible model, controlled by the STEP engine (Scripting Technology for Embodied Persona) (Huang, Eliens & Visser 2002). To perform manual features, the HamNoSys annotated GSL input is decoded and transformed into sequences of scripted commands. To demonstrate the productive capacity of the adopted engine, we discuss the example of GSL plural formation by means of numerical classifiers4 Plural formation in GSL makes use of a set of rules, the application of which is appropriately marked in our lexicon database. Figure 6 shows the VC signing the GSL sign day, while Figure 7 depicts its numerical plural form two-days. In this case, different lexical coding results in the appropriate VC performance, with a two-finger handshape used to perform the basic sign movement, instead of the default straight-index finger handshape. In Figure 7, the VC is shown in a frontal view to demonstrate the corresponding property of Blaxxun Contact 5 (VRML plug in), which allows for better perception of this specific sign detail (Kennaway 2001, 2003).

Figure 6. The GSL sign ‘day’.

Non-manual features of GSL compose the multilayer information that has to be processed in parallel with the manual phonological sign components, in order to yield a grammatically well-formed sign. For some of these non-manual features, there is no established representation in HamNoSys (version 4); we, therefore, use the classification by Sutton-Spence & Day (2001). Among non-manual features, head movement, eye gaze, and turning of the torso play a significant role in sign 4. See http://www.image.ece.ntua.gr/~gcari/gslv/ for demonstration of the current sign synthesis machine.


Figure 7. The frontal view of ‘two-days’.

language syntax. In particular, these features may convey specific grammatical meanings on the word or phrase level (e.g. negation, verb agreeement, sentential tense, role in discourse, etc.) and they typically follow the movement contour of the hand. The implementation of these features increases the acceptance of the performed sign by natural signers significantly. Again, while head movement can be coded in HamNoSys, eye gaze, and turning of the torso are transcribed and represented with internally established code numbers. The significance of head movement is evident in discourse situations, where by default the signer faces his/her interlocutor and must use (various) positions in the signing space to place the referent(s) involved in the narration. Hence, when a third person is included in the plot, the signer’s head and gaze are turned towards the specified position to indicate reference to events related to this person. In Figure 8, an example from a narrative is shown which involves two persons — other than the interlocutor(s). In the picture pairs (a) and (b), the signer conveys information related to the first absent person, while in pairs (c) and (d) a head turn accompanied by a change in the direction of eye gaze signifies reference to the second absent person involved in the same narrative. Grammatical information that is realised by means of head movement and eye gaze and coded in the GSL grammar module allows for synthesis of utterances by the avatar with minimal technical cost. For example, for temporal relations expressed by different eye gaze positions, the avatar may assign sentential tense values to the utterances it composes by exploiting the relevant features whenever they are present in the output of the Greek-to-GSL conversion procedure.

15

16


(a) (a)

(b) (b)

(c) (d) (c) (d) Figure 8. (a) Signer positioning the first absent person A in signing space; (b) Signer Figure 8: (a) Signer positioning the first absent person A in signing space, (b) Signer signing ‘A signing ‘A sits down’; Signer positioning theperson second B inSigner signing space; Figure 8: (a) Signer(c) positioning the first absent A inabsent signingperson space, (b) signing ‘A sits down’, (c) Signer positioning the second absent person B in signing space, (d) Signer (d) Signer signing ‘B sits down’ . sits down’, (c) Signer positioning the second absent person B in signing space, (d) Signer signing ‘B sits down’. signing ‘B sits down’.

5. Sign synthesis in a grammar teaching environment A prototype educational platform for GSL grammar teaching was used to experiment with feature-based processing for GSL synthesis, where educational material is presented by a virtual tutor (Efthimiou, Sapountzaki, Karpouzis & Fotinea 2004; Karpouzis, Caridakis, Fotinea & Efthimiou, in press). An electronic handbook for GSL grammar to be used in primary school education comprises grammar resources, which support the prototype, in combination with educational scenarios generated for the presentation of teaching material (Efthimiou & Fotinea 2007). Lecture topics are organised in units, with initial focus on the theoretical presentation of the educational object, followed by a unit designed to wrap up the knowledge previously presented. The last units of each lecture require full user participation, with pupils undergoing a number of drills, similar to those found in standard language teaching environments. These drills may require selection of the correct option from a number of candidates, making links between unordered pairs, indication of non-matching elements, etc.5 5. For an extensive reference to evaluation procedures of virtual signer performance and prototype platform see Sapountzaki, Efthimiou & Fotinea, in press a, b; and Efthimiou, Fotinea & Sapountzaki 2006b.


The resources are organized in a way that allows for the modification of lectures according to the needs of class member, with exercises that can dynamically change content (Efthimiou and Fotinea 2004). Moroever, the sign lexicon database (see Section 3.1) can be used to extract new educational material, thereby supporting the presentation of various grammar topics. The extraction of educational material can be conducted according to specific parameters either by the database search facilities or by selection of lemmata sorted in specific fields.

6. Concluding remarks: Potentials and limitations Sign language technologies, as a branch of NLP, are heavily dependent on the amount of available resources and on available technological solutions. Thus, the enrichment of sign lexicon databases and computational grammars that are utilised in conversion and synthesis systems may crucially enhance the performance in terms of coverage, as well as naturalness of signing. The problem of limited resources is generally acknowledged as a major obstacle in developing languagebased applications of wide scope, restricting sign representation capacity to welldefined language domains. Constant enrichment of resources is the only way to reach a point where the handling of sign data will result in acceptable performance with close to unrestricted input. In this direction, sign language corpora may contribute significantly to sign language research both for linguistic model extraction and validation of sign language NLP modules. Among relevant NLP modules, conversion tools are of utmost importance, since the development of adequate translation tools between written/spoken and sign languages is an essential aspect of communication between deaf and hearing communities. However, current signing avatar implementations face well-known technical limitations concerning the degree of naturalness of movement and the precision of non-manual feature performance, as has been reported by all groups involved in related research. In this respect, a number of questions remain open. For instance, problems such as producing cyclic and wavy movements and shoulder independence from the torso, while preserving finger-detailed performance and facial parameters have not yet been addressed. Nevertheless, the use of sign synthesis machines is absolutely necessary for the development of systems and tools for Human Computer Interaction, for dynamic real-time information exchange, Web communication, and services accessible to the Deaf. Video — though the best means to preserve naturalness — is unable to capture and communicate rapidly changing digital information content. Besides, acceptability measurements of signing avatar performance have been reported (see, for example, Morimoto, Kurokawa, Isobe & Miyashita 2004) that provide evidence in favour of introducing

17

18


signing avatars in every-day information exchange environments, that would otherwise require human interpreters to bridge communication between deaf and hearing individuals.

Acknowledgements This work is partially funded by the European Union, in the framework of the Greek national project grants SYNNENOESE (GSRT: eLearning 44) and DIANOEMA (GSRT: M3.3, id 35).

References Antzakas, Klimis. 2006. The use of negative head movements in Greek Sign Language. In U. Zeshan (ed.), Interrogative and negative constructions in sign languages, 258–269. Nijmegen: Ishara Press. Antzakas, Klimis & Bencie Woll. 2002. Head movements and negation in Greek Sign Language. In I. Wachsmuth, & T. Sowa (eds.), Gesture and sign language in human-computer interaction, 193–196. Berlin: Springer. Bellugi, Ursula & Susan Fischer. 1972. A comparison of sign language and spoken language. Cognition 1: 173–200. Berenz, Norine. 2002. Insights into person deixis. Sign Language and Linguistics 5(2), 203–227. Blaxxun Contact 5. 2004. http://www.blaxxun.com/en/products/contact/ Boutsis, Sotiris, Prokopis Prokopidis, Voula Giouli & Stelios Piperidis. 2000. A robust parser for unrestricted Greek text. In Proceedings of the 2nd Language Resources and Evaluation Conference, Athens, 467–474. Boyes Braem, Penny & Rachel Sutton-Spence (eds.). 2001. The hands are the head of the mouth: the mouth as articulator in sign languages. Hamburg: Signum. Carpenter, Bob. 1992. The logic of typed feature structures. Cambridge: Cambridge University Press. Efthimiou, Eleni. In press. Processing cumulative morphology information in GSL: the case of pronominal reference in a three-dimensional morphological system. To appear in Volume in honour of G. Babiniotis, National and Kapodestrian University of Athens. Efthimiou, Eleni & Stavroula-Evita Fotinea. 2004. An adaptation-based technique on current educational tools to support universal access: the case of a GSL e-Learning platform. In Proceedings of the TeL’04 Workshop on Technology Enhanced Learning, Toulouse, France, 177–186. Efthimiou, Eleni & Stavroula-Evita Fotinea. 2007. An environment for deaf accessibility to educational content. In Proceedings of The First International Conference on Information and Communication Technology and Accessibility, Hammamet, Tunisia, 125–130. Efthimiou, Eleni & Stavroula-Evita Fotinea. In press. GSLC: Creation and annotation of a Greek Sign Language corpus for HCI. In Proceedings of 12th International Conference on HumanComputer Interaction, Beijing, P.R. China.


Efthimiou, Eleni, Stavroula-Evita Fotinea & Galini Sapountzaki. 2006a. Processing linguistic data for GSL structure representation. In Proceedings of the Workshop on the Representation and Processing of Sign Languages: Lexicographic matters and didactic scenarios, Genova, Italy, 49–54. Efthimiou Eleni, Stavroula-Evita Fotinea & Galini Sapountzaki. 2006b) E-accessibility to educational content for the deaf. In EURODL-European Journal of Open and Distance Learning 2006/II. Electronically available since 15.12.06 at: http://www.eurodl.org/materials/ contrib/2006/Eleni_Efthimiou.htm. Efthimiou, Eleni & Marianna Katsoyannou. 2001. Research issues on GSL: a study of vocabulary and lexicon creation. Studies in Greek Linguistics 2, 42–50 (in Greek). Efthimiou, Eleni, Galini Sapountzaki, Costas Karpouzis & Stavroula-Evita Fotinea. 2004. Developing an e-Learning platform for the Greek Sign Language. Lecture Notes in Computer Science 3118, 1107–1113. Efthimiou, Eleni, Anna Vacalopoulou, Stavroula-Evita Fotinea & Gregory Steinhauer. 2004. A multipurpose design and creation of GSL dictionaries”. In Proceedings of the Workshop on the Representation and Processing of Sign Languages “From SignWriting to Image Processing. Information techniques and their implications for teaching, documentation and communication”, Lisbon, Portugal, 51–58. Fernald, Theodore B. & Donna J. Napoli. 2000. Exploitation of morphological possibilities in signed languages: Comparison of American Sign Language with English. Sign Language & Linguistics 3(1), 3–58. Fischer, Susan D. 1996. The role of agreement and auxiliaries in sign language. Lingua 98: 103–119. Fischer, Susan D. & Bonnie Gough 1979. Verbs in American Sign Language. In E.S. Klima & U. Bellugi (eds.), The signs of language. Cambridge, MA: Harvard University Press. Fotinea, Stavroula-Evita, Eleni Efthimiou, Costas Karpouzis & George Caridakis. 2005. Dynamic GSL synthesis to support access to e-content. In Proceedings of the 3rd International Conference on Universal Access in Human-Computer Interaction (UAHCI 2005), Las Vegas, Nevada, USA. Fotinea, Stavroula-Evita, Eleni Efthimiou, Costas Karpouzis & George Caridakis. In press. A knowledge-based sign synthesis architecture. In E. Efthimiou, S.-E. Fotinea & J. Glauert (eds.), Universal Access in the Information Society (Special issue on emerging technologies for deaf accessibility in the information society). Fotinea, Stavroula-Evita, Eleni Efthimiou & Dimitris Kouremenos. 2005. Generating linguistic content for Greek to GSL conversion. In Proceedings of the The 7th Hellenic European Conference on Computer Mathematics & its Applications (HERCMA–2005), Athens, Greece. Huang, Zhisheng, Anton Eliens & Cees Visser. 2002. STEP: A scripting language for embodied agents. In Proceedings of the Workshop on Lifelike Animated Agents 2002. Huenerfauth, Matt. In press. Misconceptions, technical challenges, and new technologies for generating American Sign Language animation. Universal Access in the Information Society (Special issue on emerging technologies for deaf accessibility in the information society). ISO/IEC 19774 — Humanoid Animation (H-Anim). http://www.h-anim.org. Karpouzis, Costas, George Caridakis, Stavroula-Evita Fotinea & Eleni Efthimiou. In press. Educational resources and implementation of a Greek Sign Language synthesis architecture. Computers and Education.

19

20 Eleni Efthimiou, Stavroula-Evita Fotinea and Galini Sapountzaki Kennaway, Richard. 2001. Synthetic animation of deaf signing gestures. In Proceedings of the International Gesture Workshop, City University, London. Kennaway, Richard. 2003. Experience with, and requirements for, a gesture description language for synthetic animation. In Proceedings of the 5th International Workshop on Gesture and Sign Language based Human-Computer Interaction, Genova. Kyle, Jim G. & Bencie Woll. 1985. Sign Language: the study of deaf people and their language. Cambridge: Cambridge University Press. Lancaster, G., K. Alkoby, J. Campen, R. Carter, M.J. Davidson, D. Ethridge, J. Furst, D. Hinkle, B. Kroll, R. Layesa, B. Loeding, J. McDonald, N. Ougouag, J. Schnepp, L. Smallwood, P Srinivasan, J. Toro & R. Wolfe. 2003. Voice activated display of American Sign Language for airport security. Technology and Persons with Disabilities Conference 2003. California State University at Northridge, Los Angeles, CA March 2003. Liddell, Scott. 1980. American Sign Language syntax. The Hague: Mouton. Marshall, Ian & Eva Safar. 2002. Sign language synthesis using HPSG. In Proceedings of Ninth International Conference on Theoretical and Methodological Issues in Machine Translation (TMI), Keihanna, Japan. Morimoto, K., T. Kurokawa, N. Isobe & J. Miyashita. 2004. Design of computer animation of Japanese Sign Language for hearing-impaired people in stomach X-ray inspection. In K. Miesenberger, J. Klaus, W. Zagler & D. Burger (eds.), Springer Lecture Notes in Computer Science (LNCS) 3118, 1114–1120. Neidle, Carol. 2002. SignStream™ annotation: Conventions used for the American Sign Language Linguistic Research Project. American Sign Language Linguistic Research Project Report No. 11, Boston, MA, Boston University. Prillwitz, Siegmund, Regina Leven, Heiko Zienert, Thomas Hanke & Jan Henning. 1989. HamNoSys, Version 2.0: Hamburg Notation System for Sign Language. An Introductory Guide. Hamburg: Signum. Papaspyrou, Chrissostomos. 1990. Gebärdensprache und universelle Sprachtheorie. Hamburg: Signum. Sandler, Wendy & Diane Lillo-Martin. 2006. Sign languages and linguistic universals. Cambridge: Cambridge University Press. Sapountzaki Galini, Elini Efthimiou & Stavroula-Evita Fotinea. In press a. Digital technology in Greek Sign Language teaching in primary school curriculum. In Proceedings of the Hellenic Conference on Digital Educational Material: Issues of creation, exploitation and evaluation, Volos, Greece (in Greek). Sapountzaki Galini, Elini Efthimiou & Stavroula-Evita Fotinea. In press b. e-Education in the Greek Sign Language using an Internet based platform. In: Proceedings of the Workshop: e-Learning using ICT, Lamia, Greece (in Greek). Sapountzaki, Galini. 2003. Auxiliary markers in the verb system of Greek Sign Language. In Proceedings of the 6th International Conference for Greek Language, Rethimno, Greece. Shieber, Stuart M. 1992. Constraint-based grammar formalisms. Cambridge, MA: MIT Press. Stokoe, William. 1978. Sign language structure (Revised Ed.). Silver Spring, MD: Linstok Press. Stokoe, William & Rolf Kuschel. 1979. Field guide for sign language research. Linstok Press. Sutton-Spence Rachel & Linda Day. 2001. Mouthings and mouth gestures in British Sign Language. In P. Boyes-Braem & R. Sutton-Spence (eds.), The hands are the head of the mouth: the mouth as articulator in sign languages, 69–86. Hamburg: Signum.


Sutton-Spence, Rachel & Bencie Woll. 1999. The linguistics of British Sign Language: an introduction. Cambridge: Cambridge University Press. Thompson, Robin, Karen Emmorey & Robert Kluender. 2006. The relationship between eye gaze and verb agreement in American Sign Language: an eye-tracking study. Natural Language & Linguistic Theory 24(2), 571–604. Valli, Clayton & Ceil Lucas. 1995. Linguistics of American Sign Language (2nd ed.). Washington, DC: Gallaudet University Press. Zhao, Liwei, Karin Kipper, William Schuler, Christian Vogler, Norman Badler & Martha Palmer. 2000. A machine translation system from English to American Sign Language. Lecture Notes in Computer Science 1934 (Proceedings of the Association for Machine Translation in the Americas 2000), 54–67. Zeshan, Ulrike. 2004. Hand, head and face: negative constructions in sign languages. Linguistic Typology 8(1), 1–58. Zijl, Lynette van & Andries Combrink. 2006. The South African Sign Language Machine Translation Project: issues on non-manual sign generation. In Proceedings of SAICSIT06, Somerset West, South Africa, 127–134.

Authors’ addresses Eleni Efthimiou & Stavroula-Evita Fotinea Institute for Language and Speech Processing Artemidos 6 & Epidavrou GR 15125, Athens Greece [email protected] [email protected]

Galini Sapountzaki Dept. of Special Education Greek Sign Language Lab University of Thessaly Argonauton & Filellinon GR 38221, Volos Greece [email protected]

21