Coding Hand Gestures: A Reliable Taxonomy and a Multi-media Support Fridanna Maricchiolo1, Augusto Gnisci2, and Marino Bonaiuto3 1
Dipartimento di Studi dei Processi Formativi Culturali e Interculturali nella Società Contemporanea, Università degli Studi Roma Tre, via Milazzo 11b 00185, Rome, Italy 2 Dipartimento di Psicologia, Seconda Università degli Studi di Napoli 3 Dipartimento di Psicologia dei Processi di Sviluppo e Socializzazione, Sapienza Università di Roma, Italy
[email protected],
[email protected],
[email protected]
Abstract. A taxonomy of hand gestures and a digital tool (CodGest) are proposed in order to describe different types of gesture used by speaker during speech in different social contexts. It is an exhaustive and mutually exclusive categories system to be shared within the scientific community to study multimodal signals and their contribute to the interaction. Classical taxonomies from gesture literature were integrated within a comprehensive taxonomy, which was tested in five different social contexts and its reliability was measured across them through inter-observer agreement indexes. A multi-media tool was realized as digital support for coding gestures in observational research. Keywords: coding gesture, multi-media tool, reliability, observational research.
1
Introduction
Research on hand gesture falls within the behavioural analysis of interactions. Behavioural studies often involve observational methods, i.e., the adoption and/or adaptation of reliable coding systems shared by the scientific community [1]. Nevertheless, within gesture literature, a number of gesture types and different classifications of them have been offered, e.g., [2-5]; Wundt, 1921/1973, in [6], along different dimensions/criteria. The focus of this article is to propose a reliable coding system based on a taxonomy of hand gestures used for observation of social interaction within different social contexts and situations. To this end, a multi-media manual as coding support is proposed. Up to now, there is no a universally shared category system for hand gestures. Many authors differently described the variety in which gestures occur and are used by the speaker. These differences are also evident in the criterion/a at the basis of each taxonomy, that probably depend on the scientific discipline their authors refer to and on the aims they pursue. A. Esposito et al. (Eds.): Cognitive Behavioural Systems 2011, LNCS 7403, pp. 405–416, 2012. © Springer-Verlag Berlin Heidelberg 2012
406
F. Maricchiolo, A. Gnisci, and M. Bonaiuto
Ekman and Friesen [3], using three taxonomic criteria (usage, origin, and coding), distinguished five main categories of gestures: Illustrators, conveying semantic content; Emblems, conventional and cultural signs; Regulator signals, controlling conversational flow; Emotional displays, expressing emotional states; and Adaptors, contact and manipulation hand movements. Kendon [4] referring to Gesticulation (the gesture that is incomplete without speech accompaniment) used a continuum criterion: from Spontaneous Gesticulation, Language-slotted, Pantomime, Emblems, to Signs. McNeill [5] distinguishes gestures belonging to ideation process (propositional gestures, representing linguistic referents: iconics, metaphorics, and deictics) and gestures characterizing discursive activity (non-propositional gestures: cohesive and beats). Bavelas and coll. [2] distinguish: Topic gestures, i.e., referential or semantic gestures (similar to illustrators) and Interactive gestures, having an intrinsically interpersonal character (often represented by pointing to the addressee like deictics). Krauss and coll. [7] distinguish hand movements on a continuum of lexicalization (level of resemblance or closeness with the words): from Symbolic gestures, with a higher degree of lexicalization, to Conversational gestures, distinguished into Lexical gestures (connected with the semantic content of the speech) and Motor movements (coordinated to the prosody of the speech) with a medium degree of lexicalization, up to Adapters (communicatively meaningless and not related to speech) with a lower lexicalization degree. Poggi [8] distinguishes communicative gestures according to various criteria. One of these is the cognitive construction: codified vs. creative gestures. The first are meaningful signs steadily represented in memory (e.g., emblems). Creative gestures are performed to represent or evocate some referent (e.g., iconic, but also deictic gestures). According to the author, creative gestures are motivated, while the codified ones, though having an iconic dimension, are often arbitrary. Not all these classifications are exhaustive. Moreover, they often adopt different taxonomic criteria: in some cases there is not a single criterion as there are more than one [3, 8]; in some cases the categories are not structured in specific sub-categories [2, 8] as they are simplifications of categories described by previous authors [9]. Some contributions only develop gesture annotation schemes [10] that maintain temporal structure and location information for capturing the original gestures and replicating them on an animated character. Allwood and colleagues [11] developed the MIMUN, a multimodal annotation scheme dedicated to the study of gestures (hand gestures and facial signals), with particular regard to their function about feedback, turn management and sequencing; gestures are also described through features of their shape and dynamics. Classifying a behavioural phenomenon, such as hand gesture, in a system of mutually exclusive and exhaustive categories is desirable as well as using coding systems shared by the literature, which proved to be reliable [12]. Achieving the usability of a taxonomy for making it available and widespread to the scientific community needs both conceptual and operational definition of each specific category as well as its concrete and exact description. Moreover, to establish a classification system it is necessary an empirical checking through behavioural analysis of interactions, where gestures are naturally occurring, as well as a coding of the different specific categories across several real interactive contexts. Then, in order to check if the proposed categories of gesture are recognized in a reliable way by the coders it is necessary to develop inter-observer agreement indexes such as Coehn’s K [13] or Krippendorff’s α [14]. Furthermore,
Coding Hand Gestures: A Reliable Taxonomy and a Multi-media Support
407
since nowadays the researcher work is mainly done through computer, a digital multi-media tool would be useful to make the behavioural analysis and coding job easier and straightforward. In spite of these arguments, studies on gesture classification do not always include, within a unique contribution, these minimal requirements, i.e.: empirical behavioural observation within different social situations and statistical analysis for reliability evaluation, as well as multi-media supports. As a consequence, the general aim of this contribution is an hand gesture taxonomy and a multi-media tool for coding in observational research. Specific aims are: 1) definition and illustration of the proposed taxonomy and of gesture categories; 2) establishment of criteria for assessing if trained-to-the-gesture-taxonomy observers code in a reliable way many interactions from different contexts; 3) description of a digital manual for coding gesture. Each of the next three paragraphs addresses one of the three aims.
2
A Hand Gesture Taxonomy for Observational Research
The taxonomy presented combines classical gestures categories (mainly, [3, 5]; but also, in some way, [2, 7]). Many authors (e.g., amongst others, [15, 5]) stressed the importance of gestural-verbal co-occurrence in both theoretical and methodological terms. Referring to this position, the fundamental criterion of the proposed taxonomy is that gestures can, or cannot, be linked to the speech. The term “link” here is intended not in terms of gesture-speech “co-occurrence”, but with reference either to the “content” of the speech or to the “structure” of the speech. The taxonomy is organized along three hierarchical levels (Fig. 1): macro-, specific, and sub-categories. As reported in Fig. 1, according to the taxonomy, gestures are divided in Speech Linked Gestures (SLG) and Speech Nonlinked Gestures (SNG). The SLG macro-category includes “cohesive” [5, 16], “rhythmic” [5, 16], and “ideational” gestures (specific categories). Cohesive category is distinguished in specific gestures (sub-categories), named according to the specific movement shape realized in the air by the hand(s) (e.g., “weaver”, “whirlpool”, “nipper”, etc.). Ideational category includes “emblems” [3, 15] and “illustrators” [3]; illustrators, in turn, includes “iconic”, “metaphoric”, and “deictic” [16] gestures. SNG are distinguished in “self-adaptor gestures” [3, 17] and “hetero-adaptor gestures” [3, 18], which can be “person-addressed adaptors” and “object-addressed adaptors”. The conceptual definitions (CD) of a gesture category - explaining the theoretical notion (i.e., its correspondence with previously given definitions in literature) – and the operative ones (OD) - explaining observable elements useful for gesture recognition (i.e., how to observe and recognize the gesture) - are synthetically reported in Appendix. Since it is impossible to classify the gestures without considering the context and the function of the gesture [8, 19], such indications are given in the ODs. Keeping the names already used for a very long time in the literature can be the best solution to share and to connect the present work within the scientific community.
408
F. Maricchiolo, A. Gnisci, and M. Bonaiuto
Fig. 1. Coding tree of the proposed taxonomy
3
Evaluating Reliability in Different Interactive Contexts
Hand gestures performed by persons interacting within different social contexts were observed and coded according to the above gesture categories system by trained coders. The sample of video-recorded social contexts has been selected amongst the archives of videotapes collected in many years at the Social Psychology Laboratory (Department of Psychology of Development and Socialization Processes, Sapienza University of Rome). The following natural contexts (broadcasted by Italian TV) were considered: 1) television political interviews: two Italian political leaders of opposite alliance (Silvio Berlusconi and Francesco Rutelli, N=2), interviewed separately during the campaign for the 2001 Italian political elections; 2) courtroom examinations: a witness called by accuse (N=1) testifying against the man suspected to having killed her adoptive son. The following laboratory contexts were considered: 3) simulation of small group discussion: four subjects (N=4) simulated to be members of an advisers group having to discuss two "business cases" to find one unanimous written solution for each case [20]; 4) simulations of dyadic discussion: two subjects (N=2) had the same task of the group simulation (see above); 5) simulations of examination: five subjects (N=5) are singularly interrogated by a confederate within a simulated context where they have the task to answer in one part by telling the truth and in another part by lying [21]. All the subjects in the simulations were blind to the objectives of this research. In order to have comparable data across the settings, only 20 minutes from each context
Coding Hand Gestures: A Reliable Taxonomy and a Multi-media Support
409
have been selected for a total of 100 minutes. Interactions were transcribed by an adapted conversational analysis system [22] integrated with alternate lines in which gestures were signed. The observer proceeded to annotate gestures directly in the transcripts of the speech of the interactions while watching the videotapes. Speechgesture synchrony observation leads to a degree of accuracy that permits assessment of how meaningful gestural movements co-occur with speech, syllable by syllable [19]. The segmentation (beginning and end) of each gesture was marked with square brackets as overlap on the verbal transcripts. In this way it was possible to segment gestures anchoring on the speech and to fix all the co-occurrences between spoken language and gestures. Hand gestures produced by the speakers during the videotaped interaction were coded according to the taxonomy categories. A coder “blind” to the research aims had preventively been trained by an expert coder (the first author of this chapter) to the use of the coding system and to the identification of each gesture category. Under her supervision, the observer examined various videotapes (working on a different sample of transcripts) and she was trained in recognizing all the different categories of the taxonomy. Only when the coder showed herself to be capable of coding gestures according to the coding system, she began to codify the whole selected material for the empirical test of the hand gesture taxonomy. Having finished the coding, the observer checked her own codings by observing all the video recorded material again. Any remaining ambiguous case was resolved in a discussion with her coding supervisor. The frequency and percentage of each gesture category in each context is shown in Table 1. Table 1. Amount and percentage (for context total) of gestures observed within five different contexts for each gesture specific category of the proposed taxonomy Total Gestures Political Courtroom Group Dyads Simulated f % f % f % f % examination f 54 10.5 205 20 170 24 235 22.9 804 Cohesive 140 20.6 93 18.1 138 13.4 30 4.2 59 5.8 390 Rhythmical 70 10.3 83 8.1 48 6.8 76 7.4 332 Emblem 19 2.8 106 20.6 Iconic 2 0.3 5 1 2 0.2 3 0.4 58 5.7 70 Metaphoric 108 15.9 90 17.5 207 20.2 61 8.6 141 13.8 607 Deictic 266 39.1 102 19.8 159 15.5 113 15.9 137 13.4 777 25 336 32.8 1454 Illustrator 376 55.3 197 38.2 368 35.8 177 Self-adaptor 16 23.5 38 7.4 94 9.1 215 30.3 268 26.2 631 Object-ad. 59 8.7 29 5.6 137 13.3 68 9.6 50 4.9 343 Person-ad. 0 0 0 0 2 0.2 2 0.3 0 0 4 11 65 12.6 233 22.7 284 40 318 31.1 975 Tot. 75 adaptor 100 3955 Total 680 100 515 100 1027 100 709 100 1024
% 20.3 9.9 8.4 1.8 15.3 19.6 36.8 15.9 8.7 0.1 24.6 100
Results demonstrate that almost all the gesture categories of the taxonomy are present in each different social context. It is possible to note a sort of differentiation in the use of hand gestures within the different social situations: some gesture categories occur more often in particular social contexts than in others. But some categories (i.e., illustrators) are used more often than all other categories across all the interaction contexts.
410
F. Maricchiolo, A. Gnisci, and M. Bonaiuto
The results also demonstrate that it is possible to recognize and code all hand gestures observed during whatever kind of interactions using the gesture categories provided by the taxonomy: the category system is exhaustive. To reach more efficient outcomes especially in terms of mutual exclusiveness, it has been necessary to evaluate the reliability of the coding system used for these observations and coding. For this purpose inter-rater agreement indexes on the general coding system for each context were calculated. Systematic observation and codification of hand gestures were carried out also by another independent observer, separately trained to use the coding system. The two observers (O1 and O2) separately and independently carried out the coding of the whole video-recorded sampled material, according to the above described procedure. Statistical data analyses were carried out to evaluate taxonomy reliability, measuring the concordance between two independent observers across different contexts. The percentage of agreement on gesture segmentation, the inter-rater agreement on the whole coding system of gestures in general and separated for each context were calculated by means of Cohen’s K [13] (using the software ComKappa [23]). According to [1], Cohen’s K >.75 is set as the threshold to test coding reliability. Given the fact that the length of each interaction is the same (20 minutes), simply frequencies of each coding category were used as the unit of analysis. The percentage of agreement between O1 and O2 on recognition of a gesture (reliability on unityzing) is 91.6%, thus, excellent [24]. The total agreement on the whole coding system is K = .82, which can be considered good (K>.75). The K indexes were calculated separately on each specific gesture category and on the whole gesture category system for each observed context. The minimum value of K is.75 for each of the eight specific gesture categories and for the whole system in each of the five contexts considered. Gestures more reliably identified by coders are Iconics and Self-adaptors (K=.94) while gestures less reliably identified are Rhythmics and Metaphorics (K=.75). Similarly, for the context “Simulated interrogatory” K index turns out to be the lowest: nevertheless it is still acceptable even according within a conservative approach such as the one by Bakeman and Gottman’s standard. [1]. Results as a whole demonstrate that the categories of the proposed taxonomy can be recognized, discriminated and identified in reliable way in different interaction contexts. Since the intercultural debate on gestures focuses on culture-specificity of gesture [25], we have begun a first step toward an inter-cultural validation of the taxonomy. The presented taxonomy has been tested in dyadic interactions (two young women talking about life and work problems) among members of another, very different culture from the Italian one (under many respect, such as economic, religious, social, ethnic, education conditions): they were women in Burkina Faso. Moreover these persons interacted using two different languages in two different moments [26, 27]. Further publications are in progress to report full details for that research. Observation of gestures of Burkinabe peoples have been compared to Italian ones (in the same type of conversations), finding that the same gesture categories (but in different amount) of the taxonomy have been observed also in non-Italian people. Such a test demonstrates that this system is based on abstract categories and not on culturespecific functions. In different cultures, however, gesturer varies for the amount of frequency (in each category), for the entity of movement, for the space use, and for the types and meaning of emblems [25].
Coding Hand Gestures: A Reliable Taxonomy and a Multi-media Support
4
411
Multimedia Tool for Coding Hand Gestures - CodGest
A multi-media manual (presently only in Italian language), called CodGest, has been realized as a tool to: describe the gesture taxonomy; support observational coding; make it shared by other researchers. Upon request, CodGest manual is available from the Authors. This instrument offers audiovisual support to gesture study and, in particular, for learning and using the gesture category system, in order to make such a taxonomy more easy to be consulted and used during observation. The tool is developed on digital interactive multimedia support: normal text is thus integrated by important information in form of images as well as audio-video, reproducing speakers performing hand gestures. Texts, images and videos were assembled through Macromedia Flash MX. CodGest is composed as follows: a) a home page and a brief theoretical introduction, with hypertext, summarizing salient points and basic principles on which the taxonomy is based, with literature bibliographical references; b) a brief paragraph for each gesture category, including both a conceptual and an operational definition, as well as a verbal description of shape and movement(s) performed by hand(s) (see Appendix); c) three examples (“ideal”, “prototypical”, “problematic”) of each gesture category, in video images taken one from ad hoc videos (“ideal” examples built with actors for this end) and two from “field” samples (see paragraph 3: both “prototypical”, i.e., clear, and “problematic”, i.e., dubious, examples); d) an example of gesture, for each category, through a three- or four-picture sequence, realized ad hoc in the laboratory and aimed at showing prototypical shape and movement of gesture; e) coding notes to facilitate and resolve possible problems in assigning a code to each gesture. CodGest referring to the verified category system was developed in digital format but in compliance with traditional methodological criteria for coding manual realisation [1]: a phenomenon description articulated in conceptual and operative definitions of different categories, with ideal, real, typical and problematic examples of them. The advantage of this is that the digital support in multi-media interface permits the addition of important information, such as photographic and audio-visual examples, to the text: these are fundamental for completely and appropriately understanding any coding system, as well as for sharing it. Some authors, in fact, maintain that in observational research it is desirable that the scientific community has at its disposal shared tools, allowing the researcher to compare own data and outcomes with that of others [12].
5
Conclusion
The coding system of hand gestures proposed in this study integrates and synthesizes main existing gesture classifications (in particular, [2, 3, 5, 16]), with the aim of individuating a general taxonomy useful to recognize and code hand gestures within a range of social situations. The observation and coding of hand gestures carried out within different contexts and the consequent individuation of an amount of occurrence
412
F. Maricchiolo, A. Gnisci, and M. Bonaiuto
for each gesture category in each context demonstrates the usefulness of the proposed taxonomy to study hand gestures in different dyadic or group interactions. The interobservers agreement indexes calculated in order to measure the reliability of the way in which the category of coding system can be recognized, in general, turn out to be very satisfactory. These results allow the use of this taxonomy as a good tool for coding hand gestures in the study of interaction in various social contexts. Different authors maintain that in observational research it would be desirable that the scientific community shares tools in order to permit the researchers to compare one's own data as well as one's own outcomes with the other’s ones [12]; and this is particularly true for social interaction researches which heavily rely upon observational techniques. In some fields of bodily communication such a tool partly exists (e.g., facial expression and recognition of emotion), while in other fields they are mostly lacking. The multimedia support, CodGest, for hand gesture coding represents a first step along this direction within the field of hand gestures. Such an instrument offers different possibilities under the forms of images, videos and texts to know and recognize specific categories of gesture. An important aspect of this study is that the same system of gesture categories had tested in five different social contexts, contrary to previous studies in the field which almost exclusively focused on a single category or context each time [2, 28-31]. The high scores of agreement indexes calculated in all the contexts confirm not only that the categories can be recognized in a reliable way, but also its appropriateness and effectiveness as a tool for observation and study of gestures in various social situations, implying some stability in the "kind" of gestures used across different settings of social interaction. This result gives a sort of generalization to the outcomes here obtained. However, it requires to be submitted to a more thorough verification via a generalizability analysis [32] in order to estimate if and how much this category system discriminates between subjects or between contexts (between variance) and between gesture categories, rather than between observers (within variance). Another aspect of generalization and validation is cross-cultural comparison, which has already been carried out by our research team with good outcomes [26, 27]. A further development consists in analyses aimed at checking each category function through their co-occurrence with specific verbal phenomena and devices [33]. A planned improvement to develop the digital manual is an exercise section for the user, aimed at training him/her to use the taxonomy and, therefore, to calculate the reliability of his/her measure with reference to a coding standard protocol. In this way it will be possible to compare the reliability of observers trained through digital manual support with the reliability of observers trained through a traditional method (i.e., via only textual manual and agreement with researchers). Further taxonomical developments should certainly address more detailed issues, such as, for example, the movement components or parameters characterizing each gesture category or subcategory: therefore, more fine grained analyses could be usefully enclosed within the presented hand gesture coding system [34, 6]. Similarly the presented specific categories could be enclosed within a higher order hierarchical level of coding, i.e., more general concepts such as that of “family of gesture” [6]. This means that the presented categories could lie at an intermediate level of abstraction, having above the “family of gesture” coding level, and below the level of the single movements, parameters or phases composing each gesture’s single enactment.
Coding Hand Gestures: A Reliable Taxonomy and a Multi-media Support
413
References 1. Bakeman, R., Gottman, J.M.: Observing interaction. An introduction to sequential analysis, II edn. Cambridge University Press, New York (1997) 2. Bavelas, J.B., Chovil, N., Lawrie, D.A., Wade, A.: Interactive gestures. Discourse Processes 15, 469–489 (1992) 3. Ekman, P., Friesen, W.V.: The repertoire of nonverbal behavior. Semiotica 1, 49–98 (1969) 4. Kendon, A.: Gesture and speech: How they interact. In: Wiemann, J.M., Harrison, R.P. (eds.) Nonverbal Interaction, pp. 13–45. Sage Publications, Beverly Hills (1983) 5. McNeill, D.: Hand and Mind. The University of Chicago Press, Chicago (1992) 6. Kendon, A.: Gesture: Visible Action as Utterance. Cambridge University Press, Cambridge (2004) 7. Krauss, R.M., Chen, Y., Chawla, P.: Nonverbal behavior and nonverbal communication: What do conversational hand gestures tell us? Adv. Exp. Soc. Psychol. 28, 389–450 (1996) 8. Poggi, I.: Iconicity in different types of gestures. Gesture 8, 45–61 (2008) 9. McNeill, D.: So you think gesture are nonverbal? Psychol. Rev. 92, 350–371 (1985) 10. Kipp, M., Neff, M., Albrecht, I.: An Annotation Scheme for Conversational Gestures: How to economically capture timing and form. Lang. Resour. Eval. 41(3-4), 325–339 (2007) 11. Allwood, J., Cerrato, L., Jokinen, K., Navarretta, C., Paggio, P.: The MUMIN coding scheme for the annotation of feedback, turn management and sequencing phenomena. Lang. Resour. Eval. 41(3-4), 273–287 (2007) 12. Bakeman, R., Gnisci, A.: Sequential observational methods. In: Eid, M., Dieneer, E. (eds.) Handbook of Multimethod Measurement in Psychology, pp. 127–140. American Psychological Association, Washington (2006) 13. Cohen, J.A.: Coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20, 37–46 (1960) 14. Krippendorff, K.: Estimating the reliability, systematic error, and random error of interval data. Educ. Psychol. Meas. 30, 61–70 (1970) 15. Kendon, A.: Gestures as illocutionary and discourse structure markers in Southern Italian conversation. J. Pragmatics 23, 247–279 (1995) 16. McNeill, D., Levy, E.T.: Cohesion and gesture. Discourse Processes 16, 363–386 (1993) 17. Rosenfeld, H.M.: Instrumental affiliative functions of facial and gestural expressions. J. Pers. Soc. Psychol. 4, 65–72 (1966) 18. Edelmann, R.J., Hampson, S.: Embarrassment in dyadic interaction. Soc. Behav. Personal. 9, 171–178 (1981) 19. Duncan, S.: Coding manual (2004), Technical Report availale from http://www.mcneilllab.uchicago.edu 20. Maricchiolo, F., Livi, S., Bonaiuto, M., Gnisci, A.: Hand gestures and perceived influence in small group interaction. Span. J. Psychol. 14, 755–764 (2011) 21. Caso, L., Maricchiolo, F., Bonaiuto, M., Vrij, A., Mann, S.: The impact of deception and suspicion on different hand movements. J. Nonverbal Behav. 30, 1–19 (2006) 22. Jefferson, G.: On the interactional unpackaging of a “Gloss”. Lang. Soc. 14, 435–466 (1985) 23. Robinson, B.F., Bakeman, R.: ComKappa: A Windows 95 Program for Calculating Kappa and Related Statistics. Behav. Res. Meth. Ins. C. 30, 731–732 (1998)
414
F. Maricchiolo, A. Gnisci, and M. Bonaiuto
24. Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychol. Bull. 76, 378–382 (1971) 25. Kita, S.: Cross-cultural variation of speech-accompanying gesture: A review. Lang. Cognitive Proc. 24, 145–167 (2009) 26. Bonaiuto, M., Gnisci, A., Maricchiolo, F.: Struttura e funzioni dei gesti delle mani durante la conversazione: nodi concettuali e possibili sviluppi futuri. Presentation to “Gesture in the Mediterranean: Recent Research in Southern Europe”, Procida, October 21-23 (2005) 27. Bonaiuto, M., Maricchiolo, F., Orlacchio, T.: Cultura e gestualità delle mani durante la conversazione: un confronto tra donne native dell’Italia e del Burkina Faso. Presentation to Workshop: “Intersoggettività. Identità e Cultura”, Urbino, October 14-15 (2005) 28. Beattie, G., Shovelton, H.: Iconic hand gestures and predistability of words in context in spontaneous speech. Brit. J. Psychol. 91, 473–492 (2000) 29. Beattie, G., Shovelton, H.: An experimental investigation of some properties of individual iconic gestures that mediate their communicative power. Brit J. Psychol. 93, 179–192 (2002) 30. Contento, S., Stame, S.: Déixis verbale et non verbale dans la construction de l’espace interpersonnel. Dialogue Analisis 5, 427–433 (1997) 31. Feyereisen, P., Havard, I.: Mental imagery and production of hand gestures while speaking in younger and older adults. J. Nonverbal Behav. 23, 153–171 (1999) 32. Cronbach, L.J., Gleser, G.C., Nanda, H., Rajaratnam, N.: The dependability of behavioural measurement: Theory of generazability for scores and profiles. Wiley, New York (1972) 33. Maricchiolo, F., Bonaiuto, M., Gnisci, A.: Hand gestures in speech: studies on their roles in social interaction. In: Mondada, L. (ed.) Proceedings of the 2nd ISGS Conference, Interacting Bodies, Lyon-France, June 15-18. Ecole Normale Supérieure Lettres et Sciences humaines, Lyon (2007) 34. Calbris, G.: Elements of Meaning in Gesture. John Benjamins Publishing, Amsterdam (2011) 35. Wiener, M., Devoe, S., Rubinow, S., Geller, J.: Nonverbal behavior and nonverbal communication. Psychol. Rev. 79, 185–214 (1972) 36. Barakat, R.: Arabic gestures. J. Pop. Cult. 6, 749–792 (1973) 37. De Jorio, A.: La mimica degli antichi investigata nel gestire napoletano (Ancient’s mimicry investigated through gesturing Neapolitan). Fibreno, Napoli (1832) 38. Ricci Bitti, P.E., Poggi, I.A.: Symbolic nonverbal behavior: Talking through gestures. In: Feldman, R.S., Rimé, B. (eds.) Fundamentals of Nonverbal Behavior, pp. 433–457. Cambridge University Press, New York (1991) 39. Morris, D.: Manwatching. A field guide to human behavior. Andromeda Oxford Limited and Jonathan cape Limited, London (1977) 40. Haviland, J.B.: Pointing, gesture spaces, and mental maps.. In: McNeill, D. (ed.) Language and Gesture, pp. 13–46. Cambridge University Press, Cambridge (2002) 41. Rosenfeld, H.M.: Instrumental affiliative functions of facial and gestural expressions. J. Personal. Soc. Psycho. 4, 65–72 (1966) 42. Freedman, N., Hoffman, S.P.: Kinetic behavior in altered clinical states: approach to objective analysis of motor behavior during clinical interviews. Percept. Motor Skill. 24, 527–539 (1967) 43. Kimura, D.: Manual activity during speaking. Neuropsychologia 11, 45–55 (1973) 44. Bull, P., Connelly, G.: Body movement and emphasis in speech. J. Nonverbal Behav. 9, 169–187 (1985)
Coding Hand Gestures: A Reliable Taxonomy and a Multi-media Support
6
415
Appendix
Speech Linked Gestures (SLG). CD: These gestures are performed during speech exposition. The presence of a concurrent verbal discourse is a necessary but not sufficient condition for the use of such gestures. OD: These gestures have a link, either semantic, referential, or structural, to the speech. The sound, word or verbal utterance to which these gestures are linked, is not strictly synchronic with the gesture: words can slightly precede or follow the concurrent gesture (or be omitted). Cohesive Gestures. CD: cohesive gestures [9] refer to utterance structure, creating linkages across narrative texts: they are linked with the syntactic aspects of the spoken utterance that determine its structure. OD: cohesive gestures are repetitive similar hand movements [16] performed in the same place and with same shape (each single type having its own idiosyncratic shape, e.g., circular or forward-backward or rightleft hand movements). Each sub-category of cohesive gesture is named according to the specific movement shape in the air. For example, “Weaving” (Matassa in Italian), in which both hands move horizontally, like if they are weaving. Rhythmic Gestures. CD: these gestures do not refer to the actual speech content but to prosodic aspects of verbal utterance. OD: they are rhythmical pulsing hand/finger movements (up-down, right-left) in time with co-occurring vocal peak. They are repeated along with the rhythmical pulsation and stress of the speech. Ideational Gestures. CD: these hand movements are related with the semantic content of the speech they accompany. OD: in such gestures, hands perform movements whose shape explicitly refers to (indicating or representing) concrete or abstract content(s) expressed in concurrent speech. Emblems Category. [3]. CD: emblems, also called autonomous gestures [4], conventional gestures [15], formal pantomimic gestures [35], semiotic gestures [36], are probably the first kind of gestures systematically and scientifically treated, e.g., [37]. They include all symbolic gestures [38], whose specific meaning is widely culturally shared. The same emblem can have different meaning, according to the culture; nevertheless, there are “trans-cultural” hand emblems. OD: Emblems are easily recognizable because, in spite of their arbitrary link with the speech they refer to, they have a direct verbal translation, which would usually consist of one or two words or a whole sentence (often a traditional expression shared in a specific culture). The concurrent words can be completely replaced by an emblematic gesture, as the so called “bag hand” meaning in the Italian culture something like “Well, what do you want from me?” (see [39], for a Sardinian example). Illustrators. CD: these hand movements, also called substantive gestures [15], topic gestures [2], propositional gestures [9], illustrate the content of what the speaker tells. They enlarge or complete the communication content, indicating something in the space or outlining shapes of objects or movements. Contrary to emblems, link between illustrators shape and meaning is not arbitrary but alludes to some verbalgesture relation. OD: the shape or the movement drawn by the hand(s) refers to the verbal content (representing or indicating it). To recognize this category is necessary
416
F. Maricchiolo, A. Gnisci, and M. Bonaiuto
indentifying its (actual or ideal) referent into the speech [8]. The category of illustrator gestures includes “iconic”, “metaphoric”, and “deictic” gestures [9]. Iconic gestures reproduce concrete aspects of verbal content. They have a “formal” relation with the referent since their form conveys the meaning and at the same is determined by it. Operatively, hand(s) draw(s), in the air, pictures of objects cited in discourse (e.g., drawing the form of a cube when the speaker mentions “a box”). Metaphoric gestures are also pictorial like iconic ones, but they refer to abstract idea(s), which is concretized through a specific gestural shape. Operatively, the hand, as it moves, “draws”, in the air, shapes which can represent a metaphor of an abstract idea, e.g., forming a fist shape when referring to strength: the fist becomes a metaphor of the abstract concept of strength. In an example reported in video on the CodGest, the speaker says “they are trying to make forget the past Governance” and performs a gesture representing a movement to one side, or better “put aside”: this movement of putting aside is a metaphor of the abstract action of “forgetting”. Deictic gestures (from ancient Greek language deìknymi, “to show”), also named pointing [40], indicate entities which can be actually present in the physic environment of the gesturer (e.g., indicating objects, person, or places) or ideally present in the discourse content (e.g., pointing upwards speaking about northern, or backwards to indicate the past). They can be used for pointing to the interlocutor (as in the case of “interactive” gestures by [2]). Speech Non-linked Gestures (SNG). CD: According some authors, e.g. [6], these categories could be referred to as hand movement rather than hand gestures. But, referring to Poggi [8, p. 46], gestures are any hand movement performed “to do things, to touch objects, other people or themselves, and finally to communicate”. Therefore, SNG can also be produced during speech but do not bear any clear evident relation neither to speech content nor to speech structure (whether in its prosodic or syntactic aspects). OD: these hand movements are mainly acts of contact and/or manipulation with a part of the speaker’s body, or with objects, or with other persons, as the adaptors or adapters described by Ekman and Friesen [3] (see also: [41]). SNG are distinguished in “self-adaptor gestures” and “hetero-adaptor gestures”. Selfadaptor gestures, also called body-focused movements [42], self-touching gestures [41, 43], self-manipulators ([41]), are gestures of self-contact. Operatively, hands touch parts of one's own body, e.g., touching one’s own hair, scratching oneself, rubbing the hands each other. Hetero-adaptors, also called manipulative gestures [18], contact acts [44], are gestures of contact with what is external to the performer. They can be “person-addressed adaptors” or “object-addressed adaptors”. Operatively, object-adaptors are gestures of contact with objects, e.g., touching (manipulating) some object in the physical space such as a pen, or a paper; person-adaptors are contacts with other persons, e.g., touching another one’s hand, arm or shoulder.