Rendering Gestures as Line Drawings - CiteSeerX

0 downloads 0 Views 420KB Size Report
and for designing animations of line drawn gestures. ... cause line drawings abstract from irrelevant details and thus force the focus to move on the essentials ... in the manual alphabet consists of a static hand posture and a dynamic hand location .... and modifying the geometry of static hand gestures is shown in Figure 3.
Rendering Gestures as Line Drawings Frank Godenschweger, Thomas Strothotte, Hubert Wagener Department of Simulation and Graphics Otto-von-Guericke University of Magdeburg, Germany E-mail: {godens|tstr|wagener}@isg.cs.uni-magdeburg.de

Abstract. This paper discusses computer generated illustrations and animation sequences of hand gestures. Especially the animation of gestures is very useful in teaching the sign language. We propose algorithms for rendering 3D models of hands as line drawings and for designing animations of line drawn gestures. Presentations of gestures as line drawings as opposed to a photorealistic representations have several advantages. Most importantly, the abstract nature of line drawings emphasizes the essential information a picture is to express and thus supports an easier cognition. Especially when line drawings are rendered from simple 3D-models (of human parts), they are aesthetically more pleasing than photorealistic renderings of the same model. This leads us to the assumption that simpler 3D-models suffice for line drawn illustrations and animations of gestures, which in consequence facilitates the 3D modeling task and speeds up the rendering. Another advantage of line drawings include fast transmission in networks, as e.g. the Internet, and the wide scale-independence they exhibit.

1

Introduction

Modeling and rendering human bodies or parts is a well known problem that has attracted many research groups. It seems that our perception of humans (and their photorealistic images) is very sensitive to many fine details: the cell structure of the tissue (wrinkles etc.), the hairs, and in animations the exact movement of muscles that press and stretch the tissue. Indeed, various approaches deal with these problems and try to incorporate many details, but at the moment there exist no aesthetically fully satisfying solution for representations in a way that looks realistically human. In the field of facial animation for example Parke proposed in his pioneering work [5] a parameterized model for facial animation. He identifies for each facial expression a set of parameters values such that an appropriate change of these values results in a distinct facial expressions. Terzopulos and Waters [8] developed a hierarchical trilayer as a physically based 3D model of the human face in order to refer bones, muscles and tissue. The appearance of the tissue is approximated in their model by a simulation taking elastic forces between different parts of their model into account. These and similar approaches can be adapted for rendering illustrations and animations of hand gestures. But the

148

Frank Godenschweger, Thomas Strothotte, Hubert Wagener

required models are extremely complex and the rendering times are very high. Thus at least for interactive applications this approach is prohibitive. When designing a teaching system for sign language, interactive speed certainly is more awarding than aesthetically acceptable graphics. Geitz et al. [1] followed this route by providing such a teaching system for use in the Internet. The 3D-VRML-models they use for representing the human hand are very coarse. This allows fast rendering of graphics and the system can be used interactively. Currently, single static signs of the manual alphabet can be viewed from arbitrary directions by rotating the model online. A disadvantage of such crude models of human parts, especially when rendered in a photorealistic style, is the tendency that viewers pay at least part of their attention in criticising the aesthetic shortcomings of the presentation. It has been observed that viewers of line drawings are less concerned with criticizing the crudity of the underlying model and the focus of concentration is put more on the conveyed information (cf. [7] for more information on this subject). This leads us to the assumption that graphical presentation of hand signs are given more appropriately in line drawn style. Currently, books dedicated to the task of teaching sign language usually present the signs as line drawings instead of using photos, mainly because line drawings abstract from irrelevant details and thus force the focus to move on the essentials (for example [6]). The main problem in employing line drawings as presentation style stems from the fact that rendering engines (and hardware support) for producing line drawings are quite rare (or not available at all). Furthermore, animation in line drawing style pose additional problems with respect to the selection of characteristic lines as well as frame-to-frame coherence. We develope in this paper an application for designing and animating hand gestures in order to illustrate sign language as line drawings. An interactive animating module controls sequences of gestural expression, which allows easy repetition of misunderstood sequences of sign gestures. Furthermore, the underlying 3D representation makes it possible to view the gesture sequences from arbitrary directions for a more rigorous inspection of illustrated words and phrases in the sign language.

2

Internal Representation of Gestures

In our application, gestures are represented within the computer as geometric 3D models of hands. But before we detail this representation, it is useful to classify the gestures we have to present. Here we follow a classification scheme given by Harling and Edwards [3]. In this classification, a distinction between hand posture and hand location is made, where hand posture refers to the individual position of fingers relative to the hand. In performing a gesture, hand posture as well as hand location may either be static or dynamic. This leads to four different classes of gestures. For example the gesture representing the letter J in the manual alphabet consists of a static hand posture and a dynamic hand location, while the letter A is static in posture and location.

Rendering Gestures as Line Drawings

149

In the following parts of this section, the representation of finger and hand movements forming a chosen gesture are described. 2.1

Representing the Hand and its Movement

The 3D model of the hand we use in our application is composed of freeform surfaces (actually tensor product surfaces of cubic B-Spline) which have been modeled with Alias|Wavefront. Using freeform surfaces for deformations, such as performed in hand movements, gives the advantage of a more realistic shape but is harder to control. This is why many systems which deal with hand and finger movements use rough representations, for example the phalanxes of a finger consists of cylinders. In our application, each finger (including the nail) is modeled by two patches, for each the palm and the lower arm one additional patch is required. Beside this freeform surface modeling the tissue of a hand, we supply a skeleton structure that approximately mimics the natural skeleton of the hand. This simplified skeleton is illustrated in Figure 1 (b).

(a)

(b)

(c)

Fig. 1: Deformation model of the hand. The freeform surfaces of the hand (a) are supplemented with a skeleton (b). The method of inverse kinematics is used to perform deformations, from which line drawings are calculated (c).

The movement of joints in this skeleton structure have different degrees of freedom that are chosen to match those of a real hands. Lee and Kunii [4] studied exhaustively the constraints of hand movements and finger flexions. They defined constraint functions for the hand and each phalanx of the finger. In our application, we only include the constraint for single joints so that for example a finger cannot rotate backwards. Including constraints concerning the combination of rotations is intended as future work.

150

Frank Godenschweger, Thomas Strothotte, Hubert Wagener

A posture in our system can be applied to the freeform surface modeling the tissue of the hand by a technique called cluster-based deformation, which is a well known feature in computer animation systems. For defining a posture of the skeleton, a set of rotation angles for the different joints is specified. Using the principles of inverse kinematics a posture of the hand is computed by a set of rotations. In order to specify a gesture, it suffices to give the rotation of all joints. Actually the designer of static gestures in our system has to be careful, since no collision detection for the resulting model is performed. This means that although all rotation parameters lie in their appropriately constraint range, self penetrations in the resulting model may occur. We consider this deficiency to be neglectable, because in a resulting line graphic the penetration areas are very small. Indeed this tolerance in the perception of line graphics supports our preference for this presentation style.

2.2

Temporal Control

Many gestures, especially those accompany speech, consists of many hand movements, i.e. they are inherently dynamic. Such gestures are best illustrated by animations. Our general approach of generating such animation consists of specifying a sequence of static gestures that constitute the keyframes of the animation. The transition between consecutive keyframes is obtained by an interpolation that can be chosen from different interpolation schemes. A meaningful animation of a series of gestures requires appropriate timing. When a human spells for example a word in sign language, the signer makes fast movements between the gestures representing single letters (she/he throws the letters), and then remains in this position for a few milliseconds to give the viewer time to recognize the letter in case of static gesture, or performs a dynamic gesture with a quite controlled timing. The timing specification that is incorporated into our system is currently quite rudimentary, but easy to use. For each specified keyframe (i.e. static gesture) a hold time, a preceding period, and a following period is specified. The time spent to move from a keyframe Ki to a keyframe Ki+1 is given by the sum of the following period specified for Ki and the preceding period for Ki+1 . Up to now this timing scheme is not fully validated. An easy, alternative scheme that allows for finer time-tunings consists of simply define the transition time between any pair of gestures. The disadvantage of this is that the effort to integrate new gestures grows with the number of gestures available in the system. Moreover, the size of the timing specification grows quadratic in the number of gestures as opposed to a linear grow in the former scheme. The time-tuning values as well as the parameter sets for the joint rotation which represent each gesture will be taken from a library and are thoroughly described in the next section.

Rendering Gestures as Line Drawings

3

151

The Gesture Library

The gestures available in our system are contained in a library, which can be extended by end-users in a quite convenient way. This gesture library consists of a table with an entry for each individual gesture. (Up to now the library contains gestures representing the letters of the manual alphabet). Within this table the geometry of the gesture is coded and timing specifications as described above are recorded. The geometric information of a given static gesture simply consists of the rotations at skeleton joints necessary to form the hand posture and location. In case of dynamic gestures, a sequence of key-frames and an interpolation method is stored, where key-frames in turn are specified as static gestures. The following sections describe the user interface for building and expanding the library. 3.1

Interactive Dialogs for Library Maintenance

Entering the timing values for a gesture into the library is a trivial task. The interactive dialog window for adding and editing timing values is shown in Figure 2.

Fig. 2: Dialog window for adding and editing preceding and following time periods of each manual sign

After choosing a gesture in the list, preceding period and following period can be entered or edited. Each gesture has a reference to its geometric specification. Clicking the “skeleton button” opens a dialog called gesture builder that allows inspection and redefinition of this specification. The dialog window for defining and modifying the geometry of static hand gestures is shown in Figure 3. Each individual joint can be selected and a rotation can be assigned. The user has to be aware that only basic constraints for the degrees of freedom are known

152

Frank Godenschweger, Thomas Strothotte, Hubert Wagener

Fig. 3: Dialog window for performing and modifying hand gestures. In the schematic illustration of the sekeleton the different joints can be selected by radio buttons. The rotation for the selected joint is entered in edit boxes.

to the system; especially constraints concerning combinations of rotations are missing. This means that a user can specify an unrealistic gesture without obtaining a warning from the system. This problem is toned down by displaying a graphic of the defined gesture in an associated window which is immediately updated when a parameter is redefined. In this way we can ensure instant graphical inspection of the definition process that has proved useful in gesture design. 3.2

Building Gesture Sequences

A user can trigger an animation simply by entering a sequence of manual letters. The application extracts from the library the geometric specification for each manual letter in turn and produces specifications of gesture transitions by interpolation techniques in accordance to the defined timing constraints. This results in a sequence of static gesture specifications each of which is rendered as a line drawing. The rendering process is discussed in the next section.

4

Generating Line Drawings of Freeform Surfaces

We now come to the central topic of this paper, the generation of line drawings from a geometric 3D model. The specification of hand gesture as described in Section 2 is transfered to the geometric hand model. Therefore, the first step consists of deforming the given hand model according to the rotation of joint specifying the special gesture and yielding an explicit 3D model for the gesture. In the rendering pipeline we perform the following tasks in turn:

Rendering Gestures as Line Drawings

153

1. The freeform surface is transformed into a special polygonal mesh. During this transformation, we carry out the following process: (a) reparameterisation the patch, so that a nearly natural parameterisation of patches is obtained, and (b) approximation the freeform patches by a polygonal mesh which is evenly spread in parametric space In this way a polygonal mesh of the freeform model is obtained with quadrangles more evenly spread in 3D space than with standard isoparametric meshes (for a more rigorous description see [2]). A consequence of this property is, that regions of high curvature exhibit edges whose faces span a relatively large or small angle. 2. Analytic rendering is performed, i.e. performing a vector-oriented description of visible parts. With this vector-oriented description, we are able to apply different line-styles for contours as well as edges. 3. The contour lines are identified and inner edges are classified with respect to angles of adjacent faces. This classification is used to control level of detail in the resulting drawing. 4. Chains of contour lines and inner edges of given classifications are approximated by cubic splines which can be drawn in different line styles. The concept of this rendering pipeline allows an interactive tuning of the desired line drawing. Changing the density of the polygon mesh, which is actually the adjusting the number of faces in a polygon mesh, results in a finer or coarser approximation of the freeform model. A most prominent reason for employing 3D-models of hand gestures is the arbitrary adjustment of the viewing direction. After changing the view direction, redrawing the hand includes only the step 2) to 4) of the rendering pipeline for a fast update. A readjustment of the level of detail does not require redoing all steps, but only the step 4) of the rendering pipeline which results in a very fast update. The following variations of drawing styles are included in our application: 1. All lines, such as contour lines and edge lines, can be drawn in different linewidth and brightness. This gives the possiblity to emphasize patches of the model (e.g. focussing on a finger). 2. Drawing certain inner edge lines with a desired level of detail. The revealing of the edges depends on the angle between adjacent faces and therefore corresponds to the curvature. A minimal level of detail reveals only the edges with a high angle of the adjacent faces, which actually depict the parts of high curvature. The level of desired detail can be adjusted in a nearly continuous range. At the maximal level of detail the underlying polygonal mesh is drawn (a result that usually is unwanted). 3. Computing shading by drawing simple hatching lines, cross-hatching and stipplings (i.e. dotted). Users should avoid high levels of detail, since these show the artifacts of a crude 3D model more clearly and thus annull in particular the cognitive benifits of line drawings over photorealistic presentations.

154

Frank Godenschweger, Thomas Strothotte, Hubert Wagener

A short demonstration of these different styles are shown in the next section part. 4.1

Demonstration of Drawing Styles

We are now in the position to demonstrate some line drawings to compare different parameterisations of our rendering. Figure 4 (a) shows in a minimal line drawing illustrating the manual letter A. Only the contour of each freeform patch is drawn. The same rendition but with a greater level of detail is given in Figure 4 (b).

(a)

(b)

Fig. 4: Presentation of the letter A in different drawing styles: (a) just outlines, (b) outlines and some simulated wrinkles

The variations realisable by gradually icreasing the level of detail are illustrated more completely in Figure 7. Figure 5 gives a comparison between a line drawing and a photorealistic presentation with shading performed in today’s standard quality. The depicted hand posture can be recognized more easily in the line drawing, while the photorealistic presentation can convey more spatial details. The line drawing can be sent over the network much more quickly. Indeed, the speed advantage is such that even if the photoraelistic image is desired, the line drawing can be transmitted first, followed by an interlaced transmission of the photorealistic image. Figure 5 c) shows the overlay of the photorealistic image and the line drawing. For users who prefer a more spatial presentation, we provide shaded line drawings, where the shading is realized either by cross-hatching or stippling. Figure 6 contrasts line drawings with and without shading, where stippling is used as shading method in Figure 6 a). In Figure 6 b) a line style is applied where the outlines are drawn with multiple lines in order to direct the viewers attention. Further examples, particularly using animation, can be viewed in the internet under http://isgwww.cs.uni-magdeburg.de/∼godens/gesture.html.

Rendering Gestures as Line Drawings

(a)

(b)

155

(c)

Fig. 5: Presentation of the letter Y in different styles: (a) photorealistic, (b) line drawing, (c) shows that line drawing layed over a photorealistic rendition can enhance the recognition of gestures.

(a)

(b)

Fig. 6: Presentation of the letter L in a shaded drawing style, where the shading is performed with dots (a) and in the style where the outlines are drawn in an exaggerated line style (b)

156

Frank Godenschweger, Thomas Strothotte, Hubert Wagener

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

Fig. 7: The effect of gradually increasing the level of detail

5

Conclusions and Future Work

We introduced in this paper an application for illustrating gestures either as images or as animation sequences. The most prominent characteristic of our system is the presentation of gestures as line drawings. Line drawings convey the important information when illustrating gestures (e.g. the hand posture) better than photorealistic presentations. Besides this they have some other advantages, as for example fast transmission in networks, their effortless integration into black-and-white publications, since no special print media are required, or their compact representation. Another important characteristic of our application is its extendibility. Integrating new gestures does not require 3D modeling, specification of view rotations suffice for example to define a static gesture. In the future we intend to extend our system in several aspects. In its current state, only gestures performed with one hand are handled. We plan to gradually incorporate the presentation of both hands, arms, and even facial expressions. These extensions yield much more complex models whose handling requires improved support. In defining new gestures for complex models, automatic collision or touch detection would be very helpful. Sophisticated support in designing new dynamic gestures would also be desirable. Some guiding tools for the average user should be developed, especially if the underlying model becomes more complex.

Rendering Gestures as Line Drawings

157

References 1. Sarah Geitz, Timothy Hanson, and Stephen Maher. Computer Generated 3Dimensional Models of Manual Alphabet Handshapes for the World Wide Web. In Assets ’96, pages 27–31, Vancouver, British Columbia Canada, 1996. ACM. 2. Frank Godenschweger, Thomas Strothotte, and Hubert Wagener. Presentation of Freeform Surfaces as Line Drawings. In 3D Image Analysis and Synthesis ’96, pages 87–93, Erlangen, Germany, November 1996. 3. Philip A. Harling and Alistair D. N. Edwards. Hand Tension as a Gesture Segmentation Cue. In Progress in Gestural Interaction, pages 75–88, University of York, UK, March 1996. 4. Jintae Lee and Tosiyasu L. Kunii. Model-based Analysis of Hand Posture. IEEE Computer Graphics and Applications, pages 77–86, September 1995. 5. Frederic Parke. Parameterized Models for Facial Animation. IEEE Computer Graphics and Application, 2(9):61–68, November 1982. 6. W. C. Stokoe, D. C. Casterline, and C. D. Croneberg. A Dictionary of American Sign Language. Linstock Press, Silver Spring, 1976. 7. T. Strothotte, B. Preim, A. Raab, J. Schumann, and D.R. Forsey. How to Render Frames and Influence People. In Proceedings of EUROGRAPHICS ’94, volume 13(3), pages 455–466, 1994. 8. Demetri Terzopoulos and Keith Waters. Physically-based Facial Modelling, Analysis, and Animation. Visualization and Computer Animation, 1:73–80, 1990.

Suggest Documents