per se rather than models, for example for the creation of a synthetic actor ... A behavioral MCM specifies the motion of an actor in terms of his behavior. Behavior ...
A Taxonomy of Complex Models for Visualizing Humans Nadia Magnenat Thalmann University of Geneva Geneva, Switzerland and HEC, University of Montreal Daniel Thalmann Computer Graphics Lab Swiss Federal Institute of Technology Lausanne, Switzerland
Abstract This paper proposes a new classification of models for animating synthetic actors both according to the method of controlling motion and according to the kinds of interactions the actors have. The paper discusses the use of geometrical, physical and behavioral models for animating human characters. Various situations of a virtual scene are described, including the simulation of the motion of a single virtual human, the interaction between two virtual humans and the interaction between a virtual human and a real animator. Specific problems are emphasized such as cloth animation, facial emotions, individualized walking, vision-based obstacle avoidance and physically-based interactive motion control using 3D input devices.
1. Introduction The classification of approaches to computer animation can help us impose conceptual order in an area characterized by rapid and piece-meal innovation in many directions simultaneously, to systematically analyze the differences and similarities among these approaches, and to better understand the way in which the field has been evolving. In this paper, we show the evolution over time of research into animation models. The first computerized models to be defined in Computer Animation were mainly geometric. Since computer animation derives from traditional animation, the first trend was to imitate how traditional animators produce traditional films. The accent was put more on the graphic result per se rather than models, for example for the creation of a synthetic actor (synthetic actors will be named actors in this paper). To make the movement more realistic, physics-based models have been introduced. The problem with these models is that all actors behave the same way. Because humans do not act solely according to physical laws, behavioral models have been introduced more recently to take into account the individuality of a character. Besides physical laws, another kind of control is necessary for simulating human motions. Concurrently with the evolution of motion control models, there have been major developments in the relationship between an actor and his environment. The emergence of techniques like A.I., object-oriented programming, new increases of computer speed and new interactive devices have made it possible to take into account the interactions of an actor with his environment, an actor with another actor and an actor with the animator. These kinds of interaction and our 3 categories of models (geometric, physical and behavioral) give rise to a classificatory array with 4 rows and 3 columns. Each case in the table represents existing and potential approaches to animation control, with an increasing complexity from top to bottom and from left to right.
2 2. Categories of Motion Control Methods and Categories of Actor Interfaces 2.1 Categories of Motion Control Methods Computer animation scenes involving synthetic actors may be classified both according to the method of controlling motion and according to the kinds of interactions the actors have. A motion control method (MCM) specifies how an actor is animated and may be characterized according to the type of information it privileged in animating the synthetic actor. For example, in a keyframe system for an articulated body, the privileged information to be manipulated is the angle. In a forward dynamics-based system, the privileged information is a set of forces and torques; of course, in solving the dynamic equations, joint angles are also obtained in this system, but we consider these as derived information. In fact, any MCM will eventually have to deal with geometric information (typically joint angles), but only geometric MCMs explicitly privilege this information at the level of animation control. The nature of privileged information for the motion control of actors falls into three categories: geometric, physical and behavioral, giving rise to three corresponding categories of MCM. Geometric MCMs In the first group of MCMs, the privileged information is of a geometric nature. Typically, motion is defined in terms of coordinates, angles and other shape characteristics. Although geometric MCMs have been mainly concerned with determining the motion of the skeleton, they may also be applied in calculating deformations of bodies or faces. Physical MCMs A physical MCM uses physical characteristics and laws as a basis for calculating motion. Privileged information for these MCMs include physical characteristics such as mass, moments of inertia, and stiffness. The physical laws involved are mainly those of mechanics, and more particularly dynamics. Physical laws are used to control the motion of skeletons, but they also have an important application in calculating deformations of bodies and faces, since these deformations are muscular in origin and the muscle action is most appropriately best characterized in mechanical terms. Behavioral MCMs A behavioral MCM specifies the motion of an actor in terms of his behavior. Behavior is often defined as the way that animals and human beings act and is usually described in natural language terms which have social, psychological or physiological significance but which are not necessarily easily reducible to the movement of one or two muscles, joints or end effectors. Examples would be: an actor smiling at another actor, speaking a certain sentence, getting up and walking from a chair to the door. Terms for describing variations among actors with respect to personality, and for a single actor, changes in mood, emotion and style, are also typical of the information privileged by behavioral MCMs, which is less formalized than that of geometry and physics. 2.2. Categories of Actor Interfaces As we have seen, control of actor's motion may be based on the processing of geometric, physical or behavioral information. We should consider how the relationship between the actor and the rest of the world. We call this the actor interface and distinguish four basic cases: 1) the single actor situation: the synthetic actor is alone in the scene, there is no interaction with other objects.
3 2) the actor-environment interface: the synthetic actor is moving in an environment and he is conscious of this environment. 3) the actor-actor interface: actions performed by an actor are known from another actor and may change his behavior 4) the animator-actor interface: not only may the animator communicate information to the actor but the actor is also able to respond it and communicate information to the animator. For each type of interface, we will consider our three categories of MCM: geometric, physical and behavioral and list examples of specific MCMs in Table 1. geometric MCMs single actor
physical MCMs
rotoscopy keyframe techniques
dynamics
behavioral MCMs muscular tiredness model Facial Action Coding system
kinematic constraints actor-environment
obstacle avoidance
collision models
sensors
interface
detection of intersection
deformable
vision-based obstacle avoidance
actor-actor
geometric collisions
dynamic
communication of
interface
between actors
collisions between
emotions
actors
between actors
input of forces
vision input using a video camera
animator-actor interface
design of trajectories using a dataglove
models
and device feedback
Table 1. A Classification of Animation Techniques according to Motion Control Method and Actor Interface
3. The Single Actor Situation 3.1 Introduction The synthetic actor is alone in the scene, and there is no interaction with other objects. Character motion is determined by the animator using "batch" directives. There is no real-time interaction between the actor and the animator. This is in fact the most current situation in animation. Even the scene includes decors or other moving objects, the actor is completely unconscious of the environment. Only the animator may prevent collisions by calculating appropriate trajectories in advance. 3.2 Geometric MCMs Among the best-known methods in the category of geometric MCMs for animating single actors, we may consider rotoscopy, a method which uses sensors to provide coordinates of specific points of joint angles of a real human for each frame. As already mentioned, keyframe systems are typical of systems that manipulate angles; from key angles at selected times, they calculate angles for intermediate frames by interpolation. Inverse kinematic methods may be also considered as being in this category. They determine values of joint angles from values of
4 end effectors. The extension of the principle of kinematic constraints to the imposition of trajectories on specific points of the body is also of geometric nature. 3.3 Physical MCMs Kinematic-based systems are generally intuitive and lack dynamic integrity. The animation does not seem to respond to basic physical facts like gravity or inertia. Only the modeling of objects as they move under the influence of forces and torques can be realistic. For example, the motion in Fig. 1 is difficult to achieve without dynamics. Forces and torques cause linear and angular accelerations. The motion is obtained by the dynamic equations of motion relating the forces, torques, constraints and the mass distribution of objects. Typical physical MCMs for single actors which consider no other aspect of the environment animate articulated figures through forces and torques applied to limbs. The physical laws involved are mainly those of mechanics such as the Newton-Euler equations, the Lagrange equation, the Gibbs-Appel equation or the D'Alembert principle of Virtual Work. 3.4 Behavioral MCMs We will consider as a behavioral MCM any method consisting in driving the behavior of a single actor by providing high-level directives indicating a specific behavior without any other stimulus. A typical example is the definition of a command to impose a degree of fatigue on an actor like suggested by Lee et al. in their method of Strength Guided Motion1 . The behavior of a single actor may be also manually simulated by the animator using directives about emotions, as defined by the Facial Action Coding System (FACS) introduced by the psychologists Ekman and Friesen2 . We have described a system allowing this kind of emotional directives, which includes the possibility of synchronizing them with the generation of phonemes. 3.5 The complexity of skeletons for single actor motion Most animation systems use a simple articulated body (a skeleton) made of segments and joints to model the motion of the human body. The use of dynamic algorithms certainly improves fluidity of the motion. However, there is a severe limitation to the use of such techniques. They assume a simple skeleton, but the human skeleton is very different from this structure. As a result, dynamics-based motions are more appropriate for robot-like structures than for human characters. One question arises: should we continue to develop very complex algorithms for controlling the motion of primitive structures that do not correspond to the human skeleton ? Let us consider a typical example: the shoulder. This is a very complex joint. There is a considerable deal of research in arm motion, but when the shoulder is modelled by a simple joint with 3 degrees-of-freedom, it is extremely difficult to ensure the continuity of the surface of the body around the shoulder and to endow it with a natural shape for any possible values of the joint angles. In fact, the very location of the joint cannot be the same for all values of the flexing angle. When the joint is far enough from the body surface, the results are good for large angles; but for small angles, the shoulder looks like a flexible pipe. With the joint nearer to the body surface, the results are good for small angles; but we observe that vertices tend to go inside the body for large angles. For this reason, in our Human Factory system3 we devised a moving joint based on the lengthening of the clavicle, providing good results for a shoulder flexion angle between 0o and about 100o. Fig.2 shows an example. This however is just a trick, and the only right way should be to introduce a more complex model of the skeleton. The bony arrangement of the shoulder joint consists of a shallow socket to which is joined the onehalf-spherical head of the humerus. The joint is a ball-and-socket type of arrangement; it is a multiaxial joint that can move through the following movements: flexion, extension, medial and lateral rotation, adduction and abduction, and transverse adduction and abduction. Moreover, the joint is stabilized by ligaments.
5 The differences between the classical skeleton used in computer animation and the true human skeleton are very significant and tend to make any human motion look robot-like. These differences are shown in Table 2. Human skeleton made of solid bones
Computer Animation skeleton made of segments
joints are very complex and each one is joints are rotational and considered in one different point there are over 200 degrees-of-freedom
joints are generally limited to about 50
there are 33 vertebrae; there are articulations 5 or 6 rotational joints are generally used to of one vertebral body on another, separated simulate the vertebral column by the intervertebral discs; and the articulations between the articular facets of successive vertebrae bones have some flexibility
segments are rigid
Table 2. Comparison between the true human skeleton and the classical skeleton used in computer animation 3.6 A case study for the single actor situation: an individualized walking model The use of primitive methods like keyframe animation allows the animator to specify every detail of a motion. However, this is an extremely tedious task. Research in automatic motion control provides ways and tools for reducing this problem such as task-level command languages. But, these raise another problem: how to introduce individual differences into the generic activities which are generated automatically ? For example, in the task of walking, everybody walks more or less the same way, following more or less the same laws. It is the "more or less" which is difficult to model. Even the same person does not walk the same way everyday. If he is tired, or happy, or has just received some good news, the way of walking will appear somewhat different. As in traditional animation, an animator can create a lot of keyframes to simulate a tired character walking, but this is a very costly and time-consuming task. To individualize human walking, we have developed4 a model built from experimental data based on a wide range of normalized velocities. The model is structured on two levels. At a first level, global spatial and temporal characteristics (normalized length and step duration) are generated. At the second level, a set of parameterized trajectories produce both the position of the body in space and the internal body configuration, in particular the pelvis and the legs. This is performed for a standard structure and an average configuration of the human body. The experimental context corresponding to the model is extended by allowing continuous variation of the global spatial and temporal parameters for altering the motion to try to achieve the effect desired by the animator. The model is based on a simple kinematic approach designed to preserve the intrinsic dynamic characteristics of the experimental model. But what is important is that this approach allows individualization of the walking action in an interactive real-time context in most cases (see Fig.3). The use of a physical MCM requires switching from the problem of specifying positional joint parameters to that of specifying applied forces and torques. This new parameter space is no easier to manipulate. The introduction of inverse dynamics for dynamically processing predefined trajectories does not solve the problem of how to find the natural trajectory for a specified task. This goal is partially reached by using criterion optimizing methods (energy-
6 based constraints). Combined with other criteria integrating the physiological limitations of joints, dynamic models have been shown to be appropriate for the specification of certain movements: leg balance in walking5 , movement of the hand from one location to another6 . When the motion involves an interaction with the environment, however, (e.g. contact force in a walking action) the expression of such criteria is not trivial and the problem is still open. Moreover, individualized parameters for motion control may satisfy the animator but may be incompatible from a physical point of view. Likewise, results obtained by solving equations may lead to stereotyped movements for persons with the same anatomic configuration. Another drawback of dynamic models is the excessive cost in terms of CPU time which prevents the assessment of the motion in real-time. This is a major restriction for the design of a movement, especially walking, which involves expressive information of a behavioral, social and cultural nature. In this context, we have proposed a method based on a mathematical parameterization derived from biomechanical experimental data. The main idea of this method is to take advantage of the intrinsic dynamics of the motion as studied and to extend its application context to a wider range, while still producing results which are realistic and interesting for the animator. Fig.4 shows a frame from the film Still Walking. 3 . 7 . A second case study for the single actor situation: control of facial animation One main objective of this research is to model exactly human facial anatomy (see human faces with hair rendering7 in Fig.5 and Fig.6) and movements in both their structural and functional aspects. Although all movements may be rendered by muscles, the use of a solely muscle-based model makes the results somewhat unpredictable. This suggests that more abstract entities should be defined in order to create a system which better represents reality. A multi-layered approach may be the most convenient for this. The high level layers are the most abstract and specify "what to do", the low level layers describe "how to do". Each level is seen as an independent layer with its own input and output. Our system, called SMILE8 is composed of five layers. The intermediate levels are mainly software implementation allowing better modularity. The first level corresponds to the basic animation system. In our case, the software implementation is currently based on the Abstract Muscle Action procedures as previously introduced9 . These actions are very specific to the various muscles and give the illusion of the presence of a bony structure. This approach is a typical geometrical MCM. By replacing this approach by a structure based on bones and muscles, solved using the finite-element method, we are simply switching from a geometric MCM to a physical MCM without changing the role of this level. Our system, at a higher level may be considered as a behavioral MCM where the privileged information consists of words and emotions. A word may be specified by the sequence of component phonemes. The decomposition is based on the use of a dictionary that is created interactively as needed: each time an unknown word is detected, the user enters the decomposition, which is then stored in the dictionary. Optional commands may affect intensity, duration and emphasis of each word, and pauses may be also added in order to control rhythm and intonation of the sentence. An emotion is defined in terms of the short-term evolution of the human face over time: it is a sequence of expressions with various durations and intensities. Our emotion model is based on the general form of an envelope2: signal intensity = f(t) . An emotion has a specific average duration, but it is context-sensitive. For example, a smile may generally have a 5-6 second duration, but it may last 30 seconds in a particularly funny situation. It is also important to note that the duration of each stage of the emotion is not equally
7 sensitive to the time expansion. To take into account this non-proportional expansion, we introduce a sensitivity factor associated to each stage. In order to introduce a degree of natural variability in rendering each emotion, mechanisms based on statistical distribution have been implemented. For example, we may define a stage duration of 5 ± 1 seconds according to a uniform distributed law, or an intensity of 0.7± 0.05 according to a Gaussian distribution. The various facial actions should be synchronized: emotions, word flow in a sentence and eye motion. In this layer we introduce mechanisms for specifying the starting time, the ending time and the duration of an action. This implies that each action can be executed independently of the current state of the environment, because the synchronization is dependent on time alone. Our approach to synchronization is based on the manipulation language HLSS (High Level Script Scheduler) which provides simple synchronization mechanisms. Fig. 7. shows a facial expression.
4. The actor-environment interface 4.1 Introduction The synthetic actor is moving in an environment and he is conscious of this environment. This implies that the motion of the actor is dependent on parts of the environment. The actor will avoid obstacles or collide with them, grasp objects (Fig.8) etc. This dependence on the environment can be understood from a geometric, a physical or a behavioral point of view. 4.2 Geometric MCMs In geometric MCMs, motion may be determined based on the environment using geometric operations like the intersection of the actor with the decor. Consider, for example, the problem of walking without collision among obstacles. One strategy is based on the Lozano-Perez algorithm10 . The first step consists of forming a visibility graph. Vertices of this graph are composed of the vertices of the obstacles, the start point S and the goal point G. Edges are included if a straight line can be drawn joining the vertices without intersecting any obstacle. The shortest collision-free path from S to G is the shortest path in the graph from S to G. Lozano-Perez and Wesley9 describe a way of extending this method to moving objects which are not points. Schröder and Zeltzer11 (1988) introduced Lozano-Perez algorithm into their interactive animation package BOLIO. Brooks12 (1983) suggests another method called the freeway method. His algorithm finds obstacles that face each other and generates a freeway to passing between them. This path segment is a generalized cylinder. A freeway is an elongated piece of freespace that constitutes a path between obstacles. Breen13 proposes a technique employing cost functions to avoid obstacles. These functions are used to define goal-oriented motions and actions and can be defined so that the variables are the animated parameters of a scene. These parameters are modified in such a way to minimize the cost function. 4.3 Physical MCMs The reaction of an actor to the environment may also be considered using dynamic simulation in the processing of interactions between bodies. The interaction is first identified and then a response is generated. The most common example of interaction with the environment is the collision. Analytical methods for calculating the forces between colliding rigid bodies have been presented. Moore and Wilhelms14 modelled simultaneous collisions as a slightly staggered
8 series of single collisions and used non-analytical methods to deal with bodies in resting contact. Baraff15 presented an analytical method for finding forces between contacting polyhedral bodies, based on linear programming techniques. The solution algorithm used is heuristic. A method for finding simultaneous impulsive forces between colliding polyhedral bodies is also described. Baraff16 also proposed a formulation of the contact forces between curved surfaces that are completely unconstrained in their tangential movement. A collision detection algorithm exploiting the geometric coherence between successive time steps of the simulation is explained. Von Herzen et al.17 developed a collision algorithm for time-dependent parametric surfaces. Terzopoulos et al.18 and Platt and Barr19 proposed to surround the surfaces of deformable models by a self-repulsive collision force. This is a penalty method. Lafleur et al.20 addresses the problem of detecting collisions of very flexible objects, such as clothes, with almost rigid bodies, such as human bodies. In their method, collision avoidance also consists of creating a very thin force field around the obstacle surface to avoid collisions (see Section 4.5). Hahn21 describes the simulation of the dynamic interaction among rigid bodies taking into account various physical characteristics such as elasticity, friction, mass and moment of inertia to produce rolling and sliding contacts. Terzopoulos and Fleischer22 (1988) propose dynamic models for simulating inelastic behaviors: viscoelasticity, plasticity and fracture. Gourret et al.23 develop a finite element method for simulating deformations of objects and the hand of a synthetic character during a grasping task. 4.4 Behavioral MCMs For solving the problem of a synthetic actor crossing a room with furniture (table, chairs etc.), the use of an algorithm like the Lozano-Perez algorithm will certainly provide a trajectory avoiding the obstacle. But this trajectory won't be "natural". No human would follow such a path! The decision of where to pass is based on our vision and we require a certain additional room for comfort. We try to keep a "security distance" from any obstacle. This is a typical behavioral problem that cannot be solved by graph theory or mathematical functions. Moreover, walking depends on our knowledge of the location of obstacles and it is only when we see them that we start to include them in our calculations for adapting the velocity. A more complete example is described in Section 4.6. 4.5 A case study of the actor-environment interface: cloth animation Cloth animation in the context of human animation envolves the modelling of garments on the human body and their animation. In our film "Rendez-vous à Montréal"24 featuring Humphrey Bogart and Marilyn Monroe, clothes were simulated as a part of the body with no autonomous motion. For modeling more realistic clothes, two separate problems have to be solved: cloth animation without considering collisions (only the basic shape and the deformations due to gravity and wind are considered), and collision detection of the cloth with the body and with itself. As interactions with the actor's body are ignored, the first problem may be considered as a single actor problem where the "actor" is the cloth itself. However, a cloth has no autonomy and a behavioral MCM for cloth animation has no meaning. But we may consider geometric and physical MCMs for animating clothes without collisions. In geometric MCMs, the shape of flexible objects is entirely described by mathematical functions. It is not very realistic and cannot create complicated deformable clothes, but it is fast. The geometric approach is suitable for representing single pieces of the objects or clothes with simple shapes, which are easily computed, but geometric flexible models like Weil's model (a best looking model that can create realistic folds)25 or Hinds and McCartney's model26 have not incorporated concepts of quantities varying with time, and are weak in representing physical properties of cloth such as elasticity, anisotropy, and viscoelasticity. Only physical MCMs like
9 Terzopoulos' model 17 and Aono's model27 may correctly simulate these properties. Another interesting approach by Kunii and Gotoda28 incorporates both the kinetic and geometric properties for generating garment wrinkles. Our elastic cloth surface model is based on the one introduced by Terzopoulos et al.18 It has been extended to polygonal regions. First simulations may produce a rather elastic cloth. Parameters should then be adjusted to correct this. These parameters are geometric and dynamic. Geometric parameters determine cloth dimensions and shape. Even with as simple a shape as a skirt or a scarf, results are rather realistic. Dynamic parameters have an impact on motion. Generally, most parameters are fixed during cloth creation and are not modified during the animation. However, nothing prevents us from varying them in order to refine the motion and obtain more realistic results. These dynamic parameters may be classified into two categories: internal and external. Internal parameters drive local forces in the cloth like stretching and curvature. External parameters are forces acting on the cloth such as gravity and wind effect (see Fig.9). When we consider collisions between the cloth and the body, we have a situation of actorenvironment interface using a physical MCM. Collision detection adds extra constraints and requires a specific algorithm. For very flexible objects like clothes, it is necessary to introduce a self-detection. In our method, collision avoidance consists of creating a very thin force field around the obstacle surface to avoid collisions. This force field acts like a shield rejecting the points. In our approach, the collision detection process is almost automatic. The animator has only to provide the list of obstacles to the system and indicate whether they are moving or not. For a walking synthetic actor, moving legs are of course considered as a moving obstacle. A number of parameters have been planned in order to modify the behavior of the collision detection method: shield depth, shield force and damping factor. Fig.10 shows frames from the film Flashback with cloth animation and Fig.11 and 12 shows new clothes modelled using an interactive cloth design system29 . 4 . 6 A case study of behavioral MCM: vision-based approach to obstacle avoidance To model this kind of situation, we had introduced the concept of synthetic vision30 . The initial objective was simple and quite general to create an animation involving a synthetic actor automatically moving in a corridor avoiding obstacles. To simulate this behaviour, each synthetic actor uses synthetic vision for its perception of the world and as the unique input to its behavioural model. This model is based on the concept of Displacement Local Automata (DLA), which is similar to the concept of a script for natural language processing. A DLA is an algorithm that can deal with a specific environment. Two typical DLAs are called follow-thecorridor and avoid-the-obstacle. Vision simulation is the heart of this system. This has the advantage of avoiding all the problems of pattern recognition involved in robotic vision. As input, we have a database containing the description of 3D objects: the environment, the camera characterized by its eye and interest point. As output, the view consists of a 2D array of pixels. each pixel contains the distance between the eye and the point of the object for which this is the projection. The implementation is based on the IRIS 4D architecture; it extensively uses the Graphics Engine with z-buffer and double frame buffer. The front buffer is used to display the projection of objects, which allows the animator to know what the synthetic actor sees. In the backbuffer, for each pixel, the object identifier is stored. Fig. 15 shows an example.
10 5. The actor-actor interface 5.1 Introduction Actions performed by a synthetic actor are known to another actor and may change his behavior without any action from the animator. This means that there exists some sort of communication between actors. Based on our classification of MCMs, we may distinguish three categories of communication systems: geometric, physical and behavioral. 5.2 The Geometric communication systems Geometric information is transmitted from an actor to another actor. This is typically the case of detecting geometric collisions between the two actor shapes. 5.3 The Physical communication systems In a physical communication system, an actor may apply a force that changes the motion of another actor. No system of human animation has yet included such possibilities. However, if we consider actors made of rigid bodies, it should be possible to extend algorithms like Hahn's algorithm or Baraff's algorithm to this situation. 5.4 The Behavioral communication systems These should be the most important in the future, simply because this is the most common way of communicating between real people. In fact, our behavior is generally based on the response to stimuli31 from other people. In particular, we participate in linguistic and emotional exchange with other people. Thus the facial expressions of an actor may be in response to the facial expressions of another actor. For example, the actress Marilyn may smile just because the actor Bogey smiles.
6. The animator-actor interface 6.1 Introduction In the context of interactive animation systems, the relationship between the animator and the synthetic actors should be privileged. With the existence of graphics workstations able to display complex scenes containing several thousands polygons at interactive speed, and with the advent of such new interactive devices as the Spaceball, Eyephone, and DataGlove, it is possible to create computer-generated characters based on a full 3D interaction metaphor in which the specifications of deformations or motion are given in real-time. True interaction between the animator and the actor requires a two-way communication: not only may the animator interact to give commands to the actor but the actor is also able to answer him. Finally, we may arrive at a virtual reality situation in the context of synthetic actors: complete integration between the animator and the actor. Consider again our three categories of MCMs for driving the motion of synthetic actors in this virtual reality context. Is it possible to have two-way communication between the animator and the actor at the geometric level, at the physical level, and at the behavioral level ? The answer is yes. But consider first a classical example of bidirectional communication: human-machine speech communication. As shown in Fig.14a, the operator speaks using a microphone, the phonemes and words are recognized by a speech recognizer program that forms sentences. On the basis of these sentences, answers or new sentences are composed. A speech synthesizer
11 generates the corresponding sounds which are amplified and may be heard by the operator. This two-way communication is technically possible for our three types of MCMs, so we return to the topic of communication systems. 6.2 The Geometric communication systems At the geometric level, 3D devices allow the animator to communicate any geometric information to the actor. For example, the animator may use a spaceball to define a trajectory to be followed by the actor. He may use a dataglove for defining a certain number of hand
12 speech
Speech Recognizer
Microphone
mouth real person ears
Speech Synthesizer
Amplifier loudspeakers speech
operator sentences Dialog Coordinator computer sentences
Fig. 14a The speech communication
gestures hands real person eyes
Gesture Recognizer
Dataglove controller
animator sentences Dialog Coordinator
hand animation sequence
hand animation system
actor sentences
Fig. 14b The hand gesture communication
force real hand person hand force
force sensitive device
physical model controller
force intensity Force Coordinator
force feedback device
force synthesizer
force intensity
Fig. 14c The force communication
emotion face real person eyes emotion
camera and living video digitizer
facial animation sequence
emotion recognizer
high-level facial animation system
animator emotional data Emotional dialog Coordinator actor emotional data
Fig. 14d The emotional communication positions. This possibility may be exploited to create dialogue based on hand gestures such as a dialogue between a deaf animator and a deaf synthetic actor using American Sign Language. The animator signs using two datagloves, and the coordinates are transmitted to the computer.
13 Then a sign-language recognition program interprets these coordinates in order to recognize gestures. A dialogue coordination program then generates an answer or a new sentence. The sentences are then translated into the hand signs and given to a hand animation program which generates the appropriate hand positions as shown in Fig. 15. The complete process is explained in Fig.14b. 6.3 The Physical communication systems Devices like the Spaceball are able to communicate forces and torques to the workstation. This permits the creation of object paths based on the input of these forces and torques, as shown in Section 3. In the same way, using a force transducer, a force or a torque may be communicated to an actor who can himself apply a force that may be felt by the animator using a force feedback device. It is for example possible to simulate the scene of virtual reality where the animator and the actor tug on the two ends of a rope. The complete process is explained in Fig.14c. 6.4 The Behavioral communication systems At the behavioral level, we consider emotional communication between the actor and the animator. We may restrict emotions to a few, such as happiness, anger, and sadness and consider only the facial expressions as manifestations of these emotions. In such a behavioral communication system or, more accurately in this case, an emotional communication system, the animator may smile, his face is recorded in real-time using a device like the living video digitizer and the emotion is detected using an image processing program. The dialog coordinator decides which emotion should be generated in response to the received emotion. This emotion is translated into facial expressions to be generated by the facial animation system. The complete process is explained in Fig.14d. Consider the example where Marilyn smiles when the animator smiles. The difficulty in such a process is to decide whether the animator is smiling based on the analysis of the image captured by the living video digitizer. At present only small images with a rather limited processing is possible with the actual hardware; this implies that the detection of subtleties of the face is not yet feasible. 6 . 5 A case study of dynamical communication system: Physically-Based Interactive Motion Control Using 3D Input Devices In this application32 , naturalistic interaction and realistic-looking motion is achieved by using a physically-based model of the virtual object's behavior. The approach consists of creating an abstract physical model of the object, using the laws of classical mechanics, which is used to simulate the virtual object motion in real time in response to force data from the various 3D input devices (e.g. the Spaceball or DataGlove). The behavior of the model is determined by several physical parameters such as mass, moment of inertia, and various friction coefficients which can all be varied interactively, and by constraints on the object's degrees of freedom which can be simulated by setting certain friction parameters to very high values. This allows us to explore a continuous range of physically-based metaphors for controlling the object motion. A physically-based object control model provides a powerful, general-purpose metaphor for controlling virtual objects in interactive 3D environments. Because it is based on a real model, it is natural for the user to control. Its parameters are physically-based and, therefore, easy to understand and intuitive for the user to manipulate. Its generality and control parameters make it configurable to emulate a continuum of object behaviors ranging from pure position control to pure acceleration control. As it is fully described by its physical parameters, it is possible to construct more sophisticated virtual object control metaphors by varying the parameters as a function of space, time, application data or other user input. Also, when used with force-calibrated input devices, the object metaphor can be reproduced exactly on different hardware and software platforms, providing a predictable standard interactive "feel". Obviously, pressure-sensitive input devices are usually more appropriate because they provide
14 a passive form of "force-feedback". In our case, the device that gave the best results is the Spaceball. The relationship between device input and virtual object motion is not as straightforward as one might think. Usually, some sort of mathematical function or "filter" has to be placed between the raw 3D input device data and the virtual camera viewing parameters. Several recent papers have proposed and compared different metaphors for virtual camera motion control in virtual environments using input devices with six degrees of freedom 33 34 . These metaphors are usually based on a kinematic model of control, where the virtual camera position, orientation, or velocity is set as a direct function of an input device coordinate. The interactive object control metaphor is based on physical modeling of the virtual object, using forward dynamics for motion specification. The important mechanical properties of this model which affect its motion are its mass, its moments of inertia, and the coefficients of friction and elastic forces imposed by the object. The general motion of the rigid body can be decomposed into a linear motion of its center of mass under the control of an external net force and a rotational motion about the center of mass under the control of an external net torque. Fig.16 shows the principle of a physically-based interactive camera motion control using a Spaceball and a Polhemus digitizer. Clock Tick for Tick
Raw Input
Virtual SpaceBall from device Device
Force Torque Physical Transformation Model Controller
SpaceBall
Raw Input from device
Virtual Polhemus Device
Redraw
Camera
Force Torque
Polhemus Digitizer Figure 16. Principle of a physically-based interactive camera motion control
Conclusion Through our examples of geometric, physical and behavioral MCMs, we have shown that our classification of animation techniques according to Motion Control Method and Actor Interface accommodates most types of interface. In current behavioral MCM, we are in effect conflating the semantics and the syntax of behavior. In fact, they are intrinsically linked. Future work on behavioral MCM will perhaps entail a distinction between these two levels, a properly behavioral syntactic MCM and a subjective or semantic MCM. The syntactic behavioral MCM would be closely linked to formal languages of signs. The semantic behavioral MCM would take into account emotions, intentions, the interpretations and the reasonings, emotional reactions and other causalities that relate all these thoughts and feelings. This is a more complex model.
15 Consider an example. If an actor says to another actor or to the animator: I am happy to see you, the syntactic aspect and the standard corresponding facial expressions can be generated relatively easily. Now, what intervenes between this utterance and the response of a real animator or another actor ? At the syntactic level, the sentence and facial expressions are mechanically recognized. The interpretation of these and the assimilation of the interpreted meaning must be incorporated with the knowledge and the attitudes towards the actor in the present context, given rise to a logical conclusion or emotional reaction which in turns generates a response at the syntactic level. This semantic processing and its control by the animator will privilege a largely different type of information from current behavioral models. Future work will certainly focus on this topic.
Acknowledgments The authors would like to thank Arghyro Paouri for the design of most pictures in this paper. They are also grateful to the referees for their helpful comments. This research was supported by "Le Fonds National Suisse pour la Recherche Scientifique" , the Natural Sciences and Engineering Council of Canada and the FCAR foundation in Quebec.
R
eferences
1
Lee P, Wei S, Zhao J, Badler NI (1990) Strength Guided Motion, Proc. SIGGRAPH '90, Computer Graphics, Vol. 24, No 4, pp.253-262
2
Ekman P and Friesen W (1978) Facial Action Coding System, Consulting Psychologists Press, Palo Alto.
3
Magnenat-Thalmann N, Thalmann D (1990) Synthetic Actors in Computer-Generated Films, Springer-Verlag, Heidelberg
4
Boulic R, Magnenat-Thalmann N, Thalmann D (1990) A Global Human Walking Model with real time Kinematic Personification, The Visual Computer, Vol.6, No6, pp.344-358.
5
Bruderlin A, Calvert TW (1989) Goal Directed, Dynamic Animation of Human Walking, Proc. SIGGRAPH '89, Computer Graphics, Vol. 23, No3, pp.233-242
6
Girard M (1990) Constrained Optimization of Articulated Animal Movement in Computer Animation, in: Badler NI, Barsky BA, Zeltzer D, Making Them Move , Morgan Kaufmann , San Mateo, California , pp.209-232
7
LeBlanc A, Turner R, Magnenat Thalmann N (1991) Rendering Hair using Pixel Blending and Shadow Buffers, Journal of Visualization and Computer Animation, Vol.2, No3.
8
Kalra P, Mangili A, Magnenat-Thalmann N, Thalmann D (1991) SMILE: a Multilayered Facial Animation System, Proc. IFIP Conference on Modelling in Computer Graphics, Springer, Tokyo, Japan
9
Magnenat-Thalmann N, Primeau E, Thalmann D (1988), Abstract Muscle Action Procedures for Human Face Animation, The Visual Computer, Vol.3, No.5
10
Lozano-Perez T, Wesley MA (1979) An Algorithm for Planning Collision-Free Paths Among Polyhedral Obstacles, Comm.ACM, Vol.22, No10, pp. 560-570.
11
Schröder P, Zeltzer D (1988) Pathplanning inside Bolio, in: Synthetic Actors: The Impact of Artificial Intelligence and Robotics on Animation, Course Notes SIGGRAPH '88, ACM, pp.194-207.
16 12
Brooks RA (1983) Planning Collision-Free Motions for Pick-and-Place Operations, International Journal of Robotics, Vol.2, No4, pp.19-26
13
Breen DE (1989) Choreographing Goal-Oriented Motion Using Cost Functions, in: Magnenat-Thalmann N, Thalmann D, State-of.the-Art in Computer Animation, Springer, Tokyo, pp. 141-152
14
Moore M, Wilhelms J (1988) Collision Detection and Response for Computer Animation, Proc. SIGGRAPH'88, Computer Graphics, Vol.22, No.4, pp.289-298
15
Baraff D (1989) Analytical Methods for Dynamic Simulation of Non-Penetrating Rigid Bodies, Proc. SIGGRAPH '89, Computer Graphics, Vol. 23, No3, pp.223-232.
16
Baraff D (1990) Curved Surfaces and Coherence for Non-Penetrating Rigid Body Simulation, Proc. SIGGRAPH '90, Computer Graphics, Vol. 24, No4, pp.19-28.
17
Von Herzen B, Barr AH, Zatz HR (1990) Geometric Collisions for Time-Dependent Parametric Surfaces, Proc. SIGGRAPH'90, Computer Graphics, Vol.24, No.4, pp.39-46
18
Terzopoulos D, Platt J, Barr A, Fleischer K (1987) Elastically Deformable Models, Proc. SIGGRAPH'87, Computer Graphics, Vol. 21, No.4, pp.205-214
19
Platt JC, Barr AH (1988) Constraint Methods for Flexible Models, Proc. SIGGRAPH'88, Computer Graphics, Vol.23, No.3, pp.21-30
20
Lafleur B, Magnenat Thalmann N, Thalmann D, Cloth Animation with Self-Collision Detection, Proc. IFIP Conf. on Modeling in Computer Graphics, Tokyo
21
Hahn JK (1988) Realistic Animation of Rigid Bodies, Proc. SIGGRAPH '88, Computer Graphics , Vol. 22, No 4 , pp.299-308.
22
Terzopoulos D, Fleischer K (1988) Modeling Inelastic Deformation: Viscoelasticity, Plasticity, Fracture, Proc.SIGGRAPH'88, Computer Graphics, Vol. 22, No 4, pp.269278
23
Gourret JP, Magnenat Thalmann N, Thalmann D (1989) Simulation of Object and Human Skin Deformation in a Grasping Task, Proc. SIGGRAPH '89, Computer Graphics, Vol.23, No.3, pp.21-30
24
Magnenat-Thalmann N, Thalmann D (1987) The Direction of Synthetic Actors in the Film Rendez-vous à Montréal, IEEE Computer Graphics and Applications, Vol. 7, No 12, pp.9-19
25
Weil J (1986) The Synthesis of Cloth Objects, Proc. SIGGRAPH '86, Computer Graphics, Vol.20, No.4, pp.49-54
26
Hinds BK, McCartney J, Interactive garment design, The Visual Computer (1990) 6, pp.53-61
27
Aono M (1990) A Wrinkle propagation Model for Cloth, Proc. Computer Graphics International '90, Springer-verlag, Tokyo, pp.96-115
28
Kunii TL, Gotoda H (1990) Modeling and Animation of Garment Wrinkle Formation Processes, Proc. Computer Animation'90, Springer, Tokyo, pp.131-147
29
Magnenat Thalmann N, Yang Y, Thalmann D (1991) The Problematics of Cloth Modeling and Animation, Proc. 2nd International Conference on CAD and CG, Hangzhou, China, International Academic Publ.
17 30
Renault O, Magnenat-Thalmann N, Thalmann D (1990) A vision-based Approach to Behavioral Animation, The Journal of Visualization and Computer Animation, Vol. 1, No1
31
Wilhelms J (1990) A "Notion" for Interactive Behavioral Animation Control, IEEE Computer Graphics and Applications , Vol. 10, No 3 , pp.14-22
32
Turner R, Balaguer F, Gobbetti E, Thalmann D (1991) Physically-Based Interactive Camera Motion Control Using 3D Input Devices, Proc. Computer Graphics International '91, MIT, Boston
33
Ware C., Osborne S. (1990) Exploration and Virtual Camera Control in Virtual Three Dimensional Environments, Proceedings 1990 Workshop on Interactive 3D Graphics ACM, pp.175-183
34
Mackinlay J.D., Card S.K, Robertson G. (1990) Rapid Controlled Movement Through a Virtual 3D Workspace, Computer Graphics 24(4) : 171-176
List of captions for color pictures
Figure 1. A motion difficult to achieve without dynamics
Figure 2. Body deformations (shoulder)
Figure 3. Individualized walking
Figure 4. A frame from the film Still Walking
Figure 5. Human faces with hair rendering
Figure 6. Synthetic Marilyn with hair
Figure 7.Facial expression
Figure 8. Object grasping (from the film IAD)
Figure 9. Wind effect (frame from the film Flashback)
Figure 10. 4 frames from the film Flashback
Figure 11. Cloth modeling
Figure 12. Cloth modeling
Figure 13. Vision-based obstacle avoidance
Figure 15. American Sign Language