interactive evolutionary approach to character animation

8 downloads 0 Views 159KB Size Report
We describe in this paper an approach to the animation of articulated figure using a combination of forward kinematics, neural net and ... and feel for the character and its 'illusion of life' sets the ..... [Ventr95] Ventrella J., Disney Meets Darwin:.
INTERACTIVE EVOLUTIONARY APPROACH TO CHARACTER ANIMATION M Nordin Zakaria Faculty of Computer Science and Information Technology Universiti Putra Malaysia, Malaysia [email protected]

ABSTRACT We describe in this paper an approach to the animation of articulated figure using a combination of forward kinematics, neural net and genetic algorithms. A set of kinematics motion data for an articulated model is evolved to generate a set of varied motion that performs according to an objective function set by the user. Neural net is used to constrain the search space of possible motions. User interaction with the process ensures a certain style evolves from the myriad of possibilities. We show how our technique makes it possible to create libraries of reusable motions from a small number of example motions. KEYWORDS: Computer Animation, Genetic Algorithm

1.

INTRODUCTION

This paper is concerned with semi-automatic generations of kinematics-based computer animation of articulated figures. The focus is on character animation as an arts form, rather than as a physics or biomechanics based simulation. Character animation is perceived as a creative, non-routine [Rosen97] design process where knowledge about the relationships between the requirements for a design and the form required to satisfy those requirements is intuitive rather than methodological. The animator's own vision and feel for the character and its 'illusion of life' sets the requirements. There is no clear mapping from requirement to exact form of the animation. The animator creates in passes, engaging in trial and error, and refining the current best set of solutions. The task requires an evolutionary act. The idea of using genetic algorithm to generate character animation is hence appealing. Genetic algorithm can be used to explore the space of possible animations of a character model. While genetic algorithm has been used for the generation of controllers in physically modeled articulated figure [Sims94], our approach relies on kinematics data rather than a dynamic model. A set of kinematics motion data for an articulated model is evolved

to generate a set of varied motion that performs according to an objective function set by the user. Neural net is used to constrain the search space of possible motions. User interaction with the process ensures a certain style evolve from the myriad of possibilities. The idea is that instead of the animator tediously toiling over the kinematics or physics-based details of possible character motions, the animator plays a reactive or supervisory role while the computer does the exploration. The user needs only supply a small set of initial motions. A wide range of motions is then evolved from the much smaller set of motions supplied by the user. As in [Ventr95], emphasis is on “optimization of expressivity”. The animator is in the optimization loop to aesthetically influence the evolution of motion. 2.

CURRENT APPROACHES

Motion control of articulated figure is by nature tedious. Animation of articulated model has commonly been done using keyframing. High quality results are the result of highly developed human skill and artistry. Figures are posed painstakingly for every keyframe using either forward or inverse kinematic. In the bigger industry, motion capture is commonly used for familiar figures such as human and animals. Motion capture relies on kinematics knowledge

of motion and locomotion [Boulc90]. Relying on motion of real actors, the technique can produce startlingly convincing results. The cost, however, is prohibitive and much work is involved in getting the right motion for a character. Using keyframing or motion capture, the end result sought is a library of reusable motions for a specific character. The set forms a basis set of motions from which additional motion can be warped [Witkn95] or interpolated [Wiley97, Rose98]. In much of the computer animation research community, the holy grail of research is often implied to be the complete simulation of the physical world complete with cognitiveinspired modeling of minds and emotions that drives motion and reaction of intelligent or lifelike articulated figures. In contrast to kinematic-based approach, dynamic approaches [Raibt91, Hodgn95, Grzes98] rely on knowledge about the physics and biomechanics of the motion. The goal is to leave the issue of motion control to the laws of forces. An arm moves due to forces acting on it, a character falls due to the pull of gravity. The problem then becomes that of specification of forces. While animation of character falling due to gravity is relatively trivial using dynamics package such as SD/FAST [SD/FAST] and MathEngine {MathEngine], putting the character to work on motions even as seemingly simple as walking has drawn intensive research, testifying to the difficulty of the approach. Moreover, expecting the animator to draw upon intimate knowledge of forces to animate a character seems hardly a practical idea. A quick fix is frequently in the form of higher-level user interface that makes animation as easy as providing high-level specification such as "walk up the stairs". While automation leads to an easy life for anyone who intends to produce a quick animation, it leads to lesser control by the animator over the animation. This may be undesirable especially for the highly skilled animator working on topnotch animation - where the slightest of details set apart the great from the also ran. The animation may be enough for purposes such as ergonomic study, as evident in the Jack Human Modelling and Simulation System [Jack], but it may not be appealing enough to find its way into the entertainment world. The crux of our approach lies in genetic algorithm. Genetic algorithm, a class of optimization algorithm, offers a technique for the

automatic generation of new motion specifications, while offering limited user control by means of fitness function. In genetic algorithm, a Darwinian "survival of the fittest" approach is employed to search for optima in a large multidimensional space. In computer graphics and animation, genetic algorithm has been used for synthesis of controllers for physically based articulated figures, and generations of interesting images [Sims93] and designs [Rosens97, Jacks99]. The evolutionary approach allows a more diverse area of the solution space to be traversed compared with other methods; furthermore its probabilistic selection method directs the random generative process towards meaningful and satisfactory solutions The approach in the literature closest to ours is that of Lim and Thalman in [Lim99]. Lim and Thalman's pro-active interactive evolution synthesize example motions from a single motion. The primary difference between their approach and ours is that in our approach, new motions evolve from a set that can consist of more than one motion. The newer motions may be the results of crossover of the original motions. As a result, a new motion may be a combination of old motions. For example, a kicking and punching action may result from a set that has a kick motion and a punch motion. There is no mention of the use of crossover operator in [Lim99]. Also, we use neural net to constrain the possible motions generated by the evolution. 3.

APPROACH

Our approach draws upon a combination of forward kinematics, neural net and genetic algorithms. For purpose of discussion, we assume that the model is represented as a hierarchy of rigid links connected by joints. Each joint has up to 3 degrees of freedom (DOF). The variation of each DOF is represented as a discrete function θ(T) of time. The function may be visualized and manipulated on as a curve, as shown in Figure 1.

θ

time Figure 1: A DOF function

A set of DOF functions for a character model is specified for a defined duration. We shall refer to this set simply as a DOF set. Not every joint angle for each and every frame or time unit need be stored in a DOF set. Only the control points of the representative curve need be stored. The user needs to specify a collection of q DOF sets, D1 ... Dq, each set corresponding to a distinct character motion. The set need not be too big. It only has to be sufficiently varied. The motion data may be handcrafted or it may be motion-captured. It may even have been simulated by dynamic based models. For a model with n joints, each of which has m DOFs, for each DOF set, there would be n*m motion curve to deal with. Let the ordered set of n*m curve be denoted by s = . Each ordered set forms a genome in our genetic algorithm. Among the DOF functions in a DOF set that the user has supplied, he or she marks one or more DOF function as 'interesting'. We denote the subsets of DOF functions in each DOF set Di marked as interesting by di, i = 1..q. The genetic algorithm first creates its initial population by forming an ordered Cartesian product of di. Mutation, crossover and breeding operators are then performed on this population for a finite number of generations. Figure 2 illustrates the concept involved here.

the type of solutions produced by the genetic algorithm. 3.1.

DETAILS OF THE GENETIC ALGORITHM

The three generic task in using a genetic algorithm, regardless of the exact form of the algorithm itself are 1. Defining a representation 2. Defining the genetic operators 3. Defining the objective function 3.1.1.

DEFINING A REPRESENTATION

We seek a representation that is minimal but completely expressive. The representation should not contain information beyond that needed to represent a solution to the problem. The representation should be able to represent any solution to the problem. In our approach, each genome is an array, whereby each element of this array is a 2D array. The inner 2D array holds the discrete values of a DOF function. The first dimension of this 2D array holds the values of normalized time (range is from 0.0 to 1.0), while the second dimension holds the rotational values. Figure 3 illustrates an example gene, while Figure 4 illustrates an example genome. 0 10

0.1 20

0.3 15

0.5 12

0.7 9

0.75 8

0.86 20

1.0 10

Figure 3: An Example Gene population initializer

Figure 2: Initializing a Population Before evolution commences, our approach relies on the user providing knowledge to the system on the type of motions possible. The system is trained by neural net to recognize a motion when it 'sees' one. For example, the user can supply the system with a moderate training set of motions that correspond to walking, running, kicking, punching, and so on. The recognition capability is used to constrain

Note that the number of control points for each DOF function need not be the same. However, in actual implementation, we need to set the maximum possible number of control points. One concern in genetic algorithm is how to design a representation such that it cannot represent infeasible solutions to a problem. In the animation problem, it is important that the character not be led to perform motions considered undesirable or impossible (such as rotating the leg 360 degrees about the x-axis). While, there is nothing in our representation to prevent such occurrence, the manner in which we deal with the issue is by using an evaluation function that assigns low score to undesirable or impossible motion.

0.0 0 0.0 10 0 10 0.0 0 0.0 10

0.2 2 0.1 20 0.1 20 0.2 2 0.2 20

0.4 5 0.2 15 0.3 15 0.3 15 0.3 15

0.6 6 0.3 12 0.5 12 0.5 6 0.5 12

0.8 6.5 0.5 9 0.7 9 0.6 5 0.7 19

1.0 8 0.7 8 0.7 8 0.8 18 0.7 18

0.8 20 0.8 20 1.0 19 0.9 20

1.0 10 1.0 10

1.0 24

Figure 4: An Example Genome

3.1.2.

DEFINING GENETIC OPERATORS

Each genome has three primary operators: initialization, mutation, and crossover. The initialization operator determines how the genome is initialized. It is called whenever the genetic algorithm or its population is initialized. We have already described the mechanism of initialization at the beginning of this section. The genetic algorithm initializes its population by forming an ordered Cartesian product of subsets of its DOF sets. The mutation operator defines the procedure for mutating each genome. In many introductory genetic algorithm examples, a typical mutator for a binary string genome simply flips the bits in the string with a defined probability. In our approach, we randomly perturb the control points for each DOF in each DOF set. The crossover operator defines the procedure for generating a child from two parent genomes. We use a standard single-point crossover to generate offspring. Given two individuals (candidate solutions to a problem) in a population, the genetic algorithm generates a child solution by randomly taking some of the genes from one individual, and taking the remaining genes from the other. The structure for the genome is stored in a single data structure. The genetic algorithm creates a population of solutions based on the data structure and then operates on it to evolve the best solution. We have experimented with the simple algorithm described by Goldberg in [Goldberg89]. Golberg's standard simple genetic

algorithm' uses non-overlapping populations. Elitism is optional. Each generation the algorithm creates an entirely new population of individuals. Possibilities for future experimentation include steady-state, incremental and deme. These algorithms differ in the way that they create new individuals and replace old individuals during the course of an evolution. 3.1.3.

FITNESS EVALUATION

Any reasonable objective function may be used to evaluate animation. Examples of such evaluation functions would be the maximum height of an arm swing, the duration of a kick and the proximity of an arm's end-effector to a spatial position. Hence, if we wish to constraint the end-effector of a model to follow a certain path, then we can create an objective function that assigns high score to genome that fulfills these requirements, and low score to others. Objective function together with neural net defines the kinematics of motion the user desire. Both place kinematics constraints on the motion. In evolving animations, however, as in the evolution of engineering designs, many subjective requirements comprise critical aspects of the evaluation that cannot be formulated objectively. One way to deal with this is to allow users to interact with the process by selecting solutions for propagation or ranking solutions based on visual preferences. Rather than relying strictly on objective function as is the case in traditional genetic algorithms, the approach exploits human ineptness at visual judgement. The product may not be perfectly physically accurate. It is up to the user to influence the evolution towards realistic or cartoon-like animation. As said by Ventrella, animation is more about motion-based expression and communication of certain ideas and feelings than it is about accurate simulation of physics or animal behavior [Ventr95]. Cartoon laws may rein supreme – physics and biological realism may be violated in favor of expressive extravagation and character personalization. In his study on genetic evolution of physically initiated motion, Ventrella noted that any resulting inconsistencies with true world motion are negligible and almost never distract from people’s attention to the engaging issues of evolution and the development of a figure’s body language.

3.2.

AUTOMATED RECOGNITION

The curve corresponding to the control points in a DOF set is interpolated and serves as indicators to the type of motion for a particular segment. The discrete points making up the digital representation of the curve are stored in an array the length of which is equal to the number of frames or time units for the motion. Automated recognition, by means of neural net, is used to delimit the search space of the genetic algorithm. We do not directly use the original curve data points as the input to the neural net. Instead, we reduce its size by means of Fourier transformation. Recognition of the overall motion or motions of a figure occurs in parts. The movable limbs of the figure are logically partitioned into separate sections. For example, the parts of a leg - the upper leg, lower leg, the foot - form one section, while the parts of an arm - the upper arm, lower arm and the hand -forms another section. The Fourier representation for the part in any one section is concatenated together to form a single input vector to the neural net. The following two subsections describe in greater detail. 3.2.1.

FOURIER REPRESENTATION

Any signal can be made up by adding together the correct sine waves with appropriate amplitude and phase. We imagine each DOF function as comprising of samples of a continous motion signal. Given fk a function of discrete samples of a continuous motion signal f(t), if f(kT) and F(rs0) are the kth and rth samples of f(t) and F(s), respectively, and N0 is the number of samples in the signal in one period T0, then the discrete Fourier transform (DFT) is defined as Fr =Σ fk exp(-i rΩ0k) where Ω0 =2 π N0-1, fk =T0N0-1 f(kT), Fr =F(rs0), and s0 =2πT0-1.

We use the Fast Fourier Transform (FFT) to transform motion signals into its Fourier representation. Figure 5 shows an example motion signal (100 frames walking motion) for the upper leg and its real and imaginary spectra. As the motion is real both spectra are symmetric around the center which means only half of each carries valuable information. It can be seen that almost all the information is contained within the low frequencies. Thus it seems reasonable to assume that these frequencies will also provide the best input for a recognition system. The variance of each frequency for a set of different motions also testifies to its usefulness. We also find that the real spectrum is sufficient for recognition. The real spectra for motion curve of parts that belong in the same section are concatenated together to form a single vector. 3.2.2.

BACKPROPOGATION

We train a feedforward backpropagation network to classify input frequency vector according to the type of motion that it represents. The network that we use consists of 3 layers. The first layer is the input layer. The second layer has weights coming from the input. The last layer is the network output. All except the last layers have biases. Each of the output neuron uses a tan-sigmoid transfer function. Each layer's weights and biases are initialized with small random values. The initialization of weights and biases are based on the technique of Nguyen and Widrow [Nguyn90]. The approach generates initial weights and bias values for a layer such that the active regions of the layer's neurons will be distributed roughly evenly over the input space. We use a Levenberg-Marquardt learning algorithm together with steepest descent with momentum to train the network. Training proceeds initially in batch. The weights and

a) b) c) Figure 5: Example Motion Signal (a), with its Real (b) and Imaginary Spectra (c)

Figure 6: Various Styles of Kicking biases of the network are updated only after the entire training set has been applied to the network. The gradients calculated at each training example are added together to determine the change in the weights and biases. To ease the burden on the user to come out with the perfect training input set, aside from batch training, we allow the user to incrementally retrain the network during the course of the interaction. 4.

EXAMPLES

Our implementation is based on Walker, an animation system written by Philip Winston and Kanishka Agarwal as a class project at Harvey Mudd College. The program, simple and open source yet attractive, proves to be a useful ground for hacking with genetic algorithm and neural net for kinematics-based character animation. Much of the genetic algorithm code is derived from Galib [Galib], while the neural net and digital signal processing functionality is provided by MatLab [MatLab]. We use a simple 10 DOF articulated model, building and running the implementation on a Windows-based PC. In one of the experiments we have conducted, we first supply the system with a set of motions corresponding to walking, running, kicking and dunking.. We batch train the system to recognize all these actions, including actions that are

Figure 7. Kicking Higher than Usual

considered impossible or undesirable. Next we set the constrain to be such that the motion evolved must be a kick, and we use as an evaluation function the number of times the figure kicks a box placed in front of it. Figure 6 shows the various style of kicking produced by the figure after 5 generations. Figure 7 shows the results when the box is raised higher and the evolution has run for 30 generations. In Figure 7, the leftmost image shows the original kick, while the other images shows various attempts to kick the box. In another experiment, we followed a similar procedure, but we constrain the motion such that it walk without kicking a box placed on the floor and in its path. Figure 8 shows the result, with the first image in the figure showing the original walking motion. In both our experiments, each generation takes approximately 8 s to evolve. 5.

CONCLUSION

We have described a simple approach to apply genetic algorithms and neural net in concert to generate kinematics-based character animation. We have shown how a set of kinematics motion data for an articulated model can be evolved to generate a set of varied motion that performs according to an objective function set by the user. Neural net is used to constrain the search space of possible motions. User interaction with the process ensures a certain style evolves from

Figure 8: Walk but Don't Kick the Box the myriad of possibilities. Using our approach, instead of the animator tediously toiling over the kinematics or physics-based details of possible character motions, the animator plays a reactive or supervisory role while the computer does the exploration. There are a number of areas that we need to address before the method described can be considered to have matured. A primary issue is computational requirement, both efficiencyand memory-wise. Memory and processing time is proportional to the number of DOFs and the size of each population in the evolution. Our current implementation uses 8 Kbytes per genome. To reduce the memory consumption, one can either reduce the number of DOFs in the model or one can reduce the size of the population in the genetic evolution. The bulk of the computation lies in the evaluation of the population. In each evaluation, each genome has to be evaluated to determine its relative fitness compared to the rest of the population. The amount of computation needed for each genome depends on the how we design the evaluation function. An evaluation function that requires evaluating the orientation of a model for each frame or time unit for the entire duration of a motion would require more processing compared to one that require requires such information only for a sampling of the total number of frames or time units. The evaluation can be sped up by means of parallelization. We have not implemented the parallel version of the algorithm described here, but parallel genetic algorithm is an area well-studied and should fit well into our problem.

Another important area to explore is the interoperability of the technique with existing techniques and tools. Motion interpolation techniques may be used with the library of motions evolved using our approach. Spacetimeconstraints [Witkn88, Gleit95, Popov99], a motion editing technique that seek to preserve the quality of the original motion may be used to perturb the motion curve for selected DOF function. Yet another area to be explored is the possibility to evolve entirely novel yet realistic motion. We have experimented with creating variants of predefined motions. It should also be possible to evolve entirely novel motion by placing the right constrain. To understand how this can be done, we need distinguish between positive constrain and negative constrain. Positive constraint is when we set the evolution to produce motion of specific type or types. To evolve totally novel motion, we have to apply negative constraints; negative constrains is when we set the evolution to produce any motion for as long as it is not of specific type or types. The use of automated neural-network based recognition should make this an almost trivial problem. In conclusion, animation by means of evolution and interaction has the potential of actually reinventing the arts and science of character animation. It is hoped that the ideas presented in this paper inspire the way towards realization of such potential.

REFERENCES [Boulc90] Boulic, R., Thalmann N. M., Thalman D: A Global Human Walking Model with Real-Time Kinematic Personification, The Visual Computer, 6:344-358, 1990. [Galib] GAlib, http://lancet.mit.edu/ga/ [Gleit95] Gleitcher M.: Motion Editing with Spacetime Constraints, 1997 ACM Symposium on Interactive 3D Graphics, pages 139-148, April 1997. [Goldb89] Goldberg D. E: Genetic Algorithm in Search, Optimization, and Machine Learning, Addison-Wesley, 1989 [Grzes98] Grzeszczuk R., Terzopoulos D., and Hinton G.: NeuroAnimator: Fast Neural Network Emulation and Control of Physics-Based Models, SIGGRAPH ’98, pages 9-20, 1998. [Hodgn95] Hodgins J.K et al.: Animating Human Athletics, SIGGRAPH ’95, pages 71-78, 1995. [Jack] Jack Human Modeling and Simulation System, http://www.cis.upenn.edu/~hms/jack.html [Jacks99] Jackson H.: A Symbiotic Coevolutrionary Approach to Architecture, Symposium on Creative Evolutionary Systems at the AISB'99 Convention, 6th-7th April 1999. [MathEngine] MathEngine, www.mathengine.com [MatLab] MatLab, www.mathworks.com. [Nguyn90] Nguyen, D., and B. Widrow: Improving the Learning Speed of 2Layer Neural Networks by Choosing Initial Values of the Adaptive Weights, Proceedings of the International Joint Conference on Neural Network, vol 13, pp. 21-26, 1990. [Lim99] Lim I.K. and Thalmann D., Proactively Interactive Evolution for Computer Animation, Proceedings of Eurographics

Workshop on Animation and Simulation ’99, pp. 45-52, Milan, Italy, September 1999. [Popov99] Popovic Z. and Witkin A., Physically Based Motion Transformation, SIGGRAPH ’99, pages 11-20, 1999. [Raibt91] Raibert M.H. and Hodgins J.K., Animation of Dynamic Legged Locomotion, SIGGRAPH ’91, pages 349358, July 1991. [Rose98] Rose C., Cohen M.F. and Bodenheimer B., Verbs and Adverbs: Multidimensional Motion Interpolation, IEEE Computer Graphics and Applications, pp. 32-40, September 1998. [Rosen97] Rosenman, M. A., An Exploration into Evolutionary Models for NonRoutine Design, Artificial Intelligence in Engineering, 11(3): pages 287-293, 1997. [SD/FAST] SD/FAST, www.symdyn.com [Sims93] Sims K., Interactive Evolution of Equations for Procedural Models, The Visual Computer, v9, pages. 466-476, 1993. [Sims94] Sims K., Evolving Virtual Creatures, SIGGRAPH ’94, pages 15-22, 1994. [Ventr95] Ventrella J., Disney Meets Darwin: The Evolution of Funny Animated Figures, Proceedings of Computer Animation 95, pages 35-43, April 1995. [Wiley97] Wiley D. and Hahn J.K., Interpolation Synthesis of Articulated Figure Motion. IEEE Computer Graphics and Applications, pages. 39-45, November 1997. [Witkn88] Witkin A. and Kass M., Spacetime Constraints, SIGGRAPH ’88, pages 159168, August 1988. [Witkn95] Witkin A. and Popovic, Z., Motion Warping, SIGGRAPH ’95, pages 105108, 1995.

Suggest Documents