CONSTRUCTION OF ANIMATION MODELS OUT ... - Semantic Scholar

CONSTRUCTION OF ANIMATION MODELS OUT OF CAPTURED DATA Ik Soo Lim and Daniel Thalmann Virtual Reality Lab (VRlab), Swiss Federal Institute of Technology (EPFL) CH-1015 Lausanne, Switzerland [email protected], [email protected]

ABSTRACT This article describes a method to construct parametric models out of captured motion and skeleton data. Casting the problem as scattered data interpolation, our work is based on a multi-step approximation for the interpolation function with motion data compressed by principal component analysis. This leads into smaller storage and faster computation than those of previous approaches based on classical methods of exact interpolation. As a result, motion models can be constructed out of a rich set of example data, but can be used for real-time applications. We demonstrate a motion model controllable by attributes including those invariant for each individuals such as age, gender, height and weight. A parametric skeleton model is also constructed and demonstrated. 1. INTRODUCTION Motion control of articulated figures such as humans has been a challenging task in computer animation[2]. Once an acceptable motion segment has been created, either from key-framing, motion capture or physical simulations, reuse of it is important. This article describes our work on parametric models for motion and skeleton of human figures based on example data. By separating motion or skeleton generation into a time-consuming preprocess and a fast process based on efficient data representation, it fits well to real-time synthesis for applications such as game and virtual reality while taking advantage of a rich set of examples. Much of the recent research in computer animation has been directed towards editing and reuse of existing motion data. Stylistic variations are learned from a training set of very long unsegmented motion-capture sequences[4]. An interactive multi-resolution motion editing is proposed for fast and fine-scale control of the motion[11]. Whereas most of other methods may produce results violating the laws of mechanics[21], an editing method maintaining physical validity is suggested[13]. Motion editing is also done in This work was supported by the Swiss Federal Office for Education and Science in the framework of the PAVR EU project.

frequency-domains[5][19]. Interpolation of existing motion data is employed for the on-the-fly synthesis[15][20]. These two of interpolation-based works are done in the same spirit as ours, which use sets of example motions with an interpolation scheme to construct new motions while providing a set of meaningful, high-level control parameters to an animator or a runtime system. Based on classical techniques of exact interpolation, however, they have to use the same number of basis functions as that of example motions[15][20] and a global linear system has to be solved in determining the parameters of the interpolation functions[15]. In addition, naive representations of motion data are used without taking advantage of redundancies among the example motions: for example, over a thousand number of real-valued interpolations have to be solved if captured motion data are used[15]. Due to this high demand on storage and computation, their applications are limited to motion models whose control parameters are only those variant for a single individual such as emotional and physical parameters: in walking, happy-tired walks or walks at different radii of turn. On the contrary, our work is based on a multi-step function approximation for the interpolation with motion data compressed by principal component analysis. Then, basis functions less than the number of examples can be used, no global linear system needs to be solved and only a few dozens of real-valued interpolations have to done even if densely sampled data of motion capture are used. This enables motion models constructed out of a rich set of examples and fast computed. Therefore, motion models can be controlled by attributes invariant for each individuals such as age, gender, height and weight, which require many of example motions from different individuals: for instance, male-female walks and young-old walks. Differences in age, gender, height or weight lead into not only different motions but also different skeletons. To be complete, hence, parametric skeleton models are also necessary, which are constructed by the same technique of the function approximation. In the following section, formulated as a problem of scattered data interpolation, pre-processings of motion data necessary for the interpolation is described. Then, a multi-

step approximation of functions is introduced for the scattered data interpolation and results of our approach with captured data of motion and skeleton are presented. We conclude with some future works. 2. PRE-PROCESSED DATABASE A set of similar but distinct example motions Mi from different individuals are used to construct a parametric motion model. Each of example motions are annotated by hand, pi , which describes a set of meaningful, high-level control parameters including age, gender, height and weight invariant for each individuals, emotional and physical parameters variant for each individuals. For each of example motions, a set of key-times ti is also specified, which describes phrasing of its example motion, i.e., relative timing of the structural elements. Out of these example database {(Mi , pi , ti )}ni=1 , a function mapping an arbitrary parameter p to its corresponding motion M is to be constructed to serve as a continuous parametric motion model. Since the construction of the continuous function is formulated as a scattered data interpolation problem and the database needs to be pre-processed to be suitable for this blending / interpolation. 2.1. Euclidean Representation of Orientation Each of motion data Mi consists of a human figure’s position and a set of each joint’s orientation over time. Avoiding drawbacks of Euler angles, unit quaternions are used to represent the orientations. While standard interpolation algorithms for quaternions consider only single-parameter splines, we are faced with a multi-parameter interpolation problem. For Euclidean multi-dimensional interpolation applicable for orientation data, we use a technique based on tagent space interpolation[10]. Unit quaternions have polar form, b , cos θ2 and log maps quaterq = exp (θb n) = sin θ2 n nions into Euclidean tangent space, ln q = θb n ∈ R3 where Euclidean multi-dimensional interpolation is applicable: 1. Map example quaternions into log space: yi = ln qi 0 2. Use Euclidean interpolation method: yinterp = F x; E 3. Lift the result to unit quaternion by exponential map: qinterp = exp (yinterp ) Accordingly, each motion data Mi is Euclideanized into fi . M

2.2. Time Reparametrization

Each of motion data not only differs in its duration but also structurally similar poses occur at different time values: for

example, in walking, the time when the left feet strikes the ground differs among the exemplar motion data. Thus, simple blend of them would result in odd motions. Time-warping between them has to done before any blending, for which the key-times given above can be used[15]. Describing phrasm ing of its motion, each set of the key-times ti = {tik }k=1 can define a piecewise linear mapping t ∈ [0, tim ] to a generic time b t ∈ [0, 1]: b t (t) =

j−1+

t − tij ti(j+1) − tij

1 m−1

for the largest j such that t > tij . Once re-parametrized to this generic time, all examples lie at the same structural point for a given time, regardless of phrasing or overall duration. Accordingly, each of Eufi is warped or resampled into clideanized motion data M 0 Mi in this generic time frame. 2.3. Dimension Reduction of Motion Data For multi-dimensional outputs, the interpolation problem for a real-valued function has to be solved for each of output dimension. Hence, naive representation of captured motion data for human figures typically leads into over a thousand interpolation problem. We use Principal Component Analysis (PCA) to reduce the dimension of motion data into that of a few dozens. In our use of PCA, the reduction of dimension comes from the redundancies between motion data of articulated human figures while typical applications of PCA in computer graphics take advantage of redundancies between poses of deformable bodies[1][3][12]: eigenmotions vs. eigenposes! PCA performs a basis transformation to an orthogonal coordinate system formed by the eigenvectors of the covariance matrix computed over the motion differences with the average (in descending order according to their eigenvalues): 0

1. Compute the average motion Mavg of n example mon 0 on Pn 0 0 tions Mi : Mavg = n1 i=1 Mi i=1

0

0

2. Create a matrix h D of the differences i 4Mi = Mi − 0

0

0

Mavg : D = 4M1 , . . . , 4Mn

3. Calculate the eigenvectors ei and eigenvalues λi of the covariance matrix CM0 = DDT 4. If the number of example motions is less than the did0 = mension of motion data, then a smaller matrix C M T D D can be used in computing the eigenvectors and eigenvalues from the following relation[9]: λi = λbi , ei = √1 Debi λbi

Using eigenvectors of the first r (< n − 1) largest eigenvalues, motion data can be approximately represented and hence its dimension is reduced: 0

0

M = Mavg +

n−1 X i=1

0

αi ei ' Mavg +

r X

αi e i .

i=1,... ,n

where q − 1 is the number of Gaussian present in the net.

i=1

0

Rr

which the net approximation is worst, is chosen: 2 2 fs − netq−1 (xs ) = max fi − netq−1 (xi )

Hence, each of Mi is approximated by mi = (α1 , . . . αr ) ∈ 3. SCATTERED DATA INTERPOLATION

Given a labeled set of the example data, we cast the construction of a parametric motion model as a scattered data interpolation problem which serves as a canonical formulation of many tasks in computer graphics: given a finite set of examples E = {(xi , fi )}ni=1 , construct a real-valued function f : Ru → R1 such that fi = f (xi ). For a vector-valued function, a real-valued interpolation function needs to be constructed for each dimension. Construction of closed surfaces from CT or MRI images, derivation of smooth morphing between objects, reconstruction of landscapes from contour maps and many others problems share the same formulation. For this interpolation problem, we use a radial basis function (RBF) or a network consisting of Gaussians:

2. A new Gaussian is added to the net, its center cq and weight wq is determined as follows: cq = xs , wq = fs − netq−1 (xs ) . 3. The variance βi of the new Gaussian is determined to minimize the mean square error of the net, εq = Pn 2 1 q i=1 (fi − net (xi )) . n

4. The whole example set is presented to the new configuration of the net and the error-cost function is computed. If it is lower than a prescribed degree of accuracy, the iteration stops. Otherwise, it continues starting at step 1. 4. PARAMETRIC MODEL CONSTRUCTION

Given labeled sets of the pre-processed motion data {(pi , mi )}ni=1 n and the key-time data {(pi , ti )}i=1 , interpolation functions m and t are constructed incrementally as described above: these interpolation functions compute un-timewarped motion data upon an arbitrary input parameter p. Our parametric skeleton model for human figures is also formulated q X as an interpolation function over a labeled set of exemplar wi exp −βi kx − ci k2 . netq (x) = skeleton data. The resulting interpolation functions are mapi=1 pings from multi-dimensional parameter space of age, gender, height, weight and other attributes to space of motion Though consisting of basis functions other than Gausdata or skeleton data. sians, radial basis functions have been used in computer graphics for image warping[16][8], modeling laser scan data[17][7] and multidimensional motion interpolation[15]. 5. RESULTS The construction of the interpolation function is then reOut of 8 people varying in age, gender, height and weight, alized by determining the number of Gaussians, weights, variances and centers of each Gaussian:q and {(wi , βi , ci )}qi=1 . we captured 68 motion data of ‘punching’ over various target positions and also measured skeleton data such as lengths As similarly adopted in most of those RBF applications in of arms and legs. After being aligned with respect to time, computer graphics, a standard approach to this task has to motion data went through PCA for the dimension reduction. keep the same number of Gaussians as that of exemplar data The first 24 eigenvectors were enough to describe 99% varipoints and solve a global linear system: q = n,ci = xi , ations among the motion data so that these data were repreβi by heuristic considerations and, finally, wi by solving a sented in a 24-dimensional coordinate systems formed by global linear system[14]. This strategy is known not to scale the eigenvectors (Figure 1). Interpolation functions are conwell, causing both numerical and memory allocation probstructed in the iterative way both for the motion model and lems with more example data. the skeleton model. Once constructed, these models can To avoid these drawbacks, we employ a method based compute and synthesize corresponding motion and skeleton on an iterative process which adds a Gaussian one at a time data upon a set of the control parameters for age, gender, in an adaptive way[6], which can result in an interpolation height, weight and target location (Figure 2): an articulated function with less number of Gaussians than that of exembody structure is modified according to data computed by plar data points without solving any global linear system: the skeleton model. The quality or precision of these para1. All the examples from the example set E are premetric models degrades smoothly with decreasing number sented to the net. Then, the example (xs , fs ), for of Gaussians used in the interpolation functions.

6. CONCLUSION

Figure 1: Eigenmotions. (top) the average motion, (bottom) two of eigenvectors scaled by ±constants are added to the average.

Motion synthesis based on captured data interpolation provides both realism and controllability for real-time applications. For this purpose, incremental construction of interpolation functions and approximation to the resulting functions with less number and dimension were reported. This approach offer better efficiency in storage and computation than others based on classical exact interpolation. We were able to construct both motion and skeleton model controllable even by attributes invariant for each individual, which is rarely addressed by other approaches. We will continue to enhance the reported work. More data of motion and skeleton is being collected over a wider range so that the parametric models can be used for nontrivial applications such crowd simulation[18]. Radial basis functions will be placed over a hierarchical grid structure so that even faster interpolation can be possible. 7. REFERENCES

Figure 2: Snapshots of synthesized ‘punching’ for different people and different targets (counterclockwise starting upper-left). For clarity, motions are shown in the generic time frame.

[1] M. Alexa and W. Muller. Representing animations by principal components. In Proceedings of EUROGRAPHICS 2000, 2000. [2] N. Badler, C. Phillips, and B. Webber. Simulating Humans: Computer Graphics, Animation, and Control. Oxford University Press, 1993. [3] V. Blanz and T. Vetter. A morphable model for the synthesis of 3d faces. In Proceedings of SIGGRAPH 1999, pages 187 – 194, 1999. [4] Matthew Brand and Aaron Hertzmann. Style machines. In Proceedings of SIGGRAPH 2000, pages 183 – 192, 2000. [5] A. Bruderlin and L. Williams. Motion signal processing. In Proceedings of SIGGRAPH ’95, pages 97 – 104, 1995. [6] A. Esposito, M. Marinaro, D. Oricchio, and S. Scarpetta. Approximations of continuous and discontinous mappings by a growing neural rbf-based algorithm. Neural Networks, 13:651 – 665, 2000. [7] J. C. Carr et al. Reconstruction and representation of 3d objects with radial basis functions. In Proceedings of SIGGRAPH 2001, pages 67 – 76, 2001. [8] N. Arad et al. Image warping by radial basis functions: Application to facial expressions. Computer Vision, Graphics and Image Processing: Graphical Models and Image Processing, 56(2):161 – 172, 1994. [9] G. H. Golub and C. F. V. Loan. Matrix Computations. Johns Hopkins Univ Press, 1996. [10] F. Grassia. Practical parametrization of rotations using the exponential map. Journal of Graphics Tools, 3(3):29 – 48, 1998. [11] J. Lee and S. Y. Shin. A hierarchical approach to interactive motion editing for human-like figures. In Proceedings of SIGGRAPH 1999, pages 39 – 48, 1999. [12] A. P. Pentland and J. Williams. Good vibrations: Modal dynamics for graphics and animation. In Proceedings of SIGGRAPH ’89, pages 215 – 222, 1989. [13] Z. Popovic and A. Witkin. Physically based motion transformation. In Proceedings of SIGGRAPH 1999, pages 11 – 20, 1999. [14] M. J. D. Powell. Radial basis functions for multivariable interpolation: A review. In Algorithms for Approximation, pages 143–167, Oxford, UK, 1987. Oxford University Press. [15] C. Rose, M. Cohen, and B. Bodenheimer. Verbs and adverbs: Multidimensional motion interpolation. IEEE Computer Graphics and Applications, 18(5):32 – 48, September/October 1998. [16] R. Ruprecht and H. Muller. Image warping with scattered data interpolation. IEEE Computer Graphics and Applications, 15(2):37 – 43, 1995. [17] G. Tuck and J. F. O’Brien. Shape transformation using variational implicit surfaces. In Proceedings of SIGGRAPH 1999, pages 335 – 342, 1999. [18] B. Ulicny and D. Thalmann. Crowd simulation for interactive virtual environments and vr training systems. In Proceedings of Eurographics Workshop on Computer Animation and Simulation 2001, pages 163 – 170, 2001. [19] M. Unuma, K. Anjyo, and R. Takeuchi. Fourier principles for emotion-based human figure animation. In Proceedings of SIGGRAPH ’95, pages 91 – 96, 1995. [20] D. Wiley and J. Hahn. Interpolation synthesis of articulated figure motion. IEEE Computer Graphics and Applications, 17(6):39 – 45, November/December 1997. [21] A. Witkin and Z. Popovic. Motion warping. In Proceedings of SIGGRAPH ’95, pages 105 – 108, 1995.