Abstract. The reuse of motion capture data is receiving increasing attention in computer animation. This interest is motivated by a wide spectrum of time-.
Reuse of Motion Capture Data in Animation: A Review Weidong Geng, Gino Yu Multimedia Innovation Centre The Hong Kong Polytechnic University, Hong Kong {weidong.geng, mcgino}@polyu.edu.hk
Abstract. The reuse of motion capture data is receiving increasing attention in computer animation. This interest is motivated by a wide spectrum of timecritical applications, such as video game development and animation production. This paper gives an overview of various tasks and techniques involved in the reuse of motion capture data in terms of motion authoring pipeline. We assume that whenever the user is asked to build motions, he/she will at first go to retrieve/browse the pre-recorded motions in the motion database, look for the best-fit candidate motion segments/pieces/clips, and manage to adapt them to fit the specified requirements. The two core issues in motion reuse, motion adaptation techniques and motion library construction, are the focus of this review.
1 Introduction Generating realistic motion for character animation remains one of the great challenges in computer graphics, as people have proved to be adept at discerning the subtleties of human movement and identifying inaccuracies. Motion capture is the process of recording motion data in real time from live actors and mapping it into computer characters. It is one of the most promising technologies bringing realistic and natural motions into character animation. The use of motion capture is currently most widespread and well-accepted in video game and animation [1]. However, motion capture also has its share of weakness [2, 3, 4]. Motion capture systems are still very expensive to use. The motion capture process is labor intensive and time-consuming for actors and technicians. In order to make motion capture widely available, the motion capture data needs to be made reusable [5]. This means that we may create the needed motions by reusing pre-recorded motion capture data. Furthermore, with the increased availability of motion capture data and motion editing techniques, it currently yields a trend to create the qualified motion by piecing together example motions from a database [3]. This alternative approach potentially provides a relative cheap and time-saving approach to quickly obtain high quality motion data to animate their creatures/characters. This survey will concentrate on two core parts in motion reuse: how to build the motion library and how to adapt motions to new needs. Section 2 will present a general overview about how to reuse motion capture data. Section 3 covers the research on motion adaptation techniques. Section 4 extends the discussion to motion capture data representation and motion library construction. Finally, section 5 V. Kumar et al. (Eds.): ICCSA 2003, LNCS 2669, pp. 620-629, 2003. Springer-Verlag Berlin Heidelberg 2003
Reuse of Motion Capture Data in Animation: A Review
621
concludes the paper by giving general comments on existing work in the area of motion reuse, and discuss future directions of research in this area.
2 Overview of Motion Reuse The motion reuse is typically carried out by applying appropriate motion transformation techniques on the motion clips in the motion capture database. Lamorout and van de Panne proposed to use a large set of representative motions together with techniques for tailoring these motions to make them fit new situations [6]. Molina-Tanco and Hilton built the statistic models from the database of motion capture examples [7], and then re-arrange the segments of original data in the motion library and generate a realistic synthetic motion by them. Jeong et al developed the motion editing system, Marionette version 0.1[8]. It has a motion database, and a rich set of motion operations such as motion cut/paste, rotation, transition, blending, interpolation and retargeting. From the point of view of implementation, one conceived representative system architecture for motion reuse is shown in Fig1. User interface Requirements of desired motion
Motion viewer Resulting motions
Motion query, retrieval, browser
Motion adaptation: modification, blending, concatenation etc.
Motions to be reused
Motion data input, management, and maintenance
Candidate motions
Motion library for reuse
Fig. 1. A potential architecture for motion reuse system
Whenever the user is asked to build motions, he/she will at first go to retrieve/browse the pre-captured motions in the motion database, look for the best-fit candidate motion segments/pieces/clips, and manage to adapt them to the specified requirements. The resulting motions will be presented to the user by motion viewer. If it is accepted, this means the user successfully gets the desirable motions. From the technical point view of motion reuse, firstly, we should have sufficient available motion data to be reused. Secondly, we need to make candidate motions fit into new situations. Therefore, we can summarize that this motion reusing approach is mainly limited by 1. The library of motions available to reuse, including the range of types of motions, the quantity of data, secondary information needed for reusing purpose (e.g. actor’s dimension and skeleton), and convenient tools for database construction and motion data retrieval.
622
W. Geng and G. Yu
2. The quality of tools available for adapting motions, i.e. how and in what acceptable level we can maintain the realism, aesthetics, and naturalness of the source motions despite all the possible changes in building new motions from them.
3 Motion Adaptation Principles and Operations 3.1 Adjustment Principles The fundamental issue in motion adaptation is how to generate convincing human motion transformation while retaining the desired characteristics of motions and making the changes to undesired aspects. Recently, much of the research has been devoted to developing various kinds of editing tools to modify and vary the existing motions, and/or produce a convincing motion from prerecorded motion clips. They are roughly reviewed in Table 1. Other noteworthy work includes retargeting motions to new characters [9], motion transitions generation [10] and motion path editing [11]. Table 1. A summary of motion adjustment principles
Methodol1ogy/ Assumption and Algorithmic ideas Relevant work Principles Signal processing Motion parameters is treated as a sampled [12], [13], [14], [15]
signal, and techniques from the image and signal processing domain are adapted to edit, modify, blend and align motion parameters of articulated figures. It makes some of the features that are to be [16], [17], [18], [19] Constrained preserved or changed in a motion explicit as optimization spatial or geometric constraints such as space-time constraints, inverse kinematics and physical laws, momentum constraints. Then the constraint-based problem solving is employed to generate the desired motion. Statistical models (e.g. Hidden Markov [7], [20], [21] Synthesis By statistical Model) are used to “learn” and extract “meaningful” attributes (e. g motion models. patterns) of motion the statistical properties of example motions, and then perform the motion synthesis on them. It assumes that the motions can be [22], [23] Interpolation parameterized by an interpolation scheme, and then it empirically uses linear or nonlinear methods to creates the individual motion clips.
Reuse of Motion Capture Data in Animation: A Review
623
3.2 Operations Involved in Motion Reuse Motion manipulation operations can be classified in many ways depending on the criterion that focuses on particular aspect of the techniques. A popular way to categorize them is according to the number of motions involved in an operation. Sudarsky and House defined the motion editing performed on single motion sequence as Unary operations, and the manipulation on two or more motion clips as Binary operations [24]. Jeong et al define them as inter-motion and intra-motion operations [8]. Inter-motion operations need two or more motions, and intra-motion operations need only one. Here we present an empirical classification by their roles and functions in motion building pipeline for motion captured animation. 3.2.1 Motion Modification Motion modification is going to make changes on the non-essential attributes of source motion. Its basic assumption is that most of central characteristics are preserved in the resultant new motion. The instances of this kind of operations comprise − Joint repositioning. It means changing the pose of body. For example, Bruderlin and Williams decouples the representation of original motion curve and the displacement curve [12]. Then the user can substitute displacement curve with another one, and accordingly change the positions of joints in each frame. − Motion smoothing. The smoothness of a motion depends on the density of control points in a motion curve representation. The motion can be smoothed by changing the number of control points using a filtering system [15, 17], or increasing/decreasing motion signals in high or low frequency bands [13]. − Motion warping [14]. The animator interactively defines a set of keyframe–like positional (time) constraints which are used to derive a smooth deformation that preserves the fine structure of the original motion. − Changing the orientation. Basically it changes the rotation parameters of root or pivot joint in the animated skeleton, for example, changing the walking motion on a straight line into walking on a circle. 3.2.2 Motion Synthesis Motion synthesis can be described as generating new motions by piecing together a set of pre-captured example motion data. Two or more motions can be synthesized at the same time. The resulting motion is a relatively new motion, which is similar to source motions somewhat in content and/or in style. Existing motion synthesis methods consist of − Synthesis by statistical models. It is a typical way to synthesize novel movements from examples. It usually at first learns a statistical model from the captured data which enables realistic synthesis of new movements by sampling the original captured sequences. Then the statistical model identifies the segments of the original motion capture data to generate novel motion sequence according to the specified synthesis requirements. − Synthesis by blending. It means parts of clipped motions are mixed to create a new motion. In a motion curve representation, it generates a curve that lies somewhere
624
W. Geng and G. Yu
in between the two original curves, according to a blending function or parameters. Partial motion combination can be considered as its extreme cases with a kind of “substitution”. This operation is useful when only part of a motion is meaningful. − Synthesis by interpolation. It generates the intermediate motions from set of example motions and associated parameters. Most impressive results on this operation come from Rose et al [23]. They used combination of linear approximation and radial basis functions to calculate the fitting hyper-surface approximating the example points. It is particularly effective for style manipulation since we can generate various motions without establishing the computation model of styles or nuances. − Synthesis as graph search [5]. The collection of motion sequences could be represented as a directed graph. The nodes of the graph are individual motion sequences. The edges connect frames of incident nodes (motion clips), where the outgoing edge originates from the last frame of the current motion, and the incoming edge points to the first frame of next motion. The motion generation in this case is converted as searching a suitable path from the graph by the soft and hard constraints specified by the user. 3.2.3 Motion Concatenation Motion concatenation combines two clip motions seamlessly by the transitional motions, removing discontinuities at the boundary between two motions. The resultant motion is the “addition” of the source motions both in time interval and content. Cyclification can be considered as a special case of concatenating the same motion twice or more. Its key issue is how to smoothly generate transitional motions. Several existing transition generation approaches are − Blending the overlapping part of motions. It is similar to the image morphing. The blending weights can be monotonically deceasing function in range of [0, 1]. It fades out one motion, and fades in the next one smoothly [24]. It is simple and intuitive, but it may cause unnatural motions such as footskate; − Space-time constraints [10]. Its motion transition generation uses a combination of space-time constraints and inverse kinematic constraints to generate seamless and dynamically plausible transitions between motion segments. − Basis controller [25]. They use a small set of parameterized basis controllers, such as balancing controller and landing controller, to create the continuous transitional actions between the two source motions. It can be considered as a simplified version of key-frame animation by controller. 3.2.4 Motion Retargeting The direct-mapping of captured movements from live subjects to a virtual actor potentially yields observable unnaturalness if their skeleton or surrounding environment has a big difference. Motion retargeting is introduced to solve this problem. The resultant motion shares the same motion content or high-level motion behavior with the source motion, but they are different in the lowest motion data level The motion retargeting has two typical scenarios: − Retargeting motion to new characters. Gleicher convert the motion retargeting operations into a set of kinematic constraints and spacetime constraints
Reuse of Motion Capture Data in Animation: A Review
625
formulation, based on the assumption that the skeleton has the identical structure with different size [9]. Bindiganevale & Badler proposed a method to automatically extract the important event using spatial constraints [26]. The visual attention can be kept after the motion is mapped to the virtual character. − Retargeting motion to new environment. It aims to alter the captured motion into new situation such as new path or new terrain. The path is depicted as an abstraction of positional aspects of the movements. Gleicher presented a path transformation approach to permit a single motion to be applied in a wide variety of settings [11].
4 Motion Database Construction Motion database is the basis for motion reuse. The major weakness in motion capture data is that it lacks of structure and adaptability. While a captured motion can record the specific nuances and details of a performance, it also specifically records the performance. It encodes a specific performer, performing a specific action, in a specific way [11]. The specificity makes it difficult to alter, especially since the key “essence” of the motion is not distinguished from the large amount of potentially irrelevant details. Therefore, we should ease the difficulty of motion adaptation by storing sufficient information in the motion database, including actor’s skeleton, marker information, marker placement, visual attention, footplant, and so on. 4.1 Motion Data Organization A typical strategy of motion data organization is based on the directed graph. Rose et al employ “verb graphs”, in which the nodes represent the verbs and the arcs represent transitions between verbs [23]. The verb graph, acts as the glue to assemble verbs (defined as a set of example motions) and their adverbs into a runtime data structure for seamless transition from verb to verb for the simulated figures within an interactive runtime system. Arikan & Forsyth also present a similar framework that generates human motions by cutting and pasting motion capture data [5]. The collection of motion sequence could be represented as a directed graph. Each frame would be a node. There would be an edge from every frame to every frame that could follow it in an acceptable splice. They further collapse all the nodes (frames) belonging to the same motion sequence together. Kovar et al construct a directed graph called a motion graph that encapsulates connections among the database [3]. The motion graph is a directed graph wherein edges contain either pieces of original motion data or automatically generated transitions. All edges correspond to clips of motion. Nodes serve as choice points connecting these clips. i.e., each outgoing edge is potentially the successor to any incoming edge. New motion can be generated simply by building walks on the graph.
626
W. Geng and G. Yu
4.2 Retrieval of Best-Fit Motion The retrieval of best-fit motion is mainly determined by the motion editing requirements. Arikan & Forsyth classify the motion editing requirement as hard constraints (can be satisfied exactly) and soft constraints (can not be generally satisfied exactly), including [5] • The total number of frames should be a particular number • The body should be at a particular position and orientation at a particular time • A particular joint should be at a particular position (and maybe having specific velocity) at a specific time • The motion should have a specified style (such as happy or energetic) at a particular time. In the algorithm level, one of the key issues of such motion retrieval is what similarity should be used to define the fit of a motion. A best-fit motion primitive should satisfy several possible-conflicting preconditions such a good match of character, environment and style. The similarity metric must weight these factors appropriately in order to efficiently extract the best-fit motion from the set of motion examples. For instance, Lamouret & van de Panne defined the similarity as the following distances over the candidate samples m: [6] dmin=min {m}(dstate+k1denv+k2duser) Where dstate measures the compatibility of the initial motion state. denv measures the compatibility of environment (e.g. terrain). duser measures the compatibility of particular user specifications. k1 and k2 adjust the relative importance of the terms. The best-fit motion sequence will have the minimal distance according to the above equation.
5 Discussion and Summary As the availability of motion capture data is increasing, there has been more and more interest in using it as a basis for creating computer animations when life-like motion is desired. The idea of reusing motions is not new. The existing work has already addressed many of the difficulties, especially the need for novel movement generation. Aiming at exploring the novel motion reuse approaches in entertainment industry, we also have developed a Kungfu motion library and a suite of motion editing tools for game development [27, 28]. The implemented system shows promising results in providing a cost-effective approach to quickly build motions for character animation. In light of reusable motion libraries, its access should be flexible and convenient, and its content should be sufficient and well-structured. For example, the animator can browse a human movement library, instantiating a specific movement by demanding realism and specifying the character dimensions, foot placements, even the emotional state of the character. Once the candidate motions are chosen, how can they be adapted to precisely fit the current situation? There are no perfect solutions yet. Besides the adaptation operation, we should also perform the cleaning-up of artifacts such as foot–skate from
Reuse of Motion Capture Data in Animation: A Review
627
the newly generated motions [29]. An analysis of advantages and drawbacks of existing motion adaptation approaches are given in table 2. Table 2. comparison of motion adaptation approaches
Methodology Strengths Limitations/difficulties /principles It can provide analytic solutions The signal processing does not Signal at interactive speed, and lots of explicitly describe the operation in processing
existing algorithms in signal processing can be “borrowed” directly. It can potentially provide highConstrained level motion editing operations, optimization and the user can make the adjustments to an animated character with direct manipulation. Synthesis by It structures and models the “essential” part motion capture statistical data, and help make them data to models be reused more plausibly. Interpolation It is simple, well-known and easily implemented in key-frame animation
terms of features in motions, and make it difficult for high-level motion editing The resulting motions are restricted by the range of constraints and the performance of solvers. The user can not specify “extra” adaptation It synthesizes motion based on abstractions of data rather than actual data, they risk losing important details The resultant motions are heavily dependent on the parameters specified by the user and employed interpolation methods.
The tendencies and future work in motion reuse are • Motion retrieval and synthesis in the designer’s way. It is preferred that the animator just roughly draft a motion sequence in an intuitive way such as by video, and then the system help compose the “goal motion” for them. This allows interactive manipulation of the constraints. Higher level stylistic constraints can also be incorporated into the motion synthesis framework. They can be labeled with the intrinsic style by hand, or by learning. For example, Brand & Hertzmann analyzed patterns in motion dataset, and try to infer these styles or obtain higher level descriptions of motions [20]. • A hybrid approach of motion adaptation by controller and example. The animators are expecting the methods to generate both realistic and controllable motion through a database of motion capture. Keyframe animation has the advantage of allowing the animator precise control over the actions of the character. Motion capture provides a complete data set with all the detail of live motion, but the animator does not have the full control over the result. Pullen and Bregler have already made efforts towards combining the strengths of keyframe animation with those of using motion capture data [4]. • Embedding AI technology into the motion reuse. The motion domain knowledge and the human skeleton model can be used to guide and facilitate the reuse of motion capture data. For example, Sun & Metaxas automate the gait generation using human lower-part model [30]. Furthermore, some AI algorithms such as state-space search can be directly applied in the motion reuse. Arikan & Forsyth consider selecting a collection of clips that yields an acceptable motion as a
628
W. Geng and G. Yu
combinatorial problem, and then use a randomized search of hierarchy of graphs to build new motions [5]. We can also try to build the visual plausible motions by knowledge-based simulation [31]. • High-level motion planning. While a motion database helps define the immediate capabilities of a character, it does not provide a means of ordering the motion primitives to achieve a given artistic or expressive goal. In fact, many actions require planning and anticipation. Planning techniques thus need to be developed to achieve higher-level semantic goals. • Extension to multi-person scenarios with dynamic external interactions. Up to now, most existing motion adaptation techniques merely deal with the pure motions without any external interaction from other character or environment. However, plenty of application scenarios, such as multi-person combat, are demanding tools to simultaneously edit and direct the motions for multiple characters. Some pioneer work has been done by Zordan and Hodgins now [32].
Acknowledgement This work is supported by the internal research grant of The Hong Kong Polytechnic University.
References 1. Menache, A.: Understanding motion capture for computer animation and video games”, Morgan Kaufman (Academic Press), San Diego, USA (2000) 2. Gleicher, M. and Ferrier, N.: Evaluating Video-Based Motion Capture, Proceedings of Computer Animation 2002, 75-80 3. Kovar, L., Gleicher, M., Pighin, F.: Motion graphs, SIGGRAPH 2002, 473 - 482 4. Pullen, K. and Bregler, C.: Motion Capture Assisted Animation: Texturing and Synthesis, SIGGRAPH 2002, 501-508 5. Arikan, O. and Forsyth, D.A.: Interactive Motion Generation from examples, SIGGRAPH 2002, 483-490 6. Lamouret, A. and van de Panne, M.: Motion Synthesis by Example, 7th Eurograph Workshop on Animation and Simulation, (1996) 199-212 7. Tanso, L. M. and Hilton, A. Realistic Synthesis of Novel Human Movements from a Database of Motion Capture Examples, Proceedings of the workshop on Human Motion 2000, 137-142 8. Jeong, Il-K. , Park, K-J, Baek, S-M, Lee, I.: Implementation of a Motion Editing System, Proceeding of Virtual Systems and Multimedia (VSMM’01), IEEE Computer Society, (2001)
761 - 769 9. Gleicher, M.: Retargeting Motion to New Character, SIGGRAPH’98, 33-42 10. Rose, C., Guenter, B., Bodenheimer, B. Cohen, M. F. Efficient Generation of Motion Transition using Spacetime COnstraints, SIGGRAPH’96, (1996) 147-154 11. Gleicher, M. Motion Path Editing, ACM Symposium on Interactive 3D Graphics’2001, 195-203 12. Bruderlin, A. and Williams, L.: Motion Signal Processing, SIGGRAPH ’95, 97 - 104 13. Unuma, M. Anjyo, K., Takeuchi, R.: Fourier Principles for Emotion-based Human Figure Animation, SIGGRAPH’95, 91-96
Reuse of Motion Capture Data in Animation: A Review
629
14. Witkin, A. and Popovic, Z.: Motion Warping, SIGGRAPH’95, 105-108 15. Sudarsky, S and House, D.: Motion Capture Data Manipulation and Reuse via B-spines, in CAPTECH’98, LNAI 1537, Spring-Verlag Berlin Heidelberg, (1998) 55-69 16. Gleicher, M. Motion Editing with Spacetime Constraints, ACM Symposium on Interactive 3D Graphics, (1997) 139-148 17. Lee, J. and Shin, S.-Y. : A Hierarchical Approach to Interactive Motion Editing for Humanlike Figures”, SIGGRAPH’99, 39-48 18. Gleicher, M.: Comparing Constraint-based Motion Editing Tools, Graphical Models, 63, (2001)107-134 19. Liu, C. K. and Popovic, Z.: Synthesis of complex dynamic character motion from simple animations, SIGGRAPH 2002, 408-416 20. Brand, M. and Hertsmann, A.: Style Machines, SIGGRAPH’ 2000, 183-192 21. Li, Y., Wang, T.-S., Shum, H.-Y.: Motion Texture: A two level statistic model for character motion synthesis, SIGGRAPH 2002, 465-472 22 Wiley, D. J., and Hahn, J. K.: Interpolation Synthesis of Articulated Figure Motion, IEEE CG&A, 17(6), (1997)39-45 23. Rose, C., Cohen, M. F., Brodenheimer, B.: Verbs and Adverbs: Multidimensional Motion Interpolation, IEEE CG&A, 18 (5), (1998)32-38 24. Sudarsky, S. and House, D.: An Integrated Approach towards the Representation, Manipulation and Reuse of Pre-recorded Motion, Proceeding of Computer Animation 2000, IEEE Computer Society, (2000) 56-61 25. Wooten, W. L. and Hodgins, J. K.: Transitions between dynamically simulated motions: Leaping, Tumbling, Landing, and Balancing”, the visual proceedings of ACM SIGGRAPH’97, August 3-8, Los Angeles, California, USA 26. Bindiganavale, R, and Badler, N. I.: Motion Abstraction and Mapping with Spatial Constraints”, CAPTECH’98, LNAI 1537, Spring-Verlag Berlin Heidelberg, (1998) 70-82, 27. Geng, W.-D. Lai, C.-S. Yu, G.: Design of Kungfu library for 3D game development," The 2nd International Conference on Application and Development of Computer Games, Hongkong, (2003) 138-141 28. Geng, W.-D. Chan, M., Lai, C.-S., Yu G. Implementation of runtime motion adjustment in game development, The 2nd International Conference on Application and Development of Computer Games, Hongkong, (2003)142-147 29. Kovar, L., Gleicher, L. Schreiner, J.: Footskate Cleanup for motion capture editing”, ACM Symposium on Computer Animation 2002, 97-104 30. Sun, H. C. and Metaxas, D. N.: Automating gait generation, SIGGRAPH 2001, 261-269 31 Barzel,.R. Hughes, J. F., Wood, D. N.: Plausible Motion simulation for Computer Graphics Animation, EUROGRAPHICS workshop on Computer Animation and Simulation, (1996) 183-197 32. Zordan, V. B. and Hodgins, J. K.: Motion capture-driven simulations that hit and react, In Proceedings of ACM SIGGRAPH symposium on Computer Animation, San Antonio, Texas, (2002) 89-96