On-line Adapted Transition between Locomotion and Jump - CiteSeerX

0 downloads 0 Views 1MB Size Report
The frame fb is computed by a linear interpolation between fb1 and fb2 with the blend weight parameter p, varying from 0 (motion. A) to 1 (motion B), defined in ...
On-line Adapted Transition between Locomotion and Jump Pascal Glardon∗

Ronan Boulic

Daniel Thalmann

Ecole Polytechnique F´ ed´ erale de Lausanne (EPFL) Virtual Reality Lab CH-1015 Lausanne, Switzerland

A BSTRACT Motion blending is widely accepted as a standard technique in computer animation, allowing the generation of new motions by interpolation and/or transition between motion capture sequences. To ensure smooth and seamless results, an important property has to be taken into account: similar constraints sequences have to be time aligned. But traditional blending approaches let the user choose manually the transition time and duration. In addition, according to the animation context, blending operations should not be performed immediately. They can only occur during a precise period of time, while preserving specific physical properties. We present in this paper an improved blending technique allowing automatic controlled transition between motion patterns whose parameters are not known in advance. This approach ensures coherent movements over the parameter space of the original input motions. To illustrate our approach, we focus on walking and running motions blended with jumps, where animators may vary the jump length and style. The proposed method specifies automatically the support phases of the input motions, and controls on the fly a correct transition time. Moreover the current locomotion type and speed are smoothly adapted given a specific jump type and length. Three-Dimensional Graphics and Realism CR Categories: I.3.7 [Computer Graphics]: Computing Methodologies—Three-Dimensional Graphics and Realism; Keywords: motion blending, motion adaptation, motion anticipation 1

I NTRODUCTION

In Computer Graphics, many methods and techniques have been developed to animate virtual humans, with the same principal objectives: believability, efficiency and adaptability. Motion capture [13] produces incontestably realistic human animations, but needs a post-processing work to filter and adapt the raw recorded data.

Figure 1: A transition generated by our method

Therefore editing methods have been developed, such as motion blending, producing new motions by combining multiple clips ∗ e-mail:

[email protected]

according to time-varying weights. To produce realistic result, motion blending needs the implementation of time warping techniques, where correspondences between the motions are identified and then aligned. Instead of applying time warping methods where all input motions have to be pre-processed [10, 16], we apply an approach similar to [14] which allows dynamically to blend motions not known in advance. However, its disadvantage is that correspondences of motion structures have to be done manually, by annotating the time constraints of the feet, namely the support phases. We propose a semi-automatic method based on a subset of pre-labelled motions with time information of every constraint. Owing to a relation between these motions and their support phases, the time constraints of a new generated motion are detected on the fly. All of the previous work mentioned above perform blending as soon as it is requested by an animator. In some situations, the resulting produced motion is not coherent: blending two walking motions that are out-of-phase for example. This steams from either a uncorrect chosen transition time and/or duration, or from a violation of dynamic motion properties. Hence, our method also controls the blending by choosing the appropriate transition time and duration, and by finding automatically a motion that minimize dynamic incoherence. Our methodology consists first, in an off-line process, to identify the support phase types (e.g. left foot on the floor) during which input motions can be blended. Second, these types are detected automatically on the fly, activating a smooth blending. In addition, this blending is controlled by searching and generating new motions over a multi-dimensional parameter space, to best fit the different dynamic properties of the input motions. To illustrate our method, we focus on the blending of cyclic with episodic motions. An episodic motion is played once, while cyclic motion is played repeatedly. We apply a locomotion engine, able to vary continuously its type between walking and running, with two jumping engines: walking and running jumps. An animator can steer a virtual human moving at a specific speed with a type of locomotion. When requesting a jump of a given length, the presented method, based on a state machine, first searches for the best speed and type according to the jump parameters. The motion is thus modified until the correct transition starting time is determined, by detecting automatically the support phase. At the end of the jump, the model resets the locomotion parameters as before. In the remainder of this paper, we discuss previous results in the next section. Then section 3 describes the generation of the cyclic and episodic motions. An overview of our method is presented in section 4, and the control of a blending operation is explained in section 5. Finally we conclude in section 6 with some results and future work. 2

R ELATED W ORK

Nowadays it is common to use motion capture data to generate parameterized animations either by interpolating, assembling or blending original motion clips.

Two or more motions can be interpolated to produce a new parameterized one, applying different techniques. Bi-linear [9] or multi-variate interpolation [16] method generates new motions over a multi dimensional parameter space. Statistical methods based on PCA (Principal Component Analysis) [8, 7] produces parameterized motion in a very efficient way, by computing them in a lowdimensional space. Concerning assembling animation sequences, a large motion capture database can be organized in a motion graph [11, 2, 12], where edges represents pieces of original motions or generated transitions, and nodes choice points where motions join seamlessly. These points are automatically computed by comparing pairs of motions, frame by frame. A metric function measures the distance between two frames: a result smaller than a specified threshold indicates a transition between these two frames. However, this transition threshold has to be manually quantified, as different kinds of motions have different fidelity requirements. In addition, some metric functions contain non intuitive weight values, difficult to parameterize. For example, the proposed set of weights used for the cost metric function in [12] is compared to an optimal set in [19]. Even if the presented user study - applied on one performer without highly dynamic motions - reveals better results for the optimal set of weights, it remains difficult to evaluate it under various conditions. Another problem of the motion graph technique concerns the transition duration. Actually, even if its starting point is found automatically, its duration has to be determined manually. Moreover, adding a new motion on the fly is impossible, as all input motions have to be compared pairwise to construct the directed graph. Motion blending is a basic way either to create new motions by interpolation or to generate transitions between clips. Precursor works treat the motion as a time-varying signal. Signal processing technique have therefore been developed to perform blending between motions by varying the frequency bands of the signal [5], or the Fourier coefficients [18]. However, disadvantages appear: the transition method in [18] is not invertible, support phases of the feet are not considered in [5], transition time and duration have to be specified by hand. In [15, 10] blending is controlled by changing blend weights of input motions, during a period of time given by the user. Unlike [11], the time constraints (or support phases) change smoothly according to these weight values. But this solution does not allow the addition of a new input motion dynamically. It needs to be compared to the existing ones. In contrast, [14] proposes a synchronization method to perform dynamic blending. If a requested transition is currently unfeasible due to incompatible support phases, the blending is shifted until it becomes possible. Another approach [3] proposes a decouple blending to resolve diverse coordination configurations between upper and lower body-halves, restricted to cyclic motions. Relevant events of input sequences are manually labelled and correspondence between them are established. Therefore all motions have to be known in advance. In addition, the blending is performed at an interactive frame rate. Even if all these blending techniques are widely accepted and used, transitions are not totally controlled and can produce incoherent results. In the study of [19], sequences with fixed or variable transition durations have been evaluated by some volunteers. As a result, the users preferred transition durations that are adapted to the context. Finally, approaches taking account for the dynamics like [17, 20] can control valid transitions by modifying physical parameters such as angular velocity or center of mass. But the setting of parameters is not straightforward and the motion generation is an off-line process. Recently, a momentum-based method has been developed in [1]. Starting from a single motion capture sequence, a space of new physically plausible motions is pre-computed without parame-

terization effort. Although on-line blending can be performed between these motions, the method is limited to high dynamic movements. Furthermore transition time and duration are constant. This short survey in transition generation reinforces our goal to develop a methodology able to compute automatically not only transition duration, but also transition time, and to adapt on the fly motion parameters to ensure the coherence of the blended motions. 3

M OTION PATTERNS

We use motion capture data to generate cyclic and episodic motion patterns. A PCA algorithm is applied on a motion capture data matrix, where joints rotations are expressed using the exponential map formulation. For walking and running motions, we apply the engine based on hierarchical PCA spaces as presented in [8], with five different subjects captured at various speeds. One cycle, parameterized by speed, type of locomotion and personification weights, is computed on the fly and played repeatedly as long as the context does not change. Similarly, for episodic motions, we have captured seven subjects performing walking and running jumps, with various lengths, from 0.4 to 1.8 meters. We asked the performers to adapt naturally their speed to be able to execute the requested jump. We improve upon the method of [8] by separating distinctly walking from running jumps. Actually, since a jump is constrained by the gravity force, its parameters do not change once started. Moreover, more variability is integrated by considering seven subjects, with a total of 105 walking and 103 running jumps. To summarize, we have implemented three motion patterns: one that produces a locomotion cycle, two others generating walking and running jumps. Notice that all the jump motions begin with the same event: right heel strike. State 1 Jump resquest

Motion adaptation

State 2 Right heel strike

Blending

State 3

State 0 Walking or running

Jump

State 5 Parameters as before

Right toe off

Motion

State 4 Left heel off

Blending

Left heel strike

Figure 2: The state machine for controlled blending

4

M ETHOD OVERVIEW

When a cyclic motion is played in real-time, an episodic motion may occur on the request of an animator. Therefore, a blending has to be executed, to generate a seamless transition from walking (running) to jumping and then back to walking (running). Without loss of generality, we assume in this paper that a jump starts with the right feet. To control this operation, we model a state machine, composed of six states (Figure 2). In addition, we define height foot events needed to identify transition time and duration: Right Heel Strike (RHS) / Off (RHO) Right Toe Strike (RTS) / Off (RTO) Left Heel Strike (LHS) / Off (LHO) Left Toe Strike (LTS) / Off (LTO)

20

Frame #

Frame #

The neutral sate of our state machine is the cyclic motion. As soon as the animator or the application level (for example an autonomous agent) desires to perform a jump, the model goes to the next state, where speed and locomotion type are adapted.

10

20 10 0

0.5 1 1.5 20 Jump lenght [m]

Frame #

Frame #

0

10

20 10

Frame #

0.5 1 1.5 20 Jump lenght [m] 10 0

Frame #

Single support

0.5 1 1.5 20 Jump lenght [m] 10

Frame # RHO RTO LHS LTS LHO LTO

20

20

0 20

C ONTROLLED B LENDING

As we want to control the motion blending according to the context of the animation, support phases and their duration have to be considered, as traditional methods do [16, 10, 14]. In addition, our contribution guarantees coherence between the current speed and type of locomotion, and the desired jump length. 5.1

0.5 1 1.5 Jump lenght [m]

10 0

0 Frame #

20

1 1.5 2 Jump lenght [m]

10

20

Frame #

Frame #

5

Frame #

When the motion properties are compatible with the desired jump, the model starts the transition once the support phases of both motions contain similarities. However, it is not sufficient to compare double with single support phases as in [14]. Indeed, as illustrated in Figure 10, the blend between a double support phase with the right foot backward (walking) and a single phase with the right foot forward (jump) produces uncorrect result. Therefore the transition starts only when the event RHS is detected (state 3). To ensure on-line performance, this detection is carried out using an approximation explained in section 5.1. A blending operation is performed, from walking (or running) motion to jumping motion, until the RTO event. After that, the jumping motion is played until the LHS event arises. Then to the next state, the transition back to walk (or running) is executed until the LTO event is detected. Finally, the locomotion engine returns to its neutral state, by modifying the motion parameters to the values they had before the jump request.

10 0

Figure 3: Incompatible support phases to blend a walk with a jump motion

1 1.5 2 Jump lenght [m]

10

0

0.5 1 1.5 20 Jump lenght [m]

1 1.5 2 Jump lenght [m]

10 0

Frame #

10 0

Double support

0

Frame #

Frame #

0

0.5 1 1.5 20 Jump lenght [m]

1 1.5 2 Jump lenght [m]

20

1 1.5 2 Jump lenght [m]

10 0

1 1.5 2 Jump lenght [m]

10 0

0.5 1 1.5 Jump lenght [m]

1 1.5 2 Jump lenght [m]

Figure 4: The different events for each captured subjects performing walking (left) and running (right) jumps

Therefore the feet constraints of any walking, running or jumping motion generated by the engines can be instantaneously computed. 5.2

Time Warping

As a support phase can have different durations depending on the motion properties, time warping operation is necessary to align similar constraint. Therefore, let be a motion A, where frames fa1 and fa2 delimit a footprint phase, that has to be blended with the motion B, having frames fb1 and fb2 delimiting a similar footprint. According to the elapsed time ∆t, we have to compute the corresponding frames fa and fb , as illustrated in Figure 5.

Support Phase

Blending can fail when motions have similar support phase that occur at a different normalized (or generic) time. Therefore, the motions have to be labelled, in order to know which type of support phase happens at which generic time. In our situation, only a few walking and running motions generated by the engine have been labelled (∼ 14% of the input data), by selecting specific speeds and by considering the eight foot events described in the previous section. We observe that, given a subject and a type of locomotion, the different events occur almost independently of the speed, for values between 0.8 and 2.2 [ms−1 ]. Analogously, a subset of jumping motions (∼ 25%) have been labelled, leading to the conclusion that, given a subject, the events timing is linearly dependent on the jump length, as shown in Figure 13.

Figure 5: Frame alignment between motion A and B

Frequencies Fa and Fb are associated to the motion A and B. As the have motions have different frequencies, we define a frequency F, initially equals to Fa and a phase variable ϕcur (Eq. 1) as described in [8]. Then the frame fa is determined using the Eq. 2, where fmax is the number of frames of each normalized motion sequences (walking, running or jumping).

ϕcur = ϕ prev + ∆tF

(1)

fa = ϕcur fmax

(2)

The frame fb is computed by a linear interpolation between fb1 and fb2 with the blend weight parameter p, varying from 0 (motion A) to 1 (motion B), defined in Eq. 3. p=

fa − fa1 fa2 − fa1

(3)

Finally, the frame to be displayed is computed by applying quaternion spherical interpolation between fa and fb , using the parameter p. Additionally, this parameter p is also used to adapt the frequency F by a linear interpolation between the frequencies Fa and Fb . For an episodic motion, the frequency is defined as the inverse of its duration.

7 subjects, running jump 2.5

2

Coherence

Mean speed [m/s]

5.3

To determine the duration of this adaptation, the necessary acceleration ∆V is computed. This value, normalized to the remaining time until the next possible blending time ti , is compared to a given tolerance tol. Under this tolerance, the speed is linearly increased (or decreased) until the time ti . On the contrary, when ∆V is beyond tol, the time ti+1 is selected. The type of locomotion is analogously adapted.

When an animator triggers a jump, multiple conditions have to be satisfied before effectively operating the transition. Motion adaptation First, to preserve context properties, a jump of a specific length can only be performed at a range of locomotion speed. Speed and jump length have been normalized by the leg length of the captured performers allowing motion retargeting. Then we extract the mean speed of each jump. A curve is finally computed by performing a linear least square approximation between these speeds and the corresponding jump lengths. A positive and negative tolerance is added to this curve to form a stripe of acceptable combinations of normalized locomotion speed versus jump length, as depicted in Figure 11 and 12.

1.6

Mean speed [m/s]

1.4 1.2 1

0.5

0.6 y = 0.9785x − 0.0432 tol = 0.09

0.4

0.6

0.8 1 1.2 Jump length [m]

y = 1.1621x 21x − 0.13 tol = 0.15

1

1.5 Jump length [m]

2

Figure 7: Comparison for running jumps

6

0.8

0.2 0.2

1

Transition time and duration Previous methods like [14] can blend a jump between the LHS and RTO events (double support in walking), which would produced undesirable motion, as the jump starts with the right leg. Such transitions are avoided in our method by guaranteeing that the jump starts between the events RHS and RTO, and similarly that the jump ends between the events LHS and LTO.

7 subjects, walking jump 1.8

0.4

1.5

1.4

1.6

Figure 6: Comparison, for walking jumps, between their mean speed and length. The stripes are delimited by the linear approximation (middle dashed line) and the tolerances

Therefore, given a jump length, a range of speed values is determined, implying two possible scenarios if the current locomotion speed is outside the computed range. • The priority is given to the jump execution. A new jump is generated, with an appropriate jump length and jump type corresponding to the current locomotion speed and type. This solution is well adapted in game environment for example. • The priority is given to the jump parameters. A continuous modification of the current locomotion speed is performed until the correct speed is reached. Additionally and simultaneously, the current type of locomotion is adapted to a type required by the jump. When the jump is finished, the locomotion parameters are gradually restored to values before.

R ESULTS AND C ONCLUSION

We have integrated our state machine model into a 3D application where animator can steer a virtual human by changing the speed, type of locomotion and the personification. A jump can be parameterized with a desired jump length, personification and type. At any time, the user can trigger a jump. The model adapts automatically on the fly the current motion to preserve speed coherence based on observations made on 105 walking and 103 running captured jumps. Therefore, the speed and type of locomotion may be modified during a specific period of time (Figure 8, bottom). After some experiments, we choose a normalized tol acceleration value of 0.8 [s−2 ]. Additionally, the method decides when and how long to perform the blending operation to jump, by taking into account the motion support phases. The generated transition is seamless and performed in real-time, having a duration time varying from 0.2 to 0.5 second. Our blending algorithm is based on spherical linear interpolation between the quaternions of each joint, using a linear weight function. Other transition schemes such as [11, 10, 16] produce similar results with only subtle differences. Compared to earlier approaches [11, 2, 12] on finding possible transition time over large motion capture database, our method presents advantages. First, we do not need to perform a preprocessing step involving all the clips of the database. This allows us to generate dynamically parameterized motions. Then our methodology determines transition time and duration automatically, in contrast to [11] where transition thresholds have to be specified by hand. Finally, we adapt the current motion by modifying

Figure 8: Comparison of walking with running jump: the top image without motion adaptation, the bottom image with locomotion speed and type adaptation.

its parameters to enforce the requested blending, unlike previous methods that either do not allow the transition or perform it without adaptation [10] or at an incorrect support phase like in [14]. The top image of Figure 8 shows a transition from walking to running jump, using a traditional blending method such as [15]. One can observe that the jump length is large regarding the step length, leading to a dynamic incoherence. The bottom image shows the same situation by applying our technique. The motion is adapted by modifying the speed and the type of the locomotion according to the requested jump. For a better perception of the improvements proposed by our method, the reader can also view some movies attached to this paper. Some artifacts may nevertheless occur due to the approximation used to detect the feet constraints. Therefore, like many other blending methods, we want to add IK control on the feet to prevent sliding effects, by extending the prioritized IK described in [4, 6] to the automatic generation of constraints in a dynamically evolving context. This correction should be efficient enough to not alter the real-time performance of the animation. Finally, we think that this work is a required stage on the way to correctly handle motion anticipation, where virtual humans have to decide and adapt autonomously their movements according to their environment. ACKNOWLEDGMENTS The authors would thank Simon Barbey and Mirouille for postprocessing of motion capture data, and Benoˆıt for Maya support. The Maya licenses has been granted by Alias through their research donation program. The present research has been funded by the Federal Office for Education and Science. R EFERENCES [1] Yeuhi Abe, C. Karen Liu, and Zoran Popovic. Momentum-based parameterization of dynamic character motion. In Proceedings of ACM SIGGRAPH/Eurographics Symposium on Computer Animation 2004, 2004. [2] O. Arikan and D. Forsyth. Interactive motion generation from examples. In Proceedings of ACM SIGGRAPH 2002, 2002. [3] G. Ashraf and K. C. Wong. Generating consistent motion transition via decoupled framespace interpolation. In Proceedings of Eurographics 2000, 2000.

[4] Paolo Baerlocher and Ronan Boulic. An inverse kinematic architecture enforcing an arbitrary number of strict priority levels. The Visual Computer, 20(6):402–417, 2004. [5] Armin Bruderlin and Lance Williams. Motion signal processing. In Proceedings of ACM SIGGRAPH 95, pages 97–104, aug 1995. [6] Benoˆıt Le Callennec and Ronan Boulic. Interactive motion deformation with prioritized constraints. In Proceedings of ACM SIGGRAPH/Eurographics Symposium on Computer Animation 2004, 2004. [7] Arjan Egges, Tom Molet, and Nadia Magnenat-Thalmann. Personalised real-time idle motion synthesis. In Proceedings of Pacific Graphics, 2004. [8] Pascal Glardon, Ronan Boulic, and Daniel Thalmann. A coherent locomotion engine extrapolating beyond experimental data. In Proceedings of Computer Animation and Social Agent, 2004. [9] S. Guo and J. Roberge. A high-level control mechanism for human locomotion based on parametric frame space interpolation. In Proceedings of Eurographics Workshop on Computer Animation and Simulation, pages 95–107, sept 1996. [10] Lucas Kovar and Michael Gleicher. Flexible automatic motion blending with registration curves. In Proceedings of ACM SIGGRAPH/Eurographics Symposium on Computer Animation 2003, pages 214–224, 2003. [11] Lucas Kovar, Michael Gleicher, and Fred Pighin. Motion graphs. In Proceedings of ACM SIGGRAPH 2002, pages 473–482, 2002. [12] J. Lee, J. Chai, P. Reitsma, J. Hodgins, and N. Pollard. Interactive control of avatars animated with human motion data. In Proceedings of ACM SIGGRAPH 2002, 2002. [13] Alberto Menache. Understanding Motion Capture for Computer Animation and Video Games. Academic Press, San Diego, 2000. [14] S. Menardais, R. Kulpa, and B. Arnaldi. Synchronisation for dynamic blending of motions. In Proceedings of ACM SIGGRAPH/Eurographics Symposium on Computer Animation 2004, 2004. [15] Sang Il Park, Hyun Joon Shin, and Sung Yong Shin. On-line locomotion generation based on motion blending. In Proceedings of ACM SIGGRAPH/Eurographics Symposium on Computer Animation 2002, 2002. [16] Charles Rose, Michael F. Cohen, and Bobby Bodenheimer. Verbs and adverbs: Multidimensional motion interpolation. IEEE Computer Graphics and Applications, 18(5):32–41, 1998. [17] Charles Rose, Brian Guenter, Bobby Bodenheimer, and Michael F. Cohen. Efficient generation of motion transitions using spacetime constraints. In Proceedings of ACM SIGGRAPH 96, pages 147–154, 1996. [18] Munetoshi Unuma, Ken Anjyo, and Ryozo Takeuchi. Fourier princi-

ples for emotion-based human figure. In Proceedings of ACM SIGGRAPH 95, pages 91–96, aug 1995. [19] Jing Wang and Bobby Bodenheimer. Computing the duration of motion transitions: An empirical approach. In Proceedings of ACM SIGGRAPH/Eurographics Symposium on Computer Animation 2004, pages 335–344, 2004. [20] W. L. Wooten and J. K. Hodgins. Simulating leaping, tumbling, landing and balancing humans. In Proceedings of IEEE International Conference on Robotics and Automation, apr 2000.

7 subjects, running jump 2.5

Figure 9: A transition generated by our method

Mean speed [m/s]

2

1.5

1

Double support

0.5

y = 1.1621x 21x − 0.13 tol = 0.15

1

Single support

1.5 Jump length [m]

2

Figure 12: Comparison for running jumps

Figure 10: Incompatible support phases to blend a walk with a jump motion

7 subjects, walking jump

1.4

1.6

Figure 11: Comparison, for walking jumps, between their mean speed and length. The stripes are delimited by the linear approximation (middle dashed line) and the tolerances

0.5 1 1.5 Jump lenght [m]

10

20

0.5 1 1.5 Jump lenght [m]

10 0 20

0.5 1 1.5 Jump lenght [m]

10 0 20

0.5 1 1.5 Jump lenght [m]

10 0

0.5 1 1.5 Jump lenght [m]

Frame #

0.8 1 1.2 Jump length [m]

Frame #

0.6

Frame #

0.4

20

0

Frame #

0.4

Frame #

y = 0.9785x − 0.0432 tol = 0.09

20

RHO RTO LHS LTS LHO LTO

20

20

20

20

1 1.5 2 Jump lenght [m]

10 0

0.5 1 1.5 Jump lenght [m]

10 0

Frame #

0.6

Frame #

0.8

20

10 0

0.5 1 1.5 Jump lenght [m]

10 0

20

Frame #

1

20

20

Frame #

Frame #

1.2

0.2 0.2

10 0

Frame #

Mean speed [m/s]

1.4

20

Frame #

1.6

Frame #

Frame #

1.8

1 1.5 2 Jump lenght [m]

10 0

1 1.5 2 Jump lenght [m]

10 0

1 1.5 2 Jump lenght [m]

10 0

1 1.5 2 Jump lenght [m]

10 0

1 1.5 2 Jump lenght [m]

10 0

1 1.5 2 Jump lenght [m]

Figure 13: The different events for each captured subjects performing walking (left) and running (right) jumps