Using Vision for Animating Virtual Humans

Overview Using Vision for Animating Virtual Humans

ü Motivation l Theoretical Framework – Distributed Approximating Functionals – Physics-Based Modeling l Human Motion Analysis l Biomedical Data Analysis l Conclusion

Ioannis Kakadiaris Visual Computing Lab Department of Computer Science University of Houston 1

Visual Computing

4

Physics-Based Models: Computer Vision l

l

The field of Visual Computing is concerned with the analysis, numerical manipulation, querying, display, storage, and transmission of data. l

Objective – Represent nonrigid shapes – Reconstruct nonrigid shapes from noisy data – Estimate the motion of nonrigid objects Solution – Use the principles of physics to approximate the shape of objects and their behavior

2

Physics-Based Models: Computer Graphics

Application Domains l

Human Motion Analysis

l

Biomedical Data Analysis

l

Seismic Data Analysis

5

l

l

l

3

Objective – Model nonrigid objects and their interaction with the physical world – Realistically simulate and animate the motion of articulated objects with deformable parts Previous Attempts – Geometric modeling techniques have had limited success Solution – A mathematical representation of an object (or its behavior) which incorporates physical characteristics such as forces, torques and 6 energies into the model allowing numerical simulation of its behavior.

n1

Geometry of Rigid/Deformable Models

Geometry: Global Deformations (cont.) Example: Superquadric with deformation

l

7

Geometry: Global Deformations

Geometry: Local Deformations

e(u ; a1, a2 ,... )

l

Geometric primitive:

l

Parameterized deformations:

10

Finite elements

l

– Local deformation: d – Linear combination of nodal displacements:

d = Sq d

T (e;b1 ,b 2 , ...) l

S: Matrix of local finite element shape functions Implementation: Linear triangular elements

Global deformation parameter vector

s = T(e (u;a1 , a 2 ,... );b1 ,b2 ,... ) q s = (a1 , a2 ,...,b 1,b2 ,...)

T

8

Geometry: Global Deformations (cont.)

11

Kinematics l

l

Example: Superquadric

Generalized coordinate vector

(

q = qTc , qTθ , qTs , qTd

)

T

translation rotation global- def local-def l

Velocity of points on the model

x = [I B RJ RS ]q = Lq 9

l

Jacobian L maps from q-space to 3-space12

n2

Dynamics

Numerical Simulation of Motion Equations

Lagrange equations of motion

l

Vision-Shape:

q& + Kq = fq

Vision-Motion:

&q& + q& = fq

Graphics:

l

Second order system

Mq&& + Dq& + Kq = gq + f q + f gc

M &q& + D q& + Kq = f q + g q

M: block symmetric mass matrix D: Raleigh damping matrix, D = αM + β K K: stiffness matrix f q (u,t): generalized external forces gq (u,t): generalized inertial forces

l

Numerically integrate through time

13

Dynamics: Generalized Forces

l

Overview ü Motivation ü Theoretical Framework l Human Motion Analysis – Inferring Structure in 2D – Human Body Model Acquisition – Human Body Tracking – Estimating Anthropometry and Pose from a Single Camera l Biomedical Data Analysis l Conclusion

Generalized external forces

(

fq = LT f = fc , fq , fs , fd translation-f

T

rotation-f

T

T

global-f

16

)

T T

local-f

14

17

Motion-Based Part Segmentation

Dynamics: Generalized Forces (cont.)

l l

Generalized inertial forces

gq = − ∫ µ LT L& q& du where

L& q& = ω × (ω × Rp) + 2ω × R p&

Given an image sequence of a multi-part object whose parts move relative to one another … Recover a structured description in terms of moving parts, without a priori knowledge of the object or the object domain. Accurately estimate the parts’ shape and motion parameters. terms

15

18

n3

Part Segmentation Algorithm (PSA)

Overview ü Motivation ü Theoretical Framework l Human Motion Analysis – Inferring Structure in 2D – Human Body Model Acquisition – Human Body Tracking – Estimating Anthropometry and Pose from a Single Camera l Biomedical Data Analysis l Conclusion

19

Motion-Based Part Segmentation

22

Human Model Acquisition Given image sequences (from multiple views) of a moving human … Automatically segment the apparent contour and estimate the 2D shape of the subject’s body parts (without a prior model for the human body or for the shape of the parts).

Advantages l Integrates the processes of part segmentation and fitting l

l

l

l

Allows reliable shape description of the parts Estimates the location of the joints between the parts (if any)

Does not require an a priori model of the multi20part object or of the shape of the parts

Contribution l

Automatically acquire a threedimensional model of the subject’s body parts.

Detects multiple joints

23

Experimental Setup

New framework for the two-dimensional part segmentation shape and motion estimation of multi -part objects.

l

21

Sagittal Plane Coronal Plane Transverse plane

24

n4

Human Body Model Acquisition

Results - Human leg

Protocol of movements: MovA 1. Head Motion

2. Left upper body extremities motions 3. Right upper body extremities motions

25


28

Results

Protocol of movements: MovA (cont.) 4. Left lower body extremities motions 5. Right lower body extremities motions

l

3D models for the arm and the leg

26

Results l

29

Validation and Performance Analysis

Human head and left arm

l

27

3D Shape Estimation of a subject’s body parts

30

n5

Challenges

Validation and Performance Analysis 3D Shape Estimation of a subject’s body parts

l

Humans perform complex 3D non-rigid motions

l

Body parts may not be visible from certain viewpoints

min error : 0.001 mm max error : 3.736 mm mean : 1.459 mm std dev : 1.170 mm

31

Overview 4 4 l

l l

34

3D Model -based tracking

Motivation Theoretical Framework Human Motion Analysis – Inferring Structure in 2D – Human Body Model Acquisition – Human Body Tracking – Estimating Anthropometry and Pose from a Single Camera Biomedical Data Analysis Conclusion

Input – Image sequences of the moving human from three views, and – The 3D models of the subject’s body parts (as obtained with our method) Output – The 3D position and orientation over time of each of the subject’s body parts 32

Human Motion Capture

35

Human Motion Capture Advantages of our approach – Obviates the need for markers or special equipment

Given image sequences of a moving human… Estimate over time the 3D position and orientation of a subject’s body parts.

– Model obtained from observations – Mitigates difficulties arising due to occlusion among body parts – Selects a subset of the cameras in an active and time varying fashion 33

36

n6

Model -Based Tracking: Steps

Model -Based Tracking: Select

Steps – Predict – Select – Match – Update

l

Observability Index

37


40

Model -Based Tracking: Update

Observability Index

Lagrange equations of motion ••

•

q + q = fq where q(t) : the generalized coordinate vector fq(t) : generalized external forces

38


41

Model -Based Tracking: Predict

Predicted occluding contour

Extended Kalman Filter  ••   q  =  •   q  z(t)

=

 − 1  1 

• 0  q  0   q 

  •  h   q  (t)   q 

  +  

  (t) 

+

w(t)

v(t)

z(t): vector of observations

Observability Index (I) =

∑ area(c

h(t): nonlinear function which relates the input data to the model’s state i

,ci +1 ,Pi +1 , Pi )

w(t): modeling error 39

42

n7


Validation and Performance Analysis 3D Model-Based Tracking

Frame

Camera1

Camera2

Camera3

001

Error(mm) MinError

035

XY Plane Z (height) 0.38 1.01

Max Error Mean

9.47 4.31

5.80 3.67

StdDev. RMS

1.72 1.67

1.03 0.54

43

Human Motion Capture

46

Video Presentation

44

Validation and Performance Analysis

47

Tracking Using Monocular Images

3D Model-Based Tracking

l

45

There are several applications for which the video recordings from only one view are available

48

n8

Motivation l

Motivation (Cont.)

Performance measurement for human factors engineering

l

Automatic annotation of human activities in video databases

49

Motivation (Cont.) l

52

Problem Statement Given a set of points in an image that correspond to the projection of landmark points of a human subject …

Posture and gait analysis for training athletes and physically challenged individuals

estimate both the anthropometric measurements (up to a scale) of the subject and his/her pose that best match the observed image

3D Human Model Image

http://www.motionanalysis.com

( up to scale)

50

Motivation (Cont.) l

53

Our Approach

Human body, hands and face animation

Novelty ü Using anthropometric statistics to constrain the estimation process

Advantages ü Estimation of both anthropometry and pose simultaneously ü Able to estimate anthropometry and pose from a single image

http://ligwww.epfl.ch/

51

54

n9

Overview

Step 1: Selection of projected landmarks

Novelty: Using anthropometric statistics to

Through a simple interface, the user:

constrain the estimation process

Step 1

Step2

Steps 3, 4 ...

§ Initial Anthropometric Estimates

Selection of projected § Initial Pose Estimates landmarks on the image § Iterative Minimization over lengths and angles

§ Selects the projection of visible landmarks of the subject's body § Marks §segments whose orientation is almost parallel to the image plane §pairs of segments that have similar orientation

Output Human model

55

Human Body Model ID at sp la lc le lh lk ls lw ra rc re rh rk rs rw wt

l22

Joint From atlanto occipital NK solar plexus UT left ankle LLL left clavicle UT left elbow LUA left hip LT left knee LUL left shoulder LC left wrist LLA right ankle RLL right clavicle UT right elbow RUA right hip LT right knee RUL right shoulder RC right wrist RLA waist LT

Output To DOF PR HD Tz*Rz*Ry*Rx 3 NK Tz*Ry*Rz*x 2 LF Tx*Rz*Rx*Ry 4 LC Tz*Rx*Ry 3 LLA Tz*Ry 5 LUL Tz*Rz*Rx*Ry 2 LLL Tz*R-y 3 LUA Tz*Rz*Rx*Ry 4 LHD Tz*Ry*Rx*Rz 6 RF Tx*R-z*R-x*Ry 4 RC Tz*R-x*Ry 3 RLA Tz*Ry 5 RUL Tz*R-z*R-x*Ry 2 RLL Tz*R-y 3 RUA Tz*R-z*R-x*Ry 4 RHD Tz*Ry*R-x*R-z 6 UT Tz*Ry*Rz*Rx 1

segments, 17 joints and 64 DoF

l

l

Image coordinates of the selected projected landmarks a set of ratios (of projected lengths) using the segments selected by the user

56

Family of Human Body Models l

58

59

Step 2: Initial Anthropometric Estimates

Input: a set of ratios using the segments selected by the user

2187 human body models based on anthropometric statistics

Output: the initial human model q from our cadre family of 2187 human models. *

(

q * = arg min ∑ rijq − p ij q =1,..,1 3 9

l

The cadre family is a representation of the population distribution which spans the space to capture a significant amount of the variance

q where rij =

i, j∈ I

)

2

liq , i < j are the ratios of the lqj

segments of each cadre family member 57

that correspond to the segments selected

60

n10

Steps 3 and 4: Estimates for pose and anthropometry

Initial Pose Estimates We use a geometric method for providing two initial guesses for the pose of some segments as follows:

Goal ü Minimize the discrepancy between the synthesized appearance of the Stick Model (for that pose) and the image data of the subject in the given image min f ( x j )

Solutions

Lj ≤ x j ≤ U j , j = 1,...,K

o + λ di

li

λ1= di ⋅ (j −o) + [di ⋅ (j −o)]2 − o − j +li 2 2

where x j can be an angle or a length or a ratio, and L j and U j are its lower and upper values.

λ1= di ⋅ ( j−o) − [di ⋅ ( j−o)]2 − o − j + li 2 2

Camera’s center of projection

61

64

Hierarchical Solver

The Objective Function l

j

o + λ di − j = li

The sum of squared distances between the Stick model’s site and the closest point from the line formed by the camera’s center of projection and its corresponding landmark

We assign a priority to each joint and site, and we schedule the optimization process

Image 3D Model Camera’s center of projection

Input data

Steps 3, 4

Virtual human model

62

Minimization process

65

Constraints

To guide the minimization process to a solution for a pose that is anthropometrically plausible, we apply:

Three classes of constraints are applied: l

ü a geometric method for the initial pose estimation l

ü a hierarchical solver ü various constraints

l

63

Constraints derived from the joint limit information associated with the range of motion of each joint, Constraints that enforce the symmetry between the left and right sides of the subject, and Constraints that enforce proportions. 66

n11

Results

Results

Synthetic Experiment l

Golfer

Accuracy LC UT+LT 0.6553

LLA LUA 0.9829

LHP LUA 0.5700

LF LUL 0.6397

LF LUL 0.6341

Estimated

0.6517

0.9781

0.5713

0.6377

0.6329

PE %

0.5494

0.4810

0.2281

0.3090

0.1876

Actual

67

Results

70

Golf

Subject: Vannesa

Accuracy LC UT+LT

LLA LUA

LHP LUA

0.6279

0.8625

Estimated 0.6266

0.8516 1.2638

Actual PE %

0.1958

LF LUL

LF LUL

0.6949

0.5517

0.4778

0.6925

0.5468

0.4767

0.3180

0.8957

0.2302 68

Results

71

Results l

l

Field work

69

Tennis Player

72

n12

Tennis

Conclusions l l l l

“We live in interesting times” Abundance of sensors Large volumes of information rich data New efficient and robust methods for analyzing, querying, visualizing and storing data

73

Results

76

Our Team l

l

Cyclist l

l

l

Post-Doc – Geovanni Martinez Ph.D. Students – Carlos Barron – Amol Pednekar M.S. Student – Anthony Do Undergraduate Student – Vanessa Zavaletta

74

VERI

Acknowledgements

Research Thrusts l Intelligent Systems l

Computational Biomedicine Biomedical Robotic Systems

l

Geophysical Data Analysis and Visualization

l

77

l l l l l

NSF (CAREER Award) NASA JSC Texas Higher Education Coordinating Board American Honda Shell Foundation

Analysis,Modeling, Simulation, Visualization Multimodal Human Computer Interaction 75

78

n13