An example-based approach to human body ... - Semantic Scholar

Graphical Models 66 (2004) 1–23 www.elsevier.com/locate/gmod

An example-based approach to human body manipulation Hyewon Seo* and Nadia Magnenat-Thalmann MIRALab, University of Geneva, 24 rue du General Dufour, CH-1211 Geneva, Switzerland Received 16 August 2002; received in revised form 3 June 2003; accepted 25 July 2003

Abstract We discuss a set of techniques based on examples for generating realistic, controllable human whole-body models. Users are assisted in automatically generating a new model or modifying an existing one by controlling the parameters provided. Our approach is based on examples and consists of three major parts. First, each example from the 3D range scanner is preprocessed so that the topology of all examples is identical. Second, the system that we call the modeling synthesizer learns from these examples the correlation between the parameters and the body geometry. After this learning process the synthesizer is devoted to the generation of appropriate shape and proportion of the body geometry through interpolation. Finally, we demonstrate our modifier synthesizer for more subtle manipulations of example models, using high-level parameters such as fat percentage. On any synthesized model, the underlying bone and skin structure is properly adjusted, so that the model remains completely animatable using the joint animation. By allowing automatic modification from a set of parameters, our approach may eventually lead to the automatic generation of a variety of population models. Ó 2003 Elsevier Inc. All rights reserved. Keywords: Anthropometry; Somatotyping; 3D scanned data; Human body modeling; Statistical estimation; Example-based approach; Interpolation; Regression

*

Corresponding author. Fax: +41-22-705-7780. E-mail addresses: [email protected] (H. Seo), [email protected] (N. MagnenatThalmann). 1524-0703/$ - see front matter Ó 2003 Elsevier Inc. All rights reserved. doi:10.1016/j.gmod.2003.07.004

2

H. Seo, N. Magnenat-Thalmann / Graphical Models 66 (2004) 1–23

1. Introduction The problem of modeling the human body is a central task of computer graphics. At this time, a variety of human body modeling methodologies are available, that can be classified into the creative approach and the reconstructive approach. Anatomically based modelers, such as [29,31] and [34] fall into the former approach. They observe that the models should mimic actual components of the body and their models consist of multi-layers for simulating individual muscles, bones, and tissues. While allowing for an interactive design, they however, require considerable user intervention and thus suffer from a relatively slow production time and a lack of efficient control facilities. Lately, much work has been devoted to the reconstructive approach to build 3D geometry of human automatically by capturing existing shape [11,17,21,23]. One limiting factor of these techniques lies in that they hardly give any control to the user; i.e., it is very difficult to automatically modify resulting models to different shapes as the user intends. Example-based techniques [5,32] are good alternatives to overcome such limitations, since they provide high-level control of the target model while maintaining the quality that exist in example models. In particular, those methods that extract useful statistics from the examples are promising because they can guide the estimation of uncertain information and thus add robustness to the modeler. Indeed, human modeling seems to be an area where statistics are found to be most useful [5,9]. Anthropometric modeling systems [3,10,14] typically involve intensive use of statistic information that has been previously compiled through sizing survey. With the advent of 3D scanning technology, the use of scan data has become common in modern anthropometry [7], replacing the conventional tape measure. With such a profitable condition considered, we suggest an alternative way to anthropometric modeling—Unlike earlier anthropometric modelers, we directly make use of scan dataset, rather than statistically analyzed form of anthropometric data. Arguably, the captured body geometry of real people provides the best available resource to model and estimate correlations between measurements and the shape. In this work, we show how example-based techniques can be used to regulate the realism of the human body shape when building the body model that satisfy a number of measurements on various parts. Unlike previous approaches to anthropometric modeling, subtle correlation between the measurement and the shape of the body is captured through a set of measured body geometries from range scanner. This paper contains three main contributions: Our modeler builds on a template body model that has been designed primarily for real-time animation applications. The model encapsulates all required structural components—bones and skin attachment data that allow to animate the model at any time during the synthesis, through skin attachment recalculation; landmarks and contours through which it remains measurable. We present a modeling synthesizer that takes a set of sizing control parameters as input and generates whole body geometry. We use RBF networks to learn from examples the mapping of parameters to the body geometry. We also show that the


3

synthesizer can naturally be extended to the automated generation of population models and of exaggerated models. One natural manipulation of the body geometry is modification of existing models. Based on statistical analysis of example models, we demonstrate the use of shape measures in controlling the global shape of the model. The benefits of such a system are threefold. First, a user can easily and rapidly control the result model by changing intuitive parameters. Second, very realistic geometries are produced. Finally, the generated models are readily usable in a visualization or simulation environment. A preliminary version of this work has been presented in [30]. Video sequence showing the result models can be found on our website http://www.miralab.unige.ch/~seo/VideoGM.

1.1. Related work Since anthropometry was first introduced in computer graphics [13], a number of researchers have investigated the application of anthropometric data in creation of virtual humans. Azuola et al. [3] and Grosso et al. [15] introduced systems that create variously sized human models based on anthropometric data [25]. The system automatically generates dimensions of each segment of an initial human figure based upon population data supplied as input. Alternatively, a given dimension of a person can be directly used in the creation of a virtual human model. For each segment or body structure with geometrical representation, three measurements were considered, namely the segment length, width, and depth or thickness. The desired dimension was primarily implemented by rigid scale of each component, although they showed later the extension of the system equipped with partially deformable models [3]. In our application, the challenge is to obtain seamless, deformable, and realistic body models. More recently, DeCarlo et al. [10] have shown that the problem of generating face geometries can be reduced to that of generating sets of anthropometric measurements by adopting variational modeling technique. The idea behind this is to generate a shape that shares, as much as possible, the important properties of a prototype face and yet still respect a given set of anthropometric measurements. They cast the problem as a constrained optimization: anthropometric measurements are treated as constrains, and the remainder of the face is determined by optimizing an objective function on the surface. A variety of faces are then automatically generated for a particular population. This is an interesting approach that, unfortunately, is slow in creation time (approximately one minute per each face) owing to the nature of variation modeling. Also, the shape remains as a passive constitute as the prototype shape is conformed to satisfy the measurement constraints while ÔfairnessÕ, i.e., smoothness of the shape is being maximized. Therefore, every desirable facial feature has to be explicitly specified as a constraint in order to obtain realistic shape in the resulting model that are observable in real faces, such as hooked nose or double chin.

4


A good alternative to building anthropometric body model is an example based one, where the captured sizes and shapes of real people are used to determine the shape in relation with the given measurements. Until now, various example-based modeling techniques have been introduced. In works with similar goals to ours but applied to face models, other researchers [5] have introduced a Ômorphable face modelÕ for manipulating an existing model according to changes in certain facial attributes. New faces are modeled by forming linear combinations of the prototypes that are collected from 200 scanned face models. Manual assignment of attributes is used to define shape and texture vectors that, when added to or subtracted from a face, will manipulate a specific attribute. Several methods in the literature [24,28,32] use radial basis function (RBF) [26] to represent and model objects of a same class as functional approximation. Lewis et al. [24] demonstrated the skin deformation of articulated character as a function of pose. They attempt to overcome the limitation of geometric skin deformation by using sculpted skin surface of varying postures and blending them during animation. Each example is annotated with its attribute and was located in their attribute and shape spaces. For any point in the attribute space, a new shape is derived through RBF interpolation of example shapes. Recently, Kry et al. [22] proposed an extension of that technique by using principal component analysis (PCA), allowing for optimal reduction of the data and thus faster deformation. Sloan et al. [32] have shown similar results using RBF for blending facial models and the arms to generate new ones. Their blending function is obtained by solving the linear system per example rather than per degree of freedom by making use of equivalent of cardinal basis function, thus resulting in an improved performance. This paper adopts RBF interpolation similarly to the ones described above and applies it to blending examples, but it distinguishes itself from them in several key points. First, like in [2], we start from an unorganized scan data whilst they start from existing models that have been sculpted by artists. Also, instead of dealing with a linear system per vertex coordinate, which potentially is of a large number, we introduce an approximate compact representation of the data. The skeletondisplacements representation combined with the eigenbasis allows us to efficiently handle the complex geometry of the articulated body. Lastly but not leastly, we take a broader scope and manipulate the whole body, particularly focusing on the torso part. Allen et al. [2] present yet another example-based method for creating realistic skeleton-driven deformation. Our approach resembles that approach since it makes use of a template model and unorganized scanned data, although it adopts unique and compact shape description of the body geometry, and unlike other systems, it handles the whole body. Also, we focus on generating diversity of appearance, and thus the interpolation is rather among different individual models than within one character model at various postures, although they also have shown that they can alter the appearance of the animated characterÕs upper body model by simply scaling the control points relative to the bones.


5

1.2. Overview and the organization of the paper The overall system is illustrated in Fig. 1. Our method is based on examples and we start in Section 2 with a description of the database of 3D body scans that our synthesizers are built on. Because we want to represent the body geometry as a vector in the body space, as described in Section 3, the topology of all the body meshes in the example database should be identical. The procedures we use to prepare the example models are outlined in the same section. A prerequisite of parameterized modeling is a choice of control parameters. We describe our two main categories of parameters in Section 4. There exist two different types of synthesizers: modeling synthesizers that create new body geometries using body measurements as input parameters, and modifier synthesizer that manipulate existing models. In Section 5, we introduce our modeling synthesizers and explain how we derive them by making use of the database. An RBF network is used to determine the mapping from the parameters to the body geometry. Section 6 gives a description of the modifier synthesizer. The main purpose of the modifier synthesizer is to obtain variation of the body geometry according to certain body attributes whilst keeping the distinctiveness of the individual as much as possible. Once the synthesizers are constructed, new shapes are created simply by controlling the parameters, which derives the desired deformation on the reference model by evaluating the constructed synthesizers. This is the subject of Section 7. Finally, we conclude the paper in Section 8.

Fig. 1. Overview of the proposed approach.

6


2. Data acquisition and preprocessing 2.1. 3D scanned data Example models used in this paper were obtained from 100 scanned bodies (50 for each) of European adults, acquired mostly from the Tecmath scanner [33]. Texture data was not available and is not within the scope of this work. All subjects are in an erect posture with arms and legs slightly apart, and are lightly clothed, which allowed us to carry out ÔtouchlessÕ measurements. Each of the 3D scanned data was measured at various desired parts (detailed in Section 4) and was annotated with its measured values. 2.2. Reference model As shown in Fig. 2, we use a two-layer model for the reference—skeleton and skin mesh without intermediate layer representing the fat tissue and/or muscle. The skeleton hierarchy that we use is composed of 33 joints including the root [16]. The template mesh is essentially a set of horizontal contours and vertical lines, forming a set of Bezier quad-patches. Such a grid structure enables us to take all supported measurements—girths and lengths—immediately, at any time during the process. Now that we have the skin and the skeleton, we must perform a process called skinning or skin attachment so that a smooth skin deformation can be obtained whilst the joints are transformed. A curious reader may find details in [24]. Despite significant works on visualizing realistic, dynamic skin shape [2,22,24], most character

Fig. 2. The template model: (a) skeleton hierarchy; (b) skin mesh at a lower resolution; and (c) skin mesh at a higher resolution.


7

animation in interactive applications is based on this geometric skeletal deformation technique [6], which is also employed here. 2.3. Preprocessing Throughout this paper, we represent the body geometry as a vector of fixed size (i.e., the topology is known in a priori), which we obtain by conforming the template model onto each scanned models. An assumption made here is that any body geometry can be obtained by deforming the template model. A number of existing methods such as [11] and [30] could be successfully used. In this work, we adopt a featurebased method presented in [30]. The basic idea is to use a set of pre-selected landmark or feature points to measure the fitting accuracy and guide the conformation through optimization. There are two main phases of the algorithm: the skeleton fitting and the fine refinement. The skeleton fitting phase finds the linear approximation (posture and proportion) of the scanned model by conforming the template model to the scanned data through skeleton-driven deformation. Based on the feature points, the most likely joint parameters are found that minimize the distance of corresponding feature locations. The fine refinement phase then iteratively improves the fitting accuracy by minimizing the shape difference between the template and the scan model. The found shape difference is saved into the displacement map of the target scan model. 3. Representation 3.1. Skeleton parameter and displacement map As mentioned earlier in Section 2.3, the deformation of the template model to acquire a new one has two distinct entities, namely the skeleton and displacement components of the deformation. The skeleton component is the linear approximation of the physique, which is determined by the joint transformations through the skinning (See Section 2.2). The displacement component is essentially vertex displacements, which, when added to the skin surface resulting from the skeletal deformation, depicts the detailed shape of the body. Since each example model has different posture, and since the vertex displacement is posture-dependent, the captured skeleton component is normalized by providing predefined constant rotation, so that every example shares the same posture. Thus, we denote the skeleton component as T J ¼ tx1 ; ty1 ; tz1 ; s1x ; s1y ; s1z ; tx2 ; . . . ; tym ; tzm ; smx ; smy ; smz 2 R6m ; where txj and sjx are the translation and scale of joint j (j ¼ 1; . . . ; m) along the x-axis, with the rotation excluded. Similarly, the displacement component is represented by T D ¼ dx1 ; dy1 ; dz1 ; dx2 ; . . . ; dyn ; dzn 2 R3n ;

8


where dx1 is the displacement of vertex v ðv ¼ 1; . . . ; nÞ along x-axis on the skin mesh. We therefore represent the geometry by a body vector P ¼ ðJ ; DÞ. 3.2. Principal component analysis (PCA) Although the transformations computed independently for each bone may be used as above, significant redundancy exists in the bone transformations, exhibiting similar transformations among neighboring bones. For instance, bodies with larger hip girths than the average (large sx and sy of the ÔsacroilliacÕ joints) will most of the time also have larger thighs (large ÔhipÕ joints). A similar redundancy exists as regards the skin displacement. Thus, we seek simplifications that allow the synthesizer to operate on compact vector space. In both cases, we adopt PCA [27], one of the common techniques to reduce data dimensionality. Upon finding the orthogonal basis called eigenvectors, the original data vector x of dimension n can be represented by the projection of itself onto the first M ð nÞ eigenvectors that corresponds to the M largest eigenvalues. In our case, the first 30 bases were enough to describe 99% of variations among the data so a 30-dimensional space was formed. Thus, the final representation of the body vector PN ¼ ðJN ; DN Þ is composed of two sets of 30 coefficients that are obtained by projecting the initial body vector onto each set of eigenvectors, where T

JN ¼ ðs1 ; s2 ; . . . ; s25 Þ 2 R25 ;

T

DN ¼ ðd1 ; d2 ; . . . d25 Þ 2 R25 :

4. Control parameters—size vs. shape Arguably, the sizing parameters or anthropometric measurements allow the most complete control over the shape of the body but providing all measurements required to detail the model would be almost impractical. In this paper, 8 anthropometric measurements (5 girths and 3 lengths) are chosen as sizing parameters. Supporting dozens of measurements is beyond the scope of this work. Instead, our application focuses on eight primary measurements, as listed in Fig. 3, that have been defined as the primary body measurements for product-independent size assignment [19]. Using such a small measurement set not only provides compact parameters for the body geometry representation but also allows to be obtained easily by anyone, enabling applications such as an online clothing store, where a user is asked to enter his/her measurements for customized apparel design. Throughout this paper, we assume that these key measurements are provided as input. Body attributes such as hourglass, pear/apple shape are typically those that provide a global description and dramatically reduce the number of parameters. The closest metric that maps these attributes to numerals is hip-to-waist ratio (HWR; hip girth divided by the waist girth). Another metric that describes the global change of the physique is fat percentage. It however, requires information that is not typically available from the scanned data. For example, in the Ôanthropometric somatotypingÕ [9], skinfolds of four selected parts of the body are required to give a rating of


9

Fig. 3. Anthropometric measurements of the human body.

approximate fat percentage. Fortunately, there are a number of empirical results that allow us to estimate the fat percentage from several anthropometric measurements [18]. 1 Here, we adopt HWR and fat percentage as shape parameters of the modifier synthesizer.

5. Modeling synthesizer Once the system is provided with example models prepared in Section 2, we build synthesizers for each component of the body vector through interpolation. These synthesizers allow the runtime evaluation of the shape from the given input parameters. We split these synthesizers into two kinds: joint synthesizers handle each DoF of the joints, while displacement synthesizers are used to find the appropriate displacements on the template skin, from the input parameters (see Fig. 4). As mentioned earlier, we model the body geometry by controlling size parameters. One might think this can be done by directly applying appropriate transformation of the corresponding bones from the sizing parameters. Such an approach has a number of limitations: first, one must find other transformations that are not directly determined from the size parameters. The second problem, which is essentially the same as the first one, is that it hardly guarantees realistic shape in the resulting geometry.

1 The regression equation used to estimate the fat percentage of a Caucasian woman and a Caucasian man is: 163:205 log10 ðabdomen þ hip neckÞ 97:684 log10 ðheightÞ 78:387, and 86:010 log10 ðabdomen neckÞ 70:041 log10 ðheightÞ þ 36:76, respectively.

10


Fig. 4. Modeling of a body by our modeling synthesizer. (a) Template model (mesh and the skeleton). (b) After the skeleton adjustment. (c) Output model.

Similarly to many of example-based methods [2,5,32], we are looking for a smooth interpolation that transforms the parameter space onto the body geometry space by using examples as interpolants. Given a set of examples obtained from the previous stage, we consider that the measurements define a dimension space where each measurement represents a separate axis in the dimension space. Each example will be assigned to a location in the dimension space. The goal is to produce, at any point p in the dimension space, a new deformation XðpÞ of the template model derived through the interpolation of the example shapes. When p is equal to the position pi for a particular example model i, then X(p) should equal to X i , which is the shape of example i. In between the examples, smooth intuitive changes should take place. The use of Gaussian radial basis functions (GRBFs) is a common approach to the scattered data interpolation problem and can also be used here. Consider a mapping from a d-dimensional input space x to a one-dimensional space. The output sðxÞ of the mapping takes a linear combination of the base functions: sðxÞ ¼

N X

2

wi expðbjjx xi jj Þ;

x 2 Rd ; wi 2 R;

i

where the N input vectors xi consist of the example set, b is a width parameter that controls the smoothness of the interpolating function, and jj jj denotes the Euclidean norm. The weight wi can be determined by solving the associated linear system, as presented by [24,30,32]. One drawback of such analytic solutions is that they give large negative weights [2]. Another problem is that they do not scale well, since they use the analytic linear system solver of complexity of Oðn2 Þ [8]. In addition, the linear system introduces erroneous solutions as the number of examples becomes larger. Although here we have used only 100 examples, it is not uncommon to have hundreds of examples in practical applications.


11

To avoid these drawbacks, we adopt an alternative—RBF network. Often the research aim is to set network parameters through a suitable learning rule, and thus a rich set of algorithms were devoted to the efficient calculation of wi and the basis function optimization. An in-depth discussion and its applications in neural network field can be found in [4]. Recently, Esposito et al. [14] have developed an incremental learning strategy that performs fast training. Instead of having a common width parameter b, each activation function is given its own width bi whose value is determined during training: sðxÞ ¼

N X

2

wi expðbi jjx xi jj Þ;

x 2 Rd ; wi 2 R:

i

The learning procedure is based on random search. For each newly added neuron, a population of bi Õs is generated from a Gaussian probability distribution. For both types of our synthesizers, we found that generating 20 populations gave satisfactory results. Joint synthesizers: Since we are interested in R8 ! R25 mappings, and the RBF network deals with R8 ! R interpolation, we construct 25 networks, one for each element of the joint vector J . It should be noted that the skeleton is essentially a hierarchical tree and there are two ways of describing the transformation (scale for instance) of a joint: relative or absolute. Relative transformation specifies the transformation for each segment relative to its parent whereas in absolute transformation each segment is transformed independently relative to its origin. When using relative transformation, the joint configurations exhibit a larger variation. Since a higher variation on the training set would mean noisy data in the RBF network, we have chosen an absolute transformation to describe the joint parameters, each independently from its parent. Displacement synthesizers: Similarly to the joint synthesizers, 25 network sets are constructed for each element of the displacement vector D. 5.1. Normalization Radial basis functions have values that are determined solely by one parameter, the distance from a center point in the multi-dimensional space. In our applications, the measurements define such parameter space where each measurement represents a separate axis. As each measurement has a different average and variation, and as measurements are often correlated to each other, it is prohibitive to define the space by directly using the measurements unit. Instead, we use a normalized the parameter space by using PCA. 5.2. Population synthesizer Now that we have the individual synthesizer as described above, the problem of automatically generating a population is reduced to the problem of generating the desired number of plausible sets of parameters.

12


Fig. 5. Exaggeration of a body is obtained by making use of the example database and the average body. (a) Anti-exaggeration. (b) Original model. (c) Exaggeration.

When automatically generating a set of anthropometric measurements, one must consider the dependencies or correlations among different measurements, as in [10]. In our case, as described earlier, we have removed the dependencies and formulated a normalized parameter space that is defined by orthogonal bases. Thus, we can generate each parameter independently to automatically produce the desired number of parameters that are needed to generate new individuals. We adopted a Gaussian random number generator [27]. The variance amongst the population is adjustable by the user, although default variance is set to the variance of the database. 5.3. Exaggeration In our framework, it is possible to implement exaggeration and anti-exaggeration in four stages. (a) First we measure the body geometry to be exaggerated to form a size vector in the parameter space. (b) The size vector is normalized and fed into the synthesizers that output a body vector in the normalized body space. (c) The body vector is multiplied by a scalar factor (negative in the case of ‘‘anti-exaggeration’’) that is set by the level of exaggeration. (d) An exaggerated body model is created by combining the scaled body vector with the eigenskeleton and eigendisplacements to result in a skeleton configuration and displacement map to be applied on the reference model (see Fig. 5). Fig. 5 shows an exaggeration example using our modeler.

6. Modifier synthesizer In previous sections we looked for methods for synthesizing a new individual or a population model from a number of anthropometric measurement parameters. One


13

limitation of that modeling synthesizer is that a unique, identical model is produced given the same set of parameters. Often, we want to start with a particular individual to apply modifications according to certain attributes while keeping identifiable characteristics of the physique. A typical example could be: how a person looks like when someone loses his/her weight. This challenging problem has recently been addressed in computer graphics, specifically in the domain of facial modeling. Blanz and Vetter [5] use example database models from scanners and a linear function that maps facial attributes (gender, weight, and expression) onto the 3D model. K€ahler et al. [20] make use of anthropometric data to calculate the landmark-based facial deformation according to changes in growth and age. Our modifier synthesizer is built upon regression models, using shape parameters as estimators and each component of the body vector as response variables. Our example models are however, relatively small (n ¼ 50) and skewed. We therefore start by the sample calibration, as described in Section 6.1. 6.1. Sample calibration We observed that tall-slim and short-overweight bodies are overrepresented in our database, exhibiting a high correlation between the height and the fat percentage (r ¼ 2:6155, p < 0:0001). Directly using such data that contains unequal distribution may result in false estimation by the modifier synthesizer. We use sample calibration [12] technique to avoid such erroneous estimation. It improves the representativeness of the sample in terms of the size, distribution, and characteristics of the population by assigning a weight to each element of the sample. Here, we wish to determine the weights for each sample so that the linear function that maps the height to the fat percentage has the slope 0. Consider the sample consisting of n elements. Associated with each element k are a target variable yk and a vector xk of p auxiliary variables. Consider also that xk Õs are correlated with yk by the regression Y ¼ XB, where ðX Þkj ¼ xjk denotes the jth variable of element k, Y denotes the vector of n target variables, and the correlation vector B ¼ ½b1 ; . . . ; bn T is known for the population. The calibration method aims to compute a weight wk for each element so that the sample distribution of the variables X agrees with the population distribution. The calibration problem can be formulated as follows: Problem 1: Minimize the distance n X Gðwk Þ ð1Þ k¼1

subject to a calibration constraint defined by the weighted least square equation: X T Wm XB ¼ X T Wm Y ;

ð2Þ

where Wm ¼ diagðw1 ; . . . ; wn Þ. The so-called distance function G measures the difference between the original weight values (uniform weighting of 1.0 in our case) and the values wk . The objective is to derive new weights that are as close as possible to the original weights. In our

14


case, the regression model is linear ðp ¼ 2Þ, xji and yi are the fat percentages raised to the power of j 1 and the height of element i, respectively. The coefficient vector B is T ½b1 ; 0 . Taking this value, the development of the Eq. (2) yields: 2 1 32 3 x1 ðb1 y1 Þ x1n ðb1 yn Þ w1 6 76 .. 7 . . . . . . ð3Þ L Wv ¼ 4 54 . 5 ¼ 0: . . . wn xp1 ðb1 y1 Þ xpn ðb1 yn Þ The linear system in (3) is an underdetermined, with n > p. These weights have to be computed in a way that the Eq. (1) is minimized. To find a solution to (3) with (1) minimized, we replace wk with w0k þ 1 and rewrite (3) as 2 Pn 1 3 k¼1 xk ðb1 yk Þ 6 7 .. 6 7 L Wv0 ¼ 6 Pn p . ð4Þ 7; 4 x ðb1 yk Þ 5 k¼1

k

T

where Wv0 ¼ ½w01 ; . . . w0n . In addition, we add one more equation n X

w0k ¼ 0

k¼1

to (4), since we want the mean of the weights wk to be equal to 1. The distance 2 function G used in our approach is the quadratic function Gðwk Þ ¼ 1=2ðwk 1Þ [12]. The underdetermined linear system is solved using the least norm solution, which involves the Cholesky decomposition [27]. 6.2. Parameterized shape modification of individual models Similarly to the problem of the parameter-driven individual modeling synthesizer, we wish to obtain the modification of the body geometry as a function evaluation of the chosen parameter, in our case fat percentage. Clearly, we can make use of the examples to derive such function. Unlike the modeling synthesizer, however, we would like some of the attributes, i.e., those which characterize a particular individual model to remain unchanged as much as possible during the modification, whilst other attributes are expected to be changed according to the control parameter. In general, it is difficult to identify what is invariant in an individual through changes from various factors (sport, diet, aging, etc.). In addition, 3D body examples of an individual under various changes in his/her appearance are rare. (Although one can relatively easily obtain such examples when the aim is to build pose parameter space, as used in [2].) Therefore, the problem of identifying (i) characterizing geometric elements of an individual and (ii) controllable elements from the given examples needs to be solved. We approach the problem by first computing the linear regression model between a shape parameter and each element of the body vector. The formulation of the regression model based on the weighted least square method is:


15

Fig. 6. Shape variation with the regression line and the residual.

1

B ¼ ðX T Wm X Þ ðX T Wm Y Þ; where Wm the sampling weight matrix defined in Section 6.1, X the shape parameter values of every element in the sample, Y the ith element of the body vector, pi of every element in the sample, B the coefficients of the regression function Ei ðxÞ. For a given shape parameter x, the regression function Ei ðxÞ gives the average value pî of pi Õs (see Fig. 6). The difference ei ¼ pi pî is called the residual of the regression. We consider this residual value as the distinctiveness of the body, i.e., the deviation of the body vector component from its average value. Similarly to [20], we assume that the body component keeps its variance from the statistical mean over the changes: a shoulder that is relatively large will remain large. By computing the regression functions Ei ðxÞ for every pi , it is possible to compute the average body for a given a shape parameter. Therefore, for a given body vector for which we know the shape parameter, we can compute the residual value of all the components of the body vector. Given xsrc the shape parameter of the input body and xtrg the shape parameter for which we want to generate the body, the new value of pi0 is given by: pi0 ¼ Ei ðxtrg Þ þ ðpi Ei ðxsrc ÞÞ: 7. Results We have used the proposed synthesizers to produce variously sized human body models. The system runs on a PC with Windows NT environment as a 3ds max [1] plug-in. Given a new arbitrary set of measurements, an input size vector is determined from the normalization. With this input, the joint parame-

16


ters as well as the displacements to be applied on the reference model are evaluated from the joint and displacement synthesizers, respectively. Figs. 7 and 8 show these models with corresponding input measurements listed in Tables 1 and 2. Notice that some of the measurements used go beyond the range of parameter space that we captured from examples. All our models are animatable using motion sequence, through the update of the vertex-to-bone weight that is initially assigned to the template model Appendix for the skin attachment recalculation). In Fig. 9, a captured, key frame based motion data sequence is used for the animation of our models. 7.1. Computational cost In order to reduce the computation time as much as possible at runtime, the Gaussian radial basis function was implemented using a pre-calculated lookup table. The evaluation of 60 synthesizers (30 for the joints and 30 for the displacements) upon receiving the user input parameters takes less than one second on a 1.0 GHz

Fig. 7. Male models generated from our modeling synthesizer using the input measurements listed in Table 1.


17

Fig. 8. Female models generated from our modeling synthesizer using the input measurements listed in Table 2.

Table 1 Measurements used to generate male bodies shown in Fig. 7 Subject

Neck girth

Bust girth

Underbust

Waist girth

Hip girth

Height

Crotch length

Arm length

(a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) (l) (m) (n)

38 42 36 35 44 40 46 36 38 41 42 35 41 44

105 105 87 102 114 102 120 99 103 105 113 89 103 115

100 101 84 95 108 100 117 95 96 101 108 83 99 109

93 107 79 84 102 101 114 88 92 98 106 77 99 98

98 109 94 92 101 98 115 94 97 101 108 92 101 100

177 179 184 180 177 179 171 180 174 182 196 195 184 182

72 68 73 76 70 73 66 74 68 75 77 83 73 74

59 67 70 64 63 65 60 67 60 66 66 70 66 67

18


Table 2 Measurements used to generate female bodies shown in Fig. 8 Subject

Neck girth

Bust girth

Underbust

Waist girth

Hip girth

Height

Crotch length

Arm length

(a) (b) (c) (d) (e) (f) (g) (h) (i) (j) (k) (l) (m) (n) (o) (p)

34 34 32 32 33 41 40 32 34 37 38 33 32 36 37 35

103 92 87 85 91 124 114 88 98 98 116 89 85 101 91 102

84 76 75 71 74 109 103 73 80 89 102 75 74 82 81 86

89 75 68 73 63 116 113 63 72 87 104 70 71 87 81 79

107 112 94 93 93 122 122 89 103 103 116 99 98 99 98 107

167 184 170 154 174 177 164 168 175 154 170 175 169 160 162 173

72 82 74 66 75 74 70 69 78 67 73 77 76 71 71 73

56 57 58 49 53 58 55 53 58 52 54 60 60 50 56 59

Fig. 9. Motion captured animation applied to our models.

Pentium 3. This includes the time for the update of the vertex-to-bone weight that is initially assigned to the template model, so the synthesized models are immediately controllable using motion sequence.


19

Table 3 Cross validation results of the modeling synthesizer

Corr. Coeff. Mean diff (cm) Mean (cm)

Bust girth

Underbust girth

Waist girth

Hips girth

Neck girth

0.973 0.654 92.765

0.984 0.585 82.970

0.989 0.692 67.776

0.982 0.618 93.056

0.974 0.567 32.091

7.2. Cross validation Our modeling synthesizer not only results in visually pleasing body shapes but also faithfully reproduces models that are consistent with the input parameters. In order to quantify this performance, we report the results of the modeling synthesizers by cross validation using all examples in the database. Each example has been excluded from the example database in training the synthesizer and was used as a test input for it. The output models have been measured and compared with the input measurements that are used to synthesize the model. For the five girth-related measurements, the average correlation coefficient was 0.980 and the average mean difference 0.623 (cm). The correlation coefficient and the mean difference are given in Table 3 for each measurement (see Table 3). 7.3. Modifier synthesizer Tables 4 and 5 summarize the regression models used in our modifier synthesizer. While there exists a good deal of consistency, our regression models on the body vector should be differentiated from those for anthropometric body measurements. The regression model of Ôscale_xÕ of the bone, for instance, shows that the physical fat of subjects is partly captured by, and interpreted as, the linearity in the model. Fig. 10 shows the result of the modifier synthesizer. It is clear that the bodies remain identifiable during the modification.

8. Conclusions and future work We have described a new framework to generate and modify whole body models according to various control parameters. Our contributions include a new compact and efficient representation for the body geometry and scalable database that are based on an evolutionary RBF network. Our approach is inspired by the fact that the size and shape are often correlated. Based on the example database models, our synthesizers allow a reasonable estimation of the shape and size of the whole body even when only a little information is available from the given parameters. Although the scanned database used here is sufficiently large to derive some statistical results, (Most authors recommend that one should have at least 10 to 20 times as many observations and one has variables in order to have stable estimates

20


Table 4 Regression models of fat percentage on the skeleton component Explanary variable

Regression coefficient

Standard error

F -ratio

p-value

PC2 PC5 PC7 PC9 PC1 PC1 PC2 PC3 PC6 PC3 PC5 PC8

)0.02849 0.003328 )0.00261 )0.00184 0.043125 )0.00975 0.025454 0.02357 0.022441 )0.06049 0.07172 0.03009

0.003272 0.001485 0.001122 0.000738 0.004694 0.003704 0.012929 0.01203 0.007838 0.018584 0.025202 0.014531

75.82538 5.024998 5.420798 6.186507 84.39235 6.926509 3.876025 3.838827 8.196972 10.59697 8.098249 4.287934

7.23e-12 0.029115 0.023673 0.015994 1.26e-12 0.011048 0.054122 0.05525 0.005961 0.001958 0.00625 0.043178

of of of of of of of of of of of of

scaleX scaleX scaleX scaleX scaleY scaleZ transX transX transX transY transZ transZ

Table 5 Regression models of fat percentage on the displacement component Explanary variable

Regression coefficient

Standard error

F -ratio

p-value

PC1 of displacementX PC5 of displacementX PC6 of displacementX PC1 of displacementY PC4 of displacementY PC5 of displacementY PC2 of displacementZ PC3 of displacementZ PC10 of displacementZ

)2.9363 )0.06207 0.121447 0.214258 )0.07232 0.107049 )0.08119 )0.06428 )0.064

0.039295 0.029464 0.024977 0.04131 0.03268 0.033195 0.028622 0.025047 0.012807

8.621829 4.438548 23.64189 26.90133 4.897688 10.39955 8.046944 6.58599 24.97014

0.004872 0.039797 1.04e-05 3.3e-06 0.031139 0.002141 0.006405 0.013085 6.48e-06

of the regression line.) it can become much larger as more scan data becomes available. As the size of the database grows, the ability to rapidly construct the synthesizers will become more and more critical. The RBF network presented here should scale well to larger database, because it solves the weight and variance of its nodes in an iterative manner instead of solving the whole system at once. Currently, we rely on initial skinning and the attachment recalculation on nonzero displacements, for the skin deformation during joint-driven animation. As a result, pose-dependent shape changes are predicted only by the initial skinning. For further enhancement of the deformation, we are investigating a model that can control both the dynamic (pose-dependent) and static shape (appearance) deformation. Body vectors and their distribution also depend on the initial skinning. When there is a strong weight on the upper arm, a small-scale value will be sufficient to result in a large arm while a large scale is required to have the same effect when the attachment is loose. An automatic refinement of the skin attachment to obtain an


21

Fig. 10. Modification of two individuals controlled by fat percent, hip-waist ratio, and height.

optimal skinning could be found, so that the variation of body vectors for all examples is minimized, for instance. Although in this work we have experimented mainly with size and shape parameters, there are other types of criteria that we believe are worth exploring. Some examples are sports, aging, and ethnicity. Combining these parameters with the sizing parameters is certainly one possibility to extend our system.

Acknowledgments Skeleton-driven deformation module by Frederic Cordier was used in both the preprocessing and the runtime application. Marlene Arevalo-Poizat and Nedjma Cadi helped us with the reference models. This research is supported by the Swiss National Research Foundation (FNRS).

Appendix. Skin attachment recalculation Once the body shape has been modified through the displacement, the skin attachment data needs to be adapted accordingly so that the model retains smooth skin deformation capability. Generally, the deformed vertex location p is computed as

22


p¼

n X

wi Mi D1 i pd ;

i¼1

where Mi and wi are the transformation matrix and influence weight of the i-th influencing bone, Di is the transformation matrix of ith influencing bone at the time of skin attachment (in most cases Di Õs are chosen to be so called dress-pose, with open arms and moderately open legs) and pd is the location of p at the dress-pose, described in global coordinate system. Recomputing the skin attachment data involves updating the location pd for each of its influencing bone. Note that the model has to be back into the dress-pose when the recalculation takes place.

References [1] 3ds max. Available from http://www.discreet.com/products/3dsmax/. [2] B. Allen, B. Curless, Z. Popovic, Articulated body deformation from range scan data, in: Proceedings SIGGRAPH 2002, Addison-Wesley, 2002, pp. 612–619. [3] F. Azuola, N.I. Badler, P. Ho, I. Kakadiaris, D. Metaxas, B. Ting, Building anthropometry-based virtual human models, in: Proc. IMAGE VII Conf., Tuscon, AZ, June, 1994. [4] C.M. Bishop, Chapter 5: Radial basis functions, Neural Networks for Pattern Recognition, Oxford University Press, 1995. [5] B. Blanz, T. Vetter, A morphable model for the synthesis of 3D faces, in: Computer Graphics Proceedings SIGGRAPH Õ99, Addison-Wesley, 1999, pp. 187–194. [6] Bones Pro 2, Digimation. Avilable from http://www.digimation.com. [7] CAESAR project. Avilable from http://www.sae.org/technicalcommittees/caesarhome.htm. [8] J.C. Carr, R.K. Beatson, J.B. Cherrie, T.J. Mitchell, W.R. Fright, B.C. McCallum, T.R. Evans, Reconstruction and representation of 3D objects with radial basis functions, in: Computer Graphics Proceedings SIGGRAPH Õ01, Addison-Wesley, 2001, pp. 67–76. [9] J.E.I Carter, B.H. Heath, Somatotyping—development and applications, Cambridge University Press, Cambridge, 1990. [10] D. DeCarlo, D. Metaxas, M. Stone, An anthropometric face model using variational techniques, in: Proceedings SIGGRAPH Õ98, Addison-Wesley, 1998, pp. 67–74. [11] L. Dekker, 3D Human Body Modeling from Range Data, PhD thesis, University College London, 2000. [12] J.C. Deville, C.E. S€arndal, Calibration estimators in survey sampling, Journal of the American Statistical Association 87 (1992) 376–382. [13] M. Dooley, Anthropometric Modeling Programs—A Survey, IEEE Computer Graphics and Applications, IEEE Computer Society, vol. 2, no. 9, pp. 17–25, 1982. [14] A. Esposito, M. Marinaro, D. Oricchio, S. Scarpetta, Approximation of continuous and discontinuous mappings by a growing neural RBF-based algorithm, in: Neural Networks, 13, Elsevier Science, 2000, pp. 651–665. [15] M. Grosso, R. Quach, E. Otani, J. Zhao, S. Wei, P.-H. Ho, J. Lu, N.I. Badler, Anthropometry for Computer Graphics Human Figures, Technical Report MS-CIS-89-71, Department of Computer and Information Science, University of Pennsylvania, 1989. [16] H-Anim specification. Avilable from http://ece.uwaterloo.ca/H-ANIM/spec1.1. [17] A. Hilton, D. Beresford, T. Gentils, R. Smith, W. Sun, Virtual People: Capturing human models to populate virtual worlds, Proc. Computer Animation, IEEE Press (1999) 174–185s. [18] J.A. Hodgdon, K. Friedl, Development of the DoD Body Composition Estimation Equations, Technical Document No. 99-2B, Naval Health Research Center, 1999.


23

[19] Hohenstein, Uniform Body Representation, Workpackage Input, E-Tailor project, IST-199910549. [20] K. K€ ahler, J. Haber, H. Yamauchi, H.-P. Seidel, Head shop: generating animated head models with anatomical structure, Proc. ACM SIGGRAPH Symposium on Computer Animation, 2002, pp. 55–64. [21] I.A. Kakadiaris, D. Metaxas, 3D Human body model acquisition from multiple views, in: Proceedings of the Fifth International Conference on Computer Vision, pp. 618–623, Boston, MA, June 20–23, 1995. [22] P.G. Kry, D.L. James, D.K. Pai, EigenSkin: real time large deformation character skinning in graphics hardware, in: ACM SIGGRAPH Symposium on Computer Animation, 2002, pp. 153–159. [23] W. Lee, J. Gu, N. Magnenat-Thalmann, Generating animatable 3D virtual humans from photographs, Computer Graphics Forum, vol. 19, no. 3, Proc. Eurographics 2000 Interlaken, Switzerland, August, pp. 1–10, 2000. [24] J.P. Lewis, M. Cordner, N. Fong, Pose Space Deformations: A Unified Approach to Shape Interpolation and Skeleton-Driven Deformation, in: Proceedings SIGGRAPH Õ00, pp. 165–172. [25] Man-Systems Integration Standards, NASA-STD-3000, vol. I, Revision B, July 1995. [26] M.J.D. Powell, Radial basis functions for multivariate interpolation: a review, in: J.M. Mason, M.G. Cox (Eds.), Algorithms for Approximation, Oxford University Press, Oxford, 1987, pp. 143–167. [27] W.H. Press, B.P. Flannery, S.A. Teukolsky, W.T. Vetterling, Numerical Recipes in C, The Art of Scientific Computing, Cambridge University Press, Cambridge, 1988. [28] C. Rose, M. Cohen, B. Bodenheimer, Verbs and Adverbs: Multidimensional Motion Interpolation Using RBF, IEEE Computer Graphics and Applications, IEEE Computer Society Press, (vol. 18), 5 (1998) 32–40. [29] F. Scheepers, R.E. Parent, W.E. Carlson, S.F. May, Anatomy-based modeling of the human musculature, Proceedings SIGGRAPH Ô97 (1997) 163–172. [30] H. Seo and N. Magnenat-Thalmann, An Automatic Modelling of Human Bodies from Sizing Parameters, in: Proceedings SIGGRAPH symposium on Interactive 3D Graphics, pp. 19–26, p. 234, 2003. [31] J. Shen, D.Thalmann, Interactive shape design using metaballs and splines, in: Proc. of Implicit Surfaces Ô95, Grenoble, France, pp.187-196. [32] P.P. Sloan, C. Rose, M. Cohen, Shape by Example, Symposium on Interactive 3D Graphics, March, 2001. [33] Tecmath AG. Avilable from http://www.tecmath.com, Scanner operated by courtesy of ATC (Athens Technology Center S.A.). [34] J. Wilhelms, A. Van Gelder, Anatomically Based Modeling, Proceedings SIGGRAPHÔ97, (1997) 173–180.