Fitting Optimal Curves to Time-Indexed, Noisy ... - CiteSeerX

2 downloads 0 Views 4MB Size Report
I. L. Dryden. ∗ ... landmark-based shape analysis of objects (Dryden and Mardia (1998)), where 2D objects are represented by ...... John Wiley & Sons. Fletcher ...
Fitting Optimal Curves to Time-Indexed, Noisy Observations of Stochastic Processes on Nonlinear Manifolds J. Su† , I. L. Dryden∗ , E. Klassen‡ , H. Le§ and A. Srivastava† † Department

§ School

of Statistics, ‡ Department of Mathematics, Florida State University, Tallahassee, FL 32306 of Mathematical Sciences, University Park, University of Nottingham NG7 2RD, U.K. ∗ Department of Statistics, University of South Carolina, Columbia, SC 29208

Abstract We address the problem of estimating optimal curves for interpolation, smoothing, and prediction of values along partially-observed stochastic processes. In particular, we focus on processes that evolve on certain nonlinear manifolds of importance in computer vision applications. The observations are given as a set of time-indexed points on manifolds denoting noisy observations of the process at those times. Fitted curves on manifolds amount to geometrically meaningful and efficiently computable splines on manifolds. We adopt the framework developed in Samir et al. (2010) that develops a Palais metric-based steepest-decent algorithm applied to the weighted sum of a fitting-related and a regularity-related cost function. Using the rotation group, the space of positive-definite matrices, and Kendall’s shape space as three representative manifolds, we develop the proposed algorithm for curve fitting. This algorithm requires expressions for exponential maps, inverse exponential maps, parallel transport of tangents, and curvature tensors on the chosen manifolds. These ideas are illustrated using a large number of experimental results on both simulated and real data. Keywords: motion tracking, splines on manifolds, energy minimization, interpolation, steepest-descent, DTI fiber tracking.

1. Introduction This paper studies the interaction of two disparate but inherent aspects of computer vision systems: the temporal evolution of a dynamic scene and the nonlinear representations of features of interest. Specifically, we are interested in dynamical systems where a process of interest evolves over time on a nonlinear manifold and one observes this process only at limited times. The goal is to estimate/predict the remaining process using the observed values under some predetermined criterion. The motivation of such a problem comes from many applications. Consider the evolution of the shape of a human silhouette in a video, in a situation where one has an unobstructed view of the person in only a few frames. Given these observed shapes, along with their observation times, one would like to estimate shapes at some intermediate times and perhaps even predict future shape evolution. Additionally, if the observed shapes are considered noisy (for example, due to the process of extracting shapes from image frames), one would like to smooth the observed shapes using their temporal structure. A similar problem arises in tracking the rigid motion of an object using video data, e.g. in tracking a human speaker using webcam video. Focusing on the rotation component, one observes the object’s orientation at certain times and seeks to estimate and smooth the rotation process over the whole observation interval. In addition to the focus on the estimation/smoothing of processes, this paper considers the problems where processes of interest evolve on nonlinear manifolds. Many problems in computer vision and image understanding are naturally posed as optimization or statistical inference problems on nonlinear manifolds. This is because there are some intrinsic constraints on the pertinent features that force the corresponding representations to be restricted to these manifolds. While the nonlinearities of different manifolds can be very diverse, certain types of manifolds are encountered more frequently than others in computer vision applications. These are: (1) unit spheres, (2) matrix Lie groups, (3) quotient spaces of spheres, and (4) quotient spaces of matrix Lie groups. We provide some examples of their occurrences in computer vision problems. Preprint submitted to Elsevier

March 2, 2011

1. Unit Spheres: A simple constraint of unit length leads to the representation becoming a unit sphere. In the case of a vector representing a set of features is scaled to have norm one, the set of such vectors is naturally a unit sphere. For example, in the study of directional data (Mardia and Jupp (2000)), where one is interested in analyzing the directions of moving objects, or imaging viewpoints, the space of all directions is S2 . In the landmark-based shape analysis of objects (Dryden and Mardia (1998)), where 2D objects are represented by configurations of salient points or landmarks, the set of all such configurations, after removing translation and scale is a real sphere S2n−3 (for configurations with n landmarks). Similarly, In an elastic shape analysis of curves in Euclidean spaces (Srivastava et al. (2010)), the space of all unit length curves is a unit Hilbert sphere S∞ . A number of applications involve functions that warp domains on interest to themselves in a diffeomorphic way. For example, in the problem of activity recognition, the sequence of events that form an activity may be performed at the same sequence but at different rates (Veeraraghavan et al. (2009)). A closely related problem is that of analyzing probability density functions because the derivatives of warping functions in one dimension are probability densities (or histograms, that are frequently used as features in texture analysis). It was illustrated in Srivastava et al. (2007) and others that a natural way to study the warping functions in one dimension, the probability density functions, and the histograms is to use a square-root representation under the Fisher-Rao metric. The resulting set of such functions forms a Hilbert sphere S∞ . 2. Matrix Lie Groups: The transformation groups represented by GL(n), the set of n × n invertible matrices, and its subgroups are important in many situations involving the actions of these groups on the objects in Euclidean spaces. For example, a transformation of the type x 7→ (Ax + b) where b ∈ Rn , and (i) if A ∈ GL(n) is called an affine transformation, (ii) if A ∈ S L(n), the subgroup of GL(n) consisting of matrices with determinant +1, is called a volume-preserving transformation, and (iii) if A ∈ S O(n), the set of n × n orthogonal matrices with determinant +1, is called a similarity transformation. In the problem of tracking and recognizing objects in video data, their poses relative to the camera are important. The pose of a rigid object is conveniently denoted as an element of the special orthogonal group S O(n) where n = 2 for planar objects and n = 3 for 3D objects Grenander et al. (1998); Moakher (2002). The rotation matrices are preferred over the angles (e.g. pitch, yaw, and role for 3D pose) for representing pose since they are unique for every pose. 3. Quotient Spaces of Spheres: Some problems study the quotient spaces of spheres rather than the spheres themselves. For example, the landmark shape space of 2D configurations, after removing the rotation variability, becomes complex projective space CPn−2 = S2n−3 /S1 (Kendall (1984); Kendall et al. (1999); Dryden and Mardia (1998)). In an elastic shape analysis of curves, the space of all unit length curves is a unit Hilbert sphere S∞ , and after removing the rotation and re-parameterization variability one obtains a quotient space S∞ /(S O(2) × Γ), where Γ is the re-parameterization group (Srivastava et al. (2010)). 4. Quotient Spaces of Matrix Lie Groups: Perhaps the most prominent examples of this category are Grassmann and Stiefel manifolds that are useful in orthogonal linear transformations, including dimension reduction. Depending on the chosen criterion, one can generalize the problem of dimension reduction to include Fisher’s linear discriminant analysis (Belhumeur et al. (1997)) and optimal component analysis (Liu et al. (2004)). Such problems are solved on Stiefel and/or Grassmann manifolds which, in turn, are quotient spaces of S O(n). In a number of applications, there has been a great interest in statistical analysis of symmetric, positive-definite (SPD) matrices, which can be viewed as a quotient space of GL(n). For example, the basic unit in Diffusion Tensor - Magnetic Resonance Imaging (DT-MRI) is a 3 × 3 SPD matrix that represents the diffusion tensor at each point of a volume. The goal in this problem is to estimate, interpolate, and smooth a uniform field of diffusion tensors and these tasks are performed using the Riemannian geometry of the underlying tensor space Pennec et al. (2006); Moakher (2005); Schwartzman (2006); Fletcher and Joshi (2007). Another application of this Riemannian framework is in analysis of Gaussian probability densities. These densities are completely characterized by their means and covariances. Assuming that the densities have the mean zero, the densities are studied by comparing their covariance matrices that are by definition SPD (Dryden et al. (2009)). An important issue in the problem of fitting smooth paths is: What should be the criterion for this estimation? A simple solution is to connect the given points with shortest (geodesic) paths on the manifold, say M. In other words, interpolate linearly between the given observations using piecewise geodesics. If M is a Riemannian manifold, it has a Riemannian metric to define and compute geodesic paths between arbitrary pairs of points. This estimation is illustrated using a solid line in Fig. 1. As depicted, this estimate has discontinuities in the first derivative at the 2

p2

p4

p3 p1

γ

Figure 1: Estimation of unobserved process using: (1) piecewise geodesics (solid line) and (2) second-order smooth curve (broken line).

observation times, which does not often match the actual underlying process. If given points are considered noisy, one does not require the estimated path to strictly pass through the given points, i.e. the given points are treated as soft constraints rather than hard constraints. This argument leads to an optimization problem of following type. Let {(t1 , p1 ), (t2 , p2 ), . . . , (tn , pn )} denote the given observations where ti ∈ [0, 1] and pi ∈ M, and we are interested in a path γˆ such that   n  λ1 ∑  λ2 2  (1) γˆ = argmin  d(γ(ti ), pi ) + R(γ) , 2 γ:[0,1]→M 2 i=1 where R is a smoothness penalty on γ, d denotes the (Riemannian) geodesic distance on M, and λ1 , λ2 are different weights attached to the two terms. (Instead of selecting two weights, one can equivalently use their ratio λ = λ2 /λ1 instead. We will continue to use the two separate weights for computational convenience.) The next issue is to select the smoothness penalty. There are two main ideas in this regard: • Use a penalty that relates to the first derivative of γ: ∫ R1 (γ) =

1

⟨˙γ(t), γ˙ (t)⟩ dt .

0

Since γ˙ (t) ∈ T γ(t) (M), the inner product inside the integral can be defined using the Riemannian metric. It can be shown that the solution of Eqn. 1 under this smoothness penalty are piecewise geodesics although they may not pass through the given points. This choice suffers from the same problem as the piecewise-geodesic interpolations between pi s, i.e. the discontinuity of the function γ˙ (t). • Another idea is to penalize the second derivative of γ according to: ⟩ ∫ 1⟨ 2 D γ D2 γ , dt . R2 (γ) = dt2 dt2 0 D Here Ddt2γ = dt γ˙ (t) denotes the covariant derivative of the tangent vector field γ˙ (t) and, once again, the inner product inside the integral is given by the Riemannian metric. Under this choice of R, the solution of Eqn. 1 will be smooth up to the second order and will provide more natural solutions than the previous choice. Such a solution is depicted using the broken line in Fig. 1. 2

Choosing the second criterion for the smoothness penalty, the formal goal of this paper can be stated as follows: Problem Statement: Let M be a finite-dimensional Riemannian manifold. Given a set of observation times t1 , t2 , . . . , tn in [0, 1] and the measurements p1 , p2 , . . . , pn ∈ M at those times, our goal is to estimate a path γ : [0, 1] → M that solves the optimization problem given in Eqn. 1 using R2 as the smoothness penalty. We summarize some important past work in the area next. 3

1.1. Past Work Previous work in splines on manifold can be divided into a few broad categories. The first category of papers adapt algorithms for finding smooth paths on Euclidean spaces to nonlinear manifolds but do not explicitly provide an optimality statement for these paths on the manifolds. While it is possible to establish the optimality of these solutions in Euclidean spaces, the corresponding results on nonlinear manifolds are not provided. Examples of these papers include Crouch et al. (1999); Altafini (2000); Hofer and Pottmann (2004); Gu et al. (2005); Jakubiak et al. (2006); Popiel and Noakes (2007). For instance, Crouch et al. (1999); Altafini (2000); Popiel and Noakes (2007) generalized the Casteljau algorithm for finding B´ezier curves to manifolds. Hofer and Pottmann (2004) extended the familiar cubic spline curves and splines in tension to parametric surfaces, level sets, triangle meshes, and point samples of surfaces. Jakubiak et al. (2006) presented a geometric two-step algorithm to generate splines of an arbitrary degree of smoothness in Euclidean spaces, then extended the algorithm to matrix Lie groups and applied it to generate smooth motions of 3D objects. Gu et al. (2005) studied the affine structure of domain manifolds in depth and proved that the existence of manifold splines is equivalent to the existence of a manifold’s affine atlas. A set of practical algorithms was developed to generalize triangular B-spline surfaces from planar domains to manifold domains. A related approach to interpolation on manifolds consists of mapping the data points onto a vector space, mostly the tangent space at a particular point of manifold, then computing an interpolating curve in the tangent space, and finally mapping the resulting curve back to the manifold. The mapping can be defined, e.g., by a rolling procedure, see Jupp and Kent (1987); Morris et al. (1999); Kent et al. (2001); H¨upper and Leite (2007); Kume et al. (2007); Kenobi et al. (2010). For instance, Jupp and Kent (1987) proposed a method that can be described as ”unwrapping” the data onto the plane, where standard curve fitting techniques can then be applied. The papers Morris et al. (1999); Kenobi et al. (2010); Kent et al. (2001) studied the problem of fitting a smooth curve in the planar shape space. The second category of papers studies optimality criteria of the type given in Eqn. 1 and use calculus of variation approach to proceed. They typically derive some differential equations that the solution should satisfy and to find the solutions of these fourth and higher-order differential equations is non-trivial, except in some simple cases, for example planar shape space (Kume et al. (2007)). In other words, such approaches may lead to a characterization of the solution that is not constructive or algorithmic. The generalization of cubic spines to more general Riemannian manifolds was pioneered by Noakes et al. (1989). This variational approach was followed by other authors, including Crouch and Leite (1991, 1995). Splines of class Ck were generalized to Riemannian manifolds by Camarinha et al. (1995). This approach was also followed by Machado et al. (2006) using the first-order smoothing term (R1 ) and by Machado and Leite (2006) for the second-order smoothing term (R2 ). The next category of papers takes a filtering or a smoothing approach to path fitting at discrete times, using explicit models for the unknown process and the observation process. While such problems on Euclidean spaces are solved using Kalman, extended Kalman filters, and particle filters, the corresponding problems on manifolds are solved mostly using particle filtering. For example, Solo (2009) considered state estimation for partially observed processes evolving in a Riemannian manifold and obtained some new results on un-normalized nonlinear filters. Srivastava and Klassen (2004) studied the problem of estimating trajectories on Grassmann manifolds using a particle filtering approach. Tompkins and Wolfe (2007) considered a similar problem on a Stiefel manifold. Rathi et al. (2007) formulated a particle filtering algorithm in the geometric active contour framework that can be used for tracking moving and deforming objects. Vaswani et al. (2005); Das and Vaswani (2010); Vaswani et al. (2010) developed solutions to the problem of tracking landmark-based shapes in video data. These solutions assume that one has an observation, albeit nonlinear and noisy, at each discrete time of interest in the interval and in this sense differs from the current problem where only few measurements are given. In this paper we will adapt an approach presented in Samir et al. (2010) that develops a gradient-based method for solving the optimization problem in Eqn. 1, with the smoothness penalty given by R2 . This paper considers a general, finite-dimensional Riemannian manifold M and derives the gradient of the cost function on M using a second-order Palais metric. Let γ be a twice-differentiable path on M and let v, w be two smooth vector fields along γ. That is, for any t ∈ [0, 1], the vector v(t), w(t) ∈ T γ(t) (M). Then, the second-order Palais metric (Palais (1963)) is given by: ⟩ ⟩ ⟨ ∫ 1⟨ 2 Dw D v D2 w Dv dt. (2) (0), (0) + , ⟨⟨v, w⟩⟩2,γ = ⟨v(0), w(0)⟩γ(0) + dt dt dt2 dt2 γ(t) 0 γ(0) The objective function is a functional on an appropriate space of twice-differentiable paths on M and one needs a 4

metric on this space to express the gradient as a vector field on the current path. Although there are several choices of metrics, including the standard L2 metric, the use of the second-order Palais metric greatly simplifies the expression for gradient of the second term in Eqn. 1, albeit at the cost of slight increase in complexity in gradient of the first term. Samir et al. (2010) provides an expression for the gradient vector on a general manifold in terms of its differential geometry, and demonstrates these ideas using examples from two simple manifolds: R2 and S2 . 1.2. Contributions of This Paper We take the framework developed in Samir et al. (2010) and apply the methods to some general manifolds relevant to problems in computer vision. In particular, we take the manifolds that are the most frequently used in computer vision and solve the problem stated in Eqn. 1 on those manifolds. The specific manifolds studied here are: (i) S O(3), (ii) P(n), the set of determinant one, symmetric positive-definite matrices, and (iii) Kendall’s shape space. This requires utilizing the differential geometry of these manifolds to derive expressions for exponential maps, inverse exponential maps, parallel transport of tangents, and curvature tensors. 2. Mathematical Framework In this section we summarize the main result presented in Samir et al. (2010) for fitting smooth curves through time-indexed points on Riemannian manifolds. Let M be a Riemannian manifold and γ : [0, 1] → M be an appropriately differentiable path on M. Our goal is to find a path that minimizes the energy function: ⟩ ∫ ⟨ n λ1 ∑ 2 λ2 1 D2 γ D2 γ E(γ) = d (γ(ti ), pi ) + , dt. (3) 2 i=1 2 0 dt2 dt2 The first term is referred as the data term or Ed and the second as the smoothing term or E s . We will use the steepestdescent algorithm for minimizing E, where the steepest-descent direction is defined with respect to the second-order Palais metric. According to Samir et al. (2010), the gradients of these two terms are given as follows: ∑ 1. Data Term: The gradient of the function Ed : γ → 12 ni=1 d2 (pi , γ(ti )) at γ ∈ Γ with respect to the second-order ∑ Palais metric (Eqn. 2) is given by the vector field G = ni=1 gi (t) along γ where { (1 + ti t + 21 ti t2 − 16 t3 )˜vi (t) 0 ≤ t ≤ ti , gi (t) = (4) (1 + tti + 21 tti2 − 16 ti3 )˜vi (t) ti ≤ t ≤ 1, where v˜ i is the parallel translation of vi along γ and vi = − exp−1 γ(ti ) (pi ). 2. Smoothing Term: The gradient vector field of the function E s along γ, with respect to the second-order Palais metric, is given by H2 (t) + H3 (t), where • H2 is given by: ˜ − S˜ (t)) − t(Q(t) ˜ − S˜ (t)) + S˜ (t), H2 (t) = Hˆ 2 (t) − (1/6)t3 S˜ (t) − (1/2)t2 (Q(t) • Hˆ 2 is given by:



D4 Hˆ 2 (t) D2 γ = R( 2 (t), γ˙ (t))(˙γ(t)) 4 dt dt

3 ˆ D2 Hˆ 2 (0) = DdtH3 2 (0) = 0. dt2 2 ˜ 3 ˆ Q˜ and S˜ are the parallel transports along γ of Q = DdtH2 2 (1) and of S = DdtH3 2 (1). 2 2 3 H3 is given by DdtH2 3 = Ddt2γ with initial conditions H3 (0) = DH dt (0) = 0.

with the initial conditions Hˆ 2 (0) = •

(5)

DHˆ 2 dt (0)

=

• The mapping R(· , ·)(·) denotes the Riemannian curvature of the manifold at a point. When a vector at a point in the manifold is parallel transported around that point, it may not reach its original version. The Riemannian curvature tensor measures the difference between the starting vector and the ending vector. More precisely, it is given in terms of the Levi-Civita connection ∇ by the formula: R(u, v)w = ∇u ∇v w − ∇v ∇u w − ∇[u,v] w, where [u, v] is the Lie bracket of vector fields. For details, we refer readers to Boothby (2002). 5

Combining the two terms, the gradient of E under the second-order Palais metric is given by: ∇E = λ1G + λ2 (H2 + H3 ) . This gradient expression is used in deriving a steepest-descent algorithm for finding the optimal path γˆ . This is an iterative algorithm that is summarized below. Algorithm 1. Iterative Algorithm for Gradient-Based Minimization of E. At each iteration k = 1, 2, . . . , we have path γk sampled at T times {tδ|t = 0, 1, 2, . . . , T }, δ = 1/T . 1. Data Term Gradient: For i = 1, 2, . . . , n (a) Find the shooting vector vi = − exp−1 γk (ti ) (pi ). (b) Parallel transport vi to each point along γk to form a vector field v˜ i . We start at γk (ti ) and proceed in both directions (increasing and decreasing t) in steps of δ and transport the tangent vector from the previous point to the current point. (c) Compute gi (tδ) using Eqn. 4. ∑ Evaluate G(tδ) = ki=1 gi (tδ). 2. Smoothing Term Gradient: For each t = 0, 1, . . . , T , compute: (a) Compute γ˙k (tδ) = exp−1 γk (tδ) γk ((t + 1)δ). (b) Parallel transport γ˙k (tδ) to γk ((t + 1)δ); call it γ˙k ∥ (tδ). 2 Compute D dtγk2(t) ((t + 1)δ) = (γ˙k ((t + 1)δ) − γ˙k ∥ (tδ))/δ.

(c) Compute the curvature tensor R( D dtγk2(t) (tδ), γ˙k (tδ))(γ˙k (tδ)) at γk (tδ). (d) Solve for Hˆ 2 by covariantly integrating R four times applying Algorithm 2, each with initial condition zero. 3 ˆ 2 ˆ ˜ (e) Compute Q(tδ) and S˜ (tδ) as backward parallel translations of DdtH2 2 (1) and DdtH3 2 (1), respectively. For any vector v(1) at γk (1), its backward parallel translation over γk is computed in backward steps: start from t = T , and for any tδ parallel transport v((t + 1)δ) at γk ((t + 1)δ) to γk (tδ). (f) Use Eqn. 5 to compute H2 . 2 (g) Similarly, solve for H3 by covariantly integrating D dtγk2(t) twice using Algorithm 2, each with initial condition zero. 2

3. Total Gradient: The gradient of E is now given by F = λ1G + λ2 (H2 + H3 ). 4. Update of Curve: For a pre-defined step size ϵ, we compute γk+1 (tδ) = expγk (tδ) (kϵF(tδ)), for t = 0, 1, 2, . . . , T , and compute E(γk+1 ). We start at k = 1 and continue in the increment of one until E(γk+1 ) > E(γk ). Then, we select the best k value based on the minimum E achieved, and the corresponding updated path is γk . Algorithm 2. Compute covariant integral u of a vector field v along γ. 1. 2. 3. 4.

Initialize w(0) = 0. For each t = 1, 2, . . . , T , parallel transport w(tδ) from γ(tδ) to γ((t + 1)δ); call it w∥ ((t + 1)δ). Also, parallel transport v(tδ) from γ(tδ) to γ((t + 1)δ); call it v∥ ((t + 1)δ). Define u((t + 1)δ) = v∥ ((t + 1)δ) + δw∥ ((t + 1)δ).

To implement this method, we need the following items from the differential geometry of M: • Exponential Map: For any point p ∈ M and a tangent vector v ∈ T p (M), we need to be able to find the point exp p (v) ∈ M. • Inverse Exponential Map: For any two points p1 , p2 on M, we should be able to compute the inverse exponential exp−1 p1 (p2 ). This part is needed in the gradient of E d . • Parallel Transport: For any two points p1 and p2 , and a tangent vector v1 ∈ T p1 (M), we should be able to parallel transport v1 to p2 along a geodesic path connecting p1 and p2 . This item is used in numerical implementations of covariant integration and differentiation using finite sums and differences, respectively. 6

• Riemannian Curvature: For any point p ∈ M and the tangent vectors X, Y, and Z ∈ T p (M), we should be able to compute the Riemannian curvature tensor R(X, Y)(Z). Performance We have implemented this algorithm for three manifolds M = S O(3), P(3), 3×3 SPD matrices with determinant 1 and CPn−2 . Although we present the detailed results in the later sections, we mention the computational costs of this algorithm here. To illustrate the computational efficiency, we present computational costs per iteration for path updating on the three manifolds (using Matlab on a 2.4GHz Intel processor) in Table 1. Manifold Time(sec)

S O(3) 0.2454

P(3) 0.2327

CPn−2 0.1045

Table 1: Computation costs for path updating on three different manifolds

In the next few sections, we will take specific examples of manifolds, describe the relevant parts of their differential geometries, and will implement the Algorithms 1 and 2 for finding optimal paths on those manifolds. 3. Three-Dimensional Rotation Group SO(3) Here we are interested in fitting smooth paths to given time-indexed points on the rotation group S O(3) under the standard Euclidean metric. 3.1. Basic Geometric Tools This space is a Lie group under matrix multiplication and the identity element is given by the 3 × 3 identity matrix I3 . The tangent space T I3 (S O(3)) is the set of all 3 × 3 skew-symmetric matrices, and T O (S O(3)) = {OA|A ∈ T I3 (S O(3))}. The standard Riemannian metric is given by ⟨X1 , X2 ⟩ = trace(X1T X2 ). For any point O ∈ S O(3) and a tangent vector X = OA (where AT = −A), the geodesic path from O in the direction X is given by: t 7→ OetA where e denotes the matrix exponential. The required items for computing the gradient of E for paths in S O(3) are: • Exponential Map: Given a point O ∈ S O(3) and a tangent vector X ∈ T O (S O(3)), the exponential map is given by: T expO (X) = OeO X . • Inverse Exponential Map: For any two points O1 , O2 ∈ S O(3), we can compute the inverse exponential using the formula: T exp−1 O1 (O2 ) = O1 log(O1 O2 ), where log denotes the matrix logarithm. • Parallel Transport: For any two elements O1 and O2 in S O(3), and a tangent vector W ∈ T O1 (S O(3)), the tangent vector at O2 which is the parallel transport of W along the shortest geodesic from O1 to O2 is: O1 (OT1 O2 )1/2 (OT1 W)(OT1 O2 )1/2 . Since W is tangent to S O(3) at O1 , it follows that OT1 W is tangent at the identity, i.e., OT1 W is skew-symmetric. The natural way to calculate (OT1 O2 )1/2 is to let A be a skew-symmetric matrix which is the matrix log of OT1 O2 ; then (OT1 O2 )1/2 = eA/2 . If we let A be the matrix log of OT1 O2 as above, then the geodesic from O1 to O2 is just O1 etA where t goes from 0 to 1. So, the parallel transport becomes: W ′ = O1 eA/2 (OT1 W)eA/2 . • Riemannian Curvature: For any point O ∈ S O(3) and the tangent vectors X, Y, and Z ∈ T O (S O(3)), the Riemannian curvature tensor R(X, Y)(Z) is given by: R(X, Y)(Z) = O[[A, B], C], where A = OT X, B = OT Y, C = OT Z. Here [A, B] denotes AB − BA ∈ R3×3 . 7

3000 E E

2500

s

2000 1500 1000 500 0 0

5

10

15

20

2

4000 E Es

3000 2000 1000 0 0

10

20

30

Figure 2: Minimization of E s (λ1 = 0, λ2 = 1): In each row, the left panel shows the initial curve, the second panel shows some intermediate curves, the third panel shows the final curve, and the last panel shows the corresponding decrease in E s during evolution.

1.6

7 E

1.4

E

Ed

Ed

6

1.2

5

1 4 0.8 3 0.6 2

0.4

1

0.2 0 0

10

20

30

40

0 0

50

2.5

10

20

30

40

50

12 E

E

Ed

2

Ed

10 8

1.5 6 1 4 0.5

0 0

2

10

20

30

40

0 0

50

10

20

30

40

50

Figure 3: Minimization of Ed (λ1 = 1, λ2 = 0): Each case shows the initial path (dotted line) and the final path (solid line), along with the evolution of Ed versus the iteration index.

3.2. Experimental Results We have implemented the gradient algorithm given earlier for minimizing the total energy E associated with √ a given path. To display a path γ : [0, 1] → S O(3), we use the following idea. We take the unit vector v = (1, 1, 1)/ 3 ∈ R3 and generate the process X(t) = vγ(t) in R3 and display points along X(t) at any iteration. 1. Smoothing Term Only: As the first experiment, we set λ1 = 0 and minimize only E s using its gradient H2 + H3 . In other words, we ignore the points (ti , pi ) and the data term Ed and simply try to minimize E s . Some results 8

2000

2000

E

E

Ed

Es 1500

1500

1000

1000

500

500

0 0

10

20

30

40

0 0

50

λ1 = 0, λ2 = 1

Es

10

20

30

40

2000

2000 E

E

Ed

Ed

Es

1500

1000

500

500

20

40

60

80

Es

1500

1000

0 0

50

λ1 = 1, λ2 = 1

0 0

100

λ1 = 10, λ2 = 1

50

100

150

200

250

300

λ1 = 50, λ2 = 1

2000

1000 E E

E E

d

800

Es

1500

d

Es

600 1000 400 500 200

0 0

50

100

150

200

250

0 0

300

λ1 = 100, λ2 = 1

50

100

150

200

250

300

λ1 = 100, λ2 = 0.1

Figure 4: Initial and final paths for optimization of E under different combinations of λ1 and λ2 . In all these cases we use the same data points (ti , pi ) and the same initial path γ for starting the algorithm.

are shown in Figure 2, where each row shows the initial curve (left panel), some intermediate curves (second panel), and the final curve (third panel), along with the evolution of E s . As seen in these examples, the gradient method proves very effective in minimizing E s to zero rapidly. The minimizer of E s is simply a geodesic path √ in S O(3) and when visualized via its action on (1, 1, 1)/ 3, it traces an arc on the sphere. 2. Data Term Only: In this case we set λ1 = 1 and λ2 = 0, and use the gradient G to minimize only the data term Ed . We generate a set of indexed points (ti , pi ) on S O(3), select a path γ and use Algorithm 1 to perform gradient-based updates of γ. Some of the results are shown in Figure 3; each pair of panels show the initial and the final paths (left panels) and the evolution of Ed (right panels). We note that, in each case, the decrease in Ed is very rapid at the start but slows down considerably in the later stages. This is due to the use of secondorder Palais metric in the evaluation of the gradient; while Ed is a zeroth order quantity, as it simply involves computing distances between points and no derivatives, the Palais metric is a second order metric involving second derivatives of γ(t). From a practical perspective, there is a possibility of stopping at a point that is not a global minimizer of Ed . This is visible in the examples shown in Figure 3 where the final curve does not strictly pass through the given points. In a practical situation we can possible address this problem by using a path that is close to the given points. For example, a piecewise geodesic path has been used to initialize the algorithm in the later experiments in this paper. Another notable phenomenon occurs when the largest time index ti is significantly smaller than one, and the remaining path γ evolves differently from the rest of the path. 3. Both Terms: In this case we study the full gradient term λ1G + λ2 (H2 + H3 ) for different values of λ1 and λ2 . Some results are presented in Figure 4. In this experiment we generate a set of n = 4 data points and keep them 9

fixed. We initialize the algorithm each time with a path that is passing through these points at the corresponding times and, hence, Ed is zero for the starting path γ. Different combinations of λ1 and λ2 result in different evolutions of γ and the energy E. In the earlier cases where λ2 is not negligible, the smoothing term dominates the evolution. Here, the energy quickly decreases to a very small value on the resulting path, i.e a geodesic path on S O(3). As λ1 starts getting larger, and/or λ2 starts getting negligible, the data term starts having an effect and consequently the solution stays close to the given points. 4. Symmetric Positive-Definite Matrices with Determinant 1 In this section we present experimental results on the space of SPD matrices with determinant 1. 4.1. Basic Geometric Tools ˜ ˜ Let P(n) be the space of n×n symmetric positive-definite symmetric matrices and P(n) = {P | P ∈ P(n) and det(P) = ˜ ˜ ˜ 1/n ∈ P(n). The space P(n) is a well known Riemannian symmetric 1}. It is known that for any P˜ ∈ P(n), P = P/det( P) manifold, i.e., the quotient of the special linear group S L(n) = {G ∈ GL(n) | det(G) = 1} by its closed subgroup S O(n) acting on the right (Jost (1998)) and with an S L(n)-invariant metric. Since n = 3 is important in diffusion tensor imaging, we will focus on this case. We are interested in fitting smooth paths to given time-indexed points on P(3) ˜ and it can be easily extended to P(3). The tangent space of P(3) at I is TI (P(3)) = {A | AT = A and tr(A) = 0}, and the tangent space at P is TP (P(3)) = {PA | A ∈ TI (P(3))}. For any point P ∈ P(3) and a tangent vector V, the geodesic √ √ path from P in the direction V is given by: t :7→ Pe2tA P, where · denotes the matrix square root and A = P−1 V. Now consider the required items for computing the gradient of E for paths on P(3). • Exponential Map: Given a point P ∈ P(3) and a tangent vector V ∈ TP (P(3)), the exponential map is given by: √ expP (V) = Pe2P−1 V P. • Inverse Exponential Map: For any two points P1 , P2 ∈ P(3), we can compute the inverse exponential using the formula: √ −1 2 −1 exp−1 P1 (P2 ) = P1 log( P1 P2 P1 ). • Parallel Transport: For any two elements P1 and P2 in P(3), and a tangent vector V ∈ T P1 (P(3)), the tangent vector at P2 which is the parallel transport of V along the shortest geodesic from P1 to P2 is: √ T −1 −1 2 −1 P2 T 12 B T 12 , where B = P−1 V, T = P P P and P = P−1 12 2 12 1 12 1 1 P2 P1 . • Riemannian Curvature: For any point P ∈ P(3) and the tangent vectors X, Y, and Z ∈ T P (P(3)), the Riemannian curvature tensor R(X, Y)(Z) is given by: R(X, Y)(Z) = −P[[A, B], C], where A = P−1 X, B = P−1 Y, C = P−1 Z. 4.2. Experimental Results It is difficult to visualize an element of P(3). We will use an ellipsoid formed as a level curve of function X 7→ X T PX to display the matrix P. 1. Smoothing Term Only: As the first experiment, we set λ1 = 0 and minimize only E s using its gradient H2 + H3 . In other words, we ignore the points (ti , pi ) and the data term Ed and simply try to minimize E s . Some results are shown in Figure 5, where the left panels show the initial path (first row) and the final path (second row), along with the evolution of E s on the right panels. As seen in these examples, the gradient method proves very effective in minimizing E s to zero rapidly. The minimizer of E s is simply a geodesic path in P(3).

10

500 E 400

Es

300 200 100 0 0

200

400

600

500 E 400

Es

300 200 100 0 0

100

200

300

Figure 5: Minimization of E s (λ1 = 0, λ2 = 1): In each case, the left panel shows the initial path (first row) and the final path (second row), the right panel shows the corresponding decrease in E s during evolution.

1

4

7

10

30 E 25

Ed

20 15 10 5 0 0

1

4

7

50

100

150

10

20 E Ed

15 10 5 0 0

100

200

300

400

500

Figure 6: Minimization of Ed (λ1 = 1, λ2 = 0): Each case shows the given data (first row), the initial path (second row) and the final path (third row), along with the evolution of Ed versus the iteration index.

2. Data Term Only: In this case we set λ1 = 1 and λ2 = 0, and use the gradient G to minimize only the data term Ed . In this experiment we generate a set of n = 4 data points, select a path γ passing through the given points 11

and use Algorithm 1 to perform gradient-based updates of γ. Some of the results are shown in Figure 6. The top row shows the given time-indexed points, the middle row shows the initial γ and the bottom row shows the final γ. The right panel shows the evolution of E. Note that, in each case, the decrease in Ed is very rapid at the start but slows down considerably in the later stages. 3. Both Terms: In this case we study the full gradient term λ1G + λ2 (H2 + H3 ) for different value of λ1 and λ2 , and some results are presented in Figure 7 and 8. We initialize the algorithm each time with a piecewise geodesic path that is passing through the given points at the corresponding times and, hence Ed is zero for the starting path γ. Different combinations of λ1 and λ2 result in different evolutions of γ and the energy E. In the earlier cases where λ2 is not negligible, the smoothing term dominates the evolution. Here, the energy quickly decreases to a very small value on the resulting path, i.e a geodesic path on P(3). As λ1 starts getting larger, and/or λ2 starts getting negligible, the data term starts having an effect and consequently the solution stays close to the given points. 1

7

4

10

Initial path 500 E 400

Es

300

1

7

4

10

200 100 0 0

200

400

600

λ1 = 0, λ2 = 1 500 E 400

Ed Es

300

1

7

4

10

200 100 0 0

100

200

300

400

λ1 = 1, λ2 = 1 500 E 400

Ed Es

300

1

7

4

10

200 100 0 0

100

200

300

λ1 = 10, λ2 = 1 500 E 400

Ed Es

300

1

7

4

10

200 100 0 0

100

200

300

λ1 = 100, λ2 = 1 Figure 7: Initial and final paths for optimization of E under different combinations of λ1 and λ2 . In all these cases we use the same data points (ti , pi ) and the same initial path γ for starting the algorithm.

12

1

7

4

10

Initial path 1500 E Es 1000

1

7

4

10

500

0 0

100

200

300

λ1 = 0, λ2 = 1 1500 E Ed Es

1000

1

7

4

10 500

0 0

50

100

150

200

λ1 = 1, λ2 = 1 1500 E Ed Es

1000

1

7

4

10 500

0 0

100

200

300

λ1 = 10, λ2 = 1 1500 E Ed Es

1000

1

7

4

10 500

0 0

100

200

300

λ1 = 100, λ2 = 1 Figure 8: Same as Figure 7.

5. Kendall’s Shape Space In this case we consider the shape space formed by n landmarks in R2 . 5.1. Basic Geometric Tools Let x ∈ Rn×2 represents n ordered points selected from the boundary √ of an object. It is often convenient to identify points in R2 with elements of C, i.e. xi ≡ zi = (xi,1 + jxi,2 ), where j = −1. Thus, in this complex representation, a configuration of n points x is now z ∈ Cn . We remove the translations by restricting to those elements of Cn whose average is zero and the scale variability by rescaling the complex vector to have norm one. This results in a set: ∑ D = {z ∈ Cn | 1n ni=1 zi = 0, ∥z∥ = 1} . Here, D is a unit sphere and one can utilize the geometry of a sphere to analyze points on it. Let [z] be the set of all rotations of a configuration z according to [z] = {e jϕ z | ϕ ∈ S1 }. One defines an equivalence relation on D by setting all elements of this set as equivalent, i.e. z1 ∼ z2 if there exists an angle ϕ such that z1 = e jϕ z2 . The set of all such equivalence classes is the quotient space D/S1 . This space is called the complex projective space ∑ and is denoted by CPn−2 . The tangent space T z (D) = {v ∈ Cn | ℜ(⟨v, z⟩) = 0, n1 ni=1 vi = 0} and the tangent space T z ([z]) = { jxz | x ∈ R}. Since T z (D) = T z ([z]) ⊕ T z (CPn−2 ), the tangent space is T z (CPn−2 ) = {v ∈ Cn | ⟨v, z⟩ = ∑ 0, 1n ni=1 vi = 0}. 13

A geodesic between two elements z1 , z2 ∈ CPn−2 is given by computing a geodesic between z1 and z∗2 , where ∗ = e jϕ z2 and ϕ∗ is the optimal rotational alignment of z2 to z1 given by ϕ∗ = θ and ⟨z1 , z2 ⟩ = re jθ . Now consider the required items for computing the gradient of E2 for paths on CPn−2 .

z∗2

• Exponential Map: The exponential map, exp : T z (CPn−2 ) 7→ CPn−2 , has a simple expression: expz (v) = cos(∥v∥)z + sin(∥v∥)

v . ∥v∥

• Inverse Exponential Map: The exponential map is a bijection if we restrict ∥v∥ so that ∥v∥ ∈ [0, π2 ). The inverse exponential map exp−1 z1 (z2 ) is: exp−1 z1 (z2 ) =

d (z∗ − rz1 ), where d = cos−1 (r). sin(d) 2

• Parallel Transport: If we parallel transport a vector v ∈ T z1 (CPn−2 ) along the shortest geodesic (i.e., great circle) from z1 to z2 , the result is: ⟨ ⟩ v, z∗2 (z1 + z∗2 ) v− ∈ T z∗2 (CPn−2 ). 1+r • Riemannian Curvature: For X, Y, Z ∈ T z (CPn−2 ), the curvature is given by R(X, Y)Z = −⟨Y, Z⟩X + ⟨X, Z⟩Y + ⟨Y, jZ⟩ jX − ⟨X, jZ⟩ jY − 2⟨X, jY⟩ jZ. 5.2. Experimental Results Now we show some experimental results for fitted optimal curves on CPn−2 . We will show results from two sets of experiments. In one case we simulate data using simple shapes while in another we use real shape data from the INRIA 4D repository. • Simulated Data 1. Smoothing Term Only: As the first experiment, we set λ1 = 0 and minimize only E s using its gradient H2 + H3 . In other words, we ignore the points (ti , pi ) and the data term Ed and simply try to minimize E s . Some results are shown in Figure 9 where the left panels show the initial path (top row) and the final paths(bottom row), along with the evolution of E s on the right. As seen in these examples, the gradient method proves very effective in minimizing E s to zero rapidly. The minimizer of E s is simply a geodesic path in CPn−2 . 2. Data Term Only: In this case we set λ1 = 1 and λ2 = 0, and use the gradient G to minimize only the data term Ed . In this experiment we generate a set of n = 4 data points, select a path γ passing through the given points and use Algorithm 1 to perform gradient-based updates of γ. Some of the results are shown in Figure 10. The top row shows the given time-indexed points, the middle row shows the initial γ and the bottom row shows the final γ. The right panel shows the evolution of E. Note that, in each case, the decrease in Ed is very rapid at the start but slows down considerably in the later stages. 3. Both Terms: In this case we study the full gradient term λ1G + λ2 (H2 + H3 ) for different value of λ1 and λ2 . Some results are presented in Figure 11 and 12. We initialize the algorithm each time with a piecewise geodesic path that is passing through the given points at the corresponding times and, hence Ed is zero for the starting path γ. Different combinations of λ1 and λ2 result in different evolutions of γ and the energy E. The results of this experiment are similar to that in the two manifolds we talked earlier. In the earlier cases where λ2 is not negligible, the smoothing term dominates the evolution. Here, the energy quickly decreases to a very small value on the resulting path, i.e a geodesic path on CPn−2 . As λ1 starts getting larger, and/or λ2 starts getting negligible, the data term starts having an effect and consequently the solution stays close to the given points. Some more examples when λ1 = λ2 = 1 are given in Figure 13. 14

8

x 10

−3

Es 6 4 2 0 0

20

40

60

0.016 E

s

0.014 0.012 0.01 0.008 0.006 0.004 0.002 0 0

10

20

30

40

50

60

0.09 E

s

0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0

10

20

30

40

50

60

70

80

0.25 Es 0.2 0.15 0.1 0.05 0 0

20

40

60

0.2 Es 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 0

10

20

30

40

50

60

70

80

Figure 9: Minimization of E s (λ1 = 0, λ2 = 1): In each case, the left panel shows the initial path (top row)and the final path (bottom row), the right panel shows the corresponding decrease in E s during evolution.

15

1.4 E

d

1.2

1

0.8

0.6

0.4

0.2

0 0

1

4

7

50

100

150

200

10

0.8 Ed 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

1

7

4

20

40

60

80

100

120

140

10

0.5 Ed 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 0

1

4

7

20

40

60

80

100

120

140

10

0.35 Ed 0.3

0.25

0.2

0.15

0.1

0.05

0 0

20

40

60

80

100

120

Figure 10: Minimization of Ed (λ1 = 1, λ2 = 0). Each case shows the given data (top row), the initial path (middle row) and the final path (bottom row), along with the evolution of Ed versus the iteration index.

16

1

7

4

10

Initial path 0.2 E Ed

0.15

1

7

4

10

Es

0.1 0.05 0 0

50

100

150

200

λ1 = 0.1, λ2 = 1 0.2 E Ed

0.15

1

7

4

10

Es

0.1 0.05 0 0

20

40

60

80

λ1 = 1, λ2 = 1 0.2 E Ed

0.15

1

7

4

10

Es

0.1 0.05 0 0

50

100

150

λ1 = 100, λ2 = 1 Figure 11: Initial and final paths for optimization of E under different combinations of λ1 and λ2 . In all these cases we use the same data points (ti , pi ) and the same initial path γ for starting the algorithm.

17

1

7

4

10

Initial path 0.08 E Ed

0.06

1

7

4

Es

10 0.04 0.02 0 0

20

40

60

λ1 = 0.1, λ2 = 1 0.08 E Ed

0.06

1

7

4

Es

10 0.04 0.02 0 0

50

100

150

λ1 = 1, λ2 = 1 0.08

1

7

4

0.06

E E

0.04

Es

10

d

0.02 0 0

λ1 = 100, λ2 = 1 Figure 12: Same as Figure 11.

18

5

10

15

20

1

7

4

10

0.25 E Ed

0.2

Es 0.15 0.1 0.05 0 0

1

7

4

10

50

100

150

0.4 E Ed

0.3

Es

0.2 0.1 0 0

1

4

7

50

100

10 0.8 E Ed

0.6

Es

0.4 0.2 0 0

1

4

7

20

40

60

80

10

0.5 E Ed

0.4

Es 0.3 0.2 0.1 0 0

20

40

60

80

Figure 13: Initial and final paths for optimization of E under λ1 = λ2 = 1. In each case, Each pair of panels shows the initial path (top row) and the final path (bottom row) in the left, along with the corresponding energy in the right.

• Real Data Specifically, we have implemented this algorithm for some real data taken from INRIA 4D repository. We select two real sequences of dancer from this repository, shown in top rows of Figure 14 and 15. In each case, we select 4 data points from the real sequence and generate a piecewise geodesic between these points as the initial path. Different combinations of λ1 and λ2 result in different evolutions of γ and the energy E, and we compare the estimated shapes with the ground truth.

19

Real sequence 1

7

4

10

Initial path 0.08

1

7

4

10

E Ed

0.06

Es

0.04 0.02 0 0

10

20

30

40

λ1 = 0.1, λ2 = 1 0.08

1

7

4

10

E Ed

0.06

Es

0.04 0.02 0 0

5

10

15

λ1 = 1, λ2 = 1 0.08

E Ed

0.06

1

7

4

Es

10 0.04 0.02 0 0

10

20

30

λ1 = 100, λ2 = 1 Figure 14: Initial and final paths for optimization of E under different combinations of λ1 and λ2 . In all these cases we use the same data points (ti , pi ) and the same initial path γ for starting the algorithm.

20

Real sequence 1

7

4

10

Initial path 0.2

1

7

4

10

E Ed

0.15

Es

0.1 0.05 0 0

10

20

30

λ1 = 0.1, λ2 = 1 0.2

1

7

4

10

E Ed

0.15

Es

0.1 0.05 0 0

5

10

15

λ1 = 1, λ2 = 1 0.2

1

7

4

10

E Ed

0.15

Es

0.1 0.05 0 0

10

20

30

λ1 = 100, λ2 = 1 Figure 15: Same as Figure 14

6. Estimation of Optimal Weight Ratio Using Cross-Validation An important open issue in this framework is the choice of the ratio λ = λ1 /λ2 in estimating the optimal path. Clearly, the different values of λ will lead to different results and this choice becomes important. While, in some cases, it may be possible to choose λ according to the application, this is a difficult issue in most situations. Here we suggest a cross-validation approach, that uses the training data, to automatically determine an optimal value of λ. The optimal value of λ is given by:   n  ∑ ˆλ = argmin  d(γ− i (ti , λ) , pi )2  , λ

i=1

where γ− i (· , λ) is the optimal curve obtained when pi is omitted from the data, under the weight ratio λ. In other words, we omit pi , estimate the optimal curve for the current λ and compute the distance of the estimated point γ(ti ) from the correct value pi . Summing these distances squared over all the points gives the cost function for finding the optimal λ. We demonstrate this method using the example studied earlier in Figure 11. First, we compute the cross-validation cost function for optimizing over λ as shown in the left panel in Figure 16. Then, we select the optimal λ and use that to compute the optimal path, which is shown in the right panel. 21

−3 5x 10 4.5 4 3.5 3 2.5 2 1.5 1

0

5

10

15

Figure 16: Left: Cross-validation cost function for automatically estimating λ = λ1 /λ2 . Right: The optimal path under the optimal weight ratio.

7. Conclusion We have discussed the problem of fitting optimal curves to time-indexed data points on a closed Riemannian submanifold M by means of a Palais-based gradient method. We have studied three specific manifolds: S O(3), the set of determinant one, symmetric positive-definite matrices and Kendall’s shape space, and have described the relevant parts of their differential geometries. As a result we have derived the gradients of the cost function on these manifolds and have applied the gradient algorithm for finding optimal curves for a variety of real and simulated data examples. In future work, we would like to extend this framework to other manifolds, including the shape space of elastic curves Srivastava et al. (2010), with applications to elastic shape tracking in video data. References Altafini, C., 2000. The De Casteljau algorithm on SE(3). Nonlinear Control 258, 23–34. Belhumeur, P. N., Hepanha, J. P., Kriegman, D. J., 1997. Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Transactions on PAMI 19 (7), 711–720. Boothby, W. M., 2002. An introduction to differentiable manifolds and Riemannian geometry. Academic Press. Camarinha, M., Leite, F. S., Crouch, P., 1995. Splines of class C k on non-Euclidean spaces. IMA J. Math. Control and Information 12, 399–410. Crouch, P., Kun, G., Leite, F. S., 1999. The De Casteljau algorithm on the Lie group and spheres. Journal of Dynamical and Control Systems 5, 397–429. Crouch, P., Leite, F. S., 1991. Geometry and the dynamic interpolation problem. In: Proc. American Control Conference. pp. 1131–1137. Crouch, P., Leite, F. S., 1995. The dynamic interpolation problem: On Riemannian manifolds, Lie groups, and symmetric spaces. Journal of Dynamical and Control Systems 1, 177–202. Das, S., Vaswani, N., 2010. Nonstationary shape activities: Dynamic models for landmark shape change and applications. IEEE Transactions on PAMI 32 (4), 579 –592. Dryden, I. L., Koloydenko, A., Zhou, D., 2009. Non-Euclidean statistics for covariance matrices with applications to diffusion tensor imaging. The Annals of Applied Statistics 3(3), 1102–1123. Dryden, I. L., Mardia, K. V., 1998. Statistical Shape Analysis. John Wiley & Sons. Fletcher, P., Joshi, S., 2007. Riemannian geometry for the statistical analysis of diffusion tensor data. Signal Processing 87(2), 250–262. Grenander, U., Miller, M. I., Srivastava, A., 1998. Hilbert-Schmidt lower bounds for estimators on matrix Lie groups for ATR. IEEE Transactions on PAMI 20 (8), 790–802. Gu, X., He, Y., Qin, H., 2005. Manifold splines. In: Proc. the 2005 ACM Symposium on Solid and Physical Modeling. pp. 27–38. Hofer, M., Pottmann, H., 2004. Energy-minimizing splines in manifolds. In: Proc. SIGGRAPH. pp. 284–293. H¨upper, K., Leite, F. S., 2007. On the geometry of rolling and interpolation curves on S n , S On , and Grassmann manifolds. Journal of Dynamical and Control Systems 13, 467–502. Jakubiak, J., Leite, F. S., Rodrigues, R. C., 2006. A two-step algorithm of smooth spline generation on Riemannian manifolds. Journal of Computational and Applied Mathematics 194 (2), 177–191. Jost, J., 1998. Riemannian geometry and geometric analysis. Springer-Verlag. Jupp, P. E., Kent, J. T., 1987. Fitting smooth paths to speherical data. Journal of the Royal Statistical Society. Series C (Applied Statistics) 36 (1), 34–46. Kendall, D. G., 1984. Shape manifolds, Procrustean metrics and complex projective spaces. Bulletin of the London Mathematical Society 16, 81–121. Kendall, D. G., Barden, D., Carne, T. K., Le, H., 1999. Shape and shape theory. Wiley, Chichester. Kenobi, K., Dryden, I. L., Le, H., 2010. Shape curves and geodesic modeling. Advances in Applied Probability 97, 567–584. Kent, J. T., Mardia, K. V., Morris, R. J., Aykroyd, R. G., 2001. Functional models of growth for landmark data. In: Proc. Functional and Spatial Data Analysis. University of Leeds Press, pp. 109–115. Kume, A., Dryden, I. L., Le, H., 2007. Shape-space smoothing splines for planar landmark data. Biometrika 94, 513–528. Liu, X., Srivastava, A., Gallivan, K., 2004. Optimal linear representations of images for object recognition. IEEE Transactions on PAMI 26 (5), 662–666. Machado, L., Leite, F. S., 2006. Fitting smooth paths on Riemannian manifolds. Int. J. Appl. Math. Stat. 4, 25–53. Machado, L., Leite, F. S., H¨upper, K., 2006. Riemannian means as solutions of variational problems. LMS J. Comput. Math. 9, 86–103.

22

Mardia, K. V., Jupp, P., 2000. Directional Statistics (2nd edition). John Wiley & Sons. Moakher, M., 2002. Means and averaging in the group of rotations. SIAM Journal of Matrix Analysis Applications 24(1), 1–16. Moakher, M., 2005. A differential geometric approach to the geometric mean of symmetric positive-definite matrices. SIAM Journal of Matrix Analysis Applications 26(3), 735–747. Morris, R. J., Kent, J. T., Mardia, K. V., Fidrich, M., Aykroyd, R. G., Linney, A., 1999. Analysing growth in faces. In: Proc. the International Conference on Imaging Science, Systems and Technology. pp. 404–410. Noakes, L., Heinzinger, G., Paden, B., 1989. Cubic splines on curved spaces. IMA Journal of Mathematical Control and Information 6 (4), 465–473. Palais, R. S., 1963. Morse theory on Hilbert manifolds. Topology 2, 299–340. Pennec, X., Fillard, P., Ayache, N., 2006. A Riemannian framework for tensor computing. International Journal of Computer Vision 66(1), 41–66. Popiel, T., Noakes, L., 2007. B´ezier curves and C 2 interpolation in Riemannian manifolds. Journal of Approximation Theory 148 (2), 111–127. Rathi, Y., Vaswani, N., Tannenbaum, A., Yezzi, A., 2007. Tracking deforming objects using particle filtering for geometric active contours. IEEE Transactions on PAMI 29 (8), 1470–1475. Samir, C., Absil, P. A., Srivastava, A., Klassen, E., 2010. A gradient-descent method for curve fitting on Riemannian manifolds. Foundations of Computational Mathematics in review. Schwartzman, A., 2006. Random ellipsoids and false discovery rates: statistics for diffusion tensor imaging data. Ph.D. thesis, Stanford. Solo, V., 2009. On nonlinear state estimation in a riemannian manifold. In: Proc. the 48th IEEE Conference on Decision and Control. pp. 8500– 8505. Srivastava, A., Jermyn, I. H., Joshi, S. H., 2007. Riemannian analysis of probability density functions with applications in vision. In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 1–8. Srivastava, A., Klassen, E., 2004. Bayesian and geometric subspace tracking. Advances in Applied Probability 136 (1), 43–56. Srivastava, A., Klassen, E., Joshi, S. H., Jermyn, I. H., 2010. Shape analysis of elastic curves in Euclidean spaces. IEEE Transactions on PAMI accepted for publication. Tompkins, F., Wolfe, P. J., 2007. Bayesian filtering on the Stiefel manifold. In: the 2nd IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing. pp. 261–264. Vaswani, N., Rathi, Y., Yezzi, A., Tannenbaum, A., 2010. Deform PF-MT: Particle filter with mode tracker for tracking nonaffine contour deformations. IEEE Transactions on Image Processing 19 (4), 841–857. Vaswani, N., Roy-Chowdhury, A. K., Chellappa, R., 2005. ”Shape Activity”: a continuous-state HMM for moving/deforming shapes with application to abnormal activity detection. IEEE Transactions on Image Processing 14 (10), 1603–1616. Veeraraghavan, A., Srivastava, A., Roy-Chowdhury, A. K., Chellappa, R., 2009. Rate-invariant recognition of humans and their activities. IEEE Transactions on Image Processing 8 (6), 1326–1339.

23

Suggest Documents