Anuj Srivastava, Michael I. Miller and Ulf Grenander. Abstract. The variabilities in orientations and positions of rigid objects can be modeled by applying rotation ...
1
Ergodic Algorithms on Special Euclidean Groups for ATR Anuj Srivastava, Michael I. Miller and Ulf Grenander
Abstract
The variabilities in orientations and positions of rigid objects can be modeled by applying rotation and translation groups on their surface manifolds. Following the deformable template theory the rigid templates, given by two-dimensional surface descriptions, are rotated and translated to conform to individual objects in a particular scene. The fundamental group generating rigid motion is special Euclidean group SE( ), the semi-direct product of the special orthogonal group SO( ) and the translation group . Under this model the scene representations take values in Cartesian products of the curved Lie group, SE( ). Given the observations of a scene obtained from a set of standard remote sensors, we generate the conditional mean estimates of transformation groups modeling that scene. Techniques, based on ergodic jumping stochastic gradient ows, are developed which accommodate the curved geometry of these groups. Algorithms and simulation results are presented in the context of pose estimation in SO(2), SO(3) and target trajectory estimation in SE(3) . n
n
IR
n
n
J
Keywords: pattern theory, jump-diusion processes, Monte-Carlo sampling, Lie groups, stochastic ows
1. Introduction This paper describes a technique for estimating motions of rigid targets based on the deformable template representations of complex scenes. The ecient modeling of representations for variabilities manifested by objects, 0 This paper appears in the book Systems and Control in the Twenty-First Century, edited by Byrnes, Datta, Gilliam and Martin, published by Birkhauser, Boston. This work is supported by ARO DAAH04-95{1-0494, ARO DAALO3-92-G-0115, ONR N00014-95-1-0859, and ARL MDA972-93-1-0012. 0 Anuj Srivastava and Michael I. Miller are with Department of Electrical Engineering, Washington University, St. Louis MO 63130, Ulf Grenander is with Division of Applied Mathematics, Brown University, Providence, RI 02912.
2
Srivastava, Miller and Grenander
shapes and scenes supporting invariant recognition is crucial. There are several kinds of variabilities fundamental to these representations: (i) the variability in target pose and placement, and (ii) variability in target numbers and identities. To model the rst variability, templates are constructed from two-dimensional CAD surfaces representing the rigid objects. Using the deformable template approach, these templates are varied via the basic rigid transformations involving the translation and rotation groups. Since complex scenes are composed of multiple moving targets, the complete scene transformations are nite Cartesian products of these Lie groups. Given a set of observations of a particular scene the inference constitutes generating minimum mean squared error (MMSE) estimates and, hence, optimizing on the curved geometry of Lie manifolds. To account for the other variability, namely the variable target numbers and their identities, the algorithm is equipped with jump transitions over the set of discrete parameters modeling those variations. Similar to a Poisson process, these jump transitions are performed at random exponentially separated times but with the transition measure speci ed by the a-posteriori probability associated with the scene. These jumps result in implicit solutions of the classical detection and recognition steps in automated target recognition (ATR) problem as described in our earlier work [Sri93, MSG95, SMG95]. The goal of this paper is to demonstrate an intrinsic geometric technique for constructing stochastic ows through the curved manifolds of Lie groups. The target types and numbers are, therefore, assumed known and xed with the focus of inference being only on the continuum of orientations and translations. The authors have previously presented a random sampling algorithm based on the jump-diusion processes for target tracking/recognition [MSG95, SMG95]. This work modeled the at parameter spaces based on the conventional representation of rigid orientations through the Euler angles (pitch, yaw and roll). In modeling rigid motions, the parameters are more naturally restricted to curved manifolds, see e.g. [SGJM96, SFPss, SPss, Bro72, Dun90]. Due to the constraints on the rigid shapes and motions of the objects these representations take values in Lie groups such as the special Euclidean group SE(n); SO(n) n IRn , n = 2; 3, where n stands for the semi-direct product (pg. 92 [Boo86] ). SE(n) forms a Lie group with group operation given by (O1 ; p1 ) (O2 ; p2 ) = (O1 O2 ; O1 p2 + p1) ; O1 ; O2 2 SO(n); p1 ; p2 2 IRn : (1.1)
3
Ergodic algorithms for ATR
This operation can also be expressed via n +1 n +1 matrix multiplication, O2 p2 = O1 O2 O1 p2 + p1 : (1.2) O1 p1 0 1 0 1 0 1 In general, the parameterizations consist of multiple targets, both groundand air-based, requiring estimation in the space SE(n)J ; n = 2; 3, J being the total number of observations of all targets. This paper extends the jump-diusion methodology to the curved geometry of matrix Lie groups, SE(n)J in particular. The basic issue is that the curved manifolds are fundamentally dierent from the at Euclidean spaces. We will essentially be studying stochastic gradients on the curved manifolds SE(n)J to optimize some cost function. To illustrate, de ne a C 1 function H : IRn ! IR+ and examine the ordinary dierential equation (ODE) in IRn , d (t) = ?rH ( (t)) : (1.3)
dt
The solution : [0; 1) ! IRn , given by (t + ) = (t) + R t+ n t ?rH ( (s))ds, can be viewed as a gradient curve in IR parameterized by a single parameter t. In preparation for curved manifolds we rewrite this using representations associated with the tangent spaces, the tangent vectors being studied via their actions on smooth functions on the manifold. For example, a vector a = [P a1 a2 : : : an ]y in IRn is described through @f for all smooth functions f . In the directional derivative Af = ni=1 ai @x this sense the vector a is said to be equivalent to the derivative operator P A = ni=1 ai @x@ . The action of A on a large enough family of functions completely speci es the vector a. In IRn , inherent to its at geometry is the fact that the basis vectors of the tangent space at each point are identical (left panel of Figure 1), and are given by the direct partial derivatives @ @ n @x ; i = 1; : : : ; n. In vector notation we denote them as @x Yi 2 IR where Yi is an n-vector with 1 at the i-th coordinate and 0's elsewhere. In this notation the ODE (Eqn. 1.3) may be rewritten as i
i
i
i
n d (t) = ? X @H j : (Yi;(t) H )Yi ; where Yi;(t) H @x (t) dt i i=1
(1.4)
The vector ddt(t) is interpreted as the velocity vector of a particle moving along the curve (t) in IRn and is an element of the tangent space of the manifold with the basis Y1 ; Y2 ; : : : ; Yn . For small positive it gives,
(t + ) = (t) + (?
n X i=1
(Yi;(t) H )Yi ) + o() ;
(1.5)
4
Srivastava, Miller and Grenander
Figure 1: The basis elements of tangent space are identi ed at all points for Euclidean spaces while for arbitrary manifolds the basis vectors are de ned separately at each point. de ning a parameterized curve in IRn . Notice that the incremental translations in IRn are generated by component-wise addition. We are concerned with inferences on the curved manifolds X = SE(n)J where due to curvature the usual notations of derivatives and translations have to be modi ed. Examining Eqn. 1.5, which is appropriate for IRn , demonstrates two fundamental departures as we move from the at Euclidean spaces to the curved Lie groups: 1. The rst issue is that the tangent vectors at every point on a curved manifold can be dierent, shown in the right panel of Figure 1 for a taurus. In other words, Yi @x@ is not generally tangent to X at all x 2 X . This implies that the tangent spaces at each point must be explicitly tracked. The locally Euclidean property is utilized to de ne tangent spaces at each point on the manifold. As well a Lie group has an additional structure, called parallelizability, where the tangent spaces can be de ned across the whole space in a uni ed manner. First, the tangent space at the identity element is de ned and the tangent spaces at all other points are just its rotated versions in the sense made precise later. i
2. The second aspect fundamental to curved Lie groups is that addition of two elements (g1 + g2 ) does not de ne translations on X . Instead + is replaced by the group operation, g1 g2 2 X if g1 ; g2 2 X , to describe translations on Lie groups. The rigorous mathematical theory for constructing stochastic ows on SE(n)J is presented in [SGJM96, Sri96]; herein we focus only on the algorithmic and implementation aspects of these methods illustrated by two
Ergodic algorithms for ATR
5
speci c ATR scenarios: (i) ground-based rigid objects such as tanks, trucks, jeeps, and (ii) ying objects such as airplanes, helicopters, missiles. Section 2 explains the parameterization of rigid motion using group actions. The geometry of SE(n) is studied in Section 3 leading to tools for constructing stochastic ows. Section 4 sets up a Bayesian formulation of the problem. The algorithm for solving the motion estimation problem in this Bayesian context is outlined in Section 5, with the implementation results for various speci c scenarios presented in Section 6.
2. Representations via Deformable Templates: Group Actions Representation is the core element of image understanding and therefore recognition. The ecient generation of compact models for representations of objects shapes supporting invariant recognition is crucial. The in nity of poses manifest by rigid motions during scene evolution should be accommodated. This is accomplished using the deformable template representations in the following way. De ne two target classes:
Aground = ftruck; T 72; MT 1; jeep; : : :g; Aair = fF 16; X 29; 767; cessna; : : :g ; S with A = Aground Aair . Let I () be the complete physical description of the object 2 A including its shape, size, re ectivity, thermodynamic pro le, etc., i.e. I () incorporates all the target-attributes re ected in the sensor outputs. For the sensors considered here, I () will correspond to the set of CAD surface models consisting of a nite polygonal patch description, including the vertices and the normals, and their re ectivity properties. Figure 2 shows the CAD surfaces for some objects used in the simulations later. Individual targets in a given scene are modeled by the action of
Figure 2: CAD surface models for various targets. transformation groups on the associated rigid surface models; the groups for modeling rigid motion are the rotation and the translation groups. All
6
Srivastava, Miller and Grenander
possible occurrences of targets form a homogeneous space, i.e. for any target attribute (pose and position) there exists a rotation and translation transformation pair whose action models that target occurrence. The dimensions of these transformation spaces are speci ed by the allowed degrees of motion for a given object. 1. For ground-based objects, such as tanks, trucks, and jeeps, the motion allows xed axis rotations and two dimensional translations. The translations are parameterized by the elements of IR2 though the rotation can be equivalently parameterized by elements of the special orthogonal group SO(2) = fg 2 IR22 : gy g = I; det(g) = 1g or the circle S1 = fg 2 IR2 : gy g = 1g or the torus T1 = fg 2 [0; 2] : 0; 2 identi ed g. Note that these groups can all be uniquely identi ed with each other. We choose SO(2) to represent the orientations of ground-based objects. Shown in Figure 3 are three orientations of a tank template.
Figure 3: A tank template transformed by three elements of the rotation group SO(2). The composite transformation (rotation and translation) is parameterized by elements of the special Euclidean group SE(2) SO(2) n IR2 . 2. For rigid motion in IR3 , the orientation can be represented in several ways [Kan], for example by unit quaternions (S3 ) or Cayley-Klein parameters (SU(2)) or Euler angles (S2 S1 ) or rotation matrices (SO(3)), not all of them Lie groups. To obtain uniqueness of parameterization and a simple law of composition we employ the special orthogonal group, SO(3). The motion of a rigid template is given by the special Euclidean group SE(3) SO(3) n IR3 , the semi-direct product of SO(3) and IR3 . Figure 4 shows a rendering of a transformed template with the transformations parameterized by elements of the translation and rotation groups. The trajectories of moving targets are parameterized by the nite
7
Ergodic algorithms for ATR
Figure 4: Shown in the left panel is an airplane template transformed by a rotation O 2 SO(3) and a translation p 2 IR3 as shown in the right panel. Cartesian products of the basic unit SE(n). For a moving object, the object positions in IRn , and orientations in SO(n), parameterize the group transformations at each time along the path. The complete trajectory of object motion observed at J sample times is parameterized by the elements of SE(n)J . The concatenation, in the natural time sequence, of segments formed during target motion forms a linear graph called a track, as shown for an airplane trajectory in Figure 5. In a general scene consisting of mul-
Figure 5: Discrete representation of a ight path: The concatenation of elements from motion group SE(3) at sample times forms a linear graph structure. tiple moving targets, both on ground and in air, the estimation problem is posed on the parameter space (2.1) X = SE(2)J SE(3)J ; J = J1 + J2 ; where J1 components describe ground-based objects, and J2 components describe ying objects, J being the total number of motion components. Remark: More generally, the transformations include some discrete components to accommodate the variability in the complexity of the scene, 1
2
8
Srivastava, Miller and Grenander
including the parametric dimension J (model order) as well as the target identities 2 A. To restrict the analysis to inferences via stochastic ows, throughout this paper J and will be assumed known and xed.
3. Optimization Via Stochastic Flows Estimating transformation parameters from the observed images becomes optimization on the constrained manifolds of the representation. In the literature, there exists a large class of constrained optimization techniques (such as conjugate gradient, Newton's, Lagrangian) with solutions posed on Euclidean spaces, IRn . As described in [Smi93], these extrinsic techniques depend on embedding the constraint surfaces in bigger Euclidean spaces IRn and utilizing one of the two standard variational optimization methods: (i) projective methods, where the solutions are evaluated in IRn and then projected on to the surface, and (ii) Lagrangian methods, where the cost function is the original function plus a constraint term. Often times, not utilizing the underlying geometric structure of the constraint surface these extrinsic methods are not as ecient as compared to the geometry based intrinsic methods. We rely most fundamentally on the basic tools from dierential geometry for the variational calculus on these manifolds. We start with the basic geometric features of the curved component of SE(n), the special orthogonal group SO(n).
3.1 Tangent Vector Fields on SO(n) Generating Flows The goal is to construct the equivalent of Eqns. 1.4 and 1.5 for the matrix Lie group SO(n). We utilize the fact that Lie groups are parallelizable, i.e. the tangent vectors Y1;g ; Y2;g ; : : : ; Yn;g at g 2 SO(n) can all be generated by a rotation on the tangent vectors at the identity e 2 SO(n), Y1;e ; Y2;e ; : : : ; Yn;e . Using the basic results from [Boo86, Spi79], we now establish this tangent structure on SO(n). Denote the vector space of tangents to the manifold SO(n) at a point g by Tg (SO(n)). The dimension d(n) of SO(n) is given by n(n ? 1)=2, d(2) = 1, d(3) = 3, and so on. It can be shown (see pg. 150 [Boo86]) that the tangent space at the identity, Te (SO(n)), may be identi ed with the space of n n skewsymmetric matrices, with the basis elements Yi;e ; i = 1; : : : ; d(n), resulting in:
9
Ergodic algorithms for ATR
1. For SO(2), d(2) = 1 with one basis element
0 ?1 : 2 1 0
Y1;e = p1
2. For SO(3), d(3) = 3 with three basis elements 2
Y1;e
= p1 4 2
0 ?1 0 1 0 0 0 0 0
2
3
5 ; Y2;e
= p1 4 2
0 0 1 0 0 0 ?1 0 0
(3.1) 3 5 ; Y3;e
2
0 0 0 = p1 4 0 0 ?1 2 0 1 0 (3.2)
To extend this tangent structure to other points, de ne a left rotation parameterized by g 2 SO(n) as Lg (h) = g h, the n n matrix product. Due to continuity of the group operation (matrix product in this case) each vector Yi;e determines uniquely a vector eld Yi on the whole space SO(n), i.e. evaluated at any point g 2 SO(n) this eld gives a tangent vector Yi;g = Lg (Yi;e ) = g Yi;e in Tg (SO(n)). In addition, the elds Yi 's have the property that for any g 2 SO(n) the vectors Yi;g form the orthonormal basis of the tangent space Tg (SO(n)) under the metric
< Y1 ; Y2 > = tr(Y1y Y2 ); for any Y1 ; Y2 2 Tg (SO(n)); where tr() is the matrix trace and the composition rule () is n n matrix multiplication. As an example, for g 2 SO(2), de ne the left rotation
Lg (h) = g h = gg1 ?gg2 2 1
h1 ?h2 ; for all h 2 SO(2) : h2 h1
Then, the rotated tangent vector Lh (Y1;e ) = h Y1;e is tangent to SO(2) at h. Hence, Y1;e generates a vector eld over the complete set SO(2) described by the rotation Y1;h = Lh (Y1;e ). The diculty of forcing the estimation processes to stay on the manifold X = SE(n)J or SO(n)J , is handled by constructing ows as the solutions of the ODE built from the tangent basis derived above. A vector eld Y on a manifold X is said to generate a ow, t ! (t) 2 X if the velocity vector ( ddt(t) ) at any point on the curve is equal to vector eld evaluated at that point. We use the fact that every smooth vector eld Y generates a smooth
ow (t) on the manifold (see Pg. 61 [SW86]). Let H1 : SO(n) ! IR+ and H2 : IRn ! IR+ be two C 1 functions and let Y1 ; : : : ; Yd(n) be the orthonormal vector elds on SO(n)Pderived above (Yi;g = g Yi;e ). The Pd(n) @ vector elds ? i=1 (Yi;g H1 )Yi;g , ? ni=1 @H @x @x generate ows satisfying 2
i
i
3 5
:
10
Srivastava, Miller and Grenander
the equations, (n) d1 (t) = ? dX 2 (t) = ?rH ( (t)) : (Yi;(t) H1 )Yi;(t) ; ddt 2 dt i=1
(3.3)
Call 1 (t) and 2 (t) the deterministic gradient ows on SO(n) and IRn , respectively. For matrix Lie groups, these ows are given by exponential maps, i.e. for A 2 Te (SO(n)), an n n skew-symmetric matrix, the parameterized curve : t ! g exp(At) is a ow on SO(n) with its generator given by the vector eld Yg = g A. The matrix exponential is given by the in nite series (see [Cur79]), exp(A) = e + A + A2! + A3! + : : :, and for a skewsymmetric matrix A, exp(A) is an orthogonal matrix ([Boo86, Cur79]). 3
2
3.2 Stochastic Flows The same framework can be extended to generate a diusion process as a stochastic ow by adding noise terms in Eqns 3.3 [Kun84, AM92, SGJM96]. Basically, a diusion process is simulated through randomization via addition of independent random perturbations to the directional derivative terms in the directions given by the associated tangent vectors. De ne stochastic ows on the (3J1 + 6J2 )-dimensional product group X = SE(2)J SE(3)J as follows. Consider a C 1 -function H : X ! IR+ then the resulting ow has J1 components on SE(2) and J2 components on SE(3), i.e. (t) = [ (1) (t); (2) (t); : : : (J ) (t); (J +1) (t); (J +2) (t); : : : (J +J ) (t)] : 1
2
1
1
1
1
2
For j = 1; 2; : : : ; J1 the components (j) (t) = [1(j) (t); 2(j) (t)] 2 SE(2) satisfy the equations, obtained by particularizing Eqn 3.3 for n = 2, Z
t
1(j) (t) = 1(j) (t0 ) + [?(Y1(;j) (s) H )Y1(;j) (s) ds + Y1(;j) (s) dW1(j) (s)] t Z t (3.4) 2(j) (t) = 2(j) (t0 ) + [?rp H (2 (s))ds + dW2(j) (s)] ; 0
t0
1
1
1
(j )
where W1(j) (t) 2 IR, W2(j) (t) 2 IR2 are standard Wiener processes and denotes the Stratonovich interpretation of the stochastic integral. Here Y1(;j) (t) H is the directional derivative of H in the direction tangent to the j -th orthogonal group component of X ; rp H is the gradient with respect to the position vector in the j -th group component of X . Similarly for j = J1 + 1; : : : ; J1 + J2 , the components (j) (t) = [1(j) (t); 2(j) (t)] 2 SE(3) 1
(j )
11
Ergodic algorithms for ATR
are generated by the SDE's Z t X 3 3 X ( j ) ( j ) 1 (t) = 1 (t0 ) + [? (Yi;(j) (s) H )Yi;(j) (s) ds + Yi;(j) (s) dWi(j) (s)] t i=1 i=1 Z t ( j ) ( j ) ( j ) 2 (t) = 2 (t0 ) + [?rp H (2 (s))ds + dW2 (s)] ; (3.5) t 1
0
1
1
(j )
0
where Yi;(j) (t) H; i = 1; 2; 3 are the directional derivatives of H in the three basis directions tangent to the j -th orthogonal group component of X . 1
3.3 Reference Measures on SO(n) To reference probability measures and to evaluate expectations one needs to de ne some base (or reference) measure on the underlying space, X . For the at component, IRn , the Lebesgue measure provides the reference, in geometry it is expressed as the reference volume element dx1 ^ dx2 ^ : : : ^ dxn . For the curved component SO(n) the reference measure needs to be speci ed. Being a compact, connected Lie group SO(n) has a unique bi-invariant volume element which forms the base measure for Bayesian formulation of the inference problem. This volume element on SO(n) can be expressed in the desired local coordinates as follows. 1. In SO(2), the invariant volume form in terms of rotation angle is given by d. To evaluate the invariant form in the Cartesian coor x ? x dinates, x1 x 2 2 SO(2), such that x21 + x22 = 1, substitute 2 1 x1 = cos() and x2 = sin(), resulting in ?x2 dx1 + x1 dx2 . 2. In SO(3), there are several choices for local charts, i.e. Euler angles, exponential coordinates, quaternions. For use in deriving dynamics based prior measure on rigid motions, we are interested in the volume form on SO(3) in terms of local coordinates given by the exponential map. Rigid body dynamics are naturally expressed in these exponential coordinates associated with the body-frame angular velocities of the rotating object. The chart exp : IR3 ! SO(3) relating the exponential coordinates to the elements of SO(3) is given by (q1 ; q2 ; q3 ) ! exp(Q); for Q skew-symmetric with elements q1 ; q2 ; q3 : (3.6)
12
Srivastava, Miller and Grenander
The invariant volume element on SO(3) in term of exponential coordinates is given by (see [Sri96] for proof), 1 sin2(jqj) dq ^ dq ^ dq ; jqj = qq2 + q2 + q2 : (3.7) 1 2 3 1 2 3 2 jqj2
4. Posterior for Object Tracking Taking a Bayesian approach, the variability and uncertainty of the transformation parameters is represented by the observed data ensemble contributing to the posterior measure on X = SE(n)J . We begin by de ning a prior density 0 on X . The underlying ideal scene identi ed with its pose x 2 X cannot be observed directly; then x 2 X is sensed through the remote sensor as y 2 Y , Y being the observation space. Since the data are assumed random we characterize them via a statistical transition law, called the likelihood function L(j) : X Y ! IR, describing completely the mapping from the input x to the output y. The posterior measure with density becomes the product of the prior density 0 (x), the density of the underlying true scene x and the likelihood of the data y according to ?H (x) (4.1) (xjy) = R (0x(x)L)L(y(jyxjx) )(dx) = e Z ; x 2 X ; X 0 where H is called the posterior energy, Z is the normalizer and is the base measure on the X . To make inferences from a given observed data set the posterior distribution is simulated by solving a stochastic ow equation on the manifolds, and non-local jumps to cover X , extending to Lie manifolds the jumpdiusion processes as described in [SGJM96, Sri96]. These stochastic ows in X are constructed so that their stationary measure has the density on X with respect to the base measure. This allows for the generation of several classical estimators such as MMSE and MAP as well as the mean and covariance statistics. De ne the posterior energy
H (x) = E (x) + P (x);
x2X ;
(4.2)
where E (x) re ects the energy associated with the sensor data likelihood, e?E(x) / L(yjx) in Eqn. 4.1, and P is the energy associated with the prior on X , e?P (x) / 0 (x) in Eqn. 4.1. Prior: The prior on rigid motions is based on the formulation of target dynamics using standard rigid body analysis (neglecting the earth's
Ergodic algorithms for ATR
13
curvature, motion and wind eects), through the Newtonian dierential equations [Fri86]. The prior is induced via the non-linear dierential operator associated with these equations of motion, assuming suitable statistical models for the forcing functions, as described in [MSG95, Sri96, SGJM96]. Data Likelihood: In ATR there are generally several possible remote sensors providing simultaneous data for the inference. The two examples are: (i) low resolution trackers used for global detection and azimuthelevation tracking of unresolved targets, and (ii) high resolution optical sensors, for collecting detailed information on pose and identity of the target. 1. For pose sensing and target identi cation a high resolution optical imaging system (see Figure 6) is used. It registers a two-dimensional projection of target pro les on the camera focal plane. For typical
Figure 6: The far- eld orthographic imaging system for observing the targets at a high resolution taking the target scene and projecting onto the 2-D lattice with additive noise. optical imaging systems [SHW93], the observation is given by the orthographic projection convolved with the point spread function of the camera, plus an additive noise. Under additive white Gaussian noise models the imaged data is assumed to be a Gaussian random eld with mean eld given by the target's projection pro le. Some data samples are shown in Figure 7. 2. For position tracking, a cross array of 64 narrowband, isotropic sensors (32 elements in each direction at half-wavelength spacing) is assumed as in [MSG95, SMG95] using the standard narrowband signal
14
Srivastava, Miller and Grenander
Figure 7: Sample data sets for the ATR problem: The left panel shows the high-resolution video data for a truck, the middle panel displays an airplane and the right panel shows a tank image. model developed in [Sch81]. The phase lags of signals received at the sensor elements provide information about the source locations. For the additive white Gaussian noise model, the data samples are complex Gaussian vectors with the mean given by the signal component.
5. Ergodic Algorithm on X = SE(2)J SE(3)J 1
2
?H (x)
Having obtained a posterior measure (dx) = e Z (dx) on the scene representations taking values in X = SE(2)J SE(3)J , a jump-diusion process is constructed to generate the MMSE estimates. This Markov process X (t) satis es jump-diusion dynamics through X in the sense that (i) on random exponential times the process jumps across X , and (ii) between jumps it follows SDEs of the type Eqns. 3.4,3.5 generating a diusion process. Formally stated, let t0 ; t1 ; : : : be independent and exponentially distributed with parameter (a constant). Let fX~ (i); i = 1; : : :g be a Markov chain in X with some transition function Q(x; F ), i.e. P fX~ (i +1) 2 F jX~ (1); : : : ; X~ (i)g = Q(X~ (i); F ); F X : Then de ne a process X (t) on X by, for i = 1; 2; : : :, 1
8
0 small enough, Y H 1 (H ( ()) ? H (g)) : (5.3) g
g
As mentioned earlier, in the case of the orthogonal groups, is given by the exponential map, i.e g (t) = g exp(Ye t). The formulas to evaluate the matrix exponentials for SO(n); n = 2; 3 are given below. 2. On a computer the SDE in Eqn. 5.1 is approximated by a stochastic dierence equation as follows: de ne i (t); ?1 < t < 1; i = 0; 1; ::; m as the ows generated by the vector elds Yi used in Eqn. 5.1. Let w1 ; w2 ; ::; wm be independent standard Brownian motions. De ne a composite ow, ?(t1 ; t2 )x = m (wm (t2 ) ? wm (t2 )):::1 (w1 (t2 ) ? w1 (t1 ))0 (t2 ? t1 )x : Choose > 0 and consider the discrete time Markov process X (k) = X ((k ? 1))?((k ? 1); k) ; (5.4) where X (0) = x0 . It is shown in [Ami91] that this approximation approaches the diusion in Eqn. 5.1 when ! 0, over nite time intervals. As mentioned earlier, in the case of matrix Lie groups the ow is given by the exponentiation of the corresponding tangent vectors, i.e. in SO(n), i;g (t) = g:exp(Yi t); i = 1; 2; 3. For SO(2), for all A 2 IR22 skew-symmetric, cos(a) ?sin(a) ; a = A ; (5.5) exp(A) = sin 2;1 (a) cos(a)
17
Ergodic algorithms for ATR
while for SO(3), this matrix exponential can be evaluated by the formula, for all A 2 IR33 skew-symmetric, X exp(A) = e + sina(a) A + cosa(2a) A2 ; a = 21 A2i;j : i;j
(5.6)
5.2 Computation of the Conditional Mean The process X (t) constructed by the algorithm has the ergodic property (see [SGJM96, Sri96] for details) that the posterior is its unique invariant measure and for any measurable function f , 1 Z t f (X (s))ds = Z f (x)d(x) : lim t!1 t 0 X De ne jj jj to be P the Hilbert-Schmidt norm on SO(n) (by embedding it in n IR ), i.e. jjOjj = ni;j=1 Oij2 . Using the regular Euclidean norm on IRn , we can extend it to (p; O) 2 SE(n) by jj(p; O)jj = jjpjj + jjOjj and to SE(n)J by taking sums over the components. The ergodic result dictates that for posterior the sample average of the distance function f () = jj ?xjj2 , x 2 X , converges to its expectations Z t Z 1 2 lim jjX (s) ? xjj ds = jjx ? yjj2 (dy); 8x 2 X : t!1 t 0 X
(5.7)
Approximate the minimum mean squared error (MMSE) estimate, x~N for N assumed large, by 1 x~N = arg min x2X N
N X i=1
jjxi ? xjj2 ;
(5.8)
for xi = X (i). In other words, given the samples xi 's, the MMSE estimate corresponds to the point x~N 2 X having the least total distance from the samples xi ; i = 1; 2; : : :; N . InPIRn with regular Euclidean norm this is just the sample average, x~N = N1 Ni=1 xi , but on curved manifolds the answer is dierent. For SO(n) with the Hilbert-Schmidt norm it reduces to, N X
x~N = arg x2max tr(ay x); where a = N1 xi : SO(n) i=1
18
Srivastava, Miller and Grenander
From [GV89], it can be shown that, if a = uvy is the singular value decomposition of a then 8 > uvy ; if determinant (a) 0 > > 3 2 > > 1 0 : : : 0 > < x~N = > uLvy ; L = 666 0 1 : : : 0 777 ; if determinant(a) < 0 : . > > 5 4 .. > > > : 0 0 : : : ?1 (5.9) In the case of SO(2) this formula simpli es further. The samples generated cos ( ) ? sin ( ) i i on SO(2) are of the type: sin( ) cos( ) and hence the average i
i
matrix a also has the structure a = aa1 ?aa2 . Both the singular values 2 1 p of matrix a are a21 + a22 and the orthogonal matrix closest to a (in Hilbert-
Schmidt distance) is
x~(N ) = p 21 2 a : a1 + a 2
(5.10)
6. Results Now we examine three speci c examples to illustrate the methodology: (i) estimate the orientation of a truck from an image, (ii) estimate the orientation of an airplane from an image, and (ii) estimate the trajectory of a ying airplane from a sequence of images.
6.1 Estimating Orientation of a Ground-Based Object Assume a ground-based target with unknown orientation in SO(2). The
posterior measure is the product of data likelihood and Haar measure on SO(2). The algorithm is as follows: Algorithm: Let i = 0 and X (0) = g0 2 SO(2) be any initial condition. 1. Generate a sample u of an exponential random variable with constant mean . 2. Diusion: follow the approximation Eqn 5.4 of the SDE, i.e. let l = 0. (a) For > 0 small enough, numerically approximate the directional derivative Yi;X (i) H by 1 where ? 1 = 1 H (X (i) eY ) ? H (X (i)) : 1;e
Ergodic algorithms for ATR
19
(b) Generate a Gaussian random variable, w1 N (0; 1) . p (c) Update the process by X ((i + 1)) = X (i)exp(( w1 + 1 )Y1;e ). The exponential of a 2 2 skew-symmetric matrix is given in Eqn. 5.5. (d) i = i + 1, l = l + 1; if l < u then go to (a). 3. Metropolis jump move: generate a uniform random variable 2 cos ? sin [0; 2]. Evaluate g = sin cos : Calculate H (g). If H (X (i)) > H (g), set X (i) = g. Else set X (i) = g with probability e?(H (g)?H (X (i))) . 4. i = i + 1, go to 1.
This algorithm generates a sequence, fX (i) 2 SO(2); i = 1; 2; : : :g, of samples from the posterior distribution from which the MMSE estimate is generated in the following way. The Hilbert Schmidt (H-S) norm on SO(2) is given by, jjg1 ? g2jj2 = 4 ? 2 trace(g1 g2y). For the samples x(i) = X (i); i = 1; 2; : : :; N , the MMSE estimate x~N under H-S norm is given by Eqn. 5.10. This algorithm was implemented on a SGI Onyx workstation using its graphics engine for rendering the three-dimensional objects as well as generating the orthographic projection pro les for simulating the video camera image as shown in Figure 6. The projection of a truck, rendered at true orientation xtrue in the left panel of Figure 9, was sampled on a 64 64 lattice with i:i:d: Gaussian noise was added at each pixel to simulate noisy observation shown in the middle panel. The estimation algorithm outlined above was run to generate 100 samples from the posterior with the estimated orientations x~100 are shown in the right panel of the Figure 9. The plot in Figure 10 describes the evolution of the algorithm. The two curves correspond to the H-S distance of the samples xN and the estimate x~N from the true orientation as function of N . The dotted curve displaying jjxtrue ? xN jj2 shows the algorithm jumping at times N = 1; 2; 8; 9; 21; 42 with diusions at other times. while the distance pro le of the estimated orientation x~N , jjxtrue ? x~N jj2 , is plotted by the dark line.
6.2 Estimating Orientation of an Airplane Now expand the space to SO(3) to estimate the orientation of an airplane from its noisy images. A jump-diusion algorithm is used to search in SO(3) with the image data likelihood and Haar measure on SO(3) contributing to
20
Srivastava, Miller and Grenander
Figure 9: The middle panel show the noisy image of a truck rendered at the true orientation shown in the left panel. The truck is rendered at MMSE estimate x~100 in the right panel. the posterior measure. The jump here corresponds to moves within SO(3), with the diusions being the solutions of SDE on SO(3). The H-S norm is given by jjg1 ? g2 jj2 = 6 ? 2 trace(g1 g2y ) and the corresponding MMSE estimate x~N is given in Eqn. 5.9. The algorithm becomes: Algorithm: Let i = 0 and X (0) 2 SO(3) be any initial condition. 1. Generate a sample u of an exponential random variable with constant mean . 2. Diusion: follow the approximation Eqn 5.4 of the SDE in step 2 of Algorithm 1 for u cycles. i.e. let l = 0 and Y1;e ; Y2;e ; Y3;e be the three orthonormal basis of the space of skew-symmetric matrices given in Eqn. 3.2. (a) For > 0 small enough, numerically approximate the directional derivatives i = Yi H using Eqn. 5.3. (b) Generate w1 ; w2 ; w3 i.i.d Gaussian random variables with mean zero and variance 1. (c) Update the process according to X ((i + 1)) = X (i)?(X (i)) where
p
p
p
?(X (i)) = exp( w3 Y3;e )exp( w2 Y2;e )exp( w1 Y1;e )exp( The exponentiation of 3 3 skew-symmetric matrix is given by Eqn. 5.6. (d) i = i + 1, l = l + 1; if l < u then go to (a).
3 X i=1
i Yi;e ) :
21
Ergodic algorithms for ATR Distance from Truth vs Sample Index 8
SO(2) Distance from True Orientation
7
6
5
4
3
2
1
0 0
20
40
60 Simulation Index
80
100
120
Figure 10: Evolution of the sampling process: the broken line shows the H-S distance of the process X (s) from the reference point xtrue while the regular line shows the distance function for the sample averages x~N evolving over time. 3. Metropolis jump move: generate g uniformly over SO(3), calculate H (g). If H (X (i)) > H (g), set X (i) = g. Else set X (i) = g with probability e?(H (g)?H (X (i))) . 4. i = i + 1, go to 1. The jump step (step 3) involves generating samples from a uniform measure on SO(3) in the following way. There exists a dieomorphism between the upper half of the unit 3-sphere, S3+ = f(q0 ; q1 ; q2 ; q3 ) 2 S3 jq0 > 0g and SO(3) given by, (q0 ; q1 ; q2 ; q3 ) 2 S3+ $ Q 2 SO(3) where 2 2 q0 + q12 ? q22 ? q32 ?2(q0 q3 ? q1 q2 ) 2(q0 q2 + q1 q3 ) 3 Q = 4 2(q0 q3 + q1 q2 ) q02 ? q12 + q22 ? q32 ?2(q0 q1 ? q2 q3 ) 5 : (6.1) ?2(q0q2 ? q1 q3 ) 2(q0 q1 + q2 q3 ) q02 ? q12 ? q22 + q32 The uniform measure on the sphere S3+ in polar coordinates is given by sin2 (1 )sin(2 )d1 d2 d3 (see [Spi79] for example) where for 0 1 =2, 0 2 ; 3 2. Set q0 = cos1 ; q1 = sin1 cos2 ; q2 = sin1sin2 cos3 ; q3 = sin1 sin2sin3 : (6.2) A sample from uniform measure on SO(3) can be generated by the following steps: (i) Generate a uniform random variable x 2 [0; =4] and solve the
22
Srivastava, Miller and Grenander
transcendental equation 1 =2 ? sin(21)=4 = x for 1 . (ii) Generate a uniform random variable x 2 [?1; 1], evaluate 2 = cos?1 (x). (iii) Generate a uniform random variable 3 2 [0; 2]. (iv) Evaluate (q0 ; q1 ; q2 ; q3 ) 2 S3+ from Eqn. 6.2. (v) Evaluate Q 2 SO(3) using the Eqn. 6.1. The algorithm was implemented in the same simulation environment as earlier. The plot in Figure 11 illustrates the state of the algorithm, the thin line plots jjxtrue ? xN jj2 and the thick line plots jjxtrue ? x~N jj2 against the index N . The jumps are made at N = 1; 13; 39; 59 re ecting the random 8
7
Distance from true state
6
5
4
3
2
1
0 0
50
100
150
200 250 sample index
300
350
400
Figure 11: Sampling on SO(3): the thin line plots the HS-distance of xN from xtrue while the thick line plots the HS of x~N from xtrue . times when the candidates are better matched to the data image than the present state xN . Shown in the upper panels of Figure 12 is the target rendered at xN 2 SO(3) for N = 1; 41; 81; 121; 401. The middle panels show the data match generated by removing the hypothesized contribution from target at the current sample xN , from the data. The lower panels display the evolution of sample averages x~N for the same times.
6.3 Estimating Trajectories in SE(3)J Simultaneous tracking and recognition of multiple targets requires additional jump moves to account for target detection and identi cation. Examine the tracking of a single target motion via diusions on the motion components forming a target trajectory in SE(3)J . Shown in Figure 13 is a diusion transforming a track in SE(3)4 . The position and orientation of the target at each sample point is modi ed using the SDEs as described in the last section. The motion components forming a track are related
23
Ergodic algorithms for ATR
Figure 12: Samples from Jump-Diusion process on SO(3) evolving over time: (i) the upper panels show the samples at the simulation index 1, 41, 81, 121, 401, and (ii) the lower panels show the dierence images between the observed data and the synthesized data corresponding to that orientation state at these times. through a dynamics based prior as described in [SGJM96]. For detailed algorithm and implementation results please refer to [SGJM96].
Figure 13: These panels describe a sequence of similarity transformations deforming a 4-length track in SE(3)4 .
7. Conclusion We have presented algorithms for automated target recognition with parameterizations on Cartesian products of Matrix Lie groups, SE(n); n = 2; 3 in particular. Tools have been developed for intrinsic optimization of Bayesian cost functions on these parametric manifolds. Stochastic ows with desired statistical properties are constructed to simulate the posterior distribution on the representation space. The algorithmic details are illus-
24
Srivastava, Miller and Grenander
trated through speci c applications in rigid object tracking and recognition. The algorithms are presented for two speci c scenarios: (i) estimating orientation of a ground-based object in SO(2), (ii) estimating orientation of an airborne-objects in SO(3), and (iii) estimating the target trajectory for a given target in SE(3)40 .
References [AM92]
Y. Amit and M.I. Miller. Ergodic properties of jump-diusion processes. Monograph of Electronic Signals and Systems Research Laboratory, Washington University, St. Louis, December,1992. [Ami91] Y. Amit. A multi ow approximation to diusions. Stochastic Processes and their Applications, 37(2):213{238, 1991. [Boo86] William M. Boothby. An Introduction to Dierential Manifolds and Riemannian Geometry. Academic Press, Inc., 1986. [Bro72] R. W. Brockett. System theory on group manifolds and coset spaces. SIAM Jounral on Control, 10(2):265{84, May 1972. [Cur79] Morton L. Curtis. Matrix Groups. Springer Verlag, New York, 1979. [Dun90] T. E. Duncan. An estimation problem in compact lie groups. Systems and Control Letters, 10(4):257{63, April 1990. [Fri86] Bernard Friedland. Control System Design : An Introduction To State-Space Methods. McGraw-Hill Book Company, 1986. [GV89] Gene H. Golub and C. F. Vanloan. Matrix Computations. Johns Hopkins University Press, Baltimore, 1989. [Kan] Kenichi Kanatani. Group-Theoretical Methods in Image Understanding. Springer-Verlag. [Kun84] H. Kunita. Stochastic dierential equations and stochastic ows e de Probabilites de Saintof dieomorphisms. In Ecole d' Et Flour, XII -1982, number 1097. Springer-Verlag L.N.M., 1984. [MSG95] M. I. Miller, A. Srivastava, and U. Grenander. Conditionalexpectation estimation via jump-diusion processes in multiple target tracking/recognition. IEEE Transactions on Signal Processing, 43(11):2678{2690, November 1995.
Ergodic algorithms for ATR
25
[Sch81]
R. Schmidt. A signal subspace approach to multiple emitter location and spectral estimation. Ph.D. Dissertation of Stanford University, Palo Alto, CA., Nov. 1981.
[SFPss]
Stefano Soatto, Ruggero Frezza, and Pietro Perona. Motion estimation via dynamic vision. IEEE Transactions on Automatic Control, In Press.
[SGJM96] A. Srivastava, U. Grenander, G. R. Jensen, and M. I. Miller. Inferences via jump-diusion processes on matrix lie groups. Submitted to Advances in Applied Probability, August, 1996. [SHW93] D.L. Snyder, A.M. Hammoud, and R.L. White. Image recovery from data acquired with a charge-coupled-device camera. Journal of the Optical Society of America A, 10(5):1014{1023, May 1993. [SMG95] A. Srivastava, M. I. Miller, and U. Grenander. Multiple target direction of arrival tracking. IEEE Transactions on Signal Processing, 43(5):1282{85, May 1995. [Smi93]
Steven Thomas Smith. Geometric Optimization Methods for Adaptive Filtering. ph. D. Thesis, Harvard University,, Cambridge, Massachusetts, May 1993.
[Spi79]
Michael Spivak. A Comprehensive Introduction to Dierential Geometry, Vol I & II. Publish or Perish, Inc., Berkeley, 1979.
[SPss]
Stefano Soatto and Pietro Perona. Recursive 3-d visual motion estimation using subspace constraints. International Journal of Computer Vision, In Press.
[Sri93]
A. Srivastava. Automated Target Tracking and Recognition Using Jump Diusion Processes. M. S. Thesis, Washington University, St. Louis, Missouri, December 1993.
[Sri96]
A. Srivastava. Inferences on Tranformation Groups Generating Patterns on Rigid Motions. D. Sc. Thesis, Washington University,, St. Louis, Missouri, July 1996.
[SW86]
D. H. Sattinger and O. L. Weaver. Lie Groups and Algebras with Applications to Physics, Geometry and Mechanics. Springer Verlag, New York, 1986.