Procrustes Methods for Projective Shape - Isaac Newton Institute

0 downloads 0 Views 3MB Size Report
Feb 12, 2008 - Background — projective transformation. Let y = Bx = [B11 c. dT e. ][v .... u(θ) = (cos(θ),sin(θ)) ... sin2 δ, cos2 δ, sec2 δ, csc2 δ, −tan2 δ, −cot2 δ,.
Procrustes Methods for Projective Shape

John Kent University of Leeds hello [email protected] http://maths.leeds.ac.uk/~john

Seminar at Newton Institute 12 February 2008

Introduction

Modern (similarity) shape analysis was developed 20 years ago by D.G. Kendall (1984) and F.L. Bookstein (1986) Starting point is a configuration of k points in m ≥ 2 dimensions; earlier work focused on selected distances (Similarity) shape consists of geometrical information invariant under similarity transformations — translation, rotation and scaling

Overview of talk I

Review basic theory for similarity shape I I I

I

Applications of similarity shape analysis I I I

I

Morphometrics (PCA on mouse vertebrae) imaging application — object tracking (fish) growth models (rat skulls)

Projective shape analysis I I I I

I

Bookstein coordinates Procrustes tangent coordinates statistical inference

basic theory — projective transformations and camera imaging projective invariants — cross ratio Procrustes standardization New representation for cross ratio.

Extensions I I

beyond landmarks – deformations, curves, surfaces unlabelled configurations

Similarity changes for a triangle Translation (horizontal and vertical) Size Rotation (2 df left for shape)

Coordinate systems

To work statistically with shapes we need a coordinate system. I. Bookstein coordinates (in 2-d): (a) move first landmark to the origin; (b) rotate and scale so the second landmark lies at (1,0); (c) the remaining k − 2 landmarks lie in R2 . (d) But choice of baseline matters (i) nonlinear effects if data not highly concentrated (ii) change to PCA even if data highly concentrated

Bookstein allometry example

II. Procrustes tangent coordinates (in 2-d):

(a) Shape space is a manifold (a sphere for k = 3; more generally, CP k−2 for k > 3), which can be naturally embedded in a Euclidean space. (b) Given n configurations find the “average” shape (c) Project the data onto the tangent space to shape space at this average shape. (d) Carry out statistical inference in tangent space (MVA)

Procrustes details (2-d)

(a) Represent a configuration of k points by a complex vector z (k × 1) (b) Given n configurations {zi }, the mean shape µ is defined by minimizing n X

||µ − βi exp(iθi )(zi − γi 1k )||2

i=1

over βi > 0 (scale), θi ∈ [0, 2π) (rotation) and γi ∈ C P (location) and over µ, a centered ( k1 µj = 0) and scaled P ( k1 |µj |2 = 1) configuration.

Procrustes tangent projection (partial and full)

Geometric interpretation when k = 3, where shape space = sphere S 2 ⊂ R3 I

full Procrustes tangent projection = orthogonal projection

I

partial Procrustes tangent projection = equal-area projection

Algabraic intepretation: WLOG suppose the {zi } are centered and scaled, Then the partial Procrustes tangent projection about µ is given by vi = exp(−iθi )Pzi , where P = I − µµ∗ , θi = arg{µ∗ zi }

Mouse vertebra data, n = 23, k = 6 Mean shape Marginal covariance at each landmark First principal component – toothpaste

Fish — finding objects in images

I

From training images with fish, identify landmarks by hand

I

Use first few principal components to model variability

I

Include a prior for location and scale

I

Run an edge detector on test image to build a likelihood

I

Run MCMC to find fish in test image

Fish — typical image and edge detector

Fish — principal components from training data

Fish — results of MCMC search

Deformations Landmarks sit in R2 = ambient space We can move landmarks individually, or deform the ambient space. One way to construct (small) deformations is using thin-plate splines. Deformations can be represented by bi-orthogonal grids — showing what happens to a square grid after deformation. TPS deformations combine very neatly with Procrustes tangent analysis. Eigenfunctions from TPS give a method of decomposing shape change as a linear combination of components ranging from low to high “frequency”, analogous to Fourier series or orthogonal polynomials.

Growth models for rats (spatial-temporal shape change) Rat skulls – 8 landmarks observed at 8 times between birth and adult Average over the different rats Blow up shape change by factor of 5 Growth model takes the tensor product form vjt =

p X q X

aαβ f(α) (µj )g(β) (t),

α=1 β=1

where f(α) (µj ) represents a mode of change in space and g(β) (t) for time. The best fitting model can be described as (general space × linear time) + (linear space) × quadratic time)

0.6 0.4

−0.2 0.0

*

*

*

*

*

* *

*

*

*

*

−0.6

−0.6

*

0.2

*

*

* *

−0.2 0.0

0.2

0.4

0.6

Rats: data and models

−0.6

−0.2 0.0

0.2

0.4

0.6

−0.6

−0.2 0.0

0.2

0.4

0.6

(b)

−0.2 0.0

*

*

*

*

* * *

*

* *

*

−0.6

−0.6

*

0.2

*

*

* *

−0.2

0.2

0.4

0.4

0.6

0.6

(a)

−0.6

−0.2 0.0 (c)

0.2

0.4

0.6

−0.6

−0.2

0.2 (d)

0.4

0.6

Projective shape — Introduction

Projective geometry is based on “projective invariants”. These are usually defined algebraically but have no geometric (metric) meaning. A prime example is the cross ratio, to be discussed later. We give a new definition of a “distance” between two cross ratios and the beginnings of a statistical theory of projective shape.

Background — homogeneous coordinates

Consider a Euclidean vector v (m × 1). Define the homogeneous coordinates of v by   v x=α , 1

(p × 1),

p = m + 1,

α 6= 0,

where all scalar multiples of x are treated as equivalent. Then x and v contain the same information.

Background — projective transformation

Let

     B11 v + c B11 c v y = Bx = T = T ∈ Rm+1 d v+e d e 1

where B(p × p) is nonsingular. In Euclidean coordinates w = y1 /yp = (B11 v + c)/(dT v + e) ∈ Rm

Background — Application to vision

For simplicity, let m = 1, i.e. p = 2. Consider a line in the plane, x = ul + a,

l, a (fixed) ∈ R2 ,

as u varies as a real number. Treat x as being in homogeneous coordinates and represent it in Euclidean coordinates by setting y = x/x2 . Then   (ul1 + a1 )/(ul2 + a2 y= , 1 so y1 is a projective transformation of u.

Camera view of 4 collinear points

* X X X X

* *

*

Camera geometry This is the geometry of a camera with focal point at the origin and 1-d film on the line x2 = 1. Suppose a scene contains k collinear points, but that the values of l and a are unknown. What features of these points can be recovered from a camera image? Answer — projective invariants; i.e. those features of the points which are unchanged under all projective transformations X → DXB T , where D(k × k) = diag(di ) is nonsingular; and B(2 × 2) is nonsingular.

Cross ratio Consider k = 4 collinear points in a 2-d scene, as a 4 × 2 matrix  T x1 T x xT  2   X =  T  ; let Rij = iT , xj x3 T x4 so Rij is the determinant of the 2 × 2 matrix formed by rows i and j of X . The cross ratio can be defined by τ=

(u1 − u2 )(u3 − u4 ) R12 R34 = , R13 R24 (u1 − u3 )(u2 − u4 )

the only projective invariant in this setting (special case of Pl¨ ucker coordinates).

Example — Four lanterns

To illustrate the cross ratio, I shall use an example from my back yard where 4 solar-powered lanterns have been inserted in the ground. The next slides show two views of the lanterns at dusk. Note the different spacing of the lights from the two different camera angles. I extracted the coordinates of the lights from the two images and computed δ in each case. The answers are remarkably similar! τ1 = 0.489,

τ2 = 0.487.

My back garden

View 1 of lanterns

View 2 of lanterns

Standardization 1 One way to study projective shape is through projective invariants. Another approach is to standardize X . In general in projective geometry we treat X ≡ DXB T where D(k × k) is diagonal (because we are working with homogeneous coordinates) and B(p × p) is nonsingular (representing projective transformations). Choice of D: We can scale each row of X so the last element is 1 (linear film) or to have norm 1 (circular film). Choice of B: Ensure the columns of the configuration are orthonormal, up to a factor k/p. This standardization must be carried out jointly.

Standardization 2

“Procrustes projective shape coordinates” (new) — Choose D and B so that Y = DXB satisfies yT i yi = 1 (rows of Y are unit vectors) k k Y T Y = Ip (columns of Y orthonormal, up to factor ) p p p k k XX X k (so trY T Y = yij2 = yT i yi = trIp = k). p i=1 j=1

i=1

Standardization 3 The existence and (partial) uniqueness of this standardization follows from a result on the robust estimation of a covariance matrix (Tyler). Consider the equation k p X xi xT i . A= T A−1 x k x i i=1 i

Note that A is a covariance-type matrix of the data X that does not depend on the scaling of the {xi } and is equivariant under linear transformations of the data. Tyler shows that under mild regularity conditions A is uniquely defined (up to scale). Also, it can be found by a simple iterative algorithm.

Standardization 4 If we let B denote a square root of A−1 (satisfying BB T = A−1 ; recall B is unique up to post-multiplication by a p × p orthogonal matrix Q), and set q −1 yi = ±B T xi / xT i A xi , where the sign can be chosen separately for each row, then Ip =

k k X pX T yi yT = yi yT i i /yi yi . k i=1

i=1

That is after transformation by B, the axes of the data are uniformly spread around the unit sphere in Rp . Note that the standardized configuration Y is unique up to (a) post-multiplication by an orthogonal matrix YQ, and (b) changing the sign of each row.

Standardization 5

Thus we have reduced our original configuration X to a (semi)-standardized configuration Y , which is unique up to sign changes and rotation/reflections: Y ≡ SYQ, where S = diag (si ), si = ±1, i = 1, . . . , k and Q(p × p) is diagonal.

Standardization — cross-ratio In the case k = 4, p = 2 it can be shown that that Y takes the form     c −s u(−δ/2)   c  u(δ/2) s = , Y =  u(π/2 − δ/2)   s c s −c u(π/2 + δ/2) where I

u(θ) = (cos(θ), sin(θ))

I

c = cos(δ/2), s = sin(δ/2), 0 < δ < π/4

I

unique up to (a) permutation of landmarks, (b) sign of each row, (c) rotation/reflection of data around the circle.

Then τ is related to δ by one of the trig functions sin2 δ, cos2 δ, sec2 δ, csc2 δ, − tan2 δ, − cot2 δ, depending on the permutation.

Standardized representation of 4 collinear points Standardized configuration Y

X

0.5

1.0

X

0.0

X

−1.0

−0.5

X

−1.0

−0.5

0.0

0.5

1.0

Embedding

To remove the remaining indeterminancy (S and Q) in the standardized configuration Y , we consider embedding projective shape space in a Euclidean space, so that each projective shape is repesented by a single point. Define an “absolute inner product” matrix M(k × k) by mij = |yT i yj |,

i, j = 1, . . . , k.

Then (a) mij does not depend on S or Q. (b) At least for p = 2, it is possible to reconstruct Y (up to the choice of S and Q).

Example of embedding: k = 4, p = 2 In this case

 c −s  c s , Y =  s c s −c 

c = cos(δ/2) , s = sin(δ/2)

where 0 < δ < π/2. Then 

1 C M= 0 S

C 1 S 0

0 S 1 C

 S 0  C 1

where C = cos(δ), S = sin(δ). Note 2 2 2 m12 + m13 + m14 =1

with one structural 0, so M can be represented as the edges of a spherical triangle, in unit sphere in R3 .

Spherical triangle for cross ratio 12=34

13=24

14=23

At the vertices, pairs of landmarks coincide

Interpretation of the spherical triangle

The position of the structural 0 in M is closely related to the ordering of the landmarks. Supoose landmark 1 is perpendicular to landmark 2 along one edge of the spherical triangle (and so landmark 3 is perpendicular to landmark 4). At one end of this edge (i.e. vertex of the spherical triangle), landmarks 1 &3 coalesce (as do landmarks 3 & 4). At the other vertex, landmarks 1 & 4 coalesce (as do landmarks 2 & 3).

Open issues

I

Extension of this embedding to k > 4 (and p > 2).

I

Development of useful statistical models for projective shape (cf Maybank: 4 iid N(0, 1) points on a line).

I

Procrustes tangent projections for small-scale variability.