For this reason represent- ... be processed represent some aspect of the real .... filter k. α = 1/(m â 1); m beeing the di- mensionaliy of T . I is the identity tensor.
Representing Local Structure Using Tensors. Hans Knutsson Link¨oping University Computer Vision Laboratory S-581 83 Link¨oping Sweden
ABSTRACT
formly stretched and all pairs of antipodal points maps onto the same tensor. It is demonstrated that the above mapping can be realized by sampling the 3D space using a specified class of symmetrically distributed quadrature filters. It is shown that 6 quadrature filters are necessary to realize the desired mapping, the orientations of the filters given by lines trough the vertices of an icosahedron. The desired tensor representation can be obtained by simply performing a weighted summation of the quadrature filter outputs. This situation is indeed satisfying as it implies a simple implementation of the theory and that requirements on computational capacity can be kept within reasonable limits.
The fundamental problem of finding a suitable representation of the orientation of 3D surfaces is considered. A representation is regarded suitable if it meets three basic requirements: Uniqueness, Uniformity and Polar separability. A suitable tensor representation is given. At the heart of the problem lies the fact that orientation can only be defined mod 180◦ , i.e the fact that a 180◦ rotation of a line or a plane amounts to no change at all. For this reason representing a plane using its normal vector leads to ambiguity and such a representation is consequently not suitable. The ambiguity can be eliminated by establishing a mapping between R3 and a higherdimensional tensor space.
Noisy neigborhoods and/or linear combinations of tensors produced by the mapping will in general result in a tensor that has no direct counterpart in R3 . In an adaptive hierarchical signal processing system, where information is flowing both up (increasing the level of abstraction) and down (for adaptivity and guidance), it is necessary that a meaningful inverse exists for each levelaltering operation. It is shown that the point in R3 that corresponds to the best approximation of a given tensor is given by the largest eigenvalue times the corresponding eigenvector of the tensor.
The uniqueness requirement implies a mapping that map all pairs of 3D vectors x and -x onto the same tensor T. Uniformity implies that the mapping implicitly carries a definition of distance between 3D planes (and lines) that is rotation invariant and monotone with the angle between the planes. Polar separability means that the norm of the representing tensor T is rotation invariant. One way to describe the mapping is that it maps a 3D sphere into 6D in such a way that the surface is uni1
INTRODUCTION A meaningful application of most signal processing concepts requires that the data to be processed represent some aspect of the real world in an orderly way. More precisely, it is generally required that an increased difference between real world events results in an increased distance between the data points that represent these events. Working with representations where these reguirements are not met would make most signal processing concepts meaningless, e.g. averaging and differentiation. Thus, it must be concluded that a suitable representation of events to be analysed is the basis for succesful signal processing. For many real-world aspects, however, establishing well behaved representations is a non-trivial task and in these cases a first and necessary step of any analysis is to find such a representation. In this paper the problem of finding a suitable representation of 3-D surface orientation is considered. As an introductory example the equivalent 2-D case will be discussed. The problem then of course reduces to finding a suitable representation for 2-D lines. In the following a line will be taken to mean a symmetric separator of two regions such that no relation between the reqions can be inferred, whereas by an edge will be understod an asymmetric separator such that an ordering of the two regions is always implied. Note that a consequence of this definition is that lines can only be unambigously assosiated with an angle over an intervall of 180◦ ,while edges can be unambigously assosiated with angles spanning an intervall of 360◦ . Another way to view the situation is that both lines and edges are assosiated with angles spanning a 180◦ interval and that in addition, edges are ‘signed’ but lines are ‘unsigned’. Considering the above, it is argued that a general partitioning element must have the properties of a line. The reason for this is that
it is not possible to consistently define an ordering of regions if the data constituting the different regions are intrinsically more than 1dimensional and consequently the consept of an edge is not applicable. To see why the edge concept does not work for multi-dimensional data, consider the case of differently colored adjacent patches (to a human observer color is intrinsically 2-dimensional). It does not seem to make much sense to say that one color has a higher value than any other (equal brightness and saturation assumed) and furthermore it is easy to convince oneself that all attempts to impose an ordering of the color space will result in unreasonable distances between at least some of the colors. Considering the fact that a substantial part of the concepts used in image processing are of multi-dimensional nature (color, texture, shape etc.) it becomes apparent that, in any image processing theory claiming generality, a partitioning element must have the above mentioned line properties. Another and perhaps more fundamental view of the situation is the following. The representation should be invariant to what is in the regions and only represent the geometric qualities of the separator itself i.e. in this case, its orientation. In trying to find a representation for line orientation vectors defined over a 2-dimensional half plane immediately suggests itself, the vector angle being identical to the line orientation and the vector length being a positive valued function of some other property of the line (e.g its energy, probability or some other definition of its ‘magnitude’). A space resulting from this type of direct mapping will be referred to as the original space. It is obvious in the present case that, for a given vector length, the original space corresponds to a half circle. The problem with this original space is that it leads to an unacceptable distance measure as vectors representing two lines differing by a small angle can end up being represented by
vectors that are very far apart, i.e. one line is represended close to one end of the half circle and the other line is represented close to the opposite end.
THE MAPPING A mapping M that maps the vector x onto the tensor T and meets all the above criteria is given by:
MAPPING REQUIREMENTS To remove this discontinuity a mapping is reguired that maps the original space into a suitable representation space. To guarantee a suitable mapping three basic requirements should be met.
kTk2 ≡
(1)
where: x is a vector in the original space. T is the map of x.
The mapping should locally preserve the angle metric of the original space, i.e (2)
where: r = kxk c is a ‘stretch’ constant.
X n
λ2n
(5)
tij are the components of T and λn are the eigenvalues of T. Since T is a quadratic form a change in the sign of x will have no effect and it follows directly that the ‘uniqueness’ requirement is met. That the ‘polar separability’ requirement is met is easily demonstrated by calculating the norm of T. X ij
r−2 x2i x2j = r2
(6)
where xi(j) are the components of x. Showing that the norm of T is equal to the norm of x. That the ‘uniform stretch’ requirement is met is shown in appendix A. A FILTERING REALISATION
The ‘polar separability’ requirement: As the information carried by the magnitude of the original vector x does not normally depend on the vector angle, it is reasonable to reguire that: kTk = f (kxk)
t2ij =
where:
kTk2 =
The ‘uniform stretch’ requirement:
kδTk = c kδxkr=const.
X ij
It is evident that the discontinuity will be removed if the mapping is such that it maps the vectors x and −x of the original space onto the same vector in the representation space i.e.
(4)
The norm of T is taken to be the Frobenius norm and is given by:
The ‘uniqueness’ requirement:
T(x) = T(−x)
T ≡ r−1 xxT
M:
(3)
i.e the norm of T is independent of the direction of x.
Having found a suitable mapping the question arises: Can the representation implied by the mapping be realized using measurements on actual image data, where lines (or other structures) are represented as local grayscale correlations? It will be shown that by combining the outputs from polar separable quadrature filters, it is possible to produce a representation corresponding exactly to eqn. (4).
The exactness relies on the image data beeing locally 1-dimensional, i.e. on the existence of a locally well defined orientation. The case where the 1-dimensionality assumption does not hold is discussed in ‘THE INVERSE’ section. There are four features of the above procedure relevant to the present paper. 1. The quadrature filter concept. The quadrature filter concept forms a basis for obtaining phase invariant information. A quadrature filter can be defined in Fourier space as a filter beeing positive over half of the Fourier space and zero over the other half, or more precisely defined by: Fk (u) > 0
if u · nk > 0
F (u) = 0 k
otherwise
the difference in angle between u and the filter direction nk . (In general the filters are spatially localized, but this is a separate issue not discussed in the present paper.) 3. The minimum number of filters. The discussion regarding the minimum number of filters can be found in appendix B. The result is that the minimum number of filters is 3 for 2D and 6 for 3D. 4. The combination of filter outputs. The final result T0 can be obtained by linear summation of the quadrature filter output magnitudes as indicated by eqn. 9. T0 =
(7)
where: nk is the filter directing vector. u is the frequency. The output qk of the corresponding quadrature filter will be a complex number. The magnitude kqk k of qk will be phase invariant (implying local shift invariance) and the argument arg(qk ) represents the local phase.
X k
kqk k( nk nTk − α I )
(9)
where: qk is the output from quadrature filter k. nk is the orientation of quadrature filter k. α = 1/(m − 1); m beeing the dimensionaliy of T0 . I is the identity tensor. The calculations leading to this result can be found in appendix C.
2. The filter shape. In addition to the quadrature requirement, eqn.(7), it is required that the frequency responce of the filters when (u · nk ) > 0 can be expressed as: Fk (u) = g(kuk)(u · nk )2
(8)
In other words the filter shape is polar separable, the radial part of the function is arbitrary but positive (usually some type of bandpass function) and the angular part varies as cos2 (ϕ), where ϕ is
THE INVERSE If the neighborhood is not 1-dimensional, due to noise or surface curvature, it is nessesary to find a best approximation T to T0 where T corresponds to a 1-dimensional neighborhood. This is simply done by finding the x that minimizes: ∆ = kT0 − r−1 xxT k
(10)
It is shown in appendix D that x is given by: x = λ 1 e1
(11)
Combining the results yields: S(x + rv) = x + rv S(x − rv) = rv − x
giving: T = λ1 e1 eT1
(12)
where: λ1 is the largest eigenvalue of T0 and e1 is the corresponding eigenvector. The value of ∆ indicates how well the 1dimensionality hypothesis fits the neighbourhood, the smaller the value the better the fit.
showing that (x+rv) and (x−rv) are eigenvectors of S the eigenvalues beeing 1 and -1 respectively. Since all other eigenvectors of S are ortogonal to x and v it follows from eqn.(15) that all other eigenvalues are zero. Then, according to eqn.(5), the norm of S is given by: kSk =
S = lim ²→0
T(x + ²v) − T(x) ²
q
λ21 + λ22 =
√
2
Appendix B FILTER OUTPUT The analysis in this appendix will deal only with real valued neighbourhoods of onedimensional variation i.e. neighbourhoods that can be expressed as:
(13)
ξ(s) = f (s · nξ )
where:
where:
kvk = 1 and
s is the spatial coordinate and
v·x = 0
(18)
showing that the ‘uniform stretch’ requirement is met.
Appendix A UNIFORM STRETCH To show that the uniform stretch requirement is met by the mapping is fairly straightforward. Add a small perpendicular vector ²v to x and calculate the relative difference in the norm of T. To start define S to be:
(17)
(19)
nξ is a unit vector oriented along the axis of maximal signal variation.
then kδTk = kSkkδxkr=const.
(14)
Carrying out the limit calculation is simple and yields: S = r−1 (xvT + vxT )
(15)
Let S operate on x and v. Sx = rv Sv = r −1 x
(16)
For this type of signal the Fourier transform is non-zero only on the line defined by: u ∝ nξ
(20)
Thus the situation can be treated as 1dimensional and, using eqn. (8) and 1dimensional filter theory, it is not hard to show that the magnitude of the quadrature fiter output (as a function of the signal orientation) is given by: kqk k = d(nξ · nk )2
(21)
where d can be considered a constant as it is independent of the filter orientation and depends only on the magnitude and radial distribution of the signal spectrum.
It has been shown in [2] that the minimum number of filters required when N = 2 is 3, the filter orientations given by: n1 = (1
, 0) √ n2 = (−0.5 , 3/2) √ n1 = (−0.5 , − 3/2)
Appendix C FILTER COMBINATION As in appendix B the analysis in this appendix will deal only with real valued neighbourhoods of one-dimensional variation. In addition it is assumed that the filters axes should be symmetrically distributed over the orientation space. It is felt that this is a reasonable assumption as the final result, T, by definition is invariant to rotation of the filters. It is helpful in the following discussion to bear in mind that: 1. The Fourier transform is invariant to rotation of the coordinate system. 2. The Fourier transform of a signal of onedimensional variation is a line through the origin paralel to the signal orienting vector, eqns. (19) and (20). 3. The quadratue filter output is invariant to rotation around its axis (given by nk ) and also diametrically symmetric so that kq(u)k = kq(−u)k, eqn.(21). N −1
quadrature filConsider the case of 2 ters, having symmetry axes passing through the corners of a cube in N dimensions, giving a fully symmetric distribution of filters. Consider the contribution to the filters from frequencies on a line through the center of two opposing cube faces. Since the angle between the line and any filter axis will be the same it is clear that all the filters will give the same output. Consequently the filter set is incapable of giving information sufficient to determine wich pair of cube faces the line passes through and thus, the orientation of the signal is undecideable. It can be concluded, therefore, that more than 2N −1 quadrature filters must be used.
(22)
For N = 3 the number of filters must be greater than 4 but, since there does not exist a way of distributing 5 filters in 3-D in a fully symmetrical fashion, the next possible number is 6. (In fact the only possible numbers are those given by half the number of vertices (or faces) of a diametrically symmetric regular polyhedron, leaving only the numbers 3, 4, 6 and 10. Note that this is in contrast to the 2-D case where the only symmetry restriction is K > 2.) It turns out that the minimum required number of quadrature filters K is 6. To attain the final result for the 3-D case, eqn.(31), a number of steps have to be taken. The orientations of the filters are given by vectors pointing to the vertices of a hemiicosahedron, see fig. 1. The 6 normal vectors are given in cartesian coordinates by: n1 = c ( a , 0 , b )T n2 = c (−a , 0 , b )T n3 = c ( b , a , 0 )T n4 = c ( b , −a , 0 )T n5 = c ( 0 , b , a )T n6 = c ( 0 , b , −a )T where: a = 2 b = (1 +
√
5) √ c = (10 + 2 5)−1/2
(23)
then the magnitude of the outputs from the 6 quadrature filters are, according to eqn.(21), given by:
kq1 k = dc2 r−2 (a2 x2 + 2abxz + b2 z 2 ) kq2 k = dc2 r−2 (a2 x2 − 2abxz + b2 z 2 ) kq3 k = dc2 r−2 (b2 x2 + 2abxy + a2 y 2 ) kq4 k = dc2 r−2 (b2 x2 − 2abxy + a2 y 2 )
(26)
kq5 k = dc2 r−2 (b2 y 2 + 2abyz + a2 z 2 ) Figure 1: An icosahedron (one of the 5 Platonic polyhedra).
kq6 k = dc2 r−2 (b2 y 2 − 2abyz + a2 z 2 ) Next, calculating the sum
Then the elements of the nk nTk :s are given by:
n1 nT1 = c2
a2 0 ab
n2 nT2
a2 = c2 0 −ab
n3 nT3
b2 ab 0
0 0 0
n4 nT4
0 0 0
n5 nT5 = c2
n6 nT6
= c2
0 0 0
0 b2 ab
t33 = d0 (z 2 r−2 + 12 ) t12 = t21 = d0 xyr−2
b2 −ab a2 = c2 −ab 0 0
(28)
t13 = t31 = d0 xzr−2 (24)
t23 = t32 = d0 yzr−2 where d0 = 45 d. It is evident that if the quantity 12 d0 is subtracted from the diagonal elements of T00 the result will be of the desired form.
0 ab a2
T0 = T00 − 12 d0 = d0 nξ nTξ
0 0 0 2 0 b −ab 0 −ab a2
(29)
Finally calculate the sum of all quadrature filter output magnitudes.
Let the signal orienting vector be given by:
X k
nξ = r−1 (x, y, z)
(27)
t22 = d0 (y 2 r−2 + 12 )
k
kqk kxk xTk
t11 = d0 (x2 r−2 + 12 )
0 −ab 0 0 0 b2 0 0 0
X
yields the components of T00 :
ab 0 2 b
ab a2 0
= c2
T00 =
(25)
kqk k = 2d
(30)
Combining eqns. (27),(29) and (30) yields the desired result: T0 (d0 nξ ) =
X k
1 kqk k(nk nTk − I) 5
(31)
A few comments about higher dimensional spaces are appropriate. For N = 4 the number of filters must be greater than 8 and the only possible number is 12. No calculations for the 4-dimensional case have been carried out. For N greater than 4 no regular polyhedron having more vertices than a cube exists. Appendix D THE INVERSE The norm of a tensor is invariant under rotation of the coordinate system and eqn.(10) can be rewritten as: −1
0
−1
T
∆ = kA (T − r xx )Ak
(32)
∆ = kA−1 T0 A − A−1 r−1 xxT Ak
(33)
giving:
where A is an ortogonal matrix. Let A be such that A−1 T0 A is diagonal and note that only one eigenvalue of xxT is nonzero. Then, since the norm of A−1 T0 A is the sum of the squares of its elements, it is clear that ∆ is minimized if A−1 r−1 xxT A removes the largest of these values, i.e. the largest eigenvalue of A−1 T0 A. Thus, if the eigenvalues are numbered in decreasing order, ∆ is given by: ∆ =
sX n6=1
λ2n
(34)
Then, since T0 and r−1 xxT are subject to identical rotation, it is clear that the x wich minimizes ∆ is given by: x = λ1 e1
(35)
where e1 is the eigenvector corresponding to the largest eigenvalue of T.
Acknowledgement The author whishes to thank the members of the Computer Vision group, especially Prof. G¨osta Granlund for beeing a constant source of inspiration and Dr Josef Big¨ un for showing me the elegant trick involved in appendix A. This work was supported by the Swedish National Board for Technical Developement.
References [1] G. H. Granlund. “In Search of a General Picture Processing Operator”,Computer Graphics and Image Processing 8, 155-173, 1978. [2] H. Knutsson. Filtering and Reconstruction in Image Processing. Diss. No. 88, Link¨ oping University, 1982. [3] H. Knutsson, G. H. Granlund. “Texture Analysis Using Two-Dimensional Quadrature Filters”. CAPAIDM Workshop, Pasadena, California, October 1983. [4] H. Knutsson. “Producing a Continuous and Distance Preserving Representation of 3-D Orientation”. CAPAIDM Workshop, Miami Beach, Florida, November 1985. [5] Big¨ un, J., 1988, “Local Symmetry Features in Image Processing”, Diss. No. 179, Link¨ oping University, Sweden. [6] R. Wilson, H. Knutsson. “Uncertainty and Inference in the Visual System”. IEEE Systems, Man and Cybernetics Volume 18, No 2, 1988. [7] H. Knutsson, G. H. Granlund. “Compact Association Representation of Structural Information”. Report LiTH-ISY-I0931, Link¨ oping University, 1988.