Algorithms for Comparing Geometric Patterns (Algorithmen zum Vergleich geometrischer Muster)
Dissertation zur Erlangung des Doktorgrades vorgelegt am
Fachbereich Mathematik und Informatik der Freien Universit¨at Berlin 2001 von
Christian Knauer
Institut f¨ ur Informatik Freie Universit¨at Berlin Takustraße 9 14195 Berlin
[email protected]
Betreuer:
Prof. Dr. Helmut Alt Institut f¨ ur Informatik Freie Universit¨at Berlin Takustraße 9 D–14195 Berlin Germany
[email protected]
Gutachter:
Prof. Dr. Helmut Alt Institut f¨ ur Informatik Freie Universit¨at Berlin Takustraße 9 14195 Berlin Germany
[email protected]
Prof. Dr. Remco Veltkamp Institute of Information and Computing Sciences Universiteit Utrecht Padualaan 14, De Uithof 3584CH Utrecht Netherlands
[email protected]
Vorlage zur Begutachtung: Termin der Disputation: Fassung vom:
25. Oktober 2001 2. Mai 2002 2. Juli 2002
c Christian Knauer, 2001, 2002
Contents 1 Introduction
1
I
5
Point set pattern matching in d-dimensional space
2 Testing the congruence of point sets in R d 2.1 The dimension reduction technique . . . . . . . . . . . . . . . . . . . . . . 2.2 A refined approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9 9 11
3 Measuring the Hausdorff distance of point sets in R d 3.1 The one-sided Hausdorff distance of a point set to a semialgebraic set . . .
17 17
II
21
Matching of plane curves
4 Matching polygonal curves with respect to the 4.1 Computing the Fr´echet distance . . . . . . . . . 4.2 Minimizing the Fr´echet distance . . . . . . . . . 4.3 Approximately minimizing the Fr´echet distance
Fr´ echet . . . . . . . . . . . . . . .
5 Bounding the Fr´ echet distance by the Hausdorff 5.1 The upper bound . . . . . . . . . . . . . . . . . . 5.2 Computing the reparametrization . . . . . . . . . 5.3 Recognizing κ-straight curves . . . . . . . . . . .
distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27 28 31 38
distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43 44 46 49
III Computing the Hausdorff distance between polyhedral 59 surfaces in R 3 6.1 6.2 6.3
Outline of the method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The LBP -Hausdorff distance of ∆-patterns . . . . . . . . . . . . . . . . . . The Hausdorff distance of ∆-patterns . . . . . . . . . . . . . . . . . . . . .
63 64 66
ii
CONTENTS
Appendices Bibliography
74
Index
75
Glossary
77
List of Definitions
81
List of Lemmata
83
List of Theorems
85
List of Algorithms
87
Curriculum Vitae
89
Zusammenfassung
91
Chapter 1 Introduction Geometric pattern matching deals with important practical as well as interesting theoretical problems. The basic pattern matching problems read as follows: Problem 1.1 (δ-measure problem for Π). Given two patterns P, Q from a set of valid patterns Π, and a distance measure δ : Π × Π → R on Π. Compute the distance δ(P, Q). Problem 1.2 (Equivalence problem for Π with respect to T ). Given two patterns P, Q from a set of valid patterns Π, and a set of admissible transformations T of Π. Decide, whether there is a transformation τ ∈ T such that τ (P ) = Q. Problem 1.3 (δ-matching problem for Π with respect to T ). Given two patterns P, Q from a set of valid patterns Π, a distance measure δ : Π × Π → R on Π, and a set of admissible transformations T of Π. Find a transformation τ ∈ T such that δ(τ (P ), Q) is as small as possible. Needless to say that these problems and their variants have numerous applications in a wide variety of scientific disciplines like character recognition, geographical information systems, computer aided design, to name just a few. In the field of geometric pattern matching, the patterns Π are modelled by ’simple’ geometric objects like point sets, sets of line segments, polygons, or polyhedral surfaces, in d-dimensional Euclidean space R d for some d > 0, with the cases d = 2, 3 being the most prominent choices here (for obvious reasons). The distance measure that is used, like the Hausdorff distance δH (c.f. Definition 1.4 on page 7) or the Fr´echet distance δF (c.f. Definition 3.10 on page 23), is usually a metric on Π (or on a larger class of sets) and the admissible transformations T are ’natural’ transformations of the underlying space R d like translations, Euclidean motions, or affine mappings. This provides a mathematically sound and rigorous foundation for the pattern matching problem — something that many ’classical’ approaches like, e.g., neural nets, syntactic matching, or feature based techniques
2
Introduction
fail to achieve. Moreover it is possible to apply the powerful machinery and the techniques from computational geometry; as it turns out this is indeed a very good approach — the field has developed interesting connections to a variety of mathematical subjects and produced nice algorithmic as well as mathematical results over the last years. The survey article of Alt and Guibas [15] provides an excellent and extensive overview; the interested reader is referred to this paper and the references therein. We should note that in order to apply results from geometric pattern matching to problems that arise in ’real world applications’, where one has to handle, e.g., pixel images from a digital camera, point clouds from a laser scanner, or volumetric data from a magnetic resonance scanner, it is necessary to convert the input data to the abstract mathematical objects that are used to model the problem, in a ’reasonable’ manner. This gives rise to various interesting problems and questions and constitutes an active field of research on its own. However, we will not look into this preprocessing step in this thesis; instead we concentrate on the geometric pattern matching problem from a purely theoretical point of view and try to develop efficient algorithms for various incarnations of it.
Preliminaries The model of computation we adopt in this thesis is the real RAM as proposed by [43]. It constitutes a suitable adaption of a classical machine model commonly used in algorithmics (the so-called random-access machine — RAM for short — described, e.g., in [5]) to the field of computational geometry. The main difference is that each register of the machine can hold a single real number, and that arithmetic operations, comparisons, as well as the application of analytic functions (like, e.g., k-th roots, trigonometric functions, exponentiation, or logarithms) are all available at unit cost. Some of our algorithms are randomized; thus we assume that the machine provides access to a random bit generator. All randomized algorithms in this thesis are of the Las Vegas type, i.e., they always return the correct result, but their runtime is a random variable; when we specify the runtime of a randomized algorithm as a function of the input size only, we mean the expectation of that random variable. Throughout this thesis the Oǫ -notation will be employed to emphasize that the constants involved may depend on a parameter ǫ > 0[a] . A statement of the form ‘We can compute ... in Oǫ (T (n, ǫ)) time.’ actually means ‘We can compute ... in Oǫ (T (n, ǫ)) time, for any ǫ > 0.’. Most of the major definitions are contained in the introductory sections of the individual parts. Some less important ones are provided only when needed and are dispersed throughout the text and footnotes. For easier reference the appendix contains an index and a glossary that repeats most of the crucial definitions. The reader is referred to – by now – standard textbooks about computational geometry like [43], [32] for reference to basic definitions, notions and techniques from that field. [a]
To be more precise, we have that T (n) = Oǫ (f (n, ǫ)) iff there is a function C(ǫ), and for any ǫ > 0 we have that for all n, T (n) ≤ C(ǫ) · f (n, ǫ) holds.
Introduction
3
Overview This thesis is roughly organized as follows: It consists of three parts that (essentially) can be read independently of each other. Each of the parts deals with pattern matching problems for different kinds of geometric objects. More specifically, patterns are modelled with finite sets of points in Part I, with plane polygonal curves in Part II, and with polyhedral surfaces in Part III. Throughout this work P and Q will denote sets of m and n such ’simple’ geometric objects. The introduction to each of the parts gives a short problem description and motivation and provides the necessary definitions. A survey of previous results on the problem under consideration follows, and finally a summary of our own results is given. Our main contributions are the following (we omit all necessary definitions and refer to the introductory sections of the individual parts): • In chapter 2 we present an algorithm for the d-dimensional congruence test problem of finite point patterns that runs in O(n⌈d/3⌉ log n) time (c.f. Theorem 2.2 on page 13). • In chapter 3 we present efficient algorithms to measure the one-sided Hausdorff distance of a d-dimensional m-point set P to a set Q of n geometric objects of constant ’size’ each. To be more precise we look at the case where Q is a set of n semialgebraic sets in R d , each of constant description complexity. We develop an algorithm to com1 pute δ˜H (P, Q) in Oǫ (mnǫ log m + m1+ǫ− 2d−2 n) randomized time (c.f. Theorem 3.8 on page 20). • In chapter 4 we present exact and approximation algorithms to solve the δF -matching problem for polygonal curves with respect to translations. To be more precise, we describe an algorithm that solves the corresponding decision problem in O (mn)3 (m+ n)2 time when we consider the Fr´echet distance and in O (mn)3 when we look at the weak Fr´echet distance (c.f. Theorem 4.23 on page 37 and Theorem 4.24 on page 38). We complement the exact solution with an O(ǫ−2 mn) time approximation algorithm that yields a Fr´echet distance which differs from the optimum value by a factor of (1 + ǫ) only (c.f. Theorem 4.29 on page 40). To this end we describe reference points for the Fr´echet distance, and use them to obtain the aforementioned approximation algorithms. We conclude with a negative result that shows that no such reference points for affine maps exist (c.f. Theorem 4.31 on page 41). • In chapter 5 we show that for a certain class of curves — the so called κ-straight curves — there is a close relationship between the Fr´echet and the Hausdorff distance (c.f. Theorem 5.3 on page 44). The parameter κ ≥ 1 – roughly speaking – measures how much these curves ‘resemble’ a straight line. This result gives rise to a randomized approximation algorithm that computes an upper bound on the Fr´echet distance between two such curves that is off from the exact value by a multiplicative factor of (κ + 1). The algorithm runs in O((m + n) log2 (m + n)2α(m+n) ) time (c.f. Theorem 5.6 on page 46 and Corollary 5.8 on page 49). We also provide an O(n log2 n) time algorithm to decide for any κ ≥ 1, if a given polygonal curve on n vertices is κstraight (c.f. Theorem 5.9 on page 49).
4
Introduction • In partIII we develop efficient algorithms for the δH -measure problem for simple polyhedral surfaces in R 3 . We give an algorithm that computes δH (P, Q) in Oǫ ((mn)15/16+ǫ (m17/16 + n17/16 )) randomized time (c.f. Theorem 6.19 on page 68). For variants of the Hausdorff distance, which we get when using a polyhedral distance function, we√ obtain √ an algorithm that solves the decision problem in 2+ǫ 1+ǫ Oǫ ((m + n) + mn( m + n1+ǫ )) time (c.f. Theorem 6.12 on page 66).
All of these results improve upon earlier approaches to the problems under consideration; some of them even constitute the first non-trivial algorithmic solution. Most of the results presented in this thesis have been published in [24], [16], [17], [1], and [11]. Acknowledgements I would like to thank a lot of people. The following individuals (in alphabetical order) contributed – in one way or the other – to this thesis: Pankaj Kumar Agarwal, Helmut Alt, Peter Braß, Herbert Edelsbrunner, Stefan Felsner, Frank Hoffmann, Rolf Klein, Klaus Kriegel, G¨ unter Rote, Micha Sharir, Carola Wenk. First of all I have to thank Helmut Alt for the obvious things: he was an excellent advisor and very often much more than that. I am very grateful to Peter Braß and G¨ unter Rote for sharing some of their insights with me and for asking the right questions at the right time. Thanks also go to my coauthors from which I learned so much: Pankaj Kumar Agarwal, Helmut Alt, Peter Braß, Alon Efrat, Frank Hoffmann, Rolf Klein, Klaus Kriegel, G¨ unter Rote, Micha Sharir, and Carola Wenk. My colleagues from the work group ’Theoretical Computer Science’ at the Freie Universit¨at Berlin deserve credit for providing a constant source of inspiration and motivation, for patiently answering my stupid questions, and for standing my weird personality for so long. Special thanks go to Helmut Alt and Stefan Felsner for reading earlier drafts of this thesis. Last but definitely not least I want to thank my family for their love and their support.
Part I Point set pattern matching in d-dimensional space
Point set pattern matching in d-dimensional space
7
A first and natural approach to model geometric patterns is to represent them by point sets in d-dimensional Euclidean space. Needless to say that the cases d = 2, 3 are the most prominent ones; in fact geometric pattern matching problems for planar and spatial point sets have received considerable attention in the literature, see, e.g., the survey by Alt and Guibas [14] and the references therein. However, although probably less interesting from a practical point of view, in this part we will investigate two matching problems for point sets in higher dimensions. First, in chapter 2, we consider the question, whether two point sets P and Q, with m and n points (m ≤ n), respectively, in d-dimensional Euclidean space are congruent, i.e., if there exists a rigid motion µ with µ(P ) = Q. Recall that a rigid motion is obtained by combining a translation with a rotation and (possibly) a reflection. The congruence testing problem can be seen as a special case of the general pattern matching problem described in the introduction, where the distance measure is the disrcete metric ddiscr , with ddiscr (P, Q) = 0 if P = Q and ddiscr (P, Q) = 1 otherwise, and the set of admissible transformations is the set of rigid motions of R d . We present an algorithm for the d-dimensional congruence test problem that runs in O(n⌈d/3⌉ log n) time (c.f., Theorem 2.2 on page 13). The exponential dependence on d is somewhat unsatisfactory, since the best known lower bound (which already holds in dimension one) is Ω(n log n), but some dimension-dependence is unavoidable, for the congruence testing problem without restriction on the dimension is NP -hard as is show in [7]. Obviously the discrete metric is extremely sensitive to noise and omissions, and therefore congruity is usually a too strong notion to assess the similarity of point patterns, especially in practical applications where the patterns arise from appropriately sampled real world data. The Hausdorff distance is a commonly used similarity measure for geometric patterns that circumvents these problems (at least to some extent); for two sets P and Q it is the smallest δ, such that P is completely contained in the δ-neighborhood of Q, and vice versa: Definition 1.4 (Hausdorff distance, one-sided Hausdorff distance). Let P and Q be compact sets in R d , and ||z|| denote the Euclidean norm of z ∈ R d . Then δH (P, Q) denotes the Hausdorff distance between P and Q, defined as δH (P, Q) := max δ˜H (P, Q), δ˜H (Q, P ) , with δ˜H (P, Q) := max min ||x − y||, the one-sided Hausdorff distance from P to Q. x∈P y∈Q
Intuitively speaking the ’pattern’ P has a small one-sided Hausdorff distance to Q if it is ’similar’ to a ’subpattern’ of Q. In chapter 3 we will present an efficient algorithm to measure the one-sided Hausdorff distance of a d-dimensional m-point set P to a set Q of n geometric objects of constant ’size’ each. As we already noted in the introduction this also can be seen as a special case of the general pattern matching problem, where the set of admissible transformations consists of the identity only.
8
Point set pattern matching in d-dimensional space
To be more precise we look at the case where Q is a set of n semialgebraic sets in R d , each of constant description complexity. We develop an algorithm to compute δ˜H (P, Q) in 1 Oǫ (mnǫ log m + m1+ǫ− 2d−2 n) randomized time (c.f., Theorem 3.8 on page 20). Recall that a set S ⊆ R d is called semialgebraic if it satisfies a polynomial expression, which is any finite boolean combination of atomic polynomial expressions, which in turn are of the form P (x) ≤ 0, where P ∈ R [x1 , . . . , xd ] is a d-variate polynomial. The description complexity of a polynomial expression B involving the polynomials P1 , . . . , PN is the length of an encoding of that expression over a fixed finite alphabet disregarding the length of the encoding of the coefficients of the Pi (which we can afford, since we work in the unit-cost model anyway). The description complexity of a semialgebraic set is the minimum description complexity of an expression defining that set. When we talk about algorithms that work on a set of n semialgebraic sets each of constant description complexity in time O(T (n)), we actually mean that for each constant C > 0 the runtime of these algorithms on a set of n semialgebraic sets each of description complexity at most C is O(T (n)); the constant hidden in the O-notation may depend on C. The results in this part have partially been obtained in collaboration with Peter Braß. Some of the material has already been published in [24].
Chapter 2 Testing the congruence of point sets in R d The simplest point pattern matching problem is to decide whether two patterns are actually the same, up to congruence. Problem 2.1 (Congruence testing problem for d-dimensional point sets). Given two point sets P, Q ⊆ R d of n points each. Decide, whether there exists a rigid motion µ such that µ(P ) = Q. This problem is well-understood in dimensions at most three, where there are good algorithms which reach the asymptotic lower bound of Ω(n log n) [18, 21, 22, 37, 39, 48]. For higher dimensions, however, the situation is different: only dimension reduction methods are known, which lead to an O(nd−2 log n) algorithm by Alt et al. [18], a Monte-Carlo O(n⌊d/2⌋ log n) algorithm by Akutsu [7], an unpublished deterministic algorithm of the same complexity by Matouˇsek (mentioned in [7]), and the O(n⌈d/3⌉ log n) algorithm described below (c.f., Theorem 2.2 on page 13).
2.1
The dimension reduction technique
A congruence that maps P to Q consists of two parts: a translation and a rotation. The translational part is easy to determine, since the image of the centroid c(P ) of P under that congruence has to be the centroid c(Q) of Q, and the centroid can be determined fast, i.e., in O(n) time. So each algorithm preprocesses the sets by translating P to P − c(P ) and Q to Q − c(Q), and searches for a rotation around 0 that maps one set on the other. Also the distances of the points to the centroid have to be preserved, so we can replace each point by a point on the unit sphere which carries the distance (or an ordered list of distances) of the point (or points) in that direction as a label. So in the following we have two sets P ′ , Q′ , each consisting of n labeled points (counting multiplicities) on the unit sphere with center 0, and we are looking for a label-preserving rotation that maps P ′ on Q′ .
10
Testing the congruence of point sets in
Rd
In the two-dimensional case it is easy to reduce this problem to the classical substring matching problem: The points of P ′ are cyclically ordered on the circle, and described by the pair of their label and the angle to the next point on the circle (which together is just some symbol in an alphabet). So starting at an arbitrary point and going around the circle once, we can encode P ′ as a string in that alphabet; and going around the circle twice for Q′ , we encode that set in another string. Then P and Q are congruent if and only if the string of P ′ is a substring of the string for Q′ , which can be tested in O(n log n) time. The algorithm of Alt et al. [18] for the three-dimensional case (similar also Sugihara [48]) involves again the creation of a combinatorial model and solution of the problem on that: Each of the sets consists of n labeled points on the unit sphere, so their convex hull is some polyhedron, which we can describe by its edge graph augmented by edge labels carrying the length of the edge and the angular distance to the next edge around that vertex (i.e., the interior angle of the face at that vertex). This information completely specifies the polyhedron, since, according to Cauchy’s rigidity theorem, once the edge lengths and interior angles of each face are prescribed, there is at most one convex realization of a polyhedron. But the isomorphism of (labeled) polyhedral graphs (threeconnected and planar) can be tested in O(n log n) time. For higher dimensions no such direct method is known, but it is possible to reduce one higher-dimensional problem to a number of alternative lower-dimensional ones. The underlying idea is that if we know the correct image point y ∈ Q′ for some point x ∈ P ′ , then the subspace through x is mapped on the subspace through y and the same holds for their orthogonal complements. Thus we can orthogonally decompose each point of P ′ and each point of Q′ into a pair of points (one on a line, one on the orthogonal hyperplane), and these have to be mapped on each other by the congruence. But the point on the line is uniquely identified by its signed distance to 0 (x, y positive), so we can just append this number to the label of the point on the hyperplane (additionally to all previous labels of the original point), and have reduced the original problem to a problem of points with longer labels which lie in a hyperplane. Thus if we know the correct images of k linearly independent points, we can reduce the dimension of the space by k, appending k additional labels (coordinates) to each point. If we do not know the correct image of any point, the simplest way is to take a fixed x ∈ P ′ and try each of the potential image points y ∈ Q′ in turn: if we are successful in one of the lower-dimensional problems, we can extend the solution to a solution of the original problem by removing that additional label and adding x (or y) times that label to each point; if we are not successful for any choice of y, then the sets P and Q are not congruent. This is the algorithm of Alt et al. [18] which runs in time O(nd−2 log n), reducing one d-dimensional problem of n points to n alternative (d − 1)-dimensional problems in each step. If we use the fact that any closest pair in P ′ has to be mapped on a closest pair in Q′ , and that the number of closest pairs is only linear in n (for fixed dimension d: since the degree of the graph of closest pairs is bounded by a constant, the ‘kissing number’ or ‘Newton number’ which grows exponentially in d [46, 51]), we can reduce the problem in each step by two dimensions to O(n) alternative (d − 2)-dimensional problems. This is the
2.2 A refined approach
11
algorithm of Matouˇsek [7] which runs in O(n⌊d/2⌋ log n) time. If we try to extend this approach to triples a difficulty arises which is caused by the inner symmetries of the point set: since the point-k-tuples which define the alternative reductions are themselves defined by metric properties, the set of all these alternative reductions will be closed under inner isometries of the point set. But for dimension d ≥ 4 this set can be quite big, since rotation subgroups in orthogonal planes are possible. So, e.g., in dimension four the set consisting of two regular (n/2)-gons with common center 0, but in orthogonal planes, allows (n2 /4) rotations, and any full-dimensional triple of points (that could be used to reduce the dimension by three) will generate an orbit of Ω(n2 ) other possible reductions. Similar constructions work also in higher dimensions. This big number of alternative reductions is not really needed, since they all give the same result, but they have to be recognized.
2.2
A refined approach
The solution to this problem is to recognize whenever the point set lies in orthogonal subspaces (which can be matched independently), and use an extended version of the smallest distance graph in the other cases, in such a way that we get a dimension reduction of three with a linear number of alternatives in each step. For this purpose we add to each point x ∈ P ′ the antipodal point −x labeled as ‘new’, if it is not already in the set P ′ (in the following always perform the same operations for Q′ ). Then the smallest distance between two antipodal pairs p, −p and q, −q on √ occurring √ the unit sphere is at most 2, and 2 is reached if and only if p is orthogonal to q. We can extend this from two antipodal pairs to larger point sets by constructing a graph with the extended set as vertices, which has in the beginning the antipodal pairs as connected components. Any graph containing this graph as a subgraph has the property that √ the smallest distance occuring between points of distinct connected components is at most 2, and reaches this value if and only if the connected components lie in orthogonal subspaces. So we start from the antipodal pairs graph, and extend in each step the graph by all those edges between points of distinct components that are of minimum length among all such point pairs. Then we either find after some steps a connected component which spans a subspace of dimension at least three, which allows a dimension reduction, or we find that all the connected components lie in orthogonal two-dimensional subspaces, which can be treated independently. Algorithm Congruence Test is shown on page 12. In the following, edges between two antipodal points will be called antipodal edges, whereas the other edges will be called strong edges; two points sharing such an edge will be called strongly adjacent or strong neighbours. Note that — for the sake of simplicity — we omitted several checks that are performed in the course of the algorithm, e.g., testing whether the two graphs have the same number of connected components all the time, etc. If any of these checks fail, the algorithm gives a negative answer. In the following two sections we will prove, that the algorithm is correct and that it runs within the claimed time bounds.
12
Testing the congruence of point sets in
Rd
Algorithm Congruence Test(P, Q, d) Input: Two sets P , Q, of n labeled points each, in Rd . Output: Decides whether there is a labelpreserving congruence that maps P to Q. 1. BIf the dimension is at most three, use one of the known O(n log n)-algorithms to decide the problem. 2. BPreprocess P and Q to P ′ and Q′ by moving the centroid to 0 and projecting each point on the unit sphere around 0, with the projection distance appended as an additional label. 3. BConstruct the extended sets P ′′ and Q′′ by adding for each point x ∈ P ′ and each point y ∈ Q′ the antipodal points −x, −y, labeled as ‘new’, if they were not already contained in the set. Construct the initial graphs for both sets by joining each point to its antipodal point. 4. repeat 5. if there is only one component left 6. then (∗ c.f., Lemma 2.4 ∗) 7. BCompute for P ′′ and Q′′ the coordinates of the points in the two-dimensional linear subspaces spanned by the sets, and apply one of the known O(n log n)algorithms to decide the two-dimensional problem. 8. else 9. BDetermine the smallest distance δmin between two points of distinct connected components. √ 10. if δmin = 2 11. then (∗ c.f., Lemma 2.4 ∗) 12. BDecompose P ′′ and Q′′ in their connected components, computing the coordinates of the points in the two-dimensional linear subspaces spanned by the components. 13. for each matching of the P ′′ -components to the Q′′ -components 14. do 15. for each component pair (α, β) in the matching 16. do BApply one of the known O(n log n)-algorithms for the 17. two-dimensional case to decide whether α and β are congruent. 18. if there is a matching such that each matched pair is congruent 19. then (∗ c.f., Observation 2.3 ∗) 20. P and Q are congruent 21. else 22. they √ are not congruent. 23. else (∗ δmin < 2 ∗) 24. BAdd all edges of length δmin that join points in distinct connected components to the graph. 25. if there is a triple TP = (point, neighbour1 , neighbour2 ) of P ′′ that spans a linear subspace of dimension three 26. then 27. for all triples TQ = (point, neighbour1 , neighbour2 ) of Q′′ that span a linear subspace of dimension three 28. do B If TP and TQ are congruent, perform the dimen29. sion reduction determined by TP and TQ , and call recursively Congruence Test for the (d − 3)-dimensional problem. 30. if one of the potential image triples from Q′′ gives congruent sets 31. then 32. P and Q are congruent 33. else 34. they are not congruent. 35. until a reduction is found, or the set is decomposed in orthogonal subsets.
2.2 A refined approach
13
Theorem 2.2 (Correctness and complexity of algorithm Congruence Test). The algorithm Congruence Test is correct and runs in O(n⌈d/3⌉ log n) time.
2.2.1
Correctness
The correctness of the algorithm is obvious if d ≤ 3; also the claimed time bound follows immediately in that case. It was already √ observed that once we end up with a set of connected components of pairwise distance 2 we have decomposed the original problem into orthogonal subproblems that can be solved independently: Observation 2.3. Let P, Q ⊂ R d be two finite d–dimensional point sets, and let U, V ⊆ R d be two linear subspaces. Let U ortho denote the orthogonal complement of U, and assume that 1. P = P1 ∪ P2 , P1 ⊆ U, P2 ⊆ U ortho , 2. Q = Q1 ∪ Q2 , Q1 ⊆ V , Q2 ⊆ V ortho , 3. P1 = κ1 (Q1 ) for a congruence κ1 with κ1 |V ortho = identity, and 4. P2 = κ2 (Q2 ) for a congruence κ2 with κ2 |V = identity. Then P = κ(Q) for the congruence κ = κ1 ◦ κ2 . Next we will show that the following invariant holds throughout the algorithm, just before δmin is recomputed and edges are added (and components are merged) as long as no √ dimension reduction is found and δmin is smaller than 2: Lemma 2.4 (Invariant of algorithm Congruence Test). During the execution of algorithm Congruence Test the following invariant holds at the start of the repeat-loop: Each connected component consists of points distributed on a great circle of the unit sphere, and therefore is planar. Proof. We proceed by induction: In the√beginning the invariant obviously does hold. Now assume that δmin is computed, δmin < 2 and we add all minimum length edges between vertices in distinct connected components. We show that either a dimension reduction is found (i.e., the graph now contains three points spanning a three-dimensional subspace) or the invariant also holds after the merging step. To this end we look at two connected components c1 and c2 that are merged in the merging step. Note that due to the antipodal vertices, the process is symmetric, i.e., if vertex u becomes adjacent to vertex v in the merging process, then also the antipodal vertices u′ and v ′ become adjacent in the same step. By induction there are two great circles C1 and C2 containing these two components. We distinguish the following two cases: The two great circles C1 and C2 coincide. In that case the invariant clearly holds after the merging step.
14
Testing the congruence of point sets in
Rd
The two great circles C1 and C2 differ. In that case these two circles share exactly two points. Let us assume that the merging process establishes a strong adjacency between p1 ∈ c1 and a vertex p2 ∈ c2 .
If #c1 = #c2 = 2 then #(c1 ∪ c2 ) = 4 and so either the invariant holds after the merging step or the new graph contains three points spanning a three-dimensional subspace. Therefore we can assume in the following wlog that #c1 ≥ 4, so p1 has at least one strong neighbor p′1 ∈ c1 . In case that p2 ∈ c2 − C1 the new graph contains three points spanning a threedimensional subspace. In case that or all the points in c2 minimizing the distance to c1 lie on C1 we see that either
#c2 = 2, i.e., c2 = C1 ∩ C2 and so the invariant holds, or
#c2 > 2, i.e., p2 has at least one strong neighbor p2 ∈ c2 − C1 and therefore the new graph contains three points spanning a three-dimensional subspace.
The correctness of the algorithm immediately follows from our previous considerations: If we find a dimension reduction at some point, then the correctness follows by induction over the dimension, as was argued in the introduction. If we end up with a single connected component, the problem is actually only two-dimensional by√Lemma 2.4, and if we end up with a set of connected components of pairwise distance 2 we have decomposed the original problem into two-dimensional orthogonal subproblems according to Lemma 2.4 that can be solved independently by Observation 2.3.
2.2.2
Analysis
To facilitate the analysis of the algorithm, we have to specify the implementation in some more detail. We store the edges of the complete graph on the point set (except the edges connecting antipodal vertices) in a list L, sorted by their length. The initialization time for this list is O(n2 log n). Furthermore we use a union-find data structure C to store the points in each connected component. This structure is initialized with n sets, each containing two points (namely a point along with its antipodal point, i.e., C = {Cp | p ∈ P } and Cp = {p, −p}); this initial step takes O(n) time. Finally we keep the n × n adjacency matrix G of the modified smallest distance graph. It can be set up in O(n2 ) time as Gp,q = 1 if q = −p and Gp,q = 0 otherwise. Throughout the algorithm for each point p ∈ P the structure Cp will contain exactly those points that are in the same connected component as p, and L will contain all point pairs of distance at least δmin , i.e., (p, q) ∈ L ⇐⇒ ||p − q|| ≥ δmin . In order to determine the new minimal intercomponent distance in step 9, we consider √ the length δmin of the first element of L. If δmin = 2 we have decomposed the original problem into orthogonal subproblems and the data structures need not be updated anymore. Otherwise we extract all edges of length δmin from L and store them in a list Lmin .
2.2 A refined approach
15
If l denotes the length of this list, the time for this step is bounded by O(l). For all edges (p, q) ∈ Lmin we first check whether Cp = Cq , i.e., if they join points in the same connected component before edges are added. If not, we make them adjacent in G. After that we update C by merging two different components in case we just added edges between them. The procedure we just described requires at most 2l find queries to C and at most l union operations on C, so the overall time required is of order O(l log∗ n). Now, although l can be as large as O(n) (see below), we see that the amortized cost for determining the minimal intercomponent distances in step 9 and maintaining the data structures L and C in step 24 is of order O(n2 log∗ n), since each edge is ‘touched’ at most once in the course of the algorithm. The task of projecting a set of points to the linear subspace it spans and computing a basis for that space, along with the appropriate coordinate computation can be done in linear time. The parts of the algorithm that handle the case d ≤ 3 (step 1), the case of all points lying on a plane (step 7) and the case of orthogonal subproblems (steps 12–22), respectively, are called at most once during the execution of the algorithm. Step 1 as well as step 7 takes O(n log n) time. In steps 12–22 there are at most d components in each set, since the components are mutually orthogonal, so we have to try at most d! = O(1) possibilities for the matching, and for each matching we have to solve at most d = O(1) two-dimensional subproblems; the total time required for these steps is therefore O(n log n), too. The repeat loop is executed O(n2 ) times and in each step O(n) edges are added (see below). But since each edge will be used only once in this process, the overall time spent for finding minimum length edges and merging components is bounded by O(n2 log∗ n), as argued above. The number of recursive calls of the algorithm in step 29 is bounded by the number of triples TQ found in step 27. To prove the O(n⌈d/3⌉ log n) time bound, we have to show that not too many recursive calls are generated. To be more precise, we prove that steps 24–34 generate only O(n) possible reductions. For this we look at the last time that δmin was recomputed and edges were added. From Lemma 2.4 we know that each connected component was planar before the edges were added, i.e., the points of the component were distributed on a great circle of the unit sphere. Now consider a point p from a component c that is merged with the components c1 , . . . , ck (k ≥ 1) in the current step. In each component ci there is a set of points Pi such that ||p − p′ || = δmin for all p′ ∈ Pi and ||p − p′′ || > δmin for all p′′ ∈ ci − Pi . Furthermore the distances between points from different components are larger than δmin , i.e., for all i 6= j and for all p′ ∈ Pi , p′′ ∈ Pj we have that ||p′ − p′′ || > δmin . If we pick a point pi from each Pi and place a sphere of radius δmin /2 around each pi we get an arrangement of k congruent mutually disjoint spheres touching the sphere of radius δmin /2 around p. The largest number of spheres in such a packing is called the ‘kissing number’ or ‘Newton number’ kd of dimension d, and is less than 3d = O(1). This implies that a vertex from a component can only have new neighbours in at most kd = O(1) components. Furthermore it has at most two neighbours in each of these components, since a sphere around that point intersects the great circle containing the other connected component in at most two points, unless the center of the
16
Testing the congruence of point sets in
Rd
sphere is in the subspace √ orthogonal to the plane spanned by the circle. But this is not possible, since δmin = 2 in that case, which we excluded in steps 12–22. We see that at most O(n) edges were added in this merging step and that each vertex has degree at most O(1) and therefore the number of triples in the nearest neighbour graph is bounded by O(n). The time T (n, d) that algorithm Congruence Test needs to decide the congruence of a d-dimensional n-point set, can be estimated as T (n, d) ≤ c · n2 log n + c · nT (n, d − 3) for a sufficiently large constant c > 0, where the first term accounts for the time spent in the loop and for the parts of the algorithm that handle the initialization and the base cases. By iterating this inequality it is easily seen that T (n, d) ≤ kck nk+1 log n + ck nk T (n, d − 3k) for all k ≥ 1. For k0 = ⌈ d−3 ⌉ we have that d − 3k0 ≤ 3 and thus T (n, d − 3k0) ≤ c · n log n 3 (if c is sufficiently large), so that T (n, d) ≤ 2k0 ck0 +1 nk0 +1 log n. We see that T (n, d) = O(n⌈d/3⌉ log n), which proves the claim.
Chapter 3 Measuring the Hausdorff distance of point sets in R d In this chapter we will develop efficient algorithms to measure the one-sided Hausdorff distance of a d-dimensional point set P to a set Q of n geometric objects of constant ’size’ each. To be more precise, in section 3.1, we look at the case where Q is a set of n semialgebraic sets in R d , each of constant description complexity, i.e., we look at the following problem: Problem 3.1 (δ˜H -measure problem: point set vs. semialgebraic set in R d ). Given a point set P ⊆ R d of m points, and a set Q ⊆ R d of n semialgebraic sets, each of constant description complexity. Compute δ˜H (P, Q). Of course one can always compute the distance of each point x ∈ P to each set y ∈ Q in O(1) time (c.f., Lemma 3.3 on the next page) and thus solve the problem in O(mn) time; we will refer to this algorithm as the ’brute-force approach’. We will present a 1 randomized algorithm that computes δ˜H (P, Q) in Oǫ (mnǫ log m + m1+ǫ− 2d−2 n) expected time (c.f., Theorem 3.8 on page 20). This is – to the best of our knowledge – the first result that constitutes an improvement upon the brute-force approach.
3.1
The one-sided Hausdorff distance of a point set to a semialgebraic set
In this section we look at the problem of computing the one-sided Hausdorff distance from a set of m points P in R d to a set Q of n semialgebraic sets of constant description complexity each (c.f., page 8). We will first consider the corresponding decision problem, and apply a randomized technique afterwards to compute δ˜H (P, Q). The following result will turn out to be a crucial ingredient in our algorithms: Theorem 3.2 (Point location among real-algebraic varieties, Chazelle et al., [25]). Let T be a set of k d-variate polynomials of bounded degree. A data structure of size Oǫ (k 2d−2+ǫ )
18
Measuring the Hausdorff distance of point sets in
Rd
that allows O(log k) time queries among the varieties defined by the set T can be built in Oǫ (k 2d−2+ǫ ) randomized expected time. We first describe how to determine δ˜H (P, Q) by ’brute-force’ in O(mn) time by computing all pairwise distances from the points in P to the sets in Q. Later we will use this result as a subroutine in a randomized reduction that turns the algorithm for the decision problem into an algorithm that computes δ˜H (P, Q). Lemma 3.3 (Computing δ˜H (P, Q) by brute-force). We can compute δ˜H (P, Q) in O(mn) time. Proof. It suffices to show how the distance d(p, Γ) between a semialgebraic set Γ ∈ Q of constant description complexity and a point p ∈ P can be computed. The claim then follows, since we can compute δ˜H (P, Q) by simply computing the mn distances from the points in P to the sets in Q. Let Γ(x) be a polynomial expression that defines Γ. Consider the following Tarski sentence[a] : HDΓ,p (z) := ∀ǫ∃y Γ(y) ∧ ||p − y||2 ≤ z 2 + ǫ2 .
When δ ∈ R satisfies HDΓ,p , this means that the distance between p and Γ is at most |δ|. We can transform HDΓ,p to prenex form and eliminate quantifiers with the algorithm of Collins, c.f., [30] to get a polynomial expression which we will also denote by HDΓ,p . The runtime of this algorithm is doubly-exponential in d and polynomial in the number of polynomials forming the expression Γ(x). The expression HDΓ,p defines a semialgebraic set in R , i.e., a finite set of intervals of constant size, which we can compute in O(1) time with the algorithm of Theorem 3.2. From that set we can read off the smallest δ ≥ 0 for which HDΓ,p holds and return it as d(p, Γ). Since the description complexity of Γ (and therefore of all other semialgebraic sets derived from it) is independent of the input size (i.e., m and n), we can compute d(p, Γ) in O(1) time (where the hidden constant depends doubly-exponential on d), and the claimed time bound follows. As we already mentioned, we will first consider the decision version of Problem 3.1: Problem 3.4 (δ˜H -decision problem: point set vs. semialgebraic set in R d ). Given a point set P ⊆ R d of m points, and a set Q ⊆ R d of n semialgebraic sets, each of constant description complexity, and some δ ≥ 0. Decide, whether δ˜H (P, Q) ≤ δ. We have that δ˜H (P, Q) ≤ δ iff for each point in P there is a point of Q that is δ-close. Therefore it is reasonable to look at the set of all points that are δ-close to Q: [a]
A Tarski sentence consists of a polynomial expression that is prefixed by a finite number of existential (∃) and universal (∀) quantifiers.
3.1 The one-sided Hausdorff distance of a point set to a semialgebraic set Definition 3.5 (δ-neighborhood). Let Q be a compact set in the δ-neighborhood of Q, defined as
19
R d . Then nhδ (Q) denotes
nhδ (Q) := {x ∈ R d | d(x, Q) ≤ δ}. Our result is based on the following simple observation: Observation 3.6. Let and P , Q be compact sets in R d , and δ > 0. Then the one-sided Hausdorff distance from P to Q is at most δ iff all points of P are contained in the δneighborhood of Q, i.e., δ˜H (P, Q) ≤ δ ⇐⇒ P ⊆ nhδ (Q). The algorithm for the decision problem computes from Q a data structure that represents nhδ (Q) and allows efficient point-containment queries. Then it queries this data structure with all points of P and determines all points that are not contained in the δneighborhood – this will be needed to perform the randomized reduction that solves the optimization problem. Lemma 3.7 (Computing δ˜H (P, Q) – decision problem). We can compute the set X = {x ∈ P | d(x, Q) > δ} in Oǫ (mnǫ + m1+ǫ−1/(2d−2) n) randomized expected time. Proof. We first describe a randomized algorithm that computes X in Oǫ (n2d−2+ǫ + m log n) expected time, which we speed up with a simple batching technique afterwards. Let us first argue that for Γ ∈ Q the set nhδ (Γ) is semialgebraic and that it (i.e., a polynomial expression defining it) can be computed in O(1) time. To this end, let Q(x) be a polynomial expression that defines Γ and consider the following Tarski sentence: NHΓ,δ (x) := ∃y Q(y) ∧ ||x − y||2 ≤ δ 2 .
Obviously nhδ (Γ) = {x ∈ R d | NHΓ,δ (x) holds}. We can eliminate the quantifier to obtain a polynomial expression which we will also denote by NHΓ,δ ; it will be identified with (i) the sequence (nΓ,δ (x))1≤i≤b of d-variate polynomials that form it. Since the description (i)
complexity of Γ is independent of the input size, the set NHΓ,δ (i.e., the polynomials nΓ,δ ) can be computed in O(1) time. (i) Let F = ∪Γ∈Q {nΓ,δ } denote the set of polynomials that appear in the atomic polynomial expressions forming the expressions NHΓ,δ . By the above reasoning this set can be computed in O(n) time. With the algorithm of Theorem 3.2 we can compute a point-location data (i) structure of size Oǫ (n2d−2+ǫ ) in Oǫ (n2d−2+ǫ ) time for the arrangement of the varieties nΓ,δ = 0 defined by F . The signs of all polynomials in F and therefore the validity of each polynomial expression NHΓ,δ is constant for each cell of the decomposition of R d induced by these varieties. Thus a point-location query to this data structure determines whether the query point lies in nhδ (Q) and the set X can be computed by querying the data structure with all points in P . The overall runtime of this method is Oǫ (n2d−2+ǫ + m log n). We gain a significant speedup with a simple batching technique. To this end distinguish the following cases:
20
Measuring the Hausdorff distance of point sets in
Rd
m ≤ n2d−2 : We partition Q into g = ⌈n/m1/(2d−2) ⌉ groups of at most k = m1/(2d−2) ≤ n points each. For each group, we run the algorithm described above. The total time spent is Oǫ (g(k 2d−2+ǫ + m log k)) = Oǫ (m1+ǫ−1/(2d−2) n). n2d−2 ≤ m: In that case the runtime is O(mnǫ ). Theorem 3.8 (Computing δ˜H (P, Q)). We can compute δ˜H (P, Q) in Oǫ (mnǫ log m + 1 m1+ǫ− 2d−2 n) randomized expected time. Proof. We follow a strategy similar to that proposed in [2]. Initially we set δ = 0 and X = P . Then we repeat the following steps until X becomes empty: Choose a random point x ∈ X and compute δ ′ = δ˜H (x, Q) in O(n) time with the algorithm from Lemma 3.3. Set δ to max(δ, δ ′ ). Now compute the set X ′ = {x ∈ X | δ˜H (x, Q) > δ} in Oǫ (mnǫ + m1+ǫ−1/(2d−2) n) time with the algorithm from Lemma 3.7. Finally set X to X ′ . Obviously the last value of δ will be δ˜H (P, Q). As is shown in [28], the expected number of iterations is O(log m) and therefore the expected time to compute δ˜H (P, Q) with this algorithm is Oǫ (mnǫ log m + m1+ǫ−1/(2d−2) n).
Part II Matching of plane curves
Matching of plane curves
23
Often two-dimensional shapes are given by the planar curves forming their boundaries. Definition 3.9 (Curve, Closed curve). denote the unit circle R mod 2π.
Let
S0 denote the unit interval [0, 1], and S1
1. A continuous mapping f : S0 → R 2 is called a curve. A polygonal curve is a curve P , such that there exists an n > 0, and for all 0 ≤ i < n each Pi := P |[i/n,(i+1)/n] is affine, i.e., P (i/n + λ/n) = (1 − λ)P (i/n) + λP ((i + 1)/n) for all λ ∈ [0, 1]. The set of all curves will be denoted by K0 .
2. A continuous mapping f : S1 → R 2 is called a closed curve. A closed polygonal curve is a closed curve P , such that there exists an n > 0, and for all 0 ≤ i < n each Pi := P |[i/2πn,(i+1)/2πn] is affine. The set of all closed curves will be denoted by K1 .
The problem of measuring the similarity between two shapes then directly leads to the problem of comparing two planar curves. There are several possible distance measures to assess the ‘resemblance’ of the shapes, and there are also different kinds of transformations that are allowed to match them. In this Part, we will focus on the Fr´echet distance δF and on the Hausdorff distance δH (c.f., Definition 1.4 on page 7) for polygonal curves. Definition 3.10 (Fr´ echet distance, weak Fr´ echet distance). Let P : Sk → R 2 and Q : Sk → R 2 with k ∈ {0, 1} be (closed) curves. Then δF (P, Q) denotes the Fr´echet distance between P and Q, defined as δF (P, Q) :=
inf
α :Sk→Sk β :Sk→Sk
max ||P (α(t)) − Q(β(t))||, t∈Sk
where α and β range over all continuous and increasing[b] bijections on Sk. If we only require the reparametrizations α and β to be continuous bijections (with the additional constraint that α(0) = 0, α(1) = 1, β(0) = 0, and β(1) = 1 in the case of curves), we get a distance measure that is called the weak Fr´echet distance between P and Q, and is denoted by δ˜F (P, Q). As a popular illustration of the Fr´echet distance between two curves, suppose a man is walking his dog, he is walking on one curve the dog on the other. Both are allowed to control their speed, but are not allowed to go backwards. Then the Fr´echet distance of the curves is the minimal length of a leash that is necessary. In the case of two closed curves, both (man and dog) are not only allowed to control their speed but also to choose their starting point on the curve, in order to minimize the length of the leash. If both are also allowed to go backwards, we get the weak Fr´echet distance between the two curves. For any two curves P and Q the following inequality holds: δH (P, Q) ≤ δ˜F (P, Q) ≤ δF (P, Q), but as shown in Figures 3.1 and 3.2 neither the ratio between δH and δF , nor the ratio between δ˜F and δ˜F is bounded in general. In chapter 5 however, we will show that for a
24
Matching of plane curves
Figure 3.1: Two curves with small Hausdorff distance having a large Fr´echet distance.
Figure 3.2: Two curves with small weak Fr´echet distance distance having a large Fr´echet distance.
certain class of curves there is a close relationship between the Fr´echet and the Hausdorff distance (c.f. Theorem 5.3 on page 44). These curves are defined with respect to a real valued parameter κ ≥ 1, which – roughly speaking – measures how much they ‘resemble’ a straight line; they will be called κ-straight. Our result gives rise to a randomized approximation algorithm that computes an upper bound on the Fr´echet distance between two such curves that is off from the exact value by a multiplicative factor of (κ + 1). The algorithm runs in O((m + n) log2 (m + n)2α(m+n) ) time for given polygonal curves P, Q with m and n vertices (c.f., Corollary 5.8 on page 49). We also provide an O(n log2 n) time algorithm to decide for any κ ≥ 1, if a given polygonal curve on n vertices is κ-straight (c.f. Theorem 5.9 on page 49). In chapter 4 we will present exact and approximation algorithms to compute a translation which, when applied to the first curve, minimizes the (weak) Fr´echet distance to the second one. To be more precise, we describe an algorithm that solves the corresponding 3 2 3 decision problem in O (mn) (m + n) (O (mn) in case of δ˜F ) time (c.f., Theorem 4.23 and Theorem 4.24 on page 38). We complement the exact solution with an O(ǫ−2 mn) time approximation algorithm that does not necessarily compute the optimal transformation, but one that yields a Fr´echet distance which differs from the optimum value by a factor of (1 + ǫ) only, thus trading exactness for feasibility (c.f., Theorem 4.29 on page 40). The approximation algorithm is based on the concept of reference points; we conclude chapter 4 A bijection α : S1 → as functions on [0, 2π]). [b]
S1 is called increasing, if α|[0,α
−1 (0)]
and α|[α−1 (0),2π] are increasing (considered
Matching of plane curves
25
with a negative result that rules out the existence of such reference points for affine maps (c.f., Theorem 4.31). The results in this Part have partially been obtained in collaboration with Helmut Alt and Carola Wenk (chapters 4 and 5), and with Pankaj Agarwal, Rolf Klein, and Micha Sharir (chapter 5). Some of the material has already been published in [16], [17], and [1].
26
Matching of plane curves
Chapter 4 Matching polygonal curves with respect to the Fr´ echet distance In this chapter we will develop exact and approximation algorithms for the following problem: Problem 4.1 (Matching problem – optimization version). Given two polygonal curves P, Q ∈ K0 . Find a translation τ such that δF (τ (P ), Q) (δ˜F (τ (P ), Q)) is as small as possible. To be more precise, we will provide an approximation algorithm in section 4.3 which does not necessarily compute the optimal transformation, but one that yields a Fr´echet distance which differs from the optimum value by a factor of (1 + ǫ). To this end, we observe that it is easy to generalize the notion of a reference point to the Fr´echet metric and apply the machinery of reference point based matching. The algorithm will run in time O(ǫ−2 mn), where m and n denote the number of vertices of P and Q, respectively (c.f., Theorem 4.29 on page 40); it constitutes the first approximation algorithm for Problem 4.1. We conclude the section with a proof of the fact that there are no reference points for affine maps (c.f., Theorem 4.31 on page 41) In section 4.2 we will consider the decision problem version of the exact matching problem: Problem 4.2 (Matching problem – decision version). Given two polygonal curves P, Q ∈ K0 , and δ ≥ 0. Decide, whether there exists a translation τ such that δF (τ (P ), Q) ≤ δ (δ˜F (τ (P ), Q) ≤ δ). We describethe first algorithm that solves the decision problem; it runs in O (mn)3 (m+ n)2 (O (mn)3 ) time (c.f., Theorem 4.23 on page 37 and Theorem 4.24 on page 38). Efrat et al. [33] have independently developed an algorithm for the decision problem. However we should point out, that the runtime they achieve is by a factor mn slower than ours; furthermore their result is rather complicated and relies on complex data structures. Their
28
Matching polygonal curves with respect to the Fr´ echet distance
work is based on [49], where an algorithm is presented that solves the decision problem for translations in a fixed direction in O((mn)2 (m + n)) time.
4.1
Computing the Fr´ echet distance
Let us first describe how to compute the (weak) Fr´echet distance between two polygonal curves, according to Alt and Godau, [13]. This will provide us with some of the main ingredients of our algorithm. For the remainder of this Part, P : [0, m] → R 2 and Q : [0, n] → R 2 will be polygonal curves, δ ≥ 0 is a fixed real parameter, and T2 denotes the set of planar translations. By slightly generalizing our notation, we will assume that a polygonal curve P is parametrized over [0, n], where n is the number of segments of P , and P |[i,i+1] is affine for all 0 ≤ i < n. A translation τ = h(x, y) 7→ (x+δx , y+δy )i ∈ T2 can be specified by the pair (δx , δy ) ∈ R 2 of its parameters. The set of parameters of all translations in T2 is called the parameter space of T2 , or translation space for short, and we identify T2 with its parameter space R 2 . In the sequel we will use the notion of a free space which was introduced in [13]: Definition 4.3 (Free space, Alt/Godau, [13]). The set Fδ (P, Q) := {(s, t) ∈ [0, m] × [0, n] | ||P (s) − Q(t)|| ≤ δ}, or Fδ for short, denotes the free space of P and Q. Sometimes we refer to [0, m] × [0, n] as the free space diagram; the feasible points p ∈ Fδ will be called ‘white’ and the infeasible points p ∈ [0, m] × [0, n] − Fδ will be called ‘black’ (for obvious reasons, c.f. Figure 4.1). Consider [0, m] × [0, n] as composed of the mn cells Ci,j := [i − 1, i] × [j − 1, j] 1 ≤ i ≤ n, 1 ≤ j ≤ m. Then Fδ (P, Q) is composed of the mn free spaces for each pair of edges Fδ (Pi−1 , Qj−1 ) = Fδ (P, Q) ∩ Ci,j . By L and R, we will denote the lower left and the upper right corner of Fδ , respectively, i.e., L := (0, 0), and R := (m, n). The following results from [13] describe the structure of the free space and link it to the problem of computing δF and δ˜F . Lemma 4.4 (Properties of the free space, Alt/Godau, [13]). 1. The free space of two line segments is the intersection of the unit square with an affine image of the unit disk, i.e., with an ellipse, possibly degenerated to the space between two parallel lines. 2. For polygonal curves P and Q we have δ˜F (P, Q) ≤ δ, exactly if there exists a path within Fδ (P, Q) from L to R. 3. For polygonal curves P and Q we have δF (P, Q) ≤ δ, exactly if there exists a path within Fδ (P, Q) from L to R which is monotone in both coordinates; such a path will be called bi-monotone. Definition 4.5 (δF -path, δ˜F -path). A path π in Fδ from L to R will be called a δ˜F -path. If π is also bi-monotone, it will be called a δF -path.
4.1 Computing the Fr´ echet distance
29
For a proof of the Lemma we refer to [13]. Figure 4.1 shows polygonal curves P, Q, a distance δ, and the corresponding diagram of cells Ci,j with the free space Fδ . Observe that the curve π as a continuous mapping from S0 to [0, m] × [0, n] directly gives feasible reparametrizations, i.e., two reparametrizations α and β, such that maxt∈S0||P (α(t)) − Q(β(t))|| ≤ δ. P
P
π Q
δ
Q
Figure 4.1: Two polygonal curves P and Q and their free space diagram for a given δ. An example δF -path π in the free space is drawn bold. For (i, j) ∈ {1, . . . , m} × {1, . . . , n} let Li,j := {i − 1} × [ai,j , bi,j ] (Bi,j := [ci,j , di,j ] × {j − 1}) be the left (bottom) line segment bounding Ci,j ∩ Fδ (see Figure 4.2). The segment αi,j := {(i − 1, y) | j − 1 ≤ y < ai,j } ⊆ F¯δ is called a bottom-spike, and the segment βi,j := {(i − 1, y) | bi,j ≤ y < j} ⊆ F¯δ is called a top-spike. Likewise, the segment γi,j := {(x, j − 1) | i − 1 ≤ x < ci,j } ⊆ F¯δ is called a left-spike, and the segment δi,j := {(x, j − 1) | di,j ≤ x < i} ⊆ F¯δ is called a right-spike. A bottom-spike αi,j and a top-spike βk,j will be called aligned, if ai,j = bi,j . Likewise, a left-spike γi,j and a right-spike δk,j will be called aligned, if ci,j = di,j . By induction it can easily be seen that those parts of the segments Li,j and Bi,j which are reachable from L by a bi-monotone path in Fδ are also line segments. Using a dynamic programming approach one can compute them, and thus decide if δF (P, Q) ≤ δ. For details we refer to the proof of the following Theorem in [13]: Theorem 4.6 (Computing the Fr´echet distance (decision problem), Alt/Godau, [13]). One can decide in O(mn) time, whether δF (P, Q) ≤ δ (δ˜F (P, Q) ≤ δ).
As we have already mentioned, each (possibly clipped) ellipse in Fδ is the affine image of a unit disk. Each such ‘ellipse’ in Fδ varies continuously in δ; to be more precise, the cell boundaries ai,j (δ), bi,j (δ), ci,j (δ), and di,j (δ) are continuous functions of δ. This implies that when δ is as small as possible, i.e., δ = δF (P, Q) (δ = δ˜F (P, Q)), all δF -paths (δ˜F paths) in the diagram have to contain some (in fact at least two) of the extremal points of the free space, i.e., the extreme points of the cell boundaries: ai,j , bi,j , ci,j , and di,j . Definition 4.7 (Clamped path). Let π be a δ˜F -path in Fδ . We call π clamped in the j-th row (column) if ai,j = bk,j (cj,i = dj,k ) for some i ≤ k, i.e., if αi,j and βk,j (γj,i and
30
Matching polygonal curves with respect to the Fr´ echet distance ci,j+1
di,j+1 Bi,j+1
bi,j
bi+1,j
Li+1,j
Fδ ∩ Ci,j Li,j
ai+1,j ai,j Bi,j ci,j
di,j
Figure 4.2: Intervals of the free space on the boundary of a cell.
δj,k ) are aligned, and (i, ai,j ) as well as (k, bk,j ) ((cj,i, i) as well as (dj,k , k)) lie on π. We call π horizontally (vertically) clamped if it is clamped in some j-th row (column). We call π clamped if it is horizontally or vertically clamped. We call π clamped in cell (j, i) if i = k. B Ci,j π
ai,j
bk,j Ck,j
A
Figure 4.3: The path π is horizontally clamped in the j-th row. Figure 4.3 shows an excerpt of Fδ and a horizontally clamped path in the j-th row of the diagram. The following Lemma from [13] subsumes our observations. Lemma 4.8 (Each δF -path in FδF (P,Q) is clamped, Alt/Godau, [13]). 1. If δ = δF (P, Q), then Fδ (P, Q) contains at least one δF -path, and each such path is clamped. 2. If δ = δ˜F (P, Q), then Fδ (P, Q) contains at least one δ˜F -path, and each such path is clamped in some cell. A clamped path also has a geometric interpretation. Figure 4.4 shows the geometric situations that correspond to a (horizontally) clamped path. In case (a) the reparametrization (i.e., the path) maps the point P (i − 1) to the only point on the edge Qj that has
4.2 Minimizing the Fr´ echet distance
31
distance δ from P (i − 1). This corresponds to a δ˜F -path that is clamped in cell (i, j). In case (b) it maps the part of P between P (i − 1) and P (k − 1) to the only point on the edge Qj that has distance δ from P (i − 1) and P (k − 1). This corresponds to a δF -path that is clamped in the j-th row. These situations cover the case of horizontally clamped paths. The geometric situations that involve a vertically clamped passage are similar, with the roles of P and Q interchanged.
Q
Q
δ
δ
Qj
Qj P P
P (k − 1)
P (i − 1)
P (i − 1)
(a)
(b)
Figure 4.4: The geometric situations corresponding to a horizontally clamped path.
4.2
Minimizing the Fr´ echet distance
First we give a rough sketch of the basic idea of our algorithm: Assume that there is at least one translation τ≤ that moves P to a Fr´echet distance at most δ to Q. Then we can move P to a position τ= where the Fr´echet distance to Q is exactly δ. According to Lemma 4.8 the free space diagram Fδ (τ= (P ), Q) then contains at least one clamped path. As a consequence, one of the geometric situations from Figure 4.4 must occur. Therefore the set of translations that attain a Fr´echet distance of exactly δ is a subset of the set of translations that realize at least one of those geometric situations. The set of translations that create a geometric situation involving the two different vertices P (i − 1) and P (k − 1) from P and the edge Qj from Q consist of two segments in transformation space, i.e., it can be described geometrically. Now assume that the geometric situation from above is specified by the two vertices P (i − 1) and P (k − 1) and the edge Qj . When we move P in such a way that P (i − 1) and P (k − 1) remain at distance δ from a common point on an edge of Q (i.e., we shift P ’along’ Q), we will preserve one geometric situation (namely the one involving P (i − 1) and P (k − 1) and some edge of Q). During this process the Fr´echet distance may vary, but at some point we will reach a placement τ=′ where it becomes δ again. According to Lemma 4.8 the free space diagram Fδ (τ=′ (P ), Q) then also contains a clamped path and another geometric situation from Figure 4.4 must occur. So whenever there is a translation
32
Matching polygonal curves with respect to the Fr´ echet distance
that attains a Fr´echet distance of exactly δ, there is a translation that realizes at least two such geometric situations. Moreover it can be shown that the number of translations that realize at least two such situations is only polynomial. After this informal description of the basic ideas let us go into more detail now. Let us first take a look at the free space Fδ (τ (P ), Q) as τ varies over T2 : Lemma 4.9 (Continuity lemma I). δF (τ (P ), Q) and δ˜F (τ (P ), Q) are continuous functions of (the parameters of ) τ . Proof. Let δ := δF (P, Q) and consider two reparametrizations α and β of P and Q, such that ||P (α(t)) − Q(β(t))|| ≤ δ for all t ∈ [0, 1]. The triangle inequality implies that ||τ (P (α(t))) − Q(β(t))|| ≤ δ + ||τ || for all t ∈ [0, 1], so that δF (τ (P ), Q) ≤ δF (P, Q) + ||τ ||. The same argument yields that δF (P, Q) = δF (τ −1 (τ (P )), Q) ≤ δF (τ (P ), Q) + ||τ −1 || = δF (τ (P ), Q) + ||τ ||, so that |δF (τ (P ), Q) − δF (P, Q)| ≤ ||τ ||, which proves the claim. The same reasoning applies for the weak Fr´echet distance. Lemma 4.10 (Continuity lemma II). The cell boundaries ai,j , bi,j , ci,j , and di,j in Fδ are continuous partial functions of (the parameters of ) τ . The domains of these functions are closed sets. Proof. Let us focus on a cell Ci,j of the free space diagram Fδ (τ (P ), Q) as τ varies. We will show that ai,j and bi,j are continuous partial functions of (the parameters of) τ . A symmetric argument proves the claim for ci,j , and di,j . Consider the point p = P (i − 1) and the segment s = Qj . Let pδ denote the circle with center p and radius δ . Recall that Li,j denotes the left boundary of the free space of the cell Ci,j . This boundary essentially corresponds to the intersection of pδ and s. To be more precise a point y ∈ Li,j corresponds to a point q = Q(y) ∈ s with ||p − q|| ≤ δ, i.e., a point in pδ ∩ s, and vice versa. Since ai,j and bi,j are defined whenever Li,j is non-empty we see that the domain of these two functions is the set of all translations τ such that τ (pδ ) and s intersect, or — equivalently — such that τ (p) lies in the Minkowski sum of s with a circle of radius δ, i.e., dom(ai,j ) = dom(bi,j ) = s ⊕ (δ · S1) − p. This is a closed set. Of course pδ ∩ s changes continuously in τ . The following result is an immediate consequence of the previous Lemma: Corollary 4.11. There exists ǫ > 0 such that for each translation τ with ||τ || < ǫ the following holds for all i, j, k with i ≤ k: 1. If Li,j = ∅, then Li,j (τ ) = ∅. 2. If Li,j 6= ∅, Lk,j 6= ∅, then (a) Li,j (τ ) = ∅ and ai,j = bi,j , or
(b) Lk,j (τ ) = ∅ and ak,j = bk,j , or
4.2 Minimizing the Fr´ echet distance
33
(c) Li,j (τ ) 6= ∅, Lk,j (τ ) 6= ∅. 3. If Li,j 6= ∅, Lk,j 6= ∅, Li,j (τ ) 6= ∅, Lk,j (τ ) 6= ∅, and (a) ai,j > bk,j , then ai,j (τ ) > bk,j (τ ). (b) ai,j < bk,j , then ai,j (τ ) < bk,j (τ ). The corresponding statement holds for ci,j , dk,j , Bi,j , and Bk,j . Definition 4.12 (Configuration). A triple c = (p, p′ , s) that consists of two (not necessarily distinct) vertices p and p′ of Q (P ) and an edge s of P (Q) is called an h-configuration (v-configuration) of P and Q. A configuration is an h-configuration or a v-configuration. If p = p′ then c is called degenerate. Otherwise it is called non-degenerate. Let τ be a translation. If c is an h-configuration, then τ (c) := (p, p′ , τ (s)), whereas if c is a v-configuration, then τ (c) := (τ (p), τ (p′ ), s). Let us briefly rephrase the statement of Lemma 4.8 and the discussion following it: Observation 4.13. 1. If δ = δF (P, Q), then there is a configuration c = (p, p′ , s) of P and Q such that there exist at most two points q ∈ s with ||p − q|| = ||p′ − q|| = δ. 2. If δ = δ˜F (P, Q), then there is a degenerate configuration c = (p, p, s) of P and Q such that there exist at most two points q ∈ s with ||p − q|| = δ. Definition 4.14 (δ–critical translation). A translation τ is called δ–critical for a configuration c of P and Q, if τ (c) = (p, p′ , s) and there exist at most two points q ∈ s s.th. ||p − q|| = ||p′ − q|| = δ. A translation is called δ-critical if it is δ-critical for some δ configuration. The set of all translations that are δ–critical for c will be denoted by Tcrit (c). Observation 4.15. Let c = (P (i − 1), P (k − 1), Qj ), with i ≤ k, be an h-configuration. δ δ Then Tcrit (c) ⊆ dom(ai,j ) ∩ dom(bk,j ) and ai,j (τ ) = bk,j (τ ) for all τ ∈ Tcrit (c). Of course the statement of this observation remains true if we consider v-configurations (where the roles of P and Q are interchanged). Lemma 4.16 (Characteristic feasible translations - weak version). If there is a translation τ≤ such that δF (τ≤ (P ), Q) ≤ δ (δ˜F (τ≤ (P ), Q) ≤ δ) then there is a translation τ= that is δ–critical such that δF (τ= (P ), Q) = δ (δ˜F (τ= (P ), Q) = δ). Proof. We formulate the proof in terms of the Fr´echet distance only. The proof for the case of the weak Fr´echet distance can be copied verbatim. Pick any translation τ> such that δF (τ> (P ), Q) > δ. Since δF (τ (P ), Q) is a continuous function of τ according to Lemma 4.9, there exists a translation τ= on any curve between τ≤ and τ> in translation space such that δF (τ= (P ), Q) = δ. By Observation 4.13 the translation τ= is critical for some configuration.
34
Matching polygonal curves with respect to the Fr´ echet distance
This result states that in order to check if there is a translation that moves P into Fr´echet distance at most δ to Q, it is sufficient to check all the δ–critical translations. So let us take a closer look at the set of δ–critical translations in T2 : Let c = (P (i − 1), P (k − 1), Qj ), with i ≤ k, be an h-configuration. Consider the points p = P (i − 1), p′ = P (k − 1) and the segment s = Qj . First assume that i < k, i.e., c is non-degenerate. By definition a translation τ is δ–critical for c iff there exist at most two points q ∈ s s.th. ||τ (p) − q|| = ||τ (p′ ) − q|| = δ, i.e., one of the two points τ (pδ ) ∩ τ (p′δ ) = τ (pδ ∩ p′δ ) lies on s. So if pδ ∩ p′δ = {r, r ′} we have δ that Tcrit (c) = {τ | τ (r) ∈ s or τ (r ′ ) ∈ s} = (s − r) ∪ (s − r ′). Thus for a non-degenerate configuration (which corresponds to case (b) in Figure 4.4) the set of δ–critical translations is described by two parallel line segments in translation space, where each line segment is a translate of s. In case that i = k, i.e., c is degenerate, a translation τ is δ–critical for c iff d(τ (p), s) = δ, i.e., the circle τ (pδ ) ’touches’ s, or — equivalently — such that τ (p) lies on the boundary δ of the Minkowski sum of s with a circle of radius δ, i.e., Tcrit (c) = bd(s ⊕ (δ · S1)) − p = bd(dom(ai,j )). Thus, for a degenerate configuration (which is case (a) in Figure 4.4) the set of δ–critical translations is described by a ’racetrack’ in translation space, which is the locus of all points having distance δ to a translate of s. Such a ’racetrack’ consists of two parallel line segments in translation space, where each line segment is a translate of s and two semicircles of radius δ. Again a similar statement remains true if we consider v-configurations (where the roles of P and Q are interchanged). These results are subsumed in the following Observation 4.17. For a non-degenerate configuration involving the segment s, the set of δ–critical translations is described by two parallel line segments in translation space, where each line segment is a translate of s. For a degenerate configuration involving the segment s, the set of δ–critical translations consists of two parallel line segments in translation space, where each line segment is a translate of s and two semicircles of radius δ. The critical translations of two different configurations either overlap or intersect at most four times. Lemma 4.18 (Characteristic feasible translations for δF - strong version). Let c δ be a configuration and assume that there is a translation τ= ∈ Tcrit (c) s.th. δF (τ= (P ), Q) = δ (c) containing τ= and assume that for the two δ. Let s be the segment or semicircle of Tcrit endpoints τ1 , τ2 of s we have that δF (τ1 (P ), Q) > δ and δF (τ2 (P ), Q) > δ. δ δ Then there is a configuration c′ 6= c with |Tcrit (c) ∩ Tcrit (c′ )| ≤ 4, and a translation δ δ τ=′ ∈ Tcrit (c) ∩ Tcrit (c′ ), such that δF (τ=′ (P ), Q) = δ. Proof. Let us assume wlog that c = (P (i−1), P (k−1), Qj ), with i ≤ k is an h-configuration. The case that c is a v-configuration is analogous. It follows from Lemma 4.9 that the set {τ | δF (τ (P ), Q) ≤ δ} is a closed set. Therefore the set s≤ := {τ ∈ s | δF (τ (P ), Q) ≤ δ} consists of segments or circular arcs on s whose endpoints also belong to s≤ (loosely speaking, s≤ is a closed subset of s). Let s0≤ denote
4.2 Minimizing the Fr´ echet distance
35
the set of these endpoints; this set is non-empty since τ= ∈ s≤ . Lemma 4.9 also implies that for any τ ∈ s0≤ we have that δF (τ (P ), Q) = δ. Pick any τ=′ ∈ s0≤ . Since {τ1 , τ2 }∩s≤ = ∅ there is an ǫ′ > 0 such that for all 0 < µ < ǫ′ there is a τ> ∈ s\s≤ with ||τ=′ − τ> || ≤ µ. We have for all τ ∈ s that (c.f. Observation 4.15) Li,j (τ ) 6= ∅, Lk,j (τ ) 6= ∅, and ai,j (τ ) = bk,j (τ ).
(4.1)
Let ǫ > 0 be such that a translation that is at most ǫ away from τ=′ does not change the relative position of all spikes that are not aligned. Such an ǫ indeed exists according to Corollary 4.11. We can conclude that for all 0 < µ < min(ǫ, ǫ′ ) there exists some τ> ∈ s \ s≤ with ′ ||τ= − τ> || ≤ µ that does not change the relative position of any spikes that are not aligned (in the sense of Corollary 4.11). Since Fδ (τ> (P ), Q) does not contain any δF -paths, the diagram must differ from Fδ (τ=′ (P ), Q). The only possibilities that prevent such paths are 1. Lr,t (τ=′ ) 6= ∅, but Lr,t (τ> ) = ∅ for some r, t, or 2. ar,s (τ=′ ) = bt,s (τ=′ ), but ar,s (τ> ) > bt,s (τ> ) for some r, s, t. 3. Br,t (τ=′ ) 6= ∅, but Br,t (τ> ) = ∅ for some r, t, or 4. cr,s (τ=′ ) = dt,s (τ=′ ), but cr,s (τ> ) > dt,s (τ> ) for some r, s, t. In the first case Corollary 4.11 yields that ar,t (τ=′ ) = br,t (τ=′ ) and our considerations (4.1) above show that (r, t) 6= (i, j) and (r, t) 6= (k, j). Thus we get a new h-configuration c′ = (P (r − 1), P (r − 1), Qt ) for which τ=′ is critical. In the second case our considerations (4.1) above show that (r, s, t) 6= (i, j, k). Thus we get a new h-configuration c′ = (P (r − 1), P (s − 1), Qt ) for which τ=′ is critical. δ δ δ In any case τ> 6∈ Tcrit (c′ ), so in the neighbourhood of τ=′ the two sets Tcrit (c) and Tcrit (c′ ) ′ δ δ ′ do not overlap. Since τ= ∈ Tcrit (c) ∩ Tcrit (c ) we can conclude that they do not overlap at all. The other two cases are analogous and yield a v-configuration c′ for which τ=′ is critical. The proof of this Lemma is easily modified to yield the corresponding result for the weak Fr´echet distance; in fact only the cases 1. and 3. need to be taken into account. This yields: Lemma 4.19 (Characteristic feasible translations for δ˜F - strong version). Let c δ be a degenerate configuration and assume that there is a translation τ= ∈ Tcrit (c) s.th. δ ˜ δF (τ= (P ), Q) = δ. Let s be the segment or semicircle of Tcrit (c) containing τ= and assume that for the two endpoints τ1 , τ2 of s we have that δ˜F (τ1 (P ), Q) > δ and δ˜F (τ2 (P ), Q) > δ. δ δ Then there is a degenerate configuration c′ 6= c with |Tcrit (c) ∩ Tcrit (c′ )| ≤ 4, and a δ δ translation τ=′ ∈ Tcrit (c) ∩ Tcrit (c′ ), such that δ˜F (τ=′ (P ), Q) = δ.
36
Matching polygonal curves with respect to the Fr´ echet distance
The previous two results show that the translations that correspond to the endpoints of s deserve special attention: Definition 4.20 (δ–supercritical translation). Let c be a configuration and assume δ that Tcrit (c) is the disjoint union of the sets h, h′ , s, and s′ where s and s′ are parallel line segments, and h and h′ are either empty (if c is non-degenerate) or semicircles of radius δ (if c is degenerate). The translations that correspond to the endpoints of s and ′ δ s′ will be called δ–supercritical. By Tcrit (c) we denote the set of the four translations that are δ–supercritical for c. Lemma 4.21 below summarizes our considerations: In order to decide whether there is a translation τ s.th. δF (τ (P ), Q) ≤ δ, we can restrict our attention to translations that are δ–critical for two different configurations or that are δ–supercritical. Lemma 4.21 (Characteristic feasible translations for δF ). If there is a translation τ≤ such that δF (τ≤ (P ), Q) ≤ δ then ′ δ 1. either there is a configuration c and a translation τ≤′ ∈ Tcrit (c), such that δF (τ≤′ (P ), Q) ≤ δ, δ δ 2. or there are two different configurations c 6= c′ with |Tcrit (c) ∩ Tcrit (c′ )| ≤ 4, and a δ δ (c′ ), such that δF (τ=′ (P ), Q) = δ. (c) ∩ Tcrit translation τ=′ ∈ Tcrit
Proof. By Lemma 4.16 there is a translation τ= with δF (τ= (P ), Q) = δ that is δ–critical δ δ for some configuration c, i.e., τ= ∈ Tcrit (c). Let s be the segment or semicircle of Tcrit (c) δ ′ containing τ= and let τ1 , τ2 ∈ Tcrit (c) be the two endpoints of s. 1. If δF (τ1 (P ), Q) ≤ δ or δF (τ2 (P ), Q) ≤ δ the claim of part 1. follows with τ≤′ := τ1 or τ≤′ := τ2 , respectively. 2. If δF (τ1 (P ), Q) > δ and δF (τ2 (P ), Q) > δ the claim of part 2. Lemma 4.18.
follows from
Again, the proof of this Lemma is easily modified to yield the corresponding result for the weak Fr´echet distance: Lemma 4.22 (Characteristic feasible translations for δ˜F ). If there is a translation τ≤ such that δ˜F (τ≤ (P ), Q) ≤ δ then ′ δ 1. either there is a degenerate configuration c and a translation τ≤′ ∈ Tcrit (c), such that ′ ˜ δF (τ≤ (P ), Q) ≤ δ, δ δ 2. or there are two different degenerate configurations c 6= c′ with |Tcrit (c)∩Tcrit (c′ )| ≤ 4, δ δ and a translation τ=′ ∈ Tcrit (c) ∩ Tcrit (c′ ), such that δ˜F (τ=′ (P ), Q) = δ.
4.2 Minimizing the Fr´ echet distance
37
So in order to solve the decision problem for a given δ it is sufficient to check all translations τ that correspond either to δ–supercritical translations or to the intersection of the set of non-overlapping δ–critical translations of two distinct configurations. Algorithm Fr´echet Match(P, Q, δ) Input: Two polygonal curves P, Q ∈ K0 , and δ ≥ 0. Output: Decides whether there exists a translation τ such that δF (τ (P ), Q) ≤ δ. 1. for all configurations c ′ 2. do Compute τcrit , the δ–supercritical translations for c ′ 3. for τ ∈ τcrit 4. do if δF (τ (P ), Q) ≤ δ 5. then return (True) 6. for all pairs of configurations c and c′ 7. do Compute τcrit , the δ–critical translations for c ′ 8. Compute τcrit , the δ–critical translations for c′ ′ 9. for τ ∈ τcrit ∩ τcrit 10. do if δF (τ (P ), Q) ≤ δ 11. then return (True) 12. return (False) The correctness of Algorithm Fr´echet Match follows from Lemma 4.21. The complexity of the algorithm depends on the number of configurations. There are O(m2 n) many hconfigurations and O(n2 m) many v-configurations. We thus have altogether O (mn)2 (m+ n)2 translations for each of which we check in O(mn) time if it brings P into Fr´echet distance at most δ to Q. This solves Problem 4.2 for the Fr´echet distance and yields the following Theorem: Theorem 4.23 (Correctness and complexity of Fr´echet Match). In O (mn)3 (m + n)2 time Algorithm Fr´echet Match decides, whether there is a translation τ ∈ T2 such that δF (τ (P ), Q) ≤ δ. We can copy Algorithm Fr´echet Match almost verbatim to devise a procedure for solving the corresponding decision problem for the weak Fr´echet distance. Algorithm Weak Fr´echet Match(P, Q, δ) Input: Two polygonal curves P, Q ∈ K0 , and δ ≥ 0. Output: Decides whether there exists a translation τ such that δ˜F (τ (P ), Q) ≤ δ. 1. for all degenerate configurations c ′ 2. do Compute τcrit , the δ–supercritical translations for c ′ 3. for τ ∈ τcrit 4. do if δ˜F (τ (P ), Q) ≤ δ 5. then return (True) 6. for all pairs of degenerate configurations c and c′ 7. do Compute τcrit , the δ–critical translations for c
38
Matching polygonal curves with respect to the Fr´ echet distance
′ 8. Compute τcrit , the δ–critical translations for c′ ′ 9. for τ ∈ τcrit ∩ τcrit 10. do if δ˜F (τ (P ), Q) ≤ δ 11. then return (True) 12. return (False)
The correctness of Algorithm Weak Fr´echet Match also follows from Lemma 4.22. Since 2 there are only O(mn) many degenerate configurations, we altogether have O (mn) translations for each of which we check in O(mn) time if it brings P into weak Fr´echet distance at most δ to Q. Theorem 4.24 (Correctness and complexity of Weak Fr´echet Match). The algorithm Weak Fr´echet Match decides in O (mn)3 time whether there is a translation τ ∈ T2 such that δ˜F (τ (P ), Q) ≤ δ. In order to find a translation that minimizes the (weak) Fr´echet distance between the two polygonal curves one can apply the parametric search paradigm of Megiddo [40] and the improvement by Cole [29]. Details can be found in [17] and [50]. The latter also gives a generalization of our technique to other sets of transformations, like, e.g., rigid motions.
4.3
Approximately minimizing the Fr´ echet distance
The algorithms we described so far cannot be considered to be efficient. To remedy this situation, we present approximation algorithms which do not necessarily compute the optimal transformation, but one that yields a (weak) Fr´echet distance which differs from the optimum value by a constant factor only. To this end, we generalize the notion of a reference point, c.f. [9] and [8], to the Fr´echet metrics and observe that all reference points for the Hausdorff distance are also reference points for the (weak) Fr´echet distance. We first need the concept of a reference point that was introduced in [8]. A reference point of a figure is a characteristic point with the property that similar figures have reference points that are close to each other. Therefore we get a reasonable matching of two figures if we simply align their reference points. Definition 4.25 (Reference point, [8]). Let K be a set of planar figures and δ : K → R be a distance measure on K. A mapping R : K → R 2 is called a δ–reference point for K of quality c > 0 with respect to a set of transformations T on K, if the following holds for any two figures P, Q ∈ K and each transformation τ ∈ T : (Equivariance) (Lipschitz continuity)
R(τ (P )) = τ (R(P )) || R(P ) − R(Q)|| ≤ c · δ(P, Q).
(4.2) (4.3)
The set of δ–reference points for K of quality c with respect to T is denoted R(K, δ, c, T ).
4.3 Approximately minimizing the Fr´ echet distance
39
So if δ is a metric on K, a reference point is a Lipschitz-continuous mapping between the metric spaces (K, δ) and (R 2 , || · ||) with Lipschitz constant c, which is equivariant under T . Various reference points are known for a variety of distance measures and classes of transformations, like, e.g., the centroid of a convex polygon which is a reference point of quality 11/3 for translations, using the area of the symmetric difference as a distance measure, see [12]. However, most work on reference points has focused on the Hausdorff distance, see [8]. We will only mention the following result that provides a δH –reference point for polygonal curves with respect to similarities, the so called Steiner point. The Steiner point of a polygonal curve is the weighted average of the vertices of the convex hull of the curve, where each vertex is weighted by its exterior angle divided by 2π. Theorem 4.26 (Steiner point, Aichholzer et al., [8]). The Steiner point is a δH –reference point with respect to similarities of quality 4/π. It can be computed in linear time. Note that the Steiner point is an optimal δH –reference point with respect to similarities, i.e., the quality of any δH –reference point for that transformation class is at least 4/π, see [8]. Two feasible reparametrizations α and β of P and Q demonstrate, that for each point P (α(t)) there is a point Q(β(t)) with ||P (α(t)) − Q(β(t))|| ≤ δ (and vice versa), thus δH (P, Q) ≤ δ˜F (P, Q) ≤ δF (P, Q). This is summarized in the following: Observation 4.27. Let c > 0 be a constant and T be a set of transformations on K0 ∪ K1 . Then each δH –reference point with respect to T is also a δ˜F –reference point with respect to T , and each δ˜F –reference point is also a δF –reference point of the same quality, i.e., R(K0 ∪ K1 , δH , c, T ) ⊆ R(K0 ∪ K1 , δ˜F , c, T ) ⊆ R(K0 ∪ K1 , δF , c, T ). This shows that we can use the known δH –reference points to obtain δ˜F –reference points and δF –reference points. However, since for non-closed curves each reparametrization has to map P (0) to Q(0), the distance ||P (0) − Q(0)|| is a lower bound for δ˜F (P, Q) as well as δF (P, Q). So we get a new reference point that is substantially better than all known reference points for the Hausdorff distance. Observation 4.28. The mapping Ro :
(
K0 → R 2 P 7→ P (0)
is a δ˜F –reference point for curves of quality 1 with respect to translations, i.e., Ro ∈ R(K0 , δ˜F , 1, T2 ). The quality of this reference point, i.e., 1, is better than the quality of the Steiner point, which is 4/π. Since the latter is an optimal reference point for the Hausdorff distance, this
40
Matching polygonal curves with respect to the Fr´ echet distance
shows that for the Fr´echet distance substantially better reference points exist. For closed curves however, Ro is not defined at all, so we have to stick to the reference points we get from Observation 4.27. Based on the existence of these reference points for T2 we obtain the following algorithm for approximate matchings of (closed) curves with respect to the (weak) Fr´echet distance under the group of translations, which is the same procedure as already used in [8] for the Hausdorff distance. In what follows, C ∈ {K0 , K1 }, and δ ∈ {δ˜F , δF }. Algorithm Approx Fr´echet Match(P, Q, R) Input: Two polygonal curves P, Q ∈ C and a reference point R ∈ R(C, δ, c, T2 ). Output: A value δapprox and a translation τapprox such that 1. δapprox = δ(τapprox (P ), Q), and 1. 2. 3. 4.
2. δapprox ≤ (c + 1) · δopt , where δopt = minτ δ(τ (P ), Q).
Compute R(P ) and R(Q) τapprox := R(Q) − R(P ) δapprox := δ(τapprox (P ), Q) return (δapprox , τapprox )
Theorem 4.29 (Correctness and complexity of Approx Fr´echet Match). Suppose that R can be computed in O(TR (n)) time. Then Algorithm Approx Fr´echet Match produces a (c + 1)-approximation to Problem 4.1 in O mn logk+1 (mn) + TR (m) + TR (n) time, where k = 0 if δ = δF , and k = 1 if δ = δ˜F . Proof. The correctness of this procedure follows easily from the results of [8]. The proof of the claimed time bounds follows from the results (Theorems 6, 7b, and 11) of [13]. Note that with an idea from [45] it is possible to reduce the approximation constant for reference point based matching to (1 + ǫ) for any ǫ > 0; the idea places a sufficiently small grid of size O(1/ǫ2 ) around the reference point of Q and checks each grid point as a potential image point for the reference point of P . The runtime increases by a factor proportional to the grid size.
4.3.1
Reference points for affine maps
We conclude this chapter with a negative result on the approximate matching problem. We have already seen that – compared to the Hausdorff distance – better reference points for the Fr´echet distance exist (for polygonal curves), i.e., R(K0 , δH , T2 ) ( R(K0 , δ˜F , T2 ), S where R(K0 , δ, T2 ) := c>0 R(K0 , δ, c, T2 ) is the set of all δ–reference points for K0 with respect to T2 . Note that we can easily strengthen Observation 4.28 as follows:
4.3 Approximately minimizing the Fr´ echet distance
41
Observation 4.30. Let Ro be as above, and T be a set of transformations on K0 , such that Ro is equivariant under T . Then Ro is a δ˜F –reference point for K0 of quality 1 with respect to T , i.e., Ro ∈ R(K0 , δ˜F , 1, T ). In particular this implies that R(K0 , δ˜F , A2 ) 6= ∅, where A2 is the set of affine transformations of the plane. This is complemented by the following result: Theorem 4.31 (Non-existence of δH -reference points for affine maps). There are no δH -reference points for affine maps, i.e., R(K0 ∪ K1 , δH , A2) = ∅. Proof. First observe that for two curves P1 , P2 ∈ K0 ∪ K1 with P1 6= P2 and δH (P1 , P2 ) = 0, and a reference point R ∈ R(K0 ∪ K1 , δH , T ), we have that || R(P1 ) − R(P2 )|| = 0, so R(P1 ) = R(P2 ); this means that δH –reference points only take the geometry of the curves into account. In the following c(X) denotes the center of gravity of the vertices of the convex hull of X. Assume R is a δH –reference point with respect to affine maps. Let ∆ be a curve in the shape of an equilateral triangle. Let αr be the affine map that rotates ∆ around c(∆) by an angle of 2π/3 in counterclockwise order. Let ∆r = αr (∆) denote the image of ∆ under αr . Note that c(∆) is the only fixpoint of αr . Now since R is a δH –reference point, and δH (∆, ∆r ) = 0, we can conclude from our initial remark that R(∆) = R(∆r ) = R(αr (∆)) = αr (R(∆)), so R(∆) is a fixpoint of αr , and thus R(∆) = c(∆). Now consider an arbitrary triangle ∆′ . Since ∆′ is the image of an equilateral triangle ∆ under some affine transformation α′ , R(∆′ ) = R(α′ (∆)) = α′ (R(∆)) = α′ (c(∆)). Since c is invariant under affine maps, R(∆′ ) = c(α′ (∆)) = c(∆′ ). However, as Figure 4.5 illustrates, c is not a δH –reference point, since it is not Lipschitzcontinuous. To this end, observe that lim h→0 w→∞
δH (∆1 , ∆2 ) = 0,
whereas
lim h→0 w→∞
||c(∆1 ) − c(∆2 )|| = ∞.
42
Matching polygonal curves with respect to the Fr´ echet distance
∆2
∆1 h
c(∆1 )
c(∆2 ) w
Figure 4.5: The center of gravity of the vertices of the convex hull is not a δH –reference point.
Chapter 5 Bounding the Fr´ echet distance by the Hausdorff distance As we have already seen (c.f., Figure 3.1 and 3.2), neither the ratio between δH and δF , nor the ratio between δF and δ˜F is bounded in general. However, the following result from [10] (see also [35]) shows that for certain classes of curves the three distance measures are closely related: Theorem 5.1 (δH vs. δF for convex closed curves, Alt et. al, [10]). For any pair of convex closed curves P and Q, δH (P, Q) = δ˜F (P, Q) = δF (P, Q). In the following we will consider κ-straight curves; for these curves the arclength between any two points is at most a constant κ times their Euclidean distance. Definition 5.2 (κ-Straightness). A planar (closed) rectifiable curve P ∈ K0 ∪ K1 is called κ-straight for some real parameter κ ≥ 1, if the following holds for any two points x and y on P : dP (x, y) ≤ κ||x − y||, where dP (x, y) is the arclength of the shortest piece of P connecting x and y. Examples for straight curves are the curves with increasing chords of [44], where κ ≤ 2π/3, or the self-approaching curves of [6]. We will see below that the straightness condition rules out the possibility of curves with small Hausdorff distance that have a large Fr´echet distance: In section 5.1 we show that the Fr´echet distance of κ-straight curves is at most (κ + 1) times their Hausdorff distance (c.f., Theorem 5.3 on the next page). This result gives rise to a randomized approximation algorithm that computes an upper bound on the Fr´echet distance between two κ-straight curves that is off from the exact value by a multiplicative factor of (κ + 1). The algorithm runs in O((m + n) log2 (m + n)2α(m+n) ) time for given polygonal curves P, Q with m and n
44
Bounding the Fr´ echet distance by the Hausdorff distance
vertices (c.f., Corollary 5.8 on page 49), and thus outperforms the fastest known algorithm to compute the Fr´echet distance exactly, which requires O(mn log(mn)) time, see [13]. In section 5.3 we also provide the first non-trivial algorithm to decide for any κ ≥ 1, if a given polygonal curve is κ-straight; it runs in O(n log2 n) time for a polygonal curve on n vertices (c.f., Theorem 5.9 on page 49).
5.1
The upper bound
In this section we will show that for κ-straight polygonal curves the Fr´echet distance is at most a factor of (κ + 1) away from the Hausdorff distance. Theorem 5.3 (δH vs. δF for κ-straight polygonal curves). For any pair of κstraight polygonal curves P, Q ∈ K0 δF (P, Q) ≤ (κ + 1)δH (P, Q), if max(||P (0) − Q(0)||, ||P (1) − Q(1)||) ≤ δH (P, Q). Proof. Let δ = δH (P, Q). Since both ||P (0) − Q(0)|| and ||P (1) − Q(1)|| are bounded by δ, we know that the points L and R in Fδ (P, Q) are white. To each x ∈ [0, 1] we assign the value h(x) defined by h(x) = max{y | ∃x′ ≤ x s.t. (x′ , y) ∈ Fδ }. h is a piecewise continuous monotone function consisting of of elliptic arcs and horizontal segments. At the points of discontinuity we add vertical segments and obtain a curve ϕδ from the graph of h (see Figure 5.1). In addition, we add the vertical segment Lh(0) such that ϕδ starts in L and ends in R.
ϕδ
Figure 5.1: The reparametrization ϕδ in Fδ (P, Q). The bi-monotone curve ϕδ consists of elliptical arcs and vertical and horizontal line segments. The left endpoint of a horizontal segment coincides with the upper endpoint
5.1 The upper bound
45
of an elliptical arc and the upper endpoint of a vertical segment coincides with the left endpoint of an elliptical arc. The line segments are always black (apart from the endpoint they share with an elliptical arc) and the elliptical arcs are always white. We claim that for all points (x, y) on ϕδ we have that ||P (x) − Q(y)|| ≤ (κ + 1)δ, so that two corresponding reparametrizations yield a Fr´echet distance of at most (κ + 1)δ. If (x, y) is white we are done since ||P (x) − Q(y)|| ≤ δ in that case, so let us consider a black point (x1 , y0 ) on ϕδ that lies on a horizontal black segment (this implies that x1 < 1) with the white left endpoint (x0 , y0). It remains to show that ||P (x1 ) − Q(y0 )|| ≤ (κ + 1)δ. We first claim that there exists a white point (x2 , y0 ) ∈ Fδ with x2 > x1 . To see this, observe that, since δ = δH (P, Q), there exists for each y an x such that (x, y) is white, so for any ǫ > 0 there exists an x such that (x, y0 + ǫ) is white. Since (x1 , y0 ) is white and the region above ϕδ is black (by definition), we can conclude that for any ǫ > 0 there exists an x > x1 such that (x, y0 + ǫ) is white, and moreover, since Fδ is closed (by definition), that there exists an x2 > x1 such that (x2 , y0) is white.
ϕδ
(x2, y0)
(x0, y0) (x1, y0)
By applying the triangle inequality twice we obtain ||P (x1 ) − Q(y0 )|| ≤ min( ||P (x0 ) − Q(y0 )|| + ||P (x0 ) − P (x1 )||, ||P (x2 ) − Q(y0 )|| + ||P (x1 ) − P (x2 )||) ≤ min( dP (P (x0 ), P (x1 )), dP (P (x1 ), P (x2))) + δ.
Q y0
≤δ
≤δ P
x2
x0 x1
46
Bounding the Fr´ echet distance by the Hausdorff distance So with min(dP (P (x0 ), P (x1 )), dP (P (x1 ), P (x2))) ≤ 12 dP (P (x0 ), P (x2 )), and dP (P (x0 ), P (x2)) ≤ κ||P (x0 ) − P (x2 )|| ≤ κ(||P (x0) − Q(y0 )|| + ||P (x2) − Q(y0 )||) ≤ 2κδ,
it follows that ||P (x1 ) − Q(y0 )|| ≤ (κ + 1)δ. If the black point (x1 , y0) lies on a vertical black segment we can apply the same reasoning. Since δH (P, Q) ≤ δ˜F (P, Q), and max(||P (0) − Q(0)||, ||P (1) − Q(1)||) ≤ δ˜F (P, Q) we can conclude: Corollary 5.4 (δ˜F vs. δF for κ-straight polygonal curves). For any pair of κstraight polygonal curves P, Q ∈ K0 δF (P, Q) ≤ (κ + 1)δ˜F (P, Q).
5.2
Computing the reparametrization
In this section we describe a divide-and-conquer algorithm that computes the reparametrization ϕδ . Our approach makes crucial use of the following result: Theorem 5.5 (Combination Lemma, Agarwal/Sharir, [47]). Let B be a set of mB blue Jordan arcs, and R be a set of mR red Jordan arcs with the property that any two of these arcs intersect at most a constant number of times, and let P be a set of n points with the property that none of these points lies on one of the arcs; let N = mR + mB + n. Then the total complexity of all the regions induced by the red and the blue arcs that contain a point from P is O(N) and these regions can be computed in O(N log N) time. We show the following: Theorem 5.6 (Computation of the upper envelope of Φδ (P, Q)). Let P, Q ∈ K0 be simple polygonal curves with n, and m vertices, and δ > 0. Then ϕδ can be computed in O((m + n) log2 (m + n)2α(m+n) ) [a] time by a randomized algorithm, and in O((m + n) log3 (m + n)2α(m+n) ) time by a deterministic one. [a]
Here, α(n) = min{k ≥ 1 | A(k, k) ≥ n} is the functional inverse of the Ackermann function A(k, n), which is defined as follows: A(1, n) = 2n,
n≥1
A(k, 1) = 2, k ≥ 2 A(k, n) = A(k − 1, A(k, n − 1)), See chapter 2 in [47] for a detailed discussion.
k ≥ 2, n ≥ 2.
5.2 Computing the reparametrization
47
Proof. The algorithm proceeds as follows: In a first step, we determine the intersection of ϕδ with the vertical line in Fδ that corresponds to the midpoint Q(1/2) of Q in O((m + n) log2 (m + n)2α(m+n) ) deterministic time, and O((m + n) log(m + n)2α(m+n) ) randomized time (see below); this yields a point P (y1/2 ) on P , c.f., Figure 5.2. Then we split P into PB := P |[0,y1/2] and PT := P |[y1/2 ,1] and Q into QL := Q|[0,1/2] and QR := Q|[1/2,1] , recursively compute ϕδ (PB , QL ) and ϕδ (PT , QR ), and glue them together.
PT
P (y1/2 )
PB
QL
Q(1/2)
QR
Figure 5.2: The first point on P (starting from p) that intersects bdδ (QL ). In order to compute P (y1/2 ), we imagine the following process: First we cut off the right half of Fδ (P, Q), i.e., we only look at Fδ (P, QL). Now we start sweeping a horizontal line from y = 1 downwards to y = 0 until it hits the first white point. This will happen at y1/2 . Another interpretation of this process is the following: We start walking on P in P (1) towards P (0). We walk until we arrive at a point that has distance at most δ to QL . This will happen at P (y1/2 ). Of course we cannot afford to compute the diagram Fδ (P, Q) (or larger parts of it) explicitly, so we have to proceed in a different way. Let p := P (1) be the endpoint of P . In a first step we check in O(m) time, if p is δ-close to QL . If this is the case, we are already finished. Otherwise we consider nhδ (QL ), the δ-neighborhood of QL . This set can be described as the union of rectangles of width δ and circles of radius δ. The boundary of nhδ (QL ) will be called bdδ (QL ); it consists of circular arcs (of radius δ) and line segments. Since Q is simple, this boundary has complexity O(m) (c.f. [38] and [41]), and can be computed in O(m log m) time, (c.f. [23]). The basic idea of our algorithm is based on the following simple observation: Observation 5.7. The first point on P (starting from p) that intersects bdδ (QL ) is part of the boundary of the cell C of the arrangement A induced by P and bdδ (QL ), that contains p.
48
Bounding the Fr´ echet distance by the Hausdorff distance
P (1)
bdδ (QL )
P
If we consider the segments and circular arcs of bdδ (QL ) as a set of O(m) red, and the segments of P as a set of O(n) blue Jordan arcs, we can conclude with Theorem 5.5 that the cell C has complexity O(m + n). So all we have to do is to compute the cell C, and check its boundary. For technical reasons, we need a point p′ in this cell, that is neither part of P nor of bdδ (QL ), in particular we cannot use p itself. Instead we compute the intersection of the line supporting the last segment of P with P ∪bdδ (QL ) by brute force in O(m+ n) time, and let p′ be the midpoint between p and the intersection point that is closest to p. Since any two of the Jordan arcs of P and bdδ (QL ) intersect at most twice, the cell of A that contains the point p′ can be computed in O(λ4 (m + n) log2 (m + n)) = O((m + n) log2 (m+n)2α(m+n) ) deterministic time ([36], see also [47], Theorem 6.11), or in O(λ4 (m+ n) log(m+n)) = O((m+n) log(m+n)2α(m+n) ) randomized time ([26], see also [47], Theorem 6.15). [b] Complexity. Let nX denote the number of vertices of the curve PX for X ∈ {T, B} and mY denote the number of vertices of the curve QY for Y ∈ {L, R}. We have nT + nB = n, and max(mL , mR ) ≤ m/2, so the runtime T (n, m) of the algorithm obeys the following recursion T (n, m) ≤ T (nB , mL ) + T (nT , mR ) + d · (n + m)f (n + m) ≤ T (nB , m/2) + T (nT , m/2) + d · (n + m)f (n + m), where f (n, m) is a monotone function, and d > 0 is a suitable constant. We prove by induction on m that T (n, m) ≤ c · (n + m)f (n + m) log m for a suitable constant c > 0. [b]
For positive integers n, s, a sequence U = hu1 , . . . , um i over {1, . . . , n} is called a Davenport-Schinzel sequence of order s over an n-element alphabet, if any two consecutive elements of U are distinct, and U does not contain a subsequence of length s + 2 of the form ababab . . . for a 6= b. The maximum length of a Davenport-Schinzel sequence of order s over an n-element alphabet is denoted by λs (n). It is known that λ4 (n) = O(n2α(n) ), see [47] for more details.
5.3 Recognizing κ-straight curves
49
Since T (n, 1) = O(n) the induction starts readily. Now T (n, m) ≤ c · (nB + m/2)f (nB + m/2) log(m/2) + c · (nT + m/2)f (nT + m/2) log(m/2) + d · (n + m)f (n + m) ≤ c · (nB + m/2 + nT + m/2)f (n + m)(log m − 1) + d · (n + m)f (n + m) ≤ c · (n + m)f (n + m) log m, if c ≥ d. With an algorithm of Alt et al. [9] we can compute δ = δH (P, Q) in O((m + n) log(m + n)) time. Combining this with Theorems 5.3 and 5.6, we can find a (κ + 1)-approximation to δF (P, Q), together with a reparametrization ϕapp that witnesses this fact, within the time bounds stated in Theorem 5.6. Corollary 5.8 (δF -approximation for κ-straight polygonal curves). For any pair of κ-straight polygonal curves P, Q ∈ K0 with max(||P (0) − Q(0)||, ||P (1) − Q(1)||) ≤ δH (P, Q), we can compute a (κ + 1)-approximation to δF (P, Q), together with a reparametrization ϕapp by a deterministic (randomized) algorithm in O((m + n) log3 (m + n)2α(m+n) ) (O((m + n) log2 (m + n)2α(m+n) )) time.
5.3
Recognizing κ-straight curves
In this section we describe an efficient algorithm to decide whether a given polygonal curve P is κ-straight. The results are summarized in the following Theorem. Theorem 5.9 (Recognition of κ-straight curves). Let P ∈ K0 ∪ K1 be a simple (closed) polygonal curve on n vertices, and κ ≥ 1. One can decide in O(n log2 n) time whether P is κ-straight. The Theorem is a combination of Lemmas 5.13 and 5.15; they will be proved in the next two subsections. If P is κ-straight, it is also κ′ -straight for any κ′ ≥ κ. The smallest κ such that P is κ-straight is called the detour of P . Definition 5.10 (Detour of a polygonal curve). Let P be a simple polygonal curve, and x, y ∈ P with x 6= y. Then δP (x, y) :=
dP (x, y) ||x − y||
denotes the P -detour between x and y. For X, Y ⊆ P δP (X, Y ) :=
sup x∈X,y∈Y,x6=y
δP (x, y)
50
Bounding the Fr´ echet distance by the Hausdorff distance
denotes the P -detour between X and Y . Finally δ(P ) := δP (P, P ) denotes the detour of P . If P is understood from the context, we will write δ instead of δP . The decision problem version of the detour computation problem is equivalent to the problem of recognizing κ-straight curves.
5.3.1
Simple polygonal curves
Let us first consider the case where P ∈ K0 is a simple polygonal curve. The detour problem in that setting has been studied before by Ebbers-Baumann et al., [31]. They give an (1 + ǫ)-approximation algorithm for computing the detour of P that runs in O(n log n) time. Their approach makes use of the fact that the detour of a curve is always attained at a vertex of the curve. We will also exploit that property in the proof of Lemma 5.12 below. Lemma 5.11 (Ebbers-Baumann et al., [31]). Let P ∈ K0 be a simple polygonal curve,and let L ⊆ P and R ⊆ P be two subcurves of P . Then δ(L, R) = max δ(L0 , R), δ(L, R0 ) .
We first show the following technical Lemma which is of independent interest, and will be used in the next section, too.
Lemma 5.12 (Bichromatic disjoint subcurve detour). Let P ∈ K0 be a simple polygonal curve, and let κ ≥ 1. Let L ⊆ P and R ⊆ P be two disjoint subcurves of P with |L| = mL and |R| = mR , and let m = mL +mR . Assume that for two vertices (l, r) ∈ L×R the distance dP (l, r) can be computed in O(1) time, and that max δ (L), δ (R) ≤ κ. Then P P one can decide in O m log m time whether δP (L, R) ≤ κ. Proof. From Lemma 5.11, we know that δ(L, R) = max δ(L0 , R), δ(L, R0 ) . Therefore it is sufficient to test if δ(L0 , R) ≤ κ and δ(L, R0 ) ≤ κ. Let (l, r) ∈ (L, R0 ), and m be a vertex on P that separates L and R. Then dP (l, r) ≤κ ||l − r|| dP (l, m) + dP (m, r) ≤κ ⇐⇒ ||l − r|| dP (m, r) dP (l, m) ≤ ||l − r|| − . ⇐⇒ κ κ
δ(l, r) ≤ κ ⇐⇒
Now look at the bi-variate function ( Cr,m,κ :
R2 → R
x 7→ ||x − r|| − dP (m, r)/κ.
5.3 Recognizing κ-straight curves
51
If r = (rx , ry ), the graph of Cr,m,κ is a cone in R 3 with apex (rx , ry , − dP (m, r)/κ), apex angle π/2 and its principal axis parallel to the z-axis. If l = (lx , ly ) then δ(l, r) ≤ κ ⇐⇒ (lx , ly , dP (l, m)/κ) lies below Cr,m,κ. Consider the lifting map
Λm,κ :
(
L → R3 l = (lx , ly ) 7→ (lx , ly , dP (l, m)/κ).
The image of L under this lifting map is a polygonal curve in xy-plane is L. Now we have
R 3 whose projection to the
δ(L, r) ≤ κ ⇐⇒ Λm,κ (L) lies below Cr,m,κ, and therefore δ(L, R0 ) ≤ κ ⇐⇒ Λm,κ (L) lies below Cr,m,κ for all r ∈ R0 . If Cm,κ := minr∈R0 Cr,m,κ denotes the lower envelope of all Cr,m,κ, then δ(L, R0 ) ≤ κ ⇐⇒ Λm,κ (L) lies below Cm,κ . The minimization diagram of Cm,κ , i.e., the projection of the edges of this lower envelope to the xy-plane, is the additively weighted Voronoi diagram AVDm,κ (R0 ) of the points r ∈ R0 with weights −dP (m, r)/κ. This diagram can be computed in O(mR log mR ) time, c.f. [34]. For a vertex r ∈ R0 let V(r) denote the Voronoi cell of r. Let LV (r) be the set of maximal connected edge segments of L inside V(r). Then δ(L, R0 ) ≤ κ ⇐⇒ Λm,κ (e) lies below Cr,m,κ for all r ∈ R0 and for all e ∈ LV (r). Let A denote the arrangement that results from superimposing AVDm,κ (R0 ) with the polygonal curve L. For a vertex r ∈ R0 let A(r) denote the cell of A that contains r. Let LA (r) be the set of maximal connected edge segments of L on the boundary of A(r). Then δ(L, R0 ) ≤ κ ⇐⇒ Λm,κ (e) lies below Cr,m,κ for all r ∈ R0 and for all e ∈ LA (r). To see this, pick some e ∈ LV (r) and a point z on e, see Figure 5.3. Since V(r) is star shaped wrt r, the line segment s from r to z lies completely inside V(r). Assume that s crosses the segments e1 , . . . , ek = e from LV (r) in that order. In particular e1 ∈ LA (r), i.e., e1 lies on the boundary of A(r). Let r = p0 , p1 , . . . , pk = z denote the corresponding points of intersection with ei for 1 ≤ i ≤ k.
52
Bounding the Fr´ echet distance by the Hausdorff distance e = e5
V (r) e3
z = p5 p4 p2
e2 L
e4 p3
p1
e1 r = p0 A(r)
Figure 5.3: The cell A(r) in the superposition of AVDm,κ (R0 ) and L. We have that ||z − r|| = |s| =
P
||pi − pi+1 || and dP (z, r) ≤ P dP (pi , pi+1 ) δ(z, r) ≤ Pi . i ||pi − pi+1 || i
P
i
dP (pi , pi+1 ), so
Observe that for any two sequences of positive numbers x1 , . . . , xn and y1 , . . . , yn we have that P x x Pi i ≤ maxi i . yi i yi This is easily seen, since if
maxi xi /yi = x1 /y1 then 1 P i
Thus
yi
X i
1 xi ≤ P
X yi x1 = x1 /y1 . i yi i y1
P dP (pi , pi+1 ) δ(z, r) ≤ Pi i ||pi − pi+1 || dP (pi , pi+1 ) ≤ maxi ||pi − pi+1 || ≤ max δ(p1 , r), δ(L) ≤ max δ(e1 , r), δ(L)
≤ max δ(LA (r), r), δ(L) .
5.3 Recognizing κ-straight curves
53
Since z was a arbitrary point of LV (r) this implies
So with δ(L) ≤ κ we get
δ(LV (r), r) ≤ max δ(LA (r), r), δ(L) . δ(LV (r), r) ≤ κ ⇐⇒ δ(LA (r), r) ≤ κ.
Thus for all r ∈ R0 Λm,κ (e) lies below Cr,m,κ for all e ∈ LV (r) ⇐⇒ Λm,κ (˜ e) lies below Cr,m,κ for all e˜ ∈ LA (r). The edges in LA (r) are part of the boundary of the cell of A that contains r. Therefore the total number of these edges is bounded by the total complexity of all cells of A that contain some r ∈ R0 . We will now show that the total complexity of all these cells is O(m), and that the boundaries of all these cells (and therefore all the edges in LA (r)) can be computed in O(m log m) time. To this end, consider the segments of L as a set of mL blue Jordan arcs, and the edges of AVDm,κ (R0 ) as a set of O(mR ) red Jordan arcs. Any two of these Jordan arcs intersect at most twice, and none of the points in R0 lies on one of them. The total complexity of all the regions induced by the red arcs that contain a point from R0 is O(mR ); moreover we have an explicit description of these regions, since they correspond to the Voronoi regions of the vertices from R0 . The total complexity of all the regions (this is in fact only one region) induced by the blue arcs that contain a point from R0 is O(mL ); again we have an explicit description of this region. With Lemma 5.5 we can conclude that the total complexity of all the regions induced by the red and blue arcs that contain a point from R0 is O(mL + mR ) = O(m) and that these regions can be computed in O(m log m) time. The following Lemma contains the main result of this section. Lemma 5.13 (Detour of a polygonal curve). Let P ∈ K0 be a simple polygonal curve on n vertices, and let κ ≥ 1. Then one can decide in O(n log2 n) time whether δ(P ) ≤ κ. Proof. In a preprocessing step we traverse P in O(n) time, starting from P (0), and store in each vertex P (i) the distance dP (P (0), P (i)). This will enable us to determine the distance on P between any pair of vertices of P in O(1) time (this is required in order to apply Lemma 5.12). Now we split P into two subcurves L := P |[0,⌊n/2⌋] and R := P |[⌊n/2⌋,n] with m = P (⌊n/2⌋) as a split vertex; observe that δ(P ) = max δ(L), δ(R), δ(L, R) . Then we check recursively whether δ(L) ≤ κ and δ(R) ≤ κ. If this is not the case then δ(P ) > κ. Finally we run the algorithm from the proof of Lemma 5.12 to decide in O(n log n) time, if δ(L, R) ≤ κ. The recursion only adds an additional logarithmic factor to this runtime.
54
Bounding the Fr´ echet distance by the Hausdorff distance
5.3.2
Simple closed polygonal curves
Let us now consider the case where P ∈ K1 is a simple closed polygonal curve. We are not aware of any related work on the problem of computing the detour for closed curves. For a closed rectifiable curve P and two points x, y ∈ P with x 6= y the distance on P between these two vertices, dP (x, y), is the length of the shortest subcurve of P connecting x and y. The detour between two points on P and the detour of P itself are definied in the same manner as for open curves. Since the statement of Lemma 5.11 is not valid anymore for closed curves we take a different approach, and use a recursive partition technique to enable us to apply our previous techniques to solve the detour problem for curves appropriately; we first prove a technical result: Lemma 5.14 (Top-Bottom detour of a closed polygonal curve). Let κ ≥ 1, and let P ∈ K1 be a simple closed polygonal curve with four vertices tL , tR , bR , bL on P (in that order) such that 1. dP (tL , tR ) = dP (bR , bL ), 2. dP (tR , bR ) = dP (bL , tL ), and 3. max δP (T ∪ R), δP (R ∪ B), δP (B ∪ L), δP (L ∪ T ) ≤ κ,
with T := P |[tL ,tR ] , R := P |[tR ,bR ] , B := P |[bR ,bL ] , and L := P |[bL,tL ] . Let n be the total number of vertices of T and B. Assume that for two vertices (l, r) ∈ P the distance dP (l, r) can be computed in O(1) time. Then one can decide in O n log2 n) time whether δP (T, B) ≤ κ. Proof. Let us assume wlog that T contains more than n/2 vertices; call this number nT . We claim that the following algorithm correctly decides if δP (TR , BL ) ≤ κ. Algorithm TB Closed Curve Detour (P, tL, tR , bL , bR , κ) 1. Choose a point t ∈ P on T so that TL := P |[tL,t] and TR := P |[t,tR ] contain nT /2 vertices each. 2. Split B with a (possibly new) vertex b into BL := P |[b,bL] and BR := P |[bR ,b] , such that dP (bR , b) = dP (tL , t) and dP (b, bL ) = dP (t, tR ). 3. Check if δPL (TL , BL ) ≤ κ, where PL := TL ∪ L ∪ BL . 4. Check if δPR (TR , BR ) ≤ κ, where PR := TR ∪ R ∪ BR . 5. Recursively call the algorithm with t′L = tL , t′R = t, b′R = bR , b′L = b 6.
to check that δP (TL , BR ) ≤ κ. Recursively call the algorithm with t′′L = t, t′′R = tR , b′′R = b, b′′L = bL to check that δP (TR , BL ) ≤ κ.
5.3 Recognizing κ-straight curves
55
w w1
w2
tL
tR TL
TR
t T
h
L
R
B BL bL
b
w2
BR w1
bR
w
Correctness: First we prove that δP (PL ) ≤ κ whenever δPL (TL , BL ) ≤ κ. To see this, observe that dP (u, v) = dPL (u, v) for u, v ∈ PL and therefore δP (TL , BL ) = δPL (TL , BL ) ≤ κ; similarly δP (TR , BR ) = δPR (TR , BR ) ≤ κ. Now since δP (TL , BL ) δP (L) δP (L, TL ) δP (L, BL ) δP (TL ) δP (BL )
≤ ≤ ≤ ≤ ≤ ≤
κ, δP (L ∪ T ) ≤ κ, δP (L ∪ TL ) ≤ δP (L ∪ T ) ≤ κ, δP (L ∪ BL ) ≤ δP (L ∪ B) ≤ κ, δP (L ∪ TL ) ≤ δP (L ∪ T ) ≤ κ, and δP (L ∪ BL ) ≤ δP (L ∪ B) ≤ κ,
we can conclude that δP (PL ) ≤ κ. A symmetric argument yields that δP (PR ) ≤ κ if δPR (TR , BR ) ≤ κ. In order to apply induction we have to argue that the subproblems created in Steps 5 and 6 of the algorithm meet the preconditions of Lemma 5.14: To this end, observe that dP (tL , t) = dP (bR , b), and dP (t, bR ) = dP (b, tL ). With T ′ = TL , R′ = TR ∪ R, B ′ = BR , and L′ = BL ∪ L it follows that • δP (T ′ ∪ R′ ) ≤ κ, since T ′ ∪ R′ = T ∪ R, and δP (T ∪ R) ≤ κ,
56
Bounding the Fr´ echet distance by the Hausdorff distance • δP (R′ ∪ B ′ ) ≤ κ, since R′ ∪ B ′ = PR , and δP (PR ) ≤ κ, • δP (B ′ ∪ L′ ) ≤ κ, since B ′ ∪ L′ = B ∪ L, and δP (B ∪ L) ≤ κ, and finally • δP (L′ ∪ T ′ ) ≤ κ, since L′ ∪ T ′ = PL , and δP (PL ) ≤ κ.
Similarly, dP (t, tR ) = dP (b, bL ), and dP (tR , b) = dP (bL , t), and with T ′′ = TR , R′′ = R ∪ BR , B ′′ = BL , and L′′ = L ∪ TL it follows that • δP (T ′′ ∪ R′′ ) ≤ κ, since T ′′ ∪ R′′ = PR and δP (PR ) ≤ κ, • δP (R′′ ∪ B ′′ ) ≤ κ, since R′′ ∪ B ′′ = R ∪ B and δP (R ∪ B) ≤ κ, • δP (B ′′ ∪ L′′ ) ≤ κ, since B ′′ ∪ L′′ = PL and δP (PL ) ≤ κ, and finally • δP (L′′ ∪ T ′′ ) ≤ κ, since L′′ ∪ T ′′ = L ∪ T and δP (L ∪ T ) ≤ κ. The correctness of the Algorithm therefore follows inductively. Complexity: Let nX denote the number of vertices of the curve X, where X ∈ {T, B, TL , TR , BL , BR }. We have n = nT + nB = nTL + nTR + nBL + nBR = 2nTL + nBL + nBR , and n/2 ≤ nT = nTL + nTR = 2nTL = 2nTR .
(5.1) (5.2)
Observe that in order to decide if δPL (TL , BL ) ≤ κ and δPR (TR , BR ) ≤ κ in Steps 3 and 4 of Algorithm TB Closed Curve Detour we can use the result of Lemma 5.12, since max(δPL (TL ), δPL (BL )) ≤ κ, and max(δPR (TR ), δPR (BR )) ≤ κ. This follows, since (as noted before) dP (u, v) = dPL (u, v) for u, v ∈ PL , and therefore δPL (TL ) = δP (TL ) ≤ δP (T ∪B) ≤ κ. The other inequalities can be deduced in the same way. So the costs of Steps 3 and 4 are O((nTL +nBL ) log(nTL +nBL )), and O((nTR +nBR ) log(nTR + nBR )), respectively. Therefore the runtime T (n) of Algorithm TB Closed Curve Detour obeys the following recursion: T (n) ≤ T (nTL + nBR ) + T (nTR + nBL ) + c(nTL + nBL ) log(nTL + nBL ) + c(nTR + nBR ) log(nTR + nBR ) ≤ T (nTL + nBR ) + T (nTR + nBL ) + cn log n
5.3 Recognizing κ-straight curves
57
for some constant c > 0. The size of each subproblem is reduced to at least 3/4, since (5.1) and (5.2) imply that n/4 ≤ nTL ≤ nTL + nBR = nTL + n − 2nTL − nBL = n − nTL − nBL ≤ 3n/4 − nBL ≤ 3n/4, and the total size of the subproblems remains the same at each level of the recursion. Thus the recursion tree has logarithmic depth and the total work amounts to T (n) = O(n log2 n). Lemma 5.15 (Detour of a closed polygonal curve). Let P ∈ K1 be a simple closed polygonal curve on n vertices, and let κ ≥ 1. Then one can decide in O(n log2 n) time whether δ(P ) ≤ κ. Proof. In a preprocessing step we pick an arbitrary vertex P0 on P , and handle P as if it was an open polygonal curve P ′ with starting and ending point P0 ; we traverse P ′ in O(n) time, starting from that vertex, and store in each vertex v the distance dP ′ (P0 , v), together with the total arclength of P . This will enable us to determine the distance on P between any pair of vertices on P , and the distance between any pair of vertices on a subcurve of P in O(1) time (this is required in order to apply Lemma 5.14). Now we proceed as follows: First we split P into four consecutive pieces P1 , . . . , P4 of equal arclength. Since δ(P ) = max1≤i≤j≤4 (δP (Pi , Pj )) we have that δ(P ) ≤ κ ⇐⇒ δP (Pi , Pj ) ≤ κ for all 1 ≤ i ≤ j ≤ 4. Next we define four overlapping polygonal curves Qi := Pi ∪ Pi+1 (mod 4) for 1 ≤ i ≤ 4. Observe that dP (u, v) = dQi (u, v) for u, v ∈ Qi , and δ(Qi ) = max(δP (Pi ), δP (Pi+1 ), δP (Pi , Pi+1 )) ≤ δP (P ) for all 1 ≤ i ≤ 4. Now we check if δ(Qi ) ≤ κ for 1 ≤ i ≤ 4 with the algorithm from Lemma 5.13. If this is not the case we are done, since δP (P ) ≥ δ(Qi ) > κ. Otherwise we know that δP (Pi ) ≤ κ and δP (Pi , Pi+1 ) ≤ κ for 1 ≤ i ≤ 4, so it remains to verify whether δ(Pi , Pi+2 ) ≤ κ for i = 1, 2. This can be decided in O(n log2 n) time by running the algorithm from the proof of Lemma 5.14 twice.
58
Bounding the Fr´ echet distance by the Hausdorff distance
Part III Computing the Hausdorff distance between polyhedral surfaces in R 3
Computing the Hausdorff distance between polyhedral surfaces in
R3
61
Many real world applications like, e.g., molecular docking, CAD/CAM, terrain comparison for GIS, etc., give rise to geometric pattern matching problems in three-dimensional space. In most of these applications the patterns are modeled by triangulated simple (i.e., not self-intersecting) polyhedral surfaces; such surfaces can be seen as sets of triangles with the additional property that any two distinct triangles in the set do not intersect in their relative interiors. Definition 6.1 (∆-pattern). A finite set P of triangles in R 3 with the property, that any two distinct triangles in P do not intersect in their relative interiors will be called a ∆-pattern. The size of the ∆-pattern P is the number of triangles in the set P . In this part the similarity of ∆-patterns will be assessed with the aid of the Hausdorff distance δH (c.f., Definition 1.4 on page 7) and a generalization thereof, the so-called LB Hausdorff distance which is defined with respect to a centrally symmetric convex body B ⊆ R3 . Recall that such a body B defines a metric on R 3 in the following way: Definition 6.2 (LB -distance, LB -norm). Let x and y be two points in dLB (x, y) denotes the LB -distance between x and y,
R3 .
Then
dLB (x, y) := ||x − y||LB ,
with ||z||LB being the LB -norm of z ∈ R 3 ,
||z||LB := min{c ≥ 0 | z ∈ c · B}. By modifying the Definition of δH (c.f., Definition 1.4) accordingly to use the LB distance instead of the Euclidean distance to measure the distance between points in R 3 we can define a distance measure that constitutes a generalization of the Hausdorff distance δH : Definition 6.3 (LB -Hausdorff distance, one-sided LB -Hausdorff distance). Let B (P, Q) denotes the LB -Hausdorff distance between P and Q be compact sets in R 3 . Then δH P and Q, defined as B B B δH (P, Q) := max δ˜H (P, Q), δ˜H (Q, P ) , with δ˜B (P, Q) := sup inf dL (x, y), the one-sided LB -Hausdorff distance from P to Q. H
x∈P y∈Q
B
Variants of the Hausdorff distance that are frequently considered for comparing geometric patterns are obtained by choosing B = B1 = {(x, y, z) ∈ R 3 | |x| + |y| + |z| ≤ 1}, or B = B∞ = {(x, y, z) ∈ R 3 | max(|x|, |y|, |z|) ≤ 1}, the unit balls of the L1 and L∞ metric, respectively. For B = B2 = {(x, y, z) ∈ R 3 | x2 + y 2 + z 2 ≤ 1} (the three-dimensional unit ball) the resulting distance is the Hausdorff distance δH . In the following we give an algorithm that computes δH (P, Q) in Oǫ ((mn)15/16+ǫ (m17/16 + n17/16 )) randomized time for given ∆-patterns P and Q of
62
Computing the Hausdorff distance between polyhedral surfaces in
R3
size m and n, respectively (c.f., Theorem 6.19 on page 68). We also present an algorithm that decides whether the Hausdorff distance with respect distance function √ to a polyhedral √ between P and Q is at most δ in Oǫ ((m + n)2+ǫ + mn( m1+ǫ + n1+ǫ )) randomized time (c.f., Theorem 6.12 on page 66). Some of the material has already been published in [11].
6.1 Outline of the method
63
In the following we develop efficient algorithms for the measuring problem for ∆patterns in R 3 with respect to the Hausdorff distance. Problem 6.4 (∆-pattern δH -measure problem – optimization version). Given two ∆-patterns P, Q ⊆ R 3 . Compute δH (P, Q). In the course of that we will first devise efficient methods to tackle the corresponding decision problem. Problem 6.5 (∆-pattern δH -measure problem – decision version). Given two ∆-patterns P, Q ⊆ R 3 , and some δ > 0. Decide, whether δH (P, Q) ≤ δ. Godau [35] gives a randomized algorithm that computes δH (P, Q) in Oǫ (mn(n1+ǫ + m )) expected time, where m and n denote the size of P and Q, respectively; this algorithm is cubic in the diagonal case, where m = Θ(n). In section 6.3 below, we improve upon these results and give an algorithm that computes δH (P, Q) in Oǫ ((mn)15/16+ǫ (m17/16 + n17/16 )) randomized expected time (c.f., Theorem 6.19 on page 68); in the diagonal case this becomes Oǫ (n47/16+ǫ ) ≈ O(n2,9375 ) and constitutes the first subcubic solution for Problem 6.4. For variants of the Hausdorff distance, which we get when using a polyhedral distance function instead of the Euclidean distance to measure the distance between points in R 3 , we improve our results in section √ obtain an √ 6.2 and 2+ǫ 1+ǫ algorithm that solves the decision problem in Oǫ ((m + n) + mn( m + n1+ǫ )) time (c.f., Theorem 6.12 on page 66); in the diagonal case, where m = Θ(n), this becomes Oǫ (n5/2+ǫ ) ≈ O(n2,5 ). Again the algorithm of Godau was the most efficient procedure known thus far for that problem. For the remainder of this chapter, P and Q are ∆-patterns of size m and n, respectively, and δ > 0 is fixed. As before, P 0 will denote the set of vertices of P . 1+ǫ
6.1
Outline of the method
The basic idea is best illustrated if we consider the problem in a somewhat simplified setting. To this end we make use of a different distance function, which we get when using a polyhedral distance function instead of the Euclidean distance to measure the distance between points in R 3 . For the remainder of this chapter, B will be a centrally symmetric convex body, BP will be a centrally symmetric convex polyhedron and B2 = {(x, y, z) ∈ R 3 | x2 + y2 + z2 ≤ 1} will be the three-dimensional unit ball. B We have that δ˜H (P, Q) ≤ δ iff for each point of P there is a point of Q that is δ-close (wrt the LB -distance). Therefore it makes sense to look at the set of all points that are δ-close to Q:
64
Computing the Hausdorff distance between polyhedral surfaces in
Definition 6.6 (LB -δ-neighborhood, LB -δ-boundary). Let Q be a compact set in Then nhB δ (Q) denotes the LB -δ-neighborhood of Q, defined as
R3 R3 .
3 nhB δ (Q) := {x ∈ R | dLB (x, Q) ≤ δ} = Q ⊕ (δ · B), and
bdB δ (Q) denotes the boundary of the LB -δ-neighborhood of Q, i.e., 3 bdB δ (Q) := {x ∈ R | dLB (x, Q) = δ}.
Our results are based on the following simple observation that is subsumed in Lemma 6.7 below: The one-sided LB -Hausdorff distance from P to Q is at most δ iff all vertices of P are contained in the LB -δ-neighborhood of Q and none of the triangles in P intersects the boundary bdB δ (Q). B Lemma 6.7 (The connection between δ˜H and bdB δ ). Let and P , Q be ∆-patterns in 3 R , and δ > 0. Then B B B 0 δ˜H (P, Q) < δ ⇐⇒ P ⊂ nhB δ (Q) ⇐⇒ P ⊂ nhδ (Q) and P ∩ bdδ (Q) = ∅.
So we are left with the task of verifying whether P 0 ⊂ nhB δ (Q) (’inclusion property’) B and P ∩ bdδ (Q) = ∅ (’intersection property’) hold. The first property can be checked in O(mn) steps by computing the distance of each vertex in P 0 to each triangle in Q in O(1) time. Note that this method can also compute the triangles ∆ ∈ P that contain a vertex outside of nhB δ (Q). Lemma 6.8 (Checking the inclusion property). For B ∈ {BP , B2 } we can decide whether P 0 ⊂ nhB δ (Q) in O(mn) time. In the following two sections we will describe efficient algorithms to verify the second property when B is a convex polyhedron or a ball. The basic approach will be the same in both cases, but the details (and the runtime) will differ marginally.
6.2
The LBP -Hausdorff distance of ∆-patterns
P For a triangle ∆ ∈ Q, the set nhB δ (∆) is the convex hull of three copies of δ · BP centered P at the vertices of ∆. Its boundary is a convex polyhedron and, since bdB δ (Q) is contained in the union of these boundaries, it is a polyhedral set, i.e., it is the union of a finite set of trinagles that do not intersect in their relative interiors.
6.2 The LBP -Hausdorff distance of ∆-patterns
Figure 6.1: The LB1 -δ-neighborhood of a triangle in
65
R3 .
P The complexity of bdB δ (Q) is the number of its vertices, edges, and 2-faces. BP P Theorem 6.9 (Computation of bdB δ , Aronov/Sharir, [19, 20]). The boundary bdδ (Q) has complexity O(n2 ) and can be computed in O(n2 log2 n) randomized expected time. P The algorithm of Aronov/Sharir computes a description of bdB δ (Q) where each 2-face is partitioned into triangles. In order to verify the intersection property we need a method to detect triangle-triangle intersections: P Theorem 6.10 (Triangle intersection queries for bdB δ , Pellegrini, [42]). Let T be a set of k triangles in R 3 with disjoint interiors. We can build a data structure of size Oǫ (k 4+ǫ ) in Oǫ (k 4+ǫ ) time, such that for any query triangle ∆ we can decide in Oǫ (k ǫ ) time if ∆ intersects T .
Lemma 6.11 (Checking the intersection property – polyhedral case). We 2+ǫ P decide whether P ∩ bdB + mn3/2+ǫ ) randomized expected time. δ (Q) = ∅ in Oǫ (n
can
P Proof. In a first step we compute a description of bdB δ (Q) with the algorithm from Theorem 6.9. This can be done in O(n2 log2 n) time and yields a set of O(n2 ) triangles that P partition the boundary of nhB δ (Q). Now we distinguish two cases:
m2 ≤ n: We run the algorithm of Theorem 6.10 to build a data structure of size Oǫ (m4+ǫ ) in Oǫ (m4+ǫ ) time that supports triangle intersection queries to P in Oǫ (mǫ ) P time and then we query this data structure with all triangles in bdB δ (Q) to test for 2 ǫ intersections in Oǫ (n m ) steps. The total time spent is Oǫ (m4+ǫ + n2 mǫ ) = Oǫ (n2+ǫ + n2 mǫ ) = Oǫ (n2+ǫ ).
n ≤ m2 : We partition P into g = ⌈m/n1/2 ⌉ groups of k = n1/2 ≤ m triangles each. For each group, we run the algorithm of Theorem 6.10 to build a data structure of size Oǫ (k 4+ǫ ) in Oǫ (k 4+ǫ ) time that supports triangle intersection queries in Oǫ (k ǫ ) P time and then we query this data structure with all triangles in bdB δ (Q) to test for 2 ǫ intersections in Oǫ (n k ) steps. The total time spent is Oǫ (g(k 4+ǫ + n2 k ǫ )) = Oǫ (gn2+ǫ/2 ) = Oǫ (mn3/2+ǫ ).
66
Computing the Hausdorff distance between polyhedral surfaces in
R3
By applying Lemmas 6.8 and 6.11 twice we get the main result of this section: BP Theorem 6.12 (∆-pattern δH -measure problem version). We can de√ √ – decision BP 2+ǫ 1+ǫ 1+ǫ cide whether δH (P, Q) ≤ δ in Oǫ ((m + n) + mn( m + n )) randomized expected time.
6.3
The Hausdorff distance of ∆-patterns
In the Euclidean case we follow the same basic approach as in the previous section. How2 ever, since the geometric structure of the boundary bdB δ (Q) is more complicated, we have to resort to data structures that do not match the efficiency of the ones we used in the polyhedral case. Therefore the runtime of the algorithm will be slightly worse. 2 For a triangle ∆, the set nhB δ (∆) is the convex hull of three copies of a δ-ball centered at the vertices of ∆; it is the (non-disjoint) union of three balls of radius δ around the vertices of ∆, three cylinders of radius δ around the edges of ∆, and a triangular prism of height 2 2δ around ∆. The complexity of bdB δ (Q) is the number of vertices, edges, and 2-faces
Figure 6.2: The LB2 -δ-neighborhood of a triangle in
R3 .
B2 2 of bdB δ (Q). A 2-face of the boundary is a maximal connected closed subset of bdδ (Q), B2 contained in one spherical, cylindrical, or triangular portion of bdδ (∆), for some ∆ ∈ Q. B2 2 An edge of bdB δ (Q) is a maximal connected subset of bdδ (Q) contained in two 2-faces 2 and a vertex of bdB δ (Q) is contained in three 2-faces. B2 2 Theorem 6.13 (Computation of bdB δ , Agarwal/Sharir, [3, 4]). The boundary bdδ (Q) has complexity Oǫ (n2+ǫ ) and can be computed in Oǫ (n2+ǫ ) randomized expected time.
2 The algorithm of Agarwal/Sharir computes a description of bdB δ (Q) where each 2-face is partitioned into semialgebraic surface patches of constant description complexity. Each of these surface patches is contained in one spherical, cylindrical, or triangular portion 2 of bdB δ (∆) for some ∆ ∈ Q (the same ∆ that contains the corresponding 2-face) and is bounded by at most four arcs. Each arc in turn is part of the intersection of the portion
6.3 The Hausdorff distance of ∆-patterns
67
of the boundary that contains the patch with either a plane, or an δ-sphere, or an δcylinder. A polynomial expression defining a patch is formed by the conjunction of five atomic expressions of degree at most two: one polynomial equation describing the portion 2 of bdB δ (∆) that contains the patch (i.e., a cylinder, a sphere, or a plane) and at most four polynomial inequalities defining the arcs (again these are equations describing a cylinder, a sphere, or a plane). In order to verify the intersection property we need a method to detect intersections 2 between the triangles in P and the surface patches of bdB δ . We will apply a standard approach suggested in [27] and [25] and transform this problem to a semialgebraic pointlocation problem. 2 Lemma 6.14 (Triangle intersection queries for bdB δ ). Let Ω be a set of k semialge3 braic sets of constant description complexity in R . We can build a data structure of size Oǫ (k 16+ǫ ) in Oǫ (k 16+ǫ ) randomized expected time, such that for any query triangle ∆ we can decide in Oǫ (k ǫ ) time if ∆ intersects Ω.
Proof. Let ∆(p1 , p2 , p3 ; x) be a polynomial expression that defines a triangle ∆ depending on its three vertices p1 , p2 , and p3 , i.e., ∆ = {x ∈ R 3 | ∆(p1 , p2 , p3 ; x) holds}; we can form ∆ as the conjunction of three linear inequalities and one linear equation. Let Γ(x) be a polynomial expression that defines a set Γ ∈ Ω, i.e., Γ = {x ∈ R 3 | Γ(x) holds}. For some fixed Γ, consider the set CΓ = {(p1 , p2 , p3 ) ∈ R 9 | (∃x : ∆(p1 , p2 , p3 ; x) ∧ Γ(x)) holds}. If we look at R 9 as the configuration space of the set of all triangles in 3-space, then CΓ is the set of (the parameters of) all triangles that intersect Γ. By quantifier elimination [30] we can find a polynomial expression CΓ (p1 , p2 , p3 ) that defines CΓ ; therefore this set is semialgebraic, too. Let F = {fΓ1 , . . . , fΓl | Γ ∈ Ω} denote the set of O(k) many polynomials that appear in the atomic polynomial expressions forming the expressions CΓ . With the algorithm of Theorem 3.2 we can compute a point-location data structure of size Oǫ (k 16+ǫ ) in Oǫ (k 16+ǫ ) time for the arrangenment of the varieties fΓi = 0 defined by F . Since the signs of all polynomials in F and therefore the validity of each polynomial expression CΓ is constant for each cell of the decomposition of R 9 induced by these varieties, the claim follows. Lemma 6.15 (Checking the intersection property – Euclidean case). We can ǫ 2+ǫ 15/16+ǫ 2 m ) randomized expected time. decide whether P ∩ bdB δ (Q) = ∅ in Oǫ (mn + n 2 Proof. In a first step we compute a description of bdB δ (Q) with the algorithm from Theorem 6.13. This can be done in Oǫ (n2+ǫ ) time and yields a set of Oǫ (n2+ǫ ) semialgebraic surface 2 patches of constant description complexity that partition the boundary of nhB δ (Q). Now we distinguish two cases:
n32 ≤ m: We run the algorithm of Lemma 6.14 to build a data structure of size 2 Oǫ (n32+ǫ ) in Oǫ (n32+ǫ ) time that supports triangle intersection queries to bdB δ (Q) in ǫ Oǫ (n ) time and then we query this data structure with all triangles in P to test for intersections in Oǫ (mnǫ ) steps. The total time spent is Oǫ (n32+ǫ + mnǫ ) = Oǫ (mnǫ ).
68
Computing the Hausdorff distance between polyhedral surfaces in
R3
2+ǫ 2 /m1/16 ⌉ groups of k = m1/16 ≤ n2 m ≤ n32 : We partition bdB δ (Q) into g = ⌈n surface patches each. For each group, we run the algorithm of Lemma 6.14 to build a data structure of size Oǫ (k 16+ǫ ) in Oǫ (k 16+ǫ ) time that supports triangle intersection queries in Oǫ (k ǫ ) time, and then we query this data structure with all triangles in P to test for intersections in Oǫ (mk ǫ ) steps. The total time spent is
Oǫ (g(k 16+ǫ + mk ǫ )) = Oǫ (gm1+ǫ/16 ) = Oǫ (n2+ǫ m15/16+ǫ ). 2 Note that this algorithm can also compute the triangles ∆ ∈ P that intersect bdB δ (Q).
Putting Lemma 6.8 and Lemma 6.15 together, we obtain Lemma 6.16 (∆-pattern δ˜H -measure problem – decision version). We can compute the set X = {∆ ∈ P | δ˜H (∆, Q) > δ} in Oǫ (mn + n2+ǫ m15/16+ǫ ) randomized expected time. By applying Lemma 6.16 twice we get an algorithm for Problem 6.5: Theorem 6.17 (∆-pattern δH -measure problem – decision version). We can decide whether δH (P, Q) ≤ δ in Oǫ ((mn)15/16+ǫ (m17/16 + n17/16 )) randomized expected time. With the result described in the following Theorem and the well known Clarkson/Shor technique, c.f. [28], we can easily turn the algorithm for the decision problem into a randomized procedure that actually computes the minimal distance. Theorem 6.18 (Computing δ˜H of a triangle to a ∆-pattern, Godau, [35]). Let ∆ be a triangle in R 3 . There is a randomized algorithm that computes δ˜H (∆, Q) in Oǫ (n2+ǫ ) expected time. Theorem 6.19 (∆-pattern δH -measure problem – optimization version). We can compute δH (P, Q) in Oǫ ((mn)15/16+ǫ (m17/16 + n17/16 )) randomized expected time. Proof. First we give a randomized algorithm to compute δ˜H (P, Q). We follow a strategy similar to that proposed in [2]. Initially we set δ = 0 and X = P . Then we repeat the following steps until X becomes empty: Choose a random triangle ∆ ∈ X and compute δ ′ = δ˜H (∆, Q) in Oǫ (n2+ǫ ) time with the algorithm from Theorem 6.18. Set δ to max(δ, δ ′ ). Now compute the set X ′ = {∆ ∈ X | δ˜H (∆, Q) > δ} in Oǫ (mn + n2+ǫ m15/16+ǫ ) time with the algorithm from Lemma 6.16. Finally set X to X ′ . Obviously the last value of δ will be δ˜H (P, Q). As is shown in [28], the expected number of iterations is O(log m) and therefore the expected time to compute δ˜H (P, Q) with this algorithm is Oǫ ((n2+ǫ + mn + n2+ǫ m15/16+ǫ ) log m).
Appendices
Bibliography [1] P. K. Agarwal, R. Klein, C. Knauer, and M. Sharir. Computing the detour of polygonal curves. Technical Report B 02-03, Freie Universit¨at Berlin, Fachbereich Mathematik und Informatik, 2002. [2] P. K. Agarwal and M. Sharir. Efficient randomized algorithms for some geometric optimization problems. Discrete Comput. Geom., 16:317–337, 1996. [3] P. K. Agarwal and M. Sharir. Motion planning of a ball amid polyhedral obstacles in three dimensions. In Proc. 10th ACM-SIAM Sympos. Discrete Algorithms, pages 21–30, 1999. [4] P. K. Agarwal and M. Sharir. Pipes, cigars, and kreplach: The union of Minkowski sums in three dimensions. In Proc. 15th Annu. ACM Sympos. Comput. Geom., pages 143–153, 1999. [5] A. V. Aho, J. E. Hopcroft, and J. D. Ullman. Data Structures and Algorithms. Addison-Wesley, Reading, MA, 1983. [6] O. Aichholzer, F. Aurenhammer, C. Icking, R. Klein, E. Langetepe, and G. Rote. Generalized self-approaching curves. Discrete Applied Mathematics, November 2000. to appear. [7] T. Akutsu. On determining the congruence of point sets in d dimensions. Comput. Geom. Theory Appl., 9(4):247–256, 1998. [8] H. Alt, O. Aichholzer, and G. Rote. Matching shapes with a reference point. Internat. J. Comput. Geom. Appl., 7:349–363, 1997. [9] H. Alt, B. Behrends, and J. Bl¨omer. Approximate matching of polygonal shapes. Ann. Math. Artif. Intell., 13:251–266, 1995. [10] H. Alt, J. Bl¨omer, M. Godau, and H. Wagener. Approximation of convex polygons. In Proc. 17th Internat. Colloq. Automata Lang. Program., volume 443 of Lecture Notes Comput. Sci., pages 703–716. Springer-Verlag, 1990.
72
Bibliography
[11] H. Alt, P. Brass, M. Godau, C. Knauer, and C. Wenk. Computing the Hausdorff distance of geometric patterns and shapes. Technical Report B 01-15, Freie Universit¨at Berlin, Fachbereich Mathematik und Informatik, 2001. [12] H. Alt, U. Fuchs, G. Rote, and G. Weber. Matching convex shapes with respect to the symmetric difference. Algorithmica, 21:89–103, 1998. [13] H. Alt and M. Godau. Computing the Fr´echet distance between two polygonal curves. Internat. J. Comput. Geom. Appl., 5:75–91, 1995. [14] H. Alt and L. Guibas. Resemblance of geometric objects. In J.-R. Sack and J. Urrutia, editors, Handbook of Computational Geometry, pages 121–153. Elsevier Science Publishers B.V. North-Holland, Amsterdam, 1999. [15] H. Alt and L. J. Guibas. Discrete geometric shapes: Matching, interpolation, and approximation. In J.-R. Sack and J. Urrutia, editors, Handbook of Computational Geometry, pages 121–153. Elsevier Science Publishers B.V. North-Holland, Amsterdam, 2000. [16] H. Alt, C. Knauer, and C. Wenk. Bounding the Fr´echet distance by the Hausdorff distance. In Abstracts 17th European Workshop Comput. Geom., pages 166–169. Freie Universit¨at Berlin, 2001. [17] H. Alt, C. Knauer, and C. Wenk. Matching polygonal curves with respect to the Fr´echet distance. In Proceedings 18th International Symposium on Theoretical Aspects of Computer Science, pages 63–74, 2001. [18] H. Alt, K. Mehlhorn, H. Wagener, and E. Welzl. Congruence, similarity and symmetries of geometric objects. Discrete Comput. Geom., 3:237–256, 1988. [19] B. Aronov and M. Sharir. On translational motion planning in 3-space. In Proc. 10th Annu. ACM Sympos. Comput. Geom., pages 21–30, 1994. [20] B. Aronov and M. Sharir. On translational motion planning of a convex polyhedron in 3-space. SIAM J. Comput., 26:1785–1803, 1997. [21] M. J. Atallah. On symmetry detection. IEEE Trans. Comput., C-34:663–666, 1985. [22] M. D. Atkinson. An optimal algorithm for geometrical congruence. J. Algorithms, 8:159–172, 1987. [23] F. Aurenhammer. Improved algorithms for discs and balls using power diagrams. J. Algorithms, 9:151–161, 1988. [24] P. Brass and C. Knauer. Testing the congruence of d-dimensional point sets. In Proc. 16th Annu. ACM Symp. on Computational Geometry, pages 310–314, 2000.
Bibliography
73
[25] B. Chazelle, H. Edelsbrunner, L. J. Guibas, and M. Sharir. A singly-exponential stratification scheme for real semi-algebraic varieties and its applications. Theoret. Comput. Sci., 84:77–105, 1991. [26] B. Chazelle, H. Edelsbrunner, L. J. Guibas, M. Sharir, and J. Snoeyink. Computing a face in an arrangement of line segments and related problems. SIAM J. Comput., 22:1286–1302, 1993. [27] B. Chazelle and M. Sharir. An algorithm for generalized point location and its application. J. Symbolic Comput., 10:281–309, 1990. [28] K. L. Clarkson and P. W. Shor. Applications of random sampling in computational geometry, II. Discrete Comput. Geom., 4:387–421, 1989. [29] R. Cole. Slowing down sorting networks to obtain faster sorting algorithms. J. ACM, 34(1):200–208, 1987. [30] G. E. Collins. Quantifier elimination for real closed fields by cylindrical algebraic decomposition. In Proc. 2nd GI Conference on Automata Theory and Formal Languages, volume 33 of Lecture Notes Comput. Sci., pages 134–183. Springer-Verlag, 1975. [31] A. Ebbers-Baumann, R. Klein, E. Langetepe, and A. Lingas. A fast algorithm for approximating the detour of a polygonal chain. In ESA 2001 — European Symposium on Algorithms, 2001. [32] H. Edelsbrunner. Algorithms in Combinatorial Geometry, volume 10 of EATCS Monographs on Theoretical Computer Science. Springer-Verlag, Heidelberg, West Germany, 1987. [33] A. Efrat, P. Indyk, and S. Venkatasubramanian. Pattern matching for sets of segments. In Proceedings of the 12th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 295–304, Washington, D.C., USA, September 2001. [34] S. J. Fortune. A sweepline algorithm for Voronoi diagrams. Algorithmica, 2:153–174, 1987. [35] M. Godau. On the complexity of measuring the similarity between geometric objects in higher dimensions. PhD thesis, Department of Computer Science, Freie Universit¨at Berlin, 1999. [36] L. J. Guibas, M. Sharir, and S. Sifrony. On the general motion planning problem with two degrees of freedom. Discrete Comput. Geom., 4:491–521, 1989. [37] P. T. Highnam. Optimal algorithms for finding the symmetries of a planar point set. Inform. Process. Lett., 22:219–222, 1986.
74
Bibliography
[38] K. Kedem, R. Livne, J. Pach, and M. Sharir. On the union of Jordan regions and collision-free translational motion amidst polygonal obstacles. Discrete Comput. Geom., 1:59–71, 1986. [39] G. K. Manacher. An application of pattern matching to a problem in geometrical complexity. Inform. Process. Lett., 5:6–7, 1976. [40] N. Megiddo. Applying parallel computation algorithms in the design of serial algorithms. J. ACM, 30(4):852–865, 1983. [41] J. Pach and M. Sharir. On the boundary of the union of planar convex sets. Discrete Comput. Geom., 21:321–328, 1999. [42] M. Pellegrini. On collision-free placements of simplices and the closest pair of lines in 3-space. SIAM J. Comput., 23(1):133–153, 1994. [43] F. P. Preparata and M. I. Shamos. Computational Geometry: An Introduction. Springer-Verlag, 3rd edition, Oct. 1990. [44] G. Rote. Curves with increasing chords. Math. Proc. Camb. Phil. Soc., 115:1–12, 1994. ¨ [45] S. Schirra. Uber die Bitkomplexit¨at der ǫ-Kongruenz. Master’s thesis, Fachbereich Informatik, Universit¨at des Saarlandes, 1988. [46] C. Shannon. Probability of error for optimal codes in a gaussian channel. Bell System Tech. J., 38:611–656, 1959. [47] M. Sharir and P. K. Agarwal. Davenport-Schinzel Sequences and Their Geometric Applications. Cambridge University Press, New York, 1995. [48] K. Sugihara. An n log n algorithm for determining the congruity of polyhedra. J. Comput. Syst. Sci., 29:36–47, 1984. [49] S. Venkatasubramanian. Geometric Shape Matching and Drug Design. PhD thesis, Department of Computer Science, Stanford University, August 1999. [50] C. Wenk. Shape matching in higher dimensions. PhD thesis, Department of Computer Science, Freie Universit¨at Berlin. In preparation. [51] A. D. Wyner. Capabilities of bounded discrepancy decoding. AT&T Tech. J., 44:1061– 1122, 1965.
Index L, 28 R, 28 ||z||LB , see LB -norm A(k, n), see Ackermann function A2 , see set of affine transformations α(n), see inverse Ackermann function Ackermann function, 46 atomic polynomial expression, 8 B, 63 B2 , 63 BP , 63 bdB δ (Q), see LB -δ-boundary bi-monotone, 28 δ–critical translation, 33 clamped δF -path, 29 clamped δ˜F -path, 29 clamped path, see clamped δF -path, clamped δ˜F -path closed curve, 23 closed polygonal curve, 23 configuration, see h-configuration, vconfiguration curve, 23 δ-neighborhood, 19 δ(P ), see detour δP (X, Y ), see detour δP (x, y), see detour dLB (x, y), see LB -distance dP (x, y), see κ-straightness ddiscr , see discrete metric
δF -path, 28 δF (P, Q), see Fr´echet distance δH (P, Q), see Hausdorff distance B δH (P, Q), see LB -Hausdorff distance δ˜H (P, Q), see one-sided Hausdorff distance B ˜ δH (P, Q), see one-sided LB -Hausdorff distance ˜ δF -path, 28 δ˜F (P, Q), see weak Fr´echet distance ∆-pattern, 61 Davenport-Schinzel sequence, 48 description complexity, 8 detour, 49 discrete metric, 7 Fδ (P, Q), see free space Fr´echet distance, 23 free space, 28 free space diagram, see free space Hausdorff distance, 7 h-configuration, 33 inverse Ackermann function, 46 K1 , see closed curve κ-straight curve, see κ-straightness κ-straightness, 43 K0 , see curve LB -δ-boundary, 63 LB -δ-neighborhood, 63 LB -Hausdorff distance, 61 LB -distance, 61
76
Index LB -norm, 61 λs (n), see Davenport-Schinzel sequence nhB δ (Q), see LB -δ-neighborhood nhδ (Q), see δ-neighborhood Oǫ -notation, 2 one-sided LB -Hausdorff distance, 61 one-sided Hausdorff distance, 7 polygonal curve, 23 polynomial expression, 8 R(K, δ, T2 ), see reference points R(K, δ, c, T ), see reference points reference points, 38 δ–supercritical translation, 36 semialgebraic set, 8 set of affine transformations, 41 set of planar translations, 28 δ Tcrit (c), see δ–critical translation T2 , see set of planar translations ′ δ (c), see δ–supercritical translaTcrit tion Tarski sentence, 18 translation space, 28
v-configuration, 33 weak Fr´echet distance, 23
Glossary L : The lower left corner of FÆ , 28 R : The upper right corner of FÆ , 28 ||z||LB : The LB -norm of z, ||z||LB := min{c ≥ 0 | z ∈ c · B}, 61 A(k, n): A(k, 1) = 2 for k = 1, A(k, n) = A(k − 1, A(k, n − 1)) for k ≥ 2, 46 A2 : The set of affine transformations of the plane, 41 α(n) : min{k ≥ 1 | A(k, k) ≥ n}, 46 atomic polynomial expression : Expression of the form P (x) ≤ 0, where P ∈ R [x1 , . . . , xd], 8 B: A centrally symmetric convex body, 63 B2 : The three-dimensional unit ball, 63 BP : A centrally symmetric convex polyhedron, 63 bdB Æ (Q): The boundary of the LB -δ-neighborhood of Q, i.e., the set of all x such that dLB (x, Q) = δ, 63 δ(P ) : The detour of P ; δ(P ) = maxx2X;y2Y;x= 6 y δP (x, y), 49 δP (X, Y ) : The P -detour between X and Y ; δP (X, Y ) = maxx2X;y2Y;x6=y δP (x, y), 49 δP (x, y) : The P -detour between x and y; δP (x, y) = dP (x, y)/||x − y||, 49 dLB (x, y): The LB -distance between x and y, dLB (x, y) := ||x − y||LB , 61 dP (x, y): The arclength of the piece of the curve P between x and y, 43 ddiscr : The discrete metric, i.e., ddiscr (P, Q) = 0 if P = Q and ddiscr (P, Q) = 1 otherwise, 7 δF -path : A bi-monotone curve within FÆ (P, Q) from L to R, 28
78
Glossary
δF (P, Q): The Fr´echet distance between P and Q, 23 δH (P, Q): The Hausdorff distance between P and Q, 7 B (P, Q): The LB -Hausdorff distance between P and Q, 61 δH
nhÆ (Q): The δ-neighborhood of Q, i.e., the set of all x such that d(x, Q) ≤ δ, 19 δ˜H (P, Q): The one-sided Hausdorff distance from P to Q, 7 B (P, Q): The one-sided LB -Hausdorff distance from P to Q, 61 δ˜H
δ˜F -path : A curve within FÆ (P, Q) from L to R, 28 δ˜F (P, Q): The weak Fr´echet distance between P and Q, 23 Davenport-Schinzel sequence: A sequence U = hu1 , . . . , um i over {1, . . . , n} such that any two consecutive elements of U are distinct, and U does not contain a subsequence of length s + 2 of the form ababab . . . for a 6= b, 48 ∆-pattern: A finite set of triangles with the property, that any two distinct triangles in the set do not intersect in their relative interiors, 61 description complexity (of a semialgebraic set): This accounts for the number and the maximum degree of the polynomials in a polynomial expression defining the set, 8 FÆ (P, Q): The free space of P and Q, 28 K1 : The set of all closed plane curves, 23 κ-straight curve: A curve P s.th. maxx6=y dP (x, y)/||x − y|| ≤ κ, 43 K0 : The set of all plane curves, 23 λs (n) : The maximum length of a Davenport-Schinzel sequence of order s over an n-element alphabet, 48 nhÆB (Q): The LB -δ-neighborhood of Q, i.e., the set of all x such that dLB (x, Q) ≤ δ, 63 O (f (n, ǫ)): The set of all functions T (n) such that there is a function C(ǫ), and for any ǫ > 0 and for all n, T (n) ≤ C(ǫ) · f (n, ǫ) holds, 2 pÆ : The circle with center p and radius δ, 32 polynomial expression : Any finite boolean combination of atomic polynomial expressions, 8 R(K, δ, T ): The set of δ–reference points for K with respect to T , 40
Glossary
79
R(K, δ, c, T ): The set of δ–reference points for K of quality c with respect to T , 38 semialgebraic set: a set satisfying a polynomial expression, 8 Æ (c) : The set of all translations that are δ-critical for c, 33 T rit
T2 : The set of planar translations, 28 Æ 0 T rit (c) : The set of all translations that are δ-supercritical for c, 36
Tarski sentence : A polynomial expression prefixed by a finite number of ∃ and ∀ quantifiers, 18
80
Glossary
List of Definitions Point set pattern matching in d-dimensional space Hausdorff distance, δH (P, Q), 7 One-sided Hausdorff distance, δ˜H (P, Q), 7 Computing the Hausdorff distance between d-dimensional point sets δ-neighborhood, nhδ (Q), 19 Matching of plane curves Curve, K0 , 23 Polygonal curve, 23 Closed curve, K1 , 23 Closed polygonal curve, 23 Fr´echet distance, δF (P, Q), 23 Weak Fr´echet distance, δ˜F (P, Q), 23 Matching polygonal curves with respect to the Fr´ echet distance Free space, Fδ (P, Q), 28 δF -path, 28 δ˜F -path, 28 Clamped δF -path, 29 Clamped δ˜F -path, 29 Clamped path, 29 Configuration, 33 h-configuration, 33 v-configuration, 33 δ δ–critical translation, Tcrit (c), 33 ′ δ δ–supercritical translation, Tcrit (c), 36 Approximately minimizing the Fr´ echet distance Reference point, R(K, δ, c, T ), 38 Bounding the Fr´ echet distance by the Hausdorff distance κ-Straightness, 43
82
List of Definitions
Recognizing κ-straight curves Detour of a polygonal curve, δP (x, y), δP (X, Y ), δ(P ), 49 Computing the Hausdorff distance between polyhedral surfaces in LB -distance, dLB (x, y), 61 LB -norm, ||z||LB , 61 B LB -Hausdorff distance, δH (P, Q), 61 B one-sided LB -Hausdorff distance, δ˜H (P, Q), 61 Computing the Hausdorff distance between polyhedral surfaces in B LB -δ-neighborhood, nhB δ (Q), bdδ (Q), 63
R3
R3
List of Lemmata Testing the congruence of point sets in Invariant of algorithm Congruence Test, 13
Rd
Computing the Hausdorff distance between d-dimensional point sets Computing δ˜H (P, Q) by brute-force, 18 Computing δ˜H (P, Q) – decision problem, 19 Matching polygonal curves with respect to the Fr´ echet distance Continuity lemma I, 32 Continuity lemma II, 32 Characteristic feasible translations - weak version, 33 Characteristic feasible translations for δF - strong version, 34 Characteristic feasible translations for δ˜F - strong version, 35 Characteristic feasible translations for δF , 36 Characteristic feasible translations for δ˜F , 36 Approximately minimizing the Fr´ echet distance Recognizing κ-straight curves Bichromatic disjoint subcurve detour, 50 Detour of a polygonal curve, 53 Top-Bottom detour of a closed polygonal curve, 54 Detour of a closed polygonal curve, 57 Computing the Hausdorff distance between polyhedral surfaces in B The connection between δ˜H and bdB δ , 64 Checking the inclusion property, 64 Checking the intersection property – polyhedral case, 65 Checking the intersection property – Euclidean case, 67 ∆-pattern δ˜H -measure problem – decision version, 68
R3
84
List of Lemmta
List of Theorems Testing the congruence of point sets in R d Correctness and complexity of algorithm Congruence Test, 13 Computing the Hausdorff distance between d-dimensional point sets Computing δ˜H (P, Q), 20 Matching polygonal curves with respect to the Fr´ echet distance Correctness and complexity of Fr´echet Match, 37 Correctness and complexity of Weak Fr´echet Match, 38 Approximately minimizing the Fr´ echet distance Correctness and complexity of Approx Fr´echet Match, 40 Non-existence of δH -reference points for affine maps, 41 Bounding the Fr´ echet distance by the Hausdorff distance δH vs. δF for κ-straight polygonal curves, 44 Computation of the upper envelope of Φδ (P, Q), 46 Recognizing κ-straight curves Recognition of κ-straight curves, 49 Computing the Hausdorff distance between polyhedral surfaces in BP ∆-pattern δH -measure problem – decision version, 66 2 Triangle intersection queries for bdB δ , 67 ∆-pattern δH -measure problem – decision version, 68 ∆-pattern δH -measure problem – optimization version, 68
R3
86
List of Theorems
List of Algorithms Testing the congruence of point sets in Algorithm Congruence Test, 12
Rd
Matching polygonal curves with respect to the Fr´ echet distance Algorithm Fr´echet Match, 37 Algorithm Weak Fr´echet Match, 37 Approximately minimizing the Fr´ echet distance Algorithm Approx Fr´echet Match, 40 Recognizing κ-straight curves Algorithm TB Closed Curve Detour , 54
88
List of Algorithms
Curriculum Vitae Pers¨onliche Daten: • Name: Christian Michael Knauer • Geburtstag: 27. Juni 1971 • Geburtsort: Kronach Ausbildung: • Besuch der Volksschule Reichenbach von 1977 bis 1981 • Besuch des Frankenwaldgymnasiums Kronach von 1981 bis 1990 • Studium der Informatik an der Friedrich-Alexander Universit¨at Erlangen-N¨ urnberg vom 1. November 1991 bis 5. Mai 1997 • Mitglied im Graduiertenkolleg ”Diskrete Algorithmische Mathematik” an der Freien Universit¨at Berlin vom 1. Juni 1997 bis 29. April 1998 • Wissenschaftlicher Mitarbeiter in der Arbeitsgruppe ”Theoretische Informatik” am Institut f¨ ur Informatik der Freien Universit¨at Berlin seit 1. Mai 1998 Stipendien: • Stipendium nach dem bayrischen Begabtenf¨orderungsgesetz vom 1. November 1991 bis 31. M¨arz 1995 • Promotionsstipendium im Graduiertenkolleg ”Diskrete Algorithmische Mathematik” vom 1. Juni 1997 bis 29. April 1998 Sonstiges: • Wehrdienst (W12) bei der PzPiKp 110 in Bogen vom 1. Juli 1990 - 30. Juni 1991
90
Curriculum Vitae
Zusammenfassung Das geometrische Mustererkennungsproblem besteht darin, zu zwei gegebenen Mustern P und Q aus einer Menge von zul¨assigen Mustern Π und einem Abstandsmaß δ f¨ ur solche Muster eine Transformation τ aus einer Menge von zul¨assigen Transformationen T zu finden, so daß der Abstand δ(τ (P ), Q) m¨oglichst klein wird. In der Dissertation werden effiziente Algorithmen f¨ ur verschiedenen Varianten dieser Problemstellung vorgestellt; die Ergebnisse lassen sich anhand der Art der zul¨assigen Muster gruppieren: Punktmuster im R d : Im ersten Teil betrachten wir Punktmuster; die Figuren sind Mengen von m bzw. n Punkten im R d . Kongruenztest: Wir geben einen Algorithmus an, der in O(n⌈d/3⌉ log n) Zeit (m ≤ n) entscheidet, ob es eine Kongruenzabbildung gibt, die P auf Q abbildet (dies kann als Spezialfall der Mustererkennungsproblems betrachtet werden, bei dem das Abstandsmaß die triviale Metrik ist), und verbessern damit ein Resultat von Akutsu [7]. Hausdorff-Abstand: Wir beschreiben den ersten nicht-trivialen Algorithmus zur Berechnung des gerichteten Hausdorff-Abstandes δ˜H (P, Q) von einer Menge von Punkten P zu einer Menge von semialgebraischen Mengen konstanter Beschreibungskomplexit¨at Q (dies kann als Spezialfall der Mustererkennungsproblems betrachtet werden, bei dem die Iden1 tit¨at die einzige zul¨assige Transformation ist); die Laufzeit ist Oǫ (mnǫ log m+ m1+ǫ− 2d−2 n). Planare Kurven: Der zweite Teil besch¨aftigt sich mit Mustererkennungsproblemen f¨ ur polygonale Kurven in der Ebene; die Figuren sind Polygonz¨ uge P, Q mit m bzw. n Ecken und als Abstandsmaß betrachten wir den Frech´et-Abstand δF (P, Q) von P und Q. Matching unter Translationen: Wir entwickeln den ersten Algorithmus, der das Matchingproblem f¨ ur Polygonz¨ uge bez¨ Translationen l¨ost; die uglich des Frech´et-Abstandes unter 3 2 −2 Laufzeit ist O (mn) (m + n) . Außerdem geben wir einen O(ǫ mn) Approximationsalgorithmus der G¨ ute (1 + ǫ) f¨ ur dieses Problem an, indem wir die Referenzpunktmethoden von [8] verallgemeinern. Weiterhin zeigen wir, daß es f¨ ur affine Abbildungen keine solchen Referenzpunkte gibt. Hausdorff- vs. Frech´et-Abstand: Wir zeigen, daß f¨ ur eine gewisse Klasse von Kurven ein linearer Zusammenhang zwischen dem Frech´et- und dem Hausdorff-Abstand besteht. F¨ ur diese Art von Kurven geben wir einen O((m + n) log2 (m + n)2α(m+n) ) Approximationsalgorithmus zur Berechnung von δF (P, Q) an. Schließlich beschreiben wir den ersten nicht-trivialen Algorithmus um solche Kurven zu erkennen; die Laufzeit ist O(n log2 n). Einfache polyedrische Fl¨ achen im R 3 : Im letzten Teil betrachten wir Muster die sich aus Mengen P und Q von m bzw. n disjunkten Dreiecken im R 3 zusammensetzen. Hausdorff-Abstand: Wir verbessern ein Resultat von Godau [35] und entwickeln einen Algorithmus, der δH (P, Q) in Oǫ ((mn)15/16+ǫ (m17/16 + n17/16 )) Zeit berechnet.
92
Zusammenfassung
... but in the end it doesn’t even matter ...
Rev. Final