IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
Implicit Polynomials, Orthogonal Distance Regression, and the Closest Point on a Curve Nicholas J. Redding, Member, IEEE AbstractÐImplicit polynomials (i.e., multinomials) have a number of properties that make them attractive for modeling curves and surfaces in computer vision. This paper considers the problem of finding the best fitting implicit polynomial (or algebraic curve) to a collection of points in the plane using an orthogonal distance metric. Approximate methods for orthogonal distance regression have been shown by others to be prone to the problem of cusps in the solution and this is confirmed here. Consequently, this work focuses on exact methods for orthogonal distance regression. The most difficult and costly part of exact methods is computing the closest point on the algebraic curve to an arbitrary point in the plane. This paper considers three methods for achieving this in detail. The first is the standard Newton's method, the second is based on resultants which are recently making a resurgence in computer graphics, and the third is a novel technique based on successive circular approximations to the curve. It is shown that Newton's method is the quickest, but that it can fail sometimes even with a good initial guess. The successive circular approximation algorithm is not as fast, but is robust. The resultant method is the slowest of the three, but does not require an initial guess. The driving application of this work was the fitting of implicit quartics in two variables to thinned oblique ionogram traces. Index TermsÐFitting, orthogonal distance regression, implicit polynomials, algebraic curve, successive circular approximation, resultants, ionograms.
æ 1
INTRODUCTION
IMPLICIT polynomials (i.e., multinomials) have a number of properties that make them attractive for modeling curves and surfaces in computer vision. In particular, determining whether a point is on or off the curve or surface is a simple matter as compared with parametric representations. Determining the union of two objects defined by the zero set is also straightforward [3], [31]. Implicit polynomials are also superior in many applications to local boundary models, such as splines and snakes, because the parameters of implicit polynomials capture global properties of the curve's shape and, so, they allow extrapolation [21], [23], [24]. They also have the desired properties of good shape descriptors as elucidated in [17]. In this paper, consideration is given to the problem of finding the best fitting implicit polynomial (or algebraic curve) to a collection of points S in the plane using an orthogonal distance metric. This is a nonlinear least-squares problem that requires the closest point on the curve be computed for every x 2 S at each step. This is computationally very expensive, so many different approaches have been taken to approximate the problem using various assumptions, e.g., [9], [11], [30]. It has been shown, however, that these methods can contain cusps in the solutions that they determine [29]. This observation is confirmed here using two different approximation techniques for the orthogonal distance regression (ODR) problem. As a consequence of my own unsatisfactory experience with approximate ODR techniques, I present here the details of exact methods for the ODR problem which include some novel developments. The most difficult part of the ODR problem is the robust computation of the closest point on a curve for an arbitrary
. N.J. Redding is with the Defence Science and Technology Organisation, Surveillance Systems Division, PO Box 1500, Salisbury SA 5108, Australia. E-mail:
[email protected]. Manuscript received 25 Mar. 1998; accepted 13 Jan. 2000. Recommended for acceptance by D. Kriegman. For information on obtaining reprints of this article, please send e-mail to:
[email protected], and reference IEEECS Log Number 107651. 0162-8828/00/$10.00 ß 2000 IEEE
VOL. 22, NO. 2,
FEBRUARY 2000
191
point in the plane. Sullivan et al. [29] use Newton's method, but I have found this to be unsatisfactory in some cases, even with a reasonably good initial guess. Therefore, two alternative methods are examined. The first is based upon the century-old technique of resultants from elimination theory now making a resurgence in computer graphics [15] and the second is a novel technique using successive circular approximations to the curve. The particular application to which this work is applied requires the fitting of implicit quartic polynomials in two variables to oblique ionograms, which are measurements of the high frequency properties of the ionosphere. Oblique ionograms are composed of many different traces (long curved clusters of points). These traces are long and narrow, curving, and highly varying line-like structures (thinned examples are shown in Fig. 3). The traces that are input to the feature extraction steps considered here have already been preprocessed by filtering, gray-scale thinning, and trimming to make them pixel-wide lines [13], [21], [22]. This paper is divided in the following manner: Section 2 details the model and orthogonal distance metric, Section 3 presents a review of orthogonal distance regression, and Section 4 presents methods for computing the closest point on a curve to an arbitrary point in the plane. Section 5 presents the necessary details for implementing a robust ODR procedure and Sections 6 and 7 present simulation results and conclusions, respectively.
2
MODELING PRELIMINARIES
We need to choose a parametrization family F of curves C that meet the requirements for a trace model. Let us assume that a trace is fitted by a model of the form f
x; 0. Then, we define the family by F [ C ; where the curves are given by C fx : f
x; 0g;
1
where x
x; y is the vector of pixel coordinates, is a parameter vector, and f
x;
p mÿj X X
ij xi yj
j0 i0
p mÿj X X
ij 'ij
x; y;
2
j0 i0
where the 'ij
x; y are the monomials xi yj and m p. The model parameter vector defines a curve C C ; its components will be denoted by ij . The complete trace model chosen is given by (1), where the function f is a bivariate quartic given by f
x;
4ÿj 4 X X
ij xi yj :
3
j0 i0
Given the form of the model, the next issue to be addressed is which metric should be used to determine how well the model fits the trace. Let S fxn gN n1 be the set of data points to which we wish to fit a particular curve C from the model family F . To do so, we need to define a metric d
C; S for the goodness of fit of C to S: Once d has been chosen, we then choose a particular curve C C by solving min d
C ; S. The properties that we would like d to have are, first, that the best fit to a rotated data set should be the rotation of the best fit to the original data; second, that the best fit to magnified and translated data should be the magnified and translated version of the best fit to the original data. Therefore, a possible candidate for metric d is the standard distance metric as follows: The Euclidean distance d
x; y kx ÿ yk is invariant under rotation and translation and linear in the magnification of the data. Therefore, we define d
C ; S to be the sum of the squared Euclidean distances of the set of points S from the curve C by:
192
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
d
C ; S
N X n1
min kxn ÿ xk2 : x2C
4
In other words, we are attempting to minimize the distance between each data point in the set S and the closest point to it on the curve C by varying . The disadvantage of this metric is that the function x arg miny2C kxn ÿ yk2 can be a discontinuous function of the data point xn . Thus, d
C ; S may have discontinuities in its first and higher derivatives. Also, the metric is not consistent with any reasonable statistical model for how the data is formed, for example, with the assumption that the data points recorded are formed by random perturbation of points on the underlying curve. An alternative metric that is more consistent with such a reasonable statistical model is presented in [19]. However, this metric is much more complicated, so I will continue here with the standard one above.
3
ORTHOGONAL DISTANCE REGRESSION
Given that the model, the family F of parameterized curves C , is specified by (1) and (2) and the fitting metric d
C ; S by (4), it remains to be determined which parameter corresponds to a curve C that best fits a given trace S. That is, given a trace consisting of the set of points S fxn g, xn
xn ; yn , n 1; . . . ; N, we wish to find parameter values such that the curve f
x; 0 best fits the data. The function f is smooth and, usually, the number of data points n is much larger than the number of model parameters, j j. Therefore, the fitting problem is the one of ^ such that determining a ^ arg min d
C ; S:
We can express the above fitting problem in the following way: ~ n denote the closest (using Euclidean distance) point on the Let x ~ n ÿ xn be the displacement curve f
x; 0 to point xn . Let n x between them. With these definitions, k n k2 min kxn ÿ xk2 ; x2C
and the fitting problem can be expressed as min
N X n1
subject to
wn k n k2
5
f
xn n ; 0 8xn 2 S;
where k k is the Euclidean norm and wn are a set of weights that are proportional to the intensity of the nth pixel in S. For the remainder of this paper, I will assume that the weights have unit value. This problem is called orthogonal distance regression (ODR). (The problem gets its name from the fact that the line x ~ n ÿ xn is orthogonal to the curve f at x ~ n .) A number of different approaches have been taken in the literature to solve the ODR problem. In principle, (5) is a nonlinear least squares problem for which standard methods, such as the Levenberg-Marquardt method [5], would be appropriate. However, the evaluation of the orthogonal distance causes difficulties, because it requires the computation of the closest point on the curve C for the current value of for each point in S. This computationally intensive step has caused many authors to develop fast approximate algorithms for the ODR problem. In order to reduce its computational complexity, authors have taken the approaches of linearizing the constraints, e.g., [9], linearizing the orthogonal distance metric, e.g., [30], repeated linearizations [12], or penalty function methods (ODRpack, available from netlib) [1]. All these methods are fraught with problems that render them
VOL. 22, NO. 2,
FEBRUARY 2000
unsatisfactory for our application. Essentially, the problems are due to the approximate method not ensuring that the constraint functions f
x; are exactly zero at the point to which they implicitly compute the orthogonal distance. As a result, distance of the curve to the data point is traded off against the magnitude of f
x; and when the true zero contour is plotted against the points in S, it can be plagued by cusps and gaps. An example using the method in [9] is shown in Fig. 1. Fig. 1b shows the surface f
x; that was obtained from fitting the data points in Fig. 1a. The surface looks as if a reasonable fit has been obtained because a valley lies along the data points. However, the actual zero set is a complicated unrealistic curve with a cusp because the ravine does not actually reach the zero plane in one small region on the right. This confirms the statements made in [29] regarding the problems of approximate methods.
4
COMPUTING THE CLOSEST POINT ON A CURVE
Now, because the approximate methods are often unsatisfactory, let us now examine exact methods for computing the closest point on a curve. To simplify our notation, let xc be the closest point on the curve f
x; 0 to the point xp . There are a number of possible approaches to the problem of computing the closest point on the curve. The most obvious is to express the closest point as a nonlinearly constrained minimization where xc arg min kxp ÿ xk x
subject to f
x; 0 8xn 2 S:
6
Methods of solving this nonlinear programming problem as a series of quadratic programming steps are available [26], [32]. However, this was not found to be robust in practice.
4.1
Newton's Method
The closest point can also be expressed as the solution of two simultaneous nonlinear equations. One equation indicates that the closest point satisfies the curve f
x; 0. The other is derived from the fact that the closest point must be at the base of the line perpendicular to the curve that passes through our point of interest xp . If xp
xp ; yp is the coordinate of the point and its closest point on the curve is denoted by xc
xc ; yc , then the two equations are f
xc ; yc ; 0 @f @f ÿ
xp ÿ xc 0:
yp ÿ yc @x @y
7
There are various methods for solving these simultaneous nonlinear equations. A Newton-Raphson root finding method is presented in [20, section 9.6, p. 379] which requires a good initial guess. Sullivan et al. [29] use Lagrange multipliers and Newton's algorithm to compute the closest point on the curve for each point in S and arrive at a very similar algorithm. From [29], the equation 1 0 1 0 @2 f @2 f @f 0 1 2 @x @x@y 2 ÿ2
xp ÿ x @f @x x @x C B 2 2 C C@ y A ÿB B @f A
8 @ ÿ2
yp ÿ y @f 2 @@yf2 @f @ @y@x @y A @y @f @f f
x; y; 0 @x @y is repeatedly solved for increments x, y, with initially x, y set to be the initial guess and 0. This method alone was found to be unsatisfactory because it sometimes failed to converge even with a reasonably good initial guess. (Stopping conditions of 10ÿ8 were used on changes in x and y or in the square of the orthogonal distance between xp and x
x; y.) Homotopy methods are an alternative [18], [34]. However, they have robustness problems as summarized by [10] and as quoted in [16] and they are considered to be computationally expensive [16].
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
VOL. 22, NO. 2,
FEBRUARY 2000
193
Fig. 1. An example of a cusp resulting from an approximate ODR fitting method to the trace in (a) indicated by the dots. The surface (b) appears to be goodÐthe trough lies in the region of the data points. However, the zero set (a) reveals that the surface does not actually reach the zero plane in one region on the right. The result is that the apparent curve of points is not filled by a single curve as would be expected, but by a complicated two component curve consisting of two closed segments which nearly meet at two cusps, which are shown in more detail in the insert of (a). (a) Trace and zero set f
x; 0: (b) Surface f
x;
Another approach taken by Sederberg is to bound each of the intersection points with a triangle and then shrink this enclosing triangle with each successive step splitting the triangles four ways when necessary when multiple intersections are enclosed in one triangle [28]. In the following sections, we consider two further alternatives that have certain advantages over the methods considered above.
4.2
Resultants
A class of methods that work well for curves of low degree (up to and including quartics) employ resultants. Using classical elimination theory, it is possible to construct a resultant polynomial from n polynomial (multinomial) equations in n variables such that the roots of the resultant polynomial in one variable corresponds to the solution of the n simultaneous equations. There are a number of century-old methods for constructing a resultant for a system of two polynomials in two unknowns [25], the best known of which are Sylvester's resultant and Cayley's statement of BeÂzout's method, e.g., [2], [4], [27]. We will use Cayley's statement of BeÂzout's method because it is more compact and is simpler to derive symbolically. Let us now consider the resultant method. Let f
x; y 0 and g
x; y 0 be the two polynomial equations of degree m and n (m n), respectively, in x and p (p m) and q (q n), respectively, in y, with the total degree of each being m and n, respectively. Let us assume for the moment that y is a constant. The first part of the following exposition is based upon [2]. The full development of the theory is presented here because it is spread across many sources, some of which may not be readily accessible to the reader. Cayley observed that the expression
x; f
x; yg
; y ÿ f
; yg
x; y is zero when x or when
x; y is a common solution of f 0 and g 0. Therefore, x ÿ divides and gives a polynomial
x; defined by
x;
f
x; yg
; y ÿ f
; yg
x; y xÿ
that is symmetric in x and . Furthermore,
x0 ; 0 for any common solution x0 of f 0 and g 0. Now, because
x; is of degree l max
m; n ÿ 1 in , then the coefficients of l ; lÿ1 ; . . . ; ; 1 must be polynomials in x of degree less than or equal to l (and also of y, but remember that we are assuming for the time being that y is a constant). Therefore, it can be shown that ÿ l
lÿ1
...
ÿ T 1 B
y xl ; xlÿ1 ; . . . ; x; 1 ;
9
where B is the BeÂzout matrix of f
x; y and g
x; y (with elements composed of polynomials in y) that is free of variables and x. Now,
x0 ; 0 for all (where x0 is a common solution of f 0 and g 0), therefore, T B
y
x0 l ;
x0 lÿ1 ; . . . ; x0 ; 1 0:
10
There fore, det
B
y 0 at the common s olution and
x0 l ;
x0 lÿ1 ; . . . ; x0 ; 1T is in the corresponding null space of B
y. Therefore, the determinant of the BeÂzout matrix is the resultant (with an extraneous factor if m 6 n [4]). Now, the determinant of the BeÂzout matrix is a polynomial in y and by BeÂzout's theorem [3] there are at most mn intersections of f 0 and g 0 which are all roots of this polynomial. So, computing the
194
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
roots of this polynomial gives the y-coordinate of mn possible closest points on the curve. Manocha [15] avoided computing the determinant of the BeÂzout matrix in the following way. He recognized that B
y is a matrix polynomial B
y Br yr Brÿ1 yrÿ1 . . . B1 y B0 ;
11
where Br ; Brÿ1 ; . . . ; B1 ; B0 are the
l 1
l 1 matrix coefficients of yr ; yrÿ1 ; . . . ; y; 1, respectively, and r p q. Then, in the nonmonic case (Br 6 Il1 , the
l 1
l 1 identity matrix), we can form the linearization of this matrix polynomial B
y [7] by CL
y C1 y C2 where CL is an r
l 1 r
l 1 matrix and C1 and C2 are the companion matrices given by 1 0 0 ... 0 0 Il1 C B Il1 . . . 0 0 C B 0 C B B .. .. .. .. C; C1 B ... . . . . C C B C B 0 . . . Il1 0 A @ 0 0 0 ... 0 Br 0 B B B B C2 B B B @
0 0 .. . 0 B0
ÿIl1 0 .. . 0 B1
0 ÿIl1 .. . 0 B2
... ... .. . ... ...
0 0 .. . 0 Brÿ2
1
12
0 C 0 C C .. C: . C C C ÿIl1 A Brÿ1
Then, det
CL
y det
B
y from the special structure of (12) and furthermore, the generalized eigenvalues of B
y, i.e., the zeros of scalar polynomial det
B
y, and of C1 y C2 are the same. Therefore, the generalized eigenvalues of C1 y C2 are the zeros of det
B
y. Note that generalized eigenvalues [8] are solutions of the matrix equation Az Dz so that the generalized eigenvalue is a solution of det
D ÿ A 0 and, so, in this form, A ÿC2 and D C1 . So, by computing the generalized eigenvalues
ÿC2 ; C1 , we have determined one coordinate of the possible solution of (7) without computing the determinant of the BeÂzout matrix B
y. In the cases where the generalized eigenvalues have multiplicity one, we can simply compute the other coordinate xi for each yi i
ÿC1 ; C2 using the special form of the corresponding generalized eigenvectors in these cases. Now, because det
B
y 0, it has a null space. Following Manocha [14], let us assume that the generalized eigenvectors of C1 y C2 have the T form ui
vi ; yi vi ; . . . ; yrÿ1 i vi , where vi is a vector vi T
vi0 ; . . . ; vil in the null space of B
y (i.e., B
yi vi 0). Then, it is easy to show that C2 ui ÿyi C1 ui , confirming that ui is indeed the form of the generalized eigenvector corresponding to eigenvaÿ T lue yi . Now, the only way vi can satisfy B
yi xl ; xlÿ1 ; . . . ; x; 1 0 from (9) and 0 for any is if
xi
vi0 vilÿ1 ... vi1 vil ;
which gives the other coordinate. In cases where an eigenvalue has multiplicity greater than one, the corresponding null space can have dimension greater than one and the simple relationship to the other coordinate xi breaks down. In these cases, we determine the other coordinate by substituting the y-coordinate into f 0 or g 0 and solving for its roots, which are then checked to determine which is the corresponding x-coordinate [15]. The BeÂzout matrix polynomial that results from the system of polynomial equations in (7), using the quartic in (3), is too large to
VOL. 22, NO. 2,
FEBRUARY 2000
include here. However, it is easy to derive this matrix polynomial using the steps described leading up to (9) using a symbolic mathematics package. When the algorithm is applied in practice, the terms of this matrix polynomial can be precomputed symbolically in terms of parameter symbol vector and converted into a suitable programming language so that the actual entries of the matrix are only computed at run time. In this way it is not necessary to use symbolic algebra during the execution of this algorithm. However, if at run time the higher degree coefficients of the curve C in (3) are zero, then the appropriate lower degree BeÂzout matrix polynomial must be used. This approach was implemented using quadratic, cubic, and quartic versions of (3) in conjunction with conditionals to select the appropriate resultant at run time. (A publically available package for solving systems of polynomial equations using resultants is presented in [33], which also refers to a number of other packages that have not been widely distributed.)
4.3
Successive Circular Approximation
As a second alternative, let us consider a novel procedure for the closest point on the curve. This procedure is based on successive approximations using local circular fits to the radius of curvature of C and is called successive circular approximation (SCA). For an arbitrary f 0, the radius of curvature r is given by the wellknown result that 3=2 2 2 @f @f @x @y r ÿ 2 2 @ 2 f @f @ 2 f @f @ 2 f @f @f @x2 @y @y2 @x ÿ2 @x@y @x @y
13
(a derivation is available upon request), where all partial derivatives are computed at a point of interest
x0 ; y0 on the curve and a positive sign means the center of curvature is in the direction of the gradient. We use (13) in conjunction with the gradient of f to determine the osculating circle for any point on the curve. The principle behind the algorithm is shown in Fig. 2a. Here, the osculating circle at xk has its center at xr . If xk is our current estimate of the closest point on the curve to the point xp , then we can compute an improved estimate by determining the point of intersection of the line between xp and xr and the curve. We determine the intersection point by using the equation of this line to eliminate one variable from the equation for the algebraic curve resulting in a quartic in one variable. We then determine the roots of this quartic (which is efficiently done using the modified Bernoulli method [35]) and the closest solution to xp becomes our improved estimate. So far the algorithm is straightforwardÐhowever, there are special cases that need to be taken into account for the algorithm to work in general. We will now consider the full algorithm including these special cases. An initial guess, xk , k 0, is found for the closest point on the curve to xp 2 S. Determining this initial guess will be discussed in Section 4.4. A circular approximation to curve C at xk is then made using (13) and the gradient vector at xk to determine the center of the osculating circle. As shown in Fig. 2a, we then compute the point x0k on the curve C by determining its intersection with the line xp $ xr . If this new point x0k is closer to xp , it is called xk1 and the iteration continues until the stopping conditions are satisfied. There are now two types of special cases that we have to consider, the second of which has two subtypes that have to be dealt with slightly differently. The first case (Type 1) occurs when projection along xp $ xr does not strike C at a point such that the distance to the curve is reduced (Fig. 2b). The second type of special case occurs when the line xp $ xr does not strike the curve at all. This has two cases to be considered, the first (Type 2a), occurs when xp is not on (or not very close to) the osculating circle (Fig. 2c) and the second (Type 2b) when it is (Fig. 2d). Note that, in
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
VOL. 22, NO. 2,
FEBRUARY 2000
195
Fig. 2. These four diagrams (generated from real examples encountered during fitting of the traces in Fig. 3) illustrate the various conditions that are encountered in the successive circular approximation algorithm for finding the closest point on the curve. See the text for their interpretation. (a) Usual case. (b) Special case type 1. (c) Special case type 2a. (d) Special case type 2b.
the Type 2 cases, x0k has different meanings that will be explained in a moment. In each of these special cases, we initiate a procedure called line halving where we take a series of points on the line from x0k to xk , halving the distance to xk each time (to within a tolerance of xk given by the stopping conditions). We successively try to find the intersection of the line from xp through each of these points, in turn with the curve C . If an intersection point can be determined that is closer to xp than xk , then line halving ceases with this new point designated xk1 and the main iterations continue. If no such intersection can be found, the stopping conditions will have been satisfied and the closest point on the curve will be xk . The difference between each of the special cases occurs in the definition of x0k used in line halving. For a Type 1 special case, x0k is the intersection point of line xp $ xr and C that was further from xp than xk (Fig. 2b). In the Type 2 cases, there is no such intersection point and we need to determine a substitute. For a Type 2a, this is met by defining x0k to be the intersection of line xp $ xr with the osculating circle (Fig. 2c), but this will not work for a Type 2b because the points xp and x0k would then be coincident (within some tolerance) and we would not have a line along which to choose a converging series of points for the linehalving procedure. In this last case, we choose x0k to be the intersection of the tangent line at xk and the line through xp which is parallel to the gradient of f at xk (Fig. 2d). The stopping condition used in implementing the algorithm, apart from that already mentioned in relation to line halving, was that the algorithm terminates if the change in distance from xk to xp is sufficiently small from one iteration to the next. No iteration limit was found to be necessary nor was any condition regarding a change in xk from one iteration to the next used for the following reason: Consider the case of xp at the center of a circleÐany point
on the circumference of the circle is a close point and, so, xk may swing wildly around on this circumference and still satisfy our requirement of being a close point. All of these points on the circumference would satisfy the stopping condition that their distance from xp is the same (within some tolerance) from one iteration to the next. The algorithm can be expressed concisely by the following pseudo-code. Note that stop 10ÿ8 is typically used. QUARTIC-LINE-INTERSECTION
xk ; xp ; f 1 f 0 substitute y yk
x ÿ xk
yk ÿ yp =
xk ÿ xp 2 Solve f 0 for xi 2 IR; i 1 . . . 4 by modified Bernoulli method 3 yi yk
xi ÿ xk
yk ÿ yp =
xk ÿ xp ; i 1; . . . 4 4 x0k arg mini kxp ÿ xi k2 5 return x0k if it exists or no-solution otherwise LINE-HALVING
x0k ; xk ; xp ; f 1 repeat 2 x0k
x0k xk =2 0 3 if kxk ÿ xk k2 < stop then 4 return xk 5 x00k QUARTIC-LINE-INTERSECTION
x0k ; xp ; f 6 until kx00k ÿ xp k2 < kx0k ÿ xp k2 7 return x00k SUCCESSIVE-CIRCULAR-APPROXIMATION
x0 ; xp ; f 1 k ÿ1 2 repeat
196
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
k1 rf
xjxk @f 2 @f 2 3=2
ÿ @2 f @f 2@x @2 f @f@y 2 @2 f @f @f xk
@y2
@x ÿ2@x@y@x@y @x2 @y
k g
5
r
6 7 8 9
xr xk rg=kgk x0k QUARTIC-LINE-INTERSECTION
xr ; xp ; f if x0k no-solution then if r2 ÿ kxp ÿ xr k2 > stop then
11 12 13 14 15 16 17 18 19 20 21
else
xk1 else
.Denote g
gx ; gy :
.Type 2a case: xi Solvex fkx ÿ xr k2 ÿ r2 0; Line
xr $ xp g; i 1; 2 x0k arg minxi kxi ÿ xp k2 .Type 2b case: x0k Solvex fg
x ÿ xk 0;
y ÿ yp gx ÿ
x ÿ xp gy 0g LINE-HALVING
x0k ; xk ; xp ; f
if kx0k ÿ xp k2 > kxk ÿ xp k2 ÿ stop then .Type 1 case: xk1 LINE-HALVING
x0k ; xk ; xp ; f else .Usual case: xk1 x0k until kxk ÿ xp k2 ÿ kxk1 ÿ xp k2 < stop return xk1
4.3.1
FEBRUARY 2000
q q x ÿ xp k sin < kb x ÿ xp k2 s2 kb x ÿ xp k2 s2 ÿ 2skb
3 4
10
VOL. 22, NO. 2,
Properties of the SCA Algorithm
The SCA algorithm has the following properties: First, because the distance of the current estimate of the closest point on the curve never increases (it is strictly decreasing or the algorithm terminates), the algorithm is a stable optimization procedure. Second, at every iteration, the current point xk is a feasible point (i.e., a point on f 0). Third, as is shown below, it converges to a locally closest point under reasonable assumptions. Let the initial guess x0 be contained in a neighborhood N b
xp of points on f 0 within a radius b of the point xp , i.e., kx0 ÿ xp k b. Because the distance of xk from xp is nonincreasing, the sequence of points xk satisfies kxk ÿ xp k b, i.e., xk 2 N b
xp . Because N b
xp is a compact set, when kxk1 ÿ xp k < kxk ÿ xp k 8k, ^ . If the point x ^ is then xk has a convergent subsequence with limit x not a locally closest point, we have a contradiction because, as we ^ will show, the algorithm is guaranteed not to stop at or return to x or any point in its neighborhood, N
^ x, stop , under reasonable assumptions as follows: By Step 16 of the algorithm, line halving is always employed when the local osculating circle fails to find a closer point. Therefore, to prove that the algorithm will not stop in N
^ x, it is sufficient to show that line halving is guaranteed to find ^. x that is closer to xp than x a point outside N
^ First, we assume that the osculating circle does not deviate too ^ . Let to be the angle much from the curve in the neighborhood of x ^ and xp ÿ x ^ and s be arc length from x ^ between the vectors xr ÿ x ^ in the direction of the along the osculating circle or tangent at x tangent approaching xp (so, then, 0 sin 1). Let x
s be a point on the osculating circle at arc length s, x0
s be arc length s along the tangent, and the curvature (unsigned) of the osculating circle be 1=r. Next, we square both sides of the inequality kx
s ÿ xp k kx0
s ÿ x
sk kx0
s ÿ xp k and, using the law of cosines, express kx0
s ÿ xp k in terms of kb x ÿ xp k, s, and . Then substituting kx0
s ÿ x
sk s2 =2 O
s3 (dropping higher order terms) for the distance of the ^ and noting that osculating circle from the tangent at x
< kb x ÿ xp k
s2 ; 2kb x ÿ xp k
we obtain that s must be less than the only positive nonzero root of g
s s4
2 2d 4
s2
1 d ÿ 2sd sin ;
where d k^ x ÿ xp k. The algebraic expression for this root is too complicated to be shown here, so, instead, we will now derive a smaller upper bound upon s. Let g0
s g
s=s. Because g0
s has positive first and second derivatives (s > 0), it is strictly increasing concave upwards and it is negative at s 0. Consequently, a line joining
0; g0
0 and
d; g0
d will intersect the horizontal axis at a than the root of g
s, as long as g0
s 0 at s d value s0 that is less p (i.e.,
ÿ3 5 8 sin =d). Solving for the intersection, we obtain that it is sufficient for 4
s s0
8k^ x ÿ xp k sin
4 6k^ x ÿ xp k 2 k^ x ÿ xp k2
for kx
s ÿ xp k < k^ x ÿ xp k, i.e., for x
s, s s0 , to be closer to xp ^ . (For the sake of the proof, we will assume that all line than x halving occurs through points along the tangent (Fig. 2d). This is not true for regions of f 0 with large curvature of large changes in curvature. However, the algorithm could be modified to treat the Type 1 and Type 2a cases the same as Type 2b, for the sake of the proof, at the cost of a small increase in computational complexity. The proof could be modified to accommodate these two other cases by considering the projection of the line used for halving onto the tangent.) Assuming s is small, then the angle ^ is approximately =2 ÿ . between x0
s ÿ xp and the tangent at x Then by solving for the intersection of the line joining x0
s and xp and the osculating circle, line halving will strike the curve (under 4 our first assumption) if s s1
1 ÿ sin = cos . If line halving is 0 initiated through any point x
s, s > s0 , then it is guaranteed to either find a closer point or pass through a point in the interval x0
s0 =2; x0
s0 . Consequently, if s0 s1 , line halving is guaran^ . (The distance that x0
s0 =2 is teed to find a closer point to xp than x closer must be larger than the stopping condition stop .) Therefore, limit cycles are eliminated. Furthermore, because the algorithm only ever moves to a new estimate xk if it is closer, it will never return to the neighborhood N
^ x that does not contain a locally ^ is not a closest point. Therefore, we have a contradiction and x limit point of a convergent subsequence. ^ is locally a closest point (defined as a In contrast, if the point x point on f 0 at which the gradient to f 0 passes through xp ), ^ SCA
^ then it is stationary point of the algorithm, i.e., x x; xp ; f, because, by definition, line halving would fail to find a closer point ^ . (It is possible that it could find a in the neighborhood of x neighborhood further away that is closer, but then the above arguments are repeated. This could not occur ad infinitum because the distance from f 0 to xp is bounded below and the stopping condition stop is tested at each step to ensure significant improvement occurs.) Therefore, the limit points of any convergent subsequences must be locally closest points and these points will also be limit points of the entire sequence xk .
4.4
Initial Guess for Closest Point
Newton's method and the SCA algorithm both require an initial guess of the closest point on the curve C for a given point xp 2 S. Furthermore, this initial guess must be on the curve C . This was achieved in the following way. First, the closest point on the curve to xp the last time one was computed is likely to be close to the current one. It will not be the same, however, because the curve will be slightly different due to
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
perturbations in the parameter from one computation of the orthogonal distance (4) to the next. To ensure that the initial guess thus obtained is on the curve, a gradient descent is performed from the point on the surface f 2
x; to a minimum which will mostly be on the zero contour of f
x; . If it isn't, which can occur due to f 2
x; having a local minimum, the alternatives are to slightly perturb the point with additive noise and try descending f 2
x; again or better still, use the closest point for a neighboring point in S to xp determined during the current computation of (4) if such exists. This is an entirely feasible strategy for the application at which this work is aimed because the thinned traces of ionograms have a natural ordering of their pixels and, so, have well-defined neighbors. So, as long as the point xp currently being considered is not the first one in the trace, then there will always be a pixel to the ªleftº that is very close for which we do know the closest point on the current curve C .
5
IMPLEMENTATION
Scaling the coordinate system is important for numerical stability when computing with polynomials. As a consequence, it is necessary to examine the behavior of the model (1), (2) under the change of size and position x0
x ÿ u;
14
where u
u; v denotes the shift in origin and the magnification parameter. Under the coordinate change (14), the curve of best fit transforms from C to C 0 where 0 A . The transformation matrix A is given by substituting for x and y in (2) using (14), expanding powers, and then finding the linear equations which determine the coefficients 0 by collecting the terms of x0 and y0 . Matrix A is lower triangular and, for the bivariate quartic model, (3) is 15 15. Furthermore, the metric defining the best fit transform is d
C 0 ; S 0 2 d
C ; S. Thus, we see that the best fitting curve C (1), (2) remains essentially unchanged under (14) and in the fitting metric d
C ; S (4). Furthermore, the transformation is linear. Now, the robustness of fit of the curve to the trace is greatly improved by performing the fitting in a normalized local coordinate system. Given an individual segment, we normalize its coordinate system for robustness' sake in the following fashion: Suppose that the set S of data points lies in the minimum enclosing rectangle xmin ; xmax ymin ; ymax . Then, we rescale the data and curve fitting problem (preserving the aspect ratio) by moving to a new coordinate system such that xmin ; xmax ymin ; ymax ! ÿz; z 0; 1; where z
xmax ÿ xmin : 2
ymax ÿ ymin
15
(If ymax ÿ ymin < 10, then we set ymax ÿ ymin 10 in (15) so that short horizontal traces are numerically stable.) The reason for this normalization is to ensure that the constant 00 in the best fitting polynomial over this region will almost certainly be comparable in size to the coefficient of higher order powers. This, in turn, allows P us to replace the ªcorrectº normalization ij ij2 1 by the ad hoc, but easier to work with, normalization 00 1. The choice 00 1 is the most convenient to work with but it may result in numerical instability if 00 is small compared to the remaining parameters. The above choice of coordinates in which the curve is deliberately offset from the origin along the x-axis should avoid this problem. The ODR problem can be solved efficiently using standard algorithms if we can provide the nonlinear least squares algorithm with an algorithmic expression for the Jacobian of the functions
VOL. 22, NO. 2,
FEBRUARY 2000
197
gn
x k n k with respect to the parameters . Using the approach suggested by [29], the Jacobian can be determined as follows: (For convenience, we will occasionally drop the double subscript on and '; a single index k or l indicates the lexigraphic ordering of ij, ~ n ÿ xn j 0; . . . ; p, i 0; . . . ; m ÿ j.) First, it follows from n x that @k n k
@ n
~ xn ; y~n ; @ l k n k @ l
16
~ n
~ where x xn ; y~n is the closest point on f
x; 0 to xn
xn ; yn . Second, we have X k 'k
~ xn ; y~n 0; k
X k
k
@'k @'k
~ yn ÿ yn ÿ
~ xn ÿ xn 0; @ x~n @~ yn
17
because f
~ xn ; 0 and rf at
~ xn ; y~n is parallel to n . Differentiating (17) with respect to l gives @
~ xn ; y~n ; 0 ÿ'l
~ xn ; y~n @ l @ rf
~ xn ; y~n ; 0 ÿr'l 0 ; @ l rf
18
where 0n
~ xn ÿ xn ; y~n ÿ yn ; 0 and the gradient is similarly augmented with a zero third component. Solving the two independent equations (18) for the two nonzero components of @ xn ; y~n ; 0 gives that @ l
~ @ 'l
~ xn ; y~n r'l 0n rf
~ xn ; y~n ; 0 ÿ rf ÿ : 2 @ l krfk krfk2
19
Substituting (19) into (16) gives the Jacobian @k n k k n k k n k ÿ xn ; y~n ÿ n r'l ÿ
r'l rf:
'l
~ @ l
n rf krfk2
20 A conic fit [36] to S was used as an initial guess for the nonlinear least squares routine.
6
SIMULATIONS AND DISCUSSION
Fig. 3 shows the result of fitting the bivariate quartic model to a set of ionogram traces. Each of the 11 traces were fitted separately using the curve C given by (3). The resulting fits are plotted in the window defined by the domain and range of data points of each trace. The fits are free of the effects of cusps except for the fit to trace 7. These results are a dramatic improvement over those reported in [21] using ODRpack. The question arises as to how well each of the three methods described in detail in Section 4 performs in computing the closest point on the curve. This was achieved by performing an exact ODR fit using the Levenberg-Marquardt nonlinear least squares algorithm to fit (3) to Fig. 3 using different arrangements of the closest point determining algorithms. First, Newton's algorithm in (8) was used as the primary method using the initial guesses to each closest point on the curve determined by the methods described in Section 4.4. This method failed to converge to the closest point (within the limit of 40 iterations) in 18 of the 3,010 invocations of the algorithm during the fitting to traces in Fig. 3. This may not sound like many, but a single failure means that this method is not satisfactory on its own. Each failure caused the initiation of an SCA procedure to compute the closest point so that fitting could continue. The mean number of iterations required by Newton's algorithm over the 3,010 invocations was 3:10. These results are
198
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
VOL. 22, NO. 2,
FEBRUARY 2000
Fig. 3. An example of a thinned and trimmed oblique ionogram with a plot of the fits to its traces to the same scale below. Each individual trace is indicated by a label. The fit to trace 7 has a cusp, but all other traces are free of this problem.
indicative of those obtained over many fits to ionogram traces. It is possible, in some cases, the algorithm converged to a point that was not the closest point, but no examples of this were found and no anomalies were observed in the value of the distance measure (4) during fitting when the cases of convergence failure were handled by the SCA algorithm. For comparison, the results of the same tests as above are reported here, but this time using solely the SCA algorithm to compute the closest points on the curve. The initial guess for each of the closest points was the same as before using the methods outlined in Section 4.4. The SCA algorithm was invoked 1,741 times during the fitting of all the individual traces in Fig. 3 to compute closest points. On average, 1:28 osculating circles were determined for each invocation. The SCA algorithm converged to a solution in all of the cases tested. The line halving procedure was called on average only 1:00 times for each closest point determined. Only in one case in 1,741 invocations of the SCA algorithm was it necessary to use three line halving procedures in computing a single closest point and this is the maximum measured (in one case, it was called twice). On average, over 1,744 calls to line halving, only 1:20 halvings or intersections had to be computed for line halving to successfully exit. At most, 16 halvings were required and this occurred in four cases. Finally, for each closest point computed by the SCA algorithm, including any invocations of line halving, it was necessary to compute the intersection points of a line and quartic an average of 2:49 times in 1,741 cases. At most, 17 intersections were required and this occurred in four cases. Again, these results are indicative of those obtained over many fits to ionogram traces. The costs associated with the resultant procedure for finding the closest point on the curve outlined in Section 4.2 are constant because it is not iterative. Having now established representative figures for the average required number of iterations for each of the three techniques of Section 4, let us now examine the computational complexity involved in the iterations of each of them. We will only consider the most costly operations in each. The computationally most complex step required in each iteration of Newton's method is the solution of the linear equation in (8). The solution was obtained by LU factorization with partial pivoting of the matrix on the left hand side of (8). According to [8], this involves 2n3 =3 flops, where, in this case, n 3. This is multiplied by 2:5 for the average number of iterations to give a total of 45 flops for the rate determining step in total.
The SCA algorithm's most costly step is determining the roots of the quartic equation in one variable after substitution of the equation for a line into f
x; 0 from (3). This is efficiently solved [6] using the modified Bernoulli method [35], where the problem is expressed as an eigenvalue problem. From [8], eigenvalues computed by the QR algorithm require 10n3 flops based on empirical observation, where, in this case, n 4. (Essentially, n 4 comes from the four coefficients of x3 ; x2 ; x; 1 of the quartic, normalized by the coefficient of x4 , and these normalized coefficients form one row of a lower Hessenberg matrix.) This cost is multiplied by the average of 2.5 iterations to give 1,600 flops for the rate determining step in total. The resultant method of Section 4.2 requires generalized eigenvalues and eigenvectors computed once per invocation. From [8], generalized eigenvalues and eigenvectors computed using the QZ algorithm require 66n3 flops based on empirical observation. Solving (7) with f 0, given by (3), using resultants means that C1 and C2 in (12) are 28 28 matrices (from r 7 and l 3), so the computational complexity is 66n3 flops, where n 28 for a total of 1,448,832 flops for the rate determining step. The resultant method has the advantage of not requiring an initial guess, but, when one is available, it is clear from the computational complexity calculations above that Newton's method and the SCA algorithm are much faster. Including the cost of computing an initial guess by the methods of Section 4.4 will not change this assessment. Newton's method is the fastest of the three, but it has the disadvantage of failing to find the closest point in some cases. A sensible strategy to cope with this is to initiate the SCA algorithm when Newton's algorithm has failed and this was the final strategy settled upon for the driving ionogram application of this work.
7
CONCLUSIONS
The experience reported in this paper supports the arguments in [29] for using exact orthogonal distance regression to compute the best fitting algebraic curve to data points rather than approximate methods because the result will be less prone to the problem of cusps. This exact approach, however, raises the difficult problem of determining the closest point on the curve for an arbitrary point in the plane. This paper considers three methods for achieving this in detail. The first is Newton's method used by others, the second is based on resultants which are recently making a resurgence in computer graphics, and the third is a novel technique based on successive circular approximations to the curve. It is shown that
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
Newton's method is the quickest, but is shown to fail sometimes even with a good initial guess. The successive circular approximation algorithm is not as fast, but is robust. The resultant method is the slowest of the three, but does not use an initial guess. Therefore, the recommended approach for fitting implicit polynomials by ODR is to view it as a nonlinear least squares problem and to exactly solve the subproblem of determining the closest point on a given curve to each data point by Newton's method backed up by SCA or resultants when Newton's method fails to converge.
VOL. 22, NO. 2, [22] [23] [24] [25] [26] [27]
ACKNOWLEDGMENTS
[28]
The author would like to thank Bob Whatmough for his help with the Jacobian, David Kettler for his contribution to simulation of the fitting methods, Maarten Gulliksson and Inge Soderkvist for kindly providing the Matlab code of their algorithm, and Garry Newsam, David Crisp, and Lang White for their helpful discussions.
[29]
REFERENCES
[32]
[1]
[33]
[2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
[14] [15] [16] [17] [18]
[19] [20] [21]
P.T. Boggs, R.H. Byrd, J.E. Rogers, and R.B. Schnabel, ªUser's Reference Guide for ODRPACK Version 2.01 Software for Weighted Orthogonal Distance Regression,º Applied and Computational Math. Division, Nat'l Inst. of Standards and Technology, U.S. Dept. of Commerce, NISTIR 924834, 1992. E.-W. Chionh, ªBase Points, Resultants, and the Implicit Representation of Rational Surfaces,º PhD Thesis, Univ. of Waterloo, Ontario, Canada, 1990. D. Cox, J. Little, and D. O'Shea, Ideals, Varieties and Algorithms: An Introduction to Computational Algebraic Geometry and Commutative Algebra, second ed., Springer-Verlag, 1997. Y. De Montaudouin and W. Tiller, ªThe Cayley Method in Computer Aided Geometric Design,º Computer Aided Geometric Design, vol. 1, pp. 309-326, 1984. J.E. Dennis Jr. and R.B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Prentice Hall, 1983. S. Goedecker, ªRemark on Algorithms to Find Roots of Polynomials,º SIAM J. Scientific Computing, vol. 15, no. 5, pp. 1,059-1,063, 1994. I. Gohberg, P. Lancaster, and L. Rodman, Matrix Polynomials. Academic Press, 1982. G.H. Golub and C.F. Van Loan, Matrix Computations, second ed., Johns Hopkins Univ. Press, 1989. M. Gulliksson and I. SoÈderkvist, ªSurface Fitting and Parameter Estimation with Nonlinear Least Squares,º Optimization Methods and Software, vol. 5, pp. 247-269, 1995. B.K.P. Horn, ªRelative Orientation Revisited,º J. Optical Soc. Am., vol. 8, no. 10, pp. 1,630-1,638, 1991. K. Kanatani, ªStatistical Bias of Conic Fitting and Renormalization,º IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 3, pp. 320-326, Mar. 1994. K. Kanatani, Statistical Optimization for Geometric Computation: Theory and Practice. Elsevier Science, 1996. D.I. Kettler and N.J. Redding, ªA Trimming Algorithm to Clean Thinned Features for Feature Extraction in Image Understanding,º Proc. Fourth Australian and New Zealand Conf. Intelligent Information Systems, pp. 304-307, 1996. D. Manocha, ªAlgebraic and Numeric Techniques for Modelling and Robotics,º doctoral dissertation, Univ. of California, Berkeley, 1992. D. Manocha, ªSolving Systems of Polynomial Equations,º IEEE Computer Graphics and Applications, pp. 46-55, Mar. 1994. D. Manocha and S. Krishnan, ªSolving Algebraic Systems Using Matrix Computations,º ACM Sigsam Bulletin, vol. 30, no. 4, pp. 4-21, 1996. F. Mokhtarian and A.K. Mackworth, ªA Theory of Multiscale, CurvatureBased Shape Representation for Planar Curves,º IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 14, no. 8, pp. 789-805, Aug. 1992. A.P. Morgan, ªPolynomial Continuation and Its Relationship to the Symbolic Reduction of Polynomial Systems,º Symbolic and Numerical Computation for Artificial Intelligence, B. Donald et al., eds., pp. 23-45, Academic Press, 1992. G.N. Newsam and N.J. Redding, ªFitting the Most Probable Curve to Noisy Observations,º Proc. Int'l Conf. Image Processing, vol. 2, pp. 752-755, 1997. W.H. Press, B.P. Flannery, S.A. Teukolsky, and W.T. Vetterling, Numerical Recipes in C: The Art of Scientific Computing, second ed., Cambridge Univ. Press, 1992. N. Redding, ªThe Autoscaling of Oblique Ionograms,º DSTO Electronics and Surveillance Research Laboratory, Salisbury, South Australia, Australia, Research Report DSTO-RR-0074, 1996. www.dsto.defence.gov.au/ corporate/reports/DSTO-RR-0074.pdf.
[30]
[31]
[34] [35] [36]
FEBRUARY 2000
199
N.J. Redding, ªImage Understanding of Oblique Ionograms: The Autoscaling Problem,º Proc. Fourth Australian and New Zealand Conf. Intelligent Information Systems, pp. 155-160, Nov. 1996. N.J. Redding, ªFitting Implicit Polynomials to Use as Features in Image Understanding,º Proc. Fourth Australian and New Zealand Conf. Intelligent Information Systems, pp. 161-164, Nov. 1996. N.J. Redding and R. Whatmough, ªFitting Implicit Quartics for Use in Feature Extraction,º Proc. Int'l Conf. Image Processing, vol. 2, pp. 410-413, Oct. 1997. G. Salmon, Modern Higher Algebra. Dublin: Hodges, Smith & Co., 1866. K. Schittkowski, ªNLPQL: A FORTRAN Subroutine Solving Contrained Nonlinear Programming Problems,º Annals of Operations Research, vol. 5, pp. 485-500, 1986. T.W. Sederberg, D.C. Anderson, and R.N. Goldman, ªImplicit Representation of Parametric Curves and Surfaces,º Computer Vision, Graphics, and Image Processing, vol. 28, pp. 72-84, 1984. T.W. Sederberg, ªAlgorithm for Algebraic Curve Intersection,º Computer Aided Design, vol. 21, no. 9, pp. 547-554, 1989. S. Sullivan, L. Sandford, and J. Ponce, ªUsing Geometric Distance Fits for 3D Object Modeling and Recognition,º IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 2, pp. 1,183-1,196, Feb. 1994. G. Taubin, ªEstimation of Planar Curves, Surfaces, and Nonplanar Space Curves Defined by Implicit Equations with Applications to Edge and Range Image Segmentation,º IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, no. 11, pp. 1,115-1,138, Nov. 1991. G. Taubin, F. Cukierman, S. Sullivan, J. Ponce, and D.J. Kriegman, ªParameterized Families of Polynomials for Bounded Algebraic Curve and Surface Fitting,º IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 3, pp. 287-303, Mar. 1994. Visual Numerics, Inc., IMSL C/Math/Library C Functions for Mathematical Applications. Visual Numerics, Inc., 1995. A. Wallack, I.Z. Emiris, and D. Manocha, ªMARS: A Maple/Matlab/C Resultant-Based Solver,º Proc. Int'l Symp. Symbolic and Algebraic Computation, pp. 244-251, 1998. L.T. Watson, S.C. Billups, and A.P. Morgan, ªAlgorithm 652: HOMPACK: A Suite of Codes for Globally Convergent Homotopy Algorithms,º ACM Trans. Math. Software, vol. 13, pp. 281-310, 1987. D.M. Young and R.T. Gregory, A Survey of Numerical Mathematics, vol. 1. Dover Publications, 1973. Z. Zhang, ªParameter Estimation Techniques: A Tutorial with Application to Conic Fitting,º Image and Vision Computing, vol. 15, pp. 59-97, 1997.