3L Fitting of Higher Degree Implicit Polynomials Z. Lei, M. M. Blane and D. B. Cooper Division of Engineering, Brown University, Providence, RI 02912 zbl, mmb,
[email protected]
Abstract
Implicit polynomial 2D curves and 3D surfaces are potentially among the most useful object or data representations for use in computer vision and image analysis. That is because of their interpolation property, Euclidean and ane invariants, and Bayesian recognizers. This paper studys and compares various tting algorithms in a uni ed framework of stability analysis. It presents a new robust 3L tting method that is repeatable, numerically stable and computationally fast and can be used for high degree implicit polynomials to capture complex object structure. With this, we lay down a foundation that enables a technology based on implicit polynomial curves and surfaces for applications to indexing into pictorial databases, robot vision, CAD for free-form shapes, etc.
1 Introduction
This paper lays a foundation for the tting of 2D algebraic curves and 3D algebraic surfaces to data, which enables a new developing technology for a variety of purposes including object recognition [1, 3, 4, 7, 10, 11, 13, 14, 15], computer graphics, morphing, CAD [5, 8, 12]. We brie y mention two applications of interest. 1. Given a 2D sketch of an object boundary or an image of an object, nd an image in a pictorial database that contains the object but is seen from a dierent camera-object position and perhaps was taken at a much dierent time. An approach is to compute a vector of object feature-measurements from the data presented and compare that with vectors of object feature measurements stored with each image in the pictorial database. Color and texture (widely used now) will often not be applicable, so shape will be crucial. For real-time indexing into a large database, a vector of measured features must be compared with a vector of features for an object in the database in O(10?3) seconds or orders of magnitude less. Our approach [10, 11] is to t one or more implicit polynomial curves of 4th or higher degree to object boundary data, In
Proceedings of 3rd IEEE Workshop on Applications of Computer Vision, Sarasota, FL, pp 148-153, December, 1996.
compute a number of algebraic self-invariants for each tted curve [14] or mutual-invariants [7] for pairs of tted curves, and, using Bayesian methods, compare those measured in the data with those stored with each of 104 ? 107 images in the pictorial database. We can meet the computational requirements with these algebraic invariants features! 2. Real-time 2D or 3D position invariant object recognition for robot or other applications when data may be sparse and self or mutual occlusion is present (e.g., a laser range nder senses an object from one direction) requires fast object model tting from sparse partially-occluded data, and then recognition. Implicit 3D polynomial technology is ideally suited [14]. Presently, there does not appear to be any other representation that is as well suited to these applications. The new concept and algorithm for implicit polynomial tting presented in this paper is needed to deal with two shortcomings of present methods for certain applications. 1. Current methods for tting a 4th degree polynomial to on the order of two hundred data points use nonlinear minimization and require on the order of a second of computation time on a sparc 10. Fitting polynomials of degree greater than 6 is probably impractical. Our new approach uses two to three orders of magnitude less time, and can practically t up to 18th degree polynomials. It can do a good tting job to sparse data. The increased speed permits model-based segmentation capability. 2. Current methods for tting 4th degree polynomials usually return unusable ts if the data requires a higher degree polynomial for accurate representation. Our new approach provides very stable, repeatable, pleasing usable approximations to this complex data.
2 Implicit Polynomial Fitting Methods
2.1 The Fitting problem
What do we mean by \ tting an implicit polynomial to a collection of points Z = fzi = (xi ; yi)gNi=1 " ? Our purpose here is to minimize some kind of error
metric
Err(f; Z) =
N X i=1
err2(f; z
i)
In the general least squares sense, err(f; zi ) is just f (zi ) = f (xi ; yi ) for Pan implicit polynomial f (x; y), and so Err(f; Z) = Ni=1 f 2 (zi ). However, this will not be a good estimate of the real distance (or error) of a point zi = (xi; yi ) to the zero set of the implicit polynomial f (x; y). This is known as the 0th order distance approximation. A commonly used 1st order distance approximation is [13, 15]: (1) err(f; zi ) = d(zi ; z (f )) = k jrff(z(zi ))j k i and iterative optimization techniques are used to nd the optimal solution. This is a nonlinear optimization problem. Many good algorithms have been presented for solving this problem to get the best tting polynomial, among which are [6] and [15]. However, as with all global optimization problems, these algorithms have the computational complexity of NP (Not Polynomial time) hard problems and usually the solution cannot be guaranteed to be the global optimum one. Various perturbation methods and stopping rules have been used to improve the performance, but still there is no standard approach to handle the tting problem, especially when the data set Z is intrinsically complex or ill conditioned ( too complicated to be t by a single 4th degree polynomial). These iterative methods either fail to give a physically meaningful solution, or suer from numerical instability. Furthermore, there is no guarantee that the nal solution depending on the starting point and stopping rule is anywhere close to the global optimal solution. In [8, 9], a novel approach using Linear Programmingwas introduced. This technique eciently yields a suboptimal solution without performing the iterative search for the global optimal one. This is especially ecient and robust if the data can actually be t well by a certain degree implicit polynomial (so the feasible solutions exist). If data is more complex and noisy, then the distance approximation has to be ne tuned (for example, use a higher-order distance approximation) or some global geometric constraints like a boundedness condition have to be added. In this paper we present a modi ed least squares tting method that is far more robust than the original least squares tting algorithm, and obviates the need for any iterative searching to nd the optimal solution. We have named this approach the 3L tting algorithm because we use three dierent level sets in
the tting procedure. We will show in the following sections why 3L tting is better than the least squares tting and other nonlinear optimization tting methods, and why only three level sets are sucient. In applications such as pictorial database retrieval and interactive computer graphics, where both the accuracy and the speed of the tting is an issue, 3L tting can be used to quickly generate an initial approximation to complex data by a low-degree polynomial. Linear Programming or higher degree polynomials can be used to improve the result, or the data can be broken into pieces which can be t separately for the purpose of recognition [7].
2.2 Stability of the Fitting Method
In order for the tted implicit polynomial to be a useful representation for recognition purposes, the tting procedure has to be robust: a small change in the data should not cause a huge amount of change in the coecients of the tted implicit polynomial. This is extremely important because these coecients are used to compute the algebraic invariants which serve as the primary geometric descriptors of the shape in the Bayesian recognition system ([7, 14]). We claim that the stability of the tting is more important than the goodness of the tting itself. But unfortunately, the previous tting methods all focus on the error metric of the goodness of the tting, and do not directly address the stability and robustness issues of the tting. Suppose ? represents the boundary of an object S . Let us write X f (x; y) = aij xi yj = X~ i;j 0;i+j n
where is the coecient vector and X~ is the vector of monomials xi yj . The least squares tting problem can be written as: min
Z
?
f 2 (x(s); y(s))ds
(2)
where s is the arc length. The solution of (2) will actually de ne a potential eld on the whole plane P , including the object S . But in (2), all the constraints are put on the boundary ? of S . There are no restrictions on the whole object shape S , i.e., the area enclosed by ?. So the resulting potential eld could be wildly behaved, especially when the object shape S is complex. For example, the singular points of f (x; y) @f = @f = 0) could be anywhere in (ie, points where @x @y the plane P . Suppose a saddle point p0 = (x0; y0 ) is close to the object boundary ? and the implicit polynomial t of
− + − −− + + + + + + − + + − − − +
P0 k P1 (x − x0) (y − y0) =
(a)
−ε
(b)
Figure 1: (a) A saddle point p0 close to object boundary; (b) No sign consistency around a saddle point. (2) to the data set is f (x; y) = (x ? x0 )(y ? y0 )+ (for some small positive constant ) (Figure 1(a)). At the closest point (x1 ; y1) on the data set, @f j @f @x (x1 ;y1) = y1 ? y0 @y j(x1 ;y1) = x1 ? x0 So the tangent direction at (x1 ; y1) will be @f y1 ? y0 k = ? @x @f j(x1 ;y1 ) = ? x1 ? x0 @y
dk = ((xy1 ??xy0))2 dx1 ? (x 1? x dy1 1 0 1 0 Since p0 is close to ?, y1 ? y0 and x1 ? x0 are both very small. Observe that if (x0 ; y0) does not change too much, then dxdk1 ' ((xy11??xy00))2 1. Hence, if there is a small amount of noise in the data set, then either (x0; y0 ) will change a lot, or dk will change a lot in the neighborhood of (x1 ; y1) . Either the singular points of the tted polynomial potential eld will be redistributed totally dierently or the topological structure of the zero set of f (x; y) will change a lot around the area of the singular points. Both cases will cause the potential eld de ned by f (x; y), and hence its coecients, to change greatly, which makes the coecient vector of the tted polynomial not a stable shape descriptor of shape S .
2.3 Potential Field Fitting
This property of the tting, that small amounts of positional change cause a huge amount of tangential change, and hence the change of the shape of the zero set of the implicit polynomial is undesirable. It can be avoided if we control not only the zero set of the polynomial being tted, but the behavior of the polynomial away from its zero set as well. We have to put additional constraints on the tting procedure to lter out the unstable candidates of the potential eld (i.e., the polynomial). In a nutshell, we want the singular
− − − 0 − − + − − − 0 + + +0 Γ 0 − + + − + + 0 − + S + 0 + + + +0 − − 0 − − D − − −0 −
(a)
P− P P +
0
Γ+ Γ0
(b)
Γ−
Figure 2: (a) Sign consistency introduced by the object shape; (b) 3 level sets for a stable potential eld. points of f (x; y) to be far away from the data set so that the noise in the data set will not aect the structure of the tted potential eld too much. To achieve this we will use not only the boundary information ? but also the whole object shape S in the tting. Notice that around singular points there is no sign consistency of the potential eld (Figure 1(b)). We can enforce the inside/outside concepts introduced by the object shape to the tted potential eld and force it to behave in a manner similar to that of the object shape when walking along its boundary ?, i.e., we want to preserve the sign consistency (Figure 2(a)). The distance transform [2] will generate an ideal stable potential eld which preserves the sign consistency of the object shape. The tting procedure is then to nd the best polynomial potential eld that closely resembles the potential eld from the distance transform: min
Z Z
D
(f (x; y) ? d(x; y))2 dxdy
(3)
where d(x; y) is the distance transform result and D is some region containing the object shape S . Equation (3) gets rid of the singular point problem and is numerically stable, since the distance transform d(x; y) tends to smooth out the small variations and noise (Section 2.5). Unfortunately, we have put too great a constraint on the polynomial being t to the data. We are tting the whole polynomial potential eld f (x; y) to the distance transform potential eld. We cannot hope that we will be able to get a polynomial zero set that closely approximates the data this way.
2.4 3L Implicit Polynomial Fitting
What we want from the tting procedure is an implicit polynomial that is close to the data set and locally preserves the sign consistency (or the gradient of the potential eld) along the data set. Since the gradient direction is perpendicular to the tangential direction at a point on the zero set of the polynomial, we want the tting to capture both the positional and
tangential information of the object. They are both important geometric structures of the object shape. Higher order structures can be studied also; However, such issues are beyond the scope of this paper. Denote by G(s) the gradient direction of object boundary ?(s). The tting problem becomes:
B’ dB A A+
B’+ B+ dB+
C
B
C+
1Figure 3: A; B; B0; C on the data set and 0 Z + ; B+ ; B 0 + ; C+ on the higher level set of the distance B@ (f 2(x(s); y(s)) + k 5 f (x(s); y(s)) ? G(s)k2)dsCAAtransform. min {z } | ?(s) RegularizationPart
(4) One way to compute G(s) is via the distance transform. But any potential eld with a zero level set close to the data set can be used. The regularization part in (4) can be replaced by the corresponding positional constraints from the level sets. Suppose we choose as inside and outside level sets ?+ = 5 and ?? = ?5 respectively (Figure 2(b)). For any point p0 on the object boundary ? = 0, there exist two points p? 2 ?? and p+ 2 ?+ such that p? p+ is perpendicular to the boundary tangent direction at p0 . If we force f (x; y) jp+ = 5 and f (x; y) jp? = ?5, then this is equivalent to forcing the regularization part of (4) to zero. The nal three level set (3L) tting algorithm is: min
Z
?0;?+ ;??
(f (x(s); y(s)) ? d(x(s); y(s)))2 ds
!
(5) where d(x(s); y(s)) denotes the distance transform value at point (x(s); y(s)). The solution is a linear least squares problem.
2.5 Solution and Stability Analysis for 3L Fitting
The solution of the orginal least squares tting procedure satis es: A = b, where A is the matrix of monomials for all points on the data set and b is a vector of constant value. To solve for one needs to compute the pseudoinverse of matrix A which is numerically unstable if eigenvalues of At A are close to zero. Suppose the change in the data set causes matrix A to change by A. The change in AtA is (A + A)t (A + A) ? At A ' 2AtA and the ratiot of A Ak norm of the change versus norm of At A is k2 kAt Ak . We will show that the relative change for the 3L tting is smaller than this. The solution of the 3L tting algorithm satis es: A b = (6) A0 b0 where A0 ; b0 are for level sets ?? and ?+ . Multiplying both sides of (6) by (At A0 t ) we have (At A + A0 tA0 )
= Atb + A0 t b0. We need to compute the psedoinverse of (At A +0 A0 tA0 ). The change A0 in A will cause (At A + A tA0 ) change 2(AtA +A tA0 ). However A0 will be much smaller than A because of the smoothing eect of the distance transform. (For example, in Figure 3, the distance dB+ = kB+ ?B 0 + k in the higher level set is smaller than dB = kB ? B 0 k.)0 We conclude that the ratio of the change for (At A + A t A0 ) of 3L tting is much smaller than the ratio of change for At A of original least squares tting. Because k2(AtA + A0 t A0 )k k2At(A + A0)k (7) kAtA + A0 t A0 k kAt A + A0 t A0k t
A Ak so if kA0k kAk, then RHS of (7) ' k2 kAt Ak 0 . If kA k ) t kA0k kAk, then RHS of (7) ' k2kAAt AA+k(1+ 0 t AkAk 0k A t k2At Ak . Similar result holds when kA0 k kAk. So kA Ak t A0 t A0 )k k2At Ak . 3L tting in general k2(kAAt AA++ 0 kAt Ak A t A0 k is more robust and stable than least squares tting. This is especially important when tting higher degree implicit polynomials to more complex shapes because then higher order monomials have to be computed reliably. Small amount of noise in the raw data set can quickly destroy the tting if they are not controlled properly.
3 Experiments and Future Work
In this section we present the results of the 3L tting algorithm to silhouettes of various objects. Assume that segmentation has already been done to the original images and that we will t polynomials directly to the (x; y) raw data points. In Figure 4, 3L tting of dierent higher degree polynomials is shown for a hand shape. Here red points represent data, and green points are on the zero sets of polynomials. For comparison, standard least squares tting(LSF) and bounded least squares (BLSF) tting, together with 3L tting (3LF) are shown in Figure 5 for the original hand data and its transformed version (translation, 15 percent scale and 27 degree rotation). The previous methods cannot provide useful representations in
3 level sets
degree=4
degree=8
degree=12 degree=14 degree=18 Figure 4: Level sets and higher degree implicit polynomial ts for a hand shape.
LSF
BLSF
LSF
3LF, degree 4
BLSF
3LF, degree 8
Figure 5: Comparison of least squares t(LSF), bounded least squares t(BLSF) and 3L t(3LF) for original hand data and its transformed version.
general because they are not stable and robust. The separate page of gures shows the 3L ttings for different shapes | car, tree, guitar-body, face, air-plane, and butter y. The number listed under the tting is the degree of polynomial used. These shapes are complicated and cannot be t stably by implicit polynomials of higher degree using known methods. The 3L tting algorithm can be used to generate meaningful 4th degree approximations as well as higher degree, more accurate polynomial representations. As the degree increases, more structures will be captured. The speed for the 4th degree polynomial tting is about a thousandth of a second on a SparcStation 10. Higher degree polynomials require more time, ranging from an hundredth of a second to one or two seconds. Most of the time is spent on the computation of moments. The highest degree we have tried so far is 18th. Higher degrees are too time consuming and also numerically unstable. It is also very hard to nd the zero set because one has to numerically solve a higher degree polynomialfor its roots. The speed can improve significantly if more sophisticated algorithms (for example FFT) are used to compute the moments for the polynomials. In order to t the higher degree polynomials, it is not necessary to do a lot of preprocessing. The data need only be centered to the origin and scaled to t properly into a small, say 1 by 1, box. This is for the stability of numerical computation since any number greater than 1 raised to the 20th power, for example, is really a huge number to handle in the computation. Simple cleaning algorithms are used to get rid of the spurious points of the zero set of the implicit polynomials to properly display the structure of the tted implicit polynomial for the data set. This is possible because spurious points of 3L tting polynomials are far away from the level sets and hence are easy to separate.
References
[1] R. Bolle and D.B. Cooper. Bayesian Recognition of Local 3D Shape by Approximating Image Intensity Functions with Quadric Polynomials. IEEE Transactions on Pattern Analysis and Machine Intelligence, July 1984. [2] G. Borgefors. Distance Transformations in Arbitrary Dimensions. Computer Vision, Graphics, and Image Processing, 27, 1984. [3] T.E. Boult and A.D. Gross. Recovery of Superquadrics from Depth Information. In Proceedings, AAAI Workshop on Spatial Reasoning and Multisensor Integration, October 1987. [4] D. Forsyth, J.L. Mundy, A. Zisserman and C.M. Brown. Projectively Invariant Representation Using Implicit Algebraic Curves. In Proceedings, First European Conference on Computer Vision, 1990. [5] C.M. Homann. Implicit Curves and Surfaces in CAGD. IEEE Computer Graphics and Applications, January 1993.
[6] D. Keren, D.B. Cooper and J. Subrahmonia. Describing Complicated Objects by Implicit Polynomials. IEEE Transactions on Pattern Analysis and Machine Intelligence, January 1994. [7] Z. Lei, D. Keren and D.B. Cooper. Computationally Fast Bayesian Recognition of Complex Objects Based on Mutual Algebraic Invariants. Proceedings, International Conference on Image Processing, Washington, D.C., October 1995. [8] Z. Lei and D.B. Cooper. Linear Programming Fitting of Implicit Polynomials with Applications in Object Recognition and InteractiveComputer Graphics. LEMS Report 146, Division of Engineering, Brown University, Oct 1995. [9] Z. Lei and D.B. Cooper. New, Faster, More Controlled Fitting of Implicit Polynomial 2D Curves and 3D Surfaces to Data. Proceedings, Computer Vision and Pattern Recognition Conference, San Francisco, CA, June 1996. [10] Z. Lei and Y. Lin. 3D Shape Inferencing and Modeling for Semantic Video Retrieval. To appear in Multimedia Storage and Archiving Systems, SPIE Symposium on Voice, Video and Data Communications, Boston, MA, November 1996. [11] Z. Lei, H. Civi, and D.B. Cooper. Free-form Object Modeling and Inspection. To appear in Proceedings, Automated Optical Inspection for Industry, SPIE's Photonics China '96, Beijing, China, November 1996. [12] J. Ponce, A. Hoogs and D.J. Kriegman. On Using CAD Models to Compute the Pose of Curved 3D Objects. CVGIP: Image Understanding, March, 1992. [13] V. Pratt. Direct Least Squares Fitting of Algebraic Surfaces. Computer Graphics, July 1987. [14] J. Subrahmonia, D.B. Cooper, and D. Keren. Practical Reliable Bayesian Recognition of 2D and 3D Objects Using Implicit Polynomials and Algebraic Invariants. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp505519, May 1996. [15] G. Taubin. Estimation of Planar Curves, Surfaces and Nonplanar Space Curves De ned by Implicit Equations, with Applications to Edge and Range Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, November 1991. [16] G. Taubin, F. Cukierman, S. Sullivan, J. Ponce and D.J. Kriegman. Parameterized Families of Polynomials for Bounded Algebraic Curve and Surface Fitting. IEEE Transactions on Pattern Analysis and Machine Intelligence, March 1994.