Computer Graphic and Image Processing, Vol. 1 (1972) ... The Canadian Cartographer, Vol. 10, No. 2 ... Computer Science,
Approximation of a Polyline with a Sequence of Geometric Primitives Eugene Bodansky and Alexander Gribov Environmental Systems Research Institute (ESRI) 380 New York St., Redlands, CA 92373-8100, USA
[email protected],
[email protected]
Abstract. The problem of recognition of a polyline as a sequence of geometric primitives is important for the resolution of applied tasks such as post-processing of lines obtained as a result of vectorization; polygonal line compression; recognition of characteristic features; noise filtering; and text, symbol, and shape recognition. Here, a method is proposed for the approximation of polylines with straight segments, circular arcs, and free curves. Keywords: Polyline, Polygonal line, Geometric primitives, Recognition, Approximation, Compression, Smoothing, Noise filtering, Straight segments, Circular arcs, Free curves.
1 Introduction There have been many attempts to develop algorithms that produce a representation of the polyline that is compact, accurate, and suitable for further processing. Early research concentrated on generating polygonal approximation. Later, algorithms were suggested for the approximation of polygonal lines with a set of straight segments and circular arcs. Usually these algorithms consist of two parts: segmentation of polygonal lines into fragments (short polygonal lines) corresponding to geometric primitives and approximation of these segments with primitives. This paper is dedicated to the problem of approximating polygonal lines with geometric primitives and, if some fragments cannot be approximated with geometric primitives with acceptable precision, they will be recognized as free curves. This is a more realistic supposition for the case of restoration of polygonal lines obtained with vectorization of line drawings (maps, engineering drawings, and so on). 1.1 Compression (Approximation with Straight Segments) In the past the problem of compression of the source polygonal lines was popular (see, for example, article [1] or [2]). The result of compression is new or approximating polygonal line with fever vertices than the source polygonal line. The approximating polygonal line has to match the source polygonal line with the given accuracy. Vertices of the approximating polygonal lines have to coincide with vertices of the corresponding source polygonal lines. A. Campilho and M. Kamel (Eds.): ICIAR 2006, LNCS 4142, pp. 468 – 478, 2006. © Springer-Verlag Berlin Heidelberg 2006
Approximation of a Polyline with a Sequence of Geometric Primitives
469
In articles [1] and [2], the maximum distance between a source and its approximating polygonal line is used as the measure of the error of approximation, and it has to be not more than the threshold dt. These and similar algorithms are widely used to post-process the results of the vectorization of line drawings. These algorithms are not optimal. They produce approximating polygonal lines with the given precision of approximation but not necessarily with the minimum number of vertices, although the number of vertices of the approximating polygonal lines is not greater than the number of vertices of the source polygonal line. Optimal algorithms of compression that ensure the minimum number of vertices are developed as well (see, for example, [3]), but because of the greater computational complexity, they are less common. In article [4], the measure of the error of approximation is termed the significance and is defined as the ratio of maximum deviation from the straight segment to the length of the straight segment g = D / L (see Fig. 1).
Fig. 1. Error of approximation with a straight segment
Using the coefficient of significance, it is possible to develop compression algorithms that do not require a threshold. At first glance, it seems that using algorithms without parameters is more convenient. However, such algorithms are less flexible than ones with thresholds and could not be tuned even if the operator has some information about the noise of the source polygonal lines. The vertices of approximating polygonal lines obtained with compression coincide with vertices of the source polygonal line. Because of this, compression can corrupt the shape of the polygonal line. This is the main shortcoming of compression. Confirming examples can be found in [5]. Algorithms of compression [1] and [2] are convenient for taking away redundant vertices. But if using them with a greater threshold value is desired it is necessary to do so carefully. In spite of this, compression is used very often for generalization of linear objects of maps and engineering drawings. The compression precision can be essentially enhanced with previous smoothing of the source polygonal line (see [5]). 1.2 Approximation of a Source Polygonal Line with Straight Segments and Circular Arcs By approximating polygonal lines with straight segments and circular arcs, it is possible to obtain a better precision of approximation and degree of compression than using only straight segments. The precision of approximation of each part of the source polygonal line can be evaluated with Darc and a coefficient garc that are similar to D and g used in the case of approximation with straight segments. Darc is the maximum deviation from a source polygonal line to the corresponding arc (see Fig. 2). garc = Darc / Larc, where Larc is the length of the approximation of
470
E. Bodansky and A. Gribov
the circular arc. An algorithm for compression with straight segments and circular arcs was suggested in [6]. The algorithm is looking for such segmentation of the source polygonal line and such approximation of the obtained fragments that minimize g and garc.
Fig. 2. Error of approximation with a circular arc
To simplify the problem of connection of neighboring primitives in [1], [2], and [6], methods of approximation of polygonal line fragments with geometric primitives are used so that the ends of geometric primitives coincide with ends of fragments. Therefore the precision of approximation can be poor because it depends on the errors of the source polygonal lines. Even an algorithm from [6] that uses the least squares method for evaluation of the radius of the approximating arc is rough because ends of this arc coincide with ends of the fragment of the source polygonal line. 1.3 Segmentation of the Source Polygonal Line Using the least squares method with free ends gives good precision of approximation of fragments of polygonal lines with geometric primitives. But this method takes into account only vertices and does not consider the sequence of the vertices. The same approximating arc can be obtained for different polygonal lines with the same vertices (see Fig. 3a and 3b). Even if distances between vertices of the source polygonal lines and the approximating circle (Fig. 3a) are not more than a given threshold, it is obvious that this arc cannot be used to approximate the corresponding polygonal line.
a
b
Fig. 3. Example of the same approximating arc for two different polygonal lines
Approximation of a Polyline with a Sequence of Geometric Primitives
471
To resolve this problem, the source polygonal line shown if Fig. 3a has to be segmented into two fragments divided at vertex A. It can be done simultaneously with approximation of geometric primitives (a one-step method) or before approximation of geometric primitives (two-step method). Using the one-step method in combination with minimization of the approximation error gives better precision. However, optimal methods have too big computational complexity. If approximation error is minimized with suboptimal algorithms, it is difficult to decide methods of what type give a better result. In [6], a one-step method was used. To decrease computational complexity, a rough approximation of polygonal lines with geometric primitives was used in addition to (and even more problematic) a rough method of segmentation of the source polygonal lines into fragments. Because of this, many possible versions of fragmentation of the polygonal line are not analyzed. Thus the precision of the approximation method [6] can often be considered insufficient.
2 An Optimal Segmentation The first step in the two-step method is segmentation of the source polygonal line into fragments corresponding to the different geometric primitives. It is possible to use algorithms from [1] and [2] for segmentation of the source polygonal lines into fragments corresponding to straight segments. But as noted before, these algorithms are not optimal and the result is sensitive to errors of the source polygonal lines. One optimal algorithm of segmentation of a polygonal line into fragments corresponding to straight segments with acceptable computational complexity is suggested in [7]. The algorithm is based on finding fragments such that the functional F (m, Δ ) is minimized. The functional is the weighted sum of accumulated square residuals m
∑ε j =1
j
of the approximation of fragments of the source polygonal line and a number
of straight segments m : m
F (m, Δ ) = ∑ ε j + m ⋅ Δ , j =1
where
εj
is an accumulated square residual of the approximation of the
j -th
fragment of the source polygonal line and Δ is the penalty for each segment. The shorter the fragments of the source polygonal lines, the better the accuracy of their approximation with a straight segment, but it is necessary to “pay” for each segment. A new straight segment of the polygonal line will be added only when Δ is less than the decrease of the accumulated square residuals. This method is preferable in comparison to that found in [1] and [2] because it uses the global measure of the error of approximation and can minimize an influence of errors of the source polygonal lines.
472
E. Bodansky and A. Gribov
3 Increment of the Tangent Direction and Space ( s,τ ) The method suggested in article [7] can be modified for segmentation of a polygonal line into fragments corresponding to straight segments and arcs. But the computational complexity of such a method will be too great (at least one degree more than the computational complexity of the unmodified method). The same task (segmentation of a polygonal line into fragments corresponding to straight segments and arcs) can be resolved with this method without increasing the computational complexity by applying it to the description of the source polygonal line in the plane ( s,τ ). Here s is the length of the polygonal line fragment limited with the beginning of the polygonal line and the current point and τ is the angle between the tangent to the current point and the horizontal axis. Using graphs τ (s ) for segmentation polylines and recognition of geometric primitives was suggested more than 30 years ago (see, for example, [8]). Graphs τ (s ) were used for segmentation of polylines later as well (see [9] and [10]), but more often researchers tried to use the curvature of source polylines (to split source polygonal lines in points of maximum curvatures). Points of maximum curvatures can be used only for finding corners, but not for recognition of points of inflection. Therefore, this method cannot be used for approximation of polylines with circular arcs because some of them can join in points of inflection (points C, D, E in Fig. 4a). Using graphs τ (s ) allows recognizing points of inflection as well as corners. In addition, it allows the use of the optimal algorithm of segmentation [7] because arcs in plane ( X , Y ) are transformed into straight segments of τ (s ) .
To build the graph τ (s ) that corresponds to the polygonal line, it is necessary to calculate the tangent direction at a given point. The stable algorithm of building tangents can be derived from the smoothing algorithm described in [5]. The source polygonal line shown in Fig. 4a was obtained from a raster linear object as the result of vectorization. Fig. 4b shows the result of transformation of the source polygonal line into τ (s ) . Transformation of polylines into
τ (s )
has the following properties:
• Straight parts of the polyline on the plane ( x, y ) (for example, segments
AB ,
BC , GH , and HJ in Fig. 4a) correspond to horizontal segments on the plane ( s,τ ) because the direction of the straight line is constant. • Circular arcs of the polylines on the plane ( x, y ) (for example, segments CD ,
DE , and EF in Fig. 4a) correspond to sloping straight lines on the plane ( s,τ ) because the angle between tangents at endpoints of the circular arc is proportional to the length of the arc. A slope of the straight line in
τ (s)
is proportional to the radius of the curvature
of the corresponding arc. Therefore corners (points B and H in the Fig. 4a) are
Approximation of a Polyline with a Sequence of Geometric Primitives
473
a
b
c Fig. 4. (a) A source polygonal line, (b) corresponding ation of τ ( s ) with algorithm [7]
τ ( s ) , and (c) the result of fragment-
transformed into almost vertical segments of the graph
τ (s)
(segments B′B′′ and
H ′H ′′ ) and the ends of adjacent geometric primitives (points C , D , and E ) are transformed into the corners of the graph τ ( s ) . These singular points of the graph τ ( s ) can be found with the algorithm from article [7].
474
E. Bodansky and A. Gribov
4 Two-Step Approximation of Polygonal Lines with Primitives Taking into account the properties of graph τ ( s ) , it is possible to suggest the following workflow for approximation of the ground truth line with straight segments and circular arcs: 1. Densify the vertices of the source polygonal line. 2. Build the tangent on a current point of the source polygonal line. 3. Build the graph τ ( s ) calculating the distance s from the beginning of the source polygonal line to the current point along the polygonal line and the angle τ between the tangent on this point and the horizontal axis. 4. Segment graph τ ( s ) into parts corresponding to straight segments. 5. Segment the source polygonal line into fragments corresponding to the parts of the graph τ ( s ) . 6. Approximate each obtained fragment of the source polygonal line with a straight line or circular arc. 7. Compare the errors of approximation with straight line and circular arc, and decide which primitive is more pertinent for the analyzed fragment of the source polygonal line: a straight segment or a circular arc. 8. Connect adjacent geometric primitives. Almost the same workflow is suggested in [8] and [9], but the stable and accurate result could not be obtained. The main cause of unsuccessful solution is that the authors did not have good algorithms for filtering errors of source polygonal lines, smoothing, building tangents, segmentation of τ ( s ) , and approximating polygonal lines with circular arcs. A stable and accurate algorithm of approximation of source polygonal lines is developed using graph τ ( s ) , an accurate algorithm of segmentation of polygonal lines [7], effective algorithm of smoothing [5], and an accurate and effective algorithm of approximation of polylines with straight lines [7] and circular arcs [11]. In addition, an algorithm of approximation of polygonal lines for universal vector editors and vectorization systems has to take into account that a polyline can contain free curves. Fig. 4c illustrates the result of segmentation of τ ( s ) on straight segments obtained
with an algorithm suggested in [7]. Sometimes graph τ ( s ) contains almost vertical straight lines. They correspond to corners of processed polylines or arcs with a big curvature. These are called “critical”. If several critical fragments follow one after another, they are joined (see fragment FG in Fig. 4b). A corresponding fragment of the source polygonal lines is referred to as the free curve (see curve FG in Fig. 4a). Each fragment of the source polygonal line, except fragments corresponding to free curves, is approximated with a straight line and circular arc. Parameters of the approximating circle (center x0 , y0 , and radius R ) are calculated with the iterative method described in article [11]. An iterative method of approximation of a polygonal line with a circular arc has good precision and converges quickly. An approximating straight line is calculated with a method that minimizes integral square residuals [7].
Approximation of a Polyline with a Sequence of Geometric Primitives
475
Fig. 5. The source polygonal line and the result of approximation with enlarged fragments in the vicinity of points B , C , E , and H
After approximation of each fragment, it is necessary to define what geometric primitive corresponds to the analyzed fragment. The decision depends on the errors of approximation by straight line and circular arc, threshold, and control parameter K. If the errors of approximation of the fragment with both primitives exceed the threshold the fragment of the source polygonal line is a free curve. Otherwise the fragment is a geometric primitive. Obviously approximation with a circular arc is better. If the error of approximation with a circular arc is K times less than an error of approximation with a straight segment or even less, then the fragment is a circular arc, otherwise it is a straight segment. To conclude the task of approximation of the source polygonal line with geometric primitives and free curves, it is necessary only to connect them. Fig. 5 shows the source polygonal line and the result of approximation.
Fig. 6. Recognition of the polygonal line (fig. 3a) as two arcs with an algorithm described in section 4
476
E. Bodansky and A. Gribov
5 Conclusion A method of resolving the problem of approximating a polygonal line with geometric primitives and free curves has been suggested. It is a two-step method. The first step
a
b
c
d
Fig. 7. An architectural drawing (a) and enlarged fragments (b), (c), and (d)
Approximation of a Polyline with a Sequence of Geometric Primitives
477
is segmentation of a polygonal line into fragments corresponding to primitives; the second step is an approximation of the fragments with straight segments, circular arcs, and free curves. To do segmentation, the source polygonal line transforms into graph τ ( s ) , where τ is the angle between the tangent to the current point and horizontal axis and s is the distance between the beginning and the current point measured along the polygonal line. Straight segments and arcs of a polyline correspond to straight segments of the graph τ ( s ) . Therefore, segmentation performs with an
optimal algorithm of polygonization [7] applied to τ ( s ) . Each fragment of the source polygonal line obtained with segmentation is approximated and recognized as a straight segment, circular arc, or free curve. After building geometric primitives and free curves, it is necessary to connect them. Fig. 6 shows the result of processing the source polygonal line shown in Fig. 3a. Using the suggested method, it is successfully recognized as two circular arcs. Fig. 7 includes a raster image of the architectural drawing and the result of vectorization. Obtained polygonal lines were recognized as sequences of geometric primitives and free curves. Figs. 7b, 7c, and 7d show enlarged fragments of the drawing. Points of contact in Figs. 7c and 7d are marked with crosses. The results shown in Fig. 6 and 7 were obtained with the universal vectorization system ArcScan for ArcGIS. There is still one problem that has to be resolved in further investigations: the algorithm of polygonization [7] uses one parameter, which does not have a simple physical sense, so it is necessary to tune this algorithm only in interactive mode.
References 1. Ramer, U.: An Iterative Procedure for the Polygonal Approximation of Plane Curves. Computer Graphic and Image Processing, Vol. 1 (1972) 244–256 2. Douglas, D., Peucker, Th.: Algorithm for the Reduction of the Number of Points Required to Represent a Digitized Line or its Caricature. The Canadian Cartographer, Vol. 10, No. 2 (December 1973) 112–122 3. Chan, W.S., Chin, F.: On Approximation of Polygonal Curves with Minimum Number of Line Segments or Minimum Error. International Journal on Computational Geometry and Applications, Vol. 6 (1996) 59–77 4. Lowe, David G.: Three-Dimensional Object Recognition from Single Two-Dimensional Images. Artificial Intelligence, Vol. 31, No. 3 (March 1987) 355–395 5. Bodansky, E., Gribov, A., Pilouk, M.: Post-Processing of Lines Obtained by Raster-toVector Conversion. International Journal on Document Analysis and Recognition, No. 3 (2000) 67–72 6. Rosin, Paul L., West, Geoff A.W.: Segmentation of Edges into Lines and Arcs. Image and Vision Computing, 7(2) (1989) 109–114 7. Gribov, A., Bodansky, E.: A New Method of Polygonal Approximation. Lecture Notes in Computer Science, Springer-Verlag Heidelberg, Vol. 3138 (2004) 504–511 8. Perkins, W.A.: A Model-Based Vision System for Industrial Parts. IEEE Transactions on Computers, Vol. C-27, No. 2 (February 1978) 126–143
478
E. Bodansky and A. Gribov
9. Eric, W., Grimson, L.: On the Recognition of Curved Objects. MIT Artificial Intelligence Laboratory memo No. 983 (July 1987) 10. Saund, E.: Identifying Salient Circular Arcs on Curves. Computer Vision, Graphic and Image Processing: Image Understanding, Vol. 58, No. 3 (1993) 327–337 11. Bodansky, E., Gribov, A.: Approximation of Polylines with Circular Arcs. Llados, J., Kwon, Y.-B. (eds.): Fifth IAPR International Workshop on Graphics Recognition (GREC 2003), LNCS 3088 (2004) 193–198