Genetic Algorithm for Line Extraction
Luc Baron, P. Eng., Ph.D. Departement de genie mecanique Email:
[email protected]
August 1998
E cole Polytechnique de Montreal
Genetic Algorithm for Line Extraction Abstract The extraction of lines from an image can be seen as a hard optimization problem that includes many local optima, each line of the image being a dierent optimum. Genetic algorithms (GA) are powerful stochastic optimization techniques and are considered to solve this problem. The optimization model for line extraction is shown to be equivalent to Hough Transform (HT). It has the advantage of evaluating the objective function at a minimum number of points in the parameter space, while HT must build the whole parameter space. The algorithm has a breakdown point of approximately 95%, which is the maximum percentage of outliers tolerated by the algorithm. GA also allows the extraction of few lines simultaniously, in a single pass. Mutiple passes allow the extraction of a much larger number of lines by reducing the set of points at each pass. The use of GA and robust statistics in the extraction model makes it possible to generalize the extraction for any type of geometric primitive.
1
Contents 1 Introduction
4
2 Optimization Model for Line Extraction
7
3 Genetic Algorithms
9
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Line Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Extraction Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1 3.2 3.3 3.4
Crossover . . . . Mutation . . . . . Natural Selection Coding . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
4 Experimental Results 4.1 4.2 4.3 4.4 4.5
Equivalence with Hough Transform . Robustness of the Extraction Model . Parallelism of Genetic Algorithm . . Line Extraction on a Real Scene . . . Weakness of the Extraction Model . .
5 Conclusions
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
4 7 7 8
10 10 11 11
11 12 13 14 15 15
16
2
List of Figures 1 2 3 4 5 6 7 8 9 10 11
Genetic as an optimisation technique . . . . . . . . . . . . . . . . . Crossover of two individuals of the population . . . . . . . . . . . . Equivalence between the objective function and Hough Transform . 3D representation of the objective function in the parameter space . Extraction of a line with 75% outliers . . . . . . . . . . . . . . . . . Extraction of a line with 95% outliers . . . . . . . . . . . . . . . . . Extraction of 4 lines with 75% outliers . . . . . . . . . . . . . . . . Image of a real scene. . . . . . . . . . . . . . . . . . . . . . . . . . . Edges of the real scene. . . . . . . . . . . . . . . . . . . . . . . . . . Extracted lines superimposed on the real scene. . . . . . . . . . . . Objective function of the real scene over the parameter space. . . .
3
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
9 10 12 13 14 15 16 17 18 19 20
1 Introduction In model-based vision, the image-understanding process involves several levels of processing in order to recognize objects. The extraction of geometric primitives along the edges of an object refers to the third level of this process [15, p. 11]. The edges are represented by a set of point positions in image space. These point positions can be obtained from a simple edge detection and the thresholding of an intensity image. A geometric primitive is a curve that can be described by an equation with a number of free parameters. In this report, the geometric primitives are limited to lines, and hence are described with two free parameters. The extraction of lines from a set of edge points is the determination of the two free parameters related to each line. Since the edge detection operation is usually done on images with multiple noisy edges of dierent shapes, the presence of outliers must be considered among the set of edge points. An outlier is a point which is far enough from a line to not be taken into account in the extraction and determination of the free parameters of that line. An outlier can be either a noisy point or a point of another primitive. As an opposite, an inlier is a point which is close enough to a line to be taken into account in its extraction. The breakdown point is the maximum percentage of outliers that can be tolerated by a method [19] before failure. In the context of line extraction, the breakdown point is a measure of the robustness of an extraction method, based on its ability to perform the extraction on multiple noisy edge points, in the presence of outliers.
1.1 Background The Least Square (LS) is a statistical method which optimally ts a geometric primitive using a set of noisy data points. The method minimizes the square of the distance between the primitive and data points. It is robust with respect to noise, but it has a breakdown point of 0%. This means that the presence of a single outlier can produce a completely false results on the extracted parameters [7]. Hence, according to the previous de nition, the LS method is not robust and cannot be used for primitive extraction. 4
The Least Median Square (LMedS) is a robust statistic (RS) method which optimally ts a geometric primitive using a set of noisy data points. The method randomly selects a subset of a minimum number of points, say two points for a line, called the minimum subset, and computes the square of the residuals of every other point relative to that line. The median of the square residuals is associated to that subset. The subset which has the least median squared is the one used to decide which points are inliers and which are outliers, based on a threshold value[16]. The LMedS has a breakdown point of 50%. It is a robust method and can be used for primitive extraction. A widely used primitive extraction method is Hough Transform (HT). The basic concept is to project edge points into a dierent domain, say the parameter space in which parameter values are more clearly manifested [15]. The extraction of a speci c kind of geometric primitive becomes a search in its corresponding parameter space, which leads to a robust algorithm. The breakdown point of HT is much greater than 50%. HT has been shown to be equivalent to template matching, where the templates are the cells of the parameter space. The dimension of this space is determined by the number of independent parameters necessary to describe the geometric primitive. Lines and circles need two and three independent parameters to completely describe their primitives, and lead to a two and three dimensional parameter space, respectively. Higher-order primitives lead to higher-order dimensional spaces. The size of the search space is an exponentional function of the dimension of that space [3]. Consequently, HT is mainly used for low-order primitive extraction, such as lines, and is less attractive for higher-order primitive extraction. Primitive extraction is essentially an optimization problem where the goal is to nd the global optimum of an objective function which, potentially, has many local optima [18]. By using an appropriate objective function, it has been shown that HT and RS methods are equivalent and that both have a high breakdown point (much greater than 50%). Since the objective function is not necessarily dierentiable, gradient-based optimization techniques cannot be used. The optimization method must use only function evaluation to converge to the global optimum. Since the objective function is usually complex, the optimization technique must perform the minimum number of evaluations of the objective function. 5
Random sampling is an optimization method that evaluates the objective function at a repeated grid [5]. The con dence of obtaining the global optimum depends on the step of that grid. The accuracy on the optimum is achieved by iteratively subdividing the most interesting region. This method requires quite a large number of evaluations of the objective function [18], but less than the HT method. Simulated annealing is an optimization method that evaluates the objective function at a minimum number of points. It is based on the analogy of material annealing: heated to a temperature that permits many atomic rearrangements, then carefully and slowly cooled until the material freezes into a good crystal. The method uses an iterative improvement strategy to move toward an optimum. Through a temperature reduction schedule, perturbations are allowed so that the solution can move in any direction. This allows the possibility of jumping out of a local optimum and potentially falling into a more promising path [20]. This method has been successfully used in optimal histogram partitioning [1], pose determination [8] and correspondence determination [21]. Genetic algorithm (GA) is an optimization method that evaluates the objective function at a minimum number of points. It is based on the analogy of the mechanics of natural genetics, and imitates the Darwinian survival-of-the- ttest approach [4]. The method uses bits of strings as chromosones to represent all the parameters of a potential solution. The use of several chromosones allows several solutions to be considered in the search space at each iteration. The method uses iterative improvement of the chromosone population by crossover, and applies natural selection by evaluating an objective function for each new member of the population. This allows movement of the population toward several optima simultaneously. Mutation of bits in the population of chromosomes allows movement in any direction and allows the possibility of jumping out of a local optimum and potentially falling into a more promising region [10]. GA has been successfully applied to fuzzy logic [13], neural network learning [2] and curved tting [14], used as a process of induction [9], and nally, applied to primitive extraction [17]. A very recent in-depth treatment of GA methods is presented by its creator in [11]. 6
1.2 Objectives Since many sucessful applications of GA have been realized in the past, it would be interesting to apply GA to the problem of line extraction. The following assumptions are made relative to this work.
The extraction is performed on a set of 2D point positions, obtained from an edge detection and the thresholding of an intensity image. The extraction is performed for lines only. Multiple primitives of any type are allowed in the image. Noise is allowed on point positions. Outliers are allowed in the set of point positions. The objectives of this work are:
To experiment with the use GA to perform line extraction. To show how GA can be related to the HT. To evaluate the robustness of line extraction with GA. This project diers from previous work done in the sense that the algorithm can extract multiple lines from a set of point positions in a single execution. This means that it is not necessary to remove the points from the set of points beyond the extracted line and then re-executing the GA to extract a second line and so on.
2 Optimization Model for Line Extraction 2.1 Line Model
A line in an image can be represented by the following slope-intercept equation:
y = mx + b
(1)
where both m (the slope) and b (the intercept of the line with the y-axis) are xed parameters representing that line. Since these two parameters are not bound, it is more appropriate to 7
use the parameterization proposed in [3]
= x cos + y sin
(2)
where the new parameters are bound as follows : 0 < (lx2 + ly2) 12 ; , +
(3)
while lx and ly are the image dimensions. With this formulation, every line of the image space is mapped into a single point in the parameter space. Since the line extraction is a search for line parameters, it is easier to perform the search in the parameter space, which is bounded and where lines or represented by points of high intensity.
2.2 Extraction Model In order to extract a line from a set of point positions, let us de ne di as the distance between a point Pi and a line, as follows: q
di = kdik = dTi di
(4)
where kdik is the Euclidean norm of the vector di. Since the parameter space is bounded, it is not an Euclidean space and consequently, the Euclidean norm is not de ned in that space. Therefore, let us de ne the distance di in image space, which is Euclidean, as the minimum distance between point Pi and the line represented by the parameters and , as follows: q
di = kdik = (xi , xo )2 + (yi , yo)2; di = pi , po ; pi = [xi ; yi]T ; po = [xo ; yo]T (5) where pi is the position vector of point Pi and po is the position vector of the closest point of the line to point Pi. This point is readily obtained as follows 8 if = j 8 j = -1, 0 , 1 > < xo = ; yo = yi 2 x = x sin , y sin cos + cos otherwise. po = >: o i (6) i cos2 2 yo = yi cos , xi sin cos , sin + sin The extraction model can be formulated as the following optimization problem: n X (7) z(; ) = (1 +1 d ) ! max ; i i=1 8
subject to the constraints 0 (lx2 + ly2) 2 ; , +
(8)
1
where n is the number of points in the set and di is given by eqs. 5 and 6. This objective function makes it possible to nd a line, which has the maximum likelihood in the set of points, while searching in the parameter space. It will be shown later that this objective function has many local optima and is equivalent to HT.
3 Genetic Algorithms Genetic Algorithms (GA) are powerful stochastic optimization techniques which are based on an analogy with the mechanics of biological genetics and imitates the Darwinian survivalof-the- ttest approach. The method uses iterative improvement of individuals of the population at each generation to converge toward the optimum. This is done by means of three operations; crossover, mutation and natural selection. Population
Genotype
Individual 1
0110 1100
Individual 2 ... Individual p Population of solutions
1100 1001 ...
...
0001 0010 Coded value of ρ and θ
Fitness
Phenotype Decoding Coding
08
12
23.4
12
09
17.1
... ...
...
02
19.2
01
Parameters ρ θ
Objective function z( ρ,θ)
Figure 1: Genetic as an optimisation technique Each individual in the population represents a potential solution to the problem, as shown in Figure 1. The phenotype of each individual is its physical characteristics, like the color of its eyes, and represents the parameters under optimization. The tness of an individual is its ability to survive natural selection and represents the value of the objective function at the value of its phenotype. The genotype of an individual is the coding of the phenotype in its chromosomes and corresponds to the bit string representation of parameters. 9
3.1 Crossover The evolution of the population at each generation is achieved by the reproduction of the \best" individuals, based on their ability to survive natural selection. Reproduction is performed by the crossover of the genotype of parents to obtain the genotype of two children. Figure 2 shows one of the simplest and most ecient ways of implementing this operation, which is as follows:.
The parents are randomly selected based on their tness. The genotype of the parents is split into two parts at a randomly selected crossover site. The genotype of the children is formed by recombining one part of the genotype of each of their parents. The tness of the new members is evaluated by the objective function. Parents
Genotype
Genotype
Children
Father
1 01011 10
1 01011 11
Boy
1 00111 10
Girl
Crossover site Mother
1 00111 11 Old generation
New generation
Figure 2: Crossover of two individuals of the population
3.2 Mutation Mutation is an inversion of a gene in the genotype of a new member of the population. Mutation makes it possible to try a completely dierent solution. The probability of mutation should be small in order to let the population improve itself by crossover. This way of seeking dierent solutions does not imply any control on the direction, as in gradient-based or other techniques.
10
3.3 Natural Selection Natural selection is performed on the population by keeping the \most" promising individuals, based on their tness. In this way, it is possible to keep the size of the population constant, for convenience. This is equivalent to using the solutions that are closest to optimum.
3.4 Coding Since nature has its own method of coding the phenotype into the genotype, every optimisation problem must de ne its own way of coding the optimization parameters into a string of bits. The number of bits allocated to each parameter determines its maximum resolution. Here, 10 bits have been allocated to each parameter and the following coding has been used:
g1 qlx2 + ly2; 0 g 1023 = 1023 1
(9)
g2 , 0:5 ; ,512 g 511 (10) = 2 512 2 where g1 is the genotype of the parameter , g2 is the genotype of the parameter , and lx and ly are, as previously mentioned, the image dimension. This method of coding several parameters into a single bit string is crucial in GA. When the number of parameters increases, GA allows only a polynomial increase in the size of the search space, while other optimisation techniques show an exponential increase.
4 Experimental Results The experiments are performed using a real scene and arti cially generated images. The arti cial images are binary type, 200 200 pixels, where white a pixel represents a detected point of an edge. Here, they are called edge point images. Inliers are points that are randomly spread over the line in the image, while outliers are points that are randomly spread over the whole image. 11
a) Image space b) Parameter space Figure 3: Equivalence between the objective function and Hough Transform
4.1 Equivalence with Hough Transform The evaluation of the objective function (de ned in eqs. 7 and 8) for a given value of parameters gives the level of correlation between the set of points and the given line. The objective function is a nonlinear mapping of every point of the image onto a single point in the parameter space. Alternatively, every point of the image space is mapped into a set of points in the parameter space lying on a sine curve. The latter is known as Hough Transform (HT) and is equivalent to evaluating the objective function over the whole parameter space. Figure 3a) shows an image of a line ( = 31:8 pixels, = ,0:558 radian) with 50% inliers and 50% outliers. The brightness in Figure 3b) shows the value of the objective function over the parameter space. The sine curves obtained in the parameter space are equivalent to those usually obtained with HT. The point of maximum brightness in the parameter space corresponds to the parameters of the line in the image space. Figure 4 shows the evaluation of the objective function over the parameter space, which is a 3D representation of Figure 3b). The position of the strong global maximum corresponds to the parameters of the line in the image. It is clear from this gure that the objective function has many local optima. It is easier to nd the maximum in the parameter space 12
20 0 15 10
2
5 0 0 Teta 50 100 150
-2
Rho 200 250
Figure 4: 3D representation of the objective function in the parameter space by using several solutions, rather than a single one, which would have to travel a great deal. GA has also the advantage of evaluating the objective function at a minimum number of points in the parameter space, while HT must build the whole parameter space. These are the reason why GA are appropriate for this kind of optimization problem.
4.2 Robustness of the Extraction Model As previously de ned in the introduction, the breakdown point is a measure of the robustness of an algorithm, while performing the extraction on multiple noisy edge points, in the presence of outliers. It is de ned as the maximum percentage of outliers that can be tolerated by the method before failure. Figures 5a) and 6a) show the same image as Figure 3a), but with the presence of 75% and 95% outliers, respectively. Figures 5b) and 6b) show the line extracted by the GA. The extraction works well up to approximatly 95% outliers, where the extraction starts to become inaccurate and sometimes fails. The breakdown point of the optimisation model for 13
a) Edge point image b) Extracted line Figure 5: Extraction of a line with 75% outliers extraction is approximately 95%, which is similar to the one obtained by [18].
4.3 Parallelism of Genetic Algorithm GA have an implicit parallelism based on the fact that they use a population of solutions to converge toward many local optima simultaneously. The nal population of solutions usually contains several optima, of which a few may correspond to dierent real lines in the image space. Therefore, it becomes possible to simultaneously extract several lines of the image in a single pass of the GA, without having to repeat the process of removing points from the set of points beyond the extracted line and doing another pass of the GA. The distinction between solutions corresponding to the same optimum and those corresponding to dierent ones can be made with their respective distances in the parameter space. The selection of these solution in the nal population is done as follows:
Select the solution with the highest tness. In the nal population of solutions, remove every other similar solution around the solution previously selected, based on two thresholds values corresponding to each parameter. Repeat the rst two steps until the desired number of lines is extracted. 14
a) Edge point image b) Extracted line Figure 6: Extraction of a line with 95% outliers The algorithm was applied to an arti cial image containing four lines and having 75% outliers, see Figure 7a). As shown in Figure 7b), it successfully performed the extraction in a single pass of th GA.
4.4 Line Extraction on a Real Scene The algorithm was applied to the real scene shown in Figure 8. The results of a simple edge detection of that scene is shown in Figure 9. Again, the extraction of lines from the results of the edge detection was quite successful, as seen in Figure 10. Three lines were successfully extracted in a single pass of GA, while the fourth line is obviously wrong. Other lines can be extracted by removing the points in Figure 9, corresponding to the extracted edges, and re-executing the GA over the reduced set of points. The objective function of the optimisation model for line extraction is shown in Figure 11 for the real scene.
4.5 Weakness of the Extraction Model Unfortunately, the algorithm for extracting several lines is problem ridden. The selection of the two thresholds limits the relative distance and relative orientation between two distinguishable lines in image space. The number of lines extracted from the nal solution should 15
a) Edges point image. b) Extracted lines Figure 7: Extraction of 4 lines with 75% outliers be much smaller than the size of that population. The number of generations in the GA must be kept relatively small, say less the 40. The probabilities of crossover and mutation must follow the usual rule of 50% crossover and 5% mutation approximately.
5 Conclusions The optimization model for line extraction is equivalent to HT. It has the advantage of evaluating the objective function at a minimum number of points in the parameter space, while HT must build the whole parameter space. The algorithm has a breakdown point of approximately 95%, which means that the extraction model can tolerate up to 95% outliers. The GA allows the extraction of only a few lines simultaniously, in a single pass of the GA. Mutiple passes allow the extraction of a much larger number of lines by reducing the set of points at each pass. The use of GA and robust statistics in the extraction makes it possible to generalize the extraction for any type of geometric primitive.
16
Figure 8: Image of a real scene.
References [1] Brunelli, R., \Optimal Histogram Partitioning Using a Simulated Annealing Technique", Pattern Recognition Letters, Vol. 13, No. 8, pp. 581-586, August 1992. [2] Caudill, M., \Evolutionary Neural Networks", AI Expert, pp. 28-33, March 1991. [3] Duda, R. O., Hart, P. E., \Use of Hough Transform to Detect Lines and Curves in Pictures", Communication of ACM, Vol. 15, No. 1, pp. 11-15, January 1972. [4] Eliot, L. B., \Building Better Algorithms", AI Expert, pp. 11-13, March 1991. [5] Fishler, M. A., Bolles, R. C., \Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography", Communication of ACM, Vol. 24, No. 6, pp. 381-395, June 1981. [6] Goldberg, D. E., Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, 412 pages,1989. [7] Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., Stahel, W. A., Robust Statistics: The Approach Based on In uence Functions, Wiley series in probability and mathematical statistics, 502 pages, 1986. [8] Han, Y. S., Snyder, W. E., Bilbro, G. L., \Pose Determination using Tree Annealing", IEEE Int. Conf. on Robotics and Aut., Vol. 1, pp. 427-432, 1990. 17
Figure 9: Edges of the real scene. [9] Holland, J. H., Holyoak, K. J., Nisbett, R. E., Thagard, P. R., Induction: Process of Inference, Learning, and Discovery, MIT Press, 385 pages, 1986. [10] Holland, J. H., \Genetic Algorithms", Scienti c American, pp. 66-72, July 1992. [11] Holland, J. H., Adaptation in Natural and Arti cial Systems, MIT Press, 211 pages, Cambridge, 1992. [12] Huber, P. J., Robust Statistics, Wiley series in probability and mathematical statistics, 308 pages, 1981. [13] Karr, C. L., Stanley, D. A., Scheiner B. J., \Genetic Algorithm Applied to Least Squares Curve Fitting", US Department of the Interior, Bureau of Mines, RI-9339, 8 pages, 1991. [14] Karr, C. L., \Applying Genetics to Fuzzy Logic", AI Expert, pp. 38-43, March 1991. [15] Levine, M. D., Vision in Man and Machine, McGraw Hill, 574 pages, 1985. [16] Roth, G., Levine, M. D., \Segmentation of Geometric Signals using Robust Fitting", 10th International Conference on Pattern Recognition, pp. 826-831, June 1990. [17] Roth, G., Levine, M.D., \A Genetic Algorithm for Primitive Extraction", Proc. Fourth International Conference on Genetic Algorithms, July 13-16, pp. 487-494, 1991. [18] Roth, G., Levine, M. D., \Extracting Geometric Primitives", McRCIM report CIM-921, 57 pages, August 1992. 18
Figure 10: Extracted lines superimposed on the real scene. [19] Rousseeuw, P. J., Leroy, A. M., Robust Regression and Outlier Detection, Wiley series in probability and mathematical statistics, 329 pages, 1987. [20] Rutenbar, R. A., \Simulated Annealing Algorithms: An Overview", IEEE Circuits and Devices Magazine, pp. 19-26, January 1989. [21] Wang, T., Zhuang, X., Xing, X., \Robust 3-D Motion Estimation without Point Correspondences", IEEE Int. Journal on Robotics and Aut., Vol. 7, No. 2, pp. 64-69, April 1992.
Bibliography
[22] Ballard, D. H., \Generalizing the Hough Transform to Detect Arbitrary Shapes", Pattern Recognition, Vol. 13, No. 2, pp. 111-122, 1981. [23] Davis, L. S., \Hierarchical Generalized Hough Transforms and Line-Segment Based Generalized Hough Transforms", Pattern Recognition, Vol. 15, No. 4, pp. 277-285, 1982. [24] Davis, L., Genetic Algorithms and Simulated Annealing, Morgan Kaufmann, San Mateo, 1987. [25] Dudani, S. A., Luk, A. L., \Locating Straight-Line Edge Segments on Outdoor Scenes",Pattern Recognition, Vol. 10, No. 3, pp. 145-147, 1978. [26] Fairhurst, M. C., Computer Vision for Robotic Systems, Prentice Hall, 193 pages, New York, 1988. 19
Figure 11: Objective function of the real scene over the parameter space. [27] Horn, B. K. P., Robot Vision, The MIT Press, 509 pages, Cambridge, 1986. [28] Kierkegaard, P., \A Method for Detection of Circular Arcs Based on the Hough Transform", Machine Vision Applications, Vol. 5, pp. 249-263, September 1992.
20