Optimal Triangulation in 3D Computer Vision Using a Multi-objective Evolutionary Algorithm Israel Vite-Silva1 , Nareli Cruz-Cort´es1, Gregorio Toscano-Pulido2, and Luis G. de la Fraga1 1
CINVESTAV, Department of Computing Av. IPN 2508. 73060 M´exico, D.F., M´exico {ivite,nareli}@computacion.cs.cinvestav.mx,
[email protected] 2 CINVESTAV, Unidad Tamaulipas Km. 6 carretera Cd. Victoria-Monterrey, 87276, Tamps, M´exico
Abstract. The triangulation is a process by which the 3D point position can be calculated from two images where that point is visible. This process requires the intersection of two known lines in the space. However, in the presence of noise this intersection does not occur, then it is necessary to estimate the best approximation. One option towards achieving this goal is the usage of evolutionary algorithms. In general, evolutionary algorithms are very robust optimization techniques, however in some cases, they could have some troubles finding the global optimum getting trapped in a local optimum. To overcome this situation some authors suggested removing the local optima in the search space by means of a single-objective problem to a multi-objective transformation. This process is called multi-objectivization. In this paper we successfully apply this multi-objectivization to the triangulation problem. Keywords: Evolutionary Multi-Objective Optimization, 3D Computer Vision, Triangulation.
1
Introduction
One of the foremost 3D Computer Vision problems is how to calculate the threedimensional reconstruction of a 3D object’s visible surface from two images [1,2]. The problem reduces to calculating the three-dimensional point X that better adjusts to two points (x, x ) over the images; this problem is known as triangulation. One may project two rays out of two points in known 2D images to intesect inside a reconstructed 3D space. This is not possible in the presence of noise: noise in the location of the points that transmits to the projection matrices (M, M ). Noisless conditions are ideal and rarely to be found in real images making it necessary to look for alternative methods to discover the best point of intersection in 3D space. The triangulation methods can be applied to three types of reconstructions [2, Ch. 2] namely, the projective reconstruction where neither metric nor parallelism M. Giacobini et al. (Eds.): EvoWorkshops 2007, LNCS 4448, pp. 330–339, 2007. c Springer-Verlag Berlin Heidelberg 2007
Optimal Triangulation in 3D Computer Vision
331
exists; the affine reconstruction, where the concept of parallelism exists but there is no specific metric on each coordinate axis; and the metric reconstruction, in which there exists both, a specific metric for each axis and the concept of parallelism. It would be desirable to find a triangulation method which is invariant to any type of reconstruction, this means that under any geometric transformation i.e. translation and rotation, the geometric properties remain unchanged. One option towards achieving this goal is the usage of evolutionary heuristics. The best known Evolutionary Algorithms are the Genetic Algorithms (GA). In general, GAs are very robust optimization techniques capable of finding good solutions even in the presence of noisy search spaces. They are less susceptible to become trapped in local optima than other traditional optimization techniques, due mainly to their stochastic nature and population-based scheme. However, in some cases, if the GA is dealing with a search space composed by a huge number of local optima, it could have some troubles finding the global optimum getting trapped in a local optimum because no small modification of the current GA state will produce a better solution. In order to overcome this situation, Knowles et, al. [3] suggest to overcome this situation removing the local optima in the search space by means of a singleobjective problem to a multi-objective transformation. This process is called multi-objectivization. In order to perform the multi-objectivization process, it is necessary to replace the original single objective of a given problem with a set of new objectives, or add new objectives in addition to the original function, such that the resulting multi-objective problem has a Pareto optimum front coinciding with the optimum of the original problem. M. T. Jensen [4] showed a successful application of the multi-objectivization to solve a very complex problem. In our case, we applied a single objective Evolutionary Algorithm (EA) to the Triangulation problem obtaining very poor results. As a consequence, we decided to multi-objectivize the problem aiming to improve the solutions found. The main contribution of this paper is show how a representative evolutionary multi-objective state-of-the-art algorithm (NSGA-II [5]) was applied to solve the triangulation problem by ”multi-objectivizing” the objective function. Our experiments indicate that the results obtained by the Multi-Objective Evolutionary algorithm are much better than those obtained by the single-objective EA. Furthermore, when compared against other known triangulation methods, ours obtains better results if only few correspondence points are available; and a similar performance if the quantity of available points increases. The rest of the paper is organized as follows: Section 2 shows the triangulation problem; in Section 3 a number of triangulation methods are discussed; Section 4 defines the single and multi-objective optimization problems; in Section 5 the single objective evolutionary triangulation problem is presented; Section 6 shows the multi-objectivized triangulation problem; Section 7 presents the experiments and results; A number of conclusions are drawn in the final Section 8.
332
2
I. Vite-Silva et al.
The Triangulation Problem Statement
Reconstruction is the method by which the spacial layout of a scene and cameras can be recovered from two views [2]. Suppose that the set of 3D points are unknown, and a set of images correspondences x ⇐⇒ x are given. The reconstruction goal is to find the camera matrices M and M and the 3D points Xi such that: xi = M Xi and xi = M Xi for all i. The reconstruction method follows the next three steps1 : 1) Compute the fundamental matrix F from point correspondences. This matrix relates both images points and contains the rotation, translation and intrinsec cameras parameters. 2) Compute the projection matrices M and M from the fundamental matrix. 3) For each point correspondence x ⇐⇒ x , compute the point in space that projects to these two image points. This third step is known as triangulation. Thus, the triangulation method can be thought as the last part of the reconstruction method. If we already have calculated the projection matrices M and M , then we only need to recover the point in three dimensions. However, due to the fact that the points positions over images are inaccuracy plus the noise caused by the inherent propagation of the floating point representation, it is highly possible that the projected lines from the two points in the images do not intersect in the space, then it is necessary to look for a manner to obtain the best approximation. That one is problem addressed in this paper.
3
Methods to Calculate the Triangulation
There exists several triangulation methods [6], however we will describe here only the most relevant ones. The middle point triangulation method, known as Midpoint, obtains the middle point over the nearest distance between the two projected lines [7,8]. This method is relatively easy to implement, however its main disadvantage is that it is neither affine nor projective invariant, because in these cases do not exist a defined metric over the angles neither the distances. The Linear Least Square method (LLS) finds an approximation for the triangulation in the least square sense by means of the homogeneous linear equations solution by using the singular value decomposition (SVD). This is affine and Euclidean invariant and its execution time is very low. However, its main weaknesses are: it is not invariant in the presence of a projective reconstruction, and it could be unstable due to the inversion matrix calculation. The Poly-Abs method [6] attempts to minimize the sum of the distances’ absolute values between the correspondences points x ⇐⇒ x and their corresponding epipolar lines, that is: d(x, λ) + d(x , λ ) where λ and λ are the corresponding epipolar lines. The epipolar lines are computed by using the fundamental matrix. It is affine and projective invariant. This method can find very good results if the fundamental matrix is computed with high precision, if not, the error is very large. 1
Many variants on this method are possible.
Optimal Triangulation in 3D Computer Vision
4
333
Single and Multi-objective Global Optimization
The general global single-objective optimization problem can be defined as follows (let us assume minimization): Find the vector a such that: minimizes f (a);
(1)
where a = [a1 , ..., an ] is the vector of n decision variables, and the objective function f maps a into n → . The general multi-objective optimization problem, can be defined as follows, Find a such that: minimizes F (a) (2) where a = [a1 , ..., an ] is the vector of n decision variables, and F (a) is the vector with k objective functions [f1 (a), ..., fk (a)]. In general, does not exist a single solution that is minimal for all objectives. Instead, there is a set of solutions P ∗ called the Pareto optimal set, with the property that: (3) ∀a∗ ∈ P ∗ ¬∃a | a a∗ where a a∗ ↔ (∀i ∈ {1, ..., k}, fi (a) ≤ fi (a∗ ) ∧ ∃i ∈ {1, ..., k} : fi (a) < fi (a∗ )). The expression a a∗ is read as a dominates a∗. In addition, for two solutions a and a , we say a ∼ a if and only if ∃i{1, ..., k}, fi(a) < fi (a ) ∧ ∃j ∈ {1, ..., k}, j = i, fj (a ) < fj (a). Such a pair of solutions are said to be incomparable and each is nondominated with respect to the other. Pareto optimal solutions are also termed non-inferior, admissible, or efficient solutions, and their corresponding vectors are nondominated.
5
Applying an Evolutionary Single Objective Algorithm to the Triangulation Problem
In this Section we show how the triangulation problem was adapted to be solved by an evolutionary single-objective optimization algorithm. We experimented with a simple Genetic Algorithm (GA) [9] and a Particle Swarm Optimization Algorithm (PSO) [10] with different parameters values. The statement of the problem and the experiments are presented next. 5.1
Single-Objective Triangulation Problem Definition
The triangulation problem we want to solve can be defined as follows: Find the 3D point X in world coordinates (Xw , Yw , Zw ) such that: Minimizes:
ˆ ) + d(x , x ˆ ), f (X) = d(x, x
(4)
334
I. Vite-Silva et al.
where: d(*,*) are Euclidean distances, x and x represent the correspondence 2D ˆ and x ˆ are 2D reconpoints from the first and second images respectively, x ˆ has structed (estimated) points from the first and second images respectively. x coordinates (ˆ x, yˆ) calculated by: x ˆ=u ˆ/w, ˆ ˆ
yˆ = vˆ/w, ˆ
T
T
[ˆ u, vˆ, w] ˆ = M [Xw , Yw , Zw , 1]
(5)
x has coordinates (ˆ x , yˆ ) computed by ˆ /w ˆ, xˆ = u
yˆ = vˆ /w ˆ ,
[ˆ u , vˆ , w ˆ ] = M [Xw , Yw , Zw , 1] T
T
(6)
M and M are the 3 × 4 projection matrices from the first and second images respectively. 5.2
Experiments for the Single-Objective Triangulation Problem
The first set of experiments was to execute a Genetic Algorithm (GA) [9] and a Particle Swarm Optimization algorithm (PSO) [10] attempting to minimize Equation (4). For all the performed experiments we used synthetic images generated by projecting a regular polyhedron. In order to set the correspondence points x ⇐⇒ x we selected 24 pair points from each image. Then each point was perturbed with bi-dimensional Gaussian noise (k RMS pixels in each axial direction) with zero mean and standard deviation of k pixels for k from 1 to 8. Actually, we experimented with a large set of different parameters’ values and operators, however, due to space reasons, only some of them are presented here. The average results from 30 independent runs for both GA and PSO algorithms, are presented in Figure 1. They are compared against the Linear Least Square (LLS) and PolyAbs methods [6] which are one of the best known methods. Only eight pair points out of the 24 available ones were randomly selected to compute the fundamental matrix. The parameters’ values used by the GA were the following: Representation= Binary with Gray Codes, Selection Type=Binary Tournament, Crossover Type= Two Points, Number of Generations=20000, Population Size=100. The parameters’ values applied to the PSO were the following: Number of Generations= 1000000, Population Size= 50, C1=1.4962, C2=1.4962, W=0.7298 It is easy to see that these results are not competitive at all. This failure was our main reason for applying multi-objectivization to the problem. The corresponding experiment is presented in next section.
6
Applying a Multi-objective Evolutionary Algorithm to the Multi-objectivized Triangulation Problem
Due to the disappointing results obtained by the single objective evolutionary algorithms (GA and PSO) we decided to multi-objetivize the problem [3,4].
Optimal Triangulation in 3D Computer Vision 60
60
LLS PolyAbs GA
50
50
2D error
2D error
(b)
40
30
30
20
20
10
10
0
LLS PolyAbs PSO
(a)
40
335
1
2
3
4
5
6
7
8
Noise
0
1
2
3
4
5
6
7
8
Noise
Fig. 1. Comparing the LLS and Poly-Abs methods under a Projective reconstruction against (a) Genetic Algorithm and (b) the Particle Swarm Optimization
For the multi-objectivization sake, the triangulation problem can be formulated as a two-objective optimization problem decomposing the original function in the following way: 6.1
Two-Objective Triangulation Problem Definition
Find the 3D point X in world coordinates (Xw , Yw , Zw ) such that: Minimizes: ˆ) f1 (X) = d(x, x
and
ˆ) f2 (X) = d(x , x
(7)
where: d(*,*) are Euclidean distances, x and x represent the correspondence ˆ and x ˆ are 2D re2D points from the first and second images respectively, x ˆ constructed (estimated) points from the first and second images respectively. x ˆ has coordinates (ˆ has coordinates (ˆ x, yˆ) calculated by Equation (5) and x x, yˆ) computed by Equation (6). 6.2
The NSGA-II Algorithm
The Evolutionary Multi-Objective Optimization area (EMOO) is a very active one2 , numerous algorithms are published every year [11,12,13]. The NSGA-II [5] is a representative state-of-the-art multiobjective optimization algorithm and one of the most competitive to date. We applied the NSGA-II to solve our multi-objectivized problem3 shown in Eq. (7). Our two-objective optimization problem was fitted to the NSGA-II in the following manner: The three coded variables for each individual are X = (Xw , Yw , Zw ) by using binary representation. The individuals’ fitness is equal to the objective functions 2
3
An updated EMOO repository is available at: http://delta.cs.cinvestav.mx/˜ccoello/ EMOO/ The last version of this software is available at: http://www.iitk.ac.in/kangal/ codes.shtml
336
I. Vite-Silva et al.
f1 (X) and f2 (X) (Eq. 7). The parameters’ values are the following: Number of generations=300, Crossover rate=0.9, Mutation rate=0.33, Population size= 100, Chromosome’s length=106. The multi-objective problem has Pareto front coinciding with the singleobjective optimum, then the solution to our original single-objective problem is taken precisely from the Pareto set. We measure the single-objective function (Eq. 4) in all the Pareto front points. Then, that solution with the less error is selected.
7
Experiments of the Multi-objectivized Triangulation Problem
For all experiments we used synthetic images generated projecting a regular polyhedron. In order to set the correspondence points x ⇐⇒ x , we selected 24 point pairs from each image. Then each point was perturbed with bi-dimensional Gaussian noise (k RMS pixels in each axial direction) with zero mean and standard deviation of k pixels with k from 1 to 8. The Projective and Affine reconstructions were tested. For each of them, we experimented by taken randomly different quantities of pair of points to compute the fundamental and projection matrices, namely 8, 12, 16 and 20 point pairs out of the 24 available ones. 60
60
LLS PolyAbs NSGA−II
50
50
(b)
40 2D error
40 2D error
LLS PolyAbs NSGA−II
(a)
30
30
20
20
10
10
0
0 1
2
3
4
5
6
7
8
1
2
3
4
Noise 60
60
LLS PolyAbs NSGA−II
50
6
7
LLS PolyAbs NSGA−II
(c) 50
8
(d)
40 2D error
40 2D error
5 Noise
30
30
20
20
10
10
0
0 1
2
3
4
5 Noise
6
7
8
1
2
3
4
5
6
7
8
Noise
Fig. 2. Comparing the NSGA-II, LLS and Poly-Abs methods under a Projective reconstruction. The number of point pairs taken to calculate the fundamental matrix are: (a) eight pairs, (b) twelve, (c) sixteen and (d) twenty.
Optimal Triangulation in 3D Computer Vision 60
60
LLS PolyAbs NSGA−II
50
50
(b)
40 2D error
2D error
LLS PolyAbs NSGA−II
(a)
40 30
30
20
20
10
10
0
0 1
2
3
4
5
6
7
8
1
2
3
4
Noise 60
5
6
7
8
Noise 60
LLS PolyAbs NSGA−II
50
LLS PolyAbs NSGA−II
(c) 50
(d)
40 2D error
40 2D error
337
30
30
20
20
10
10
0
0 1
2
3
4
5 Noise
6
7
8
1
2
3
4
5
6
7
8
Noise
Fig. 3. Comparing the NSGA-II, LLS and Poly-Abs methods under a Affine reconstruction. The number of point pairs taken to calculate the fundamental matrix are: (a) eight pairs, (b) twelve, (c) sixteen and (d) twenty.
The average results from 30 independent executions of the algorithm with noisy points from 1 to 8 pixels, taking 8, 12, 16 and 20 correspondence points are shown in Figures 2 and 3 under Projective and Affine reconstructions, respectively. These results are compared against the LLS and Poly-Abs methods [6]. For the experiment under projective reconstruction, if we only take 8 and 12 correspondence points to compute the fundamental matrix (Fig. 2(a) and (b)) it can be observed that the NSGA-II clearly outperforms LLS and Poli-Abs methods. The 2D error decreases when the noise increases. However, when taking 16 and 20 correspondence points (Fig. 2(c) and (d)) the NSGA-II’s advantage is lost with respect to the LLS method showing very similar performances. With respect to the experiment under an affine reconstruction with 8 correspondence points (Fig. 3(a)), NSGA-II is clearly better than the other two methods. For the 12 and 16 correspondence points cases (Fig. 3(b) and (c)) NSGA-II is still better, but the advantage is reduced with respect to the LLS method. And when taking 20 correspondence points, it seems to be that NSGA-II and LLS show similar results. For all cases Poly-Abs presents a poor performance. 7.1
Discussion
The results obtained by the multi-objectivized problem show that the Evolutionary Algorithm outperforms the LLS and Poly-Abs triangulation methods if
338
I. Vite-Silva et al.
there are only a few quantity of correspondence points. However, if more correspondence points are available, then the LLS and the EA performances are very similar. These correspondence points are difficult to compute specially for real images. On the other hand, it is important to note that for all the experiments presented in this paper, we assume that the correspondence points x ⇐⇒ x from the two images are noisy, which is actually a realistic situation in almost all the real images cases. This, of course, implies that the fundamental and projection matrices are noisy too. Furthermore, both the single and the multi-objectivized triangulation problems using an EA are invariant to all the types of reconstruction. The multi-objectivized approach can be successful applied if there are only few very noisy correspondence points and the algorithm execution time is not important, only the obtained accuracy.
8
Conclusions
We have shown a number of experimental results obtained by a single-objective Evolutionary Algorithms to the Triangulation problem, they present a very large error. These disappointing results were dramatically improved when the multiobjectivization was applied to the problem and solved by the NSGA-II algorithm. The multi-objectivized approach can be successful applied if there are only few very noisy correspondence points obtaining better results than other known triangulation techniques. Based on our experiments, we can argument that the multi-objectivization methodology performs quite well for the presented triangulation problem. It is worth to remark that such application problems are rather rare. To the best of our knowledge, there are only other two approaches in the specialized literature reporting the successful application of the multi-objectivization methodology, they are [3] (where it was originally proposed) and [4]. Acknowledgments. This work was partially supported by CONACyT, M´exico, under grant 45306 and scholarship 173964.
References 1. T. Jebara, A. Azarbayejani, and A. Pentland. 3D Structure from 2D Motion. IEEE Signal Processing Magazine, 6(3):66–84, May 1999. 2. R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, ISBN: 0521540518, 2nd edition, 2004. 3. J. D. Knowles, R. A. Watson, and D. W. Corne. Reducing Local Optima in SingleObjective Problems by Multi-Objectivization. In E. Zitzler, K. Deb, L. Thiele, C. A. Coello Coello, and D. Corne, editors, Proceedings of the First International Conference on Evolutionary Multi-Criterion Optimization (EMO 2001), volume 1993 of LNCS, pages 269–283, Berlin, 2001. Springer-Verlag.
Optimal Triangulation in 3D Computer Vision
339
4. M. T. Jensen. Guiding Single-Objective Optimization Using Multi-objective Methods. In G. Raidl et al., editor, Applications of Evolutionary Computing. Evoworkshops 2003: EvoBIO, EvoCOP, EvoIASP, EvoMUSART, EvoROB, and EvoSTIM, pages 199–210, Essex, UK, April 2003. Springer. Lecture Notes in Computer Science Vol. 2611. 5. K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan. A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 6(2):182–197, April 2002. 6. R. Hartley and P. Sturm. Triangulation. Computer Vision and Image Understanding, 68(2):146–157, 1997. 7. P. A. Beardsley, A. Zisserman, and D. W. Murray. Navigation Using Affine Structure and Motion. In J. O. Eklundh, editor, Proc. 3rd European Conference on Computer Vision - ECCV’94, volume 800 of LNCS, pages 85–96. Springer-Verlag, 1994. 8. P. A. Beardsley, A. Zisserman, and D. W. Murray. Sequential Updating of Projective and Affine Structure from Motion. International Journal of Computer Vision, 23(3):235–259, 1997. 9. A. E. Eiben and J. E. Smith. Introduction to Evolutionary Computing. Springer, Berlin, 2003. 10. J. Kennedy and R. C. Eberhart. Swarm Intelligence. San Mateo, CA: Morgan Kaufmann Publishers, 2001. 11. D. W. Corne, N. R. Jerram, J. D. Knowles, and M. J. Oates. PESA-II: RegionBased Selection in Evolutionary Multiobjective Optimization. In L. Spector, E. D. Goodman, A. Wu, W. B. Langdon, H. M. Voigt, M. Gen, S. Sen, M. Dorigo, S. Pezeshk, M. H. Garzon, and E. Burke, editors, Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001), pages 283–290, San Francisco, California, USA, 7-11 2001. Morgan Kaufmann. 12. K. Deb, M. Mohan, and S. Mishra. Evaluating the Epsilon-Domination Based Multi-Objective Evolutionary Algorithm for a Quick Computation of ParetoOptimal Solutions. Evolutionary Computation, 13(4):501–525, 2005. 13. L. V. Santana-Quintero and C. A. Coello Coello. An Algorithm Based on Differential Evolution for Multi-Objective Problems. International Journal of Computational Intelligence Research, 1(2):151–169, 2005.