list of potential matches in order to make the loop-closing process more efficient. .... of quickly reducing the uncertainty on the trajectory. The Overlap Weighted.
Match Selection in Batch Mosaicing Using Mutual Information Armagan Elibol, Nuno Gracias, and Rafael Garcia Computer Vision and Robotics Group, University of Girona, 17071, Spain {aelibol,ngracias,rafa}@eia.udg.edu
Abstract. Large area photo-mosaics are widely used in many different applications such as optical mapping, panorama creation and autonomous vehicle navigation. When the trajectory of the camera provides an overlap between non-consecutive images (closed-loop trajectory), it is essential to detect such events in order to get globally coherent mosaics. Recent advances in image matching methods allow for registering pairs of images in the absence of prior information on orientation, scale or overlap between images. Owing to this, recent batch mosaicing algorithms attempt to detect non-consecutive overlapping images using exhaustive matching of image pairs. This paper proposes the use of Observation Mutual Information as a criterion to evaluate the benefit of potential matches between pairs of images. This allows for ranking and ordering a list of potential matches in order to make the loop-closing process more efficient. In this paper, the Observation Mutual Information criterion is compared against other strategies and results are presented using underwater imagery.
1
Introduction
One of the most important steps in building a mosaic is image matching [1]. In absence of other sensor data, time consecutive images are generally assumed to have an overlapping area. This overlap allows to register the images and to obtain an initial estimate of the camera trajectory over time. This initial estimate suffers from rapid accumulation of registration errors and can be very far from the real trajectory. However, it provides useful information to predict the nontime consecutive overlapping images. Matching those images helps to refine the topology by using global alignment methods [2,3]. With the refined topology, new non-time consecutive overlapping images can be predicted and attempted to match. This iterative matching and optimization process continues until no new overlapping images occur. Topology estimation was first argued in [4] where iterative topology inference was proposed assuming that time-consecutive images have an overlapping area. Recent advances in image matching techniques such as the Scale Invariant Feature Transform (SIFT) [5], allow for registering pairs of images in the absence of prior information on orientation, scale or overlap between images. Such H. Araujo et al. (Eds.): IbPRIA 2009, LNCS 5524, pp. 104–111, 2009. c Springer-Verlag Berlin Heidelberg 2009
Match Selection in Batch Mosaicing Using Mutual Information
105
techniques are behind the recent widespread of panorama creation algorithms, since they allow creating panoramas with minimal user input [6,7]. In several approaches attempt to match all images against all or rely upon manually selection of spatially overlapping images [8,9]. While this is feasible for small sets, it becomes impractical for larger problems, such as creating underwater mosaics where useful surveys may comprise many thousand images. The objective of this work is to study different strategies for selecting the image pairs to be matched and get the best estimation of the topology of the surveyed area by exploring the contributions of image matchings, and by choosing which images to be matched first. We assume that all images have already been acquired and time consecutive images have an overlapping area. Therefore, we are free to choose the order of the image pairs to be matched.
2
Topology Estimation Using Extended Kalman Filter
Our approach is inspired by image mosaicing methods based on the Extended Kalman Filter(EKF), which have been studied over the last decade especially in the context of mosaic based navigation [10,11,12]. As matching non-consecutive image pairs provides more information about the topology and improves the trajectory estimation, it is essential to detect them while estimating the topology of the surveyed area. In this context, it is important to measure the contribution of matching one image pair in terms of how much information it will provide about the topology. In this work, the potential gain of matching image pairs is predicted by considering image matching between potential overlapping images as an observation or measurement. Then, the predicted gain is calculated as the amount of information the observation provides to the information matrix of the whole system. This is obtained by calculating the Observation Mutual Information (OMI). As our interest is batch mosaicing, we do not require any control input and do not use the state prediction equations. Only observation update equations are used. As a design option, each image is used once (at most), in each iteration of the algorithm. This ensures the independence among the observation elements and allows us to adapt the existing methods for sensor fusion, selection and management. 2.1
Definitions
The state vector, x, is composed of absolute homographies, m Hi = mat(xi ), which relate all images to the mosaic frame. We use similarity homographies [13] which have 4 degrees of freedom (Scaling, Rotation and Translation in both x and y axis). Let P be the covariance matrix of x. xi = [ai , bi , ci , di ]T i = 1, 2, 3, ..., N where N is the total number of images. A new measurement (observation) is obtained when two images, i and j, are successfully matched. The observation is represented by the homography between corresponding images at iteration k, zk =i Hm ·m Hj + vk where vk is the observation noise vector and m denotes the mosaic frame. We follow the
106
A. Elibol, N. Gracias, and R. Garcia
usual assumption that observation noise is Gaussian and not correlated with state noise. The covariance matrix of the observation noise is denoted by Rk and can be estimated from points and matches[14]. A potential observation is a non time-consecutive overlapping image pair, which has not been attempted to match and for which there is predicted overlap. The predicted overlap is computed using state vector and its covariance matrix. OMI score is calculated for each potential observation. This OMI score is the Predicted Information Gain of the observation. Y(k|k) 1 (1) I(k,zk ) = log 2 Y(k|k−1) where Y is the information matrix [15]. The OMI score requires the knowledge of Rk . For potential observations (where image matching has not been attempted yet), a generic Rk is used. This generic Rk was experimentally obtained from matches among consecutive image pairs by selecting the one with the highest uncertainty from this set. Each matched image pair allow for estimating one Rk . 2.2
Implementation
Our algorithm is composed of five blocks: Initialization, Potential Observation List Generation, Selection, Image Matching and Filter update. The pipeline is illustrated in Fig. 1. In the initialization block, the initial values of x and P are computed from correspondences between time consecutive images pairs. The first image frame is chosen as a global (reference) frame. The absolute homography of image i, 1 Hi , is calculated by accumulating the relative homographies, 1 Hi = 1 Hi−1 ·i−1 Hi i = 2, 3, ..., N. As the first image is chosen as a global frame, its covariance matrix is set to zero. The uncertainties of relative homographies are calculated from the point and matches by using the first order propagation [14]. Covariance matrices of initial absolute homographies are also obtained by using the first order approximation of the accumulation, assuming that covariances of time consecutive homographies are not correlated. Once the initial covariance matrix is computed, a Potential Observation List can be generated. For every possible image pair, the overlapping area is computed using a numerical approximation. If the overlap is bigger than a given threshold,
Fig. 1. Pipeline of EKF based topology estimation framework
Match Selection in Batch Mosaicing Using Mutual Information
107
the image pair is considered as an overlapping image pair and it is added to potential observation list. After generating the list, the Selection step starts. For each possible observation in the list, which has not been processed in the previous iterations, different scores can be calculated depending on the tested strategy. The aim of the Selection step is to choose the subset of observations in such a way that maximizes the score. The selection step is modeled as a linear assignment problem [16]. The maximum number of observations to be selected is equal to N2 as one image can only be used in one observation. After generating and choosing the list of potential observations, image matching starts. The image matching step is composed of two sub-steps: SIFT [5] is used to detect the features in images and RANSAC [13] is used to reject outliers and estimate the homography. Each image pair is attempted to be matched only once. If it is successful, noise covariance is calculated from the correspondences by using the first order noise propagation [14]. The final filter update step updates the state and covariance by using EKF equations.
3
Experimental Results
Several criteria can be devised to order the pairs of potential image matches, so that the most relevant ones are attempted first. We propose and test the following criteria: (1) Expected Overlap (2) OMI, (3) Overlap Weighted OMI, and (4) Random Order. The Expected Overlap criterion selects the pairs which have higher overlap, and thus a higher chance of being successfully matched. This criterion takes into account the uncertainty in the trajectory (using a numerical approximation to compute the overlap under uncertainty) and has been used before in the context of underwater mosaicing [17]. The OMI criterion selects the pairs that contribute the most in terms of Mutual Information with the aim of quickly reducing the uncertainty on the trajectory. The Overlap Weighted OMI combines the first two. Finally the Random Order criterion selects the pairs randomly. It is included a comparison baseline to assess the performance of the other criteria. The data set is an underwater image sequence that consists of 169 images of size 512 × 384 pixels. In order to compare the results, we computed the set of homographies using a standard iterative bundle adjustment approach [2]. Bundle adjustment minimizes the reprojection error over the trajectory parameters. The result of applying bundle adjustment iteratively is given at the last line of Table 1. Fig. 2 shows the resulting trajectory. The resulting homography set is used as a reference to compare the results of EKF based topology estimation strategies. Our comparison criteria is the average reprojection error over all correspondences that were previously found by employing iteratively bundle adjustment and image matching. Table 1 gives the summary of the results. Tested strategies are listed in the first column. The second column shows the total number of image pairs that have been matched successfully. The third column
108
A. Elibol, N. Gracias, and R. Garcia
Fig. 2. Final topology with bundle adjustment. Numbers correspond to the image centers and lines denote the overlapping image pairs which were successfully matched. Table 1. Summary of Results Strategy
Number of Successful Obs. Expected Overlap 869 OMI 870 Overlap Weighted OMI 871 Random Order 870.3 Bundle Adjustment 872
Number of Total Number Unsuccessful Obs. of Obs. 967 1836 1036 1906 1011 1882 1022.3 1892.7 2141 3013
Iterations Avg. Error in pixels 38 11.04 40 9.58 40 9.54 38 10.67 8.78
Fig. 3. Cumulative number of successful observation for each iteration. X axis shows iterations and Y axis shows the number of successful observations in cumulative order.
contains the total number of image pairs that were not matched successfully, referred to as unsuccessful observations. Adding the second and the third column gives the total number of matching attempts which is illustrated in the fourth column. The fifth column denotes how many iterations have been executed. The
Match Selection in Batch Mosaicing Using Mutual Information
109
Fig. 4. Average Reprojection Error for all correspondences for tested selection strategies. X axis shows the number of successful observations in cumulative order and Y axis shows the Average Reprojection Error in Pixels. The small plot contains a zoomed area of the bigger plot between 300 and 900 cumulative successful observations.
Fig. 5. Average Reprojection Error for all correspondences for tested selection strategies. X axis shows the total number of matching attempts in cumulative order and Y axis shows the Average Reprojection Error in Pixels. The small plot contains a zoomed area of the bigger plot between 500 and 2000 cumulative total matching attempts.
last column shows the average reprojection error in pixels calculated by using all correspondences with the resulting set of homographies for each tested strategy. To assess the performance of the random strategy, we have run our algorithm 10 times and computed the average values. From Table 1, one can conclude that OMI Based observation selection produces the minimum reprojection error among all strategies and also the closest to the bundle adjustment. Plots of the total number of successful observation and total matching attempts vs. reprojection error are given in Figs. 4 and 5. It can be noted that the order of
110
A. Elibol, N. Gracias, and R. Garcia
observations makes a difference and has an effect on the resulting trajectory. OMI based selection strategy has the biggest total number of image matching attempts. Especially after the initialization step, the generated potential observation list has several entries which actually do not have overlap because of the high uncertainty of the state. During the first iterations, the total number of successful observations in OMI based selection strategy is low. This can be seen in Fig. 3. OMI selects the observations that provide the most information to the system and this reduces the reprojection error rapidly. This is clearly illustrated in Fig. 4. It is worth noting that random selection has produced relatively good results in terms of total matching attempts. This is partly due to the characteristics of the data set where trajectory lines are very close each other (see Fig. 2). This leads to a high probability of having overlap between any two images.
4
Discussion
When there is no a priori information on the topology, a conceptually simple strategy is to use exhaustive matching “all against all”. If there is a priori information such as consecutive images having overlap, then iterative Bundle adjustment approach can be employed, as it is shown in the previous section. For the test data set (169 images) all-against-all matching strategy would try to match 14, 196 image pairs. If the prior information that time consecutive images have an overlap is available, then initial estimation about trajectory can be computed by accumulating. Non-consecutive overlapping pairs can be estimated by using this initial estimation. As this initial estimation is far from the real trajectory, finding non-consecutive image pairs could generate several false image pairs. Total number of matching attempts can be seen in the last row of Table 1. Using priori information helps to reduce the total matching attempts compared to the all-against-all. In our EKF based framework, we were able to reduce the total matchings attempts even more by incorporating the covariances on the image position while generating the potential observation list. This eliminates several non-overlapping image pairs.
5
Conclusions
In this paper, we have proposed a ranking criterion to evaluate the benefit of potential matches between pairs of images of a sequence. Proper ordering of these matches makes loop-closing more efficient in building large area photomosaics. All successfully matched image pairs contribute differently in terms of reducing uncertainty and reprojection error. An important conclusion of this study is that the order of matching overlapping images plays an important role when trying to reduce the reprojection error of the trajectory while taking into account uncertainties of the parameters. Although all tested strategies had nearly the same number of observations, the resulting reprojection errors are not the same. This emphasizes the importance of the proper selection of pairs to be matched and their order. In this context, different strategies for ordering of image matching have been tested and their performance have been compared.
Match Selection in Batch Mosaicing Using Mutual Information
111
Acknowledgments. This work has been partially funded by the Spanish Ministry of Education and Science(MEC) under grant CTM2007-64751 and the FREESUBNET EU project MRTN-CT-2006-036186. Nuno Gracias has been supported by Ramon y Cajal program and Armagan Elibol has been funded by Generalitat de Catalunya under grant 2004FI-IQUC1/00130.
References 1. Zitov´ a, B., Flusser, J.: Image registration methods: A survey. Image and Vision Computing 21(11), 977–1000 (2003) 2. Triggs, B., McLauchlan, P., Hartley, R., Fitzgibbon, A.: Bundle adjustment – A modern synthesis. In: Triggs, B., Zisserman, A., Szeliski, R. (eds.) ICCV-WS 1999. LNCS, vol. 1883, pp. 298–375. Springer, Heidelberg (2000) 3. Gracias, N., Zwaan, S., Bernardino, A., Santos-Victor, J.: Mosaic based navigation for autonomous underwater vehicles. IEEE Journal of Oceanic Engineering 28(3), 609–624 (2003) 4. Sawhney, H., Hsu, S., Kumar, R.: Robust video mosaicing through topology inference and local to global alignment. In: Burkhardt, H., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1407, pp. 103–119. Springer, Heidelberg (1998) 5. Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004) 6. Yao, J., Chamb, W.K.: Robust multi-view feature matching from multiple unordered views. Pattern Recognition 40(11), 3081–3099 (2007) 7. Brown, M., Lowe, D.G.: Automatic panoramic image stitching using invariant features. International Journal of Computer Vision 74(1), 59–73 (2007) 8. Szeliski, R.: Image mosaicing for tele-reality applications. In: IEEE Workshop on Applications of Computer Vision, pp. 44–53 (1994) 9. Can, A., Stewart, C., Roysam, B.: Robust hierarchical algorithm for constructing a mosaic from images of the curved human retina. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, p. 292 (1999) 10. Richmond, K., Rock, S.M.: An operational real-time large-scale visual mosaicking and navigation system. In: MTS/IEEE OCEANS Conference, Boston, USA (2006) 11. Garcia, R., Puig, J., Ridao, P., Cuf´ı, X.: Augmented state Kalman filtering for AUV navigation. In: IEEE International Conference on Robotics and Automation, Washington D.C., vol. 3, pp. 4010–4015 (2002) 12. Caballero, F., Merino, L., Ferruz, J., Ollero, A.: Homography based Kalman filter for mosaic building. applications to UAV position estimation. In: IEEE International Conference on Robotics and Automation, pp. 2004–2009 (2007) 13. Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004) 14. Haralick, R.: Propagating covariance in computer vision. In: Proceedings of the Theoretical Foundations of Computer Vision, TFCV on Performance Characterization in Computer Vision, Germany, pp. 95–114 (1998) 15. Grocholsky, B.: Information-Theoretic Control of Multiple Sensor Platforms. PhD thesis, University of Sydney (2002) 16. Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004) 17. Gracias, N., Victor, J.: Underwater mosaicing and trajectory reconstruction using global alignment. In: MTS/IEEE OCEANS Conference, vol. IV, pp. 2557–2563 (2001)