Robust Weighted Graph Transformation Matching for ... - IEEE Xplore

4 downloads 0 Views 2MB Size Report
for Rigid and Nonrigid Image Registration. Mohammad Izadi and Parvaneh Saeedi, Member, IEEE. Abstract—This paper presents an automatic point matching.
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 10, OCTOBER 2012

4369

Robust Weighted Graph Transformation Matching for Rigid and Nonrigid Image Registration Mohammad Izadi and Parvaneh Saeedi, Member, IEEE Abstract— This paper presents an automatic point matching algorithm for establishing accurate match correspondences in two or more images. The proposed algorithm utilizes a group of feature points to explore their geometrical relationship in a graph arrangement. The algorithm starts with a set of matches (including outliers) between the two images. A set of nondirectional graphs is then generated for each feature and its K nearest matches (chosen from the initial set). Using the angular distances between edges that connect a feature point to its K nearest neighbors in the graph, the algorithm finds a graph in the second image that is similar to the first graph. In the case of a graph including outliers, the algorithm removes such outliers (one by one, according to their strength) from the graph and re-evaluates the angles until the two graphs are matched or discarded. This is a simple intuitive and robust algorithm that is inspired by a previous work. Experimental results demonstrate the superior performance of this algorithm under various conditions, such as rigid and nonrigid transformations, ambiguity due to partial occlusions or match correspondence multiplicity, scale, and larger view variation. Index Terms— Feature-point matching, graph-based algorithm, image registration, outlier detection, structural similarity.

I. I NTRODUCTION

I

MAGE registration is the process of estimating an optimal transformation between two images that relates positions in one image, to their corresponding positions in another image. Image registration procedure includes four basic steps: feature detection, feature matching, mapping function, and image transformation and re-sampling [1]. Among these four steps, the most challenging one is the feature matching step. Image features can be points, lines, curves or surface parameterizations. Feature matching is the procedure of establishing match correspondence between features in two images of the same scene/object taken at different times and conditions including point of view variation, different sensor modalities, and rigid or non-rigid shape changes. In addition, there are always some features with no correspondences (e.g. due to change in the view, shape or occlusions). These features can further complicate the matching problem. Robust and accurate image registration is a challenging problem and its accuracy affects the registration Manuscript received October 14, 2011; revised April 2, 2012; accepted May 23, 2012. Date of publication July 16, 2012; date of current version September 13, 2012. This work was supported by the NSERC Canada and MacDonald Dettwiler and Associates Ltd. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Xilin Chen. The authors are with the Laboratory for Robotic Vision, School of Engineering Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIP.2012.2208980

outcomes and the reliability of its dependent applications directly. Image registration has numerous applications in areas such as remote sensing (image mosaicing, map generation, change detection, correction of geometric distortion), medical image processing (change detection, pre-operative modeling, measuring response to treatment), and machine vision (simultaneous localization and mapping, trajectory tracking, 3D scene reconstruction). In this paper, we propose a simple and intuitive method for fast matching of image features by incorporating structural relationships between image features (points) and their neighboring features. The algorithm is built upon the previous work by [2] and it is tested for various image types. The performance of the proposed work is compared against [2] and [3]. A. Related Work Automatic image point matching algorithms are classified into two groups [4]: area-based and feature-based. 1) Area-based methods use image region content to match points between two images. These methods compare similarities of image regions using measures such as correlation (e.g. normalized/non-normalized meansquared difference/cross-correlation) [5], Mutual Information [6], and Euclidean distance of normalized (the zero average and unit variance) regions. The main advantage of such methods is in their high registration accuracy (up to a fraction of a pixel). Besides being computationally expensive, area-based methods are sensitive to the intensity variation, noise level, imaging condition, and sensor modalities. Moreover, both images should be at least statistically dependent and large scene variations could cause failure in the matching process. Area-based methods are more suitable for images with fewer distinctive information. 2) Feature-based methods utilize distinctive features to establish match correspondences or estimate the registration parameters between two or more images. The most commonly used features include salient points, lines, curves, edges, line intersections, and image regions around each feature. The main advantage of feature-based methods is in their capability to handle larger and/or more complex misalignments or variations between scenes. Due to using a limited number of pixels associated with each feature, these methods are usually faster than the area based methods. They are also capable of registering images of different nature. Due to the higher efficiency of feature-based methods, they are more commonly adopted by machine learning and

1057–7149/$31.00 © 2012 IEEE

4370

vision-based systems. Numerous previous works have been proposed for reliable and robust point matching. Most these methods use local image information (such as color, texture, edge density, pixel intensity values, etc.) around each feature point to assess or measure the similarity between correspondences. The main issue with these algorithms is in their high sensitivity to the local characteristics of each feature. For instance, matching image feature points using correlation of image templates could easily fail under large view variation where the observed local background of each feature changes dramatically. Many of the more recent approaches have tried to rely on relationship between a set of feature points to establish the matching. For instance, Chui et al. [7] introduced a featurebased method named TPS-PRM (thin-plate spline-robust point matching) for non-rigid registration. The algorithm utilized the Softassign [8], deterministic annealing, and Thin-Plate Spline [9] for spatial mapping and outlier rejection and it solved both the correspondence matching and projection transformation parameters. Navy et al. [10] proposed a method for matching two clusters of points extracted from satellite images. They estimated parameters for a homography transformation by minimizing a cost function. The homography transformation enabled projection of any point from the first image into its correspondence in the second image. Xiong et al. [11] proposed an algorithm for point matching in high-resolution satellite images using super points (Harris corners with a control network for each). The correspondences were established using the minimum distance in position and angle in the control network and the best correspondences were selected through thresholding. Recently, the concept of graph has been utilized for establishing a higher level geometrical or spatial relationship between feature points in the concept of image registration. Here, the main idea is that regardless of the transformation between the two images the spatial relationship between image feature points (if not for all at least for some) is maintained. For instance, Robles-Kelly and Hancock [12] reported a graph matching method by matching string representations. They defined a serial order of a graph’s nodes using the steadystate random walk. In the steady-state, the probability of visiting a node was related to the leading eigenvector of the adjacency matrix. They solved the graph matching problem by minimizing the Edit distance. They showed results (the fraction of correct correspondences versus number of nodes deleted) for the House image sequence dataset [13]. Zheng and Doermann [14], [15] also proposed a graph based approach for the point matching problem. They established graphs between neighboring points using a minimum Euclidean distance constraint. They used a relaxation labeling method to find the optimal matching between two graphs by maximizing the number of matched edges. Although according to the problem definition the proposed method was general and could cover both rigid and non-rigid matching problems, the complexity of the search space was very high. It was also sensitive to matching initialization and could only converge to the local optimal solutions. Hong et al. [16] proposed a graph cut based stereo sparse matching method in which multiscale feature

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 10, OCTOBER 2012

points were extracted and used to construct a weighted sparse graph using the graph cut theory. To find the matches, a graph labeling problem was solved by optimizing an energy function. This method utilized the epipolar geometry of the stereo images and therefore it was more suitable for stereo matching. Aguilar et al. [2] proposed a robust point-matching method named Graph Transformation Matching (GTM). They created an initial one-to-one matching between points of the images based on correlation. Then a K-nearest-neighbor graph was constructed for each image. Finally, vertices that introduced structural dissimilarity between the two graphs were iteratively eliminated. The obtained graphs represented the corresponding matches between the two images. Liu and An [17] proposed a graph based point matching method using the spatial order of adjacent points instead of the adjacency relationship only as proposed by the GTM. With the assumption of rigid transformation, the similarity of the two candidate points was compared using the local spatial order of the neighboring points through a cyclic string matching method. While limited results (using three image pairs) showed better performance by this method comparing to both RANSAC and GTM, the authors failed to provide any quantitative comparison of the results on a higher number of images with different conditions including scale variation or deformation. Sanroma et al. [18] proposed a graph matching method using the Expectation Maximization (EM) algorithm. They utilized a mixture model to evaluate the geometrical arrangement of nodes and their structural relationships (consistency). They solved the graph matching problem using the EM first and then approximated the solution using Softassign. They evaluated the performance of their method using five image pairs (in each pair the second image was transformed manually) and compared those results against methods such as RANSAC and GTM. One of the main problems that most of the above mentioned methods have difficulties with is the correct matching under larger transformations between the images. In this case, the local visual properties around each feature could significantly vary and therefore establishing correct match correspondences become very hard if not impossible. We experienced such a problem in our research for 3D reconstruction of scenes using multiple oblique views of these scenes. The camera yaw angles between the two views of a scene could vary up to 180 degrees with an additional 45 ± 10 degrees variation in the pitch angles. Under such condition, even those most robust methods fail to find a high number of correct matches between the two images of the scene. GTM algorithm provides some good matching results for images with small view variation, but it failed to establish match correspondences in our images. The proposed work in this paper is an improved version of the method proposed by [2] (Graph Transformation Matching or GTM). The main contributions of this work are as follows: 1) Introducing a very robust and accurate approach to identify correct feature point matches in two or more images under a number of conditions including, translation, rotation, scale variation, deformation, partial occlusion, and multiplicity of correspondences.

IZADI AND SAEEDI: ROBUST WGTM FOR RIGID AND NONRIGID IMAGE REGISTRATION

4371

2) Providing a dynamic scheme that enables establishing geometrical (angular distance) relationships between features that most likely stays unchanged as acquisition parameters change between images. II. P ROPOSED M ETHOD Since the proposed approach is inspired by the Graph Transformation Matching algorithm [2] (GTM) algorithm we first describe this algorithm and discuss some of the issues related to that. The principle of the GTM approach is in enforcing coherent adjacency relationships of match correspondences between the two images. This is done through an iterative approach in which correspondences that disrupt a predefined neighborhood relationship are eliminated. GTM’s performance quality is superior to that of RANSAC [3] and Softassign [8] according to [2]. A. Graph Transformation Matching (GTM) GTM assumes two sets of match correspondences (including a number of outliers) between the two images: P = { pi } and P  = { pi } of size N where pi matches pi . It creates median K-nearest-neighbor (K-NN) graph G p = (V P , E P ) using the following definition: 1) Define a vertex v i for each match correspondence pi , such that V P = {v 1 , . . . , v N }. A non-directed edge (i, j ) exists when p j is one of the closest neighbors of pi and also  pi − p j ≤ η. Here η is defined by: η = medi an

(l,m)∈V P ×V P

 pl − pm 

(1)

Through the above condition, two constraints are imposed: 1) a vertex only validates structures of its closest neighbors, and 2) a filtering scheme in which outliers that cause structural deformations are removed. Therefore two K − N N graphs for the two sets are generated, G P and G p . G P has an N×N adjacency matrix A P where A P (i, j ) = 1 if (i, j ) ∈ E P and 0 otherwise. Similarly, the graph G P  has the adjacency matrix A P  . If all correspondences were correct, the two graphs would be isomorphic. Since this is not the case (in general), structural differences exist between the two graphs. 2) Iterate through the following steps: a) Select an outlier ( j out ) from the matching set. The outlier is selected according to the structural disparity which is approximated by computing the residual adjacency matrix R = |A P − A P  | and selecting j out that yields the maximal number of different edges in both graphs. j out = argmax

N 

j =1...N i=1

R(i, j )

(2)

b) Once an outlier is identified, the outlier and its correspondences (along with all references to these two vertices) are removed and N is decreased by one.

Fig. 1.

Wrong matches with similar isomorphic graphs.

c) Re-generate both K − N N graphs and continue with the next iteration. The algorithm stops when it reaches to the null residual matrix (R(i, j ) = 0, ∀ i, j ). In the above algorithm two constraints are forced: 1) Since the K − N N graphs G P and G P  are non-directed graphs, A P and A P  are symmetrical. 2) Any matching point with less than K neighbors is eliminated from the process. There are two main assumptions the GTM algorithm: 1) If all the correspondences are correct, graphs G P and G p will then be isomorphic. 2) The transformation between images is reasonably smooth. The first above assumption may be held true under some contradictory circumstances. For example, sometimes two falsely matched points have the same neighbors. For instance, as shown in Figure 1, one of the matches (shown with a blue line in the left image and red dot in the right image) is an outlier but the two created graphs are still isomorphic. Therefore GTM could fail to remove all outliers. Also if the transformation between two images is not reasonably smooth and the input images include some distortions due to deformation, the camera motion, large scene coverage, or using a wide lens, neighbors of two true matching points might not be at the same distances from each other in the two images and consequently K − N N graphs of the two will not be the same. Here, we propose a new method to overcome the above potential limitations. B. Weighted Graph Transformation Matching (WGTM) Instead of relying only on the relationship between adjacent features, the proposed method utilizes the angular distances between neighboring features. Utilizing angular distance comparing to linear distance for matching graphs of point correspondences is advantageous due to the fact that the angular distance could be invariant with respect to scale and rotation. The angular distance is also less sensitive to the noise originating from lens distortion. It is important to note that under partial deformation or large perspective transformation, some of the points go through dramatic changes and therefore do not retain the invariance property with respect to rotation or scale. This however is not the case for all the points in the image. The proposed algorithm therefore relies on the points that do maintain the invariance with respect to the image transformation while removing points with high error or variant with respect to that particular transformation.

4372

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 10, OCTOBER 2012

For the above reasons, in this work the matching between corresponding features in two images will proceed only if the angles between each feature and its neighboring features provide sufficient support to that matching. Following steps describe the proposed algorithm. 1) We assume there are two sets of N-feature points with an initial one-to-one relationship (P = { pi } and P  = { pi }). In our method two median K − N N graphs G P = (V P , E P ) (with adjacency matrix A P ) and G P  = (V P  , E P  ) (with adjacency matrix A P  ) are generated for the sets P and P  respectively. We define a vertex v i for each match correspondence pi , such that V P = {v 1 , . . . , v N }. A directed edge (i, j ) exists when p j is one of the closest neighbors of pi and also  pi − p j ≤ η. Here η is defined by: η = medi an

(l,m)∈V P ×V P

 pl − pm 

(3)

Therefore two K − N N graphs for the two sets are generated, G P and G p . G P has an N×N adjacency matrix A P where A P (i, j ) = 1 if (i, j ) ∈ E P and 0 otherwise. Similarly, the graph G P  has the adjacency matrix A P  . There are two differences in creating these matrices comparing to those generated by the GTM: a) First, the two graphs of G P and G P  are directed graphs and therefore matrices of A P and A P  may not be symmetrical. This simply means that in the constructed K − N N graphs having a vertex v j as one of the K neighbors of vertex v i does not imply that v i is also one of the K neighbors of vertex v j . v i is only one of the K neighbors of v j if and only if it satisfies condition  pi − p j ≤ η. b) Second, points with less than K neighbors (as long as they have at least two neighbors) are not eliminated. most one edge (connec2) Find all vertices of G P with at tion) with other vertices ({v i | ∀ j,(i, j )∈E P A P (i, j ) ≤ 1}) and remove them and their correspondences from G P and G P  . Recompute both G P and G P  . Repeat this step until all vertices of G P have minimum of two edges. 3) A weight matrix W is generated for each point pi using graph G P . For the edge that connects vertex v i to v m compute a weight value using the following equation:  ⎞  ⎛       ⎜ ( pm − pi ). ( pm − pi )Rot (θ(kmin ,i)) ⎟  ⎜ ⎟  





W (i, m) =  Acos ⎝ ⎠

pm − pi

pm − p

i     (4) where Rot (θ (kmin , i ))



cos(θ (kmin , i )) si n(θ (kmin , i )) −si n(θ (kmin , i )) cos(θ (kmin , i ))

 (5)

Here kmin represents the optimal rotation angle between each pair of matches. pi and pm are the 2D vectors of

image coordinates for vertices v i and v m and Acos is the inverse cosine function. The optimal rotation angle is defined as the angle that minimizes the sum of angular distances between vectors v i and v i :  kmin = argmi n ∀k,(i,k)∈E P ∀ j,(i, j )∈E P

 ⎞  ⎛     ( p j − pi ). ( p j − pi )Rot (θ (k, i )) ⎜  ⎟  Acos ⎜ ⎟





 ⎝ ⎠

p j − pi p  − p 

  j i   (6) and θ (k, i ) = Atan −π,π ( pk−pi )− Atan −π,π ( pk −pi )

(7)

where:

⎧ ⎪ vx ≥ 0 ⎨ Atan(v y /v x ), Atan −π,π (v) = Atan(v y /v x )+π, v y ≥ 0, v x < 0 ⎪ ⎩ Atan(v y /v x )−π, v y < 0, v x < 0 (8)

In the above equation v is a 2D image vector with the x and y coordinates of v x and v y . At this point, the weight matrix W is filled by the angular distances between vectors of corresponding matched vertices and all of their connected vertices. All elements of W that have not been set to any value are set to null. 4) For each vertex v i , find the percentage of edges connected to v i with their correspondences connected to v i . If the percentage is smaller than 50%, the weight value of all different edges in the weighted matrix should be replaced by π. In another words:  ∀k,(i,k)∈E P A P  (i, k) < 0.5 ∀i, j : W (i, j ) = π, if  ∀k,(i,k)∈E P A P (i, k) (9) 5) For each vertex v i of G P , compute the mean of all weights related to all edges connected to it: w(i ) =

mean (W (i, j ))

∀ j,(i, j )∈E P

(10)

Remove vertex corresponding to the the maximum value of w and all of its corresponding vertices from P and P  . 6) Find the maximum element of W (wmax ). wmax = max (W (i, j ))

(11)

μnew = mean(w(i ))

(12)

∀i, j

Set ∀i

If wmax < π and |μnew − μold | <  stop; otherwise, set μold = μnew and proceed to the next iteration. The value of  is set to 0.001 (found empirically) for all the results presented in this work. The value of μold was set to 2π initially. At every iteration, one outlier is identified and eliminated from the list of corresponding matches.

IZADI AND SAEEDI: ROBUST WGTM FOR RIGID AND NONRIGID IMAGE REGISTRATION

4373

the House sequence [13]. This set is used by many of the previous work. The third group, demonstrates performance of the proposed method for aerial Pictometry oblique angle images. The significance of these results is in the fact that the capturing cameras have large yaw angle differences with an additional pitch variation that could results in substantial local background variation for feature points. 1) Test Results for the GTM Image Set: In this section we show the performance of the algorithm on the set of images used by the GTM algorithm. To reassure that the results gained in this work are under the same conditions as presented by the GTM algorithm, we created same conditions and estimated same measures. In these images, first corner features are extracted [19] and matched. All the outliers are removed manually. Among all remaining inliers, 60 correct matches are chosen randomly. Outliers are then added by matching arbitrary points between the two images. The percentage of the additive outliers are changed from 5% to 95% with increments of 10%. The algorithms are then executed for each case 100 times and in each time the added random error was regenerated. The mean results are then reported using all these runs. Four measures are used for each set of results: TP +TN T P + T N + FN + FP TP Pr eci si on = T P + FP Accur acy =

(13) (14)

Here the Accur acy measurement is the degree of closeness of identifying a match to the actual (true) match and the Pr eci si on (also called reproducibility or repeatability) represents the degree to which repeated matching under unchanged conditions show the same results. Fig. 2.

Weighted graph transformation matching algorithm.

Such elimination reduces the value of μnew . Once all outliers are removed, the algorithm identifies the worst inlier as an outlier. Removing an inlier changes the value of μnew only by a small amount. Once the first inlier is removed, the difference between the μnew and μold is small and the algorithm will stop. The computed error value in each iteration allows treating both highly distorted and less distorted images properly and without additional constraint. Figure 2 shows the algorithm presented above. C. Algorithm Evaluation In this part, we evaluate the performance of the WGTM and compare its accuracy (under identical conditions) with the GTM [2] and RANSAC [3]. Since both the proposed algorithm and the GTM algorithm use K − N N graphs, the performance is evaluated for two values of K = 5 and K = 20. The results in this section are divided into three groups. The first group, is the results for the image dataset used in the GTM paper. It compares the accuracy of the three methods through introducing outliers. The second set represents the performance of the proposed algorithm on

TP T P + FN TN Speci f i ci t y = T N + FP Recall =

(15) (16)

The Recall (also called sensitivity) measures the proportion of actual matches which are correctly identified as such and Speci f i ci t y computes the proportion of those wrong matches that are correctly identified. 2) Camera Movement: This set corresponds to combinations of camera movements including, translation, rotation and scale change. It includes twenty image pairs. For each image pair, the percentage of outliers are changed from 5% to 95%. The number of outliers is computed by: No of outliers =

outlier percent × no. of inliers 100 − outlier percent

(17)

Figures 3(a) to 3(d) show the mean Accuracy, Precision, Recall and Specificity of the three algorithms for this data set. As shown, generally RANSAC has performed better than the GTM algorithm except for the specificity values which are slightly less than the GTM (about 3% in average). GTM algorithm seems to perform better for K = 5 than K = 20. This probably can be explained by the fact that if an outlier is identified, it (along with its correspondence and all references to these two vertices) will be removed from the process and

4374

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 10, OCTOBER 2012

100

100

90

90

Precision(%)

Accuracy(%)

80 70 60 WGTM, K = 20 WGTM, K = 5 GTM, K = 20 GTM, K = 5 Ransac+Epipolar Geometry

50 40 300

10

20

30 40 50 60 70 Outlier Percentage (%)

80

90

80 70 60 WGTM, K = 20 WGTM, K = 5 GTM, K = 20 GTM, K = 5 Ransac+Epipolar Geometry

50 40

100

0

10

20

30 40 50 60 70 Outlier Percentage (%)

80

90

100

(b)

(a) 100

100 90

99

80

60 50 40

Specificity(%)

Recall(%)

70 WGTM, K = 20 WGTM, K = 5 GTM, K = 20 GTM, K = 5 Ransac+Epipolar Geometry

30

98 97 96

WGTM, K = 20 WGTM, K = 5 GTM, K = 20 GTM, K = 5 Ransac+Epipolar Geometry

95

20 10 0

10

20

30 40 50 60 70 Outlier Percentage (%)

80

90

100

94

0

10

20

30 40 50 60 70 Outlier Percentage (%)

(c)

80

90

100

(d)

Fig. 3. Comparison of RANSAC, GTM, and WGTM methods for camera movements (translation-rotation-scale variation). (a) Accuracy plots: trans+rot+zoom. (b) Precision plots: trans+rot+zoom. (c) Recall plots: trans+rot+zoom. (d) Specificity plots: trans+rot+zoom.

any matching point with less than K neighbors is also eliminated from the matching process. Therefore a high number of K does not seem to favor the GTM algorithm. With higher K values, the WGTM algorithm performs better due to the fact that it enforces a tighter bound between the network of neighboring points if possible. In the case of outliers in a network of points those outliers are removed but the graph is not dismissed completely and it is compared at a lower order of K (with a minimum Kmin = 2). Figure 4 shows one of the test images that include rotation and translation, and scale change. Here yellow lines show correct match correspondences. Blue lines depict falsely identified inlier (by the programs) and cyan lines show the missing inliers (by the programs). In this example both GTM and RANSAC algorithms include false inliers as well as missing inliers. WGTM has outperformed both GTM and RANSAC. 3) Correspondence Ambiguity: In this section, the performance of the algorithm is tested under ambiguous conditions. There are several sources for image ambiguities including partial occlusion and multiple match correspondences due to duplicity of some of the matching features. Here the test set includes twenty image pairs that include partial occlusions and multiple matches associated with one feature point. The feature points are found using Harris corner detector and are matched against each other manually (ground truth). A number of outliers are added to the system. The overall percentage of

(a)

(b)

(c) Fig. 4. Examples of matching results for camera movement with 75% outliers in the initial matches. (a) RANSAC+Epipolar Geometry. (b) GTM. (c) WGTM.

outliers in total number of matched points varies from 5% to 95%. Figure 5 shows Accur acy, Pr eci si on, Recall, and Speci f i ci t y for this set of tests. In these graphs each value represents the average value for 1000 runs of the system. In each run the added outliers are chosen randomly. The WGTM algorithm outperforms both GTM and RANSAC algorithms. Figure 6 shows a visual example (chosen from the above set) with lines projecting the matching points from one image to the other. In this figure, yellow lines show correct match correspondences and blue lines represent falsely identified inlier (by the program) and cyan lines show the missing inliers that the algorithm has missed to identify.

IZADI AND SAEEDI: ROBUST WGTM FOR RIGID AND NONRIGID IMAGE REGISTRATION

100

100

90

90

80

80 Precision(%)

Accuracy(%)

70 60 50

70 60 50

WGTM, K = 20 WGTM, K = 5 GTM, K = 20 GTM, K = 5 Ransac+Epipolar Geometry

40 30 20 0

4375

10

20

30 40 50 60 70 Outlier Percentage (%)

80

90

WGTM, K = 20 WGTM, K = 5 GTM, K = 20 GTM, K = 5 Ransac+Epipolar Geometry

40 30

100

0

10

20

30

40

50

60

70

80

90

100

Outlier Percentage (%)

(a)

(b) 100

100 90

99

80 98 Specificity(%)

Recall(%)

70 60 WGTM, K = 20 WGTM, K = 5 GTM, K = 20 GTM, K = 5 Ransac+Epipolar Geometry

50 40

97 96 95

WGTM, K = 20 WGTM, K = 5 GTM, K = 20 GTM, K = 5 Ransac+Epipolar Geometry

30 94

20 10 0

10

20

30

40

50

60

70

80

90

100

93

0

10

20

30

40

50

60

70

Outlier Percentage (%)

Outlier Percentage (%)

(c)

(d)

80

90

100

Fig. 5. Performance comparison for RANSAC, GTM, and WGTM methods under correspondence ambiguities. (a) Accuracy plots. (b) Precision plots. (c) Recall plots. (d) Specificity plots.

(a)

(b)

(c) Fig. 6. Examples of matching results for an ambiguous matching condition with 75% added outliers. (a) RANSAC+Epipolar Geometry. (b) GTM. (c) WGTM.

4) Deformation: In this section, the performance of the algorithm is tested for deformable objects including cloths, papers or plastic wrappers. Once again the feature points are found using Harris corner detector and are matched against each other manually. A number of outliers are added to the system. The overall percentage of outliers in total number of matched points varies from 5% to 95%. Figure 7 shows Accur acy, Pr eci si on, Recall, and Speci f i ci t y for this set. Figure 8 represents two sample test images with the visual presentation of the matches between the two. 5) Test Results for the House Image Sequence: In this section we show the performance of our algorithm for the

House dataset [13] which is most commonly used to assess the performance of many of the existing point matching algorithms. This set includes 111 images of a toy House taken from different view points while the capturing camera moved in the space. The scale in this set stays almost unchanged. In each image we have manually marked 60 identical feature points as the ground truth. These ground truth points are instrumental in assessing the performance of the proposed algorithm. For this set of data, two groups of experiments are carried out. The first test is performed to compute the matching error rate for the proposed algorithm. For this test, the 60 manually found ground truth points were matched between two frames using Mutual Information algorithm [20]. The matching was performed between the first frame (House.seq0.png) and all other 110 remaining frames in this image dataset. Then, the proposed algorithm (with K = 20) is applied to the results of the initial matched points. An error rate is computed by 1 - the ratio of no. of correct matches found by the WGTM divided by the no. of correct matches in the initial set. Figure 9 represents the results for this test. Figure 10 shows three typical matching point results for the three algorithms of WGTM, GTM and RANSAC for a pair of images with 180 degrees viewing difference in between. In the second test, random outliers are added at rates of [15%, 35%, 55%, 75%, 95%]. The performance of the proposed algorithm is then verified by comparing the matches found by the proposed algorithm against the existing ground

4376

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 10, OCTOBER 2012

100

100

90

90 80

70

Precision(%)

Accuracy(%)

80

60 50

30 20 0

10

20

30 40 50 60 70 Outlier Percentage (%)

80

90

60 WGTM, K = 20 WGTM, K = 5 GTM, K = 20 GTM, K = 5 Ransac+Epipolar Geometry

50

WGTM, K = 20 WGTM, K = 5 GTM, K = 20 GTM, K = 5 Ransac+Epipolar Geometry

40

70

40 30 0

100

10

20

30 40 50 60 70 Outlier Percentage (%)

(a)

80

90

100

(b)

90

100

80

99

Specificity(%)

Recall(%)

70 60 50 40 30

WGTM, K = 20 WGTM, K = 5 GTM, K = 20 GTM, K = 5 Ransac+Epipolar Geometry

10

20

30 40 50 60 70 Outlier Percentage (%)

97 96

WGTM, K = 20 WGTM, K = 5 GTM, K = 20 GTM, K = 5 Ransac+Epipolar Geometry

95

20 10 0

98

80

90

100

94

0

10

20

30 40 50 60 70 Outlier Percentage (%)

(c)

80

90

100

(d)

Fig. 7. Performance comparison for RANSAC, GTM, and WGTM methods under deformation. (a) Accuracy plots. (b) Precision plots. (c) Recall plots. (d) Specificity plots. 8 7

(a)

(b)

6 5 4

(c) Fig. 8. Comparison of the methods for deformed objects. (a) RANSAC+ Epipolar Geometry. (b) GTM. (c) WGTM.

truth. In each case, the algorithm is executed 50 times and every time the random outliers are regenerated. The average results for the 50 runs are then estimated. We show the performance for two values of K = 5 and K = 20. The same mechanism is applied for the GTM algorithm and its performance is compared with the proposed algorithm. For the GTM algorithm we only show the results for K = 5 since a higher value of K will cause the GTM to perform worse. To reduce the number of data the mean Precision and Recalls are computed for 4 clusters of the images in the dataset. These four clusters include frames: 1 to 28, 29 to 56, 57 to 84, and 85 to 111. Fig. 11(a) to (d) represents the Precision rates for these four groups of frames. For both sets of tests in this section the  is set to 0.01.

3 2 1 0 10

20

Fig. 9.

30

40

50

60 70 80 Frame Number

90

100

110

WGTM error rate for House sequence image set.

6) Test Results for Pictometry Images: In this section we provide the application of the proposed point matching algorithm on images of oblique view from Pictometry dataset. The parent application in this case was developed for the reconstruction of 3D building models using aerial oblique Pictometry images. Pictometry images include multiple overlapping images of city regions acquired at different times and viewings. A set of images of a particular scene with three different viewings is shown in Figure 12.

IZADI AND SAEEDI: ROBUST WGTM FOR RIGID AND NONRIGID IMAGE REGISTRATION

(a)

(b)

4377

IIC stands for the number of Incorrect Initial Correspondences. IIC in percentage represents the ratio of IIC/IC in percent. As shown in Table I (K = 5), WGTM leads to higher number of correct matches (TP) and a smaller number of missed correct matches (FN) comparing to the GTM. WGTM performs better than both GTM and RANSAC for views of zero and 180 degree. In the 90 degree case, WGTM performs better than the GTM but RANSAC outbids WGTM in four cases (rows 6, 7, 9 and 10). At K equal to 20, once gain WGTM beats the GTM algorithm. However RANSAC outperforms WGTM in two cases (rows 8 and 10) marginally. In overall, the average percentage of the correct matches found by WGTM (82.53% for K = 5 and 90.86% for K = 20) are higher than those found by the GTM (45.5% for K = 5 and 27.7% for K = 20) and RANSAC (80.3%). A K = 20 leads to a higher performance as it enforces a stronger geometrical relationship (where possible) by incorporating higher number of points. D. Sensitivity to the Stopping Condition Threshold ()

(c) Fig. 10. Results of WGTM, GTM, and RANSAC for Pictometry oblique images with 180° viewing difference. (a) WGTM (k = 20), TP = 117, FP = 0, FN = 2, TN = 100. (b) GTM (k = 20), TP = 21, FP = 0, FN = 98, TN = 100. (c) RANSAC+EG, TP = 90, FP = 0, FN = 29, TN = 100.

As can be seen, for instance, local image information around different corners of the scene in the three views are different. Establishing accurate epipolar geometry in these scenes is an essential importance for accurate 3D reconstruction of the building model. To create the epipolar geometry, one option is to use the camera and acquisition parameters accompanied with each image in the form of metadata. Using these parameters by themselves however leads to inconsistent and inaccurate epipolar geometry as they include considerable amount of uncertainty due to imprecise values. Another way of establishing the epipolar geometry is to use the method by [21]. In this method, however, we need between 8-12 matching points across the two images. The accuracy and uniform distribution of these points are key elements in the accuracy of the estimated epipolar geometry and the fundamental matrix. Tables I and II present the results of the WGTM along with the GTM and RANSAC+Epipolar Geometry for 15 sets of Pictometry image pairs. In these tables, column Image Pair represents the id of the image pair as well as the yaw angle between the viewing cameras. The Pictometry image acquisition platform includes 5 cameras: one nadir looking and 4 with slant viewing that are positioned with 90 degrees (in yaw) separation (0°, 90°, 180° and 270°). Therefore, two images of one scene may include up to 180 degrees yaw difference. The heading of IC in these tables represents the Initial Correspondences or the total number of initially matched points. CIC (Correct Initial Correspondences) represents the number of correct matches and

The proposed algorithm relies on a stopping threshold, , which is used to terminate the iterative process of outlier rejection. Generally, the value of  depends on the complexity of the transformation between the two images whose points are matched as well as the level of accuracy required by the system. The following experimental results and discussion provide a brief analysis of the sensitivity of the algorithm to  and the proper setting of this parameter with respect to the transformation between the two images. For the experimental testing, the House sequence [13] was utilized. This set includes 111 images of a toy House taken from different view points as the capturing camera moves in the space. The scale in this set stays almost unchanged. The transformation across the entire set increases in complexity as the number of frames increases. In this set of tests, 30 identical correspondences are identified manually (as the ground truth) across the entire set of images first. Next, 30 random outliers are added to the manually found ground truth and for the different values of  the algorithm is executed and the final results in each case is inspected by computing the Recall and Precision values for each case. In these experiments the value of K is set to 20. Figure 13 displays these two values for different frames at different setting of . Figure 14 shows the mean Precision and Recall values at different settings of the . From these results it can be concluded that there is always a trade-off between the values of Recall and Precision as a function of . While the smaller values of  result in higher Precision, they lead to lower Recalls and vice versa. Therefore one may set the value of  according to the importance of these two values in his/her specific application. For instance, if a system cannot cope with outliers but can function fine with small number of inliers, it is better to set the  to a very small value. In this work, for all the results presented in Section II-C.1 (original GTM images) and Section II-C.3 (Pictometry

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 10, OCTOBER 2012

100

100

90

90

80

80

70

70

60

Precision [\%]

Precision [\%]

4378

Mean Precision GTM K=5 Mean Precision WGTM K=5 Mean Precision WGTM K=20

50 40

50 40

30

30

20

20

10

10 10

20

30

40 50 60 Added error [\%]

70

80

90

Mean Precision GTM K=5 Mean Precision WGTM K=5 Mean Precision WGTM K=20

60

100

10

20

30

(a)

70

80

90

100

70

80

90

100

(b)

100

100

90

90

80

80 70 Precision [\%]

70 Precision [\%]

40 50 60 Added error [\%]

Mean Precision GTM K=5 Mean Precision WGTM K=5 Mean Precision WGTM K=20

60 50 40

60 50

30

30

20

20

10

10 10

20

30

40 50 60 Added error [\%]

70

80

90

Mean Precision GTM K=5 Mean Precision WGTM K=5 Mean Precision WGTM K=20

40

100

10

(c)

20

30

40 50 60 Added error [\%]

(d)

Fig. 11. Performance comparison for GTM and WGTM methods for House image dataset. (a) Precision plots (frames 1–28). (b) Recall plots (frames 1–28). (c) Precision plots (frames 29–56). (d) Recall plots (frames 29–56). (e) Precision plots (frames 57–84). (f) Recall plots (frames 57–84). (g) Precision plots (frames 85–110). (h) Recall plots (frames 85–110).

images),  was set to 0.001. For the House image sequence (Section II-C.2), however, the value of  was set to 0.01. E. Sensitivity With Respect to the Number of Inliers The proposed algorithm in this work explores the geometrical relationships between networks of points to identify and separate matching inliers from outlier ones. The maximum number of points in each network (K ) is a parameter of the algorithm that controls the accuracy and the time complexity of the algorithm. In this section we present a brief discussion on the effects of the number of inliers (regardless of the number of outliers) on the accuracy of the proposed method. We compare such dependency for the proposed, GTM, and RANSAC algorithms. The original work by [2] used 60 inliers with maximum 1140 added outliers (at 95%). It was noticed that if the number of inliers are decreased (regardless of the percentage of added outliers) the quality of the matching for this algorithm drops significantly. Figure 15 shows the comparison of the Precision graphs for GTM, WGTM, and RANSAC under three typical conditions (camera movement, ambiguity and deformable objects image sets). In these images the number of inliers is increased from 10 to 60 and in each case the percentage of added outliers vary from 15% to 95%.

(a)

(b)

(c)

Fig. 12. Three typical views of a sample Pictometry scene. (a) 0° view. (b) 90° view. (c) 180° view.

As seen, when the number of inliers is smaller than 50 the accuracy is substantially lower than when there are more than 50 inliers. Such behavior can be explained by the fact that when there are less inliers in the two set of initial matches, each network of points has a weaker structure as many of the points in that network are outliers. While the algorithm will run until a predefined accuracy achieved, many of the initial network of feature points are doomed to fail. Those remaining networks are also reduced (in number of points) as physically there are not that many close inliers exist in the neighborhood. Ultimately, smaller number of networks with smaller number of points in each network causes the

IZADI AND SAEEDI: ROBUST WGTM FOR RIGID AND NONRIGID IMAGE REGISTRATION

4379

TABLE I R ESULTS FOR K = 5

No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Image pair 0°-1 0°-2 0°-3 0°-4 0°-5 90°-1 90°-2 90°-3 90°-4 90°-5 180°-1 180°-2 180°-3 180°-4 180°-5 Mean [%]

IC # 100 100 100 100 100 220 220 220 220 220 220 220 220 220 220

General data CIC IIC # # 67 33 36 64 69 31 62 38 93 7 120 100 120 100 120 100 120 100 120 100 120 100 120 100 120 100 120 100 120 100

IIC % 33 64 31 38 7 45 45 45 45 45 45 45 45 45 45

TP # 64 34 66 60 89 100 34 94 63 93 110 91 109 104 114 82

WGTM-K = 5 FP FN TN # # # 0 3 33 0 2 64 1 3 30 0 2 38 0 4 7 1 19 99 2 64 98 0 28 100 0 58 100 1 27 99 0 9 100 0 27 100 2 8 98 1 11 99 0 6 100 15

TP # 45 30 39 27 88 20 36 17 17 30 45 89 46 67 39 45

GTM-K = 5 FP FN TN # # # 0 22 33 1 6 63 0 30 31 0 35 38 0 5 7 0 99 100 0 62 100 0 105 100 1 104 99 0 90 100 0 74 100 1 29 99 3 71 97 0 48 100 0 81 100 52

TP # 60 32 60 49 87 103 74 91 102 100 90 88 84 90 98 80

RANSAC+EG FP FN TN # # # 0 7 33 2 4 62 0 9 31 0 13 38 0 6 7 0 16 100 1 24 99 0 31 100 1 19 99 0 20 100 0 29 100 0 30 100 1 33 99 0 25 100 0 22 100 17

TP # 21 19 21 21 34 41 14 20 32 31 21 42 21 21 35 27

GTM-K = 5 FP FN TN # # # 0 46 33 3 17 61 0 48 31 0 41 38 0 59 7 1 78 99 7 84 93 1 102 99 0 89 100 0 89 100 0 98 100 1 76 99 0 96 100 0 94 100 1 85 99 70

TP # 60 32 60 49 87 103 74 91 102 100 90 88 84 90 98 80

RANSAC+EG FP FN TN # # # 0 7 33 2 4 62 0 9 31 0 13 38 0 6 7 0 16 100 1 24 99 0 31 100 1 19 99 0 20 100 0 29 100 0 30 100 1 33 99 0 25 100 0 22 100 17

TABLE II R ESULTS FOR K = 20

No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Image pair 0°-1 0°-2 0°-3 0°-4 0°-5 90°-1 90°-2 90°-3 90°-4 90°-5 180°-1 180°-2 180°-3 180°-4 180°-5 Mean [%]

IC # 100 100 100 100 100 220 220 220 220 220 220 220 220 220 220

General data CIC IIC # # 67 33 36 64 69 31 62 38 93 7 120 100 120 100 120 100 120 100 120 100 120 100 120 100 120 100 120 100 120 100

IIC % 33 64 31 38 7 45 45 45 45 45 45 45 45 45 45

TP # 65 34 67 60 90 116 87 88 106 93 117 110 113 109 118 90

WGTM-K = 5 FP FN TN # # # 0 2 33 0 2 64 0 2 31 0 2 38 0 3 7 0 3 100 3 11 97 0 34 100 0 15 100 0 27 100 0 2 100 1 8 99 0 4 100 1 6 99 1 2 99 7

accuracy of both algorithms to drop. However under the same conditions, WGTM still performs more robustly comparing to the GTM and RANSAC algorithms. As the algorithm advances many of outliers are eliminated and consequently the order of each network of points decreases (from its initial K value). At the time when the algorithm stops each network has different number of matches in its structure. F. System Portability Like any other complex algorithm, our algorithm has a few parameters that should be set according to the application and required precision. These parameters are the value of epsilon

(used in the stopping condition), percentage of the connected edges (set to 50% for all the tests presented in this paper) and the value of K . Section II-D of the paper is allocated to the discussion of  and how it should be set according to the requirement of the parent program. We suggest setting epsilon to very small values for applications where the accuracy is the most important aspect and the number of matches perhaps not so much so. The percentage of the connected edges (in Equation 9) is set to 50%. This value was found emprically and for all the test cases presented in this paper its value is unchanged.

4380

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 10, OCTOBER 2012

ε = 0.001

100

100

90

90

80

80

70

70

60 50 40

Precision Recall

60 40 30

20

20

10

10 10 20 30 40 50 60 70 80 90 100 110 Frame number

(a)

(b)

ε = 0.01

ε = 0.05

110 100

90

90

80

80

70

70

50 40

Precision Recall

Value [%]

Value [%]

10 20 30 40 50 60 70 80 90 100 110 Frame number

100

60

Precision Recall

50

30

110

ε = 0.005

110

Value [%]

Value [%]

110

60

Precision Recall

50 40

30

30

20

20

10

10 10 20 30 40 50 60 70 80 90 100 110 Frame number

10 20 30 40 50 60 70 80 90 100 110 Frame number

(c)

(d)

Fig. 13. Performance of the WGTM algorithm for different values of stopping threshold, , for the House dataset. (a) Precision and recall  = 0.001. (b) Precision and recall  = 0.005. (c) Precision and recall  = 0.01. (d) Precision and recall  = 0.05. 110 100 90 80 70

Value [%]

The value of K could affect the accuracy of the matching process. Smaller values of K does not allow more complex networks of points to contribute in the matching process and potentially wrong matches with similar number of points could be mistaken as correct matches and therefore decrease the precision. The nature of input images, the number of outliers, as well as the type of transformation or distortion between the two images is also important in setting a proper value for K . For images with high distortion or deformable transformations, a value larger than 5 (10 or 15) is suggested for K .

60

Mean Precision Mean Recall

50 40 30 20 10

G. Execution Issues In this section we present the issues related to the execution of the proposed algorithm. All of our codes are written in C++. We have also implemented an optimized version of the GTM algorithm in C++. Both systems were run on a 64-bit Intel Core 2 Duo processor, 3 GHz computer with 8 GB of RAM and Windows 7 64-bit operating system. 1) Optimization: Creating K − N N graphs are of complexity order of O(n 2 log(n)) where n represents the number of matched points between the two images. Since in every iteration of the our algorithm these graphs must be regenerated (using the remaining inliers) the system can be computationally expensive. To reduce the time complexity of our method, the K − N N graphs, the weight matrix W and mean matrix w are not reconstructed in every iteration. Instead, these graphs and matrices are constructed once in the first

0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.010 0.020 0.030 0.040 0.050 0.060 0.070 0.080 0.090

ε

Fig. 14.

Mean precision and recall for various values of .

iteration and updated for each vertex that has lost one or more features points in its network in future iterations. Since only a maximum of K features are connected to each vertex, only K vertices will be updated for each vertex in each graph. The time complexity breakdown for creating and updating each variable in our system is as follows. In the first iteration: 1) 2) 3) 4)

Creating distance matrices: O(n 2 ) Sorting distance matrices to find medians: O(n 2 log(n)) Creating two K − N N graphs: O(n 2 log(n)) Removing edges with maximum one edge: O(n 2 )

Inliers Inliers Inliers Inliers Inliers Inliers

35 55 75 Outlier Percentage (%)

95

GTM (K=5) − Ambiguity

60 50 40 30 20 10

Inliers Inliers Inliers Inliers Inliers Inliers

35 55 75 Outlier Percentage (%)

100 90 80 70 60 50 40 30 20 10 0 15

WGTM (K=5) − Camera movement

60 50 40 30 20 10

Inliers Inliers Inliers Inliers Inliers Inliers

35

Precision(%)

100 90 80 70 60 50 40 30 20 10 0 15

55

GTM (K=5) − Non−rigid

60 50 40 30 20 10

Inliers Inliers Inliers Inliers Inliers Inliers

35 55 75 Outlier Percentage (%)

75

95

100 90 80 70 60 50 40 30 20 10 0 15

WGTM (K=5) − Ambiguity

60 50 40 30 20 10

Inliers Inliers Inliers Inliers Inliers Inliers

35

55

75

95

100 90 80 70 60 50 40 30 20 10 0 15

WGTM (K=5) − Non−rigid

60 50 40 30 20 10

Correct Matches Correct Matches Correct Matches Correct Matches Correct Matches Correct Matches

35

55

75

Outlier Percentage (%)

Outlier Percentage (%)

(d)

(e)

(f)

Correct Matches Correct Matches Correct Matches Correct Matches Correct Matches Correct Matches

35 55 75 Outlier Percentage (%)

95

100 90 80 70 60 50 40 30 20 10 0 15

RANSAC − Ambiguity

60 50 40 30 20 10

Correct Matches Correct Matches Correct Matches Correct Matches Correct Matches Correct Matches

35 55 75 Outlier Percentage (%)

(g)

(h)

95

(c)

Outlier Percentage (%)

RANSAC − Camera movement

60 50 40 30 20 10

100 90 80 70 60 50 40 30 20 10 0 15

(b)

Precision(%)

Precision(%)

Precision(%)

(a)

95

Precision(%)

60 50 40 30 20 10

100 90 80 70 60 50 40 30 20 10 0 15

Precision(%)

GTM (K=5) − Camera movement

4381

Precision(%)

100 90 80 70 60 50 40 30 20 10 0 15

Precision(%)

Precision(%)

IZADI AND SAEEDI: ROBUST WGTM FOR RIGID AND NONRIGID IMAGE REGISTRATION

95

100 90 80 70 60 50 40 30 20 10 0 15

95

RANSAC − Non−rigid

60 50 40 30 20 10

Correct Matches Correct Matches Correct Matches Correct Matches Correct Matches Correct Matches

35 55 75 Outlier Percentage (%)

95

(i)

Fig. 15. Performance comparison of precision for GTM, WGTM, and RANSAC methods for different number of inliers. (a) GTM: camera movement. (b) GTM: matching ambiguities. (c) GTM: nonrigid transformation. (d) WGTM: camera movement. (e) WGTM: matching ambiguities. (f) WGTM: nonrigid transformation. (g) RANSAC: camera movement. (h) RANSAC: matching ambiguities. (i) RANSAC: nonrigid transformation.

5) Creating the weight matrix W : O(K 2 n 2 ) → O(n 2 ) (since K is a constant) 6) Creating matrix w: O(n 2 ) In other iterations: 1) Updating two K − N N graph: O(K nlog(n)) → O(nlog(n)) 2) Removing edges with maximum one edge: O(K n) → O(n) 3) Updating the weight matrix W : O(K n K 2 ) → O(n) 4) Updating matrix w: O(K n) → O(n) In a worst case scenario, where all n matches are outliers, the algorithm runs for n iterations and therefore results in a time complexity of O(n 2 log(n)).

2) Running Time: The respective running time for both GTM and WGTM algorithms are presented in Table III. In implementing the GTM algorithm, we were able to reduce the time complexity of it to O(n 2 log(n)). This improvement was possible by updating the K − N N graphs in each iteration instead of regenerating them. Clearly the execution time depends on the number of initial matches and the percentage of the outliers. In this comparison, the RANSAC algorithm’s time is not presented and we refer to the values presented in [2]. The running time for GTM and WGTM algorithms are computed until the stopping condition are satisfied. In each case 60 inliers were selected and a number of outliers, calculated by the outlier percentage, was randomly chosen and added to them. Each time value in this table

4382

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 10, OCTOBER 2012

TABLE III T IME R EPORTED FOR B OTH GTM AND WGTM AT K = 5 AND K = 20 General data Out (%) TC 5 63 15 70 25 79 35 92 45 109 55 133 65 171 75 239 85 399 95 1199

GTM-K = 5 NoI TIME 8 0.043 17 0.075 27 0.113 43 0.167 65 0.256 95 0.388 133 0.622 206 1.181 370 3.285 1188 52.849

WGTM-K = 5 NoI TIME 6 0.138 13 0.193 22 0.271 35 0.419 52 0.657 77 0.975 117 1.576 188 2.775 353 6.500 1172 64.939

represent the average execution time for 50 runs of the algorithm. III. C ONCLUSION A point matching algorithm is presented in this paper that establishes accurate match correspondences between two or more images. This algorithm uses a graph-based approach to employ a network of feature points around each matching features point. We compared the performance of our proposed algorithm with the state of art using different input images. ACKNOWLEDGMENT The authors would like to thank W. Aguilar for her help with the graph transformation matching algorithm and its image dataset. R EFERENCES [1] B. Zitová and J. Flusser, “Image registration methods: A survey,” Image Vis. Comput., vol. 21, no. 11, pp. 977–1000, 2003. [2] W. Aguilar, Y. Frauel, F. Escolano, M. M. Perez, A. E. Romero, and M. Lozano, “A robust graph transformation matching for non-rigid registration,” Image Vis. Comput., vol. 27, no. 7, pp. 897–910, Jun. 2009. [3] M. A. Fischler and R. C. Bolles, “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,” Commun. ACM, vol. 24, pp. 381–395, Jun. 1981. [4] X. Peng, M. Ding, C. Zhou, and Q. Ma, “A practical two-step image registration method for two-dimensional images,” Inf. Fusion, vol. 5, no. 4, pp. 283–298, 2004. [5] P. Fua, “A parallel stereo algorithm that produces dense depth maps and preserves image features,” Mach. Vis. Appl., vol. 6, no. 1, pp. 35–49, 1993. [6] P. Viola and W. M. Wells, “Alignment by maximization of mutual information,” in Proc. 5th Int. Conf. Comput. Vis., 1995, pp. 16–23. [7] H. Chui and A. Rangarajan, “A new point matching algorithm for nonrigid registration,” Comput. Vis. Image Understand., vol. 89, nos. 2–3, pp. 114–141, 2003. [8] S. Gold and A. Rangarajan, “A graduated assignment algorithm for graph matching,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 18, no. 4, pp. 377–388, Apr. 1996. [9] F. Bookstein, “Principal warps: Thin-plate splines and the decomposition of deformations,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 11, no. 6, pp. 567–585, Jun. 1989. [10] P. Navy, V. Pagé, E. Grandchamp, and J. Desachy, “Matching two clusters of points extracted from satellite images,” Pattern Recognit. Lett., vol. 27, no. 4, pp. 268–274, 2006. [11] Z. Xiong and Y. Zhang, “A novel interest-point-matching algorithm for high-resolution satellite images,” IEEE Trans. Geosci. Remote Sensing, vol. 47, no. 12, pp. 4189–4200, Dec. 2009.

GTM-K = 20 NoI TIME 43 0.076 53 0.099 61 0.128 77 0.185 93 0.252 122 0.388 168 0.680 238 1.355 398 3.991 1198 62.408

WGTM-K = 20 NoI TIME 6 0.146 13 0.273 22 0.469 36 0.808 53 1.288 77 2.038 114 3.445 181 6.162 342 13.511 1148 100.058

[12] A. Robles-Kelly and E. R. Hancock, “String edit distance, random walks and graph matching,” Int. J. Pattern Recognit. Artif. Intell., vol. 18, no. 3, pp. 315–327, 2004. [13] CMU/VASC Image Database [Online]. Available: http://vasc.ri.cmu. edu//idb/html/motion/house/index.html [14] Y. Zheng and D. Doermann, “Robust point matching for twodimensional nonrigid shapes,” in Proc. 10th IEEE Int. Conf. Comput. Vis., vol. 2. 2005, pp. 1561–1566. [15] Y. Zheng and D. Doermann, “Robust point matching for nonrigid shapes by preserving local neighborhood structures,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 4, pp. 643–649, Apr. 2006. [16] H. Zhang, Y. Mu, Y. You, and J. Li, “Multi-scale sparse feature point correspondence by graph cuts,” Sci. China Inf. Sci., vol. 53, no. 6, pp. 1224–1232, 2010. [17] Z. Liu and Z. Liu, “A robust point matching algorithm for image registration,” in Proc. 3rd Int. Conf. Mach. Vis., 2010, pp. 66–70. [18] G. Sanromà, R. Alquézar, and F. Serratosa, “Smooth simultaneous structural graph matching and point-set registration,” in Proc. 8th Int. Conf. Graph-Based Represent. Pattern Recognit., 2011, pp. 142–151. [19] C. Harris and M. Stephens, “A combined corner and edge detection,” in Proc. 4th Alvey Vis. Conf., vol. 15. 1988, pp. 147–151. [20] A. Collignon, F. Maes, D. Delaere, D. Vandermeulen, and G. Marchal, “Automated multi-modality image registration based on information theory,” in Proc. 14th Int. Conf. Inf. Process. Med. Imag., 1995, pp. 263–274. [21] Z. Zhang, R. Deriche, O. Faugeras, and Q.-T. Luong, “A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry,” Artif. Intell. J., vol. 78, nos. 1–2, pp. 87– 119, 1994.

Mohammad Izadi received the B.S. degree in computer engineering from the Isfahan University, Isfahan, Iran, and the M.S. degree in computer engineering (artificial intelligence) from the Amirkabir University of Technology, Tehran, Iran, in 2003 and 2006, respectively. He is currently pursuing the Ph.D. degree in engineering science (intelligent systems and control) with Simon Fraser University, Burnaby, BC, Canada. His current research interests include computer vision, pattern recognition, neural networks, automatic 3-D map generation, and object tracking.

Parvaneh Saeedi (M’97) received the B.A.Sc. degree in electrical engineering from the Iran University of Science and Technology, Tehran, Iran, and the Master’s and Ph.D. degrees in electrical and computer engineering from the University of British Columbia, Vancouver, BC, Canada, in 1998 and 2004, respectively. She was a Research Associate with MacDonald Detwiller and Associates Ltd., Richmond, BC, Canada, from 2004 to 2006. She has been an Assistant Professor with the School of Engineering Science, Simon Fraser University, Burnaby, BC, since 2007. Her current research interests include pattern recognition, machine vision, image understanding, and artificial intelligence.

Suggest Documents