Journal of Electronic Imaging 19(1), 013002 (Jan–Mar 2010)
Efficient registration of multitemporal and multisensor aerial images based on alignment of nonparametric edge features Sokratis Makrogiannis GE Global Research Visualization and Computer Vision Lab One Research Circle Niskayuna, New York 12309
[email protected] Nikolaos G. Bourbakis Wright State University College of Engineering and Computer Science Assistive Technology Research Center 3640 Colonel Glenn Highway Dayton, Ohio 45435-0001
Abstract. The topic of aerial image registration attracts considerable interest within the imaging research community due to its significance for several applications, including change detection, sensor fusion, and topographic mapping. Our interest is focused on finding the optimal transformation between two aerial images that depict the same visual scene in the presence of pronounced spatial, temporal, and sensor variations. We first introduce a stochastic edge estimation process suitable for geometric shape-based registration, which we also compare to intensity-based registration. Furthermore, we propose an objective function that weights the L2 distances of the edge estimates by the feature points’ energy, which we denote by sum of normalized squared differences and compare to standard objective functions, such as mutual information and the sum of absolute centered differences. In the optimization stage, we employ a genetic algorithm scheme in a multiscale image representation scheme to enhance the registration accuracy and reduce the computational load. Our experimental tests, measuring registration accuracy, rate of convergence, and statistical properties of registration errors, suggest that the proposed edge-based representation and objective function in conjunction with genetic algorithm optimization are capable of addressing several forms of imaging variations and producing encouraging registration results. © 2010 SPIE and IS&T. 关DOI: 10.1117/1.3293436兴
1 Introduction Image registration is the process of determining spatial correspondences between two images of the same scene that have been acquired under varying conditions. It is usually employed as an intermediate stage in image analysis or computer vision pipelines with applications in several domains, such as biomedical imaging,1–3 remote sensing,4,5 robot vision, motion detection and guidance systems,6–8 and stereo vision.9 In the remote sensing field, it is frePaper 08160RR received Oct. 16, 2008; revised manuscript received Nov. 4, 2009; accepted for publication Dec. 2, 2009; published online Feb. 17, 2010. 1017-9909/2010/19共1兲/013002/15/$25.00 © 2010 SPIE and IS&T.
Journal of Electronic Imaging
quently being used for change detection,10 topographic mapping, sensor fusion, and urban planning among others. More specifically, in change-detection applications, the objective is to find the differences between two images of the same scene that have been taken from variable viewpoints, at different times, using different sensors.11,12 The selection of the most suitable registration method relies on the nature of the problem and the imposed requirements.13 The main categories of registration problems have traditionally been divided into the following: 共i兲 Template registration, which takes place when a specific pattern has to be matched with an image,14 共ii兲 viewpoint registration, which has to compensate for variations that are caused by image acquisition from different viewpoints,8,15 共iii兲 multisensor registration, which is the registration of images acquired by different sensors,2,16–18 and 共iv兲 multitemporal registration, where the matching process is applied to images taken from the same scene but at different times.19,20 In general, the image-registration problem may be defined as follows: Definition 1: Let SR be a set of points in the reference image R and SI the corresponding set in the misaligned input image I so that SR , SI 傺 S, where S is the domain of the image spatial coordinates S = 兵共x , y兲 , x , y 苸 N2其 共in the nonrestrictive 2-D case兲. The registration task is to determine the optimal spatial transformation 共ST兲 in the parameter space P that minimizes the dissimilarity measure DST共R , I兲 between the two images that is defined in the set of real numbers R. Existing registration approaches may be typically classified into correlation-based, spatial frequency domain, and numerical optimization-based approaches. The correlationbased methods, which appeared first in the literature, were mostly employed for template matching.14,21,22 According
013002-1
Downloaded From: http://spiedigitallibrary.org/ on 03/07/2014 Terms of Use: http://spiedl.org/terms
Jan–Mar 2010/Vol. 19(1)
Makrogiannis and Bourbakis: Efficient registration of multitemporal and multisensor aerial images…
to these methods, a statistical similarity measure is applied between the two test images. The objective is to find the best match of the template and the tested image usually by shifting and rotating the template in the reference spatial domain. Common template-matching metrics of the past include the cross-correlation, cross-covariance, correlation coefficient, and the sum of absolute or squared differences. These methods are more suitable for small translations and rigid transformations, even in the presence of white noise. However, correlation-based measures are not sufficient for images corrupted by other types of noise or illumination differences. Under these circumstances, it is more suitable to carry out the registration in the Fourier domain.23–26 According to these approaches, specific properties of the Fourier transform related to the shift theorem are utilized to account for translation, rotation, and scale differences. In addition to that, they perform well in the occurrence of frequencydependent noise and handle the illumination changes and variations due to different sensors efficiently. Another advantage of these methods is that their execution times may be optimized using FFT. On the other hand, a usual problem of Fourier-based techniques is the aliasing effect that can be resolved using windowing operations. Furthermore, these methods are limited to specific well-defined transformations, mainly translation and rotation. When local misalignments are present 共as, for example, in temporal registration兲 or when the images contain different parts of the same scene, these methods present shortcomings. These difficulties can be resolved by proper utilization of numerical optimization-based registration methods as described next. The optimization-based registration methods make use of a numerical optimizer and an image similarity function to determine the spatial correspondence between two images. Previous reports have shown that methods of this category may efficiently address complicated forms of misalignment.8,9,22,27,28 These approaches can be subdivided into the intensity- and feature-based groups. According to the first group, the original intensity or color-pixel attributes are readily employed, followed by a spatial transformation and estimation of dissimilarity between the aligned and reference images. On the contrary, featurebased methods rely on a feature extraction step as described next. 1.1 Extraction of Salient Image Features At this stage, specific features are extracted from the two images to facilitate the matching process. Usually these features are related to intensity, edge, texture, or regionbased estimates.3–7,14,17,27,29–32 The goal of this step is to measure features that can distinguish the points of interest, such as object corners, lines, or shapes, and can be efficiently matched in the following steps. 1.2 Spatial Transformation The spatial transform is defined as the function that operates on the domain of spatial coordinates, STP : SI → SR using the parameter set P. The type of transformation depends on the requirements set by the application of image registration. This transform may also be defined as a function that maps the spatial coordinates to the n-dimensional real Journal of Electronic Imaging
domain Rn, where n is the dimensionality of the image space, followed by a resampling operation that produces the transformed image values. Some common models are the conformal, affine, projective, and polynomial transforms. Nevertheless, these global models cannot address the registration of nonrigid objects, such as human or animal organs and body structures, with sufficient accuracy. These problems require the use of locally adaptive transformation models,13 which are better known as deformation fields, and represent a popular research area in the medicalimaging and computer-vision community with typical applications including computational anatomy3,31 and shapebased analysis.33 1.3 Estimation of the Dissimilarity Measure This function serves as an objective measure of the correctness and accuracy of a spatial transformation that maps the input image to the reference spatial domain. In this context, our goal is to find the set of spatial transformation parameters Pmin that minimizes the selected dissimilarity measure D Pmin = arg min共D兵R共兲,I关STPi共兲兴其兲, Pi
共1兲
where i is the index that spans the spatial transform parameter space, Pi is the transformation parameter set at i, STPi共兲 is the transformed set of spatial coordinates using Pi, and Pmin denotes the optimal set of transformation parameters. Another significant component of every registration scheme is the search strategy, which seeks the global optimum of the dissimilarity measure in the parameter space determined by the transformation model. Previous optimization strategies include the exhaustive search, gradientbased approaches, the Simplex algorithm, or more advanced approaches, such as simulated annealing, TRUST,35 and genetic Levenberg-Marquardt,34 36,37 algorithms. Further categorization of registration methodologies originates from implementation considerations and userdefined requirements. For example, a potential solution for time-sensitive applications that require optimization of a linear registration model would be the utilization of the Radon transform.23,38 This is the transform defined by summing up the pixel values along multiple directions. In most cases, the Radon transform is estimated over a number of arbitrary spatial orientations and the optimization problem is projected from the original image space to a number of 1-D feature spaces. This work deals with registration of aerial images that include viewpoint, temporal, and sensor variations. Because of the different physical characteristics of various sensors, the relationship between the intensities of matching pixels is often complicated and not known a priori. Features present in one image might appear only partially in the other image or not appear at all. Contrast reversal may occur in some image regions while not in others, and multiple intensity values in one image may map to a single intensity value in the other. Furthermore, imaging sensors may produce considerably dissimilar images of the same scene when configured with different imaging parameters. The misaligned images might also include variable terrains
013002-2
Downloaded From: http://spiedigitallibrary.org/ on 03/07/2014 Terms of Use: http://spiedl.org/terms
Jan–Mar 2010/Vol. 19(1)
Makrogiannis and Bourbakis: Efficient registration of multitemporal and multisensor aerial images…
Fig. 1 Outline of the employed registration methodology that serves as a test bed for comparisons of different image representations and registration cost functions.
due to temporal variability, and in addition to that, the alignment process is frequently required to be automated. In order to overcome difficulties related to intensity variations, we first introduce a stochastic edge representation using kernels for nonparametric probability density estimation. We then propose a registration cost function that is specific to this geometric representation and compare it to the mutual information 共MI兲 and an L2 distance-based measure for validation. An overview of the registration testbed that we use in our comparisons is outlined in Fig. 1, where starting from the coarsest scale the spatial transform parameters are estimated using genetic-algorithm-based optimization and the approximate solution serves as the center of the search space in a multiscale optimization scheme. The contributions of this work are the multivalued edge-based representation based on nonparametric probability density estimation and its associated dissimilarity function that aim for the alignment of image pairs characterized by considerable temporal, sensor, and viewpoint variations. The structure of this paper is organized as follows: In Sec. 2, we describe the proposed registration scheme with emphasis on the considerations of image representation and the associated similarity measures. Section 3 presents our experimental methodologies for measuring different aspects of the proposed scheme, such as the registration accuracy, convergence rate to the final solution, and computational complexity. A discussion of the applicability and originality of the previously presented ideas is also reported in the same section. Finally, some conclusions are made and future goals are presented in Sec. 4. Journal of Electronic Imaging
2 Proposed Registration Approach In this section we introduce the main components of our image registration scheme 共i.e., the data representation, similarity measure, search optimization algorithm, and multiscale processing兲. Our interest is concentrated on the original parts of this work 共i.e., the edge-based shape representation and the feature-weighting similarity function兲. 2.1 Image Representation The image data are regularly represented by the original pixel values, wavelet parameters, or content-based descriptors such as edge or region maps and graphs. The original pixel representation is sufficient for simpler registration problems; however, when multitemporal or multimodal variations occur, other representations may be used to facilitate the optimization process. In this work, we propose a statistical edge estimation approach for producing a shapebased representation of the image content that is detailed next. 2.1.1 Minimal density edge map estimator The fundamental problem of edge detection in a multivariate space is to find the boundary between two homogeneous regions, where homogeneity may be defined and measured using intensity or color information. In contrast to conventional edge detection methods, which try to localize intensity or color gradients, we introduce a method that produces a probabilistic edge map using nonparametric probability modeling and estimation. Assuming that the im-
013002-3
Downloaded From: http://spiedigitallibrary.org/ on 03/07/2014 Terms of Use: http://spiedl.org/terms
Jan–Mar 2010/Vol. 19(1)
Makrogiannis and Bourbakis: Efficient registration of multitemporal and multisensor aerial images…
Fig. 2 One-dimensional example of kernel 共dotted lines兲 nonparametric density estimation 共solid line兲 of five intensity sample points forming an edge.
age field can be locally approximated by homogeneous regions or edges, we proceed to the observation that the shape of the probability distribution of the image intensity at the edge areas is approximately bimodal, whereas areas of uniform intensity are characterized by a unimodal distribution as explained in Ref. 39. Hence, homogeneous regions tend to form clusters in the multivariate space characterized by a high concentration of uniformly distributed intensity values. A basic feature indicating the degree of homogeneity is the cluster density. Clusters of neighboring regions occupy different locations, and the in-between space is identified by the absence of samples and, thus, by low-probability density in the feature space, as also indicated in Ref. 39, where an analogous probability density function 共pdf兲 model was employed for image denoising filters. The boundary edge line should therefore be positioned at these low-density areas and ideally located at minimum density. According to this formulation, instead of trying to locate intensity gradients, the edge detection problem is interpreted as seeking intensity or color points of discontinuity identified by their lowprobability density values. Usually pixels are located on either of the two homogeneous levels—that correspond to pdf modes—and in general none of the existing sample points takes on low pdf values. In order to locate this minimum, a resampling of the density function at an appropriate location is required. In this work, the pdf of the image intensity samples, which belong to a local area of the image plane, is estimated on the sample mean, and this estimate provides an implicit measure of the bimodality of the samples distribution, which consequently indicates the presence of an edge according to our previous assumption. More specifically, in the case of a bimodal distribution, the probability density of the mean will be relatively low, whereas it is expected to be significantly higher when the distribution is approximately unimodal. A case of bimodal distribution is depicted in Fig. 2, where the probability density is approximated at the samples’ mean using Parzen kernels as described next. Journal of Electronic Imaging
Edge detection filters are commonly implemented using a k ⫻ k window W that slides over the image plane. If W is positioned over a homogeneous area of the image, then the histogram of the samples within W is expected to be unimodal, whereas if W lies on the border between two uniform areas, the histogram should have bimodal shape because it will include intensity samples from two different regions. In the unimodal case, a high-density value is expected everywhere within the sample domain defined by W. In the second case, although the pdf might take on relatively high values at the sample points, it is expected to have low values in the valley between the two peaks 共as seen in Fig. 2兲. Now, regarding the estimation of probability density of the intensity samples within W, there is a large number of methods40 to choose from. Although the histogram calculation might suffice for several applications, here we employ a nonparametric kernel method because of the small sample size and the presence of noise in our data. Nonparametric techniques have the advantage that they do not make any explicit assumptions about the underlying distribution, but the data sample itself is used as a model. A nonparametric method widely used by the statistical community is the Parzen window or multivariate kernel.40 The unknown density p共X兲 is estimated by a sum of radial kernel functions centered at the sample points, and all contributions are averaged to produce a smoothed version of the true pdf. A probability density estimate is thus available for every point in the sample space. Let Xi, i = 1 : N, be intensity samples that are located within the spatial window W in the image’s spatial domain, and correspond to independent and identically distributed ˆ 兲 produced by the N observations. The density value pˆN共X ˆ may be estimated by sample vectors X in position X i
冉 冊
N ˆ −X 1 X i ˆ K , pˆN共X兲 = 兺 Nhd i=1 h
共2兲
where K is the employed kernel function, h is the standard deviation of the kernel, and d is the dimensionality of the ˆ is employed feature space. In our application, d = 1 and X ˆ determined by the samples’ mean estimate 共i.e., X N = 1 / N兺i=1Xi兲. A common choice for the kernel function K is the second-order multivariate Gaussian, which may also be regarded as a spherically symmetric pdf,
冉 冊
K共X兲 = 共2兲共−d/2兲 exp −
X2 . 2
共3兲
The performance of the kernel density estimator defined in Eq. 共2兲 is measured using the distance between the true and estimated densities. The consistency of this operator depends on the L1 or L2 error criterion, and it was observed that, in the L2 approach, the tail behavior of a density function becomes less important. It also turns out that the choice of h is much more significant for the quality of the estimate than the choice of K. The h parameter determines the kernel’s decrease rate with distance, in other words, the smoothness of the kernel function. In the present application, the image can be considered as the superposition of a deterministic signal plus noise; therefore, the ker-
013002-4
Downloaded From: http://spiedigitallibrary.org/ on 03/07/2014 Terms of Use: http://spiedl.org/terms
Jan–Mar 2010/Vol. 19(1)
Makrogiannis and Bourbakis: Efficient registration of multitemporal and multisensor aerial images…
Fig. 3 Graphs of 共a兲 the normalized squared differences 共NSD兲 and 共b兲 absolute differences 共AD兲 functions, applied to the edge- and intensity-based image representations respectively. It is worth noting that NSD introduces feature weighting into the similarity estimation process.
nel width must be smaller than the sharpest detail in the unknown target density. If h has a big value, then the estimate will be too smooth, causing undesirable blurring and might not reveal structural features, such as an existing ˆ 兲 will bimodality. If it is too small, then the estimate pˆN共X suffer from statistical instability, not being able to filter out the random variability of the sample 共due to noise兲. In this work, a global value for h is used. Its value is mainly determined by the finest detail in the image that needs to be preserved 共signal characteristics兲 and is increased thereafter, when noise is present, according to noise variance. Here, we typically use h = 20 for kernel width k = 5. ˆ 兲 is attributed to the center Each density estimate pˆN共X pixel of window W, which slides over the complete image domain to generate the final edge map. This edge estimation algorithm produces a geometric image representation that will be used in conjunction with a shape feature– weighting cost function to address multisensor variability as described next. 2.2 Similarity Measures According to the previous section, a fitness function needs to be optimized in order to establish the optimal spatial correspondence. Because our goal is to achieve efficient image registration, an effective cost function has to be defined in order to express the 共dis兲similarity of the transformed image in the reference image spatial domain. Here we introduce a cost function that estimates the similarity between two edge maps by means of L2 distances denoted by sum of normalized squared differences 共SNSD兲, and we outline the MI and sum of absolute centered differences 共SACD兲 functions, two commonly used image similarity measures that will be used for comparisons in Sec. 3. 2.2.1 SNSD A 共dis兲similarity measure for the edge-based representation is introduced here. It originates from the idea that the conJournal of Electronic Imaging
tribution of each point to the objective function should be weighted by the significance of the corresponding feature. Therefore, points with higher edge probability or strength should be attributed bigger weights in the calculation of the total image similarity cost. Assuming that is a point in the reference coordinate space ⍀ and STp共 . 兲, the spatial transformation model that maps 苸 ⍀ to STp共兲 苸 ⍀, the local dissimilarity estimate at each point is defined as the L2 distance between the edge estimates of the spatially transformed input image IE关STp共 . 兲兴 关where IE共 . 兲 denotes the input image兴 and the reference image RE共 . 兲, divided by the sum of their energy. The summation of the local 共i.e., pixelwise兲 dissimilarity estimates over ⍀ defines the overall cost function SNSD SNSDSTp共RE,IE兲 =
兵I 关ST 共兲兴 − R 共兲其2
E p E 兺 2 2 . I 关ST 共 兲兴 + R p E共 兲 苸⍀ E
共4兲
This objective function may be considered as a hybrid feature-and image-based criterion. This is because the significance of each pair of points is weighted by an energy term that is explicitly derived from their probability of being feature points, while all image pixels are still used for calculating the overall degree of similarity. Therefore, an advantage of this approach is that it does not require a feature selection process, which is often complicated and sensitive to noise and outliers. It is also worth noting that SNSD is expected to be more efficient when applied to multivalued edge representations. In this application, we generate the edge maps from the reciprocal of Eq. 共2兲 such that larger values correspond to higher edge likelihood. The effect of the energy weighting term renders SNSD suitable for registration of intermodal imaging data because it will be minimized when the edge estimates are aligned. The graph of the pixelwise normalized squared differences function, which is utilized in Eq. 共4兲, is depicted in Fig. 3共a兲.
013002-5
Downloaded From: http://spiedigitallibrary.org/ on 03/07/2014 Terms of Use: http://spiedl.org/terms
Jan–Mar 2010/Vol. 19(1)
Makrogiannis and Bourbakis: Efficient registration of multitemporal and multisensor aerial images…
2.2.2 MI This is a statistical similarity measure that estimates the degree of dependence of A and B by measuring the distance between the joint distribution PA,B共a , b兲 and the statistically independent case of PA共a兲 and PB共b兲.2 In the image registration field, A and B denote the variables of intensity of the reference R and input I image. The best matching corresponds to the maximum value of this quantity. The MI is accordingly estimated by the following expression:
冉
STp = 兺 兺 pAB共a,b兲log MISTp共R,I兲 ⬅ MIA,B a
b
冊
pA,B共a,b兲 . pA共a兲 · pB共b兲 共5兲
This estimation of MI may produce several artifacts, such as multiple local optima, that complicate the task of search optimization.19 More robust pdf estimation may be accomplished by interpolation, or nonparametric density estimation. MI is considered to be a reliable image similarity measure for radiometric, multisensor, and multitemporal variations.2,19 2.2.3 SACD This measure is expressed by SACDSTp共R,I兲 =
兺
苸⍀
兩I关STp共兲兴 − ˆISTp − R共兲 + Rˆ兩,
共6兲
where R and I are the reference and input images, while Rˆ and ˆISTp are the respective expected values calculated over the overlapping area of the image pair. The reciprocal of this function operates as a fitness measure. The graph of the pixelwise absolute differences function that is calculated in Eq. 共6兲 is shown in Fig. 3共b兲. 2.3 Genetic Algorithm (GA)-Based Optimization Approach A significant stage of the registration process is related to the search for the optimal parameter set of the spatial transform. This is frequently complicated by the occurrence of outliers, digitization effects, noise, and illumination variations that affect the imaging process. In order to overcome the above difficulties, a numerical optimization scheme may be employed to define the optimal transformation. Some early approaches to this problem, such as relaxation, clustering, and hierarchical search algorithms, were limited to simple variations or were characterized by impractical computational complexity. Although several optimization and feature extraction methods have been proposed to address these limitations,13,35,36 the optimization task becomes challenging when a significant amount of uncorrected variations occurs; therefore, it remains an open research topic. In our scheme, the genetic algorithm41 is used for finding the optimal registration parameters in the global search space. The employed search space corresponds to the class of rigid spatial transformations combined with isometric scaling, which is sufficient for a number of aerial image registration problems. Moreover, the complicated feature selection process is not necessary in this approach; Journal of Electronic Imaging
the overlapping areas of the examined images are compared instead. A more detailed explanation of this methodology follows. 2.3.1 GA principles The Genetic Algorithm approach is a stochastic method, very well suited for optimization problems in several fields.41,42 The initial search space is reduced using principles of the evolution theory in order to optimize a userdefined fitness function as defined next. Definition 2: Let f be the function that we seek optimization for, G the domain of genes or chromosomes, and X a gene that belongs to G, X 苸 G. The function f maps values from the parameter space G to the domain R of real values, f : G → R. The goal of optimization is to find the gene Xmin that minimizes f, Xmin = arg min兵f共Xi兲其, where i spans the search space. It is obvious that the optimization problem follows the same formulation as that of image registration 共see Definition 1兲. Previous theoretical studies show that genetic techniques provide a highly efficient heuristic for information gathering in noncontinuous search spaces.41 It has also been indicated that GAs produce reliable optimization results in practice. The classic gradient search is more efficient for problems that satisfy tighter constraints 共i.e., continuity, low dimensionality, unimodality兲. Nevertheless, the GAs outperform both gradient search and other forms of stochastic search, e.g. simulated annealing, on more complicated problems, close to real conditions, such as discontinuous, noisy, high-dimensional and multimodal objective functions.35,36,41 In an interesting connotation, the image registration field includes discontinuous problems due to partial area overlapping, difficulties caused by the noise and illumination variations that take place during image acquisition, multimodal complexities that are introduced by sensor variations, and multidimensional difficulties due to the use of several image cues in the registration process. Next, we explain the details of GA application to this optimization problem. 2.3.2 Parameter settings and encoding The input variables of the genetic algorithm known as chromosomes or genes are bit strings that contain the parameters of the spatial transformation. In order to construct these bit strings, we need first to define the parameter model of the search space, which in our application is the conformal mapping model, a member of the family of linear transformations. The parameters may now be encoded from the real value format to the bit-string representation by means of Bj =
共2L j−1 − 1兲 · 共C j − Cmin j 兲 Cmax − Cmin j j
,
共7兲
where j is the index in the parameter space, B j is the decimal number that is subsequently represented by binary format, L j is the granularity of input parameter C j that takes on max real values in the employed search space, and 关Cmin j , Cj 兴 is the range of values of C j. Figure 4 displays an example of the affine parameter representation and coding for one generation of k genes. After the range and granularity of the
013002-6
Downloaded From: http://spiedigitallibrary.org/ on 03/07/2014 Terms of Use: http://spiedl.org/terms
Jan–Mar 2010/Vol. 19(1)
Makrogiannis and Bourbakis: Efficient registration of multitemporal and multisensor aerial images…
Fig. 4 The spatial transformation parameters are encoded and represented by binary strings to produce the gene populations that are processed by the GA algorithm.
initial search space has been determined, the population of chromosomes is selected randomly in the initial search space using a uniform probability distribution. A fraction of the good solution is selected while the rest is eliminated. 2.3.3 GA operations The chromosomes that remain are combined according to the three basic operations of reproduction, mutation, and crossover. More specifically, reproduction is the operation by which the strings that produce high fitness values 共or equivalently, lower costs兲 remain in the next generation. This is carried out by ranking the gene performance and assigning higher selection probabilities to the best genes. After the reproduction has been completed, a mutation probability is assigned to each chromosome of the current generation. When a mutation operation occurs, a random value is chosen for the corresponding bit string. Crossover is the random combination of the best strings coming from the previous generation that is accomplished by exchanging random parts between successive genes in the current population. Crossover is executed for the first N genes of the generation, where N is proportional to probability of crossover. Two points are selected, and the parts of the bit strings between these points are exchanged. We note here that the reproduction and crossover operations are responsible for the searching capability of genetic algorithms. The mutation operation prevents the genetic algorithms from converging to a local solution by randomly changing the binary value at a location in the bit string. After each generation has been created, the fitness values of the current genes are estimated. In the image registration context, the fitness function corresponds to an image similarity measure. The above optimization process is iterated until the algorithm converges to the final solution according to the termination criterion, which is typically related to convergence properties, for example, when the same solution is repeated for a number of consecutive generations that was specified by the user. 2.4 Transformation Model In most of the literature, the registration problem is limited to addressing relatively small variations in translation and rotation. In this work, an effort is carried out to solve more extended variations in scale, rotation, and translation. The employed search space is therefore defined by a rigid transform plus isometric scaling that is also known as conformal mapping. This transform originates from a simplified version of the affine model without the skew and shear operations, and has proven to be sufficient for the image misJournal of Electronic Imaging
Fig. 5 A synthetic example of two aerial images that are characterized by scale, rotation, and translation misalignments.
alignments of our application. Both these transformation models are outlined next for the sake of completeness. 2.4.1 Affine transform A point of the input image spatial domain SI—represented in homogeneous coordinates 关i.e., 共x , y , 1兲兴—is mapped onto the registered image domain SR by means of spatial transform STp, such that SR = STp共SI兲. This operation may also be expressed by the matrix product SR = 关Tr兴 · 关Sc兴 · 关Rt兴 · 关Sk兴 · SI ,
共8兲
where Tr, Sc, Rt, and Sk represent the matrices of translation, scaling, rotation, and skew, respectively.6 The parameter space AP = 兵tx , ty , sx , sy , , sk其 is the domain of affine parameters that cover several forms of misalignments 共i.e., translation, scale, rotation, shear, and skew. It should be noted that affine transforms are linear because they preserve straight lines. When the shear and skew operations are omitted, the corresponding transform is called conformal mapping and its parameter set is represented by CMP = 兵tx , ty , s , 其. A synthetic example of this transformation model is illustrated in Fig. 5. Another consideration when applying a spatial transform is the implementation of the resampling process. An interpolation scheme is normally required for several arbitrary values of the transform coefficients. In this work, the resampling is carried out using B-spline interpolation.34 It is also worth noting that the presented registration scheme can be readily extended to other linear transformation models. The selection of a transformation model is contingent on the nature and requirements of each application. 2.5 Hierarchical Multiscale Structure As mentioned in the previous sections, the GA-based optimization scheme is considered suitable for ill-posed registration problems that include extensive scaling and rotation differences. Even so, the results of this process still need to be refined to achieve sub-pixel registration, that is essential for processes that typically follow the image registration stage, such as change detection.12,33 This level of accuracy may be accomplished by increasing the number of generations in the GA optimization; nevertheless, this policy would introduce additional computational cost. In this work, we employ a multiscale representation scheme using a hierarchical Gaussian pyramid. In previous works, it was concluded that the generation of multiple scales is advantageous for various image analysis processes, including registration.43 Some standard hierarchical multi-resolution approaches of the literature are Wavelet schemes and
013002-7
Downloaded From: http://spiedigitallibrary.org/ on 03/07/2014 Terms of Use: http://spiedl.org/terms
Jan–Mar 2010/Vol. 19(1)
Makrogiannis and Bourbakis: Efficient registration of multitemporal and multisensor aerial images…
Gaussian or Laplacian pyramids.34 Multiscale processing schemes have been used in biomedical imaging applications with outstanding results as well.3 Here we employ a Gaussian pyramid that has two goals: 共i兲 find the coarse solution 共affine parameter set兲 in reduced time and 共ii兲 optimize the registration results as the scales become finer. The overall process may be divided into the following steps:
Table 1 Image-acquisition information that indicates the type of temporal and sensor variations in the utilized data set.
1. 2. 3. 4. 5.
A Gaussian image pyramid is constructed. Start from the coarsest toward the finest scale. The GA registration algorithm is applied. The Affine Parameters are stored. The search space range is reduced by a factor of 2 using as reference point 共center value兲 the detected registration solution. 6. The overall process is completed after the finest scale is processed, and the input image is resampled based on the final parameters. One of the main advantages of this hierarchical scheme is the computational efficiency because it is a top-down approach 共i.e., an approximate optimum is found in the coarsest scale, while enhanced registration accuracy is accomplished in finer scales兲. At the intermediate scale共s兲, in order to further localize the optimization process in the vicinity of the previously found solution, a portion of the initial gene population is randomly generated around the optimized parameters. These genes are generated by addition of Gaussian noise to the previous optimization result. The standard deviation of this noise was determined to be a fraction of the search space range 共typically, 1 / 8兲. We produce 30%of the initial population using this strategy, while the remaining genes are generated in the previously defined search space. At the finest scale 共i.e., the original image兲, a reduced number of iterations may be used because an approximate solution of the optimization function has already been found. The multiscale processing often leads to subpixel capability—even for extensive misalignments—as will be shown in Sec. 3. A disadvantage of this strategy is that the initial solution is critical for the final result; thus the final accuracy relies on the initial estimation.
3
Experiments and Discussion
The proposed scheme was tested on several aerial images that include viewpoint, temporal, and multisensor variations. In addition to that, the test images include different terrain parts and, in several cases, the image quality is significantly different between the reference and input images; thus, the registration task is further complicated. Here we present results on three sets of test images that belong to groups denoted by A–E, which include images acquired with different sensors 共visual and thermal兲 of variable resolutions and intrinsic parameters. Table 1 displays the sensor types and years of acquisition for a subset of our test images that are used here for evaluating the aforementioned approaches. This evaluation was carried out qualitatively by visual assessment of alignment correctness and quantitatively by a measure of registration accuracy as described next. Journal of Electronic Imaging
Image
Year
Mode
A1
1974
Visual
A2
1974
-⬎⬎-
A3
1978
-⬎⬎-
A4
1974
-⬎⬎-
T_A1
1974
Thermal
B1
1978
Visual
B2
1978
-⬎⬎-
B3
1985
-⬎⬎-
B4
1978
-⬎⬎-
C1
1975
-⬎⬎-
C2
1974
-⬎⬎-
D2
1974
-⬎⬎-
D4
1975
-⬎⬎-
D5
1978
-⬎⬎-
T_D1
1985
Thermal
E1
1974
Visual
E2
1974
-⬎⬎-
3.1 Registration Consistency Measure The assessment of registration accuracy is relatively simple when ground-truth data are available 共i.e., the affine parameters are previously known兲. Conversely, in the absence of ground truth, alternative strategies have to be followed. An indicative measure of registration accuracy in this case is the registration consistency1 共RC兲. This measure requires the estimation of forward and inverse affine registration parameters that is followed by a composition operator. The input image is subsequently resampled using the resulting parameters, and a spatial distance measure is averaged to calculate the final difference on the overlapping area. The estimation of RC between the reference R and the input I image is carried out by RC共R,I兲 = 共1/NR兲 ·
兺
苸⍀
储 − STR,I ⴰ STI,R共兲储,
共9兲
where STR,I and STI,R are the forward and inverse spatial transformations between the input and reference images, o is the composition operator, is a point in the reference coordinate space ⍀, and NR is the area of overlapping pixels in the reference image space. Our experimental comparisons were carried out on the basis of data representation, similarity measures, and the use of single or multiple scale information. Considering the
013002-8
Downloaded From: http://spiedigitallibrary.org/ on 03/07/2014 Terms of Use: http://spiedl.org/terms
Jan–Mar 2010/Vol. 19(1)
Makrogiannis and Bourbakis: Efficient registration of multitemporal and multisensor aerial images…
Fig. 6 共a兲 First set of images denoted by A1, A2, A3, A4, T_A1 共left to right兲 and 共b兲 their color-coded edge map representations. 共Color online only.兲
data representation, we tested the algorithm using the original intensity values and geometrical features derived from the edge estimation process that was previously introduced. Apart from that, we compared the similarity measures of SACD, MI, and SNSD. The final comparison element was the potential use of a pyramid hierarchical scheme to carry out registration in multiple scales. Following this experiment methodology, Figs. 6–9 depict qualitative registration results, while Figs. 10–14 show graphs of our quantitative measures. More specifically, in Figs. 6–9 are displayed some representative test images from our data sets, their color-coded, edge-based representations, and the corresponding registration results. It can be seen that although several variations are present in the original images due to different sensors, times of acquisition, and viewpoints, the edge-based representation considerably reduces the sensor and temporal variations in order to assist the registration
process. Besides that, Figs. 10–14 provide an objective overview of registration capabilities and facilitate comparisons described next. 3.2 Similarity Measure Comparisons Figures 10 and 11 display the graphs of objective functions versus registration parameters in the search space. For the intensity representation, the MI and SACD were compared 关Figs. 10共a兲 and 11共a兲兴. We conclude from these graphs that the MI measure produces a smoother curve with fewer local minima than SACD. In addition, the SNSD function was tested on the edge-based image representation and compared to the previous two with respect to registration accuracy. SNSD seems to be more amenable for optimization using this representation 关Figs. 10共b兲 and 11共b兲兴. This observation becomes more obvious for the image pair A1,
Fig. 7 共a兲 Representative registration results using MI and original pixel values using multiple scales 共left to right, reference image A3 and aligned A1 and A4兲 and 共b兲 SNSD on edge map representation using multiple scales 共left to right, reference image A1 and aligned A2 and T_A1兲. Journal of Electronic Imaging
013002-9
Downloaded From: http://spiedigitallibrary.org/ on 03/07/2014 Terms of Use: http://spiedl.org/terms
Jan–Mar 2010/Vol. 19(1)
Makrogiannis and Bourbakis: Efficient registration of multitemporal and multisensor aerial images…
Fig. 8 共a兲 Second set of images denoted by B1, B2, B3, and B4 共from left to right兲 and 共b兲 their color-coded edge map representations. 共Color online only.兲
A3 in Fig. 11, which corresponds to multiple types of variability 共i.e., scale, rotation, translation, and temporal兲 compared to A1, A2 共Fig. 10兲, which includes translational variability only. In another series of experiments, we used the quantity of RC to evaluate the accuracy of our registration algorithm applied to pairings 共Nimagepairs = 15兲 coming from the 17 test images in Table 1. The image pairs were determined after grouping the original data with respect to the image scenes and the types of variations 共i.e., temporal, sensor, and viewpoint兲. The images in Table 1 are grouped according to the visual scene; for example, images with identifiers beginning with the letter A correspond to the same scene. A
statistical evaluation of these results is displayed in the box plots of Fig. 12. The boxes denote the range of RC measurements with respect to the utilized cost function. The central line of each box plot denotes the median estimate, and the box boundaries correspond to the median absolute differences. The statistical outliers are symbolized by plus signs. The outlier detection criterion calculates the distance of each measurement from the median divided by the standard deviation of the samples. Figure 12共a兲 corresponds to the original intensity-based image representation, whereas Fig. 12共b兲 shows results from the shape-based approach. It is worth noting that SACD was not included in the com-
Fig. 9 共a兲 Representative registration results using MI and original pixel values using multiple scales 共left to right, reference image B1 and aligned B2 and B3兲 and 共b兲 SNSD on edge map representation using multiple scales 共left to right, reference image B1 and aligned B2 and B4兲. Journal of Electronic Imaging
013002-10
Downloaded From: http://spiedigitallibrary.org/ on 03/07/2014 Terms of Use: http://spiedl.org/terms
Jan–Mar 2010/Vol. 19(1)
Makrogiannis and Bourbakis: Efficient registration of multitemporal and multisensor aerial images…
Fig. 10 Comparison of the objective function values 共presented as normalized costs兲 for the registration of images A1 and A2 on the 共a兲 original and 共b兲 edge-based representation in the employed search space. The MI-based registration scheme in this example is better suited for optimization—because it has a smoother curve and fewer local minima—than SACD and, to a smaller extent, SNSD.
parison of edge-based registration 关Fig. 12共b兲兴, because it is a similarity measure developed for original image registration only and SNSD was not included in Fig. 12共a兲 because it is an edge-based cost function. From this experiment, we also concluded that the MI and SNSD are more efficient, especially when using the multiscale scheme; nevertheless,
MI is computationally demanding and more sensitive to noise. We have also carried out experiments to compare the computational loads of SNSD and MI and the average computational times from repeated experiments on our data sets, where Ntest = Nimagepairs ⫻ Ntrials = 15⫻ 10= 150, are
Fig. 11 Comparison of the objective function values 共presented as normalized costs兲 for the registration of images A1 and A3 on the 共a兲 original and 共b兲 edge-based representation in the employed search space. SNSD shows favorable optimization properties compared to MI in this example of increased complexity. Journal of Electronic Imaging
013002-11
Downloaded From: http://spiedigitallibrary.org/ on 03/07/2014 Terms of Use: http://spiedl.org/terms
Jan–Mar 2010/Vol. 19(1)
Makrogiannis and Bourbakis: Efficient registration of multitemporal and multisensor aerial images…
Fig. 12 Box plots of the registration consistency measure using 共a兲 original pixel values and 共b兲 edge representation for the dataset of Table 1. It becomes obvious that the multiscale schemes are characterized by better registration accuracy. In addition, the multiscale SNSD approach applied to the shape-based representation and the multiscale MI on original pixel intensities produce lower median values and reduced variability.
TSNSD = 67⫾ 16 s and TMI = 217⫾ 41 s on a PC platform with Intel Xeon processor at 2.4 GHz and 3 GB of RAM suggesting ⬃224% improvement in computational speed. The reason for this substantial improvement is that SNSD involves computationally efficient arithmetic operations, whereas MI requires the calculation of intensity histograms of the two compared images. The total execution time of this system may be significantly reduced by replacing the sophisticated GA optimization algorithm with a simpler one based on gradient descent or Marquardt-Levenberg, which would nevertheless compromise the convergence resilience to noise and discontinuities in the search space. The experimental results depicted in Figs. 10–12 indicate that the combined use of the edge-based shape representation and SNSD is capable of producing as efficient registration results as MI at a considerably lower computational cost. An additional advantage of this edge representation is that it captures the basic geometry of the image scene, alleviating the need for radiometric calibration of images coming from different sensors. Another interesting outcome is that the global parametrization of the transfor-
mation in conjunction with the edge representation leads to efficient registration of temporal discrepancies that include terrain, shadow, and viewpoint variations. Nevertheless, it should be noted that the edge detection parameters might need to be calibrated according to the application domain, because the accuracy of edge estimation is crucial for the final registration results especially in the coarser scales. Overall, these experimental comparisons suggest that SNSD represents a reliable registration cost function with feature-based properties that can be readily employed in a hierarchical scheme. 3.3 Search Algorithm As stated previously, when no ground-truth data are available, the registration process is further complicated. In this work, to facilitate the search optimization, an approximate estimation of the ground truth was carried out by means of ground control point registration 共i.e., manual selection of point correspondences on the reference and input data兲. The search space was defined by a percentage of variations
Fig. 13 GA convergence estimated by the Bias measure for the image pairs 共a兲 A1, A2 and 共b兲 B1, B2 using different settings of image representation and image similarity measures. We observe that the edge-based representation leads to faster convergence of the optimization algorithm than the intensity-based. Journal of Electronic Imaging
013002-12
Downloaded From: http://spiedigitallibrary.org/ on 03/07/2014 Terms of Use: http://spiedl.org/terms
Jan–Mar 2010/Vol. 19(1)
Makrogiannis and Bourbakis: Efficient registration of multitemporal and multisensor aerial images…
Fig. 14 Optimization values 共SACD function兲 for image pair A1, A2 using 共a兲 single- and 共b兲 multiscale 共from scales兲 structure for the same number of total trials 共registration cost evaluations兲. The multiscale strategy shows improved optimization capability.
around the approximate ground truth and was initially set to 30% of the overall search space in order to test the optimization capabilities. Considering the GA implementation, the population consists of 120 chromosomes. Efficient registration results were also produced when the range was raised to 50% by increasing the generation size. If each parameter is encoded using 9 bits—as was typically the case—then the length of the total bit string would be 4 ⫻ 9 = 36 bits. The reproduction and crossover operations are applied with the corresponding probabilities Pc and Pm, iteratively. After extensive experimentation, it was concluded that faster convergence can be reached for Pc = 0.9 and Pm = 0.05. The proposed method regularly converged within 5000 trials to the final solution using Elite search, which preserves and passes the best matching gene to the next generation. The convergence rate of the GAs may be evaluated by the Bias measure, which indicates the average percentage—or probability of each position to have the most prominent binary values. For example, a Bias of 0.8 means that on average each position takes the value “0” or “1” with probability 0.8. In Fig. 13 are displayed the graphs of bias for two different image pairs, using the SNSD, MI, and SACD measures combined with original or edge map representation. One should note that because GAs seek maximization of the fitness function we use the reciprocals of SACD and SNSD, whereas MI does not need to be reciprocated in the optimization step. From Fig. 13, it is interestingly concluded that the edge representation is more efficient in terms of convergence rate than the original intensity values. 3.4 Single versus Multiple Scale Processing As expected, the use of visual information from multiple results enhances the registration accuracy and reduces the computational load. Table 2 displays the final optimization values as a function of the number of scales in the hierarchical scheme and total number of trials in GA optimization. From Table 2 is concluded that as the number of scales grows, more accurate registration results are produced. It is interesting to note that even for a relatively small number of trials—around 2000, which corresponds to 28 generations—efficient registration results may be produced by the multiscale methodology. Moreover, in Fig. 14, the convergence of GAs is displayed for single-scale and multiscale registration. A number of 20,000 trials is used Journal of Electronic Imaging
for the single-scale version, while the multiscale employs four scales with 5000 trials each to carry out a fair comparison. It is obvious that the multiscale version yields better optimization results using the SACD function and similar results were also observed for SNSD and MI. Considering the execution time, the single-scale version takes approximately 273⫾ 52 s, while the multiscale scheme 67⫾ 16 s, leading to significant improvement in this regard, too. Similar conclusions are drawn from the other images of the employed data set. On the other hand, this pyramidal scheme is dependent on the initial estimation of affine parameters from the coarsest scale, which may not include some meaningful visual detail. As a result, the number of generated scales should be determined from the spatial resolution and level of details in the acquired images. It is worth noting that the presented scheme shows subpixel capabilities in many cases, which is quite encouraging considering the type and extent of variations that are present in the employed test sets. The edge-based shape representation is more effective for addressing sensor and temporal variability and the registration accuracy is further enhanced by the introduction of multiscale processing. 4
Conclusions and Future Research
A linear registration scheme is proposed in this work and tested on aerial imaging data. The registration problem in this case comprises of combined viewpoint, multitemporal, and multisensor registration that include local variations. We propose to address this problem by the use of a multivalued edge representation using nonparametric probability density estimation coupled with an adaptive feature weighting objective function denoted by SNSD. To the best of our knowledge, this pairing of a multivalued stochastic edge estimator with a feature weighting function has not been reported previously for image registration applications. Different data representations 共intensity, edge-based兲 and similarity measures 共MI, SACD兲 were implemented and compared in a registration test bed that uses GA optimization in a hierarchical fashion. From these experiments, it was concluded that the proposed elements of this work produced encouraging results at a moderate computational cost. It is also worth noting that the proposed methodology has been successfully tested for single- and multimodal rigid regis-
013002-13
Downloaded From: http://spiedigitallibrary.org/ on 03/07/2014 Terms of Use: http://spiedl.org/terms
Jan–Mar 2010/Vol. 19(1)
Makrogiannis and Bourbakis: Efficient registration of multitemporal and multisensor aerial images… Table 2 Optimization results 共using SACD兲 with respect to the number of scales in the hierarchical structure and the number of trials in GA for image pairs A1-A2, B1-B2, and E1-E2. A1,A2
Number of scales
Number of trials
4
3
2
1
6000
961.0
959.2
954.1
943.5
4000
960.8
957.8
953.4
938.9
2000
959.2
952.0
944.5
928.9
B1,B2
Number of scales
Number of trials
4
3
2
1
6000
970.6
970.5
970.5
967.8
4000
970.7
970.3
969.6
967.5
2000
970.6
970.5
965.6
964.1
E1,E2
Number of scales
Number of trials
4
3
2
1
6000
951.7
950.5
934.4
912.9
4000
952.4
951.1
952.0
910.6
2000
899.9
944.4
943.4
906.6
tration of biomedical brain images by further delimiting the range of the search space in the optimizer input.10,32 Our future goals include the use and comparison of spatial graph structures and statistical tests to estimate the image similarity, the consideration of different multiscale representation schemes 共scale-space, feature domain兲, and the development of a change detection approach based on this scheme. References
12. 13. 14. 15.
1. M. Holden, D. L. G. Hill, E. R. E.Denton,J. M. Jarosz, T. C. S. Cox, T. Rohlfing, J. Goodey, and D. J. Hawkes, “Voxel similarity measures for 3-D serial MR brain image registration,” IEEE Trans. Med. Imaging 19, 94–102 共2000兲. 2. F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, and P. Suetens, “Multimodality image registration by maximization of mutual information,” IEEE Trans. Med. Imaging 16, 187–198 共1997兲. 3. D. Shen and C. Davatzikos, “HAMMER: hierarchical attribute matching mechanism for elastic registration,” IEEE Trans. Med. Imaging 21共11兲, 1421–1439 共2002兲. 4. Y. Bentoutou, N. Taleb, K. Kpalma, and J. Ronsin, “An Automatic Image Registration for Applications in Remote Sensing,” IEEE Trans. Geosci. Remote Sens. 43共9兲, 2127–2137 共2005兲. 5. D. Liu, P. Gong, M. Kelly, and Q. Guo, “Automatic registration of airborne images with complex local distortion,” Photogramm. Eng. Remote Sens. 72共9兲, 1049–1060 共2006兲. 6. L. Shapiro and G. C. Stockman, Computer Vision, Chapter 11, Prentice Hall, 2001. 7. L.-K. Shark, A. A. Kurekin, and B. J. Matuszewski, “Development and evaluation of fast branch-and-bound algorithm for feature matching based on line segments,” Pattern Recogn. 40共5兲, 1432–1450 共2007兲. 8. S.-D. Wei and S.-H. Lai, “Robust and efficient image alignment based on relative gradient matching,” IEEE Trans. Image Process. 15共10兲, 2936–2943 共2006兲. 9. B. D. Lucas and T. Kanade, “An iterative image registration technique with an application to stereo vision,” in Proc. of 7th Int. Joint Conf. on Artificial Intelligence, Vancouver, pp. 674–679 共1981兲. 10. X. Dai and S. Khorram, “The effects of image misregistration on the Journal of Electronic Imaging
11.
16. 17. 18. 19.
20. 21. 22. 23. 24. 25.
accuracy of remotely sensed change detection,” IEEE Trans. Geosci. Remote Sens. 36共5兲, 1566–1577 共1998兲. N. Rowe and L. Grewe, “Change detection for linear features in aerial photographs using edge-finding,” IEEE Trans. Geosci. Remote Sens. 39共7兲, 1608–1612 共2001兲. J. R. G. Townshend, C. O. Justice, C. Gurtney, and J. McManus, “The impact of misregistration on change detection,” IEEE Trans. Geosci. Remote Sens. 30共5兲, 1054–1060 共1992兲. L. G. Brown, “A survey of image registration techniques,” ACM Comput. Surv. 24共4兲, 325–376 共1992兲. T. Kim and Y.-J. Im, “Automatic satellite image registration by combination of matching and random sample consensus,” IEEE Trans. Geosci. Remote Sens. 41共5兲, 1111–1117 共2003兲. Q. Zheng and R. Chellappa, “A computational vision approach to image registration,” IEEE Trans. Image Process. 2共3兲, 311–326 共1993兲. G. Hermosillo, C. Chefd’Hotel, and O. Faugeras, “Variational methods for multimodal image matching,” Int. J. Comput. Vis. 50共3兲, 429– 343 共2002兲. M. Irani and P. Anandan, “Robust multi-sensor image alignment,” in Proc. of Int. Conf. on Computer Vision, pp. 959–966 共1998兲. Y. Keller and A. Averbuch, “Multisensor image registration via implicit similarity,” IEEE Trans. Pattern Anal. Mach. Intell. 28共5兲, 794–801 共2006兲. H.-M. Chen, P. K. Varshney, and M. K. Arora, “Performance of mutual information similarity measure for registration of multitemporal remote sensing images,” IEEE Trans. Geosci. Remote Sens. 41共11兲, 2445–2454 共2003兲. V. Murino, U. Castellani, A. Etrari, and A. Fusiello, “Registration of very time-distant aerial images,” in Proc. of Int. Conf. on Image Processing (ICIP 2002), Vol. 3, pp. 989–992 共2002兲. R. Brooks, T. Arbel, and D. Precup, “Anytime similarity measures for faster alignment,” Comput. Vis. Image Underst. 110共3兲, 378–389 共2008兲. B. K. P. Horn, Robot Vision, MIT Press, Cambridge, MA 共1986兲. C. Cain, M. M. Hayat, and E. E. Armstrong, “Projection-based image registration in the presence of fixed-pattern noise,” IEEE Trans. Image Process. 10共12兲, 1860–1872 共2001兲. C. D. Kuglin and D. C. Hines, “The phase correlation image alignment method,” in Proc. of IEEE Conf. on Cybernetics and Society, pp. 163–165 共1975兲. B. S. Reddy and B. N. Chatterji, “An FFT-based technique for translation, rotation and scale-invariant image registration,” IEEE Trans.
013002-14
Downloaded From: http://spiedigitallibrary.org/ on 03/07/2014 Terms of Use: http://spiedl.org/terms
Jan–Mar 2010/Vol. 19(1)
Makrogiannis and Bourbakis: Efficient registration of multitemporal and multisensor aerial images… Image Process. 5共8兲, 1266–1271 共1996兲. 26. H. Stone, M. Orchard, E.-C. Chang, and S. Martucci, “A fast direct Fourier-based algorithm for subpixel registration of images,” IEEE Trans. Geosci. Remote Sens. 39共10兲, 2235–2243 共2001兲. 27. J. W. Hsieh, H. Y. M. Liao, K. C. Fan, M. T. Ko, and Y. P. Hung, “Image registration using a new edge-based approach,” Comput. Vis. Image Underst. 67共2兲, 112–130 共1997兲. 28. M. D. Pritt, “Automated subpixel image registration of remotely sensed imagery,” IBM J. Res. Dev. 38共2兲, 157–166 共1994兲. 29. B. Ma, A. Hero, J. Gorman, and O. Michel, “Image registration with minimum spanning tree,” in Proc. of IEEE Int. Conf. on Image Processing (ICIP 2000), Vol. 1, pp. 481–484 共2000兲. 30. M. R. Sabuncu and P. Ramadge, “Using Spanning Graphs for Efficient Image Registration,” IEEE Trans. Image Process. 17共5兲, 788– 797 共2008兲. 31. M. I. Miller, “Computational anatomy: shape, growth, and atrophy comparisons via diffeomorphisms,” Neuroimage 23共1兲, S19–S33 共2004兲. 32. J. Shi and C. Tomasi, “Good features to track,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 1994), pp. 593–600 共1994兲. 33. G. E. Christensen, R. D. Rabbitt, and M. I. Miller, “Deformable templates using large deformation kinematics,” IEEE Trans. Image Process. 5共10兲, 1435–1447 共1996兲. 34. P. Thevenaz, U. E. Ruttimann, and M. Unser, “A pyramid approach to sub-pixel registration based on intensity,” IEEE Trans. Image Process. 7共1兲, 27–41 共1998兲. 35. Y. Chen, R. R. Brooks, S. S. Iyengar, N. S. V. Rao, and J. Barhen, “Efficient global optimization for image registration,” J. Plant Nutrition 14共1兲, 79–92 共2002兲. 36. P. Chalermwat and T. El-Ghazawi, “Multi-resolution image registration using genetics,” Proc. of IEEE Int. Conf. on Image Processing (ICIP 1999), Vol. 2, Oct. 24–28, Kobe, Japan, pp. 452–456 共1999兲. 37. S. Makrogiannis, N. G. Bourbakis, and S. Borek, “A stochastic optimization scheme for automatic registration of aerial images,” in Proc. of IEEE Int. Conf. on Tools with Artificial Intelligence (ICTAI 2004), pp. 328–336 共2004兲. 38. D. Robinson and P. Milanfar, “Fast local and global projection-based methods for affine motion estimation,” J. Math. Imaging Vision 18, 35–54 共2003兲. 39. S. Fotopoulos, A. Fotinos, and S. Makrogiannis, “Fuzzy rule-based color filtering using statistical indices,” in Fuzzy Filters for Image
Journal of Electronic Imaging
Processing, pp. 72–97, Springer-Verlag, Heidelberg 共2003兲. 40. E. Parzen, “On estimation of a probability density function and mode,” Ann. Math. Stat. 33, 1065–1076 共1962兲. 41. D. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, New York 共1989兲. 42. S. Makrogiannis, G. Economou, and S. Fotopoulos, “A fuzzy dissimilarity function for region based segmentation of color images,” Int. J. Pattern Recognit. Artif. Intell. 15共2兲, 255–267 共2001兲. 43. J. L. Barron and M. Khurana, “Determining optical flow for large motions using parametric models in a hierarchical framework,” in Proc. of Vision Interface (V197), May, Kelowna, BC, pp. 47–56 共1997兲. Sokratis Makrogiannis received his PhD in 2002 from the University of Patras, Greece. His research interests are mainly related to image analysis and computer vision. He has authored several journal and conference papers in these areas. Dr. Makrogiannis is an image processing developer-consultant at the Visualization and Computer Vision Laboratory in General Electric’s Global Research Center in Niskayuna, New York. Nikolaos G. Bourbakis received his PhD from the University of Patras, Greece, in 1983. His research interests are in assistive technology, information security, intelligent systems and robotics, computer vision, and wearable systems. He has been published in international refereed journals and conference proceedings. Dr. Bourbakis currently is an OBR Distinguished Professor of IT and the director of the Assistive Technology Research Center 共ATRC兲 at the College of Engineering and Computer Science at Wright State University, in Ohio.
013002-15
Downloaded From: http://spiedigitallibrary.org/ on 03/07/2014 Terms of Use: http://spiedl.org/terms
Jan–Mar 2010/Vol. 19(1)