A Spatial-Feature-Enhanced MMI Algorithm for Multimodal ... - CiteSeerX

2580

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 48, NO. 6, JUNE 2010

A Spatial-Feature-Enhanced MMI Algorithm for Multimodal Airborne Image Registration Xiaofeng Fan, Member, IEEE, Harvey Rhody, Life Member, IEEE, and Eli Saber, Senior Member, IEEE

Abstract—Maximization of mutual information (MI) (MMI) has a wide application in multimodal image registration. The conventional MMI algorithm is based on the statistics derived from image sampling and does not use spatial features. In this paper, we proposed an improved MMI registration algorithm that combines the spatial information through a feature-based selection mechanism. A Harris corner filter is chosen to classify the pixels. The classification results are used to focus sample selection, joint entropy estimation, and weighted MI information calculation. Although features play an important role, conventional feature matching is not used, which removes an important challenge to the use of feature-based information. The advantages of the proposed algorithm are improved robustness by incorporation of spatial features and significant reduction in computing time for registration. Experiments with challenging synthetic images and multimodal airborne infrared images are provided to demonstrate the improvement. Index Terms—Harris corner detector, maximization of mutual information (MMI), multimodal image registration.

I. I NTRODUCTION

M

ULTIMODAL image registration is a challenging problem for satellites, aircraft, and other surveillance systems. The sensors typically run at various capture rates, possess significant differences in resolution, and are not crossregistered spatially. Registration techniques can be divided into two categories: feature-based and area-based. The former often employ four steps: feature detection, feature matching, mapping, and resampling [1]. They can generally be classified as point-based [2]–[4], line-based [5], [6], and region-based [7]– [9]. Conventional feature-matching techniques often require manual intervention to improve the accuracy and efficiency, and often fail because of the disparity in capture rates, resolution, and spectral properties [10], [11]. Area-based methods [12]– [14] that are based on correlation [15] have been applied for registering images. These are often ineffective in handling multimodal images due to different intensity variations in the different models.

Manuscript received October 23, 2008; revised July 15, 2009. Date of publication March 4, 2010; date of current version May 19, 2010. This work was supported by the Wildfire Airborne Sensor Program (WASP) Project team, funded by the National Aeronautics and Space Administration under Contracts NAG13-02051 and NAG13-03026. X. Fan and H. Rhody are with the Department of Image Science, Rochester Institute of Technology, Rochester, NY 14623 USA (e-mail: [email protected]; [email protected]). E. Saber is with the Department of Electrical Engineering, Rochester Institute of Technology, Rochester, NY 14623 USA (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TGRS.2010.2040390

The maximization of mutual information (MI) (MMI) technique introduced by Viola [16] overcomes many of the problems with the registration of multimodal imagery. The MMI algorithm was originally applied to challenging medical imaging situations [17], [18] and has been extended to remote sensing by several investigators [19], [20]. However, it only takes statistical intensity information into account and does not employ spatial information that may be available in the form of image features [21]. The current research has focused on incorporating spatial information for MMI algorithm. Butz and Thiran [22] applied MI to edge measures defined by different edge operators. However, the attraction range is narrow, which increases the difficulty of the optimization procedure [1]. Pluim et al. [23] proposed inclusion of spatial information by multiplying the MI measure with an external local gradient term. Holden et al. [24] registered two images by maximizing the multidimensional MI of the corresponding features. Gan and Chung [21] proposed a spatial feature maximum distance-gradient magnitude. The technique requires MMI on 4-D and is computation intensive. These conventional approaches calculate MI directly from raw pixel values and are sensitive to image noise that can sometimes modify the histogram data used for registration. Guo et al. [25] use the MI to select the valid band from hyperspectral image for classification and registration. Cariou and Chehdi study the MI application on automatic georeferencing of airborne pushbroom scanner images with missing ancillary data [26]. Although MMI is a powerful method for multimodal image registration, there are still some problems for infrared (IR) image registration. 1) The robustness is questionable. Our experiments show that it often fails in the registration of IR images due to the flat search surface. 2) Usually, remote sensing applications produce a huge volume of raw data from airborne or satellite images. The associated efficiency requirement is difficult to meet with MMI due to its computational intensity. The aforementioned two problems become obstacles to the application of the conventional MMI algorithm to remote sensing registration. The object of this paper is to overcome those shortcomings and extend the MMI algorithm to a robust algorithm for multiband image registration. Recognizing the importance of bridging the spatial and statistical information, this paper proposes a different approach to enhance the MMI by combining featurebased methods. Our algorithm uses feature information to focus on pixels, such as those on edges or corners, that are rich in

0196-2892/$26.00 © 2010 IEEE

FAN et al.: SPATIAL-FEATURE-ENHANCED MMI ALGORITHM

2581

information and therefore, more important for registration. In contrast with other approaches to improve MMI, e.g., [21] and [23], this technique eliminates the necessity of incorporating feature descriptions or parameters into the MMI process and actually enables the process to be accelerated. We use a Harris corner detector to locate these information-rich pixels by classifying the image into edges, corners, and uniform regions. However, the feature extraction is not limited to the Harris corner algorithm. Other similar feature detectors could be used as well, making the algorithm easy to adapt to different applications. Another novel characteristic of the proposed algorithm is that the MI calculation may be applied to Harris corner label (HCL) classification images, rather than to the original images. The experiments also show that the MMI calculation based on the Harris corner classification result is robust, more stable, and faster than using the original image. This is due to a number of factors, which include the use of sample values that are determined by the classification process. This tends to reduce the noise through the associated filtering process. We have experimented with two search methods for optimization, namely, exhaustive search and gradient search, and have found that the feature-enhanced process is beneficial in both cases. An optional step in the process optimization can utilize a wavelet pyramid to achieve coarse-to-fine processing to further improve the algorithm efficiency [27]. This is possible because the feature information is equally applicable to pixel selection at all pyramid levels. The remainder of this paper is organized as follows: Section II provides a background about MMI and the application in the remote sensing image registration. Section III describes our HCL-enhanced MMI algorithm and briefly analyze the performance of the algorithm. Section IV covers the two different search methods: exhaustive search and gradient search. Section V gives the introduction to Wildfire Airborne Sensor Program (WASP) imagery and the wavelet-pyramidenhanced algorithm [28] used in registration of IR airborne imaging. Experimental results and analysis are presented in Section VI. II. MI A LGORITHM Viola introduced the MMI image registration algorithm for multimodal image application [16]. It has gotten wide use in the medical imaging community and has been applied in remote sensing as well [29]. The following section will sketch the main elements of the MMI algorithm. Let A be a discrete random variable with values ai , i = 1, 2, . . . , m, and probability PA (ai ). The entropy of A is defined as E[A] = −

m

PA (ai ) log2 PA (ai ).

(1)

i=1

Let B be a second discrete random variable with values bj , j = 1, 2, . . . , n. Its entropy H(B) is calculated in the same manner. The joint entropy E(A, B) is E(A, B) = −

m n i=1 j=1

PAB (ai , bj ) log2 PAB (ai , bj ).

(2)

If A and B are statistically independent, then PAB (ai , bj ) = PB (ai )PB (bj ) for all (i, j), and it follows immediately that: E(A, B) = E(A) + E(B).

(3)

It is readily shown that maximum entropy is reached with statistically independent random variables. The MI is the difference between the maximum possible and actual joint entropy, i.e., I(A, B) = E(A) + E(B) − E(A, B).

(4)

The MI can be scaled by dividing the summation of entropy of A and B R(A, B) =

I(A, B) . E(A) + E(B)

(5)

In our application, the random variables A and B correspond to the brightness value of the two images that are to be registered. Let X = [x1 , x2 , . . . , xn ] represent a set of pixel locations in image A, and let A(xi ) denote the brightness values at position xi . The samples can be used to construct a histogram hA (A(X)) and to estimate entropy E(A). We can use the same approach to estimate E(B). However, as will be seen later, these are never actually needed in the matching algorithm. Let T be a spatial transform, and let Y = T (X) be a set of locations in image B. Then, B(Y ) = B(T (X)) is a set of samples of B, and [A(X), B(T (X))] is a set of sample pairs from images A and B associated with the transform T . Our goal is to find the transform that provides the best spatial match of the images. Conventional MMI searches for the transform that provides the maximum value of the MI between the samˆ B) of the MI can be computed ples sets. An estimate I(A, by finding the histogram hAB (A(X), B(T (X))), which, after normalization, is an approximation to PAB . The transformation ˆ B(T (x))) or Tˆ = can be estimated by Tˆ = arg maxT I(A(x), ˆ B(T (x))). arg maxT R(A(x), For a fixed set X of samples, E(A(X)) will be a constant and will therefore not be of interest in searching for a maximum. Although the sample set Y = T (X) will vary with the choice of T , it is reasonable to assume that the statistics of the image samples B(Y ) will not change in a systematic or significant manner during the search. If this turns out to be untrue, then it is a simple matter to calculate hB (B(Y )) and include the estimate of E(Y ) in the calculation. If E(B(Y )) keeps constant in the application or the MI is normalized as (5), E(B(Y )) could be generally omitted in favor of reducing the number of calculations. Then, maximizing I(A(X), B(T (X))) is equivalent to minimizing E(A(X), B(T (X))) over the range of parameters in the transformation T . The aforementioned estimates are based entirely upon the histogram, which, in turn, is determined by the set X and the mapping T . Therefore, we are led to consider ways to choose a good sample set. Let us consider which qualities of the sample set are important. Let T ∗ be the transformation that maximizes I, and let Tˆ be the transformation that maximizes ˆ Since the goal is to find T ∗ , we want to choose the samples I. such that Tˆ = T ∗ and ensure that the value of Iˆ is sensitive to variations in T . Note that we are not concerned with making an

2582


Fig. 1. Flowchart for the HCL-based MI registration.

accurate estimate of the MI. The primary requirement is to have the information peak produced by the estimate have the same parameter-space location as the true peak. These considerations lead us to select locations for the set X that fall on or near the edges and corners in image A. This will provide a histogram that is skewed away from the true probability distribution, but it will tend to increase the sensitivity to parameter variation. We will return to this topic later. Fig. 2.

WASP airborne image and the corresponding HCL mapping.

III. F EATURE -E NHANCED MMI A LGORITHM The proposed technique encompasses a set of related algorithms for multimodal image registration. It includes two basic steps: The first responds to spatial features, and the second step uses them to complete the MMI registration. The algorithm flowchart is shown in Fig. 1. In our implementation, we use the Harris corner detector to extract the feature information because it is fast and translation and rotation invariant. However, it can be replaced by other corner detectors, such as the Kanade–Lucas–Tomasi Feature Tracker detector [30], which might be a good match for a feature tracking application. The robustness of the MMI algorithm for multimodal image registration is based on the statistical relationships between corresponding sets of pixels. However, such relationship is likely to be affected by the sensitivity variation of the IR sensors, which sometimes cause the conventional MMI algorithm performance to be unstable. On the other hand, some spatial features, such as edges and corners, are richer and more constant in information. The key point of the proposed algorithm is to utilize the focusof-attention mechanism to improve the robustness of MMI. The Harris filter conveniently provides a label map with three levels: [0, 1, 2] which are associated with background (no corner or edge), edge, and corner, respectively, at each pixel location. Similar classification can be done with a variety of detectors, and it is likely that other detectors would give similar results. The three-level label map can be used in place of an image for the purpose of matching. With suitable Harris threshold settings, it tends to provide a less noisy parameter search surface than is available with the original images.

A. Harris Corner Detectors The Harris corner detector is a popular interest-point detector due to its strong invariance to translation, rotation, illumination variation, and relative tolerance of image noise [31]. It utilizes

the average squared gradient c(x, y) computed over a small region w as follows: c(x, y) = [S(xi , yi ) − S(xi + Δx, yi + Δy)]2 (xi ,yi )∈N (x,y)

(6) where (x, y) is the pixel position and (xi , yi ) are the neighboring pixels located in the window centered on (x, y). S(x, y) represents the intensity of the corresponding pixel. This can be expressed in matrix form as c(x, y) ≈ dT M(x, y)d where d = [Δx, Δy]T and ⎡ (Sx (xi , yi ))2 w ⎣ M= Sy (xi , yi )Ix (xi , yi ) w

w

Sx (xi , yi )Iy (xi , yi ) (Sy (xi , yi ))2

(7)

⎤ ⎦

w

where Sx (xi , yi ) = S(xi , yi ) − S(xi + Δx, yi ) and Sy (xi , yi ) = S(xi , yi ) − S(xi , yi + Δy) are gradient measures. Based on the relationship between the eigenvalues of the matrix M (x, y), the pixels are classified into three categories [0, 1, 2]. Our algorithm uses the label information to identify pixels located on edges and corners and gives them priority in choosing the sample locations. This has the effect of incorporating spatial information into the selection of data for the MMI calculations while avoiding the difficulty of maintaining an explicit description of spatial data. The labels can either be used to guide the sampling of the image data or sampling of a filtered image. We have found that the label image itself can be used as a surrogate for the original image in the registration process (see Fig. 2). This reduces the range of histogram values and the sensitivity to image noise.


2583

Fig. 3. Illustration for (3). x- and y-axes are the shift parameters. (a) Normalized 4-D high-dimensional MI. (b) Normalized MI between the HCL mapping. (c) Normalized conditional entropy of pixel intensity. TABLE I C OMPARISON OF MI

B. HCL Directed MMI In this section, we will discuss the relationships between the traditional MMI method and the label-directed MMI algorithm. Suppose that S r and S f are the intensity domains for the reference and floating images, respectively, and H r and H f are their corresponding HCL mapping. The high-dimension MI can be constructed by integrating the HCL p(S f , H f , S r , H r ) I(S f , H f , S r , H r ) = S f ,H f ,S r ,H r

× log2

f

f

r

r

p(S , H , S , H ) . (8) p(S f , H f )p(S r , H r )

The probability p(S r , H r , S f , H f ) over the sampling set can be approximated by the 4-D joint histogram h(S r , H r , S f , H f ), and the probability distribution p(S r , H r , S f , H f ) can be approximated as p(S f , H f , S r , H r ) =

h(S f , H f , S r , H r ) . (9) h(S f , H f , S r , H r ) S f ,H f ,S r ,H r

A similar approximation can be achieved on p(S f , H f ) and p(S r , H r ), i.e., h(S f , H f ) p(S f , H f ) = h(S f , H f ) I f ,H f

h(S r , H r ) . p(S r , H r ) = h(S r , H r ) I r ,H r

The aforementioned equation provides a good estimation for the 4-D MI. However, one disadvantage of the 4-D MI is the computation intensity. To speed the calculation, we express the 4-D MI as p(S f , H f , S r , H r ) I(S f , H f , S r , H r ) = S f ,H f ,S r ,H r

× log2

p(S f , H f , S r , H r ) p(S f , H f )p(S r , H r )

where O(S f , H f , S r , H r ) = E(S f |H f ) + E(S r |H r ) − E(S f , S r |H f , H r ). The derivation shows that the 4-D MI is composed of two items: the MI between the HCL mappings and the remaining terms. Our experiments show that the MMI of the HCL is dominant (Fig. 3) and that the HCL maps provide effective image registration with MMI. The sampling is done using direction from the HCLs. The use of the HCL maps for registration reduces the range of required histogram bins that are required, which can speed the MMI calculation significantly. To compare the sensitivity of different scaled MI, we can use the kurtosis to measure the sharpness scale of each search surface. The sharper the search surface is, the higher is the kurtosis [32]. Table I shows that the MI kurtosis of HCL is about double that of the original pixel intensity that means that the MI between the HCL is more effective for registration, which can be shown as the sharper peak in the search surface [Fig. 3(b)]. The edge descriptors are important composition for human visual registration, and the HCL-enhanced MMI algorithm follows that idea. Note that in order to increase the capture range, we use the Harris corner detector to emphasize both edge and corner candidate pixels instead of only the corner pixels. This improves the registration’s smoothness and makes the algorithm more robust. IV. S EARCH M ETHOD The use of MMI to find the parameters of the transformation that maximizes the MI can be approached in two ways: exhaustive search and gradient search. Both of them were tested with our algorithm. A. Exhaustive Search

= E(H f )+E(S f |H f ) + E(H r )+E(S r |H r ) − E(H f , H r )+E(S f , S r |H f , H r ) = M I(H f , H r )+O(S f , H f , S r , H r ) (10)

In exhaustive search, each of the transformation parameters is stepped across its range of possible values. This covers a parameter-space region with resolution that depends on the step sizes Tˆ = arg max(I), T

T is the parameter space.

(11)

2584


The expense of steps depends upon the volume of search region in parameter space. External knowledge, such as the initial measurement units, can be very helpful by limiting the search region. A variety of tactics, such as a pyramid search structure in parameter space, can be used to achieve a necessary level of precision while managing computing time. However, a full parameter-space search can be computationally expensive. To reduce the computing time, we can adjust the parameters to fit the application. A general coarse-to-fine algorithm can be constructed by standard pyramid techniques. This will be discussed next. B. Gradient Search Gradient search is an attempt to reduce the computation required by exhaustive search. It depends upon an estimate of the gradient of the MI with respect to the transformation parameters. The Parzen window method was used to build a parametric estimate of the probability distribution from which the derivative of the MI with respect to the various search parameters can be estimated. The detailed derivative was described in [16]. The parameters will be updated by each step in the gradient search by T = T − λ(dI/dT ). The process will be iterated until a maximum of MI is located. Our experiments show that the gradient search is more efficient than the exhaustive search. However, the tradeoff is that the gradient search is sensitive to noise and can be trapped in a local minima. The compromise is to apply exhaustive search in the high layer of multiresolution pyramid to provide the initial point then use gradient search around the initial point in a corresponding lower ayer. V. I MPLEMENTATION AND R ESULTS In this section, we will describe an experiment that challenges the robustness of the registration algorithm and presents a case where the MMI algorithm fails but the HCL-MMI algorithm succeeds. We will then describe an experiment that utilizes real-world multimodal imagery gathered with the fourcamera WASP sensor. Finally, we will present an analysis of the relative computational effort of the HCL-MMI and traditional MMI algorithms. To reduce the risk of being trapped into the local minimum, the exhaustive search method was used in the following experiments. A. Experiment to Test Robustness This experiment compares the performance of the conventional MMI registration algorithm and the HCL-based MMI algorithm on a pair of synthetic images at different noise levels to simulate data from two different noisy sensors with different spectral responses [Fig. 4(a) and (b)]. The image in Fig. 4(c) has a constant background, and the object is covered with spatially independent random pixel noise with a normal distribution. The image in Fig. 4(d) has a reversal of the structure of Fig. 4(c). Both the conventional MMI and the HCL-MMI algorithms have no difficulty registering the original images, Fig. 4(e) and (f). However, the conventional MMI algorithm fails to register Fig. 4(c) and (d). Note that when they are aligned, every pixel is random in one image and constant in the other, making every

Fig. 4. Noised image registration example. (a) and (b) Original images. (c) and (d) represent two noised images. (e) and (f) show the search surface for two images without noise. (g) and (h) show the search surface for conventional MMI and HCL-enhanced MMI registration algorithm.

pixel in one image statistically independent of its counterpart in the other image. Any misalignment leads to overlaps of like regions. Overlaps of random pixel areas maintain pairwise independence, causing the entropy to be high for all shift positions. Conventional MMI works with low noise. However, it is very sensitive to additive random noise in the test [33]. Fig. 4(g) shows the MMI surface as a function of horizontal and vertical shift that is achieved with Fig. 4(c) and (d). However, the HCL-enhanced MMI registration has good performance on both the noisy images as well as on the noiseless pair. Although the pixel intensity inside the region is random, there is a useful information at the object edges which the algorithm is able to extract. Focusing on these pixels enables the MMI to detect misalignments. From Fig. 4(h), it could be seen that although the noise causes roughness in search surface of the HCL-MMI algorithm, the peak value is still very clear and easy to locate. It is expected that the performance of both algorithms deteriorates as more noise are added. To test this, we conducted 100 registration runs with each algorithm at each of several noise levels. The registration is a success if both translation on x- and y-axes are below two pixels. The following table shows the percentage of successful registrations for each algorithm with a range of noise levels. In each pair, we used white noise


2585

Fig. 7. Synthetic-noised IR images. Fig. 5.

Bar graph for correct ratio of Table II. TABLE II C ORRECT R ATIO FOR HCL-MMI AND MMI A LGORITHM ON S YNTHETIC I MAGE R EGISTRATION

TABLE IV C ORRECT R ATIO FOR HCL-MMI AND MMI A LGORITHM ON WASP I MAGE R EGISTRATION : I) T HE L EFT T WO C OLUMNS A RE THE C ORRECT R ATIOS FOR HCL-MMI AND MMI A LGORITHM ON SW BAND I MAGE R EGISTRATION . II) T HE R IGHT T WO C OLUMNS A RE THE C ORRECT R ATIOS FOR HCL-MMI AND MMI A LGORITHM ON M ULTIBAND I MAGE R EGISTRATION

TABLE III WASP C AMERA D ESCRIPTION

Fig. 6. Three bands of WASP Image: Its SW band image, middlewave band image and longwave band image from left to right successively.

with standard deviation that varied from 1% to 9% of the pixel intensity. The performance of the traditional MMI algorithm falls rapidly as noise is added, while the decline of HCL-MMI algorithm is much slower because the edge information is more stable than the individual pixel intensity. This is evident in a visual form in the bar graph of the fraction of correct alignments as a function of noise levels shown in Fig. 5. B. Experiments on IR Images Difficult images of real scenes are provided by the WASP sensor, which employs three IR cameras and a high-resolution visible camera. The specifications of the system are shown in Table III. The images from different IR bands provide spectral information that can be used for fire detection as well as for other kinds of environmental data gathering. The images are difficult to register spatially because of the differences in intensity variations (see Fig. 6). The shortwave band (SW)

depends mainly on reflected light, while the longwave band depends upon the thermal values across the scene, and the midwave band is a combination of both. The different pixel intensity variations with temperature and reflectivity provide a challenge for registration by conventional feature-based and correlation algorithms. The algorithms were tested on both real WASP imagery and on simulated imagery. The simulated imagery was used to provide accurate ground truth and complete control of testing conditions. The tests were then run on real imagery to demonstrate that the algorithms performed as expected. The simulated imagery was produced by the DIRSIG system [34], which can produce synthetic scenes of high complexity that are physically realistic in terms of photometry. The images were produced for sensors with the same spectral response as the WASP IR cameras but with pixel sizes reduced by an order of magnitude to support testing at subpixel registration. 1) Tests With Synthetic IR Images: Synthetic images such as those shown in Fig. 7 were used. These images are of size 640 by 510 WASP pixels. The floating images were rotated by an angle in the range of [−10, 10] degrees and translated from [50, 150] WASP pixels in x and y-directions. The registration algorithms were run 100 times each for various levels of noise that was added to each image. Both translation and rotation errors were calculated relative to the ground truth. To be sure that all registration successes were counted, a combined outcome with less than two-pixel translation offset and two-degree rotation difference is accepted. The results are shown in Table IV and Fig. 8. The result shows clearly that the HCL-MMI is more robust than the conventional MMI. Although the success rate for both

2586


Fig. 8. Bar graph for correct ratio versus noise level for HCL-MMI and MMI algorithm on WASP image registration. (a) Intraband registration between SW band. (b) Interband registration between SW and MW band. TABLE V M EANS AND S TANDARD D EVIATIONS OF E RRORS FOR THE S UCCESS R ESULT OF HCL-MMI AND MMI A LGORITHM ON WASP I MAGE R EGISTRATION . H IGHLIGHTS IN B LACK R EPRESENT THE B ETTER R ESULTS

algorithms is reduced with increasing noise, the HCL-MMI algorithm outperformed the traditional MMI algorithm at all noise levels. The registration accuracy can be measured by the means and standard deviations of rotation error and shift error. Table V shows comparison results (the better ones are shown in boldface). In most cases, the registration accuracy and stability are improved. Both algorithms generally achieved subpixel accuracy. 2) Intraband Registration With Real Imagery: The images shown in Fig. 9(a) and (b) represent two frames taken 4 s apart from the shortwave IR band of the WASP system. Note the significant 2-D translational difference among the two frames. Fig. 9(c) shows the first two levels of the wavelet decomposition. Fig. 9(d) shows the parameter search function created by the MMI algorithm for the four-layer decomposition utilized in this example arranged in raster scan from “coarse” to “fine.” The position of the peak indicates the optimum X and Y coordinates for registration for the layer under analysis. The registered images are overlaid [Fig. 9(e)] to indicate the degree of accuracy obtained by our proposed approach. Fig. 9(f) and (g) show the pixel value from the same row and column of the

Fig. 9. Intraband registration example. (a) and (b) represent two sequential frames. (c) Wavelet decomposition of frame shown in Fig. 7(a). (d) MMI parameter search function. (e) Overlaid registered images. (f) and (g) show [shown black in (e)] the pixel intensity of (a) and (b) on the same row and column.

two serial frames, and the slices show alignment of the images both horizontally and vertically. 3) Interband Registration With Real Imagery: In this section, we demonstrate the effectiveness of our proposed algorithm in registering multimodal imagery acquired by the WASP system using the three IR bands discussed in Table III previously. The short, medium, and long IR images are shown in Fig. 10(a)–(c), respectively. Note the translation, scale, and significant intensity variations. The addition of the scale parameter increases the dimensionality of the search space from three to four. The results computed by our algorithm are shown in Fig. 10 and displayed in Table VI, where the SW band


2587

Fig. 10. Interband registration example. (a), (b), and (c) are the SW-band, middlewave-band, and longwave-band images, respectively. (d) The corresponding registration image for the three bands. (e) and (f) show [shown black in (d)] the pixel intensity of the three bands on the same row and column. TABLE VI I NTERBAND R EGISTRATION R ESULT

was used as the base image. Fig. 10(e) and (f) compare the image intensity along the row and column indicated by the lines in Fig. 10(d). Although the intensity variation may be in different directions in the different image bands, one can see that the fluctuations that are caused by major features are in close alignment. This demonstrates interband alignment by the MMI and HCL-MMI algorithms. The main advantage of HCLMMI in this example is that it is much faster. C. Computational Effort This section compares the computation efforts between conventional MMI and HCL-MMI. The conventional MMI registration uses a set of points X that are distributed randomly over the image. The samples are used to construct the joint histogram of data values between the images. The registration process looks for a transformation that minimizes the joint entropy of the histogram constructed from A(X) and B(T (X)). The computational effort depends upon the number of samples N , the histogram bins K, and the number of steps S used in the parameter search process. Since the computational effort is in direct proportion to the number of samples and the number of bins required in the joint histogram, the computational cost to the samples and histogram bins are O(N ) and O(K 2 ). The total computation needed is O(T ) = O(N ) · O(K 2 ) · O(S). The proposed algorithm improves the efficiency in three ways: 1) In conventional MMI the samples are taken uniformly across the images. The sample grid must be small enough to ensure the capture of a sufficient number of points with high infor-

TABLE VII T IME C ONSUMPTION FOR HCL-MMI AND MMI A LGORITHM ON WASP I MAGE R EGISTRATION

mation value. The MMI-HCL algorithm reduces the computation intensity by selecting the information-rich pixels that then reduces the sample size. In our experiment, the number of samples per image can be reduced by 80% without loss of registration accuracy. 2) The label maps, rather than the original imagery, can also be used in the MI calculation. In the HCL maps, the pixels on background, edges, and corners are denoted by classification values 0, 1, 2. This removes most of the background variation but preserves essential information about the location of the edges and corners. The reduction of the number of histogram bins from 256 (for typical bytescale normalized images) to three enables the 2-D histogram of sample values to be reduced from 2562 to 32 . This can save substantial memory and computing time. 3) A pyramid approach can be used to enable a hierarchical search for the optimum parameter values. An approximate registration can be accomplished by using larger steps in parameter values with lower resolution images. The value for maximum entropy at low resolution is used as the starting point for refined search at the next higher resolution. The Haar wavelet [35] is used to build a pyramid and has found that the pyramid algorithm leads to about 90% reduction in the volume of search region without accuracy loss compared with the exhaustive search [36]. Taken together, the three steps described previously can reduce the computing time by 90%. For the intraband registration, we measured the rotation and shift and the scale plus shift for interband registration. The running time for a typical registration was reduced from 2 min to about 10 s. Table VII shows the comparison of the time consumption (the running

2588


environment is Pentium IV at 2.0 GHz with 1 GB RAM, IDL 6.0). This improvement is significant and may enable the proposed algorithm to be suitable for serial image registration in near real-time applications by using optimized code and specialized hardware. VI. C ONCLUSION In this paper, we have proposed an improved MMI algorithm for effectively registering multimodal images. MMI registration is robust to the brightness difference caused by multimodal sensors. HCL classification identifies information-rich areas for registration based on local neighborhood knowledge. Our algorithm improves the stability and accuracy of the MMI registration algorithm. The experiments show that this algorithm is capable of handling a wide variety of translations and practical scale variations by searching the parameter space in order to maximize MI. Efficiency is improved by using features to select information-rich areas, by using the classification label image to reduce the joint histogram calculation, and by using pyramid techniques to achieve coarse-to-fine analysis. The reduction in computing time is not done at the expense of registration accuracy. The main limitation is in the assumption that the images include enough structure information for registration. If the Harris corner detector cannot identify enough informationrich pixels in the image, accuracy and stability will decline significantly. In this case, this technique can easily be extended with other feature detectors, which may be of interest in some applications, or “fall back” to the conventional MMI approach. R EFERENCES [1] B. Zitova and J. Flusser, “Image registration methods: A survey,” Image Vis. Comput., vol. 21, no. 11, pp. 977–1000, Oct. 2003. [2] W. Wang and Y. Chen, “Image registration by control point pairing using the invariant properties of line segments,” Pattern Recognit. Lett., vol. 18, no. 3, pp. 269–281, Mar. 1997. [3] B. L. D. Lavine and L. Kanal, “Recognition of spatial point patterns,” Pattern Recognit., vol. 16, no. 3, pp. 289–295, 1983. [4] J. Ton and A. Jain, “Registering Landsat images by point matching,” IEEE Trans. Geosci. Remote Sens., vol. 27, no. 5, pp. 642–651, Sep. 1989. [5] S. Moss and E. Hancock, “Multiple line-template matching with EM algorithm,” in Proc. IGARSS, Singapore, 1997, pp. 1283–1292. [6] B. Zhukov, A. S. Vasileisky, and M. Berger, “Automated image coregistration based on linear feature recognition,” in Proc. 2nd Conf. Fusion Earth Data, Sophia Antipolis, France, 1998, pp. 59–66. [7] G. S. A. Goshtasby and C. Page, “A region-based approach to digital image registration with subpixel accuracy,” IEEE Trans. Geosci. Remote Sens., vol. GRS-24, no. 3, pp. 390–399, May 1986. [8] M. Holm, “Towards automatic rectification of satellite images using feature based matching,” in Proc. IGARSS, 1991, pp. 2439–2442. [9] P. Andersen and M. Nielsen, “Non-rigid registration by geometry constrained diffusion,” Med. Image Anal., vol. 5, no. 2, pp. 81–88, Jun. 2001. [10] L. G. Brown, “A survey of image registration techniques,” ACM Comput. Surv., vol. 24, no. 4, pp. 326–376, Dec. 1992. [11] C. Morandi and E. De Castro, “Registration of translated and rotated images using finite Fourier transform,” IEEE Trans. Pattern Anal. Mach. Intell., vol. PAMI-9, no. 5, pp. 700–703, Sep. 1987. [12] F. P. Ferrie, T. M. Peters, and M. A. Audette, “An algorithmic overview of surface registration techniques for medical imaging,” Med. Image Anal., vol. 4, no. 3, pp. 201–217, Sep. 2000. [13] W. Pratt, Digital Image Processing, 2nd ed. New York: Wiley, 1991.

[14] R. Bracewell, The Fourier Transform and Its Applications. New York: McGraw-Hill, 1965. [15] W. Pratt, “Correlation techniques of image registration,” IEEE Trans. Aerosp. Electron. Syst., vol. AES-10, no. 3, pp. 353–358, May 1974. [16] P. Viola, “Alignment by maximization of mutual information,” Ph.D. dissertation, MIT, Cambridge, MA, 1995. [17] A. Rangarajan, H. Chui, and J. Duncan, “Rigid point feature registration using mutual information,” Med. Image Anal., vol. 3, no. 4, pp. 1–17, Dec. 1999. [18] D. Rueckert, C. Hayes, C. Studholme, P. Summers, M. Leach, and D. Hawkes, “Non-rigid registration of breast MR images using mutual information, proceedings of the medical image computing and computer-assisted intervention,” in Proc. MCIAA, Cambridge, MA, 1998, pp. 1144–1152. [19] A. Cole-Rhodes, K. Johnson, and J. LeMoigne, “Multiresolution registration of remote sensing imagery by optimization of mutual information using a stochastic gradient,” IEEE Trans. Image Process., vol. 12, no. 12, pp. 1495–1511, Dec. 2003. [20] H. Chen, P. K. Varshney, and M. K. Arora, “Performance of mutual information similarity measure for registration of multitemporal remote sensing images,” IEEE Trans. Geosci. Remote Sens., vol. 41, no. 11, pp. 2445–2454, Nov. 2003. [21] R. Gan and A. C. Chung, “Multi-dimensional mutual information based robust image registration using maximum distance-gradient-magnitude,” in Proc. IPMI, vol. 3565, Lecture Notes in Computer Science, I. G. E. Christensen and M. Sonka, Eds., Berlin, Germany, 2005, pp. 210–221. [22] T. Butz and J.-P. Thiran, “Affine registration with feature space mutual information,” in Proc. MICCAI, vol. 2208, Lecture Notes in Computer Science, W. Niessen and M. Viergever, Eds., Berlin, Germany, 2001, pp. 549–556. [23] J. P. W. Pluim, J. B. A. Maintz, and M. A. Viergever, “Image registration by maximization of combined mutual information and gradient information,” IEEE Trans. Med. Imag., vol. 19, no. 8, pp. 809–814, Aug. 2000. [24] M. Holden, L. D. Griffin, N. Saeed, and D. L. Hill, “Multichannel mutual information using scale space,” in Proc. MICCAI, vol. 3216, Lecture Notes in Computer Science, C. Barillot, D. R. Haynor, and P. Hellier, Eds., Berlin, Germany, 2004, pp. 797–804. [25] B. Guo, S. R. Gunn, R. I. Damper, and J. D. B. Nelson, “Band selection for hyperspectral image classification using mutual information,” IEEE Geosci. Remote Sens. Lett., vol. 3, no. 4, pp. 522–526, Oct. 2006. [26] C. Cariou and K. Chehdi, “Automatic georeferencing of airborne pushbroom scanner images with missing ancillary data using mutual information,” IEEE Trans. Geosci. Remote Sens., vol. 46, no. 5, pp. 1290–1300, May 2008. [27] P. Thevenaz and M. Unser, “A pyramid approach to sub-pixel image fusion based on mutual information,” in Proc. IEEE Int. Conf. Image Process., Lausanne, Switzerland, 1996, pp. 265–268. [28] R. Turcajova and J. Kautsky, “A hierarchical multiresolution technique for image registration,” in Proc. SPIE Math. Imaging: Wavelet Appl. Signal Image Process., 1996, pp. 686–696. [29] J. P. Kern and M. S. Pattichis, “Robust multispectral image registration using mutual-information models,” IEEE Trans. Geosci. Remote Sens., vol. 45, no. 5, pp. 1494–1505, May 2007. [30] C. Tomasi, “Detection and tracking of point features,” Carnegie Mellon Univ., Pittsburgh, PA, 1991. [31] C. Harris and M. Stephens, “A combined corner and edge detector,” in Proc. 4th Alvey Vis. Conf., 1988, pp. 147–151. [32] C. S. Regazzoni, C. Sacchi, A. Teschioni, and S. Giulini, “Higher-orderstatistics-based sharpness evaluation of a generalized Gaussian pdf model in impulsive noisy environments,” in Proc. 9th IEEE SP Workshop Statist. Signal Array Process., 1998, pp. 411–414. [33] W. T. Wisniewski and R. A. Schowengerdt, “Information in the joint aggregate pixel distribution of two images,” in Proc. SPIE-Vis. Inf. Process. XIV, 2005, pp. 167–178. [34] S. D. Brown, R. Raqueno, and J. R. Schott, “Incorporation of bidirectional reflectance characteristics into the digital imaging and remote sensing image generation model,” in Proc. SPIE-Opt. Sci., Eng., Instrum., Aug. 1997, pp. 1508–1511. [35] M. Ehlers and D. Fogel, An Introduction to Wavelets. San Diego, CA: Academic, 1992. [36] X. Fan, H. Rhody, and E. Saber, “A comparison of exhaustive search vs. gradient search for automatic imagery registration based on MMI,” in Proc. Western New York Image Process. Workshop, Rochester, NY, Sep. 2006. [37] X. Fan, H. Rhody, and E. Saber, “Automatic registration of multi-sensor airborne imagery,” in Proc. IEEE AIPR Annu. Workshop, Washington, DC, Oct. 2005, p. 86.


Xiaofeng Fan (M’07) received the B.S. and M.S. degrees in electronic engineering from the University of Science and Technology of China, Hefei, China, in 1997 and 2001, respectively. Since 2003, he has been working toward the Ph.D. degree in the Image Science Department, Rochester Institute of Technology, Rochester, NY. Currently, he participates in the project of developing the Wildfire Airborne Sensor System to improve the detection and management of forest fires. His research interests include image registration, imaging systems, imaging algorithms, and processing.

Harvey Rhody (S’68–M’70–LM’05) received the B.S.E.E. degree from the University of Wisconsin, Madison, the M.S.E.E. degree from the University of Cincinnati, Cincinnati, OH, and the Ph.D. degree in electrical engineering from Syracuse University, Syracuse, NY. He has spent nearly four decades with the Rochester Institute of Technology (RIT), Rochester, NY, as a Faculty Member and Administrator, previously serving as Head of the Department of Electrical Engineering and President of the RIT Research Corporation, and where he is currently a Professor of imaging science with the Chester F. Carlson Center for Imaging Science. His research interests include imaging systems, remote sensing, imaging algorithms, and image processing, and he has conducted numerous projects designed in those areas. His current focus is in the area of 3-D imaging, where he is developing courses and leading research projects in 3-D sensing. He is the Founder and Past Director of the Laboratory for Imaging Algorithms and Systems within the Carlson Center, which focuses on improving imaging systems and data collection. The laboratory is currently developing the Wildfire Airborne Sensor System to improve the detection and management of forest fires and also worked with NASA to develop the data cycle system for the Stratospheric Observatory for Infrared Astronomy, a joint project with the German Aerospace Center.

2589

Eli Saber (S’91–M’96–SM’00) received the B.S. degree in electrical and computer engineering from the University of Buffalo, Buffalo, NY, in 1988, and the M.S. and Ph.D. degrees in electrical and computer engineering from the University of Rochester, Rochester, NY, in 1992 and 1996, respectively. From 1988 to 2004, he was with Xerox Corporation in a variety of positions ending as the Product Development Scientist and the Manager in the Business Group Operations Unit. From 1997 to 2004, he was an Adjunct Faculty Member with the Department of Electrical Engineering, Rochester Institute of Technology (RIT), Rochester, and with the Department of Electrical and Computer Engineering. He is currently a Professor with the Electrical Engineering Department, RIT. His research interests are in the areas of digital image and video processing, including image/video segmentation, object tracking, content-based Image/video analysis and summarization, multicamera surveillance processing, 3-D scene reconstruction, and image/video understanding. He is also currently serving as an Area Editor for the Journal of Electronic Imaging. He authors many conference and journal publications in the area of image and video processing. Dr. Saber is a Senior Member of the Imaging Science and Technology Society, a member of the Image and Multidimensional Digital Signal Processing Technical Committee, and member and former Chair of IEEE Industry Technology Committee. He has served as an Associate Editor for the IEEE T RANSACTION ON I MAGE P ROCESSING and has served on several conference committees including the Finance chair for the International Conference on Image Processing (ICIP) 2002, Tutorial Chair for ICIP 2007 and ICIP 2009. He is currently serving as the General Chair for ICIP 2012.

A Spatial-Feature-Enhanced MMI Algorithm for Multimodal ... - CiteSeerX

A Spatial-Feature-Enhanced MMI Algorithm for Multimodal ... - CiteSeerX

Suggest Documents

A 3D multimodal FDTD algorithm for electromagnetic and ... - CiteSeerX

A FULLY PARALLEL ALGORITHM FOR MULTIMODAL IMAGE ...

A multimodal registration algorithm of eye fundus images ... - CiteSeerX

IntellWheels MMI: A Flexible Interface for an Intelligent ... - CiteSeerX

Secure Learning Algorithm for Multimodal ... - Semantic Scholar

Multimodal Algorithm for Iris Recognition with

Changing range genetic algorithm for multimodal ...

A Genetic Algorithm with Tabu Search for Multimodal and ...

a multimodal workbench for automatic surveillance - CiteSeerX

A Gesture Processing Framework for Multimodal ... - CiteSeerX

MMI - Squarespace

MULTIMODAL TRACKING FOR SMART ... - CiteSeerX

mmi m

Multimodal Brain Warping Using the Demons Algorithm ... - CiteSeerX

Audit - MMI

MMI - Squarespace

An Adaptive Multimodal Biometric Management Algorithm - Kalyan ...

Multimodal Algorithm Based on Particle Filter for ... - Atlantis Press

Novel Registration and Fusion Algorithm for Multimodal Railway

S10 / S10 MMI S10 active / S10 active MMI Repair Documentation ...

A Heuristic Algorithm to Compute Multimodal Criterial Function ... - MDPI

Memetic Algorithm for Computing Shortest Paths in Multimodal ... - Core

A Segregated Genetic Algorithm for Constrained ... - CiteSeerX

A Molecular Evolutionary Algorithm for Learning ... - CiteSeerX