Image Fusion Based on Steerable Pyramid and PCNN* Haibo Deng, Yide Ma† School of Information Science&Engineering, Lanzhou University, Lanzhou, 730000, China
[email protected]
invariant and requires strict registration. Furthermore, an image can only be decomposed into three orientations in each scale. Generally, as the texture and edge are continuous, this orientation limited decomposition restricts DWT in image fusion. Pyramid decomposition, such as Laplacian pyramid, contrast pyramid, ratio pyramid, morphological pyramid etc, is coincident with human visual system. But these methods have their own deficiencies in image fusion [5]. For example, contrast pyramid loses too much information from source images; ratio pyramid introduces false information does not exist in source images; morphology pyramid gets bad edges. Furthermore pyramid transformation is nonorientation, so important information from source images can not be effectively decomposed into different subbands. This paper, based on steerable pyramid [6] and PCNN, proposes a new image fusion scheme. First, the source images are decomposed into several subbands of different scale and orientation by steerable pyramid, then low frequency subbands are fused by weighting and high frequency subbands by PCNN. Fused image is obtained by invert steerable pyramid transform. Steerable pyramid not only preserves compact support orthogonal wavelets’ advantages, but also shift invariant, self converting, orientation controllable. These advantages make steerable pyramid the sound choice to decompose the source images into different levels and orientations in image fusion. In the subband selection stage, PCNN is introduced to properly synthesize the information of the central pixel and its surrounding pixels. This PCNN-based selection procedure considers the correlation between the neighboring pixels in subbands and reduces the misjudgment in subband fusion. This paper is organized as follows: steerable pyramid and PCNN are introduced in section 2 and 3 respectively. Section 4 describes the fusion algorithm based on steerable and PCNN. Section 5 gives the experimental results and evaluates the performance of various methods. Conclusions are made at the end.
Abstract A new image fusion algorithm, based on steerable pyramid and Pulse Coupled Neural Network (PCNN), is proposed in this paper. First, original images are decomposed into several subbands of different levels and orientations by steerable pyramid. Then, low frequency subbands are fused by weighting and high frequency subbands are fused by PCNN. The fused image is obtained by inverse steerable pyramid transform. Results testify our approach in comparison with wavelets fusion in both subjective visual effect and objective evaluation criteria while using four different pairs of test images. Index Terms-Image Fusion; Steerable Pyramid; PCNN; Wavelets;
1. Introduction Image fusion is to synthesize source images from different kinds of sensors or the same sensor in different times of a same scene. Fused image offers a better description and reduces the uncertainty of the scene. With the development of image fusion technology, its applications have branched into medical imaging and diagnosis, remote sensing, intelligent traffic and military etc. The procedure of image fusion can be divided into two stages: decomposition of source images and selection of coefficients from decomposed images. Some common decompose methods are: blocks [1], multiresolution transform (such as wavelets, pyramid transform [2-4] etc.). The selection stage is generally by first calculating a certain index, and then selecting the coefficients by comparing the index, such as: absolute value, energy, spatial frequency [1]. Block method suits multifocus image fusion, and needs to adjust the block size and threshold while fusing different images. Discrete wavelets transform (DWT) is anther commonly used method, but it is not shift * The work is supported by National Natural Science Foundation of China (No. 60872109) † Corresponding author. Email:
[email protected]
978-1-4244-4457-1/09/$25.00 ©2009 IEEE
569
The angular constraints of band filters are Bk( )= B( )[-jcos( - k)]n , where =arg( ), =k /n+1, and
2. Introduction of steerable pyramid
n
If a function can be expressed as linear combination of its rotated versions, this function is steerable. Steerable pyramid [12] is composed of a class of basic filters. Once the responses of the basic filters are known, responses of arbitrary orientations can be synthesized as a linear combination of these basic filters. Steerable pyramid decomposes image into a series of multiscale and multiorientation subbands. This decomposition not only preserves compact support orthogonal wavelets’ advantages, but also shift invariant, self converting, orientation controllable. The first level decomposition and reconstruction in the frequency domain is shown as follows:
B (ω ) =
3. The model of PCNN Local properties of an image are always represented by multi-pixels. This means high correlation of the neighboring pixels. Traditional pixel-based fusion methods only consider a single pixel of decomposed coefficient, ignoring the correlation, which lead to misjudgment during image fusion. Pulse Coupled Neural Networks (PCNN) [7] is based on the research of visual cortex in mammals by Eckhorn in 1990. It has been proved that PCNN is very suitable for image processing, such as image enhancement, segment, denoise. One of most importance characteristic of PCNN is similar clustered pixels inspiring in the same iteration. With proper parameters, PCNN can synthesize local information and makes a “good” decision. The mathematical model of PCNN can be described with the follows equations. Indexes i and j indicate the pixel location in image.
L0(- )
B0(- )
B0( )
B1(- )
B1( )
BK(- )
BK( )
L1(- )
L0( )
2
2
L1( )
Fig.1 First level of the steerable pyramid
As shown in Fig.1, the left part of the image is the decomposition procedure, and the right is reconstruction. Input image is first decomposed by the high-pass and low-pass subbands by nonorientation high-pass filter H0( ) and low-pass filter L0( ). Then the low-pass subband is decomposed again into a series of oriented subbands and a new low-pass subband by oriented band-pass filters Bk( ), k=0,1,…K, and a new low-pass filter L1( ). The new filtered low-pass image can be decomposed again by using the same procedure until the required level. The reconstruction in the frequency domain can be written as follows: n (1) X(ω)* ={| H (ω)|2 + | L (ω)|2 (| L (ω)|2 + | B (ω)|2 )}X(ω) + at. . 0
0
1
¦
k
a.t. stand for aliasing terms. To guarantee perfect reconstruct, the system should meet some constraints: (1) To avoid aliasing, L1( ) is constrained as: L1( )=0, when | |> !". (2) (2) To avoid amplitude distortion, the transfer function should abide by: n (3) | H (ω ) |2 + | L (ω ) |2 (| L (ω ) |2 + | B (ω ) |2 ) = 1 0
1
¦
Fi, j[n] =exp(−αF)Fi, j (n−1)+VF ¦mijklYi, j[n−1]+Si, j
(5)
Li, j [n] = exp(−αL )Li, j (n −1) +VL ¦wijklYi, j [n −1]
(6)
U i , j [ n ] = Fi , j ( n )(1 + β Li , j [ n ])
(7)
Yi , j [ n] = step(U i , j [n] − Ti , j [n])
(8)
Ti , j [ n ] = exp( −α T )Ti , j [ n − 1] + VT Yi , j [ n ]
(9)
A single Pulse Coupled Neuron is composed of three components: dendritic tree, linking modulation, and pulse generator. The dendritic tree has two branches, i.e. the linking branch and the feeding branch. The linking receives the external stimulus, and the feeding receives external and local stimulus. As shown in Eq.(5) and (6). aF and aL are the time attenuate constants, Si,j are the input stimuli. In the linking modulation, the internal term Ui,j is a dual channel ,and its value is the production of the linking and the modulated feeding branch. Eq.(7) describes the internal term Ui,j. #is the linking parameter. If Ui,j>Ti,j , the neuron generates a pulse and renews the threshold according to Eq.(8) and (9). Here VT and aT are normalized constant and time constant.
k=0
0
k
k =0
H0( )
H0(- )
¦ | B (ω ) | . 2
4. Image fusion algorithm based on steerable pyramid and PCNN
k
k =0
(3) In order to decompose the image recursively, the next level transfer function should be n ω ω (4) | L1 ( ) |2 =| L1 ( ) |2 (| L1 (ω ) |2 +¦ | Bk (ω ) |2 ) 2 2 k =0
The fuse algorithm can be briefly illustrated as follows:
570
(1) Decompose input source images into several subbands of different orientations and schales by steerable pyramid. Group the corresponding subbands from the source into pairs. (2) Fuse the low frequency subbands pair by weighting. (3) Fuse the high frequency orientation subbands pairs by PCNN. The absolute values of each subband pair are take as stimuli of two PCNN. These stimuli are first transmitted into its corresponding neurons by Eq.(5) and (6). Then the signals are synthesized in the internal activity terms by Eq.(7). If Ui,j>Ti,j , the neuron fires and record total number of fired neurons and fire times of each neuron. If there are still some neurons still not fire, attenuate the threshold value according Eq.(9) and repeat Eq.(5)(9) again. When all neurons fired at least once, the iteration stops. We choose the coefficients from the subbands, which has the higher frequency, as result coefficient. The detail parameter settings can be seen in section 5. (4) Fused image is obtained by applying inverse steerable pyramid transform of the fused subbands.
5.2 Objective evaluation standards Due the complexity of image fusion, there is no evaluation criterion yet that reflects the effect of fused image precisely and objectively. Subjective evaluation of the result image may vary with different people. RMSE, through comparing the error pixels between the fused result with the standard image, is adopted in many literature. This does reflect the fusion performance, but in most conditions, the standard image does not exist, such as CT and MRI. Here, we use two groups of criteria to evaluate the fused image from different perspectives. The first group, including entropy and average gradient, measures the information and clarity in the single fused image without considering the source images. The second group, including edge preservation degree [8] and mutual information [9], is used to measure how much “information” is obtained from the source images. Note, the mutual information used here is defined as the sum of the mutual information of the fused image and two source images. Through evaluating the fused image from different perspectives, we can get a more objective and precise conclusion.
5. Results and discussion In this section, we use the proposed method to fuse different kinds of images. This section first gives the detail fusion procedure of our method and wavelets. Second, objective evaluation standards are described. Finally the two methods are compared by using four different pairs of images, i.e. CT and MRI images, remote sensing images, infrared and visual images, and multifocus images. And the fused images are assessed by objective standards.
5.3 Result analysis In order to testify the validity of the proposed method, the experiments use four different kinds of source images [10], i.e. CT and MRI, remote sense imaging in different wave bands, infrared and visual images, and multifocus images. For each group (see Fig.2-5), image (a) and image (b) are source images. Image (c) and (d) are the fused results of our method and wavelets. Note, E, AG, EPD, MI used in the following tables stand for Entropy, Average Gradient, Edge Preservation Degree and Mutual Information. In group 1, (a) and (b) are CT and MRI images. Obviously these two images contain complementary information: (a) is a CT image, which shows the structures of bone, while (b) is a MRI image, which shows the soft tissues. As we can see from (c) and (d), our method is superior to wavelets in both visual and objective effect.
5.1 Details of proposed method and wavelets based fusion algorithm Our methods first decompose the two input source images into 3 levels 6 orientations. So there are 1+3*6+1=20 subbands for each source image. Then the corresponding subbands are grouped into pairs. The low frequency pairs are fused by weighting; the high frequency oriented pairs are fused by PCNN. The parameters of PCNN are set as follows: =0.5, convolution cores are w=m=[0.707, 1, 0.707;1,0,1; 0.707, 1, 0.707], aF= aL =0.2, aT = 0.012. The fused image is obtained by inverse steerable pyramid transform. The wavelets based image fusion method first decomposed the source images, by db1 wavelets basis, into 3 levels. Then the low subbands are fused by weighting, the high frequency subbands are fused by choosing the max coefficients. Fused result is obtained by inverse wavelets transform.
(a)
571
(b)
house are clear. As we can see from (c) and (d), the people and baluster in our method are much better than (c). The objective criteria also approve this conclusion.
(c)
(d)
Fig.2 Group 1: (a) and (b) are CT and MRI images, (c) and (d) are fused results by wavelets and our method
(a)
(b)
(c)
(d)
Table 1 Comparison of fused image by wavelets and our method (Group 1) E 5.4611 5.8175
Wavelets Proposed
AG 0.0308 0.0293
EPD 0.6382 0.6525
MI 0.0208 0.0373
Group 2 is a pair of remote sensing images of different wave bands. Image (a) is much clearer in the rivers and land while (b) in the buildings and roads. Although there is little difference between the two fused images, objective criteria show our method is better for the entropy, edge preservation degree and mutual information of our method are greater than wavelets.
Fig.4 Group 3: (a) and (b) are infrared and visual images, (c) and (d) are fused results by wavelets and our method Table 3 Comparison of fused image by wavelets and our method (Group 3) Wavelets Proposed
E 6.5282 6.6902
AG 0.0306 0.0327
EPD 0.9294 0.9292
MI 0.4252 0.4358
(a) and (b) in Fig.5 are multifocus images. The clock in (a) is in focus but the background is out of focus. Background in (b) in focus but the clock is out of focus. As we can see from (c) and (d), the fused books in (c) in are blurry, but very clear in (d). This is due to the capability of PCNN to synthesize local information. (a)
(b)
(c)
(a)
(b)
(c)
(d)
(d)
Fig.3 Group 2: (a) and (b) are remote sensing images, (c) and (d) are fused results by wavelets and our method Table 2 Comparison of fused image by wavelets and our method (Group 2) Wavelets Proposed
E 7.5724 7.5935
AG 0.0523 0.0539
EPD 0.9216 0.9571
MI 1.7285 2.0497
Fig.5 Group 4: (a) and (b) are multifocus images, (c) and (d) are fused results by wavelets and our method Table 4 Comparison of fused image by wavelets and our method (Group4)
In Fig.4, (a) is an infrared image, in which the person can be seen clearly. (b) is a visual image, in which the surroundings such as baluster and the
Wavelets Proposed
572
E 7.3150 7.3698
AG 0.0201 0.0201
EPD 0.9673 0.9634
MI 19.1051 19.2012
wavelet-based image fusion tutorial [J]. Pattern Recognition, 2004, 37 (9): 1855-1872. [3] Alexander Toet. Hierarchical image fusion [J].Machine Vision and Applications, 1990, 3 (1): 111. [4] Z.Liu, K.Tsukada, K.Hanasaki, et al. Image fusion by using steerable pyramid [J]. Pattern Recognition Letters, 2001, 22 (9): 929-939. [5] Zhaobin Wang, Yide Ma, et al. Medical image fusion using m-PCNN [J]. Information Fusion, 2008, 9 (2): 176-185. [6] Freeman,W.T., Adelson,E.T., et al. The design and use of steerable filters [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1991, 13 (9): 891-906. [7] R. Eckhorn, H.J. Reitboeck, et al. A neural network for feature linking via synchronous activity: Results from cat visual cortex and from simulations in: R.M.J. Cotterill (Ed.), Models of Brain Function [M]. Cambridge, UK, Cambridge Univ. Press, 1989:255272. [8] Xydeas, C.S., Petrovic, V,. Objective image fusion performance measure [J]. Electronics Letters, 2000, 36 (4): 308-309. [9] Guihong Qu, Daili Zhang, Pingfan Yan. Information measure for performance of image fusion [J]. Electronics Letters, 2002, 38 (7): 313-315. [10] Dr. Oliver Rockinger. Image Fusion. http://www.imagefusion.org/, 2009.4
6. Conclusion Image fusion technology has been applied successfully in many fields such as medical imaging and diagnosis, remote sensing, intelligent traffic and military etc. Considering the deficiency of the existing fusion methods, this paper proposes a steerable pyramid and PCNN based methods. First, the steerable pyramid is applied to decompose the important information in the input images, such as edges, texture, into different levels and orientations. Then PCNN is used to select the “better” coefficients from the decomposed high frequency subbands. The fused image is obtained by inverse steerable pyramid transform. Four different kinds pairs of images are used to test the performance of this method against wavelets based fusion method. The results approve our methods is effective, both in subjective visual effect and objective evaluation criteria.
References [1]
[2]
Shu Taoli, James T. Kwok, Yaonan Wang, et al. Combination of images with diverse focuses using the spatial frequency [J]. Information Fusion, 2001, 2 (3): 169-176. Gonzalo Pajares, Jesús Manuel de la Cruz, et al. A
573