Improved dynamic image fusion scheme for infrared and visible ...

2 downloads 0 Views 230KB Size Report
feature level fusion with region-based fusion scheme has been reported both ... be the pyramid transform (PT) [5] [6], discrete wavelet transform (DWT) [7]-[9], ...
Improved dynamic image fusion scheme for infrared and visible sequence Based on Image Fusion System GANG XIAO, KUN WEI, ZHONGLIANG JING Institute of Aerospace Science and Technology, Shanghai Jiaotong University

[email protected], [email protected], Abstract - Improved dynamic image fusion scheme for infrared and visible sequence based on image fusion system is introduced in this paper. Target detection technique is employed to segment the source images into target and background regions as the first step of the improved dynamic image fusion for infrared and visible sequence. Different fusion rules are adopted respectively in target and background regions. Two quantitative performance indexes are feedback to determine optimum weight coefficients for fusion rules. This feedback performance information improved the fusion result and effectiveness in target regions. Fusion experiments on real world image sequences indicate that the improved method is effective and efficient, which achieves better performance than those fusion methods without feedback. Keywords: Image Sequences fusion, Infrared and Visible Image, feedback performance information

1

Introduction

The techniques of multi-source image fusion originated in the military fields, and their impetuses also came from military fields. The battlefield detecting technology, based on the pivotal content of multi-source image fusion, has become one of the most important military advanced technologies, including target detection, track, and recognition and scene awareness. Image fusion is a specialization of the more general topic of data fusion, dealing with image and video data [1]. It is the process by which multi-modality sensor imageries from same scene are intelligently combined into single view of the scene with extended information content. Image fusion has the important applications in the military, medical imaging, remote sensing, and security and surveillance fields. The benefits of image fusion include improved spatial awareness, increased accuracy in target detection and recognition, reduced operator workload and increased system reliability [2]. Image fusion processing must satisfy the following requirements, as described in [3][24]: Preserve (as far as possible) all salient information in the source images; do not introduce any artifacts or inconsistencies; be shift invariant; be temporal stable and consistent. The last two points are especially important in dynamic image fusion (or image sequences fusion) as human visual system is

[email protected]

highly sensitive to moving artifacts introduced by the shift dependent fusion process [3]. Fusion process can be performed at different levels of information representation, sorted in ascending order of abstraction: signal, pixel, feature and symbol levels [4]. From the simplest weighted pixel averaging to more complicated multi-resolution (MR) method (including pyramidal schemes and wavelet schemes), pixel-based fusion methods were well researched [3] [5]-[10]. Recently, feature level fusion with region-based fusion scheme has been reported both qualitative and quantitative improvements over the pixel-based method as more intelligent semantic fusion rules can be considered based on actual features [11]-[16]. An improved dynamic image fusion scheme for infrared and visible sequence based on feedback optimum weight coefficients is introduced in this paper.

2

Generic Pixel-Based Image Fusion Scheme

The generic pixel-based fusion scheme is briefly reviewed, more details can be found in [3] [5]-[16]. Fig.1 illustrates generic fusion scheme, which can be divided into three steps as following: At first, all source images are decomposed by using multi-resolution method, which can be the pyramid transform (PT) [5] [6], discrete wavelet transform (DWT) [7]-[9], discrete wavelet frames (DWF) [3] or dual-tree complex wavelet transform [10] etc. Then the decomposition coefficients are fusion by applying a fusion rule, which can be a point-based maximum selection (MS) rule or more sophisticated area-based rules [6] [7]. Finally, the fused image is reconstructed by using the corresponding inverse transform on the fused coefficients. Image A

PreProcessing

Image B

Fused Image

Image MR Transform

Fusion

Inverse Transform

Fig. 1 Generic pixel-based image fusion scheme Source sensors for most image fusion systems have different fields of view, resolutions, lens distortions and frame rates. It is vital to align the input images properly with each other, both spatially and temporally, a problem addressed by image registration [13]. To minimize this problem, the imaging sensors in many practical systems

1745

are rigidly mounted side-by-side and physically aligned as closely as possible. Most applications of a fusion scheme are interested in features within the image, not in the actual pixels. Therefore, it seems reasonable to incorporate feature information into the fusion process [14]. A number of region-based fusion schemes have been proposed [11][16]. However, most of region-based schemes are designed for still image fusion, and every frame of each source sequence is processed individually in image sequences case. These methods do not take full advantage of the wealth of inter-frame-information within source sequences.

further develop the wavelets fusion method by introducing DT CWT, which provide approximately shift invariance by introducing limited redundancy ( 2 m : 1 for m-D and any levels decomposition). However, the DT CWT employs two filters banks, which must be designed rigorously to achieve appropriate delays while satisfying perfect reconstruction (PR) conditions. Moreover, the decomposition coefficients from every tree should be regarded as the real or imaginary part of complex, increasing difficulty of subsequent processing in the fusion rule.

3

IR Sequence

Improved dynamic image fusion scheme based on region-based Target

For pixel-based approaches, the MR decomposition coefficient is treated independently (MS rule) or filtered by a small fixed window (area-based rule). However, the most applications of a fusion scheme are interested in features within the image, not in the actual pixels. Therefore, it seems reasonable to incorporate feature information into the fusion process [14]. A number of region-based fusion schemes have been proposed [11]-[16]. However, most of region-based schemes are designed for still image fusion, and every frame of each source sequence is processed individually in image sequences case. These methods do not take full advantage of the wealth of inter-frame-information within source sequences. The novel region-based fusion scheme proposed for fusion of visible and infrared (IR) image sequences is shown in Fig. 2, where the target detection (TD) techniques are introduced to segment target regions intelligently. For convenience, we assume both source sequences are registered well before fusion. Firstly, both the visible and IR sequences are enhanced by using pre-processing operator. Then each frame of the source sequences transformed by using a MR method (where the LR DWT is adopted in this paper, see the Sec. 3.1). Simultaneously, the frames are segmented into object and background regions by using a TD method. Different fusion rules are adopted in target and background regions. Finally, the fused coefficients belong to each region are combined, and fused frames are reconstructed by using the corresponding inverse transform.

3.1

PreProcess ing

Visible Sequence

Target Region Fusion

TD Background Region Fusion

MR Transform

Fused Sequence

Inverse Transform Fig. 2 Region-based IR and visible dynamic image fusion scheme A new implementation of the DWT is introduced in this paper, which provides approximate shift invariance and nearly perfect reconstruction while preserving the properties of DWT: computational efficiency and easy implementation. Fig.3 shows decomposition and reconstruction scheme for the new transform in cascading form, which can be extended easily to 2-D by separable filtering along rows and then columns. For the limited redundancy (not more than 3 : 1 for 1-D) of the new transform, we use LR DWT (the limitedly redundant discrete wavelet transform) here to distinguish it from DWT and DWF. c w1 y H ω ( ) 2 − 1 G 1 (ω ) 1 x H 0 (ω ) H 1 (2ω )

cw2

H 0 (2ω ) ↓ 2 H 1 (2ω )

The Limited Redundancy Discrete Wavelet Transform

It is well know that the standard DWT produces a shift dependent [17] [18] signal representation due to down sampling operations in every subband, which results in a shift dependent fusion scheme, as described by Rockinger [3] . To overcome the problem, Rockinger present a perfectly shift invariant wavelet fusion scheme by using DWF. However, this method is much computationally expensive due to high redundancy ( 2 m × n : 1 for m-D and nlevel decomposition) of the representation. Bull et al [10]

MR Transform

2 − 1 G 0 (ω ) 2 −1 G1 (2ω )

↑ 2 G0 (2ω )

c wk

c sk

2 −1 G1 (2ω )

H 0 (2ω ) ↓ 2 ↑ 2 G0 (2ω ) Fig. 3 The LR DWT and inverse transform.

3.2

Region Segmentation Algorithm

The target detection (TD) operator aims for segmenting both source frames into target regions, in which the significant information is included such as moving human and vehicle, and background regions. A novel target detection method is proposed in this paper based on the characteristics of IR imaging. At first, a region merging

1746

method [19] is adopted to segment the initial IR frame. It is easy to find the target regions, which have high contrast with the neighboring background, in the segmented IR frame. A confidence measure [20] for each candidate region is computed. It is very inefficient to compute the confidence measure for each candidate within every frame. Therefore, a model matching method is adopted to find the target regions in the subsequent frames. A target model is obtained by using intensity information of the target region in pre-frame. Not the whole but a small region in post-frame which correspond with (and is little larger than) the target region in pre-frame is matched. The initial detection operator based on segmentation and confidence measure will be repeated in case no target being detected in certain successive frames. The target detection in the visible sequence is similar to the IR sequence.

3.3

Fusion Rules in the Target Region

To preserve the full information as far as possible in the target region, a special fusion rule should be employed in object region. Assume that target detection gives M target 2 M maps: TIR = {t1IR , tIR , , t IR } in IR frame and N target region 1 2 N maps: TV = {tV , tV , , tV } in the corresponding visible frame. The target map is down sampled by 2 m (in accord with the resolution of decomposition coefficients) to give a decimated target map at each level. The target maps in both source frames are analyzed jointly TJ = TIR ∪ TV . The frame is segmented into three sets: single, overlapped target region sets and background region set. Overlapped target regions TO = TIR ∩ TV . Single target regions are all the target regions where no overlap TS = T J ∪ TO . Clearly, TJ = TS ∪ TO . Background regions B = TJ . In the single target regions, fusion rule can be written as:

c ir ( x, y ), if S ir (t ) ≥ S v (t ) c f ( x, y ) =  otherwise  c v ( x, y ),

In case

( x , y )∈t

( x , y )∈t

where I ir and I v denote IR and visible frames respectively. Then an energy index of the coefficients within the overlapped region is computed respectively in IR and visible frame, S i (t ) = ∑ c i ( x, y ) 2 (3)

where the weights

(5)

ϖ min (t ) and ϖ max (t ) can be obtained

1 1 − M (t )  ϖ min ( t ) = (1 − )  2 1−α  ϖ max ( t ) = 1 − ϖ min ( t )

(6)

Finally, in the background regions, the simplest MS rule is adopted.

3.4

Optimum weight coefficients Based on feedback performance information

In the Target Region, the moving target was fused based fusion rules as mentioned above. The fused weight coefficients were calculated as Eq.(1) to Eq.(5). When the target was moving in sequence images, the fused result was not influenced by the background in this region. That is the stiff fusion. Here, an improved dynamic image fusion based on feedback performance information is proposed. The flowchart of this improved dynamic image fusion is shown as Fig.4. Target Region Optimum weight Fusion coefficients

Evaluation Index(es) : AG/En/SF/AMI......

Fused Target Performance information

Feedback fused effectiveness

Fig.4 The flowchart of improved dynamic image fusion Where, AG means Average degree, En denotes Entropy, SF means spatial frequency, AMI denotes average mutual information, and CE means Cross-entropy. AG and AMI were taken as two quantitative performance indexes for feedback to determine optimum weight coefficients. This feedback fusion rule can adjust fusion coefficients according to the moving target and background. In other words, this feedback performance information improved the fusion result and effectiveness in target regions [23].

4

( x , y )∈t

where t ∈ TO and i = ir , v means the IR and visible frame respectively. A threshold of similarity α is introduced where α ∈ [0,1] and normally α = 0.85 is appropriate. In case M (t ) < α the fusion rule in overlapped target region t ∈ TO can be written as:

,a weight average method is adopted:

ϖ max (t ) ⋅ cir ( x , y ) + ϖ min (t ) ⋅ cv ( x, y ), if Sir (t ) ≥ S v (t ) c f ( x, y ) =  ϖ min (t ) ⋅ cir ( x, y ) + ϖ max (t ) ⋅ cv ( x, y ), if Sir (t ) < S v (t )

cir ( x , y ), if ( x, y ) ∈ TIR c f ( x, y ) =   cv ( x, y ), if ( x, y ) ∈ TV

(1) In a connected overlapped target region t ∈ T O , a similarity measure between two source is define as: 2 ⋅ ∑ I ir ( x , y ) ⋅ I v ( x, y ) ( x , y )∈t (2) M (t ) = ∑ [ Iir ( x, y)]2 + ∑ [ I v ( x, y)]2

M (t ) ≥ α

(4)

The platform of the Visible-Infrared Dynamic Image Fusion System

4.1 hardware of Image Fusion System The Visible-Infrared Dynamic Image Fusion System (VIDIF) consists of two subsystems, the Visible-Infrared Image Dynamic Acquisition sub-System (VIIDAS) and the Multi-Source Image Fusion sub-System (SIFS). VIIDAS performs the synchronous acquisition, real-timely registration of visible and infrared data. It consists of pantilt system, visible and infrared sensors, visible electric

1747

lens and its controller, video capture cards, PC and others. Figure 5 shows our hardware integrated scheme of VIIDAS. Figure 6 show the actual VIIDAS.

recently, including the algorithm proposed by this paper in SIFS. SIFS can simulate real-timely dynamic image sequence fusion and evaluate the performance of fusion. The interface of the software SIFS was shown as Fig.7. Image Enhance

other modules

Image Static Image Dynamic Registration Fusion Image Fusion

Software of Image Fusion System - SIFS Fig.7 shows the interface of the software SIFS.

5 Fig.5 Hardware integrated scheme of VIIDAS ECDIP infrared Camera

UM-301 CCD Camera

The proposed improved fusion scheme was examined on real world dynamic images (image sequences). These infrared and visible sequences were taken from the VIIDAS. Fig.8 shows some results of fusion of spatially registered visual (visible light) and infrared (IR) (thermal 3-5 micro m) image sequences consisting of 32 subsequent frames of each sensor. Visual inspection of the results from each method shows that the proposed improved dynamic scheme is superior to the generic scheme in case of same transform method [24].

PAN&TILT

Fig.6 Photos of the VIIDAS The visible image uses UM-301 camera of UNIQ, which is a Hyper HAD CCD visible sensor with highperformance near infrared imaging capability and the far infrared imager is from CEDIP of France. The effective resolution of UM-301 is 752×582. The video capture card V410, from Micro-view Science & Technology Co.,Ltd .of China, can achieve 768×576 (PAL) or 640×480 (NTSC) resolution on each input of four ones. The visible electric lens of VIIDAS, can zoom or change the focus and aperture. The zooming of visible lens makes the visible and the infrared sensors have the same imaging angle of view, so that the imaging results of one target in the visible and the infrared images have the same size and the consequent registration will be easier. The pan-tilt (PT) system of two dimensions was made by Googol Technology Ltd. of Shenzhen. The angle range of vertical rotation is 0° ~ 320°, and the angle range of horizontal rotation is -90°~90°. The resolution of rotation is 0.00036°, and the maximum speed of rotation is 180°/s. The visible and infrared cameras are fixed on the PT abreast, assuring that the optical axis of the visible sensor is in parallel with the one of the infrared sensor, and the imaging axes of coordinates of the visible sensor run parallel to the ones of the infrared sensor, in this way, the registration will be less difficult. 4.2 Software of Image Fusion System SIFS is a software system developed using MATLAB of the MathWorks. It performs image enhancing, image registration, image fusion, performance evaluation and others. There are many fusion algorithms in common use

Experimental Results and Analysis

8(a)

8(b)

8(c)

8(d)

8(e)

8(f)

8(g)

8(h)

Fig.8 fused results of visual (visible light) and infrared (IR) image sequences Fig. 8 (a) and (b): A pair of frames of visual and IR image sequences. (c): average gray arithmetic (AGA); (d) Fused with the Laplacian pyramid transform (LPT) ; (e)Fused with the contrast pyramid transform (CPT) ; (f): Fused with the grade pyramid transform (GPT) ; (g) Fused with DWT(DB4) ; (h) Fused with LR DWT (DB4), using the proposed region-based target detection dynamic fusion scheme. The original IR and visible images were taken with ECDIP infrared camera and UM-301 CCD camera in 17.Dec.2006, at Image Lab, SJTU. To quantify temporal stability and consistency, we adopt proposed by the quantitative measure I ((S v , S ir ), F ) Rockinger, more details can be found in [3]. We computed the average mutual information (AMI) over the 31 set of inter-frame-differences (IFDs) using each scheme with

1748

each wavelet fusion method. In Table 1, DB4 means the Daubechies wavelets of order 4, BIOR4.4 denotes the biorthogonal wavelets, where the two “4” are orders of synthesis and analysis filters respectively, Q-shift9 designed by Kingsbury [21], “Generic” and “Proposed” means using the generic wavelet fusion scheme (fusion frames one by one) and the proposed improved regionbased dynamic fusion method. Clearly, the measure results in Table 1 are consistent with the conclusion of visual inspection. In addition, the LR DWT method performs better than the DWT and DT CWT methods in both generic and dynamic schemes, and it is comparable with the DWF method in respects of temporal stability and consistency while using lower computational cost because of lower redundancy of the LR DWT.

6

In this paper, an improved dynamic image fusion scheme for infrared and visible sequence based on region target detection is proposed. Target detection technique is firstly employed to segment the source images into target and background regions. Different fusion rules are adopted respectively in target and background regions. Two quantitative performance indexes are feedback to determine optimum weight coefficients for fusion rules. This feedback performance information improved the fusion result and effectiveness in target regions. This feedback fusion rule can adjust fusion coefficients according to the moving target and background. The platform of the VIIDAS, including hardware and software of was introduced too. IR and visible dynamic images fusion experiments with VIIDAS indicate that the proposed improved method is effective and efficient, which achieves better performance than the generic fusion method without feedback information.

Table 1 The AMI for the IFDs of visual and IR image sequences. Fusion Scheme

DWT DB4

DWT BIOR4.4

DT CWT Q-shift9

LR DWT DB4

LR DWT BIOR4.4

DWF DB4

DWF BIOR4.4

Generic

0.5233

0.5453

0.6362

0.8534

0.8624

0.9066

0.9173

Proposed

0.6211

0.6302

0.7108

0.9144

0.9200

0.9021

0.9104

Acknowledgments

Table2. The quantitative evaluation for different fusion methods of visual and IR image sequences. AG

En

SF

AMI

CE

AGA

1.61

6.53

4.14

4.00

1.68

LPT

1.77

6.54

4.55

4.10

1.86

CPT

2.21

6.83

4.89

3.89

1.65

GPT

2.03

6.59

3.35

3.65

1.78

DWT

2.14

6.76

5.51

3.83

1.81

LR WT

1.99

6.82

3.91

4.02

1.63

Conclusion

This work has been jointly supported by National Defense Foundation of China (A1420060161-5), Foundation Project (2007AA704319) and Aviation Foundation Project (40204101).

References

All of these quantitative Evaluation Indexes are described in reference [22]. There are other two infrared and visible sequence experiments based on this improved dynamic image fusion method. There video data was supplied by Sarnoff of USA. The original infrared and visible images were shown as Fig.9 (a), (b) and Fig.10 (a), (b). Their fusion results were shown as Fig.9 (c) and Fig.10 (c).

[1] H. Maitre, I. Bloch, Image fusion, Vistas in Astronomy, 41(3), pp. 329–335. 1997. [2] M.I. Smith, J.P. Heather, A review of image fusion technology in 2005, in Proc. SPIE, 5782, pp. 29-45. 2005. [3] O. Rockinger, Image sequence fusion using a shift invariant wavelet transforms, IEEE Trans. Image Processing, 3, pp. 288-291. 1997. [4] M. Abidi, R. Gonzalez, Data Fusion in Robotics and Machine Intelligence, Academic Press, USA, 1992.

(a)infrared image (b)visible image (c)fusion result Fig.9 infrared & visible fusion using LR DWT

(a)infrared image (b)visible image (c)fusion result Fig.10 infrared & visible fusion using LR DWT.

[5] A. Toet, Hierarchical image fusion, Machine Vision and Applications, 3, pp. 1–11. 1990. [6] P.J. Burt, R.J. Kolczynski, Enhancement with application to image fusion, in Proc. 4th Int. Conf. on Computer Vision, pp. 173-182. 1993. [7] H. Li, B.S. Manjunath, S.K. Mitra, Mulitsensor image fusion using the wavelet transform, in Proc. IEEE International Conference on Image Processing, Austin, Texas, 1, pp. 51-55. 1994.

1749

[8] L.J. Chipman, T.M. Orr, L.N. Lewis, Wavelets and image fusion, IEEE Trans. Image Processing, 3, pp. 248251. 1995.

[20] A. Yilmaz, K. Shafique, M. Shah, Target tracking in airborne forward looking infrared imagery, Image and Vision Computing, 21, pp. 623-635. 2003.

[9] I. Koren, A. Laine, F. Taylor, Image fusion using steerable dyadic wavelet transforms, in: Proc. IEEE International Conference on Image Processing, Washinton D.C., pp. 232-235. 1995.

[21] N. G. Kingsbury, The dual-tree complex wavelet transform with improved orthogonality and symmetry properties, IEEE Int. Conf. Image Proc., pp.375–378. 2000.

[10] P. Hill, N. Canagarajah, D. Bull, Image Fusion using Complex Wavelets, Complex Proc. 13th British Machine Vision Conference, University of Cardiff, pp. 144-152, 2002.

[22] Xiao Gang, Jing Zhongliang, WU Jianmin, LIU Congyi. Synthetically Evaluation System for Multi-source Image Fusion and Experimental Analysis, Journal of Shanghai Jiaotong university(Science),E-11, (3), pp. 263-270. 2006.

[11] Z. Zhang and R. Blum, Region-based image fusion scheme for concealed weapon detection, In Proc. 31st Annual Conference on Information Sciences and Systems, pp. 44-49, 1997. [12] B. Matuszewski, L.-K. Shark, and M. Varley, Regionbased wavelet fusion of ultrasonic, radiographic and shearographyc non-destructive testing images, In Proc. 15th World Conference on Non-Destructive Testing, Rome, 2000.

[23] Xiao Gang, Zhongliang JING, Shu Wang. Optimal Color Image Fusion Approach Based on Fuzzy Integral. Image Science Journal,Vol.55, No. 4. 2007,PP:189196. [24] XIAO GANG, YANG BO, JING ZHONGLIANG. Infrared and Visible Dynamic Image Sequence Fusion Based on Region Target Detection. The 10th International Conference on Information Fusion. July.9-12, 2007, Quebec. CANADA. PP: 80-85.

[13] G. Piella and H. Heijmans, Multiresolution image fusion guided by a multimodal segmentation, In Proc. Advanced Concepts of Intelligent Systems, Ghent, Belgium, 2002. [14] G. Piella, A region-based multiresolution image fusion algorthim, In ISIF Fusion 2002 conference, Annapolis, 2002. [15] G. Piella, A general framework for multiresolution image fusion: from pixels to regions, Information Fusion, 4, pp. 259–280. 2003. [16] J.J. Lewis, R.J. O’Callaghan, S.G. Nikolov, D.R. Bull, C.N. Canagarajah, Region based fusion using complex wavelets, in Proc. 7th International Conference on Information Fusion, Stockholm, Sweden, pp. 555-562, 2004. [17] E. P. Simoncelli, W. T. Freeman, E. H. Adelson, D. J. Heeger, Shiftable multiscale transforms, IEEE Trans. Information Theory, Special Issue on Wavelet Transforms and Multiresolution Signal Analysis, 38, pp. 587—607. 1992. [18] G. T. Strang, Wavelets and dilation equations: A brief introduction, SIAM Rev. 31(4), pp. 614-627. 1989. [19] N. Frank, N. Richard, On region merging: the statistical soundness of fast sorting with applications, IEEE Conf. on Computer Vision and Pattern Recognition, 2: pp. 19-26. 2003.

1750