Comparison of Near Infrared And Visible Image ... - Semantic Scholar

Comparison of Near Infrared And Visible Image Fusion Methods Sami Varjo, Jari Hannuksela Machine Vision Group, Infotech Oulu and Department of Electrical and Information Engineering P.O. Box 4500, FI-90014 University of Oulu, Finland Email: [email protected], [email protected]

Sakari Alenius Nokia Research Center, Tampere, Finland Email: [email protected]

Abstract—Visible images are usually merged with midrange infrared images for surveillance enhancement or for non visual tasks like face recognition. The IR images contain information that is not the same as in the visible range images. We test fusion of near infrared images with visible spectrum images for detail enhancement and contrast highlighting using several quality metrics. It is shown that computationally light fusion methods can be utilized for visual enhancement. PCA based approach produces promising results for low ambient light images merged with flash near infrared images.

I. I NTRODUCTION The IR images contain information that is not the same as in the visible range images. The IR reflectance of objects may be different than for the visible light. Foliage is often much more intensive in IR images and some non- or semitransparent objects may become transparent in IR wavelengths and vice versa. Fusion of these images with different contents could be utilized to enhance image quality of mobile devices if suitable cameras are available. The fusion methods for combining infrared images with visible spectrum images concentrate heavily on the surveillance and remote sensing applications. The fusion goal in surveillance is to enhance the interesting objects visible in thermal images against the visible image surroundings [1]. The results can contain the IR band data highlighted with unnatural colors for good perception. The other goal is creating colored night vision images. Waxman et al. combine thermal images with visible images using a combination of IR excited and IR suppressed visible images with color remapping [2]. These pseudo colored night images are more pleasing to look at than the plain IR images, but the color mapping resembles only slightly the natural coloring in the daylight [3]. In remote sensing applications multiband image data is fused for increased spatial resolution and improved information representation [4]. Fusion of IR and visible light images has been used for face recognition, but there the information is combined for increased information content, not for improved visual appearance [5]. Krishnan and Fergus have presented the idea of dark flash photography where the image pairs are taken with IR-light flash and low ambient light conditions [6]. The motivation for the use of IR flash was to obtain good low light images with out disturbing the subjects since the visible intensity of IR

flash applied was two magnitudes lower than the regular flash. Matlab implementation of the above method was reported to have execution time of 25 minutes for 1.3 mega pixel image, or 3 minutes if the algorithm spatial constraints of the algorithm were eased. We propose to fuse near infrared images with visible spectrum images for results with improved contrast and increased information contents. Several computationally light weight approaches are tested for the fusion and the results are compared using several fusion metrics. II. M ETHODS A. Fusion Quality Assessment Fusion quality was estimated using several different metrics. These were Qx , edge preservation value [7], universal image quality index based values: Qp considering the inputs saliency, Qw weighting salient image areas, and Qe the edge dependent measure [8]. Qc is universal quality index measure that has been extended to take into account the similarity between inputs and resulting images [9]. As different type of quality values, the human perception based measures were calculated. The same contrast sensitivity filters were used as in [10], namely Mannos-Skarison resulting QM , Daly’s filter for QD and Ahumada’s filter for QA . Also the combined mutual information M I with fusion symmetry F S were calculated [11]. B. Fusion A method inspired by the unsharpening mask technique for combining value channels of input images retaining the colors from visible range image (A) was applied. The normalized IR image was subtracted from visible image value channel presented in HSV colorspace. The renormalized difference image was subtracted again from the original value channel with given weight. Also, the PCA based method for flash-noflash fusion by Alenius and Bilcu [12] using both RGB (B) and LAB (C) color spaces was tested. Their method divides image content to low and high pass bands, determines weights for tessellated data using PCA, and interpolates these the fusion weights to cover the whole images. Several other inhouse methods that included method with fusion weighting based on input variances (D), Min/max fusion where fusion is pixelwise sum of maximum and minimum of inputs with

Fig. 1.

Examples of visible spectrum and near infrared images: (left) the daylight images , (right) the lowlight and flash-near infrared images.

additional weight for visible data (E) and negative/positive feed back fusion inspired by [2] with both RGB (F) and HSV (G) color spaces. Contrast pyramid fusion [13] (H) and Laplacian pyramid [14] (I) with maximum value selection method applied on value channel of HSV image and gray scale IR image were used as external references. Also, discrete wavelet decomposition based fusion [15] with maximum coefficient selection policy was applied (J). C. Image Acquisition The images for IR-vis fusion experiments were obtained using Sigma SD14, a digital SLR camera, with a Yashinon 50 mm f/1.4 lens with M42 Pentax screw mount. The camera’s own IR cut filter between the sensor (Foevon X3) and lens was removed. For the ambient light images, a Schott RG IR band pass filter was used (B+W’s #486 IR/UV Cut). The filter stop band started at 670 nm with less than 10 % transmission at 720 nm. The IR pass (visible cut) filter was an unbranded glass filter having an IR pass at 720 nm. The overlap of IR cut and IR pass filters was determined to be very small (vis:IR:IR pass+cut = 1600:200:1). The lowlight scenes were captured having underexposure about 2 f-stops or more indicated by the camera’s light meter. The RAW images were pre processed by tuning manually the exposures, and the IR images were converted to gray scale. Examples of input image pairs are presented in the Fig. 1. III. R ESULTS A. Daylight Image Fusion With abundant ambient light the differences between result quality measures scattered. No single outstanding approach was clearly distinguishable as shown by the mean quality values of 36 image pairs in Table I. Visual inspections suggest that Laplacian pyramid based approach (I) yields slightly better results than the (A) or (C) while the differences were small (Fig 2). The edge preservation and universal quality index based quality metrics also support this observation. The color distortions are clearly the main problem in the fusion. Color space that separates the intensity from color values can

be used to alleviate the problem as can be seen from PCA method in RGB (B) and LAB (C) color spaces. B. Flash-IR-Visible Image Fusion The results for 25 image pairs taken in low ambient lighting merged with IR-flash images are collected in Table II. With the low light conditions, the PCA fusion in LAB color space (C) produces clearly the best results with the edge preservation and universal image quality index based quality measures (Fig 2). The same method, but applied in RGB color space, performs much worse with these metrics. Visually the RGB color space results contain some highlight anomalies and slight color distortions. The negative/positive feedback fusion for the value channel of HSV (G) results in the best values with the HVS based measures. However, visual inspection shows that the PCA based approach yields the least color artifacts in the results. The clear benefit of PCA approach compared to others that were tested is that it does not necessarily need color image smoothing, or at least the need is greatly reduced. While the PCA in LAB color space produced visually the best results it’s main problem is the washed out colors. Partly the reason for this is the fact that the low ambient light image does not contain sufficient color information. Also the miss match between visible and infrared intensities can be too great. Tone mapping method might be required to correct the result colors. The average execution times for unoptimized Matlab implementations ranged between 1.6 and 8.1 seconds implying that all the tested are computationally light. More definitive analysis of the computational complexity cannot be done with the Matlab implementations. IV. C ONCLUSION Images captured with abundant ambient light can be fused with several approaches. All the tested methods were computationally considerably lighter compared to the methods reported in the literature. The Laplacian pyramid decomposition of intensity image and the infrared data, that are further combined

TABLE I M EAN QUALITY VALUES FOR THE FUSED IR AND VISIBLE IMAGES WITH ABUNDANT AMBIENT LIGHT WITH STANDARD ERRORS .

A MI FS* MI/FS Qx Qp Qw Qe Qc Qm* Qd* Qa*

0.6689 0.0755 74.644 0.6531 0.8225 0.9101 0.8678 0.8121 0.8520 1.3472 0.3415

±0.0332 ±0.0118 ±25.653 ±0.0131 ±0.0126 ±0.0066 ±0.0128 ±0.0109 ±0.0338 ±0.0486 ±0.0153

B 0.5748±0.0330 0.1349 ±0.0151 8.1601 ±28.959 0.4901 ±0.0116 0.6469 ±0.0122 0.8011 ±0.0057 0.7443 ±0.0091 0.6469 ±0.0091 0.8325 ±0.0332 1.3196 ±0.0484 0.3306 ±0.0137

F MI FS* MI/FS Qx Qp Qw Qe Qc Qm* Qd* Qa*

0.6430 0.0569 29.452 0.6386 0.8127 0.8966 0.8636 0.8046 0.8276 1.3099 0.3294

±0.0349 ±0.0066 ±6.4311 ±0.0141 ±0.0134 ±0.0098 ±0.0148 ±0.0114 ±0.0279 ±0.0388 ±0.0125

C 0.5722 0.1070 54.782 0.6915 0.8329 0.9022 0.9045 0.8579 0.8414 1.3465 0.3275

G 0.6701 0.0666 21.082 0.6379 0.7981 0.8857 0.8584 0.7972 0.8170 1.2916 0.3257

±0.0359 ±0.0081 ±3.2982 ±0.0146 ±0.0146 ±0.0129 ±0.0161 ±0.0126 ±0.0304 ±0.0398 ±0.0135

±0.0330 ±0.0151 ±0.3646 ±0.0116 ±0.0122 ±0.0057 ±0.0091 ±0.0091 ±0.0598 ±0.0787 ±0.0271

D 0.6595 0.1164 6.9538 0.5064 0.7122 0.8352 0.7458 0.6934 0.8145 1.2929 0.3244

H 1.8502 0.4681 4.4115 0.1114 0.3541 0.5458 0.6831 0.4360 0.5874 0.8109 1.2681

±0.0214 ±0.0231 ±0.2796 ±0.0098 ±0.0169 ±0.0181 ±0.0230 ±0.0267 ±0.0149 ±0.0327 ±0.0454

±0.0377 ±0.0087 ±0.7628 ±0.0157 ±0.0168 ±0.0118 ±0.0168 ±0.0146 ±0.0317 ±0.0454 ±0.0137

E 0.6483 0.0646 15.798 0.5718 0.7587 0.8684 0.8211 0.7346 0.8035 1.2858 0.3176

I 0.5296 0.0984 24.175 0.7078 0.8186 0.9150 0.9224 0.8339 0.7108 1.1782 0.2779

±0.0289 ±0.0157 ±5.7165 ±0.0124 ±0.0150 ±0.0071 ±0.0090 ±0.0130 ±0.0156 ±0.0207 ±0.0075

±0.0356 ±0.0065 ±2.2455 ±0.0135 ±0.0146 ±0.0087 ±0.0116 ±0.0129 ±0.0282 ±0.0403 ±0.0115 J

0.6569 0.0696 43.801 0.6402 0.8194 0.8991 0.8607 0.8054 0.8435 1.3383 0.3381

±0.0334 ±0.0117 ±22.380 ±0.0143 ±0.0116 ±0.0094 ±0.0149 ±0.0100 ±0.0355 ±0.0516 ±0.0158

* small value better

TABLE II M EAN QUALITY VALUES FOR FUSED F LASH IR AND VISIBLE IMAGES WITH STANDARD ERRORS .

A MI FS* MI/FS Qx Qp Qw Qe Qc Qm* Qd* Qa*

0.4229 0.1112 8.9440 0.4443 0.6651 0.7174 0.6412 0.7105 0.7364 1.1883 0.2998

±0.0283 ±0.0216 ±0.2675 ±0.0257 ±0.0263 ±0.0287 ±0.0340 ±0.0243 ±0.0266 ±0.0357 ±0.0129

B 0.4773 ±0.0354 0.1123 ±0.0196 11.2358 ±0.3684 0.3998 ±0.0254 0.5568 ±0.0287 0.6679 ±0.0272 0.5958 ±0.0374 0.5578 ±0.0291 0.7449 ±0.0317 1.2083 ±0.0414 0.3027 ±0.0152

F MI FS* MI/FS Qx Qp Qw Qe Qc Qm* Qd* Qa*

0.4370 0.1323 6.9824 0.4749 0.6001 0.6396 0.5924 0.7180 0.7007 1.1527 0.2798

±0.0309 ±0.0185 ±0.3401 ±0.0273 ±0.0285 ±0.0368 ±0.0429 ±0.0233 ±0.0301 ±0.0391 ±0.0146

* small value better

G 0.4691 0.1196 6.4432 0.4247 0.5473 0.5937 0.5383 0.6717 0.6817 1.1253 0.2714

±0.0286 ±0.0173 ±0.3377 ±0.0366 ±0.0376 ±0.0447 ±0.0477 ±0.0330 ±0.0215 ±0.0281 ±0.0102

C 0.3990 0.1354 8.5587 0.5489 0.7194 0.7959 0.7592 0.7759 0.8059 1.2966 0.3262

±0.0265 ±0.0220 ±0.2466 ±0.0125 ±0.0156 ±0.0144 ±0.0210 ±0.0153 ±0.0376 ±0.0518 ±0.0168

D 0.4133 0.1437 5.1102 0.3954 0.6201 0.7350 0.6113 0.6261 0.8314 1.3096 0.3423

H 0.3458 ±0.0206 0.0966 ±0.0151 17.8091 ±0.2792 0.3875 ±0.0184 0.5600 ±0.0259 0.6117 ±0.0289 0.4821 ±0.0358 0.6553 ±0.0216 0.8341 ±0.0946 1.2089 ±0.0715 0.3499 ±0.0450

±0.0297 ±0.0181 ±0.3360 ±0.0225 ±0.0292 ±0.0195 ±0.0296 ±0.0298 ±0.0565 ±0.0726 ±0.0245 I

0.3776 0.1030 7.4545 0.4725 0.6954 0.7409 0.7065 0.7386 0.7162 1.1673 0.2906

±0.0289 ±0.0189 ±0.3120 ±0.0272 ±0.0261 ±0.0265 ±0.0307 ±0.0239 ±0.0219 ±0.0296 ±0.0110

E 0.4096 ±0.0308 0.0821 ±0.0120 18.4658 ±0.5237 0.4280 ±0.0253 0.6329 ±0.0265 0.6939 ±0.0289 0.6115 ±0.0356 0.6696 ±0.0234 0.7531 ±0.0292 1.2086 ±0.0383 0.3125 ±0.0160 J 0.3894 ±0.0281 0.1075 ±0.0181 9.2944±0.3178 0.3788 ±0.0206 0.6365 ±0.0298 0.7191 ±0.0209 0.6208 ±0.0241 0.6553 ±0.0295 0.8433 ±0.0471 1.3463 ±0.0705 0.3483 ±0.0228

Fig. 2.

Enlarged portions of fusion result examples for images taken in excess ambient light (top) and for images taken in lowlight (bottom).

by maximum value selection, yielded the best fusion results in daylight image pairs. The PCA based method originally developed for flash-noflash image fusion performed the best in lowlight cases where IR-flash was applied. Compared to the other tested methods, a noisy lowlight image did not distort the colors in the result image, and the need for smoothing the color image input is greatly reduced. The main problem with the method appears to be washed out colors. Histogram stretching and white balancing could be used to enhance the colors. However, if the color information in input images is poor, color cannot be enchanced in this way. Tone mapping is one other possibility to improve color representation in the resulting images. The quality measures based on edge preservation and the universal quality index value reflect quite well the noisiness and detail transfer in the fusion results when compared to visual inspections. The human perception based metrics did not perform as coherently, while none of the utilized metrics considers the correctness of the color information. When tone mapping or other color correction schemes are studied a proper measure for color correctness must be sought and utilized.

The Laplacian pyramid decomposition resulted good results in daylight scenes and it might be beneficial to combine the PCA approach and the Laplacian pyramid approach. Now the PCA is used to calculate the weights for images where high pass and low pass spectral components are separated while pyramid decomposition would enable fusion on multiple detail levels. R EFERENCES [1] A. Toet, J. IJspeert, A. Waxman, and M. Aguilar, “Fusion of visible and thermal imagery improves situational awareness,” Displays, vol. 18, pp. 85–95, 1997. [2] A. M. Waxman, M. Aguilar, D. A. Fay, D. B. Ireland, J. P. R. Jr., W. D. Ross, J. E. Carrick, A. N. Gove, M. C. Seibert, E. D. Savoye, R. K. Reich, B. E. Burke, W. H. McGonagle, and D. M. Craig, “Solidstate color night vision: Fusion of low-light visible and thermal infrared imagery,” Lincoln Laboratory Journal, vol. 11, pp. 41–60, 1998. [3] Y. Zheng and E. A. Essock, “A local-coloring method for night-vision colorization utilizing image analysis and fusion,” Information Fusion, vol. 9, pp. 186 – 199, 2008. [4] C. Pohl and J. L. van Gendern, “Multisensor image fusion in remote sensing: concepts, methods and applications,” International Journal of Remote Sensing, vol. 19, pp. 823–854, 1998. [5] X. Chen, P. J. Flynn, and K. W. Bowyer, “Ir and visible light face recognition,” Computer Vision and Image Understanding, vol. 99, pp. 332 – 358, 2005.

[6] D. Krishnan and R. Fergus, “Dark flash photography,” in SIGGRAPH ACM Transactions on Graphics, 2009. [7] C. Xydeas and V. Petrovic, “Objective image fusion performance measure,” Electronics Letters, vol. 36, pp. 308–309, 2000. [8] G. Piella and H. Heijmans, “A new quality metric for image fusion,” in International Conference on Image Processing, ICIP Proceedings., vol. 3, 2003, pp. III – 173–6 vol.2. [9] N. Cvejic, A. Loza, D. Bull, and N. Canagarajah, “A similarity metric for assessment of image fusion algorithms,” International Journal of Information and Communication Engineering, pp. 178–182, 2005. [10] H. Chen and P. K. Varshney, “A human perception inspired quality metric for image fusion based on regional,” Information Fusion, vol. 8, pp. 193 – 207, 2007. [11] C. Ramesh and T. Ranjith, “Fusion performance measures and a lifting wavelet transform based algorithm for image fusion,” in Proc. Of the 5th International Conference on Information Fusion, vol. 1, 2002, pp. 317–320. [12] S. Alenius and R. Bilcu, “Combination of multiple images for flash relightning,” in IEEE 3rd International Symposium on Communications, Control and Signal Processing, 2008, pp. 322 –327. [13] A. Toet, “Multi-scale contrast enhancement with applications to image fusion,” Optical Engineering, vol. 31, pp. 1026–1031, 1992. [14] P. Burt and E. Adelson, “The laplacian pyramid as a compact image code,” Communications, IEEE Transactions on, vol. 31, pp. 532 – 540, 1983. [15] S. Nikolov, P. Hill, D. Bull, and C. Canagarajah, Wavelets for image fusion.