IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 1, MARCH 2009
111
Improving Face Recognition via Narrowband Spectral Range Selection Using Jeffrey Divergence Hong Chang, Member, IEEE, Yi Yao, Andreas Koschan, Member, IEEE, Besma Abidi, and Mongi Abidi
Abstract—In order to achieve improved recognition performance in comparison with conventional broadband images, this paper addresses a new method that automatically specifies the optimal spectral range for multispectral face images according to given illuminations. The novelty of our method lies in the introduction of a distribution separation measure and the selection of the optimal spectral range by ranking these separation values. The selected spectral ranges are consistent with the physics analysis of the multispectral imaging process. The fused images from these chosen spectral ranges are verified to outperform the conventional broadband images by 3%–20%, based on a variety of experiments with indoor and outdoor illuminations using two well-recognized face-recognition engines. Our discovery can be practically used for a new customized sensor design associated with given illuminations for improved face-recognition performance over the conventional broadband images. Index Terms—Face recognition, Jeffrey divergence, kernel density estimation, multispectral images, spectral distribution.
I. INTRODUCTION UE to increasing security concerns, the accuracy of computer-based face-recognition systems for security applications, such as identity authentication and gate access, has attracted significant research attention. Existing face-recognition systems have demonstrated good recognition performance with frontal, centered faces acquired under controlled lighting conditions [1]. However, recognition performance deteriorates under varying illumination, especially when the images were acquired outdoors, even on the same day [2], [3]. Fig. 1 illustrates the changes in lighting that produce variations in the face appearance. Illumination changes can vary the overall magnitude of the light intensity reflected back from a subject as well as the pattern of shading and shadows visible in an image. Chen et al. [4] summarized the main algorithms toward the goal of improving the performance under varying lighting conditions. They classified these into three main algorithm categories: 1) preprocessing and normalization; 2) invariant feature
D
Manuscript received April 02, 2008; revised November 07, 2008. First published February 03, 2009; current version published February 11, 2009. This work was supported in part by the DOE University Research Program in Robotics under Grant #DOE-DEFG02-86NE37968 and in part by the National Science Foundation (NSF)-CITeR under Grant #01-598B-UT. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Davide Maltoni. The authors are with the Imaging, Robotics, and Intelligent Systems Laboratory, Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN 37996-2100 USA (e-mail:
[email protected];
[email protected];
[email protected];
[email protected];
[email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIFS.2008.2012211
Fig. 1. Face under different lighting conditions. (a) An RGB color image of a male subject under fluorescent light. (b) An RGB color image of the same subject by the same camera under daylight. (c) A monochromatic image of a female subject under fluorescent light. (d) A monochromatic image of the same female subject by the same camera under daylight.
extraction; and 3) face modeling. Preprocessing and normalization, such as histogram equalization and gamma correction [5], [6], are easy to implement but provide inferior performance. Edge maps, derivatives of the gray level, and Gabor-like filters [7] are all regarded as illumination-invariant signature images. However, empirical studies show that none of these representations are sufficient to overcome image variations due to changes in the direction of illumination. Under such circumstances, the 3-D face model can be used to render face images with different poses and under varying lighting conditions [8], [9]. Another promising approach to improve recognition performance under various illuminations combines near infrared (IR) with visible images since face images captured by using near infrared sensors are nearly invariant to changes in ambient illumination [10], [11]. The recognition results in [12] showed that visible and IR imagery perform similarly across algorithms and that fusion of visible and IR imagery is a viable mean of enhancing performance beyond that of either acting alone. However, this approach requires a rigorous registration procedure or a specified imaging system with hardware registration. In our previous work [13], we proposed a multispectral imaging system for improved face recognition in visible spectrum without registration. The closest work to ours is by Pan et al. [14]–[16]. They acquired spectral images over the near-infrared spectrum (700–1000 nm) and demonstrated that spectral images of faces acquired in the near infrared range can be used to recognize an individual under different poses and expressions. However, their recognition performance was not compared with that of the broadband images acquired with
1556-6013/$25.00 © 2009 IEEE Authorized licensed use limited to: IEEE Xplore. Downloaded on February 10, 2009 at 15:30 from IEEE Xplore. Restrictions apply.
112
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 1, MARCH 2009
conventional cameras in the visible spectrum. It is evident from the literature that not much research has been conducted by using multispectral imaging in the visible domain to address the problem of face recognition, especially with respect to changes in illumination conditions. In this paper, our main contribution is an illumination-specific spectral range selection algorithm that provides a minimum set of a narrowband spectral range for improved face-recognition performance in the visible domain. Let the total number , and denote the central waveof multispectral bands be length of the th band. The complete set of multispectral bands . Our method finds an optimal is according to given illuminations such that subset can outperform conventional broadthe fused images from with a band images. In this algorithm, the optimal subset given lighting condition is selected by ranking the separation values between genuine and imposter sets at different wavelengths. Kernel density estimation is used to estimate the distributions of the two sets and Jeffrey divergence is employed to calculate the separation between the two distributions. Our experimental results demonstrate that the face-recognition rate can be substantially improved over that of the conventional broadband images for indoor and outdoor environments. In addition, a simplified multispectral face imaging system can be achieved based on our work with reduced acquisition and processing time. It is straightforward that for environments with fixed lighting conditions, such as offices, fixed narrowband filters with center wavelengths tuned to the selected spectral range can be used. For environments with possible illumination changes, tunable filters can be used, where the response of the filter is adjusted according to current illumination conditions. Therefore, in both scenarios, our findings lead to a new avenue in customized design of the imaging system for face recognition and benefit the security system based on biometric recognition. Both our spectral range selection method and hardware solutions can be perfectly implemented for applications with either fixed or varying lighting conditions. Multispectral imaging response is discussed in Section II by serving as the physics analysis for the selection of the optimal spectral range. Our methodology is given in Section III. The proposed spectral range selection algorithm is presented in Section IV. A brief description of our multispectral imaging system and database is given in Section V. Experimental results are demonstrated in Section VI and conclusions are drawn in Section VII. II. PHYSICS ANALYSIS In recent years, modern multispectral imagers have been designed from mainly federally designated tasks, such as airborne and space-based military surveillance to commercial marketplace, such as industrial, agricultural, geological, environmental, and medical communities. For example, “push-broom” designs, which incorporate stationary area-array detectors, are very common. Another attractive design uses the combinations of charge-coupled device (CCD) cameras with various types of narrow or broadband filters. The images are then processed by using normal high-capacity computational
Fig. 2. Spectral sensor response of (a) a monochromatic imaging system and (b) a multispectral imaging system.
machinery with software developed to properly treat the spectral data. With the advances in filter technology, a tunable filter was used in conjunction with a monochrome camera to produce a stack of images at a sequence of wavelengths, forming the multispectral images (MSIs). In this section, a physics analysis of the multispectral imaging process is presented to clarify the feasibility of selecting a set of narrowband images for improved face-recognition performance. There are two advantages of multispectral images over conventional images, which we took into consideration as our inspiration of utilizing multispectral images for face recognition. First, it is well known that humans tend to easily spot any color changes in the skin tones. The main obstacle for the universal color use in machine vision applications is that the cameras are not able to distinguish changes of surface color from color shifts caused by varying illumination [17]. Multispectral images in visible domain can provide a new avenue to separate the color of a subject and the illumination. Second, with multispectral images, we have the freedom to emphasize and/or suppress the contribution of images from certain narrowbands. Contrarily, conventional monochromatic and RGB images provide only one- or three-broadband responses. Fig. 2 compares the spectral sensor responses of a conventional monochromatic camera and a multispectral imaging system. In a multispectral imaging system as shown in Fig. 3, there are four main factors that determine the intensity values of an ; 2) spectral disimage: 1) spectral reflectance of a subject tribution of the illumination ; 3) spectral response of the
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 10, 2009 at 15:30 from IEEE Xplore. Restrictions apply.
CHANG et al.: IMPROVING FACE RECOGNITION VIA NARROWBAND SPECTRAL RANGE SELECTION
113
camera ; and 4) the transmittance of the liquid-crystal-tun. Therefore, the camera response able filter (LCTF) corresponding to the th band centered at wavelength , between and , can be obtained by
(1) with and being the total number of multispectral bands. Therefore, the camera response is the result of an integration process which can also be calculated in a discrete form as the summation of samples. Since each spectral image is acquired within a very narrow band, we only use one sample of each factor per band. Therefore, the sensor output at wavelength can be represented as (2) We learn that the main difference in the skin reflectance of different ethnicities is in the level not the shape in spectrum [18], [19]. The positions of peaks and valleys of the reflectance responses in the whole spectrum range are similar for different ethnicities. The natural reason for the closeness of skin reflectance is that the skin color appearance for all ethnic groups is formed from three colorants: 1) melanin; 2) carotene; and 3) hemoglobin [20]. Therefore, we are able to perform spectral band selection without distinguishing faces from different ethnicities. An example of the normalized skin reflectance is shown in Fig. 4(a). A monochromatic CCD sensor response and the transmittance of a liquid-crystal-tunable filter are given in Fig. 4(b) and (c), respectively. Those two factors are the same for different skin color as long as the same imaging equipment is used for data acquisition. The combined spectral characteristics of these three factors: 1) skin reflectance; 2) sensor response; and 3) the transmittance of the LCTF are plotted in Fig. 4(d). Therefore, given a particular camera, an remains LCTF, and a subject, the product the same. In this case, the camera response at band under is illumination (3) which indicates that the camera response has a direct relationship to the incident illumination. Considering the illumination of halogen light, for instance, the spectral power distribution is plotted in Fig. 5(a). The hypothetical corresponding sensor response at the wavelengths in the visible domain is demonstrated in Fig. 5(b). The magnitude of the resulting camera response reaches its maximum in the proximity of 610 nm. III. ALGORITHM METHODOLOGY In a face-recognition comparison, a gallery consists of a set . is the total number of samples in of samples the gallery with one sample per person. When a probe is presented to a system, it is compared with the entire gallery. The comparison between a probe and each gallery biometrics . sample produces a similarity score
Fig. 3. Camera response p() is the result of integration of all the factors involved, including the spectral distribution of the illumination L(), reflectance of the subject R(), the transmittance of LCTF T (), and the spectral response of the camera S ().
Since similarity scores can quantitatively evaluate the similarities between two images under comparison, a larger similarity score indicates more resemblance between the two samples. Our approach starts from the similarity score calculation. Denote as the similarity score between the probe image of the th subject collected at the th band and the gallery image of the th subject. The similarity scores in each band can be divided into and imposter sets. two groups, referred to as the genuine The genuine and imposter sets are defined as and , respectively. The genuine set contains similarity scores with probe and gallery images from the same subject while the imposter set consists of similarity scores with the probe and gallery images from different subjects. Ideally, the genuine and imposter sets should cluster at the high and low ends of the score scale, respectively, without overlap so that an appropriate threshold can be derived to completely separate the genuine matches from the imposter ones. In such a condition, a perfect 100% recognition rate can be achieved. However, in practical situations, often overlapped regions exist between these two sets. Therefore, an important criterion in evaluating the effectiveness of the recognition system is the separation between the distributions of the similarity scores of the genuine and imposter sets. For a face-recognition system using multispectral images, the behavior of the similarity scores from various bands differs substantially to where it results in a varying face-recognition rate. Fig. 6 shows the probability density functions (PDFs) of the genuine and imposter sets for bands 480 nm and 720 nm. The x-axis shows the similarity score values varying from 0 to 4 and the y-axis is the estimated probability density. From visual inspection, the separation in band 720 nm is more conspicuous, which suggests that the probe set collected at band 720 nm should produce a higher face-recognition rate in comparison with the probe set at band 480 nm. In this paper, we propose using the separation between the genuine and imposter sets to select the optimal band range for given illumination conditions. To achieve this goal, we need an and imaccurate estimation of the PDFs of the genuine sets and a quantified measure to evaluate the poster separation between them. Recall that the total number of muland that denotes the central wavetispectral bands is length of the th band. The complete set of multispectral bands . We want to find an optimal subset is such that , where
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 10, 2009 at 15:30 from IEEE Xplore. Restrictions apply.
114
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 1, MARCH 2009
Fig. 4. (a) Normalized skin reflectance for different skin colors. (b) The sensor response of a monochromatic camera. (c) The transmittance of an LCTF. (d) The combined spectral characteristics of three factors: skin reflectance, sensor response, and the transmittance of the LCTF.
Fig. 5. (a) Spectral power distribution of halogen light. (b) Product of the four factors including skin reflectance, sensor response, the transmittance of the LCTF, and spectral power distribution of halogen light.
turns
is the number of bands to be selected and if is the th largest separation measure.
re-
IV. ALGORITHM DESCRIPTION The pipeline of the face recognition algorithm with our automated band selection mechanism is illustrated in Fig. 7. Face recognition starts typically with image preprocessing including segmentation and normalization. Afterward, salient features are extracted. Based on the features extracted from pairs of probe and gallery images, similarity scores are computed. Then, band selection is performed as follows: 1) the similarity scores distributions of the genuine and imposter sets are estimated by using kernel density functions; 2) Jeffrey divergence is calculated to quantitatively describe the separation between these two distributions; and 3) the optimal spectral range is selected according to the requirement. After that, images from the selected bands are fused and fed into a classification engine that outputs the recognition rate.
A. Probability Density Function Estimation From the similarity scores of various subjects in a probe set, and the distributions of the genuine and imposter sets are estimated by using the kernel density estimation (KDE) [21] (4) and (5)
denotes the kernel function used for density esHere, is the width, and represents the total timation, number of samples in probe data sets and the number of subjects in gallery. Note that since face identification performance
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 10, 2009 at 15:30 from IEEE Xplore. Restrictions apply.
CHANG et al.: IMPROVING FACE RECOGNITION VIA NARROWBAND SPECTRAL RANGE SELECTION
115
PDF. Therefore, the Jeffrey divergence is used because the expectation is computed with respect to both PDFs. The Jeffrey divergence is given by
(9)
C. Band Selection
Fig. 6. Illustration of different separations between the PDFs of the similarity scores from the genuine and imposter sets. More separated PDFs are observed from band 720 nm than from band 480 nm.
, the opGiven the number of bands to be selected timal bands can be chosen by sorting the values in bands corresponding to the a descending order. The number of values in the sorted sequence largest are selected. If the number of bands to be selected is unknown, values can be used. the degradation percentage of the and . The Let degradation percentage is defined as (10)
is investigated, the number of samples in a probe is equal to the number of samples in gallery. In our implementation, the is used Gaussian kernel
Given a predefined degradation percentage are selected.
, the bands with
D. Band Fusion (6)
affect the efficiency of The choices of the width such that the KDE significantly. We select the width the asymptotic mean integrated square error (AMISE) [22] is minimized
(7)
Wavelet-based methods have been widely used toward image fusion. The Haar wavelet-based pixel-level fusion, as described in [25] is applied. Given the registered narrowband images from the selected spectral range, 2-D discrete wavelet decomposition is performed on each image to obtain the wavelet approximation coefficients and detail coefficients. The coefficients in inverse wavelet transform for the fused image are obtained by choosing the maximum among each type of coefficient. The 2-D discrete wavelet inverse transform is then performed to construct the fused image. V. DATABASE DESCRIPTION
where , and
with .
B. Divergence Calculation Once the PDFs of the similarity scores from the genuine and imposter sets are estimated, the remaining question is how to describe the divergence between the two PDFs. The Kullback–Leibler (KL) divergence [23] is a popular measure that evaluates the extent to which two PDFs agree and it is expressed as (8) where and denote the true and estimated PDFs. The Jeffrey divergence is a symmetric version of the KL distance and [24]. Since we are interested in with respect to the distance between two independent PDFs in our application, it is not practical to assign either PDF to be the true or estimated
In this section, the database we use to obtain our experimental results is introduced along with the corresponding multispectral imaging system. Our mobile multispectral imaging system consists of a monochromatic camera, a liquid-crystal-tunable filter (LCTF), a digital RGB camera, a spectrometer, a frame grabber, and an onboard computer. All components are integrated on one translational platform to acquire well-aligned face images as shown in Fig. 8(a). A detailed view of the multispectral imaging components is shown in Fig. 8(b). The LCTF can electronically tune a narrowband filter centered at various wavelengths in visible spectrum and provides narrowband filters with a full width-at-half-maximum bandwidth of 7 nm. A maximum of 321 narrowband MSIs can be acquired by continuously tuning the LCTF. We have acquired datasets with different illumination scenarios. For example, the illumination setups of halogen light and daylight are given in Fig. 9. The quadruple halogen lights with a pair on each side of the participant are shown in Fig. 9(a). The face data were acquired in daylight with side illumination. This was due to the fact that many participants were unable
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 10, 2009 at 15:30 from IEEE Xplore. Restrictions apply.
116
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 1, MARCH 2009
Fig. 7. Illustration of the algorithm pipeline. The proposed band selection algorithm is highlighted in yellow.
Fig. 8. (a) All-inclusive multimodal and multispectral mobile imaging system. (b) Lateral view of the imaging system. (c) Multispectral imaging components.
Fig. 10. Examples of eight subjects: (a) Male Asian under fluorescent light. (b) Female Caucasian under fluorescent light. (c) Male Caucasian under fluorescent light. (d) Female of African descent under fluorescent light. (e) Female Asian Indian under daylight. (f) Male Caucasian under Daylight. (g) Female Asian under daylight. (h) Male of African descent under daylight. Fig. 9. Two illumination setups. 1) Quadruple halogen lights with a pair on each side of the halogen lighting setup and 2) daylight with side illumination.
to maintain pose or expression with bright sunlight and wind streaming directly into the eyes. An outdoor data-acquisition setup with side illumination is shown in Fig. 9(b). A spectrometer (Ocean Optics USB2000) with a cosine corrector and a light meter (EasyView30) are used to record the irradiance and illuminance of the lighting situation for each record. There are a total of 82 participants of different ethnicities, ages, facial, hair characteristics, and genders in our multispectral face database with 2624 face images. The image resolution is 640 by 480 pixels and the interocular distance is about 120 pixels. The database was collected in 11 sessions between
August 2005 and May 2006 with some participants being photographed multiple times. The database was made of 76% male and 24% female; the ethnic diversity was defined as a collection of 57% Caucasian; 23% Asian (Chinese, Japanese, Korean, and similar ethnicity); 12% Asian Indian; and 8% of African descent. Fig. 10 illustrates the demographics of the database including different ethnicities, age groups, facial hair characteristics, and genders. Fig. 11 shows samples from one data record in the IRIS-M database with a variation in lighting conditions and elapsed time. As we can notice, there are intense differences between spectral band images. For example, the image at band 660 nm looks brighter than that at band 580 nm. This can be explained by the formulation of images via the imaging system.
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 10, 2009 at 15:30 from IEEE Xplore. Restrictions apply.
CHANG et al.: IMPROVING FACE RECOGNITION VIA NARROWBAND SPECTRAL RANGE SELECTION
117
study the situation where the illuminations for gallery and probe images are different. Therefore, the experiments are designed according to different acquisition illuminations for gallery and probe images. In our experiments, similarity scores are obtained via two well-known recognition engines: 1) Identix’s FaceIt and 2) Cognitec’s FaceVACS. These two engines are used for the proof of the generality of our proposed approach. Jeffrey divergence values of each and every band probe set of 35 samples are calculated and then normalized to [0, 1]. Before the selection of spectral ranges, polynomial smoothing is conducted to reduce the undesired disturbances from noisy data. In addition, it becomes easier to observe the embedded trend with the smoothed data. The spectral ranges are finally selected and the corresponding narrowband images are fused by Haar-wavelet fusion. A numerical measure is used to evaluate recognition performance. To make the comparison of the overall performances at different ranks, a mapping operation projecting the multiindex CMC curve to a single number CMCM is defined as (11) where is the number of gallery and probe images, repredenotes the number of probe imsents the rank number, and ages that can be correctly identified at and below rank . can be viewed as a weight, which decreases monotonously as increases. As a result, rank-one is dominant and contributes the most to the value of CMCM. Better face-recognition performance is indicated by a higher CMCM value, which varies between 0 and 1. A. Gallery: Fluorescent, Probe: Halogen
Fig. 11. Sample images in a data record in the IRIS-M database; spectral image under daylight, side illumination at (a) band 580 nm, (b) band 620 nm, (c) band 660 nm, and (d) band 700 nm. Spectral image under indoor halogen light at (e) band 580 nm, (f) band 620 nm, (g) band 660 nm, and (h) band 700 nm. Conventional broadband image (i) under halogen light, (j) slightly fluorescent light, (k) under another fluorescent light, and (l) under daylight.
VI. EXPERIMENTAL RESULTS The selection of the optimal spectral range from a series of multispectral images under given illuminations by improving face-recognition performance is found in the following experiments. Three experiments are designed to investigate the recognition performances of fused images from the selected spectral range and conventional images. In the real world, face images are often acquired under different lighting conditions and compared with the database images. It is reasonable and important to
In this experiment, the spectral range of multispectral face images under halogen light is selected via proposed algorithm while gallery images are under another indoor lighting, fluorescent light. There are 25 sets of probe images that are involved in the selection. They are subspectral narrowband images between wavelengths 480 nm and 720 nm with an increment of 10 nm acquired under halogen light. The Jeffrey divergence value of each probe set is given in Fig. 12. The results using Identix are indicated by triangle and Cognitec by stars. From Fig. 12, we can see that the optimal range is between 600 nm and 640 nm. Therefore, two and three narrowband images from the selected range are fused to form new probe sets. Fig. 13 compares their recognition performances with that of the probe sets with single narrowband images and conventional broadband images based on the rank-one recognition rate and CMCM values. One data record used in this experiment is also shown in Fig. 14. The subspectral images under halogen light at wavelengths 580 nm, 620 nm, 660 nm, and 700 nm are shown in Fig. 14(a)–(d). A conventional broadband image, fused image via two bands: 610 nm and 640 nm, and fused image via three bands: 610 nm, 630 nm, and 640 nm are also given in Fig. 14(e)–(g). The fusion of the top two and three optimal bands selected by Identix and Cognitec outperforms the conventional broadband monochromatic images by approximately
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 10, 2009 at 15:30 from IEEE Xplore. Restrictions apply.
118
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 1, MARCH 2009
Fig. 12. Jeffrey divergence of 25 probe sets, including subspectral narrowband images from 480 nm to 720 with the increment of 10 nm under halogen light while the gallery images are acquired under fluorescent light.
Fig. 14. (a)–(d) Subspectral images under halogen light at wavelength 580 nm, 620 nm, 660 nm, and 700 nm. (e) Conventional broadband image under halogen light as the comparison. (f) Fused image via two bands: 610 nm and 640 nm. (g) Fused image via three bands: 610 nm, 630 nm, and 640 nm. (h) Gallery image: conventional broadband image under fluorescent light.
Fig. 13. (a) Rank-one recognition rate and (b) CMCM values of different probe sets, including conventional broadband images, single subspectral images, and fused images from the selected spectral range in Experiment 1.
% % % of the rank-one rate and by 4.5% of CMCM values, which verifies the effeteness of the proposed band selection scheme. B. Gallery: Daylight, Probe: Halogen In this experiment, to study the influence of illumination changes on spectral range selection, gallery images are under daylight instead of fluorescent light as in Experiment 1. The probe sets are the same.
Fig. 15. Jeffrey divergence of 25 probes, including subspectral narrowband images from 480 nm to 720 nm with the increment of 10 nm under halogen light while the gallery images are acquired under daylight.
The spectral range of multispectral face images under halogen light is selected via Jeffrey divergence. Fig. 15 illustrates that both recognition engines select the optimal range from 600 nm to 640 nm. In parallel, Fig. 16 demonstrates the rank-one recognition rate of various probes, including the single subspectral band, conventional broadband, and fused
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 10, 2009 at 15:30 from IEEE Xplore. Restrictions apply.
CHANG et al.: IMPROVING FACE RECOGNITION VIA NARROWBAND SPECTRAL RANGE SELECTION
Fig. 16. (a) Rank-one recognition rate and (b) CMCM values of different probe sets, including conventional broadband images, single subspectral images, and fused images from a selected spectral range in Experiment 2.
images from two and three bands. As expected, the fused images from the selected narrowbands yield a higher recognition rate, indicted by an increase of 20% relative improvement in the rank-one rate and 6.7% relative improvement in CMCM in comparison with the conventional broadband image set. One data record used in this experiment is shown in Fig. 17. Note that the images from the adjacent bands are correlated. In multispectral image fusion, more bands do not guarantee better performance or more useful information. On the contrary, it may even deteriorate the results. For instance, with the given lighting condition in Experiment 2, the fusion of bands 610 and 620 nm provides a better recognition rate than the fusion of three bands of 610, 620, and 630 nm. The time lapse between gallery and probe acquisitions is more than six months in Experiment 2, which is very close to the practical face-recognition situation. We found that even with a different gallery illumination in Experiment 1 and Experiment 2, a similar optimal spectral range is selected while the probe images are acquired under the same illuminant, halogen light. C. Gallery: Fluorescent, Probe: Daylight In this experiment, the most challenging lighting condition, daylight, is investigated for probe sets. To simulate practical face recognition, stable indoor fluorescent light is used for gallery
119
Fig. 17. (a)–(d) Subspectral images under halogen light at wavelength 580 nm, 620 nm, 660 nm, and 700 nm. (e) Conventional broadband image under halogen light as the comparison. (f) Fused image via two bands: 610 nm and 620 nm. (g) Fused image via three bands of 610 nm, 620 nm, and 630 nm. (h) Gallery image: conventional broadband image under daylight.
Fig. 18. Jeffrey divergence of 13 probes including subspectral narrowband images from 480 nm to 720 nm with an increment of 20 nm under varying daylight while the gallery images are acquired under fluorescent light.
images while all of the probes are acquired under varying daylight. The spectral range is selected among 13 sets of narrowband spectral images from wavelength 480 nm to 720 nm with the increment of 20 nm. The divergence of each probe set is given in Fig. 18. The top three bands are 680 nm, 700 nm, and 720 nm. In Fig. 19, the rank-one recognition rate and CMCM values of various probes are illustrated. We found that probes from a single band 720
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 10, 2009 at 15:30 from IEEE Xplore. Restrictions apply.
120
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 1, MARCH 2009
Fig. 19. (a) Rank-one recognition rates and (b) CMCM of various probes, including band 680 nm, 700 nm, 720 nm, broadband monochromatic image, fused images from 680 nm and 720 nm, and fused images from 680 nm, 700 nm, and 720 nm.
nm and the fused images of two and three bands outperform the conventional broadband images by 3% in the rank-one rate and 3.4% in CMCM values. In parallel, Fig. 20 depicts one data record used in this experiment.
Fig. 20. (a)–(d) Subspectral images under daylight at wavelength 580 nm, 620 nm, 660 nm, and 700 nm. (e) Conventional broadband image under daylight as the comparison. (f) Fused image via two bands: 680 nm and 720 nm. (g) Fused image via three bands: 680 nm, 700 nm, and 720 nm. (h) Gallery image: conventional broadband image under fluorescent light. TABLE I SUMMARY OF THREE EXPERIMENTS ON SELECTED SPECTRAL RANGE AND CORRESPONDING PERFORMANCE IMPROVEMENT
D. Summary We summarize our experimental results in Table I. It is observed that while the probes are under certain illumination (halogen light) even with different gallery images, the optimal spectral range is the same (600 nm to 640 nm). It seems that there is a harmony between the optimal spectral range for face recognition and the physics analysis of the imaging system Section 2. A peak at 610 nm is found with physics analysis and it is within the spectral range selected by our proposed algorithm. Our algorithm is able to consistently locate the optimal spectral range without prior knowledge of the imaging system’s configuration and the characteristics of the illumination. The observation that the fused images based on optimal spectral selection provides a better recognition rate than broadband images can be explained as follows. The beneficial facial information, including spectral information carried in a broadband image, has been compromised by the imaging (integration) process. On the other hand, the spectral information which contributes most to face recognition has been utilized and emphasized by our method so that our method can improve the recognition performance. In other words, the most beneficial information for face recognition from different bands and the
fusion technique are analyzed and selected toward improved face-recognition performance. The performance improvement with probes under varying outdoor daylight is not as significant as that with probes under an indoor lighting condition. This performance degradation is mainly attributed to the fact that the illumination, such as daylight, varies every moment. The spatial inhomogeneous lighting distribution with a variation of shadows on different subjects further degrades the performance. Different preprocessing and feature selection methods may produce different results. In our experiments, to illustrate the performance difference, similarity scores are obtained via two well-known recognition engines: 1) Identix’s FaceIt and 2) Cognitec’s FaceVACS. These two recognition engines explore different preprocessing, feature selection and recognition methods. In Figs. 12, 15, and 18, different divergences are observed from
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 10, 2009 at 15:30 from IEEE Xplore. Restrictions apply.
CHANG et al.: IMPROVING FACE RECOGNITION VIA NARROWBAND SPECTRAL RANGE SELECTION
Fig. 21. Conventional broadband face image under halogen light. (b) Conventional broadband image of the same subject under the same halogen light at different time. (c) Fused image via two bands: 600 nm and 610 nm. (d) Fused image via three bands: 600 nm, 610 nm, and 630 nm.
121
Fig. 23. CMCM of various probes, including broadband monochromatic image, fused images from 600 nm and 610 nm, and fused images from 600 nm, 610 nm, and 630 nm.
values of fused images (97.5%) from band 600 nm, 610 nm, and 630 nm are higher than that of broadband conventional images (96.67%). Note that smaller performance improvement is observed in comparison with the scenario where different lighting conditions are used. The reason lies in that the performance is good since gallery and probes are collected under the same lighting condition, which leaves marginal room for further improvement. From the aforementioned experiments, we can conclude that the fused images via our spectral range selection method can outperform the conventional broadband images for scenarios either with the same lighting conditions for gallery and probes or with different lighting conditions. VII. CONCLUSION
Fig. 22. Rank-one recognition rates of various probes, including broadband monochromatic image, fused images from 600 nm and 610 nm, and fused images from 600 nm, 610 nm, and 630 nm.
different recognition engines. However, from our experimental results, we could see that despite the existence of performance difference, the optimal bands selected via these two engines are similar. This verifies the effectiveness of our algorithm. Our algorithm is general enough to incorporate systems based on different preprocessing and feature selection methods. We deliberately used gallery and probe sets from different illuminations to resemble practical applications. In practical applications, we have the freedom to collect a gallery set in a controlled and well-lit environment. But probe sets are usually collected in real surveillance conditions with poor and possible varying illuminations. However, to complete our investigation on spectral range selection, the following experiments based on face images under the same lighting condition for gallery and probe are conducted. Gallery images are acquired under halogen light, and probe images are also collected under the same halogen light with the same group of subjects at different times. Fig. 21 gives one data record used in this experiment. In Figs. 22 and 23, the rank-one recognition rate and CMCM values of various probes are illustrated. The rank-one recognition rate of different probes is the same at 95%. CMCM
A variation in illumination dramatically degrades face-recognition performance. We proposed using narrowband subspectral images instead of conventional broadband images to improve recognition performance. A spectral range selection algorithm was developed to choose the optimal band images under given illumination conditions. From our experiments based on both tested recognition engines, FaceIt and FaceVACS, the spectral ranges of 600 nm to 640 nm and 680 nm to 720 nm are the optimal choice for probes under indoor halogen light and varying daylight, respectively. The selected optimal spectral bands are consistent with those specified by physics analysis with known system configuration and illumination characteristics and result in a 3%–20% improvement in the face-recognition rate in comparison with that of conventional images. A simplified multispectral face imaging system can be achieved based on this paper and it can be practically used for a customized sensor associated with given illuminations to benefit the security system based on face recognition. Both our hardware solutions and spectral range selection method can be perfectly implemented for applications with either fixed or varying lighting conditions. REFERENCES [1] P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss, “The FERET evaluation methodology for face recognition algorithms,” IEEE Trans. Patern Anal. Mach. Intell., vol. 22, no. 10, pp. 1090–1104, Oct. 2000. [2] P. J. Phillips, P. Grother, R. J. Micheals, D. M. Blackburn, E. Tabassi, and J. M. Bone, FRVT 2002: Evaluation Rep. [Online]. Available: http://www.frvt.org/FRVT2002/documents.htm.
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 10, 2009 at 15:30 from IEEE Xplore. Restrictions apply.
122
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 4, NO. 1, MARCH 2009
[3] S.-W. Lee, S.-H. Moon, and S.-W. Lee, “Face recognition under arbitrary illumination using illuminated exemplars,” Pattern Recogn., vol. 40, no. 5, pp. 1605–1620, 2007. [4] W. Chen, M. J. Er, and S. Wu, “Illumination compensation and normalization for robust face recognition using discrete cosine transform in logarithm domain,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 36, no. 2, pp. 458–466, Apr. 2006. [5] S. Shan, W. Gao, B. Cao, and D. Zhao, “Illumination normalization for robust face recognition against varying lighting conditions,” in Proc. IEEE Workshop AMFG, 2003, pp. 157–164. [6] X. Xie and K.-M. Lam, “Face recognition under varying illumination based on a 2D face shape model,” Pattern Recogn., vol. 38, pp. 221–230, 2005. [7] T. F. Cootes, G. J. Edwards, and C. J. Taylor, “Active appearance models,” in Proc. Eur. Conf. Computer Vision, 1998, vol. 2, pp. 484–498. [8] B. Moghaddam, T. Jebara, and A. Pentland, “Bayesian face recognition,” Pattern Recogn., vol. 33, pp. 1771–1782, 2000. [9] C. Liu and H. Wechsler, “A unified Bayesian framework for face recognition,” Proc. IEEE ICIP, pp. 151–155, 1998. [10] D. A. Socolinsky, L. B. Wolff, J. D. Neuheisel, and C. K. Eveland, “Illumination invariant face recognition using thermal infrared imagery,” Proc. IEEE ICPR, vol. 1, pp. 527–534, 2001. [11] G. Bebis, A. Gyaourova, S. Singh, and I. Pavlidis, “Face recognition by fusing thermal infrared and visible imagery,” Proc. Image Vis. Comput., vol. 24, pp. 727–742, 2006. [12] S. G. Kong, J. Heo, F. Boughorbel, Y. Zheng, B. R. Abidi, A. Koschan, M. Yi, and M. A. Abidi, “Adaptive fusion of visual and thermal IR images for illumination-invariant face recognition,” Int. J. Comput. Vis., vol. 71, no. 2, pp. 215–233, 2007. [13] H. Chang, A. Koschan, B. Abidi, and M. Abidi, “Physics-based fusion of multispectral data for improved face recognition,” Proc. IEEE ICPR, vol. III, pp. 1083–1086, 2006. [14] Z. Pan, G. Healey, M. Prasad, and B. Tromberg, “Face recognition in hyperspectral images,” IEEE Trans. Pattern. Anal. Mach. Intell., vol. 25, no. 12, pp. 1552–1560, Dec. 2003. [15] Z. Pan, G. Healey, M. Prasad, and B. Tromberg, “Hyperspectral face recognition under variable outdoor illumination,” in Proc. SPIE, 2004, vol. 5425, pp. 520–529. [16] Z. Pan, G. Healey, M. Prasad, and B. Tromberg, “Multiband and spectral eigenfaces for face recognition in hyperspectral images,” in Proc. SPIE, 2005, vol. 5779, pp. 144–151. [17] S. Li and A. Jain, Handbook of Face Recognition. New York: Springer, 2004. [18] F. H. Imai, N. Tsumura, H. Haneishi, and Y. Miyake, “Principal component analysis of skin colour and its applications to colorimetric reproduction on CRT display and hardcopy,” J. Imaging Sci. Technol., vol. 40, no. 5, pp. 422–430, 1996. [19] H. Nakai, Y. Manabe, and S. Inokuchi, “Simulation and analysis of spectral distribution of human skin,” Proc. ICPR, vol. 2, pp. 1065–1067, 1998. [20] E. A. Edwards and S. Q. Duntley, “The pigments and color of living human skin,” Amer. J. Anat., vol. 65, no. 1, pp. 1–33, 1939. [21] L. Wasserman, All of Statistics: A Concise Course in Statistical Inference. New York: Springer Texts Statist., 2005. [22] A.-R. Mugdadi and E. Munthali, “Relative efficiency in kernel estimation of the distribution function,” J. Statist. Res., vol. 15, no. 4, pp. 579–605, 2003. [23] S. Kullback, Information Theory and Statistics. New York: Dover, 1997. [24] Y. Rubner, C. Tomasi, and L. J. Guibas, “The earth mover’s distance as a metric for image retrieval,” Int. J. Comput. Vis., vol. 40, no. 2, pp. 99–121, 2000. [25] R. C. Gonzalez and R. E. Woods, Digital Image Processing, 2nd ed. Upper Saddle River, NJ: Prentice-Hall, 2002.
Hong Chang (M’05) received the B.S. and M.S. degrees in electronics engineering from Beihang University, Beijing, China, in 1998 and 2001, respectively, and is currently pursuing the Ph.D. degree in electrical engineering and computer sciences in the Imaging, Robotics and Intelligent Systems Lab at the University of Tennessee, Knoxville. Her research interests include topics in image fusion, pattern recognition, multispectral image processing, and statistic modeling.
Yi Yao received the B.S. and M.S. degrees in electrical engineering from Nanjing University of Aeronautics and Astronautics, Nanjing, China in 1996 and 2000, respectively, and the Ph.D. degree in electrical and computer engineering from the University of Tennessee, Knoxville, in 2008. Currently, she is an Electrical Engineer at the Global Research Center of General Electric. Her research interests include object tracking, sensor planning, and multicamera surveillance systems.
Andreas Koschan (M’90) received the M.S. degree in computer science and the Dr.-Ing. degree in computer engineering from the Technical University Berlin, Berlin, Germany, in 1985 and 1991, respectively. Currently, he is a Research Associate Professor in the Department of Electrical and Computer Engineering at the University of Tennessee, Knoxville. His work has primarily focused on color image processing and 3-D computer vision, including stereo vision and laser-range finding techniques. He is a coauthor of three textbooks on 3-D image processing. Dr. Koschan is a member of IS&T.
Besma R. Abidi received the Master’s degree from the National Engineering School of Tunis, Tunisia, in 1986, and the Principal Engineer diploma in electrical engineering, (Hons.) and the Ph.D. degree from The University of Tennessee, Knoxville, in 1985 and 1995, respectively. Currently, she is Research Assistant Professor with the Department of Electrical and Computer Engineering at the University of Tennessee, Knoxville. Her general areas of research are 2-D and 3-D intelligent computer vision, sensor positioning and geometry, and scene modeling.
Mongi Abidi received the Principal Engineer diploma in electrical engineering from the National Engineering School of Tunis, Tunisia, in 1981, and the M.S. and Ph.D. degrees in electrical engineering from The University of Tennessee, Knoxville, in 1985 and 1987, respectively. Currently, he is Professor and Associate Department Head in the Department of Electrical and Computer Engineering, directing activities in the Imaging, Robotics, and Intelligent Systems Laboratory. He conducts research in the field of 3-D imaging, specifically in the areas of scene building, scene description, and data visualization.
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 10, 2009 at 15:30 from IEEE Xplore. Restrictions apply.