Fast and Robust Color Constancy Algorithm Based on Grey Block

0 downloads 0 Views 1MB Size Report
algorithm based on a new hypothesis, the grey block-differencing hypothesis, which states that the ... the simplest methods (i.e., grey world and max-RGB).
OPTICAL REVIEW Vol. 20, No. 4 (2013) 341–347

Fast and Robust Color Constancy Algorithm Based on Grey Block-Differencing Hypothesis Shiming LAI, Xin TAN, Yu LIU, Bin WANG, and Maojun ZHANG College of Information System and Management, National University of Defense Technology, Changsha, Hunan 410073, China (Received March 20, 2013; Accepted April 30, 2013) Color constancy is a fundamental research topic in color and vision. Numerous methods have been proposed in recent years. New methods are highly accurate but tend to be more complex. This paper proposes a simple low-level statistical algorithm based on a new hypothesis, the grey block-differencing hypothesis, which states that the average of reflectance differences of adjacent blocks in a scene is achromatic. The new method has almost the same complexity as the simplest methods (i.e., grey world and max-RGB). Experimental results demonstrate that the accuracy of the proposed method is exceptional. # 2013 The Japan Society of Applied Physics Keywords: color constancy, grey block-differencing hypothesis, illuminant estimation, white balance

1.

which, in general, can obtain satisfactory results. For example, the white balance module of a digital camera usually utilizes estimated color-biased information to set the parameters of the sensor’s color channel gain factor. Another reason is that almost all color constancy methods involve approximately the same transformation step; thus, focus is on light source estimation. Color constancy methods can be classified into three groups:3) static, gamut-based, and learning-based methods. Grey-World,4) max-RGB,5) Shades-of-Grey,6) and GreyEdge7) are four typical static methods. These methods estimate the light source by utilizing the low-level statistics of a color image. Grey-World method involves the computation of only the average value of all the pixels, and max-RGB method involves the computation of only the maximum values. Thus, these methods are easy and quick to execute. However, Grey-World and max-RGB are inaccurate at certain circumstances (e.g., the image has a dominant color and some pixel values are saturated). Grey-Edge method utilizes a derivative structure of images and can achieve satisfactory results when the parameters are selected appropriately. However, selection of optimal parameters is difficult. Grey-Edge also requires a preprocessing step, that is, smoothing the image with a Gaussian kernel. Image convolution therefore becomes time consuming. Gamutbased method has been introduced by Forsyth.8) Its aim is to find a suitable mapping which maps the image gamut (unknown illuminant) into the canonical gamut (known illuminant). Learning-based methods estimate the illuminant using a model that is learned on training data. Various machining-learning techniques such as neural networks,9) and advanced image feature techniques such as scene semantics10) have been proposed and adopted. Gamut-based and learning-based methods are highly accurate. However, the practical application of these methods is limited by several drawbacks, including the need for training, difficult implementation, and slow execution. The limitations of these algorithms obstruct their practical application. For example, in a digital camera, the image signal processor must complete a number of image

Introduction

Color is very important for images. It is the most intuitive information for human vision. Color as an effective cue has been widely utilized in many computer vision algorithms, such as in feature extraction and description.1,2) Image color depends not only on the scene but also on the light source. That is, an object presents different colors under different light sources (Fig. 1). Human vision has the natural ability to correct the effects of different light sources. Similar behavior is highly desirable in an image acquisition device. Filtering out the influence of the light source is essential to obtain consistent color cues in computer vision application. The goal of color constancy is to convert an image captured with an unknown light source to an image with a canonical (e.g., white) light source. The common digital camera can be considered an example to illustrate color constancy. The task of the digital camera is to record a ‘‘real’’ scene at various environments, which involve different lighting conditions. Under these conditions, the same object may present a different color in the complementary metal oxide semiconductor (CMOS) or charge-coupled device (CCD) image sensor of a digital camera. Digital cameras record ‘‘real’’ scenes, where ‘‘real’’ is based on the observation of the human eye. The human eye can obliterate the impact of different light sources and identify a consistent object color in different kinds of light. Therefore, digital cameras require color constancy. The digital camera image signal processing flow includes a white balance process. The purpose of this process is to make the white object appear white always and free from the influence of illumination. White balance process is a kind of color constancy process. Color constancy involves two steps: estimating the image light source and transforming the image through the estimated results. Estimating the light source is more difficult and important than transforming the image because the former can be directly applied to image enhancement, 

E-mail address: [email protected] 341

342

OPTICAL REVIEW Vol. 20, No. 4 (2013)

(a)

S. LAI et al.

(b)

Fig. 1. (Color online) Same scene captured with different light sources appears different: (a) image captured under illuminant F (color temperatures: 2700 K) and (b) image captured under illuminant D65 (color temperatures: 6500 K).

processing tasks such as color filter array (CFA) interpolation, denoising, and compression. White balance is merely one task among numerous image processing tasks. Available hardware resources for white balance are limited. Many color constancy algorithms have limited applications because of excessive consumption of resources. This study proposes a simple, fast, and accurate color constancy algorithm. A low-level statistical algorithm is provided based on a new hypothesis: the grey blockdifferencing hypothesis. The proposed algorithm is then experimented on and analyzed. The experiments of this paper includes validation of the algorithm on a PC and of applications in an image signal processing hardware platform based on field-programmable gate array (FPGA). The remainder of this paper is organized as follows: Section 2 presents our hypothesis and algorithm. Sections 3 and 4 describe the experimental evaluation. Section 5 summarizes the paper. 2.

Color Constancy with Image Block Difference

2.1 Lambertian reflection model The Lambertian reflection model, also known as the ideal diffuse reflection model, is an ideal approximation of unsmooth surfaces. The Lambertian model assumes that reflected light intensity is the same in all directions and does not change with changes in the incident ray or viewing direction. According to the Lambertian reflection model, the value fðxÞ of one point x in the image relies on incident ray spectral distribution eðÞ, reflectivity sðx; Þ of point x at wavelength , and camera sensitivity function cðÞ as illustrated by the equation Z ð1Þ fðxÞ ¼ eðÞcðÞsðx; Þ d; !

where x denotes the spatial coordinates,  is the spectral wavelength of the light, and ! is the entire visible spectrum range (wavelength of 380 to 780 nm). Generally, image light estimation is equivalent to the estimation of the observed color of light e with considering the incident ray spectral distribution eðÞ and camera sensitivity function cðÞ. Z ð2Þ e ¼ eðÞcðÞ d: !

Given that both eðÞ and cðÞ are unknown, estimation of image light is an under-constrained problem in the image formation process. Without any further assumptions, this estimation is an unsolvable problem.11) 2.2 Grey-Edge hypothesis and its limitations The Grey-World hypothesis assumes that average reflectance in a scene is achromatic. Based on this hypothesis, the mean values of the red, green, and blue channels are equal. Thus, the difference between the mean values of different channels directly reflects the color aberration of luminance. After the proposal of the original Grey-World method,4) some other improved versions12,13) have been proposed. The Grey-Edge hypothesis assumes that the average of the reflectance differences in a scene is achromatic. Higherorder image statistics can be adopted to estimate the image’s light source. Based on this hypothesis, Gaussian filter and Minkowski norm are introduced to the Grey-Edge method: Z  n  p 1=p  @ f c ðxÞ    ¼ ken;p; ; c 2 fr; g; bg; ð3Þ c  @xn  dx where f c ðxÞ represents the value of channel c (r, g, or b) of image fðxÞ. f c ðxÞ ¼ f c ðxÞ  G is the image after local smoothing with a Gaussian filter G with standard deviation . n ¼ 0; 1; 2 is the order of the image derivative, p is the ; en;p; ; en;p; T is Minkowski norm parameter, en;p; ¼ ½en;p; r g b the estimated light source, and k is a constant that ensures ken;p; k ¼ 1, where k  k is a two-norm operator. The framework in Eq. (3) is a perfect formula that describes numerous classic methods uniformly. For example, e0;1;0 represents Grey-World method, e0;1;0 represents maxRGB method, e0;p;0 represents Shades-of-Grey method, and en;p; ðn ¼ 1; 2Þ represents first-order and second-order GreyEdge methods. However, Grey-Edge methods have a few limitations. First, although Grey-Edge method can be implemented with several simple Matlab codes, its computation process involves Gaussian kernel convolution; this convolution slows the algorithm. For second-order Grey-Edge method, 4    7 exhibits the best result in the experiment. We assume that  ¼ 4 and kernel size is 25  25. Even if second-order convolution is broken down to the x and y direction, it still requires twice the convolution with 1  25 kernel size. Its calculation time is 50 times longer than the

OPTICAL REVIEW Vol. 20, No. 4 (2013)

S. LAI et al.

calculation with Grey-World method. Second, selection of parameters  and p is necessary in Grey-Edge method. If these parameters are not selected properly, the result will not be good enough, especially when a priori information on the input image is not sufficiently provided.3) Third, much calculation is involved in image gradient computation. For example, the formula of the first-order gradient is sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  1þ0  2  0þ1  2 1  @ f c ðxÞ @ f c ðxÞ @ f c ðxÞ ; ð4Þ ¼ þ @x1 @x1 y0 @x0 y1 thus, the first-order gradient must be calculated on the x-direction and then on the y-direction. Furthermore, square and square root operations are required in the last step. All these limitations influence the practical application of GreyEdge method. 2.3

Color constancy based on grey block-differencing hypothesis This study proposes a simple color constancy method based on a new hypothesis: the average of the reflectance differences of adjacent blocks in a scene is achromatic (grey block-differencing hypothesis). Grey-Edge hypothesis does not consider the influence of noise and texture. The algorithm is implemented by performing a Gaussian filtering process to overcome the effects of noise and texture. The grey block-differencing hypothesis addresses this issue by utilizing block differencing instead of pixel differencing. We can obtain a simple color constancy algorithm based on the new hypothesis. Figure 2 presents our method’s flow chart. First, we divide the image into BW  BH blocks. Each block has a  BÞ  G;  is calculated size of s  s. The average pixel value ðR; in each block. A small image with size BW  BH is obtained. Then, the second-order gradient of the small image is computed with discrete Laplace operator. Finally, the average value of each channel in the gradient image is computed to estimate light source e. The method can be described as Z ð5Þ ðLap  f sc ðxÞÞdx ¼ kesc ; c 2 fr; g; bg; where f sc ðxÞ denotes the small image on the c channel after s  s block-averaging down sampling. Lap is a discrete isotropic Laplace operator as provided by Eq. (6). The estimated light source is provided by es ¼ ½esr ; esg ; esb T ; 2 3 1 1 1 6 7 Lap ¼ 4 1 8 1 5: 1 1 1

ð6Þ

Block averaging operation implies several occurrences. First, the operation smoothes the image, which has been proven as an important pre-process to improve the robust performance of color constancy algorithms.7,12) General Grey-World and Grey-Edge methods both employ Gaussian convolution to smoothen an image. Second, image size

s*s Block

343

Input image

Averaging

kers R Laplace Filtering

Block-averaging Down Sampling

G

Small image (size of BW*BH)

1 1 1 1 −8 1 1 1 1

Averaging

kegs

es

Averaging

kebs

B

Fig. 2. Flow chart of the proposed algorithm based on image block gradients.

becomes 1=ðs  sÞ of the original size after the operation; computation of the gradient is based on the resulting small image. Thus, the subsequent amount of calculation is greatly reduced. As for the correction of color aberration, our method employs diagonal transform14) to complete the task, that is, ; effiffiffib T correct the image from estimated light e ¼p½e r ; ebp p ffiffi ffi ffiffi ffi T to standard light e0 ¼ ½e0r ; e0b ; e0b  ¼ ½1= 3; 1= 3; 1= 3T . Thus, output image f 0 ðxÞ is f 0c ðxÞ ¼ 3.

e0c f c ðxÞ; c 2 fr; g; bg: ec

ð7Þ

Experiments on Standard Dataset

3.1 Experimental setup The experimental dataset is linear Color Checker dataset.15,16) The original images in the Color Checker dataset15) are processed with the parameters of the shooting digital camera, including demosaicing, white balance, gamma correction, and color correction. Given that color constancy operation must be performed on a linear image, Shi and Funt averaged the two G values of the Bayer pattern to generate RGB images to obtain a linear Color Checker dataset.16) This dataset contains 568 images, 246 of which are indoor images and the rest are outdoor images. Figure 3 shows some sample images from the dataset. Macbeth color checker reflects the main light color in each image. We consider angular error to measure the accuracy of the algorithm and to compare the estimated light and referred color checker light:   eg  ee ; ð8Þ angularðeg ; ee Þ ¼ cos1 keg k  kee k where eg is referred image light, ee is estimated light,  is vector dot product, and k  k is vector length. Indexes such as the mean, median, and tri-mean errors as well as the worst and best 25% are selected to evaluate the performance of the algorithm.3) 3.2 Parameter setting Our algorithm has only one parameter: image block size s  s. We consider s ¼ 10; 20; 40; 60; 80; 100; 120; 140; 160;

344

OPTICAL REVIEW Vol. 20, No. 4 (2013)

S. LAI et al.

Color-Checker dataset

Linear Color-Checker dataset Fig. 3.

(Color online) Example of the Color Checker dataset and its linear version.

6 mean median trimean

5

angular error

4

3

(a)

(b)

(c)

(d)

(e)

(f)

2

1

0

2

3

10

10

number of block

Fig. 4. (Color online) Experimental results of different parameters. The proposed algorithm can obtain robust results within a wide parameter range (the number of block is between 0.1 and 1 K).

180; 200 in the experiment. The image size of the Color Checker dataset is approximately 500 K pixels. Thus, the number of the image block is approximately 5 K when s ¼ 10 and only about 13 when s ¼ 200. We evaluate the results with different block numbers (not s) to obtain more meaningful experimental results (for images of different size). Figure 4 presents the relationship chart for the number of blocks and angular error. The best performance is obtained when the number of blocks is approximately 0.3 K. The results at 0.1 to 1 K block number are close to the best one. Thus, the algorithm is not very sensitive to parameter selection. 3.3 Accuracy comparison Figure 5 provides an example of results obtained through several methods. The comparison between our algorithm and others is shown in Table 1. Our result adopts the parameter s ¼ 40, that is, block number is approximately 0.3 K. Other algorithm results are obtained from the study of Gijsenij

Fig. 5. (Color online) Example of results of the different methods: (a) input image, (b) Grey-World (6.36 ), (c) 2nd-order Grey-Edge (4.60 ), (d) Gumat Mapping (1.07 ), (e) Natural Image statistics (1.47 ), and (f ) Proposed Method (1.29 ). Color constancy methods are applied on the linear images. We transform all images (original and resulting images) into sRGB space for better visualization. The value reported with each image indicates angular error.

et al.3) All results are acquired by their best parameters. The results in Table 1 indicate that the accuracy of our algorithm is higher than that of max-RGB, Grey-World, and secondorder Grey-Edge. Accuracy improves by at least 34% with tri-mean as an evaluated measure. Compared with that of two relatively complicated algorithms (i.e., gamut-mapping and natural image statistics), the accuracy of our algorithm is lower at some indexes and higher at others. Our algorithm performs well at the worst 25% index; it is 20% higher than

OPTICAL REVIEW Vol. 20, No. 4 (2013)

S. LAI et al. Table 1.

345

Comparison of algorithms.

Method

Mean

Median

Tri-mean

Best 25%

Worst 25%

Max-RGB (e0;1;0 ) Grey-World (e0;1;0 ) 2nd-order Grey-Edge (e2;p; ) Gamut-mapping8) Natural Image statistics10) Proposed method

7.5 6.4 5.1 4.1 4.2 3.63

5.7 6.3 4.4 2.5 3.1 2.82

6.4 6.3 4.6 3.0 3.5 3.04

1.5 2.3 1.9 0.6 1.0 1.11

16.2 10.6 10.0 10.3 9.2 7.43

the best one (natural image statistics method), which demonstrates that our algorithm performs best at ‘‘difficult’’ scenes and that its robust ability is exceptional. 3.4 Complexity analysis We assume image size (total pixel number) to be N. The time complexities of the Grey-World and max-RGB algorithm are both linear, that is, OðNÞ. Our algorithm includes two steps: block averaging step in which time complexity is OðNÞ and gradient calculation in which time complexity is OðN=s2 Þ. The time complexity of the entire algorithm is O½Nðs2 þ 1Þ=s2 . We conclude that our algorithm approximates the time complexity of the two simplest algorithms: Grey-World and max-RGB. Grey-Edge algorithm is complicated by image convolution. Even if we divide convolution to the vertical and horizontal one, time complexity would still be OðkNÞ assuming that convolution kernel size is k  k. Other algorithms, such as gamutmapping and learning–based statistical algorithms, are often more complicated. These algorithms require learning and training and are difficult to implement. This study does not analyze the complexity of these algorithms. 4.

Experiments on FPGA-Based Real-Time Imaging System

We create an image acquisition system based on an FPGA hardware platform to further test the practicality of the proposed algorithm. The proposed algorithm is applied to the white balance module of the image signal processing pipeline. Accuracy, resource consumption, and real-time performance of the algorithm are evaluated. 4.1 Implementation details Figure 6 shows the experimental device of the imaging system. The system comprises a lens, high-resolution CMOS image sensor, FPGA image signal processor, and other components. The sensor captures high-resolution RAW images (Bayer format). The RAW image captured by the sensor is transferred to the FPGA image signal processor through a parallel interface. The image signal processor processes the RAW image and converts it into final RGB format images. The signal processor also configures the sensor via the inter-integrated circuit interface. The final RGB images are sent to the display terminal through an HDMI interface. The image signal processor is the core of the system, and its process flow includes defective pixel correction, auto white balance, automatic exposure control,

Fig. 6.

(Color online) Experimental imaging system.

I2C Control

CMOS Sensor

Defective pixel correction

Auto Exposure/Auto White Balance

Demosaicing

Bayer Image Processing

HDMI Output

Gamma Correction

Color Correction

Denoising

RGB Image Processing

Fig. 7.

(Color online) Image signal processing flow chart.

RAW image demosaicing, color correction, image denoising, gamma correction, and other modules. Figure 7 shows our image signal processing (ISP) flow chart. White balance is a key step in ISP. We establish the automatic white balance module of the system based on the proposed color constancy algorithm. Demosaicing, gamma correction, color correction, and other processing modules affect the original characteristics of the Bayer image such as linearity. White balance should be included in the Bayer image processing stage of ISP to avoid such problem. The Bayer image is one of the most commonly used patterns of CFA, which is placed over the sensor to capture the color image. Given that only one color sample is obtained in each pixel, the CFA’s adjacent pixel is designed to collect a different color. A color RGB image is then obtained by interpolating a missing channel from the neighboring samples; the process is known as demosaicing.17) The Bayer image has one R-channel, one B-channel,

346

OPTICAL REVIEW Vol. 20, No. 4 (2013)

S. LAI et al.

Table 2.

Hardware resource utilization.

Modules

FFs

LUTs

DSP48s

BRAM36/18

Clock freq (MHz)

DPC Demosaicing Color correction Gamma correction

2010 4188 99 93 524 (8%)

2230 3823 45 165 839 (12%)

1 8 9 0

2/2 5/2 0/0 0/3

255 287.2 362 262

0

0/0

333.3

White balance

(a)

(b)

(c)

(d)

(e)

(f)

Fig. 8. (Color online) Sample experimental results: (a) light box: illuminant F (2700 K), (b) light box: illuminant CWF (4150 K), (c) light box: illuminant D65T (6500 K), (d) indoor, (e) outdoor, and (f ) an extreme scene contain a large uniformly colored surface. Each scene contains two images: the original image (top) and the final image after white balance (bottom).

and two G-channel pixels in each 2  2 pixel block. The proposed color constancy algorithm can be directly applied to the Bayer image by utilizing the corresponding channel sub-images of the Bayer image. 4.2 Hardware resource utilization This subsection presents hardware resource utilization of our algorithm. Table 2 provides the synthesis results, including the number of flip-flops (FFs), look-up tables (LUTs), DSP48s, block rams (BRAMs), and max clock frequency (Clock Freq) of FPGA. Results of other ISP modules are provided for better comparison. The results are reported with Xilinx Virtex-5 XC5VLX50T FPGA. As shown in Table 2, the white balance module consumes few resources. This module does not require a multiplication

device (DSP48) or internal memory (Block Ram) and consumes 524 flip-flops (8% of total consumption) and 839 LUTs (12% of total consumption). 4.3 Results Figure 8 shows sample results of the six different scenes. The proposed white balance method can accurately correct color cast at different scenes. 5.

Conclusions

This paper proposes a color constancy algorithm that is simple, fast, and robust. Our algorithm is based on the assumption that the average of reflectance differences of adjacent blocks in a scene is achromatic. Our algorithm has only one parameter. The algorithm is not very sensitive

OPTICAL REVIEW Vol. 20, No. 4 (2013)

to parameter selection. The experiment on the standard dataset shows that the performance of the algorithm is exceptional. Experiments on the FPGA-based real-time imaging system show that our algorithm is simple and practical. Acknowledgment This work was supported in part by the National Natural Science Foundation (NSFC) of China under Grant Nos. 61175006, 61175015, 61271438, and 61275016. References 1) T. Gevers and A. W. M. Smeulders: Pattern Recognition 32 (1999) 453. 2) T. Gevers and A. W. M. Smeulders: IEEE Trans. Image Process. 9 (2000) 102. 3) A. Gijsenij, T. Gevers, and J. van de Weijer: IEEE Trans. Image Process. 20 (2011) 2475. 4) G. Buchsbaum: J. Franklin Inst. 310 (1980) 1. 5) E. H. Land: Sci. Am. 237 (1977) 108. 6) G. D. Finlayson and E. Trezzi: Proc. IS&T/SID’s Color Imaging Conf., 2004, p. 37.

S. LAI et al.

347

7) J. van de Weijer, T. Gevers, and A. Gijsenij: IEEE Trans. Image Process. 16 (2007) 2207. 8) D. A. Forsyth: Int. J. Comput. Vis. 5 (1990) 5. 9) V. C. Cardei, B. Funt, and K. Barnard: J. Opt. Soc. Am. A 19 (2002) 2374. 10) A. Gijsenij and T. Gevers: IEEE Trans. Pattern Anal. 33 (2011) 687. 11) A. Gijsenij, T. Gevers, and J. van de Weijer: Proc. 26th IEEE Conf. Computer Vision and Pattern Recognition, 2009, p. 581. 12) K. Barnard, L. Martin, A. Coath, and B. Funt: IEEE Trans. Image Process. 11 (2002) 985. 13) B. Li, D. Xu, W. Xiong, and S. Feng: Color Res. Appl. 35 (2010) 304. 14) G. D. Finlayson, M. S. Drew, and B. V. Funt: Proc. 4th IEEE Int. Conf. Computer Vision, 1993, p. 164. 15) P. V. Gehler, C. Rother, A. Blake, T. Minka, and T. Sharp: Proc. 26th IEEE Conf. Computer Vision and Pattern Recognition, 2008, p. 1. 16) L. Shi and B. Funt: Re-processed version of the Gehler color constancy database of 568 images, [http://www.cs.sfu.ca/ ~colour/data/]. 17) R. Ramanath, W. E. Snyder, G. L. Bilbro, and W. A. Sander III: J. Electron. Imaging 11 (2002) 306.

Suggest Documents