IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 5, MAY 2011
1565
Histogram Specification: A Fast and Flexible Method to Process Digital Images Gabriel Thomas, Member, IEEE, Daniel Flores-Tapia, and Stephen Pistorius
Abstract—Histogram specification has been successfully used in digital image processing over the years. Mainly used as an image enhancement technique, methods such as histogram equalization (HE) can yield good contrast with almost no effort in terms of inputs to the algorithm or the computational time required. More elaborate histograms can take on problems faced by HE at the expense of having to define the final histograms in innovative ways that may require some extra processing time but are nevertheless fast enough to be considered for real-time applications. This paper proposes a new technique for specifying a histogram to enhance the image contrast. To further evidence our faith on histogram specification techniques, we also discuss methods to modify images, e.g., to help segmentation approaches. Thus, as advocates of these techniques, we would like to emphasize the flexibility of this image processing approach to do more than enhancing images. Index Terms—Contrast enhancement, histogram equalization (HE), histogram specification (HS), maximum entropy, segmentation.
I. I NTRODUCTION
W
HAT CONSTITUTES good contrast? That is a good question since image quality is very subjective and, as the phrase says, beauty is in the eyes of the beholder. It depends on the user, and we would like to add here that it also depends on the scene itself. Back in the days of darkroom developing—we assume those days are almost over for the photography enthusiast—developing black-and-white pictures required a test strip, which was made by exposing the photographic paper to different exposure times, as shown in Fig. 1. The winning exposure time was the one that offered the best contrast, which usually meant the one that yielded an image with very dark blacks and very light whites, and everything in between. This last statement would suggest that histogram
equalization (HE) must be one of the most effective techniques. If only life was so simple, then yes, HE is the best algorithm, but the very fact that there is still a lot of research in this area suggests that there is more to be done. The HE dates back to the 70s, and a patent was issued way back in 1976. Now, good contrast also depends on the scene. If you are taking pictures in a zoo and the zebra is your subject, then yes, you want those black and whites, and all sort of graylevel values that can appear in the background. However, if you happen to photograph a polar bear in winter, you need mostly whites. What is a good contrast then? We will go for a simple answer here—a method that offers you more gray-level values or more saturation values for the different color tones in the image but will not degrade an image in a considerable way. Since we first mentioned the HE, this paper starts in Section II with a brief introduction to this technique. Section III discusses ways to specify a histogram so that some problems faced by the HE can be solved by these new techniques, including the one originally proposed by the authors in [1], which is further discussed in this paper. Section IV deals with the idea that histogram specification (HS) can also be used as a preprocessing technique that can improve image segmentation [2]. Thus, we aimed to demonstrate that the HS is a fast and flexible technique that can serve more than one purpose. II. E NHANCEMENT BY H ISTOGRAM M ODIFICATION A. HE Modifying an image such that its histogram has a uniform distribution usually yields a better contrast. The technique is known as the HE, and the transformation T (r) needed to obtain this equalization can be formulated as r
Manuscript received June 20, 2010; revised August 24, 2010; accepted August 25, 2010. Date of current version April 6, 2011. The Associate Editor coordinating the review process for this paper was Dr. Emil Petriu. G. Thomas is with the Department of Electrical and Computer Engineering, University of Manitoba, Winnipeg, MB R3T 5V6, Canada (e-mail:
[email protected]). D. Flores-Tapia is with the Department of Medical Physics, CancerCare Manitoba, Winnipeg, MB R3E 0V9, Canada (e-mail: Daniel.Flores@ cancercare.mb.ca). S. Pistorius is with the Department of Medical Physics, CancerCare Manitoba, Winnipeg, MB R3E 0V9, Canada, and also with the Faculty of Medicine and the Department of Physics and Astronomy, University of Manitoba, Winnipeg, MB R3T 5V6, Canada (e-mail: Stephen.Pistorius@cancercare. mb.ca). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIM.2010.2089110
pr (w)dw
s = T (r) =
(1)
0
where r is the intensity value of the original pixel, s is the pixel value of the transformed image, and pr (r) is the probability density function (PDF) associated to the original image. It is assumed that the image is the outcome of a continuous random variable and that the histogram resembles its PDF. In this paper, PDF refers to a histogram that has been normalized so that the area is equal to 1. The HE is then calculated by using the cumulative density function (CDF) of the original image as the transformation function as expressed in (1). It can be easily shown that the PDF of the transformed image
0018-9456/$26.00 © 2010 IEEE
1566
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 5, MAY 2011
Fig. 1. Test strip examples. Different exposure times can be seen as stripes offering different contrasts. Darkroom-developing techniques look for high contrasts on final pictures by selecting the right exposure time.
Fig. 2. (a) Original image. (b) Results obtained using the HE. (c) Results obtained using darkroom-developing techniques.
is indeed uniformly distributed [3]. In its discrete form, (1) becomes sk = T (rk ) =
k
pr (rj )
for k = 0, 1, . . . , L − 1
(2)
j=0
for an image with L gray-level values. The HE can yield bad results for images that contain noise and/or include a constant background. Because we are concentrating on enhancing images for general photography, the noise case is not of our concern here; albeit the use of telephoto lenses of, for example, f500 mm can introduce haze noise, we are assuming that inexpensive cameras would have zoom capabilities of up to f75 mm, which does not introduce this type of noise. Nevertheless, the HE can yield good results. Fig. 2(a) shows an original image that has been modified by the HE, as can be seen in Fig. 2(b). Note how brightening the dark shadows on the woman’s face let us see more details in the HE result. Fig. 2(c) shows the same image but was developed using a darkroom technique called dodging, which consists of blocking the light coming from the enlarger, as illustrated in Fig. 3. Fig. 4 shows the histograms of the three images shown in Fig. 2. Because the HE was formulated assuming a continuous PDF, the final histogram is not necessarily uniform, and its histogram shows peaks and gaps [4]. Notice also in Fig. 4 how the picture enhanced using the dodging technique in the darkroom has a histogram that is not uniform but still yields fairly good results.
Fig. 3. Contrast enhancement based on blocking the light during the developing process.
As a matter of fact, the original and specified histograms are quite similar. Let us look at the histograms shown in Fig. 5, where the graylevel spreading and the general shape between the histograms are even more similar to the ones seen in Fig. 4. Fig. 6 shows the images of the aforementioned histograms. Fig. 6(a) is the original image that was taken on film at a full f-stop
THOMAS et al.: HISTOGRAM SPECIFICATION: A FAST AND FLEXIBLE METHOD TO PROCESS DIGITAL IMAGES
1567
Equating (1) and (3) can be used to form the transformation function that yields the specified histogram, i.e., z = G−1 (s) = G−1 [T (r)]. For digital normalized images with L gray-level values, the straightforward implementation of the HS is based on the formulation of
sk = T (rk ) =
k
fR (rj ) for k = 0,
2 1 , ,...,1 L−1 L−1
fZ (zi ) for k = 0,
2 1 , , . . . , 1. (5) L−1 L−1
j=0
Fig. 4.
(4)
Histograms of pictures shown in Fig. 2.
sk = G(zk ) =
k i=0
Fig. 7 shows the steps mentioned above for two CDFs corresponding to an original image and a specified histogram. Going back to the histograms shown in Figs. 4 and 5, one can see that HS can potentially enhance images if only we know a priori the specified histogram that results in a better image. That is a big if, particularly if the algorithm to be proposed here is to be fully automatic. It is something to have a couple of parameters to adjust for an image enhancement technique to be considered in, for example, general photography, but to specify L − 1 values for a final histogram is quite discouraging. Is there a way to specify these histograms automatically then? The succeeding sections discussed this possibility.
Fig. 5. Original histogram of Fig. 6(a) and specified histogram used to obtain Fig. 6(b).
III. HS T ECHNIQUES FOR C ONTRAST E NHANCEMENT A. Brightness-Preserving HE
faster than required, that is, the shutter speed was set faster. During developing, it was underdeveloped one full step to try to compensate for the wrong setting. Furthermore, the image was scanned and saved as a Joint Photographers Expert Group file. All these steps caused the artifacts that can be seen at a closer look and that will be amplified if the image is to be printed in a larger size. A zoom on the image shows these effects, too. Fig. 5 shows the histogram of the original image and a histogram obtained by simply smoothing the original histogram using low-pass filtering. This histogram was then used as the specified histogram followed by median filtering with a mask of size 3 × 3 to remove any spurious pixels. The final image is shown in Fig. 6(b). These two examples suggest that having a uniform final histogram is not necessarily the best choice. This then brings us to our next subsection, which discusses a way to modify contrast according to a specified histogram. B. Histogram Specification (HS) The HS yields an image with a PDF that follows a specified shape fZ (z) for z ∈ [0, 1]. If HE is applied to this final image, the outcome would be an image that also has a uniform PDF, i.e., z fZ (w)dw = s.
G(z) = 0
(3)
It is well known that, if the histogram of an image shows a strong peak because the image is dominated by a large area of a single gray-level value, this can cause problems when using HE [3]. Fig. 8 shows an example when HE can cause bad results. Regardless of the shape of the histogram in the original image, the HE will yield an image with a final level of brightness that is close to 0.5. For the example in Fig. 8, the mean of the histogram-equalized image is μHE = 0.4991. There is nothing wrong with that, you may say, “since the over enhancement seems to be caused by the spreading, not the average brightness.” Yes but not entirely true is the answer. Recall the scenario of a picture of a polar bear in a winter tundra scenery. The image should be very white, and if the HE is applied, the polar bear may look more like a brown bear in the end. Something similar has happened to Fig. 8. The original mean is μo = 0.2854, and it can be seen that the HE produced a brighter result in that case, way too bright in some areas, particularly the face of the woman sitting in the front. Wang and Ye [5] proposed a technique that yields an image with a similar brightness level as the original one. The idea is to find a specified histogram fZ (z), with mean or average level of brightness that is equal to the original one subject to the constraint that the entropy is maximum. They called this technique brightness-preserving HE with maximum entropy (BPHEME), and since the uniform distribution has maximum entropy, this condition offers excellent contrast as well.
1568
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 5, MAY 2011
Fig. 6. Original image. (b) Result obtained using the HS by simply smoothing the original histogram.
solution to the Lagrange multipliers as the one presented in [6] and used in this paper. Fig. 9 shows the result obtained with this method. Note how keeping similar brightness results in an image that is also as dark as the original but with an enhanced contrast. However, the results in the dark region on the right are not as good as that with HE. B. Contrast Enhancement by Piecewise Linear Transformation
Fig. 7. CDF of the original and specified histograms. The necessary steps to accomplish the HS are shown with the arrows.
Recently, Tsai and Yeh [7] developed a contrast enhancement technique based on a piecewise linear transformation (PLT) function T (ri ) described as Tk−1 (ri ) =
Mathematically, the method is expressed as ⎧ ⎨ fZ (z) ≥ 0 1 fZ (z)dz = 1 maxf {−fZ (z) ln fZ (z)dz} , s.t. ⎩ 01 0 zfZ (z)dz = μo (6) 1 for z ∈ [0, 1] and μo = 0 rfR (r)dr. A functional can be formed as ⎡ 1 ⎤ 1 J (fZ (z)) = − fZ (z) ln fZ (z)dz + λ1 ⎣ fZ (z)dz − 1⎦ 0
0
⎡ 1 ⎤ + λ2 ⎣ zfZ (z)dz − μo ⎦
(7)
0
where λ1 and λ2 are Lagrange multipliers associated with the constraints in (6). This is solved using the calculus of variations, i.e., ∂J/∂fZ (z) = − ln fZ (z) − 1 + λ1 + λ2 z, solution of which is given by fZ (z) = eμ1 −1 eλ2 z for z ∈ [0, 1]. Using constraints 1, if μo = 0.5 fZ (z) = λ2 eλ2 z /eλ2 − 1, if μ0 ∈ (0, 0.5) ∪ (0.5, 1) once again, for z ∈ [0, 1], the second Lagrange multiplier can 1 be found as μo = 0 zfZ (z)dz, yielding μo = (λ2 eλ2 − eλ2 + 1)/λ2 (eλ2 − 1), which can be found using a lookup table as suggested in [5] or by an iterative algorithm that yields a better
(sk − sk−1 ) (ri − vk−1 ) + sk−1 (vk − vk−1 ) for k = 1, 2, . . . , V
(8)
where V is the total number of segments that is equal to the total number of valleys minus 1, which is found in a smooth version of the original histogram, the vk values are the valley locations of the different modes found in histogram ri ∈ [vk−1 , vk ], and the sk values are computed as sk =
vk k=0
fR (rk )
for k = 0, 1/(L − 1), 2/(L − 1), . . . , vk . (9)
In order to find the valleys, the method first filters the original histogram with a low-pass Gaussian filter with standard deviation σG , which is equal to the most frequent distance between valleys in the original histogram. To compute σG , the maximum distance between valleys Wmax is used to generate a histogram of distances hd divided into ten equal segments between 0 and Wmax , and σG is chosen as the most frequent one, i.e., the one corresponding to the maximum value in histogram hd . Fig. 10 shows the image obtained using this method. Note how the dark areas have more contrast and how the bright areas were not as emphasized as with BPHEME. Note how the modes will be equalized, yielding a mean value for each mode that it is approximately at the center of each distance between valleys. The brightest mode (the last one) in the specified histogram is located between 0.92 and 1, and its mean is in between these values. However, in the original histogram, that mode is shifted toward 0.9961. This explains why there are not as many bright
THOMAS et al.: HISTOGRAM SPECIFICATION: A FAST AND FLEXIBLE METHOD TO PROCESS DIGITAL IMAGES
Fig. 8.
1569
(a) Original image. (b) Histogram equalization.
Fig. 9. Image modified by the BPHEME, where µ = 0.28509. Original and specified histograms. Fig. 11. Image obtained using the proposed method and original and specified histograms.
Fig. 10. Image and histograms obtained using the PLT method.
areas in Fig. 10 as in Fig. 9, particularly on the face of the woman in the middle. C. Piecewise Maximum Entropy Histogram As discussed in Section III-B, the separation of the different modes and the final transformation suggested in [7] yields good results in a very simple and fast way. The same can be said about BPHEME. Thus, both methods were presented as excellent options for the image enhancement in consumer
electronics such as digital photography. With this in mind, our method relies on the application of the concepts used in both approaches but in a way that the proposed method can overcome some of the difficulties faced from certain type of images. It is also implemented in a simple way and executes fast. The new method that we named piecewise maximum entropy (PME) relies on the idea that the separation of the modes as in the PLT approach offers very good results but can encounter problems by shifting the modes’ means too far from the original means. Therefore, instead of using a linear transformation, we proposed to form a piecewise transformation function in which segments yield the original means of the modes, maximizing their entropy as suggested in the BPHEME. For the discrete case, the normalized distribution of a discrete random variable with outcomes defined in [0, v], where v ≤ L − 1, that offers maximum entropy given a mean value μ is given by fZ (zi = i) = Cti
for i = 0, 1/(L − 1), . . . , v/(L − 1) (10)
where C and t can be found by using the two constraints, i.e.,
v/(L−1)
i=0
v/(L−1)
fZ (zi ) = 1
and
i=0
ifZ (zi ) = μ.
(11)
1570
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 5, MAY 2011
Fig. 12. (a) Original PME image. (b) Modified PME image. (c) Original and specified histogram used in (b).
Let the original histogram fR (rk ) for k = 0, 1/(L − 1), 2/(L − 1), . . . , 1 be a mixture distribution that, for simplicity, will be assumed to have two modes A and B distributed as fRA (rk ) for k = 0, Δv, 2Δv, . . . , v1 , and fRB (rk ) for k = v1 + Δv, v1 + 2Δv, . . . , 1, where Δv = 1/(L − 1) and v1 is the valley found between the two modes. Moreover fR (rk ) = pfRA (rk ) + (p − 1)fRB (rk )
(12)
where p is the proportion of pixels in mode A with respect to the total number of pixels in the image and (1 − p) is likewise for B. If fRA (rk ) and fRB (rk ) are forced to be distributed as (10) with means of equal value to the original means, guaranteeing maximum entropy for the two modes, fR (rk ) also has maximum entropy and can be used as the specified histogram. Fig. 11 shows the results obtained with the proposed method. Note how the dark region on the right preserved the mean brightness while achieving a good contrast and how the whites were forced to be present in the final specification but without creating the same effect on the front woman’s face. The histogram shows the final specification, and the circles correspond to the means of the original and specified segments indicating that the average brightness of the modes remained very similar. The dotted vertical lines indicate the location of the valleys that correspond to the same locations used in PLT. Note how the support of the first mode starts at 0 and the support of the last mode ends at 1 so that the blacks and the whites are
considered in the final histogram. This will help on eliminating the slight darkening results found on final images reported in [7]. Note also how the PME does not seem to have excessively brightened the face of the woman in the middle, at the expense of reducing the contrast from the face of the person at the back. The additional benefits of this method will then depend on the shape of the mode and the location of its mean. For example, note how, in Fig. 11, the mean of the mode located between 0 and 0.4 is almost at the middle of the two valleys. For this mode, the PME yields a result for that region that is actually similar to the PLT. Furthermore, if only one mode exists in the image, the PME will yield very similar results to the BPHEME. Further contrast improvements can be achieved if the valleys between the modes that have areas much smaller than the rest can be eliminated, choosing a single valley in between. Additionally, the number of valleys between two consecutive peaks that are close together should be reduced by eliminating the valley in the middle. There is no point of stretching these small size modes, and these changes lead to the definition of a histogram that stretch the contrast even more. In Fig. 12, the peaks that are closer to 5 gray-level values or less are considered as one, and the valleys of the modes in which the total area is less than 0.5% were also eliminated as indicated before. Note, for example, how this new histogram improves the contrast of the objects located in the shadows of the top shelf. The color implementation is based on converting the color images that are originally in the red–green–blue color model to the hue–saturation–intensity (HSI) model and applying the
THOMAS et al.: HISTOGRAM SPECIFICATION: A FAST AND FLEXIBLE METHOD TO PROCESS DIGITAL IMAGES
1571
Fig. 13. Note how the BPHEME darkens the image at the left slightly more than expected and introduces small bright artifacts on the man’s shirt. At the right, every single method, including the HE, performs reasonably well, but the PME seems to favorably smooth the brightness levels on the sun and does not accentuate the lens glare shown at the right bottom. The PME also shows nicer smoothing for the shadows on the image on the left when compared with the PLT.
techniques mentioned in this paper to the intensity level I to form a new HSI model. Examples of a contrast enhancement with color images can be found in Fig. 13. Furthermore, three different illumination scenarios of the same scene are presented in Fig. 14: 1) a bad contrast due to
a dark case; 2) a good contrast from the original; and 3) a bad contrast because of a bright image. Cases 1 and 2 are particularly challenging. As expected, all the methods performed well when not much contrast enhancement is needed; this can be seen in the images located at the center column. Similar results
1572
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 5, MAY 2011
Fig. 14. Color image examples for pictures taken with dark, normal, and bright illuminations presented in the first, center, and last columns, respectively. The images at the first row are the original ones.
as obtained in the previous examples can be seen in the rest of the images. IV. I MAGE E NHANCEMENT FOR S EGMENTATION To the best of our knowledge, there is no single automatic segmentation algorithm that can deliver results accurately on different types of images such as magnetic resonance imaging, thermal, synthetic aperture radar, etc. Even in cases involving one type of imagery where, for example, illumination can
vary drastically, frustration during the development of such an algorithm can build up rapidly. Thinking in general terms for the sake of robustness, let us consider what type of histograms can be easily segmented regardless of the segmentation approach used. First, the histogram has to consist of modes (values that occurs with the highest frequency in a distribution) with very well defined valleys. Second, the mixture probability distribution that can be approximated from the histogram should ideally be smooth; this would facilitate on the detection of peaks and valleys. Thinking
THOMAS et al.: HISTOGRAM SPECIFICATION: A FAST AND FLEXIBLE METHOD TO PROCESS DIGITAL IMAGES
Fig. 15. Example of the HS using a manually designed specified PDF.
Fig. 16. Example used to evaluate the effect of reducing the entropy. (a) Original image with only two gray levels. (b) Gold standard obtained by thresholding (a). (c) Original image with added Gaussian noise. (d) Histogram of (c) [(a) No noise; (b) Po = 0.21673 Pb = 0.78327; (c) noisy; (d) noisy].
in this broad sense, one can infer that modifying an image in such a way so that the final histogram has the characteristics mentioned above would help on the segmentation part. Fig. 15 shows an image before and after the HS. Note how it can be said that the transformed image is a better candidate for segmentation. The regions within the background and within the object are smoother in the modified image. Quantitatively, the variance of the pixels within the object and the background in the original image are 1910.9 and 1478.2, respectively, and 1750.1 and 1227.9 for the modified image. The last example illustrates the potential for using the HS for segmentation purposes; what remains is to find ways to specify these histograms in at least a semiautomatic way as it was done in Section III. A. Semiautomatic Specification Let define the original PDF and the specified PDF as pO (z) and pS (z), respectively. We would like to have pS (z) resemble
1573
Fig. 17. Original and two specified histograms. Entropy values are 1.7636, 1.6723, and 0.7846, respectively.
Fig. 18. Percentage of misclassified pixels versus entropy ratio. (Horizontal line) Error of the original Otsu segmentation.
a smooth version of pO (z). This can be accomplished by lowpass filtering the histogram in the frequency domain or by simply averaging the samples in pO (z). Now, the amount of averaging will determine the number of modes (discernable peaks) in the specified PDF; too much smoothing will eliminate modes that are very close together. This, in fact, may be a desirable feature since, as suggested in [8], close histogram peaks may be formed by the same object and should be considered as one. Thus, the first step can be defined as pOf (z) = pO (z) ∗ w(z), where ∗ denotes convolution and w(z) is a low-pass filter. For example, w(z) can be defined as rect(z) = 1 for 0 < z < dist, where dist is twice the distance between peaks in the original histogram in case concatenation of these peaks is desired. Once again, the smoothing can be done in different ways such as in the frequency domain, but keep in mind that the support of the filter defines the elimination of closely spaced modes in the histogram. Smoothing a histogram is not a new idea. For example, it has been done before for
1574
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 5, MAY 2011
Fig. 19. (a) Original segmentation. (b) Segmentation obtained using the HS.
the estimation of noise variance when affected by variances of speckle noise or image edges [9] and noise reduction [10]. The location of peaks zp and valleys zv can be obtained by finding the zeroes of pD (z) = dpOf (z)/dz. With these locations, one can design a specified histogram in different ways as discussed next. B. Specification by Double Smoothing This method is based on incrementing the value of the original histogram at the position of the peaks by constant kp and the reduction of the values in the position of the valleys by another constant kv followed by a second smoothing, i.e.,
P pO (z)δ(z − zpi ) p(z) = pO (z) + kp i=i
− kv
V
pO (z)δ(z − zvi ) ∗ w(z))
(13)
Fig. 20.
Modified and original PDFs of the first example.
i=i
where P and V indicate the number of peaks and valleys, respectively. Finally, to have a valid PDF pS (z) = ∞ −∞
p(z)
.
(14)
p(z) dz
C. Specification Using Gaussian Modes If all the modes in the original histogram resemble Gaussian ones and have similar variances, then the specified histogram can be defined by
P p(z) = pG (z) ∗ pOf (z)δ(z − zpi ) (15) i=i
where pG (z) has a Gaussian shape with a smaller standard deviation σG than the ones seen in the original PDF. Subsequently, (14) guarantees a valid PDF. D. Analysis of the Specified Histograms The histograms specified in (13) and (15) should have lower entropy as defined by H = − p(z) ln p(z) in the continuous case or H = − p(zi ) ln p(zi ) in the discrete one, where zi indicates the gray-level value, than the original ones. The implementation of the HS by (4) and (5) provides an approximation of the desired histogram with a reduced number of
gray levels, and this is the main reason of the entropy decrease. Furthermore, this condition should yield specified modes that are sharper and should therefore be more separated from each other, defining better valleys for segmentation purposes, as it will be shown in the example here. This concept of sharper peaks and minimum entropy leads to interesting compensation algorithms for radar imaging, as explained in [11]. The filtering part of the histogram leads to dithering, as explained in [12] and [13], which is equivalent to adding noise to the image. The PDF of the sum of two independent random variables is given by the convolution of their PDFs. If Y and W are these independent random variables distributed as pY (z) and pW (z), then X = Y + W is distributed as pX (z) = pY (z) ∗ pW (z). Since the shape of pW (z) is a valid PDF, then all its values are positive, and we are in fact filtering pY (z) with a low-pass filter. As the previous suggestion was to average samples, W in that case is uniformly distributed noise. As mentioned before, the entropy of the specified histogram should be less than the original one. As stated in [14], for any two independent random vectors Y and W such that both H[X] ˜ ], where Y˜ and W ˜ and H[Y ] exist, H[Y + W ] ≥ H[Y˜ + W are two independent multivariate Gaussians with proportional covariances to Y and W , respectively. Therefore, if the original PDF modes are in fact non-Gaussian but the filtering steps approximate the specified PDF to Gaussian ones, then the entropy is less than the original histogram. If the specified histogram is a sum of Gaussians as specified in (15), the condition of minimum entropy is clearly satisfied, too.
THOMAS et al.: HISTOGRAM SPECIFICATION: A FAST AND FLEXIBLE METHOD TO PROCESS DIGITAL IMAGES
1575
Fig. 21. (Left) Original segmentation. (Right) Results obtained using the HS.
In order to verify the above entropy discussion, an example using a simple segmentation technique based on Otsu’s thresholding [15] was tested. Fig. 16 shows the image used for this example. Fig. 17 shows the first and last specified histograms used for this simulation based on (15) followed by (14), changing the standard deviations of pG (z) in (15)—a total of 16 different standard-deviation values have been used from 1 to 8. The original PDF in Fig. 17 refers to the histogram shown in Fig. 16(d). Fig. 18 of misclassified pixels versus the shows the percentage ratio pO (zi ) ln pO (zi )/ pS (zi ) ln pS (zi ) of the entropy of the original histogram and the entropy of the specified histogram. As it can be seen, reducing the standard deviations of the Gaussians yields lower entropy as the entropy ratio becomes larger. It is interesting to note that reducing the entropy for this example, i.e., defining narrower Gaussian modes, does not necessarily yield fewer errors, as indicated in Fig. 18. What it shows in this example is that even a small reduction in the entropy yields better results. This translates to almost an effortless selection of parameters in (13) and (15) to obtain a better segmentation as the ones shown in the succeeding subsections. By no means can the discussion presented here be generalized for all images or all segmentation approaches. What is consistent, however, is that the modification does reduce the errors for all the segmentation cases presented in this paper, as well as all the entropy ratios in Fig. 18. Testing all the major segmentation methods available is out of the reach of this paper. The approach presented here was tested using different segmentation techniques offered in MATLAB. The details of the segmentation part can be found in the MATLAB documentation or the MathWorks homepage. None of these segmentation algorithms were modified; the only difference was the input images that were modified with the proposed approach.
an object that the original segmentation algorithm missed. Note also how the segmentation appears to follow the objects better. Fig. 20 shows the modified and original PDFs of this image. F. Example Using an Entropy Filter Fig. 21 shows the results of using (13) with a segmentation approach that uses an entropy filter to calculate texture values. The entropy is calculated using a 9 × 9 mask, which gives an estimation of the “roughness” of the area. Morphological operations then eliminate artifacts and fill gaps after the entropy filter is used. Note how the specified image obtained better results for the segmentation of the two textures. Note also how the images are almost identical. This corroborates the results found in Fig. 18 in the sense that a small modification based on a specified histogram that almost has the same entropy as the original histogram yields good results. G. Example Using Thresholding
E. Example Using Morphological Segmentation
Here, classical segmentation by thresholding the histogram is investigated. Here, the minimum-error-thresholding (MET) [16] method is used to assess another thresholding approach aside from Otsu’s because of the excellent results of this technique reported in [15], in which 40 different thresholding segmentation methods were investigated. In order to have a quantitative analysis of the improvement achieved by using the HS, 100 images of a wide variety of natural scenes were tested, and the segmentation was compared with the database ground-truth segmentations performed by human observers [17]. Fig. 22 shows an example using the HS and MET segmentations. A quality metric R was defined as |(SHi , SM )| i S Hi , SM i − (16) R= i |SHi | i |S Hi |
The first segmentation algorithm presented here is based on a series of basic morphological operations such as dilation, closing, and opening. The edges of the image are calculated first using a Sobel mask, and these edges are then modified by the morphological operations to obtain a final segmentation. Once again, more details can be found in the MATLAB documentation. Fig. 19 shows the results using (15). Note how the modified image presents smoother regions and how it can detect
where |x| denotes the cardinality of x, SHi is the segmentation done by the ith observer (where the number of observers varies from more than three to less than five), SM is the segmentation done by the method to be evaluated, the S¯Hi values are the background pixels from the ith observer segmentation, and (SHi , SM ) denotes the set of pixels corresponding to the region in segmentation SHi that contains the pixels in segmentation SM .
1576
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 5, MAY 2011
Fig. 22. Examples using the MET technique with the HS. First column is the original image; second column corresponds to the MET segmentation (in green); third column is the human segmentation; and fourth column is the MET using the HS (in green). (Red caption) Value of the quality metric computed as in (16). TABLE I R ESULTS O BTAINED U SING THE HS ON 100 I MAGES . T HE VALUES I NDICATE THE AVERAGE VALUE OF THE Q UALITY M ETRIC R AND S HOULD BE C OMPARED W ITH THE AVERAGE O BTAINED W HEN U SING NO HS, W HICH IS E QUAL TO 0.090776
following the two methods specified in (13) and (15). Parameter wh corresponds to the normalized cutoff frequency of the ideal low-pass filter used in pOf (z) = pO (z) ∗ w(z). H. Segmentation Using the HS as a Solo Approach At this point, one may wonder if the HS can be used as a segmentation approach on its own. It would just be a matter to define the new histogram as two Dirac impulses, one for the background and the other one for the objects (assuming they have similar gray-level values). Let us consider the case of a mixture probability distribution p(z) = P1 p1 (z) + P2 p2 (z) consisting of two modes distributed each as Gaussian; the PDF is defined then by pG (z) = √
− P1 e 2πσ1
(z−μ1 )2 2σ 2 1
+√
− P2 e 2πσ2
(z−μ2 )2 2σ 2 2
(17)
where P1 and P2 are the probabilities of occurrence and P1 + P2 = 1. By specifying a histogram as two Dirac impulses, as indicated before, located right where the means of the Gaussian modes are, the PDF of this mixture is pS (z) =
√ P1 δ(z 2πσ1
− μ1 ) + √ P1 2πσ1
+
√ P2 δ(z 2πσ2 √ P2 2πσ2
− μ2 )
(18)
which can be simplified to pS (z) = For the metric defined in (16), 0 < R < 1, where 1 would indicate a perfect segmentation match between all the observers and the method. The caption in red in Fig. 22 shows the computed values of R for each example. Table I shows the average value of metric R obtained when using the 100 images
P1 σ 2 P2 σ 1 δ(z − μ1 ) + δ(z − μ2 ). P1 σ 2 + P2 σ 1 P1 σ 2 + P2 σ 1 (19)
The optimal threshold that minimizes the average error for segmentation is defined by [3] P1 p1 (T ) = P2 p2 (T )
(20)
THOMAS et al.: HISTOGRAM SPECIFICATION: A FAST AND FLEXIBLE METHOD TO PROCESS DIGITAL IMAGES
1577
Fig. 23. HS used as a segmentation approach.
which, for the case of the Dirac impulses, yields P1 σ2 δ(T − μ1 ) = P2 σ1 δ(T − μ2 )
(21)
and for the Gaussian mixture −
P1 e
(z−μ1 )2 2σ 2 1
−
= P2 e
(z−μ2 )2 2σ 2 2
.
(22)
The case of the Gaussian mixture is well known that, for σ = σ2 = σ1 , the optimum threshold is defined by [3] P2 σ2 μ1 + μ2 + TG = ln (23) 2 μ1 + μ2 P1 Solving (21) a nascent delta function δ(x) = √ by 2using 2 lima→0 (1/a π)e−x /a , the optimum specified threshold TS is similar to (23), but σ = 0, which defines that the optimum threshold, is just in the middle of the two Dirac impulses in the specified PDF, i.e., TS = (μ1 + μ2 )/2. This result is not a surprise since the specified distribution can be seen as the Gaussian mixture with equal variances approximating zero. The effect of P2 /P1 in (23) is not taken into consideration when using the HS, and this may cause trouble when one of the modes is too prominent compared with the other one. Fig. 23 shows an example using only Dirac impulses as the specified PDF, that is
P pOf (z)δ(z − zpi ) . (24) p(z) = i=i
The first column corresponds to the original images, the second column corresponds to the specified PDFs, the third column shows the modified images, and the last column is the pixel positions in the modified images with gray-level values less than 115. The segmentation is not extremely accurate, but it is not bad at all, considering how easy it was obtained. V. C ONCLUSION The image enhancement method presented in this paper has run in less than 2 s using a 2-GHz personal computer and has produced satisfactorily enhanced images that yielded bad results using the HE. By combining and improving two different ways to enhance the contrast, our method has compared well with respect to each approach and has solved some of the issues of the original techniques. The method has also shown good results when used in color images. Additionally, the HS has been proposed as a way to improve image segmentation. The specification of the final histogram has been done relatively easy, and all it takes has been the definition of a low-pass filter and the amplification and the attenuation of the peaks and the valleys, respectively, or the standard deviation of the assumed Gaussian modes in the final specification. Examples showing better segmentation have been presented, and the possibility of using this approach as a standalone segmentation method has been also discussed. The attractive side of using the HS is the easy implementation that is needed to obtain
1578
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 60, NO. 5, MAY 2011
considerably better results for enhancement and segmentation purposes. R EFERENCES [1] G. Thomas, D. Flores-Tapia, and S. Pistorius, “Fast image contrast enhancement for general use digital cameras,” in Proc. IEEE Instrum. Meas. Technol. Conf., Austin, TX, May 2010, pp. 706–709. [2] G. Thomas, “Image segmentation using histogram specification,” in Proc. IEEE Int. Conf. Image Process., San Diego, CA, Oct. 2008, pp. 589–592. [3] R. C. Gonzalez and R. E. Woods, Digital Image Processing, 2nd ed. Englewood Cliffs, NJ: Prentice-Hall, 2001. [4] M. Stamm and K. J. R. Liu, “Blind forensics of contrast enhancement in digital images,” in Proc. IEEE Int. Conf. Image Process., San Diego, CA, Oct. 2008, pp. 3112–3115. [5] C. Wang and Z. Ye, “Brightness preserving histogram equalization with maximum entropy: A variational perspective,” IEEE Trans. Consum. Electron., vol. 51, no. 4, pp. 1326–1334, Nov. 2005. [6] G. J. Erickson and C. R. Smith, Maximum Entropy and Bayesian Methods. Seattle, WA: Kluwer, 1991. [7] C. M. Tsai and Z. M. Yeh, “Contrast enhancement by automatic and parameter-free piecewise linear transformation for color images,” IEEE Trans. Consum. Electron., vol. 54, no. 2, pp. 213–219, May 2008. [8] A. O. Silva, J. F. Camapum Wanderley, A. N. Freitas, H. de F. Bassani, R. A. de Vasconcelos, and F. M. O. Freitas, “Watershed transform for automatic image segmentation of the human pelvic area,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Montreal, QC, Canada, May 2004, vol. 5, pp. 597–600. [9] W. Hagg and M. Sties, “Efficient speckle filtering of SAR images,” in Proc. Geosci. Remote Sens. Symp., Pasadena, CA, Aug. 1994, vol. 4, pp. 2140–2142. [10] A. Wrangsjo and H. Knutsson, “Histogram filters for noise reduction,” in Proc. SSAB Symp. Image Anal., Stockholm, Sweden, Mar. 2003. [11] J. S. Sok, G. Thomas, and B. C. Flores, Range Doppler Radar Imaging and Motion Compensation. Norwood, MA: Artech House, 2001. [12] P.-E. Forssen, “Image analysis using soft histograms,” in Proc. SSAB Symp. Image Anal., Stockholm, Sweden, Mar. 2002. [13] R. M. Gray and T. G. Stockham, “Dithered quantizers,” IEEE Trans. Inf. Theory, vol. 39, no. 3, pp. 805–812, May 1993. [14] A. Dembo, T. M. Cover, and J. A. Thomas, “Information theoretic inequalities,” IEEE Trans. Inf. Theory, vol. 37, no. 6, pp. 1501–1518, Nov. 1991. [15] M. Sezgin and B. Sankur, “Survey over image thresholding techniques and quantitative performance evaluation,” SPIE J. Electron. Imaging, vol. 13, no. 1, pp. 146–168, Jan. 2004. [16] J. Kittler and J. Illingworth, “Minimum error thresholding,” Pattern Recognit., vol. 19, no. 1, pp. 41–47, 1986. [17] D. Martin, C. Fowlkes, D. Tal, and J. Malik, “A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics,” in Proc. 8th Int. Conf. Comput. Vis., Jul. 2001, vol. 2, pp. 416–423.
Gabriel Thomas (S’89–M’95) received the B.Sc. degree in electrical engineering from the Monterrey Institute of Technology, Monterrey, Mexico, in 1991 and the M.Sc. and Ph.D. degrees in computer engineering from the University of Texas, El Paso, in 1994 and 1999, respectively. Since 1999, he has been a Faculty Member with the Department of Electrical and Computer Engineering, University of Manitoba, Winnipeg, MB, Canada, where he is currently an Associate Professor. He has coauthored the book Range Doppler Radar Imaging and Motion Compensation. His current research interests include digital image and signal processing, computer vision, and nondestructive testing.
Daniel Flores-Tapia received the B.Sc. degree in electrical engineering from the Monterrey Institute of Technology, Chihuahua, Mexico, in 2002 and the Ph.D. degree in computer engineering from the University of Manitoba, Winnipeg, MB, Canada, in 2009. He is currently a Postdoctoral Fellow at CancerCare Manitoba, Winnipeg. His current research interests include biomedical Fourier imaging, biomedical signal processing, and electrical impedance tomography.
Stephen Pistorius received the B.Sc. degree in physics and geography from the University of Natal, Durban, South Africa, in 1982, and the B.Sc. (Hons.) degree in radiation physics, the M.Sc. degree in medical science, and the Ph.D. degree in physics from the University of Stellenbosch, Bellville, South Africa, in 1983, 1984, and 1991, respectively. In 1986, he was certified as a Medical Physicist by the Health Professions Council of South Africa, and in 2002, he obtained the Professional Physicist designation from the Canadian Association of Physicists. He is the Provincial Director of Medical Physics, CancerCare Manitoba, Winnipeg, MB, Canada, and is an Associate Professor with the Faculty of Medicine and an Adjunct Professor with the Department of Physics and Astronomy, University of Manitoba, Winnipeg. His research interests include advanced imaging techniques and reconstruction, Monte Carlo simulation, and radiation transport/beam modeling for ionizing and nonionizing radiation. He has authored over 100 publications and presentations. Dr. Pistorius was the Chair of the Canadian Organization of Medical Physicists from 2006 to 2008. He has won a number of national and international awards.