GrayCut - Object Segmentation in IR-Images

GrayCut - Object Segmentation in IR-Images Christian Ruwwe and Udo Zölzer Department of Signal Processing and Communications

Helmut-Schmidt-University

University of the Federal Armed Forces

Hamburg, Germany

[email protected]

Abstract. Object segmentation is a crucial task for image analysis and has been studied widely in the past. Most segmentation algorithms rely on changes in contrast or on clustering the same colors only. Yet there seem to be no real one-and-for-all solution to the problem. Nevertheless graph-based energy minimization techniques have been proven to yield very good results in comparison to other techniques. They combine con trast and color information into an energy minimization criterion. We give a brief overview of two recently proposed techniques and present some enhancements to them. Furthermore a combination of them into the GrayCut algorithm leads to suitable results for segmenting objects in infrared images. Key words: Image Segmentation, Energie Minimization, Graph-based Techniques, Infrared Images.

1

Introduction

There has been a long road on the search to the holy grail of image segmentation [1], [2], [3], [4]. Most segmentation algorithms rely on changes in contrast or on grouping the same colors only. Yet there seem to be no real one-and-for-all solution to the problem. Recent advances show graph-based techniques are superior in terms of qual ity and user interaction compared to other algorithms [5], [6], [7]. They combine contrast and color information into an energy minimization criterion. We will explain them in the following section in a little more detail, before we show how our now proposal fits into this evolution. After the following short introduction into two graph-based algorithms, the next section contains the core description for our new segmentation algorithm, called GrayCut. Some experimental results and examples show the usage of our proposed method for segmentation of objects in gray-valued images - especially for infrared (IR) images. Finally, we conclude with a summary and give a brief outlook on future work.

2

1.1

Christian Ruwwe and Udo Z¨ olzer

GraphCut

The idea of using an energy minimization technique for image segmentation and solving it with graph-based algorithms was first described by Greig, Porteous and Seheult in 1989 [8]. We will describe the GraphCut as it was later refined by Boykov and Jolly [5]. It combines the two already known approaches for image segmentation: al gorithms based on colors (or more precisely gray-levels) and segmentation based on the contrast in different regions of an image. For successful segmentation the energy formulation E(z) = P (z) + γ · C(z) (1) has to be minimized. The weighting parameter γ controls the importance of one term over the other. The fidelity term P (z) gives rise to a cost function, which penalizes false classification of a pixel z of the image I to the foreground α = 1 or to the background α = 0. Since the user provides a so-called trimap, where two regions sure foreground and sure background - has to be defined, one can easily calculate a probability distribution and cost functions pz,α from the gray-valued pixels and the image histograms of these two regions to be � P (z) = pz,α . (2) z∈I

Costs can be calculated from the negative log-likelihood of the probability be longing either to the foreground or to the background. Second a prior term C(z) representing the pairwise interactions between neighboring pixels is calculated from the contrast between each two neighboring pixels z and zˆ � C(z) = cz,zˆ . (3) (z,zˆ)∈N

The neighborhood N is chosen such that only neighboring pixels around the segmentation boundary are summed up. These are only pixels z and zˆ belonging to two different foreground/background maps: αz = 6 αzˆ. Only a 4-way neighbor hood is used here. Therefore, the minimization criterion is to find the shortest possible segmentation border that gives the smallest sum over its contrast terms. The contrast between neighboring pixels z and zˆ can be expressed as � � 2 (Iz − Izˆ) cz,zˆ = exp − , (4) 2σ 2 where Iz is the gray-value of the pixel z in the range 0 . . . 1. The variance σ 2 over all differences in intensity can be seen as the noise floor present in the image. Choosing this parameter carefully lets the contrast term successfully switch be tween almost zero for high contrast and one vice versa. However, other functions, separating noise from real contrast in the same manner, are also possible.

Lecture Notes in Computer Science

3

From these two properties of each pixel - one belonging to the object or the background, the other being an edge or not - an undirected graph is built [9]. More precisely a so called S/T -graph is built, where the two terminals S and T represent the object respectively the background. Edges from and to these ter minals are weighted with the corresponding foreground/background costs pz,α . Neighboring pixels are connected with edges in 4-way neighborhood, weighted with the corresponding contrast terms cz,zˆ. Finally using a standard minimum-cut/maximum-flow (MC) algorithm has been proven to give the optimal segmentation border in terms of the energy formulation E(z) defined in (1). The segmentation border corresponds to the edges representing the minimum cut in the graph. 1.2

GrabCut

GrabCut published in 2004 by Rother, Kolmogorov and Blake [6] extends this useful scheme to color images. Instead of gray-level histograms, it makes use of Gaussian mixture models (GMM). Background and foreground are each de scribed with five full-covariance Gaussian components Mz,k . So the fidelity term P (z) is now calculated from the superposition of the Gaussian components � � 1 1 T −1 √ (5) exp − (Iz − µk ) Σk (Iz − µk ) , Mz,k = 2 2π Σk where the term Iz now reflects a three-valued RGB color of the pixel z. The µk are the mean color of each component and Σk are full-covariance matrices reflecting color dependencies between the three color layers. Adaptation of the probability distributions Mz,k to the RGB colors is carried out with the iterative expectation maximization (EM) algorithm [10], according to a predefined trimap given by the user. Due to the three-dimensional color space, the contrast cz,zˆ is now calculated as � � ||Iz − Izˆ| |2 cz,zˆ = exp − 2 , (6) 2σ · ||z − zˆ||

where the norm ||Iz − Izˆ|| is the Euclidian distance in RGB space and ||z − zˆ|| indicates the spatial (Euclidean) distance between two neighboring pixels z and zˆ (GrabCut uses a 8-way connectivity).

In this manner, the whole algorithm is laid out in an iterative way: after each EM iteration, an S/T-graph is built up like in the GraphCut and solved with the minimum-cut algorithm. The resulting segmentation border is used to update the trimap describing foreground and background regions. This new trimap is used for the next EM iteration and so on. The alternating usage of EM steps and MC solutions guarantees the proper monotonic energy minimization over time. The amount of changes in the overall energy E(z) between two iterations might be used as a suitable stopping criterion for the algorithm.

4

2


GrayCut for Infrared Images

Infrared images are gray-level representations of infrared emissions that cannot be seen directly by a human observer. The colorization is therefore purely virtual, almost like in x-ray images, where bright gray-values represent dense materials. Colorization is normally chosen such that warm (hot) regions are displayed in a bright gray-value, whereas cold regions are shown in darker gray-values, but this mapping can also be inverted. Furthermore, the relative infrared emissions depend on weather and climate conditions. In our application of naval ship images, the objects itself - the ships - might be warmer (or brighter) than the surrounding water, or even colder (i.e. darker) than the rest of the image. So using predefined colors for image segmentation will not work at all. Additionally infrared images in general are not that sharp in displaying ob jects than daylight images. In fact the noise floor present in the images is always much greater than in typical man-made pictures. Therefore, pure edge-based image segmentation will not be sufficient at all, too. We already used image segmentation with a GrabCut-related algorithm to separate ships in marine images from their surrounding water [7]. Nevertheless, this segmentation was carried out on RGB images. Now the infrared (IR) image comes into play. Since they are only gray-values, we combine the advantages of both algorithms: GraphCut which is gray-level-based, and GrabCut, that uses an iterative optimization scheme. 2.1

Gaussian mixture models

As in GrabCut we use Gaussian mixture models again, but this time only to find distributions in the two gray-scale histograms - the one for the user-defined background and one for the (unknown) rest. The possible range of values is reduced from the three-dimensional space of RGB colors to the purely one dimensional gray-scale histogram. So the covariance-matrix Σ reduces to the simple scalar variance σ 2 � � 1 (Iz − µk )2 Mz,k = exp − , (7) 2πσk 2σk2 Using five Gaussian components for each model gives too much fragmenta tion. We found that using two or three components each is more suitable for gray-value segmentation. 2.2

Iterations

Adaptation of the Gaussian mixture models is of course carried out again by expectation maximization, so the whole algorithm is of an iterative nature. Starting with a random distribution for EM learning as in GrabCut is not a good starting point for the segmentation task. We apply the very first EM step


5

Fig. 1. An example infrared image and the segmentation results after the first five iterations: The first row shows the input image and the background selection (black line) as applied by the user. The next two rows show the segmentation result (white line) after the first four iterations of the GrayCut algorithm.

6


before the whole algorithm starts. This guarantees a proper initial distribution of the mixture models, but also ensures the adaptation to changes in the trimap based on intermediate segmentation results. Since the possible range of values and the total number of components has been reduced, the overall algorithm performance haven been slightly increased. Moreover, less iterations are needed for the Gaussian components to adapt the gray-level histogram. Usually good results are already achieved after the first three to five iterations. Subsequent iterations only change few pixels directly at the segmentation border. 2.3

Post-processing

In same cases, the segmentation can be improved by applying additional post processing operations between subsequent iteration steps. We have already shown in [7] that widening the calculated segmentation border with a morphological dilation operation gives superior results, when the color information in the image is too low for the fast adaptation by the EM algorithm. We have successfully used the dilation operation with a disc-like structuring element. However, other structuring elements might be more usable to represent certain image content. For example, many vertical edges from mast and antennas present on many ships. This is scope of ongoing research. Other morphological operations, like cleaning up small separated pixels in the foreground-map and leaving only the main object part, may be applied too in some cases. We found using these kind of post-processing techniques is more suitable when dealing with color images, than when segmenting gray-valued IR images. 2.4

Don’t-Care Map

When dealing with real-world applications, (infrared) images may contain sev eral optical symbols laid over the original image. These lines and symbols are generated synthetically and therefore provide a very high contrast with sharp edges. Segmentation algorithm would normally react very strong on these sharp edges. Typically, this would lead to wrong segmentation results. Since these synthetically added symbols always remain at the same position, we introduce an additional map called don’t-care -map. This map distinguishes image regions with these special symbols from the rest of the real image and is used to exclude this pixel from the further algorithm. These don’t-care-map influences the construction of the S/T-graph in several ways: First sharp edges from this optical overlays don not carry any information for the segmentation task and should be rejected. Contrast terms cz,zˆ belonging to this edges are set to 1, simulating a homogeneous region with no edge at all. Moreover the gray values of these pixels should not be adapted by the Gaus sian mixture models, since it is not known whether the part of the image behind


7

the optical symbol belongs to the object or to the background. Therefore pixels present the don’t-care-map are not used during the EM-steps. In addition to this uncertainness, any classification of pixels to the fore ground or to the background has to be undone before the next minimum-cut can be solved. All pixels from the don’t-care-map are set back to the unknown state again, allowing the assignment of edge weights to the S/T-links again. These slight modifications enable the algorithm to ignore special (predefined) parts of the image. The missing parts of the segmentation can move around freely inside the don’t-care-regions, since all constraints from the contrast terms are cleared. Therefore solving the minimum-cut connects all valid parts of the seg mentation with the shortest possible segmentation border through this unknown region. 2.5

User Interaction

We want to have as few user interactions as possible needed to carry out the segmentation task. Defining only a rough background region seems to satisfy our goal, but further improvements into fully automated segmentation are desirable. Nevertheless, the iterative algorithm structure enables the user to redefine his trimap with additional background or additional foreground regions. Defining foreground in the beginning is usually not very helpful for the segmentation task, and slows down the whole process from the user point of view. Only in the rare case of difficult images, the user has to give additional constraints to the algorithm and apply few more iterations to achieve the desired result.

3

Experimental results and discussion

Our experiments on infrared images showing navy ships demonstrate the usabil ity of the proposed GrayCut algorithm. Figure 1 shows an example segmentation on one of the infrared ship images. The first four iterations are shown. The black line indicates the originally drawn background box by the user and evolves over time to the segmentation border in white. The difference between each iteration step is getting lower and lower and more than five iterations are not needed at all. Figure 2 demonstrates the evolution of the six Gaussian components during five iterations. In the beginning, all six distributions show an almost equal be havior due to the random initialization. In each iteration, the components tend to separate from each other by the expectation maximization. The dotted Gaus sians represents the background, the solid lines are components of the foreground model. Again, only the first few iterations are usable. The last (fifth) iteration does not bring any real gain in the overall segmentation quality.

8


Fig. 2. Evolution of the Gaussian mixture model components (left) over five iterations for the foreground (solid) and for the background (dotted) and the real gray-value his tograms (right) of the foreground (solid) and the background (dotted).

The right part of figure 2 shows the real histogram of the example image. Again, the dotted parts indicate the background and solid bars show the his togram of the foreground. It is clearly visible, that the amount of information present in the object is clearly less than in the rest of the image: the number of pixels belonging to the dark background region is much greater than the few ones corresponding to the foreground object. This is one reason for applying two different Gaussian mixture models describing foreground and background separately. We used a set of 79 infrared images showing different levels of quality. Figure 3 shows some examples of the segmentation results. The black rectangle is the initial background selection by the user, whereas the white line shows the final segmentation result (after 10 iterations). The left image in the third row shows an example application for the don’t-care-map where the black lines overlaid in the rear part of the ship are ignored by the segmentation border. The proposed GrayCut algorithm is able to give 23 out of 79 suitable results without the need of additional user interaction. Additional 25 results can be improved with refinement of foreground and background regions by the user. The last 31 images result in wrong or even no segmentation borders at all. This is mainly due to the bad signal-to-noise ratio of the infrared images. We presume even a human user might have difficulties in drawing a suitable segmentation border in images of the last type seen in figure 3.

4

Conclusions and future work

We have used already known algorithms to derive a new method called GrayCut for segmenting in gray-level images. Some special extensions to the basic algo rithm have been introduced, especially the don’t-care-map to ignore some parts of the image. The development was mainly driven by infrared ship images, but we think this is also applicable in other fields if image processing and analysis, like in X-ray images for medical applications.


9

Fig. 3. More segmentation results on different images: The black line indicates the background region as defined by the user, the white line shows the segmentation result of the proposed GrayCut-algorithm after 10 iterations (or no result at all as in the last case in the lower right corner).

10


User interaction is reduced to the absolute minimum, i.e. only the background region is needed and has to be given by the user. The overall goal is to yield a perfect segmentation without additional user input, but the user can correct bad results by giving more constraints. Since the scene structure - a ship in the water - is nearly always the same, a fully automated algorithm is desirable, where no user input at all is needed. The previous experiments show the usability of the proposed GrayCut al gorithm. Nevertheless, different enhancements and parameter tuning have been shown to yield superior results. As a drawback, these parameters have to be adjusted carefully by the user before applying the segmentation and depend on the image content and the image quality. Currently no automatic parameter ad justment and post-processing selection is available. This is still scope of future research. On the other hand, pre-processing steps might be useful to improve the image quality before applying the segmentation algorithm. Namely noise reduction as already proposed in [11] might be a very suitable step. An integration of these image enhancement algorithms has to be investigated in the future.

References 1. Mortensen, E., Barrett, W.: Intelligent scissors for image composition. ACM Siggraph (1995) pp. 191-198. 2. Kass, M., Witkin, A., Terzopoulos, D.: Snakes: Active contour models. Interna tional Conference on Computer Vision (1987) pp. 259-268. 3. Friedland, G., Jantz, K., Rojas, R.: Siox: Simple interactive object extraction in still images. International Symposium on Multimedia (2005) pp. 253-259. 4. Wippig, D., Klauer, B., Zeidler, H.: Extraction of ship silhouettes using active con tours from infrared images. International Conference on Computer Vision (2005) pp. 172-177. 5. Boykov, Y., Jolly, M.: Interactive graph cuts for optimal boundary & region seg mentation of objects in n-d images. International Conference on Computer Vision (2001) Vol. I, pp. 105-112. 6. Rother, C., Kolmogorov, V., Blake, A.: Grabcut - interactive foreground extraction using iterated graph cuts. Proc. ACM Siggraph (2004) pp. 309-314. 7. Rusch, O., Ruwwe, C., Zoelzer, U.: Image segmentation in naval ship images. 11. Workshop Farbbildverarbeitung (2005) pp. 63-70. 8. Greig, D., Porteous, B., Seheult, A.: Exact maximum a posteriori estimation for bi nary images. Journal of the Royal Statistical Society (1989) Series B, 51(2):271279. 9. Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) (2004) Vol. 26, No. 2, pp. 147-159. 10. Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society (1977) Series B, 39(1), pp. 138. 11. Wippig, D., Klauer, B., Zeidler, H.: Denoising of infrared images by wavelet thresh olding. International Conference on Industrial Electronics, Technology & Automa tion (2005)

GrayCut - Object Segmentation in IR-Images

GrayCut - Object Segmentation in IR-Images

Suggest Documents

Video Object Segmentation using Tracked Object Proposals

Deformable object segmentation in ultra-sound images

Object segmentation in poultry housings using ... - WordPress.com

Object Segmentation in Images using EEG Signals

Quality Control in Crowdsourced Object Segmentation

One-Shot Video Object Segmentation

segmentation for object-based ... - CiteSeerX

Unsupervised Semantic Object Segmentation of ... - Semantic Scholar

Robust Object Segmentation Using a Multi-Layer

Automatic Object Segmentation by Quantum Cuts

Towards Automatic Object Segmentation with ... - QUT ePrints

Simultaneous Object Recognition and Segmentation by Image

the suitability of object-based image segmentation

Segmentation-driven 6D Object Pose Estimation

Temporally Object-based Video Co-Segmentation

Moving Object Segmentation Using Background Supermodels

object oriented image segmentation - Pattern Recognition Lab

Saliency-Seeded Region Merging: Automatic Object Segmentation

Image sequence segmentation for object oriented coding

Learning Object-Class Segmentation with ... - Semantic Scholar

EFFICIENT VIDEO OBJECT SEGMENTATION BY ... - Semantic Scholar

A Novel Spatio-Temporal Video Object Segmentation

AUTOMATIC VIDEO OBJECT SEGMENTATION AND TRACKING ...

Semantic Image Segmentation and Object Labeling