1388
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
Short Papers
NO. 10,
OCTOBER 2002
___________________________________________________________________________________________________
Automatic Multilevel Thresholding for Image Segmentation by the Growing Time Adaptive Self-Organizing Map Hamed Shah-Hosseini and Reza Safabakhsh, Member, IEEE Abstract—In this paper, a Growing TASOM (Time Adaptive Self-Organizing Map) network called “GTASOM” along with a peak finding process is proposed for automatic multilevel thresholding. The proposed GTASOM is tested for image segmentation. Experimental results demonstrate that the GTASOM is a reliable and accurate tool for image segmentation and its results outperform other thresholding methods. Index Terms—Self-organizing map, image segmentation, automatic multilevel thresholding, histogram, time-adaptive, TASOM.
æ 1
VOL. 24,
INTRODUCTION
MULTILEVEL thresholding is a process that segments a gray-level image into several distinct regions. This process is often designed using the gray-level histogram of the image. For example, maximization of an entropy measure for image thresholding was used by Kapur et al. [1]. With the assumption that an image is composed of two normal distributions, Kittler and Illingworth suggested a minimum error thresholding method [2]. The Otsu method [3] uses an interclass variance criterion for bilevel thresholding. A number of algorithms have also been introduced for multilevel thresholding [4], [5], [6], [7], [8], [9]. Multilevel thresholding can be used for gray level or color quantization [10]. It can also be used for image segmentation [11]. For example, Khotanzad and Bouarfa used a clustering procedure along with a peak seeking algorithm for image segmentation [12]. An iterative form of Otsu’s method was suggested by Reddi et al. [13] to generalize the method to multilevel thresholding. This method, which is called the “iterative threshold selection method,” is fast, but its performance depends on the initial value of thresholds [14]. For getting a good performance, it is suggested to evenly distribute the initial thresholds in the histogram space. The most difficult task is to determine the appropriate number of the thresholds automatically. Unfortunately, many thresholding algorithms, in spite of their fame, are not able to automatically determine the required number of thresholds [15]. The SOM (Self-Organizing Map) is an unsupervised neural network which is composed of two layers. An input layer which receives the input vectors and an output layer which contains the neurons of the SOM network in a discrete space called lattice. The SOM uses a competitive learning rule with time decreasing learning parameters: the learning rate and the neighborhood function. At the beginning of the training, the learning rate is set to a value close to unity. Then, it is decreased gradually during the training. The learning rate ðnÞ is usually described by an exponential decay on time n
. The authors are with the Computer Engineering Department, Amirkabir University of Technology, Tehran 15914, Iran. E-mail: {haamed, safa}@ce.aut.ac.ir. Manuscript received 17 Jan. 2001; revised 14 Aug. 2001; accepted 19 Mar. 2002. Recommended for acceptance by A. Khotanzad. For information on obtaining reprints of this article, please send e-mail to:
[email protected], and reference IEEECS Log Number 113496. 0162-8828/02/$17.00 ß 2002 IEEE
n ; ðnÞ ¼ 0 exp 1
ð1Þ
where 1 is a time constant and 0 is the initial value of the learning rate. The neighborhood function usually begins such that it includes all neurons in the network and then gradually shrinks with time. The neighborhood function hj;i ðnÞ is usually described by a Gaussian function: ! d2j;i ð2Þ hj;i ðnÞ ¼ exp 2 2 ðnÞ such that the width ðnÞ is described by ðnÞ ¼ 0 exp n2 . Again, 2 is a time constant and 0 is the initial value of at the beginning of the SOM algorithm. The distance dj;i between neurons i and j is calculated on the lattice. The SOM network has been employed for bilevel thresholding [16] and for color quantization [17]. A fuzzy SOM network thresholds gray-level images [18] and a combination of SOM and PCA networks is used for multilevel thresholding in [19] and [20]. TASOM (Time Adaptive SOM) networks are modified forms of SOM networks. Each neuron in a TASOM network has its own learning rate and neighborhood function which are updated automatically with the incoming input vectors, while an SOM network has a time decreasing learning rate and neighborhood function. Due to the environment-dependent learning parameters of the TASOM networks, they can operate in both stationary and nonstationary environments [21], [22]. The TASOM with neighborhood functions has been introduced in [23] for bilevel thresholding, and has been used for adaptive pattern classification in [24]. TASOM networks have also been used for adaptive principal component analysis [25]. In this paper, the TASOM is modified to have a growing number of neurons. The proposed GTASOM is used to estimate the peaks and valleys of the image histogram. A post-processing algorithm recognizes each peak neuron along with its left and right limits. This way, the number of regions, and the regions themselves are determined. These properties, in fact, make the proposed segmentation algorithm an automatic multilevel thresholding algorithm for image segmentation. The proposed GTASOM and the post-processing algorithms are described in Section 2. Experimental results are presented in Section 3, where the performance of the proposed segmentation algorithm is evaluated and compared to some other thresholding methods. Conclusions are given in Section 4.
2
THE PROPOSED IMAGE SEGMENTATION ALGORITHM
The proposed segmentation algorithm consists of two steps. In the first step, GTASOM is trained by the gray levels of the given image. In the second step, a peak finding process completes the segmentation using the neurons of GTASOM along with their winning frequencies. The output of the segmentation algorithm will be a segmented image composed of several distinct regions. The two steps are explained in detail in this section.
2.1
GTASOM: The Growing TASOM
The TASOM networks presented in this paper are able to find the number of regions in images. At the same time, similar to the automatic multilevel thresholding algorithms, they segment the image based on its gray levels. For this purpose, we design a growing neural network that begins its work with a small number of neurons. As the training goes on, the number of neurons may increase. The training data are the gray levels of the image to be segmented. The gray levels are used to train one-dimensional weight vectors of the neural network.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
The proposed GTASOM network consists of two layers: an input layer with one node and an output layer of variable but growing number of neurons. The input layer receives the gray levels of the image one by one in an arbitrary order and, at each time step, it distributes the received gray level to all the neurons in the output layer. The nearest neuron to the current input value is selected as the winning neuron. The scaling vector is the first parameter that should be updated and the width of the neighborhood function of the winning neuron is the second. In addition, the learning rates of all neurons are also updated. This kind of updating is unique to the TASOM networks. After updating the learning parameters, the weights of all the current neurons are modified. At this point, it is decided whether or not it is necessary to add more neurons to the present network. If no additional neuron is needed, then the learning should stop. Otherwise, the learning with the image points is repeated again. The complete algorithm of the proposed GTASOM is specified in eight steps as follows: 1.
2. 3.
Initialization. The output layer encompasses the neurons in a one-dimensional linear array called the lattice. At the beginning, three neurons are employed in the lattice, thus NON ¼ 3. The first step of the algorithm begins with n ¼ 0. Choose some random values for the initial weights wj ð0Þ, where j ¼ 1; 2; 3. The learning-rate parameters j ð0Þ should be initialized with values close to one. The neighborhood width parameters j ð0Þ should be initialized with big enough values to include all the neurons in the lattice. The constant parameters , , s , and s can have any positive value less than or equal to one. The constant parameters sf and sg should be based on the application characteristics. The maximum distance between two adjacent neurons is controlled by the parameter w . The effect of parameter w on the performance of GTASOM is discussed in the next section. The parameters Eð0Þ and E2ð0Þ may be initialized with some small random values. Neighboring neurons of any neuron j in the lattice are included in the set NHj . In this paper, for any neuron j in the lattice: NH1 ¼ f2g, NHj ¼ fj1; jþ1g, and NHNON ¼ fNON 1g. Sampling. Get the next gray-level point x of the image under consideration in an arbitrary order. Similarity matching. Find the best-matching or winning neuron iðxÞ at time n, using the minimum-distance Euclidean norm as the matching measure: iðxÞ ¼ arg minxðnÞ wj ðnÞ; j ¼ 1; 2; . . . ; NON; ð3Þ
VOL. 24, NO. 10,
4.
1389
i ðn þ 1Þ ¼ i ðnÞ þ g
X 1 jwi ðnÞ wj ðnÞj sg :slðn þ 1Þ:#ðNHi Þ j2NH
6.
where is a constant parameter between zero and one which controls how fast the neighborhood widths should follow the normalized neighborhood errors. The function #(.) gives the cardinality of a set. The neighborhood widths of the other neurons do not change. Moreover, gð0Þ ¼ 0, 0 gðzÞ < NON for one-dimensional lattices of NON neurons, and dgðzÞ values of z. In this paper, we use dz 0, for positive z . This function is used for normalgðzÞ ¼ ðNON 1Þ 1þz ization of the weight distances. Here, the selected value for the parameter is equal to 0.1. Updating the learning-rate parameters: Adjust the learning-rate parameters j ðnÞ of all neurons in the network by: j ðn þ 1Þ ¼ 1 jxðnÞ wj ðnÞj j ðnÞ j ðnÞ þ f sf :slðn þ 1Þ
E2ðn þ 1Þ ¼ E2ðnÞ þ s x2 ðnÞ E2ðnÞ ;
ð6Þ
Eðn þ 1Þ ¼ EðnÞ þ s ðxðnÞ EðnÞÞ:
ð7Þ
ð9Þ
for all j;
7.
where is a constant parameter between zero and one which controls how fast the learning rates should follow the normalized learning rate errors. For function fð:Þ, we should have fð0Þ ¼ 0, 0 fðzÞ 1, and dfðzÞ dz 0, for positive values z of z. The function fðzÞ ¼ 1þz is used in this paper. This function is used for normalization of the distance between the weight and input vectors. Here, the selected value for the parameter is equal to 0.1. Updating the synaptic weights. Adjust the synaptic weight vectors wj ðnÞ of all output neurons in the network using the following update rule: wj ðn þ 1Þ ¼ wj ðnÞþ j ðn þ 1Þhj;iðxÞ ðn þ 1ÞðxðnÞ wj ðnÞÞ for all j;
ð10Þ
where j ðn þ 1Þ is the learning-rate parameter and hj;iðxÞ ðn þ 1Þ is the Gaussian neighborhood function ! ðj iðxÞÞ2 exp 2 ; 2 iðxÞ ðn þ 1Þ 8. 9.
which is centered at the winning neuron iðxÞ. Continuation. Continue with Step 2 until all the image points are presented to the network. Growing. Calculate the following distance for all the two adjacent neurons j and j þ 1 dwj ¼ jwj wjþ1 j
ð5Þ
! i ðnÞ ; ð8Þ
where þ sðn þ 1Þ ¼ E2ðn þ 1Þ Eðn þ 1Þ2 ;
!
i
j
where j:j denotes the absolute value. Updating the scaling value. Adjust the scaling vector (in fact, scalar) slðn þ 1Þ with the following equations: pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi slðn þ 1Þ ¼ sðn þ 1Þ; ð4Þ
OCTOBER 2002
ð11Þ
if dwj > w , a neuron k is added between neurons j and j þ 1 with the following parameters: wk ¼
ðwj þ wjþ1 Þ ðj þ jþ1 Þ ð j þ jþ1 Þ ; k ¼ ; k ¼ : ð12Þ 2 2 2
and
5.
The function ðzÞþ ¼ z if z > 0; otherwise it is zero. Here, the selected values for the parameters s and s are equal to 0.1. Updating the neighborhood widths. Adjust the neighborhood width i ðnÞ of the winning neuron iðxÞ (i, for short) by the following equation:
10. The learning should stop when no new neuron is added in Step 9. Otherwise, go to Step 2. The parameters and in (8) and (9) are constant values between zero and one, determining the degree of updating for neighborhood widths and learning rates, respectively. These parameters are not crucial in the success or failure of the GTASOM algorithm. For example, if is very small for an application, the normalized learning rate error remains high for a longer time and
1390
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
compensates this small value. A similar effect happens when is very large. This fact is true for , too. The constant parameters sf and sg are the slope of the functions fð:Þ and gð:Þ, respectively. Determination of these parameters depends on the application for which the TASOM algorithm is employed. Here, they are determined experimentally using the gray levels of one or more images available for the parameter tuning of GTASOM. In this paper, sf ¼ 100 and sg ¼ 10; 000. If we expand (7) over time and assume Eð0Þ ¼ 0, we get Eðn þ 1Þ ¼ s
n X ð1 s Þk xðn kÞ:
ð13Þ
k¼0
This is the summation of xðkÞ values over time n, approximating the mean value of xðkÞ’s, with more emphasis on recent values. How emphasized recent values are, is controlled by s . The emphasis on recent data increases as s approaches one, and it decreases as s approaches zero. We know that 0 < s < 1, and 0 xðkÞ MG for all k, where MG is the maximum possible gray level. Thus, Eðn þ 1Þ MG 1 ð1 s Þnþ1 ; which means that it is always bounded and never goes to infinity as time passes. Similar discussion can be used for (6). In this equation, the parameter s plays the role of parameter s . Further discussion on TASOM and its difference with SOM can be found in [26].
2.2
The Peak Finding Process
At this stage, it is assumed that the weights of GTASOM have converged. Due to the topological ordering of the family of SOM networks, we expect the one-dimensional (scalar) weight vectors of the neurons be ordered in an increasing or decreasing order. Specifically, if wj represents the weight of neuron j in the linear array of the network output layer (lattice), then the weights are said to be topologically ordered if w1 w2 . . . wNON or w1 w2 . . . wNON ;
ð14Þ
where NON represents the total number of neurons in the output layer at the end of training phase. Without loss of generality, we can assume that the weights are ordered in an increasing order. For each neuron j, a winning frequency fwinðjÞ is calculated. This value represents the number of times that the relevant neuron wins the competition during the training phase for one complete pass over the gray-level image. However, the winning frequencies of neurons should be calculated at the end of the training phase of the network, or during the last complete pass of the training over the image. The pairs ðwj ; fwinðjÞÞ approximate the pairs ðk; hðkÞÞ of the histogram of the gray-level image, where j ¼ 1; 2; . . . ; NON and k ¼ 1; 2; . . . ; G. In this approximation, the maximum number of gray levels of the image, G, reduces to NON gray levels. Now, the problem of finding the prominent peaks of the image histogram is resolved through the ðwj ; fwinðjÞÞ pairs. This action substantially increases the noise tolerance of the image segmentation algorithm. Since the SOM-based networks can satisfactorily approximate their input space distributions and, due to the smoothing property of the neighborhood functions employed in them, they are less susceptible to noise and outliers. Working on the original image histogram is almost formidable. There are numerous small and large peaks and valleys in the histogram of real images. So, finding the correct number of and separating regions of these images using the original histograms cannot be done reliably. A process is needed to find the peaks of the regions, their left limits, and their right limits. We present one such process in the following. First, we find the neurons that are considered as peaks. A neuron j is a peak if its left neuron j 1 and its right neuron j þ 1 both have smaller winning frequencies than neuron j. In other words: fwinðjÞ > fwinðj 1Þ and fwinðjÞ > fwinðj þ 1Þ:
ð15Þ
VOL. 24,
NO. 10,
OCTOBER 2002
Also, a neuron j is a peak if fwinðjÞ > fwinðj 1Þ and fwinðjÞ ¼ fwinðj þ 1Þ;
ð16Þ
fwinðjÞ ¼ fwinðj 1Þ and fwinðjÞ > fwinðj þ 1Þ:
ð17Þ
or
It should be noted that, if two peak neurons have the same winning frequency and there is no other peak neuron between them, we keep one of them and eliminate the other. After finding the peak neurons of the regions, another process is needed to determine the left and right limits of the regions. Each region has one and only one peak neuron. In other words, the number of peak neurons is equivalent to the number of regions. We explain how to find the left limit of a peak neuron in the following lines. The right limit of a peak neuron is similarly found. We move from the peak neuron to its left adjacent neuron and continue this left movement until the new left neuron has a winning frequency higher than that of its right neuron. In this case, the weight of the right neuron represents the left limit of the region with the peak neuron under consideration. In fact, this left limit is a valley neuron in the approximated histogram. A region is completely segmented from the gray-level image with its left and right limits. The same algorithm is implemented to segment all other regions of the image represented by the peak neurons. It should be noted that no assumption about the shape of the regions is made in the peak finding and region segmentation process. So, unlike some image segmentation algorithms, the regions in the proposed algorithm may have different shapes and distributions.
3
EXPERIMENTAL RESULTS
To test the performance of GTASOM for gray-level image segmentation, several gray-level images, synthetic and natural, are employed.
3.1
Experiments with Synthetic Images
Images used in these experiments are all gray-level images having levels between 0 and 255. The first image used for segmentation is shown in Fig. 1a. As it is seen in the image histogram, shown in Fig. 1b, it is a five region synthetic image composed of five superimposed squares, where each square has only one gray level. The segmented images obtained by the iterative threshold selection [13], [14], the Papamarkos’s [19], the Kapur’s [1], the Khotanzad’s [12], and the GTASOM methods are shown in Figs. 1c, 1d, 1e, 1f, and 1g, respectively. The number of the misclassified pixels for a segmented image divided by the total number of pixels of the image produces the total average misclassification error. This error for the above image for the iterative threshold selection, the Papamarkos’s, the Kapur’s, the Khotanzad’s, and the GTASOM methods are 0.0067, 0.0479, 0.6078, 0.0000, and 0.0000, respectively. It is clear that the Khotanzad’s and GTASOM methods produce the best performance based on the errors and the segmented images. The histogram accuracy of GTASOM is controlled by the parameter w which is chosen equal to 5. With this value for the parameter, the histogram is recognized to contain five major peaks, which is quite correct. Another synthetic image is shown in Fig. 2a. This image is composed of 32 regions, where each region has been synthesized to have a Gaussian distribution, as shown in the image histogram depicted in Fig. 2b. The iterative threshold selection, the Papamarkos’s, and the Khotanzad’s methods, along with GTASOM for w ¼ 5 are used for segmentation of Fig. 2a. The histograms of the segmented images are shown in Figs. 2c, 2d, 2e, and 2f, respectively. The segmented image, produced by GTASOM, is also shown in Fig. 2g. Since the same variance is used for Gaussian distributions of regions, the histogram of the segmented image should ideally be flat. The histogram of the GTASOM method is seen to be the closest histogram to a flat distribution. The number of the misclassified pixels of a segmented image divided by the number of regions
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
VOL. 24, NO. 10,
OCTOBER 2002
1391
Fig. 1. (a) The original image composed of five squares, where each square is represented by one grey level. (b) The histogram of Fig. 1a. (c), (d), (e), (f), and (g) The segmented images obtained by the iterative threshold selection, the Papamarkos’s, the Kapur’s, the Khotanzad’s, and GTASOM methods, respectively.
Fig. 2. (a) The original image composed of 32 regions, where each region is represented by a Gaussian distribution. (b) The histogram of Fig. 2a. (c), (d), (e), and (f) The histograms of the segmented images of Fig. 2a using the iterative threshold selection, the Papamarkos’s, the Khotanzad’s, and GTASOM methods, respectively. (g) The segmented image obtained by GTASOM.
produces the average misclassification error for each region. This error is 20.6690 for the iterative threshold selection, 27.6034 for the Papamarkos’s, 1.9769 for the Khotanzad’s, and 0.7474 for the GTASOM method. Again the lowest error belongs to the GTASOM method. It is seen that the original test image uniformly occupies the gray-level space. The histogram bins of the segmented image produced by GTASOM are more uniformly distributed in the gray-level space than the other implemented methods. It should be noted that, when the number of thresholds is high, the Kapur’s method is extremely time-consuming. As a result, the Kapur’s method was not used in this experiment.
3.2
Experiments with Real Images
Four test images are used in this section, as shown in the first row of Fig. 3. The first two images are the Lena image and the tree image, each has 128 128 pixels. The last two images are the goldhill image and the fruits image [27], each has 512 512 pixels. The segmented images produced by the iterative threshold selection, the Papamarkos’s, the Kapur’s, the Khotanzad’s, and the GTASOM methods are shown in the second row to the last row of Fig. 3, respectively. The
GTASOM method with w ¼ 15 recognizes that the Lena image and the tree image are composed of five and three regions, respectively. While the goldhill image with w ¼ 25 and the fruits image with w ¼ 20, are both recognized by GTASOM to have three regions. The other implemented thresholding methods use these numbers of regions for thresholding the images. For the Lena image, the results are almost similar. For the tree image, the GTASOM and the Khotanzad’s methods perform better than the other methods. For the goldhill and the fruits images, the best performance belongs to GTASOM, then the iterative threshold selection and the Papamarkos’s methods. The Khotanzad’s method does not produce satisfactory results for the two images.
3.3
The Key Parameter in Histogram Approximation by GTASOM
The optimum number of thresholds is known only in some synthetic images. This number in most images, where there are many small and large histogram peaks, is unknown. The distinction between an important and an unimportant peak is vague and the optimum number of thresholds usually depends on
1392
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
VOL. 24,
NO. 10,
OCTOBER 2002
Fig. 3. The first row shows the original test images: the Lena image, which is segmented to five regions, the tree image, the goldhill image, and the fruits image, each of which is segmented to three regions. From the second to the last rows show the segmented images obtained by the iterative threshold selection, the Papamarkos’s, the Kapur’s, the Khotanzad’s, and the GTASOM methods, respectively.
application. For example, we observed that the number of thresholds, which is obtained by the method suggested in [10] depends on the maximum allowable threshold. The performance of the plane curve approach [6] depends on how much smoothness is used in the histogram. In GTASOM, the number of thresholds is implicitly determined by the key parameter w . The parameter w determines the degree of accuracy in approximating the image histogram. Due to the property of any SOM-based network, large histogram peaks attract more neurons of the GTASOM than small peaks and, thus, they are always included in the approximated histogram. Choosing small values for w enforces accurate and detailed approximation of the histogram and large values for w produce rough approximations. Any small peak next to a large one in the histogram is considered as part of the large peak if the distance between the two peaks is less than w . In other words, small and narrow histogram peaks near large ones are filtered and treated as noise. Small and adjacent peaks may be grouped as one segment in the histogram if the distance between them is less than w . The effect of parameter w on the number of gray levels in the segmented image is shown in Fig. 4. When w is 15 for the Lena image, the number of quantized gray levels is 5, and the segmented image is shown in the last row of Fig. 3. When the parameter w decreases, the number of gray levels increases. Figs. 4a and 4b show
the segmented Lena image with 56 recognized regions when w ¼ 2 and 20 regions when w ¼ 5. The histogram of the original Lena image and the pairs of weight-winning frequency for w ¼ 2; 5; 15 are shown in Figs. 4c, 4d, 4e, and 4f, respectively. These figures clearly show the effect of parameter w on the accuracy of GTASOM in the histogram approximation: as w increases, the GTASOM algorithm ignores the smaller peaks of the histogram. The size of the image affects the required time for each complete pass of GTASOM over the image. In other words, the number of pixels of the test image which is used for training GTASOM linearly controls the time complexity of GTASOM. This number can be reduced by using a subsampling technique [20]. The number of complete passes required for the GTASOM algorithm to converge depends on the complexity of the test image and the parameter w . The more accuracy we need, the more complete passes are required. Assume that the Lena image is segmented to 5, 20, and 56 regions without using any subsampling techniques. The time required for such segmentation by GTASOM is 15, 34, and 167, respectively. The Papamarkos’s method needs 18, 56, and 440 seconds, respectively. The iterative threshold selection and the Khotanzad’s methods are fast and segment the Lena image in less than one second. The Kapur’s method needs hours to segment the Lena image since it uses an exhaustive search to find the optimum thresholds and it is impractical when the number of thresholds is high.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
VOL. 24, NO. 10,
OCTOBER 2002
1393
Fig. 4. The effect of the parameter w of the GTASOM algorithm on the number of gray levels of the segmented image. (a) and (b) The segmented images of the original Lena image obtained by GTASOM for w ¼ 2 and w ¼ 5, which contain 56, and 20 gray levels, respectively . For w ¼ 15, the segmented image is seen in the last row of Fig. 3, which contains five gray levels. (c) The histogram of the original Lena image. (d), (e), and (f) The approximated histogram of the original Lena image using GTASOM for w ¼ 2, w ¼ 5, and w ¼ 15, respectively.
4
CONCLUSIONS
In this paper, a growing TASOM network, called “GTASOM,” was introduced. The converged weights of the neurons of the proposed network are used by a peak finding process for image segmentation. The GTASOM was tested for image segmentation with synthetic and real images. Experiments revealed that GTASOM along with the peak finding process performs better than or as well as other thresholding methods such as the iterative threshold selection, the Papamarkos’s, the Kapur’s, and the Khotanzad’s methods.
[14]
[15]
[16] [17] [18]
[19]
REFERENCES [1] [2] [3] [4]
[5]
[6]
[7] [8]
[9]
[10]
[11] [12]
[13]
J.N. Kapur, P.K. Sahoo, and A.K. Wong, “A New Method for Gray-Level Picture Thresholding Using the Entropy of the Histogram,” Computer Vision, Graphics, and Image Processing, vol. 29, pp. 273-285, 1985. J. Kittler and J. lllingworth, “Minimum Error Thresholding,” Pattern Recognition, vol. 19, pp. 41-47, 1986. N. Otsu, “A Threshold Selection Method from Grey-Level Histograms,” IEEE Trans. Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 377-393, 1979. M.I. Sezan, “A Peak Detection Algorithm and Its Applications to Histogram-Based Image Data Reduction,” Computer Vision, Graphics, and Image Processing, vol. 49, pp. 36-51, 1990. A.S. Abutaleb, “Automatic Thresholding of Gray-Level Pictures Using Two-Dimensional Entropy,” Computer Vision, Graphics, and Image Processing, vol. 47, pp. 22-32, 1989. S. Boukharouba, J.M. Rebordao, and P.L. Wendel, “An Amplitude Segmentation Method Based on the Distribution Function of an Image,” Computer Vision, Graphics, and Image Processing, vol. 29, pp. 47-59, 1985. S. Wang and R.M. Haralick, “Automatic Multithreshold Selection,” Computer Vision, Graphics, and Image Processing, vol. 25, pp. 46-67, 1984. N. Papamarkos and B. Gatos, “A New Approach for Multithreshold Selection,” Computer Vision, Graphics, and Image Processing-Graphical Models and Image Processing, vol. 56, no. 5, pp. 357-370, 1994. J.-S. Chang et al., “New Automatic Multi-Level Thresholding Technique for Segmentation of Thermal Images,” Image and Vision Computing, vol. 15, pp. 23-34, 1997. N. Papamarkos, C. Strouthopoulos, and I. Andreadis, “Multithresholding of Color and Gray-Level Images Through a Neural Network Technique,” Image and Vision Computing, vol. 18, pp. 213-222, 2000. V.-P. Zhang and M.M. Desai, “Wavelet Based Automatic Thresholding for Image Segmentation,” Proc. Int’l Conf. Image Processing, pp. 224-227, 1997. A. Khotanzad and A. Bouarfa, “Image Segmentation by a Parallel NonParametric Histogram Based Clustering Algorithm,” Pattern Recognition, vol. 23, no. 9, pp. 961-973, 1990. S.S. Reddi, S.F. Rudin, and H.R. Keshavan, “An Optimal Multiple Threshold Scheme for Image Segmentation,” IEEE Trans. System Man and Cybernetics, vol. 14, no. 4, pp. 661-665, 1984.
[20]
[21]
[22]
[23]
[24]
[25]
[26] [27]
T.W. Ridler and S. Calvard, “Picture Thresholding Using an Iterative Selection Method,” IEEE Trans. Systems, Man, and Cybernetics, vol. 8, no. 8, pp. 630-632, 1978. R.J.K. Whatmough, “Automatic Threshold Selection from a Histogram Using the Exponential Hull,” Computer Vision, Graphics, and Image Processing, vol. 53, pp. 592-600, 1991. C.C. Chang, C.H. Chang, and S.Y. Hwang, “A Connectionist Approach for Thresholding,” Proc. Int’l Conf. Pattern Recognition 11, pp. 522-525, 1992. A.H. Dekker, “Kohonen Neural Networks for Optimal Color Quantization,” Network: Computation in Neural Systems, vol. 5, pp. 351-367, 1994. H. Atmaca, M. Bullut, and D. Demir, “Histogram Based Fuzzy Kohonen Clustering Network for Image Segmentation,” IEEE Int’l Conf. Image Processing, pp. 951-954, 1996. N. Papamarkos, “Color Reduction Using Local Features and a SOFM Neural Network,” Int’l J. Imaging Systems and Technology, vol. 10, no. 5, pp. 404-409, 1999. N. Papamarkos and A. Atsalakis, “Gray-level Reduction Using Local Spatial Features,” Computer Vision and Image Understanding, vol. 78, pp. 336350, 2000. H. Shah-Hosseini and R. Safabakhsh, “Automatic Adjustment of Learning Rates of the Self-Organizing Feature Map,” Scientia Iranica, vol. 8, no. 4, pp. 277-286, 2001. H. Shah-Hosseini and R. Safabakhsh, “TASOM: The Time Adaptive SelfOrganizing Map,” Proc. Int’l Conf. Information Technology: Coding and Computing, pp. 422-427, 2000. H. Shah-Hosseini and R. Safabakhsh, “The Time Adaptive Self-Organizing Map with Neighborhood Functions for Bilevel Thresholding,” Proc. AI, Simulation, and Planning in High Autonomy Systems Conf., pp. 123-128, 2000. H. Shah-Hosseini and R. Safabakhsh, “Pattern Classification by The Time Adaptive Self-Organizing Map,” Proc. Int’l Conf. Electronics, Circuits, and Systems, pp. 495-498, 2000. H. Shah-Hosseini and R. Safabakhsh, “TAPCA: Time Adaptive SelfOrganizing Maps for Adaptive Principal Components Analysis,” Proc. IEEE Int’l Conf. Image Processing, pp. 509-512, 2001. H. Shah-Hosseini and R. Safabakhsh, “The Time Adaptive Self-Organizing Map for Distribution Estimation,” Int’l J. Eng., vol. 15, no. 1, pp. 23-34, 2002. The standard test images at http://jiu.sourceforge.net/testimages/ index.html, 2002.
. For more information on this or any other computing topic, please visit our Digital Library at http://computer.org/publications/dlib.