Texture Segmentation by Frequency-Sensitive Elliptical Competitive Learning Steve De Backer
Paul Scheunders
Vision Lab, Department of Physics, University of Antwerp, Groenenborgerlaan 171, 2020 Antwerpen, Belgium
Abstract In this paper, a new learning algorithm is proposed with the purpose of texture segmentation. The algorithm is a competitive clustering scheme with two specific features: elliptical clustering is accomplished by incorporating the Mahalanobis distance measure into the learning rules, and underutilization of smaller clusters is avoided by incorporating a frequency-sensitive term. In the paper, an efficient learning rule that incorporates these features is elaborated. In the experimental section, several experiments demonstrate the usefulness of the proposed technique for the segmentation of textured images. On compositions of textured images, Gabor filters were applied to generate texture features. The segmentation performance is compared to k -means clustering with and without the use of the Mahalanobis distance and to the ordinary competitive learning scheme. It is demonstrated that the proposed algorithm outperforms the others. A fuzzy version of the technique is introduced, and experimentally compared with fuzzy versions of the k -means and competitive clustering algorithms. The same conclusions as for the hard clustering case hold. Key words: unsupervised classification, competitive learning, mixture modelling, texture segmentation
1
Introduction
Clustering is used for partitioning a collection of data into subgroups. Over the years large number of clustering methods have been proposed [1–4]. A frequently used approach to clustering is the classification maximum likelihood Email address:
[email protected] (Steve De Backer). URL: http://www.ruca.ua.ac.be/visielab (Steve De Backer).
Preprint submitted to Elsevier Preprint
18 September 2001
technique [5–8]. Here, a mixture of distributions is assumed for describing the cluster shapes. In general the well-known expectation-maximization technique (EM) [9,10] is used to maximize the likelihood function. A simple model assumes spherical clusters of equal size, which limits the set of parameters to the cluster positions. The applied distance measure is then the Euclidean one. In most practical applications, this simplification is (wittingly or unwittingly) applied and leads to the k -means clustering algorithm [11]. A sequential approach is known by the name of Competitive Learning [12]. In practical classification problems however, the assumption of spherical clusters is not justified. Just to mention an example, clusters appearing in feature space, coming from e.g. textured images can be very different from spherical. In these cases, a more general cluster model should be applied, like a multivariate normal distribution. This however leads to several practical problems. First of all, in most practical situations, high-dimensional feature spaces are involved. This implies that a high number of parameters needs to be estimated. For this, a high number of data points needs to be available. Consequently the algorithm becomes very time-consuming. Moreover, the covariance matrices involved can become near-singular leading to numerical problems when inverting them. The EM algorithm updates the parameters globally (i.e. after presentation of all data points during one iteration), which makes the algorithm slow. Speedup can be achieved by using incremental variants which result in faster convergence [13,14]. To our knowledge, not many studies have investigated the problem of the singularities. Some have tried to solve the problem by appropriate initialization [15]. Recently a renormalization approach was studied, where the singularities were removed by combining elliptical and spherical distance measures [16]. In this paper, a different approach is proposed, making use of Competitive Learning. In this approach, the parameters are updated each time a data point is presented. A direct update rule for the inverse of the covariance matrices is introduced. This makes the algorithm much faster than EM. Initially the algorithm starts with spherical clusters, but during training the cluster shapes can be adapted iteratively to become ellipsoidal. In this way no singularities can occur. An extra difficulty that is introduced when using competitive learning is that of an initial overassignment of the larger clusters. This problem is solved by incorporating a frequency-sensitive term. Hard clustering as well as fuzzy clustering versions exist of the k -means [17–19] and competitive learning algorithm [20]. By simple extension, a fuzzy version of the ellipsoidal competitive learning technique will be introduced. 2
The problem of texture segmentation has been studied for many years [21]. We choose to apply the proposed algorithm to the problem of texture segmentation, for the following reasons. On the one hand, many features are needed for a proper texture characterization, on the other hand, many data points need to be processed. It is also recognized that the obtained clusters in feature space can have highly non-spherical shapes and can differ widely in size. A frequently used class of texture features comes from multiresolution decompositions of texture [22]. In particular, Gabor filters [23] and the Wavelet transform [24] were employed regularly to perform texture segmentation. In this paper we choose to employ Gabor filters for texture feature extraction. The paper is organized as follows: in the next section, the frequency-sensitive elliptical learning algorithm is elaborated. In section 3 the fuzzy version of the technique is introduced. In section 4, the problem of texture segmentation is addressed and the issues of feature extraction and dimensionality reduction are elaborated. In section 5, the experimental results are presented and discussed.
2
Frequency-Sensitive Elliptical Competitive Learning
2.1 Clustering
One way of clustering a d-dimensional data space into C clusters, is to model the space as a mixture of normal distributions. The negative logarithm of the likelihood function is then used as a cost function. Minimizing this function with respect to the model parameters then leads to a partitioning of the space. A very common approximation is to assume spherically shaped clusters. Then, the cost function becomes: E=
C X X
DE (~xi , ~vk )
(1)
k=1 i∈Sk
where C is the total number of clusters, ~vk are the cluster representatives and Sk = {i; k = argmin j DE (~xi , ~vj )}
(2)
DE stands for the Euclidean distance measure: DE (~xi , ~xj ) = k~xi − ~xj k2 . Here it is implicitly assumed that all clusters are equally sized (having the same volume) and equally sampled (having the same amount of data points). 3
Minimization of E is performed in two steps: optimization with respect to the partitioning is done by: i ∈ Sk ⇐⇒ k = argmin j DE (~xi , ~vj )
(3)
Minimization with respect to the cluster positions is obtained by: ∂ X ∂E = DE (~xi , ~vk ) = 0 ∂~vk ∂~vk i∈Sk
(4)
P
(5)
leading to: ~vk =
i∈Sk
~xi
|Sk |
|Sk | is the total number of data points that belongs to cluster k. A practical way of clustering using the Euclidean distance measure is the use of the k -means clustering algorithm (KM). Here, initially the C cluster representatives are positioned randomly in the data space. Then, cluster assignment (3) is performed globally by assigning each data point to one cluster. Cluster updating is performed globally by updating each cluster position using (5). During several iterations, the cluster assignment and the cluster updating are performed alternatively until convergence (i.e. until the difference in E for two succesive iterations is sufficiently small). A sequential way of performing clustering is found by an on-line gradient descent optimization of (1). At each iteration one data point is presented. The data point is assigned to a cluster by (3). The representative of that cluster is then updated (at iteration t + 1) as: ~vk (t + 1) = ~vk (t) + α(t)(~xi − ~vk (t))
(6)
where α(t) is a learning parameter which is a decreasing function of t. This procedure is known as Competitive Learning (CL) [25]. Several reasons exist to prefer the sequential approach above the batch-one. One iteration step of KM takes a CPU-time proportional to the total number of data points. Applying CL during one ’epoch’ (i.e. as many times as number of data points), takes about the same time. Due to the smaller steps that are taken for the adaption process, CL will lead quickly to a nearby optimum, while KM will have more possibilities to get stuck into well-separated local optima [26]. Moreover when a stream of continuously changing data is involved, 4
CL will be useful as an adaptive approach. A possible disadvantage is that the learning parameter should be adapted properly to guarantee efficient learning.
2.2 Frequency-Sensitive Clustering
When using CL, instabilities can appear during the first iterations due to favourising of larger clusters over the others (an effect that obviously does not appear in the case of KM, since all data points are updated simultaneously). This effect, also known in the field of vector quantization, can be removed by using a frequency-sensitive approach [27,28]. Here, the distance measure is multiplied by nk (t), the number of times that cluster k has been modified at the t-th iteration using (6). In this way distances become larger in regularly updated clusters. Clusters that contain smaller numbers of distant points will then have the opportunity to be updated as well. Recently, this technique has been demonstrated to be very efficient for clustering in cases where differently sized clusters appear [29]. The technique will be referred to as FSCL.
2.3
Elliptical Clustering
For normally distributed cluster densities in general, the Mahalanobis distance needs to be used: DM (~xi , ~vk ) = (~xi − ~vk )T Σ−1 xi − ~vk ) k (~
(7)
where Σk is the covariance matrix of cluster k, containing the complete shape description of the clusters. As a consequence, cluster updating requires (along with the calculation of the cluster positions, which can easily be shown to be done by (5)) the calculation and the inversion of the covariance matrices, which then need to be used in the cluster assignment step. A k -means clustering algorithm using the Mahalanobis distance can then be constructed in the following way: initially the C cluster representatives are positioned randomly in the data space. The initial covariance matrices are chosen to be the identity matrix (This implies the use of the Euclidean distance measure). Then, cluster assignment is performed globally by using (3) (where D is given by (7)). Cluster updating is performed globally by updating every cluster position using (5). The cluster covariance matrices are then calculated by: Σk =
1 X (~xi − ~vk )(~xi − ~vk )T |Sk | i∈Sk 5
(8)
These can then be used in the next iteration for the assignment step. This technique will be referred to as elliptical k -means clustering (EKM). For the sequential approach using the Mahalanobis distance, an update rule, similar to (6), for the covariance elements is needed. We suggest the following updating rule: Σk (t + 1) = Σk (t) + α(t)((~xi − ~vk )(~xi − ~vk )T − Σk (t)) = (1 − α(t))Σk (t) + α(t)(~xi − ~vk )(~xi − ~vk )T
(9)
where ~vk stands for ~vk (t + 1). This rule is based on the following idea: in the same way as in (6), where the position of the cluster moves towards the data point presented, the covariance ’moves’ towards the covariance represented by that data point. This rule however is impractical, since using this information for the assignment step would require an inversion of Σ after each presentation of a data point. Therefore we need to find an updating rule for Σ−1 directly. Writing the difference vector (~xi − ~vk ) as m, ~ we obtain: Σ−1 k (t + 1) −1 = (1 − α)Σk (t) + α(m ~ ·m ~ T) −1 −1 Σk (t) α −1 T Σk (t)(m ~ ·m ~ ) = I+ 1−α 1−α α write β = 1−α and expand the inverse Σ−1 −1 T 2 −1 T T 3 k (t) ) − β . . . = I − βΣ−1 (t)( m ~ · m ~ ) + β Σ (t)( m ~ · m ~ ) · Σ (t)( m ~ · m ~ k k k 1−α −1 T rearrange and write m ~ Σk (t)m ~ as λ Σ−1 β 2 2 k (t) − Σ−1 (t)(m ~ ·m ~ T )Σ−1 k (t)(1 − βλ + β λ − . . .) 1−α 1−α k Σ−1 (t) ~ ·m ~ T )Σ−1 β Σ−1 k (t)(m k (t) = k − 1−α 1−α 1 + βλ =
After filling in the expressions for m, ~ β and λ, one obtains Σ−1 k (t + 1) =
Σ−1 k (t) 1−α
−
T
Σ−1 xi − ~vk ) · Σ−1 xi − ~vk ) k (t)(~ k (t)(~
α (10) · 1 − α 1 + α (~xi − ~vk )T Σ−1 xi − ~vk ) − 1 k (t)(~
In this way, a direct updating of the inverse of the covariance matrix is obtained, which can be used in the assignment step. The elliptical competitive learning technique will be referred to as ECL. 6
2.4 The FSECL Algorithm
The complete clustering algorithm, including the frequency-sensitivity and the elliptical clustering will be referred to as FSECL. To summarize the algorithm, in the following we mention the subsequent steps to be followed. (1) Initialize by randomly positioning C cluster centers and setting all covariance matrices equal to the identity matrix. (2) Randomly pick a data point (3) Assign the point the nearest cluster k by using (3) with D coming from (7). The frequency-sensitive approach is applied, where (7) is multiplied by nk (t), the number oftimes that the cluster has been updated in the past. (4) Adapt the position of cluster k by using (6) and add one to nk . (5) Adapt the inverse covariance matrix Σ−1 k by using (10) (6) Goto 2
3
Fuzzy Approach
The previously described clustering algorithms were all hard clustering techniques, i.e. a data point belongs to one and only one cluster. A class of fuzzy clustering techniques has been designed [17], that allows the data point to belong to several clusters, using a membership function [30]. Such function uik describes the membership of data point i to cluster k. uik ∈ [0, 1] P
k
uik = 1
(11)
Incorporating this function into the minimization process allows for the data points to delay the decision to which cluster to belong until the end of the iteration process. This can be advantageous when clusters overlap or touch, or when clusters are far apart. In the latter case, the far away clusters have a chance to move to the exemplar regions [19]. In the following section we demonstrate that the fuzzy concept can easily be incorporated in the proposed algorithms. 7
3.1 Fuzzy Clustering
The cost function of (1) extends to EF =
C X X
DE (~xi , ~vk )uqik
(12)
k=1 i
The exponent q controls the degree of fuzziness, and is usually taken to be slightly higher than 1. The sum over i now runs over all data points. Instead of hard partitioning, now (3) extends to an optimization for the membership values [17]:
uik =
C X l=1
DE (~xi , ~vk ) DE (~xi , ~vl )
!
1 q−1
−1
.
(13)
By minimizing (12) with respect to the cluster positions, (5) is extended towards: ~vk =
xi uqik i~ P q i uik
P
(14)
A fuzzy k -means clustering algorithm (FKM) (in the literature known as fuzzy c-means algorithm) is obtained by iteratively updating (13) and (14) until convergence. A gradient descent approach leads to the updating rule (extension of 6): ~vk (t + 1) = ~vk (t) + α(t)uqik (~xi − ~vk (t))
(15)
With this rule, a Fuzzy Competitive Learning algorithm (FCL) can be obtained, where a data point i is presented and uik is calculated using (13), after which all cluster centers, having a non-zero membership are updated using (15) [20].
3.2 Fuzzy Frequency Sensitive Competitive Learning
For the same reasons as in section 2.2, a frequency-sensitive approach can be useful. The distance measure is then multiplied by µk (t + 1) = µk (t) + uqik 8
(16)
which is the total of membership values of cluster k during all previous iterations. It may seem unnecessary to use frequency-sensitive clustering here, since fuzzy clustering already takes care of far away clusters, but it has been shown [20] that the combination is more powerful.
3.3 Fuzzy Elliptical Clustering
When the Mahalanobis distance of (7) needs to be applied, the cluster covariance matrices are calculated from (12) by (extension of (8)) [31]: 1 X q uik (~xi − ~vk )(~xi − ~vk )T q i uik i
Σk = P
(17)
where datapoint i is presented and uik is calculated using (13). A global clustering algorithm as EKM is then obtained, and will be referred to as fuzzy elliptical k -means algorithm (FEKM). For the sequential approach, we propose to extend (9) to:
Σk (t + 1) = Σk (t) + α(t)uqik (~xi − ~vk )(~xi − ~vk )T − Σk (t)
(18)
A direct updating rule for Σ−1 k can then be found, similar as in (10), where α → αuqik
(19)
3.4 Fuzzy Frequency Sensitive Elliptical Competitive Learning
By combining the techniques of 3.2 and 3.3, one obtains a full fuzzy frequencysensitive elliptical competitive learning algorithm (FFSECL). (1) Initialize by randomly positioning C cluster centers and setting all covariance matrices equal to the identity matrix. (2) Randomly pick a data point (3) Calculate the distances from the point to all cluster centers using (7). Multiply this by the frequency term µ(t) from (16). From these calculate the memberships using (13). (4) Adapt the position of the clusters by using (15). (5) Adapt the inverse covariance matrices Σ−1 by using (10) with (19). (6) Update the frequency terms using (16). (7) Goto 2
9
4
Texture Segmentation
In this section the problem of texture segmentation and the applied methodology is elaborated. The problem is to segment an image into different textural regions. Ideally, each pixel is to be classified as one of a (predefined) number of textures. We will describe the problem as a clustering problem in a feature space. Each pixel is represented by a feature vector. By clustering, similar feature vectors are grouped and a unique texture is assigned to each cluster. Three steps are required. First the features should be calculated. We choose to apply multiresolution-based features, since these are known to give a good texture characterisation. Secondly, the feature space should be reduced to optimize the segmentation procedure, by selecting or extracting a reduced set of features from the complete set. Finally, the clustering has to be performed.
4.1 Texture Features
From the textures, features were calculated using non-separable Gabor filters, implemented in the Fourier space. From the filtered images, energies were calculated in 8 different orientations, equally distributed between -90 and 90 degrees, and for 8 different frequencies, equally distributed on an octave scale, leading to 64 features.
4.2 Dimensionality Reduction
In typical texture problems, the number of obtained features is very high. For the classification of textured images, features are calculated over a complete (sub)image. Therefore very stable features are generated, but a limited number of data points is available. For the problem of segmentation, the features are calculated for each pixel separately, and therefore tend to be noisy. However, many data points are available. In both cases, a serious reduction of the number of features is required, for segmentation mainly because of the noisiness of the features and to reduce the complexity. Because of the Curse-of-Dimensionality phenomenon, it is expected that the segmentation performance will decrease when more features are involved. However, the number of applied features should at least be higher than the intrinsic dimensionality of the data set. Therefore, a trade-off between the two will lead to optimal segmentation results. An unsupervised reduction corresponds to a projection of the feature space to lower dimensionality. The standard technique to do this is by using Prin10
cipal Component Analysis (PCA), where the correlation matrix is diagonalized, leading to a rotation of the feature space, which then can be projected onto the principal axes, corresponding to the largest eigenvalues. Other, nonlinear techniques, as Multidimensional Scaling, Sammon’s Mapping or SelfOrganizing Maps are possible [32]. Since the main issue of this paper is the clustering itself, PCA is applied.
4.3 Clustering
The projected feature space needs finally to be clustered. The issue of cluster validity (what is the optimal number of clusters involved) is not discussed in this paper. In the conducted experiments we will assume that the optimal number of clusters is known in advance. The proposed technique can then be applied and its performance can easily be estimated by the segmentation results. As a performance measure we will use the percentage of correctly segmented pixels (in the experiments a truth image is assumed to be available).
5
Experiments and Discussion
For the experiments, compositions of textures were generated. All textures were taken from the Brodatz figures [33]. The compositions contained five regions with different textures. In figure 1, an example of such compositions is shown, displaying from left to right, the textures D29, D28, D21, D2 and D24 in the center. From these compositions, 64 Gabor features were calculated for each pixel. The obtained feature space is then reduced to d dimensions using PCA, with d ranging from 1 to 7. On the reduced feature space, clustering is applied. Six different algorithms were applied: spherical k -means clustering (KM), Spherical Competitive Learning (CL), Frequency-Sensitive Spherical Competitive Learning FSCL), elliptical k -means clustering (EKM), Elliptical Competitive Learning (ECL) and Frequency-Sensitive Elliptical Competitive Learning (FSECL). For all the techniques, segmentation performance is studied in function of the dimension d. The effects of the introduction of the frequencysensitive term and the elliptical term are studied separately. In all cases no post-processing steps were applied. In figures 3(a) to 3(f), segmented results are shown for d = 3. In figure 3(a)3(b), the resulting segmented images are shown for k -means clustering, with 11
Euclidean and Mahalanobis distance respectively. In figure 3(c)-3(d), results using ordinary CL with and without frequency-sensitivity are shown. Finally, in figure 3(e)-3(f), results using ECL with and without frequency-sensitivity are shown. In figure 2, segmentation results are plotted in function of d. Since all algorithms depend on the initial configuration of the parameters, and the competitive approaches depend also on the order in which data points are presented, results differ from one run to another. Therefore we repeated all experiments during 100 runs, and average and standard deviation are given. From these results the following can be concluded: • The results for CL, FSCL and KM were comparable. In all three cases textures D28 and D2 were not distinguishable. Apparently, in the case of spherical clustering applying the sequential version led to similar results as the batch version. The use of the frequency sensitive term did not improve the performance. • Extending the model to ellipsoidal cluster shapes using EKM did not give any improvement. It was expected that this more general model could improve the results, but it seems that the algorithm is not able to estimate the large amounts of parameters to model the ellipsoidal shapes. • Using ECL, a decrease in the segmentation performance was observed, when the frequency sensitive term was not included. Without the frequency sensitive term, a cluster can grow almost unlimited as can be seen in figure 3(e). Including the latter however resulted in an improvement of the performance. From these observations we can conclude that including frequency-sensitivity or elliptical clustering seperately does not improve results, but on the contrary can deteriorate clustering performance. Including both on the other hand does improve results. These observations were generally found in all compositions we tested. Obviously results differed depending on the choice of textures. The case that we used in this paper was found to be particularly difficult. Besides the gain in performance, the competitive approach also entails a gain in calculation speed. KM and EKM update all data points simultaneously during one iteration. Several iteration steps were needed to obtain useful results. The competitive approaches update the data points one at a time. Typically about one ’epoch’ (i.e. all data points presented once) sufficed to obtain useful results. In table 1, some typical calculation times are shown as an indication for the speed-up. All experiments were repeated for the fuzzy versions of the algorithms: the fuzzy k -means algorithm (FKM), the fuzzy competitive learning algorithm (FCL), the fuzzy frequency sensitive competitive learning algorithm (FFSCL), the fuzzy ellipsoidal competitive learning algorithm (FECL) and the 12
proposed fuzzy frequency sensitive ellipsoidal competitive learning algorithm (FFSECL). In figures (4(a) to 4(f)), the segmentation results are displayed for the textures of figure (1). In general the same conclusions can be drawn as for the hard clustering results. When trying to incorporate the Mahalanobis distance into the batch-mode algorithm, the performance deteriorates, even more than in the hard clustering case. When incorporating frequency-sensitivity results slightly improve. When incorporating elliptical clustering results deteriorate, even more than in the hard clustering case but incorporating both improves results. With respect to the introduction of fuzzy clustering, we can conclude from the figures that it slighty improves results compared to hard clustering in almost all algorithms, except in the case of ECL, where it deteriorates results. Using the fuzzy approaches, the calculation times go up compared to the hard clustering approaches. This is because we need to calculate the membership and to update all the cluster centres each iteration. In table 1, typical processing times are displayed.
6
Conclusions
In this paper, texture segmentation was aimed at using competitive learning techniques. A new technique was proposed that incorporates elliptical clustering and frequency-sensitivity into the learning rule. For this, an efficient learning rule was proposed. Further, a fuzzy version of the algorithm was proposed. Experiments on texture segmentation, applying Gabor texture features were performed. We compared the proposed technique to the spherical and ellitical k-means clustering algorithms, both hard as well as fuzzy. The results showed that the proposed technique outperforms the others in performance as well as efficiency.
13
References
[1] J. Hartigan, Clustering Algorithms., Wiley, 1975. [2] O. R. Duda, E. P. Hart, Pattern classification and scene analysis, John Wiley, New York, 1973. [3] A. K. Jain, R. C. Dubes, Algorithms for Clustering Data, Prentice-Hall, Englewood Cliffs, NJ, 1988. [4] L. Kaufman, P. J. Rousseeuw, Finding Groups in Data, John Wiley & Sons, 1990. [5] A. J. Scott, M. J. Symons, Clustering methods based on likelihood ratio criteria, Biometrics 27 (1971) 387–397. [6] D. M. Titterington, A. F. M. Smith, U. E. Makov, Statistical Analysis of Finite Mixture Distributions, Wiley, Chichester, 1985. [7] G. J. McLachlan, K. E. Basford, Mixture Models: Inference and Applications to Clustering, Marcel Dekker, New York, 1988. [8] G. Celeux, G. Govaert, Comparison of the mixture and the classification maximum likelihood in cluster analysis, J. Statist. Compu. Simul. 47 (1993) 127–146. [9] A. P. Dempster, N. M. Laird, D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society B 39 (1977) 1–38. [10] G. J. McLachlan, T. Krishnan, The EM Algorithm and Extensions, Wiley, 1997. [11] J. MacQueen, Some methods for classification and analysis of multivariate observations, in: L. M. L. Cam, J. Neyman (Eds.), Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, University of California Press, Berkeley, CA, 1967, pp. 281–297. [12] S. Grossberg, Adaptive pattern recognition and universal recoding, I. Parallel development and coding of neural feature detectors, Biological Cybernetics 23 (1976) 121–134. [13] R. M. Neal, G. E. Hinton, Learning in Graphical Models, Kluwer Academic Publishers, 1998, Ch. A view of the EM algorithm that justifies incremental, sparse, and other variants, pp. 355–368. [14] P. Baldi, Y. Chauvin, Smooth on-line learning algorithms for hidden Markov models, Neural Computation 6 (2) (1994) 307–318. [15] J.-M. Jolion, P. Meer, S. Bataouche, Robust clustering with applications in computer vision, IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-13 (8) (1991) 791–802.
14
[16] J. Mao, A. K. Jain, A self-organizing network for hyperellipsoidal clustering (hec), IEEE Trans. on Neural Networks (1996) 16–29. [17] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, New York, 1981. [18] I. Gath, A. B. Geva, Unsupervised optimal fuzzy clustering, IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-11 (7) (1989) 773–781. [19] S. Z. Selim, M. S. Kamel, On the mathematical and numerical properties of the fuzzy c-means algorithm, Fuzzy sets and Systems 49 (1992) 181–191. [20] F.-L. Chung, T. Lee, Fuzzy competitive learning, Neural Networks 7 (3) (1994) 539–551. [21] T. Reed, J. du Buf, A review of recent texture segmentation and feature extraction techniques, CVGIP: Image Understanding 57 (3) (1993) 359–372. [22] M. Unser, M. Eden, Multiresolution feature extraction and selection for texture segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence 11 (1989) 717–728. [23] A. K. Jain, F. Farrokhnia, Unsupervised texture segmentation using Gabor filters, Pattern Recognition 24 (12) (1991) 1167–1186. [24] M. Unser, Texture classification and segmentation using wavelet frames, IEEE Transactions on Image Processing 4 (11) (1995) 1549–1560. [25] T. Kohonen, Self-Organizing Maps, Springer-Verlag, 1995. [26] P. Scheunders, A comparison of clustering algorithms applied to color image quantization, Pattern Recognition Letters 18 (1997) 1379–1384. [27] D. DeSieno, Adding a conscience to competitive learning, in: Proc. IEEE 1988 Int. Conf. Neural Networks, 1988, pp. 1117–1124. [28] S. C. Ahalt, A. K. Krishnamurthy, P. Chen, D. E. Melton, Competitive learning algorithms for vector quantization, Neural Networks 3 (1990) 277–290. [29] P. Scheunders, S. De Backer, High-dimensional clustering using frequency sensitive competitive learning, Pattern Recognition 32 (2) (1999) 193–202. [30] L. A. Zadeh, Fuzzy sets, Information and Control 8 (1965) 338–353. [31] D. E. Gustafson, W. C. Kessel, Fuzzy clustering with a fuzzy covariance matrix, in: Proc. IEEE DCD, IEEE, San Diego, 1979, pp. 761–766. [32] S. De Backer, A. Naud, P. Scheunders, Non-linear dimensionality reduction techniques for unsupervised feature extraction, Pattern Recognition Letter 19 (1998) 711–720. [33] P. Brodatz, Textures, A photographic album for artists and designers, Dover Publications, New York, 1966.
15
Method
# iterations
Time (s)
KM
24
1.5
EKM
50
23.
CL/FSCL
1
0.1
ECL/FSECL
1
0.3
FKM
54
15.
FEKM
65
28.
FCL/FFSCL
1
1.2
FECL/FFSECL
1
1.3
Table 1 Average processing times for the different algorithms (calculations done on Pentium II, 350 MHz)
D29
D28
D24
D21
D2
Fig. 1. The composition of textures
16
100 95 90
CL FSCL ECL FSECL KM EM
Classification Result (%)
85 80 75 70 65 60 55 50 1
2
3
4
5
6
Dimension
Fig. 2. Plot of segmentation performance in function of d for the different algorithms. All experiments are repeated 100 times, and average and standard deviation are given.
17
(a) KM
(b) EKM
(c) CL
(d) FSCL
(e) ECL
(f) FSECL
Fig. 3. Segmentation result for (a) k -means clustering, (b) elliptical k -means , (c) spherical competitive learning, (d) frequency sensitive spherical competitive learning, (e) elliptical competitive learning (f) frequency sensitive elliptical competitive learning.
18
(a) FKM
(b) FEKM
(c) FCL
(d) FFSCL
(e) FECL
(f) FFSECL
Fig. 4. Segmentation result for fuzzy versions of (a) k -means clustering, (b) elliptical k -means , (c) spherical competitive learning, (d) frequency sensitive spherical competitive learning, (e) elliptical competitive learning (f) frequency sensitive elliptical competitive learning.
19