Data Recovery Fuzzy Clustering: Proportional Membership and Additive Spectral Methods
Susana Nascimento Department of Computer Science and Centre for Artificial Intelligence (CENTRIA) Faculdade de Ciências e Tecnologia Universidade Nova de Lisboa PORTUGAL
International Workshop ``Clusters, orders, trees: Methods and applications'' in honor of Professor Boris Mirkin Moscow, December 12th-13th 2012
Background and Motivation Current fuzzy clustering methods are useful for finding fuzzy structures in data especially with respect to typologies
v1
v2 y21
y1 2
y1
Typological structure Type Type
y2 y11
Yet they do not follow the conventional statistics approach: no feedback on data Susana Nascimento
[email protected]
2
Data Recovery Framework for Clustering 1.
Data are assumed to have been generated according to a cluster structure Observed_Data = Model_Data + Residual
2. Goal of clustering is to fit the Model_Data, minimising the Residual 3. Square-Error Clustering Criterion ||Residual||2 Min Susana Nascimento
[email protected]
3
Generic Types of Data in Clustering
x 11 ... x i1 ... x n1
Entity-to-feature
Data matrix – Rectangular
Susana Nascimento
(Dis)similarity matrix – Square
...
x 1f
...
... ...
... x if
... ...
...
...
...
...
x nf
...
0 d(2,1) d(3,1 ) : d ( n ,1)
[email protected]
0 d ( 3,2 ) : d ( n ,2 )
0 : ...
x 1p ... x ip ... x np
... 0 4
Objectives Develop a fuzzy clustering framework within the data recovery approach for both data formats: A) Entity-to-Feature data format ‘proportional membership’ clustering B) Square Similarity data format ‘fuzzy additive spectral’ clustering
Susana Nascimento
5
Fuzzy Clustering Proportional Membership Model (FCPM)
FCPM model: Y- data, U – membership, V- prototypes entity k 1,, n; feature h 1,, p; prototype i 1,, c;
ykh uik vik eikh
Proportional (F. Roberts) membership: uik – proportion of vi h
2
v2 e1k
u1k e2k
v1
yk
u2k h Susana Nascimento
[email protected]
1
6
The FCPM Family
FCPM-0 Clustering Criterion c n p
E0 U, V; Y ( ykh uik vih ) 2 i 1k 1h 1
0 uik 1 c
uik
i 1
1
i, k ; k ;
FCPM-m Clustering Criteria c n p Em U, V;Y uik m ( ykh uik vih )2 , i 1k 1 h 1
with m=0, 1, 2, ... satisfying the fuzzy constraints Susana Nascimento
7
FCPM Algorithm: Alternating Optimisation (AO) Em(U,V;Y) / E(U,V;Y,)
AO Architecture Goldstein-Levitin-Polyak Gradient Projection method
initialize prototypes V(0) and partition U(0)
Method for projecting a vector onto the simplex of admissible membership values
Repeat new partition (t ) (t-1) (t -1) U (U ,V , Y, ...)
uk(t)= PQ(uk(t-1)- E(uk(t-1), V))
argmin Em(v; U(t), Y) vV
Susana Nascimento
new prototypes V(t) (U(t), Y, ...) until t= tmax .or. |V(t) - V(t-1)|err
[email protected]
8
FCPM-0: Indicator of the Number of Clusters
Susana Nascimento
[email protected]
9
Analysis of Cluster Structure Recovery
Data Generator Each “original” prototype oi is randomly generated within prespecified small hyper-cube
The origin of the space, o, is defined by the means of the features.
Each cluster direction is ooi
Two
Susana Nascimento
[email protected]
p-sampling hyper-boxes
Ai=[.9oi, 1.1oi ]
Bi=[o, oi ]
20% of ni points within Ai
80% of ni points within Bi
11
Experimental Study
Main Goals
1. Analyse the ability of FCPM to recover the original prototypes.
–
2. Study FCPM-0 as an indicator of the number of clusters. 3. Comparison of FCPM with the well known Fuzzy c-Means (FCM) [Bezdek, 1981]
Setting of Experiments Generated Data 150 data sets; • c0= 3, 4, 5, 6, …; • p= 20, 30, 50, …, 180; •
–
Space Dimension low, intermediate, high p low 5; c0
high –
p 25 c0
FCM and FCPM-m Equal initial setting
c=c0= 3, 4, 5, ... Susana Nascimento
[email protected]
12
FCPM-m, FCM prototypes / Original prototypes
v’s: Original FCM FCPM-0 FCPM-1 FCPM-2 FCPMb FCPM-AE
Low space dimension data set (c= 3, p=2, n=50)
All FCPM-1,2 versions find c= 3 prototypes; FCPM-0 moves prototype of cluster 2 farway left and share prototype with cluster 3 Susana Nascimento
13
FCPM-m, FCM prototypes / original prototypes 0.1
2 Original v’s FCM
0.05
4
FCPM-0 FCPM-1
0
3
FCPM-2
1
-0.05
5
6
-0.1 0.1 0.05
0.1 0.05
0 0
-0.05
-0.05 -0.1
-0.1
High space dimension data set (c0= 6, p=180, n=887) projected on the space of the three Principal Components
FCPM-1, FCPM-0 tend to provide central prototypes like FCM FCPM-2 leads to extreme prototypes Susana Nascimento
[email protected]
14
Assessement of Cluster Structure Recovery
Dissimilarity Coefficient to Recovery of Prototypes
c
DV ,V '
p
' vih
i 1h 1 c p vih 2 i 1h 1
vih c
2
p
2 v'ih i 1h 1
Dissimilarity D can be used to compare cluster proptotypes in different settings. — Dissimilarity to FCM prototypes; —
— Dissimilarity to original prototypes; Susana Nascimento
[email protected]
15
Average Dissimilarity: results
Susana Nascimento
16
Number of Clusters: results
(C1) some of the initial prototypes converge to the same stationary point; (C2) some of the initial prototypes have been removed by the algorithm from the data set
Susana Nascimento
17
Summary of Experimental Results
FCPM-2 is able to recover the original prototypes: extreme points.
The other versions of FCPM favour central prototypes.
FCPM-0 (in low/intermediate space dimensions) and FCPM-2 (in high space dimension) act as indicators of the “natural” number of clusters present in the data according to the typological model.
For high dimensional data, the FCM leads to degenerate partitions: all prototypes coinciding.
FCPM proportional membership leads to more clear-cut partitions than the FCM distance membership.
Susana Nascimento
[email protected]
18
FCPM vs FCM • FCM criteria
• FCPM criteria
c n
m 2 Em U, V;Y uik d (y k , uik vi ) i 1k 1
m 2 J m U, V;Y uik d (y k , vi )
1. Minimizing Em over uik for vi fixed, minimizes d2(yk, uik vi ) by projecting yk on axes 0 vi (i=1,...,c). >> careful choice of the origin of the space in FCPM.
1. The origin of the space is irrelevant on minimizing FCM criterion Jm
c n
i 1k 1
2. FCM prototypes are average 2. FCPM prototypes tend to be extreme points of theirs clusters. points in their clusters.
Susana Nascimento
[email protected]
19
Capturing Ideal Types with FCPM: Example Mental Disorders [Mezzich and Solomon, 1980] •
44 patients;
•
17 psychosomatic features (h1-h17);
•
severity rating scale: 0-6;
Four Conditions: Depressed (D), Manic (M), Schizophrenic (Ss), Paranoid schizophrenic (Sp)
•
Ideal type modelling Each condition is characterized by a pattern of features, that takes
extreme values (0 or 6), defining a syndrome of mental conditions
Feature-to-Cluster Contribution / Underlying Type Feature_to_Class Weights (Class-Depressed)
Underlying type D
0.25
w(h|D)
0.20
0.15 0.10
0.059
0.05
D : h 5 h8 h 9 h13 h17
0.00 1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17
Feature
The
most contributing features to each cluster revealed by FCPM-2 mostly coincide with the ones of the original classes.
Feature_to_Cluster 2-Weight (Cluster-Depressed) 0.25
w2(h|D)
0.20 0.15 0.10
0.059 0.05 0.00 1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17
Feature Susana Nascimento
[email protected]
21
Contradicting a Cluster Tendency Having a cluster structure revealed in a data set, we question how sensible is that structure with regard to augmenting the data by entities bearing more or less similarities to the cluster prototypes.
Disorder
Augmented Disorder 55% patients
full-scale syndrome
xgh = round(sfxkh) + t sf t mild-scale 0.6 0/1 light-scale 0.3 0/1
full-scale syndrome mild-scale syndrome light-scale syndrome
30% patients 15% patients
Six distinct augmented data sets Susana Nascimento
[email protected]
22
Underlying Types: when less “heavy patients” are added FCM moves prototypes to ‘mild’ syndromes DFCM h 'h MFCM h 'h SsFCM h 'h SpFCM h 'h Susana Nascimento
FCPM-2 keeps prototypes in the ‘extreme’ symdrome (0 / 6)
h5
h8
h9
h13
h17
DFCPM h5
h8
h9
h13
h17
5 4
0 0
6 5
5 4
1 1
6 6
0 0
6 6
6 6
0 0
h3
h8
h13
h16
h17
h3
h8
h13
h16
h17
0 1
6 5
0 1
0 1
6 5
0 0
6 6
0 0
0 0
6 6
h3
h8
h16
h17
h3
h8
h16
h17
5 4
1 1
5 3
0 1
6 5
0 0
6 6
0 0
h8
h10
h11
h12
h13
h14
h15
h8
h10
h11
h12
h13
h14
h15
5 4
5 4
5 5
4 4
1 1
5 5
5 5
h 'h MFCPM h 'h SsFCPM h 'h SpFCPM h 'h
6 6
6 6
6 6
6 6
0 0
6 6
6 6
[email protected]
23
Conclusion The FCPM framework offers a family of clustering models based on the concept of ‘proportional membership’ the belongingness of entities to clusters are based on how much they share the features of corresponding prototypes.
These kind of methods are restrictive covering a specific type of cluster structure – –
FCPM extreme type structure FCPM average type structure
Ability of FCPM to reconstruct the data from the model The
effectivness of FCPM ‘fuzzy proportional membership’ and
‘ideal type’ are yet to be better explored in real world applications.
Susana Nascimento
[email protected]
24
Main References
S. Nascimento, B. Mirkin, and F. Moura Pires (2003). Modeling Proportional Membership in Fuzzy Clustering. In IEEE Transactions on Fuzzy Systems (IEEE-TFS), 11(2), pp. 173-186 S. Nascimento (2005). Fuzzy Clustering via Proportional Membership Model. Vol 119 of Frontiers of Artificial Intelligence and Applications, IOS Press, 200 pp.
Susana Nascimento
[email protected]
25