Data Clustering using the self-organizing properties of ...

1 downloads 0 Views 84KB Size Report
October 21, 2003 presented by Alejandro Murua and Larissa Stanberry at ”Unsupervised Learning Workshop”,. Department of Statistics,. University of ...
Data Clustering using the self-organizing properties of magnetic systems Marcello Blatt, Shi Wiseman and Eytan Domany Department of Physics of Complex Systems, The Weizmann Institute of Science, Rehovot, Israel ∗October 21, 2003

presented by Alejandro Murua and Larissa Stanberry at ”Unsupervised Learning Workshop”, Department of Statistics, University of Washington, Seattle. ∗

Potts Model Points vi, i = 1, . . . , N , residing on some latice. Each vi has a spin si = 1, . . . , q. Jij denotes interaction between spins i and j. Configuration of the system S = {si}N 1. Energy of the system H (“cost function”) H=

X i,j

Jij (1 − δsi,sj )

si · sj do not contribute to the energy sum. si ↑↓ sj contribute a positive interaction Jij > 0

Boltzman distribution at fixed T 1 exp (− H ) P (S) = Z T

gives weight to the configuration S. Z=

P S

exp (− H ) is a normalizing constant. T 2

The ordering properties of the system

. magnetization, Em(S) max (S)−N m(S) = qN(q−1)N ,

where Nmax = maxk {Nk (S)},

Nk (S) =

X

δsi ,k

i

Nk (S) is the number of spins in state k ∼ cluster size

. spin-spin correlation, Eδsisj . Gij = Eδsisj = P (si · sj ) . susceptibility χ (a.k.a. variance) 2 − (Em)2) = N var(m) χ=N (Em T T

3

Homogeneous System All spins have constant interactions Jij = J Ferromagnetic

Paramagnetic

T -low Em = 1 Gij ≈ 1

T -high Em = 0 Gij ≈ 1q ∗

∗ spin-spin correlation G = Eδs s = P (s · s ) ij i j i j

4

Inhomogeneous Potts Model Spins form magnetic “grains” characterized by strong coupling within the grain, and weak coupling between the grains. Ferromagnetic

Super-Paramagnetic

Paramagnetic

T low

T

T high

Spins aligned

Strongly coupled · Weakly coupled ↑↓

Spins disordered

Gij > 1 − 2q O( q12 ) ∗

Gij ≈ 1 − 2q O( q12 )

Gij =

1 q

∀i, j

independent spins

∗ spin-spin correlation G = Eδs s = P (s · s ) ij i j i j

5

Monte-Carlo Simulations to compute thermal averages Thermodynamic average is calculated as EA =

X

A(S)P (S),

S P (S) =

1 Z

exp (− H ) is a Boltzmann factor, Z = T

P S

exp (− H ) T

Number of possible system configurations is q N . Solution:

. Generate {S1, . . . , SM } ∼ Boltzmann distribution. . Use it as a statistical sample.

. Approximate thermal averages by EA =

1 PM A(S ), i M i=1

M ¿ N. 6

Swendsen-Wang Algorithm The algorithm changes the value of the entire cluster, rather than a single spin. . Assign first configuration at random s1, . . . , sN . . Visit all pairs of spins with positive interactions Jij > 0. . The bond between the spins is frozen with probJ ability pij = 1 − exp ( Tij δsisj ). . SW -cluster = {spins, connected by frozen bonds}. . Assign a spin value at random to SW -clusters . Iterate 7

Clustering Data 1. Consider data points x1, . . . , xn ∈ RD 2. Define number of spin states q • Number of spin states q is not related to the number of clusters !!

3. Define a neighborhood • all pairs (i, j) have N 2 interactions • D ≤ 3 use Delaunay triangulation • K-nearest neighbors: xi is K-nn of xj ⇐⇒ xj is K-nn of xi . The outcome is a connected graph. But D %⇒ K % • For D > 100 use a K − nn ◦ MST 8

4. Define interactions Jij = f (dij ), e.g.   d2 1 ij exp (− ) ˆ Jij = K 2a2  0

vi, vj − K − nn otherwise

ˆ • K-average number of neighbors. • a-”local length scale” over which Jij decays • defined by the high-density regions • smaller than average distance in low-density regions • a = d¯ij ,

vi , vj -neighbors.

5. Generate M -configurations S1, . . . , SM 6. ∀S calculate

m(S) =

Nmax = maxk {Nk (S)}, the state k ∼ cluster size.

qNmax (S)−N (q−1)N

Nk (S) =

P

,

δ i si ,k

is the number of spins in

7. For each spin configuration calculate an indicator function ½

cij =

1 0

vi , vj ∈ SW − cluster otherwise

1 PM m(S ). 8. Calculate magnetization Em ≈ M k 1

9. Calculate variance var(m) = (Em2 − (Em)2)

10. Transition from super-paramagnetic to paramagnetic phase occurs at Tc ≈

2 1 √ exp (− hhdij ii ) 4 log (1+ q) 2a2

hh¦ii is the average of all neighbors.

11. Identify super-paramagnetic phase.

12. Select one T for all subphases.

13. Gij =

(q−1)Cij +1 , q

Cij = Ecij = P (vi , vj ∈ SWk )

14. Link vi, vj if Gij > 0.5,

1 q

< threshold < 1 − 2q

15. Connected subgraphs are cluster cores.

16. Remaining points are linked to the neighbor with max Gij .

Complexity

. Neighborhood definition is the most time consuming part.

. SW -step requires O(N ) ∼ 0.12 CPU time.

. M -steps, M ≈ 1000.

. LANDSAT, ISOLET ∼ 1 hour of CPU time.

. ISOLET ∼ 1 week with Projection Pursuit method.

. Weakly depends on dimensionality D (only K-nn part).

9

Comments

. Based on dissimilarity measure. No need for metric conversion.

. K-nn ◦ M ST .

. Different features of the data set are uncovered at different T ∼ Multiresolution approach.

. Final outcome is just a sample.

10

Random-Cluster (RC) Problem Consider data points vi. Define a bond variable

½ 1 bond is ’occupied’ nij = 0 bond is ’vacant’

System configuration is given by N = {nij }. Random clusters are defined as vertices of connected components of the occupied bonds. Random Cluster model is defined by: W (N ) =

nij q C(N ) (1−nij ) (1 − p ) Π p ij hi,ji i,j Z

C(N ) is the number of clusters, N ,

0 ≤ pij ≤ 1, Z = const

For ind-t bonds, q = 1, RC-Model ⇔ Potts Model Joint probability distribution of spin and bond variables in Potts model: 1Π P (S, N ) = Z hi,ji [(1 − pij )(1 − nij ) + pij nij δsisj ] 11

Summing over all S-configurations gives a Potts model X

P (S, N ) = W (N )

S

Let pij =

Jij 1 − exp (− T ), X

then

P (S, N ) = P (S)

N 1 exp (− H(S) ) is Boltzmann distribution. P (S) = Z T

Suggest Documents