Hierarchical clustering in large networks

4 downloads 156 Views 913KB Size Report
Etowah. Fayette. Franklin. Geneva. GreeneHale. Henry. Houston. Jackson. Jefferson. Lamar ..... Avery. Beaufort. Bertie. Bladen. Brunswick. Buncombe Burke. Cabarrus ... McKenzie. McLean. Mercer. Morton. Mountrail. Nelson. Oliver. Pembina.
$

'

Clustering Relational Data Anuˇska Ferligoj Faculty of Social Sciences University of Ljubljana

&

% version: January 20, 2009 / 08 : 35

A. Ferligoj: Clustering relational data

2

'

$

Outline 1 2

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cluster Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 2

3

Clustering Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

9

Benefits from the Optimizational Approach . . . . . . . . . . . . . . . . . . . .

9

10

Blockmodeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

41

Clustering with Relational Constraints . . . . . . . . . . . . . . . . . . . . . . .

41

47 48

Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

47 48

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

1

'

$

Introduction A large class of clustering problems can be formulated as an optimizational problem in which the best clustering is searched among all feasible clusterings according to a selected criterion function. This clustering approach can be applied to a variety of very interesting clustering problems, as it is possible to adapt it to a concrete clustering problem by an appropriate specification of the criterion function and/or by the definition of the set of feasible clusterings. Both, the blockmodeling problem (clustering of the relational data) and the clustering with relational constraint problem (clustering of the attribute and relational data) can be very successfully treated by this approach.

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

2

'

$

Cluster Analysis Grouping units into clusters so that those within a cluster are as similar to each other as possible, while units in different clusters as dissimilar as possible, is a very old problem. Although the clustering problem is intuitively simple and understandable, providing solution(s) remains a very exciting activity. The field of cluster analysis • has its society, the International Federation of Classification Societies, formed in 1985 from several national classification societies; • organizes every second year its conference; • publishes two journals: the Journal of Classification (from 1984) and the journal Advances in Data Analysis and Classification (from 2007).

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

3

'

$

Clustering Problem Cluster analysis (known also as classification and taxonomy) deals mainly with the following general problem: given a set of units, U, determine subsets, called clusters, C, which are homogeneous and/or well separated according to the measured variables. The set of clusters forms a clustering. This problem can be formulated as an optimization problem: Determine the clustering C∗ for which P (C∗ ) = min P (C) C∈Φ

where C is a clustering of a given set of units, U, Φ is the set of all feasible clusterings and P : Φ → R is a criterion function.

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

4

'

$

Clustering There are several types of clusterings, e.g., partition, hierarchy, pyramid, fuzzy clustering, clustering with overlaping clusters. The most frequently used clusterings are partitions and hierarchies. A clustering C = {C1 , C2 , ...Ck } is a partition of the set of units U if [ Ci = U i

i 6= j ⇒ Ci ∩ Cj = ∅ A clustering H = {C1 , C2 , ...Ck } is a hierarchy if for each pair of clusters Ci and Cj from H it holds Ci ∩ Cj ∈ {Ci , Cj , ∅}

s

s

s

s

s

s

,

and it is a complete hierarchy if for each unit x it holds {x} ∈ H, and U ∈ H. & % y l y * 6

A. Ferligoj: Clustering relational data

5

'

$

Clustering Criterion Function Clustering criterion functions can be constructed indirectly, e.g., as a function of a suitable (dis)similarity measure between pairs of units (e.g., euclidean distance) or directly. For partitions into k clusters, the Ward criterion function X X P (C) = d(x, tC ) C∈C x∈C

is usually used, where tC is the center of the cluster C and is defined as tC = (u1C , u2C , ..., umC ) where uiC is the average of the variable Ui , i = 1, ...m, for the units from the cluster C. d is the squared euclidean distance.

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

6

'

$

As the set of feasible clusterings is finite a solution of the clustering problem always exists. Since this set is usually very large it is not easy to find an optimal solution. In general, most of the clustering problems are NP-hard. For this reason, different efficient heuristic algorithms are used. Among these, the agglomerative (hierarchical) and the relocation approach are most often used.

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

7

'

$

Agglomerative Approach The agglomerative clustering approach usually assumes that all relevant information on the relationships between the n units from the set U is summarized by a symmetric pairwise dissimilarity matrix D = [dij ]. Each unit is a cluster: Ci = {xi }, xi ∈ U, i = 1, 2, . . . , n; repeat while there exist at least two clusters: determine the nearest pair of clusters Cp and Cq : d(Cp , Cq ) = minu,v d(Cu , Cv ) ; fuse the clusters Cp and Cq to form a new cluster Cr = Cp ∪ Cq ; replace Cp and Cq by the cluster Cr ; determine the dissimilarities between the cluster Cr and other clusters. The result is a hierarchy that is usually presented by a clustering tree – a dendrogram.

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

8

'

$

Relocation Approach This approach assumes that the user can specify the number of clusters in the partition. Determine the initial clustering C; while there exists C0 such that P (C0 ) ≤ P (C), where C0 is obtained by moving a unit xi from cluster Cp to cluster Cq , or by interchanging units xi and xj between two clusters in the clustering C; repeat: substitute C0 for C .

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

9

'

$

Benefits from the Optimizational Approach The optimizational approch to clustering problem offers two possibilities to adapt to a concrete clustering problem: the definition of the criterion function P and the specification of the set of feasible clusterings Φ. Blockmodeling is searching for a clustering according to the relational data only and the solution can be obtained by an appropriatelly defined criterion function. For clustering with relational constraint an appropriately defined set of feasible clusterings is used.

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

10

'

$

Blockmodeling The goal of blockmodeling is to reduce a large, potentially incoherent network to a smaller comprehensible structure that can be interpreted more readily. Blockmodeling, as an empirical procedure, is based on the idea that units in a network can be grouped according to the extent to which they are equivalent, according to some meaningful definition of equivalence.

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

11

'

$

Cluster, Clustering, Blocks One of the main procedural goals of blockmodeling is to identify, in a given network N = (U, R), R ⊆ U × U, clusters (classes) of units that share structural characteristics defined in terms of R. The units within a cluster have the same or similar connection patterns to other units. They form a clustering C = {C1 , C2 , . . . , Ck } which is a partition of the set U. Each partition determines an equivalence relation (and vice versa). Let us denote by ∼ the relation determined by partition C. A clustering C partitions also the relation R into blocks R(Ci , Cj ) = R ∩ Ci × Cj Each such block consists of units belonging to clusters Ci and Cj and all arcs leading from cluster Ci to cluster Cj . If i = j, a block R(Ci , Ci ) is called a diagonal block. s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

12

'

$

The Everett Network a b c d e f g h i j

a 0 1 1 1 0 0 0 0 0 0

b 1 0 1 0 1 0 0 0 0 0

c 1 1 0 1 0 0 0 0 0 0

d 1 0 1 0 1 0 0 0 0 0

e 0 1 0 1 0 1 0 0 0 0

f 0 0 0 0 1 0 1 0 1 0

g 0 0 0 0 0 1 0 1 0 1

h 0 0 0 0 0 0 1 0 1 1

i 0 0 0 0 0 1 0 1 0 1

a 0 1 0 0 1 1 0 0 0 0

c 1 0 0 0 1 1 0 0 0 0

j 0 0 0 0 0 0 1 1 1 0

a c h j b d g i e f

A

B

C

A

1

1

0

B

1

0

1

C

0

1

1

h 0 0 0 1 0 0 1 1 0 0

j 0 0 1 0 0 0 1 1 0 0

b 1 1 0 0 0 0 0 0 1 0

d 1 1 0 0 0 0 0 0 1 0

g 0 0 1 1 0 0 0 0 0 1

i 0 0 1 1 0 0 0 0 0 1

e 0 0 0 0 1 1 0 0 0 1

f 0 0 0 0 0 0 1 1 1 0

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

13

'

$

Equivalences Regardless of the definition of equivalence used, there are two basic approaches to the equivalence of units in a given network (compare Faust, 1988): • the equivalent units have the same connection pattern to the same neighbors; • the equivalent units have the same or similar connection pattern to (possibly) different neighbors. The first type of equivalence is formalized by the notion of structural equivalence and the second by the notion of regular equivalence with the latter a generalization of the former.

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

14

'

$

Structural Equivalence Units are equivalent if they are connected to the rest of the network in identical ways (Lorrain and White, 1971). Such units are said to be structurally equivalent. In other words, X and Y are structurally equivalent iff: s1. s2.

XRY ⇔ YRX XRX ⇔ YRY

s3. s4.

∀Z ∈ U \ {X, Y} : (XRZ ⇔ YRZ) ∀Z ∈ U \ {X, Y} : (ZRX ⇔ ZRY)

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

15

'

$

. . . Structural Equivalence The blocks for structural equivalence are null or complete with variations on diagonal in diagonal blocks.

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

16

'

$

Regular Equivalence Integral to all attempts to generalize structural equivalence is the idea that units are equivalent if they link in equivalent ways to other units that are also equivalent. White and Reitz (1983): The equivalence relation ≈ on U is a regular equivalence on network N = (U, R) if and only if for all X, Y, Z ∈ U, X ≈ Y implies both R1. R2.

XRZ ⇒ ∃W ∈ U : (YRW ∧ W ≈ Z) ZRX ⇒ ∃W ∈ U : (WRY ∧ W ≈ Z)

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

17

'

$

. . . Regular Equivalence Theorem 1 (Batagelj, Doreian, Ferligoj, 1992) Let C = {Ci } be a partition corresponding to a regular equivalence ≈ on the network N = (U, R). Then each block R(Cu , Cv ) is either null or it has the property that there is at least one 1 in each of its rows and in each of its columns. Conversely, if for a given clustering C, each block has this property then the corresponding equivalence relation is a regular equivalence. The blocks for regular equivalence are null or 1-covered blocks.

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

18

'

$

Establishing Blockmodels The problem of establishing a partition of units in a network in terms of a selected type of equivalence is a special case of clustering problem that can be formulated as an optimization problem (Φ, P ) as follows: Determine the clustering C? ∈ Φ for which P (C? ) = min P (C) C∈Φ

where Φ is the set of feasible clusterings and P is a criterion function.

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

19

'

$

Criterion Function Criterion functions can be constructed • indirectly as a function of a compatible (dis)similarity measure between pairs of units, or • directly as a function measuring the fit of a clustering to an ideal one with perfect relations within each cluster and between clusters according to the considered types of connections (equivalence).

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

20

'

$

Indirect Approach RELATION

DISSIMILARITY MATRIX

STANDARD CLUSTERING ALGORITHMS

− → −→ − − −−→ −→

DESCRIPTIONS OF UNITS

R Q

original relation path matrix triads orbits

D

hierarchical algorithms, relocation algorithm, leader algorithm, etc.

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

21

'

$

Dissimilarities The dissimilarity measure d is compatible with a considered equivalence ∼ if for each pair of units holds Xi ∼ Xj ⇔ d(Xi , Xj ) = 0 Not all dissimilarity measures typically used are compatible with structural equivalence. For example, the corrected Euclidean-like dissimilarity v u n X u 2 2 ((ris − rjs )2 + (rsi − rsj )2 ) d(Xi , Xj ) = u t(rii − rjj ) + (rij − rji ) + s=1 s6=i,j

is compatible with structural equivalence.

s

s

s

s

s

s

,

The indirect clustering approach does not seem suitable for establishing clusterings in terms of regular equivalence since there is no evident way how to construct a compatible (dis)similarity measure. & % y l y * 6

A. Ferligoj: Clustering relational data

22

'

$

Example: Support Network among Informatics Students The analyzed network consists of social support exchange relation among fifteen students of the Social Science Informatics fourth year class (2002/2003) at the Faculty of Social Sciences, University of Ljubljana. Interviews were conducted in October 2002. Support relation among students was identified by the following question: Introduction: You have done several exams since you are in the second class now. Students usually borrow studying material from their colleagues. Enumerate (list) the names of your colleagues that you have most often borrowed studying material from. (The number of listed persons is not limited.)

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

23

'

$

Class Network - Graph g12

g63

b96

class.net

b85 b02 g09 g42

g07 g22

g24

g28 g10 b03

Vertices represent students in the class; circles – girls, squares – boys. Reciprocated arcs are represented by edges.

b51

b89

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

24

'

$

Class Network – Matrix Pajek - shadow [0.00,1.00] b02 b03 g07 g09 g10 g12 g22 g24 g28 g42 b51 g63 b85 b89

b96

b89

b85

g63

b51

g42

g28

g24

g22

g12

g10

g09

g07

b03

b02

b96

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

25

'

$

Pajek - Ward [0.00,7.81]

Indirect Approach b51 b89 b02 b96 b03 b85 g10 g24 g09 g63 g12 g07 g28 g22

Using Corrected Euclidean-like dissimilarity and Ward clustering method we obtain the following dendrogram. From it we can determine the number of clusters: ‘Natural’ clusterings correspond to clear ‘jumps’ in the dendrogram. If we select 3 clusters we get the partition C.

g42

C = {{b51, b89, b02, b96, b03, b85, g10, g24}, {g09, g63, g12}, {g07, g28, g22, g42}} s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

26

'

$

Partition into Three Clusters (Indirect Approach) g12

g63

b96 b85 b02 g09 g42

g07 g22

g24

g28 g10

On the picture, vertices in the same cluster are of the same color.

b03

b51

b89

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

27

'

$

Matrix Pajek - shadow [0.00,1.00] b02 b03 g10 g24

C1

The partition can be used also to reorder rows and columns of the matrix representing the network. Clusters are divided using blue vertical and horizontal lines.

b51 b85 b89 b96 g07 g22

C2

g28 g42 g09 g12

C3

g63

C3 g12

g09

g42

g28

C2 g22

g07

b96

b89

b85

b51

g24

g10

b03

b02

C1

g63

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

28

'

$

Direct Approach The second possibility for solving the blockmodeling problem is to construct an appropriate criterion function directly and then use a local optimization algorithm to obtain a ‘good’ clustering solution. Criterion function P (C) has to be sensitive to considered equivalence: P (C) = 0 ⇔ C defines considered equivalence.

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

29

'

$

Criterion Function One of the possible ways of constructing a criterion function that directly reflects the considered equivalence is to measure the fit of a clustering to an ideal one with perfect relations within each cluster and between clusters according to the considered equivalence. Given a clustering C = {C1 , C2 , . . . , Ck }, let B(Cu , Cv ) denote the set of all ideal blocks corresponding to block R(Cu , Cv ). Then the global error of clustering C can be expressed as X P (C) = min d(R(Cu , Cv ), B) Cu ,Cv ∈C

B∈B(Cu ,Cv )

where the term d(R(Cu , Cv ), B) measures the difference (error) between the block R(Cu , Cv ) and the ideal block B. d is constructed on the basis of characterizations of types of blocks. The function d has to be compatible with the selected type of equivalence. s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

30

'

$

Empirical blocks

a b c d e f g

a 0 1 1 1 1 1 0

b 1 0 1 1 1 1 1

c 1 1 0 1 1 1 1

Number of inconsistencies for each block

A B

A 0 1

B 1 2

d 0 0 0 0 0 0 0

e 1 0 0 0 0 1 0

Ideal blocks

f 0 0 0 0 0 0 0

g 0 0 0 0 0 1 0

a b c d e f g

a 0 1 1 1 1 1 1

b 1 0 1 1 1 1 1

c 1 1 0 1 1 1 1

d 0 0 0 0 0 0 0

e 0 0 0 0 0 0 0

f 0 0 0 0 0 0 0

g 0 0 0 0 0 0 0

The value of the criterion function is the sum of all inconsistencies P = 4.

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

31

'

$

Local Optimization For solving the blockmodeling problem we use the relocation algorithm: Determine the initial clustering C; repeat: if in the neighborhood of the current clustering C there exists a clustering C 0 such that P (C 0 ) < P (C) then move to clustering C 0 . The neighborhood in this local optimization procedure is determined by the following two transformations: • moving a unit Xk from cluster Cp to cluster Cq (transition); • interchanging units Xu and Xv from different clusters Cp and Cq (transposition).

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

32

'

$

Partition into Three Clusters: Direct Solution (Unique) g12

g63

b96

b85

b02 g09 g42

g07

g22

g24

g28 g10

This is the same partition and has the number of inconsistencies.

b03

b51

b89

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

33

'

$

Generalized Blockmodeling 1

1

1

1

1

1

0

0

1

1

1

1

0

1

0

1

1

1

1

1

0

0

1

0

1

1

1

1

1

0

0

0

0

0

0

0

0

1

1

1

0

0

0

0

1

0

1

1

0

0

0

0

1

1

0

1

0

0

0

0

1

1

1

0

C1

C2

C1

complete

regular

C2

null

complete

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

34

'

$

Generalized Equivalence / Block Types Y 1 1 1 1 X 1 1 1 1 1 1 1 1 1 1 1 1 complete Y 0 1 0 X 1 0 1 0 0 1 1 1 0 regular

0 X 0 0 0

Y 0 0 0 0 0 0 0 0 null

0 1 0 0

0 0 0 0

1 1 1 1

Y 0 1 0 0 X 1 1 1 1 0 0 0 0 0 0 0 1 row-dominant

0 0 1 0

Y 0 1 0 0 X 0 1 1 0 1 0 1 0 0 1 0 0 row-regular

0 0 0 0

Y 0 0 0 1 X 0 0 1 0 1 0 0 0 0 0 0 1 row-functional

0 1 0 0

Y 0 0 1 0 X 0 0 1 1 1 1 1 0 0 0 1 0 col-dominant

0 0 0 1

0 0 0 1

Y 0 1 0 1 X 1 0 1 0 1 1 0 1 0 0 0 0 col-regular

0 0 1 0

0 0 0 0

Y 1 0 0 0 0 1 0 0 X 0 0 1 0 0 0 0 0 0 0 0 1 col-functional

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

35

'

$

Pre-specified Blockmodeling In the previous slides the inductive approaches for establishing blockmodels for a set of social relations defined over a set of units were discussed. Some form of equivalence is specified and clusterings are sought that are consistent with a specified equivalence. Another view of blockmodeling is deductive in the sense of starting with a blockmodel that is specified in terms of substance prior to an analysis. In this case given a network, set of types of ideal blocks, and a reduced model, a solution (a clustering) can be determined which minimizes the criterion function.

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

36

'

$

Pre-Specified Blockmodels The pre-specified blockmodeling starts with a blockmodel specified, in terms of substance, prior to an analysis. Given a network, a set of ideal blocks is selected, a family of reduced models is formulated, and partitions are established by minimizing the criterion function. The basic types of models are: *

*

*

*

0

0

*

0

0

*

0

0

*

*

0

0

*

0

*

0

0

?

*

*

0

0

*

center -

hierarchy

clustering

periphery

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

37

'

$

Prespecified Blockmodeling Example We expect that center-periphery model exists in the network: some students having good studying material, some not. Prespecified blockmodel: (com/complete, reg/regular, -/null block) 1

2

1

[com reg]

-

2

[com reg]

-

Using local optimization we get the partition: C = {{b02, b03, b51, b85, b89, b96, g09}, {g07, g10, g12, g22, g24, g28, g42, g63}}

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

38

'

$

2 Clusters Solution g12

g63

b96 b85 b02 g09 g42

g07 g22

g24

g28 g10 b03

b51

b89

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

39

'

$

Model Pajek - shadow [0.00,1.00] g07 g10 g12 g22

Image and Error Matrices:

g24 g28

1

2

1

reg

-

2

reg

-

g42 g63 b85 b02 b03 g09

1

2

1

0

3

2

0

2

b51

Total error = 5 center-periphery

b89

b96

b89

b51

g09

b03

b02

b85

g63

g42

g28

g24

g22

g12

g10

g07

b96

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

40

'

$

Blockmodeling of Multi-Way Network It is also possible to formulate a generalized blockmodeling problem where the network is defined by several sets of units and ties between them. Therefore, several partitions – for each set of units a partition has to be determined. The generalized blockmodeling approach was adapted for 2way networks, and only for structural equivalence and the indirect approach for 3-way networks.

Blockmodeling of Valued Networks Till now we were treating only binary networks. Another interesting problem is the development of generalized blockmodeling of valued ˇ networks. Ziberna (2007) proposed several approaches to generalized blockmodeling of valued networks, where values of the ties are assumed to be measured on at least interval scale.

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

41

'

$

Clustering with Relational Constraints To group given territorial units into regions such that units inside the region will be similar according to selected variables (attributes) and form contiguous part of the territory was the motivation to develop clustering with relational constraints approach (Ferligoj and Batagelj, 1982 and 1983). Departements and regions of France

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

42

'

$

. . . Clustering with Relational Constraint In the case of clustering with the relational constraint, the problem is to find clusterings as similar as possible according to attribute data and also considering the ties from a relation R. The constrained clustering problem can be expressed as clustering problem where the constraints are considered in the definition of the feasible clusterings. The clustering with constraints problem seeks to determine the clustering C∗ for which the criterion function P has the minimal value among all clusterings from the set of feasible clusterings C ∈ Φ, where Φ is determined by the constraints. Set of feasible clusterings for relational type of constraint can be defined as: Φ(R) = {C : C is a partition of U and each cluster C ∈ C is a subgraph (C , R ∩ C × C) in the graph (U, R) with the required type of connectedness} s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

43

'

$

. . . Clustering with Relational Constraints We can define different types of sets of feasible clusterings for the same relation R. Some examples of types of relational constraint Φ i (R) are type of clusterings

type of connectedness

Φ 1 (R)

weakly connected units

Φ 2 (R)

weakly connected units that contain at most one center

Φ 3 (R)

strongly connected units

Φ 4 (R)

clique

Φ 5 (R)

the existence of a trail containing all the units of the cluster

Trail – all arcs are distinct. A set of units L ⊆ C is a center of cluster C in the clustering of type Φ 2 (R) iff the subgraph induced by L is strongly connected and R(L)∩(C \L) = ∅. s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

44

'

$

Some Graphs of Different Types 3s 4s   A    A      A  A  As s  1 2 a clique 4s 5s A  A  A  A  A  A  A  A U A  Us A s - s 1 2 3 weakly connected units

4s 5s  A A   A  A   A A  A A  AU s U A s s 1 2 3 strongly connected units 4s 5s A   A  A  A AU s s s 1 2 3 weakly connected units with a center {1, 2, 4}

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

45

'

$

Solving Clustering with Relational Constraints Problem The agglomerative hierarchical and the relocation approach can be adapted for solving relational constrained clustering problems (Ferligoj and Batagelj, 1982 and 1983).

Clustering with Relational Constraints for Large Data Recently Batagelj, Ferligoj and Mrvar (2007, 2008) adapted the clustering with relational constraint approach for very large networks. It is available in program Pajek .

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

46

'

$

Example: US Counties US Census 2000: V1 – Area, V2 – Population, V47 – Percent of White, V125 – Educational attainment 1990, V126 – Household income; standardized Whatcom Toole

Liberty

Pend Oreille

Okanogan

Ferry

Burke

Sheridan

Glacier Skagit

Divide

Daniels

Boundary San Juan

Bottineau

Renville

Rolette

Cavalier

Pembina

Kittson

Roseau

Towner

Lake of the Woods

Hill

Lincoln

Stevens

Bonner

Blaine

Flathead

Valley

Pondera

Walsh

Ward

Williams

Roosevelt

Phillips

McHenry

Island

Clallam

Marshall Koochiching

Ramsey

Pierce

Mountrail

Benson

Pennington

Snohomish Nelson

Chelan Jefferson Douglas

Kitsap

Richland

Sanders

Kootenai

Spokane

Lincoln

Red Lake Polk

Lake

McKenzie

Cascade

Benewah

Lewis and Clark

Thurston

Petroleum

Golden Valley

Granite

Garfield Walla WallaColumbia Asotin

Benton

Wahkiakum

Custer

Broadwater

Multnomah Hood River

Washington

Park

Carter

Hyde

Lawrence

Boise

Teton

Brule

Aurora

Jackson

Custer

Mellette

Davison Hanson McCook

Fall River

Douglas Gregory Charles Mix

Todd

Bennett

Hutchinson Turner

Gooding

Converse

Sublette Power

Jerome

Boyd

Keya Paha

Caribou Sioux

Lincoln Cassia

Knox Cherry

Sheridan

Brown

Platte

Thomas

Carbon

Cache

Loup

Arthur

McPhersonLogan

Valley

Greeley

ShermanHoward

Keith Deuel

Platte

Polk Merrick

Shasta Salt Lake Tooele

Douglas Saunders Sarpy

Chase

Hayes

Dundy

Hitchcock Red WillowFurnas

Frontier Gosper Phelps KearneyAdams Clay

Pottawattamie Cass

Juab

Carbon

Churchill Storey

Sanpete

White Pine

Carson Lyon

Douglas

Sevier

Alpine

Yolo Napa

Sonoma

Emery

Millard

Grand

El Dorado

Sacramento

Mineral

Amador

Beaver Solano

Piute

Wayne

Calaveras Marin

Tuolumne

Mono

Contra Costa San Joaquin San Francisco Alameda

Garfield

Iron

Lincoln

Esmeralda Stanislaus

San Juan

Nye

Mariposa San Mateo

Washington

Madera Santa Clara

Kane

Merced

Santa Cruz

Fresno Inyo

San Benito

Monterey Tulare Kings Clark

Mohave

San Luis Obispo

Coconino

Kern

Navajo

Apache

San Bernardino Santa Barbara

Yavapai Ventura Los Angeles

Riverside

Gila La Paz

Orange

Maricopa Greenlee San Diego

Imperial

Graham

Pinal Yuma

Pima Cochise

Santa Cruz

Adair

Poweshiek Iowa

Barry

Eaton

Scott

Ontario Seneca Livingston Wyoming Yates

Clarke Lucas

Will Henry

Warren Elkhart St. Joseph La Porte Lake Porter

Bureau Putnam

La Salle Grundy Kankakee

Lagrange Steuben Fulton Lucas Williams

Ottawa

Noble De Kalb Wood SanduskyErie Henry Starke Marshall Defiance Kosciusko Whitley Seneca Huron Paulding Allen Putnam Hancock

Ashtabula

Knox Livingston

Merrimack Strafford

Bennington Rockingham Windham Cheshire Hillsborough

Rensselaer

Essex

Albany Schoharie Franklin

Chenango

Berkshire Hampshire Greene

Tioga Chemung

Broome

Wabash Van Wert Huntington Miami Allen Wells Adams Cass

Potter

Geauga Cuyahoga

Middlesex Worcester Suffolk

Columbia

Delaware

Tioga

Susquehanna

Bradford

Hampden

Forest

Sullivan Wyoming

Elk

Clarion

Sullivan Lycoming

Cameron

Mercer

Clinton Clearfield

Armstrong

Centre

Lackawanna Pike Luzerne

Jefferson

MahoningLawrence Butler Stark

Ulster

Wayne

Venango Trumbull

Portage MedinaSummit

Crawford Wyandot AshlandWayne Richland

Mc Kean

Crawford

Lorain

Jasper Pulaski Fulton Newton

Marshall DesHenderson Moines Warren

Saratoga

Montgomery Schenectady Otsego

Cortland

Steuben

St. Joseph Branch Hillsdale Lenawee Monroe Lake

Rock Island

Mercer

Monroe Wapello Jefferson Henry

Chautauqua Cattaraugus Allegany

WashtenawWayne

Madison

Norfolk

Erie Cass

Lee Kendall

Fulton Cayuga Onondaga

St. Clair

Oakland Macomb Ingham Livingston

Jackson

Cumberland

Belknap

Sullivan

Oneida

Wayne

Monroe

Tompkins Schuyler

Cook DuPage

Grafton

York Washington

Berrien Kane DeKalb

Whiteside

Johnson

MadisonWarren Marion MahaskaKeokuk Washington Louisa

Montgomery Adams Union

Fremont Page

Clear Creek

Yuma

Davis RinggoldDecatur Wayne Appanoose

Worth

Nodaway

Decatur

Norton

Phillips

Smith

Jewell

Republic

Brown Washington Marshall Nemaha

Van Buren

Putnam Schuyler ScotlandClark

Union

Columbia Montour Carbon

Orange

Sussex Monroe

Northumberland

Plymouth Providence Windham Tolland Bristol Dutchess LitchfieldHartford Kent Bristol Newport New London Washington Middlesex Putnam New Haven Dukes Fairfield

Barnstable

Nantucket

Rockland Westchester

Passaic Bergen

Bronx Warren Morris EssexNew York Nassau Northampton Hudson Schuylkill Queens Union Kings Lehigh Richmond Somerset Hunterdon

Suffolk

Peoria

Macon

Woodford

Tazewell

Schuyler

Lewis

Adams

Daviess

Linn Livingston

McDonough Fulton

Ford

McLean

Mason

Knox

Grundy

Doniphan

Hancock

Sullivan Adair

Richardson Holt

DeKalb

Cheyenne Rawlins

Mercer Harrison

Gentry

Pawnee

Andrew

Adams

Denver

Taylor

Lee

Atchison JohnsonNemaha

Washington

Yuba Nevada

Allegan

Ogle

Orange

Rutland Windsor

Warren

Herkimer

Orleans

Genesee Erie

Racine Walworth Kenosha

Rock

Calhoun Van BurenKalamazoo Carroll

Stark Mills

Fillmore Saline

Gage Harlan Franklin Webster NuckollsThayer Jefferson

Boulder

Gilpin

Sierra

Jasper

Otoe

Phillips

Weld

Grand

Garfield

Placer

Polk

Cass

Seward

Logan

Larimer

Morgan

Uintah

Hamilton York

Hall

Rio Blanco

Sutter

Lafayette Green

Knox Lincoln Sagadahoc

Hamilton

Lapeer Genesee Clinton Shiawassee

Jo Daviess Stephenson Winnebago Boone McHenryLake Jones

Benton Linn

Kennebec Androscoggin

Snyder Columbiana Iroquois White Beaver Hardin Indiana Benton Mifflin Carroll Carroll Marion Holmes Mercer Auglaize Morrow Grant Hancock Blair Cambria Juniata Blackford Howard Jay Tuscarawas Allegheny Middlesex Knox Huntingdon Perry Dauphin Tippecanoe Logan Jefferson LebanonBerks Westmoreland Shelby Bucks Union Delaware Warren Clinton Tipton Coshocton Harrison Brooke Mercer Monmouth Delaware Washington Montgomery VermilionFountain Madison Randolph Champaign Champaign Darke Cumberland Ohio Licking Boone Hamilton Montgomery Miami GuernseyBelmont Lancaster Chester Franklin Philadelphia Somerset Bedford York Franklin Muskingum Fulton Delaware Henry Fayette Clark Madison Adams Vermillion Marshall Greene Burlington Wayne Ocean Hancock Camden Parke Marion Noble Hendricks Fairfield Preble Montgomery Perry Monroe Morgan Greene Gloucester New Castle Edgar Buchanan Putnam Fayette Caldwell Scott Arapahoe Summit Pickaway Eagle Union Pike Rush Morgan Clinton Christian Moultrie Cecil Salem Allegany Monongalia Wetzel Fayette Morgan Washington Carroll Atchison Coles Jefferson Shelby Marion Hocking Harford Atlantic Monroe Ralls MorganJohnson Cloud Frederick Chariton Garrett Preston Baltimore Tyler Butler Mineral Berkeley Jackson Shelby Vigo Washington Randolph WarrenClinton Carroll Franklin Mitchell Clay Pleasants Cumberland Pike Athens Sherman Thomas Sheridan Graham Rooks Osborne Clay Douglas Pottawatomie Platte Taylor Ross Ray Clark Greene Owen Jefferson Hampshire Kit Carson Riley Decatur Baltimore Harrison Clay Cumberland Doddridge MacoupinMontgomery Frederick Howard Jefferson Lake Wood Ritchie Kent Elbert Vinton Audrain Highland BrownBartholomew Leavenworth Pitkin Monroe Saline Howard Hamilton Winchester Montgomery Cape May Calhoun Park Ottawa Barbour Tucker Dearborn Clarke Ripley Lafayette Wyandotte Kent Jersey Loudoun Sullivan Pike Grant Effingham Queen Anne Lincoln Lincoln Geary Shawnee Lincoln Wirt Jackson Greene Jackson Meigs Clermont Jasper Crawford Jennings Hardy Wabaunsee Fayette Anne Arundel OhioBoone CampbellBrown Boone Lewis Kenton MesaDelta Logan Wallace Gove Trego Ellis Russell Montgomery Gilmer Douglas District of Columbia Johnson Upshur Jackson Warren Teller Caroline Bond Falls Church Arlington Dickinson Cooper Shenandoah Fairfax Lawrence Fairfax Prince George Cheyenne Madison El Paso Adams Scioto Jackson Calhoun Alexandria Callaway Switzerland Gallia Saline Manassas Park Warren Talbot Clay Mason Randolph St. Charles Jefferson Gallatin Johnson Pettis Manassas Chaffee Roane Prince William Fauquier Richland Lawrence Daviess Braxton Sussex Gunnison Martin Ellsworth Morris Scott Rappahannock Bracken Pendleton Moniteau Carroll GrantPendleton Knox Osage St. Louis Marion Cass St. Louis Lawrence Page Mason Washington Trimble Clinton Franklin Miami Greenup Calvert Orange Cole Osage Owen Rockingham Rush Robertson Lewis Culpeper Webster Greeley Wichita Scott Lane Ness Fremont Barton Charles Putnam Henry Clay Dorchester Lyon Franklin Kiowa Morgan St. Clair Harrisonburg ClarkOldham Gasconade Wayne Edwards Harrison Madison Stafford Cabell Montrose Wabash Pike Wicomico McPherson Pocahontas Highland Henry Washington Dubois Gibson Marion Fleming Rice Carter Boyd Greene Nicholas Crowley Nicholas Chase Monroe Jefferson Scott Benton Kanawha Fredericksburg St. Mary King George Crawford Floyd Orange Bates Jefferson Coffey Anderson Franklin Miller Worcester Bourbon Linn Rowan HarrisonJeffersonShelby Spotsylvania Augusta Pawnee Staunton Maries Bath Elliott WayneLincoln Pueblo Westmoreland Ouray Somerset Saguache Hodgeman Perry Hamilton Custer White Lawrence Waynesboro Warrick Bath Perry Montgomery Stafford Harvey Spencer Woodford Fayette Charlottesville Camden St. Clair Caroline Randolph AndersonFayette HamiltonKearny Finney PoseyVanderburgh Meade Albemarle Boone Franklin Spencer Greenbrier Clark San Miguel Bullitt Richmond Bent Prowers Otero Reno CrawfordWashington Essex Louisa Hickory MenifeeMorgan Edwards Ste. Genevieve Greenwood Woodson Allen Bourbon Vernon Phelps Jessamine St. Francois Northumberland Johnson Fluvanna Logan Hancock Clifton Forge Mercer Powell Martin Pulaski Rockbridge Henderson San Juan Breckinridge Nelson Hinsdale Butler Covington AlleghanyLexington Washington Hanover Nelson Saline Gallatin Raleigh Daviess Mingo Wolfe Gray Cedar Accomack Perry Jackson Williamson Goochland Buena Vista Hardin Madison King William King andLancaster Queen Estill Sedgwick Ford Magoffin Dolores Union Mineral Dallas Laclede Huerfano Pratt Summers Middlesex Polk BoyleGarrard Rio Grande Dent Lee Wyoming Amherst Larue Marion Stanton Grant Haskell BuckinghamPowhatan Henrico Wilson Neosho Kiowa Kingman Alamosa Botetourt Monroe McLean Cumberland RichmondNew Kent Breathitt Floyd Crawford Barton Iron Madison Hardin Webster Craig Ohio Grayson Pike Union Johnson Elk Lincoln Dade Gloucester Mathews Jackson Mercer Las Animas Lynchburg Owsley Chesterfield Appomattox McDowell Cape Girardeau Charles’ Knott Rockcastle Amelia CrittendenHopkins Taylor Casey Bedford Northampton Pope Costilla Texas Reynolds MontezumaLa Plata Baca Bollinger Giles James’ Bedford Hart Green Buchanan Salem Hopewell Williamsburg Webster Roanoke Wright Roanoke Colonial HeightsYork Greene Sumner Meade Clark Cowley Barber Muhlenberg Campbell Prince Edward Perry Archuleta Petersburg Livingston PulaskiMassac Conejos Butler Edmonson Comanche Harper Montgomery Labette CherokeeJasper Morton Stevens Seward Montgomery Caldwell Shannon Prince George Poquoson Chautauqua Alexander Clay Nottoway Tazewell Bland Wayne Radford Adair Letcher Dickenson Pulaski Laurel Lawrence Leslie Surry Newport News McCracken Ballard Dinwiddie Pulaski Scott Hampton Lyon Franklin Wise Metcalfe Russell Warren CharlotteLunenburg Carter Christian Barren Douglas Norton Russell Floyd Wythe Sussex Marshall Newton Norfolk Christian Isle of Wight Stoddard Knox Carlisle Todd Logan Harlan Pittsylvania Ottawa Smyth Portsmouth Mississippi Trigg Halifax Nowata Wayne Harper Grant Kay Brunswick Southampton Allen Whitley Bell Howell Cumberland McCreary Washington Simpson Craig Cimarron Beaver Texas Carroll Lee Scott Virginia Beach Suffolk Chesapeake Alfalfa Monroe Clinton Graves Greensville Mecklenburg Emporia Barry Stone Washington South Boston Patrick Martinsville Henry Woods Oregon Ripley Butler Grayson Franklin Galax Taney Ozark Hickman McDonald Calloway Bristol Colfax New Madrid Danville Osage Fulton Macon Clay Pickett Taos Stewart Montgomery Sullivan San Juan Robertson Woodward Claiborne Hancock Alleghany Rio Arriba Sumner Johnson Gates Hawkins Delaware Ashe Surry Stokes Rockingham Hertford Trousdale Caswell Person Scott Campbell Garfield Noble Fulton Union Jackson Pawnee Camden Obion VanceWarren Northampton Clay OvertonFentress Rogers Major Pasquotank Dunklin Lake Carroll Boone Marion Currituck Randolph Carter Benton Union Mayes Weakley Henry Granville Dallam Sherman Hansford Ochiltree Lipscomb Houston Washington Halifax Smith Grainger Watauga Cheatham Baxter Perquimans Ellis Wilkes Pemiscot Hamblen Sharp DavidsonWilson Chowan Greene Yadkin Dickson Putnam Morgan Tulsa Forsyth Greene Unicoi Avery Anderson Payne Benton GuilfordAlamance Lawrence Izard Orange Durham Franklin Dyer Jefferson Madison Bertie Humphreys Mora Gibson Washington Mitchell Dewey Knox DeKalb Carroll Nash Wagoner Cumberland Cocke KingfisherLogan Caldwell White Newton Searcy Stone Edgecombe Alexander Davie Creek Cherokee Adair Yancey Williamson Madison Martin Washington Los Alamos Blaine Rutherford Craighead Roane Hutchinson Hartley Moore Roberts Hemphill Crockett Cannon Harding Hickman Sevier Wake Tyrrell Dare Iredell Burke Davidson Loudon Lauderdale McDowell Independence Wilson Sandoval Roger Mills Randolph Chatham Lincoln Mississippi Catawba Rowan WarrenVan Buren Blount Custer Perry Maury Jackson Buncombe Okmulgee Rhea Decatur HaywoodMadisonHenderson Pitt Van Buren Poinsett Bledsoe Beaufort Muskogee Crawford Johnson Haywood Johnston Cleburne Lewis Canadian Oklahoma Santa Fe Meigs Hyde Tipton BedfordCoffee Greene Swain Franklin Lincoln McKinley Marshall Lee Okfuskee Sequoyah San Miguel McMinnMonroe Pope Chester Oldham Carson Gray Potter Wheeler Rutherford Harnett Grundy Cleveland Wayne Sequatchie McIntosh Cabarrus Graham Henderson Stanly Montgomery Cross Polk GastonMecklenburg Washita Moore White Beckham Conway Moore Jackson Haskell Wayne Lawrence Logan Transylvania Giles Hamilton Lenoir Craven Caddo Shelby Fayette Hardeman Sebastian McNairyHardin Woodruff Lincoln Franklin Marion Quay Bradley Seminole ClevelandPottawatomie Pamlico Cherokee Macon Crittenden Polk Faulkner McClain Clay Cumberland Bernalillo Grady Yell Hughes Jones Cherokee Union AnsonRichmondHoke St. Francis York Deaf Smith Randall ArmstrongDonley Collingsworth Cibola Pittsburg Le Flore Perry SampsonDuplin Greer Towns Rabun Carteret Spartanburg Scott Alcorn Catoosa Latimer Lauderdale Kiowa Pickens DeSoto Greenville Scotland Union Dade Whitfield Tippah Fannin Benton Prairie Walker Murray Guadalupe Limestone Lee Harmon Lancaster Oconee Pulaski Lonoke Marshall Pontotoc Madison Jackson Tishomingo Colbert Onslow Gilmer Valencia Garvin Union Chester Monroe Tunica Tate White Torrance Saline Curry Comanche Habersham Robeson Bladen Prentiss Jackson Coal Chesterfield Garland Lumpkin Stephens Lawrence Marlboro Montgomery Parmer Castro Swisher Briscoe Hall Childress DeKalb Pender Anderson Laurens Gordon Morgan Stephens Murray Union Dawson Pickens Polk Franklin Chattooga Dillon Atoka Pushmataha DeBaca Phillips Tillman Franklin Fairfield Kershaw Panola Lafayette Hart Marshall Banks Floyd Hot Spring Hardeman Johnston Darlington Hall Newberry Quitman Lee Itawamba Carter Cotton Grant JeffersonArkansas Coahoma Abbeville Bartow Cherokee Cherokee Pontotoc New Hanover Columbus Forsyth Greenwood Lee Wilbarger Pike JacksonMadison Elbert Marion Winston Cullman Brunswick Roosevelt Bailey Lamb Motley Cottle Floyd Hale Jefferson McCurtain Clark Howard Florence Yalobusha Socorro Wichita Etowah Choctaw Blount Barrow Richland Polk Saluda Marshall Marion Foard Lincoln Dallas Sevier Gwinnett Clarke Calhoun Horry Tallahatchie Paulding Lexington Bryan Cobb Chickasaw Sumter Love Catron Cleveland Monroe Oglethorpe McCormick Fulton Oconee Bolivar Desha Clay WilkesLincoln Lamar DeKalb Walton Grenada Calhoun Haralson Walker Edgefield Hempstead Lincoln Little River Fayette Cleburne MontagueCooke Nevada St. Clair Douglas Grayson Calhoun Clarendon Lamar Clay Rockdale Morgan Red River Sunflower Carroll Knox Baylor Archer Fannin Cochran Hockley Lubbock Crosby Dickens King Webster Williamsburg Drew Ouachita Taliaferro Greene Newton Jefferson Calhoun Columbia Aiken Leflore Lowndes Clayton McDuffie Montgomery Bowie Henry Georgetown Carroll Orangeburg Bradley Warren Oktibbeha Fayette Talladega Richmond Delta Coweta Choctaw Miller ButtsJasperPutnam Tuscaloosa Shelby Chaves Randolph Clay Barnwell ChicotWashington Heard Spalding Pickens Hancock Lafayette Columbia Jack Bamberg Glascock Wise Denton Collin Ashley Union Titus Garza Kent StonewallHaskell Throckmorton Lynn Yoakum Terry Young Franklin Hopkins Berkeley Holmes Humphreys Hunt Morris Cass Winston Noxubee PikeLamar Attala Dorchester Baldwin Troup Meriwether Sierra Bibb Jefferson Burke Monroe Jones Washington Camp Allendale Coosa Chambers Rockwall Sharkey Chilton Rains Upson Colleton Greene Bibb Claiborne Union Tallapoosa Wilkinson WestEast Carroll Lea Marion Jenkins Parker Tarrant Dallas Wood Carroll Screven Hampton Yazoo Morehouse Charleston Leake NeshobaKemper Dawson Borden Scurry Fisher Hale Gaines Jones Shackelford Stephens Palo Pinto Harris Talbot Webster Issaquena Upshur Otero Crawford Twiggs Johnson Sumter Madison Bossier Caddo Lincoln Kaufman Van Zandt Beaufort Perry Elmore Lee Emanuel Harrison Taylor Peach Grant Gregg Autauga Ouachita Muscogee Houston Eddy Scott Newton Lauderdale BleckleyLaurens Dona Ana Macon Smith Richland Madison Treutlen Candler Bulloch Jasper Hood Dallas Chattahoochee Bienville Warren Marion Macon Effingham Johnson Jackson Hinds Rankin Martin Howard Mitchell Nolan Andrews Taylor Ellis Callahan Eastland Schley Russell Marengo Pulaski Montgomery Erath Somervell Luna Panola Lowndes Henderson Dodge Dooly MontgomeryEvans Franklin Rusk WheelerToombs Caldwell De Soto Bullock Stewart Webster Navarro Red River Hidalgo Sumter Tattnall Wilcox Chatham Claiborne Smith Jasper Clarke Choctaw Tensas Wilcox Telfair Hill Bryan Comanche Winn Crisp Simpson Bosque Coke BarbourQuitman Winkler Ector Midland Glasscock Copiah Liberty Pike Runnels Loving Sterling Jeff Davis AndersonCherokee TerrellLee Randolph Long Shelby Ben Hill Natchitoches Coleman Brown Butler Crenshaw Jefferson Hamilton Appling Freestone Turner El Paso Clay Jones Wayne Irwin Coffee Covington Catahoula Grant La Salle Clarke Jefferson Davis Worth Wayne Bacon McLennan Limestone Henry Nacogdoches Lincoln Lawrence CalhounDougherty Monroe McIntosh Sabine Mills Adams Franklin Conecuh Tift Ward Washington Coffee Dale Tom Green Coryell Reeves San Augustine Sabine Crane Upton Concordia Pierce Reagan Concho Culberson Hudspeth Leon Irion Atkinson Early Angelina Baker Covington Houston Mitchell Falls Berrien Glynn McCulloch San SabaLampasas MarionLamar Greene Rapides Brantley Wilkinson Amite Pike Walthall Cook Colquitt Forrest Perry Houston Miller Escambia Vernon Trinity Geneva Avoyelles Bell Robertson Ware Lanier Madison Camden Schleicher Menard Seminole DecaturGradyThomasBrooks Clinch George Holmes Washington Lowndes West Feliciana East Feliciana St. Helena Polk Tyler Milam Walker Jackson Stone Jeff Davis Pointe Coupee Evangeline Mobile Baldwin Pearl River Echols Burnet Mason Newton Charlton Pecos Llano Jasper Okaloosa Crockett Santa Rosa Allen Williamson Brazos Escambia Beauregard Tangipahoa Walton Washington San Jacinto St. Landry Nassau Gadsden Grimes Jackson Baton Rouge Burleson Sutton Kimble WestEast Baton Rouge Harrison Leon Madison Hamilton St. Tammany Livingston Hancock Calhoun Duval Jefferson Baker Montgomery Travis Gillespie Hardin Bay Lee Liberty Calcasieu St. Martin JeffersonAcadia Davis Iberville Blanco Washington Terrell Columbia Lafayette Ascension Liberty St. John the Baptist Suwannee Wakulla Orange Bastrop Orleans Hays Lafayette Kerr Union St. James Waller Taylor Edwards Presidio Vermilion Iberia BradfordClay St. Johns Kendall Gulf Fayette Jefferson Cameron Franklin Real Caldwell Assumption St. Charles Austin Harris Brewster Comal St. Bernard Val Verde Gilchrist St. Mary Bandera Jefferson Chambers Alachua Guadalupe Colorado Putnam Dixie Fort Bend Lafourche Plaquemines Flagler Gonzales Terrebonne Bexar Medina Galveston Uvalde Kinney Lavaca Wharton Levy Marion Brazoria Wilson DeWitt Volusia Jackson Atascosa Karnes Matagorda Zavala Frio Citrus Lake Victoria Seminole Goliad Maverick Sumter Hernando Orange Calhoun Dimmit Bee Live Oak La Salle McMullen Pasco Refugio Brevard Jackson

Routt

Duchesne

Wasatch Utah Eureka

Buffalo

Ionia

Ottawa

Clayton Grant

Bremer

Tama

Waldo

Caledonia

Lewis

Oswego

Sanilac

Saginaw Niagara

Clinton

Lancaster

Moffat

Washoe

Dawson

Perkins

Montcalm Gratiot

Kent

Black Hawk Buchanan Delaware Dubuque

Marshall

Colfax Dodge Washington

Butler

Lincoln Sedgwick

Daggett

Colusa

Franklin Butler

HamiltonHardin Grundy

Addison

Carroll

Huron

Tuscola Muskegon

Muscatine

Cheyenne

Summit

Glenn

Webster

Gladwin

Jefferson

Mecosta Isabella Midland Newaygo

Washington Ozaukee

JeffersonWaukesha Milwaukee

Jackson

Nance

Custer Kimball Morgan Davis

Pershing

Calhoun

Crawford Carroll Greene Boone Story

Harrison Shelby Audubon Guthrie Dallas

Laramie

Uinta

Weber Elko

Butte

Sac

Osceola Clare

Arenac

Bay Oceana

Dodge

Dane

Iowa

Cedar

Banner

Box Elder

Humboldt

Humboldt CherokeeBuena Vista Pocahontas Wright

Oxford

Chittenden

Marquette Green Lake Fond du Lac Sheboygan

Richland Sauk

Burt

Hancock

EssexCoos

Essex Mason Lake

Juneau Adams

Houston

Thurston

MadisonStantonCuming

Garfield Wheeler Boone

Garden

Rich

Siskiyou

Blaine

Morrill

Albany

Sweetwater

Modoc

St. Lawrence

Kewaunee

Manitowoc Winnebago Calumet

Winneshiek Allamakee Crawford Chickasaw Fayette

Plymouth

Dakota Woodbury Ida

Monona Hooker

Scotts Bluff Del Norte

Lake

Fillmore

Mitchell Howard

Hancock Cerro Gordo Floyd

Washington

Orleans

Lamoille

Franklin

Washington

Monroe La Crosse

Freeborn Mower

Winnebago Worth

Wayne

Goshen Grant

Mendocino

Faribault

Palo Alto

Dixon

Pierce Antelope

Box Butte

Franklin

OBrien’ Clay

Franklin Grand Isle Clinton

Alcona

Manistee Wexford Missaukee Roscommon Ogemaw Iosco

Cedar

Holt

Rock

Bear Lake

Oneida

Lander

Martin

Osceola DickinsonEmmet

Sioux

Kossuth

Union

Dawes

Bannock

Twin Falls

Plumas

Jackson

Lyon

Yankton Bon Homme Clay

Natrona Minidoka

Owyhee

Jackson

Tehama

Nobles

Waupaca Brown Outagamie

Waushara

Lincoln

Niobrara Lincoln

Josephine

Lassen

MinnehahaRock

Portage

Columbia Bingham Fremont

Trinity

Wood

Winona

Vernon Tripp

Shannon

Bonneville

Blaine

Lake

Pepin Jackson Buffalo Trempealeau Wabasha

Olmsted

Franklin

Alpena Otsego Montmorency

Grand Traverse KalkaskaCrawfordOscoda

Benzie

Goodhue Le SueurRice

Brown Nicollet

Penobscot

Presque Isle

LeelanauAntrim

Shawano

Blue Earth Cottonwood Waseca Steele Dodge Watonwan

Moody PipestoneMurray

Somerset Cheboygan

Charlevoix Door

Marathon Eau Claire Clark

Lyman

Weston

Washakie Hot Springs

Elmore

Malheur Harney

Humboldt

Lake

Emmet

Marinette

Oconto Menominee

Dunn

Pierce

Dakota

Redwood

Lincoln Lyon Brookings

Kingsbury

Sanborn Miner

Piscataquis

Langlade

Taylor Chippewa

St. Croix Washington Hennepin Ramsey

Scott

Camas Ada

Douglas Coos

Klamath

Jerauld

Delta

Rusk

Barron

Anoka

Wright

Mackinac

Florence Forest

Oneida

Lincoln

Jones

Jefferson Madison Teton

Butte

Buffalo

Sawyer

Menominee

Chisago Polk

Sherburne

Hand

Hughes

Washburn

McLeod Carver Renville Sibley

Beadle

Haakon

Pennington

Payette

Isanti

Stearns

Kandiyohi Meeker

Yellow Medicine

Campbell Johnson

Canyon

Pope

Lac qui Parle Chippewa

Deuel Hamlin

Stanley Park

Lane

Stevens

Grant Codington Clark

Gem

Burnett

Price

Swift

Fremont

Clark

Chippewa Schoolcraft Dickinson

Kanabec Mille Lacs

Big Stone

Day

Faulk Spink

Alger

Iron Vilas

Pine

Morrison

Douglas

Benton Roberts

Edmunds

Potter

Sully Meade

Aroostook

Luce

Marquette

Gogebic Ashland Iron

Todd

Brown Walworth

Dewey Ziebach

Crook

Washington

Deschutes

Otter Tail

Perkins

Butte

Custer

Baraga

Bayfield Douglas

Traverse

Marshall

Harding

Powder River

Sheridan Big Horn

Grant

Sargent

McPherson

Carbon

Yellowstone National

Curry

Campbell

Valley

Crook

Houghton Carlton

Aitkin

Wilkin Richland McIntosh

Beaverhead Lemhi

Linn

Dickey

Emmons

Grant

Big Horn Gallatin

Madison

Cass

Sioux

Corson

Adams Baker

Keweenaw

Becker

Clay

Crow Wing

Grant

Adams

Sweet Grass Stillwater

Union

Wheeler

Ransom

Rosebud

Yellowstone

Wallowa

Umatilla

Morrow

ShermanGilliam

Clackamas Marion

Jefferson

LaMoure

Silver Bow

Wasco

Yamhill

Polk

Benton

Hettinger

Fallon Bowman

Deer Lodge

Idaho

Tillamook

Cass

Ontonagon

Klickitat

Lincoln

Barnes

Wadena Logan

Treasure

Jefferson Ravalli Skamania

Clark

Stutsman

Stark

Slope

Musselshell

Nez Perce Lewis

Cowlitz Columbia

Kidder

Morton Wheatland

Lake

Mahnomen Hubbard

Burleigh

Meagher

Clearwater

St. Louis Itasca

Traill Norman

Billings Golden Valley

Prairie

Latah

Yakima Franklin

Clatsop

Wibaux

Powell Whitman

Lewis Pacific

Steele

Oliver

Judith Basin

Adams

Griggs

Mercer

Fergus

Missoula

Mineral Pierce

Clearwater Foster

Dunn Dawson

Kittitas

Grays Harbor

Wells

Sheridan McLean

Garfield Grant

Cook

Eddy

McCone

Shoshone King Mason

Beltrami

Grand Forks

Chouteau

Teton

Shelby

Marion

Brown Cass

Logan

De Witt

Piatt

Menard

Macon

Sangamon

Douglas

Aransas Polk

San Patricio

Osceola

PinellasHillsborough Nueces

Webb Duval

Indian River

Jim Wells Hardee Kleberg

Manatee

St. Lucie Highlands Okeechobee

Sarasota DeSoto Martin

Jim Hogg Brooks Zapata

Kenedy

Glades Charlotte

Hendry

Palm Beach

Lee

Starr Hidalgo

Willacy

Cameron

Collier

Broward

Dade Monroe

Pajek

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

47

'

$

Software Most of described clustering procedures are implemented in Pajek – program for analysis and visualization of large networks (Batagelj and Mrvar, 1998). It is freely available, for noncommercial use, at: http://pajek.imfm.si

s

s

l y

s

y

s

s

,

s

&

% * 6

A. Ferligoj: Clustering relational data

48

'

$

Conclusion The optimizational approach to the clustering problem can be applied to a variety of very interesting clustering problems, as it alows possible adaptations of a concrete clustering problem by an appropriate specification of the criterion function and by the definition of the set of feasible clusterings. Both the blockmodeling problem and the clustering with relational constraint problem are such cases. There are several possible further developments in blockmodeling, e.g., efficient direct approach for 3-way blockmodeling, blockmodeling for large networks, and dynamic blockmodels.

s

s

l y

s

y

s

s

,

s

&

% * 6

Suggest Documents