The financial support provided by the Graduate School of. Industrial Administration at Carnegie Mellon University and the International. Center for Management ...
0= 0
c
*.0G
*
0
06
0d
>~
ý0
on
r..
0d
j
00
x0..
0*.0 u0
000* 0 *O 0* *4 0
.0
*0
3
.
NOTICE
SCLI
THIS
DOCUMENT
IS
BEST
QUALITY AVAILABLE. THE COPY FURNISHED TO DTIC CONTAINED A SIGNIFICANT NUMBER OF
DO WHICH PAGES REPRODUCE LEGIBLY.
NOT
-Y
•
"
-
j
-
.2
I
met
Carnegie-Mellon University r*TSSBUlMJ
Ii
, PENNMYLVANIA 15213
GRADUATE SCHOOL OF INDUSTRIAL ADMINISTRATION WILLIAM LARIMER MS..ON, PIXNDER
S~CLEARINGHOUSE
Reproduced by the for Federal Scientific & Technical Information Springfield Va. 22151
s alaI t ':r: cmd P .r-ubi* LTbix dtxCX.. ý hta bou (
.
DE
'
m
W-
Management Sciences Research Report No.
183
CLUSTER ANALYSIS AND MATHEMATICAL PROGRAMMING
by
M.
R.
Rao*
October 1969
*Graduate School of Industrial Administration Carnegie-Mellon University and International Center for Management Sciences at the Center for Operations Research & Econometrics Universite Catholique de Louvain Now at University of Rochester
This report was prepared Sciences Research Group, NONR 760(24), NR 047-048 Reproduction in whole or U. S. Government.
as part of the activities of the Management Carnegie-Mellon University, under Contract with the U. S. Office of Naval Research. in part is permitted for any purpose of the
Management Sciences Research Group Graduate School of Industrial Administration Carnegie-Mellon University Pittsburgh, Pennsylvania 15213
ACKNOWLEDGEMENT
The author is discussions.
indebted to Professor W. W. Cooper for many helpful
The financial support provided by the Graduate School of
Industrial Administration at Carnegie Mellon University and the International Center for Management Sciences at the Center for Operations Research & Econometrics affiliated with Universite Catholique de Louvain is gratefully acknowledged.
ABSTRACT Cluster analysis involves the problem of optimal partitioning of a given set of entities into a pre-assigned number of mutually exclusive and exhaustive clusters.
Here the problem is
formulated
in two different ways with the distance function (a) of minimizing the within groups sums of squares and (b) minimizing the maximum distance within groups.
These lead to different kinds of linear and non-linear
(0-1) integer programming problems. discussed.
Computational difficulties are
.•i'
S..!
CLUSTER ANALYSIS AND MATHEMATICAL PROGRAMMING -4M.R.RAO
INTRODUCTION
Cluster analysis involves the problem of optimal partitioning of a given set of entities into a number (pre-assigned).of mutually The criterion for optimality exclusive and exhaustive clusters. depends heavily upon the application in It
is
which it
is
to be used.
not the purpose of this paper to discuss the relative merits A discussion of this can be found in
of the various criteria. "Sokal and Sneath [1]. our analysis here,
In
we confine ourselves to distance based
cluster analysis in which a distance measure between the various -enti.ties is available. This type of cluster analysis has received increasing attention recently. given by Majone
A detailed discussion of this is
[2] and Majone and Sandy [3].
Even
though a
distance ".measure between the various entities is known, the cridepends upon the intended terion for optimal partitioning still application.
Again,
it
is
beyond the scope of this paper to dis-
cuss the relative merits of the various possible criteria but references can be found in £lgorithm is
Jensen
[5] where a dynamic programming
given for minimizing the within - groups sums of
squares.
One of the purposes of this paper is
to consider the crite-
rion of minimizing the within groups sums of squares and show how the problem can be formulated as a mathematical programming problem.
In
the general case,
the formulation leads to a fractional
non-linear
See also [4]for an application of distance based cluster analysi-..
2.
0-1 programming problem with constraints and there does not appear to be any computationally problem.
Fortunately,
ly tractable in sed in
efficient procedure for solving such a
the problem does appear to be computational-
some variants and special cases which are discus-
detail. An alternate criterion,viz.,
within groups,
is
minimize the maximum distance
also considered.
leads to a linear integer (0-i)
The formulation in
this case
programming problem which can be
solved by any one of the known techniques
[6,7].
A simple but
efficient algorithm is given to solve this problem when the number of pre-assigned groups is restricted to be only two. FORMULATION OF THE PROBLEM. The following definitions are used in d4. is N is
the formulation
a metric distance between entities i and
.
j.
the number of entities
M id the number of pre-assigne*d groups. We give a formulation of the problem under different criteria. I)
Minimize (maximize) the within (between)-groups sums of squares Let Xik be equal to 1 or 0 depending upon whether the ith entity is
in
group k or not.
Let d.. be a Euclidean metric. o
The problem can now be written as
M N-1 N "Min k=E [( E * i=1" j=i+l
2 iX d i Xik
N kxV i=I
x -k]
(I)
M Subject to
E k=1
xik
1
Xik > 0 and integer valued
for i
=
for i
=
k
÷.
1,2,...N
1,2,...M
(2)
3.
a fractional non-linear 0-1 programming problem which However two special gases -is- very difficult to solve in general. of this appear to be less difficult and they are discussed next. This is
each group is
a) The number of eptitie& in
in
M I
specified
Sometimes the number of entities in each group is specified Let k be the number of entities in. group k such that advance. k= N. n.
k=J function (1) now becomes
The objective
Min
I -M 2(3) d. x.i Z ( E E k=J nk .i=I j=i+l -
kN xjk
and -we have an additional set of constraints, ".i=
N Z xik ='k (4)
k = 1,2,...M
a non-linear 0-1 programming problem with conThis is still However since all nk are specified in adstraint set (2) and (4). vance,
we do not have a fractional objective as in
(1) and there
are at least two possible approaches to attempt to solve this problec approach is to treat this as a constrained non-linear The first boolean programming problem and use the methods outlined by Hammer and Rudeanu [8].
to linearize the obje-tive
Another approach is
function at the expense of increasing the number of constraints and solve the resulting problem by any of the known techniques for linear integer programming.
The 0-1 programming problem to be
solved is as follows N N-1 M I E - ( £E Min E k=1
k 2 d.. y..) 13 1) -nk i=1 j=i+l k
Subject to Xik + xj
-
Yij
c 1
i
=
1,2,...N-1
j
i+1,i+2,...N
k=
1,2,...M
(6)
N i
I:Mx z x ik
k= 1,2k...M
n
.I ik
k
= 1
=1,2,....N
k=1
All
In
i
and
0 and integer valued.
this formulation,
the number of constraints increases rapidly
with N and M and hence this approach is
computationally useful only
for small values of N and M. *
b) The number of pre-assigned groups is
equal to two
A method for cluster analysis suggested by Edwards and CavalliSforza [9] involves dividing the entities into two most-compact clusters and repeating the process sequentially.
This,
in. our
notation implies that at each stage of the process we have M = 2. In vwith
this case,
a better formulation of the problem is
possible
the following notation
SLet x. = I or 0 depending upon whether the ith entity is e
in
group
1or 2.
The problem can be written as
Mini( N-1 E
NE
dý
x
x.)
r x.[
I-1 X E
NE
2 dý(O-x )(1-x'}
i=1 j=i+i N /(N-
i=1
xj)]
(7)
x.1 = 0 or I for all i. This
is
a fractional non-linear 0-1 programming problem with no
one approach to solve this [•].
also a boolean programming problem an
is
ccnstraints.This
additional
.appears to be promising;
It
so far no computational
1) In
the
first
and of these the
is
we rewrite
in
second one
should be pointed out however,
experience
approach,
to use the methods given
problem is
We outline two other methods
that
available.
the objective
function
*)
as follows
-
x.)( E E i=i j=i+1
r
d
referi'ed to as (8w).
i=i1
i=i j=i+1
N (1-xi)(l-x.))]/( E x.)(N 1 i=1i
function
(8)
N =E
(81
x.)
without the denominator be
We note that the denominator in
(8)
is
a
if
maximium
N X i=1
[•]
+
I
d. 13 Let the objective
x. x)
.
NY
N-1
N
N
N-1
N
Min [(N
x.
[I] 2
=
where
the least integer greater than or equal to N!2.
is
function
minimize the objective
We first
in
methods given
£ x. "i=1 the problem is
If
[8].
or
then solved.
[N] 2
by using the
obtain
we'are fortunate'to
[I 2
=
(8')
-
Otherwise,
N Z
let
x.
m < [].
i=1
Now,
we know that the only possible solutions that'may improve
the value of (8)
should necessarily
satisfy
the
following relation
N i= I
x
9
43.
There this
is
no loss
in
generality.in
were not true we
assuming
could use N -
N2N Y
x. i 1
that m < instead of
since E
i=i1
xi.
if
."6.
The next step is
to consider
added constraint
(9).
If
(8')
the current
"for the objective function (8) now, is
it'is
solution has a better
becomes
'either
so high that
Z x. ='[11 2 . i=1 the value
x. to be equal to V] or 2
I
The process.
or the value of the function for the objective function
would be higher than the current best value even if N N i=
value
than the best value found till
retained as the best value found thus far.
repeated until
(8')
and solve the problem with the
(8)
we assume
-2
The efficiency of this approach depends heavily upon the ability
to find the solution of (8')
present,
with the restriction
(9).
At
there does not appear to be efficient methods for doing
this, 2-) An alternate
approach which appears to be promising is
branch and bound methods to solve this problem. that the problem is
to use
First we note
very similar to the problem of "selecting
an optimum subset" described by Beale [10] where a tree-search (branch and bound) in in
procedure is
outlined.
Using the terminology
[10], we briefly describe the procedure first and then indicate some detail how the various steps can be performed for this
problem. The root of the tree corresponds to a partial solution in which all the elements2 are free to be assigned.
We select one
element and we now have two possibilities of assignment. We can assign the element to group I or group 2. These two possibilities correspond to the two branches emanating from the root.
For con-
venience,
we will always write the assignment of an element to group 1 as the branch to the right and always branch to the right 2
II
F For convience
we use
!,.
the word elenent
..
to refer to
-
an entity.
...
".first. Thus,
we have a partial solution
at any node of the tree,
with some elements assigned to group. 1, while others are yet to be assigned. in
which all
some assigned to group 2
A complete solution is
one
the elements have been assigned and this corresponds
to an end of the tree. Before we branch from a particular node, bound for the objective function.
If
we compute a lower
the lower bound is
greater
than or equal to the best complete solution found so far, we do not branch further from this node. In this case,'we back-track along the tree until we reach a node with an unexplored branch to If no such node exists,.the problem is solved and the best the left. complete solution found so far is other case,
the optimal solution.
where the lower bound is
In the
less than the value of the
current best solution, we need to branch further. variable not yet assigned and branch to the right.
We select a We continue
branching until we reach an end of the tree or the lower bound test indicates that we need not branch further. When an end of the tree is reached, it would represent an improvement in the objective function and hence it is retained as the current best solution.
From the end of the tree,
we then back-track along the
tree until we reach a node with an une'xplored branch to the left. SELECTION
OF AN ELEMENT AT A NODE.
At any particular.node,
we have a set (possibly empty) of-
elements in group I and a set (possibly empty) of elements in group 2. If we need to branch further, we select an element not yet assigned and branch to the right.
In
our procedure,
ponds to assigning this selected element to group 1.
this
corres-
So the se-
lection of an element is made such that the sum of the distances-between this element and all other elements already in group I is minimum.
vi 8. COMPUTATION OF LOWER BOUND AT A NODE.
first
-We
give some
notations
for convenience.
At a
given node
let
xi = 1 -for all
i assigned to group I
x . = 0 for all
i assigned to group 2
{i/xi = 1),
Let N,
=
Let.n1
and n 2 be1 21 the number of elements in
Let n 3
= N -
nI
n2
-
to any group and let Let K1
I i,j&N
d..
and N 2
{i/xi (
= 01.
be the number
N
and N2repciey respectively.
of elements
not yet assigned
N3 be a set consisting of these n 3 elements. and K2
1
=
i
Z jEN
d..
13
2
Let D be a N x N symmetric matrix giving the. distance between
Let F be a'n
entities.
between an element in
x n1
3
the
sub-matrix of D giving the distance
N3 and an element in
N1 .
Let ri be the sum
of the elements in row i of matrix Y. Let.1 be a n 2 x n submatrix of D giving the distance between an Let R i be the sum of the element in N22 and 1an element in N. elements in
row i of the matrix R.
We assume that matrices R.and F are arranged in
such a manner
that
j.
for i
E. .1 FY
for
Let C be a ni x n 3
3
3
i
symmetric