Feb 18, 1999 - In chemistry, graphs are used in particular to describe molecular .... in practice, one uses only a small set of invariants and applies the method .... are represented by lines, double bonds by double lines. ... 5 3 5 5 3 1 5 5 5 4 5 5 5 5 5 5 5 5 .... In words: A permutation matrix P represents an automorphism of a ...
Algebraic Combinatorics in Mathematical Chemistry. Methods and Algorithms. III. Graph Invariants and Stabilization Methods. (Preliminary Version) Gottfried Tinhofer Institut fur Mathematik, Technische Universitat Munchen, D-80290 Munchen, Germany Mikhail Klin Department of Mathematics and Computer Science, Ben-Gurion University of the Negev, 84105 Beer-Sheva, Israel February 18, 1999
Supported by the grant I-0333-263.06/93from the G. I. F., the German-Israeli Foundation for Scienti c
Research and Development
Abstract This paper deals with graph invariants and stabilization procedures. We consider colored graphs and their automorphisms and we discuss the isomorphism problem for such graphs. Various global and local isomorphism invariants are introduced. We study canonical numberings, invariant partitions, stable and equitable partitions and algorithms for stabilizing partitions.
Contents
1 2 3 4 5 6 7 8 9
Introduction Isomorphisms and automorphisms of colored graphs Canonical numberings Graph invariants, vertex and arc invariants Partitions and their stabilizations The total degree partition The Weisfeiler-Leman closure k-invariants and multidimensional Weisfeiler-Leman stabilization Concluding remarks Acknowledgement References
Page 1 4 10 18 22 35 40 46 53 58 59
1 Introduction
1.1 Graphs are excellent models for a wide variety of structures: molecular structures, data structures, networks, production scheduling diagrams, logical designs, stochastic processes, and many more. In each case, the structure consists of a nite set of objects together with a binary relation between these objects. Objects are represented by the vertices of a graph. Relations are expressed using arcs or edges.
In chemistry, graphs are used in particular to describe molecular structures. Usually such graphs are called molecular graphs. Here the objects are the atoms of a molecule, the relations are the interactions between them, usually denoted as bonds. In general, there are dierent atoms in a molecule which leads to the necessity of distinguishing dierent kinds of vertices in the graphical model. There may also be dierent kinds of bonds leading to the necessity of distinguishing dierent kinds of edges in the model. We may use labels to express the distinction between dierent kinds of vertices and edges. Labels may be symbols or strings of symbols (natural numbers), or even more structured objects (ordered lists of strings). In a drawing of a graph we could also use colors to distinguish dierent kinds of elements by optical means. This is why labels often are called "colors", too. Of course, they are abstract colors. They may be numbered and represented by their numbers. Edges and non-edges may also be distinguished by dierent colors. In this way, we have to deal with complete colored graphs.
1.2 One of the main purposes of graphical models, at least in chemistry, is the cata-
loguing and the identi cation of equivalent structures. Two colored graphs are equivalent (as models) or isomorphic (as combinatorial structures) if they are colored using the same set of colors and can be mapped onto each other such that the colors of vertices and edges are preserved. We give an exact de nition for isomorphism of colored graphs in Section 2. Let ? and ?0 be two colored graphs, x a vertex of ? and x0 a vertex of ?0. Equality of the colors of the vertices is a necessary (by de nition) but by far not sucient condition for the existence of an isomorphism between the two graphs which maps x onto x0. Suppose that ? and ?0 are isomorphic. There may be several dierent mappings providing an isomorphism. In order to nd all vertices x of ? which are mapped onto x0 by some of these isomorphisms we have to study the automorphisms of ?. Vice versa, to nd all vertices x0 onto which x is mapped by an isomorphism we have to study the automorphisms of ?0. An automorphism of a colored graph ? is a mapping of the vertex set of ? onto itself which preserves all colors (of vertices and edges). Vertices which are mapped onto each 1
other by an automorphism are equivalent in the sense that they have the same function in the model, i.e. they are related in the same way to their surroundings and their roles in the description of the structure are indistinguishable. The sets of equivalent vertices are called the orbits of ? with respect to its automorphism group. The same situation is found with the edges of ?. Edges may be equivalent or non-equivalent according to whether they correspond to each other by an automorphism or not. Moreover, an edge of an undirected graph may be considered as a pair of arcs (x; y) and (y; x), which may be equivalent or not. Hence, to distinguish equivalent parts of a graph ? properly, we rather should speak of equivalent arcs or equivalent non-arcs. As a consequence we will also consider arc invariants rather than edge invariants. The maximal sets of equivalent arcs or non-arcs in a graph are called the 2-orbits (or the orbitals) with respect to the automorphism group.
1.3 A canonical representation of a colored graph ? is a representation which is unique
up to isomorphism. One way to get such a representation is to use a canonical numbering of the vertices and to establish a canonical adjacency matrix of the graph according to this numbering. If the numbering is chosen such that some appropriate matrix function (for example the rank in a lexicographic ordering) reaches an extremal value, then the adjacency matrices of two graphs ? and ?0 will be equal if and only if these graphs are isomorphic.
1.4 Finding a canonical numbering is supported by the use of local invariants. A local in-
variant or vertex invariant is a number f (x) assigned to a vertex x which is characteristic for this vertex such that f (x) = f (y) whenever x is equivalent to y, and which depends only on the neighbourhood of x. The valency of a vertex is the simplest example of a vertex invariant. Two vertices x and y of a graph ? can be equivalent only if they have the same set of vertex invariants. If by some investigation we learn that these sets are dierent then we should express the non-equivalence of x and y in our representation of the graph by assigning dierent colors to these vertices. If in the original graph they have the same color, say blue, then we may for instance change their colors to light-blue and dark-blue thereby keeping in mind the original color. If there are more than two dierent sets of values of the vertex invariants considered among the blue vertices, then we have to use more than two dierent shadings of blue in order to make non-equivalent vertices distinguishable. Assume that we have changed all original colors in this way using vertex invariants. After this process we may nd also equally colored but non-equivalent arcs which also should be distinguished in our representation of the graph. Therefore, let us look for arc invariants. The simplest arc invariant is the ordered pair of colors of its end vertices. In addition, there is a whole list of obvious arc invariants, namely for instance the number of vertices of a xed color which are adjacent to both ends of a given arc, the number of such vertices which are non-adjacent to boths ends or the number of those vertices which are adjacent to exactly one end of this arc. Using these invariants we can also recolor the arcs of ?. Changing colors as long as new vertex or arc invariants are available we nally get a richly colored graph which is a re ned model of the structure we want to represent. In favorable 2
cases - and these cases are the most frequent ones - two vertices (arcs) of the nal colored graph have the same color if and only if they are equivalent, i.e. if the are mapped onto each other by some automorphism. In less favorable cases the "only if" part of the later statement is not true, however, the "if" part is always true.
1.5 In a colored graph where two vertices or two edges have equal color if and only if
they are equivalent the sets of vertices of equal color are the orbits and the sets of arcs of equal color are the 2-orbits of the graph with respect to its automorphism group. Clearly, there is a logically simple way to nd the orbits and 2-orbits of any graph. Just try all possible permutations of the vertex set and check whether they ful ll the conditions of an automorphism or not and then nd out which vertices and edges are equivalent. However, this way may take a terrible long time, too long to be a practicable method.
The crucial point why in the above procedure for re ning canonically the coloring of a graph we may not succeed in nding the orbits and 2-orbits is that we do not know a complete set of local invariants. Such a set is still unknown, at least for general graphs. What we know is just a large set of some local invariants. Moreover, the larger the set of invariants we use the more time consuming the coloring procedure will be. Therefore, in practice, one uses only a small set of invariants and applies the method iteratively using the result of the current step as the starting point for the next step. The iteration stops as soon as the current coloring does not change any more. In this case we call the resulting coloring a stable coloring (with respect to method used). The process leading to this coloring is called a stabilization procedure. In any case the result is an approximation to the partition of the vertices and edges into the orbits and 2-orbits of the graph under consideration.
1.6 This paper is the third part of a series of publications dealing with the interrela-
tions between algebraic combinatorics and mathematical chemistry. Formally it is selfcontained, nevertheless acquaintance with part I [KliRRT95] and part II [BabCKP97] certainly will help the reader to better understand the whole area of investigations, main concepts and motivations.
1.7 The paper consists of nine sections. All de nitions related to isomorphisms and automorphisms of colored graphs together with some striking examples and the formulations of some basic facts are given in Section 2. In Section 3 we discuss canoncical numberings of graphs and we explain how such numberings help to solve the graph isomorphism problem. In Section 4 we introduve numerous global and local invariants of graphs and select those which will be used in stabilization procedures further on. Section 5 introduces the general idea of stabilization and illustrates it on a number of examples of chemical interest. In Section 6 we consider the so-called total degree partition and explain how it is related to doubly stochastic automorphisms of graphs, a natural generalization of the notion of automorphisms. The classical stabilization procedure introduced in 1968 by B. Weisfeiler and A. Leman is brie y considered in Section 7. It returns the smallest cellular algebra generated by a the adjacency matrix of a colored graph. More powerful stabilizing procedures are considered in Section 8. We also mention brie y the limitations 3
of such approaches. In Section 9 we give some additional remarks and some conclusions are drawn from the material in the foregoing sections.
2 Isomorphisms and automorphisms of colored graphs 2.1 We adopt the notation used in [KliRRT95]. Here we repeat the most important de nitions concerning graphs and add some additional ones which are used in this paper.
A directed graph ? = (V; R) consists of a nite set V of vertices and a binary relation R, i. e. a subset R V V: The size (or, alternatively, the order) of a graph ? is the number of its vertices. The elements of R are called arcs. An arc (v; w) 2 R is considered to be oriented from v to w: An undirected graph is a graph the arc set R of which is symmetrical which means that (x; y) 2 R implies (y; x) 2 R: In an undirected graph the pair of antiparallel arcs (x; y) and (y; x) is called edge. An edge is identi ed with the subset (or unordered pair) fx; yg V of its end vertices and is sometimes denoted by : If (x; y) 62 R and (y; x) 62 R, then we call the unordered pair fx; yg a non-edge. Let Rt = f(y; x) : (x; y) 2 R g: The relation Rt is called the transpose of R. The operation R ! Rt is called transposition. In our paper we most frequently use a numbering of the vertex set V in order to deal with it conveniently. A numbering of V is a one-to-one mapping : V ?! f1; 2; : : : ; jV jg which identi es each element of V with a certain number between 1 and jV j. These numbers are used as names for vertices. In addition, we often assign labels to vertices, too. Labels are used to express properties of vertices, they are not used as names. A labeling of V is a mapping l : V ?! L from V into a set of labels, however, it is not necessarily a one-to-one mapping. The reader is asked to distinguish cleary the notion of a numbering from that of a labeling. As soon as a numbering for V is xed we may assume that V = f1; 2; : : : ; jV jg: The adjacency matrix of a graph ? (directed or undirected) with n vertices is an n nmatrix A with entries ( Aij = 10 ifif ((i;i; jj )) 262 R R:
Note that A depends on the numbering of V . If we renumber V then in general A will change, too. For v 2 V and R V V de ne R(v) = fw : (v; w) 2 Rg. If ? = (V; R), then R(v) is called the set of successors of v and Rt(v) is called the set of predecessors of v in ?: A vertex w is called a neighbour of v if w 2 R(v) [ Rt (v): The set of neighbours of v is denoted by N (v): Clearly, N (v) = R(v) [ Rt(v): The number jR(v)j (the number of successors of v) is called the out-degree of v and is denoted by outdeg(v): Analoguously, the number jRt(v)j is called the in-degree of v and is denoted by indeg(v): The number of neighbours of v is called the degree (or the valency) 4
of v and is denoted by deg(v). In the case of an undirected graph we have deg(v) = outdeg(v) = indeg(v): In situations when we consider several dierent graphs on the same vertex set v, say ?; ?0; : : : ; we have to distinguish the degrees of v according to the graphs under consideration. In such cases we shall write deg(?; v); outdeg(?; v); indeg(?; v) in order to refer to a speci ed graph ?:
2.2 A colored graph ? = (V ; R ; : : :; Rd ) is a system of a vertex set V and a set of binary relations Ri ; 0 i d; with the properties (i) Ri \ Rj = ; (i.e. the Ri's are mutually disjoint), (ii) Ri \ Diag(V V ) = 6 ; implies Ri Diag(V V ) where Diag(V V ) = f(v; v) : v 2 V g (the diagonal of V V ), (iii) V V = Sdi Ri: Relations Ri Diag(V V ) represent vertex colors, while all the other relations Ri satisfy Ri \Diag(V V ) = ; and represent arc colors. Note that because of (iii) a colored graph is always a complete graph, i.e. if we identify all colors then ? becomes the graph (V; V V ): 0
=0
The set Ri is called the i-th color class of ?. Each graph ?i = (V; Ri) is a spanning subgraph of ?. It is called the i-th color graph of ?: Note that this notion is strictly dierent from the notion of a colored graph. In some sense, a colored graph consists of dierent color graphs. Note that each graph ? = (V; R) with R \ Diag(V V ) = ; (i.e. without loops) can also be considered as a colored graph (V ; R ; R ; R ) with d = 2; R = Diag(V V ), R = R and R = R where R is the relation complementary to R, i.e. R = f(v; w) : v 6= w ^ (v; w) 62 Rg: 0
1
2
0
1
2
A set of relations fR ; R ; : : :; Rdg which satis es (i)-(iii) will be shortly denoted by R. In the sequel, when we want to emphasize that we are dealing with a colored graph, we shall denote it by ? = (V ; R): The adjacency matrix of a colored graph is de ned by 0
1
Adj (?) =
d X i
i Adj (?i):
=0
2.3 Example. Consider the graph in Figure 2.1a. It represents the molecule Cl C O (Dioxin) and contains atoms of oxygene, carbon and clorid, represented by the labels O, C and Cl: Bonds are represented by lines, double bonds by double lines. In Figure 2.1b the corresponding colored graph is depicted. Some of the non-edges are indicated by broken lines. Since 4
5
12
2
there are too many of them not all have been included in order to keep the picture clear. Cl Cl C
C
C
C
C
O
O
C
C
C O C
C
C
C
Cl bond double bond
C
non-bonds
Cl Cl
Figure 2.1a
Figure 2.1b 15
16
11
12
7
8
4
3
2
1
Γ
6
5
9
10
13
14
18
17
Figure 2.1c Below the adjacency matrix of this colored graph is given. We have used the vertex numbering shown in Figure 2.1c where the dioxin graph is depicted in a fashion dierent from Figure 2.1a in order to show the spatial symmetry. Any other numbering from 1 through 18 would do it as well, however, the resulting adjacency matrix would look dierent.
6
00 B 5 B B B 3 B B 5 B B B 3 B B 5 B B 5 B B B 5 B B 5 B Adj (?) = B B 5 B B 5 B B B 5 B B 5 B B 5 B B B 5 B B 5 B B @5 5
5 0 5 3 5 3 5 5 5 5 5 5 5 5 5 5 5 5
3 5 1 3 5 5 4 5 5 5 5 5 5 5 5 5 5 5
5 3 3 1 5 5 5 4 5 5 5 5 5 5 5 5 5 5
3 5 5 5 1 3 5 5 4 5 5 5 5 5 5 5 5 5
5 3 5 5 3 1 5 5 5 4 5 5 5 5 5 5 5 5
5 5 4 5 5 5 1 5 5 5 3 5 5 5 5 5 5 5
5 5 5 4 5 5 5 1 5 5 5 3 5 5 5 5 5 5
5 5 5 5 4 5 5 5 1 5 5 5 3 5 5 5 5 5
5 5 5 5 5 4 5 5 5 1 5 5 5 3 5 5 5 5
5 5 5 5 5 5 3 5 5 5 1 4 5 5 3 5 5 5
5 5 5 5 5 5 5 3 5 5 4 1 5 5 5 3 5 5
5 5 5 5 5 5 5 5 3 5 5 5 1 4 5 5 3 5
5 5 5 5 5 5 5 5 5 3 5 5 4 1 5 5 5 3
5 5 5 5 5 5 5 5 5 5 3 5 5 5 2 5 5 5
5 5 5 5 5 5 5 5 5 5 5 3 5 5 5 2 5 5
5 5 5 5 5 5 5 5 5 5 5 5 3 5 5 5 2 5
5 5 5 5 5 5 5 5 5 5 5 5 5 3 5 5 5 2
1 CC CC CC CC CC CC CC CC CC CC : CC CC CC CC CC CC CC CC CA
We have three classes of vertex colors, 0, 1 and 2, corresponding to O, C and Cl, and we have also three classes of edge colors, 3, 4 and 5, corresponding to bonds, double bonds and non-bonds. For any i, 0 i 5; if we replace in Adj (?) all entries except the entry i by 0 and replace the entry i by 1, then we get the adjacency matrix of the color graph (V; Ri):
2.4 A permutation : V ?! V acting on the nite set V is a one-to-one mapping from V onto itself. The image of an element x 2 V with respect to the permutation is denoted
by x : A permutation of V can be represented by its permutation matrix P de ned by ( v (P )v;w = 10 ifif ww = 6= v : Let S (V ) be the set of all permutations acting on V . Assume ; 2 S (V ): The mapping x ?! (x ) (for x 2 V ) is again a permutation of V and, hence, an element of S (V ): We call it the product of and (in this order) and denote it by : By de nition, we have
x = (x ) : A trivial example for a permutation of V is the identity mapping id : V ?! V which leaves every element v 2 V xed. For every 2 S (V ) there is an inverse element ? 2 S (V ), de ned by v? = w () w = v such that ? = ? = id: 1
1
1
1
7
A subset G S (V ) is called a permutation group acting on V if G is closed with respect to multiplication of permutations, that is if
; 2 G =) 2 G: Evidently, S (V ) itself is also a permutation group. It is called the symmetric group of the set V . It is easy to check that P = P P : Hence, multiplication of permutations is equivalent to multiplication of the corresponding permutation matrices. Note that in particular
P P? = P? = Pid = I: 1
1
Hence
P? = (P )? : 1
1
Further,
P? = (P )t: 1
2.5 Let ? = (V; R) be a graph. An automorphism of ? is a permutation : V ?! V of the vertex set V such that
(v; w) 2 R () (v ; w ) 2 R holds for every (v; w) 2 V V: Using the notation R = f(v ; w ) : (v; w) 2 R g the latter condition reads R = R : We say that an automorphism preserves the arcs of ?. Let A = Adj (?). Recall (see [KliRRT95]) that for 2 S (V ) the graph ? = (V; R ) is obtained by renumbering the vertices according to the action of : It is easy to check, see 5.16 in [KliRRT95], that Adj (? ) = (P? AP ): In particular, the condition that is an automorphism of ? reads in matrix language 1
A = Pt AP or
P A = AP : Now, let ? = (V ; R) be a colored graph with R = (R ; : : :; Rd): A permutation 2 S (V ) is called an automorphism of ? if is an automorphism of each of the color graphs (V; Ri); 0 i d: According to the de nition of Adj (?) we get for 2 S (V ) d X P? Adj (?)P = i P? Adj (?i)P 0
1
1
i
=0
=
d X i
i Adj (?i ) = Adj (? )
=0
8
where ? = (V ; R ; R ; : : :; Rd ) = (V ; R ): This implies that a permutation is an automorphism of a colored graph ? if and only if P? Adj (?)P = Adj (?) or P Adj (?) = Adj (?)P : In words: A permutation matrix P represents an automorphism of a colored graph ? = (V ; R ; R ; : : :; Rd) if and only if P commutes with the adjacency matrices of all color graphs (V; Ri); 0 i d; or equivalently, if and only if P commutes with the adjacency matrix Adj (?) of ?: 0
1
1
0
1
Now, it is easy to see, that the set G of all automorphisms of a colored graph ? is closed with respect to multiplication, i.e. G is a permutation group. This group G is called the automorphism group of the colored graph ? and also denoted by Aut(?): Sometimes we will say that G acts on V , and we will denote this fact using the notation (G; V ):
2.6 Let ? = (V; R) and ?0 = (V; R0 ) be two graphs. W.l.o.g. we assume that both graphs have the same vertex set V = f1; 2; : : : ; ng; where n is the number of vertices (just number the vertices and use the numbers as names). A permutation : V ?! V of V is called
an isomorphism of ? and ?0 if (v; w) 2 R () (v ; w ) 2 R0: Let A = Adj (?) and A0 = Adj (?0). Then is an isomorphism if and only if P A = A0P : ? and ?0 are called isomorphic, written ? ' ?0, if there is at least one isomorphism of them.
Let ? = (V ; R ; : : :; Rd ) and ?0 = (V ; R0 ; : : : ; R0d) be two colored graphs. ? and ?0 are called isomorphic if and only if there is a permutation : V ?! V which is simultaneously an isomorphism of all color graphs (V; Ri) and (V; R0i); 0 i d: Evidently, this is the case if and only if P Ai = A0iP ; 0 i d; where the Ai's and A0i's are the adjacency matrices of the color graphs of ?i and ?0i , respectively. Note that the latter condition is in turn equivalent to P A(?) = A(?0)P : 0
0
2.7 Let ? = (V ; R ; : : : ; Rd) be a colored graph and G = Aut(?) its automorphism group. We call two vertices v; w 2 V equivalent, written v w, if there is a g 2 G with vg = w: Clearly, is an equivalence relation on V . Its equivalence classes are called the orbits of V 0
under Aut(?) (see [KliRRT95], 4.14), or the orbits of the permutation group (G; V ): The orbit containing vertex v is denoted by OrbG (v): Let OrbG (v ); OrbG (v ); : : :; OrbG (v ) be the dierent orbits. Then, obviously, OrbG (vi) = fvig : g 2 Gg = fv : v vig; 1 i : 1
9
2
The orbits form a partition of V , i.e. they are non-empty and pairwise disjoint and their union is V . This partition is called the automorphism partition of ? and will be denoted by Vaut(?): Example 2.3 (continued): Let G = Aut(?) be the automorphism group of the graph in
Figure 2.1c. It is easy to see that G = fg ; g ; g ; g g where g is the identity and g = (1; 2)(3; 4)(5; 6)(7; 8)(9; 10)(11; 12)(13; 14)(15; 16)(17; 18); g = (1)(2)(3; 5)(4; 6)(7; 9)(8; 10)(11; 13)(12; 14)(15; 17)(16; 18); g = (1; 2)(3; 6)(4; 5)(7; 10)(8; 9)(11; 14)(12; 13)(15; 18)(16; 17): We invite the reader to interprete each non-identical automorphism as a certain geometrical symmetry of the Figure 2.1c. For example, g is obtained by a re ection with respect to the axis between the vertices having even numbers and those having odd numbers. The authomorphism partition in this example is f1; 2g; f3; 4; 5; 6g; f7; 8; 9; 10g; f11; 12; 13; 14g; f15; 16; 17; 18g: 1
2
3
4
1
2
3
4
2
1
1
2
7
6
2
8
8 7
3
3
6
4
5
5
4
b)
a)
1 1 6
2
8
7
8 2
3
7
3 4 5
4
6 5
d)
c)
Figure 3.1
3 Canonical numberings
3.1 Let ? = (V; R) be a graph. The adjacency matrix of ? was de ned in 2.1. It depends on the numbering chosen for the vertices of ?. Changing this numbering may change the 10
adjacency matrix drastically. To have a small example consider Figure 3.1a and Figure 3.1b. They show the abstract picture of a molecule known as the "cuneane" numbered in two dierent ways. The corresponding adjacency matrices are 1 0 0 0 1 0 0 1 0 0 0 0 1 1 BB 1 0 1 BB 1 0 1 1 0 0 0 0 CC BB 0 1 0 BB 0 1 0 1 0 0 0 1 CC CC BB BB 0 1 1 0 1 0 0 0 C B0 1 1 B A =B BB 0 0 0 1 0 1 1 0 CCC ; A = BBB 0 0 0 BB 1 0 0 BB 0 0 0 0 1 0 1 1 CC B@ 0 0 0 B@ 1 0 0 0 1 1 0 0 CA 1 0 1 1 0 1 0 0 1 0 0 2
1
0 1 1 0 1 0 0 0
0 0 0 1 0 1 1 0
1 0 0 0 1 0 1 0
0 0 0 0 1 1 0 1
1 0 1 0 0 0 1 0
1 CC CC CC CC : CC CC CA
3.2 Now consider the graph in Figure 3.1c. Perhaps this graph is also the cuneane, but
drawn in a dierent (and strange) way. Is this graph isomorphic to the cuneane? Let ? be the cuneane, regarded as the graph depicted in Figure 3.1a, and ?0 the graph in Figure 3.1c. They have the same set of vertices V = f1; 2; : : : ; 8g: To prove that these graphs are isomorphic - or are not isomorphic - we have to check all possible permutations of V and to see whether they ful ll the isomorphism condition. There are 8! = 40320 dierent mappings to check. To do this is a big task. We invite the reader to work out this example and to try to nd his own method for reducing this big task to one which requires only moderate eorts. In oder to reduce the eorts needed for doing jobs like this we will exploit a dierent idea for checking isomorphism. Namely, let us associate to each class of isomorphic graphs a certain canonical representation. Then, to check whether two graphs ? and ?0 are isomorphic, we have just to calculate for both graphs their canonical representations and to compare them whether they are equal or not. There are several ways for creating canonical representations. The way, we are going here is, roughly speaking, the following. Consider the adjacency matrices of all graphs which belong to a particular prescribed isomorphism class. Read each matrix, row by row, as a binary number. Select the matrix which gives the smallest number. Call this matrix A~. Each graph G in the isomorphism ~ Such a numbering will class considered can be numbered in such a way that Adj (G) = A: be called a canonical numbering. The graph G with this numbering will be considered as canonically numbered. Below this idea is exposed in a more rigid way. Clearly, instead of A~ we could have chosen a matrix yielding the largest binary number. This would lead to a dierent canonical representation.
3.3 We recall that a graph ? in our de nition is a pair (V; R). In what follows let us identify the vertex set V with the set f1; 2; : : : ; ng; i.e. let us number the vertices in
V and use the numbers as names for them (see 2.1).Then ? has a uniquely determined 11
adjacency matrix Adj (?) = (Ai;j ) i;jn : This matrix can be encoded by a unique natural number. Namely, consider the following code of A, n n X X Aij 2n ? i? n?j : cd(A) = 1
2
i
=1
j
(
1)
=1
Its value depends just on the adjacency matrix A chosen to represent the isomorphism class of ?: Moreover, it is a natural number, the binary representation of which is the string A A : : :A nA : : : A n : : :An : : : Ann which consists of the rows of A written horizontally side by side in their natural order. 11
12
1
21
2
1
Note that from the knowledge of cd(A) we can reconstruct the matrix Adj (?), see Remark 3.9.1. For example, for the "cuneane" in Fig. 3.1 we get cd(A ) = 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 +2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 + 2 = 4877487903930551460; cd(A ) = 5021603092014697890 where A and A are the adjacency matrices of the graphs in Figure 3.1a and 3.1b. 62
57
56
55
53
52
46
44
40
38
37
35
28
1
26
25
19
17
16
15
11
10
7
5
2
2
1
2
3.4 Let V = f1; 2; : : : ; ng, let ? be a graph with vertex set V , and let ?~ = Iso(?) be the
set of all graphs with vertex set V which are isomorphic to ?, that is ?~ is the isomorphism class of ? (restricted to the vertex set V ). Frequently, ?~ is called an abstract graph. By de nition, ?~ = f? : 2 S (V )g; and the set of all adjacency matrices belonging to graphs in ?~ is A(?) = fPt AP : 2 S (V )g: Now, x any permutation 2 S (V ): Since S(V) is a group, we have f : 2 S (V )g = f : 2 S (V )g: Hence, A(?) = fPt AP : 2 S (V )g = fPt AP : 2 S (V )g: Since P = P P and Pt = Pt Pt , we have nally A = fPT (PT AP )P : 2 S (V )g: This means that we get the same set A if we start with the graph ? instead of ?. Therefore the set A(?) is independent of how we number the vertices of ? initially, i.e. A(?) = A(? ) 12
for arbitrary 2 S (V ): Since each graph ?0 which is isomorphic to ? is of the form ? for some isomorphism , for isomorphic graphs ? and ?0 we have
A(?) = A(?0): Let ? and ?0 be two graphs with vertex set V and adjacency matrices A and B , respectively. Assume that A(?) \ A(?0) 6= ;: Then there is some 2 S (V ) and some 2 S (V ) such that Pt AP = PtBP: it follows
B = P Pt AP Pt = (P P? )tAP P? = Pt ? AP? : 1
1
1
1
Hence, ? and ?0 are isomorphic, and the permutation ? is an isomorphism. 1
Summarizing we can state now the following facts: For any two graphs ? and ?0 we have
A(?) \ A(?0) 6= ; () A(?) = A(?0) and ? ' ?0 A(?) \ A(?0) = ; () ? 6' ?0: For a graph ? with adjacency matrix A = Adj (?) let us de ne
canon(?) = min2S V cd(Pt AP ) = minfcd(B ) : B 2 A(?)g: (
)
The value of canon(?) we shall call the canonical number of ?. The matrix Pt AP which produces this canonical number is called the canonical matrix of ? (or of ?~ ). Any permutation 2 S (V ) such that
cd(Pt AP ) = canon(?) is called a canonical numbering of ?: Note that there exists a unique canonical number and a unique canonical matrix of ?, however, this matrix may be obtained with the aid of more than one canonical numbering. (Let 2 S (V ) be an automorphism of ?: Then A = Pt AP , hence canon(?) = cd(Pt Pt AP P ) = cd(Pt AP ):) It turns out that canon(?) is the kind of representation for a graph ? we are looking for. We can prove the following statement.
3.5. Proposition. Let ? = (V; R) and ?0 = (V; R0) be two graphs. We have canon(?) = canon(?0) if and only if ? ' ?0: Proof. We have Hence
? ' ?0 =) A(?) = A(?0):
canon(?) = minfcd(B ) : B 2 A(?)g = minfcd(B ) : B 2 A(?0)g = canon(?0): 13
Further, Hence
? 6' ?0 =) A(?) \ A(?0) = ;:
canon(?) = minfcd(B ) : B 2 A(?)g 6= minfcd(B ) : B 2 A(?0)g = canon(?0): The latter inequality holds since cd is an injective function on the set of all 0-1-matrices.
2
Let us emphasize once more that the method for de ning a canonical representation for graphs just described is only one among many dierent such possibilities. For instance, instead of taking the minimum in the de nition of canon(?) above we could choose the maximum. The minimum has the advantage that the numbers we have to compute are smaller than when we use the maximum.
3.6 Now the question arises: How to nd the canonical number of a given graph ?: The brute force approach again requires to examine all possible permutations 2 S (V ). Therefore we are in need of a method with the aid of which we are able to compute canon(?) without constructing the whole set A(?): In many cases, some clever tricks al-
low to reduce drastically the number of permutations which have to be examined. Below we give a very simpli ed outline of the main ideas used in computations of a canonical code. Before doing so, however, let us return to the example of the cuneane. Consider Figure 3.2a which represents the starting situation where no numbering is done so far. In order to not get confused with dierent meanings of numbers we start by using letters a; b; : : :; h as names for the vertices.
All vertices in the graph have valency three. Thus, the rst row of the canonical matrix must be 00000111, which means that the neighbours of the (still unknown) vertex 1 have to get the numbers 6, 7 and 8, i.e. R(1) = f6; 7; 8g: Now, we want to assign the numbers 1, 6, 7, 8 to vertices in such a way that in addition the second row of the adjacency matrix starts with as many 0's as possible. Therefore, we have to look for two non-adjacent vertices which have a maximum number of common neighbours. This requirement reduces the possible locations of the rst numbers 1 and 2 to only four pairs of vertices, a and e, b and d, b and f , c and e. Each selection among these four options allows the construction of an adjacency matrix the rst two rows of which read 0 0 0 0 0 1 1 1: 0 0 0 0 1 0 1 1 Evidently, the canonical matrix must start in this way. Therefore, with the information used so far, 1 and 2 can be assigned in 8 possible ways. Assume that we have selected one of them, say a; e. such that a gets number 1 and e gets number 2. Next we have to nd the vertex which gets the number 3. Clearly, we have to look for a vertex which is neither adjacent to a nor to e, because only then the third row of the desired matrix can start with two zero's. There are two options, namely c and h. The next property we want to 14
see is that our candidate has a minimum number of neighbours outside of N (a) [ N (e): In the case just considered this property does not distinguish c from h. However, in other cases, it can be usefully exploited in order to reduce the number of cases which have to be considered. Starting also with each of the remaining alternatives instead with a and e, we nd that for the three numbers 1, 2 and 3 there are the following options: fa; egc; fa; egh; fb;dgf; fb; dgg; fb;f gd; fb; f gh; fc; ega; fc; egg: Thereby, the vertex which gets the 3 is left outside the brackets. Note that for each of the eight pairs fx; ygz above we have to examine two options, namely (x; y)z and (y; x)z: Next, let us exploit the obvious symmetry of the cuneane. Since there is an automorphism which xes the vertices g and h and interchanges a and f , b and e and c and d, the option fc; egg is already covered by fb; dgg: Since there is another automorphism which xes the vertices b and e and interchanges a and c, f and d and g and h, the option fb; dgg is covered by fb; f gh. This latter is in turn covered by fa; egh: Finally, fb; dgf is covered by fc; ega, which in turn is covered by fa; egc. Thus, we are left with the following options for assigning the numbers 1,2,3 (in this order): aec; eac; aeh; eah. Each of these partial numberings leads to a unique total numbering of the cuneane. For example, take the rst option aec. We know that a gets number 1, e number 2, c number 3, the unique neighbour of e which is not a neighbour of a, namely d, gets number 5, the unique neighbour of c which is not a neighbour of a or of e, namely h, gets number 4, the numbers 6,7 and 8 are to assigned to b, f and g, moreover to get a second row of the canonical matrix as indicated above, we have to assign 7 and 8 to b and f , hence g gets number 6, now the third line of the matrix we are going to construct will contribute less to the value of the code if we assigne 8, and not 7, to b. Thus, the numbering is complete. The result is shown in Figure 3.2b. In the same way, starting with one of the other options, eac, aeh and eah, and proceeding completely analoguous we get the uniquely determined numberings shown in Figures 3.2c, 3.2d and 3.2e, respectively. The corresponding adjacency matrices are dierent. The numbering in Figure 3.2c yields the smallest codenumber. The corresponding adjacency matrix is 0 1 0 0 0 0 0 1 1 1 B 0 0 0 0 1 0 1 1C B CC B B CC 0 0 0 1 0 1 0 1 B B 0 0 1 0 1 1 0 0C CC : A=B B B CC 0 1 0 1 0 0 1 0 B B 1 0 1 1 0 0 0 0C B C B @ 1 1 0 0 1 0 0 0 CA 1 1 1 0 0 0 0 0 15
Thus
canon(?) = 507522663119374560: A dierent canonical numbering is given in Figure 3.1d. It corresponds to the originally considered starting triple bdf which being covered by aec was eliminated from further consideration. 8
b a
1
3
c d
e
f
7
5
2
6
g
4
h
a)
b)
8
8
2
1 3
7
4
6
1
2
7
5
6
5
3
4
c)
d) 7
2
5 6
1
8
4 3
e)
Figure 3.2
3.7 The most ecient algorithms known today for nding canonical numberings of graphs are based on a backtracking procedure for organizing the search through the set of all possible permutations of the vertex set V = f1; 2; : : : ; ng. A backtracking procedure is described most conveniently by a so-called search tree, which is a rooted directed tree. The nodes of this tree represent certain situations during the process of computation, while the branches of the tree indicate which situation will be reached next. The root of the tree represents the start, the leaves represent the dierent permutations in S (V ): 16
The backtracking procedure consists in a walk through the search tree. Once a node is reached one notes down a trace back to its predecessor in order to be able to go back and try another search direction. For our goal, a search tree could be used the nodes of which are characterized by partial permutations. A partial permutation is a sequence ((1); : : :; (k)) which can be completed to a full permutation ((1); : : :; (k); (k + 1); : : : ; (n)) of V (which is a permutation in the usual sense). The root of the search tree corresponds to the empty permutation, i.e. at the begin nothing is xed. Interior nodes of the search tree correspond to striktly partial permutations while its leaves correspond to full permutations. An interior node corresponding to ((1); : : :; (k)) will then have n ? k successors characterized by ((1); : : :; (k); (k + 1)) P where (k + 1) is one of the numbers in V ? f(1); : : :; (k)g: Such a search tree has n! nk ? n?k nodes, a huge number, even if n is small. Therefore, short-cuts are of high importance. For each x = ((1); : : :; (k)) we can elaborate some lower bound for the code values found at all possible extensions of x to a full permutation. If this bound is larger than the smallest code value found so far, then we do not need to explore the successors of x, i.e. the whole subtree rooted in x can remain unexplored. Sophisticated bounding strategies will reduce the walk through the search tree to only a small part such that in this way canonical numberings for graphs of up to several thousand vertices can be computed in reasonable time. 1
1
=0 (
)!
A more elaborate but still simpli ed description of the backtracking methode may be found in [KliPR88]. Descriptions of methods for canonically labeling graphs (which are also implemented on a high professional level) can be found in [ArlZUF74], [McK77], [Leo84]. In [McK90] the nowadays well-known computer package nauty is presented which includes a program for nding canonical numberings of graphs. In the elaboration of the example in 3.6 backtracking was not used explicitely. Further, bounding and short-cutting has been done implicitely by using symmetry arguments. In general, such kind of bounding is not available, unless we do not have at least partial information about the automorphism group of the graph under consideration. Some more remarks concerning the computation of canonical numberings are given in Section 9.5.
3.8 Let ? = (V ; R) be a colored graph and ?~ the corresponding abstract graph. Every adjacency matrix A of ?~ contains entries from the set f0; 1; : : : dg. Therefore the code number cd(A) must be de ned by
cd(A) =
n n X X i
=1
j
Ai;j (d + 1)n ? i? 2
(
=1
Also here a permutation 2 N (V ) will exist such that t cd(A) = min cd(P AP ):
17
n?j :
1)
Again such a numbering is called a canonical numbering. Proposition 3.5 remains valid without change also for colored graphs.
3.9 Remarks:
1. Given n and c = canon(?) we are able to reconstruct ? and the numbering . For this aim nd the representation of c as a binary number. This is a certain string of 0's and 1's which starts with a 1. If necessary, we add 0's to the left of this string to get a string of length n : After this we cut the whole string into n small pieces of length n each and take them as rows to build up the adjacency matrix A : Thus, the mapping 2
A ?! cd(A ) is invertible. 2. Perhaps organic chemist were the rst to realize the necessity of a suitable canonical representation for graphs. There are several dierent systems for coding chemical molecular graphs, perhaps the best known of them are the IUPAC nomenclature rules (see [IUPAC]). For graphs of small size these rules work very eectively, coding and decoding can be done without the aid of a computer. Unfortunately, in general, the use of these rules may cause misunderstandings and even mistakes. We refer to [KliLPZ92], [Rea83], [Rea85] for a discussion of this phenomenon. 3. The notion of a canonical labeling was suggested almost at the same time and independently by experts in mathematical chemistry and graph theory, see [ArlZUF74], [Ran74], [Pro74]. The advantage of a purely mathematical approach is that it is based on a general computational scheme. Alternatively, chemists started using heuristical methods and spent some time by discussing of their disadvantages, see e.g. [Mac75], [Ran74], [Ran75], [Ran76], [Ran77]. Still canonical numbering is exploited by a very restricted community of experts in mathematical chemistry, the major part of chemists seems not to be ready to use this (perhaps in their eyes too) mathematical tool. 4. Eective bounding as a tool for short-cutting the search tree walk can be done using graph invariants, in particular vertex invariants. We shall discuss the role of graph invariants in the subsequent sections.
4 Graph invariants, vertex and arc invariants
4.1 Let ? = (V ; R ; : : :;0Rd ) be a colored graph. A colored graph ?0 = (V 0; R0 ; : : :; R0d0 ) is a subgraph of ? if V V and for each i; 0 i d0; there is a j (i) such that R0i = Rj i \ (V 0 V 0): Thus, a subgraph of ? consists of a subset V 0 of the vertex set V 0
0
( )
and all the colored arcs of ? with both ends in V 0: Given ?, subgraphs are also denoted by ?(V 0): 18
If we use V = f1; : : :; ng as description for the vertex set of ? then V 0 will still be a set of natural numbers, but in general, it will not be an interval any more. Since we want to study graphs and their subgraphs, and perhaps also subgraphs of subgraphs, we assume in this context that every vertex set V considered is a subset of N , the set of natural numbers.
4.2 Let G be the set of all graphs ? = (V; R) with V N : A function f : G ?! CI is called a (?dimensional) graph invariant if ? ' ?0 =) f (?) = f (?0): If also
f (?) = f (?0 ) =) ? ' ?0 is true then f is called a complete graph invariant.
In many cases graph invariants take only two values, namely 0 or 1, the 1 indicating that a certain graph property is present, the 0 indicating that this property is missing. Properties the indicator function of which is a graph invariant are called invariant properties, i.e. either both of two isomorphic graphs have this property or none of them has it. Among the more known examples of invariant properties are: 1. Undirected graphs: connectedness, k-connectedness, planarity, being a tree, being a bipartite or being even a multipartite graph, chordality, being an interval graph, etc. 2. Directed graphs: strong connectedness, being an acyclic graph, being the graph of a partial order, etc. In other cases, the values of graph invariants are restricted to natural numbers. Trivial examples are the vertex number and the edge number. A few less trivial examples are: the clique number !(?) (the size of the largest subset V 0 V such that ?(V 0) is a complete subgraph), the chromatic number (?) ( the smallest number of colors needed to color the vertices of ? such that (v; w) 2 R implies that the colors of v and w are dierent), the cyclomatic number (?) (the minimum number of edges which must be deleted in order to destroy all cycles), the connectivity number k(?) (the largest number k such that ? is k-connected), the diameter d(?) (the length of a longest path in ?), the maximum degree (?): The notion of graph invariants (and vertex and edge invariants below) applies also to colored graphs. For colored graphs, graph invariants are functions of the invariants of their color graphs. 19
There are also graph invariants whichi can be complex-valued vectors or even complexvalued matrices. For instance, let j ; 1 j n; be the eigenvalues of the adjacency matrix Ai of the i?th color graph ?i of a colored graph ?. For a complex number z let = v0 = v > =) f (?; v; w) = f (?0; v0; w0) ; w0 = w for any isomorphism of ? and ?0. In particular, this implies v; w 2 V (?) =) f (?; v ; w ) = f (?; v; w) 21
for any automorphism of ?: Some examples for non-trivial arc invariants are: the distance between v and w in the color graph (V; Ri ); 0 i d, the number of vertices z with (v; z) 2 Ri and (z; w) 2 Rj ; 0 i; j d, the number of cycles in (V; Ri) which contain (v; w); 0 i d, the maximum number of non-crossing paths from v to w (in each of the color graphs), any pair (f (?; v); f (?; w)) where f is a vertex invariant.
5 Partitions and their Stabilizations
5.1 Let V be a nite set. A partition V = fV ; : : : ; Vr g of V is a set of subsets of V with 1
the properties (i) Vi \ VSj 6= ; if and only if i = j and (ii) V = ri Vi : The sets Vi are called the cells (or the classes) of V : The partition with r = 1 and V = V is called the trivial partition. The partition with r = jV j and jVij = 1 for 1 i n is called the split partition. =1
1
A partition V 0 is ner than V (and V is coarser than V 0) if every cell of V 0 is contained in a cell of V : Write V 0 V if V 0 is ner than V : The relation is a partial ordering on the set of partitions of V . The maximum of this partial ordered set is the trivial partition, the minimum is the split partition. Let ? = (V ; R) a colored graph and Aut(?) its automorphism group. A partition V of the vertex set V = f1; 2; : : : ; ng is called an invariant partition if Vi = fv : v 2 Vig = Vi for 1 i r and all automorphisms : The automorphism partition Vaut(?) of ? is an example for an invariant partition, but every partition which is coarser than the automorphism partition is also invariant. Let V = (V ; : : : ; Vr ) and V 0 = (V 0; : : :; Vr0 ) be two invariant partitions. Then V \ V 0 = fVl \ Vl0 ; : 1 l r ^ 1 l0 r0 ^ Vl \ Vl0 6= ;g is a new invariant partition of V , called the meet of V and V 0: The meet of two partitions consists of all non-empty intersections of cells from both partitions. 1
1
Every equivalence relation on V partitions V into its equivalence classes. Vice versa, every partition V of V de nes an equivalence relation on V , the equivalence classes of which are the cells of V : Let (H; V ) (Aut(?); V ) (here means H being a subgroup of Aut(?)): Consider the equivalence relation v H w () 9 2 H : w = v : 22
The equivalence classes of this relation are the orbits of (H; V ): Such a partition is called an orbit partition. The automorphism partition itself is also an orbit partition. It is the maximum among all orbit partitions.
5.2 An algorithm for nding the automorphism partition of arbitrary graphs can be used also to check isomorphism of two connected graphs ? = (V ; R ) and ? = (V ; R ): Assume V \ V = ; (which is no loss of generality) and let ? be the disjoint union of ? 1
1
1
1
2
2
2
2
1
and ? : An automorphism of ? will either satisfy V = V and V = V or V = V and V = V : The two graphs ? and ? are isomorphic if and only if there is an automorphism of ? which sats es V = V and V = V : This is the case if and only if the automorphism partition of ? consists of cells O ; : : : ; Or with Ol \ V 6= ; and Ol \ V 6= ; for all l; 1 l r: 2
2
1
1
1
2
2
2
2
1
2
1
2
2
1
1
1
2
5.3 Assume ? = (V; R): A partition V = fV ; : : :; Vr g of V is called equitable with respect to ? if for all k; l 2 f1; : : :; rg; the numbers jR(v) \ Vl )j and jRt (v) \ Vlj are constant on each cell, i.e. depend on the cell index k only and not on the vertex v 2 Vk : Thus in an equitable partition each element v of a cell Vk has the same number of successors and the same number of predecessors in each cell of V : Now let ? = (V ; R) be a colored graph. A partition V is called equitable with respect to ? if it is equitable with respect to all color graphs ?i = (V; Ri); 0 i d: 1
The importance of the notion of an equitable partition is expressed by the following proposition, which is folklore.
Proposition. For every partition V of the vertex set V of a (colored) graph ? there is a unique coarsest equitable partition V which is ner than V . If V is invariant then also V is invariant. Proof. To prove the rst part of the proposition we shall present an algorithm which, given V ; computes V . Some mathematical notions used in the description of the procedure below, with which the reader may not be familiar, are explained in 5.4.
procedure Stabgraph; Input: A partition V = fV ; : : :; Vr g of V ; a graph ? = (V; R); 1
begin = r; L: for 1 l do for v 2 Vl do begin L(v; 0) = l; for 1 k do
23
LT (v; k) = jR(v) \ Vk j; LS (v; k) = jRt (v) \ Vk j; L(v) = (L(v; 0); LT (v; 1); : : :; LT (v; ); LS (v; 1); : : :; LS (v; )); end; sort the lists L(v)v2V lexicographically and for v 2 V let rank(v) be the rank of L(v) in this ordering; 0 = 1 + maxv2V rank(v); for 01 l 0 do Vl = fv : rank(v) = l ? 1g; if > then
begin 0 =; goto L; end; r = ; end. Let V = V ; V ; : : :; Vt be the sequence of partitions produced by the algorithm Stabgraph. Since V V and since V is equitable, we conclude that in the rst iteration of the algorithm we have L(v) = L(w) for every pair of vertices v; w which belong to the same cell of V : This shows that V V : By the same argument, replacing V by V we nd that V V : Proceeding in this way step by step we get nally V Vt: However, when Stabgraph stops, then Vt? = Vt which means exactly that Vt is equitable. Hence, V = Vt : If V is invariant, then Vl = Vl; 1 l r: Since for any automorphism of ? the equality w = v implies R(v) = R(w), it follows (R(v) \ Vl) = R(w) \ Vl; and hence, jR(v) \ Vlj = jR(w) \ Vlj; 1 l r: Thus w v implies L(v) = L(w) in each iteration of the algorithm Stabgraph. Therefore, in each of the partitions Vs, and in particular in the nal partition V , the vertices v and w belong to the same cell. In other words, each cell of V is a union of orbits of the automorphism group of ?. This means exactly that V is invariant. 2 5.4 Remarks: 0
1
1
0
1
2
1
1. Algorithm Stabgraph uses lexicographical sorting of vectors. Let x = (x ; x ; : : :; xh) and y = (y ; y ; : : :; yh) be vectors with real components. Then x is called lexicographically less than y, written as x y, if for the smallest index j such that xj 6= yj the inequality xj < yj holds. Note that if no such index exists, then x = y: Sorting a set of h-dimensional vectors fx ; x ; : : :; x s g lexicographically means to nd a permutation of f1; 2; : : : ; sg with the property x x : : : x s : The rank of a vector x l in this lexicographic ordering is the number of dierent vectors x k with x k x l . Note that some of the vectors x l may be equal. Therefore, the rank of x l can be smaller than l ? 1: 1
1
2
2
(1)
(2)
( )
( 1)
( 2)
(
)
(
(
(
)
)
(
(
)
(
)
)
)
Sorting vectors in this way is done exactly like sorting words of a language in a dictionary for which reason this kind of sorting is called lexicographical. 24
2. Stabgraph performs a sequence of re nements of the initial partition V . The re nement is achieved by qualifying vertices according to their lists L(v): If V is invariant, then in each step l the list L(v) is a new vertex invariant derived from the current partition Vl. 3. If ? is a colored graph with d > 2, then algorithm Stabgraph must be applied repeatedly with dierent color graphs ?i = (V; Ri) in order to get the coarsest equitable partition with respect to ?. Since a re nement achieved using ?i may render the current partition non-equitable with respect to ?j it can happen that the same color graph ?j has to be used several times. Note that each application of Stabgraph with some color graph ?i re nes the current input partition. 4. Since for an undirected graph ? = (V; R) we have R(v) = S (v) = N (v); in algorithm Stabgraph the lists L(v ) can be shortened to L(v ) = (L(v; 0); L(v; 1); : : :; L(v; )); where now L(v; k) = jN (v) \ Vk j:
5.5 Let f be any vertex invariant and assume ff (?; v) : v 2 V g = ff ; : : : ; fr g: Assume that f f : : : fr and de ne Vl = fv : f (?; v) = flg: Then Vf = fV ; : : : ; Vr g is an invariant partition. Each cell Vl is a union of orbits of ?: Moreover, Vf is canonically ordered in the following sense: Let ? and ?0 be two isomorphic graphs and Vf (?) = fV ; : : : ; Vr g and Vf (?0) = fV 0; : : : ; Vr0g the corresponding partitions with respect to the invariant f . Then Vl = Vl0; 1 l r; for any isomorphism between ? and ?0: To express the fact that a partition V is considered as an ordered partition we shall write V = (V ; : : :; Vr ) instead of V = fV ; : : :; Vr g: 1
1
2
1
1
1
1
1
Let ? = (V; R) be an undirected graph (i.e. R = Rt). Let deg(?; v) be the degree of v in ? and assume that deg(?; V ) = fd ; : : :; dr g with d : : : dr . The partition Vdeg = (V ; : : :; Vr ) de ned by Vi = fv 2 V : deg(?; v) = dig; 1 i r; is called the is called the total degree partition of ? degree partition of ?: The re ned partition Vdeg and will be denoted by T DV (?) = (TDV ; : : :; TDVr ): 1
1
1
1
Note that, since Vdeg is canonically ordered and since algorithm Stabgraph produces a series of partitions each of which is canonically ordered as well, the total degree partition T DV (?) also has this property. This means, if T DV (?) = (W ; : : : ; Wt) and T DV (?0) = (W 0 ; : : :; Wt00 ), then ? ' ?0 implies t = t0 and Wi = Wi0 for 1 i t and all isomorphisms between ? and ?0. The same observation holds for any input partition V for Stabgraph. If V is canonically ordered, then so is V : 1
1
Application of algorithm Stabgraph to any invariant partition Vf yields a possibly ner invariant partition Vf: The question is: For which vertex invariant f is Vf always the automorphism partition? Another question is: Given f , for which graphs ? is Vf the automorphism partition? We shall discuss this questions in the subsequent sections. 25
5.6 The process of deriving V from V is called a stabilization procedure. The meaning of this is the following: Given an invariant partition V we re ne this partition using the derived invariants L(v). If no re nement is possible any more in this way, then the current partition is said to be stable (with respect to the re ning procedure used).
There are many dierent possible stabilization procedures. It depends on which invariants are used for getting a re nement of V : Instead of the lists L(v) we could use any graph invariant f and construct the list Lf (v) = (f (?(N (v) \ V )); : : : ; f (?(N (v) \ Vr ))), which is a new vertex invariant analoguous to L(v): Using Lf in algorithm Stabgraph would result in a partition V which is stable with respect to f , i.e. no pair v; w belonging to the same cell of V could be distinguished by their lists Lf (v) and Lf (w), respectively. 1
#
#
Since a partition V of V is often given by or considered as a coloring of the elements of V , an equitable partition is also called a stable vertex coloring.
5.7 Example: Consider the graph in Figure 5.1 (the graphical representation of the acenaphth{(1,2-a){ acenaphthylene molecule, see [Rou76]). The degree partition is (V ; V ) with 1
2
V = f3; 4; 5; 7; 8; 9; 13; 14; 15; 17; 18; 19g; V = f1; 2; 6; 10; 11; 12; 16; 20; 21; 22g: 1
2
Let us use Stabgraph in order to compute the total degree partition. Iteration 1:
L(1) = L(11) = L(21) = L(22) = f2; 0; 3g; L(2) = L(10) = L(12) = L(20) = f2; 1; 2g; L(3) = L(5) = L(7) = L(9) = L(13) = L(15) = L(17) = L(19) = f1; 1; 1g; L(4) = L(8) = L(14) = L(18) = f1; 2; 0g; L(6) = L(16) = f2; 2; 1g: 18
19
3
20
17
4
2
5
1
21
16
6
22
11
15
14
7 10
12
9
13
Figure 5.1 Lexicographical sorting gives 26
8
L(3) = L(5) = L(7) = L(9) = L(13) = L(15) = L(17) = L(19) L(4) = L(8) = L(14) = L(18) L(1) = L(11) = L(21) = L(22) L(2) = L(10) = L(12) = L(20) L(6) = L(16): The new partition is V = fV ; V ; V ; V ; V g de ned by V = f3; 5; 7; 9; 13; 15; 17; 19g; V = f4; 8; 14; 18g; V = f1; 11; 21; 22g; V = f2; 10; 12; 20g; V = f6; 16g: Iteration 2: This step is performed with the help of Figure 5.2 which shows the partition of the vertex set of our graph after the rst re nement step. We get 1
1
1
2
3
4
5
2
3
4
5
L(1) = L(11) = (3; 0; 0; 1; 2; 0); L(2) = L(10) = L(12) = L(20) = (4; 1; 0; 2; 0; 0); L(3) = L(9) = L(13) = L(19) = (1; 0; 1; 0; 1; 0); L(4) = L(8) = L(14) = L(18) = (2; 2; 0; 0; 0; 0); L(5) = L(7) = L(15) = L(17) = (1; 0; 1; 0; 0; 1); L(6) = L(16) = (5; 2; 0; 1; 0; 0); L(21) = L(22) = (3; 0; 0; 0; 2; 1): 2
1
1
4 1
2
4
1
3
5
3
3
1
5
3 4
2
1
4
1
1
2
Figure 5.2 Lexicographical sorting gives
L(5) = L(7) = L(15) = L(17) L(3) = L(9) = L(13) = L(19) L(4) = L(8) = L(14) = L(18) L(21) = L(22) L(1) = L(11) 27
L(2) = L(10) = L(12) = L(20) L(6) = L(16): According to this result the new partition is V = (V ; V ; V ; V ; V ; V ; V ) de ned by 2
1
2
3
4
5
6
7
V = f5; 7; 15; 17g; V = f3; 9; 13; 10g; V = f4; 8; 14; 18g; V = f1; 11g; V = f21; 22g; V = f2; 10; 12; 20g; V = f6; 16g: This partition is indicated in Figure 5.3. 1
2
4
5
3
6
7
2
3
2
6
1
6
4
7
3
5
1
5
7
4
1
6
2
3
1
6
2
Figure 5.3
3
Looking at Figure 5.3 we see immediately that V is the automorphism partition of our graph. Therefore, V must be stable. We can check stability by trying a third iteration. 2
2
Iteration 3:
L(1) = L(11) = (4; 0; 0; 0; 1; 0; 2; 0); L(2) = L(10) = L(12) = L(20) = (6; 0; 1; 0; 1; 1; 0; 0); L(3) = L(9) = L(13) = L(19) = (2; 0; 0; 1; 0; 0; 1; 0); L(4) = L(8) = L(14) = L(18) = (3; 1; 1; 0; 0; 0; 0; 0); L(5) = L(7) = L(15) = L(17) = (1; 0; 0; 1; 0; 0; 0; 1); L(6) = L(16) = (7; 2; 0; 0; 0; 1; 0; 0); L(21) = L(22) = (5; 0; 0; 0; 0; 0; 2; 1): This result proves V = V : Hence V is stable. 3
2
2
However, the situation met in this example is a rather particular one. In general, we reach a partition which (doing some simple computations) we can easily prove to be stable but we do not know (and cannot see easily) whether it is the automorphism partition or not.
5.8 Example:
Let us return to the example in 3.1. The cunean is a so-called cubic graph, i.e. it is regular 28
of degree 3. For regular graphs the (absolutely) coarsest equitable partition is V = (V ), the trivial partition. This partition is invariant of course, and it is also stable with respect to the stabilization procedure using the vertex invariant deg(?; v): This means that application of Stabgraph to the cuneane has no eect at all and does not produce any re nement. 2
1
1
7 2
8
1 1
3
6
2 1 2
5
2
4
Figure 5.4 Every time when we are faced with a graph ? and an invariant stable partition V we would like to know whether V is the automorphism partition or not. In many cases, the answer will be negative. So it is with the cunean and the trivial partition, since in [KliRRT95], Section 4.26, it was shown already that this graph has more than one orbit with respect to its automorphism group. To get along with regular graphs we have to use a vertex invariant which is stronger than the vertex degree and which will help to distinguish at least some of the non-equivalent vertices. Let us try the vertex invariant
dist(?; v) = (jfw : d(w; v) = igj)i
;;
=1 2 3
;
which is a vectorial invariant, the i-th component of which equals the number of vertices w the distance d(w; v) to v of which is equal to i. Note that the distance between two vertices v and w is de ned to be the length of a shortest path connecting them. To nd dist(?; v); v 2 V we use the numbering indicated in the left part of Figure 5.4 and compute the distance matrix D = (d(v; w))v;w2V : It reads 00 1 2 2 2 2 1 11 BB 1 0 1 1 2 3 2 2 CC BB 2 1 0 1 2 2 3 1 CC CC BB 2 1 1 0 1 2 2 2 C B D=B BB 2 2 2 1 0 1 1 2 CCC : BB 2 3 2 2 1 0 1 1 CC B@ 1 2 3 2 1 1 0 2 CA 1 2 1 2 2 1 2 0 From this matrix we readily get the values of dist(?; v):
dist(?; 1) = (3; 4; 0); dist(?; 2) = (3; 3; 1); dist(?; 3) = (3; 3; 1); dist(?; 4) = (3; 4; 0); dist(?; 5) = (3; 4; 0); dist(?; 6) = (3; 3; 1); dist(?; 7) = (3; 3; 1); dist(?; 8) = (3; 4; 0): 29
From these values we derive the invariant partition V = (V ; V ) with 1
2
V = f2; 3; 6; 7g; V = f1; 4; 5; 8g which is shown in the right picture of Figure 5.4. Now let us apply algorithm Stabgraph to ? with initial partition V : We get L(1) = (2; 2; 1); L(5) = (2; 2; 1); L(2) = (1; 1; 2); L(6) = (1; 1; 2); L(3) = (1; 1; 2); L(7) = (1; 1; 2); L(4) = (2; 2; 1); L(8) = (2; 2; 1) and L(2) = L(3) = L(6) = L(7) L(1) = L(4) = L(5) = L(8): Hence, V = V , the initial partition is already stable. However, it is still not the automorphism partition (see also [KliRRT95], Section 4.26). 1
2
5.9 Example: Consider the graph in Figure 5.5. In the picture on the left we have chosen an arbitrary numbering of the vertices. The picture in the middle shows the degree partition of this graph which we take as initial partition for Stabgraph. 1
2
9
1
2
10
14 6
5
7
1
8
1
1
2 2
3 1
1
1
1
12
11
4
2
2
1
1
1
3
2
1
1
2
13
3
1
2
3
1
3
1
Figure 5.5 Let us compute the total degree partition. Iteration 1:
L(1) = L(2) = L(3) = L(4) = L(5) = L(6) = L(7) = L(8) = (1; 1; 1); L(9) = L(10) = L(11) = L(12) = (2; 2; 1); L((13) = L(14) = (2; 0; 3): Thus, L(1) = L(2) = L(3) = L(4) = L(5) = L(6) = L(7) = L(8) L((13) = L(14) L(9) = L(10) = L(11) = L(12) 30
and V = fV ; V ; V g with 1
1
2
3
V = f1; 2; 3; 4; 5; 6; 7; 8g; V = f13; 14g; V = f9; 10; 11; 12g: 1
2
3
In Figure 5.5 V is shown in the picture on the right. A second iteration shows that V is already stable. Hence, it is the total degree partition. However, V is not the automorphism partition. We have for example 2 6 7, since 2 belongs to a cycle of length ve, but 7 does not. 1
1
1
The number of cycles of a given length, say k, to which a vertex v belongs is an obvious vertex invariant. Denote this number by cnk (v): The length k may vary between 1 (a loop) and n (a Hamiltonian cycle). We get a vectorial vertex invariant if we collect all numbers cnk (v); 1 k n; taking them as the components of a vector
cn(v) = (cn (v); : : :; cnn (v)): 1
For the graph in our example we get
cn(1) = cn(2) = cn(3) = cn(4) = (0; 0; 0; 0; 1; 0; 0; 0; 2; 0; 1; 3; 0; 0); cn(5) = cn(6) = cn(7) = cn(8) = (0; 0; 0; 0; 0; 1; 0; 0; 2; 1; 2; 2; 0; 0); cn(9) = cn(10) = cn(11) = cn(12) = (0; 0; 0; 0; 1; 1; 0; 0; 3; 1; 2; 2; 0; 0); cn(13) = cn(14) = (0; 0; 0; 0; 1; 2; 0; 0; 4; 1; 1; 2; 0; 0); The corresponding invariant partition is
Vcn = ff1; 2; 3; 4g; f5; 6; 7; 8g; f9; 10; 11; 12g; f13; 14gg which can be shown to be the automorphism partition. Note that we would nd this partition also if instead of cn we would use the rst nine components only. This means that in fact it is not necessary to nd the numbers cn : : : ; cn : Unfortunately, we do not know this in advance. 10
14
Now, for the graph in this example cn(v) determines the automorphism partition directly, even without the use of Stabgraph. For other graphs however, this is not the case. The crucial point is that we never know in advance which vertex invariant will be helpful and which not. Another crucial point is that cn(v) is not eciently computable. Computing cnn(v) means in particular to compute the number of Hamiltonian cycles to which v belongs. There is no known algorithm which would perform this task in reasonable time on arbitrary graphs. Therefore, cn(v) is not a useful invariant for large graphs in general, although it may be useful in some special cases, for instance with graphs of small size as in this example.
5.10 From the foregoing examples we learn that Stabgraph is a systematic method for computing a partition of the vertex set of a graph ? which is in some sense "near" to the automorphism partition Vaut(?). In cases where we have to state that Stabgraph 31
did not succeed in nding Vaut(?) we have look for further re nements of the partition at hand. In Example 5.7 we used an additional vertex invariant in order to get a ner initial partition. In Example 5.8 we succeeded in nding the automorphism partition by using the invariant cn(v): The question arises whether we can create an algorithm which works similar to Stabgraph but uses systematically a set of eciently computable vertex invariants instead of merely deg(?; v) such that we dont get trapped in situations like in Example 5.7 or 5.8. If we insist on the property eciently computable and mean by this term that the amount of time needed for one run of the algorithm remains bounded by the value of some universal polynomial p(x) at the vertex number n of the input graph, i.e. the time needed to process a graph ? with n vertices is at most p(n), and if we insist on the condition that the algorithm ends up with the automorphism partition no matter to which graph it is applied, then there is still no answer to this question. We just do not know whether such an algorithm exists or not. However, there is an algorithm which runs in polynomial time and which takes into account a reasonable part of the known and eciently computable vertex invariants and arc invariants. This algorithm, known as the Weisfeiler-Leman stabilization method (WL-algorithm, for short), to which reference has been made several times already in [KliRRT95], (from the point of view of practical applications), nowadays is judged to be the best stabilization method. Its time need in the worst case is of the same order as the time need for Stabgraph. It produces not only a good "approximation" to the automorphism partition but also partitions the set of arcs and the set of non-arcs into subsets which are unions of the 2-orbits of Aut(?): There is a well founded mathematical theory around this algorithm, see [KliRRT95], [BabBLT97] and [BabCKP97], which will be considered to some extent in Section 7.
5.11 In each iteration of Stabgraph the invariants L(v) are computed and compared
lexicographically. Lexicographical sorting of n vectors of length h needs time O(hn ln n); see [AhoHU75]). This means that for large n the time needed to do the sorting is bounded from above by Chn ln(n) where C is some positive constant. For small n this is perhaps an unrealistic large bound, the actual time need is much less, at least in the average. Nevertheless, since we have to repeat sorting in each iteration this part of the algorithm is rather time consuming. We may try therefore to avoid computing and lexicographically sorting the lists L(v) by applying some other method for distinguishing non-equivalent vertices. Let us study the so-called sum algorithm. Assume that ? is undirected and that we know some initial invariant partition V = (V ; : : :; Vr ) (the trivial partition or the degree partition for example). We start by labeling the vertices by their cell numbers. After that for each vertex a new label is computed by taking the sum of the labels on its neighbours. In general this gives a ner invariant partition. Clearly, if two vertices get dierent labels, then they cannot be equivalent. Now, the procedure can be repeated until no re nement of the current partition is achieved. 1
32
To see how the method is supposed to work consider the example prepared in Figure 5.6. The uppermost picture shows the initial partition (the degree partition). After one iteration of the sum method the four end vertices of the tree have been partitioned into two cells. On the other hand the neighbours of the endpoints, which are not all equivalent, have all got the same label. After the second run (see the third picture) all endpoints have again the same label, while all other non-euqivalent vertices have dierent labels. After a third iteration we come back to the situation in the second picture. Thus we see that the sum method does not stabilize the vertex partition. On the contrary, this partition oscillates. Each second run yields the same partition as in the third picture (one can prove this), and every next run will give the same partition as in picture 4. 1 2 2
4
3
10
3
8
1
1
6 4
14
4
4
1
2
12
4
8 3
12
4
26
18
6 3
2
8
18
26 12
2
8
4 18
Figure 5.6 The example in Figure 5.6 is a "bad example" for the sum method. On other examples the method will work quite well and will even give the automorphism partition. The reader may convince himself that this is the case for instance with the graph in Figure 5.3.
5.12 The sum method is the basic ingredient of an algorithm for generating a unique
machine description of chemical structures published by H. L. Morgan in 1965. Morgan started with labels equal to the degrees of the vertices and called the sums of the labels of their neighbours extended connectivities. In chemical publications this method is frequently referred to as Morgans procedure. However, Morgan did not invent it. It seems that this method has been folklore in chemistry since a long time before, see the corresponding comments in [MarM56] and [BalMB85a]. We doo not know exactly what was the origin for Morgan to introduce his procedure. The authors of [BalMB85a] refer to [MarM56] as to a textbook suitable for chemists where similar iterative procedures are used for solving systems of linear equations. Let A be the adjacency matrix of the undirected graph ? and assume that x = (x ; : : :; xn) is the vector of the labels xi assigned to the vertices i at the beginning of the sum algorithm. Then the new labels after one iteration give a new vector y which satis es y = Ax: Indeed, for every vertex i we have 1
yi = new label of i = sum of all labels xj of the neighbours of i: 33
In formal terms this reads
yi =
X j 2N i
( )
xj =
X j 2N i
Aij xj =
( )
n X j
Aij xj :
=1
Thus, the series of labels produced by the sum algorithm is the series x; Ax; A x; A x; : : :: Oscillations of the corresponding partitions can be studied using this representation of the labels. 2
3
Oscillation is very frequent with the sum algorithm. Even with the graph Morgan used in [Mor65] (also a tree similar to the one in Figure 5.6) for introducing his algorithm the method fails to produce a stable partition. However, the drawback of the sum method { no result because the partition becomes not stable { is easy to overcome. One just has to apply it stepwise, cell by cell, thereby keeping the membership to a cell as the rst and the sum of the labels in the neighbourhood as the second discriminating feature. Proceeding in this way oscillation will never appear, since each new partition is a re nement of the previous one. Thus, such a version of the sum algorithm is indeed a stabilization method.
5.13 A version of the sum algorithm which has the features of a stabilization method was
already on the market shortly before Morgans paper appeared. In [Ung64] S. H. Unger presented a graph isomorphism algorithm in which an appropriate version of the sum method was used as a stabilizer for vertex partitions. Unger appearently is a computer scientist and not a chemist, thus, his paper was completely neglected, or at least not recognized in the chemical world. It never has been cited in chemical papers dealing with isomorphism of graphs, even not in such important and comprehensive papers as [BalMB85a], [MekBBB85] and [BalMB85b]. Unger observed also in [Ung64] that his algorithm works as well if instead of the sum any other symmetric function of the labels xj ; j 2 N (i) is taken. For instance, we could compute the new labels yi according to the formula Y yi = xj : j 2N i
( )
Also the kind of arithmetic can be freely chosen. Instead of using the usual arithmetic of integers we could perform all computations modulo n, (n the vertex number), or modulo some other appropriate integer p: R. Karp has analyzed the sum algorithm on the base of the arithmetic modulo 2: In his paper [Kar78] he showed that this sum algorithm needs time O(n ln(n)) and determines the automorphism partition correctly for (n)% of all graphs with n vertices, where (n) = nn : 2 A report on some extended form of the sum algorithm is given in [Ber78]. 2
3 2
2
5.14 The perhaps rst graph isomorphism algorithm which used a subroutine similar to
Stabgraph as a stabilizer for partitions was described 1970 in [CorG70] by D. G. Corneil
34
and C. C. Gotlieb. However, stabilization as a tool for recognizing graph isomorphism was considered earlier by B. Y. Weisfeiler and A. A. Leman, even in a much more general context. Their rst publications were in Russian, so they did not become known in the western world. The elaborate English version [Wei76] of their results appeared not earlier than 1976. The Weisfeiler-Leman stabilization method was the starting point for a nowadays well developed algebraic theory, the theory of cellular algebras, see [KliRRT95]. A detailed description of this stabilization method and its computer implementation is given in [BabCKP97] and [BabBLT97].
5.15 Since 1965, two lines of activity, one in chemistry and one in mathematics and com-
puter science, have produced an enormous number of graph ismorphism algorithms. In chemistry, mainly the Morgan approach was developed further, for a review over this line of activity see [BalMB85a], [MekBBB85], [BalMB85b], [BonMB85]. Papers not mentioned in this review or more recent papers are [Ber87], [RueR90], [Uch80a], [Uch80b], [Uch81]. Another early paper on graph isomorphism is [Sus65]. The algorithm reported in it uses partitioning, but no re nement method is applied.
5.16 In mathematics, from about 1970 on, several backtracking algorithms for general graph isomorphism have been implemented and compared. A bibliography of the earlier attempts to attack graph isomorphism has been given in [ReaC77]. This bibliography has been extended in [Gat79]. Several computer packages supplying graph isomorphism algorithms are available publicly via internet, the most important being nauty, see [McK90]. In the last decades the main interest of mathematicians and computer scientists in the graph isomorphism problem has concentrated on its complexity aspect. Despite considerable eorts of many researchers, up to now the complexity of the isomorphism problem could not be compared with the complexity of other standard combinatorial problems, as for instance the Hamiltonian path problem. It is known today, that the complexity of the isomorphism problem is not reduced if one considers only graphs of some particular properties, say for example, regular graphs, chordal graphs, bipartite graphs, and many others. On the other hand, there are many graph classes known for the members of which there exists a polynomial graph isomorphism test (an algorithm which for testing isomorphism of two graphs with n vertices needs at most cna elementary computational steps, where c and a are positive constants). Examples are trees, planar graphs, interval graphs, graphs with bounded valencies, graphs with bounded multiplicities, and many others. A survey on the earlier complexity results is given in [BooC79].
6 The total degree partition
6.1 Let ? = (V; R) be an undirected graph. The total degree partition T DV (?) of ? has
been de ned in 5.5. It is the coarsest equitable partition of V . It is also the partition which results from an application of Stabgraph to the degree partition of ?: Instead of the degree partition we could also use the trivial partition (V ) as the initial partition for Stabgraph, the result would be the same, since the degree partition is identical with 35
the result of the rst iteration when Stabgraph is started with (V ): The total degree partition of a graph with n = jV j vertices can be computed in time O(n log(n)): This has been observed several times in the literature, see for instance [Col79], [CaiFI92], implicitly it has been mentioned also in [AhoHU75]. Since T DV (?) is an invariant partition its cells are unions of orbits. The best case appears when each cell is an orbit itself, i.e. when T DV coincides with the automorphism partition of ?. There is no simple criterion which would allow us to decide whether this coincidence happens or not. 'Simple' means here that we could check the equality T DV (?) = Vaut(?) with the help of an algorithm which needs at most time O(na) to produce the answer, where a is a small natural number. Indeed, there is no algorithm known for this job which has a polynomial time bound valid for arbitrary graphs. However, in special favorable cases we are even able to prove that T DV (?) = Vaut(?) holds. The class of trees constitutes such a favorable case. 3
6.2 Proposition. Let ? be a tree. Then T DV (?) = Vaut(?). Proof. Several proofs for this fact can be found in the literature. Perhaps the rst proof
is given in [Cor68]. Subsequent independent proofs have been given in [JaR74],[Sza74], [Tin75] and [Col79]. The Proposition follows also from a certain theorem in [Tin86] which will be discussed in one of the following subsections. 2 The class of trees is the simplest class of graphs with respect to problems concerning automorphisms, isomorphisms and encoding procedures. The total degree partition of trees may be found by a special algorithm in time O(n): There is an O(n) isomorphism algorithm for trees, see [AhoHU75]. Further, there exist O(n) encoding and decoding algorithms for trees, see for example [Rea72] or [TiS84].
6.3 Let ? be an undirected graph and T DV (?) = (TDV ; : : :; TDVr ) its total degree partition. From T DV (?) we derive a new set of graph invariants, namely 1. the number of cells r 2. the numbers i = jTDVi j; 1 i r 3. the numbers ij = jN (v) \ TDVj j; 1 j r; v 2 TDVi; 1 i r: Note that jN (v) \ TDVj j is independent of v 2 TDVi , hence, ij is well de ned. These invariants are called the structure constants of T DV (?): We arrange them as matrix 1
1 r 0 : : : 0 C : : : r C C : : : r C CC : : :: : : : : :: : : : : : :: : : : A r r : : : rr Clearly, ? ' ?0 implies sctdp(?) = sctdp(?0): Note that the de nition of sctgd is based on T DV (?), which is a canonically ordered partition. Therefore sctdg is a graph invariant. 0 BB B sctdp(?) = B BB @
1
11
1
2
21
2
1
36
Example:
To have an example take the tree ? in Figure 5.6 und use the numbering shown in Figure 5.7. Its degree partition is (f1; 7; 8; 9g; f2; 4; 5g; f3; 6g): Stabgraph applied to this partition yields
T DV (?) = (f8; 9g; f1; 7g; f4g; f2; 5g; f3g; f6g): This is also the automorphism partition Vaut(?): 8 4 2
1
6
3 9 5
7
Figure 5.7 We nd
0 BB 62 BB BB 2 sctdg(?) = B BB 12 BB @1 1
0 0 0 0 0 0 2
0 0 0 0 1 0 0
0 0 0 0 0 1 1
0 0 1 0 0 2 0
0 0 0 1 1 0 0
0 1 0 1 0 0 0
1 CC CC CC CC : CC CA
For regular graphs ? of degree d we get the (2,1)-matrix sctdp(?) = d . Since for many pairs of numbers n and d there are also many non-isomorphic regular graphs of degree d and vertex number n, sctdp(?) is not a complete graph invariant. 1
6.4 Let us study the invariant sctdp(?) in some more detail. Suppose ? and ?0 are two undirected graphs with adjacency matrices A and A0, respectively. We have ? ' ?0 if and
only if there is a permutation matrix P satisfying the matrix equation ()
XA = A0X
where X is the unknown variable, here it denotes an unknown n n-matrix. Condition () is just a collection of n n linear equations in the variables Xij ; 1 i; j n; the entries of X , namely (XA)ij = (A0X )ij ; 1 i; j n or n n X X Xik Akj = A0ik Xkj ; 1 i; j n: k
=1
k
=1
37
Hence, we see that the matrix equation XA = A0X is equivalent to a system of linear equations in the variables Xij : This system may have many solutions none of which de nes a permutation matrix. Therefore, in order to describe isomorphism of ? and ?0 via solvability of () we have to add additional constraints which exclude solutions which are not permutation matrices. Part of this constraints can be formulated as linear relations, too. For example we may exclude a large set of solutions of () by adding the conditions 8 Pn > > < Pnj Xij = 1; 1 i n () > i Xij = 1; 1 j n > : Xij 0; 1 i; j n: The rst condition says that in every row of X the sum of all entries equals 1, the second says that in every column the sum of all entries euqals 1, too. Finally, the third condition says that X has non-negative entries only. These are the conditions for a so-called doubly stochastic matrix. Since every permutation matrix is doubly stochastic, () excludes only matrices in which we are not interested. However, in general, the solution set of () and () is still much too large. This can be seen by the following consideration. =1
=1
Assume that P and Q are dierent solutions of (*) and (**). Take any real number satisfying 0 < < 1 and de ne R() = P + (1 ? )Q: Then R() satis es (*) and (**). No R() is a permutation matrix. Let DS (A; A0) be the set of matrices solving (*) and (**), i.e. the set of doubly stochastic matrices X with the property XA = A0X: Given any X ; : : : ; X t 2 DS (A; B ) and any non-negative numbers ; : : :; t with t X l = 1; (1)
( )
1
l
=1
the convex sum
X ( ; : : : ; t) = 1
t X l
l X l
( )
=1
DS (A; A0);
belongs to too. For this reason, DS (A; A0) is called a convex set. In fact, DS (A; A0) is a very special convex set, namly a so-called polytope (a bounded polyhedron). An element of a convex set is called an extremal point if it is not a proper convex sum of two dierent elements in it. The extremal points of a polyhedron are called its vertices. 1
It is easy to see that a permutation matrix cannot be a proper convex sum of doubly stochastic matrices. In the contrary, there is a rather well-known theorem due to Birkhoff, see [Bir46], which states that every doubly stochastic matrix is a convex sum of DS (A; A0 ) can be considered as a set of points in R nn. The extremal points together with the line segments on the surface of DS (A; A0 ) which connect extremal points form a graph, the skeleton of DS (A; A0 ). This justi es the use of the term vertex. 1
38
permutation matrices, and moreover, each permutation matrix is a vertex of the polytope of all doubly stochastic matrices. From this it follows that all permutation matrices which solve () are vertices of DS (A; A0): However, there can be also vertices of this polyhedron which are not permutation matrices. In fact, this is the case at least when DS (A; A0) 6= ;, but ? 6' ?0: In this latter case there are solutions of () and (), but none of them is a permutation matrix. The situation is characterized best using the total degree partitions of ? and ?0:
6.5 Let us say that ? is ds-isomorphic to ?0 if DS (A; A0) 6= ;: We shall denote this relation between graphs by ? 'ds ?0. Note that the shape of DS (A; A0) is independent of the numbering of ? and ?0 : Renumbering the graphs changes A and A0 according to A ?! P tAP and A0 ?! QtA0Q where P and Q are appropriate permutation matrices. Suppose Y 2 DS (P t AP; QtA0Q): Then Y P tAP = QtA0QY () (QY P t)A = A0(QY P t): Now, if Y is doubly stochastic and P and Q are permutation matrices, then also QY P t is doubly stochastic. Conversely, if X 2 DS (A; A0), then QtXP 2 DS (P tAP; QtA0Q). Hence, the mapping X ?! QtXP de nes a one-to one mapping between DS (A; A0) and DS (P tAP; QtA0Q): Since this mapping is performed just by renumbering the variables, both polytopes have exactly the same shape. 2
Ds-isomorphism is a weak notion of isomorphism. It diers from the usual notion by the fact that we do not require integrality for the solution of () and () which describes the ds-isomorphism. Quite a few graph invariants are also invariants with respect to ds-isomorphism. This explains why they are not strong enough in order to describe isomorphism completely. To give an example we consider the invariant sctdp(?):
Proposition.
Let ? and ?0 be two undirected graphs with vertex set V = f1; : : :; ng and adjacency matrices A and A0 , respectively. Then ? 'ds ?0 if and only if sctdp(?) = sctdp(?0):
Proof. The "if"-part of the proof for this proposition is simple. Assume that sctdp(?) = sctdp(?0) and let
T DV (?) = (TDV ; : : :; TDVr ); T DV (?0) = (TDV 0; : : : ; TVr0 ): 1
De ne
1
8 < if j 2 TDVl and i 2 TDVl0 for some l Yij = : l 0 otherwise. 1
In fact, if X and Y are both doubly stochastic, then XY is again doubly stochastic. Hence, DS (A; A) is closed with respect to matrix multiplication and, therefore, forms a in nite semigroup: a semigroup of "doubly stochastic automorphism" of ?: 2
39
Evidently, Y = (Yij ) is a doubly stochastic matrix with entries Yi;j 6= 0 if and only if (i; j ) 2 TDVl0 TDVl for some index l. Note that the entries of the matrix sctdp(?) satisfy X j 2 TDVt =) Akj = ts ; k2TDVs X 0 i 2 TDVs0 =) Aik = st: k2TDVt0
Hence, for i 2 TDVs0; j 2 TDVt we have n X X 1 1 A (Y A)ij = Yik Akj = kj = ts s k k2TDVs s =1
and
n X
n X
A0ik 1 = 1 st : t t k k2TDVt0 Now, s st is the number of edges with one end in Vs and the other end in Vt. The same meaning has tts: Hence s st = tts, which nally shows that (Y A)ij = (A0Y )ij : Since i and j where arbitrarily chosen, this equation holds for every pair i; j: It follows that Y A = A0Y: Thus, (A0Y )ij =
A0ik Ykj =
=1
sctdp(?) = sctdp(?0) =) DS (A; A0) 6= ; =) ? 'ds ?0: The "only if"-part of the proof, however, involves quite a few technical details which perhaps are not interesting for the major part of our readers. Therefore we refer the interested reader to the original paper [Tin86] where the statement of the proposition is presented as Theorem 1. A somewhat more general statement is proved in [Tin91]. An alternative proof can be found in [SchU97]. 2
7 The Weisfeiler-Leman Closure
7.1 Let ? = (V; R) be an undirected graph. Let us consider rst a very simple edge invariant suggested by Figure 7.1. Let e =< u; v > be an edge and We V the set of all vertices in V ? fu; vg which are adjacent to both ends u and v of e: Then, clearly, if is any isomorphism between ? and a second graph ?0, then < u; v > must be an edge of ?0 and the image w of each vertex w 2 We must be adjacent to u and to v:
40
W’e
We
W’’ e
v
u
Figure 7.1 In addition, let We0 be the set of vertices in V ? fu; vg which are adjacent to exactly one vertex in fu; vg, and We00 the set of those which are neither adjacent to u nor to v. Again, w 2 We0 implies that w is adjacent to exactly one vertex in fu; vg; and w 2 We00 implies that w is neither adjacent to u nor to v: Therefore, (jWej; jWe0j; jWe00j) is a 3-dimensional edge invariant.
7.2 Next let us turn to directed graphs. Assume that ? = (V; R) is directed (i.e. R 6= Rt) and is an isomorphism of ? and ?0 = (V; R0 ): Let (u; v) be an arc of ?, i.e. (u; v) 2 R:
If w is a successor of u and of v in ? then w must be a succesor of u and of v in R0. Analoguously, if w is a non-successor of u and a successor of v, then w must be a non-successor of u and a successor of v: Therefore, let us partition the set V ? fu; vg into the following sets:
W (u; v) = fw : (u; w) 2 R ^ (w; v) 2 Rg; 11
W (u; v) = fw : (u; w) 2 R ^ (v; w) 62 Rg; W (u; v) = fw : (u; w) 62 R ^ (v; w) 2 Rg; W (u; v) = fw : (u; w) 62 R ^ (v; w) 62 Rg: In the same way we can partition the set V ? fu; vg into analoguously de ned sets 10
01
00
W 0 (u; v); W 0 (u; v); W 0 (u; v); W 0 (u; v): 11
10
01
00
0 (u; v ) for all x; y 2 f0; 1g, the vector Since Wxy (u; v) = Wx;y
(jW (u; v)j; jW (u; v)j; jW (u; v)j; jW (u; v)j) 11
10
01
00
is a 4-dimensional invariant. Since (u; v) 2 R is an arc rather than an edge, like in the previous sections we call it an arc invariant. 41
Involving Rt into the consideration, we can split up the invariant jW (u; v)j into four invariant summands, using the partition 11
W (u; v) = fw [ fw [ fw [ fw 11
: (u; w) 2 R \ Rt ^ (w; v) 2 R \ Rt g : (u; w) 2 R \ Rt ^ (w; v) 2 R \ Rt g : (u; w) 2 R \ Rt ^ (w; v) 2 R \ Rt g : (u; w) 2 R \ Rt ^ (w; v) 2 R \ Rt g:
Each of these four sets in the union for W (u; v) is mapped under onto its analoguously de ned counterpart in ?0 , with R at each appearance replaced by R0. Therefore, the cardinalities of these four sets are arc invariants. 11
w
w
u
v
u
w
w
v
u
v
u
v
Figure 7.2 Partitioning each of the sets Wxy (u; v); xy 2 f0; 1g, in this way we get 16 arc invariants, or equivalently, a single 16-dimensional arc invariant. Each variant corresponds to one of the 16 dierent ways a vertex w not in fu; vg can be related to this set. Four of these possibilities are shown in Figure 7.2. They correspond to the partitioning of W (u; v) described above. 11
In the above considerations, it is not important that (u; v) is an arc of ?, i.e. that (u; v) 2 R. All conclusions remain true if (u; v) is a non-arc, too. That is, the de nition of all the invariants above can be extended to the de nition of invariants of non-arcs. Thus, combining these invariants to a 16-dimensional vector we get a mapping
f : V V ?! Z
16
with the property that for each isomorphism of ? and ?0
f (u; v) = f (u; v) for all ordered pairs of vertices (u; v). Note that the de nition of the sets Wxy (u; v) and their partitions makes sense also when u = v: Mappings f like the one just considered here we don't call arc invariants or non-arc invariants any more. Instead, to cover all cases, arcs, non-arcs and diagonal elements (u; u), we call them 2-invariants. 42
Remark: The considerations in this subsection give an idea how arc invariants can be
found. In the next subsections we will re ne this idea and incorporate it into a systematic procedure which, starting with a colored graph, iteratively uses appropriate arc invariants to re ne the current coloring until some "stable" coloring is reached.
7.3 Finally, let us turn to the most general case where ? = (V ; R) is a colored graph. 0 Assume that ? is isomorphic to some colored graph ? = (V ; R0) and that R = (Ri) id ; R0 = (R0i) id : Then for any isomorphism of these graphs, for all (u; v) 2 V V and for all i; j; 0 i; j d; we have jfw : (u; w) 2 Ri ^ (w; v) 2 Rj gj = jfw : (u; w) 2 R0i ^ (w; v) 2 R0j gj: 0
0
Using the notation
WRi;j (u; v) = fw : (u; w) 2 Ri ^ (w; v) 2 Rj g we get the 2-invariant
trR(u; v) = (jWRi;j (u; v)j) i;jd which is of dimension (d +1) : The function symbol trR for denoting this invariant means triangle. Each component of this invariant equals the number of triangles built up over the basic arc (u; v) by arcs of color i and j , respectively. 0
2
The numbers WRi;j (u; v) are also de ned for u = v. Note that WRi;j (u; u) = jRi(u) \ RTj (u)j. This is just the number of vertices which are both successors uf u in ?i and predecessors of u in ?j : (Note that
trS (u; v) = (jWSi;j (u; v)j) i;jd is de ned for every partition S = fS ; : : : ; Sdg of V V , not only for colored graphs, and independently of the context in which this function trS on V V is used.) 0
0
7.4 Example: Consider once more the cuneane depicted in Figure 3.1a. Using the vertex
numbering shown in this picture the adjacency matrix of the corresponding colored graph is 0 1 0 1 2 2 2 2 1 1 B 1 0 1 1 2 2 2 2C B CC B B 2 1 0 1 2 2 2 1C B CC B 2 1 1 0 1 2 2 2 B CC : A=B B CC 2 2 2 1 0 1 1 2 B B 2 2 2 2 1 0 1 1C B C B @ 1 2 2 2 1 1 0 2 CA 1 2 1 2 2 1 2 0 43
Here,
V = f1; : : :; 8g; d = 2; R = Diag(V V ); R = f(u; v) : Au;v = 1g; R = f(u; v) : Au;v = 2g: For example, to nd WR; (1; 8), we have to determine the vertices k such that A ;k = 1 and Ak; = 2: Thus WR; (1; 8) = f2; 7g: In the same way we nd WR; (2; 4) = f1g; WR; (4; 2) = f5g: which proves immediately that the cunean has no automorphism with 1 = 2 and 8 = 4, or with 1 = 4 and 8 = 2: Therefore, the arcs (1; 8) and (2; 4) are not equivalent. 0
1
2
1 2
1
8
1 2
1 2
1 2
7.5 Remember that for a graph ? with automorphism group Aut(?) = G the 2-orbit of (u; v) is the set 2-orbG (u; v) = f(ug ; v g ) : g 2 Gg: Among those we can retrieve also the orbits of ?, namely
OrbG (u) = fug : g 2 Gg = fw : (w; w) 2 2-orb(u; u)g: This observation allows us to identify orb(u) and 2-orb(u; u): Let f be any function de ned on V V and let K = fk ; : : : ; ksg the set of dierent values of f . The sets 1
Si = f(u; v) : f (u; v) = kig; 1 i s; de ne a partition S = fS ; : : : ; Ssg of V V: If f is a 2-invariant, then S is an invariant partition, which means that each Si is an invariant set of pairs. Namely, in such a case we have Si = Si; 1 i s; for each automorphism of ?: The nest invariant partition is the given by the sets of 2-orbits. We denote it by 2-orb(V; ?) (see [KliRRT95], Section 5). 1
7.6 Just like in the case of partitions of the vertex set V (see Section 5) we can de ne an equitable partition of V V to be a partition S on the cells Si of which trS has constant values, i.e. for which (u; v); (x; y) 2 Si =) trS (u; v) = trS (x; y) holds for 1 i s: Given an initial partition S there is a unique coarsest equitable partition S : Again, just like in the case of partitions of the vertex set V we can design an algorithm which applied to S computes S . In the following formulation of this algorithm, which works completely analogously to Stabgraph, we assume that the input is given as a colored graph. Thus, we assume that the initial partition of V V is the system of 44
relations R of a colored graph ? = (V ; R):
procedure Stabgraph 2; Input: A colored graph ? = (V ; R ; : : :; Rd);
begin = d; L: for 0 l do for (u; v) 2 Rl do begin L(u; v; 0) = l; for 1 j; k do ; L(u; v) = (L(u; v; 0); jWR (u; v)j; : : :; jWR;(u; v)j; jWR; (u; v)j; : : :; jWR;(u; v)j; : : :; jWR; (u; v)j; : : :; jWR;(u; v)j); end; sort the lists L(u; v) u;v 2V V lexicographically and for (u; v) 2 V V let r(u; v) be the rank of L(u; v) in this ordering; 0 = 1 + max u;v 2V V r(u; v); for 1 l 0 do Rl = f(u; v) : r(u; v) = l ? 1g; if 0 > then begin = 0 ; goto L; end; end. 7.7 Algorithm Stabgraph 2 computes a series of partitions R = R; R ; : : :; R t , ending when for the rst time R t = R t : Denote this nal partition by R . Each partition R i in this series ful lls the conditions (i) - (iii) of 2.2 and, hence, de nes a colored graph ? i = (V ; R i ): The process is controlled by the 2-invariant trR , i.e. each new partition R i is determined by re ning the cells of the current partition R i according 0
0 0
0
1
1 0
(
(
0
)
)
(0)
( )
(1)
( )
( +1)
( )
( )
(i)
( )
( +1)
( )
to dierent values of trR i : ( )
The partition R is an equitable partition, in the sense introduced in this section. It is on the other hand a stable partition with respect to the re ning method used by Algorithm Stabgraph 2. This re ning method is called the Weisfeiler-Leman stabilization. The resulting colored graph ? = (V ; R) is called the Weisfeiler-Leman closure of (the input graph) ?: Algorithm Stabgraph 2 is equivalent to the algorithmic procedure introduced and studied by B. Weisfeiler and A. Leman in [Wei76]. Detailed descriptions of ecient implementations of some variants of Stabgraph 2 can be found in [BabCKP97] and [Bas98]. A demonstration how these implementations work can be found under 45
http://www.statistik.tu-muenchen.de/~bastert/interactiv.html.
7.8 Today it is well known that the result of the WL-stabilization, i.e. the partition R produced by Stabgraph 2 when applied to an input R de nes a linear basis for a socalled cellular algebra (also called coherent algebra). Let R = (R; : : : ; R) and consider the adjacency matrices As = Adj (Rs ); 0 s of the relations Rs : The fact that R is stable is re ected by the fact that each matrix 0
product AiAj can be written as
AiAj =
X s
=0
psi;j As;
with non-negative integers psi;j ; 0 s : The last equality expresses the fact that ij (u; v )j is independent of (u; v ) 2 R , namely jW ij (u; v )j = ps for all (u; v ) 2 R : jWR s ij R s The same equality shows also that the set of matrices of the form X A = alAs; as 2 CI; 0 l ; s
=0
is closed with respect to matrix multiplication. It is easy to see (and follows in fact from the conditions (i) - (iii) in 2.2) that it is also closed with respect to Schur-Hadamard multiplication (componentwise multiplication) and to transposition of matrices and contains the matrices I and J (the unit matrix and the matrix of all 1's). Therefore, this set forms a cellular algebra, for more details consult [KliRRT95].
7.9 Remarks: 1. The cellular algebra a linear base of which is given by the matrices A ; : : :; A de ned above is the smallest such algebra which contains all adjacency matrices Adj (Ri); 0 i d; of the colored input graph ?. We call it the cellular algebra generated 0
by ?, or the WL-closure of ?, and denote it by W (?):
2. The numbers psij are called the structure constants of the algebra W (?): They de ne a tensor of dimension ( + 1) . We shall denote it by p(?): Since Algorithm Stabgraph 2 produces a canonically ordered partition of V V , it is obvious that p(?) is a graph invariant. If a graph ? = (V; R) is considered as a colored graph (V ; R ; R ; R ) as described in Section 2.2, p(G) is a graph invariant also for ordinary graphs. 3
0
1
2
3. The tensor p(?) is a rather strong graph invariant. For two graphs ? and ?0, for example, the equality p(?) = p(?0) implies that both graph have the same spectrum of eigenvalues. However, p(?) is not a complete graph invariant. There are non-isomorphic graphs ? and ?0 with p(?) = p(?0): Note thate due to Remark 2 above the equality p(?) = p(?0 ) means not only the numerical equality of the two tensors of structure constants but also an equality of "positions" of the graphs "inside" the algebra which they generate. This means the following fact: 46
Assume ? = (V; R) and ?0 =S (V; R0), and let R and R0 be partitions found by Algorithm S Stabgraph 2. Then R = tk Rik impliesR0 = tk R0ik , and vice versa. =1
=1
8 k-invariants and multidimensional Weisfeiler-Leman stabilization 8.1 The notion of 2-invariants can easily be extended to k-invariants, k > 2: To shorten the notation we shall use the abreviations D = f0; : : : ; g, and ~u = (u ; : : : ; uk ) and ~i for elements of V k and f0; : : : ; gk , respectively. For g 2 G we de ne ~ug = (ug ; : : :; ugk ): 1
1
Assume that ? is a colored graph as before, let G = Aut(?) and let
~u = (u ; : : : ; uk ) 2 V k = V| :{z: : V} k factors be an arbitrary ordered k-tuple of vertices. Consider the set 1
OrbG (~u) = f~ug : g 2 Gg: It is called the k-orbit of ~u with respect to (G; V ). Two k-orbits OrbG (~u) and OrbG (~v) are either equal, if ~v 2 OrbG (~u), or they are disjoint. Let k-orb(G; V ) be the set of dierent k-orbits of (G; V ). Thus k-orb(G; V ) is a partition of V k : Compare these notions with the notions discussed in [KliRRT95], Chapter 5, for the case when k = 2: Like in the case k 2 f1; 2g; a partition S = fS ; : : :; Stg is called invariant (with respect to G) if Sig = Si for all i; 1 i t; or with other words, if ~u 2 Si implies ~ug 2 Si: 1
8.2 Suppose that we know the k-orbits of (G; V ) for some k 2 N: Then we know also the h-orbits for 1 h < k: Indeed, OrbG (u ; : : : ; uh) = 1
= f(u0 ; : : : ; u0h) 2 V h : (u0 ; : : : ; u0h; u| 0h; :{z: :; u0h} ) 2 OrbG (u ; : : : ; uh; u| h; :{z: :; uh} )g: k - h times k - h times In particular, for h = 1, we get 1
1
1
OrbG (u) = fu0 : (u| 0; :{z: :; u}0) 2 OrbG (u; | :{z: :; u})g: k times k times Thus, some subset of k-orb(G; V ) is in one-to-one correspondence with the automorphism partition of ?. 47
8.3 A function f : V k ?! CI which is constant on the k-orbits of (G; V ), i.e. a function f with the property f (~ug ) = f (~u); ~u 2 V k ; g 2 G; is called a k-invariant.
To have an example for a k-invariant let S = fS ; : : : ; Sdg be an invariant partition of V k : For ~u 2 V k let c(~u) be the "color" of ~u, i.e. c(~u) = i if and only if ~u 2 Si: Further, de ne ~ui;v = (u ; : : : ; ui? ; v; ui ; : : : ; uk ); 1 i k; v 2 V: Note that ~ui;v is derived from ~u by replacing the i?th vertex ui by the vertex v. Now we consider the k-tuple of colors c(~ui;v ); 1 i k: Let ~cS (~u; v) = (c(~u ;v ); : : :; c(~uk;v ))t and sort the list (~cS (~u; v) : v 2 V ) lexicographically. Assume that ~cS (~u; v ); : : :;~cS (~u; vn) is a permutation of this list such that ~cS (~u; v ) ~cS (~u; v ) : : : ~cS (~u; vn) Arrange these vectors as the columns of a matrix CS (~u) = ~cS (~u; v ) ~cS (~u; v ) : : : ~cS (~u; vn) : 0
1
1
+1
1
1
1
2
1
2
Since c(~ui;v ) = c(~ugi;vg ) for any g 2 G and all i; 1 i k; we have f~cS (~u; v) : v 2 V g = f~cS (~ug ; vg) : v 2 V g; and each vector ~cS (~ug ; vg) in this set appears with the same multiplicity as ~cS (~u; v) when v varies over V . This is the case if and only if CS (~u) = CS (~ug ): Therefore, CS is a k-invariant.
8.4 From CS we can derive another k-invariant, which is sometimes easier to handle and which we get in the following way. Let ~cS ;~cS ; : : :;~cSK , where K = kd , the set of all possible k-dimensional vectors the components of which have values in f0; : : : ; dg. Assume that these vectors are arranged in lexicographical order such that ~cS ~cS : : : ~cSK : Clearly, each ~cS (~u; v) is among these vectors. For 1 i K let li(~u) be the number of appearances of ~cSi in CS (~u). De ne tetrakS (~u) = (l (~u); : : :; lK (~u)): Evidently, since CS (~u) = CS (~v) () tetrakS (~u) = tetrakS (~v) (1)
(1)
(2)
(2)
(
(
( )
1
48
)
)
+1
the list tetrakS is a k-invariant which is equivalent to CS . Note that for k = 2 the invariants tetrakS and trS are identical.
8.5 We call a partition S of V k equitable, if the k-invariant tetrakS is constant on its cells.
Clearly, k-orb(G; V ) is an equitable partition.
Like in the cases k = 1 and k = 2, for every partition S there is a coarsest equitable partition S which we can determine by a stabilization method completely analogous to Stabgraph 2 the description of which is given below. This is again a stabilization algorithm which starting with S computes a series of partitions S = S ; S ; : : :; St: The algorithm terminates after step t when for the rst time the current partition does not change in the next iteration step. This happens when for the rst time the current partition is equitable, i.e. St = St . Hence, S = St: 0
1
+1
procedure Stabgraph k; Input: A partition S = (S ; : : :; Sd) of V k ;
begin = d; L: for 0 l do for ~u 2 Sl do begin
0
L(~u; 0) = l; L(~u) = (L(~u; 0); tetrakS (~u)); end; sort the lists L(~u)~u2V k lexicographically and k for ~u 2 V let r(~u) be the rank of L(~u) in this ordering; 0 = 1 + max~u2V k r(~u); for 1 l 0 do Sl = f~u : r(~u) = l ? 1g; if 0 > then
begin = 0 ; goto L; end; end. 8.6 Assume that the initial partition S to which Stabgraph k is applied satis es the
condition (ii) There is a partition (D ; : : :Dk ) of f0; : : :; dg such that [ Sj = f(u ; : : :; ui; u| i; :{z: :u}i ) : (u ; : : :; ui) 2 V ig; 1 i < k: j 2Di k-i times 1
1
1
49
This condition is an obvious extension of the condition (ii) in Section 2, paragraph 2.2, to partitions of V k ; k > 2: If S satis es (ii), then so does each of the partitions Sl produced by Stabgraph k. Hence, S satis es (ii). From each partition S of V k which satis es (ii) we can derive partitions S i of V i; 1 i < k: Let l 2 Di and de ne Sl i = f(u ; : : : ; ui) : (u ; : : :; ui; u| i; :{z: :; u}i) 2 Slg; k?i times D~ i = fl 2 Di : Sl i 6= ;g: Then S i = (Sl i )l2Di : If S is an invariant and/or equitable partition of V k , then so is S i for V i: To complete the notation let us put S k = S : ( )
( )
1
1
( )
( )
( )
~
( )
( )
8.7 Let ? = (V ; R) be a colored graph and assume k 2: For ~u 2 V k let ?(~u) be the
subgraph of ? consisting of the vertices u ; : : :; uk and the arcs between them having the same color as in ?: The subgraph ?(~u) is said to be induced by ~u: 1
Call two elements ~u and ~v of V k similar, denoted by ~u ~v, if and only if the equivalence (ui; uj ) 2 Rl () (vi; vj ) 2 Rl holds for every triple i; j; l with 1 i; j k; 0 l d: This means, two k-tuples ~u and ~v are similar exactly if they induce isomorphic subgraphs of ?, and the correspondence ui 7?! vi; 1 i k; is an isomorphism. By de nition, the relation is an equivalence relation on V k . Consequently, the corresponding equivalence classes form a partition of V k . Denote this partition by R k : Algorithm Stabgraph k applied to the initial partition S = R k produces the equitable partition R k . This process is called k-dimensional Weisfeiler-Leman stabilization. ( )
( )
( )
Each R k is an invariant partition. Consequently, each R k is invariant, too. According to 8.6, from each R k we can derive invariant partitions of V i; 1 i < k; by the procedure described in Section 8.6. To simplify the notation, let these partitions be denoted by R k;i . Note that R ; = R = V (where V = (V )). ( )
( )
( )
(
)
(1 1)
(1)
Now what we have is a large scheme of partitions which is arranged below:
R R R R
; ; ; ;
(1 1)
R R R
; ; ;
(2 1)
R
;
(4 1)
R R
; ;
(3 1)
(4 2)
(3 2)
(4 3)
50
(2 2)
(3 3)
(4 4)
:::
c d a b
a b
51 c d
Figure 7.3
Fuerer’s graph Φ 4
Each line k of this array can be computed by Algorithm Stabgraph k which can be implemented to run in time O(nk ln(n)) Therefore, Stabgraph k is a time-polynomial algorithm. However, when k increases by 1, then the time need for the Weisfeiler-Leman stabilization algorithm increases by the factor n. On the other hand, some simple mathematical considerations show that +1
R i;i R i (
)
;i R i ;i : : :
( +1
)
( +2
)
for every i 1: In particular,
R
; R ; R ; :::
(1 1)
(2 1)
(3 1)
That means that the larger k the better R k; will approximate the desired automorphism partition Vaut(?): (
1)
8.8 Now inevitably the question arises: Is there a number k such that for all graphs ? the k; partition R equals the automorphism partition of ?? (
1)
Unfortunately, the answer is de nitely "No". In [Fue87] Furer has presented a series of graphs which serve as counterexamples for a positive answer to our question. One of these graphs, , is shown in Figure 7.3. It needs some explanation. The graph is composed by several identical units, one unit being shown in the shadowed area on the right lower corner. There are three more such units on the lower half of the picture. They build up a horizontal block consisting of two rows which are connected by four pairs of vertical bridges. Another copy of this block is in the upper half of the picture. It is linked to his lower counterpart by another four pairs of parallel bridges. The vertices on the leftmost vertical line are to be identi ed with the vertices on the rightmost vertical line, in the order they appear downwards from the top. The rst two of these identi cations are indicated by the letters a and b. In the same manner, the vertices on the uppermost horizontal line are to be identi ed with the vertices on the lowermost horizontal line, in the order they appear from left to right. The rst two identi cations are indicated by the letters c and d. In this way a graph appears which can be considered as situated on a torus. 4
Now the next graph in the series is constructed by rst adding another copy of a horizontal block on the top (or on the bottom) and connecting it to the current graph by vertical bridges, and then enlarging the graph to the right by adding two more units at each block. In this way, the next graph will get three blocks, each of them consisting of 6 units. Again corresponding vertices on the leftmost and rightmost vertical lines and on the uppermost and lowermost horizontal lines have to be identi ed. The graph appearing in this way is denoted as . Now by repeated analoguos extension, i.e. adding of a block of the current length and enlarging all blocks by adding two units, we get a series of graphs ; ; ; : : :. 6
4
6
8
Now, consider any pair of parallel bridges mentioned above, one such pair is marked by a shadowed circle in Figure 7.3. If we twist the two bridges as indicated in the gure by the dotted lines, then a new graph 0 appears which in all other parts is identical with : Let 0 ; 0 ; 0 ; : : : be a series of graphs constructed from ; ; ; : : : by twisting 4
4
4
6
4
8
52
6
8
exactly one pair of bridges. By some tricky arguments it can be shown that m and 0 m are non-isomorphic for all m 2 f2; 3; 4; : : :g: Further, it can be shown that Stabgraph k will distinguish m and 0 m only for k > 2m: In particular, this means that, if we de ne 2
2
2
d = 2; R = R0 = Diag(V V ); R = R( m ); R0 = R(0 m ); R = R( m ); R0 = R( 0 m ); R = fR ; R ; R g; R0 = fR0 ; R0 ; R0 g; 0
0
1
2
2
1
2
0
then
2
1
2
2
2
2
0
1
2
tetrakR k = tetrakR0 k for all k 2m: This shows that tetrakR k is not a complete graph invariant. ( )
( )
( )
9 Concluding Remarks
9.1 As already mentioned in Section 3 there is more than one way to introduce canon-
ical numberings of graphs. For instance, we may choose between minimal and maximal canonical codes. Further, when dealing with undirected graphs only, we may also work just with the upper triangular part of their adjacency matrices, since these matrices are symmetric and their diagonals contain zeros only. In our exposition we dealt with the code numbers which correspond to the lexicographic minimum of all possible adjacency matrices. These numbers are smaller as those we would get using the lexicographical maximum. When cataloguing graphs this may be of some advantage. However, this is not a common tradition. For example, in [Far78], and [Wei76] the lexicographic maximum was used, and a few termini and even some simple computational tricks developed in these papers depend on precisely this selection of a canonical numbering.
9.2 The notion of a total degree partition is in our opinion folklore. Therefore we even
did not try to trace back this notion to the publication where it appears for the rst time. The longstanding tradition in algebraic combinatorics to use this term made us to substitute the term "degree" for the for a chemist so natural term "valency" of a vertex in a graph. The characterization of the total degree partition and its structure constants as a complete set of invariants for the classes of ds-isomorphic graphs (see Section 6) belongs to G. Tinhofer [Tin91]. It happens that this result still is not widely known, while the acquaintance with it serves in our eyes as a natural background for the use of various stabilization procedures. This is the reason why we decided to include in this paper a short exposition of the ideas related to this result. Some more information about this topic can be found in [SchU97]. 53
The use of equitable partitions of graphs is becoming wider and more productive from year to year, see e.g. [MuzK98], [Sta96]. We refer to [ChaG97] and [God94] for an introduction to the subject of equitable partitions.
9.3 The notion of the WL-closure plays a relatively modest role in this paper. Its theoretical and computational aspects are discussed in much more details in the previous parts [KliRRT95], [BabCKP97] of this series. Traditionally it is treated by dealing with its matrix analogue, namely the minimal coherent (cellular) algebra which contains a given matrix, in particular, the adjacency matrix of a graph. This notion was successfully exploited extensively in algebraic combinatorics, in particular in [FarIK90], [FarKM94]. 9.4 The algorithms stabgraph and stabgraph k, k 2; may be considered as operators which applied to a graph ? determine a certain partition S (?) of V k : We say that such an operator works canonically if it produces a canonically ordered partition. This means that, if ? and ?0 are two graphs and S (?) = (W ; : : :; Wt) and S (?0) = (W 0 ; : : :; Wt00 ); then ? ' ?0 implies t = t0 and Wi = Wi0 for 1 i t and all isomorphisms between ? and ?0: 1
1
We have already mentioned in Section 5.5 that Stabgraph works canonically. Let f : V k ?! CI be any k-invariant with f (V k ) = ff ; : : :; fK g: Then the partition 1
S = (S ; : : :; SK ); Si = f~u : f (~u) = fi g; 1 i K; 1
is not only invariant but also canonically ordered. Since stabgraphk starts with a canonically ordered partition (either by de nition, if ? is a colored graph, or by convention, if ? is a graph, which has to be transformed into a colored graph, see subsection 2.2)), and since in each step of this algorithms partitions are re ned using a new k-invariant, the resulting partitions R k;i ; 1 i k; evidently are canonically ordered. (
)
9.5 Canonically ordered partitions V of V are very useful for nding canonical numberings of graphs. For instance, suppose that V is the split partition, i.e. each cell Vi of V contains exactly one vertex, say Vi = fvig: Then, clearly, (vi) = i; 1 i n; represents a canonical numbering. If V is not the split partition, then we may restrict ourselves to those numberings which preserve the canonical ordering of V : That means that we may restrict ourselves to numberings of V with the property that u 2 Vi ; v 2 Vj ; i < j implies (u) < (v): Such numberings assign the smallest numbers to the vertices in V , the second smallest numbers to the vertices in V ; and so on. Finding the minimum of cd(At AA ) under the restriction that is a numbering of this kind may be much easier than nding the overall minimum. 1
2
Let V be a canonically ordered partition, for instance one of the partitions R k; ; k 1; say R s; : Thus, we may consider V as obtained from some initial colored graph by application of Algorithm Stabgraph s. In general, not all cells Vi of V will consist on one vertex only, but some of them will contain more than one element. (
(
1)
54
1)
V1
V2
V4
V3
V5
d a
b
c
e
i
h
g
V7
V6 k
j
f
l
m
Figure 9.1 Consider Figure 9.1 which shows some partition V schematically. The rst three cells and some cells with larger index are assumed to consist of a single vertex, while cell V and some other cells contain more than one vertex. A canonical numbering which is based on V is already xed on the vertices in the cells V ; V ; V and V , namly we have to realize the assignments a ! 1; a ! 2; c ! 3; k ! 11: However, it is still unsettled how the remaining numbers 4 to 10 and 12, 13 have to be assigned. We only know that the vertices d; e; f; g will get the numbers 4 to 7, the vertices h; i; j the numbers 8 to 10, and the vertices l; m the numbers 12, 13. 4
1
2
3
6
Let us guess that in a numbering which yields the smallest adjacency matrix the number 4 is assigned to the vertex d. Then we change from the situation in Figure 9.1 to the situation in Figure 9.2, i.e. d is now the only vertex in cell V , the remaining vertices of the former cell V build up the new cell V , the old cells V ; V and V are renamed by increasing their index by 1. 4
4
5
5
V4 V1
V2
6
V6
V3 d
a
b
j
f
V7
V8
k
l
i
h c
7
m
g
e V5
Figure 9.2 The fact that in the original partition a cell Vi consists of a single vertex x is caused by the fact, that one of the relations in R s;s = (R ; : : :; Rd), say Rt, is f(x; x; : : :; x)g: Identifying ordered pairs (x; x; : : :x) with vertices x 2 V as usual we may formulate this fact by saying that in the coloring of V s de ned by R s;s the s?tuple (x; x; : : :; x) is the only s-tuple which has color t. After our guess which leads to the new partition in Figure 9.2 we need an additional color which is still unused in order to distinguish the vertex d from the remaining ones, in the same way as the vertices a; b; c and k have already been distinguished. We may use this new color in the following way: Let be such that R = f(d; d; : : : ; d); (e; e; : : :; e); (f; f; : : :; f ); (g; g; : : :; g)g: (
)
0
(
55
)
De ne a new partition R0 of V s by R0i = Ri ; 0 i ? 1;
R = f(d; d; : : :; d)g; R = f(e; e; : : :; e); (f; f; : : :; f ); (g; g; : : :; g)g; R0i = Ri? ; + 1 i d + 1: This new partition R0 is not equitable. After appplying Stabgraph s to it we get an equitable re nement R0. Moreover, the partition R0 s; of V , which is derived from R0 as R s; is derived from R (see Section 8.7), is a re nement of our initial partition V . It is canonically ordered under the condition that d is the vertex which has to get the smallest number among all vertices in its cell. If R0 s; is the split partition, then like at the begin of this subsection we have found a numbering of the vertex set. We will note down the corresponding code number cd(Pt Adj (?)P ) (see Section 3.3) and try another guess for the vertex with the smallest number in V instead of d. If d was not our rst guess, then we compare cd(Pt Adj (?)P ) with the smallest number found so far. We store only the currently smallest code number together with the corresponding numbering. In this way we nally will end up with a numbering which is compatible with the canonical ordered initial partition V , and which among all such numberings yields the smallest value of cd(Pt Adj (?)P ): +1
1
(
(
1)
1)
(
1)
4
If R0 s; is not the split partition, then we have to make another guess, thereby always chosing a vertex out of the rst cell in the current partition which has more than one element. After a nite number of successive guesses we always arrive at the split partition, i.e. at some numbering of of V which gives a new code number cd(Pt Adj (?)P ) to be compared with the previously smallest one. (
1)
Suyppose that a sequence of guesses d ; d ; : : : ; d? ; d has been made and that all options for the last position d have been explored already. Then we have to "move back" again to the last but rst position and, given d ; : : : ; d? , try the next guess instead of d? : In this way going forth and back we systematically explore all possible numberings which are compatible with our initial partition V : This procedure nally yields a canonical numbering of ?. 1
2
1
1
2
1
The advantage of such backtracking procedures is that some elaborate versions of them turned out to be very ecient in practice. Nevertheless, for none of this versions a polynomial time bound could be proved up to now. Among the best known computer software packages for solving isomorphism problems is the one described in [McK90]. It contains also a computer code for nding canonical numberings. It is based on a method similar to the one discussed here (with s = 1), see also [McK81].
9.6 High-dimensional stabilization has some history within the investigations of coherent (cellular) algebras. In particular, quite a few years ago similar procedures have been suggested for the investigation of strongly regular graphs. One of these procedures is related 56
to the notion of "graphs with 4-conditions". It has been introduced by M. D. Hestenes and D. G. Higman in [HesH71], the idea is attributed by them to Ch. Sims. Various generalizations of this notion and interesting examples are considered in [Iva87], [Iva89], [BroIK89], [FarKM94], [Pas92]. B. Weisfeiler used for such procedures the term "deep stabilization", however, he never de ned exactly which stabilization procedures should be covered by this term. Unfortunately, in this area there is still a mixture of dierent languages, which is certainly a serious obstacle for the exchange of ideas. We hope to review in the future all above mentioned links more strictly.
9.7 In Section 8 we have discussed the possibility of nding a sequence R ; R ; : : : Rk; = VAut(?): 1 1
2 1
1
of partitions of the vertex set V of a graph ? which nally ends up with the automorphism partition. However, we do not know much about the number k for which this happens for the rst time. Clearly, k = k(?), i.e. k depends on ?: We know that certainly Rn; = VAut (?): In most cases k(?) is much smaller than n. However, the main result in [CaiFI92] is that there is a series of graphs ?s with n(s) vertices such that k(?s ) = O(n(s)): That means to nd VAut(?s ) we would have to compute Rn s ; where satis es 0 < < 1: Hence, the result in [CaiFI92] demonstrates that the k-dimensional Weisfeiler-Leman method is of restricted practical interest. 1
( ) 1
Fortunately for chemists, no arti cial graphs like the Furer graphs or similar constructions appear in real chemical world. For dealing with chemical graphs in most cases Stabgraph 2 and even Stabgraph is a sucient computational tool.
9.8 In the literature one frequently nds various approaches to the isomorphism problem
where spectral properties of graphs are used in order to determine appropriate canonically ordered partitions. We have already mentioned at the end of Section 7 that the invariants appearing in the description of the Weisfeiler-Leman closure, namely the canonically ordered structure constants psij (see Section 7.8), are substantially stronger than any spectral invariants. That means, if the Weisfeiler-Leman closures ? and ?0 of two graphs ? and ?0 have the same tensor of canonically odered structure constants, then these graphs are cospectral.
For a colored graph ? = (V ; R) with R = (R ; : : : ; Rd) the number of colors d + 1 can be as large as n . Since in psij the range of the indices i; j and s is f0; : : : ; dg; in principle, there can be as many as n such constants. However, most of them are zero. One can prove that every Weisfeiler-Leman closure has at most n non-zero structure constants. Algorithm Stabgraph 2 needs time O(n log n): Hence, by means of Algorithm Stabgraph 2, within the same time bound one can take into consideration implicitely all spectral properties of graphs. 0
2
6
3
3
9.9 Let W be a cellular algebra (this notion was discussed in [KliRRT95]). A doubly stochastic matrix X is called a ds-automorphism of W if XA = AX for all matrices A 2 W: (In fact, it suces to require only that X commutes with all basic matrices of 57
W .) Let DS (W ) be the set of all ds-automorphisms of W . According to [EvdKP96] a cellular algebra W is called compact if DS (W ) coincides with the convex hull of Aut(W ): A graph ? is called weakly compact if its WL-closure consists of the basic relations of a compact algebra. These notions are in fact generalizations of the notion of a compact graph introduced in [Tin86] and developed further in [Bru88], [SchT88], [Tin89] and [Tin89]. Again generalizing a result in [Tin91] it was proved in [EvdKP96] that, if ? is a weakly compact graph, then testing whether any other graph ?0 is isomorphic to ? or not can be done in polynomial time (using WL-stabilization as a subroutine). Thus, in a sense, weakly compact graphs are "good" objects for isomorphism testing. For a wide class of graphs it is possible to prove that they are weakly compact (or even compact) in advance, i.e. on a purely theoretical level. In particular, this is true for directed or undirected cycles, for trees cographs, and for a wide superclass of these graph classes, see [EvdPT97].
9.10 In mathematical chemistry, the interest into the graph isomorphism problem is ori-
ginated by the problem of testing isomorphism of molecular graphs. The notion molecular graph, however, is still not strictly de ned. It depends on the current level of knowledge in chemistry which graphs are realised as chemical molecules. Today a few million of graphs are known to represent indeed the constitutional formulas of chemical compound. One has to expect that this class of graphs will doubtlessly grow even much larger in the future. The discovery of fullerens (see the issue 33 of MATCH (1996), which is completely devoted to this class of compounds) essentially extends the class of molecular graphs. Nevertheless some features of molecular graphs (according to the current our understanding of them) may be rather eciently estimated based on the experience of chemists, computer search, statistical experiments, and so on. From this point of view it seems permissible to claim that for almost all (if not even for all) known molecular graphs the WL-stabilization method will work adequately, i.e. this method applied to a molecular graph will return a correct description of its automorphism partition. For each concrete instance of a molecular graph this claim should be con rmed theoretically, using for example our knowledge of weakly compact graphs. Thus, further investigations of weakly compact graphs become important from a new applied point of view. We believe that further progress in this direction will help chemists to justify theoretically the use of their favorite stabilization procedures for perception of symmetry properties of molecular graphs.
10 Acknowledgement We gratefully acknowledge stimulating and helpful discussions with G. M. Adel'sonVel'ski.
58
References [AhoHU75] Aho A.V., Hopcroft J.E., Ullman J.D.: The design and analysis of computer algorithms. Addison-Wesley, Reading, 1975. [ArlZUF74] Arlazarov V.L., Zuev I.I., Uskov A.V., Faradzev I.A.: An algorithm of reduction of nite undirected graphs to the canonical form. Zhurn. Vychisl. Mat. i Mat. Fis. 14, 1974, 737-743 (Russian). [BabBLT97] L. Babel, S. Baumann, M. Ludecke, G. Tinhofer: STABCOL: Graph isomorphism testing based on the Weisfeiler-Leman algorithm. Preprint. TUMM9702, Munich, 1997, 33 pp. [BabCKP97] L. Babel, I.V. Chuvaeva, M. Klin, D.V. Pasechnik: Algebraic Combinatorics in Mathematical Chemistry. Methods and Algorithms. II. Program implementation of the Weisfeiler-Leman algorithm. (A preliminary version). Preprint. TUM-M9701, Munich, 1997, 45 pp. [BalMB85a] Balaban A.T., Mekenyan O., Bonchev D.: Unique Description of Chemical Structures Based on Hierarchically Ordered Extended Connectivities (HOC Procedures). I. Algorithms for Finding Graph Orbits and Canonical Numbering of Atoms. Journal of Computational Chemistry 6, 1985, 538-551. [BalMB85b] Balaban A.T., Mekenyan O., Bonchev D.: Unique Description of Chemical Structures Based on Hierarchically Ordered Extended Connectivities (HOC Procedures). III. Topological, Chemical, Stereochemical Coding of Molecular Structures. Journal of Computational Chemistry 6, 1985, 562-569. [Bas98] Bastert O.: New ideas for canonically computing graph algebras. Preprint. Technische Universitat Munchen, TUM-M9803, June 1998. [Ber78] Bersohn M.: A sum algorithm for numbering the atoms of a molecule. Computers & Chemistry 2, 1978, 113-116. [Ber87] Bersohn M.: A matrix method for partitioning the atoms of a molecule into equivalence classes. Comput. Chem. 11, 1987, 67-72. [Bha80] Bhat K.V.S.: Re ned Vertex Codes and Vertex Partitioning Methodology for Graph Isomorphism Testing. IEEE Transactions on Systems, Man and Cybernetics, Vol. SMC-10, No. 10, 1980, 610-615. [Bir46] Birkho G.: Tres observaciones sobre el algebra lineal. Tucuman Rev. Nac. Ser. A5, 1946, 147-150. [BonMB85] Bonchev D., Mekenyan O., Balaban A.T.: Unique Description of Chemical Structures Based on Hierarchically Ordered Extended Connectivities (HOC Procedures). IV. Recognition of Graph Isomorphism and Graph Symmetries. MATCH 18, 1985, 83-99. 59
[BooC79]
Booth S., Colbourn Ch.J.: Problems polynomially equivalent to graph isomorphism. Report CS-77-04, 1979, University of Toronto. [BroIK89] Brouwer A.E., Ivanov A.A., Klin M.H.: Some new strongly regular graphs. Combinatorica 9, 1989, 339-344. [Bru88] Brualdi R.A.: Some applications of doubly stochastic matrices. Linear Algebra Appl. 107, 1988,77-100. [CaiFI92] Cai J.Y., Furer M., Immerman N.: An optimal lower bound on the number of variables for graph identi cation. Combinatorica 12, 1992, 389-410. [ChaG97] Chan A., Godsil Ch.D.: Symmetry and Eigenvectors. Preprint, Combinatorics and Optimization, University of Waterloo, Canada, 1997. [Col78] Colbourn Ch.J.: A Bibliography of the Graph Isomorphism Problem. Technical Report No. 123/78, University of Toronto, 1978. [Col79] Colbourn Ch.J.: Re nement Techniques For Graph Isomorphism. Proc. 10th S-E Conf. Combinatorics, Graph Theory and Computing, 1979, 261-288. [Cor68] Corneil D.G.: Graph isomorphism. Ph.D. Thesis, University of Toronto, 1968. [CorG70] Corneil D.G., Gotlieb C.C.: An Ecient Algorithm for Graph Isomorphism. Journal of ACM 17, 1970, 51-64. [CorK80] Corneil D.G., Kirkpatrick D.G.: A Theoretical Analysis of Various Heuristics for the Graph Isomorphism Problem. S IAM J. Comput. 9, 1980, 281-297. [EvdKP96] Evdokimov S., Karpinski M., Ponomarenko I.: Compact Cellular Algebras and Permutation Groups. Report No. 85154-CS, Department of Computer Science, University of Bonn, Germany. [EvdPT97] Evdokimov S., Ponomarenko I., Tinhofer G.: On a new class of weakly compact graphs. Preprint, Technische Universitt Munchen, TUM-M9715. [Far78] Faradzev I.A.(ed.): Algorithmic investigations in combinatorics. Moscow, Nauka, 1978 (Russian). [FarIK90] Faradzev I.A., Ivanov A.A., Klin M.H.: Galois correspondences between permutation groups and cellular rings (association schemes). Graphs and Combinatorics 6, 1990, 303-332. [FarKM94] Faradzev I.A., Klin M.H., Muzichuk M.E.: Cellular rings and groups of automorphisms of graphs. In: Faradzev I.A. et al. (eds.): Investigations in algebraic theory of combinatorial objects. Kluwer Acad. Publ., Dordrecht, 1994, 1-152. 60
[Fue87]
Furer, M.: A Counterexample in Graph Isomorphism Testing. Preprint, Computer Science Department, The Pennsylvania State University, 1987. [Gat79] Gati G.: Further annotated bibliography on the the isomorphism disease. Journal of Graph Theory 3, 1979, 95-109. [God94] Godsil C.D.: Algebraic Combinatorics. Chapman & Hall, 1993. [HesH71] Hestenes M.D., Higman D.G.: Rank 3 groups and strongly regular graphs. SIAM{AMS Proc. 4, 1971, 141-159. [IUPAC] IUPAC, Nomenclature of Organic Chemistry. Pergamon Press: Oxford, 1979. [Iva87] Ivanov A.V.: On rank 3 graphs with 5-vertex condition. Math. Forschungsinstitut Oberwolfach, Tagungsbericht 24, 1987, 8-9. [Iva89] Ivanov A.V.: Non rank 3 strongly regular graphs with the 5-vertex condition. Combinatorica 9, 1989, 255-260. [JaR74] James K.R., Riha W.: Algorithm for description and ordering of trees. Technical report 54,University of Leeds, 1974. [Kar78] Karp R.M.: Probabilistic analysis of a canonical numbering algorithm for graphs. Proceedings of the AMS Symposium on Relations Between Combinatorics and Other Branches of Mathematics, 1978. [KliLPZ92] Klin M.H., Lebedev O.V., Pivina T.S., Ze rov N.S.: Nonisomorphic cycles of maximum length in a series of chemical graphs and the problem of application of IUPAC nomenclature rules. MATCH 27, 1992, 133-151. [KliPR88] Klin M.H., Poschel R., Rosenbaum K.: Angewandte Algebra. Vieweg, Braunschweig, 1988. [KliRRT95] M. Klin, C. Rucker, G. Rucker, G. Tinhofer, Algebraic combinatorics in mathematical chemistry. Methods and algorithms. I. Permutation Groups and coherent (cellular) algebras. Technical Report, Technische Universitat Munchen, TUM-M9510 (1995). [Leo84] Leon J.S.: Computing automorphism groups of combinatorial objects. In: Atkinson M.D. (ed.): Computational group theory Durham 1982. Academic Press, London, 1984, 321-335. [Mac75] Mackay, A. L.: On rearranging the connectivity matrix of a graph. J. Chem. Phys. 62, 1975, 307-308. [MarM56] Margenau H., Murphy G.: The Mathematics of Physic and Chemistry. Van Nostrand, Princeton, 1956. 61
[McK77]
McKay B.D., Computing automorphisms and canonical labelings of graphs. In: Combinatorial Mathematics, Springer Lecture Notes in Mathematics 686, 1977, 223-232. [McK81] McKay B.D.: Practical graph isomorphism. Congressus Numerantium 30, 1981, 45 - 87. [McK90] McKay B.D.: nauty User's Guide. Computer Science Department, Australian National University. [MekBBB85] Mekenyan O., Bonchev D., Balaban A.T.: Unique Description of Chemical Structures Based on Hierarchically Ordered Extended Connectivities (HOC Procedures). II. Mathematical Proofs for the HOC Algorithm. Journal of Computational Chemistry 6, 1985, 552-561. [Mor65] Morgan H.L.: The generation of a unique machine description for chemical structures. Journal of Chemical Documentation 5, 1965, 107-112. [MuzK98] Muzychuk M., Klin M.: On graphs with three eigenvalues. Discrete Mathematics 189, 1998, 191-207. [Pas92] Pasechnik D.V.: Skew{symmetric association schemes with two classes and strongly regular graphs of type L n? (4n ? 1): Acta Appl. Math. 29, 1992, 129-138. [Pro74] Prokurowski, A.: Search for a unique incidence matrix of a graph. BIT 14, 1974, 209-226. [Ran74] Randic, M.: On recognition of identical graphs representing molecular topology. J. Chem. Phys. 60, 1974, 3920-3928. [Ran75] Randic, M.: On rearrangement of the connectivity matrix of a graph. J. Chem. Phys. 62, 1975, 308-309. [Ran76] Randic, M.: On discerning symmetry properties of graphs. Chem. Phys. Lett. 42, 1976, 283-287. [Ran77] Randic M.: On canonical numbering of atoms in a molecule and graph isomorphism. J. Chem. Inf. Comput. Sci. 17, 1977, 171-180. [Rea72] Read R.C.: The Coding of Various Kinds of Unlabeled Trees. In: Read R.C.(ed.): Graph Theory and Computing. Academic Press, New York, 1972. [Rea83] Read R.C.: A new system for the design of chemical compounds. 1. Theoretical preliminaries and the coding of acyclic compounds. Journal of Chem. Inf. and Comp. Sci. 23, 1983, 135-149. [Rea85] Read R.C.: A new system for the design of chemical compounds. 2. Coding of cyclic compounds. Journal of Chem. Inf. and Comp. Sci. 25, 1985, 116128. 2
62
1
[ReaC77]
Read R.C, Corneil D.G.: The graph isomorphism disease. Journal of Graph Theory 1, l977, 339-363. [Rou76] Rouvray D. H.: The Topological Matrix in Quantum Chemistry. In: Balaban A. T.(ed.): Chemical Applications of Graph Theory. Academic Press, New York, 1976. [RueR90] Rucker G., Rucker Ch.: Computer Perception of Constitutional (Topological) Symmetry: TOPSYM, a Fast Algorithm for Partitioning Atoms and Pairwise Relations among Atoms into Equivalence Classes. J. Chem. Inf. Comput. Sci. 30, 1990, 187-191. [ShaDC74] Shah Y.J., David G.J., McCarthy M.K.: Optimum Features and Graph Isomorphism. IEEE Transactions on Systems, Man and Cybernetics, 4, 1974, 313-319. [SchT88] Schreck H., Tinhofer.: A Note on Certain Subpolytopes Associated With Circulant Graphs. Linear Algebra Appl. 111, 1988, 125 - 134. [SchU97] Scheinerman E.R., Ullman D.H.: Fractional Graph Theory. Wiley, New York, 1997. [Sta96] Stadler P.F.: Landscapes and their correlation functions. J. Math. Chemistry 20, 1996, 1-45. [Sus65] Sussenguth E.H.: A Graph-Theoretic Algorithm for Matching Chemical Structures. Journal of Chemical Doc. 5/1, 1965, 36-43. [Sza74] Szamkolowicz L.: Remarks on the theory of graph characteristics. Ann. Mat. Pure Appl. 98, 1974, 257-261. [Tin75] Tinhofer G.: Zur Bestimmung der Automorphismen eines endlichen Graphen. Computing 15, 1975, 147-156. [TiS84] Tinhofer G., Schreck H.: Linear Time Tree Codes. Computing 33, 1984, 211-225. [Tin86] Tinhofer G.: Graph isomorphism and theorems of Birkho type. Computing 36, 1986, 285-300. [Tin89] Tinhofer G.: Strong Tree-Cographs Are Birkho Graphs. Discrete Applied Mathematics 22, 1989, 275 - 288. [Tin91] Tinhofer G.: A note on compact graphs. Discrete Appl. Math. 30, 1991, 253-264. [Uch80a] Uchino M.: Algorithms for unique and Unambiguous Coding and Symmetry Perception of molecular Structure Diagram. I. Vector Functions for Automorphism Partitioning. J. Chem. Inf. Comut. Sci. 20, 1980, 116-120. 63
[Uch80b]
[Uch81] [Ung64] [WeiL68] [Wei76]
Uchino M.: Algorithms for unique and Unambiguous Coding and Symmetry Perception of molecular Structure Diagram. II. Basic Algorithm for Unique Coding and Computation of Symmetry Group. J. Chem. Inf. Comut. Sci. 20, 1980, 121-124. Uchino M.: Algorithms for unique and Unambiguous Coding and Symmetry Perception of molecular Structure Diagram. 5. Unique Coding by the Method of "Orbit Graphs". J. Chem. Inf. Comut. Sci. 20, 1980, 121-124. Unger S.H.: GIT - a heuristic program for testing pairs of directed line graphs for isomorphism. Comm. ACM 7, 1964, 26-34. Weisfeiler B.Y., Leman A.A.: Reduction of a graph to a canonical form and an algebra arising during this reduction. Naucho - Technicksagy Informatsia 9, Seria 2, 1968, 12-16 (Russian). Weisfeiler B. Y. (ed.): On construction and identi cation of graphs. Lecture Notes in Math. 558, Springer, Berlin, 1976.
64