Algorithms for Group Actions Applied to Graph

0 downloads 0 Views 199KB Size Report
testing program nauty McKa], or the present MOLGEN generator GKLM]). The restriction to simple ..... phieprinzip, Diplomarbeit (1995), Universit at Bayreuth. Br].
AMS Proceedings Style Volume 00, 0000

Algorithms for Group Actions Applied to Graph Generation Thomas Gruner, Reinhard Laue, and Markus Meringer Abstract. The generation of discrete structures up to isomorphism is inter-

esting for theoretical as well as for practical purposes. Mathematicians want to look at and analyse structures and, for example, the chemical industry uses mathematical generators of isomers for structure elucidation. The example chosen in this paper for explaining general generation methods for representatives from group orbits, is a relatively far reaching and fast graph generator. It is intended to serve as a basis for the next more powerful version of a generator of chemical isomers.

1. Introduction The problem of generating graphs has been considered by many authors. We mention only those approaches which are closely related to our work. Mostly the approaches use some kind of orderly generation of which we will present our subsetoriented version below. The basic ideas are from R. Read [R] and I. Faradzhev [F]. An important step forward was made by L.A. Goldberg [Go], who showed that generating graphs by iteratively adding vertices of maximal degree to smaller graphs can achieve a polynomial delay, i.e. a polynomial upper bound for the waiting time, before outputting the rst graph and for either outputting the next graph or halting. This was a successful attempt to use structure information in the orderly generation approach. A di erent strategy had been chosen in the famous DENDRAL project which was an early version of a generator of chemical isomers [LBFL], [GKL]. In that application a "brutto formula" is given. This details the number of atoms of the di erent kinds that have to appear in the molecule. The di erent molecules that correspond to the given brutto formula are called isomers. All isomers that correspond to this input have to be constructed. Since all atoms of the same kind are supposed to have the same valence, we can represent them as vertices of a multigraph regarding the valence as the degree. Thus, in graph theoretical terms, all multigraphs have to be constructed up to isomorphism, where the number of vertices of each degree is prescribed. We call (a0 ; a1; ; am ) the 

1991 Mathematics Subject Classi cation. Primary 05A05, 05C25, 05C85; Secondary 05C90, 68Q20, 68R10.

c 0000 American Mathematical Society 0000-0000/00 $1.00 + $.25 per page

1

2

THOMAS GRU NER, REINHARD LAUE, AND MARKUS MERINGER

degree partition of a graph G; if G has ai vertices of degree i for each i: Often in practical applications, some parts of the molecule are known. Then one can replace a subgraph by a vertex of a sometimes high degree. Therefore, though valences of atoms are usually very low, we do consider also vertices of higher degree. Later the vertex has to be expanded to the subgraph. This expansion process is a generator problem of its own and it may again cause isomorphism problems. We sketch the strategy of the DENDRAL system. The given structure information is analysed to deduce properties of subgraphs which are used as building blocks for the nal structures. This process can be explained by considering the construction process in reverse order. Starting from a hypothetical solution, a series of simpli cation steps are carried through until only small parts remain. So, in a "planning step" the vertices of degree 1 and 2 are removed and cyclic components are formed by removing bridges. From the given brutto formula, the possible number of cyclic components, the possible edge degree series for each cyclic component, and the number of interconnections of these components are deduced. The isomers are then built, up to isomorphism, from a catalogue of cyclic structures. Each construction step requires the solvution of isomorphism problems using some computational group theory. Thus, a structural analysis of the problem makes it possible to break down the problem into smaller pieces which are handled more easily. The approach presented here makes use of both kinds of ideas. On one hand, the mathematical idea of homomorphism is exploited to use structural information algorithmically. This leads to a recursive construction of (simple) graphs having a given degree partition from regular graphs. Compared to DENDRAL, this allows an unbounded number of simpli cation steps. On the other hand, orderly generation is used in the remaining homomorphically irreducible cases. The result is a generator which is extremely fast in many situations but which, due to the mathematical overhead, is slow in smaller cases compared to existing generators (such as, for example, B.D. McKay's makeg [McKb] based on the well known isomorphism testing program nauty [McKa], or the present MOLGEN generator [GKLM]). The restriction to simple graphs in this paper is not essential. We only intend to give a mathematical version of the new recursive strategy before adapting it more closely to the needs of computational chemistry. A complexity analysis is still missing and should evolve out of a search for the best recursion strategy among di erent variants of the general approach. We intend to look for properties bounding the sizes of the orbit problems that have to be solved in the recursion steps. Generally the use of a series of homomorphic simpli cations will lead to a remarkable reduction of the complexity of the task of nding orbit representatives,[L]. In that way, at least in practically important cases, a surprisingly good bound should be possible.

2. Basic Principles

We start with some basic principles and then show how they are used in generating graphs up to isomorphism. Generally the problem of generating objects up to isomorphism can be interpreted as the problem of nding orbit representatives for a group action. Since algorithms usually also need the stabilizers of the chosen representatives, we understand by a solution of the orbit representative problem, a determination of a set of representatives together with their stabilizers.

ALGORITHMS FOR GROUP ACTIONS APPLIED TO GRAPH GENERATION

3

Our approach allows us to give a solution to the generating problem that di ers from that of the conventional approaches. Mathematically, one can determine the number of orbits by some variant of the Cauchy-Frobenius Lemma,see [K], which is non-constructive. This is an elegant way to get the size of the solution set. Already this method has been criticized because of its exponentional complexity. We point out however, that, as early as 1982, H. S. Wilf [Wi] gave a discussion of that problem that apparently was not recognized by complexity theory. Wilf considers the quotient of the time complexity of counting, by the time complexity of listing representatives. A counting formula is e ective if this quotient tends to 0 with increasing size of the counted structures. If one is only interested in the number of orbits, his analysis shows that the Cauchy-Frobenius Lemma gives an e ective solution. Since practical applications really need the solutions themselves and not just the number of solutions, there has been much e ort to list orbit representatives completely. Of course, such an approach cannot lead very far when there is an exponential growth of the number of orbits. This is a motivation to generate representatives from the orbits uniformly at random. In mathematics one can nd other solutions to problems of this kind. For example, the Hauptsatz for nite abelian groups describes all nite abelian groups up to isomorphism without giving their number or listing them. Instead, there is an implicit description by lists of invariants. Though it seems intractable to nd such a Hauptsatz for our problem, we make a step in that direction. So instead of listing representatives in any case, we mix listing with an implicit handling of large sections of orbits. If there is no non-trivial group action, all the objects belong to di erent orbits. Thus, knowing an algorithm which computes the size of the set of objects and which allows the construction of each object on demand will suce. We say that a set O of orbits is determined if we have a rank function at hand, ranking the set O, and allowing the computation of a representative for the i-th orbit for any given rank number i: In terms of [SW] we thus require an unrank function. To avoid a misunderstanding, we point out that we do not demand the corresponding rank function, which, given any element from any orbit, would give the rank of the orbit. Such a function would solve the general isomorphism problem for graphs, whereas nding an unrank function would be an easier problem. We represent simple graphs as subsets of the set of all 2-element subsets of a vertex set V: Two such simple graphs are isomorphic i they lie in the same orbit of SV : Thus, we have to consider induced group actions. In such situations, there is often enough structure to apply algebraic decomposition techniques. The basis of this approach is to make algorithmic use of homomorphisms. 2.1 Definition: Homomorphism of group actions.

Let G1 be a group acting on a set 1 and G2 be a group acting on a set 2 : A pair  = ( ; G) of mappings, where  maps 1 into 2 and G : G1 G2 is a group homomorphism, is a homomorphism of group actions if  is compatible with both actions, i.e. for all g G1 and all ! 1  (!g ) = ! g G : If both components of  are surjective, then  is called an epimorphism, and if both components are bijective then  is called an isomorphism. !

2

2

THOMAS GRU NER, REINHARD LAUE, AND MARKUS MERINGER

4

If two group actions ( 1 ; G); ( 2; G) are isomorphic with an isomorphism  then  carries orbit representatives and their stabilizers onto corresponding representatives and stabilizers. Thus, if isomorphisms of group actions can be found, whole sets of solutions of the orbit problem can be used many times. This allows the description of large sets of solutions implicitly in our generator, as described above. We x some notation we use which is partially standard. We extend the group theory notions of a normalizer and a centralizer, to the case of actions on sets of more general objects. Thus, if ( ; G) is a group action and  then for each g G let g =  g   and NG () = g G g =  ; the normalizer of  in G: Similarly, CG() = g G  g =    is the centralizer of  in G: By abuse of notation, for a single point ! we denote by NG (!) = NG ( ! ) the stabilizer of !: Suppose an epimorphism  : ( 1; G1) ( 2 ; G2) is given. Then we may use  in two ways. Firstly, if a solution of the orbit problem is known in ( 2 ; G2) then we only have to look at the preimage sets  1(!2 ) for representatives !2 and let the full preimage groups G1 (NG2 (!2 )) act on these sets respectively. We call this usage of  a split. Secondly, if a solution of the orbit problem is known in ( 1 ; G1) then we may use the homomorphic images of the representatives and their stabilizers to nd a solution in the image domain. There is some complication, since di erent orbits in the preimage set may be mapped onto the same orbit. So if the image !2 = !1 of one representative !1 is used, one has to avoid the other image representatives for the elements of  =  1 (!2): It should be noted that those points in  that are mapped onto !1 by some g( ) G1 correspond bijectively to the cosets of NG1 (!1) in the full preimage group of NG2 (!2 ): The elements g( ) 1 are just a set of coset representatives. Thus, one can also obtain the stabilizer of !2 in this way. We call this usage of  a fusion. To point out the importance of these two interpretations, we brie y mention a typical application in construction problems. Suppose a set of representatives for the isomorphism types of a certain class of graphs is given. Then an extension step adds new subgraphs into each of these graphs. This step can be broken into two parts by the homomorphism principle. In the rst part the new subgraph is added as a colored object in all possible ways up to equivalence under the automorphism group of the old graph. This is the split part of the extension. It is related to the simpli cation which forgets the new added subgraph. In the next half of the extension step we forget the special coloring of the subgraph. This is a simpli cation  where we use fusion. For another simple example using some aspects of homomorphisms, consider the generation of multigraphs. A multigraph may be obtained by gradually increasing the highest edge multiplicity by 1. In each step only the automorphism group of the chosen representative needs to be considered with its action on the set of mappings from the set of all edges of maximal degree m to the set m; m + 1 : 

2

f

j j

2

g

f

f

2

2

j j

j j

g

8

2

g

2

f

g

!

2

f

g

ALGORITHMS FOR GROUP ACTIONS APPLIED TO GRAPH GENERATION

5

This is of course a rst application of a split. The percentage of simple graphs with non-trivial automorphism group tends to 0 with increasing number of vertices, [O]. Thus, in most cases we don't have any group action at all in these steps ( a recent implementation [Bg] demonstrates that already for a small number of vertices the automorphism groups of the multigraphs tend to become trivial very soon). This observation is trivial, but it is practically important, since in all these cases we can switch over to implicit handling instead of listing. Moreover, when graphs with trivial automorphism group combine to a bigger graph, these implicit listings may be combined yielding an implicit handling for a combinatorially exploding problem. Nevertheless, the cases with non-trivial group action have to be considered. We demonstrate the use of homomorphism here with a simple example. Suppose a multigraph G has k edges of highest multiplicity m forming the edge set Ek : Then AutG acts on the set of mappings from Ek to m; m +1 : The orbits correspond to the multigraphs with highest multiplicity m + 1 which can be deduced from G: To solve this orbit problem we may map Ek onto 1; ; k , m; m + 1 onto 0; 1 ; and AutG onto some corresponding subgroup U of Sk : This is an isomorphism of group actions which maps onto an orbit problem which is independent of k: If we solve these orbit problems as in the case of simple graphs once and for all, we only have to note the isomorphism instead of all solutions. In particular, after a nite number of steps, no new orbit problems have to be solved for an unbounded edge multiplicity. One can break up this orbit problem even further by reducing the action of U from 0; 1 f1; ;kg to 0; 1 B where B is an orbit of U on 1; ; k : This may be repeated until the stabilizer of a representative mapping is trivial or the mappings are fully determined. As usual in algebra, simpli cations by homomorphisms stop at some stage when no more non-trivial homomorphisms exist. These situations are called irreducible cases. For such cases we need another tool, i.e. orderly generation[R],[F]. We now suppose that a group G acts on a nite set X: We impose on X an ordering < such that also the set 2X of all subsets of X is lexicographically ordered. This ordering will not be compatible with the action of G; in general. It is therefore quite astonishing that such an ordering can be used to solve orbit problems. Each orbit S G for some S 2X contains a lexicographically minimal element S0 which we denote as the canonical representative with respect to xt then if y is not minimal in its orbit under NG (T); the set T y is not in canon< (2X ; G): f



g



2

[f g

[f g

THOMAS GRU NER, REINHARD LAUE, AND MARKUS MERINGER

6

The candidates obtained after removing the cases of the preceding lemma are often called semicanonical in the special case of graph generation [KP], [Gd], [Mo]. A test for minimalityfor each remaining candidate S; now has to decide whether there exists some g G such that S g < S: The obvious strategy is to run through all g G until either S g < S; or all elements of G have been tested. Of course, there must be chosen some ordering in which the elements of G have to be considered. We take a Sims chain, also called a strong generating set, with respect to the set X ordered ascendingly, as a base B = (b1; ; bn): This chain consists of transversals of the left cosets of Gi = CG( b1 ; ; bi ) in Gi 1 = CG( b1 ; ; bi 1 ) for i = n; ; 1: We order these representatives r by bri : Then we can run through all cosets rGi under this order of r0 s and, for xed r; in ascending order through Gi for i = n; ; 1: There is a case where some elements of G need not be considered in this procedure [Gd]. 2.4 Lemma. Suppose S < S U for some subset U of G and S g = S for some g G: Then also S < S gU ; since S gU = S U : Thus, for a subgroup U which has already been tested, the whole left coset gU can be omitted if S g = S is detected. Sometimes the elements of X play a di erent role in a bigger context. Then one has the additional condition that each xg has to belong to a certain class of elements of X; tending to further restrictions for the choice of group elements. This is used in the graph generation algorithms of [McKb] and [Go], where only additions of vertices of maximal degree are considered. Often the required solutions have to ful ll some constraints such as forbidding certain substructures. Then a check whether these constraints are ful lled is usually much faster than a canonicity check and should be done rst. It even pays to neglect the canonicity test in several consecutive extension steps, and to only check the constraints there. Then a canonicity check may reveal at a later point, that a candidate S is not minimal in its orbit, In the light of lemma 2.4 above, it is useful in such a case to determine the rst extension step where this non-canonicity could have been detected. Then all further extensions of this earlier candidate must also be rejected. Depending on the selectivity of the additional constraints. a delicate balance between steps with constraint checking only, and steps with canonicity check combined with tracing back to the earliest detection point, is needed for the fastest strategy. This has been followed by MOLGEN [GKLM]. Several di erent strategies for solving the isomer generation problem had been pursued in di erent stages of the MOLGEN project. The rst versions followed the DENDRAL strategy [LBFL]. There, in a nite number of steps, the isomers are constructed out of cyclic graphs. The present version uses orderly generation in lling classes of rows of the adjacency matrix [Gd] which are known to be invariant with respect to isomorphisms. For a future version we have implemented a proposal from [GKL] as a preliminary step. This generator is presently only available for simple graphs[Gr]. The generation strategy of this version makes sophisticated use of homomorphisms, and combines them with the orderly generation approach as discussed above in the irreducible cases. This new strategy is explained in the next section. 2

2



f





2



g

f



g

ALGORITHMS FOR GROUP ACTIONS APPLIED TO GRAPH GENERATION

7

3. A graph generator

The new generator relies on a strategy of determining rst how all graphs with a given degree partition can be built up recursively from regular graphs. The basic result for this approach is the following. 3.1 Theorem. Let a = (a0 ; a1; ; am ) be a degree partition of a graph, i. e. there exists a graph G having exactly ai vertices of degree i for all ai; and let aj = 0 for some j: If G is any graph with this degree partition then the aj vertices of degree j span a subgraph T and the remaining vertices span a subgraph H; such that the degree partitions b = (b0 ; ; bj ) of T and c = (c0 ; ; cm ) of H ful ll the following 

6





conditions. For each l 2 f1;    ; mg; with l 6= j; there exists a partition

al = such that for all i

X

i+k=l;cij 0

ci =

cik

Xc k

ik :

There exists a matrix P I with jH j columns and jT j rows such that all entries are 0 or 1 and I has i cik rows of sum k and bj l columns of sum l for all k and l: If, on the other hand, these conditions are ful lled for degree partitions a; b; c then, for all graphs T with degree partition b and H with degree partition c; there exists a graph G with degree partition a; having T and H as subgraphs.

There are well known criteria for a degree partition to be the degree partition of a graph [H]. Also, the existence condition for a 0=1-matrix with the required row and column sums can be expressed numerically without any explicit construction by the Gale-Ryser theorem, see[vLW] pp. 148-149. Thus, one can decide in advance whether a splitting of a given degree sequence a into two degree sequences b and c of graphs, will permit the construction of a graph G with the required degree sequence a from two corresponding graphs T and H: It is also clear that the subgraphs T and H in such a case are uniquely determined in any resulting graph G: Also the incidence structure matrix I having a row for each vertex x H and a column for each vertex y T; in which an edge connecting x to y is denoted by the entry 1 in the corresponding place of I; is unique. We use this decomposition of G to de ne homomorphisms of group actions. We consider V = 1;P ; v as the set of vertices of the graphs with degree partition a; such that v = i i ai : The symmetric group Sv acts on the set of all labelled graphs with vertex set V and degree partition a; such that the isomorphism types of graphs are just the orbits of Sv on this set. Mapping each graph G onto the set partition (V0 ; ; Vm ); where Vi is the set of vertices of degree i of G; gives a rst simplifying homomorphism of the action of Sv : Here, of course, all graphs with degree partition a and vertex set V are mapped onto a set partition with Vi = ai ; for each i: Therefore Sv is transitive on all these set partitions. The split version of the usePof homomorphisms P then tells us that we only need to look for graphs with Vi = j