In this paper we study the Supercube as an alternative to the hypercube as a ... a power of 2, it contains all cycles of length greater than 3 as subgraphs, i.e. it is ...
Embedding graphs onto the Supercube Vincenzo Auletta
y
Adele Anna Rescigno
z
Vittorio Scarano
x
Abstract In this paper we consider the Supercube, a new interconnection network derived from the hypercube introduced by Sen in [10]. The Supercube has the same diameter and connectivity of a hypercube but can be realized for any number of nodes not only for powers of 2. We study the capabilities of the Supercube to execute parallel programs using graph{embedding techniques. We show that complete binary trees and bidimensional meshes (with a side length power of 2) are spanning subgraphs of the Supercube. Then we prove that the Supercube is Hamiltonian and, when the number of nodes is not a power of 2, it contains all cycles of length greater than 3 as subgraphs.
Keywords: Cycles, Graph Embedding, Hamiltonian cycle, Parallel Architectures, Supercube.
1 Introduction The hypercube network is widely used as an architecture for parallel machines (Connection Machine, NCube and Intel IPSC). Its great popularity is essentially due to modularity, regularity and low diameter. These characteristics make easy to design ecient parallel programs and share the machine among users. Moreover, the logarithmic degree of the nodes makes the architecture technologically feasible with respect to an \ideal" network connecting each processor to all the others where the degree of a node is linear in the number of nodes. However, the number of nodes of this network is restricted to be a power of 2, which can be, in some situations, a signi cant drawback. In fact, it is necessary to double the number of processing elements Part of the results of this paper have been presented at the IV
Italian Conf. on Theoretical Computer Science, L'Aquila, Italy, Oct. 1992 [1]. Research supported by \Progetto Finalizzato Sistemi Informatici e Calcolo Parallelo" of C.N.R. under grant n. 91.00939.PF69 y Dipartimento di Informatica ed Applicazioni, Universit a di Salerno, 84081 Baronissi (SA), Italy. z x Department of
Computer Science, University of Massachusetts at Amherst, Amherst, Ma, 01003, U.S.A.
1
to upgrade a hypercube, which can be unrealizable for budget limitations or technical reasons. Several architectures have been proposed having any number of nodes and hypercube-like characteristics. In [3] a network called Generalized Hypercube is presented but it becomes a complete network when the number of nodes is prime. In [6] the Incomplete Hypercube is proposed, but the minimum degree of the nodes can be 1. This is a strong limitation to the communication capabilities of that node and to the fault{tolerant properties of the architecture. In [10] the Supercube, a generalization of the hypercube, has been introduced. It can be realized for any number of nodes N and contains as subgraph the hypercube with dimension blog N c. It is exactly a hypercube when the number of nodes is a power of 2. The addition of a node is easy and has low cost (few edges disappear). Moreover, the Supercube has the same good characteristics of high connectivity and small diameter as the hypercube network [10] and the degree of a node is at least blog2 N c? 1 and at most 2blog2 N c ? 2. Its topological properties are studied in [13] while its good fault{tolerant characteristics are explored in [11, 1, 2]. In this paper we study the Supercube as an alternative to the hypercube as a basis for an interconnection network and, in particular, we study the capability of Supercube to eciently execute typical parallel algorithms. To execute a parallel program, its tasks are to be mapped onto the processors of the parallel machine. It is possible to model this kind of problem in graph{theoretical terms of graph embedding. We model both the parallel algorithm and the parallel machine as graphs and an ecient execution is obtained mapping the rst graph onto the second one in such a way that adjacent tasks are as close as possible. It is known that the optimal mapping problem is NP-complete in most of its formulations, but for simpler cases many results have been obtained in the past years. We rst prove that complete binary trees and bidimensional meshes (with a side length power of 2) are spanning subgraphs of the Supercube. Then we prove that the Supercube is an Hamiltonian graph and, when the number of nodes is not a power of 2, it contains all cycles of length greater than 3 as subgraphs, i.e. it is pancyclic. Only few graphs, as the De Bruijn graph [12], the X-tree [7] and Product Shue network [8], are known to have such characteristic. These results are also used to prove that the Butter y can be embedded onto the Supercube with dilation and congestion 2.
2
2 De nitions In the sequel the (Hamming) distance d(x; y) of two binary strings x and y is de ned as the number of positions in which x and y dier. The length of a string x is denoted by jxj. A Hypercube network Hd is an undirected graph with 2d nodes. Each nodes is labelled with a distinct d-bit binary string representing an integer between 0 to 2d ? 1 and there is an edge between two vertices x and y if d(x; y) = 1.
De nition 1 A Supercube SN (V (SN ); E(SN )), with N = 2s + h and 0 h 2s ? 1, is a graph having each node labelled with a (s + 1)-bit binary string representing a number between 0 and N ? 1. Let u and v be the labels of two nodes and let u = buu and v = bv v , where ju j = jv j = s and jbuj = jbv j = 1. There is an edge (u; v) in SN i: 1
1
1
1
bu 6= bv and d(u1; v1) = 0 (horizontal edge); bu = bv and d(u1; v1) = 1 (hypercube edge); bu 6= bv (and say bu = 1), d(u1; v1 ) = 1 and bu v 2= V (SN ) (oblique edge). In [10] it is proved that the Supercube is (s + 1){regular whenever h = 2s?1. The maximum degree maxd is such that s maxd 2s ? 2, while the minimum degree mind is such that s ? 1 mind s. Besides, it is also proved that its connectivity is equal to the minimal degree of the network and the diameter is blog2 N c. We can consider V (SN ) partitioned in three sets V1, V2 and V3 where, for any sequence v of s bits: V1 is the set of nodes having label of the form 0v for which 1v 2 V (SN ); V2 is the set of nodes having label of the form 0v for which 1v 2= V (SN ); V3 is the set of nodes having label of the form 1v. The oblique edges connect a vertex of V3 with a vertex of V2 corresponding to an \absent node" in the Supercube, that is, when a hypercube edge should join a node in V3 with a node not belonging to V (SN ), it is stretched to the corresponding node in V2 . Adding a new node to an existing Supercube, in a certain way, means to split each oblique edge of the rst node in V2 in two edges, an horizontal one and a hypercube one. Notice that, from the de nition, it follows that the subgraph induced by V1 [ V2 (with hypercube edges) is a hypercube Hs.
3 Graph Embedding We will consider the capabilities of the Supercube to eciently execute parallel programs. 3
V1
# #"! "! 0000 0001
0010 0011 0100
V2
0101
0110 0111
vv vv # vv v "! vv vv
!!
!
!
!
!
1000 1001
V3
1010
Figure 1: The Supercube S11 . We de ne the computation graph of a parallel algorithm to be the graph in which nodes are processes and edges connect processes exchanging data; similarly the host graph of a parallel machine is a graph in which nodes are processors and edges are physical interconnection links. Our goal is to minimize the number of the steps needed in the host network to emulate a communication between adjacent processes in the computation graph and, at the same time, we want to minimize the number of processors of the host network. More formally, we de ne the embedding h; i of a graph G=(V(G),E(G)) into a graph H=(V(H),E(H)) as a function from V(G) to V(H), together with a function that maps (u; v) 2 E(G) into a path ((u; v)) 2 H connecting (u) and (v). The dilation of the edge (u; v) 2 E(G) under h; i is the length of the path ((u; v)) 2 H, while the dilation of an embedding h; i is the maximum dilation, over edges in G, under h; i. The expansion of an embedding h; i of G into H is de ned as the ratio of the size of V (H) to the size of V (G). We notice that expansion measures processors' utilization. The congestion of the edge (u; v) 2 E(H) under h; i is the number of paths images of passing through (u; v) 2 E(H) while the congestion of an embedding h; i is the maximum congestion, over edges in H, under h; i. The load of a node v 2 V (H) under hi; i is the number of nodes of G mapped onto v by and the load of an embedding h; i is the maximum load over the vertices of H. We assume that a processor can communicate with each of its neighbors in one step, so that edges 4
serve as bidirectional links. In this way each step of G is simulated by a series of steps of H. Given an embedding h; i of G into H, each communication across an edge (u; v) of G is eected by transmitting the message along the path ((u; v)) 2 H. In such a case, a message from u 2 G to v 2 G has to wait that processor (u) executes at most other tasks, has to travel at most edges in H and for each edge can be delayed by other messages and by the fact that the intermediate processor has to execute at most other tasks. It follows that the number of steps needed to simulate a step of G on the host architecture H can be at most [4]. It can be easily veri ed that SN can be embedded into Hblog N c with dilation, congestion and load O(1). Since Hblog N c is a subgraph of SN , we can say that the Supercube and the Hypercube are computationally equivalent, that is T steps on one of the architectures can be simulated in O(T) steps on the other one. When G is a subgraph of H there is an embedding such that the dilation, congestion and load are 1. In the following we will consider only embeddings with load 1.
3.1 Complete Binary Trees It is possible to embed Td (a complete binary tree of height d) into its optimal Hd with dilation 2 using the inorder enumeration of the tree nodes and the binary representation of this number as hypercube label [4]. Since Td is not a subgraph of Hd when d 3, this embedding is optimal [9]. We will prove that the complete binary tree is a spanning subgraph of the Supercube. In the sequel we assume to identify the nodes of Td in the usual way: the root is identi ed by e (null string); the children of a node w are identi ed by w0 (left child) and w1 (right child). A (binary) labeling ` of a complete binary tree Td assigns to each node w of Td a distinct d-bit label `(w).
De nition 2 Given a labeling ` of a complete binary tree Td , we call the mirror labeling for any node w the label given by
M(`(w)) = [`(w)]R where z is the bit complement of the string z and zR is the reverse of the string z.
For example, given a node w = wm : : :w1 if `(wm : : :w1) = `s+1 `s : : :`1 then M(`(w)) = `1 : : :`s `s+1 .
Theorem 1 The complete binary tree Th can be embedded in S h ? with dilation 1. 2
Proof : Let us call canonical every labeling of Th such that 5
1
`(e) = 01h?1 `(0) = 001h?2 `(1) = 1h?1 0 `(00) = 0001h?3 `(01) = 101h?2 `(10) = 01h?2 0 `(11) = 1h?2 00 By induction on the height of the tree, we prove that a canonical labelling describes an embedding of Th into S2h ?1 with dilation 1. For h = 3, the assertion can be veri ed by inspection and we note that the edge between the root and its right child is an oblique one. Let us suppose that a canonical labelling ` gives an embedding of Th?1 in S2h?1 ?1, then the labeling m for Th is obtained according to the following scheme: m(e) = 01h?1 m(0) = 0`(e) m(1) = 1M(`(e)) m(00w) = 0`(0w) m(01w) = 1M(`(0w)) m(10w) = 0`(1w) m(11w) = 1M(`(1w)) for w 2 f0; 1g such that 0 jwj h ? 2. It is easy to verify that m gives us an embedding of Th into S2h ?1 with dilation 1 and that m is canonical. We point out that Th can be embedded in SN for N > 2h ? 1 with dilation 2, using the embedding algorithm into the hypercube [4].
3.2 Meshes and Cycles Given a sequence B of binary labels B = [w1; w2; : : :; wk ] then we de ne 0B = [0wij1 i k] and 1B = [1wij1 i k]. If B1 = [w10 ; : : :; wk0 ] and B2 = [w100; : : :; wr00] then the sequence [B1 ; B2] = [w10 ; : : :; wk0 ; w100; : : :; wr00].
De nition 3 A Gray code on d bits Gr(d) is de ned in the following way: Gr(1) = [0; 1] and Gr(d)= [0G r(d ? 1); 1G r(d ? 1)]. The Gray code is such that two adjacent labels have distance 1. Using the Gray code any mesh 2r 2s can be embedded in Hr+s with dilation 1 [5]. In the sequel, if not dierently speci ed, labels of the Gray code are on s + 1 bits. Given a Gray code G r(d), we call the rst 2d?1 labels the Left part of the code LG (d) and the last 2d?1 labels the Right part RG (d). We notice that the last node of the Left part is always a neighbor in the Supercube of the rst one. 6
Every label in the Right (Left) part correspond to a label in the Left (Right) part diering only on the last bit. More formally, the corresponding node of G(2s + i) 2 RG is the node G(2s ? 1 ? i) 2 LG and the corresponding node of G(j ? 1) 2 LG is G(2s+1 ? j) 2 RG . We may think to fold the G r(s + 1) after the Left part in such a way that the corresponding nodes are in the same \column" and horizontal edges join corresponding nodes (see Fig. 2).
vv vv vv
G(0)
....
G(2s+1 ? 1)
vv vv vv
G(j ? 1) ....
s .... G(2 ? 1 ?....i)
....
.... G(2s+1 ? j)
G(2s + i)
G(2s ? 1) Left G(2s )
Right
Figure 2: Folding the Gray code G r(s + 1). In the following, we call existing a label for which a node in SN exists.
Lemma 2 The mesh 1 N can be embedded in SN with dilation 1.
Proof : We map mesh nodes on nodes whose labels are taken alternatively in the Left part and in the
Right one (if possible) using horizontal edges to skip. Horizontal edges join rst nodes of the Left part to last nodes of the Right part, so we advance in the Left part and go back in the Right one. Also oblique edges are sometimes used to skip when the label we would use in the Right part is not existing. Let us call LN (i) the node where the i-th node of the mesh 1 N is mapped, with 0 i N ? 1. We x LN (0) = G(0) and assume that LN (?1) 2 LG ; the position of the i-th node LN (i) depends on where the previous two nodes have been mapped (see Fig. 4): 1) if LN (i ? 1) 2 LG and LN (i ? 2) 2 RG then LN (i) is the successor of LN (i ? 1); 2) if LN (i ? 1) 2 LG and LN (i ? 2) 2 LG then, if it exists, LN (i) will be the corresponding node in the Right part of LN (i ? 1) otherwise it will be the successive label of LN (i ? 1). 3) if LN (i ? 1) 2 RG and LN (i ? 2) 2 LG then, if it exists, LN (i) will be the previous label in the Right part of LN (i ? 1) otherwise we use the oblique edge to the successive node of LN (i ? 2). 4) If LN (i ? 1) 2 RG and LN (i ? 2) 2 RG then LN (i) will be the corresponding node of LN (i ? 1).
Theorem 3 A mesh 2r k can be embedded with dilation 1 in S r k . 2
7
1)
3)
LN (i?1) u u LN (i?2)
LNu(i)
LN (i?2) u u u LN (i?1) LN (i)
2)
or
LN (i?2) u
LN (ui?1) u LN (i)
LN (i?2) u u" " LN (i?1)
or
LNu(i)
" "
X
LN (i?2) u
LN (i?1) LN (i) u u
X
4)
LN (i) u u LN (i?2)
u LN (i?1)
Figure 3: The four cases of Lemma 2.
Proof : Mesh nodes are labelled with a dlog ke + r-bits string. The node in row i and column j (where 2
0 i 2r ? 1 and 0 j k ? 1) is labelled by Lk (j)Gr (i).
The result follows from the previous Lemma, since the rows can be thought to collapse in a \supernode" and the problem is to embed such a 1 k mesh in Sk .
De nition 4 In a cycle CN , nodes are labelled with 0 i N ? 1, and any node j is joined to the node j + 1 mod N .
Lemma 4 The Supercube SN is Hamiltonian.
Proof : We embed CN in SN with dilation 1 using the same strategy of Lemma 2 for monodimensional
meshes but we have also to show that LN (N ? 1) is adjacent to LN (0) in the Supercube. By the properties of the Gray code, it is sucient that LN (N ? 1) = G(2s ? 1). We embed the cycle according to the strategy of Lemma 2 and consider the last two labels of LN . Four cases are in order:
1) 2) 3) 4)
LN (N ? 2) = G(2s) 2 RG and LN (N ? 1) = G(2s ? 1) 2 LG LN (N ? 2) = G(2s + 1) 2 RG and LN (N ? 1) = G(2s ? 1) 2 LG LN (N ? 2) = G(2s ? 2) 2 LG and LN (N ? 1) = G(2s ? 1) 2 LG LN (N ? 2) = G(2s ? 1) 2 LG and LN (N ? 1) = G(2s) 2 RG
In the rst three cases, the Lemma is proved. In the last case, we make an inversion of the skipping in such a way that LN (N ? 1) = G(2s ? 1). When the label G(2s + 1) does not exist, it can be easily obtained using the oblique edge between G(2s ? 2) and G(2s) (see Fig. 4a). 8
In general, if G(2s + k) is the rst not existing label in the Right part, we use the oblique edge between G(2s ? 1 ? k) and G(2s + k ? 1) to invert the verse of the skipping (see Fig. 4b). a)
v
v
X
v
-
v
v
l l X lv
b)
v v v
v v
X v v
v v
-
v v v
v v
T X v v
v v
T
Figure 4: Inversions of the verse of the skipping in Lemma 4. We notice that, if N = 2s, the Supercube is a Hypercube and we are always in the case 1. It is well known that in Hd any even length cycles C2t can be embedded with dilation 1 using the rst t labels and the last t labels of G r(d). Now we can prove that the Supercube is pancyclic when N is not a power of two.
Theorem 5 Any cycle Ci, where 3 i N , can be embedded with dilation 1 in SN , where N is not a power of 2.
Proof : 3 i < 2s : Any even length cycle can be embedded into Hs, which is a subgraph of SN . When i = 2t+1 (1 t 2s?1 ? 1) we embed C2t in the Left part (obtained from G r(s) starting from the Gs(0)) and include in it a node of the Right part, (say G(2s + r)) followed by a non existing one. It can be easily seen that at least one such node is present in the Right part, since N is not a power of 2. If 2s + r 2s+1 ? 2t then the node corresponding to G(2s + r) belongs to C2t, and G(2s + r) can be included using the oblique edge between itself and G(2s ? r ? 2). If it is not the case, we have to shift the cycle XORing all the nodes with the label G(2s ? r). In this way, the rst node of the cycle is G(2s ? r) and G(2s + r) is included in the cycle substituting the edge (G(2s ? r); G(2s ? r + 1)) with the path (G(2s ? r); G(2s + r); G(2s ? r + 1)) i = 2s : Using the Left part of the Gray code. i = 2s + k (where 1 k < h): The cycle is embedded using the same strategy of Lemma 2 and Lemma 4, but using just k nodes of the Right part. The usual strategy is followed until the (k ? 1)-th node of the Right part is included in the cycle. Then four cases may occur for the choice of the k-th Right node and they depend on the position 9
of the (k ? 1)-th node (see Fig. 5). After the k-th node in RG has been selected, all the remaining nodes of the Left part are included. We say group a set of adjacent existing labels in the Right part. If the (k ? 1)-th node is the last one of its group, we include just the rst one of the next group1 using an oblique edge (Fig. 5a-b). When the (k ? 1)-th node is not the last one of its group, if the node preceding in the cycle the (k ? 1)-th Right node is in the Right part, we include just the last one of its group using an oblique edge (Fig. 5c), otherwise we simply include the k-th node following the usual rule (Fig. 5d).
vv vv v v v vv vv vv vv vv v v
i = 2s + h : Using the Lemma 4.
a)
X
k-1
c)
k-1
X
A A XA
b)
k
vv vv vv v v vv vv vv vv v vv vv k-1
X
d)
k
k-1 k
X
A A XA
k
X
Figure 5: Four cases of the Theorem 5, when i = 2s + k, with 1 k < h. The embedding strategy for the cycles can be used also to embed eciently the Butter y network onto the Supercube.
De nition 5 A Butter y Bh is a graph with h 2h nodes where any node is labelled with hi; wi, for 0 i h ? 1 and w 2 f0; 1gh. Edges are divided into straight edges (hi; wi; h(i + 1) mod h; wi) and oblique edges (hi; wi; h(i+1) mod h; w ei i), where ei is the binary vector of length h whose only nonzero component is the i-th and is the bit-wise XOR operation. Using a well-known standard technique, it is easy to show that 1
Note that it exists since k < h.
10
Lemma 6 The butter y Bh can be embedded in Sh h with dilation 2 and congestion 2. 2
Proof : We de ne an embedding h; i in the following way. We map a node hi; wi of the butter y on the node (hi; wi) = Lh (i)w. Notice that the labels assigned to hi; wi and h(i + 1) mod h; wi are at distance 1 by Lemma 4 (straight edges) while to an oblique edge (hi; wi; h(i+1) mod h; w eii we assign the path (hi; w); (h(i + 1) mod h; wi); (h(i + 1) mod h; w ei i). Then the result follows from Lemma 4 and from the de nition of the Butter y.
References [1] V. Auletta, A.A. Rescigno, V. Scarano, \On the fault tolerance and computational capabilities of the Supercube", Proc. of IV Italian Conference on Theoretical Computer Science, L'Aquila (Italy), Oct. 1992. [2] V. Auletta, A.A. Rescigno, V. Scarano, \The diameter of the surviving route graph of the Supercube", Technical Report Universita di Salerno 3-92, June 1992. [3] L. Bhuyan, D.P. Agrawal, \Generalized Hypercubes and Hyperbus structure for a computer network", IEEE Trans. on Comp. C-33, 1984, pp. 323-333. [4] S. Bhatt, F. Chung, F.T. Leighton, A. Rosenberg, \Optimal simulation of tree machines", Proc. of 27th IEEE Symp. on Found. of Comp. Science 1986, pp. 274-282 [5] L. Johnsonn, \Communication Ecient Basic Linear Algebra Computations on Hypercube Architectures", J. Parallel and Distributed Computing 4, 1987, pp. 133-172. [6] H.P. Katseff, \Incomplete Hypercubes", IEEE Trans. on Comp. C-37, 1988, pp. 604-607. [7] A. Rosenberg, \Cycles in Networks", Technical Report 91-20 of Comp. and Inf. Science Dept. of Univ. of Massachusetts at Amherst, 1991. [8] A. Rosenberg, \Product-Shue networks: towards reconciling Shues and Butter ies", Discr. Appl. Math. 37/38, 1992, pp. 465-488. [9] Y. Saad, M. Schultz, \Topological properties of hypercube", IEEE Transactions on Computers 37-7, July 1988, pp. 867-871. [10] A. Sen, \Supercube: An Optimally Fault Tolerant Network Architecture", Acta Informatica 26, 1989, pp. 741-748. 11
[11] A. Sen, A. Sengupta, S. Bandyopadhyay, \ On the routing problem in faulty supercubes", Information Processing Letters 42, 1992, pp. 39-46. [12] M. Yoeli, \Binary ring sequences", Amer. Math. Monthly 69, 1962, pp. 852-855. [13] S.-M. Yuan, \Topological properties of supercube", Information Processing Letters 37, 1991, pp. 241-245.
12