A Class of Scalable Optical Interconnection Networks through Discrete Broadcast-select Multi-domain WDM Khaled A. Aly
Patrick W. Dowdy
Dept. Electrical & Computer Engineering University of Central Florida Orlando, FL 32816-2450
[email protected]
Dept. Electrical & Computer Engineering State University of New York at Buffalo Buffalo, NY 14260-2050
[email protected]
Abstract – Passive star-coupled optical interconnects with wavelength-division multiplexing hold a strong potential for realizing flexible large-scale multicomputer networks. Virtual point-to-point regular connectivities can be defined and modified in a network with broadcast-selectrouting via distributed wavelength assignment to the nodes. The system size in this approach is bound by the number of separable wavelength channels. This paper proposesa network class that achieves scalability by grouping the nodes into clusters and employing a separate pair of broadcast and select couplers with each cluster. Interconnecting the clusters according to a regular topology results in a modular architecture with reconfigurable partitions and significantly reduced fiber link density. This approach efficiently combines wavelength-division with direct space interconnection, taking advantage of the properties of each. For the topologies of most significance to parallel computing, the paper studies conflict-free wavelength assignment that maximizes spatial reuse, identifies the valid network partitions, and evaluates the fiber link density and both space and wavelength channel throughput improvement. The results show that cube and shuffle networks take most advantage of the proposed approach in terms of maximizing wavelength reuse.
1
Introduction
Passive optical interconnects can potentially contribute to the performance, scalability, cost-effectiveness, fault-tolerance, and adaptability of parallel computer systems. Direct processor interconnection via optical fiber mainly takes advantage of the fiber’s large bandwidthdistance product, and high signal quality [1,2]. Wavelength-division multiplexing (WDM) enables extracting a larger amount of usable bandwidth, resulting in a relaxed speed mismatch with I/O electronics and significantly reduced access latency,especially in a distributed shared memory environment [3, 4]. Wavelength selectivity can further be employed to support distributed bandwidth and/or spatial reconfiguration and partitioning [5–9]. The virtual point-to-point (VPP) approach was considered to realize the shuffle [10] and other regular connectivities [11] using a single star coupler, by wavelength assignment to the nodes. In contrast to a multi-access approach [3–5], there is no need for fast-tunable devices but communication is multi-hop and performance is dependent on the average distance of the emulated connectivity. Synchronous parallel 1 The work described in this paper was done while Khaled Aly was a Ph.D. candidate at the State University of New York at Buffalo. 2 This research was supported by the National Science Foundation under Grants CCR-9010774 and ECS-9112435.
applications with regular interprocessor communication exhibiting high locality can take full advantage of the VPP approach [12, 8, 9]. Scalability of VPP networks is constrained by the number of wavelengths that can be coupled and separated while maintaining acceptable crosstalk and power budget levels. Large networks can be constructed through a combination of space and wavelength division. Spatial wavelength re-use by exploiting fiber attenuation was studied in [13]. Multi-star configurations were considered, as a means of wavelength re-use, for generating permutations by fiber assignment [7], for realizing spanning multiaccess channel hypercubes [4], and for topological configuration in a cluster-based chordal ring [8,9]. These approaches are based on passive broadcast-select routing that avoids incorporating nonlinear optical devices such as wavelength converters or wavelength spatial switches [14]. This paper proposes a class of discrete broadcast-select multidomain WDM networks, referred to as M-WDM, where the broadcast and select functions are realized via two separate sets of star couplers for transmit and receive. This modular structure is developed to achieve scalability and enable distributed reconfigurable partitioning with conflict-free communication. An objective is to minimize the impact of the physical interconnection network on both functionality and performance. Scalability and low link complexity are due to mixing wavelength and passive space-division. The cluster-based structure provides modularity, yet maintains independence among nodes within the same cluster. Reconfigurable partitioning is based on the principle of quotient networks [15], realized through wavelength selectivity at the receive side and cluster self links. Conflictfree communication is due to proper assignment of channel sets to clusters and wavelength re-use is possible because of the sparsity of most practical regular interconnection graphs. Section 2 of this paper presents the network topological definition, used notation, and network properties. Conflict-free wavelength assignment is the subject of Section 3, with results obtained for cube, grid, and tree cluster interconnection topologies. Scalability evaluation results are presented in Section 4 in terms of network partitionability, link density, and both space and wavelength channel throughput. Numerical comparisons to all-space and all-wavelength realizations are held to demonstrate the advantages of the proposed configuration. Conclusions are summarized in Section 5.
2
Discrete Broadcast-select M-WDM
The network configuration is formally defined in Section 2.1. Notation and network properties are presented in Section 2.2.
2.1 Topological Definition The network consists of M1 clusters where each cluster is a set of M0 nodes, with a total network size of M = M1 M0 nodes. Each node possesses a single fixed-wavelength transmitter. The receiver need to be capable to simultaneously monitor a subset of the set of separable wavelength channels, whose number is denoted as C . It can be realized using a multichannel acoustooptic tunable filter [14] or a detector array and a passive (gratings based) wavelengthdivision demultiplexer [16]. A cluster configuration based on shared source/detector arrays is shown in Fig. 1. Each cluster possesses its own broadcast and select domains realized by an output and an input optical star couplers, respectively. The cluster interconnection network (CIN) refers to the fiber connection pattern from output to input couplers. Clusters are provided with self links to enable wavelength connectivity among nodes in the same cluster. Nodes sharing an output coupler transmit over an ordered set of distinct wavelength channels. At the input coupler side, several distinct channel sets are monitored depending on the CIN topology. Transmit channel sets are assigned to output couplers such that no conflicts may occur at any input coupler. That is, the assignment is such that the channel sets which can be listened to through any input coupler are disjoint to provide a collisionless environment. Fig. 2 illustrates an M-WDM shuffle network with wavelength assignment shown. In this example, clusters are interconnected according to a (2,3) shuffle permutation, which makes a 3-dimensional de Bruijn graph whose vertices are the clusters. λ0
0
λ1
1
1 Λi
m0-1
(a)
λ00
1
λ01
2
λ02
3
λ03
Λ0
0
λ10
1
λ11
2
λ12
3
λ13
Λ1
0
λ00
1
λ01
2
λ02
3
Λ
0
Λ
1
Λ
2
Λ2
Λ
3
Λ1
Λ0
Λ
0
Λ1
Λ1
Λ
1
Λ
2
Λ2
Λ
3
Λ0
Λ1
Λ
0
Λ0
Λ0
Λ
1
Λ
2
λ03
Λ0
Λ2
Λ
3
0
λ00
Λ0
Λ1
Λ
0
1
λ01
Λ0
Λ
1
λ02
Λ0
2
Λ
2
3
λ03
Λ0
Λ2
Λ
3
0
λ20
Λ2
Λ0
Λ
0
1
λ21
Λ2
Λ
1
λ22
Λ2
2
Λ
2
3
λ23
Λ2
Λ1
Λ
3
0
λ20
Λ2
Λ0
Λ
0
1
λ21
Λ2
Λ2
Λ
1
2
λ22
Λ
2
3
λ23
Λ2
Λ1
Λ
3
0
λ10
Λ1
Λ0
Λ
0
1
λ11
Λ1
Λ1
Λ
1
2
λ12
Λ
2
3
λ13
Λ1
Λ2
Λ
3
0
λ20
Λ2
Λ0
Λ
0
1
λ21
Λ
1
2
λ22
Λ
2
3
λ23
Nodes output star coupler Light source array
Λ0
Λ1
Λ0
Λ0
Λ2
Λ2
Λ1
Λ2
Λ
Λ
Λ
Λ
Λ
Λ
Λ
output couplers (broadcast)
Λ
Λ2
Λ2
input couplers (select)
wavelength multiplexed fiber links
3
clusters RX side
[ [
0 1
λ1
Λ
Figure 2: An 8-cluster M-WDM shuffle with self links (4 nodes per cluster). Clusters 0, 2 and 3 transmit over channel set Λ0 = (00 ; 01 ; 02 ;03 ), clusters 1 and 6 transmit over Λ1 = (10 ;11 ; 12 ; 13 ), and clusters 4, 5 and 7 transmit over Λ2 = (20 ; 21 ; 22 ; 23 ). Every fiber link is shown as 4 lines, each representing a channel. Input couplers receive conflict-free the set of all channels Λ = Λ0 Λ1 Λ1 . Passive individual channel selection can be done at the cluster or node level.
Output fiber links
λ0
Λ0
Λ0
0
clusters TX side
F λ M0-1
?
transmits over channel li , for all 0 i M0 1. The number of channel sets, W , required for conflict-free reception is determined by the CIN topology.
1 WDM
Λ
C electronic buses
F λ C -1 Input fiber links
Wavelength-division demultiplexer input star coupler
(b)
C-1
Detector array
0
1
Nodes
M0-1
Figure 1: Cluster organization: (a) transmit side; nodes transmit over an assigned ordered set of channels Λi = (i0 ; :: : ;i;m0 ?1 ) (b) receive side; nodes receive conflict free over any channel of the set of all available channels Λ = 0 ;: : :; C ?1 by monitoring the appropriate buses
f
g
f
g
The set of clusters is defined as P = P0 ; P1 ; : : : ; PM1 ?1 . Each node is represented by a 2-tuple (i; j ), where i [0; M1 1] represents the cluster index and j [0; M0 1] represents the node index within the cluster. The set of nodes in cluster i, 0 i M1 1, 1) . The wavelength band is Pi = (i; 0); (i; 1); : : : ; (i; M0 is partitioned into W disjoint ordered channel sets, each set consists of M0 channels. The set of disjoint channel sets is defined as Λ = Λ0 ; Λ1 ; : : : ; ΛW ?1 , and the ordered channel set i, 0 i W 1, is represented by Λi = (i0 ; i1 ; : : : ; i;M0 ?1 ). If channel set Λl is assigned to cluster Pk , then node (k; i) Pk
2
f
f ?
? ? g
2
?
The cluster degree, denoted as F , represents the number of fiber links per input (or output) coupler. The dimension of an output coupler is therefore M0 F and that of an input coupler is F M0 . Clusters are interconnected according to a regular topology , whose graph representation is as follows: the set of vertices represents the set of clusters (i.e. each cluster of M0 nodes is represented by a vertex in ) and the set of edges represents the CIN fiber links. The number of necessary channel sets to achieve conflict-free communication C , where W is related to the number of channels by M0 W represents the number of sufficient and/or necessary disjoint channel sets to satisfy the conflict-free communication condition for a given CIN topology. An objective is to minimize W and therefore reduce the number of required channels. Consequently, larger networks can be built with lower space complexity.
?
g
2
T
T
2.2 Notation and Network Properties
T
The CIN is a regular interconnection topology , with self cluster links (Pi ; Pi ), for all i [0; M1 1], which enables arbitrary wavelength connectivity among its nodes. The cluster size provides an
2
?
Table 1: Number of channel sets for CIN topologies
additional dimension to the network that is used to enable scalability and reconfigurable partitions, as will be discussed in Section 4. Cube, grid, and tree-based topologies are considered for the CIN. The main objective is to evaluate and compare the efficiency of the considered CIN topologies for the proposed M-WDM configuration, rather than discuss the well known topological properties and parallel applications [17]. The used notation is defined below.
SH GR
BC
XT
?
?
The broadcast-select multi-domain approach achieves easily reconfigurable partitioning and modularity. This paper compares MWDM structures to all-space (with no WDM) and all-wavelength (using a single star coupler) realizations. Large space networks are constrained by high link complexity and difficulty of spatial partitioning in some of the cases [18]. Single star virtual point-to-point WDM networks are limited in size to a few tens of nodes. M-WDM network structures provide a topological design tradeoff that combines the flexibility and low complexity of WDM networks with the scalability of space networks. The examined generalization reduces to an all-space network when M0 = 1 node per cluster and to an all-wavelength realization when M1 = 1 cluster with M0 nodes. The properties of the proposed M-WDM configuration are summarized in the following: (1) Scalability achieved via wavelength spatial re-use, the network design attempts to use the maximum number of channels on a single fiber without interfering with conflict-free communication, maximizing both the fiber and the wavelength channel utilization. It has better expandability than direct space-division networks, since expansion can take place by adding nodes to clusters. The total network size does not need to preserve topologically valid sizes since it is intended to be partitioned. (2) Reconfigurable Partitioning into independent or cooperating sub-networks, depending on the mode of parallelism. Network partitions are established by simply monitoring the proper channels. Processors belonging to the same cluster may participate in different sub-networks. (3) Low Complexity due to the significantly reduced fiber link density, which is shown to be much lower than the the case of a similar modular structure with space “non-WDM” interconnects. (4) Wavelength selectivity is needed at the receive side to reconfigure the network partitions but modifying the partitions would take place at an infrequent task-by-task basis. If tunable filters are employed, only slow tunability is necessary. The receiver configuration shown in Fig. 1, based on a detector array, enables transceiver sharing and completely passive channel monitoring/selection.
3
Conflict-free Channel Set Assignment
To guarantee conflict-free communication, distinct channel sets are assigned to the transmitters of any two clusters with a communication distance less than 3 in the CIN graph. This guarantees the uniqueness
Remarks
c
Upper bound
2[3 + n( mod3)]
Upper bound
=n
n ( +1) ]
n n+1
GR?2;3m GRn;3m GR?n;4m T Rn;k XT 2;k
GR
TR
2blog[2
BC n CCCn SHn;k
An n-dimensional binary cube CIN is denoted as n and consists of M1 = 2n clusters. An n-dimensional cube-connected cycles CIN is denoted as n and has a total of M1 = 3 2n clusters. An (n; k) shuffle stage CIN with M1 = nk clusters is denoted as n;k (equivalent to a generalized de Bruijn graph). A symmetric ndimensional grid CIN with k clusters per dimensional axis is denoted as torus (grid with wrap-around conn;k and the corresponding ? . The total number of interconnected nections) is denoted as n;k clusters is M1 = kn . A complete n-ary tree CIN of depth k levels, denoted as n;k , interconnects a total of M1 = (nk 1)=(n 1) and the k-level X-tree is denoted as 2;k .
CCC
Number of channel sets, W
CIN
Without cluster self-links With cluster self-links
m is a positive integer n = 2; 3 n = 2; 3
5 9 8
n+2
—
7
—
of the channel sets used by clusters that have a common adjacent cluster. Two clusters separated by a distance of 1 have a common neighbor due to the self cluster links and two clusters separated by a distance of 2 have a third common neighbor. Any two clusters with a common neighbor are assigned unique channel sets to avoid collisions at input couplers, formally stated as follows: Find W disjoint sets of clusters Q0 , Q1 , ..., QW ?1 , where Qi = Pj ; Pk ; for j; k [0; M1 1] djk > 2 , djk denotes the distance between clusters Pj and Pk , such that Q0 Q1 ::: QW ?1 = P. Channel set Λi is then assigned to Pj Pj Qi ; j [0; M 1] , for all 0 i W 1. This problem is distinct from the standard graph-theoretic vertex coloring and perfect matching problems. The importance of minimizing W stems from the need to limit the restriction on the cluster size M0 since M0 W C . The conflict-free communication principle is F , since F clusters always have a single common governed by W neighbor and therefore need to be assigned unique channel sets.
j f j
g [ [ [ 2 2
2
f
? g
?
?
The necessary and sufficient number of channel sets W has been derived for the considered CINs, and the results are summarized in Table 1. For grids and tori, this number is dependent on the number of clusters per axis, and is found for some specific cases. For the binary cube and cube-connected cycles, an upper bound is obtained using the analogy with the theory of single error correcting codes (W is minimal in n when n = 2i for some integer i, since only then n (n + 1) evenly divides 2 ). The exact number has been derived for the shuffle and the tree networks. Channel assignment maps have been obtained, but are not included in this paper. Clusters in each disjoint set Qi are assigned the same channel set without conflicts, for all i [0; W 1]. Individual wavelength assignment is a oneto-one map between each ordered channel set and the set of nodes in each of the clusters it is assigned to. CINs with a smaller relative W are more efficient because they allow higher degree of wavelength concurrency. An example channel assignment for a 3 M-WDM network is given in Fig. 3.
BC
2
?
BC
4
Scalability Analysis
A scalable multiprocessor network need to be capable of expanding at linear complexity while preserving its performance and connectivity characteristics. This section analyses the scalability of the proposed
Λ0
000
Λ1
001
level M0-1 cluster 0
Λ3
Q0 = {000, 111}
100
Λ2
CIN
101 level 1
Q1 = {001, 110} Q2 = {010, 101}
level 0
Λ1
Q3 = {100, 011}
110
Λ0
111
CIN Λ2
010
Λ3
cluster M1-1
011
Figure 3: Channel set assignment for an 8-cluster binary 3-cube M-WDM network (self links not shown). This assignment is minimal since (n + 1) 2n .
j
class of M-WDM networks and compares it to both all-space and allwavelength realizations. Section 4.1 studies the network partitioning into smaller independent sub-networks of identical topology to the CIN, which implies that connectivity is preserved among various system sizes. Section 4.2 evaluates the complexity in terms of the fiber link density. It is shown that the inter-cluster link density is either a constant or a decreasing function of the network size for most of the CIN topologies. Effective throughput expressions of both a space channel and a wavelength channel in an M-WDM network are given in Section 4.3. The results are compared to both physical point-to-point (all-space) and single-domain virtual point-to-point (all-wavelength) realizations in Section 4.4. It is shown that M-WDM structures with cube and shuffle CIN topologies are most efficient in terms of flexible partitioning and spatial wavelength concurrency.
Figure 4: Logical equivalent of the M-WDM network structure: level topology is that of the CIN and levels are combined through cluster self links.
long to the same cluster. Therefore, multiple levels can be combined by any valid number, and in any order (not necessarily in the order of node labeling within each cluster). This is a particularly useful feature when sub-networks (blocks of nodes) are deallocated because no fragmentation will result. The number of nodes in the VPP subnetwork for a given uniform emulation is denoted as N0 and the number of sub-networks in a partition is denoted as N1 , N1 N0 = M . A maximal emulation refers to an emulation that involves all network nodes (N1 = 1 and N0 = M ). Table 2 provides a summary of the emulation rules for the examined CIN topologies. Valid maximal emulations for the considered CINs are given in Tables 5, 6, 7, 8, and 9 illustrating the parameters of both the physical M-WDM network (n; k) and those of the virtual network (nv ; kv ). Figure 5 shows an example of a two-level quad M-WDM tree emulating a 5-level virtual binary tree.
4.1 Partitioning An M-WDM structure with M1 clusters, M0 nodes per cluster, and CIN topology is logically equivalent to M0 levels each of size M1 nodes interconnected according to , as shown in Figure 4. Each network level includes exactly one node from each cluster. For example, level i consists of the set of nodes (0; i); : : : ; (M1 1; i) , 0 i M0 1. A network partition is a set of l-level combinations, where l levels are combined by establishing wavelength connections between l nodes from each cluster in conformance with the definition of . Levels can be combined in any order since inter-level wavelength connections are established through cluster self links. The rules for obtaining valid level combinations, and hence valid partitions, are mostly similar to those used in network emulation by a smaller identical network (the principle is knwon as quotient networks [15] or graph coverings [19]). It is represented here within the context of wavelength-domain VPP network emulation. Every cluster in the M-WDM network is assigned a number of nodes from the VPP (emulated) network such that the virtual network connectivity is preserved. The emulation is called uniform if each cluster is assigned an equal number of nodes of the VPP network [15]. Cluster self-links are essential in most cases to support communication among nodes covered by the same cluster. This corresponds to the local communication that would take place within a node among the processors which it emulates in [15].
T
?
T
f
?
g
T
The emulation rules are applicable, in a reverse sense, to combining levels to form larger networks of the same topology of each level. The central idea is that nodes at the same relative position at all levels are fully connected in the wavelength domain since they be-
Figure 5: Five-level virtual binary tree embedded in a two-level quad-tree interconnected M-WDM structure; i = 2 and j = 2.
4.2 Link Density Wiring complexity is a primary concern when implementing large processor networks due to cost, signal degradation, and interconnection complexity factors. Optical communication primarily alleviates the signal degradation concerns, and if WDM is employed, the link density can be significantly reduced as well.. An all-space approach, where metal interconnects are replaced by fiber, takes advantage only of the fiber’s superior signal characteristics. An all-wavelength approach using WDM over a single star coupler to emulate a connectivity via wavelength assignment suffers a size limitation but the wiring density is very small (2 fibers per transceiver per node). The space-wavelength approach of M-WDM networks combines the scal-
ability of space realization and the low complexity/reconfigurability of wavelength realization. This section compares the link density of the M-WDM structures considered in the paper to their physical (allspace) point-to-point analogs. It is assumed throughout that in both cases the architecture is modular. That is, M1 modules each of size M0 nodes are interconnected. The M-WDM structures lend themselves to this modular approach, where a module is represented by a cluster. The main difference is that nodes in the same cluster connect to the input and output couplers and global fiber links exist only between couplers. This provides an advantage of M-WDM structures over a modular point-to-point network, which is the regularity and simplicity of local links and their independence on the global CIN topology. Figure 6 compares the physical cluster configuration of a large point-to-point grid network realization to an M-WDM one. processors+ sources and detectors filters / demultiplexers input coupler output coupler L1 links Modular point-to-point (2-D grid module)
2F links Modular M-WDM (general cluster)
Figure 6: Comparison of cluster physical structure in modular space and M-WDM realizations
The cost of local interconnects is generally lower than global interconnects, and it is more desirable to minimize global link density. The link density is defined as the normalized number of unidirectional links used by every node. Let the number of links internal to a module (or a cluster) be denoted as L0 , and the number of inter-cluster links per cluster be denoted as L1 . The local and global link densities in the M-WDM structures are 0 = L0 =M0 and 1 = L1 =M0 , respectively. Link densities of modular point-to-point networks depend on the network topology. Expressions for local and global link densities are derived and summarized in Table 3 for the set of CIN topologies under consideration. The cluster (module) size is denoted as M0 in both cases, with local parameters n0 and k0 when applicable. For the physical point-to-point realization, the following assumptions are made: Binary cube CIN is decomposed by factorizing the dimension, cube-connectedcycles is decomposedsuch that each module consists of a multiple of cycles, the decomposition of [20] is used with the shuffle, an arbitrary decomposition is assumed for grids with an identical module and inter-module grid dimensions, and for the tree CIN it is assumed that the M0 nodes of each module are interconnected as a non-rooted n0 -ary tree of depth k0 . Note that in M-WDM networks, M = M1 M0 generally does not have to satisfy valid network sizes since partitioning takes place through dividing M0 .
4.3 Channel Throughput Channel throughput evaluation in this section assumes uniform reference pattern among all nodes. This metric is used as a means of efficiency comparison and is not intended to accurately model the performance of an actual application. The throughput of a wavelength channel in a single domain WDM network or a space channel in a physical point-to-point network is given by the inverse of the average distance between any pair of nodes. In M-WDM structures, throughput is found for both a space channel (fiber link ) and a wave-
length channel. The effective channel throughput, denoted as , is defined as the channel throughput of the emulated (maximal) VPP network, without regard to the underlying physical structure. The total network throughput Γ is obtained as times the effective number of virtual channels (those that would be employed by the VPP network). The fiber link and wavelength channel throughputs are then expressed as f = Γ=L1 and w = Γ=C , respectively. Table 4 summarizes these results for the considered CIN topologies.
4.4 Comparison This section compares the M-WDM structures to both physical pointto-point (all-space) and single-domain virtual point-to-point (allwavelength) realizations, in terms of channel throughput and link density, for the set of considered CIN topologies. Numerical results are obtained for a range of valid maximal emulations following the rules of Table 2 , keeping the number of used wavelength channels to the order of 10’s of channels. The following are common characteristics: (1) Fiber link throughput f is a decreasing function of the network size since expansion beyond a certain size takes place by spatial wavelength re-use employing more fibers. The improvement of f over , thanks to WDM is more significant when more wavelength channels are used (less channel sets are required). It is about an order of magnitude in most cases. (2) Wavelength channel throughput w is an increasing function of the network size since very large sizes employ more space-division (unlimited resource) than wavelength-division (constrained resource). The improvement of w over is several orders of magnitude in most cases thanks to the spatial wavelength re-use. CIN topologies that allow higher wavelength concurrency are more efficient for this space-wavelength realization. (3) Local link density 0 is approximately in the same order in all-space and M-WDM realizations. The advantage in M-WDM is the regularity of local cluster connections and their independence on the physical CIN topology (a standard module can be used with any topology). (4) Global link density 1 improvement through the M-WDM realization is very significant as it is either a decreasing function (or a constant) of the network size. This characteristic implies excellent scalability with regards to global wiring complexity. Individual CINs are discussed below. M-WDM binary cube realization exhibits excellent scalability characteristics in terms of wavelength concurrency and reduced link density, Table 5. The binary cube can be expanded only by increasing its dimension, which makes M-WDM approach attractive for large-scale implementation. On the other hand, larger dimensions imposes stricter limitation on M0 , using a fixed number of 64 channels. This implies relatively higher efficiency of smaller sizes. The improvement over physical point-to-point realization can be shown by examining two sizes of 1 and 4 K-nodes, realized via 7 and 10-dim. M-WDM networks, respectively. For the 1 K-node network, global link complexity is 8 times higher in a modular all-space realization, and the throughput ratios are f = = 10, and w = = 160. For a network of 4 K-nodes, 1 is only 4 times higher in an all-space network, f = = 4:36, and w = = 3352:5. The cube-connected cycles takes the least advantage of the MWDM structure in partitioning and throughput improvement, Table 6. This is due to its topological structure that does not result in flexible emulations and the requirement of a large number of channels for relatively small sizes. The virtual network size is constrained by the relation nv = 2n. Physical networks with more than 2 K-nodes,
n > 4, would require a prohibitively large number of channels.
References
The shuffle M-WDM structure exhibits the best scalability characteristics among all considered CIN topologies, as shown in Table 7. It is easy to partition, and the number of required channel sets is minimal, W = n. So the improvement in wavelength channel utilization is significant. Global link density decreases as the network size increases. Using up to 64 channels, it is noted that the binary shuffle M-WDM is more scalable than those with larger n. However, throughput and link density improvements are approximately preserved among comparable sizes with different parameters. For example comparing 2;8 to 3;5 , both with sizes in the order of 4 K-nodes: f = = 18 in both, w = = 256 in the first and 243 in the second, the reduction ratio in global link density when compared to a point-to-point network is 0.055 in both, the first has an average distance of 12 hops employing 36 channels, while the second has an average distance of 8 employing 54 channels. Virtual point-to-point shuffleNet has been subject of several studies lately due to its regularity and logarithmic diameter [10]. The proposed M-WDM structure provides a very efficient means of realizing large partitionable shuffleNets at a fraction of complexity.
[1] A. Guha, J. Bristow, C. Sullivan, and A. Husain, “Optical Interconnects for Massively Parallel Architectures,” Applied Optics, vol. 29, pp. 1077– 1093, Mar. 1990.
Torus emulations with larger and smaller virtual dimension are considered in Table 8. Networks of up to 36 K-nodes can be realized using no more than 72 channels. Variation in channel throughput and link density improvement among different realizations/emulations is very little and depends on the difference between n and nv , the ratio of kv to k, and the number of channels. Global link density is constant for each maximal emulation (comparison is held only for identical physical and virtual dimensions).
[7] A. Ganz, B. Li, and L. Zenou, “Reconfigurability of multi-star based lightwave LAN’s,” in Proc. GLOBECOM’92, Dec. 1992.
Examples of emulating trees of smaller or equal degrees are shown in Table 9. As can be expected, emulating an equal virtual degree results in larger wavelength throughput improvement due to the increased spatial wavelength re-use. For example compare the maximal emulation of the 2 K-node 2;11 virtual binary tree by 2;10 to that by 4;5 . In the first, f = = 1:6 and w = = 640 while in the second f = = 2:7 and w = = 227. Global link density advantage is noted especially for larger degrees. The reduction ratio is 0.5 when emulating binary trees by binary trees and 0.25 when emulating quad-trees by quad-trees.
[10] M. G. Hluchyj and M. J. Karol, “ShuffleNet: An application of generalized perfect shuffles to multihop lightwave networks,” IEEE Journal on Lightwave Technology, vol. 9, pp. 1386–1397, Oct. 1991.
SH
TR
5
SH
TR
TR
Conclusions
This paper proposed and studied a new cluster-based spacewavelength realization of processor networks with regular topology. The goals were to take advantage of the reconfigurability property of WDM and enable scalability beyond the available wavelengths in the space domain. The considered structure is intended to be partitioned into multiple virtual point-to-point sub-networks. Results on conflict-free wavelength assignment were provided and the scalability characteristics were evaluated in terms of partitioning, link density, and space/wavelength channel throughputs. The advantages over physical modular point-to-point networks were demonstrated. It was also shown that this approach is most efficient for realizing large binary cube and shuffle networks, both very important in parallel processing and distributed switching, due to the relatively high wavelength concurrency exhibited by these two topologies when applied to cluster interconnection.
[2] T. Lane et al., “Gigabit Optical Interconnects for The Connection Machine,” in Proc. SPIE (Optical Interconnects in the Computer Environment), pp. 24–35, 1989. [3] P. W. Dowd, “Random access protocols for high speed interprocessor communication based on a passive star topology,” IEEE Journal on Lightwave Technology, vol. 9, pp. 799–808, June 1991. [4] P. W. Dowd, “Wavelength division multiple access channel hypercube processor interconnection,” IEEE Transactions on Computers, vol. 41, pp. 1223–1241, Oct. 1992. [5] P. W. Dowd, K. Bogineni, K. A. Aly, and J. Perreault, “Hierarchical scalable photonic architectures for high-performance processor interconnection,” IEEE Transactions on Computers, vol. 42, pp. 1105–1120, Sept. 1993. [6] K. A. Aly and P. W. Dowd, “Time-space-wavelength networks for low-complexity processor interconnection,” in IPPS’94 Workshop on Massively Parallel Processing Through Optical Interconnects, (Cancin, Mexico), 1994.
[8] K. A. Aly and P. W. Dowd, “Parallel computer reconfigurability through optical interconnects,” in Proc. 21st International Conference on Parallel Processing, pp. I105 – I108, August 1992. [9] K. A. Aly and P. W. Dowd, “WDM cluster ring: A low-complexity partitionable reconfigurable processor interconnection structure,” in Proc. 22nd International Conference on Parallel Processing, pp. I150 – I153, August 1993.
[11] B. Li and A. Ganz, “Virtual topologies for WDM star LANs: The regular structures approach,” in Proc. IEEE INFOCOM’92, 1992. [12] P. Lalwaney, L. Zenad, A. Ganz, and I. Koren, “Optical interconnects for multiprocessors: Cost-performance tradeoffs,” in Proc. IEEE Supercomputing’92, pp. 278–285, Nov. 1992. [13] M. Karol, “Exploiting the attenuation of fiber-optic passive taps to create large high-capacity LAN’s and MAN’s,” IEEE Journal on Lightwave Technology, vol. 9, pp. 400–408, Mar. 1991. [14] C. A. Brackett, “Dense wavelength division multiplexing networks: Principles and applications,” IEEE Journal on Selected Areas of Communications, vol. 8, pp. 948–964, Aug. 1990. [15] J. P. Fishburn and R. A. Finkel, “Quotient networks,” IEEE Transactions on Computers, vol. C-31, pp. 288–295, Apr. 1982. [16] P. Kirkby, “Multichannel wavelength-switched transmitters and receivers- new component concepts for broadband networks and distributed switching systems,” IEEE Journal on Lightwave Technology, vol. 8, pp. 202–211, Feb. 1990. [17] F. T. Leighton, Introduction to Parallel Algorithms and Architectures. San Mateo, California: Morgan Kaufmann Publishers, 1992. [18] H. J. Siegel, Interconnection Networks for Large-Scale Parallel Processing: Theory and Case Studies. New York, NY: McGraw Hill, 2 ed., 1990. [19] H. Bodlaender, “The Classification of Coverings of Processor Networks,” Journal of Parallel and Distributed Computing, vol. 6, pp. 166– 182, 1989. [20] K. E. Batcher, “Decomposition of perfect shuffle networks,” in Proc. International Conference on Parallel Processing, pp. I.255–I.261, August 1991.
Table 2: Valid M-WDM network partitions Physical M-WDM
Virtual network
Level size,
Virtual network size,
(
M1
Number of levels,
Limit on virtual network size due to l M0
(
BCn
BC n+i
2n
2i
2n+i
i log2 M0
CCCn SHn;k
CCC2i n n2 n SHn;k+i nk k-stage shuffleNet nk
GRn;k
GRn;ik GRn+i;k
T Rn;k
T R ni ;jk+1
n; k)
n v ; kv )
l = N0=M1 N0 2n
(2i
?1)+i
2i n2
2i
n
n2i + i log2 M0 ? n
ni (i + k )ni
n(i+k) (i + k )n(i+k )
i logn M0 (i + k )ni M0
kn kn
in ki
ik)n kn+i
nk ? 1 n?1
njv ? nv nv ? 1
nkvv ? 1 nv ? 1
i M01=n i logk M0
(
j 1 + lognv nvn?v 1 M0 , i = nvj?1
Table 3: Link density CIN
Physical point-to-point (all-space) configuration of size corresponding to the virtual network size
BCn CCCn SHn;k GRn;k
2nv (M0
T Rn;k
2n0 (n0 0
0 n0
1 2(nv ? n0 )
0
2
2
2
0
2n
2
1 2n=M0 6=M0 2n=M0
2
2n=M0
2
2(n + 1)=M0
=nv ? 1)=M 1=nv
4n=M0
1
0
=n
1
0
k ?1 ? 1)=(nk0 ? 1)
2
n ? 1)(nk0 0 + 1)=(nk0 0 ? 1)
( 0
Table 4: Channel throughput
Effective channel throughput (of the virtual network),
CIN
CCCn
nv M0 n+1 3 M0 4
2(n ? 1)(Mv ? 1) Mv (n ? 1)(3kv ? 1) ? 2(Mv ? kv ) n 4=nv kv if kv is even 4kv =nv (kv2 ? 1) if kv is odd
SHn;k GRn;k T Rn;k
n
2 3 4 5 6 7 8 9 10 15
M1 M0
4 8 16 32 64 128 256 512 1K 32K
16 16 8 8 8 8 4 4 4 4
M0 2nv M0 2n + 1
k ?1 ? 1)(nkvv ? 1) (nv ? 1)(nvv k k k v ?1 v 2 2 nv + (nv ? 1) (nv ? 1) + 2kv (nv ? 1) ? 2(nvv ? 1) Table 5: Binary cube CIN,
M-WDM network
Fiber link throughput, f
2 nv ? 1 nv 2nv ?1 2nv +1 n ? 1 2 v (7nv ? 12) + 4(nv + 1)
BC n
M-WDM configuration
# of ch.s
# of nodes
Max. Virtual network
64 64 64 64 64 64 64 64 64 64
64 128 128 256 512 1K 1K 2K 4K 128K
6 7 7 8 9 10 10 11 12 17
C
M
nv
?
BCn
Channel throughput pt.-to-pt. M-WDM 0.328 0.283 0.283 0.249 0.222 0.200 0.200 0.182 0.167 0.133
?
2nv (M 1) n + 1)(M1 1)
(2
f
10.496 7.924 3.170 2.656 2.283 2.000 0.888 0.800 0.729 0.565
w
1.968 3.962 3.962 7.968 15.984 32.000 32.000 64.064 559.872 4630.528
Link density pt.-to-pt. M-WDM
0 4 4 3 3 3 3 2 2 2 2
1 8 8 6 6 6 6 4 4 4 4
0 2 2 2 2 2 2 2 2 2 2
1
0.250 0.375 1.000 1.250 1.500 1.750 4.000 4.500 5.000 7.500
Table 6: Cube-connected cycles CIN, M-WDM network
n M1 M0 2 3
8 24
8 16
# of ch.s
# of nodes
80 64
64 384
C
M
Max. Virtual network nv = 2 n 4 6
Channel throughput pt.-to-pt. M-WDM
n; k)
(
(2,3) (2,4) (2,5) (2,6) (2,7) (2,8) (2,9) (2,10) (2,11) (2,12) (2,13) (2,14) (2,15) (3,2) (3,3) (3,4) (3,5) (3,6) (4,2) (4,3)
M 1 M0
8 16 32 64 128 256 512 1,024 2,048 4,096 8,192 16,384 32,768 9 27 81 243 729 16 64
20 24 28 32 16 18 20 22 24 26 28 30 32 9 12 15 18 21 12 16
# of ch.s
# of nodes
40 48 56 64 32 36 40 44 48 52 56 60 64 27 36 45 54 63 48 64
160 384 896 2,048 2,048 4,608 10,240 22,528 49,152 106,496 229,376 491,520 1,048,576 81 324 1,215 4,374 15,309 192 1,024
C
Max. Virtual network
M
n; k)
(
(1,12) (1,16) (1,20) (1,24) (2,4) (2,8) (2,16) (2,24) (2,32) (2,48) (2,4) (2,8) (2,16) (2,24) (2,32) (2,48)
M1 M0
12 16 20 24 16 64 256 576 1,024 2,304 16 64 256 576 1,024 2,304
12 16 20 24 4 4 4 4 4 4 9 9 9 9 9 9
# of ch.s
# of nodes
n v ; kv )
36 48 60 72 32 32 32 32 32 32 72 72 72 72 72 72
144 256 400 576 64 256 1,024 2,304 4,096 9,216 144 576 2,304 5,184 9,216 20,736
C
M
(2,5) (2,6) (2,7) (2,8) (2,8) (2,9) (2,10) (2,11) (2,12) (2,13) (2,14) (2,15) (2,16) (3,3) (3,4) (3,5) (3,6) (3,7) (4,3) (4,4)
n v ; kv )
(
(2,12) (2,16) (2,20) (2,24) (2,8) (2,16) (2,32) (2,48) (2,64) (2,96) (2,12) (2,24) (2,48) (2,72) (2,96) (2,144)
n;k)
(
(2,8) (2,9) (2,10) (2,11) (4,3) (4,4) (4,5) (4,6) (4,3) (4,4) (4,5) (4,6)
M1
255 511 1,023 2,047 21 85 341 1,365 21 85 341 1,365
M0
2 2 2 2 6 6 6 6 4 4 4 4
# of ch.s
# of nodes
8 8 8 8 36 36 36 36 24 24 24 24
511 1,023 2,047 4,095 127 511 2,047 8,191 85 341 1,365 5,461
C
M
Max. virtual network
n v ; kv )
(
(2,9) (2,10) (2,11) (2,12) (2,7) (2,9) (2,11) (2,13) (4,4) (4,5) (4,6) (4,7)
f
0.165 0.132 0.111 0.095 0.095 0.083 0.074 0.067 0.061 0.056 0.051 0.048 0.044 0.280 0.199 0.154 0.125 0.105 0.271 0.193
3.300 3.168 3.108 3.040 1.520 1.494 1.480 1.474 1.464 1.456 1.428 1.440 1.408 2.529 2.388 2.310 2.250 2.205 3.252 3.088
0.168 0.124 0.100 0.083 0.250 0.124 0.064 0.040 0.032 0.020 0.168 0.084 0.040 0.028 0.020 0.012
f
2.688 2.645 2.667 2.656 0.800 0.397 0.204 0.128 0.102 0.064 1.210 0.605 0.288 0.202 0.144 0.086
1
2 2
0
2 2
1
2 2
0.375 0.375
Link density pt.-to-pt. M-WDM
w
0
1.320 2.112 3.552 6.080 12.160 21.248 37.888 68.608 124.928 229.376 417.792 786.432 1441.792 2.529 5.373 12.474 30.375 76.545 4.336 12.352
w
2.688 2.645 2.667 2.656 2.000 3.968 8.192 11.520 16.384 23.040 1.344 2.688 5.120 8.064 10.240 13.824
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 4 4 4 4 4 4 4 4 4 4 4 4 4 6 6 6 6 6 8 8
0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
1
0.200 0.167 0.143 0.125 0.250 0.222 0.200 0.182 0.167 0.154 0.143 0.133 0.125 0.667 0.500 0.400 0.333 0.386 0.667 0.500
Link density pt.-to-pt. M-WDM
0
2.000 2.000 2.000 2.000 2.000 2.000 2 .667 2.667 2.667 2.667 2.667 2.667
1
4.000 4.000 4.000 4.000 4.000 4.000 2.667 2.667 2.667 2.667 2.667 2.667
0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
1
0.167 0.125 0.100 0.083 1.000 1.000 1.000 1.000 1.000 1.000 0.444 0.444 0.444 0.444 0.444 0.444
T R(n;k)
Channel throughput pt.-to-pt. M-WDM 0.083 0.071 0.062 0.055 0.120 0.083 0.062 0.050 0.208 0.149 0.115 0.094
0
GR(n;k)
Channel throughput pt.-to-pt. M-WDM
Table 9: Tree CIN, M-WDM network
0.518 2.340
Link density pt.-to-pt. M-WDM
SH(n;k)
(
Max. virtual network
w
1.296 1.560
Channel throughput pt.-to-pt. M-WDM
Table 8: Grid CIN, M-WDM network
f
0.216 0.130
Table 7: Shuffle CIN, M-WDM network
CCCn
f
0.133 0.114 0.099 0.088 0.336 0.224 0.166 0.133 0.777 0.536 0.410 0.334
w
21.165 36.281 63.426 112.585 1.680 4.703 14.095 45.500 5.824 16.887 62.287 171.080
Link density pt.-to-pt. M-WDM
0
0.000 0.000 0.000 0.000 1.333 1.333 1.333 1.333 0.000 0.000 0.000 0.000
1
6.000 6.000 6.000 6.000 3.333 3.333 3.333 3.333 10.000 10.000 10.000 10.000
0 2 2 2 2 2 2 2 2 2 2 2 2
1
3.000 3.000 3.000 3.000 1.667 1.667 1.667 1.667 2.500 2.500 2.500 2.500