Performance-driven Design and Redesign of High ... - Semantic Scholar

2 downloads 0 Views 102KB Size Report
designer can optimize the design of the of a LAN for su- perior performance. Design decisions include the number of LAN segments, number of bridges, ...
Performance-driven Design and Redesign of High-speed Local Area Networks C.P. Ravikumar Electrical Engineering Indian Institute of Technology Hauz Khas New Delhi 110016 [email protected]

Dilip R. Pandit Anubhav Mishra * Tata Elxsi (India) Ltd. Hughes Software Systems Ltd. Plot 31, Sector 18 26, 28/2 Hoody Gurgaon, Whitefield Road Haryana 122015 Mahadevapura Post Bangalore 560048 [email protected] [email protected]

Abstract Although distributed computing over a network of computers has become a reality, its success mainly depends on the performance of the underlying network. In this paper, we consider the problem of designing a local area network with specified cost and performance constraints. The cost and performance of a local area network (LAN) are directly related to its topology. Using the a priori knowledge of the approximate number of users of the network and the kind of communication traffic that must be supported, the designer can optimize the design of the of a LAN for superior performance. Design decisions include the number of LAN segments, number of bridges, assignment of users to segments, and the method to interconnect the segments through bridges. In case of ATM networks, the decisions are regarding the number of ATM switches, the assignment of hosts to switches, and the way to connect switches through cross-connects. While assigning too many users to the same segment may cause large delays due to the sharing of network bandwidth, splitting the LAN into too many segments will increase the cost of the LAN. We report a greedy heuristic algorithm for Local Area Network Design. We propose an interesting method to construct good initial solutions to the topology design problem using a heuristic method which is based on the three-opt technique for solving the travelling salesperson problem. Our experimental results indicate that the heuristic algorithm finds good solutions. Unlike a multiprocessor, a distributed computing environment grows or shrinks dynamically. The number of users may grow way beyond the number for which the network was originally designed, resulting in a performance * Dilip and Anubhav were M.Tech students of Computer Technology in the Department of Electrical Engineering when this work was carried out. An equipment grant from Sun Microsystems is gratefully acknowledged.

degradation. In such a situation, a complete redegin of the network may not be feasible without incurring a prolonged disruption in service and a significant rewiring cost. What is desirable is an incremental redesign, which preserves most features of the existing network and requires as little rewiring as possible. We consider the problem of network redesign as an optimisation problem and present heuristic algorithms for the problem. The algorithm works in four phases which correspond to four network redesign options that are progressively more expensive. Thus, while the first phase tries to achieve better network performance by user reassignment, the fourth phase calls for creation of an additional LAN segment using an additional bridge. Our heuristic algorithm is greedy in that it chooses a more expensive redesign technique only when the more economical ones fail to provide the desired performance. We present experimental results of an implementation of our redesign heuristic.

1

Introduction

Distributed computing on a network of computers has become increasingly popular due to its affordability. In an experiment performed by NASA in 1997, PETAFLOPS performance (1015 Floating Point Operations/Second) was obtained on a network of a large number of personal computers. The fact that individual computers have become faster and that high-performance network components such as ATM switches have become available at affordable prices has increased the interest in distributed computing. Yet, the success of parallelization of an application over a network of computers crucially depends on the delay performance of the underlying network. This is even more so in the modern scenario where the speed of uniprocessors has reached unprecedented levels and per-

sonal computers that work at GHz speeds are expected to emerge in the next few years. In this paper, we focus the design of local area networks that can support highperformance distributed computing. The network topology influences its cost, performance, and reliability. The problem of designing the network topology to minimize the cost of the network while satisfying user-specified reliability and performance constraints is an important optimization problem in the design of computer networks. Unfortunately, this problem is intractable [2]. The design of the network involves the prediction of its performance such as the overall network delay, which in turn, depends on the traffic flow within the network. However, determining the traffic flow in the network requires the knowledge of the routing algorithm employed, which itself depends on the network topology. This cyclic dependency makes the network design problem a computationally difficult one. The subnet topology design has been considered by several authors in the past [2,4, 5,11, 6, 7]. More recently, Elbaumand Sidi [5] have considered the problem of designing the topology of local area networks (LANs). LANs today come in many flavors, such as Ethernet, token ring, Fibre-optic LANs, ATM LANs, and so on. Refer to Figure 1, which shows an ATM LAN as well as an Ethernet-based LAN. To keep the discussion general, we shall consider the following model of a local area network. A LAN consists of several segments which are interconnected using bridges. When the number of users, n, supported by the LAN is large, or when the geographical distance between the users is large, there can be a degradation in the performance of the network. Thus, the Ethernet limits the length of one LAN segment to be 185 meters [14]. Given n, the geographical locations of the users, the expected traffic patterns, and user constraints on the performance and reliability of the network, we consider the two problems TDP and NRP described below. The Topology Design Problem (TDP) is to determine the topology of the LAN which meets the above criteria and minimizes the network cost The Network Redesign Problem (NRP) is to alter the network topology to meet the user constraints while minimizing the redesign cost. For example, an existing LAN structure may fail to support an increased number of users and may require a reassignment of users to segments to improve its performance. Another situation where a network may have to be redesigned is when the users feel the degradation in the speed of network applications due to unforeseen traffic such as the data traffic resulting from distributed computing. The treatment of LAN topologies given above is sufficiently general to accommodate Asynchronous Transfer Mode networks (Figure 1(a)). Thus, the a LAN segments will be a set of hosts connected to a switching node. The

Figure 1: A Local Area Network Topologies, (a) Asynchronous Transfer Mode LAN (b) Ethernet LAN interconnection of these "LAN segments" will be through the links that connect the ATM switching nodes. Although there are no "bridges" in the ATM LAN, ATM switching nodes over a wider area are interconnected through ATM cross-connect switches. Cross-connects do not participate in Call admission functions and their main function is to perform fast switching of the inter-switch virtual channels/virtual paths [3]. The TDP considered in this paper is similar to the LAN topology design problem described by Elbaum and Sidi in [5] with the following enhancements. The authors of [5] do not consider network reliability constraints or multicast traffic. Our traffic model is more general and can accommodate both point-to-point communication and multicast communication. Finally, the cost of network construction has not been considered in [5]. This paper is organized as follows. Section 2 describes the problems TDP and NRP. Heuristic algorithms for the TDP and NRP are discussed in Section 3. We have implemented these heuristics on a Sun SPARCstation in C programming language. Experimental results based on this implementation are presented in Section 4. Conclusions and scope for future work are provided in Section 5.

2 Problem Formulation In this paper, we shall use the term segment to mean an ATM switch or a segment of an Ethernet LAN. Similarly, the term bridge is used to mean a cross-connect for ATM networks or a conventional bridge device for Ethernet LANs. When we talk of assigning users to segments, we mean the assignment of host nodes to ATM switches or the assignment of hosts to Ethernet LAN segments. Let T = [tij] be an n x n matrix which describes the expected traffic in a LAN which supports n users. Thus t^ indicates the traffic flow from user i to user j. Let (xi,yi, Zi) denote the geographical location of user i. A distance metric can be defined given the geographical locations of two users i and j. For example, the Manhattan distance metric dij is defined as \xi - Xj\ + \yi - yj\ + Alternate distance metrics may also be used,

such as the Euclidean measure, or a user-specific measure which considers the obstacles in cable routing. The matrix D = [dij] defines the distances among every pair of users. Let m be the number of LAN segments. Let tty be a 0/1 variable which denotes the assignment of user i to segment j. Let k be the number of bridges. Let &y be a 0/1 variable which is set to 1 if and only if segments i and j are directly connected by a bridge. A bridge which connects two LAN segments i and j carries the traffic from a user assigned to segment i to a user assigned to segment j. Although m - 1 bridges suffice to achieve connectivity of m LAN segments (e.g. a tree topology), higher performance and reliability can be achieved by using a larger number of bridges. The performance of the LAN is given by the average delay experienced by a packet in reaching the destination node. In a tree topology, there is only one possible route that a packet can take from a node i to a node j. Thus it is possible to predict the traffic through every bridge in the network with the knowledge of the network topology and the traffic matrix T.

procedure AverageDelayQ TotalDelay := 0.0; TotalTraffic := 0.0; for every LAN segment s do Compute the traffic flow F s through segment s; for every pair of users i,j do begin if i and j belong to the same segment s then

TotalDelay := TotalDelay + tij C^F else begin Find a path p from i to j; for every LAN segment s on the path p do begin TotalDelay := TotalDelay + ty • ^b?~ end for every bridge b on the path p do

TotalDelay := TotalDelay + LookupDelay{b); end TotalTraffic := TotalTraffic + t^; end TotalDelay . return TotalTraffic' end Figure 2: Computing the Average Network Delay

2.1

Delay Model

The average network delay can be computed as follows. Consider two nodes i and j. When i and j are on the same LAN segment s, the transmission delay from i to j depends on queuing and collisions. Prediction of delay from i to j depends on the type of LAN. We first describe a procedure for Ethernet LANs, a simple M/M/1 queuing model can be used to predict the delay from i to j [2]. If Fs is the flow in segment s and C is the capacity of the LAN segment then the expected queuing delay is C^F . When i and j belong to two different LAN segments, packets originating at i may have to pass through one or more bridges before reaching the destination node j. In the latter case, the look-up delay in the bridges must also be taken into account. Given the capacity of a bridge and the traffic through the bridge, the look-up delay can be computed as the ratio of the traffic to capacity. The procedure is sufficiently general to be able to handle unicast as well as multicast traffic. The initial computation of traffic flow within the network can account for multicast traffic by presuming a routing algorithm e.g. a minimum spanning tree algorithm [4]. For ATM networks, the delay from i to j consists of three components, namely, propagation delay, queueing delay, and switching delay [3]. The propagation delay is proportional to the distance between i and j and typical values are 4 to 5 /zs/km on CSMA/CD coaxialbased networks. The queuing delay depends on the traffic in the network and may be computed using M/M/1 model. Switching delay for a given switch type is a constant, and a typical value is 100 /is.

2.2

Cost Model

The cost of a network has two components, a nonrecurring engineering cost (NRE) and a recurring maintenance cost (RM). The NRE cost includes the initial cost incurred on the LAN equipment such as bridges and cabling. The RM cost includes the monthly cost of maintaining the equipment. In this work, we only consider the NRE cost component. While the cost of the bridges can be calculated in a straightforward manner, an estimate of the cabling cost is more difficult. Given the matrix U = [wy], the minimum length of the cable required to wire every LAN segment is first estimated. Using the distance matrix D, we compute the length Ls of the minimum Steiner tree [9] which connects all the users assigned to a LAN segment s. This length is multiplied by the cost per unit length of cable. NRE cost is then given by m

CostNRE = k • CostBridge + /2 ^os^cable

x

Ls

(1)

8=1

The cost of network redesign depends on the nature of changes being made to the network. If a set of new user nodes has been added to the network, the incremental cabling cost must be computed. When the topology has been modified through the addition of a bridge, the cost of the bridge must be included. 2.3

Reliability Model

The reliability of a LAN depends on the amount of redundancy built into the network. In a reliable network,

two users can continue to communicate in the presence of component failures. A simple measure of reliability is the network's node connectivity. A network is r-connected if there exist at least r node-disjoint paths between every pair of users assigned to two different segments. Kleitman's algorithm can be used to compute the node connectivity of a network [2].

2.4

Design Issues

The inputs to the topology design problem are the number of users n, traffic description T, the distance matrix D, an upper bound on the permissible average delay, and a specification of network reliability. The outputs of the TDP are the LAN topology, which is characterized by the number of segments k, the segment connectivity information B, and the assignment U of users to LAN segments. For simplicity, we have assumed that all the bridges are identical in nature. This assumption can be relaxed with some straightforward extensions. The objective of TDP is to arrive at a topology that minimizes network cost (Equation 1). The inputs to the network redesign problem include the description of the existing LAN topology and new network requirements. The possible changes include (i) Addition of one or more users (ii) Geographical relocation of one or more users (iii) Change in the traffic pattern, and (iv) Change in network delay constraint. The objective of the NRP is to arrive at a modified network topology such that the cost of network redesign (Section 2.2) is minimized.

3

Algorithms for Topological Design

3.1 Topology Design Fetterholf and Anandalingam [7] have observed the phenomenon of "locality of traffic" which means nodes which are geographically close to one another share more intense traffic. We can exploit this phenomenon to design low-cost network topologies. Let S be the set of user nodes. Our heuristic algorithm constructs a minimaldistance Hamiltonian path P which connects all the user nodes. A Hamiltonian path has the property that each node in S is visited exactly once by the path. The length of the Hamiltonian path P is given by n-l

Length(P) =

(2)

The Hamiltonian path represents the "initial" LAN topology i.e. a topology in which all the nodes are on a single segment. A minimum-length Hamiltonian path is clearly the least-cost topology when no bridges are permitted,

since the cabling cost is directly proportional to the cable length. However, the delay constraint may be violated by the initial topology. If the average delay, as computed by procedure AverageDelayQ of Section 2, is higher than the specified bound on acceptable delay, we modify the initial topology by splitting the LAN segment into two. A greedy approach is followed and the edge (i,j) in the path P whose length is largest is deleted. A removed link is replaced by a bridge. We again check if the average delay is within the specified limit and reiterate the process of removing one link at a time until the delay constraint is met. The problem of finding the initial topology is closely related to the well known Travelling Salesperson Problem (TSP) which is to find the minimum-length Hamiltonian cycle in a graph. In [10], it has been proved that any heuristic for the TSP may be translated into a heuristic for optimal network cost routing. The TSP is itself an NPcomplete problem. We use the three-opt technique [13] to find a near-optimal solution to the TSP. Three-opt is a greedy heuristic which requires O(n3) time to find a near-optimal solution to an n-city TSP. From the resulting tour of the nodes, we remove the edge with largest length in order to transform the tour into a Hamiltonian path. For comparison purposes, we also implemented the O(ri2) farthest-insertion heuristic [13] for the TSP. On all the problems for which we tested our algorithm, both heuristics resulted in tours of comparable length. Note that the assignment of users to LAN segments comes as a byproduct of the above procedure. To improve the run-time of the greedy algorithm, we can begin with an initial topology which has 1 + |_-———J segments, where nnominai is a "rule of thumb" guide to the number of nodes per segment which is likely to satisfy the performance constraint. In our experiments, we used nnominai = 10. The greedy algorithm is modified by removing the -——— largest-length links from the initial Hamiltonian path.

3.2

Redesign Algorithm

The incremental redesign algorithm works in four phases. Phase I of the algorithm uses the "locality of traffic" principle [7] to assign new users to LAN segments. The new user nodes are considered one at a time. A given user node i is assigned to the segment j such that the product NewDelay • RedesignCost is minimum, where NewDelay is the network delay after the user i has been assigned to segment j and RedesignCost is the incremental cost of the network. In Phase I, the incremental cost is limited to recabling cost. A major share of the average network delay comes from the delay introduced by the bridges. If we can assign users to segments in such a way that most of the traffic generated by the nodes assigned to a cluster remains local to the

/ \ Segment 1

Bridge b1 —

/ \ I Segment 21

Bridge b2 / Segment 3

\ V

- Segment 5

Table 1: Results of Topology Designer. Delays are indicated in milliseconds and costs are indicated in 105 Indian Rupees. Example

Figure 3: Motivation for Phase III cluster, we can minimize the traffic through the bridges and improve the LAN performance significantly. Phase II and Phase III of our algorithm utilize this concept. During Phase II, we attempt to minimize the product of average delay and redesign cost by reassigning users to clusters. This is carried out by a greedy iterative algorithm which performs a pairwise swap of two users i and j assigned to two different clusters within the iteration. If the swap results in an improvement of the delay-cost product, the change is retained, else the network is reinstated to its previous state. Decisions once made are not reversed. The iterative algorithm stops when none of the swaps can result in an improvement. Phase III attempts to alter the network topology i.e. the interconnections among LAN segments. Refer to Figure 3. Due to the changes in the traffic patterns, the users in Segment 5 communicate mostly with users in Segment 2. Reassignment of users (Phase II) alone cannot yield performance improvements in such a situation. Thus, every packet originating in Segment 5 and destined to Segment 2 will encounter a delay due to bridge b2 and Segment 3. Phase III is also an iterative improvement algorithm which searches for a better network topology by making incremental modifications in each iteration. An iteration consists of deleting an existing bridge and adding a new bridge such that connectivity is retained. A series of delete-andadd transformations are attempted and those transformations which lead to an improvement of delay-cost product are retained. Note that all the three phases mentioned above do not alter the number of segments. Increasing the number of segments will amount to splitting an existing segment using a bridge, contributing to an increase in network cost. Phase IV attempts such expensive transformations. Our algorithm enters Phase IV only when Phases I, II, and III have failed to meet the specified performance constraint. The segment with the largest number of nodes is first selected as the candidate for a split transformation. An empty segment is added and the control passes to Phase II so as to perform user reassignment and topological transformations. During each phase, we keep an upper bound on the number of iterations that do not yield any improvement. The algorithm proceeds to the next phase when the rejection count exceeds the limit. The rejection count is set to 0 when an improvement is found.

Elbaum-Sidi Grid30 Grid50

4

n

k

m

8 30 50

2 3 7

1 2 6

Avg. Delay 12.550 0.079 0.950

Delay Bound 8.330 0.025 0.211

Network Cost 0.270 0.795 1.557

Results

Our heuristic algorithms were implemented on a Sun SPARCstation using C programming language. The complete software requires about 3000 lines of code. We used the example from [5] to compare the result from our topology design algorithm with that reported in [5]. The authors of [5] considered a small LAN with 8 users. For this example, our topology designer arrived at a two-segment, 1-bridge solution. Interestingly, this topology is identical to that reported in [5]. We also experimented with much larger problems. Grid30 corresponds to a site of 30 users whose geographic locations lie in the vicinity of corners of a grid. Grid50 is a similar problem where there are 50 users. The run-time of our algorithm on a Sun SPARCstation 10 even for the 50 user problem is about 3 minutes. The results of the topology design algorithm are summarized in Table 1. In the table, k denotes the number of segments in the network found by our algorithm, and m denotes the number of bridges. The delay bound shown in Column 6 is a trivial lower bound on the average network delay found by assuming that all the m segments are fully connected through bridges. In order to verify the incremental redesign option of our software, we considered an example network with four segments and 20 users. The initial topology is linear, with bridge 1 connecting segments 1 and 3, bridge 2 connecting segments 2 and 3, and bridge 3 connecting segments 1 and 4. The user assignment to segments was performed randomly. We performed the following two experiments. In the first experiment, we did not specify any new users and performed a redesign. Note that this mode of operation allows us to (possibly) improve the network obtained using our topology design algorithm. The results of this experiment are shown in Table 2(a). In the second (third) column, we show the cost (average delay) of the new topology with reference to the initial topology. Results obtained after the algorithm has progressed through different phases are indicated. Phase II was able to obtain an improvement in cost and delay through user reassignment. The application of Phase IV resulted in the addition of a bridge, increasing the cost of the network at no improvement in

Table 2: Results of Topology Redesign Experiments Case 1 2 3 4 5

Relative Cost 1.0000 0.9254 0.9254 1.1164 1.1164

Relative Delay 1.0000 0.5588 0.5588 0.5588 0.5588

Comments

[1] F. Backes. Transparent Bridges for Interconnection of IEEE 802 LANs, IEEE Networks Jan 88.

Initial Topology Result after Phase II Result after Phase III Result after Phases IV and II Result after Phases IV, II, III

[2] D. Bertsekas and R. Gallager. Data Networks Second Edition. PHI 1992.

(a) : No changes in traffic patterns or users Case 1 2 3 4 5

Relative Cost 1.0000 1.0000 0.7055 0.7055 0.8096

Relative Delay 1.0000 1.0000 0.5801 0.5391 0.5651

Comments Initial Topology Result after Phase II Result after Phase III Result after Phases IV and II Result after Phases IV, II, III

(b) : Addition of 5 new users

delay. Thus, in this example, the results of Phase II must be used as the final results. Similar experiments were conducted for a second case where five new users were added (See Table 2(b).) Since the number of users and the traffic patterns have changed, it is not possible to compare the network cost and delay of the resulting topology with that of the initial topology. We therefore use the results of Phase I as the basis of comparison in columns 2 and 3. We can see from the Table that Phases I, II, III result in improvements in terms of cost and delay. The best results were obtained after Phase III in this example. The runtime of our redesign heuristic for all the examples which we attempted was within 1 minute.

5

References

Conclusions

We presented heuristic algorithms for two important problems concerning the design of local area networks. The topology design problem is to construct economical LAN topologies which meet the speed requirements of applications such as multimedia conferencing. The second problem is that of incremental redesign of existing LAN topologies to handle increased traffic. In both problems, the objective is to minimize the cabling cost and the cost of LAN equipment such as bridges or cross-connect switches. We have presented initial results of our heuristic algorithms which are encouraging. Extensions of this work call for more accurate delay models to handle multimedia traffic. Similarly, one may consider better cost models that account for the recurring cost of LAN maintenance.

[3] U. Black. ATM Foundation for Broadband Networks. Prentice Hall. Englewood Cliffs, New Jersey, 1995. [4] A. Dash. Subnet Topology design for Multicasting Applications, M Tech Dissertation, Dept of Mathematics, IIT Delhi, December 1996. [5] R. ElbaumandM. Sidi. Topological Design of LANs using Genetic Algorithms IEEE/ACM Transactions on Networking vol4, No. 5, Oct 1996, Pages 766779. [6] C. Ersoy and S.S. Panwar. Topological Design of Interconnected LAN/MAN Networks, IEEE JSAC, Vol. 11, No 8, Oct 1993. [7] PC. Fetterolf and G. Anandalingam. Optimization Interconnection of LANs. An Approach using simulated annealing, OSRA Journal on computing Vol3 No 4, Fall 1991. [8] H. Frank and W Chou. Network Properties of the ARPA Computer Network IEEE Network Vol. 4, 1974. [9] M.R. Garey and D.S. Johnson. Computers and Intractability. A guide to the theory of NP completeness. San Fransisco, CA. Freeman, 1979. [10] B.K. Kadaba and J.M. Jaffe. Routing to Multiple Destinations in Computer Networks. IEEE Transactions on Commun. Vol. COM-31. May 1983. Pages 343-357. [11] A. Kershenbaum. Telecommunications Network Design Algorithms. McGraw Hill Inc. Singapore. 1983. [12] W Stallings. Data and Computer Communications Second Ed., Macmillan, NY, 1988. [13] N. Syslo, N. Deo and J.S. Kowalik. Discrete Optimization Algorithms, Prentice Hall Inc, New Jersey 1983. [14] Wipro Infotech Ltd. Network Products Service Guide. Vol. 1. Release 1.0. June 1994.

Suggest Documents