a novel approach for assessing network efficiency

5 downloads 0 Views 5MB Size Report
economic settings, to assess network efficiency. This method is Data Envelopment Analysis (DEA), which is used to measure and compare the relative efficiency ...
IADIS INTERNATIONAL CONFERENCES

INFORMATICS 2010 WIRELESS APPLICATIONS AND COMPUTING 2010 TELECOMMUNICATIONS, NETWORKS AND SYSTEMS 2010

part of the IADIS MULTI CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS 2010

IADIS International Conferences Informatics 2010, Wireless Applications and Computing 2010 and Telecommunications, Networks and Systems 2010

A NOVEL APPROACH FOR ASSESSING NETWORK EFFICIENCY Panayotis Fouliras and Emmanouil Stiakakis Dept of Applied Informatics, University of Macedonia Egnatias 156, 540 06 Thessaloniki, Greece

ABSTRACT The novelty of our approach is that we are implementing a well-known efficiency measurement method, used in economic settings, to assess network efficiency. This method is Data Envelopment Analysis (DEA), which is used to measure and compare the relative efficiency of a number of units with multiple input and output items. A sample of 25 universities and technological educational institutes is examined, not only resulting in a ranking according to their relative efficiency, but also in a set of possible quantitative improvement suggestions. The results could function as a guide for network designers and analysts, so that they are able to focus on specific input and output items. KEYWORDS Efficiency, Performance, Network, Data Envelopment Analysis

1. INTRODUCTION The Internet has evolved rapidly over the last two decades, effectively becoming an essential medium for communication and other information-related services. Nevertheless, it still remains a communication system whose design and management requires appropriate methods in order to assess certain important factors, such as reliability and efficiency. Given such an assessment, the network manager or designer is provided with a concise (and some times detailed) picture of the current state of the system, allowing planning for the necessary actions. Hence, assessing network efficiency is of paramount importance. Network efficiency, however, is a very general and rather vague term. It can represent different things to different parties. For example, a client may consider a certain network efficient in terms of the throughput achieved regarding network services that he has asked for. On the other hand, an Internet Service Provider (ISP) may consider its network efficient when the required volume of traffic by its clients is slightly smaller than the traffic offered by the network itself. Nevertheless, it is also possible to consider network efficiency in terms of a multitude of characteristics, such as minimizing the number of client requests that were rejected, the amount of traffic which represents dropped packets due to congestion, the traffic that was dropped due to non-conformity to certain Quality of Service (QoS) constraints, etc. In most cases, however, efficiency metrics comprise of traffic (in and out of a network) expressed in either Mbps or pps (packets per second). Additional metrics may involve packets lost per second or delay manifested as RTT (Mnisi et al, 2008). The remainder of this paper is organized as follows: related work and other approaches are discussed in Section 2. A brief description of DEA is presented in the third Section. This is followed in Section 4 by our proposal in detail, using a specific group of networks as our testbed. In Section 5 we present an analysis of the results produced by DEA and discuss the merits of this method in computer networks. Finally, the conclusions are presented in Section 5.

2. RELATED WORK In general, there are two typical approaches used to measure network efficiency: analytical and simulationbased. Under the analytical approach, there is a detailed knowledge of the network topology, the nature of the

223

ISBN: 978-972-8939-19-9 © 2010 IADIS

traffic present in the network and the capacity of all nodes involved. The traffic demand itself (generated by the users) is random, but it is possible to model it, if it demonstrates an appropriate statistical property. In the case of the Internet, empirical studies have shown that it can be described as self-similar traffic that can have heavy tail distribution (Willinger and Paxson, 1998). This is in contrast to the typical Poisson distribution used for telephone networks. Armed with this information, the network manager can employ probabilistic modeling techniques in order to evaluate the performance of a specific system. Analytical models can be broadly classified into state space models which are represented today by Markov chains or non-state space models, such as product-form queuing networks (Bolch et al, 2006). A Markov chain consists of a set of states and a set of labelled transitions between the states. The main advantage of such an approach is that it can be used to predict the behaviour of a system, apart from evaluating its performance. For large and complex systems, such a model is more difficult both to create as well as to solve, although recent research efforts have led to the development of various software packages that assist towards this direction. In spite of these advances, there is a continuing need towards the ability to deal with larger Markov chains and much research is being devoted to this topic (Bolch et al, 2006). Under the simulation-based approach, an abstract model of the actual system is created and run with the desired parameters and restrictions. The main drawback is the time taken to run such models for large, realistic systems. Moreover, the accuracy of results depends on the accuracy of the models constructed. Yet another approach for system performance evaluation is based on actual measurement of the system under study. If performed properly, this approach has the advantage of collecting realistic data from the actual system, rather than working with an abstract model of it. It is difficult, however, to apply it on large systems and a network may appear as a “black box”, making efficiency measurements impossible at first. In this paper, we propose a novel hybrid way of assessing the relative efficiency of a collection of networks. Our proposal is based on the observation that a group of networks can be seen as a collection of systems which exhibit certain measurable inputs and outputs. In this case there is an appropriate method which has been used with considerable success in the field of Economics, namely DEA. We argue that DEA can be used to assess the relative efficiency of a collection of networks simultaneously, provided certain simple conditions are met.

3. A BRIEF DESCRIPTION OF DEA DEA is a nonparametric approach used to measure the performance of a population of units, named Decision Making Units (DMUs). Parametric approaches require the imposition of a specific functional form (e.g., a regression equation) relating inputs to outputs. In contrast, DEA does not require any assumption about the functional form. DEA constructs a discrete piecewise frontier over the data and then calculates a maximal performance measure for each DMU in relation to all the other DMUs. The relative efficiency of any DMU is calculated by the ratio of a weighted sum of m outputs over a weighted sum of n inputs. The weights are selected so that the efficiency measure of each DMU (DMU 1 in the following mathematical problem) is maximized, subject to the constraint that no DMU can have a relative efficiency score greater than unity (Cooper et al, 2007):

(

)( (u ⋅ y )/ (v

max u T ⋅ y 1 / v T ⋅ x1 u,v

subject to

T

T

i

)

)

⋅ xi ≤ 1

u, v ≥ 0

∀ i = 1, 2, …, k

where u is an m x 1 vector of output weights, v is an n x 1 vector of input weights, y is an m x 1 vector of output values, x is an n x 1 vector of input values and k is the number of DMUs. After transforming the above fractional problem into a linear programming problem (multiplier form) by equating the denominator of the efficiency ratio of the DMU under study to unity and using the equivalent dual model, the DEA problem takes the following form (envelopment form):

min θ θ ,λ

subject to

224

− y1 + Y ⋅ λ ≥ 0

IADIS International Conferences Informatics 2010, Wireless Applications and Computing 2010 and Telecommunications, Networks and Systems 2010

θ ⋅ x1 − X ⋅ λ ≥ 0 λ≥0

where θ is the efficiency score of DMU 1 (0 < θ ≤ 1), λ is a k x 1 vector of constants, Y is the m x k output matrix and X is the n x k input matrix. Usually the number of DMUs is considerably larger than the sum of input and output items; hence much less computational effort is needed when the envelopment form is to be solved. The choice of the appropriate orientation (input or output) of the model is not important. Generally, the orientation should be selected according to which items the managers have most control over. The inputand output-orientated DEA models identify the same set of efficient DMUs. It is only the efficiency measures associated with the inefficient DMUs that may differ between the two perspectives (Coelli et al, 2005). The following features could be considered the main strengths of the DEA methodology, making it of great interest to operations analysts, management scientists and industrial engineers (Charnes et al, 1994; Fuchs, 2004): • DEA can handle multiple input and output items. • Inputs and outputs could be stated in very different measurement units. • DEA focuses on individual observations in contrast to population averages. • Each DMU is characterized by a single relative efficiency score. • DEA does not require specification or knowledge of a priori weights for inputs and outputs, as in the usual index number approaches. • Based on projections of the inefficient DMUs onto the efficient frontier, specific estimates for improvements in inputs and/or outputs are produced. • DMUs are directly compared against a peer or a combination of peers. • DEA is very much in line with the concept of benchmarking since the unit under study could identify benchmarks by making comparisons with a group of comparable units (perhaps the most important feature of the method).

4. NETWORK EFFICIENCY METRICS, TESTBED AND RESULTS A network creates traffic as a result of the transformation of data mostly generated by computers at its ends (we assume that control traffic generated by intermediate nodes, such as routers, is negligible in an efficient network). Hence, some of the end-nodes in a network are input points in the sense that they generate traffic and some are output points in the sense that they consume the generated traffic. These end-points are border nodes (switches or routers) connecting a network with another or are hosts attached to the particular network. Network users, such as students in a university network, typically interact with the system, most often requiring a particular piece of software or other form of data. In this sense, they cause the generation of traffic entering the network, obviously directed towards the computers they use. As stressed earlier, there is no unique definition of network efficiency. The simplest view of network activity is the set of input and output traffic over a certain period of time. Furthermore, depending on the traffic demanded relative to the traffic conveyed over a network, it is possible to estimate relative network efficiency when comparing such results over a large number of similar networks (e.g., they have similar link capacity and users who cause traffic demands). We, therefore, propose the use of five metrics for calculating relative network efficiency. These are input and output traffic averaged over a one-year period, maximum input and output traffic, and the number of registered students for each of the networks under study. The first four metrics were selected because they represent network traffic both in a normalized way (input and output traffic span adequately long periods of time in order to accommodate for extreme periods, such as holidays or weekends), as well as peaks (i.e., maximum input and output traffic), where peak traffic demands may comprise of both elastic time data and non-elastic time, such as real-time video, which may fail due to excessive delays. Indeed, these metrics are adopted in the popular, free, traffic load measurement application MRTG (MRTG, 2010), which uses SNMP for data collection, with each measurement typically taken over a five minute period. Measurements span a day, week, month and a year period. In the latter case MRTG averages data over a day period. Furthermore, MRTG can also measure traffic in terms of packets per second, but given that IP networks allow packets of varying size we do not consider such data accurate

225

ISBN: 978-972-8939-19-9 © 2010 IADIS

enough for our purposes compared to the metrics proposed above. We chose a one-year period because it is large enough to span a complete behavior of a human society (typical clients of a computer network), yet reasonably small enough to assume that the networks under study have not undergone dramatic modifications. The last (output) metric we propose (number of registered students for each of the networks examined) was selected because they represent by far the largest homogeneous user group of these networks. Hence, they present similar traffic network demand behavior.

4.1 Network Testbed In order to test our proposal, we selected a sizeable group of interconnected networks for which the relevant measurements are publicly available. GRNet (Greek Research & technology Network) is a backbone network that connects the networks of all Greek universities, Technological Educational Institutes (TEI) and various other state organizations. The NOC (Network Operations Center) of each university or TEI does not necessarily provide the complete topology or even traffic statistics regarding its internal network. However, GRNet does provide diagrams with traffic measurements regarding its nodes via MRTG (GRNet, 2010). Some of the measurements refer to the main links connecting GRNet with the respective university and TEI networks. Hence, it is possible to collect real-world measurements for average and maximum input and output traffic, allowing us to apply DEA in order to calculate the relative efficiency of these networks. Table 1. Networks and respective measurement data University/TEI (DMU) NTUA (Nat. Tech. U. of Athens) UoAthens AUTH (Arist. U. of Thessaloniki) AUEB (Athens U. of Econ. & Buss.) UoPatras UoCrete UoIoannina DUTh (Xanthi) – (Demokr. U. of Thrace) UoPeloponnese UoMacedonia UoWesternMacedonia Tech. UoCrete TEI Athens TEI Patras TEI Larisa TEI Lamia TEI Halkis TEI Kalamata TEI Serres TEI Thessaloniki TEI Kozani TEI Ionian Islands TEI Messolongi TEI Crete TEI Kavala

Inputs Traffic In Max Traffic (Mbps) In (Mbps) 175.20 1018.70 115.60 579.80 89.20 2453.50 26.10 534.40 669.70 924.20 91.80 775.50 60.00 552.20 28.30 410.00 0.54 93.10 19.70 284.80 3.80 327.80 41.60 215.30 69.10 1460.00 10.70 117.00 11.00 101.80 3.22 61.30 1.00 101.00 0.87 48.00 5.25 99.90 10.70 941.20 12.60 99.70 0.68 3.58 2.09 92.50 38.20 983.20 14.20 133.70

Outputs Traffic Out Max Traffic (Mbps) Out (Mbps) 920.90 2218.20 344.90 960.00 206.90 650.00 68.20 345.10 267.90 279.60 436.70 1150.00 105.80 8020 88.30 686.30 48.90 292.70 48.70 415.20 17.20 149.80 146.60 709.40 24.20 248.09 39.20 199.50 7.07 914.30 6.310 40.60 0.39 11.60 3.70 52.30 9.44 99.80 5.56 413.70 16.80 99.50 0.66 7.37 0.54 733.20 132.60 348.10 25.80 188.60

No. of Students 8,858 32,335 34,148 6,857 12,993 8,421 10,451 11,732 2,519 5,889 2,340 1,973 15,838 10,218 9,928 3,989 5,460 2,772 5,600 12,334 9,931 2,023 4,049 9,322 5,444

Table 1 summarizes the data collected according to our selected metrics for each network. Networks or backup links were excluded because they present more than an order of magnitude lower traffic than the ones in the table. Most networks have a 1 Gbps main link to GRNet, while the rest 2.5 Gbps and 10 Gbps, respectively. In this sense, the main link capacities (as well as the number of students) fall within an order of magnitude. Furthermore, these networks are from educational institutes of the entire country and each network may consist of a single or multiple sites spanning different cities aggregated into a single network.

226

IADIS International Conferences Informatics 2010, Wireless Applications and Computing 2010 and Telecommunications, Networks and Systems 2010

4.2 Results The following results have been computed using DEA-Solver v6.0, a commercial software package by SAITECH, Inc. In the context of DEA, a DMU with an efficiency score equal to unity is an efficient unit. Of the 25 universities and TEIs, 11 are efficient DMUs. The average efficiency score is 0.777, the minimum 0.161 and the standard deviation 0.264. Table 2 shows the differences between the actual and the expected values only for the inefficient DMUs. These differences represent the required reductions for each of the input items (traffic in, max traffic in), as well as the required increases for output items (traffic out, max traffic out). The “No. of students” has been omitted, since there is no possibility to decrease it. Consequently, this table should be considered as a guide for all sampled DMUs in order to suggest exactly which items, and to what extent, need to be improved to achieve operational efficiency. For instance, “Max Traffic Out” for the UoPatras should be increased for the respective network’s relative efficiency to be improved. We stress, however, that this guide determines only what should be done; not the actual means to achieve it. Table 2. Projection of inefficient DMUs onto the efficient frontier DMU AUEB

UoPatras

UoCrete

UoIoannina

DUTh (Xanthi)

UoMacedonia

UoWesternMacedonia

Tech. UoCrete

TEI Athens

TEI Lamia

TEI Kalamata

Score I/O 0.436 Traffic In Max Traffic In Traffic Out Max Traffic Out 0.392 Traffic In Max Traffic In Traffic Out Max Traffic Out 0.907 Traffic In Max Traffic In Traffic Out Max Traffic Out 0.462 Traffic In Max Traffic In Traffic Out Max Traffic Out 0.926 Traffic In Max Traffic In Traffic Out Max Traffic Out 0.386 Traffic In Max Traffic In Traffic Out Max Traffic Out 0.161 Traffic In Max Traffic In Traffic Out Max Traffic Out 0.931 Traffic In Max Traffic In Traffic Out Max Traffic Out 0.432 Traffic In Max Traffic In Traffic Out Max Traffic Out 0.689 Traffic In Max Traffic In Traffic Out Max Traffic Out 0.820 Traffic In Max Traffic In Traffic Out

Data Projection 26.10 11.37 534.40 133.25 68.20 68.20 345.10 345.10 669.70 67.43 924.20 361.98 267.90 267.90 279.60 690.0642 91.80 83.27 775.50 519.02 436.70 436.70 1150.00 1150.00 60.00 27.73 552.20 255.25 105.80 105.80 802.00 802 28.30 26.21 410.00 379.65 88.30 88.30 686.30 686.30 19.70 7.61 284.80 110.00 48.70 48.70 415.20 415.20 3.80 0.61 327.80 52.67 17.20 26.29 149.80 159.07 41.60 31.95 215.30 200.50 146.60 146.60 709.40 709.40 69.10 29.82 1460.00 630.00 24.20 81.76 248.09 312.55 3.22 2.22 61.30 42.22 6.31 6.31 40.60 40.60 0.87 0.71 48.00 39.38 3.70 9.92

Difference -14.73 -401.15 0 0 -602.27 -562.22 0 410.4642 -8.53 -256.48 0 0 -32.27 -296.95 0 0 -2.09 -30.35 0 0 -12.09 -174.80 0 0 -3.19 -275.13 9.09 9.27 -9.65 -14.80 0 0 -39.28 -830.001 57.56 64.46 -1.00 -19.08 0 0 -0.16 -8.62 6.22

% -56.45% -75.07% 0.00% 0.00% -89.93% -60.83% 0.00% 146.80% -9.29% -33.07% 0.00% 0.00% -53.78% -53.78% 0.00% 0.00% -7.40% -7.40% 0.00% 0.00% -61.38% -61.38% 0.00% 0.00% -83.93% -83.93% 52.86% 6.19% -23.21% -6.88% 0.00% 0.00% -56.85% -56.85% 237.84% 25.98% -31.13% -31.13% 0.00% 0.00% -17.96% -17.96% 168.19%

227

ISBN: 978-972-8939-19-9 © 2010 IADIS

TEI Serres

TEI Crete

TEI Kavala

Max Traffic Out 0.704 Traffic In Max Traffic In Traffic Out Max Traffic Out 0.706 Traffic In Max Traffic In Traffic Out Max Traffic Out 0.476 Traffic In Max Traffic In Traffic Out Max Traffic Out

52.30 5.25 99.90 9.44 99.80 38.20 983.20 132.60 348.10 14.20 133.70 25.80 188.60

63.27 3.70 70.33 9.44 113.04 26.95 209.21 132.60 420.29 6.76236 63.67 25.80 188.60

10.97 -1.55 -29.57 0 13.24 -11.25 -773.99 0 72.19 -7.44 -70.03 0 0

20.97% -29.60% -29.60% 0.00% 13.27% -29.44% -78.72% 0.00% 20.74% -52.38% -52.38% 0.00% 0.00%

5. CONCLUSION In this paper we have applied DEA, which is extensively used in economic related contexts on a collection of networks, in order to assess their relative efficiency. This is a novel approach since we treat networks as economic systems whose input items should be minimized, output items should be maximized and their efficiency obviously maximized. This approach offers the advantage of treating a group of networks as “black boxes” without any detail regarding their topologies or internal characteristics. Instead, we only use certain input and output items, such as traffic measurements and rough number of users for each network. The results obtained not only yield the present relative efficiency of these networks, but also provide the network manager with a complete set of quantitative suggestions towards the improvement of the relative efficiency of the networks determined as inefficient. Although our analysis was based on a small number of input and output items, we strongly believe that these are the most important metrics, characterizing the overall network efficiency. Additional research should be directed towards the examination and comparison of other efficiency measurement methods with DEA regarding computer networks.

REFERENCES Bolch, G. et al, 2006. Queueing Networks and Markov Chains: Modeling and Performance Evaluation with Computer Science Applications. 2nd edition, Wiley-Interscience, Hoboken, New Jersey. Charnes, A. et al, 1978. Measuring the efficiency of Decision Making Units. In European Journal of Operational Research, Vol. 2, pp. 429-444. Coelli, T.J. et al, 2005. An Introduction to Efficiency and Productivity Analysis. 2nd edition, Springer, New York, NY. Cooper, W.W. et al, 2007. Data Envelopment Analysis: A Comprehensive Text with Models, Applications, References and DEA-Solver Software. 2nd edition, Springer, New York, NY. Fuchs, M., 2004. Strategy development in tourism destinations: A DEA approach. In Poznan Economics Review, Vol. 4, No. 1, pp. 52-73. GRNet, Greek Research & technology Network, 2010. Available at: http://netmon.grnet.gr/traffic/ (accessed January 8, 2010). Mnisi, N.V. et al, 2008. Active Throughput Estimation using RTT of Differing ICMP packet sizes. Proceedings of the Third IEEE International Conference on Broadband Communications, Information Technology & Biomedical Applications (Broadcom). Pretoria, Gauteng, South Africa, pp. 480-485. MRTG, Multi Router Traffic Generator, 2010. Available at: http://oss.oetiker.ch/mrtg/ (accessed January 8, 2010). Willinger, W. and Paxson, V., 1998. Where Mathematics Meets the Internet. In Notices of The American Mathematical Society, Vol. 45, No. 8, pp. 961-970.

228