Real-time Monitoring of Network Latency in Software ...

11 downloads 7575 Views 196KB Size Report
end path latency in Software Defined Networks (SDN). This method avoids ... Cisco, BigSwitch Networks, Juniper, etc. in their devices. Latency is one of widely ...
2015 IEEE International Conference on Advanced Networks and Telecommuncations Systems (ANTS)

Real-Time Monitoring of Network Latency in Software Defined Networks Debanshu Sinha, K Haribabu, Sundar Balasubramaniam Computer Science & Information Systems Birla Institute of Technology & Science, Pilani, Rajasthan, 333031, INDIA {f2010558, khari, sundarb}@pilani.bits-pilani.ac.in mode, when a switch receives a flow for which it doesn't find a matching label, it will forward the packets to controller, which then has to compute the path and update the switch FIB. (OpenFlow[2], and ForCES[3] are some well known southbound interfaces (i.e. protocols used by the controller to communicate with network elements). OpenFlow is being widely adapted and supported by vendors such as HP, IBM, Cisco, BigSwitch Networks, Juniper, etc. in their devices.

Abstract—Latency in a network is an important parameter that can be utilized by Service providers and end users alike. Delay on a network path is often measured using end-to-end probing packets. When multiple end systems measure end-to-end latency, there are overlaps in their paths. Since end systems do not have this knowledge, it results in redundant work and network overhead. In this paper, we propose a method to measure end-toend path latency in Software Defined Networks (SDN). This method avoids redundant work and measures latency in realtime. Our proposal is an improvement over the looping technique. We simplified the looping technique by using IP TTL as a counter. In order to avoid duplicate work, latency is measured per link and stored in the controller. End systems may register their flow labels with the SDN controller to receive latency information. For each registered flow, controller composes individual link latencies on that path to compute endto-end latency. We also propose another approach to measure latency using queue lengths at network switches. This technique removes network overhead. In our simulations, improved looping technique is found to be giving better results with reduced computational and network overhead, while the proposed queue length technique shows comparable results.

Latency is one of widely used performance metrics in computer networks. Latency is used in network diagnostics as well as by end-host applications. Network administrators use network delay to detect abnormalities in the network. Content Distribution Networks use latency information to redirect their clients to appropriate servers[4]. Applications sensitive to delay or jitter, such as voice and video applications, monitor delay to adapt their traffic rate or use alternate paths.. TCP uses delay to detect packet loss which impacts congestion control. Latency is measured in various ways: i) using end-to-end utilities such a ping and trace-route ii) using dedicated infrastructure such as tracers, landmark hosts, or DNS servers iii) using geographical coordinate systems, etc. Latency can be either measured in real-time or extrapolated based on precomputed values possibly between a set of edge-routers in the Internet. The centralized control and global visibility of SDN architectures enable new ways of measuring latency adding accuracy and real-time basis. One of the challenges in measuring latency in the distributed architecture of the Internet is that measurement is done by end-hosts while network devices are passive participants. Considering the scale of the Internet – as number of end hosts - measuring latency between all pairs of end-hosts is always a challenge[5]. End-to-end measurements also results in duplicate work.

Keywords- SDN, Latency Measurement, OpenFlow, OpenVSwitch

I.

INTRODUCTION

Software Defined Networking (SDN) is defined by Open Network Foundation as an architecture in which the control and data planes are decoupled while network intelligence and state are logically centralized. [1]. In traditional networks, both control plane and data plane are present in the network device. In SDN, network devices need to take instructions from an SDN controller to process incoming data packets. The Controller is a central entity that keeps in continuous contact with all network devices and gather statistics from them. It uses these statistics for routing, load balancing etc. In SDN paradigm, network devices forward packets by using the intelligence given by the controller. This is in contrast to the existing design wherein switches learn MAC addresses and use STP to avoid loops, and the routers use OSPF/IS-IS to learn routes for destinations. All this will now be done by the controller in an SDN.. For a particular flow, the controller computes the end-to-end path and adds flow labels and corresponding actions to the FIB (Forward Information Base) devices situated on the path. Each network device keeps in touch with the controller over a secure channel. The controller can add the flows reactively or proactively. In the reactive

978-1-5090-0293-1/15/$31.00 ©2015 IEEE

If network devices can actively contribute in network measurements, duplicate work can be eliminated. Moreover the number of network devices is limited - in the order of 557k[6]. If there is a way of computing latencies between adjacent network devices, such latencies can be composed into end-toend path latency. Such a method will greatly reduce the computational effort on end-hosts as well as the overhead of extra messages on network bandwidth. SDN-based Network architecture has provisions for such a method. In this paper we review the existing methods to measure link latencies and propose two new methods. The first is an improvement over an existing method[7] - via simplifying its

57

2015 IEEE International Conference on Advanced Networks and Telecommuncations Systems (ANTS)

implementation – we call this the TTL-based based looping method. The second method is based on measuring queue length at a switch and map it to the latency of the link and correspondingly compose end-to-end path latency. We compare the aaccuracy, and overheads - computational and communication - of these methods with those of existing methods. Rest of the paper is divided into Related Work, Proposed Approaches, Implementation, Experiments and Results, and Conclusion. II.

the number of iterations to be made by the service packet before reporting back to the controller. We simplify the method from [7] to use just three rules. Our method continuously loops a service packet pack in each link instead of a group;; and we leverage the TTL (time to live) field to keep track of the number of iterations of our service packet. This requires only three rules each in s1 and s2: (i) to match and decrement the TTL; (ii) to forward the packet back to the other switch; and (iii) to forward the packet back to the controller if TTL is 0.

RELATED WORK

The simplest implest way of measuring latency is using the ping utility. It requires that ICMP packets are forwarded till the end host which may not always be feasible due to firewall restrictions. Probe packets themselves may get dropped due to congestion, limiting its usage in applications where real real-time measurement of latency is required.

C. Queue Length Method Latency atency on a network is made of four components – processing delay, propagation delay, transmission delay, delay and queuing delay. It has been en observed that the main variable among these is the queuing delay i.e. the other components remain more or less constant [10]. We leverage this idea to build a model of calculating link latencies from the queue length at each port of a switch, since s a single link can be uniquely identified as a [(s1,, port1), port1 (s2, port2)] pair. Our model is therefore:

In traditional methods either require additional specialized infrastructure or measure approximate latencies in a distributed way and the latency measured doesn't represent the current state of the network. SDN makes a network programmable and provides statistics that can be readily used for real-time monitoring. Latency of a link between two switches can be measured by sending a probe packet from the controller to swicth1, which will send it to swicth2, which sends it back to controller [8].. Latency is computed by subtracting up and down delays from the controller to switch1 and swicth2. OpenNetMon [9] measures delay using the same method as in [8], except that alll switches and the controller are formed in a separate VLAN and probing frequency is determined based on throughput rate between the switches. In [7], a looping technique for real-time monitoring of latency is proposed. A probe packet is injected into the network on a chosen path and is made to traverse the path for a number of times before it is sent to the controller. The controller ontroller computes latency as an average over all loops the packet has made. This method measures latency up to a precision of tens of microsecond microseconds. III.

Here Qd is calculated based on the number of packets queued (the queue length, Ql), the MTU (maximum transmission unit) for the link (which we take as 1500 bytes) and the bandwidth for the link Bl. Note, the Bl value used is the rate of data transmission in the link Figure 1. System Model (10mbps/100mbps). Td is calculated based on MTU and Bl. Cd is the sum of Pd and Ps1s2, which is assumed to be a constant. Since processing delay (P ( d) and propagation delay (Ps1s2) remain constant irrespective of traffic variation, Cd is measured on a raw network by running ping. We assume there iss no queuing delay because it is a raw network and the transmission delay is small because ping packet is only (8+20=28) bytes long and therefore insignificant when compared to the MTU 1500 bytes.

PROPOSED APPROACH PPROACHES

A. System Model We define the base model of our system consisting of two OpenFlow-compliant switches s1 and s2,, connected by a link of bandwidth Bl. There is a direct logical link from each switch to the controller Cn. For the link we assume a propagation delay Ps1s2; for each switch a queue length Ql, a queuing delay Qd, a transmission delay Td and a processing delay Pd.

D. Comaprison Table 1 provides a comparison omparison measurement techniques: Metric

B. TTL-based Looping Method A simple method to calculate lculate the latency would be to loop a special service packet through the Cn-s1-s22-Cn circuit and record the time delay in the controller when the packet returns. After subtracting the Cn-s1 and s2-Cn delays for the same, we would get the desired value, as described in[9]..

Ping

of

various

OpenNetM TTL-based on Looping O(P*L/Q) O(N*P*T/Q)

latency

Queue Length O(P*N/Q)

Network O(L) Overhead per Flow Query Network O(F*Q*L) O(F*P*L) O(N*P*T) O(N*P) Overhead Total Precision Low High Very high High Is it RealNo Yes (for Yes Yes (for Time? large burst small N) intervals) Table1 Comparison of latency Measurement techniques. techniques Q - number of queries (i.e. requests); N - number of switches; L - path length; P -

Improvements suggested to the aforementioned method in [9] is to keep the packet in a continuous loop over some consecutive group of links, reporting an average latency for that link ink group over time. However the method described in [7] achieves this by installing multiple rules at each switch which match separate parts of the packet header to take appropriate forwarding actions. The number of rules required is related to

58

2015 IEEE International Conference on Advanced Networks and Telecommuncations Systems (ANTS)

based looping method was a simplified yet precise implementation of a previous method; while the Queue Length method takes care of most drawbacks of the TTL-based looping method, but requires a more fine tuning with operations constatns and an implementation at switch level.

rate of probing; F - number of Flows in the network; T(TL) - number of times a service packet loops over a link.

In Table 1, network overhead per query per flow is estimated by amortization over all flows for all methods except Ping. It can be observed that Ping has least network overhead per query, TTL-based looping method has maximum network overhead. Total network overhead (for all queries) is the least for queue-length based method. When F > N, TTL-based looping method and queue length method have lower network overheads than OpenNetMon and Ping. In a typical operational network, the number of flows (F) will be larger than number of switches (N). For a given probing rate P, TTL-based looping method has better precision because it continually measures latency per link. A method can report real-time latency if it’s probing interval matches with traffic burst interval in the network. OpenNetMon can report in real-time if traversal time of probe packet over end-to-end path is less than or equal to traffic burst time. TTL-based looping method yields real-time results because it continually measures link-wise latency. Queue-length method is affected by the polling time a controller would take to poll all switches in the network. For small N, queue-length method will give real-time results. IV.

Figure 2 Ping vs OpenNetMon vs Queue Length vs Looping – Mininet Setup

However both are promising approaches for real-time reporting of latency, considering they can be directly queried from the controller.

SIMULATIONS

We used Mininet for simulating our networks.Once the topology was setup we used D-ITG [11] to generate network traffic and iperf at times to flood a random path with traffic to simulate high demand. We compared Queue Length and TTLbased Looping with ping and OpenNetMon in Mininet using a simple tree topology with a depth of 3 and a fan-out of 3 resulting in 27 hosts. The tested path had 4 links. We set alternate links to varying bandwidths (i.e. 10mbps or 100mbps), to further allow packet queuing. In Fig. 2, we show the performance of all four methods together in a Mininet simulated network. Ping closely coincides with the TTL-based Looping method. Also, Ping and TTLbased Looping closely coincide with the OpenNetMon method as well. TTL selection also plays an important role for managing the load on the controller. As network size increases, the number of latency updates received by the controller increases linearly with the number of links. A smaller TTL would lead to a huge amount of load on the controller. We propose that the TTL should be adjusted to a manageable value depending on the network size and the maximum load capability of the controller. For our simulation, we have used a fixed TTL of 10.

REFERENCES [1]

[2]

[3]

[4]

[5]

[6] [7]

An aspect of the TTL-based Looping method is the network overhead. Looping Packet is of size of 24 bytes (20 bytes IP header and 4 bytes of hash payload) while a Ping is typically of size 32 bytes (can be varied). The Queue Length Method effectively eradicates any network overhead issues by directly querying the switches. However this method is far from perfect.

[8]

VI. CONCLUSION

[11]

[9]

[10]

We introduced two methods for latency calculation in this paper, each with its own advantages and drawbacks. The TTL-

59

“White Papers - Software-Defined Networking: The New Norm for Networks.” [Online]. Available: https://www.opennetworking.org/sdn-resources/sdnlibrary/whitepapers. [Accessed: 05-Oct-2014]. N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner, “OpenFlow: Enabling Innovation in Campus Networks,” SIGCOMM Comput Commun Rev, vol. 38, no. 2, pp. 69–74, Mar. 2008. “draft-wang-forces-compare-openflow-forces-01 - Analysis of Comparisons between OpenFlow and ForCES.” [Online]. Available: https://tools.ietf.org/html/draft-wang-forces-compare-openflowforces-01. [Accessed: 06-Oct-2014]. P. Francis, S. Jamin, C. Jin, Y. Jin, D. Raz, Y. Shavitt, and L. Zhang, “IDMaps: A Global Internet Host Distance Estimation Service,” IEEEACM Trans Netw, vol. 9, no. 5, pp. 525–540, Oct. 2001. “Number of Internet Users (2015) - Internet Live Stats.” [Online]. Available: http://www.internetlivestats.com/internet-users/. [Accessed: 24-Jul-2015]. “CIDR Report.” [Online]. Available: http://www.cidrreport.org/as2.0/#General_Status. [Accessed: 09-Jun-2015]. V. Altukhov and E. Chemeritskiy, “On real-time delay monitoring in software-defined networks,” in Science and Technology Conference (Modern Networking Technologies) (MoNeTeC), 2014 First International, 2014, pp. 1–6. M. Feridun, International Federation for Information Processing, and Computer Society, Eds., Monitoring Latency with Openflow. Piscataway, NJ: IEEE, 2013. N. L. Van Adrichem, C. Doerr, F. Kuipers, and others, “Opennetmon: Network monitoring in openflow software-defined networks,” in Network Operations and Management Symposium (NOMS), 2014 IEEE, 2014, pp. 1–8. J. Kurose and K. Ross, Computer Networking: A Top Down Approach, 4e, vol. 1. 2012. S. Avallone, S. Guadagno, D. Emma, A. Pescapè, and G. Ventre, “DITG Distributed Internet Traffic Generator.,” in ResearchGate, 2004, pp. 316–317.