Reliability and Scalability Issues in Software Defined ... - IEEE Xplore

165 downloads 3058 Views 439KB Size Report
Email: [email protected]. Abstract—Software Defined Network (SDN) structure has been proposed for its flexibility in deployment and management. As an.
2013 Second GENI Research and Educational Experiment Workshop

Reliability and Scalability Issues in Software Defined Network Frameworks Xinjie Guan and Baek-Young Choi

Sejun Song

Department of Computer Science & Electrical Engineering University of Missouri - Kansas City Kansas City, USA Email: {xinjieguan, choiby}@umkc.edu

The Dwight Look College of Engineering Texas A&M University College Station, USA Email: [email protected]

We have conducted experiments on the protoGENI testbed [5] to investigate the reliability and scalability issues using a large number of flows. According to our experiments, both OpenFlow and FlowVisor are significantly impacted by a large number of newly inserted flows. This observation alarms us about the importance of reliability and scalability issues in SDN and its frameworks.

Abstract—Software Defined Network (SDN) structure has been proposed for its flexibility in deployment and management. As an implementation of SDN structure, OpenFlow protocol decouples data plane and control plane so that flexible and programmable installation and management of forwarding rules are allowed. However, on the other hand, the decoupled structure raises additional computational and network resources consumption that even may lead to fatal disasters. In this study, we examine the issues of reliability and scalability of SDN under disaster scenarios on a GENI test-bed. Observations from our experiments show that more attention should be paid to improve the reliability and scalability of SDN and its frameworks.

I. I NTRODUCTION User mobility and server visualization have become the trends in the development of networking, which demand the networks rapidly respond to dynamic requirements from an application level, in order to build large-scale networks with high flexibility and scalability. Software-defined Network (SDN) [8] architecture is proposed to meet such dynamic computing requirements. In SDN, the data and control planes are decoupled, and a physical network and network states are logically separated. As a result, networks can be customized so that dynamic computing can be accommodated in a finegrained manner. OpenFlow [8] is an embodiment of the SDN architecture. It standardizes the communication between the decoupled control and forwarding (data) planes, and defines the direct access of network devices such as switches and routers. Then, the control of network devices is logically centralized in a control plane via the OpenFlow protocol. OpenFlow-based SDN leads a new direction in network architecture, and has been embraced by large service providers such as Google and Facebook. However, due to the separation of the data plane and control plane, we find that additional network and computational resources would be consumed for each OpenFlow event, which may render SDNs to suffer from scalability, causing it to be vulnerable to network disaster. Many tools and applications that have been developed and implemented based on the SDN architecture focus on traffic engineering and network virtualization, such as FlowVisor [6], and HyperFlow [9]. Reliability and scalability of SDNs, however, have not been explored much. 978-0-7695-5003-9/13 $26.00 © 2013 IEEE DOI 10.1109/GREE.2013.28 10.1109/.14 10.1109/GREE.2013.15

Fig. 1.

Experiment network topology

II. R ELIABILITY AND S CALABILITY E XPERIMENTS In order to verify the detrimental impact of excessively many new flows, we configured a dedicated network with GENI resources, and tested the scenario where varied numbers of new flows are injected to an OpenFlow switch. we have used OpenFlow Switches, Nox controllers [3], and FlowVisor [6] framework. We point out that when an excessively large number of new flows are injected to an OpenFlow switch, the switch as long as its controller and the concurrent applications would be significantly impacted and even saturated. In practice, a large number of new flows may come from adversaries, legitimate burst events or system misconfigurations. We depict our experiment network topology in Figure 1. Four hosts and two controllers from three sites were employed, in which a host kept generating single packet (of 46 bytes) flows at various speeds ranging from 250 flows per second to 2000 flows per second. Those new flows are not matched with any entry in the OpenFlow switches, and thus will be reported to the controller. As illustrated in Figure 1, Host 2 generated and sent many new flows to Host 4 residing in a different 102

80

100 User CPU System CPU 80

60

Loss Rate (%)

CPU Utilization (%)

70

50 40 30

60

40

20 20 10 0

0

0 0

250 500 750 1000 12501500 17502000

Flow Insertion Rate(flows/second)

Fig. 2. flows

250

500

750 1000 1250 1500 1750 2000

Flow Insertion Rate(flows/second)

CPU utilization on the controller with varied numbers of new

Fig. 3. Packet loss rate between probe hosts with varied numbers of new flows

site using a packet generator, packETH [4]. Meanwhile ping probes were sent periodically from Host 1 to Host 3 with a rate of one packet per second. Both the generated new flows as well as the ping probes passed through the same switches in two sites and the backbone network. Controller 1 and Controller 2 that are installed on protoGENI nodes [5] made the routing decisions when packets are not matched with any forwarding entry in the switches. Additionally, all these messages were processed through a FlowVisor placed in the middle between the switches and controllers. We first investigate that the controller’s CPU utilization as shown in Figure 2. As the number of injected new flows increases, the controller’s CPU utilization significantly increases reaching as high as 80%. We also notice that when 2000 new flows were injected per second, the CPU utilization is slightly lower than the case of 1750 flows per second. This might be due to high packet loss rates as we shall see in Figure 3. Next, we examine the impact on the network resources by observing the packet loss rate of ping probes in Figure 3. We find that the ping probes were substantially disrupted by the many new flows. Specifically, when 250 new flows per second are injected, the packet loss rate is about 20% which is already pretty high. The packet loss rate further rises as the new flow injection rate increases. From around the rate of 1500 flows per second, it is pretty impossible to receive any packet. In addition to blocking controllers and exhausting network resources, we find that the many new flows also seriously burden the FlowVisor that is built on the switches for the visualization and isolation. An excessive number of errors were thrown from the switches to a controller due to some reason, when many new flows were injected. Meanwhile, the FlowVisor recorded all these errors for future debugging purpose. That may be fatal when a network disaster such as flash crowded occurs, hundreds of errors are thrown and recorded every second, have eventually filled the disk up and caused a system failure.

and controllers is not lowered, and it may cause even more traffic than using a single layer structure due to the additional control traffic in organizing the hierarchy structure. A safety constraints is stated in [2] as a programming paradigm for network virtualization, according to which, end nodes and edge switches can only generate packets with its own IP, and process packets within its own subnet. This constraints may filter some malicious traffic but cannot prevent new flow attacks. In [7], we proposed a Network Embedded Online Disaster (NEOD) management system. By placing an embedded module on OpenFlow switches that are closer to the source of failure, unusual network activities are expected to be monitored and captured early without incurring extensive computational and network resource consumption involving the controllers. It improves the reliability and scalability of the controllers under disaster events. In this study, we extended the observation of the reliability and scalability issues to an SDN framework that is beyond the controllers. This calls for attention to such issues of frameworks that are built upon OpenFlow switches. The next steps would be developing the methods and metrics to evaluate SDN related frameworks, and to seek and establish general guidelines in designing and implementing the frameworks without jeopardizing the reliability and scalability in SDN. R EFERENCES [1] S. Hassas Yeganeh and Y. Ganjali. Kandoo: a framework for efficient and scalable offloading of control applications. In Proc. (ACM) first workshop on Hot Topics in Software Defined Networks, pages 19–24, 2012. [2] A. Milanova, S. Fahmy, D. Musser, and B. Yener. A Secure Programming Paradigm for Network Virtualization. In 3rd International Conf. Broadband Communications, Networks and Systems, pages 1–10, 2006. [3] Nox. http://www.noxrepo.org/. [4] packETH. http://packeth.sourceforge.net/packeth/Home.html. [5] ProtoGENI. http://www.protogeni.net/. [6] R. Sherwood, G. Gibb, K. Yap, G. Appenzeller, M. Casado, N. McKeown, and G. Parulkar. Flowvisor: A Network Virtualization Layer. OpenFlow Switch Consortium, Tech. Rep, 2009. [7] S. Song, S. Hong, X. Guan, B.Y. Choi, and C. Choi. Neod: Network embedded on-line disaster management framework for software defined networking. In (IFPF/IEEE) International Symp. Integrated Network Management, Mini-Conference, 2013. accepted. [8] The OpenFlow Switch Consortium. http://www.openflowswitch.org. [9] A. Tootoonchian and Y. Ganjali. Hyperflow: A distributed control plane for openflow. In Proc. (USENIX) internet network management conf. Research on enterprise networking, 2010.

III. C ONCLUDING R EMARKS AND F UTURE W ORK Some research work has been done to offload controllers including [1] in which a hierarchical structure of multiple controllers is used, so that CPU consumption on a single controller can be reduced. However, the traffic between switches

103

Suggest Documents