Cost Efficient Rule Management and Traffic ...

53 downloads 2822 Views 2MB Size Report
for QoS Provisioning in Software Defined Network [1]. 19. 3.1 Motivation ..... control-plane traffic load balancing and control-channel setup cost are jointly considered ..... ios [67,68] in a dynamic environment, where traffic flows are frequently refreshed in certain ...... quirements of an mpls transport profile,” Tech. Rep., 2009.
Cost Efficient Rule Management and Traffic Engineering for Software Defined Networks Huawei Huang A DISSERTATION SUBMITTED IN FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN COMPUTER SCIENCE AND ENGINEERING Graduate Department of Computer and Information Systems The University of Aizu 2016

Copyright by Huawei Huang All Rights Reserved

Acknowledgements I would like to thank all who helped me during my Ph.D career. Especially, Prof. Song Guo has taught me many very useful skills to do research, and given me plenty of advices to let me know the appropriate manners to communicate, cooperate and get along with others. All those words are treasure to me, because they will always be shinning in my mind like the navigation light when I am struggling in the darkness and then offer me great power to conquer the challenges appeared in both my future research career and my life. And I gratefully acknowledge the detailed comments and constructive suggestions made by the other review committee members: Prof. Shuxue Ding, Prof. Miyazaki, and Prof. Incheon Paik. I have revised this dissertation very carefully by taking all the comments and suggestions from reviewers into consideration. Much appreciation for the comments that have much improved the quality of this dissertation. Then, I really thank all the current and former members in my lab. I have had a happy life in this lab with your companies and cooperations. Finally, I would like to show my highest gratitude to my family: my parents, my wife, my younger sister and younger brother. Without your great support and love, there will not be my current achievements. Thank you all, truly.

Contents 1 Introduction 1.1

1

Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.1.1

SDN Rules . . . . . . . . . . . . . . . . . . . . . . . . . .

2

1.1.2

Critical Resource in SDN Networks . . . . . . . . . . . . .

2

1.1.3

Rule Installation and Caching . . . . . . . . . . . . . . . .

3

1.1.4

Rule Update for Link Failure . . . . . . . . . . . . . . . .

4

1.2

Motivation and Consistency . . . . . . . . . . . . . . . . . . . . .

5

1.3

Contributions of this Dissertation . . . . . . . . . . . . . . . . . .

5

1.4

Organization of Dissertation . . . . . . . . . . . . . . . . . . . . .

6

2 Fundamentals and Related Work 2.1

2.2

8

Preliminary of SDN . . . . . . . . . . . . . . . . . . . . . . . . .

8

2.1.1

Architecture of SDN Networks . . . . . . . . . . . . . . .

8

2.1.2

Benefits of SDN . . . . . . . . . . . . . . . . . . . . . . .

9

State-of-the-Art Cost-Efficient Rule Management and Traffic Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2.2.1

Development of SDN . . . . . . . . . . . . . . . . . . . . .

11

2.2.2

Rationale of TCAM . . . . . . . . . . . . . . . . . . . . .

11

2.2.3

Cost-Efficient TCAM Usage . . . . . . . . . . . . . . . . .

12

2.2.4

Rule Installation and Caching . . . . . . . . . . . . . . . .

15

2.2.5

Traffic Engineering with Rule Compression . . . . . . . .

16

2.2.6

Failure Recovery for SDN Networks . . . . . . . . . . . .

17

i

3 Joint Optimization of Rule Placement and Traffic Engineering for QoS Provisioning in Software Defined Network [1] 19 3.1

Motivation and Problem Statement . . . . . . . . . . . . . . . . .

19

3.2

System Model and Assumptions . . . . . . . . . . . . . . . . . . .

21

3.2.1

Problem Complexity Analysis . . . . . . . . . . . . . . . .

23

3.3

Optimization with candidate paths . . . . . . . . . . . . . . . . .

24

3.4

Optimization without candidate paths . . . . . . . . . . . . . . .

26

3.5

Heuristic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . .

29

3.6

Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

3.6.1

Simulation settings . . . . . . . . . . . . . . . . . . . . . .

31

3.6.2

Solutions under given candidate paths . . . . . . . . . . .

32

3.6.3

Solutions without candidate paths . . . . . . . . . . . . .

34

Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . .

35

3.7.1

Performance of the nonRM-CP and RM-CP . . . . . . . .

36

3.7.2

Performance of the nonRM-nonCP and RM-nonCP . . . .

40

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

42

3.7

3.8

4 Cost Minimization for Rule Caching in Software Defined Networking [2] 43 4.1

Motivation and Problem Statement . . . . . . . . . . . . . . . . .

43

4.2

System model and Assumptions . . . . . . . . . . . . . . . . . . .

44

4.3

Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

4.4

Offline algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . .

46

4.5

Online Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . .

48

4.5.1

Typical actions in optimal solutions . . . . . . . . . . . .

48

4.5.2

Online Exactly Match the Flow Algorithm

. . . . . . . .

48

4.5.3

Online Extra η time-slot Caching Algorithm . . . . . . . .

52

Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

55

4.6.1

Simulation Settings . . . . . . . . . . . . . . . . . . . . . .

55

4.6.2

Evaluation of Offline Algorithm . . . . . . . . . . . . . . .

56

4.6.3

Evaluation of Online EMF and ECA . . . . . . . . . . . .

57

4.6.4

Evaluation of Special Case of ECA . . . . . . . . . . . . .

60

4.6

ii

4.7

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

61

5 Near-Optimal Routing Protection for In-Band Software-Defined Networks [3] 62 5.1

5.2

5.3

5.4

5.5

Motivation and Problem Statement . . . . . . . . . . . . . . . . .

62

5.1.1

Motivation . . . . . . . . . . . . . . . . . . . . . . . . . .

62

5.1.2

Our Goal . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

System Model and Formulation . . . . . . . . . . . . . . . . . . .

64

5.2.1

Preliminary . . . . . . . . . . . . . . . . . . . . . . . . . .

64

5.2.2

System Model and Assumptions . . . . . . . . . . . . . .

65

5.2.3

Problem Formulation

. . . . . . . . . . . . . . . . . . . .

66

Near-Optimal Path Selection Algorithm . . . . . . . . . . . . . .

67

5.3.1

Log-Sum-Exp Approximation Approach . . . . . . . . . .

68

5.3.2

Markov Chain Design . . . . . . . . . . . . . . . . . . . .

69

5.3.3

Implementation of MC Guided Algorithm . . . . . . . . .

70

Online Handling and Theoretical Analysis under Single-link Failure 72 5.4.1

Operations When A Link Fails . . . . . . . . . . . . . . .

72

5.4.2

Theoretical Performance Fluctuation of Single-Link Failure 73

5.4.3

Case Study under ‘1+1’ Protection Scheme . . . . . . . .

74

Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . .

75

5.5.1

Methodology and Simulation Settings . . . . . . . . . . .

75

5.5.2

Representative Execution Case of Algorithms . . . . . . .

76

5.5.3

Case Study of Single Link Failure . . . . . . . . . . . . . .

79

5.5.4

Performance of Alg. 8 in the Initial Stage . . . . . . . . .

79

5.5.5

Performance of Alg. 8 under Single-Link Failure . . . . .

80

5.6

Proof of Theorem 8

. . . . . . . . . . . . . . . . . . . . . . . . .

80

5.7

Proof of Lemma 5 . . . . . . . . . . . . . . . . . . . . . . . . . .

82

5.8

Proof of Theorem 9

. . . . . . . . . . . . . . . . . . . . . . . . .

84

5.9

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85

6 Conclusion

88

iii

List of Figures 1.1

Structure and content of this dissertation. . . . . . . . . . . . . .

6

2.1

SDN architecture.

. . . . . . . . . . . . . . . . . . . . . . . . . .

9

2.2

The rationale of CAM based lookup operation. . . . . . . . . . .

13

3.1

The motivation case: traffic engineering and duplicated rules placement in traditional SDN enabled networks. . . . . . . . . . .

20

3.2

Constructed instance of rule placement problem. . . . . . . . . .

24

3.3

An example for path searching. . . . . . . . . . . . . . . . . . . .

27

3.4

The internal architecture of simulation and the relations between components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.5

34

Case study of four schemes with a 10-node scaled network, the rules are placed into the data plane according to the solutions obtained by solving four optimizations. . . . . . . . . . . . . . . .

3.6

35

The optimal rule space occupation cost of nonRM-CP and RM-CP. This suite of simulations emphasize on comparing the performance of rule space occupation cost between nonRM and RM schemes, while providing the candidate paths. . . . . . . . . . . .

3.7

36

QoS satisfaction ratio of nonRM-CP and RM-CP. This suite of simulations emphasize on comparing the performance of QoS satisfaction degree between nonRM and RM schemes, while providing the candidate paths. . . . . . . . . . . . . . . . . . . . . . . . . .

3.8

37

Rule space occupation of fast heuristic algorithms under nonRM-CP and RM-CP schemes in randomly generated large-scale networks.

iv

38

3.9

Rule space occupation of nonRM-nonCP and RM-nonCP under a partial ITALYNET networks with 10 nodes. This suite of simulations emphasize on comparing the performance between Alg. 1 and optimal solutions. . . . . . . . . . . . . . . . . . . . . . . . .

40

3.10 Rule space occupation of nonRM-nonCP and RM-nonCP under randomly generated networks with 30 nodes. This suite of simulations emphasize on comparing the performance between nonRM and RM schemes, under the cases of CP and nonCP, respectively. 41 4.1

The sniffed TCP traffic flow [4]. . . . . . . . . . . . . . . . . . . .

44

4.2

Illustration of typical actions in optimal solution. . . . . . . . . .

49

4.3

Rules are cached in Ei because of CiE action. . . . . . . . . . . .

51

4.4

An example of ECA solution. . . . . . . . . . . . . . . . . . . . .

51

4.5

Performance of offline Algorithms while varying γ. . . . . . . . .

53

4.6

Performance of offline Algorithms while varying l. . . . . . . . . .

55

4.7

Performance of EMF and ECA over OPT-A. . . . . . . . . . . .

56

4.8

Performance of EMF and ECA over OPT-B. . . . . . . . . . . .

57

4.9

Performance of EMF and ECA over OPT-C while varying γ and η. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

58

4.10 Performance of EMF and ECA over OPT-C while varying l. . .

59

4.11 Performance of ECA over OPT-A, OPT-B and OPT-C under special case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

60

5.1

An illustrative link failure occurs in an in-band SDN. . . . . . . .

63

5.2

The protection of control-plane traffic for controller-switch sessions. Note that, the number of controllers can be more than one. Here we only illustrate an example with one controller. . . .

65

5.3

State machine for each session in the proposed algorithm. . . . .

71

5.4

An example of operations when single-link failure occurs. . . . .

74

5.5

The 26-node Fat-tree topology. Controller connects to gateway node. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

v

77

5.6

Representative execution of algorithms under Fat-tree topology (initial |Js |=5, |Ds |=2). It can be seen that Alg. 8 converges in both initial stage and after the link failure. Note that, the numerical Joint system cost includes both the Largest link overhead measured by the traffic rate (Mb/s) and the Average (Avg) node configure overhead measured by the average configuring times in each switch node.

5.7

. . . . . . . . . . . . . . . .

78

Link overhead distribution in the core links under Fat-tree topology before/after the critical link (0,3) fails. Alg. 8 always shows near-optimal performance in terms of the aggregated traffic rates in the core links. Alg. LR exhibits sharp increasing aggregated traffic rate in the neighboring links of the failed one.

5.8

. . . . . .

Convergence property of algorithms under Fat-tree topology. Alg. 8 shows overwhelming performance over benchmark algorithms. .

5.9

86

87

Performance of Alg. 8 when varying the number of initial candidate paths for each session in the initial stage (shorten as Init.) and after-link-failure (shorten as a.l.f.) under Fat-tree topology. We find that although a larger candidate path set increases the cost fluctuation, it reduces the convergence time after the link failure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vi

87

List of Tables 2.1

Comparisons on the energy-efficient TCAM Usage . . . . . . . .

14

3.1

Notations and Symbols for the First Topic . . . . . . . . . . . . .

22

3.2

20 SDN rules used in case study . . . . . . . . . . . . . . . . . .

33

4.1

Notations and Symbols for the Second Topic . . . . . . . . . . .

45

5.1

Notations and Symbols for the Third Topic . . . . . . . . . . . .

66

vii

Nomenclature • API: Application Programming Interface • ASIC: Application-Specific Integrated Circuit • BCAM: Binary Content Addressable Memory • CAM: Content Addressable Memory • CP: Candidate Path • CPRP: Control-Plane Routing Protection • ECA: Extra Cache Algorithm • EMF: Exactly Match the Flow • MA: Markov Approximation • MC: Markov Chain • MWFP: Minimum Weighted Flow Provisioning • nonCP: non Candidate Path • nonRM: non Rule Multiplexing • QoS: Quality of Service • RM: Rule Multiplexing • SDN: Software-Defined Networking • SRAM: Static Random-Access Memory • TCAM: Ternary Content Addressable Memory • TCP: Transmission Control Protocol • TE: Traffic Engineering • TLS: Transport Layer Security

viii

Abstract Software-Defined Networking (SDN) is a promising network paradigm that separates the control plane and data plane in the network. It has shown great advantages in simplifying network management such that new functions can be easily supported without physical access to routers or switches. In SDN networks, Ternary Content Addressable Memory (TCAM) is a critical hardware, which is used to store forwarding rules for high-speed packet processing in SDNenabled devices. However, it can be supplied to each device with very limited quantity because it is expensive and energy-consuming. Therefore, this dissertation studies three primary issues for SDN networks on the cost-efficient rule management. At the first, because rules can be deployed into network switches in a static SDN environment, we study rule placement problem with the objective of minimizing rule consumption for multiple unicast sessions under QoS constraints. To this aim, we propose a rule multiplexing scheme, in which the same set of forwarding rules deployed on each node apply to the whole flow of a session going through but towards different paths. Based on this scheme, we formulate the rule placement problem jointly considering link bandwidth and rule space constraints under both existing and our rule multiplexing schemes. Via an extensive review of the state-of-the-art work, to the best of our knowledge, we are the first to propose the rule multiplexing problem. Extensive simulations are conducted to show that our proposed approaches significantly outperform existing solutions. Secondly, in an online environment of SDN networks, each traffic flow is shaped by a set of associated forwarding rules that are maintained by switches in their local TCAM-based flow tables. Since rules should be deployed or removed depending on varying traffic pattern, it is worth to study the rule caching problem under an online environment. As mentioned, since TCAM is an expensive hardware, each switch has only limited TCAM space and it is inefficient and even infeasible to maintain all rules at local switches. On the other hand, if we eliminate TCAM occupation by forwarding all packets to the centralized

controller for processing, it will result in a long delay and heavy processing burden on the controller. Therefore, in the second topic, we are motivated to study the trade-off between local packet processing and remote packet processing. To this end, we formulate a Minimum Weighted Flow Provisioning (MWFP) problem with the objective to minimize the total cost in terms of TCAM occupation and remote packet processing. We propose an efficient offline algorithm if the network traffic is given. Otherwise, we propose two online algorithms with guaranteed competitive ratios. Finally, we conduct extensive trace-driven experiments using real network traffic traces. The evaluation results demonstrate that our proposed algorithms can significantly reduce the total cost, and the solutions obtained are nearly optimal. Thirdly, SDN brings a number of advantages along with many challenges, one particular concern is on the resilience for the in-band control channels. The existing approaches mainly rely on a local rerouting policy when performing the routing protection for the target sessions in SDN networks. However, such policy would potentially bring congestions in the neighboring links of the failed one. Therefore, in the third topic, we notice that rules should be updated corresponding to the link failure in an SDN network. Aiming to provide the robust routing protection towards the control plane of SDN networks, we strive to find the cost-efficient rule update solutions by studying a weighted cost minimization problem. In particular, the traffic load balancing and control-channel setup cost are jointly taken into consideration. Since this problem is known as NP-hard, we propose a Markov Approximation (MA) based near-optimal approach to solve it. We then extend our solution to an online case that handles the single-link failure one at a time. The incurred performance fluctuation by the single-link failure is also analyzed with theoretical derivation. Extensive numerical results show that the proposed MA based algorithm illustrates fast convergence and high efficient resource consumption in terms of rule deployment cost and link bandwidth utilization.

ii

Chapter 1

Introduction In this chapter, we first present the background of this dissertation. Then, the consistency and motivation of these topics will be given. Finally, the structure of this dissertation is presented.

1.1

Background

Software Defined Networking (SDN) has been viewed as the next generation network paradigm [5–7]. It simplifies the network management by decoupling the control and data planes such that complicated controlling logics no longer need to be installed to packet forwarding devices like switches or routers, but at a logically centralized network operating system called controller. SDN has shown great advantages in simplifying network management such that network operators can implement their own protocols, rules and policies with common programming languages. As a result, operators can achieve flexible control over network services such as traffic engineering [8–10], Quality of Service (QoS) [11, 12], security [13–15]. This dissertation studies three primary issues for SDN networks. We are going to claim that the major resource in SDN networks is the rule-table space in switches. So, the proposed solutions are trying to address the problem of costefficient rule-management for SDN networks. For the first topic, since rules can be deployed into switches in a static manner, the proposed approach is targeting at decreasing the total rule consumption. For the second topic, the proposed solution manages rules in an online environment. That is rules can be installed and also removed depending on the traffic patterns, such that the overall rule consumption can be minimized. The third topic is focusing on the robustness of network links under SDN networks. When a link fails, how to update the rule with a cost-efficient manner is an important topic that is worthy of being studied.

1

In the following, the background behind the three primary topics that we are focusing on will be introduced.

1.1.1

SDN Rules

In SDN networks, each SDN switch at the data plane conducts data forwarding according to the flow-table entries (also called SDN rules) installed by the controller. Each forwarding rule can be expressed in the form of hMatch, Actioni, in which the Match field is used to match against the packet header. If a rule is matched, the switch executes the specified actions in the Action field to the packet. For example, the rule hM atch : {ip, nw src = 100.0.0.1, nw dst = 100.0.0.2}, Action = output : 3i indicates that the packets from a host with source IP address 100.0.0.1 and a destination IP address 100.0.0.2 will be forwarded to the output port 3 of the switch. According to the OpenFlow specification [16], a flow table entry consists of multiple matching and action fields. Once all conditions specified in the matching fields are satisfied, the corresponding actions specified in the action field will be executed by the host switch. Some representative examples of matching fields are given as follows. • dl src: source data-link-layer (MAC) address of the packet • nw dst: destination network-layer address of the packet • dl type: protocol type of the packet • in port: incoming port number of the packet In action field, the fundamental function is routing denoted by keyword Output. Other actions, e.g., Set-queue, Drop, Push/Pop VLAN or MPLS Tag, and Set Field, are more intensively applied to provide QoS support, secure access control, network management, and modification of packet header fields, respectively. These non-routing actions greatly improve the usefulness of OpenFlow implementations, e.g., network management, access control, and VLANs examples as reviewed in [5].

1.1.2

Critical Resource in SDN Networks

Since last decade, Ternary Content Addressable Memory (TCAM) has became the dominated hardware that can provide super high speed forwarding operation in the packet-switching networks. For example, a commercial TCAM chip named R8A20410BG can support 20Mbit density working at 360 MHz per table, which means it can perform up to 360 million searches per second per table.

2

While TCAMs have line-rate speed lookup benefits, it also comes with disadvantages such as the high cost-to-density ratio (350 US dollars for a 1 Mbits chip) and high power consumption (15-30 Watts per Mbits). Due to these reasons, TCAMs have been limited to wild card storages in the packet-switching devices, and must be carefully planed to use. Therefore, a number of cost-efficient TCAM usage approaches have been proposed in recent work for packet-switching networks. For example, by conducting a survey of the state-of-the-art literature, we find that these approaches can be classified into three categories: reduce TCAM usage opportunity [17, 18], always utilize partial TCAM [19–21], compact the size of forwarding rules [1, 22–25]. And we also find out that this topic of cost-efficient TCAM usage still has not been well studied in the context of traffic engineering for SDN networks.

1.1.3

Rule Installation and Caching

In SDN, there are mainly two ways of installing rules into switches: reactive way [6, 13] and proactive way [26–28]. Reactive Rule-Installation A typically reactive rule-installation procedure [6, 29] between a pair of users (say users A and B) contains three steps: 1) User A sends out packets after connection initialization. Once a packet arrives a switch without matched flow table entries, this packet is forwarded to the controller; 2) Upon receiving the packet, the controller decides whether to allow or deny this flow according to network management policies; 3) If the flow is allowed, the controller installs corresponding rules to all switches along the path, such that consecutive packets can be processed by the installed rules locally at switches. Note that, for caching the installed rules, controller usually sets an expiration time, which defines the maximum rule maintenance time when no packet of associated flow arrives at switches. The reactive rule installation and caching has been widely adopted by existing work [6,13] because of its on-demand fashioned usage of TCAM space. Proactive Rule-Installation On the other hand, other studies [30,31] argue that the reactive rule-installation is time-consuming because of remote rule fetching, leading to heavy overhead in packet processing [8, 27, 29]. To reduce the response time for the packets that arrive at switches with no matched rules, proactive rule-installation has been proposed to install rules into switches before the specified packets arrive. This fashion has been proved imposing minimal overhead on network based on

3

traffic prediction [8], and reducing network recovery time if switches cache the pre-computed backup rules in case of network failures [28]. Which Way of Rule-Installation is the Best Choice? Naturally, we may be very curious about the following two questions: 1) which way of the rule-installation is the best choice at a time, and 2) how long should the installed rule be cached in a flow table. In consequence, these questions motivate us to study the rule caching problem with the objective of minimizing the sum of remote processing cost and local forwarding table occupation cost, jointly considering various traffic patterns in networks.

1.1.4

Rule Update for Link Failure

In an SDN network, the connection between a switch and a controller is used to exchange control-plane traffic, e.g., OpenFlow messages and the collected global network statistics [26]. The global network information is critical for control policies to make centralized decisions. Network statistics shall be collected as much as possible, such as the traffic rate in each link, available flow table size in each forwarding hardware, and reported to the controller via secure channels. The controller may respond with new instructions to each device. Such bidirectional communications contribute to the control traffic by a non-negligible fraction [32]. A controller usually interacts with SDN switches via out-of-band control [33] connections in a dedicated network [34, 35]. The advantages of such an out-ofband network are in two folds: (1) High security is provided for control signals because a separate network is used for communication; (2) The control-to-switch connection is still available through the separate network even if failures occur in the data plane. However, the out-of-band dedicated network is expensive to build. Furthermore, building such a separate network may not be feasible in some scenarios, such as the widely distributed access networks. Therefore, in a large-scale network with hundreds even thousands of switches, an alternative less expensive way is to use the so called in-band connection [36] for the control plane. In this fashion, a controller establishes communication with a target switch through a multihop routing path consisting of multiple intermediate switches. By using the Transport Layer Security (TLS) or Transmission Control Protocol (TCP) connections, the control-plane traffic can be relayed over the in-band controller-switch channels. The usage of in-band connection can be found in either wired networks [37–40], or wireless networks [41–44]. Although in-band connection is a practical approach, it also comes with many challenges. One particular challenge is how to provide resilient communications between the switch and controller in case of link failures.

4

In practice, link failure usually occurs at a network randomly and dynamically, resulting in that controller has to re-install new rules for all the affected routing paths. Even when a single link failure occurs, the expense of refreshing the current solution is significantly high or even intractable. Therefore, designing cost-efficient strategies that can handle the dynamic link failure in an online fashion is a critical challenge. To this end, we study a weighted cost minimization problem, in which the control-plane traffic load balancing and control-channel setup cost are jointly considered when selecting protection paths for control channels.

1.2

Motivation and Consistency

From the background discussed above, we can see that the major resource in SDN networks is the TCAM. So, the consistency of three concentrated topics of this dissertation is that the proposed solutions are all trying to address the cost-efficient rule management. In detail, for the first topic, since rules can be deployed into switches in a static manner, the objective is to decrease the TCAM consumption. For the second topic, the proposed solution manages rules in an online environment. Rules can be installed and also removed sometimes such that the overall rule consumption can be minimized. The third topic is related to the robustness of network links. If a link is disconnected, how to update the rule to achieve a tradeoff between TCAM consumption and link recovery efficiency. Therefore, the motivation of this dissertation is how to perform the costefficient rule management under three different scenarios of SDN networks.

1.3

Contributions of this Dissertation

Aiming to achieve the cost-efficient rule management towards the aforementioned three different scenarios under SDN networks, the contributions of this dissertation are summarized as follows: • At the first, because rules can be deployed into network switches in a static SDN environment, we study rule placement problem with the objective of minimizing rule consumption for multiple unicast sessions under QoS constraints. We prove such optimization problem NP-hard. When a set of possible candidate paths for each session are given, we formulate the optimization problems under both existing and our rule multiplexing schemes, i.e., the CP-based RM and nonRM optimizations. Then, we further study a more challenging scenario, where no candidate paths are provided, with a joint optimizations between routing and rule placement.

5

Chpt 1. Introduction Chpt 2. Fundamental and backgroud of SDN

Chpt 3. Joint Cost Opt. for

Chpt 4. Cost Min. for

Chpt 5. Cost Min. for routing Protection over

Rule Placement

Rule Installing

Load balance on link

Traffic Engineer

Rule Caching

Config. cost on node

Chpt 6. Conclusion

Figure 1.1: Structure and content of this dissertation. • Next, since rules should be deployed or removed depending on varying traffic pattern, we study the rule caching problem under an online environment, with the objective of minimizing the total cost over remote processing and local forwarding table occupation. We propose an offline algorithm by adopting a greedy strategy if the network traffic is given in advance. We also devise two online algorithms with guaranteed competitive ratios. • Finally, rules should be updated corresponding to the link failure in an SDN network. Aiming to provide the robust routing protection mechanism towards the control plane of SDN networks, we strive to find the cost-efficient rule update solutions. Our approach can be extended to the routing protection in data plane. To solve the proposed weighted cost minimization problem, a near-optimal algorithm has been proposed using Markov approximation technique. In particular, we have designed a Markov chain with a state space representing all feasible protection solutions and a well devised transition rate matrix, such that the theoretical performance of the proposed algorithm can be guaranteed. Furthermore, we extend our solution to an online case that can handle the dynamic single-link failure one at a time. The incurred performance fluctuation is also analyzed with theoretical derivation.

1.4

Organization of Dissertation

The structure of this dissertation is illustrated as Fig. 1.1. This dissertation is organized in the following manner. Chapter 2 discusses various fundamental 6

concepts. It also reviews some of the recent studies of cost-efficient rule allocation and traffic engineering in the literature. Chapter 3 studies how to explore the traffic engineering technique to lower the consumption of forwarding rules in the data-plane of SDN networks. A novel rule multiplexing mechanism is proposed. Chapter 4 further studies the rule caching problem with the objective of minimizing the sum of remote processing cost and local forwarding table occupation cost. Chapter 5 focuses on the routing protection towards the control plane traffics in SDN networks. The near-optimal routing protection for control-plane traffic in the in-band fashioned SDN networks. Finally, Chapter 6 concludes this dissertation.

7

Chapter 2

Fundamentals and Related Work 2.1 2.1.1

Preliminary of SDN Architecture of SDN Networks

Software Defined Networking has been envisioned as the next generation network infrastructure [5, 6, 28], which promises to simplify network management by decoupling the control plane and data plane [31, 45]. By shifting the control plane to a logically centralized controller, SDN offers programmable functions to dynamically control and manage packets forwarding and processing in switches, making it easy to deploy a wide range of network management policies and new network technologies, such as traffic engineering [8–10], Quality of Service [11, 12], access control management [14, 15], failure diagnosis [46] and failover mechanisms [39, 47–49]. Fig. 2.1 depicts a logical view of the SDN architecture. Network intelligence is logically centralized in software-based controllers, which maintain a global view of the network. As a result, the network appears to the business applications as a single, logical switch. With SDN, enterprises and carriers gain vendorindependent control over the entire network from a single logical point, which greatly simplifies the network design and operation. SDN also greatly simplifies the network devices themselves, since they no longer need to understand and process thousands of protocol standards but merely accept instructions from the SDN controllers.

8

APPLICATION LAYER

Business Applications

API

CONTROL LAYER

SDN Controller Software

API

API

Network Services

Control Data Plane Interface (e.g., OpenFlow)

INFRASTRUCTURE LAYER

Any topology Network Device

Figure 2.1: SDN architecture.

2.1.2

Benefits of SDN

As an open-source implementation of the SDN paradigm, OpenFlow [26] has attracted many attentions from both industry and academia. A group of large companies, including Google, Microsoft, Facebook, Cisco and AT&T, have shown a great interest in OpenFlow and formed the ONF (Open Networking Foundation) [16] to standardize OpenFlow protocols. In an OpenFlow-enabled network, the controller periodically communicates with all dominated switches via secure channels to obtain network information. With a global view of the network, the controller dynamically installs and updates forwarding rules at switches to implement management policies. For enterprise networks and carrier networks, SDN makes it possible for the network to be a competitive differentiator, not just an unavoidable cost center. OpenFlow-based SDN technologies enable network operators to address the high-bandwidth, dynamic nature of modern applications, adapt the network to ever-changing business needs, and significantly reduce operations and management complexity. The benefits [50] that enterprises and carriers can achieve through an OpenFlow-based SDN architecture include: • Centralized control of multi-vendor environments: SDN controller can control any OpenFlow-enabled network device from any vendor, including switches, routers, and virtual switches. Rather than having to manage groups of devices from individual vendors, operators can use SDN-based orchestration and management tools to quickly deploy, configure, and update devices across the entire network. 9

• Reduced complexity through automation: OpenFlow-based SDN offers a flexible network automation and management framework, which makes it possible to develop tools that automate many management tasks that are done manually today. • Higher rate of innovation: SDN adoption accelerates business innovation by allowing network operators to literally program and reprogram the network in real time to meet specific business needs and user requirements as they arise. By virtualizing the network infrastructure and abstracting it from individual network services, SDN and OpenFlow enable introducing new services and network capabilities in a matter of hours. • Increased network reliability and security: SDN makes it possible for IT to define high-level configuration and policy statements, which are then translated down to the infrastructure via OpenFlow. An OpenFlow-based SDN architecture eliminates the need to individually configure network devices each time an end point, service, or application is added or moved, or a policy changes, which reduces the possibility of network failures due to configuration or policy inconsistencies. Because SDN controllers provide complete visibility and control over the network, they can ensure that access control, traffic engineering, quality of service, security, and other policies are enforced consistently across the wired and wireless network infrastructures, including branch offices, campuses, and data centers. Enterprises and carriers benefit from reduced operational expenses, more dynamic configuration capabilities, fewer errors, and consistent configuration and policy enforcement. • More granular network control : OpenFlow’s flow-based control model allows IT to apply policies at a very granular level, including the session, user, device, and application levels, in a highly abstracted, automated fashion. This control enables cloud operators to support multi-tenancy while maintaining traffic isolation, security, and elastic resource management when customers share the same infrastructure. • Better user experience: By centralizing network control and making state information available to higher-level applications, an SDN infrastructure can better adapt to dynamic user needs. For instance, a carrier could introduce a video service that offers premium subscribers the highest possible resolution in an automated and transparent manner. Today, users must explicitly select a resolution setting, which the network may or may not be able to support, resulting in delays and interruptions that degrade the user experience. With OpenFlow-based SDN, the video application would be able to detect the bandwidth available in the network in real time, and automatically adjust the video resolution accordingly. 10

2.2

2.2.1

State-of-the-Art Cost-Efficient Rule Management and Traffic Engineering Development of SDN

As the first attempt of building a network operating system at a large scale, NOX [13] achieves a simple programming model for control function based on OpenFlow. Later, Maestro [51] exploits parallelism with additional throughput optimization techniques while keeping the simple programming model for programmers. FlowVisor [52] is the first testbed for SDN, which slices the network hardware by placing a layer between control plane and the data plane. Its basic idea is that if unmodified hardware supports some basic primitives, then a worldwide testbed can ride on the coat-tails of deployments without extra expense. Recently, SDN-enabled switches and routers have been deployed in real large-scale networks, such as Google’s G-scale network [53]. Ethane [6] has been proposed as a new network architecture for the enterprise, which allows managers to define a single network-wide fine-grained policy and then enforces it directly.

2.2.2

Rationale of TCAM

In the Ethernet networks, switches and routers must deliver bandwidth-hungry services such as voice over Internet Protocol (VoIP), IP Television (IPTV), Video On Demand (VOD), and wireless 3G/4G with the appropriate Quality of Service levels. In order to build the platforms necessary to optimally manage large amounts of network traffic quickly and effectively, system designers are increasingly relying on advanced Content Addressable Memory (CAM), especially TCAM devices to perform ultra-fast data packet searches. CAM compares input search word, such as the match fields in packet header, against a table of stored forwarding rules, and returns the address of the matched data. CAM can finish a complete lookup operation over all stored rules in a single clock cycle. Therefore, it is popular in the high throughput systems. Fig. 2.2(a) illustrates an example of the lookup operation. When a packet with the Source IP address “100.0.0.1” arrives at a switch, the packet header will be compared against the rule prefixes stored in CAM based table. The matched prefix, such as the shadowed one, will activate the corresponding matchline, which generates an encoding signal. After decoding such mapping signal by Decoder, the predefined action, such as “100.0.0.1:Output 3”, will be duplicated to the Action execution module. Finally, the processed packet leaves the current switch. In general, there are two types of CAMs: Binary CAM (BCAM) and TCAM. The former can be used to store full entries and perform exact 0/1 lookup against

11

each bit of the search data, while the latter can store wildcard entries and do more beyond the binary comparison. In each wildcard entry, the “X” value, called a ‘don’t care’ bit, can be also represented indicating that a particular bit in the search data will not be taken into consideration when comparing with a stored rule. This feature is very useful in many applications such as the prefix matching in IP-lookup and range queries for packet classification. In order to support three states of each bit in a rule, i.e., match 0, match 1 and don’t care, each TCAM cell requires the encoding via using two physical bits. For example, Fig. 2.2(b) illustrates a NOR-type based TCAM cell, which contains two Static Random Access Memory (SRAM) cells representing two physical bits D0 and D1 . Since each physical bit can represent 2 binary states, thus the combination of D0 and D1 can denote 4 logical possible states, but only three of them are required by the ternary storage. On the other hand, Fig. 2.2(c) shows the ternary encoding table for the NOR-type based TCAM cell, where we set D0 =0, D1 =1 and D0 =1, D1 =0 to store logic ternary symbol “0” and “1”, respectively. Additionally, the cell allows searching for an “X” symbol by setting both SL0 and SL1 to logic “0”. This is an external ‘don’t care’ that forces a match of a bit regardless of the stored bit. Therefore, using TCAM, packet forwarding device can do the wildcard lookup operations. In the early stage of Internet protocol routers, the lookup speed was unable to match the growth of link bandwidth. TCAMs have been adopted to design high throughput forwarding engines on routers and switches [22]. Due to the realization of the logical ternary symbol, TCAMs are more expensive and consuming much more circuit board space than SRAMs. In the fast lookup operation, TCAM chips also generate a large amount of heat. Therefore, TCAM cell is far more complicated and power-consuming than a SRAM cell. For example, 1 Mbits TCAMs consume 15-30 Watts of power, about 50 times higher than SRAMs.

2.2.3

Cost-Efficient TCAM Usage

When applying energy-efficient lookup operation in physical packet-switching networks, the recent related works can be generally classified into three aforementioned categories. The remarkable properties of these existing proposals are summarized in Table 2.1. Category-A: TCAM Usage Reduction In order to offload TCAM usage, Yamanaka et al. [17] built a “matching field translator” architecture, in which a list of exact match rules are generated for a corresponding wildcard rule in the first step, and then controller translates exact matching fields into the source MAC addresses based on the correspondence

12

CAM based table

RAM based table

CAM cell matchlines







action x







Decoder





Precharge

Rule prefixes

Encoder

action 1 action 2

action m 1 2

3 4

5

6



n

searchlines

Search word (n bits)

Action execution

Packet header

Packet arrives

Packet leaves IP packet

Source IP address



100.0.0.1

As key

Destination IP address CAM routing table

RAM based table

Look up

120.0.0.1 100.0.0.1

Data



120.0.0.1: Output 2

map

100.0.0.1: Output 3

101.1.0.1

101.1.0.1: Output 4

(a) Rationale of packet lookup operation in a switch/router. ^

matchline ^^ D1 D0

^ wordline SL0

^^

bit line 0

bit line 1

^ Logical ternary symbol

^

Stored value

Search bit

D0

D1

SL0

SL1

0 1

0 1

1 0

0 1

1 0

X

1

1

0

0

SL1

(b) An NOR-type TCAM cell.

(c) Ternary encoding for NOR cell.

Figure 2.2: The rationale of CAM based lookup operation. between them. As a result, only the shorter rules that contain MAC addresses are necessary and can be stored in BCAM of a switch. Similarly, Congdon et al. [18] created a Signature CAM and RAM based Packet Parser, which works as a prediction circuitry. According to the prediction logic results, i.e., prediction hit, incorrect prediction and prediction miss, the TCAM utilization manners are attributed to no-TCAM, only using master-TCAM, and full-TCAM usage, respectively. Category-B: TCAM Partial Uilization Panigrahy et al. [19] partitioned TCAMs into several groups first, and then used an Application-Specific Integrated Circuit (ASIC) based hash-table to perform

13

Cat.

Table 2.1: Comparisons on the energy-efficient TCAM Usage Power Comp. Dynamic Literature Critical component(s) aware ratio update [17]

A

[18] [19]

B

[20] [21] [22]

C

[23] [24] [25] [1]

match field translator signature CAM, prediction circuitry ASIC based prefix indexer bit-selection logic CAM based pre-classifier prefix aggregation and expansion techniques TCAM Razor approach Tree representation shorter tags Rule-multiplexing scheme

no

n/a

good

yes

n/a

poor

yes

n/a

poor

yes yes

n/a n/a

n/a fair

yes

fair

fair

no no yes no

high high fair high

poor poor fair fair

lookup in only one TCAM chip, others remaining inactive. Similarly, Zane et al. [20] proposed a bit-selection logic to reduce power consumption. In the proposed architecture, TCAMs are partitioned to different blocks and the hashing bits are selected to point to specified TCAM subtables. Recently, Ma et al. [21] introduced a smart pre-classifier which classifies a packet in advance such that only a small portion of TCAM will be activated and searched for a given packet. Category-C: Forwarding Rule Compression Ravikumar et al. [22] introduced prefix aggregation and expansion techniques, aiming to activate a limited number of TCAM arrays during an IP lookup. In such a way, the effective TCAM size in a router can be compacted. To address the range expansion problem of TCAM installation, Meiners et al. [23] considered how to generate a semantically equivalent packet classifier that requires the minimum number of rules for a given set of original TCAM entries. Using tree representation of rules, Sun et al. [24] proposed a redundancy removal algorithm, which removes redundant rules and combines overlapping rules to build an equivalent and smaller rule set for a given packet classifier. Kannan et al. [25] used shorter tags for identifying flows than the original ones used to store the flow entries. As a result, the size of forwarding rules can be reduced. In order to efficiently use TCAM space, a rule multiplexing scheme in [1] was proposed with a joint optimization on traffic engineering in SDN networks. Using this scheme, the original same set of rules deployed on each node for the whole flow of a session but towards different paths can be compacted in some particular 14

overlapped switch nodes, such that the occupied TCAM space is reduced.

2.2.4

Rule Installation and Caching

Since the usage of TCAM space is a crucial issue, lots of efforts have been made on rule installation and caching strategies in SDN. The realted existing studies can be classified into two categories: reactive way [6, 13] and proactive way [26–28]. Reactive Rule-Installation The reactive rule caching has been widely adopted by existing work [6, 13] because of its efficient usage of TCAM space. The first packet of each “microflow” is forwarded to the controller that reactively installs flow entries in switches. For instance, Ethane [6] controller reactively installs flow table entries based on the first packet of each TCP/UDP flow. Recently, Bari et al. [29] use the on-demand approach to response flow setup requests. Proactive Rule-Installation On the other hand, other studies [30, 31] argue that reactive approach is timeconsuming because of remote rule fetching, leading to heavy overhead in packet processing [8,27,29]. To reduce the response time for packets at switches without matched rules, proactive approach has been proposed to install rules in switches before corresponding packets arrive. For example, Benson, et al. [8] developed a system MicroTE that adapts to traffic fluctuations, with which rules can be dynamically updated in switches to imposes minimal overhead on network based on traffic prediction. Kang, et al. [28] have proposed to pre-compute backup rules for possible failures and cache them in switches in advance to reduce network recovery time. In addition, other related literatures [10,54–56] focus on the rules scheduling considering forwarding table size utilization. For instance, Katta et al. [54] proposed a abstraction of an infinite switch based on an architecture that leverages both hardware and software, in which rules caching space can be infinite. In that case, rules can be cached in forwarding table as many as possible. This abstraction saves TCAMs space, but the packet processing speed in switch is a bottleneck. To efficiently use TCAMs space, Kanizo et al. [10], Nguyen et al. [55] and Cohen et al. [56] propose their rules placement scheduling jointly consider the traffic routing in network. However, rules updating is ignored in their optimization. To the contrast, we study both the two aspects in our optimization. The work most related with our second primary proposal is DIFANE [27], a compromised architecture that leverages a set of authority switches serving as a

15

middle layer between the controller in control plane and switches in data plane. The endpoints rules are pre-computed and cached in authority switches. Once the first packet of a new microflow arrives the switch, the desired rules are reactively installed, from authority switches rather than the controller. In this way, the flow setup time can be significantly reduced. Unfortunately, caching precomputed rules all in authority switches consumes large TCAM space. In our second primary proposal, we still load the flow rules into switches in a reactive way. However, rule caching period is controlled by our proposed algorithm by taking both remote processing and TCAM occupation cost into consideration.

2.2.5

Traffic Engineering with Rule Compression

Rule Compression Many existing work about SDN focuses on rule-space compression, rule split and distribution. By shorting the Rule Multiplexing as RM, these work can be classified into two categories: 1) nonRM based; 2) RM-based. A number of existing work [10, 27, 28, 57] belong to the first category. DIFANE [27] and vCRIB [57] have been proposed to leverage all switches in a network to realize endpoint policies. Specifically, DIFANE uses a “rule split and caching” approach to increase the path length for the first packet of a flow. Later, Palette et al. [10] have proposed a framework for decomposing large SDN tables into small ones and then distributing them across the network, while preserving the overall SDN policy semantics. Kang et al. [28] have proposed a heuristic rule placement algorithms that distribute forwarding policies across general SDN networks while managing rule space constraints. Their solutions are obtained based on given routing scheme, while its effect on rule placement is ignored. Different from references in the first category, we study the joint routing and rule multiplexing, i.e., the RM-based rules placement, problems in this paper, which have never been investigated before. Multi-path Routing The multi-session multi-path QoS routing problem can be also generally classified into two categories: nonCP based [8, 9, 58–60] and CP-based [61–64]. For example, Zhang et al. [58] have proposed routing optimization schemes to find a set of routes to minimize cost. In [59], a fundamental traffic engineering problem is studied to find minimum number of paths to achieve the maximum throughput. The effect of data center traffic characteristics on data center traffic engineering have been investigated in [8]. A system called MicroTE is developed to adapt to traffic variations by leveraging the short term and partial predictability of the traffic matrix. Nakibly et al. [60] have studied a problem of splitting traffic flow over multiple efficient paths to improve the network band-

16

width utilization. However, using multiple paths for a traffic flow will increase the consumption of expensive forwarding resources, such as TCAM entries of switches and wavelengths of optical switches. They formulate and solve several problems of splitting a traffic flow over multiple paths while minimizing the overhead of forwarding resources. Agarwal et al. [9] have considered a scenario where SDN-enabled nodes are incrementally introduced into an existing network. They formulate an optimization problem with the objective of maximizing the network utilization. Furthermore, they propose fast algorithms to solve this problem with large-scale network instances. The CP-based multi-path traffic engineering has been also extensively investigated. Wang et al. [61] have developed flow control algorithms for networks with multiple paths between each source-destination pair. Han et al. [62] have investigated the problem of congestion aware multi-path routing problem in the Internet by exploiting path diversity. Key et al. [63] have studied the benefits using multiple paths for a session with a joint consideration of rate control over paths and congestion control.

2.2.6

Failure Recovery for SDN Networks

Resilience is another significant topic in SDN networks. This section reviews the existing studies which investigate the failure recovery schemes for SDN networks. In the literature, there are mainly three categories of failure recovery schemes for SDN networks: restoration, cold-backup protection [30, 39] and hot-backup protection [65]. Each of them is specified as follows. Restoration In [30, 39], Sharma et al. presented restoration mechanisms in OpenFlow networks. In case of link failure, a controller reacts to the link failure according to the following steps in sequence: (a) removes the affected forwarding rules, (b) computes backup paths, and (3) installs the new required rules. Cold-backup Protection In this protection, only the forwarding rules are allocated from the beginning, but traffic is not redirected to the backup paths until the failure occurs. For example, in the same work [30, 39], authors also implemented the group table based fast-failover mechanism [26] in OpenFlow network. Backup paths are precomputed and installed to group table. Their experimental findings show that path protection is more qualified than restoration with respect to the sub-50 ms fast failure recovery requirement [66] of carrier-grade networks. Moreover, Borokhovich et al. [48] introduced the classic graph search algorithms, e.g., depth-first search and breadth-first search algorithms, into OpenFlow networks

17

based on the same fast-failover functionality using group table. The controller invokes one of these algorithms to compute backup paths, along which routing rules are pre-installed into the group-table of switches. Hot-backup Protection In this scheme, the bandwidth resource of backup paths is fully allocated from the beginning, such that the backup paths carry the same traffic as the primary working path to avoid disruption of connection. For example, authors in [65] apply ‘1+1’ protection to the data-plane of an OpenFlow network, where backup paths are pre-configured and carry the duplicated traffic with working path. Thus, destination-switch can still receive packets when a link failure occurs. Comparison with Our Third Proposal Compared with the conventional routing-resilience approaches, we have the following observations. 1) The existing approaches rarely specifically address the control-plane routing protection. 2) None of the existing approaches can provide an optimal (in terms of efficiency of resource utilization) fast recovery solution when single-link failure occurs in large-scale networks. To fill this gap, we propose a Markov approximation based routing protection for the in-band control-plane of SDN. In particular, our approach can give the near-optimal global rerouting solution, which is pretty suitable for the flow swapping scenarios [67,68] in a dynamic environment, where traffic flows are frequently refreshed in certain groups of links simultaneously.

18

Chapter 3

Joint Optimization of Rule Placement and Traffic Engineering for QoS Provisioning in Software Defined Network [1] 3.1

Motivation and Problem Statement

In SDN networks, a centralized controller translates network management policies into packet forwarding rules, and deploys them to network devices, such as switches and routers. Each network device stores forwarding rules in its local Ternary Content Addressable Memory (TCAM) [24, 54, 69] that supports high-speed parallel lookup on wildcard patterns. While TCAM excels in packet processing, it is an expensive hardware with high energy consumption. For example, TCAM ternary match is 6 times expensive than Hash-based binary match in Static-RAM [70]. Further, it is reported that TCAMs are 400 times more expensive [54, 71] and 100 times more powerconsuming [72] per Mbit than RAM-based storage. As a result, each network device can only be equipped with limited TCAM. Today’s commodity switches typically support rule size from 2K to 20K only [10,28,73]. Additionally, the rule updating procedure in TCAM is quite slow, and about 40 to 50 rule updates per second [74, 75]. However, the increasing demands would generate a large number of forwarding rules. The shortage of TCAM motivates us to investigate efficient rule placement in SDN such that traffic demands can be accommodated

19

Connection between controller and switch

Routing path 1 of a session Routing path 2 of a session Rules working on path 1

d1

0.5

0.8 Gb/s 0.8

0.1

1.5

0.5

s1

Rules working on path 2

1.0

0.5

0.1

0.2

0.2

0.1

0.1

d1

0.5

0.8 Gb/s 0.8

0.1

1.5

1.0

0.2 Gb/s 0.5

0.5

s1

0.1

0.1 0.2

0.2

0.1

Figure 3.1: The motivation case: traffic engineering and duplicated rules placement in traditional SDN enabled networks. as many as possible. In this chapter, we consider a set of unicast sessions, each of which is associated with some endpoint policies between a source and a destination. These endpoint policies are translated into a set of forwarding rules that work as packet filters and should be applied to every packet from source to destination. Each session specifies a throughput threshold to guarantee a certain level of QoS. Single-path routing has been widely used for unicast sessions because of its simplicity. However, it would fail to satisfy the QoS requirement. For example, we consider a unicast session with 1Gb/s throughput requirement from source s1 to destination d1 in the network shown in Fig. 3.1, in which even the best path 1 → 2 → 5 → 7 can achieve a throughput at most 0.8Gb/s. To achieve an imposed throughput, multiple paths can be employed for packet delivery. 20

In the bottom case of Fig. 3.1, using another path 1 → 3 → 5 → 7 with 0.2 Gb/s transmission rate simultaneously will achieve total throughput of 1 Gb/s. However, when multipath routing is applied in SDN, existing solutions [10,27,28] enforce endpoint policies by duplicating the same set of rules on each path of the session, leading to high TCAM consumption. To deal with the TCAM-efficient rule placement in QoS-guaranteed multipath routing, we propose a rule multiplexing scheme implemented in Controllers. As the example illustrated in Fig. 3.1, traditionally, the same set of rules should be deployed to both path 1 and path 2, denoted by the solid and dotted links from s1 to d1 , respectively. In our proposed multiplexing scheme, only one copy of rules to be deployed onto the common nodes of multipath will be enough to manage entire going-through traffic belonging to different paths. For example, switches 1, 5, and 7, can jointly accommodate all rules in their TCAMs. We consider an example rule “dl src=s1 , dl dst=d1 actions=mod vlan vid:0x0001”, which modifies the VLAN IDs to 0x0001 for packets from source s1 to d1 . Since packets along both paths share the same source and destination addresses, a single rule at switch 5 is enough to complete the VLAN ID modification. In the rest of this chapter, we let the abbreviation of RM/nonRM denote the scheme where rule multiplexing is applied or not, CP/nonCP indicate the scheme where candidate paths are provided or not.

3.2

System Model and Assumptions

We model the SDN as a graph G=(N , E), where node set N consists of SDNenabled network devices, and edge set E represents the communication links among devices. Each device u ∈ N maintains a TCAM-based flow table that can accommodate at most Cu rules. The bandwidth of each link (u, v) ∈ E is constrained by B(u,v) . We consider a set of K unicast sessions, and each session k ∈ K imposes a QoS requirement with throughput Dk from a source sk to destination dk . Furthermore, each session k is associated with a collection of rules (e.g., for access control, or network measurements). Usually, these rules cannot be accommodated by a single node due to limited TCAM capacity. To deploy these rules across the network, we use the algorithm proposed in [10] to decompose them into multiple subsets, which are maintained in Ik . Let f (i) denote the session which a rule subset i belongs to. Each subset i ⊆ If (i) is an atomic unit with a number of ci well-ordered non-routing-oriented rules that cannot be scattered over multiple nodes for the sake of semantic integrity. As a result, these rule sets can be placed along the routing paths in an arbitrary order. All important symbols and variables used in this chapter are summarized in Table 3.1. In the traditional implementation, duplicated rules will be placed onto mul-

21

l r(u,v)

Table 3.1: Notations and Symbols for the First Topic Description a set of network devices a set of links among devices TCAM capacity of node u bandwidth of link (u, v) max(u,v)∈E {B(u,v) } a set of unicast sessions throughput required by session k a set of atomic rule subsets for session k the mapping function of subset i to session k the number of rules in subset i, ∀i ⊆ Ik , k = f (i) a set of paths for session k a binary variable indicating whether rule set i is placed on node u a binary variable indicating whether rule set i is on node u located in path l a binary variable indicating whether path l is selected a real variable representing the transmission rate on path l a real variable representing the transmission rate on

λl(u,v)

link (u, v) along path l a binary variable indicating whether link (u, v) is on

Notation N E Cu B(u,v) B¯ K Dk Ik f (i) ci Lk xiu xil u yl rl

path l

22

tiple buckets belonging to different paths such that the same set of endpoint policies will be executed along any path in the multi-path routing. This motivates us to reduce the rule space occupation by combining common rules among multiple buckets on each node.

3.2.1

Problem Complexity Analysis

All network and traffic demand information is maintained at the centralized controller that has a global view of the SDN. With the given K unicast sessions and their traffic bandwidth requirements Dk , a set of candidate paths Lk can be selected for session k, a set of atomic rule subsets Ik for session k, we consider a rule placement problem with the objective to minimize the total rule space occupation for all sessions under their QoS constraints. Theorem 1. Given a set of candidate paths, the rule placement problem mentioned above is NP-hard. Proof. To prove an optimization problem to be NP-hard, we need to show the NP-completeness of its decision form, i.e., we attempt to find a rule placement such that the QoS of all sessions is satisfied, and total rule space occupation is no greater than X. It is easy to see that such a problem is in NP class as the objective associated with a given solution can be evaluated in a polynomial time. The remaining proof is done by reducing the well-known 2-partition problem, i.e., given a set of numbers A = {a1 , a2 , ..., an }, we attempt to divide them into P P two sets such that j∈J1 aj = j∈J2 aj = A, where J1 and J2 are index sets without overlapping. We now describe the reduction from the 2-partition problem to an instance of our rule placement problem. We create two unicast sessions whose throughput should be no less than A. The rule set of each session contains two rules. As shown in Fig. 3.2, for each number aj ∈ A, we create two paths l and l0 for both sessions, respectively, which share a bottleneck link of capacity aj . Moreover, all nodes along these paths has no available rule space except the nodes associated with the bottleneck link, each of which can accommodate at most one rule. In the following, we only need to show that the 2-partition problem has a solution if and only if the resulting instance of our rule placement problem has a solution that satisfies both QoS and rule space constraints. First, we suppose a solution of the 2-partition problem that the numbers can be divided into two sets with identical sum. The corresponding solution of our problem is to assign the paths of capacity aj , j ∈ J1 to one session, and the ones of capacity aj , j ∈ J2 to the other. It is easy to verify that the throughput of both session is A, and the number of occupied rule space is X. 23

1

1 1

2

n

2

2

Figure 3.2: Constructed instance of rule placement problem. Then, we suppose that our rule placement problem has a solution with a total rule space X and throughput A for both sessions. Since only one rule can be accommodated at the nodes on the bottleneck link of each path, the two paths associated with a common bottleneck cannot be used by two sessions simultaneously. In order to achieve the throughput A, the paths assigned to P P two sessions satisfy j∈J1 aj = j∈J2 aj = A, which forms a solution of the 2-partition problem.

3.3

Optimization with candidate paths

In this section, we consider to optimize the rule space usage when a set Lk of candidate paths is given for each session k ∈ K. This scenario is practical in reality. For example, these candidate paths are pre-selected according to delay requirements. To solve the rule placement problem, we define a binary variable xiu as follows: ( 1, if rule set i is placed on node u, i xu = 0, otherwise. In addition, we define a for each path:    1, xil = u   0,

binary variable xil u to describe the rule placement

if rule set i is placed on node u along path l, otherwise.

Due to the rule multiplexing, each rule set placed at node u can be used by

24

all paths going through it, leading to: xiu = max{xil u }, ∀i ⊆ Ik=f (i) , ∀k ∈ K, ∀u ∈ N. l∈Lk

(3.1)

Note that only the rule sets belonging to the same session k can be multiplexed among paths in Lk . Since not all candidate paths need to be used for packet delivery, we define a binary variable y l for path selection as follows: ( 1, if path l is selected for packet delivery, yl = 0, otherwise. If a path l ∈ Lk is selected, i.e., y l = 1, each rule set i ⊆ Ik=f (i) should be P deployed on at least one node along this path, i.e., u∈l xil u ≥ 1. This constraint can be formulated as: X l xil (3.2) u ≥ y , ∀i ⊆ Ik=f (i) , ∀l ∈ Lk , ∀k ∈ K. u∈l

Otherwise, i.e., y l = 0, we do not constrain rule placement on this path, P i.e., u∈l xil u ≥ 0 that is always satisfied. The max operation in (3.1) can be replaced by the following equation: xiu ≥ xil u , ∀l ∈ Lk , ∀i ⊆ Ik=f (i) , ∀k ∈ K, ∀u ∈ N.

(3.3)

The number of rules placed at node u ∈ N cannot exceed its rule space capacity as represented by: X X xiu ci ≤ Cu , ∀u ∈ N. (3.4) k∈K i⊆If (i) l On the other hand, by defining rl and r(u,v) as the transmission rate on path l and link (u, v) on this path, respectively, the QoS of each session k ∈ K shall be guaranteed by letting the total transmission rate of all selected paths be greater than Dk : X rl ≥ Dk , ∀k ∈ K. (3.5) l∈Lk

Furthermore, the transmission rate of a path is determined by the link with the minimum rate, which is represented by: l 0 ≤ rl ≤ r(u,v) , ∀(u, v) ∈ l, ∀l ∈ Lk , ∀k ∈ K.

(3.6)

The characteristics of the association between routing paths and transmission rate should be specified. First, multiple paths associated with a common link should share the bandwidth of this link: X X l r(u,v) ≤ B(u,v) , ∀(u, v) ∈ E. (3.7) k∈K l∈Lk

25

Then, the transmission rate on the link (u, v) in the selected path l shall between 0 and the maximum bandwidth of this link B(u,v) : l 0 ≤ r(u,v) ≤ y l B(u,v) , ∀(u, v) ∈ E, ∀l ∈ Lk , ∀k ∈ K.

(3.8)

Finally, the multiplexing-considered rule placement problem with the objective minimizing the total allocated rule subsets under the candidate paths can be formulated as: XXX min xiu ci , (3.9) k∈K i∈Ik u∈N

s.t. : (3.2) − (3.8); l xiu , xil u, y

l ∈ {1, 0}, rl > 0, r(u,v) > 0.

Although the above formulation (3.9) is a mixed integer linear programming (MILP), there exist highly efficient algorithms, e.g., branch-and-bound, and fast off-shelf solvers, e.g., CPLEX. Since our focus is to develop new schemes for rule placement and the corresponding optimization problems, we omit the details of solving MILP in this paper . To better understand the benefits of our proposed rule multiplexing scheme, the same optimization problem under the traditional rule placement scheme is also formulated as follows. X X XX min xil (3.10) u ci k∈K i⊆Ik=f (i) l∈Lk u∈l

X

X

X

xil u ci ≤ Cu , ∀u ∈ N ;

(3.11)

k∈K i⊆Ik=f (i) l∈Lk

s.t. : (3.2), (3.5) − (3.8); l xil u, y

l ∈ {0, 1}, rl > 0, r(u,v) > 0.

Recall that the traditional scheme duplicates the same set of rules on each path of a session, resulting in that TCAM capacity constraint (3.4) is replaced by (3.11). Accordingly, its associated constraint (3.3) is also eliminated in above formulation (3.10).

3.4

Optimization without candidate paths

For many flow requests in practice, their candidate paths may not be specified by users, or constrained by any performance requirements (e.g., delay). When no candidate path is provided, the rule placement problem becomes more challenging but beneficial for TCAM-efficient QoS provisioning. To jointly consider traffic engineering and rule placement will raise the opportunities of both rule

26

Figure 3.3: An example for path searching. multiplexing and QoS guarantees. In this section, we investigate the rule placement problem without candidate path by developing a formulation that makes a good tradeoff between rule multiplexing and bandwidth utilization. We define a binary variable λl(u,v) to indicate whether link (u, v) is selected by path l: ( 1, if link (u, v) is on the path l, λl(u,v) = 0, otherwise. The path searching process is represented by constraints (3.12) - (3.14). X X λl(sk ,v) − λl(v,sk ≤ 1, (sk ,v)∈E (v,sk ∈E (3.12) ∀l ∈ Lk , ∀k ∈ K; X

λl(v,dk −

(v,dk ∈E

X

λl(dk ,v) ≤ 1, (3.13)

(dk ,v,)∈E

∀l ∈ Lk , ∀k ∈ K; X X λl(u,v) − λl(v,w) = 0, (u,v)∈E

(3.14)

(v,w)∈E k

∀v ∈ N \ {sk , dk }, ∀l ∈ L , ∀k ∈ K. We use the example shown in Fig. 3.3 to explain these constraints, where solid arrows indicate a path from source s to destination d. At the source node, the number of outgoing links minus that of incoming links should be equal to 1 if this path is selected. Otherwise, their differences should be zero. A similar constraint (3.13) is imposed for the destination. At each intermediate node, for example, node v in Fig. 3.3, the number of incoming link should be equal to the number of outgoing link, which is represented by constraint (3.14). l In order to avoid cyclic paths, we particularly define an integer variable z(u,v) l to denote the sequence number of the link along path l, i.e., (u, v) is the z(u,v) -th link along the path l ∈ Lk from the source sk to the destination dk . If link (u, v)

27

l is not on path l, i.e., λl(u,v) = 0, its value of z(u,v) should be zero. Otherwise, the difference between the sequence numbers of two consecutive links on the path should be 1. Therefore, z(u,v) is between 0 and |N |-1 if link (u, v) is in path l: l 0 ≤ z(u,v) ≤ λl(u,v) (|N | − 1),

(3.15)

∀(u, v) ∈ l, ∀l ∈ Lk , ∀k ∈ K. X

l z(v,w) −

(v,w)∈E

X

l z(u,v) =

(u,v)∈E

X

λl(v,w) ,

∀(v,w)∈E

(3.16)

∀v ∈ N − {dk }, ∀l ∈ Lk , ∀k ∈ K. With respect to rules placement, it is always shall be guaranteed that rules should be placed on and only on the nodes along the selected paths and shown as constraints (3.17) - (3.18): X X xil λl(u,v) + λl(v,dk , u ≤ (u,v) (v,dk (3.17) ∀i ⊆ Ik=f (i) , ∀l ∈ Lk , ∀k ∈ K, ∀u ∈ N ; X X xil λl(sk ,v) , ∀i ⊆ Ik=f (i) , u ≥ u∈N

(3.18)

(sk ,v)

∀l ∈ Lk , ∀k ∈ K. The maximum transmission rate of each link (u, v) belonging to path l is constrained by B(u,v) if this link is selected by l, i.e., λl(u,v) = 1. Otherwise, l r(u,v) = 0. This can be described as: l 0 ≤ r(u,v) ≤ λl(u,v) B(u,v) , ∀(u, v) ∈ E,

(3.19)

∀l ∈ Lk , ∀k ∈ K. Constraint (3.20) indicates that transmission rate of a path is determined by the bottleneck link. If a path (u, v) is on the path l, i.e., λl(u,v) = 1, we get l 0 ≤ rl ≤ r(u,v) . l ¯ 0 ≤ rl ≤ r(u,v) + (1 − λl(u,v) B, (3.20) ∀(u, v) ∈ E, ∀l ∈ Lk , ∀k ∈ K. Otherwise, constraint (3.20) becomes 0 ≤ rl ≤ B¯ = max(u,v)∈E {B(u,v) }, which is always satisfied. Finally, the relation between rl and λl(u,v) can be specified as: rl ≤

X

λl(u,v) B¯ ≤ rl · M, (3.21)

(u,v)∈E

∀(u, v) ∈ E, ∀l ∈ Lk , ∀k ∈ K. where M is an arbitrarily large number, such that all λl(u,v) = 0, ∀(u, v) ∈ l, ∀l ∈ Lk if rl = 0. 28

Algorithm 1 Fast Heuristic Algorithm l Require: Problem formulations with integer variables xiu , xil u , λ(u,v) ∈ {0, 1} el ,x eil , x ei of the original problem Ensure: Solutions λ (u,v)

1:

2: 3: 4: 5:

u

u

bl obtain the solutions, i.e., x biu , x bil u , λ(u,v) , of optimization problems by relaxing all integer variables for all k ∈ K do for all l ∈ Lk do el bl λ (u,v) ← PathSearch(λ(u,v) , k, l) el x eil ← PathRulePlacement(b xil , λ , k, l) u

u

(u,v)

end for 7: x eiu ← SessionRulePlacement(b xiu , x eil u , k) 8: end for 6:

l l Following the same definitions of xiu , xil u , r and r(u,v) in last section, the rule placement problem without candidate paths can be formulated as: X X X min xiu ci , (3.22) k∈K i⊆Ik=f (i) u∈N

s.t. : (3.3) − (3.5), and (3.7), (3.12) − (3.21); l l l xiu , xil u , λ(u,v) ∈ {1, 0}, r > 0, r(u,v) > 0.

The corresponding problem without rule multiplexing can be formulated as follows in a similar manner as given in last section. X X XX min xil (3.23) u ci k∈K i⊆Ik=f (i) l∈Lk u∈l

s.t. : (3.5), (3.7), (3.11), (3.12) − (3.21); l l l xil u , λ(u,v) ∈ {0, 1}, r > 0, r(u,v) > 0.

3.5

Heuristic Algorithms

Due to the NP-hardness of the rule placement problem, we propose a fast heuristic algorithm using relaxation and rounding techniques. As shown in Algorithm 1, we first solve the optimization problems by relaxing all integer variables, and then obtain feasible solutions by invoking PathSearch, PathRulePlacement, and SessionRulePlacement algorithms. Note that, with line 7, Algorithm1 is RM-nonCP heuristic; otherwise, it becomes the nonRM-nonCP heuristic. The pseudo codes of PathSearch algorithm is shown in Algorithm 2. All bl (u, v) tuples are sorted in a decreasing order according to values of λ (u,v) and are maintained in set Q. Then, we find feasible solutions satisfying constraints (3.12), (3.13) and (3.14) in the for loop from line 4 to 10. 29

Algorithm 2 PathSearch bl Require: LP solution λ

(∀(u, v) ∈ E), session index k, path index l el Ensure: The rounded solutions λ (u,v) el 1: λ ← 0, ∀(u, v) ∈ E (u,v)

(u,v)

2: 3: 4: 5: 6: 7:

bl bl bl Sort Q = {(u, v)|λ (u,v) > 0} as π1 , · · · , π|Q| such that λπ1 ≥ · · · ≥ λπ|Q| P ←∅ for j = 1; j ≤ |Q|; j + + do P ← P ∪ {πj } bl if dλ (u,v) e, ∀(u, v) ∈ P satisfy (3.12), (3.13) and (3.14) then l e λ ← 1, ∀(u, v) ∈ P (u,v)

break 9: end if 10: end for 8:

Similarly, we find feasible solutions of xil u by first sorting them in a decreasing order in PathRulePlacement algorithm. All nodes belonging to the routes obtained from Algorithm 2 are maintained in set V. Each element πj = (u0 , i0 ) from Q is then checked sequentially, and is included in P if it satisfies the following conditions. 1) Node u0 is in V, 2) rule subset i0 does not show in a (u, i)-tuple in P , and 3) the remaining space on node u0 can accommodate rule subset i0 . Finally, the SessionRulePlacement algorithm is invoked to find feasible solubiu in a decreasing order as shown in line 2, and then tions of xil u . We first sort x find feasible integer solutions in the following for loop. Theorem 2. The time and space complexity of Algorithm 1 is O(|E| + αβN + PK α2 βN 2 ) and O(|N |α(1 + β) + |E|β), respectively, where α = k=1 |Ik | and PK β = k=1 |Lk |. bl Proof. The worst computational complexity of Algorithms 2 and 3 is O(|λ (u,v) |) il = O(|E|) and O(|b xu ||V| ) = O(N |Ik ||Lk |), respectively. In addition, Algorithm 2 2 4 in line 7 has time complexity of O( |b xiu ||e xil u | ) = O(N · |Ik | · |Lk |). Therefore, the overall time complexity can be derived as follows. O(Alg.p1) = O(

K X

|Lk | × (O(Alg.2) + O(Alg.3))) +

k=1

= O(|E| + N ·

K X

O(Alg.4)

k=1 K X

|Ik ||Lk |) + O(N 2 (

k=1

K X

k=1

|Ik |)2

K X

|Lk |)

k=1

= O(|E| + αβN + α2 βN 2 ). The space complexity is determined by the number of variables xui , xul i , and 30

Algorithm 3 PathRulePlacement el Require: LP solution of x bil u (∀u ∈ N, ∀i, f (i) = k), λ(u,v) (∀(u, v) ∈ E) from paths finding algorithm, k and l Ensure: The rounded solutions x eil u il 1: x eu ← 0 , ∀u ∈ N, ∀i, f (i) = k 2: Sort Q = {(u, i)|b xil bil u > 0} as π1 , · · · , π|Q| in a decreasing order of x u l e 3: V ← {u, v|λ = 1, ∀(u, v) ∈ E} (u,v) 4: P ← ∅ 5: for j = 1; j ≤ |Q|; j + + do 6: (u0 , i0 ) ← πj 7: if u0 ∈ V and ∀(u, i) ∈ P, i0 6= i and P 0 0 ∀(u0 ,i)∈P ci + ci ≤ Cu then 8: P ← P ∪ {πj } S 9: if Ik ⊆ ∀(u,i)∈P {i} then 10: x eil u ← 1 , ∀(u, i) ∈ P 11: break 12: end if 13: end if 14: end for λl(u,v) , which can be calculated as |N |α, |N |αβ, and |E|β, respectively. By summing up the number of all these variables and the corresponding rounded ones, the total space complexity is O(|N |α(1 + β) + |E|β). Note that, Algorithm 1 can be adopted when candidate paths are provided as a special case, in which variables λl(u,v) are fixed to one if (u, v) ∈ l or to zero otherwise.

3.6 3.6.1

Case Study Simulation settings

In this section, we demonstrate the rationale and advantages of the proposed mechanism by a case study on a partial ITALYNET topology [76, 77] as shown in Fig. 3.5, where link capacity is fixed to 100 and TCAM capacity of each switch is 15. Suppose we have a traffic demand from host h1 (MAC address 00:00:00:00:00:01, ip address 10.0.0.1) to host h2 (MAC address 00:00:00:00:00:02, ip address 10.0.0.2) with a bandwidth QoS request of 120. A set of provisioned non-routing-oriented rules, as illustrated in Table 3.2, need to be deployed on switches along each path. In our experimental results, we use the combinations of RM/nonRM and CP/nonCP to indicate the resulting schemes.

31

Algorithm 4 SessionRulePlacement Require: LP solution of x biu (∀u ∈ N, ∀i, f (i) = k), x eil u (∀u ∈ N, ∀i, f (i) = k, ∀l, l ∈ Lk ,) from Alg. 3, and k Ensure: The rounded solutions x eiu 1: x eiu ← 0, ∀u ∈ N, ∀i, f (i) = k 2: Sort Q = {(u, i)|b xiu > 0} as π1 , ..., π|Q| in a decreasing order of x biu 3: P ← ∅ 4: for j = 1; j ≤ |Q|; j + + do 5: (u0 , i0 ) ← {πj } 6: if (u0 , i0 ) ∈ {(u, i)|e xil u = 1} and P 0 c + c ≤ Cu0 then i ∀(u0 ,i)∈P i 7: P ← P ∪ {πj } S 8: if Ik ⊆ ∀(u,i)∈P {i} then 9: x eiu ← 1 , ∀(u, i) ∈ P 10: break 11: end if 12: end if 13: end for

The internal architecture and relations between components of simulation are illustrated as Fig. 3.4. After solving optimization or heuristic algorithm, the obtained optimal or suboptimal solutions can be leveraged to place OpenFlow rules in data plane. We use Mininet as the emulator of data plane and Ryu as the OpenFlow controller. In rest of the paper, all mathematical programming formulations are solved using commercial solver Gurobi optimizer [78], which is embedded in the optimization program in terms of Python interfaces.

3.6.2

Solutions under given candidate paths

We first consider the problems when four available paths are provided, i.e., l1 = {1 → 2 → 3 → 4}, l2 = {1 → 5 → 6 → 7 → 8 → 9 → 4}, l3 = {1 → 0 → 4} and l4 = {1 → 5 → 6 → 0 → 8 → 9 → 4}. After solving the formulations of nonRM-CP and RM-CP, we display the solutions in Fig. 3.5(a) and 3.5(b). We observe that without rule multiplexing, paths l1 and l2 are selected for packet delivery as shown in Fig. 3.5(a). Since the rule space at source and destination is not enough to accommodate all rules, they are distributed on multiple nodes along the paths. For example, each path installs 5 rules at the source node, and the rest are placed at node 2 and node 9 on different paths, respectively. The total rules take 40 TCAM entries. As shown in Fig. 3.5(b), when rule multiplexing is enabled, paths l3 and l4 are selected to share the rules placed at source, destination and the common

32

ID 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Table 3.2: 20 SDN rules used in case study Rules add-flow sw id dl src=00:00:00:00:00:01, dl dst=00:00:00:00:00:02,actions=mod vlan vid:0x0001 add-flow sw id dl src=00:00:00:00:00:01, dl dst=00:00:00:00:00:02,actions=mod nw tos:0x10 add-flow sw id nw src=10.0.0.1,nw dst=10.0.0.2, actions=drop add-flow sw id nw src=100.0.0.3,actions=drop add-flow sw id dl src=00:00:00:00:00:05,actions=drop add-flow sw id nw ttl=53,actions=drop add-flow sw id dl type=ipv6,actions=drop add-flow sw id dl type=tcp6,actions=drop add-flow sw id dl type=udp6,actions=drop add-flow sw id dl type=icmp6,actions=drop add-flow sw id dl type=ip,in port=2,nw src=10.0.0.4, actions=CONTROLLER:2024 add-flow sw id dl type=arp,in port=3,arp spa=10.0.0.4, actions=CONTROLLER:2025 add-flow sw id dl type=ip,in port=1,nw src=10.0.0.5, actions=CONTROLLER:2026 add-flow sw id dl type=ip,in port=2,nw src=10.0.0.6, actions=CONTROLLER:2027 add-flow sw id dl type=ip,actions=normal add-flow sw id dl type=icmp,actions=normal add-flow sw id dl type=tcp,actions=normal add-flow sw id dl type=udp,actions=normal add-flow sw id dl type=arp,actions=normal add-flow sw id dl type=rarp,actions=normal

33

Controller nonRM-CP

Control schemes RM-CP

nonRM-nonCP

RM-nonCP

Optimizations Optimi Op timizati timiza ti tions

Heuristicc algorithms

Gurobi Solver, Python thon IFs

Optimal solutions

Suboptimal solutions

OpenFlow Protocol

OpenFlow Messages and Rules

Switch Packet in

Flow table …

Secure channel

Actions set execution, e.g., access control, rate control, VLAN setting.

Packet out

Figure 3.4: The internal architecture of simulation and the relations between components. node 0. As a result, the total rule space occupation is only 20.

3.6.3

Solutions without candidate paths

If the candidate paths are not provided, the optimal paths will be provided by solving the formulations of nonRM-nonCP and RM-nonCP. As a result, paths l1 and l3 are found in the solution of nonRM-nonCP. The resulting traffic distribution and rule placement along each path are illustrated in Fig. 3.5(c) with a total rule space occupation of 40. Finally, Fig. 3.5(d) displays the solution from RM-nonCP. Three paths are calculated as {1 → 5 → 6 → 0 → 4}, {1 → 0 → 6 → 7 → 8 → 9 → 4} and {1 → 5 → 6 → 0 → 8 → 9 → 4} with traffic rates of 20, 20 and 80, respectively. In particular, two rules (rule id: 12, 13) are placed on node 1, eight rules (rule id: 0, 3, 7, 10, 15, 16, 18, 19) on node 6, and ten rules (rule id: 1, 2, 4, 5, 6, 8, 11, 14, 17) on node 0. Such a placement guarantees that each path covers the whole rule sets.

34

l1

l2

l3

l4

l2

l1

l3

l4

X\ 



Z

X\ \ \

Y









Y



X\





X



 



(a) nonRM-CP

(b) RM-CP

l1

l3 l1

l5

l3 X\



 \

l6

l7

\





Y 

X





X\ XW



Y 

Y 

X

X

(c) nonRM-nonCP

Y

 _

(d) RM-nonCP

Figure 3.5: Case study of four schemes with a 10-node scaled network, the rules are placed into the data plane according to the solutions obtained by solving four optimizations.

3.7

Performance Evaluation

We have developed a simulation framework using C++ and realized the proposed heuristics using python. As shown in Fig. 3.4, the 4 rule placement schemes are embedded into our simulator, with which we conduct simulations to evaluate the performance of the proposed algorithms under various of network topologies. The demonstrated simulation result is averaged over 100 instances for each network setting.

35

7000 nonRM−CP RM−CP

8000

Rule space occupation

Rule space occupation

10000

6000 4000 2000 0

1

2 3 4 Number of sessions

5000 4000 3000 2000 1000

5

nonRM−CP RM−CP

6000

50

70 Number of rule subsets

90

(a) Rules cost v.s. number of sessions, (b) Rules cost v.s. number of rule subi.e., K sets, i.e., |Ik |

5000

5000 nonRM−CP RM−CP

4000 3000 2000 1000

Rule space occupation

Rule space occupation

6000

nonRM−CP RM−CP 4000

3000

2000

100 200 300 400 500 Maximum required traffic rate

1

2 3 4 5 6 Number of given paths

7

(c) Rules cost v.s. maximum traffic (d) Rules cost v.s. number of given rate requirement, i.e., the upper bound paths, i.e., |Lk | of Dk Figure 3.6: The optimal rule space occupation cost of nonRM-CP and RM-CP. This suite of simulations emphasize on comparing the performance of rule space occupation cost between nonRM and RM schemes, while providing the candidate paths.

3.7.1

Performance of the nonRM-CP and RM-CP

At the first, RM-CP and nonRM-CP schemes are evaluated with optimal solutions in a relative small scale of network, i.e., the ITALYNET network topology with 20 datacenter nodes, which has been widely used in literatures [76, 77]. The default settings of system parameters are as follows: K = 3, |Ik | = 20, |Lk | = 2, B(u,v) ∈ [150, 200], Cu ∈ [1500, 2000], ci ∈ [10, 100], and Dk ∈ [100, 200], for ∀k ∈ K, ∀(u, v) ∈ E, u ∈ N, i ∈ If (i) . Rule space occupation We first investigate the rule space occupation performance of our proposed rule placement schemes. By varying the number of sessions K from 1 to 5, Fig. 3.6(a) shows the occupied rule space increases linearly as the number of sessions grows because more paths are employed to achieve the throughput requirement, leading to more rule space occupation. The rule multiplexing scheme can save 36

1 QoS satisfaction ratio

QoS satisfaction ratio

1 0.8 0.6 0.4 0.2 0 1

nonRM−CP RM−CP 2 3 4 Number of sessions

0.6 0.4 0.2 0 50

5

nonRM−CP RM−CP

0.8

70

90 110 130 150 170 190 Number of rule subsets

(a) QoS satisfaction ratio v.s. number (b) QoS satisfaction ratio v.s. number of sessions, i.e., K of rule subsets, i.e., |Ik | 1 QoS satisfaction ratio

QoS satisfaction ratio

1 0.8 0.6 0.4 0.2 0 200

nonRM−CP RM−CP 300 400 500 Maximum required traffic rate

0.8 0.6 0.4 0.2 0 10

600

nonRM−CP RM−CP 20 30 40 50 Rules capacity of switch

60

(c) QoS satisfaction ratio v.s. maxi- (d) QoS satisfaction ratio v.s. rules camum traffic rate requirement Dk pacity of switch, Cu Figure 3.7: QoS satisfaction ratio of nonRM-CP and RM-CP. This suite of simulations emphasize on comparing the performance of QoS satisfaction degree between nonRM and RM schemes, while providing the candidate paths. TCAM space significantly. For example, when K=5, the rule space needed by RM-CP is less than 35% of nonRM-CP. Then, we study the effect of the size of Ik for each session by changing its upper bound value from 10 to 50 and fixing the lower bound as 10 when we generate its value randomly. As shown in Fig. 3.6(b), although the amounts of occupied rule space of both algorithms show as linear functions of the endpoint policy size, RM-CP outperforms nonRM-CP by saving about 30% of rule space utilization. The performance under different maximum traffic rate requirements is shown in Fig. 3.6(c) by varying the upper bound of Dk from 100 to 500. Particularly, the upper range of B(u,v) is reassigned to 500. As illustrated in this figure, the performance of nonRM-CP grows over the maximum traffic rate at a line rate, while the rules cost of RM-CP always holds around 3300. That is because rules can be always multiplexed at the nodes shared by multiple path with RM-CP scheme, no matter what the traffic rate requirement is. However, it is worth noting that the performance of RM-CP is only guaranteed within the range of

37

bandwidth resource capacity. Once traffic rate requirement exceeds the link bandwidth, the QoS can not be satisfied. Later, we investigate how the size of available path set affects the performance. Note that, we use a modified weighted Dijkstra algorithm to find available paths over the network graph for each session. In this weighted Dijkstra algorithm, we label each link with the current available bandwidth and update them after finding the current shortest path, then try to find the next one. By setting the number of given paths for each session ( i.e., |Lk |, ∀k ∈ K ) from 1 to 7, and link bandwidth B(u,v) in the range [150, 300], we show the rule space occupation cost of nonRM-CP scheme in Fig. 3.6(d). We notice that the two schemes performs the same when |Lk |=1, because single path cannot provide any opportunity for rule multiplexing. As |Lk | grows, more rule space saving can be achieved by RM-CP over nonRM-CP, up to 30%. Such saving saturates soon when more candidate paths are given, e.g., |Lk | >4. This can be attributed to the fact that RM-CP can always find some common nodes for rule multiplexing in a small number of candidate paths. 6

5

x 10

10 nonRM−CP.Alg RM−CP.Alg

2

Rule space occupation

Rule space occupation

2.5

1.5 1 0.5 0

1

2

3

4 5 6 7 8 Number of sessions

9

nonRM−CP.Alg RM−CP.Alg

8 6 4 2 0

10

x 10

20

40 60 80 Number of rule subsets

100

(a) Rule space cost v.s. number of ses- (b) Rule space cost v.s. number of rule sions, i.e., K subsets 7

5

x 10

12 nonRM−CP.Alg RM−CP.Alg

6

Rule space occupation

Rule space occupation

8

4 2 0

1M

nonRM−CP.Alg RM−CP.Alg

10 8 6 4 2 0

2M 4M 8M Number of rules for each session

x 10

100

200 500 1000 Number of switches

(c) Rule space cost v.s. number of rules (d) Rule space cost v.s. for each session switches, i.e., N

2000

number of

Figure 3.8: Rule space occupation of fast heuristic algorithms under nonRM-CP and RM-CP schemes in randomly generated large-scale networks.

38

QoS satisfaction We also show the QoS satisfaction under link bandwidth of 500 and single-path routing. This metric is defined as the portion of simulation instances whose QoS requirements are satisfied. In each simulation case, since the traffic requirements from hosts are generated randomly, all the requirements possibly can not be guaranteed, i.e., it fails to find a feasible solution using the optimization of nonRM-CP or RM-CP with the given link bandwidth and routing path. Therefore, the simulation in section 8.1.2 is essentially the robustness comparison of nonRM-based and RM-based schemes under the given candidate paths. We first investigate the influences of number of sessions on the QoS satisfaction. As shown in Fig. 3.7(a), the performance of both algorithms degrades as the number of sessions increases because larger number of traffic demands would quickly exhaust the bandwidth resources. Then, we show the QoS satisfaction under different scale of rule subsets of each session in Fig. 3.7(b), where RM-CP outperforms nonRM-CP. It clearly show the QoS satisfaction as a decreasing function over scale of rule subsets. The effect of throughput requirements on QoS satisfaction is evaluated. As shown in Fig. 3.7(c), by randomly generating traffic rates within ranges [100,200], [200,300], [300,400], [400,500], and [500,600], the QoS satisfaction shows a decreasing function of both schemes. The reason can be attributed to the fact that it becomes harder to guarantee all the requirements with the available resources under higher QoS requirement. At last, we see the QoS satisfaction under different scale of switch capacity in Fig. 3.7(d) showing as increasing functions, when switch capacity varys within 10 and 60. After converging at switch capacity of 50, their performance is not affected by larger switch capacity. Overall, from all the figures above we can always observe that the RM-CP significantly outperforms nonRM-CP of QoS satisfaction. Performance in Large-Scale Networks Then, we also evaluate the fast heuristics under nonRM-CP and RM-CP schemes in large-scale networks. The topology is randomly generated in each running case. The default settings of system parameters are as follows: N = 500, K = 3, |Ik | = 100, |Lk | ∈ [2, 10], B(u,v) = 100, Cu = 800K (K=103 ), ci = 1K, and Dk ∈ [80, 300], for ∀k ∈ K, ∀(u, v) ∈ E, u ∈ N, i ∈ If (i) . As shown in Fig. 3.8(a), we first evaluate the performance under various numbers of sessions by varying K from 1 to 10. The cost of rule space occupation linearly increases over K for both schemes. By extending |Ik | from 20 to 100, Fig. 3.8(b) shows the rule space occupation cost is an increasing function of the number of rule subsets for each session. It can be also observed that RM-CP 39

150 nonRM−nonCP.Opt nonRM−nonCP.Alg RM−nonCP.Opt RM−nonCP.Alg

100

Rule space occupation

Rule space occupation

150

50

0

1

2 3 4 Number of sessions

100

50

0

5

nonRM−nonCP.Opt nonRM−nonCP.Alg RM−nonCP.Opt RM−nonCP.Alg

10

15 20 25 Number of rule subsets

30

(a) Rules cost v.s. number of sessions, (b) Rules cost v.s. number of rule subi.e., K sets, i.e., |Ik | Figure 3.9: Rule space occupation of nonRM-nonCP and RM-nonCP under a partial ITALYNET networks with 10 nodes. This suite of simulations emphasize on comparing the performance between Alg. 1 and optimal solutions. outperforms nonRM-CP by around 60% of the total rule space occupation. Then, we study the performance under various numbers of required rules for each session by varying |Ik | × ci ∈ {1M, 2M, 4M, 8M} (M=106 ) and reassigning Cu = 4M. Fig. 3.8(c) shows the rule space cost increases over the number of rules. Finally, we evaluate the performance under various scales of networks by varying the number of switches, i.e., N ∈ {100, 200, 500, 1000, 2000}. The rules occupation cost is shown in Fig. 3.8(d). It can be observed that the cost is little affected by the size of networks for both the nonRM-CP and RM-CP schemes. This is because although the provided candidate short paths are found a little different in various scales of networks, the total required number of paths is always the same when the required traffic rate of each session is fixed. As a result, the induced rule space occupation cost maintains similar in different running cases. Further, we see that the RM-CP shows much more efficient than nonRM-CP again.

3.7.2

Performance of the nonRM-nonCP and RM-nonCP

We firstly evaluate the performance of the proposed heuristic Alg. 1 under schems nonRM-nonCP and RM-nonCP with the corresponding optimal solutions in small-scale network, e.g., the same network topology adopted in Section 3.6, in which parameters are set as: N = 10, |Ik | = 20, ci = 1, B(u,v) ∈ [80, 200], Cu ∈ [15, 20] and Dk ∈ [100, 200]. By varying K from 1 to 5, Fig. 3.9(a) shows the comparison between the optimal rule occupation and the result obtained by applying Alg. 1. Their performance results as a function of |Ik | in the range from 10 to 30 are also compared in Fig. 3.9(b). As we observe from both figures, the proposed Alg. 1 incurs only a little extra rule occupation, i.e., within 10% and 5% compared to

40

4

2

nonRM−CP.Opt RM−CP.Opt nonRM−nonCP.Alg RM−nonCP.Alg

8000

Rule space occupation

Rule space occupation

10000

6000 4000 2000 0

1

2 3 4 Number of sessions

1.5

nonRM−CP.Opt RM−CP.Opt nonRM−nonCP.Alg RM−nonCP.Alg

1 0.5 0

5

x 10

15

20 25 30 Number of rule subsets

35

(a) Rules cost v.s. number of sessions, (b) Rules cost v.s. number of rule subi.e., K sets, i.e., |Ik | Figure 3.10: Rule space occupation of nonRM-nonCP and RM-nonCP under randomly generated networks with 30 nodes. This suite of simulations emphasize on comparing the performance between nonRM and RM schemes, under the cases of CP and nonCP, respectively. the optimal solutions of nonRM-nonCP and RM-nonCP, respectively. Then, extensive simulation experiments are conducted to show the performance of the heuristic algorithm under schems nonRM-nonCP and RM-nonCP in 30-node networks that are randomly generated by linking any two nodes with probability of 0.2. Since the corresponding optimal solutions can be obtained in a timely manner, we show the optimal solution of former two schemes, i.e., nonRM-CP and RM-CP, instead for the purpose of comparison. The default settings of simulation parameters are as follows: |Ik | = 20, B(u,v) ∈ [100, 200], Cu ∈ [1500, 2000], ci ∈ [10, 100], and Dk ∈ [100, 200]. Fig. 3.10(a) shows the performance results of all four models under various number of sessions from 1 to 5. Both nonRM-nonCP and RM-nonCP models search their best routes for each session if available. For the models nonRM-CP and nonRM-nonCP that both apply the traditional rule placement mechanism, the latter can always obtain some improved performance. We attribute it to the gain achieved by jointly optimizing multi-path routing and rule placement. When our proposed rule multiplexing scheme is applied, the corresponding models RM-CP and RM-nonCP achieve significant performance improvement over nonRM-CP and nonRM-nonCP, respectively. However, when comparing RM-nonCP to RM-CP, we notice only slight improvement achieved. This shows that the rule multiplexing scheme is sometimes more efficient to improve performance, especially when the given candidate paths are good enough already. Finally, we show the experimental results in Fig. 3.10(b) when varying the number of rule subsets from 15 to 35 and fixing the number of unicast sessions to 5. The total rule space occupation of all models shows as an increasing function of rule subset sizes as explained in Section 3.7.1. Some other findings similar to the ones shown in Fig. 3.10(a) are also made. In summary, the advantage of 41

our rule multiplexing mechanism can be always observed that the RM scheme achieves 30% less rules cost than nonRM scheme under CP case, while this quantitative improvements are approximately 10%∼20% under nonCP case.

3.8

Summary

In this chapter, we propose a rule multiplexing scheme for rule placement with the objective of minimizing rule space occupation for multiple unicast sessions under QoS constraints. We formulate an optimization problem by jointly considering routing engineering (i.e., with or without the given candidate paths) and rule placement under both the existing nonRM-based and our proposed RM-based rule placement schemes. Due to the NP-hardness, we propose heuristic algorithms for minimization problems of RM-nonCP and nonRM-nonCP using the relaxation and rounding techniques. Two phases are included in the major heuristic algorithm. In the first phase, we select multiple paths for each session, while rules placement solution is found at the selected routing paths in the second phase. The computational complexity is also analyzed. Finally, extensive simulations are conducted to show that our proposals and heuristic algorithms save TCAM resources significantly.

42

Chapter 4

Cost Minimization for Rule Caching in Software Defined Networking [2] 4.1

Motivation and Problem Statement

Each network flow in SDN networks, is associated with a set of rules, such as packet forwarding, dropping and modifying, that should be installed at switches in terms of flow table entries along the flow path. SDN-enabled switches maintain flow rules in their local TCAMs [24, 54, 69], which support high-speed parallel lookup on wildcard patterns. In practice, network flow shows various traffic patterns. For example, we show real-time traffic of four network flow patterns [4] in Fig. 4.1, where some are burst transmission while the others have consecutive packet transmissions for a long time. For consecutive transmission, only the first packet experiences the delay of remote processing at the controller, and the rest will be processed by local rules at switches. However, for burst transmission, the corresponding rules cached in switches will be removed between two batches of packets if their interval is greater than the rule expiration time. As a result, remote packet processing would be incurred by the first packet of each batch, leading to a long delay and high processing burden on the controller. A simple method to reduce the overhead of remote processing is to cache rules at switches within the lifetime of network flow, ignoring the rule expiration time. Unfortunately, network devices are equipped with limited-space TCAMs because they are expensive hardware and extremely power-hungry. For instance, it is reported that TCAMs are 400 times more expensive [54] and 100 times more power-consuming [72] per Mbit than RAM-based storage. Since TCAM space is shared by multiple flow in net-

43

Packets (Bytes)

2000 1000 0 0 2000 1000 0 0 2000 1000 0 0 2000 1000 0

20

40

60

80

100

10

20

30

40

50

5

10

15

20

25

30

1 2 3 4 5 6 7 8 9 10 11 12 13 14 Time slot Figure 4.1: The sniffed TCP traffic flow [4].

works, it is inefficient and even infeasible to maintain all rules at local switches. This dilemma motivates us to investigate efficient rule caching schemes for SDN to strive for a fine balance between network performance and TCAM usage.

4.2

System model and Assumptions

We consider a discrete time model, where the time horizon is divided into T time slots of equal length l. Note that we have an significant assumption here for the length of each time slot. We assume that the length of each time-slot cannot be too short to make sure the rule-installation can be accomplished in a time slot. Normally, the rule installation procedure takes an interval varying from several to several hundreds of milliseconds in an industrial environment. Due to different length of time slot indicates different traffic patterns, we will show the effect of such the factor in the section of performance evaluation. Then, a network flow travels along a set of SDN-enabled switches, each of which is assigned a set of rules to implement routing or network management functions. For simplicity, in this paper, we study rule caching at a switch with a set of associated rules whose maintenance cost is α per second, and our results can be directly extended to all switches along the flow path. Note that the set of rules associated with the network flow will be cached or removed as a whole at the switch. When no matched flow table entries are available at the switch for the arriving packets, these packets will be sent to the controller for processing. We model the remote processing cost at controller as β. The ratio

44

Table 4.1: Notations and Symbols for the Second Topic Notations Description T maximum time-slot range of consideration l length of each time slot (seconds) at a binary indicator that denotes the positive flow rate in time slot t, t=1,2,..T δ sum of time slots where ai > 0 D amount of valid periods during [1, T ] Vi the ith valid period during [1, T ] Ei the ith empty period during [1, T ] α occupation cost of caching table entry β remote processing cost γ γ=α β xt yt CC CF

a binary variable indicating whether cache action happens in time slot t, t=1,2,..T a binary variable indicating whether fetch action happens in time slot t, t=1,2,..T total cost of cache action during [1, T ] total cost of fetch action during [1, T ]

between these two kinds of cost is denoted by γ, i.e., γ = α β . We consider arbitrary traffic pattern of the network flow. The set of time slots with (at > 0) and without (at = 0) packet transmission are referred to as valid period and empty period, respectively. All symbols and variables used in this chapter are summarized in Table 4.1.

4.3

Formulation

We define a binary variable xt to denote whether a flow entry is cached at the t-th time slot: ( 1, if rules are cached at the t-th time slot, xt = 0, otherwise. The occupation cost can be calculated as follows. CC = αl ·

T X

xt .

(4.1)

t=1

We also define a binary variable yt to indicate whether remote packet pro-

45

cessing is conducted.    1, if rules are remotely fetched in the t-th yt = time slot,   0, otherwise. and the corresponding cost can be calculated by: CF = βl ·

T X

yt .

(4.2)

t=1

With the global information of the given traffic flow, the Minimum Weighted Flow Provisioning (MWFP) problem can be formulated as follows: MWFP : min CT otal = CC + CF xt ,yt

s.t.

xt − xt−1 ≤ yt , t = 2, 3, .., T. t X

(4.3a)

yj ≥ xt , t = 1, 2, .., T.

(4.3b)

xt + yt ≥ at , t = 1, 2, .., T.

(4.3c)

xt , yt ∈ {1, 0}, t = 1, 2, .., T.

(4.3d)

j=1

The trigger of remote processing is represented by constraint (4.3a): the fetch must be made (yt = 1) if the required rules are not available at the (t − 1)th time slot (xt−1 = 0) and will be in the catch on the t-th time slot (xt = 1). Constraint (4.3b) indicates that the switch needs to fetch rules at least one time before caching. Constraint (4.3c) claims that arriving packets must be processed by local cached rules or remote controller. Note that the input of this problem are at , α and β, and the output is the scheduling solution, i.e., xt and yt , t=1,2,..T .

4.4

Offline algorithm

In this section, we propose a heuristic algorithm to solve the offline version of the problem. As shown in Algorithm Offline Greedy, we first generate a collection of time slot sets that represents possible rule cache periods as S ={ {1},{2},.., {T }, {1,2},{2,3},.., {T -1,T }, {1,2,3},{2,3,4},.., {T -2,T -1,T }, ..., {1,2,..,T -1}, {2,3,..,T }, {1,2,..,T }}. Each set is assigned a weight according to (4.4) in line 2. ( βl + αl · |Sj |, if |Sj | > 1; w(Sj ) = (4.4) βl · at , if |Sj | = 1, t ∈ Sj , ∀Sj ∈ S. Then, we select time slot sets for rule caching in an iterative manner from line 4 to 8. In each iteration, we choose the set X that minimizes the value of weight w(X) divided by number of elements not yet covered. 46

Algorithm 5 Offline Greedy Algorithm to solve MWFP Require: flow indicator set F = {a(t), t∈[1,T ]}, and T ={1,2,...,T } Ensure: The collection C of subsets of T S 1: generate sample collection S with S∈S S = T 2: generate the weight function w: S → R+ by invoking (4.4) 3: C ← ∅, and R ← T 4: while R 6= ∅ do w(X) 5: X ← arg minX∈S |X∩R| 6: C ←C ∪X 7: R←R\X 8: end while

Remark 1. The computing complexity of Alg.5 is O(T logT ), if we use binary search tree in line 5 of Alg.5. Proof. In the first, the main loop iterates for O(T ) time. Then, X in line 5 can be found in O(logm) time while using binary search tree, where m is the number of sets in instance S. Because |S| = 21 (T 2 + T ), we obtain the total computational time O(T )× O(log( 21 (T 2 + T ))) = O(T )O(log(T 2 )) = O(T )O(2logT ) = O(T logT ). Theorem 3. Alg.5 is (ln T +1)-approximation to the optimal solution of MWFP, where T is the maximum time slot. Proof. For each element (time slot) tj ∈ T , let Sj be the first picked set that covers it while applying Alg.5, and θ(tj ) denote the amortized cost of each element in Sj , w(Sj ) θ(tj ) = . |Sj ∩ R| P Obviously, the cost of Alg.5 can be written as tj ∈T θ(tj ). Then, let Tb = {t1 , t2 , ..., tT } denote the ordered set of elements in [1,T ] that each is covered. Note that, when tj is to be covered, apparently we have R ⊇ {tj , tj+1 , ..., tT }. We can see that R contains at least (T − j + 1) elements. Therefore, the amortized cost in Sj is at most the average cost of the optimum solution (denoted by OPT), i.e., θ(tj ) =

w(Sj ) OPT ≤ . |Sj ∩ R| T −j+1

By summing the θ(tj ) in all time slots, we get: X tj ∈T

θ(tj ) ≤ OPT(

1 1 1 + + · · · + + 1). T T −1 2

47

That is Greedy ≤ OPT · H(T ) ≤ OPT · (1 + ln T ), where H(T ) is called the harmonic number of T .

4.5

Online Algorithms

In this section, we consider the MWFP problem assuming that the packet traffic information is not given in advance. We first present several important observations in the optimal solution, followed by two proposed online algorithms with low computational complexity to approximate the optimal solution.

4.5.1

Typical actions in optimal solutions

By carefully examining the optimal solutions of several problem instances, we find that there exists several typical actions as follows. • FnC (Fetch and Cache): for valid periods with at least 2 time slots, flow rules are first fetched from the remote controller, and then they are cached at local switches. • SuF (Successive Fetch): for a valid period with at least 2 time slots, all packets are forwarded to the remote controller for processing. • CiE (Cache in Empty period): rules are cached in switches during the empty period between two valid periods. • NF (Only Forward): Packets are processed by the controller in the empty periods of one time slot. Above typical actions are illustrated in Fig. 4.2. The optimal solutions can be categorized into following three cases. • OPT-A: there are only SuF actions in the optimal solution. • OPT-B: there are only CiE actions in the optimal solution. • OPT-C: both SuF and CiE actions exist in the optimal solution.

4.5.2

Online Exactly Match the Flow Algorithm

Our first online algorithm, which is referred to as EMF (Exactly Match the Flow Algorithm), is shown in Alg. 6. In each time slot, each switch makes a decision, caching or fetching, according to observed network traffic. When there are packets arriving in the current time slot, i.e., at = 1, if no matched rules are cached, i.e., xt−1 = 0, switch fetches rules from the controller. Otherwise, we keep caching them in switches.

48

FNC

FNC

i

i+1

t t

i

i

i+1

FNC SUF CIE

t t

i+2

i+4

i+2 i+2

i+4

NF

i+5

|Vi+5|=1

i+3

Figure 4.2: Illustration of typical actions in optimal solution. Algorithm 6 online Exactly Match the Flow (EMF) Algorithm 1: for each time slot t ∈ [1, T ] do 2: if at =1 and xt−1 =0 then 3: fetch flow rules from the controller 4: yt ← 1, xt ← 1 5: else if at =1 and xt−1 =1 then 6: keep caching entries in switches 7: else if at =0 and xt−1 =1 then 8: remove the corresponding flow table entries 9: xt ← 0 10: end if 11: end for Lemma 1. Suppose there are D valid periods (denoted by Vi , i = 1,2,...,D) including δ valid time slots within [1, T ], the total cost of EMF is CEMF = βlD + αlδ.

(4.5)

Proof. In Alg. 6, flow rules are maintained in switches only when there are network traffic passing through, such that TCAM occupation cost can be easily calculated by αlδ. Since rule fetching action happens in the beginning of each valid period, we have fetching cost of βlD. By summing them up, the total cost of EMF can be calculated by (4.5). Lemma 2. When the optimal solution belongs to OPT-A, SuF is adopted by i |−1 any valid period Vi , we have γ > |V|V , ∀i = 1, 2, ..., D. i| Proof. Since SuF is adopted in the optimal solution, its cost must be less than FnC, i.e., |Vi |βl < βl + |Vi |αl ⇒β
, ∀i = 1, 2, ..., D. |Vi | − 1 |Vi | 49

Theorem 4. When the optimal solution belongs to OPT-A, the EMF algorithm is (γ + D δ )-competitive. Proof. Since the optimal solution belongs to OPT-A, i.e., all packets are sent to the controller for processing, it is easy to see that the total cost of optimal solution is βlδ. Thus, the competitive ratio λA is: bA = CEMF = αδ + βD = γ + D . λ βlδ βδ δ

Lemma 3. When the optimal solution belongs to OPT-B, rules are always maintained by switches once downloaded, i.e., CiE is adopted by the period Ei between any two valid periods Vi and Vi+1 , and we have γ < |E1i | , ∀i = 1, 2, ..., D − 1. Proof. If switches continue to cache rules once they’re downloaded, the TCAM occupation cost during Ei must be less than the fetching cost at the beginning of Vi+1 , i.e., αl(|Ei | + |Vi+1 |) < βl + |Vi+1 | · αl 1 ⇒ β > α|Ei | ⇒ γ < , ∀i = 1, 2, ..., D − 1. |Ei |

Theorem 5. When the optimal solution belongs to OPT-B, the EMF algorithm  D+γδ is 1+γ(δ+D−1) -competitive. Proof. As shown in Fig. 4.3, since rules are maintained in switches once downloaded under OPT-B, its total cost can be easily calculated by βl + αl(δ + PD−1 i=1 |Ei |), where the first term is the cost of the only fetching that happens in the beginning of the first valid period, and the second term is TCAM occupation cost. Combined with Lemma 1, the competitive ratio is: CEMF PD−1 βl + αl(δ + i=1 |Ei |) βD + αδ D + γδ ≤ = . β + α(δ + D − 1) 1 + γ(δ + D − 1)

bB = λ

Lemma 4. When the optimal solution belongs to OPT-C, there exists at least one valid period Vi , i ∈ [1, D] and one empty period Ej , j ∈ [1, D − 1], such that |Vi |−1 1 |Vi | < γ < |Ej | . 50

FNC

FNC

i

i+1

t t

i

i+1

i

FNC

i

i

CIE

t t

i

i+1

i i+1

i

Figure 4.3: Rules are cached in Ei because of CiE action. Extended FNC yt : xt :

Extended FNC

V3

V2

V1

Cover partly

FN C

Cover fully

VD

V4

Cover partly

Cover fully

Time slot

ECA solution: less FNC patterns, longer cache duration. Figure 4.4: An example of ECA solution. Proof. This lemma can be easily proved following similar analysis in Lemmas 2 and 3. Theorem 6. When the optimal solution belongs to OPT-C, the EMF algorithm  D+γδ is D+γ(δ−D+2) -competitive. Proof. Without loss of generality, we suppose CiE is adopted in empty periods {E1 , E2 , ..., Ex }, and SuF is adopted in valid periods {V1 , V2 , ..., Vy }. There are z NF actions in the optimal solution. The total cost of optimal solution belongs to OPT-C can be calculated by:

COP T −C

=

βl[D − x + αl(δ +

=

y X

(|Vi | − 1)] +

i=1 x X

y X

j=1

i=1

|Ej | −

βlD + αlδ + h(x, y, z),

51

|Vi |) − αlz,

where h(x, y, z)

=

αl

αl

x X j=1 y X

|Ej | − βlx + [βl

y X

(|Vi | − 1) −

i=1

|Vi |] − αlz.

i=1

Py i |−1 < γ ⇒ βl i=1 (|Vi | − Referring to Lemma 4, γ < |E1j | ⇒ α − β < 0 and |V|V i| Py 1) − αl i=1 |Vi | < 0. Therefore, h(x, y, z) < 0 and the ratio λC >1. Obviously, we have x ≥ 1, y ≥ 1 and z ≥ 0. We then consider two extreme cases. In the first case, there exists multiple CiE actions but only one SuF action. Since there must be one empty period between these two types of patterns, CiE actions cover at most D − 2 empty periods, i.e., x ≤ D − 2. In the second case, there are only one CiE action and multiple SuF actions. The only one CiE occupies at least two valid periods. As a result, SuF actions use at most D − 2 valid periods, i.e., y ≤ D − 2. Furthermore, the constitution of y SuF actions occupy y valid periods, and the other D − y ones are left to CiE and NF actions, which cover x empty periods and z tiny valid periods, respectively. If there is only one CiE pattern, in which all the D − y valid periods are crossed with x empty periods, we have xmax + 1 + z=D − y, i.e., x + y + z ≤ D − 1. Finally, we obtain a feasible region of h(x, y, z), which is denoted by Λ = {(x, y, z)|1 ≤ x ≤ D−2, 1 ≤ y ≤ D−2, z ≥ 0, x+y+z ≤ D−1}, where x, y and z are all integers. The lower bound of h(x, y, z) is derived as follows. y x X X h(x, y, z) = αl |Ej | − βlx + (β − α)l |Vi |− j=1

i=1

βly − αlz, ≥ (α − β)lx + (β − α)l · 2y − βly − αlz, = (α − β)lx + (β − 2α)ly − αlz, ≥ (α − β)l + (β − 2α)l − αl(D − 3), = αl(2 − D). Therefore, the competitive ratio can be expressed by: λC ≤

4.5.3

βD + αδ D + γδ = . βD + αδ + α(2 − D) D + γ(δ − D + 2)

Online Extra η time-slot Caching Algorithm

Our proposed EMF algorithm attempts to minimize the TCAM occupation cost by caching flow rules only when there are network traffic passing through 52

8000

Offline OPT Offline Greedy

600

800

400

600

OPT Greedy

6000 Cost

Cost

800

600

4000

400

400

200

2000 200 0

0 0

Proactive Reactive

2

200 0

0.05 0.1 0.15 0.2

4

γ

6

8

0 0

10

(a) l=0.25 s, α=10, Trace 1

γ

0.4

0.6

(b) l=0.25 s, α=10, Trace 1

OPT Greedy

300 Cost

0.05 0.1 0.15 0.2

0.2

Proactive Reactive

300

200

200

100

100 0.4

0 0

2

4

γ

0.6

6

0.8

8

1

10

(c) l=0.25 s, α=10, Trace 1 Figure 4.5: Performance of offline Algorithms while varying γ. switches. However, it would incur frequent remote processing at the controller under burst packet transmissions. In this subsection, we study to further reduce total cost by proposing the ECA (Extra Cache Algorithm) that specifies an expiration time for cached rules. As shown in Alg. 7, we specify a parameter η as input. In each time slot, if we decide to conduct fetching action, the expiration time of fetched rules, which is denoted by idle timeout, is set to ηl. We show how to set the value of η to achieve the closest performance with optimal solution by empirical analysis in next section. General cases of ECA We let N0 denote the number of empty periods whose length is less than η. By representing the total number of empty time slots covered by η with L, we have the following theorem. Theorem 7. The competitive ratios of ECA over OPT-A, OPT-B and D−N0 +γ(δ+L) 0 +γ(δ+L) , ζB = D−N OPT-C are ζA = D−N0 +γ(δ+L) δ 1+γ(δ+D−1) and ζC = D+γ(δ−D+2) , respectively.

53

Algorithm 7 online Extra Cache Algorithm (ECA) Require: η 1: for each time slot t ∈ [1, T ] do 2: if fetch action happens in slot t then 3: idle timeout ← ηl for all entries to be installed 4: end if 5: t++ 6: end for

Proof. As shown in Fig. 4.4, compared with an original optimal solution where there is no extended FnC pattern included, in the ECA solution, if the length of an empty period is longer than η, the cached rules will be removed after the expiration time, and they will be refetched at the beginning of next valid period. The total TCAM occupation cost can be calculated by αl(δ +L). Otherwise, rules are cached at the switch during the empty period, leading to remote processing cost βl(D − N0 ). Therefore, the total cost of ECA with η is CECA(η) = βl(D − N0 ) + αl(δ + L).

(4.6)

Following the similar analysis in Theorems 4, 5, and 6, we can easily obtain 0 +γ(δ+L) the competitive ratios of ζA = D−N0 +γ(δ+L) , ζB = D−N δ 1+γ(δ+D−1) and ζC = D−N0 +γ(δ+L) D+γ(δ−D+2)

over OPT-A, OPT-B, and OPT-C, respectively.

Special case of ECA Suppose the length (denoted by variable X) of all the empty periods Ej (j = 1, 2, ..., D − 1) are exponentially distribution with mean value µe , i.e., X ∼ exp( µ1e ), the value of N0 can be calculated by: −ηl

N0 = (D − 1) · P r(X ≤ ηl) = (D − 1)(1 − e µe ).

(4.7)

And the total length of the completely covered empty periods by η can be written as L1 = (D − 1) · E(X ≤ ηl) Z ηl 1 X = (D − 1) (X( e −µe ))dX (4.8) µe 0 −ηl ηl = µe (D − 1)[1 − ( + 1)e µe ]. µe On the other hand, the total length in the empty periods where partially cut by η shall be simply calculated as −ηl

L2 = (D − 1 − N0 ) · (ηl) = ηl(D − 1)e µe .

54

(4.9)

4

500

2

Offline OPT Offline Greedy

x 10

1.5 Cost

Cost

400

OPT Greedy

Proactive Reactive 600

1

500

300

300

400

0.5 200

200 0

1

0.5

2 3 l (seconds)

1

1.5

4

0 0

5

(a) γ=0.2, α=10, Trace 1

300 1

1

2

3

2 3 l (seconds)

4

4

5

5

(b) γ=0.2, α=10, Trace 1

8 Theory, trace 1 Experiment, trace 1

Ratio

6

Theory, trace 2 Experiment, trace 2

4 Theory, trace 3 Experiment, trace 3

2 0

0.5

1

1.5 2 l (seconds)

2.5

3

(c) γ=0.2, α=10, three traces Figure 4.6: Performance of offline Algorithms while varying l. Therefore, the cost of ECA with parameter η, µe and l shall be CECA(η,µe ,l) = αl(δ + L1 + L2 ) + βl(D − N0 ) −ηl

−ηl

(4.10)

= α(δl + (D − 1)µe (1 − e µe )) + βl(1 + (1 − D)e µe ). Similarly, the competitive ratio of ECA under this special case can also be derived following the approach in the proof of Theorem 7. In performance evaluation, we conduct extensive simulations under various network settings to find out the best value of η leading to closet performance with optimal solution.

4.6

Evaluation

We conduct extensive simulations in this section to evaluate the performance of our proposed algorithms and the derived competitive ratios.

4.6.1

Simulation Settings

We adopt network traces [4], with a collection of TCP packets and ethernet frames captured in a wired hub using wireshark tool. Each trace file is collected 55

5000

4.5 Offline OPT ECA EMF

3000

300

2000

200

4 Ratio

Cost

4000

Proactive Reactive

Theory, ECA Experiment, ECA Theory, EMF Experiment, EMF

3.5 3

100

1000

0

0 0

1

1

2

2.5

3

2

2 0

3

1

l (seconds)

(a) γ=2, η=5, trace 3 4 Theory, ECA Experiment, ECA Theory, EMF Experiment, EMF

6

3 Ratio

8 Ratio

3

(b) γ=2, η=5, trace 3

10

4

Theory, ECA Experiment, ECA Theory, EMF Experiment, EMF

2 1

2 0

2 l (seconds)

1

2

3 γ

4

0

5

(c) η=5, l = 0.5 s, trace 3

0 1 2 3 4 5 6 7 8 9 10 η

(d) γ=1, l = 0.2 s, trace 3

Figure 4.7: Performance of EMF and ECA over OPT-A. within around 50 seconds. We first process these trace files by accumulating the traffic volume of each time slot with length l. Both proposed algorithms (Greedy, EMF and ECA) and legacy algorithms (Proactive and Reactive) are implemented in our simulation. Note that, in the adopted legacy Proactive algorithm, rules are only fetched in the first time slot and cached all the remaining duration, resulting in a cost of βl + T αl. In contrast, the Reactive algorithm triggers remote process at each time slot with a cost (β + α)δl. To obtain the offline optimal solutions (denoted by Offline OPT), the commercial solver Gurobi optimizer [78] is used. In each suite of simulation, we always fix α=10 and the settings of other parameters are labeled below the figures.

4.6.2

Evaluation of Offline Algorithm

As shown in Fig. 4.5, when α = 10 and l = 0.25 s, total cost of all algorithms decreases as γ grows from 0.01 to 10. This is because under larger γ, remote fetching is preferred as it leads to low cost. It also can be seen from Fig. 4.5(b) that the Reactive strategy products much higher cost compared with other algorithms, because there are always remote fetching operations even if γ is small. On the other hand, the Proactive way is competitive to Greedy only when γ is very small, as shown in Fig. 4.5(b), but it generates an even higher cost than the Reactive strategy, as it shows in Fig. 4.5(c), once γ grows bigger

56

x 10

2 Offline OPT ECA EMF

1.5 Cost

4

1

Proactive Reactive 500 400

0.5

300 0

0 0

1

Theory, ECA Experiment, ECA Theory, EMF Experiment, EMF

1.8 Ratio

2

1.6 1.4 1.2

1

2

3

2

1

3

0.5

1 l (seconds)

l (seconds)

(a) γ=0.2, η=2, trace 2

(b) γ=0.2, η=2, trace 2

3.5

3.5 Theory, ECA Experiment, ECA Theory, EMF Experiment, EMF

2.5

Theory, ECA Experiment, ECA Theory, EMF Experiment, EMF

3 2.5 Ratio

3 Ratio

1.5

2

2 1.5

1.5

1

1 0.1

0.2

0.3 γ

0.4

0.5 0

0.5

(c) η=2, l = 0.2 s, trace 2

2

η

4

6

(d) γ=0.2, l = 0.2 s, trace 2

Figure 4.8: Performance of EMF and ECA over OPT-B. than 0.4. In contrast, our proposed Greedy algorithm always performs close to the offline optimal cost. Specifically, from Fig. 4.5(c) we observe that the costs of all algorithms converge when γ is greater than 2 because they generate only fetch operations. We then investigate the influence of time slot length on the total cost by changing the value of l from 0.1s to 5s. As shown in Fig. 4.6(a), total cost increases as the growth of l under optimal solutions. We observe that some fluctuation in the performance of our greedy algorithm. That is because in each iteration of while loop in Alg.5, we prefer tiny valid periods, which may have some time slots already contained. From Fig. 4.6(b), it also can be seen that Greedy outperforms the legacy Proactive and Reactive strategies. The performance ratio of our algorithm and the optimal solution is shown in Fig. 4.6(c), where the ratio is very close to 1, much better than the analytical upper bound.

4.6.3

Evaluation of Online EMF and ECA

Then, the online EMF and ECA algorithms are evaluated with real network traffic traces under three cases, respectively. Note that, the legacy Proactive and Reactive strategies are only shown in the group of simulations where l varies. Similar results are obtained by varying parameters γ and η and thus omitted.

57

1.6

Ratio

1.5

Theory, ECA Experiment, ECA Theory, EMF Experiment, EMF

1.4 1.3

1.4 1.3

1.2

1.2

1.1 0.5

0.6

γ

0.7

0.8

0

(a) η=2,l=0.2 s,trace 2 1.2

1.25

1.15

1.2

1.1 Theory, ECA Experiment, ECA Theory, EMF Experiment, EMF

1.05 1 0.5

0.6

γ

0.7

5

η

10

15

(b) γ=0.625,l=0.2 s,trace 2

Ratio

Ratio

Theory, ECA Experiment, ECA Theory, EMF Experiment, EMF

1.5 Ratio

1.6

Theory, ECA Experiment, ECA Theory, EMF Experiment, EMF

1.15 1.1 1.05 0

0.8

(c) η=2,l=0.4 s,trace 2

5

η

10

15

(d) γ=0.625,l=0.4 s,trace 2

Figure 4.9: Performance of EMF and ECA over OPT-C while varying γ and η. Over OPT-A The total cost under different values of l is shown in Fig. 4.7(a), where results of all algorithms show as increasing functions of l. On the other hand, the performance ratio of EMF and ECA with the optimal solutions decreases as the growth of l as shown in Fig. 4.7(b). Furthermore, we observe that the performance ratio are always equal to the derived upper bound. That is because only SuF and NF patterns exist in the optimal solutions under this case, i.e., packets are always processed by the remote controller. Finally, we show the performance ratio under different value of η in Fig. 4.7(d). We observe that ratio of ECA is same with EMF when η=0, and the ratios of EMF never change because of fixed γ and l. In Figs. 4.7(b), 4.7(c), and 4.7(d), EMF outperforms ECA with closer performance to the optimal solution under most of settings. We attribute this phenomenon to the fact that rules are cached at switches for a longer time under ECA. Over OPT-B In this case, packets tend to be processed at local switch because γ becomes small. In Figs. 4.8(a) and 4.8(b), we have similar observations with the results under OPT-A case, except ECA outperforms EMF. This can be attributed to 58

8000

Proactive Reactive

1.5

350

4000

300

1

250

2000

200

0 0

Theory, ECA Experiment, ECA Theory, EMF Experiment, EMF

2

Ratio

Cost

6000

Offline OPT ECA EMF

0.5 1

1

2

3

2

0

3

l (seconds)

(a) γ=0.625, η=2, trace 2

0.2

0.3

0.4 0.5 l (seconds)

0.6

(b) γ=0.625, η=2, trace 2

Figure 4.10: Performance of EMF and ECA over OPT-C while varying l. that there are mainly CiE patterns and few NF patterns under OPT-B case, and the extra caching durations of ECA cover many empty periods. Accordingly, the cost of ECA is smaller than EMF, particularly when γ and η become large. Finally, both algorithms converge to the optimal solution under larger value of l. In respect of performance ratio, Fig. 4.8(c) shows ratio is decreasing function of γ for both ECA and EMF. Because larger γ leads to more short FcN patterns in optimal solution, which makes ECA and EMF close to optimal solution. Therefore, their ratios decrease and approach to 1 gradually. As we mentioned above, Fig. 4.8(d) shows the benefit of a larger η in ECA, because more empty periods are covered and much fetching cost can be saved under OPT-B case. Over OPT-C The performance of ECA and EMF is investigated in Figs. 4.9 and 4.10. We observe that ECA and EMF show distinct performance under different settings. For example, in Fig. 4.10(a), the cost of ECA is larger than EMF when l=0.2, but it becomes opposite when l is greater than 0.3. We have similar observations in other figures. Interestingly, in Fig. 4.9(b), we observe that the curve of ECA first increases, and then decreases when η is greater than 3, finally converging to 1.28 after η = 4. This is because when η is small, it only covers few empty periods with short length, and the advantage of extra caching is not obvious. As η becomes larger, the number of covered empty periods grows, leading to reduced total cost. When all empty periods are covered by large η, the performance of ECA becomes stable. In Fig. 4.9(d), the ratios of ECA keep decreasing and then converging because short empty periods become fewer when l increases to 0.4. Additionally, in Figs. 4.7(a), 4.8(a) and 4.10(a), we can always observe that the Reactive approach creates extremely high total cost and the proposed EMF and ECA perform better than both Proactive and Reactive strategies. This also validates the efficiency of the proposed online algorithms.

59

1.8

2.5 Theory, ECA Experiment, ECA

Theory, EMF Experiment, EMF

1.2 1 0

2

Theory, ECA Experiment, ECA

1.4

2

4

η

6

8

Theory, EMF Experiment, EMF

Ratio

Ratio

1.6

1.5

1 0

10

2

4

η

6

8

10

(a) ECA over OPT-A under special case, (b) ECA over OPT-B under special µe =1.0, γ=1, l=0.25. case, µe =1.0, γ=0.2, l=0.25.

1.5

Theory

Ratio

1.4 1.3 1.2 1.1 0.8

Experiment

10

0.7 5

0.6

γ

0.5

0

η

(c) ECA over OPT-C under special case, µe =1.0, l=0.5. Figure 4.11: Performance of ECA over OPT-A, OPT-B and OPT-C under special case.

4.6.4

Evaluation of Special Case of ECA

Finally, we study the performance of ECA when lengths of empty periods are exponential distribution. We consider randomly generated network traffic with µe =1.0 and T =100. As shown in Fig. 4.11(a), ratio of ECA algorithm increases as η grows from 0 to 10, which shows the same performance with Fig. 4.7(d). In Fig. 4.11(b), we have similar observation with Fig. 4.9(b) because of the same reasons. In Fig. 4.11(c), we set l=0.5, and performance ratios are always below the theoretical bound as γ and η changes within [0.5,0.8] and [0,10], respectively. The simulation results also suggest that η shall be set to small values no matter how γ changes.

60

4.7

Summary

In this chapter, we study traffic flow provisioning problem by formulating it as a minimum weighted flow provisioning problem with objective of minimizing the total cost of TCAM occupation and remote packet processing. An efficient heuristic algorithm is proposed to solve this problem when network traffic is given. We further propose two online algorithms to approximate the optimal solution when network traffic information is unknown in advance. Finally, extensive simulations were conducted to validate the performance of theoretical analysis of the proposed algorithms, using the real traffic traces.

61

Chapter 5

Near-Optimal Routing Protection for In-Band Software-Defined Networks [3] 5.1

Motivation and Problem Statement

In this section, we firstly give the reason that motivates us to study the routing protection for the in-band SDN networks. Then, our goal of this study will be presented.

5.1.1

Motivation

As we have discussed in the first chapter, although in-band connection is a practical approach, it comes with many challenges. One particular challenge is how to provide resilient communications between the switch and controller in case of link failures. In recent studies, Google has reported high delays and failures in configuring switches with a failure rate between 0.1% and 1% [35]. In a large network, the failures on the data plane occur more frequently. The disconnected link can on average last 30 minutes [79]. In an in-band fashion of SDN networks, where control-plane traffic shares medium with data plane traffic, even a single link failure may disconnect a large number of switches from their controllers, resulting in much worse damage than the out-of-band fashion [49]. To better understand this, we illustrate an example in Fig. 5.1, where switches connect to a remote controller via in-band multi-hop relay connections. If the link (0,2) fails, switches 2, 4 and 5 will lose connections with the remote

62

Connection in data-plane C Controller-switch connection in control-plane

Regular switch

Control-lost switch 1

Remote controller

3 5

0 2

Link failed

4

Figure 5.1: An illustrative link failure occurs in an in-band SDN. controller. Consequently, packets may not be processed correctly in the controllost switches, thus leading to performance degradation, such as packet loss, loop routing, suboptimal or infeasible routing actions [26, 46]. In the following, we discuss why we study the routing protection for the inband fashioned control plane in two perspectives. On one hand, we notice that the emphasized control-plane oriented routing protection problem looks very similar to the data plane routing protection, which has received much attention in literature primely utilizing the policy of local rerouting [34, 38, 40, 80–83], such as detour, forward local rerouting and backward local rerouting. However, we argue that this policy potentially brings congestions to the links near the failed one, resulting in higher control latency to a fraction of controller-to-switch channels. In contrast, we strive to find more robust end-to-end global rerouting solution corresponding to the flow swapping scenario [67, 68], especially in a dynamic environment, where flows are frequently added or removed in certain groups of links simultaneously. On the other hand, with respect to the methodology to address the routing recovery problem, there are two major categories: restoration [30,39] and protection [40,84]. In the former scheme, the recovery paths can be either preplanned or calculated on-demand, but network resources (such as forwarding rules and link bandwidth) will not be allocated until a failure is detected. We can see that such an approach inherently results in long recovery time and high packet loss. In contrast, in the protection scheme, backup resources are always pre-planned and reserved such that once a failure is detected, recovery can be made immediately. As a result, when fast recovery is a major concern, the protection will be preferred. In addition, the experimental studies in literature [30, 39] reveal that path protection is more qualified than restoration with respect to the sub-50 ms fast failure recovery requirement [66] of carrier-grade networks. Therefore, we adopt the protection scheme in our third emphasized issue.

63

5.1.2

Our Goal

When applying the routing protection to the in-band control plane of SDN networks, two concerns should be carefully addressed. Since control-plane traffic shares the bandwidth resource with data plane traffic, the first issue is how to provide guaranteed bandwidth to control traffic. Network operator should carefully schedule the bandwidth utilization of control traffic, such that the network performance (e.g., throughput) can be optimized. One way to achieve this goal is to balance the control traffic over all links. Another concern is the setup cost of the control channels, because the establishment and maintenance of TCP/TLS connections require a number of routing table entries [85] and message exchanges. Generally, the setup cost is positively proportional to the length of the control-channel path. Consequently, the length of each selected path should also be taken into consideration when launching the control channels. To this end, we study a weighted cost minimization problem, in which the control-plane traffic load balancing and control-channel setup cost are jointly considered when selecting protection paths for control channels. Since the multiple resource constrained routing is known as NP-complete [86, 87], we propose a near-optimal algorithm using Markov approximation technique [88]. Extensive simulations are conducted to show that our proposed algorithm has fast convergence and is more efficient in the resource utilization perspective than the existing benchmark approaches.

5.2 5.2.1

System Model and Formulation Preliminary

Control Connection Establishment and Maintenance In our emphasized network, one controller or multiple controllers coordinately manage an SDN switch remotely over an intermediate network. The connecting fashion of this network can be either a separate dedicated network (out-of-band controller connection), or a hop-by-hop relaying network managed by switches (in-band controller connection) [26]. The only requirement is that the intermediate network should provide TCP/IP connectivity. Typically, every controllerto-switch channel (also called controller-switch session) can be established as a single network connection between the switch and the controller, using TLS or plain TCP protocols [26]. Once a control connection is built, it must be maintained by the underlying TLS or TCP connection mechanisms, until the connection is terminated by TCP timeouts or TLS session timeouts [26].

64

Controller

C

Bidirectional primary working path Bidirectional backup paths (s, in-use path 1) S1 (s, in-use path 2) C S2

Sn Switches

Sm

(s, in-use path |Ds|)

The configuration of working path and backup paths for session s:(C,Sm).

Figure 5.2: The protection of control-plane traffic for controller-switch sessions. Note that, the number of controllers can be more than one. Here we only illustrate an example with one controller. Protection for Control Connection We consider the Dedicated Backup Path Protection (DBPP) scheme [89, 90], which belongs to hot-backup protection category. For example, one popular DBPP is the ‘1+1’ protection [39, 84, 91], where a working path is protected by one dedicated backup path and traffic is duplicated on both the working path and the backup path. Note that, our proposed approach is a general framework that can be applied to other recovery mechanisms as well.

5.2.2

System Model and Assumptions

Given a set of controller-switch sessions S in an SDN G = (V, E) with switch set V and link set E, each controller-switch session s ∈ S is equipped with a set of required in-use paths, denoted as Ds , which includes one working path and |Ds | − 1 backup paths. For example, in Fig. 5.2, all in-use paths for session s between controller C and switch Sm at time t are illustrated as (s,1), (s,2), · · · , (s,|Ds |). The adopted DBPP scheme belongs to shared-link protection [89], and therefore the provided candidate paths in Js for each session s should be as disjointed as possible. This is because both the primary working path and backup paths are not likely to fail at the same time by a single link-failure when the provided candidate paths are highly disjointed. However, we do not fucus on how to find the sufficient disjointed candidate path set for each session in this chapter. Let cl and dl (l ∈ E) denote respectively the link bandwidth capability and the aggregated data plane traffic load on link l. The currently available link bandwidth for control-plane traffic is given by (cl − dl , l ∈ E). For major notations used in this chapter are summarized in Table 5.1.

65

Table 5.1: Notations and Symbols for the Third Topic Notations Description (V,E) an SDN with switch set V and link set E S a set of controller-switch sessions Js a set of candidate paths for session s ∈ S a set of required in-use paths for session s ∈ S, including Ds one working path and |Ds | − 1 (|Ds | ≥ 2) backup paths Rs demanding traffic rate of session s ∈ S rl aggregated control-plane traffic load on link l ∈ E dl aggregated data plane traffic load on link l ∈ E cl bandwidth capability on link l ∈ E φv currently available rule space capacity on node v ∈ V |.| size of a set or length of a path a binary variable indicating whether session s ∈ S selects zsp path p ∈ Js as one of its required in-use paths f a feasible routing-protection configuration for all sessions F the set of all feasible configurations for the whole system us (f ) system cost of a session s ∈ S under configuration f ∈ F overall system cost under a given configuration f ∈ F, P uf i.e., uf = s∈S us (f )

5.2.3

Problem Formulation

The Control-Plane Routing Protection (shorten as CPRP) problem is stated and formulated as follows. Path Selection The target of CPRP problem is to find the optimal path set for all controllerswitch sessions. To denote whether candidate path p ∈ Js is selected by session s ∈ S as one of its required in-use paths, we define a binary variable zsp as:    1, if candidate path p ∈ Js is selected by p zs = s as one of its required in-use paths;   0, otherwise. With this definition, the entire decision space of all possible path-selections can be expressed as: X F = [{zsp }|zsp ∈ {0, 1}, zsp = |Ds |, s ∈ S]. p∈Js

66

Minimization of Joint Weighted System Cost SDN network operators perform traffic engineering to improve resource utilization, which is normally measured in terms of two objectives: • To ensure load balance by decreasing the aggregated traffic load on most severely congested links. • To reduce the node cost which measures in the average configuring times in each switch. Consequently, the overall weighted system cost described in our objective function includes two terms: 1) the largest control flow rate over all links, and 2) the average connection-setup cost on each switch node. As a result, the CPRP problem is formulated as the following integer programming optimization:

CPRP : Minimize(max(rl ) + l∈E

s.t.:

X

ω XX p zs · |p|) |V|

(5.1a)

s∈S p∈Js

zsp = |Ds |, ∀s ∈ S

(5.1b)

X

(5.1c)

p∈Js

rl =

X

zsp · Rs , ∀l ∈ E

s∈S p∈Js ,l∈p

rl ≤ cl − dl , ∀l ∈ E X X zsp ≤ φv , ∀v ∈ V

(5.1d) (5.1e)

s∈S p∈Js ,v∈p

V ariables : zsp ∈ {0, 1}, ∀p ∈ Js , ∀s ∈ S Objective function (5.1a) is the proposed overall weighted system cost that captures both objectives discussed above. In the objective function, the first term maxl∈E (rl ) denotes the largest control traffic rate over all links while the P P 1 p second term |V| s∈S p∈Js zs · |p| indicates the average connection-setup cost in each switch node. Furthermore, the tradeoff between these two terms of cost is allowed to be freely tuned by introducing a weighting factor ω. Constraint (5.1b) claims that for each session s, exact |Ds | number of in-use paths must be selected from its candidate path set Js . Then, (5.1c) calculates the aggregated traffic rate on each link, as the sum of traffic demand from all passing-through sessions. The capacity constraints on links and nodes are specified by constraints (5.1d) and (5.1e), respectively.

5.3

Near-Optimal Path Selection Algorithm

Path selection under constrained resource is known as NP-complete problem [86, 87]. The CPRP problem is a combinatorial optimization, in which the 67

global optimal solution consists of local path-selection decisions. Since there is no computationally efficient solution in a centralized manner, we strive for designing a distributed algorithm that solves the problem following the framework of Markov approximation technique [88]. In the following, we specify the two steps in designing our algorithm under the Markov approximation framework: log-sum-exp approximation and implementation of Markov chains.

5.3.1

Log-Sum-Exp Approximation Approach

Let f = {zsp , ∀p ∈ Js , ∀s ∈ S} denote a configuration for the CPRP problem, and F the set of all feasible configurations that are already known. For convenience of presentation, we denote uf as the system objective function (5.1a) corresponding to a given configuration f . To better understand the log-sum-exp approximation, we also let each configuration f ∈ F associate with a probability pf , indicating the percentage of time that the configuration f is in use. Then, CPRP can be approximated by the following optimization problem via applying the approximation technique in [88]: X 1 X CPRP(β) : min pf uf + pf log pf (5.2a) β f ∈F f ∈F X s.t.: pf = 1 (5.2b) f ∈F

where β is a large positive constant and related to the performance of this approximation approach. The motivation behind this approximation is that it potentially leads to distributed solutions. Let p∗f ∈F be the optimal solution of the CPRP(β) problem, λ denote the Lagrangian multiplier associated with the equality constraint in (5.2) under p∗f . Then, by solving the following KarushKuhn-Tucker (KKT) conditions [92] of the problem in (5.2):  1 1  uf + log p∗f + + λ = 0, ∀f ∈ F    β β   X p∗f − 1 = 0 (5.3)   f ∈F     λ≥0 we obtain the optimal solution: p∗f = P

f

0

exp(−βuf ) , ∀f ∈ F. 0 ∈F exp(−βuf )

(5.4)

Remark 2. With the log-sum-exp approximation approach described above, we obtain an approximate version of the CPRP problem with the assistance of an P entropy term β1 f ∈F pf log pf . If we can time-share among different configurations according to the optimal solution p∗f in (5.4), then CPRP can be solved approximately within a bound β1 log |F|, which can be made small by choosing a large β. 68

5.3.2

Markov Chain Design

Here we design a Markov Chain (MC) with a state space being the set of all feasible configurations F and a stationary distribution denoted as p∗f in (5.4). Since the system is operated under different configurations, the transition between two states in the designed MC indicates swapping in-use paths of any session. Therefore, in the implemented MC, if the transitions among states can be trained to converge to the desired stationary distribution p∗f , system can achieve near-optimal performance. To construct a time-reversible MC [88] with stationary distribution p∗f , we let 0 f, f ∈ F denote two states of MC, and use qf,f 0 as the nonnegative transition 0 rate from state f to f . Furthermore, we have to ensure that: (a) in the resulting MC, any two states are reachable from each other, and (b) the detailed balance 0 equation p∗f qf,f 0 = p∗f 0 qf 0 ,f , ∀f, f ∈ F should be satisfied. Our design is as follows. State-Space-Structure Recall that a configuration f ∈ F represents a set of in-use paths for all sessions. 0 Initially, we set the transition rate between two configurations f and f to be 0, unless they satisfy the following two conditions: 0

0

• C1: |f ∪ f | − |f ∩ f | = 2 0

0

• C2: f ∪ f − f ∩ f ∈ Js¯ 0

where s¯ is the session which causes the state transition f → f . That is, if 0 session s¯ makes a single path swapping, the state f transits to f . Transition Rate Matrix Training Both traffic statistics on links and the number of forwarding rules installed in switches can be easily pulled by controllers. For example, in OpenFlow networks [26], traffic rate in each path can be measured via meter table entries and inquired through Controller-to-Switch Echo messages. On the other hand, the current consumption of the flow table in a switch can be obtained by controller via sending Read-State messages. Therefore, the cost of current network configuration can be acquired by controller at any time. Particularly, we let the transition rate qf,f 0 positively correlated to the difference of system performance 0 under two adjacent configurations f and f in the state matrix. In detail, the transition rate qf,f 0 is designed as:  0 1 X 1  exp( β (us (f ) − us (f ))) qf,f 0 =    exp(τ ) 2 s∈S (5.5) X 0 1 1    qf 0 ,f = exp( β (us (f ) − us (f )))  exp(τ ) 2 s∈S

69

Algorithm 8 Sojourn-and-Transit Algorithm to Solve CPRP 1: for each s ∈ S do 2: execute Procedure Initialization 3: execute Procedure Set-timer for s 4: end for 5: while system is still running do 6: /*Procedure Transit*/ 7: if Ts expires then 8: zspold ← 0 9: zspnew ← 1 10: execute Procedure Set-timer for s 11: broadcast RESET(uf 0 , {s}) signal for other sessions 12: end if 13: /*Procedure RESET*/ ¯ message then 14: if session s receives RESET(uf 0 , S) 15: uf ← uf 0 ¯ invoking (5.6) 16: refresh and start timer Ts (s ∈ S − S) 17: end if 18: end while Algorithm 9 Startup for a session (Procedure Initialization) Require: a session s ∈ S, Js Ensure: Ds 1: initializes a dedicated processing-thread for s 2: Ds ← randomly select |Ds | feasible paths from Js

where τ is a conditional positive constant that avoids overflow computing of exp(.) in computer. The design of qf,f 0 in (5.5), in practice, is likely to make the system switch to a configuration with better performance. This is because P 0 0 when s∈S (us (f ) − us (f )) > 0 and the performance gap between f and f is greater, the transition rate qf,f 0 will be bigger, and vice versa.

5.3.3

Implementation of MC Guided Algorithm

The implementation based on our designed Markov chain is shown in Algorithm 8, in which controller creates a dedicated processing-thread for each of its holding sessions. Therefore, this algorithm can execute on single controller or multiple parallel controllers, which can apportion all the processing-threads. We will consider how to assign threads to multiple controllers in our future work. Typically, each dedicated thread follows a general state machine shown in Fig. 5.3, using which we explain the procedures of this algorithm for the single-controller case as follows. 70

Algorithm 10 Set timer for a session (Procedure Set-timer) Require: session s ∈ S Ensure: Ts , pold , pnew 1: pold ← one in-use path randomly selected from Ds 2: pnew ← one feasible not-in-use path randomly selected from Js \Ds 3: measures the current system cost uf 4: estimates the system cost uf 0 of the target configuration if replace pold with pnew 5: generates a random exponentially distributed timer Ts for thread s with mean equal to P 0 exp(τ − 12 β s¯∈S (us¯(f ) − us¯(f ))) , (5.6) |Ds | · (|Js | − |Ds |) and begins counting down If receives a RESET ssignal. g a Initialization

Timer counts down to 0.

Set & start a timer er

T Transition & ttrigger RESET Jumps back

Figure 5.3: State machine for each session in the proposed algorithm. • Procedure Initialization: For each session s ∈ S, the controller creates an associated thread, then randomly selects |Ds | feasible not-in-use paths that satisfy resource requirement from candidate path set Js . 0

• Procedure Set-timer: Let f and f denote the current and next targeting configuration, respectively. For each session s ∈ S, the controller first randomly selects one feasible path from the not-in-use path set (i.e., Js \Ds ), and one in-use path from Ds . The system cost of current configuration uf can be measured by the controller, which then estimates the performance of the target configuration, i.e., uf 0 , if these two paths pold and pnew were swapped. Meanwhile the controller triggers an exponentially distributed timer Ts for session s with a mean value of exp(τ − 12 β(uf −uf 0 ))·(|Ds |·(|Js |−|Ds |))−1 . In the last step, the controller broadcasts a RESET(uf 0 , s) message for other sessions to notify them the updated system cost uf 0 . • Procedure Transit: When a timer Ts∈S expires, the controller swaps the chosen pair of path pold and pnew for s, then the execution thread repeats Procedure Set-timer for session s.

71

¯ • Procedure RESET: When a session s ∈ S receives a RESET(uf 0 , S) ¯ message, the controller refreshes all the other timers Ts (∀s ∈ S − S), according to (5.6) with the updated system cost uf 0 . Now, we have the following theorem. Theorem 8. Algorithm 8 realizes a time-reversible Markov chain with the stationary distribution given in (5.4). The proof is given in Appendix-5.6. Furthermore, we make some notable remarks: Remark 3. Our proposed algorithm can be extended to other traffic engineering problems in SDN systems, e.g., finding the resilient routing paths for data plane traffics between any pair of switches. Remark 4. Because this algorithm is executed in a distributed manner for each session, it can be applied to more practical scenarios where multiple controllers are deployed over large-scale networks. This requires the system to know the per0 formance under target configuration f via a probing phase. It can be achieved by minimum information exchange among relevant controllers. For a transition 0 from f to f , where only one session changes a single path, its holding controller only has to notify this event to other “invariant” controllers. Then, the “nexP 0 t” target system performance s∈S us (f ) can be known immediately in each controller.

5.4

Online Handling and Theoretical Analysis under Single-link Failure

In this section, we extend our proposed Algorithm 8 to the online case that handles the dynamic single-link failure [49]. Then, the theoretical performance fluctuation induced by such link failure is presented.

5.4.1

Operations When A Link Fails

When a single link fails, any candidate paths and in-use paths which include the failed link become invalid and should be removed for each session. Since we assume that only a single link can fail at a time, there is always at least one in-use path working for each session. Therefore, the connection between controller and any switch will not be disturbed. Then, we present the additional operations with respect to single-link failure in Algorithm 11. After removing all invalid paths in Step 1, Step 2 fills up the vacancy of desired in-use paths and ensures control traffic protection. Finally, other sessions should be notified of the updated overall system cost via RESET messages 72

Algorithm 11 Online Dynamic Handling of Single-Link Failure 1: Step 1: Remove all the candidate paths and in-use paths which involve in the failed link for each session. 2: Step 2: ¯←∅ 3: S 4: for s ∈ S do 5: if s lost any in-use path then 6: Ds ← controller randomly picks up feasible not-in-use paths from the updated Js \Ds ¯←s 7: S 8: end if 9: end for ¯ signals. 10: Step 3: Controller broadcasts RESET(uf , S)

to refresh their timers in Step 3. Then, the controller continues listening to Procedure Transit and Procedure RESET of Algorithm 8. In addition, it is worth noting that the controller can deploy the required protection paths according to a suboptimal solution after link failure and before achieving the convergence.

5.4.2

Theoretical Performance Fluctuation of Single-Link Failure

When the invalid paths are removed, the configurations involving those paths should be deleted from the original configuration hopping Markov chain M . ˆ denote the new Markov chain after removing all invalid configurations Let M ˆ . Accordingly, based on M , and G as the survived configuration space in M the disappeared configuration space is F\G. For example, as shown in Fig. 5.4, when link (4,5) fails, any configuration including this link, e.g., f3 , will be moved into F\G. The remaining configurations f1 , f2 , · · · are moved to G. Note that, ˆ is still irreducible. it can be proved that M We are particularly interested in the robustness of the proposed Algorithm 8 in the online case. This motivates us to further study the performance fluctuation from the state when link failure just occurs to the converged performance ˆ. in M ˆ is denoted At first, the stationary distribution of the configurations in M ∗ ∗ ˆ : [ˆ by q : [qg (u), g ∈ G]. Furthermore, we define another vector q qg (u), g ∈ G] to indicate the distribution of configurations g ∈ G in M when link failure just occurs and before the Step 3 of Algorithm 11. We use the total variation disˆ ) to quantify the distribution difference of all configurations tance [93] dT V (q∗ , q ˆ g ∈ G between M and M . Then, we have the following lemmas.

73

Controller C

1

3

f1

Failed 5

0 2

4 f2

f1

f2



^ G: config. space of M. F\G

f3

f3





F: config. space of M.

(C,1):[(0,1),(0,2,1)]; (C,2):[(0,1,2),(0,2)]; (C,3):[(0,1,3),(0,2,4,3)]; (C,4):[(0,1,3,4),(0,2,4)]; (C,5):[(0,1,3,5),(0,2,4,3,5)] (C,1):[(0,1),(0,2,4,3,1)]; (C,2):[(0,1,2),(0,2)]; (C,3):[(0,1,3),(0,2,4,3)]; (C,4):[(0,1,3,4),(0,2,4)]; (C,5):[(0,1,3,5),(0,2,4,3,5)] (C,1):[(0,1),(0,2,4,3,1)]; (C,2):[(0,1,2),(0,2)]; (C,3):[(0,1,3),(0,2,4,3)]; (C,4):[(0,1,3,4),(0,2,4)]; (C,5):[(0,1,3,5),(0,2,4,5)]

Figure 5.4: An example of operations when single-link failure occurs. ˆ is bounded by: Lemma 5. (a) The total variation distance between q∗ and q ˆ) , dT V (q∗ , q

1X ∗ |F\G| |qg − qˆg | ≤ . 2 |F|

(5.7)

g∈G

(b) By denoting S1 ⊆ S as the set of sessions which lost a candidate path due to the link-failure, and Sim ⊆ S1 as an imaginary set of sessions which select the disappeared path if it still exists, we have P Q |Js |−1  Q |Js |−1 { × } |Ds |−1 |Ds | |F\G| s∈S⊥ ∀Sim ⊆S1 s∈Sim = (5.8)  Q |Js | |F| s∈S 1

|Ds |

where Js , Ds are the path sets for session s ∈ S before link failure, and S⊥ = {S1 \Sim }. The proof is given in Appendix-5.7. Based on Lemma 5, we then study a special case in the next subsection.

5.4.3

Case Study under ‘1+1’ Protection Scheme

Now, we proceed to study a special case, which is named as ‘1+1’ protection with equal number of available candidate paths. We adopt the ‘1+1’ protection mechanism [39, 84] and particularly provide each session with the same number of initial candidate paths, i.e., |Js | is same for ∀s ∈ S in the initial stage of controller connection setup. In addition, when single-link failure occurs, in order to ensure that at least one candidate path can be chosen for each session, we assume that (|Js | − 1) − |Ds | ≥ 1, ∀s ∈ S. Consequently, equation

74

(5.8) can be rewritten as: P |F\G| = |F|

{

|Js |−1 1



Q

∀Sim ⊆S1 s∈Sim

Q

s∈S1

Q

×

s∈S⊥

|Js |−1 2



} .

|Js | 2



(5.9)

Then, letting umax = maxg∈G ug , we obtain the following theoretical bound of performance perturbation under this special case when a single-link failure occurs. Note that, umax is the theoretical performance of the solution obtained by solving the maximization version of the CPRP problem. Theorem 9. The performance perturbation of a single-link failure under the special case ‘1+1’ protection with equal number of available candidate paths is bounded by ˆ uT k ≤ min(umax , 2umax [1 − (1 − kq∗ uT − q

2 |S| ) ]). |Js |

(5.10)

The proof is given in Appendix-5.8. Then, we have the follow remark. Remark 5. Suppose that the number of sessions in a given network is large and fixed, i.e., |S| is a large constant. When |Js | → ∞, [1 − (1 − |J2s | )|S| ] → 0, making the fluctuation bound approach 0. In general, a larger |Js | makes the term 1 − (1 − |J2s | )|S| smaller, leading to a smaller fluctuation bound.

5.5 5.5.1

Performance Evaluation Methodology and Simulation Settings

It is worth noting that, our proposed algorithm is designed to be embedded to controller module of SDN networks. However, to our best knowledge, the existing SDN emulator Mininet has not yet supported the evaluation of control plane traffics in the in-band SDN networks. Therefore, we have implemented a simulator in Python to emulate an SDN with in-band control and conduct the evaluation of routing protection for control plane traffics. Benchmarks: Four benchmark algorithms are used to compare the performance with our proposed Alg. 8. As the first, K-shortest path finding algorithm [87, 94, 95] is a classical static heuristic, in which each session is provisioned with the first K shortest paths from the given candidate path set. Note that, in our simulation K = |Ds |. The second one is called Alg. Iterative [96], and also designed by using the Markov approximation technique. In Alg. Iterative, although the system is similarly allowed to transit from one configuration to another by swapping one pair of paths only, the transition rate is designed as qf →f 0 ∝ exp−1 (−βuf 0 ). Furthermore, it keeps tracing the best configuration observed so far which is used as the final solution. 75

Then, the third one is Local Rerouting (Alg. LR), which is widely adopted in the literature [34, 38, 40, 80–83]. When applying Alg. LR in our simulation, once a link fails, the affected traffic is rerouted to neighbouring links in a fashion of detour [82, 83]. Finally, the Optimal solutions are solved utilizing the state-of-the-art optimizer Gurobi [78], which provides some classical solvers, such as linear programming solver (LP solver), quadratically constrained programming solver (QCP solver) and mixed-integer linear programming solver (MILP solver). The solvers in the Gurobi Optimizer are designed from the ground up to exploit modern architectures and multi-core processors, using the most advanced implementations of the latest algorithms. The Gurobi Optimizer also supports a variety of programming and modeling languages including: Object-oriented interfaces for C++, Java, .NET, and Python. For the comparison with our proposed Alg. 8, we solve the integer programming CPRP problem using the Python-oriented application programming interfaces, under the circumstances both before and after the link failure. Other settings and metrics: Simulations are conducted under the online dynamic case with occurrence of a single-link failure. The traffic demand of each controller-switch session and the link bandwidth capacity are randomly generated within a given range. Before executing the algorithms, by invoking a simple depth-first path finding algorithm, we try to provide each controllerswitch session s ∈ S with a number |Js | of highly-disjointed candidate paths under Fat-tree topology. Under fixed ω = 1, τ = 1 and β = 10, we select a number |Ds | = 2 of paths as in-use paths by executing our proposed algorithm and the other benchmark algorithms. The weighted Joint system cost is defined as the summation of both the Largest link overhead measured in traffic rate (Mb/s) and the Average (Avg) node configure overhead measured in the average configuring times at each switch node. Simulations are conducted under a Fat-tree topology with 26 nodes and 50 bidirectional links (shown in Fig. 5.5). In the Fat-tree topology, controller directly connects to gateway node 0, and indirectly connects to other switch nodes via the in-band connections. The control-plane traffic demand of each session is randomly generated within range [1, 15] Mb/s, and the link bandwidth capability for both control-plane and date-plane traffic in each link is set to 1000 Mb/s.

5.5.2

Representative Execution Case of Algorithms

By fixing |Js |=5 and |Ds |=2 for each session, the numerical result shown in Fig. 5.6 illustrates the representative execution case of algorithms over a logical period [0, 15 seconds]. A single-link failure occurs in link (0,3) at logical time 76

0

Controller

1

3

2

4

5

6

7

10

11

14

15

18

19

22

23

8

9

12

13

16

17

20

21

24

25

Figure 5.5: The 26-node Fat-tree topology. Controller connects to gateway node. 10. As link (0,3) is one of the most critical links under the Fat-tree topology, its failure brings a worst damage that 20% of both the total candidate paths and in-use paths cannot be used anymore. Before link-failure, we observe that our proposed algorithm converges to the optimal performance with a cost 93 in the initial 0.5 seconds. In contrast, Kshortest algorithm keeps a high system cost 141. Although the best solution of Alg. Iterative is traced at around 7.5 logical time, this algorithm shows a fluctuant performance all the time. Thus it is ambiguous when its best solution can be obtained. On the contrary, the near-optimal solution can be traced quickly using our proposed Alg. 8 because of its fast convergence. As shown in Fig. 5.6, at logical 0.3 second in the simulation, we get the converged nearoptimal solution. Then, when the single-link failure occurs, the cost fluctuation is observed under all algorithms. The failed link shrinks the candidate path set and in-use path set for each session. As a result, the updated in-use paths are selected by sharing more common links, making the total system cost grows high. Comparing the fluctuation gaps of algorithms, we find that our proposed algorithm has smaller one than the other two algorithms. Moreover, its performance converges to the optimal solution quickly at 11.5 logical time. However, the K-shortest algorithm still holds the highest cost 163 and Alg. Iterative shows a more severe perturbation. On the other hand, right after the link-failure occurs, the cost of our Sojourn-and-Transit algorithm is shown to be 148, and the converged cost is 117, leading to the fluctuation gap is 31. According to Theorem 9, the theoretical fluctuation bound is 191.7 using (5.10). The fluctuation gap is within the theoretical bound under the failure of link (0,3). Furthermore, in the same group of simulation shown in Fig. 5.6, the middle and bottom figures demonstrate the largest link overhead (in terms of aggregated control-plane traffic rate in Mb/s, the average node configure overhead (in 77

Optimal

Alg. 8

Alg. Iterative

Alg. K−Shortest

Performance fluctuation < Theoretical bound

150 100

Avg node configure Largest link overhead (Mb/s) overhead

Joint system cost

200

0

Single link failure occurs

5

10

15

10

15

10

15

200 Single link failure occurs

150 100 0

5

18 16

Single link failure occurs

14 12 10 0

5

Figure 5.6: Representative execution of algorithms under Fat-tree topology (initial |Js |=5, |Ds |=2). It can be seen that Alg. 8 converges in both initial stage and after the link failure. Note that, the numerical Joint system cost includes both the Largest link overhead measured by the traffic rate (Mb/s) and the Average (Avg) node configure overhead measured by the average configuring times in each switch node. terms of times of configuring), respectively. Note that, the joint system cost is calculated by summing the two overhead items that are shown in the bottom two sub-figures. Overall, both the two terms of costs show the similar performance with the joint system cost. However, in the bottom figure with respect to the average node configure overhead, we observe that the K-shortest algorithm interestingly maintains a quite low level. This is because the K-shortest algorithm always selects the former |Ds | shortest candidate paths for each session. Consequently, the total length of the selected paths is very short. After the link failure, due to shrunken candidate path space, some longer paths have to be chosen, making the average node overhead become higher than before. For example, we observe that the average node overhead of K-shortest increases from 11 to 12 when the single link failure occurs.

78

5.5.3

Case Study of Single Link Failure

With the same single link failure in the critical core link (0,3) at logical time 10, we compare the link overhead in terms of the aggregated control-plane traffic rate in other core links such as (0,1), (0,2), (0,4) and (0,5) of the Fat-tree topology. Note that the Alg. LR can be evaluated only after link failure happens. For a fair comparison with our Alg. 8, we apply Alg. LR based on the converged routing solution that is yielded by Alg. 8 just before the link failure. Particularly, when the core link (0,3) fails, the affected traffic will be rerouted via the neighboring links (0,2) and (0,4), respectively. In the following, we explain the numerical results of both situations based on the average performance obtained over 100 instances. At the first, as shown in Fig. 5.7(a), the link overhead of the converged solution obtained by our Alg. 8 is very close to the optimal solution, before link failure. When link (0,3) fails, we can observe the performance of Alg. LR from Fig. 5.7(b) that the aggregated control-plane traffic rates in the core link (0,2) increase sharply from 80 Mb/s to around 165 Mb/s, if the affected traffic is rerouted only via (0,2). In contrast, the average performance of Alg. 8 is still very close to the Optimal. This is because they averagely amortize the affected traffics to the other 4 core links. The similar results can be found in Fig. 5.7(c) when the affected traffic is locally rerouted over the core link (0,4).

5.5.4

Performance of Alg. 8 in the Initial Stage

Now, we study the performance of algorithms in the initial connection-setup stage of control-plane traffic. In this group of simulation, each parameter setting is evaluated with 100 instances. By fixing |Js | = 5, |Ds | = 2 and varying the solution tracing time from 0.1 to 2 seconds, we record the cost of the best solution for all algorithms. From Fig. 5.8(a), we observe that the performance of all algorithms are similar when the solution-tracing time is 0.1 second. However, the proposed Alg. 8 shows the significant advantage over the two benchmark Alg. Iterative and Alg. K-Shortest when the solution-tracing time grows, and it converges to the optimal solution when tracing time exceeds 1 second. Since convergence only occurs when using Alg. 8, we further study the cumulative distribution function (CDF) of its convergence time by varying the candidate path scale |Js | within {3, 4, 5, 6, 7}. Such results are shown in Fig. 5.8(b), in which it can be observed that approximate 45% of convergence times are within the first 1 second and over 90% of convergence times are shorter than 5 seconds when |Js | ≥ 5. On the other hand, the near-optimal solution yield by Alg. 8 becomes more difficult to find while |Js | < 5, leading to the convergence time increases higher accordingly.

79

5.5.5

Performance of Alg. 8 under Single-Link Failure

Next, we study the influence introduced by the single-link failure while applying Alg. 8. Based on the same settings as the former group of simulation, we compare the performance of Alg. 8 between the initial convergence stage (Init.) and the stage after-link-failure. The average system costs of Alg. 8 in both stages are compared in Fig. 5.9(a). Here |Js | is the number of candidate paths before the single-link failure, and varies from 4 to 7 due to the adopted ‘1+1’ protection. We see that there is an increment of approximate 20 units of system cost with each |Js | under a.l.f. case. Fig. 5.9(b) shows the CDF of system cost fluctuation a.l.f. of Alg. 8. We surprisingly find out that the fluctuation is positively proportional to |Js |. For example, 90% of the cost fluctuation is less than 30 when |Js | = 4, but this percentage reduces to 75%, 48%, and 25% when |Js | = 5, 6, 7, respectively. Here, the probability of having very bad configurations becomes higher at the timeslot when the link failure occurs, under the situation that more candidate paths are being provided. However, all the observed cost fluctuations are within the theoretical bound described in Theorem 9. Then, in Fig. 5.9(c), the convergence time a.l.f. is observed larger than that in the initial stage. This is because the decreased candidate paths make the best configuration found in a longer time. We also see that the average convergence time is less than 5 seconds when |Js | ≥ 5. Further, Fig. 5.9(d) illustrates the CDFs of convergence time a.l.f.. We see that the convergence times are almost same when |Js | = 5, 6, 7 and better than that under |Js | = 4. It can be seen that although a larger candidate path set increases the cost fluctuation, it benefits the convergence time after link failure. In summary, from all the numerical results, we have the following observations: 1) The proposed Sojourn-and-Transit algorithm (Alg. 8 ) has a nearoptimal performance; 2) Alg. 8 has higher resource utilization efficiency than the conventional Local Rerouting algorithm; 3) Alg. 8 achieves 10%-20% lower overhead in terms of link bandwidth consumption and connection setup cost, as compared with the two benchmark algorithms, Alg. Iterative and Alg. KShortest.

5.6

Proof of Theorem 8

Proof. By the two conditions for state space of constructing the designed Markov chain, we see that all configurations can reach each other within a finite number of transitions, where a single in-use path is replaced in each transition. Therefore, the constructed Markov chain is irreducible, i.e., an ergodic Markov chain. Now we show that the stationary distribution of the constructed Markov chain is exactly (5.4). 80

In the proposed Sojourn-and-Transit algorithm, the sojourn time of each configuration is exponentially distributed and the transition probability between different configurations is independent of time, so the state process constitutes a homogeneous continuous-time Markov chain. Let P rf →f 0 denote the probability 0 that system will enter state f when it leaves state f due to expiration of a countdown timer. We also introduce Nf indicating the set of state which are directly connected to a state f . In Sojourn-and-Transit algorithm, the next state of f 0 0 has equal probability to be any state f , ∀f ∈ Nf . It is not hard to know the size of state space Nf is |Nf | = |S| · |Ds |(|Js | − |Ds |), where s is the critical 0 session which induces the transition from f to f . Then, we have P rf →f 0 =

0 1 1 = , ∀f ∈ F, ∀f ∈ Nf . |Nf | |S||Ds |(|Js | − |Ds |) 0

In the following, we show that the state transition rate from f to f in the implemented Sojourn-and-Transit algorithm satisfies (5.5). • Firstly, all the transition rates of the paths selection process are finite; • Secondly, it is straightforward to see that all path configurations can reach each other via a finite number of transitions. Therefore, the constructed Markov chain is irreducible. • Finally, the detailed balance equation holds between any two adjacent states. Because according to (5.6), given a current state f , each session counts down with a rate λ=

1 |Ds |(|Js | − |Ds |) P · , 1 exp(τ ) exp( 2 β s∈S (us (f 0 ) − us (f )))

then, the system leaves state f with a rate |S|λ. With probability P rf →f 0 , 0 system jumps to an adjacent state f when leaving the current state f . 0 Therefore, we calculate the transition rate from f to f as follows: qf,f 0 = |S|λ × P rf →f 0 |Ds |(|Js | − |Ds |) 1 P · exp(τ ) exp( 12 β s∈S (us (f 0 ) − us (f ))) 1 × |S| · |Ds |(|Js | − |Ds |) 0 1 1 X = · exp( β (us (f ) − us (f ))). exp(τ ) 2 = |S| ×

(5.11)

s∈S

0

Finally, using (5.11) and (5.4), we can see that p∗f qf,f 0 = p∗f 0 qf 0 ,f , ∀f, f ∈ F, i.e., the detailed balance equations hold. According to the Theorem 1.3 and Theorem 1.14 in [97], the constructed Markov chain is time-reversible and its stationary distribution is (5.4). 81

5.7

Proof of Lemma 5

Proof. By referring the derivation of (5.4), we know the stationary distribution ˆ is shown as: of the configurations in M exp(−βug ) , ∀g ∈ G. 0 g ∈G exp(−βug )

qg∗ = P

(5.12)

0

Next, we analyze the distribution of configurations g ∈ G in M when link failure just occurs. As we see in the Step 2 of the Alg. 11 when link failure occurs, if there is no any session losing an in-use path, before Step 3 the distribution of g ∈ G in M is already known as: p∗g = P

f

0

exp(−βug ) , ∀g ∈ G; 0 ∈F exp(−βuf )

(5.13)

otherwise, when any in-use path is replaced, the distribution of g ∈ G in M will become bigger than p∗g . That is, we have: qˆg ≥ p∗g , ∀g ∈ G.

(5.14)

By (5.12) and (5.14), we know: exp(−βug ) exp(−βug ) −P 0) 0 0 exp(−βu g g ∈G f ∈F exp(−βuf )

qg∗ − qˆg ≤ P

0

exp(−βug ) exp(−βug ) −P , ∀g ∈ G, =P 0 0 g 0 ∈G exp(−βug ) f 0 ∈G exp(−βuf ) + σ

(5.15)

P where σ = gˇ∈F \G exp(−βugˇ ). ˆ ) as follows. Then, we calculate the dT V (q∗ , q ˆ) = dT V (q∗ , q

X 1X ∗ |qg − qˆg | = (qg∗ − qˆg ), 2 o g∈G

(5.16)

g∈A

where Ao , {g ∈ G : qg∗ ≥ qˆg }, and Ao ⊂ G. On the other hand, the system costs uf ∈F are independent to each other, and follow normal distribution. That is, uf ∈F are independent and identically distributed (i.i.d.) discrete random values and the expectation of system cost exists within the finite configuration space F. Letting this expectation denote by P µ, and according to the law of large numbers [98], we have f ∈F \G exp(−βuf ) =

82

P

|F\G| exp(−βµ) and

f ∈F

exp(−βuf ) = |F| exp(−βµ). Thus, this yields

ˆ) = dT V (q∗ , q

X

(qg∗ − qˆg )

g∈Ao

P ≤P

g∈Ao

g 0 ∈G

P

g∈G

≤P

g 0 ∈G

=1− P

exp(−βug0 ) exp(−βug ) exp(−βug0 ) P

f

P

exp(−βug )

f

0

−P −P

g∈Ao

exp(−βug )

f 0 ∈G

exp(−βuf 0 ) + σ

P

exp(−βug )

g∈G

f 0 ∈G

exp(−βuf 0 ) + σ

exp(−βug ) P ∈G exp(−βuf ) + f ∈F \G exp(−βuf ) g∈G

0

f ∈F \G

= P

P

0

exp(−βuf )

0 ∈F exp(−βuf )

=

|F\G| exp(−βµ) |F| exp(−βµ)

|F\G| = . |F| This concludes Lemma 5(a). Now we prove Lemma 5(b). Since we only consider single-link failure and each session is provided with disjointed candidate paths, the number of disappeared candidate path is at most one. That is, to each session s ∈ S1 , the size of its candidate path space turns from |Js | to |Js | − 1. Now, we compute the total number of configurations disappeared due to the single-link failure. By the definition of Sim , we know in the disappeared configurations, each session in Sim selects the disappeared candidate path and other |Ds | − 1 candidate paths as its in-use paths. Consequently, the number of the path selection conditions for sessions in Sim is calculat Q |Js |−1 ed as c1 = s∈Sim |D . On the other hand, it is not hard to get the s |−1 number of path selection conditions for sessions from S1 \Sim is calculated as  Q s |−1 c2 = s∈S⊥ |J|D . Recall that, there is no influence to any session in S\S1 , | s because their candidate paths are not affected by the failed link. Similarly, the  Q |Js | number of path selection conditions of all sessions in S\S1 is c3 = s∈S\S1 |D . s| In addition, it should be noticed that any session s ∈ S1 can be a participant of Sim . Therefore, the Sim should be varied as Sim ⊆ S1 in the calculation to include all possible solutions. Finally, we can compute the |F|F\G| | as follows.

83

P c3 × ∀Sim ⊆S1 {c1 × c2 } |F\G| =  Q |Js | |F| s∈S |Ds | P c3 × ∀Sim ⊆S1 {c1 × c2 } =Q  Q  |Js | |Js | s∈S\S1 |Ds | × s∈S1 |Ds |  Q P Q |Js |−1  Q |Js | × { × |Ds | |Ds |−1 =

∀Sim ⊆S1 s∈Sim

s∈S\S1

|Js | |Ds |



Q

s∈S\S1

Q

s∈S1





}

|Js | |Ds |



Q |Js |−1 Q |Js |−1 × ∀Sim ⊆S1 { |Ds |−1 |Ds | s∈Sim s∈S⊥  Q |Js | s∈S1 |Ds |

P =

×

s∈S⊥

|Js |−1 |Ds |



}

This concludes the proof.

5.8

Proof of Theorem 9

Proof. In the context of this special case, referring to Lemma 5 and (5.9), can be extracted as: |F\G| = |F|

P|S1 |

k=1 {

|S1 | k



 |S1 |−k |Js |−1 k × |Js2|−1 } 1  |S | 1 |J | s

2

=

= =

[

|Js |−1 1



+

|Js |−1 2

 |S | ] 1 −  |Js | |S1 |

 |Js |−1 |S1 | 2

2 |Js |(|Js |−1) |S1 | s |−2) |S1 | [ ] − [ (|Js |−1)(|J ] 2 2 |Js |(|Js |−1) |S1 | [ ] 2 |S1 | |S1 | |Js | − (|Js | − 2) |Js ||S1 |

|Js | − 2 |S1 | ) |Js | 2 |S1 | = 1 − (1 − ) |Js | 2 |S| ≤ 1 − (1 − ) , |Js | =1−(

where |Js | ≥ 4, and is same for ∀s ∈ S.

84

|F \G| |F |

Now, we calculate the fluctuation of system cost: X ˆ uT k = k kq∗ uT − q (qg∗ − qˆg ) · ug k g∈G

≤ umax ·

X

|qg∗ − qˆg |

g∈G

ˆ) = umax · 2dT V (q∗ , q 2 |S| ≤ 2umax [1 − (1 − ) ]. |Js | On the other hand, it is not hard to find that the performance perturbation should be no greater than the cost of the worst configuration in G. Thus the upper bound will never expire umax . Finally, we have the final result shown in (5.10). This concludes the proof.

5.9

Summary

In this chapter, in order to ensure fast recovery of the control-plane traffic in inband SDN networks, we study a joint weighted cost minimization problem. The traffic load balancing and control-channel setup cost are jointly considered when performing routing protection for control-plane traffic in an in-band SDN. To solve this multiple resource constrained routing problem efficiently, we propose a near-optimal algorithm using the distributed Markov approximation technique. In particular, we extend our approach to the online case that can promptly handle single-link failures. The induced performance fluctuation is also studied with theoretical derivation. Finally, extensive simulations are conducted to evaluate the performance of the proposed approach. Compared with the existing benchmark approaches, our proposed algorithm has much better performance in terms of both resource utilization and robustness of handling the single-link failure.

85

Link overhead (Mb/s)

Link overhead (Mb/s)

200

Optimal (b.l.f.) Alg. 8 (b.l.f.)

150 100 50 0

(0,1)

(0,2) (0,3) (0,4) Core links in topology

(a) Before link failure (shorten as b.l.f.), the converged link overhead obtained by Alg. 8 shows very close to the optimal solution found by Gurobi [78].

Link overhead (Mb/s)

150 100 50 0

(0,5)

Optimal (a.l.f.) Alg. 8 (a.l.f.) Alg. LR

200

(0,1)

(0,2) (0,3) (0,4) Core links in topology

(0,5)

(b) In Alg. LR, when link (0,3) fails, the affected traffic flows are rerouted via the neighbouring link (0,2). Here only the control-plane traffics a.l.f. are shown.

Optimal (a.l.f.) Alg. 8 (a.l.f.) Alg. LR

200 150 100 50 0

(0,1)

(0,2) (0,3) (0,4) Core links in topology

(0,5)

(c) In Alg. LR, when link (0,3) fails, the affected traffic flows are rerouted via the neighbouring link (0,4). Here only the control-plane traffics a.l.f. are shown. Figure 5.7: Link overhead distribution in the core links under Fat-tree topology before/after the critical link (0,3) fails. Alg. 8 always shows near-optimal performance in terms of the aggregated traffic rates in the core links. Alg. LR exhibits sharp increasing aggregated traffic rate in the neighboring links of the failed one.

86

Optimal Alg. 8

140

1

Alg. Iterative Alg. K−Shortest

0.8 CDF

Sys. cost of best solution

160

120

0.6 0.4 0.2

100 0.1

0 0

0.5 1 1.5 2 Solution tracing time (seconds)

(a) System cost of algorithms while varying the best-solution tracing time.

|Js|=6 |Js|=7

|Js|=3 |Js|=4 |Js|=5

5 10 15 Convergence time of Alg. 8

20

(b) Convergence time (seconds).

Figure 5.8: Convergence property of algorithms under Fat-tree topology. Alg. 8 shows overwhelming performance over benchmark algorithms.

1

Optimal, Init. Alg. 8, Init. Optimal, a.l.f. Alg. 8, a.l.f.

140

0.8 CDF

System cost

160

120

0.6 0.4 0.2

100 4

5

Initial |Js|

6

0 0

7

(a) System cost

50 100 150 Cost fluctuation of Alg. 8 a.l.f.

1

Alg. 8, Init. Alg. 8, a.l.f.

0.8

20 CDF

Convergence time

|Js|=6 |Js|=7

(b) Cost fluctuation a.l.f.

30

10

0.6 0.4 0.2

0

|Js|=4 |Js|=5

4

5

Initial |Js|

6

0 0

7

(c) Convergence time (seconds)

|Js|=4 |Js|=5 |Js|=6 |Js|=7 5 10 15 20 Convergence time of Alg. 8 a.l.f.

(d) CDF of convergence time a.l.f.

Figure 5.9: Performance of Alg. 8 when varying the number of initial candidate paths for each session in the initial stage (shorten as Init.) and after-link-failure (shorten as a.l.f.) under Fat-tree topology. We find that although a larger candidate path set increases the cost fluctuation, it reduces the convergence time after the link failure.

87

Chapter 6

Conclusion In a summary, this dissertation investigates three primary issues on the rule management while performing traffic engineering for SDN networks. In the first issue, we study a rule multiplexing scheme for rule placement with the objective of minimizing rule space occupation for multiple unicast sessions under QoS constraints. We formulate an optimization problem by jointly considering routing engineering (i.e., with or without the given candidate paths) and rule placement under both the existing nonRM-based and our proposed RMbased rule placement schemes. Due to the NP-hardness, we propose heuristic algorithms for minimization problems of RM-nonCP and nonRM-nonCP using the relaxation and rounding techniques. Two phases are included in the major heuristic algorithm. In the first phase, we select multiple paths for each session, while rules placement solution is found at the selected routing paths in the second phase. The computational complexity is also analyzed. Then, extensive simulations are conducted to show that our proposed algorithms save TCAM resources significantly. Next, we study the rule caching problem with the objective to minimize the total cost of remote processing and local forwarding table occupation. We propose an offline algorithm by adopting a greedy strategy if the network traffic is given in advance. We also devise two online algorithms with guaranteed competitive ratios. Then, we conduct extensive simulations using real network traffic traces to evaluate the performance of our algorithm proposed. The simulation results demonstrate that our algorithms can remarkably reduce the total cost, and the solutions obtained are near optimal. Thirdly, for the routing protection towards the control plane traffics in SDN networks, we strive to provide the optimal routing protection for control-plane traffic in the in-band SDN networks. Note that, Our approach can be extended to the routing protection in data plane. To solve the aforementioned weighted cost minimization problem, a near-optimal algorithm has been proposed using

88

Markov approximation technique. In particular, we design a Markov chain with a state space of all feasible protection solutions and a well devised transition rate matrix such that the theoretical performance of the proposed algorithm can be guaranteed. Compared with the existing benchmark approaches, our proposed algorithm can provide higher robustness and efficiency by handling the singlelink failure with fast convergence. The theoretical performance fluctuation of our algorithm due to single-link failure is also thoroughly studied with closedform expression. As a part of the future work, we shall apply our framework to the data plane of SDN networks. Some other recovery mechanisms, such as restoration and cold-backup protection mechanisms, will be further studied.

89

Bibliography [1] H. Huang, S. Guo, P. Li, B. Ye, and I. Stojmenovic, “Joint optimization of rule placement and traffic engineering for qos provisioning in software defined network,” IEEE Transactions on Computers, vol. 64, no. 12, pp. 3488–3499, 2015. [2] H. Huang, S. Guo, P. Li, W. Liang, and A. Zomaya, “Cost minimization for rule caching in software defined networking,” IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 4, pp. 1007–1016, April 2016. [3] H. Huang, S. Guo, W. Liang, K. Li, B. Ye, and W. Zhuang, “Near-optimal routing protection for in-band software-defined heterogeneous networks,” sbumitted to IEEE Journal on Selected Areas in Communications, 2016. [4] S. Yang, J. Kurose, and B. N. Levine, “Disambiguation of residential wired and wireless access in a forensic setting,” in Proc. IEEE International Conference on Computer Communications (INFOCOM), 2013, pp. 360–364. [5] N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner, “Openflow: enabling innovation in campus networks,” ACM SIGCOMM Computer Communication Review, vol. 38, no. 2, pp. 69–74, 2008. [6] M. Casado, M. Freedman, J. Pettit, J. Luo, N. Gude, N. McKeown, and S. Shenker, “Rethinking enterprise network control,” IEEE/ACM Transactions on Networking (TON), vol. 17, no. 4, pp. 1270–1283, Aug 2009. [7] A. R. Curtis, J. C. Mogul, J. Tourrilhes, P. Yalagandula, P. Sharma, and S. Banerjee, “Devoflow: scaling flow management for high-performance networks,” in ACM SIGCOMM Computer Communication Review, vol. 41, no. 4. ACM, 2011, pp. 254–265. [8] T. Benson, A. Anand, A. Akella, and M. Zhang, “Microte: Fine grained traffic engineering for data centers,” in Proc. Seventh COnference on Emerging Networking EXperiments and Technologies, ser. CoNEXT ’11, 2011, pp. 1–12. 90

[9] S. Agarwal, M. Kodialam, and T. Lakshman, “Traffic engineering in software defined networks,” in Proc. IEEE 32th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM), April 2013, pp. 2211–2219. [10] Y. Kanizo, D. Hay, and I. Keslassy, “Palette: Distributing tables in software-defined networks,” in IEEE INFOCOM, 2013, pp. 545–549. [11] A. Ishimori, F. Farias, E. Cerqueira, and A. Abel´em, “Control of multiple packet schedulers for improving qos on openflow/sdn networking,” in 2013 Second European Workshop on Software Defined Networks (EWSDN), 2013, pp. 81–86. [12] H. Huang, P. Li, S. Guo, and B. Ye, “The joint optimization of rules allocation and traffic engineering in software defined network,” in 2014 IEEE 22nd International Symposium of Quality of Service (IWQoS), May 2014, pp. 141–146. [13] N. Gude, T. Koponen, J. Pettit, B. Pfaff, M. Casado, N. McKeown, and S. Shenker, “Nox: towards an operating system for networks,” SIGCOMM Comput. Commun. Rev., vol. 38, no. 3, pp. 105–110, Jul. 2008. [14] S. Shin and G. Gu, “Attacking software-defined networks: A first feasibility study,” in Proc. second ACM SIGCOMM workshop on Hot topics in software defined networking, 2013, pp. 165–166. [15] K. Giotis, C. Argyropoulos, G. Androulidakis, D. Kalogeras, and V. Maglaris, “Combining openflow and sflow for an effective and scalable anomaly detection and mitigation mechanism on sdn environments,” Computer Networks, vol. 62, pp. 122–136, 2014. [16] “Open networking foundation,” https://www.opennetworking.org/. [17] H. Yamanaka, E. Kawai, S. Ishii, and S. Shimojo, “Openflow networks with limited l2 functionality,” ICN 2014, p. 232, 2014. [18] P. T. Congdon, P. Mohapatra, M. Farrens, and V. Akella, “Simultaneously reducing latency and power consumption in openflow switches,” IEEE/ACM Transactions on Networking (TON), vol. 22, no. 3, pp. 1007– 1020, 2014. [19] R. Panigrahy and S. Sharma, “Reducing tcam power consumption and increasing throughput,” in Proc. 10th Symposium on High Performance Interconnects, 2002, pp. 107–112.

91

[20] F. Zane, G. Narlikar, and A. Basu, “Coolcams: Power-efficient tcams for forwarding engines,” in IEEE Societies Twenty-Second Annual Joint Conference of the IEEE Computer and Communications, vol. 1, 2003, pp. 42– 52. [21] Y. Ma and S. Banerjee, “A smart pre-classifier to reduce power consumption of tcams for multi-dimensional packet classification,” in Proc. ACM SIGCOMM 2012 conference on Applications, technologies, architectures, and protocols for computer communication, 2012, pp. 335–346. [22] V. Ravikumar, R. N. Mahapatra, and L. N. Bhuyan, “Easecam: An energy and storage efficient tcam-based router architecture for ip lookup,” IEEE Transactions on Computers, vol. 54, no. 5, pp. 521–533, 2005. [23] C. R. Meiners, A. X. Liu, and E. Torng, “Tcam razor: A systematic approach towards minimizing packet classifiers in tcams,” in IEEE International Conference on Network Protocols (ICNP), 2007, pp. 266–275. [24] Y. Sun and M. S. Kim, “Tree-based minimization of tcam entries for packet classification,” in Proc. 7th IEEE Consumer Communications and Networking Conference (CCNC), 2010, pp. 1–5. [25] K. Kannan and S. Banerjee, “Compact tcam: Flow entry compaction in tcam for power aware sdn,” in Distributed Computing and Networking. Springer, 2013, pp. 439–444. [26] “Openflow switch specification,” https://www.opennetworking.org/sdnresources/onf-specifications/openflow. [27] M. Yu, J. Rexford, M. J. Freedman, and J. Wang, “Scalable flow-based networking with difane,” ACM SIGCOMM Computer Communication Review, vol. 40, no. 4, pp. 351–362, 2010. [28] N. Kang, Z. Liu, J. Rexford, and D. Walker, “Optimizing the one big switch abstraction in software-defined networks,” Proc. ACM CoNEXT, 2013. [29] M. F. Bari, A. R. Roy, S. R. Chowdhury, Q. Zhang, M. F. Zhani, R. Ahmed, and R. Boutaba, “Dynamic controller provisioning in software defined networks.” in CNSM, 2013, pp. 18–25. [30] S. Sharma, D. Staessens, D. Colle, M. Pickavet, and P. Demeester, “Openflow: Meeting carrier-grade recovery requirements,” Computer Communications, vol. 36, no. 6, pp. 656–665, 2013. [31] F. D¨ urr, “Towards cloud-assisted software-defined networking,” Technical Report 2012/04, Institute of Parallel and Distributed Systems, Universit¨at Stuttgart, Tech. Rep., 2012. 92

[32] A. Bianco, P. Giaccone, A. Mahmood, M. Ullio, and V. Vercellone, “Evaluating the sdn control traffic in large isp networks,” in Proc. IEEE International Conference on Communications (ICC), June 2015, pp. 5248–5253. [33] R. Ahmed and R. Boutaba, “Design considerations for managing wide area software defined networks,” IEEE Communications Magazine, vol. 52, no. 7, pp. 116–123, 2014. [34] A. Sgambelluri, A. Giorgetti, F. Cugini, F. Paolucci, and P. Castoldi, “Openflow-based segment protection in ethernet networks,” Journal of Optical Communications and Networking, vol. 5, no. 9, pp. 1066–1075, 2013. [35] S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh, S. Venkata, J. Wanderer, J. Zhou, M. Zhu et al., “B4: Experience with a globallydeployed software defined wan,” in ACM SIGCOMM Computer Communication Review, vol. 43, no. 4, 2013, pp. 3–14. [36] S. S. Lee, K.-Y. Li, K. Y. Chan, J.-H. YwiChi, T.-W. Lee, W.-K. Liu, and Y.-J. Lin, “Design of sdn based large multi-tenant data center networks,” in Proc. IEEE 4th International Conference on Cloud Networking (CloudNet), 2015, pp. 44–50. [37] A. R. Curtis, W. Kim, and P. Yalagandula, “Mahout: Low-overhead datacenter traffic management using end-host-based elephant detection,” in Proc. IEEE International Conference on Computer Communications (INFOCOM), 2011, pp. 1629–1637. [38] N. Beheshti and Y. Zhang, “Fast failover for control traffic in softwaredefined networks,” in Proc. IEEE Global Communications Conference (GLOBECOM), 2012, pp. 2665–2670. [39] S. Sharma, D. Staessens, D. Colle, M. Pickavet, and P. Demeester, “Fast failure recovery for in-band openflow networks,” in Proc. 9th international conference on the Design of reliable communication networks (DRCN), 2013, pp. 52–59. [40] Y. Hu, W. Wendong, G. Xiangyang, C. H. Liu, X. Que, and S. Cheng, “Control traffic protection in software-defined networks,” in Proc. IEEE Global Communications Conference (GLOBECOM), 2014, pp. 1878–1883. [41] A. Detti, C. Pisa, S. Salsano, and N. Blefari-Melazzi, “Wireless mesh software defined networks (wmsdn),” in Proc. IEEE 9th international conference on wireless and mobile computing, networking and communications (WiMob), 2013, pp. 89–95.

93

[42] S. Salsano, G. Siracusano, A. Detti, C. Pisa, P. L. Ventre, and N. BlefariMelazzi, “Controller selection in a wireless mesh sdn under network partitioning and merging scenarios,” arXiv preprint arXiv:1406.2470, 2014. [43] H. Huang, P. Li, S. Guo, and W. Zhuang, “Software-defined wireless mesh networks: architecture and traffic orchestration,” IEEE Network, vol. 29, no. 4, pp. 24–30, 2015. [44] K. Sepp¨ anen, J. Kilpi, and T. Suihko, “Integrating wmn based mobile backhaul with sdn control,” Mobile Networks and Applications, vol. 20, no. 1, pp. 32–39, 2015. [45] M. Mendonca, B. N. Astuto, X. N. Nguyen, K. Obraczka, T. Turletti et al., “A survey of software-defined networking: Past, present, and future of programmable networks,” 2013. [46] U. C. Kozat, G. Liang, and K. Kokten, “On diagnosis of forwarding plane via static forwarding rules in software defined networks,” in Proc. IEEE International Conference on Computer Communications (INFOCOM), 2014. [47] N. Handigol, B. Heller, V. Jeyakumar, D. Mazieres, and N. McKeown, “I know what your packet did last hop: Using packet histories to troubleshoot networks,” in 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2014. [48] M. Borokhovich, L. Schiff, and S. Schmid, “Provable data plane connectivity with local fast failover: Introducing openflow graph algorithms,” in ACM SIGCOMM Workshop on Hot Topics in Software Defined Networking (HotSDN 2014), 2014, pp. 121–126. [49] M. L. Cing-Yu Chu, Kang Xi and H. J. Chao, “Congestion-aware single link failure recovery in hybrid sdn networks,” in Proc. IEEE International Conference on Computer Communications (INFOCOM), April 2015. [50] O. N. Fundation, “Software-defined networking: The new norm for networks,” ONF White Paper, 2012. [51] Z. Cai, A. L. Cox, and T. E. N. Maestro, “A system for scalable openflow control,” Technical Report TR10-08, Rice University, Tech. Rep., 2010. [52] R. Sherwood, G. Gibb, K.-K. Yap, G. Appenzeller, M. Casado, N. McKeown, and G. Parulkar, “Can the production network be the testbed?” in Proc. 9th USENIX conference on Operating systems design and implementation, ser. OSDI’10. Berkeley, CA, USA: USENIX Association, 2010, pp. 1–6.

94

[53] “Google g-sacle network,” https://www.eetimes.com/electronicsnews/4371179/Google-describesits-OpenFlow-network. [54] N. Katta, J. Rexford, and D. Walker, “Infinite cacheflow in softwaredefined networks,” Tech. Rep. TR-966-13, Department of Computer Science, Princeton University, Tech. Rep., 2013. [55] X. N. Nguyen, D. Saucez, C. Barakat, T. Thierry et al., “Optimizing rules placement in openflow networks: trading routing for better efficiency,” in ACM SIGCOMM Workshop on Hot Topics in Software Defined Networking (HotSDN 2014), 2014. [56] R. Cohen, L. Lewin-Eytan, J. S. Naor, and D. Raz, “On the effect of forwarding table size on sdn network utilization,” in Proc. IEEE International Conference on Computer Communications (INFOCOM), 2014, pp. 1734– 1742. [57] M. Moshref, M. Yu, A. Sharma, and R. Govindan, “Vcrib: Virtualized rule management in the cloud,” in Proc. 4th USENIX conference on Hot Topics in Cloud Ccomputing, (HotCloud’12), 2012. [58] C. Zhang, Y. Liu, W. Gong, J. Kurose, R. Moll, and D. Towsley, “On optimal routing with multiple traffic matrices,” in Proc. IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies, vol. 1. IEEE, 2005, pp. 607–618. [59] H. Wang, J. Lou, Y. Chen, Y. Sun, and X. Shen, “Achieving maximum throughput with a minimum number of label switched paths in mpls networks,” in Proc. 14th International Conference onComputer Communications and Networks, (ICCCN). IEEE, 2005, pp. 187–192. [60] G. Nakibly, R. Cohen, and L. Katzir, “Optimizing data plane resources for multipath flows,” IEEE/ACM Transactions on Networking (TON), vol. PP, no. 99, 12 2013. [61] W.-H. Wang, M. Palaniswami, and S. H. Low, “Optimal flow control and routing in multi-path networks,” Performance Evaluation, vol. 52, no. 2, pp. 119–132, 2003. [62] H. Han, S. Shakkottai, C. Hollot, R. Srikant, and D. Towsley, “Multi-path tcp: a joint congestion control and routing scheme to exploit path diversity in the internet,” IEEE/ACM Transactions on Networking (TON), vol. 14, no. 6, pp. 1260–1271, 2006. [63] P. Key, L. Massouli´e, and D. Towsley, “Path selection and multipath congestion control,” in 26th Proc. IEEE International Conference on Computer Communications, (INFOCOM), 2007, pp. 143–151. 95

[64] R. Khalili, N. Gast, M. Popovic, and J.-Y. Le Boudec, “Mptcp is not pareto-optimal: performance issues and a possible solution,” IEEE/ACM Transactions on Networking (TON), vol. 21, no. 5, pp. 1651–1665, 2013. [65] D. Adami, S. Giordano, M. Pagano, and N. Santinelli, “Class-based traffic recovery with load balancing in software-defined networks,” in Proc. IEEE Global Communications Conference Workshops, 2014, pp. 161–165. [66] B. Niven-Jenkins, D. Brungard, M. Betts, N. Sprecher, and S. Ueno, “Requirements of an mpls transport profile,” Tech. Rep., 2009. [67] T. Mizrahi and Y. Moses, “On the necessity of time-based updates in sdn,” in Open Networking Summit (ONS), 2014. [68] ——, “Software defined networks: It is about time,” in Proc. IEEE International Conference on Computer Communications (INFOCOM), 2016. [69] K. Pagiamtzis and A. Sheikholeslami, “Content-addressable memory (cam) circuits and architectures: A tutorial and survey,” IEEE Journal of SolidState Circuits, vol. 41, no. 3, pp. 712–727, 2006. [70] P. Bosshart, G. Gibb, H.-S. Kim, G. Varghese, N. McKeown, M. Izzard, F. Mujica, and M. Horowitz, “Forwarding metamorphosis: Fast programmable match-action processing in hardware for sdn,” in Proc. ACM SIGCOMM, 2013, pp. 99–110. [71] “Sdn system performance,” http://pica8.org/blogs/?p=201, 2012. [72] E. Spitznagel, D. Taylor, and J. Turner, “Packet classification using extended tcams,” in Proc. 11th IEEE International Conference on Network Protocols (ICNP). IEEE, 2003, pp. 120–131. [73] B. Stephens, A. Cox, W. Felter, C. Dixon, and J. Carter, “Past: Scalable ethernet for data centers,” in Proc. 8th international conference on Emerging networking experiments and technologies (CoNEXT). ACM, 2012, pp. 49–60. [74] X. Jin, H. H. Liu, R. Gandhi, S. Kandula, R. Mahajan, M. Zhang, J. Rexford, and R. Wattenhofer, “Dynamic scheduling of network updates,” in Proc. 2014 ACM conference on SIGCOMM, 2014, pp. 539–550. [75] D. Y. Huang, K. Yocum, and A. C. Snoeren, “High-fidelity switch models for software-defined network emulation,” in Proc. second ACM SIGCOMM workshop on Hot topics in software defined networking (HotSDN), 2013, pp. 43–48.

96

[76] G. Sun, V. Anand, H.-F. Yu, D. Liao, and L. Li, “Optimal provisioning for elastic service oriented virtual network request in cloud computing,” in 2012 IEEE International Conference on Global Communications Conference (GLOBECOM), 2012, pp. 2517–2522. [77] H. Huang, D. Zeng, S. Guo, and H. Yao, “Joint optimization of task mapping and routing for service provisioning in distributed datacenters,” in Proc. IEEE International Conference on Communications (ICC). IEEE, 2014, pp. 1–6. [78] G. Optimization, “Gurobi http://www.gurobi.com.

optimizer

reference

manual,”

URL:

[79] D. Turner, K. Levchenko, A. C. Snoeren, and S. Savage, “California fault lines: understanding the causes and impact of network failures,” ACM SIGCOMM Computer Communication Review, vol. 41, no. 4, pp. 315–326, 2011. [80] S. S. Lee, P.-K. Tseng, A. Chen, and C.-S. Wu, “Non-weighted interface specific routing for load-balanced fast local protection in ip networks,” in Proc. IEEE International Conference on Communications (ICC), 2011, pp. 1–6. [81] C. Cascone, L. Pollini, D. Sanvito, and A. Capone, “Traffic management applications for stateful sdn data plane,” in 2015 Fourth European Workshop on Software Defined Networks (EWSDN), 2015, pp. 85–90. [82] A. Capone, C. Cascone, A. Q. Nguyen, and B. Sanso, “Detour planning for fast and reliable failure recovery in sdn with openstate,” in Proc. 11th International Conference on the Design of Reliable Communication Networks (DRCN), 2015, pp. 25–32. [83] P. M. Mohan, T. Truong-Huu, and M. Gurusamy, “Tcam-aware local rerouting for fast and efficient failure recovery in software defined networks,” in Proc. IEEE Global Communications Conference (GLOBECOM), 2015. [84] J.-P. Vasseur, M. Pickavet, and P. Demeester, Network recovery: Protection and Restoration of Optical, SONET-SDH, IP, and MPLS. Elsevier, 2004. [85] R. J. Edell, N. McKeown, and P. P. Varaiya, “Billing users and pricing for tcp,” IEEE Journal on Selected Areas in Communications, vol. 13, no. 7, pp. 1162–1175, 1995. [86] Z. Wang and J. Crowcroft, “Qos routing for supporting resource reservation,” IEEE Journal on Selected Areas in Communications, September, 1996. 97

[87] X. Yuan, “Heuristic algorithms for multi-constrained quality-of-service routing,” IEEE/ACM Transactions on Networking, vol. 10, no. 2, pp. 244– 256, 2002. [88] M. Chen, S. C. Liew, Z. Shao, and C. Kai, “Markov approximation for combinatorial network optimization,” IEEE Transactions on Information Theory, vol. 59, no. 10, pp. 6301–6327, 2013. [89] S. Ramamurthy and B. Mukherjee, “Survivable wdm mesh networks. part i-protection,” in Proc. Eighteenth Proc. IEEE International Conference on Computer Communications. (INFOCOM), vol. 2, 1999, pp. 744–751. [90] K. Walkowiak, M. Klinkowski, B. Rabiega, and R. Go´scie´ n, “Routing and spectrum allocation algorithms for elastic optical networks with dedicated path protection,” Optical Switching and Networking, vol. 13, pp. 63–75, 2014. [91] C. Huang, V. Sharma, K. Owens, and S. Makam, “Building reliable mpls networks using a path protection mechanism,” IEEE Communications Magazine, vol. 40, no. 3, pp. 156–162, 2002. [92] S. Boyd and L. Vandenberghe, Convex optimization. Cambridge university press, 2004. [93] P. Diaconis and D. Stroock, “Geometric bounds for eigenvalues of markov chains,” The Annals of Applied Probability, pp. 36–61, 1991. [94] D. M. Topkis, “A k shortest path algorithm for adaptive routing in communications networks,” IEEE Transactions on Communications, vol. 36, no. 7, pp. 855–859, 1988. [95] M. R. Rahman and R. Boutaba, “Svne: Survivable virtual network embedding algorithms for network virtualization,” IEEE Transactions on Network and Service Management, vol. 10, no. 2, pp. 105–118, 2013. [96] J. W. Jiang, T. Lan, S. Ha, M. Chen, and M. Chiang, “Joint vm placement and routing for data center traffic engineering,” in Proc. IEEE International Conference on Computer Communications. (INFOCOM), 2012, pp. 2876– 2880. [97] F. P. Kelly, Reversibility and stochastic networks. Press, 2011.

Cambridge University

[98] K. L. Judd, “The law of large numbers with a continuum of iid random variables,” Journal of Economic theory, vol. 35, no. 1, pp. 19–25, 1985.

98

Suggest Documents