Flow Aggregation for Traffic Engineering - IEEE Xplore

Globecom 2014 - Next Generation Networking Symposium

Flow Aggregation for Traffic Engineering Noriaki Kamiyama1,2 , Yousuke Takahashi2 , Keisuke Ishibashi2 , Kohei Shiomoto2 , Tatsuya Otoshi1 , Yuichi Ohsita1 , and Masayuki Murata1 1

Osaka University NTT Network Technology Laboratories E-mail: [email protected] 2

Abstract—Although the use of software-defined networking (SDN) enables routes of packets to be controlled with finer granularity (down to the individual flow level) by using traffic engineering (TE) and thereby enables better balancing of the link loads, the corresponding increase in the number of states that need to be managed at routers and controller is problematic in large-scale networks. Aggregating flows into macro flows and assigning routes by macro flow should be an effective approach to solving this problem. However, when macro flows are constructed as TE targets, variations of traffic rates in each macro flow should be minimized to improve route stability. We propose two methods for generating macro flows: one is based on a greedy algorithm that minimizes the variation in rates, and the other clusters micro flows with similar traffic variation patterns into groups and optimizes the traffic ratio of extracted from each cluster to aggregate into each macro flow. Evaluation using traffic demand matrixes for 48 hours of Internet2 traffic demonstrated that the proposed methods can reduce the number of TE targets to about 1/50 ∼ 1/400 without degrading the link-load balancing effect of TE.

I. I NTRODUCTION Internet routers switch packets in accordance with the destination IP address without keeping flow states, and packets are transmitted on the route with the minimum total value for the static weights of links. The sending hosts autonomously determine the flow rates without call admission control (CAC), so link congestion is inevitable. Therefore, to maintain adequate quality, Internet service providers (ISPs) need to design and manage their networks so that the loads on all links are kept below a certain level. However, the link loads change due to variations in the traffic demand matrix and to route fluctuations caused by failures. This led to the development of various traffic engineering (TE) methods that balance the link loads by dynamically configuring the packet routes [3][4][5][9][10][12]. Many of these TE methods use the Open Shortest Path First (OSPF) routing protocol and configure the routes by updating the link weights accordingly. That is, the packet routes are set in units of destination network addresses, which is a coarse granularity, so effectively balancing the link load with these TE methods is difficult. Software-defined networking (SDN) has attracted much attention recently [11]. In an SDN/OpenFlow network, packet flows can be defined as any combinations of fields in the packet header, e.g., source and destination IP addresses, source and destination port numbers, and protocol type, and the packet routes can be configured in units of flows. SDN enables finegrained control of routes, i.e., in units of flows. In the OpenFlow network, the routing policy can be flexibly configured by using the OpenFlow controller, which centrally manages all the network nodes, and the routes can be dynamically controlled

on a short time scale. For example, the TE methods proposed by Hong et al. and Jain et al. use SDN to configure routes in fine grain units within a short time frame [6][8]. Although SDN can be used to control routes with finer granularity, and we can expect higher link-load balancing effect of TE, the routers and controller need to keep the state of each unit of route setting. Therefore, if we set flows defined by the five fields in the packet header1 as the unit of route control, the number of states to be kept by routers and controller is huge, and applying this approach to large-scale networks is difficult. Moreover, the variance in the traffic rates of the micro flows is larger than that of the total traffic aggregated by destination network address. In TE, routes are controlled on the basis of predicted traffic loads. Hence, the effect of TE due to using micro flow as the unit will be degraded because of the error in predicting traffic loads. On the other hand, , if routes are controlled in coarsegrained units, such as the destination network address, as do conventional TE methods using the OSPF link weights, a large amount of traffic might be shifted at each route modification, which would also degrade the load-balancing effect of TE. Moreover, the link loads on the detour routes would be high, so routes would be again shifted to another candidate route. This could lead to the routes becoming unstable. Therefore, routes should be controlled with coarser-grained granularity than micro flows and finer-grained granularity than the mixture of flows aggregated by flow destination network address. In other words, it is effective to control the routes by using macro flow units defined in terms of groups of micro flows with the same origin and destination network addresses. We thus propose two methods for aggregating micro flows into macro flows to be used effectively as TE targets. The first method is based on a greedy algorithm that optimizes a single criterion, and the second method clusters micro flows with similar traffic variation patterns into groups and optimizes the traffic ratio of extracted from each cluster to aggregate into each macro flow. Evaluation using traffic demand matrixes for 48 hours of Internet2 traffic demonstrated that the proposed methods can reduce the number of TE targets to about 1/50 ∼ 1/400 without degrading the link-load balancing effect of TE. In Section 2, we summarize the requirements for generating macro flows, and describe the two proposed methods for generating macro flows. We show the numerical results for each property of the generated macro flows in Section 3 and the numerical results for the load-balancing effect of TE using the generated set of macro flows in Section 4. Finally, we summarize the key points in Section 5.

1 Hereafter,

c 2014 IEEE 978-1-4799-0913-1/14/$31.00 ⃝

978-1-4799-3512-3/14/$31.00 ©2014 IEEE

1936

we call flows obtained by this definition micro flows.


II. G ENERATION OF M ACRO F LOWS U SED AS TE TARGETS We group F micro flows between a pair of origindestination (OD) nodes into M macro flows and control the routes for each macro flow. We divide time into time slots with fixed interval T and construct the set of macro flows using Rf,t , which is the amount of traffic carried by each micro flow f in each time slot t during the evaluation period, ¯ f as the average Rf,t during the 1 ≤ t ≤ L. We define R ∑L ¯ evaluation period, i.e., Rf = t=1 Rf,t . We use the general term, TE target to mean the target or unit of TE, such as micro flow or macro flow, hereafter. A. Requirements We identified three requirements when generating macro flows. (1) Number of macro flows, M : TE methods are used to set routes for each TE target on the basis of the amount of traffic of each TE target, so routers need to measure traffic for each TE target during each measurement time interval and predict the traffic load as necessary. Since the amount of computation required to derive the optimum routes increases drastically with the number of TE targets, the number of macro flows M should be as small as possible. (2) Variation in macro-flow traffic, Vx : Because the amount of traffic of each micro flow fluctuates, the amount of traffic of each macro flow, which is generated by aggregating multiple micro flows, also varies. TE determines the route of each TE target on the basis of the rounded value, i.e., the amount of traffic of each TE target at any time instance or the predicted traffic load. Therefore, the variation in macro flow traffic Vx should be as small as possible to improve the effect of TE. (3) Maximum amount of macro-flow traffic, Q: When a macro flow route is changed by TE, the link load on the new route increases, and the increase is large if the amount of macro flow traffic is large. This can create congestion on the links in the new route, which could lead to the macro flow route being changed again. The route can thus become unstable. The maximum amount of macro flow traffic, Q, should thus be as small as possible to keep the TE target routes stable. Several methods have been proposed for generating macro flows. For example, Hu et al. proposed a method that efficiently finds the optimal pattern of flow aggregation for a large amount of traffic by hierarchically searching the information in the IP headers [7]. Zhang et al. proposed a flow-aggregation method that finds flows with a large amount of traffic [13]. However, since these methods do not generate macro flows as the TE targets, they cannot satisfy the three requirements described above. Aggregating more micro flows into each macro flow by decreasing M will increase the Q of the obtained macro flows although Vx will decrease thanks to the statistical multiplexing effect. Hence, there is a negative correlation between Q and M (or Vx ), and optimizing these three requirements simultaneously is difficult. We thus propose two methods for generating macro flows. The first method minimizes each criterion, i.e., M and Vx , by using a greedy algorithm. Moreover, to consider both Q and Vx simultaneously as the optimization target, we also propose the second method which clusters micro flows

into groups so that those in each group have similar patterns of traffic variation and then optimizes the ratio of traffic extracted from each cluster to aggregate into each macro flow. B. Generating Macro Flows Using Greedy Algorithm Let us optimize either Vx or Q when M is given. For an OD node pair with F micro flows, there are M F combinations for grouping these F micro flows into M macro flows, and the computation time for checking all possible combinations dramatically increases as F or M increases. Therefore, we approximately derive the optimum solution by using the greedy algorithm shown below. We define F m as the set of micro flows grouped into macro flow m. (i) Initialize F m of each m (1 ≤ m ≤ M ) to empty set ϕ. (ii) Select one micro flow f in descending order of microflow property Yf . (iii) Derive property P of the macro-flow set when grouping the selected micro flow f into each macro flow m. (iv) Assign micro flow f to macro flow m∗ giving the minimum P ; i.e., update F m to F m = F m + {f }. (v) Repeat steps (ii)-(iv) until all F micro flows are grouped into a macro flow. Optimization function P and property Yf of macro-flow set determine the sequence of macro-flow assignments, and we consider the following five cases. (1) Minimize total variation in macro-flow traffic With the aim of decreasing Vx , we set Φ, the sum of the variation in macro-flow traffic during the evaluation period, to P . In other words, we set M ∑L ∑ ¯ 2 t=1 (Xm,t − Xm ) P =Φ= , (1) L m=1 where Xm,t is the sum of the traffic of micro flows that have been already grouped into macro flow m in time slot ¯ m is the average Xm,t in the evaluation period; i.e., t, and X ∑ ¯ m = ∑L Xm,t /L. Moreover, Xm,t = f ∈F m Rf,t and X t=1 we set√Yf to the coefficient of variation of micro flow f ; i.e., √ ∑L ¯ 2 ¯ Yf = t=1 (Rf,t − Rf ) / LRf . We describe this method as G/var. (2) Minimize maximum peak traffic of macro flows With the aim of decreasing Q, we set Ψ, the maximum peak traffic of all the macro flows, to P . In other words, we set P = Ψ = max

max Xm,t ,

1≤m≤M 1≤t≤L

(2)

and we set Yf to the peak traffic of each flow f ; i.e., Yf = max1≤t≤L Rf,t . We describe this method as G/peak. (3) Minimize maximum total traffic of macro flows With the aim of decreasing Q, we set Ω, the maximum total traffic of the macro flows, to P ; i.e., we set P = Ω = max

1≤m≤M

L ∑

Xm,t ,

(3)

t=1

and we set Yf to the total traffic of each flow f ; i.e., Yf = ∑L t=1 Rf,t . We describe this method as G/sum.

1937


(4) Minimize maximum peak traffic ratio of macro flows To improve route stability, we can minimize the maximum peak traffic ratio of each macro flow to the total traffic in each time slot instead of minimizing the maximum traffic of macro flows. Therefore, with the aim of decreasing Q, we set Π, the maximum peak traffic ratio of each macro flow to the total traffic in each time slot, to P . In other words, we set Xm,t , (4) P = Π = max max ˆt 1≤m≤M 1≤t≤L X ˆ where F is the set of all ∑flows, and Xt is the total traffic in time slot t; i.e., Xˆt = f ∈F Rf,t . Moreover, we set Yf to R the maximum peak traffic ratio; i.e., Yf = max1≤t≤L Xˆf,t . t We describe this method as G/pr. (5) Minimize total variation in macro-flow traffic with constraint of peak traffic ratio In addition to decreasing Vx , we also try to avoid dramatically increasing Q by considering the constraint of the upper bound on Q. Optimization function P is also given by (1). However, when grouping each flow into a macro flow in steps (iii) and (iv), we consider only the macro flows with the peak traffic X ratio, max1≤t≤L Xm,t ˆt , is below the given upper bound as the candidates. We also set Yf to the coefficient of variation of micro flow f . We describe this method as G/var+pb. C. Generate Macro Flows Using Clustering If we try to obtain the macro-flow set in which both Vx and Q are strictly optimized in the sense of Pareto optimal, we need to check all possible M F combinations of grouping F micro flows into M macro flows, so this approach is difficult for large-scale networks. However, to approximately generate macro flows with a small Vx and small Q, it should be effective to improve the statistical multiplexing effect by aggregating flows with different patterns of traffic variation. Based on this assumption, we propose a method for generating macro flows using clustering in this section. The proposed method, denoted as Clustering, consists of three steps: 1) cluster flows on the basis of traffic pattern, 2) optimize the ratio of traffic extracted from each flow cluster into each macro flow, and 3) assign flows to macro flows on the basis of the traffic ratio derived in step 2. We describe each of these steps below. (1) Flow clustering By clustering each flow f on the basis of the L-dimensional vector Rf whose entries are the amount of traffic of flow j in time slot t, we group flows with similar traffic patterns into the same cluster. As the clustering method, we use the k-means method, which classifies members into K clusters, where K is a given parameter. The k-means method generates the initial clusters randomly and repeatedly reassigns each member x to the cluster whose centroid is closest to x until the reassignment converges. Because Rf,t differs greatly among flows, we use the spherical k-means method to appropriately group flows into clusters [2]. With this method, flows are grouped into clusters on the basis of the normalized vector of Rf , which is obtained ∑L 2 by dividing each element of Rf by ∥Rf ∥ (= t=1 Rf,t ). Moreover, as the centroid of each cluster c, the normalized centroid rˆc = {ˆ rc,1 , rˆc,2 , · · · rˆc,L } is used. The spherical kmeans method comprises 5 steps.

(i) For each flow f , derive the L-dimensional normalized ˆ f = Rf /∥Rf ∥. vector with elements of R (ii) Randomly select the initial value of the normalized centroid rˆc of each cluster c from the L-dimensional normalized vector space. (iii) Assign each flow f to the cluster c giving the minimum ˆ f , rˆc ) = ∑L (R ˆ f,t − rˆc,t )2 between the distance D(R t=1 ˆ f , and the normalized normalized vector of flow f , R centroid of cluster c, rˆc . (iv) Recalculate the normalized centroid of each cluster c, rˆc,t . (v) Repeat steps (iii) and (iv) until the cluster structure converges. (2) Optimizing ratio of traffic extracted from each cluster Let Gc denote the set of flows grouped into each cluster c of K clusters obtained in the previous step. We try to extract fixed ratio αm,c (0 ≤ αm,c ≤ 1) of traffic from each cluster c and aggregate the extracted traffic into each macro flow m (1 ≤ m ≤ M ). In other words, as shown in Fig. 1, we allocate αm,c Zc,t traffic of cluster v to macro flow m in time slot t, where Zc,t ∑ is the total traffic of cluster c in time slot t; i.e., Zc,t = f ∈Gc Rf,t . Therefore, we have ∑K ∑ ∑ K Mm,t = c=1 αm,c Zc,t = c=1 αm,c f ∈Gc Rf,t , where Mm,t is the amount of traffic of macro flow m in time slot t. By solving the following linear programming problem minimizing the maximum traffic of each macro flow in each time slot t, we obtain the optimum extraction ratio αm,c . minimize max Mm,t , m,t

s.t.

M ∑

αm,c = 1, ∀c, 1 ≤ c ≤ K,

(5) (6)

m=1

0 ≤ αm,c ≤ 1.

(7)

(3) Aggregating flows into macro flows If we can divide the traffic of each cluster with any granularity, we can obtain the optimum set of macro flows minimizing the maximum traffic of the macro flows in step (II). However, in practice, we can aggregate traffic only with the granularity of flows into macro flows; we cannot divide the traffic of each cluster with any granularity. Therefore, we obtain the macro flow set by solving the integer program minimizing the deviation from the optimum value αm,c . Let xf,m denote the binary variable taking unity when assigning flow f to macro flow m and zero otherwise. Moreover, we define dm,c,t as the difference between the traffic ratio extracted from each cluster c to macro flow m in time slot t obtained by ∑ xf,m and the traffic ratio obtained by αm,c ; i.e., dm,c,t = ∥ f ∈Gc Rf,t xf,m − αm,c Zc,t ∥. We also define s as the sum of dm,c,t over all time slots in the evaluation period. We try to minimize S, the maximum value of s, in all macro flows. However, it is difficult to make the ratio of traffic extracted from each cluster close to αm,c in all time slots because the traffic of each flow f varies. Therefore, we also try to minimize Q, the maximum traffic of macro flows, in addition to S. We define the following integer program for xf,m minimizing the weighted sum of S and Q.

1938

min s.t.

(1 − β)Q + βS ∑ Rf,t xf,m , Q = max m,t

f

(8) (9)


L ∑

∑

S = max Rf,t xf,m − αm,c Zc,t , (10)

c t=1 f ∈Gc M ∑ xf,m = 1, ∀f, 1 ≤ f ≤ F, (11) m=1

xf,m ∈ {0, 1},

(12)

where β is a real parameter in the range of 0 ≤ β ≤ 1.

The results when flows were individually treated and not aggregated into macro flows, which is denoted as Individual, are shown in dotted lines. As M increased, the number of flows comprising each macro flow decreased, and the statistical multiplexing effect was degraded, so Φ increased as M increased. The Φ of the macro flows obtained with all six methods was less than about half the Φ of Individual, and we confirmed that the stability of TE targets was improved by using macro flows as the target of TE. For the two proposed methods, the Φ of G/var and G/var+pb that minimized the traffic variation of each macro flow was small, whereas the Φ of G/peak and G/pr, which did not consider the traffic variation, was large. In the entire range of M for KCLA and in the range of M ≤ 15 for CHWA, the Φ of Clustering was the largest among these six methods. We can aggregate traffic only with the granularity of flows into macro flows, and we cannot divide traffic of each cluster with any granularity. Therefore, the optimality of Clustering was degraded from the optimum value αm,c obtained in step (2) due to the deviation of the traffic ratio of each cluster aggregated into each macro flow obtained in step (3).

Fig. 1. Extracting traffic from each cluster and assigning them to macro flow

III. P ROPERTIES OF G ENERATED M ACRO F LOWS To evaluate the properties of macro flows generated by each of the proposed methods described in Secs. II-B and II-C, we used Internet2 traffic data captured over four hours from 15:00 to 19:00 on 19 May, 2012, as well as the topology of Internet2 [1]. A. Evaluation Conditions Internet2 consisted of nine nodes, and Rf,t was set to the amount of traffic between each of the 72 OD node pairs within each five-minute time slot. We used the first 24 time slots as the evaluation period. Due to space limitations, we present the results for only two OD node pairs: Kansas City to Los Angeles (KCLA) and Chicago to Washington (CHWA). Because the effect of TE for short-length flows is negligible, we considered only flows generating traffic in three or more time slots as the target of macro flow aggregation. The number of flows targeted for macro flow aggregation was 1,101 for KCLA and 8,488 for CHWA. In G/var+pb, we set the peak traffic of each flow multiplied by 1.2 as the upper limit on peak macro flow traffic. In the clustering described in Sec. II-C, we can set K to any integer. We generated macro flows for various setting of K and found that the effect of K on the obtained set of macro flows was small except when K was extremely large, in which case the number of flows aggregated into each macro flow was close to unity, or an extremely small value, i.e., two or three. Therefore, we set K to 5, 10, 20, and 30 when Nf , the number of target flows for macro flow aggregation, was in the range of Nf < 500, 500 ≤ Nf < 1, 000, 1, 000 ≤ Nf < 5, 000, and Nf ≥ 5, 000, respectively, for each OD node pair. In accordance with this policy, we set K = 20 for KCLA and K = 50 for CHWA. We used the CBC MILP Solver 2.7.5 to obtain the solutions in steps (2) and (3) of the clustering-based method with β = 1/3. B. Numerical Results for Macro Flow Properties We define Φ as the average coefficient (CoV) ∑M of∑variation 24 2 for the macro flow traffic; i.e., Φ = m=1 ( t=1 Xm,t /24 − 2 0.5 ¯ ¯m X ) /Xm /M . Figure 2 plots Φ against M , the number of macro flows, for KCLA. Figure 3 shows the results for CHWA.

Fig. 2. Average coefficient of macro flow traffic variation for KCLA

Fig. 3. Average coefficient of macro flow traffic variation for CHWA

Figure 4 shows the cumulative distribution of the CoV of macro flow traffic produced by each method when M = 20 for KCLA. As shown in Fig. 2, G/var and G/var+pb achieved the smallest Φ, and the CoV of the traffic was smaller than those of the other methods in all macro flows generated. Figure 5 plots Ψ, the maximum peak traffic of macro flows given by (2), against M for KCLA, and Fig. 6 shows the results for CHWA. The unit of Ψ is the amount of traffic generated in each time slot (Mbytes). When flows were aggregated into macro flows, Ψ always nondecreased. As M increased, the number of flows comprising each macro flow decreased, so Ψ decreased. We also observed that the Ψ of Clustering tended to be larger than those of the greedy-based methods. When M was greater than about 15, the Ψ of G/peak

1939


and G/pr closely agreed with the Ψ of Individual, so we can neglect the increase in Ψ when using these aggregation methods. As M increased, the number of flows comprising each macro flow decreased, so Ψ decreased. However, when M was larger than about 20, the effect of increasing M on reducing Ψ diminished because the traffic ratio of flows with extremely large amount of traffic against Ψ was large. The Ψ of G/peak and G/pr that minimize the peak traffic load was the smallest of the two proposed methods over a wide range of M . Moreover, the Ψ of G/var+pb with the peak traffic ratio as the constraint was close to those of these two methods. The Ψ of G/var and G/sum without considering the peak traffic was large.

with increasing M was negligible for M greater than about 20. Therefore, M should be set to 20 when using Internet2 traffic data. For example, when G/var+pb is used with M = 20, the variation in the TE-target traffic is reduced to about 1/5 while suppressing the increase of the maximum traffic of TEtarget, compared with Individual. Hence, the effect of TE on balancing the link load should be improved. IV. L OAD - BALANCING E FFECT OF TE Although we can dramatically reduce the number of TE targets by aggregating flows into macro flows, the effect of TE on balancing the link load would likely be degraded because the route setting granularity in TE became coarse. We investigated this by numerically comparing the load-balancing effect of TE between using units of flow and using units of macro flow. We used four metrics: 1) the average link load in each time slot (ave.ave), 2) the average maximum link load in each time slot (ave.max), 3) the average minimum link load in each time slot (ave.min), and 4) the average standard deviation of the link load in each time slot (ave.sd). A. Evaluation Conditions Using a random network with 100 nodes, each with degree greater than unity (average = 4), we evaluated the effect of TE. For each the 9,900 OD pairs (100 × 99), we generated traffic volume Rf,t for each flow f over four hours (48 time slots) between any OD pair randomly selected from among 72 Internet2 OD pairs. Only flows with traffic in four or more time slots were used. In a practical application, we would need to generate macro flows in accordance with the predicted traffic load. Since we did not address traffic prediction, we assumed that the traffic load was accurately predicted and constructed the macro flow set on the basis of the traffic load in all 48 time slots. In other words, we generated macro flows using each of the proposed methods described in Sec. III on the basis of the amount of traffic Rf,t in all 48 time slots. The number of macro flows generated for each OD pair M was set to 20. The candidate routes to be set by TE were the 30 K-shortest paths, i.e., the 30 routes with the shortest calculated hop length, for each OD pair. The route for each macro flow was set, in descending order of the maximum traffic in each time slot, to the candidate route that would give the minimum maximum link load in all 48 time slots. For comparison, we also evaluate the case without using TE (NonTE) in which the shortest-hop routes were assigned between any OD pair.

Fig. 4. Cumulative distribution of CoV of traffic for each macro flow

Fig. 5. Maximum peak traffic of macro flows for KCLA

Fig. 6. Maximum peak traffic of macro flows for CHWA

In sum, the effect of Clustering on minimizing the variations and peaks of macro flow traffic was small compared with the methods based on greedy algorithms, so the greedy-based methods are better for flow aggregation. Among the greedybased methods, G/var+pb gave good results for both Q and Vx . While the variance in macro flow traffic increased with the number of macro flows M , the peak macro flow traffic converged at about M = 20, so the improvement in Q and Vx

B. Effect of TE on Link-Load Balancing To investigate the effect of using macro flows on linkload balancing with and without TE, we calculated the four metrics (ave.ave, ave.max, ave.min, and ave.sd) for NonTE, for TE by flow (MicroTE), and for TE by macro flow for three of the proposed methods, Clustering, G/var, and G/var+pb. The results are summarized in Table I. With NonTE, since the shortest-hop routes were always used, it had the lowest ave.ave value. With MicroTE, although ave.ave was 25 % higher, ave.max was lower by about 30 % and ave.sd was lower by about 55 %, compared with NonTE. Figure 7 plots the maximum link load in each time slot, and Fig. 8 plots the standard deviation of the link load. These figures also confirm this tendency. The difference in the load-balancing effect among the three methods using macro flows were small. It was anticipated

1940


that the load-balancing effect would be degraded with TE using units of macro flow because the granularity of route settings would be coarse. However, although TE with macro flow granularity increased ave.ave by about 10%, it decreased ave.max by about 5% and ave.sd by about 50%. As mentioned in Sec. IV-A, the route for each macro flow was set in descending order of the maximum traffic in each time slot without considering the routes of macro flows for which the routes were assigned later. Consider, for example, the case in which there are two candidate routes (A and B) between an OD node pair and in which there are three flows (a, b, and c) with traffic loads of (100, 20), (0, 80), and (70, 50), respectively in two time slots. When the three flows are not aggregated into macro flows, routes are assigned to the three flows in the order of flow a, flow b, and flow c. If route A is assigned to flow c, the maximum link load is 180, whereas it is 130 if route B is assigned to flow c, so route B is assigned to flow c. As a result, the maximum link load is 130. On the other hand, if flows a and flow b are aggregated into one macro flow, route A is first assigned to macro flow a+b, then route B is assigned to flow c. As a result, the maximum link load is 100. This simple example shows that the maximum link load can be increased by allocating routes for TE targets without considering the routes for TE targets to be assigned later. This possibility increases with the variation in the traffic load of each TE target. As shown in Figs. 2 and 3, the variation in the traffic load of each macro flow was larger than those of individual flows, so the maximum and variation of the link load were smaller with macro-flow TE than with individual-flow TE. However, the link-load balancing effect greatly depended on the TE method used, so macro-flow TE does not always improve the load-balancing effect compared with individualflow TE.

and optimizes the traffic ratio of extracted from each cluster to aggregate into each macro flow. Evaluation using traffic demand matrixes for 48 hours of Internet2 traffic demonstrated that the proposed methods can reduce the number of TE targets to about 1/50 ∼ 1/400 without degrading the link-load balancing effect of TE. ACKNOWLEDGMENTS This work was supported by the Ministry of Internal Affairs and Communications of Japan.

Fig. 7. Time series for maximum link load

TABLE I E FFECT OF BALANCING LINK LOAD (G BYTES / SLOT ) ave.ave

ave.max

ave.min

ave.sd

NonTE

2.617

5.778

0.540

1.009

MicroTE

3.248

4.216

1.880

0.449

Clustering

3.586

4.065

2.283

0.297

G/var

3.575

3.992

2.219

0.281

G/var+pb

3.574

3.999

2.264

0.284

Fig. 8. Time series for standard deviation of link load R EFERENCES

In sum, by using the proposed methods to generate macro flows and by controlling the packet routes using units of macro flow with the number of macro flows set to 20, we can reduce the number of TE targets to about 1/50 ∼ 1/400 without degrading the link-load balancing effect of TE. V. C ONCLUSION Using traffic engineering (TE) to control packet routes increases the number of node states that need to be managed, which is a problem in large-scale networks. One solution to this problem is to aggregate multiple flows into macro flows and assign a route to each macro flow. When macro flows are the target of TE, the variations among the traffic rates in each macro flow should be minimized to improve route stability. We proposed two methods for generating macro flows: one uses a greedy algorithm to minimize the variation or peak of the traffic load of each macro flow, and the other clusters micro flows with similar traffic variation patterns into groups

[1] Internet2-Abilene Network, http://www.networknebraska.net/internet2. [2] I.S. Dhillon and D. M. Modha, “Concept decompositions for large sparse text data using clustering, ” Machine Learning 42 (1): pp. 143―175, 2001. [3] A. Elwalid, C. Jin, S. Low, and I. Widjaja, “MATE: MPLS Adaptive Traffic Engineering, ” IEEE INFOCOM 2001. [4] B. Fortz and M. Thorup, “Internet Traffic Engineering by Optimizing OSPF Weights, ” IEEE INFOCOM 2000. [5] B. Fortz and M. Thorup, “Robust optimization of OSPF/IS-IS weights, ” INOC 2003. [6] C. Y. Hong, et al., “Achieving high utilization with software-driven WAN, ” ACM SIGCOMM 2013. [7] Y. Hu, D. M. Chiu, and J. C. S. Lui, “Entropy based adaptive flow aggregation, ”IEEE/ACM Transactions on Networking, 17 (3), pp. 698711, 2009. [8] S. Jain, et al., “B4: Experience with a globally deployed software defined WAN, ” ACM SIGCOMM 2013. [9] S. Kandula, D. Katabi, B. Davie, and A. Charny, “Walking the Tightrope: Responsive Yet Stable Traffic Engineering, ” ACM SIGCOMM 2005. [10] A. Kvalbein, A. F. Hansen, T. Cicic, S. Gjessing, and O. Lysne, “Fast IP Network Recovery using Multiple Routing Configurations, ” IEEE INFOCOM 2006. [11] N. McKeown, et al., “OpenFlow: enabling innovation in campus networks, ” ACM SIGCOMM Computer Communication Review, Apr. 2008. [12] D. Mitra and K. Ramakrishna, “A Case Study of Multiservice Multipriority Traffic Engineering Design, ” IEEE GLOBECOM 1999. [13] Y. Zhang, S. Singh, S. Sen, N. Duffield, and C. Lund, “Online identification of hierarchical heavy hitters: algorithms, evaluation, and applications, ” ACM IMC 2004.

1941