On Global Modeling of Backbone Network Traffic Stilian Stoev
George Michailidis
Joel Vaughan
Department of Statistics University of Michigan, Ann Arbor 439 W. Hall, 1085 S. University Email:
[email protected]
Department of Statistics University of Michigan, Ann Arbor 439 W. Hall, 1085 S. University Email:
[email protected]
Department of Statistics University of Michigan, Ann Arbor 439 W. Hall, 1085 S. University Email:
[email protected]
Abstract—We develop a probabilistic framework for global modeling of the traffic over a computer network. The model integrates existing single–link (–flow) traffic models with the routing over the network to capture the global traffic behavior. It arises from a limit approximation of the traffic fluctuations as the time–scale and the number of users sharing the network grow. The resulting probability model is comprised of a Gaussian and/or a stable, infinite variance components. They can be succinctly described and handled by certain ’space–time’ random fields. The model is validated against real data and applied to predict traffic fluctuations over unobserved links from a limited set of observed links.
I. I NTRODUCTION Understanding the statistical behavior of computer network traffic has been an important and challenging problem for the past 15 years, because of its impact on network performance and provisioning [13], [16] and on the potential for development of more suitable protocols [15], [16]. Since the early 1990s it has been well established that the traffic over a single link exhibits intricate temporal dependence, known as burstiness, which could not be explained by traffic models developed for telephone networks [12]. This phenomenon could be understood and described by using the notions of long–range dependence and self–similarity [7], which in turn are related to the presence of heavy tails in the distribution of file sizes [3], [15]. A bottom-up mechanistic model for single link network traffic that is in agreement with the empirical features observed in real network traces was presented in [24]. A competing model based on queuing ideas was studied in [14]. For many further developments see eg [16]. Advances in technology that allowed the acquisition of direct, through sampling [6], and indirect [11] measurements have allowed researchers to examine the characteristics of traffic in entire networks [10], [19], based on statistical modeling analysis. On the other hand, an analogue of the mechanistic models available for single link network traffic is not available. Such a model would allow for better understanding of network performance [8], [13] and detection of anomalous behavior [17]. Further, it would manage to capture and explain statistical relationships between flows traversing the network at all time scales (time) and across all links (space); the latter represents a fairly tall requirement, which may also prove rather impractical given the underlying complexity (protocols, applications) and heterogeneity (physical infrastructure, diverse users) of modern networks.
Our objective in this paper is to propose a mechanistic model that captures several fundamental characteristics of network-wide traffic and thus constitutes a partial solution for this challenging problem. We first model the user behavior on source–destination paths across the network and then aggregate over users and over time, thus developing a joint ’space–time’ probability model for the traffic fluctuations over all links in the network. This asymptotic approximation is relevant only in the case of fast, backbone networks where the routing is constant and multiple users share fairly unconstrained resource. The success of our modeling strategy is demonstrated by using real data from the Internet2 network in the context of network traffic prediction – a problem with important implications on network performance, provisioning, and management [16]. II. P RELIMINARIES Consider a computer network of L links and N nodes. Traffic flows (groups of packets) from any node (source) to any other node (destination) are transmitted over predetermined sets of links (routes). The routes are formally described by the routing matrix A = (aℓj )L×J : 1 , route j involves link ℓ aℓj = 0 , otherwise, where J is typically of the order O(N 2 ). Here, we suppose that A is given and relatively constant, which is the case for many networks including the backbone. A. Problem formulation We assume, for simplicity, that the traffic is fluid. That is, the amount of data (bytes) R b transmitted over link ℓ during the time interval (a, b) is a Yℓ (t)dt, where Yℓ (t) is the traffic intensity (bytes per unit time) over link ℓ. Let also Xj (t) denote the traffic intensity at time t over route j, 1 ≤ j ≤ J . Then, assuming that traffic propagates instantaneously over the network, we obtain the following routing equation: ~ (t) = AX(t), ~ Y
(1)
~ ~ (t) = (Yℓ (t))1≤ℓ≤L . where X(t) = (Xj (t))1≤j≤J and Y Notice that this relationship holds exactly only if traffic propagates instantaneously along the routes. Thus, (1) can not be adopted over the finest, high–frequency time scales where packet delay plays a central role. On the other hand,
2
for all practical purposes, the routing equation holds over a wide range of time scales greater than the round trip time (RTT) for packets in the network [19]. Further, it captures the fundamental relationship between the traffic intensity over different routes in the network and the resulting load, incurred on the links. From a physical perspective, the computer network is merely used to “transport” information from source to destination nodes. In normal (uncongested) operating regime, the traffic is carried seamlessly and the traffic intensities Xj (t) are driven solely by the demand along the routes. Thus, as a first approximation, one may view the Xj (t)’s as statistically independent in j, 1 ≤ j ≤ J . Therefore, in view of (1), the statistical dependence between Yℓ1 (t) and Yℓ2 (t) for two links ℓ1 and ℓ2 , is governed by the set of routes Xj (t) traversing both links. This discussion suggest an approach to building a global, network–wide model of the traffic intensities Yℓ (t), 1 ≤ ℓ ≤ L, t ≥ 0. The temporal dependence of the flow–level traffic Xj (t) can be described well by the existing mechanistic models exhibiting long–range dependence and heavy tails (see Section II-B). Thus, the routing equation (1) yields a model for the temporal dependence of the link–level traffic. The dependence across links is also naturally induced by the routing structure of the network. Although the independence assumption for the Xj (t)’s over j is rather strong, it is nevertheless supported by empirical evidence, especially over backbone network links [19] and Fig. 2 below. This approach will allow us to gain insights and offer a first attempt at modeling the joint behavior of traffic on all network links.
As shown in [14], if the number of users M = M (T ) grows to infinity, as a function of the time scale T , we have: 1/α • (fast growth) If (M (T )T ) /T → ∞, as T → ∞, then o n 1 p X0∗ (T t, M ) = {BH (t)}t≥0 . L lim T →∞ T H t≥0 M (T )
B. Single flow traffic models
where Uj,on := Tj,off − Tj,on and Uj,off := Tj+1,on − Tj,off are the durations of the j-th ’On’ and ’Off’ periods, respectively. The durations of the user activity are assumed independent, with cumulative distributions, such that, as x → ∞:
The statistical behavior of single–link and single–flow traffic has been well studied and largely understood (see e.g. [16]). Successful mechanistic models have been developed that explain the observed traffic burstiness in terms of heavy tails and long–range dependence. Such models are built on the paradigm of multiple users sharing a link. Depending on the regimes prevalent in the network, one obtains two qualitatively different asymptotic models for the cumulative traffic fluctuations. One regime leads to finite–variance, Gaussian models that exhibit long–range dependence and self– similarity. The other regime yields infinite variance processes with independent increments. Let {X (i) (t)}, 1 ≤ i ≤ M be independent and identically distributed stationary processes modeling the traffic intensities of M users sharing a given route. Then, the cumulative traffic over the route is: Z TX M X ∗ (T, M ) := X (i) (t)dt. 0
i=1
We are interested in the asymptotic behavior of the cumulative traffic fluctuations about the mean: X0∗ (T, M ) := X ∗ (T, M ) − EX ∗ (T, M ).
(slow growth) If (M (T )T )1/α /T → 0, as T → ∞, then o n 1 ∗ X (T t, M ) = {Λα (t)}t≥0 , L lim 0 T →∞ (T M (T ))1/α t≥0 where ’L lim’ denotes limit of the finite–dimensional distributions. In the fast growth scenario the number of users M (T ) grows relatively fast (compared to the time scale T ), and the limit BH = {BH (t)}t≥0 is a Gaussian H–self–similar process with stationary increments, called fractional Brownian d motion (fBm). Namely, for all c > 0, we have {BH (ct)}t≥0 = {cH BH (t)}t≥0 with 0 < H ≤ 1. One can show that σ 2 2H |t| +|s|2H −|t−s|2H , t, s ≥ 0 Cov(BH (t), BH (s)) = 2 for some σ 2 = Var(BH (1)) > 0. The slow growth regime on the other hand, yields the infinite variance stable L´evy motion Λα = {Λα (t)}t≥0 (see e.g. [18]). The parameters 1 < α < 2 and H = (3 − α)/2 are related and they stem from the underlying assumptions on the user– level traffic X (i) (t). Here the individual user traffic intensity is modeled by an On/Off process with heavy–tailed, infinite variance ’On’ and ’Off’ durations. Namely, X I(Tj,on ≤ t < Tj,off ), (2) X (i) (t) = •
j∈Z
1−Fon (x) := F on (x) ∼ con x−αon and F off (x) = coff x−αoff , (3) where 1 < α := min{αon , αoff } < 2. The probabilistic model of durations, reflects the ubiquitous presence of heavy tails in computer networks (file sizes, web pages, etc. [3], [4], [15]). The heavy tailed nature of the durations, implies that the processes X (i) (t) of user activity are long–range dependent (LRD). The intimate connection between long– range dependence and self–similarity provides an appealing mechanistic (physical) explanation of the cause of burstiness in network traffic (see e.g. [7], [12], [16] and the references therein). III. N ETWORK –W IDE T RAFFIC M ODELING A. Asymptotic Approximations As in Sec. II-B, we model the traffic intensity Xj (t) over route j as a composition of Mj independent users generating On/Off traffic: Xj (t) =
Mj X i=1
(i)
Xj (t), 1 ≤ j ≤ J .
(4)
3
We suppose, in addition, that the Xj (t)’s are independent in j and have common parameter α = min{αon , αoff } ∈ (1, 2) as in (3). We then obtain the following results: Theorem 1: Let Mj ∼ r(j)M, M = M (T ) → ∞, for ~ (t) be as in (1). some constants r(j) > 0 and let Y 1/α If (M (T )T ) /T → ∞, as T → ∞, then Z Tt 1 ~ (τ ) − EY ~ (0))dτ p (Y L lim T →∞ T H M (T ) 0 ~ H (t)}t≥0 , = {AB (5) (j)
(j)
~ H (t) = (r(j)B (t))1≤j≤J and B (t)’s are i.i.d. where B H H fBm’s with parameter H = (3 − α)/2 ∈ (1/2, 1). Theorem 2: Assume the conditions of Theorem 1. If (M (T )T )1/α /T → 0, as T → ∞, then Z Tt 1 ~ (τ ) − EY ~ (0))dτ L lim 1/α 1/α (Y T →∞ T M 0 ~ α (t)}t≥0 , = {AΛ (6) (j)
(j)
~ α (t) = (r(j)Λα (t))1≤j≤J and Λα (t)’s are i.i.d. where Λ L´evy α−stable motions. Theorems 1 and 2 correspond, respectively, to the fast and slow regime asymptotics in the single–flow case. Similar and, in fact, more general random field limits were obtained in a different context by [5]. Their proofs follow readily from the well–known single–flow results with an application of the continuous mapping theorem. B. Functional fBm We shall discuss next a class of stochastic processes, indexed by functions, which can be used to succinctly represent the limit processes in Theorem 1. Consider a measure spaceR (E, µ) and the space L2H (µ) = 2H dµ < ∞}, where H ∈ {f : E → R, kf k2H 2H := E |f | (0, 1). Introduce the functional σ2 2H 2H φ2H (f, g) := kf k2H + kgk − kf − gk (7) 2H 2H 2H , 2 for f, g ∈ L2H (µ) and σ > 0; φ(f, g) resembles the auto– covariance function of an fBm and it can be shown to be positive semi–definite (see [23]). One can thus construct a Gaussian process with covariance φ2H : Definition 1: Let H ∈ (0, 1]. A zero mean Gaussian process B = {B(f )}f ∈L2H (µ) indexed by the functions f ∈ L2H (µ) is said to be a functional fractional Brownian motion (f–fBm), if:
(ii) The process B has stationary increments: d
{B(f + h) − B(h)}f ∈L2H (µ) = {B(f )}f ∈L2H (µ) ,
(9)
for all h ∈ L2H (µ). (iii) If f g = 0 µ−a.e., then B(f ) and B(g) are independent. (iv) Bf (t) := B(tf ), t ∈ R is an ordinary fBm process. The proof is given in [23]. We now show how the f–fBm’s can be used to represent the limit in Theorem 1. Let E = {1, · · · , J } and let µ be the counting measure on E. Consider the f–fBm B = {B(f )}f ∈L2H (µ) . Proposition 2: For the limit process in (5), we have o n d ~ H (t)}t≥0 = . {AB (B(tfℓ ))1≤ℓ≤L t≥0
Here fℓ (u) = r(u)1/H 1Aℓ (u), where Aℓ ⊂ {1, · · · , J } denotes the set of routes that use link ℓ, 1 ≤ ℓ ≤ L. The proof is given in [23]. To illustrate the result let r(j) = 1, i.e. all routes involve the same number of users Mj = M . Then, the random variables B(tfℓ1 ) and B(sfℓ2 ) represent the asymptotic cumulative fluctuations of traffic over links ℓ1 and ℓ2 , respectively. Since fℓ = 1Aℓ is merely an indicator function, we have: σ 2 2H |t| µ(Aℓ1 ) + |s|2H µ(Aℓ2 ) EB(tfℓ1 )B(sfℓ2 ) = 2 −|t − s|2H µ(Aℓ1 ∩ Aℓ2 ) − |t|2H µ(Aℓ1 \ Aℓ2 ) −|s|2H µ(Aℓ2 \ Aℓ1 ) = µ(Aℓ1 ∩ Aℓ2 )
σ 2 2H (|t| + |s|2H − |t − s|2H ). 2
(10)
Recall that Aℓ ⊂ {1, · · · , J } is the set of all routes that involve link ℓ. Thus, the last relation has the following natural interpretation. The spatial dependence between the links ℓ1 and ℓ2 is governed solely by the routes they have in common, i.e. the set Aℓ1 ∩Aℓ2 . On the other hand, the temporal dependence follows the fBm model. In particular, B(tfℓ1 ) and B(tfℓ2 ) are independent if and only if links ℓ1 and ℓ2 have no common routes, i.e. µ(Aℓ1 ∩ Aℓ2 ) = 0. The space–time random field arising in the slow regime (Theorem 2) can be represented analogously by using a functional L´evy stable motion (f–Lsm). Both the f–fBm and the f–Lsm have convenient stochastic integral representations. For more details, see [23]. IV. A N A PPLICATION TO N ETWORK K RIGING
We focus only on the fast regime (Theorem 1), which proves to be most robust and prevalent in practice. We model the Cov(B(f ), B(g)) = EB(f )B(g) = φ2H (f, g), f, g ∈ L2H (µ). joint distribution of the traffic traces Yℓ (t), 1 ≤ ℓ ≤ L, as increments of functional fBm: The next result shows some basic properties of the f–fBm’s. Proposition 1: Let H ∈ (0, 1] and B = {B(f )}f ∈L2H (µ) Yℓ (t) := µY (ℓ) + B(tfℓ ) − B((t − 1)fℓ ), t = 1, 2, · · · , be f–fBm. where fℓ (u) = r(u)1/H 1Aℓ (u), u ∈ E ≡ {1, · · · , J } and Aℓ (i) The process B is H−self–similar: is the set of all routes using link ℓ. Here µY (ℓ) is the mean d {B(cf )}f ∈L2H (µ) = {cH B(f )}f ∈L2H (µ) , (∀c > 0). (8) traffic over link ℓ, per unit time.
4
Assuming that the mean structure µ ~ Y = (µY (ℓ))1≤ℓ≤L and the parameters H and r(u) of the limit f–fBm model are known, one recovers the joint distribution of the traffic load on the network across all links ℓ and time slots t. This allows one to address a number of fundamental statistical problems. P ROBLEM I: (Instantaneous prediction (network Kriging)) Given are the traffic loads D := {Yℓ (t), 1 ≤ t ≤ t0 , ℓ ∈ O},
(11)
over the set of links ℓ ∈ O ⊂ {1, · · · , L}. Predict the traffic load Yℓ0 (t0 ) on a unobserved link ℓ0 6∈ O, in terms of the data D. P ROBLEM II: (Spatio–temporal prediction) Given D as in (11), predict the traffic load Yℓ (t0 + h) on an observed or unobserved link ℓ, at some future time t0 + h > t. Remarks: 1) The estimation of the Hurst parameter H is a well– studied problem (see e.g. [20], [21].) We advocate the use of robustified wavelet methods to obtain H in practice (see e.g. [21], [22].) On the other hand, the estimation of the mean structure µ ~ Y = (µY (ℓ))1≤ℓ≤L is an important and challenging problem in practice. We address it in a general statistical framework with the help of latent models and auxiliary NetFlow data sets in our forthcoming work [25]. 2) In the interest of space, we focus only on the first, instantaneous prediction problem. The h–step prediction problem can be addressed similarly. We refer to the instantaneous prediction as network Kriging because of its resemblance to Geostatistical prediction problems. The term network Kriging was also used in [1], but in a rather different setting; namely, predicting delays along routes from active network measurements of flows in the network. Here, the focus is on link rather than flow measurements. ~ (t) = (Yℓ (t))1≤ℓ≤L and the rows of Partition the vector Y the routing matrix A into two components, corresponding to the indices of the unobserved (’u’) and observed (’o’) sets of links: Au Yu (t) ~ (t) = and A = . Y Ao Yo (t) ~ (t) = AX(t), ~ ~ Proposition 3: Let Y where EX(0) = µX , t ~ ~ and ΣX := E(X(0) − µX )(X(0) − µX ) . Suppose that the matrix Ao ΣX Ato is invertible. Then: (i) The statistic Ybu (t0 ) = Au µX + Au ΣX Ato (Ao ΣX Ato )−1 (Yo (t0 ) − Ao µX ) (12) is a unbiased predictor for Yu (t0 ) in terms of the data D in (11). The mean–squared error (m.s.e.) matrix of Ybu (t0 ) is: m.s.e.(Ybu (t0 )|D) (13) t b b := E (Yu (t0 ) − Yu (t0 ))(Yu (t0 ) − Yu (t0 )) |D = Au ΣX Atu − Au ΣX Ato (Ao ΣX Ato )−1 Ao ΣX Atu ,
where the last expectation is conditional, given the data D. (ii) The statistic Ybu (t0 ) in (12) is the best unbiased m.s.e. predictor of Yu (t0 ) in terms of the data D in (11). That is, for any other unbiased predictor Yu∗ (t0 ), we have that m.s.e.(Ybu (t0 )|D) ≤ m.s.e.(Yu∗ (t0 )|D)
(14)
where the last inequality means that the difference between the matrices in the right– and the left–hand sides is positive semidefinite. For the proof, see [23]. Remarks: ~ (t) is non–Gaussian, then the estimator in (12) 1) If Y remains the best linear unbiased predictor (b.l.u.p.) of Yu (t0 ) in terms of the data D. Relations (13) and (14) continue to hold, where now Yu∗ (t0 ) is an arbitrary linear in D, unbiased predictor of Yu (t0 ). 2) Note that only the observations Yo (t0 ) at the present time t0 are involved in (12). This is due to the product form of the space–time covariance structure of the functional fBm (10) and Proposition 9 in [23]. The fact that the b.l.u.p. Ybu (t0 ) in Proposition 3 does not depend on the past data Yo (t), t < t0 shows that the Ybu (t0 ) is in fact the standard Kriging predictor, which is well–studied in spatial statistics (see eg [2]). We shall therefore refer to Ybu (t0 ) as to the standard network Kriging predictor. V. A NALYSIS OF I NTERNET 2 DATA
We first illustrate the validity of our probabilistic models in the context of real data, derived from NetFlow measurements of the Internet2 backbone network (for technical details, see [23]). Fig. 2 (left) shows that the Xj (t)’s are nearly uncorrelated in j, which supports the simplifying independence assumption. On the other hand, the wavelet spectrum of a typical flow indicates that Xj (t) is well–modeled by a fractional Gaussian noise time series for a wide range of time scales (Fig. 2 right.) The Hurst exponents along most routes were found to be approximately equal (within statistical errors). These observations (and NS2 simulation experiments, not shown here due to lack space) support the validity of the global functional fBm model for the cumulative traffic fluctuations. Fig. 3 shows the performance of the standard Kriging predictor in practice. By monitoring just a few links, one can track relatively well the traffic load on other links. Table I shows further that a given link can be relatively well predicted from measurements of as few as two other links. For more detailed experiments, the legend describing the link IDs in Table I and a discussion on the remaining challenges, see [23]. R EFERENCES [1] D.B. Chua, E.D. Kolaczyk, and M. Crovella. Network kriging. IEEE Journal on Selected Areas in Communications, 24(12):2263–2272, Dec. 2006. [2] N. Cressie. Statistics for Spatial Data: revised ed. John Wiley, New York, 1993.
5
Fig. 1. Internet2 backbone: 9 nodes and 26 one–directional links. Most links have capacity of 10 Gbs/s, with the exception of: Chicago–Kansas, Kansas–Salt Lake City, New York–Washington, and Washington–Atlanta, which have capacity of 20 Gbs/s in each direction (per early 2009).
Fig. 2. Left: Correlation matrix (absolute values) of the flow–level traffic derived from 1 hour of NetFlow measurements of Internet2 on Feb 19, 2009. Brighter shades indicate numbers close to 1. Right: Daily link– and flow–level traces and their wavelet spectra indicating b ≈ 0.98 and H b ≈ 0.99, resp. Hurst parameters H
[3] M. E. Crovella and A. Bestavros. Self-similarity in World Wide Web traffic: evidence and possible causes. In Proceedings of the 1996 ACM SIGMETRICS. International Conference on Measurement and Modeling of Complex Systems, pages 160–169, May 1996. [4] M. E. Crovella, M. S. Taqqu, and A. Bestavros. Heavy-tailed probability distributions in the World Wide Web. In R. Adler, R. Feldman, and M. S. Taqqu, editors, A Practical Guide to Heavy Tails: Statistical Techniques and Applications, pages 3–25, Boston, 1998. Birkh¨auser. [5] B. D’Auria and G. Samorodnitsky. Limit behavior of fluid queues and networks. Oper. Res., 53(6):933–945, 2005. [6] N.G. Duffield. Sampling for passive Internet measurement: a review. Statistical Science, 19:472-498, 2004. [7] A. Erramilli, P. Pruthi, and W. Willinger. Self-similarity in high-speed network traffic measurements: Fact or artifact? In Proceedings of the 12th Nordic Teletraffic Seminar NTS12, Espoo, Finland, pages 299–310, 1995.
Fig. 3. Left: Prediction for the Internet2 backbone link Houston to Atlanta HOUS->ATLA based on the links: SEAT->SALT, SEAT->LOSA, LOSA->HOUS, ATLA->WASH, CHIC->NEWY. The traces reflect an entire day of activity (February 19, 2009). Observe the diurnal patterns and the utilization (see the caption of Fig. 1). The dotted lines indicate 95% prediction bounds. Right: A zoomed– in portion of the left plot.
Number of links 2 2 2 2 2 2 3 4 5 6 8 10
Link labels 3,7 7,9 9,12 12,17 17,21 3,21 3,7,9 3,7,9,12 3,7,9,12,17 3,7,9,12,17,21 3,5,7,9,11,12,17,21 3,5,7,9,11,12,17,21,23,25
Relative m.s.e. 0.07 0.12 0.08 0.41 3.06 0.05 0.12 0.08 0.07 0.06 0.06 0.06
TABLE I Empirical relative the PTmean squared error for Pstandard Kriging estimator: bu (t) − Yu (t))2 )/( Tt=1 Yu (t)2 ). The Internet2 (r.m.s.e.) = ( t=1 (Y backbone link 13 (Kansas City to Chicago) was predicted from various sets of other backbone links. The data spans the entire day of February 19, 2009.
[8] Network traffic behaviour in switched Ethernet systems. Performance Evaluation, 58: 243–360. [9] Internet2: http://www.internet2.edu/observatory/ [10] A. Lakhina, K. Papagiannaki, M. Crovella, C. Diot, E.D. Kolaczyk, and N. Taft. Structural analysis of network trafic flows. Proceedings of Sigmetrics, 2004. [11] E. Lawrence, G. Michailidis, V.N. Nair and B. Xi. Network tomography: a review and recent developments. in Frontiers in Statistics, J. Fan and H. Koul (eds), 345–364, 2006. [12] On the self-similar nature of Ethernet traffic. Computer Communications Review, 23:183–193, 1993. Proceedings of the ACM/SIGCOMM’93, San Francisco, September 1993. [13] A. Lombardo, G. Morabito, and G. Schembra. A novel analytical framework compounding statistical traffic modeling and aggregateLevel service curve disciplines: network performance and efficiency implications. IEEE/ACM Trans. on Networking, 12: 443-456, 2004. [14] T. Mikosch, S. Resnick, H. Rootz´en, and A. Stegeman. Is network traffic approximated by stable L´evy motion or fractional Brownian motion? The Annals of Applied Probability, 12(1):23–68, 2002. [15] K. Park, G. Kim, and M. E. Crovella. On the relationship between file sizes, transport protocols, and self-similar network traffic. In Proceedings of the Fourth International Conference on Networks Protocols (ICNP’96), October 1996. [16] K. Park and W. Willinger, editors. Self-Similar Network Traffic and Performance Evaluation. J. Wiley & Sons, Inc., New York, 2000. [17] Y. Paschalidis, and G. Smaragdakis. Spatio-temporal network anomaly detection by assessing deviations of empirical measures. IEEE/ACM Trans. on Networking, 17: 685–697, 2009. [18] G. Samorodnitsky and M. S. Taqqu. Stable Non-Gaussian Processes: Stochastic Models with Infinite Variance. Chapman and Hall, New York, London, 1994. [19] H. Singhal, and G. Michailidis. Identifiability of flow distributions from link measurements with applications to computer networks. Inverse Problems, 23: 1821–1850, 2007. [20] S. Stoev, M. Taqqu, C. Park, G. Michailidis, and J. S. Marron. LASS: a tool for the local analysis of self-similarity. Computational Statistics and Data Analysis, 50:2447–2471, 2006. [21] S. Stoev, M. S. Taqqu, C. Park, and J. S. Marron. On the wavelet spectrum diagnostic for Hurst parameter estimation in the analysis of Internet traffic. Computer Networks, 48:423–445, 2005. [22] S. Stoev and M. S. Taqqu. Asymptotic self-similarity and wavelet estimation for long-range dependent fractional autoregressive integrated moving average time series with stable innovations. J. Time Ser. Anal., 26(2):211–249, 2005. [23] S. Stoev, G. Michailidis, and J. Vaughan. On Global Modeling of Network Traffic. http://www.stat.lsa.umich.edu/˜ sstoev/global_tr.pdf [24] M. S. Taqqu, W. Willinger, and R. Sherman. Proof of a fundamental result in self-similar traffic modeling. Computer Communications Review, 27(2):5–23, 1997. [25] J. Vaughan, S. Stoev, and G. Michailidis. Network–wide statistical modeling and prediction of computer traffic. Working paper, 2009.