A State-Space Approach for Optimal Traffic Monitoring via Network Flow Sampling Michael Kallitsis
Stilian Stoev
George Michailidis
Merit Network, Inc. Ann Arbor, MI
Department of Statistics University of Michigan Ann Arbor, MI
Department of Statistics University of Michigan Ann Arbor, MI
[email protected]
[email protected]
arXiv:1306.5793v1 [cs.SY] 24 Jun 2013
[email protected] ABSTRACT
The latter approach simplifies the monitoring task significantly. The idea is to sample packets from flows of interests at specific router interfaces, henceforth called observation points. For each packet sampled, several header information can be extracted and recorded for further analysis. Each packet from a flow (a flow can be an aggregate flow, i.e., flows originating from a particular subnet or an autonomous system) is sampled independently with a particular sampling probability (also known as sampling rate). Typical sampling rates are between 0.01 (i.e., only 1 out of 100 packets is selected for sampling) and 0.20. Higher sampling rates can also be chosen, but they amount to valuable resource consumption at each router (cache memory, CPU cycles, storage, network bandwidth and power). Thus, judicious choice of the sampling rates greatly affects the efficient operation of the network. Regardless of the measurement technique, network monitoring aims to several objectives: a) identification of the traffic volume for network flows (known as traffic matrix ) [26, 21, 23, 32, 27, 13], i.e., traffic for each origin-destination pair given the link counts and the topology of the network, b) identification of flow characteristics such as end-to-end network delay [4, 20, 6], flow length, flow distribution or other flow statistics [31, 10, 12]. To accomplish these, several interesting problems arise, such as the subset selection problem for choosing the locations of the monitoring stations [4, 6] and the sampling design problem [26]. This paper aims to address the problem of flow estimation through an optimal sampling strategy under resource constraints (see also [11]). We assume that network monitoring is performed via Netflow-alike measurements. Our framework takes into account both temporal correlations of the flows (see also [13]) as well as their spatial correlation [14]. We adapt Bernoulli sampling for our measurements at each observation site (that is, the information of a packet at each network link is recorded according to the sampling rate/probability described above), but alternative sampling techniques can aslo be considered [8, 7, 9]. The toy-example of Figure 1 provides more insights
The robustness and integrity of IP networks require efficient tools for traffic monitoring and analysis, which scale well with traffic volume and network size. We address the problem of optimal large-scale flow monitoring of computer networks under resource constraints. We propose a stochastic optimization framework where traffic measurements are done by exploiting the spatial (across network links) and temporal relationship of traffic flows. Specifically, given the network topology, the state-space characterization of network flows and sampling constraints at each monitoring station, we seek an optimal packet sampling strategy that yields the “best" traffic volume estimation for all flows of the network. The optimal sampling design is the result of a concave minimization problem; then, Kalman filtering is employed to yield a sequence of traffic estimates for each network flow. We evaluate our algorithm using real-world Internet2 data.
1.
INTRODUCTION
Advances in networking technologies and high performance computing have led to an unprecedented growth of a vast array of applications such as cloud computing, social networking, video on demand, cloud storage, and voice over IP, to name a few. At the same time, malicious network activity remains a big concern since network attacks become more sophisticated. Therefore, it is extremely important for network operators to have an accurate global-view of their network for diagnosing anomalous activity [19], for optimal network capacity planning and quality of service considerations [5]. These can be achieved through network monitoring. However, monitoring everywhere and constantly is expensive, energy inefficient and computationally challenging. Thus, one should employ statistical tools for traffic estimation through limited collection of measurements. Network monitoring has traditionally been done with SNMP measurements [5, 23, 32]. SNMP measurements provide link counts which give the aggregate traffic volume at the observation point of interest. Recently, more granularity can be achieved by performing flowlevel measurements using tools such as Cisco’s NetFlow. 1
2.
Internet2 IP Network Topology Internet2 IP router core topology and the backbone links that interconnect them T640
Seattle
Chicago
MX960 T1600
T640
Salt Lake City
Kansas City T640
20G
New York 10G
MX960
Washington DC
NETWORK PA R T N E R S T640
Los Angeles MX960
10 Gbps 20 Gbps
Atlanta
Houston T640
CONNECTORS
on the proposed method; assume we have flows between each network node. Further, assume flow-monitoring tools can sample with rate 20 out of every 100 packets. How should the network operator assign the sampling rates to each flow subject to the sampling capacity of each link? By considering the network topology, one would expect that “long-flows” do not require many samples at every single link they traverse. For example, the flow from ’Houston’ to ’NY’ needs not be sampled at every link on its path. Sampling on link ’Houston’ ’Kansas’ may be sufficient; this will leave the resources of the intermediate link to be utilized for monitoring the “short” flow that traverses only the link ’Kansas’ to ’Chicago’. Similarly, a stochastic characterization of the “evolution” of each flow over time provides valuable information for choosing the “’best” sampling strategy. Section 2 unifies these ideas in a stochastic optimization framework. Similar approaches for flow estimation via state-space models have also appeared in [26, 27]. However, in [26] the proposed method addresses flow estimation on a single network link and the spatial correlation between flows are not examined. In [27], the suggested statespace model considers link counts at every network link without relying on flow sampling (in particular, SNMP counts are considered). In large-scale networks this approach is not practical nor feasible. Further, compared to [27], we study a framework better-tailored to the ongoing measurement process. The contributions of this paper are twofold: a) We present a stochastic optimization framework for finding the optimal sampling design that would yield the best traffic estimates for each flow (section 2). The model views each flow as a stochastic process. The state of the system at a particular time instance is the volume of traffic that each flow carries at that instance. Through sampling, we get a partially observed system; this observation uncertainty is captured through the measurement equations we define next. The goal is to find the “best” sampling strategy that minimizes the estimation error over the (finite) horizon of interest. This is the first attempt to model the flow estimation problem under a stochastic control framework; b) We study an approximation scheme for the solution of the abovementioned stochastic optimization problem (section 3). The problem of obtaining the optimal sampling rates is then reduced to a deterministic optimization problem that can be solved a priori. Based on the calculated sampling rates, traffic estimation for each time-step is then performed via the Kalman filter. As illustrated in Figure 2 the proposed approach poses significant gains over existing techniques. We evaluate our approach using real-world data obtained from Internet2 (section 4).
Figure 1: Backbone network of Internet2. Consider a communication network of N nodes and L links. The total number of traffic flows, i.e. source and destination (S, D) pairs, is denoted by J. We denote the set of flows as J := {1, 2, . . . , J} and the set of all network links with L := {1, 2, . . . , L}. Traffic is routed over the network along predefined paths described by a routing matrix R = (r`,j )L×J , with r`,j = 1, when route j uses link ` and 0 otherwise. Let xt = (xt (1), xt (2), . . . , xt (J))T , t = 1, 2, . . . and yt = (yt (1), yt (2), . . . , yt (L))T , t = 1, 2, . . . be the vector time series1 of traffic traversing all J routes and L links, respectively. We shall ignore network delays and adopt the assumption of instantaneous propagation. This is reasonable when traffic is monitored at a time-scale coarser than the round-trip time of the network, which is the case in our setting. We thus obtain that the link and route level traffic are related through the fundamental routing equation2 yt = Rxt .
(1)
The spatial correlation between flows encoded in the routing matrix R, will play an important role in determining the optimal sampling design. Before discussing the solution of our optimization problem though, we first consider in detail all the components of our stochastic control formulation.
2.1
A State-Space Model
In this paper, we model the evolution over time of the volume of each flow as a stochastic process. In particular, we model the dynamics of each flow j, j = 1, 2, . . . , J, as the following Markov process: 1 Here, time is discrete and traffic loads are measured in bytes or packets per unit time, over a time scale greater than the round-trip time of the network. 2 Note that an in backbone IP networks the routing matrix R does not change often.
PROBLEM DESCRIPTION 2
3ROX CENIC CIC OmniPoP Drexel University GPN Indiana GigaPoP KyRON LEARN LONI MAGPI MAX MCNC Merit Network MREN NOX NYSERNet Oregon Gigapop Pacific Northwest GigaPoP SoX University of Memphis University of New Mexico USF/FLR University of Utah/UEN
Root Mean Squared Error per time slot 2400
2.2.1
2200
As mentioned above, at time t the volume of flow j is denoted by the state variable xt (j). We adopt a Bernoulli sampling scheme [12]. This says that each packet of flow j passing through the observation point ` at time t, is sampled with probability ut (`, j). In other words, the variable ut (`, j) specifies the sampling rate at link ` for flow j at time t. The number of packets captured at observation point ` for flow j is given by the random variable y˜t (`, j). Given xt (j), y˜t (`, j) follows a binomial distribution, i.e.,
2000
Packets/slot
1800 1600 1400 1200 1000 800 600
Optimal sampling. Naive sampling.
Training/Learing ends; Estimation begins.
03:00
06:00
09:00
12:00
15:00
18:00
21:00
Bernoulli flow sampling
23:59
Time (UTC)
y˜t (`, j) ∼ Binomial(xt (j), ut (`, j)).
Based on the observations y˜t (`, j), the unbiased estimator for xt (`, j) – the traffic volume at link ` for flow j – is given by:
Figure 2: Estimation error per time interval: comparison of optimal versus na¨ıve sampling for the 72 Internet2 flows on 2009-03-17.
z˜t (`, j) = xt+1 (j) = ρj xt (j) + wt (j), t = 0, 1, 2, . . .
(2)
(7)
vt (`, j) :=E[(˜ zt (`, j) − E˜ zt (`, j))2 |xt (j)] =E[(˜ zt (`, j) − xt (j))2 |xt (j)] =xt (j)
2.2.2
1 − ut (`, j) . ut (`, j)
(8)
Spatial combination of estimators
We seek a combined estimator for the volume of flow j that uses measurements from several observation points [14]. Such an estimator can be expressed as, X zt (j) = w`,j z˜t (`, j), (9) `∈`(j)
where `(j) ⊆ L is the set of links that flow j traverses. Conditioned on the state xt (j), the observations at different links y˜t (`, j) are independent so one can calculate the variance of the combined estimator to be X 2 Var(zt (j)|xt (j)) = w`,j vt (`, j), (10)
(3)
with xt being the vector representing the “state” of each flow at time t, and F a diagonal matrix of the coefficients ρj . Moreover, we have the following probability density functions for the primitive random variables described above, i.e., 1 p(x0 ) = c1 exp{− [(x0 − x ¯0 )T (Pˆ0|−1 )−1 (x0 − x ¯0 )]} 2 (4) 1 ˆ −1 wt ]}. p(wt ) = c2 exp{− [wtT Q (5) 2 ˆ can be determined The parameters x0 , Pˆ0|−1 , F and Q through a short calibration phase using techniques for fitting autoregressive models (see [3], Chapter 8).
2.2
y˜t (`, j) . ut (`, j)
The variance of the estimator at link ` for flow j equals
where xt (j) represents the state of flow j at time t, namely the numbers of packets (or bytes) carried at time interval t. For the purposes of this paper we assume that each time interval has a duration of 10 minutes. The sequence of random variables wt (j), t = 1, 2, . . . represent random noise. They belong to the set of primitive random variables, meaning that they are mutually independent. They are also independent from the state random variables. We assume noise to be Gaussian with zero mean and variance equal to σ2 . To fully characterize the system evolution for flow j of Eq. (2) we need initial state x0 (j), which is also assumed to be Gaussian. Its mean and variance can be calculated during a calibration phase. To summarize, for t = 0, 1, 2, . . ., the system dynamics are described by xt+1 = F xt + wt ,
(6)
`∈`(j) t (`,j) where vt (`, j) = xt (j) 1−u ut (`,j) (see Eq. (8)) and `(j) is the set of links that flow j is traversing and can be acquired from the routing matrix R. To obtain the best linear unbiased estimator for all j ∈ J , we want to find the optimal weights P w`,j that minimize the above variance subject to `∈`(j) w`,j = 1. Taking the Lagrangian and using the first-order optimality conditions we arrive to
vt (`, j)−1 . −1 k∈`(j) vt (k, j)
w`,j = P
Traffic Measurement 3
(11)
Continuing from (10), 1
Var(zt (j)|xt (j)) = P
1 k∈`(j) vt (k,j)
2.2.3
.
and non-decreasing function [2], one can easily verify that our objective function is concave in ut as a composition of a concave and non-decreasing function with a convex function.
(12)
The Measurement Equation
3.
From Eqs. (6), (7), (9) we see that we have a partially observable system. In other words, the state xt of the system – the traffic volume for each flow at t – is not directly available, but can be inferred through the observations zt . Using the normal approximation to the binomial distribution we get the following relation between the state and observations for flow j, for t = 1, 2, . . . zt (j) = xt (j) + rt (j),
The problem at hand belongs to the category of measurement adaptive problems [30]. In the general case, the problem of optimal measurement control (see also sequential design of experiments [25]) can be formulated as the following discrete-time, finite-horizon, partiallyobservable, perfect-recall stochastic control problem (see [30, 18]). We are given, the system evolution equation, written as
(13)
where rt (j) is a Gaussian random variable, i.e. (see Eq. (12)) xt (j) (14) rt (j) ∼ N 0, P 1 k∈`(j)
OPTIMAL SAMPLING
xt+1 = ft (xt , wt ), t = 0, 1, 2, . . . , T,
(19)
the measurement equation zt = ht (xt , ut , rt ), t = 0, 1, 2, . . . , T,
1 −1 ut (k,j)
(20)
The measurement equation, for all flows becomes
and the probability densities for the random variables x0 , wt and rt
zt = x t + r t ,
p(x0 ), p(wt ), p(rt ), t = 0, 1, 2, . . . , T.
(15)
with the probability density function for rt being 1 ˆ t (xt , ut )−1 rt ]}, p(rt ) = c3 exp{− [rTt R 2
The performance criterion is the expected cost over the horizon of interest
(16)
V = E{
ˆ t (xt , ut ) is a covariance matrix. Using the prowhere R posed measurement scheme, the covariance matrix is just a diagonal matrix with elements the variances shown in Eq. (14).
2.3
1
j=1
1 k∈`(j) vt (k,j)
.
(22)
g := (g1 (Z1 ), . . . , gt (Zt ), . . . , gT (ZT )) that minimizes the expected cost (22) over the horizon of interest subject to “budgetary” sampling constraints. The symbol Zt := (z1 , z2 , . . . , zt ),
(23)
represents the history of observations up to time t. Similarly, the history of sampling rates up to time t will be denoted as ut . As mentioned above, we assume a system with perfect-recall which means that all this information is available. Having calculated an optimal sampling strategy, the optimal sampling action at time instance t would be
where zt := zt (ut ) is the vector of volume estimation for each flow j = 1, 2, . . . J at time t (see Eq. (9) and (11)). The instantaneous cost of (17) can then be written as: P
ct (xt , ut ) + cT (xT , uT )},
where ct (xt , ut ) is the instantaneous cost function and the expectation is taken with respect to the random variables xt . The problem is to find the optimal sampling strategy
Let ut ∈ (0, 1)L×J be the sampling matrix arranged in a vector form; the variable ut (`, j) specifies the sampling rate at link ` for flow j at time t. We define the instantaneous cost to be the estimation error at time t as follows: (17) ct (xt , ut ) := trace E[(zt − xt )(zt − xt )T ] ,
J X
T −1 X t=0
The Instantaneous Cost
ct (xt , ut ) =
(21)
(18)
ut = gt (Zt ).
Proposition 1. The instantaneous cost function shown in (18) is concave in ut . Proof. Using the expanded form of Eq. (18) with 1 vt (k, j) = xt (j)( ut (k,j) − 1) we observe that this resembles the harmonic average of the terms vt (k, j). Using the fact that the harmonic average function is a concave 4
(24)
In other words, given the history of observations, the optimal action would at time t will be given by the precalculated optimal policy. In the general case, applying dynamic programming is hindered by the curse of dimensionality [24]. Therefore, some sort of approximation techniques need to
be involved [24, 1]. Indeed, under the following conditions, the stochastic control problem can be solved efficiently [22] by exploiting the two-way separation between estimation and control [30, 18]. The conditions3 are: a) The system evolution equation (see Eq. (19)) is linear; b) The measurement equation (see Eq. (20)) is linear in the state and measurement noise; c) The primitive random variables are Gaussian; and d)The instantaneous cost (in our case given by Eq. (17)) is independent of the state xt . In the special case we have a state equation of the following form xt+1 = Ft xt + wt ,
The computation of the optimal sampling rates can be determined a priori by calculating the solution of the following nonlinear, deterministic control problem: V ∗ = min ut
(25)
u∗t ∈ arg min ct (ˆ xt|t , ut )
(35)
s.t. But ≤ d,
(26)
where ut is the vector of sampling rates, But ≤ d represent linear “budgetary” constraints per link, and B is a matrix of appropriate dimensions (deduced from the routing matrix R). The concavity of our objective function, along with the linearity of our constraints lead us to a minimization of a concave function. This is an NP-complete problem, known as global concave minimization. The solution of the concave program always lies on the vertices of the convex hull defined by the convex polyhedron of our linear “budgetary” inequalities shown in Eq. (35). The proof can be found in [15]. The above proposition suggests that one way to solve our concave program – but certainly not the most efficient one – is to enumerate all the vertices of the induced convex hull, and pick the one that yields the lowest error. More sophisticated methods for solving concave programs can be found in [15, 29]. The complete traffic estimation algorithm is presented next.
1 ¯0 )T (Pˆ0|−1 )−1 (x0 − x ¯0 )]} p(x0 ) = c1 exp{− [(x0 − x 2 (27) 1 ˆ −1 p(wt ) = c2 exp{− [wtT Q (28) t wt ]}. 2 1 ˆ t (ut )−1 rt ]}, p(rt ) = c3 exp{− [rTt R (29) 2 ˆ t (ut ) gives the relationship between measurewhere R ment noise and sampling rate. The performance criterion is T T X X V = E{ ct (ut )} = ct (ut ) (30) t=0
subject to constraints on ut . Given the above conditions and the sampling rates ut , traffic volume estimation can be performed with ˆ t|t , the optimal estimate conthe Kalman filter [17]. x ditioned on Zt , is given by
Algorithm 1
ˆ t [zt − Ht Ft−1 x ˆ t|t = Ft−1 x ˆ t−1|t−1 + K ˆ t−1|t−1 ], (31) x
(Optimal Sampling).
1. For t = 1, . . . , t0 collect traffic data to calibrate the ˆ model; i.e., find x0 , Pˆ0|−1 , F and Q.
ˆ t , the Kalman gain, is where K ˆ t = Pˆt|t−1 HtT (Ht Pˆt|t−1 HtT + R ˆ t )−1 K
(34)
ut (`,j)
where Ht (ut ) relates the measurement matrix with the measurement control. The probability density functions of the primitive random variables are
t=0
ct (ˆ xt|t , ut ),
t=0
ˆ t|t subject to the “budgetary” sampling constraints. x is the “best” state estimator available at the time the optimization problem is solved. The optimization problem (34) can be decomposed into a sequence of problems. For time t, and given the ˆ t|t we have: state estimation x
a measurement equation of this form zt = Ht (ut )xt + rt ,
T X
(32)
ˆ t|t = x0 , and solve (34). We have 2. For t = t0 , set x now obtained ut , for t = t0 , . . . , T .
and Pˆt|t , the conditional covariance of the error in the estimate of xt given Zt can be calculated recursively for t = 0, 1, . . . , T by
3. Using the optimal sampling rates of Step 2) and Eqs. (32) and (33) calculate the Kalman gain.
ˆ t )−1 Ht Pˆt|t−1 Pˆt|t = Pˆt|t−1 − Pˆt|t−1 HtT (Ht Pˆt|t−1 HtT + R T ˆ t−1 + Ft−1 Pˆt−1|t−1 Ft−1 Pˆt|t−1 = Q . (33)
4. Using Eq. (9) get the combined observation for each flow j.
3
The models presented in [22, 30, 18] cover more general cases than the one presented here. Specifically, in the general model the state of the system needs to be controlled as well, and a quadratic cost is associated with the system state. Further, a quadratic cost in the decision variables may also exist.
5. With the observations acquired from (9), use the Kalman filter (31) to obtain the traffic volume estimation for time t, given the past of observations. 6. Set t = t+1 and go to Step 3). Repeat until t = T . 5
Estimate for the largest volume flow
4
4
x 10
3.5
Estimate for the second largest flow
4
2.2 optimal est. Naive sampling True traffic
x 10
Estimate for the third largest flow
4
2.5
x 10
Optimal sampling Naive sampling True traffic
2
optimal est. Naive sampling True traffic
1.8
2
3
2
packets/slot
packets/slot
2.5
1.4 1.2
1.5
1 1.5
1
0.8 1
0.6
03:00
06:00
09:00
12:00 15:00 Time (UTC)
18:00
21:00
0.4
23:59
03:00
(a) Largest flow.
06:00
09:00
12:00 15:00 Time (UTC)
18:00
21:00
23:59
0.5
03:00
(b) Second largest flow.
06:00
09:00
12:00 15:00 Time (UTC)
18:00
21:00
23:59
(c) Third largest flow.
Figure 3: Traffic estimation for different flows. Estimation of a short (1 hop) and small volume flow
Estimate for a long (4 hops) and small volume flow 3000
3000
2500
Optimal sampling Naive sampling True traffic
2500
1500
1500
1000
1000
500
500
0 0
03:00
06:00
09:00
12:00 15:00 Time (UTC)
18:00
21:00
03:00
06:00
09:00
23:59
12:00 15:00 Time (UTC)
18:00
21:00
23:59
Figure 5: Short flow (1 hop), low volume.
Figure 4: Long flow (4 hops), low volume.
4.
Optimal sampling Naive sampling True traffic
2000
2000 packets/slot
0.5
packets/slot
packets/slot
1.6
Figures 3(a) – 3(c) present the cases for the three largest flows in terms of average traffic volume size, namely flows 19, 66 and 30. Moreover, Figure 4 illustrates the estimation outcome for flow 14, which is a “long” flow traversing 4 links, but with relatively low traffic volume. Similarly, Figure 5, depicts the results for flow 68, a “short” flow with low traffic volume. Clearly, the proposed approached is advantageous over the simplistic sampling scheme. The results indicate the performance gains of our sampling scheme, being a result of considering both temporal and spatial correlation between flows. A necessary requirement, though, is the stationarity of traffic volumes. This does not always hold for Internet traffic. Ongoing work includes investigation of “richer” stochastic models, something that would allow sampling designs with even stricter sampling constraints (e.g., 1 : 100 or even 1 : 1000). Furthermore, one can additionally re-calibrate the model and “learn” its new parameters by increasing the frequency of the training periods (step 1 of Algorithm 1).
PERFORMANCE EVALUATION
We use a real-world network, namely Internet2, to evaluate our algorithm. We juxtapose our method against a na¨ıve sampling scheme (i.e., sampling rates not chosen optimally; Kalman filtering is still used though). Internet2 involves L = 26 links, N = 9 nodes and J = 72 routes (see [28, 16]). In particular, we employ a dataset for traffic captured on March 17, 2009. The dataset includes the traffic volume of the 72 flows, and the routing matrix R (see Eq. (1)) which gives the path that each flow traverses in the network4 . In all examples that follow, a training window of 500 time slots was applied to calibrate our model (see Step 1 of Algorithm 1). In the na¨ıve sampling scheme we evenly split the available sampling capacity among the competing flows of a link. We assume that the sampling capacity for each of the 26 network links is 0.20. Figure 2 shows the empirical root mean squared error (RMSE) for the whole network on the day of interest. RMSE is defined as, qP RM SE(t) = xj (t|t) − xj (t))2 /|J|. The RMSE j∈J (ˆ time average for the optimal sampling scheme is 1548 packets per time slot, and 1773 packets per time slot for the na¨ıve one. This corresponds to a 13% error reduction. We also examine traffic estimation for individual flows.
5.
REFERENCES
[1] D. P. Bertsekas and J. N. Tsitsiklis. Neuro Dynamic Programming. Athena Scientific, 1996. [2] S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004. [3] P. J. Brockwell and R. A. Davis. Time series: theory and methods. Springer-Verlag New York,
4
All datasets used can be provided by the authors upon request.
6
Inc., New York, NY, USA, 1986. [4] D. Chua, E. Kolaczyk, and M. Crovella. Network kriging. Selected Areas in Communications, IEEE Journal on, 24(12):2263 –2272, dec. 2006. [5] K. C. Claffy, G. C. Polyzos, and H.-W. Braun. Application of sampling methodologies to network traffic characterization. SIGCOMM Comput. Commun. Rev., 23(4):194–203, Oct. 1993. [6] M. Coates, Y. Pointurier, and M. Rabbat. Compressed network monitoring for IP and all-optical networks. In Proc. 7th ACM SIGCOMM, pages 241–252. ACM, 2007. [7] E. Cohen, G. Cormode, and N. Duffield. Don’t let the negatives bring you down: sampling from streams of signed updates. In SIGMETRICS ’12, pages 343–354, New York, NY, USA, 2012. ACM. [8] E. Cohen, N. Duffield, H. Kaplan, C. Lund, and M. Thorup. Algorithms and estimators for accurate summarization of internet traffic. In IMC 2007, pages 265–278, NY, 2007. ACM. [9] N. Duffield. Fair sampling across network flow measurements. SIGMETRICS Perform. Eval. Rev., 40(1):367–378, June 2012. [10] N. Duffield, C. Lund, and M. Thorup. Properties and prediction of flow statistics from sampled packet streams. In In Proc. ACM SIGCOMM Internet Measurement, pages 159–171, 2002. [11] N. Duffield, C. Lund, and M. Thorup. Flow sampling under hard resource constraints. In SIGMETRICS 2004, pages 85–96, NY, 2004. [12] N. Duffield, C. Lund, and M. Thorup. Estimating flow distributions from sampled flow statistics. IEEE/ACM Trans. Net., 13(5):933–946, Oct. ’05. [13] N. Duffield, C. Lund, and M. Thorup. Learn more, sample less: control of volume and variance in network measurement. Information Theory, IEEE Trans, on, 51(5):1756 – 1775, may 2005. [14] N. Duffield, C. Lund, and M. Thorup. Optimal combination of sampled network measurements. In IMC 2005, pages 8–8, Berkeley, CA, USA, 2005. USENIX Association. [15] R. Horst, P. M. Pardalos, and N. V. Thoai. Introduction to Global Optimization. Kluwer Academic Publishers, The Netherlands, 1995. [16] Internet2. http://www.internet2.edu/observatory/. [17] R. E. Kalman. A New Approach to Linear Filtering and Prediction Problems. Transactions of the ASME, Journal of Basic Engineering, (82 (Series D)):35–45, 1960. [18] P. R. Kumar and P. Varaiya. Stochastic systems: estimation, identification and adaptive control. Prentice-Hall, Inc., NJ, USA, 1986. [19] A. Lakhina, M. Crovella, and C. Diot. Diagnosing network-wide traffic anomalies. SIGCOMM
Comp. Comm. Rev., 34:219–230, Aug. 2004. [20] M. Lee, N. Duffield, and R. Kompella. Opportunistic flow-level latency estimation using consistent netflow. Networking, IEEE/ACM Transactions on, 20(1):139 –152, feb. 2012. [21] A. Medina, N. Taft, K. Salamatian, S. Bhattacharyya, and C. Diot. Traffic matrix estimation: existing techniques and new directions. In SIGCOMM ’02, pages 161–174, NY, 2002. [22] I. Meier, L., J. Peschon, and R. Dressler. Optimal control of measurement subsystems. Automatic Control, IEEE Trans. on, 12(5):528 –536, Oct ’67. [23] K. Papagiannaki, N. Taft, and A. Lakhina. A distributed approach to measure ip traffic matrices. In IMC 2004, pages 161–174, NY, 2004. [24] W. B. Powell. Approximate Dynamic Programming. John Wiley and Sons, Inc., 2007. [25] H. Robbins. Some Aspects of the Sequential Design of Experiments. Bulletin of the American Mathematical Society, 58(5):527–535, Sept. 1952. [26] H. Singhal and G. Michailidis. Optimal sampling in state space models with applications to network monitoring. In Proceedings of the 2008 ACM SIGMETRICS, pages 145–156, 2008. [27] A. Soule, A. Lakhina, N. Taft, K. Papagiannaki, K. Salamatian, A. Nucci, M. Crovella, and C. Diot. Traffic matrices: balancing measurements, inference and modeling. In 2005 ACM SIGMETRICS, pages 362–373, New York, NY, USA, 2005. ACM. [28] S. Stoev, G. Michailidis, and J. Vaughan. On global modeling of network traffic. Technical report, University of Michigan, 2010. http://www.stat.lsa.umich.edu/∼sstoev/global tr.pdf. [29] N. V. Thoai and H. Tuy. Convergent algorithms for minimizing a concave function. Mathematics of Operations Research, 5(4):pp. 556–566, 1980. [30] H. Witsenhausen. Separation of estimation and control for discrete time systems. Proceedings of the IEEE, 59(11):1557 – 1566, nov. 1971. [31] L. Yang and G. Michailidis. Sampled based estimation of network traffic flow characteristics. In INFOCOM ’07., pages 1775 –1783, May 2007. [32] Y. Zhang, M. Roughan, C. Lund, and D. Donoho. An information-theoretic approach to traffic matrix estimation. In SIGCOMM 2003, pages 301–312, New York, NY, USA, 2003. ACM.
7