Practical Issues of Statistical Path Monitoring in Overlay Networks with Large, Rank-Deficient Routing Matrices Sameer Qazi
Tim Moors
University of New South Wales
[email protected],
[email protected] Abstract-Overlay networks can be used to find working paths when direct underlay paths are anomalously slow, e.g. because of a network fault. Overlay paths should not use links that are involved in a fault, so choosing which overlay path to use often requires path monitoring, which introduces an overhead. By using a routing matrix ‘M’ to define which links are used in each path, and sorting the matrix according to the degree of independence of paths, we can choose a subset of paths to monitor, and so reduce overheads. The performance metrics of the unmonitored paths are then predicted based on information inferred from the monitored. Previous work has shown how statistical prediction errors can occur, even in small networks (11 nodes, 110 paths), when monitoring fewer paths than the rank of the matrix. This paper extends previous work by showing that collinear relationships between variables in paths of larger routing matrices in networks with tens of nodes and 1-2 000 paths are a large source of errors in larger networks. We show that mitigation of such errors leads to improved path metric prediction and anomaly detection. I. INTRODUCTION End-to-end paths in the Internet sometimes fail to deliver the Quality of Service (QoS) required by some applications. For many user-perceived performance failures/faults there exists an alternate path which can be used to actually prevent or “mask” the fault from the end user by using quick switch-over mechanisms. Recent works [1] show that when the direct-path between two Internet hosts fails, an alternate path between them can be established using an overlay host whose direct-paths to the source and destination host has not failed due to the spatial diversity of paths (Figure 1). Indirect path through an overlay host (ONE HOP OVERLAY PATH)
Direct path fails
Figure 1 Establishing Alternate paths via an overlay host when the path between two Internet hosts fail
The Internet seems to work most of the time but sometimes recovery from failures is painfully slow. For many of the user perceived performance failures/faults there is a redundant path available which can be used to actually prevent or “mask” the
fault from the end user using quick switch-over mechanisms. One study [2] shows that for almost 80% of the paths used in the Internet there is an alternate route with a lower probability of packet loss. The Internet finds alternate paths using the Border Gateway Protocol-4 (BGP-4), where each sub-network learns about global reachability to different hosts in the network through exchange of route advertisements with only immediate neighbors (sub-networks). Despite being highly scalable, there are three fundamental shortcomings with the way alternate paths are explored by BGP. (1) On detection of a failure on the primary route, path exploration proceeds using a ‘trial and error’ method investigating each alternate path in turn; (2) The rate at which new routes are learned is artificially dampened to avoid flooding the network with frequent path update messages causing routing table changes, a phenomenon known as “route flapping”; (3) BGP only addresses reachability and does not adequately address more subtle performance metrics such as latency, loss rates and throughput on paths. Several protocol extensions [3-5] have been proposed for BGP-4, to overcome the delayed routing convergence. [3] proposes use of expedited route-withdrawl messages to rid the network of stale routing information and [4] proposes using explicit cause-of-failure tags to simplify path exploration after failure. Xu and Rexford [5] propose a hybrid scheme where parties interested in improving the QoS of their traffic can broker their path-metric demands with BGP speakers, a method referred to as pull-based route retrieval. However, all of these proposals and others require community effort to deploy to get the maximum benefit. On the contrary there is no such requirement for overlay networks, whose development was a systemic approach to counter these major shortcomings of BGP working outside its framework. An overlay is basically a group of participating peers who agree to route traffic amongst themselves on behalf of other participants (peers) to bypass faults observed in the underlay path. However, the ability of an overlay to find good detours in the Internet is primarily dependent on the knowledge of the underlay topology. This is because an overlay link is logical abstraction of several physical links. Two overlay links may seem disjoint at the application layer yet share a link in the underlying IP layer. The shared IP link renders both useless in the event of failure. For example consider the example in Figure 2(a). If link ‘l’ fails disconnecting source S from destination D then assuming each link has unit weight and using
shortest path routing, it renders both overlay peers R1 and R2 useless for S to reach D using a single overlay hop as S needs l to reach R2 and R1 needs it to reach D. In this case S can only reach D through R3 or through the two hop overlay route S>R1->R3->D. This requires that overlay nodes constantly monitor individual overlay peers and overlay links to successfully detour the traffic via an overlay node in the event of failure on the underlay network. To be able to establish such alternate paths quickly in overlay networks it is vital to monitor all such possible indirect paths through probing. However, when the size of the overlay network is large, probing generates excessive overhead [1]. Maintaining complete state about all overlay links require in the ideal case that all ‘N’ peers be connected as logical mesh or clique (Figure 2(b)). Subsequent probing for measurement of end-to-end path metrics and its dissemination via link state protocol incurs maintenance overheads of O (N2). This results in scalability issues limiting the size of deployed overlay networks. On the contrary, maintaining complete overlay state without the knowledge of topological diversity of individual relay nodes may be counterintuitive when we consider that the location of path and performance failures are not known a priori, are often correlated and vary on very small time scales. Current works, e.g. RON [1] aim to bypass path failures using application specific metrics e.g. throughput, loss rate, latency and routing through all possible indirect overlay nodes which are probed aggressively incurring large overheads. Such path exploration techniques are not scalable above modest network sizes. Recent works [6, 7] show that due to large degree of physical link sharing among paths, one can monitor only a carefully selected subset of the paths and derive path metrics on the remaining paths using statistical prediction. II. RELATED WORK Anderson et al, designed RON [1] to be a resilient routing tool for the Internet by implementing a small link state overlay (50 nodes). The overlay tries to find the best alternate path to the destination. The best path may be the default Internet path or an alternate overlay path. The design posed scalability problems with more than 50 nodes due to extreme bandwidth requirements for active probing of all virtual overlay links and subsequent dissemination of the learned parameters over the logical mesh architecture, at intervals of the order of a few seconds. [8] shows that in most cases alternate paths can be found using at most one overlay hop but does not address the path selection problem. Several other studies including [9] have addressed the scalability issue in unstructured overlays by arguing in favor of a hybrid architecture, combining overlay routing with multi-homing techniques. However, multi-homing is only effective in improving path diversity near the edge of the network and thus has limited benefits for failures other than
last hop failures. Multi-path routing on overlay networks is also proposed in [10]. Topology aware approaches have been extensively studied to counter the scalability issue. [11] discusses a proposal for ‘pruning’ the overlay topology through removal of redundant physical links from the path monitoring exercise which are not likely to be selected by the overlay routing algorithm. Several other studies present interesting heuristics for finding disjoint overlay paths without explicit knowledge of underlay topology or aggressive path monitoring. Akamai driven one hop source routing is presented in [12] which presents a detour selection process in which a host selects as detour, an overlay node in physical proximity to one of the ‘preferred’ Akamai servers (mirrors) to serve content. [13] presents a scheme where a source uses routing tags, each of which specify the path a packet takes through the network from its present location to the destination by selecting one of possible routing options. [14, 15] find disjoint alternate paths making use of traceroute information. D
R3
R1
l R2
S
Figure 2. (a) (left)How overlay resilience depends on topology of the underlay network. 2(b) Inferring maximum informatison about all virtual overlay link.s
III. ALGEBRAIC PROBLEM FORMULATION We begin by establishing some relevant notation and definitions. Let G=(ν,ε) be a strongly connected directed graph, where the nodes in ν represent network devices (routers and end-hosts) and the edges in ε represent links between those devices. Additionally, let ρ be the set of all paths between endhosts in the network (pre-determined by commercial Internet routing policies), and let nv = |ν|, ne = |ε|, np = |ρ| denote, respectively, the number of devices, links, and paths. If we use vector b ∈ Rne to denote measurement of a metric on each edge j ∈ ε of the graph, then the vector y ∈ Rnp of path measurements is given by y=Mb [7] where M ∈ [0,1] np x ne is a routing matrix in which: Mi,j =
1 if path i traverses link j
Mi,j =
0, otherwise
(1)
Figure 3 gives an example of a network and corresponding routing matrix and measurement vectors. The measurements could be of delays, or loss rates. The column (or row) rank of a matrix, such as M, is the number of linearly independent columns (or rows) in that ma-
l1 l2 l 3 101
AMP25 AMP30 AMP40 AMP50 RIPE30 RIPE40
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1
M= 1 1 0
0 0.001
C
011 y1
β3
Y= y2
0.01 0.1 0.2 Fraction of rank-log scale
0.5 1
y3
l3
Figure 4. Eigen Spectra of AMP and RIPE Networks. β2
y3 β1 b= β2 β3
Eigen Spectra 1
Eigen Values of M'M (Normalized)
trix. If one measures r=Rank(M) paths, then the path metrics of the entire network can be determined exactly. Section VI will show that the routing matrices for large Internet overlay networks are ‘rank deficient’, in the sense that their rank is smaller than either dimension of their matrices, i.e. r