Real-Time Security Exercises on a Realistic Interdomain ... - CiteSeerX

0 downloads 0 Views 439KB Size Report
Florida International University. {yueli,liux}@cis.fiu.edu ..... We are currently investigating issues of interoperability be- tween simulated ..... describe in the next section, consisted of two DELL Pow-. erEdge 1900 ... the number of policies by selecting only a subset of the prefixes from the model. ..... Wow, AS7007! NANOG mail ...
Real-Time Security Exercises on a Realistic Interdomain Routing Experiment Platform ∗ Yue Li1 , Michael Liljenstam2, and Jason Liu1 1

School of Computing and Information Sciences Florida International University {yueli,liux}@cis.fiu.edu 2 Ericsson Research, Isafjordsgatan 14E 164 80 Stockholm, Sweden [email protected] Abstract We use a realistic interdomain routing experiment platform to conduct real-time attack and defense exercises for training purposes. Our interdomain routing experiment platform integrates open-source router software, real-time network simulation, and light-weight machine virtualization technologies, and is capable of supporting realistic large-scale routing experiments. The network model used consists of major autonomous systems connecting Swedish Internet users with realistic routing configurations derived from the routing registry. We conduct a series of real-time security exercises on this routing system to study the consequence of intentionally propagating false routing information on interdomain routing and the effectiveness of corresponding defensive measures. We describe three kinds of simplistic BGP attacks in the context of security exercises designed specifically for training purposes. While an attacker can launch attacks from a compromised router by changing its routing policies, administrators will be able to observe the adverse effect of these attacks and subsequently apply appropriate defensive measures to mitigate their impact, such as installing filtering rules. These exercises, all carried out in real time, demonstrate the feasibility of largescale realistic routing experiments using the real-time routing experiment platform.

1 Introduction Interdomain routing plays an important role in providing the crucial connectivity for the Internet. As the dominant interdomain routing protocol, the Border Gateway Protocol (BGP) has been deployed on the Internet for more than a ∗ This research is supported in part by National Science Foundation grants CNS-0836408 and HRD-0833093.

decade. However, BGP historically lacks sufficient security features and has well-known vulnerabilities [6, 8]. Further, BGP operates based on routing policies that control whether a route can be chosen or advertised to the peers. The policy configuration in a real interdomain routing system can be fairly complex, which may lead to routing configuration errors resulting large-scale catastrophic failures in network connectivity. A well-known incident occurred in 1997 caused by a misconfigured router announcing Autonomous System (AS) 7007 as the origin of the best path to most part of the Internet, and consequently disrupting Internet connectivity for several hours [22]. A more recent case involved Pakistan Telecom inadvertently hijacking prefixes belonging to YouTube resulting in large-scale loss of connectivity [28]. Although we haven’t seen any public report about malicious attacks against interdomain routing system beside anecdotal evidence (e.g., see [33] for the media coverage of a BGP hijacking demonstration in August 2008), it is indubitable that an attacker can exert similar impact on the routing infrastructure as long as he or she manages to compromise one or more BGP routers. Simulation is a common technique that has been widely used to study security issues of routing systems (e.g., [14, 16, 21]). Many network simulators also come with the capability of supporting very large routing experiments, examples of which include SSFNet [12], GTNetS [9], and ROSSNet [3]. There are also simulators tailored specially for interdomain routing, such as C-BGP [26]. Although simulation is capable of rendering algorithmic routing operations and network dynamics in great detail, simulation fidelity is always in question. Modeling decisions must be made to facilitate large-scale routing experiments. For example, C-BGP only simulates the BGP decision process based on router configurations and network topology, but ignores timers and packet exchange between peers. These tradeoffs are not necessarily for the benefit of providing a

faithful representation of the routing protocols. As such, most of the existing simulators do not provide the operational realism as with the real routers. Simulation model development is also labor-intensive and error-prone, and therefore requires extensive effort and training in modeling and simulation. As a result, we have not observed an active involvement from the network research community in developing and maintaining routing protocol simulators. There have been recent efforts in conducting routing experiments on emulation testbeds. For example, VINI is a software facility that allows researchers to conduct elaborate routing experiments on the PlanetLab using virtualization techniques [4]. Progress has been made to support network virtualization on shared commodity hardware [5]. Although these emulation testbeds allow realistic network experiments, emulated network conditions are inherently limited by the available resources and physical setup (such as the nodal processing speed, link bandwidths and latencies). Emulation-based solutions also have difficulties in representing and controlling arbitrary network conditions (such as the constitution and utilization of background traffic). These factors, to a certain degree, would limit flexibility, scalability, and thereby utility of the emulation testbeds. We take a different approach which combines simulation and emulation. We have developed a real-time routing experiment platform for realistic, scalable and flexible routing experiments (see our preliminary work in [15]). The platform can seamlessly incorporate open-source router software (in particular, XORP [13]) using real-time network simulation and machine virtualization technologies. XORP software routers are run inside separate virtual machine containers. To scale up the routing experiments, we adopt a light-weight virtualization technology to multiplex hundreds of XORP instances on each physical machine. These virtual machines are connected by a real-time network simulator. That is, traffic generated by these virtual machines is captured and diverted to the network simulator, which calculates the delays and losses of the packets as they traverse the virtual network. The forwarding tables maintained at the routers on the virtual network are synchronized with the corresponding XORP instances run inside virtual machine containers, so that the packet forwarding functionality can be completely offloaded to the underlying simulation engine to reduce the cost of I/O operations. In addition, sophisticated traffic models (such as the fluid traffic model [18]) can run in the network simulator and combined with the emulated traffic, which would determine the traffic conditions of the virtual network. While XORP provides an extensible routing protocol development and experimentation solution, our platform provides the necessary framework for conducting tests under various controlled network conditions. XORP is a widely used open-source routing software developed and main-

tained continuously by network researchers. Therefore, we bypassed the time-consuming and error-prone protocol simulation model development stage. Further, since network simulation can be made truly scalable using parallel and distributed simulation technologies, this platform can easily accommodate large-scale network models with thousands of emulated router instances. One obvious advantage of this real-time routing platform is that researchers are able to “play” with real routing protocol implementations in real time. To demonstrate this capability, we conducted a series of interdomain routing security exercises, where attackers can launch attacks against specific networks through a compromised router, and defenders use common network tools to detect and analyze such attacks, and subsequently install countermeasures in order to block them or to mitigate their impact to the networks. In order to obtain meaningful results from the routing experiments, we constructed a network model to reflect a real interdomain routing system consisting of major autonomous systems connecting Swedish Internet users. A detailed ASlevel model has been built using data primarily from public sources, with realistic routing configurations derived from the routing registry [16]. Although similar models have been developed, they are mostly for US-centric interdomain systems (e.g., Cyclops [7]). Further, we developed an automated process for configuring and deploying routing experiments with the network model. The network exercise thus mirrors the real routing system to a large extent in AS connectivity and policy configuration. We focus on three specific kinds of BGP attacking activities in this paper—denial of service, deaggregation, and traffic engineering subversion attacks—which are common choices among potential attackers. These security exercises show that the real-time routing experiment platform can support realistic network security analysis and training, and provide valuable insight into the security issues of the interdomain routing system. The major contributions of this paper are summarized as follows: 1. We extend our previous work [15] in supporting realistic and relatively sophisticated interdomain routing exercises, especially for training purposes. 2. We develop an automated process for deploying and configuring large-scale network experiments using realistic network models obtained from network measurement data. 3. We use the platform to study the impact of three simplistic yet common interdomain routing attacks on a realistic model of Swedish autonomous systems. The remainder of this paper is organized as follows. Section 2 describes the experiment platform. In Section 3 we present the construction of the Swedish network model. The process for automatically deploying and configuring

network experiments is described in Section 4, followed by Section 5, where we discuss the security exercises we conducted on the realistic routing experiment platform, together with some quantitative evaluation of their impact to the network. Finally, in Section 6, we conclude the paper and discuss future work.

2 Experiment Platform 2.1

The Components

The real-time routing experiment platform consists of three components: The XORP Router. XORP (eXtensible Open Router Platform) is an open-source routing platform supported by a large developer community. XORP consists of both IPv4 and IPv6 routing protocol implementations, including most common protocols, such as BGP, OSPF, and RIP. XORP features an extensible multi-process architecture design, using one process for each routing protocol, in addition to the administrative processes designated for management, configuration, and coordination [13]. Inter-process communication is supported through an unified mechanism, called XORP Resource Locator (XRL). For example, the routing processes (BGP, OSPF, etc.) must use XRL to communicate with the process that manages the routing information base (RIB), which receives and consolidates routes computed by all running routing processes. XORP designates a separate process to manage packet forwarding via a uniform interface called Forwarding Engine Abstraction (FEA), which turns out to be extremely important for our purposes (as described in the next section). FEA provides a platform-independent solution for managing the network resources required to support packet forwarding on the router platform. These network resources include network interfaces, forwarding table, and network sockets (including TCP sockets, UDP sockets, as well as raw sockets). The network sockets are used particularly by the routing protocols to communicate routing information with their peers. The OpenVZ Virtual Machine. We run XORP on light-weight virtual machines in order to support large-scale routing experiments. Virtualization provides resource isolation necessary for running multiple router instances on the same physical machine simultaneously. We choose OpenVZ, which is an open-source OS-level virtualization solution for Linux [24]. OS-level virtualization systems are designed to provide multiple instances of the same OS (also called containers, domains, or simply virtual machines) on the same physical machine, and have demonstrated capable of providing better performance and scalability compared to other virtualization approaches. We found that, in addition to providing full-scale virtualization of the network re-

sources that are important to run XORP, OpenVZ can support far more domains than other popular virtual machine solutions (e.g., Xen [2] and VMWare [31]). The PRIME Simulator. PRIME (Parallel Real-time Immersive network Modeling Environment) is a parallel simulation engine that supports real-time network simulation. With real-time simulation, unmodified implementations of real applications can be executed together with the network simulator that operates in real time [19]. Traffic between the real applications “flows” on the virtual network with calculated packet delays and losses in accordance with the topology and congestion level of the simulated network. To guarantee real-time performance, the network simulator must process events at a rate no slower than the wall-clock time. PRIME adopts parallel discrete-event simulation and hybrid traffic modeling techniques to accelerate simulation execution. PRIME also employs a priority-based scheduling algorithm that allows processing real-time events in a timely manner. To communicate with the real applications running on client machines, PRIME provides a real-time communication interface, which has been built on OpenVPN [20]. Each machine that participates in the network emulation establishes a separate VPN connection to a designated VPN server, which has been customized to function as a gateway between the participating machines (which we call client machines) and the real-time simulator. Traffic generated by the real applications on the client machines is forwarded to the network simulator and then carried on the virtual network; conversely, traffic targeting the real applications is exported by the simulator and sent via the VPN connection to the corresponding client machines. This emulation infrastructure is transparent to the applications running on the client machines. Note that the simulation gateway can consist of multiple VPN servers to alleviate and balance the traffic load between the client machines and the real-time simulator.

2.2

The Architecture

The real-time routing experiment platform requires running XORP software routers inside separate OpenVZ containers and connecting them using the real-time simulation infrastructure in PRIME. Fig. 1 illustrates the overall architecture of the experiment platform. The XORP instances run within OpenVZ containers correspond to the virtual routers simulated by PRIME. Not all simulated routers need to be emulated. For example, Fig. 1 shows only a selection of routers at the core of the network being emulated externally with XORP. Other routers can be simulated by choice. We are currently investigating issues of interoperability between simulated and emulated routers. The communication between the XORP instances and their corresponding simulated entities in PRIME is car-

Figure 1. Routing experiment architecture.

Figure 2. Example of an emulated router. ried out through the simulation gateway. Fig. 2 shows an example of emulating a router that has two network interfaces with designated IP addresses 10.20.8.11 and 10.2.100.5, respectively. Correspondingly, the OpenVZ container running the XORP instance installs two OpenVPN clients that are connected with the OpenVPN server running at the simulation gateway. The two OpenVPN clients create two separate virtual network interfaces (i.e., tunnel devices), each matching the IP address of the corresponding network interface at the simulated router. OpenVPN uses IP over UDP to transport packets between the virtual network interfaces and the simulation gateway [23]. XORP, running as user-level processes, manages the network interfaces, updates the forwarding table, and uses the transport-layer socket interface to communicate the routing information (such as OSPF link state update messages) with the peering XORP instances. Packets originated by XORP are sent from the tunnel devices over the OpenVPN connection to the simulation gateway, which forwards them to PRIME over dedicated TCP connections. The I/O threads at the simulation side translate the packets into simulation events representing the packets being pushed down from the protocol stack at the corresponding router in simulation and then sent out onto the appropriate links. Once inside the simulation, packets are forwarded according to the simulated network conditions. When a packet arrives at at a simulated router, if the packet is destined for another host in the simulated network, the IP protocol session consults with router’s forwarding table

and subsequently delivers the packet to the output queue of the outgoing network interface. If the packet reaches its destination, the packet is sent up to the emulation protocol session, which exports the packet the XORP instance via the I/O threads through the simulation gateway and over the OpenVPN connection. Eventually the packet emerges from the tunnel device that corresponds to the router’s network interface that originally receives the packet in simulation. To minimize the overhead caused by packets traveling through the simulation gateway (both in terms of throughput and latency), it is important that packet forwarding operations are performed entirely within simulation. Consequently, the forwarding table at the simulated router must be synchronized with the actual forwarding table, which is updated by the routing protocols running within the corresponding XORP instance. This is achieved by adding a XORP FEA plug-in, through which a command channel is established between the XORP instance and the virtual router node within PRIME. XORP uses the command channel to transfer forwarding table updates (such as adding or deleting forwarding table entries) and interface configuration requests (such as activating or disabling a network interface) to the corresponding router node in simulation. A detailed description of this forwarding plane offloading approach can be found in our previous paper [15].

2.3

Timing and Scalability

Since PRIME needs to interact with the XORP instances and simulate the virtual network in real time, the timing accuracy is critical for obtaining trustworthy experimental results. This means that the platform must keep up with the wall-clock time, receive real-time events promptly, and export them without significantly missing their real-time deadlines. We conducted preliminary studies to evaluate the timing issue of our platform. Experiment results suggest that the forwarding plane offloading approach adopted by our routing experiment platform can significantly reduce the traffic load across the emulation infrastructure [15]. Studies show that our platform is capable of supporting large-scale routing experiments. For example, in one case study our platform supports as many as 240 XORP instances running on a single physical machine equipped with an Intel QuadCore Xeon 2.0 GHz CPU and 8 GB memory.

3 Network Model The network model for our routing attack scenarios stems from a study conducted previously by Liljenstam on the potential impact of routing attacks targeting the Swedish part of the Internet infrastructure [16]. In this section, we describe the attempt to build the realistic network model that we used for conducting our security exercises.

Coming up with a routing model realistic enough to produce meaningful results involves substantial challenges, and we must carefully consider what simplifications are possible. To study the impact from such events we are primarily interested in the exchange of routing information between ASes. Thus, it is sufficient to model the network at the AS level; and, mainly due to the lack of proprietary information and measurement data within these ASes, we also make a further simplification in that each AS is modeled through a single BGP-speaking router. This is presumably not far from the truth for small stub ASes, but is clearly a gross simplification for ASes belonging to large ISPs. Even with these simplifications, the global AS topology is unwieldy to deal with. However, since the goal was to study the parts related to Swedish interests, the topology was trimmed down to mainly include ASes belonging to Swedish ISPs or customers. Such a reduction requires care though, since connectivity will be reduced and potentially important route propagation paths may be lost. To avoid such distortions, certain important ASes belonging to international tier-1 ISP are retained as the model is reduced. A critical part of the model is to have the routers implement realistic routing policies. Unfortunately, the detailed policies are generally regarded as sensitive business information, and thus ISPs are reluctant to reveal them. The only publicly available information on routing policy is what has been registered into the Internet Routing Registry for coordination of policies between different ASes. “Mid-sized” ISPs (i.e., larger ISPs, but not the largest tier-1 ISPs) tend to use this information to generate BGP filters automatically from the policies stated in the registry. In the model we try to essentially achieve the same thing. However, since information is entered manually, on a voluntary basis, into the routing registries, there are known problems with missing and/or stale information. The result is that we can only achieve an approximation of the real routing policies. Two things work to our advantage, though. Since our focus is in Europe, we rely on the routing registry maintained by RIPE NCC, which has been found to contain the best maintained information out of the different registries [30]. Also, by combining information from different sources, we can improve somewhat on the policies found in the routing registry. And finally, where information is missing or connectivity has been abstracted away, we make assumptions that filters are assumed to block illegitimate routes from propagating. The intention is to thus find a lower bound of the consequences rather than overestimate them. We built our AS-level model of the routing system using data from several sources: • Public BGP data include BGP table dumps (RIBs) collected by Route Views [29] and RIPE NCC [27], which were used to derive AS adjacencies. • A list of registered Swedish ISPs kept by the Swedish

Figure 3. AS topology. National Post&Telecom Agency (PTS) was used to select ASes for the model to create a focus on Swedish organizations and users. • ISP market share data collected by PTS was used in the original study to assign fractions of users to different ASes in the model. • Routing policies from the Internet Routing Registry is the only public source on routing policy, and was used to supply routing policies for the model. The resulting network model consists of 167 ASes, as illustrated in Fig. 3, where Swedish ISPs are shown as dark blue diamonds (or dark gray if printed on black and white), stub customer ASes are shown as cyan (light gray) diamonds, and major international neighboring ISPs are shown as green (light gray) circles close to the center. A more detailed account on the derivation of AS topology and interdomain routing policies is provided next.

3.1

AS Topology

AS path information from BGP RIBs was used to derive AS adjacencies which was used to build the AS topology. The AS paths were also used to infer peering relationships using Gao’s heuristic [11], which provide some indications of routing policies. In order to get a good coverage of the Swedish part of the routing system, the Route Views RIBs, containing 8.6 million routes collected globally, were supplemented with 1.86 million routes collected by the RIPE route collector at the Stockholm Internet exchange point (run by Netnod) and the London Internet exchange. From these data sets, a global AS-topology was created, annotated with peering relationships (provider/customer, peer/peer, or sibling/sibling). Starting with the resulting global AS graph, a simple iterative heuristic was used to trim it down to subgraph focused on Sweden. The focus topology was grown starting from the two ASes NETNOD and TELIANET-SE by iteratively adding neighboring ASes found in the list of registered Swedish ISPs, or their customers. In each iteration a manual inspection was also made to check all considered

neighbors (indicated for inclusion or exclusion) to make adjustments and to ensure that large international ISPs were included for realistic connectivity. Since private peering and unused backup connections are generally absent from collected routing tables, we also add adjacencies indicated by policy information from the routing registry.1

3.2

Routing Policies

Routing policies in the RIPE Internet Routing Registry are expressed in the Routing Policy Specification Language (RPSL). We used parser code from the RIPE RtConfig tool to parse the RPSL objects in the registry and generate an intermediate representation. Other scripts are then used to construct prefix filters for the model (and augment the AS adjacencies) from the intermediate representation. Routing data from RIPE’s collection point in Stockholm (rrc07) and from a Looking Glass server in AS1299 was used to validate the model, by comparing with the routes observed at the same points in the model. Initial validation results pointed to problems with the poor quality of the routing policy information from the routing registry, as a large fraction of the routes that were expected to be observed at one of the vantage points in the network were blocked in model. To improve the model, minor modifications were made to the policies to account for the observed routes. This leads to an expected improvement in the agreement between the model and observations at the relevant point.

4 Experiment Setup We wrote a set of scripts to generate the necessary configuration files for both XORP instances and the PRIME simulator given a network model, and to automatically deploy experiment scenarios. We start with the network model described in the previous section, which consists of the AS topology and routing policies, embedded in XML. The xml2dml script translates XML into DML, a format describing the network model that can be interpreted by PRIME. The script also filters out the routing information (e.g., prefix filtering rules, export policies), in a simpler format for later processing. The dmlevn script performs automatic IP address assignment (if necessary) and flattens the hierarchical network model (which resembles the real network architecture with hosts and routers within subnetworks belonging to larger subnetworks) such that they can be easily instantiated by PRIME when experiment starts. The script also generates VPN configuration information, which is a list of emulated routers and hosts that will be used for setting up the emulation infrastructure. In particular, the 1 Only adjacencies where the policies occur at both end-point ASes were included to try to avoid including stale information.

list includes the detailed configuration of their network interfaces. The dmlpart script partitions the network model to prepare for parallel simulation (if more than one processors are needed to run a large-scale simulation). The VPN configuration information is used by vpnscript to generate configuration files for both clients and the simulation gateway. Recall that there will be one VPN client corresponds to each network interface on an emulated host (see Fig. 2). The gateway configuration file is used by the VPN server to start the simulation gateway. The simulator starts with the model DML file, the flattened model configuration file, and the partitioning information. The latter is only needed in case parallel simulation is desired. The routing information, together with the (flattened) network model configuration and the VPN configuration information about the emulated routers and hosts, is then used by the xorpify script to construct configuration files for the XORP instances. These configuration files are in fact only templates, before they are distributed to the virtual machines. On each corresponding virtual machine, the cfgxorp script combines the template with the proper runtime information about the VPN clients and converts the template to a configuration file to run XORP. The environment for the security exercises, which we describe in the next section, consisted of two DELL PowerEdge 1900 servers connected by a normal 100 Mb/s Ethernet connection. Each server has an Intel Quad-Core Xeon 2.0 GHz CPU and 8 GB memory. We set up OpenVZ environment on one machine, which hosted all 167 XORP instances as well as the simulation gateway. PRIME ran on the other machine. While we were sure that the machine could support realtime simulation of this network model without a problem, we would still need to make sure that the machine that ran the XORP instances and the simulate gateway was not under-provisioned. We measured its memory consumption and CPU utilization during the exercises.2 The total memory consumed by all virtual machines was about 5.8 GB (still far below the 8 GB physical memory limit). Fig. 4 shows the CPU utilization during the security exercise that involves reconfiguring BGP policies at a router as a result of the deaggregation attack. The CPU utilization was higher (but well under control) in the first 20 minutes during the initialization of the test model (with 167 ASes) and at around 2,200 seconds into the experiment when the BGP reconfiguration happened. Fig. 4 also shows the number of packets that traverse the simulation gateway, which was 2 The current XORP implementation (version 1.4) does not support BGP setup with more than 4,000 policy terms [32]. We had to shrink the number of policies by selecting only a subset of the prefixes from the model. We chose 905 prefixes specifically for our experiments; this resulted in as many as 3,995 policy terms on some routers. Although this reduction undoubtedly would weaken the realism of our experiments, it does not significantly undermine the results shown here.

70

25000

cpu utilization number of packets

60

15000

40 30

10000

20

Number of Packets

CPU Utilization (%)

20000 50

5000 10 0 0

500

1000

1500

2000

2500

0 3000

Time (seconds)

Figure 4. CPU and traffic load for running OpenVZ and the simulation gateway. recorded at 10-second intervals. The result is similar to the CPU utilization. We calculated a peak throughput of 1.4 Mb/s (and an average of 60 Kb/s), far below the available physical bandwidth and what can be supported by the VPN infrastructure (at approximately 90 Mb/s from our previous studies). The above measurement results indicate that the experiment platform had sufficient resources to support the interdomain routing exercises.

5 Security Exercises and Results In this section we describe three security exercises we conducted on the real-time routing experiment platform with the Swedish network model described previously. The human attackers, referred to as the red team, perform interdomain routing attacks, while the network administrators, referred to as the blue team, are engaged in defending the routing infrastructure against such attacks. We assume that the red team is able to compromise a router in the interdomain routing system and launch attacks against a known target, be it a government agency or a financial institution. The main responsibility of the blue team is to detect the attacks, e.g., by inspecting the routes at different vantage points of the network, and react promptly to the attacks by adjusting the policies (not necessarily at the same compromised router). Our purpose here is to demonstrate the capabilities of the real-time routing experiment platform. The security exercises described in this section were conducted in a closed controlled environment in our lab, although it is quite possible to scale these types of exercises to include multiple institutions.

5.1

Denial-of-Service Attack

Since a BGP speaker uses TCP to communicate with its peers, it is vulnerable to TCP-specific vulnerabilities, such as syn-flooding or denial-of-service attacks. Although the cause of a denial-of-service (DOS) attack is manifold, one prominent effect of such an attack is the temporary unavailability of the target system. For interdomain routing, this

Figure 5. A detailed view of the part of the network for the DOS attack.

means route withdrawals and thus affecting the reachability to certain networks. A repeated offense to a router may cause possible churns to the routing system (i.e., with routing information in the network changing rapidly). This may lead to more serious instability in interdomain routing. Directly simulating a denial-of-service attack does not necessarily provide us important information on the system’s vulnerability since it depends on the specific exploit as well as the particularity of the system configuration. During the exercise, we simply reproduce the result of a denialof-service attack by having the red team to shut down and then reboot a router. The blue team would detect such an attack by observing changes to the routes to certain destinations at another router. We acknowledge that our model contains only AS-level topology and therefore cannot truly assess the real impact of a denial-of-service attack. This is a limitation of our network model, not a limitation of the real-time routing experiment platform. Figure 5 shows the part of the Swedish network in more detail on which the attack was implemented. The effect on the round-trip time (measured by ping) between the host with a hypothetical IP address 194.103.154.1 that belongs to AS 21200 and the host 192.175.48.1 that is multi-homed in AS 112 and AS 16150 is shown in Figures 6 (left plot). When the router at AS 8434 was brought down at 260 seconds after the initialization phase, BGP recomputed the routes resulting a diversion of the traffic taking the longer path via AS 3301. As a result, the round-trip delays were increased from around 120 to 150 milliseconds. When the router at AS 8434 was rebooted at around 550 seconds into the experiment, BGP re-established the original path after reconfiguration. The round-trip time returned to the level around 120 milliseconds. To show the effect of such routing changes on traffic, we initiated a TCP transfer in the same experiment from the host 194.103.154.1 to the host 192.175.48.1 and analyzed the packet trace from tcpdump. Figure 6 (right plot) shows the number of bytes transmitted at the sender side. The TCP transfer was slowed down when router failure occurred at 260 seconds and when an alternative route was immediately found. When the router came back, the original path was re-established and a better throughput was achieved. The changing throughput is demonstrated by the

200 Round trip time (ms)

190 180 170 160 150 140 130 120 110 0

100

200 300 400 500 Time (seconds)

600

700

Figure 6. Round-trip time and transfer rate. slight difference in the slope of the curve. At 620 seconds, the TCP download was completed. As we see in the figure, at this time, the round-trip time became (more) stable, settling at 120 milliseconds.

5.2

Deaggregation Attack

Deaggregation attack is also commonly referred to as prefix hijacking. Since routers perform the longest prefix match forwarding, it is common that one can divide a prefix into longer prefixes and advertise them separately. This allows the attackers to choose specific target prefixes if they can control a router and subsequently advertise false routes to its peers. Once the forged routes are accepted by the peers, traffic destined to the target prefixes will be forwarded to an unintended router. The deaggregation attack can be used either to hijack the victim’s address block (so that the traffic would be routed to unintended parties) or to create a black hole (so that no one would be able to reach the target prefixes). A more systematic treatment of prefix hijacking schemes can be found in [1, 34]. Here we applied the deaggregation attack in the security exercise context and used it as an example to show the capabilities of the real-time routing experiment platform and analyzing its potential impact. During the exercise, the red team was able to select a router to launch the attack among the 167 BGP speakers in the Swedish network model. The red team logged onto the compromised router using xorpsh, which is the XORP command shell that allows one to inspect and change the router’s configuration at run time. The red team then carried out the deaggregation attack by tampering with the routing information. Basically, the attacker selected one prefix announced by the router as the target, and divided the prefix into two prefixes that are one bit longer than the original prefix. These two fake prefixes were then announced to all peers with routes designated to a nonexistent router, with the result that all traffic to the target prefix would be dropped. The blue team detected the unreachable prefixes through the use of ping. Of course, in reality, the detection would most likely come from complaints by the customers. The blue team then used traceroute to locate the problem and the offending routes were removed from the router.

It is obvious that the effectiveness of the deaggregation attack would be determined by the network topology, the routing policies, and the exact place on the network where such an attack is launched. We conducted measurements to figure out the potential impact of deaggregation attacks on the Swedish network model. We chose an online bank as the target of such an attack. In this case the deaggregation attack would cause the bank customers to be unable to reach their bank accounts, or more seriously their request could be intercepted and their private information is at the risk of being compromised. We picked each of the 60 Swedish Internet service provider ASes to initiate the attack.3 Figure 3 shows the impact of a particular attack: the yellow (or light gray) circle to the left of the center denotes the target bank; the larger red (medium dark gray) circle below the center denotes the AS that initiated the attack; and the smaller red (medium dark gray) circles denote all the ASes that were affected by the attack. The result of this experiment is summarized in the following table: # affected ASes Frequency

0 55%

≤ 10 85%

≤ 20 95%

≤ 50 98.3%

In general, attacks that render larger impact (i.e., with more affected ASes) occur at lower frequency. Our experiments show that, in 55% (of the 60 attacks), no AS is affected other than the attack point itself. The impact is no larger than 20 ASes in 95% cases: 30% cases involve 1 to 10 other ASes affected by the attack; 10% cases involve 11 to 20 ASes. In extreme cases, however, we have observed that as many as 85 ASes could be affected by an attack. We obtained similar results when targeting other organizations (with different prefixes), which indicate that the selection of the attack point primarily determined the scope of the impact. For example, we found that four ASes are most vulnerable to such deaggregation attacks. Each of the four ASes was affected more than 14 times out of the 60 attack cases. We found that the policies on these routers need to be revisited. In this case, the BGP input filters are of particular importance; the deaggregation attack could have a lesser effect if more specific prefix filtering rules were in place at the peers.

5.3

Traffic Engineering Subversion Attack

Normal traffic engineering can be achieved through BGP configuration. A network administrator can install several types of policies to direct traffic on a certain path. However, these legitimate techniques may also be used by attackers to subvert interdomain routing. Such is the traffic engineering subversion attack, in which a malicious router may redirect 3 The rest of the 107 ASes are either customer ASes or foreign ASes that are immediate neighbors of the Swedish ISPs.

AS 3

AS AS

# show bgp routes Status Codes: * valid route, > best route Origin Codes: i IGP, e EGP, ? incomplete

~~

AS 2

normal traffic

AS n

attacker

*> * * *

Prefix -----194.103.154.0/24 194.103.154.0/24 194.103.154.0/24 194.103.154.0/24

Nexthop ------*.*.*.* *.*.*.* *.*.*.* *.*.*.*

Peer ---*.*.*.* *.*.*.* *.*.*.* *.*.*.*

AS Path ------3301 21200 i 1299 3301 21200 i 1880 8434 21200 i 12552 3303 1299 3301 21200 i

AS 1 (a) Routing table before the attack.

victim

Figure 7. An example for traffic engineering subversion attack.

# show bgp routes Status Codes: * valid route, > best route Origin Codes: i IGP, e EGP, ? incomplete

* * *> *

Prefix -----194.103.154.0/24 194.103.154.0/24 194.103.154.0/24 194.103.154.0/24

Nexthop ------*.*.*.* *.*.*.* *.*.*.* *.*.*.*

Peer ---*.*.*.* *.*.*.* *.*.*.* *.*.*.*

AS Path ------3301 3301 3301 3301 3301 3301 21200 i 1299 3301 21200 i 1880 8434 21200 i 12552 3303 1299 3301 21200 i

(b) Routing table after the attack.

traffic to inflict on the performance of the target AS. For example, in Figure 7, normal traffic destined to the victim AS (AS1 ) is redirected onto a suboptimal, backup path (through ASn ) as opposed to the optimal path (through AS2 ); this will decrease the capability of the victim AS if the backup path is of lower network performance. Note that since traffic is still being forwarded to the correct destination with only reduced capacity as a result of the attack, this kind of attack is more difficult to detect. The specific traffic engineering subversion scenario we applied in the security exercise used AS-path padding. In the beginning, the red team selected a router and logged onto the presumably compromised router using xorpsh. By inspecting its routing table, the red team identified a victim AS and selected a prefix originated from the victim AS as the target. The attack was carried out against the target prefix by simply setting the as-path-expand variable in the BGP configuration. The AS-path padding option resulted in XORP prepending several copies of the current AS number to the AS-path attribute in its BGP update messages. Since routes with shorter AS path length are more favorable in the BGP route selection process, the traffic destined to the victim would no longer take the original path. Instead, another suboptimal path would be selected as the result. The blue team would detect this attack by observing the abnormal AS-path padding in the routing table of an AS. Then, the blue team would log onto the presumably compromised router and correct the false BGP configuration. Figure 8 shows an example where the red team launched an attack at the router of AS 3301 by inflating the path to the prefix 194.103.154.0/24, which belongs to AS 21200. The blue team observed such the attack at the neighboring router of AS 2603. Before the attack, the BGP routing table shows that the preferred path was through AS 3301, as illustrated in Fig. 8(a). After the attack, the routing table indicated that the alternate path through AS 1880 and AS 8434 was now the preferred choice. And this was the consequence of the original AS-path having been inflated as AS 3301 was inserted six times in the routing update from AS 3301 to AS 2603, which could be derived from the rout-

Figure 8. An example for traffic engineering subversion attack.

ing table shown in Fig. 8(b). Obviously, AS 3301 could be immediately determined as the malicious router, and its erroneous routing policies could be corrected accordingly. Note that using duplicate AS numbers to inflate an AS path is but one simple method of traffic engineering subversion. Other attacks may involve more sophisticated techniques to inflate AS paths, such as withholding reachability information in route announcements. The effectiveness of traffic engineering subversion attacks relies on network topology and policy configuration. It would be quite difficult to detect these attacks and even less likely to quantify the scale of such attacks.

6 Conclusions and Future Work There is an increasing demand for realistic, scalable and flexible routing experiments, especially in studies of security aspects of large-scale interdomain routing systems. A real-time routing experiment platform has been developed by our group to meet this urgent need. This experiment platform enables researchers to conduct real-time routing experiments and exercises with real routing protocol implementations in large-scale network scenarios. We conducted real-time security exercises involving realistic interdomain routing systems and analyzed their potential impact. The network model used in these exercises has been constructed using realistic routing configurations from the routing registry. The network model consists of major autonomous systems connecting Swedish users. We devised three simplistic network security exercises that involve a red team of attackers and a blue team of network administrators to demonstrate the consequence of the attacks and the effectiveness of defensive measures and practices. Our purpose is to establish a case study to demonstrate the potential of

the real-time routing experimentation platform. Immediate future work includes improving the scalability of these routing experiments. Although using lightweight virtual machines allows us to conduct experiments with many real router implementations, the amount of physical resources required to construct an experiment with tens or hundreds of thousands of XORP routers would not be easy to come by. One potential solution to this problem involves providing interoperability between real router instances and simulated ones. Not all routers in an experiment may require detailed representation. We could selectively make some of the routers as simulated routers. The ability to perform live experiments is a major attraction for systems like PlanetLab [25], and is unparalleled by other methods. Conducting tests under real network traffic can reveal both design and operational problems before the ultimate deployment. However, the ability of live testing sacrifices controllability. After all, one cannot control live traffic to reveal every potential problem that the target system would encounter under all kinds of traffic conditions. We use simulation to model network transactions and therefore can easily incorporate synthetic traffic models. Synthetic traffic cannot replace live traffic, but it provides a controllable diversity. Our future work includes further investigation of various synthetic (but realistic) traffic models that can be used to represent different network congestion levels for our experiment purposes. It is possible to run our routing testbed on distributed platforms (such as an overlay network) so that it can incorporate live traffic situations. The possibility of running our platform in PlanetLab and EmuLab [10] definitely deserves further investigation. RINSE is a real-time simulation framework especially designed for network security exercises [17]. Our experiment platform inherits some of its capabilities, but focuses more on the routing protocols (especially real routing protocol implementations). We would like to develop other monitoring and diagnosis capabilities, such as interoperability using standard SNMP queries, and the ability to inject faults for both simulated and emulated entities.

References [1] H. Ballani, P. Francis, and X. Zhang. A study of prefix hijacking and interception in the Internet. SIGCOMM’07. [2] P. Barham et al. Xen and the art of virtualization. SOSP’03. [3] D. Bauer et al. A case study in understanding OSPF and BGP interactions using efficient experiment design. PADS’06. [4] A. Bavier et al. In VINI veritas: Realistic and controlled network experimentation. SIGCOMM’06. [5] S. Bhatia et al. Hosting virtual networks on commodity hardware. Technical Report GT-CS-07-10, Georgia Tech Computer Science, 2008. [6] K. Butler et al. A survey of BGP security issues and solutions. Technical report, AT&T Labs, 2005.

[7] Y.-J. Chi, R. Oliveira, and L. Zhang. Cyclops: The Internet AS-level observatory. CCR 2008. [8] S. Convery and M. Franz. BGP vulnerability testing: Separating fact from FUD v1.1. NANOG 28, 2003. [9] X. A. Dimitropoulos and G. F. Riley. Large-scale simulation models of BGP. MASCOTS’04. [10] EmuLab. http://www.emulab.net/. [11] L. Gao. On inferring autonomous system relationships in the Internet. TON, 9(6):733–745, 2005. [12] T. G. Griffin and B. J. Premore. An experimental analysis of BGP convergence time. ICNP’01. [13] M. Handley et al. Designing extensible IP router software. NSDI’05. [14] J. Kim et al. A BGP attack against traffic engineering. WSC’04. [15] Y. Li, J. Liu, and R. Rangaswami. Toward scalable routing experiments with real-time network simulation. PADS’08. [16] M. Liljenstam. Simulating the national-level impact of routing attacks in Sweden. SNCNW’06. Available at http://liljenstam.net/publication_ docs/sncnw2006.pdf. [17] M. Liljenstam et al. RINSE: the real-time interactive network simulation environment for network security exercises. PADS’05. [18] J. Liu. Packet-level integration of fluid TCP models in realtime network simulation. WSC’06. [19] J. Liu. A primer for real-time simulation of large-scale networks. ANSS’08. [20] J. Liu et al. An open and scalable emulation infrastructure for large-scale real-time network simulations. INFOCOM’07. [21] P. McDaniel. Iseb: Trace driven modeling of Internet scale BGP attacks and countermeasures. DETER/EMIST Workshop 2005. [22] S. A. Misel. Wow, AS7007! NANOG mail archives, http://www.merit.edu/mail.archives/ nanog/1997-04/msg00340.html, 1997. [23] OpenVPN. http://www.openvpn.net/. [24] OpenVZ. http://openvz.org/. [25] PlanetLab. http://www.planet-lab.org/. [26] B. Quoitin and S. Uhlig. Modeling the routing of an autonomous system with C-BGP. IEEE Network, 19(6), 2005. [27] RIPE NCC. http://www.ripe.net/. [28] RIPE NCC. YouTube hijacking: A RIPE NCC RIS case study. http://www.ripe.net/news/ study-youtube-hijacking.html, 2008. [29] Routeviews. http://www.routeviews.org. [30] G. Siganos and M. Faloutsos. Analyzing BGP policies: methodology and tool. INFOCOM’04. [31] VMWare Workstation. http://www.vmware.com/ products/ws/. [32] XORP users mailing list. http://mailman.icsi. berkeley.edu/pipermail/xorp-users/ 2008-April/002515%.html. [33] K. Zetter. Revealed: The Internet’s biggest security hole. http://blog.wired.com/27bstroke6/ 2008/08/revealed-the-in.html. [34] C. Zheng et al. A light-weight distributed scheme for detecting IP prefix hijacks in real-time. SIGCOMM’07.

Suggest Documents