1
Towards an Efficient Implementation of Traceback Mechanisms in Autonomous Systems1 K. Boudaoud, F. LeBorgne University of Nice Sophia Antipolis - I3S- Laboratory - CNRS
[email protected]
• Abstract—One of the major problems due to denial of service attacks is the identification of fault packets. To resolve this problem, several IP traceback methods have been defined. In this paper, we propose a traceback solution that is efficient and applicable in the real context of Internet, i.e. taking into account the fact that Internet is constituted of a set of autonomous systems (ASs) operated by different administrative authorities. Index Terms—Autonomous Logging, Security.
Systems,
IP
Traceback,
I. INTRODUCTION The frequency of denial of service (DoS) attacks that target the Internet is continuously increasing. An example is the distributed DoS attacks against important Estonia organizations in April 2007 [1]. One of the major problems of this kind of attacks is the identification of the attack source, more precisely the real source address of fault packets (i.e. packets responsible of the attack). To resolve this problem, several IP traceback methods aiming to trace the packets to their origin have been defined. However, implementing these methods in the real context of Internet is not an easy task. Actually, most of the proposed solutions assume that they are applied in a network managed by the same administrative authority, which is clearly not the case of the Internet that is constituted of a set of autonomous systems (ASs) under the control of different administrative entities. In the context of this work, we focus on the implementation of traceback mechanisms between several ASs to identify the source AS of fault packets, i.e. the AS from where the fault packets are originated. The efficiency of a traceback mechanism in ASs depends on the kind of collaboration established between ASs, which can be: strong, weak or non- collaboration. • Strong collaboration implies a total collaboration where each AS knows the topology of the other ASs. This kind of collaboration avoids eventual modification of packets, due for example to marking mechanisms, which could have an impact on the efficiency of the
1
This work has been done in the context of a research project called MetroSec (http://www.laas.fr/METROSEC), granted and funded by the French ministry of research, CNRS, INRIA and DGA.
•
chosen traceback method. Weak collaboration is probably the most frequent case and consists of a minimal collaboration between ASs. The ASs exchange some kind of information without giving away their internal topology. For example, if we apply a traceback method in this kind of collaboration, ASs will collaborate as follows. One AS can ask another AS if a packet has passed through it. Then the requested AS can give the source AS of the packet (i.e. from where the packet is coming) or transmit the request to the source AS of the packet. Non-collaboration is the worst case, where the ASs don’t communicate any information except those necessary for BGP (Border Gateway Protocol) and can modify the packets, as for example the fields of the IP header by using another marking mechanism.
Assuming that we have either strong collaboration or weak collaboration between ASs, we can have three cases: (1) implementation of the same traceback method in all ASs with a strong collaboration, (2) implementation of the same traceback method in all ASs with a weak collaboration and (3) implementation of a different method in each AS with a weak collaboration. The aim of this paper is to identify the most efficient method in the third case that seems to us to be the most interesting case. Our paper is organized as follows. We first give an overview of existing traceback mechanisms. Then, we present our approach in the context of a traceback between ASs. After that, we discuss the evaluation of our approach. Finally, we conclude with some remarks. II. RELATED WORK To deal with the problem of identification of IP source, several mechanisms have been proposed which are based on three main approaches: Link testing, Logging and packet marking. • The aim of the Link testing approach is to search the source address of the attack hop-by-hop, by testing network links between routers, starting from the victim, more precisely from the router closest to the victim, until the real source of the attack. Input Debugging and Controlled Flooding are two implementations of this approach. Input Debugging [2] uses the attack signature to determine the attack traffic’s source and the Controlled Flooding [3] determines the attack path by flooding one by one the links of each router and then
2
•
•
watching variations in the attack rates. The most significant shortcoming of this approach is that the attack must still active to trace back the attack source. The implementation of this solution is particularly difficult, as it requires the knowledge of the network topology. Moreover, in case of multiple attack sources, it’s not easy to identify these sources. The principle of the Logging approach is to store in routers some information (such as the source link, IP source and destination address) concerning packets crossing them and then to use logged information to trace back the origin of fault packets, router by router. Contrarily to Link Testing, the Logging method is more efficient for identification of multiple sources. However, this method requires important processing and storage resources to save packets logs. Snoeren [4] proposes a realistic implementation of this approach in the case of a large network and under a same administrative authority. In the Packet Marking approach, route path information (such as the IP address of the crossed router) is added into the header of the packets, when they traverse a router, to rebuild the path taken by fault packets. Three main techniques have been proposed for packet marking paradigm: Probabilistic Packet Marking (PPM), Deterministic Packet Marking (DPM) and ICMP traceback. In PPM [5][6] packets are marked randomly when they cross a router. In DPM [7][8], only incoming packets are marked when they enter a network. ICMP Traceback [9] is similar to PPM except that the route path information is not inserted in the packet itself but in an ICMP packet, generated for each marked packet and sent to the final destination of this packet. The main drawback of the packet marking approach is: 1) in the case of PPM and DPM, modification of packets that increases the size of marked packets and 2) in the case of ICMP traceback, generation of additional traffic. Moreover, to be efficient, it is necessary to use the same marking method in all transit ASs.
method based on the logging approach to trace back the AS source in the case of a weak collaboration between ASs.
III. APPLICATION OF LOGGING FOR AN AS-TRACEBACK Basing on the example given in Fig. 1, we will explain the AS-traceback method that we propose. In our approach only incoming packets are logged. For each packet, is logged the AS source (i.e. the AS from where it is coming) and a packet identifier, obtained with a transformation function that is different for each AS: F for AS1, G for AS2 and H for AS3 in Fig.1. In this example, let’s consider a packet x1 sent by a client of AS4 to a client of AS1 passing through the AS3 and AS2. The traceback process starts after identification of the fault packet x1 by the victim of AS1. After having identified the fault packet, the victim send a traceback request to its AS (i.e. AS1). This last will determine, from the logged information, the AS from where the packet x1 is coming, which is AS2, then transmit to this AS the traceback request. This process is repeated by the AS2 and AS3 until the AS4 or until a transit AS that doesn’t collaborate. It is interesting to notice that even if there was no collaboration between AS1 and AS2, it will be possible to determine successfully the source AS of the packet x1 if AS3, the last transit AS, collaborate.
Fig. 1. AS-Traceback of one packet with the Logging method
When implemented in the same AS, generally these methods (except link testing) are able to trace back the attack origin. In the context of a traceback between ASs, the efficiency of these methods depends, as said previously, on the kind of collaboration and if the implemented method is the same or not in all ASs. If the same method is implemented in all ASs: • In case of a strong collaboration, all solutions are efficient except PPM that induces a loss of precision due to the size of the network. • In case of a weak collaboration, DPM is less efficient than the other solutions because of the loss of information concerning the network topology. When different methods are applied in the context of a weak collaboration, packet marking is the worst solution because when a packet enters an AS, the marking done by the AS from where it comes, is lost. Logging seems to us the only method that can be applied between ASs in a realistic and efficient manner. Actually, the information required for this method is the AS source of a packet and this information can be transmitted is the case of a weak collaboration. In the next sections, we will present our AS-traceback
To implement this method efficiently, it is important to reduce the size and number of packet identifiers that are logged. To resolve the problem of storage space, we propose to use the hash function introduced by Snoeren [4]. To log packet identifiers, we propose to use either a full logging or probabilistic logging or both. In full logging approach, all packet digests are logged. This solution is very efficient because if it is applied by all ASs, 100% of packets can be traced. In probabilistic logging, packet digests are logged randomly. In the example given in figure 2, where a full logging solution is implemented in all ASs, the AS1 receives a traceback request for packet x1, x2, x3 and x4 as they have been identified as fault packets by the victim. The AS1 sends then a traceback request to AS2, for packets x1 and x4, and to AS3 for packets x2 and x3, according to the information logged in its database. The AS3 in its turn transmits the traceback request to AS4 for packet x2. This process is repeated recursively for packet x2 until finding its source AS. However, the traceback stops at AS2, for packets x1 and x4, and at AS3 for packet x3, which means
3 that packets x1 and x4 are coming from AS2 and x3 from AS3.
Finally, one of the most important aspects of our method is that it allows cohabitation of both kind of logging (see Fig. 4). As we can see it in the example given in Fig. 4, the use of full logging in AS C1 allows identifying the source AS of packets x1 and x3 even if a probabilistic logging method has been used in AS P1.
Fig. 2. AS-traceback using the full logging approach
In the case of a probabilistic logging solution (see Fig. 3), the traceback process is repeated until finding the source AS of the fault packet or until there is no information logged about the researched packet. However, because some packets will not be logged, two alternatives will be possible: • Send the traceback request to all possible source ASs, which can overhead the bandwidth. • Send the traceback request to the ASs from where at least one packet is coming. We have chosen the second solution that can be optimized as follows. Only packets for which the source AS has been identified will be indicated in the request sent to this AS (see Fig. 3). As it is indicated in the figure below, AS1 sends a traceback request to AS2 for x1 and to AS3 for x3 as these packets are logged in its database. However, as there is no information about x2 and x4, a traceback request concerning these packets is sent to both AS2 and AS3.
Fig. 3. AS-traceback using the probabilistic logging approach
Fig. 4. AS-traceback using a probabilistic and full logging approach
IV. EVALUATION To evaluate our traceback approach, we have done a simulation using NS-2 (Network Simulator-2). Obviously simulating an AS network (i.e. Internet) that is close to the reality is not easy, particularly because of the complex commercial agreements between ASs. First, we have generated the topology of the ASs network. Then, we have deployed our agents (senders and receivers) as follows. We have deployed randomly 100 UDP sender agents and one receiver agent (victim) that will receive all incriminated packets. Our simulations have been realized on different topologies to ensure the coherence of the results. We have done several simulations, where we have changed the percentage of packets’ logging. The figures 5, 6, 7 and 8 show the success rate of traceback according to the number of transit ASs crossed by the fault packets. The results obtained for a traceback with one fault packet (see Fig. 5) are very bad because when a packet is not present in the next hop the traceback stops. Fig. 6 shows the results obtained for a traceback with two fault packets coming from the same source. In this figure, we observe an improvement of the efficiency of traceback, particularly when using a logging probability of 80%. The Fig. 7 shows that for 4 packets coming from the same source, an 80% probabilistic logging solution is as efficient as the full logging approach, where we obtain systematically the source AS of the incriminated packets. Moreover, with a logging probability of 60% we obtain similar results as in the previous case (see Fig. 6) i.e. when using a logging probability of 80% to trace back two fault packets.
4 In Fig. 8, we see clearly that using a logging probability of 60% for tracing back six fault packets, coming from the same AS, is sufficient.
Fig. 8. AS-Traceback result with six packet
V. CONCLUSION Fig. 5. AS-Traceback result with one packet
In this paper, we have first presented existing traceback techniques. Then, we have proposed an AS-traceback method that is more realistic in the context of Internet, i.e. the real network of ASs. Through a simulation, we have shown that the probabilistic logging approach is efficient under the condition that we have a sufficient number of packets coming from the same real source. Our simulations have shown that with 80% of logging, only 4 packets where necessary and with 60%, 6 packets were sufficient. The next step will be to apply our approach in a network, where ASs use different probabilities to log the packets.
REFERENCES [1] [2] Fig. 6. AS-Traceback result with two packets
[3]
[4]
[5]
[6] [7]
[8]
[9] Fig. 7. AS-Traceback result with four packets
Cyberattacks on Estonia 2007. http://en.wikipedia.org/wiki/Estonian_Cyberattack. S. Savage, D. Wetherall, A. Karlin, and T. Anderson, "Practical network support for ip traceback", in Proc of the 2000 ACM SIGCOMM Conference, August, 2000. H. Burch and B. Cheswick, "Tracing anonymous packets to their approximate source", in Proc. of the 2000 USENIX LISA Conference, New Orleans, USA, December 2000. A. C. Snoeren, C. Partridge, L. A. Sanchez, C. E. Jones, F. Tchakountio, S. T. Kent, and W. T. Strayer, "Hash-based ip traceback", in Proc. of the 2001 ACM SIGCOMM Conference, August, 2001. S. Savage, D. Wetherall, A. Karlin and T. Anderson, "Network support for IP traceback", IEEE/ACM Transactions on Networking, vol. 9, n°. 3, pp. 226-237, June 2001. A. Yaar, A. Perrig and D. Song, "FIT: Fast Internet traceback", in Proc. of IEEE INFOCOM, Miami, USA, March 2005. A. Belenky and N. Ansari, "Ip traceback with deterministic packet marking", IEEE Communications Letters, Vol. 7, No 4, pp. 162— 164, Avril 2003. S. Chen and Q. Song, "Perimeter-based defense against high bandwidth DDoS attacks", IEEE Transactions on Parallel & Distributed Systems, July, 2005. S. M. Bellovin. "Icmp traceback messages", Internet Draft, IETF, March, 2000.