are unallocated blocks of addresses that are currently reserved by IANA. Although these blocks are subject to future allocation, the procedure is slow enough [4 ...
On the effectiveness of Martian address filtering and its extensions Hyogon Kim Korea University
Inhye Kang University of Seoul
Abstract—Martian address filtering refers to a technique that discards IP packets that have an invalid source or destination address. This paper evaluates its effectiveness (or lack thereof) under denial of service (DoS) attack or host scan, in terms of packet-level and flow-level filtering performance. In order to overcome the shortcoming of Martian address filtering, we consider two extensions: unallocated address checking and blacklisting. We demonstrate through trace-based simulation that these techniques can indeed boost filtering performance. We also analyze the performance and the possible side-effects of the extensions. Keywords—packet filtering, stateful inspection, denial-ofservice attack, host scan, Martian addresses
I. INTRODUCTION Martian address filtering [1] refers to a technique that discards IP packets with an invalid source or destination address. For instance, network 0 (i.e., 0.0.0.0/8) appearing in the destination address field or broadcast address in the source address field is considered illegal, and the Router Requirements RFC [1] strictly prohibits Internet routers from forwarding such packets. Table I summarizes the Martian addresses for source and destination address field of an IP datagram [1,2]. TABLE I.
MARTIAN ADDRESS BLOCKS
Category
Source
Class D/E
Not unicast
Default Loopback Default / broadcast
Destination Class E (except 255.255.255.255) Network 0 127.0.0.0/8
01 or –1 in the host ID
10.0.0.0/8 172.16.0.0/12 192.168.0.0/16
Link-local
169.254.0.0/16
Test network
192.0.2.0/24 198.18.0.0/15
1
0.0.0.0/8 can be used as part of an initialization procedure in a limited number of protocols such as BOOTP and ICMP (Mask Request). If the router is in normal operation, however, these must be discarded. We assume that routers have finished the initialization procedure and consider their behavior in normal operation phase.
GLOBECOM 2003
In this paper, we investigate the effectiveness of the Martian address filtering technique (or lack thereof) in detecting and filtering network attacks. In addition to the Martian addresses, we also consider the use of the “tarpit” (unallocated) IP address blocks to enhance the filtering effect. Note that these are packet-filtering schemes, and we additionally explore the impact of a flow-based extension called blacklisting. The rest of this paper is organized as follows. Section II introduces the three filtering schemes, and Section III analyzes the expected performance of blacklisting, which is the most complex of the three. Section IV evaluates the performance of the three schemes and their combinations through trace-based simulations. With a brief discussion on related work, Section VI concludes the paper. II.
0 in the host ID
RFC1918 private
The reason Martian addressed packets in transit are discarded is because they cannot result in any meaningful communication – because either the sender or the receiver is not allowed to communicate via the global Internet, unidentifiable, or even non-existent. The first source of these packets is obviously broken implementations. The second, and major, source is network attacks. In particular, denial-ofservice (DoS) attacks usually randomly spoof the source address to obfuscate tracking, and global-scale host scans typically probe randomly generated IP addresses for vulnerable hosts [9]. So Martian addresses are bound to appear in the sequence of addresses they randomly forge. Therefore, Martian address filtering can also be used to detect and filter attack packets although it was originally designed against malformed packets originating from broken implementations.
MARTIAN ADDRESS FILTERING AND EXTENSIONS
A. Martian address filtering As mentioned above, Martian address filtering was originally designed to filter malformed packets rather than deliberately forged attack packets that utilize all or most of the entire IPv4 address space. The shortcoming of Martian address filtering when used against network attacks, therefore, lies in the fact that Martian addresses occupy only a fraction of the entire IP address space. Specifically, the number of addresses covered by Table I is less than 15% for source, and less than 9% for destination. Assuming that DoS attack packets spoof their source addresses randomly, it means that only 15% of attack traffic is filtered at most by vanilla Martian filtering. Likewise, random sweeping host scan packets are filtered only up to 9%. Worse yet, in case large Martian blocks such as Class D/E space, 0 network, and loopback network are avoided in the random address generation, which a reasonably informed hacker might well do, the filtering rate would approach zero.
- 1348 -
0-7803-7974-8/03/$17.00 © 2003 IEEE
B. Filtering unallocated blocks In addition to the Martian addresses defined in [1,2], there are unallocated blocks of addresses that are currently reserved by IANA. Although these blocks are subject to future allocation, the procedure is slow enough [4,12] in terms of the required address filter configuration change corresponding to it. From 1998 to 2001, the annual IP address space utilization showed 7% of growth. Considering the current allocation of about half the entire IPv4 address space [7], this is roughly ten /8 blocks a year. However, the growth rate is slowing down (i.e. negative second derivative) [7]. As of December 2002, unallocated Class A blocks are 1, 2, 5, 7, 23, 27, 31, 36, 37, 41, 42, 49, 50, 58-60, 70-79, 83-127, 197, 222, 223 [4,5]. This is 35% of the entire address space above and beyond the Martian addresses 2 . If packets originating from or destined to these addresses can be filtered, the filtering effect will accordingly increase. Specifically, up to 50% of randomly spoofed DoS attacks and 44% of wide-sweeping host scans can get caught. Again, however, if the unallocated blocks are also avoided in random number generation, the effect would be diminished although it is rather cumbersome to do. C. Blacklisting Time …… 09:35:23.955222 09:35:23.958716 09:35:23.965132 09:35:23.965443 09:35:23.965945 09:35:23.974520 09:35:23.976617 09:35:24.091332 09:35:24.093271 09:35:24.093317 …… 09:35:24.104956 09:35:24.106191 09:35:24.107471 09:35:24.125654 09:35:24.126519 ……
Source address …… x.x.x.x x.x.x.x x.x.x.x x.x.x.x x.x.x.x x.x.x.x x.x.x.x x.x.x.x x.x.x.x x.x.x.x …… x.x.x.x x.x.x.x x.x.x.x x.x.x.x x.x.x.x ……
Source port …… 64218 64232 64310 64311 64314 64322 64331 64424 64423 64422 …… 64438 64433 64429 64466 64464 ……
Destination address …… 72.142.101.184 72.142.101.197 197.14.58.120 197.14.58.121 197.14.58.124 197.14.58.132 197.14.58.141 19.231.216.127 19.231.216.126 19.231.216.125 …… 19.231.216.141 19.231.216.136 19.231.216.132 85.114.173.117 85.114.173.115 ……
Destination port …… 111 111 111 111 111 111 111 111 111 111 …… 111 111 111 111 111 ……
Figure 1. Hostscan “pivots” destination IP address. The Router Requirements RFC [1] states that if a router discards a packet because of Martian address filtering, it should log at least the source and destination IP address. We can consider extending and building on this rule for further enhancement of the filtering performance. With blacklisting, we “book” an IP address if it pairs with an invalid address (i.e., Martian or unallocated) more than a tolerable number of times in a given observation interval. The rationale is that with only Martian and unallocated address checking, the attack packets falling on legitimate address blocks get away. We implement the blacklisting as follows. If invalid addresses appear in the destination field and on a particular destination port more than a few times in the packets sent by a particular source, it is highly likely that the source is randomly probing (i.e.,
scanning) vulnerable hosts, which run a service represented by the destination port number. Figure 1 shows a real-life example of a hostscan looking for the vulnerable RPC service where source address and destination port are fixed, and the destination address pivots on them. We notice that 72.x.x.x, 197.x.x.x, and 85.x.x.x are unallocated, hence illegal destination addresses. Under blacklisting, we begin to filter all future packets from x.x.x.x 3 for a preset amount of time as punishment (for scanning) and as deterrence (against possible worm epidemic [9]), as well as discarding these invalid packets in hand. Likewise, for DoS that employs source spoofing, the victim’s address is fixed but source address is varied [6]. If the number of invalid addresses used as source address for a destination exceeds a prescribed limit, the destination IP address is declared a DoS victim and the packets to it can be filtered as necessary. Powerful it might be, blacklisting must be used with care. Filtering all packets destined to the DoS victim can result in denial of service in a shifted form – the victim cannot receive any traffic. Although somewhat involved scheme is already available to reduce such collateral damage [10], we explore a simple scheme, trading accuracy for speed. When we determine a destination IP as the DoS target, we associate a signature with it, composed of the packet size, protocol, and flags (for TCP only), extracted from the offending packets4. Among the packets destined to the suspected DoS victim, we only filter those exactly matching the signature. As to the host scan, on the other hand, another form of denial-of-service attack is possible. A well-informed attacker can use victim’s IP address in the source field and transmit to an illegal address more than a few times. This can trick a blacklisting filter into identifying the victim as a scanner and erroneously block the traffic from it – a denial-of-service for the victim. Despite the open issues, in this paper we explore the potential of blacklisting as an experimental concept. III. ANALYSIS OF BLACKLISTING In this section, we analyze the performance of blacklisting. (Packet-based filtering schemes are too trivial for an analysis.) We first define the attack detection threshold T as the smallest number of per-address perpetrations to be considered an attack given an observation interval I. Now we analyze the time the blacklisting filter takes to detect an attack, and the probability of detection failure. A. Time to detection Given IPv4 space allocation ratio L, and the attack intensity R, the number of address selections (i.e. attack packet arrivals) required to reach T follows the Pascal distribution where the mean is T/(1-L), and the variance is TL/(1-L)2. Assuming the attack packets arrive in pace, it takes T/((1L)*R) on the average to detect an attack exceeding the threshold. Not surprisingly, the detection time is shorter for lower threshold, when the attack is more intense, and as the
2
Although there will be unallocated sub-blocks in those allocated to Regional Internet Registries (RIRs), we ignore them for simplicity and for conservative evaluation of the techniques.
GLOBECOM 2003
3
The scanner’s IP address is masked for privacy.
4
Here we assume that offending packets have the same signature. If not, we could employ multiple signatures for a single attack.
- 1349 -
0-7803-7974-8/03/$17.00 © 2003 IEEE
fraction of legitimate address space is smaller (i.e., more illegal addresses). Assuming L≈0.5 [7], the average detection time is 2T/R. B. Probability of detection failure Here, we analyze the probability that we miss an attack flow. During I, a total of N = RI attack packets arrive. The “impunity” probability pn of the flow not being blacklisted is given by: pn (N ,T , L) =
T −1
N i
∑ i=0
(1 − L ) i L N − i
Again for L≈0.5, this becomes pn (N ,T ) =
T −1
N i
∑ i= 0
−N 2
(1)
Figure 2(a) plots pn as a function of N for for some T values. It turns out that pn is a relatively fast decreasing function of N. Note that the ratio N/T to achieve a low pn decreases with T. 1
T=1 T=5 T=10
0.9 0.8
For pn=0.5% as an example, α=0.4 and β=10. IV. TRACE-BASED SIMULATION In this section, we evaluate the filtering performance of four methods: vanilla Martian address filtering (M), Martian address filtering with unallocated address checking (M+U), Martian address filtering with blacklisting (M+B), and Martian address filtering with unallocated address checking and blacklisting (M+U+B). We use two metrics to compare the filtering performance of these methods: packet-level filtering and flow-level filtering rate. And in addition to the threshold T, we define one more parameter for blacklisting. We define the quarantine Q as the amount of time during which the packets using the blacklisted address are filtered. When the quarantine is over, the address is removed from the blacklist. For the evaluation, we use a real-life trace collected from one of the four Internet exchanges in Korea, in December 2001. This trace is known to contain a few source-spoofed denial-of-service (DoS) attacks and numerous host scans and port scans [6]. The trace was captured from two trans-pacific T-3 links. The total traffic rate was over 90Mbps (bidirectional), and the number of packets in the trace is over 625 million.
0.7
A. Packet-level filtering effect Table II shows the number of packets let go (including legitimate) and those filtered. In the parenthesis, we show T and Q values used for each method. For M and M+U, the number in the parenthesis is the T value. And for B, it is the Q value. The first thing we notice is that the vanilla Martian filtering is fairly ineffective, compared with the filtering rate with unallocated address checking. Note that M+U gives the lower bound of the filtering rate without false positives (i.e., illegitimate address has been used after all). M+U boosts the filtering rate by 300% for host scan- and 670% for DoSsuspected traffic.
pn
0.6 0.5 0.4 0.3 0.2 0.1 0 0
5
10
15
20
25
30
35
N
Figure 2(a). pn as a function of N for various T. 100 90 80 70
N
60 50 40 30
pn=0.005 pn=0.01 pn=0.05 pn=0.1 pn=0.5
20 10 0 0
5
10
15
20
25
30
35
T
Figure 2(b). Smallest N required to achieve a target pn. Figure 2(b) shows the smallest possible N for a target pn as a function of T for some pn values. These curves can be linearly approximated for the given T values, and give us a guideline to set T. Namely, if N/I is the threshold attack rate, we can set
T = α ( IR − β )
GLOBECOM 2003
(2)
Assuming most attacks utilize the full IPv4 space (whether it is source spoofing or random destination probing), M and M+U give us a hint on the attack intensity in the trace. According to M, the host scan intensity is 0.88%/0.09 ≈ 9.78% and 0.47%/0.15 ≈ 3.13% for DoS, making the total 12.91%. On the other hand, according to M+U, 2.64%/0.44 ≈ 6% for host scan and 3.15%/0.5 ≈ 6.3%, making the total 12.3%. The discrepancy between the compositions seems to imply that the particular DoS attacks captured in the trace are avoiding Martian blocks. However, the total intensities seem to coincide at 12-13%. Although the unfiltered traffic is reported 94.1% by M+U, it means, the truly legitimate packets are only 87-88%. Now, this gives us a baseline to evaluate the performance of blacklisting. We observe that blacklisting is indeed powerful. With M+U+B, more than half of the packets are classified as attack packets. But the calculation above tells us that blacklisting with T=1 and Q=∞ is probably too harsh. An investigation into the cause reveals a shocking fact that supposedly unallocated addresses are being used in “normal” communications. Under M(1)+U(1), such packets occupy 6.1% of host scan packets and as much as 29.1% for DoS
- 1350 -
0-7803-7974-8/03/$17.00 © 2003 IEEE
packets. This is blatant piracy on the part of the end network that uses the addresses, and negligence on the part of the service provider for the network – the provider must have filtered the BGP advertisement. Anyway, under blacklisting M(1) + U(1) + B(∞), this leads to an innocent host that receives such a packet being permanently classified as DoS victim, with all subsequent legitimate packets to it filtered. Similar explanation applies to host scan. We believe address piracy is the single most influential reason that blacklisting overfilters. As to the impact of the threshold and the quarantine values, we explore it in subsection C and D.
attack intensity R in the quarantine, we need the following condition.
R>
T Q
(3)
Below, we use this condition as the guideline to set Q. For instance, if we want to quarantine the attacks of over 10/s rate at T=1, we set Q to 0.1s. 100 90
Filtering method
PACKET FILTERED AND PASSED
Not filtered
80
Filtered as DoS
70
Host scan
M(1)
98.54%
0.47%
0.88%
M(1) + U(1)
94.10%
3.15%
2.64%
M(1) + B(∞)
42.02%
26.04%
31.89%
M(1) + U(1) + B(∞)
38.53%
29.20%
32.22%
percentage
TABLE II.
60 50 40
not filtered host scan DoS
30 20 10
B. Flow-level filtering effect Flow-level filtering rate is a more important metric for flow-level or stateful inspection devices (e.g. stateful firewalls, flow-level monitors) [11]. This is because each attack packet constitutes a flow since an address field is varied per-packet, whereas in normal flows all 5 fields in the flow definition remain the same within a flow. In Table III, we show the total number of unfiltered flows during the 8-hour duration of the trace. We use flow inactivity timeout of 30 seconds and infinite quarantine time for blacklisting methods. TABLE III.
NUMBER OF FLOWS BEFORE AND AFTER FILTERING
Filtering method -
Unfiltered flows 3946269
M(1)
3677298 (93.18%)
M(1) + U(1)
3468156 (87.88%)
M(1) + B(∞)
1116949 (28.30%)
M(1) + U(1) + B(∞)
978768 (24.80%)
Compared with the packet-filtering rate in Table II, the flowfiltering rate is higher for all methods. This is expected, since as mentioned above, each individual attack packet constitutes a flow. Applying the same extrapolation of subsection A, we can estimate the lower bound of the attack flows to more than 24% (i.e. M(1)+U(1) is roughly 12%, and more than half of IPv4 space is allocated). Conversely, the lower bound of truly legitimate flows is 24.80% (no extrapolation is necessary). C. Effect of quarantine in blacklisting Permanent quarantine is impractical (i.e., memory usage keeps increasing) and could be unfair (e.g. a source address can be quarantined forever for erroneously sending a packet to an unused destination address). To keep an attacker with
GLOBECOM 2003
0 0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
q
Figure 3. Packet filtering for M(1)+U(1)+Q(q). In Figure 3, we evaluate the impact of the quarantine value for T=1. The horizontal axis represents Q in seconds. According to (3), the target attack rate we want to detect varies from 100 packets/s to 5 packets/s. Larger values of Q is not very meaningful, since DoS or global-scale host scan has much higher rates. For instance, the SQL Slammer worm forced the infected machine to transmit 4,000 scan packets per second on the average [9]. In the figure, as the threshold attack rate decreases, more packets are discarded as we expect. The unfiltered packets fraction ranges from 81.1% to 93.6%. Provided our estimate of attack traffic intensity (IV-A) is in the right range, this result implies that the average attack rate is between 5 per second and 10 per second. And it in turn means that relatively lower speed host scan is more dominant than DoS. Increasing T leads to similar phenomenon that we observe in Figure 3, but the difference is not significant so we omit them for space constraint. D. Effect of threshold in blacklisting Figure 4 shows the packet-filtering effect of T for M(T)+U(T)+Q(0.1) in blacklisting. We set Q=0.1 based on the analysis in the previous section that the average attack intensity is in that range. The figure shows that the effect of T is minimal. This implies that there are not many legitimate flows that are erroneously classified as attack with Q=0.1. In other words, it means that the number of offenses committed by the DoS flows and host scan flows (N) is mostly much larger than the threshold (T) – to the extent that guideline (2) does not really matter. This is understandable, since the number of packets mobilized in a DoS attack or a hostscan is very high. Even for a very high threshold, they end up violating it and getting caught. On the other hand,
- 1351 -
0-7803-7974-8/03/$17.00 © 2003 IEEE
unintentional offenders that unknowingly attempt to send a handful of packets to a Martian or unallocated addresses are so few that bumping up T does not visibly increase the unfiltered packets. So if we set the threshold higher than suggested in (2), we can virtually make pn zero for the attack packets, while safely letting unintentional offenders pass. 90 80 70
percentage
60
Figure 5 shows the number of outstanding flows with and without filtering. The horizontal axis is time in seconds. The starting point is 9:35 a.m. Dec. 14, 2001. The upper curve shows the number of flows before any filtering is done. The lower curve is when we apply M(1)+U(1)+B(0.1) with DoS signature. It is shown in [6] that the sharp upward spikes are the result of DoS attacks, especially for t = [36000, 50000]. We notice that the filtering eliminates them and prevents state explosion – see how flat it is during the onset of the attacks.
50 40
not filtered host scan DoS
30 20 10 0 0
5
10
15
20
25
30
t
Figure 4. M(t)+U(t)+Q(0.1). In terms of the number flows to be maintained in the filter, high T also incurs no visible overhead. For T = 30, for instance, the additional overhead is less than 1%. (Due to the space constraints, we do not show the graph) It again confirms that the attack flows are identified as attack for almost any T values, whereas there are not many “innocent” offenders. E. Effect of signatures in DoS blacklisting 220000 200000
V. CONCLUSION We evaluate two approaches to enhancing the performance of Martian address filtering against deliberately forged attack packets. Unallocated address block checking increases the filtering rate 3 to 5 times depending on the attack type. In our estimate, unallocated address block checking only could eliminate about 36% of attack packets. Blacklisting exhibits powerful filtering performance, but the collateral damage it incurs calls for significant refinement before it is considered for deployment in real networks. This is a subject of our future work. The only related work we are aware of at the time of writing is Bellovin et al. [3]. It utilizes the Martian filtering and general address allocation policies to show that current backbone routing table sizes can be significantly reduced. As for the idea of punishing misbehaving flows, it is not new. Although used for congestion control, Floyd and Fall [8] suggests to actively punish the flows that are persistently unresponsive to congestion signals. REFERENCES
filtering no filtering
[1] [2] [3]
180000 160000 number of flows
packets due to blind blacklisting. In future work, we will analyze the false positive and false negative rate for the signature method.
140000 120000
[4]
100000
[5]
80000
[6]
60000 40000 35000
40000
45000
50000
55000
[7]
60000
time (s)
Figure 5. Number of outstanding flows before and after filtering.
[8]
In Section II-C, we discussed the possibility of using signature-based blacklisting for DoS packet filtering. Here we turn on the DoS signature checking and examine its impact on the packet filtering rate. We use Q = 0.1 and various T values for this experiment. The result is that DoS filtering ratio as compared with Figure 4 is 0.6, roughly independent of T (graph is uninteresting and not shown here). This means that at least 40% of packets associated with a victim’s address can be unrelated with the DoS attack. It shows that the signature method can be quite effective in salvaging otherwise discarded
GLOBECOM 2003
[9] [10]
[11] [12]
- 1352 -
F. Baker, Requirements for IPv4 routers, RFC 1812. IANA, Special-use IPv4 addresses, RFC 3330. S. Bellovin, R. Bush, T. G. Griffin, and J. Rexford, “Slowing Routing Table Growth by Filtering Based on Address Allocation Policies,” preprint, available from http://www.research.att.com/~jrex, June 2001. IANA, Internet Protocol Version 4 Address Space, http://www.iana.org/assignments/ipv4-address-space. CAIDA, IPv4 Address Space Utilization, http://www.caida.org/outreach/resources/learn/ipv4space/. I. Kang and H. Kim, “Determining Embryonic Connection Timeout for Stateful Inspection,” IEEE International Conference on Communications (ICC), May 2003. A. Broido, E. Nemeth and K. C. Claffy, “Internet Expansion, Refinement, and Chun,” a NANOG presentation, Feb. 2002. http://www.caida.org/~broido/nanog200202.egr.pdf. S. Floyd, and K. Fall, “Promoting the Use of End-to-End Congestion Control in the Internet,” IEEE/ACM Transactions on Networking, August 1999. CAIDA, “Analysis of the Sapphire Worm,” http://www.caida.org/analysis/security/sapphire/, Jan. 30, 2003. M. Poletto, “Practical Approaches to Dealing with DDoS Attacks,” http://www.nanog.org/mtg-0105/ppt/poletto.ppt, a NANOG presentation, May 2001. G. Iannaccone et al, “Monitoing Very High Speed Links,” Internet Measurement Workshop, 2001. S. Marcus, IPv4 Address Space Allocation and Usage Trends, NANOG presentation, 2002.
0-7803-7974-8/03/$17.00 © 2003 IEEE