ALPi: A DDoS Defense System for High-Speed Networks

1864

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 24, NO. 10, OCTOBER 2006

ALPi: A DDoS Defense System for High-Speed Networks Paulo E. Ayres, Huizhong Sun, H. Jonathan Chao, Fellow, IEEE, and Wing Cheong Lau

Abstract—Distributed denial-of-service (DDoS) attacks pose a significant threat to the Internet. Most solutions proposed to-date face scalability problems as the size and speed of the network increase, with no widespread DDoS solution deployed in the industry. PacketScore has been proposed as a proactive DDoS defense scheme, which detects DDoS attacks, differentiates attack packets from legitimate ones with the use of packet scoring (where the score of a packet is calculated based on attribute values it possesses), and discards packets whose scores are lower than a dynamic threshold. In this paper, we propose ALPi, a new scheme which extends the packet scoring concept with reduced implementation complexity and enhanced performance. More specifically, a leaky-bucket overflow control scheme simplifies the score computation, and facilitates high-speed implementation. An attribute-value-variation scoring scheme analyzes the deviations of the current traffic attribute values, and increases the accuracy of detecting and differentiating attacks. An enhanced control-theoretic packet discarding method allows both schemes to be more adaptive to challenging attacks such as those with ever-changing signatures and intensities. When combined together, the proposed extensions not only greatly reduce the memory requirement and implementation complexity but also substantially improve the accuracies in attack detection and packet differentiation. This makes ALPi an attractive DDoS defense system amenable for high-speed hardware implementation. Index Terms—Denial-of-service (DoS) attack, network security, overload control, packet differentiation.

I. INTRODUCTION

D

ISTRIBUTED DENIAL-OF-SERVICE (DDoS) attacks aim to interrupt localized Internet services by flooding the victim with a high volume of malicious packets originating from many different sources. Methods based on packet marking, traceback protocols [1]–[4], and pushback mechanisms [5], [6] have been used to tackle DDoS attacks. Intrusion pattern recognition approaches have also been proposed by the data mining community to automate the extraction of hidden attack signature using offline machine-learning techniques as in [7], [8], and online ones as is the D-WARD approach [9]. A combination of static and dynamic statistical filters have also been

Manuscript received September 20, 2005; revised April 11, 2006. The work of H. J. Chao was supported in part by New York State, NYSTAR. The work of W. C. Lau was supported in part by the Chinese University of Hong Kong under RGC/CUHK Direct Grant 2050368. P. E. Ayres, H. Sun, and H. J. Chao are with the Department of Electrical and Computer Engineering, Polytechnic University, Brooklyn, NY 11201 USA (e-mail: [email protected]; [email protected]; [email protected]). W. C. Lau is with the Chinese University of Hong Kong, Shatin, Hong Kong (e-mail: [email protected]). Digital Object Identifier 10.1109/JSAC.2006.877136

proposed[10]. There are also many signature-based commercial DDoS solutions that can detect and mitigate specific types of known DDoS attacks, e.g., especially those are generated by common DDoS attack tools. However, their signature-based approach makes them vulnerable for never-seen-before DDoS attacks. For instance, while Arbornetworks’ product [11] mitigates DDoS attacks with the traceback approach, it requires the precise characterization of the attacking packets as part of the input. Mazu, Cisco (who acquired Riverhead Networks), and Cyberoperations all have products [12]–[14], which employ statistics-based adaptive filtering techniques and anomaly detection by comparing the current traffic against some baseline models. However, most of these solutions do not fully automate the packet differentiation and selective discarding processes. Instead, they only recommend a set of binary filter rules to the network administrator. Webscreen [15] and Stealthwatch [16] network appliances both use packet scoring and selective discarding based on some dynamic score thresholds.1 An ideal DDoS defense system should be flexible enough to cope with new and more sophisticated attacks in the future, and offer online automated approaches that are more scalable in terms of network operating speed and the number of potential targets to be protected. PacketScore, described in [17], proposes a statistics-based overload control approach that efficiently addresses key scalability issues in a backbone implementation, allowing a large number of targets to be protected at high speed. It is a proactive defense system by nature, able to detect and block never-seen-before attacks. Essentially, it detects and filters DDoS attacks based on a packet-scoring approach. Arriving packets are given scores based on their transmission control protocol/Internet protocol (TCP/IP) attribute values as compared with nominal traffic profiles, and selectively discarded if their scores exceed a dynamic threshold. In this paper, we present key enhancements to PacketScore’s[17], which reduce its implementation complexity, while boosting accuracies in attack detection and packet differentiation. Our enhancements also make the system more adaptive against increasingly sophisticated DDoS attacks. The rest of this paper is organized as follows: Section II provides an overview of the previously proposed PacketScore scheme. Section III presents the motivations and advantages of ALPi.2 Sections IV and V describe the implementation of the proposed leaky-bucket (LB) and attribute-value-variance (AV)-based scoring schemes, respectively. Section VI describes the use of a new proportion-integration (P/I)-based overload 1This

packet scoring approach is introduced at the end of this section.

2Pronounced “El pie.” “A” stands for attribute-value-variation, “L” for leaky-

bucket, and “Pi” for proportion integration.

0733-8716/$20.00 © 2006 IEEE

AYRES et al.: ALPi: A DDoS DEFENSE SYSTEM FOR HIGH-SPEED NETWORKS

1865

Fig. 2. Illustration of the PacketScore scheme Fig. 1. Deployment of 3D-Rs and DCSs to tackle end-point DDoS attacks.

control system. Section VII evaluates the performance of the new schemes. Section VIII gives the conclusion of this paper. II. OVERVIEW OF CLP-BASED PACKETSCORE SCHEME Here, we review the previously proposed PacketScore scheme [17]. Fig. 1 depicts the support of distributed detection and overload control by multiple detecting-differentiating-discarding routers (3D-Rs) on a defense perimeter and DDoS control servers (DCSs). Let be the total number of 3D-Rs along the defense perimeter. The use of DCS reduces the peer communications among the 3D-Rs to , and spares the 3D-Rs from the burden of managing a large number of per-end-point-target nominal traffic profiles. Since a DCS exchanges only control messages with the 3D-Rs, it can be safely kept away from the normal data path, i.e., out of the reach of potential DDoS attack traffic. To facilitate load balancing and improve scalability, the set of potential end-point targets within a domain can be partitioned among multiple DCSs. The PacketScore scheme [17] uses a statistic-based Bayesian method called conditional legitimate probability (CLP) to calculate packets’ scores, hereinafter referred to as the CLP-based scheme. It consists of the following three phases. • Attack detection and victim identification by monitoring four key traffic statistics of each protected target (packetsper-second, bits-per-second, number of active flows, and new arriving flow rate), while keeping minimum per-target states. The key traffic parameters are compared with the nominal traffic profile parameters. A DCS aggregates the reports from multiple 3D-Rs on a defense perimeter, to confirm if there is actually an ongoing attack. • Differentiate attacking packets from legitimate ones by giving a score to every packet destined to the identified victim. Scores are determined by comparing every packet’s current traffic profile against its nominal traffic profile. More specifically, they are computed by CLP, and stored in the form of scorebooks. By this method, the attribute value shared by attacking (legitimate) packets will be assigned a lower (higher) score, because of its relative frequency increase (decrease) in current traffic

profile against the nominal ones. As a result, PacketScore can efficiently differentiate legitimate packets among suspicious traffic. • Discard packets selectively by comparing the packet’s score with a dynamic threshold, which is adjusted according to: 1) the score distribution of all suspicious packets and 2) the congestion level of the victim. Fig. 2 summarizes the PacketScore scheme. Each arriving packet obtains a set of partial scores from the scorebook via a lookup operation, according to the attribute values it carries. The packet score—the sum of the packet’s partial scores—is then compared with a dynamic threshold in the overload control unit. Packets whose scores are less than the threshold (like packet #3 in Fig. 2) will be discarded. A nominal profile is a set of baselines collected during a period in which the protected network was allegedly free of attacks. It characterizes the traffic within a certain period of time by measuring the average throughput in packets or bytes per second. The profile is kept in form of a set of normalized histograms for each packet attribute of interest. During system operations, the same set of histograms are constructed via online real-time traffic measurements to characterize the current traffic profile. By comparing the current profile against the nominal one, PacketScore is able to distinguish legitimate packets from DDoS attacking packets. The following attributes are currently used for traffic profiling: IP protocol-type, packet size, time-to-live (TTL) value, TCP server port number, 16-bit source/destination IP address prefixes, TCP/IP header length, and TCP flag patterns. Iceberg-style histograms, defined in [18], are used so that the nominal profile includes only the non-null attribute values (icebergs) that appear more frequently than a preset threshold, say x%. This keeps the profile to a manageable size, and reduces the lookup time. Iceberg-style histograms require two passes of input data to collect nominal profile data. A one-pass iceberg-style histogram maintenance/update is implemented efficiently in hardware by applying a two-stage pipelined approximation similar to what is proposed in [19]. In this method, data scans for processing is divided into periods where period icebergs to be accounted in period , which also scans for icebergs to be used in period and so on, as in Fig. 3. This

1866


III. IMPROVEMENT OVER THE CLP-BASED SCHEME

Fig. 3. Iceberg detection in real-time.

Fig. 4. Outgoing traffic maintenance using discarding threshold updates.

figure contains real attribute values and frequencies from the flag nominal profile, using a 1% threshold. Arriving packets in possessed flag attribute values 2, 16, 17, 18, 20, and period 24. These values (or icebergs) are accounted in period , with the number of occurrences being 2235, 3850, 154, 88, 101, and 991, respectively. At the same time, in period , arriving packets have flag attribute values 2, 16, 17, 19, 22, and 24, composing . the icebergs to be accounted in period Scoring is obtained as a direct comparison of nominal and measured profiles using CLP as a metric.3 After the scores are computed, it is necessary to calculate a score threshold that will distinguish legitimate packets from attacking ones. The threshold score is determined based on target/maximum-system throughput requirements, which regulate the output throughput. This overload control process is achieved by having a cumulative distribution function (CDF) of all incoming packets created and maintained using one-pass quantile computation techis niques, as in [20]–[22]. Next, the discarding threshold calculated (and dynamically adjusted) using the load-shedding algorithm, as in [23]. According to this algorithm, the congestion level of the victim is measured, allowing the victim system to opportunistically accept more potentially legitimate traffic as its capacity permits. is simply a CDF entry. Incoming packets The resulting , are discarded, as shown in whose CDFs are below the Fig. 4. The key idea here is to prioritize, forward, and drop packets based on their score values. 3For

further details on CLP, please refer to [17, eqs. (1)–(3)].

We propose new methods to replace the CLP-based scheme to achieve high-speed operations, e.g., 10 Gb/s. In the CLP-based scheme, a scorebook, a collection of each attribute value’s score, is first generated based on Bayesian CLP. The score associated with each attribute value is obtained from two histograms; one is the currently measured and the other is the nominal profile. The implementation complexity arises from the calculation of these two histograms for each packet attribute. On the contrary, the LB-based scheme does not need to calculate a measured profile histogram, nor does it need to calculate any kind of histogram in real-time. Instead, it assigns an LB for each attribute value and determines a score for each attribute value based on the number of overflows of the associated LB. The scorebook can be readily constructed by keeping track of the overflow count of each bucket, which, in turn, requires only standard LB memory access and update operations. Another proposed scoring method, called AV, improves the accuracy of packet discarding,4 under all circumstances, as compared with the CLP- and LB-based schemes. This is achieved by using an attribute value variance instead of simple attribute values as an LB threshold. It is less complex than the CLP but more complex than the LB. The complexity comes from the necessity to calculate the variance for each attribute value during the nominal profile. It is very challenging to provide an effective overload control when a system is under fast-changing DDoS attacks. The previously proposed PacketScore scheme uses a CDF and a load-shedding algorithm to generate the discarding threshold . Packets with scores lower than the threshold are discarded. However, if an attacker changes its attack type and intensity, the —which was valid for a certain range of scores—would very likely become invalid, therefore compromising the differentiation capacity, until a more adequate is dynamically set. This situation tends to worsen as the scores of a measurement period are used in the next period, while the attacks continue to change. We have observed that the moment the attacks change, spikes of admitted traffic appear (due to the threshold invalidation explained above), sometimes lasting for a relatively large period of time. Even with frequent threshold updates in a small period of time, (the only way to revalidate the threshold), the CLP scheme still suffers from this problem. To address this problem, we apply the classical proportion integration (P/I) scheme in control theory to determine the discarding threshold dynamically. This control-theoretic approach not only makes the system simpler, but also helps to reduce the computational and memory requirements of the system. The P/I scheme also provides a higher degree of independence from the scores generated in the previous period, and adapts faster to new attacks than the CDFbased load-shedding scheme used in the original PacketScore scheme. Its superior performance over the original PacketScore scheme will be demonstrated through a series of simulations in Section VII. 4The capacity to distinguish legitimate packets from attacking packets, and discard the attacking ones with as much accuracy as possible.


1867

Fig. 5. Nominal and measured profile related to packet scoring on the LB-based scheme.

IV. LB-BASED PACKETSCORE SCHEME LB is a well-known traffic enforcement/shaping algorithm and is usually implemented at the network edges to ensure a user’s traffic complies with the negotiated traffic parameters. Conceptually, an LB consists of a bucket with a size of and a drain rate of . In the context of traffic control, arriving packets are considered nonconforming if they will cause the bucket to overflow. They can be either discarded (for enforcement) or delayed (for shaping). to represent a particular Let us introduce the notation TCP/IP attribute value. In this case, is an index that uniquely identifies a TCP/IP attribute, while represents the value of this TCP/IP attribute . Attribute value for example, could rep, , (HTTP), , resent , etc. , with a Here, an LB is maintained for each attribute value given size and a drain rate that are derived from the histogram of in the nominal profile.5 By measuring the LB overattribute flow frequency (more precisely, the overflow count in a measurement period), we can determine how discrepant the measured-traffic and nominal profiles are. This overflow frequency is regarded as a partial score for the associated attribute value. The total score of the arriving packet destined to the identified victim is the sum of all partial scores. The bucket size, also , is determined as follows: called threshold (1) 5Attribute values can be a single entry or a group. This allows values that do not show up frequently or do not show at all in the nominal profile, to be accounted for. Refer to Section IV-B for further details.

where is the nominal profile measurement interval in seconds, is the number of packets-per-second measured during , and is the percentage of packets in the nominal pro. Both and are obfile having attribute value tained from the nominal profile, and their multiplication consti. tute the drain rate A. LB-Based Scoring In this section, we formalize the notion of LB-based packet scoring. Consider all the packets destined for an identified victim. A packet, , carries a set of discrete-valued attributes , where could be the TTL value (e.g., or ), , the server port number, , the packet size in bytes, , where is the number of attributes. Let and so forth up to be the number of overflows of the LB associated with . We then have the packet score as the sum of overflows of the attribute values in a packet (2) Fig. 5 shows an example to illustrate how the scores of packet #1 (on the bottom left of the figure) and packet #2 (on the bottom right) are obtained. There are two sets of histograms; the top three belong to normal traffic profile and the bottom three belong to current traffic profile. The three histograms are associated with three attributes, TTL, destination port number, and packet size. Each histogram indicates the relative frequency of the attribute value that is larger than a given threshold (e.g., 1%). For instance, the relative frequency of TTL values of 23, 26, 54, 61, 251 are 10%, 5%, 13%, 7%, and 3%, respectively. Based on , (1), the threshold of each LB (i.e., each attribute value),

1868


Fig. 6. Profile measurement and scorebook generation on the LB-based scheme.

is calculated and listed below the top three histograms. For exof TTL value 23 is 106 packets, where the ample, the relative frequency is 10%, is 106 packets/s, and is 10 s. The discrepancy between the histograms of the nominal traffic profile and those of the current traffic profile is reflected by the number of LB overflows, as shown below the bottom set overflows of the histograms. For instance, the LB of 20 times in the measurement period. In an actual system, the histograms of the current traffic profile are not required. They are shown here to facilitate the explanation. A packet score is a sum of the partial scores of the packet attributes of interest. The partial score is actually the number of overflows of the associated LB. For instance, packet #1’s score, 49, is the sum of 20, 14, and 15, while packet #2’s score is 2. The higher the score, the greater the attribute-values of the packet deviates from the nominal traffic, and thus the higher probability that the packet is an attack packet. Fig. 6 illustrates how packet scores are calculated. Here, we to be packets, and the measureassume the value of K packets/s ment interval to be 10 s, yielding multiplied by the relative frequency of each attribute value. Fig. 6 shows an incoming packet and its attributes bytes, . and Take the TTL attribute as an example. The LB associated with has its threshold set to 500 000, as calculated in Fig. 6, based on (2), and given , and 5% (found on the TTL nominal traffic profile in Fig. 5). When the packet arrives, is identified as , and the level of the LB correis increased by one. In the case of the new sponding to level being higher than the , the number of LB overflows is also increased by one. LB thresholds for TTL and for server port attributes are represented in Fig. 5. The number of overflows for the second TTL LB corresponding to TLL value

in Fig. 6 with the value of 5% in Fig. 5. 26 is represented by , for . This value is the partial score, or Partial scores for the other attribute values are calculated similarly. One should observe that some LBs are shared among dif, for example, ferent attribute vales. This is the case for represents the packet-size attribute (as in Fig. 5). As where observed, all attribute values from 47 to 100 in Fig. 5 are asso.6 ciated to the same One of the most notable differences between the LB-based approach and [17] is that construction of histograms for different attributes is no longer necessary when performing online measurement of the current traffic. Rather, histograms are only used when building the nominal profile,7 which is in turn used to set and . the LB’s fixed parameters Fig. 7 illustrates how to detect an SQL-Worm attack using LBs. The unusual flow of attacking packets rapidly increases the levels of the UDP protocol, server port 1434, and packet bytes LBs, eventually causing their overflows, which size leads to packet differentiation by score. B. LB Nominal Profile Nominal profiles are maintained by the 3D-R (as shown in Fig. 1), in a way that every target under protection has its own set of nominal profiles. They consist of a series of LB sizes obtained from the histograms of nominal profiles of packet attribute values, and throughput information (in terms number of packets per second) which is used to set drain rate of the corresponding LB. The profiles are maintained by the 3D-R, being collected during a period where the network operated allegedly free from attacks, and relying on the observation that relative 6Read in the number of overflows of the second LB of the third iceberg for the “packet-size” TCP/IP attribute. 7An offline calculation that causes no impact to the real-time traffic collection operations.


Fig. 7. SQL-worm attack scoring and differentiation by LB.

Fig. 8. Flag nominal profile on the LB-based scheme.

distribution samples of real-life Internet traffic attributes do not vary significantly over a short period of time, unless there is an attack (a claim corroborated in [17], and in our simulations).8 As a direct application of the iceberg-style histograms, CLP nominal profiles do not include attribute values with frequencies below the preset threshold during the measurement interval. Overall DDoS attack detection sensibility would benefit with an increase in granularity for these less-frequent attribute values, therefore, we extend the iceberg-style histogram concept, in the LB approach. In this scheme, all attribute values that do not appear so frequently during the measurement interval are grouped in a single entry in a nominal profile histogram, when the sum of their frequencies becomes higher than a preset fixed threshold. Fig. 8 represents the nominal profile histogram for the flag attribute. Its distribution frequencies represent the traffic profile contained in the trace obtained from the Internet trace archive of the MAWI project [24] of May 31, 2004, from 2:00 pm to 2:10 pm. Each attribute value has a distribution frequency associated with it. Values 2, 16, and 24, for instance, have distribution frequencies higher than the fixed threshold (set in 1% in 8Many

different real traffic traces were used in [17], and in our simulations.

1869

the example), while all other attribute values from 0 to 64 (except the three values just mentioned) do not. As a result, these attributes are grouped, sharing the same distribution frequency (like the three single entry values).9 This profile was obtained using profilers: special programs that are part of our simulations, and generate the profiles by reading the data in tcpdump format. Our observation shows great similarity in iceberg attribute values across adjacent measurement periods. Based on this observation, we suggest the profile update to take place every 10 min, with the last updated profile being used toward score generation (as long as there are no attacks during these periods). When under attack, the profile is not used by any subsequent period, and is kept only for postattack analysis purposes. In that situation, the next profile to be used should be the one from the same period-of-day of the day before, or a week before. This use of daily/weekly profiles can be also applied when there is an expected discrepancy between adjacent periods, as to provide a more realistic profile comparison. To illustrate this, say that in a given network, and on weekly basis, the legitimate throughput for the 8:00 am to 8:10 am period is expected to be ten times higher than its previous adjacent period. The nominal profile to be used in this situation, is that of the same period of the day before. One inherent problem of nominal profiles, in general, is the inability to detect unexpected hikes of legitimate nominal traffic throughput within the nominal period. For this situation, we set higher than the throughput read a target throughput from the nominal profile.10 This way we opportunistically accept more legitimate packets (and also potentially forwarding to more attacking traffic as a drawback). We propose the be dynamic, always higher than the nominal throughput by x% value does not oversubscribe the line (as long as the final or a committed packet rate previously set). C. LB Real-Time Implementation As seen in (2), the packet score is obtained by summing the number of LB overflows of the packet’s attribute values. Two processes need to occur in parallel for this to happen: the traffic profiling, and the score computation. Profiling controls the LB levels and overflows, as in Fig. 6. Scoring is divided in two parts: the scorebook generation and the packet scoring. The scorebook is a set of associations between score and attribute values, containing the latest snapshot of LB overflows. It needs to be periodically updated at the beginning of each traffic profile period, which is a time-scale much longer than the packet arrival time-scale. After the scorebook is built, it is used as a static reference for obtaining the partial scores of an incoming packet, and its further score calculation. After getting the current traffic profile, the CLP method needs to do a complex offline calculation to generate the scorebook by software, which takes some processing. Unlike [17] which requires CLP computation, the scorebook in the LB approach is promptly ready to be used for the next period, with no need of any extra computation. 9In the example, all attribute values below the threshold, coincidently form a single group. However, there can be many groups within a histogram, as long as their joint attribute value frequency is higher than the threshold. 10This technique is also used in [17].

1870


V. AV SCHEME

Fig. 9. Pipelined implementation timeline for the LB-based scheme.

The following processes must occur in parallel (in the 3D-R) at the time of the packet arrival. • Traffic measurement, LB level and overflow controls, histogram update for future generation of the next nominal profile. • Scorebook generation at the end of each period. • Score computation based on the frozen (static) scorebook and the current packet attribute values. • Selective packet discard (overload control), and dynamic threshold adjustments by the P/I control system. To properly implement and integrate those processes on the LB scheme, a pipelined approach implementation is used, as Fig. 9 shows. The decision to allow or drop packets does not start until the third period starts. Fig. 9 also demonstrates the parallelization of the processes and the interdependency between periods. Although the packet scoring is performed for each arriving packet, selective discarding only happens if the system is . operating beyond its safe (target) utilization level The LB scheme is faster and simpler, compared with the CLP scheme, making it more suitable for real-time implementation. In particular, for the CLP-based scheme, many interdependent tasks need to be implemented sequentially before the packets can enter the selective discarding stage. With the LB scheme, these tasks can be all performed within a period of 100 ms or less, so that the packet discarding and overload control can start immediately, resulting in faster responses to DDoS attacks. Efficient hardware-based algorithms for iceberg-style histogram maintenance and update have been designed to profile current traffic and periodically measure the nominal traffic for each protected end-point at 10 Gb/s line speed using techniques similar to those described in [19]. The original algorithm requires a two-pass operation on the input data stream. Here, we approximate the two-pass mechanism by pipelining first and second pass over consecutive observation intervals. The LB-based approach simplifies table lookups necessary to obtain packet scores, and thereby enables the DDoS engine to distinguish attacking packets from legitimate ones at 10 Gb/s line speeds. To generate a scorebook, the attribute values of incoming packets within an interval are mapped into their corresponding LBs by simple table lookup, which can be implemented by on-chip memory, and thus eliminate the bottleneck for accessing external memory. The LB array structure is amenable to parallel and pipeline hardware implementation, even though the number of potential end-points or stub networks to be protected within a security perimeter is large.

In this section, we introduce the attribute value variance as another new metric in the packet scoring process. The AV scheme basically compares the incoming packet’s attribute value distributions with the nominal profile, providing packet scores based on the resulting differentiation. It approximates the measured profile distributions, detecting attribute values on arriving packets that significantly deviate from the nominal profile. Scoring is based on the probability of whether the packet’s attribute-value distributions significantly differ from the nominal profile or not. This probability results from the comparison of the average means and variances of the iceberg-attribute-values computed in the nominal profile, with the incoming packet’s current attribute-value mean distributions. The higher the incoming packet’s measured profile deviates from the nominal profile, the higher its likelihood of being an attacking one and vice versa. A. The AV Operation We devised the idea of packet scoring by AV-based on the model described in [25], a work on anomaly detection on webserver requests from the Intrusion Detection Community. Their method successfully approximates the actual but unknown distribution of the query attribute lengths of a request, detecting instances that significantly deviate from the observed normal behavior. In our scheme, during the nominal profile calculation, the average mean and the variance of the attribute value distribution are calculated. In a differentiation phase, the probability of an arriving packet attribute value with average can be calculated based on the Chebyshev inequality, as shown below

(3) This allows the scheme to obtain an upper bound on the exceeds a threshold11 and deprobability that tecting the probability that an incoming packet attribute value deviates more from its in the nominal profile, than its current value . B. The AV Nominal Profile The AV nominal profiles histograms contain the average mean distribution , and variance of the attribute values. The overall structure (except for the introduced variance field) and profile update scheme are implemented in the same way as in the LB-based scheme, although the profile calculation is implemented differently. Here, we calculate the average mean for each attribute value, dividing the profile period into samples,12 with an attribute value mean calculated for each 11Please

note this threshold is completely unrelated to TH .

12A sample size of 60 gives a reasonable mean approximation, without adding

too much overhead.


Fig. 10. Flag nominal profile on the attribute-value-variation-based scheme.

sample. The mean of these samples becomes the average mean, as per (4), followed by the variation calculation as per (5):

1871

a probability of “1” and score “0.” is a partial score, and the collection of all attribute value probabilities will compose the scorebook. Fig. 11 illustrates how a packet’s score is generated by the AV-based scheme. The histograms of nominal and current traffic profile in Fig. 11 are similar to those in the LB approach in Fig. 5. The difference is that each attribute value in Fig. 11 has of relative frequency in both the average and the variance nominal traffic profile, while only in Fig. 5. For example, the of relative frequency for TTL average and the variance value 23 is 10% and 0.00006%, respectively, which can be calculated from (4) and (5). On the other hand, the histogram of the current traffic profile only has average of relative frequency. From (6), a partial score for each attribute is generated and stored in a scorebook, as shown in Fig. 11. For instance, and % in nominal traffic TTL of 23 has % in current traffic profile. From (6), we get profile and the resulting of 0.15%, which we normalize to 1500 by multiplying a constant value.14 VI. SELECTIVE PACKET DISCARDING AND OVERLOAD CONTROL

(4) (5) The resulting profiles will determine acceptable distribution deviations from , at the same time these distributions stay close to an average mean distribution . The scoring process will further measure these deviations, with higher deviations meaning higher likelihoods of attacking packets. Fig. 10 shows the nominal profile used for the flag attribute, with the standard deviations13 of the attribute values on top of each average mean histogram bar. Like the flag histogram in Fig. 8, Fig. 10 represents the same MAWI [24] trace. A profiler program has also been created for the AV-based scheme. C. The AV Scoring The packet score is composed of the sum of the probabilities of its attribute value distributions deviating from their respective and in the nominal profiles. Each probability can be viewed as the probability —same as in (3)—of an attribute value in accordance to the nominal profile as (6) where represents the incoming packet; , a measured attribute; , the value of this attribute; , the attribute value’s current mean distribution; , the average mean distribution of the same attribute value measured in the nominal profile; and is the variance of this attribute value also measured in the profile. If the incoming packet’s mean is less than the average, the packet is automatically considered legitimate, having assigned 13Standard deviations are shown instead of variances, which are very low values, insignificant in the plot.

As Section III shows, overload control is a key component in the PacketScore scheme. This is implemented in the 3D-R, which continually tries to maintain a preset target throughput . This control is achieved by forwarding or discarding the packets according to the . The P/I control performs overload control on both LB and AV schemes, by providing and dynamically as updating the (7) , and where is the threshold variation. This variation is composed of the and are sum of two parts: oscillation and error control. static values used in (7). These are critical values that should be carefully selected, or the overload control will be ineffecfor , tive. In our simulations, we used the values of . We used and for LB and AV, respecand for tively, for an index data structure of up to 150 000 score enfor the AV scheme is that tries. The reason for a smaller the P/I function is evoked more frequently (every 10 ms), as compared with the LB scheme (every 100 ms), requiring larger threshold variations per threshold update. All of these values were obtained in our simulations during a learning period, in which many values were tested until the overload control proand are derived from the duced good results. Because scorebook data structure and the scores produced, one does not need to experiment new values, as long as the same data strucand must ture size is used. If a different size is chosen, and error be changed proportionally. Continuing with (7), rate are functions of a period (and ), with the error rate being equal to the actual output throughput minus the target . throughput Fig. 12 depicts the integrated overload control operation generation. First, the error rate involved in the 14This

is to ensure that the scorebook will carry only integral numbers.

1872


Fig. 11. Nominal and measured profiles related to packet scoring on the AV-based scheme.

Fig. 12. The proportion integration (P/I) control system.

is calculated in the error control portion, based on the dif15 and the target ference of the actual output throughput throughput from the nominal profile. The threshold and , calculating the calculation then obtains both with the use of and discarding threshold variation . This variation is added to the previous period’s threshold , composing the . 15The

fraction F of arriving packets

+

.

is simply a score, based on which, the arriving packets are forwarded or dropped, if their scores are above or below it, respectively. This process of comparing packet scores to the current discarding threshold is done at wire-speed, with the threshold concurrently adjusted from time to time. These periodic adjustments occur at the same time-scale as the scorebook generation interval (100 ms, in the LB-scheme), which is larger than the packet arrival time-scale and shorter than the 5 s interval period. In the AV scheme, they occur more frequently (every 10 ms) within the period the scorebook is generated (100 ms). The shorter interval proved to be more efficient against fast-changing attacks, providing a better overload control. Although the LB-scheme also benefits from these faster updates, in terms of overload control, we kept this value higher due to a better score separation, as the next section shows. P/I simplifies the overload control substantially, when compared with the CDF/load-shedding scheme, used in the CLP approach. Because the use of CDF requires maintaining a large data structure and online histograms, it imposes an overhead to the system. VII. PERFORMANCE EVALUATION In this section, we use simulations to evaluate and compare the performance of the three PacketScore schemes in a stand-


1873

TABLE I CLP, LB, AND AV SCHEME RESULTS FOR DIFFERENT ATTACK TYPES

alone environment consisting of: the 3D-R components, the respective offline nominal profile and the same stored nominal traffic by the WIDE project [24]. To establish a fair comparison, all simulations have the same overall common internal settings (unless stated otherwise), and input traffic (obtained from the WIDE project), from which they generate attacking packets in the exact same way. Both LB and AV schemes use the P/I approach to provide overload control, whereas CLP uses CDF/load-shedding algorithm. The simulations share the same traces, attack intensity and type (10 times, generic attack), scorebook update interval (every set to 15 000 PPS, a value 5 s), and target maximum load opportunistically higher than the maximum incoming load measured of 8711 PPS in the nominal profile, yielding the acceptance of more potentially legitimate traffic as the system capacity permits. We used the Internet trace archive of the WIDE project [24] to obtain 10-min duration traces, for the period between 2:00 pm and 2:10 pm, from May 31, 2004 to June 3, 2004, for a total of four nonoverlapping periods. We also used the same set of traces used in [17], although we are not comparing these results here. We felt the newer traces better reflect today’s average Internet traffic (they contained six times more traffic, on average). A. Performance Criteria We adopted the following four items as performance criteria for the simulations: false positive ratio, false negative ratio, score separation power, and effectiveness of the overload control. False positives represent the percentage of legitimate packets that are mistakenly discarded, while false negatives represent the percentage of attacking packets admitted. The score separation power, gives us the degree of differentiation between the legitimate packet score distribution and the attack packet score distribution. Establishing a tail probability of 5%, we observed an intersection zone between these two distributions. For the percentage of legitimate packet scores outside this zone, we call , and the percentage of attack packet scores outside the same zone, as in [26, Fig. 2]. Let be the lowest (highest) score observed for the incoming legitimate (attacking) to be the fraction of attacking (legitpackets. Define (above ). imate) packets that have a score below

These are the two metrics used to measure the score separation. and The results should be interpreted as follows: the closer are to 100% and to each other, the better. To check how the effectiveness of both overload controls, we compare the acagainst the target maximum utilizatual output utilization , set by the schemes. Ideally, the ratio tion should be “1” or very close to “1” (either above or below). It is also important to note—when evaluating the performance of the three different schemes—that the different criteria are all correlated, therefore one should not interpret each one of them independently. For example: lower false positive rates can not be obtained with poor score separation. Similarly, a very low false positive rate does not yield good performance if the ratio is not good. The legitimacy of an accurate false positive rate relies on a good score separation allied to an austere overload control. B. Different Attack Types Evaluations of the three schemes demonstrate their accuracy in forwarding legitimate packets when under different attacks. For the most impacting attack, the false negative rates were: 2.82% (LB), 1% (CLP), and 0.90% (AV), when analyzing each attack individually. The following attacks are used, as in [26]. • Generic attack: all attribute values of the attacking packets are uniformly randomized over their corresponding allowable ranges. • TCP-SYN flood attack. • SQL slammer worm attack. • Nominal attack: all attacking packets resemble the most dominant type of legitimate packets observed in practice, i.e., 1500-byte TCP packets with server port 80 and TCP-flag set to ACK, with uniformly random source IP addresses. • Mixed attack: equally combines the above four types of attacks while keeping the overall attack rate to 10 times . that of the nominal packet rate or • Changing attack: similar to the mixed attack except that the different types of attacks take turns. An attack type is randomly selected and continues for an exponentially distributed period. The corresponding results are depicted in Table I, which presents the performance of all schemes under different types of attacks. The attack intensity on the AV-based scheme is

1874


TABLE II INCREASING ATTACK INTENSITY

TABLE III CHANGING ATTACK TYPE AND INTENSITY

a multiple of the , rather than .16 By doing this, we emphasize the better results of this scheme over the is almost twice (yielding more other two, as stressful attacks). The overload control provided by P/I on the LB and AV schemes (measured on the last two columns in Tables I–III) provides a much better result than on the CLP scheme (third rightmost column), with a throughput ratio of exactly 1 all the time. The CLP and LB schemes presented a very similar score separation, while AV was able to completely separate the scores virtually all the time. The false negatives are more constant on LB and AV (due to better overload control), and the false positives presented the best results in all times on AV (changing attacks every 10 s on average has a lower rate on CLP, but with a very high throughput rate, a situation in which AV demonstrated a much lower rate than CLP), followed by the CLP and LB schemes. The last three rows in the table represent the results against different attacks taking turns within random average intervals of 10, 30, 60 and 300 s. This “10 s attack” is the most disruptive one once it forces the dynamic threshold to become very often invalid, hence requiring quick and precise threshold adjustments. C. Increasing Attack Intensity Changes in attack type and intensity constitute a big challenge to any defense scheme. For PacketScore, it affects the discarding phase by invalidating , resulting in higher false positive rates, as Section III explains. Table II shows the results for the schemes when under different attack intensities configured either statically or dynamically (changing sequentially every 10 or 50 s in average) for an unchanging generic attack. As stated in the previous subsection, for the AV-based the intensity is a multiple of scheme). Table II also demonstrates an almost identical overload 16This

explains why in the AV-based scheme the false negatives rates are about 1/2 the value of the other schemes rates. Were the same ratio applied to the other two schemes, and the false negatives would also drop.

Fig. 13. Score distribution of attacking and legitimate packets in the three schemes.

control performance among the three schemes, indicating that attack intensity fluctuations do not influence the overload control effectiveness—by either P/I or CDF/load-shedding—signifratio to the icantly. However, when comparing this other two tables, the advantages of using P/I becomes clear. This is because the overload control provided by the P/I technique is much more effective when there are attack type variations.17 Table III shows the scheme performances when challenged by attacks with increasing complexity, generated by having the attack types changed, while changing their intensity at random periods. The LB and AV schemes provided a better result than CLP, showing a higher resiliency under extreme attack conditions. D. Score Separation Power Fig. 13 provides a graphical comparison of the score separation power provided by the three schemes. Fig. 13 covers a 17Due

to its higher resiliency to TH variations.


period of 5 s (from 100 to 105 s), with the same simulation sets as the ones used in the first row of Table I. Although the schemes produce completely different distributions, they proto the left, and to vide an overall good separation with the right, and little overlapping of those two curves.18. The AV score separation contains very few overlapping and is clearly the best among the three, followed by those in the LB and CLP schemes, in this order. One should note that the results Fig. 13 presents directly influence the false positive/negative rates in the previous subsections.

VIII. CONCLUSION In this paper, we discussed three DDoS defense systems based on packet scoring, introducing the LB and the AV schemes. From the implementation point of view, the LB scheme is the simplest among the three because it does not require construction of histograms for the measured profile, or any kind of real-time histogram, such as the CDF in the CLP scheme. On the other hand, the AV complexity is between the other two, since it requires building the histograms for the measured profile. Another advantage of LB is that it does not require any kind of manipulation of the scorebook during the transition of a period, unlike the other two. All three schemes need to build histograms to generate nominal profiles, a task that can be either performed offline or with a lower priority than scoring related processes. Accuracy in differentiating legitimate from attacking packets is an important performance index. Our study shows that the AV scheme has better results in every situation. LB with P/I can be better than the CLP with load-shedding for fast changing complex attacks once LB implementation is simpler and P/I makes the scheme more adaptive, which can be better than employing a more accurate likelihood function. When analyzing the attacks individually, CLP generally presented better results than LB, but still worse than AV. The P/I control system has higher accuracy, resiliency, and simplicity compared with CDF. The faster threshold updates, deviations from , impose which quickly correct difficulty for an attacker to coordinate a change in its attack in a timely fashion, so as to exploit periods of unrealistic thresholds maintained by the overload control, which should cause the number of false positives and negatives to increase. One should observe the limitations of the three schemes presented. These are DDoS defense schemes primarily activated by an increase in traffic volume. Therefore, they are not able to defend victims against “teardrop” or “ping-of-death” types of attacks, unlike signature-based Intrusion Detection Systems (IDSs). The schemes also do not provide attribute value analysis semantics (e.g., distinguishing the drop of a TCP control packet, like a SYN or ACK, as being more disruptive than dropping a TCP packet with a sequence number assigned). 18Please note that the Y axis has a logarithmic scale, which may not demonstrate clearly in the plots that the R range is much higher than the R range.

1875

The prototyping effort of ALPi is currently under way. By incorporating the aforementioned LB-based, AV-based, and P/I mechanisms, ALPi can leverage the advances of data-stream processing techniques while remaining amenable to optimized pipeline hardware implementation.

REFERENCES [1] S. Bellovin, M. Leech, and T. Taylor, “ICMP traceback messages,” Internet draft, draft-ietf-itrace-01.txt, Oct. 2001. [2] K. Park and H. Lee, “On the effectiveness of probabilistic packet marking for IP traceback under denial of service attack,” in Proc. INFOCOM, 2001, pp. 338–347. [3] A. Snoeren, “Hash-based IP traceback,” in Proc. SIGCOMM, Aug. 2001, pp. 146–152. [4] A. Yaar, A. Perrig, and D. Song, “FIT: Fast Internet traceback,” in Proc. IEEE INFOCOM, Mar. 2005, pp. 1395–1406. [5] J. Ioannidis and S.M. Bellovin, “Implementing pushback: Routerbased defense against DDoS attacks,” in Proc. Netw. Distrib. Syst. Security Symp., Feb. 2002, pp. 79–86. [6] D.K.Y. Yau, J.C.S. Lui, and F. Liang, “Defending against distributed denial-of-service attacks with max-min fair server-centric router throttles,” in Proc. IWQoS, 2002, pp. 35–44. [7] W. Lee and S.J. Stolfo, “Data mining approaches for intrusion detection,” in Proc. 7th USENIX Security Symp., Jan. 1998, pp. 79–93. [8] D. Marchette, “A statistical method for profiling network traffic,” in Proc. 1st USENIX Workshop on Intrusion Detection and Network Monitoring, Apr. 1999, pp. 119–128. [9] J. Mirkovic, G. Prier, and P. Reiher, “Attacking DDoS at the source,” in Proc. ICNP, Nov. 2002, pp. 312–321. [10] Q. Li, E. C. Chang, and M. C. Chan, “On the effectiveness of DDoS attacks on statistical filtering,” in Proc. IEEE INFOCOM, Mar. 2005, pp. 1373–1383. [11] Arbornetworks Com. [Online]. Available: http://www.arbornetworks. com. [12] Mazu Networks Inc. [Online]. Available: http://www.mazunetworks. com. [13] Thread Defense System [Online]. Available: http://www.cisco. com/en/US/netsol/ns340/ns394/ns171/ns441/net\_value\_proposition 0900aecd8013402e.html [14] Cyber-Operation Com. [Online]. Available: http://www.cyberoperations.com. [15] Webscreen Technology. [Online]. Available: http://www.webscreentechnology.com [16] Lancope. [Online]. Available: http://www.lancope.com [17] Y. Kim, W. C. Lau, M. C. Chuah, and H. J. Chao, “PacketScore: Statistics-based overload control against distributed denial of service attacks,” in Proc. IEEE INFOCOM, Apr. 2004, pp. 2594–2604. [18] B. Babcock, “Models and issues in datastream systems,” in Proc. ACM Symp. Principles of Database Syst., Jun. 2002, pp. 1–16. [19] R. M. Karp, C. H. Papadimitriou, and S. Shenker, “A simple algorithm for finding frequent elements in streams and bags,” ACM Trans. Database Syst., vol. 28, no. 1, pp. 51–55, Mar. 2003. [20] F. Chen, D. Lambert, and J.C. Pinheiro, “Incremental quantile estimation for massive tracking,” in Proc. 6th Int. Conf. Knowl. Discovery and Data Mining, Aug. 2000, pp. 516–522. [21] A. C. Gilbert, Y. Kotidis, S. Muthukrishnan, and M. J. Strauss, “How to summarize the universe: Dynamic maintenance of quantiles,” in Proc. 28th VLDB Conf., Aug. 2002, pp. 454–465. [22] M. Greenwald and S. Khanna, “Space-efficient online computation of quantile summaries,” in Proc. ACM SIGMOD Int. Conf. Manage. Data, May 2001, pp. 58–66. [23] S. Kasera, J. Pinheiro, C. Loader, M. Karaul, A. Hari, and T. LaPorta, “Fast and robust signaling overload control,” in Proc. 9th Int. Conf. Netw. Protocols, Nov. 2001, pp. 323–331. [24] MAWI Traffic Archive, [Online]. Available: http://tracer.csl.sony.co.jp/ mawi/ [25] C. Kruegel and G. Vigna, “Anomaly detection of web-based attacks,” in Proc. 10th ACM/CCS, Oct. 2003, pp. 251–261. [26] M. C. Chuah, W. C. Lau, Y. Kim, and H. J. Chao, “Transient performance of packetscore for blocking DDoS attack,” in Proc. IEEE ICC, Jun. 2004, pp. 1892–1896.

1876


Paulo E. Ayres received the B.Sc. degree in computer science from University Center of Brasilia (UniCEUB), Brasilia, Brazil, in 1996, and the M.Sc. degree in electrical and computer engineering from Polytechnic University, New York, in 2006. Since 2001, he has been an Associate with the Securities Industry Automation Corporation (SIAC), working on the implementation of critical networks for the financial industry. His current interests include network protocols and security.

Huizhong Sun received the B.E. and M.E. degrees in control engineering from Harbin Institute of Technology, Harbin, China, in 1996, and 1998, respectively. He is currently working towards the Ph.D. degree at the Department of Electrical and Computer Engineering, Polytechnic University, New York. His research interests include network security and measurement.

H. Jonathan Chao (F’01) received the B.S. and M.S. degrees in electrical engineering from National Chiao Tung University, Hsinchu, Taiwan, and the Ph.D. degree in electrical engineering from The Ohio State University, Columbus. He is Department Head and Professor of Electrical and Computer Engineering at Polytechnic University, New York, where he joined in January 1992. During 2000–2001, he was Co-Founder and CTO of Coree Networks, NJ, where he led a team to implement a multiterabit multiprotocol label switching (MPLS) switch router with carrier-class reliability. From 1985 to 1992, he was a Member of Technical Staff at Telcordia, where he was involved in transport and switching system architecture designs and ASIC implementations, such as the world’s first SONET-like framer chip, ATM layer chip, sequencer chip (the first chip handling packet scheduling), and ATM switch chip. From 1977 to 1981, he was a Senior Engineer at Telecommunication Laboratories of Taiwan performing circuit designs for a digital telephone switching system. He coauthored two networking books, Broadband Packet Switching Technologies—A Practical Guide to ATM Switches and IP Routers (New York: Wiley, 2001) and Quality of Service Control in High-Speed Networks (New York: Wiley, 2001). He has been doing research in the areas of network security, terabit switches/routers, quality-of-service control, and optical networking/switching. He holds more than 20 patents and has published over 150 journal and conference papers in the above areas. He has also served as a consultant for various companies, such as Lucent, NEC, Telcordia, and Huawei. Prof. Chao is a Fellow of the IEEE for his contributions to the architecture and application of VLSI circuits in high-speed packet networks. He received the Telcordia Excellence Award in 1987. He is a corecipient of the 2001 Best Paper Award from the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY. He has served as a Guest Editor for the IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS (JSAC) ( special topics on “High-Speed Network Security” (3Q 2006), “Advances in ATM switching systems for B-ISDN” (June 1997), “Next-Generation IP Switches and Routers” (June 1999), and two Special Issues on “High-Performance Optical/Electronic Switches/Routers for High-Speed Internet” (May and September 2003). He also served as an Editor for the IEEE/ACM TRANSACTIONS ON NETWORKING from 1997 to 2000.

Wing Cheong Lau received the B.S.Eng. degree from the University of Hong Kong, Shatin, and the M.S. and Ph.D. degrees in electrical and computer engineering from the University of Texas at Austin. He is an Associate Professor with the Department of Information Engineering, Chinese University of Hong Kong (CUHK), Shatin, where he also serves as the Director of the Mobile Technologies Center (MobiTeC). From 1997 to 2004, he was with the Performance Analysis Department, Bell Laboratories, Lucent Technologies, Holmdel, NJ, where he served as a Performance Consultant and System Architect. Prior to joining CUHK, he was with Qualcomm, San Diego, CA, actively contributing to the design and standardization of IETF and 3G Mobility Management protocols and architecture. His research interest includes networking protocol design and performance analysis, traffic characterization, system modeling, and network security for high-speed wired and wireless networks.