Botnets Detection Based on IRC-Community - CiteSeerX

0 downloads 0 Views 299KB Size Report
May 18, 2009 - Typical life-cycle of a IRC based botnet and its attacking behaviors. The botmaster usually finds ..... 37, Issue 1, pp. 5-16, 2007. [16] L. Bernaille ...
Botnets Detection Based on IRC-Community Wei Lu and Ali A. Ghorbani Network Security Laboratory, Faculty of Computer Science University of New Brunswick, Fredericton, NB E3B 5A3, Canada {wlu, ghorbani}@unb.ca Botnets are networks of compromised computers controlled under a common command and control (C&C) channel. Recognized as one the most serious security threats on current Internet infrastructure, botnets are often hidden in existing applications, e.g. IRC, HTTP, or Peer-to-Peer, which makes the botnet detection a challenging problem. Previous attempts for detecting botnets are to examine traffic content for IRC command on selected network links or by setting up honeypots. In this paper, we propose a new approach for detecting and characterizing botnets on a large-scale WiFi ISP network, in which we first classify the network traffic into different applications by using payload signatures and a novel clustering algorithm and then analyze the specific IRC application community based on the temporal-frequent characteristics of flows that leads the differentiation of malicious IRC channels created by bots from normal IRC traffic generated by human beings. We evaluate our approach with over 160 million flows collected over five consecutive days on a large scale network and results show the proposed approach successfully detects the botnet flows from over 160 million flows with a high detection rate and an acceptable low false alarm rate.

I. INTRODUCTION

O

ne of the biggest threats to the current Internet infrastructure is botnets which are usually comprised of large pools of compromised computers under the control of a botmaster. Botnets can be centralized, distributed or peer-topeer (P2P) according to different command and control (C&C) models and different communication protocols (e.g. HTTP, IRC or P2P). The attacks conducted by botnets are very different, ranging from Distributed Denial-of-Service (DDoS) attacks to e-mail spamming, keylogging, click fraud, and new malware spreading. In Figure 1, we illustrate a typical lifecycle of a botnet and its attacking behaviours.

victim server 8.DDOS

1.exploit 2.bot download

Botnet

Botmaster 6.pass

7.command

5.pass authen.

4.join

7.command

vulnerable host

3.DNS query

DNS server

IRC server

Fig. 1. Typical life-cycle of a IRC based botnet and its attacking behaviors

The botmaster usually finds a new bot by exploiting its vulnerabilities remotely. Once affected, the bot will download and install the binary code by itself. After that, each bot on the botnet will attempt to find the IRC server address by DNS query, which is illustrated in Step 3 of Figure 1. Next is the communication step between bots and IRC server. In IRC based communication mechanism, a bot first sends a PASS message to the IRC server to start a session and then the server

authenticates the bot by checking its password. In many cases, the botmaster also needs to authenticate itself to the IRC server. Upon the completion of these authentications, the command and control channels among botmaster, bots, and IRC server will be established. To start a DDoS attack, the botmaster only needs to send a simple command like ".ddos.start victim_ip" while all bots receive this command and start to attack the victim server. This is shown in Step 8 of Figure 1. More information about the botmaster command library can be found in [1]. Detecting botnets traffic is a very challenging problem. This is because: (1) botnets use the existing application protocol, and thus their traffic volume is not that big and is very similar to the normal traffic behaviour; (2) classifying traffic applications becomes more challenging due to the traffic content encryption and the unreliable destination port labelling method. Previous attempts on detecting botnets are mainly based on honeypots [2,3,4,5,6], passive anomaly analysis [7,8,9] and traffic application classification [10,11,12]. Setting up and installing honeypots on the Internet is very helpful to capture malwares and understand the basic behaviours of botnets. The passive anomaly analysis for detecting botnets on a network traffic is usually independent of the traffic content and has the potential to find different types of botnets (e.g. HTTP based botnet, IRC based botnet or P2P based botnet). The traffic application classification based botnets detection focuses on classifying traffic into IRC traffic and non-IRC traffic, and thus it can only detect IRC based botnets, which is the biggest limitation when compared with the anomaly based botnets detection. In this paper, we focus on traffic classification based botnets detection. Instead of labeling and filtering traffic into non-IRC and IRC, we propose a generic approach to classify traffic into different application communities (e.g. P2P, Chat, Web, etc.). Then, based on each specific application community, we investigate and apply the temporal-frequent characteristics of network flows to differentiate the malicious botnet behaviors from the normal application traffic. The major contributions of this paper include: (1) a novel

978-1-4244-2324-8/08/$25.00 © 2008 IEEE.

1

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2008 proceedings. Authorized licensed use limited to: University of New Brunswick. Downloaded on May 18, 2009 at 11:21 from IEEE Xplore. Restrictions apply.

application discovery approach for classifying network applications in a large-scale WiFi ISP network, (2) a new algorithm to discriminate botnets IRC from the normal IRC traffic, which is based on n-gram (frequent characteristics) of flow payload over a time period (temporal characteristics), and (3) a botnet detection framework for detecting any types of botnets. The rest of the paper is organized as follows. Section II presents our application classification approach for network flows. Section III is the botnet detection algorithm based on the temporal-frequent characteristics of botnets. Section IV is the experimental evaluation for our detection model with over 160 million flows collected on a large-scale WiFi ISP network. Finally, some concluding remarks and future work are given in Section V. II. TRAFFIC APPLICATION CLASSIFICATION Identifying network traffic into different applications is very challenging and is still an issue yet to be solved. In practice, traffic application classification relies to a large extent on the transport layer port numbers, which was an effective way in the early days of the Internet. Port numbers, however, provide very limited information nowadays. An alternative way is to examine the payload of network flows and then create signatures for each application. This, however, generates two major limitations: (1) legal issues related to privacy, and (2) it is impossible to identify encrypted traffic. By observing traffic on a large-scale WiFi ISP network, we found that even exploring the flow content examination method, there are still about 40% network flows that cannot be classified into specific applications (i.e. 40% network flows are labeled as unknown applications). Investigating such a huge number of unknown traffic is inevitable since they might stand for the abnormalities in the traffic, malicious behaviors or simply the identification of novel applications. Next we first discuss the payload signatures based classification approach and then present the cross association clustering algorithm for classifying the unknown traffic into different known application communities. A. Payload Signatures Based Classification The payload signatures based classifier is to investigate the characteristics of bit strings in the packet payload. For most applications, their initial protocol handshake steps are usually different and thus can be used for classification. Moreover, the protocol signatures can be modeled through either public documents like RFC or empirical analysis for deriving the distinct bit strings on both TCP and UDP traffic. The classifier is deployed on a large-scale free wireless fidelity (WiFi) network and the classification results show that about 40% flows cannot be classified by the current application payload signatures based classification method. Next, we present a fuzzy cross association clustering algorithm in order to address this issue. B. Unknown Traffic Classification The traditional port-based classification method is proven to

be misleading due to the increase of applications tunneled through HTTP, the constant emergence of new protocols and the domination of P2P networking [13]. Examining the payload signatures of applications improves the classification accuracy, but still a large number of traffic cannot be identified. Recent studies on application classification include "applying machine learning algorithms for clustering and classifying traffic flows" [14], "statistical fingerprint based classification" [15] and "identifying traffic on the fly" [16]. Different with the previous approaches, our method is hybrid, combining the payload signatures with a novel cross association clustering algorithm [17]. The payload signatures classify traffic into predefined known application communities. The unknown traffic is then assigned into different application communities with a set of probabilities by using a clustering algorithm. Those unknown traffic that cannot be classified into any known application community will be considered as new or unknown applications. The basic idea of applying cross association algorithm is to study the association relationship between known traffic and unknown traffic. In numerous data mining applications, a large and sparse binary matrix is used to represent the association between two objects (corresponding to rows and columns). Cross associations are then defined as a set of rectangular regions with different densities. The clustering goal is to summarize the underlying structure of object associations by decomposing the binary matrix into disjoint row and column groups such that the rectangular intersections of groups are homogeneous with high or low densities. Previous association clustering algorithms need to predefine the number of clusters (i.e. rectangles). This, however, is not realistic in our unknown traffic classification because the actual number of applications is unknown. The basis of our unknown traffic classification methodology is a novel cross association clustering algorithm that can fully estimate the number of rows and columns automatically [17]. During classification, the traffic consists of unknown and known flows are clustered in terms of the source IP and the destination IP. A set of rectangles is generated after this stage. We define these rectangles as communities including either a set of flows or empty. Then flows in each community are clustered in terms of destination IP and destination port. Similarly, one community will be decomposed into several sub-communities, each represents an application community. After all flows are classified into different application communities, we have to label each application community. A simple and effective way is to label each application community based on its content. In particular, we calculate the number of flows for each known application in the community and normalize the numbers into a set of probabilities ranging from 0 to 1. The unknown flows in each application will be assigned into a specific application according to a set of probabilities. This idea is similar with the member function in fuzzy clustering algorithm and the experimental evaluation proves its accuracy and efficiency. An exception for this labeling method is if the dominant flow in the community is the unknown flow, the whole community will be labeled as

978-1-4244-2324-8/08/$25.00 © 2008 IEEE.

2

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2008 proceedings. Authorized licensed use limited to: University of New Brunswick. Downloaded on May 18, 2009 at 11:21 from IEEE Xplore. Restrictions apply.

A general aim for intrusion detection is to find various attack types by modeling signatures of known intrusions (misuse detection) or profiles of normal behaviors (anomaly detection). Botnet detection, however, is more specific due to a given application domain. N-gram bytes distribution has proven its efficiency on detecting network anomalies. In [18] Wang et al. examined 1-gram byte distribution of the packet payload, represented each packet into a 256-dimenational vector describing the occurrence frequency of one of the 256 ASCII characters in the payload and then constructed the normal packet profile through calculating the statistical average and deviation value of normal packets to a specific application service (e.g. HTTP). Anomalies will be alerted once a Mahalanobis distance deviation of the testing data to the normal profiles exceeds a predefined threshold. Gu et al. improve this approach and apply it for detecting malware infection in their recent work [19]. Different with previous n-gram based detection approaches, our method extends n-gram frequency into a temporal domain and generates a set of 256-dimentional vector representing the temporal-frequent characteristics of the 256 ASCII binary bytes on the payload over a predefined time interval. The temporal feature is important in botnets detection due to two empirical observations of botnets behaviors: (1) the response time of bots is usually immediate and accurate once they receive commands from botmaster, while normal human behaviors might perform an action with various possibilities after a reasonable thinking time, and (2) bots basically have preprogrammed activities based on botmaster’s commands, and thus all bots might be synchronized with each other. After obtaining the n-gram (n = 1 in this case) features for flows over a time-window, we then apply K-means algorithm to cluster the data objects with 256-demensional features. We don’t construct the normal profiles because normal traffic is sensitive to the practical networking environment and a high false positive rate might be generated when deploying the training model on a new environment. In contrast, K-means clustering is unsupervised and doesn’t define threshold that needs to be tuned in different cases. In our approach, the number of initial clusters by K-means is 2. We denote the 256-dimensional n-gram byte distribution as ti a vector < f1t i , f 2t i ,..., f 256 > , where f jt i stands for the frequency of the jth ASCII character on the payload over a time window ti (j=1,2,…,256 and i=0,1…). Given a set of N data objects F ~ {Fi | ti i=1,2,…,N}, where Fi =< f1t i , f 2t i ,..., f 256 > , the detection approach is described in Algorithm I. In practice, labeling the cluster is always a challenging problem when applying unsupervised algorithm for intrusion detection. By observing the normal IRC traffic over a long period on a large scale WiFi ISP network and the IRC botnet traffic collected on a honeypot, we derive a new metric, standard deviation σ m for each cluster m, to differentiate

Suppose σ j is the standard deviation of the jth ASCII over n flows, the average standard deviation σ over 256 ACSII characters for flows can be calculated by the following formula: 256

∑ σi σ = i =1 256 ALGORITHM I BOTNET DETECTION Function BotDel (F) returns botnet cluster

t t t Inputs: Collection of data objects Fi =< f i ,f i ,...,f i 1

256 > , i = 1,2,..N

2

Initialization:

initialize number of clusters k ( e.g. k = 2 ), cluster centers cm , 1 ≤ m ≤ k Repeat: q ← q + 1

Assign data objects to clusters by determining the closest cluster center points. Calculate the new center point cm −new for each cluster m. Until: cm−new − cm < th1 or q > th2

Calculate standard deviation for each cluster m: σ 1 , σ 2 ,..., σ m If σ b = max(σ 1 , σ 2 ,..., σ m ) then cluster b is labeled as botnet cluster Return the botnet cluster σ b .

0.06

0.1

Average Bytes Frequency over IRC Botnet Flows

III. BOTNET DETECTION BASED ON IRC COMMUNITY

botnet IRC cluster from normal IRC clusters. The higher the value of average σ m over 256 ACSII characters for flows on a cluster m, the more normal the cluster m is. This is reasonable because during normal IRC traffic, human being’s behaviors is more diverse with various possibilities compared to the malicious IRC traffic generated by bots. Given the frequency vectors for n flows as follows: 1 2 > … < f n , f n , ..., f n > } { < f11 , f 21 , ..., f 256 > , < f12 , f 22 , ..., f 256 1 2 256

0.09

0.08 Average Bytes Frequecny over Normal IRC Flows

"unknown", which has the potential of discovering new or unknown applications.

0.07

0.06

0.05

0.04

0.03

0.02

0.01

0

0

50

100

150 Index of ASCII Characters

200

250

300

0.05

0.04

0.03

0.02

0.01

0

0

50

100 150 200 Index of ASCII Characters

250

300

Fig. 2 Average bytes frequency over Fig. 3 Average bytes frequency over 256 ASCIIs for normal IRC flows 256 ASCIIs for botnet IRC Flows

As an example, Figures 2 and 3 illustrate the average bytes frequency over the normal IRC flows and IRC botnet flows, respectively. The average standard deviation of bytes frequency over 256 ASCII characters for normal IRC traffic is 0.002 and the maximal standard deviation of bytes frequency over 256 ASCII characters for normal IRC traffic is 0.05, while the average standard deviation of bytes frequency over 256 ASCII characters for IRC botnet traffic is 0.0009 and its maximum is 0.01, which is much smaller than that of normal IRC traffic. This observation confirms that the normal human

978-1-4244-2324-8/08/$25.00 © 2008 IEEE.

3

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2008 proceedings. Authorized licensed use limited to: University of New Brunswick. Downloaded on May 18, 2009 at 11:21 from IEEE Xplore. Restrictions apply.

being’s IRC traffic is more diverse than the malicious IRC traffic generated by bots.

TABLE I DESCRIPTION ON KNOWN AND UNKNOWN SET OVER ONE HOUR

IV. EXPERIMENTAL EVALUATION We implement a prototype system for the approach and then evaluate it on a large-scale WiFi ISP network over five consecutive business days. The botnet IRC traffic is collected on a honeypot deployed on a real network and is then aggregated into 243 flows. The time interval for flow aggregation is 1 minute. When evaluating the prototype system, we randomly insert and replay botnet traffic flows on the normal daily traffic. Since our approach is a two-stage process (i.e. unknown traffic classification first and botnet detection on IRC application community next), the evaluation is accordingly divided into two parts: (1) the performance testing for unknown traffic classification, not only focusing on the capability of our approach to classify the unknown IRC traffic, we also concentrate on the classification accuracy for other unknown applications (e.g. new P2P) since we expect the algorithm could be extended to detect various types of botnet, like Web based and P2P based botnets; (2) the performance evaluation for system to discriminate malicious IRC bonnet traffic from normal human being IRC traffic. A. Evaluation on Unknown Traffic Classification Evaluating the unknown traffic classification capability is not an easy task in reality since we have no idea on the novel or recent appeared applications and it always needs the intervention of network experts. During our experiment, we randomly choose part of known traffic and then force to label them as unknown. The selection for the number of all these label free traffic is decided according to the 40% rule. The final unknown traffic set is composed by the forcibly labeled known traffic and the 243 botnet IRC flows. Over five days evaluation, we found that all the botnet flows can be accurately classified into the IRC application community (i.e. 100% classification rate for IRC traffic). However, the general classification accuracy over all applications is about 85% which is not that high compared to the specific IRC application. The general classification accuracy is an average value over all application classification since the approach has different classification rate for different application communities. Table I is a description about known application set and the unknown application set over one hour, e.g. how many known applications the flows belong to, etc. B. Evaluation on Discriminating Botnet from Normal IRC The proposed approach is evaluated with five full consecutive days traffic. Table II shows the flow distribution for IRC application community and the total flow community for each day after the traffic classification step. Two metrics are used to evaluate the performance of discriminating botnet traffic from normal IRC traffic, namely Detection Rate (DR) and False Alarm Rate (FAR). DR is the ratio of number of botnet flows detected over total number of botnet flows and FAR is the ratio of number of false botnet alarms over the total number of alarms.

Known Flowset Number of Flows 176484

Unknown Flowset

Number of Applications 38

Number of Flows 39408

Number of Applications 11

TABLE II DESCRIPTION ON IRC COMMUNITIES OVER FIVE DAYS Flows

Total Flows 35409K 29538K 35272K 32693K 33751K

Days

1 2 3 4 5

Known Flows 23724K 18313K 22574K 20596K 20926K

Total IRC Flows 606 569 253 264 287

Known IRC Flows 363 326 10 21 44

Table III lists the DR and FAR for all the five days detection and accordingly Table IV lists the average standard deviation over the 256 characters of the payload collected on the network for each cluster. TABLE III DETECTION PERFORMANCE OVER FIVE DAYS Performance Metrics Days

DR (%)

FAR (%)

100.0 100.0 77.8 100.0 100.0

8.9 6.8 3.1 1.6 5.0

1 2 3 4 5

TABLE IV STANDARD DEVIATION OF BYTES FREQUENCY OVER 256 ASCIIS FOR NORMAL AND BOTNET CLUSTERS Average Standard Days

Normal Clusters

Botnet Clusters

0.0015 0.0029 0.0015 0.0013 0.0015

0.0005 0.0017 0.0006 0.0005 0.0006

1 2 3 4 5

From Table II, we see that the total number of flows we collect for one day is over 30M and the total number of known flows which can be labeled by the payload signatures is over 20M. The number of IRC flows over the five consecutive day is from 200 to 600, which is a very small part of the total flows. Our traffic classification approach can classify the unknown IRC flows to the IRC application community with a 100% classification rate on the five days evaluation. The detection rate for differentiating bot IRC traffic from normal human being’s IRC traffic is 100% on four days testing, while an exception happens on the 3rd day’s testing on which our approach obtained a 77.8% detection rate with a 3.1% false alarm rate. The best evaluation over the five days testing is a

978-1-4244-2324-8/08/$25.00 © 2008 IEEE.

4

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2008 proceedings. Authorized licensed use limited to: University of New Brunswick. Downloaded on May 18, 2009 at 11:21 from IEEE Xplore. Restrictions apply.

100% detection rate with only 1.6% false alarm rate. Moreover, evaluation results from Table IV indicate that the average standard deviation of bytes frequency over the 256 ASCIIs on the flow payload is an important metric to indicate normal human IRC clusters and malicious IRC traffic generated by machine bots.

[4]

[5] [6]

V. CONCLUSION In this paper we attempt to conduct a taxonomy on all existing botnet detection approaches and classify them into three categories, namely honeypots based, passive anomaly analysis based and traffic application classification based. As claimed by Gu et al., anomaly based botnet detection approaches have the potential to find different types of botnets, while current existing traffic classification approaches only focus on differentiating malicious IRC traffic from normal IRC traffic, which is considered as its biggest limitation. In this paper, we address this limitation by presenting a novel generic application classification approach. Through this unknown applications on the current network will be classified into different application communities, like Chat (or more specific IRC) community, P2P community, Web community, etc. Since botnets are exploring existing application protocols, detection can be conducted in each specific community. As a result, our approach can be extended to find different types of botnets. In particular, we evaluate our framework on IRC community in this paper and evaluation results show that our approach obtains a very high detection with a low false alarm rate when detecting IRC botnet traffic. Especially we formalize the botnet behaviours by using an average standard deviation of bytes frequency over 256 ASCIIs on the traffic payload, and conclude an important bot identification strategy, that is the higher the value of the average deviation, the more human being like the IRC traffic. This indication strategy is important when using unsupervised clustering algorithm for botnet detection in the later research. In the near future, we will evaluate our approach on the web specific community and test its performance on web based botnets. Some novel P2P botnets construction methods have been proposed and investigated in [21], and as a result we will also conduct an evaluation for our approach with the new appeared P2P botnets.

[7]

[8]

[9] [10] [11]

[12]

[13]

[14]

[15] [16] [17]

[18] [19]

ACKNOWLEDGMENT The authors graciously acknowledge the funding from the Atlantic Canada Opportunity Agency (ACOA) through the Atlantic Innovation Fund (AIF) to Dr. Ghorbani.

[20] [21]

REFERENCES [1] [2] [3]

P. Barford and V. Yegneswaran, "An inside look at Botnets," Special Workshop on Malware Detection, Advances in Information Security, Springer Verlag, ISBN: 0-387-32720-7, 2006. The Honeynet Project & Research Alliance, "Know your enemy: Tracking botnets, " http://www.honeynet.org, March 2005. M.A. Rajab, J. Zarfoss, F. Monrose, and A. Terzis, "A multifaceted approach to understanding the botnet phenomenon, " Proceedings of the

6th ACM SIGCOMM Conference on Internet measurement, pp. 41-52, October 2006. P. Baecher, M. Koetter, T. Holz, M. Dornseif, and F. Freiling, "The nepenthes platform: an efficient approach to collect malware," Proceedings of Recent Advances in Intrusion Detection, LNCS 4219, Springer-Verlag, 2006, pp. 165–184, Hamburg, September 2006. V. Yegneswaran, P. Barford, and V. Paxson, "Using honeynets for internet situational awareness," Proceedings of the 4th Workshop on Hot Topics in Networks, College Park, MD, November 2005. Z.H. Li, A. Goyal, and Y. Chen, "Honeynet-based botnet scan traffic analysis," Botnet Detection: Countering the Largest Security Threat, in Series: Advances in Information Security , Vol. 36, W.K.Lee, C. Wang, D. Dagon, (Eds.), Springer, ISBN: 978-0-387-68766-7, 2008. G.F. Gu, J.J. Zhang, and W.K. Lee, "BotSniffer: detecting botnet command and control channels in network traffic," Proceedings of the 15th Annual Network and Distributed System Security Symposium, San Diego, CA, February 2008 A. Karasaridis, B. Rexroad, and D. Hoeflin, "Wide-scale botnet detection and characterization," Proceedings of the 1st Conference on 1st Workshop on Hot Topics in Understanding Botnets, Cambridge, MA, 2007. J. R. Binkley and S. Singh, "An algorithm for anomaly-based botnet detection," USENIX SRUTI: 2nd Workshop on Steps to Reducing Unwanted Traffic on the Internet, July 2006. W. T. Strayer, R. Walsh, and C. Livadas, D. Lapsley, "Detecting botnets with tight command and control," Proceedings 2006 31st IEEE Conference on Local Computer Networks, pp. 195-202, Nov. 2006. W. T. Strayer, D. Lapsley, R. Walsh, and C. Livadas, "Botnet Detection Based on Network Behavior," Botnet Detection: Countering the Largest Security Threat, in Series: Advances in Information Security , Vol. 36, W.K.Lee, C. Wang, D. Dagon, (Eds.), Springer, ISBN: 978-0-38768766-7, 2008. C. Livadas, R. Walsh, D. Lapsley, and W.T. Strayer, "Using machine learning techniques to identify botnet traffic," Proceedings 2006 31st IEEE Conference on Local Computer Networks, pp. 967-974, Nov. 2006. A. W. Moore and K. Papagiannaki, "Toward the accurate identification of network applications," Proceedings of 6th International Workshop on Passive and Active Network Measurement, pp. 41-54, Boston, MA, March 2005. N. Williams, S. Zander and G. Armitage, "A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification," ACM SIGCOMM Computer Communication Review, Vol. 36, Issue 5, pp. 5-16, 2006. M. Crotti, M. Dusi, F. Gringoli and L. Salgarelli, "Traffic classification through simple statistical fingerprinting," ACM SIGCOMM Computer Communication Review, Vol. 37, Issue 1, pp. 5-16, 2007. L. Bernaille, R. Teixeira, I. Akodkenou, A. Soule, and K. Salamatian, "Traffic classification on the fly," ACM SIGCOMM Computer Communication Review, Vol. 36, Issue 2, pp. 23-26, 2006. D. Chakrabarti, S. Papadimitriou, D. Modha, and C. Faloutsos, "Fully Automatic Cross-Associations," Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 79-88, Seattle, Washington, August 22-25, 2004. K. Wang and S. Stolfo, "Anomalous payload-based worm detection and signature generation," Proceedings of the 8th International Symposium on Recent Advances in Intrusion Detection (RAID), Seattle, WA, 2005. G. F. Gu, P. Porras, V. Yegneswaran, M. Fong, and W.K. Lee, "BotHunter: detecting malware infection through IDS-Driven dialog correlation," Proceedings of the 16th USENIX Security Symposium, Boston, MA, August 2007. P. Wang, S. Sparks, and C. Zou "An advanced hybrid peer-to-peer botnet," Proceedings of the 1st conference on 1st Workshop on Hot Topics in Understanding Botnets, Cambridge, MA, 2007. C. Zou and R. Cunningham, "Honeypot-aware advanced botnet construction and maintenance," Proceedings of International Conference on Dependable Systems and Networks, June 2006.

978-1-4244-2324-8/08/$25.00 © 2008 IEEE.

5

This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE "GLOBECOM" 2008 proceedings. Authorized licensed use limited to: University of New Brunswick. Downloaded on May 18, 2009 at 11:21 from IEEE Xplore. Restrictions apply.

Suggest Documents