signatures for traffic application classification "[10,11] ... generate a number of good ideas, they are far from completed yet due to the ... signatures based application detection engine. ..... [20] E. Eskin, "Anomaly detection over noisy data using.
2009 2009 Seventh Seventh Annual Annual Communications Communication Networks Networks and and Services Services Research Research Conference Conference
BotCop: An Online Botnet Traffic Classifier Wei Lu, Mahbod Tavallaee, Goaletsa Rammidi and Ali A. Ghorbani Faculty of Computer Science University of New Brunswick Fredericton, NB E3B 5A3, Canada {wlu,m.tavallaee, g.rammidi, ghorbani}@unb.ca pull mechanism, the botmaster sets the commands in a file at C&C server (e.g., HTTP server), and the bots frequently connect to the server to read the latest commands. While in centralized structure all bots receive the commands from a specific server, in distributed structure the command files will be shared over P2P networks by botmaster, and bots can use specific search keys to find the published command files. In reality, detecting and blocking such an IRC botnet, however, is not a difficult task since the whole botnet can be put down by blacklisting the IRC server. To overcome this issue, botnets have evolved by allowing more flexibility in the applied protocols, and now they are even transforming from centralized structure into the advanced distributed strategy to solve the weakness of having a single point of failure. Compared to the traditional centralized C&C model, the distributed (Peer-to-Peer) botnet is much harder to be detected and destroyed because the bot’s communication does not heavily depend on a few selected servers, and thus shutting down a single or even a couple of bots cannot necessarily lead to the complete destruction of the whole botnet. Early research to detect botnets are mainly based on honeypots [1,2,3]. Setting up and installing honeypots on the Internet is very helpful to capture malware and understand the basic behavior of botnets, and, as a result, makes it possible to create bot binaries or botnet signatures. However, this analysis is always based on the existing botnets and provides no solution for the new botnets. To overcome this issue, new methods are proposed to automatically detect the botnets. These approaches can be categorized into two major groups: (1) passive anomaly analysis [e.g. 4,5]; and (2) traffic classification [e.g. 6]. Botnet detection based on the passive anomaly analysis is usually independent of the traffic content and has the potential to find different types of botnets (e.g., HTTP, IRC and P2P). This approach is, however, limited to a specific botnet structure (e.g. centralized only). In contrast, traffic classification focuses on classifying network traffic into the corresponding applications, and then distinguishing between normal and malicious activities. The biggest challenge of this approach is classification of traffic into appropriate application groups.
Abstract A botnet is a network of compromised computers infected with malicious code that can be controlled remotely under a common command and control (C&C) channel. As one the most serious security threats to the Internet, a botnet cannot only be implemented with existing network applications (e.g. IRC, HTTP, or Peerto-Peer) but also can be constructed by unknown or creative applications, thus making the botnet detection a challenging problem. In this paper, we propose a new online botnet traffic classification system, called BotCop, in which the network traffic are fully classified into different application communities by using payload signatures and a novel decision tree model, and then on each obtained application community, the temporalfrequent characteristic of flows is studied and analyzed to differentiate the malicious communication traffic created by bots from normal traffic generated by human beings. We evaluate our approach with about 30 million flows collected over one day on a large-scale WiFi ISP network and results show that the proposed approach successfully detects an IRC botnet from about 30 million flows with a high detection rate and a low false alarm rate.
1. Introduction Over the past few years botnets have differentiated themselves as the main source of malicious activities such as distributed-denial-of-service (DDoS) attacks, phishing, spamming, keylogging, click fraud, identity theft and information exfiltration. Similar to the other malicious software, botnets use a self-propagating application to infect vulnerable hosts. They, however, take advantage of a command and control (C&C) channel through which they can be updated and directed. According to the command and control (C&C) models, botnets are divided into two groups of centralized (e.g., IRC and HTTP) and distributed (e.g., P2P). Centralized botnets employ two mechanisms to receive the command from the server, namely push and pull. In the push mechanism, bots are connected to the C&C server (e.g., IRC server) and wait for the commands from the botmaster. In contrast, in the 978-0-7695-3649-1/09 $25.00 © 2009 IEEE DOI 10.1109/CNSR.2009.21
70
Authorized licensed use limited to: National Taiwan University. Downloaded on December 30, 2009 at 08:41 from IEEE Xplore. Restrictions apply.
Addressing the aforementioned challenges, we propose a hierarchical framework for the next generation botnet detection, which consists of two levels: (1) in the higher level all unknown network traffic are labeled and classified into different network application communities, such as P2P community, HTTP Web community, Chat community, DataTransfer community, Online Games community, Mail Communication community, Multimedia (streaming and VoIP) community and Remote Access community; (2) in the lower level focusing on each application community, we investigate and apply the temporal-frequent characteristics of network flows to differentiate the malicious botnet behavior from the normal application traffic. The major contributions of this paper include: (1) we propose a novel application discovery approach for automatically classifying network applications on a largescale WiFi ISP network; and (2) we develop a generic algorithm to discriminate general botnet behavior from the normal network traffic on a specific application community, which is based on n-gram (frequent characteristics) of flow payload over a time period (temporal characteristics). The rest of the paper is organized as follows. Section 2 introduces related work, in which we discuss some typical literatures on the current botnet detection communities. The proposed online traffic classification method is discussed in Section 3. Section 4 presents the temporalfrequent characteristic and then explains our botnet detection approach. Section 5 is the experimental evaluation for our detection model with a mixture of around 30 million flows collected on a large-scale WiFi ISP network and a botnet traffic trace collected on a honeynet deployed on the public Internet. Finally, in Section 6 we make some concluding remarks and discuss the future work.
virtual machine while logging all traffic, to try and get details of how a compromised host will join that particular botnet in the wild. During this testing, network fingerprints are created to capture network information like DNS requests, Destinations IP addresses, contacted ports and presence of default scanning behavior. IRCrelated features are also extracted by running an IRC server in the testing hosts and then any attempted connections are logged and an IRC fingerprint consisting of PASS, NICK, USER, MODE and JOIN values is created. Botnets are then tracked by joining a modified IRC tracker to the actual IRC server and observing it, and also DNS cache probing. Although the honeypot based approach is quite helpful in creating bot binaries and bot signatures, it is always limited to the existing botnets and provides no solution for the new bots. To overcome this shortcoming two botnet detection approaches have been proposed recently, namely traffic classification and passive anomaly analysis. A typical work of traffic classification based botnet detection using machine learning algorithms is illustrated at [6], in which Strayer et al. propose an approach for detecting botnets by examining flow characteristics such as bandwidth, duration, and packet timing in order to look for the evidence of the botnet command and control activities. They propose an architecture that first eliminates traffic that is unlikely to be a part of a botnet, then classifies the remaining traffic into a group that is likely to be part of a botnet, and finally correlates the likely traffic to find common communications patterns that would suggest the activity of a botnet. Typical approaches of passive anomaly based botnet detection are discussed in [4,5]. In [4], Karasaridis et al. study network flows and detect IRC botnet controllers in a fashion of four steps, in which the most important one is to identify hosts with suspicious behavior and isolate flow records to/from those hosts. In [5], Gu et al. investigate the spatial-temporal correlation and similarity in network traffic and implement a prototype system, BotSniffer, to detect botnets. All the above mentioned botnet detection techniques are either limited to the specific C&C protocols or limited to the specific botnet structures.
2. Related work Previous attempts to detect botnets are mainly based on honeypots, passive anomaly analysis and traffic classification. In order to get a full understanding of botnets behavior, honeypots are widely installed and setup on the Internet to capture the malware and consequently track and analyze the bots [1,2,3,]. A typical example is the Nepenthes honeypot that is commonly used to collect the shell code or bot binaries by mimicking a reply that can be generated by a vulnerable service. Rajab et al. in [1] deployed nepenthes to collect malware in their unused IP address space. A honeynet consisting of VMWare virtual machines running Windows XP is used to capture any exploits that may be missed by Nepenthes. Once all binaries are collected, they use greybox testing that runs the collected binary on a clean image of Windows XP
3. Traffic classification Early common techniques for identifying network application rely on the association of a particular port with a particular protocol. Such a port number based traffic classification approach has been proved to be ineffective due to: (1) the constant emergence of new peer-to-peer networking applications that IANA does not define the corresponding port numbers [7], (2) the dynamic port number assignment for some applications (e.g. FTP for data transfer), and (3) the encapsulation of different
71
Authorized licensed use limited to: National Taiwan University. Downloaded on December 30, 2009 at 08:41 from IEEE Xplore. Restrictions apply.
services into same application (e.g. chat or steaming can be encapsulated into the same HTTP protocol). Recent studies on network traffic application classification include "applying machine learning algorithm for clustering and classifying traffic flows based on a set of statistical features" [8,9], "modeling payload content signatures for traffic application classification "[10,11] and "identifying traffic based on heuristics derived from analysis of communication patterns of hosts" [12,13]. Although existing traffic classification mechanisms generate a number of good ideas, they are far from completed yet due to the limited number of applications they can identify and the rough application scopes (e.g. BLINC in [13] attempts to identify the general P2P traffic instead of the specific underlying P2P applications like eDonkey or BitTorrent). Moreover comparing all above mentioned methods is difficult because of the lack of sharable dataset and appropriate metrics [14]. Addressing these limitations, we propose in this paper a hybrid mechanism for classifying flow applications on the fly, in which we first model and generate signatures for more than 470 applications according to port numbers and protocol specifications of these applications and then concentrating on unknown flows that cannot be identified by signatures, we investigate their temporal-frequent characteristics in order to differentiate them into the already labeled applications based on a decision tree trained by corresponding temporal-frequent characteristics of known flows. Next we discuss the online traffic classification system in more detailed.
215,000 flows cannot be identified. A general result is that about 40% flows cannot be classified by the current payload signatures based classification method. In next section we build a module that works in parallel with the signatures based application detection engine. The new module focuses only on those applications that the signature-based detector could not identify and that appear to the signatures-based classifier as unknown. Table 1. Workload of Fred-eZone WiFi network over 1 day SrcIP
DstIP
Flows
Packets
Bytes
1055K
1228K
30783K
994M
500G
Table 2. Classification results with one hour traffic on FredeZone Known Applications Flows 249K
ScrIPs 102K
DstIPs 202K
Unknown Applications App. 82
Flows 215K
SrcIPs 1001K
DstIPs 1055K
3.2. Decision tree based classifier N-gram bytes distribution has proven its efficiency on detecting network anomalies. Wang et al. examine 1-gram byte distribution of the packet payload, represent each packet into a 256-dimenational vector describing the occurrence frequency of one of the 256 ASCII characters in the payload and then construct the normal packet profile through calculating the statistical average and deviation value of normal packets to a specific application service (e.g. HTTP) [16]. Anomalies will be alerted once a Mahalanobis distance deviation of the testing data to the normal profiles exceeds a predefined threshold. Gu et al. improve this approach and apply it for detecting malware infection in their recent work [17]. Different with previous n-gram based approaches for network intrusion detection, we extend in this paper n-gram frequency into a temporal domain and generate a set of 256-dimentional vector representing the temporal-frequent characteristics of the 256 ASCII binary bytes on the payload over a predefined time interval. By observing and analyzing the known network traffic applications, labeled by the signatures based classifier, over a long period on a large-scale WiFi ISP network, we found that the n-gram (i.e. n = 1 in particular) over a one second time interval for both source flow payload and destination flow payload is a strong enough feature that can be applied to differentiate traffic applications. As an example, Figures 1 to 5 illustrate this novel temporal-frequent metric for the application BitTorrent (P2P), Gnutella (P2P), LimeWire (P2P), HTTPWeb (WEB) and SecureWeb (WEB), respectively. Axis X in all these 5 Figures is the ASCII characters from 0 to 255 on the source flow payload. Axis Y stands for the
3.1. Signatures based classifier The payload signature based classifier is to investigate the characteristics of bit strings in the packet payload. For most applications, their initial protocol handshake steps are usually different and thus can be used for classification. Moreover, the protocol signatures can be modeled through either public documents like RFC or empirical analysis for deriving the distinct bit strings on both TCP and UDP traffic. The signatures based classifier is deployed on Fred-eZone, a free wireless fidelity (WiFi) network service provider being operated by the City of Fredericton [15]. Table 1 lists the general workload dimensions for the Fred-eZone network capacity. From Table 1, we see, for example, that the unique number of source IP addresses (SrcIP) appeared over one day is about 1,055 thousands and the total number of packets is about 944 millions. All the flows are bi-directional and we clean all unidirectional flows before applying the classifier. Table 2 lists the classification results over one hour traffic collected on Fred-eZone. From Table 2, we see that about 249,000 flows can be identified by the application payload signatures and about
72
Authorized licensed use limited to: National Taiwan University. Downloaded on December 30, 2009 at 08:41 from IEEE Xplore. Restrictions apply.
frequent value for each ASCII character appeared over a predefined time interval (i.e. 1 second).
By comparing Figures 1 to 3 with the Figures 4 and 5, we see that the temporal-frequent metric of flow payload are very different for P2P and WEB applications. In more fine-grained level, we see that the temporal-frequent metric of flow payload for applications BitTorrent, Gnutella and LimeWire are different as well by comparing Figures 1 to 3. Similar results also apply to differentiate the two applications (i.e. HTTPWeb and SecureWeb) in the same application group (i.e. WEB). We denote the 256-dimensional n-gram byte ti distribution as a vector < f1ti , f 2ti ,..., f 256 > , where
Figure 1. Temporal-frequent metric for source flow payload of BitTorrent application.
f jti stands for the frequency of the j th ASCII character on the flow payload over a time window ti ( j = 1, 2...256; i = 0,1, 2,...) (i.e. the temporal-frequent metric of the flow payload). Given n historical known flows for each specific application, we define a n × 256 matrix, p app , for profiling applications, which are illustrated as follows:
Figure 2. Temporal-frequent metric for source flow payload of Gnutella application.
p
app n× 256
f 1 t1 t2 f1 = tn f 1
f 2t1 f 2t 2
f 2t n
f 2t51 6 f 2t52 6 tn f 2 5 6
We create over 470 application profiling matrix for all the applications on the signatures base. Unknown flows that cannot be identified by signatures based classifier, therefore, could be labeled by the new application profiling matrix because unknown flows with payload, even though no signature is found to match the signature base, their temporal-frequent characteristics can always be modeled and thus can be used for unknown traffic classification. The decision tree technique is a good candidate to achieve the unknown traffic classification in this case due to its low computational complexity and the training capability for large-size dataset. A typical decision tree is represented in a form of a tree structure (e.g. Figure 6), in which each node is either a leaf node or a decision node. A leaf node indicates the value of the target class, such as Application = Gnutella in the Figure 6 and a decision node specifies some test to be carried out on a single attribute value, with one branch and sub-tree for each possible outcome of the test, for instance a decision f 5 with a branch test f5 ≤ 0.3 in Figure 6. A decision tree can be used to classify an example by starting at the root of the tree and moving through it until a leaf node, which provides the classification of the instance. Suppose Figure 6 is the decision tree for application classification trained by the 256-dimensional
Figure 3. Temporal-frequent metric for source flow payload of LimeWire application.
Figure 4. Temporal-frequent metric for source flow payload of HTTPWeb application.
Figure 5. Temporal-frequent metric for source flow payload of SecureWeb application.
73
Authorized licensed use limited to: National Taiwan University. Downloaded on December 30, 2009 at 08:41 from IEEE Xplore. Restrictions apply.
specific time period for botnet IRC traffic.
attribute < f1 , f 2 ,..., f 256 > , an unknown flow with a new 256-dimensional vector will be compared starting from root node f1 to see if it is bigger than 0.1 or not, and if the testing result is f1 ≤ 0.1 , then f5 is selected to see if it is bigger than 0.3 or not, if it is bigger than 0.3, the unknown flow will be labeled as Gnutella application. The training of the decision tree for obtaining a decision model is based on the historical 470 application profiling matrix and each application profiling matrix includes at least 10,00 instances (i.e. the size of the matrix is 1000 × 256 ). The decision tree algorithm we apply is the C4.5 proposed by Quinlan [18] since it is well known and frequently used over the years.
Figure 7. Average byte frequency over 256 ASCIIs for normal IRC flows
f1
f1 ≤ 0.1 f5
f 5 > 0.3
App=Gnutella
f1 > 0.1 f 20
f 5 ≤ 0.3
f 20 ≤ 0.45
App=BitTorrent f 64 < 0.05
App=LimeWire
f 64
f 20 > 0.45
App=Secureweb f 64 ≥ 0.05
App=Httpweb
Figure 6. A typical decision tree for traffic classification Figure 8. Average byte frequency over 256 ASCIIs for botnet IRC flows
4. Botnet detection
After obtaining the n-gram (n = 1 in this case) features for flows over a time window, we then apply an agglomerative hierarchical clustering algorithm to cluster the data objects with 256 features. We do not construct the normal profiles because normal traffic is sensitive to the practical networking environment and a high false positive rate might be generated when deploying the training model on a new environment. In contrast, the agglomerative hierarchical clustering is unsupervised and does not define threshold that needs to be tuned in different cases. In our approach, the final number of clusters is set to 2. Given a set of N data objects F ~ {Fi | i = 1, 2,..., N } ,
The temporal-frequent characteristic based on n-gram over a time period cannot only be applied to train the decision tree model for traffic classification, but also can discriminate the malicious traffic by bots from the normal traffic created by human-beings. The temporal feature is important in botnet detection due to two empirical observations of botnets behavior: (1) the response time of bots is usually immediate and accurate once they receive commands from botmaster, while normal human behavior might perform an action with various possibilities after a reasonable thinking time, and (2) bots basically have preprogrammed activities based on botmaster's commands, and thus all bots might be synchronized with each other. These two observations have been confirmed by a preliminary experiment conducted in [19]. As an example, Figures 7 and 8 illustrate the average byte frequency over the normal IRC flows and IRC botnet flows, respectively. By comparing Figures 7 and 8, we see the average byte frequency over a specific time period for normal IRC traffic is much smaller than average byte frequency over a
ti where Fi =< f1ti , f 2ti ,..., f 256 > , the detection approach is described in Algorithm 1. In practice, labeling clusters is always a challenging problem when applying unsupervised algorithm for intrusion detection. Previous intrusive cluster labeling methods are based on two assumptions: (1) there are two clusters only, one is normal and the other is intrusive, and
74
Authorized licensed use limited to: National Taiwan University. Downloaded on December 30, 2009 at 08:41 from IEEE Xplore. Restrictions apply.
(2) the number of instances in normal cluster is much bigger than the number of instances in intrusive cluster [20] and thus the cluster with small number of instances is usually labeled as intrusive cluster. We apply the same labeling strategy in this paper.
to create the training dataset for learning the decision tree based classifier, 11 typical applications belonging to 8 typical application groups are modeled from known labeled flows, which are illustrated in Table 3. The size of input data for training decision tree is 11000 × 256 . In order to validate the decision tree model we conduct a realtime classification evaluation in which traffic trace collected over 2 days are used for training and the realtime traffic flows collect on the 3rd day are used for testing.
Algorithm 1. Implementation of Botnet detection approach Function BotDel (F) returns botnet cluster ti Inputs: Collection of data objects Fi =< f1ti , f 2ti ,..., f 256 >, i = 1, 2,..., N Initialization: initialize number of clusters k (i.e. k = N ) by assigning each data instance to a cluster so that each cluster contains only one data instance Repeat: k ← k − 1 find the closest pair of clusters and then merge them into a single cluster compute distance between new clusters and each data of old clusters Until: k = 2 calculate number of instances in each cluster, g1 ,., g m , 1 ≤ m ≤ k If gb = min( g1 , g 2 ,..., g m ) then cluster b is labeled as botnet cluster Return the botnet cluster b with g b .
Table 3. Applications in training dataset
5. Experimental evaluation We implement a prototype system for the approach and then evaluate it on a large-scale WiFi ISP network over one day. The botnet traffic is collected on a honeypot deployed on a real network, aggregated them into 243 flows. The time interval for flow aggregation is 1 second. When evaluating the prototype system, we randomly insert and replay botnet traffic flows on the normal daily traffic. Since our approach is a two-stage process (i.e. unknown traffic classification first and botnet detection on application communities next), the evaluation is accordingly divided into two parts: (1) the performance testing for unknown traffic classification, not only focusing on the capability of our approach to classify the unknown IRC traffic, we also concentrate on the classification accuracy for other unknown applications (e.g. new P2P) since we expect the algorithm could be extended to detect any new appeared decentralized botnet; (2) the performance evaluation for system to discriminate malicious IRC bonnet traffic from normal human being IRC traffic.
Application ID 2006
Application Name BitTorrent
Application Group P2P
2000
Gnutella
P2P
2008 1010 1011
LimeWire HTTPWeb SecureWeb
P2P WEB WEB
1008 1004 1002
POP SMTP FTP
MAIL MAIL DataTransfer
5672 1005 5005
MSN SSH WindowsMediaPlayer
CHAT RemoteAccess Streaming
Size of Matrix
1000 × 256 1000 × 256 1000 × 256 1000 × 256 1000 × 256 1000 × 256 1000 × 256 1000 × 256 1000 × 256 1000 × 256 1000 × 256
During the online evaluation, the decision tree based classifier is deployed on a large-scale WiFi ISP network and works in parallel with the signature based classifier. More than 90,000 flows are collected over the testing day on the network and are enforced to be identified as unknown, of which the real labels are illustrated in Table 4. Tables 5 and 6 describe the detailed classification accuracy for each specific application using source flow based classifier and destination flow based classifier, respectively. The general classifying accuracy is illustrated in Table 7 for both classifiers. The online evaluation results show that the decision tree classifier based on destination flows achieves a 92.6% classification accuracy which is higher than 89.4% accuracy obtained by the source flows based classifier. All unknown flows are identified to specific applications and no unclassified flows happen due to the deterministic mechanism of decision tree structure.
5.2. Evaluation on botnet detection
5.1. Evaluation on traffic classification
During the evaluation of botnet detection, the proposed approach is evaluated with one day traffic. Table 8 shows the flow distribution for the application community with bot flows and the total number of flows after the traffic classification step. As illustrated in Table 8, the total number of flows is 32,693K and the number of flows
The data set for traffic trace used in the experimental evaluation is collected over three consecutive days on a large-scale WiFi ISP network, in which we achieve a 60% classification rate over 100 millions flows. The workload for Fred-eZone network is illustrated in Table 1. In order
75
Authorized licensed use limited to: National Taiwan University. Downloaded on December 30, 2009 at 08:41 from IEEE Xplore. Restrictions apply.
Table 6. Classification results with destination flow based decision tree classifier
labeled by the payload signature based classifier is 20,596. The rest unknown flows are 12,097, in which 243 unknown flows are classified into known IRC community (i.e. they actually represent the IRC C&C bot flows). Since we know all these unknown flows are actually belong to IRC, our approach obtains 100% accuracy for classifying these malicious bot C&C flows into their own application community. Next, we evaluate the capability of our approach for discriminating the bot generated traffic from normal traffic in the same application community. As illustrated in Table 9, we show the detection results in terms of number of correctly detected bot C&C flows and the number of falsely detected bot flows over the actual number of bot flows and normal flows on the specific community. From Table 8, we see that the total number of flows we collect for one day is over 30 millions and the total number of known flows which can be labeled by the payload signatures is over 20 millions. The number of IRC C&C flows is a very small part of the total flows. Our traffic classification approach can classify the unknown (malicious) IRC flows to the IRC application communities with a 100% classification rate on the evaluation. All the IRC C&C flows are differentiated from the normal traffic with a low false alarm rate, i.e. only 4 false alarms on the evaluation.
Applications
BitTorrent FTP Gnutella HTTPWeb LimeWire MSN POP SecureWeb SMTP SSH WindowsMediaPlayer
Decision Tree Classifier Based on Source Flows Total Classification Number of Accuracy (%) Flows Correctly Indentified 82983 89.4
Number of Flows 29739 224 15109 16216 141 4049 26 12886 11522 2197 722
BitTorrent FTP Gnutella HTTPWeb LimeWire MSN POP SecureWeb SMTP SSH WindowsMediaPlayer
Number of Unknown Flows 29739 224 15109 16216 141 4049 26 12886 11522 2197 722
Decision Tree Classifier Based on Destination Flows Total Classification Number of Accuracy (%) Flows Correctly Indentified 85995 92.6
Table 8. Description of application community Total Flows
Known Flows
32693K
20596 K
Flows in Botnet Communities 264 IRC {21 normal}
Table 9. Detection performance
Table 5. Classification results with source flow based decision tree classifier Applications
Number of Flows Correctly Labeled 27796 181 13992 13996 108 4012 26 11809 11424 2170 81
Table 7. General classification accuracy for both classifiers
Table 4. Distribution of "unknown" application flows Applications BitTorrent FTP Gnutella HTTPWeb LimeWire MSN POP SecureWeb SMTP SSH WindowsMediaPlayer
Number of Unknown Flows 29739 224 15109 16216 141 4049 26 12886 11522 2197 722
Normal IRC Flows
Bot C&C Flows
Correctly detected Bot C&C Flows
21
243
243
Number of Falsely Identified Bot C&C Flows 4
6. Conclusions In this paper, we present a novel generic botnet traffic classification framework, in which unknown applications on the current network are firstly classified into different application communities, such as Chat (or more specific IRC) community, P2P community, Web community, to name a few, and then focusing on each application community, a novel temporal-frequent characteristic is applied for discriminating network traffic by bots from normal network traffic by human-beings. Since botnets are usually exploring existing application protocols, our approach can be extended to find different types of
Number of Flows Correctly Labeled 27777 193 11929 12635 131 4021 26 12097 11512 2181 481
76
Authorized licensed use limited to: National Taiwan University. Downloaded on December 30, 2009 at 08:41 from IEEE Xplore. Restrictions apply.
botnets and has the potential to find the new botnets when exploring specifically the traffic on the "unknown" community. In particular, we evaluate our framework on IRC chat community and evaluation results show that our approach obtains a very high detection rate (approaching 100% for IRC bot) with a low false alarm rate when detecting IRC botnet traffic. In the immediate future, we will evaluate our approach on the P2P community and measure its performance on P2P based botnets.
using semi-supervised learning", Performance Evaluation, Vol. 64, No. 9-12., 1194-1213, 2007. [9] L. Bernaille, R. Teixeira, I. Akodkenou, A. Soule, and K. Salamatian, "Traffic classification on the fly", ACM SIGCOMM Computer Communication Review, Vol. 36, Issue 2, 23-26,2006. [10] L. Bernaille and R. Teixeira, "Early recognition of encrypted applications". In Proceedings of Passive and Active Measurement Conference (PAM 2007), Louvain-la-neuve, Belgium, 165-175, 2007. [11] S. Sen, and J. Wang, "Analyzing peer-to-peer traffic across large networks". In Proceedings of ACM SIGCOMM Internet Measurement Workshop, Marseilles, France, 2002. [12] A. Moore and K. Papagiannaki, "Toward the accurate identification of network applications", In Proceedings of 6th Passive and Active Measurement Workshop (PAM 2005), 2005. [13] T. Karagiannis, K. Papagiannaki, and M. Faloutsos. "BLINC: multilevel traffic classification in the dark", In Proceedings of the 2005 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, Philadelphia, Pennsylvania, 229-240, 2005. [14] L. Salgarelli, F. Gringoli, and T. Karagiannis, "Comparing traffic classifiers", ACM SIGCOMM Computer Communication Review, Volume 37, Issue 3, 65-68, 2008. [15] Fred-eZone WiFi ISP, available and retrieved in December2008, http://www.fred-ezone.ca/ [16] K. Wang, and S. Stolfo, "Anomalous payload-based network intrusion detection", In Proceedings of the 7th International Symposium on Recent Advances in Intrusion Detection (RAID), Sophia Antipolis, France, 2004. [17] G.. F. Gu, P. Porras, V. Yegneswaran, M. Fong, and W.K. Lee, "BotHunter: detecting malware infection through IDS-Driven dialog correlation". In Proceedings of the 16th USENIX Security Symposium, Boston, MA, 2007. [18] J. R. Quinlan, C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, 1993. [19] M. Akiyama, T. Kawamoto, M. Shimamura, T. Yokoyama, Y. Kadobayashi, and S. Yamaguchi, "A proposal of metrics for botnet detection based on its cooperative behavior," In Proceedings of the 2007 International Symposium on Applications and the Internet Workshops, pp. 82-85, 2007. [20] E. Eskin, "Anomaly detection over noisy data using learned probability distributions," In Proceedings of 17th International Conference on Machine Learning, pp. 255-262, Palo Alto, 2000.
Acknowledgement The authors graciously acknowledge the funding from the Atlantic Canada Opportunity Agency (ACOA) through the Atlantic Innovation Fund (AIF) to Dr. Ali Ghorbani.
References [1] M.A. Rajab, J. Zarfoss, F. Monrose, and A. Terzis,
[2]
[3]
[4]
[5]
[6]
[7] [8]
"A multifaceted approach to understanding the botnet phenomenon," In Proceedings of the 6th ACM SIGCOMM Conference on Internet measurement, pp. 41-52, 2006. V. Yegneswaran, P. Barford, and V. Paxson, "Using honeynets for internet situational awareness," In Proceedings of the 4th Workshop on Hot Topics in Networks, College Park, MD, 2005. F. Freiling, T. Holz, and G. Wicherski. "Botnet tracking: exploring a root-cause methodology to prevent Denial of Service attacks". In Proceedings of 10th European Symposium on Research in Computer Security (ESORICS’05), 2005. A. Karasaridis, B. Rexroad, and D. Hoeflin, "Widescale botnet detection and characterization," In Proceedings of the 1st Conference on 1st Workshop on Hot Topics in Understanding Botnets, Cambridge, MA, 2007. G.F. Gu, J.J. Zhang, and W.K. Lee, "BotSniffer: detecting botnet command and control channels in network traffic," In Proceedings of the 15th Annual Network and Distributed System Security Symposium, San Diego, CA, February 2008. T. Strayer, D. Lapsley, R. Walsh, and C. Livadas, "Botnet detection based on network behavior," Botnet Detection: Countering the Largest Security Threat, in Series: Advances in Information Security, Vol. 36, W. K. Lee, C. Wang, D. Dagon, (Eds.), Springer, 2008. IANA port numbers, available and retrieved in Dec. 2008.http://www.iana.org/assignments/port-numbers J. Erman, A. Mahanti, M. Arlitt,, I. Cohen, and C. Williamson, "Offline/realtime traffic classification
77
Authorized licensed use limited to: National Taiwan University. Downloaded on December 30, 2009 at 08:41 from IEEE Xplore. Restrictions apply.