An Efficient False Alarm Reduction Approach in HTTP-based Botnet ...

2 downloads 0 Views 221KB Size Report
Abstract— In recent years, bots and botnets have become one of the most dangerous infrastructure to carry out nearly every type of cyber-attack. Their dynamic ...
2013 IEEE Symposium on Computers & Informatics

An Efficient False Alarm Reduction Approach in HTTP-based Botnet Detection Meisam Eslahi

H. Hashim

N.M. Tahir

Computer System and Technology Dept. University of Malaya Malaysia. Email: [email protected]

Faculty of Electrical Engineering Universiti Teknologi MARA Malaysia. Email: [email protected]

Faculty of Electrical Engineering Universiti Teknologi MARA Malaysia. Email: [email protected]

instead of having a central C&C server, the botmaster sends a command to one or more bots, and they deliver it to their neighbours. Since the botmaster commands are distributed by other bots, the botmaster is not able to monitor the delivery status of the commands [2]. Moreover, the implementation of a P2P botnet is difficult and complex. Therefore, botmasters have begun to use the central C&C model again, where the HTTP protocol is used to publish the commands on certain web servers [6, 7]. Instead of remaining in connected mode, the HTTP bots periodically visit certain web servers to get updates or new commands. This model is called the PULL style and continues at a regular interval that is defined by the botmaster [8]. Because of the wide range of HTTP services used, it is not easy to block [7, 9]. Moreover, HTTP protocol is used by a wide range of normal applications and services on the Internet, thus detection of the HTTP botnets with low rate of false detection (e.g. false negative and positive) has become a challenge for botnet detection studies [6, 7]. Therefore, this paper aims to propose a method to detect HTTP botnets with a low rate of false positive and false negative. The major contributions can be summarised as follows:

Abstract— In recent years, bots and botnets have become one of the most dangerous infrastructure to carry out nearly every type of cyber-attack. Their dynamic and flexible nature along with sophisticated mechanisms makes them difficult to detect. One of the latest generations of botnet, called HTTP-based, uses the standard HTTP protocol to impersonate normal web traffic and bypass the current network security systems (e.g. firewalls). Besides, HTTP protocol is commonly used by normal applications and services on the Internet, thus detection of the HTTP botnets with a low rate of false alarms (e.g. false negative and false positive) has become a notable challenge. In this paper, we review the current studies on HTTP-based botnet detection in addition to their shortcomings. We also propose a detection approach to improve the HTTP-based botnet detection regarding the rate of false alarms and the detection of HTTP bots with random patterns. The testing result shows that the proposed method is able to reduce the false alarm rates in HTTP-based botnet detection successfully. Keywords— Network Security, Botnet Detection, Command and Control Mechanism, HTTP Botnet, False Alarm Rate.

I. INTRODUCTION A botnet threat comes from three main elements - the bots, the command and control (C&C) servers, and the botmasters. A bot is a small application which is designed to infect computers and use them as part of a botnet without their owners’ knowledge. The infected computers or zombies are controlled by skilful remote attackers called botmasters. They use C&C servers as interface to send orders to all the bots and control the entire botnet [1]. In general, there are different types of botnet command and control models based on the communication style (i.e. PUSH or PULL), architecture (e.g. centralised and decentralised) and protocols (e.g. IRC, HTTP and P2P) [2]. The Internet Relay Chat (IRC) protocol is used in the first generation of botnets where the IRC servers and the relevant channels are employed to establish a central C&C server to distribute botmasters’ commands [3]. The IRC bots follow the PUSH approach as they connect to selected channels and remain in the connect mode [4]. Although the IRC botnets are easy to use, control and manage, they suffer from a central point of failure [5]. To overcome this issue, in the P2P model,

978-1-4799-0210-1/13/$31.00 ©2013 IEEE

x

A botnet detection technique based on a behavioural analysis approach to detect malicious activities in a given network.

x

Two filter algorithms, High Access Rate (HAR) and Low Access Rate (LAR), to remove a wide range of unwanted traffic without using any white or black lists, in order to reduce the rate of false alarms (i.e. false positive and false negative).

x

A periodic pattern detector called Periodic Access Analysis (PAA) to detect PULL style HTTP botnet traffic with both fixed and random intervals.

The remainder of this paper is organised as follows. Section II presents current studies on HTTP-based botnet detection along with their weaknesses. Section III proposes a data reduction and analysis approach to overcome current challenges regarding false alarm rates. The experiment and resulting analysis are considered in Section IV, followed by discussion and future works in Section V. Finally, Section VI gives the overall conclusions of this paper.

201

2013 IEEE Symposium on Computers & Informatics

II. RELATED WORKS

To overcome this issue, they propose a fuzzy crossassociation classifier which uses synchronisation activity as a metric based on the fact that the bots may perform abnormal activities to be in synchronisation with other bots in the same botnet. This method also requires a large number of bots in one botnet and may generate false alarms in small-scale botnets. Finally, In order to detect small-scale botnets with lower false alarms, Binbin et al. [6] used request byte, response byte, and the number of packets as common features of an HTTP connection, to classify the similar connections generated by a single bot. Their method can detect the small-scale botnets, but some techniques like random request delay or random packet number can evade their detection method and generate high false negative rates in the results. In addition, like the other HTTP-based botnet detection approaches, normal programs which generate periodic connections (e.g. auto refresh web pages) can be detected as a bot and increase the number of false positives. Each of aforementioned methods comes with different tradeoffs regarding false alarm rates and efficiency in detecting HTTP-based botnet with random patterns. Therefore, this paper aims to propose new data filtering approaches to reduce the false positive and false negative rates in the detection results.

A considerable number of studies on botnet detection have adopted passive analysis by collecting the network traffic for a specific period and analysing it in order to identify any evidence of bot and botnet activities [7]. Table 1 shows several previous studies proposed to detect the HTTP-based botnets. It shows the false negative and false positive and their efficiency in detecting bots with random pattern (e.g. interval, packet size). Jae-Seo et al. [3] and Tung-Ming et al. [9] introduced a parameter based on one of the pre-defined characteristics of HTTP-based botnets. They suggested a Degree of Periodic Repeatability (DPR) to show the pattern of regular connections (i.e. PULL style) of HTTP-based bots to certain servers. In their method, an activity is considered a bot if its DPR is low, although the DPR becomes low only if a bot uses the fixed connection intervals. By changing the connection intervals technique (e.g. random pattern), the botmasters can evade this technique and generate a false negative in results [10]. Moreover, the authors observed that by using this technique, the normal automatic software, such as updaters, can be detected as a bot and generate a false positive in results. To reduce the false alarm rates Gu et al. proposed Botsniffer [8] and its extension BotMiner [4] based on analysing similarities in the abnormal or malicious activities generated by a group of bots from the same botnet. Although they can detect the bots with random interval, they observed that some services such as Gmail session, which periodically checks the emails for updates, can generate high rates of false positives in the results. Moreover, these methods are designed based on cooperative behaviour analysis, which requires an adequate number of members (bots) in one botnet to make detection successfully. Therefore, their proposed “Group correlation analysis” shows less efficiency (e.g. a high false negative rate) in the detection of small-scale botnets and single bots. In order to overcome this shortcoming, Botsniffer proposed a sub-system for single bot detection; but, as noted by authors, it is not as robust as their group analysis technique. Accordingly, Lu et al. in [11] categorised the services and application flows using payload-signature to examine the bit strings in the packets payload as a signature. These signatures were used to separate known traffic from unknown traffic in order to decrease the false alarm rates. Like traditional signature-based techniques the proposed classifier is less effective as it is unable to identify new or encrypted patterns and possibly increase the false negative rate. TABLE I.

III. DATA REDUCTION AND ANALYSIS This paper employed the passive behaviour analysis approach to collect information about particular network traffic and analyse it in order to identify any signs of bots and botnet activities. However, its main objective was to propose an efficient filtering approach to reduce the rate of false negative and false positive rates to improve current detection solutions. A. Data Preparation: Before applying proposed data reduction approaches, simple data filtering models called HTTP Traffic Separator (HTS) and Get and Post Separator (GPS) are applied on collected traffic to select HTTP traffic with GET and POST methods only. These filtering approaches are used by almost every HTTP-based Botnet Detection study as HTTP-based bots use these methods to contact their C&C server [8, 12]. B. Grouping and Sorting: The Grouping and Sorting process sorts the collected traffic packets and divides them into different groups based on the source IP address, destination IP address, URL, UserAgent string and timestamp. While the other studies mostly use source IP, destination IP and Domain names to divide the collected traffic packets to different groups [3, 4, 9, 13], in this paper one of the HTTP header fields known as the User-Agent has been used as an additional parameter along with the previous ones, to make the collected network packets’ grouping and classification more accurate.

EXISTING HTTP-BASED BOTNET DETECTION

False Alarm Rate

Proposed Method

False Negative

False Positive

[3] [4] [6] [8] [9] [11]

High Low High Low High High

High Medium High Medium High Medium

Efficiency in Random Pattern Detection

× ¥ × ¥ × ¥

C. High Access Rate Filter: The High Access Rate (HAR) filter detects and eliminates the group of similar HTTP connections or requests that have been generated within a short period of time, for example, a group with more than one request per second. This is

202

2013 IEEE Symposium on Computers & Informatics

important as automatic software (e.g. updater and downloader) transmits a similar periodic pattern of traffic which can be falsely identified as HTTP bots’ activities and increase the false alarm rates [4, 7, 9]. However, the number of requests which are generated by the aforementioned applications is extremely high when compared to HTTP-bots (e.g. more than one request per minute). Moreover, Strayer et al. [14] observed that a bot does not generate bulk data transfer. Therefore, this filter is proposed to remove any traffic that generates high rates of request and label them as automatic software instead of bot activities.

VLAN 1

PC1

VLAN 2

C&C Server 1

PC2

VLAN 5

Local Area Network

Analyser

PC3 VLAN 3

D. Low Access Rate Filter: The Low Access Rate (LAR) filter acts on results from the HAR filter and removes traffic with a low rate of access. For example, only a few packets in a long period of time (e.g. entire data collecting period). This filter is designed based upon the comment made by Strayer et al. [14] where bots are designed to perform bigger tasks and much faster than humans, hence they do not generate brief traffic.

VLAN 6

PC4

C&C Server 2

VLAN 4

Figure 3: The Testbed Overview

IV. EXPERIMENT AND RESULTS In order to evaluate the proposed architecture, several experiments were conducted. Figure 3 illustrates the experimental schema which is designed based on the topology proposed by Lu et al. [11]. The experiment requires several infected computers from different networks (e.g. different VLAN), analyser and command and control servers. Two HTTP-based botnets called BlackEnergy [15] and Bobax [16] are employed in this research as they are used in most of the previous studies explained in the literature. The evaluation study used four different bot configurations to generate botnet activities as shown in Table 2. PC1 was infected by the real BlackEnergy bot and the others were infected by the modified bots modelled after BlackEnergy and Bobax. PC2 was infected by HBot1 and it is modelled based upon the BlackEnergy description but it was modified to become stealthier. It contacts the command and control server periodically at random intervals (i.e. three to ten minutes). The HBot2 was generated based on the descriptions of Bobax and it connects to the command and control server at every four-minute interval. Finally, the HBot3 adopted a similar structure as HBot2 but it contacts the command and control server at random intervals. The bots on PC1 and PC2 are periodically connected to the command and control server 1 and bots on PC3 and PC4 are connected to the command and control server 2. In addition to the bots, a set of small software sensors are designed and placed on the experimental clients to collect the traffic and send them to the analyser engine which is located in VLAN 5. Moreover, the Tcpreplay [17] application was used to regenerate the normal traffic which were previously captured from the university campus during the experiment.

E. Periodic Access Analysis: The Periodic Access Analysis (PAA) process selects the HTTP connections that are generated with a pattern with periodic intervals. This filter is designed based on the nature of HTTP-based botnets which follow the PULL style where they connect to their command and control server periodically in order to get the commands and updates. This filter is an improved model of the same concept which is used by the existing HTTP-based botnet detection methods in [4, 9] to select only suspicious activities which initiate periodic communication to specific servers. This process calculates the total data collecting time, and divides it into equal partitions (i.e. P1 to Pn) as shown in Figure 1. Based on the experimental observation of this paper, the length of one hour is considered for each partition. To illustrate the difference between normal groups and botnet groups, Figure 1 depicts the example of their distributions. The circle represents the botnet activities and the triangle illustrates the normal groups. As can be seen in the figure, the circles are repeated in variable intervals to present PULL style command and control mechanisms with random intervals. Moreover, they appear in all partitions and this can be considered as a periodic pattern, therefore they can be identified as botnet activities. The remaining groups, represented by triangles, are considered as being non-periodic and will be removed by the filter.

TABLE II. EXISTING HTTP-BASED BOTNET DETECTION

P1 Start Time

P2

P3

……………………….

Normal Activities

Suspicious

Pn Stop Time

Figure 1: Normal and Suspicious Activities’ Distribution

203

Infected Computer

Name of Bot

Method

C&C Connection Interval

PC1 PC2 PC3 PC4

BlackEnergy HBot1 HBot2 HBot3

GET GET POST POST

Fixed Random Fixed Random

2013 IEEE Symposium on Computers & Informatics

TABLE III. EXISTING HTTP-BASED BOTNET DETECTION Client (PC)

Infected by

PC1 BlackEnergy PC2 HBot1 PC3 HBot2 PC4 HBot3 All Collected Data

Collected Packets

74,594 87,495 70,943 72,829 305,861

Data Preparation HTS GPS

27,264 21,099 23,702 30,558 102,623

3,837 2,64 3,523 2,456 12,461

As discussed above, numbers of experiments were conducted to evaluate the performance of proposed approaches to reduce false alarms. Table 3 shows the results of one of the experiments, which was conducted by collecting 305,861 packets in 3 hours. There are five main algorithms used to filter the traffic. The first filter of data preparation filtered about 66.45% of data from 305,861 to 102,623 packets. The second filter, GPS removed another 87.86% data to produce only 12,461 packets. The data reduction and analysis process was continued with other filter algorithms: HAR, LAR followed by Periodic Access Analysis (PAA). The purpose of having the last three filters is to separate the HTTP-based botnet Command and Control traffic flows from the normal flows. The HAR filter has been proven to be very effective by removing 91.93% of unwanted traffic data packets and reducing them to 1,005 packets from 12,461 packets. The LAR filter reduced the traffic from 1,005 to only 226 packets, which indicated 77.51% of the traffic had been removed with the filter. Finally, the PAR filtered out unwanted traffic data packets and reduced them from 226 packets to only 125 packets (i.e. 44.69% of the traffic had been removed). In total, with all five steps, the traffic was reduced from 305,861 to 125 packets which indicated 99.96% of the traffic data packets had been removed. As can be seen in Table 3, the drastic reduction is significant as the huge percentage is achieved without using any white or black lists. The experimental results have demonstrated that the proposed method in this paper is able to reduce the amount of unwanted data and detect all the HTTP-based bots which are used in the aforementioned experiment.

HAR Result

LAR Result

PAA Result

192 255 201 357 1,005

48 30 100 48 226

48 22 34 21 125

monitored clients and end devices. Although the experiment results show that our method significantly reduced the false alarm rate, in some circumstances normal applications can be detected as suspicious activities. For instance, if users keep using normal auto refresh web sites constantly over a long period of time (e.g. a whole day), it may generate the same pattern as HTTPbased botnets. To overcome this issue, our future work called botsAnalytics focuses on the User-Agent string to evaluate the origin environment (e.g. standard web browsers) of collated requests. The User-Agent is part of HTTP request header that indicates the application or browser which is generating a request. However, as observed in [18], the botnets can use fake User-Agents in order to impersonate normal applications. Therefore, future efforts should focus on designing a pattern recognition method to distinguish the original User-Agents from the fake ones, and to use this method to improve botnet detection. We are working on two unique algorithms to prioritise the detected suspicious activities in order to rank them qualitatively (i.e. low and high). This helps network security experts to make better decisions on detected suspicious activities. Moreover, in future work more study should be dedicated to the detection of HTTP-based botnets with multi C&C servers instead of those which communicate with a single C&C server. Finally, based on our investigation, mobile devices and networks have been targeted by the new generation of botnets called MoBots [19]. Mobile networks are now well integrated with the Internet (e.g. 3G, 4G and LTE technologies) and provide efficient environments which attract botmasters [20]. Moreover, Knysz et al. [21] discovered that mobile botnet activities over WiFi connections (i.e. mobile HTTP-based botnets) are more difficult to monitor and detect. On the other hand, the techniques and filtering approaches discussed in this paper are mainly designed based on computer and computer network characteristics and may not fully applicable for mobile HTTP botnets. Therefore, we are working on the extension of current approaches to design a central security management solution (e.g. Cloud-based) called mobAnalytics, to detect and mitigate a new generation of botnets in mobile devices (smartphones in particular) and response to them.

V. DISCUSSION AND FUTURE WORK The current HTTP-based botnet detection methods are mostly based on the fact that bots are periodically connected to their command and control servers to update themselves or get new commands from their botmaster. Jae-Seo et al. [3] observed that some normal automatic applications, such as download managers, updaters, and auto refresh web pages, pose similar activities and generate false alarms. Likewise, Gu et al. [8] found that some services, such as Gmail sessions which periodically check emails for updates, also have the potential to generate false alarms. In order to reduce the false alarm rates, the aforementioned studies proposed a white list technique. However, this paper proposed two filters, HAR and LAR and they removed a wide range of unwanted traffic (i.e. about 96% of non-related data) without using any white or black lists. Proper data reduction and filtering plays an important role in the analysis process as it can reduce the processing load at the analyser engine and effects on the

VI. CONCLUSION This paper proposed several approaches to reduce the false alarms rate in HTTP-based botnets detection. The proposed methods are evaluated based on the false positive and false negative rates and its efficiency in the detection of botnets with random intervals. The test results show that the proposed method achieved higher efficiency in detecting HTTP-based botnets. The very low false positive ratio obtained through the

204

2013 IEEE Symposium on Computers & Informatics

use of the new proposed HAR, and LAR filters, shows that the proposed method is able to reduce false alarm rates and improve current studies on HTTP-based botnet detection successfully.

[10]

ACKNOWLEDGMENTS

[11]

This work was supported in part by Research Management Institute, Universiti Teknologi MARA, Malaysia.

[12]

REFERENCES

[1] [2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

L. Chao, J. Wei and Z. Xin, "Botnet: Survey and Case Study," in Proceedings of the Fourth International Conference on Innovative Computing, Information and Control (ICICIC), 2009, pp. 1184-1187. M. Bailey, E. Cooke, F. Jahanian, X. Yunjing and M. Karir, "A Survey of Botnet Technology and Defenses," in Proceedings of the Cybersecurity Applications & Technology Conference for Homeland Security (CATCH), 2009, pp. 299-304. L. Jae-Seo, J. HyunCheol, P. Jun-Hyung, K. Minsoo and N. Bong-Nam, "The Activity Analysis of Malicious HTTP-Based Botnets Using Degree of Periodic Repeatability," in Proceedings of the International Conference on Security Technology (SECTECH), 2008, pp. 83-86. G. Gu, R. Perdisci, J. Zhang and W. Lee, "BotMiner: Clustering Analysis of Network Traffic for Protocol and Structure Independent Botnet Detection," in Proceedings of the 17th Conference on Security Symposium, San Jose: USA, 2008, pp. 139-154. G. Fedynyshyn, M. C. Chuah and G. Tan, "Detection and Classification of Different Botnet C&C Channels," in Proceedings of the 8th International Conference on Autonomic and Trusted Computing, 2011, pp. 228-242. W. Binbin, L. Zhitang, L. Dong, L. Feng and C. Hao, "Modeling Connections Behavior for Web-Based Bots Detection," in Proceedings of the e-Business and Information System Security (EBISS), 2010 2nd International Conference on, 2010, pp. 1-4. M. Eslahi, R. Salleh and N. B. Anuar, "Bots and Botnets: An Overview of Characteristics, Detection and Challenges," presented at the IEEE International Conference on Control System, Computing and Engineering, Penang,Malaysia, 2012. G. Gu, J. Zhang and W. Lee, "BotSniffer: Detecting Botnet Command and Control Channels in Network Traffic," in Proceedings of the 15th Annual Network and Distributed System Security Symposium (NDSS), 2008. K. Tung-Ming, C. Hung-Chang and W. Guo-Quan, "Construction P2P firewall HTTP-Botnet defense mechanism," in Proceedings of the IEEE

[13]

[14] [15] [16] [17] [18]

[19]

[20] [21]

205

International Conference on Computer Science and Automation Engineering (CSAE), 2011, pp. 33-39. J. Dae-il, C. Kang-yu, K. Minsoo, J. Hyun-chul and N. Bong-Nam, "Evasion technique and detection of malicious botnet," in Proceedings of the International Conference for Internet Technology and Secured Transactions (ICITST), 2010, pp. 1-5. W. Lu, M. Tavallaee and A. A. Ghorbani, "Automatic discovery of botnet communities on large-scale communication networks," presented at the 4th International Symposium on Information, Computer, and Communications Security, Sydney: Australia, 2009. H. Binsalleeh, T. Ormerod, A. Boukhtouta, P. Sinha, A. Youssef, M. Debbabi and L. Wang, "On the Analysis of the Zeus Botnet Crimeware Toolkit," in Proceedings of the Eighth Annual International Conference on Privacy Security and Trust (PST), 2010, pp. 31-38. W. Binbin, L. Zhitang, L. Dong, L. Feng and C. Hao, "Modeling Connections Behavior for Web-Based Bots Detection," in Proceedings of the 2nd International Conference on e-Business and Information System Security (EBISS), 2010, pp. 1-4. W. T. Strayer, R. Walsh, C. Livadas and D. Lapsley, "Detecting Botnets with Tight Command and Control," in Proceedings of the 31st IEEE Conference on Local Computer Networks, 2006, pp. 195-202. J. Nazario. (September, 24, 2012). BlackEnergy DDoS Bot Analysis. Available: http://atlas-public.ec2.arbor.net/docs/BlackEnergy+DDoS +Bot+Analysis.pdf P. Royal. (September, 24, 2012). On the kraken and bobax botnets [PDF]. Available: www.damballa.com/downloads/press/Kraken_ Response.pdf A. Turner. (2012). Tcpreplay. Available: http://tcpreplay.synfin.net/ C. Rossow, C. J. Dietrich, H. Bos, L. Cavallaro, M. v. Steen, F. C. Freiling and N. Pohlmann, "Sandnet: network traffic analysis of malicious software," in Proceedings of the the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security, Salzburg, Austria, 2011, pp. 78-88. M. Eslahi, R. Salleh and N. B. Anuar, "MoBots: A New Generation of Botnets on Mobile Devices and Networks," presented at the IEEE International Symposium on Computer Applications and Industrial Electronics (ISCAIE), Kota Kinabalu Malaysia, 2012. J. Kok and B. Kurz, "Analysis of the BotNet Ecosystem," in Proceedings of the 10th Conference of Telecommunication, Media and Internet Techno-Economics (CTTE), 2011, pp. 1-10. M. Knysz, X. Hu, Y. Zeng and K. G. Shin, "Open WiFi networks: Lethal weapons for botnets?," in Proceedings of the IEEE International Conference on Computer Communications (INFOCOM), 2012, pp. 26312635.

Suggest Documents