Fast-flux Attack Network Identification Based on Agent

5 downloads 47278 Views 213KB Size Report
The availability of services is always a hot issue on the. Internet. Because the electronic ..... the result shows, all these agents belong to some web hosting.
Fast-flux Attack Network Identification Based on Agent Lifespan Sheng Yu, Shijie Zhou, and Sha Wang School of Computer Science and Engineering University of Electronic Science and Technology of China Chengdu, China {yusheng123, sjzhou16, rainfairy12}@gmail.com

Abstract—Fast-flux refers to rapidly changing the mapping between IP address and domain name. Although some benign uses with this technique are known, it currently has become a favorite tool for cyber criminals to launch collaborative attacks, such as phishing, pharming, and malware spreading. While the legal fast-flux networks and the malicious ones hold some same features, such as short TTL and large IP pool, it is hard to distinguish them. In this paper we propose a novel way to deal with the fast-flux attack identification issue. We try to measure the service availability of the agents in the fast-flux network to identify the malicious fast-flux. This is the first time that researchers observe the fast-flux network in terms of service availability. We develop some metrics on the service availability. And the observation results show the metrics are useful. Keywords- fast-flux service networks; fast-flux attack; fast-flux attack network; network security

I.

agents’ lifespan are provided. The observation result of fastflux attacks in real life is discussed in the section 5. Finally, we make the conclusion in section 6. II.

TECHNICAL BACKGROUND

2.1 Domain name system DNS is a global database system, mainly in charge of translating domain names meaningful to humans into the IP addresses. Most commonly, before we access the websites, the browser would automatically query the IP address of the domain name. The DNS server usually returns the completely same reply every time. Thus all the users will access the same IP address all the time. The traditional process of using the HTTP service is shown in the figure 1.

INTRODUCTION

The availability of services is always a hot issue on the Internet. Because the electronic components’ lifetime is limited and short, the server and network are probable to break down. For solving this problem, on host, some techniques, such as RAID, failover systems and cluster systems, are introduced. The network faces more challenges, being vulnerable to not only hardware failure, but also the attacks on the network. The Distributed Denial-of-Service (DDoS) attack is a classic and terrible threat to service availability. To enhance the availability of the network, the round-robin DNS and Content Distribution Networks (CDNs) are proposed. And within the recent years, a noble and more advanced method, fast-fast service network (FFSN) is introduced. All these three methods are based upon some specific uses of the DNS system. Thus, they are all vulnerable to DNS attack, such as DNS cache poisoning and DNS ID Spoofing. Even so, they do enhance the longevity and robustness of the network significantly. Unluckily, the cyber criminals are also aware of this new technique. Usually they use the fast-flux to construct the illegal websites and hide the real control component. The remainder of this article is organized as follows. Some technical background of fast-flux is present in section 2. In section 3, the related work of the fast-flux attack identification is introduced. In the section 4, two metrics measuring the

Figure 1. Traditional content retrieval process

A DNS query may be either a non-recursive query or a recursive query [1]. But this process is transparent to clients. However, there are some HTTP content retrieval processes, which don’t fit the pattern mentioned in the figure 1. The round-robin DNS, CDNs, and fast-flux network are three examples in the real life Internet. 2.2 Fast-flux network Fast-flux refers to rapidly changing the mapping between IP address and domain name [2]. The network using fast-flux

technique is fast-flux network, sometimes called fast-flux service network (FFSN). FFSN is not limited to HTTP application. Any application using the DNS can use the FFSN. But currently in practice, almost all the FFSNs are used for HTTP service. And no matter what service the FFSN is used for, the concept and method to detect are the same. Thus in the following chapters, we assume that all the FFSNs are used for HTTP service. FFSN is different from the round-robin DNS [3] and CDNs [4]. By round-robin DNS, the DNS server returns a same set of IP addresses, but in different order at different time. In CDNs, the DNS server finds the nearest server from the requesting client and returns the IP address of that server. Thus, the same user will always get the same IP address of the domain in CNDs. But in FFSN, even when the same user accesses the same domain name many a time, the DNS server will return a different set of IP addresses. Using FFSN to do malicious or illegal works, such as phishing, pharming, and malware spreading, is called as fastflux attack. That kind of FFSN is also named as fast-flux attack network (FFAN). 2.3 The classification of fast-flux According to what are in flux, the FFSN can be classified into two categories, single flux and double flux [2]. The single flux only puts the IP address of the domain name in flux. The content retrieval process in the FFSN using single-flux is shown in figure 2. In the figure 2 and the rest of this paper, the website www.ffan.com is used as an imaginary site using FFSN to do illegal business. DNS Query: IP of www.ffan.com? 7

DNS Servers

1 DNS Query: IP of www.ffan.com?

9 HTTP Request

Users 12

HTTP Reply

DNS Reply: IP is 2.2.2.2 2 8 DNS Reply: IP is 1.1.1.1

HTTP Reply 3 HTTP Request

6

4 Redirected 10 Redirected request Reply request www.ffan.com 5 Reply 1.1.1.1(flux agent) www.ffan.com 11 2.2.2.2(flux agent) Mothership Figure 2. The content retrieval process in FFSN

The process in figure 2 illustrates the case that, one user accesses one website twice in a short period. Firstly, the user raises a query on what IP of the website is to the DNS server. The DNS server returns the IP 1.1.1.1 with a very short TimeTo-Live (TTL). Then the user connects to the 1.1.1.1 to get content. After a short time, the user accesses the same website again. Because of the short TTL, the IP record, 1.1.1.1, has expired. So the user has to query again. The second time, the user gets a different reply, 2.2.2.2. Consequently the user goes to 2.2.2.2 for service. In this FFSN, both the 1.1.1.1 and 2.2.2.2 are flux agents. The flux agent is the front node in FFSN. In other words, the flux agent is the node that the users directly contact. The action of the flux agents is controlled by the background nodes, motherships. In figure 2, the flux agents don’t host any content, but act as proxies and redirectors. It is the mothership who really provides the service. Double-flux refers to dynamically and repeatedly changing the IP addresses of both the flux agents and their authoritative domain name servers. Thus, if using the double flux in the figure 2, the DNS servers would be changed dynamically. That is, if the user queries on the same domain name twice, the user might contact two different authoritative DNS server in fact. In this paper, we propose some metrics measuring the flux agents. Thus what kind of fast-flux is used does not influence our discussion. III.

RALATED WORK

In FFAN, almost all the flux agents are compromised computers [5][6]. At some time, some agents would be removed, and some new compromised victims would be added to this agent pool. Thus blocking some agents in FFAN can’t stop FFAN from working. The architecture of FFAN, an abuse of FFSN, makes it difficult to detect and block the real control component, the motherships. Thus how to detect and mitigate FFAN is still an open issue. The researchers have proposed some metrics. TTLA and TTLNS measure the TTL of A resource record and NS resource record respectively in DNS reply [2][7][8]. n A is the number of unique IP addresses of one domain name [9]. nNS is the number of authoritative domain name servers [9]. nASN , nO , nNA and nN indicate the diversity of the IP of domain [9][10]. nDN measures how many domain names the compromised host will be resolved to at different time [10]. The Domain registrar ( qDR ) and Domain age ( qTD ) are also introduced [10]. The flux-score combines many metrics: f ( x) = 1.32 ⋅ n A + 18.54 ⋅ nASN + 0 ⋅ nNS [9]. These metrics try to measure the FFSN from different aspects. But as pointed in [10], some of these features might be missing in practical FFSN, so not all metrics are applicable to all FFSN. Because FFAN is an abuse of FFSN, FFAN and benign FFSN hold these same or similar features. Thus, these metrics

are proved to be useful to detect FFSN, but invalid to identify FFAN or hard to implement. IV.

n

AR j =

∑S i =1

SOME METRICS ON SERVICE AVAILABILITY

A distinct and important difference between FFAN and benign FFSN is the origin of the flux agents. In benign FFSN, the agents are the machines, which are completely controlled by the organization of the FFSN. These agents would keep alive 24/7. Contrarily in FFAN, almost all the agents are the compromised host. The owner of FFAN often cooperates with a botnet's originator who controls a collection of compromised computers, which are also named as zombie or bot. That collaboration gifts tons of usable agents for the FFAN. But none of the bots can be physically controlled by the attackers. Thus the online time of these agents can’t be determined in advance. Based upon the difference of the agents’ lifespan between benign FFSN and FFAN, we propose two metrics and develop a system. These metrics could not be used to identify FFSN, but aim at distinguishing the FFAN from benign FFSN.

n

The MAR is the minimum of AR j . The range of MAR is between 0% and 100%. As the same reason mentioned above, the MAR in benign FFSN should be high, while that in most of FFAN should be low. Both the AOR and MAR are used to measure the quality of the agents’ service. 4.2 The flux agents monitoring system For monitoring the agents in the FFSN, we develop a system, the Flux Agents Monitoring System (FAMS). The architecture of FAMS is shown in the figure 3.

DNS servers

4.1 Two metrics of FFAN For measuring and detecting the FFAN, we propose two metrics of FFAN on the service availability. Once a new agent in the FFSN is found, we will monitor the agent immediately for one day. Every hour, we will send a HTTP request, and check the return status code. That is, we check the agent 24 times and get a status sequence S . The Sij

ij

IP records DB dig Agents Monitor

Suspicious Flux Domain DB

means the status of agent i at the j time. The value of Sij is:

IP lifespan records DB

⎧1 if the service of agent i is available at j time Sij = ⎨ ⎩0 if the service of agent i is unavailable at j time The first metric is Average Online Rate (AOR). If there are n agents and we monitor them t hours, the AOR is calculated as:

FFAN FFAN Domain DB

Detector

t

n

AOR =



∑S j =1

t

i =1

n

ij

n

=

t

∑∑ S i =1 j =1

Figure 3. The architecture of FAMS ij

t×n

Thus the range of AOR should be between 0% and 100%. In the following, we set t to 24. In a benign FFSN, the fast agents should be completely controlled and would keep alive 24/7. Even some force majeure might make the service of one agent inaccessible. Generally and globally, the service is stable and available. Thus the AOR should be next to 100%. Oppositely, the fast agents in FFAN usually are controlled remotely and can’t be managed physically. So how long the agents keep alive is out of the attackers’ control. Sequentially the AOR in most of FFAN should be clearly lower than that of benign ones. Another metric is Minimum Availability Rate (MAR). The Availability Rate (AR) at the j time is defined as:

In FAMS, the dig tool is used to gather information related to the monitored domains. The system runs discrete queries. The interval between two queries is the TTL indicated in the previous DNS response. Once a new IP record is found, it will be added to IP records database. The Agents Monitor (AM) will monitor the status of all the IPs in the IP records database. The AM will send HTTP request, and record the response. A segment of the IP lifespan records is shown in the figure 4. ;; IPLife Section 77.67.126.66: 111111111111111111111111 63.216.248.16: 111111111111111111111111 ... 63.116.244.88: 111111111111111111111111 ;; IPLife End Figure 4. A segment of the IP lifespan records

In the IP lifespan records database, 1 means the service is available. And 0 means the agent does not return a correct HTTP response, including host inaccessible, port unavailable, HTTP service stopped and so on.

Some statistic information of AOR and MAR is shown in table 1. TABLE I.

The detector will judge whether the suspicious domain is FFAN according to the IP lifespan records using the metrics, AOR and MAR. V.

PRIMARY OBSERVATION RESULTS

Benign FFSN FFAN 1.0 0.9 0.8

AOR

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0

2

4

6

8

10

12

14

16

18

20

22

24

Time

MAR

Benign FFSN

FFAN

minimum

0.939

0.0

average

0.978

0.356

maximum

1.0

1.0

minimum

0.882

0.0

average

0.934

0.213

maximum

1.0

1.0

According to our observation, all benign FFSNs have high AOR and MAR. This is intelligible. The FFSN is usually used for a large scale network. As the owner of the large network, the organizer commonly pays a mass of attention to the quality of service. While the local network might have a fault, a failure to access all agents might occur occasionally. We have tried to minimize this influence: the timeline of status sequence uses relative time. For example, in a FFSN, the agent m is discovered at time t, and agent n at time t+1. So our system starts to monitor m at t and n at t+1. Assuming at time t+2, for the local fault, we can’t access these two agents, and all other times, the service is available, the status sequences are 11011111111111111111 and 10111111111111111111. The AOR is 0.958, and MAR is 50%. Another method to minimize the influence of local fault is to longer the observation time. Under the previous situation, if the monitor time is 48 hours, the AOR is 0.979, and the MAR is unchanged. By these methods, the influence is lowered, but can’t be eliminated. Oppositely, most of the FFANs have low AOR and MAR. But there are also some FFANs with high AOR and MAR. For the first time, the AOR and AR in FFAN is the same high as that in benign FFSN. But later, the AOR and AR decrease dramatically.

Figure 5. The AOR of monitored domains

The AR of these domains is shown in figure 6.

Benign FFSN FFAN 1.0

0.8

AR

Group

Metric

AOR

5.1 Observation results During our one month observation, we monitor 157 FFANs, which are identified by ATLAS [11] and FluXOR [12], and 7 benign FFSNs, which all are top 500 global sites according to Alexa’s rank [13]. The AOR of these domains is shown in figure 5.

THE STATISTIC INFORMATION OF AOR AND MAR

0.6

5.2 Analysis of the observation Overall, the differences of AOR and MAR between benign FFSN and FFAN are easy to identify. The AOR of all benign FFSNs is over 0.9. While in 85.99% of FFANs, 135 out of 157, the AOR is below 0.9. And the MAR of all benign FFSNs is over 0.85. Contrastively, in 93.63% of FFANs, 147 out of 157, the MAR is below 0.85. There are also ten domains of FFAN, with AOR higher than 0.9 and MAR higher than 0.85. These ten domains can be divided into four categories.

0.4

0.2

0.0 0

2

4

6

8

10

12

14

16

18

20

Time

Figure 6. The MAR of monitored domains

22

24

The first group has five domains: adultplaceonline.com, docpharmsite.com, gurucoolonline.com, lookpornworld.com and medmedicines.com. That is a FFAN with multi domain names. The agents of each domain are completely the same. All agents are of high quality and the service is always available. The second group has three domains: aeomailer029.com, aeomailer037.com and aeomailer040.com. According to their

similar domain names, it’s enough to identify that there is a strong relationship among them. All DNS responses of them have long TTL, about 86,400 seconds. Besides, the agent sets of them are nearly the same. And the IP addresses of their agents are regular. The agents have the IP ranges, from 69.163.46.102 to 69.163.46.109, from 66.207.170.70 to 66.207.170.78, and so on. For the sequential IP addresses, theses agents should not be random chosen zombies. The third group is blitzneuigkeiten.info. This domain uses a long TTL, 28,800 seconds. And the agent pool of this domain consists of only three agents. The forth group is weblessnet.ru. TTL of 432 seconds is used. During our observation, nine agents were found. All the nine agents kept alive 24/7, which makes our metrics invalid. We check all these agents by WHOIS service. Reasonably, the result shows, all these agents belong to some web hosting service providers, such as DirectSpace Networks, ColoGuys, NOC4Hosts and Internap Network Services Corporation. Surely they are not random chosen zombies. In conclusion, even not all FFANs could be pointed out, our metrics are useful to identify FFAN. 5.3 The limitation of the AOR and MAR In our primary observation, we also find some limitations of the metrics. For measuring the metrics, we must collect the agents firstly. If the IP pool of the FFAN is too small, or only a few agents are collected, the metrics could be markedly inaccurate. For an extreme example, if we only collect one agent for one monitored FFSN, and one time for the network failure we can’t access the service, the MAR is 0%. In our observation, some FFANs have dormancy. During the dormant period, the FFAN returns a few agents or only one agent with a long TTL. Under that situation, our metrics are invalid. Furthermore, the metrics are based upon the quality of agents’ HTTP service. If the firewall or security scheme or terrible network environment stops us from accessing the service, the metrics would be biased. For example, for accessing the website www.facebook.com, we must make a DNS query first. But as the result of DNS poisoning attacks here [14], we will always get an incorrect IP. We will fail to access the service because of the wrong IP. Thus the AOR and MAR of www.facebook.com would both be 0%, which is evidently misleading.

The metrics would be inaccurate, when FFAN uses some agents completely controlled to keep alive 24/7. This method could make AOR and MAR higher. But in this case, once identified as FFAN, the attackers are easier to find out. VI.

CONCLUSION

As far as we know, this is the first time that researchers try to distinguish the FFAN from benign FFSN based upon the agents themselves. The experiments show, our metrics are useful to identify the FFAN. The method is easy to implement and deploy. And these metrics are time-saving. According to our observation, most of the FFANs have low AOR and MAR. REFERENCES [1] [2] [3] [4] [5]

[6]

[7]

[8]

[9]

[10]

[11] [12] [13] [14]

Wikipedia, Domain Name System, http://en.wikipedia.org/wiki/Domain_Name_System, 2010. Honeynet project, Know Your Enemy: Fast-Flux Service Networks, http://www.honeynet.org/papers/ff/, 2007. Wikipedia, Round Robin DNS, http://en.wikipedia.org/wiki/Round_robin_DNS, 2009. Wikipedia, Content Delivery Network, http://en.wikipedia.org/wiki/Content_delivery_network, 2010. T Moore and R Clayton, An empirical analysis of the current state of phishing attack and defence, in proceedings of the 2007 Workshop on the Economics of Information Security, 2007. CV Zhou, C Leckie,and S Karunasekera, Collaborative Detection of Fast Flux Phishing Domains, Journal of Networks, VOL. 4, NO. 1, February 2009. ICANN Security and Stability Advisory Committee, SSAC Advisory on Fast Flux Hosting and DNS, http://www.icann.org/en/committees/security/sac025.pdf, 2008. ICANN Technique Report, Initial Report of the GNSO Fast Flux Hosting Working Group, http://gnso.icann.org/issues/fast-fluxhosting/fast-flux-initial-report-26jan09.pdf, 2009. T Holz, C Gorecki, K Rieck,and FC Freiling, Measuring and detecting fast-flux service networks, in proceedings of the 15th Network & Distributed System Security Symposium, 2008. Emanuele Passerini, Roberto Paleari, Lorenzo Martignoni and Danilo Bruschi, FluXOR: Detecting and Monitoring Fast-Flux Service Networks, Lecture Notes in Computer Science, Springer, July 2008, PP. 186-206. Active Threat Level Analysis System, http://atlas.arbor.net/, 2009. FluXOR, http://fluxor.laser.dico.unimi.it/, 2009. Alexa, http://www.alexa.com/topsites/global, 2009. Wikipedia, DNS cache poisoning, http://en.wikipedia.org/wiki/DNS_cache_poisoning, 2009.

Suggest Documents