A Survey on HTTP Flooding Attack Detection and Mitigating ...

0 downloads 0 Views 835KB Size Report
May 5, 2016 - they do not have a semantic surfing order like in the case of a normal .... behavior for defense against flash-crowd attacks [C]. Proc. IEEE ICC ...
International Journal of Innovations & Advancement in Computer Science IJIACS ISSN 2347 – 8616 Volume 5, Issue 5 May 2016

A Survey on HTTP Flooding Attack Detection and Mitigating Methodologies Apurv Verma Computer Sci. and Engg. MATS University Raipur, India ABSTRACT A Distributed Denial of Service (DDoS) attack prevents a Web Server from providing its resources to the clients. One of such DDoS is HTTP Flooding Attack. In HTTP flooding attack, the attackers target the bottleneck resources such as network bandwidth, CPU processing, database bandwidth, et al. by sending large number of HTTP GET requests. This is achieved by using searching bot. In this paper, we will survey the different techniques that are used to identify the attacking users and to improve the detection and reduction of HTTP flooding attack for improved server performance. Keywords DDoS; HTTP Flooding Attack; Server weblog; clustering algorithm

I. INTRODUCTION In today's world, the amount of data that is being exchanged in the computer network has increased tremendously which results the issue of detection and prevention of malicious use of the network a primary concern for the network users and administrators. Various types of attacks are happening on networks every now and then, Distributed Denial of Service (DDoS) attack is one of the most serious threats to the web services. It affects a web server by exhausting its resources such as bandwidth, memory and processing. HTTP Flooding attack is one typical type of DDoS attack which exhausts the web server of its resources by sending large number of HTTP GET requests. These requests are very similar to the requests send by the humans (users) hence it is very difficult for the server to differentiate between the two. Researchers in the past have been trying to detect HTTP Flooding attacks using anomaly-based approaches. A reference surfing profile is being build from raw web-logs for this, however these web-logs are corrupted with web-crawling traces from the searching bots which are very hard to detect thus interfere with the performance of the anomaly detection mechanism.

18

Apurv Verma, Mr. Deepak Kumar Xaxa

Mr. Deepak Kumar Xaxa Computer Sci. and Engg. MATS University Raipur, India These searching bots becomes untraceable because they do not have a semantic surfing order like in the case of a normal user as these bots get the web pages one by one by following the built-in hyperlinks. Their web crawling traces gets mixed with the raw web log thus disturbing the baseline profile. Secondly, the burst crawling behavior of these searching bots can also affect the baseline profile. For example, web crawlers makes multiple requests from the server within seconds while normal users have a time gap of few minutes between their requests. II. LITERATURE SURVEY Existing HTTP Flooding attack detection schemes are classified into two categories. (i) Specification Based; and (ii) Anomaly based. The specification based schemes generally depend on the genuine system-behaviors which are captured by specifications developed manually, like Admission Control based scheme [10], Group Testing based schemes [2]. But, such specification based schemes cannot detect unknown characteristics of the HTTP flooding attacks. However, the anomaly based schemes are able to detect both known and unknown characteristics of HTTP flooding attack. Detection mechanism in anomaly based schemes work in two phases - (i) Normal behavior training phase; and (ii) Abnormal behavior detection phase. In Training phase, a reference surfing profile of normal surfing behavior is created by observing the system behavior in absence of any attack. In Detection phase, each users' surfing profile is compared with the reference surfing profile created in the training phase and those users with large difference are declared as malicious users or attackers. III. PROBLEM DEFINITION HTTP Flood is a type of distributed denial of service (DDoS) attack in which the attacker exploits

International Journal of Innovations & Advancement in Computer Science IJIACS ISSN 2347 – 8616 Volume 5, Issue 5 May 2016

seemingly legitimate HTTP GET or POST requests to attack a web server or application. HTTP flooding attacks are volumetric attacks, often using a botnet 'zombie army' - a group of internet connected computers, each of which has been maliciously taken over. A sophisticated layer 7 attack, HTTP floods do not use malformed packets, spoofing or reflection techniques and require less bandwidth than other attacks to bring down the targeted site or server. They use standard URL requests. Even the traditional rate-based detection is ineffective in detecting HTTP flood since traffic volume in HTTP floods is often under detection thresholds. This makes HTTP flood attacks significantly harder to detect. Web crawling traces from the unknown searching bots gets mixed in the browsing logs of web server. These web crawling traces prejudice the base/ reference profile of anomaly based schemes in their training phase itself. This makes the anomaly based HTTP flooding attack detection schemes ineffective. IV. METHODOLOGY (a) HTTP sCAN The detection approach is based on density based clustering. Multiple surfing features of a normal web surfing behavior is associated in presence of web crawlers and the attackers are identified by comparing their individual behavioral profile with the system. Detection of HTTP Flooding attack is done as follows Characteristic point for a web user Y is given as (sl, H(𝜀𝑛𝑌 /𝑃)), where sl is the browsing session length and H is the relative entropy which describes the consistency between web users' surfing profile and the webpage popularity. Also the normal web surfing pattern is gives by C. If, ∃𝑐𝑝 ∈ 𝐶, (sl, H(𝜀𝑛𝑌 /𝑃)) ∈ 𝑁𝐸𝑝𝑠 (𝑐𝑝), the user Y is flagged as a normal web user. Else, user Y is flagged as a malicious user. (b) Group Testing Based Approach A novel Group Testing based approach is proposed that will function on back-end servers. First, classis GT model is extended from its size constraint state. Then, in accordance to specific testing matrices, client service requests are redistributed to various virtual servers embedded within each back-end server. Based on above established framework, a

19

Apurv Verma, Mr. Deepak Kumar Xaxa

two-mode detection mechanism is proposed that efficiently identify the attackers. t pools and n items are represented by a t×n binary matrix M where rows represent the pools and columns represent the items. The t-dimensional binary column vector V denotes the test outcomes of these t pools, where 1-entry represents a positive outcome and 0-entry represents a negative one. The attackers can be captured by decoding the test outcome vector V and the matrix M.

A detection model based on Group Testing is assumed that there are t virtual servers and n number of clients, among these n clients, d number of clients are attackers. Consider the Matrix M = t×n in this context, virtual servers are mapped into rows and clients into columns. M[i,j] is 1 only if client j's request is distributed to the virtual server i.

Fig 1: Detection Model based on Group Testing Correspondingly V[i] will be 1 only if the virtual server i has received malicious request from at least 1 attacker. Otherwise if V[i] is 0, all the client requests received are legitimate. The number of malicious users; d can be obtained by decoding the outcome vector V and the matrix M. Three algorithms for detection is proposed and a system is generated on the basis of these detection schemes. (c) Hidden Semi Markov Model Focus is put on detection of application layer DDoS attacks. A new model is presented based on a large scale hidden semi Markov Model, which describes the browsing behavior of web users. Along with it a new on-line algorithm is designed for the detection of anomalies.

International Journal of Innovations & Advancement in Computer Science IJIACS ISSN 2347 – 8616 Volume 5, Issue 5 May 2016

The browsing behavior of a web user can be related to two factors - the structure of a website and the way user is accessing the web pages. Structure of a website consists of web documents, hyperlinks, et al. When a user clicks on a hyperlink that points to a web page, the browser sends various requests for the web page and its objects. This phase is called as: HTTP ON. The user viewing time after the HTTP ON phase is called as HTTP OFF period.

Fig 2: Browsing phases The browser request for the web page and its inline objects is received by the server in a short time interval, except for those requests that are handled and responded by the web proxies or caches. Number and amount of requests of the clicked page is not deterministic as the in-line objects are overlapped with the pages and hence the server side cannot directly observe the actual number of pages clicked by user. It can only be estimated by request sequence received. This estimation is done on the basis of data clustering. Since the HTTP OFF phase is much longer than the HTTP ON period, the requested objects in the observation sequence can be grouped together into various clusters according to their characteristics. Each group represents a web page, the users' request sequence is transformed into the corresponding group sequence. The order or transition of these groups gives the users' click behaviour.

(d) CALD CALD provides a combination of three functionalities - (i) Abnormal traffic detection, (ii) DDoS attack detection and (iii) filtering.

Fig 4: CALD Process (i) Abnormal traffic detection: Initially, CALD detects abnormal traffic using front end sensors. Any abrupt HTTP GET request traffic change can be either due to flash crowds or DDoS attack. Once abnormal traffic is detected, the front end sensor sends a ATTENTION signal to the DDoS attack detection component. (ii) DDoS attack detection: This component is only activated when it receives a ATTENTION signal from the abnormal traffic detection module. It further analyses the abruption in traffic by tracing each incoming source IP address and each web page and records the average frequency in a vector. Based on this vector, entropy is calculated which gives the mess extent of source IP as A and that of web pages as B. Rate between A and B is defined by R. This variant R has smaller value in case of flash crowds and larger value in case of attacker. Thus a threshold can be set to differentiate between flash crowd and attackers. (iii) Filter: After the detection of DDoS attack, the previous component sends the corrupt source IP address to the filter component which let the legitimate requests through but stop the attack traffic. As a result, maximum performance of a server is maintained without harmful affect from DDoS. V. CONCLUSION & FUTURE WORK Detection and mitigation of HTTP flooding attack is a necessary process as it directly affects the consistency of the behavior of web server. In this paper we summarize various existing methodologies for detection and prevention of this distributed denial of service attack and concluded some results. For future work, we will be exploring various other techniques and will try to use them along with the combination of the existing techniques to deduce a new method which would

Fig 3: HsMM States

20

Apurv Verma, Mr. Deepak Kumar Xaxa

International Journal of Innovations & Advancement in Computer Science IJIACS ISSN 2347 – 8616 Volume 5, Issue 5 May 2016

make the whole process of mitigating HTTP flooding attack more accurate and robust. REFERENCES [1] Wang Jin, Zhang Min, Yang Xiaolong, Long Keping, Xu Jie, "HTTP-sCAN Detecting HTTP-Flooding Attack by Modeling Multi-Features of Web Browsing Behavior from Noisy Web-Logs". IEEE, China Communications. 2015. [2] Xuan Ying, Shin I, et al. Detecting Application Denial-of-Service Attacks: A Group-Test-Based Approach [J]. IEEE Trans. on Parallel & Distributed systems, Aug. 2010, vol. 21(8), pp. 1203~1216. [3] Xie YI, Yu Shunzheng. A large-scale hidden semimarkov model for anomaly detection on user browsing behaviors [J]. IEEE/ACM Trans on Networks, 2009, vol. 17(1), pp. 54-65. [4] Wen Sheng, Jia Weijia, et al. CALD: Surviving Various Application-Layer DDoS Attacks That Mimic Flash Crowd [C]. Proceedings of the 4th International Conference on Network and System Security, IEEE, Piscataway, N.J., pp. 247-254. [5] XIE Yi, Yu Shunzheng. Monitoring the applicationlayer DDoS attacks for popular websites [J].

21

Apurv Verma, Mr. Deepak Kumar Xaxa

IEEE/ACM Trans. On Networks, 2009, vol. 17(1), pp. 15-25. [6] Oikonomou G, Mirkovic J. Modeling human behavior for defense against flash-crowd attacks [C]. Proc. IEEE ICC, 2009, Dresden, German, pp. 1-7. [7] Yatagai T, Isohara T, Sasase I. Detection of HTTPGET flood attack based on analysis of page access behavior [C], in Proceedings IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing, 2007, Victoria, BC, pp: 232–235. [8] Yu J, Fang C, Lu L, et al. Mitigating application layer distributed denial of service attacks via effective trust management[J]. IET Communication, 2010, Vol. 4(16), pp. 1952–1962 [9] Jung J, Krishnamurthy B, Rabinovich M. Flash crowds and denial of service attacks: Characterization and implications for CDNs and web sites[C]. Proc. IEEE WWW, pp. 252-262, May 2002, Honolulu, Hawaii, USA. [10] Srivatsa M., Iyengar A, et al. Mitigating Application-Level Denial of Service Attacks on Web Servers: A Client-Transparent Approach [J]. ACM Trans. On Web, vol. 2(3), July 2008.

Suggest Documents