Towards Improving Web Attack Detection: Highlighting the Significant Factors Majda A. Wazzan
Mohammad H. Awadh
Security Techniques Unit, Deanship of Information Technology King Abdul Aziz University, Jeddah, KSA
[email protected]
Department of Electrical and Computer Engineering, Faculty of Engineering King Abdul Aziz University, Jeddah, KSA
[email protected]
Abstract—Nowadays, with the rapid development of Internet, the use of Web is increasing and the Web applications have become a substantial part of people's daily life (e.g. EGovernment, E-Health and E-Learning), as they permit to seamlessly access and manage information. The main security concern for e-business is Web application security. Web applications have many vulnerabilities such as Injection, Broken Authentication and Session Management, and Cross-site scripting (XSS). Subsequently, web applications have become targets of hackers, and a lot of cyber attack began to emerge in order to block the services of these Web applications (Denial of Service Attach). Developers are not aware of these vulnerabilities and have no enough time to secure their applications. Therefore, there is a significant need to study and improve attack detection for web applications through determining the most significant factors for detection. To the best of our knowledge, there is not any research that summarizes the influent factors of detection web attacks. In this paper, the author studies state-of-the-art techniques and research related to web attack detection: the author analyses and compares different methods of web attack detections and summarizes the most important factors for Web attack detection independent of the type of vulnerabilities. At the end, the author gives recommendation to build a framework for web application protection. Keywords—web application; web application vulnerabilities; web application intrusion detection; big data; cloud
I. INTRODUCTION Recently, Web applications are becoming increasingly more popular and more complex. With the emerge of Web 2.0, there is a significant growing in the need for web applications to facilitate the transactions such as Financial, EBanking, E-Health, E-Learning, Energy, Defense, and Realtime communications with users and within the organizations. The wide variety of web applications lead to the need of a reliable and robust web application that guarantee the integrity and security of data exchange through these applications. A web application is an application program that installed and running in a remote web server. It is delivered over the Internet and respond to requests via HTTP protocol. Most of web applications are vulnerable to web attacks. Researches said that 92% of web base applications are vulnerable to some kinds of attacks [15]. As Gartner explains that about 75% of all attacks on information security is target web application layer [16].
The open web application security project (OWASP) [12] presents the top ten web applications most critical vulnerabilities as follow: 1. Injection 2. Broken Authentication and Session Management 3. Cross-Site Scripting (XSS) 4. Insecure Direct Object References 5. Security Misconfiguration 6. Sensitive Data Exposure 7. Missing Function Level Access Control 8. Cross-Site Request Forgery (CSRF) 9. Using Components with Known Vulnerabilities 10. Un-validated Redirects and Forwards Vulnerabilities of web applications are a weakness in the application that leads to expose of data or access and control the application. As OWASP [12] defines the attack: "Attack are the techniques that attackers use to exploit vulnerabilities." In another words, web attack is the exploitation of a weakness in the web application in order to gain more privileges or to infiltrate system and leak sensitive data. Therefore, web attack detection is one of the most important challenges that facing organizations. Web Attack detection can be done through monitoring the HTTP traffic and inspecting the components of the incoming request. For effective investigation of web attack detection, we need to inspect some components of the request (i.e. Source IP, request timestamp, HTTP Method, URI requested, full HTTP data sent and response). The attack data could be URI, HTTP headers from client, cookies ... etc. Attack detection can be classified into static detection techniques and Dynamic detection techniques. Static detection techniques includes producing post occurrence of event, parse log files using standards tools and it aims to forensics investigation. Dynamic detection techniques includes detecting attack as it happens, trigger alarm when attack is happening and aims to detects or prevents in real time. Dynamic detection can be Signature based detection or anomaly based detection. Signature based detection is good enough to detect a large majority of initial web application attack, but it fails to detect certain unique attack such as focefull browsing and malicious redirecting. It may leads to false positive. On the other hand, anomaly based detection is based on training the IDS to learn normal web traffic. In this paper, the author sheds light on the application attack detection, surveys the state of the art of the researches on web
978-1-4673-6537-6/15/$31.00 ©2015 IEEE
attack detection and compares their attack detection methods then summarizes the important factor in web attack detection and provides some recommendation for a new framework to protect web applications. This Paper is structured as follows: Section II demonstrates the related works. Section III provides analysis and section IV discusses the results. Section V gives recommendation for a proposed framework. Finally, Section VI concludes the paper. II.
RELATED WORKS
In [1], the authors propose a preventing approach to prevent SQL injection attack for web application depending on Boyer Moore string matching algorithm. The proposed model have four stages: Crawl stage, Parameter Testing Stage, Exploit stage and Report stage. In Crawl stage the model crawl the input URL and capture parameters. Next the model test the parameters, if the testing phase shows that the page is vulnerable to SQL injection attack, it exploit the page and get the information from the database then report the results of the detection process. The results show that using the model increase the efficiency and accuracy of detecting the attacks. In [2], the authors develop a technique to detect the application layer distributed denial-of-service (DDoS) attack. They develop a new clustering algorithm to discover the recent interest of the visiting users. The technique depends on entropy minimization to category data points and cluster sequences of requests observed at a server. It uses cluster with small entropy to group elements with similar behavior. This technique calculates the highest likelihood of connections to determine the best cluster and after assigning the cluster, it apply a reallocation process to reduce the number of unpopular cluster. The attack detection of this technique depends on the number of attacking hosts and strategy chosen by attackers. The results show that most of the request sequences correctly classified. In [3], the authors proposed a model for application layer DDoS attack detection. The model monitor the network traffic that composes the service request traffic profile of normal users and attackers. Attack detection in this model base on some key features like: total number of services requested during a monitoring period, the maximum number of continuous service requested during the monitoring time, and the maximum number of continuous service that not requested during the monitoring time. The model utilizes a support vector machine of type C-support vector classification C-SVC as a pattern classification algorithm to make a decision about the traffic of the application. The experiments done using two DDoS tools to gather the attack traffic (BlackEnergy and malware of 7.7 DDoS attack). The results show a high detection rate (99.4 %) for the attacker's IP address. In [4], this model extracts information like, time delay, between request and response of client/server connection then analyses it. It executes a statistical analysis of events for forecasting or anomaly detection. It considers a continuous function of the connections and performs behavior analysis on the traffic, then according to the probability distribution
determine if the event is similar (regular traffic) or different (intrusion). The test of the proposed algorithm was divided to into two phase: profiling the legitimate traffic and anomalous traffic detection using Slowloris attack. The results show excellent reliability and achieve an effective detection with low rate of false positive. In [5], the authors propose a detection method for flooding attack on application layer based on semantic rules. They analyze the components and modules of the PHP dynamic pages to get information about system resources (bandwidth, memory, CPU) usage by these modules for some specific websites. Then the authors classify the obtained rules to three classification rules: computation, communications, and security. Depending on theses three classifications, the rules are formulated to identify the malicious browsing behaviors. The results of the experiments help in identify the malicious browsing behaviors and this help in improve performance of the services over web and split the cost of the resources. The authors of [6] present web application intrusion detection system (WAIDS) based on anomaly model for detecting input validation attacks. This approach based on web application parameters which has identical structure ad values. The system detects and prevents input validation attacks; it operates on the application layer. The system starts by analyzing HTTP requests to collect header data of GET, POST the parameters then it transform the collected data into alphabetic characters. It uses keywords replacement matrix to filter the data. The system build a normal web request profile and get optimal sequence detection to find the most appropriate normal request sequences. The system can detect malicious code and reports the violation profiles. The system also can check the HTTP requests at runtime. Further, the results show that the system has higher detection rate than some systems. In [7], the authors introduce a new tool to detect, prevent web application attack using pattern recognition. Based on input validation, the tool tries to detect the web attacks through pattern recognition. Patterns can be detected in both HTTP requests and responses using over 200 known attack patterns in an extensible and manageable way. The proposed tool acts as a proxy server. The results show satisfying results for every attack category examined having a high percentage of success. In [8], the authors present an anomaly detection system that detects web-based attacks using a number of different techniques. The analysis techniques used by the system take advantage of the particular structure of HTTP queries that contain parameters. The system analyses HTTP requests specifically the GET requests and ignore the header data and POST requests. It extracts URIs queries parameters associated with a program and use it as input to produce an anomaly score for each web request. The parameters of the queries are compared with established profiles that are specific to the program or active document being referenced. The system learned the length and the structure of the parameter from the input data. The system supports analysis with respect to generic anomaly detection methods that do not take into account the specific program being invoked.
or illegal flow or nonexistence URL in the system whitelist or the request trying to bypass the login URL
Method
In [9], the authors propose a tool to analyze big data and process the user's historical surfing information to build a profile of the user's typical surfing patterns. It calculates the deviations in these patterns with coupled to event of interest in order to use this deviation as an indicator for the involvements of this event. The tool based on five parameters: intensity of surfing, frequency of revisiting or refreshing pages, irregular hours of activity, interaction level (passive/active), and diversity of interest topics. Depending on the browsing, a database of browsing data can be given which make it easily to monitor and analyze the user behavior. The author in [10] develops an active defense model to detect web attacks. He calculates the risk of web service by using antibodies concentration. The proposed mechanism uses the clone selection and hyper-mutation in order to learn and detect web attacks. It preprocessed HTTP requests as antigen before they recognized and learned by immunocytes. The mechanism simulates the cell in BIS to apperceive the environment. The concentration of antibodies refers to the web attack type. The result shows the effectiveness of the mechanism. In [11], the authors propose an agglomerative clustering method for detecting web attacks. They process the log files and labeled them manually to improve the accuracy of their detection method. The detection method produces one or more clusters from the collected dataset and processes them according to a proposed algorithm. The method is compared to two other methods: Ngram and ‘length’, ‘character distribution’ [18]. The author’s method shows high detection rate and zero false positive rates. In [17], the authors propose an approach for penetration test. They provide a tool named "viper". This approach is based on pattern matching for error messages. The results show that viper showed higher performance comparing to similar tools like SQLMap.
Host
III. ANALYSIS In this section, the author analyzes the above researches and compares their attack detection methods. To detect the web attacks, it is important to analyze the HTTP requests. The author summarizes the significant factors that help in improving the detection performance. Table (1) shows the most important parts of the requests that are to be monitored and inspected.
URL contains not allowed meta characters or valid access time is expired
Session Hijacking
URL contains Session ID value that equals the session ID that the server set for this session
Buffer Overflow
Unacceptable length
Information Leakage
Using HTTP illegal method in the request or not allowed SOAP method
RFC violation
The GET or HEAD contains a body
Buffer overflow
The request contains illegal POST data length
Input violation
The request contains query string or disallowed POST data
HTTP request smuggling attack
The content length of the POST data equal 0
Non browsing client
UR L
Attack Type
Forcfull Browsing
Event
Request references URL that is not defined as entry point (illegal URL)
No Host header exist in HTTP 1.1 request Host header contains IP address The request includes illegal meta character in the parameter.
Parameter Tampering
The request contains a dynamic parameter that its value illegally changed by the client side. The request contains illegal parameter data type, contains parameter numeric value is not in the allowed range, contains illegal static parameter value or contains an alphanumeric parameter value that does not comply with regular expression field.
Input violation
The request contains a dynamic parameter that its value changed to empty or the value contains an illegal meta character, contains illegal number of
Table 1. Significant factors of web attack detection. Aspect of the Request
URL
Access violation
Parameters
From the previous studies we can conclude that most of them concentrate on only one type of vulnerabilities while there is a need for comprehensive framework that consider all vulnerabilities and can detect all kind of attacks because the unique HTTP request may contains more than one type of attacks at the same time.
Accsess violation (suspicious)
mandatory parameters or may contain a not allowed parameter. The parameter value length is not allowed or the multi part request has a parameter with a NULL value. Detection Evasion, HTTP parameter pollution attack
Cookies
Referrer URL
The request contains the same name for multiple parameter name or contains multiple decoding for the URI or parameter value.
HTTP attack
parser
The request contains Bad multipart parameters parsing
HTTP attack
parser
The cookie header is not RFC compliant
Access violation
CSRF session cookie is injected into the response
Buffer overflow
The request includes an illegal cookie length.
Cookie violation
The request has a disallowed modified cookie domain that
Automated attack
The referrer header contains an unidentified referrer URL
Information Leakage
The response contains sensitive data like payment card information
Suspicious file
Virus detected
The request includes a file that contains a worm or a virus.
Suspicious pattern
Attack signature
The request or the response contains a pattern that matches an attack signature
Sensitive Data
IP
Dos Attack
The request contains a non-trusted, or blacklisted IP
The Technical Security Unit, in Deanship of Information Technology at King Abdulaziz University [13], interests in investigating the best methods of protecting their web applications. These web applications serve more than 170 thousands students and more than 20 thousands employees. Through the daily work, the security team using tools to monitor, inspect and analyze web application traffics to detect any presence of attacks. This analysis comprises the requests and responses of the logs and their aspects. As shown in the previous section, there are many detection methods in the literatures but unfortunately most of them concentrate on only one type of vulnerabilities while the
unique request may contains more than one types of attacks at the same time. It is also obvious that the most important vulnerability poses the web security application is injection. Injection could be done mainly through the parameters that used in the webpage’s URLs. Therefore, in this paper, the author gives more attention to inspecting the URLs’ parameters and input violations. Many attacks can be happened through exploiting the parameters; for example parameters tampering, input violations, detection evasion, HTTP parameters pollution. Therefore, security policies should force application developers for strongly validating the input parameters. The second factor of web attack detection is URL inspection. Referencing that is not defined as entry point URL in the request may cause Forcfull Browsing attack. Also the unacceptable length of the URL may cause a buffer overflow attack and exposing session ID value, in the request, that equal to the server session value is an indicator for session hijacking. The next factor is methods. Where using a not-allowed methods may cause information leakage, and the illegal data in POST method may cause buffer overflow or HTTP request smuggling attack if the data equal to 0. Cookies also are one of the significant factors; illegal cookies may cause buffer overflow or HTTP parser attack. On the other hand, injected cookies on response or illegally modified cookies may cause access and cookie violations. If HTTP request contains a suspicious pattern or a suspicious file then negative security violation is triggered and virus or signature attack pattern is detected. Any appearance of sensitive data in the response may cause information leakage. Host header content IP address and empty host header in the request may cause non browsing client attack. An unidentified referrer URL and user agent are also other request components that should be inspected to avoid automated attacks. Finally, inspecting and analyzing the requests needs a good security policy to identify a whitelist and blacklist of URLs, parameters, file types, and headers. Then add rules for each aspect of the request and response to detect the attacks IV.
RECOMMENDATION FOR PROPOSED FRAMEWORK
According to previous sections the author gives recommendations for building a comprehensive framework of detecting web attack for the web application layer. The framework should target the following objectives to ensure detecting most of the potential web attacks (known and zero day attacks) per request. These recommendations will improve the attack detection process: • The framework should provide a capability of detecting a known attack and unknown attack (zero day attack). So the framework should have a monitor model to monitor and inspect the aspects of the request and response. • A filtering model should be added that can filter the traffic and constructing records that help in analyzing the payload and headers of the request and response (analyzing the specified fields and
•
• • • •
protocols) and for data reduction to improve the detection performance. The framework should have a signature based detection model for the HTTP traffic that comparing and matching any suspicious pattern in the request to the known attack signatures. The framework should have an anomaly based detection model that can monitor the behavior of the requests and build normal behavior profile. It should have the capabilities to report the incidents and correlates the events. The framework should be able to report the Denial of Service (DoS) attacks. The framework should decrease the positive false rate and increases the detection rate. V.
CONCLUSION
The main security concern for E-business is Web application security. Web application services present many challenges; for instance, security, availability and scalability. These challenges should be addressed. Web attack detection is an information security branch with significant promises. It has changed the trends of attack detection and its popularity is growing day by day. In this paper, the author surveys some of the state of the art researches in web attack detection; then summarizes and demonstrates the most important factors for detecting web attack that need to be investigating to improve the attack detection process. For future work, the authors will build a robust, reliable and accurate detection framework for web attack depending on the summarized factors in this paper. The proposed framework should be integrated and cover all the mentioned objectives. Most of the future work web pages will be built on the cloud platforms therefore; the proposed framework should be considering the cloud based applications. Also, the logs considered as the main pillar of detecting the web attack. Depending on the tools employed by our university, logs reach millions of records per day. This big data of the logs need to be correlated in order to take the advantage of this correlation in forensics.
References [1]
[2]
Buja, G.; Bin Abd Jalil, K.; Bt Hj Mohd Ali, F.; Rahman, T.F.A., "Detection model for SQL injection attack: An approach for preventing a web application from the SQL injection attack," Computer Applications and Industrial Electronics (ISCAIE), 2014 IEEE Symposium on , vol., no., pp.60,64, 7-8 April 2014 Pawel Chwalinski, Roman Belavkin, and Xiaochun Cheng. 2013. Detection of Application Layer DDoS Attacks with Clustering and
[3] [4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12] [13] [14]
[15]
[16] [17]
[18]
Bayes Factors. In Proceedings of the 2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC '13). IEEE Computer Society, Washington, DC, USA Y.S. Choi, J.T. Oh, J.S. Jang, I.K. Kim, Timeslot Monitoring Model for Application Layer DDoS Attack Detection, 2011 Aiello, M.; Cambiaso, E.; Scaglione, S.; Papaleo, G., "A similarity based approach for application DoS attacks detection," Computers and Communications (ISCC), 2013 IEEE Symposium on , vol., no., pp.000430,000435, 7-10 July 2013 Chu-Hsing Lin; Chen-Yu Lee; Jung-Chun Liu; Ching-Ru Chen; ShinYang Huang, "A detection scheme for flooding attack on application layer based on semantic concept," Computer Symposium (ICS), 2010 International , vol., no., pp.385,389, 16-18 Dec. 2010 Yong Joon Park; Jaechul Park, "Web Application Intrusion Detection System for Input Validation Attack," Convergence and Hybrid Information Technology, 2008. ICCIT '08. Third International Conference on , vol.2, no., pp.498,504, 11-13 Nov. 2008 Helen Kapodistria, Sarandis Mitropoulos, Christos Douligeris, (2011) "An advanced web attack detection and prevention tool", Information Management & Computer Security, Vol. 19 Iss: 5, pp.280 – 299 Kruegel, C.; Vigna,G.:"Anomaly Detection of Webbased Attacks", Conference: Proceedings of the 10th ACM Conference on Computer and Communications Security, CCS 2003, Washington, DC, USA, October 27-30, 2003 Kedma, G.; Guri, M.; Sela, T.; Elovici, Y., "Analyzing users' web surfing patterns to trace terrorists and criminals," Intelligence and Security Informatics (ISI), 2013 IEEE International Conference on , vol., no., pp.143,145, 4-7 June 2013 Liang Guangmin, "Modeling Unknown Web Attacks in Network Anomaly Detection," Convergence and Hybrid Information Technology, 2008. ICCIT '08. Third International Conference on , vol.2, no., pp.112,116, 11-13 Nov. 2008 Xiaofeng Yang; Wei Li; Mingming Sun; Xuelei Hu; Shuqin Li; Yongzhi Li, "Clustering toward detecting cyber attacks," Computer Application and System Modeling (ICCASM), 2010 International Conference on , vol.12, no., pp.V12-243,V12-247, 22-24 Oct. 2010 https://www.owasp.org/index.php/Main_Page http://www.kau.edu.sa/home.aspx Khairkar, A.D.; Kshirsagar, D.D.; Kumar, S., "Ontology for Detection of Web Attacks," Communication Systems and Network Technologies (CSNT), 2013 International Conference on , vol., no., pp.612,615, 6-8 April 2013. Khairkar, A.D:" Intrusion Detection System based on Ontology for Web Applications", 2013, availabe online, http://www.coep.org.in/page_assets/341/Intrusion_Detection_System_ba sed_on_Ontology_for_Web_Applications.pdf http://www.coep.org.in/page_assets/341/Intrusion_Detection_System_ba sed_on_Ontology_for_Web_Applications.pdf https://www.gartner.com Angelo Ciampa , Corrado Aaron Visaggio , Massimiliano Di Penta, A heuristic-based approach for detecting SQL-injection vulnerabilities in web applications, Proceedings of the 2010 ICSE Workshop on Software Engineering for Secure Systems, p.43-49, May 02-02, 2010, Cape Town, South Africa K.L. Ingham, H. Inoue. “Comparing anomaly detection techniques for HTTP”. In Recent Advances of Intrusion Detection (RAID). Springer. Sep. 2007, pp.42-62.