Host Based Intrusion Detection and Prevention Model Against DDoS Attack in Cloud Computing Aws Naser Jaber11, Mohamad Fadli Zolkipli 21, Hasan Awni Shakir22, Mohammed R. Jassim 23, 1
Faculty of Computer Systems and Software Engineering,Address
[email protected],
[email protected] 2 National University of Malaysia
[email protected] 3University of Technology, Iraq
[email protected]
Abstract. Cloud computing has become an innovative technology. Recent advances in hardware and software have put tremendous pressure on administrators, who manage these resources to provide an uninterrupted service. System administrators should be familiar with cloud-server monitoring and network tools. The main focus of the present research is the design of a model that prevents distributed denial-of-service attacks based on host-based intrusion detection protection systems over hypervisor environments. The prevention model uses principal component analysis and linear discriminant analysis with a hybrid, nature-inspired metaheuristic algorithm called Ant Lion optimisation for feature selection and artificial neural networks to classify and configure the cloud server. The current results represent a feasible outcome for a good intrusion detection and prevention framework for DDoS-cloud computing systems based on statistics and predicted techniques.
1
Introduction
Cloud computing facilities have grown sharply in recent years due to the rise in demand of cloud based services [1]. This has put tremendous pressure on developers to provide sufficient hardware and software resources. In addition, system administrators need to ensure the efficient energy and hardware utilisation of cloud computing facilities [2], while system administrators need to address the security of cloud-based services, such as data storage and transfer and applications of on-demand services. Therefore, it is important to monitor cloud computing resources and their applications. This study provides a thorough review of the cloud computing tools used for monitoring performance of cloud infrastructure at the consumer and provider end. Revolutionary advances in hardware, networking, middleware and virtual machine technologies have led to the emergence of new, globally distributed computing platforms – namely, cloud computing – which provide computation facilities and storage as services accessible from anywhere via the Internet without significant investments in new infrastructure, training or software licensing. Infograph reports that
63% of financial services, 62% of manufacturing, 59% of healthcare and 51% of transportation industries are now using cloud computing services[3]. As cloud computing services are becoming more practical and popular due to their convenience and economic advantages, security vulnerability has become a continuous threat, for both cloud service providers and clients. A distributed denial-of-service (DDoS) attack that degrades and downs service availability constitutes the major security concern; this is a cyber-security threat in which multiple systems across the Internet are used to flood a target device or network with packets [3]. A typical example of DDoS architecture occurs when many distributed devices across a network are infected with DDoS zombies (also called ‘agents’ or ‘demons’) and commanded by an attacker to launch attacks on a target. Such structured attacks are designed to damage Internet applications, generate heavy traffic, reduce network performance and disable services; they are not, however, designed to compromise usernames and passwords or to steal data. To fight these armies of zombies, therefore, cloud computing needs an intelligent army of specifically developed solutions against sophisticated DDoS attacks. This concern surrounds the uncertainty that stateful devices – such as firewalls, intrusion detection systems (IDSs), intrusion prevention systems (IPSs) and load balancers – can become fault points when a network is under attack. Particularly, traditional firewalls fail during a DDoS attack, since it consumes the central processing unit (CPU) such that enabling synchronised (SYN) flood protection on the firewall becomes useless[4]. The major drawback of the current, host-based IDPS is that it is difficult to manage because information has to be configured and managed for every host. Some defence methods cannot detect any DDoS attack randomly [5]. Furthermore, an IDPS cannot detect any profile other than that which is predefined by the network administrator. Although these drawbacks are overcome by misuse detection, this method cannot detect anything other than predefined rules, and the system rules must be updated constantly. Thus, it is necessary to build an IDS that overcomes the drawbacks of and is much more efficient than the existing system. The present, conceptual model study implements a model for preventing DDoS attacks in hypervisor through host-based IDPSs. It was designed using two main aspects: data analysis, which is practical for large datasets that belong to the DDoS attack, and swarm intelligence, which is for training and classification based on the selected intent of the attack and blacklisted Internet providers (IPs). The common goal of our and other, related approaches is to detect DDoS attacks .
2
Literature Review
The changing and aggressive nature of the attacks make them a severe threat that is difficult to counter. As DDoS attacks grow larger and longer, single appliances and even some hosted solutions are unable to withstand them. Recent heavy attack loads have been breaking records in size. According to Akamai Technologies [6], the number of distributed DDoS attacks doubled in 2016. A close look at the type of attacks reveals that network time protocol (NTP) reflection attacks almost quadrupled,
increasing by 276% over the same period. Companies in the gaming and software industries are frequent targets of hackers that leverage on DDoS as an attack vector Figure 1.
Fig. 1. In 2016, 12 attacks exceeded 100 Gbps and two separate DDoS campaigns exceeded 300 Gbps.
Cloud computing encounters both traditional and contemporary security threats. It is vulnerable to core technology vulnerabilities, such as web applications and services, virtualisation and cryptography; essential cloud characteristic vulnerabilities, like unauthorised entry, Internet protocol vulnerabilities, and access to management interfaces; and flaws in known security controls, together with common vulnerabilities, such as injection vulnerabilities and weak authentication schemes. Assailants discover vulnerabilities and utilise them for attacks. There have been many attacks against virtual machines on cloud computing platforms, such as attacks on virtualisation and on hypervisors and various port scanning, backdoor channel, user-toroot, flooding and insider attacks (e.g. internal denial-of-service attacks via zombies in the cloud). A core technology in cloud computing is virtualisation technology. Key components of cloud computing infrastructure are virtual machines (VMs). For instance, virtualisation technology enables the execution of several operating system environments, or VMs on a single hardware system. A VM has applications as well as operating systems. It implements programs the way a real machine would. Virtualisation creates blind spots of invisible networks or network traffic in the same server infrastructure. An important study states that cloud services are usually made available to customers via the Internet. Standard Internet Protocols and mechanisms are used for communication between the customers and the cloud[7]. The communication process involves the transmission of either data/information or applications between the customer and the cloud[8] . The challenges include denial-of-service (DoS), eavesdropping, IP-spoofing based flooding, man-in-the-middle attack (MIMT) and masquerading. Solutions to these challenges are also employed conventionally, such as Internet security protocol (IPsec), intrusion detection and prevention systems.
A very strong and new form of attack on the availability of Internet services and resources is DDoS [9]. A DDoS attack is any act intended to cause a service to become unavailable or unusable. There are no inherent limitations in the number of machines that can be used to launch a DDoS attack, which utilises the dispersed nature of the Internet through hosts owned by different entities around the world. As shown in Fig 2, the DDoS attacker first distributes various types of DDoS attack tools to target networks from behind. Next, the attacker creates thousands of zombies, which represent active and passive attackers. Thus, the victims are now exposed to DDoS attack, and always without their knowledge. This attack mechanism applies to all types of computer networks.
Fig. 2: DDoS attack.
The vectors for DDoS attacks may vary, but the end goal remains the same, which is to overpower firewalls, servers or other perimeter-defined devices by sending request packets at very high packet rates. The network becomes overwhelmed with too much traffic, so that people are no longer able to access the website. In cloud computing, the drawbacks in hypervisor security has non-zero attack services [10]. Thus, the real security question for the cloud starts with feasible security on the cloud hypervisor. Hypervisors can detect such attacks, which leads us to rethink the integrity of the security hypervisor. More potent attacks attempt to take control of the hypervisor itself; such attacks include malicious VM hyperjacking and traditional network security threats, such as traffic snooping (intercepting network traffic) and address spoofing (forging VM, MAC, or IP addresses) [11] One of the major attack types is that which hits the hypervisor and leads to a transmission control protocol (TCP) SYN flood of DDoS attacks, which overhead the cloud resources. Another limitation of IDPS is that it has to evaluate whether it fulfils the security requirements of the cloud computing environment [12]. The disadvantages of the prevention mechanism are related to false positives, but these occur mainly when the user does not have a practical understanding of computers. If a procedure that the user is trying to perform appears as a malicious activity in the IPS and the user’s connection is cut, the IT department has to spend a
significant amount of time checking on every computer that encounters a false positive scenario [13]. Not all types of attacks are known, and new ones appear constantly. Thus, attackers can always find security loopholes to exploit so that they can gain access to the sensor network. Such intrusions will go unnoticed and will likely lead to failures in the normal operation of the network.
3 Methodology The proposed conceptual model to deal with this, as shown in Figure 3, aims to synthesise the hypervisor as a host-based IDPS that analyses the IPs from the dataset and blocks the attacks of TCP and user datagram protocol (UDP) DDoS. Host-based IDPSs have two theoretical phases inside the hypervisor, the IDS and IPS models. The IDS model has five phases: the future extraction and traffic aggregation phases, and the future selection using principal component analysis (PCA), linear discriminant analysis (LDA) and artificial neural network (ANN)-based classifications. IPS models spontaneously start after alerting the IDS model. The IPS model has three phases, which involve reconstructing the table IPs and Ant Lion fitness, classifying ANN and filtering attacking source IPs, the latter being the most important. The monitoring, aggregation, and correlation of alerts are performed in a distributed manner for an IDPS as a host base inside the hypervisor to maximise the possible advantages of hybrid deployment. The hypervisor locally analyses the correlated alerts. The normal and anomaly IPS with DDoS attack are in the cloud server. The IDPS generates alerts in the above two cases. Thus, the IDS allow good packets to pass, and the IPS updates and blocks suspicious IPs. In addition, a sensor from the IDS phases primarily matches the IP and another feature in the dataset. The details of processes are explained in the following subsections.
A. Feature selection using PCA and LDA with optimised ALO There are many dimensionality reduction algorithms that are used to select the best features, such as PCA and LDA. Here, three algorithms were used to reduce the dimensionality of data and select the most optimal features (Fig. 4). In the first phase, PCA is used to analyse the covariance of each feature. In second phase, the LDA, as a supervised classification technique, provides more class reparability and further reduces the dimensionality of the feature set. The final phase of IDS involves optimising the outcome of PCA and LDA by ALO. It is worth mentioning that several variants of LDA have been investigated to address the vanishing of within-class scatter under the projection to a low-dimensional subspace in LDA. However, some of these proposals are ad hoc, while others do not address the generalisation problem for new data. Although LDA is preferred in many applications of dimension reduction, this method does not always outperform PCA. A hybrid dimension reduction model that combines PCA and LDA is thus proposed to optimise the discrimination performance in a generative manner. The main goal is to
enhance data discrimination, which is achieved with subspaces learned with either PCA or LDA.
Host- based IDPS in hypervisor
Packet Capturing Module
Intrusion Detection Module Feature Extraction phase
Traffic Aggregation phase
Feature Selection using PCA + LDA
Optimized Feature Selection using ALO with ANN based Detection Classifier
IDPS Model
Intrusion Prevention System
Reconstruct Table IP s and Antlion fitness with ANN Classifier
Filter attacking Source IP s
Allow Good packets to Pass
Update Suspicions IP s To Anomaly Knowledge Profile at Hypervisor
Cloud Server
Fig. 3. The architecture of the host-based IDPS on a hypervisor cloud computing server.
Feature Selection for Datasets
Principal Component Analysis
Linear Discriminate Analysis
AntLion Optimizer
Fig. 4. The three-phase hybrid model of dimensionality reduction algorithms.
This learning mechanism differs from existing proposals in that this mechanism is guided by a hybrid model and thus directly addresses the generalisation problem for new data. In addition, it has developed computational strategies to estimate optimal subspaces. The problem in using this model is simply stated as follows: given a set of labelled training data from different classes and another set of unlabelled testing data from the same group of classes, each testing data is identified by relying on the new model. Both sets consist of feature vectors. The details of the three features selection algorithms may be shown thus: LDA Algorithm 1)
Compute the mean vectors for the input features dataset ( Mean
2)
x
1 n xi n i 1
(1)
Calculate the scatter matrices, within class (Sw) and between class (SB):
Sw i 1 ( Pxi Px)( Pxi Px)
T
n
a.
Antit Ant
b. where i = M −
postion
(2) (3)
overall mean
Find the linear discriminants by computing the eigen values for Sw-1 SB. Select the linear discriminants for the new feature set by sorting and choosing eigen vectors, , with the highest eigen values. 4) The new feature set obtained by the linear discriminants is then used to obtain transformed input dataset by following equation:
3)
a.
Y=X.W
(4)
PCA algorithm 1)
Compute mean vectors for the input features dataset (xi) Mean:
x
1 n xi n i 1
2)
Calculate the scatter matrix – covariance matrix:
a.
S w i 1 Ni ( Px m)( Px m)T n
(5)
(6)
3) Compute Eigen vectors and eigen values. 4) Sort the eigen vectors in descending order, W 5) Project the principal components onto the input features dataset by using the equation a. Y=WTx (7) 6) Compute mean vectors for the input features dataset (xi) Mean:
x
1 n xi n i 1
(8)
7)
Calculate the scatter matrix – covariance matrix
a.
Sw i 1 ( Pxi Px)( Pxi Px)
8)
Compute mean vectors for the principal components (Pxi)
n
Mean:
Px
T
1 n Pxi n i 1
(9)
(10)
ALO Algorithms
1.
Initialise population of ants and antlions randomly, Xt Compute the best fitness for both antlions and ants.
X t [0, cumsum(2r (t1 ) 1), cumsum(2r (t2 ) 1),..., cumsum(2r (tn ) 1)] 2. 3.
Determine best fit antlions and label them as elite. While for each Ant Lion: Choose an Ant Lion by Roulette Wheel Compute random walks Normalise Update Ant Position End for Calculate ant fitness Replace antlions by their fittest counterparts Update Elite antlions
End while 4. Return Ant Lion fitness
Containing three phases as stated, the IPS model will automatically start after investigating all futures of the dataset that send any suspect DDoS packets to the IPS. After ending the procedure, any anomaly IP will be blocked by the IPS and, in the overall operation, will be integrated IDPS as a host based in the hypervisor.
Fig. 4. Reconstructed feature table with IPs and ALO fitness.
Detection classifier – ANN The Ant Lion fitness of the ALO (output) is used to classify the normal traffic from attack traffic. The best-fit Ant Lion obtains the highest fitness values. The ANNfeedforward neural network uses the Ant Lion as shown in Figure 5. Filter attacking source suspicious IPs: An initial profile is generated over a period (typically days, sometimes weeks) and is sometimes called a training period. Profiles for anomaly-based detection can either be static or dynamic. Once generated, astatic profile is constant unless the IDPS is specifically directed to generate a new profile. A dynamic profile is adjusted constantly as additional events are observed. The corresponding measures of normal behaviour also change because systems and networks change over time. A static profile will eventually become inaccurate and thus should be regenerated periodically, for example, an attacker can perform small amounts of malicious activity occasionally and then slowly increase the frequency and quantity of activity. If the rate of change is sufficiently slow, the IDPS might think the malicious activity is a normal behaviour and include it in its profile. Malicious activity might also be observed by an IDPS while it builds its initial profiles. • Blacklists and whitelists: After the proposed host-based integrated IDPS on hypervisor introductions based on any suspicious IPs, the host-based intrusion detection and prevention system (HIDPS) will take immediate action against any anomaly DDoS packet. The hypervisor will unitise a list of malicious activityassociated discrete entities, such as hosts, TCP or UDP port numbers, internet control message protocol (ICMP) types and codes, applications, usernames, URLs and filenames or file extensions. Blacklists, also known as hot lists, are typically used to allow HIDPSs to recognise and block activity that is highly likely to be malicious; this list may also be used to assign a higher priority to alerts that match entries on the blacklists. Some IDPSs generate dynamic blacklists that are used to temporarily block recently detected threats (e.g., activity from the IP address of an
attacker). A whitelist is a list of discrete entities that are known to be benign. Whitelists are typically used on a granular basis, such as in protocol-by-protocol, to reduce or ignore false positives involving known benign activity from trusted hosts. Whitelists and blacklists are commonly used in signature-based detection and stateful protocol analysis.
Fig. 5. ANN-based detection classifier.
Experimental Results
All result, till update are impressive and feasible. In fact, we have measure both the IDS, and IPS in term of detection rate for three types of selected attacks – which are TCP sync, TCP push and ICMP DDoS attack. These attacks came from CIDA DDoS 2009 attack, and UCLA datasets. The reason to select these types of datasets are due to their nature of real time capturing. However, TCP shows the highest detection rate as shown the Fig 5, the deviation reached 100% of detection -which mean zero error classification. Then TCP push and ICMP will less percentage of classification and detection rate . While, in IPS also shows a high percentage of detection and prevention. This happen because the normalization enhanced by ALO. Again, TCP sync shows the full detection and prevention rate in term of comparison with TCP push and ICMP Also, we have calculated the highest IP’s that hit the server, and we notice that it came from CIDA datasets. The IP 172.162.222.64 was the most repeated attack for TCP. Which mean it accruing many flooding massages to the cloud server. And our work has a good result in term of detection and prevention.
IDS
IPS
100.00%
100.00%
80.00%
80.00%
60.00%
60.00% 40.00%
40.00%
20.00%
20.00%
0.00%
0.00%
TCP Sync %detection TCP Sync %detection
TCP PSH Ack %detection
TCP PSH Ack %detection
ICMP Attack %detection
ICMP Attack %detection
Fig 6: the IDS detection rate
Fig 7: IPS detection and prevention rate
Fig 8: TCP IPs attack from CIDA dataset Conclusion This unique conceptual model design is based on two important science methods and statistical procedures combined with smarm intelligence algorithms. PCA with LDA is used as a statistical procedure, and ALO with ANN is also proposed as swarm intelligence. The proposed model of HIDPS is based on moulding IDS alone and then integrated with IPS in the hypervisor environment. The IDS should be efficiently developed from PCA, LDA and the output dataset optimised from these by ALO, using the ANN classifier for any anomaly IPs, which will be sent to the IPS or else allow the good packets to pass. IPS uses ALO after taking the decision from IDS and using the ANN classifier to finally prevent and send the suspicious IP to the black list, which is shown in the hypervisor of the cloud server. While, this work in progress it will convert to the full model in near future for PhD thesis.
References
1. Subashini, S., Kavitha, V.: A survey on security issues in service delivery models of cloud computing. Journal of network and computer applications 34, 1-11 (2011) 2. Beloglazov, A., Abawajy, J., Buyya, R.: Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing. Future generation computer systems 28, 755-768 (2012) 3. Dorbala, S.Y., Kishore, R., Hubballi, N.: An experience report on scalable implementation of ddos attack detection. In: International Conference on Advanced Information Systems Engineering, pp. 518-529. Springer, (Year) 4. Gérard, K., Hamaide, M., SADRE, R., VAN ROY, P., BILAL, M.: " Collaborative intrusion detection system for small computers. 5. Anusha, K., Sathiyamoorthy, E.: OMAMIDS: Ontology Based Multi-Agent Model Intrusion Detection System for Detecting Web Service Attacks. Journal of Applied Security Research 11, 489-508 (2016) 6. Akamai: (2017) 7. Ullrich, J., Zseby, T., Fabini, J., Weippl, E.: Network-Based Secret Communication in Clouds: A Survey. IEEE Communications Surveys & Tutorials (2017) 8. Suryawanshi, M.S.M., Kolhe, M.V.L.: Big Data Analytics: Requirements and Characteristics. International Journal 4, 46-51 (2017) 9. Zargar, S.T., Joshi, J., Tipper, D.: A survey of defense mechanisms against distributed denial of service (DDoS) flooding attacks. IEEE communications surveys & tutorials 15, 2046-2069 (2013) 10. Sun, D., Zhang, J., Fan, W., Wang, T., Liu, C., Huang, W.: SPLM: Security Protection of Live Virtual Machine Migration in Cloud Computing. In: Proceedings of the 4th ACM International Workshop on Security in Cloud Computing, pp. 2-9. ACM, (Year) 11. Gupta, B., Badve, O.P.: Taxonomy of DoS and DDoS attacks and desirable defense mechanism in a Cloud computing environment. Neural Computing and Applications 1-28 (2016) 12. Naik, N., Jenkins, P.: An Analysis of Open Standard Identity Protocols in Cloud Computing Security Paradigm. In: Dependable, Autonomic and Secure Computing, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), 2016 IEEE 14th Intl C, pp. 428431. IEEE, (Year) 13. Patel, A., Taghavi, M., Bakhtiyari, K., JúNior, J.C.: An intrusion detection and prevention system in cloud computing: A systematic review. Journal of network and computer applications 36, 25-41 (2013)