Anomaly Detection in Datacenters for Cloud Computing Digital Forensics ALECSANDRU PĂTRAȘCU1, VICTOR VALERIU PATRICIU2 1,2 Military Technical Academy 1 Advanced Technologies Institute 1,2 39-49 George Cosbuc Street, District 5, 050141, Bucharest, Romania 1 10 Dinu Vintila, District 2, 021102, Bucharest, Romania E-mails:
[email protected],
[email protected],
[email protected]
Abstract: Cloud computing technologies have an important place in today’s digital environment as they offer the user attractive benefits such as information backup, file storage, renting virtual machines. In this context we need to know exactly where, when and how a piece of data is processed and, even more, we need to know what is happening in a datacenter at the virtual machine level. This means we must have installed, at the datacenter level, a system that can detect anomalies based on the usage pattern of virtual machines. In this paper we will present a novel way of monitoring virtual machine activity in datacenters and how we use this information in order to train our automated anomalies machine learning modules. Key-Words: cloud computing, data forensics, anomaly detection framework, distributed computing.
1 Introduction Since its creation, the cloud computing technology presented itself to the users as a way in which they could rent various amounts of computing power under the form of virtual machines, intermediate platform targeted to developers or ready to use applications for mass usage. The technologies surrounding it have evolved with great pace, but nevertheless, we can find a common concern for all of them cloud computing security. Also, the need of knowing how the information is delivered from and to the clients and under what condition is it processed is emerging alongside with the security issues. In this context, cloud computing has become in the last years a paradigm that attracts more and more researchers. One of the main research areas in this field is the way in which common data and processing power can be shared and distributed across
single or multiple datacenters that are spread across a specific geographical area or even the entire globe. A new request for IT experts is increasing: the need to know exactly how, where and in which condition is the data from the cloud stored, processed and delivered to the clients. We can say with great confidence that cloud computing forensics has become more and more needed in todays distributed digital world. In case of classic computer forensics, the purpose is to search, preserve and analyze information on computer systems to find potential evidence for a trial. In cloud environments the entire paradigm changes because we don't have access to a physical computer, and even if we have access, it is a great chance that the data stored on it to be encrypted or split across multiple other computer systems. Closely related with knowing where and how information is processed we can find another threat: anomalies. Network
anomalies are and will continue being an important part of any modern distributed system. Virtualized IT infrastructures are designed to be scalable and highly available. At the same time, the dynamic nature of web deployment means that today's data center suffers from chronic instability leading to inconsistent results, performance issues, and downtime. System anomalies are a common reason of the data center instability. Traditional application performance management solutions are insufficient when it comes to addressing dynamic web environments, while automated provisioning addresses only a part of the issue. In an ideal world, every server in a cluster would be running uniform configurations: identical resources, versions of OS, software and data. In real world, however, that is rarely the case. Instead, systems are plagued by a number of inconsistencies that increase over time. This divergence of servers in the datacenter causes a large number of problems in IT operations, from minor disruptions to large-scale outages. The rest of the paper is structured as follows. In section 2 we discuss related work in the field of anomaly detection in cloud computing environments and in section 3 we present our own architecture for a combined solution for automated cloud anomalies detection. In section 4 we conclude our work. 2 Related Work In the field of classic digital forensics there are a lot of active researches and many books, guides and papers. Nevertheless, in the field of cloud computing forensics and incident response the papers are mostly theoretical and present only an ideal model for it. In the direction of classic incident response, one of the most interesting guides is the one from NIST [1]. In it we can find a summary containing a short description and
recommendations for the field of computer forensics, along with the basic steps that must be made when conducting a forensic analysis: collection, examination, analysis and reporting. A great deal of attention is paid to the problem on incident response and how should an incident be detected, isolated and analyzed. Bernd Grobauer and Thomas Schreck talk in [2] about the challenges imposed by cloud computing incident handling and response. This problem is also analyzed in [3], where they consider that incident handling should be considered a welldefined part of the security process. Also it is presented a description for current processes and methods used in incident handling and what changes can be made when migrating towards a cloud environment from the point of view of a customer or a security manager. Furthermore, the integration of cloud incident handling and cybersecurity is presented in two papers, one written by Takahashi et al [4] and the other written by Simmons et al [5]. They describe how Internet development leads to a widespread deployment of various IT technologies and security advances. In their paper they also propose an ontological approach for integration of cybersecurity in the context of cloud computing and present how information should be used and analyzed in such environments. The field of cloud logging, as a support for forensics, is starting to emerge along with the ones presented before. In this direction, we find various theses, such as the one of Zawoad et al, which presents an architecture for a secure cloud logging service in [6]. They discuss the need of log gathering from various sources around the datacenter or hypervisors in order to create a permanent image of the operations done within a datacenter and they present an architecture that can be used to help in this direction.
3 Architecture In this section we present the top view architecture of an anomaly detection system for cloud computing system. 3.1 The General Context To better understand the problem we present briefly the entire environment and the issues we face. Our cloud computing forensic enabled framework is composed from two layers and a series of modules, as in Figure 1.
distribution (responsible with horizontal and vertical scaling of the requests received from the scheduler), Internal cloud API (intended as a link between the virtualization layer and the cloud system), External cloud API (offers a way to the user for interacting with the system). The essential modules are the Cloud Forensic Module and the Cloud Forensic Interface pair. Their main goal is to gather all forensic and log data from the virtual machines that are running inside the virtualization layer and it represents the interface between the legal forensic investigator and the monitored virtual machines. The investigator has the possibility to monitor one or more virtual machine for a targeted user for a specific amount of time. 3.2 The Anomaly Detection System
Figure 1. Forensic enabled cloud computing architecture In the virtualization layer we find the actual platforms/servers that host the virtual machines and have virtualization enabled hardware. In the management layer we find the modules responsible for enabling the entire operations specific to the cloud. These modules are, in order: Security (responsible with all security concerns related to the cloud system - intrusion detection and alarming module), Validation engine (receives requests to add new jobs to be processed), Virtual jobs (creates an abstraction between the data requested by the user and the payload that must be delivered to the cloud system), Scheduler (schedules the jobs to the virtualization layer), Hypervisor interface (acts like a translation layer that is specific to a virtualization software vendor), Load
In our anomaly detection system, we started from the assumption that the users are running an operating system level virtualization software; we made this decision due to the fact that our architecture main building blocks are the same regardless of what virtualization technology is used. More exactly we modified the LXC (LinuX Container) software that can be installed on top of a regular Linux distribution. The main idea behind OS level virtualization is to have a certain place in the OS, a directory, for example, that represents the virtual machine hard disk. The rest of the resources needed (memory, networking, etc.) are split by the Linux kernel for every virtual instance. We can simply say that it is an enhanced chroot environment. 3.2.1 General Architecture We start explaining our architecture by first explaining graphically how LXC runs. In Figure 2 we can see an operating system with the proper kernel in place. LXC acts
like a virtual layer over the Linux kernel that sits between the virtual instances, called containers, and the kernel. When we create a new container, a directory is created on the disk and the user is locked in inside it. The new container's internal structure will be initially a full clone taken from the main operating system. When a modification must be made inside the container, only then an actual file is created inside the chroot-ed directory. The same applies for all the files existing in the operating system. In this way it is possible to install even additional software inside the container without affecting the outside operating systems.
Figure 3. Altered forensic LXC architecture The last module is needed in the final version of our cloud computing forensic framework, in which it will be just one virtual machine per user. Each time our framework layer decides to increase or decrease the number of virtual machines we just have to start or stop containers inside the virtual machine. To be more explicit, we give in Figure 4 a sketch of our final implementation.
Figure 2. General LXC architecture We propose the following modification: since the LXC layer acts like a hypervisor we can integrate with it and load an additional module, called “arbiter”. Its purpose is to monitor all calls made by the containers and create a database with all requests and responses. In Figure 3 we can see the conceptual modification that are made to the initial system by adding the “Arbiter” and the “Arbiter external interface”.
Figure 4. Sketch for our final forensic cloud architecture 3.2.2 The Arbiter Module The “Arbiter” module is implemented in the LXC context. We have chosen this approach because we need to integrate with every part of LXC. Also, a great benefit is to have the possibility to intercept all
network traffic. The underneath architecture is based on libpcap, the default library for intercepting networking traffic. Furthermore, it is possible to inject network traffic stored in external files and replay it inside a container. This will be the main method of training and testing our framework - use known network dumps for various attacks and then check the validity of our implementation. 3.2.3 The Learning Module For the learning module we want to use kernel based methods, more exactly Support Vector Machines (SVMs). This approach is better than using the Bayesian variant due to the fact that a SVM needs zero start knowledge in order to train, whereas Bayes machines needs a special attention on initialization (train evenly bad and good patterns). In Figure 5 we can see an example explaining from where our module will read its data.
This method is rather new and it can help administrators and/or forensic investigators by giving graphical statistics over what is happening in the datacenter. Figure 6 shows an example. We have considered all the running containers inside a datacenter as a pixel in a photo. A pixel can have RGB values ranging from a minimum 0x000000 to a maximum 0xFFFFFF. We consider that a container having the minimum value is “clean” and a container having values closer to the maximum value is “altered”. The more “altered” the container is, the more alerts our cloud forensic system triggers.
Figure 6. Datacenter visual analysis 4 Conclusion
Figure 5. SVM learning module and its input 3.2.4 The Visualization Module Our system will produce large quantities of logs and data. This is the reason why we need a dedicated visualization approach and to solve this problem we have studied “visual analytic” methods.
As we have seen, the field of cloud computing forensics and incident response is a new field for research that attracts more and more scientists. It poses a lot of challenges due to the distributed nature of the cloud but steps are starting to be made in this direction. In our paper we have presented a novel way in which user actions can be monitored and reproduced inside a cloud environment, even if it spreads over multiple datacenters. In this work we presented a new solution that provides the digital forensic investigators with a reliable and secure method in which they can monitor user
activity over a Cloud infrastructure. As future work we intend to continue walking on this research path in order to further explain module by module what we are going to implement. Of course, further testing using more complex scenarios and a thin integration with other existing Cloud infrastructures would also help us to further improve our solution. References: [1] NIST SP800-86 Notes, “Guide to Integrating Forensic Techniques into Incident Response”, http://cybersd.com/sec2/80086Summary.pdf [2] B. Grobauer and T. Schreck, “Towards incident handling in the cloud: challenges and approaches”, in Proceedings of the
2010 ACM workshop on Cloud computing security workshop, New York, 2010 [3] G. Chen, “Suggestions to digital forensics in Cloud computing ERA”, in Third IEEE International Conference on Network Infrastructure and Digital Content (ICNIDC), 2012 [4] T. Takahashi, Y. Kadobayashi and H. Fujiwara, “Ontological Approach toward Cybersecurity in Cloud Computing”, 2010 [5] M. Simmons and H. Chi, “Designing and implementing cloud-based digital forensics”, in Proceedings of the 2012 Information Security Curriculum Development Conference, pages 69-74, 2012 [6] S. Zawoad, A.K. Dutta and R. Hasan, “SecLaaS: Secure Logging-as-a-Service for Cloud Forensics”, in 8th ACM Symposium on Information, Computer and Communications Security (ASIACCS), 2013