VMGuard: An Integrity Monitoring System for Management Virtual Machines Haifeng Fang∗† , Yiqiang Zhao† , Hongyong Zang∗† , H. Howie Huang‡ , Ying Song† , Yuzhong Sun† and Zhiyong Liu† ∗ Graduate University of Chinese Academy of Sciences, Beijing, China † Key Laboratory of Computer System and Architecture, Institute of Computing Technology, Chinese Academy of Sciences email: {fanghaifeng,zanghongyong,songying}@ncic.ac.cn, {zhaoyiqiang,yuzhongsun,zyliu}@ict.ac.cn ‡ George Washington University, Washington DC, USA. email:
[email protected] Abstract—A cloud computing provider can dynamically allocate virtual machines (VM) based on the needs of the customers, while maintaining the privileged access to the Management Virtual Machine that directly manages the hardware and supports the guest VMs. The customers must trust the cloud providers to protect the confidentiality and integrity of their applications and data. However, as the VMs from different customers are running on the same host, an attack to the management virtual machine will easily lead to the compromise of the guest VMs. Therefore, it is critical for a cloud computing system to ensure the trustworthiness of management VMs. To this end, we propose VMGuard, an integrity monitoring and detecting system for management virtual machines in a distributed environment. VMGuard utilizes a special VM, GuardDomain, which runs on each physical node to monitor the co-resident management VMs. The integrity measurements collected by the GuardDomains are sent to the VMGuard server for safe store and independent analysis. The experimental evaluation of a Xen-based prototype shows that VMGuard can quickly detect the rootkit attacks while the performance overhead is low. Keywords-Virtual Machine; Integrity Monitoring; Remote I/O
I. I NTRODUCTION Cloud computing has become a popular choice for largescale high throughput computing. Using virtualization technologies (e.g., Xen [1]), the cloud providers can dynamically create a large number of virtual machines (VMs) to meet the requirements of the customers. For example, through Infrastructure as a Service (IaaS) [2] (e.g., Amazon’s EC2 [3]), the customers can purchase the guest VMs to run their applications in an on-demand fashion. The cloud providers maintain the privileged access to the Management Virtual Machine (e.g., Domain0 in Xen) [4] that directly manages the hardware and supports the guest VMs. Currently, the customers must trust the cloud providers to protect the confidentiality and integrity of their applications and data. However, as the VMs from different customers are running on the same hardware, an attack to the management virtual machine will easily lead to the compromise of the guest VMs. Therefore, it is critical for a cloud computing system to ensure the trustworthiness of management VMs. In Xen, a Guest Virtual Machine is called DomainU, and the Management Virtual Machine is the Domain0, which owns all privileges and runs the management tasks, e.g., domain management,
CPU scheduling, I/O processing, etc. For example, after creating a DomainU, the Domain0 can transparently read and write the memory content of the DomainU through the management interface (e.g., xc map foreign range). Should the Domain0 be compromised, the attackers could use the management interface to steal the valuable information from any DomainU. Prior work [5] [6] [10] [11] [12] [13] [25] assumes that Domain0 is trustworthy and develops the ”outof-the-box” monitoring tools that run in Domain0. Unfortunately, there exist several (external or internal) vulnerabilities in the Domain0 that can be exploited during an attack, e.g., wrong configurations and software bugs. A typical attack on the Domain0 usually starts from a DomainU instance. One example is the recent exploit (CVE2007-4993) [14] where a “grub.conf” in a DomainU is used to execute the privileged commands in the Domain0. In addition, a management console (e.g., XenCenter, HyperVM) opens new holes - the XenAPI HTTP interface has a crosssite scripting vulnerability [15], which runs a script in a user’s browser session in context of an affected site. The compromise of a management console allows an attacker to control all the VMs managed by it. For example, on June 8th 2009, 100,000 hosted websites were affected by a zeroday SQL injection hole in the HyperVM 2.09 [16], where the intruders gained the root privileges in the Domain0. Clearly, there is a need to maintain the integrity of the Domain0. Current techniques can be categorized as: 1) static integrity measurement, where many approaches [4] [7] [17] are based on the TCG’s TPM technology. However, the TPM-based measurement can only ensure the integrity of the Domain0 at startup, and it does not protect the Domain0 from malicious software that runs in the Domain0. 2) dynamic integrity measurement, where many approaches [5] [11] [13] focus on the integrity of the user VM. In this paper, we design and implement VMGuard, an integrity monitoring and detecting system for management VMs in a distributed environment. VMGuard utilizes a special VM, GuardDomain, which runs on each physical node to monitor the co-resident management VMs. The integrity measurements collected by the GuardDomains are sent to the VMGuard server for safe store and independent analysis. The GuardDomain has two key features: 1) it is easy to deploy and maintain, as there is no need to update the
baseline database, and 2) it is able to survive an attack even if the adversary gains the root access of the co-resident Domain0. In summary, this paper makes the following contributions: • We design and implement a monitoring system to maintain the integrity of management VMs. To the best of our knowledge, it is the first attempt in the Xenbased virtual computing environment. Our experimental evaluation shows that the prototype can timely detect the kernel-level rootkit attacks against Domain0s, while the performance overhead remains reasonable. • We utilize the split device driver model in Xen to bind VMs to the remote storage images. The experiment shows that its performance is 19% better than NFS. VMGuard has its limitations, one of which is that the GuardDomainU must be bootstrapped from the untrusted Domain0. Further, the integrity of the GuardDomainU must be verified by the server Domain0. As part of the future work, we plan to investigate new techniques in VMGuard to address these problems. One possible extension is that on the startup of GuardDomainU, its integrity can be verified by the administrator via the TPM-based technique, and while on the runtime, it is monitored by GuardDomain. The rest of this paper is organized as follows. Section II describes the background, and Section III presents the architecture of the VMGuard. Section IV introduces the implementation of the prototype system; Section V presents the evaluation, and Section VI discusses the related work. We conclude in Section VII. II. BACKGROUND A. Split Device Driver Model The split device driver model is the core feature of Xen. The applications in a DomainU access the physical I/O devices through the front-end device driver, which becomes a virtual I/O device in the DomainU. When the applications perform I/O operations, the front-end device driver forwards these requests to the back-end device driver in the Domain0 via the communication mechanisms (I/O ring, event channel, grant table). The back-end device driver checks whether the requests are legitimate and requests the local native device drivers to perform the real I/O operations. When the I/O operations are completed, the back-end device driver will asynchronously notify the front-end device driver that reports to the applications in the DomainU. In short, the front-end and back-end device driver act as an agent to each other, forming the virtual I/O device driver of the DomainU. B. Kernel-level Rootkits A large fraction of security breaches are caused by kernellevel attacks, e.g., the rootkits that violate some form of security in the entire system [25]. In this paper, we mainly focus on the external attacks against the Xen system. Once the attackers gain the root privilege of Xen, they will likely
install various rootkits especially the kernel-level ones in the Domain0. Therefore, these rootkits are among the most significant threats for XenoLinux (Linux’s modified version running on Xen) in Domain0 [18]. A Linux rootkit enters into the kernel space through accessing “/dev/kmem” (or “/dev/mem”), or loaded as a kernel module (LKM). Once successful, it will likely modify the kernel text code, e.g., the virtual file system and the device drivers, or change the kernel’s critical data structures, e.g., the system call table and the interrupt table. III. S YSTEM D ESIGN A. Approach In this paper, we assume that Xen is trustworthy because its code size is very small and it has few bugs, and the domain in the running state has almost no power to destroy Xen [20]. Note that this assumption is consistent with prior research efforts [5] [6] [7] [11] [13]. Our object is to use VMGuard to assure that the intrusions that violate the kernel integrity of Domain0s are effectively detected. To this end, VMGuard must address the following challenges: • Real-time monitoring VMGuard should be able to detect the violation in realtime. We propose a special domain, GuardDomain, that is implemented in the Xen kernel space. The GuardDomain can 1) directly call the internal memory management functions to access all physical memory belonging to Domain0, 2) use Xen’s clock functions to timely activate the monitoring and integrity measurement tasks, and 3) run along with the Domain0. Since the GuardDomain is hidden inside the Xen space, we assume that it is hard for the Domain0 to gain the control of the GuardDomain. Note that there is a semantic gap between the Xen and guest VM kernel, because the GuardDomain has no information on the layout of the physical memory managed by the Domain0. Here we introduce a new DomainU, GuardDomainU, which is created by the administrator and runs a special Linux kernel to support the semantic interpretation tools. Once GuardDomainU starts, the administrator can switch it into a trusted state for protection. Together, the GuardDomain and GuardDomainU accomplish the monitoring and integrity measurement for the co-resident Domain0. Clearly, both domains shall be securely isolated from the Domain0 and DomainUs, and run in a trusted state when the Domain0 or DomainUs is under attack. • Tampering-resistant To keep GuardDomain and GuardDomainU from attacks, we shall minimize the interactions between them and other domains. Currently, when a DomainU starts, other domains can interact with it through network interface. Additionally, the Domain0 can control a DomainU through the privileged interfaces. To close these ”loopholes”, we introduce a new trusted running mode for GuardDomainU. We also modify
the privileged management interfaces to prevent the Domain0 from controlling GuardDomainU and to allow GuardDomainU map the physical memory area in the Domain0. Although GuardDomain is located inside the Xen space and hidden from other domains, it needs to communicate with GuardDomainU, so the interfaces to GuardDomainU shall be only accessed by GuardDomainU. • Integrity verification VMGuard utilizes a central verification server, GuardServer, that stores the integrity measurements from GuardDomains. The server maintains the baseline database and verifies the integrity of all Domain0s. For security purpose, the centralized management node does not deploy DomainUs. As a client, the GuardDomainU retrieves the management policies from the GuardServer. It timely transfers the integrity measurements to the server. Obviously, we need to establish a secure network transport channel between the server and each GuardDomainU. A channel through the GuardDomainU’s network interface would open GuardDomainU up for attacks. Thus, we propose to establish the secure channel in the form of remote images. This way, the server can maintain the information related to integrity measurement with an image file, and a GuardDomainU can remotely bind the image file as its storage. Although we can mount the remote storage onto Domain0 via NFS, this solution is not secure because the untrusted Domain0 can directly access the content inside the mounted image file. Thus, we design a new secure remote I/O binding mechanism for GuardDomainU, by which we only reveal the location of the remote image file (its contents can not be accessed by Domain0). B. Architecture As illustrated in Figure 1, the VMGuard adopts a client/server architecture. The client side on the left side provides a trusted monitoring environment for each client Domain0 (C-Domain0), and the server side on the right side is a centralized verification environment in server Domain0 (S-Domain0). On the low level, the VMGuard system includes three subsystems as follows: • The mode-control subsystem consists of the modecontrol manager, the integrity measurement manager (policy semantic interpreter), and the privileged interface for GuardDomainU. After GuardDomainU starts up, the administrator can use the mode-control manager to switch GuardDomainU into the trusted running mode. The measurement manager can read the policy file that determines which part of kernel memory in C-Domain0 shall be measured, and send this information to its policy interpreter. The policy interpreter is responsible for translating the policy. •
The dynamic monitoring subsystem consists of GuardDomain, the memory metric-hash engine, the domain
page-table manager, etc. In GuardDomain, there is a VCPU which is scheduled by Xen to periodically activate the memory metric-hash engine. With the help of the domain page-table manager, the hash engine can access any memory areas in C-Domain0. According to the measurement policy table, the engine calculates the hash value of the memory area of C-Domain0. •
The remote I/O binding subsystem consists of the front-end remote storage device driver and the backend remote storage device driver. The front-end device driver is responsible for transferring the I/O requests produced by applications in GuardDomainU to the back-end device driver through the network. To enhance the security of the remote I/O channel, we can encrypt the data in the GuardDomainU and decrypt them on the verification server. Currently our prototype system has yet implemented this mechanism.
C. Deployment and Workflow of VMGuard The deployment of the VMGuard system is accomplished in the following steps: 1) On the server side, the administrator prepares a policy for each service node that defines the associated kernel symbol table, the measurement policy and encryption keys, and stores as an image file that will be bounded to the GuardDomainU of the service node. 2) On the client side, after the Domain0 is started, the administrator writes the location of the image file into the GuardDomainU’s resource configuration file, and starts the GuardDomainU. During the start-up time of GuardDomainU, it binds to the pre-defined image file through the remote I/O binding subsystem. 3) Through running the mode-control manager, the administrator switches GuardDomainU into the trusted mode, in which the GuardDomainU’s network interface is turned off, the interface of mapping Domain0’s memory is opened, and the interface of mapping GuardDomainU’s memory is closed. 4) The administrator starts the policy semantic interpreter and integrity measurement manager. The former reads the measurement policy and the kernel symbol table from the image, and translates them into memory measurementrelated settings. Through the privileged interface for GuardDomainU, these settings are registered into the measurement policy table. Using the integrity measurement manager, the administrator activates the GuardDomain. When GuardDomain is scheduled, it checks whether the time interval is up. If so, then it activates the memory hash-metric engine, which reads the items from the measurement policy table. Each item contains the measurement information such as the start address and the length of the measured memory area. With the
Figure 1.
1 2 3 Architecture of VMGuard (!mode-control subsystem; !dynamic monitoring subsystem; !remote I/O subsystem)
help of the domain page-table manager, the engine maps the memory area, calculates its hash value, and then saves the measurement value into the measurement information buffer. 5) The administrator starts the integrity logger that continually fetches and transports the integrity measurement records. The integrity logger reads the measurement records from the buffer through the privileged interface for GuardDomainU. Then, it appends these records to the log file in the GuardDomainU’s image. 6) On the verification server side, the administrator starts the integrity metric-value verifier. The integrity metricvalue verifier opens the GuardDomainU’s corresponding image file. It reads the logging records and compares them with the values in the metric-baseline database. If any hash value has an error, then an alarm is issued. IV. I MPLEMENTATION We have implemented the VMGuard prototype system based on Xen 3.1 and Linux 2.6.18. In this section, we elaborate the corresponding implementations for these approaches described in Section III. A. The Privileged Interface for GuardDomainU We add a new hypercall interface that can only be called by the mode-control manager and the integrity measurement manager through the privcmd driver. By passing different parameters, the administrator can perform various privileged operations, e.g., switching GuardDomainU into the trusted mode, obtaining the hash value, etc.
B. Privcmd Driver We modify the driver’s privcmd ioctl subroutine (IOCTL PRIVCMD MMAP, IOCTL PRIVCMD MMAPBATCH) and add the privilege checking logic into Xen. Thus, the domain can access the interface (xc map foreign range) to map other domain’s memory only if it matches either conditions: 1) the mapping domain is the initial domain (e.g., Domain0), and the mapped domain is running in the normal mode; 2) the mapping domain is not the initial Domain (e.g., GuardDomainU), but it is running in the trusted mode. As mentioned earlier, through the xc map foreign range interface, Domain0 can map the physical memory area in DomainU into its virtual address space so that it can do read and write operations for the region. A lot of work [5] [13] is based on this interface, including XenAccess [11], but XenAccess can only run in the Domain0. We utilizes a modified interface and some mini-patch to Xen, so that we can run XenAccess in the GuardDomainU. C. Policy Semantic Interpreter In VMGuard, we rename the modified XenAccess as the policy interpreter. The policy interpreter gets the physicalto-machine (P2M) table of C-Domain0 by accessing the modified xc map foreign range interface, and calculates the starting machine address of the kernel page table in CDomain0 by referring to the P2M table. Based on the measurement policy, the interpreter reads the “System.map” file to look up the starting virtual address of the measured memory areas in C-Domain0. With the P2M table and the kernel page table, it can look up the corresponding page table entry (PTE) from which it can get the starting
machine page frame number (start mfn) of the memory area in C-Domain0, and then it registers the information into the policy table by calling the new hypercall with “GUARDVM register policy” command.
D. GuardDomain Currently Xen does not support the concept like kernel threads in Linux. We find that there is an idle domain within the Xen space, which manages all of idle vcpus that are scheduled by the Xen scheduler. When the idle vcpu is scheduled, some functions in the Xen space can run on this VCPU. Based on this idea, we implement GuardDomain in the Xen space that has three features. First, its data structure is not linked into the global domain management list, so it is hidden from outside. Second, it shares the page table (idle pg table) with Xen so that it can call all Xen functions. Last, it only creates one VCPU which is similar to the idle vcpu, but the VCPU’s priority is the same with the general VCPU so that it can timely activate the memory hash-metric engine. E. Memory Hash-metric Engine The memory hash-metric engine is implemented as a set of hash functions, which can calculate the digest of the machine memory area. Currently, we choose the MD5 hash algorithm to calculate the digest. When the engine is called by GuardDomain, it checks the real-time clock to determine whether to do the hash work. F. Domain Page-table Manager In the measurement policy table, there are some items related to memory regions for which the engine needs to calculate the corresponding hash value. These memory regions belong to C-Domain0. However, the engine cannot access them directly in the Xen space. On the 32-bit x86 physical platform, the virtual address space of Xen in non-PAE mode covers only 64MB memory, which means that Xen can only access a maximum 64MB of machine memory at one time. To access the machine memory belonging to the domains, we add a simple domain memory mapping module, domain page-table manager. Now, we illustrate the way how to access the kernel machine memory of C-Domain0 as follows. First, as mentioned in Section IV.C, we can, from the measurement policy table, get PTE or the starting address (start mfn) of the kernel memory region in C-Domain0 and the length. Second, we allocate one-page memory from the Xen heap space by calling the alloc xenheap page function. The purpose of allocating the page is to temporarily use its virtual address region occupied by the page. Using the virt to mfn macro-function, we can translate the page’s starting virtual address into the starting machine page frame number (mfn). In the meanwhile, by traversing the kernel
Figure 2.
architecture of remote I/O binding technology
page table (idle pg table) of Xen, we get the corresponding PTE of the page. Finally, the domain page-table manager temporarily replaces the PTE for Xen heap page memory with the PTE for kernel memory in C-Domain0 and refreshes TLB. Then the memory remapping is complete. From this point, the engine can directly access the kernel memory in C-Domain0. If the memory area to be measured is larger than one page, we can repeat the process. Due to space limitation, we do not describe in-depth how to recover the original mapping in this paper. G. Transparent Remote I/O Binding Channel In Xen, the front-end virtual block device driver in DomainU is the “blkfront” module and the back-end virtual block device driver is the “blktap” module. To establish the transparent I/O binding channel, we firstly extend the blktap module and add two new modules, remote-blkfront and remote-blkback, so that the block I/O requests can be transferred to the remote storage in the server machine through the network. As shown in Figure 2, in C-Domain0 the user space part (tapdisk) of blktap maps the I/O ring memory to the user space through the mmap interface. In the new remoteblkfront module, blk read calls the get io request function to get the I/O requests from the user space I/O ring. In the S-Domain0 side, the new remote-blkback module that has a socket connection with the remote-blkfront module, receives the I/O requests. The remote-blkback module can read or write the image with O DIRECT and O LARGEFILE mode. According to the blkif request field in I/O ring, the remote-blkback can quickly locate the block data in the image to ensure the I/O processing efficiency. V. E VALUATION Currently, we have implemented a virtual computing platform that provides the IaaS service and VMGuard is its key component. In this test, the client runs on a machine with Intel Xeon E5410 2.33GHz 8 Core, and the server uses AMD Athlon 2200 1.8GHz dual core. The two machines are connected via Gigabit Ethernet network.
Table I R EPRESENTATIVE KERNEL ROOTKITS FOR L INUX 2.6
adore-ng-0.56 (ported) lvtes (ported) mood-nt override (ported) phalanx-b6 (ported) suckit2priv hack open
kernel version 2.6.16 2.6.3 2.6.16 2.6.14 2.6.14 2.6.x 2.6.18
loading mode LKM LKM kmem LKM mem kmem LKM
800
integrity verification report modify modify syscall table √ kernel text √ √ √ √ √
√ √
676.42
700
total execution time (s)
rootkit name
GuardDomain is not running 1(s) monitoring interval 1(min) monitoring interval
AND DETECTION RESULTS
647.33
600 512.27 456.16
500
371.59
400 300
407.63
304.51 246.57 265.31
200 100 0
A. Functional Verification
GuardDomainU (NFS)
GuardDomainU (VMGuard)
Figure 3.
The execution time of building Linux kernel in DomainU under three different binding modes (Local: DomainU binding to local image file; NFS: GuardDomainU binding to remote image file by NFS; VMGuard: GuardDomainU binding to remote image file through our new remote I/O binding technology) 500 456.16 450 400
total execution time (s)
The goal of the functional test is to verify whether VMGuard can detect, in a timely fashion, that the kernel in CDomain0 has been compromised by the attacker. Currently, our system focuses on kernel-level rootkits attacks against the XenoLinux in C-Domain0. During the experiment, we find that many of the well-known rootkits (e.g., adore-ng 0.56, lvtes, override, phalanx-b6) cannot, without modifications, be compiled or installed on XenoLinux 2.6.18. First, the “/dev/mem” driver’s mmap interface is re-implemented in Xen, as a result, traditional mem-type rootkits (like phalanx-b6) cannot easily locate the memory area of the kernel. Second, most LKM-type rootkits are based on the symbols exported by the kernel. But, since Linux kernel 2.6, some critical kernel symbols are no longer exported. Third, Xen is the only code running in ring 0, while the kernel of Domain0 runs in less privileged ring (ring 1 in case of x86 32). So, in the XenoLinux’s kernel space, traditional LKM-type rootkits cannot directly execute the privileged operations such as setting control registers to bypass the memory protection. Therefore, we port these rootkits to the XenoLinux 2.6.18. For mem-type rootkits we resort to the C-Domain0’s “System.map” to find the locations of the attacked memory area [21] . For LKM-type rootkits we need to bypass the memory protection. Traditional kernellevel rootkits can directly clear the WP bit of CR0, while XenoLinux’s kernel prevents it from happening. However, there is a hypercall (HYPERVISOR update va mapping) by which rootkits can make the read-only memory area writable. We also implement a typical kernel-level rootkit (called hack open) for XenoLinux 2.6.18, which can modify the system call table and kernel memory areas. Figure 5 shows the process of an attack against CDomain0. First, the administrator selects the “active guarddomain” command to switch GuardDomainU into the trusted mode (Window B). Then, the management virtual machine (Domain0) is attacked and the attacker inserts the “lvtes.ko” module into the kernel in C-Domain0 (Window A). Immediately, VMGuard raises an alarm that the received hash value has an error (Window C). The representative kernel rootkits which can be detected by VMGuard are listed in Table I.
DomainU (Local)
NHUQHOEXLOGWLPHSPDNHDOO HPDFVWLPHSHPDFVQZWHPSW[W E]LSWLPHSE]LSGOLQX[WDUE]
371.59
350 Local
300
NFS
VMGuard
246.57 250 200 140.34 116.64
150 100 50 2.03 7.69 7.33
10.93
0 kernel-build
emacs
bzip2
Figure 4. The execution time of three workloads running in GuandDomainU when GuardDomain is not running
B. Performance Evaluation To assess the overhead of VMGuard, we compile the Linux 2.6.18 kernel in three different DomainUs and measure the total execution times by using “time” command. Kernel building is a typical I/O and computation-intensive workload. We run this workload three times under two different modes, that is, when GuardDomain is running and not running. The average results are shown in Figure 3. Intuitively, the performance of VMGuard is affected by the time intervals of running GuardDomain. When the interval is 1 second, the overhead is very high, which would negatively affect other DomainUs. However, when the intervals are more than 1 minute, the performance overhead becomes reasonable - 7.6% performance degradation for other DomainUs. To compare our new I/O binding technique in VMGuard with NFS. We run two typical application workloads in three different DomainUs when GuardDomain is not running. Opening a large file with Emacs and then immediately closing it is cache-sensitive, and decompressing the compressed Linux kernel source package file by bzip2 is a computation
Figure 5.
1 2 Screenshot when the kernel in C-Domain0 is attacked (!the administrator starts monitoring work through activating GuardDomain; !the attacker installs the 3 kernel rootkit (lvtes.ko) in C-Domain0; !the verification server detecting the violation of integrity and reports a error message)
intensive workload. Again, we run these workloads three times and the average results are shown in Figure 4. VMGuard shows that, for I/O intensive workload (i.e., kernel building), the performance improves by about 19% than NFS. We believe that it is derived from the fact that our implementation is able to take advantage of the split device driver model in Xen, removing redundant operations in the software stacks. VI. R ELATED W ORK A. VM Monitoring Garfinkel and Rosenblum [6] propose the idea of virtual machine introspection (VMI), an intrusion detection system (IDS) that co-locates on the host machine and leverages a virtual machine to isolate the IDS. In VMwatcher [13], the IDS system running in the Domain0 monitors the memory in a DomainU through the xc map foreign range interface. It can deduce the processes information in DomainU based on the data structure of the task and sends this information to the anti-virus software. By observing hardware behaviors (e.g., CR3 change, TLB flush), Antfarm [12] can transparently obtain the information about the processes running in a virtual machine. Lares [5] provides an active monitoring framework, in which some hooks are inserted into the critical path of the VM kernel. In contrast, VMGuard does not need to modify the kernel of the virtual machine, and different from the above works, makes a first attempt to monitor the Management VM.
B. Dynamic Measurement To enforce the trustworthiness of the virtual computing environment, researchers have introduced a number of trustenhancing mechanisms, many of which are based on the TCG’s TPM technology [4] [7] [8] [9] [17]. However, the TPM-based measurement can only ensure a VM’s integrity at startup, and it does not protect applications within the VM from exploitation. In dynamic measurement and integrity verification, the most related project is Copilot [19] that provides a dynamic memory measurement mechanism by installing a memory monitoring co-processor on the motherboard. In Copilot, the hardware can hash the memory area that contains the kernel text and other key components through direct memory access (DMA). Similarly, VMGuard is a software approach to provide a dynamic memory measurement mechanism for VMs. C. Remote I/O Binding Extensive research has been done on remote I/O in data centers [22] [23] [24]. Collective [22] provides a remote I/O binding technology by which a laptop can access the virtual storage resources in the cloud computing platform. Netchannel [23] provides a remote back-end I/O device driver for DomainU to facilitate the VM migration. In VMGuard, an administrator can bind the monitoring VM (GuardDomainU) with the virtual storage resources located on the servers. Although traditional remote I/O technologies (e.g., NFS, NBD, iSCSI) are widely used in data centers, they are implemented in virtual file system or block device driver layer of the operating system. In comparison, VMGuard utilizes the split device driver model in Xen.
VII. C ONCLUSION This paper proposes the design and implementation of VMGuard to monitor the integrity of the management virtual machines in a distributed environment. VMGuard introduces some unique features: real-time monitoring, centralized policy, easy to deploy, and highly tamper resistant. To the best of our knowledge, it is the first attempt to monitor the integrity of Domain0 in Xen-based virtual computing environment. In VMGuard, we introduce a new secure remote I/O binding technique that takes the advantage of the split device driver model in Xen. As Xen has been widely adopted in cloud computing platforms, we believe that these trust-enhancing techniques will benefit both cloud providers and users. The evaluation shows that the VMGuard is effective and the performance overhead is acceptable. ACKNOWLEDGMENT This work was supported in part by the National HighTech Research and Development Program (863) of China under grants 2009AA01Z141 and 2009AA01Z151, the projects of National Science Foundation of China (NSFC) under grants 90718040, and the National Grand Fundamental Research Program (973) of China under grant No.2007CB310805. R EFERENCES [1] Barham, P., Dragovic, B., Fraser, K. et al.: Xen and the Art of Virtualization. In: Proc. of the 19th ACM Symp. on Operating Systems Principles 2003. pp.164-177. [2] Huizenga, G.: Cloud Computing: Coming out of the fog. In: Proc. of the Linux Symposium 2008,Valume 1, pp. 197 - 210 [3] http://aws.amazon.com/ec2/ [4] Berger, S., Cceres, R., et al.: vTPM: Virtualizing the Trusted Platform Module. In: Proc. of the 15th conference on USENIX Security Symposium, 2006 [5] Payne, B.D., Carbone, M., Sharif, M. and Lee, W.: Lares: An Architecture for Secure Active Monitoring Using Virtualization. In Proc. of the IEEE Symposium on Security and Privacy 2008, pp. 233-247 [6] Garfinkel, T., Rosenblum, M.: A Virtual Machine Introspection Based Architecture for Intrusion Detection. In: Proc.of the Network and Distributed Systems Security Symposium 2003, pp.191-206. [7] Garfinkel, T., Pfaff, B., et al.: Terra: A Virtual Machine-Based Platform for Trusted Computing. In: Proc. of the 19th ACM Symp. on Operating Systems Principles 2003, pp. 193 - 206 [8] Sailer, R., Valdez, E., Jaeger, T., et al.: sHype: Secure Hypervisor Approach to Trusted Virtualized Systems. Techn. Rep. RC23511, Feb. 2005. IBM Research Division
[9] Berger, S., Cceres, R., et al.: TVDc: managing security in the trusted virtual datacenter. In: ACM SIGOPS Operating Systems Review, Volume 42 , Issue 1 (2008), pp. 40-47 [10] Hay, B., Nance, K.: Forensics Examination of Volatile System Data Using Virtual Introspection. In: OPERATING SYSTEMS REVIEW 2008, Vol 42 Number 3, pp.75-83 [11] Payne, B., Carbone, M., Lee, W.: Secure and Flexible Monitoring of Virtual Machines. In: Computer Security Applications Conference 2007, pp. 385-397 [12] Jones, S.T., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: Antfarm: Tracking Processes in a Virtual Machine Environment. In: Proc. of USENIX Annual Technical Conference 2006 [13] Jiang, X., Wang, X., Xu, D.: Stealthy malware detection through vmm-based out-of-the-box semantic view reconstruction. In: Proc. of the 14th ACM conference on Computer and Communications Security 2007, pp.128 - 138 [14] http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-20074993 [15] http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-20083253 [16] Dan Goodin. Webhost hack wipes out data for 100,000 sites, http://www.theregister.co.uk/2009/06/08/webhost attack [17] Sailer, R., Zhang, X., Jaeger, T., van Doorn, L.: Design and Implementation of a TCG-based Integrity Measurement Architecture. In: Proc. of the 13th conference on USENIX Security Symposium, 2004 [18] http://www.sans.org/reading room/whitepapers/honors/linuxkernel-rootkits-protecting-systems 1500 [19] Petroni, N.L., Fraser, Jr. T., Molina, J., et al.: Copilot: a Coprocessor-based Kernel Runtime Integrity Monitor. In: Proc. of the 13th conference on USENIX Security Symposium, 2004 [20] Wojtczuk, R.: Subverting the Xen Hypervisor. Black Hat USA. 2008. [21] Anthony Lineberry.: Malicious Code Injection via /dev/mem. Black Hat Europe 2009 [22] Chandra, R., Zeldovich, N., Sapuntzakis, C., S. Lam, M.: The collective: a cache-based system management architecture. In: Proc. of the 2nd conference on Symposium on Networked Systems Design and Implementation 2005, Vol.2, pp. 259-272 [23] Kumar, S., Schwan, K.: Netchannel: a VMM-level mechanism for continuous, transparent device access during VM migration. In: Pro. of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments 2008 [24] Meyer, D.T., Aggarwal, G., et al.: Parallax: virtual disks for virtual machines. In: Proc. of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008, pp. 41-54 [25] Sharif, M., Lee, W., Cui, W., et al.: Secure In-VM Monitoring Using Hardware Virtualization. In: Proc. of the 16th ACM conference on Computer and Communications Security 2009, pp.477-487