Temporal and Spatial Isolation in a Virtualization Layer ... - IEEE Xplore

8 downloads 0 Views 510KB Size Report
modifications to the guest OSes if there is no proper hardware virtualization support that rarely exists in embedded systems. Therefore existing virtualization.
7D-1

Temporal and Spatial Isolation in a Virtualization Layer for Multi-core Processor based Information Appliances Tatsuo Nakajima, Yuki Kinebuchi, Hiromasa Shimada, Alexandre Courbot, Tsung-Han Lin Department of Computer Science and Engineering Waseda University [email protected] Abstract : A virtualization layer makes it possible to compose multiple functionalities on a multi-core processor with minimum modifications of OS kernels and applications. A multi-core processor is a good candidate to compose various software independently developed for dedicated processors into one multi-core processor to reduce both the hardware and development cost. In this paper, we present SPUMONE, which is a virtualization layer suitable for developing multi-core processor based-information appliances.

operating systems and their applications while guaranteeing their deterministic real-time behavior. Our project is developing SPUMONE, which is a virtualization layer for multi-core processor-based embedded systems. SPUMONE assumes to use an SMP(Symmetric Multiprocessing)-based multi-core processor, and each core has a core-local memory. Thus, it is easy to share code and data from all cores, but each core can keep some code and data secretly in a core-local memory. SPUMONE uses the characteristics to implement the spatial isolation. SPUMONE satisfies the above requirements, and currently focuses on achieving the following four goals.

I Introduction Multi-core processors are being increasingly adopted for embedded systems because they improve performance, power consumption and lower development cost. Composing multiple operating systems on a multi-core processor enhances the reusability of software when developing rich functional information appliances. Multiple OS environments enable the product to use two versions of an operating system at the same time. In order to build multiple OS environments, a virtualization layer specialized for embedded systems is necessary 1 , since most of processors for embedded systems support only two protection levels, and there is no hardware support for virtualization. In traditional approaches, an OS kernel runs at the user level to isolate the respective OS kernels to increase reliability, but this approach requires heavy modifications to the guest OSes if there is no proper hardware virtualization support that rarely exists in embedded systems. Therefore existing virtualization solutions are not preferred by the embedded system industry. In [1], Armand and Gien present several requirements for a virtualization layer to be suitable for embedded systems:

i.

Mapping of virtual cores on physical cores dynamically to balance the tradeoff among real-time constraints, performance and energy consumption. ii. Reducing interrupt latency without degrading real-time performance in a single and multi-core processor. iii. Isolating RTOS from GPOS without executing guest OSes in user space. iv. Detecting integrity violations in OS kernels, and repairing them by rebooting the kernels independently. The rest of the paper is structured as follows. In Section II, we present why we need a virtualization layer. Section III shows the basic architecture of SPUMONE. In Section IV, we describe how SPUMONE supports the temporal isolation, and how it supports the spatial isolation in Section V. In Section VI, we present a mechanism to enhance the security and reliability of the guest OS. Finally, Section VII concludes the paper.

II. Why Virtualization

i.

It should run an existing operating system and its supported applications in a virtualized environment, such that modifications required to the operating system are minimized (ideally none), and performance overhead is as low as possible. ii. It should be straightforward to move from one version of an operating system to another one; this is especially important to keep up with frequent Linux evolutions. iii. It should reuse native device drivers from their existing execution environments with no modification. iv. It should support existing legacy often real-time 1

Fig 1: Co-existing Multiple Operating Systems

Our virtualization layer can used for various embedded systems, but our current main focus is information appliances.

978-1-4244-7516-2/11/$26.00 ©2011 IEEE

645

The section presents three advantages to use the virtualization layer in embedded systems. Fig. 1 shows the first advantage of the virtualization layer. Embedded systems usually include control processing like mechanical motor control, wireless communication control or chemical control. Using software enables us to adopt a more flexible control strategy, so recent advanced embedded systems contain microprocessors for implementing flexible control strategies.

7D-1 On the other hand, recent embedded systems need to process various information for supporting better human decision making. Therefore, recent embedded systems contain both control and information processing functionalities. In traditional embedded systems, dedicated processors are assigned for respective processing. A multi-core processor offers a possibility to combine these multiple processing on a single processor. Fig.1 shows that both control and information processing runs on a virtualization layer that is executed on a single processor. This approach requires less processors and reduces the cost of embedded systems.

requires to modify the OS kernel to manage heterogeneous resources. The virtualization layer makes it possible to hide the heterogeneity from OS. Thus, unmodified OS can be used on a heterogeneous multi-core processor.

III. Basic Architecture A. User-Level Guest OS vs. Kernel Level Guest OS There are several traditional approaches to execute multiple operating systems on a single processor in order to compose multiple functionalities. Microkernels and virtual machine monitors execute guest OS kernels at the user level. When using microkernels, various privileged instructions, traps and interrupts in the OS kernel need to be virtualized by replacing their codes. Also, since OS kernels are to be executed as user level tasks, application tasks need to communicate with the operating system kernel via inter-process communication. Therefore, a significant amount of the operating system needs to be modified. Virtual machine monitors are another approach to execute multiple operating systems. If a processor offers hardware virtualization support, all instructions that need to be virtualized trigger traps to the virtual machine monitor. This makes it possible to use any OSes without any modification. But if the hardware virtualization support is incomplete, some instructions still need to be complemented by replacing some code to virtualize them. Most of processors used for embedded systems only have two protection levels, and MMU cannot usually be used in the privileged level. So, when OS kernels are located in the privileged level, they are hard to isolate. On the other hand, if the OS kernels are located in the user level, the kernels need to be modified significantly. Most of embedded system industries prefer not to modify a large amount of the OSes' code, so it is desirable to put them in the privileged level. Also, the virtualization of MMU requires significant overhead if the virtualization is implemented by software. Therefore, we need alternative mechanisms to reduce the engineering cost, to ensure the reliability of the kernels and to exploit some advanced characteristics of multi-core processors. We believe that the following three issues are serious problems, when a guest OS is implemented in the user level.

Fig 2: Reusing Existing Software Fig.2 shows the second advantage that a virtualization layer makes it possible to reuse existing software. In the left figure, three existing independently developed software are integrated in a single system. The virtualization layer allows the software to be integrated with minimum modification. In the left figure, OS independent services can be continuously used even when an OS personality that implements user interface is changed ex. from Symbian to Android. The OS personality needs to be changed according to various business reasons. If additional software developed by an embedded system company is implemented as OS independent services, the software needs not to be ported on a new OS personality. Also, introducing virtualization layer in embedded systems offers additional advantages. For example, proprietary device drivers can be mixed with GPL codes without license violation. This solves various business issues when adopting Linux in embedded systems.

i.

The user level OS implementation requires heavy modification of the OS kernel. ii. Emulating an interrupt disabling instruction is very expensive if the instruction cannot be replaced. iii. Emulating a device access instruction is very expensive if the instruction cannot be replaced. Fig 3: Hiding Heterogeneous Hardware Transparently

In a typical RTOS, both the kernel and application codes are executed in the same address space. Embedded systems have dramatically increased their functionalities in every new product. For reducing the development cost, the old version of application codes should be reused and extended in an ad-hoc way. The limitation of hardware resources is always the most important issue to reduce the product cost. Therefore, the application codes sometimes use very ad-hoc programming styles. For example, application codes running on RTOS usually contain many privileged instructions like

The third advantage shows in Fig 3 is that a virtualization layer can hide heterogeneity in a processor architecture. In the left figure, the number of cores in a multi-core processor can be dynamically changed, but the guest OS is not aware of the change. In the right figure, a processor has several cores that offer the same instructions, but each core’s power consumption and processing power are not the same. [4, 8] show that this approach is very effective to reduce the power consumption of the entire system. However, the approach

646

7D-1 multi-core SH4a chip. Currently, SMP Linux, Toppers2, and L4 are running on SPUMONE as guest OSes. The basic abstraction of SPUMONE is virtual cores(vcores). Virtualizing other physical hardware resources such as memory and devices is an option. Each embedded system can choose the most suitable virtualization configuration strategy according to various tradeoffs. Unlike typical microkernels or virtual machine monitors, SPUMONE itself and OS kernels are executed in privileged level as mentioned in the previous section. Since SPUMONE provides an interface slightly different from the one of the underlying processor, we simply modify the source code of OS kernels, a method known as para-virtualization. This means that some privileged instructions should be replaced to function calls to invoke SPUMONE API, but the number of replacements is very small. Thus, it is very easy to port a new guest OS or to upgrade the version of a guest OS on SPUMONE. For spatially isolating multiple operating systems, if it is necessary, SPUMONE assumes that underlying processors support the mechanisms to protect physical memories used by respective operating systems like VIRTUS [5]. The approach may be suitable for enhancing the reliability of guest OSes on SPUMONE without significantly increasing overhead. In this paper, we propose an alternative novel approach to use a multi-core processor’s functionality to realize the isolation among guest OSes. The approach does not assume that the processor provides an additional hardware support to spatially isolate guest OSes. SPUMONE does not virtualize I/O devices because traditional approaches incur significant overhead that most of embedded systems could not tolerate. In SPUMONE, since device drivers are implemented in the kernel level, they do not need to be modified when the device is not shared by multiple operating systems.

interrupt disable/enable instructions to minimize the hardware resources. Also, device drivers may be highly integrated into application codes. Thus, it is very hard to modify these application codes to be executed at the user level without changing a significant amount of application codes even if their source codes are available. Therefore, it is hard to execute the application codes and RTOS in the user level without violating the requirements described in Section I. Therefore, executing RTOS is very hard if the processor does not implement hardware virtualization support. Even if there is proper hardware virtualization support, we expect that the performance of RTOS and its application code may be significantly degraded. Our approach chooses to execute both OS kernels and a virtualization layer at the same privileged level. This decision makes the modification of OS kernels minimal, and there is no performance degradation by introducing a virtualization layer. However, the following two issues are very serious in the approach. i. ii.

Interrupt disable instructions have serious impact on the interrupt latency of RTOS. There is no spatial isolation mechanism among OS kernels.

The first issue is serious because replacing interrupt disable instructions is very hard for RTOS and its application codes as described above. The second issue is also a big problem because executing OS kernels in virtual address spaces requires significant modification of the OS kernels. SPUMONE proposes two techniques presented in Section III and Section IV to overcome the problems. B. SPUMONE: A Multi-core Processor Virtualization Layer for Embedded Systems

based

B.1 Interrupt/Trap Delivery Interrupt virtualization is a key feature of SPUMONE. Interrupts are intercepted by SPUMONE before they are delivered to each guest OS. When SPUMONE receives an interrupt, it looks up the interrupt destination table to make a decision to which OS it should be delivered. The destination virtual processor is statically defined for each interrupt source when the OS kernels are built. Traps are also delivered to SPUMONE first, then are directly forwarded to the currently executing OS. For intercepting interrupts by SPUMONE, we modified the interrupt entry point of the OS kernels to the SPUMONE’s vector table. The entry point of each OS is notified to SPUMONE via a virtual instruction for registering their vector table. An interrupt is first examined by the SPUMONE’s interrupt handler in where the destination virtual core is decided, and the corresponding scheduler is invoked. When the interrupt triggers OS switching, all the registers of the current OS are saved into the stack, then the registers in the stack for the previous OS

Fig 4: SPUMONE Basic Architecture SPUMONE (Software Processing Unit, Multiplexing ONE into two or more) is a thin software layer for multiplexing a single physical CPU core into multiple virtual cores [7, 9]. The current target processor of SPUMONE is the SH4a architecture, which is very similar to the MIPS architecture, and the processor is adopted in various Japanese embedded system products. Also, currently, standard Linux and various RTOSes support the processor. The current version of SPUMONE runs on a single and

2

647

Toppers is an RTOS which is widely used in Japan, and available as open source software.

7D-1 are restored. Finally, the execution is switched to the entry point of the destination OS. The processor initializes the interrupt just as if the real interrupt occurred, so the source code of the OS entry points does not need to be changed. The interrupt delivery on a multi-core platform is basically the same as the one on a single-core platform. Each SPUMONE instance delivers interrupts to their destinations. On a multi-core system, virtual cores may migrate among physical cores. In order to deliver interrupts to a virtual core running on a different core, the assignments of interrupts and physical cores are switched along with the virtual core migration.

Bootstrap: In addition to the features supported by the single-core SPUMONE, the multi-core version provides the virtual reset vector device, which is responsible for resetting the program counter of the virtual core that resides on a different core. Physical Memory: A fixed size of physical memory area is assigned to each guest OS. The physical address for the OSes can be simply changed by modifying the configuration files or their source codes. Virtualizing the physical memory would increase the size of the virtualization layer and the substantial performance overhead. In addition, unlike the virtualization layer for enterprise systems, embedded systems need to support a fixed number of guest OSes. For these reasons we simply assign a fixed amount of physical memory to each guest OS. Idle Instruction: On a real processor, the idle instruction suspends a processor until it receives an interrupt. On a virtualized environment, this is used to yield the use of real physical core to another OS. We prevent the execution of this instruction by replacing it with SPUMONE API. Typically this instruction is located in a specific part of the kernel, which is fairly easy to find. Peripheral Devices: Peripheral devices are assigned by SPUMONE to each OS exclusively. This is done by modifying the configuration of each OS not to share the same peripherals. We assume that most of devices can be assigned exclusively to each OS. This assumption is reasonable because, in embedded systems, multiple guest OSes are usually assigned different functionalities and use different physical devices. It usually consists of RTOS and GPOS, where RTOS is used for controlling special purpose peripherals such as a radio transmitter and some digital signal processors, and GPOS is used for controlling generic devices such as various human interaction devices and storage devices. However some devices cannot be assigned exclusively to each OS because both systems need to share them. For instance, the processor we used offers only one interrupt controller. Usually a guest OS needs to clear some of its registers during its initialization. In the case of running on SPUMONE, the guest OS booting after the first one should be careful not to clear or overwrite the settings of the guest OS executed first. For example, we modified the Linux initialization code to preserve the settings done by Toppers.

B.2 Virtual Core Scheduling Multiple OSes run by multiplexing a physical core. The execution states of the OSes are managed by data structures that we call virtual cores. When switching the execution of the virtual cores, all the hardware registers are stored into the corresponding virtual core’s register table, and then restored from the table of the next executing virtual core. The mechanism is similar to the process implementation of a typical OS, however the virtual core saves the entire processor state, including the privileged control registers. The scheduling algorithm of virtual cores is the fixed priority preemptive scheduling. When RTOS and GPOS share the same physical core, the virtual core bound to RTOS would gain a higher priority than the virtual core bound to GPOS in order to maintain the real-time responsiveness of RTOS. This means that GPOS is executed only when the virtual core for RTOS is in an idle state and has no real-time task to be executed. The process scheduling is left up to OSes so the scheduling model for each OS needs not to be changed. Idle RTOS resumes its execution when it receives an interrupt. The interrupt to RTOS should preempt GPOS immediately, even if GPOS is disabling its interrupts. When virtual cores assigned to GPOS are migrated to be executed on a shared core, those cores are scheduled with the timesharing scheduler. B.3 Inter-core Communication Communications among SPUMONE instances running on physical cores are implemented with the shared memory area and the inter-core interrupt (ICI) mechanism. First, a sender stores data on a specific memory area, then it sends an interrupt to a receiver, and the receiver copies the data from the shared memory.

B.5 Dynamic Multi-core Management SPUMONE for multi-core processors is designed in a distributed model similar to the Multikernel approach [2]. A dedicated instance of SPUMONE is assigned to each physical core. This design is chosen in order to eliminate the unpredictable overhead of synchronization among multiple physical cores. In addition, the basic lock mechanism can be shared between the single-core and multi-core version, which may simplify the design of SPUMONE. SPUMONE enables to multiplex multiple virtual cores on physical cores. The mapping between physical cores and virtual cores is dynamically changed to balance the tradeoffs among real-time constraints, performance and energy consumption. In SPUMONE, a virtual core can be migrated

B.4 Modifying OS Kernels Each guest OS is modified to be aware of the existence of the other guest OSes, because hardware resources other than the processor are not multiplexed by SPUMONE as described below. Thus those are exclusively assigned to each OS by reconfiguring or by modifying their OS kernels. The following describes how the OS kernels are modified in order to run on the top of SPUMONE. Interrupt Vector Table Register Instruction: The instruction registering the address of a vector table is replaced to notify the address to the SPUMONE’s interrupt manager. Typically this instruction is invoked once during the OS initialization.

648

7D-1 to another core according to the current situation. This approach offers several advantages as we explain below.

and how complex those are implemented. The table also shows the modified LoC for RTLinux, RTAI and OK Linux that are previous approaches to support the multiple OS environments. Since we could not find RTLinux, RTAI, OK Linux for the SH4a processor architecture, we evaluated them developed for the Intel architecture. OK Linux is a Linux kernel virtualized to run on the L4 microkernel. For OK Linux, we only counted the code added to the architecture dependent directory arch/l4 and include/asm-l4. The results show that it is clear that our approach requires significantly small modifications to the Linux kernel. The result shows that the SPUMONE’s strategy to virtualize processors successfully to reduce the amount of modification of guest OSes and to satisfy the requirements described in Section I.

Fig. 5: Dynamic Multi-core Processor Management

Table 2. The total number of modified LoC in *.c, *.S, *.h, Makefiles

The first advantage is to change the mapping between virtual cores and physical cores to reduce energy consumption. As shown in Fig. 5, we assume that a processor offers two physical cores. Linux uses two virtual cores, and RTOS uses one virtual core. When the utilization of RTOS is high, two virtual cores of Linux are mapped on one physical core (Left Top). When RTOS is stopped, each virtual core of Linux uses a different physical core (Right Top). Also, one physical core is used by a virtual core of Linux and another physical core is shared by Linux and RTOS when the utilization of RTOS is low (Right Below). Finally, when it is necessary to reduce energy consumption or one of physical cores is dead, all virtual cores run on the same physical core (Left Below). This approach enables us to use very aggressive policies to balance real-time constraints, performance, and energy consumption.

OS(Linux version) Linux/SPUMONE(2.6.24.3) RTLinux 3.2(2.6.9) RTAI 3.6.2 (2.6.19) OK Linux (2.6.24)

As described in Section II, using interrupt disable instructions is a serious problem if each guest OS kernel and its application code execute the instructions independently. In this section, we propose a novel technique to overcome the problem, and show the effectiveness of the proposed technique.

Fig 6: Virtual Core Migration The technique is based on virtual core migration. When we ported Toppers and Linux on SPUMONE, we found that some paths of the Linux kernel gained the highest interrupt priority level(IPL) unexpectedly (e.g. bootstrap, idle thread). This made us aware of the possibility that some device drivers or kernel modules programmed in a bad manner gain a higher IPL and interfere with the activity of Toppers. We modified SPUMONE to proactively migrate a virtual core, which is assigned to Linux sharing a physical core with Toppers, to another physical core when it traps into the kernel or interrupts are triggered as shown in Fig. 6. In this way, only the user level code of Linux is executed concurrently on a shared physical core, which will never change the priority levels. Therefore, Toppers must preempt Linux immediately without emulating or replacing the

Table 1. Linux kernel build time Linux and TOPPERS

Time 68m5.9s 69m3.1s

Removed Loc 8 1131 163 -

IV. Temporal Isolation

C. Performance and Engineering Cost Table 1 shows the time required to build Linux kernel on native Linux and modified Linux executed on the top of SPUMONE together with Toppers. Toppers only receives the timer interrupts each 1ms, and executes no other tasks. The result shows that SPUMONE and Toppers impose the overhead of 1.4% to the Linux performance. Note that the overhead includes the cycles consumed by Toppers. The result shows that the overhead of the SPUMONE’s virtualization to the system throughput is sufficiently small. Configuration Linux Only

Added LoC 161 2798 5920 28149

Overhead 1.4%

We evaluated the engineering cost of reusing RTOS and GPOS by comparing the number of modified lines of code (LoC) in each OS kernel. Table 2 shows the LoC added and removed from the original Linux kernels. We did not count the lines of device drivers for inter-kernel communication because the number of lines will differ depending on how many protocols they support and how complex are them. We did not include the LoC of utility device drivers provided for communication between Linux and RTOS or Linux and servers processes because it depends on how many protocols

649

7D-1 interrupt enabling/disabling instructions. We measured the effect of loads on Linux to the dispatch latency of a periodic task in Toppers. A periodic task runs every 1ms. It is sampled 100,000 times during the measurement. The dispatch latency is the time spent from the interrupt triggered until the periodic task starts its execution. Only the periodic task is executed on Toppers, which means that no other real-time task on Toppers will prevent the execution of the periodic task. Fig. 7 and 8 compares the distribution of the dispatch latency without and with the virtual core migration technique under invoking continuous write() to NFS share file system. The measurement without the virtual core migration technique shows that the maximum latency is 96 μs. With the virtual core migration technique is enabled, the maximum latency is reduced to 39 μs,

Fig. 9:. The effect of load on TOPPERS to Linux’s DMIPS score (y-axis in DMIPS, larger is better) Fig. 9 shows the total score of the Dhrystone benchmark. The bar at the left end shows the score of the evaluation done with Linux executed on the top of SPUMONE with three physical cores. As long as the workload of the periodic task grows, the score of Dhrystone degrades. At the load of 90%, the result gets close or less than the score of the three dedicated core configuration. The result shows the overhead of the virtual core migration technique is not significant with the benchmark.

V. Spatial Isolation For isolating RTOS from GPOS in the same privileged address space, a naïve solution is to use the MMU-based memory isolation. However, this approach does not offer a mechanism to protect the page table and a page fault handler from a malicious OS kernel. We propose an alternative technique using the core-local memory. This approach is practical because future multi-core processors will support core-local memory, and we do not assume extra hardware support for implementing a virtualization layer. The core-local memory is a programmable memory connected to each core in a multi-core processor. The access latency to the core-local memory is faster than to access to a shared main memory. The original purpose of the core-local memory is to exploit the memory access locality of each thread for improving the scalability of parallel applications. Since the core-local memory is invisible and inaccessible from other cores, we exploit this characteristic to provide a novel technique for protecting OS kernels on the MMU-based address space separation.

Fig. 7: Dispatch latency on multi-core (NFS stress on Linux without the virtual core migration technique)

Fig. 8: Dispatch latency on multi-core (NFS stress on Linux with the virtual core migration technique

A. Core-Local Memory Let us assume that two OS kernels running on top of a dual-core processor where each core has an independent core-local memory as shown in Fig.10. If the following assumptions are satisfied, an OS kernel is protected from others.

We have also measured the effect of the processor utilization of Toppers to Linux. We compared the score of the Dhrystone benchmark with Linux running on the top of 4 dedicated cores (indicated as 4 cores in the Fig 9), Linux running on the top of 3 dedicated cores and one core shared with Toppers in various workloads (xx% in the figures), and Linux running on the top of 3 dedicated cores (indicated as 3 cores in the figures). The real-time task on Toppers is executed in the cycle of 10 ms. The percentage shows the ratio of the execution time of the periodic task (30% means that the real-time task is executed for 3 ms continuously).

i.

The size of an OS kernel is small enough to fit in core-local memory. ii. Each core should be restricted to reset other core where the reset cleans up the content of the core-local memory. iii. The boot image of an OS kernel should not be infected, and a secure boot loader can load the kernel image in the shared memory correctly.

650

7D-1 iv.

Each core should be restricted to access I/O devices. I/O devices that are managed by a core should not be accessed from other cores.

page table does not contain a corresponding page table entry. iii. When LMEM handles a page fault, a corresponding page is copied from the shared memory to the core-local memory. LMEM calculates the hash value of the page and compares it with the pre-calculated value stored in the core-local memory. The mismatch of the hash value means that the image of pOS in the shared memory is corrupted. If there is no mismatch, the page fault is correctly completed and the execution of pOS is resumed. iv. When there is no space available in the core-local memoryLMEM swaps out some pages to the shared memory. LMEM checks whether the page is updated or not, and if it is updated, LMEM recalculates the hash value of the page and updates the hash table entries. The pages will be used for loading other pages.

B. Hash-based Integrity Management The problem of the solution presented in the previous section is the size of core-local memory. Currently they are a few hundred KB. It is too small to load a modern RTOS. In order to virtually extend the size of a local memory, we propose a hash-based integrity management mechanism assisted by the core-local memory protection. The original kernel image is stored in the shared main memory, and a subset of the kernel image is copied to the core-local memory before executed by the core. When a part of the kernel image is loaded in the core-local memory, this part is verified every time to make sure that it is not corrupted or infected.

Fig. 10: Isolated core-local memory We present how the hash-based integrity management works in Fig. 11. The page allocation in a core-local memory and the calculation of cryptographic hash values are managed by the local memory(LMEM) manager that resides permanently in the core-local memory. An OS kernel image that can be protected from other OS kernels is called a protected OS(pOS). An OS kernel that may be infected by malicious activities is called a vulnerable OS(vOS). pOS and vOS run on different cores. i.

ii.

Fig. 11: Hash-based memory protection

The boot loader loads the LMEM manager into the core-local memory. The OS kernel images of pOS and vOS are loaded at the same time into the main memory. LMEM calculates the hash value of each page of pOS, and stores it in a hash table also located in the core-local memory. The manager loads a memory page that contains the entry point of pOS into the core-local memory. Then the other core may start to execute vOS. The pages of pOS are mapped in a virtual address space, and a page table for managing the virtual address space should be in the core-local memory. When the size of the page table is bigger than the size of the core-local memory, LMEM can swap out unused page tables to the shared memory. LMEM also manages the hash table for maintaining the integrity of the swapped page tables. LMEM manages page faults when the

651

In this approach, the image of pOS in the shared memory may be corrupted by vOS. Our current policy is to restart pOS by reloading a new undamaged kernel image by a secure loader. We are also considering a technique to protect a kernel image by using a memory error correction technique and an encryption technique.

VI. Monitoring Service and Recovery Issues The spatial isolation does not increase the security and reliability of each guest OS. A malicious stealth code can be injected into the Linux kernel, and behaves maliciously while hiding themselves from virus detectors. The basic approach to solve the problem in SPUMONE is to use the monitoring service. The monitoring service checks the integrity of several data structures in the OS kernel periodically. The integrity is specified as constraints of each data structures. If the monitoring service detects the violation of the constraints, the service invokes a recovery function that is defined for each data structure to recover the integrity. The repair procedure may not repair the system

7D-1 completely - some garbage may remain in the kernel space or the repair procedure may even fail. In this case, the guest OS kernel is rebooted proactively. When the Linux kernel causes an error while executing, the error can be translated to a system call error or an application signal. Of course, the optimistic approach may leak some resources in the kernel. In this case, Linux is rebooted when the kernel becomes idle. This approach is very effective in some embedded systems. For example, when the user is using a mobile phone, the Linux kernel does not need to be rebooted immediately if some errors occur in the kernel, but the kernel can be rebooted when the user puts the phone in his pocket. We are also considering an alternative approach. When the monitoring service detects some anomalies, it saves the states of application processes. Then, Linux is rebooted, and the states of processes are reconstructed. The applications can continue to run even though the kernel is restarted, which is similar to the checkpoint/recovery technique. Toppers and its applications can be simply rebooted when some anomalies are detected. The rebooting time can be improved by storing some important states in a persistent memory by using a similar technique presented in [6]. Usually, most applications on Toppers contain a small amount of states, and rebooting the applications and Toppers is very fast. Also, the rebooting does not affect the functionality of the embedded system. This approach improves user satisfaction dramatically because the user is not aware of the reboot. In our approach, if the Linux kernel is attacked, the attacker can invade other OS kernels if the spatial isolation is not used. In order to attack other kernels, an attacker needs to insert code into the Linux kernel address space. Various traditional approaches can detect the modifications of the kernel easily. Recently, attacks tend to use kernel rootkits. Kernel rootkits try to stealth themselves, and various security tool cannot find them. For example, some rootkits may modify kernel data structures that are used to manage processes. In our approach, the monitoring service checks the data structures that the rootkits try to modify, and repairs them to allow security tools to detect the rootkits. Currently, the monitoring service checks some typical data structures that various rootkits are known to modify, and shows that our approach can remove many well known rootkits. We are also working on the synchronization mechanism between the monitoring service and the Linux kernel. Our approach uses optimistic synchronization because we cannot modify the Linux kernel to exclude shared data structures between the Linux kernel and the monitoring service [10]. Of course, the monitoring service should be protected from the Linux kernel. There are various mechanisms to protect the monitoring service. For example, we can use a co-processor or a special device to execute the monitoring service. The multi-core processor that we are using (SH4a) for building embedded systems contains a core-local memory in each CPU core. This core-local memory can only be accessed by its CPU core. In our approach, a CPU core is

dedicated to execute the monitoring service. Thus, the Linux kernel cannot access its core-local memory, but the CPU core executing the monitoring service can access all the memory used by the Linux kernel.

VII. Conclusion SPUMONE can execute multiple operating systems without suffering a large amount of overhead and engineering cost. Although most of processors for embedded systems are not suitable for implementing the virtualization layer to offer the complete isolation between guest OSes because it requires a large amount of overhead without virtualization hardware supports. Since SPUMONE and OS kernels run in the same privileged space, the possibility of kernel corruption is increased, but our approach offers novel isolation mechanisms among guest OSes without incurring large overhead or the engineering cost. We are currently enhancing our implementation to support various policies to consider the tradeoffs among power consumption, performance and timing constraints. The monitoring service should be enhanced in the near future. Especially, we are interested in using Daikon [3] to detect the invariance inside the kernel automatically.

References [1] F. Armand, M. Gien, "A Practical Look at Micro-Kernels and Virtual Machine Monitors", In Proceedings of the IEEE 6th Consumer Communications and Networking Conference, 2009. [2] A. Baumann, at. al., "The Multikernel: A New OS Architecture for Scalable Multicore Systems", In Proceedings of the 22nd ACM Symposium on Operating Systems Principles, 2009. [3] M.D. Ernst, J.H. Perkins, P.J. Guo, S. McCamant, C. Pacheco, M. S. Tschantz, and C. Xiao, "The Daikon System for Dynamic Detection of Likely Invariants", Science of Computer Programming, Vol. 69, No. 1--3, Dec. 2007, pp. 35-45. [4] A. Fedorova, J.C. Saez, D. Shelepov, M. Prieto, "Maximizing Power Efficiency with Asymmetric Multicore Systems". Communication of the ACM, Vol.52, No.12, pp.48-57, 2009. [5] Hiroaki Inoue, Junji Sakai, Masato Edahiro, "Processor virtualization for secure mobile terminals", ACM Transaction on Design Automation of Electronic Systems, Vol. 13, No. 3, 2008. [6] Hiroo Ishikawa, Alexandre Courbot, Tatsuo Nakajima, "A Framework for Self-Healing Device Drivers", In Proceedings of the Second IEEE International Conference on Self-Adaptive and Self-Organizing Systems, pp.277-286, 2008. [7] Yuki Kinebuchi, Takushi Morita, Kazuo Makijima, Midori Sugaya, Tatsuo Nakajima, "Constructing a Multi-OS Platform with Minimal Engineering Cost", In Analysis, Architectures and Modeling of Embedded Systems, Springer, 2009. [8] R. Kumar, et. al., “Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance”, In Proceedings of the 31st International Symposium on Computer Architecture, 2004 [9] T. Nakajima Y. Kinebuchi A. Courbot H. Shimada T-H. Lin H. Mitake, "Composition Kernel: A Multi-core Processor Virtualization Layer for Rich Functional Smart Products", In Proceeding of The 8th IFIP Workshop on Software Technologies for Future Embedded and Ubiquitous Systems, 2010. [10] H. Shimada, A. Courbot, Y. Kinebuchi, T. Nakajima, "A Lightweight Monitoring Service for Multi-Core Embedded Systems", In Proceedings of the 13th Symposium on Object-Oriented Real-Time Distributed Computing, 2010.

652

Suggest Documents