Fine-grained, Dynamic User Customization of ... - Semantic Scholar

4 downloads 52859 Views 39KB Size Report
Application performance can be improved by customizing the operating system kernel at run time. Inserting applica- tion code directly into the kernel avoids the ...
Fine-grained, Dynamic User Customization of Operating Systems Willy S. Liao

See-Mong Tan Roy H. Campbell Department of Computer Science University of Illinois at Urbana-Champaign Digital Computer Laboratory 1304 W. Springfield Urbana, IL 61801 fliao,stan,[email protected]

Abstract Application performance can be improved by customizing the operating system kernel at run time. Inserting application code directly into the kernel avoids the costly protectiondomain switches required in traditional interprocess communications. Our design for a customizable operating system structures the kernel as a set of object-oriented frameworks. The user can then perform fine-grained customization by subclassing kernel classes and inserting objects into the kernel. User code is written in a safe, object-oriented language (Sun’s Java), which is interpreted or dynamically compiled in the kernel. Objects in the kernel, regardless of their origin, interact with each other seamlessly through ordinary object invocation. This extension technique has the advantage that a user can build directly on top of kernel frameworks using object invocation just as if the user were a system implementor, without compromising system safety.

1. Introduction Operating systems are generally designed to make the common case fast, offering good performance for a “typical” range of application behavior. However, applications that do not exhibit common patterns of behavior often suffer. A classic example is the choice of file system caching policies. The policy of discarding the oldest pages from the cache (the Least Recently Used or LRU policy) is appropriate for file-access patterns shown by most applications such as text editors. However, studies have shown that database applications consistently access files in a manner that LRU is very poor at caching [3, 9]. There has thus been a recent trend towards dynamically customizable operating systems, in which system services can be customized by the user in an application-specific manner at runtime. In such a system,

the database application could install a customized filesystem caching policy and improve performance. The traditional mechanism for dynamic kernel customization is kernel-user interprocess communications (IPC). An example is user-level virtual-memory pagers in Mach[8]. The disadvantage of this approach is the cost of cross-domain interaction, either in synchronization of multiple processes or in process movement across domain boundaries. An alternate approach that offers better performance is direct insertion of user code into the kernel domain, but safety then becomes a concern. Sandboxing, introduced by the Bridge OS [12], is a mechanism where user-supplied binary code is modified at load-time to restrict memory references by that code to certain regions. Another method is to require the user to write all kernel extensions in a safe language whose compiler will guarantee correctness and safety. The SPIN [1] operating system allows user code written in a safe language (Modula-3 [7]) to be inserted into the kernel. Since kernel-user interaction is not needed on every event of interest, performance is improved. For example, SPIN’s performance with user-supplied kernel extensions is superior to Mach kernel-user IPC for customization. A problem with this approach is that the kernel requires a trusted, safe compiler to generate the object code. Commercial vendors either must use a globally trusted compiler to build applications that extend the kernel, or they must provide source code to users so that the user can use a local, trusted compiler to build applications. The first idea leads to a key distribution problem, while the second is commercially infeasible. Our solution to the problem of safe, dynamic kernel customization is to write user extensions in a safe, objectoriented language, letting a kernel-resident interpreter execute the extensions. The security problem is solved since the scripts need not be verified by an external trusted entity; the kernel interpreter is responsible for guarding against illegal activities. Dynamic code generation (“just-in-time compilation”) is used in the interpreter to let extensions achieve the

speed of compiled code. We then build on top of this mechanism with an object-oriented kernel whose internal classes and frameworks are visible to and extensible by the user. A key feature of our architecture is that user code does not need to be specially adapted to the extension mechanism, since his extensions are structured as ordinary classes. From the user’s point of view he is simply customizing a class library that happens to encapsulate system resources and behavior.

2. The kernel as a set of user-extensible frameworks Our design involves user extension of kernel frameworks via safe, direct extension of the kernel. A kernel-resident interpreter executes the user code, and this code accesses system resources in a safe fashion due to both languagelevel and system-provided safeguards. Architecturally, user code represents additional object components for use within the object-oriented frameworks that make up the kernel. Our operating system platform is the object-oriented microkernel operating system Choices [2].

2.1. The Java language Sun’s Java language [5] is the kernel extension language in this design, for two reasons: it is object-oriented and it is safe. Java is an object-oriented language that strongly resembles C++ [10], and it possesses the necessary language attributes for building object-oriented frameworks. Java is also designed as a safe language, unlike C++, since it was envisioned that hosts would download Java code from untrusted sites for local execution. There are no user-visible pointers in Java and memory is garbage-collected automatically instead of being explicitly freed by the user. No heavyweight mechanisms such as memory management hardware are used, which is very important as manipulating page tables and page table caches are quite expensive. Java is compiled from its source to a machine-independent bytecode, which can then be interpreted by a kernel-resident interpreter. In our system, we use bytecode as the actual code that is passed into the kernel. There now exist just-intime compilers for Java that convert byte-code into native code as the byte-code is interpreted, so Java’s performance is comparable to compiled code. There is one “implementation” detail of the Java language worth noting here. Unlike many other languages, the Java language is implemented on top of a precisely specified virtual machine [11]. The machine-independent bytecode described above is actually a sequence of instructions for this virtual machine. The virtual machine possesses primitive instructions for object manipulation and invocation. It was also designed with security considerations in mind so that incorrect or malicious bytecode can be detected and

prevented from doing harm. For example, certain invariants must hold for each instruction of a bytecode sequence and static analyses to verify these invariants can be performed before running the bytecode. The Sun interpreter used in our design verifies the bytecode to protect against malicious or incorrect bytecode [13]. It must also be pointed out that one can treat the the Java language and the virtual machine as two separate layers; it is possible to change the Java language somewhat without even having to add instructions to the virtual machine.

2.2. Building an OS with frameworks The object-oriented framework-based design of Choices is vital to our design. A framework is the specification of the interactions that are permitted between components and their relationships to each other [4]. The Choices OS is implemented as a collection of C++ frameworks, one for each major subsystem, such as process management, virtual memory and so forth. Customization proceeds by subclassing a framework class to produce a new specialized class that can be substituted in the framework wherever its parent may be used. Such fine-grained customization is a major advantage of object-oriented framework design. Rather than reimplement all the functionality necessary for a customized component, the designer can modify an existing component by reusing most of the old code and adding a small amount of new code. Normally operating system frameworks are designed and implemented completely by the system designer. These frameworks may or may not be visible to the user; there may be a separate application interface. In our design the kernel class hierarchies are user-visible and can be extended at runtime. At boot time the kernel consists only of those classes in the system-provided core frameworks. Certain classes can be extended by the user, in that the user can create subclasses from them dynamically in Java. Any object in the kernel does not know whether an object it is invoking belongs to a system-supplied or a user-supplied class. This seamless integration of user-supplied and system-supplied classes provides an operating system kernel whose composition is the sum of basic objects and specialized, applicationdependent objects. There is no requirement for a separate application interface, since the interface can be the frameworks themselves. An extensible subsystem built on this principle is the Choices network protocol subsystem, which is based on the x-Kernel [6]. NetworkProtocol objects are stacked together to form protocol stacks. NetworkSession objects encapsulate communication endpoints or open connections. Applications use NetworkSessions to send and receive NetworkMessage objects. Users can insert new subclasses of these three classes into the kernel at runtime. These

subclasses can then be used as if they were built into the kernel, and users can dynamically compose their own protocol stacks out of a mixture of system- and user-supplied protocols.

3. Mechanisms for user-system integration This section first discusses a class framework for letting Java and C++ objects interoperate. The discussion then turns to issues of protecting classes in the kernel from unauthorized use. Native Java methods (which are called from Java, but implemented in the native code of the platform) are used to implement these features. Finally some implementation issues with the interpreter are discussed.

3.1. Java-C++ integration framework Objects written in Java can interoperate easily with objects written in C++. Our design uses stub classes in both languages that are instantiated to act as proxy objects. For example, for a user-supplied Java object to subclass from or invoke a system-supplied C++ class A, there must exist a stub class for A in the Java language. The Java stub class for A holds a reference to a real object of class A. Any call on the Java stub class is converted into a C++ method invocation on the real object reference. A Java stub object is created and associated with each C++ object that needs to be accessible by the Java environment. The converse arrangement (C++ stub objects for real Java objects) lets the system-supplied C++ frameworks invoke objects that are inserted by the user at runtime. Figure 1 depicts the stub class concept. C++

3.2. Protection

Java

SystemClass

SystemClassStub

Method1

Method1

Method2

Method2

Method3

Method3

Method4

Method4

stubObjectPointer

realObjectPointer

UserClassStub

Objects attempting to access SystemClass

UserClass

Objects attempting

Method1

Method1

to access UserClass

Method2

Method2

Method3

Method3

Method4

Method4

realObjectPointer

stubObjectPointer

the Choices kernel show that it takes 6 microseconds to invoke and return from a null Java method, starting out from native Choices kernel code on a SPARCStation 600MP. On the same machine, a UNIX null system call (getpid) takes 3 microseconds to complete. An upcall from the kernel to user space via the UNIX signal mechanism is far more expensive, at around 150 microseconds. By contrast, invocation of a native method by the Java interpreter has no additional overhead compared to the above cost of invoking an interpreted Java method. Thus these preliminary measurements show that two-way interaction between the Java and the native code environments is certainly not expensive, and is far cheaper than in UNIX due to Java’s low cost upcalls. Accompanying the stub classes are memory-access classes for converting pointer types and memory references in method calls, since Java has no pointers. The PointerTo and AddressRange Java classes encapsulate pointers and memory regions. The PointerTo class has pointers to different primitive data types such as integers and bytes. The AddressRange class has ReadOnly and ReadWrite variants; the former can only be read from and not written to. A stub class converts between these Java objects and their C++ equivalents whenever it forwards a method call. The user cannot alter the pointers to memory held by these objects, nor can the user forge these objects to gain access to arbitrary regions of memory. These memory-access classes effectively give the user’s Java code the same expressiveness as traditional C++ in manipulating raw memory, while preventing the user from corrupting arbitrary objects and regions of memory. We next discuss how this protection is accomplished.

Figure 1. Inter-language invocation through stub classes. The overhead of the stub class mechanism is low. Preliminary measurements with the Sun Java interpreter 1.0 in

Lightweight, language-level mechanisms are used to protect the system from malicious user-supplied code. The Java language provides a namespace mechanism for organizing classes via packages. Packages are organized on hierarchical lines, i.e. java.lang and java.util. The complete symbolic name of a class consists of the package name plus the class name. Certain methods and instance variables can only be accessed by other classes in the same package. Objects such as stub classes and PointerTo and AddressRange classes cannot be forged by users since their constructors are private to a system package. The non-public instance variables of these classes likewise cannot be manipulated. The Java interpreter prevents user code from circumventing class member access restrictions. User code is therefore limited by the system to using objects that the system has provided. In general, protected resources such as memory regions, I/O ports, and so on are encapsulated in restricted objects that can be used safely but not created or “damaged” by their clients.

However, package-based protection has one hole: there is no control over who may add classes to a package in Java, which allows anyone to subvert security by naming their classes so that they belong to vital system packages. Our design extends the package system to provide full security in the face of dynamic extension by multiple users. Packages in our design can be protected in a hierarchical fashion by access lists, so that only certain principals can add classes to a packages. For example, only the system may create new sub-packages or add classes to existing packages in the java.* hierarchy, but everyone is permitted to use the classes in any existing package in java.*. The system also handles naming conflicts (if two different users try to use the same name). Finally, we wish to stress that these naming extensions do not require any modification to the Java interpreter or compiler. We use the Java ClassLoader mechanism, which allows the programmer to supply a ClassLoader object that controls the loading of other Java classes. Every class instantiated in the Java runtime has an associated ClassLoader object. All references from that class to any other class that have not been loaded yet are directed to the associated ClassLoader. Any subsequent class loaded by this ClassLoader will use the same ClassLoader to resolve its references recursively. Therefore, our kernel supplies a SecureClassLoader subclass that is used to load all user classes. This SecureClassLoader is written using only ordinary Java and the integration framework classes mentioned earlier. It implements access control for operations such as loading a class into a particular package. It also tags classes with user IDs so that there can exist more than one instance of a class with a given symbolic name (such as the ever-popular Foo).

3.3. Interpreter issues One implementation issue is the space overhead required by this design. Java bytecode is quite small and compact, as it is geared towards space-efficiency. The framework classes and required stub classes do not consume much space in the runtime environment. The virtual machine implementation itself is not unduly large; the standalone SPARC Solaris interpreter 1.0.2 is less than 350 KBytes in size, including routines that are unnecessary in the embedded environment of an operating system kernel. We have no reason to believe at this point that runtime space requirements for user Java extensions will be significantly higher than running native programs; garbage collection may even allow more spaceefficient operation. Therefore we do not believe embedding a Java interpreter into an OS kernel will pose a large resource burden on the host machine. Another issue concerns changes to the Java language and their effect on compatibility of user extensions with various

versions of the OS. As noted earlier, the virtual machine and the Java language are two related, but separate, entities. The virtual machine is likely to change much more slowly, since it is a publically-documented specification intended for use by multiple parties. Any changes that do occur will likely be instruction set extensions rather than changes that break backward compability. Existing user extensions that are in the form of bytecode will thus continue to work. The Java language, and its class libraries, can be expected to change more rapidly, but this only affects the compilers that convert Java source to bytecode. The kernel’s virtual machine is indifferent to such changes. Therefore, from the user’s point of view, the only issue is (if Java source is available) whether to rewrite the extension code in a more efficient manner with new language features and recompile it, or whether to keep using the old compiled bytecode.

4. Conclusion In our design the user becomes an auxillary system implementor, specializing existing components and adding new ones to the kernel. The system can be customized at a fine grain with a high degree of code reuse through subclassing of kernel objects. Java and C++ can be integrated easily through the use of stub classes for inter-language invocation and special memory-access classes. Protection is provided by interpreter-enforced language-level features, augmented by mechanisms which enforce access restrictions to Java packages and handle name conflicts among classes. Our approach combines the safety of an interpreted language with the speed of a compiled language, and it avoids costly protection domain switches. It also does not require an interface between system and user code that is any different from that of ordinary object invocation. No complicated dispatching mechanisms are needed since stub classes transparently handle inter-language interaction. A kernel architecture with user-extensible frameworks encourages a minimal kernel design. The system designer does not need to insert complicated frameworks with many specialized subclasses for dealing with rarely-encountered and unusual situations, since the user can make subclasses as needed. The system designer can instead focus on the overall framework design, since this influences the nature of the customizations users can apply. It is up to the users to “fill in the blanks” with any specialized components they require. Furthermore, users can also insert entirely new frameworks into the kernel if new subsystems are needed to support their applications. These new frameworks can in turn be used and customized by other applications as ordinary kernel-supplied frameworks can be. An example use for this ability is a user-level server that installs into the kernel a multimedia realtime filesystem that needs low overhead interaction with kernel buffers and timers.

We are currently working on implementing our operating system design. A preliminary version with the dynamicallyextensible network protocol subsystem has been built. We plan to validate our ideas by extending the kernel dynamically with network protocols and testing their performance against traditional user-space implementations.

References [1] B. N. Bershad, S. Savage, P. Pardyak, E. G. Sirer, M. E. Fiuczynski, D. Becker, C.Chambers, and S. Eggers. Extensibility, Safety and Performance in the SPIN Operating System. In Proceedings of the 15th Symposium on Operating System Principles, December 1995. [2] R. H. Campbell and S.-M. Tan. Choices: An objectoriented multimedia operating system. In Fifth Workshop on Hot Topics in Operating Systems, Orcas Island, Washington, May 1995. IEEE Computer Society. [3] H. Chou and D. Dewitt. An evaluation of buffer management strategies for relational database systems. In Proceedings of VLDB 85, pages 127–141, 1985. [4] L. P. Deutsch. Design Reuse and Frameworks in the Smalltalk-80 Programming System. In T. J. Biggerstaff and A. J. Perlis, editors, Software Reusability, volume II, pages 55–71. ACM Press, 1989. [5] J. Gosling and H. McGilton. The Java language environment: A white paper. Sun Microsystems. Mountain View, CA. http://www.sun.com, May 1995. [6] N. Hutchinson and L. Peterson. The x-kernel: An archtecture for implementing network protocols. IEEE Transactions on Software Engineering, 17(1):64–75, Jan. 1991. [7] G. Nelson. System Programming in Modula-3. Prentice Hall, 1991. [8] R. Rashid. Threads of a New System. UNIX Review, 1986. [9] M. Stonebraker. Operating system support for database management. Communications of the ACM, 24(7):412–418, July 1981. [10] B. Stroustrup. The C++ Programming Language. AddisonWesley, Reading, Massachusetts, 1986. [11] Sun Microsystems, Inc. The Java virtual machine specification. http://java.sun.com/doc/vmspec/html/vmspec-1.html, 1995. [12] R. Wahbe, S. Lucco, T. Anderson, and S. Graham. Efficient software-based fault isolation. In Proceedings of the 14th Symposium on Operating Systems, pages 203–216, Asheville, NC, 1993. [13] F. Yellin. Low level security in Java. http://java.sun.com/sfaq/verifier.html, 1995.