Evolving Operating Systems and Architectures - Semantic Scholar

3 downloads 3438 Views 104KB Size Report
of the operating system (OS) kernel implemented for the predecessor does ... allows these new features to help improve system performance as intended. ... This application of object orientation in operating systems keeps the Synthesis kernel.
Evolving Operating Systems and Architectures: How Do Kernel Implementors Catch Up? Calton Pu Department of Computer Science Columbia University New York, NY 10027 [email protected]

1 Introduction Computer architectures have evolutionary changes as the hardware technology advances. For example, the intel 80x86 family of microprocessors have evolved from x = 1 to x = 4 and onward. Although each of these microprocessors is \upward-compatible" with its predecessor, a naive port of the operating system (OS) kernel implemented for the predecessor does not run well on the successor. Even though many of the di erences are \transparent" to the programmer, e.g., the di erent bus widths and on-chip cache sizes, they have profound impact on the performance of di erent kernel routines. Typically, the OS kernel is rewritten for the new member of the family, despite the instruction set compatibility. In addition, sometimes computer architectures have intentionally very distinct external interfaces and internal designs. For example, there are several competing architectures based on the RISC technology, such as SPARC, HP-PA, and MIPS. These architectures have di erent instruction sets. To design and implement a portable OS kernel to run on di erent families of RISC microprocessor is a serious challenge. A relatively successful approach is the micro-kernel, for example Mach and Chorus. However, attempts to port micro-kernels show the problems caused by hidden assumptions in the micro-kernel design [8]. The evolution of operating systems is similar to that of computer architecture. Both the OS interface (kernel calls) and the OS implementation evolve with the advances in either hardware or software technology. One possible solution for the OS interface change is the emulation of either the new or the old interface [1]. However, this approach usually introduces important performance penalties. More importantly, changes in the OS implementation are very dicult to propagate to the kernels running on other architectures. As the architectural evolution accelerates, we will see faster changes in both architecture and OS kernel implementation. Consequently, every new generation of OS kernel implementation will have a shorter useful lifespan than before. In fact, it is plausible that we will not have enough time to re ne and optimize an OS kernel before a particular processor becomes obsolete. The tradeo between code portability and eciency is made ever more pressing by the fast-paced RISC architectural evolution. An important question is, therefore, \are the OS kernel implementors locked into the catch-up game?" Can we ever re ne an OS kernel implementation quickly enough to make it useful? Are the new architectures always doomed to running kernels with performance bugs?

1

2 Architecture Interface Speci cation The canonical interface speci cation of a computer architecture is the instruction set. For application programmers, the maintenance of instruction set compatibility is enough, since the same application will run on \the same" architecture. However, for OS kernel implementors, many other features are important. For example, versions of the MS-DOS written for the 80286 could run on new machines using 80386. However, failing to take advantage of new features makes the new systems run almost as slow as those based on 80286. In most cases, only recoding of the kernel allows these new features to help improve system performance as intended. Another example is the on-chip cache size. Although the cache size introduces very important optimization considerations, it is usually considered an encapsulated performance feature, something that programmers \should not have to be concerned with". As any OS kernel implementor knows, careful management of the on-chip cache (as well as other silicon resources such as registers) is crucial to a good system performance. However, unlike the registers, typically the programmer only has indirect control over the cache content. Making sure that \the right thing" happens requires non-trivial programming tricks. Since most of these tricks are speci c to a cache size and architecture, when something changes the code performance degrades considerably. A third example is the performance of real-time executives. Besides the requirement of low overhead, real-time executives must also carry predictable overhead for each system service. This is important for real-time applications since they depend on predictable system services to produce predictable job schedules. However, unless all the system code consists of straight line programs, the system behavior depends on the relative speed of many parameters, including the instruction execution time, cache size, and bus width. The veri cation and debugging of a real-time system is still considered a black art, primarily based on exhaustive testing. A paradoxical result is that once a real-time system \works", all the system components must be frozen. It is considered unwise, for example, to run the same program on a faster, compatible CPU, due to the uncertainty introduced by the many system parameter changes. Some of the real-time programmers try to minimize the uncertainty by minimizing the system resources they use. For example, a normal practice in real-time computing is to turn o caching completely. This way, the system behavior would be immune to subtle run-time di erences due to cache management. This does not solve the problem of architectural evolution, unfortunately, since the other system parameter changes are still unmanageable. For example, typically a faster CPU is not faster the same rate across the board, i.e., some instructions of the new CPU are faster than the old (say, 2 times) more than others (say, 1.5 times). Small di erences like these may introduce enough uncertainty into the system to require a completely new veri cation and debugging process. These examples show that if we relied on traditional programming techniques alone, an OS kernel implementor would fall behind quickly in the race against architecture evolution. Large computer companies can a ord to invest in teams of programmers to keep re ning the OS implementation. Research prototypes gradually fall behind unless the projects are continuously and generously funded. The V system [2], for example, has been ported to some new architectures but no longer claims to contain the fastest RPC implementation.

2

3 Some Techniques That May Be Helpful Although the trade-o between portability and eciency appears inescapable, it is so only in the context of traditional programming techniques. During the development of the Synthesis operating system kernel [7, 3, 5], we have explored some ideas that may lead to a way out of this terrible trade-o . These ideas are by no means sucient, but they o er a glimmer of hope. First, Synthesis uses run-time code generation in the kernel for performance. The idea is to optimize the code executed by kernel calls using the information available at run-time. For example, we generate speci c code for a thread to read a particular le. This specialization and state caching allows very short critical paths in kernel calls that are not achievable without run-time information. Besides performance, the kernel code generation also adds exibility to the kernel implementation. For example, the code generator can be made simple and portable, as long as it generates very ecient code for each situation. Second, the Synthesis kernel is designed and implemented in a modular fashion, with components called quajects [6]. Each quaject encapsulates a resource and contains the operations on that resource. This application of object orientation in operating systems keeps the Synthesis kernel organization manageable as more facilities have been added to it. The important observation is that the quaject organization did not introduce large overhead due to the boundary between quajects. When a costly execution path is found in the kernel, we always nd a way to generate a new, optimized quaject (thus without boundaries) that improves on that execution path. Third, Synthesis uses ne-grain scheduling based on software feedback [4]. This is an adaptive and decentralized scheduling technique that dynamically adjusts the thread priorities (and time slice) to achieve scheduling goals such as real-time constraints. If the Synthesis code is moved to a faster machine, for example, ne-grain scheduling will make the necessary adjustments to ensure the deadlines are met. This is also very useful in a heterogeneous environment, where components from di erent architectures that use di erent operating systems form a federated computer system. The ideas described above made the current Synthesis code ecient and somewhat portable. But several serious problems remain. For example, the run-time code generation in Synthesis is written in assembly language by hand. We need a higher level programming language to capture the abstraction of incremental code generation. The kernel programmer needs a more powerful way to express the several steps in the incremental code generation process. Another example is the hidden architecture parameters such as cache size. Because they have not been made explicit, assumptions about the cache size permeates the hand-written assembly code. Again, we need high level programming language support to specify the quantity and quality of these critical system resources. The kernel programmer would be able to write more abstract code that are less dependent on particular values of these resources.

4 Object Orientation In Operating Systems Object orientation can help relieve some of the tensions in the trade-o between portability and specialization. For example, explicit interface speci cation of both the architecture and the OS help reveal the performance and portability problems due to hidden assumptions. From the architecture side, we have seen in Section 2 that the instruction set is not sucient as the architectural interface when considered from the OS kernel perspective. From the OS side, Walpole et al. [8] have shown that even well designed micro-kernel interfaces such as Chorus have serious hidden assumptions 3

about the underlying architecture. The hiding of implementation details, an important part of the object orientation, may also become part of the problem as the architecture evolves. Consider the instruction set of IBM /360: the di erent architectures of the /360 family are supposed to run the same instruction set at di erent performance levels. In other words, the software programs are de ned by the instruction set they use, and the programs are assumed to be indi erent to other system parameters. As microprocessors evolved, OS kernels became quite sensitive to many other architectural parameters such as the bus width and cache size. These are system components that were not on chip at the time instruction set was proposed as the encapsulating interface to architectures. Whether these components are implementation details that should be hidden or important system functional characteristics that should be revealed is a debate beyond this position paper. In either case, the problem resides on the judicious selection of what to hide and what to reveal, a decision that depends on the particular architecture. New system resources that are previously unavailable provide a clear challenge to OS kernel implementors. In order to achieve higher performance, the kernel must take these new resources into account. Similarly, system resources that have been logically hidden in the past may need to be reconsidered when new architectures augment their size and availability (e.g., large on-chip caches) and increase their impact on OS kernel design and implementation. In summary, this position paper does not present a ready-made solution to a speci c problem. Rather, we describe the problem of maintaining the OS implementation through the many inevitable changes in the architecture and the OS interfaces. This problem is aggravated by the current fast changes in both the architecture and the OS interfaces. On the positive side, we outline a possible approach based on object-oriented methodology used in the Synthesis project. The development of a more mature methodology is the topic of active research.

References [1] Usenix Association, editor. Proceedings of the Usenix Workshop on Micro-Kernels and Other Kernel Architectures, Seattle, April 1992. Usenix Association. [2] D. Cheriton. The V distributed system. Communications of ACM, 31(3):314{333, March 1988. [3] H. Massalin and C. Pu. Threads and input/output in the Synthesis kernel. In Proceedings of the Twelfth Symposium on Operating Systems Principles, pages 191{201, Arizona, December 1989. [4] H. Massalin and C. Pu. Fine-grain adaptive scheduling using feedback. Computing Systems, 3(1):139{173, Winter 1990. Special Issue on selected papers from the Workshop on Experiences in Building Distributed Systems, Florida, October 1989. [5] H. Massalin and C. Pu. Reimplementing the Synthesis kernel. In Proceedings of Workshop on Microkernels and Other Kernel Architecturs, Seattle, April 1992. Usenix Association. [6] C. Pu and H. Massalin. Quaject composition in the synthesis kernel. In Proceedings of International Workshop on Object Orientation in Operating Systems, Palo Alto, October 1991. IEEE/Computer Society. [7] C. Pu, H. Massalin, and J. Ioannidis. The Synthesis kernel. Computing Systems, 1(1):11{32, Winter 1988. [8] J. Walpole, J. Inouye, and R. Konura. A case study of Chorus on the HP-PA RISC. In Proceedings of the Workshop on Micro-Kernels and Other Kernel Architectures, Seattle, April 1992.

4