Using Concurrent Objects for Low-level Operating System ... - CiteSeerX

21 downloads 107483 Views 71KB Size Report
Mar 1, 1995 - as needed, and (2) update the system software without needing a kernel rebuild or cold start. .... The interrupt will have been automatically masked by an ..... the execution time after reception of ICMP echo packet through the transmission of .... Science, Graduate school of Science and Technology, Keio ...
Sony Computer Science Laboratory Technical Memo SCSL-TM-95-005

SCONE: Using Concurrent Objects for Low-level Operating System Programming 3 Jun-ichiro Itohy

Yasuhiko Yokotez

Mario Tokorox

March 1, 1995

Abstract This paper proposes a methodology for making low-level system code of operating systems be completely replaceable at runtime. Our approach uses specialized concurrent objects and dedicated system service layers for low-level system objects to provide a basic programming model for low-level system programming. System programs can be implemented as concurrent objects regardless of the functionality they implement. Each concurrent object is mapped to an appropriate actual implementation by the system service layer specific to each type of system code. Under our new concurrent object-oriented programming model which we call SCONE 1 , it is possible to program low-level system code without hazardous operations such as explicit synchronization, direct scheduler manipulation, etc. We present the implementation of our methodology on the Apertos operating system and demonstrate its efficiency with a preliminary performance evaluation.

1

Introduction

1.1 Motivation There is an increasing need for operating systems to be able to adapt to application and hardware requirements by allowing operating systems functionality to be changed or replaced. Here we present three examples of such needs: The first example is the dynamic change of hardware devices. PDAs and mobile computers now have hot-pluggable hardware interfaces, such as the PC-Card interface2 . One can plug in/out network interface cards, secondary storage cards, modem cards, GPS receivers, etc, while the system ins running. The operating systems need to be able to adapt itself to new hardware devices automatically. The second example is the heterogeneity in the network infrastructure. PDAs and mobile computers are carried along with the user while maintaining network connections to other computers. However, the network infrastructure may range from dialup serial lines (through ground/mobile phone) to wireless WAN of various providers. The system software of the mobile computer should be able to adapt to the available physical network layer, and protocol suites, in order to maintain the network connectivity. The last example is Set-Top-Boxes (STBs) for Video On Demand services. The STB is a consumer device, which should always be available, and the users cannot be expected to customize or debug their system software. Therefore, there should be a way to replace the system software of STBs at runtime, while STBs are in service [DAVIC94]. In addition, STBs will need to execute various applications such as games, on-line shopping, banking, and coming new applications within their rather strict memory limitations. As a result, even low-level system software should be dynamically replaceable to make room. The above examples show that the ability to adapt to dynamic and unpredictable changes affect the low-level system code such as device drivers, network protocol handlers, timers, paging policies, etc. Therefore we need a way for constructing dynamically replaceable low-level system code, so that, (1) we can download new system code from the server connected via a network or from the secondary storage as needed, and (2) update the system software without needing a kernel rebuild or cold start. 1.2 The design issues of replaceable low-level system code We believe that the important design issues for making low-level system code replaceable are as follows: 3 Submitted to ACM OOPSLA’95. y Keio University, Tokoro Laboratory z Sony Computer Science Laboratory x Keio University/Sony Computer Science Laboratory 1 System programming by Concurrent Objects for a Non-static Environment 2 Formerly called PCMCIA/JEIDA interface [JEIDA and PCMCIA 93]

1

Guarantee of safe execution is important because changes in low-level system code will affect the whole system, and thus without such guarantee the whole system will become unstable. We cannot assume every system code is “correct”; rather, we must prevent the system programmers from making errors, and guarantee safe execution of the system even if they do. To be more specific, we claim that the followings should be guaranteed by the programming model:

 Prevention of synchronization errors and deadlocks.  Timing error detection.  Error detection against hardware device removal. Programmability is important for realizing reconfigurable system code because programmers have to cope with more complex situations compared to implementations in static systems. It is therefore important that they have the help of good abstractions for their programs. Execution performance is important too, as is usual, in system programming. There have been micro-kernel based approaches, which implemented low-level system code as userprocesses or tasks outside of the kernel. Such approaches are indeed better than traditional monolithic kernel implementations, where no dynamic replacement is possible, but have failed to realize safe execution of low-level system code. To realize safe execution and good programmability, we propose a new programming model for implementing low-level system code of an operating system. Our model ensures that such code is programmed uniformly as a set of single-threaded concurrent objects, and executed under the supervision of a dedicated system service layer per each kind of low-level system code. Each layer maps the primitives of concurrent object execution onto the specific functionality of low-level system services, such as network protocol handlers, device drivers or virtual memory facilities. The paper is organized as follows. In Section 2, we review the low-level system code implementation in monolithic kernel operating systems, and examine what is needed for dynamically replaceable system code. Section 3 describes our proposal in detail. In Section 4, we design the system service layer for device drivers, as an example of our design principle. Section 5 outlines our current implementation on the Apertos object-oriented operating system, and its execution performance. Section 6 discusses and compares our work with previous work. Section 7 concludes the paper and outlines our future plans.

2

Low-level system code in monolithic kernel implementations

In traditional monolithic kernel implementations of UNIX, all types of low-level system code, such as network protocol handlers, device drivers, and virtual memory managers are implemented as parts of a huge monolithic kernel, and are statically linked together at compilation time. In such systems, it is naturally difficult to dynamically install or remove low-level system code at run-time. In addition, system programming is complicated because of: Namespace management. The namespace of functions and variables of each programming unit is not independent. As as result, programmers themselves have to manage naming conflicts. Traditionally, the ad-hoc approach has been to prefix the name of the module to the name of functions, for example, “diskintr()” for interrupt handler of a disk device driver. Mutual exclusion and scheduling. While implementing part of the UNIX kernel, a programmer has to cope with multiple threads of control in the kernel. Mutual exclusion between kernel code fragment is achieved by hard-coded function calls to spl. Each kernel thread yields its execution explicitly using a function call to sleep. It is extremely easy to introduce synchronization errors with such low-level of abstractions; moreover, such bugs will be fatal for the entire system. As an example, suppose we are to design a disk device driver as an example of low-level system programming. The disk read operation in a monolithic kernel OS will be executed as follows (Figures 1 and 2 3 ): 1. In diskread(), after establishing mutual exclusion by using splhigh() and splx() 4 , a disk read request is issued to the hardware. 2. diskread() yields its execution by calling sleep(). The execution will be switched to some other kernel thread in ready state. 3. The disk read operation is completed, and a hardware interrupt is issued by the disk interface. 3 Notice for UNIX gurus: diskread() in the sample source code combines the functionalities of bread() and diskstrategy(), for simplicity. 4 splhigh() masks all hardware interrupts, therefore prevents diskinterrupt() from being invoked. splx() removes interrupt masks.

2

4. The running kernel thread (or user process) is suspended by the hardware interrupt, and the interrupt handler, diskintr() is invoked. The interrupt will have been automatically masked by an assembler stub routine before the invocation of the interrupt handler, and therefore mutual exclusion is already established. 5. diskintr() transfers data from the disk to main memory, then re-activates the kernel thread that sent the read request via diskread() by calling the wakeup() routine. 6. diskread() is woken up and disk driver issues a reply to the caller in the upper-layer.

thread for read request

diskread() issue read request

thread for process interrupt

issue reply to the caller

transfer the data diskintr()

time

interrupt

Figure 1: Synchronization with interrupt in disk read operation

1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28:

/* * disk driver sample code. */ diskread(buf) { int s; s = splhigh(); /* critical section here */ /* issue a read request */ splx(s); /* wait for interrupt */ while (! (flagof(buf) & DONE)) sleep(buf); /* finished */ } diskintr() { /* the function is critical section (automatic) */ if (read completed) { /* transfer data into buf */ flagof(buf) |= DONE; /* wake the read request thread up */ wakeup(buf); } }

Figure 2: Disk read operation in BSD UNIX kernel The example here shows the usage of mutual exclusion functions splhigh(), splx(), and voluntary scheduling functions sleep(), and wakeup() for synchronization. In situations where low-level system code is dynamically replaced at run-time, it would be extremely difficult to ensure the correctness of programs that use these low-level concurrency primitives. Therefore, it is highly desirable that we could employ concurrency models with much higher level of abstraction, even at the level of systems programming.

3

SCONE: A programming model for low-level system code

3.1 Principles of SCONE We need to have a programming model for providing separated execution mechanisms for user applications and low-level system code, without using ad-hoc mechanisms nor defining different abstractions 3

for each type of low-level system code. Defining different abstractions for each type of code leads us to infinite numbers of different abstractions, which will make it very difficult for programmers to understand. Here we propose SCONE, a new concurrent object-oriented programming model for implementing low-level system code as a dynamically installable/removable entity, without sacrificing safe execution of the system. The principle ideas of SCONE for implementing low-level system code are as follows: 1. Low-level system code is programmed with a uniform concurrency abstraction, i.e., as a set of independent concurrent objects, each with a single thread of execution. This is irrespective of different levels of system functionality. 2. We provide a dedicated system service layer for each type of low-level system code, so that the primitives of concurrent objects are mapped onto specific functionality of low-level system services. The first item gives the programmers a uniform programming model that is applicable to low-level system programs as well as to user applications. Unlike monolithic kernel implementations, the programmers are able to model the low-level system codes as concurrent objects. A concurrent object [Yonezawa and Tokoro 87] has an encapsulated data region to hold internal state, program code, and a single virtual processor. We can regard each object as an unit of atomic execution, thereby avoiding hard-coded mutual exclusion operations used in both monolithic and micro-kernel implementations. Although concurrent objects are a good programming model for concurrent programming, it is a highlevel abstraction, and needs to be mapped onto an actual implementation. As we have seen, The appropriate implementation for system code is totally different from that of user applications. Thus, we must have a mechanism for providing an appropriate mapping from concurrent objects onto concrete implementations of low-level system code. In SCONE we provide a separate system service layer to each type of low-level system code, as well as to applications. By doing so we can implement the most appropriate runtime behavior for low-level system code including scheduling, mutual exclusion control and communication primitives, without forcing programmers to manage the differences between system code and user applications. 3.2 Low-level system code in SCONE programming model In our programming model SCONE , the disk driver example in Section 2 is programmed with concurrent objects as in Figure 3. The code in a device driver object is implemented as a set of methods, and the device driver objects communicate by messages, just like objects in ConcurrentSmalltalk [Yokote 90], via synchronous and asynchronous messages. Sending a synchronous message will cause the caller to wait until the reply is received. A message contains a method selector, and reception of a message causes invocation of the specified method. Because there is only a single virtual processor per object, there is no concurrency inside an object. If a message is sent to a running object, the message is queued for later execution. 1. The first method, Disk::Read issues the read request to the disk interface. 2. Create a continuation by specifying Disk::ReadCont as a target method, and information preserved across Disk::Read and Disk::ReadCont as a message. 3. Finish Disk::Read, without replying to the caller. Some other object will be scheduled for execution. 4. Disk read operation is completed, and hardware interrupt is issued by the disk interface. 5. The hardware interrupt suspends the running object, and the interrupt handler Disk::Interrupt is invoked. 6. Disk::Interrupt transfers data from the disk to main memory, then re-activates the continuation generated at step 2. It causes a message to be sent to the method Disk::ReadCont. 7. In Disk::ReadCont, a reply to the caller will be issued. Continuation is a pseudo-object needed for implementing multiple execution flows in a single-threaded concurrent object, and will be discussed in detail in Section 4.3. NewContinuation, DeleteContinuation, Send, Exit, and Reply are system service requests from the device driver object to the system service layer. The actual code for these system services is implemented in the system service layer for device driver objects. Observe that there is no hard-coded mutual exclusion operation, nor direct scheduler manipulation in the device driver code. The code will be executed appropriately with the help of the system service layer for device driver objects.

4

continuation

Disk::Read() issue read request to hardware

transfer the data

Disk::ReadCont()

Disk::Interrupt()

issue reply to the caller time

interrupt

1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20:

/*cont is a instance variable*/ Disk::Read() { /*issue read request*/ cont = NewContinuation(ReadCont, preservedVars); Exit(); } Disk::Interrupt() { /*transfer data from hardware*/ Send(cont, REACTIVATE); } Disk::ReadCont(vars) /*local method*/ { /*compose a reply*/ DeleteContinuation(cont); Reply(reply); }

Figure 3: Disk device driver code in our programming model

4

Designing the service layers in SCONE— a device driver example

Here, we demonstrate how the device driver objects and their service layer are designed in SCONE using the device driver example from previous section. The requirements of device drivers are drastically different from those of applications. The requirements are:

 timing requirements  communication primitives and interrupt notification  synchronization between interrupt handler and service request  mechanisms for safe execution Since in SCONE, the device driver objects are programmed using concurrrent object-oriented abstractions, such as message passing and object creation, we design the system service layer for device driver objects so that such primitives are implemented so as to satisfy the above requirements. The following sections describe the requirements in detail and how the implementation provided by the service layer satisfies them. 4.1 Timing requirements A device driver can be separated into two parts with different timing requirements. One is the interrupt handler and related code which implements hardware accesses, the other is the remaining part of the device driver which implements data transfer between other parts of the system. This second part sometimes implements part of the upper-layer services (say, a network packet filter). The interrupt handler part is truly time-critical; it should be executed immediately when a CPU interrupt is raised, and should finish its execution as soon as possible. For this purpose, the interrupt handlers are pinned down into physical memory 5 . On the other hand, the rest of the device driver code is performance conscious but not a time-critical task. Thus, they can be scheduled together with other system code, with higher priority if necessary. Considering the above factors, we implement a device driver as two (or more) separate device driver objects serviced by different service layers. The low-level device driver object implements the interrupt 5 If an object that implements interrupt handling for a paging device is allocated onto paged memory region, the object itself may be swapped out to a paging device, and may lead to a system failure.

5

handler and related hardware accesses and the remaining objects implement data transfers and a part of the upper-layer services. We provide two different system service layers, one for interrupt handlers, and another for upper-layers (see Figure 4). − need of interrupt mask control − time−critical intr handler device read device write

SCSIinterface

− no need of interrupt mask control − not time−critical

high−level device control buffering/caching

− UNIX style device driver is implemented as two device driver objects − Device driver object is the unit of atomic execution, and it has single CPU context associated with it

SCSIdevice[0−7]

Figure 4: Implementing a device driver as a set of concurrent objects An example where we benefit from this separation is a device driver for a SCSI interface. A SCSI interface supports 7 cascaded storage devices, and a system programmer has to manage the internal state of each storage device as well as the state of the interface itself. Using our methodology, a system programmer implements an interrupt handler object for the SCSI interface chip, and 8 upper-layer objects for each SCSI storage device and the controller itself. The separation into objects enables reuse of lowlevel system code and allows state management of devices in a natural way. 4.2 Communication primitives and interrupt notification The discussion here as well as that in Section 4.3 concentrate on the system service layer for interrupt handler objects, not upper-layer objects, because its treatment is a good example for effectiveness of our proposal to have separated system service layers for each kind of low-level system code. Message-based communication between objects has a tight relationship with scheduling and synchronization between objects. For example, by sending a synchronous message, the sender object has to wait for execution of the receiver object to be finished. If the sender object is a time-critical task and if the receiver object consumes significant amount of time (for example, the receiver object is located at a remote host or is user application object), then the sender object may fail to satisfy its timing constraints. Since device driver objects are time-critical, we must avoid such a situation. The system service layer for device driver objects prevents device driver objects from sending synchronous messages to remote host or non time-critical objects, by checking the recipient of the message. Because I/O events are notified via hardware interrupts, we need to have a way to route such interrupts to device driver objects. The system service layer converts interrupts into asynchronous messages to device driver objects. By doing so, a programmer need not distinguish between hardware interrupts and message communications. 4.3 Synchronization between interrupt handler and service request using continuations There is a slight problem in implementing the disk I/O operation in section 2 using concurrent objects. As shown in Figure 1, the process consists of two execution flows one being the issuer of the read request that executes items 1 through 6, another being the interrupt handler that executes items 4 through 5. These execution flows are interdependent because the following execution order should always be preserved: the read request is issued first, then the interrupt handler is executed, and lastly a reply is sent to the caller. In addition, the operations for read request and reply should not be interrupted by a hardware interrupt from the disk interface. Because there is only a single virtual processor per each object and hard-coded mutual exclusion operations are not available, we must implement such execution flows into one object. Therefore, we need to have a way to suspend the requesting execution thread and resume it after the execution of interrupt handler has finished. To solve this problem, the system service layer for device driver objects provides a pseudo-object called continuation, as noted in Section 2. A continuation is generated by the system service layer at a request from device driver objects. If a message is sent to a continuation object, it sends an asynchronous message to a predefined target. The message and the target should be specified at the time when a continuation is generated. By using continuations, a programmer can divide a method into multiple methods and control its progress in a fine grained manner. Furthermore, hard-coded direct scheduler

6

operation can be avoided by using continuations, helping to prevent system hang-ups caused by mistakes in programs. 6 4.4 Mechanisms for safe execution We argued that the guarantee of safe execution is important for implementing dynamically reconfigurable system code. As already stated, our model allocates a CPU context per each object and avoids hardcoded mutual exclusion operation. The separation of CPU context among objects helps us avoid system hang-ups due to mistakes in programs. Some issues are as follows: Detecting timing errors If some of the interrupts have been lost or not received by a device driver due to a timing error, the issuer of the request might wait forever for completion of the service. We use stray continuations to detect such an infinite wait, First, the system service layer for device driver objects registers the lifetime of continuations and the name of the caller. Lifetime of a continuation should be sufficiently long to avoid uncertainty in error detection. If the lifetime of a continuation has expired, the continuation can be regarded as a stray continuation, and the system service layer for device driver objects can perform recovery action or send an error status message to the caller. Guaranteeing safety during installation/removal of device drivers Installing device driver object can be done simply, if there’s no hardware resource conflicts with existing device drivers. However, removing a device driver should be handled with care, since the following hazardous situations could occur:

 A device driver object is removed while some requests are pending.  The hardware interface is removed by the user, while the device driver is still intact. The first case can be detected by timeout of continuations. If the condition is detected before the removal of device driver object takes effect, we can reject the removal request. Otherwise, the callee will receive an error status message sent by the system service layer after the timeout of the continuation. The latter case can be detected by the device driver object itself, allowing it to perform some finalization procedure, including its self-removal. 4.5 Summary of the functionalities provided by the service layer for device drivers The mappings the system service layer for device driver objects to concurrent object-oriented primitives is summarized in Table 1. The table shows the differences between system services provided by the system service layers Systemand Drive(see Section 5). Readers might note that New and Delete are unavailable in Drive; this is because they are system service for creating/deleting objects, and they cannot be executed in bounded time since there will be a chance of storage device accesses. As we have seen, Exit, NewContinuation and DeleteContinuation are special to Drive, and are needed for synchronization with interrupts. Bind and Unbind are special to Drive too, and are needed for registeration of interrupt handlers.

m

m

m

m

5

m

Implementation

Based on the design outlines in the previous chapter, we have implemented device driver objects and their system service layer on the Apertos [Yokote et al. 91b, Yokote et al. 91a, Yokote 92] operating system running on i486-based PC-AT compatible machines (CPU is 486DX2-66). (Figure 5) Each circle above the ovals denote device driver objects, each of which has a virtual processor allocated to it. Two ovals denote system service interfaces for device driver objects. Drive is the system service interface dedicated to device driver objects with hardware access and interrupt handler, providing very restricted services for satisfying timing constraints of the interrupt handlers. System is for device driver objects without hardware accesses, and provides richer services. Table 2 and 3 lists the cost of services provided by Drive and System, respectively. Evaluation figure of the Drive includes comparison with an UNIX variant, BSDI BSD/386 1.1 running on the same machine. The performance figure on Table 2 and Table 3 does incur noticeable overhead, especially when executing message-based communication between objects. The source of overhead is CPU context

m

m

m m m

6 Using continuations to program complex synchronization patterns has been traditionally done at the language level, for example in actors[Agha 86].

7

Table 1: Implementation of system services for device driver objects on mSystem Sends a specified message to the specified target object. The sender will continue its execution. After invoking Send, wait for reply to be received. Sends a specified message to the specified target object, and wait for a reply.

service Send Receive Call

Reply Exit

Finishes the execution of method. Resumes the callee if necessery. —

NewContinuation DeleteContinuation

— —

New

Bind

Creates a new device driver object from a class system. Removes a device driver object. Allocates memory to virtual memory region. Releases the memory region allocated by Grow. —

Unbind



Migrate

Moves a specified object onto other system service layer.

Delete Grow Shrink

mDrive execution environment for lower−half of device drivers (interrupt handlers)

on mDrive Sends a specified message to the specified target object. The sender will continue its execution. — Sends a specified message to the specified target object, and wait for a reply. Call will be rejected if the target object is non time-critical objects such as user applications. Finishes the execution of method. Resumes the callee if necessery. Finishes the currently executing method, without replying to the caller. Creates a continuation object. Removes specified continuation object. — — Allocates memory to physical memory region. Releases the memory region allocated by Grow. Registers specified method as interrupt handler for specified interrupt signal. Clears the association created by Bind. Moves a specified object onto other system service layer.

mSystem execution environment for upper−half of device drivers (packet filtering, caching, etc.)

Figure 5: Two system service layers, mDrive and mSystem

Table 2: Costs of mDrive services, and interrupt operations (in sec) metaoperation Interrupt message delivery Null interrupt handler execution Send metacall overhead on Drive Call-Reply roundtrip on Drive

m

m

8

Apertos 25.0 44.2 108.6 207.8

BSD/386 11.5 16.2 — —

Table 3: Costs of mSystem services (in sec) metaoperation Send metacall overhead on System Call-Reply roundtrip on System

m

m

Apertos 88.2 772.8

switches between objects, and scheduler operations. In executing a Call-Reply roundtrip, there are 14 context switches between the caller object, the system service layer mSystem, the scheduler, and the callee object. To decrease their overhead, we introduce a context switch avoidance mechanism, which dynamically determines the possibility of optimization at runtime. The optimization technique is used only if the safe execution of low-level system code can be guaranteed. Detailed discussion of the optimizations has been omitted from the paper, please refer to [Itoh 95] for details. With context switch avoidance, the execution of message based communication between objects on System will be 100 times faster, as shown in Table 4. m Table 4: Costs of mSystem services, after optimization (in sec) metaoperation Call-Reply roundtrip on System(context hand-off)

m

Apertos 7.8

For measuring macroscopic performance figures, we have implemented IP network protocol handler objects onto the system service layer mSystem. Based on our design principle, the network protocol handlers are implemented onto an appropriate system service layer for protocol handlers [Murata and Yokote 93]. We have measured the execution time after reception of ICMP echo packet through the transmission of ICMP echoreply packet. There is a concurrent object per each network protocol, five in total. Table 5 shows the results, indicating that optimized execution is 10 times faster than non-optimized execution.7 Table 5: Costs of network protocol handlers on mSystem (in sec) metaoperation ICMP echo-echoreply roundtrip ICMP echo-echoreply roundtrip(optimized)

6

Apertos 3466.0 362.3

Discussions

Recent UNIX variants implement the Loadable Kernel Module (LKM) mechanism [Sun Microsystems91, NetBSD project94], which allows fragments of low-level system code to be installed into an executing UNIX kernel. Although this approach makes dynamic changes possible, it is an ad-hoc solution in that it does not guarantee the safe execution of a running system. For example, it is possible to remove a LKM while a kernel thread is waiting for interrupt in the LKM code, causing the kernel thread to wait forever. Traditional micro-kernel operating systems, such as Chorus [Rozier et al. 88] and Mach [Tevanian and Rashid 87], have been designed to implement most of the operating system functionalities as out-of-kernel processes or tasks. Network protocol handlers, paging device managers, and even device drivers [Armand 91, Forin et al. 91a, Forin et al. 91b] can be implemented as tasks outside the kernel. Such implementation solves some of the problems of traditional UNIX implementation described in Section 2. However like LKM, the problem of guaranteeing safe execution has not been solved. Also, traditional micro-kernels provide the same system service interface to user application tasks, as well as tasks which implement system code. This causes the following problems: intervention between tasks. Usually a task is implemented as an independent entity, and there will be no interventions over a task’s execution by other tasks, other than explicit message passing or synchronization operations. However, some tasks which implement low-level system code do not preserve such ideal runtime characteristics. For example, tasks that implement device drivers use explicit hardware interrupt mask operation inside them, and the mask can prevent other tasks from 7 We

do not provide performance figure of UNIX variant, because the implementations are completely different.

9

m

running. In SCONE , since the system service layer for objects (such as DriveIn Section 5) manages synchronization and scheduling operations, objects can always preserve its independency. timing requirements. The timing requirements for some system code is totally different from those in user applications. For example, a micro-kernel usually provides location transparent communication mechanism for tasks. If we implement low-level system code using location transparent task-to-task communication primitives, there will be a chance of a remote message communication while executing time-critical part (there’s no way to check that), and it is difficult for the task to satisfy its timing requirement. Even if we exclude the worst-case senario above, the use of system service layer supporting unnecessary functionalities is a source of overhead. In SCONE , we provide the most appropriate system service layer for each type of system code, and there are no such overheads nor timing errors. infinite system service usage. Some micro-kernel services cannot be made available to all tasks. Consider an example where we have implemented a network protocol handler task (call it net), a resource manager task (res), and user application task (app), on the top of a micro-kernel supporting location-transparent message communication services. If a remote communication request arrives at the micro-kernel, it will forward the request to net. Then, net handles the request, and asks the micro-kernel to send a network packet through network device driver. Now, suppose net invokes a local resource allocation request to res while processing a packet from application app, and res issues a remote communication request. The situation creates an infinite service request loop between res and net. In SCONE , we can prevent such hazardous situations by the system service layer rejecting such inter-object communication. Chorus and Mach claimed to support multiple personalities (system interfaces) to each of the tasks/actors, however such ‘personalities’ were actually implemented as server tasks/actors on the same system service interface provided by their micro-kernel. In other words, multiple personality does not mean multiple system service interface to tasks, and our approach cannot be implemented onto Chorus and Mach in a natural way. The comparisons with previous work is summarized in Table 6. Table 6: Comparison with previous work Dynamic reconfiguration Independent namespace Avoidance of hard-coded mutex between modules inside module Safety guarantee while insertion/removal takes effect Uniform programming model Kernel interface is appropriate Avoidance of infinite system service usage

Trad UNIX NG NG

UNIX LKM OK NG

Chorus/Mach OK OK

Our approach OK OK

NG NG

NG NG

OK NG

OK OK

NG — —

NG — —

NG NG NG

OK OK OK





NG

OK

NG: No Good

OK: Good

There are very few operating systems other than Apertos, which implement mechanisms for providing different system service interface for both system programs and user programs. Spin operating system [Bershad et al. 94, Bershad 94] is a dialect of Mach operating system, and implements mechanisms for extending the system services provided by the Spin micro-kernel. Such extension is achieved by installing program module called Spindle into Spin micro-kernel at runtime. Spindle can be different from tasks to tasks, therefore the system service interface of the micro-kernel can be extended to suit the requirements of the specific task. We could implement our approach onto Spin, if the basic Spin microkernel interface is well restricted, or Spin implements restriction of system service based on Spindle. As far as we know, however, it seems that Spin only supports extension of system service interface, and its basic micro-kernel interface is already extensive, being similar to Mach. Therefore, it is likely that it will be difficult to apply our approach onto the Spin operating system. There are various operating systems for micro-computers that support dynamic installation and removal of device drivers, such as MS-DOS, OS-9, and so on. Their implementation gives no considerations to safety, nor programmability. As far as we understand from the readings, even WindowsNT [Cutler 93] is implemented in the same way as the UNIX LKM implementation. The use of continuation for dividing execution thread is similar to continuation in Mach kernel . In Mach, continuation is used for decreasing overheads and reducing memory usage by kernel stack, and they cannot be achieved without kernel stack sharing technique. In our implementation of system service layer for device drivers, continuation is used for implementing synchronization between hardware interrupt handler and I/O requesting execution flows, by a single-threaded concurrent object. 10

7

Conclusion

In this paper, we argued that the runtime replacement of low-level system code in operating systems is becoming a critical task for operating system designers. Operating systems should be designed to be fully reconfigurable from its low-level structure, in order to adapt to the changes in hardware and external environment. Our task is to devise design methodology that does not sacrifice safe execution of the system, while maintaining good programmability and practical performance. We have proposed the SCONE programming model for designing low-level system code to be dynamically installable/removable:  System code is implemented with a uniform concurrency abstraction, as a set of single-threaded concurrent objects.

 The divergence in the requirements of various system layers is provided by different system service layers. We took the device driver as an example, and showed how the service layer realizes the behavior of device driver objects. The design focused on satisfying the requirements specific to device drivers such as timing constraints and interrupt handling. We have implemented our design on the top of the Apertos operating system, and provided preliminary execution performance figures. We have successfully satisfied the timing constraints for executing device drivers, while preserving high programmability and good performance. As a future work, we plan to define a standardized protocol for communication between device driver objects, which is needed for reusing device driver code. We also plan to implement a storage system and a class system for dynamic creation of low-level system objects, for dynamic installation and removal of low-level system code.

References [Agha 86] Gul A. Agha. Actors: A Model Of Concurrent Computation In Distributed Systems. The MIT Press, 1986. [Armand 91] Fran¸cois Armand. Give a Process to your Drivers! In Proc. of the EurOpen Autumn 1991 Conference, Sep. 1991. [Bershad 94] Stefan Savage Brian N. Bershad. Issues in the Design of an Extensible Operating System. Technical report, University of Washington, Department of Computer Science and Engineering, June 1994. [Bershad et al. 94] Brian N. Bershad, Craig Chambers, Susan Eggers, Chris Maeda, Dylan McNamee, Przemyslaw Pardyak, Stefan Savage, and Emin G¨um Sirer. SPIN—An Extensible Microkernel for Application-specific Operating System Services. Technical Report 94-03-03, University of Washington, Department of Computer Science and Engineering, 1994. [Cutler 93] David N. Cutler. Inside Windows NT. Microsoft Press, 1993. [DAVIC94] DAVIC. DAVIC(The Digital Audio Visual Council)’s First Call for Proposal, October 1994. DAVIC/100. [Draves et al. 91] Richard P. Draves, Brian N. Bershad, Richard F. Rashid, and Randall W. Dean. Using Continuations to Implement Thread Management and Communication in Operating Systems. In Proceedings of the 13th Symposium on Operating Systems Principles, pp. 122–137. ACM SIGOPS, 1991. [Forin et al. 91a] Alessandro Forin, David Golub, and Brian Bershad. An I/O System for Mach. In Proceedings of the Usenix Mach Symposium. USENIX, Nov. 1991. [Forin et al. 91b] Alessandro Forin, David Golub, and Brian Bershad. An I/O System for Mach 3.0. Technical Report CMU-CS91-191, School of Computer Science, Carnegie Mellon University, 1991. [Itoh 95] Jun-ichiro Itoh. Design and Implementation of Runtime-replaceable Device Drivers. Master’s thesis, Dept. of Computer Science, Graduate school of Science and Technology, Keio University, March 1995. (in Japanese). [JEIDA and PCMCIA 93] JEIDA and PCMCIA. PC card guidelines. JEIDA, March 1993. (in Japanese). [Kiczales and Lamping 93] Gregor Kiczales and John Lamping. Operating Systems: Why Object-Oriented? In Proceedings Third International Workshop on Object Orientation in Operating Systems(IWOOOS’93), pp. 25–30, Dec. 1993. [Kiczales et al. 92] Gregor Kiczales, Marvin Theimer, and Brent Welch. A New Model of Abstraction for Operating System Design. In Proceedings Second International Workshop on Object Orientation in Operating Systems(IWOOOS’92), pp. 346–349, 1992. [Murata and Yokote 93] Kenichi Murata and Yasuhiko Yokote. A Reflective Network System Using Concurrent Objects and Meta Architectures. Technical Report SCSL-TM-93-010, Sony Computer Science Laboratory Inc., July 1993. [NetBSD project94] NetBSD project. NetBSD-current source codes, 1994. [Rozier et al. 88] M. Rozier, V. Abrossimov, F. Armand, I. Boule, M. Gien, M. Guillemont, F. Herrmann, C. Kaiser, S. Langlois, ´ P. Leonard, and W. Neuhauser. CHORUS Distribted Operating Systems. Computing Systems, Volume 1, No. 4, pp. 305–370, Fall 1988. [Sun Microsystems91] Sun Microsystems. SunOS 4.1.3 Reference Manual, 1991. [Tevanian and Rashid 87] Avadis Tevanian, Jr. and Richard F. Rashid. MACH: A Basis for Future UNIX Development. Technical Report CMU-CS-87-139, Computer Science Department, Carngegie Mellon University, June 1987. [Yokote 90] Yasuhiko Yokote. The Design and Implementation of ConcurrentSmalltalk. World Scientific Publishing, 1990. [Yokote 92] Yasuhiko Yokote. The Apertos Reflective Operating System: The Concept and Its Implementation. In OOPSLA ’92 Proceedings, pp. 414–434. ACM, Oct. 1992. also appeared in Sony Computer Science Laboratory Inc., Technical Report SCSL-TR-92-014. [Yokote et al. 91a] Yasuhiko Yokote, Atsushi Mitsuzawa, Nobuhisa Fujinami, and Mario Tokoro. Reflective Object Management in the Muse Operating System. In Proceedings International Workshop on Object Orientation in Operating Systems(IWOOOS’91). IEEE, Oct. 1991. also appeared in Sony Computer Science Laboratory Inc., Technical Report SCSL-TR-91-009. [Yokote et al. 91b] Yasuhiko Yokote, Fumio Teraoka, Atsushi Mitsuzawa, Nobuhisa Fujinami, and Mario Tokoro. The Muse Object Architecture: A New Operating System Structuring Concept. Operating System Review, Volume 25, No. 2, Apr. 1991. also appeared in Sony Computer Science Laboratory Inc., Technical Report SCSL-TR-91-002. [Yonezawa and Tokoro 87] Akinori Yonezawa and Mario Tokoro, editors. Object Oriented Concurrent Programming. The MIT Press, 1987.

11

Suggest Documents