Design and Implementation of Real-Time Scheduler in Real-Time ...

7 downloads 120656 Views 280KB Size Report
all provided in server tasks which run as user-level application programs. .... For the hard real-time activities, the ITDS model allows a system designer to pre-.
Design and Implementation of Real-Time Scheduler in Real-Time Mach

Tatsuo Nakajima and Hideyuki Tokuda School of Computer Science Carnegie Mel lon University Pittsburgh, Pennsylvania 15213

Abstract A micro kernel-based operating system architecture is becoming common for advanced distributed computing systems. However, a current microkernel lacks the support of realtime facilities such as a real-time scheduling and synchronization. These facilities are very important for future operating systems to support audio and video. Real-Time Mach provides real-time facilities to make real-time application easily. Especially, processor scheduling plays a key role to manage the system resource in a timely fashion. In this paper, we describe the real-time scheduling facilities in Real-Time Mach and the implementation and the performance evaluation. 1

Introduction

Micro kernel-based distributed operating systems are becoming common for advanced distributed computing systems. A micro kernel provides basic resource management functions such as processor scheduling, memory object management, IPC facility, and low-level I/O support [3, 13, 2, 11]. Traditional operating functions such as le system and network services are all provided in server tasks which run as user-level application programs. Advantages of using such micro kernel for real-time applications is that the preemptability of the kernel is better, the size of the kernel becomes much smaller, and addition of new services is easier. The support of real-time facility becomes much important in future distributed computing environment. New advanced workstations are equipped with a microphone, speakers, and audio/visual coprocessors and are beginning to handle various media information such as real-time video images, animation, high quality sounds, and digitized voice as a routine work. Although the quality of such data will totally depend upon the completion time of the data manipulation 1

This research was supported in part by the U.S. Naval Ocean Systems Center under contract number N66001-87-C-0155, by the Oce of Naval Research under contract number N00014-84-K0734, by the Defense Advanced Research Projects Agency, ARPA Order No. 7330 under contract number MDA72-90-C-0035, by the Federal Systems Division of IBM Corporation under University Agreement YA-278067, and by the SONY Corporation. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing ocial policies, either expressed or implied, of NOSC, ONR, DARPA, IBM, SONY, or the U.S. Government.

1

functions, many non-real-time operating systems do not provide real-time thread management and scheduling support for predictable computing. A challenge in such real-time systems is to develop a real-time kernel which can provide users with predictable and reliable distributed real-time computing environment. Especially, processor scheduling plays a key role in managing the system resources in a timely fashion. We introduce integrated time-driven scheduler (ITDS) for Real-Time Mach scheduler. The ITDS model is based on the integration of hard real-time and soft real-time scheduling policy. Each thread has not only timing information but also semantic importance, then systems can control transient overload. In this paper, we describe real-time scheduling facilities in Real-Time Mach and the implementation and the performance evaluation. In Section 2, we rst describe basic issues in real-time scheduling. Section 3 presents the implementation of processor scheduler for RealTime Mach. Section 4 discusses the performance evaluation. In Section 5, we also describe related work. Section 6 summarizes the development status and considers future work. 2

Scheduling Paradigm and Issues

Our scheduling paradigm is based on integrated time-driven scheduling model. ITDS model enables us to predict and analyze computation easily. The properties are suitable for treating timely critical objects such as continuous media. In this section, we rst describe the problems of traditional xed priority scheduling, and introduce ITDS model. Also, we discuss several issues about the implementation of a processor scheduler.

2.1 Traditional Fixed Priority Scheduling The xed priority scheduling is a widely adopted scheduling policy for real-time systems. The policy allows us an analytical guarantee of the scheduling feasibility if a set of tasks considered critical even in situations of transient overload. Many real-time systems support the simple priority-based preemptive scheduling policy. Each task is assigned an integer number as a priority and a task which has a low priority are alway preempted whenever a high priority task becomes runnable. This strategy simpli es the implementation of scheduler, but often causes a serious problem when the system becomes transient overloaded. The simple priority-based scheduler cannot decide which task is important, which task should be completed and which task should be aborted when systems becomes overloaded. In the scheduling policy, a higher priority task is more important than a lower priority task. In real world, however, the importance of a task and its priority are di erent concepts because the priority assignment is meaningful while systems are no overloaded.. We should separate these concepts. Moreover, the simple xed priority scheduling decreases modularity and extensibility. Because a priority of a task may be required to be changed whenever a new task is added in the systems when a mode change is occurred. Real-time synchronization protocols such as the priority ceiling protocol may confuse the assignment of priorities when a task set is changed. The priority of a task should not be changed in the life. In Real-Time Mach, the period of a task is directly used as a priority and each task has an importance.

2

2.1.1

Scheduling Aperiodic Tasks

Real-time applications such as continuous media applications require more responsive scheduling for both periodic and aperiodic tasks. The management of aperiodic tasks becomes more important in microkernel based operating systems because system servers such as le servers and network servers are executed as aperiodic tasks. When the user level servers such as network servers and an operating system emulation servers are implemented as aperiodic tasks may be called by periodic tasks of real-time applications, the deadlines of requests should be satis ed. In order to achieve responsive and predictable aperiodic task scheduling, we have introduced aperiodic server which reserve the CPU capacity. Aperiodic server algorithms can improve the response time of aperiodic tasks. Several aperiodic server algorithms such as deferrable server and sporadic server[8, 16]are proposed. Aperiodic servers preserve the capacity for aperiodic tasks. There are two advantages of the aperiodic server over traditional approaches. The rst advantage is to enhance responsiveness of aperiodic tasks. In microkernel based operating system, interrupt processing that is implemented inside kernel in traditional monolithic kernel may be executed as an aperiodic task in user level. The interrupt processing may process data of a time critical periodic task, then it must be processed as soon as possible, but it is dicult to execute it as a periodic task because we do not know whether the interrupt belongs to a periodic task or an aperiodic tasks before interpreting the data associated with the interrupt. Aperiodic servers can solve the problem because the task processing the interrupt can be responded quickly. The second advantage is to the management of system servers such as network servers. Network servers need enough capacity to handle packets, but are not allowed to disturb the execution of periodic tasks. The capacity which can be used by network servers must be determined by the tradeo between response time and schedulability. Operating systems should support several aperiodic server algorithms according to the resource requirements of applications.

2.2 Integrated Time-Driven Scheduling Model The objective of the integrated time-driven scheduling model is to provide predictability, exibility, and modi ability for managing both hard and soft real-time activities in various real-time applications. For the hard real-time activities, the ITDS model allows a system designer to predict whether the given task set can meet its deadlines or not. For the soft real-time activities, the designer may predict whether the worst case response times meet the requirements or not. Under a transient overload condition, the ITDS model uses tasks' importance value to decide which task should complete its computation and which should be aborted or canceled. In the ITDS model, we adopted a capacity preservation scheme to cope with both hard and soft real-time activities. By capacity preservation we mean that we divide the necessary processor cycles between the two types. 3 We rst analyze the necessary processor cycles by accumulating the total computation time k for the hard periodic and sporadic activities (i.e., CTii ) and then apply rate monotonic scheduli=1 ing policies [10]. It should be noted that our choice of a rate monotonic or deadline monotonic

X

3

capacity-based scheduling algorithms are used in di erent types of scheduler such as in Fair Share Scheduler [5] and in FDDI's capacity allocation scheme [12]. However, our motivation was not fairness rather to reduce the response time of aperiodic activities while we guarantee the periodic tasks' deadlines are met.

3

paradigm instead of a dynamic scheduling policy, like earliest deadline is that it is extremely dicult to utilize such deadline information for scheduling a message at media access level or a bus-transaction on a system internal-bus. At least, given a set of real-time activities, we can easily map the period into a xed integer priority domain. For instance, given a set of periodic, independent tasks in a single processor environment with the rate monotonic scheduling algorithm, the worst case schedulable bound is 69%[10], the average case is 88%[9], and the best case, where threads have harmonic periods, is up to 100% of the CPU utilization. After analyzing the hard real-time activities, we will assign the remaining schedulable amount of the unused processor cycles to the soft real-time tasks. In order to guarantee that no soft task will consume more than the allocated processor cycles, we create a special periodic server, aperiodic server [16]. (Even though we call it server, it is not implemented as an ordinary task.) Then, the aperiodic server will preserve the processor cycles during its period and if there is an aperiodic request, the server will assign the remaining processor cycles to the requested task. If the requested task could not complete its execution in the aperiodic server's capacity, it must block until the beginning of the next cycle of the server. Otherwise, there is a possibility that other periodic tasks may not meet their deadlines.

2.3 Intergration of Scheduling and Synchronization Synchronization between tasks may cause priority inversion problem. Several techniques are developed to avoid the problems. The techniques require to compare the priorities of tasks. The comparison requires the coordination with scheduling policies because the semantics of a priority is di erent in each policy. In Real-Time Mach, each scheduling policy exports the functions to compare the priorities of tasks. The synchronization policy is transparent from a scheduling policy and the synchronization policy can be used for all scheduling policies.

2.4 Implementation Issues The implementation of ITDS scheduler should be satis ed several requirements. The rst requirement is that ITDS scheduler should support several scheduling policies because there is no scheduling policies suitable for all applications. Scheduler interface should support the interface for changing scheduling policies dynamically. The second requirement is that ITDS scheduler in Real-Time Mach should support multiprocessors. Scheduler should support both single processors and multiprocessors. The third requirement is how to implement complex scheduling policies. 2.4.1

Policy/Mechanism Separation

Policy/mechanism separation is a useful tool for structuring systems, and enables systems to exploit the characteristics of applications. Users can change the systems to be suitable for their applications. We describe why policy/mechanism separation is useful in real-time scheduling. For real-time scheduling, there are many scheduling algorithms, for example, the integer xed priority, the rate-monotonic algorithm, the deadline monotonic algorithm, and the rate monotonic algorithm with aperiodic servers. Complex algorithms require much cost, but may o er better response time. Then, applications would choose an appropriate policy for themselves. The important issues in the policy/mechanism separation are 4

: How the policy module should be separated from the mechanism. Placement of Policies: Where the policy module should be placed in the system. Communication between policy and mechanism: How they should communicate (interact) with each other. Selection of Policy: How the system or user can select the current (or active) policy at run time. Separation of Policies

In Real-Time Mach, all scheduling policies are implemented as objects. Each object encapsulates the implementation of its policy. The communication between the policy object and its mechanism was done through function calls, not message passing. 2.4.2

Implementation of Scheduling Mechanism

To support several types of architectures, we can divide the functions of a scheduler into an assignment module and a dispatch module. The assignment module has a role to assign threads to processors and the dispatch module has a role to dispatch threads in each processors. The assignment module of a scheduler in a single processor machine has a null function. We can consider two approaches for the assignment module. In the rst approach, if a thread becomes runnable, the assignment module assigns a processor to the threads and enqueues it to the run queue of the processor. Each processor has the dispatch module which selects the highest priority thread in its run queue. In the second approach, systems are divided into several address spaces and each address space includes one or more threads. First, the assignment module assigns a processor to an address space. Next, each address space dispatches threads included in the address space. The second approach is suitable for parallel applications[1]. The dispatch module can be implemented in user address space, then creating threads and switching threads in the same address spaces can be made faster. However, the context switching cost between di erent address spaces becomes more expensive than the cost of the rst approach. In real-time system, cheaper preemption cost is better, then we decide to select the rst approaches in Real-Time Mach. 2.4.3

Implementation of Scheduling Policies

The real-time scheduling algorithms become complex if we need to consider aperiodic servers. We cannot wire in only one scheduling algorithm because there is no algorithm which is better than other algorithms in all situations. The operating systems require the following two facilities. The rst facility is the primitive to change scheduling policies dynamically, and the second one is the mechanism to construct a new scheduling algorithm. The rst mechanism allows applications to select the best scheduling policy. The second mechanism allows us to implement a new scheduling algorithm easily. We can analyze the characteristics of algorithms and can compare them with other scheduling algorithms easily. Complex scheduling algorithms can be separated into several simple scheduling algorithms. We need to prepare several typical scheduling algorithms and to provide the mechanism to compose them into one scheduling algorithm. Scheduling policies which include an aperiodic server become very complex. Thus, creating a new scheduling algorithm is dicult. However, some scheduling algorithms may are very similar 5

to each other. For example, the rate-monotonic and the deadline monotonic are very similar. The di erence is the parameter to determine priorities. In the rate monotonic scheduling algorithm, a period determines the priority, and in the deadline monotonic scheduling algorithm, a deadline determines the priority. We can decompose a scheduling algorithm into small modules. This approach reduces the code size of kernel and the time for creating a new scheduling algorithm. 3

Implementation of Scheduler

To design ITDS scheduler for Real-Time Mach, we carefully consider the structure and the interface of scheduler. To design the structure of scheduler, we focus on two issues. The rst issue is that our scheduler should support several scheduling policies. The second issue is that it should support both single processors and multiple processors.

3.1 Real-Time Mach Overview Real-Time Mach has been developed in CMU for providing a common distributed real-time computing environment. Real-Time Mach is an extension of Mach kernel and has the following four characteristics over the original Mach kernel.  Real-Time Thread Model.  Real-Time Scheduling.  Real-Time Synchronization.  Real-Time IPC.

The major feature of Real-Time Mach is predictable resource management which enables us to analyze computing before executing applications. The resources for time critical threads are evaluated eagerly. Every object provided by the kernel such as a thread, a memory and a port has attributes which re ect the requirements of applications. In Real-Time Mach, a thread is de ned for a real-time or non-real-time activity. For a realtime thread, additional timing attributes must be de ned by a timing attribute descriptor. A real-time thread is classi ed as a periodic or aperiodic thread and each class of threads is de ned as a soft or hard real-time. The real-time scheduler in Real-Time Mach allows a system designer to predict whether the given task set can meet its deadlines or not. For soft real-time activities, the designer may predict whether the worst case response times meet the timing requirements or not. We adopted a capacity preservation scheme to cope with both hard and soft real-time activities. By capacity preservation we mean that we divide the necessary processor cycles between the two types. Under a transient overload condition, the scheduler uses tasks' importance value to decide which task should complete its computation and which should be aborted or canceled. Traditional synchronization primitives use FIFO ordering among waiting threads to enter a critical section, since FIFO ordering can avoid starvation. In a real-time computing environment, however, FIFO ordering often creates a priority inversion problem[15]. A higher priority thread may have to wait for the completion of all low priority threads in the waiting queue. Realtime operating systems should support various synchronization policies based on queueing order among waiting threads and preemptability of running threads in the critical section. 6

IPC is heavily used in microkernel environments. Predictable IPC is important to build modular and manageable systems. Our IPC can control the receiver of a message and the priority between clients. When creating a new port, we may specify a port attribute to select IPC policies.

3.2 System Interface The system uses the xed priority policy as the default policy, however, we can change scheduling policies by using the following functions. rt sched get policy gets the current scheduling policy, rt sched set policy sets and activates a new scheduling policy. rt aperiodic server create creates a new aperiodic server, rt aperiodic server bind binds the thread to the aperiodic server indicated by the argument. The scheduling policy attribute is used to pass policy-speci c arguments such as a aperiodic server's period, capacity of the server. The following is the summary of interface and a brief description of a policy attribute. kval t = rt kval t = rt kval t = rt kval t = rt

sched set aperiodic aperiodic

typedef struct policy processor set t apserver t policy t policy t time value t time value t time value t g

( policy attr ) policy( policy attr ) server create( policy attr ) server bind( policy attr )

sched get policy

policy attr t;

attr

f

rt pset; rt apserver; rt policy; rt server type; rt capacity; rt period; rt deadline;

/3 scheduling policy 3/ /3 type of aperiodic server 3/ /3 capacity of aperiodic server 3/ /3 period of aperiodic server3/ /3 for deadline monotonic 3/

111

rt policy speci es the base scheduling policy such as rate monotonic policy. The default policy is used if the a user speci es NULL as scheduling policy. rt server type speci es a type of aperiodic server. rt capacity is the capacity of the aperiodic server, and the rt period is the

period of aperiodic server.

3.3 Structure of ITDS Scheduler The ITDS scheduler provides an interface between the scheduling policies and the rest of the operating system. An object-oriented approach is used to implement the scheduler with the policies embedded in the policy object. Each instantiation of the scheduler may have a di erent scheduling policy governing the behavior of the object. Since Mach supports a multiprocessor environment, ITDS allows a user to select a scheduling policy for a processor set. A processor set is an user-de ned object in Mach and binds a thread and a set of physical processors. A special processor set called a default processor set exists. Before new processor set is created, all processors belong to a default processor set. 7

Mach Scheduler Interface

thread_block, choose_thread, thread_setrun

Interface Object

ITDS Policy Objects

Dispatch Object

RR

FP

RM

ITDS Interface

RMPOLL

RMDS

Policy Interface

ITDS Object

Figure 1: ITDS Scheduler Figure 1 shows a block diagram of the ITDS scheduler which indicates the relationship between the scheduler and policy objects. Mach scheduler uses the three functions: thread block, thread choose, and thread setrun for controlling a run queue. thread choose selects a next runnable thread from run queue. thread block calls thread choose internally to select a new thread and switch to the thread. thread setrun make the thread runnable. We change the three primitives for ITDS scheduler. Mach scheduler interface provides functions related to all scheduling events. The interface layer use the three primitives. The interface object translates the functions to the interface of the dispatch object. The dispatch object can be replaced for each target architecture, which forwards the operation to a ITDS object attached to each processor set. The dispatch object provides the control of preemption, for example which processor is preempted. In single processor machines, only one processor set named default processor set is supported, then system has only one ITDS object. The functions in a policy object interface manipulate run queue according to scheduling policy. The above three primitives call corresponding policy object interface nally. For example, if the current policy is rate monotonic scheduling policy, thread choose calls policy object interface that returns a pointer to the runnable periodic thread which has the shortest period.

3.4 Interface for Policy Object The ITDS object and the policy objects have the same interface. We need ve categories of functions. The operations in rst category manage the binding of policy and mechanism. The second category processes the run queue. The third category is called by the kernel periodically. The fourth category is called to control aperiodic servers. Above interface is related to events of scheduling. However, checking the necessarily of preemption and priority control in synchronization module depend on scheduling policy, then we need to export interface to make managing preemption and synchronization independent of scheduling policies. We list the interface below. Operations are called whenever the system encounters a scheduling event associated with a thread. kval t

= itds

startup

( sobj, inv type ) 8

kval t kval t kval t thread t kval t boolean t boolean t int preempt t preempt t server t kval t

= itds = itds = itds = itds = itds = itds = itds = itds = itds = itds = itds = itds

( sobj inv type ) run( sobj, thread, inv type ) block( sobj, thread, inv type ) choose( sobj, inv type ) kill( sobj, thread, inv type ) csw check( sobj, thread, inv type ) comp priority( sobj, thread1, thread2, inv type ) map priority( sobj, thread, max priority, inv type ) timer( sobj, inv type ) aperiodic server( sobj, inv type ) create aperiodic server( sobj, policy attr, inv type ) bind aperiodic server( sobj, server, thread, inv type ) shutdown

As described later, policy modules can be constructed using multiple inheritance. inv type indicates that it calls a current class or a super class. A scheduling policy directs not only the management of a run queue, but also determines the priority ordering. Since ITDS encapsulates the priority management, the rest of thread and synchronization management can be easily created for many other policies which use di erent priority ordering. The above operations fall into ve categories. The operations in rst category manage the binding of policy and mechanism. The itds startup binds a policy and a mechanism, and itds shutdown unbinds a policy and a mechanism. The second category processes the run queue. itds run enqueues the thread to run queue, itds block noti es the thread is blocked. itds choose chooses the next runnable thread, and itds kill removes the runnable thread from the run queue. The third category checks which threads have higher priorities. itds csw check checks whether the current thread needs the preemption, and itds comp priority compares the priorities of two threads, and itds map priority maps the priority to integer value. Max priority indicates the possible highest integer priority. If mapping is impossible, it noti es it to caller. The operation is used to make a priority queue. The fourth category is called by the kernel periodically. itds timer processes the quantum of threads. The last category manages the aperiodic servers. itds aperiodic server resets the aperiodic servers at every period, The fth category is called to control aperiodic servers. itds create aperiodic server created the new aperiodic servers, and itds bind aperiodic server binds the thread to an aperiodic server.

3.5 Scheduling Policy Objects Scheduling policies including background servers or aperiodic servers[16] are becoming very complex. It is a very dicult task to implement and add a new scheduling policy to a system. A policy may be very similar to other policies so that they may share a portion of their codes. For example, background threads in the rate monotonic scheduling policy can be scheduled by the xed priority, FIFO, or round-robin manner. If we create three di erent objects for each scheduling policy, most parts of the policies will be redundant. If a policy designer wants to create a new scheduling policy, he/she must create it from scratch. To solve the problem, we propose the notion of a micro scheduling policy object. A micro scheduling policy object contains only simple scheduling policy such as a xed priority, ratemonotonic, or deferrable server. It includes an incomplete object which is meaningless by itself. 9

We use multiple inheritance for the composition of policies. A designer can create a new scheduling policy through the composition of the micro scheduling policy objects by specifying its super classes. Consider the case where a designer wants to create a scheduling policy C which is a composite of real-time scheduling policy A and non real-time scheduling policy B. We can create class C as a new scheduling policy by inheriting from the classes of policy A and policy B. We decompose a scheduling policy into a base scheduling policy, an aperiodic server policy, and a background scheduling policy. The classes of scheduling policy objects have three links which are named explicitly to control multiple inheritance. An aps link indicates an aperiodic server policy, a back link indicates a background policy, and a super link represents an inherited scheduling policy by the current scheduling policy. A base scheduling policy controls the scheduling of all threads and it may call an aperiodic server policy object, a background policy object, and other policy objects through an aps link, a base link, or a super link. The inheritance in our framework speci es the name of links for super classes explicitly. It is an example of showing that unique inheritance mechanism is not appropriate for all compositions of objects. The problem of our approach is the cost of the composition. The composition of micro scheduling policy objects requires several method calls to execute methods in the policy interface. The scheduler is a critical component which a ects system performance. In our implementation, we use a customized run-time for realizing multiple inheritance and method caches to make method calls faster. The scheduler need not change the con guration of the composition of policies unless current scheduling policy is changed. The method cache is invalidated only at that time. The strategy reduces the overhead and implementation cost since the strategy is not provided by traditional object-oriented languages. In the rate monotonic policy, we need only one procedure call for periodic threads, and two procedure calls for aperiodic threads1 .

3.6 Micro Scheduling Policy Objects The current version of Real-Time Mach supports 9 micro scheduling policy objects. Here, we present the micro scheduling policy objects. Round Robin[RR]: Fifo[FIFO]:

Threads are executed in a round-robin fashion.

Threads are executed in a rst in rst out order.

The thread priority is assigned at the creation time and is xed. The highest priority thread is selected rst.

Fixed Priority[FP]:

Earliest Deadline Fast[EDF]:

The thread with the earliest deadline is selected rst.

The periodic threads are selected based on the rate monotonic scheduling algorithm. [10] assigns priority according to the frequency of a periodic thread. The threads with higher frequency (i.e. shorter period) are given higher priority. Aperiodic threads are executed in a background mode.

Rate Monotonic with Background Server[RM]:

The periodic threads are selected based on the deadline monotonic algorithm. In the algorithm, each periodic thread has a xed deadline in every period. The thread with smaller deadline are given higher priority. Aperiodic threads are also managed in a background mode.

Deadline Monotonic with Background Server[DM]:

1

In our implementation, the rate monotonic scheduling module needs to call a background scheduling module.

10

Polling servers becomes runnable periodically, and execute aperiodic tasks whenever the polling server can nd a runnable aperiodic task.

Polling Server[POLL]:

Unlike a polling server, the deferrable server allows an aperiodic task to be executed at any given time as long as the server's reserved execution time last. The deferrable server enhances the functionality of the rate monotonic policy by reserving some execution time for aperiodic processes. server is conceptually a periodic process which divides the processor time allocated to it among the aperiodic tasks in the system. If no aperiodic tasks are ready to run when the server's period begins, the server can save its execution time and use it to service the aperiodic later in the same server period.

Deferrable Server[DS]:

The sporadic server also allows an aperiodic task to be executed at any given time as long as the server's execution time like deferrable server. The di erence is the method of replinishment algorithm. A sporadic server enable us to check schedulability analysis easier than a deferrable server.

Sporadic Server[SS]:

3.7 Composition of Scheduling Policy Objects The current version of Real-Time Mach supports 9 micro scheduling policy objects. From these objects, we support 28 scheduling policy objects in total. Now, we explain how to compose micro scheduling policy objects using RMRRSS policy as an example. RMRRSS is a scheduling policy object where periodic threads are executed using a rate-monotonic scheduling and aperiodic threads are executed using sporadic servers. The aperiodic threads bound to the same sporadic server are scheduled in round-robin manner. Figure 2 shows the class hierarchy of RMRRSS scheduling policy in our scheduler. RMAPS is a scheduling policy which manages the rate monotonic scheduling policy as a base scheduling policy with arbitrary aperiodic server policies. The periodic threads are scheduled by RM object through a super link, and the aperiodic threads are scheduled by SS object through an aps link, when the threads are bounded to aperiodic servers and the aperiodic threads executed as background threads are scheduled by RR object through a back link. 4

Evaluation

The basic cost of the ITDS scheduler were measured using a Sony NEWS-1720 workstation (25 MHz MC68030) and a FORCE CPU-30 board (20 MHz MC68030). We simply evaluated a single processor environment of Sony machine. We used an accurate clock on the FORCE board for timing measurement on NEWS-1720 through VME-bus backplane. This clock enabled us to measure the overheads with resolution of 4  s.

4.1 Preemtion Cost Analysis Before we start analyzing the preemptability of the system, let us rst de ne basic cost factors. Figure 3 de nes the basic cost factors when a higher priority thread preempts a lower priority thread which is executing a system call. Copr speci es the execution cost of the primitive opr. Cnon int is the worst case execution time of a non-interrupt region where all interrupts are masked. A critical interrupt may be delayed until the non-critical region is completed. Cint hdr 11

SchedObj

Super Link

FIFO Super Link Super Link

Super Link

RM DS

Super Link

Super Link

RMAPS

RR

SS

Aps Link Super Link Back Link

RMRRSS

Figure 2: Structure of Policy Module

Thread A

Cnon_int

Csys_left

Thread B

Interrupt

Cwakeup

handler

Cint_hdr

Cint_left Cblock

Cdispatch

Scheduler

Csched_call

Cchoose

Figure 3: Preemption during a System Call

12

Basic Operation Cwakeup Csched call Cint lef t(clock) Cblockreincar Cblock Cchoose Cdispatch Cnull trap Cclockint

Cost (s) 72 y1 72 36 672 y1 84 y1 40 y1 48 48 108 y2

Cblock , Cwakeup and Cchoose , are measured under a xed priority scheduling policy (default) and are policy speci c numbers. y2 This includes the cost calling scheduling policy routines, but no thread wakeup cost.

y1

Table 1: The Basic Overhead is the worst case execution time of the interrupt handlers. Interrupt handler can be interrupted by a higher priority interrupt. Cwakeup is the time to wakeup a blocked thread. Cint left is the remaining time after the wakeup until the interrupt is completed. Csys left is the remaining execution time of the system call. Csched is the total scheduling delay time and sum of Csched call , Cblock , Cchoose , and Cdispatch . Csched call is the delay time to switch to the scheduler. Cblock is the blocking time for giving up the CPU and Cchoose is the selection time for a next thread, Cdispatch is the context switching time. The results of the measurement are summarized in Table 1. Now, let us consider the preemption cost in Real-Time Mach. The total worst case preemption cost can be de ned as

Cpreempt = Cnon int + Cint hdr + Cwakeup + Cint left + Csys left + Csched Csched = Ccall sched + Cblock + Cchoose + Cdispatch Under the xed priority scheduling, Cpreempt becomes Cnon int +Cint hdr +Cint left +Csys left + 196 s. In our target machine, a clock interrupt handler alone requires at least additional Cclockint = Cint hdr + Cint left = 108s. In a real-time application, the cost of Cint hdr + Cint left can be precomputed based on the system con guration, however, the cost for Cnon int and Csys left are operating system speci c. In many monolithic kernel-based system, Cnon int and Csys left becomes relatively high. However, in a micro kernel-based system, an ordinary system call becomes preemptive since its function is implemented in a user-level task. For real-time programs which have shorter deadlines than Cpreempt , we need to reduce each cost factor further down. To reduce Cnon int and Csys left , further kernelization of the current micro kernel is required.

4.2 Scheduler Cost Analysis Achieving higher schedulability and responsiveness requires additional cost than simple algorithms such as round robin scheduling. Each system should select the most suitable policy for their applications. For example, background aperiodic threads can be executed in FCFS, RR or 13

Scheduling Operations Thread block Thread setrun Thread choose Switch thread y

Mach Scheduler (s) 126 28 20 84

Real-Time Mach Scheduler (s) 124 52 40 84

Both schedulers use xed priority policy.

Table 2: The Basic Scheduling Cost FP order using an aperiodic server with the rate monotonic policy. Selecting a suitable scheduling policy supported by ITDS scheduler makes the system exible and open because a system designer can build his own optimized scheduler for his application. We measured the additional scheduling overhead caused by the policy/mechanism separation of the ITDS scheduler. Table 2 compares the scheduling overhead between a default xed priority scheduler in the original Mach kernel and our modi ed Real-Time Mach kernel. The results indicate that both versions have the same thread blocking cost. The reason for achieving the same thread blocking time is due to our optimized Mach scheduling interface layer for a single processor architecture. However, thread setrun costs about 1.7 times and thread choose costs 4 times more than the original Mach scheduler. Most of the additional costs are due to policy/mechanism separation. The actual overhead will be caused by thread setrun, since thread choose function is always called from thread block in Real-Time Mach. The di erence between two systems is proportional to the number of thread setrun calls. Table 3 compares the implementation costs among xed priority, rate monotonic, and rate monotonic with deferrable server policies. The scheduling overhead alone cannot determine the e ectiveness of the scheduling policy, but this results indicate the estimate of worst case execution time of each policy. Because of the 32-level priority domain, the xed priority policy can execute all internal scheduling functions in a constant time. Each priority is mapped into an index to an array of run queues. In contrast, a rate monotonic policy uses thread's period as its priority and the run queue is implemented as a priority queue using a linked list. Thus, The cost for making a runnable thread and for choosing the next thread is not in a constant time, rather it is proportional to the number of runnable threads. The rate monotonic with deferrable server policy needs additional selection costs than the pure rate monotonic, since it needs to manage the quantum of aperiodic server. The gure shows that xed priority policy is least expensive, and rate monotonic policy with sporadic server is most expensive. We can conclude that task set is not changed dynamically, the xed priority policy is best, and it is changed dynamically, the rate monotonic policy is best. Also, if the response time is important for aperiodic activities and the task set requires only soft real-time constraint rate monotonic policy with deferrable server is good, and if we require the hard real-time constraint, the rate monotonic policy with sporadic server is good.

14

(s) itds block itds run itds choose itds comp priority itds csw check

FP 24 24 24 24 24

RM (p/ap) 24 24/48 36/48 36/60 36/48

RM with DS (p/a/aps) 36/84 36/60/48 48/84/96 60/84/84 48/60/84

RM with SS (p/a/aps) 36/160 36/60/48 48/84/120 60/84/84 48/60/84

y1 y2 y3 y4 y5

The di erence is due to the current thread is bounded by the aperiodic server or not. n is the length of a run queue. m is a number of aperiodic servers. The di erence is due to a periodic or aperiodic thread. \p", \ap", and \aps" indicates the cost to execute the function for a periodic thread, an aperiodic thread and the aperiodic thread binded to an aperiodic servers.

Table 3: A Comparison of Scheduling Policies in ITDS Scheduler

4.3 Using Hando Scheduling for Optimizing Context Switching In context switching, rst, thread setrun is called, then, the AST is set in the routine. Lastly, thread block is called. In the case, the cost of context switching becomes Csetrun + Ccall sched + Cchoose + Cblock . We provides a new functions: switch thread for bypassing the scheduling module in order to optimize the performance of scheduling. switch thread directly switches from one thread to another thread without calling the scheduling module. We apply the technique in two cases in Real-Time Mach. The rst case is used in the preemption of threads. When a preemption is occurred, a new thread is determined without calling choose thread. Because systems can know who preempt the current thread. In the case, we can reduce the cost of choose thread and thread setrun. The second case is used in the implementation of a critical section when releasing a lock. The optimization is applied when a lower priority thread knows that a high priority thread waits for releasing the lock. Under the basic priority inheritance protocol, systems can determine a next thread when a lower priority thread calls an unlock primitive and a higher thread's priority is inherited to the thread during the critical section. The cost to be reduced in the optimization is Csetrun + Ccall sched + Cchoose . Moreover, we do not need Csys left of rt mutex unlock. Modular structure provided by policy/mechanism separation requires extra cost to support multiple scheduling policies so that bypassing a scheduling module can reduce the cost of policy/mechanism separation if systems can determine a next thread without calling the scheduler.

4.4 Response Time Analysis for Aperiodic Threads The benchmark consists of two threads. First thread is a periodic thread, and second thread is an aperiodic thread. We change the quantum of aperiodic server, and measured the response time of aperiodic thread. Thread B becomes runnable 30 ms later after the thread is started. Table 4 shows the thread set of the benchmark. Rate monotonic algorithm can schedule aperiodic threads using aperiodic servers. The benchmark considers the response time of thread B where respective aperiodic server is used. The following formula show the predictable response time using respective aperiodic servers, where ATi is a arrival time of thread i and REMi (e) is a remaining time of thread i at the time when 15

choose block

Call_sched

Thread A Thread B

switch_thread call_sched

Thread A

Thread B

switch_thread

Thread A

Thread B

Figure 4: Scheduling Path

Thread Thread A Thread B Thread A Thread B

Period (ms) 500 100 500 200

Worst Case Exec (ms) 300 1 300 1

Start Time (ms) 0 30 0 30

Table 4: Parameters of Threads in Benchmark 1

16

Utilization (%) 60 3.3 60 3.3

WCRT (msec)

450.0 Backgroud Polling Server Deferrable Server

400.0 350.0 300.0 250.0 200.0 150.0 100.0 50.0 0.0 100

150

200

250 300 350 400 Quantum of Aperiodic Server (msec)

Figure 5: Response Time Analysis for Aperiodic Threads event e is occurred. arrived.

W CRTbackground = REMB (ATA ) + W CETA = 271ms W CRTPolling = round(ATA ; Tbackground ) W CRTDeferrable = W CETA = 1ms Background server delays the execution thread B until thread A is completed. Polling server can allow thread B to execute at every polling interval. Deferrable server can allow the thread B to execute as soon as the thread becomes runnable. The result is shown in Figure 5. We measured the response time when we change the quantum of aperiodic server. The predicted results and measured results are very close. 5

Related Work

Advantages of using a micro kernel instead of a standard monolithic kernel is its high preemptability, small size, and extensibility. However, only a few micro kernels were designed for supporting predictable distributed real-time computing environment. For instance, Chorus's micro kernel[13] was designed for real-time applications. However, their emphasis was placed at rather low-level kernel functions such as providing a user-de ned interrupt handler and preemptive kernel. The kernel uses the wired-in xed priority preemptive scheduling policy and there is no additional features reported to avoid priority inversion problems. The V kernel's emphasis was also intended for supporting high speed real-time applications[2]. V's optimized message passing mechanism and VMTP protocol can provide basic functions for building distributed real-time applications. However, the wired-in scheduling policy and locking protocol may cause us a potential inversion problem. Amoeba's advantage is its high performance RPC and was used for remote video image transmission using Ethernet. Like Real-Time Mach, Amoeba can support a set of single board 17

computers without having local disks, however, it does not provide any safe mechanisms for creating a periodic thread and avoiding priority inversion problems. Several operating systems use object-oriented techniques to enhance the exibility of the systems. In Choice[14], C++ is used to make the structure of kernel clear. Scheduling policies are also implemented as C++ objects so that a policy can be changed easily. However, Choice does not support the integration of scheduler and synchronization for avoiding priority inversion. X-kernel[6] and Ficus[4] also use object-oriented techniques. In these system, Objects have syntactically same interface and the enhancement of functionality is very easy. The notion of policy/mechanism separation was rst introduced in the development of the Hydra operating system at CMU [7]. Hydra implemented a scheduler which consult a user-level policy modules. 6

Conclusion and Further Work

We reported that using new real-time thread, synchronization, and scheduler in Real-Time Mach, a user could eliminate unbounded priority inversion problems and could perform schedulability analysis for real-time programs in a single CPU environment. The explicit use of timing constraint in de ning a real-time thread was very e ective to perform a schedulability analysis. Yet, the system interface for creating periodic and aperiodic threads was natural for de ning various real-time activities. The Real-Time Mach has been operating satisfactorily for a network of SONY NEWS, IBM AT-compatible machine and DecStation. So far, our focus was to eliminate various types of unbounded priority inversions and unpredictable runtime behavior among real-time programs in a single CPU environment. We must go on to extend our facility to support distributed real-time applications. User level implementation of threads provides us exible environment because users can de ned their own model. However, context switching between address spaces is expensive and maintaining the consistency of priorities of threads is dicult. In future, we need to investigate more ecient and more exible approach by merging two approaches. References

[1] T.E. Anderson, B.N. Bershad, E.D. Lazowska and H.M. Levy, "Scheduler Activations: E ective Kernel Support for the User-Level Management of Parallelism", In proceeding of 13th Symposium

on Operating System Principle, 1991

[2] D.R. Cheriton, G.R. Whitehead and E.D. Sznyter, \Binary emulation of UNIX using V Kernel", In proceedings of Summer Usenix Conference, June, 1990. [3] D. Golub, R. Dean, A. Forin, and R. Rashid, \Unix as an application program", of Summer Usenix Conference, June, 1990.

In the proceedings

[4] J.S.Heidemann and G.P.Popek, \A Layered Approach to File System Development", Technical Report, CSD-910007, University of California, Los Angeles, 1991 [5] G.J. Henry, \The Fair Share Scheduler", 8, October, 1984.

AT&T Bell Laboratories Technical Journal, Vol. 63, No.

18

[6] N.Hutchinson, and L.Peterson, "The x-kernel: An architecture for the Implementing Network Protocols", IEEE Transaction on Software Engineering, Vol.17, No.1, 1991. [7] R. Levin, E. Cohen, W. Corwin, F. Pollack, and W. Wulf, \Policy/Mechanism Separation in HYDRA", In Proceedings of 5th Sympo. on Operating Systems Principles, November, 1975. [8] J. P. Lehoczky, L. Sha, and J. K. Strosnider "Enhanced Aperiodic Responsiveness in a Hard RealTime Environment", In Proceeding of 8th IEEE Real-Time System Symposium, 1987. [9] J. P. Lehoczky, L. Sha, and Y. Ding, \The rate-monotonic scheduling algorithm: Exact characterization and average case behavior", In Proceeding of 11th IEEE Real-Time System Symposium, 1989. [10] C. L. Liu and J. W. Layland, \Scheduling algorithms for multiprogramming in a hard real time environment", Journal of the ACM, Vol.20, No.1, 1973. [11] S.J. Mullender, G.V. Rossum, A.S. Tanenbaum, R. Renesse and H. Staveren, \Amoeba: A Distributed Operating System for the 1990s", IEEE Computer Vol.23, No.5, May, 1990 [12] F. Ross, \FDDI - A Tutorial",

IEEE Communication Magazine, May, 1986.

[13] M. Rozier, V. Abrossimov, F. Armand, I. Boule, M. Gien, M. Guillemount, F. Herrmann, C. Kaiser, S. Langlois, P. Leonard, and W. Neuhauser, \Chorus distributed operating system", Computing Systems Journal, The Usenix Association, December, 1988 [14] V.F.Russo, "An Object-Oriented Operating System", Ph.D Thesis, University of Illinois at UrbanaChampaign, 1990 [15] L. Sha, R. Rajkumar, and J. P. Lehoczky, "Priority Inheritance Protocols: An Approach to RealTime Synchronization", IEEE Transactions on Computers, Vol.39, No.9, 1990. [16] B. Sprunt, L. Sha and J. P.Lehoczky, \Aperiodic Task Scheduling for Hard-Real-Time Systems", The Journal of Real-Time Systems, Vol.1, No.1, 1989. [17] H. Tokuda, M. Kotera, and C. W. Mercer, \An integrated time-driven scheduler for the arts kernel", In Proceedings of 8th IEEE Phoenix Conference on Computers and Communications, March, 1989. [18] H. Tokuda, T. Nakajima, and P. Rao, \Real-Time Mach: Towards a Predictable Real-Time System", In Proceedings of USENIX Mach Workshop, October, 1990. [19] H. Tokuda and T. Nakajima, "Evaluation of Real-Time Synchronization in Real-Time Mach", In Proceeding of USENIX 2nd Mach Symposium 1991,.

19