Scheduling to support Multimedia Feedback Remote. Control Applications. Klara Nahrstedt. Jonathan Smith. Computer Science Department Computer Science ...
New Algorithms for Admission Control and Scheduling to support Multimedia Feedback Remote Control Applications Jonathan Smith
Klara Nahrstedt
Computer Science Department Computer Science Department University of Illinois University of Pennsylvania
Abstract
Multimedia feedback remote control applications create a multimedia control loop between devices such as robot hands and cameras and a human user. Such applications are characterized by continuous update of multimedia data passed to and from the remote device(s), e.g., tactile, force and visual information. The supporting multimedia system must provide application-to-application quality of service guarantees between the human user and the remote devices. To provide such guarantees, networked applications need to either directly or indirectly control network resources (e.g., in switches), and end-point resources such as memory, CPU and I/O device access[And93]. Resource control must be integrated in a robust admission and scheduling process. We assume that such admission and scheduling exist for network resources such as bandwidth and delay, and focus on an admission and scheduling process for the end-points. By integrating the endpoint resources, we discovered several problems, for which we have designed novel admission and scheduling algorithms for control of a resource, namely a multi-level admission service and joint scheduling. We have designed and implemented the algorithms within our OMEGA architecture which controls the availability of end-point resources needed in full-feedback remote control multimedia applications such as telerobotics. These algorithms are discussed in sucient detail in the paper that they can be easily implemented and adopted by others. The algorithms are validated through experimentation with a telerobotics application executing on an ATM network.
1 Problem Description New applications enabled by multimedia devices involve the use of sensory data. These new applications are more interesting if distributed, but there are corresponding new research challenges. In particular, computer networks have traditionally been designed with resource-sharing goals in mind, This work was supported by the National Science Foundation and the Advanced Research Projects Agency under Cooperative Agreement NCR-8919038 with the Corporation for National Research Initiatives. Additional support was provided by Bell Communications Research under Project DAWN, by an IBM Faculty Development Award, and by Hewlett-Packard.
1
e.g., the common Ethernet shared bus LAN. Emerging switched technologies such as Asynchronous Transfer Mode (ATM) oer the possibility of much greater control of the network subsystem's characteristics, expressed as Quality of Service (QoS) measures. The possibility of such control has inspired a rethinking of the architectures for application-to-application (end-to-end) communications in a distributed multimedia environment. There is an expanding class of applications which, for a variety of reasons (such as insulating remote operators from hazardous materials, unavailability of a complex scienti c instrument locally, etc.) require sensory feedback remote control. This class of applications shares many characteristics with teleoperation, which has been known for years (it was rst looked at in the 1960s for space exploration tasks) to require hard real-time characteristics to preserve fundamental properties such as stability of the control system. We have abstracted the properties of such systems into what we call \Remote Control Multimedia Applications" (Figure 1) and use this abstraction to construct an architecture capable of supporting such applications. Among the most important algorithmic decisions to be made in such an architecture are those associated with the control of the timesensitive system, such as admission and scheduling. We have developed new joint admission schemes and equivalent joint scheduling for the type of complex systems (which are organized logically as multi-level systems) under study. Operator Side
Robot Side
(master)
Arm
Arm
Display
(slave)
Control &
Control &
Commun.
Commun.
Camera
Software &
Software & Hardware
Hardware
Network
Figure 1: A Remote Control Multimedia Application { Telerobotics The remainder of the paper is organized into six sections. Section 2 describes related work in this research area. Section 3 provides a brief logical overview of the OMEGA communication system in which our admission service and scheduling are embedded. Sections 4 and 5 describe the design of the multi-level admission service and joint scheduling, respectively. Section 6 describes our 2
experimental platform and results from telerobotics experiments conducted over an ATM network, and Section 7 summarizes the contributions and suggests promising directions for future exploration.
2 Related Work Most research on admission and scheduling in distributed multimedia communication systems has focussed on network resources, such as bandwidth, network delay and buer space for queues. Several admission mechanisms, scheduling policies and tests, and buer allocation schemes to support real-time communication and guaranteed services are presented in [HLG93, MSST91, FV90] and other work. The primary CPU resource, processor occupancy, is analyzed in the real-time systems discipline, and also focuses on scheduling. We further discuss the current state of the art in scheduling in Section 5.1. To provide application-to-application guarantees, the real-time services in the transport subsystems are necessary, but not sucient. For example, the fact that data has arrived at a network adapter \on-time" does not imply that an application receives and processes the data in a timely manner. Real-time communication must be integrated with real-time computing support and applications at network end-points [And93]. Anderson provided a uniform theoretical framework and model, called Metascheduler, for resource reservation to achieve application-to-application guarantees [And93]. In practice, however, the applications and real-time computing are still not fully integrated with the real-time communication. Some orchestration between operating system (OS) and network resources has been tried (e.g., [TTCM92]), but relatively little orchestration among all three types of resources (multimedia devices, OS resources, and network resources) [And93]. Unfortunately, availability of these resources is interdependent when guarantees are required. These interdependencies expose some serious limitations in the canonical real-time scheduling algorithms, which we address in Section 5.
3 OMEGA Architecture To provide application-to-application guarantees, we need guaranteed services in the network and at the end-points. As many results illustrating methods to provide such services are now mature, we assume that our communication architecture resides on top of guaranteed network services. We concentrate our research eorts on providing end-point guaranteed services. For resource admission 3
service and scheduling, two issues need to be addressed: (1) the communication system model, and (2) the model for end-point resources.
3.1 Communication Model
Real-Time Call
Application Protocol
(RTAP)
Real-Time
Management
Connection
QoS Broker
Transport Subsystem
Application Subsystem
The communication system is modeled as a two layer system, illustrated in Figure 2. We call this system architecture the OMEGA Architecture, alluding to its location at system end-points.
Network Protocol (RTNP)
Management
Network Hardware
Figure 2: The OMEGA Architecture Communication Model The transport subsystem layer in our model combines the roles of the network and transport layers using Integrated Layer Processing [CT90]. Functions such as connection management, forward error correction, timing failure detection and timely data movement form the core of the Real-Time Network Protocol (RTNP). The application subsystem layer contains the functions of the application and session layers such as call management, rate control of multimedia devices, input/output functions (e.g., display of video), fragmentation of application protocol data units (APDUs), integration/disintegration of APDUs, etc. These functions are the core of the Real-Time Application Protocol (RTAP). Both subsystems must provide guarantees useful to a scheduler for the calls/connections they service end-to-end. Therefore, they require guarantees on the resources needed for the communication. Resource guarantees are negotiated during the call establishment phase by the QoS Broker protocol [NS95] (Figure 3), which is an addition to the communication architecture. The broker orchestrates both local and global end-point resource availability. Local resource availability is achieved by using services such as translation (between application QoS and network QoS) and admission. For global resource availability, the broker uses a negotiation service between the 4
Local Availability of Resources
Global Availability of Resources
Application Subsystem
QoS Broker
QoS Broker
Resources
(BUYER)
(SELLER)
Local Operating System
Remote
Resources
QoS Translation Admission
Resources Negotiation
Resources
Network Resources
Transport Subsystem Resources
Figure 3: The QoS Broker Concept end-points and relies on network resource guarantees provided by the network subsystem, e.g., by B-ISDN switches. The goal of the broker is to negotiate a resource contract among all the system components (application, OS, network). The broker assumes dierent roles (seller and buyer) to distinguish between the participating partners.
3.2 The Resource Model and its representation At the end-point, three logical groups of resources must be managed, namely multimedia devices, CPU scheduling and memory allocation and network resources. We parameterize all end-point resources through Quality of Service (QoS) parameters maintained in small databases, which represent the requirements for the resources [NS95]. The resources in each domain (application, OS, network) maintain domain-speci c representations. This, of course, gives rise to multiple views of QoS. Application requirements for multimedia devices are speci ed through application QoS parameters. For example, video quality is described by frame rates (30 frames/s), frame size (height * width in pixels), color (bits/pixel), etc. The network QoS parameters describe the requirements for the network resources, such as packet rate, packet loss, jitter, end-to-end delay. The system QoS parameters describe the requirements on CPU scheduling and buer allocation such as task start times, durations, and deadlines. These multiple QoS views must map to a set of resources to be coordinated for management, so they are translated among each other. This is done by services included in the QoS Broker. For example, the translation between application and network QoS is done by the QoS Translator [NS94a]. These dierent QoS representations are also used by the multi-level admission service, 5
described in the next section.
4 Admission Control Admission control, where an activity is either admitted to the pool of managed entities or excluded from this pool, is essential to the resource allocations required for guaranteed services. For distributed multimedia communications systems, availability of each resource along the path(s) between source(s) and sink(s) must be known. A formalized general state machine for resource admission is shown in Figure 4. In our system, the admission service has two main functions: START
STOP
Receive ’admit request’ with QoS Parameters
Send ’admit ack’ with Response
Map QoS Parameters into required Resources
Free Resource
Check Resource Availability ’reserve’ if sender or inter. node
’accept’
’reject’
Allocate Resources
’accept’ if receiver ’reject’ if receiver or not enough resources
Reserve Resources ’accept’
’reject’
Wait for Response
Receive Response
Figure 4: Admission service - state machine.
map between application/network QoS parameters and the system parameters; and check the availability of resources and based on these tests reply (reserve, or reject) with
feasible QoS parameters to the negotiation process on the buyer side. At the seller side, the admission service provides the answer accept or reject and sends back to the buyer the response with possible QoS parameters.
The QoS Broker performs admission control at both layers of the OMEGA system illustrated in Figure 2. For ease of implementation (see Section 6, below), we assumed networked multimedia applications with periodic media streams (e.g., uncompressed video, sensory data). The admission tests of our prototype are therefore limited to providing guarantees for this type of trac. Aperiodic requests (tasks) may occur (e.g., QoS renegotiation/resource adaptation request), but our scheduler 6
polls periodically for these requests and treats them as deadline-driven requests (tasks). We assume that all tasks (application and network) processing messages are non-preemptive basic tasks (e.g., read sensory sample, read a video frame from a video device). Many multimedia communication systems (e.g., HeiTS, METS), when testing for schedulability, assume preemptive scheduling algorithms. These algorithms assume that any message can be suspended at any time, with a small overhead, in order to transmit a higher priority message. However, in real communication systems this is rarely practical. We adapt a non-preemptive scheduling algorithm. Non-preemptive algorithms are relatively easy to implement, but a high priority message can be blocked by a long low priority message. This is called priority inversion [R.L94]. The application/network QoS parameters (sample/rate, packet/rate, importance, sample size, packet size) are mapped onto the system parameters (1) task priorities, (2) task periods, and (3) buer space requirements. The task priorities are inherited from the importance of the sample priority and equivalent to the assignment of priorities according to task deadlines. For network tasks, the task priorities are inherited from the application tasks. The importance of priority inheritance for support of guarantees is clear. Task durations (e) as well as speci cation of tasks (name of the real-time protocol functions) in communication protocols are pre-computed and stored a priori in the system database. This parameter depends on the sample/packet size. Task period (P ) is computed as the inverse of the sample rate/packet rate. The sample/packet size, fragmentation/reassembly, mixing/splitting, and error correction mechanisms determine space requirements in the system QoS (both subsystems). In our communication protocols, we allocated at least 2 MA space for each unidirectional channel for ring buers, so that one sample can be loaded from/to a multimedia device and another sample can be sent/received to/from the transport subsystem.
4.1 Admission Service in the Application Subsystem The admission service performs four tests at the application subsystem level: a device quality test, a local schedulability test, an end-to-end delay test, and a buer allocation test. These tests check the multimedia devices and system resources availability for the RTAP. According to the naming convention presented in Tables 1 and 2, the tests are summarized in Table 3 and discussed below:
The device quality test compares the con guration parameters of the multimedia devices with
the speci ed application QoS requirements. For example, if a video device can provide a maximal frame rate of 15 frames/second and the user speci es the application QoS sample 7
Application Subsystem
i il d r r(i) r(d; i) csA j s eA
number of sent/received streams l-th sample on stream i direction of stream (in,out) number of RTAP tasks number of RTAP task per stream i number of RTAP tasks per stream i in direction d RTAP tasks context switch time number of cs among RTAP tasks number of schedulable intervals
processing time of a RTAP task A RTAP task time begin of RTAP task r tBAr time deadline of RTAP task r tEAr i ?! i0 relation: i precedes i0
Transport Subsystem
k
number of connections
d direction of connection m number of RTNP tasks m(k) number of RTNP tasks per connection k csN
RTNP tasks context switch time number of cs among RTNP tasks number of connections per stream i number of connections in direction d processing time of a RTNP task RTNP task time begin of RTNP task m time deadline of RTNP task m
n k(i) k(d) eN N tBNm tENm
Table 1: Basic notation rate as 30 frames/s, then the admission service either rejects the QoS requirement and waits for correct user input, or \falls back" to the possible QoS and informs the user of the change.
The local schedulability test takes the system QoS parameters which specify the application
tasks for processing of multimedia streams and checks if the tasks are schedulable. The behavior of the considered application tasks allows us to test the tasks as if they would be scheduled using Earliest Deadline First (EDF) policy. For this kind of scheduling, Liu and Layland provide a schedulability test in [LL73]. However, because our tasks are nonpreemptive, the schedulability test must be altered1 . The altered test in the application subsystem is the test #(1) in Table 3. For each stream i in direction d, the deadline test #(2) in Table 3 holds. If the schedulability test #(1) cannot be met, the stream with later deadline (lower rate) will be rejected. If the schedulability test is satis ed, the task precedences are
1
The schedulability test is tighter for non-preemptive tasks:
8
Pni
=1
ei Pi
min
1
P
i( i)
Pni
=1
ei
1
Variable Name SI
Relation
n
lcm(PA ;:::;PA ) max. number of schedulable SI = min n) (PA ;:::;PA intervals P time of set of RTAP tasks per TAi = r(i) er(i) medium (sample)i P time to process sending sample TAS (il ) = r(in;il ) erA(in;il ) 1
1
TAi TAS(il ) TAR(il ) TA1;:::;i?1 TNk TN1;:::;k?1 WFF
il
time to process receiving sample il guaranteed time to process (i? 1) streams in mind;i (PAd;i ) time of set of RTNP tasks per connection (packet) k guaranteed time to process (k ? 1) connections in mind;i(PAd;i) wait for feedback time (i ?! i0 )
TAR(il) = Pr(out;il ) erA(out;il )
P j TA1;:::;i?1 = Pd Pi Pr ed;i;r A + j csA TNk = Pm(k) eNm(k) ?1 TPN1;:::;k PP
= P n n csN d k WFF = 2 HHD + (Pk i TNout;k + T R i ) + (P T in;k + T S i ) A
( )
d;k;m m eN + k(i0 ) N
A
( ) ( 0)
Table 2: Naming convention for admission services. assigned according to their deadline (highest priority is assigned to the earliest deadline). If there are input and output tasks with the same period, the input tasks take precedence over the output tasks.
The end-to-end delay (EED) test consists of two steps. At the buyer side, the test #(3) takes
the durations of the local buyer application tasks and checks them against the speci ed QoS EED (CA ) bound. Here we make sure that the tasks, although schedulable, do not violate the EED requirement. This is especially important in cases where PA > CA . For example, sensory data in telerobotics may provide such a behavior (e.g., the task period is PA =20 ms and EED CA =10 ms). At the seller side, all processing times of application tasks r(i), network tasks m(k) over connections k(i) which carry the medium i, and the actual network latency HHD (Host-toHost Delay) are taken into account. The test #(3') must hold.
The buer allocation test checks whether there is enough memory space for the ring buers
assigned to multimedia devices to lock them in real memory and smooth the trac jitter. 9
Admitted Resources
Admission Tests
P
P
Test #
CPU for i in direction (TA1;:::;i?1 + r(d;i) eAr(d;i) + j (i) csjA(i) )
d
mind;i(PAd;i) Pr d;i er d;i P d;i Deadline for i in direcA A tion d Pr d;i er d;i < C i EED for i at buyer side A A P P R i S i EED for i at seller side (TA + TA +( d k i TNd;k + HHD)) CAi Buer for i 2 MAi < 32MBytes (
)
( ) ( )
(
)
(
)
( )
( )
(1) (2) (3) (3') (4)
Table 3: Admission tests in the application subsystem for stream i. Smoothing trac is required when measured EED < requested EED. Real-time networked applications want the right data at the right time (requested EED), not sooner or later (although sooner is still better than later). The ring buers are pinned into real memory, hence test #(4) holds in our system. The size 2 MAi is then locked in the memory. 32 MBytes is an upper bound which can be allocated as a pinned region for user processes in the AIX 3.1 system; this value is implementation-speci c and will vary with the OS environment.
4.2 Admission Service in the Transport Subsystem The admission service at the transport subsystem level performs tests on network resources such as a throughput test, rate control test, network EED test, and system resources such as CPU schedulability test for RTAP/RTNP. Table 4 summarizes the admission tests in the transport subsystem.
The throughput test controls the assignment of bandwidth to individual connections, so that
no more bandwidth is allocated than the transport subsystem can support. Currently, the test does not specify to the underlying services how to control the shared bandwidth. The upper bound of available aggregate throughput at the end-point is determined by the network host interface and its device driver. For example, in our experimental system the ATM host interface (hardware) provides a transmission rate of 155 Mbps, however, the ATM transport subsystem after overhead provides 135 Mbps [ST93]. So any throughput requested for the sending or receiving connections is checked against the 135 Mbps limit bound (Test #(6)). (RTNP uses raw ATM; this bound will vary with the protocol - e.g., a protocol with more 10
Admitted Resources
Admission Tests
Bandwidth Rate Control CPU for k(i) in direction d Deadline for k(i) in direction d CPU for k(0 i) if (TAR(il ) ?! TAS (il ) )
Pk d Bk d 135Mbps Pd Pk RNd;k 1000 N T ;:::;i + T ;:::;k? + Pm d;k i ( )
( )
1
1
A
Test #
d;i
1
N
Pmind;i(PAe)d;k i m(d;k(i)) N
( )
(
( ))
eNm(d;k(i))
(6) (7) (8)
PNd;k
(9)
TAS0(il ) + TAR(il ) + TNin;k(il ) + TNout;k(il ) + WWF (10) PAi TA1;:::;i + TN1;:::;k?1 + WFF + (10') Pm(d;k(i)) em(d;k(i)) mind;i(P d;i) A N P (k ) d;k EED for k at buyer m(k) ed;m (11) C N N 0
0
+1
side EED for k at seller side
Pd Pm k ed;m k ( )
N
( )
+ HHD CNd;k
(11')
Table 4: Admission tests in transport subsystem copies and checksums might only get 70 Mbps.)
The rate control test checks the number of network packets per second, moved from/to user
space to/from the network host interface, against a certain bound (in our implementation, 1000). This bound results from the OS cost (due to overhead) of moving data between the user and kernel space (Test #(7)).
The end-to-end delay test checks the duration of network tasks at the end-points against the
required end-to-end delay bound. The same approach as in the application subsystem with respect to buyer #(11) and seller sides #(11') must be considered here.
The schedulability test checks the schedulability of all tasks (application and network tasks). The scheduling at the transport subsystem level, where we test schedulability of tasks (application and network tasks) sharing a single processor, must consider the following time dependencies: 1. Time dependencies between application and network tasks EDF and priority assignment cannot be used in the transport subsystem as discussed in application subsystem. The application and network tasks share a single processor and 11
are mutually time dependent, and transport tasks are not typically strongly periodic, as is the case for any application tasks. The dependency (precedence ?! [NS94b]) relation is, for example, read sample(il) ?! send packet(k(il )) ?! send packet(k(il )) if fragmentation of il sample is required. A further implicit precedence between application and network tasks is receive packet(k(il )) ?! write sample(il). The priority is assigned by the application subsystem to the application tasks (according to the deadline). Network tasks must inherit these priorities in order to enforce joint scheduling. The schedulability tests in the transport subsystem for this type of dependency are #(8) and #(9). The network tasks TNk , added to TA1;:::;i in test #(8), might violate the schedulability test, so some tasks might be rescheduled to the next interval(s). In the case of sending tasks, sending network tasks are rescheduled to the next interval(s) if they satisfy the network EED test #(11,11')2. In the case of receiving tasks, the application task might be rescheduled. Again, the EED tests # (11,11') need to be checked. 1
2
2. Time dependencies between input/output streams When testing for schedulability of tasks at the end-points, other types of time dependencies might occur and must be considered. For example, Figure 5 shows sensory data dependency relations in a telerobotics application, where the operator sends position data il , the slave receives the data and returns the force feedback data f (il ). The application would like to receive f (il ) so that the computation of sample il+1 can be based on f (il) (write(f (il)) ?! read(il+1)) . If this kind of dependency occurs, a wait for feedback (WFF) time interval must be included into the schedulability test because the input and output stream information are interdependent. The schedulability tests for these types of dependencies at an end-point (e.g., the operator side in the telerobotics) are #(10,10'). The knowledge of WFF time can be utilized for scheduling another task that serves a dierent medium. At the slave side the schedulability test #(8) can be used. 2
The number of possible intervals to schedule a task is SI (Table 2).
12
Operator
App.
i
0 i
Host-Host Delay
l
Progressing Time
Net. f(i App. 20
l
f(i
Slave
l
Net. i l
Wait for Feedback
Net.
Network
l
App.
i
App.
f(i
Net. f(i
l l
l
)
)
Host-Host Delay
) )
Figure 5: Distributed scheduling - precedence graph (example). The QoS Broker gets the application precedence relations from the user (through application QoS parameters) and together with the implicit application/network precedence relations it creates a precedence graph. This graph is combined with other data to generate a schedule, see Section 5.2.
The buer allocation test is needed if the network tasks queue the incoming/outgoing packets. Buers are allocated wherever queues are built.
5 Joint Scheduling An important goal of providing end-to-end guarantees in distributed multimedia systems is a feasible and semantically correct schedule of application and transport tasks. Real-time multimedia distributed applications use complex programs (either using processes/threads for each stream and relying on the OS scheduler, or using stream procedures in one complex control program where scheduling of procedures relies on a local customized scheduler) to handle their multimedia tasks as well as sending and receiving of multimedia streams by a single process. We provide automatic generation of such application-dependent local customized schedules through the QoS Broker. The user provides the application QoS speci cation of the multimedia streams and the broker derives a feasible and semantically correct schedule. This automation is achieved under the assumption that we have a fully schedulable protocol stack, as the OMEGA architecture does. The broker retrieves required protocol tasks (functions) at the application and transport subsystem level from the system QoS pro le. As discussed in Section 4 we assume that all tasks (application and network) in RTAP/RTNP are non-preemptive and periodic (e.g., read 13
sensory sample, read a video frame from a video device).
5.1 Current State of the Art Many real-time multimedia applications use currently Rate-Monotonic (RM) or Earliest Deadline First (EDF) policies to schedule periodic streams. Both algorithms, for preemptive and nonpreemptive tasks, have been studied [LSD89, SLR86, SKG91], weakening the assumptions of Liu and Layland [LL73]. One important assumption remains and that is that the tasks are independent. Applying this to OMEGA, and assuming that the multimedia streams in the same directions are independent, the behavior of application tasks allows one to schedule tasks according to the Earliest Deadline First (EDF) policy (i.e., synchronization relation in Application QoS is FALSE). However, when considering application and network tasks together, they are not independent. If they were, the schedule shown in Figure 6 would be feasible according to EDF, but it is semantically incorrect. Similar examples can be found for the dependencies between input and output streams (described for the full-feedback remote control application in Section 4.2) and other dependencies (e.g., stream synchronization). 7
Application Task (Read a Sample) Network Task (Send Packet)
4
4
t 30
15
30 t
15
30
Resulting Schedule
Figure 6: Feasible, but incorrect schedule. Therefore, a new approach must be applied. An important area of real-time scheduling research is devoted to problems such as the eect of blocking caused by the need for synchronization of jobs that share logical or physical resources, creating a critical region. While a synchronization relation is a dependency relation, it is also an equivalence relation, i.e., the fact that task A must be synchronized with task B is equivalent to B's requirements for A (there is no semantic context of precedence). Mok [Mok83] developed a procedure to generate feasible schedules with a kernelized 14
monitor for a set of non-preemptive periodic tasks with synchronization relations, meaning that the procedure doesn't allow preemption of a job within the critical section. The synchronization problem was further investigated in the context of priority-driven preemptive scheduling in [SRL90]. What can happen is that due to synchronization, one task with lower priority working in critical region can block tasks with higher priority. One approach to solve the problem is to use the priority inheritance protocols in existing synchronization primitives. For OMEGA, there is a synchronization relation between application and network tasks processing the same packet. This is a special case of a synchronization relation, a precedence relation, which adds ordering constraints and thus loses equivalence. Precedence-constrained scheduling at the transport system level is discussed in [HS92, SRC85]. In both, the graph of precedence constrained modules is obtained from the semantics of the communication. A time slicing approach for application tasks with precedence relations is taken in [NS94b]. Natale and Stankovic take a task precedence graph of robotics tasks and applying breadth- rstsearch assign time slices to sets of tasks at dierent processors using laxity metric. Another approach to precedence relations with respect to reservation is shown in [And93]. Anderson examines resources which handle streams of continuous media, such as devices, CPU, and network. He parameterizes them uniformly using the Continuous Media resource model by message size, message rate and workahead limit. A client makes a reservation, called a session, prior to using it, as follows: During the establishment phase, the reservation protocol creates a pipeline of required resources for one stream (sends a request message from the sender to the receiver) and gets a sequence of sessions, called compound session. Then the protocol reserves resources traversing the compound session. However, when it comes to scheduling, Anderson does not generate a schedule, and does not utilize the sequence of sessions for scheduling decisions, he relies on the OS scheduling policy. He assumes that per-session, over one non-CPU resource, there exists one process which does the work for the session and all processes are scheduled according to deadline-workahead policy, which can be implemented by split-level scheduling [GA91]. In the deadline-workahead policy the critical processes (real-time processes) are scheduled according to EDF. This means that if an application creates one process (representing work over a multimedia device) and the transport system another (e.g., for a network adapter), at the scheduling level they dier according to their deadlines only, and that can cause problems as Figure 6 shows. Precedence graphs and the time-slicing approach are best suited to our problem of schedul15
ing multimedia streams with precedence requirements on single processors. We developed a new algorithm for generation of feasible and semantically correct schedulers.
5.2 Algorithm for Generation of a Distributed Schedule for Multimedia Streams The scheduling is divided into two phases, pre-processing phase and run-time phase. The pre-processing phase is also done in two steps. In rst step, the duration of the protocol tasks is measured o-line and stored in system parameter pro le. The second step is performed by the QoS Broker (admission service) where, based on the (1) QoS speci cation, (2) protocol stack, and (3) precedence relations between dierent streams, the tasks are mapped into time slices and a resulting feasible and correct schedule is suggested to the distributed application. It means that the con gurable application and network protocol functions together are ordered into a precedence graph and using time slicing mapped into a sequence of time slots that are executed by the scheduler in the speci ed order. Each broker entity (buyer and seller) computes its own feasible schedule. During the run-time phase, the transmission system, serving the application, retrieves the suggested schedule at each end-point, and schedules the protocol tasks according to the schedule. The local schedulers run in parallel. Appendix A details computation of the precedence graph and slicing.
5.3 Example Figure 7 gives an example application of Appendix A's algorithm in a graphical form. We consider at the buyer side registration of (1) one sensory stream in direction in (application task period - 20 time units; one-to-one translation), (2) one sensory stream in direction out (application task period - 20 time units; one-to-one translation) and (3) one video stream in direction out (application task period - 60 time units; one-to-one translation). The lcm is 60 time units, and the number of intervals, scheduled dierently, is SI = 3. The intervals are labeled as s1 ; s2 ; s3. The tasks are labeled according to Tables 1 and 2. First, the user registers sending sensory stream. The broker-buyer at the application layer gets the corresponding task TAS (sen) , checks the schedulability and end-to-end delay and if they are satis ed, it appends it to the scheduler list in interval s1 (as \reserved"). Going through the brokerage process at the transport level, the broker gets the corresponding network task TN1 , does tests on both tasks TAS (sen) and TN1 . If the tests are satis ed, it appends the network task to 16
sensory data send 1
3
2
R(f-sen) T A
S(sen) T A
Register streams
video receive
feedback sensory data receive
R(v) T A
S(sen l+1) T A
R(f-sen l ) T A Application Precedence
S(sen) T A
Graph Globale
S(sen) T A
Precedence
R(v) T A
R(f-sen) T A
R(v) T A
R(f-sen) T A
Graph 2 T N
1 T N
3(v ) T 1 N
3(v ) T 2 N
WFF Time Slicing T 1 N
T S(sen) A
WFF
T S(sen) T 1 WFF T 2 T R(f-sen) T3(v 1) N A N A N
s
1
T 2 N
T R(f-sen) A
T3(v 1) N
T S(sen) T 1 WFF T 2 T R(f-sen) T3(v 2 ) A N N N A
s
T3(v 2) N
T S(sen) T 1 WFF T 2 T R(f-sen) TR(v) A N A N A
s
2
T R(v) A
3
Figure 7: Precedence graph creation and mapping to time slicing. the task queue in s1 . After getting accept from the broker-seller, the broker-buyer changes the state for the scheduling task to \allocate". In the second step, the receiving sensory stream and video stream are registered to the broker-buyer. The broker takes rst the stream with higher importance ( parameter in media quality application QoS) and computes gcm from period of both sensory streams (the already accepted and the one to be admitted). In next step, the broker gets the corresponding application task TAR(f ?sen) and according to EDF it orders the task in the task list with respect to the previous application tasks (In our implementation we keep the application precedence list separately from the global precedence list.). At the transport level, the network task TN2 including the application precedence relation (force feedback data should be received before the next position is sent out) are tested together with the corresponding application task TAR(f ?sen) and all previously accepted tasks. If the tests are positive, the tasks are appended according to their precedence relations to the global scheduler. In the third step the corresponding application task TAR(v) for video and the network tasks TN3(v ) , TN3(v ) which transport video fragments are appended on both levels to the precedence graphs. Note that when a video stream is included the number of intervals increases to 3. Hence, copying of previous tasks must be done before appending corresponding video tasks. The reason is that the copied tasks are already accepted. 1
17
2
RTAP/RTNP
Other Tasks
QoS Broker
used by RTAP/ RTNP
X
Protocol
Other Tasks
Joint Scheduling
Scheduler 0
RT priorities
16
RT priorities
40
Non-RT priorities Priority-based Scheduling
Fixed Priority Scheduling
Figure 8: Mapping of the Scheduling - Split Scheduling
6 Implementation, Experiments and Results A prototype of the admission service and joint scheduling (within OMEGA architecture) was implemented. We used the IBM RS/6000 workstation, applying the real-time (RT) extension support (RT priorities with xed-priority scheduling, and a page locking mechanism) of the AIX OS as follows: The networked multimedia application and network tasks (RTAP/RTNP) run as a separate process where the individual tasks are scheduled according to the joint scheduler (local customized scheduler), and the single process uses xed RT priority scheduling (Figure 8). We assigned RT priorities higher than the AIX scheduler to the process(es) which needed RT guarantees. This guarantees that the process is not preempted by the scheduler, albeit somewhat crudely. The evaluation of the admission and scheduling within the OMEGA architecture depends on the application. We validated the OMEGA architecture with a telerobotics application over an ATM LAN network (Figure 1). The evaluation concentrated on scenarios which might be useful for the telerobotics application: (1) send sensory data (coordinates) from the operator to the slave side and sensory data (force feedback) from the slave to the operator; (2) send video stream (visual feedback) from the slave to the operator; (3) send sensory data from the operator to the slave and video from the slave to the operator; and (4) send sensory data in both directions and video from the slave to the operator.
Scenario 1: The RTAP/RTNP tasks performed very well under joint scheduling as imple-
mented. The required sensory data QoS was 64 bytes per sample, 50 samples/second, maximal 10 ms EED (end-to-end delay) and 20 ms for round trip delay because of the full-feedback 18
Operator Side
Robot Side
(master)
(slave)
Puma 250 SUN station
JIFFE station
Robot Control
Robot Control
Bus
Bus
Application Subsystem
Display
Puma560
Software&Hardware
Software&Hardware
Application Subsystem
OMEGA
OMEGA Camera
TCP/IP Ethernet Adapter
TCP/IP Ethernet Adapter
ATM Card
ATM Card
Network
Figure 9: Comparison of OMEGA/ATM with Application Subsystem/TCP/IP/Ethernet requirement. The measured end-to-end delays of the sensory data for our telerobotics application were approximately 3 ms (average value). This value was compared (Figure 9) with the performance of an architecture consisting of an application subsystem (having the same tasks as RTAP) together with TCP/IP over Ethernet. The application subsystem tasks and TCP/IP tasks were scheduled by the UNIX scheduler (application tasks in user non-real-time space, TCP/IP tasks in kernel space). The average end-to-end delay was approximately 175 ms with lightly loaded Ethernet - measured at 8:00pm. Hence, OMEGA with its admission service and joint customized scheduling contributes to a factor of 50 improvement.
Scenario 2: RTAP/RTNP tasks performed well under joint scheduling. The required video
data QoS was 240x160 pixels frame size, 8 bits/pixel, uncompressed frame, maximal 1000 ms EED and frame rate between 5 frame/second and 20 frames/second. We displayed approximately 20 frames/second with EED of 52 ms in our experimental setup.
Scenario 3 and 4: Both scenarios were rejected by the admission service because of several
bottlenecks in our experimental platform and implementation [Nah95]. One unexpected bottleneck was the display of video using X Windows. We could not display frames rapidly enough without violating the tight bound (20 ms) of sending/receiving sensory data. However, we believe that with shared X (which was not available on our platform) and faster processors this bottleneck would disappear.
Another problem, left unaddressed in this stage of our experimental work, is several applications sharing the processor, where some do not register with the QoS broker. They are then not under 19
admission control for guaranteed CPU scheduling. In this case the AIX RT extension cannot provide guarantees. Thus once another process is scheduled (even a non-real-time process), priority inversion may occur, and the real-time task under joint scheduling misses the deadline. The implicit assumption in the telerobotics world is that the general purpose machine is lightly loaded with other applications and the CPU bandwidth is mainly used by the real-time multimedia application. To force a solution to this problem, major changes in the OS kernel would be needed to force registration with the QoS Broker entity. This is future work, and we will have to examine the implications for interoperating guaranteed and non-guaranteed entities under one system.
7 Conclusion Admission service and scheduling are important elements of resource management when end-to-end guarantees are required. Our applications class require integrated control of network resources, and multimedia devices and system resources at the end-points to provide stability of control for the whole system. Our multi-level admission service and joint scheduling provide one solution. Examining admission for end-point resources showed several time dependencies which must be included into schedulability tests to provide correct customized CPU scheduling. Our new algorithms are given in sucient detail that they can be easily implemented by others. We validated our admission service and scheduling by using it in the implementation of a nontrivial telerobotics application. We successfully provided hard-real time guarantees for the sensory data (Scenario 1), soft-real-time guarantees for the video trac (Scenario 2) and identi ed system limitations for future experimental platforms in this class of applications (Scenario 3 and 4). A major area for future work is the operating systems support for scheduling tasks and activities traditionally managed by the OS such as execution of network protocol functions.
References [And93]
D. P. Anderson. Meta-Scheduling for Distributed Continuous Media. ACM Transaction on Computer Systems, 11(3), August 1993.
[CT90]
D.D. Clark and D.L. Tennenhouse. Architectural Considerations for a New Generation of Protocols. In ACM SIGCOMM'90, pages 200{208, Philadelphia, PA, September 2 1990. 20
[FV90]
D. Ferrari and D. C. Verma. A Scheme for Real-Time Channel Establishment in WideArea Networks. IEEE JSAC, 8(3):368{379, April 1990.
[GA91]
R. Govindan and D. P. Anderson. Scheduling and IPC Mechanisms for Continuous Media. Technical Report UCB/CSD 91/622, University of California,Computer Science Division, Berkeley, March 1991.
[HLG93] J. M. Hyman, A.A. Lazar, and G.Paci ci. A Separation Principle Between Scheduling and Admission Control for Broadband Switching. IEEE JSAC, 11(4):605{616, May 1993. [HS92]
C.J. Hou and K. Shin. Allocation of Periodic Task moduls with Precedence and Deadline Constraints in Distributed Real-Time Systems. In IEEE Real-Time System Symposium, pages 147{156, Phenix, September 1992.
[LL73]
C. L. Liu and J. W. Layland. Scheduling Algorithms for Multiprogramming in a Hard Real-Time Environment. Journal of the ACM, 20(1):46{61, January 1973.
[LSD89]
J. Lehoczky, L. Sha, and Y. Ding. The Rate-Monotonic Scheduling Algorithm: Exact Characterization and Average Case Behavior. In Proceedings of IEEE Real-Time Systems Symposium, pages 166 {171, Santa Monica, Califronia, December 1989. IEEE Computer Society Press.
[Mok83]
A.K. Mok. Fundamental Design Problems of Distributed Systems for the Hard RealTime Environment. PhD thesis, MIT, 1983.
[MSST91] T. Murase, H. Suzuki, S. Sato, and T. Takeuchi. A Call Admission Control Scheme for ATM Networks Using a Simple Quality Estimate. IEEE JSAC, 9(9):1452{1460, December 1991. [Nah95]
K. Nahrstedt. An Architecture for End-to-End Quality of Service Provision and its Experimental Validation. PhD thesis, University of Pennsylvania, August 1995.
[NS94a]
K. Nahrstedt and J. Smith. A Service Kernel for Multimedia Endstations. In IWACA '94 Multimedia: Advanced Teleservices and High-Speed Communication Architectures, pages 8{22, Heidelberg, Germany, September 1994. 21
[NS94b]
M. Di Natale and J. A. Stankovic. Dynamic End-to-End Guarantees in Distributed Real-Time Systems. In Proceedings of Real-Time Systems Symposium, pages 216{227, December 1994.
[NS95]
K. Nahrstedt and J. M. Smith. The QoS Broker. IEEE Multimedia, 2(1):53{67, Spring 1995.
[R.L94]
R.L.R Carmo et al. Real-Time Communication Services in a DQDB Network. In Proceedings of Real-Time Systems Symposium, pages 249 {258, San Juan, Puerto Rico, December 1994.
[SKG91]
L. Sha, M. H. Klein, and J. B. Goodenough. Rate Monotonic Analysis for Real-Time Systems. In Foundations of Real-Time Computing, Scheduling and Resource Management, pages 129{156. Kluwer Academic Publisher, Norwell, 1991.
[SLR86]
L. Sha, J. P. Lehoczky, and R. Rajkumar. Solution of some Practical Problems in Prioritized Preemptive Scheduling. In IEEE-Real-Time Systems Symposium, pages 181{ 191, New Orleans, 1986.
[SRC85]
J. A. Stankovic, K. Ramamritham, and S. Cheng. Evaluation of a Flexible Task Scheduling Algorithm for Distributed Hard Real-Time Systems. IEEE Transactions on Computers, 34(12):1130{1143, December 1985.
[SRL90]
L. Sha, R. Rajkumar, and J.P. Lehoczky. Priority Inheritance Protocols: An Aroach to Real-Time Synchronization. IEEE Transactions on Computers, 39(9):1175{1185, September 1990.
[ST93]
J. M. Smith and C. Brendan S. Traw. Giving Applications Access to Gbit/s Networking. IEEE Network, pages 44{52, July 1993.
[TTCM92] H. Tokuda, Y. Tobe, S. T. C. Chou, and J. M. F. Moura. Continuous Media Communication with Dynamic QOS Control Using ARTS with an FDDI Network. In ACM SIGCOMM 92, pages 88{98, Baltimore, MD, 1992.
22
A Computation of Precedence Graph and Time Slicing The algorithm for computation of the precedence graph is as follows: 1. register
1; : : :; i
streams in direction
2. compute
mini;d(PAi;d) 8i
3. compute
SI 8i
d
`registered' and
`registered' and
4. order application tasks
8i
through application QoS
d 2 (in; out)
d 2 (in; out)
`registered' according to deadlines (EDF) preserving
the RTAP precedence graph 5. check CPU schedulability, deadline, EED test in application subsystem and compute APG (Application Precedence Graph) as follows:
8i
`registered'
check #(3) if buyer; check #(3') if seller; /* EED Tests */ If tests are positive then continue else rejection.
check #(2)) /* deadline test */ If test positive then continue else rejection.
check #(1)) /* CPU schedulability */ If test positive then continue else rejection.
/* if there exists a higher
priority of a stream in other direction, then user makes the decision which stream to remove */
i stream admitted; append to APG TAi ; compute TA;:::;i = TA;:::;i? + Pr i eAd;r i + Pj q csjA i ; 1
1
1
( )
( )
( )
( )
6. s = 1; for tasks in APG compute GPG (Global Precedence Graph):
8i
`registered' and admitted in APG
d = in) get TAS(i) from APG; R(i) from APG; if (d = out) get TA 8k(i) /* k(i) - number of fragments/connections
(a) if ( (b) (c)
23
per stream (sample)
i
*/
i. get
TNk ;
ii. check #(11) at buyer; check #(11') at seller; /* EED tests */ If test positive then continue else rejection. iii. check #(9); /* deadline tests */ iv. check #(8) /* check schedulability */ If test positive continue (v.) v. check
else goto `reschedule' (viii.)
(i0 ?! i)
If test positive then (compute WFF; check #(10,10');) else continue (vi.) A. check (#(10,10')) If tests positive then `precede' = TRUE else rejection.
d = in && k = first)
vi. check (
If test positive then (append to GPG
TNk ;).
d = out &&
vii. check (
k = first
< TAS(i) ; TNk >;)
else (append to GPG
&& k:last))
If test positive then check (vii.A.) else continue (vii.C.) A. check `precede' variable If test positive then (append to GPG (i.)) B. check
< WFF; TNk >;
k:= k + 1; goto
else continue (vii.B.)
(9WFF ) ^ (TNk < WFF )
If tests positive then (insert
TNk
in GPG instead of WFF;) else (append
TNk ;k:=k+1; goto (i.)). k ; T Ri > (k = last) then append to GPG < TN A
to GPG C. if
viii. `reschedule':
/* application and network tasks can't be scheduled in one
interval*/
s < SI )
check (
If test positive then (copy
TA1;:::;i?1; TN1;:::;k?1)
from interval
s
to
s+1;
continue;) else rejection.
switch case
1;)
d
in:
(leave in interval
24
s
the task
TAS(i);
move
TNk
to interval
s+
case move
out: TAR(i)
(leave in interval to interval
s + 1;)
s
check #(11/12,9,8) in interval
the task
TNk ;
s + 1;
if all tests positive then continue else rejection.
append tasks in GPG as follows: switch case case
d
in: out:
TNk ; R(i) append TA ;
append
s := s+1; k:
= k+1; goto (i.)
The joint schedule for the application and network tasks is implemented as a circular list of intervals (length of the interval is min(Pi )). Each interval includes a list of possible tasks scheduled in this interval. The number of intervals is speci ed by computation of SI . The following structure captures the scheduler: typedef struct sched_spec { int number_of_ticks; /* SI - number of intervals */ long min_period; SCHEDULER_PERIOD sched[NUMBER_OF_TICKS]; } typedef struct sched_per_minperiod { SCHED_ELEMENT sched_queue[MEDIA_NUMBER*NUMBER_OF_TASKS_PER_MEDIUM]; }SCHEDULER_PERIOD; typedef struct sched_element_spec { int state; /* reserve, allocate */ int task_name; int task_duration; long time_begin; long time_deadline; }SCHED_ELEMENT;
The current implementation of the scheduler structure is not optimal, and it is restricted to NUMBER OF TICKS = 10, NUMBER OF TASKS PER MEDIUM = 10 and number of media in a multimedia stream MEDIA NUMBER = 5. If there exist several intervals with the same tasks, they are explicitly in the list which could be optimized.
25