End-to-End QoS Guarantees: Lessons Learned ... - Semantic Scholar

5 downloads 9873 Views 221KB Size Report
its system software support, and our own implementation decisions, coupled with the .... Functions such as call management, rate control of multimedia de- vices ...
End-to-End QoS Guarantees: Lessons Learned from OMEGA Klara Nahrstedt

Jonathan M. Smith

Computer Science Department CIS Department University of Illinois University of Pennsylvania

Abstract

With audio-visual and other sensory information in distributed multimedia applications, endto-end quality of service guarantees are a major acceptance factor for these applications. We designed and implemented an end-point architecture, called OMEGA, for provision of end-toend QoS guarantees. This architecture relies on a distributed QoS management entity, called the QoS Broker, to translate, negotiate/renegotiate, and admit end-to-end QoS as a contract. It uses end-to-end real-time protocols for transport. We tested our architecture on a telerobotics application. This paper presents various lessons learned from implementation of the OMEGA architecture. Performance gures show that the design of OMEGA was correct; we can provide end-to-end QoS guarantees. However, the results also show that shortcomings of the implementation platform, its system software support, and our own implementation decisions, coupled with the inherent conservatism of resource reservations, severely limited the applications. Some of these lessons are obvious, some are connected directly to the speci c platform, but some are of a general nature, and need thorough consideration and new algorithms, QoS management and QoS system support schemes.

1 Introduction Provision of end-to-end quality of service (QoS) guarantees for multimedia streams is a major requirement for design and implementation of complex multimedia distributed systems. Over the last three years we worked on this requirement and we speci ed, designed and implemented an end-point communication architecture, OMEGA, for provision of end-to-end QoS guarantees. OMEGA relies on network real-time services and network resource management, and uses selected real-time OS services. It takes the application-speci c requirements for a multimedia call, This work was supported by the National Science Foundation and the Advanced Research Projects Agency under Cooperative Agreement NCR-8919038 with the Corporation for National Research Initiatives. Additional support was provided by Bell Communications Research under Project DAWN, by an IBM Faculty Development Award, and by Hewlett-Packard. 

1

reserves resources needed by the application, transport and OS coherently (using its own reservation, relying on reservation of underlying services) and schedules it according to the QoS reservation during the transmission. The QoS management is performed by the QoS Broker [NS95]. The broker utilizes negotiation, admission and other management services to provide resources for a connection with end-to-end resource guarantees. We tested the architecture on a telerobotics application and the measured performance of the system revealed many valuable lessons about QoS architecture. Measurements showed that the basic design of the QoS Broker/OMEGA architecture was correct and the implementation of this architecture can provide end-to-end QoS guarantees for a limited number of continuous streams and exible, recon gurable management of QoS. However, the measurements also show how the hardware platform, software choice and design decisions limit the implementation of end-to-end QoS provision. Section 2 gives a brief overview of the QoS Broker concept and its inclusion into the OMEGA architecture. Section 3 discusses lessons learned from the design of OMEGA/QoS Broker. Section 4 describes the experimental set-up, implementation of OMEGA and performance results from our telerobotics experiment. Section 5 focuses on implementation issues, and Section 6 concludes the paper.

2 Brief Overview of QoS Broker/OMEGA Architecture To provide application-to-application guarantees, we need guaranteed services both in the network and at the end-points [And93]. We assume that our communication architecture resides on top of guaranteed network services, as many results illustrating methods to provide such services are now mature. We concentrate our research e orts on providing end-to-end QoS guarantees which rely on the end-to-end resource management. For end-to-end resource management, two issues need to be addressed: (1) the end-point communication system model, and (2) the model for end-point resources.

2.1 Communication Model The communication system is modeled as two interoperating subsystems, the application subsystem, which uses a real-time applications protocol (RTAP), and a transport subsystem, which uses a Real Time Network Protocol (RTNP). We call this system architecture the OMEGA Architecture, 2

alluding to its location at system end-points. Both subsystems must provide guarantees useful to a scheduler for the calls/connections they service end-to-end. Hence, an important part of the new architecture is the resource management protocol, as represented by the QoS Broker[NS95]. The QoS Broker is a new end-point design for resource orchestration, drawing on successful models for human negotiations. The design provides a specialized manager to establish resource guarantees, using detailed databases and negotiation among managers of required resources. Furthermore, the QoS Broker provides a method for coordinating several layers of the system to provide end-to-end service guarantees. We have used the model of striking a deal, as it re ects the notions of negotiation and renegotiation of QoS central to adaptive applications. When guarantees are made in the deal, the broker ensures that the necessary resources are guaranteed to be available at the relevant points in the end-to-end communications path, incorporating both local and global resources. For global resource availability, the broker uses a negotiation service between the end-points and relies on network resource guarantees provided by the network subsystem, e.g., by B-ISDN switches. The renegotiation service allows the user to change QoS on demand. The goal of the broker is to negotiate a resource contract among all the system components (application, OS, network). The broker assumes di erent roles (seller and buyer) to distinguish between the participating communication partners. The role assignment allows the distributed system to support senderinitiated negotiation (e.g., with underlying RCAP protocol [BM91]) as well as receiver-oriented negotiation (e.g., with underlying RSVP protocol [ZBE+ 93]). To ensure that the application subsystem and transport subsystem functions were under scheduler control (and hence included in the QoS Broker's set of guaranteed services) we designed and implemented prototypes of the Real-Time Application Protocol (RTAP) and Real-Time Network Protocol (RTNP) functions. Functions such as call management, rate control of multimedia devices, input/output functions (e.g., display of video), fragmentation of application protocol data units (APDUs), and stream extraction/melding of APDUs are the core of the RTAP. Functions such as connection management, forward error correction, timing failure detection and timely data movement form the core of the simple RTNP. While not full-featured, the schedulable protocol functions were necessary to study and implement a scheduled multiplexing algorithm for multimedia streams, which we required for end-to-end guarantees. 3

2.2 The Resource Model and its Representation At the end-point, three logical groups of resources must be managed, namely multimedia devices, CPU scheduling and memory allocation and network resources. We parameterize the requirements on end-point resources through deterministic Quality of Service (QoS) parameters maintained in small databases [NS95]. The resources in each domain (application, OS, network) maintain domainspeci c representations. This, of course, gives rise to multiple views of QoS. Application requirements for multimedia devices are speci ed through application QoS parameters. For example, video quality is described by frame rates (30 frames/s), frame size (height * width in pixels), color (bits/pixel), etc. The network QoS parameters describe the requirements for the network resources, such as packet rate, packet loss, jitter, end-to-end delay. The system QoS parameters describe the requirements on CPU scheduling and bu er allocation such as task start times, durations, and deadlines. These multiple QoS views must map to a common set of resources to be coordinated for management, so they are translated among each other. This is done by services included in the QoS Broker. For example, the translation between application and network QoS is done by the QoS Translator [NS95].

3 Lessons Learned from Design 3.1 QoS Broker Concept

Treating the system components (e.g., the end-stations and the network infrastructure) as peers has implications which may prove useful in many future systems. While we used telerobotics as a driving application in our work, the QoS Brokerage idea seems to have wide application. One such application is a computer modem, where the line quality may be a dynamic, requiring signaling of bandwidth capacity to an application as the line improves or degrades. Another example is using one or more idle workstations to support an application; when the workstation owner begins typing at the system console, the allocation of system resources may have to change dramatically. Finally, consider ATM over wireless media. It is advantageous to use the same software model for both wired and wireless media above the ATM layer. In wireless media, the link quality may change as the mobile application element moves from an indoor wireless LAN to wireless microcells, and on to ATM over cellular. In each of these three examples, the resource management concept of the 4

broker allows the application and the system components to cooperate in support of the user. The QoS Broker concept is general enough to be useful across many implementation technologies. It can incorporate, for example, the integrated layering approach [CSZ92] in the control-management plane. Furthermore, the broker has interaction mechanisms in it to make `contracts' with an OS as well as with network resource management. When operating systems and network subsystems for which contract protocols exist are available, the broker uses them. We expect this availability to become more common with multimedia manipulation and other advanced uses of machines, causing the best-e ort resource-sharing models of common operating systems to be stressed to failure.

3.2 Revision of QoS at the Application/Network Interface Real-time distributed multimedia applications need system support for provision of application-toapplication guarantees. Before this project was undertaken, QoS was frequently treated as purely a network phenomenon, deliverable via proper con guration of switches and other network sharing mechanisms. When an application perspective is applied, many of the QoS measures do not make sense { mainly because the network QoS is necessary, but is only part of the picture. To ll out the picture and provide these guarantees according to application QoS requirements, applicationto-application resource management must be in place. In general, the concept of di erent QoS views, which enables each major system component (application, OS, network) to specify its requirements and constraints in its domain-speci c representation proved very useful. As the variety of di erent media increases and with it the variety of di erent application QoS parameters, a exible interface between the application and the transport subsystem is needed. New device speci cations and translation functions can be plugged in and made available to the QoS Broker through well-known con guration databases. This idea supports the goal of having re-con gurable multimedia distributed systems without rewriting the software when new media streams need to be supported in the application.

3.3 QoS Broker Services and Protocols The design of the QoS Broker protocols worked. The bidirectional translation property at the API allows propagation of the feedback information either during the negotiation process or later renegotiation. For our decision to support sequential multi-layer negotiation protocols, there are certain trade5

o s. We allow an application to exchange application-speci c information before any network resource management is involved. The distributed multimedia applications can rst nd out if application resources for their speci ed media are available and if it makes sense to request the equivalent networking resources (which might be expensive to hold when the receiver(s) cannot support the desired media quality). The disadvantage of the multi-level negotiation is that we need two or more round trips to setup an application-to-application multimedia call. Our architecture targets rather long-lived applications which need real-time QoS guarantees and therefore can tolerate a longer connection setup time in return for guarantees and stability provided during the transmission. The design of our two layered communication system is simple and should be extended, although a small number of layers is preferable for real-time multimedia communication. The layering provides two things. First, given that there is a good understanding of the application (which may be encapsulated in application pro les), translation can be performed between speci cations of QoS. Second, the layer structure can be used to hide transparent adaptation, e.g., some of the automatic recon guration of the QoS Broker. The design of roles (e.g., buyer, seller) in a reservation protocol at higher layers should be present so that the reservation protocol can support both sender-initiated reservation and receiver-initiated reservation for individual connections. This would allow asymmetric applications to use di erent underlying reservation schemes. An excellent example of such an application is teleoperation, where the operator is the sender and wants to initiate the sending connection to the remote robot, but it is also a receiver and wants to initiate a connection to receive data from the slave (robot). The design of a constructive algorithm which creates a feasible and correct schedule for multimedia streams based on QoS was needed. The algorithm takes QoS speci cations of individual streams and task speci cations for processing the streams and outputs a feasible schedule. The current scheduling solution is static; this worked for our sample applications but is unacceptable in general. This algorithm must be slightly changed when we want to integrate real-time and nonreal-time services at the end-points (application, OS and network). We left the problem of `several applications sharing the processor and having together guarantees and best e ort requirements' unaddressed. The implicit assumption in the telerobotics world is that the general purpose machine is lightly loaded with other applications and the CPU capacity is mainly used by the real-time multimedia application. Ultimately, the best-e ort service applications will have to register with 6

the QoS Broker as well. Time for the best-e ort applications so that they do not starve, will then be considered when running schedulability tests.

4 Implementation, Experiments, and Results We validated the OMEGA architecture with a telerobotics application. The telerobotics/teleoperation application is nontrivial and has challenges distinct from teleconferencing. A system con guration for a possible telerobotics environment is shown in Figure 1. It allows a remote operator to exert Operator Side

Robot Side

(master)

(slave)

Puma 250 SUN station

JIFFE station

Robot Control

Robot Control

Bus

Bus

Application Subsystem

Display

Puma560

Software&Hardware

Software&Hardware

Application Subsystem

OMEGA

OMEGA Camera

TCP/IP Ethernet Adapter

TCP/IP Ethernet Adapter

ATM Card

ATM Card

Network

Figure 1: Telerobotics environment. force or to impart motion to a slave manipulator. The operator experiences the force and resulting motion of the slave manipulator, known as \kinesthetic feedback". An operator may be provided with visual feedback and possibly audio feedback as well.

4.1 Telerobotics Requirements The telerobotics application has the following speci c properties and requirements: 1. Telerobotics includes end-points (robots) without a human operator as well as end-points with a human operator. Here, the setup of the remote (slave) side must occur remotely without help of a human operator. This setup process must be done in a robust manner. 2. The media used in our environment are sensory data and video. The sensory data specify the positions of the robot arm and are transmitted to the slave. The slave sends force feedback sensory data indicating the forces at the robot arm. Visual information supplements 7

the feedback information for the operator. Based on the feedback information the operator decides on the next move of the master arm which then translates into position coordinates, transmitted to the slave. A closed loop exists between the master arm and the slave arm. Visual information is supporting information for the operator to have visual control over the working space of the remote robot, and to allow proper decisions in case of a robot failure. 3. The telerobotics requirements on the sensor data transmission are: (1) very high reliability, i.e., loss of one position datum in 1 minute is allowed, and no two consecutive positions can be lost; (2) the position is encoded as a vector of 12 oating point values where the elements have a varying importance for the robotics application; (3) end-to-end delay of position information must be guaranteed and the upper bound is 20 ms; a desirable bound is 10 ms; (4) the positions (samples) should arrive with approximately the same interarrival time (20 ms), i.e., the sample rate of the positions is 50 samples/s; (5) sensory data are transmitted in both directions with the same quality, and (6) the precedence relation between sensory streams. 4. The requirements on video data transmission are: (1) loss of one frame per second, (2) endto-end delay is 200 ms, and (3) the frame rate is 5 frames/s.

4.2 Implementation Choices and Restrictions Choices in Hardware/Software Implementation An OMEGA prototype runs on IBM RISC System/6000 workstations using AIX. The master side uses an IBM RISC System/6000 Model 530, the slave side uses an IBM RISC System/6000 Model 360. The robot control software and hardware resides in two other machines: the JIFFE real-time processor on the master side and a SUN 4 workstation with real-time OS support for UNIX. The two RISC System/6000 workstations are connected with ATM host interfaces[TS93] using a 155 Mbps SONET OC-3c equivalent G-LINK physical interface. The RISC System/6000 workstations are connected to the individual robot control stations via cards from BIT3 Corporation which provide an S-Bus-to-MCA connection. OMEGA treats access to the BIT3 cards as a robot device access. The RISC System/6000 Model 360 includes an IBM Ultimedia video card [IBM94], which can capture images at the rate of 30 frames/second. In using the broker for setting up an end-to-end guaranteed multimedia call, several issues are application-speci c. 8

1. We specify additional information in the application negotiation; in addition to the exchange of application QoS parameters, the operator sends a request to the robot for a video image to view the working environment before real-time transmission starts. 2. Currently no group brokerage is supported for the telerobotics application (although controlling a group of robots would be interesting). 3. Robotics traces with sensory data are used at the master and the slave sides. There is no connection to real robotics devices for safety reasons. 4. We grouped the RTAP/RTNP functions into a very simple tasks set consisting of Read Robotics Data, Write Robotics Data, Read Video Frame, Write Video Frame, Send Cell, Receive Cell, Send Datagram, Receive Datagram and Renegotiate. These tasks are scheduled by the joint

scheduler. Such simple tasks are more easily scheduled with respect to QoS guarantees. Further subtasking could be used if ner-grained scheduling or more complex protocol functions were needed.

The OMEGA software implementation utilizes the real-time services in AIX Version 3.2. These include [Cor91] real-time (RT) priorities, fast context switching, xed priority scheduling for RT processes, ne granularity timer services (The RS/6000 Model 530 has a clock speed of 25 MHz and timer resolution 400 ns. The Model 360 is slightly faster.), and code and data pinning. OMEGA uses the real-time services as follows: The networked multimedia application and network tasks (RTAP/RTNP) run as a separate process where the individual tasks are scheduled by the joint scheduler. This single process uses xed RT priority scheduling (Figure 2). We assigned RT priorities higher than the AIX scheduler to the process(es) which needed RT guarantees. This guarantees that the process is not preempted by the scheduler, albeit somewhat crudely.

Implementation Restrictions induced by our ATM LAN Network First, the broker does not yet rely on network resource management in the ATM layer, as this mechanism is not implemented in the host interface or in the ATM switch. For the lightly loaded ATM LAN in our experimental environment, the network resources are always available and successfully allocated. Therefore the response from the ATM LAN to the broker request is assumed to be accept. This trivializes the network management, but lets us test the broker's distributed end-to-end entities (buyer/seller) to (1) orchestrate the local buyer resources, (2) orchestrate the 9

RTAP/RTNP

Other Tasks

QoS Broker

used by RTAP/ RTNP

X

Protocol

Other Tasks

Joint Scheduling

Scheduler 0

RT priorities

16

RT priorities

40

Non-RT priorities Priority-based Scheduling

Fixed Priority Scheduling

Figure 2: Mapping of the Scheduling - Split Scheduling remote seller resources, and (3) coordinate between them. The admission service in the transport subsystem provides partial control of network resources, for example available bandwidth in terms of transport packets not ATM cells, end-to-end delay, bu er space for queues to schedule the packets over ATM host interface VCIs. A second limitation is the absence of a practical way to experiment with the broker using other LANs. If using other LANs such as a Token Ring, the end-to-end links are shared links. In this case, the MAC layer needs to be included in resource reservation and allocation, which is dicult. The third speci c property of OMEGA is its utilization of two transmission modes provided by our experimental ATM host interface. The ATM host interface transmits datagrams of up to 64 KB using ATM Adaptation Layer (AAL) AAL3/4. We call this mode in our implementation the DATAGRAM-MODE. A null AAL layer can be used to transmit PDUs of size 44 bytes or less which can t into an ATM cell. We call this mode the CELL-MODE. The maximal number of connections active is 256 due to a size limitation on a content-addressable memory.

4.3 Experimental Results The evaluation of OMEGA performance concentrated on scenarios which might be useful for the telerobotics application: 

Scenario 1 - Robotics Streams in Both Directions: For transmission of the robotics streams we

tested two cases: rst, where each stream is mapped to a single simplex network connection using AAL 3/4 CS-PDU (the robotics stream is registered without its intraframe speci cation) and second, where each robotics sample is split into four packets according to the intraframe 10

speci cation and transmitted over four simplex network connections. The individual components are small enough to t into an ATM cell, therefore we use the null AAL (raw ATM) for this kind of stream transmission. 

Scenario 2 - Video Stream in One Direction: In the second scenario we send only a single

uncompressed video stream from the slave to the master using a frame rate of 5 frames/second. The transmission of the frames (240x160 pixels, 8 bits/pixel) is performed over a single simplex connection using AAL 3/4 CS-PDUs.



Scenario 3 - One Robotics Stream in One Direction, One Video Stream in Opposite Direction: Scenario 3 represents sending one robotics stream from the master to the slave and

uncompressed video stream from the slave to the master side. We tested the transmission of the robotics stream utilizing both cases described in Scenario 1. The video stream is as in Scenario 2.



Scenario 4 - Robotics Streams in Both Directions and Video Stream in One Direction: Scenario

4 represents the most complex scenario. It includes robotics streams from master to slave and vise versa as well as one video stream from the slave side to the master side.

The main emphasis is on measured end-to-end (processing) delays between the sources and sinks at the master and slave side. Complete measurements, i.e. the mean values (mean) and mean deviation values (m:d) as well as ratios between mean and mean deviation in percentages (m:d:=mean), are in [Nah95]. We show the results for some selected cases in a box-plot1 to illustrate the relative end-to-end delays and the mean deviation in these delays. The statistical evaluation is based on 100 experiments per Scenario in each mode.

4.3.1 QoS Broker Performance The performance of the broker is measured on lightly loaded workstations, i.e., no additional application software is running on both workstations. The box-plot provides much more information than an X-Y plot for this type of data. For a given x value, the box de nes the middle 50 % of the data, the horizontal line inside the box is the median, and the bar at the end of the dashed line marks the nearest value beyond a standard range (in this case, 1.5*(inter-quartile range)) from the quartiles. Points outside of these ranges are shown individually. Details of box-plot presentation can be found in [BCW88]. We use them because of the ability to visualize jitter from the height of the box. 1

11

60000 55000 45000

50000

Duration [usec]

65000

70000

The establishment of a resource contract for a unidirectional QoS call/connection, takes on average 60 milliseconds. The measurement was taken at the broker-buyer side when no previous connection was established. The measurement includes negotiation between buyer and seller over a DATAGRAM-MODE connection and application negotiation was performed without any request for any additional information. Much of this time is consumed in the analysis of schedule feasibility. If the QoS Translator splits the data across VCIs (e.g., the 1-4 mapping for sensory data discussed as the second case of Scenario 1), the resource deal takes an average of 67 milliseconds. This is a direct consequence of (1) the more complex computation in the QoS Translator because the translator must loop several (four in the 1-4 mapping) times to assign network QoS to the transport connections and (2) admission service at the transport level, where again the service must perform the schedulability and append the transport tasks several (four in our case) times instead of once. Figure 3 shows the run-time of the QoS Broker during the brokerage process. These times are long, as we used extremely simple algorithms to obtain a proof-of-concept. Unless frequent renegotiation is required, even these times should not present a problem.

1-1 Mapping

1-4 Mapping

QoS Broker with Different Mappings

Figure 3: Run-time of the QoS Broker (Scenario 1 - no previous call was established).

4.3.2 RTAP/RTNP Performance We needed to characterize the performance of the new schedulable protocol stack. We expected some performance gain from the method of implementation, as a multithread task embedded in a 12

single AIX process. The obvious gain comes from the sharing of data in a single address space; which reduces or eliminates the need for copying data between layers. Thus, the implementation exploits many of the features of the \Integrated Layer Processing" ideas of Clark and Tennenhouse [CT90], and some performance gain from this would not be surprising. The outstanding question was the ability of the OMEGA implementation to meet the guarantees required by the telerobotics application. The experimental results for Scenario 1 are compared to a similar telerobotics experiment [DSP93] performed on a conventional architecture, using TCP/IP and no real-time support. Both environments are shown in Figure 1.

Telerobotics Result with TCP/IP/Ethernet for Scenario 1 The end-to-end delay of 1.22 second reported in [DSP93] is based on the following experimental setup: The application subsystem and TCP/IP/Ethernet at the master side reside at an SGI IRIS workstation. The application subsystem and TCP/IP/Ethernet at the slave side use SUN 4 station. The robot control at the master side is performed by the real-time JIFFE processor and at the slave side is controlled by the Unimate controller which is working under RCI/RCCL environment providing real-time control capability. The controlling processor is a SUN station. The real-time processes are running at a rate of 50 samples/second. The application subsystems on the IRIS and SUN 4 (tasks such as \Read from position ring bu er", \Write to position ring bu er", \Read from force ring bu er", \Write into force ring bu er", \Write position into socket", \Read force from socket") reside in non-privileged user space. The application subsystem has access to a timer with 1 second granularity through the UNIX sleep and alarm functions. This non-real-time behavior environment a ects the program and communication structure as follows: The robotics data are not sent every 20 ms, but gathered in a ring bu er at the master side and then a set of data is sent together to the slave in one datagram. At the slave, the application subsystem gets the data and writes it to the ring bu er. Furthermore, there is considerable context switching between the application subsystem and TCP/IP. The robotics tasks are also included in the delay. The size of the robotics packet was 100 bytes. We recreated this setup. We sent 64 byte packets (between IRIS and SUN using sensory traces) as fast as possible without any scheduling control. The measured round trip delay (average) is 351 ms with lightly loaded Ethernet (measured 8:00 pm). End-to-end delay is 175 ms. The timer resolution is a crude 10 ms on these workstations. 13

For the telerobotics application, the problem with any delay larger then 20 ms is (if sending one position data in one datagram - not a set of data in one datagram) that the slave robot becomes unstable, if the position information does not arrive on time and it does not have any information to act on. This can result in severe damage.

Telerobotics System Performance with OMEGA The measured end-to-end delays (EEDs) of the sensory data in Scenario 1 are 3.066 ms (average value) using the CELL-MODE which is a factor of approximately 60 better than the delays with TCP/IP over Ethernet with 175 ms. The RS/6000 Model 530 gives a timer resolution of 400ns; the Model 360 is slightly better. The improvement which we see is not due as much to ATM versus TCP/IP/Ethernet because the measured ping time averaged 3 ms. The improvement in performance is due to (1) OMEGA with its resource reservation/allocation, feasible schedule utilizing real-time OS service and its capability to control the timing guarantees in a very ne granularity, and (2) faster context switching between user space and kernel. Context switches in AIX take about 1.7 microseconds. The EED data are measured when robotics data between operator and slave are in sync. What it means is that the virtual clocks in the end-to-end scheduler loop are synchronized on both machines. The actual clock of the machines is not synchronized, but because we have periodic streams, we achieve a synchronous behavior by setting the virtual clock properly for the rst arriving packet. This can be done either by sending time stamp in the rst packet or utilizing the estimate of the ReceiveDatagram/ReceiveCell task durations (from the system QoS pro le database) for the rst packet. This is not the best solution, especially when data gets out of sync. In a certain time limit one can compensate locally to achieve resynchronization, but by longer delays, a resynchronization with time stamps is necessary. The synchronization of the rst packet must be enforced in the protocol stack. Here, the broker cannot help. Figure 4 shows the EED times for CELL-MODE when the rst packet is in sync or out of sync. In Scenario 2 we displayed approximately 20 frames/second with EED of 52 ms. For Scenario 3 and 4, both scenarios were rejected by the admission service because of several bottlenecks in our experimental platform and implementation [Nah95]. When we examined Scenario 3, we decided to study this scenario more carefully. Therefore, we lowered the image display and receiving datagram times in the system QoS pro le in order to manipulate the broker call 14

20000 15000 10000 5000

Duration [usec]

in sync

out of sync

Figure 4: Comparison of end-to-end delays in Scenario 1 establishment, and interesting results occurred. The application tasks and the sending networking tasks sustained their processing time as in Scenarios 1 and 2. However, the receiving network tasks for datagram mode (robotics and video data) introduced unpredictable, large processing times. Figure 5 shows the behavior as a comparison of the processing times for receiving video between Scenarios 2 and 3. This behavior means that none of the robotics data in Scenario 3, when set up in DATAGRAM-MODE, came on time. They were all lost due to missed deadlines. Neither could the video frame rate of 5 frame/second be sustained. In CELL-MODE, some of the robotics data got through, but the loss rate is unacceptably high for the robotics data. The video frame rate of 5 frames/second was achieved. There are several approaches to this problem, such as (1) introducing priority scheduling and multiplexing into the ATM host interface, (2) minimizing the delay due to serialization (decrease video frame/fragment size) for the same frame rate, or (3) explicitly decreasing the frame rates and fragment sizes. Note, that the fragmentation of video frames might force decrease of the frame rate performance and increase the request rate on the ATM device. However, the decreased frame/fragment size solves only the bottleneck (1). Bottleneck (2) can be solved by using shared X and faster display which decrease the display time per frame and are necessary when sensory data should be multiplexed with the video trac. The bottleneck (3) can be solved by allowing only a very low video frame rate (1 frame/second or less) and very small fragment sizes if the user does 15

140000 120000 100000 80000 20000

40000

60000

Duration [usec]

Scenario 2

Scenario 3

Figure 5: Comparison of `Receive Video Frame' processing time in Scenarios 2 and 3 not care about the video frequency.

5 Lessons Learned from Implementation The prototype OMEGA implementation proved the design to be correct, but also showed limitations in provision of QoS guarantees, varying between the trivially remedied and the deep research questions. An example of the former is pacing required by the video service due to some bugs in the ATM interface device driver. Among the deepest research questions is that of mapping perceptual QoS to the kinds of algorithms and mechanisms; we have only touched on that topic.

5.1 Lessons Learned about Platform Characteristics There are several limitations due to the chosen platform: 1. The head-of-line blocking induced by the single-threaded DMA was an unexpected source of diculty. Yet, such DMA is common to high-performance network adapters due to the potential for high performance due to concurrent access to memory by adapter and processor. Unless the adapter is capable of multiplexing several DMAs at a time, this problem is fundamental. One means of addressing this problem is to perform DMA in smaller units, such an ATM cell-size, as was done by Davie[Dav93]. Even in this case, the algorithms em16

ployed to manage the manipulation of packets or cells must be architected to re ect the timing requirements of the end-to-end system if timing criteria are not to be violated. 2. The speci cation of priorities provided only an indirect means of controlling AIX xed-priority scheduling. Our measurements and analysis with the AIX OS, using video and robotics data, showed that using the so-called `real-time' priorities and other real-time services is necessary, but not sucient to control protocol task behavior when used for implementation of QoS deadline-based scheduling, unless severe restrictions apply. These include (1) having only one user, (2) one multimedia application running on the RS/6000, (3) implementation of application/transport protocols in a single user process with real-time priority, and (4) joint scheduling by the protocol stack. Only with these restrictions satis ed can the joint scheduling provide (approximate) predictability for guaranteed services. Future systems should provide better access to scheduler features, data structures, etc. 3. The performance speed of the CPU and the load on various machines was di erent which resulted in pessimistic CPU resource allocation (slowing down) of the faster machine (RS/6000 360) in order to comply with the processing capabilities of the slower machine (RS/6000 530) and achieve end-to-end guarantees.

5.2 Lessons Learned from OMEGA Implementation There are several limitations due to our implementation choices. 

The major limitation is the scheduler structure [Nah95]. The problem is that when we support media streams with very di erent QoS such as video stream and robotics stream, the number of intervals (intervals hold di erent task schedules within a second) in the circular bu er of our scheduler dramatically increases. Currently we can support only 10 intervals. An increase of the interval number causes pinned memory space problems and the program crashes. So another data structure must be found, so that very di erent QoS rates and their corresponding tasks in the intervals can be scheduled properly. This is future work.



There is a problem with automatically nding the right datagram size to transmit video data when the initial datagram size and the receive task cannot pass the admission service. Currently, we have the upper bound of datagram size set to 64 KB, and the processing times 17

for receiving datagram task (in the system QoS pro le) are set to the maximal duration of the largest datagram in the application. If fragmentation of the video should occur (e.g., the broker rejected a video call for datagram size of 64KB), then we have to lower the upper bound of the DATAGRAM SIZE constant manually and pre-compute task duration for this datagram size, which is not an optimal solution. It should be done automatically. 

User requests for renegotiation are limited. There are several issues which this service at the buyer side needs to consider: (1) The user signals to the broker the change through the Graphical User Interface (GUI); (2) the broker makes changes in the shared pro le; (3) the \Renegotiate" task in the RTAP/RTNP schedule reads it and adjusts the running parameter and (4) propagates the change to the other side. On the seller side, the renegotiation task reads from a renegotiation connection if any change is required and adjusts the running parameter. OMEGA supports only renegotiation of the video rate in Scenario 2. Because we wanted a fast renegotiation of video frame rate (below 60 ms - time of QoS Broker performance), we support only lowering the video frame rate (relax the QoS). The reason is that the broker does not have to go through the whole admission, negotiation and translation phase because we are relaxing the resource, not tightening. If the rate is increased, then the broker must go through the entire admission process. Hence, the other limitation of this implementation is the limit on renegotiation parameters. The overhead (runtime) of the renegotiation task on both sides to accomplish (3) and (4) is shown in Figure 6. An estimate (a priori) of this overhead must be available to the broker when providing admission decision of scheduling tasks, hence included in the system QoS pro le.

6 Conclusion Many systems exist which provide acceptable perceptual performance for video and audio. Yet there are a variety of errors and failures observed in our experimental measurements. One might ask why this is so. The explanation is fundamental and refers back to our choice of a demanding application to test the QoS Broker/OMEGA architecture. In particular, the telerobotics application has QoS demands far beyond that needed for a teleconference, where perceptual limitations provide considerable latitude to the system designer. Here, we experienced a direct and measurable con ict between the video portion of the application and the real-time sensory feedback loop for 18

8000 6000 4000 0

2000

Duration [usec]

BUYER

SELLER

Figure 6: Renegotiation Overhead the kinesthetic data. These measurements make the case quite strongly that QoS guarantees are non-trivial to provide, and one can imagine many applications and con gurations far more complex that what we have described and measured. The brokerage scheme proved useful for call establishment of di erent media types and for getting responses (provision of guarantees or rejection) based on QoS speci cations of multimedia application calls. The broker analyzes the QoS requirements for various media in an integrated fashion, which is more complex than for each medium. The integration is crucial, as our experiments showed. Future systems must include an integrated and dynamic resource management, which allows better predictions and provision of guarantees in multimedia communication systems, based on QoS speci cations. Our experimental platform was able to provide real-time guarantees for Scenarios 1 and 2. Measurements for Scenario 1 showed that real-time feedback and closely-coupled control loops are possible with a proper communication system. In the application domain, this is one step ahead of where telerobotics/teleoperation applications were before.

References [And93] D. P. Anderson. Meta-Scheduling for Distributed Continuous Media. ACM Transaction 19

on Computer Systems, 11(3), August 1993.

[BCW88] K. B. B Becker, J.M. Chambers, and A.R. Wilks. The S Language. Wadsworth & Brooks, California, 1988. [BM91] A. Banerjea and B. Mah. The Real-Time Channel Administration Protocol. In Proceedings of 2nd International Workshop on NOSSDAV, Heidelberg, Germany, November 1991. [Cor91] IBM Corporation. AIX Version 3.1: RISC System/6000 as a Real-Time System. IBM International Technical Support Center, Austin, March 1991. [CSZ92] D.D. Clark, S. Shenker, and L. Zhang. Supporting Real-Time Applications in an Integrated Services Packet Network: Architecture and Mechanism. In SIGCOMM'92, pages 14{22, Baltimore, MD, August 1992. [CT90] D.D. Clark and D.L. Tennenhouse. Architectural Considerations for a New Generation of Protocols. In ACM SIGCOMM'90, pages 200{208, Philadelphia, PA, September 2 1990. [Dav93] B. S. Davie. The Architecture and Implementation of a High-Speed Host Interface. IEEE JSAC, 11(2):228{239, February 1993. [DSP93] V. Desikachar, M. Stein, and R. Paul. Wide Bandwidth, Distributed Digital Teleoperation. Technical Report MS-CIS-93-65, University of Pennsylvania, Philadelphia, PA, 1993. [IBM94] IBM Corporation. Ultimedia Services 2.1. for AIX, Guide and Reference, 1994. [Nah95] K. Nahrstedt. An Architecture for End-to-End Quality of Service Provision and its Experimental Validation. PhD thesis, University of Pennsylvania, August 1995. [NS95] K. Nahrstedt and J. M. Smith. The QoS Broker. IEEE Multimedia, 2(1):53{67, Spring 1995. [TS93] C. B. S. Traw and J. M. Smith. Hardware/Software Organization of a High-Performance ATM Host Interface. IEEE JSAC, 11(2):240{253, February 1993. [ZBE+ 93] L. Zhang, B. Braden, D. Estrin, S. Herzog, and S. Jamin. RSVP: A new Resource ReSerVation Protocol. IEEE Network, pages 8{18, September 1993. 20

Suggest Documents