A Probe-based Algorithm for QoS Speci cation and Adaptation Klara Nahrstedt, Ashfaq Hossain and Sung-Mo Kang
e-mail:
[email protected],
[email protected],
[email protected] University of Illinois at Urbana Champaign
Abstract
Multimedia services such as Video-On-Demand (VOD) will require a certain level of Quality of Services (QoS). However, the VOD service will be required to run on general purpose machines such as PCs or workstations and shared networks such as Ethernet and ATM (Asynchronous Transfer Mode) LANs due to the prohibitive cost of home networking. In this environment one problem is the speci cation of application QoS [7] such as the video display frame rate prior to negotiation for QoS guarantees. This paper proposes an on-line probe-based algorithm to specify initial application QoS for continuous media and to determine the critical degradation point when performance degrades and adaptive mechanisms should be applied. This probe-based algorithm should be part of a negotiation phase. We will show the applicability of the probe-based algorithm on several VOD service experiments. The results show that it is worthwhile to perform probes in advance if the probe time is small with respect to the duration of the movie/video clip. The advantage of using on-line probes in QoS speci cation is that the negotiated QoS value represents a realistic application QoS with respect to the current load of the VOD system. Keywords: Application Quality of Service, Video-on-Demand, Probe-based Algorithm, Feedback Mechanism, Adaptive Behavior
1 Introduction Multimedia distributed services such as Video-On-Demand (VOD) are at the center of our study. VOD services use the client/server model. The server is a local (neighborhood) multimedia storage server and includes a database of movies and other video clips [2]. The client provides front-end operations such as reception, decompression, display and control of the multimedia data. The network provides the communication between clients and the server. Such a VOD architecture is shown in Figure 1. In this architecture we assume: (1) the underlying local area network provides guarantees1 according to Quality of Service (QoS) speci cation, (2) the client/server system layer (OS and transport/network protocol) does not provide guarantees2, (3) the client/server application layer has a negotiation/adaptation mechanism to control the dynamics of the application resources This condition is achieved by use of a lightly loaded LAN (e.g., Ethernet or ATM), so that resources are always available. 2 General purpose OS (e.g., UNIX) and transport/network protocols (e.g., UDP/IP) currently used for multimedia processing and communication in multimedia workstations and PCs do not yet provide guarantees. 1
1
and implicitly the underlying system resources, and (4) the client/server application layers use the application QoS, such as display frame rate or frame size, as their negotiation parameters. Multimedia Storage Server
Multimedia Workstation
System Architecture Application Server
System Architecture END-TO-END QoS GUARANTEES
Application Client
(System Layer)
(System Layer)
Operating System Communication Protocols
Operating System Communication Protocols
Network Adapter
Network Adapter
Local Area Network
Multimedia PC
Multimedia Workstation
Figure 1: VOD topology and system architecture.
1.1 Problem Description
Current audio/video applications have services and protocols with QoS negotiation, renegotiation capabilities which assume that the user knows the actual QoS parameter. The question is: How do we determine what is the possible QoS for the client/server VOD service? Hence, we are considering (1) the determination of the application (QoS) [7] such as the actual \play" frame rate of a continuous media stream at the client side before the user begins negotiation of application QoS, and (2) the adaptation of application QoS utilizing information from the determination phase. In particular, we are interested in QoS speci cation which mirrors the client/server system (CPU load) performance dynamics and not the network performance dynamics. We propose an on-line algorithm for QoS speci cation and its subsequent utilization for adaptation. This algorithm is based on probes done at the beginning of the application negotiation phase and (1) determines the application QoS of a continuous medium as a statistical guarantee, (2) determines the degradation point at the client side when system performance starts to severely degrade due to buer problems and mismatched rates between the server and the client, and (3) provides suggestions for the client adaptation phase, which must be undertaken to avoid severe degradation during transmission. For the adaptation of application QoS we use a feedback mecha2
nism as known and used in networks for congestion control. We employ a predictive feedback to slow down the trac when the system degradation point is approached which can be seen as a trac shaping mechanism for the network. We will show the applicability of our algorithm through a set
of experiments using dierent architectures.
1.2 Related Work
In most VOD services, the application QoS speci cation is performed in two ways:
Device Speci cation
A video card and its accompanying software provide a description of the possible frame rates and frame sizes which the card and the driver can support (e.g., XVideo 700 Parallax Video provides 640x482 pixels frames at 30 frames/second). The VOD application may take these parameters as the QoS speci cation. However, as many experiments show, these parameters might not be realistic QoS parameters the VOD service can sustain because the end-to-end QoS depends on many dierent factors: (1) application software running on the client/server sides, (2) the transport protocol stack used by the VOD service, (3) CPU utilization by other applications during the runtime of the actual application, and (4) the underlying network.
O-line Testing
Another approach to QoS speci cation is to run extensive o-line tests to pre-determine the QoS parameters [6]. The QoS parameters are then stored in con guration les and retrieved when the negotiation phase begins. The problem with this approach is that it does not take into account the actual load when a VOD service runs. The load of the system layer might dynamically change; hence, o-line testing may not provide a realistic estimation of the measured QoS parameter. At the network level, the QoS adaptation is mostly considered as a congestion control mechanism. The source starts to send trac and the network monitors it. If congestion is observed at a switch, feedback information is sent back via dierent protocols to the source to slow down the trac [3, 4]. Congestion at the network level, and hence QoS adaptation, can be avoided if sources negotiate a rate contract with the network and the sources obey the negotiated contract using trac shaping mechanisms [8]. An example of a network protocol which negotiates network resource availability and adapts accordingly to the trac feedback is the Resource ReSerVation Protocol (RSVP)[9]. Our QoS speci cation service, using the probe-based algorithm, and equivalent QoS negotiation/adaptation protocols reside within the application subsystem layer on top of network protocols such as RSVP. This means that if a protocol such as RSVP exists in the system layer (Figure 1), then the integration of our approach with the underlying network protocols can provide the desired application-to-application QoS guarantees. Furthermore, our VOD application QoS services can utilize the underlying multicasting support (multiple sources to multiple targets) if it exists, however the application QoS speci cation and adaptation service itself is meant to serve only one source and one target. The multiple source, multiple target distribution is considered and resolved at the underlying system layer and/or in the LAN.
3
2 Probe-based Algorithm Let X1; X2; :::; Xk?1; Xk be measured QoS parameters. Let be the dierence between the new measured value Xk and a past measured value. is the maximum allowable dierence of degradation in quality of service (QoS). a is the accumulated dierence between the rst and the last measured values. Note that depending on the measured QoS parameter, , a , and can be positive or negative. For example, given the measured display frame rate of Xk , if the frame rate decreases (Xk < Xk?1 ), then there is performance degradation, (; ; a) have negative values and j j < jj. On the other hand, if Xk represents measured task processing time, then Xk > Xk?1 represents a degradation in performance and < . Let c be the counter of occurrences = . Let T be the time interval upper bound of the probe, i.e., the probe runs in interval I = (0; T ). Let i be the counter of previous measurements where Xk ? Xk?i = We will store each measured value in the background until the degradation point is found or the probe interval I expires. In the algorithms described below we show only the storage of measured values which contribute to nding the degradation point. Note that our algorithm aims to bound the dierence between measured values, which is applicable to time series [1]. For example, when a video stream shows a frame rate of 10-12 frames/second, we want to bound the frame rate dierence (degradation) to 5 frames/second because degradation between 12 frames and 6 frames/second is noticeable and may disturb the viewer (Figure 2). application QoS (X) X1 X0 X2 X5
degradation point
X 9 X
11
...
I
Time t
Figure 2: Example of the probe-based algorithm. A simpler algorithm, which is often used in network monitoring and congestion control algorithms, would use a minimum allowable value of a QoS parameter (Xmin ) and the service would compare each measured value Xk against Xmin . If Xk degrades below Xmin , a feedback is triggered to slow down the source. In the probe-based algorithm described below, we assume that Xk > Xk?1 means degradation and Xk < Xk?1 means improvement. Our probe-based algorithm is as follows:
Initialization step 4
measure X0, set k=1; set timer which runs until current time t = T ; c = 0; i = 1; = 0; a = 0; specify ;
Execution step { within t 2 I do
1. measure Xk ; compute = Xk ? Xk?1 ; a := a + ; 2. if f( < ) ^ (a < )g then store Xk ; k := k + 1; f if ( 0) then i := i + 1; else i := 1 g; goto step 1. /* no QoS degradation or only a small degradation*/ 3. if f( == ) _ (a == )g then (a) if (c == 0) then c := c + 1; , store (Xk ; Xk?i ); k := k + 1; goto step 1; (b) if f(c > 0) ^ ( == ) g then i := i + 1; c := c + 1; compute = Xk ? Xk?i ; i. if ( < ) then store (Xk ; Xk?1), Xk?i is already in the list; k:=k+1; goto step 1; ii. if ( == ) then Xk?1 = Xk?i , store Xk , k := k + 1; goto step 1. iii. if ( > ) then Xk?i is the degradation point, return( Xk?i ; k; i; t) 4. if f( > ) _ (a > ) g then Xk?i is the degradation point, return (Xk?i ; k; i; t) { time expired after interval I, and the loop did not end with degradation point, then return (Xk?1 ; k; i = 1) In this case we don't have a degradation point. The runtime of the algorithm is O(k) where k is a nite number of measurements in the interval I . The QoS speci cation is computed after the algorithm ends. The speci ed QoS parameter is the expected value X = ( k?1i+1 Pjj ==0k?i Xj ). The system level can sustain this value as a QoS and the application can use it for negotiation of network guarantees. The output of the degradation point with the speci cation of k and t, is needed for the QoS adaptation. Note that the degradation point can be translated internally to other control parameters to control the QoS. There are two possible reactions: (1) the application lets the degradation point pass and then requests adaptation (e.g., sends a slow down request to the video source), or (2) the application sends an adaptation request before the degradation point is approached.
3 Experimental Setup and Methodology 3.1 Hardware Setup
We performed our experiments of video server and clients over 10Mbps Ethernet and IP-overATM environments. The Ethernet setup is the standard IEEE 802.3, using CSMA/CD for packet transmission. The IP-over-ATM setup involves our iPOINT ATM switch[5], which was designed and implemented at the University of Illinois at Urbana-Champaign. The general setup of the video server and clients for IP-over-ATM is shown in Figure 3. The iPOINT (Illinois Pulsar Optical Interconnect) provides ve input/output ports. Four of the I/O ports operate at 100Mbps and the fth port at 400Mbps. As shown in Figure 3, two SPARCStation20s are connected to two 100Mbps ports; one SPARCStation10 is connected to another 100Mbps port. The fourth 100Mbps port is connected via a commercial Fore ASX-100 switch to the Blanca/XUNET Gigabit testbed. We used one of the SPARCStation20s as our video server. The other machines were used as video clients. To establish IP-over-ATM connections between the 5
Xunet ATM WAN U.C. Berkeley
U. Wisc
AT&T Bell Labs ATM Switch IP Router
Oakland
Newark Chicago
45 Mbps DS3 link 250 Mbps optic link 622 Mbps optic link
LLNL Sandia
Rutgers
622 Mbps WDM
Campus ATM LAN
iPOINT Testbed iPOINT Video Server (SS 20)
Xunet switch ss10t1
Fore/Xunet Adapter [FXA]
Controller
iPOINT Switch
iPOINT Video Client (SS 20)
ss10t2
iPOINT Trunk Port iPOINT Queue Modules
IP router Fore ASX-100
HIPPI/ATM Adapter Optical Fiber [62.5/125] CCSO: Node 1 (Optical patch panel)
Figure 3: iPOINT ATM testbed. workstations themselves and the XUNET testbed, we equipped each workstation with Fore SBA200E ATM SBus adapter cards. These cards are Solaris compatible and have hardware support for AAL5 frames. The arriving ATM cells are FIFO queued in the input module of the input ports and eventually transmitted by the switching fabric to the outgoing destination port. The switch is non-blocking. To generate compressed video from the server, we used a SunVideo compression card which uses a C-Cube CL4000 processor to create motion JPEG (MJPEG) and MPEG-1 streams from NTSC or PAL inputs. The test streams used for our results are our own custom-compressed streams with moderate zooming and panning. The frame size of both MJPEG and MPEG-1 streams is 320 240 pixels. We selected 24 bits per pixel for compression in each case.
3.2 Software Setup
We used Solaris XIL imaging library routines for our video server and client code development. The APIs provided by SunVideo SDK can perform some useful video functions including recognizing frame boundaries for MJPEG and MPEG-1 streams, testing the integrity of a compressed frame, etc. At this time, unfortunately, decompression of frames in these streams is software-based. Therefore, even though a real-time compression (30 fps for MJPEG) is achieved by the dedicated SunVideo card, software-only decompression has a maximum frame-rate of about 17 fps for MJPEG and much less for dierent MPEG-1 patterns (varying with combinations like I only, IP, IPP, IPB etc).
Video Server Processing/Transmission Code
The server code draws heavily on IP socket programming and Solaris XIL library. The server creates a Compressed Image Sequence (CIS) for a speci c type of stream (MJPEG or MPEG6
1), mmaps the video le to the user memory and gradually reads an integral number of frames into the CIS. These frames are then transferred as IP packets to the client through UDP sockets. After sending each packet, the server waits for any feedback response from the client(s). The server can vary its output rate depending on the feedback, if any.
Video Client Processing/Transmission Code
The client coding is more involved than that of the server. In addition to creating a CIS for manipulating compressed frames using available APIs, the client is responsible for decompressing and displaying the frames. We used the multithreading features available on Solaris 2.3 and higher. Packet reception, decompression/display operations were implemented as independent threads. XIL APIs for decompression and display were used for code development. Regarding application space buering, each client implements a ring buer of n elements. Each element in the ring buer can hold up to an integral number of compressed frames. For our implementation, we considered that each received packet will contain an integral number of compressed frames and they can be accommodated in a single ring element. The receive() and decompress() threads chase each other in the ring. The application buer space is full when the receive() thread is just behind the decompress() thread. A schematic representation of the client and server code structures is shown in Figure 4.
3.3 Methodology
The variable of interest is the video display frame rate at the client side. This variable represents the measured application QoS parameter X in our algorithm (Section 2). The client/server software shown in Figure 4 runs over the UDP/IP protocol stack with short video clips as probes. The duration of the video clips is maximal 5 minutes (I parameter in the probe-based algorithm). This software is used by the application negotiation protocol [6]. The UDP/IP stack uses two networks, namely Ethernet and ATM. We call the experiments over Ethernet the Ethernet Setup and the experiments over ATM the ATM Setup. In each setup we tested three scenarios: (1) no feedback (i.e., no utilization of the degradation point), (2) feedback after the degradation point (i.e., the degradation point was utilized after it passed), and (3) feedback before the degradation point (i.e., the degradation point was utilized for preventive QoS adaptation).
4 Results For all the experimental results presented below, the video server listens over a well-known UDP port for clients' requests. It is assumed that the server does not have any previous or negotiated knowledge about the client's frame rate decompression/display capability. The server sends frames at the same rate (fps) the movie was compressed.
4.1 UDP/IP over Ethernet No Feedback 7
Video Server Wait on Client request
Access Movie on Disk
Video Client
iPOINT Switch Video Data
Video Data
ATM
ATM ATM
Feedback
Feedback
Receive Thread
Disk
No
Send compressed movie data
Video Client
ATM
Display Thread
Receive Data
Acquire Mutex Lock
Yes
Acquire Mutex Lock
No
FLAG = full ? Yes
Decompress
Process Client Feedback Buffer data Set FLAG=full; No
Display Frame
Movie Done? Yes
Yes
Send END Signal to Client
Release Mutex Lock
Receive Thread
More Frame in buffer? No
flag = EMPTY Release Mutex Lock Display Thread
Ring Buffer
Figure 4: Client/Server program structure for our VOD service. In this experiment, the client receives datagram packets containing MJPEG and MPEG-1 compressed video frames. After receiving the frames, software decompression and display of each frame occurs. Figure 5 shows the measured display frame rate at the client and server sides. The reason for the client-side degradation is that the client is receiving data from the network at a rate higher than it can decompress and display the frames. Application ring buers suer over ow for this reason too. As the packets continue to arrive in the network interface, the kernel starts buering these packets into its internal memory, which will eventually be copied into the application space as previously buered data are consumed. To perform this internal buering operation, the CPU load is increased, in addition to decompressing and displaying previously received frames. This results in decreased frames/sec on the client. It is apparent from the results that without reducing the packet arrival rate from the server, the client never recovers from this degraded performance. Moreover, kernel internal memory for buering incoming packets is also nite. Eventually the kernel will have no other choice but to drop incoming packets as it runs out of memory, resulting in loss of sequential frames for the movie (in the case of MJPEG). For MPEG-1, it may result in dropping entire Groups of Pictures (GOPs) if the reference I-frame is dropped. Figure 5 further shows that the degradation point is around the 4000th frame (150 seconds) for MJPEG and 37th second for MPEG-1. 8
25
18 Compressed Frames (MJPEG) Per Second (Server) Frames (MJPEG) Per Second (Client)
Compressed Frames (MPEG-1) Per Second (Server) Frames (MPEG-1) Per Second (Client) 16
20
14
12 15 10
8 10 6
4
5
2
0
0 0
100
200
300 Time, secs
400
500
600
0
20
40
60 Time, secs
80
100
120
Figure 5: Performance of client/server with no feedback (Left side: MJPEG probe.
Right side: MPEG-1 probe.).
From the above discussion it is also clear that the degradation point for the display frame rate can be translated into the application ring buer occupancy, meaning that when the application's ring buers become full (receiving and display thread meet) then degradation begins (frame rate decreases). Hence, although our probe-based algorithm measures the variation of frame rates, internally it translates into control of the application ring buer occupancy.
Feedback after Degradation Point In this scenario when client observes degradation, i.e., the transmission protocol passes the degradation point (application ring buers are full), the client sends a feedback message SLOWDOWN to the server. The server slows its sending data rate in response to this message. The server continues at this reduced data rate until the client sends a NORMAL feedback message to the server, whereupon it resumes its previous speed. After the server slows down, the client's performance improves (reduced kernel buering). After the server resumes its normal speed, the client's performance degrades again (increased kernel buering). Figure 6 shows the results. There is some improvement in the display frame rate, but the improvement is very bursty in the case of MJPEG, which may be visually annoying to the VOD user.
Feedback before Degradation Point In the third scenario, the transmission protocol sends a feedback message shortly before the degradation point is approached. Internally this translates into sending a feedback message from the client to the server when the application ring buers are 75% full. Figure 7 shows the results. As we can see, the client display frame rate is much smoother than in Figure 6 and the frame rate can be maintained. This is the desirable result for a VOD service.
9
25
20 Compressed Frames (MJPEG) per Second (Server) Frames (MJPEG) per Second (Client)
Compressed Frames (MPEG-1) Per Second (Server) Frames (MPEG-1) Per Second (Client)
20 15
15 10 10
5 5
0
0 0
100
200
300 Time, secs
400
500
600
0
50
100 Time, secs
150
200
Figure 6: Performance of client/server with feedback after degradation point.
4.2 UDP/IP over ATM
Since the bottleneck in the previous set of Ethernet experiments was not the network load, IP-overATM performance was essentially the same. For increased load on the network, ATM performance will, obviously, be better. For our experiments, both ATM network and Ethernet loads were minimal. Figure 8 shows the performance when feedback is triggered before the degradation point.
4.3 Evaluation
The results presented in Figure 5 show that the application QoS of a VOD service can change if no control of the application and system levels is exhibited. Hence, it is worthwhile to start a probe (a short video clip which might be the beginning of a movie) to determine the degradation point so that the transmission phase can utilize this knowledge and react in advance. Figures 6, 7 and 8 show the improvements which come from utilizing knowledge of the degradation point. Comparison of MJPEG streams in Figures 7 and 8 shows that SLOWDOWN feedbacks occur much earlier in the Ethernet setup than in the IP-over-ATM setup. The reason is that the data can be buered in the ATM network much longer than in Ethernet. Furthermore, Figure 5 shows that the time interval I is only a couple of minutes (150 seconds - 2.5 minutes for MJPEG) to nd out where the degradation point is. Clearly, the larger the ring buers are, the longer it can take before degradation occurs. Hence, the choice of I is important. We believe that the length of I will always be much smaller than the actual length of a movie because of the client's buer capacity limitation. Cost and security reasons will limit the VOD buer sizes at the client side. The clients will have small buers and the negotiation protocols will need to determine the degradation points for realistic QoS speci cation and adaptation.
10
25 Compressed Frames (MJPEG) Per Second (Server) Frames (MJPEG) Per Second (Client)
Compressed MPEG-1 Frames Per Second (Server) Frames Per Second (Client)
14
20
12
10 15 8
10
6
4 5 2
0
0 0
100
200
300 Time, secs
400
500
600
0
50
100
150
200 250 Time, secs
300
350
400
450
Figure 7: Performance of client/server with feedback before degradation point.
5 Conclusion We presented an on-line probe-based algorithm to specify a realistic QoS parameter, such as the display frame rate in a VOD service, and to identify a degradation point over a time interval I . Our experimental results show that it is useful to use a probe-based algorithm for QoS speci cation and utilize the knowledge of the degradation point if one is found. Utilization of the degradation point, using feedback to regulate the source rate, improves the client's application QoS performance. In particular, a feedback issued before the degradation point is reached helps to provide a smooth frame rate, the feedback mechanism does not aect the QoS metrics and the change is not user visible. If the feedback is issued after the degradation point, the change in QoS metrics is user visible. The concept of probe-based techniques suggests that the speci cation of QoS at user/application interfaces can be automated. Utilizing QoS con guration pro les and probe-based QoS speci cation services, the QoS negotiation can start with more accurate QoS values instead of pessimistic, worst case QoS values to get guarantees. In our future work, we intend to investigate and compare dierent on-line probe-based algorithms at the client, server and switches as part of the negotiation and monitoring phases. We would like to provide information about various degradation points to VOD services under various loads depending on what kind of quality is required.
References [1] C. Chat eld. The Analysis of Time Series - An Introduction. Chapman and Hall, fourth edition, 1989. [2] B. Furht, D. Kalra, F. Kitson, A. Rodriguez, and W. Wall. Design Issues for Interactive Television Systems. IEEE Computer, pages 25{39, May 1995. 11
25 Compressed Frames (MJPEG) Per Second Transmission (Server) Frames (MJPEG) Per Second (Client)
Compressed Frames (MPEG-1) Per Second (Server) Frames (MPEG-1) Per Second (Client)
14
20
12
10 15 8
10
6
4 5 2
0
0 0
100
200
300 Time, secs
400
500
600
0
50
100
150
200 250 Time, secs
300
350
400
450
Figure 8: Performance of client/server with feedback before degradation point (ATM). [3] H. Kanakia, P. P. Mishra, and A. Reibman. An Adaptive Congestion Control Scheme for RealTime Packet Video Transport. In ACM SIGCOMM, Baltimore, MD, August 1993. [4] H.T. Kung, T. Blackwell, and A. Chapman. Credit Update Protocol for Flow-Controlled ATM Networks: Statistical Multiplexing and Adaptive Credit Allocation. In ACM SIGCOMM, pages 101{115, London, UK, 1995. [5] J.W. Lockwood, H. Duan, J.J. Morikuni, S.M. Kang, S. Akkineni, and R.H. Campbell. Scalable Optoelectronic ATM Network: The iPOINT Fully Functional Testbed. IEEE Journal of Lightwave Technology, pages 1093{1103, June 1995. [6] K. Nahrstedt. An Architecture for End-to-End Quality of Service Provision and its Experimental Validation. PhD thesis, Department of Computer and Information Science, University of Pennsylvania, August 1995. [7] K. Nahrstedt and J. M. Smith. The QoS Broker. IEEE Multimedia, 2(1):53{67, Spring 1995. [8] R. Sharma and S. Keshav. Signalling and Operating System Support for Native-Mode ATM Applications. In ACM SIGCOMM, pages 149{157, London, UK, September 1994. [9] L. Zhang, B. Braden, D. Estrin, S. Herzog, and S. Jamin. RSVP: A new Resource ReSerVation Protocol. IEEE Network, pages 8{18, September 1993.
12