Optimization of Resilient Packet Ring Networks Scheduling for MPEG ...

2 downloads 0 Views 201KB Size Report
Scheduling for MPEG-4 Video Streaming. Jian Zhu Ashraf Matrawy Ioannis Lambadaris. Department of Systems and Computer Engineering. Carleton University.
Optimization of Resilient Packet Ring Networks Scheduling for MPEG-4 Video Streaming Jian Zhu

Ashraf Matrawy

Mohsen Ashourian

Ioannis Lambadaris

Department of Systems and Computer Engineering Carleton University Ottawa, Ontario, Canada {jianzhu, amatrawy,ioannis}@sce.carleton.ca

Islamic Azad University of Iran, Majlesi Branch Isfahan, Iran [email protected]

Abstract – Resilient Packet Ring (RPR) is an emerging standard for the construction of local and metropolitan area networks. The Priority Queue (PQ) algorithm, which is recommended as the scheduling scheme for RPR always gives priority to the transit buffer. Using this scheduling scheme in single transit buffer RPR, the high priority traffic, such as video packets waiting to access the ring at congested node in the transmit buffer will suffer large delays and unsteady delay jitters. In this paper, we propose a new scheduling scheme for RPR to improve the quality of service for video traffic transmission. The proposed scheduling scheme alternately selects packets from the single transit buffer and the high priority transmit buffer using Deficit Round-Robin (DRR) algorithm. If there is no packet in the above two buffers, low priority transmit buffer is then served. We investigate the system performance for transmission of MPEG-4 encoded bitstream as high priority traffic in an RPR network in scenario that all traffic is forwarded to a common node. We report end-to-end delay and delay jitter for I, P and B frames of several encoded video streams. Simulation results show certain improvement on overall delay and delay jitter performance for all types of video frames, especially for I frames which are the most important frames for video reconstruction.

time, provided the packets use different segments of the ring. Each node of RPR is connected to two ringlets. The direction of one ringlet is clockwise and the other is counter clockwise. In RPR, packets traveling on the ring are stored in each node’s transit buffer, while packets accessing the ring from local node are stored in the transmit buffer. To provide Quality of Service (QoS) guarantees, traffic scheduling algorithm is needed in RPR. These guarantees include end-to-end delay, delay jitter, packet loss rate, or a combination of these parameters. The function of a scheduling algorithm is to select the packet to be transmitted in the next cycle from the available packets belonging to the flows sharing the output link. IEEE 802.17 standard organization proposed two ring architectures for RPR: Single Transit Buffer (STB), and Dual Transit Buffer (DTB) [1][2]. For both architectures, a high priority and low priority transmit buffer exist in each node. In the STB architecture, a single transit buffer mixes Low Priority (LP) and High Priority (HP) traffic from the transmit buffer. Traffic in the single transit buffer has the highest priority. STB has lower implementation complexity. But the weakness of this architecture is that high priority traffic trying to access the ring will be delayed by any low priority traffic already transiting on the ring. At a congested node, high priority traffic waiting to access the ring will suffer large delay and unsteady delay jitter. Video data, classified as high priority traffic in RPR, is delay and delay jitter sensitive. Therefore the original scheduling scheme of RPR is not appropriate for the video traffic. To improve the QoS of video traffic transmission, we propose a new scheduling scheme for RPR networks with STB. We implement Deficit Round-Robin (DRR) algorithm [3] to alternately select packets from the single transit buffer and high priority transmit buffer (Fig.2). Low priority transmit buffer can be served by the scheduler only when there is no packet in the above two buffers. By using our scheme, video traffic in high priority transmit buffer can regularly access the ring according to its weight assignment. Using the proposed scheduling scheme, we investigate the transport of real-time video traffic generated by MPEG-4 applications in a RPR network. Simulation results show certain improvement on overall delay and delay jitter performance for all types of video frames. The rest of the paper is organized as follows: An overview

Keywords: Resilient Packet Ring, Scheduling, Deficit RoundRobin, MPEG-4. I. INTRODUCTION Resilient Packet Ring (RPR) [1][2] technology known as IEEE 802.17 standard is designed to meet the requirements of packet-based local and metropolitan area networks. RPR networks have multiple performance advantages such as spatial reuse, resilience, scalability, bandwidth efficiency and ease of management. RPR has a bidirectional full duplex ring topology. Bidirectional rings provide resilience since a packet can reach its destination even in a situation of one link failure. By letting the destinations strip the packets, RPR spatially reuses the bandwidth. Since several packets can travel on the ring at the same This work is supported by CITO (Communications and Information Technology Ontario) and NSERC (Natural Sciences and Engineering Research Council of Canada)

0-7803-8938-7/05/$20.00 (C) 2005 IEEE

271

on DRR is provided in Section II. Section III contains our proposed scheduling scheme. Simulation results and analysis are presented in section IV. Conclusions are drawn in Section V.

served and emptied. The MAC scheduler sends traffic in the following order: 1) Packets sent from the mixed transit buffer. 2) Packets sent from the high priority transmit buffer. 3) Packets sent from the low priority transmit buffer.

II. DEFICIT ROUND-ROBIN (DRR) ALGORITHM The round-robin schedulers divide the time into allocation cycles and allocate each active flow a fraction of the available bandwidth in each cycle. The main attraction of this class of schedulers is their simplicity. Of all the scheduling algorithms, DRR [3] is the most efficient for high speed network for its low operational cost and O(1) complexity. The DRR scheduler maintains a linked list of the active flows, called the Active List [4]. A flow is added to the tail of the Active List when this flow is activated by the new arriving packets. A flow is deemed as active during a certain time interval if it always has packets awaiting service. A round is defined as one round-robin iteration during which the DRR scheduler serves all the flows that are present in the Active List. DRR scheduler assigns quantum to each active flow. Suppose that transmission rate of an output link is r. Let n be the total number of flows and let pi be the reserved rate for flow i. Let pmin be the lowest of these reserved rates. In order that each flow receives service proportional to its guaranteed service rate, the DRR scheduler assigns a weight to each flow. The weight is given by [4]

Low Priority

High Priority

Add –in Traffic

Drop Traffic

Receive

Client MAC Leaky Bucket

Pass through Traffic

Priority Scheduler

Transit Buffer

From Ring

To Ring

Fig.1. The MAC Architecture in RPR with STB

Low Priority

High Priority

Transmit Buffers Add –in Traffic

Note that wi ≥1. Let Qi represent the quantum assigned to flow i and let Qmin be the quantum assigned to the flow with the lowest reserved rate. The quantum assigned to flow i, Qi is given by wi*Qmin. In this way, the quantum assigned to each flow is in proportion of its reserved rate. In order that the work complexity of the DRR scheduler is O(1) (the complexity doesn’t increase with the number of flows or packets), it is important that Qmin should be equal to or greater than the size of the largest packet that may potentially arrive at scheduler. Each flow has a deficit count, which records the difference between the amount of data actually sent thus far, and the amount that should have been sent. This deficit is added to the value of the quantum in the next round, as the amount of data the scheduler should try to schedule in the next round. Therefore, a flow that received very little service in a certain round is given an opportunity to receive more service in the next round [3][4].

Drop Traffic

(1)

This approach guarantees low delay for higher priority traffic. However, it often leads to the starvation of lower priority flows. In the single transit buffer architecture, traffic on the ring, which is a mixture of HP and LP transit traffic, has higher priority over the transmit HP traffic. This could cause the LP traffic on the ring block the transmit HP traffic from accessing the ring. During congestion interval, low priority transit traffic can block the transmission of high priority traffic waiting to access the ring and degrade the delay and delay jitter performance of packets in the high priority transmit buffer. As we know, real time data such as video and voice is classified as high priority traffic in RPR, which suffers more from packet delay and delay jitter.

Receive

wi=pi/pmin

Transmit Buffers

Client MAC

III. THE PROPOSED SCHEDULING SCHEME IN RPR

Leaky Bucket

Each node of RPR has a high priority and low priority transmit buffer at the client side. In the single transit buffer architecture, a transit buffer combines mixed low priority and high priority traffic. Fig. 1 shows the Single Transit Buffer configuration. Priority Queue (PQ) algorithm is used as the scheduling scheme for RPR. The scheduler handles a specific queue on the basis that all the higher priority flows have been

Pass through Traffic

From Ring

Transit Buffer

DRR Scheduler

Priority Scheduler

To Ring

Fig.2. The MAC Architecture in the Proposed Scheme.

272

To optimize the video transmission performance of the RPR networks, we proposed a new scheduling scheme [5]. In our scheme, we substitute part of the original priority queuescheduling algorithm with deficit round-robin algorithm. We use deficit round-robin scheduling between the transit buffer and transmit high priority buffer. The transmit low priority buffer is served by the scheduler only when there is no packet in both the transit buffer and transmit high priority buffer. Fig. 2 shows the MAC architecture of our scheme. Using DRR in our scheme can guarantee the regularly access of the high priority transmit traffic to the ring and it only requires O(1) work to process a packet. Therefore, it does not add extra complexity to the original architecture. IV. SIMULATION RESULTS AND ANALYSIS We study the performance of our proposed scheduling scheme on delivering MPEG-4 video streams. MPEG-4 [6] is the emerging standard for audio and video compression. It is capable of exploiting both spatial and temporal redundancies. MPEG-4 has three frame types, namely I-, P-, and B-frames. Iframes are intra-coded frames, P-frames are predicted frames depending on the previous I-frame, and B-frames are bidirectional predicted frames (depending on the previous and following I- or P-frame). Frames are arranged in Group of Pictures (GOP). A GOP consists of exactly one I-frame and some related P-frames and optionally some B-frames between these Iand P-frames. In MPEG-4, losing an I-frame would cause distortion of all following frames in a GOP. A P-frame loss would only influence the B-frames and the loss of a B-frame would not influence any other frame. We simulate a 10-node ring network in OPNET modeler. Fig. 3 shows the RPR network topology with Hub scenario in our simulation (a scenario that all traffic is forwarded to a common node). Four nodes (6, 7, 8, 9) send packets on the outer ring to the terminal node 0, the Hub. Observations are made in the Hub. RPR recommended fairness mechanism for single transit buffer [1,2] is used in simulations. Table 1 shows the simulation parameters of background traffic. The three packet sizes emulate real traffic profile. The generated traffic rate from each node is 33 Mega bits/s, that has 10 Mega bits/s high priority traffic and 23 Mega bits/s low priority traffic. This results in congestion at node 9. In addition to the traffic mentioned in Table 1 for each node, at node 9 we add an extra MPEG-4 stream traffic source which is also destined to the Hub. In this way, we can study the transmission performance of MPEG-4 streams at the congested node. In this paper, we used two different video sources. “News” with average motion and “Akiyo” with small motion are encoded at 384 Kbps [7]. Both videos are in Common Intermediate Format (CIF), which means they are 352 × 288 pixel and 30 frames per second. They are encoded according to MPEG-4 with a fixed 12-GOP with two consecutive B-frames (IBBPBBPBBPBB). The UDP protocol is used to segment and packet the MPEG-4 frames. If a frame is larger than 1024 bytes, it is segmented to smaller packets before transmission.

Fig.3. RPR Network Topology Used in the Simulation Table 1. Simulation Parameters Parameters

Value

Link Capacity

100 Mega bit/s

Fairness Sample Period

100 µs

Transit Queues Size

1 Megabits

Packet Sizes

Traffic Generated in Each Node

60%: 64 byte 20%: 512 byte 20%: 1518 byte HP:10 Mega bits/s LP: 23 Mega bits/s

In our simulation, the DRR weight assignment of transmit high priority flow is proportional to its guaranteed service rate. The remaining portion is assigned to the transit buffer. In this paper, we choose guaranteed high priority transmits traffic rate for each node as 10M bits/s. Since the total link capacity is 100M bits/s, we assign weight 1 to transmit high priority traffic buffer, and weight 9 to its transit buffer. We compared the proposed scheduling scheme performance in terms of delay and delay jitter with the original RPR scheduling scheme. Fig. 4(a), Fig. 4(b) and Fig. 4(c) show the endto-end delay of I, P and B frames of video sequence “News” for both scheduling schemes. Fig 5 shows the same comparison for video sequence “Akiyo”. For both videos, we observed better end-to-end delay performance by using our proposed scheme. Table 2 shows the average end-to-end delay comparison of two scheduling schemes. From Table 2, we can see that the average delay of I frames by using original RPR scheduling scheme is almost 212 msec, a large delay in real time applications such as teleconferencing [6]. Using our proposed scheme, the delay for all types of frames is reduced approximately 10 times. The result is more obvious for I frames because I frames are larger in size and are segmented to more UDP packets than other frames.

273

(a)

(a)

(b)

(b)

(c)

(c)

Fig.4. Comparison of End-to End Delay of Video “News” by Using Different Scheduling Schemes

Fig.5. Comparison of End-to End Delay of Video “Akiyo” by Using Different Scheduling Schemes

Fig. 6 shows the delay jitter comparison of I, P, B frames of average motion video “News”. Fig 7 shows the same comparison but based on the video “Akiyo”. Based on our proposed scheme, the delay jitter of video transmission outperforms the original scheme for both videos. When our proposed scheduler handle the HP transmit traffic from node 9, the ring traffic from other node is slightly delayed. But from our observation of simulation, the overall delay and delay jitter performance for HP traffic from all the nodes was improved [5].

Table 2. Average End-to-End Delay Comparison Video Sequence

News

Akiyo

274

I P B I P B

Average End-to-End Delay (msec) Original Proposed Scheme Scheme 212 22 29 3 24 2 230 24 30 3 22 2

Fig.6. Comparison of Delay Jitter of Video “News” by Using Different Scheduling Schemes

Fig.7. Comparison of Delay Jitter of Video “Akiyo” by Using Different Scheduling Schemes

V. CONCLUSIONS In this paper we proposed a new scheduling scheme for RPR network. We have simulated streaming of MPEG-4 encoded bit-streams from the congested node in an RPR network in Hub scenario. We investigate the system performance in terms of end-to-end delay and delay jitter of video frames for the original RPR scheduling scheme and our proposed scheme. The proposed scheme shows much lower end-to-end delay, and lower jitter delay for all types of I, P and B frames of video sequences. The improvement was more significant for I frames which are the key frames for video reconstruction at the receiver. In practice at the receiver we have limited play-out buffer for video. Therefore, we need to discard video frames with delay higher than a fixed threshold [8]. Lower delay and delay jitter of frames will result in lower delay in playing video and smaller play-out buffer requirement. REFERENCES [1]M.Takefman, et al. “IEEE Draft P802.17”, Draft 3.3, Apr. 2004. [2]D. Tsiang and G. Suwala, “CISCO SRP MAC layer Protocol”, Internet Engineering Task Force (IETF) Comments 2892, Aug. 2000.

275

[3]M. Shreedhar and G. Varghese, “Efficient fair queuing using deficit round-robin,” IEEE/ACM Trans. Networking, Vol. 4, pp. 375–385, June 1996. [4] S. S. Kanhere and H. Sethu, “ On the Latency Bound of Deficit Round Robin”, Proc. of 11th International Conference on Computer Communications and Networks, PP.548-553, Oct. 2002 [5]J.Zhu, A. Matrawy, I. Lambadaris and M.Ashourian, “A New Scheduling Scheme for Resilient Packet Ring Networks with Single Transit Buffer”, Proc. of IEEE GLOBECOM 2004 Workshop (CAMAD’04), pp.276-280, Nov. 2004. [6]ISO-IEC/JTC1/SC29/WG11.ISO/IEC 14496: Information technology - Coding of audio-visual objects, 2001. [7]http://www.tkn.tu-berlin.de/~jklaue/cif.html, Retrieved Nov. 2004. [8]J.Klaue, B.Rathke and A.Wolisz, “EvalVid--A Framework for Video Transmission and Quality Evaluation”, Proc. of the 13th International. Conference on Modelling Techniques and Tools for Computer Performance Evaluation, Sep. 2003.

Suggest Documents