Integration of Error Recovery and Adaptive Playout ... - Semantic Scholar

1 downloads 0 Views 802KB Size Report
In [1], we propose a synchronized multicast media streaming framework ... variations, streaming media applications are deploying error controls as well as rate ...
Integration of Error Recovery and Adaptive Playout for Enhanced Multicast Media Streaming Jinyong Jo, Yoonyoung Kim and JongWon Kim Networked Media Lab., Department of Information and Communications, Kwang-Ju Institute of Science and Technology (K-JIST), Gwangju, 500-712, Korea E-mail: {jiny92,yykim,jongwon}@kjist.ac.kr ABSTRACT Multicast streaming has been extensively investigated to tackle the network bandwidth challenge. However, the ubiquitous deployment of IP multicast on the Internet is expected only after several years to come due to the manageability and scalability problems. We study a pragmatic solution to enhance the feasibility of multicast media streaming by focusing on an one-to-many streaming over the source specific multicast (SSM) service. In [1], we propose a synchronized multicast media streaming framework employing server-client coordinated adaptive playout. The adaptive playout mechanism controls the playout speed of audio and video by adopting the time-scale modification of audio. Based on the overall synchronization status as well as the buffer occupancy level, the playout speed of each client is manipulated within a perceptually tolerable range. In this work, we are extending the framework to incorporate the error recovery. Each client performs an interactive error recovery with the assistance of adaptive playout. RTCP-compatible signaling between the server and clients is performed, where the cumulative feedback for retransmission (to address the bandwidth restriction on the control messages) is assisted by the adaptive playout. The simulation results demonstrate several enhancements in terms of playout discontinuity and error recovery. Keywords: Multicast media streaming, source specific multicast, inter-client synchronization, cumulative NACK-based error recovery, and adaptive playout control.

1. INTRODUCTION With the enormous growth of the Internet, more and more applications are distributing media contents to worldwide users. The average size of media contents transferred over the Internet are exponentially increasing everyday. It can be partly supported by the upcoming broadband network infrastructure. However, to provide broadband media contents reliably and scalably, an efficient utilization and proper management of limited network bandwidth become increasingly important. As a viable option to save the precious bandwidth, multicast provides immense merits in designing multimedia applications. Especially for continuous streaming media (e.g., audio and video), it allow us to support a large set of clients simultaneously. The major problem with multicast is that, due to the manageability and scalability problems, the ubiquitous deployment of IP multicast on the Internet is expected only after several years to come [2]. Realizing an effective media streaming over the IP multicast faces lots of challenges. It requires feasible signaling/transport protocols, stable multicast-enabled networks, and network-adaptive applications to distribute the broadband media in an acceptable quality [3]. For the network side, source specific multicast (SSM) model [4] alleviates several complications (e.g., by restricting application scenario) of conventional any source multicast (ASM), especially for one-to-many media streaming. For the protocol side, real-time transport protocol pair (RTP/RTCP [5]) can provide interoperable/monitored real-time transport channel over the IP networks. RTP/RTCP in itself does not guarantee quality of service (QoS) to streaming media applications but acts as a helper. Streaming media applications at the server and clients are responsible to adaptively take charge of network congestion/error/delay variations and system resource limitations. To overcome the network variations, streaming media applications are deploying error controls as well as rate (e.g., congestion and flow) controls. Reliable transmission of RTP/UDP packets needs error controls based on automatic repeat request (ARQ), forward error correction (FEC), or their hybrid. In the context of reliable multicast, building blocks are

proposed based on the NACK, tree-structured ACK, and FEC schemes [6]. It is also important to deploy an effective congestion control scheme to cope with the network congestion while achieving the TCP-friendliness [7]. The error control directly affects media quality. Since the Internet does not guarantee the QoS, a reliable transmission needs error controls including the ARQ, the FEC, and their hybrid. Each of them has its own merit and demerit, and works well in the unicast environment while remaining too challenging in the multicast environment. Especially, the error recovery by retransmission is appealing due to its simplicity and efficiency. However, the required feedback bandwidth and repair latency make the problem difficult. In the unicast environment, works in the IETF (Internet Engineering Task Force) are standardizing selective retransmission mechanisms as a part of RTP/RTCP [8,9]. They focus on modifying the retransmission-based error recovery. For example, [9] introduces a desired RTCP format for the selective retransmission over a wireless link. Cumulative NACK approach in [10] is utilizing multiple multicast channels to cope with the feedback implosion, where a TCP’s window-like scheme is used to merge the retransmission request. The playout of streaming media needs to be synchronized according to the associated timing information (e.g., timestamp per each packet). The goal of media synchronization is to reconstruct the timing relation of media contents at the clients. The noticeable disruption (due to buffer underflow or system overload) and/or discrepancy (due to playout timing skews between different media objects) in the playout significantly degrades the playback quality. These intra-client media synchronization including the lip synchronization has been considered as an important issue. To keep the synchronization, streaming media applications have to deal with the network delay jitter and loss that can cause the playout disruption. They also need to keep the smooth playout regardless of the exceptional events at the client system. In addition, we can expect to see different clients are playing different portion of media. This situation is happening since each client is connected to the streaming server through paths of different bandwidth, loss, and delay. The difference in the capability of client system is another reason. Thus, we need to address the playback synchronization issue in both intra- and inter-client aspects while properly controlling the buffers at the clients. That is, an adaptive playout is required to maintain the playout synchronized despite of the network fluctuations and system limitations [11, 12]. Note that, compared to the multicast rate and error control issues, multicast playback synchronization has been rarely discussed. To cope with the above challenges and provide high-quality media streaming, an adaptive multicast streaming framework has to be established. In order to reduce the playback discontinuity and mitigate the heterogeneity, we propose a synchronized multicast streaming framework in [1], where the synchronized playout of all clients is adaptively managed. In the proposed framework, each client keeps synchronization despite of the exceptional system events as well as the network fluctuations. To assist each client for the challenge, we propose to adaptively control the playback speed. By extending the audio/speech adaptive playout with time-scale modification in [13, 14] to audio/video, the playback speed of the client can be varied. That is, based on the combined buffer (i.e., network and application buffers) occupancy level, the player at the client is adaptively expanding or contracting the playout within a range that it does not hurt the viewers perceptually. We can proactively conduct the adaptive playback not only to reduce the playback discontinuity but also to guarantee high-quality playback with flexible error controls∗ . Thus, in this paper, we are focusing the impact of adaptive playout to the cumulative NACK-based error recovery. The cumulative NACK-based error recovery is assisted by the adaptive playback to secure enough time for request/reply of retransmission. Furthermore, to reduce unnecessary feedbacks, whether to request retransmission or not is selectively determined. In this paper, we are interested in verifying the effectiveness of adaptive playout to the multicast error recovery. We choose to evaluate the proposed framework by conducting simple yet extensive network simulations. Results show that the proposed framework can reduce the playback discontinuity without degrading the media quality while enhancing the inter-client synchronization by mitigating the client heterogeneity. At the same time, it can assist the retransmission-based error recovery with the adaptive playout and reduce bandwidth through the selective retransmission and cumulative feedbacks. ∗

We are restricting our attention only to the adaptive playout and error control, setting aside the integration of congestion control to future works.

The rest of this paper is organized as follows. The proposed multicast media streaming framework is introduced in Section 2. The core components of the framework are explained subsequently in Section 3. After reviewing the simulation results in Section 4, we conclude the paper in Section 5.

2. SYNCHRONIZED MULTICAST MEDIA STREAMING FRAMEWORK Multicast has been extensively investigated to tackle the network bandwidth challenge. However, the global deployment of multicast is hardly accomplished at present because of the involved complexities and limited QoS capabilities. In this paper, we focus on a pragmatic solution to enhance feasibility of multicast media streaming service. That is, we attempt to propose the synchronized multicast media streaming framework that can mitigate the deployment complexity and enhance the service quality. The proposed framework will adopt the SSM model [15, 16], because it offers an off-the-shelf solution to overcome the deployment complexities of IP multicast. Furthermore, the RTP/RTCP over the SSM environment is adopted to adjust the protocols to the SSM [17]. Multicast RTP Multicast RTCP Unicast Feedback

SSM-enabled IP network

Frames played

Client 1

Server audio video

ARQ Network loss Client 2

Retransmission buffer

. . . Client i ARQ Client n

Client n compositor

Application loss

. . .

decoder (AD) demux

VD monitor scheduler

timer controller

Figure 1. The proposed synchronized multicast media streaming framework.

Fig. 1 illustrates the overall framework of the proposed synchronized multicast media streaming, where the adaptive playout control for audio/video streams (e.g., MPEG-2 and MPEG-4) is deployed with the media delivery through the SSM-enabled IP network. Multiplexed audio/video stream (e.g., MPEG-2 program stream (PS) [18]) is transmitted to all the clients joined through a PIM-SM (protocol independent multicast - sparse mode) routing. It is assumed that a single-layer-encoded stream is delivered to moderate number of clients (e.g., around 100) with receiver buffers. Under this kind of situation, one can expect clients to lose synchronization easily and play slightly different portion of the stream compared to the other clients. Note that we are assuming the lightly coupled synchronization, where the server and clients need to be synchronized within an allowed range. Note also that this is in comparison to the tightly synchronized playout employing the clock synchronization by the network time protocol (NTP) [19]† . The desired inter-client synchronization is accomplished in this work by performing the playout adaptation with the audio time-scale modification [13, 20]. The synchronized playout is assisted with the feedback-based streaming control, which also can encompass the rate and error controls. The simple retransmission-based error recovery can be enhanced with the explicit help of playout control, which tries to expand the time available for retransmission by slowing down the playback temporarily. Server can also help the error recovery implicitly by †

With the time synchronization and given presentation time, the inconsistency of the playout can be controlled. However, it may result in the increase of playback discontinuity, which becomes more evident when the performance of clients are not homogeneous and subject to diverse network fluctuations.

relaxing the target presentation time to that of the slowest client to secure the time for retransmission. The client can not do a multicast feedback, since the neighboring routers for each client are configured by IGMPv3 (Internet Group Management Protocol, version 3) [21] protocol to filter a source in the SSM. Instead, each client uses a customized unicast RTCP receiver report (RR) while the server multicasts a RTCP sender report (SR) to deliver a control message [17]. The most important component of the proposed framework is the adaptive playout control. It helps us to minimize the playout discontinuity and to cope with the network fluctuations and system limitations. We proposed to deploy the adaptive playout mechanism in multicast environment and exploit its flexibility advantage. Inter-client synchronization is necessary to provide high-quality media streaming service with the fairness among session participants. Synchronized playout of participating clients makes it easy for the server to deploy its network adaptation mechanisms. That is, the necessary server-client interaction for the rate, error, and synchronization can be better accomplished with the adaptive playout of each client. Also, feedback-based scheme with RTCP-compatible signaling is proposed to achieve the synchronization with simplicity. An effort on accommodating the RTP/RTCP to the SSM environment has been made in [17], which generally deals the RTCP extensions for the SSM. We extend it to RTCP-compatible signaling for the inter-client synchronization. We also propose to recover losses based on cumulative NACK-based retransmission request. Although the retransmission-based error recovery has shortcomings like long repair latency and limited multicast scalability, it still remains as an attractive candidate for the error recovery because of its simplicity and efficiency. In the multicast environment, retransmission request/reply may increase network traffics drastically unless multicast transmission is combined with the right feedback implosion control. Issues even become worse if the exchange of controlling message is restricted as in the SSM environment. To solve the scalability problems of retransmission-based error recovery for the SSM with unicast feedback [17], we propose to adopt selective retransmission request similar to [8, 9, 22] with cumulative NACKs to save feedback bandwidth. Furthermore, server-client coordinated playout control helps to salvage the time for retransmission.

3. ADAPTIVE PLAYOUT AND ERROR CONTROL 3.1. Adaptive Playout with Audio Time-scale Modification To provide high-quality video streaming and ease the client adaptation to the network fluctuations and the system limitations, the client-based adaptive playout is adopted. As shown in the magnified part of Fig. 1, the multiplexed stream is stacked at the receiver buffer of each client to wait the decoding and playback. The stream is de-multiplexed into the respective audio and video decoding buffers, where the presentation timestamp (PTS) located at the header of MPEG-2 PS is recorded into a scheduler. It also gets the current buffer occupancy level from a buffer monitor. Based on the timing and buffering status, the scheduler controls the adaptive playout by expanding/contracting the playout. The buffer monitor checks the sequence number of incoming packets and determines the loss from the network - it performs gap-based loss detection. The associated timer of the controller provides the required timing information and controls the retransmission. The main role of the adaptive playout control is to reduce the discontinuity incurred by packet over/underflows and momentary CPU overload. An effective playout adaptation allows us to avoid excessive packet droppings at the application, conceal the network fluctuations and thus minimize the degradation of perceptual media quality. With the audio time-scale modification [13, 20], we can significantly enhance the adaptation capability of the client at the cost of increased computation. To perform the adaptive playout with the timescale modification, we need to know the allowed ratio within which a player can manipulate the playback speed without being detected by the user’s perception. Even though the allowed playout variation differs based on the type of audio (including the silence), we assume that playout variation up to 25% is usually unnoticeable‡ . In this paper, we define the playback factor (∆p ) to specify this ratio. With the time-scale modification based on the Waveform Similarity OverLap-Add (WSOLA) algorithm [20], a player can vary the playout speed. In Fig. 2, the packet version of modified WSOLA [13] for the time-scale expansion and contraction is illustrated. For the time-scale expansion, a template segment which is longer ‡

The playback speed should be changed with caution to keep the playout consistent and smooth.

Original

P1

P2

P1

P2 Waveform similarity

P1

Waveform similarity

P0 P1

Realignment P1

P2

P2 Time scale contraction

Time scale expansion Time scale modified

P2

Realignment

P1

P2

Figure 2. Packet based SOLA in time-scale modification.

than one pitch period from the original waveform is sampled and then similar segment is searched based on the waveform similarity. The found segment followed by the rest of the samples in the original packet is overlapped with the template segment and added to the time-scale modified packet. The time-scale contraction is performed in a similar way except that it narrows searching region within a packet distance to compress the output waveform.

3.2. Local Playout Adaptation pci Ps + ∆ p Ps Ps − ∆ p bS bH bR t

bL bi

Figure 3. Desired operation of the proposed adaptive playout control.

Under the proposed framework, each client is performing the playout adaptation locally. The local playout is heavily tied with the buffer control. It adapts to control the combined buffer occupancy§ guided by the playback factor. In this paper, we fix the buffer size of all clients as small as possible but large enough to service the worst case client with the largest round trip time (RTT) and loss. Then, we denote the current buffer occupancy level of client i at time t as bi (t) and specify two fixed thresholds for the receiver buffer as shown in Fig. 3. They are lower limit (bL ) and higher limit (bH ) thresholds. When bi (t) falls under bL or exceeds bH , there is high risk of buffer underflow or overflow, respectively. We introduce another implicit threshold, namely retransmission limit (bR ), to guide the retransmission-based error recovery. With this, the controller determines whether it request the retransmission of lost packet or not. We will cover this threshold later in the error recovery part. As discussed above, the local playout adaptation attempts to manage bi (t) between bL and bH . By helping the buffer-level control within the range, the adaptive playout reduces the risk of buffer underflow/overflows. By keeping the playback speed of client i at t, pic (t) within acceptable range Ps · (1 − ∆p ) ∼ Ps · (1 + ∆p ) (i.e., 1 ± ∆p of its normal playback speed Ps ), we can prevent exceptional events that may result in unnecessary § The size of receiver network buffer is assumed to be much larger than the summation of decoder and composition buffers. If this assumption becomes invalid, we need to consider the virtual combined buffer that comprises all three buffers together.

packet discard. Thus, with the adaptive playout based on the time-scale modification, we can manage the buffer level more flexibly.

3.3. Adaptive Playout for Intra-/Inter-client Synchronization Another key role of local playout adaptation at each client is towards the synchronized playout (intra-client synchronization). If there exists noticeable discrepancy in the audio and its corresponding video (or temporary disruption of playout), it disturbs the audience a lot. In this paper, we assume that globally synchronized time reference is not available (i.e., lightly coupled synchronization). Thus, each client is responsible to manage the synchronization based on its own time reference, namely local virtual time. The time reference is guided by the PTS delivered with media packets. We also focus on the intra-media aspect of intra-client synchronization. Note that the intra-client synchronization issue is closely tied with the inter-client synchronization to be explained later. Tci , Vci

SR( n + 1)

SR( n) Ts ( n)

Tci (n + 1)

Tci Vci t

εt ε ci

Figure 4. Server-aided playout adaptation for both types of synchronization.

Fig. 4 depicts the proposed playout operation to maintain the playback close to a target presentation time. Initially, the virtual presentation time vci (t) starts concurrently with the start of each playout at tstart and the timer records the current presentation time of client i, Tci (tstart ). Note that each media packet contains the PTS. Using Tci (tstart ) as the reference, we can estimate the virtual presentation time at tstart + ∆t . Each client then compares its current presentation time Tci (t) to the virtual one vci (t). The target range is controlled by ²t , which is centered around the vci (t). Here, we denote by ²ic the amount of playout time discrepancy with respect to the vci (t). That is, ²ic means the time difference between the virtual (vci (t)) and actual (Tci (t)) presentation time. When the ²ic falls in the target range, i.e., |²ic | < 0.5 · ²t , the playout adaptation is not necessary. When ²ic becomes greater than ²t /2 or less than −²t /2, each client attempts to adjust the current presentation time and consequently move the buffer level. Each player either fastens or loosens its playback speed to compensate the time discrepancy (|²ic − 0.5 · ²t |) until the Tci (t) goes into the target range. To accomplish the required server-client synchronization, we adopt a feedback-based synchronization. RTCPcompatible signaling between the server and clients is performed to exchange controlling messages. The RTCP RR interval is determined based on the bandwidth of standard RTP and RTCP application-defined (APP) packet, which is customized to deliver the necessary timing information for the synchronization. When the server receives the compound RTCP RR packets, it gets the current presentation time of each client (Tci (t)) and estimates the RT T i . To determine the target Ts from the aggregated feedbacks from all the clients, the server needs to set an arbitration policy. It is important to adjust the fairness or performance of all clients in terms of synchronization¶ . For example, we can choose a simple averaging for the arbitration and thus Ts is calculated by averaging all Tci ’s. Or we can use other policies such as median, minimum, or maximum. Refer to [1] for the detailed calculation for Ts . With the calculated Ts , the server multicasts the compound RTCP packets containing Ts to all clients and each client adapts its playback speed. ¶ Note that this issue is tightly connected to the fairness of multicast congestion control. We also assume that clients with badly broken links are compelled to leave the multicast group so that it does not bias the arbitration.

The previous Fig. 4 also illustrates the detailed behavior of the server-aided client adaptation with respect to the presentation time. Note that the proposed server-aided client adaptation is tightly coupled with the local playout adaptation. At this stage, we are using a simple approach to coordinate local and server-aided adaptations. Simply speaking, we are just replacing local virtual time with the guide from the server. Note however that it is very important to make the switch among the local and server-aided adaptations smooth so that the playout at each client is consistent. The detailed procedure is as follows. While each client tries to adapt the local presentation time Tci (t) to its virtual presentation time vci (t), it receives the target presentation time Ts (n) from the server. Upon the reception, each client goes through transient adaptive playout period by simply resetting its reference presentation time of virtual timer to the server-reported Ts (n). After the reset, the player goes into the adaptation state to re-adjust the presentation time within the allowed target range.

3.4. Adaptive Playout Control for Error Recovery We can extend the role of local playout adaptation in order to secure the time required for retransmission request/reply. It enhances the chance of error recovery even in the circumstances that RTT is relatively small. Proposed cumulative NACK-based error recovery scheme is illustrated in Fig. 5. It is no use to request the retransmission if the buffer level bi (t) is less than the retransmission limit bR . Precise estimation of one-way delay from each client to the server is very important in setting the bR effectively. However, with the SSM and RTP/RTCP environment, it is rather complex to measure the RTT at the clients without the clock synchronization. By slowing down the playout speed, the adaptive playout can secure sufficient time for retransmission. Thus, the threshold for successful retransmission-based error recovery - the effective bR - can be lowered to bR bR i (1+∆p ) . It thus enlarges the possible range of retransmission to (1+∆p ) ≤ b (t) ≤ bH . bi

Up to n feedback cumulation is possible

bR + Acn (t ) (1 + ∆ p )

Retransmission request is impossible

b bR R (1 + ∆ p )

t

effective bR

∆ p ⋅ bi (t )

Ac

bR

bR (1 + ∆ p )

bi (t ) t

Figure 5. Error recovery with the help of adaptive playout control.

Furthermore, the cumulative NACK can save the feedback bandwidth. When clients detect a packet loss based on a sequence gap, it checks the buffer occupancy level bi (t) to determine whether it needs to request retransmission right away or it can hold the request for a while to process multiple requests together. From Fig. 5, it is natural that bi (t) must be greater than bR to merge multiple requests together when there is no adaptive playout. To merge feedbacks of n consecutive packets in the bitmask field of packet header, we need the buffer level margin of Anc (t). That is, bi (t) has to be maintained over bR + Anc (t). When bi (t) lies between bR n i n (1+∆p ) ∼ bR + Ac (t), we can hold the feedbacks by Ac = min[(1 + ∆p ) · b (t) − bR , Ac (t)] by slowing down the playback speed. Based on the buffer occupancy, the selective decision for retransmission is made. After making the decision, each client feeds the cumulative NACK back to the server using the unicast feedback channel. When the server receives cumulative NACKs from the clients, it checks the requests to suppress duplicate retransmission requests. For example, if there are more than two NACKs requesting the same packet within a RTT, late requests are suppressed at the server. The requested packet which is unsuppressed by the server is then multicasted back to all clients.

Inter-client synchronization SYNC Receive

Ts

Local adaptation b(t ) ≤ bL

OR

b(t ) > bL

AND

Set Ts as a reference point

b(t ) ≥ bH

RISK b(t ) < bH

NORMAL Loss AND bR ≤ bi (t ) (1 + ∆ p )

No Loss OR bR > b i (t ) } {Loss AND (1 + ∆ p )

ARQ

ARQ SYNC NORMAL

RISK 0

RISK

bL

bH

Error recovery bS

Figure 6. Proposed coordination among the adaptive playout states.

3.5. Coordination of Adaptive Playout Controls Fig. 6 describes the proposed operation of adaptive playout control for the local adaptation, the server-aided adaptation, and the cumulative NACK-based error recovery. The coordination is performed among the coordination states: NORMAL, RISK, SYNC, and ARQ. A coordination state transits to the other states based on the buffer occupancy, the losses from network and system, and/or the server-reported Ts . When the coordination state is under the NORMAL, it tries to adjust the current presentation time Tci (t) to the virtual presentation time vci (t). The coordination state goes into the RISK when the application is at the high risk of buffer under/overflows, i.e., bi (t) ≤ bL or bi (t) ≥ bH . Since the RISK has the highest priority for attention to prevent the playback discontinuity, the player cannot escape from the RISK while the buffer occupancy level bi (t) is below bL or above bH . The ARQ state defines the assistance of adaptive playout control for the error recovery. Thus, bR the player does not go into the state while bi (t) is less than (1+∆ or greater than bR + Anc (t) even though p) packets are lost. The inter-client synchronization is tightly coupled with the adaptive playout control which is performed on the NORMAL state. That is, the player goes into SYNC state when it receives the server-reported Ts . In the SYNC state, the virtual presentation time vci (t) is re-adjusted to the sender-reported Ts in the RTCP SR packet. The actual adjustment of Tci (t) to the sender-reported Ts is achieved when the player goes into the NORMAL state after escaping from the SYNC state.

4. SIMULATION RESULTS 4.1. Simulation Setup Fig. 7 provides the detailed simulation topology by the network simulator (NS2) [23]. The SSM is provided by the PIM-SM multicast routing and each client feeds back its status through an unicast RTCP RR. There exists one server and 16 clients will join to the group from arbitrary locations. Table 1. Video traffic model. I frame(kbytes)

B frame(kbytes)

P frame(kbytes)

Average frame size

50

21.6

10

Standard deviation

8

2

1

We simulate the MPEG-2 audio/video streaming scenario at average 5 Mbps with 30 f/s frame rate. This frame rate leads us to the temporal granularity of 33 ms. The adopted group of picture (GOP) structure is for

Streaming Server Session Member loss(%)/RTT(ms) 52

59

58

51

48

60

49

17

45

16

56 7%/350ms 53 54

13%/400ms

20

55

50

18

68

9%/500ms 8%/500ms

67

19

22

64

23

63

62

7

21

6%/350ms

46

65

66

57

10%/350ms

47

10%/450ms

15

38

5

8

12 3%/250ms

40

39 5%/120ms

42

43 44

13

6

4

61 71

26

14

4%/200ms

25

74

9 30

70

32

2

1

3

27

78

28

76

77

79

80

10 0

33

72

4%/250ms 5%/200ms

73

75 31

69

24

41

34

11

29

37

6%/50ms 5%/50ms

35

36 0%/50ms

82

81

1%/150ms

Figure 7. Simulation topology.

15 frames consisting of one I-frame and 3 P-frames (i.e., IBBPBBPBBPBBPBB) and the bit rates of each frame is tabulated in Table 1. At this stage, the decoding and composition delay is ignored for the sake of simplicity. The timing accuracy of playing a frame is less than 10 ms in the simulations. All clients joining the multicast group have the same receiver buffer size (default size: 1 sec, unless specified else) and perform the same playout adaptation. Following thresholds are utilized for the receiver buffers. Lower bR limit bL is set to (1+∆ from the bottom, where the retransmission limit bR is based on the estimated RTT. p) Higher limit bH has margin equal to 100 ms from the top. We set the default playback factor (∆p ) to 0.25 (25%), which is somewhat an aggressive choice for the time-scale modification. For the Ts arbitration policy, selection of minimum value is employed as a default policy. Target synchronization discrepancy ²t with respect to the virtual presentation time is set to 100 ms. Note also that the playout adaptation needs to be done in terms of video frame. The presentation time discrepancy is thus converted to the number of video frames Nfi (based on the video frame rate Rf ). Thus, we are getting the number of frames to be adaptively played out with the time-scale modification, Npi , i.e., Npi = Nfi /∆p . For example, if we want to compensate 2 frames, then we needs to do the playout for 20 frames with ∆p = 0.1. In summary, the player of each client has to perform the playout adaptation up to Npi /Rf sec as long as the buffer level is properly controlled. The simulation initiates just after 0 sec by starting the multicast streaming from the server. Each client joins to the group with a gap of 11 sec (fixed). After joining and starting the playback, all the clients are subject to a randomized, momentary CPU overload. Currently the CPU overload is happening for less than 50 ms at full strength (i.e., no other computation is possible) and it is randomly occurring with a 10 sec exponential distribution. For the network, various combination of TCP and CBR traffics are traversing the whole network topology and causing network fluctuations between the server and clients. We select independent loss only on the receiver links as a loss scenario. The individual losses are occurred with an uniform distribution and several clients experience burst losses with 50 to 100 ms exponentially distributed burst length. The server has one second retransmission buffer. The average loss fraction is 6 %, some clients suffer from maximum loss at 13 % and some clients even do not experience any loss (0%). For the cumulative NACK-based error recovery, up to 16 NACKs are merged. With all these setups, we are using several measures for the synchronized streaming performance and two for the performance of cumulative NACK-based error recovery. They are 1) accumulated playout discontinuity per each client during streaming, 2) the maximum difference of playout among clients, 3) the consumed feedback bandwidth and 4) the success probability of loss recovery with retransmission.

4.2. Verification Results To verify the performance of the proposed approach, three different playout cases are compared. ‘No playout adaptation’ means the case where the adaptive playout control for both local adaptation and error recovery is not performed and adaptive playout cases are divided into ‘Local adaptation’ and ‘Server-aided adaptation’. In

1.0

Discontinuity(s)

Time difference(s)

both local and server-aided adaptation cases, the player request retransmission with the help of playout control. The request is performed individually (‘individual request’) or cumulatively (‘cumulative request’).

No playout adaptation 0.9 0.8 0.7

Local adaptation with individual request

0.5

0.6

0.3

0.3

0.2

0.2

0.1

0.1

0.0 50

100

150

Local adaptation with individual request

0.4

Server-aided adaptation with individual request

0.4

No playout adaptation

0.7

0.5

Local adaptation with cumulative request Server-aided adaptation with cumulative request

0.6

0.8

200

250

300

Local adaptation with cumulative request Server-aided adaptation with cumulative request Server-aided adaptation with individual request 0

50

100

150

Time(s)

200

250

300

Time(s)

(a)

(b)

Figure 8. (a) Maximum playout time differences between the leading and trailing clients, (b) Accumulated playout discontinuity averaged per each client.

First, the maximum playout time differences between the leading and trailing clients at each time t is shown in Fig. 8(a). Without the adaptive playout, the time difference increases up to maximum buffer occupancy level, i.e., bs . Packet overflows or intentional flushing is useful in reducing the gap, but it pays the playout discontinuity. Check the accumulated discontinuity per each case in Fig. 8(b). It illustrates 0.8 sec discontinuity is occurring to each client on the average without the playout adaptation. In comparison, the proposed playout adaptation can bound the gap more effectively. If the player adopts local adaptation only, there is no way to restore the inter-client synchronization since there is no guidance. Eventually it is subject to flushing of packets in the buffer. But with the help of server coordination, we can maintain and the gap throughout the whole playout. The accumulated playout discontinuity in Fig. 8(b) shows that the proposed playout adaptation with the server-aided synchronization efficiently reduces the playout discontinuity to half of ‘No playout adaptation’. Table 2. Impact of Ts arbitration policy on the performance.

Ts selection

Minimum

Median

Maximum

Average

Discontinuity(s)

0.4812

0.5562

0.5812

0.4870

The appropriate selection on Ts , arbitration policy, impacts the overall performance as shown in Table 2. The result shows that Ts derived from minimum arbitration policy (i.e., min(Tc1 + d1s + RT T 1 , · · · , · · · , TcNc + Nc c dN ) is preferable to the others with respect to playback discontinuity. That is, if we choose Ts s + RT T according to the slowest client in the group, it increases the time available for retransmission for all the other clients. This in turn contributes to the enhanced error recovery and hence results in the reduced discontinuity compared to the other arbitration policies. The averaged Ts selection also reduces the playout discontinuity compared to the other policies. Fig. 9(a) illustrates the consumed bandwidth of RTCP RR and retransmission requests used for feedbacks to the server. Note that we assume the compounded RTCP RR with 120 bytes and the retransmission request packet (regardless of feedback cumulation) of with 12 bytes are used to feedback QoS information and retransmission request. Furthermore, three cases compared here are exposed to the same network losses but

Probability

Bandwidth(b/s)

25000

No playout adaptation Local adaptation with individual request

20000

Server-aided adaptation with individual request

15000

Local adaptation with cumulative request

Server-aided adaptation with individual request 0.8

0.7 No playout adaptation Server-aided adaptation with cumulative request

0.6

10000

Local adaptation with individual request

0.5

Local adaptation with cumulative request

Server-aided adaptation with cumulative request

5000

0.4

0

50

100

150

200

250

300

0

50

100

150

(a)

200

250

300

Time(s)

Time(s)

(b)

Figure 9. (a) Required feedback bandwidth for retransmission request and receiver report, (b) Average repair probability at time t in a group.

may experience different application losses based on the scenario. We can observe the bandwidth efficiency of cumulative NACK for retransmission request in Fig. 9(a). The cumulative NACK-based requests show the saving of the feedback bandwidth to a half compared with the individual requests. Fig. 9(b) illustrates the averaged repair probability of lost packets in each client. It is confirmed that the coordination of the server and clients, i.e., server-aided adaptation, increases the repair probability. Without the adaptive playout, the repair probability is highly fluctuating over the time. However, we note that the repair performance of the adaptive playout with the cumulative requests is worse than that of ‘No playout adaptation’. This result partly implies that the choice on the cumulation (i.e., n = 16) is rather aggressive since it makes the timely recovery to fail from time to time. The lack of the precision in converting buffer level to time quantity and the feedback packet loss (since it can affect up to n retransmission) are partly responsible for this, too.

5. CONCLUSIONS In this paper, we propose a framework for the synchronized multicast media streaming with the adaptive playout and error control, which is suitable for one-to-many media streaming with an improved flexibility. The adaptive playout controls the playout speed of audio and video by adopting the time-scale modification of audio based on the buffer occupancy. The main role of the adaptive playout control is to reduce the discontinuity incurred by packet overflow/underflows and momentary CPU overload. By employing the server-client coordinated adaptive playout control with feedback, it varies the playback speed within the perceptual limit and adapts the presentation time of each client to help the inter-client synchronization. In addition, each client performs retransmission-based error recovery with the assistance of playout control to compensate for long repair latency. The retransmission request with NACK-based cumulation can save the allocated bandwidth and prevent the feedback implosion. The network-simulator based simulations show that the proposed framework 1) reduces the playout discontinuity caused by the network fluctuations and the system limitations without degrading the media quality, 2) saves the feedback bandwidth induced by retransmission request, and 3) mitigates the client heterogeneity by employing the sever-client coordinated adaptive playout and error control. Thus, the potential of the proposed multicast streaming framework to enhance the multicast streaming quality is well verified.

ACKNOWLEDGMENTS This work is supported by the University Research Program supported by Ministry of Information & Communication in Republic of Korea.

REFERENCES 1. J. Jo and J. Kim, “Synchronized one-to-many media streaming with adaptive playout control,” in Proc. SPIE ITCOM: Multimedia Systems and Applications V, July 2002. 2. K.C. Almeroth, “The evolution of multicast: from the MBone to interdomain multicast to Internet2 deployment,” IEEE Network, vol. 14, pp. 10–20, Jan. 2000. 3. C. Diot, B.N. Levine, B. Lyles, H. Kassem, and D. Balensiefe, “Deployment issues for the IP multicast service and architecture,” IEEE Network, vol. 14, pp. 88–98, Jan. 2000. 4. H. Holbrook and B. Cain, “Source-specific multicast for IP,” Internet Draft draft-ietf-ssm-arch-00.txt, IETF, Nov. 2001. 5. A. Schulzrinne, S. Casner, Frederderick, and V. Jacobson, “RTP: A transport protocol for real-time applications,” IETF RFC 1889, Jan. 1996. 6. M. Handley and et. al., “The reliable multicast design space for bulk data transfer,” IETF RFC 2887, Aug. 2000. 7. J. Widmer and M. Handley, “Extending equation-based congestion control to multicast applications,” in Proc. ACM SIGCOMM, Aug. 2001. 8. C. Burmeister and et. al., “Extended RTP profile for RTCP-based feedback - results of the timing rule simulation,” Internet Draft draft-burmeister-avt-rtcp-feedback-sim-00.txt, IETF, Nov. 2001. 9. A. Miyazaki and et. al., “RTP payload formats to enable multiple selective retransmission,” Internet Draft draft-ietf-avt-rtp-selret-04.txt, IETF, Nov. 2001. 10. G. Poo, “A cumulative negative acknowledgment (CNAK) approach for scalable reliable multicast,” in Proc. Computer Communications and Networks, Oct. 2001, pp. 268–273. 11. J. Walpole and et. al., “A player for adaptive MPEG video streaming over the Internet,” in Proc. of the SPIE 26th Applied Imagery Pattern Recognition Workshop (AIPR), Oct. 1997. 12. B. Moon, J. Kurose, and D. Towsley, “Packet audio playout delay adjustment: performance bounds and algorithms,” ACM/Springer Multimedia Systems, vol. 5, no. 1, pp. 17–28, Jan. 1998. 13. F. Liu, J. Kim, and C.-C. J. Kuo, “Adaptive delay concealment for Internet voice applications with packet-based time-scale modification,” in Proc. IEEE ICASSP, May 2001. 14. E. Steinbach, N. F¨arber, and B. Girod, “Adaptive playout for low-latency video streaming,” in Proc. International Conference on Image Processing (ICIP), Oct. 2001, pp. 962–965. 15. S. Bhattacharyya and et. al., “An overview of source-specific multicast (SSM) deployment,” Internet Draft draft-ietf-ssm-overview-03.txt, IETF, Aug. 2001. 16. H. Holbrook and D. Cheriton, “IP multicast channels: EXPRESS support for large-scale single-source applications,” in Proc. ACM SIGCOMM, 1999, pp. 65–78. 17. J. Chesterfield and et. al., “RTCP extensions for single-source multicast sessions with unicast feedback,” Internet Draft draft-ietf-avt-rtcpssm-00.txt, IETF, Feb. 2002. 18. D. Hoffman and et. al., “RTP payload format for MPEG1/MPEG2 video,” IETF RFC 2250, Jan. 1998. 19. D. L. Mills, “Internet time synchronization: The network protocol,” IEEE Trans. on Communications, vol. 39, no. 10, pp. 1482–1493, Oct. 1991. 20. W. Verhelst and M. Roelands, “An overlap-add technique based on waveform similarity (WSOLA) for high quality time-scale modification of speech,” in Proc. IEEE ICASSP, Apr. 1993, pp. 554–557. 21. B. Cain, S. Deering, and A. Thyagarajan, “Internet group management protocol, version3,” Internet Draft draft-ietf-idmr-igmp-v3-01.txt, IETF, Feb. 1999. 22. J. Ott and et. al., “Extended RTP profile for RTCP-based feedback,” Internet Draft draft-ietf-avt-rtcpfeedback-02.txt, IETF, July 2001. 23. K. Fall and editors K. Varadhan, “ns notes and documentation,” The VINT Project at LBL, Xerox PARC, UCB, and USC/ISI, Aug. 2001, http://www.isi.edu/nsnam/ns/.

Suggest Documents