Keywordsâ TCP-friendly Internet video, end-to-end congestion control, rate ..... Here, quality adaptation based on adaptive temporal frame-rate change has ...
Smooth and Fast Rate Adaptation and Network-Aware Error Control for TCP-friendly Internet Video Transmission
Young-Gook Kim, JongWon Kim and C.-C. Jay Kuo Integrated Media Systems Center and Department of Electrical Engineering-Systems University of Southern California, Los Angeles, CA 90089-2564 E-mail: {younggoo,jongwon,cckuo}@sipi.usc.edu
Abstract – A new rate adaptation mechanism called SFRAM is developed based on the TCP-throughput
equation in this research to facilitate end-to-end Internet video transmission. The proposed scheme not only achieves TCP-friendliness but also provides a smooth and fast rate adaptation for Internet video delivery. By adaptively averaging measurements such as the round trip time (RTT) and the packet loss rate over a suitable window, SFRAM mitigates unnecessary fluctuation that is undesirable for video transmission. The adopted weighting scheme enables the response in a fast manner only for distinct network variations so that the overall network utilization is improved and end-to-end video quality is sustained. When integrated with active routing support such as RED (random early detection) and ECN (explicit congestion notification), SFRAM provides the best possible performance. Additionally, this property is utilized to select an interactive error control scheme so that the video encoder can choose the most effective error recovery option based on the network status. Extensive experiments by using the ns-2 network simulator and the real Internet are performed to test the dynamic behavior of SFRAM. The seamless integration of an interactively linked ns-2 network simulator and the ITU-T H.263+ video encoder is demonstrated for network-aware error control as well as rate adaptation of Internet video. Keywords— TCP-friendly Internet video, end-to-end congestion control, rate adaptation, SFRAM, network
aware error control, and RED-ECN.
Final Version: May 28, 2000
Submitted to IEEE CAS-VT Special Issue on Streaming Video 2000
I. INTRODUCTION
Internet video, encompassing streaming video playback and video conferencing, has emerged as one of the essential applications today. Cutting edge technologies in broadband networks make it possible to deliver multimedia contents to the household in the very near future. It is however a challenging problem to provide the required quality of service (QoS) for the multimedia traffic since the best-effort Internet does not guarantee the available bandwidth. Furthermore, the delay-jitter (i.e., the delay variance) and the packet loss rate, which are critical to the selection of a proper transmission rate, are not predictable. The QoS problem of Internet video has been approached via resource allocation such as Integrated Services with RSVP (resource reservation protocol) [1][2], in which each flow attempts to reserve the resource so that the packet loss rate and the delay are bounded. Although it can provide guarantees, the required admission control is so complicated that it is still difficult and premature to deploy. In the current Internet, video applications are deployed in the end-to-end sense rather than coordinated by networks, which gives rise to the following two main issues in video transmission over the best-effort Internet.The first issue is about the amount of Internet resources (i.e. the bandwidth) that an end-to-end video application should utilize for its multimedia contents. TCP traffic is dominant in today’s Internet applications such as FTP, HTTP and TELNET. The congestion control mechanism of TCP (transmission control protocol) [7], motivated by intelligent end-to-end fair-sharing of a dummy network core, has contributed to the robustness of the current Internet for more than two decades. However, TCP is not effective enough for realtime applications because its window-based congestion control does not provide instant rate adaptation. It is a trend to use UDP (user datagram protocol) with application-level rate adaptation for Internet video. A couple of end-to-end rate adaptation mechansims [3][4][5][6] have been proposed, especially for video applications, in which the end user adjusts the transmission rate based on the observed link status (e.g. packet loss and delay/delay-jitter). Unlike congestion control of TCP that emphasizes reliability, the video rate adaptation mechanism demands additional functionalities to enable low-latency high-quality video. Smooth transmission is important for a video source since it leads to sustainable quality only with small buffering for low-latency. From the network viewpoint, fast adaptation to network variations while behaving in a TCP-friendly manner is essential since it can reduce the packet loss rate and improve the overall network utilization. Reduction of the packet loss rate is also beneficial in maintaining video quality since there exists temporal dependency among frames [15][16]. Unfortunately, most Internet video applications have not yet implemented the active
2
fairness mechanism, which may cause congestion collapse [8]. Thus, it is very important that a rate adaptation mechanism should be designed in favor of the overall network performance rather than its own merit. The other issue is how to cope with the packet loss as well as the fluctuating available bandwidth, which are inevitable in the shared Internet. Video applications are subject to temporal error propagation due to the extensive exploitation of frame dependency. It is necessary to make the underlying video stream robust to errors and mitigate the effect of lost packets. If the application is not delay sensitive in the sense that the round trip time (RTT) level latency is permitted, retransmission coupled with a large de-jittering buffer is the most effective remedy for packet loss. However, real-time interactive video is delay stringent, and it requires both error resilient encoding and proactive protection via forward error correction (FEC). Under the bandwidth fairness constraint, it is inevitable to employ the error recovery scheme while sacrificing the encoding performance. The so-called network aware error control (NAEC) will be implemented with the help of the proposed SFRAM, by which the network status is provided.
Encoder
Target Rate
Network Aware Error Control
Buffer
Video (UDP)
Packet Scheduling
Internet Cloud
Available Bandwidth
Buffer
Decoder
Network Status (Packet loss, RTT)
Decoding Status (Latest decoded frame, error degree)
Smooth and Fast Rate Adaptation
Feedback (TCP)
Mechanism
Figure 1: Illustration of the proposed Internet video transmission system. Figure 1 shows the functional block of the proposed Internet video system, which is depicted for one-way transmission for simplicity. UDP with the proposed rate adaptation mechanism SFRAM, is used for data channel and another external channel (using TCP for reliability) is needed for feedback1. In the two-way case, data packets may piggy back the feedback information. In the receiver side, there is a network status estimator. Basically, it records the packet loss history and the round trip time (RTT). In the case of TCP, all these functions are implemented in the sender side with feedback acknowledgements. However, the sender side might be too overloaded to handle the situation since real-time encoding consumes lots of computational power. The receiver may estimate network status parameters more precisely because it observes actual packet 1
Currently SFRAM utilizes somewhat modified and simplified form of RTP/UDP combination for the data channel,
and the feedback channel uses a variant form of RTCP reports. However, it is convertible to a form compliant to RTP/RTCP , if required.
3
dynamics rather than processed feedbacks. Additionally,
the same concept can be easily extended to
multicast or multi-user environments because the sender needs not handle feedbacks from all receivers [23][24][25]. SFRAM will calculate the next available bandwidth (ABW), which keeps network utilization high while limiting the overload and improving inter-protocol fairness. ABW is translated at the sender into the target rate for the video encoder and used for packet scheduling to meet the ABW dictation. The receiver not only feedbacks ABW through SFRAM, it also conveys the network status such as the packet loss and RTT. Together with the feedback from the video decoder (e.g. the latest decoded frame and the error degree), the network status is utilized by SFRAM to coordinate network-aware error control. In this paper, we propose a feasible solution of rate adaptation mechanism, especially designed for Internet video transmission while it also handles error control. The main novelties of this work include the following: 1. Establishing a TCP-friendly smooth and fast rate adaptation mechanism (SFRAM) for Internet video. By adaptively averaging measurements such as RTT and the packet loss rate within a suitable window, SFRAM mitigates unnecessary fluctuation that is undesirable for video transmission. The adopted weighting scheme enables the response in a fast manner only for distinct network variations so that the overall network utilization is improved and end-to-end video quality is sustained. When integrated with active routing support such as RED (random early detection) and ECN (explicit congestion notification), SFRAM provides the best possible performance. 2. Coordinating network aware error control by SFRAM to achieve the best video quality. The network status aware capability of SRFAM is utilized to select an interactive error control scheme so that the video encoder can choose the most effective error recovery option based on the network status. The seamless integration of an interactively linked ns-2 network simulator and the ITU-T H.263+ video encoder to demonstrate network aware error control as well as rate adaptation for Internet video is performed. . This dynamic simulation setup enables us to evaluate the performance of feedback dependent error recovery options in an interactive but reproducible manner so that the comparative evaluation can guide us in possible scenarios to exploit SFRAM for the best end-to-end video experience. In Section II, background and motivation are described in detail with reference to related work. The detailed implementation of SFRAM is described in Section III, which focuses on the weighting method and the determination of the sampling window size. In Section IV, basic error recovery schemes are examined to derive the dynamic NAEC scheme . In Section V, both network simulation results by using the ns-2 network simulator and the actual Internet are provided to demonstrate the smooth and fast adaptation capability of SFRAM in various network scenarios. Additionally, the effect of SFRAM for video quality and the
4
performance of NAEC, either in a fixed or an adaptive method, are provided. Concluding remarks are finally given in Section VI.
II. BACKGROUND AND MOTIVATION A.
TCP-friendly Rate Adaptation
The main purpose of TCP-friendly end-to-end rate adaptation mechanisms is that they achieve the fairness between TCP and non-TCP flows. Besides, they attempt to reduce the packet loss that causes retransmission in TCP and quality degradation of video traffics. Another benefit is that the overall network utilization will be improved. Special attention has to be paid to congestion detection. Currently, the packet drop is used as a signal of congestion, and the sender backs off the transmission rate as a response. It is not difficult to find shortcomings of this approach for Internet video transmission. If the end user relies on the packet loss for backing off, a lost packet is not only useless but also should be retransmitted in reliable TCP. In the case of real-time video, most frames have temporal dependency on others. Loss resilient and recovery mechanisms have to be added, and they require extra bandwidth and introduce additional delay. A desirable end-to-end rate adaptation mechanism for Internet video should be able to cope with all these problems. A couple of TCP-friendly rate adaptation mechanisms [3][4][5][6] proposed to date can be divided into two categories. One is to mimic the TCP congestion control mechanism directly by adopting AIMD (additive increase multiplicative decrease) [4]. It increases the transmission rate by sending one more packet in an RTT interval when there is no packet loss. If there is a packet loss, it decreases its rate to one half. The main characteristic of this approach is the saw-tooth behavior of the transmission rate. Even though it can achieve TCP-friendliness, it is not suitable for video transmission. For pre-encoded and stored video, it demands a larger buffer to cope with rate fluctuation. Additionally, it might result in quality fluctuation heavily. In summary, AIMD is too dynamic and unpredictable, and an alternative for more predictable rate adaptation is highly desired. An equation-based approach [5], that uses the TCP throughput model [9] to adjust the transmission rate, was proposed to meet demands discussed above. The TCP throughput equation can be written as TCP throughput =
MTU , 2p 27 p RTT p(1 + 32 p 2 ) +T0 3 8
(1)
where MTU is the maximum transmission unit, RTT is the round trip time, T0 is the retransmission time out, and p is the loss rate (notice that p is not the actual packet loss rate). To use Eq. (1), one has to measure parameters RTT, T0 and p in such a way as TCP does. RTT and T0 may be measured whenever the
5
acknowledgement of a received packet arrives. For p, we count the packet loss only once per RTT interval regardless of the actual lost packet number, because TCP responds only to the first packet loss in an RTT interval. However, packet loss does not always happen in every RTT interval. Actually, the dynamics of AIMD has come from applying the packet loss event directly to determine the congestion status. Thus, it is necessary to average p over a suitable number of RTT intervals so that Eq. (1) predicts the average transmit ability of TCP connection. We use the term “sampling window”, denoted by w, to indicate the number of RTT intervals used to average p. RTT and T0 can also be averaged for multiple RTT intervals with the same purpose instead of using an instance. It is noticeable that the TCP-equation based approach can predict a very stable and smooth rate change compared to AIMD. To design a successful sampling window-based scheme, the selection of w to avoid excessive smoothing is a key issue. Let us take a look at the effect of two different sampling windows as shown in Figure 2. We will defer the detailed setup until Section V. Here, please pay attention to the transient response of the rate trace, which is the output of the TCP throughput model for two different sampling windows (w =6 and 50, respectively). By changing the total number of flows sharing the same link, the share of each flow is alternating in every 20 seconds between 32 and 125 kbps (the simulated ABW), respectively2. For w=6, there are lots of instances that p is zero, causing Eq. (1) to diverge. To circumvent this problem, we applied the additive increase (AI) scheme, which restricts the sender to send one more packet in one RTT interval. However, the packet loss rate still varies too much and, as a result, the transmission rate fluctuates significantly, which is not desirable for video. The receiver will need more initial buffering to cope with this kind of rate fluctuation, implying a longer end-to-end delay. On the other hand, a larger value of w enables the rate change in a smooth manner sacrificing the responsive reaction to recent observations. Basically, the results show that, by using TCP-equation adequately, one should adapt to the ever-changing Internet with flexibility. In this work, we propose SFRAM, which aims a smooth but fast response based on the TCP equation as a solution to coordinate the conflicting demand of video applications and networks.
2
The ABW change in a pulse train shape represents an unrealistic and extreme situation in the real Internet. It is
utilized only to capture and compare the response of different rate adaptation schemes.
6
p
rate
ABW
0.4 0.2 0 0
20
40
60
80
100
120
300000 250000 200000 150000 100000 50000 0 140 (sec)
rate, ABW (bps)
RTT (sec), p
RTT
0.6
0.6
RTT
p
rate
ABW
0.4 0.2 0 0
20
40
60
80
100
120
300000 250000 200000 150000 100000 50000 0 140 (sec)
rate, ABW (bps)
RTT (sec), p
(a) w = 6
(b) w = 50 Figure 2: Rate adaptation with different sampling window sizes: (a) w = 6 and (b) w = 50. The pure end-to-end mechanisms discussed above are relying on an individual observation of the loss and delay feedback and thus have a limitation in identifying the actual cause of loss and delay. To handle this problem, some active congestion avoidance mechanisms have been proposed. Among them, the random early detection (RED) gateway [10] with the explicit congestion notification (ECN) capability [11] is the most relevant to end-to-end congestion control (i.e. rate adaptation). A RED gateway detects incipient congestion based on the average queue level. If the average queue is shorter than a lower threshold, no action is taken. If the average queue goes higher than an upper threshold, a packet is always dropped. If the average queue lies between thresholds, the newly arriving packet is dropped with probability MaxP. In an extension of RED, instead of dropping the packet, it marks the ECN bit on the packet header. The receiver should consider this ECN bit in the same way as the packet loss and the feedback to the sender. As analyzed for TCP in [12], we implement the RED-ECN extension in SFRAM and analyze its benefit for end-to-end rate adaptation. B.
Network Aware Error Control
With the end-to-end rate adaptation mechanism in place, the video application at the end is dictated by the dynamically changing rate budget. Also, the variation of video itself, denoted by the variable bit rate (VBR) nature, requires a smooth but occasionally peaky rate budget. Switching among layered (i.e. scalable) or aggregated video streams is the first-hand option to tackle coarse bandwidth fluctuations in general. Remaining fine-granularity adjustment depends on the available rate control options within layers and the capacity of de-jittering buffers. Thus, the resulting quality adaptation, which trades off the spatial and temporal quality based on both the rate budget and the user preference, is constrained by the adjustment capability (i.e. speed) of the video encoder/decoder. That is, the demand on the smoothness of the rate budget
7
varies upon video applications, and the proposed SFRAM can be designed to adaptively absorb a part of the rate fluctuation while not hurting the underlying network. Packet loss affects not only the lost packet itself but also subsequent packets due to the complicated dependency links of video packets. To prevent error propagation, provisions such as media-aware packetization, error resilient encoding (e.g. intra-macroblock refresh and data partitioning), protection by source (e.g. the parity motion vector) and/or channel (e.g. FEC) redundancy, and timely coordinated automatic repeat request (ARQ) are proposed. Each of the above schemes has its own error recovery capability and applicability depending on the target environment. For example, the source-oriented technique is widely applicable regardless of network awareness as in the case of I-MB refresh. As a trade-off, it costs a longer recovery time and a mismatched performance. Also, there is a reference picture selection (RPS) scheme, which takes advantage of the feedback channel. On the contrary, the channel (or packet) level schemes such as FEC and ARQ are targeting more timely recovery, although it also experiences quality degradation for a short transient period. Among them, proactive FEC is usually applied in the form of unequal error protection and has a wider applicability than reactive ARQ, which is more efficient but highly delaydependent (from feedback). It is, however, another tradeoff between coding efficiency and the error recovery capability, which has to be selected based on the application context. Thus, it will be best if it can be coordinated with the proposed SFRAM by using the monitored RTT, loss rates, and packet acknowledgments. The desired coordination should catch the distinct network variations as well as the specific loss of packets. Before going into the detailed description of SFRAM in Section III, let us review existing research on the integration of rate adaptation and error-resilient Internet video [17][18][19][20]. First, the feedback-based Internet video scheme summarized in [17] addresses the possible rate and error control combination in addition to network fairness. In [18], the Internet video system employing error resilient scalable compression (i.e., wavelet-based) is proposed with the cooperative transport protocol in order to combine a novel compression method, which is error resilient and bandwidth scalable with a low-delay TCP-friendly transport protocol. Rate adaptation, however, is based on the simplified TCP throughput model and the packet trace in the real Internet environment simulation is utilized for its performance demonstration. For ITU-T H.263+, the video-aware congestion control scheme named the receiver-based congestion control mechanism (RCCM) is integrated with the variable frame-rate encoding and fast motion-compensated frame interpolation (FMCI) components. Here, quality adaptation based on adaptive temporal frame-rate change has been demonstrated to meet the modified AIMD-based ABW demand [19] in the Internet modem setup. Also, in [20], video-friendly unicast rate adaptation is proposed to transmit ISO/IEC MPEG-4 video in an error resilient mode. Its rate adaptation is also based on the TCP throughput model while the error resilient mode decision is conducted by using the two-state Markov model. However, due to the difficulty in building a dynamic but reproducible
8
evaluation setup, evaluation on interactive network aware error control is lacking. Thus, by forming an interactively linked ns-2 network simulator and the H.263+ video encoder, special effort is made to evaluate the performance of the feedback dependent error recovery scheme in this work.
III. SMOOTH AND FAST RATE ADAPTATION MECHANISM (SFRAM) To use the TCP-throughput model in Eq. (1), we have to measure RTT, calculate T0, and count the number of lost packets. In SFRAM, we use TCP's rules for all three parameters, while switching the role of the sender and the receiver. At first, the receiver sends the feedback to the sender as a signal for starting. The sender puts the feedback number and the elapsed time after receiving the feedback in every packet header. After receiving each packet, the receiver calculates RTTsample with RTTsample = TR – TS – TE,
(2)
where TR, TS, TE are the packet received time at the receiver, the feedback sent time at the receiver, and the elapsed time in the sender, respectively. The receiver uses the exponential filter to update RTT and the TCP algorithm for T0. We put the ECN capability in the mechanism so that the receiver can receive the marked packet with the ECN bit [10] [11][12]. To calculate the packet loss rate p, the receiver counts only once regardless of the number of lost or marked packets in every RTT as discussed earlier. After the receiver calculates RTT and counts p, it calculates TCP-throughput. Table 1 shows the sample records of RTT and the lost or marked (L/M) pattern. Let us assume that 10 RTT is used for the sampling window, i.e., w=10. The highest sample index means the latest record, and vice versa. We also assume that the packet sizes are fixed for simplicity. After measuring parameters, the receiver predicts the TCP-throughput with Eq. (1) and sends the feedback back to the sender as the next rate. Even with a large sampling window, p can be zero to cause infinity in Eq. (1). In such a case we increase the rate so that the sender can send one more packet in one RTT interval. This is similar to the AI mechanism of TCP and RAP [4]. The main purpose of SFRAM is to predict the overall average available bandwidth so that it avoids the unnecessary fluctuation of rate prediction. Thus, a large enough sampling window is necessary, and the AI mechanism can be applied if there is no packet loss for this period. The desirable value of the sampling window is explained in the next section. Note that the proposed scheme is receiver-based so that the receiver has more direct access to measurements. With this scheme, the receiver can easily compare the current receiving rate and the next rate that is restricted not to exceed two times of the current receiving rate.
9
Occurrence of p=0
0.6
1 2 5 k bp s D ro p t ail,
0.4
1 2 5 k bp s RE D -E CN , 1 6 k bp s D ro p t ail,
0.2
1 6 k bp s RE D -E CN
0 6
10
20 Sampling W indow w
30
40
Figure 3: Sampling window size w vs. the occurrence rate of p=0. Since our goal is to capture the overall network variation effectively, we should average values of RTT, T0 and p in a suitable time scale. As explained in Section II, we use the sampling window w as the averaging period. To determine the proper choice of w, let us compute Eq. (1) by assuming p = 0.01 and RTT = 40 msec. Then, the rate is around 250 MTU (maximum transfer unit)/sec, which is equivalent to 10 MTU/RTT. The value p = 0.01 means that there is one packet loss every 100 packets in the probability sense. Since it takes 10 RTT to send 100 packets, it implies that the averaging interval should be at least 10 RTT to prevent frequent happenings of the instance p=0. The probability of p=0 is empirically drawn for varying w in Figure 3. It shows that it is desirable to set the window size to be at least w=20 to avoid the case of p=0. Thus, in order to safely avoid the zero loss instance and achieve smooth rate prediction, a window size of 50 (w=50) is adopted in our approach. Even though a longer averaging helps to achieve a smoother rate, it does not respond in a fast manner to network variations. To demonstrate it, we show three cases of sampling as tabulated in Table 1. Sample number(i) Received packet (L/M)I Case 1 RTTi (L/M)I Case 2 RTTi (L/M)I Case 3 RTTi Weight Wi for p
1 10 1 0.05 1 0.05 0 0.04 -5
2 9 0 0.04 1 0.05 0 0.04 -4
3 8 1 0.05 1 0.05 0 0.04 -3
4 5 0 0.04 1 0.05 0 0.04 -2
5 7 1 0.05 1 0.05 0 0.04 -1
6 8 0 0.04 0 0.04 1 0.05 1
7 6 1 0.05 0 0.04 1 0.05 2
8 4 0 0.04 0 0.04 1 0.05 3
9 6 1 0.05 0 0.04 1 0.05 4
10 8 0 0.04 0 0.04 1 0.05 5
Table 1: Illustration of RTT and the lost or marked (L/M) packet pattern. There is no distinct network variation in Case 1. The network is getting less congested in Case 2 and more congested in Case 3. By taking the average loss rate p and RTT, we can get the same TCP-throughput value since all three cases have the same loss rate p and RTT. It is the non-weighted case that is not able to adapt the network variation in a fast manner. We need to increase the next rate for Case 2, decrease the rate for Case 3
10
by proposing a weighting method. First of all, the latest RTT is weighted the most. This is also applied to T0 as given below: w n ∑ i × RTTi RTT = i =1 w , n ∑i i =1 w ∑ ( L / M )i i =1 pa = w , received ∑ i i =1
w n ∑ i × (T0 ) i T0 = i =1 w n ∑i i =1 w ∑ ( L / M )i ×Wi pw = i =1 w ∑ Wi i = w +1 2
(3)
(4)
For loss rate p, we introduce the weighting factor W as illustrated in Table 1. The average loss rate pa is calculated with Eq. (4) in the same manner for all three cases. We add the weighting effect with pw in Eq. (5), where pw has a negative value for Case 2 and a positive value for Case 1. Finally, p is determined by if ( pw > 0) else
p = pa (1 + m × pw ) m p = pa (1 + ×p ) 1+ m w
(5)
Eq. (5) implies that p can be increased up to (1+m) times of pa if it is getting more congested, decreased down to 1/(1+m) times of pa, if it is getting less congested. By applying these weighted RTT, T0 and p to Eq. (1), we can achieve fast adaptation to network variations.
IV. NETWORK AWARE ERROR CONTROL In Section III, we propose SFRAM as a TCP-friendly rate adaptation mechanism that mainly plays a role to predict the next available bandwidth. In this section, we utilize SFRAM to coordinate NAEC, (network aware error control) which selects the most effective error recovery scheme based upon the network status. We first review several error recovery candidates for H.263+ and their characteristics, and then propose the general rule of thumb to coordinate them based on SFRAM3. A.
Review of Error Recovery Schemes
A. 1.
Intra Macroblock (I-MB) refresh Mode
The simple way to mitigate error propagation is to regularly refresh some part of P-frames as intra coding. The ratio and location of I-MB’s inside P-frames needs to be coordinated according to the packet loss rate. Also, it will benefit from a careful consideration of the visual refreshing effect not alone error resiliency. This 3
To demonstrate the dynamic interaction of SFRAM with on-line H.263+ encoding, only source-oriented error
resiliency options are investigated, excluding the packet-level interaction with FEC, ARQ, and hybrid.
11
source-oriented technique is widely applicable regardless of network awareness (i.e. feedback). The performance of this mode is closely related to the ratio of I-MB’s and the actual packet loss rate while the delay of feedback, expressed in RTT, does not affect much. The wide applicability of this error resilient mode has attracted recent research work towards the rate-distortion optimized I-MB refreshing technique as a part of the optimized coding mode decision approach [20][21]. However, it usually costs a longer recovery time and a mismatched performance especially when it is operated without feedback. When the feedback is available as in our case, it can be enhanced into a better scheme by adaptively linking the ratio of I-MB’s with the observed network status. In addition, it can be further integrated with error tracking, which tracks the propagation of negatively acknowledged packets, to conduct the error propagation aware refreshing [15]. In this case, the corresponding I-MB refresh becomes dependent upon the feedback delay, which restricts its usability only to latency-relaxed video transmission with an on-line encoder. A. 2.
Reference Picture Selection (RPS)
RPS is evaluated as the most effective error protection scheme, when a reliable and low-delay feedback is available. It is recently extended to cover multiple-picture referencing in the name of enhanced RPS. For single-frame based RPS, two basic modes, ACK/NACK (positive/negative acknowledgement), are defined in [15]. In the ACK mode, referencing is changed only after successful decoding of a video picture to be referenced, leading to some loss of coding efficiency. On the contrary, in the NACK mode, the normal referencing transition is conducted until the failure of a specific frame is notified. From the point of notification, referencing is frozen until it is re-synchronized. Thus, the NACK mode takes advantage of coding efficiency at the cost of transient quality loss. Two modes of RPS are depicted in Figure 4. The enlarged gap between the encoding and reference frames, which depends on RTT, reflects the lower coding efficiency of the ACK mode as shown in Figure 4(a). Under the NACK mode, the decoder can pause or conceal errors till it gets re-synchronized. If the application is not delay stringent, large buffering at the decoder at the cost of latency increase will absorb the transitional quality loss. If not, it may be better to use the latest former frame as reference till re-synchronization. As shown in Figure 4 (b), where frame 4 references based on frame 2, which causes the transient error drift until it receives the re-referenced or intra frame. Thus, both RTT and packet loss plays an important role in determining the efficiency of RPS modes.
12
ACK
1
ACK
2
1
ACK
3
4
5
6
7
8
2
3
4
5
6
7
8
9
NACK for 3
9 1
2
4
5
6
7 time
time
(b) the NACK Mode
(a) the ACK Mode
Figure 4: Illustration of the ACK/NACK modes in reference picture selection (RPS). A. 3.
Video Redundancy Coding Mode
A multi-threading version for video redundancy coding (VRC), especially implemented in H.263+ RPS, is depicted in Figure 5. As a form of the multiple description (MD) technique, each thread keeps its own temporal dependency chain for packet loss isolation. For overlapping frames in threads, the decoder can choose the best thread based on the reception order and the error history of each thread. The multi-threading version is controlled by the number of threads Nth and the number Nfr, of frames per thread, which determines the synchronization frequency f sync = N th × ( N fr − 1) + 1 . For example, in Figures 5(a) and (b), 2/2 and 3/3 (for Nth / Nfr) are illustrated, respectively. At the end of the cycle, redundant synchronization frames, 4s and 8s1/8s2, are encoded, respectively.
Thread 1
1
Thread 2
1
2
4 3
4s
Thread 1
1
Thread 2
1
Thread 3
1
2
5
3
8
6
8s1
1 cycle time
4
7
8s2
1 cycle time
(a) Nth = 2, Nfr = 2 Encoded
(b) Nth = 3, Nfr = 3 Encoded for synchronization
Figure 5: Illustration of the video redundancy coding (VRC) mode. Coding efficiency is reduced by the overhead factor of fsync/(fsync+1), where only redundant frames are calculated. Also, as with larger Nth, the temporal prediction loss will increase. At the cost of overhead, the error recovery characteristic of VRC is similar to that of RPS. However, in a fixed synchronized frequency form, it is applicable without feedback and not sensitive to RTT as in the case of I-MB refresh. Also, if a
13
separate channel for each thread is provided and the loss is isolated, it may outperform RPS since at least one thread can be kept error free. However, as packet loss increases, every thread can get affected simultaneously and the error will propagate beyond the one cycle. B.
Network Aware Error Control via SFRAM As observed from the above review, the success of network aware error control relies on how to interpret
the network status and how to link it with the error resiliency options. First, feedback-based RPS is inherently delay sensitive and RTT is the key factor for its deployment. The efficiency of the RPS ACK mode, even though it is guaranteed to prevent error propagation, is affected by RTT. NACK RPS, while more efficient till packet loss, is subject to transient quality loss and requires complicated processing. Thus, switching among modes of RPS is coupled with packet loss monitoring under a reasonably small RTT. On the other hand, the VRC mode with multi-threading is capable of preventing error propagation depending on the ratio Nth / Nfr. Their selection, which trades off error resiliency and coding efficiency, has to be coordinated based on the application context. In contrast with these two modes, the I-MB refresh mode recovers the error in a gradual fashion. In the case of higher packet loss, the I-MB refresh mode can outperform the VRC mode. Thus, in terms of RTT, reactive RPS modes and feedback-free I-MB refresh/VRC modes are expected to show differentiated error resiliency. It also implies that they are kind of complementary and can be used in a hybrid manner. Within each mode group, packet loss will lead to performance degradation. These intuitions can be summarized in Figure 6, which depicts the overall NAEC mechanism and its recommended operation ranges. The proposed SFRAM attempts to smooth the abrupt response to network changes. When integrated with NAEC, SFRAM can prevent excessive switching among optional modes, since it provides RTT and the loss rate in a smooth manner. Also, SFRAM can be utilized in adjusting adaptation parameters to observed network variations. To be more specific, the ratio of I-MB’s can be changed based on the expected packet loss.
User’s preference RPS NACK RTT history RPS ACK
NAEC Loss history
VRC
Weighting I-MB refresh
RTT
Loss rate
Optional mode
Short
High
ACK
Short
Low
NACK
Long
High
I MB refresh
Long
Low
VRC
Decoding Status
(a) Overview of theNAEC mechanism
(b) The optional mode choice
Figure 6: The proposed coordination of NAEC via SFRAM.
14
V. EXPERIMENT RESULTS A.
Simulation Setup The discrete-event network simulator ns-2 [13] has been extensively utilized for the evaluation of SFRAM
and its interaction with Internet video transmission in our experiments. First, basic properties of SFRAM such as TCP friendliness and smooth and fast rate adaptation are evaluated. Then, SFRAM works with the error recovery options of H.263+ video [15] for video quality examination. To be more specific, to evaluate NAEC dictated by SFRAM, the ns-2 simulation tool is integrated with an on-line H.263+ encoder as shown in Figure 7. It enables us to examine interaction between the network and the video codec, which is difficult to obtain with pre-encoded video or network trace evaluations. Let us explain the overall simulation environment in detail starting from the topology to the packet type below. For the network simulation topology, we use the simple butterfly type where all end users pass through the same bottleneck link with 2.5 Mbps bandwidth while all other links of 10 Mbps are congestion free. This bottleneck link was injected from 20 to 160 flows during simulation, resulting in 125 to 16 kbps fair share (denoted by ABW) on the average. Propagation delay of the bottleneck link, denoted by dp, is set to 50 or 100 ms, which stands for a typical value for the current Internet. For the router, we simulate the droptail or RED queuing with various queuing sizes. The sizes are defined as Qmax = Qk × d p × bandwidth / packet size, where the queuing delay factor Qk varies among 2, 4 or 8, respectively. Under this topology, delay and delay-jitter are mainly affected by the queuing delay (note the short delay of side links). The maximum delay is then calculated by d max = (1 + Qk ) × d p . Not only delay factor Qk determines RTT, but also affects the packet loss rate since the increased queuing can absorb bursty traffic. Also, in the case of RED routers, we set minthresh = Qmax/3 and maxthresh = Qmax, respectively. For ECN coupling, the marking probability is set to 0.1 for all RED queues, which means ECN bit marking once every 10 packets. For the traffic type, two types of flows, i.e. TCP and SFRAM (weighted and non-weighted), are injected side by side for comparative evaluation. For TCP flows, we use the SACK1 version implemented in ns-2 and the packet size is set to 500 bytes. Note that, although TCP-SACK1 and 50/100 ms propagation delay cases are provided, other TCP implementations and delays are checked. Non-weighted SFRAM means a simple averaging in calculating Eq. (1) while real SFRAM employs weighting. For SFRAM, the rate adaptation mechanism is implemented on top of the UDP traffic and the packet size of UDP is also set to 500 bytes (if the video encoder is not linked to the SFRAM flow). When linked to the on-line H.263+ encoder, each frame outputs a variable size and it is contained by a single packet as long as it does not exceed MTU.
15
Sender TCP flow, 500 bytes
Router Side link (3-td) msec 10 Mbps
Receiver
Router
DropTail or RED
Bottleneck link dp = 50 or 100 msec BW =2.5 Mbps
DropTail or RED
Side link td msec 10 Mbps
UDP (SFRAM) flow, 500 bytes UDP (SFRAM) Simulated online H.263+ video
TCP Sink
UDP sink (SFRAM)
{TCP, SFRAM} data
{TCP ACK},{SFRAM feedback}
Figure 7: Illustration of the overall ns-2 simulation topology and traffic types along with the on-line H.263+ video encoder. B. B. 1.
Evaluation of SFRAM Smoothing effect
By comparing SFRAM with and without weighting, the smoothing effect of SFRAM is evaluated for sampling window w set to 6 and 50, respectively. For this scenario, a fixed number (10 in the experiment) of SFRAM (with and without weighting) and one TCP flows are sent over the simulated network (Qk=8 and dp=50 ms) sharing a bandwidth of 125 kbps each. For SFRAM, n=4 and m=1 are used for weighting parameters. As shown in Figure 8, a larger window width leads to a smoother transmission behavior and the employed weighting does not hurt the smoothing for this steady network. Comparing (c)-(d) with (g)-(h), SFRAM provides a comparable performance in the smoothing effect.
16
d
RTT(sec), p
rate (bps) RTT(sec), p
rate (bps)
0.5
300000
0.5
300000
0.4
250000 200000
0.4
250000 200000
0.3
150000 100000
0.2 0.1 0
150000 100000
0.2 0.1
50000 0
0
0.3
50000 0
0 0
Time (sec) 50 (a) w = 6, Droptail, non-weighted
Time (sec) 50 (b) w = 6, RED-ECN, non-weighted
0.5
300000
0.5
300000
0.4
250000 200000
0.4
250000 200000
150000 100000
0.2
0.3 0.2 0.1 0
Time (sec) (c) w = 50, Droptail, non-weighted
150000 100000
0.1
50000 0
0
0.3
50000 0
0
50
0
Time (sec) 50 (d) w = 50, RED-ECN, non-weighted
0.5
300000
0.5
300000
0.4
250000 200000
0.4
250000 200000
150000 100000
0.2
0.3 0.2 0.1 0
Time (sec) (e) w = 6, Droptail, SFRAM
0
0.5
300000 0.5
0.4
250000 0.4 200000 0.3 150000 0.2 100000 0.1 50000 0 0
0.2 0.1 0 0
Time (sec) (g) w = 50 Droptail, SFRAM
50000 0
0
50
0.3
150000 100000
0.1
50000 0
0
0.3
50
Time (sec) (f) w = 6, RED-ECN, SFRAM
300000 250000 200000 150000 100000 50000 0 0
Time (sec)
(h) w = 50, RED-ECN, SFRAM
Figure 8: Performance comparison of the smoothing effect. x
RTT
p
|
17
50
rate
50
B. 2.
Smooth and Fast adaptation to network variations
We examine the smooth and fast adaptation capability of SFRAM to network variations. The importance of fast adaptation is that it can reduce the packet loss (i.e. prevent congestion). We compare SFRAM with and without weighting by simulating the pulse train with a fair share denoted by ABW. By alternating the number of flows from 10 SFRAM and TCP flows to 10 SFRAM and 70 TCP every 20 sec, each individual flow shares 125 kbps to 32 kbps under the Qk=8 and dp=50 ms setting. For SFRAM, n=4 and m=1 are used. In Figure 9(a), there are some delays in adapting to network variations in the case of non-weighted and droptail queue. In Figure 9(b), SFRAM can adapt to network variations in a fast manner especially when used with RED-ECN. With the same network variation, we show the actual packet loss rate P for SFRAM and TCP with various weighting factors in Figure 10. While it is getting more weighted (larger n and m), P is decreased. In any case, SFRAM results in a lower P compared with the non-weighted scheme. SFRAM is also beneficial to TCP since it reduces the packet loss of TCP. Additionally, we know that RED-ECN is very effective in
0.5 0.4 0.3 0.2 0.1 0
RTT
0
20
p
40
rate
60
80
ABW
100
120
300000 250000 200000 150000 100000 50000 0 140 (sec)
rate, ABW (bps)
RTT (sec), p
decreasing the packet loss rate.
RTT (sec), p
d
0.5 0.4 0.3 0.2 0.1 0 0
20
RTT
p
rate
40
60
80
ABW
100
120
(b) RED-ECN, SFRAM Figure 9: Comparison of the fast adaptation effect.
18
300000 250000 200000 150000 100000 50000 0 140 sec
rate, ABW (bps)
(a) Droptail, non-weighted
N o n -w e ig h ted , d ro p N o n -w e ig h ted , e cn
0 .0 3
n =2 , dro p 0 .0 2
n =2 , ecn n =3 , dro p
0 .0 1
0.0 1 5 Actual loss rate P
Actual loss rate P
0 .0 4
0 .0 1
0.0 0 5
n =3 , ecn n =4 , dro p
0 1
2 3 4 m (p w e ig h ting d e g re e )
0
n =4 , ecn
5
1
2 3 4 m (p w eig h tin g d e g re e)
(b) P, TCP
2 drop 4 ecn
2 ecn 8 drop
4 drop 8 ecn
Actual loss rate P
Actual loss rate P
(a) P, SFRAM
0.3 0.2 0.1 0 20
40
80
2 drop 4 ecn
0.3
2 ecn 8 drop
4 drop 8 ecn
40 80 Num ber of flows
160
0.2 0.1 0
160
20
Number of f low s
2 drop 4 ecn
2 ecn 8 drop
(d) P, SFRAM
4 drop 8 ecn
0.2 0.1 0 20
40 80 Num ber of flows
Actual loss rate P
Actual loss rate P
(c) P, non-weighted
0.3
5
160
0.3
2 drop 4 ecn
4 drop 8 ecn
40 80 Num ber of flows
160
0.2 0.1 0 20
(e) P, TCP with non-weighted
2 ecn 8 drop
(f) P, TCP with SFRAM
Figure 10: The actual packet loss rate P for SFRAM and TCP with respect to network variations.
B. 3.
TCP-friendliness and actual packet loss P
TCP friendliness is verified by sending the same number of SFRAM and TCP flows. A broad range of flow numbers and Qk values (2,4 and 8) are simulated with two queuing mechanisms (i.e. droptail and RED-ECN). Results for dp=50 ms are shown in Figure 11, since there is not much difference for other ranges between 10-
19
100 ms. SFRAM flows can share the bandwidth with TCP flows very friendly while the weighting does not hurt TCP-friendliness, either. From measured P, the ECN scheme makes a significant contribution to the packet loss rate reduction. Unlike the complicated queuing algorithm in [14], the RED-ECN scheme demands
2 drop 4 ecn
2
2 ecn 8 drop
4 drop 8 ecn
TCP-friendliness
TCP-friendliness
only a low complexity. Thus, one can reduce the packet loss rate effectively by implementing ECN.
1
0 20
40 80 Num ber of flows
2 drop 4 ecn
2
2 ecn 8 drop
4 drop 8 ecn
1
0 20
160
40
80
160
Number of flows
(a) TCP-friendliness, non-weighted
(b) TCP-friendliness, SFRAM
Figure 11: Comparison of the TCP-friendliness behavior and the actual packet loss rates.
B. 4.
Internet Experiments
We have performed the real Internet experiments to evaluate SFRAM. A Internet connection was made from almaak.usc.edu in Los Angeles to cross.unomaha.edu in Nebraska. At first, 1 SFRAM flow was sent where w, n and m were set to 50, 4 and 2, respectively. After 35 seconds, 30 TCP flows were sent for 5 Mbytes. Figure 12 shows the change of the transmission rate for SFRAM and 1 TCP flows measured on March 24, 2000. As given in the figure, SFRAM achieves a smoother rate change yet adapts fast to distinct network variations.
SFRAM rate
Estimated RTT
SFRAM RTT
0.2
350000 300000 250000 200000 150000 100000 50000 0 3.78 11.7 19.4 26.9 34.1 43.2 52.1 61.6 70.9 80.1 88.3 95.9 103 111
time(sec)
Rate(bps)
tcp rate
0.15 0.1 0.05 0 3.78 12.6 21.1 29.2 37.8 48.1 58.1 68.6 78.7 88.2 96.6 105 113
time (sec)
time(sec)
(a) Transmit rate
(b) RTT
Figure 12: SFRAM vs. TCP in the Internet experiment in terms of transmission rate and RTT.
20
C.
Video Quality Adaptation We show the effect of the smooth rate prediction from SFRAM on the resulting video quality here. We
send 20 flows of TCP and SFRAM each for 100 sec where one SFRAM flow is linked to the on-line H.263+ encoder. Each flow shares 64 kbps on the average, known as the available bandwidth (ABW). The Foreman video clip of the QCIF format is encoded with a target rate based on SFRAM prediction with two different window sizes, w = 6 and 50. For H.263+, the TMN 8 rate control scheme with the 2-frame skip mode is used. A video clip of 256 frames is encoded repeatedly for 100 sec while the first frame of the image sequence is encoded with the Intra frame mode. Since the original video clip is captured with 30 frames/sec, 1000 frames are examined during 100 sec. approximately. It is meaningful that the behavior of the video encoder is examined in a reproducible manner with realistic TCP backgrounds. Figure 13 shows that the target rate (i.e. rate prediction via SFRAM) and SNR_Y for each window size. Note that the SNR_Y value is measured at the encoder side without considering the effect of packet loss since we want to examine the effect of SFRAM only. Even though both cases use the bandwidth fairly in average, a small window size results in video quality fluctuation since the target rate is so dynamic. This implies that a sensitive rate adjustment such as TCP
25000
36
20000
34 SNR_Y
Target rate (Bps)
congestion control is not desirable for video transmission, especially for delay stringent applications.
15000 10000
32 30 28
5000
26
0 0
25
50 Tim e (s e c)
75
100
0
25
50 Tim e (s e c)
75
100
0
25
50 Tim e (s e c)
75
100
12000
36
10000
34
8000
SNR_Y
Target Rate (Bps)
(a) w = 6
6000 4000 2000
32 30 28
0 0
25
50
75
26
100
Tim e (s e c)
(b) w = 50 Figure 13: Comparison of SNR_Y for two different window sizes.
21
D. Error recovery with feedback channel The performances of various error recovery schemes are examined especially under the realistic network environment with a feedback channel, where the TCP traffic exists as the background flow. Since our interest is to examine the effect of the network constraint on error recovery schemes, we simply adopt the frame unit packetization. It is true that these results can be affected by the GOB level packetization. However, we put this issue aside for future consideration. The same simulation setup for ns-2 and H.263+ as given in the above section is adopted. All results of video quality are calculated at the decoder side with the measured packet loss rate and RTT. D. 1.
I-macro block refresh (I-MB refresh)
Basically, the I-MB refresh method recovers the error in a progressive way as shown in Figure 14. One I macroblock is inserted in every 10 and 5 macroblocks for (a) and (b), respectively. Even though (a) has a higher PSNR value for the error-free frame, it takes more time to recover the error. It is a general trend to provide the average SNR_Y to evaluate the performance of an error recovery scheme. In some applications, users may prefer a large amount of error-free frame. In that case, the I-MB refresh method is not an
40
40
30
30
SNR_Y
SNR_Y
appropriate method to use.
20 10
20 10
0
25
50
75
100
0
25
Tim e (sec)
50
75
100
Tim e (sec)
(a) 10 % I-MB
(b) 20 % I-MB Figure 14: The effect of I-MB refresh.
D. 2.
Reference Picture Selection (RPS)
Figures 15 (a) and (b) show the SNR_Y value of decoded pictures for the ACK and NACK modes of RPS, respectively. For the ACK mode, there is no difference between the quality of encoded and decoded pictures except for those containing the lost packet. However, compared to ACK shown in (a), NACK experiences severe quality degradation until it gets a new Intra frame. Thus, both RTT and the loss rate are very critical to NACK while ACK is only dependent on RTT.
22
40
30
30
SNR_Y
SNR_Y
40
20 10
20 10
0
25
50
75
100
0
25
50
Time (sec)
75
100
Tim e (sec)
(a) ACK mode
(b) NACK mode
Figure 15: The performance of RPS with RTT = 400msec with 3 % loss.
Video Redundancy Coding (Nth = 2, Nfr = 3)
40
40
30
30
SNR_Y
SNR_Y
D. 3.
20 10
20 10
0
25
50
75
100
0
25
50
Time (sec)
75
100
Tim e (sec)
(a) 3% loss
(b) 8% loss
Figure 16: The performance of VRC under different packet loss rates. We simulate the VRC scheme with Nth = 2, Nfr = 3 by using H.263+ video under different packet loss rates. Figures 16 (a) and (b) show error patterns under 3% and 8% loss rates, respectively. The performance of VRC is similar to that of NACK in RPS because it can recover errors instantly. It is better than NACK in the sense that the decoder can display the uncorrupted chain in one cycle while NACK experiences an error packet during RTT. However, it may happen that the error propagates beyond one cycle as described in Section IV A.3, which is related to the overall packet loss rate. Since it is necessary to request an intra-frame for this case, a hybrid type of error recovery can be considered. From the above discussion, we can know that VRC is more efficient in a lower packet loss rate while RTT does not affect the performance.
23
E.
Network Aware Error Control
From the experiment of various error recovery schemes with a realistic packet loss situation, we know that their performance is strongly related to the network status. The characteristics of underlying video applications are also important to the performance evaluation. Here, the effect of NAEC based on the information obtained from SFRAM is studied. We investigate two cases under short and long RTT scenarios. For short RTT, we apply the ACK and NACK modes of RPS adaptively based upon the packet loss rate. The I-MB refresh method is used for long RTT, where the ratio of I-MB is adjusted adaptively according to the packet loss rate. To do that, we set, Qk = 2, dp = 50 msec and 100 msec when it experiences maximum RTT equal to 200 msec and 400 msec, respectively. 10 TCP and 10 SFRAM are sent for 120 sec, where one of SFRAM flow carries coded packets. During the intervals of 30-60sec and 90-120sec, 60 TCP flows are inserted to result in a congestion status, which mainly affects the loss rate. The rule to switch between error recovery schemes is summarized in Table 2.
RTT (msec)
Loss rate (%)
Optional mode
< 200
> 5.0
ACK
< 200
< 5.0
> 200
N/A
NACK Adaptive I-MB I-MB ratio = 3 * p
Table 2: The rule of error recovery scheme selection based on RTT and the packet loss rate. Figure 17 shows the result of applying the RPS ACK and NACK modes adaptively for short RTT. The SNR_Y value of the ACK mode is lower than that of NACK when there is no error frame. On the contrary, NACK achieves a better performance with a low packet loss rate because its coding efficiency is higher than that of the ACK mode. However, it experiences severe quality degradation frequently under a high packet loss rate even though it recovers after one RTT time interval. Compared with the fixed mode, adaptive NAEC via the use of SFRAM achieves a much better performance as shown in (c). For a low packet loss interval, video quality is almost same as that of the NACK mode. If the loss rate is higher than 0.5%, we should switch to the ACK mode to prevent error propagation.
24
0.3
30
0.2
20
0.1
10
0 0
30
60 Time (sec)
90
Loss rate (%)
SNR_Y
40
120
40
0.3
30
0.2
20
0.1
10
0
Loss rate (%)
SNR_Y
(a) ACK mode only
Time (sec)
40
0.3
30
0.2
20
0.1
10
0 0
30
60 Time (sec)
90
Loss rate (%)
SNR_Y
(b) NACK mode only
120
(c) Adaptive NAEC SNR_Y
Loss Rate p
Figure 17: NAEC coordination with adaptive RPS. With a longer RTT, the adaptive I-MB refresh method is investigated. The encoder adjusts the ratio of IMB as three times as the loss rate determined by SFRAM. Figure18 shows the result of applying the I-MB refresh method adaptively. In (a), it achieves better video quality for the error-free frame compared to that of (b). However, it takes more time to recover from errors, which is not desirable especially in a high packet loss rate. In (c), an adaptive scheme is applied, and we see both coding efficiency and the error recovery pattern are improved overall in comparison with the scheme with a fixed ratio of I-MB.
25
0.3
30
0.2
20
0.1
10
0 0
30
60 Time (sec)
90
Loss rate (%)
SNR_Y
40
120
40
0.3
30
0.2
20
0.1
10
Loss rate (%)
SNR_Y
(a) 10 % I-MB refresh
0 0
30
60 Time (sec)
90
120
40
0.3
30
0.2
20
0.1
.
10
0 0
30
60 Time (sec)
90
120
(c) Adaptive I-MB refresh SNR_Y
Loss Rate p
Figure18: NAEC coordination with adaptive I-MB refresh.
26
Loss rate (%)
SNR_Y
(b) 33% I-MB refresh
VI. CONCLUSION We investigated a new rate adaptation scheme, called the smooth and fast rate adaptation mechanism (SFRAM), suitable for Internet video transmission in this paper. Based on extensive experiments with the ns simulation tool and in the real Internet environment, we demonstrate that SFRAM can adjust its transmission rate in a very smooth manner by using a large sampling window. Besides, it adapts well to the network variation with a weighting method so that the packet loss rate can be significantly reduced and the overall network utilization is improved. In addition, the integration of RED (random early detection)-ECN (explicit congestion notification) into SFRAM provides an even better performance. SFRAM are very useful not only in sustaining stable video quality but also in preventing quality degradation from the packet loss. Along this direction, an effective and efficient network-aware error control (NAEC) scheme was proposed so that SFRAM can dictate the optimal error recovery scheme available to the encoder. Experiments based on a seamless integration of NS tools and the H.263+ encoder showed that NAEC dictated by SFRAM uses the bit budget in an efficient way while reducing severe degradation of video quality. Since our major interest is in the rate adaptation mechanism, we show the possibility of NAEC with a general rule of the optional mode selection and the frame level packetization. In the future, we plan to examine the exact relationship between RTT, loss rate and the optional mode, especially for the finetuned hybrid type of NAEC. Also, the GOB level packetization effect will be considered in terms of the network performance as well as from the error control point of view.
27
VII. REFERENCES [1]
R. Braden, D. Clark, and S. Shenker, “Integrated services in the Internet architecture: an overview”, IETF RFC 1633, June 1994.
[2]
R. Braden, Ed., L. Zhang, S. Berson, S. Herzog, and S. Jamin, “Resource reservation protocol (RSVP) - version 1 functional specification,” IETF RFC 2205, Sept. 1997.
[3]
S. Jacobs and A. Eleftheriadis, “Streaming video using TCP flow control and dynamic rate shaping”, Journal of Visual Commun. and Image Representation, Sept. 1998.
[4]
R. Rejaie, M. Handley, and D. Estrin, “RAP: An end-to-end rate-based congestion control mechanism for realtime streams in the Internet,” in Proc. IEEE INFOCOMM’99, Mar. 1999.
[5]
J. Padhye, J. Kurose, and D. Towsley, “A TCP-friendly rate adjustment protocol for continuous media flows over best effort networks”, UMass-CMPSCI Technical Report TR 98-04, Oct. 1998.
[6]
D. Sisalem and H. Schulzrinne. “The loss-delay based adjustment algorithm: A TCP-friendly adaptation scheme”, in Proc. NOSSDAV’98, July 1998.
[7]
W.R. Stevens, TCP/IP Illustrated, Volume 1 - The Protocols. Addison-Wesley, 1994.
[8]
S. Floyd and K. Fall, “Promoting the use of end-to-end congestion control in the Internet,” IEEE/ACM Trans. on Networking, Aug. 1999.
[9]
J. Padhye, V. Firoiu, D. Towsley, and J. Kurose, "Modeling TCP throughput: a simple model and its empirical validation,” UMASS CMPSCI Tech Report, TR98-008, Feb. 1998.
[10]
S. Floyd and V. Jacobson. “Random early detection gateways for congestion avoidance,” IEEE/ACM
Trans. on Networking, Aug. 1993. [11]
K. Ramakrishnan and S. Floyd, “A Proposal to add Explicit Congestion Notification (ECN) to IP”,
IETF RFC 2481, Jan. 1999 [12]
H. Krishnan, “Analyzing explicit congestion notification (ECN) benefits for TCP”. Master Degree
Thesis, UCLA, 1998. [13]
UCB/LBNL/VINT, Network Simulator - NS (version 2). http://www-mash.cs.berkeley.edu/ns, 1998.
[14]
S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W. Weiss, “An architecture for
differentiated services,” IETF RFC 2475, Dec. 1998. [15]
ITU-T, Recommendation H.263 Version 2 - Video Coding for Low Bitrate Communication. Jan. 1998.
[16]
Moving Picture Expert Group, MPEG-4 video verification model version 10.0. ISO/IEC
JTC1/SC29/WG11, Feb. 1998. [17]
J. Bolot and T. Turletti, “Experience with rate control mechanisms for packet video in the Internet,”
ACM SIGCOMM Computer Communication Review, vol. 28, no 1, Jan. 1998.
28
[18]
W. Tan and A. Zakhor, “Real-time Internet video using error resilient scalable compression and TCP-
friendly transport protocol,” IEEE Trans. on Multimedia, vol. 1, no. 2, June 1999. [19]
J. Kim, Y.-G. Kim, H. Song, T.-Y. Kuo, Y. J. Chung, and C.-C. J. Kuo, "TCP-friendly Internet video
streaming employing variable frame-rate encoding and interpolation", IEEE Trans. on Circuits and Systems for Video Technology (Special Issue on the Picture Coding Symposium '99), to be published. [20]
F. L. Leannec, F. Toutain, and C. Guillemot, “Packet loss resilient MPEG-4 compliant video coding
for the Internet,” Signal Processing: Image Communication, vol. 15, 1999. [21]
G. Cote, S. Shirani, and F. Kossentini, "Optimal mode selection and synchronization for robust video
communications over error prone networks", submitted to IEEE JSAC, May. 1999. [22]
J.-Y. Lee, T.-H. Kim, and S.-J. Ko, “Motion prediction based on temporal layering for layered video
coding,” in Proc. ITC-CSCC, Jan. 1998. [23]
S. McCanne, Scalable compression and transmission of internet multicast video. Ph.D. thesis,
University of California Berkeley, 1996. [24]
L. Wu, R. Sharma, and B. Smith. “Thin streams: An architecture for multicasting layered video”,
Workshop on Network and Operating System Support for Digital Audio and Video, May 1997. [25]
X. Li, M. Ammar and S. Paul, “Layered video multicast with retransmission (LVMR: evaluation of
hierarchical rate control”, in Proc. IEEE INFOCOM, March 1998. [26]
G. Conklin and S. Henami, “A comparison of temporal scalability techniques”, IEEE Trans. on
Circuits and Systems for Video Technology, Sept. 1999. [27]
M. Vishwanath and P. Chou, “An efficient algorithm for hierarchical compression of video,” in Proc
IEEE Intl. Conf. Image Processing, 1994.
29