video performance and pricing. ... As a result, it is more cost efficient. ..... Then, the PSD of the propagation error in frame n + 2 can be given as the product of two ..... Services Working Group,â http://ww.ietf.org/ html.charters/diffserv-charter.html.
Internet Video Packet Categorization with Enhanced End-to-End QoS Performance Jin-Gyeong Kim, JongWon Kim* , and C.-C. Jay Kuo Integrated Media Systems Center and Department of Electrical Engineering, University of Southern California, Los Angeles, California 90089-2564 * Department of Information & Communication, K-JIST (Kwang-Ju Institute of Science & Technology) KwangJu, KwangJu, 500-712, KOREA ABSTRACT In this research, a video delivery system is proposed by integrating the layered codec and the proposed corruption model. In the system, it is assumed that the base layer packets are delivered without loss since they are assigned the most reliable service level. Then, packets in the enhancement layer are analyzed with the corruption model and categorized for differentiated delivery services. The corruption model for generating the relative priority index (RPI) is extended to layered bitstreams. In addition to taking into account the initial error and the propagation error due to packet loss, the dependency between the same and different layers is reflected in the modified macroblock corruption model. In order to facilitate the prioritized delivery of the enhancement layer, the RPI-based corruption model for each packet as well as coordinated delivery of packetized video over QoS networks are investigated. Finally, a per-packet optimization framework with unequal error protection (UEP) is realized to consider both end-to-end video performance and pricing. The performance of the proposed solution is verified with extensive simulations. Keywords: Packet video, layered coding, corruption model, error propagation, unequal error protection, network service level, and QoS.
1. INTRODUCTION The ever-increasing demand on multimedia communications via the wired/wireless Internet faces the challenge of packet loss as well as bandwidth fluctuation. The dependency between image frames makes the compressed video stream vulnerable even to a small number of lost packets. To address error resilience, the latest versions of ITU-T H.263+ [1] and ISO MPEG-4 have adopted a couple of options to alleviate the corruption of compressed video to error prone channels. Examples include layered representation (e.g. data partitioning), re-synchronization, error tracking, and error recovery options. The robustness issue against packet loss is more relevant to the Internet scenario, which is the main concern of this research. Internet multimedia (especially packet video) applications have very diverse requirements on the network service. Recently, network architectures are geared towards providing different quality of services (QoS) while preserving network parameters such as loss, delay, and bandwidth. Thus, coordination between different priority packets and network service levels has arisen to the surface. The priority assignment to a video packet would be best if it can accurately represent its error propagation effect to the receiving video quality. A systematic solution is proposed in this work to assign the relative priority index (RPI) to each packet so that it can be used under the packet-level unequal error protection (UEP) framework, where a different level of protection for each packet is attempted. It can be realized at the transport end with different levels of forward error correction (FEC) and/or automatic repeat request (ARQ) for each packet [2]. Also, a proper priority in terms of RPI for each packet may be conveyed to the differentiated service (DiffServe) network to treat each packet differently by using the differentiated forwarding mechanism [3]. In order to assign RPI to each video packet according to its loss propagation property, a corruption model was proposed in our previous work [4] [5]. In the work, the initial error strength of a lost MB was estimated based on the adopted error concealment scheme at the decoder. Then, to capture the expected impact of the distortion on future frames (i.e. motion-based dependency structure), a trace-back calculation for each MB was used. Also, loop-filtering and intra-MB refreshing effects were considered through the propagation path. By combining the above tools, the proposed corruption model can reasonably estimate the loss propagation effect for each packet, and provide the expected distortion due to its loss.
With the derived packet-level RPI-based corruption-model, coordinated delivery of packetized video over QoS networks were also investigated in previous research. A per-packet optimization framework with unequal error protection (UEP) was realized to take both end-to-end video performance and pricing into account. In spite of improved results of UEP with the corruption model, reconstructed frames might be damaged by dropping significant packets due to their large loss impact. To guarantee the lower bound of end-to-end quality, we propose a new packet delivery system based on combining RPI-based corruption model and layered coding techniques in this work. If the layered coding technique is used, the minimum quality can be preserved with a relatively low bit rate in the base layer. As a result, it is more cost efficient. To achieve higher visual quality, the enhancement bitstream can be transmitted through different channels. It is known via the rate-distortion characteristics that a relatively large amount of bits is required to get additional quality improvement on top of the base layer. Hence, the cost for transmitting the enhancement layer bitstream is relatively higher than the base layer bitstream. In order to transmit the enhancement layer bitstream, the differentiated network is a proper choice if prioritization is applied to packets. In this research, we extended our corruption model for video packets generated with layered coding techniques. For this purpose, the video delivery system was modified. The layered video codec generates the base layer and the enhancement layer bitstreams. The corruption model is applied to the enhancement layer bitstream to estimate the packet loss impact. The prioritized enhancement layer packets are delivered through differentiated service networks, while the base layer packets are transmitted through the most reliable transmission channel, in which no packet loss is assumed. The rest of this paper is organized as follows. A video delivery system with the RPI-based corruption model is described in Section 2. A MB-level corruption model for layered bitstream is proposed in Section 3. Then, the optimization formulation utilizing RPI under the UEP scenario is introduced in Section 4. Experiments are performed to verify the proposed corruption model for error resilient ITU-T H.263+ video in Section 5.
2. PACKET VIDEO DELIVERY SYSTEM WITH RPI-BASED CORRUPTION MODEL A packetized video delivery system with the proposed RPI-based corruption model is shown in Figure 1. The proposed delivery system consists of the video encoder, the packetizer, the RPI-based corruption model association module, the network adaptation module, the underlying delivery network, the de-packetizer, and the video decoder. In this system, a layered video encoding scheme is used to generate two layers of the compressed video bitstream. The base layer bitstream can be decoded independently, while the enhancement layer bitstream exploits the decoded base layer data to provide better video quality. Many video compression standards such as H.263/+, MPEG2, and MPEG4, provide the layered video encoding mechanism. It is also assumed the existence of a network that supports prioritized variable-rate delivery and the associated pricing mechanism. As an advanced network system, the DiffServ network has been introduced to provide Quality of Services (QoS). The DiffServ network can achieve QoS with routers that forward packets with different per-hop behaviors (PHBs) [12]. PHB is applied differently according to the DiffServ (DS) level value that is carried in the Internet Protocol (IP) header of each packet. The DS level is marked when the packet enters the DiffServ network. In the system, Network Adaptation Layer (NAL) is placed at the entrance to the DiffServ network, and it provides dynamic QoS mapping for the DS level from the Relative Priority Index (RPI) which indicates the sensitivity of loss and delay of packets. For each packet, the Relative Loss Index (RLI) and the Relative Delay Index (RDI) are calculated for the sensitivity measure of the loss and the delay effects, respectively. In NAL, a packet with value k is mapped to DS level q dynamically according to the distribution of k and the traffic condition. Then, the packet is marked with q and passed to the DiffServ network. In the DiffServ network, each packet is forwarded to routers with different PHB according to its DS level. The network traffic condition can be fedback to NAL, and the mapping parameters can be adapted to the traffic condition. As a result, each packet is delivered with a different delay and loss rate according to the sensitivity. In order to extract accurate RLI and RDI, the encoded bitstream should be analyzed based on the loss sensitivity and the delay sensitivity. For RLI, packet importance based on end-to-end quality should be evaluated. The corruption model estimates the packet loss impact in terms of MSE for each packet for enhancement layer packets. In this paper, a corruption model is developed especially for the SNR scalable bitstream. The corruption model takes into account initial errors and propagation errors that propagate through enhancement layer decoded frames.
RPI
Corruption Model
BE
Scalable Encoder
Packetizer
k
AF1
Packetizer
q
RL
AFn AS
Video Source
Network or Wireless Channel
k
Packet Loss Delay
Error Concealment
Enhancement Layer Decoder
Reconstructed Video
AF0
Depacketizer
Base Layer Decoder
Figure 1. The packet video delivery system employing the RPI-based corruption model. Compressed video at the video encoder is packetized (maybe multiplexed with other media) at the packetizer. The resulting video packets are then associated with RPI so that their impact on video quality can be relayed to the network adaptation module and the underlying delivery network. The only enhancement video packets are evaluated with RPI because we assume that the base layer video packets are delivered without packet loss by using the most reliable channel (or the highest priority). In the DiffServ case, each packet may be mapped to a different DS level that is again categorized by the degree of loss, delay, and bandwidth. Successfully delivered packets are de-packetized, de-multiplexed and decoded for rendering. From the decoded frame data in the base layer, the enhanced quality video frames can be reconstructed in the enhancement video decoder. Generally, any reliable transmission scheme can be assumed as long as it supports the RPI-associated differentiation. For example, it can be applied in robust video transmission with adaptive FEC-based protection [2] or the DiffServ packet forwarding [3]. Furthermore, other layered encoding methods such as spatial and temporal scalable coding, can be used with extension of the current work.
3. MB-LEVEL CORRUPTION MODEL 3.1. Review of Previous Work Most state-of-the-art video compression techniques, including H.263+ and MPEG-1,2,4, are based on motioncompensated prediction (MCP). The video codec employs the inter-frame prediction to remove temporal redundancy and the transform coding to reduce spatial redundancy. In the Internet environment, packets may be discarded due to the buffer overflow at intermediate nodes of the network, or considered being lost due to long queuing delays. When a packet is lost, an error recovery action (or called the error concealment procedure) is performed at the decoder, attempting to figure out the best alternative for the lost portion. Usually, no normative error concealment method is defined for most video compression standards. Various error concealment schemes have been proposed by researchers, e.g. [10], [11]. The temporal concealment scheme exploits the temporal correlation in video signals by replacing a damaged MB with the spatially corresponding MB in the previous frame. This straightforward scheme will however produce adverse visual artifacts in the presence of large motion. The motion-compensated temporal concealment is usually employed with estimated motion vectors from surrounding MBs. There remains some residual error even after the application of a sophisticated error concealment scheme, and the error propagates due to the recursive prediction structure. This type of temporal error propagation is typical in hybrid video coding that relies on MCP for inter-frame coding. The number a lost MB is referenced in the future depends on the coding mode and motion vectors of MBs in subsequent frames. This tells us the importance of each MB in the viewpoint of error propagation.
σ v2 (2,1)
σ v2 (1,1)
σ
σ v2 (2,2) 2 u
σ v2 (1,2)
σ v2 (2,3)
n
n+1
n+2
Figure 2. Error propagation of a lost MB. While propagating temporally and spatially, the residual error after error concealment decays over time due to the leakage in the prediction loop. Leaky prediction is a well-known technique to increase the robustness of DPCM by attenuating the energy of the prediction signal. In hybrid video coding, leakage is introduced by spatial filtering operations that are performed during encoding. Spatial filtering can either be introduced by an explicit loop filter or implicitly as a side effect of the half-pixel motion compensation scheme with the bilinear interpolation. This spatial filtering effect in the decoder was analyzed by F¨ aber et al. [6]. In their work, the loop filter was approximated by a Gaussian-shape filter in the spatial frequency domain and the error signal was assumed to be a zero-mean stationary random process. With this approximation, the propagating error energy can be formulated as σv2 [t] =
σu2 = σu2 · α[t], 1+γ·t
(1)
where γ = σf2 /σg2 is a parameter describing the efficiency of the loop filter to reduce the introduced error, and where σf and σg are, respectively, the loop filter strength and a parameter that characterize the PSD shape of an error signal. The parameter α[t] in above is called the power transfer factor after t-time steps. This analytical model as given in (1) has been verified by experimental results [6, 11]. While the statistical propagation behavior of an error can be analyzed with this model, it is difficult to estimate the loss effect for a packet composed of several MBs (i.e. the GOB unit) in general. Thus, in the following section, we extend the propagation behavior analysis by incorporating additional factors such as error concealment schemes and the encoding mode so that one can track the error propagation effect better with a moderate computational complexity.
3.2. Derivation of MB-level Corruption Model for Layered Codec The purpose of the corruption model is to estimate the total impact of packet loss. When one or multiple packets are lost, errors are introduced and propagated. The impact of errors is defined as the difference between reconstructed frames with and without packet loss and measured in terms of mean square errors (MSE). Fig. 2 shows the error propagation of a lost MB, in which the initial error of the corrupted MB is denoted by u(x,y) and its energy is measured in terms of error variance σu2 . The propagation error v(x,y) in consecutive frames has energy σv2 (m, j) in impaired MB j of frame n + m. The initial error due to packet loss is dependent on the error concealment scheme adopted by the decoder. The amount of initial error can be calculated at the encoder if the error concealment scheme used in the decoder is known a priori. The initial error that is introduced by packet loss of the enhancement layer can be calculated differently according to the error concealment algorithm of the enhancement layer. One possible error concealment algorithm is motion compensated error concealment predicted from the previous frame of the same enhancement layer as shown in Fig. 3 (a). In this case, the motion vector for the lost macroblock can be borrowed from the upper macroblock. This error concealment algorithm can provide a good result in the case of a low packet loss rate. If the packet loss rate is high, the expected initial error with this algorithm might lose accuracy due to the multiple packet loss effect. In order to isolate the packet dependency when the error concealment is performed, another error concealment algorithm can be considered as shown in Fig. 3 (b). The lost macroblock is replaced with a macroblock in the same
MV
Enhancement Layer
MVc MV=0 Lost MB Base Layer (b)
(a)
Figure 3. Error Concealment of a lost MB in Enhancement Layer . σf 2
σf 2 σv2
(a)
(b)
(c)
Figure 4. Prediction modes for the SNR enhancement layer: (a) forward prediction, (b) upward prediction and (c) bi-directional prediction. position in the corresponding frame of the base layer. Due to the higher reliability of the base layer, the concealed macroblock can reduce error propagation even for a high packet loss rate. The introduced initial error is propagated through successive frames of the same or the higher enhancement layers via motion compensation. However, the propagated error energy generally decays over time due to filtering effect of the prediction loop of the decoder. Let us analyze the energy transition through an error propagation trajectory. Typical propagation trajectories are illustrated in Fig. 2. The first one is a parallel trajectory while the second one is a cascaded trajectory. Before analysis, some assumptions are required to reduce the computational complexity. Let the decoder with the DPCM loop and the spatial filter be a linear system. Also, for each frame to be predicted, the time difference is set to 1, i.e. t = 1 in (1). Furthermore, we need to consider the inter layer prediction between different layers in addition to the prediction loop filtering effect. In the SNR scalability, the prediction can be performed with three different ways as shown in Fig. 4 The propagation error from the reference macroblock to the prediction macroblock depends on the prediction mode. If the forward prediction is used, the propagated error energy is subject to the transition function as given in (1). If the upward prediction is used, the propagation error will be same as the error of the reference macroblock because no prediction filter is applied. However, because the base layer does not have packet loss and we are only interested in the loss impact of enhancement layer packets, the propagation error will not be counted for the packet loss impact. For the case of bi-directional prediction, the propagated error from the reference macroblock in the enhancement layer decays due to both the prediction filtering effect and the average effect. As a result, the transition function for the enhancement layer packet is modified to be σv2 = k ·
σu2 , 1+γ
(2)
where k takes values 1, 0 or 14 for the prediction mode. If the prediction mode is forward prediction, it has value 1, if it is upward prediction, the value is 0, and it has 14 for bi-directional prediction.
σ 2f
a
(σ u2 , σ g2 )
U
A
σ a2
σ 2f
σ 2f
B
b
n
U
σ b2
n+1
σ a2
a
(σ u2 , σ g2 )
σ 2f
σ b2
b
A
n
B
n+1
(a)
n+2
(b)
Figure 5. Two typical propagation trajectories:(a) parallel and (b) cascade. MV1
n w(2,j)
MV2
n+1 i
j
n+2
Macroblock to be evaluated
Figure 6. Recursive weight calculation. Now, the overall distortion due to macroblock loss can be estimated based on (2). Fig. 5 (a) shows the parallel propagation trajectory. The error in reference frame n is characterized by (σu2 , σg2 ) that determines PSD of the error. In the parallel trajectory, an error can propagate to more than two different areas in subsequent frames. For each path, a different motion vector and spatial filtering can be applied. After calculating PSD of the error frame, the integration of PSD over the spatial frequency domain becomes the propagation error energy as σv2 = σa2 + σb2 = σu2 (
ka kb + ), 1 + γa 1 + γb
(3)
where γa = σf2a /σg2 and γb = σf2b /σg2 . Therefore, MSE can be individually estimated and accumulated for the parallel error propagation trajectory. In the cascaded propagation trajectory as shown in Fig. 5(b), the initial error energy σu2 of U of frame n is referenced by A of frame n + 1. For each transition, the loop filter function is characterized by σf2a and σf2b , respectively. Then, the PSD of the propagation error in frame n + 2 can be given as the product of two filter functions. The equivalent error energy is derived as σb2 = ka · kb ·
σu2 , 1+γ
where
γ=
σf2a + σf2b . σg2
(4)
The propagation error energy from U to B is given in (4). It is the same as (2) except for the loop filter efficiency γ. For cascaded propagation, the loop filter efficiency γ can be derived from the equivalent loop filter strength σf2 that is the sum of filter strengths σf2a and σf2b . As a result, the equivalent filter strength for cascaded propagation is the sum of filter strengths along the propagation path. Furthermore, let us consider the portion amount of a MB contributes to the next predicted frames, which is called the dependency weight. The dependency weight can be calculated recursively with stored motion vectors and MB types as shown in Fig. 6. The transferred error energy can be calculated with the loop filtering effect and the temporal dependency weight. Finally, to evaluate the total impact of the loss of M Bn,i , the weighted error variances for MBs of subsequent frames should be summed. Since the initial error can sustain over a number of frames without converging to zero, we have to limit frames to be evaluated under an acceptable computational complexity. As a result, the total energy of errors due to a MB loss in a sequence can be written as σ 2 = σu2 +
N M m=1 j=1
Wn,j (m, j) · σv2 (m, j) ,
(5)
where M is the size of the estimation window and N is the total number of MB in a frame, respectively. Also, we have m m 2 σu2 n=1 σfn,j 2 · kn , and γm,j = (6) σv (m.j) = 1 + γm,j n=1 σg2 Based on Eq. (5), it can be expected that the macroblock loss error in the enhancement layer is likely to decay faster than the base layer. Especially, if the more upward predicted macroblocks are used, then it becomes more difficult for the propagation error to last long. Thus, in general, a pre-defined number of frames is sufficient to estimate the total impact of MB loss. This defines an estimation window for the corruption model. The appropriate estimation window might be determined based on the characteristics of the underlying image sequence.
4. CORRUPTION MODEL BASED RPI AND NETWORK ADAPTATION 4.1. Corruption Model Based RPI for Layered Coding By using the derived equation for layered coding, each macroblock can be evaluated with its loss impact in terms of expected MSE. Because the transmission unit is a packet that is much larger than the size of one macroblock, the packet level loss effect should be estimated from the macroblock loss effect. In addition to packet level estimation, appropriate categorization is necessary in order to make it feasible in the network adaptation layer. RPI assignment is somewhat dependent on the employed video coding scheme. Here, we consider MCP-based video compression schemes with the proposed MB-level corruption model for layered coding. Under this MB-level corruption model, the total loss impact of a MB is calculated recursively under the assumption that there is only one MB loss (called the MB-independent assumption). However, a packet usually contains more than one MB. It may contain a number of GOB’s (group of blocks) or slices, and even frames. Moreover, there could be the multiple-packet loss in practice. The initial error for an MB is not affected by other MB’s in the same packet, because they are located exclusively and lost at the same time. The propagation path of one MB can be affected by other MB loss. Thus, the packet loss estimation under the MB-independent assumption may deviate from the real situation, which is assumed to be negligible here. Under the assumption that only one packet loss occurring in the whole sequence (or a reasonable amount of frames surrounding this packet), the error propagation path and its loop filter strength are not affected by other packet losses. Then, independent RPI can be assigned to each packet by summing only MB-level corruption effects under the MB-independent assumption with a proper normalization. In fact, because the propagation error decays over time quickly in the enhancement layer, the independent RPI can be a reasonable choice.
4.2. RPI-based Coordination for Network Adaptation With the RPI-based corruption-model for each packet, a coordinated effort to deliver packetized video over QoS networks is investigated in this section. In particular, the packet-based protection framework with unequal error protection (UEP) is realized. This framework incorporates both the end-to-end video performance and pricing. The end-to-end video performance is measured in either objective quantity (e.g. the PSNR value) or subjective quality. In our approach, the impact on visual quality due to the loss of packet is represented by RPI (independent or dependent RPI). Given RPI-assigned packets, the network adaptation task can be formulated as follows. The loss impact of each packet expressed in terms of RPI is first categorized (i.e., normalized and quantized) among K categories according to the significance of the packet from the perspective of quality degradation . In fact, the actual categorization process may vary according to the employed network adaptation module and the underlying network service. Under the network delivery scenario, the resulting quality degradation depends on both the categorized RPI and delivery mechanism. The delivery mechanism for each packet is again categorized into level q among a total of Q levels anticipating price p according to q. For example, in the DiffServ network, DS level indicates different forwarding with which a certain level of QoS is assured. Thus, the total quality degradation of video with N packets can be expressed as QD =
N i=1
QD(k(i), q(i)).
(7)
Since the total cost P for N packets are limited and each packet i costs pq(i) , the optimal assignment of q = (q(1), q(2), ..., q(N )) can be found by minimizing the total quality degradation. That is, min QD = min q
subject to
q
N
N
QD(k(i), q(i)),
(8)
i=1
pq(i) ≤ P.
(9)
i=1
We can solve the problem by finding the service level mapping q that minimizes the Lagrangian formula Ji (λ) = QD(k(i), q(i)) + λ · pq(i) .
(10)
The solution depends on QD(k(i), q(i)) and pq(i) . The Lagrangian formulation of this problem is illustrated in [18]. The mapping function from categorized RPI to the quality delivery mechanism and the pricing strategy affects the mapping solution. With these two determined, the cost to quality degradation function can be derived for each packet. The solution is then to set each pq(i) equal to the price at which the slope −λ line intersects the quality degradation curve. Since the total cost P is the sum of all packet costs, by adjusting λ, the cost-constraint (9) has to be met. In particular, we consider the following special situation where the quality degradation is affected by the loss effect only. The expected quality degradation QD(k(i), q(i)) can be factored into the product of the loss impact QDk(i) and the packet loss rate Lq(i) for the packet. If the quality degradation QDk is linearly proportional to the categorized call k of RPI (i.e. QDk = k · Q, where Q is the normalization factor), the price for service level pq is reciprocal to the packet loss rate Lq , and the packet loss rate Lq is proportional to service level q, the quality degradation can be expressed as QD(k(i), q(i)) = QDk(i) · Lq(i) = Q · k(i) ·
L . q(i)
(11)
Then the Lagrangian formula of (10) becomes Ji (λ) = Q · k(i) ·
q(i) L +λ· . q(i) L
After solving (12) for q(i), parameter λ can be calculated by the constraint equation (9). N Q · ( i=1 k(i))2 λ= , P2 k(i) . q(i) = L · P · N k(i) i=1
(12)
(13)
(14)
In practice, DS levels are given as discrete values. In this case, the mapping method shown above might not give the optimal assignment. As an possible mapping method, the corresponding RPI and the cost are plugged into 12, then the level having minimal J can be selected for the packet.
5. EXPERIMENTAL RESULTS The proposed corruption model was first verified with simulations. Then, the QoS mapping was performed based on RPI calculated with the proposed corruption model. Simulations were carried out with the QCIF Foreman sequence, encoded by an H.263+ SNR scalable encoder at 15fps and 64kbps for the base layer and 192kbps for the enhancement layer. In order to increase robustness, the synchronization code was inserted into the beginning of every GOB, leading to a GOB-based packet. To evaluate the impact of packet loss for enhancement layer packets, the error energy was estimated for each MB by using the proposed corruption model for the layered bitstream. Then, error energies for all
MBs that belong to a packet were summed and averaged. Also, to simplify the acquisition of error PSD parameters for each MB, decaying factor γ was calculated from the initial error energy and the resulting error energy when the prediction loop filter was applied. This method can reduce the computational complexity. The bitstream was decoded with the corresponding SNR scalable decoder, where error concealment for lost GOBs was performed with the upward prediction from spatially the same position in the base layer. In all simulations, it was assumed that the base layer was transmitted without loss, while enhancement layer packets were transmitted with a different reliability in terms of the packet loss rate.
5.1. Verification of Proposed Corruption Model The single packet loss model for independent packet RPI was verified by comparing the measured distortion due to the single-packet loss with the estimated one from the proposed corruption model. Two different simulations were performed. One was to see how the corruption model represents the prediction filtering effect while the other one was to see how well the estimated end-to-end quality degradation would match the actual value. For the first simulation, one GOB loss effect was estimated with the proposed corruption model. The enhancement layer packet was intentionally discarded and its reconstruction was performed with the corrupted bitstream. Then, the initial and propagation errors estimated with the corruption model were compared to actual errors calculated from the reconstructed sequence. The simulation program was modified to store the MSE of the propagation error by each frame. Fig. 7 shows the simulation results for different packet loss. Different packet loss has different error propagation characteristics. The graph shows that the proposed model can provide high accuracy. Note that the error propagation behavior of the enhancement layer decays very rapidly over time even though no Intra MB refresh is applied. The result is consistent with Eq. (2).
(a) GOB number = 4.
(b) GOB number = 5.
Figure 7. A packet loss effect estimation with the proposed corruption model Second, another verification simulation was performed to estimate the packet loss impact in terms of end-to-end quality degradation. The initial error energy and the propagation error energy were estimated with the corruption model and then compared with actual values. The actual values were obtained as follows. To calculate each packet loss effect, one packet was discarded from the bitstream, and the bitstream was then decoded. Finally, the difference between a reconstructed sequence without loss and the decoded sequence was computed. For this simulation, the estimation window was set to 20 frames. Fig. 8 shows the correlation between the actual one and the estimated one with the corruption model. It is verified by this simulation that the proposed corruption model can represent the actual packet loss effect for the enhancement layer packets.
5.2. RPI-based Coordination of Network Adaptation The RPI-based coordination of network adaptation was evaluated under the proposed video delivery framework given in Fig. 1. The test sequence and encoding parameters were the same as those given in the previous section. The RPI generation was performed only for the enhancement layer bitstream. Fig. 9 shows the distribution of the estimated MSE. The categorization method can be varied according to the distribution of the MSE and the network condition. The network delivered prioritized packets with different reliability property corresponding to the service
0.6 150
0.5
100
0.3
Number of packets
Actual MSE
0.4
0.2
50
0.1
0 0
0.1
0.2
0.3 Extimated MSE
0.4
0.5
0.6 0
0
20
40
60 MSE
80
100
120
Figure 8. Correlation between the model-based es- Figure 9. The histogram of MSE estimated with the timation and the actual values. corruption model.
level, which was rated according to the packet loss rate. For coordination, the Lagrangian-based mapping method was implemented to satisfy the given cost-constraint. The base layer packets were transmitted without loss through error free channel. With the proposed corruption model, RPI was calculated for all 1710 GOB packets of the Foreman sequence for enhancement layer. RPI was then categorized into 20 linearly proportional levels. The packet loss rate was set to be inversely proportional to the service level ( 6 levels in total). Their values ranged from 1.5 % to 30 %. The unit price was proportional to the service level. The resulting packet loss rate was the reciprocal of the unit price. The coordinated mapping from the RPI category to the service level was then performed based on minimization of the Lagrangian formula of (14). The optimization was performed for discrete DS level. To simulate the error pattern of the underlying network, the Gilbert model with transition parameter 0.9 was used [19]. Also, the loss rate for each level was maintained constant throughout the evaluation to allow fair comparison. The resulting PSNR was compared since it provides a certain measure of end-to-end visual quality. In Fig. 10, several PSNR curves were given in the case of 20% average packet loss, where the effect of RPI differentiation was illustrated. As expected, the UEP coordination based on proposed RPI gives a performance boost compared to that does not differentiate packets. To better illustrate the gain of the proposed scheme, the average PSNR performance was compared by varying the cost-constraint. For each cost-constraint, the simulation was performed 50 times and then the average PSNRs are compared. Even though average packet loss rate of UEP case is lower than that of EEP due to cost function, higher PSNR can be achieved with UEP. The resulting average PSNR curves for different cost-constraints are shown in Fig. 11. The advantage with RPI assignment is very obvious.
6. CONCLUSION AND FUTURE WORK We proposed a corruption model that takes into account the error propagation behavior for the layered video codec. The proposed video delivery system takes advantage of the layered video coding technique and the differentiated network service to enhance the end-to-end quality. The resulting estimation approximates well the real loss impact within a narrow margin while requiring a small computational overhead. When applied to the RPI association, the proposed corruption model-based RPI satisfies the requirement of the proposed coordinated packetized video delivery and provides a reasonable performance improvement. In the future, we would like to extend the proposed corruption model to other layered coding techniques and data partitioning techniques.
REFERENCES 1. ITU-T Recommendation H.263 Version 2 (H.263+), Video coding for low bit rate communication, Jan. 1998. 2. W. Kumwilaisak, J. Kim and C.-C. J. Kuo, “Reliable Wireless Video Transmission via Fading Channel Estimation and Adaptation,” in Proc. WCNC 2000, Sept. 2000.
35
34.5
34
33.5 PSNR
UEP
31
EEP
30
40
32 31.5 30.5
33 32.5
60
80
100 120 Total Cost
140
160 x1E3
Figure 10. The PSNR comparison under 20% PLR Figure 11. The performance comparison for schemes in enhancement layer. with and without RPI.
3. J. Shin, J. Kim and C.-C. J. Kuo, “Content-based packet video forwarding mechanism in differentiated service networks,” in Proc. Packet Video Workshop 2000, May 2000. 4. J.-G. Kim, J. Kim, J. Shine, and C.-C. J. Kuo, “Coordinated Packet-Level Protection with a Corruption Model for Robust Video Transmission,” in Visual Communication Image Processing 2001, January 2001. 5. J.-G. Kim, J. Kim and C.-C. J. Kuo, “On the corruption model of loss propagation for relative prioritized packet video,” in SPIE Proc. Applications of Digital Image Processing XXIII, July 2000. 6. N. Farber, K. Stuhlmuller, and B. Girod, “Analysis of error propagation in hybrid video coding with application to error resilience,” in Proc. IEEE ICIP ‘99, Oct. 1999. 7. G. Reyes, A. R. Reibman, and S.-F. Chang, “A corruption model for motion compensated video subject to bit errors,” in Proc. Packet Video Workshop ‘99, Apr. 1999. 8. R. Zhang, S. L. Regunathan, and K. Rose, “Video coding with optimal inter/intra mode switching for packet loss resilience,” IEEE J. Select. Areas Communications, vol. 18, no. 6, June 2000. 9. M. H. Willebeek-LeMair, Z.-Y. Shae, and Y.-C. Chang, “Robust H.263 video coding for transmission over the Internet,” in Proc. INFOCOM ’98, March 1998. 10. Y. Wang and Q.-F. Zhu, “Error control and concealment for video communication: A review,” Proceedings of the IEEE, vol. 86, no. 5. May 1998. 11. B. Girod, “Feedback-based error control for mobile video transmission,” Proceedings of the IEEE , vol. 87, no. 10, Oct. 1999. 12. IETR, ”Differentiated Services Working Group,” http://ww.ietf.org/ html.charters/diffserv-charter.html. 13. J. Kim, W. Kumwilaisak, and C.-C. J. Kuo, “Cross-validation of proposed data partitioning annex for enhanced error resilience,” ITU-T standardization Sector Q.15/SG16 , Document Q15-G-23, Feb., 1999. 14. ITU-T, Video codec test model near-term version 10 (TMN10), Q.15/SG16, Document Q15-D-65, Apr. 1998. 15. C.-S. Kim, R.-C. Kim, and S.-U. Lee, “An error detection and revovery algorithm for compressed video signal using source level redundancy,” in IEEE Trans. on Image Processing, vol. 9, no. 2, Feb. 2000. 16. U. Horn, K. Stuhlmuller, M. Link, and B. Girod, “Robust Internet video transmission based on scalable coding and unequal error protection,” in Signal Processing: Image Communication, vol. 15, no. 1-2, Sept. 1999. 17. D. Wu, Y. T. How, and Y.-Q. Zhang, “Transporting real-time video over the Internet: Challenges and approaches,” in Proceedings of the IEEE, vol. 88, no. 12, Dec. 2000. 18. A. Ortega and K. Ramchandran, “Rate-distortion methods for image and video compression,” in IEEE Signal Processing Magazine, Nov. 1998. 19. L. Zhang, D. Chow and C. H. Ng, “Cell loss effect on QoS for MPEG video transmission in ATM networks ,” in IEEE ICC ‘99, 1999.