Joint Source Coding and Packet Marking for Video Transmission over

0 downloads 0 Views 107KB Size Report
ABSTRACT. We consider transmission of video for real-time display over a network offering differentiated services. In such a network, the probability of loss for ...
Joint Source Coding and Packet Marking for Video Transmission over DiffServ Networks C. E. Luna, Y. Eisenberg, R. Berry, T. N. Pappas, and A. K. Katsaggelos Department of Electrical and Computer Engineering Northwestern University, 2145 Sheridan Rd., Evanston, IL 60208 USA Email: {carlos,yeisenbe,rberry,pappas,aggk}@ece.northwestern.edu A BSTRACT We consider transmission of video for real-time display over a network offering differentiated services. In such a network, the probability of loss for each video packet is determined by the service class assigned to it. Using information about the error concealment strategy at the decoder and the QoS status of the network, we jointly consider the selection of source coding parameters and packet classification. A minimum cost approach is presented in which the goal is to minimize the cost to transmit a video sequence, while achieving a desired level of video quality, as measured by the expected distortion of the received video sequence. An approach for minimizing the distortion subject to a cost constraint is also discussed. A solution approach based on Lagrangian relaxation and deterministic dynamic programming is presented. Experimental results demonstrate the advantage of simultaneously adapting the source coding and priority assignment. 1

I NTRODUCTION In the current Internet, packets are forwarded using a best-effort service provided by the Internet Protocol (IP). This results in a network service where packets may be lost, duplicated, or experience considerable delay. While this service works well for elastic applications such as web browsing, applications with strict delay and loss constraints, such as video streaming can suffer significantly. In video streaming applications, the receiver begins displaying the received video sequence before the complete sequence has been downloaded. For this reason, strict delay constraints must be met if the video quality is to be preserved. Packets that arrive past their scheduled presentation time are treated as lost. Strict delay constraints limit the use of retransmissions. For these reasons, considerable effort has been devoted to the reliable transmission of video over unreliable networks. The approaches towards this task range from rate control and mode selection to forward error correction [1, 2, 3, 4, 5]. Of interest is also the development of adaptive applications that can respond to variations in network conditions

over time [6, 7, 8]. Another approach to reliable transmission of video that has been studied in the networking community is to add quality of service (QoS) support to IP. One architecture that has been proposed to achieve this is DiffServ [9]. The goal of DiffServ is to introduce mechanisms and protocols that try to guarantee certain levels of service. In this type of environment, the users may then be charged for sending traffic over the network according to the level of service they request. In this paper, we consider video transmission over a network offering a type of DiffServ. In this situation, the sender needs to choose the encoding parameters for each video packet as well as the proper service class for each packet. Each service class has a different cost associated with it. For example, the cost may be in dollars per packet or Euros per byte. One goal may be to select the class of service for each packet resulting in the minimum total cost while meeting constraints on the video quality and delay. An alternative aim may be to minimize the distortion subject to a cost constraint. To accomplish these goals, we consider jointly the selection of encoding parameters and service class. We also take into account the effects of the error concealment mechanism at the decoder. In this situation, if a packet is easily concealed then we may be able to transmit the packet using a lower class of service, or not transmit it at all. We refer to the later option as the generalized skip mode [5]. Regarding related work, in [10] a source-driven packet marking approach was presented. The goal was to mark packets containing speech based on their perceptual importance. Packets that were perceptually critical were marked for premium service and sent over a virtual wire. Other packets were marked as best-effort. However, in this approach the selection of coding parameters was not considered. A similar approach was taken in [11] for video transmission. Here, perceptually important macroblocks were grouped into premium packets. In [12], cost-distortion optimized streaming media over DiffServ networks was considered. The goal was to select the appropriate service class for each video packet to achieve

     $   %  &'

()*   $   

      

    

   

   "!#!   

  

 +,-

  

    

Figure 1: System Block Diagram an optimal cost-distortion trade-off. In this work, the authors considered the delivery of pre-encoded video and thus did not consider the selection of encoding parameters. In addition, the effect of the error concealment at the receiver was not considered. The rest of this paper is organized as follows: In Section 2, we present the model of differentiated services used in this paper. Section 3 contains the formulation in detail. In Section 4 we introduce the minimum cost approach. Section 5 presents the proposed solution based on Lagrangian relaxation and deterministic dynamic programming. In Section 6 we discuss the minimum distortion approach. Section 7 presents experimental results illustrating the advantage of the proposed approach. Section 8 contains extensions and concluding remarks. 2

D IFFERENTIATED S ERVICES The DiffServ approach to providing QoS in the Internet relies on the use of several service classes. Each of these classes assigns a different priority to packets. When the packets are received by a router, they are queued according to their class. Packets that are assigned to high priority classes receive better QoS than packets that are assigned to low priority classes. The goal is to achieve (statistically) different levels of end-to-end QoS for each of the service classes. As an incentive to achieve efficient network resource utilization, the network may use pricing. In this setting, there is a cost associated with each service class. By adjusting the prices for each service class the network can influence which class a user selects. Typically, transmitting a packet in a high priority service class results in a larger cost than transmitting the packet in a lower priority class. The sender has to classify each packet according to its importance in order to better utilize the available network resources. For example, in [13], a packet marking algorithm was presented to maintain end-to-end throughput in a DiffServ network. The authors in [11, 10] considered a source driven classification approach where they

try to limit the effect of distortion at the receiver. In this paper, we consider jointly the selection of source coding parameters and packet classification. Our goal is then to either (1) minimize the cost required to transmit a video sequence while meeting constraints on the delay and on the received video quality, or (2) minimize the distortion subject to a cost budget. classes be denoted by . Let the set of available service . . For a service class /10 , there is an associated cost per packet, 243 , and a packet loss probability, 563 . Thus, a packet sent on class / has probability of loss 573 and the user will have to pay 243 for transmitting the packet. Note that 2 3 increases with decreasing 5 3 . We consider the case where the individual user’s traffic is small compared to the overall traffic in the network. Therefore, we assume that the effect of an individual user’s traffic on the probability of loss associated with each service class is negligible. We further assume that the price and probability of loss for each class vary slowly and are fixed over the time it takes to transmit a video frame. In [12], a similar model for DiffServ is used in which each service class has a fixed cost and probability of loss associated with it. In our work, we assume that the probability of loss for each class is known at the encoder. This knowledge can be gained through network feedback, such as via RTP [14]. The probability of loss for each class may also be explicitly conveyed to the user by the service provider. 3

F ORMULATION Figure 1 shows a block diagram of the system under consideration. Video frames are fed into the video encoder which generates a stream of video packets. The packet classifier assigns a priority class to each video packet. The video packets are transmitted through a network offering differentiated services. The video packets are received and buffered at the decoder buffer. The video decoder reads video packets from the decoder buffer and displays the resulting video frames in real-time. By realtime display we mean that the video is displayed continuously without interruption at the decoder. We seek to design a controller that will determine the source coding parameters and priorities for each video packet. These decisions take into account the decoder concealment strategy and the QoS information associated with each service class.

3.1

Expected Distortion

Consider a system where video is encoded using a block-based motion-compensated video-coding technique such as MPEG-4 or H.263. In such a system, each video frame is divided into 8 slices or video packets. Each video packet is made up of a sequence of consecutive macroblocks (MB) and can be processed independently by the encoder and decoder. For each video packet, we must select source coding parameters, such as the coding mode (i.e., inter/intra/skip) and the quantiza-

tion parameter per MB. The available choices of source coding parameters are represented by a finite set of quantizers 9 . Note that the term quantizer is used in a generic sense to include different coding modes (inter, intra, skip) or quantization parameters for each MB in the packet. Choosing quantizer :;?;edŒK‚Æ ƒ ƒ,ƒ bits, which corresponds to a transmission delay of 40 ms at 200 kb/s. Consider the reference system with service class 2Ÿ… , i.e., packets are transmitted with loss probability 0.2 and the cost per packet is fixed at 50 microcents. The resulting total cost for transmitting a video frame of Ó,Ô7Ï ƒ KgËqÏ ƒ microcents is used as constraint Á d , in Eq. (5). Figure 5 shows the PSNR plot per frame obtained with the reference system and the proposed system. The proposed system yields an advantage of 0.5 dB PSNR on average. Clearly, the advantage of the proposed system depends on the parameters of the available service classes. 31.5 Reference System Minimum Distortion Approach 31

The minimum cost problem can also be rewritten as,

v j kml w |2 ;?;