Modeling and Performance Evaluation - CiteSeerX

5 downloads 0 Views 385KB Size Report
feedback law implemented by the rate controller in order to achieve a given target. The MPEG encoder emits one frame every ∆ seconds, and its output is ...
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 6, NOVEMBER 2005

2777

Transmission of Adaptive MPEG Video Over Time-Varying Wireless Channels: Modeling and Performance Evaluation Laura Galluccio, Giacomo Morabito, Member, IEEE, and Giovanni Schembra

Abstract—Wireless channels are characterized by high timevarying bit-error rates (BERs). To cope with this problem, several adaptive forward-error-correction (AFEC) schemes have been proposed in the literature. They work locally at the wireless link, adding a variable amount of redundancy to the transmitted data in order to maintain the packet error rate below an acceptable level. However, when such schemes are utilized, the bandwidth offered to the applications changes when channel conditions change. In this paper, the effects of these bandwidth variations are investigated in the case of real-time Motion Picture Experts Group (MPEG) video transmission. The MPEG encoder is controlled in order to adapt its emission rate to the current bandwidth offered by the wireless link. To this end, the encoding quality is diminished by the source rate controller when the transmission rate has to be decreased due to an increase in the channel BER, whereas it is improved when the transmission rate can be increased due to a decrease in the channel BER. A Markov-based model, denoted as SBBP/SBBP/1/K, has been introduced to model the scenario being considered. The analytical framework allows evaluation of the performance of the system and can be used to optimize the design of a video transmission system for wireless channels, providing the instruments to derive the tradeoff between information corruption in the wireless channel and MPEG video encoding quality. Index Terms—Forward error correction (FEC), Motion Picture Experts Group (MPEG), quality of service (QoS), switched batch Bernoulli process (SBBP), wireless channels.

I. I NTRODUCTION

T

HE NEED for supporting multimedia applications in dynamic environments where users are equipped with wireless terminals is one of the most challenging research topics today. In fact, it is known that wireless channels are characterized by bit-error rates (BERs) that are several orders of magnitude higher than the corresponding values for terrestrial networks. Accordingly, data packets may arrive at their destination corrupted, thus becoming useless. To overcome this problem, one of the solutions most widely adopted today is using forward error correction (FEC). FEC algorithms introduce a chosen amount of redundancy: the Manuscript received August 15, 2003; revised September 3, 2004; accepted September 13, 2004. The editor coordinating the review of this paper and approving it for publication is V. K. Bhargava. The work of L. Galluccio and G. Morabito was supported by Ministero dell’Istruzione, dell’Università e della Ricerca (MIUR) under contract VICOM. The work of G. Schembra was supported by MIUR under contract TANGO. The authors are with the Dipartimento di Ingegneria Informatica e delle Telecomunicazioni (DIIT), University of Catania, 95124 Catania, Italy (e-mail: [email protected]; [email protected]; schembra@ diit.unict.it). Digital Object Identifier 10.1109/TWC.2005.858028

higher the BER, the higher the amount of redundancy introduced. However, in wireless channels, the BER is characterized by high time variability: There are periods when channel conditions are good, that is, the BER is low, and periods when channel conditions are bad, that is, the BER is high. In order to maintain a high level of resource efficiency while guaranteeing the information accuracy required by applications, several adaptive FEC (AFEC) schemes have been introduced in the recent past [1], [2], [6], [7]. According to these schemes, the amount of redundancy at any time depends on the channel conditions being low if channel conditions are good, and high if channel conditions are bad. One consequence is that AFEC schemes cause variations in the bandwidth offered to user applications, which therefore have to adapt their output rate accordingly. This paper focuses on video applications that are destined to become very common in wireless-communication scenarios. More specifically, the target of the paper is the definition of an analytical framework for the design of a real-time Motion Picture Experts Group (MPEG) video transmission system over a wireless link that applies AFEC to keep the packet corruption probability acceptable, i.e., below a given threshold. The MPEG encoder uses a rate controller that adapts the output rate by appropriately setting the quantizer scale parameter (QSP) [8], [12], [29] to follow the bandwidth variations, while maximizing encoding quality and stability. In order to achieve this target, the rate controller monitors the activity of the frame that is being encoded, its encoding mode, and the number of bytes used to encode the previous frames. Then, it chooses the appropriate QSP in such a way that the transmission buffer at the sender site never saturates, even during periods with low available bandwidth. The whole system can be modeled by an emission process that feeds the transmission buffer. The server of this buffer behaves according to the channel conditions estimated by the adaptive error controller: The serving rate is higher when channel conditions are good and lower when channel conditions are bad. Switched batch Bernoulli processes (SBBPs) are used to model both the MPEG source [4], [15], [17], and the server process of the transmission buffer that coincides with the timevarying bandwidth available in the wireless channel [20], [24]– [28]. Accordingly, an SBBP/SBBP/1/K model is introduced to describe the whole system. The analytical framework proposed in the paper is used to evaluate the performance in terms of the distortion introduced

1536-1276/$20.00 © 2005 IEEE

2778

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 6, NOVEMBER 2005

Fig. 1. Mobile terminal system architecture.

by the quantization mechanism in the encoding process, which are the loss and mean delay in the transmission buffer, at different target packet error probabilities (PEPs) achieved using AFEC. Results obtained in the paper can be used to obtain the best tradeoff between encoding quality, which requires a high available bandwidth, and information correctness at the destination, which requires a high level of redundancy, thus causing bandwidth reduction. The rest of the paper is organized as follows. Section II describes the wireless MPEG transmission system considered in this paper. Section III proposes an analytical framework of the whole video transmission system, accounting for both the video source and the transmission channel. Section IV provides a derivation of the performance parameters. Section V applies the analytical framework to a case study in order to demonstrate the model’s capability of providing performance insights for the system design. Finally, Section VI concludes the paper. II. D ESCRIPTION OF THE S YSTEM The architecture of the video transmission system in the mobile terminal considered in this paper is shown in Fig. 1. The adaptive rate source is an adaptive-rate MPEG video source over a User Datagram Protocol (UDP)/IP protocol suite. The video stream generated by the video source is encoded by the MPEG encoder according to the MPEG video standard [30], [31]. In the MPEG encoding standard, the frame, which corresponds to a single picture in a video sequence, is the basic displaying unit. Three encoding modes are available for each

frame: intraframes (I), predictive frames (P), and interpolative frames (B). The basic idea behind MPEG video compression is to remove spatial redundancy within a video frame and temporal redundancy between successive video frames. The encoder output is a deterministic period sequence in which the period is a group of pictures (GoPs) realized with three types of encoded frames. 1) I frames coded using only information present in the picture itself in order to provide potential random access points in the compressed video sequence. The coding is based on the discrete-cosine transform according to the joint photographic experts group (JPEG) coding technique. 2) P frames coded using a coding algorithm similar to the one used for I frames, but with the addition of motion compensation with respect to the previous I or P frame (forward prediction). 3) B frames coded with motion compensation with respect to the previous I or P frame, and the next I or P frame, or an interpolation between them (bidirectional prediction). Typically, I frames require more bits than P frames, while B frames have the lowest bandwidth requirement. In encoding each frame, it is possible to tune the number of bits needed to represent the frame and, thus, its quality, by appropriately choosing the so-called QSP. Its value can range within the set [1, 31]: 1 being the value giving the best encoding quality but requiring the maximum number of bits to encode the

GALLUCCIO et al.: TRANSMISSION OF ADAPTIVE MPEG VIDEO OVER TIME-VARYING WIRELESS CHANNELS

frame, and 31 being the value giving the worst encoding quality, but requiring the minimum number of bits. The QSP can be dynamically changed according to the feedback law implemented by the rate controller in order to achieve a given target. The MPEG encoder emits one frame every ∆ seconds, and its output is packetized in the packetizer according to the UDP/IP protocol suite: the packetizer fragments the information flow into blocks of UP bytes1 ; these blocks constitute the payloads for the UDP, which adds a header of 8 bytes; each UDP packet is then put in the payload field of an IP packet. The IP packets are then sent to a transmission buffer whose service rate is time varying and depends on the channel condition estimated by the adaptive error controller, as will be explained below. The main target of the rate controller is to avoid buffer saturation, which causes losses and long delays, while maximizing the encoding quality and stability. To this end, it chooses the QSP parameter according to a feedback law monitoring the activity of the frame being encoded, its encoding mode (I, P, or B), and the current number of packets in the transmission buffer. The model introduced in the paper is so general that it can be applied whatever the feedback law. The feedback law used in the paper was introduced in [4] and [17] and, for the sake of completeness, will be reported in Section V-A. It has been defined in such a way that a controlled number of packets are present in the transmission buffer at the end of each GoP, while pursuing a constant distortion level within the GoP. Packets leaving the transmission buffer enter the adaptive error controller. Its main target is to use FEC to partially solve the problem of wireless-link unreliability. The FEC block creator divides packets into sets of k blocks. These blocks are given as input to the AFEC encoder and encoded in sets of m blocks, with m ≥ k. If any set of k or more blocks belonging to the same packet is received correctly, then the original packet can be reconstructed properly. Obviously, the larger the value of m, the higher the probability that the information can be reconstructed at the receiver station, but the lower the wireless-link bandwidth available at the video source. The value of m is chosen by the FEC controller in such a way that the PEP, i.e., the probability that a packet cannot be reconstructed at the receiver station, is no higher than a target (C) value PˆPEP . Given that wireless channel conditions change dynamically, AFEC encoding is applied, as proposed in [1], [2], [6], and [7]. This encoding technique requires knowledge of the current BER on the link. This estimation is performed by the wireless channel estimator. The estimated BER value is given as input to the FEC controller, which evaluates m so that the requirement on the PEP is satisfied. The value of m therefore changes in time and, as a consequence, the available link capacity c˜(t) also changes in time as  c˜(t) =



k ·c m(t)

(1)

1 If Real-Time Protocol (RTP)/Real-Time Control Protocol (RTCP) protocols are also used over the UDP/IP protocol suite, the related overhead should be considered.

2779

where c is the capacity (in packets/s) when FEC is not used. At any time, the service rate of the transmission buffer is set equal to c˜(t). Accordingly, both the MPEG encoder output process and the transmission-buffer service process are stochastic processes, the first depending on the behavior of the source and the rate controller, and the second on the BER behavior of the wireless channel. These processes will be modeled with two ˜ (n), respectively, as discrete-time SBBP processes Y˜ (n) and N described in detail in Section III. III. S YSTEM M ODEL In this section, we derive a discrete-time analytical model of the system described in the previous section. We will set the slot duration ∆ equal to the video-frame interval. As a first step, Sections III-B and III-C will describe the models of the noncontrolled MPEG encoder output and the available capacity of the channel as SBBPs [9]. Then, the whole system will be modeled as an SBBP/SBBP/1/K queueing system in Section III-D, where K is the maximum number of packets the transmission buffer can contain. For the sake of completeness, Section III-A provides a brief outline of SBBP processes. A. Switched Batch Bernoulli Processes (SBBPs) An SBBP Y (n) is a discrete-time emission process modulated by an underlying Markov chain [9], and represents a special case of the family of the hidden Markov model processes [19]. Each state of the Markov chain is characterized by an emission probability density function (pdf): The SBBP emits data units according to the pdf of the current state of the underlying Markov chain. Therefore, the SBBP Y (n) is fully described by the state space (Y ) of the underlying Markov chain, the maximum number of data units the SBBP can emit in one (Y ) slot rMAX , and the matrix set (Q(Y ) , B (Y ) )), where Q(Y ) is the transition probability matrix of the underlying Markov chain, while B (Y ) is the emission probability matrix whose rows contain the emission pdfs for each state of the underlying Markov chain. If we indicate the state of the underlying Markov chain in the generic slot n as S (Y ) (n), the generic elements of the matrices Q(Y ) and B (Y ) are defined as follows:   (Y ) Q s ,s = Prob S (Y ) (n + 1) = sY |S (Y ) (n) = sY [ Y Y]

(Y )

B s ,r [Y ]

∀sY , sY ∈ (Y )   = Prob Y (n) = r|S (Y ) (n) = sY   (Y ) ∀sY ∈ (Y ) , ∀r ∈ 0, rMAX .

(2)

(3)

We will introduce an extension to the meaning of the SBBP to model not only a source emission process, but also a video-sequence activity process, and an available wirelesschannel-capacity process. In the latter cases, we will indicate them as an activity SBBP and a transmission-channel SBBP, respectively, and their matrices B (Y ) as the activity

2780

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 6, NOVEMBER 2005

probability matrix and the channel-transmission probability matrix, respectively. B. Noncontrolled MPEG Source Model The noncontrolled MPEG video source is part of the adaptive-rate source shown in Fig. 1 comprising the video source, the MPEG encoder, and the packetizer. We denote it as noncontrolled because we are assuming it works with a constant QSP q not controlled by the rate controller. The first step in modeling the whole video transmission system shown in Fig. 1 is the derivation of the SBBP process Y˜q (n), modeling the emission of the noncontrolled MPEG video source at the packetizer output for each QSP q. This model was calculated by the authors in [4] and [17]. Here, for the sake of brevity, we will refer to those works in order to define the notation. The model captures two different components: the activity-process behavior and the activity/emission relationships. As input, it takes the first- and second-order statistics of the activity process, and the three functions, one for each encoding mode (I, P, or B), characterizing the activity/emission relationships. The state of the underlying Markov process of Y˜q (n) is a double variable, ˜ S (Y ) (n) = (S (G) (n), S (F ) (n)), where S (G) (n) ∈ (G) is the state of the underlying Markov chain of the activity process G(n), and S (F ) (n) ∈ J is the frame to be encoded in the GoP at the slot n. The state set (G) represents the set of activity levels to be captured. For example, according to [5], we have (G) = {Very Low, Low, High, Very High}. Set J, on the other hand, represents the set of frames in GoP and depends on the GoP structure. For example, if the movie is encoded with the GoP structure IBBPBB, set J is defined as J = {I, B, B, P, B, B}. As demonstrated in [4] and [17], the underlying Markov chain of Y˜q (n) is independent of q. Therefore, we will indicate ˜ ˜ its transition probability matrix as Q(Y ) instead of Q(Yq ) , ˜ ˜ and set (Q(Y ) , B (Yq ) ), for each q ∈ [1, 31], defines the SBBP emission process modeling the output flow of the noncontrolled MPEG encoder, when it uses a constant QSP value q. C. Service SBBP Model The target of this section is to derive the SBBP model of ˜ (n), which represents the service process of the the process N transmission buffer when AFEC is employed. As said so far, it closely depends on the amount of redundancy the AFEC (C) encoder introduces to achieve the target maximum PEP PˆPEP due to the wireless channel. As usual, (e.g., [14], [24], and [26]), we assume that the channel behavior can be described by means of an M -states Markov process. Accordingly, channel statistical behavior can be described by an M × M transition probability matrix Q(C) and by BERi , the BERs for each state of the process i ∈ [1, M ]. Thus, the service SBBP model is represented by the following parameters: 1) the maximum number of packets that can be transmitted ˜) (N in a time slot rMAX ; ˜ 2) the state space (N ) ;

˜

˜

3) the matrix set (Q(N ) , B (N ) ) containing the transition probability matrix and the channel-emission probability matrix. ˜ Obviously, the transition probability matrix Q(N ) of the un˜ derlying Markov chain of the process N (n) coincides with the channel-transition probability matrix Q(C) , as calculated ˜ in [26]. The state space (N ) coincides with the channel state ˜ ˜ space, i.e., (N ) = [1, M ]. Instead, in order to derive B (N ) , we have to calculate the bandwidth reduction due to the AFEC redundancy for each state i of the channel SBBP. This depends on the BER characterizing the state BERi . The FEC redundancy to be introduced to achieve the target (C) value for the maximum PEP PˆPEP should be such that the (C) resulting PEP for any state i of the channel PPEP,i is lower than or equal to the target one, i.e., (C)

(C)

PPEP,i ≤ PˆPEP .

(4)

According to the notation introduced in Section II, indicating the size of each block expressed in bits as R, and assuming that losses introduced by the wireless channel are independent and uniformly distributed within a block,2 the PEP, when the channel is in the generic state i, can be calculated as follows: (C) PPEP,i

=

m  l=m−k+1

 m · (1 − PBEP,i )m−l · (PBEP,i )l (5) l

where PBEP,i represents the probability that a block is corrupted when the channel is in state i, and can be evaluated as follows: PBEP,i = 1 − (1 − BERi )R .

(6)

Now, substituting (5) in (4), we can numerically find the minimum value of m verifying the inequality in (4) for each value i of the channel state. Let us indicate this value as mi . ˜i (in packets/s), which is actually Accordingly, the capacity N available for the transmission of data to obtain a PEP lower (C) than PˆPEP in the wireless channel when its state is i, can be calculated as follows: ˜i = k · c N mi

(7)

where c is the channel capacity when no FEC encoding is applied (in [packets/s]). ˜i . In general, from (7), we obtain a noninteger value for N However, we can assume that, when the channel state is i, in ˜i  packeach slot, the channel is able to transmit either Di = N ets with a probability of pDi = 1 − (N˜i − Di ), or (Di + 1) packets with a probability of pDi +1 = 1 − pDi , where we have indicated the largest integer no greater than x as x. 2 This assumption is accurate if interleaving is utilized, which is usual in wireless communications [28].

GALLUCCIO et al.: TRANSMISSION OF ADAPTIVE MPEG VIDEO OVER TIME-VARYING WIRELESS CHANNELS

In summary, the emission probability matrix of the SBBP ˜

modeling the channel is B (N ) ∈ [M element can be calculated as follows:  B

˜) (N



 [i,d]

=

pDi , pDi +1 , 0,

X

˜) (N rMAX ]

, and its generic

if d = Di if d = Di + 1 otherwise

(8)

˜

3) S (Y )(n) is the state of the underlying Markov chain of Y˜ (n), which coincides with that of Y˜q (n), for any q ∈ [1, 31]. According to the late-arrival-system-with-immediate-access time diagram, the transmission-buffer state in slot (n + 1) can be obtained through the Lindley equation [13]

 sQ = max min sQ + r, K − d, 0

˜) (N rMAX

where is the maximum number of packets that can be transmitted in one slot, i.e., ˜) (N

rMAX = max{Di + 1}. i

(9)

The transition probability matrix and the state space, together with the channel-emission probability matrix and the maximum number of packets that can be transmitted in one slot defined in (8) and (9), completely characterize the channel SBBP model.

1) S (Q) (n) ∈ [0, K] is the transmission-buffer queue state in the nth slot, i.e., the number of packets in the queue and in the service facility at the observation instant; ˜ 2) S (N ) (n) is the state of the underlying Markov chain of ˜ (n); the channel SBBP N

(10)

where sQ is the transmission-buffer state in the generic slot n, while r and d are the server capacity and the number of arrivals at slot n + 1, respectively. ˜ (n), modeled in Section III-C, can The channel SBBP N be equivalently characterized through the set of transition ˜ probability matrices M (N ) (d), which are transition probability matrices including the probability that the server capacity is d (in packets/slot). These matrices can be obtained from the ˜ ˜ parameter set (Q(N ) , B (N ) ) as follows:

D. Video-Transmission-System Model The adaptive-rate source pursues a given target by implementing a feedback law in the rate controller, which calculates the value q of the QSP to be used by the MPEG encoder for each frame. The target of this section is to model the video transmission system as a whole, indicated here as Σ. To this aim, we use a discrete-time queueing system model. Let K represent the maximum number of packets that can be contained in the queue of the transmission buffer and its server. The server capacity of this queueing system, that is, the number of packets that can leave the queue at each time slot, is a stochastic process that has been modeled with the channel ˜ (n). SBBP process N The input of the queue system is the emission process of the adaptive-rate source, indicated here as Y˜ (n). Therefore, at slot n, the transmission-buffer queue size is incremented by ˜ (n). Both the input and the output Y˜ (n), and decremented by N processes can be modeled by means of two SBBP processes, as discussed above, where the slot duration is the frame duration ∆. To model the queueing system, we assume a late-arrivalsystem-with-immediate-access time diagram [3], [11]: Packets arrive in batches, and can enter the service facility if it is free, with the possibility of them being ejected almost instantaneously. Note that in this model, a packet service time is counted as the number of slot boundaries from the point of entry to the service facility up to the packet departure time. Therefore, even though we allow the arriving packet to be ejected almost instantaneously, its service time is counted as 1, not 0. A complete description of Σ at the nth slot requires a three-dimensional Markov process, whose state is defined as ˜ ˜ S (Σ) (n) = (S (Q) (n), S (N ) (n), S (Y ) (n)), where:

2781



 ˜ M (N ) (d) 

s˜ ,s˜ N



N

  ˜ (n + 1) = d,  ˜ N (N )   S (n) = sN˜ ≡ Prob ˜ S (N ) (n + 1) = sN˜      ˜ ˜ = Q(N )     · B (N )    

∀d ∈



s ˜ ,s ˜ N

N

˜) (N 0, rMAX

s ˜ ,d N

 .

(11)

The adaptive-rate source emission process is modeled by an SBBP whose emission probability matrix depends on the transmission-buffer state. In order to model this process, we use the SBBP models of the noncontrolled MPEG video source described in Section III-B, Y˜q (n), for each q ∈ [1, 31]. So, we ˜ ˜ ˜ ˜ have a parameter set (Q(Y ) , B (Y1 ) , B (Y2 ) , . . . , B (Y31 ) ), which ˜ represents an SBBP whose transition matrix is Q(Y ) , and whose emission process is characterized by a set of emission ˜ matrices {B (Yq ) }q=1,2,...,31 . Consequently, at each time slot, the emission of the MPEG video source is characterized by an emission probability matrix chosen according to the QSP value defined by the feedback law q = φ(sQ , a, j). More concisely, as in (11), for the channel SBBP, we characterize the emission process of the adaptive-rate source (Y˜ ) (Y˜ ) through the set of matrices {Ms (r)}, ∀r ∈ [0, rMAX ], each Q matrix representing the transition probability matrix including the probability of r packets being emitted when the buffer state is sQ . Accordingly, the generic element of the ma(Y˜ )

trix Ms (r) can be obtained from the above parameter set Q

˜

˜

˜

˜

(Q(Y ) , B (Y1 ) , B (Y2 ) , . . . , B (Y31 ) ) as follows:     ˜  (Y˜ ) Ms (r) Q(Y ) = Q

[(i ,j  ),(i ,j  )]

a ∈(Act)

  ˜ · B (Yq ) (r)

[(i ,j  ),r]

[(i ,j  ),(i ,j  )]

· fAct (a |i , j  )

(12)

2782

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 6, NOVEMBER 2005

where the following hold: 1) q  is the QSP chosen when the frame to be encoded is the j  th in the GoP, the activity is a , and the transmissionbuffer state before encoding this frame is sQ . The value of q  is determined by the feedback law = φ(·), i.e.,

q  = φ sQ , a , j  . (13)

element is π

 Q(Σ) 



sQ ,s˜ ,s˜ , s ,s˜ ,s˜ Q N

Y

N

˜

 (Q)    S (n + 1) = sQ , ˜) (N ≡ Prob S (n + 1) = sN˜ ,   (Y˜ ) S (n + 1) = sY˜ , ˜) (N

In this section, we evaluate both the static and time-varying statistics of the quantization distortion, represented by the process PSNR(n). More specifically, we will quantize the PSNR process with a set of L different levels of distortion, {µ1 , µ2 , . . . , µL }, each representing an interval of distortion values where the quality perceived by the users can be considered constant. As an example, for the movie Evita, from a subjective analysis obtained with 300 tests, the following L = 5 levels of distortion were envisaged: µ1 = [31.2, 34.2] dB, µ2 = [34.2, 35.0] dB, µ3 = [35.0, 36.2] dB, µ4 = [36.2, 38.4] dB, and µ5 = [38.4, 52.1] dB. The pdf fPSNR (p) can be easily calculated from the transition probability matrix and the steady-state probability array of the whole system, which have been derived in (14) and (16), respectively [see (18) at the bottom of the page], where the following hold: 1) ψ[sQ ,a ,j  ] (p) is a Boolean condition defined as follows

 (Q)   S (n) = sQ ,     (N˜ )  S (n) = sN˜ ,  ˜   S (Y ) (n) = s ,  Y˜

dMAX rMAX

d=0

r=0

N

  (Y˜ ) · Ms (r)  Q

s˜ ,s˜



N

 

 · ψ s , s , . . . , r, d Q Q 

s˜ ,s ˜ Y

(14)

Y



where ψ(sQ , sQ , . . . , r, d) is a Boolean condition for the queue state behavior, and is defined as follows:

ψ sQ , sQ , K, r, d    1, if max min sQ + r, K − d, 0 = sQ = . (15) 0, otherwise

ψ [s 

Q



] (p) =



 1, if F (j ) φ sQ , a , j  = p . 0, otherwise (19)



(ζ)

element γl , for each l ∈ [1, L], is the QSP range giving a distortion belonging to the lth level for a frame encoded with encoding mode ζ ∈ {I, P, B}. Of course, by so doing, we (ζ) assume that a variation of q within the interval γl does not cause any appreciable distortion. From the distortion curves for the movie Evita, we have calculated the following QSP

where 1 is a column array whose elements are equal to 1, and π (Σ) is the steady-state probability array, whose generic

K 

,a ,j 

2) F (j ) (q) in (19) is the so-called distortion curve [5], [17], [22] for the generic frame j, which is the curve linking the average PSNR to the QSP value, q, used to encode the frame. Now, in order to calculate the statistics of the quantized PSNR process, let us define the array γ (ζ) in which the generic

Once the matrix Q(Σ) is known, we can calculate the steadystate probability array of the system Σ as the solution of the following linear system  π (Σ) · Q(Σ) = π (Σ) (16) π (Σ) · 1 = 1

fPSNR (p) ≡ Prob {PSNR(n) = p} =

. (17)

IV. Q UANTIZATION -D ISTORTION A NALYSIS

˜) (Y

    ˜ M (N ) (d)  =



A direct solution of the system in (16) may be difficult since the number of states grows explosively as the maximum transmission buffer size K increases. Nevertheless, many algorithms, e.g., [10], [18], and [23], enable us to calculate the array π (Σ) , while maintaining a linear dependence on K.



Y

 ˜ = Prob S (Q) (n) = sQ , S (N ) (n) = sN˜ , S (Y ) (n) = sY˜

2) fAct (a |i , j  ) is the probability that the generic frame j  in the GoP has an activity a when its activity level is i . This function, as demonstrated in [15], [16], and [21], is a Gamma pdf, whose mean value and variance characterize the video trace. 3) (Act) is the set of all the possible activities. Finally, we can model the video transmission system as a whole. If we indicate two generic states of the system as sΣ = (sQ , sN˜ , sY˜ ) and sΣ = (sQ , sN˜ , sY˜ ), the generic element of the transition matrix of the video transmission system as a whole Q(Σ) can be calculated, due to (11) and (12), as follows: 

(Σ) [(sQ ,sN˜ ,sY˜ )]

K   







fAct (a |i , j  )

˜ ) i ∈(G) j  ∈J s =0  ˜ ) i ∈(G) a ∈(Act) sQ =0 s ∈(N s ˜ ∈(N Q ˜ N

N

· Q



(Σ)

sQ ,s˜ ,(i ,j  ) , s ,s˜ ,(i ,j  ) Q N

N

 · π(Σ)

sQ ,s˜ ,(i ,j  ) N

 · ψ s ,a ,j  (p) (18) [Q ]

GALLUCCIO et al.: TRANSMISSION OF ADAPTIVE MPEG VIDEO OVER TIME-VARYING WIRELESS CHANNELS

the same level for (m − 1) consecutive slots, and leaves this level at the mth slot, that is  m−1 (Σ) · Q→( =µl ) · 1T fδl (m) = π (Σ1,µl ) · Q(Σ) →µl

ranges corresponding to the above distortion levels µl , for each l ∈ [1, 5]. 1) For I frames: γ (I) = [[16, 31], [13, 15], [10, 12], [6, 9], [1, 5]]. 2) For P frames: γ (P) = [[15, 31], [13, 14], [10, 12], [6, 9], [1, 5]]. 3) For B frames: γ (B) = [[17, 31], [14, 16], [11, 13], [7, 10], [1, 6]].

(Σ)

where :

(PSNR)

= Prob {PSNR(n) ∈ µl }



(20)



sQ ,s˜ ,(i ,j  ) , s ,s˜ ,(i ,j  ) Q N

N

π (Σ) · Q→µl (Σ)

π (Σ) · Q→µl · 1T

.

(25)

  PSNR(n + 1) ∈ µl , . . . , PSNR(n + m − 1) ∈ µl ,  PSNR(n − 1) ∈ µl  PSNR(n) ∈ µl PSNR(n + m) ∈ µl

 

(24)

We analyzed the statistical characteristics of 1 hour of MPEG video sequences of the movie Evita. To encode this movie, we used a frame rate of F = 25 frames/s, and a frame size of 180 macroblocks. The GoP structure IBBPBB was used, selecting a ratio of total frames to intraframes of GI = 6, and the distance between two successive P frames or between the last P frame in the GoP and the I frame in the next GoP as GP = 3. The size of the transmission buffer has been set to K = 60 packets. The gross link capacity assigned to the video application is 2 Mb/s. The IP packets at the wireless terminal are divided into 40 bytes blocks, as usual in the universal mobile telecommunications system (UMTS) environment. The AFEC module encodes sets of k = 16 blocks into sets of m. In this case study, we use the eight-state finite-state Markov channel (FSMC) model introduced in [26] for the wireless channel and consider two different cases. 1) Pedestrian: The mobile user’s velocity is 5 km/h. 2) Driver: The mobile user’s velocity is 55 km/h. Assuming that wireless transmission is performed in the 2-GHz band, which is the value used in UMTS, the maximum Doppler frequency is fm = 10 Hz in the first case and fm = 100 Hz in the second. The values that characterize Q(C) are given in Table I for the pedestrian and driver cases. The above matrices were calculated

In order to calculate the pdf fδl (m) in (21), let us indicate the matrix containing the one-slot probabilities of transition towards system states in which the distortion level is µl as (Σ) Q→µl . It can be obtained from the transition probability matrix of the system Q(Σ) , as in (23), shown at the bottom of the page. Therefore, the pdf fδl (m) can be calculated as the probability that the system Σ, starting from a distortion level µl , remains in

Q(Σ) →µl

.

A. System Characterization

p∈µl



(Σ)

π (Σ, =µl ) · Q→µl · 1T

V. C ASE S TUDY

(PSNR)

fδl (m) = Prob

π (Σ,µl ) · Q→µl

(Σ)

π (Σ, =µl ) =

in and (21), shown at the bottom of the page. The term π[l] (20) can be calculated from the pdf fPSNR (p) obtained in (18) as follows:  (PSNR) = fPSNR (p). (22) π[l]



π (Σ1,µl ) =

The array π (Σ1,µl ) in (24) is the steady-state probability array in the first slot of a period in which the distortion level is µl . The array π (Σ, =µl ) , on the other hand, is the steady-state probability array in a generic slot in which the distortion level is other than µl , and is defined as

Let q = φ(sQ , a , j  ) be the feedback law, linking the transmission-buffer state at the beginning of a generic slot n, sQ ∈ [0, K], the activity of the frame in the same slot, a ∈ (G) , and the position in the GoP of the frame to be encoded, j  ∈ J, to the QSP to be used to encode the (a ,j  ) current frame. Moreover, for each a and j  , let θl = (ζ)     {∀sQ such that φ(sQ , a , j ) ∈ γl } be the range of values of the transmission-buffer state for which the rate controller chooses QSP values belonging to the level µl , according to the adopted feedback law. By definition, it follows that a variation (a ,j  ) does not cause of the transmission-buffer state within θl any appreciable distortion variation. Let us now calculate the probability that the value of the (PSNR) , and process PSNR(n) is in the generic interval µl , π[l] the pdf fδl (m) of the stochastic variable δl , representing the duration of the time the process PSNR(n) remains in the generic interval µl without interruption. They are defined as π[l]

2783

a ∈(Act)

 = 0,

(Σ) Q 

sQ ,s˜ ,(i ,j  ) N

,

s ,s˜ ,(i ,j  ) Q N

 fAct (a |i , j  ),

(21)

(a ,j  )

if sQ ∈ θl otherwise

(23)

2784

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 6, NOVEMBER 2005

TABLE I Q(C) PARAMETERS IN THE PEDESTRIAN CASE (fm = 10 Hz) AND DRIVER CASE (fm = 100 Hz)

TABLE II REDUNDANCY BLOCKS AND NET LINK CAPACITY OFFERED TO THE APPLICATION FOR DIFFERENT CHANNEL STATES AND TARGET ERROR (C) PROBABILITIES PˆPEP IN THE DRIVER CASE, WHEN THE GROSS LINK CAPACITY IS c = 2 Mb/s

assuming that the video-frame rate is 25 frames/s and therefore, the slot duration is ∆ = 40 ms. (C) The target error-probability values considered are PˆPEP = (C) (C) (C) 10−5 , PˆPEP = 10−4 , PˆPEP = 10−3 , and PˆPEP = 10−2 . Table II lists, for each state i of the server SBBP model, the values of mi and the resulting available link capacities c˜i for (C) these PˆPEP values in the driver case, taken as an example. In this case study, we will consider a feedback law obtained from the statistics of the movie Evita, expressed in terms of rate and distortion curves [5], [17], [23]. The rate curves Ra,j (q) give the expected number of packets which will be emitted when the jth frame in the GoP has to be encoded, if its activity value is a, and is encoded with a QSP value q. The distortion curves F (j) (q) give the expected encoding PSNR, and have been defined in Section IV. The rate and the distortion curves for the movie Evita are shown in Fig. 2. The considered feedback law aims to maintain the number of packets in the transmission-buffer queue lower than a given threshold Kθ at the end of each GoP interval, while maintaining stable the PSNR during the whole GoP. In this case, both the rate curves Ra,j (q) and the distortion curves F (j) (q) are used.

More specifically, if we indicate the transmission-buffer queue length and the channel available capacity when the jth frame in the GoP has to be encoded as sQ and sN˜ , respectively, and a being the activity of this frame, the QSP is chosen assuming the following. 1) The activity will remain constant during the rest of the GoP, that is, Act(n) = a, for each frame h ∈ [j + 1, GI ]. 2) The channel behavior, and therefore the available network ˜ (n), remains constant during the rest of the bandwidth N GoP, that is, for each frame h ∈ [j + 1, GI ]. Under these assumptions, the QSP is chosen as the minimum QSP q¯, such that it is possible to find a set of QSP values for the next frames of the GoP, [qj+1 , . . . , qGI ], so that the following hold. 1) The PSNR of those frames is constant, and equal to the value that should be achieved for frame j. 2) The number of emitted packets expected for the next frames of the GoP, if these QSP values are used, added to the current queue, minus the number of packets that will leave the queue until the end of the GoP, results to lower than the given threshold Kθ .

GALLUCCIO et al.: TRANSMISSION OF ADAPTIVE MPEG VIDEO OVER TIME-VARYING WIRELESS CHANNELS

Fig. 2.

2785

Rate-distortion curves for I, P, and B frames. Rate curves for (a) frame I, (b) frame B, and (c) frame P. (d) Distortion curves.

In other words, the feedback law works by choosing the QSP as in (26), shown at the bottom of the page.

B. Numerical Results Fig. 3 shows the pdfs of the transmission-buffer queue size for the two values of the Doppler frequency fm and for a given value of the target error probability among those being considered. The values shown have been calculated as follows:    Prob S (Q) (n) = sQ = sN ˜

˜) ∈(N

 sY˜

∈(Y˜ )

(Σ)

π (s ,s ,s ) . [ Q N˜ Y˜ ]

(27)

We can observe that the curves are basically Gamma distributions and are very similar to each other independently of (C) the PˆPEP value. This is the evidence that the feedback law works properly. This is further demonstrated in Fig. 4 where we show the average queue size as well as the mean delay in the transmission buffer. The value of the average queue size (C) does not change significantly when the PˆPEP changes and is higher in the driver case. This can be explained by the fact that in the driver case, the wireless medium quality is lower and therefore, the transmission-buffer service rate is lower. Similar discussions can be carried out concerning Fig. 5, where the performance in terms of loss probability in the transmission buffer is shown and calculated as in [4].

  ¯ such that ∃ [qj+1 , . . . , qGI ] for which :  q  q) ∀k ∈ [j + 1, . . . , GI ] F (k) (qk ) = F (j) (¯ q = φ(sQ , a, j) = min  I  q¯∈[1,31]  ˜ q) + G sQ + Ra,j (¯ k=j Ra,j (qk ) − (GI − j + 1) · N (n) ≤ Kθ

(26)

2786

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 6, NOVEMBER 2005

Fig. 3. Transmission-buffer size pdf for PˆPEP = 10−2 (a) in the pedestrian case and (b) in the driver case. (C)

(C) Fig. 4. Average transmission-buffer size and mean delay versus the target error probability PˆPEP .

Fig. 6 shows the performance related to the encoding quality. (C) In particular, it can be observed that, for high values of PˆPEP , due to the high amount of available bandwidth, the most likely PSNR level is the highest. On the contrary, for low values of (C) PˆPEP , as a result of the large amount of redundancy introduced by AFEC, the available bandwidth is low, and therefore, the video source reduces the encoding quality. For this reason, the (C) lower the value of the target PEP in the wireless link PˆPEP , the greater the probability of poorer PSNR levels. In order to better quantify the influence of the choice of the target value (C) PˆPEP on the encoding performance, in Fig. 6, the average PSNR level is shown. As expected, the worst case for the average PSNR level is given when the AFEC has a very stringent (C) target for the maximum PEP PˆPEP . When a less stringent target value for the PEP is required, the encoding quality increases.

Obviously, the average PSNR value, and thus encoding quality, is higher in the pedestrian case. VI. C ONCLUSION In this paper, we have defined an analytical framework for the evaluation of the performance of real-time MPEG video transmission over a wireless link that applies AFEC to keep the PEP below a given threshold. The MPEG encoder uses a rate controller that adapts the output rate by appropriately setting the QSP to follow the bandwidth variations while maximizing encoding quality and stability. The whole system has been modeled by an emission process that feeds the transmission buffer; the server of this buffer behaves according to the channel conditions, i.e., the

GALLUCCIO et al.: TRANSMISSION OF ADAPTIVE MPEG VIDEO OVER TIME-VARYING WIRELESS CHANNELS

Fig. 5.

(C) Packet loss probability in the transmission buffer versus the target error probability PˆPEP .

Fig. 6.

Average PSNR level versus the target error probability PˆPEP .

2787

(C)

service rate is higher when channel conditions are good and lower when channel conditions are bad. SBBPs have been used to model both the MPEG video source [4], [15], [17] and the server process of the transmission buffer that coincides with the time-varying available bandwidth in the network. Accordingly, the whole system has been modeled as an SBBP/SBBP/1/K process. The analytical framework proposed in the paper has been used to evaluate the performance in terms of the distortion introduced by the quantization mechanism in the encoding process, which are the loss and mean delay in the transmission

buffer. Numerical results show that our system is very robust and reliable due to the implemented feedback law that maintains almost constant the mean delay and the loss probability in the output buffer. Moreover, the corruption probability in the wireless channel is also limited in spite of possible variations in time in the wireless-channel BER. The proposed model allows the designer to evaluate the introduced encoding quality variation that represents the cost of using this approach. The results obtained in the paper can be used to obtain the best tradeoff between encoding quality and information correctness.

2788

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 4, NO. 6, NOVEMBER 2005

R EFERENCES [1] I. F. Akyildiz, I. Joe, H. Driver, and Y. L. Ho, “A new adaptive FEC scheme for wireless ATM networks,” in Proc. IEEE Military Communications Conf. (MILCOM), Boston, MA, Oct. 1998, pp. 277–281. [2] E. Altman, C. Barakat, and V. M. Ramos, “Queueing analysis of simple FEC schemes for IP telephony,” in Proc. IEEE Information Communications (INFOCOM), Anchorage, AK, Apr. 2001, pp. 796–804. [3] J. J. Bae, T. Suda, and R. Simha, “Analysis of individual packet loss in a finite buffer queue with heterogeneous Markov modulated arrival processes: A study of traffic burstiness and a priority packet discarding,” in Proc. IEEE Information Communications (INFOCOM), Florence, Italy, Apr. 1992, pp. 219–230. [4] A. Cernuto, F. Cocimano, A. Lombardo, and G. Schembra, “A queueing system model for the design of feedback laws in rate-controlled MPEG video encoders,” IEEE Trans. Circuits Syst. Video Technol., vol. 12, no. 4, pp. 238–255, Apr. 2002. [5] C. F. Chang and J. S. Wang, “A stable buffer control strategy for MPEG coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 7, no. 6, pp. 920– 924, Dec. 1997. [6] S. R. Cho, “Adaptive error control scheme for multimedia applications in integrated terrestrial-satellite wireless networks,” in Proc. IEEE Wireless Communications and Networking Conf. (WCNC), Chicago, IL, Sep. 2000, pp. 629–633. [7] A. Chockalingam and M. Zorzi, “Wireless TCP performance with link layer FEC/ARQ,” in Proc. IEEE Int. Conf. Communications (ICC), Vancouver, BC, Canada, Jun. 1999, pp. 1212–1216. [8] W. Ding and B. Liu, “Rate control of MPEG video coding and recording by rate-quantization modeling,” IEEE Trans. Circuits Syst. Video Technol., vol. 6, no. 1, pp. 12–20, Feb. 1996. [9] O. Hashida et al., “Switched batch Bernoulli process (SBBP) and the discrete-time SBBP/G/1 queue with application to statistical multiplexer,” IEEE J. Sel. Areas Commun., vol. 9, no. 3, pp. 394–401, Apr. 1991. [10] A. E. Kamal, “Efficient solution of multiple server queues with application to the modeling of ATM concentrators,” in Proc. IEEE Information Communications (INFOCOM), San Francisco, CA, 1996, pp. 248–254. [11] A. La Corte, A. Lombardo, and G. Schembra, “An analytical paradigm to calculate multiplexer performance in an ATM multimedia environment,” Comput. Netw. ISDN Syst., vol. 29, no. 16, pp. 1881–1900, Dec. 1997. [12] L. J. Lin and A. Ortega, “Bit-rate control using piecewise approximated rate-distortion characteristics,” IEEE Trans. Circuits Syst. Video Technol., vol. 8, no. 4, pp. 446–459, Aug. 1998. [13] D. V. Lindley, “The theory of queues with a single server,” Proc. Cambridge Philos. Soc., vol. 48, pp. 277–289, 1952. [14] H. Liu and M. El Zarki, “Performance of H.263 video transmission over wireless channels using hybrid ARQ,” IEEE J. Sel. Areas Commun., vol. 15, no. 9, pp. 1775–1786, Dec. 1997. [15] A. Lombardo, G. Morabito, and G. Schembra, “An accurate and treatable Markov model of MPEG-video traffic,” in Proc. IEEE Information Communications (INFOCOM), San Francisco, CA, Mar./Apr. 1998, pp. 217–224. [16] A. Lombardo, G. Morabito, S. Palazzo, and G. Schembra, “A Markovbased algorithm for the generation of MPEG sequences matching intraand inter-GoP correlation,” Eur. Trans. Telecommun. J., vol. 12, no. 2, pp. 127–142, Mar./Apr. 2001. [17] A. Lombardo and G. Schembra, “Performance evaluation of an adaptiverate MPEG encoder matching intServ traffic constraints,” IEEE/ACM Trans. Netw., vol. 11, no. 1, pp. 47–65, Feb. 2003. [18] M. F. Neutz, Matrix-Geometric Solutions in Stochastic Models: An Algorithmic Approach. Baltimore, MD: The Johns Hopkins Univ. Press, 1981. [19] L. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proc. IEEE, vol. 77, no. 2, pp. 257–286, Feb. 1989. [20] A. Ramesh, A. Chockalingam, and L. B. Milstein, “A first-order Markov model for correlated Nagakami-m fading channels,” in Proc. IEEE Int. Conf. Communications (ICC), New York, Apr. 2002, pp. 3413–3417. [21] O. Rose, “Statistical properties of MPEG video traffic and their impact on traffic modeling in ATM systems,” Univ. Würzburg, Inst. Comput. Sci., Würzburg, Germany, Tech. Rep. 101, Feb. 1995. [22] G. M. Schuster and A. K. Katsaggelos, Rate-Distortion Based Video Compression, Optimal Video Frame Compression and Object Boundary Encoding. Norwell, MA: Kluwer, 1997. [23] T. Takine, T. Suda, and T. Hasegawa, “Cell loss and output process analyses of a finite-buffer discrete-time ATM queueing system with correlated arrivals,” in Proc. IEEE Information Communications (INFOCOM), San Francisco, CA, Mar. 1993, pp. 1259–1269.

[24] C. C. Tan and N. C. Beaulieu, “On first-order Markov modeling for the Rayleigh fading channels,” IEEE Trans. Commun., vol. 48, no. 12, pp. 2032–2040, Dec. 2000. [25] B. Vucetic, “An adaptive coding scheme for time-varying channels,” IEEE Trans. Commun., vol. 39, no. 5, pp. 653–663, May 1991. [26] H. S. Wang and N. Moayeri, “Finite-state Markov channel—A useful model for radio communication channels,” IEEE Trans. Veh. Technol., vol. 44, no. 1, pp. 163–171, Feb. 1995. [27] M. Zorzi and R. R. Rao, “On the statistics of block errors in bursty channels,” IEEE Trans. Commun., vol. 45, no. 6, pp. 660–667, Jun. 1997. [28] M. Zorzi, R. R. Rao, and L. B. Milstein, “Error statistics in data transmission over fading channels,” IEEE Trans. Commun., vol. 46, no. 11, pp. 1468–1477, Nov. 1998. [29] Coded Representation of Picture and Audio Information, MPEG Test Model 5. ISO-IEC/JTC1/SC29/WG11, Apr. 1993. [30] Coded Representation of Picture and Audio Information, MPEG Test Model 2. International Standard ISO-IEC/JTC1/Sc29/WG11, Jul. 1992. [31] Coding of Moving Pictures and Associated Audio for Digital Storage Media up to 1.5 Mb/s Part 2, Video, International Standard ISOIEC/JTC1/SC29/WG11, DIS11172-1, Mar. 1992.

Laura Galluccio received the Laurea degree in electrical engineering and the Ph.D. degree in electrical, computer and telecommunications engineering, both from the University of Catania, Catania, Italy, in 2001 and 2005, respectively. Since 2002, she has been with the Italian National Consortium of Telecommunications (CNIT), where she is working as a Research Fellow within the Virtual Immersive Communications (VICOM) Project. From May to July 2005, she was a Visiting Scholar at the COMET Group, Columbia University, New York, NY. Her research interests include ad hoc and sensor networks, protocols and algorithms for wireless networks, and network performance analysis. Dr. Galluccio served and will serve in the Program Committee of the 4th Academic Network for Wireless Internet Research in Europe (ANWIRE) International Workshop on Wireless Internet and Reconfigurability, the 20th International Symposium on Computer and Information Sciences (ISCIS 05), and Networking 2006.

Giacomo Morabito (M’02) received the Laurea degree in electrical engineering and the Ph.D. degree in electrical, computer, and telecommunications engineering from the University of Catania, Catania, Italy, in 1996 and 2000, respectively. From November 1999 to April 2001, he was with the Broadband and Wireless Networking Laboratory of the Georgia Institute of Technology as a Research Engineer. Since May 2001, he has been with the School of Engineering at Enna of the University of Catania, where he is currently an Assistant Professor. He is serving as a Guest Editor on the editorial board of Computer Networks and Mobile Networks and Applications (MONET). He is also a Member of the technical program committee of several conferences. Moreover, he has been the Technical Program Co-Chair of Med-Hoc-Net 2004. His research interests include mobile and satellite networks, self-organizing networks, quality of service (QoS), and traffic management. Dr. Morabito is serving on the Editorial Board of IEEE Wireless Communications Magazine.

Giovanni Schembra received the degree in electrical engineering from the University of Catania, Catania, Italy, in 1991. Working in the telecommunications area, he received the Master’s degree from CEFRIEL, Milan, Italy, in 1992, with his thesis focusing on the analytical performance evaluation in an ATM network. He received the Ph.D. degree in electronics, computer science, and telecommunications engineering with a dissertation on multimedia traffic modeling in a broadband network. He is currently an Assistant Professor in Telecommunications at the University of Catania.