QoS for Wireless Interactive Multimedia Streaming - Semantic Scholar

QoS for Wireless Interactive Multimedia Streaming Kostas E. Psannis

Yutaka Ishibashi

Marios G. Hadjinicolaou

Dept. of Technology Management University of Macedonia Greece

Dept. of Computer Science & Eng Nagoya Institute of Technology Japan

Dept. of Electronic & Computer Eng Brunel University UK

[email protected] [email protected] achieved, wireless video delivery faces numerous challenges, among them highly dynamic network topology, high error rates, limited and unpredictably varying bit rates, and scarcity of battery power[1]-[2]. Most emerging and future mobile client devices will significantly differ from those used for speech communications only; handheld devices will be equipped with color display and a camera, and have sufficient processing power to allows presentation, recording, and encoding/decoding of video sequences. In addition, emerging and future wireless systems will provide sufficient bit rates to support video communication applications. Nevertheless, bit rates will always be scarce in wireless transmission environments due to physical bandwidth and power limitations; thus, efficient video compression is required. In the last decade video compression technologies have evolved in the series of MPEG-1, MPEG-2, MPEG-4 and H.264 [3]-[6]. Given a bandwidth of several hundred of kilobits per second, the recent codecs, such as MPEG-4, can efficiently transmit quality video.

ABSTRACT This paper presents an efficient approach for supporting wireless interactive multimedia streaming services. One of the main goals of wireless video interactive services is to provide priority including dedicated bandwidth, controlled jitter (required by some real-time and interactive traffic), and minimum additional resources on the load on the server and the decoder complexity. The proposed method is based on storing multiple differently encoded versions of interactive video streams at the server. The corresponding video streams are obtained by encoding the original uncompressed video file as a sequence of I-P(M) frames using different GOP pattern. Mechanisms for controlling the interactive request are also presented and their effectiveness is assessed through extensive simulations. Wireless interactive video services are supported with a minimum of additional resources on the server load, network bandwidth, decoder complexity and acceptable visual quality at the wireless client – end.

In recent years several techniques for supporting interactivity for MPEG code video streaming applications have been devised [7]-[16]. In [7], [8] interactive functions are supported by dropping parts of the original MPEG-2 video stream. Typically, this is performed once the video sequence is compressed and aims to reduce the transport and decoding requirements. These approaches introduce visual discontinuities during the interactive mode, due to the missing video information. Alternatively interactive functions can also be supported using separate copies of the movie that are encoded at lower quality of the normal playback copy [9], [10]. In these cases, there is no significant degradation in the visual quality. However, the number of prestored copies of the movie limits the speed-up granularity. Other conventional schemes that support interactive functions require that frames are displayed at a rate much higher than the normal playback (for example 90 fps [11]), or involves downloading parts of the video data in a player device located at the customer premises so that the customer can view without further intervention form the network [12]. In the latter case the downloading can be done prior to viewing. Other approaches [13], [14], address only the issue of reverse functions of MPEG video streams; specifically, they describe methods of transforming an MPEG I-B-P compressed bitstream into I-B and I-P(M) bitstreams at client respectively. However such approaches require much higher decoder complexity to perform the transformations and higher storage at the client. Some researchers suggested a model in which the server can partition each video stream into two sub-streams (a low-resolution and a residual component stream) in order to support interactivity [15]. During the interactive mode, only the low-resolution stream is transmitted to the client. This reduces the amount of data that

Categories and Subject Descriptors D.2.3 [Coding Tools and Techniques].

General Terms Algorithms, Experimentation, Verification.

Keywords Quality of Services, Wireless Interactive, Multimedia Steaming, Scalable Coding, Joint Source Coding, Wireless Multimedia Coding

1. INTRODUCTION Multimedia applications and services over wireless networks is challenging due to constraints and heterogeneities such as limited battery power, limited bandwidth, random time-varying fading effect, different protocols and standards, stringent quality of service (QoS) requirements. The wireless communications and networking protocols will ultimately bring video to our lives anytime, anywhere, and on any device. Until this goal is Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Q2SWinet’07, October 22, 2007, Chania, Crete Island, Greece. Copyright 2007 ACM 978-1-59593-806-0/07/0010...$5.00.

168

needs to be retrieved and sent to the client, although it still introduces extra seek and latency overhead on the load of the server. Moreover, [16] proposed a dual bitstream least cost scheme to efficiently provide VCR like functionality for MPEG video streaming. This method requires low decoding complexity at the expense of significant extra storage and network requirements. Note that none of the methods mentioned above fully address the problem of supporting interactive operations with minimum additional resources at the load of the server/network bandwidth and decoder complexity. In addition they support limited interactive functions. However in the literature survey none of the methods applied error resilience techniques at the video coding level to support wireless interactive multimedia services. Error resilient (re)-encoding is a technique that enables robust streaming of stored video content over noisy channels. It is particularly useful when content has been produced independent of the transmission network conditions or under dynamically changing network conditions. Developing error resilience technique which provides high quality of experience to the end mobile user is a challenge issue [17][19]. In this paper we propose a very efficient error resilience technique for wireless multimedia streaming. Encoding separate copies of the video, the wireless interactive video streams are supported with minimum additional resources. The corresponding versions are obtained by encoding every (i.e. uncompressed) frame of the original movie as a sequence of I- P(I)- frames using different GOP pattern.

I o B1 B 2 P3 B 4 B5 I 6. B7 B8 P9 (a) normal play B1 B2 B4 B5 . B7 B8 B10 B11 ..

(b) redundant frames P3 I 6 P9 I12 P15 I18 P21 I 24 P27

(c) transmitted frames Figure 1. Fast Forward mode for F-S=6 to assume that the start point of the Fast Forward mode is an Iframe. This is due to the fact that the I- frames will not cause an unpleasant effect in viewing because they are decoded independently.

3. PROPOSED TECHNIQUE In a typical video distribution scenario, video content is captured, then immediately compressed and stored on a local network. At this stage, compression efficiency of the video signal is most important as the content is usually encoded with relatively high quality and independent of any actual channel characteristics.

3.1 Interactive Mode The problem addressed is that of transmitted a sequence of frames of stored video using the using the minimum amount of energy subject to video quality and bandwidth constraints impose by the wireless network To support interactive functions, the server maintains multiple, different encoded versions of each movie. One version, which is referred to as the normal version is used for normal-speed playback. The other versions are referred to as interactive versions. Each interactive version is used to support Fast/Jump Forward/Backward Slow Down/Reverse and Reverse at a variable speedup. The server switches between the various versions depending on the requested interactive function. Assume that I- frame is always the start point of interactive mode. Since Iframes are decoded independently, switching from normal play to interactive mode and vice versa can been done very efficiently. Note that only one version is transmitted at a given instant time.

The paper is organized as follows. In Section 2 the problem of supporting wireless interactive multimedia services is formulated. In Section 3 the preprocessing steps required to support efficient interactive multimedia streaming over wireless networks are detailed. Section 4 presents the extensive simulations results. Finally conclusions are discussed in Section 5.

2. PROBLEM DESCRIPTION The full interactive functions with the MPEG coded video presents a number of considerable challenges associated with video data storage and playout [6]-[8]. An MPEG video stream comprises intra-frames (I), predicted frames (P), and interpolated frames (B). According to MPEG coding standards, I-frames are coded such that they are independent of any other frames in the sequence; P-frames are coded using motion estimation and each one has a dependency on the preceding I- or P- frame; finally the coding of B- frames depends on the two “anchor” frames - the preceding I/P frame and the following I/P frame. An MPEG coded video sequence is typically partitioned into small intervals called GOP (Group of Pictures). This structure allows a simple realization of forward (normal)-pay operation but imposes several but imposes several additional constraints on the other interactive functions. Assume that

The corresponding interactive version is obtained by encoding every N-th (i.e., uncompressed) frame of the original movie as a sequence of IP(Marionette) frames ( N int eractive = var iable, M int eractive = 1). Marionette-frame contains information for the previous encoded frame. Effectively this results in repeating the previous I-frame in the decoder, enhancing the visual quality during the interactive mode. This is because it freezes the content of the I-frame, reducing the visual discontinuities (of dropped –B and –P frames). Moreover P(Marionette) frames are produced between successive I-frames in order to maintain the same frame of normal play and achieve full interactive operations at variable speeds.

I i is the starting point of the fast playback

mode and F_S (Frames_Skipped) is the number of skipped frames. Consider the case in which the current frame is frame

P9

P3

and

3.2 Rate Control Schemes

is the following frame to be displayed during the Fast

We consider a system where source coding decisions are made using the minimum amount of energy min E q ( i ) {I , P ( M )}

Playback for F_S factor equals to 6. Actually there is no need to send any of the B-frames because for the given F_S factor the displayed P-frames can be decoded only from the preceding -I or P frames. Fig.1 (c) depicts the number of required frames to be transmitted in order to display correctly the P-frames It is logical

subject to minimum Distortion ( D

169

min

) at the mobile client and

the available Channel Rate ( C

_ Rate )

algorithm for different type of video traces with allowable channel rate C_ Rate= 300Kbps. All the video traces achieve acceptable visual quality because the PSNR is sufficiently large. The average PSNR values for the 60- frames of the video traces are 33.08 dB for the Complex motion, 32.85 dB for the High motion and 32.75 for the normal motion respectively. Note that the average quality is slightly better under the complex motion video trace. Figure 3 depicts the PSNR plot per frame obtained with the proposed algorithm for the mobile trace (complex motion) for different allowable channel rates. The average PSNR values for the 60- frames of the video trace are 33.08 dB, 32.65 dB and 32.35 for the allowable channel rates 300Kbps, 200Kbps and 100Kbps respectively. It should be emphasized that the average quality is better under the 300Kbps allowable channel rate.

required by wireless

network. Hence

min E {I , P(M )} ≤ C _ Rate q(i)

min E {I , P ( M )} ≤ D q (i)

min

Our goal is to proper select a quatizer q(i) in order to minimize the energy required to transmit the sequence of I- P(M) frames subject to both distortion and channel constraints. A common approach to control the size of an MPEG frame is to vary the quantization factor on a per-frame basis [21]. The amount of quantization may be varied. This is the mechanism that provides constant quality rate control. The quantized coefficients QF [ u ,v ] are computed from the DCT coefficients F [ u ,v ] , the quantization_scale, MQUANT , and a quantization_matrix, W [ u ,v ] , according to the following equation.

16× F[ u,v ] QF[ u,v ] = MQUANT ×W[ u,v ]

(1)

High Motion

33,5

PSNR (dB)

Varying the quantization scale ( MQUANT )

33 32,5 32

• Varying the quantization matrix ( W [ u , v ] ) To bound the size of predicted frames, an P(M)-frame is encoded such that its size fulfils the following constraints

BitBudget {I , P ( M )} ≤ C _ Rate BitBudget {I , P ( M )} ≤ D

Complex Motion

Normal Motion

34

The quantization factor may be varied in two ways [20]-[21 •

Video Traces - Interactive C_rate=300Kbps

34,5

31,5 31 30,5

(2)

0

2

4

6

8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60

Frames

Figure 2. PSNR for encoded frames in the interactive mode for different video traces

min

The encoding algorithm in the first encoding attempt starts with the nominal quantization value that was used to encode the preceding I-frame. After the first encoding attempt, if the resulting frame size fulfils the constraints (2), the encoder proceeds to the next frame. Otherwise, the quantization factor (quantization_matrix, W [ u , v ] ) varies and the same frame is reencoded. The quantization matrix can be modified by maintaining the same value at the near-dc coefficients but with different slope towards the higher frequency coefficients. This procedure is repeated until the size of the compressed frame corresponds to (2). The advantage of this scheme is that it tries to minimizes the fluctuation in video quality while satisfying channel condition.

Mobile trace - Interactive C_rate=300Kbps,200Kbps,100Kbps

34,5

300Kbps 200Kbps 100Kbps

34

PSNR (dB)

33,5 33 32,5 32 31,5

4. EXPERIMENTS

31

There are two types of criteria that can be used for the evaluation of video quality; subjective and objective. It is difficult to do subjective rating because it is not mathematically repeatable. For this reason we measure the visual quality of the interactive mode using the Peak Signal-to-Noise Ratio (PSNR). We use the PSNR of the Y-component of a decoded frame. The MPEG-4 bitstreams used for simulation is the “Complex Motion”, “High Motion” and “Normal Motion” video sequences with a frame rate of 30fps. The GOP format was N=5, M=1.We consider a set of allowable channel rate, C_Rate= (300Kbps, 200Kbps, 100Kbps). Figure 2 depicts the PSNR plot per frame obtained with the proposed

30,5 0

2

4

6

8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60

Frames

Figure 3. PSNR for encoded frames in the interactive mode for different allowable channel rate

5. CONCLUSIONS In this paper, we investigated the constraints of supporting wireless multimedia interactive services on server load, network

170

[12] Kostas Psannis, Marios Hadjinicolaou “Transmitting Additional Data of MPEG-2 Compressed Video to Support Interactive Operations”. In: Proc. IEEE ISIMP01, Hong Kong, May 2001,pp 308-311.

bandwidth and decoder complexity. In order to overcome these additional resources we proposed the use of differently encoded version of each video sequence. The server responds to an interactive request by switching from the currently transmitted version to another version. By proper encoding versions of the original video sequence, interactive functionality can be supported with considerably reduced additional storage, network bandwidth and decoder complexity and acceptable visual quality at the wireless client –end.

[13] M-S.Chen, D.D. Kandlur. “Downloading and Stream Conversion: Supporting Interactive Playout of Video in a Client Station”. In: Proc IEEE Multimedia Conference (1995), pp 73-80. [14] Kostas Psannis, Marios Hadjinicolaou “Support Backward Interactive Functions of MPEG–2 Stream”, In Proc: WSEAS Intelligent Systems Conference Feb2002, Switzerland, pp 282-287

6. REFERENCES [1] M. Ethoh and T. Yoshimura, “Advances in Wireless Video Delivery”, Proc. IEEE, vol.93, no.1, 2005, pp. 111-22

[15] Prashant J.Shenoy, Harrick M.Vin. “Efficient Support For Scan Operations In Video Servers”. In: Proc. ACM Multimedia Conference, November 1995. ACM Press, San Francisco, CA, pp 131-140

[2] Kostas E. Psannis, Marios Hadjinicolaou:, “MPEG based Interactive Video Streaming: A Review”, WSEAS Transaction on Communication, vol.2, April 2004, pp 113120

[16] C-W. Lin, J. Youn, J. Zhou, M-T. Sun, “MPEG video streaming with VCR functionality”, IEEE Trans on Circuits and Systems for Video Technology, vol 11 , no 1, Feb 2001, pp 415-425

[3] Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to About 1.5Mbits/s, (MPEG-1), Oct. 1993 [4] Generic Coding of Moving Pictures and Associated Audio. ISO/IEC 13818-2, (MPEG-2), Nov 1993.

[17] Kostas Psannis, Marios Hadjinicolaou and Anargyros Krikelis, MPEG-2 Streaming Of Full Interactive Content, IEEE Transactions on Circuits and Systems for Video Technology, ISSN: 1051-8215, vol.16. no 2, pp 280-285 (February 2006).

[5] Coding of Moving Pictures and Associated Audio.MPEG98/W21994, (MPEG-4), Mar.1998 [6] T. Stockhammer and N. M Hannuksela, “H.264/AVC Video for wireless Transmission, IEEE Wireless Communications, August 2005, pp 6-72.

[18] Kostas Psannis and Yutaka Ishibashi, Impact of Video Coding on Delay and Jitter in 3G Wireless Video Multicast Services, EURASIP Journal on Wireless Communications and Networking, Vol. 2006, Article ID 24616, ISSN: 16871472, pp 1-7 (June 2006)

[7] Banu Ozden, Alexandros Biliris, Rajeen Rastogi, Avi Silberschatz. “A Low-Cost storage Server for Movie on Demand Databases”. In: Proc. of the 20th VLD Conference, Santiago, 1994.pp 594-605..

[19] G. Carle and E. Biersack, “Survey of error recovery techniques for IP based audio visual multicast applications”, IEEE Network, Dec 1997, pp 38-45.

[8] HJ Chen, A. Krishnamurthy, TDC Little, and D. Venkatesh, "A Scalable Video-on Demand Service for the Provision of VCR-Like Functions," Proceedings IEEE Multimedia, 1995, pp. 65-71. [9]

[20] Kostas. E. Psannis, Yutaka Ishibashi and Marios Hadjinicolaou, “A Novel Method for supporting Full Interactive Media Stream over IP Network”, International Journal on Graphics, Vision and Image Processing, Vol. 4, pp 25-31, (2005)

Marwan Krunz, George Apostolopoulos. “Efficient Support for interactive scanning operations in MPEG-based video on video on demand”, Multimedia Systems, vol.8, no.1, Jan. 2000, pp.20-36

[21] Kostas E. Psannis, Marios Hadjinicolaou and Argy Krikelis, ‘Full Interactive Functions in MPEG-based Video on Demand Systems”, In Nikos E. Mastorakis and G Antoniou editors, Recent Advances in Circuits, Systems and Signal Processing Book Series, Published by WSEAS Press, pp 419-424 (2002)

[10] Michael Vernick, Chitra Venkatramani, Tzi-Cher Chinueh “Adventure in Building the Stony Brook Video Server”. In Proc ACM Multimedia, Boston, Nov 1996, pp 287-295. [11] Jayanta K Dey-Sircar, James D.Salehi, James F Kurose, Don Towsley. “Providing VCR Capabilities in Large-Scale Video Servers” I In: Proc. ACM Multimedia Conference, Oct 1994, San Francisco, CA, pp 25-32

171