Proceedings of 2001 International Symposium on Intelligent Multimedia, video and Speech Processing
May 2 4 2001 Hong Kong
Transmitting Additional Data of MPEG-2 Compressed Video To Support Interactive Operations Kostas Psannis Dept of Electronic & Computer Engineering
Brunel University, Uxbridge, Middlesex, UB8 3PH,UK
[email protected] Marios Hadj inicolaou
Dept of Electronic & Computer Engineering Brunel University, Uxbridge, Middlesex, UB8 3PH,UK Marious.Hadj
[email protected] .uk
ABSTRACT In this paper we address the problem of supporting interactive playout, both forward and backward, of MPEG-2 encoded video stream. Typically, the forward and backward operations are used mainly in the form of Fast Forward (FF) and Fast Backward or Fast Rewind (FR). The proposed approach is based on transmitting additional data of the same movie from the server to the Digital Storage Device (DSD) in a client station. The server responses to FF/FR request by switching via the digital device from normal version to appropriate FF/FR version. The special additional data consists solely of I frames by removing the P and B frames and showing only the I frames. The alternative “interactive” data is generated from the storage at the server every n-th frame from the original MPEG-2 compressed video. Interactive operations are supported using extra network bandwidth than that is already allocated for normal playback. Specifically the server can decide the bit rate of the additional data, required for transmission to the digital device by varying the frame rate (number of I frameslsec), depending on the availability of the network bandwidth.
Keywords: MPEG-Fast ForwardlRewind operations -Interactive Video on Demand-Scanning operations
I INTRODUCTION Significant progress has been made during the last few years on the development and standardization of core high-speed networking and digital video techniques. The forthco,ming multimedia telecommunication services are expected to use a great deal of video material in compressed format, for storage and transmission. At present, it is generally accepted by both telecommunication industry and academic community that MPEG-2 will be the main standard for video coding. Video On Demand server is expected not only to concurrently serve many clients, but also to provide interactive features such as Fast Forward and Fast Rewind play. The inter frame dependencies [l] of MPEG make it prohibitively expensive to provide certain interactive
308
features over the network [2]. Interactive functions can be supported by dropping parts of the original MPEG-2 video stream [3], [4]. Typically, dropping is performed after compression and aims to reduce the transport and decoding requirements without causing significant degradation in video quality. Alternatively interactive functions can also be supported using separate copies of the movie that are encoded at lower quality of the normal playback copy [SI, [6]. Other conventional schemes that support interactive functions either display frames at rate much higher than the normal playback, for example 90 fps [7] or involves downloading the video data in a player device (not real time playout) located at the customer premises so that the customer can view without further intervention form the network [8]. In this work, we introduce an efficient approach for supporting Fast Forward (FF) and Fast Rewind (FR) in a Video On Demand (VOD) system. Our approach is based on transferring additional data from the server to the digital storage in the client premises. The additional data is being generated at the server every n-th frame from the original movie, which is compressed using MPEG-2 video coding. We refer to the additional data as the FFIFR data and the one used for normal playback as the MPEG-2 data. The remainder of this paper is organized as follows. In Section I1 we briefly describe the MPEG data organization. Section I11 provides detailed description of the proposed methodology. In section IV we compare the additional storage at the client with the storage of the server and other extra storage device used by other method. Finally Section V concludes the paper and points to open research.
I1 MPEG Organization Assume that the MPEG-2 file has a playback rate of 30 frames per second. However, there are no hardware or software MPEG decoders that can run faster than 30 frames per second. Let TF be the Total (T) number of Frames (F) in the normal version. We define the number of I, P and B frames as follows:
Proceedings of 2001 International Symposium on Intelligent Multimedia, video and Speech Processing
May 2-4 2001 Hong Kong
normal playback by few seconds prior to initiating FF/FR mode is acceptable.
I=--TF N
1 1 N-M P=TFx(---)=TFx(1 MxN M N 1 M-1 B = TF X (1 - -= TF X (-) M) M
MPEG-2 data
(2)
(3)
where, N is the distance between two successive I frames, defining a “group of pictures” (GoP). kf is the distance between consecutive I or P frames. Usually set to 3. Note that N must be a multiple of M . M =1 means no B frames in the sequence. =0 implies I frames only. TF must be multiple of N Let p =Total Number of I frames (TNI) then,
N=axM TF=PxN
(4) (5)
From (4) and (5) we have
TF = @ x ~ x M
(6)
I1 I16 I31 146....
Fast ForwardIRewind data Figure 2: Proposed model The block diagram of the proposed model is depicted in Figure 2. In our proposed work we create in real time an alternative data file from the storage of the server. In this case we consider that the server can support the creation of two data streams of the same storage. The additional data is originated from the compressed MPEG-2 video and transferred into the digital device over the same channel if the bandwidth permits or over a communication channel that is different from the one used for normal playback. The FFIFR data consists of an Elementary Stream (ES), which has all I-frames. Assume that the typical characteristic of MPEG-2 video is that shown in table I
So, according to (l), (2), (3), (4), (5), (6) we have
TF I = - H I = ~ N N-M
P=TFX(-
MN
(7)
I-frame size
) f-J P = P X ( a ! - 1 )
M-1 B = TF x (-) HB = a x P x ( M M
Frames (bytes) 30392
(8)
Average (bytes) 20663 11856
(bytes) 11910
- 1) ( 9 )
Frames in a GOP
Recording ratio
1:4:10
15
111. Detailed Description of the Model Interactive operations such as FF/FR are implemented both at the server and the Digital Storage Device (DSD). A good signaling operation between the client, the server and the digital storage device is necessary to switch from normal playback to Fast Forward /Fast Rewind and vice versa. Switching from one version to another is performed online, in response to a client request. To maintain the GoP periodicity, switching from normal version to an interactive version must take place at I frame. When the FFIFR request arrives at the server, the server continues to send frames from normal version up to and excluding the previous frame that follows the common I frame. In this case latency is introduced, in which the receiver continues normal playback some frame periods from the time FF/FR request is issued. It is worth mentioning that when client requests FFiFR mode implies that the movie must be advanced in as fast mode. Thus extending
Ratio of I: P: B
30
Table I1 provides a typical GoP Length (N), frames ratio and recording ratio. According to table I and table I1 the bandwidth for normal play is given by 3Ofps xAverage(1PB)Size x8bi%yle= 1.99MBps where,
Average(1PB)Size;-
4”“ +%veruge~
N
1 1 (- --) M N
+ B,,,,,.,,,x
One of the distinguishing features of our model is that the server can decide the bit rate of the additional data by varying the frame rate (number of I framesisec) (table 111)
309
1
(1--) M
Proceedings of 2001 international Symposium on intelligent Multimedia, Video and Speech Processing
Table 111: Computation the additional bit rate,
May 2 4 2001 Hong Kong
Table IV: Comparison of our additional storage
For example if the server transmits additional data 6I/sec it will achieve 0.99Mbps and the total bit rate will be 2.98Mbps(table 111).
IV The additional storage in the client premises The extra storage required by the Digital Storage Device (DSD), the storage at the server and the additional storage required by stream conversion method [8] are defined as follows.
Wclient = TNI x Averagevrames , , , , . , ,W , = TF x Average(IPB)Size
(10) (11)
Wclrent(P+I) = TF x Averag(1, B)Size
(12)
where 'average
Averag(I, B)Size =
~
A4
+ 'average
x1 -(
M-1
A4
From (10) and (1 1) and (12) we have
I
where @=(-
n
Table IV depicts that by reducing the GoP Length (N) we increase the storage requirement for both the server and the client, while on the other hand we provide better visual quality for the FF/FR mode since more I frames are generated. Tuning the above parameter requires careful consideration because there is a trade off between the quality of picture at the interactive mode and the extra storage required for both the server and the client. This trade-off is shown in Figure 3 and Figure 4. In the same table we compare the extra storage at the client station using our methodology with the extra storage required at the customer premises due to P-I conversion [SI. In Figure 4 we can see that our extra storage is less than conversion the additional storage based on P-I method. This is because our method stores only the I frames for each GoP and the P + I conversion method [8] converts the P frames into I frames and then stores all the frames for each GoP.
B average
I average
) x ( M -l)+(-
2500
Paverage
I average
1500
0.43 + 1 . 1 1 x a
E 1000 3
w
Wserver (N=6)
500
-Wclient(P1)
0
0
,
5000 Time (sec)
Wc/ient
Wc/ient ( p + / )
Wclient (N=6)
Q
From table I and table II we get -=
= 1.54 x a
-Wserver (N=l5)
-Wclient (N=15)
2000
1,
Paverage B average =(l-), Y = l + x ( M -1) 'average I average
1
7
10000
(N=15)
-Wclient(P1) (N=6)
Figure 3: Relative increase of the storage as a function of time
client
310
Proceedings of 2001 International Symposium on Intelligent Multimedia. video and Speech Processing
May 2-4 2001 Hong Kong
REFERENCES Wserver-Wclient Wclient(P1)-Wclient
Wciient(P1)-Wclient
0
200
400
600
800
Wclient (Mbytes)
Figure 4: Relative increase of the W
as a as
a function o f W sewer or W client(P-+l)
V
Conclusions
In this paper we have proposed a new methodology for supporting interactive operations such as Fast Forward and Fast Rewind in a Video On Demand (VoD) system. Generating additional data for each movie at the server and transmitting at the digital device in the client station supports interactive operations. The advantage of our methodology is that there is a small additional storage at the client compared with the storage at the server and other extra storage used by other method [SI. Specifically there is a trade off By reducing the GoP Length (N) we increase the storage both at the server and the client, while we provide better visual quality for FF/FR mode since more I frames are generated. In addition interactive operations can be supported with less or more extra bandwidth depending on the availability of the network. In our hture work we will analyse the storage of the server that supports two output data streams, predict the limitations on the interactive functions (e.g. number of supported speedups, visual quality during FF/FR mode, maximum duration of the interactive mode), support other forms of interactivity including switching between Fast Forward and Fast Rewind without going through normal playback, propose a bit rate reduction methods for both the original MPEG-2 and FF/FR streams in order to accommodate in the same channel the additional bit rate. One other point for open research is to integrate the Digital Storage Device (DSD) to the next generation of integrated receivers decoders (IRD) or Set Top Boxes in order to achieve more cost effectiveness for the client.
[1] INTERNATIONAL STANDARD13818-2. INFORMATION TECHNOLOGY - GENERIC CODING OF MOVING PICTURES AND ASSOCIATED AUDIO INFORMATION VIDEO [ 2 ] M-S Chen, DD Kandlur, and PS Yu. ”Support for fully Interactive Playout in a Disk-ArrayBased Video Server “. In: Proc. Second International Conference on Multimedia, October 1994, San Francisco, CA, ACM, pp 391-398 [3] Banu Ozden, Alexandros Biliris, Rajeen Rastogi, Avi Silberschatz. “A Low-Cost Storage Server for Movie on Demand Databases”. In: Proc. of the 20thVLDB Conference, Santiago, Chile, 1994.pp 594-605 [4] Michael Vemick, Chitra Venkatramani, Tzi-cher Chinueh. ”Adventures in Building the Stony Brook Video Server”. In: Proc. ACM Multimedia, November 1996 Boston, MA, pp 287-295 [5] Marwan Krunz, George Apostolopoulos. “Efficient Support for interactive scanning operations in MPEG-based video on video on demand”. Multimedia Systems, ~01.8,no.l, Jan. 2000, pp.20-36
[6] Prashant J.Shenoy, Harrick M.Vin. “Efficient Support For Scan Operations In Video Servers”. In: Proc. ACM Multimedia Conference, November 1995. ACM Press, San Francisco, CA, pp 131-140 [7] Jayanta K Dey-Sircar, James DSalehi, James F Kurose, Don Towsley. “Providing VCR Capabilities in Large-scale Video Servers” In: Proc. ACM Multimedia Conference, Oct 1994, San Francisco, CA, pp 25-32 [ 8 ] M-S.Chen, D.D. Kandlur. “Downloading and Stream Conversion: Supporting Interactive Playout of Video in a Client Station”. In: Proc IEEE Multimedia Conference (1995).
Acknowledgement We would like to thank Dr A Krikelis the Chief Technology Officer of Aspex Technology for his valuable comments on this work.
31 1