Video Transcoding and Streaming for Mobile

0 downloads 0 Views 133KB Size Report
This is a further constraint since the GPRS/EGPRS and ... PDA-based live video streaming system on GPRS ... low as possible; this it could be done setting a.
Video Transcoding and Streaming for Mobile Applications Giovanni Gualdi, Rita Cucchiara Dipartimento di Ingegneria dell’Informazione University of Modena and Reggio Emilia Via Vignolese, 905/b - 41100 Modena, Italy {gualdi.giovanni, cucchiara.rita}@unimore.it

Andrea Prati Dip.to di Scienze e Metodi dell’Ingegneria University of Modena and Reggio Emilia Via Amendola, 2 - 42100 Reggio Emilia, Italy [email protected]

Abstract The present work shows a system for compressing and streaming of live videos over networks with low bandwidths (radio mobile networks), with the objective to design an effective solution for mobile video access. We present a mobile ready-to-use streaming system, that encodes video using h264 codec (offering good quality and frame rate at very low bit-rates) and streams it over the network using UDP protocol. A dynamic frame rate control has been implemented in order to obtain the best trade off between playback fluency and latency.

2.

3.

1. Introduction Mobile video browsing has been a very hot topic in the multimedia community in the past years. The increase of computational power has made also possible to perform in real time also the other end of the video stream, that is the grabbing, encoding and network streaming. On top of this it is possible to build mobile live video encoding and streaming; this is quite a challenging target, that presents many technological issues, that can be summarized in the following points: 1. Ubiquitous networks. Given the high grade of mobility required on the application, a network with an (almost) ubiquitous territorial coverage is necessary; therefore, WiFi or UMTS can not be currently used due to their limited coverage; GPRS network has been selected as transportation layer; GPRS is based on the GSM infrastructure, that is covering very high percentage of the territory. In particular, the GPRS-EDGE (Enhanced Data rates for GSM Evolution), also known as EGPRS, version has been used, given that it has the same coverage of GPRS but with a higher available bandwidth. Let’s consider

4.

that in case that the encoding side is mobile, the video communication will use the wireless network communication in uplink. This is a further constraint since the GPRS/EGPRS and other wireless communications are often asymmetric, favoring the downlink. Live video. The system requires live video encoding and decoding. The encoding must be efficient, so that video can be effectively transmitted on low bandwidth. It’s important noting that in on-line video compression, offline (multi-pass) encoding are not possible. Good perceived video quality; we want the system to offer good understanding of the scene, requiring the streamed live video to offer: single images quality, fluency, no image skipping. Low latency is a very important point, that makes the difference between live video streaming for entertainment of for interaction purposes. In entertainment scenarios, latencies of several seconds might not be considered as a problem; on the contrary, interaction, that requires quick reaction from end to end, must offer low latency.

Commercial or off-the-shelf streaming solutions, are not sufficient to meet our requirements. We considered Windows Media Suite [1], Darwin Streaming Server [2] and Helix Streaming Server [3]. These tools offer excellent streaming solutions for entertainment video streaming and intensive broadcasts. But are not optimized for unicast low latency video streaming. Moreover there are constraints on the choice of the codec, often just proprietary codecs are allowed, or on the network management (poor flexibility in the protocols or the communication ports). VideoLan (VLC) [4] is a very good open source video streaming solution, flexible and effective. It

implements many codecs, including H264 [5] (for both encoding and decoding), but is still show a few blocking limitations: • Constrains on protocols: for live video streaming on UDP, the video must be encapsulated with MPEG-TS, that is shown to be a waste in bandwidth by [6] • Latency is not satisfactory, as shown in the experimental results. • High rate of packets loss; in order to obtain lowest latency, all the buffers (encoding side and decoding side) have been minimized. But this makes the system pretty sensible to any network unsteadiness. For all the above reasons, we decided to develop our own live video streaming system, based on H.264/AVC and UDP as transportation layer protocol. Regarding the protocol, our choice of H264 on other codecs is pretty straightforward from a video quality point of view; see the comparison between H264 and MPEG4 on PSNR in figure 1-A. The drawback of H264 is encoding and decoding complexity (as shown in Figure 1-B). Since the bandwidth is limited, the frame size of the encoded video is pretty small (maximum is CIF), making it possible to perform real time encoding on regular laptop PCs. The decoding can be performed on modest x86 PCs but also on high performance PDAs (tested on 520Mhz xScale). The decision of using UDP rather than TCP is straightforward, since we are streaming real time data. UDP has a prioritized dispatch over TCP, and this makes it even more suitable for our target application. Moreover the evaluation paragraph will show that it is also very reliable.

2. Related works We couldn’t find specific works addressing all the 4 aforementioned bottom line points for our mobile application. A very nice application for live video streaming between moving vehicles has been proposed by Guo et al. in [7]. However, their system is based on 802.11 WiFi networks and thus it is not suitable to our case. Some previous works have proposed systems for video streaming over low-capacity networks, such as GPRS. For instance, Lim et al. in [8] introduced a PDA-based live video streaming system on GPRS network. The system is based on MPEG-4 compression / decompression on PDA. Their system works at 2-3 fps when transmission is over GPRS, that

is not compliant with our requirements. This limitation is basically due to the limitation of the video codec. Moreover, no information on the latency of the system is provided. H.264/AVC [1] opens to new possibilities in video streaming. In fact, the primary goals of this standard are improved video coding and improved network adaptation. Antonios Argyriou in [9] uses H264 with the introduction of a new transport layer protocol called Stream Control Transmission Protocol (SCTP), suitable for handling multiple streams and multi-client access to videos: but this is not our case, where a single video (with only video data and no audio) has to be transmitted to a single receiver, hence not requiring data multiplexing.

3. System Description Our system can be divided in 2 parts, the encoder and the decoder (shown in figure 2 and 3). Both applications are multi-threaded in order to increase performances; each block of the schema represent the thread. The programming language is C#.NET where possible, for ease of implementation. The intensive calculations blocks (encoding and decoding) have been imported as native C++ modules. The video encoder is build upon the free open source version of X264 [10]. The original X.264 has been modified in the source code in order to load not only videos from file system, but also from a generic circular buffer, that allows higher flexibility, since it can be easily fed with any kind of source (file system, video device, video server, etc.). In our case the video source is obtained through a video file system or a USB camera: the dedicated video grabbing thread provides the YUV frames to the circular buffer. On the other side, the encoder thread asynchronously extracts them from the buffer. If the grabbing rate is higher than the encoding rate for short time, no video data will be lost. As drawback, the buffer introduces some latency; for this reason it is important to keep the buffer at low occupancy. The raw H.264 encoded stream is then sent over the network, splitting it in UDP datagrams of fixed byte size. The decoder has been built upon the FFMPEG H264 [11]engine. Depending on the processing load on the encoder (that might be variable for many reasons, like the different motion conditions of the video to encode), on the decoder or on the network, the packet generation rate P.G.R. (encoder side) and the packet extraction

rate P.E.R. (decoder side) might differ from time to time. Therefore, the UDP network buffer at the receiver side plays an essential role in order to reduce the effects of these discrepancies. If the P.G.R. remains higher than the P.E.R., the buffer might fill up and, when completely filled the new incoming datagrams will be lost. For this reason we implemented a simple adaptive algorithm to increase or decrease the buffer size dynamically: in practice, the algorithm either doubles the buffer size every time that it gets filled up beyond 80%, or halves it when its level decreases under 20%. These threshold values are computed empirically and depend on the network’s conditions. Since the latency is directly related to the occupancy of the buffer, it’s important to keep it as low as possible; this it could be done setting a playback frame rate higher than the encoding frame rate. But this will generate an intermittent playback, since its fluency would be interrupted every time that the buffer gets emptied and the decoder needs to wait for the next incoming datagram to be received. For this reason we implemented a dynamic adaptation of the frame-rate. As shown in figure 4, we define an optimal occupancy gap of the buffer (empirically defined around 5 to 30%), and we modify the playback frame rate according to some multipliers (α,β,γ), that are defined as functions of Δ(occ%), the derivative of the occupancy of the buffer. The graphs of α,β,γ are shown in figure 4.

4. Experimental results We tested the system with the encoder working on a laptop, connected with EGPRS or GPRS, mounted on a car, that was moving at variable speed (up to 110km/h), for more than 100 minutes of transmission. The decoder was performed on a standard x86 platform, or on a xScale PDA device running Windows Mobile 5. Less than 0,1% of datagrams have been lost, and none of them was out of order. The measured latency can be seen in table 1.

Introducing the adaptation on the playback frame rate, in short time (less than 60 seconds, depending on the functions of α,β,γ) the system removes the interruption of playback. Some measurements on PSNR have given an average of 34.49db on a CIF sequence of 2500 frames, 10fps, at 120kbps.

5. References [1]http://www.microsoft.com/windows/windowsmedia/ [2]http://developer.apple.com/opensource/server/streaming/ index.html, [3]http://www.realnetworks.com/products/mediadelivery.htm l [4]http://www.videolan.org/vlc/ [5] Advanced video coding for generic audiovisual services. Technical report, ITU Rec. H624/ISO IEC 14996-10 AVC, 2003. [6] A. MacAulay, B. Felts, and Y. Fisher. WHITEPAPER IP streaming of MPEG-4: Native RTP vs MPEG-2 transport stream. Technical report, Envivio, Inc., Oct. 2005. [7] M. Guo, M. Ammar, and E. Zegura. V3: a vehicle-tovehicle live video streaming architecture. In Proc. of IEEE Intl Conf on Pervasive Computing and Communications, pages 171– 180, 2005. [8] K. Lim, D. Wu, S. Wu, R. Susanto, X. Lin, L. Jiang, R. Yu, F. Pan, Z. Li, S. Yao, G. Feng, and C. Ko. Video streaming on embedded devices through GPRS network. In Proc. Of IEEE Intl Conference on Multimedia and Expo, volume 2, pages 169–172, 2003. [9] A. Argyriou and V. Madisetti. Streaming h.264/avc video over the internet. In Proc. of 1st IEEE Consumer Communications and Networking Conference, pages 169– 174, 2004. [10] http://developers.videolan.org/x264.html [11] http://sourceforge.net/projects/ffmpeg

Figure 1-A and 1-B: H264 vs MPEG4, PSNR and computational load

video

Grabber Video Encoder

YUV

Net Streamer

Main()

UDP

H264

Figure 2: the encoder side

UDP buffer UDP

Main()

RGB

H264 To the screen Figure 3: the decoder side

Video Decoder

Figure 4: dynamic frame rate adaptation

System / Setup

Mean (s) Variance (s)

Windows Media, lowest encoding / playback buffering 4.15 VideoLan, lowest encoding / playback buffering

4.07

Our System (same x264 encoding parameters used in 1.73 VLC) Table 1: latency

0.026 2.19 0.042

Suggest Documents