Peer-to-Peer Adaptive Video Streaming System Artūras Serackis†
Paulius Tumas∗ ∗ Department
of Electronic Systems Vilnius Gediminas Technical University Naugarduko g. 41-426, Vilnius LT-03227, Lithuania Email:
[email protected]
Abstract—The paper presents a peer-to-peer video streaming system with an ability to change the encoded video resolution adaptively to the data throughput changes. The low latency live video streaming based on the WebRTC specification was used in the investigation. The P2P video streaming system was able to predict the current performance of the data connection in the receiver and react by sending a request to the sender to increase or decrease the video resolution. An experimental investigation was performed by the use of network emulator and activating the 10% packet loss emulation. The system proposed in this paper was able to decrease the video resolution when the packet loss rate was increased. In addition, the system was able to detect the increase of the data throughput and to return the high video resolution for the video stream. Index Terms—WebRTC, live video stream, adaptive streaming.
I. Introduction The adaptive video streaming systems usually are working in a videoon-demand (VOD) basis [1]. Each video record has several copies of the video, encoded using different resolution (e.g. 720p, 480p, 320p) [2], [3]. The video player in the web browser perform an analysis of the video playback buffer to switch between available video streams in order to avoid the playback buffer overflow [4], [5]. The analysis of the playback buffer is performed in order to detect the increased latency of the media stream data packets and packet loss. The switching to the video stream of the lower resolution decreases the video stream bitrate and adapts to the decreased throughput of the data connection. Due to division of the video stream into a fixed duration segments (e.g. segments with 3 seconds of video) for adaptive streaming in VOD applications requires adds the playback delay equal to the duration of the video segment. Such streaming strategy is not acceptable for live peer-topeer video streaming applications. Therefore, the use of the low latency video streaming solutions, such as based on the WebRTC specification, are preferable [6], [7]. The live video streaming from the mobile device in dynamically changing data connection throughput is a challenging task [8], [9], because there is no possibility 978-1-5090-1201-5/15/$31.00
©2015 IEEE
†
Department of Electronic Systems Vilnius Gediminas Technical University Naugarduko g. 41-413, Vilnius LT-03227, Lithuania Email:
[email protected]
to create several concurrent video streams on the mobile device before streaming. Even if the hardware may support the encoding of the four-five concurrent video streams in parallel, the energy consumption of the mobile device will be unacceptable for the practical applications. The paper presents a peer-to-peer video streaming system with an ability to change the encoded video resolution adaptively to the data throughput changes. The system, proposed in this paper, performs analysis of the WebRTC data connection statistics and reacts by sending a request to the video source for decreasing or increasing the resolution of the video. An experimental investigation was performed in order to test the systems behavior in the situations when the packet loss rate suddenly increases and after a period of time returns to the normal state. The proposed adaptive video streaming system is able to temporarily decrease the resolution of the streamed video and restore it when the throughput of the data connection increases. II. Materials and Methods A video streaming system proposed in this paper is based on a WebRTC specification with adaptive selection of the video resolution. The system consists of mobile device, which acts as a video streaming source (publisher), a signaling server used for peer-to-peer connection establishment and a HTML5 based video player (subscriber). The data throughput available on the current data connection is estimated by the analysis of the statistics, available for the WebRTC stream. WebRTC video streaming uses RTP data packet streaming over UDP and additionally sends the RTCP packets to transfer an additional information about the media streaming related events (e.g. times-tamp, accumulated number of packets lost from the beginning of the video stream). The video streaming mobile application used in the proposed system was upgraded with the ability to switch between three different video resolutions: 1280 × 720; 720×480 and 320×240. The switching is performed in real time using FFmpeg library tools without interrupting the video stream. The command to change the video resolution are sent from the web browser (with the video player) and
is passed to the video encoder. Such an approach does not require to perform encoding of the same video into several parallel video streams with different resolution. The resolution is changed in real-time, adaptively according to the command received from the receiver (another peer of the P2P video streaming session). III. Results The experimental investigation was performed using WAN emulator to increase the packet loss rate to 10%. The packet loss was applied in non-periodic manner at the same time for three video streams with resolution: 320 × 240; 640 × 480 and 1280 × 720. The comparison of number of lost packets distribution between three video streams with different resolution are given in Figure 1. For the high resolution video (640 × 480 and 1280 × 720) streams the peak of the packet loss is observed in the beginning and monotonically decreases. The number of lost packets for a video stream with resolution of 320×240 has no clear peak in the beginning of the distortion application. 35 240p 480p 720p
30
Number of lost packets
are received by the mobile application using a websocket based data connection. The adaptive filter based decision system was proposed for the initiation of the video resolution change requests. The adaptive filter was used to predict the changes of the received packet rate. The decrease of the number of received packets due to the packet loss, jitter and latency was a signal to decrease video resolution. Whereas the low variation of the number of received packets and decreased jitter was a signal to increase (restore) the video resolution. Three adaptive filters were selected for investigation: • Predictor based on a Fast transversal Least Mean Square (LMS) adaptive filter [10]. • Predictor based on a FIR adaptive filter that uses sliding window fast transversal least squares [11], [12]. • Predictor based on a Least squares lattice adaptive filter [13], [14]. The P2P video streaming over mobile data networks has a variable jitter and latency of the data packets. Especially the video streaming conditions changes, when the streamer moves from place to place in the open area or in the building and the mobile device is switching between Edge, 3G, HSPA, LTE networks. The portions of the data packets are delayed and the frame rate of the video stream in the receiver node drops dramatically from 20–30 fps to 2–5 fps. A WebRTC based video streaming systems, such as OpenTok video streaming platform uses it’s own intelligent methods to decrease the encoded video bitrate when the data connection throughput decreases. However, it is hard sometimes to restore the previous video stream bitrate, even if the data connection returns to the previous state. In this paper we propose a simple adaptive video streaming system with the possibility to switch remotely the video resolution in the encoder. The resolution of the encoded video can be increased or decreased. The decision is made by the analysis of the current received frame rate. Two main thresholds hup and hlo are used in the decision system. These thresholds divide the range of the possible video frame rates (in the receiver) into three parts. E.g. if the upper threshold is set to hup = 20 fps and the lower threshold is set to hlo = 10 fps, the received frame rate of 16 fps will initiate the control command for the encoder, to decrease the video resolution once. If the frame rate continues to decrease and drops below hlo , the additional control command to decrease the video resolution for the encoder is initiated. The system also works for the increase of video resolution, if the received frame rate starts to increase and passes the hlo or the hup thresholds. This gives the ability to the system to restore the quality of the video steam after accidental delays or jitter fluctuations during the transmission of the data packets. The changes of the video resolution in the encoder are made using a FFmpeg based implementation. The video resolution can be changed smoothly before the video frame
25
20
15
10
5
0 360
370
380
390
400
410
420
430
Time, s Fig. 1. Comparison of the 10% packet loss rate application for the video with different resolution.
It is seen in Figure 2 that the continuous packet loss decreases the video stream bitrate dramatically to the same level for all three tested resolutions. Several practical experiments were performed using the video streaming over mobile data connection with fixed resolution driving in the city. The performance of the video stream varied dramatically, especially when the car started to move. However, there were almost no observed packet loss. In four performed recordings the packet loss was observed 7 times. The loss of packets was discrete with a few packets lost only. Therefore, the packet loss in the mobile data networks is not the main reason of the decreasing video streaming performance and could be ignored in the design of the adaptive video resolution switching algorithm.
Video stream bitrate, Mbits/s
3 240p 480p 720p
2.5
2
1.5
1
0.5
0
340
360
380
400
420
440
460
Time, s Fig. 2. Comparison of the received bitrate changes after applying the 10% packet loss rate.
applying different adaptive filters was tested on the data stream statistics, received during practical experiment with the video streaming while driving in the city. The received video stream bitrate (number of bits received per second) was used as an input to the adaptive filter. The received frame rate was used as a desired signal. The performance of the adaptive filter predictor was evaluated by registering the adaptive filter prediction error. The results of the frame rate prediction using adaptive fast transversal filter (FTF) are presented in Table I. Two additional filters were experimentally tested on the recorded WebRTC session statistical data: a FIR adaptive filter that uses sliding window fast transversal least squares (SWFTF) and a Least squares lattice adaptive filter (LSL). During the experimental tests the parameters of the adaptive filter were changed in a wide range in order to estimate the best prediction performance available for the give adaptive filter based predictor. The results are shown in Table II and Table III.
TABLE I Adaptive FTF filter based frame rate prediction results
6 8 10 12 16 24 32 64
f.f.=0.99 MSE STD 17.8 4.2 18.1 4.2 18.3 4.2 18.4 4.2 19.4 4.4 21.2 4.6 22.7 4.7 34.1 5.8
f.f.=0.97 MSE STD 18.6 4.2 19.1 4.3 19.7 4.4 20.1 4.4 21.8 4.6 25.8 5.0 29.8 5.4 50.4 7.1
f.f.=0.95 MSE STD 19.7 4.4 20.5 4.5 21.4 4.6 22.1 4.6 24.6 4.9 31.0 5.5 38.1 6.1 74.1 8.6
Actual Predicted Error
40 30
Number of frames
N
50
20 10 0 −10
TABLE II Adaptive SWFTF filter based frame rate prediction results N 6 8 10 12 16 24 32 64
µ = 0.08 MSE STD 19.2 4.3 20.0 4.4 21.9 4.6 22.6 4.7 24.0 4.9 28.4 5.3 31.1 5.5 60.5 7.7
µ = 0.04 MSE STD 19.2 4.3 20.0 4.4 21.9 4.6 22.6 4.7 24.0 4.9 28.4 5.3 31.1 5.5 60.5 7.7
µ = 0.02 MSE STD 19.2 4.3 20.0 4.4 21.9 4.6 22.6 4.7 24.0 4.9 28.4 5.3 31.1 5.5 60.5 7.7
TABLE III Adaptive LSL filter based frame rate prediction results N 6 8 10 12 16 24 32 64
f.f.=0.99 MSE STD 17.9 4.2 18.1 4.2 18.3 4.3 18.4 4.3 19.3 4.4 20.8 4.6 22.3 4.7 30.8 5.5
f.f.=0.97 MSE STD 18.7 4.3 19.2 4.4 19.8 4.4 20.1 4.5 21.8 4.7 25.5 5.1 29.6 5.4 47.8 6.9
f.f.=0.95 MSE STD 19.7 4.4 20.6 4.5 21.5 4.6 22.2 4.7 24.7 5.0 30.7 5.5 37.8 6.2 65.0 8.1
The prediction of the video stream performance by
−20 −30
0
50
100
150
200
250
300
350
400
Time, s Fig. 3. Prediction of the frame rate using adaptive filter based method.
The illustration of the short example of the video frame rate prediction is given in Figure 3. The prediction was performed by the the adaptive filter with the current bitrate used as an input to the filter. The actual values of the frame rate are the integer values. However, the noninteger values are usually received in the output of the adaptive filter. An additional rounding of the adaptive filter output was performed in order to receive an integer value of the frame rate. The integer value is needed for a decision system. A comparison between the actual received frame rate and the predicted frame rate, received by rounding the output of the adaptive filter is given in Figure 4. The best prediction performance was achieved by the application of the Least squares lattice adaptive filter. The increase of the filter from 6 to 64 always decreased the
References 50 Actual Predicted Error
Number of frames
40
30
20
10
0
−10
−20
0
50
100
150
200
250
300
350
400
Time, s Fig. 4. Prediction of the frame rate using adaptive filter based method and applying the rounding of the filter output.
prediction performance. Therefore, the adaptive filter of 6th order was used for adaptive video streaming system. IV. Conclusion The video streaming system, proposed in this paper is designed for a low latency live P2P video steaming, based on a WebRTC specification. The system requires a data channel, to sent the commands from the receiver node to the transmitter node in the P2P session. The control commands are used to change the video resolution before the encoding is performed. The control commands are initiated by a decision system, based on the analysis of the received frame rate. The adaptive filter-based prediction of the receiver frame rate is able to initiate the changes of the video resolution before the actual frame rate is decreased in the receiver. However, the performance of the predictors investigated in this paper is low, with mean square error above 17 with standard deviation above 4 frames. The received mean square error shows, that the error of the frame rate prediction is close to 5–6 frames, therefore, the results of the prediction should be treated with a reserve.
[1] S. Paulikas, “Estimation of degraded video quality of mobile h. 264/avc video streaming,” Elektronika ir Elektrotechnika, vol. 98, no. 2, pp. 49–52, 2015. [2] M. Khouderchah, C. Krishnamurthy, J. Ellis, and J. Medved, “Method and apparatus for providing video on demand,” 2015, uS Patent App. 14/678,649. [3] C. Tian, J. Sun, W. Wu, and Y. Luo, “Optimal bandwidth allocation for hybrid video-on-demand streaming with a distributed max flow algorithm,” Computer Networks, vol. 91, pp. 483–494, 2015. [4] H. Riiser, H. S. Bergsaker, P. Vigmostad, P. Halvorsen, and C. Griwodz, “A comparison of quality scheduling in commercial adaptive http streaming solutions on a 3g network,” in Proceedings of the 4th Workshop on Mobile Video. ACM, 2012, pp. 25–30. [5] L. De Cicco, V. Caldaralo, V. Palmisano, and S. Mascolo, “Elastic: a client-side controller for dynamic adaptive streaming over http (dash),” in Packet Video Workshop (PV), 2013 20th International. IEEE, 2013, pp. 1–8. [6] B. Li, Z. Wang, J. Liu, and W. Zhu, “Two decades of internet video streaming: A retrospective view,” ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), vol. 9, no. 1s, p. 33, 2013. [7] J. Nurminen, A. Meyn, E. Jalonen, Y. Raivio, and R. Garcia Marrero, “P2p media streaming with html5 and webrtc,” in Computer Communications Workshops (INFOCOM WKSHPS), 2013 IEEE Conference on, April 2013, pp. 63–64. [8] S. Paulikas, P. Sargautis, and V. Banevicius, “Impact of wireless channel parameters on quality of video streaming,” Elektronika ir Elektrotechnika, vol. 108, no. 2, pp. 27–30, 2011. [9] V. Jaseviciute, D. Plonis, and A. Serackis, “Dynamic adaptation of the jitter buffer for video streaming applications,” in Information, Electronic and Electrical Engineering (AIEEE), 2014 IEEE 2nd Workshop on Advances in, Nov 2014, pp. 1–4. [10] D. Slock and T. Kailath, “Numerically stable fast transversal filters for recursive least squares adaptive filtering,” Signal Processing, IEEE Transactions on, vol. 39, no. 1, pp. 92–114, 1991. [11] D. T. Slock and T. Kailath, “A modular prewindowing framework for covariance ftf rls algorithms,” Signal Processing, vol. 28, no. 1, pp. 47–61, 1992. [12] ——, “A modular multichannel multiexperiment fast transversal filter rls algorithm,” Signal processing, vol. 28, no. 1, pp. 25–45, 1992. [13] D. Navakauskas, “A reduced size lattice-ladder neural network,” in Neural Networks for Signal Processing VIII, 1998. Proceedings of the 1998 IEEE Signal Processing Society Workshop. IEEE, 1998, pp. 313–322. [14] S. S. Haykin, Adaptive filter theory. Pearson Education India, 2008.