Multithreaded Distributed MPEG1-Video Delivery in the Internet ...

5 downloads 52 Views 113KB Size Report
Multithreaded Video Delivery, Video Storage, Multimedia. Communication ... client can tolerate network link congestion to a greater extent compared to that of a ...
Multithreaded Distributed MPEG1-Video Delivery in the Internet Environment Gwang S. Jung

Kyung W. Kang

Qutaibah Malluhi

Dept. of Math and Computer Science Lehman College of The City University of New York Bronx, NY 10468-1589 (718) 960-8785

Dept. of Computer Science Jackson State University Jackson, MS 39217 (601) 968-2105

Dept. of Computer Science Jackson State University Jackson, MS 39217 (601) 968-2225

[email protected]

[email protected]

[email protected] ABSTRACT Recently, there is an increasing demand on delivering video data over the Internet. In this paper, we propose a novel method of delivering video stream to a client by utilizing multithreaded parallel connections from the client to multiple geographically distributed servers. The proposed approach has been developed to deliver high quality best-effort MPEG-1 data delivery over current TCP/IP that does not guarantee QoS. The experimental results show that the video quality delivered by the proposed multithreaded stream could significantly be improved over the conventional single video streaming methods.

Keywords Multithreaded Video Delivery, Video Storage, Multimedia Communication, MPEG

1. INTRODUCTION With the advancement of the World Wide Web, there is an increasing demand on delivering video data over the Internet [4]. The predominant method of delivering video over the current Internet is video streaming such as RealPlayer of the Real Network [7], Vivoactive of the ViVo Software [10], Streamworks of the Xing [9], and Netshow of the Microsoft [6]. These technologies provide best-effort movie data delivery over TCP/IP based on proprietary movie format. The current implementation of the TCP/IP inherently does not provide any guaranteed Quality of Service (QoS) for delivering such video streams [2, 8]. Since each method provides the client with only one data stream from one server, it often suffers from poor quality of pictures in the case of network link congestion. It is therefore highly demanded to develop methods of efficiently delivering high quality movie stream that is less delay-sensitive in the existence of network congestion. In this paper, we propose a novel method of delivering video stream to a client by utilizing multithreaded parallel connections from the client to multiple geographically distributed servers.

The proposed approach has been developed to deliver high quality and less delay-sensitive video stream over TCP/IP. The proposed method has been applied to MPEG (Moving Picture Expert Group)-1 data delivering over TCP/IP. The typical resolution of the MPEG-1 stream is 352 by 240 pixels and 30 frames per second (in the case of NTSC) [5]. The corresponding bandwidth requirement is 1.5 M bits per second. In the proposed approach, the video is partitioned into streams of GOPs (Group of Pictures) and stored over multiple distributed servers. Each server stores and delivers one stream that consists of GOPs of a MPEG-1 video stream. The parallelism of these servers and the multiple network paths from the servers to the client are employed to fully utilize the client’s bandwidth. Since the client receives multiple video streams from the servers in parallel, the client can tolerate network link congestion to a greater extent compared to that of a conventional single streaming method. Experiments were conducted to measure the performance of the proposed method in various Internet environments characterized by the self-similar network traffic model [3]. The experimental results show that the video quality delivered by the proposed multithreaded stream scheme could significantly be improved, in terms of delay sensitivity, over the traditional best-effort single streaming methods. In section 2, we will describe the structure of the MPEG-1 stream. In section 3, we will show how the MPEG-1 stream is restructured to provide multithreaded parallel data streams to the user client. Section 4 presents the experiments conducted. Section 5 concludes the paper.

2. The Structure of the MPEG-1 Stream The MPEG stands for Moving Picture Expert Group, and the MPEG-1 was finalized by the ISO (International Organization for Standardization) and IEC (International Electrotechnical Commission) in 1991. MPEG-1 was originally optimized to work at video resolutions of 352 by 240 pixels at 30 frames per second (NTSC based), or 352 by 288 pixels at 25 frames per second (PAL based). It is often referred to as SIF (Source Input Format) video. The MPEG-1 bit-rate is optimized for applications of around 1.5 M bits per second, but it can be higher if required. The MPEG-1 stream consists of video sequences. Each sequence consists of GOPs (Group of Picture) each of which has one or more frames. A frame (or picture) is the primary coding unit of a video sequence.

The MPEG-1 video consists of I, P, and B frames for efficient coding and for enabling fast random access. I frame (Intra-coded images) is encoded as a still image, and can be decoded without any reference to other frames. It is therefore the basis for coding other frames, and directly accessible in random fashion. The compression rate is consequently lower than other frames in MPEG-1. P-frames (Predictive-coded frames) are encoded and decoded based on the information of the previous I or P frames. Temporal redundancy between frames is utilized to encode the P frames more efficiently. The B frames (Bi-directional predictivecoded frames) are encoded and decoded based on the previous and following I and P frames. It has the highest compression rate. Figure 1 shows the reference association among frames. For example, to encode/decode each P frame, the previous I (or P) frame is referenced. B frames are encoded based on moving vector obtained from bidirectional prediction. Therefore each B frame is decoded based on the previous I (or P) frame and the next I (or P) frame. At encoding time, the frequency and the location of I frames can be chosen, based on the application's need for random accessibility and the location of scene cuts in the video sequence. The number of B frames, between any pair of reference (I or P) frames, can be chosen based on factors such as the amount of memory in the encoder and the characteristics of the target pictures encoded. Forward Prediction

I

I,B,B,P,B,B,P,B,B,P,B,B,P,B,B. In general, two GOPs are required to display 1 second MPEG-1 movie.

3. DATA LAYOUT AND VIDEO STREAMING OF THE PROPOSED SCHEME This section explains how the MPEG-1 video stream is partitioned and stored over data servers and is delivered to the client. We then explain how the client manages buffers for obtaining multithreaded input streams.

3.1 Striping MPEG-1 Stream In MPEG-1 stream, the video frames are varied in size and possess some data dependencies. Several frames depending on each other can be grouped together into a GOP. The boundary between GOPs can be a natural boundary for partitioning the MPEG-1 video data. The MPEG-1 video is first sliced into m GOPs, and then stored over n number of servers in round robin fashion. The sequence of GOPs stored over a data server represents a single MPEG-1 stream. As the Figure 3 shows, with Video GOP

#1

#2

#3

#4

#5

#6

#7

#8 ...

Server 1:

GOP Stream 1

Server 2:

GOP Stream 2

Server 3:

GOP Stream 3

Server 4:

GOP Stream 4

Figure 3. Striping MPEG-1 Data into GOP Streams

B B P B B P

four data servers, the MPEG-1 data is striped into GOPs and stored over four servers to provide four GOP streams.

Bidirectional Prediction Figure 1. Frame Types The encoding sequence of the frames is determined according to the reference associations among frames as we described above. The video stream has the same sequence, namely stream sequence, of the frames determined at encoding time. The decoding process is the reverse process of the encoding based on the same reference association. The frames are decoded in the order of the stream sequence. The decoder however displays the frames in different sequence, namely display sequence, to play the movie frames in proper order. Figure 2 shows the difference

After the GOPs are stored, the stream header is structured to represent the number of GOPs and size of each GOP in the stream. Figure 4 shows the format of each stream of GOPs.

3.2 Multithreaded GOP Input Streams Each server stores and delivers one stream that consists of GOPs. The parallelism of these servers and the multiple network paths from the servers to the client are employed to achieve high data rates. The client invokes multiple threads each of which creates a GOP input stream connecting to a GOP stream server. The client receives multiple GOP streams in parallel. The client and network Head

Stream Sequence: I

P

B

B

P

B

B

P

B

1

2

3

4

5

6

7

8

9 10 11

Display Sequence : I

B

B

P

B

B

P

B

B

B

P

Real GOP Data

P

B

1 3 4 2 6 7 5 9 10 8 12 Figure 2. Stream Sequence versus Display Sequence between stream sequence and display sequence. The GOP is a sequence of one or more frames, beginning from an I frame and ending to a P (or B) frame before the next I frame. The number of frames depends on the characteristics of the target movie object and encoding parameters. Typical GOP of a MPEG1 file consists of 15 frames in the sequence of

Number Size of of GOP GOP1

Size of ... Size of GOP2 GOPm

Data of Data of ... Size of GOP1 GOP2 GOPm

Figure 4. The Format of a GOP Stream bandwidth thus can be fully utilized. Each GOP input stream thread has its own buffer that is being filled by the GOP stream delivered from a server. The network congestion between a client’s thread to a server does not block the client. The client can still get the GOPs through other available input streams unless all the stream connections are congested. In reality, the occurrence of multiple connections being congested is less likely. Therefore the client with proposed multithreaded GOP input streams is less Internet delay sensitive than the one with single video input stream.

Each video-input stream has a buffer whose size is dynamically Receiver GOP Stream 1

Buffer 1

GOP Stream 2

Buffer 2

control Reader

GOP Stream 3

Buffer 3

GOP Stream 4

Buffer 4

Send to MPEG Decoder

Figure 5. The Client with 4-threaded GOP Input Streams adjusted to the size of target GOP in the stream to be accessed according to the information in the stream head. The client’s Reader gets a GOP from each input stream in sequence, and delivers bytes in the GOP to the MPEG-1 decoder, as shown in Figure 5. The buffer is shared and synchronized by the GOPInputStream thread and the Reader. While the Reader reads the bytes from a stream buffer, the other stream buffers are being filled with GOPs delivered by the other GOPInputStream threads. Figure 5 shows the multithreaded GOP input streams. The synchronization between the Reader and each GOPInputStream thread is described in Figure 6.

4. EXPERIMENTS Experiments were conducted to measure the performance of the proposed method in various Internet environments. We are particularly interested in seeing the effect of the Internet delays, due to the traffic and/or network congestion, to the MPEG-1 displayer.

4.1 Experimental Setup The experiment was conducted in Windows NT 4.0 environment with 256 Mega bytes of main memory and dual Pentium II 350 MHz CPUs. We conducted the experiment with MPEG-1 player implemented in Java [1]. The proposed multithreaded GOP input streaming scheme was also implemented in JAVA. Figure 7 shows the GOP processing time of the Java MPEG player. The average processing time for decoding each GOP was 760 milliseconds, and the GOP processing time was in proportion to the size of a GOP. 1 0.9 0.8

synchronized public void GOPInputStream() { while(full == true) wait(); // go to sleep state buffer.size = number of bytes in the GOP; readGOP(buffer); // read one GOP from server and // store it into the buffer write_point = size of GOP; full = true; // buffer is now full notify(); // wakeup Reader waiting for // buffer to be full } synchronized public byte Reader() { while (full == false) wait(); readPtr++; if readBtr < buffer.size of the buffer then return(buffer[readPtr]); else { full = false; // reading has finished nGOP++; // update id of next GOP to be read notify(); // wakeup VideoInputStream // waiting for the buffer to be empty } } Figure 6. Synchronization between Reader and each GOPInputStream Thread The synchronization is based on the flag full. When the full flag is true GOPInputStream thread needs to wait until the buffer is emptied by the Reader thread. The Reader performs the reading process one byte at a time. The bit-wise nature of the MPEG-1 stream requires the Reader to read data from the buffer one byte at a time, and stores it into the shift register of the MPEG-1 decoder temporarily for decoding. After the Reader finishes consuming the buffer, it sets the flag full false, then the GOPInputStream resumes accessing the buffer to initiate downloading the next GOP from the stream server.

Processing Time (s)

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 GOP

Figure 7. GOP Processing Time Distribution by Java MPEG-1 Player The MPEG-1 movie file was striped into GOPs and then was stored over servers. Servers were implemented in Java. Each server is structured as a Java process in the same address space. Each server process is equipped with an Internet delay generator that has been developed as described in Section 4.2.

4.2 Delay Characterization Conducting experiments in real Internet environment are often unpredictable and difficult to control. To conduct controllable experiments, we decided to simulate the Internet delay patterns based on α-stable self-similar stochastic traffic model [3]. Figure 8 shows a World Wide Web traffic model characterized by αstable self-similar traffic model (the value α for the stability = 1.28; the Hurst parameter for the self-similarity = 0.8333). The xaxis represents time slots, and the y-axis shows the throughput in each time slot. We first get a α-stable self-similar distribution, and make a mapping from this distribution to a traffic distribution. The mapping is done as follows. Let n be the number of data in the distribution. Let

mi

be the magnitude of the ith datum in the

distribution. Let k be the size of the MPEG-1 file striped into GOPs. We need to calculate the scale factor in such a way that the summation of the magnitudes of all n data equals the amount of data to be delivered. The scale factor s can be obtained by:

n

∑m

s=k

i

. Since the distribution is scale invariant, we consider

i =1

Packets per unit

each scaled bar as a throughput in a unit time interval t. The time interval t is determined based on the desired data throughput. Let h (bytes/second) be the desired throughput to obtain from the distribution, t can be obtained by: t = k/h. For example, for smooth MPEG-1 play, required data throughput is about 200K bytes/sec.

Time slots

Figure 8. Internet Traffic Characterization by α-stable self-similar Stochastic Traffic Model

// Let P[i] be the ith throughput in the throughput traffic model // Let T[j] be the jth delay in the delay traffic model j = 0; X = 0; for (i = 0; i < number of throughputs in the model; i++) do { X += P[i] if (X 1,500 bytes { // Get the delay of the first packet of 1,500 bytes lastX = X - P[i] ; part1 = 1500 – lastX; T[j++] = time_unit*P[i]/part1;

250

200

Kbytes per unit

This figure is calculated by the ratio between the size of each GOP and its processing time, by the Java MPEG-1 player used for the experiment, as shown in Figure 8. However the actual throughput delivered from the server to the client can be reduced due to the internal execution delay of the experimental server. About 10% of the total throughput is reduced by the internal delay.

150

100

50

// Get the delays of the next packets part2 = p[i]-part1; for(i=0; i

Suggest Documents