Asynchronous Prefetching Streaming for Quick-Scene ...

3 downloads 0 Views 1MB Size Report
yoshihiro.suzuki@mac.com, [email protected]). or unpunctual ..... NEWS 24 . .
Asynchronous Prefetching Streaming for Quick-Scene Access in Mobile Video Delivery (Preprint) Naofumi Uchihara, Hiroyuki Kasai, Member, IEEE, Yoshihiro Suzuki and Yoshihisa Nishigori Abstract —This paper proposes a new video delivery system with a lightweight, smooth, and quick-response accessibility to video scenes without interference by unstable mobile network conditions and also without consuming mobile device storage. This can avoid a video scene access delay brought on by mobile-network-specific delay and instability. We proposed an innovative streaming technology that consists of a viewcontext-aware asynchronous prefetching and streaming scheme and a seamless media assembly and media synchronization scheme. As an additional powerful extension for this system, we present a hierarchical prefetching asynchronous streaming technology that considers a videoscene structure. This can provide access to a general view through the whole content as well as to a detailed view around playback position by consideration of the video-scene structure and the user’s viewing position. The implementation details on mobile phones and preliminary evaluation results are described, and its feasibility is also shown1. Index Terms — Mobile Video Streaming, Asynchronous Prefetching, Quick-Scene Access

I. INTRODUCTION In these days, high-bit-rate mobile network video services, such as 3.5G HSPA (High-Speed Packet Access) [1] or WiMAX [2], and high-capacity storage technologies, such as HDD (hard-disk drive) or flash memory, in a mobile environment have been rapidly promoted into commercialization. This can enable us to access a great many video contents anytime and anywhere. Conventional research and development for mobile video systems until now have focused on the extensions or applications of existing video delivery technologies specified for wired networks. However, in consideration of the practical viewing of contents, we find that lightweight, smooth, and quick-response capabilities for content and scene access are truly needed for mobile usage during moments of free time during waits for elevators, trains, 1 This work was supported in part by KAKENHI 21760277, a Grant-in-Aid for Young Scientists (B).This work was supported in part the joint research on a digital home appliance called the “FUN-X project,” which was supported in part by Funai Electric Co., Ltd., and The University of ElectroCommunications. This paper was presented in part at the 28th IEEE International Conference on Consumer Electronics (ICCE2010), Las Vegas, USA, January 2010. Naofumi Uchihara and Hiroyuki Kasai are with the Graduate School of Information Systems, The University of Electro-Communications, Tokyo, 182-8585, Japan (e-mail: {uchihara, kasai}@appnet.is.uec.ac.jp). Yoshihiro Suzuki and Yoshihisa Nighigori are with the R&D Engineering Department at Funai Electric Co., Ltd., Tokyo, 101-0021, Japan (e-mail: [email protected], [email protected]).

or unpunctual friends. So are researching and developing a newly innovative mobile video system. Its main core technology is an asynchronous prefetching streaming technology that can increase a lightweight, smooth, and quick responsiveness and interactivity of mobile video contents. This prefetches and temporally stores discontinuous partial video portions locating in the future time line in the background of receiving the streaming media that viewers are watching. It allows them to jump to and restart any desired scenes very quickly. It also provides a continuous and seamless video provisioning by switching between networkstreamed media and prefetched media. These combined access schemes are automatically synchronized, based on viewers’ dynamic view context. As an extension of this, we newly propose an innovative efficient access with a hierarchical prefetching asynchronous streaming consideration of a video scene structure. It can provide access to a general view through a whole content and also to a detailed view around a playback position by consideration of the video-scene structure and the user’s viewing position. The user can easily browse video scenes within currently played content and quickly reach the scenes that most users truly want to view. In the rest of this paper, we summarize the requirements for practical mobile video services in Section II and related works in Section III. We then propose an innovative mobile-videostreaming scheme based on an asynchronous prefetching technology in Section IV. This section describes the basic concept, system diagram and operational procedure. Section V describes our elemental technologies, which are a scenario metafile description, a context-aware asynchronous prefetching scheme, a seamless media assembly and synchronization, and a hierarchical prefetching streaming scheme. Section VI describes our implementation details, and Section VII shows the feasibility of our proposal by evaluating the cost of streaming-session switching, overall CPU occupations in a commercial mobile device. An application implementation of a commercialized PDR (personal disk recorder) is also introduced in Section VIII. Lastly, Section IX concludes our research and mentions future study issues. II.

REQUIREMENTS FOR MOBILE VIDEO SERVICE

Scenarios of video viewing in a mobile environment have different features from those in conventional TV viewing or on-demand video viewing in static PC environments. According to a questionnaire result for mobile video usages, most viewers were watching merely to kill time[3] This type

of viewing has no strong intentions to view specific contents or programs in advance. Here, after the start of viewing, viewers tend to frequently skip over things they dislike. By skimming within or among the contents, they finally reach a desired content and then watch the rest of it, if they like it, during the remaining time. For this kind of viewing style, a video-streaming mechanism would provide a capability to access a wide variety of video contents for searching and browsing via mobile wireless networks. However, it might be quite unsatisfactory for users because wireless networks may prevent smooth browsing and rendering of rich media contents. Frequent skips toward desired contents or scenes, such as TV channel zapping, would also cause additional big delays for those operations and for media buffering. This situation is derived from dynamically changeable bandwidths and high delays in mobile networks. From this scenario study, predownloading of video contents as files into mobile devices might be more preferable for fast video access. This can be through WLAN and USB cable, for instance, prior to practical viewing phases. It is, however, unrealistic to store many contents because the capacity of mobile-device storage is quite low. Consequently, a new mechanism must be developed that will allow viewers smooth and quick access to desired contents/scenes in a short time and the ability to skip or change contents easily and to view an entire content if they want without consuming device storages. III. RELATED WORK Smooth and fast streaming technologies have been developed up to now, especially in proxy caching and P2P technology domain. A user-aware prefetching mechanism is proposed [4] that will prefetch video segments into a proxy server that a user is quite likely to skip into. Focusing on a skipping capability is the same as our purpose. However, this skipping point is limited to only the next-candidate segment. An active prefetching technique to proactively prefetch the uncached segments is also proposed [5], where the segment is selected by a probability that clients are very likely to access in the future. Moreover, the basic mechanisms in [4] and [5] are based on a proxy-based technique, and the skipping position is limited to one position. A work closely related to ours is a novel hybrid video-downloading and streaming scheme that integrates traditional client-server-based video streaming and P2P-based media distribution [6]. Although the required mechanism shares the same idea with our proposal in IV, the core mechanism is different from the viewpoint of focusing on device storage limitation. Furthermore, the fast accessible skipping capability is not considered in this paper. As a whole, to our knowledge most works focus on seamless play mechanism of a stream by prefetching or predownloading media data. On the other hand, our proposal tackles a quick and multipositional accessibility within contents. As for commercial video systems, an internet TV service [7] allows viewers to access not only transcripts of the audio track background information within the video, but also specific scenes by showing those short video thumbnails. The

mobile broadcasting service system in Japan also shows stillimage thumbnails with short text information extracted from subtitles [8] The viewer can select any scene and start video streaming from that selected position. Although these services share the same vision with us, the thumbnail videos or images are completely independent of the video stream to be viewed. Our thumbnail stream is extracted partially from the original video stream for users’ real viewing by being seamlessly combined with the rest of the original video stream. IV. VIDEO ASYNCHRONOUS PREFETCHING STREAMING A. Basic Mechanism Lightweight and fast-content access capabilities are a must for mobile video services where a user wants to jump very quickly to any desired scenes. However, it could be difficult to provide stable functionalities, since mobile network conditions can be dynamically changed, and its network delay could be very high. This might impose a big burden on the user’s skipping and jumping. Displaying a list of accessible candidate scenes at a glance could be helpful for the user under such a situation. It is, however, still difficult to restart streaming quickly from a new position because of the same reasons. Therefore this paper proposes an asynchronous prefetching technology. More specifically, a mobile client asynchronously prefetches multipositional thumbnail streams in the background of receiving the streaming media that the user is watching. The asynchronous prefetching thumbnail stream represents a short-length anchor video to help the user understand an overview of its playback position. Each thumbnail stream has a certain length of time, e.g., 5 seconds, and is originally extracted from the original stream automatically or manually. By prefetching these prefetched streams asynchronously, the mobile client enables the user to select any desired scenes among randomly accessible scenes by referring all of the prefetched thumbnails on the screen. Further, since the already prefetched thumbnail stream, the mobile client can restart very quickly without media buffering delay. If the user watches the video scenes continuously, a successive stream media following the thumbnail stream will be streamed via an ordinal network streaming method and assembled seamlessly with this thumbnail stream. Moreover, the mobile client can also prestore some thumbnail streams prior to a real viewing moment. For instance, while the user is at home, the thumbnail streams can be transferred into the mobile devices via Wi-Fi, USB cable, or a broadcasting system. In a mobile environment, they can first access the prefetched (predownloaded) thumbnail streams quickly without accessing the mobile network and continue to watch them by assembling them seamlessly with networked media. Here we call the ordinal streaming session as a Synchronous Streaming Session and the prefetching thumbnail streaming session as an Asynchronous Prefetching Streaming Session. The following descriptions show how the proposed system works step by step as shown in Fig. 1. The multipositions of thumbnail streams to be prefetched inside one video stream must be determined in Step 1. These positions correspond to

accessible positions to where the user can jump during a practical content-viewing phase. How to determine those positions could be periodical extraction methods (e.g. 5 minutes), scene-detection methods [9][10][11], media summarization technologies [12][13][14], or on a handmade metadata basis provided by a content creator/provider, but it is out of scope in this paper. In Step 2, this multipositional information is stored in a scenario metafile, and this file can be transferred into a mobile device prior to a real-content playing phase. Next, the user starts content playing on his/her mobile device by using an ordinal network-streaming mechanism after a content-selecting phase. If its beginning portion of the selected content is already stored inside the mobile device as mentioned earlier, the user can start playing very quickly without a streaming-buffering delay. Step 4 is one of the featured phases in this proposal. Once a viewing phase starts, an asynchronous prefetch procedure of thumbnail streams will soon start. During this phase, the mobile client asynchronously prefetches multipositional thumbnail streams by considering the user’s viewing context. This means that the thumbnails to be prefetched could be dynamically changed based on the user’s VCR operations and network conditions. Step 5 is a quick and lightweight skipping and jumping phase. If the user dislikes a currently displayed scene and wants to skip it, the mechanism allows him or her to just skip to the top of the next portion of the prefetched thumbnail stream. If the user wants to search other desired scenes, he/she can select a “Selectable Jump Mode” in Step 6. Here all the prefetched thumbnails appear at a glance, and the user can directly select any scene. Lastly, if the user continues to watch the stream, a successive stream after this thumbnail will be streamed via the ordinal network streaming session and assembled with the current thumbnail stream in Step 7, as mentioned earlier. Content Server

Meta File

Step 1: Determine Thumbnail Stream position

Mobile Client

Step 2: Create Scenario File, and Transfer

Step 5: Skip to Next Thumbnail

Step 4: Start pre-fetching

Meta File

Step 3: Play content

Step 6: Jump to any Thumbnail

Step 7: Switching to NW streaming

Switching Controller (B), the client starts a network streaming process by instructing the Streaming Controller (C) according to the parsed scenario metafile. The received data in the Streaming Packet Receiver (D) is passed into the Coded Buffer. Decoded media data are rendered by Media Renderer via the Decoded Media Buffer after a decoding process in the Media Decoder (E). Meta File Synchronizer(A)

Meta File Pre-fetching Control

Streaming Controller(F)

Switching Controller (B)

Pre-fetching Packet Receiver(G) Pre-fetched Media Packet

Thumbnail Stream Storage (H)

File Reading Start/Stop/ Suspend..

File Reader (I) Streamed Media Packet

Signaling Control

UI

Media Decoder (E) Decoded Media Buffer

Streaming Packet Receiver (D)

Streaming Controller(C)

Coded Media Buffer

Streaming Start/Stop/ Suspend..

Media Render

Fig. 2. Basic mobile client diagram.

Later, after a predetermined length of time, the Prefetching Controller (F) selects one thumbnail stream to be prefetched first from candidate streams described in the scenario metafile. The controller then asynchronously starts the prefetching procedure of the selected stream. The received data in the Prefetching Packet Receiver (G) are temporally stored into the Thumbnail Stream Storage (H) as a file. In the same way, the successive thumbnail streams are prefetched one by one. Once a playback position runs over any beginning position of already prefetched thumbnail streams, the switching controller indicates the File Reader (I) to pass corresponding data from the storage into the Coded Media Buffer. The controller asks the Streaming Controller (C) to suspend its current synchronous streaming session and the related processes. During reading of the prefetched thumbnail stream file, the asynchronous prefetching process is still working at a maximum transmission rate. When the remaining time of the playing thumbnail stream falls below a threshold, the switching controller instructs the file reader to suspend its reading and orders the synchronous streaming controller to resume its streaming from the corresponding stream position. V. ELEMENTAL CORE TECHNOLOGIES

Original Stream

Mobile Network

Fig. 1. Mobile video asynchronous streaming basic mechanism.

B. Basic Diagram and Procedure Fig. 2 shows a basic diagram of a proposed mobile client. The Meta File Synchronizer (A) first retrieves a scenario metafile from an HTTP server. After parsing the file in the

A. Scenario Metafile Description The scenario metafile describes the session information of the synchronous stream sessions and the asynchronous prefetching stream sessions. The mobile client decides the starting timings of the asynchronous session behavior based on this scenario metafile description. More specifically, the scenario metafile includes the synchronous streaming session information, which is address information, file path

information, port numbers, and time position information within each stream. The file describes also the asynchronous prefetching session information and its playback timing information. Table I shows an example of the scenario metafile. The tag inside the tag represents the order and playback timing of each video program. It corresponds to the synchronous streaming session, and the tag indicates the asynchronous prefetching sessions within the synchronous streaming session. The offset attribute represents the offset playback timing of each prefetching session, and the layer attribute describes the scene structure of each content segment, which is used for the hierarchical prefetching streaming in V.D. TABLE I SCENARIO FILE EXAMPLE Skim@ Scenario File 123 NEWS 24 http://192.168.10.5:80/sub01.mp4 http://192.168.10.5:80/sub03.mp4 http://192.168.10.5:80/sub05.mp4 other item

B. Context-Aware Asynchronous Prefetching Streaming The most significant challenge is to determine which asynchronous stream should be prefetched preferentially at a dynamically changeable situation caused by viewers’ VCR operations or network conditions. We call this streaming Context-Aware Asynchronous Prefetching Streaming. Since skipping or jumping can occur more frequently as a result of our basic proposed functionality, the viewing position might overtake the position of the asynchronously prefetched stream at that time. Or a currently prefetching stream might be expected to miss its playback timing because of a sudden change of network condition. Therefore the mobile client periodically calculates and determines a targeted asynchronous thumbnail stream to be prefetched by considering the current playback position, the playback timing of the candidate asynchronous streams, and the available bandwidth. Even if the prefetching procedure of a certain asynchronous stream has already started, the mobile client suspends it and starts a new prefetching procedure according to the condition changes. Moreover, the more challengeable technology is to be satisfied that the transmission rate control of the prefetching streaming utilizes an unoccupied bandwidth in a mobile network as much as possible while not disturbing the synchronous streaming. The latest networks, such as HSDPA or WiMAX, have a featured capability called an adaptive

modulation and coding (AMC) [15] by which it can adjust its transmission rate adaptively by adopting the most appropriate modulation and coding scheme. In Mobile WiMAX with a 1.5 MHz channel bandwidth, its bandwidth can be theoretically changed between 1 Mbps and 4.7 Mbps. Therefore we introduce an easy control mechanism in which the client detects the packet loss and the packet jitter for either of two sessions by sending and receiving monitoring packets with the server. If the packet loss occurs or the packet jitter exceeds a predefined threshold, the mobile client adjusts the sending packet rate of the asynchronous prefetching session. In regard to the RTP/UDP base session for the asynchronous session, the mobile client sends a pause or stop message to the asynchronous streaming server, since most of the streaming servers do not support a dynamic rate transmission. In regard to a TCP or HTTP base session, the mobile client reduces a receiving rate for TCP or HTTP, like the AIMD (Additive Increase and Multiple Decrease) control in the TCP control [16]. It is necessary, however, to investigate an innovative control scheme, such as low-priority transmission control [17][18][19], which can completely avoid interference with the synchronous streaming session and that can utilize an unoccupied bandwidth as much as possible. As you see, the user may jump beyond successive contents, and some prefetching data might go to waste. This additional functionality can give lightweight and fast-content access capabilities in a mobile environment without being interference caused by unstable mobile network conditions. C. Seamless Media Assembly and Synchronization It is required to seamlessly assemble two independent media sources, network streamed, namely, synchronous session, media and prefetched asynchronous media files, without disturbing user viewing. The retrieval speed of files is much faster than that of streaming media; also, the latter speed can be fluctuated by wireless network conditions. Therefore, especially when media sources switch from a prefetched asynchronous file to a synchronous streaming server, its synchronous streaming start timing must be adjusted based on its network condition. It is especially worth noting that the threshold time must be adjusted to prevent not only buffer overflow of the buffer located in the Streaming Packet Receiver, but also buffer underflow in the Decoded Media Buffer. The Streaming Controller calculates the threshold by periodically exchanging monitoring packets with a streaming server and by monitoring network delay. As for media synchronization, the point is that the synchronous (network) streamed media and the asynchronously prefetched stream media can have completely independent time stamps. When TCP or HTTP protocol is used for the prefetching asynchronous streams, since the prefetched streams are independent files, e.g., MP4 files, they have separate time stamps. On the other hand, when RTSP/RTP protocol is used, the time stamp in a streamed packet cannot represent its position within a whole stream because its absolute playback position can be deleted by VCR operations such as PAUSE. Consequently, the time stamp of

each prefetched stream must be maintained by using the absolute playback position within the original stream and the internal position of each packet inside the stream. The former is described in the or tag inside the scenario metafile. The latter is calculated as follows: a time stamp in the first received media is memorized, and the successive time stamps are recalculated based on the first one and their own. Finally, a unified time stamp is newly assigned into each media prior to the media decoding process and used in its following processes in the mobile client. D. Hierarchical Prefetching Streaming The proposed system enables viewers to skip and jump scenes by using prefetched streams as anchor scenes in the background of an ordinal streaming session. Therefore the number of prefetched streams into a mobile device can lead to higher usability, since the number of randomly accessible scene positions can be increased. However, because the mobile network bandwidth is relatively narrower, it is hard to prefetch all asynchronous streams at one time. Therefore to achieve this purpose, an additional mechanism is required to prefetch streams on a priority basis according to the user’s viewing situation. The proposed new mechanism, a hierarchical prefetching streaming technology, provides the user access to a general view through a whole content as well as to a detailed view around a playback position considering a video scene structure and the user’s viewing position. Specifically, while the asynchronous streams giving a panoramic view within the content are preferentially prefetched at the beginning of the streaming session, the streams close to the currently playback position will be prefetched according to the user’s viewing behavior. To achieve this functionality, we adopt the concept of a layer. For instance, we assign layer 1 to the asynchronous streams that are located with relatively longer time intervals to get a general view through one video. Meanwhile, layer 2 and above are assigned to the streams located within short time ranges in order to access the scenes near the playback position where the user is watching. By this layer information described in the scenario metafile, the client first prefetches all of the streams with layer 1 sequentially. Then the streams with layer 2 between those already prefetched layer 1 streams can be prefetched according to the position the mobile client accesses using skip or jump functionalities.

Fig. 3. Hierarchical prefetching procedure.

Fig. 3 shows the prefetching steps of the synchronous streams (layer 0) and the asynchronous streams (layer 1 or 2) conceptually. First, after an ordinal synchronous streaming session starts at t = T0, the mobile client starts to try to prefetch all of the asynchronous streams with layer 1 at t = T1 to provide a wide range of video scenes through the video content. Once the prefetching of streams with layer 1 is finished, the client starts to prefetch layer 2 asynchronous streams at t = T2. Here the candidate streams to be prefetched are located between the playback position and the following steam with layer 1. At t = T3, when the user jumps to the next following asynchronous stream with layer 1 by using the forward skip functionality, the client stops the active prefetching session for layer 2 streams. The client restarts the asynchronous session for layer 2 streams, which follows the newly accessed layer 1 stream. Since a playback position dynamically changes because of such operations as the forward/backward skip, the proposal can adaptively change the asynchronous streams to be prefetched. VI. SYSTEM IMPLEMENTATION A. Protocol Stack and Implementation Architecture Fig. 4 shows the protocol stack in this system. MPEG-4 video SP [20], H.261 [21], and H.263/263+ [22], and H.264/AVC [23] video codecs, and MPEG-4 AAC audio codec [24] are adopted. As for the synchronous streaming protocol and packetization protocols, RTSP (Real Time Streaming Protocol), and RTP (Real-time Transport Protocol)/RTCP (RTP Control Protocol) are used. The file format for the prefetched thumbnail stream is MP4 (MPEG-4 Part 14) [25] container format for a mobile environment usage. Regarding the asynchronous thumbnail prefetching protocol, HTTP is adopted by considering a self-error recovery capability of TCP, since the prefetched thumbnail stream is not accessed by the users in a real-time manner.

Fig. 4. Mobile client implementation architecture.

B. Three types of Mobile Client Implementation Details An open source project is adopted for the implementation of the RTSP server side program. On the other hand, crossplatform-oriented implementations are being performed not only by implementation of core modules in C/C++ platformindependent languages, but also by the implementation of platform-independent wrapper modules for network control modules, thread control modules including such exclusive controls as Mutex and Signal, and string (character) processing. Portable implementation for core modules has been accelerating our implementations so far. The modules with thick lines in Fig. 2 are independent thread modules; they are connected by FIFO-based circulate buffers and synchronously controlled by the exclusive control of Mutex, Event, Signal, and Critical Section. It is quite easy to plug the core modules into each dependent platform by adding platform-dependent modules such as GUI. We have implemented our system on the two different types of operation systems, which are an embedded Windows system and an embedded Unix operation system. Fig. 5 shows the screen shots for two implementations.

(A) Windows-based implementation (B) Unix-based implementation Fig. 5. Mobile client implementation screen shots.

C. Hierarchical Prefetching Streaming Implementation Fig. 6 shows screen shots for the hierarchical prefetching

streaming described in V.D. Fig. 6 (A) reveals that the preferentially prefetched asynchronous thumbnails in layer 1 are shown in the overlaid dialog window. The playtime bar below the thumbnails represents the time-based relation between the currently playing scene (white triangle) and the focused scene (green triangle) on the dialog. Thus the user can get an overview of the whole stream at the early stage of the viewing and can jump to the desired scenes at this stage. Here the special button between the thumbnails, the vertical gray lines, indicates that the left layer 1 thumbnail on the button in layer 2 exists, followed by more than one thumbnail. By accessing this button, streaming processed for those thumbnail streams in layer 2 will start as shown in Fig. 6 (B). In this way, as Fig. 6 (B) shows the many divided short time-bars, the user can access the detailed scenes around the preferentially prefetched asynchronous thumbnails in layer 1.

(A)

(B)

Fig. 6. Hierarchical prefetching streaming screen shots. (A) shows the only layer 1 thumbnail streams. (B) shows completion of the prefetching.

VII. PERFORMANCE EVALUATIONS This section presents preliminary performance evaluations to show feasibility of our proposed streaming mechanism. A. Switching Performance between Prefetched Stream and Synchronous Streaming We evaluate the performances of our prototype implementation. As a preliminary evaluation, the CPU loads from when the synchronous prefetched stream local access switches into the synchronous stream access, i.e., the normal streaming access. The mobile device has 400MHz CPU, 128 Mbytes SDRAM, 2.8-in VGA (320 x 240) LCD display, and IEEE 802.11b/g interface on embedded Windows operation system. The specification of bit-streams used in this experiment is MP4 format with 160-sec duration. MPEG-4 video with QVGA (176 x 144) image size, 256 kbps and 15 fps, and MPEG-4 AAC LC (Low Complexity) profile at 64 kbps is used for media codec. The test scenario in this evaluation has a repeated play of the access to the prefetched thumbnail file, and access to network streaming every 10 seconds, namely, the periods with 0-10, 20-30,..,120-130, and 140-150 sec correspond to file access phases.

60

session, i.e., 256 kbps, the CPU load is less than a 2.5% CPU increase and reasonably low for other processes. In comparison with the results in Table II, this result indicates a feasibility of asynchronous thumbnail prefetching in practical mobile use.

(A)

50 45

2.5

(B) CPU Occupation [%]

CPU Occupation [%]

55

40

Asynchronous Stream Local Access

35 0

5

Synchronous Access 10

15

Time [sec]

2.0 1.5 1.0

Rate of Synchronous Streaming Session

0.5

Fig. 7. CPU occupation change. 0.0

Fig. 7 shows a change of CPU occupation, which is normalized every 20 seconds. As the result shows, the CPU load during the synchronous streaming (Fig. 7, Synchronous Access periods) is 5-8% higher than that of the prefetched stream local access (Fig. 7, Asynchronous Stream Local Access periods). In particular, the 3-sec period prior to transition from the file access to the streaming access has a higher CPU load because the packet-receiving thread and the copy thread start at 3 sec (Fig. 7 A) prior to the transition. At this time, all threads are being executedm and the CPU load is highest. On the other hand, the end of the network-streaming period (Fig. 7 B) has the lowest values. Because the streaming thread starts much earlier than real streaming viewing starts, the receiving-and-copying process finishes at that period. Table II shows CPU occupations for each thread, revealing that the Video Renderer has a much higher value, much the same as the AAC Decoder value does. This is why the transformation of YUV signals to RGB signals needs many calculations. As a result, the proposed system achieves seamless assembly and quick accessibility of contents in only a client-side mechanism without assistance by an additional proxy or server technologies. TABLE II CPU OCCUPATION IN MOBILE CLIENT Thread

Main MPEG-4 Decoder AAC Decoder Video Receiver Video File Reader Video RTCP IF Packet Copier Else

CPU[%]

7.6 19.9 14.5 2.6 3.0 0.2 1.6 2.3

Thread

SWITCH CONTROLLER Video Renderer Audio Output Audio Receiver Audio File Reader Audio RTCP IF Signaling Controller Total

CPU[%]

3.4 15.3 4.8 2.5 3.8 0.2 0.2 81.9

The next evaluation is for CPU overhead for an asynchronous thumbnail prefetching session while the synchronous streaming session exists. The bit rate of the background traffic, i.e., the synchronous streaming session, is 256 kbps. Fig. 8 shows CPU occupations against transmission rates of the asynchronous thumbnail prefetching session. Even if the rate is nearly the same as the synchronous streaming

50

100

150

200

250

300

Transmission Rate [bps]

Fig. 8. CPU occupation change for asynchronous streaming.

Lastly, we investigate changes of the display delay and the required buffer size compared with the asynchronous streaming process starting times. The results are shown in Figs. 9 and 10, respectively. In both figures, the starting time can be controlled by the “Streaming Start Time Threshold.” For example, 1000 msec of streaming start time threshold means that the synchronous stream, of which the beginning is, for example, 60 sec on the content play-time basis, starts its streaming process at 59 sec. Namely, the synchronous stream starts 1000 msec before the practical display time in the client. Also, the positive value of the display delay in Fig. 9 indicates that the media decoder waits for the media data to be received in the client side, and this is abnormal behavior. Here, although the evaluations have been performed on the 3.5G HSDPA network, the results in IEEE802.11g are shown together for comparative investigations. Fig. 9 shows that the increase of the streaming start time threshold can decrease the playing time delay, since the media decoder can get the synchronous stream and start the decoding process earlier. On the other hand, because of the required buffer increase against the increase of the threshold, an optimal threshold must be calculated for the possible smallest threshold with a negative value of the display delay. Moreover, in comparison with IEEE802.11g, since 3.5G HSDPA has a higher delay, higher start-time thresholds are required. As a result, the optimal threshold can be estimated at around 2,500 msec for 3.5G and 700 msec IEEE802.11g network in this experimental environment. However, because the network condition can dramatically change, the calculation method for the optimal starting time must be investigated and evaluated for a practical application.

Fig. 9. Starting time of the synchronous streaming and playing time delay on mobile client. 30000

Maximum Memory [Bytes]

IEEE 8 0 2 .1 1 g H SDPA

25000 20000 15000 10000 5000 0

0

1000

2000

3000

4000

5000

6000

Streaming Start Time Threshold [msec] Fig. 10. Starting time of the synchronous streaming and required buffer size on mobile client.

The overall CPU load against the fully scratch-based implementation onto the commercial mobile terminal was confirmed. Then the access switching capability, one of the core mechanisms, between an asynchronously prefetched thumbnail stream local access and a synchronous stream network access was confirmed. Further, the CPU load of the asynchronous streaming, which may impose a big additional load on the ordinal synchronous streaming, is shown to be relatively low. Consequently, these preliminary results indicate a feasibility of the proposed mechanism. However, the synchronization timing control still lies in the head of a practical implementation of the proposal system, and further studies are required.

time by simply selecting the program to be viewed from a TV program guide list. Moreover, if you want to search for a specific word, for example, a program title or an actor or a product name, it will provide the right video scenes or TV commercials where the word appears to avoid time-consuming video search operations. This system uses human-made video metadata created by professionals in nearly a real-time manner within 24 hours. These metadata include not only program information, such as program titles, actor and actress names, video-scene titles, and news events, but also TV commercial information, such as product and store names and URLs. The mobile access system for the PDR we developed has capabilities that enable us to access the recorded video contents and to sort all of those videos by broadcasting channel, category, or specific word bases that the user can configure in advance. It also allows us to search video scenes by inputting free search words from the mobile terminal. The implementation system is composed of our developing a home-gate server connecting with PDR via an NFS (Network File System) and a mobile client system. The diagram of the server on Linux OS is shown in Fig. 11. The program in the server side periodically monitors every 15 min, for example, each program’s recoding status, which is stored in the SQL database in the PDR. Once a new video program has been recorded, the script extracts the corresponding MPEG video segment from one big video file, which includes all of the one-week videos in the target channel, by using a system call command. The extracted video segment is then converted to an MP4 file with MPEG-4 video and MPEG-4 AAC audio. Also, a thumbnail image for the video is created by convert command from the stored still image data in the PDR. Once the file is converted and installed into a predefined path for these files, the completion flag is registered with another SQL database for mobile access, and we can access PHP written front-end Web pages or RSS-based menu lists and enjoy a mobile streaming service.

VIII. SYSTEM APPLICATION We are now developing an application system that applies the proposed asynchronous prefetching streaming scheme to a commercialized PDR system at home. The PDR system can automatically record and index all terrestrial broadcasted video programs of up to eight TV channels within one week available in Japan. This allows us to watch any program at any

Fig. 11. Mobile access application system for PDR system.

In this system, since the video scenes can be segmented according to human-made metadata, the thumbnail streams to be asynchronously prefetched can be automatically created. The proposed asynchronous prefetching streaming technology

can then provide the lightweight and fast-content access capabilities for practical use. IX. CONCLUSIONS This paper proposes a new lightweight video delivery system with smooth- and quick-response accessibility to video scenes without interference by mobile network conditions, and also without consuming mobile device storage. The mobile client asynchronously prefetches multipositional thumbnail streams in the background of receiving the streaming media that a user is watching. The thumbnail stream represents a short-length anchor video to help the user understand an overview at that playback position. Moreover, the hierarchical prefetching streaming following a prioritized scene structure by layer concept can provide the user access to a general view through an entire content as well as a detailed view around a playback position considering the viewer’s position. The preliminary evaluation experiments showed a feasibility of the asynchronous thumbnail prefetching in a practical mobile use. We are currently developing its application system connecting a commercialized PDR system, and we are also evaluating usability tests in a 3.5G mobile network. However, we must investigate a network transmission control scheme for the asynchronous prefetching streaming that will not interfere with the synchronous streaming. Some conventional research results [17][18][19] will be useful for our further study. REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]

[12]

[13]

3GPP High Speed Packet Access (HSPA), http://www.3gpp.org/HSPA. WiMAX Forum, http://www.wimaxforum.org/ Klab Inc., “Questionnaire for video viewing on mobile phones”, 2007. (in Japanese) Chung-Ming Huang and Tz-Heng Hsu, “A User-Aware Prefetching Mechanism for Video Streaming,” World Wide Web, vol.6, no.4, pp.353-374, 2003. Songqing Chen, Haining Wang, Bo Shen and Susie Wee, “Segmentbased Proxy Caching for Internet Streaming Media Delivery,” IEEE Multimedia Magazine, vol. 12, no. 3, pp.59-67, 2005. Yufeng Shan and S.Kalyanaraman, “Hybrid video downloading/ streaming over peer-to-peer networks,” IEEE International Conference on Multimedia and Expo (ICME), Vol.2, pp.665-668, 2003. “Trends and developments in communications and media technology, applications and use,” ACMA, 2009. Hiroshi Tanaka, “Oneseg and Mojie: NHK Mobile TV and Japan mobile web project,” EBU Meeting with Mobile content: anytime, anywhere, in any format?, 2008. Hongliang Li, Guizhong Liu, Zhongwei Zhang and Yongli Li, “Adaptive scene-detection algorithm for VBR video stream,” IEEE Transactions on Multimedia, vol.6, no.4, pp. 624-633, Aug. 2004. Jung-Rim Kim, Sungjoo Suh and Sanghoon Sull, “Fast Scene Change Detection for Personal Video Recorder,” IEEE Transactions on Consumer Electronics, vol.49, no.3, 2003. Anastasios Dimou, Olivia Nemethova and Markus Rupp, “Scene Change Detection for H.264 Using Dynamic Threshold Techniques,” In Proceedings of 5th EURASIP Conference on Speech and Image Processing, Multimedia Communications and Service, Smolenice, Slovak Republic, 2005. Jek Charlson So Yu, Mohan S. Kankanhalli and Philippe Mulhem, “Semantic Video Summarization in Compressed Domain MPEG Video,” IEEE International Conference on Multimedia and Expo (ICME), 2003. Padmavathi Mundur , Yong Rao and Yelena Yesha, “Keyframe-based video summarization using Delaunay clustering,” International Journal on Digital Libraries, vol.6, no.2, pp.219-232, 2006.

[14] D. Besiris , A. Makedonas , G. Economou , S. Fotopoulos, Combining graph connectivity & dominant set clustering for video summarization, Multimedia Tools and Applications, vol.44, no.2, pp.161-186, 2009. [15] Physical Layer Aspects of UTRA High Speed Downlink Packet Access (Release 2000), TR25.848, 2h, V0.2.1, 3GPP. [16] Jianbo Gao and Nageswara S. V. Rao, “TCP Aimd Dynamics Over Internet Connections,” IEEE Communications Letters, vol.9, no.1, 2005. [17] Aleksandar Kuzmanovic and Edward W. Knightl “TCP-LP: A Distributed Algorithm for Low Priority Data Transfer,” IEEE INFOCOM, 2003. [18] Aleksandar Kuzmanovic, Edward W. Knightly and R. Les Cottrell, “A protocol for low-priority bulk data transfer in high-speed high-rtt networks,” Second International Workshop Protocols for Fast LongDistance Networks, 2004. [19] Vidhyashankar Venkataraman, Paul Francis, Murali S. Kodialam and T. V. Lakshman, “A priority-layered approach to transport for high bandwidth-delay product networks,” International Conference On Emerging Networking Experiments and Technologies (CoNEXT), 2008. [20] MPEG-4 SP, ISO/IEC 14496-2,”Information technology -- Coding of audio-visual objects -- Part 2: Visual,” 1999. [21] ITU-T Recommendation H.261, “Video Codec for Audiovisual Services at px64 kbits,” 1993. [22] ITU-T Recommendation H.263, “Infrastructure of audiovisual services – Coding of moving video”, Video coding for low bit rate communication. [23] H.264/AVC 14496-10, “Information technology -- Coding of audiovisual objects --Part 10:, Advanced Video Coding,” 2003. [24] MPEG-4 AAC, ISO/IEC 14496-3, “Information technology -- Coding of audio-visual objects -- Part 3: Audio,” 2005. [25] MP4, ISO/IEC 14496-14, “Information technology -Coding of audio visual objects- Part 14:MP4 file format”, 2003. BIOGRAPHIES Naofumi Uchihara received a B.Eng. degree from Gunma University, Gunma, Japan, in 2007 and a M.Eng. degree from The University of ElectroCommunications, Tokyo, Japan, in 2009. He is now studying for a Dr.Eng. degree at the University of Electro-Communications. His research interests include video coding and video transmission. Hiroyuki Kasai received a B.Eng., M.Eng., and Dr.Eng. degrees in Electronics, Information, and Communication Engineering from Waseda University, Tokyo, Japan, in 1996, 1998, and 2000, respectively. Dr. Kasai was a visiting researcher in British Telecommunication BTexacT Technologies, U.K., from 2000 to 2001. He joined Network Laboratories, NTT DoCoMo, Japan, in 2002, and since 2007 has been an associate professor at The University of Electro-Communications, Tokyo. His research interests include video coding, video transmission, mobile service platform technology, and ubiquitous service technology. Yoshihiro Suzuki received a B.Eng. and M.Eng. degrees in Electronic Engineering from Tohoku University, Sendai, Japan, in 1983 and 1985, respectively. Mr.Suzuki joined Matsushita Communication Industrial Co., Ltd in 1985. He joined FUNAI Electric Co., Ltd in 2008. His main field of interests includes embedded operation systems, wired and wireless access networks, ubiquitous network systems, and mobile terminals Yoshihisa Nishigori received a B.Eng. and M.Eng. degrees in Electric Engineering from Osaka University, Osaka, Japan, in 1981 and 1983, respectively. Mr.Nishigori joined Matsushita Electric Industrial Co., Ltd in 1983. He joined FUNAI Electric Co., Ltd in 200. His main field of interests includes video signal processing, application software and network technology products.

Suggest Documents