Video Everywhere Through a Scalable IP-Streaming Service Framework

Video Everywhere Through a Scalable IP-Streaming Service Framework Hsin-Ta Chiaol, Fang-Chu Chen', Kuo-Shu Hsu', Shyan-Ming Yuan2 Digital Video and Optical Communications Technologies Division Information and Communications Research Laboratories Industrial Technology Research Institute Chutung, Hsinchu, Taiwan tJosephChiao, fcchen, kshsu}@itri. org. tw 2Dept. ofComputer Science, National Chiao Tung University, Taiwan smyuan(dcs. nctu. edu. tw In this paper, we propose a scalable IP-streaming service

framework providing the time-shifting and place-shifting capability to achieve the vision of video everywherea Traditional PVR (Personal Video Recorder) provides the time-shifting viewing

However, when mobile TV and PMP (Portable Media available, the traditional PVR has to be enhanced with place-shifting viewing experiencea Hence, several IPTV streaming techniques such as orb or SONY locationfree already can deliver a video stream to various kinds of end user terminals through Internet or wireless LAN. In this paper, we propose a distributed, scalable IP-streaming service framework to enhance the current IPTV streaming techniques. Three important issues are addressed in this paper. First, we propose a suitable metadataformat that is capable of being used in the service framework. Second, we discuss how to produce scalable IDTV (Interactive Digital Television) contents that can adapt the display capabilities of diverse terminal devices. Third, we evaluate how to apply the MPGE-4 SVC (Scalable Video Coding) in the service

experiencea Player) are the extra proprietary

Traditional PVR (Personal Video Recorder) provides the time-shifting viewing experience and allows you to watch the recorded TV programs on a TV near by the PVR when you have spare time. However, once the portable devices such as mobile TVs or PMPs (Portable Media Player) are available, a person can find a video player to use almost everywhere. Consequently, it is reasonable to enhance the time-shifting viewing experience of a traditional PVR with the new placeshifting capability. It is the vision of video everywhere: you can access your own media (video) everywhere through an available terminal device. As shown in Figure 1, a person can program his PVR remotely during his working hours, and watch the recorded TV program later when he has spare time, maybe on the way home through a PDA, or in the living room through a traditional TV set.

framework.

I. Introduction

S. Kramer, 2005

Figure 1: The vision of video everywhere

1-4244-0398-7/06/$20.00 )2006 IEEE

For achieving the vision of video everywhere, currently several proprietary IPTV streaming techniques are available . They could deliver a video stream to various kinds of end user terminal devices. For example, orb [1] is a free PC software for Internet streaming. It can stream out the video streams stored in PC hard disks or in DVD discs. In addition, it can also stream out live TV programs if a TV capture card is installed on the PC running the orb server. The orb provides a portal for users to instruct his orb server to stream out an available video. Through a web browser on a terminal device, a user can access the portal and start a video streaming to his/her terminal device such as a notebook or a PDA. Automatic trans-coding is supported by the orb server to adapt the transmission network and the capability of the terminal device. SONY location free [2] is a hardware box that can accept TV programs broadcasted on the air or via cable, and then stream out the received TV programs to a SONY VAIO notebook through broadband network or to a SONY PSP (PlayStation Portable) through wireless LAN.

However, the above-mentioned IPTV streaming techniques have the following shortcomings: *

Their service frameworks are not general enough to accommodate other video-based multimedia contents, such as the Podcast aggregators (Juice [3], Doppler [4]) for accessing the Internet videoblog, or the video captured by the cameras of a home security system. Their architecture is based on a centralized, real-time streaming server that is not flexible enough to exploit the possibility to reduce the video transmission cost or to optimize the network condition for video streaming.

*

*

*

(Scalable Video Coding) [5] as the target video format in the service framework and discuss the possibility of using SVC to replace trans-coding. Finally, section 6 concludes this paper. II. The Scalable IP-Streaming Service Framework

Figure 2 shows the scalable IP-streaming service framework proposed in this paper. It is designed for accommodating various video sources described below: *

Traditional analog and digital TV: including the scheduled streaming and file-based video clips through datacasting.

They operate upon proprietary metadata that is not available in public for extension or for inter-operating with.

*

They are not IDTV-aware and support only traditional video content. The interactive content accompanied with a streaming video is out of their scope.

Mobile TV: including the above-mentioned scheduled streaming and file-based video clips through the broadcast network, and the on-demand streaming through the mobile network.

*

Podcast aggregator for downloading the updated video files from the videoblog web server according to the metadata in RSS [6] or ATOM [7] format.

*

The video streams stored in the private media server or captured by the home security system.

On dernad strearing

Operator

PVR.......................

|Mobilebroadband

Mobedled strerigV

/broadcaste

0

Broadctert S

~

Video file

Videoblog

|HTP

Serve

Audio fil

RSS, ATOM Various kinds of access networks

Internet

1H

HTTP Seruve da111111

P

Serve Various kinds of video sources

Various kinds of video repository

Figure 2: The IP-streaming service framework providing the time-shifting and place-shifting capability In this paper, we propose a scalable IP-streaming service framework to provide a solution to the above-mentioned problems. Section 2 shows the system architecture of the IPstreaming service framework. In section 3, we discuss the metadata issue of the IP streaming service framework. In section 4, we discuss the issue for attaching IDTV contents to a video stream that is delivered by the service framework. In section 5, we briefly describe the application of MPEG-4 SVC

The proposed scalable IP-streaming service framework is of a distributed architecture that contains various source nodes and various consumption nodes. A source node is a video repository that contains storage for video streams to be delivered out, such as a mobile PVR, podcast aggregators, video servers for home security, and private media servers. A consumption node is a terminal device with the video playback capabilities, such as a PMP, a PDA, a laptop PC or a traditional TV set. In addition, source nodes and consumption nodes may be situated in the same device. The service framework is embedded with the service intelligence for managing the computing, storage and transmission resources inside the service framework. It is responsible for scheduling the user requests, such as recording or playback, and dispatching the system resources cooperatively to support the above mentioned user requests. Since there are multiple source nodes and consumption nodes, it is possible to exploit the potential for pre-caching or other techniques according to the user's preferences to seamlessly provide the time-shifting and placeshifting video streaming service. Consequently, the cost of network transmission can be minimized (because some transmission networks may account for the usage prices by the total size of transmitted packets) and the power consumption on mobile devices can be reduced (by reducing the RF activities when they are powered by batteries).

III. Metadata

In additional to the system architecture, the metadata for the service framework is another important issue to be addressed in this paper. As shown in figure 2, the metadata formats exchanged between various video sources and various video repositories are private metadata formats that depend on the types of video sources. For these metadata format used outside the IP-streaming framework, we choose to follow the original metadata formats for the various kinds of video sources. For example, in the European DVB-H/IPDC mobile TV system [8], the metadata format exchanged among mobile PVRs, telco operators, and mobile TV broadcasters are DVB-IPDC ESG metadata format [9], which is derived from the TV-Anytime phase 1 metadata [10]. (The DVB-IPDC EPG metadata standard is an extension to the TV-Anytime phase 1 standard for accommodating the new requirement of the DVB-CBMS services and networks.) In European DVB-T, DVB-C and DVB-S TV systems [11], the metadata formats exchanged between traditional TV broadcasters and home PVRs maybe DVB-SI [12] or the TV-Anytime metadata carried inside the DVB transport streams. The metadata exchanged between the Podcast aggregators and the video or audio blogs are RSS or ATOM.

For each video repository, we use an XML document that contains an instance of the OnDemandServiceType of the TV-Anytime phase 1 metadata to describe the on-demand video streaming service provided by the video repository, such as a mobile PVR or a home PVR. A video repository may contain several video programs that can be delivered on an of demand. We use instance the OnDemandProgramType of the TV-Anytime phase 1 metadata to describe an available video program inside a video repository. In the IP-streaming framework, the major extension to the TV-Anytime phase 1 metadata is for extending them to support the IDTV content attached to the streaming video. Originally, the TV-Anytime phase 1 metadata can only describe the audio and video components of a video program (through the AVAttributes element in Figure 3). Here a new element IDTVAttributes is added to the ProgramInformationType element for accommodating the attributes of IDTV contents associated with a video program. Possible types of IDTV contents that can be streamed to a terminal device include: WTVML [15], MPEG-4 Part 20 LASeR [16], and JSR-272 [17] - Mobile Broadcast Service API for Handheld Terminals. Video Repository (Mobile PVR)

-:

Video Streaming Standard Control Module

-O-h -r

- Oerlde&rif ier -----------------

AVAtfriluFtes eFI

-

ID'TVAtitri uFtes --~~ Uer.ued-rcm

L

et..

"

PVR Recording

and Streami.g

Braca. n

Engine

Prooco

"

Ternmial Device

Mobile PVR Pnvate Control ModWle

e

Video Streaming U-e Interface

_

Web Serice P,t,l olocol.t,.

AIAP-UJRC Interface Generator

GUI Modl

R-mte Consle

F31

CprcDqrarnlnfcpwwn;ati-mn Type. --

*-> erineciFr am

Figure 4: The control interfaces of a video repository

E

-- EpbsodleOlf

--

Aggire--gaion0f

ED

Figure 3: The extension to TV-Anytime phase 1 metadata For the metadata that is used inside the IP-streaming service framework (exchange between the various video repositories and terminal devices), we propose several new XML metadata formats [13] that are of enough expressive power to support the service model of the above mentioned service framework. The first XML metadata format is for describing the video programs stored in video repositories and for on-demand streaming. It is derived from the skeleton of the TV-Anytime phase 1 standard. The second XML metadata format is for describing the capabilities of various kinds of terminal devices. The third XML metadata format is for describing the capabilities of various kinds of video repositories, and the properties of interconnection networks. Both of the above two metadata formats are derived from the W3C Composite Capabilities/Preference Profiles (CC/PP) [ 14].

Inside the service framework, we provide a metadata repository that stores the XML metadata for the video repositories. A terminal device can access the metadata repository to find out available video repositories provided by the service framework, and discover the available on-demand video programs inside each video repository. Each video repository contains two control interfaces. The first is the webservice control interface. This interface is a standard control interface, and each video repository has to support it. The major function of this interface is for controlling video streaming. It can accept commands for starting, stopping, fastforwarding, back forwarding, pre-caching or pausing a video streaming. In addition, it can also send the latest metadata of the video repository to a terminal device. Since the web-service control interface is purely for traditional client-server, requestreply interaction, a video repository do not provide any user interface information for video streaming through this control interface. Consequently, a terminal has a full control of the video-streaming user interface. The second control interface provided by the video repository is the AIAP-URC control interface. AIAP-URC

(Alternate Interface Access Protocol - Universal Remote Console) is a US standard for modality-independent user interfaces [18][19][20]. The core concept of the AIAP-URC standard is Universal Remote Console (URC). All devices that support the AIAP-URC standard can be controlled by a URC. The user interface for controlling a type of device is given by an abstract description. The interface generator on the URC reads the abstract description, and is responsible for rendering the real user interface for controlling the device on the URC. For each video repository, the AIAP-URC control interface is responsible for controlling the private functions other than the standard web-service control interface. Since the private functions provided by various kinds of video repositories are different, it is difficult for a terminal device to provide the user with the interfaces of these private functions without any hints. The AIAP-URC standard can solve this problem by passing an abstract description of the structures and the presentation hints of the user interface of a private function to a terminal device. Each terminal device has an AIAP-URC user interface generator. It can parse the abstract user interface descriptions and render the user interface of the private function on the terminal device. For example, the AIAP-URC control interface of a mobile TV PVR can check the program schedule of mobile TV broadcasting, and assign the schedule for recording mobile TV programs. The AIAP-URC control interface of a Podcast aggregator can refresh the RSS or ATOM metadata of video and audio blogs. It can also search the metadata and initiate downloading of video and audio clips. IV. IDTV Contents

The second issue addressed in this

paper

MIDP) for accommodating the features of mobile TV and data broadcasting. Both the WTVML and LASeR are XML-based IDTV content representations. In contrast, JSR-272 supports Java programs as the format of IDTV content. In the IPstreaming service framework, due to the requirement of placeshifting, it is important for an IDTV content to adapt the change on the display capability of the terminal devices, such as switching the viewing terminal device from a mobile phone to a traditional TV set. Since WTVML support model-based presentation, for an IDTV content, major kinds of presentation models (e.g. for mobile phone, for SDTV, for PC) can be assigned to the IDTV content in advance. When place-shift happens, in most case, the same IDTV content can be used on the new terminal device without further translation. The LASeR content is also very flexible since it supports 2D vector graphics. No IDTV content translation is required when placeshifting happens. In J2ME-and-MIDP platform, the GUI API for JSR-272 is LCDUI. Since LCDUI is not a scalable GUI API that can adapt the context change of terminal devices, we suggest using the JSR 226 Scalable 2D Vector Graphics API [21] for J2ME to replace the LCDUI inside the IP-streaming service framework. -

WTVML Type of IDTV Interpreter-based SW (presentation middleware middleware engine)

The content

is for the IDTV

terminal devices. As shown in Table 1, we evaluate three IDTV content formats for the above purpose in this paper: WTVML, LASeR and JSR-272. These three content formats are designed for resource-constrained mobile TV systems, which have small screen sizes, low transmission bandwidth, and less powerful hardware support on handheld devices. In addition, these content formats can also be used in traditional large-screen TV environment. Hence, they are suitable for adapting the environment change when place-shifting is happened to a user of the service framework.

WTVML [15] is an ETSI approved standard. It is an XMLbased IDTV content format, and can separate the presentation model and the data model of IDTV contents. LASeR is the draft standard of MPEG-4 part-20: Lightweight Application Scene Representation [16]. LASeR is based on the Tiny profile of the W3C SVG V 1.1 and V 1.2 standards, therefore supports 2D scenes with scalable vector graphics. JSR-272 [17] is the API extension to the Java platform on mobile phone (J2ME-

JSR-272

Interpreter-based

Java-based

Binary

SW (presentation middleware engine) (execution engine)

WML

representation

derived from XML-based SVG

J2ME-and-MIDPbased Java

Deck-Card-based presentation

Scene-tree-based presentation

Not strictly defined

Yes, through the model set for card presentation.

Yes, it supports vector-based

No, the GUI API MIDP LCDUI is not scalable.

.peni representation

contents. The IDTV contents used in this service framework are of two types: the source IDTV contents (from the content sources to the service framework) and the target IDTV contents (delivered from the service framework to the terminal devices). The target IDTV contents should be light weight-enough and could be scalable to adapt the display capabilities of the various

XML-based and derived from

LASeR

program

The structure of IDTV contents

Can GUI adapt to the terminal devices

graphic

Table 1: The comparison between WTVML, LASeR and JSR-272 How to generate the target IDTV contents from the source IDTV contents is a troublesome problem; however, we consider that the transformation from a IDTV PCF (Portable Content Format) format to a target IDTV content format maybe a feasible way. For example, the DVB PCF standard (not yet finalized) is designed for the interoperability of various kinds of IDTV content formats or IDTV middleware. An IDTV contents is first authored in the format of DVB-PCF, and can be transformed to a destination IDTV content format before the IDTV content is broadcasted out. Hence, it is a reasonable assumption that in the future most IDTV contents will have their original copies that are authored in PCF formats. However,

since the technical details of DVB PCF currently is not revealed in public, we plan to elaborate our idea after the DVBPCF group reveals enough technical details of the PCF standard.

major mobile TV middleware standards were evaluated: WTVML, LASeR, and JSR-272. Finally, we showed how to apply the MPEG SVC in the service framework. We hope that the IP-based service framework may become a new use case for the above-mentioned DVB-PCF, mobile TV middleware standards and the MPEG SVC standard.

V. Application of ISO MPEG SVC

One issue of realizing video everywhere is the ability to accommodate different playback device capabilities and different bandwidth of the heterogeneous transmission network. The conventional non-scalable method is to perform compression each time a different request is made. This repetition of compression of the same source not only puts a lot of workload in content preparation but also results in complex communication. According to the proposed framework shown in figure 2, where the content provider is far separated from the users and the network is complex, the conventional nonscalable compression is not appropriate. The scalable coding which provides 'compress once for all' convenience is a better means for the video everywhere framework. The state of the art scalable coding technology -- SVC, the advanced scalable coding, is the newest coding technology being standardized in ISO MPEG organization. SVC provides multi-dimensional scalability including temporal, spatial, and SNR. This multidimensional scalability is best suited for different playback devices, such as of different screen size, different battery lifetime etc. And the scalable bit stream size is good for adapting to different network bandwidth. According to the technology under standardization, the SVC bit stream will be packed into NAL, network adaptation layer, unit. Different dimension's scalability is packed in different NAL unit to be transmitted in the network. An extractor is then used to extract the appropriate NAL units according to the available bandwidth or the requirements from the playback devices. We propose the following application of SVC to the framework depicted in figure 2: The content provider compresses their content using SVC with as many dimensions of scalability as possible, then sends the bit stream to the video repositories via NAL units. An extractor is situated in each of the repository or streaming server. When a user's request or the traffic flow status is reported, the extractor extracts the appropriate NAL units stored in the repository and streams them out to the final terminal device. This way one is able to save the complexity of having to perform recompression or trans-coding, at the same time spares the communication back to the original content

provider.

VI. Conclusions In this paper, we proposed a concept for implementing an IP-based service framework for time-shifting and place-shifting video streaming. We showed the system architecture, and discussed the metadata formats interchanged within the service framework. In addition, we also proposed a possible approach for streaming IDTV contents inside the service framework by employing both the PCF and the mobile TV middleware. Three

References [1]

Official Homepage of Orb Networks,

[2] Official Homepage of SONY LocationFree, [3]

Official Homepage of Juice,

[4] Official Homepage of Doppler, [5] J. Ohm, Introduction to SVC Extension ofAdvanced Video Coding, ISO/IEC JTC1/SC29/WGI 1, July 2005. [6] Wikipedia, RSS (file format), ), April 2006. [7] M. Nottingham, R. Sayre, The Atom Syndication Format, IETF RFC 4287, Dec. 2005. [8] DVB, IP Datacast over D VB-H - Set ofSpecificationsfor Phase 1, DVB Document A096, 2005. [9] DVB, IP Datacast over DVB-H - Electronic Service Guide (ESG), DVB Document A099, 2005. [10] The TV-Anytime Forum, Specification Series: S-3 On Metadata - Part A: Metadata Schemas, Aug. 2003. [I1] ETSI, DVB - Implementation Guidelinesfor the Use of MPEG-2 Systems, Video and Audio in Satellite, Cable and Terrestrial Broadcasting Applications, ETSI TR 101 154 V1.4.1, July 2000. [12] ETSI, D VB - Specification for Service Information (SI) in D VB Systems, ETSI EN 300 468 VI.5.1, May 2003. [13] T. Bray, J. Paoli, C. Sperberg-McQueen, et al, Extensible Markup Language (XML) 1.0, 2nd ed., W3C Recommendation, , Oct. 2000. [14] W3C, Composite Capabilities/Preference Profiles (CC/PP), 7July 2004. [15] ETSI, Specification for a Lightweight Microbrowser for Interactive TV Applications, Based on and Compatible with WML, ETSI TS 102 322 VI.1.1, May 2004. [16] J. Dufourd and Y. Lim, LASeR and SAF Editor's Study (Draft Standard ofISO/IEC 14496-20), July 2005. [17] A. Rantalahti and I. Wong, JSR 272 - Mobile Broadcast Service APIfor Handheld Terminals, , April 2005. [18] G. Vanderheiden, G. Zimmermann, S. Trewin, Interface Sockets, Remote Consoles, and Natural Language Agents - A V2 URC Standards Whitepaper, Aug. 2004. [19] B. LaPlant, S. Trewin, G. Zimmermann, et al., "The Universal Remote Console: A Universal Access Bus for Pervasive Computing," IEEE Pervasive Computing, Jan-March, 2004. [20] J. Nichols and B. Myers, Report on the INCITS/V2 AIAP-URC Standard, Feb. 2004. [21] S. Chitturi, JSR 226 - Scalable 2D Vector Graphics API for J2ME, , March 2005.

Video Everywhere Through a Scalable IP-Streaming Service Framework

Video Everywhere Through a Scalable IP-Streaming Service Framework

Suggest Documents

Towards ubiquitous video services through scalable video coding and ...

A framework for scalable summarization of video - Luis Herranz

A Framework for Scalable Cloud Video Recorder System in ...

A Scalable Overlay Video Mixing Service Model - CiteSeerX

Getting Started with Video Everywhere

A Scalable Service Discovery Framework with Load Sharing ... - UC3M

Scalable hologram video coding for adaptive transmitting service

GENI Cinema: An SDN-Assisted Scalable Live Video Streaming Service

Cooperative Video Broadcasting Scheme with Scalable Video

Scalable Portrait Video for Mobile Video Conferencing

Mobile Video Transmission Using Scalable Video Coding

Scalable Portrait Video for Mobile Video

Supporting Scalable Video Codecs in a P2P Video ... - Semantic Scholar

Supporting Scalable Video Codecs in a P2P Video

Scheduled Video DeliveryâA Scalable On-Demand Video Delivery

A Scalable Benchmark as a Service Platform

Plugging a Scalable Authentication Framework into Shibboleth

A Scalable Framework for Information Visualization - CiteSeerX

Plugging a Scalable Authentication Framework into Shibboleth

A scalable framework for multimedia knowledge ... - CiteSeerX

Equalizer: A Scalable Parallel Rendering Framework - CiteSeerX

GLADE: A Scalable Framework for Efficient Analytics

A Framework for Scalable Biophysics-based Image

A Scalable DDoS Detection Framework with Victim

Video Everywhere Through a Scalable IP-Streaming Service Framework