Multimedia Content Repurposing in Ambient Intelligent Environments M. Shamim Hossain
M. Anwar Hossain
Abdulmotaleb El Saddik
SITE, University of Ottawa 800 King Edward Ottawa, ON K1N 6N5, Canada
SITE, University of Ottawa 800 King Edward Ottawa, ON K1N 6N5, Canada
SITE, University of Ottawa 800 King Edward Ottawa, ON K1N 6N5, Canada
[email protected]
[email protected]
[email protected]
context, and predictive to assist humans [8, 7]. In such a domain, rich multimedia content will be accessible by any user from anywhere at anytime, and able to be rendered to any device over heterogeneous networks. Therefore, tailoring content to match individual context through repurposing is an important task in the AmI paradigm. Several research efforts have been initiated to realize the AmI vision that includes developing intelligent computing devices, ubiquitous communication infrastructures, and adaptive and natural user interfaces. Consequently, multitudes of AmI applications were developed. The IST OZONE [9] project investigated a generic framework for enabling consumer-oriented AmI applications that also included research on content adaptation and interface technologies. The IST BETSY [2] project also focused on multimedia content adaptation in order to deliver the content to various hand-held wireless devices in varying network conditions. The Philips’s Home Lab [14] is a test bed for comprehending a real-life AmI environment where the electronic devices are embedded in the surrounding environment so that it can recognize the user’s presence and respond to meet their needs. There are several major AMI related projects at universities and in industries [15]-[19]. In the following sections, we first present a motivating scenario of multimedia content repurposing and discuss several content repurposing issues in the context of AmI. We then elaborate the design and architecture of a prototype repurposing system. This is followed by our concluding remarks.
Abstract Multimedia content repurposing is a challenging task that aims to adapt and deliver media content to client devices by considering the heterogeneity pertaining to users, terminals, and communication channels such that it appears homogeneous, seamless and ubiquitous in the context of multimedia access and delivery. This ubiquity of multimedia access and delivery, among several others, constitutes the requirements of an ambient intelligent environment where people interact with intelligent computing devices through adaptive and intuitive interfaces in a natural and context-sensitive manner. This paper addresses several issues of multimedia content repurposing in the ambient intelligent environments and presents a prototype system including its design and architecture.
1. Introduction The presence of heterogeneous technological surroundings forces new ways of dealing with multimedia content distribution and delivery. The same content cannot be rendered and perceived in the same way due to the user’s preferences, the natural environment of the users, the end terminal capabilities, and the varying network characteristics. Content repurposing may tackle these issues by automatically converting one media content into another while preserving a copy of the original content [3]. This conversion may be performed in different ways that include conversion of different modes, video coding formats, frame rates, bit rates, spatial resolutions and so on [11, 1]. It is obvious that the goal of multimedia content repurposing is to enable universal multimedia access [10], which considers the delivery of content under different usage environments. Such a goal also drives the vision of Ambient Intelligence (AmI) that will enable technology to become unobtrusive, embedded in our natural surroundings, adaptive to the response of people and their
1-4244-0832-6/07/$20.00 ©2007 IEEE.
2. Motivating Scenario Figure 1 shows a motivating scenario where distributed users participate in a live video conferencing session. Each of the communicating parties has his own surroundings that may differ in terms of devices, networks and user preferences. Live video stream is captured from one user environment and is sent to another user environment. However, because of the heterogeneity of
49
R ep co urp nt os en e t d
R ep co urp nt os en e t d
Figure 1. A scenario of multimedia content repurposing in an ambient intelligent collaborative (video conferencing) environment
day, nearby people, and devices. According to Chen and Kotz [6], context includes: a) Computing context such as a network profile (e.g. network connectivity, bandwidth, communication cost and nearby resources), b) User context such as the user’s profile, user’s location and nearby users and people, c) Physical context such as illumination characteristics and temperature, d) Temporal context such as the time of day, week or year that refers to when the context information of the user is captured. These ambient environment profiles fall into the following categories as described in the natural environment characteristics tool of MPEG-21 [4, 12]. o User profile: The user profile describes the personal properties and preferences of a user, which includes audio-video qualities such as frame rate, and resolution. It also describes the user’s content preferences, presentation preferences, accessibility, mobility and destination. User characteristic tools of MPEG-21 [12] describe such profiles in detail. o Terminal profile: This profile describes the codec capabilities, device capabilities and available input/output characteristics. The terminals could be a server, client, or proxies along the repurposing path. In addition to general capabilities, the server also has the information about the neighboring proxies connected to it. Proxies also have the information about neighboring connected proxies and the available repurposing services that are running on each proxy. Based on these profiles, a server can identify whether the client is capable of decoding the content or whether a need exists for the content to be
the usage environment, the captured video stream cannot the rendered to the satisfaction of the receiving user. The solution to such an anomaly lies in considering the individual’s context (ambient usage environment) for delivering the content. In doing so, the captured video stream is repurposed through some intermediary steps prior to delivering the content to the receiving terminal.
3. Ambient Environment Description In order to develop an AmI-aware multimedia content repurposing system, it is necessary to capture information from the surrounding environment of the user. Such information may be captured and stored in different profiles. These profiles include, but are not limited to, ambient environment profiles (e.g. user, terminal, and network profiles), content profiles, and repurposing service profiles. These profiles play an important role in determining the context of the usage environment and, hence, assist in repurposing content to ensure ubiquitous and universal multimedia distribution and delivery. In the following, we elaborate on these profiles.
3.1. Ambient Environment Profile The ambient environment profile describes the surrounding context of the user. Context information can be characterized by the situation of an entity where an entity can be a person, a place, or a computational or media object that is considered relevant to the interaction between a user and an application [5]. This information may include the user’s location, user’s activity, time of
50
processing power and memory. Servers usually have the limitation of computational load and resource consumption of complex repurposing if the dynamic context of a user and its surrounding natural usage environment (referred as ambient information) is considered. Thus, in our system, repurposing is performed in distributed resourceful proxies, in multiple steps, in order to reduce the computational load on the server [1]. However, the distribution of repurposing tasks among the proxies requires special considerations such as selecting and composing the best repurposing services. To this end, we used a repurposing service selection algorithm. The repurposing service selection algorithm is one of the decision makers that find the best repurposing services to repurpose the content in order to make it acceptable to the user. The repurposing service selection algorithm uses the QoS function as a metric to determine the best repurposing path. In our case, we refer to this metric as the ambient QoS metric that has an aggregate value between 0 and 1, where 1 denotes the highest satisfaction of the user. The ambient QoS (AmI QoS) metric is a function of m-tuple, which includes the user’s preferences (audio quality, frame rate and resolution of video), network preferences (required bandwidth), and device characteristics (memory requirements). Each ambient QoS parameter is mapped to each application level variable. So, ambient QoS or the satisfaction of a user with each application level variable, is expressed as a component ambient QoS function qi(yi). In this function yi is a multimedia application level variable. For more than one multimedia application parameter, or variable, the overall satisfaction [20] of a user is determined as a combination function of the individual component satisfaction qi presented in Eq. (1) below:
further converted in order to be rendered by the requesting client. The terminal capability tools of MPEG-21 [12] describe the above terminal capabilities in detail. o Network profile: The network profile is important for dynamically repurposing the multimedia content based on varying network capabilities and conditions. This profile describes maximum network bandwidth, available bandwidth, delay, and error. The network characteristic tools of MPEG-21 [12] also describe these characteristics. Some of the attributes in the above profiles (ambient environment information) can be manually keyed-in while others are captured through sensing devices and/or user devices. These sensing devices may be standalone devices (e.g. motion sensor, environment sensor) or be embedded with pervasive devices (a PDA with GPS, a cell phone with camera etc.).
3.2. Repurposing Service Profile The repurposing service profile contains a copy of the list of inputs (such as codecs and format) that the repurposing service can accept. It also contains a list of outputs that the repurposing service can create by repurposing one of the inputs. In addition, this profile may contain a link back to the description of the proxy from where the current repurposing service is running.
3.3. Content Profile The following is a description of the content that a sender can deliver. It includes the metadata or description of different media types (such as audio, video) and their variations. Some of these metadata include information about transport protocol, coding standards (such as H.263 video, Mpeg-4 video etc.) and other identifying parameters. This metadata description follows the MPEG-7 standard [13]. Our repurposing system uses the content profile along with other profiles to adapt the multimedia content to client devices.
Qcomb = f comb(qi ) = f comb(q1 , q2 .....qm ) =
0 ≤ qi ≤1
m m
1 ∑ qi i =1
,
............................. (1)
Where, each ambient QoS component q1, q2…, qm is based either on frame rate, resolution, SNR quality or other parameters. As mentioned in Eq. (1) the overall satisfaction function is low if one individual satisfaction function component is low. For instance, if a multimedia stream has a very high resolution but plays back at 3 fps, then the combination satisfaction is low. In the algorithm described below, RN is the set of repurposing services considered by the system, i.e. the candidate set, and the RV are the set of repurposing services that have already been considered. The candidate
4. System and Architecture Due to the different codec capabilities, input/output capabilities, memory requirements of user devices, and the underlying network conditions, multimedia content created for one user device requires others to use the repurposing. This repurposing may take place either on the clientside, the server-side or on the proxy. Some client devices (PDA, cell phone etc.) have some constraints regarding
51
Heterogeneous clients consist of devices such as laptops, PDAs, cell phones, wall mounted displays, touchscreens, DTVs, webcams and other pervasive devices. These devices are used by different users and are dispersed in the ambient environment connected through ubiquitous networks. Some of the client devices have smart sensors, which capture natural environment information such as temperature, time, and video illumination characteristics. In this case, those clients act as capturing agents. In general, the clients may be subscribers to multimedia content or consumers of the repurposed content. The server collects and maintains the ambient profile and content profile, which are deemed necessary to repurpose the requested content. It also collects all the information about the proxy and the repurposing services running on it. It is responsible to generate the repurposing path in order to perform optimized repurposing tasks. Repurposing proxies contain one or more repurposing services. Each repurposing service consists of a decoder, a size converter and an encoder. It has all possible input parameters including a media standard (e.g. H263, MPEG-4), frame format (e.g. CIF, QCIF), frame rate (e.g. 30 fps, 15 fps), etc. that a service can accept. It also has all possible output parameters based on which the repurposed output is delivered. The high-level message flows, shown in Figure 2, are described in the following steps: (1) After establishing a session with the individual client, the ambient profiles including the user preferences, the device profile, and the network profile are transmitted to the server. (2) The server then requests information from the connected neighboring proxies and continues to request information from the rest of the proxies connected to each of the neighboring proxies. (3) After receiving all information from the proxies and the repurposing services, the server generates an optimized repurposing path by using the ambient QoS metric. (4) The server then sends the path information and the stream to the first connected proxy. The proxy, after receiving the path and stream from the server, performs the necessary repurposing tasks and sends the path information along with the repurposed stream to the next available proxy - if needed. This process continues until the desired repurposed content is generated. (5) The client receives the repurposed content from the last proxy in the chain.
repurposing services set contain the repurposing services that have input edges coming from any repurposing services in the set RV. At the beginning of the algorithm, the set RV contains only the Content_Sender node, which contains the original media format; and RN contains all the other repurposing services in the graph that are connected to the Content_Sender, and the Receiver. In each iteration, the algorithm selects the repurposing service Ri that generates the maximum user satisfaction or ambient QoS (AmI QoS). The ambient QoS is computed as an optimization function of the frame rate, the frame size or the quality for the output format of Ri, which is subject to the constraint of the available bandwidth between Ri and its ancestor repurposing services. Ri is then added to RV. The RN set is then updated with all the connected neighbor repurposing services of Ri. The algorithm stops when the RN set is empty, or when the Receiver node is added to RV. The steps of the selection algorithm are given below: 1. Let RV ∈ {Content_Sender}. Let RN ∈ downstream neighbor {Content_Sender}. Let {Ri} be the set of repurposing services. 2. If RN = ∅ {i.e. no more repurposing services to consider and the receiver cannot be reached through the repurposing path}, then TERMINATE (FAILURE). 3. For ∀ Ri ∈ RN, Compute the AmI QoS for all the repurposing services in RN 4. Select the repurposing service Ri that has the maximum AmI QoS RN = RN – { Ri }, RV = RV ∪{ Ri } 5. If Ri = Receiver, then GOTO Step 8. 6. For ∀ Rj ∈ downstream neighbor { Ri }, RN = RN ∪ { Rj } 7. GOTO Step 2. 8. Print repurposing path from the Content_Sender to Ri. Let us illustrate the run time complexity of our algorithm. We assume, Vr and Er are the nodes and the edges (links) respectively in a directed acyclic repurposing graph G(Vr, Er). The satisfaction function is called once for each edge in the repurposing graph (steps 2-6), for a total of |Er| calls. Steps 2-6 are repeated at most |Vr| times, as (repurposing node) vertex vr ∈ Vr is added, at most, once to the set RN. Selecting the repurposing service with the maximum satisfaction takes O(V) time. Thus, the total run time complexity of our algorithm is O(|Vr|2 +| Er|). Figure 2 shows the system architecture of our prototype multimedia content repurposing system. It consists of heterogeneous clients, a server and a handful of repurposing proxies.
52
Ambient information
Figure 2. Multimedia content repurposing system architecture
visual contents and the encoded contents. As shown in Figure 3, the visual content is repurposed from the MJPEG (at 30 fps) to the H.264 (at 15 fps) for the target channel of 64 kbps and 32 kbps. The target channel bandwidth is dynamically identified by proxies and is sent to the server. There is a sudden drop in both graphs in Fig.3 for the encoded data series, which is due to the changing of the scene. However, even when the scene change happens at the 35th and the 121st frame, the visual quality of the repurposed frame is higher than that of the encoded stream because of the repurposing. There is a gain in quality of 1db after repurposing visual content into H.264. This gain in quality is due to the repurposing of the multimedia stream, where the bandwidth is controlled. Initial usability tests showed user satisfaction with some initial setup delay. The repurposing path finding time, the proxy waiting time (streaming from proxy to proxy) and the repurposing time caused this delay. The average path finding time, proxy initializing time and repurposing time were 20 ms, 2000ms and 3000ms respectively. We are currently working on minimizing these delays.
5. Implementation and Results A prototype system is implemented using J2SE 1.5 and JMF 2.1.1a. We used RTCP in combination with RTP over UDP in the transport level. To establish initial communication between servers and clients, SIP over TCP was used. The RTSP as well as the Session Description Protocol (SDP) are used at the session level in order to mange the streaming session. The profiles were created in XML and XML schemas. One profile example for the repurposing service profile is shown in Appendix A. This prototype application periodically scans for available multimedia repurposing services and determines the best available service, with the help of the proposed service selection algorithm, in order to render the multimedia stream to different clients based on the user’s satisfaction. After running the application, we measured the quality of the repurposed content. For the measurement, we used a widely accepted and objective measure of visual quality metric called the Peak Signal toNoise-Ratio (PSNR) defined in Eq. (2). Where, MSE is the mean square error between the original content and the reconstructed visual content.
PSNR = 10 log10
255 2 dB MSE
(2)
6. Concluding Remarks In this paper, we presented a prototype multimedia content repurposing system that is able to repurpose multimedia content to adapt to heterogeneous client devices. The repurposing system contributes to the ambient intelligence vision, which aims to provide universal multimedia access and delivery in a seamless fashion. Our prototype, therefore, considers the surrounding environment information (ambient
We reported some significant subsets of the results of the experiments conducted on our system. One of the repurposing services that we developed was able to repurpose visual content from MJPEG to H.264 at different channel bandwidths. We measured visual quality by calculating and comparing the PSNR of repurposed
53
environment context) of users while performing repurposing tasks. The repurposing services are distributed among different proxies, which can be used in a chain whenever a complex repurposing task requires multiple repurposing services to be invoked. One of the limitations of our prototype is that we only developed a few repurposing services and did not consider a vastly diverse network that, in general, will be the case in an ambient intelligent environment. In our future work, we will test more heterogeneous devices in different networks and explore the context of their usage. Furthermore we will investigate the suitability of using the W3C’s Composite Capabilities/Preferences Profiles (CC/PP) for expressing device capabilities and user preferences required to be maintained in the AmI environment.
7. References [1] El Saddik, A. and Hossain, M. S. Multimedia content repurposing. In Encyclopedia of Multimedia, B. Furht, Ed. Springer, Feb. 2006. [2] BETSY - BEing on Time Saves energy, Available at: http://www.hitech-projects.com/euprojects/ betsy/index.htm [3] Singh, G., Content repurposing. IEEE Multimedia, 11, 1 (Jan.-Mar., 2004), 20-21. [4] ISO/IEC 21000-7:200x AMD/1, “Information technology - Multimedia framework (MPEG-21) Part 7: Digital Item Adaptation, AMENDMENT 1: DIA Conversions and Permissions”, Final Proposed Draft Amendment 1 (FPDAM/1), 2005. [5] Dey, A. K. Understanding and Using Context. Personal and Ubiquitous Computing, 5, 1 (2001), 47, Springer-Verlag. [6] Chen, G., and Kotz, D. A Survey of context-aware mobile computing research. Technical Report TR2000-381, Dartmouth College, Dartmouth, 2000. [7] Aarts, E. Ambient Intelligence: A Multimedia Perspective. IEEE Multimedia, 11, 1 (Jan.-Mar., 2004), 12-19. [8] Ducatel, K., Bogdanowicz, M., Scapolo, F., Leijten, J., and Burgelman, J-C., “Scenarios for Ambient Intelligence in 2010,” IST Advisory Group Final Report, Seville, 2001. [Online] Available at: ftp://ftp.cordis.lu/pub/ist/docs/istagscenarios2010.pd f. [9] IST OZONE project. Available at: http://www.hitech-projects.com/euprojects/ozone/ [10] Vetro A., Christopoulos, C., and Ebrahami T., eds. IEEE Signal Processing, Special issue on Universal Multimedia Access, 20, 2 (March 2003). [11] Xin, J., Lin, C-W., and Sun, M-T., Digital video transcoding. The Proceedings of the IEEE, 93, 1 (Jan. 2005), 84- 97. [12] Vetro, A. and Timmerer, C., Digital Item Adaptation: Overview of Standardization and Research Activities. IEEE Trans. Multimedia, 7, 3 (June 2005), 418-426. [13] Jose M. M., MPEG-7 Overview, Technical Report, ISO/IEC JTCI/SC/29/WG11 N6828, 2004, Available at: http://www.chiariglione.org/MPEG/ standards/mpeg-7/mpeg-7.htm. [14] Philip’s Home Lab. Available at: http://www.research.philips.com/technologies/misc/ homelab/ [15] S. Kalasapur, M.Kumar, B.Shirazi, "Personalized service composition for ubiquitous multimedia delivery," Sixth IEEE International Symposium on a World of Wireless Mobile and Multimedia
50
Y-PSNR
40
30
20
Target Channel : 32 Kbps Repurposed stream Encoded stream
10
0 0
30
60
90
120
150
Frame Number
(a)
50
Y-PSNR
40
30
20
Target Channel: 64 Kbps Encoded stream Repurposed stream
10
0 0
30
60
90
120
150
Frame Number
(b) Figure 3. Quality comparison of the repurposed stream and the encoded stream at different bandwidths
54
[16]
[17] [18] [19] [20]
Networks (WoWMoM 2005), pp. 258- 263, 13-16 June 2005. K.Nahrstedt, B.Yu, J.Liang and Y.Cui, "Hourglass Multimedia Content and Service Composition Framework for Smart Room Environments", the Elsevier Journal on Pervasive and Mobile Computing, 2005. PICO tech report and website: http://www.cse.uta.edu/pico@cse/ Ambient Computing. Available at: http://www.ambientcomputing.com/company.htm Ambient Multimedia Intelligent Systems (AMIS) http://www.discover.uottawa.ca/ A. Richards et. al. “Mapping User Level QoS from a Single Parameter”, in Proc. MMNS’1998, Versailles, Nov. 1998.
Appendix-A video/Mjpeg2k resolution 320x240 video/h263 resolution 352x288