Multimedia Content Adaptation Within the CAIN ... - CiteSeerX

3 downloads 27700 Views 207KB Size Report
Optimization ... CAIN framework, a content adaptation engine that integrates different content ... identify the best adaptation tool from the available ones. First ..... Problems [8] allow efficient navigation of large search spaces to find an optimal.
Multimedia Content Adaptation Within the CAIN Framework Via Constraints Satisfaction and Optimization Fernando López, José M. Martínez, and Víctor Valdés Grupo de Tratamiento de Imágenes, Escuela Politécnica Superior Universidad Autónoma de Madrid — E-28049 Madrid, Spain {f.lopez, josem.martinez, victor.valdes}@uam.es

Abstract. This paper presents a constraints programming based approach to decide which of a set of available content adaptation tools and parameters should be selected in order to perform the best adaptation of a media asset targeting to enhance the final user’s experience in a particular usage scenario. The work is within the scope of the Universal Multimedia Access (UMA) framework and makes use of MPEG standards for content and usage environment description. The proposed technique has been evaluated within the CAIN framework, a content adaptation engine that integrates different content adaptation tools, and that uses media and usage environment metadata to identify the best adaptation tool from the available ones. First, mandatory constraints are imposed. If there is more than one adaptation tool capable of adapting the content fulfilling every mandatory constraint, another group of desirable constraints are applied to reduce the solution space. If at this step there are still several adaptation tools or parameter values able to adapt the content fulfilling mandatory and desirable restrictions, a final optimization step chooses the best adaptation tool and parameters.

1 Introduction The development of both, new access networks providing multimedia capabilities and a wide and growing range of terminals, makes the adaptation of content an important issue in future multimedia services. Content adaptation is the main objective of a set of technologies that can be grouped under the umbrella of Universal Multimedia Access (UMA) [1]. This means the capability of accessing to rich multimedia content through any client terminal and network. In this way content adaptation bridge content authors and content consumer in a world of increasing multimedia diversity. In order to perform content adaptation it’s necessary to have the description of the content and the description of the terminal and network conditions. To enhance the user’s experience [2], not only terminals and networks parameters, but also user personalization and environmental conditions should be taken into account when adapting. This information imposes some constraints to the content coding parameters (and even others characteristics as semantic content or duration) to be delivered. S. Marchand-Maillet et al. (Eds.): AMR 2006, LNCS 4398, pp. 150 – 164, 2007. © Springer-Verlag Berlin Heidelberg 2007

Multimedia Content Adaptation Within the CAIN Framework

151

These constraints are imposed according to terminal capabilities, network conditions, user preferences and handicaps, environmental conditions, etc. In this way, content adaptation may be performed via a content adaptation engine to provide to the user the best experience for the content requested within the available usage environment. The different available content adaptation tools may diverge in adaptation approach (e.g., transcoding, transmoding, …), range of parameters values, supported input and output formats, performance (in terms of processing requirements, quality, etc.), … Several approaches have been proposed to perform content adaptation [3][4][5][6]. In [3] the authors propose a method where adaptations tools are described by inputs, outputs, preconditions and effects. In this paper we propose to describe adaptation tools using a capabilities description tool inspired in MPEG-7 MDS [7]. Constraints programming [8] is used to select the more suited content adaptation tool and parameters from the available specific content adaptation tools. It should be noted that [3] proposes to use a planning algorithm to find a chain of elementary adaptation operations that transform the media accordingly, whilst our framework considers adaptation tools that perform combined adaptations, not elementary ones. The adaptation engine selects only one adaptation tool from the available ones (we are evaluating to extend our solution to allow concatenation of adaptation tools in a future). The paper is structured as follows. Section 2 presents an overview of CAIN, the content adaptation framework within the work presented in this paper is developed. Section 3 presents the architecture of the Decision Module (DM), which is the module in charge of taking the decision about which adaptation tool to use and with which parameters. Section 4 presents our proposal for content adaptation based on solving a Constraints Satisfaction Problem (CSP) [8]. Section 5 deals with methods to fulfil every mandatory constraint, whereas section 6 deals with the same problem when looking to impose as many desirable constraints as possible. Section 7 exposes the proposed solution to the problem of selecting the optimum adaptation tool and parameters when there are several configurations satisfying mandatory and desirable constraints. Section 8 concludes the paper and overviews current and future work.

2 Overview of CAIN In this section we summarize CAIN [9] (Content Adaptation INtegrator), the framework within the work described in this paper is developed. CAIN is a content adaptation manager designed to provide metadata-driven content adaptation [10]. Different Content Adaptation Tools (CATs) allow integrating different content adaptation approaches [11]: transcoding, transmoding, scalable content, temporal summarization, that may be just signal driven or include semantic driven adaptation [12]. Fig. 1 summarizes the CAIN adaptation process: When CAIN is invoked; it receives the media content, an MPEG-7 MDS [7] and an MPEG-21 BSD [13] compliant content description, and an MPEG-21 DIA [13] usage environment description (user characteristics, terminal capabilities, and network characteristics). All those inputs are parsed and the extracted information is sent to the Decision

152

F. López, J.M. Martínez, and V. Valdés

MPEG-7, MPEG-21 Content Description

MPEG-21 Usage Environment Description

Media

CAIN CAT Capabilities Description

·················

MPEG-7, MPEG-21 Adapted Media Description

CAT n

CAT 2

CAT 1

Decision Module

Adapted Media Fig. 1. CAIN adaptation process

Module (DM). The DM decides which of the available CATs must be launched to produce adapted content and metadata. The output of the system is the adapted content and the adapted media description (according to MPEG-7 and MPEG-21 (g)BSD). 2.1 Extensibility in CAIN and the DM CAIN was proposed as an extensible content adaptation engine. Besides the CATs currently integrated in CAIN [14], there exists the need for integration of new CATs and codecs in the future. CAIN architecture has been designed to allow the addition of new CATs. In order to allow the addition of new CATs, we have defined an API specification and a CAT Capabilities Description File with information about both the input and output formats accepted by the CATs and their adaptation capabilities. Therefore, each new CAT to be added to CAIN should implement a defined API to communicate with the DM and should provide this information in a CAT Capabilities Description File. The CAT Capabilities Description Scheme [15] defines the adaptation capabilities, specifying in each case which kinds of adaptations the CAT is able to perform and the possible list of parameters that define the mentioned adaptation, such as input format, output format, and different features depending on which kind of adaptation is being defined: e.g., accepted input/output, frame-rate, resolution, channels, bitrate, etc...

Multimedia Content Adaptation Within the CAIN Framework

153

The CAT Capabilities Description File is parsed to sign up the CAT in the CAIN Registry, which is necessary for the DM to know that a new CAT is available and which are its characteristics. 2.2 CAIN Architecture Fig. 2 shows the current modular CAIN architecture. When an adaptation request arrives, the Execution Module (EM) is in charge of coordinating the different tasks assigned to the others modules. status

adapt(content, usage_env)

CAIN

Content Parser Usage environment Parser

(CAT,parameters)

ath _p rce ou t(s ap ) ad ath t_p

() oad d() upl lo a wn do

Execution Module e rg (ta

Communication Module

decide(content usage_env cat_capabilities)

s) ram ,pa

CATs

Decision Module CAT Capabilities Parser

Capabilities

Media Repository

Fig. 2. The CAIN architecture

First the EM receives through the adapt() operation the media content identifier, and a usage environment description (according to MPEG-21 DIA [13]). Using the media content identifier, the EM requests the Communication Module (CM), through the download() operation, to retrieve from the Media Repository the media content and its corresponding content description (according to MPEG-7 MDS [7] and MPEG-21 DIA BSD [13]). CAIN is currently implemented in Java so these XML documents are parsed (using the Content Parser Module and the Usage Environment Parser module) and represented as Java objects. The EM is also in charge of parsing the CAT Capabilities Description File (using the CAT Capabilities Parser module). All of this parsed information is delivered to the DM through the decide() operation, which has to look for the CAT that best fulfils the adaptation requirements. The selected CAT and execution parameters are sent back to the EM, which, using the Communication Module (CM), gets the content from the media repository and executes the selected CAT, through the adapt() operation, passing the retrieved media

154

F. López, J.M. Martínez, and V. Valdés

content and the parameters given by the DM. The CAT returns the adapted content and the adapted media description to the EM. Finally, the adapted media content as well at its description (in form of standard MPEG-7 and MPEG-21 description files) is stored (using the upload() operation of the CM) in the Repository. Every access to the Media Repository, for reading and writing media and media descriptions, are performed via the CM. 2.3 Supported Media With regard to media resources, the current implementation of CATs in CAIN [14] supports mainly images and videos, as can be seen in Table 1 where the mapping between media formats and CAT categories is depicted. For images, JPEG-2000 has been selected due to its scalability features. In the case of video, MPEG video formats and the scalable video coding (SVC) format introduced in [16] has been selected. Table 1. Relationship between media types and CAT categories

Media type MPEG-1/2/4 SP video, MPEG-1 audio JPEG 2000, SVC MPEG-1/2 MPEG-1/2/4 SP video

CAT category Transcoding Scalable content Semantic driven Transmoding

2.4 Description Tools for Metadata-Driven Adaptation The following three subsections describe the description tools used within CAIN to support metadata-driven adaptation, which are grouped in content, usage environment, and CAT capabilities description tools. A more in deep description of these metadata can be found in [9] and [15]. Kind of adaptation

Media adaptation

Content Description MPEG-7 Media Description (MediaInformation, Transcoding Hintes) MPEG-7 Variations and Summaries

Semantic adaptation

Transmoding to text

Bitstream adaptation

Regions of interest with MPEG-7 descriptors annotated by users or automatically: Visual, semantic, classification,… MPEG-7 textual tools: Keywords, textual annotations, Spoken Content MPEG-21 BSD/gBSD. Preferably gBSD to allow semantic bitstream adaptation

Fig. 3. Content description tools for adaptation

Multimedia Content Adaptation Within the CAIN Framework

155

2.4.1 Content Description Tools The content description tools are based on MPEG-7 MDS and MPEG-21 DIA BSD. Fig 3 summarizes the adaptation tools supported by CAIN and the content description used for this adaptation. The media description metadata should provide support for the following content adaptation modalities: • Media format adaptation: Supported by MPEG-7 media description (Media Information, transcoding hints...). • Bitstream adaptation (truncation): Supported by MPEG-21 Bitstream Syntax Description (DIA BSD or gBSD). If the number of formats is reduced, BSD may be the best option, although it does not provide the capability of associating semantic labels (this may provide some “semantic” transcoding capabilities) as gBSD does. • Media adaptation based on predefined variations and summaries: Supported by MPEG-7 variations and summaries descriptions. • Semantic and knowledge-based adaptation: Supported by MPEG-7 and JPEG-2000 regions of interest with importance, MPEG-21 gBSD markers, ... annotated by users or labelled in an automatic or supervised way via analysis algorithms. • Transmoding to text: Supported by MPEG-7 keywords, textual annotations,... 2.4.2 Usage Environment Usage environment description tools cover the description of terminal and network resources, as well as user preferences and characteristics of the natural environment. The context description is based on a subset (in the sense of an MPEG Profile [17]) of MPEG-21 DIA Usage Environment Descriptions Tools as shown in Fig. 4. Usage environment description tools (MPEG-21 DIA) include: • User characteristics: With user interactions (imported from MPEG-7 MDS), presentation preferences, accessibility characteristics and location characteristics. • Terminal description: Currently its uses static terminal description, leaving the possibility of using dynamic characteristics (CPU load, available free storage space, free RAM ...) for further versions. In any case it will be required to receive from the client the current information about the terminal being used (either the complete description or a pointer to a static description). • Network description: Currently its uses a static network description too, leaving the possibility of using dynamic characteristics (current congestion, error rate, delay time ...) for furthers versions. In any case it will be required to receive from the client the current information about the network being used (either the complete description or a pointer to a static description). 2.4.3 CAT Capabilities Description Tools Obviously not every CAT can perform every adaptation operation (bitrate reduction, transcoding, transmoding, audio/video summarization...). In order to achieve CAIN extensibility it’s necessary to annotate CATs capabilities. The selected CAT Adaptation Capabilities Description Scheme [14] (see Fig. 5) is based on the MediaFormatD Description Tool (from MPEG-7 Multimedia Description Schemes [7]), which describes the information related to a file format and coding parameters of the media.

156

F. López, J.M. Martínez, and V. Valdés

Usage Environment description tools (MPEG-21 DIA) User Description Tools Usage Preferences Media Format: content, bit rate, visual coding (format, frame height, frame width, frame aspect ratio and frame rate), audio coding (format, audio channels, sample rate, bits per sample). Presentation Preferences o AudioPresentationPreferences: Volume, output device, balance. o DisplayPresentationPreferences: Colour temperature, brightness, saturation, contrast. o ConversionPreferences: Media type conversion preferences and priorities. o PresentationPriorityPreferences: Modality (audio, video...) priorities Terminal Capabilities Tools Codec Capabilities Audio, video and image coding/decoding supported formats. Display Capabilities Supported display modes (resolution, refresh rate), screen size, colour bit depth. Audio Output Capabilities Supported audio modes (sampling frequency, bits per sample), low frequency, high frequency, number of channels… Storage Characteristics Input transfer rate, output transfer rate, size, writable. Network Characteristics Tools Network Capability Maximum capacity and minimum guaranteed. Fig. 4. Usage environment description tools for adaptation

The main adaptation capabilities description tools elements used to describe CAT capabilities are: 1. Header. The header allows the identification of the described CAT and includes a name and an optional textual description. 2. Adaptation modality. This element allows the definition of each adaptation modality with the possible media formats each adaptation modality is able to receive and to produce. It’s composed by an adaptation mode (defines as an MPEG-7 Classification Scheme that allows to describe the modality in detail), and a reference to one or more media systems the CAT is able to perform. For example, for a CAT performing video summarization, there can be different modalities, like keyframe replication (which do not reduce the timeline to allow easy audio synchronization), video skimming, and image story board. 3. Media systems. Each instance of this element allows the definition of media formats at system level by indicating: File format name, file format extension, references to zero or more visual elementary stream, references to zero or more

Multimedia Content Adaptation Within the CAIN Framework

157

audio elementary stream, and optionally a scene coding format. These elements allow the description of the media system formats each CAT adaptation modality is able to read (input), write (output), or both (common to avoid redundancies). 4. Elementary streams. These elements allow the description of video and audio coding parameters. Besides the type of the stream (video, audio, image) the parameters are grouped on input, output and common (in order to reduce redundancies) parameters. The set of parameters are based on MPEG-7 MDS MediaFormatD element, with some simplifications and extensions looking to allow the definition of adaptation capabilities. When defining a coding format, each feasible parameter will be considered by the DM as a restriction. If no restriction is imposed over a particular property of the codec, it must be considered that the codec is able to deal with any value of this property. cat

n ElementaryStream

n n

MediaSystem

n n

AdaptationModality

Bitrate

FileFormat

mode

TargetChannelBitRate

Extension

MediaSystemRef

ColorDomain

System

Resolution

VisualCoding

GraphicsCodingFormat

AudioCoding

AudioCodingFormat

SceneCodingFormat

OtherCodingFormat

id

Element

Description Scheme

Element

Description

Fig. 5. CAT Capabilities Description Scheme

3 The Decision Module The Decision Module (DM) is a software module receiving an input in form of a content description, a mandatory usage environment description, and a desired usage environment description. This module searches for the CAT that produces the best content adaptation, defined as the adaptation that matches more constraints, and therefore yielding the best experience to the user. Terminal capabilities and network characteristics have been included in the mandatory usage description, whereas the user preferences have been included in the desirable usage environment description. Some user preferences (user’s handicaps) have been included in the mandatory usage description. Fig. 6 illustrates a hypothetical adaptation process that the DM has to look for. As input we have a content description for a video that is available in a specific format, bitrate, and colour depth (this information must be provided by the Media Repository; if this is not the case, CAIN includes a media description generation module in charge

158

F. López, J.M. Martínez, and V. Valdés (Usage Environment)

(Content Description) Content: Format: MPEG-2 Bitrate: 28000 bits/s Colour depth: 65536 colours

(CAT Capabilities)

Mandatory output: Format: MPEG-1 Bitrate:

Suggest Documents