A Framework for Automatizing and Optimizing the Selection of Indexing Algorithms Mihaela Brut, S´ebastien Laborie, Ana-Maria Manzat, and Florence S`edes Institut de Recherche en Informatique de Toulouse (IRIT) 118 Route de Narbonne 31062 Toulouse Cedex 9, France {brut,laborie,manzat,sedes}@irit.fr
Abstract. Inside an information system, the indexation process facilitates the retrieval of specific contents. However, this process is known as time and resource consuming. Simultaneously, the diversity of multimedia indexing algorithms is growing steeply which makes harder to select the best ones for particular user needs. In this article, we propose a generic framework which determines the most suitable indexing algorithms according to user queries, hence optimizing the indexation process. In this framework, the multimedia features are used to define multimedia metadata, user queries as well as indexing algorithm descriptions. The main idea is that, apart from retrieving contents, user queries could be also used to identify a relevant set of algorithms which detect the requested features. The application of our proposed framework is illustrated through the case of an RDF-based information system. In this case, our approach could be further optimized by a broader integration of Semantic Web technologies.
1
Introduction
Various domains (such as news, TV, resource collections for commercial or consumer applications, collaborative work, video surveillance. . . ) develop information systems for managing and retrieving multimedia contents. In order to retrieve these contents, a multimedia indexing phase is priory required. This one mainly extracts some multimedia features for producing multimedia metadata and stores these descriptions into a metadata collection on which user queries will be applied. Currently, information system developers select by themselves the appropriate indexing algorithms to be included into the indexation engine. However, the diversity of these indexing algorithms is permanently increasing. Furthermore, as information systems should handle extensive multimedia collections, it is not possible to execute all algorithms because the indexation process is time and resource consuming (e.g., CPU). Consequently, a solution for automatizing and optimizing the selection of indexing algorithms must be developed. Our article provides such solution by selecting the indexing algorithms which are the most suitable for particular user needs. For that purpose, indexing algorithms are described by the features they extract and user queries are used to determine a relevant set of algorithms which extract all requested features. ´ Sicilia, and N. Manouselis (Eds.): MTSR 2009, CCIS 46, pp. 48–59, 2009. F. Sartori, M.A. c Springer-Verlag Berlin Heidelberg 2009
A Framework for the Selection of Indexing Algorithms
49
Alongside with the time and computing resource economies, our solution has the following advantages: – It determines a relevant set of indexing algorithms which could extract as many as possible multimedia features requested by user queries. Thus, developers are informed about the most suitable algorithms set corresponding to a specific query set. – It is based on a general uniform modeling of user queries and indexing algorithm descriptions. This modeling uses the multimedia features in order to describe and relate the queries and the algorithms. Hence, our approach can be applied to different representation and query languages. – It considers low-level multimedia features as well as high-level ones, acquired through indexing algorithms that make use of Semantic Web technologies. – It can be adopted in the development phase of an information system as well as during the concrete system usage. The remainder of the paper is structured as follows. Section 2 illustrates the diversity and heterogeneity of existing indexing algorithms, and presents some systems which integrate such algorithms in order to manage and retrieve various multimedia contents. Because existing systems do not provide a solution for a selective indexing process according to user queries, we propose such solution in Section 3. The proposed solution is generic and could be applied to multiple data representations (unstructured or structured descriptions). Section 4 illustrates the application of our approach for RDF descriptions as well as the supplementary benefits acquired in this case due to Semantic Web approach. Section 5 presents a brief conclusion and some perspectives.
2
Related Work
In general, an information system in charge with managing and retrieving multimedia contents is composed of [15,6]: – A multimedia collection which contains several multimedia contents. These contents refer to media items, such as texts, images, videos or audios. – A metadata collection which contains information about the media characteristics (e.g., size, name, file type) and their contents. [8] presents some specific metadata for multimedia contents. These metadata may be encoded in several standards, such as EXIF [12], Dublin Core [19], MXF [9], etc. – An indexation engine which includes several indexing algorithms to be applied on the multimedia collection in order to enrich the metadata collection. A fair amount of research has been conducted on developing indexing algorithms. In the following, we propose to illustrate their diversity and heterogeneity for each media type. For textual documents, some indexing techniques, e.g., [14], are inspired by classic Information Retrieval [21], or by Web Information Retrieval, exploiting the hypertext features, such as page hyperlinks [5] and HTML general tags [1].
50
M. Brut et al.
The progress from a term-based to a concept-based document indexation was possible due to the latent semantic indexing technique [3] or to some knowledge representation models and methods that are typical to artificial intelligence domain (such as neural networks, semantic networks, bayesian networks) [18]. Concerning images, content semantic indexation processes usually analyze object related information (e.g., how many objects are in this image?, is object X present?, which objects are in the image?). They exploit various a priori knowledges on the observed scenes and a general model of the world. To achieve this, current methods are based on feature extraction, clustering/segmentation, object descriptor extraction, object recognition [7]. Pattern recognition techniques [23], such as boosting or cascade of classifiers execution, have been also applied for image semantic indexation. Audio analysis is accomplished in some main directions [11]: segmentation (splitting an audio signal into intervals for determining the sound semantic or composition), classification (categorizing audio segments according to predefined semantic classes, such as speech, music, silences, background noise) and retrieval by content (using similarity measures in order to retrieve audio items that syntactically or semantically match the query). In the area of content-based video indexing and retrieval, many research efforts have been conducted on automatic techniques for extracting metadata which describe the content of large video data archives, such as [10]. Some of these metadata have been adopted in video standards, e.g., in MPEG-7 [17]. These systems typically use metadata elements such as content type, shot-boundaries, audio keywords, textual keywords or caption text keywords. Besides, there is also some research, e.g., [22], in determining the class of video scene objects (e.g., human, vehicle, type of vehicle, animal) and detecting activities (e.g., carrying a bag, raising arms in the air, the manner of running or walking). A single indexation engine could not include all these algorithms because it would overload the indexation process with useless analysis for concrete user needs. Thus, the development phase of an indexation engine requires the selection of specific indexing algorithms. Many projects integrate such specific algorithms in their indexation engines in order to manage multimedia contents. The K-Space project1 is focused on semantic inferences for semi-automatic annotation and retrieval of multimedia contents. It integrates three research clusters: content-based multimedia analysis, knowledge extraction, and semantic representation and management of multimedia. The VITALAS project2 (Video & image Indexing and reTrievAl in the LArge Scale) develops solutions for crossmedia indexing and retrieval by developing intelligent access techniques to multimedia professional archives. The CANDELA project3 (Content Analysis and Network DELivery Architectures) performed video content analysis in combination with networked delivery and storage functionalities. The ISERE project4 1 2 3 4
http://kspace.qmul.net:8080/kspace/ http://vitalas.ercim.org http://www.hitech-projects.com/euprojects/candela http://www.mica.edu.vn/Isere/
A Framework for the Selection of Indexing Algorithms
51
(Inter-media Semantic Extraction and Reasoning) aimed to study a unified multimedia model which enables to enhance the identification of semantic contents. The MUSCLE network of excellence5 (Multimedia Understanding through Semantics, Computation and LEarning) deals with multimedia data mining and machine learning technologies in order to provide a set of showcases for object recognition, content analysis, automatic character indexing, content-based copy detection, unusual behavior detection, movie summarization, human detection, speech recognition, etc. However, none of these projects propose a solution to select the suitable indexing algorithms according to user needs. Moreover, their indexation engines consider a fixed set of algorithms. In the following section, we propose a generic framework which determines, according to user queries, a relevant set of indexing algorithms to be included into a multimedia retrieval system.
3
A Generic Framework for Selecting Indexing Algorithms
When a user query asks for some multimedia contents, the system builds the results by searching into the metadata collection. Our proposition considers that the same query could be also used in order to identify a set of indexing algorithms which detect the requested multimedia features. In this section, we present a method for determining such list of algorithms (§3.1). Moreover, we show that our approach could be used in the development phase of an information system as well as during the concrete system usage (§3.2). 3.1
Determining Indexing Algorithms According to a Query
In order to be as much generic as possible, our proposition is based on a uniform modeling of the query, the multimedia metadata and the indexing algorithm descriptions. This uniformity is acquired mainly by considering multimedia features in all models. More precisely, a query is viewed as a list of features to be retrieved, a multimedia metadata contains a list of features presented in a multimedia content and an indexing algorithm identifies a list of features. In this context, answering a query consists in finding multimedia contents by locating in the metadata collection the requested features. Similarly, we propose to determine the set of indexing algorithms necessary for answering a query, more precisely a relevant set of algorithms in order to optimize the indexation process. Consider a query Q, a list of indexing algorithm LA and a feature f , such that f ∈ Q. Algorithm 1 selects in LA a relevant set of indexing algorithms which extracts the feature f and as many as possible features specified by Q. Initially, the list of indexing algorithms LA contains all available algorithms in the system. The feature f is selected in Q according to the maximum number 5
http://www.muscle-noe.org
52
M. Brut et al.
Algorithm 1. indexingAlgorithmsSelection Input: A user query Q, a list of indexing algorithms LA and a feature f , such that f ∈ Q. Output: A list of indexing algorithms. Data: A list of indexing algorithms L which gather all results returned by the recursive call of indexingAlgorithmsSelection. LA ← the indexing algorithms of LA which extract f ; if LA = ∅ then mark f in Q; L ← ∅; foreach fi unmarked in Q do L ← L ∪ indexingAlgorithmsSelection(Q, LA , fi ); if L = ∅ then e ← selectOneAlgorithm(LA ); return {e}; else return L; else return ∅;
of indexing algorithms in LA which extracts f . Actually, during the recursive call of Algorithm 1 this list LA will be refined into a new list L. During the execution of Algorithm 1, we propose to mark features that could be identified by some indexing algorithms included in LA . Hence, at a time only features which have not already been identified by indexing algorithms are considered. This branch and bound technique is useful to prune the backtracking search tree of Algorithm 1. Moreover, when several indexing algorithms can be applied for determining a set of features, the selectOneAlgorithm method is used to select only one algorithm, thus avoiding applying multiple indexing algorithms for determining the same set of features. For instance, this selection could be based on the algorithms execution time. Finally, when Algorithm 1 stops, it returns a relevant list of indexing algorithms that retrieves the features mentioned in the query. Nevertheless, this algorithm is not complete because the resulted list is related only to the given feature. Hence, when some features remain unmarked, it could be possible that other indexing algorithms identify them, especially algorithms that do not consider the given feature f . To assure the application of Algorithm 1 for all requested features, we propose Algorithm 2 which includes multiple calls of Algorithm 1 such as to cover as many as possible unmarked features from the query. When Algorithm 2 stops, one can check the marked features in order to verify if all requested features can be identified by the indexing algorithms in the resulted list:
A Framework for the Selection of Indexing Algorithms
53
Algorithm 2. getAllIndexingAlgorithms Input: A user query Q and a list of indexing algorithms LA . Output: A list of the indexing algorithms. Data: A list of indexing algorithms L which gather all results returned by the call of indexingAlgorithmsSelection. L ← ∅; foreach fi unmarked in Q do L ← L ∪ indexingAlgorithmsSelection(Q, LA , fi ); return L;
– if all query features are marked, it means that the resulted set of indexing algorithms may identify all query features, thus producing after their executions some metadata solutions to the query. – if some query features remain unmarked, it means that the resulted set of indexing algorithms cannot identify all query features. 3.2
Concrete Application Domains
As we already announced in Introduction, our solution could be applied during the development phase of a multimedia information retrieval system as well as during the concrete system usage. The Case of an Information System Development. When an information system is designed, one important task consists in establishing an exhaustive set of possible queries that will be submitted by users. This enable developers to assure that the system will meet the concrete user needs. According to this query set, our proposition provides a solution to automatically determine a list of relevant indexing algorithms to be used during the indexation process. This list corresponds to the implicit indexation process, i.e., executing indexing algorithms over each multimedia contents during the acquisition phase (i.e., when the multimedia content is included in the system). The indexation results will constitute the initial multimedia metadata collection. For this purpose, our proposed technique consists in applying Algorithm 2 for each query and in unifying the obtained lists of indexing algorithms into a single list. For example, suppose n queries and m indexing algorithms. For each query Qi , Algorithm 2 produces a list i=1Li of indexing algorithms, with 1 ≤ i ≤ n. The unified resulted list L will be n Li , such that L contains k indexing algorithms with k ≤ m. It is important to note that, if some query features remain unmarked, the system may advise developers to collect new indexing algorithms that identify these unmarked features. This sort of dialog enables developers to be assisted during the information system development. The Case of Querying an Information System. When a user query is submitted to the system, it retrieves a set of solutions into the metadata collection.
54
M. Brut et al.
If no solution is retrieved, this probably means that other indexing algorithms should be executed in order to find some results. According to a user query, our proposition enables to determine the list of indexing algorithms which could provide the supplementary metadata to be used for retrieving some results. This list corresponds to the explicit indexation process, i.e., executing new indexing algorithms over each multimedia contents. During the explicit indexation process, the user is informed that no results are available for the moment and that he could receive some later. Thanks to our feature based modeling, one may know before the evaluation of a query if it is possible or not to retrieve some results. Indeed, suppose Lf the list of features covered by an implicit indexation process. If all requested features specified by Q are included in Lf , it means that no supplementary indexing algorithms are required for responding to the query Q. Otherwise, one may execute Algorithm 2 to establish the explicit indexation process. As could be noticed, the solution proposed in this section considers queries, multimedia metadata and indexing algorithm descriptions as lists of features. However, in many applications and metadata standards, multimedia features (the low-level ones and the semantic ones) are most often organized into a structure, such as XML [4], RDF [16], etc. In the following section, we show that our proposition is still effective for such structured descriptions.
4
Application to Structured Metadata
Previously, we have shown that our framework uses a unified modeling of the query and the list of indexing algorithms which are both based on multimedia features. Features may be specified thanks to keywords or based on specific structures (e.g., RDF, XML). Currently, most of the standardized multimedia metadata vocabularies are XML based. Many of them are already translated in RDF in order to be available for Semantic Web technologies, such as [2] for MPEG-7 and [13] for EXIF. Consequently, we propose to illustrate our proposition on RDF-based descriptions (§4.1). Moreover, we show that our framework can be further optimized thanks to the Semantic Web technologies (§4.2). 4.1
Application to RDF-Based Descriptions
Consider an information system which adopts RDF-based data descriptions. In such a system, indexing algorithm outputs are RDF descriptions or they are translated into RDF descriptions. In Figure 1, we consider an example set of indexing algorithms described in terms of their outputs through RDF schemes. For instance, the indexing algorithm described in Figure 1(c) accomplishes a vehicle recognition inside a video content. Its description contains blank nodes (e.g., : url, : x) which are instantiated when the indexing algorithm is applied on a particular video content. After the execution of indexing algorithms, the obtained metadata are stored into a metadata collection which is queried using SPARQL [20]. For instance, the
A Framework for the Selection of Indexing Algorithms rdf:type V ideo
foaf:depicts : url
:x
55
rdf:type foaf:Person ex:hairColor : color
(a) RDF description D1 of an indexing algorithm. rdf:type V ideo
dc:author
: url
:x
(b) RDF description D2 of an indexing algorithm. rdf:type V ideo
foaf:depicts : url
:x
rdf:type ex:Vehicle
(c) RDF description D3 of an indexing algorithm. rdf:type V ideo
: url
dc:author dc:creator
:x :y
(d) RDF description D4 of an indexing algorithm.
Fig. 1. Examples of RDF-based indexing algorithm descriptions
following SPARQL query Q evaluated against the metadata collection retrieves authors of videos which depict persons: PREFIX rdf: PREFIX foaf: PREFIX dc: SELECT ?author FROM WHERE { ?video
. ?video ?person . ?person . ?video ?author . } Our proposed framework could be used to determine a relevant set of indexing algorithms which provides information for answering this query. The features from the user query and from the entire list of algorithms should be previously identified. For the former, each triple pattern specified in the SPARQL query (e.g., ?video ) could be considered as a requested feature. For the latter, each RDF triple specified in the indexing algorithm descriptions (e.g., : url rdf:type Video) could be considered as an extracted feature. Figure 2 presents the backtracking search tree of the execution of Algorithm 2 over the query Q and the list of indexing algorithms illustrated in Figure 1. Each
56
M. Brut et al. f1 {D1 ,D2 ,D3 ,D4 }
Features requested by query Q f2
f4
{D1 ,D3 }
{D2 , D4 }
f1 = ?video f2 = ?video ?person f3 = ?person
f3
f4
{D1 }
{∅}
f4 {∅}
f4 = ?video ?author Marked features: f1 , f2 , f3 , f4
Results: {D1 , D2 }
Fig. 2. An execution of Algorithm 2 over the SPARQL query Q and the indexing algorithm descriptions illustrated in Figure 1
node represents a requested feature and a set of indexing algorithms which detect not only this feature but also the features represented by the node’s ancestors. In this figure, D1 is a good candidate because it extracts three requested features, namely f1 , f2 and f3 . For f4 , two candidates are found: D2 and D4 . The selectOneAlgorithm method selects only one of them (cf., Section 3). Moreover, all query features are marked which means that the application of the selected indexing algorithms (described by D1 and D2 ) may produce solutions for the given query. In the following, we propose some optimizations of our framework based on grouping features and using the semantic meaning of features. 4.2
Optimizations and Discussions
In the case of RDF-based descriptions and SPARQL queries, we have shown that features are expressed as triples and triple patterns, respectively. Consequently, for queries involving many triple patterns, Algorithm 2 will test a lot of feature combinations. Indeed, the complexity of Algorithm 2 is determined by the number of features. Moreover, since each triple pattern is viewed as a feature, certain general features (e.g., ?x ?y) could correspond to many indexing algorithms. Consequently, certain tree nodes will include quite big lists of indexing algorithms. In order to improve the execution of Algorithm 2, we propose to group triple patterns which are related to each other. For example, in Figure 2, the feature f2 and f3 could be grouped into one single feature because we are looking for something which depicts a person. This could be achieved by processing the query: since each triple pattern is composed of {subject, predicate, object}, two triple patterns could be grouped if the object of the former is the same with the subject of the latter (as it is the case for f2 and f3 ).
A Framework for the Selection of Indexing Algorithms
57
Another improvement consists in using the semantic meaning of features. Actually, in many situations a requested feature does not correspond exactly to the ones in an indexing algorithm description, as shown in the following examples. Example 1. If a query is searching information about a person, and a list of indexing algorithms identifies only information about man, Algorithm 2 won’t produce any result. However, using the fact that a man is a person, one is able to select some indexing algorithms, and consequently produce some results. This reasoning facilities could be achieved by using ontologies which could be also used to determine synonyms and related features. Example 2. Suppose we add to the query Q, the following statement f5 : ?person "blond". Our approach won’t retrieve some indexing algorithms which determine this feature (i.e., f5 will remain unmarked) because no indexing algorithms determine exactly this feature. However, the indexing algorithm associated to the description D1 extracts information about hair colors. In order to select this algorithm, it is possible to relax the query constraint by replacing “blond” with a variable ?color. Ontologies may also be used to find out that blond is a specialization of hair color. Adopting ontology could be a solution for optimizing our framework, since it specifies a network of concepts containing general concepts as well as specific ones.
5
Conclusion
Our approach is situated in the context of multimedia information systems where an intensive indexation process is required in order to facilitate the multimedia content retrieval. We propose a generic framework for selecting a relevant set of indexing algorithms according to user queries. Moreover, we have shown that our proposal could be applied on concrete multimedia metadata vocabularies, such as RDF-based descriptions. This application to such language allows the use of Semantic Web technologies which could improve our framework. Furthermore, our solution is general enough for functioning in a context of a local information system as well as inside a distributed information system, or even inside a Web based information system. For the last case, the solution of implementing the indexing algorithms as Web services is to be considered. Our solution is going to be integrated and tested in the context of the LINDO project6 . This project is focused on managing the multimedia indexation process inside a distributed environment, where all the indexing algorithms are centralized on a central server and deployed on demand on the remote servers, according to the user queries. Our solution provides an optimization of the indexing algorithms deployment and indexation processes due to the selection of the most suitable algorithms according to the user queries. Furthermore, in the 6
http://www.lindo-itea.eu
58
M. Brut et al.
distributed environment of the LINDO project, queries might be different for each remote server. Consequently, we will observe if some indexing algorithms are frequently demanded by most of remote servers.
Acknowledgement This work has been supported by the EUREKA Project LINDO (ITEA2 – 06011).
References 1. Agosti, M.: Information Retrieval and HyperText. Kluwer Academic Publishers, Norwell (1996) 2. Arndt, R., Troncy, R., Staab, S., Hardman, L., Vacura, M.: COMM: Designing a well-founded multimedia ontology for the web. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudr´e-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 30–43. Springer, Heidelberg (2007) 3. Sarwar, B., Karypis, G., Konstan, J.A., Riedl, J.: Incremental SVD-based algorithms for highly scalable recommender systems. In: Proceedings of the Fifth International Conference on Computer and Information Technology (2002) 4. Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., Yergeau, F.: Extensible markup language (XML) 1.0, 5th edn. Recommendation, W3C (2008) 5. Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems 30(1-7), 107–117 (1998) 6. Buckland, M.K., Plaunt, C.: On the construction of selection systems. Library Hi Tech. 12, 15–28 (1994) 7. Chen, S.-C., Ghafoor, A., Kashyap, R.L.: Semantic Models for Multimedia Database Searching and Browsing. Kluwer Academic Publishers, Norwell (2000) 8. Chrisment, C., S`edes, F.: Media annotation. In: Multimedia Mining: A Highway to Intelligent Multimedia Documents (Multimedia Systems and Applications Series), pp. 197–211. Kluwer Academic Publishers, Dordrecht (2002) 9. Devlin, B.: MXF – the Material eXchange Format. EBU Technical Review, Snell & Wilcox (July 2002) ¨ G¨ 10. D¨ onderler, M.E., S ¸ aykol, E., Arslan, U., Ulusoy, O., ud¨ ukbay, U.: BilVideo: Design and implementation of a video database management system. Multimedia Tools and Applications 27(1), 79–104 (2005) 11. Foote, J.: An overview of audio information retrieval. Multimedia Systems 7(1), 2–10 (1999) 12. Japan Electronics and Information Technology Industries Association: Exchangeable image file format for digital still cameras: Exif Version 2.2 (April 2002) 13. Kanzaki, M.: EXIF vocabulary workspace – RDF Schema. W3C (2003), http://www.w3.org/2003/12/exif/ 14. Lambolez, P.Y., Queille, J.P., Chrisment, C.: EXREP: A generic rewriting tool for textual information extraction. Ing´eni´erie des Syst`emes d’Information 3, 471–487 (1995) 15. Lancaster, F.W.: Information Retrieval Systems. Wiley, New York (1979) 16. Manola, F., Miller, E.: RDF primer. Recommendation, W3C (2004)
A Framework for the Selection of Indexing Algorithms
59
17. Mart´ınez, J.M.: MPEG-7 Overview v.10. ISO/IEC JTC1/SC29/WG11/N6828 (2004), http://www.chiariglione.org/mpeg/standards/mpeg-7/mpeg-7.htm 18. Micarelli, A., Sciarrone, F., Marinilli, M.: Web document modeling. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) Adaptive Web 2007. LNCS, vol. 4321, pp. 155–192. Springer, Heidelberg (2007) 19. National Information Standards Organization: The Dublin Core Metadata Element Set. ANSI/NISO Z39.85 (May 2007) 20. Eric Prud’hommeaux and Andy Seaborne. SPARQL Query Language for RDF. Recommendation, W3C (January 2008) 21. Salton, G., Mcgill, M.J.: Introduction to Modern Information Retrieval. McGrawHill, Inc., New York (1986) 22. Viola, P., Jones, M.: Robust real-time object detection. International Journal of Computer Vision (2001) 23. Yang, M.-H., Kriegman, D.J., Ahuja, N.: Detecting faces in images: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(1), 34–58 (2002)