This image contains the subjects Hill, Lake, Building, and Bird, among others. ..... Lake required. Lake. Lake rlv sim sim sim. Lake. µ. 788.0. 8.00.1. 8.0. 72.00.1.
Content-Based Fuzzy Search in a Multimedia Web Database Marina Teresa Pires Vieira Mauro Biajiz Sérgio Ricardo Borges Júnior1 2 2 Eduardo Cotrin Teixeira Fernando Genta dos Santos Josiel Maimoni Figueiredo2 {marina, mauro, borges, eduardo, genta, josiel} @dc.ufscar.br Departament of Computer Science. Federal University of São Carlos. São Carlos, SP Caixa Postal 676, Brazil Abstract. This paper presents the mechanisms employed to carry out content-based fuzzy searches in a multimedia applications database. These searches can be carried out through the World Wide Web, allowing for the search for media whose content has a certain degree of similarity with that defined in the query predicate. The imprecision involved in the semantic information that defines the content of the media is treated by means of proximity relations to compare terms established in the query with those found in the database. A description is given of the formulas used to calculate the similarity degree between these terms to allow for classification of the media in the response set, as well as the algorithms used to search through the database to retrieve the media. Keywords: semantic information, fuzzy logic, multimedia database, information retrieval
1 Introduction Today it is very common to search for information in a large amount of data, which generally involves video, audio and images, among others. These types of data are manipulated by multimedia applications, which are becoming increasingly popular. In a multimedia database it is useful to maintain not only the media's raw data but also information about its content. This information provides greater flexibility for the user to compose his queries. This approach is used in the AMMO environment (Authoring and Manipulation of Multimedia Objects), which has been developed to allow for the creation, storage and manipulation of multimedia applications [1-4]. In this environment, a user can query a multimedia applications database using exact or fuzzy content-based searches on the World Wide Web. The applications are structured using the SMIL standard [5], an XML application [6], which allows them to be executed on the Web with the help of a presentation tool. The user can also manipulate information separately, for instance, by 1 2
retrieving a particular scene or even a specific media of a scene. Although the database used in the AMMO environment is based on the SMIL standard, the semantic information involved can be fitted to any multimedia database. This paper discusses the main aspects relating to the conception of the environment. Section 2 describes the set of metadata that represent the multimedia applications and the semantic information stored in the multimedia database. Section 3 explains how queries can be set up in the environment, while section 4 gives details of the query processing, including the set of formulas that allows for classification of the media to be presented to the user. Section 5 lists some of the related work, and section 6 presents our conclusions.
2 Multimedia Database of SMIL Applications 2.1 The SMIL Standard SMIL [5] is a proposal of the W3C for the treatment of multimedia applications. The SMIL standard, whose definition is based on XML, utilizes a set of tags that serve to organize multimedia information for presentation on the Web. The mechanisms supplied by SMIL allow for the composition of presentations combining a variety of media, synchronizing them temporally and spatially. The main tags that comprise a document based on the SMIL 1.0 standard are , and . The tag defines a SMIL document and all the other tags on the document are its dependents. The tag defines the spatial arrangement of the document through and tags. In addition, defines metainformation about the document, using the tag. The tag contains the tags that, in some way, influence the document’s temporal behavior, i.e., the media (, , , , and ), the synchronization tags ( and ) and the linking tags ( and ). Shown below is an example of a simplified SMIL document that displays a video simultaneously to the execution of an audio file.
Fig.6. SMIL document generated from the results of the query RESULT Source: ../images/img109.jpg Scene : rio-tourism.smi Application: applicationA Scene : rio-points.smi Application: applicationA Source: ../images/img118.jpg Scene : rio-points.smi Application: applicationA
Fig.7. View in text mode
The format used to view the scene is obtained by the application of a style sheet (based on XSL) [10], which transforms the document into HTML. Other forms of viewing are available, such as that shown in Figure 7, which is a textual representation of the result and is obtained by the application of another style sheet. The SMIL document can also be kept in its original state, with presentation formatting, for use as a model document for interchanges between applications, or even as a source of data for other applications [11].
4 Evaluation of Queries A query is evaluated in three stages. First, a preselection is made of the media, based on the subjects involved in the query expression. The µo (µmediaObject) value of each preselected media, which represents the degree of pertinence (or similarity degree) in the response set, is then calculated. Finally, the media are classified in decreasing order of the µo values. The µo function represents a proximity relation defined as µo : P × M → [0,1], where P represents the set of all the possible media that satisfy the query predicate and M represents the set of media stored in the database at a given instant. These stages are described in detail below.
4.1 Preselection of Media A preselection is made of the media that contain the subjects requested in the query expression, or similar subjects (satisfying the minimum similarity established for the requested subjects), which are combined in the same form as that defined in the query expression. Only the media of the types indicated by the user are preselected (image, video, text or a combination of these types). This stage has been divided into the steps discussed below, and the following query predicate serves as an example to illustrate the discussion. (Building AND Lake) OR (Lake With Bird) Group 1
Group 2
Step 1: Obtain subjects that are similar to each subject of each group. The first step consists of making a preselection of subjects similar to the subjects indicated in the query expression. The subjects that are similar to a requested subject are those with similarity degrees higher than or equal to the requested subject. Consider Figure 8, where SQ1 (subject 1 of the query) represents Building and SQ2 represents Lake. The similar preselected subjects are shown in level 1 of the figure (S1, S4 and S9 are similar to SQ1 while S2 and S5 are similar to SQ2). Step 2: Retrieve the media that contain similar subjects This step consists of retrieving the media that contain subjects found in the previous step and that are of the same type as that requested by the user. Each subject may be present in several media. In the example of Figure 8, the retrieved media are indicated in level 2. Step 3: Combine the collections of media for each group The sets of media relating to the subjects of each group are combined according to the group’s operator (level 3 of Figure 8). Thus, if the subjects of the group are combined by the connector AND, an intersection of the sets of media is made, obtaining the media
that contain all the subjects of the group. If the connector OR is used, these sets are joined, resulting in the media that contain at least one of the subjects of the group. When the group consists of an association of two subjects (such as group 2 of the example), the sets are combined by the connector AND, since the two subjects must be present in the media. (SQ1
AND
SQ2)
{S1, S4, S9}
{S2, S5}
{M1}, {M2, M3}, {M5}
{M1,M2,M3,M5}
h
Level 1 (Step 1)
{M1,M6}, {M3}
Level 2 (Step 2)
{M1,M6,M3}
Level 3 (Step 3)
{M1, M3}
Fig.8. Preselection of media based on a group Step 4: Combine the collections of media of all the groups of the query predicate The purpose of the last step is to combine the set of media that satisfy each group according to the connector and the precedence of the groups in the query predicate. In the example given here, where {M1, M3} is the set of media that satisfies group 1 and assuming that {M3, M7, M9} is the set that satisfies group 2, the result of the search is {M1, M3} ∪ {M3, M7, M9} = {M1, M3, M7, M9}. 4.2 Calculation of the Similarity of the Media After the preselection is made, the degree of similarity ( µo ) of each preselected media is calculated. The degree of similarity establishes “to what extent” the media belongs to the response set. To calculate the value of µo , the Gj groups involved in the query expression are considered, and how they are related to each other (AND/OR connectors). The µGi (similarity degree of the Gj group) is calculated for each Gj group according to its composition. The calculation of the groups of subjects related to each other by the connector AND is different from the calculation of the groups of subjects related by the connector OR, which, in turn, is different from the subject composition groups. The similarity degree of the subjects with qualifiers is calculated differently from those without qualifiers. The formulas for these calculations and examples of their use are given below.
Calculations of the subject groups with AND, OR connectors and of subject composition are carried out through formulas 1, 2 and 3, respectively. Examples of their use are given following the presentation of the formulas, while section 4 contains a discussion of these formulas. n
where: µSi, i = 1, ..., n, are the similarity values of the subjects Si of Gj; rlvSi is the relevance degree of the subject Si ; Si, Si+1 are the two subjects of the composition group; simJ is the similarity of the association of group Gj, of the requested one with that found in the media; rlvSi e rlvSi+1 are the relevance degrees requested. The similarity of subjects without qualifiers is calculated through formula 4, while that of subjects with qualifiers is calculated using formula 5. where tlr = simrequired * rlvSi (subjects without Si , se simSi ≥ tlr , µSi = 0sim , se simSi < tlr (4) qualifiers) n
µ
∑ ( sim rlv Qj ∗
Si
=
j =1
Qj
)
(subjects with qualifiers)
n
∑ rlv
Qj
j =1
where: Qj, j=1,...,n, is the set of qualifiers of the subject Si; simQj is the similarity degree of the qualifier Qj;
(5)
rlqQj is the relevance degree of the qualifier Qj supplied by the user; simSi is the similarity found for subject Si; tlr is the tolerance value; simrequired and rlvSi are the values corresponding to the similarity and relevance supplied by the user for a subject Si ; The tolerance value, tlr, which is applied to allow subjects with similarity close to that of the desired subject to also be selected in the search, reduces the required degree of subject similarity. The µo of the media is calculated recursively, using the intermediary values (µinterm) obtained through expression (6) for groups connected with OR and expression (7) for groups connected by AND.
µ
interm
= max{µGj} , j = 1, ..., n, where n is the number of groups connected with OR
(6)
n
µ
interm
=
∑ µG
j
j =1
n
, j =1,...,n, where n is the number of groups connected with AND
(7)
Example Let us assume that the user wishes to retrieve media that satisfy the query expression (Tall Green Building AND Lake) OR (Lake With White Bird), with the following values of similarity and relevance: Building (1.0, 1.0), Lake (0.8, 0.8), Bird (1.0, 0.9), Tall (0.8, 1.0), Green (0.7, 0.8) and White (1.0, 1.0). Let us, further, assume that a media was found containing the following degrees of similarity with those established in the query: Building (1.0), Lake (0.9), Bird (1.0), Tall (0.8), Green (0.9) and White (1.0). Based on these values, the following similarity degree of the groups is obtained. Calculation of µG1 : (Tall Green Building AND Lake)
Calculation of the similarity degree of the media (µo): (Tall Green Building AND Lake) OR (Lake With White Bird).
µ
o
= max{G1, G 2} = max{0.92,0.788} = 0.92 .
The calculation of similarity of a media involves various aspects, which are discussed in the next section. 4.3 Considerations 1– The value of µSi of expression (5) is calculated through the weighted average of the similarity values of the qualifiers, using relevance as a weight. The weighted average was chosen in order to compensate for the influence of the terms in the calculation of similarity. Thus, the greater the relevance of a qualifier for a subject, the stronger its influence on the value of µSi. In the theory of fuzzy sets, the intersection operation (AND operator) between the elements of two fuzzy sets is done by obtaining the minimum values of the elements involved. However, for the nature of the information manipulated here (semantic information in multimedia data), it was found that the weighted average of the values of similarity of the subjects applies more adequately. This can be verified through the following example. Supposing that media were requested containing the information Tall Building AND Blue Lake AND White Bird, and that two media were found containing the following similarity degrees for the three terms considered (0.9 AND 0.9 AND 0.3) and (0.35 AND 0.35 AND 0.35), and relevance 1 for all of them. By applying the minimum operator, the value calculated for the first media would be 0.3, while for the second it would be 0.35, classifying the latter value as the closest to the requested one. However, intuitively, one perceives that the former media is closer to satisfying the user than the latter. The same reasoning applies to expression (1). 2 – In the composition of subjects, it is assumed that the degree of relevance of the composition is 1.0 (represented by the value 1 in the denominator of expression 3). When defined in a query expression, these compositions are considered obligatory and, therefore, the retrieved media must contain them.
3 – The similarity of subjects with qualifiers is calculated by means of expression (5). However, the calculation is only made if the similarity between the subject that was found and the one requested is 1.0. This is due to the possibility of obtaining distorted results. Supposing the user wishes to retrieve High Hill with the following similarity and relevance values: Hill (0.8, 1.0) and High (0.9, 1.0). Supposing a media has been found that contains High Mountain, and that the similarity between Mountain and Hill is (0.9). Even though the subjects are very similar, the similarity between High Hill and High Mountain may not correspond to the real semantics intended by the user. For the user, it may be that a High Hill is significantly more similar to a Low Mountain. Hence, when qualifiers are defined for subjects with similarity degrees different from 1.0, the user is required to choose from among the possible combinations involving similar subjects and values of qualifiers from the same domain in question (High Mountain, Medium Mountain, Low Mountain, etc.).
5 Related Work Several studies have focused on the use of fuzzy logic to represent and manipulate imprecise information in databases. These approaches are employed both in systems based on the relational model [7,12-14] and in systems based on the object-oriented model [15-22]. Lee [12] proposes an extension to the relational database model that represents the imprecision of the data with the use of probability distribution. Medina et al. [13] proposed a generalized model of a fuzzy relational database that integrates the use of several models in a same framework, such as the use of a similarity relation, proximity and possibility distribution. George et al. [15,16] present an extension of the object-oriented data model to improve different types of imprecise data. Buckles and Petri [14] propose an approach to introduce imprecision in a relational database, in which the attributes of a tuple can have, as values, subsets of a set of domains, and a similarity relation is defined for each set of equivalent domains. Shenoi and Melton [7] extended the approach proposed by Buckles and Petri, substituting similarity relations for proximity relations by eliminating the transitive property. This change was introduced to allow the users greater freedom to insert similarity values among the elements of the domains of an application. The system presented here uses proximity relations applied to an object-oriented database of multimedia applications to retrieve media similar to those requested in a query. The fuzzy information that is manipulated refers to the content of these media and must be treated differently to the approaches used in the above-mentioned studies.
6 Conclusions This paper discussed the approach that is being used in the AMMO project for the retrieval of multimedia information based on the semantic content of the stored media. The use of fuzzy logic, as presented herein, has proved to be appropriate for the retrieval and classification of media. Tests carried out in the environment have confirmed that the formulas defined to calculate the similarity of the media are appropriate for the nature of the semantic information. This system was developed using Java and Jasmine ii Object-Oriented Management System [23]. XML and SMIL 1.0 standard language resources are used to present the media and the scenes of the multimedia applications via the Web. Ongoing studies are directed at meeting the requisites defined by version 2.0 of the SMIL standard [24].
References 1 Santos M.T.P, Vieira M.T.P., Borges S.R., Figueiredo J.M., Fornazari F.P., Biajiz M., (2000). Semantic Information Search Facilities for MHEG-5 and SMIL Applications. FQAS 2000: Fourth International Conference on Flexible Query Answering Systems, Warsaw, Poland. Springer-Verlag, "Advances in Soft Computing" series, 315-325. 2 Vieira M.T.P., Biajiz M., Santos M.T.P., Miradaya L. R., Fornazari F. P., (1999). Metadata for Content-Based Search on an MHEG-5 Multimedia Objects Server. Proc. of the Third IEEE Meta-Data’99, IEEE Computer Society, NIH Campus, Bethesda, Maryland – EUA. URL:http: //computer.org/conferen/proceed/meta/1999 /papers/32/ MVieira.html. 3 Vieira M.T.P., Santos M.T.P., (1997). Content-based Search on a MHEG-5 Standardbased Multimedia Database. Proceedings of the QPMIDS DEXA 97, IEEE Computer Society, Toulouse – FR, 154-159. 4 Fornazari F. P., (1999). A System for the treatment of Fuzzy Searches in Multimedia Applications for a Multimedia Objects Server. MPhil. Dissertation – Departament of Computer Science – UFSCar, São Carlos, São Paulo, Brazil. (In Portuguese). 5 SMIL W3C, (1998). Synchronized Multimedia Working Group of the World Wide Web Consortium. Synchronized Multimedia Integration Language (SMIL) 1.0 Specification. W3C Recommendation, URL: http://www.w3.org/TR/REC-smil. ONLINE: June 2001. 6 XML-W3C Recommendation, (2000). Extensible Markup Language (XML) 1.0. October. URL: http://www.w3.org/TR/REC-xml. ON-LINE: June/2001. 7 Shenoi S., Melton A., (1999). Proximity Relations in Fuzzy Relational Database Model. Fuzzy Sets and Systems 100. Supplement, 51-62. 8 Zadeh L. A., (1997). Similarity Relations and Fuzzy Orderings. In Fuzzy Sets and Applications: Selected Papers by L.A. Zadeh, Yager R.R., et al., eds. Wiley-Interscience Publication, 81-104.
9 Figueiredo J., (2000). An Environment for Authoring and Manipulation of Multimedia Applications on the World Wide Web. MPhil. Dissertation – Departament of Computer Science – UFSCar, São Carlos, São Paulo, Brasil. (In Portuguese).
10 XSL-W3C Candidate Recommendation, (2000). Extensible Stylesheet Language (XSL) 1.0. November. URL: http://www.w3.org/TR/xsl. ON-LINE: March/2001. 11 Sall K., (1998). XML: Structuring Data for the Web: An Introduction. URL: http://www.stars.com/Authoring/Languages/XML/Intro/. ON-LINE: March/1998. 12 Lee S. K., (1992). An Extended Relational Database Model For Uncertain And Imprecise Information. 18 th VLDB Conference. Vancouver, British Columbia, Canada. 13 Medina J. M., Pons O., Vila M. A., (1994). GEFRED. A Generalized Model of Fuzzy Relational Data Bases. Information Sciences. 14 Buckles B. P., Petry F. E., (1982). A Fuzzy Representation of Data for Relational Databases. Fuzzy Sets and Systems 7, 213-226. 15 George R., Srikanth R., Buckles B. P., Petry F. E., (1997). An Approach to Modeling Impreciseness and Uncertainty in the Object-Oriented Data Model. In Dubois, D.; Prad, H. e Yager, R.R.; Fuzzy Information Enginieering – A Guided Tour of Applications, John Wiley & Sons, Inc., 325-337. 16 George R., Yazici A., Buckles B. P., Petry F. E., (1997). Modeling Impreciseness and Uncertainty in the Object-Oriented Data Model – A Similarity-Based Approach. In De Caluwe, R.; Fuzzy and uncertain Object-Oriented Databases – Concepts and Models, Word Scientific, 63-95. 17 Yazici A., George R., Aksoy D., (1998). Design and Implementation Issues in the Fuzzy Objetct-Oriented Data Model. Journal of Information Sciences 108, 241-260. 18 Gyseghem N. V., De Caluwe R., (1997). The UFO Database Model: Dealing with imperfect information. In De Caluwe, R.; Fuzzy and uncertain Object-Oriented Databases – Concepts and Models, Word Scientific, 123-185. 19 Bordogna G., Leporati A., Lucarella D., Pasi G., (2001). The Fuzzy Object-Oriented Database Management System. In: Recent Issues on Fuzzy Databases, Bordogna, G; Pasi G., eds. Physica-Verlag, 209-236. 20 Bordogna G., Lucarella D., Pasi G., (1997). An Extension of a graph based data model to manage fuzzy information. In: Fuzzy and Uncertain Object-Oriented Databases – Concepts and Models, Word Scientific, 97-122. 21 Tré G., Caluwe R., Cruyssen B. V., (2001). A Generalised Object-Oriented Database Model. In: Bordogna, G; Gabriella Pasi; Recent Issues on Fuzzy Databases, Physica-Verlag, 155-182. 22 Koyuncu M., Yazici A., George R., (2000). Flexible Querying in an intelligent Object-Oriented Database Environment. FQAS 2000: Fourth International Conference on Flexible Query Answering Systems, Warsaw, Poland. Proceedings published by the Springer-Verlag group in the "Advances in Soft Computing" series, 75-84. 23 Computer Associates, (2000). Jasmine ii – on-line documentation. 24 W3C Recommendation, (2001). Synchronized Multimedia Integration Language (SMIL 2.0); http://www.w3.org/TR/2001/REC-smil20-20010807.