A Fuzzy Object Query Language(FOQL) for Image Databases S.Nepal, M.V. Ramakrishna and J.A. Thom Department of Computer Science RMIT University GPO Box 2476V, Melbourne VIC 3001 fnepal,rama,
[email protected]
Abstract
The image data as well as the nature of querying image data is fuzzy. Users like to retrieve images based on similarity rather than exact matching. Often, the users are able to provide only partial information while posing queries. As a consequence, the query languages of traditional databases which deal with precise data are inappropriate for querying image data. Content based retrieval systems have been developed for querying image data in which the users can pose queries based on visual properties such as color and texture. These systems which have advanced the state of the art in image database remarkably lack formal query languages. The traditional query languages are unable to capture the inherent fuzzy nature of the image data and content based querying. The fuzzy object query language (FOQL) presented in this paper addresses the need to support fuzzy values and fuzzy collections required for image databases. It can be used for de ning schemas and high level concepts, and for querying image databases. FOQL provides the following advantages over other query languages. It allows users to de ne high level concepts such as mountain and sunset in terms of low level features such as color and texture. It captures the inherent fuzzy nature of content based retrieval by keeping the query results fuzzy as against other query languages. The users can interactively re ne their queries and high level concept de nitions using recursive and named query de nition constructs in FOQL. Being an extension of ODMG-OQL, FOQL can be easily mapped to ODMG-compliant visual query languages. Keywords: Multimedia database, query language, image database, content based image retrieval, fuzzy object query language.
1 Introduction
Multimedia data, especially image data has become prevalent in recent years. Large collections of image data are growing rapidly due to the advent of cheaper storage, processing power and are becoming widely available via the Internet. Content based retrieval from such data is essential for these systems to be useful. Modeling and eciently retrieving data from such large collections has become an important issue [8, 18]. Due to the inherent complexity of handling image data, this area is still in an infant stage and is an active research area. One of the major characteristics of the image database systems is that they deal with inherently fuzzy data. Comparisons are based on similarity rather than exact matching [17]. The result of a query in such systems need to be a set of elements (images) and their corresponding ranks (of how similar they are). The query result is a fuzzy set where each element's membership value determines its rank. This is unlike traditional database systems, where the data is exact, users pose exact match queries and the result obtained is a set of objects. The Content Based Image Retrieval (CBIR), systems developed mostly in recent times, have used dierent approaches to deal with such databases. The architecture of each system is dierent and they use some form of visual query languages. We identify the following three types of CBIR systems. 1
Systems that use visual features (color, texture, etc.) and visual query languages (using graphical user interface) [8]. Systems that use semantic features (nose, eye, mouth, etc.) and textual query languages [2]. Systems that use a combination of the above two [15].
With the current state of the art (in image processing and computer vision) extracting \things" from the images is a very dicult problem. Thus an important issue in CBIR systems is to map low level features such as color and texture (\stus") to high level concepts such as sunset and mountain (\things"). Towards this aim, we have proposed and are building a system with four level schema architecture to model image data [12]. The rst (lowest) layer is the image representation layer which stores the raw image data. After processing the images, the necessary components and features are extracted and stored at the image feature layer. The third layer is the semantic layer which includes the system provided semantics such as similarity and spatial functions. The last, user semantic layer, provides a framework for the user to interact with the lower layers and de ne semantics. Thus our proposed modeling schema allows the user to map \stus" to \things" interactively. We need a suitable data de nition and query language for implementation of CBIR system based on our architecture. It should be capable of incorporating the future advances in image processing technology into the CBIR system. Image databases need to store fuzzy data and process queries (which are not be precisely de ned). There has been prior research and prototype systems incorporating fuzzy concepts into relational and object oriented databases. Umano proposed FREEDOM-O, a fuzzy database system which is an extension of Codd's relational model of data [19]. This system supports fuzzy data model, and querying. The user can de ne and store fuzzy sets and the result of a query can be fuzzy. The value \young" can be stored for age, as well as numbers for the same attribute. The system can retrieve all \young" people etc. with the help of user de ned predicate \$young". Other researchers have proposed similar fuzzy extensions to the relational model such as in [16, 21, 20, 10]. There are fuzzy extensions to object oriented data model, such as by Bordogna, who also de nes fuzzy inheritance, fuzzy attributes etc. [1]. To deal with the absence of a standard object oriented query language, ODMG has proposed OQL as a standard query language. Consequently, there have been attempts to introduce fuzzy concepts into the ODMG OQL such as [11, 5]. The MOQL system enables fuzzy query speci cation but the query results are not fuzzy [11]. Connan has introduced fuzzy concepts for querying multimedia databases [5]. None of the query languages developed/proposed for existing CBIR systems directly deal with the inherent fuzzy nature of the data. Towards this end, we are proposing a fuzzy object query language, FOQL, in this paper. The FOQL is being developed for querying the CHITRA content based image retrieval system we are developing incorporating our four level data model. We plan to use Fagin's results in [6, 7] to process fuzzy boolean queries and fuzzy weighted boolean queries. Our plan is do develop heuristics, to deal with the proven complexity of query processing [6]. The data de nition constructs of FOQL enables us to de ne the schemas for the bottom three layers. Its query constructs is capable of querying and de ning high level concepts of the top layer. It is a \pure" fuzzy query language, since both the query speci cation and result are fuzzy. Thus FOQL avoids the impedance mismatch between the query language and the data. To the best of our knowledge, this is the rst attempt of de ning a formal fuzzy object query language. We chose to use objected oriented approach, since object oriented models are capable of modeling the complex image data. Our approach is to extend the current de-facto standard query language, OQL of ODMG [14]. Following are the salient features of FOQL. 2
FOQL constructs enable users to formulate queries in a more natural way and handle the
underlying problems in CBIR systems. The named query de nition construct provides a framework to map low level features to high level concepts. Most of the existing CBIR systems such as QBIC [8] do not allow the user to save their queries and invoke them later. Typically, a user has to repeat the tedious procedure if he/she wants to repose a query to the database. The user instead would like to save their queries and invoke them later when necessary. FOQL supports this functionality using the named query de nition construct. It also allows users to de ne concept predicates (such as de ning \mountains" as a predicate in terms of low level features). The concept predicates provide greater power to the query language. The recursive and named query de nitions in FOQL enable users to re ne their queries and high level concept de nitions interactively. FOQL allows users to provide weights to the predicates using a weighted predicate construct. This enables users to express their preferences to the feature values explicitly. FOQL provides an explicit mechanism for controlling the number of images output/displayed using threshold and index collection constructs. The threshold construct limits the number of images using the membership values, whereas the index collection expression uses the rank of the images. FOQL is a superset of OQL, and hence users do not need to learn a completely new language. Since visual query language interface has to be provided in CBIR systems, it should be possible to easily map the query language into visual query language. Since FOQL is based on ODMG standard, FOQL can be readily mapped into ODMG-compliant visual query languages [4].
The rest of the paper is organised as follows. The next section presents our Fuzzy Object Model, which is a fuzzy extension to the ODMG Object Model. We then describe the Fuzzy Object De nition Language which enables de nition of the object database schema. This language is essentially the same as the ODMG Object De nition Language, but uses our Fuzzy Object Model. In Section 4, we describe FOQL using an example image database. We only describe the important constructs of the language along with demonstration of the necessity and power of them in the context of image databases. A summary is contained in the last section.
2 Fuzzy Object Model
We incorporate fuzzy constructs to handle fuzzy data into the ODMG Object Data Model [3]. This choice allows us to use OQL as a basis for de ning FOQL. Our idea is not to arrive at a full
edged generic fuzzy object oriented data model [1]. The aim is to provide only the necessary fuzzy extensions to the object data model which is enough to handle the fuzzy information encountered in the image databases. For example, we do not deal with fuzzy integers (and fuzzy arithmetic), but we support fuzzy sets of objects (the objects could be integers also) and the corresponding set operations. This enables us to keep the data model as well as the query language simple, yet powerful enough to represent and manipulate the fuzzy image data. We next describe each construct of the ODMG Object Data Model followed by the corresponding fuzzy extension. The ODMG Object Model supports two kinds of types: built-in and user de ned. The ODMG provides a set of built-in types, and the user can use them later to de ne types necessary for a particular application. The Fuzzy Object Model extends the built-in types to support fuzzy 3
Img-col
one-to-one isin
one-to-many
has
many-to-many
Image belongsto
contains
Img-comp
Figure 1: Graphical Representation of Schema for the Example Image Database data. The built-in types in general are classi ed into two groups: literals (with no oids) and objects (with oids). The ODMG Object Model supports three kinds of literal type: atomic literal, collection literal and structured literal. In addition to these, our Fuzzy Object Model supports the following fuzzy literal types. Fuzzyboolean atomic literal: a literal of this type can take a value in the range 0.0 to 1.0. Fuzzy collection literal: A fuzzy set of objects is a (usual) set of objects, with each object having an associated membership value. This value, in the range 0.0 to 1.0, represents the degree of belonging of the object into the fuzzy set. Thus, a fuzzy set of objects is essentially a set of (object, value) pairs. A Fuzzyset is a set of literals along with a membership value of each literal. For example, fuzzyset((1, 0.3), (2, 0.7)) has two elements. Similarly, fuzzybag and fuzzylist are de ned. The ODMG model supports two types of objects: atomic object and collection objects (set, bag, list and array). Fuzzy Object Model does not have any built-in atomic object types. Atomic object types are de ned by users as in ODMG Object Model. We extend collection object types in our Fuzzy Object Model by supporting the notion of the fuzzy collection objects: Fuzzyset, Fuzzybag and Fuzzylist. The de nition of these collection objects is similar to the above de nition of collection literals, but the collection objects have identity. Hence, we do not elaborate further in this regard. Moreover, the ODMG Object Model supports the concept of extent to provide a framework for user de ned collection object types. Our Fuzzy Object Model supports the notion of fuzzy extent to support the fuzzy data sets in the system. A fuzzy extent of a type is the set of all (fuzzy object) instances of the type within a particular database. In this paper, we use collection(t) to refer to any of set(t), list(t), array(t) or bag(t) where t is a type. Similarly fuzzycollection(t) is used.
3 Fuzzy Object De nition Language (FODL)
The Object De nition Language (ODL) is a speci cation language with which the object types can be de ned (the attributes, relationships and methods). This is similar to the Data De nition Language in a traditional database management system. Our FODL is essentially same as that of the ODMG ODL, but uses our Fuzzy Object Data Model described above. In the following we provide an example usage of FODL by de ning a small image database. We will not elaborate all the details of the FODL, and refer the reader to ODMG literature [3].
3.1 De nition of an Example Image Database in FODL
We illustrate the FODL by de ning an image database schema shown in Figure 1. It consists of three object types, Img-col, Image, and Img-comp. The Img-comp type is de ned for image 4
component objects, the Image type for image objects and Img-col type for collection of images. In the gure, the relationship types are shown by lines. The schema in the gure shows that an Img-col object is a collection of Image objects. Each image object contains several Img-comp objects. This schema can be de ned as follows in our FODL. 9. interface Image 10. ( extent Images 11. key imagename) 12. f 13. attribute String imagename; 1. interface Img-col 14. attribute Blob rawimage; 2. ( extent Img-cols 15. attribute Matrix color; 3. key name) 16. attribute Matrix texture; 4. f 17. attribute String phototaken; 5. attribute String name; 18. attribute String placetaken; 6. relationship Set has 19. relationship Img-col isin 7. inverse Image::isin; 20. inverse Img-col::has; 8. g; 21. relationship Set contains 22. inverse Img-comp::belongsto; 23. Fuzzyboolean colormatch(in Image img); 24. Fuzzyboolean texturematch(in Image img); 25. Fuzzyboolean similarto(in Image img); 26. g; 27. interface Img-comp 28. ( extent Img-comps 29. key imgobjno) 30. f 31. attribute String imgobjno; 32. attribute Matrix color; 33. attribute Matrix texture; 34. attribute Matrix shape; 35. relationship Image belongsto 36. inverse Image::contains; 37. Fuzzyboolean colormatch(in Img-comp imgcomp); 38. Fuzzyboolean texturematch(in Img-comp imgcomp); 39. Fuzzyboolean shapematch(in Img-comp imgcomp); 40. Fuzzyboolean similarto(in Img-comp imgcomp); 41. g; Line 1 de nes the Img-col as an object type. Line 2 de nes the extent of the object type Img-col, and line 3 the key for the object type Img-col. In our example image database, the intuition behind the de nition of Img-col is that a collection of images can be classi ed into various groups, such as \Flower" and \Animal". Each such collection has its own properties, which are de ned by the type Img-col. Each collection de nition is related with a large number of images. Similarly, line 7 de nes the inverse relationship of the images with their respective collection. Important properties of an object are its methods. In line 23 the method colormatch()is de ned to compare images based on color similarity. The methods texturematch() and similarto() implement texture matching and overall similarity matching. These methods return a value of type Fuzzyboolean. In our example database, we assume that an image can be processed and its interesting components (such as homogeneous regions of color and texture) can be extracted. These image components are called image objects in image databases. But we refer to them as image components in order to distinguish them from the instances of the type Image. The object type Img-comp is de ned to store information about image components. Each image may 5
have many image components. With each image component, we store color, texture and shape features.
4 Fuzzy Object Query Language
The FOQL is an extension of OQL, which can deal with fuzzy object model and fuzzy data (which is an inherent characteristic of image data). All OQL queries are supported and we have extended semantics of many OQL constructs and introduced new constructs. FOQL is a typed expression language similar to OQL (An expression returns a result which can be an object or a literal. Each expression in FOQL has a type). In the following, we introduce few important constructs supported by FOQL, and show how it is used for querying image databases. We refer the reader to the technical report [13] (accessible on WWW) for detailed syntax and semantics of the language. We use the above example image database to illustrate the use, power and semantics of the constructs. Though the main motivation for designing the FOQL is the nature of multimedia data and querying, it can be used in many other fuzzy applications. In a later section, we illustrate the power of FOQL for other applications using an employee database example.
4.1 Select-From-Where
Select-from-where is the main construct for querying, which is capable of specifying any user requirement. We rst give the formal syntax and semantics of the select-from-where query expression. Various query expression parts such as result construction, thresholding and predicate computation are explained with examples. We also describe evaluation of simple predicates, and the weighted predicates. The general form of a select statement is as follows. select [distinct][[thold]] f (x1 ; x2 ; ::::; xn ; xn+1 ; xn+2 ; ::::; xn+p ) from x1 in e1 (xn+1 ; xn+2 ; ::::; xn+p ) x2 in e2 (x1 ; xn+1 ; xn+2 ; ::; xn+p ) x3 in e3 (x1 ; x2 ; xn+1 ; xn+2 ; ::; xn+p ) ............................. xn in en (x1 ; x2 ; ::; xn?1 ; xn+1 ; ::; xn+p ) [where p(x1 ; x2 ; ::::; xn ; xn+1 ; xn+2 ; ::::; xn+p )] [order by f1 (x1 ; x2 ; ::; xn+p ); f2 (x1 ; x2 ; :::; xn+p ); ::::; fq (x1 ; x2 ; xn+p ) ] where, xn+1 ; :::::; xn+p are free variables (that have to be bound to evaluate the query). The ei 's have to be of type collection or fuzzycollection, p has to be type boolean or fuzzyboolean and the fi 's have to be a sort-able type, i.e., an atomic type. The result of the query is a collection(t) or fuzzycollection(t). The from clause can take one of the three forms: from x1 in e1 (xn+1 ; xn+2 ; ::::; xn+p ), from e1 (xn+1 ; xn+2 ; ::::; xn+p ) as x1 or from e1 (xn+1 ; xn+2 ; ::::; xn+p ) x1 . Assuming xn+1 ; xn+2 :::; xn+p are bound to Xn+1 ; Xn+2 ; :::; Xn+P , the query is evaluated as follows. 1. The result of the from clause is a bag or fuzzybag of elements of type struct(x1 : X1 ; x2 : X2 ; :::::; xn : Xn ) where X1 ranges over the bagof or fuzzybagof(e1 (Xn+1 ; Xn+2 ; ::::; Xn+p )), X2 ranges over bagof or fuzzybagof( e2 (X1 ; Xn+1 ; Xn+2 ; ::; Xn+p )), ... Xn ranges over bagof or fuzzybagof(en (X1 ; X2 ; :::; Xn?1 ; Xn+1 ; ::; Xn+p ) ). The parameter for bagof() or fuzzybagof() above should be of corresponding collection type. These functions convert collection types to bag and fuzzybag, respectively. 2. The result of the from clause are ltered, retaining only the elements that satisfy (return true if boolean, non-zero if fuzzy boolean) the where predicate p(X1 , X2 , ...., Xn?1 , 6
Xn , Xn+1 , Xn+2 , ..., Xn+p ). The rules for evaluation of weighted predicates (of type fuzzyboolean) is given in a later section. 3. If a threshold [thold] has been speci ed, the elements with membership value less than \thold" are ltered out. The default value of \thold" is 0.0. 4. The result is transformed into a corresponding list after sorting it according to the functions f1 speci ed in order by. The elements having the same f1 value are then sorted according to f2 value etc. 5. Apply the function f (X1 ; X2 ; ::::; Xn?1 ; Xn ; Xn+1 ; ::::; Xn+p ) to each element of the result. If speci ed function f is *, this step does nothing. If the result elements are not atomic, this function creates a new type. 6. If \distinct" has been speci ed, duplicates are eliminated and a corresponding set or list is obtained. Elimination of duplicates from a fuzzycollection(t) is described later. The result of \select-from-where" construct is of type collection or a fuzzycollection literal. If the from clause speci es a fuzzycollection or any expression in the where clause is fuzzyboolean, then the result is of type fuzzycollection(t). Otherwise it is a collection(t). The type (t) is determined by the result construction function f . \order by" results in list (or fuzzylist) and \distinct" without \order by"results in set (or fuzzyset) (if neither is speci ed, we get bag or fuzzybag). The select-from-where expression in FOQL works basically similar to that of OQL. The special circumstances of dealing with fuzzy data types is discussed below in terms of an example query.
Query 1 Find all the images from the image collection \Flower" that have color similar to the
example image \ ower1.gif" and contain an image component similar to the example image component rose (whose imgobjno is \01"). The result must be distinct and contain only those images that are ranked (have membership value) greater than 0.8. select distinct [0.8] I.imagename from Img-col R, I in R.has, O in I.contains, Image J, Image-comp K where R.name = \Flower" and I.colormatch(J) and O.similarto(K) and J.imagename = \ ower1.gif" and K.imgobjno= \01"
Type Coercion
We may encounter expressions involving incompatible types, such as boolean and fuzzyboolean. We introduce type coercion (conversion of one type to another) into FOQL to make such expressions valid. There are two kinds of type coercion: boolean and collection. In boolean type coercion, the boolean value \true" is converted to a fuzzy boolean value 1.0, and \false" to 0.0. Similarly, in collection type coercion, a collection(t) is converted to its respective fuzzycollection(t), with the membership value of each element set to 1.0. In the Query 1, the boolean value returned by the predicate R.Name = \Flower" is converted into fuzzy boolean value to make it compatible.
Constructing Results
The result of a FOQL query is always a collection(a bag, set or list) or a fuzzycollection (a fuzzybag or a fuzzyset or a fuzzylist). The type of each element of this (fuzzy)collection depends on the projection attributes (speci ed f ) in the select clause. For example, if the select clause has atomic value of type t, it returns fuzzycollection(t) or collection(t). For example, the Query 1 returns a fuzzybag. We refer reader to [13] for further details. 7
Thresholding
By default all elements with membership value greater than 0.0 are included in the result fuzzy collection. In CBIR systems, often users want to treat membership values less than a certain threshold as 0.0. FOQL enables this by threshold queries. For example, the Query 1 result included only those elements that have membership value greater than 0.8.
Computation of the Membership Value
The membership value of an element in the fuzzycollection result depends on the constraints speci ed in the where clause. The computation of the membership value is done in two phases: predicate computation and predicate combination.
Predicate Computation: The operands of predicates may involve literals, path expressions
and functions. If the predicate involves literal comparisons using relational operators, the fuzzy boolean value is readily computed. If it involves a function, the corresponding function returns the fuzzy boolean value. If the predicate involves a path expression the evaluation is more complicated. We propose an algorithm to evaluate the fuzzy value for path expressions. The algorithm is illustrated using Query 1 as follows.
In Query 1, R, I, O, J and K are the collection (set) variables. These are converted into fuzzy set (setting membership value 1.0 for for each element) as the type of the Query 1 is fuzzycollection. The following algorithm evaluates the predicate \O.colormatch(K)". 1. Extend the path expression as much as possible. O.colormatch(K) ! I.contains.colormatch(K) I.contains.colormatch(K) ! R.has.contains.colormatch(K) R.has.contains.colormatch(K) ! Img-cols.has.contains.colormatch(K). 2. Find the fuzzy value returned by colormatch(K) function. Assume this value to be 0.8 in this example. 3. For each collection in the path expression, nd the membership value for the instance. In our example, the collections, their iteration variables and the membership values are as follows. Img-cols R 1.0 Img-cols.has I 1.0 Img-cols.has.contain O 1.0 4. Return the fuzzy AND (minimum) value of the fuzzy values obtained from above. In our example, the nal fuzzy value returned is 0.8. The computation is shown in Figure 2. We consider a general path expression Pn P2 P1 P0 , where P0 is an elementary (function or literal) component and P1 ; P2 ; ::: are fuzzy sets. (The number of fuzzy sets (collections) involved in the path expression may not be equal to the identi ers/variables appeared.) Let the fuzzy value returned by elementary component P0 be 0 (elem), and let i (o) denote the membership value of an element o in the ith fuzzy set for i = 1; 2; :::n. The fuzzy value of the path expression is given by, pathvalue = f (0 (elem); 1 (O1 ); ::::; n (On)). Here f is a \fuzzy logical AND" function, and we use the commonly used minimum function [6]. Thus the pathvalue is the minimum of the membership values of objects appearing in the path expression and the value returned by the elementary predicate.
8
(Fuzzy)collection
Predicate
R
1 Im-col
I
1 Im-col.has
O
1 Im-col.has.contains
colormatch Im-col.has.contains.colormatch()
0.8
Figure 2: Atomic Predicate Computation Logical Operators (AND, OR, NOT) AND Predicates AND R.name = ‘‘Flower’’ AND 1
I.colormatch(J) AND 0.7
O.colormatch(K) K.imgobjno = ‘‘01’’
J.imagename = ‘‘sky.gif’’
0.8 1
1
Figure 3: Predicate Combination using AND-OR-NOT Graph Predicate Combination: Computation of the fuzzy value associated with each atomic predicate was described above. Fuzzy binary boolean operators and and or can be used to combine atomic predicates in FOQL, resulting in complex predicates. Such predicates can be represented using AND-OR-NOT graph as shown in Figure 3. The min-maxcomplement functions are used on the AND-OR-NOT graph to evaluate the membership value of the each element of the result set [6]. Figure 3 shows the AND-OR-NOT graph for our example query. The nal membership value in our example is min(1:0; 0:7; :::) = 0:7. Note that AND-OR-NOT graph has two kinds of leaf nodes: Exact matching nodes(predicates) that return 0.0 or 1.0, and the similarity matching nodes that return a fuzzy value in the range 0.0 to 1.0. Processing the exact matching predicates rst may reduce the search space.
Distinct
In general the query result is a fuzzy bag (fuzzy multiset). The result of the Query 1 without \distinct" may contain the same image name several times with dierent membership values (an image may contain more than one image components, and each image component may satisfy the predicate \O.colormatch(K)" with dierent value). If a user wants a fuzzy set returned rather than a fuzzy bag, he can specify \distinct" as in Query 1. Then the membership values of multiple appearance of each element are fuzzy-ORed to arrive at the membership value of unique appearance of the element.
User Preferences Using Weighted Predicates
Existing content based retrieval systems enable users to specify preferences to the features to be used in comparison. This is accomplished by the user interface obtaining weights (of features) 9
from the user [9]. If a user wants to retrieve images based on color and texture similarity, but likes 80% weight assigned for color feature and 20% weight for the texture in comparisons, the FOQL query is as follows. Query 2 Find all the images from the database similar to the example image \sunset.gif" with preference 0.8 to color and 0.2 to texture. select I.imagename from Img-cols R, I in R.has, Image J where [0.8] I.colormatch(J) and [0.2] I.texturematch(J) and J.imagename = \sunset.gif" Note that the sum of the weights assigned to the atomic predicates connected by and must be equal to 1. So, we do not need to explicitly specify the weight [0.0] to the predicate J.imagename = \sunset.gif" in the Query 2. If we simply multiply the weight with the fuzzy value returned by the atomic predicate, the predicate J.imagename = \sunset.gif" with weight 0.0 always gives a fuzzy value 0.0. Thus the query result is an empty fuzzy set, which is not what one would expect. FOQL thus evaluates the predicates rst without considering weights associated with them. A weighted predicate P having m atomic predicates is evaluated as follows using the algorithm proposed by Fagin [7]. Let represent the set of weights given to the atomic predicates. Each atomic predicate is evaluated as above, and are sorted according to their weights (1 ; 2 ; :::). Let X = (x1; x2 ; :::; xm ) represent the set of fuzzy values returned by the sorted atomic predicates. Case 1:The atomic predicates are connected by or operator. The fuzzy value of the predicate P is the maximum of the m values, f (X ) = Max(X ). For example, if X = f0:2; 0:5; 1:0g, then f (X ) = 1:0 Case2: The atomic predicates are connected by and operator as in Query 2. The fuzzy value of the predicate P is f (; X ) given by, f (; X ) = m:m :Min(X ) + Pim=1?1 i:(i ? i+1 ):Min(x1 ; :::; xi ): Thus in our case, = (0:8; 0:2; 0:0), and X = (0:5; 0:3; 1:0). The fuzzy value of the complex predicate is computed as, f (1 ; 2 ; 3 ; x1 ; x2 ; x3 ) = (1 ? 2 )Min(x1 ) + 2:(2 ? 3)Min(x1 ; x2) + 3:3 :Min(x1 ; x2 ; x3 ): = (0.8 - 0.2). Min(0.5) + 2. (0.2 - 0.0). Min(0.5, 0.3) + 3. 0.0 Min(0.5, 0.3, 1.0) = 0.42 Note that we need to process this only if Min(X ) 6= 0:0.
Order By Operator
The user may want to display the query results in an ordered way, such as most relevant images appearing rst. This requires the result (a fuzzy bag or a fuzzy set) to be converted into an ordered list (fuzzy list). For example, the Query 2 can be ordered according to their rank as follows. select * from ( select I.imagename from Img-cols R, I in R.has, Image J where ([0.8] I.colormatch(J) and [0.2] I.texturematch(J) and J.imagename = \sunset.gif")) as G order by value(G) The result of the query is an ordered list, with elements having higher membership value appearing rst. We have introduced the reserved word \value" which refers to the membership value of the element. In general if e1 is a fuzzy collection expression, and x is an iteration variable over e1 , then value(x) is a valid expression which returns the fuzzy value associated with x. 10
4.2 Named Query De nition
The second construct we choose to describe is the named query de nition, which enables de ning of functions. The syntax for the named query de nition is, define id(x1 ; x2 ; :::; xn ) as e(x1 ; x2 ; :::; xn ), where id is an identi er, e an FOQL expression and x1 ; x2 ; :::; xn are free variables in the expression e This query de nes a function id as the expression e, with the speci ed parameters. This construct supports a very useful query mechanism that is not supported in other languages designed for CBIR systems. It helps us to map low level features such as color and texture into high level concepts. This is considered to be one of the most important issues to be addressed in image databases. Most of the existing content based retrieval systems do not allow the user to save their queries and invoke them later. To the best of our knowledge, the rst system that has such functionality is \Chabot" [15]. Typically, a user has to repeat the tedious procedure if he/she wants to repose a query to the database. The user would like the capability to save queries and invoke them later when necessary. FOQL supports such functionality using the query de nition construct. In the existing CBIR systems, the image retrieval is accomplished based on low level features such as color and texture. Instead, the user would prefer to pose queries in terms of high level concepts such as mountain and sunset. The high level concepts can be mapped into low level features by naming a query de nition identi er with an appropriate concept. Query 3 Suppose a user has arrived at the conclusion that retrieving images having image components similar to the example components \01" and \02" results in images of sunset. He can then de ne the concept \sunset" as follows. define sunset as select distinct I.imagename from Img-cols R, I in R.has, O in I.contains, Img-comps J, Img-comps K where (O.colormatch(J) and J.imgobjno = \02") or (O.colormatch(K) and K.imgobjno= \01") The two image components \01" and \02" may be given by the user using graphical user interface. Once the query is de ned, a user can pose the one word query sunset to specify his requirement. To accomplish the same in QBIC system [8], the user need to repeat the same tedious procedure (query by example images). Similarly, the user can de ne other high level concepts such as mountain. These concept de nitions which are stored in the database, are essentially mappings from low level features to high level concepts. They can be invoked later as they are persistent. The user then later can pose a complex query using (fuzzy)set expressions [13]. For example, a query expression sunset union mountain yields a fuzzy set of images that have both mountain and sunset. Predicates in FOQL are the valid FOQL expressions of type boolean and fuzzyboolean. Users need predicates to de ne high level concepts, so that natural language can be easily translated into queries. The named query de nition allows us to de ne predicates. Similar to above examples, concept predicates map the high level concepts to low level features. The predicates can be used in the queries enabling the user to pose more complex queries. Thus, the concept predicates provide greater power to the query language. The concept predicates can be de ned using example images, example image components and already de ned predicates. The following examples illustrates this. Query 4 The following de nes a concept predicate \sun" based on example image components \01" and \02". define sun(X Img-comp)as (X.similarto(&O1 ) or (X.similarto(&O2 ) 11
Here &O1 and &O2 represent the object identities of the image components \01" and \02", respectively. These object identities are captured through user interfaces. Similarly, a concept predicate sea can be de ned. In FOQL new concept predicates can be de ned using already de ned predicates. We use the predicates \sun" and \sea", to de ne a new concept predicate \sea with sun" and use it to query. define sea with sun(X Img-comp, Y Img-comp) as sea(X) and sun(Y) select I.imagename from Img-cols R, I in R.has, X in I.contains, Y in I.contains where sea with sun(X, Y) and X!=Y
4.3 Indexed Collection Expressions
The third construct we choose to explain in the context of image querying is index collection expressions. The number of elements in the result may be large, and it may not be desirable to display all (specially in interactive mode). Many existing CBIR systems present a limited number of result elements(images) (10-20) at a time. The user then can click to view more of the results. In FOQL we provide an explicit mechanism for controlling the number of elements output/displayed. This feature of FOQL enables easy mapping of the corresponding graphical user interface construct into FOQL. The following example illustrates the syntax.
Query 5 Retrieve all the images from the image collection \Animal" that are similar to the given example image \tiger1.gif" based on color feature. Display 10 images at a time.
We rst de ne the required named query \tiger" and order the images by their ranks. define tiger as select distinct * from (select I.imagename from Img-cols R, I in R.has, Image J where R.name = \Animal" and I.colormatch(J) and J.imagename = \tiger1.gif") as G order by value(G) The rst 10 images can be retrieved by the user with the simple query, tiger[1:10]. If the user wants to view the next 10 images, he can click the \NEXT" button provided by the user interface, which can be readily translated into tiger[11:20].
4.4 Other FOQL Expressions
In addition the FOQL supports many other useful expressions, including elementary expressions, construction expressions, atomic type expressions, object expressions, collection expressions, fuzzy set expressions and conversion expressions. We feel describing those does not throw further light on the basic ideas we are trying to convey, nor is it feasible due to space limitation. We refer the reader to the report [13] accessible from WWW for further details.
4.5 Another Example
Though FOQL was motivated for CBIR systems, it may be useful in other applications. Consider an employee database with attributes name, id, annual salary and age, where the id is the key and employees is the extent. We classify employees into three fuzzy subsets based on age: young, middle and old. The membership functions of these fuzzy subsets are implemented by the methods young(), middle() and old(). Also, employees are classi ed into three fuzzy subsets based on salary: low, middle and high. We use the following example queries on employee database to demonstrate the strength/power of FOQL for other applications. The following named query de nes the predicates promote-able(X ) (ones who earn high salary and young) and nearly-retired(X ). 12
promote-able(X ) define nearly-retired(X ) X .high() and X:young() X .age > 55 We can now use these de nitions to retrieve all the employees with their chances of promotion, who are about to get retirement as follows. select * (from select distinct name from Employee X where nearly-retired(X) and promote-able(X)) as G order by value(G) This shows that users can pose fuzzy queries in a more natural way which is not supported by other existing query languages. define
5 Summary
In this paper, we proposed a Fuzzy Object Query Language(FOQL) based on the ODMG Object Query Language(OQL). Our approach has been to extend the current de facto standard OQL, to facilitate easy implementation and acceptance of FOQL. We rst extended the Object Model of ODMG to include fuzzy data types, and proposed the Fuzzy Object Model. The Object De nition Language (ODL) is extended to Fuzzy Object De nition Language (FODL) to enable users to de ne fuzzy data in the schema speci cation. Syntax and semantics of few FOQL constructs were described using an example image database by example queries. We refer reader to the report [13] for further details. We have also described how the FOQL constructs help us to formulate queries and handle the underlying problems in CBIR systems such as mapping low level features to high level concepts. The query languages designed for image databases should enables easy mapping into visual query language. Since FOQL is based on ODMG standard, FOQL can be readily mapped into the ODMG-compliant visual query languages. Example queries are given to show the strength of FOQL in comparison with other SQL like languages. The inherent nature of the queries and data in image databases has been the main motivation in the design of this language. It can also be used in other fuzzy applications, such as the example employee database discussed. Many query languages have been proposed to deal with fuzzy data by keeping the query results non-fuzzy. However, FOQL is a query language in which both the query result and speci cation are fuzzy. Thus, FOQL is a \pure" fuzzy query language. To the best of our knowledge, this is the rst attempt of de ning a formal fuzzy object query language. Our future work includes implementation of the FOQL in our CBIR system. We will also address the problem of query processing and optimization in the image database context.
References
[1] G. Bordogna, D. Lucarella and G.Pasi. A fuzzy object oriented data model. In Proceedings of the Third IEEE Conference on Fuzzy Systems IEEE World Congress on Computing Intelligence, pages 313{318. IEEE, June 26-29 1994. [2] Alfonso F. Cardenas, Ion Tim Ieong, Ricky K. Taira, Roger Barker and Claudine M. Breant. The knowledge-based object-oriented PICQUERY+ language. IEEE Transaction on Knowledge and Data Engineering, Volume 5, Number 4, pages 644{657, August 1993. [3] R.G.G. Cattell. The Object Database Standard: ODMG-93, Release 1.2. Morgan Kaufmann, 1996. [4] Manoj Chavda and Peter T. Wood. Towards an ODMG-compliant visual object query language. In Proceedings of the 23rd Very Large Data Base (VLDB) Conference, Athens, Greece, 1997. 13
[5] Connan F. and Rocacher D. Flexible querying and multimedia databases. Proceedings of the International Symposium on Soft Computing for Industry(ISSCI`96), Montpellier, France, 1996. [6] Ronald Fagin. Combining fuzzy information from multiple systems. Proc. Fifteenth ACM Symp. on Principles of Database Systems, pages 216{226, Montreal, 1996. [7] Ronald Fagin and Edward L. Wimmers. Incorporating user preferences in multimedia queries. Proc. 1997 International Conference on Database Theory, 1997. [8] Myron Flickner, Harpreet Sawhney, Wayne Niblack, Jonathan Ashley, Qian Huang, Byron Dom, Monika Gorkani, Jim Hafner, Denis Lee, Dragutin Petkovic, David Steele and Peter Yanker. Query by image and video content: The QBIC system. Computer, Volume 28, Number 9, pages 23{32, September 1995. [9] Amarnath Gupta. Visual information retrieval: A virage perspective. Technical Report Revision 4, Virage Inc., 9605 Scranton Road, Suite 240, San Diego, CA 92121, 1997. [10] Janusz Kacprzyk, Slawomir Zadrozny and Andrzej Ziolkowski. Fquery III+: A \humanconsistent" database querying system based on fuzzy logic with linguistic quanti er. Information Systems, Volume 14, Number 6, pages 443{453, 1989. [11] John Z. Li, M. Tamer Ozsu, Duane Szafron and Vincent Oria. MOQL: a multimedia object query language. In The Third International Workshop on Multimedia Information Systems, pages 19{28, Como, Italy, 1997. [12] Surya Nepal, M.V.Ramakrishna and J.A.Thom. Four layer schema for image data modelling. In Chris McDonald (editor), Australian Computer Science Communications, Vol 20, No 2, Proceedings of the 9th Australasian Database Conference, ADC'98, pages 189{200, 2-3 February, Perth, Australia, 1998. [13] Surya Nepal, M.V. Ramakrishna and James A. Thom. FOQL: fuzzy object query language. Technical Report TR-98-17, Department of Computer Science, RMIT, PO Box 2476V, Melbourne 3001, Australia, 1998. URL: http://www.cs.rmit.edu.au/~rama/TR-98-17.ps. [14] ODMG-2.0. Draft: The Object Database Standard: ODMG-93, Release 2.0. http: wwwdb.informatik.uni-rostock.de/~jo/. [15] Virginia E. Ogle and Michael Stonebraker. Chabot:retrieval from a relational database of images. IEEE Computer, Volume 28, Number 9, pages 40{48, September 1995. [16] Elie Sanchez. Importance in knowledge systems. Information Systems, Volume 14, Number 6, pages 455{464, 1989. [17] Simone Santani and Ramesh Jain. Similarity queries in image databases. Proceedings of CVPR96, IEEE International Conference on Computer Vision and Pattern Recognition, San Francisco, June 1996, 1996. [18] Uri Shaft and Raghu Ramakrishnan. Data modeling and feature extraction management in image databases. In C.-C.J. Kuo (editor), Multimedia Storage and Archiving Systems, Volume 2916 of SPIE, Boston, MA, November 1996. [19] Motohide Umano. FREEDOM-0: A fuzzy database system. In M.M. Gupta and E. Sanchez (editors), Fuzzy Information and Decision Processes, pages 339{347. North-Holand, 1982. [20] R. Vandenberghe, A. Van Schooten, R. De Caluwe and E.E. Kerre. Some practical aspects of fuzzy database techniques: An example. Information Systems, Volume 14, Number 6, pages 465{472, 1989. 14
[21] M.H. Wong and K.S. Leung. A fuzzy database-query language. Information Systems, Volume 15, Number 5, pages 583{590, 1990.
15