A Framework for Integration of Large-Scale Distributed Visual Databases

4 downloads 10818 Views 713KB Size Report
metasearch agent derives a list of relevant database sites to the given query by matching .... template builder, and metadata re nement module, constitute theĀ ...
NetV iew: A Framework for Integration of Large-Scale Distributed Visual Databases Aidong Zhang, Wendy Chang, and Gholamhosein Sheikholeslami Department of Computer Science State University of New York at Bu alo Bu alo, NY 14260 USA Tanveer F. Syeda-Mahmood Xerox Research Center Webster, NY 14580 USA

Abstract

Many visual databases (including image and video databases) are being designed in various locations. The integration of such databases enables users to access data across the world in a transparent manner. In this article, we present a system framework, termed NetView, which supports global content-based query access to various visual databases over the Internet. An integrated metaserver is designed to facilitate such access. The metaserver contains three main components: the metadatabase, the metasearch agent, and the query manager. The metadatabase is organized to include the metadata about individual visual databases which re ect the semantics of each visual database. The query manager extracts heterogeneous features such as text, texture, and color in the query for suitable matching of the metadata. The metasearch agent derives a list of relevant database sites to the given query by matching their feature content with the metadata. Such a capability can signi cantly reduce the amount of time and e ort that the user spends in nding the information of interest. We discuss the design strategies for the metadatabase, the metasearch agent, and the query manager. The performance of the metaserver is re ned based on user's feedback.

Introduction With the explosive growth and increasing popularity of the Internet and World Wide Web (WWW), it is now possible to access large image and video repositories distributed throughout the world. Access to these repositories plays an increasingly important role in numerous applications such as geographic information systems, medical information systems, surveillance, and distributed publishing. Today, the National Library of Medicine (NLM) is engaged in the development of an electronic archive of digitized photographs, x-rays, scanned articles, and digitized video with the goal of providing wide access to these collections via client/server systems. NASA also has terabytes of space data for exploration of space and atmospheric sciences. In such applications, the search and  Current Address: Department of Computer Engineering, Rochester Institute of Technology, Rochester, NY 14623.

1

retrieval of visual databases become an essential part of the scienti c research. In the commercial world, several products are emerging which provide networked access to image databases, such as ArtviewTM where high quality images of paintings and other art work in various galleries can be displayed to remote prospective buyers. The enormous growth in the amount of image and video information has also created new issues for users. Internet users now face a problem of resource selection. That is, given a query, where should a user start a search? One example could be a graphic designer who looks for images with speci c patterns to be used as a background for his/her design. Experienced designers may already have a list of favorite image web sites and will start the search from one of those sites. Less experienced designers will probably visit a well-known web search engine such as Yahoo to get a list of stock photograph web sites to search. In either case, the designers may need to search all the sites on the list to nd the desired images, which can be very time consuming. Furthermore, if the desired images cannot be found, there are currently no speci c approaches for the designer to determine an alternate list of web sites except possibly by using a di erent search engine with an associated loss of context. To deal with the explosive growth of the visual data and the inherent complexity in visual data querying, it is crucial to perform a careful selection of database sites in order to support ecient queries. One way to deal with this issue is to design a metaserver on top of various visual databases. Given a query, the metaserver rst produces a ranking of the database sites and then distributes the queries to the selected databases. In recent research, database selection has focused on directing text queries to databases. For example, web search engines such as Lycos and Alta Vista currently create web indices in their search engines by periodically scanning potential web sites and using the text information in their resident HTML pages. But most implementations of text-based distributed systems do not perform site selection, often posing a query to all sites in parallel as done in the CLASS (College Library Access and Storage System) and the NCSTRL (Networked Computer Science Technical Report Library) systems at Cornell University [14]. More recently, techniques from information retrieval [15, 18] are used for intelligent resource site selection. Examples of such systems include GLOSS (from Stanford) [10, 9], WHOIS++ (from Bunyip Corporation) and HARVEST (from University of Colorado) [4]. These systems employ statistical approaches to record the frequency of occurrence of text keywords from known sites to construct an index of relevant sites for directing a query. For example, the GLOSS (Glossary Of Servers Server) [10] keeps statistics on the available databases to estimate which databases are potentially most useful for a given query using boolean and vectorspace retrieval models of document retrieval. WAIS (Wide Area Information Servers) [13] divides its indices among the databases into multiple levels with the top-level index containing a \directory of servers". Given a query, the \directory of servers" is searched and the query is then forwarded to the selected databases. Similar to the WAIS approach, several WWW-based image search facilities, 2

such as ImageRover [19], WebSeek [23], and AltaVista, maintain a centralized index on images in individual WWW sites as speci ed through the URL addresses, image le names or manuallygenerated text annotations. These search engines allow a combination of text and image content queries and return a list of archival sites which contain the images whose descriptions match the pattern. The work that comes closest to selection of relevant image databases (rather than images directly), is the work in MetaSEEk [5]. MetaSEEk is a meta-image search engine designed to query large distributed online visual information sources. MetaSEEk's target search engines include VisualSEEk [22], QBIC [8], and Virage [2]. For each query, MetaSEEk selects the target search engines that support the speci ed method of the query and may have the desired results based on the performance metrics calculated from the past queries. The performance metrics of the query is given by the user after viewing the responses returned by the search engines. While MetaSEEk provides a novel approach to select among search engines, the concept of selecting databases based on visual content has not, to our knowledge, been explored. This article will present an approach to designing an integrated metaserver which supports intelligent resource selection based on the feature content of image queries. Given a visual query, the system returns a ranked list of potentially relevant database sites. The user can then follow the path recommended by the system to perform the search by starting with the highest ranked sites. Such a framework would allow users to quickly locate the resources and dramatically cut down the overall time spent in retrieving the images of interest. The metaserver contains three main components: the metadatabase, the metasearch agent, and the query manager. To support intelligent site selection, the metadata for inclusion within the metadatabase is formulated on the basis of visual content of the images housed at each remote visual database. The visual content of the images in each database are summarized through image templates and statistical features characterizing the similarity distributions of the images. The metadatabase is organized into a hierarchical structure. The query manager extracts the information in the query for suitable matching of the metadata. The query manager is also responsible for the re nement of the visual queries based on user's feedback. Various selection approaches for the metasearch agent which make use of the metadata can be designed to derive a ranking of relevant database sites with respect to a query. A prototype system is implemented using Java in a Web-based environment and experimental analysis is conducted.

System Architecture Figure 1 illustrates the overall architecture of the system. This system contains three main parts: visual databases at remote sites, a metaserver, and a set of visual display applications at the client 3

machines. We will focus on the design of the metaserver. The role of the metaserver is to accept user queries, extract the information in the query for suitable matching of the metadata, produce a ranking of the database sites, and distribute the queries to selected databases. Figure 2 shows the metaserver components, including the metadatabase, the metasearch agent, and the query manager. Image Database

Text Database

Video Database

Image+text Database

Metaserver Metasearch Agent

Meta Database

Query Manager

Client browser

Client browser

Client browser

Client browser

Figure 1: A distributed visual data retrieval system.

Query Visual Database

Metadata

Metasearch Agent

Meta Database

Query Query Manager Relevance

Query

Feedback

Users

Figure 2: The metaserver architecture. The metadatabase houses both template and statistical metadata. The templates are feature vectors which are representatives of the feature vectors of component database images, and the statistical metadata record the relationships between the templates and the database images (See detailed discussion on metadata in the following section). Three additional modules, the metadata collector, the template builder and the metadata re nement, are designed to support the interoperability and integrity of the metadata. The metadatabase, together with the metadata collector, template builder, and metadata re nement module, constitute the metadatabase management system [7], as illustrated in Figure 3. The collector gathers the metadata from the visual database, 4

at the time when the database registers with the metaserver or at the time when the database resubmits its metadata. The template builder organizes and creates templates. The metadata re nement module periodically initiates the metadata update process by asking the database to resubmit its metadata. Registration/Refinement

Visual Database Template Metadata

Template

Metadata

Metadata

Collector Template

Template Builder

New Templates

Template

Refinement

Metadata Refinement

Metadatabase

Figure 3: The metadatabase management system. The query manager consists of two modules: the query processing module and the query re nement module. The query processing module accepts user's query, extracts various feature vectors from the query, and submits the feature vectors to the metasearch agent. During the feedback step, the query processing module accepts images that are judged as relevant or irrelevant to the original query by the users. It extracts various feature vectors from the images, and forwards the feature vectors to the query re nement module. The query re nement module constructs modi ed query from the image feature vectors sent by the query processing module. Figure 4 shows the functionalities of the query manager. The metasearch agent invokes the site selection algorithm which matches the query features to templates with the corresponding metadata of the databases to select the potentially relevant databases for the query. The query is then forwarded to the selected visual databases in an acceptable form. The searching mechanism of the local database searches its repository for possible answers to the posed query. The answer is then fed back to the client. A prototype system, termed NetV iew, is developed based on the above framework. NetV iew is implemented using Java as the programming language. It consists of three major components: a central server, a remote database interface, and a user interface, as shown in Figure 5. NetV iew implements the metaserver functionalities in its central server. The central server interacts with the user interface through Java's socket interface to receive the user's query. The central server processes the query and returns a list of relevant database sites to the user. The user may choose the 5

Metasearch Agent

Query

Refined Query

Feedback Image Features

Query

Query

Processing

Query

Refinement

Relevance Feedback

Users

Figure 4: The query manager. databases to search. The central server then forwards the query to the remote database interface for distributing query to the chosen databases in an acceptable form. NetV iew currently supports image databases stored in le systems, O2 databases, and ObjectStore databases via the remote database interface. Central Server Visual Database

Query

Remote Database Interface

Query

Meta Search Agent

Meta Database

Query Manager List of Relevant DBs

Query/ Relevance Feedback User Interface Retrieved Images

Figure 5: NetV iew: the web-based visual information system.

NetV iew's user interface is an HTML form and can be invoked within a standard WWW browser. The user interface is responsible for the interactions between the users and the central server. Through running a Java applet, the interface allows the user to submit a query, select the desired databases, initiate a search, mark the relevance of retrieved images, and initiate a subsequent feedback search. Figure 6 shows the NetV iew user interface. The user can enter a query by typing a lename at Open box. The query image is then displayed at the upper window. Any image that is accessible over the Internet can be used as a query. The query can be either the complete image or a portion of the image (as marked by the user). In 6

Figure 6: The NetV iew user interface. Figure 6, queries of di erent sizes are shown in the upper window. The rightmost image is the query for which the retrieved images are shown in the window at the bottom of the page. Next, the user sends the query to the server by clicking on the Send Image button. Upon receiving the list of relevant databases from the metasearch agent, the user interface displays the list. The user can then choose to search a database by highlighting the database name and clicking the Send button. The query image is forwarded to the chosen database. The local database searches its repository, and retrieved images are fed back to the user interface for viewing. The remaining part of the user interface is used for the relevance feedback mechanism. The user is asked to mark the retrieved images judged as relevant or irrelevant to the query by highlighting the image name and clicking the corresponding buttons (Relv or NRel). The relevance indications are submitted to the system for processing when the user clicks on the feedback (Fback) button, and a re ned list of relevant databases is then shown.

7

Metadata We will now outline the rationale of the metadata formulation that is used to select relevant databases for a given visual query. It may appear that such a selection can be done using methods in text database discovery by maintaining an index of relevant sites using text information associated with the database. Also, information such as monetary cost, latency of database sites can provide additional metadata to enable early pruning of costly sites. However, for visual querying, text information can only perform a coarse pruning of the database sites. Further pruning must use the visual information in the query. But how can database relevance be determined for visual queries without avoiding detailed examination of all images of databases for possible matches? Clearly, it is not desirable to move the complete machinery of image content-based search used in the database engines to the distribution site to determine database relevance. Also, it is not possible to create o -line, an indexed set of database sites containing relevant images to queries, as that requires an anticipation of all possible visual queries. To address these problems, we propose an intermediate approach wherein the information in the database images is summarized and represented in a suitably abstracted form in the metadatabase. Following current research achievements, the information in a visual object may be represented by a set of features such as texture, color and shape. Computationally, the features of an image are typically represented by a set of numerical numbers, termed a feature vector. Visual queries can then be supported by matching the features of the query with the feature vectors in the database using similarity metrics. Many approaches have been developed to extract various features from visual data, including texture, color, and shape [12, 1]. Figure 7 shows two examples of visual queries and their matched images retrieved using the mechanisms presented in [16, 20, 24]. Texture and color queries as well as the matched segments within images are shown. The similarity attached to each image demonstrates that the similarity between the query and the image increases as their content become more similar. Templates. To support ecient retrieval, approaches have been proposed to categorize the feature vectors in a database into clusters on the basis of their features [25, 21]. Each cluster can then be represented by a feature vector, denoted template, which is generally the centroid of the cluster. The cluster can be further classi ed into subclusters, which can then be represented by their centroids. A template at a higher level represents the coarse features that contains all the features represented by its child templates. We observe that such templates can be used to adequately represent the content of the database. Thus, we collect templates from the component databases as part of the metadata in the metadatabase. To nd the templates, we rst select sample images from all local databases. Then using hierarchical clustering method, we build a tree-like structure called dendrogram [11, 21]. We can 8

0.92

0.89

0.99

0.81

(a)

0.88

0.88

0.78

0.76

(b) Figure 7: Visual queries and matched database images: (a) texture, (b) color. cut the dendrogram at di erent levels resulting in di erent sets of clusters. We then use the centroids of the resulting clusters as the templates. This process is applied for each feature class to nd the corresponding templates. Thus these templates can represent the images in all the local databases concisely. Moreover, the hierarchical template structure can be realized to support ecient search of the matched templates for a given query. Other appropriate methods may also be used to nd a good set of templates that can be used to well represent the content of the database. The metaserver can relate the content of the databases to the templates by calculating statistical metadata of each database with respect to the templates. This statistical metadata will be used by the site selection mechanism. Statistical Metadata. Based on the templates collected from individual visual databases, we can measure the similarity of visual data in the databases to the templates. Using these similarity measurements, statistical data is computed that capture the likelihood of a database containing data that are relevant to a template. Let DB denote the set of all databases and sim(i,t) denote the similarity between image i and template t, where 0:0  sim(i; t)  1:0. The similarity between a database image and an image template can be measured using the methods described in [12]. Figure 8 shows some similarity measure values between images and templates. As we can see, similar images tend to have a closer range of similarity measurement with the same template. Our use of image templates for visual abstraction is based on this observation. Assume that the feature extraction algorithms can generate feature vectors which are close to each other in the feature space for similar images. Given a template t, which is the center of an image cluster, images similar to t usually have high degree of similarity to t. Images not similar to t normally have low degree of similarity to t. Consider an example of a texture template and 9

(1)

(2)

(3)

0.85

0.80

0.80

0.79

0.71

0.69

0.67

0.66

0.30

0.30

0.29

0.27

0.13

0.10

0.06

0.02

(4)

(1)

(2)

(3)

(4)

(a)

0.99

0.92

0.91

0.91

0.72

0.72

0.72

0.71

0.52

0.52

0.51

0.51

0.20

0.20

0.20

0.20

(b)

Figure 8: Templates and database images (a) texture features, (b) color features. 10

its matched sample database images, shown in Figure 8. Cluster (4), with similarity measures in the range of [0:02; 0:13], is the least similar set with respect to the template, while cluster (1), with similarity measures in the range of [0:79; 0:85], is the most similar set with respect to the template. Similarly, consider an example of a color template and its matched sample database images, cluster (4) is the least similar set with similarity measures of 0:2, while cluster (1) is the most similar set with similarity measures in the range of [0:91; 0:99]. Thus, for image q, with a high similarity to template t, images similar to q will normally have high similarity to q and have close degree of similarity to template t. Note that, in general, if two images have the same degree of similarity to a template, they are not necessarily similar to each other. Various statistical data can be computed from the distributions of the similarities between database images and the templates, and stored in the metadatabase. The statistical data are used to represent the visual relationships between the databases and templates. The relevant databases for a given query can then be selected by determining the similarity of the query with metadatabase templates and then ranking the database sites based on the visual relationships recorded between the databases and templates. Various database selection approaches can be developed corresponding to the statistical data. Hierarchy of Metadatabase. The metadatabase is organized into a hierarchical structure. Figure 9 presents a conceptual view of the metadatabase. At the higher level, templates Ti (1  i  m) are grouped according to features including texture patterns, color patterns, and shape patterns, with top-level nodes representing the most general categories of texture, color and shape. To support ecient query retrieval, the templates in each feature class can be indexed hierarchically. When the metadatabase is rst initialized, we build the initial template hierarchy by clustering a set of sample images representing various application domains such as geographical images, medical images, industrial images, nature scenery images, and di erent types of video frames. At the lowest level, database sites are grouped under each template ti (1  i  k), and the groupings are based on the similarity of the database images to the template. We de ne that an image is similar to a template if the similarity measurement between the two images is greater than a threshold. A database site dbi (1  i  n) can be grouped under one or more templates. The time it takes to initialize the metadatabase hierarchy depends on the number of sample images used in the process. The larger the number of sample images used in con guring the initial template hierarchy is, the longer the initialization process would be. It took approximately 6 hours to build the initial metadatabase hierarchy with 3,000 sample images running on a dedicated SUN Ultra Enterprise 4000 with 1 Gigabyte of memory and 168 MHz UltraSPARC CPU. The initial metadatabase hierarchy in our prototype consists of both texture and color features. Once initialized, the metadatabase can be expanded to include new metadata. Metadata Gathering. New templates may be added by the template builder when a new 11

Metadatabase

...

Texture

Color

...

T

2

...

Tm

...

T1

t1

db 1

t2

db 2

Shape

Feature Classes

...

Templates

...

tk

...

db n

Database Sites

Figure 9: A conceptual view of the metadatabase. visual database registers with the metaserver or when new images are added to the database. When a database registers with the metaserver, a set of representative image samples are sent to the metaserver. If a visual database has already been clustered, the image samples will consist of images chosen from each cluster. If a visual database has not yet been clustered, the image samples are chosen randomly based on an uniform distribution. Based on our experiments, about 10% of the total number of images in the visual database is sucient to be used as the image samples. The sampling algorithm is made available to the visual database by the metaserver. The metaserver then clusters the sample images to generate a set of image templates representing the visual database. The clustering is based on the strategy outlined previously. Image templates representing the database are then merged with the existing image templates housed at the metaserver. Adding new images to a database may not always require new templates. New templates are necessary only if the features of the new images cannot be adequately represented by existing templates. This can happen when the new images are semantically di erent from the existing images in the database. In such cases, the metadata collector collects the new sample images and con gures the new templates. The template builder then merges the new templates into the metadatabase hierarchy. Once a template is added to the metadatabase, it will not be removed. Whenever a new template is added to the metadatabase, all remote databases will be asked to provide the statistical metadata pertaining to the new template. There is no need to update the 12

previously-computed statistical metadata, since it is collected with respect to the existing templates. Each visual database system may use di erent visual features, models, and similarity metrics. To maintain heterogeneity of the visual databases, the metadata collector provides the feature extraction and similarity measurement algorithms to gather the statistical metadata from the visual databases. When a database registers with the metaserver, a registration form, together with the image templates housed at the metaserver, are sent to the database by the metadata collector. The registration form contains the algorithms for calculating the metadata and also requests information from the individual databases such as the type of data housed, the expected query form, the monetary cost, and the latency of the database site. The database then computes the similarities of its images to all templates using the supplied algorithms and returns the statistics associated with each template such as the number of samples, mean, variance, and histogram. The metadata collector then stores the metadata in the metadatabase. Note that although the visual database uses the algorithms supplied by the metaserver to compute the statistical data, it will use its own search function and similarity metrics when retrieving images. In addition, the clustering and indexing structures of the visual database remain unchanged and are not a ected by the integration. Databases registered with the metaserver may be updated, rendering inaccurate statistical metadata recorded earlier. The metadata must therefore be dynamically updated. The metadata re nement module periodically initiates the metadata updates by asking databases to resubmit metadata to the collector. The size of the metadatabase largely depends on the number of templates presented at the metadatabase and the number of databases registered with the metaserver. Since a set of statistical metadata is stored for each database and template pair, increasing the number of templates or the number of databases will require more storage at the metadatabase. However, the performance of the system will be improved since the content of the visual databases can be more accurately represented by the templates. In addition, storage is needed to store the feature vectors of the templates and the database information such as the type of data housed, the expected query form, the specialized algorithms supported, the monetary cost, and the latency of the database sites. Much research has been directed to resource discovery over the Internet [18]. Since this aspect is not the focus of the research, we assume that the discovery of new database resources will be performed manually. Once discovered, the new database will be asked to register with the metaserver, following the steps outlined above.

Directing A Visual Query For a given query, the relevant databases are rst determined by a two-step process using the metadatabase: (1) select the templates which match to the query based on the similarity of the query 13

with templates, and (2) determine the ranks of the database sites based on the visual relationships recorded between the databases and selected templates. The user can then choose to send the query to the databases with the highest ranks. In our system, a query is a still image. Upon receiving a query q, the query manager rst extracts a set of subqueries fq1 ; ::; qn g from the query, with each subquery representing a feature class (i.e., texture patterns, color patterns, and shape patterns). The similarities of each subquery qi to the metadatabase templates are computed. Let G denote the set of all the templates existing in the metadatabase. A set of matched templates Tq = ft1 ; :::; tm g may be selected based on the following criterion: i

i

Tq = ft j 8t 2 G; sim(qi; t)   ^ (maxt0 2T (sim(qi; t0 )) ? sim(qi; t))  "q g; qi

i

i

(1)

where sim(qi ; t) is the similarity between subquery qi and template t,  and "q are given thresholds. The similarities between subquery qi and the templates in Tq must be greater than the given threshold  . The templates in Tq are considered to have the highest similarities to qi. If Tq is empty, the metaserver will ask the user to either submit a new query or distribute the query to all visual databases. If a set of matched templates can be found for subquery qi , the metasearch agent then invokes the selection approach which uses the subquery similarity to matched templates and corresponding statistical metadata of databases dbi , 1  i  n, to return a list of relevant database sites for that speci c subquery. The rankings of the databases for all the subqueries are then merged to yield a nal set of databases for the given query. Let q contain a set of subqueries q1 ; :::; qn and each qi matches with multiple independent templates tq1 ; :::; tqm . Let D1 ; D2 ; :::; Dn be the sets of relevant databases for q1 ; :::; qn , respectively. Since we look for databases which contain images similar to all subqueries, the set of relevant databases for q, denoted DBs, is calculated as the intersection of Di , i = 1,...,n: DBs = D1 \ D2 \ :::: \ Dn : (2) i

i

i

i

i

i

i

Various selection approaches can be developed to determine Di (1  i  n). For example, a selection approach, termed histogram-based, was developed based on the histogram of the similarity distribution [6]. Let [aq ;t ; bq ;t ], where aq ;t = sim(qi ; t) ? q and bq ;t = sim(qi ; t)+ q , be the similarity interval for a subquery qi with respect to a template t and q is a prede ned o set value for the given qi. The interval [aq ;t; bq ;t ] speci es a similarity range within which we wish to search for similar images to subquery qi . Given the similarity distribution of a database with respect to a template t, let hdb;t : [0:0; 1:0] ! I represent the histogram of the similarities of images in a database db to template t, where hdb;t (x) is the number of images that have a similarity x. Let num(db; t; s1 ; s2 ) be the number of images i

i

i

i

i

i

i

i

i

14

within database db with a similarity in the range of [s1 ; s2 ], with respect to template t. For each subquery qi that matches with a set of templates ft1 ; :::; tm g, we calculate num(db; ti ; aq ;t ; bq ;t ) for each matched template ti as: X num(db; ti ; aq ;t ; bq ;t ) = hdb;t (x): (3) i

i

i

i

i

x2[a

qi ;ti

i

i

i

i

i

;b

qi ;ti

]

The value num(db; ti ; aq ;t ; bq ;t ) is the total number of database images falling within the similarity interval with respect to template ti . We then sum num(db; ti ; aq ;t ; bq ;t ) over all matched templates ft1 ; :::; tm g, and rank the databases in decreasing order of the summed value: Di = fdb j db 2 DB ^ Pmi=1 num(db; ti ; aq ;t ; bq ;t ) > 0 m m X X 0 0 (4) ^(maxdb 2DB ( num(db ; ti; aq ;t ; bq ;t )) ? num(db; ti ; aq ;t ; bq ;t ))  g; i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i

i=1

i

i

i

i

i=1

i

i

i

where DB is the set of databases registered with the metaserver and  is a given threshold. In determining the rank of each selected database db for subquery qi, we use Pmi=1 num(db; ti ; aq ;t ; bq ;t ). Figure 10 shows the relationships between a query, templates and databases. In this example, a subquery representing the texture feature of the query image and a subquery representing the color feature of the query image are extracted from the query image. The texture subquery is matched with template t1 . The color subquery is matched with templates p1 and p2 . The databases grouped under template t1 are db1 ; :::; dbk , the databases grouped under template p1 are db01 ; :::; db0n , and the databases grouped under template p2 are db001 ; :::; db00m . The metaserver then invokes the selection approach to determine which of the indexed databases are relevant to the speci c subquery. For example, let fdb5 ; db7 ; db8 g be the top 3 ranked databases that are chosen for the texture subquery, and fdb3 ; db5 ; db7 g be the top 3 ranked databases that are chosen for the color subquery. The selection approach returns the set of relevant databases for the query image as the intersection of the two sets, that is, DBs = fdb5 ; db7 g. Various models may be used to merge the selections of multiple subqueries. For example, given a query q = fq1 ; :::; qn g, let Tq = ft1 ; ::; tm g be the set of templates that are matched with subquery qi using Formula (1), and a~ be the average similarity between qi and the templates in Tq . Let DBs = fdb1 ; ::; dbh g be the nal set of chosen databases for query q using Formula (2), and let S q = fsqdb1 ; :::; sqdb g be the corresponding number of similar images used by the selection approach to rank databases for subquery qi , where sqdb = Pmj =1 num(db; tj ; aq ;t ; bq ;t ) for the histogrambased approach. For each database dbj in DBs , we compute the estimated score for the subquery qi as: sqdb a ~ (5) Scdb ;q = Hq  m  Ph q ; i l=1 sdb where Hq is defaulted to 1.0 and can be adjusted by the user to satisfy their preferences of visual features. i

i

i

i

i

i

i

i

i

i

h

i

i

i

i

j

j

i

i

i

l

i

15

j

i

j

i

Query image

Class Identification

Feature Classes

Texture

Shape

Color

...

Text Template Matching

Image Templates

...

t1

p1

...

p2

Metadata Matching Databases

db 1

...

,

db k

db 1

...

,

db n

Selection Approach

db5

db 7

,,

db 1

...

,,

db m

Selection Approach

db3

db 8

db 5

db 7

Sample Retrieved Images From Databases

{db5 , db7 } Final Retrieved Images

Figure 10: Query, templates and databases. The estimated score of database dbj for the query q may then be computed as Pni=1 (Scdb ;q ). The database sites in DBs are then further ranked based on the decreasing order of Pni=1 (Scdb ;q ). As shown in Figure 10, the chosen databases in DBs may be further ranked as fdb5 ; db7 g using the merging algorithm. We now estimate the time required to derive a ranked list of relevant databases for a given query. Let G denote the set of templates existing in the metadatabase and DB denote the set of databases registered with the metaserver. The computational complexity to choose the matching templates for a given query is O(jGj). Since the maximum number of databases grouped under each template is jDB j, the computational complexity for applying the selection approach to the databases grouped under the matched templates in the worst case is O(jDB j). Thus, the required time to derive a ranked list of relevant databases for a given query in the worst case would be 16

j

i

j

i

O(jGj) + O(jDB j). To improve the eciency in nding the matched templates to the query image, instead of serial search, we can use the existing indexing methods such as R -tree or its variants [3] to index the templates. Thus, the computational complexity to choose the matching templates for a given query can be O(logjGj). In addition, the number of databases grouped under each template is generally much less than jDB j, which makes the average time complexity of nding the ranked list of databases less than O(jGj) + O(jDB j). However, the average time needed to nd similar images with respect to a given query largely depends on the e ectiveness and eciency of the selected local search engines. The network latency may also a ect the response time.

Query Re nement Based on Relevance Feedback Relevance feedback was rst introduced in the mid-1960s and has been used extensively by the information retrieval community to improve the performance for a query [8, 2]. This process modi es the query formulations based on the user's judgment to the initial retrieved documents [17]. The main idea consists of adding/subtracting terms that have been identi ed as relevant/irrelevant by the user, and also altering term weights in a new query formulation. Terms included in the previously retrieved relevant document are weighted higher whereas terms included in the previously retrieved irrelevant document are de-emphasized. The purpose of such a query alteration process is to derive an optimal query in the expectation of retrieving more relevant documents and less irrelevant documents in a later search. Our query re nement is implemented using an interactive graphical display technique to establish communications between system and users. After the initial search, a list of initial retrieved images are displayed to the user, and the mouse pointer can be used to designate images as relevant or irrelevant to the user's needs. The query process module extracts various features from each relevant or irrelevant image and forwards the features to the query re nement module. Let F = ff1 ; :::; fn g be a set of features, where fi 2 ftexture; color; shape; etc:g. Let q = fq1; :::; qn g be the original query and qi 2 F . Let Ir be the set of relevant images retrieved by the initial search and Iir be the set of irrelevant images retrieved by the initial search as determined by the user. Given an image g = fg1 ; :::; gn g, where g 2 Ir [ Iir and gi 2 fi . Let qi0 = qi . The query re nement module constructs a re ned subquery qik (k  1) by adding features in the relevant images and subtracting features in the irrelevant images: X X qik = qik?1 + N1 gi ? N1 gi (6) r g2I ir g2I r

ir

where Nr is the number of the known relevant images in Ir and Nir is the number of the known irrelevant images in Iir . The weights , , and must be determined experimentally. The metasearch agent then takes the modi ed subqueries q1k ; :::; qnk to produce a set of relevant 17

Figure 11: Sample test images. database sites, Di , for each subquery. The results are then merged to generate a nal list of relevant databases using Formula (2).

Performance Evaluation We now evaluate the e ectiveness of our design using the histogram-based selection approach and the merging algorithm introduced previously. We down-loaded 8,722 ower, scenery, materials, and background images from Internet and randomly grouped them into 9 databases. The sizes of the databases ranged from 500 to 1400 images. Figure 11 shows some sample test images. The texture and color features of the queries and database images were extracted using the mechanisms given in [20, 24]. In our experiments, we posed 35 queries. Each query is decomposed into a texture subquery and a color subquery. The similarity values between all texture subqueries and texture templates ranged from 0.65 to 0.90. The similarity values between all color subqueries and color templates ranged from 0.50 to 0.95. The number of queries with respect to the number of matched templates and di erent ranges of similarity are summarized in Table 1. The  o set used to determine the similarity interval is defaulted to 0.08. The " threshold used to select templates is set at 0.05. By default, the perceptive weight Hq is set to 1.0 for both color and texture subqueries. All tests were conducted on a dedicated SUN Ultra Enterprise 4000 with 1 Gigabyte of memory and 168 MHz UltraSPARC CPU. The average time that takes the system to return a list of relevant databases for a query is 0.48 second. Selection E ectiveness. Selection precision measures the accuracy of database selection against manually veri ed relevant databases. Given a query, let selected? sites be the set of top n database sites selected by the selection approach in response to the query, and relevant? sites be the set of top m databases that have been manually judged to contain images that are relevant to the query. The selection precision is calculated as: i

18

Sim Ranges No. of Templates Texture Color 1

1

[0.80,1.00]

[0.70,0.79]

[0.60,0.69]

Total Queries

9

3

4

16

1

2 or 3

2

1

1

2 or 3

1

3

6

2

2 or 3

2 or 3

1

2

1

15

12

8

Total Queries

19

35

Table 1: Distribution of number of queries based on the number of matched templates and similarity ranges.

( jselected?sites\relevant? sitesj if jselected? sitesj > 0, jselected?sitesj P= (7) 1 otherwise. A higher P value indicates a better selection approach. The P values of all queries were calculated by comparing the top 4 manually ranked databases with the top 4 databases ranked by the selection approach (i.e., n = m = 4). We assume the manual ranking to be the ideal ranking that can be achieved. The manual rank of each database site was determined by measuring the similarity between each given query to all database images. For the subqueries, databases with at least 10 images that have a similarity  0:85 with respect to the given subquery are considered to be relevant to the subquery. For the query, databases with at least 10 images that have a similarity  0:80 with respect to both texture and color subqueries of the given query are considered to be relevant to the query. All relevant databases are then ranked based on the number of highly similar images. The top 4 ranked databases were used for the experiments as the top 4 relevant databases. All 35 queries conducted in the experiments have at least one relevant database. Users were not asked to determine the rank of databases, since user judgments are often subjective and illustrative experimental results often cannot be generated based on such individual judgments. The performance for each query is measured at 3 levels of precision: 0.75, 0.50, and 0.25, representing high precision, medium precision, and low precision, respectively. The experimental results, without query re nement, are reported in Table 2, which shows the average P values as a function of the number of matched templates. The relatively high selection precision (0.72) demonstrates the e ectiveness of the system in selecting the relevant databases with respect to a query. Among the 4 selected databases, approximately 3 were relevant to the given query. The system o ers similar performance for queries matched with a single texture and color templates and queries matched with multiple texture and 19

Sim Ranges No. of Templates Texture Color 1

1

[0.80,1.00]

[0.70,0.79]

[0.60,0.69]

Average Precision

0.75

0.83

0.58

0.73

1

2 or 3

0.75

0.75

0.5

2 or 3

1

0.83

0.75

0.63

2 or 3

2 or 3

1.00

0.5

0.75

Average Precision

0.78

0.73

0.60

0.72

0.72

Table 2: Selection precision as a function of the number of matched template. color templates. We observed that when the similarity between the query and the templates becomes low (generally below 0.6), the templates no longer provide a sound basis for the estimation. A more detailed report on the experimental results can be found in [6]. Search Eciency. Search ratio S is de ned to measure the query search eciency resulted by the database selection. Given a query, let jDBsjimage be the total number of images of the database sites selected by the selection approach in response to the query, and jDB jimage be the total number of images of all registered databases. The database search ratio is calculated as: sjimage : S = jjDB (8) DB jimage The S values of all queries were calculated by comparing the total number of images included in the top 4 selected databases with the total number of images of all 9 databases. A lower S value indicates that less number of images need to be searched with respect to the query, thus the higher eciency of the image retrieval. In our experiments, the average S value of all 35 queries is 0.39. Thus, the overall eciency of the image retrieval is signi cantly improved from searching against all 8,722 images to average 3,401 images with respect to a given query. A low search ratio combined with a high selection precision show the overall good performance of our system. Re nement. We chose the 9 queries that have the lowest selection precision and applied the re nement process. The average selection precision of the 9 queries before re nement was 0.47. The selected databases returned by the re nement process were compared with the results of the initial selection performed using the original query. Note that both initial and re nement selections were conducted against all databases. The changes in precision are used to evaluate the e ect of the re nement process. For each query, the top 5 retrieved images returned by the top 4 selected databases (total of 20 images) were inspected and designated as relevant or irrelevant. In determining the relevancy of an image, similarity measures instead of user judgments were used to achieve consistent results. Images that had a similarity greater than or equal to 0:80 with respect 20

to both texture and color subqueries of the given query were considered to be relevant to the query, the rest of the images were considered to be irrelevant. The thresholds , , and were set to 0.5, 1.0, and 0.5, respectively. After the re nement, the average selection precision of the 9 queries reached 0.58, which is a signi cant improvement with respect to the selection precision of 0.47 before the re nement. The result demonstrates that relevance feedback represents a powerful process for improving the output of the retrieval system. Discussion. Following the two-step process to handle a given query, the robustness of the retrieval system (whether or not all and only the relevant images for the query are retrieved) is determined by the accuracy of the extracted features, similarity measures, and the selected templates. Speci cally, the e ectiveness of the site selection approach relies on the following factors:

 The feature extraction algorithms can generate feature vectors which are close to each other in the feature space for similar images.

 The closeness of the feature vectors can be represented by the high degree of the similarity between the two vectors. In addition, a good similarity measurement should be consistent with human perception.

 The content of the databases can be adequately represented by the templates. Thus, the sample images submitted by the visual databases for template con guration must adequately represent the underlying images included in the database.

The proposed system is able to generate reasonable texture and color feature vectors and cluster the feature vectors using a hierarchical approach. The e ectiveness of the similarity measure in capturing visual similarity to templates was demonstrated by showing the existence of di erent similarity ranges for distinct clusters as shown in Figure 8. In our experiments, more than 85% of the 8,722 database images can be matched to a single template with high degree of similarity (i.e.,  0:80) to the matched template. This demonstrates that the templates adequately represent the majority of the database images.

Conclusion We have developed the system framework NetV iew which supports global visual query access to various visual databases over the Internet. This framework includes the creation of a metaserver and its major components: the metadatabase, the metasearch agent and the query manager. The formulation of the metadata is achieved by abstracting the visual information in visual databases through templates and capturing the statistics of the similarity distributions of visual data as indices to databases. The selection of databases for distributing a given query is achieved by matching the content of the query with the templates stored in the metadatabase and then generating a list of the 21

most relevant databases. The performance of the system is re ned based on user's feedback. The experimental results have demonstrated that such an integrated system can signi cantly improve the overall eciency of retrieving images stored in the distributed visual databases since fewer databases will be searched. The proposed framework is currently being deployed and tested at the Computer Science Department at the State University of New York at Bu alo. Further research work is being conducted to improve the system performance and functionality by exploring new clustering approaches and supporting additional visual features such as shape.

Acknowledgments We would like to thank Deepak Murthy for his participation in designing the prototype of the NetView system.

References [1] Special Issue on Content-Based Image Retrieval Systems, Editors V. N. Gudivada and V. V. Raghavan. IEEE Computer, 28(9), 1995. [2] J.R. Bach, C. Fuller, A. Gupta, A. Hampapur, B. Horowitz, R. Jain, and C.F. Shu. The virage image search engine: An open framework for image management. In Proceedings of SPIE, Storage and Retrieval for Still Image and Video Databases IV, pages 76{87, San Jose, CA, USA, February 1996. [3] N. Beckmann, H.P. Kriegel, R. Schneider, and B. Seeger. The R*-tree: an ecient and Robust Access Method for Points and Rectangles. In Proceedings of ACM-SIGMOD International Conference on Management of Data, pages 322{331, Atlantic City, NJ, May 1990. [4] M. Bowman, P. Danzig, D. Hardy, U. Manber, and M. Scwartz. Harvest: A scalable, customizable discovery and access system. Technical Report CU-CS732-94, Department of Computer Science, University of Colorado-Boulder, 1994. [5] S. Chang, J. Smith, M. Beigi, and A. Benitez. Visual Information Retrieval from Large Distributed Online Repositories. Communications of the ACM, 40(12):63{71, December 1997. [6] W. Chang, G. Sheikholeslami, A. Zhang, and T. Syeda-Mahmood. Ecient resource selection in distributed visual information systems. In the ACM Multimedia'97, pages 203{213, Seattle, WA, November 1997. [7] W. Chang and A. Zhang. Metadata For Distributed Visual Database Access. In Second IEEE Metadata Conference, Silver Spring, MD, September 1997. 22

[8] M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, and B. Dom et al. Query by Image and Video Content: The QBIC System. IEEE Computer, 28(9):23{32, 1995. [9] L. Gavarno and H. Garcia-Molina. Generalizing Gloss to Vector-Space Databases and Broker Hierachies. In Proceedings of the 21st International Conference on Very Large Data Bases, pages 78{89, 1995. [10] L. Gavarno, H. Garcia-Molina, and A. Tomasic. The E ectiveness of Gloss for the Text Database Discovery Problems. In Proceedings of the ACM SIGMOD'94, pages 126{137, Minneapolis, May 1994. [11] A. D. Gordon. Classi cation Methods for the Exploratory Analysis of Multivariate Data. Chapman and HAll, 1981. [12] R. Jain and S.N.J. Murthy. Similarity Measures for Image Databases. In Proceedings of the SPIE Conference on Storage and Retrieval of Image and Video Databases III, pages 58{67, 1995. [13] B. Kahle and A. Medlar. An Information System for Corporate Users: Wide Area Information Servers. ConneXions - The Interoperability Report, 5(11):2{9, November 1991. WAIS is accessible at http://www.wais.com/newhomepages/techtalk.html. [14] C. Lagoze and J. Davis. Dienst: An architecture for distributed document libraries. Communications of ACM, 38(4):47, April 1995. [15] K. Obraczka, P. Danzig, and S-H Li. Internet Resource Discovery Services. IEEE Computer Magazine, 26(9):8{22, 1993. [16] E. Remias, G. Sheikholeslami, A. Zhang, and T. F. Syeda-Mahmood. Supporting ContentBased Retrieval in Large Image Database Systems. The International Journal on Multimedia Tools and Applications, 4(2):153{170, March 1997. [17] J. Rocchio. Relevance Feedback in Information Retrieval. In The Smart System - experiments in automatic document processing, pages 313{323. Prentice Hall, Englewood Cli s, NJ, 1971. [18] M. Schwartz, A. Emtage, B. Kahle, and C. Neuman. A Comparison of Internet Resource Discovery Approaches. Computing Systems, 5(4):461{493, 1992. [19] S. Sclaro , L. Taycher, and M. La Cascia. ImageRover: A Content-based Image Browser for the World Wide Web. In IEEE International Workshop on Content-based Access of Image and Video Libraries, pages 2{9, 1997. 23

[20] G. Sheikholeslami and A. Zhang. An Approach to Clustering Large Visual Databases Using Wavelet Transform. In Proceedings of the SPIE Conference on Visual Data Exploration and Analysis IV, pages 322{333, San Jose, February 1997. [21] G. Sheikholeslami, A. Zhang, and L. Bian. Geographical Data Classi cation and Retrieval. In Proceedings of the 5th ACM International Workshop on Geographic Information Systems, pages 58{61, Las Vegas, Nevada, November 1997. [22] John R. Smith and Shih-Fu Chang. VisualSeek: a fully automated content-based image query system. In Proceedings of ACM Multimedia 96, pages 87{98, Boston MA USA, 1996. [23] John R. Smith and Shih-Fu Chang. Visually Searching the Web for Content. IEEE Multimedia, 4(3):12{20, 1997. [24] J. Wang, W. Yang, and R. Acharya. Color Clustering Techniques for Color-Content-Based Image Retrieval. In the Fourth IEEE International Conference on Multimedia Computing and Systems (ICMCS'97), pages 442{449, Ottawa, Canada, June 1997. [25] Wei Wang, Jiong Yang, and Richard Muntz. STING: A Statistical Information Grid Approach to Spatial Data Mining. In Proceedings of the 23rd VLDB Conference, pages 186{195, Athens, Greece, 1997.

24