2000 Springer-Verlag
Int J Digit Libr (2000) 3: 85–99
Advanced techniques for digital libraries SPIRE: a digital library for scientific information Lawrence D. Bergman, Vittorio Castelli, Chung-Sheng Li, John R. Smith IBM T.J. Watson Research Center, P.O. Box 704, Yorktown Heights, NY 10598, USA; E-mail:
[email protected] Received: 19 December 1998/Revised: 9 June 1999
Abstract. In this paper we describe the architecture and implementation of a digital library framework for scientific data, particularly imagery, with a focus on support for content-based search. Content is specified by the user at one or more of the following abstraction levels: pixel, feature, and semantic. An object-definition mechanism has been developed that supports examplebased and constraint-based specification of both simple and complex query targets.This framework incorporates a methodology yielding a computationally efficient implementation of image processing algorithms, thus allowing the interactive, real-time extraction and manipulation of user-specified features and content during the execution of queries. The framework is well-suited for searching scientific databases, including satellite imagery, and medical and seismic data repositories, where the richness of the information does not allow the a priori generation of exhaustive indexes. Key words: Digital library – Scientific data – Acquisition of content – Abstraction pyramid – Object-based model
A great deal of scientific data can be considered to be imagery. Satellite images and medical images (including MRI and CT) are clearly in this category. Other regular, gridded datasets such as seismic data, oceanographic sonar, output from 3D simulations of meteorological or hydrological systems, and many others can often be treated as imagery. For this reason, much of our attention in the SPIRE system, and in this paper, has focused on how to identify, locate, and deliver content from scientific imagery. There is a significant body of literature on digital libraries, which we shall briefly review in this introduction. In order to highlight the problems encountered in managing scientific data (in contrast to “traditional” digital libraries which typically manage video, photographic images, and text), we present a simple application scenario. We then describe the state of the art of several keys technologies supporting digital libraries, discuss their limitation when used for scientific data, listing at the same time our contributions, and we conclude the introduction with an overview of the paper. 1.1 Digital library overview
1 Introduction We describe the architecture and the current implementation of a digital library for scientific data. Scientific data is quite unlike other multimedia types, in that it is acquired with instruments designed to produce digital information of a nature and in a format that facilitates its processing. In other words, unlike photographic images and video, scientific data is meant to be analyzed with quantitative methods that almost always involve the use of digital computers, and thus is ideally suited for storage in a digital library.
The fundamental technological issues pertaining to digital libraries are the acquisition of content1 , the search of a large collection of data, and the representation of the returned results. To support efficient search, a digital library architecture must face the following challenges: data storage, content categorization and indexing, and result delivery in digital format. To be practically useful, a digital library must provide the user with a simple 1 While content acquisition is important for web-related archives, such as WebSEEk [1], where the digital library searches data belonging to multiple, non-federated, owners, in this paper we assume that the data sources are known and well defined, and thus we do not concern ourselves with this particular aspect.
86
L.D. Bergman et al.: SPIRE: a digital library for scientific information
yet powerful interface, present the results of a search in a meaningful format, and allow the user to reformulate and iteratively refine queries. Fundamental work in the general area of digital libraries was conducted under the sponsorship of the Digital Libraries Initiative (DLI), at Carnegie Mellon [2], U.C. Berkeley [3], U.C. Santa Barbara [4], University of Illinois at Urbana-Champaign [5], University of Michigan [6], and Stanford University [7], and numerous projects and initiatives have since contributed to advancements of the field. Today, digital libraries organize heterogeneous and diverse media types, ranging from video [2], to photographic images, textual documents [5], audio [8], medical data [9, 10], geographically referenced data [4], and satellite images [11]. Given their ability to effectively manage this diversity of datatypes, we believe that digital libraries are ideally suited to manage scientific datasets. Satellite images of the earth, astronomical and planetary images, medical data, and geological measurements used in the petroleum industry, for example, all share the following characteristics: the volume of acquired data is extremely large, the density of information is high, the analysis (and retrieval) is usually performed using low-level descriptors that can be captured by a computer (rather than high-level semantics that are very hard to characterize algorithmically), the acquisition process is expensive, the analysis (by a human expert) is complex, and the data itself is valuable. Consider, for instance, satellite images of the earth surface, which are being produced at an exponentially growing rate. The acquisition of this type of data is expensive, since it involves launching an earth-observing satellite into orbit, deploying a number of ground stations, and warehousing the tens of gigabytes produced daily by the orbiting instruments. The large datasets produced are difficult to manage: each individual image is too large to be easily transmitted over the existing Internet infrastructure, and finding images for particular purposes out of many thousands of candidates can be daunting. Yet, environmental protection agencies, developers, real estate agents, the Forest service, farmers, paper companies, and educational institutions, just to mention a few, can significantly benefit from using remotely sensed data. What is needed are mechanisms allowing the user to effectively identify, among the terabytes of available data, the “interesting” portions of the few images relevant to the task at hand. Thus, digital libraries are the natural candidate tool to manage scientific and technical data. In the next section we present an application scenario, developed for oil production and exploration, intended to more clearly delineate the types of search that we consider typical of scientific digital library applications, and indicate some of the solutions that we have explored. Other applications that we have implemented, including solar flare detection, and uses of remotely-sensed earth imagery are provided as additional examples throughout the paper.
1.2 An application scenario Digital libraries for scientific applications must support access to multiple datatypes, sophisticated mechanisms for specifying query semantics, and the ability to manage and process large amounts of information. We have addressed these issues in a recently constructed application for content-based retrieval from archives of petroleum well-bore imagery. We will describe this application and use of the retrieval tool as an introduction to the capabilites of our system. An important component of oil exploration and oilfield management is obtaining information on the geology of the subsurface strata using data collected from already-drilled wells. Data is collected by lowering a package of instruments (“logging tools”) to the bottom of the well, and slowly pulling them back to the surface. While the package is being pulled to the surface, the instruments are measuring various physical properties of the rocks surrounding the borehole, including electrical resistivity, sonic velocity, and natural and induced radioactivity. Most measurements are “single channel”; typically every 6 inches a single measurement is made of the surrounding rock by a given instrument. Other instruments are more sophisticated. For example, the Formation Micro-scanner Imager (FMI) has four arms, each with two ‘pads’ which press firmly against the surrounding borehole walls. The pads on the FMI have a very high density of electrodes which can detect subtle variations in electrical resistivity. With this tool, instead of one measurement being made every 6 inches, 192 measurements around the circumference of the bore are made every 0.1 inches. The result of measuring 1000 feet of a borehole with this instrument is therefore an image 192 pixels wide, and 120 000 pixels high. Figure 1 shows an interval in a borehole from the Boonesville, Texas, area which has been “logged” with 6 single channel instruments sampling every 6 inches (on the right-hand side of image), and an FMI image sampled every 0.1 inches (left-hand side of image). The section of the borehole shown consists of alternating intervals of sandstone, siltstone, and shale. Geologists studying well data are interested in identifying strata of particular types and/or particular characteristics. For example, it might be interesting to find all the coarsely bedded sandstone, or all the finely laminated shale intervals. Bulk lithologies (i.e., rock types) can generally be distinguished using the single channel measurements. For example, sandstone is characterized by low gamma ray values, while shale is characterized by high gamma ray values. Electrical resistivity can be used to distinguish sandstone whose pores are filled with oil versus water. Once these gross lithologies have been identified, the higher resolution FMI images can be used to identify fine-scaled features within the rock. These finescale features can give the geologist clues as to the en-
L.D. Bergman et al.: SPIRE: a digital library for scientific information
87
Fig. 1. Section of FMI and 6 single channel data from the Boonesville dataset
Fig. 2. Definition of a new search object type (“thin-beddedshale”) through a positive texture example and a constraint on a log measurement, using the query specification interface
vironment in which the strata were originally deposited: e.g., river, beach, desert. Our application allows the geologist to identify stratigraphic intervals based on example strata extracted from FMI images, as well as constraining the search by specify-
ing restrictions on the accompanying single channel data. Figure 2 shows an example of this strata definition which was constructed with our drag-and-drop query builder. A section of the FMI image has been selected that contains a sample of the desired strata. This sample image is used to define a texture to be matched in any target images. The strata definition also contains constraints on the single channel measurements. In this example, we have specified that our definition of shale will be restricted to intervals that have gamma ray values greater than 50. Figure 3 shows the results of using this strata definition to identify similar intervals in a single 1000 foot set of well data. It can be seen that a number of similar stratigraphic intervals have been identified that match the search criteria. Often a geologist wishes to identify a set of lithologies which, when they occur together and in particular order, will characterize a geologic feature. For instance, a river delta is often characterized by a sand sequence coarsening in the upward direction, abruptly capped by shale. Using our system, the geologist can define a composite object by specifying a set of constraints between multiple component objects (potentially of different types). For example, the user might construct a query of the form
Fig. 3. Results of searching for “thin-bedded-shale” (as defined in Fig. 2) over a 1000-foot section of the Boonesville dataset
88
L.D. Bergman et al.: SPIRE: a digital library for scientific information
Fig. 4. Definition of river delta lobe as a compound object
“Define compound object called ‘delta lobe’, consisting of thinly laminated siltstone, overlain within 10 feet by coarsely grained sandstone, which in turn is abruptly overlain by shale”. Figure 4 diagrams this compound object. Such a query would be composed using our dragand-drop query builder (described in Sect. 2.4), with results returned as in the previous example.
1.3 Preliminaries and previous work 1.3.1 Searching multimedia libraries by content Classical database queries, expressed via SQL, assume that the data is stored in tables, i.e., that the information is structured. Relational, and other “traditional” database types do not extend well, however, to repositories of non-traditional, unstructured information. Information retrieval from multimedia databases and libraries is the subject of active investigation, and no universal approach to the problem has yet been found. Contentbased retrieval [12] is a term denoting a large class of diverse methodologies for searching multimedia repositories; essentially every solution that does not rely simply on traditional metadata falls under this category. Content extraction for multimedia databases has numerous similarities with computer vision, and has the same limitations. In particular, the problem of automatically deriving a reasonably accurate semantic description of, say, a photographic image has not yet been solved. To overcome this impasse, the content-based community has almost universally taken the approach of extracting low-level descriptors (features) from the data and casting content-based retrieval into a similarity search problem in the feature space. This approach assumes that multimedia objects that are perceptually dissimilar most likely have different content, and that only a few objects that have different content are perceptually similar. If these assumptions are satisfied, then content-based retrieval faces three challenges: how to map a multimedia object to a set of numeric quantities that can be stored and indexed, how to measure similarity, and how to communicate the desired content to a computer. For sake of discussion, we shall analyze the three points in the context of a library of photographic images.
Most existing systems in this domain capture content through low-level image descriptors that are inspired by properties of the human visual system. In particular, one can characterize images in terms of color, texture and shape. Starting from this consideration, researchers have developed mappings, such as color, texture, and shape features, from color spaces and 2D grayscale fields to numeric spaces. With the exception of some shape features, such mappings are not meant to mimic the way humans process and understand color, texture and shape, but are designed to allow the computer to discriminate between regions that look different to humans. Thus, for instance, a computer can segment an image into regions of homogeneous texture in much the same way as a human would do by hand, but to do so, it relies on quantities that in general do not correspond to human texture concepts such as “regularity”, “contrast”, or “granularity”. In what follows, we will use color, texture and shape to denote computer-generated features, and care should be taken not to confuse them with their perceptual equivalent. Almost universally, color is characterized by means of the empirical probability distribution (called color histogram [13]) of the pixel values in an appropriately quantized color space. Research in the field has concentrated on what color space is the most appropriate, how to quantize the color space, how to index the color histograms for fast retrieval and on alternative color features. A typical color histogram is an array (vector) of 64 to 256 entries, describing an entire image. Less frequently, images are segmented into homogeneous regions and color histograms are acquired from each region. Texture [14] is a visual quantity that pertains to local variations of the image gray levels (color texture can also be defined). Texture presents more difficulties than color for content-based search, since there is no known way of consistently mapping “visual” textures to numerical quantities. There are, however, quantities, such as gray level variations in local neighborhoods, which are quite easy to compute. Even though these texture features do not correspond directly to quantities that humans find descriptive, they are widely used. Such computationally tractable features include: transform-based features (where the features are coefficients or statistics of coefficients of linear transformations), gray-level-difference texture features (i.e., statistics of the histogram of the gray level differences), and co-occurrence matrix features (i.e., statistics of the co-occurence matrices of the gray levels or of the quantized color space). Several texture features are almost always extracted at the same time and organized in a feature vector. Shape has also been used as a content-based retrieval feature, though more sparingly than texture. Usually images are analyzed and reduced to a set of simple shapes (for instance blobs [15]). Sometimes more complex shape descriptors (such as shocks [16]) are extracted and indexed.
L.D. Bergman et al.: SPIRE: a digital library for scientific information
In the context of the scientific data we have encountered, color histograms have little practical applicability, and simple, “universal” shape descriptors are also of limited use. On the other hand, texture has proved very useful in analyzing satellite images (where it can be used to discriminate between semantic classes that are not easily distinguished using spectral information), data for the oil industry, as described in Sect. 1.2, and some types of medical images (such as mammograms and MRI’s of certain types of tumors). The next challenge is how to map similarity as defined by the user, to similarity as computable by the computer using low-level features. It is important to stress that similarity is usually perceptual similarity and not semantic similarity. Histograms and texture vectors can be represented as points in a high-dimensional Euclidean space. Roughly speaking, similar images map to points that are close in this space, thus similarity can be captured via a distance function. The main challenge is to make sure that the relative similarity ranking induced by the distance function is consistent with the notion of similarity as required by the user. Solutions range from selecting different metrics (for a discussion on similarity metrics see, for instance [17]), to using relevance feedback to modify the similarity metric using positive and negative examples provided by the user [18–21]. The final challenge is how to communicate the desired content to the computer. Since the features used to process the query are non-intuitive, the user typically does not specify content by providing feature values directly. Instead, query-by-example is used almost universally. Examples are either in the form of images, provided by the user or selected from a “dictionary”, or are sketches, drawn using facilities provided in the query interface. When multiple feature types are used simultaneously, such as in the QBIC system [12] which retrieves based on texture, color and shape, the user can often control the relative importance of the feature classes in determining similarity. More rarely, as in WebSEEk [1], some facilities exist to directly modify feature values through visual interfaces.
1.3.2 Searching multimedia databases When multimedia objects are mapped into feature spaces, similarity retrieval is reduced to nearest neighbor queries in high-dimensional metric spaces. Unfortunately, this is a very hard problem. If the metric is known and fixed a priori, one can construct similarity indexes, as described, for instance, in [22]. Due to the “curse of dimensionality” phenomenon [23], however, as the length of the feature vectors grows, the search becomes exponentially harder, and the indexes quickly become useless. To mitigate the problem, often one relies on dimensionality reduction techniques and approximate searches (see [24] and references therein).
89
When the metric is not known a priori, for instance when it is learned at query time from positive and negative examples, the use of pre-constructed indexes becomes difficult, and is usually avoided. 1.4 Contributions There are several limitations of the existing technologies that our system is designed to overcome, often by relying on properties of scientific data. First of all, existing systems tend to support a fairly limited range of query specification styles. A system may provide query-by-example facilities, or may support semantic searches based on labels, but typically do not provide both. Related to this is an inability of most systems to freely combine query elements at different abstraction levels. We consider it quite important to be able to ask queries of the form, “Find me flat, grassy areas that are to the north (upwind) of and within 5 kilometers of a given fire location”. This query (of use to wildfire managers for landing helicopters), as processed by our system, would combine previously extracted information (fire location) with textural information from an image (grassy) and associated data sets (flat, from a digital elevation map). Second, existing systems essentially rely on pre-extracted and pre-indexed information, thus limiting the type of content that can be searched to what is already stored in the library. Our system has facilities for defining new features and semantics, extracting them from the data, and using them in the construction of queries. Third, combining representations (pixel, feature, semantic) in order to accelerate query processing is a relatively unexplored area. Although there has been a fair amount of work on how to index particular types of information (as previously cited), little attention has been paid to how to utilize multiple types, including how to capitalize on multiresolution representations. Fourth, multimedia query systems typically provide little support for building a library of user-defined object types. This is particularly true of types that are defined in terms of subcomponents and constraints. Our system allows the user to incrementally build a dictionary of objects and queries, by composing previously defined objects via spatial and temporal operators. Finally, support for query refinement tends to be weak in systems of this sort. Although use of relevance feedback has been incorporated into several systems in recent years, the use of negative examples in conjunction with multiple positive examples is relatively unexplored. We have developed a methodology for efficiently combining iterative refinement based on relevance feedback with indexing. 1.5 Paper overview In the rest of the paper we will explore the design criteria and architecture of the SPIRE digital library for scien-
90
L.D. Bergman et al.: SPIRE: a digital library for scientific information
tific information. Query formulation, including our model of query construction, the query refinement process, and the user-interface is addressed in Sect. 2. Underlying technology for processing search is described in Sect. 3. We will describe the architecture of the system in Sect. 4. Section 5 contains discussion, conclusions and future directions for research.
Metadata Semantics (lake,city) Features (texture, shape ...)
2 Query formulation
Raw Data
2.1 The Progressive framework The heart of the search and retrieval paradigm implemented in our system is a progressive framework, combining data representation with search operations. The main motivation for the development of the progressive framework is the richness of non-structured data (such as images and video), and especially of scientific data. This richness requires more advanced methodologies for content specification than those found in most current image-retrieval systems, and typically prevents the automatic extraction of content at the time of ingestion in the database. One would be tempted to state that the correct definition of objects is in terms of semantics. This definition, nevertheless, is not sufficient to describe scientific data. Numerous examples can be found in different disciplines. For example, a radiologist’s report that simply stated a diagnosis, with no description of shapes, textures, sizes and distances, or other observable features, would contain information inadequate for its intended use. Sometimes, as it is the case in remotely sensed images, even a feature-level description is inadequate: this is the case, for instance, when the scientist is interested in locating ground control points in different images in order to superimpose or composite them into a mosaic. Here, the search is performed by comparing different images pixelby-pixel. Based on these considerations, we have organized data into an abstraction pyramid. This pyramid is diagrammed in Fig. 5. The abstraction pyramid, which we call the “InfoPyramid”, permits query specification at a variety of abstraction levels including pixel (raw data) level, feature level, semantic level, or using metadata information. Note that there are two kinds of information that a user could potentially provide as part of a query specification. The first is a set of parameters used to conduct a search at a particular level (or at several levels) of the pyramid. When several levels are employed, this specification will include information on how to combine them, for instance through simple Boolean operators, or fuzzy logic operators. The second type of information describes how to transform data between one level of the pyramid and another. These two types of information will be discussed in Sect. 2.3 below.
Fig. 5. The abstraction pyramid
2.2 Object definition To effectively search multimedia databases, the SPIRE system relies on an object-based representation of content. We define simple objects as connected regions that are homogeneous with respect to a set of characteristics defined at one or more abstraction levels. Figure 6 diagrams a commonly used information flow for creating simple objects. In this pipeline, features are computed from pixels, the features are segmented based on some homogeneity measure, and then labels are applied to the resulting regions. Also note that some of these stages may be processed offline; for example, we pre-extract a number of features at image-ingest. Simple objects can be defined using multiple attributes defined at different abstraction levels. A geologist analyzing resistivity data acquired from a well bore (data that are usually displayed as images) could potentially ask the repository to return all portions of images containing limestone with texture similar to a specified ex-
(a)
(b)
label, coords, attributes label, coords, attributes label, coords, attributes
(c)
(d)
label, coords, attributes
Fig. 6a–d. Extracting descriptors at multiple levels of abstraction: from the raw data (a), features are computed (b), the image is then partitioned into homogeneous regions (c) each region is classified to produce a class label, coordinates are extracted and attributes are computed (d)
L.D. Bergman et al.: SPIRE: a digital library for scientific information
ample. Here, the object is defined using both semanticlevel information (limestone) and feature-level information (similarity texture). Simple objects, however, are not sufficient to support an extensible search framework. For example, consider the detection of the “bright points” on the solar corona using images acquired with the SOHO (Solar and Heliospheric Observatory) instruments. Bright points are small bright areas, easily visible in the images acquired with the Extreme ultraviolet Imaging Telescope (EIT), and differ from other similar phenomena by their association with a very orderly, persistent magnetic field, which confines the hot plasma. These magnetic fields are easily detected using photospheric magnetograms, acquired with a different instrument. The magnetogram represents data from a deeper layer within the sun than the one imaged by the EIT instrument. Thus, bright points are defined in terms of two simple objects specified at the feature level (bright areas in the EIT data and magnetic fields in the photospheric magnetogram data) which are related by a positional constraint (i.e., they are at the same location, but at different “depths”). More complex objects can require more complex definitions: consider again a case involving solar data, where the scientist is now looking for Coronal Mass Ejections (CME), a type of violent eruptive phenomenon. A CME appears as a sudden brightening of a bright spot, immediately followed by appearance of one or two dark spots near the brightening spot (the hot plasma gets concentrated from the darkening areas to the brightening spot), and finally a “shock wave” can be observed, that extends to the entire surface of the sun within a few hours. The definition of CME involve a set of simple objects (a bright spot, a brightening spot, and one or two darkening areas) related by spatial and temporal constraints (the bright spot is followed by a brightening spot in the same position, the darkening areas are adjacent to the brightening spot, darkening occurs slightly after the brightening, the darkening areas are always larger in size than the brightening spot). We call these more complex definitions “compound objects”. Compound objects consist of simpler objects (either simple objects or previously-defined compound objects) and a set of relationships. Relationships are specified as binary relationships between pairs of component objects, although in some instances (such as “between”) they involve more than two components. Note that compound object relationships may involve sharp constraints (e.g., “A is within 5 kilometers of B”) or fuzzy relationships (e.g., “A is near B”).
2.3 Query specification Our system allows several ways of querying the abstraction levels of the InfoPyramid, as well as specifying how information is to be transformed within the pyramid.
91
The simplest way of specifying a query is through a semantic label (“find me a forest”). Our system supports only exact label matching; semantic matching is a welldeveloped field that we have not attempted to extend. A very common way of specifying queries is by providing examples. Texture matching is a case-in-point of this specification mode. A texture match is specified by selecting a region of an image that represents the desired feature. For example, a forester who is looking for 10–20 year old hardwood forests might pick (by rubberbanding, for example) a region from a Landsat image that is known to represent the desired forest type. An example-based specification can include several such examples of the type desired, known as positive examples, as well as several examples of types that are explicitly not desired, known as negative examples. Positive examples are combined using the logical operator “or” (find something that looks like A or that looks like B), whereas the negative examples are combined using the operator “and” (find something that does not look like C and does not look like D). Example-based specifications are used to define simple objects using features and feature-matching operators, or pixels and pixel-matching operators. We provide an extensive facility for the user to define new features (this is a transformation from pixels to features) by specifying sequences of data algebra and processing operators to be applied to the raw data. Features are defined using a scripting language via the drag-anddrop interface described in the next section. We also allow a user to categorize features and attach labels to individual categories (this is a transformation from features to semantics). This is accomplished through an interactive interface that allows the output from a classifier or clustering routine to be labeled by the user. Figure 7 shows an example of specifying such labels. Simple objects, specified by the user, are added to a library of available object types within our system. Compound objects are constructed from this library and a set of predefined object relationships, using the drag-anddrop query builder. Compound object definitions are also added to the object type library, and become indistinguishable from objects defined using other mechanisms. 2.4 User interface Formulation of content-based queries at the multiple levels of abstraction supported by our system requires a number of facilities. First of all, the query interface must support inclusion of sample images, or portions of images. Image portions are used as input to example-based search procedures, either as positive or negative examples (see Sects. 2.5 and 3.2.2). Second, the interface must support accumulation of a library of entities. These entities include object definitions, both simple and compound, and feature definitions. As the user constructs new semantic definitions, the query interface needs to include them in the list of
92
L.D. Bergman et al.: SPIRE: a digital library for scientific information
Fig. 7. Interactive assignment of labels to image regions
entities that are available for constructing queries, or for building new definitions. In addition to these absolute requirements for the query interface, we had a number of desiderata. These include English-language style query specification, image examples embedded in-line in object specifications, automatic syntax-checking or syntax-enforcement, and good support for both query refinement and query reuse. Beyond the requirements of any particular query interface, we also wanted a means of readily customizing query interfaces for particular application domains. One of the characteristics of digital libraries for scientific information is that they require a good deal of domainspecific configuration. Datatypes, relationships, semantic entities all differ between applications. This goes far beyond simply changing database information; the kinds of questions, and the way that they are asked can differ substantially between fields of inquiry. Our user-interface model, DanDE (short for dragand-drop English) is an attempt to provide a highly flexible general-purpose environment for query construction that combines both structured query language and example-based constructions in a seamless fashion [25]. We have adopted an English-like query pattern, with phrases representing actions and relationships and with objects which are distinguishable elements in the data repository. These objects can be textual fields (for instance, “bright” or “CME”); spatial features such as fires, roads, or land cover types; or images. The DanDE interface provides draggable entities representing objects and phrases which can be dropped
into placeholders within other phrases. These phrases represent query elements, and are presented as Englishlanguage phrases with embedded multimedia objects. Phrases can be combined with each other using dragand-drop operations to build up query sentences. This interface style enables the construction of arbitrarily long and complex query sentences and provides an explicit view of all their portions. The interface also imposes constraints on the construction process, disallowing drops that would result in syntactically incorrect combinations of objects and relationships. Figure 8 shows a sample DanDE interface. Query phrases are selected from a menu at the left and dropped onto a phrase palette. Place-holders for subphrases are represented as “blanks” in the phrase and are filled from other menu items or by dragging subclauses on the palette into them, with automatic syntax-enforcement. Since subphrases can be dragged out of a phrase and retained on the palette, we have a rich facility for reusing and refining query components. As object and feature definitions are constructed, they can be added to menus, allowing us to build up sets of semantic entities from lower-level definitions. DanDE interfaces are specified through a high-level configuration file. The configuration file contains a BNF specification of the query syntax, information on the types of widgets to be displayed, and the behavior of widgets and subphrases. The most important behavior to be specified is the communication with application objects. This is specified through Java method calls to these objects, embedded in the configuration file. DanDE con-
L.D. Bergman et al.: SPIRE: a digital library for scientific information
93
Fig. 8. Sample from a DanDE interface
figuration files are read at run-time by a Java-based interpreter, and used to build the interface. Each individual application built using the SPIRE framework has two user-interface components. The first is a DanDE query builder. The second is an image navigation/display browser. Figure 1 shows an example of the latter for the petroleum exploration scenario. The image browser for each application is constructed using a Java library, including classes for image management, client-server navigation, session history, geometric transformations, and overlay management.
creating a color-coded transparent overlay that indicates score value (e.g., green for a good score, blue for a medium score, magenta for a poor score) and compositing it with a gray-level version of the original image. This composite image readily allows the user to visually locate areas that produced a high score, but which the user judges to be inappropriate matches, or areas which produce low scores which the user judges to be good matches. Both of these cases are good candidates for additional examples during refinement.
2.5 Iterative refinement
3 Query processing
A facility for refining searches based on previous results is an important component of the query process. We have have focused primarily on the problem of refining example-based queries by providing additional positive and negative examples to the query engine. Section 3.2.2 will discuss the computational mechanisms for processing multiple examples. Here we will focus on the user interface features designed to facilitate this. In addition to the drag-and-drop interface, which readily supports in-line inclusion of multiple image examples in a query, we have developed visualization tools to help evaluate the efficacy of similarity searches. A special type of image, which we call the match-score image [26] shows the search scores for each region of an image (see Sect. 3 for a discussion of region-based feature extraction), while providing contextual cues that facilitate interpretation of the scores. The image is produced by
Our approach to a scalable and extensible search framework is based on combining data representation and data manipulation. As mentioned in Sect. 2, we distinguish between different levels of content abstraction, namely, metadata, semantic level, feature level, and raw data level. We believe that one can derive some semantic and feature level information from each multimedia object that has “universal” applicability, i.e., that is useful to a very large user community, and which can be extracted and indexed at ingestion time. At the same time, we are aware that specialized user groups will want to define new concepts and, perhaps, new features, that must be extracted and processed at query time. Our system allows the manipulation of predefined and preextracted data at the feature and semantic level, and at the same time can extract new content at any level of the InfoPyramid during the execution of a query.
94
L.D. Bergman et al.: SPIRE: a digital library for scientific information
In this section we will discuss support for managing both pre-extracted content and content defined at query-construction time. We will provide details on data ingestion, including: compression, feature selection, feature extraction, and semantic object extraction; and on search processing including: operating on compressed data, learning through iterative refinement, and processing compound objects.
3.1 Preprocessing and ingestion Raw data storage. Raw data in lattice format (time series, images, volumetric data, etc.) is compressed using a wavelet-based algorithm, optimized for fast decompression of arbitrary hyper-rectangular subsets at arbitrary scales. This algorithm produces a lossily compressed version of the data, and a set of residuals, which, together with the lossily compressed data yield lossless compression. The lossy version of the data can be reduced to 10% (or even to 5% depending on the application) of the original file size without appreciable reduction of usefulness during the search and display operations; the residuals compress to between 50% and 30% for the typical datasets we have encountered. Thus, in a large digital library, the lossy version of the data can be stored on fast, random access media, such as hard disks, and used during the search phase, while the residuals can be stored on slower and cheaper tertiary storage for retrieval only when the user has identified the desired data. The compression/decompression algorithm relies on the superscalar architecture and floating-point capabilities of modern processors, and the optimization process takes into account both memory hierarchy (to stage appropriately memory accesses) and the characteristics of disks (to minimize I/O overhead). Numerous families of wavelet filters are contained in an easily extensible library of filter banks, from which the system administrator can choose depending on the type of data and the application domain. The wavelet transform produces a sequence of coarser and coarser approximations of the data. Thus, information at multiple scales (especially features, but also semantic content) can be extracted during the compression phase. Feature selection. Feature sets that are useful for distinguishing image content are generally domain-dependent. Although texture features have a great deal of descriptive power for the application domains we have explored to date (EOS data, astronomical images, petroleum data, and MRI medical images), texture is a broad category and may be represented many ways. One should select texture feature sets that best represent the information contained within the data based on the application. Since we wish to pre-extract feature sets that are most likely to be useful for run-time object extraction, but minimize the
expense (computational and storage) of pre-extracting information, we want to select these feature sets carefully. Within SPIRE, different texture feature sets are selected for each application domain using an automatic mechanism which can analyze a large library of feature sets and identify the one that allows the best discrimination between different types of image regions. This algorithm takes as input a collection of image region sets, where each set contains homogeneous examples of a particular type of texture. For instance, we have generated, with the help of an expert, a training set of 37 texture examples from EOS images, which must be distinguishable by the software. The algorithm identified a set of 21 texture features that best distinguishes between the examples [27], and we have applied a similar methodology to well- bore images [28]. Feature extraction. As part of preprocessing, texture features are extracted from a sliding window of fixed size, dependent on the application and the data, and appropriately indexed. For the application scenarios currently supported, typical window sizes range from 24 × 24 to 64 × 64, and the step size is usually between 2 and 16 pixels. Within each window, a single vector of feature values (each value representing a separate quantity such as entropy or roughness) is pre-extracted and stored. The number of texture features computed from each window typically ranges from 9 to 42. Indexing of the feature vectors for an image can include region-based indexing, in order to more readily locate the features associated with image sub-regions, and value-based indexing, to allow us to quickly locate features that have values similar to a search target. For satellite data, we index the coordinates of the extraction window using an R-Tree [29], and the features values, using a method that allows similarity search [30] and iterative refinement. Semantic object extraction. Often the metadata of scientific data contains semantic information (for instance, the diagnosis of the condition of a patient is frequently part of the dossier, and in this case the library ingestion mechanism does not have to automatically extract this information). Semantic content not described in the metadata that has general applicability, and that can be reliably inferred from the data, is extracted and indexed at ingestion time. For instance, multispectral satellite images of the earth can be easily partitioned into regions of similar land cover, using a suite of automatic classification algorithms. During ingestion, satellite images for which appropriate trained classifiers exist are labeled pixel-by-pixel. Connected regions of pixels with the same label are then merged into objects, attributes of each object are extracted, and bounding boxes are computed and indexed using R-Trees to facilitate spatial retrieval. Individual attributes of objects (such as size) can be indexed with appropriate binary trees to facilitate subsequent queries. Note that semantic content can be extracted from the raw
L.D. Bergman et al.: SPIRE: a digital library for scientific information
data, computed features, metadata, other semantic information, or any combination of these. Our system also allows definition of fuzzy semanticlevel objects, by using a fuzzy rather than a traditional classifier. Classifiers, like other processing operators, are part of the extensible library of operators used by the search and by the ingestion engines. We have implemented numerous types of statistical classifiers, ranging from simple parametric methods to complex nonparametric algorithms. The structure of the system allows the system administrator to train selected classifiers from the library with the appropriate training data, or to use pre-trained classifiers during the object extraction phase. Also, new classification algorithms can be easily added to the library. Since studies show that the relative performance of different algorithms are domain and application-dependent [31–33], we provide for ready addition of new classification algorithms to the library. 3.2 Searching As mentioned in Sect. 2, it is impractical and often impossible to extract all conceivable content during the ingestion phase. The problem is further complicated, in the digital library domain, by the fact that “exact” queries, typical of traditional databases, are here replaced by similarity queries, where the concept of similarity is ill-defined and often subjective. Finally, when content is specified in terms of composite objects, that is, of sets of simple objects related by spatial or temporal constraints, the search space grows exponentially in the number of components. Thus, besides the usual problems encountered in retrieval based on content, the described system must address how to effectively extract new contents specified by the user, how to capture the user’s idea of similarity, and how to efficiently process complex spatio-temporal queries involving multiple objects. 3.2.1 Operating on compressed data As described in Sect. 2, a fundamental functionality of the system is to allow the definition of new features or semantic content to be extracted from the raw data at runtime. Although highly desirable, the resulting processing is computationally expensive, and a naive approach is impractical. One of the challenges we have addressed is how to process this type of dynamic specification at interactive speeds for a large archive. The SPIRE system provides a solution by combining data representation and manipulation. In particular, the search engine can access a library of signal/image processing operators that rely on properties of the compression algorithm to yield significant increases in execution speed over their classical counterparts. Such algorithms extract features and semantic information from the raw data [34, 35], and semantic information from preextracted features.
95
A user searching a database of multispectral satellite images, for example, can define a new classifier for a particular taxonomy of land-cover classes by specifying a set of example regions on one or more images. These regions are used as training data in a supervised learning algorithm, and the new semantic classes defined by the user become now part of the query dictionary. Classifying an entire satellite image is an expensive task, even when simple classifiers are used. The progressive classifier algorithm [35] operates on the wavelet-compressed version of the data. It first analyzes a low resolution version, divides pixels into those corresponding to homogeneous regions and those corresponding to heterogeneous regions at full resolution, labels the former set and analyzes the latter set at finer resolution. When the algorithm is used to classify entire images, it is four to six times faster than regular classification. When used during search, to identify objects of a particular class, the observed speedups are significantly larger, from 20 to 30 times. This reduces the time to analyze a 4-band, 4000 × 4000 pixel image to two to three seconds on a mid-range workstation, which is acceptable for an interactive query system. 3.2.2 Learning similarity measures through iterative refinements A known problem of similarity retrieval in digital libraries is mapping the user’s notion of similarity into a computable function. Early approaches, such as that of QBIC, allowed the user to explicitly weight the various feature types used in the matching process (texture, color histogram and shape). A significant body of research has since been devoted to recasting the problem in a classification framework, where the metric used for retrieval is trained using a set of labeled examples [20, 36, 37]. Usually, images are divided into “good matches” and “poor matches”, and a simple parametric classifier is trained using the examples. Sometimes complex metrics are learned either online using features extracted from the entire images [37], or offline using features extracted from sliding windows [20]. Complex metrics are more flexible, but also more computationally expensive during search. In the SPIRE system, similarity search is performed differently at different levels of the abstraction pyramid. At the metadata level, searches are usually sharp, i.e., matches are not ranked according to similarity (for instance, if the user requires an image of Indiana, returning an image of Kentucky is as unacceptable as returning an image of Australia, despite the fact that Kentucky is closer to Indiana than Australia). At the semantic level, objects are extracted using either classical or fuzzy classifiers. Similarity between sharp pre-extracted objects is defined using a zero-one metric (objects of the same class have a similarity of one, objects of different classes have similarity of zero). Similarity between fuzzy labels is defined in terms of membership function, i.e., the similarity
96
L.D. Bergman et al.: SPIRE: a digital library for scientific information
of an object to a target class is the value of its membership to that class. While the rules governing similarity for metadata and semantic content lead to simple construction of unambiguous queries, specifying content at the feature level is more complex, due to the fact that it is hard for the user to map visual clues into features and vice-versa. Features that are visually meaningful to a human being (such as texture) are represented in the computer using nonintuitive representations (such as fractal dimension). For this reason, query-by example offers the only viable specification option. As described in Sect. 2.4, the query-specification interface has widgets that accept multimedia objects as part of a query specification. For instance, the user can define a new type of object as areas that “look like this with respect to texture”, where “this” is a portion of an image. The semantics of such a definition is the following: the system interprets the example as an area which is homogeneous with respect to one or more texture features. Feature vectors for all extraction windows (as described in Sect. 3.1) that overlap the example region are retrieved, yielding a “cloud” of points in the feature space. From this cloud of points the search engine extracts the mean µi and variance σi2 of each texture feature, and constructs a modified Euclidean distance. If x = [x1 , . . . , xn ] is a n-dimensional texture feature vector in the database, its distance from the query example is defined as n D = (xi − µi )2 /σi2 i=1
Distances are then mapped into the interval [0, 1] through a simple trapezoidal function, to produce a score s. When multiple (positive and negative) examples are provided by the user, for instance, as part of an iterative refinement process, the system adjusts the parameters of the trapezoidal functions associated with each example, and assigns scores to the stored feature vectors according to the following (conceptual) algorithm. Given a feature vector in the database, the system computes its similar(p) ity scores with respect to the positive examples si , the (n) scores with respect to the negative examples si , and combines them with the following formula: (p) (n) s = min max si , min (1 − si ) . (1) i
i
If we interpret individual scores as membership functions to the class of feature vector defined by an example, and we use the typical definitions of fuzzy AND (min), OR (max) and NOT (1 − x), (1) says that the score is the membership to the class of objects that belong to any of the positive example class (fuzzy OR) and not to any of the negative examples. By using a simple weighted Euclidean distance the system can perform the search in a scalable fashion, by relying on appropriately modified tree-based indexes.
3.2.3 Processing compound objects A compound object is a pair (O, R), where O is a set of objects and R is a set of spatial or temporal, fuzzy or sharp, relations between the elements of O. Unfortunately, the search space grows exponentially in the number of elements of O. Queries in the SPIRE system return the best n results, where n is a user-selectable parameter, usually much smaller than the total number of objects satisfying the query with a score greater than zero. Thus, scheduling and controlling the execution are essential steps in guaranteeing a timely completion of the search. Compound object definitions produced with the DanDE interface are sent to the server, which parses them into a relation graph. A scheduling algorithm produces a minimum spanning tree of the graph, by assigning costs to edges, and stages the execution accordingly. A dynamic programming algorithm then controls the execution of the search in a depth-first, greedy fashion, and relies on backtracking to ensure that the correct search results are returned. Details can be found in Li et al., [38]. 4 System architecture The architecture of the SPIRE system is diagrammed in Fig. 9. In order to provide an effective digital library solution for scientific data, the system design is based on several criteria: – the system must run over the internet – the system must provide for ingest and management of a variety of datatypes and formats – the system must provide a variety of search techniques and search operators – the system must be readily extensible SPIRE has an internet-based client-server architecture. The client, which is described in more detail in Sect. 2.4, is written in Java, and packaged as an applet. Clientserver communication is via http. The server employs an http daemon which runs the query engine via cgi. The query engine has two major subsystems, a data management subsystem, and a mining library subsystem. The data management subsystem is responsible for storing and retrieving information at a variety of abstraction levels (as described in Sect. 2.1). Data is stored in a variety of formats using multiple mechanisms including files for raw data, feature, and objects (and associated indices), and database tables for metadata. It is important to note that this subsystem is designed to provide a plugboard approach to data storage and retrieval — new types of indices, storage formats, or databases can be installed without modifying the system structure. The ingestion engine is responsible for inserting information into the data management subsystem including: transforming raw data into one of the multiresolution formats employed by our system, adding metadata descrip-
L.D. Bergman et al.: SPIRE: a digital library for scientific information
Mining Library
Ingestion Engine
Feature Extraction Segmentation Clustering Classification Filtering Template Matching Object Recognition
Internet
Client For Query Formulation
Query Engine
Data Repository Raw Data
Features + Index
Objects +Index
Metadata (DB2/ODBC)
Fig. 9. Architecture of the SPIRE System
tions to the database, and creating information abstracts from the data including appropriate indexing. The mining library subsystem is designed to provide a plugboard for image processing operators and other search facilities. Currently available modules provide for clustering, classification, filtering, image matching, linear transformations, object extraction, and object composition. New modules are readily installed allowing for expansion of system facilities. The query engine is responsible for parsing queries from the client, invoking appropriate functions in the data management and mining library subsystems, and packaging return results for the client. A simple C-like scripting language is used to specify how query processing will be sequenced.
97
providing a set of pre-extracted types, the latter by providing a rich facility for composing new types of arbitrary complexity. Processing efficiency for large data archives is a fundamental concern. Since scientific questions often require specifications beyond those that can be precomputed, efficient indexing schemes for pre-extracted information are insufficient. We have developed technology in several areas designed to address these issues. In order to support run-time image processing operations, we have developed a progressive framework that combines image representation (in particular image compression) with image processing. Progressive implementations of image processing operators rely on the properties of the compression scheme to significantly reduce the amount of data analyzed during the feature extraction and manipulation phases. The progressive operators are capable of extracting user-specified features at query time and using these features to search for new, nonpredefined, content. Similarity queries are effectively specified through selection of image examples. We have developed techniques for processing multiple positive and negative examples in an efficient, and scalable manner. Complex query formulations, involving multiple objects with a number of constraints, are expensive to evaluate. We have devised a dynamic programming algorithm that makes processing of such formulations feasible, even for large digital libraries. 5.2 Future work
5 Discussion and conclusions 5.1 Conclusions There are many important issues associated with providing an infrastructure for scientific digital libraries. In our system we have addressed a number of the most important concerns involved in query specification and processing. An important consideration in making a scientific digital library useful is a flexible and expressive query facility. Central to this is an appropriate paradigm for formulating queries — a paradigm that readily supports the different definitions of content that are implied by different domains, or that capture different user’s intent. We have developed what we believe to be a natural approach to query formulation by combining an abstraction pyramid with an object-based model. This approach permits easy combination of a variety of specifications, including example-based and semantic label-based. A library of definitions, both user-defined and systemdefined, is developed through time, with powerful facilities for combining existing object definitions in creating new object types. In addition, the system readily supports both the novice and the expert user. The former, by
We are currently involved in several extensions of this search technology. The first is to combine modeling with our current information retrieval framework. In processing a query, we would like to not only search imagery and other scientific datatypes, but also incorporate results from statistical and/or simulation models. Our ultimate goal here is to be able to access models in a “smart” fashion using the progressive approach. We would like to run models at querytime, with compute-intensive operations performed only in regions that are likely to meet search criteria. We are currently working on extending our query framework to incorporate temporal phenomenon. Scientific datasets often represent time-varying phenomenon, acquired as multiple samples over a specified time period. We are seeking to provide a facility for specifying objects in terms of time-varying attributes. We have developed a progressive approach to querying time-series [39]) that supports this form of object specification. In addition to time-varying data, an ability to handle higher-dimensionality datasets, in particular 3D, would be invaluable. Our framework readily extends to higher dimensional datasets and associated features (such as 3D texture, for example). We are currently seeking a motivating scenario for exploring such extensions, with medi-
98
L.D. Bergman et al.: SPIRE: a digital library for scientific information
cal datasets (MRI, CT) or seismic data being likely candidates. Once higher dimensional support and time-series support are available, the combination of the two will allow content-based query of 3D time-varying datasets such as hydrological, or meteorological simulations. Interoperability between digital libraries has become a topic of increasing interest over the last few years. Scientific data users wish to locate and combine data acquired from multiple sources. Content-based retrieval from distributed, heterogenous archives poses particular challenges given not only incompatibilities between metadata descriptions and data formats (many of these dealt with by existing or developing standards), but also incompatibilities in specification and search capabilities between search engines. We are currently working on developing an XML-based schema for describing the structure and services provided by content-based search engines. Such schema could be queried, and content-based searches formulated in ways that are appropriate for each individual search engine. Acknowledgements. The authors would like to thank Col. Chacko, Dr. Ian Bryant, Dr. Peter Tilke, Dr. Barbara Thompson, Dr. Loey Knapp, Dr. Nand Lal, for their comments and suggestions, and for defining applications scenarios for our technology. This work was supported in part by NASA/CAN contract no. NCC5-101.
References 1. Smith, J.R., Chang, S.-F.: Visually searching the web for content. IEEE Multimedia Magazine 4:12–20, Summer 1997 Demo at http://www.ctr.columbia.edu/webseek 2. Wactlar, H.D., et al.: Intelligent access to digital video, informedia project. IEEE Computer Magazine 29:46–52, May 1996 3. Wilensky, R.: Towards work-centered digital information services. IEEE Computer Magazine 29:37–43, May 1996 4. Smith, T.R: A digital library for geographically referenced materials. IEEE Computer Magazine 29:54–60, May 1996 5. Schatz, B., et al.: Federating diverse collections of scientific literature. IEEE Computer Magazine 29:28–36, May 1996 6. Atkins, D.E., et al.: Toward inquiry-based education through interacting software agents. IEEE Computer Magazine 29:69– 76, May 1996 7. Paepcke, A., et al.: Using distributed objects for digital library interoperability. IEEE Computer Magazine 29:61–68, May 1996 8. Li, V., Wanjiun, L.: Distributed multimedia systems. Proc. IEEE 85:1063–1108, July 1997 9. D’Alessandro, M.P., et al.: The Iowa Health Book: creating, organizing and distributing a digital medical library of multimedia consumer health information on the internet to improve rural health care by increasing rural patient access to information. In: Proc. 3th Forum on Research and Technology Advances in Digital Libraries ADL ’96, 1996, pp. 28–34 10. Lowe, H.J., et al.: The image engine HPCC project, a medical digital library system using agent-based technology to create an integrated view of the electronic medical record. In: Proc. 3th Forum on Research and Technology Advances in Digital Libraries, Washington, DC, USA, May 1996. ADL ’96, pp. 45–56 11. Castelli, V., et al.: Progressive search and retrieval in large image archives. IBM Journal of Research and Development 42:253–268, Mar. 1998 12. Niblack, W., et al.: The QBIC project: Querying images by content using color texture, and shape. In: Proc. SPIE – Int. Soc. Opt. Eng., Vol. 1908. Storage Retrieval for Image and Video Databases, 1993, pp. 173–187
13. Hafner, J., et al.: Efficient color histogram indexing for quadratic form distance functions. IEEE Trans. Pattern Anal. Mach. Intell. 17:729–736, July 1995 14. Rao, A.R.: A Taxonomy for Texture Description and Identification. New York: Springer-Verlag, 1990 15. Carson, C., et al.: Region-based image query. In: Proc. IEEE CVPR ’97 Workshop on Content-Based Access of Image and Video Libraries, Santa Barbara, CA, USA, 1997 16. Kimia, B.B., et al.: Shock-based approach for indexing of image databases using shape. In: Proc. SPIE Photonic East – Int. Soc. Opt. Eng., Vol. 3229, Boston, MA, USA, 3-4 November 1997, pp. 288–302 17. Santini, S., Jain, R.: Similarity query in image databases. In: Proc. IEEE Conf. Comp. Vis. and Pattern Rec., San Francisco, CA, USA, 18–20 June 1996. CVPR ’96, pp. 646–651 18. Rui, Y., et al.: A relevance feedback architecture for content based multimedia information retrieval systems. In: Proc. IEEE Workshop on Cotent Based Access of Image and Video Libraries, San Juan, Puerto Rico, 20 June 1997, pp. 82–89 19. Cox, I.J., et al.: PicHunter: Bayesian relevance feedback for image retrieval. In: Int. Conf. On Pattern Recognition, Vienna, Austria, 1996 20. Wan, X., Yang, Z., Kuo, C.C.J.: Efficient interactive image retrieval with multiple seed images. In: Proc. SPIE Photonic West – Int. Soc. Opt. Eng., Vol. 3527. Multimedia Storage and Archiving Systems III, Boston, MA, USA, 2–4 November 1998, pp. 13–24 21. Li, C.-S., Smith, J.R., Castelli, V.: SSTIR: Similarity search through iterative refinement. In: Proc. SPIE Photonic West – Int. Soc. Opt. Eng., San Jose, CA, Jan 24–30 1998 22. Kim, B.S., Park, S.B.: A fast k nearest neighbor finding algorithm based on the ordered partition. IEEE Trans, Pattern Anal. Mach. Intell. PAMI-8:761–766, Nov. 1986 23. Cherkassky, V.S., Friedman, J.H., Wechsler, H.: From statistics to neural networks: theory and pattern recognition applications. Springer-Verlag, 1993 24. Thomasian, A., Castelli, V., Li, C.-S.: Clustering and singular value decomposition for approximate indexing in high dimensional spaces. In: Proc. of Seventh International Conference on Information and Knowledge Management, Bethesda, MD, USA, 3–7 November 1998. CIKM ’98, pp. 201–207 25. Bergman, L., Schoudt, J., Castelli, V., Knapp, L., Li, C.-S.: Asimm: A framework for automated synthesis of query interfaces for multimedia databases. In: Proc. SPIE – Int. Soc. Opt. Eng. 3229:264–275, 1997 26. Bergman, L.D., Castelli, V.: The match score image: A visualization tool for image query refinement. In: Proc. SPIE Photonic West – Int. Soc. Opt. Eng. 3298:172–183, 1999 27. Li, C.-S., Castelli, V.: Deriving texture feature set for contentbased retrieval of satellite image database. In: Proc. IEEE Int. Conf. on Image Proc., Santa Barbara, CA, Oct. 26–29 1997, pp. 567–579 28. Li, C.-S., Smith, J.R., Castelli, V., Bergman, L.D.: Comparing texture feature sets for retrieving core images in petroleum applications. In: Proc. SPIE Photonic West – Int. Soc. Opt. Eng. 3656 Storage Retrieval Image Video Datab. VII :2–11, 1999 29. Guttman, A.: R-trees: a dynamic index structure for spatial searching. SIGMOD Record 14:47–57, June 1984 30. Li, C.-S., Smith, J.R., Castelli, V., Bergman, L.D.: Combining indexing and learning in iterative refinements. In: Proc. SPIE Photonic West – Int. Soc. Opt. Eng. 3656 Storage Retrieval Image Video Datab. VII :390–400, 1999 31. Pettit, E., Bailey, R., Bowden, R., Ashley, R.: Performance evaluation of statistical and neural network classifiers for automatic land use/cover classification. In: Proc. SPIE – Int. Soc. Opt. Eng. 1838:138–53, 1993 32. Salu, Y., Tilton, J.: Classification of multispectral image data by the binary diamond neural network and by nonparametric, pixel-by-pixel methods. IEEE Transactions on Geoscience and Remote Sensing 31:606–616, May 1993 33. Paola, J.D., Schowengerdt, R.A.: A detailed comparison of backpropagation neural network and maximum-likelihood classifiers for urban land-use classification. IEEE Transaction on Geoscience and Remote Sensing 33(4):981–998, 1995 34. Li, C.-S., Chen, M.-S.: Progressive texture matching for earch
L.D. Bergman et al.: SPIRE: a digital library for scientific information observing satellite image databases. In: Proc. SPIE Photonic East – Int. Soc. Opt. Eng., Vol. 2916, Boston, MA,USA, 18–19 Nov. 1996, pp. 150–61 35. Castelli, V., Kontoyiannis, I., Li, C.-S., Turek, J.J.: Progressive classification: A multiresolution approach. Research Report RC 20 475, IBM, 06/10/1996 36. Ma, W., Manjunath, B.: Texture features and learning similarity. In: Proc. IEEE Conf. Comp. Vis. and Pattern Rec., San Francisco, CA, USA, 18–20 June 1996. CVPR ’96, pp. 425–430 37. Squire, D.M.: Learning a similarity-based distance measure for image database organization from human partitioning of an image set. In: Proc. SPIE Photonic West – Int. Soc. Opt.
99
Eng., Vol. 3527. Multimedia Storage and Archiving Systems III, Boston, MA, USA, 2–4 November 1998, pp. 80–88 38. Li, C.-S., Castelli, V., Bergman, L.D., Smith, J.R.: Sproc: Fast algorithm for sequential processing of composite objects retrieval from large image/video archives. In: Proc. SPIE Photonic West – Int. Soc. Opt. Eng., San Jose, CA, Jan 24–30 1998 39. Li, C.-S., Yu, P.S., Castelli, V.: Malm, a framework for mining sequence databases at multiple abstraction levels. In: Proc. 7th International Conference on Information and Knowledge Management, Bethesda, MD, USA, 3–7 November 1998. CIKM ’98, pp. 267–272