by the National Geospatial-Intelligence Agency University Research Initiatives. (NURI) under Grant HM1582-04-1-2028 and by the U.S. National Science.
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 5, MAY 2011
1603
Entropy-Balanced Bitmap Tree for Shape-Based Object Retrieval From Large-Scale Satellite Imagery Databases Grant J. Scott, Member, IEEE, Matthew N. Klaric, Student Member, IEEE, Curt H. Davis, Fellow, IEEE, and Chi-Ren Shyu, Senior Member, IEEE
Abstract—In this paper, we present a novel indexing structure that was developed to efficiently and accurately perform content-based shape retrieval of objects from a large-scale satellite imagery database. Our geospatial information retrieval and indexing system, GeoIRIS, contains 45 GB of high-resolution satellite imagery. Objects of multiple scales are automatically extracted from satellite imagery and then encoded into a bitmap shape representation. This shape encoding compresses the total size of the shape descriptors to approximately 0.34% of the imagery database size. We have developed the entropy-balanced bitmap (EBB) tree, which exploits the probabilistic nature of bit values in automatically derived shape classes. The efficiency of the shape representation coupled with the EBB tree allows us to index approximately 1.3 million objects for fast content-based retrieval of objects by shape. Index Terms—Content-based retrieval, image databases, knowledge-based indexing, object indexing, remote sensing.
I. I NTRODUCTION
A
S THE volume of remote-sensing earth imagery continues to increase, automated processes must be developed and refined, which can eliminate the requirement of a human-inthe-loop for creating large-scale searchable image repositories. Content-based image retrieval (CBIR) is an increasingly popular retrieval method for large-scale image databases. CBIR queries are not performed in a traditional relational database management system (RDBMS) of image metadata, e.g., sensor, location, or time, but instead use features extracted from image content to search. Traditionally, descriptive features are extracted to represent various discriminating properties of the image content. These features may represent global properties, e.g., color and texture, or collective localized features, e.g.,
Manuscript received July 7, 2009; revised November 30, 2009 and May 18, 2010; accepted August 22, 2010. Date of publication December 17, 2010; date of current version April 22, 2011. This work was supported in part by the National Geospatial-Intelligence Agency University Research Initiatives (NURI) under Grant HM1582-04-1-2028 and by the U.S. National Science Foundation under Grant IIS-0812515. G. J. Scott, M. N. Klaric, and C. H. Davis are with the Center for Geospatial Intelligence, University of Missouri, Columbia, MO 65211-0001 USA. C.-R. Shyu is with the Informatics Institute, University of Missouri, Columbia, MO 65211 USA. Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TGRS.2010.2088404
the shape and color of segmented objects or the texture of partitioned image regions. Numerous CBIR systems have been reported in the literature, e.g., Query by Image Content (QBIC) [1], VisualSeek [2], Photobook [3], and PicToSeek [4]. In [5], Gevers and Smeulders offer a comprehensive overview of CBIR. In [6], Lew et al. provide a review of the state of the art in CBIR. In the remote-sensing domain, there are relevant contributions that focus on content-based retrieval and, oftentimes, image information mining. A notable contribution that has explored content-based retrieval of satellite imagery is the knowledge-based information mining (KIM) system by Datcu et al. [7]. With regard to CBIR, KIM exploits Landsat Thematic Mapper (TM), as reported in [8]. Li and Narayanan, in [9], used Land Cover and Land Use thematic maps as supervised training of support vector machines over the spectral information of an image. They also exploit Gabor wavelets for textural feature extraction to capture spatial information from an image. A necessity for developing a successful CBIR system is the extraction of discriminant features to describe the images in the database. As such, the development of feature extraction algorithms has dominated the literature in the field. These fundamental features are often assembled to model higher level human visual perception for CBIR, where the ultimate goal is to retrieve visually similar images. In addition, there exists a solid foundation of literature on the extraction and modeling of spatial components of imagery, e.g., objects or natural divisions (e.g., the horizon of a landscape photo or a foreground person). Shape analysis and retrieval have emerged as particularly important topics in CBIR, because visual knowledge is often related to shape characteristics of objects. In this paper, we are primarily concerned with object shape retrieval from large-scale remote-sensing imagery databases. Hence, we will focus on a shape feature set generated using objects that were automatically extracted from a large collection of satellite imagery. Two promising approaches for automatic shape extraction from large-scale satellite imagery databases include techniques based on transforms (e.g., Fourier [10] or wavelet [11]) and morphology [12]. For the reported research herein, we employ the latter approach, as described in Section II. There exists a plethora of research with regard to object shape feature extraction. Traditionally, shapes are conceptualized in the literature using a few broad categories as follows:
0196-2892/$26.00 © 2010 IEEE
1604
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 5, MAY 2011
1) contours; 2) regions; or 3) skeletons. Contour representations of shapes are typically outlines found with edge detectors and other similar image processing techniques. Recently, scalespace methods [13] have become very popular as shape descriptors. Using this approach, an object contour is continually smoothed by increasing Gaussian filters, building a hierarchy of salient inflection points [14]. In [15], Avrithis et al. utilize Fourier transforms and curve moments to build invariant curve representations for further feature extraction. Kunttu et al. present a method for encoding both color and shape information using intensity Fourier [10]. Other methods have been developed for nonrigid shapes, e.g., multiscale convexity concavity by Adamek and O’Connor [16]. Comparative literature, such as [17], provides a retrieval performance review of various contour-based descriptors using a standard data set, including curvature scale space, wavelet encoded contours, and visual correspondence. Other comparative literature includes Zhang and Lu’s [18] extensive review of Fourier, scale space, Zernike moments, and grid descriptors. Object skeleton-based features are somewhat rare in the literature. Skeletons can be derived using morphological image processing techniques or variations such as medial axis [19] or shock graphs [20]. In the CBIR of skeleton-encoded shape features, searching is equivalent to graph matching or computing transformation steps to achieve the second graph from the first graph. The complexity of ranking skeletons with these methods limits the efficiency of retrieval performance, because the similarity must be computed between the query and numerous candidates. Despite the computational cost, skeletal methods have been shown to particularly be robust with regard to object occlusion. Other shape descriptors include edge histograms combined with Fourier transforms as in [21], which exploit statistical information of the shape. Minimum bounding circle [22] and convex hull approaches rely on finding a circular or convex region to encompass the shape prior to feature extraction. The feature extraction then processes the object shape by also examining the regions not populated by the intersection of shape and bounding object. Along with the breadth of object shape feature spaces, there exists a healthy quantity of the literature focused on measuring similarity of shapes in the aforementioned feature spaces. Popular approaches depend, to a degree, on the feature space. In scale-space methods, the predominant methods involve finding inflection point correspondence between objects. Some approaches measure the similarity through deformation/ transformation steps to achieve the second shape from the first (e.g., [23]), or the second skeleton from the first [20]. Other approaches combine local and global invariants for computing similarity [24]. Utilizing local invariants is key to maintaining adequate retrieval of objects that are subject to occlusion. In CBIR, it is desirable to provide the results of the query in a similarity-ordered set. For this reason, CBIR is often cast as a problem of finding nearest neighbors in the feature space defined by the chosen object descriptor. Although there exist methods such as pruning to eliminate segments of the database, the most efficient approaches use indexing schemes to access the feature space [25]. Various indexing schemes have
been reported in the literature, including the containment tree for topological image structure [26], EBS k-D tree for highdimensional feature spaces [27], and sparse distributed memory structures for properties generated from principal component analysis (PCA) in [28]. In [29], Liu et al. construct a separate 1-D index for each feature in the feature set. Through their search algorithm, this method has the benefit of quickly returning empty sets if no objects are within the desired similarity radius. However, the algorithm generates a candidate point set in each dimension, requiring results to be merged into a final result set. In high-dimensional feature spaces, the number of candidate point sets may inhibit performance. In addition, dense feature spaces will compound this problem, because the candidate point lists will substantially increase. Note that the CBIR literature rarely has feature descriptors tightly coupled with indexing structures. To create a truly scalable system for CBIR, one should substantially increase the database size without equivalent retrieval performance decreases. This paper directly addresses this problem. We have designed an indexing structure, the entropy-balanced bitmap (EBB) tree, which is particularly suited to our chosen shape descriptor. Existing RDBMS indexing mechanisms are not suitable for our shape encoding data. In addition, common space/data partitioning indexing extensions for RDBMS are ill suited for this high-dimensional data. We explored the suitability of metric index approaches and found them inadequate for our data collection. By using a shape descriptor that provides a small fixed encoding size and developing a tightly coupled indexing and retrieval structure, we have developed a scalable approach for content-based retrieval of objects using shape. Our object shape database consists of 1.3 million objects, yet we can return thousands of the most similar ranked shapes in a few seconds. The remainder of this paper is organized as follows. In Section II, we explain our automatic object extraction, shape encoding, and data clustering as the index preprocessing steps. Section III describes the theoretical basis of the EBB tree, along with relevant algorithm details. Our experimental methods and results are detailed in Section IV. We conclude with discussion in Section V. II. O BJECT E XTRACTION AND P REPROCESSING We have developed an extensive geospatial imagery retrieval system, GeoIRIS [30], which employs numerous retrieval techniques. Currently, our image database contains 45 GB of highresolution orthorectified, georeferenced commercial satellite imagery. This imagery is five banded—0.6–1.0 m panchromatic and 2.4–4.0 m multispectral—with each band having an 11-b effective range. One of the latest extensions to GeoIRIS employs a scale-invariant shape descriptor to retrieve objects from the database. This paper is focused on retrieval applications with geospatial awareness; as such, a typical interaction for our system might be to submit an object as the query, along with geospatial constraints. For example, “Given a query image containing a baseball diamond, find all similar baseball diamonds in the database that are within 2 km of a radio broadcast tower.” With this goal in mind, we must efficiently
SCOTT et al.: ENTROPY-BALANCED BITMAP TREE FOR SHAPE-BASED OBJECT RETRIEVAL
1605
perform object-based retrievals from our database as the first step to incorporate other geospatial knowledge. The extraction and encoding of object shapes is described in the following sections. Note that we also store object spectral information and principal axis length for use in complex object queries. A. Multiscale Object Extraction and Shape Representation in Bitmaps Our automatic object extraction algorithms for highresolution satellite imagery [31] exploit the differential morphological profile (DMP) [12] to facilitate the processing of large quantities of imagery and efficiently discover objects. One of the current challenges in any automatic object indexing process is to extract the relevant objects from the imagery. In small image databases, edge detectors and segmentation are viable object location strategies. Our database of high-resolution satellite imagery contains numerous large scenes, with a total coverage of 3994 km2 . For a collection of satellite imagery of substantial scale, traditional object extraction methods are inefficient. Manual extraction of objects from an imagery collection of this scale is infeasible; as such, automated processes are necessary. To accomplish this difficult task, we process the scenes using the DMP on the panchromatic channel of the imagery. The DMP is a multiscale segmentation algorithm, which exploits contrast edges in imagery. Using geodesic morphology by reconstruction, objects that are lighter or darker than their surrounding image content generate response in the DMP. The intensity of the DMP is correlated to the difference in the contrast of the object and its surrounding. The resulting extractions are homogenous regions, each representing an object. The interested reader should refer to [12] for a detailed presentation of the DMP. The DMP produces a set of scaled contrast responses, referred to as DMP levels. Level m represents the possible objects detected using a geodesic disk of size rm , which were not detected with radius rm−1 . Each level in the DMP represents objects extracted after the transition from one geodesic scale to the next. During the processing of the DMP, we utilized a normalized difference vegetation index (NDVI) to filter out nonanthropogenic objects. Because the resulting objects are anthropogenic structures extracted from imagery using DMP responses, we use regionbased (solid) objects instead of applying additional processing to generate contours or skeletons. Therefore, we focused our approach on the region-based subset of all available shape descriptors. Seeking the most efficient method to represent region-based shapes and still have adequate descriptive power to identify general shapes, we chose grid descriptors [32]. Grid descriptors are effectively a sampling of an object shape into a matrix of fixed size. These grids provide natural scale invariance. In addition, immediately prior to sampling the object into the grid, we align the principal axis of the extracted object to the middle horizontal grid axis. Given a fixed-size grid that represents a shape, it is natural to represent the grid as a simple bitmap. In GeoIRIS, we used 322 b, representing 1024-D bitmap space, for an encoding size of 128 B per shape. Early empirical analysis revealed that this size of bitmap
Fig. 1. Grid descriptor of extracted objects. The object exists in the original image and can be extracted using the DMP. (a) and (b) Two regions of original imagery. (c) and (d) DMP-extracted objects from (a) and (b), respectively. (e) and (f) Encoding of the lighter extracted objects each of (c) and (d), the ones represent bits set on.
provided good balance of retrieval performance and shape discrimination. Fig. 1 shows two example chips from our database imagery, followed by a relevant level of the DMP, and finally the resulting bitmap encoded shape. This object-encoding scheme has some substantial benefits that we exploit. First, for our large-scale image database with 1.3 million extracted objects, all objects are encoded in less than 160 MB or 0.34% of the original 45 GB. Therefore, continuing to scale our database will not be limited by an increase in the number of encoded shapes. Second, an indexing scheme has been developed, which allows efficient ranked retrievals and exploits bit operations, instead of floating-point operations, during search and ranking. This indexing structure is developed in Section III, including algorithms for induction and search. Finally, revisiting Fig. 1(e) and (f), we can see that the encoding is very intuitive. In our GeoIRIS database, we have a mixture of scenes from urban, suburban, and rural areas of the world. On the average, we have 325 encoded objects per square kilometer. Because the balance of land cover type varies in the imagery, the number of extracted objects will necessarily vary. For
1606
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 5, MAY 2011
example, as the proportion of urban versus rural area increases, the number of objects can be expected to increase per square kilometer. This condition will have an effect on the portion of the original database size that is needed for object shape representation.
niques, i.e., the concept of inverse document frequency [33]. We defined the inverse bit frequency, ibf , of bit[k] to be
B. Bitmap Dissimilarity Measure
However, ibf alone proved insufficient for some key objects in our satellite imagery domain. This measure of bit relevance drove the weights near the center horizontal axis bits to effectively zero. Our final bit-weighting scheme is to use the following combination of (3) and (4):
One important factor of the indexing and retrieval scheme is the choice of an appropriate dissimilarity metric. We utilize one dissimilarity metric for initially clustering data, ranking the bitmap results, and building the priority queues for leaf traversal. We experimented with numerous dissimilarity measures, which rely on measuring the number of bits that varies between two bitmaps. We have d(B1 , B2 ) = |B1
XOR
B2 |
(1)
with XOR representing the bitwise exclusive OR operation. In (1) and the following discussions, we borrow from a mathematical set notation and use |B| to represent the count of bits on in a bitmap for equations (as well as the more general count of elements in a set) and refer to this value as the cardinality of a bitmap in the text. When measuring the dissimilarity of two shaperepresentation bitmaps, not all bits need to be treated equal. As expected, various bits in different objects can have significantly different relevance to the object shape. We therefore evaluated weighted bit dissimilarities using dwt (B1 , B2 ) =
K
(B1 [k]
XOR
B2 [k]) ∗ wt[k]
(2)
k=1
where wt[k] is the weight assigned to bit k. In these dissimilarity measures, each bit that differs contributes to the dissimilarity by the amount of its weight. In (2), the significant step is assigning weights that accentuate shape differences. As described in Section II, all objects are aligned to the x-axis and centered at the y-dimension of the bitmap. This alignment, coupled with the scaling of all objects into a fixed bitmap size, implies that every object will horizontally span the center of the bitmap. Therefore, the bits along this center axis contribute less to the shape information than the top and bottom regions. With this condition in mind, we assume that bits farther from the center horizontal are more important in describing the object shape when they are set on. Our initial experimental bit-weighting approach used the square root of the absolute y-of f set from the center x-axis as (3) y_off[k] = y_offset. Because we use an even number of bits for the edges, the two rows that form the center horizontal axis are counted as an offset one. Therefore, with our chosen bitmap size of 32 × 32, (3) has a range of [1, 4]. Empirical analysis revealed that bit-weighting schemes based on the y-of f set alone were not sufficient. Therefore, we borrowed ideas from established document retrieval tech-
ibf [k] = log
DB Population Size DB objects with bit[k] = 1
wt[k] = max {ibf [k], y_off[k]} .
.
(4)
(5)
During our analysis, to determine an appropriate bit weighting for dissimilarity measures, we examined the effects on different types of shapes. Fig. 6 has a sample of the variety of shapes that may be found in our database. Airplanes are representative of complex shapes, L-shaped buildings are representative of shapes with a few significant concavities, and baseball fields are representative of shapes that combine linear and curvature to form shapes that are difficult to distinguish from purely curve-based shapes without bit weighting.
C. Clustering Multiscale Objects in Bitmap Space In CBIR applications, it is generally expected that similar objects are close together in the feature space. In addition, we do not expect the feature space to uniformly be saturated with objects. Instead, we expect that similar database objects will tend to form high-dimensional clouds in the feature space. In effect, this case is a necessary requirement of successful feature extraction algorithms, low intraclass variance, and high interclass spread. With this requirement in mind, it is beneficial to apply clustering algorithms to automatically discover and label these dense regions. Clustering techniques are well established in pattern recognition and data mining. The complexity of clustering algorithms is heavily influenced by the size of the database, both in the dimensionality and number of objects. In-depth discussion of feature extraction philosophies and clustering techniques can be found in [34] and [35]. Once clusters are discovered in the database, the statistical properties of these clusters can be exploited to create efficient and accurate indexing of the feature space. The EBB induction algorithms rely on these clusters. To prepare our database for indexing, we adapted the densitybased spatial clustering of applications with noise (DBSCAN) [36] clustering algorithm for use with a large collection of bitmaps. We developed a sampling-based clustering approach that typically uses two passes through the database. This approach makes it particularly attractive for large-scale feature sets, e.g., the objects extracted from our satellite imagery. To generate clusters, we measure dissimilarity between any two bitmaps using (2). After clustering, each object belongs to a cluster of objects that are similar in bitmap space.
SCOTT et al.: ENTROPY-BALANCED BITMAP TREE FOR SHAPE-BASED OBJECT RETRIEVAL
1607
Fig. 2. Example EBB tree. Circular nodes are decisions in the search path, determined by maximizing (6), and square nodes are the leaves that contain the bitmap population, which exists in the nodes at the various stages of induction, concluding with the leaves. The leaf nodes are labeled with their class population for clarity.
III. EBB I NDEXING Bitmap indexing has many uses in retrieval and databases. In traditional RDBMS, bitmap indexes are utilized to partition relations into a relatively small number of disjoint sets using single attributes (e.g., gender). Another common usage is the bitmap index for term–document correlation in information retrieval (IR) schemes [33]. In IR, the bitmaps are typically documents of a collection, and individual bits represent the presence of a term in the document. In our current context, we are dealing with bitmaps that represent object shapes as a binary grid, using 32 × 32 b. If we attempt to grow a full bitmap index that covers all of the bitmap space, we would need 21024 − 1 internal nodes to accommodate 21024 leaves. What is required is a significantly smaller index. In the current discussion, we are dealing with a large collection of bitmaps, clustered in sets of naturally occurring groups of similar bitmaps. Our approach, the EBB tree, exploits these groupings found in the bitmap space to efficiently index the object shapes with a much smaller tree than would be necessary to cover the entire space. For retrieval efficiency, these clusters are further divided into a large number of leaves that contain a small group of very similar bitmaps. Furthermore, to accommodate large results sets, the leaves are linked together in priority queues for leaf navigation. For GeoIRIS, our bitmap index has 27 005 leaves, with an average leaf population of 47.49 and an average search depth of 14.72. A. EBB Tree Induction In previous work, the entropy-balanced statistical k-D tree [27], [37] was used to exploit knowledge about classes or groupings in a feature space when indexing continuous mul-
tidimensional feature sets. The motivation is to increase retrieval precision by lowering the entropy while simultaneously reducing the imbalance of the tree. Using statistical analysis of clustered or ground-truth labeled data, we exploit the statistical properties of clusters to induce an entropy-balanced tree that decreases the entropy from parent to child nodes. Statistical entropy, as defined by Shannon, is a measure of the randomness or variability of data [38]. Therefore, induction should seek to minimize leaf entropy, ensuring that leaf contents have a high degree of similarity. One desirable trait is to not greedily sacrifice the entropy of one node to lower the entropy of its sibling. The result is an efficient indexing structure, where searches reach leaves of low entropy, implying more certainty that the leaf contents are similar to the query. Fig. 2 illustrates the general concepts of EBB induction. Note that the bitmaps represented are 16 b in size, and in the following discussion, the bit positions start at zero in the top left and count across the bitmap rows. One traditional bitmap that covers this bitmap space would require 65 536 leaf nodes and 65 535 internal nodes, a total tree size of 217 − 1. At the root level, there exists a large collection of bitmaps organized into five classes, represented by the five grids. If the induction algorithm determines the decision bit as k = 12, then the three bitmaps with that bit on will be in the right child (A,C,D). The two bitmaps with that bit of f (B,E) will be pushed into the left child. At the second level, the initial left child may determine the best decision bit as k = 8, thereby splitting the two classes into the left (E) and right (B) child nodes. The root node’s right child determines the next split to be k = 1, creating a single class (D) leaf as its right child and a two-class node as its left child (A,C). Finally, the internal node at the third level will use bit k = 0 to separate its two classes into the final leaves. This
1608
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 5, MAY 2011
resulting EBB tree will allow navigation to the various classes using two to three bit comparisons. If we used a greedy maximum entropy reduction approach, without the balancing effect, we would have a tendency to make decisions that position leaves at higher levels in the tree. For example, again considering the root node in Fig. 2, a greedy decision may use bit k = 0 as the root node decision, creating a class-A leaf at level 1. This step would be followed by a series of similar decisions, resulting in a tree with leaves at levels 1, 2, 3, and two leaves at level 4. The resulting searches would take from one to four comparisons. Projecting this behavior out to much larger databases, with larger bitmaps, more classes, and less crisp class-bit probabilities accentuates the variability in search efficiencies. Another issue may arise when examining why a greedy decision will split off leaves higher in the tree, i.e., it sacrifices the entropy of one subtree for the gain in the other subtree. Another downside to this type of split decision is the effect of creating numerous low-probability high-entropy leaves. These leaves would be sufficient in a classifier, but in an indexing system that expects retrievals to require leaf traversals, this approach can result in traversing through numerous highentropy leaves. Fig. 2 represents a simplification of the data for illustration, where the classes have only 0 and 1 as bit probabilities. Actual data are significantly more complex, including more classes, more bits, and bit probabilities in classes that range between 0.0 and 1.0. The a priori class-based bit probabilities are a key part of the exploitation of database knowledge during index induction. In any particular leaf, during the induction of an index, we must estimate the probability of some Classi within the portion of the bitmap space that the leaf occupies. These classes represent our discovered clusters of bitmaps. The clustered bitmaps could be represented as prototype vectors of floating-point numbers, but in reality, statistics developed from these vectors have little significance in the binary space. Therefore, we use the probabilities of bits being on or of f for a given Classi in some bitmap space covered by some Leafj . We approximate this value by examining the members of each Classi in the leaf and tracking the occurrence of on and of f bits for each bit position. The use of these approximations are discussed as follows in the context of computing the conditional probability of Leafj , given Classi . The EBB tree is designed for very large collections of bitmaps, which lend themselves to exploiting the probabilistic tendencies of data. One critical design issue is the development of the split decision objective function, which can properly exploit these class-based probabilities. Our desire is to induce an index with a collection of low-entropy leaves. The result is then a collection of leaves of objects in the bitmap space, where each leaf represents a group of similar bitmaps. Therefore, we desire a decision criterion that allows the recursive induction algorithm to balance the reduction of entropy between each set of sibling subtrees whenever a split decision is made. This condition ensures that the entropy of one subtree is not sacrificed for the sake of the other subtree. The decision criterion is the bit k that maximizes γ = Hparent − σHR − σHL − ABS(σHR − σHL ).
(6)
Hparent is the entropy of the parent node, and σHR and σHL are the weighted sum components of the right and left children, respectively, where σH = P (Leafj )H(Leafj ). The first three terms on the right-hand side of (6) represent the reduction of entropy. The terms in the absolute value represent the balancing factor. We also constrain the node splitting to require at least a minimal entropy reduction from parent to children, such as a percentile decrease. We calculate the entropy of any Leafj as H(Leafj ) = −
L
P (Classi |Leafj ) log P (Classi |Leafj )
i=1
(7) where L is the number of classes that exist in Leafj . To calculate the entropy of a leaf in a high-dimensional bitmap space, we define basic probabilities over a bitmap database as a foundation. Given a database D composed of a set of disjoint classes, we define the a priori probability of Classi as P (Classi ) =
|Classi | |D|
(8)
where |Classi | and |D| are the size of Classi and the database, respectively. To capture the aforementioned point-based probabilities, we calculate the a priori probabilities of each bit k in each class i as # off bits |Classi | # on bits P (Classi,k=1 ) = |Classi | P (Classi,k=0 ) =
(9) (10)
which represent the probability of Classi with bit k set off or on, respectively. To maintain an approximation of the class bit probabilities during tree induction, we calculate the probability of Classi ’s bit k in N odej using ⎧ if on & of f bits ⎨ 1.0 P (Classi,k,j ) = P (Classi,k=0 ) if of f bits (11) ⎩ P (Classi,k=1 ) if on bits. Equation (11) examines Classi ’s bit variations in some particular Leafj . If the class has bitmaps with k both of f and on, then the probability of reaching Leafj by a search with a Classi bitmap is approximated as 1.0. If Classi has only of f or on bits, this approximation is taken from (9) or (10), respectively. The probability of Leafj , given Classi in a bitmap of size K, is calculated as P (Leafj |Classi ) =
K
P (Classi,k,j ).
(12)
k=1
This approach allows us to calculate the probability of Classi , given Leafj , using Bayes’ theorem as P (Classi |Leafj ) =
P (Leafj |Classi )P (Classi ) P (Leafj )
(13)
where the probability of Leafj is the number of database objects in Leafj over the size of the database. Using (13), we can
SCOTT et al.: ENTROPY-BALANCED BITMAP TREE FOR SHAPE-BASED OBJECT RETRIEVAL
1609
Fig. 3. Two split decisions based on a three class example, with the result tree shown on the right. The grids shown are populated with the probability of on bits in each class, as defined by (10). Bit numbers start at 0 and fill rows to the right, ending at 8 in the bottom right corner. L and R are the left and right children of a possible split, respectively.
calculate the entropy (7) and thereby make the desired induction decisions to maximize (6). The EBB has a notable property related to the maximum height of the index. In particular, the maximum search depth of any path to an EBB leaf is K + 1, where K is the size of the bitmaps. As can be observed from Algorithm 1, a specific bit k can only be used a single time in one path. This condition limits the number of decision nodes to K and, therefore, the maximum search depth. The EBB tree is built with a recursive decision tree induction algorithm as detailed in Algorithm 1, SplitNode. Initially, the entire database D is evaluated as a root node R, and a decision bitmap is created, dcsn, with all bits set of f . SplitNode is then called with the root node and the blank decision bitmap. For each of f bit in the dcsn bitmap, divide the current node into two candidate child nodes. Each database object with the current bit off is assigned to left child; if the bit is set on, objects are assigned to the right child. Then, the split objective function (6) is evaluated, possibly updating the current best decision γmax and setting the decision bit kmax . For a node to split, the entropy must be reduced from the parent to its children by some threshold . When a split bit is determined, that bit is set on in dscn. After storing the kmax in the current node as the decision bit, SplitNode is called for both the left and right children, each using the new dcsn bitmap. The dcsn bitmap is passed to the recursive calls of the children to allow the induction to accelerate, because those bits must never be evaluated again. If some N odej uses bit k as the decision bit, then all bitmaps in the left subtree will have the bit k off, and all bitmaps in the right subtree will have the bit set on. Therefore, the maximum height of the EBB is the number of bits in the bitmap, as in the case in traditional bitmap index schemes. Fig. 3 provides a three-class example to illustrate the behavior of Algorithm 1 for making split decisions. On the left side are class representations of the probabilities of bit k being on, i.e., (10). In the middle table are two decision levels, the root node, Decision 1, and the root node’s right child as Decision 2. The bit index is the first column, followed by the probability and entropy of each child for a possible split at that bit. Finally, the decision value of (6) is shown in the last table column. The last part in Fig. 3 is the resulting tree produced from the table. For the computations in Fig. 3, P (Class1 ) = P (Class2 ) = P (Class3 ) = 0.333. Note that bits 0, 2, 6, and 8 never need evaluation, because the entire population has those bits set off. The root decision has four bits that need to be evaluated: 1) bit 1; 2) bit 4; 3) bit 5; and 4) bit 7. Bits 5 and 7 are equivalent but have different permutations of the class-to-child distribution
generated by bit 1. Bit 1 effectively partitions off class 3 into the left child, leaving classes 1 and 2 in the right. Bit 4 partitions class 1 between the left and right nodes, raising the entropy of the right node. The root node’s right child decision bits that must be evaluated are 4 and 5. Bit 4 will split class 1 between the left and right children, increasing the entropy of the right, because it will also have all of class 2. Bit 5, as a decision bit, partitions the node into class-homogenous leaves, which is the optimal split. Algorithm 1. SplitNode(N,dcsn): Recursive node-splitting algorithm for inducing the EBB tree. Parameters include the node N and previous decisions dcsn. 1: Calculate and store (7) for N ; 2: Initialize decision parameter γmax ; 3: for all bit k such that: (k AND dcsn) = k do 4: Partition N into lef tN and rightN , using bit k; 5: Calculate σHR and σHL 6: Calculate γk using (6) 7: if γk > γmax then 8: γmax ← γk ; 9: kmax ← k; 10: end if 11: end for 12: if Suitable Decision Found then 13: Create LChild and RChild with appropriate data. 14: dcsn ← dcsn OR kmax ; 15: Store kmax the decision of this node N.k; 16: SplitNode(LChild, dcsn); 17: SplitNode(RChild, dcsn); 18: end if
B. EBB Tree Search and Retrieval Searching the EBB tree is performed in the following two steps: 1) search into the bitmap space index and 2) generate ranked results of bitmaps in the leaves using a chosen metric. During the induction of the index, each node stores the decision bit k. Once the tree is induced, the leaves are analyzed to provide efficient nonlinear leaf traversal during searches. To accommodate the need to traverse the leaves, we build a neighbor priority queue for each leaf by calculating the probabilistic prototypes of the leaves (i.e., groups of bitmaps). The probabilistic prototype is calculated from the probability
1610
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 5, MAY 2011
that each bit k is on in the current leaf. When the probability of bit k being on is greater than or equal to 0.5, the prototype bit is set on. These prototypes are then used to compute neighbor priority queues based on leaf prototype similarity on a per-leaf basis. The generation of the leaf priority queues is O(m2 q), where m is the number of leaves generated in the tree, and q is the desired priority queue size. Given a query bitmap B, the navigation down the index is a series of simple bitwise operations. A search into the feature space for n results is performed following the recursive Algorithm 2. The search with bitmap B starts in the root node, specifying desire result size S. In each internal node N , steps 1–6 facilitate the recursive tree navigation. At each internal node, a decision bit k was stored during induction. If the query bitmap has that bit set on, the search continues in the right subtree of the current node N ; otherwise, it continues in the left subtree. Algorithm 2. Search(N,B,S): EBB tree searching in node N for S results from population D partitioned into leaves L using query bitmap B. 1: if N not a leaf then 2: if B AND k = k then 3: Return Search(N.RightChild,B,S); 4: else 5: Return Search(N.LeftChild,B,S); 6: end if 7: else 8: Rank destination leaf’s bitmaps in order of similarity into result R; 9: if |R| < S then 10: for L ∈ P riorityQueue do 11: Rank L bitmaps into R; 12: if |R| ≥ S then 13: Break; 14: end if 15: end for 16: end if 17: end if 18: RETURN R; When a leaf of the tree is reached, the leaf population is added to the ranked result set R. Oftentimes, a leaf may not have an adequate amount of data to satisfy the desired result set size, in which case the search must continue collecting results from additional leaves. At this point, searches must continue outward in the bitmap space from the initial leaf. The traversal of the leaves is codified starting at step 9 in Algorithm 2 when the current result set size is less than the desired size S. We first check that more results are needed, which is the expected case. We then examine the first neighbor in the priority queue, which is the most similar leaf in the tree, as measured by the chosen similarity metric. The bitmaps of this neighbor leaf L are added into the result set R. In an iterative fashion, the size of R is again checked, and the next neighbor leaf of the priority queue is processed, if needed. We currently build priority queues for each leaf that cover a portion of the database population.
Retrievals are conducted using the EBB tree, which partitions the feature space into relatively small groups of similar bitmaps. The extracted bitmap data resides in a dedicated data store and is used during similarity ranking, whereas the index resides in search agents. The search agents have a small memory footprint and can be distributed (i.e., replicated) across a network. The search agents hold just the navigation portion of the EBB, accessing the priority queues and data inside the data store. When traversal through the leaf population is required, a leaf’s priority queue is used for navigation through the data store. From a practical standpoint, we build priority queues to accommodate search result sizes that are a portion of the database size |D|. IV. R ESULTS The evaluation of a large-scale content-based retrieval system is often subjective. Different users may consider the various visual–perceptual characteristics to be of different importance. This case is a driving reason for concepts such as relevance feedback and customizable queries. In large-scale databases of satellite imagery with automatically extracted objects, it is infeasible to have the database ground truth labeled. Due to the subjective nature of content-based retrieval and the large scale of our database, we provide some example retrievals from the system, experiments using deformed shapes, and efficiency evaluations in the remainder of this section. A. Shape Deformation Effects We performed experiments to evaluate the effects of increasing levels of bit differences on the retrieval of objects. For these experiments, we eroded the encoded bitmap shapes and then used the eroded bitmap as a query into the database. Fig. 4 shows three example imagery objects, followed by the bitmap encoded shape, and the bitmap eroded by 2% and 5% of the pixels. The bitmaps simultaneously were eroded from the top and bottom. As noted in Section III, bits farther along the yoffset from the center x-axis are more heavily weighted than bits near the center. For this reason, we chose to first erode the bitmaps for this experiment in the significant weighted bits. This approach accentuates the effect of the shape change in the rankings. The first row in Fig. 4 shows our first airplane shape, and the effect of the erosion is the loss of the wings on the plane shape. The second row is the L-shaped building, were the erosion appears to effectively round off the corners while still preserving the general appearance of the L shape. The last row is the water treatment pool, which we basically begin to flatten. Note that the water treatment pools, followed by the baseball diamonds, had the largest bitmap cardinality. Therefore, the percentage-based erosion has the most drastic impact on these shapes. Given that all our objects are automatically extracted from the imagery, these experiments help demonstrate to what degree our retrieval ability is affected by bit differences that result from imperfect extraction algorithms. We evaluated the ranking position of the original encoded shape in the query results, as well as the dissimilarity trend with increasing erosion. Fig. 6 represents a subset of our test data, which includes 10 airplanes, 20 baseball diamonds,
SCOTT et al.: ENTROPY-BALANCED BITMAP TREE FOR SHAPE-BASED OBJECT RETRIEVAL
Fig. 4.
1611
Shape erosion examples: imagery object, extracted and aligned shape, and then erosion at 2% and 5% for the test of shape change on retrieval ranking.
10 L-shaped buildings, 10 water treatment pools, and numerous other shapes. We used 500 test shapes to erode for this experiment, but retrievals were conducted against the full database of 1.3 million objects. Fig. 5(a) provides a summary of the erosion of our test data with regard to bit cardinality, related to percentage. Baseball diamonds and airplanes averages provide the bounds of the bit erosion as the highest and lowest bit cardinalities, respectively. The average of all test objects is shown by the middle trend (All). Fig. 5(b) plots the average dissimilarity of the eroded shape versus the original encoded shape as the erosion increases. Note that this trend lacks the linearity of the bits versus percent erosion in Fig. 5(a) due to the bit-weighting scheme employed. As expected, the higher cardinality baseball diamond average significantly increases faster under percentage erosion due to both the increased number of bits that have changed and the propensity of those eroded bits to be weighted higher. Fig. 5(c) shows the average rank of an evaluation object when the system is queried with an eroded version of the object. It is observed that the airplanes retain the first result backup through 5% erosion due to the uniqueness of the shape and, as
can be observed in Fig. 4, the airplane shape is still observable in 5% of the eroded shapes. Baseball diamonds, on the other hand, quickly drop in average rank by 5% erosion. This result can be attributed to the fact that a given percentage erosion in baseball diamonds is more than three times the bits eroded from an airplane. In addition, as previously emphasized, the bits eroded are the highest weighted bits, thereby more significantly affecting the rank of the original object. We also expect that our database has more objects that loosely resemble baseball diamonds, e.g., buildings and water treatment pools. B. Content-Based Object Retrievals Currently, we support customizable object searches in our system by separately weighting the effect of the object’s shape and spectral features, as well as offering size constraints on the objects. Fig. 6 shows six example objects (top row) used as queries into GeoIRIS, followed in the next eight rows with ordered search results. In the clips of the satellite imagery shown, the relevant objects are bounded by yellow rectangles.
1612
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 5, MAY 2011
Fig. 5. Shape erosion effects. (a) Trends of the average quantity of bits eroded from shapes relative to the percentage of erosion. (b) Trend of average dissimilarity between test objects and their eroded counterpart. (c) Trend in ranking of the original object when queried with its eroded version.
The first column is a query using an airplane object. The search was customized to designate a shape-only search, without using object spectral characteristics and constraining search result objects to a size of 25–115 m. These results demonstrate rotational insensitivity. In this example, one of the challenging issues of using the DMP for automatic object extraction emerges. Results 3 and 4 are the same plane, extracted from neighboring levels of the DMP, but with shape encodings that vary enough to not be detected as duplicates by our existing algorithms (due to variations in the extracted shape at different levels). This case happens also with results 6 and 7. In each of these cases, it is shown that the extracted shape has variations just enough to change the dimensions and position of the bounding box. The second column is a similar query with a different plane. Note that the query plane from column 1 appears as the second to the last of the results for column 2. Column 3 is a search with a single baseball diamond among a complex of diamonds.
In this search, we equally weight the shape with the spectral characteristics of the object and, furthermore, constrain the size to a range of 30–80 m. We want to ensure that we get large areas of dirt, which are not very large or very small to be a baseball diamond. Note that, in these results, we have found four distinct baseball diamond complexes. In addition, in lower ranked results, grass infield baseball diamonds are discovered once we exhaust the database of dirt infield diamonds. Column 4 is an L-shaped building that was queried with no spectral characteristics and a size constraint of larger than 30 m. By not utilizing spectral characteristics, we can find objects of various colors. Column 5 represents a water treatment pool. These dark objects are not perfectly round but, instead, have a deep and narrow concavity from the light-colored catwalk (see Fig. 4, row 3). This query is performed with the shape weighted 75% and the spectral characteristics weighted at 25%, with the size constrained to 45–60 m. In this query, some of our results are not our conceptual water pool yet closely match in terms of shape and spectral characteristics. The final column results from querying the system with the dark circle on top of a cooling tower, with 60/40 shape/spectral weighting and the size constrained to 30–150 m. Of our imagery collection, we have three cooling towers, two of which are shown as the top two results. The third cooling tower, which is not returned, has large amounts of steam occluding the top opening of the tower in our imagery. The remaining results are similar in extracted shape and spectral characteristics. We conducted an additional experiment using a collection of ground-truth labeled baseball diamonds, L-shaped buildings, and water treatment pools. Our nonlabeled data consisted of 31 000 objects from a 2-km spatial proximity of 20 baseball diamonds, 10 L-shaped buildings, and 10 water treatment pools. For each test, we withheld 10% of the ground-truth objects for tenfold cross-validation. After building EBB for each test collection, we queried the remaining data to measure the recall in the top 50 of the desired test class. Our baseball diamonds had an average recall 77.87% in the top 50 results. In addition, four new baseball diamonds, which were not part of the ground truth, were discovered in the results. Our L-shaped buildings exhibited an average recall of 80.48%, and the water treatment pools exhibited an average recall of 70.04%. In all of these tests, neither object size nor spectral characteristics was not used to restrict the result set, as in typical queries in our GeoIRIS system. The use of object size would allow for improved recall in a smaller result set, because the unlabeled data averaged 36.25 m, whereas the minimum size of the ground-truth objects was 37 m. C. Efficiency With regard to query efficiency, the search time of shapebased queries is primarily dependent on the number of results desired and the associated cost of computing the bitmap dissimilarities. Recall from Section III that we build leaf priority queues to enable the nonlinear navigation in the highdimensional bitmap space. Our system utilizes priority queues that have a target object coverage for retrievals of approximately 20 000. We typically retrieve 6000–12 000 results from
SCOTT et al.: ENTROPY-BALANCED BITMAP TREE FOR SHAPE-BASED OBJECT RETRIEVAL
Fig. 6.
Shape retrieval with EBB. Top row images are query objects (in bounding box), and top ranked results follow.
1613
1614
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 5, MAY 2011
Fig. 8. Content-based retrieval interface using an airplane (left-hand side) that was automatically extracted from the image database. The large center region is the fourth ranked object retrieved from the database, shown in context and marked with the bounding box. The top of the retrieval results are shown on the right.
Fig. 7. Retrieval efficiency. (a) Average number of bitmap dissimilarities calculated using (2) to retrieve thousands of the results. (b) Average seconds to retrieve a result set of increasing result set size (thousands). (c) Timing comparison for 10-nearest neighbor searches of increasing database size for database versus EBB.
the search agent for the client interface. We performed retrieval efficiency experiments using 800 randomly selected objects in our database. Each object was used to retrieve results sets of size 1000 and increasing to 25 000, in steps of 1000. Fig. 7 summarizes the results of these experiments. For each retrieval, we recorded the time (in seconds) and the number dissimilarity calculations used. For example, the average number of bitmap dissimilarities computed, with (2), to retrieve the top 1000 results was less than 1053. This result is 0.081% of the comparisons that a brute-force search would require in our database of 1.3 million objects. Fig. 7(a) shows that the number of dissimilarity computations (solid line) closely follows the desired result size (plus marked trend). The number of comparisons needed for brute force is 1.3 million for any number of desired results. As aforementioned, our priority queues target
coverage of objects is 20 000, which causes the dissimilarity computations to begin to level off after 20 000 desired results as we exhaust the navigation priority queues. Fig. 7(b) shows the timing trend (in seconds) for the experiment. The average retrieval time for our typical query of 6000 shapes was 1.87 s, and for 12 000, it was 4.57 s. As previously discussed, the priority queue sizes that we use began to limit the number of retrievals possible, and therefore, the timing begins to level off after 20 000. Note that this priority queue size is simply a system parameter that can be adjusted to balance the expected search result sizes against the resources dedicated to managing priority queues. This level of efficiency for shape retrievals facilitates integration into complex systems, such as GeoIRIS. In addition, Fig. 7(c) provides a comparison in timing between 10-nearest neighbor searches against increasing database sizes for both a traditional RDBMS and the EBB. These results are average times for 300 test queries as the database size increases. The notable trends are that searches without the EBB linearly increase with the size of the database. In contrast, searches with the EBB are significantly faster and logarithmically increase, as expected. When considering the two types of efficiency experiments together, we see that query times are linear with respect to the desired size of a result set S and are logarithmic with respect to the number of leaves L. Overall, the efficiency of the retrieval is O(Slog2 (L)). For all these experiments, we used a dual quad-core (1.6-GHz) server with 8 GB of RAM, which was running the back-end PostgreSQL database, as well as the EBB index agents and search clients. V. C ONCLUSION In our GeoIRIS retrievals, we have not expected to retrieve objects using shape alone; instead, we have combined the results of shape retrieval with object spectral signatures and, possibly, size constraints. In the future, we expect to include object textural analysis or other relevant algorithms, with shape-based object retrieval being a component of a larger solution. Fig. 8
SCOTT et al.: ENTROPY-BALANCED BITMAP TREE FOR SHAPE-BASED OBJECT RETRIEVAL
shows the results from our GeoIRIS system when searching with an automatically extracted airplane. The upper left corner is a region with the query shape noted by a bounding box. The larger center region shows the currently selected result in its larger context, with the result object in a bounding box. This example shows the fourth ranked result to the query object. Note that the airplanes are facing in two directions. Given the alignment steps, we can expect objects, e.g., airplanes, to be aligned pointing either left or right. To facilitate retrievals with rotational insensitivity, we augment the retrieval by repeating the search with a mirrored bitmap of the query. Given our retrieval efficiency, the search time for 6000 shapes increases to less than 6 s for a second result set and the subsequent merge of the result sets. Our future work will continue toward increasing the scalability of the EBB. Research is needed to determine the practical limits on the scalability of the EBB for today’s hardware platforms and algorithms to overcome these limitations. We are also developing algorithms to provide dynamic manipulation of the EBB with data inserts, deletes, and updates. One of the challenges of dynamic manipulation is maintaining the efficiency of the tree after periods of manipulation. As the number of changes to the tree increases, relative to the original database size, the statistics of the data needs to be reevaluated and the tree possibly entirely rebuilt. In addition, we will explore applications of the EBB to other domains, possibly domainspecific text retrieval with a limited index term set. Another possible extension of this paper is the generalization of the EBB from bitmap space to arbitrarily discrete feature spaces. ACKNOWLEDGMENT The authors would like to thank DigitalGlobe for providing QuickBird imagery from the RADII development data set for use in this paper and the reviewers for their constructive comments, which have significantly helped improve this manuscript. R EFERENCES [1] M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafher, D. Lee, D. Petkovie, D. Steele, and P. Yanker, “Query by image and video content: The QBIC system,” Computer, vol. 28, no. 9, pp. 23–32, Sep. 1995. [2] J. R. Smith and S.-F. Chang, “Visualseek: A fully automated contentbased image query system,” in Proc. 4th ACM Int. Conf. Multimedia, 1996, pp. 87–98. [3] A. Pentland, R. W. Picard, and S. Sclaroff, “Photobook: Content-based manipulation of image databases,” Int. J. Comput. Vis., vol. 18, no. 3, pp. 233–254, Jun. 1996. [4] T. Gevers and A. W. Smeulders, “PicToSeek: Combining color and shape invariant features for image retrieval,” IEEE Trans. Image Process., vol. 9, no. 1, pp. 102–119, Jan. 2000. [5] T. Gevers and A. W. Smeulders, “Content-based image retrieval: An overview,” in Emerging Topics in Computer Vision, G. M. S. B. Kang, Ed. Upper Saddle River, NJ: Prentice-Hall, 2004, ch. 8, pp. 333–384. [6] M. Lew, N. Sebe, C. Lifi, and R. Jain, “Content-based multimedia information retrieval: State of the art and challenges,” ACM Trans. Multimedia Comput., Commun., Appl., vol. 2, no. 1, pp. 1–19, Feb. 2006. [7] M. Datcu, H. Daschiel, A. Pelizzari, M. Quartulli, A. Galoppo, A. Colapicchioni, M. Pastori, K. Seidel, P. G. Marchetti, and S. D’Elia, “Information mining in remote sensing image archives: System concepts,” IEEE Trans. Geosci. Remote Sens., vol. 41, no. 12, pp. 2923–2936, Dec. 2003.
1615
[8] H. Daschiel and M. Datcu, “Information mining in remote sensing image archives: System evaluation,” IEEE Trans. Geosci. Remote Sens., vol. 43, no. 1, pp. 188–199, Jan. 2005. [9] J. Li and R. M. Narayanan, “Integrated spectral and spatial information mining in remote sensing imagery,” IEEE Trans. Geosci. Remote Sens., vol. 42, no. 3, pp. 673–685, Mar. 2004. [10] I. Kunttu, L. Lepisto, and J. Rauhamaa, “Fourier-based object description in defect image retrieval,” Mach. Vis. Appl., vol. 17, no. 4, pp. 211–218, Sep. 2006. [11] V. P. Shah, N. H. Younan, S. S. Durbha, and R. L. King, “A systematic approach to wavelet-decomposition-level selection for image information mining from geospatial data archives,” IEEE Trans. Geosci. Remote Sens., vol. 45, no. 4, pp. 875–878, Apr. 2007. [12] M. Pesaresi and J. A. Benediktsson, “A new approach for the morphological segmentation of high-resolution satellite imagery,” IEEE Trans. Geosci. Remote Sens., vol. 39, no. 2, pp. 309–320, Feb. 2001. [13] J. Weickert, S. Ishikawa, and A. Imiya, “Linear scale-space has first been proposed in Japan,” J. Math. Imaging Vis., vol. 10, no. 3, pp. 237–252, May 1999. [14] F. Mokhtarian and A. K. Mackworth, “A theory of multiscale, curvaturebased shape representation for planar curves,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 14, no. 8, pp. 789–805, Aug. 1992. [15] Y. Avrithis, Y. Xirouhakis, and S. Kollias, “Affine-invariant curve normalization for object shape representation,” Mach. Vis. Appl., vol. 13, no. 2, pp. 80–94, Nov. 2001. [16] T. Adamek and N. E. O’Connor, “A multiscale representation method for nonrigid shapes with a single closed contour,” IEEE Trans. Circuits Syst. Video Technol., vol. 14, no. 5, pp. 742–753, May 2004. [17] L. J. Latecki, R. Lakamper, and U. Eckhardt, “Shape descriptors for nonrigid shapes with a single closed contour,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recog., 2000, vol. 1, pp. 1424–1429. [18] D. Zhang and G. Lu, “Review of shape representation and description techniques,” Pattern Recognit., vol. 37, no. 1, pp. 1–19, Jan. 2004. [19] H. Blum, “A transformation for extracting new descriptors for shape,” in Models for the Perception of Speech and Visual Forms, W. Whaten-Dunn, Ed. Cambridge, MA: MIT Press, 1967, pp. 362–380. [20] T. B. Sebastian, P. N. Klein, and B. B. Kimia, “Recognition of shapes by editing shock graphs,” in Proc. ICCV, 2001, pp. 755–762. [21] S. Brandt, J. Laaksonen, and E. Oja, “Statistical shape features for content-based image retrieval,” J. Math. Imaging Vis., vol. 17, no. 2, pp. 187–198, Sep. 2002. [22] M. Safar and C. Shahabi, “MBC-based shape retrieval: Basics, optimizations and open problems,” Multimedia Tools Appl., vol. 29, no. 2, pp. 189– 206, Jun. 2006. [23] S. Belongie, J. Malik, and J. Puzicha, “Shape matching and object recognition using shape contexts,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 4, pp. 509–522, Apr. 2002. [24] E. Rivlin and I. Weiss, “Local invariants for recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 17, no. 3, pp. 226–238, Mar. 1995. [25] R. Mehrotra and J. Gary, “Similar-shape retrieval in shape data management,” Computer, vol. 28, no. 9, pp. 57–62, Sep. 1995. [26] M. Kliot and E. Rivlin, “Invariant-based shape retrieval in pictorial databases,” Comput. Vis. Image Understanding, vol. 71, no. 2, pp. 182– 197, Aug. 1998. [27] G. Scott and C.-R. Shyu, “Knowledge-driven multidimensional indexing structure for biomedical media database retrieval,” IEEE Trans. Inf. Technol. Biomed., vol. 11, no. 3, pp. 320–331, May 2007. [28] R. Rao and D. Ballard, “Object indexing using an iconic sparse distributed memory,” in Proc. IEEE Int. Conf. Comput. Vis., 1995, pp. 24–31. [29] C.-C. Liu, J.-L. Hsu, and A. L. Chen, “Efficient near neighbor searching using multi-indexes for content-based multimedia data retrieval,” Multimedia Tools Appl., vol. 13, no. 3, pp. 235–254, Mar. 2001. [30] C.-R. Shyu, M. Klaric, G. J. Scott, A. S. Barb, C. H. Davis, and K. Palaniappan, “GeoIRIS: Geospatial information retrieval and indexing system—Content mining, semantics modeling, and complex queries,” IEEE Trans. Geosci. Remote Sens., vol. 45, no. 4, pp. 839–852, Apr. 2007. [31] M. Klaric, G. Scott, C.-R. Shyu, and C. Davis, “Automated object extraction through simplification of the differential morphological profile for high-resolution satellite imagery,” in Proc. IGARSS, 2005, pp. 1265– 1268. [32] G. Lu and A. Sajjanhar, “Region-based shape representation and similarity measure suitable for content-based image retrieval,” Multimedia Syst., vol. 7, no. 2, pp. 165–174, Mar. 1999. [33] R. A. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. Reading, MA: Addison-Wesley, 1999. [34] K. Fukunaga, Introduction to Statistical Pattern Recognition. New York: Academic, 1990.
1616
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 5, MAY 2011
[35] S. Theodoridis and K. Koutroumbas, Pattern Recognition. New York: Academic, 1999. [36] M. Ester, H. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise,” in Proc. Int. Conf. Knowl. Discov. Data Mining, 1996, pp. 226–231. [37] G. Scott and C.-R. Shyu, “EBS k-d tree: An entropy-balanced statistical k-d tree for image databases with ground-truth labels,” in Proc. Int. Conf. Image Video Retrieval, vol. 2728, Lecture Notes in Computer Science, 2003, pp. 467–476. [38] C. E. Shannon, “A mathematical theory of communication,” Bell Syst. Tech. J., vol. 27, pp. 379–423, Jul.–Oct. 1948.
Grant J. Scott (S’02–M’09) received the B.S. and M.S. degrees in computer science and the Ph.D. degree in computer engineering and computer science from the University of Missouri, Columbia, in 2001, 2003, and 2008, respectively. He is currently serves an Assistant Research Professor with the Department of Electrical and Computer Engineering, University of Missouri. He conducts research as part of the Satellite and Remote Sensing Group, Center for Geospatial Intelligence (CGI). During his Ph.D. studies, he was a member of the Medical and Biological Digital Library Research Laboratory and the Center for Geospatial Intelligence, University of Missouri, conducting research in high-performance multimedia retrieval systems (databases), hybrid retrieval systems and protein structural retrieval/comparison engines, and highresolution satellite image processing. During the course of his M.S. degree, he was a member of the Computational Intelligence Research Laboratory, with research emphasis on computational intelligence, pattern recognition, neural networks, fuzzy systems, image processing/machine vision, and bio-medical image databases. His current research is focused on the automated exploitation of high-resolution satellite imagery, in particular geospatial database development, imagery feature-extraction algorithm development, and distributed automatic imagery processing orchestration architectures. His research interests also include high-dimensional indexing and content-based retrieval in biomedical and geospatial databases, as well as computer vision, pattern recognition, computational intelligence, databases, parallel/distributed systems, and information theory in support of media databases systems.
Matthew N. Klaric (S’06) received the B.S. (summa cum laude) degree in computer science from Saint Louis University, St. Louis, MO, in 2003. He is currently working toward the Ph.D. degree in computer science at the University of Missouri, Columbia. In 2004, he was a Research Assistant with the Medical and Biological Digital Library Research Laboratory and the Center for Geospatial Intelligence, University of Missouri. In addition, he has served as an Instructor for several undergraduate computer science classes. His research interests include geospatial content-based information retrieval, data mining, computer vision, and pattern recognition. Mr. Klaric annually reviews papers for the IEEE International Geoscience and Remote Sensing Symposium (IGARSS).
Curt H. Davis (S’90–M’92–SM’98–F’08) was born in Kansas City, MO, on October 16, 1964. He received the B.S. and Ph.D. degrees in electrical engineering from the University of Kansas, Lawrence, in 1988 and 1992, respectively. He is currently the Naka Endowed Professor of electrical and computer engineering with the University of Missouri, Columbia (MU) and the Director of the Center for Geospatial Intelligence. His primary research involves the use of satellite microwave and optical remote sensing systems for applications in the areas of earth observation and science, ice sheet mapping and change detection, and urban area geospatial information processing. His ice sheet mapping and change detection research has been funded by the National Aeronautics and Space Administration (NASA) for more than a decade, and he is an internationally recognized expert in the measurement of polar ice sheet change using precision satellite altimeters, the influence of climate on these changes, and the impact of these changes on global sea levels. His urbanarea research focuses on the automated processing and development of highresolution geospatial information products. Examples include high-resolution digital elevation models, urban land cover maps, automated feature extraction of anthropogenic features, and automated change detection. His research results have been documented in more than 45 refereed journal publications and 70 symposia presentations and proceedings. His most significant scientific results have been published in top scientific journals such as Science, Nature, and the Journal of Geophysical Research. Dr. Davis has recently been named an IEEE Fellow for his “contributions to satellite remote sensing.” He has received numerous awards throughout his career, including the National Science Foundation (NSF) Antarctica Service Medal (1988 and 1989), the International Union of Radio Science (URSI) Young Scientist Award (1996), and the NASA New Investigator Program (1996–1999). He served as the Technical Program Cochair of the 2004 IEEE Geoscience and Remote Sensing Symposium held in Anchorage, AK. He is currently an Associate Editor for the IEEE T RANSACTIONS ON G EOSCIENCE AND R EMOTE S ENSING , in which majority of his technical contributions to remote sensing have been published.
Chi-Ren Shyu (S’89–M’99–SM’07) received the M.S.E.E. and Ph.D. degrees in electrical and computer engineering from Purdue University, West Lafayette, IN, in 1994 and 1999, respectively. Upon completing one year of postdoctoral training with Purdue, he joined the Department Computer Engineering and Computer Science, University of Missouri (MU), Columbia, in October 2000. He is currently the Paul K. and Diane Shumaker Endowed Professor of engineering and heads the MU Informatics Institute. His research interests include geospatial image information mining, visual knowledge understanding and retrievals, and biomedical informatics. Dr. Shyu is the recipient of the National Science Foundation Faculty Early Career Development (NFS CAREER) Award, MU College of Engineering Faculty Research Award, and various teaching awards. He is a member of the American Association for the Advancement of Science (AAAS) and the American Medical Informatics Association (AMIA).