Cartographic Indexing into a Database of ... - Semantic Scholar

2 downloads 0 Views 2MB Size Report
This paper aims to develop simple statistical meth- ... Bhattacharyya distance and the Matusita distance. ... distance measures used for statistical indexing. Ex-.
Cartographic Indexing into a Database of Remotely Sensed Images Benoit Huet and Edwin R. Hancock Department of Computer Science, University of York, York, Y01 5DD, UK

Abstract

This paper aims to develop simple statistical methods for indexing line patterns. The application vehicle used in this study involves indexing into an aerial image database using a cartographic model. The images contained in the database are of urban and semi-urban areas. The cartographic model represents a road network known to appear in a subset of the images contained within the database. There are known to be severe imaging distortions present and the data cannot be recovered by applying a simple Euclidean transform to the model. We e ect the cartographic indexing into the database using pairwise histograms of the angle differences and the cross ratios of the lengths of line segments extracted from the raw aerial images. We investigate several alternative ways of performing histogram comparison. Our conclusion is that the Matusita and Bhattacharyya distances provide signi cant performance advantages over the L2 norm employed by Swain and Ballard. Moreover, a sensitivity analysis reveals that the angle-di erence histogram provides the most discriminating index of line-structure; it is robust both to image distortion an to the variable quality of input line-segmentation.

1 Introduction

Pivotal to the e ectiveness of multi-media or hypertext systems is the ability to accurately retrieve images from a large database on the basis of contents [1, 2, 3, 4, 5]. The crucial ingredient of the underlying image database management system is an accurate way of indexing according not only to the frequency of salient image features but also to meaningful relational structure between the features. Ideally, such content-based indices should be capable of unambiguously delivering answers to queries with a minimum computational overhead. One of the most popular techniques is to construct a histogram of image attributes [6, 7]. Retrieval is e ected by comparing the histograms for the query image and those residing in the database. Images for which the integrated di er-

ence between normalised histograms bin-contents is small are deemed to have satis ed the query. This basic histogram-based retrieval mechanism was rst used by Swain and Ballard [6] in the indexing of a database of colour images. It has recently been shown to operate e ectively for indexing range-images, where the histogram is computed from surface normal orientation [8]. Although e ective, the basic histogram-based retrieval mechanism has a number of shortcomings. In the rst instance, both the query image and the database must be represented in terms of the same raw attributes. This means that the database can only be queried at low-level using template images. To render the low-level indexing e ective, considerable e ort must be directed towards image normalisation. In other words, the issues of photometric normalisation and colour constancy are of critical importance. Moreover, in many applications it may be more desirable to index into the database using a relational query [9]. In this case, the salient properties of the image to be retrieved may be represented more e ectively using the relational structure existing between the raw image features rather than by their contextfree occurrence. Such relational descriptions o er the additional attraction of being amenable to hierarchical representation [10]. This not only allows a rich conceptual structure to be imposed on the pictorial information, it can also lead to a considerable reduction in the computational overheads associated with search. The second shortcoming of existing histogrambased image retrieval systems resides in the way in which histogram comparison is e ected. This is normally achieved using the L1 or L2 norms to compare the normalised histogram bin-contents. The shortcomings of retrieving the image that minimises the L2 norm largely relate to its sensitivity to noise. The measure is dominated by high-error bins. This means that it does not return a good measure of closeness for histograms which are largely congruent but con-

tain a few noise dominated bins. There are a number of potentially more powerful statistical methods [11] that are less prone to ambiguity and outlier structure. In particular, if we regard the normalised histogram as representing a class-conditional probability density function, then there are a number of interclass probabilistic distance measures that can be used to gauge histogram di erence. These include the divergence, the Bhattacharyya distance and the Matusita distance.

application since there is both sampling noise and signi cant imaging distortions. The structure of this paper is as follows. Section 2 describes our application domain which involves the cartographic indexing of aerial infra-red images. Section 3 describes the construction of pairwise geometric histograms. In section 4 we detail the probabilistic distance measures used for statistical indexing. Experimental evaluation is presented in section 5. This takes the form of a comparative study aimed at establishing the best geometric representation together with the most e ective histogram distance measure. Finally, section 6 presents some conclusions and suggests directions for future investigation.

2 The Application Domain

(a) Digital map.

(b) Infra-red line-scan image.

Figure 1: Typical images used by the system

In this paper we consider the application of indexing into a database of aerial images ( gure 2) using a digital map ( gure 1(a)). Here the relational query is posed at high-level using cartographic structures to index against images containing corresponding feature groupings. In particular, we are interested in retrieving grey-scale images that contain certain patterns of road structure represented in the digital map. In other words, the relational query is posed in terms of cartographic structure rather than raw image attributes. This application vehicle allows us to study two aspects of the indexing process. In the rst instance we aim to show how histogram-based methods can be used to index a relational model against image data. In order to realize this process we investigate a number of alternative scale and rotation invariant image representations derived from line-segments. The most straightforward of these is the histogram of pairwise angle di erences. A more complex relational attribute is the cross-ratio of line-length. The second aspect of our study is to compare the e ectiveness of the di erent probabilistic distance measures. Here we illustrate that the best results are achieved if the Bhattacharyya or Matusita distances is used. Our evaluation is based on a database of 22 aerial infra-red images ( gure 2). We aim to index into the images using a relational histogram for straight-line-segments forming a road network in the cartographic map. The corresponding image features are intensity ridges which are segmented using a relaxation operator. This is a challenging

The problem of accurately and rapidly performing content-based retrieval of images is very complex and is currently an active area of research [1, 2, 3, 4, 5]. The pictorial information used in these studies have invariably represented quite diverse scenes in which the di erent images are relatively salient with respect to one-another. In this paper we consider the problem of indexing into a database of images having relatively uniform appearance. The application vehicle for our study is the problem of indexing into a database of remotely sensed images using a digital map. The majority of the images contain road structure. However, they can be distinguished neither upon the density nor orientation of the road structure. It is the underlying shape of the road networks that distinguishes the images. Moreover, since the digital map represents a cartographic abstraction of the aerial images, we can not rely on the constancy or photometric invariance of a statistical summary of the raw image data as an indexing device. Instead, we aim to extract line segments from the raw image data and to e ect the indexing via a histogram of pairwise geometric attributes. Since the coordinate systems of map and image are not in registration, the attributes used to construct the histogram are invariants to scale and rotation. Because the imaging process is subject to geometric distortion, the attribute histograms must not be oversensitive to imaging systematics. The image data-base consists of 22 aerial infra-red line scan images (see gure 2). These images are of both rural and urban areas. The images are formed by a line-scan process in the horizontal direction and by aircraft motion in the vertical direction. The main features are man-made road structures which radiate strongly in the infra-red band. These features present themselves as intensity ridges in the infra-red images. The ridges are extracted using a relaxational line-

nder [12] which encourages contour connectivity using a dictionary of local line structures. Straight-line segments extracted from the labelled intensity ridges delivered by the relaxation operator. The extracted line-segments are used to compute histograms of pairwise geometric attributes (see gure 4). We aim to compare the data histograms with those extracted from the cartographic information for road networks in the digital map. Because of the rotating mirror optics underlying the line-scan process used to sense the infra-red images, there is a signi cant barrel distortion in the horizontal direction. In other words there is distortion of the data histograms with respect to the model. Some images in the data-base cover the same area as the digital map; these two images (Figure 2 (f) img60 and (q) img170) for example, are taken at di erent aircraft altitude.

(a) img10

(b) img20

(c) img30

(d) img40

(e) img50

(f) img60

(g) img70

(h) img80

(i) img90

(j) img100

(k) img110

(l) img120

(m) img130

(n) img140

(o) img150

(p) img160

(q) img170

(r) img180

(s) img190

(t) img200

(u) img210

3 Pairwise Geometric Histograms

We have used three di erent geometric attributes to construct pairwise histograms from the line segments. These are the relative angles, the line-segment lengthratio together with the line-segment projection cross ratio. The raw information available for each linesegments are the orientation (slope) and length. To illustrate how the pairwise attributes are computed, suppose that the two raw-attributes for the line indexed (ab) are denoted by the vector x = (l ;  ) where l is the line-segment lengthand  is the line-segment's orientation (see gure 3). The relative orientation between the lines indexed (ab) and (cd) is equal to ab

ab

ab

ab;cd

ab

T

ab

= min[j( ;  )j; j( ;  )j] ab

cd

cd

ab

The ratio of line-segment length is given by min[l ; l ] r = max[ l ;l ] The line-segment projection cross-ratio is computed as follows: min[l ; l ] xr = max[ l ;l ] The methodology used to compute the histogram attributes allows us to know the range of the attributes. The relative orientation attribute will range between 0 and =2, the line-segment length ratio between 0 and 1 and the cross-ratio between 0 and 1. Therefore we are able to choose sensible bin-sizes for each histogram. For the experiments, described hereafter, the relative orientation histogram (angle histogram) is composed of 18 bins (each bin spanning an ab

cd

ab

cd

ab;cd

ad

bc

ad

bc

ab;cd

Figure 2: The infra-red image database; the database contains 22 images of urban, semi-urban and rural areas.



b

L2

Norm L2 (PD ; PM

v u uX ) = t (P n

D

=1

(i) ; P (i))2 M

i

a



d

φ θ ab

c θ cd

Bhattacharyya B (PD ; PM )

Base Line



M atusita

angle of =36 radius). The histogram based on ratio of line-segment length (length ratio histogram) is also composed of 18 bins, and the line-segment projection cross-ratio histogram (cross-ratio histogram) is divided into 36 bins. All our histograms are normalised so that the sum of the bin contents is equal to unity. To provide some illustrative examples of our methodology, Figure 4 shows the sequence of processing steps from an infra-red radar map image leading to the extraction of the histogram. Figure 4(a) is the raw image. Figure 4(d) is the result of applying straightline detection to the output of a probabilistic relaxation line detector. Finally, Figure 4(g), (j) and (m) show respectively the computed histogram based on line-segment pair relative angles, ratio of line-segment length and line-segment projection cross-ratio.

4 The Statistical Distance Measures

Our second aim is to study the relative e ectiveness of a number of alternative distance measures as measures of histogram similarity. These histograms represent the frequency of occurrence of the various pairwise attributes de ned in the previous section of this paper. Suppose that the normalised histogram contents for the bin indexed i in model and data histograms are respectively denoted by P (i) and P (i). Each histogram contains n bins. The distance measures under consideration are the following: M

Norm L1 (PD ; PM )

=

X jP n

i

=1

D

X pP n

=1

D

(i)  P (i)

Distance

M (PD ; PM

L1

= ; ln

M

i

Figure 3: Computing the histogram attributes from line segments (ab) and (cd)



Distance

D

(i) ; P (i)j M

v u uX p =t ( P n

D

i



Divergence D(PD ; PM ) =

=1

X[(P n

=1

pP

M

(i))2

(i) ; P (i)) ln PP ((ii)) ] D

D

i

(i) ;

M

M

Before we proceed to experiment with the di erent distance measures, it is important to understand the way in which they gauge di erences in histogram structure. The Bhattacharyya distance is e ectively a correlation measure. If the case of Gaussian probability density functions, the Bhattacharyya distance is proportional to the Mahalanobis distance between the class means. The divergence has a more subtle structure. In the case of Gaussian mixtures, it not only depends upon the between class Mahalanobis distance, it also gauges the di erence in class covariance. The Matusita distance is proportional to the negative exponential of the Bhattacharyya distance.

5 Experiments

The aims of our experimental evaluation are threefold. In the rst instance we are interested in determining which of the di erent distance-measures provides the most e ective index against the digital map. Since our database contains both low-altitude and high-altitude images of the mapped area, we will be interested in the relative ranking given to these two images. The second aspect of our study concerns the most e ective pairwise attributes for use in constructing the image histograms and subsequently indexing. Here we investigate the use of relative angle, line-segment length ratio and line-segment projection cross ratio. Finally, we investigate the sensitivity of the method to over and under segmentation of the raw image features.

5.1 Distance measures

First we computed the angle histogram of the digital map ( gure 1(a)). Then, we performed the comparison between each database entry histogram and the digital map histogram. For each of the ve distance measures we ordered the images in the data-base according to their closeness to the map histogram. All the techniques identi ed img60 as having the closest histogram similarity to cartographic model. However, when the ranked matches are considered the L1 and L2 norms do not perform as well as the statistical distance measures. Particularly, the L1 norm is the only measure providing an incorrect second best match (img130 against img170 for other measures). There was no noticeable di erence in the performances of the Bhattacharyya distance, Matusita distance and the Divergence. We have also compared the histogram distances between the individual infra-red images. In an experiment involving the infra-red image img90, only the L1 norm failed to return image img190 as the second best match. This image is a second view of the same location as img90 taken from a di erent aircraft altitude. An interesting observation may be made about the distribution of the response of the various distance measures. None of the statistical distances tested here provide a clear gap between correct match and incorrect match. This means that we cannot rely solely on the statistical histogram comparison to automatically select the n best matches for further processing.

5.2 Pairwise attributes

Having identi ed the most discriminating distance measures, we can now investigate the use of alternative attributes, such as ratio of segment's length and linesegment projection cross ratio. The digital map length-ratio histogram is compared against each length-ratio histogram corresponding to database entries, using the same distance measure as in the previous experiment. None of the distance measures delivered acceptable results using lengthratio histograms composed of 16 bins. The acceptable answer img60 and img170 never appeared among the very rst (or best) match. We experimented with various size of histograms and nally concluded that the ratio of line-segment length does not provide any discriminating power. Our last experiment with single pairwise attribute histograms involves histogram based on line-segment projection cross ratio. This time the digital map crossratio histogram is compared against each database entry. The results show that the cross-ratio of linesegment pairs is an e ective mean to index into an line-

segmented image database. All the distance measures, except the L1 and L2 norms, identi ed image img60 as most similar to the query image orig60. However, its discriminating power in not as powerful as the relative line-segment angle attribute. Particularly, the crossratio histogram does not deal very well with changes in the segmented image. This was particularly illustrated by the misclassi cation of img170, since none of the techniques managed to rank this image as second best match.

(a) Digital Map (b) Original image (c) Original image (orig60) (img60) (img170)

(d) Line image

(e) Line image

Angle Histogram

(f) Line image

Angle Histogram

0.14

Angle Histogram

0.09

0.2

"orig60_anglehist"

"sout60_anglehist"

0.12

"sout170_anglehist"

0.18

0.08

0.16

0.1

0.14

0.07

0.12

0.08 0.06

0.1

0.06

0.08 0.05

0.06

0.04

0.04

0.04

0.02

0.02

(g) Angle Histo- (h) Angle Histo- (i) Angle Histogram gram gram 0

0.03

0

2

4

6

8

10

12

14

16

18

0

0

2

4

6

Angle Histogram

8

10

12

14

16

18

2

4

6

Angle Histogram

0.1

0.07

10

12

14

16

18

0.1 "sout60_lengthhist"

0.08

8

Angle Histogram

0.09 "orig60_lengthhist"

0.09

0

0.08

0.09

0.07

0.08

"sout170_lengthhist"

0.07

0.06

0.06

0.06 0.05

0.05

0.05 0.04

0.04

0.04 0.03

0.03 0.02

0.02

0.01

0.01

0.03 0.02 0.01

(j) Length-ratio (k) Length-ratio (l) Length-ratio Histogram Histogram Histogram 0

0

0

2

4

6

8

10

12

14

16

18

0

0

2

4

6

Angle Histogram

8

10

12

14

16

18

0

2

4

6

XRatio Histogram

0.12

8

10

12

14

16

18

Angle Histogram

0.18

0.12

"orig60_xratiohist"

"sout60_xratiohist"

"sout170_xratiohist"

0.16 0.1

0.1 0.14

0.08

0.12

0.08

0.1 0.06

0.06 0.08

0.04

0.06

0.04

0.04 0.02

0.02 0.02

(m) Cross-ratio (n) Cross-ratio (o) Cross-ratio Histogram Histogram Histogram 0

0

0

5

10

15

20

25

30

35

0

0

5

10

15

20

25

30

35

0

5

10

15

20

25

30

35

Figure 4: Typical infra-red images going through the processing steps leading to the histogram representations.

5.3 Sensitivity Study

The nal experimental goal is to investigate the sensitivity of the retrieval to the quality of segmentation. We address this issue by systematically varying the noise rejection parameter  of the relaxation op-

erator used to extract intensity ridges from the raw image data. Formally,  is related to the standarddeviation of the raw image noise. Reducing  has the e ect controlling the density of line-segments extracted from the raw image. Figure 5 shoes a series of image segmentations resulting from various values of the noise rejection parameter. When the noise rejection parameter is set to a low value, the cartographic structure becomes swamped by clutter. When the rejection parameter is large, signi cant structure is lost. For each of the resulting segmentations we created a new set of attribute histograms.

(a) Original image

(b)  = 7

(c)  = 10

in the cross-ratio histogram decreases as the number of extracted line-segments decreases.

6 Conclusions

The main contribution of this paper has been to show how a cartographic model can be used to index into a database of aerial images. The indexing is e ected by comparing pairwise attribute histograms using a number of statistical distance measures. A comparative study shows that the most e ective and least noise sensitive attribute is the relative angle. Our main conclusion concerning histogram discrimination is that the Bhattacharyya distance, the Matusita distance and the divergence all outperform the L1 or L2 norms. This study can be regarded as proving the foundations for a more ambitious programme of work aimed at developing a hierarchical relational indexing system [10]. Our next goal is to re ne the coarse-grained putative image matches using a ne-grained relational graph matching technique [13]. This is similar in concept to the structural hashing idea of Costa and Shapiro [9] but draws on a statistical model of clutter process which contaminate the image data-base.

References (d)  = 15

(e)  = 20

(f)  = 25

Figure 5: Line extraction using various 

Our aim here is to investigate the e ects of segmental clutter and drop-out on the retrieval performance. We con ne our attention to relative angle histogram and the cross-ratio histogram. In order to assess the algorithm's performance we augmented the database with the four new segmentations contained in gure 5. The infra-red image segmented with  = 7 is called img60 07 and the same convention applies for image img60 10, img60 20 and img60 25. The original database image img60 has been segmented with  = 15. The database was again queried using the digital map shown in gure 1(a). The main experimental conclusion is that the retrieval method is robust to variable segmentation quality. For the relative angle histogram (see results in table 1), the standard distance measures (L1 and L2 norms) do not perform as well as the statistical measures (Bhattacharyya, Matusita and Divergence). The latter responded to the query with the various segmentations of img60 as giving the best 5 matches, while the high-altitude image img170 scored sixth. The results obtained with the cross-ratio are less encouraging (see table 2). The results indicate that the amount of salient structural information contained

[1] W. Niblack, R. Barber, W. Equitz, M. Flickner, E. Glasman, D. Petkovic, P. Yanker, C. Faloutsos, and G. Taubin, \The QBIC project: Querying images by content using color, texture and shape.," Image and Vision Storage and Retrieval, pp. 173{187, 1993. [2] A. P. Pentland, R. W. Picard, and S. Scarlo , \Photobook: tools for content-based manipulation of image databases," Storage and Retrieval for Image and Video Database II, pp. 34{47, February 1994. San Jose, California. [3] T. Gevers and A. Smeulders, \ nigma: An image retrieval system," International Conference on Pattern Recognition (ICPR) 1992, pp. 697{700, 1992. [4] M. Swain, \Interactive indexing into image databases," Image and Vision Storage and Retrieval, pp. 95{103, 1993. [5] R. W. Picard, \Light-years from lena: Video and image libraries of the future," Int. Conf. on Image Proc., vol. 1, pp. 310{313, 1995. [6] M. Swain and D. Ballard, \Indexing via colour histograms," Third International Conference on Computer Vision, pp. 390{393, 1990. [7] A. Jain and A. Vailaya, \Image retrieval using color and shape," Pattern Recognition, vol. 29, no. 8, pp. 1233{1244, 1996. [8] C. Dorai and A. Jain, \View organisation and matching of free-form objects," IEEE Computer Society International Symposium on Computer Vision, pp. 25{ 30, 1995.

P

Match 10th 9th 8th 7th 6th 5th 4th 3rd 2nd Best

L1 Norm img50 0.49 img90 0.48 img180 0.47 img60 10 0.468 img170 0.46 img130 0.44 img60 25 0.41 img60 07 0.37 img60 20 0.34 img60 0.31

L2 Norm img50 0.15 img90 0.15 img180 0.15 img130 0.15 img150 0.14 img60 25 0.13 img60 10 0.13 img60 07 0.11 img60 20 0.10 img60 0.09

Bhattacharyya img150 0.04 img90 0.04 img130 0.04 img180 0.04 img170 0.04 img60 25 0.04 img60 10 0.03 img60 07 0.03 img60 20 0.02 img60 0.01

Matusita img150 0.30 img90 0.30 img130 0.29 img180 0.29 img170 0.28 img60 25 0.28 img60 10 0.27 img60 07 0.24 img60 20 0.21 img60 0.19

Divergence img150 0.37 img90 0.37 img130 0.37 img180 0.34 img170 0.33 img60 25 0.32 img60 10 0.30 img60 07 0.24 img60 20 0.19 img60 0.14

Table 1: E ect of segmental clutter on relative angle histograms Match 10th 9th 8th 7th 6th 5th 4th 3rd 2nd Best

L1 Norm img70 0.43 img60 25 0.42 img50 0.42 img60 20 0.42 img60 0.41 img170 0.40 img120 0.40 img80 0.39 img60 10 0.38 img60 07 0.31

L2 Norm img50 0.09 img60 0.09 img60 20 0.09 img70 0.09 img170 0.09 img60 25 0.09 img120 0.08 img60 10 0.08 img80 0.08 img60 07 0.07

Bhattacharyya img90 0.04 img60 20 0.04 img100 0.04 img50 0.04 img170 0.03 img120 0.03 img60 0.03 img80 0.03 img60 10 0.03 img60 07 0.02

Matusita img90 0.28 img60 20 0.28 img100 0.28 img50 0.28 img170 0.26 img120 0.26 img60 0.25 img80 0.25 img60 10 0.24 img60 07 0.20

Table 2: E ect of segmental clutter on cross-ratio histograms [9] M. Costa and L. Shapiro, \Scene analysis using appearance-based models and relational indexing," IEEE Computer Society International Symposium on Computer Vision, pp. 103{108, 1995. [10] K. Sengupta and K. Boyer, \Organising large structural databases," IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 17, no. 4, pp. 321{ 332, 1995. [11] P. Devijver and J. Kittler, Pattern Recognition - A Statistical Approach. Prentice-Hall, 1982. [12] E. R. Hancock, \Resolving edge-line ambiguities using probabilistic relaxation," IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'93), pp. 300{306, 1993. [13] R. Wilson and E. Hancock, \Relational matching with active graph structures," Fifth International Conference on Computer Vision, pp. 450{456, 1995.

Divergence img50 0.32 img100 0.32 img60 25 0.32 img60 20 0.31 img170 0.28 img120 0.26 img60 0.26 img80 0.25 img60 10 0.24 img60 07 0.15

Suggest Documents