Improving reconstruction of man-made objects from ... - CiteSeerX

0 downloads 0 Views 289KB Size Report
images by machine learning ... Keywords: Object Reconstruction, Machine Learning, Graph Theory, Generic .... P . For practical reasons we will abbreviate.
Improving reconstruction of man-made objects from sensor images by machine learning R. Englert and A. B. Cremers Rheinische Friedrich-Wilhelms-Universitat Bonn Institute of Computer Science III Romerstrasse 164, D-53117 Bonn, Germany

ABSTRACT

In this paper we present a new approach for the acquisition and analysis of background knowledge which is used for 3D reconstruction of man-made objects { in this case buildings. Buildings can be easily represented as parameterized graphs from which p-subisomorphic graphs will be computed. P-graphs will be de ned and an upper bound complexity estimation of the computation of p-subisomorphims will be given. In order to reduce search space we will discuss several pruning mechanisms. Background knowledge requires a classi cation in order to receive a probability distribution which will serve as a priori knowledge for 3D building reconstruction. Therefore, we will apply an alternative view of nearest-neighbor classi cation to measured knowledge in order to learn based on a complete seed and a noise model a distribution of this knowledge. An application of an extensive scene consisting of 1846 building cluster which are represented as p-graphs in order to estimate a probability distribution of corner nodes demonstrates the e ectiveness of our approach. An evaluation using the information coding theory determines the information gain which is provided by the estimated distribution in comparison with non available a priori knowledge. Keywords: Object Reconstruction, Machine Learning, Graph Theory, Generic Scene Knowledge, Semantic Modeling, Statistical Analysis, Information Coding Theory, Upper Bound Complexity Estimation

1. INTRODUCTION

This paper presents a new approach for the acquisition of 3D generic scene knowledge which is required as background knowledge for computer-aided reconstruction (CAR ) of buildings. 3D reconstruction of man-made objects, particularly buildings, is necessary for a large number of tasks related to planning and construction. Due to occlusions, ambiguities, and image noise a model-based reconstruction (MBR ) system requires generic scene knowledge in order to ll these de ciencies (cf. Braun et al. ). Background knowledge is a derived and generalized representation of a large variety of given examples from objects. Our approach is headed on the learning of 3D generic scene knowledge from objects which are represented as parameterized graphs. In this context learning can be seen as \Extraction + Classi cation" of examples. Man-made objects are easily representable as parameterized graphs, since they enable one to model a combination of topologicaland geometrical information. Further reasons and applications for the use of parameterized graphs are described in Englert. In Section (2.) the basic notion of parameterized graphs and the computation of p-subisomorphisms will be described and depicted by an example. Since the computation of subisomorphisms is known to be NP-hard we will provide several pruning mechanisms in order to reduce the search space. An upper bound complexity estimation of our approach which does not include any assumptions as e.g. planarity will be given. The computation of psubisomorphisms is a powerful technique for the generation of building model knowledge (cf. Englert et al. ). In Section (3.) a nearest-neighbor based approach for the classi cation of building model knowledge will be described. This approach makes intensive use of p-subisomorphism computation for the generation of instances which need to be classi ed. 1

2

3

Further author information { Email: [email protected] This work is supported in part by BMBF/ DLR under Grant 01 M 3018 F/6.

1

The model knowledge generation and classi cation approach will be evaluated using an extensive threedimensional modeled suburban scene consisting of 1846 building cluster which are represented as p-graphs. From these p-graphs all possible 3-nodes will be extracted and classi ed according to a noise model (Section (4.2.)). Why are 3-nodes important for 3D-building reconstruction? 3-nodes comprise more information than the Waltz labeling (cf. Waltz ). Furthermore, the Waltz labeling is based on assumptions which do not hold for most applications: only trihedral objects, resp. only boxes with shadows and illumination, can be processed automatically; the approach needs a large amount of junctions for these simple cases (approx. 505); the acquisition of types of junctions requires tremendous e ort, and nally, the models and the used background knowledge (here: types of junctions) must be complete and perfect. In contrast to this approach, 3-nodes can be extracted, classi ed, and analyzed automatically. The evaluation uses the information coding theory in order to analyze statistically the gained distribution PM of the 3-nodes (Section (4.3.)). Therefore, we discuss the computed distribution of 3-nodes, then PM will be compared with a uniform distribution of the 3-nodes using the Kullback Leibler distance resulting in the information gain due to PM . Furthermore, the information provided by the observation of two edges about the remaining edge will be estimated and discussed. Finally, we conclude with a discussion of future work. 4

2. PARAMETERIZED GRAPHS AND SUBISOMORPHISM COMPUTATION

In this section parameterized graphs and the computation of p-subisomorphic graphs will be described (Sections (2.1.) and (2.2.)). Since the computation of p-subisomorphism is known to be very costly, we will give an upper bound complexity estimation (Section (2.4.), cf. Englert and Seelmann-Eggebert ). 5

2.1. Basic notion

In this section parameterized graphs and p-subisomorphic graphs will be de ned. Basic concepts of graph theory can be found in e.g. Aigner and Harary. 6

7

Definition 2. .1 (parameter vector). A parameter vector A = (a ; : : :; an) of length n is a list consisting of n real-valued constants or real-valued variables, and ai is called component of A. Definition 2. .2 (parameterized graph, p-graph). A parameterized graph GP = (V; ; E; ) (short: p-graph) is an undirected graph, where each node v 2 V has a parameter vector (v) and each edge e 2 E has a parameter vector (e). The neighborhood N(V 0) of a node set V 0  V is N(V 0 ) def: = fv 2 V nV 0j 9w 2 V 0 : fv; wg 2 E g and the degree (V 0) is jN(V 0)j. 1

Note, all parameter vectors of a p-graph can be of di erent length. On the other hand, parameter vectors do not permit the discrimination of nodes, resp. edges, from each other. The following extensions of graph concepts are based on parameter vectors (j  j denotes the cardinality of a set): Definition 2. .3 (pn-connected). A p-graph GP = (V; ; E; ) is called pn-connected i . for all pairs of nodes fv; wg 2 V  V exists a path P 2 E k ; k  jE j from v to w, that each edge e 2 E has a parameter vector A = (e) of length n. Two edges e = fv; wg and e = fp; qg are adjacent i . v = p and w 6= q. 1

2

Definition 2. .4 (cut node). A cut-node is a node, whose elimination destroys the pn -connection of the resulting p-graph.

Since nodes of a p-graph contain a parameter vector, we assume that p-graphs do not entail self-loops. Analogously, p-graphs must not have parallel edges. The notion of an induced p-subgraph enables one to extract a subgraph from a p-graph maintaining its information given by parameter vectors. Definition 2. .5 (is subsumed by, ).

Given two parameter vectors A = (a ; : : :; an) and B = (b ; : : :; bn) of length n. A is subsumed by B , written A  B , i . for i = 1; : : :; n holds ai = bi or ai is a variable and bi a constant. Definition 2. .6 (is equivalent to, =: ). 1

1

2

Two parameter vectors A and B are called equivalent, written A =: B , i . A  B and A  B .

The trivial case in which each parameter vector is subsumed by itself will not be considered. As example take the parameter vectors A = (a; X)  , B = (X; b),: C = (Y; X), and D = (a; c), then A  D, C  A, C  B, and C  D hold, where P  P 0 means P  P 0 or P = P 0. Subgraphs of p-graphs   have to take relations of parameter vectors of nodes and edges into consideration. Note, the binomial coecient Sk denotes the sets consisting of k objects without repetition out of a set S, and the mapping f jM denotes the restriction of f to M, where f jM : M ! Y , with M  X, assuming a mapping f : X ! Y is given. Definition 2. .7 (p-subgraph). A p-subgraphG0P= (V 0 ;  0; E 0;  0) of a parameterized graph GP = (V; ; E; ) is a pn -connected p-graph  V  with V 0 0 0 0 0 0 0 0 V V, E E\ ,  =  jV , and  =  jE . G is called induced p-subgraph of GP , i . E = \ E. 0

2

Definition 2. .8 (union,

0

P

[: ).

2

Given two induced p-subgraphs G0P = (V 0 ;  0; E 0;  0): and G00P = (V 00 ;  00; E 00;  00) of a p-graph GP = (V; ; E; ), with V 0 \ V 00 = ;. Then G0P union G00P , written G0P [ G00P , denotes the p-graph G [: = (V [: ;  [: ; E [: ;  [: ), with : ?  : V [: = fV 0 [ V 00 g,  [: =  jV [: , E [: = f V [ V \ E g, and  [: =  jE [: . 0

00

2

Definition 2. .9 (p-isomorphic, 'P ). A p-graph GP = (V; ; E; ) is p-isomorphic to a p-graph G0P = (V 0;  0 ; E 0;  0), written GP 'P G0P , i . there exist bijections V : V ! V 0 and E : E ! E 0, with: E (fv; wg) = fV (v); V (w)g for all edges fv; wg 2 E , with (v)   0 (V (v)) for all nodes v 2 V , and with (e) =  0 (E (e)) for all edges e 2 E . As a consequence, a node v is p-isomorphic to a node v0 , written v 'P v0 , if for their embedding p-graphs GP = (fvg;  jfvg; ;; "f ) y and G0P = (fv0 g;  0 jfv0g; ;; "f ) holds GP 'P G0P . For practical reasons we will abbreviate : GP 'P G0P by the quadruple [GP ; G0P ; V ; E ] and v 'P v0 by the quadruple [v; v0 ; V ; E ]. Analogously, GP [ v0 denotes GP union G0P . The concepts of an admissible extension of induced p-subgraphs and p-subisomorphic graphs

are at the core of the approach presented.

Definition 2. .10 (admissible extension).

Given two p-graphs GP = (V; ; E; ) and G0P = (V 0;  0 ; E 0;  0) and two nodes v 2 V and: v0 2 V 0. The node pair :(v; v0 ) is an admissible extension of the pair (GP ; G0P ), i . v 62 V , v0 62 V 0 , and GP [ v is p-isomorphic to G0P [ v0 .

Definition 2. .11 (p-subisomorphic, P ). A p-graph GP = (V; ; E; ) is p-subisomorphic to a p-graph G0P = (V 0 ;  0; E 0;  0 ), written GP P G0P , i . there exists an induced p-subgraph GiP of G0P which is p-isomorphic to GP , where the upper index i denotes the number of nodes of GiP . GP is called template graph and G0P search graph.

2.2. P-subgraph isomorphism computation

In this section a breadth- rst search algorithm for the computation of induced p-subgraphs will be described. More precisely, given two p-graphs S and T, with T P S, compute all induced p-subgraphs T 0 of S which are p-isomorphic to T. Hence, T is a template graph and S a search graph which represents the so-called search space. The enumeration of all solutions requires tremendous e ort. In order to reduce computational cost we will provide pruning mechanisms which minimize the search space. The algorithm is depicted in Table 1. It takes as input a template graph T = (VT ; T ; ET ; T ) and a search graph S = (VS ; S ; ES ; S ). The main idea is to perform a breadth search traversing the p-graphs along all incident edges given a start node: take a node nT 2 T as start node, look for all nodes nS 2 S, with nT 'P nS , expand nT by matchings with its neighboring nodes until all nodes of T have been processed or no further matching is possible. More formally, this idea can be expressed by the following lemma:  Lower letters denote constants and upper letters denote variables. y "f denotes the so-called empty function f : ; ! ;

3

1. Initialization TP (1) = ; for all vT 2 VT do for all vS 2 VS do if [vT ; vS ; V ; E ] then TP (1) = TP (1) [ f[vT ; vS ; V ; E ]g

end if end for end for

2. Main loop for i = 2; : : :; jVT j do for all [GT = (VT ; T ; ET ; T ); GS ; V ; E ] 2 TP (i ? 1) do TP (i) = ; for all (vT ; vS ) admissible: extension: of (GT ; GS ) do (*) TP (i) = TP (i) [ f[GT [ vT ; GS [ vS ; 0V ; 0E ]g

end for end for end for for i = 1; : : :; jVT j do Store TP (i) end for

Table 1. Breadth search algorithm for the enumeration of induced p-subgraphs. A = B denotes the assignment of

B to A.

Lemma 2. .12.

Given induced p-subgraphs T = (VT1 ; T1 ; ET1 ; T1 ), T = (VT2 ; T2 ; ET2 ; T2 ), S = (VS1 ; S1 ; ES1 ; S1 ), S = (VS2 ; S2 ; ES2 ; S2 ), of T = (VT ; T ; ET ; T ), resp. S = (VS ; S ; ES ; S ), with [T ; S ; V1 ; E1 ] and [T ; S ; V2 ; E2 ], and assume that V: T1 \ VT2 = ;, VS1 \ VS2 = ; and 8fv ; v: g 2 ET ,: with v 2 VT1 , v 2 VT2 , fV1 (v ); V2 (v )g 2 ES and T (fv ; v g) = S (fV (v ); V (v )g) hold, then [T [ T ; S [ S ; V : ; E : ], with 1

1

1

2

1

2

1

2

2

2

  (v) , v 2 V def: T V V : (v) = V (v) , v 2 VT [

and

1

2

1

1

2

1

2

1

2

1

2

1

1

1

2

(1)

8  (fv ; v g) , fv ; v g 2 ET < E , fv ; v g 2 ET = : E (fv ; v g) E : (fv ; v g) def: fV (v ); V (v )g , v 2 VT ^ v 2 VT 1

[

2

1

1

2

1

2

2

1

2

1

2

1

2

1

:

:

2

1

2

[

[

2

2

1

(2)

2

1

2

2

Proof: The p-isomorphismT [ T 'P S [ S can be derived straightforward by applying the described assumptions to the de nition of p-isomorphism (De nition 2..9). Note, the mappings V : and E : obviously ful ll the conditions of p-subisomorphism (cf. De nition 2..9 and 2. .11). During initialization the algorithm (Figure 1) computes all start nodes of T according to their matchings with nodes of S. The sets TP (k) contain quadruples of p-subisomorphic graphs to S, their corresponding p-subgraphs of S both consisting of k nodes, and the p-isomorphic mappings of the parameter vectors for the nodes and edges. In the main loop the p-graphs T 0 2 TP (1) are expanded by all admissible extensions of T 0 and their corresponding p-subgraphs of S. Since an admissible extension corresponds uniquely to an element of TP (1), the mappings V and E can be constructed using Lemma 2..12. Finally, the sets TP (k), k  jVT j, will be stored. All p-subgraphs T 0 of T which are p-subisomorphic to S will be enumerated. There are several so-called pruning mechanisms in order to reduce the tremendous e ort of the exhaustive search. 1

[

2

1

2

[

4

Search space

S

T a

a

T,3

b

A,3

s,1

b B,2 c

s,1

W, 2

s,3

Focus on neighbors

d c D,2 a

E,3

e

Figure 2. Focussing on all neighbors of a node during one pass enables one to reduce computational effort, since the corresponding part of the search space (indicated by the circle) needs to be examined only once.

Figure 1. An example of a search graph S and a

template graph T. The parameter vectors are given by numbers, variables (upper letters), and symbol constants (lower letters).

2.3. Pruning mechanisms

Pruning mechanism shrinking

set: The expansion of the search space depends strongly on the size of the initial set TP (1). This set can be shrunken by taking all neighbors of the nodes to be matched into account. Let us highlight an example (Figure 1): Initially, the node v 2 T (marked with a) can match two nodes in S, both marked with a, too. Due to the unique matching of the remaining nodes of T, only the one node of S (upper node which is marked with a) is a candidate for further processing. The following shrinking sets i contain candidate nodes for the matching and, hence, theyScan be used in order to shrink the initial set TP (1). i is based on the neighbors N(V ) of nodes V , with N(V ) = v2V N(v) and N(v) contains the neighbors of node v. Definition 2. .13 (shrinking set i ). Given a template graph T = (VT ; T ; ET ; T ) and a search graph S = (VS ; S ; ES ; S ). The shrinking set i : VT ! 2VS is de ned by \ i (v) def: = i? (v) \ N(i? (w)) 1

w2X

1

with v 2 VT nX; X  VT , with 8w 2 X : ji? (w)j = 1, fv; wg 2 ET , and ji? (v)j > 1. 1

1

The shrinking sets i (v) of the nodes v 2 VT initially (i = 0) contain the initial matchings of TP (1). The sets i (v) will be computed repeatedly until they do not shrink any more, or more formally i (v) = i? (v). Finally, their corresponding sets of TP (1) will be replaced by them. 1

Pruning mechanism look ahead: The size of the sets TP (i) can be reduced as follows: If the node vT0 of the

p-graph G0T has several neighbors, then focussing the search onto these neighbors reduces search e ort (cf. Figure 2), since the algorithm needs to examine this part of the search space only once. Thus, the quadruple (cf. Figure 1, line with tag (*)) should only be added to the set TP (i) after all neighbors of vT0 have been checked. Otherwise, 5

each successful matching of a neighbor node with a node of the search graph according to its previously matched neighbors should be added to the quadruple.

2.4. Upper bound complexity estimation

In this section an upper bound complexity of the algorithm presented (cf. Figure 1) will be estimated. Given a template graph T = (VT ; T ; ET ; T ) and a search graph S = (VS ; S ; ES ; S ), then the sets TP (i) = f[Ti = (VTi ; Ti ; ETi ; Ti ); Si = (VSi ; Si ; ESi ; Si ); Vi ; Ei ]g contain induced p-subgraphs Ti (resp. Si ) of T (resp. S) with jVTi j = i (resp. jVSi j = i). Initially, TP (1) = f[vT ; vS ; V1 ; E1 ]g with vT 2 VTi , vS 2 VSi , and vT 'P vS for all pairs (vT ; vS ) holds. During enumeration of all induced p-subgraphs the algorithm looks in step i for all admissible extensions (vT ; vS ) 2 (VT nVT0 )  (VS nVS0 ) of each [Ti; Si ; VT ; ET ] 2 Pi. The following estimation holds:

jPi j  +1

X Ti ;Si ;Vi ;Ei ]2Pi

X

vT 2N (VTi ) vS 2N (VSi )

1 i (vT ; vS )A

(3)

( 1 , vT 2 VTi , vS 2 VSi , and [

i (vT ; vS ) =

0 @ X

(vT ; vS ) is an admissible extension of (Ti ; Si) 0 , else

(4)

This estimation counts symmetrical solutions several times. Suppose [Ti ; Si; Vi ; Ei ] 2 Pi , then [Ti ; Si ; Vi ; Ei ] could be generated from each induced p-subgraph T 0 = (V 0;  0 ; E 0;  0) of Ti with i ? 1 nodes. Remember, an induced p-subgraph is pn-connected. This property is due to v 2 Vi nV 0 which is not a cut-node (cf. De nition 2..4). The existence of at least two noncut-nodes in a pn -connected p-graph is well-known. In worst case each p-subgraph T 0 can be extended by each node of VT nVT0 . Consequently, we get the following estimation:

X

jPi j  +1

Ti ;Si ;Vi ;Ei ]2Pi

[



X

Ti ;Si ;Vi ;Ei ]2Pi

N(VTi )  N(VSi )

(5)

(jVT j ? i)  (jVS j ? i)

(6)

[

= jPij  (jVT j ? i)  (jVS j ? i)

(7)

VS j? maximal cost for Hence, summing up jPij for i = 1; : : :; jVT j, results in jPjVT jj  jP j  (jVT j ? 1)!  jVSjj?j VT j? computation of all induced p-subgraphs given a search graph and a template graph. This result concludes that the cost depends strongly on the size of the initial matching set P . Note, this estimation does not take prior knowledge into account, e.g. distribution of the parameter vectors and graph properties, e.g. planarity. Fortunately, the pruning mechanism shrinking set can be used in order to reduce the size of P . The reduction of the sets TP (i) (pruning mechanism look ahead) has not been incorporated into the estimation, either. However, their embedding into the algorithm reduces computational cost signi cantly. (

1

1)!

(

1)!

1

1

3. NEAREST-NEIGHBOR CLASSIFICATION

In this section a simple machine learning approach for the classi cation of observed, resp. measured, data will be described. Most learning methods construct abstractions from observations and store them in memory. In our context we will consider problems which require a transition from numerical to symbolical representations. This transition will be called classi cation.z A growing interest can be observed for methods which store a choice of known instances in memory and apply this structured knowledge to new situations. In this sense learning can be seen as a construction of a concept or a class of patterns. Note, a simple approach to de ne a concept is to use the set of stored instances as a de nition of the concept. This method requires no generalization. The more interesting z The names classi cation, recognition, and matching are common for methods achieving this transition (cf. Niemann8).

6

Insert newly classified instance I

(a)

(b)

I

Insert newly classified instance I

center of gravity of class has been modified

I

Figure 3. Classifying a new instance may change the center of gravity of a class (a); but due to the use of a complete seed the center of gravity remains invariant (b). case, however, includes a generalization, and a simple approach is based on a similarity of at least two instances: An instance I which is known to belong to a certain class can be generalized by assuming that instances similar to I also belong to the class. This approach requires no prior knowledge as e.g. the density of data, but its demerit is that it may be dicult to formalize a similarity measure. Therefore those so-called nonparametric classi cation rules hold for di erent densities. A famous nonparametric classi cation rule which is used in a wide range of learning methods, e.g. instance-based learning, is the nearest-neighbor decision rule or short nn-rule (cf. Duda and Hart ). The paradigm of nearest-neighbor classi cation provides a useful frame in order to describe our learning approach. Suppose a so-called seed of instances (xi ; i ), i = 1; : : :; n is given, where xi is a measurement and takes a value in a metric space X upon which is de ned a metric d, and i is an index of a class. For brevity, we will say \xi belongs to i " when we mean precisely that the measurement xi belongs to class i . A new pair (x; ) is given, and the task is to estimate  by utilizing the information contained in the previous classi ed instances. A measurement x0 2 fx ; : : :; xng of the instance (x0; 0 ) is a nearest-neighbor to x of (x; ) i . 9

1

min d(xi; x) = d(x0; x)

i = 1; : : :; n

(8)

Then the nn-rule decides x belongs to class 0 . Note, a mistake is made if 0 6= . Furthermore, the nn-rule uses for decision only one nearest neighbor. The n ? 1 remaining classi cations are ignored. Intuitively, it can be reasoned why the nn-rule works well. Suppose the pair (x; ) is a random variable and the probability P( = !) is the a posteriori probability P(!jx) with which  will be selected. Then x will be suciently close to x0 , P(!jx) ' P(!jx0), if the size of the sample is large enough. In this case the state ! will be taken with probability P(!jx) which will be selected by the nn-rule. Thus for a large sample set a good classi cation accuracy can be expected. The power of the method comes also from the retrieval process. But there are some demerits in storing large amounts of data and de ning the distance measure. Our approach di ers a little bit from the nearest-neighbor approach: we assume, that (i) noise will be modeled explicitly, and (ii) the seed is complete. Definition 3. .1 (complete).

A seed is called to be complete, if for each measurement a class exists which ful lls (8) according to a given noise model.

Intuitively, a complete seed reduces the classi cation e ort since the center of gravity of a class remains invariant. Figure 3 highlights this situation: in case (a) a new instance I has been classi ed and inserted into the predicted 7

Figure 4. A visualized part of a modeled village close to Bonn showing the used test eld from which the corner nodes are extracted.

class, resulting in a change of the center of gravity of this class. The latter step does not occur in case (b) since the used seed is complete. These assumptions ensure that each new instance will be uniquely and correctly classi ed. If the classes form a decomposition into disjoint subsets whose union is the classi ed data set according to the noise model, then as a consequence the resulting classi cation will be unique in this sense, that the classes are non-overlapping. The nn-rule can be seen as an alternative view to our approach.

4. APPLICATION AND EVALUATION

In this section an application of the above described classi cation method to corner nodes of buildings (short: corner nodes) will be given. In Section (4.1.) the test eld is described. In the application and evaluation we will focus on corner nodes of buildings. Corner nodes are signi cant parts of buildings which are used as building models (cf. Braun et al. ) for MBR (cf. Englert ). A simple noise model and the used seed will be described (Section (4.2.)). Finally, the resulting classi cation will be evaluated using the information coding theory, demonstrating that the achieved knowledge from the classi cation provides an information gain for the use of corner nodes in a CAR system for 3D building reconstruction (Section (4.3.)). 1

2

4.1. Description of the test eld

As a test eld we have chosen a village close to Bonn from which 1846 building cluster (short: buildings) have been modeled three-dimensionally with a low degree of generalization which includes e.g. detailed roof structures, eaves, canopies: 745 buildings have a complex roof structure (no at- or lean-to roof) and hence are of major interest (cf. Figure 4). The buildings can be grouped together into three classes: (a) detached houses (489), (b) any combination of two basic building typesx (162), e.g. a L-shaped hip- and saddleback roof building, and (c) buildings which consist of more than two basic building types (94). Note, in most cases a combination of basic building types mean that some of the buildings are nested or closely interlocked and thus result in a complex roof structure and in an irregular ground plane. From these buildings p-graphs GP = (V; ; E; ) have been computed (cf. Englert ): parameter vectors of the nodes V are (n 2 V ) = (D), where D is the degree of node n, and parameter vectors of the edges E are (e 2 E) = (4), where 4 is the gradient of e. 10

2

x Flat-, pent-, saddleback-, hip-, broach-, hipped-gable -, and mansard roof building.

8

4.2. Extraction and classi cation of 3-nodes

Learning of generic 3D building model knowledge, resp. of corner nodes, is a challenging endeavor which we will perform with the above described test eld. The learning of building model knowledge can be divided into \Extraction" + \Classi cation". The former will be described in this section based on corner nodes.

Definition 4. .1 (3-node, corner node). A 3-node is a p-graph consisting of one node of degree three, which is the so-called corner node. Edges are labeled horizontal (h), vertical (v ), vertical ? (v? ), oblique (o ), or oblique? (o? ). The signs '+' and '?' denote the gradient of the line segment considering the corner node. +

+

+

+

Note, 3-nodes as e.g. (h, h, h) or (v , v , v ) do not have a three-dimensional nature, and lack validity. +

v+

+

+

+

o+

v+

h

h

o

o-

-

v (2): 10069

h (1): 4 o+

(7): 5

(8): 21672 o+

+

o

v-

h

o+

h

h (10): 807

(11): 55

v+

(13): 14 o+

(14): 63

o+

(12): 0 h

v-

o-

o-

o-

o-

o(16): 0

(17): 0

(18): 4775

o-

o+

o(19): 4

(15): 3

o+

o-

o+

oo-

v-

v+

o+

o-

h

o-

h

v (6): 2

h

o+

o+

-

(5): 0

h

(9): 12710

o+

o-

o-

o+

h

h

o-

v (4): 25

(3): 2

v+ h o-

-

o+

v+ o+

h

z

-

o

y x

(20): 0

(21): 13

Figure 5. 21 classes of 3-nodes and their distribution. For the classi cation we will use the following noise model, due to inaccuracies during measurement of 3-nodes: v? = [?90 ? ; ?90 + [, o? = [?90 + ; 0 ? [, h = [0 ? ; 0 + [, o = [0 + ; 90 ? [, and v = [90 ? ; 90 + [, with  = 1:0. Figure 5 depicts 21 classes of 3-nodes, and therefore as seed we have used these 21 3-nodes. The algorithm described in Table 2.2.takes as input the seed represented as p-graphs and a 3D building description which is represented as p-graph, too. It computes the corner-nodes of the building search graph using the seed as template graphs. As output the 3-nodes are classi ed using the above described decision rule and the noise model. Note, the algorithm generates all permutations of edges of 3-nodes (in maximum six permutations for a 3-node), and thus the learning result depends not on an order of the edges of a 3-node. The results of the learning process of all building p-graphs are depicted in Figure 5. In the following section we will analyze these results in the context of 3D building reconstruction. +

9

+

4.3. Information gain estimation of the classi ed knowledge

In the context of 3D building reconstruction from images the following problem evolves (cf. Grun et al. ): during the computation of line segments (short: lines{ ) and their grouping to 3-nodes often not all required lines can be observed or segmented. In order to build proper 3-nodes the completion of partially measured 3-nodes has to be performed using background knowledge. Background knowledge will be gained by estimate the frequency of occurrence of corner nodes using the distribution of 3-nodes (cf. Figure 5). In this section we will evaluate this background knowledge using the information coding theory, which is based purely on statistics (as long as we do not consider the meaning of an information). At the core of this theory is the concept of entropy (cf. McEliece ). Note, the term E [X] denotes the mean value of a random variable X according to a given probability mass function: 11

12

Definition 4. .2 (Entropy).

Suppose X is a nite discrete random variable. Let p(x) = P fX = xg denote the probabilities for the instances x of X . The entropy of X is de ned by  1  (9) H(X) = E log p(x) X 1 (10) = p(x)log p(x) x2X

In the following, we will use the number 2 as base for the logarithm, since we assume that the knowledge is encoded as sequences consisting of bits. In this case the entropy is called information I(X). Note, if p(x) = 0, then however, we consider p(x)log p x to be 0. For the evaluation the measured distribution PM of 3-nodes will be used as a priori knowledge (cf. Figure 5). This distribution will be compared with a uniform distribution of all possible classes of 3-nodes, which means that no a priori knowledge about the distribution of 3-nodes is available. The evaluation has three parts: 1

(

)

1. Discussion of the measured distribution PM (cf. Figure 5). 2. The Kullback Leibler distance between PM and a uniform distribution of 3-nodes will be computed (cf. Cover and Thomas ). 3. Suppose two edges of a corner node have been observed. The amount of information they provide about the remaining edge will be computed in order to determine how strong the edges depend on each other according to the distribution PM of the 3-nodes. 13

Definition 4. .3 (Kullback Leibler distance). The Kullback Leibler distance between two probability mass functions is de ned as

  D(pkq) = EP log p(x) q(x) X p(x) = p(x)log q(x) x2X

(11) (12)

The Kullback Leibler distance is also known as relative entropy. It is a measure of the ineciency of assuming that the distribution is q, when the true distribution is known (here: p). Let us highlight the classi cation result of corner nodes (cf. Figure 5): since each edge of a 3-node has one of ve possible labels there are in total 125 classes with the following properties: { The terms 'lines' and 'edges' will be used interchangeably.

10

 60 classes are described by three di erent labels, and each corner node has 6 permutations, thus only 10 classes

are di erent ignoring the order of the labels;  60 classes are described by two di erent labels, and each corner node has 3 permutations, thus only 20 classes are di erent ignoring the order of the labels;  5 classes are described by one common label for their edges, and thus all 5 classes are di erent.

As a consequence 35 of 125 classes are di erent ignoring permutations of them, and only 21 of these classes occur in the used test eld, where 7 classes have three di erent labels, 12 classes have two di erent labels, and 2 classes have a common label for their edges (ignoring permutations). Hence, in total we will consider 7  6+12  3+2  1 = 80 classes in our test eld, where the remaining 45 classes do not occur. Note, for sake of clearness depicts Figure 5 the classes without permutations. In the following we assume that a random variable X represents an edge of a 3-node and its instances are denoted by lower letters x.

Evaluation 1 : The distribution PM of 3-nodes consists of 21 classes, where 5 classes contain a signi cant amount of 3-nodes (21672, 12710, 10069, 4775, 807), 11 classes contain only a few 3-nodes (63, 55, . .., 2), and 5 classes are empty, which means that these types of corner nodes do not occur in the extensive test eld (cf. Section 4.2.). This distribution shows that the test eld consists mostly of at roof- and saddleback roof buildings, which can be nested or interlocked.

Evaluation 2 : Using the Kullback Leibler distance (11) the error between the measured classes of 3-nodes and the uniform distribution PU can be measured: D((X; Y; Z)PM k(X; Y; Z)PU ) = 3:23 bits. Since H((X; Y; Z)PM ) = 3:73 we would need in average H((X; Y; Z)PM ) + D((X; Y; Z)PM k(X; Y; Z)PU ) = 3:73 + 3:23 = 6:96 bits in order to describe a distribution using PU , and thus the measured distribution PM provides approx. 86.60 percentk more information than a distribution which is not based on a priori knowledge. Evaluation 3 : Suppose two edges of a 3-node have been observed. The amount of information they provide about the remaining edge will be estimated. Using the concept of information this can be expressed by  jx; y)  I(X; Y ; Z) = EP x;y;z p(x; y; z)log p(zp(z) which denotes the information X and Y provide to Z. Note, this so-called mutual information of X; Y and Z can also be expressed using the Kullback Leibler distance (11): I(X; Y ; Z) = D(p(x; y; z)kp(x; y)p(z)). In our case, a random variable Z has ve di erent instances and thus IS (Z)  1:81 bits, which denotes the average amount of information about an observation of an edge. In contrast to this estimation the information IU (X; Y ; Z) is zero, assuming that all random variables are uniformly distributed in the space of the 125 classes. Using the distribution depicted in Figure 5 we get an information of IM (X; Y ; Z)  1:28. This result means that the observations of X and Y provide an information gain of 58.60 percent of the average amount of information about Z. As a consequence the distribution PM provides a large amount of information which can be used in order to improve 3D building reconstruction, particularly the completion of partially observed 3-nodes. (

)

5. CONCLUSION AND FUTURE WORK

We have presented an e ective method in order to learn background knowledge for reconstruction of man-made objects. As application an extensive suburban scene consisting of 1846 three-dimensional modeled building cluster has been used from which all occuring corner nodes have been extracted using p-graph subisomorphism computation. The evaluation is based on the Kullback Leibler distance depicting that the measured distribution of 3-nodes provides an information gain of 86.60 percent compared to an uniform distribution of 3-nodes, which has to be used when no k aestimated ?aoptimal denotes the relative error of an estimated value. aoptimal

11

a priori knowledge is available. These promising results motivate the following future work: (a) often combinations of 3-nodes can be observed in images, which can also be analyzed and will provide deeper insights into the structure of buildings; (b) the nn-rule does neither explicitly generalize classi ed instances nor does it enable one to use a domain theory for the search. Therefore, we will investigate in developing a more sophisticated learning approach which includes both features. The pruning mechanisms reduce size of search space and thus they should be further exploited. Finally, the search e ort of the approach depends strongly on the template graphs (cf. Englert et al. ) which reasons to further analyze their structure. 3

ACKNOWLEDGEMENTS

The authors gratefully acknowledge E. Gulch and W. Forstner, Institute of Photogrammetry, Bonn, for their cooperation of building acquisition from aerial images. Furthermore they like to thank M. Beetz and T. Frohlinghaus for their invaluable discussions.

REFERENCES

1. C. Braun, T. H. Kolbe, F. Lang, W. Schickler, V. Steinhage, A. B. Cremers, W. Forstner, and L. Plumer, \Models for photogrammetric building reconstruction," Computer & Graphics 19(1), pp. 109 { 118, 1995. 2. R. Englert, \Systematic acquisition of generic 3d building model knowledge," in Semantic Modeling for the Acquisition of Topographic Information from Images and Maps, W. Forstner, ed., Birkhauser Verlag, (Basel, Switzerland), 1997. to appear. 3. R. Englert, A. B. Cremers, and J. Seelmann-Eggebert, \Recognition of polymorphic patterns in parameterized graphs for 3d building reconstruction," in Proceedings of the 1st Workshop on Graphs Based Representations, J.-M. Jolion and W. Kropatsch, eds., Advances in Computing, Springer Verlag, 1997. to appear. 4. D. Waltz, \Understanding line drawings of scenes with shadows," in The psychology of computer vision, P. H. Winston, ed., Computer Science Series, pp. 19 { 92, McGraw-Hill, New York, U.S.A., 1975. 5. R. Englert and J. Seelmann-Eggebert, \P-subgraph isomorphism computation and upper bound complexity estimation," Tech. Rep. IAI-TR-97-2, University of Bonn, Institute of Computer Science III, Bonn, Germany, 1997. 6. M. Aigner, Combinatorial search, John Wiley & Sons Ltd and B. G. Teubner, Stuttgart, Germany, 1988. 7. F. Harary, Graph theory, Addison-Wesley, Massachusetts, U.S.A., 1972. 8. H. Niemann, Pattern analysis and image understanding, Springer Verlag, Heidelberg, Germany, 2nd ed., 1989. 9. R. O. Duda and P. E. Hart, Pattern classi cation and scene analysis, John Wiley & Sons Ltd, N.Y., U.S.A., 1973. 10. R. Englert and E. Gulch, \A one-eye stereo system for the acquisition of complex 3d-building structures," GEO-INFORMATION-SYSTEMS, Journal for Spatial Information and Decision Making 9, pp. 16 { 21, Aug. 1996. 11. A. Grun, O. Kubler, and P. Agouris, eds., Automatic extraction of man-made objects from aerial and space images, (Basel, Switzerland), Birkhauser Verlag, 1995. 12. J. McEliece, The theory of information and coding, vol. 3 of Encyclopedia of mathematics and is applications, Addison-Wesley, Massachusetts, U.S.A., 1977. 13. T. Cover and J. Thomas, Elements of information theory, Wiley series in telecommunications, John Wiley & Sons, Inc., 1991.

12

Suggest Documents