Jun 21, 1998 - Hannah, 1988; Dhond & Aggarwal, 1989; G ulch, 1991; Hsieh et al., 1992]. .... stereo, were studied in many subsequent works, such as [Baker ...
Computer Science Department of The University of Auckland CITR at Tamaki Campus (http://www.tcs.auckland.ac.nz)
CITR-TR-21
June 1998
Stereo Terrain Reconstruction by Dynamic Programming Georgy Gimel'farb 1
Abstract This TR is a review of the symmetric dynamic programming approach to stereo terrain reconstruction.
1 CITR, Tamaki Campus, University Of Auckland, Auckland, New Zealand
1 Stereo Terrain Reconstruction by Dynamic Programming Georgy Gimel'farb
Computer Science Department, The University of Auckland, New Zealand 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . 1.1.1 Intensity-based stereo: basic features . . . 1.1.2 Global versus local optimization . . . . . 1.2 Statistical decisions in terrain reconstruction . . 1.2.1 Symmetric stereo geometry . . . . . . . . 1.2.2 Simple and compound Bayesian decisions 1.3 Probability models of epipolar pro les . . . . . . 1.3.1 Prior geometric model . . . . . . . . . . . 1.3.2 Prior photometric models . . . . . . . . . 1.3.3 Posterior model of a pro le . . . . . . . . 1.4 Dynamic programming reconstruction . . . . . . 1.4.1 Uni ed dynamic programming framework 1.4.2 Regularizing heuristics . . . . . . . . . . . 1.4.3 Con dence of the DPM . . . . . . . . . . 1.5 Experimental results . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
3 5 6 7 7 10 11 12 14 15 17 18 19 20 21
1.1 Introduction Photogrammetric terrain reconstruction from aerial and space stereopairs of images occupies a prominent place in cartography and remote sensing of the Earth's surface. Traditional analytical photogrammetry, based on human stereopsis, involves the two following main steps. First, several ground control points (GCP) are detected in the images for placing these latter in such a way as to visually perceive a stereomodel of the 3D surface and embed it into a reference 3D coordinate system. This step is called image orientation, or calibration [Haralick & Shapiro, 1993]. Then the perceived stereomodel is visually traced along x- or y-pro les (3D lines with constant y- or x-coordinates, respectively) or along horizontals (lines with constant z -coordinates).
2
1 STEREO TERRAIN RECONSTRUCTION BY DYNAMIC PROGRAMMING
A dense set of regularly or irregularly spaced 3D points of the pro les or horizontals form a digital surface model (DSM) of the terrain. On-line or posterior visual exclusion of particular \non-characteristic" objects, such as forest canopy or buildings, converts a DSM into a digital terrain model (DTM), digital elevation model (DEM), or triangulated irregular network (TIN) 1 . Digital photogrammetry uses (interactive) computational methods both to orient the images by detecting the GCPs and to reconstruct the terrains [Ackermann, 1996; Schenk, 1996]. The computational binocular stereo models a \low-level" human stereopsis which is based on the one-to-one correspondence between 3D coordinates of the visible terrain points and 2D coordinates of the corresponding pixels in stereo images. Under a known optical geometry of binocular stereo viewing, found by calibration, 3D coordinates of each spatial point are computed from dierences between 2D coordinates of the corresponding pixels. The dierence of x-coordinates is called a horizontal disparity, or x-parallax. The dierence of y-coordinates is a vertical disparity, or y-parallax. The disparities can be presented as a digital parallax map (DPM) that specify all the correspondent pixels in a given stereo pair. A DSM is computed from a DPM using the calibration parameters. Each binocularly visible terrain area is represented by (visually) similar corresponding regions in a given stereo pair, and the correspondences can be found by searching for the similar regions in the images. That human stereo vision nds these similarities so easily and, at most, reliably hides the principal ill-posedness of the binocular stereo: there always exist a multiplicity of optical 3D surfaces giving just the same stereo pair [Kyreitov, 1983]. Therefore it is impossible to reconstruct precisely the real terrain from a single pair, and the computational reconstruction pursues a more limited goal of bringing surfaces close enough to those perceived visually or restored by traditional photogrammetric techniques from the same stereo pair. Natural terrains possess a big variety of geometric shapes and photometric features, so the computational reconstruction assumes only a very general prior knowledge about the optical 3D surfaces to be found (at most, speci cations of allowable smoothness, discontinuities, curvature, and so forth). Long-standing investigations have resulted in numerous stereo methods (see, for instance, [Barnard & Fischler, 1982; Baker, 1984; Hannah, 1988; Dhond & Aggarwal, 1989; Gulch, 1991; Hsieh et al., 1992]. Some of them, mostly, simple correlation or least-square image matching [Helava & Chapelle, 1972; Scarano & Brumm, 1976; Helava, 1988] have already found practical use in modern photogrammetric devices. But there still exists a necessity to develop more ecient and robust methods to be implemented in practice. 1 A DEM contains 3D points ( ) supported by a regular ( )-lattice; a DTM either means just the same as the DEM or incorporates also some irregular characteristic topographic features, and a TIN approximates the surface by adjacent non-overlapping planar triangles with irregularly spaced vertices ( ) [Maune, 1996]. x; y; z
x; y; z
x; y
1.1 INTRODUCTION
3
1.1.1 Intensity-based stereo: basic features To measure similarity between stereo images, both photometric and geometric distortions including discontinuities due to partial occlusions should be taken into account. Photometric distortions are due to non-uniform albedo of terrain points, non-uniform and noisy transfer factors over a eld-of-view (FOV) of each image sensor, and so forth. Because of these distortions, the corresponding pixels in stereo images may have different signal values. Geometric distortions are due to projecting a 3D surface onto the two image planes and involve (i ) spatially variant disparities of the corresponding pixels and (ii ) partial occlusions of some terrain points. As a result, the corresponding regions may dier in positions, scales, and orientations. Partial occlusions lead to only monocular visibility of certain terrain patches so that some image regions have no stereo correspondence. If a terrain is continuous then the geometric distortions preserve the natural x- and y-order of the binocularly visible points (BVP) in the images. Due to occlusions, even without photometric distortions two or more terrain variants are in full agreement with the same stereo pair. Therefore, terrain reconstruction, as an ill-posed problem, must involve a proper regularization [Kyreitov, 1983; Poggio et al., 1985; Marroquin et al., 1987]. Today's approaches to computational stereo dier in the following features: (i ) which similarities are measured for matching the images, (ii ) to which extent the image distortions are taken into account, (ii ) which regularizing heuristics are involved, and (iv ) how the stereo pair is matched as a whole. All the approaches exploit the image signals, that is, gray values (intensities) or, generally, colors (signal triples in RGB or other color scale) or multi-band signatures (signals in several and not only visible spectral bands). But with respect to image matching, they are usually classi ed as feature-based and intensity-based stereo. The rst group relies on speci c image features (say, edges, isolated small areas, or other easily detectable objects) to be found in both stereo images by pre-processing. Then, only the features are tested for similarity. Usually, a natural terrain has relatively small number of such characteristic features, and in most cases the intensity-based approaches where image similarity is de ned directly in terms of the signal values (gray levels, colors, or multi-band signatures) are used to reconstruct terrains. The intensity-based approaches are based on mathematical models that relate optical signals of the surface points to the image signals in the corresponding pixels. The model allows to deduce a particular measure of similarity between the corresponding pixels or regions in the images to be used for stereo matching. The similarity measure takes account of the admissible geometric and photometric image distortions. The simplest model assumes (i ) no local geometric distortions and (ii ) either no photometric distortions or only spatially uniform contrast and oset dierences between the corresponding image signals. More speci cally, it is assumed that a horizontal planar patch of a desired terrain is viewed by (photometrically) ideal image sensors and produces two relatively small corresponding rectangular windows in both stereo images.
4
1 STEREO TERRAIN RECONSTRUCTION BY DYNAMIC PROGRAMMING
Then, the similarity between the two windows is measured by summing squared dierences between the signals or by computing the cross-correlation [Baker, 1984; Forstner, 1993; Hannah, 1988]. The above model is easily extended to take account of varying x-slopes of a surface patch: by exhausting relative x-expansions and contractions of both the windows and searching for the maximum similarity [Helava, 1988]. An alternative way is to adapt the window size until the simplifying assumption about a horizontal surface patch be justi ed [Kanade & Okutomi, 1994]. More elaborated models compute similarity under various non-uniform photometric distortions of the images. In particular, to partially exclude the non-uniformity either only phases of the Fouirer transforms of both the windows are matched [Weng, 1993], or outputs of Gabor wavelet lters are used to isolate illumination perturbances of the images from variations of the surface re ectance attributed to orientations of the terrain patches [Chen et al., 1994], or cepstrums, that is, amplitudes of the Fourier transforms of the logarithmic Fourier transforms of the windows, are involved to isolate the surface re ectance variations from the perturbations caused by the noisy linear optical stereo channels that transfer the signals [Ludwig et al., 1994], and so forth. On frequent occasions complex combinations of intensity- and feature-based techniques are used to obtain a robustness to typical image distortions [Cochran & Medioni, 1991]. Alternative and computationally less complex signal models in [Gimel'farb, 1979, 1991] allow to take into account both the varying surface geometry and the non-uniform signal distortions along a single terrain pro le. The models admit arbitrary changes of the corresponding gray values providing that ratios of the corresponding gray level dierences remain in a given range. Section 1.3 presents these models in more detail.
1.1.2 Global versus local optimization A terrain is reconstructed by searching for the maximum similarity between the corresponding regions or pixels in a given stereo pair. A similarity measure takes account of the admissible image distortions and includes some regularizing heuristics to deal with the partial occlusions. Generally, there exist two possible scenarios for reconstructing a terrain: to exhaust all possible variants of a visible surface (a global optimization ) or to successively search for each next small surface patch, given the previously found patches (a local optimization ). For a continuous terrain, both the variants are guided by the visibility constraints. The local optimization is widely used in practice because in many cases it needs less computation and easily takes into account both x- and y-disparities of the corresponding pixels. But, it has the following drawback. If each local decision is taken independently from other ones, then the patches found may form an invalid 3D surface violating the visibility constraints. But if each next search is guided by the previously found patches, then the local errors are accumulated and, after a few steps, the \guidance" may result
1.2 STATISTICAL DECISIONS IN TERRAIN RECONSTRUCTION
5
in a completely wrong matches. In both cases, the local optimization needs an intensive interactive on-line or post-editing of a DPM or DSM to x the resulting errors. The global optimization is less subject to the local errors. But it is feasible only if the direct exhaustion of the surface variants can be avoided. In particular, this is possible when a terrain is reconstructed in a pro le-by-pro le mode and an additive similarity measure allows to use dynamic programming techniques for the global optimization. Historically, this approach was rst proposed in [Gimel'farb et al., 1972] (see also the comprehensive review [Baker, 1984]) and then extended in [Gimel'farb, 1979, 1991, 1994a, 1996]. Dynamic programming algorithms, but mostly for a feature-based stereo, were studied in many subsequent works, such as [Baker & Binford, 1981; Ohta & Kanade, 1985; Lloyd, 1986; Raju et al., 1987; Cox et al., 1996]. The symmetric dynamic programming stereo in [Gimel'farb, 1979, 1991; Gimel'farb et al., 1992] uses the maximum likelihood or Bayesian decision rule, derived from a particular probability model of the initial stereo images and desired surfaces, and takes into account: the geometric symmetry of stereo channels (image sensors), basic regular and random non-uniform distortions of the images, and discontinuities in the images because of occlusions in each channel.
1.2 Statistical decisions in terrain reconstruction In this section, we review the symmetric geometry of binocular stereo and discuss the in uence of the visibility constraints on optimal statistical decisions for terrain reconstruction [Gimel'farb, 1994b]. We restrict our consideration to an ideal horizontal stereo pair. A DSM is considered as a bunch of epipolar pro les obtained by crossing a surface by a fan of epipolar planes. An epipolar plane contains the base-line connecting the optical centers of stereo channels. Traces of an epipolar plane in the images, that is, two corresponding epipolar (scan)lines, represent any epipolar pro le in this plane. Thus to reconstruct a pro le only the signals along the epipolar lines have to be matched [Helava & Chapelle, 1972; Scarano & Brumm, 1976].
1.2.1 Symmetric stereo geometry
Both images of an ideal stereo pair are in the same plane. Let L and R denote two square lattices supporting a digital stereo pair in this plane. The pixels have integer x and y coordinates with steps of 1. The corresponding epipolar lines coincide with the x-lines having the same y-coordinate in the two images, and a DPM contains only the x-parallaxes of the corresponding pixels. Figure 1.1 shows a cross-section of a digital surface by an epipolar plane. Lines oL xL and oR xR represent the corresponding x-lines in the images. The correspondence between the signals in the pro le points and in the image pixels is given by the symbolic
6
1 STEREO TERRAIN RECONSTRUCTION BY DYNAMIC PROGRAMMING OL
OR
O
xL
abcde fhk
xR
adfg h i jk
x
o
X
oR
oL
f d
e
g
c a
h i j
b
k
Z
Figure 1.1:
Digital terrain pro le in the epipolar plane.
labels \a" { \k". Notice that both the solid and the dashed pro les give just the same labels in the images if the signals for these pro les have the shown labels. Let [X; y; Z ]T be the symmetric 3D coordinates of a point in the DSM. Here, y = yL = yR denotes the y-coordinate of the epipolar lines that specify an epipolar plane containing the point and [X; Z ]T are the Cartesian 2D coordinates of the point in this plane. The X -axis coincides with the stereo base-line which links optical centers OL and OR of the channels and is the same for all the planes. The Z -axis lies in the plane y. The origin O of the symmetric (X; Z ) coordinates is midway between the centers [Gimel'farb, 1991]. If spatial positions of the origin O and plane y are known, the symmetric coordinates [X; y; Z ]T are easily converted into any given Cartesian 3D coordinate system. Let pixels [xL ; y]T 2 L and [xR ; y]T 2 R correspond to a surface point [X; y; Z ]T . Then the x-parallax p = xL xR is inversely proportional to the depth (distance, or
1.2 STATISTICAL DECISIONS IN TERRAIN RECONSTRUCTION gR
gL a
7
b
c
d
e
f
h
xL
k
a
d
f
g
h
i
j
k
xR
p f d
e
g
d
h
h
c a
b c
d
Figure 1.2:
i j
b
Epipolar pro les in (
j
k
x
i h
) coordinates giving the same image signals.
x; p
height) Z of the point [X; Z ]T from the base-line OX : p = bf=Z . Here, b denotes the length of the base-line and f is the focal length for the channels. Each digital pro le in the epipolar plane y is a chain of the isolated 3D points [X; y; Z ]T that correspond to the pixels [xL ; y]T 2 L and [xR ; y]T 2 R. The DSM points [X; y; Z ]T , projected onto the image plane through the origin O, form the auxiliary \central" lattice C . This lattice has x-steps of 0:5 and y-steps of 1. A symmetric DPM on the lattice C is obtained by replacing the coordinates X and Z of a DSM in the epipolar plane y with the corresponding (x; y)-coordinates in C and the x-parallaxes, respectively. If the pixels [xL ; y]T 2 L and [xR ; y]T 2 R in a stereo pair correspond to a surface point with the planar coordinates [x; y]T 2 C , then the following simple relations hold:
xL = x + 2p ; xR = x 2p :
(1.1)
Figure 1.2 shows the epipolar plane of Fig. 1.1 in the symmetric (x; p) coordinates. Figure 1.3 presents two more pro les giving rise to the same distribution of the corresponding pixels along the epipolar lines under the shown labels of the points. We will consider an extended DPM (p; g) that contains a digital surface p : C ! P and an orthoimage g : C ! Q of the surface in the symmetric 3D coordinates. Here, P = [pmin; pmin + 1; : : : ; pmax] is a nite set of the x-parallax values and Q = [0; 1; : : : ; qmax ] is a nite set of the signal values (say, gray levels). The orthoimage g represents the optical signals g(x; y) 2 Q in the surface points [x; y; p = p(x; y)]T ; (x; y) 2 C . The digital surface p consists of the epipolar pro les py . Each pro le has a xed y-coordinate of the points and is represented by the two epipolar lines g L;y and gR;y with the same y-coordinate of the pixels in the images.
8
1 STEREO TERRAIN RECONSTRUCTION BY DYNAMIC PROGRAMMING gR
gL a
b
c
d
e
f
h
xL
k
a
d
f
g
h
i
j
k
xR
p h
d c a
b
i
f
f g
j
k
e
d
h f
f e
g h
d i
c j
Figure 1.3:
x
Epipolar pro les in (
b
) coordinates giving the same image signals.
x; p
1.2.2 Simple and compound Bayesian decisions Partial occlusions of the surface points impose the following strict visibility constraints on the x-parallaxes along an epipolar line in a symmetric DPM:
p(x 0:5; y) 1 p(x; y) p(x 0:5; y) + 1: (1.2) Let P be the parent population of all the DPMs p that satisfy the constraints (1.2). The constraints result in speci c statistical decisions for reconstructing a DPM p 2 P
from a given stereo pair. Let an error be de ned as any discrepancy between the true and the reconstructed surface. The conditional Bayesian MAP-decision that minimizes the error probability by choosing a surface p 2 P with the maximum a posteriori probability is as follows:
popt = arg max Pr(pjg L; gR ): p2P
(1.3)
Here, Pr(pjgL ; gR ) is the a posteriori probability distribution (p.d.) of the surfaces for a given initial stereo pair. If the probability model of a terrain and stereo images allows to obtain only the prior p.d. Pr(gL ; gR jp) of the images, then the conditional maximum likelihood decision can be used instead of the conditional MAP-decision:
popt = arg max p2P Pr(g L; gR jp):
(1.4)
Let a pointwise error be a dierence between the true and the reconstructed xparallax value in a particular surface point. Compound Bayesian decisions are more adequate to the low-level stereo than the simple rules in (1.3) and 1.4) because minimize
1.3 PROBABILITY MODELS OF EPIPOLAR PROFILES
9
either the expected number of the pointwise errors:
popt = arg max p2P
X
(x;y)2C
Prx;y (p(x; y)jg L; gR )
(1.5)
or the mean magnitude of the pointwise errors, that is, the variance of the obtained DPM about the true one:
popt = arg max p2P
X
(x;y)2C
(p(x; y) Efp(x; y)jg L ; gR g)2 :
(1.6)
Here, Prx;y (pjg L ; gR ) denotes a posterior marginal probability of the surface point [x; y; p = p(x; y)]T for a given stereo pair and Ef:::g is a posterior expectation of the surface point, that is, the expected x-parallax value in the surface point with the planar coordinates (x; y):
Efp(x; y)jg L ; gR g =
X
p2P
pPrx;y (pjgL ; gR ):
(1.7)
Due to the constraints (1.2), the rule in (1.5) with the maximum sum of the posterior marginal probabilities of the surface points (the MSPM-decision) is not reduced to the well-known pointwise MPM-decision with the maximal posterior marginal probability of each pro le point [Marroquin et al., 1987]. Likewise, the rule in (1.6) with minimal variance of these points about the posterior expectations (the MVPE-decision) is not reduced to the well-known PE-decision that chooses the posterior expectations as the reconstructed points [Yuille et al., 1990]. Both the conditional simple decisions in (1.3) and 1.4) and the conditional compound MSPM- and MVPE-decisions in (1.5) and 1.6) allow for the dynamic programming reconstruction of an epipolar pro le (see Sect. 1.4).
1.3 Probability models of epipolar pro les This Section presents geometric and photometric probability models describing the initial stereo images and the desired surfaces. Geometric models [Gimel'farb, 1994a, 1996] represent an epipolar pro le by a Markov chain of admissible transitions between the vertices in a planar graph of pro le variants (GPV). Figures 1.2 and Figures 1.3 show examples of the GVP. Radiometric models [Gimel'farb et al., 1972; Gimel'farb, 1979, 1991; Gimel'farb et al., 1992] describe basic regular and random photometric distortions of stereo images gL and gR with respect to the optical signals, that is, to the orthoimage of a surface 2 . 2 Below, the abbreviations BVP and MVP denote the binocularly and the only monocularly visible surface points, respectively.
10 1 STEREO TERRAIN RECONSTRUCTION BY DYNAMIC PROGRAMMING gR
gL
x L,o+1
x L,o
x R,o
x R,o+1
xL
p
xR (x o+ 0.5, po −1, v)
po+1
MR B (x o +1, po , v) ML
(xo , po , v)
po
(x o+ 0.5, po −1, v)
po−1
x xo Figure 1.4:
xo+ 0.5
xo+ 1
Admissible transitions in the GPV.
1.3.1 Prior geometric model Transitions between the GPV-vertices in the coordinates x; p represent all pro le variants with at least the MVPs [Gimel'farb, 1979, 1991]. Each GPV-vertex v = [x; p; s] has three visibility states s indicating the binocular (s = B ) or only monocular (s = ML or MR ) observation by the stereo channel L or R. It is obvious that only the eight transitions shown in Fig. 1.4 are allowed in a GPV. The Markov chain model with a stationary p.d. describes the expected shape and smoothness of a pro le by transition and marginal probabilities of the visibility states. This allows for probabilistic ordering of the pro le variants in Figs. 1.2 and Figs. 1.3 that have the same similarity with respect to image signals. Let Pr(v jv0 ) be a probability of transition from a preceding vertex v0 to a current vertex v. The dierences x x0 and p p0 between the x-coordinates and x-parallaxes in these GPV-vertices are uniquely speci ed by the visibility states s and s0 . Therefore the transition probabilities can be denoted as Pr(sjs0 ). If the Markov chain have the stationary p.d. of the visibility states in the equilibrium then only seven of the allowable transitions have non-zero probability and the transition MR ! ML is absent: Pr(MLjMR ) Pr(x; p; ML jx 1; p; MR) = 0: The GPV in Fig. 1.5 with a narrow "tube" of x-parallaxes demonstrates both the uniform internal transitions and speci c uppermost/lowermost transitions which allow to preserve the equilibrium conditions. Let M substitute for both ML or MR so that Pr(ML jB ) = Pr(MR jB ) Pr(M jB );
Pr(B jML ) = Pr(B jMR ) Pr(B jM );
Pr(ML jML) = Pr(MR jMR ) Pr(M jM ); Pr(ML jMR ) Pr (M jM ):
1.3 PROBABILITY MODELS OF EPIPOLAR PROFILES MR
p o+2
MR
MR
B
B
B
B
ML
ML
ML
MR
MR
MR
B
B
B
ML
ML
ML
MR
MR
MR
MR
B
B
B
B
ML
ML
ML
ML
MR
p o−1
p o−2
MR
ML
p o+1
po
11
MR
MR
B
B
B
ML
ML
ML
MR
MR
MR
B
B
B
B
ML
ML
ML
ML
xo
x o−1 Figure 1.5:
MR
x o+1
x o+2
GPV for the symmetric DP stereo.
The transition probabilities Pr(B jM ) and Pr(M jB ) and the resulting marginal probabilities of the visibility states Pr(B ) and Pr(M ) in the generated pro les are expressed in terms of the two transition probabilities Pr(B jB ) and Pr(M jM ) as follows: B jB ) ; Pr(B jM ) = 1 Pr(M jM ); Pr (M jM ) = 0; Pr(M jB ) = 1 Pr( 2 (1.8) 0 : 5(1 Pr( B j B )) 1 Pr( M j M ) Pr(B ) = 2 Pr(B jB ) Pr(M jM ) ; Pr(M ) = 2 Pr(B jB ) Pr(M jM ) : To retain the equilibrium conditions at the boundaries of the GPV in Fig. 1.5 and the marginal probabilities Pr(B ) and Pr(M ) for the internal GPV-vertices, the transition
12 1 STEREO TERRAIN RECONSTRUCTION BY DYNAMIC PROGRAMMING probabilities for the extreme GPV-vertices at the uppermost boundary are as follows: Prupp(B jB ) = 1 Pr(M ); Prupp(M jB ) = Pr(M ); (1.9) Prupp(B jM ) = 1; Prupp(M jM ) = 0; Prupp (M jM ) = 0: At the lowermost boundary there are the following transition probabilities: M) Pr(M ) Prlow (B jB ) = 1 Pr( Pr(M ) ; Prlow (M jB ) = Pr(B ) ; Prlow (B jM ) = 1; (1.10) Prlow (M jM ) = 0; Prlow (M jM ) = 0 if Pr(M ) Pr(B ) and Pr(B ) ; Prlow (B jB ) = 0; Prlow (M jB ) = 1; Prlow (B jM ) = Pr( M) (1.11) Pr( B ) Prlow (M jM ) = 0; Prlow (M jM ) = 1 Pr(M ) if Pr(M ) > Pr(B )
1.3.2 Prior photometric models
A symmetric photometric model speci es the basic distortions of the images gL and gR with respect to the orthoimage g of a DPM. The distortions are described by positive transfer factors which vary over a FOV of each stereo channel and by a random noise in the channels:
qL = aL q + rL ; qR = aR q + rR :
(1.12)
Here, q = g(x; y) denotes the signal value q 2 Q in the point [x; y; p = p(x; y)]T of a surface represented by a DPM (p; g), and qL = g L(xL ; y) and qR = gR (xR ; y) are the signal values in the corresponding image pixels. The transfer factors aL = aL(xL ; y), aL = aR (xR ; y) and the random noise rL = rL(xL ; y), rR = rR (xR ; y) present the multiplicative and the additive parts of the orthoimage distortions, respectively. The transfer factors aL and aR vary in a given range A = [amin; amax ] such that 0 < amin (aL ; aR ) amax . Transfer factors represent most regular part of the image distortions which cannot be independent for the adjacent BVPs to retain a visual resemblance between the stereo images. To describe these interdependencies, the symmetric dierence model [Gimel'farb, 1979, 1991] involves a direct proportion between each gray level dierence in the adjacent BVPs and the two corresponding dierences in the stereo images to within the additive random noise: aLq a0Lq0 = eL (q q0 ); aR q a0R q0 = eR (q q0 ): (1.13) Here q = g (x; y) and q0 = g (x0 ; y0) are the signal values in the neighboring BVPs along the same epipolar pro le (y = y0 ) or in the two adjacent pro les (jy y0 j = 1)
1.3 PROBABILITY MODELS OF EPIPOLAR PROFILES
13
in the DPM p and eL, eR denote the positive "dierence" transfer factors. These factors describe local interactions between the "amplitude" factors aj over the FOVs and can vary within a given range E = [emin; emax] where 0 < emin (eL ; eR ) emax . The dierence model in (1.13) admits large deviations between the corresponding gray levels but retains the visual resemblance of the images by preserving the approximate direct proportions between the corresponding gray level dierences. For the independent p.d. of the orthoimage signals and the independent random noise rL , rR in the stereo images, (1.13) results in a Markov chain of the image signals corresponding to the BVPs and in the independent image signals corresponding to the MVPs along a given pro le, respectively. Under a given surface geometry p, dierent statistical estimates of the surface orthoimage g and transfer factors aL, aR can be deduced using particular assumptions about a p.d. of the random noise and variations of the transfer factors. The estimates are based on the corresponding image signals for the BVPs of a surface. The match between the estimated orthoimage and the initial stereo images to within a given range of the transfer factors forms a theoretically justi ed part of a quantitative intensity-based measure of similarity between the stereo images. A heuristic part of the measure corresponds to the MVPs because the model in (1.12) gives no ways to estimate parameters of the image distortions without prior assumptions about their links with the like parameters for the neighboring BVPs.
1.3.3 Posterior model of a pro le The geometric model in (1.8) and the photometric model in (1.13) describe a terrain pro le py 2 p and signals g L;y , gR;y along the corresponding epipolar lines in the images by particular Markov chains. As a result, both the prior p.d. Pr(gL;y ; g R;y jpy ) of the signals for a given pro le and the posterior p.d. Pr(py jgL;y ; g R;y ) of the pro les for given image signals can be assumed to expand in the products of conditional transition probabilities. For brevity, the index y is omitted below. Let Pr(g L ; gR jv i 1 ; vi ) denote the transition probability of the image signals for two given successive GPV-vertices along a pro le. Let Pr(vi jvi 1 ; gL ; gR ) be the transition probability of the two successive GPVvertices along the pro le for given corresponding image signals. Then the above prior and posterior p.d. are as follows: Pr(g L; gR jpy ) = Pr0 (g L; g R jv0 ) Pr(pjgL ; gR ) = Pr0 (v0 jgL ; gR )
NY1 i=1
NY1 i=1
Pr(g L ; gR jvi 1 ; vi );
(1.14)
Pr(v i jv i 1 ; gL ; gR ):
(1.15)
14 1 STEREO TERRAIN RECONSTRUCTION BY DYNAMIC PROGRAMMING Here, i denotes serial numbers of the GPV-vertices along a pro le, N is the total number of these points, Pr0 (g L ; gR jv0 ) is the marginal probability of the image signals for a given starting GPV-vertex v 0 , and Pr0 (v0 jgL ; gR ) is the marginal probability of the starting GPV-vertex v 0 for given image signals. The transition probabilities in (1.14) and 1.15) are derived from the photometric model of (1.13), but the transition probabilities in (1.15) depend also on the geometric model of (1.8). The marginal probabilities of the GPV-vertices are calculated in succession along the GPV by the obvious relations which follow directly from Figs. 1.4 and Figs. 1.5 : X Pr(v jgL ; gR ) = Pr(v 0 jgL ; gR ) Pr(vjv0 ; gL ; gR ): (1.16) v 2!(v) 0
Here !(v) denotes a set of the nearest neighboring GPV-vertices v0 that precede the current vertex v. Generally, the set contains the following GPV-vertices shown in Fig. 1.4: !(x; p; ML) = f(x 1; p; B ); (x 0:5; p 1; ML)g; (1.17) !(x; p; B ) = f(x 1; p; B ); (x 1; p; MR); (x 0:5; p 1; ML)g; !(x; p; MR ) = f(x 0:5; p + 1; B ); (x 0:5; p + 1; MR )g: Transitional probabilities Pr(vi jvi 1 ; gL ; gR ) in (1.15) are related by the photometric model of (1.13) to the maximum residual deviations of each stereo image g L, gR from the estimated orthoimage g. The deviations are obtained by transforming the estimated orthoimage in such a way as to nd the best approximation of each stereo image to within a given range E of the dierence transfer factors. Let i be ordinal number of a GPV-vertex along the pro le. Let qL;i = g L(xL;i ; y) and qR;i = g R (xR;i ; y) be the gray values in the pixels [xL;i ; y]T 2 L and [xR;i ; y]T 2 R corresponding to the GPV-vertex vi;B = [xi ; pi ; si = B ]. Let uL;i and uR;i be the gray values which approximate qL;i and qR;i , respectively, in the transformed orthoimages. The orthoimage is transformed to approximate the images g L and gR by minimizing the maximum square deviation from both the images. Under a particular p.d. of the random noise, the transition probabilities for the neighboring BVPs are as follows [Gimel'farb, 1991; Gimel'farb et al., 1992]: Pr(vi;B jvi 1;B ; gL ; gR ) / Pr(B jB ) exp ( d(qL;i; qL;i 1 ; qR;i ; qR;i 1 )) (1.18) where the factor is inversely proportional to the expected variance of the residual minimax deviations L;i = qL;i uL;i R;i = uR;i qR;i of the approximation. The local dissimilarity d(:::) = (L;i )2 in the exponent depends on the corresponding gray level dierences L:i;i 1 = qL;i qL;i 1 and R:i;i 1 = qR;i qR;i 1 in the images, on the current estimate eopt of the dierence transfer factors, and on the residual errors as follows: L;i = L;i 1 +nL:i;i 1 eopt (L:i;i 1 + R:i;i 1 ); o (1.19) 2 : eopt = arg min ( + e ( + )) L;i 1 L:i;i 1 L:i;i 1 R:i;i 1 e2E
1.4 DYNAMIC PROGRAMMING RECONSTRUCTION
15
Here, e = eL=(eL + eR ) is the normed transfer factor and E denotes its range: E = [emin; emax] where 0 < emin 0:5 emax = 1 emin < 1. The transition probability Pr(vi;B jv i 1;M ; g L; g R ) for the transition M ! B (that is, si 1 = M and si = B ) has the like form except for using the nearest preceding BVP along the pro le instead of the adjacent MVP vi 1 to get the dissimilarity value d(:::) of (1.18) and 1.19) . The relations in (1.19) are derived from the model in (1.13) by minimizing the maximum pixelwise error of adjusting the orthoimage to the stereo images. It is easily shown that the gray value qi for the BVP vi is estimated by averaging the corresponding image signals: qi = (q1i + q2i )=2. The orthoimage is then transformed to approximate each stereo image by changing the gray level dierences for the successive BVPs within a given range E of the transfer factors. But, additional heuristics are essential to de ne the transition probabilities for the transitions M ! M and B ! M because the MVPs do not allow to estimate the transfer factors and orthoimage signals. If the transfer factors and additive noise in (1.12) vary rather smoothly over the FOVs there are several possible ways to introduce these heuristics: constant value of a residual square deviation d(:::) = d0 02 > 0 for all the MVPs [Gimel'farb, 1979, 1991], extension of a deviation computed for a current BVP on the subsequent MVPs [Gimel'farb, 1991], extension of a relative deviation of the approximation L;i = L;i =qL;i for a current BVP on the subsequent MVPs: L;k = max f0 ; L;i qL;k g; k = i + 1; i + 2; :::; as long as sk = ML or MR , and so forth.
1.4 Dynamic programming reconstruction This Section reviews the dynamic programming terrain reconstruction based on the optimal statistical decisions (1.3){1.6) . The additive similarity measures, obtained by taking the logarithm of the p.d. in (1.14) and 1.15) , allow to implement the maximum likelihood rule of (1.4) or the Bayesian MAP decision of (1.3), respectively. The relations in (1.16) and 1.17) allow to compute the additive similarity measures of (1.5) and 1.6) for the compound Bayesian rules. Each similarity measure is maximized by dynamic programming to take account of the visibility constraints of (1.2) [Gimel'farb, 1979, 1991; Gimel'farb et al., 1992]. Some heuristics are embedded into the similarity measures to cope with discontinuities in the images due to occlusions and reduce the resulting multiplicity of the surfaces that are consistent with the images. Also, con dences of the reconstructed terrain points have to be estimated for evaluating and validating the obtained results.
16 1 STEREO TERRAIN RECONSTRUCTION BY DYNAMIC PROGRAMMING
1.4.1 Uni ed dynamic programming framework Equations (1.14){Equations 1.16) provide a uni ed dynamic programming framework for terrain reconstruction in a pro le-by-pro le mode. Each epipolar pro le py is reconstructed as a continuous path in the GPV maximizing an additive similarity measure:
popt = arg max p W (pjgL ; gR ):
(1.20)
The similarity measure W (:::) is represented as follows:
W (pjg L; g R ) =
nX1 i=1
w(v i 1 ; vi jg L ; gR ):
(1.21)
where the local similarity term w(:::) describes the transition between the GPV-vertices vi 1 and vi . In particular, such a term can be obtained from the transition probabilities in (1.18) and 1.19) . Dynamic programming search for the global maximum of the similarity measure in (1.21) involves a successive pass along the x-axis of the GPV. At any current xcoordinate xc , all the possible GPV-vertices vc = (xc ; pc; sc) are looked over to calculate and store, for each the vertex, the local potentially-optimal backward transition vc ! vp = t(vc) to one of the preceding vertices !(vc) listed in (1.17). Let [xb ; xe ] be a given range of the x-coordinates of the GPV-vertices. Let Wxc (v c ) denote the similarity value accumulated along a potentially-optimal part of the pro le with x-coordinates between xb and xc that ends in the GPV-vertex vc . Then the basic dynamic programming recurrent computation is as follows:
t(vc ) = arg v max Wxp (v p ) + w(v p ; vc jgL ; gR ) ; p 2! (v c ) (1.22) Wxc (vc ) = Wxp (t(v c)) + w(t(v c ); vc jgL ; gR ): After passing a given range of the x-coordinates, the desired pro le popt of (1.20) is obtained by the backward pass using the stored potentially-optimal transitions t(v ) for the GPV-vertices:
vopt N 1 = arg max v fWx (ve )g ; e
vopt i 1
=
t(vopt ); i
e
i = N 1; :::; 1:
(1.23)
The maximum accumulated similarity value in (1.23) corresponds to the desired global maximum of the similarity value in (1.20):
W (popt jgL ; gR ) = Wxe (v opt N 1 ):
(1.24)
By embedding calculations of the corresponding marginals of (1.16) into this search one can implement in similar way the compound decisions in (1.5) and 1.6) .
1.4 DYNAMIC PROGRAMMING RECONSTRUCTION
17
1.4.2 Regularizing heuristics The similarity measure of (1.21), deduced from the probabilistic models of the surface and stereo images, does not overcome the inherent ambiguity of terrain reconstruction. Figs. 1.2 and Figs. 1.3 show that visually the least appropriate \pit-like" dotted pro le in Fig. 1.3, even with no photometric distortions, concurs successfully with other, much more natural pro les. This variant is excluded either by setting proper parameters of the geometric model of (1.8) or by using a simple heuristic based on the expected surface smoothness. The latter one is rather straightforward: under the equal signal similarity, the most smooth pro le, that is, with the maximum number of the BVPs, has to be chosen. This heuristic, easily implemented by dynamic programming, counts in favour of the solid pro le in Fig. 1.2 with the ve BVPs. But there are visually less appropriate variants which cannot be excluded using only the numbers of the BVPs or the probability ordering of (1.8). Therefore, additional heuristics have to be introduced to obtain the terrains which mostly agree with the visual reconstruction. Such a heuristic can be based on an assumption that the more photometric transformations (even the admissible ones) are required to convert one stereo image into another in line with a given surface, the less appropriate is the surface. For example, the conversion of the epipolar line gL into the corresponding line g R in accord with the solid or the dashed pro le in Fig. 1.2 results in two gaps, size of 2 and 1 or of 4 and 1, respectively, to be interpolated. In the rst case, the g L-signals (\f",\h") and (\h",\k") have to be interpolated for tting the gR -signals (\f{g{h") and (\h{i{ j{k"). In the second case, the gL -signals (\a",\f") and (\f",\k") are compared to the gR -signals (\a{d{f") and (\f{g{h{i{j{k"). The solid and the dashed pro les in Fig. 1.3 involve two gaps of size 2 or one gap of size 6, respectively. The desired heuristic should take account of the signal matches after such a transformation. The latter heuristic is embedded in [Gimel'farb, 1996] to the MAP-decision of (1.3) by summing the two weighted similarity measures: popt = arg max (1.25) p fW (pjgL ; gR ) + (1 )Wreg (pjg L; g R )g : Here, is a relative weight of the heuristic part (0 1), W (pjgL ; gR ) denotes the similarity measure of (1.21) between the epipolar lines gL and gR in the stereo images under a pro le p, and Wreg (pjg L; g R ) is the similarity measure between the two lines gL and gR matched in accord with the pro le p. The second measure estimates the magnitude of deformations that transform one of the images into another image. The magnitude is given by a sum of square dierences between the corresponding intensity changes along the pro le p:
Wreg (pjgL ; gR ) =
nX1 i=1
L:i;b(i) R:i;b(i) 2 :
(1.26)
Here, b(i) is the closest BVP that precedes the GPV-vertex vi along the pro le p and L:i;k = g L(xL;i ; y) gL(xL;k ; y) and R:i;k = gL (xR;i ; y) gR (xR;k ; y) denote the gray
18 1 STEREO TERRAIN RECONSTRUCTION BY DYNAMIC PROGRAMMING level dierences between the two pixels in each stereo image. The pixels correspond to the GPV-vertices vi and vk ; k < i, along the epipolar pro le. The weighted similarity measure in (1.25) decides in favor of the pro le giving the highest resemblance between the stereo images transformed one into the other, all other factors being equal. So, it tries to suppress the pit-like pro le variants that may cause big dierences between the corresponding parts of the transformed images.
1.4.3 Con dence of the DPM Computational stereo does nothing more than matches stereo images. Even if the geometric constraints and regularizing heuristics allow to eliminate some geometric ambiguities shown in Figs. 1.2 and Figs. 1.3 , there still exist the photometric ambiguities, that is, the places with no signal variations to guide the matching. A BVP of a terrain is con dent if it is represented by a speci c visual pattern which gives no multiple good matches in a close vicinity of the corresponding points in the images. Only in this case human visual reconstruction can be used for validating the computed DPM or DSM. But, if the detectable pattern is present only in a single image, due to partial occlusions, or is absent at all because the signal values are almost equal, then the computed surface points have very low or even zero con dence. Such places can hardly be compared to the GCPs, even if these latter exist due to eld surveys or have been found by photogrammetric visual reconstruction. Let S = C ! [0; Smax] denote a con dence map for a DPM with a conditional con dence range 0 S (x; y) Smax . The con dence measure has to be derived from the image signals used for stereo matching: the more discriminative the image features, the more con dent the reconstructed terrain point. Generally, the con dence measures re ect not only the image and terrain features but also the features of stereo matching. For example, in [Bolles et al., 1993] two con dence measures for a DPM obtained by a correlation-based local optimization are considered. The rst measure is based on a dierence p(x; y) p0 (x; y) between the best-match disparity p(x; y) in the DPM and the second-best-match disparity p0 (x; y) for the same planar position (x; y) 2 C in the DPM. The second measure exploits the ratio of the cross-correlation, giving the best match, to the autocorrelation threshold. Experiments with these measures have shown that low-con dent terrain areas must be excluded both from stereo matching and stereo evaluation. We restrict our consideration to a simple con dence measure which takes no account of a matching algorithm. This measure is more convenient for evaluating dierent stereo techniques because the con dence map is computed only from the initial stereo images and the obtained DPM. The basic photometric distortions in (1.13) are speci ed by a given range of relative changes of the gray level dierences in both images: each gray level dierence between two neighboring BVPs in the pro le can be changed in each image to within this range. Therefore, in the simplest case, the gray level dierences estimated for the BVPs in
1.5 EXPERIMENTAL RESULTS
19
the reconstructed DPM may serve as the con dence measure: the higher the dierence, the more de nite the matching and the greater the con dence. Let (b(x); y) denote the planar coordinates of a BVP that precedes a BVP with the planar coordinates (x; y) in the reconstructed DPM (p; g ). Then, S (x; y) = jg(x; y) g(b(x); y)j. It should be noted that the \vertical" gray level dierences take no part in the above con dences only due to the adopted independent line-by-line DPM reconstruction. The obtained con dence map allows to separate the con dent and non-con dent parts of a DPM by simple thresholding. More elaborated approach involves a separate linear approximation of the gray levels in the stereo images that correspond to the BVPs before and after a current BVP. The obtained _- or ^-shaped gray level approximations and variations of the image signals with respect to them allow to roughly estimate a ducial interval for the x-disparities in this point. The greater the interval, the smaller the con dence.
1.5 Experimental results Experiments with real aerial and ground stereo pairs (a few of them are presented in [Gimel'farb et al., 1992; Gimel'farb, 1996; Gimel'farb et al., 1996] indicate that the symmetric dynamic programming approach gives dense DPMs which agree closely with the visually perceived surfaces. To exclude outliers caused by inexact epipolar geometry of the images or by low-contrast image regions, the reconstruction involves on-line median ltering over several adjacent pro les. In [Gimel'farb et al., 1992] the accuracy of the DPM the size of 1124(x) 450(y) points reconstructed from a stereo pair of highland was veri ed using 433 GCPs with x-parallaxes found by an analytic photogrammetric device. The pro les were reconstructed within the range P = [pmin = 50; pmax = 50] in the pro le-by-pro le mode. The histogram of absolute deviations between the computed and manually perceived DPMs in the GCPs is shown in Table 1.1, the mean absolute deviation being 1.47 of the x-parallax unit. Table 1.1:
Absolute deviations of -parallaxes for the 433 GCPs in the reconstructed DPM x
Deviation 0 1 2 3 4 8 Number of the GCPs 98 257 356 403 421 433 % of the GCPs 23 60 82 93 97 100 Because of diculties in obtaining the GCPs for the available stereo pairs, we have used in most cases the simpli ed performance evaluation method of [Hsieh et al., 1992]. This method checks the computed x-parallaxes against the ones which are visually found for some arbitrary chosen characteristic terrain points. In our experiments, generally about 60-70% of the computed x-parallaxes are within the 1 range with respect to
20 1 STEREO TERRAIN RECONSTRUCTION BY DYNAMIC PROGRAMMING a
b
c
d
, b 512 512 stereo pair \Pentagon", thresholded con dence map.
Figure 1.6: a
c
the reconstructed DPM, and
d
the
the chosen control values. Most signi cant errors occur in shaded, vastly occluded, or highly textured terrain regions. For example, Figs. 1.6 and Figs. 1.7 show the initial stereo pairs \Pentagon" and \Mountain", range images of the reconstructed DPMs, and the thresholded con dence maps. The threshold is set as to choose 20% of all the DPM points as the con dent ones with S (x; y) . These latter are shown by black pixels in Fig. 1.6d and Fig. 1.7d. The range images Fig. 1.6c and Fig. 1.7c represent the x-parallax map by grayscale coding (from the white nearest pixels to the most distant black pixels). The x-parallax
1.5 EXPERIMENTAL RESULTS
21
a
b
c
d
, b 700 1100 stereo pair \Mountain", thresholded con dence map.
Figure 1.7: a
c
the reconstructed DPM, and
d
the
22 1 STEREO TERRAIN RECONSTRUCTION BY DYNAMIC PROGRAMMING map is reduced to the left stereo image, that is, to the lattice L. The DPMs are obtained by using the similarity measure of (1.25) with the following basic parameters: the weight w = 0:5 (variations in a broad range 0:2 w 0:8 have almost no eect on the resulting DPMs), the range E = [0:2; 0:8] of the normed transfer factor in (1.19), and the residual square deviation d0 = 100 for the MVPs. On the whole, the obtained range images correspond rather closely to the visual depth perception of these scenes, and the chosen con dent points represent main topographic landmarks of these terrains. But, there are a few local depth errors: mainly along the upper right edge of the Pentagon building in Fig. 1.6 and in a small textured region at the upper left part of Fig. 1.7. The errors in Fig. 1.6 are caused by a strip of the wall along this edge which is seen in the right image but is occluded (and thus has no stereo correspondence) in the left image. This strip is grouped in the reconstructed pro les with the similar neighboring roof's details which are observed in both images. These errors illustrate the main drawbacks of the simpli ed symmetric binocular stereo used in the experiments: (i ) it takes no account of the possibly inexact epipolar geometry of the images; (i i) it takes no account of y-interactions between the signals because of the independent pro le-by-pro le reconstruction mode, (i ii) the regularizing heuristics do not overcome in full measure all the errors caused by the MVPs that match closely the neighboring BVPs. Nonetheless, in spite of the local errors, the overall quality of the reconstruction of the observed surfaces is fairly good. Also, this dynamic programming approach greatly surpasses many other algorithms in time complexity. Table 1.2 shows the processing times for reconstructing the same scene \Pentagon" by our dynamic programming approach and by the algorithms of [Cochran & Medioni, 1991] and [Chen et al., 1994]. Processing time for reconstructing the DPM \Pentagon" Gimel'farb [1991, 1996] Cochran & Medioni [1991] Chen et al. [1994] HiNote VP Series 500 Symbolics 3650 IBM RS/6000 (Windows-95) (Genera 7.2) (N/A) MS Visual C++ Lisp N/A 512 512 512 512 256 256 38s 22h 44m 16s 20m
Table 1.2:
Algorithm Computer (OS ) Language DPM size Time
These and other experimental results show that the ecient and fast dynamic programming terrain reconstruction can be obtained by integration of the theoretical image/surface models, the optimal statistical decision rules, and the regularizing heuristics which take account of the ill-posedness of the problem.
BIBLIOGRAPHY
23
Bibliography F. Ackermann. Techniques and strategies for DEM generation. In C. Greve, editor, Digital Photogrammetry: An Addendum to the Manual of Photogrammetry, pages 135-141, Bethesda, 1996. ASPRS. H. H. Baker. Surfaces from mono and stereo images. Photogrammetria, 39: 217-237, 1984. H. H. Baker and T. O. Binford. Depth from edge and intensity based stereo. In Proceedings of the 7th International Joint Conference on Arti cial Intelligence, August 24-28, 1981, Vancouver, Canada, volume 2, pages 631-636. S. T. Barnard and M. A. Fischler. Computational stereo. ACM Computing Surveys, 14: 553-572, 1982. R. C. Bolles, H. H. Baker, and M. J. Hannah, The JISCT stereo evaluation. In Proceedings of the DAPRA Image Understanding Workshop, April 18-21, 1993, Washington, D.C., pages 263-274. San Mateo, 1993. Morgan Kaufmann. T.-Y. Chen, W. N. Klarquist, and A. C. Bovik. Stereo vision using Gabor wavelets. In Proceedings of the IEEE Southwest Symposium on Image Analysis and Interpretation, April 21-24, 1994, Dallas, Texas, pages 13-17. Los Alamitos, 1994. IEEE Computer Society Press. S. T. Cochran and G. Medioni. 3D surface description from binocular stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14: 981-994. I. J. Cox, S. L. Hingorani, and S. B. Rao. A maximum likelihood stereo algorithm. Computer Vision and Image Understanding, 63: 542-567, 1996. U. R. Dhond and J. K. Aggarwal. Structure from stereo - a review. IEEE Transactions on Systems, Man, and Cybernetics, 19: 1489-1510, 1989. W. Forstner. Image matching. In [Haralick & Shapiro, 1993, Chapter 16, pages 289-378] G. L. Gimel'farb. Symmetric approach to the problem of automating stereoscopic measurements in photogrammetry. Kibernetika, (2): 73-82, 1979 [In Russian; English translation: Cybernetics, 15(2): 235-247, 1979]. G. L. Gimel'farb. Intensity-based computer binocular stereo vision: signal models and algorithms. International Journal of Imaging Systems and Technology, 3: 189-200, 1991. G. L. Gimel'farb. Regularization of low-level binocular stereo vision considering surface smoothness and dissimilarity of superimposed stereo images. In C. Arcelli, L. P. Cordella, and G. Sanniti di Baja, editors, Aspects of Visual Form Processing, pages 231-240, Singapore, 1994. World Scienti c.
24 1 STEREO TERRAIN RECONSTRUCTION BY DYNAMIC PROGRAMMING G. L. Gimel'farb. Intensity-based bi- and trinocular stereo vision: Bayesian decisions and regularizing assumptions. In Proceedings of the 12th IAPR International Conference on Pattern Recognition, October 9-13, 1994, Jerusalem, Israel, volume 1, pages 717-719. Los Alamitos, 1994. IEEE Computer Society Press. G. L. Gimel'farb. Symmetric bi- and trinocular stereo: tradeos between theoretical foundations and heuristics. In W. Kropatsch, R. Klette, F. Solina, editors. Theoretical Foundations of Computer Vision, Computing Supplement 11, pages 53-71. Wien et al., 1996. Springer. G. L. Gimel'farb, V. M. Krot, and M. V. Grigorenko. Experiments with symmetrized intensity-based dynamic programming algorithms for reconstructing digital terrain model. International Journal of Imaging Systems and Technology, 4: 7-21, 1992. G. L. Gimel'farb, V. I. Malov, V. B. Gayda, M. V. Grigorenko, B. O. Mikhalevich, and S. V. Oleynik. Digital photogrammetric station \Delta" and symmetric intensitybased stereo. In Proceedings of the 13th IAPR International Conference on Pattern Recognition, August 25-29, 1996, Vienna, Austria, volume III, pages 979-983. Los Alamitos, 1996. IEEE Computer Society Press. G. L. Gimel'farb, V.B.Marchenko, and V.I.Rybak. An algorithm for automatic identi cation of identical sections on stereopair photographs. Kibernetika, (2): 118-129, 1972 [In Russian; English translation: Cybernetics, 8(2): 311-322, 1972]. E. Gulch. Results of test on image matching of ISPRS WG III/4. ISPRS Journal of Photogrammetry and Remote Sensing, 46: 1-18, 1991. M. J. Hannah. Digital stereo image matching techniques. International Archives on Photogrammetry and Remote Sensing, 27: 280-293, 1988. R. M. Haralick and L. G. Shapiro. Computer and Robot Vision. Vol. 2. Reading, MA: Addison-Wesley, 1993. U. V. Helava. Object space least square correlation. Photogrammetric Engineering and Remote Sensing, 54: 711-714, 1988. U. V. Helava and W. E. Chapelle. Epipolar scan correlation. Bendix Technical Journal, 5: 19-23, 1972. Y. C. Hsieh, D. M. McKeown, and F. P. Perlant, Performance evaluation of scene registration and stereo matching for cartographic feature extraction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14: 214-238, 1992. T. Kanade and M. Okutomi. A stereo matching algorithm with an adaptive window: theory and experiment. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16: 920-932, 1994.
BIBLIOGRAPHY
25
V. R. Kyreitov. Inverse Problems of Photometry. Novosibirsk: Computing Center of the Academy of Sciences of the USSR, Siberian Branch, 1983. [In Russian]. S. A. Lloyd. Stereo matching using intra- and inter-row dynamic programming. Pattern Recognition Letters, 4: 273-277, 1986. K.-O. Ludwig, H. Neumann, and B. Neumann. Local stereoscopic depth estimation. Image and Vision computing, 12: 16-35, 1994. J. Marroquin, S. Mitter, and T. Poggio. Probabilistic solution of ill-posed problems in computational vision. Journal of the American Statistical Association, 82: 76-89, 1987. D. F. Maune. Introduction to digital elevation models (DEM). In C. Greve, editor, Digital Photogrammetry: An Addendum to the Manual of Photogrammetry, pages 131-134, Bethesda, 1996. ASPRS. Y. Ohta and T. Kanade. Stereo by intra- and inter-scan line search using dynamic programming. IEEE Transactions on Pattern Analysis and Machine Intelligence, 7: 139-154, 1985. T. Poggio, V. Torre, and C. Koch. Computational vision and regularization theory. Nature, 317: 314-319, 1985. G. V. S. Raju, T. Binford, and S. Shekar. Stereo matching using Viterbi algorithm. In Proceedings of the DARPA Image Understanding Workshop, February 23-25, 1987, Los Angeles, CA, volume 2, pages 766-776. San Mateo, 1987. Morgan Kaufmann. F. A. Scarano and G. A. Brumm. A digital elevation data collection system. Photogrammetric Engineering and Remote Sensing, 42: 489-496, 1976. A. F. Schenk. Automatic generation of DEMs. In C. Greve, editor, Digital Photogrammetry: An Addendum to the Manual of Photogrammetry, pages 145-150, Bethesda, 1996. ASPRS. J. Weng. Image matching using the windowed Fourier phase. International Journal of Computer Vision, 11: 211-236, 1993. A. L. Yuille, D. Geiger, and H. Bultho. Stereo integration, mean eld theory and psychophysics. In O. Faugeras, editor, Computer Vision - ECCV 90 (Lecture Notes in Computer Science 427) pages 73-82, Berlin, 1990. INRIA, Springer-Verlag