Disparity Estimation Based on Bayesian Maximum A Posteriori (MAP ...

5 downloads 0 Views 1MB Size Report
Jul 7, 1999 - SUMMARY. In this paper, a general formula of disparity es- timation based on Bayesian Maximum A Posteriori (MAP) al- gorithm is derived and ...
IEICE TRANS. FUNDAMENTALS, VOL.E82–A, NO.7 JULY 1999

1367

PAPER

Disparity Estimation Based on Bayesian Maximum A Posteriori (MAP) Algorithm∗ Sang Hwa LEE† , Jong-Il PARK†† , Nonmembers, Seiki INOUE††† , and Choong Woong LEE† , Members

SUMMARY In this paper, a general formula of disparity estimation based on Bayesian Maximum A Posteriori (MAP) algorithm is derived and implemented with simplified probabilistic models. The formula is the generalized probabilistic diffusion equation based on Bayesian model, and can be implemented into some different forms corresponding to the probabilistic models in the disparity neighborhood system or configuration. The probabilistic models are independence and similarity among the neighboring disparities in the configuration. The independence probabilistic model guarantees the discontinuity at the object boundary region, and the similarity model does the continuity or the high correlation of the disparity distribution. According to the experimental results, the proposed algorithm had good estimation performance. This result showes that the derived formula generalizes the probabilistic diffusion based on Bayesian MAP algorithm for disparity estimation. Also, the proposed probabilistic models are reasonable and approximate the pure joint probability distribution very well with decreasing the computations to O(n(D)) from O(n(D)4 ) of the generalized formula. key words: disparity estimation, Bayesian maximum a posteriori (MAP) algorithm, Markov random field, plane configuration model, probabilistic diffusion

1.

Introduction

Stereo matching is to estimate the disparities between stereo images, which are generated from different viewpoints respectively. Stereo matching generates the three dimensional scene structure, disparity map, from a set of two dimensional stereo images. Figure 1 shows the concept of stereo image and disparity from two different viewpoints. An object point is projected on the same rectified plane through two cameras. The two points, PL and PR , are the projected points of the object point on the rectified plane. The disparity is the difference between distances of two projected points from two focusing points, FL and FR respectively. In the epipolar geometry of Fig. 1, we can consider only Manuscript received June 29, 1998. Manuscript revised November 18, 1998. † The authors are with Institute of New Media and Commun., School of Electrical Eng., Seoul National University, in Seoul, Korea. †† The author is with School of Electrical Eng., the Hanyang University in Seoul, Korea. ††† The author is with NHK Science and Technical Research Labs., Tokyo, 157–8510 Japan. ∗ This paper was supported in part by ’95 SPECIAL FUND for UNIVERSITY RESEARCH INSTITUTE, Korea Research Foundation.

one dimensional disparity search. In order to estimate dense disparities, various algorithms have been proposed. The sum of squared differences (SSD) algorithm searches for the disparity based on the region. This algorithm calculates the squared differences of intensity between stereo images and aggregates them in the region. It finds the disparity of the minimum of the summed differences. This algorithm is simple to implement and has advantages to apply the various adaptive postprocessings in order to reduce the erroneous disparities. However, this algorithm is very dependent on the block size so that it suffers from either blurring of the boundary and discarding the detailed area or noisy estimation. And, it is difficult to find the optimal block size with respect to stereo images. The gradient based algorithm [14], [15] is based on the idea that the same intensity or color in an image may look different corresponding to the visual angles or positions. Due to this misunderstanding of human visual system and camera imaging system, the measure based on the difference of intensity or color differences may be incorrect at some angles or positions of viewpoints. These algorithms propose the measure of stereo matching based on gradient quantities. This algorithm chooses the disparity with most similar gradient of intensity in the stereo images. In this algorithm, it is difficult to find the correct disparity in the regions whose intensity varies uniformly. The diffusion based algorithm [11], [12] is based on

Fig. 1 Stereo image and disparity. This stereogram satisfies the one dimensional epipolar geometry.

IEICE TRANS. FUNDAMENTALS, VOL.E82–A, NO.7 JULY 1999

1368

the diffusion equation of energy function. That is to say, the derivative of energy function in the time domain is equivalent to the Laplacian of the energy function in the spatial domain. With this diffusion equation, the energy function is diffused iteratively and then the disparity of minimum energy is selected. Since this algorithm is based on linear functions, some of nonlinear adaptivity have been proposed. And, there are many new algorithms to combine the above individual algorithms in order to compensate for the disadvantages of each algorithm adaptively. The strip-mesh based algorithm [16] uses the triangular patches at the object boundaries to overcome the limitations of the SSD. The phase based algorithm [17] uses the phase difference in the stereo images. The image signals have two components, magnitude and phase component after Fourier transformation. In general, the phase component is more important in image processing than the magnitude [1]. This algorithm applies the phase difference between stereo images to estimate the disparity. In this paper, we consider the Bayesian Maximum A Posteriori (MAP) algorithm for disparity estimation [4]. This algorithm is based on the Markov Random Field (MRF) theory and some probabilistic models. Given stereo image set, this algorithm diffuses the energy space iteratively with probabilistic models and searches for the disparity with maximum probability or minimum energy in the converged energy space. This algorithm is similar to the diffusion based algorithm in that both of the algorithms diffuse the energy space and search the disparity with minimum energy. The difference is that the diffusion based algorithm diffuses the energy space with diffusion equation of mechanics while this Bayesian model does with Markov and probabilistic models. This paper derives the generalized formula rigorously using the joint probability distribution of the neighboring disparities in the MRF model and proposes new probabilistic models to improve the performance with reduced calculations. This paper consists of six sections. In Sect. 2, we survey the basic theory of Bayesian MAP estimation for stereo matching. The general formula is derived in the Sect. 3, and the proposed algorithm based on the proposed plane configuration model is described in the Sect. 4. The experimental results are shown in the Sect. 5, and finally, we conclude the proposed algorithm and experimental results in the Sect. 6. 2.

Bayesian MAP Algorithm for Disparity Estimation

In the stereo matching to estimate the disparity map, the starting point is to maximize the conditional probability of disparity given the stereo images, maximize

p(D | {I}),

{I} = {IR , IK }.

(1)

In Eq. (1), {I} is the stereo image set and the set is composed of two or more images from different viewpoints, respectively. In the stereo image set, IR is the reference image, and IK are the other images. The symbol D is the disparity map of the stereo image set and describes the three dimensional structure. In maximizing the conditional probability, the main problem is how to calculate the conditional probability given only stereo image set. We need a new measure to transform an available measure into probability measure. The Gibb’s distribution is the probability distribution whose random variable is related to the energy dimension [4], [8],   E(x) p(x) = C exp − . (2) T In the Gibb’s distribution, E(x) is the error energy of the random variable x and can be obtained from some energy functions of x. The Gibb’s distribution is a kind of function to transform the energy into probability. Now, we can consider the probability space in the energy space with the Gibb’s distribution and construct a one-to-one mapping between the two spaces. By transforming Eq. (1) with the Gibb’s distribution, p(D | {I}) ∝ exp {−E(D | {I})} ,

(3)

we can redefine the Bayesian model of stereo matching as minimization of E(D | {I}). Since we can generally construct measures which are related with the error energy, we are able to maximize the probability by minimizing the error energy which is transformed into the probability with Gibb’s distribution. By applying the Bayesian rule in the probability theory to the conditional probability, p(D | {I}), we can obtain the energy equation which is composed of two energy measures, intensity based energy measure and disparity based energy measure, respectively, E(D | {I}) ∝ E({I} | D) + E(D).

(4)

In Eq. (4), the first term means by the photometric error energy due to the intensity differences between stereo images given disparity map, and it can be calculated as follows,  ρi (IR − IK (D)), (5) E({I} | D) = IK ∈{I}

where, IK (D) is the translated IK by the disparity map D, and ρi (·) is the measure function to construct the energy, and ρi (x) = x2 is the usual case. This paper uses the contaminated Gaussian energy measure which generalizes the energy measure,     x2 + i . (6) ρi (x) = − ln (1 − i ) exp − 2σi 2 By adjusting the parameter i and σi , we can obtain the various results with respect to the characteristics

LEE et al: DISPARITY ESTIMATION BASED ON BAYESIAN MAXIMUM A POSTERIORI (MAP) ALGORITHM

1369

of stereo images. The main purpose of using the contaminated Gaussian energy measure instead of square function is to limit the disparity range so that the algorithm can diffuse the disparity space with related to the correlation of the neighboring disparities. The disparity different from neighboring disparities is diffused with uniform and minimal probabilistic weight. This diffusion makes the disparity distribution continuous and the correlation of neighboring disparities higher. The diffusion based on the contaminated Gaussian energy measure is much more stable and is converged faster than the square function. And, the above parameters are very important to the stability and convergence. As the probabilistic diffusion is iterated, the probabilistic weights except for the one disparity will become uniform and minimal value. This is the reason that the proposed diffusion is stable and rapid. The second term in Eq. (4) is the energy measure of disparity map. This measure is based on the assumption that the disparity will distribute continuously in the image plane. That is to say, the disparity will be similar to those of adjacent neighborhood. The energy E(D) increases as the difference between adjacent disparities increases. This energy term has an important role in regularization of disparity distribution in the disparity map. The measure function of this energy uses the contaminated Gaussian energy measure same as the intensity based measure. However, in the case of the disparity based measure, the contaminated Gaussian has an important effect on the convergence of the energy space and the boundary spreading problem in the estimated disparity map with respect to the parameters, d and σd . The d is related to the maximal saturated energy level. As it approaches to zero, the saturated energy level of the disparity based energy measure is increased and is more weighted than the intensity based energy measure. In other words, the disparity distribution is considered to estimate the disparity as the more important factor than the intensity difference between the stereo images is. This relation between two energy measures makes the exact estimation difficult with respect to the image characteristics. And, σd is related to the range of the probabilistic diffusion. As it increases, many disparities in wide range have high correlation or probability. This means that there are many candidates of disparity to be estimated with the probabilistic diffusion, and the possibility to estimate disparity incorrectly is increased. On the other hand, as σd decreases, it is possible to assign low probabilistic weight to similar disparities in the neighborhood. So, it is important to choose the appropriate value in order to improve the estimation and the speed of the convergence in the probabilistic diffusion process.

3.

Formulation of MRF Model

Before deriving the general formula at a position (i, j), we first need to introduce the concept of energy space in stereo matching. As is the same case of the diffusion based algorithm, the pointwise intensity error is calculated at each image point and at each disparity over a certain disparity range. In the step of error calculation three dimensional energy space is constructed. At each point (x, y, d), the photometric error is precalculated and iteratively diffused with the probabilistic models. The intensity based error is diffused with the disparity based error energy according to the probabilistic models and is converged to a stable energy space. We find the disparity of minimum energy in the converged energy space over the disparity range. 3.1 Formulation Let the disparity at the position (i, j) be dij . We consider the pointwise energy E(dij ) in the energy space. According to Markov Random Field (MRF) theory [4], [8], it is possible to estimate the disparity of a position, if all of the joint probability distributions between neighborhood disparities are known in advance. In order to calculate the disparity based energy between neighborhood disparities, we have a basis on MRF theory. Therefore, the probability to be maximized becomes the conditional probability given stereo image set {I} and neighborhood disparity set. Let the configuration of neighborhood disparities be N , and disparity vector be dN , which consists of the disparity elements in the configuration N . The problem of stereo matching based on the MRF theory is as below. minimize E(dij | {I} , dN ), maximize p(dij | {I} , dN ).

or, (7)

As described in Eq. (4), the energy to be minimized can be composed of two measures. One is the error energy measure in the image plane, and the other is the regularization measure in the disparity space. The former is the energy of error signal given the disparity set and can be characterized by the intensity of the stereo images and disparity map. The latter is the energy function in order to consider the similarity between neighborhood disparities based on the continuous distribution of disparity in the image plane. The intensity based error E0 (dij ) is calculated with stereo image set {I} and the disparity dij ,  ρi (IR (i, j) − IK (i, j + dij )). (8) E0 (dij ) = IK ∈{I}

In this paper and our experiments, we use the stereo images which satisfy an epipolar geometry. So, we have to consider only one component of coordinate system

IEICE TRANS. FUNDAMENTALS, VOL.E82–A, NO.7 JULY 1999

1370

in each pair of images to be compared. With the two measurements, we can construct the energy function at dij in the energy space based on Eq. (4),  ρd (dij − dn ).(9) E(dij | {I} , dN ) = E0 (dij ) + dn ∈N

In Eq. (9), since the image set {I} is given and fixed, we can consider only the configuration condition in the probabilistic manipulations. In other words, it is no loss in generality to dealing with the conditional probability, p(dij | dN ). In order to obtain the probability distribution from the energy function, we apply the Gibb’s distribution to the energy equation, p(dij | dN ) ∝ p0 (dij )  exp {−ρd (dij − dn )} . ×

(10)

dn ∈N

By the probability theory, the probability, p(dij ) can be obtained by integrating the above conditional probability over the joint probability distribution of all configurations in the neighborhood system,  p(dij | dN )p(dN ). (11) p(dij ) = all dN

In Eq. (11), p(dN ) is the joint probability distribution of disparities in the configuration. The symbol all dN means by the all configurations of disparities to be available at dij . These configurations may be changed corresponding to a position dij in the energy space. Let S denote the all dN , the super set of sets which consists of all configurations of neighboring disparities with respect to dij . Also, let n(S) be the number of the elements in the super set S. Combining Eq. (11) with Eq. (10), p(dij ) ∝ p0 (dij )     exp {−ρd (dij − dn )} . (12) × p(dN ) N ∈S

dn ∈N

Equation (12) can be also transformed into energy function as before, E(dij ) = E0 (dij )   p(dN ) exp {−ρd (dij − dn )} . − ln N ∈S

dn ∈N

(13) Equation (13) is the Bayesian formula based on MRF model for disparity estimation. By the MRF model, the neighboring disparities have weights with respect to the joint probability distribution, p(dN ). The derived algorithm utilizes not only the differences of the neighboring disparities, but also the joint probability distribution as the weighting factor. The probabilistic weighting can be regarded as the adaptive-windowed

SSD and makes the proposed algorithm have good performance. The energy space is iteratively diffused by the formula, and the most accurate disparity is found when the energy space is converged. The distributions of the configurations play a dominant role in the diffusion process, and the estimate of the distributions is an important problem in the practical implementations. 3.2 Analysis of the Formulation The bound is the criterion to guarantee stability of the convergence in the iteration process. If the iteration is processed with some bounds, the iterated values don’t increase infinitely. The derived formula diffuse the energy space iteratively, and should make the space stable as the diffusion is iterated. The purpose of the bounds analysis is to obtain the bounds which guarantee the stability of the diffusion. In addition, it is easier to examine how the formula diffuse the energy space and what the probabilistic condition is to enhance the diffusion. We manipulate the Eq. (13) using two inequalities [6] to obtain the bounds. The first is the inequality between arithmetic mean and geometric mean, K 



K

 ≥ K ak ,

ak

k=1

K

for all ak > 0.

(14)

k=1

Since each term of the summation operator in Eq. (13) is positive, it is possible to apply this inequality to the general formula. We can obtain an inequality by manipulating Eq. (13) based on Eq. (14) as below,

  exp {−ρd (dij − dn )} p(dN ) ln N ∈S

dn ∈N

1 ≥ ln n(S) +

n(S)   × ρd (dij − dn ) . (15) ln p(dN ) − N ∈S

dn ∈N

In order to calculate the n(S), define the disparity range set, D, as the set of all the possible disparities, and the geometry set of the configurations, G, which consists of geometric structures of neighborhood in the two dimensional image plane. Figure 2 shows the various geometries of the neighborhood configurations. As mentioned above, it is possible to vary the geometry of the configuration as well as the disparities in the configuration. The n(S) is calculated as 

n(G)

n(S) =

[n(D)]

n(Nk )

,

(16)

k=1

where, Nk is the configuration with k-th geometry in G. By combining Eq. (13) with Eq. (14), we obtain the

LEE et al: DISPARITY ESTIMATION BASED ON BAYESIAN MAXIMUM A POSTERIORI (MAP) ALGORITHM

1371

the H¨ older inequality to obtain the lower bound of the energy space,

  p(dN ) exp {−ρd (dij − dn )} ln N ∈S



1 ln α



dn ∈N

[p(dN )]α

N ∈S

 1 + ln β

N ∈S





β exp {−ρd (dij − dn )}

,

dn ∈N

(19)

Fig. 2 Examples of geometry of configurations. The empty circles are neighboring disparities for the centered disparity to be estimated.

and, combining Eq. (19) with Eq. (13), we obtain an new inequality,  1 E(dij ) ≥ E0 (dij ) − ln [p(dN )]α α N ∈S β  1   − ln exp {−ρd (dij − dn )} . β N ∈S dn ∈N

(20) general inequality of Bayesian model for stereo matching as follows, E(dij ) ≤ E0 (dij ) − ln n(S)    1  ρd (dij − dn ) . − ln p(dN ) − n(S) N ∈S

dn ∈N

(17) Equation (17) shows the upper bound of the energy at each point in the energy space. In Eq. (17), as the difference of disparity in the term ρd (dij − dn ) increases, the upper bound of the energy E(dij ) increases, and the corresponding probability decreases. This means that the formula makes the neighborhood disparities smooth and the isolated disparity be similar to the neighborhood disparities. And, if the term p(dN ) has a large value, the upper bound of energy decreases and the probability increases. Therefore, we should choose the configuration so as to make the joint probability distribution as large as possible. This paper will implement the general formula from the point of how to choose the configuration in order to estimate the joint distribution maximally. Now, we apply the H¨ older inequality to Eq. (13). In the finite sum, the inequality is  n  α1  n  β1 n    |xi yi | ≤ |xi |α |yi |β , i=1

i=1

for

The inequality Eq. (20) describes the lower bound of the energy space, and, restricts the energy space in the finite range together with Eq. (17). As we can see in Eq. (20), we should choose the joint probability as large as possible and the disparity dij similar to the neighboring disparities in order to make the energy bound lower.

i=1

1 1 + = 1, α β

1 < α < ∞.

(18)

As is the same analysis of the upper bound, we use

4.

The Plane Configuration Model

4.1 Independence Probabilistic Model In the general formula, Eq. (13), it is necessary to know the joint distribution of disparities in the configuration, p(dN ), and to specify the configuration at each dij in advance. However, it is difficult to obtain the exact joint probability distribution and tremendous computations are required in real cases. In addition, the disparity distribution has not only continuity but also discontinuity at the object boundary. New probabilistic model is required to overcome the regularization by the MRF model and to estimate the disparity more exactly at the boundary region with lower computations. So, we assume the simple probabilistic model for the joint distribution of the configuration. Assume that each disparity in the configuration is independent of the others. The independence model looks contrary to the continuity of the disparity distribution. However, in the boundary region, the abrupt discontinuity exists and can be guaranteed by the independence model. The joint probability p(dN ) can be rewritten as  p(dN ) = p(dn ). (21) dn ∈N

IEICE TRANS. FUNDAMENTALS, VOL.E82–A, NO.7 JULY 1999

1372

By inserting Eq. (21) into Eq. (13) and interchanging the summation and multiplication operator, the new and more practical formula is derived generally, E(dij ) = E0 (dij )   exp {−ρd (dij − dn )} p(dn ). − ln

can be seen in Eqs. (24) and (25). Equations (24) and (25) are different from Eqs. (17) and (20) in that the product of the marginal probability distributions instead of joint probability distribution has to be considered in order to diffuse the energy space. 4.2 Similarity of Disparity

N ∈S dn ∈N

(22) Equation (22) is the new general formula of Bayesian model for stereo matching based on the assumption of independence between the disparities in the configuration. This formula is much more useful for implementation than Eq. (13), because the marginal probability p(dij ) can be calculated easily in the energy space with Gibb’s distribution. In addition, since the energy space is diffused iteratively with this formula, it is possible to estimate the marginal probability more accurately in each iteration step. Now, the only thing to do is to guarantee that the probability satisfies the usual property of probability. That is to say, the probability space transformed from the energy space should satisfy the axiomatic properties of probability theory. Otherwise, this incomplete probability space can result in the divergence in the energy space as the diffusion processing is proceeded iteratively. So, it is necessary to set the constraint on the probability space as follows,  p(dij ) = 1, and, dij ∈D

exp {−E(dij )} p(dij ) =  . exp {−E(di,j )}

(23)

dij ∈D

In Eq. (22), we should choose the configuration and its element dn so as to maximize the marginal probability p(dn ). We assume that the disparity varies or distributes continuously with higher probability than it does abruptly. As is the same case of the intensity distribution in the image, the disparity varies smoothly except for the boundaries of objects. Based on this assumption, we estimate all the disparities in the configuration be equivalent. That is to say, a configuration consists of only one valued disparity set. We call this similarity model as plane configuration model. This assumption is reasonable since we have no information on the distribution of disparity and disparities in the configuration have strong correlations between them. In addition, this assumption is consistent with the disparity based regularization measure in the Bayesian model which makes the distribution of disparity smooth and continuous. Figure 3 shows the concept of the plane configuration model and probabilistic diffusion in the energy space. In Fig. 3, we use the configuration of Fig. 2 (a). Four disparities with same value construct a configuration and each disparity is assumed to be independent of the others. In probabilistic diffusion, all the products of four marginal probabilities are summed over the disparity range according to Eq. (22).

As the previous manipulation based on the inequality between arithmetic mean and geometric mean, we manipulate Eq. (22), E(dij ) ≤ E0 (dij ) − ln n(S) 1   {ln p(dn ) − ρd (dij − dn )} . − n(S) N ∈S dn ∈N

(24) Also, the lower bound of Eq. (22) is obtained as E(dij )

α    1 p(dn ) ≥ E0 (dij ) − ln α N ∈S dn ∈N β    1 − ln exp {−ρd (dij − dn )} . β N ∈S

dn ∈N

(25) As the same as Eqs. (17) and (20), the disparity regularization and the maximal joint probability dependence

Fig. 3 The proposed plane configuration model and probabilistic diffusion. Four neighboring disparities with same value construct a configuration. Every energy value in three dimensional space is probabilistically diffused through all the possible configurations based on the plane configuration model.

LEE et al: DISPARITY ESTIMATION BASED ON BAYESIAN MAXIMUM A POSTERIORI (MAP) ALGORITHM

1373

With the above plane configuration model, the number of possible configurations, n(S) becomes equivalent to the number of disparity range, n(D), because all the disparities in the configuration are chosen so as to be the same. 5.

Simulations and Results

This paper had experiments with two sets of stereo images, which consist of 5 images generated from 5 different viewpoints respectively. The center image is the reference, and the others are located at the positions of left, right, top, and bottom respectively. The five cameras are on the same plane and are rectified. The three cameras, center, left, and right also satisfy the one dimensional epipolar geometry in the horizontal direction. And, center, top, and bottom cameras do in the vertical direction. Figures 4 and 5 are the test images, random dot and face. The initial energy space is generated by pointwise intensity difference. The four pairs of difference, which have reference with center image, are calculated and averaged. The energy space is probabilistically diffused with Eq. (22) for E1 (dij ) and Eq. (26) for k ≥ 2, Ek+1 (dij ) = ∆Ek (dij ) = − ln

Bayesian MAP estimation for stereo matching operates on the initial energy space iteratively. The disparity range is from 0 to 7 pixels. The first figure is the initial disparity map and the next figures are the diffused disparity map from once to eleven times. The disparity is converged to the neighboring disparities corresponding to some probabilistic models. In order to evaluate the performance, we compare the disparity maps generated from the proposed algorithm with those of the SSD and Scharstein’s algorithm [12]. This paper implemented the proposed algorithm with the same configuration and all the same parameters in order to compare the performances and probability model with those of the Scharstein’s algorithm. In this paper, the values of the main parameters, σd , d , σi , i are set to 0.4, 0.01, 5, and 0.1 respectively. Figure 7 shows the comparison of four disparity maps for random dot generated from four algorithms respectively. Figure 7 (a) is generated from the SSD of block size 7 × 7 followed by iterative median filtering, (b) is generated from the Scharstein’s algorithm, and (c) is generated by Eq. (22). In Eq. (22), the plane configuration model is not used but assumed the independence.

1 {E0 (dij ) + Ek (dij ) + ∆Ek (dij )}, 2

 

  (k) (k) exp −ρd (dij − d(k) n ) p(dn ),

N ∈S dn ∈N

(26) (k)

until converged to a stable state. In Eq. (26), dij and (k)

dn are disparities in the k times iterated space. The diffusion process is finished when there are negligible changes between the current and the previous disparity map. 5.1 Experimental Results

(a)

Before comparing the performance of the proposed algorithm with those of the other ones, Fig. 6 shows how

(b) Fig. 4 Test stereo image of random dot. (a) Reference image, (b) True disparity map.

Fig. 5 Test stereo image of face. (a) Reference image, (b) True disparity map.

IEICE TRANS. FUNDAMENTALS, VOL.E82–A, NO.7 JULY 1999

1374

Since many computations are necessary to obtain the joint probability of disparities, this paper could not realize the pure formula, Eq. (13), and applied the independence assumption, Eq. (22). In this case, the number of combination of disparities in the configuration is equivalent to n(D)4 instead of n(D). As we can see, the disparity maps from the proposed algorithm is superior to those from the SSD and Scharstein’s one. The disparity boundaries of Figs. 7 (c) and (d) are closer to the true map, and the false disparities are fewer than in the (a) and (b). In the four figures, the left and right areas result from the modulo operation in the implementation program, and they are meaningless regions in the disparity maps. Compared with (d) which was generated by plane configuration model, the disparity map, (c) is not better. This result can be explained by the independence probabilistic model of disparities in the configuration. As mentioned above, the independence model was proposed to guarantee the discontinuity of the disparity distribution in the boundary region. The disparity map from the proposed probabilistic model can be better than that from generalized formula. Also, because it takes too much time and computing power to generate a disparity map with the generalized formula, the proposed model is preferred. Figure 8 shows the disparity maps for face image. As is the same result of the random dot image, the proposed algorithm outperforms the others, in that the clearer and more accurate boundaries, and the fewer erroneous disparities. Fig. 6 Convergence of the disparity map for random dot. (a) Non-diffused initial disparity map, (b) once diffused map, (c) twice diffused map, and finally (l) 11 times diffused map.

Fig. 7 Comparison of disparity maps for random dot. (a) SSD algorithm (7 × 7 block size), (b) [12] method, (c) Generalized formula, (d) Proposed Algorithm.

5.2 Comments on the Results In any stereo images, the proposed algorithm generates the better disparity maps than the usual SSD and Scharstein’s algorithm. These results show that the diffusion formula is correctly derived and the probabilistic models of independence and similarity are reasonable and appropriate for estimating the joint probability distribution of the disparities in the configuration. The proposed algorithm uses only first–order MRF window, (a) in Fig. 2, in order to generate more detailed disparity map. When the MRF window is enlarged, the blurring of disparity map becomes serious. However, it is not important to optimize the size of MRF window in the proposed algorithm, unlike SSD. Even though the diffusion is processed with the smallest first–order MRF, the MRF support becomes propagated into farther neighbors step by step as the number of iterations is increased. Furthermore, since the neighboring disparities in the MRF window is weighted corresponding to the probability distribution, the MRF window is consequently deformable and space-adaptive by the iterative diffusion and probabilistic weights. The deformable window is the main advantage and the most important feature of the proposed algorithm that makes the performance outstanding. In order to construct such adaptive window operations, there are no needs

LEE et al: DISPARITY ESTIMATION BASED ON BAYESIAN MAXIMUM A POSTERIORI (MAP) ALGORITHM

1375

disparity range and the geometry of the neighborhood configuration. In the case of Fig. 2 (d), the computation 24 4 becomes O(n(D) ) instead of O(n(D) ) of Fig. 2 (a) for the generalized formula. 6.

(a)

(b)

Conclusions

In this paper, we have derived the general formula of Bayesian MAP estimation for stereo matching and implemented the proposed algorithm based on the simplified probabilistic models such as independence and similarity between disparities in the configuration. The independence probabilistic model guaranteed the discontinuity at the object boundary region, and the similarity model does the continuity or the high correlation of the disparity distribution. According to the experimental results, the performance of the proposed algorithm outperformed the other algorithms, in that the boundaries of disparity were clearer and erroneous estimates were reduced very effectively. These results mean that the derived formula is adequate and the probabilistic models are reasonable and appropriate for the joint probability distribution of disparities in the configuration. Furthermore, the proposed probabilistic models 4 decrease the computations to O(n(D)) from O(n(D) ) of the generalized formula. We can conclude that the derived formula is the general form and can be changed into the some different forms based on the reasonable probabilistic models. The more accurate is the assumed probability model, the better disparity map can be generated with this formula. Currently, we are investigating on further improvement of the method by introducing various probabilistic models and configurations. Acknowledgement Part of this work was performed while Sang Hwa Lee was visiting ATR Media Integration and Communications Research Laboratories. The authors are grateful to Prof. Y. Ohta of University of Tsukuba for providing with the stereo images and true disparity map. References

(c) Fig. 8 Comparison of disparity maps for face. (a) SSD algorithm (7×7 block size), (b) [12] method, (c) Proposed Algorithm.

of any other complex analysis or auxiliary spatial information and optimization of window size except for the only probabilistic distribution from initial energy space. In addition, the plane configuration model reduces the calculations of random dot and face to 0.1% and 0.02% of the generalized formula, respectively. The amount of computational reduction is dependent on the

[1] J.S. Lim, “Two Dimensional Signal and Image Processing,” Prentice Hall, 1990. [2] R.M. Haralick and L.G. Shapiro, “Computer and Robot Vision,” Addison Wesley, 1993. [3] D.H. Ballard and C.M. Brown, “Computer Vision,” Prentice Hall, 1982. [4] R. Chellappa and A. Jain, “Markov Random Fields,” Academic Press, 1993. [5] G. Xu and Z. Zhang, “Epipolar Geometry in Stereo, Motion and Object Recognition,” Kluwer Academic Publishers, 1996. [6] E. Kreyzig, “Introductory Functional Analysis with Applications,” WIE Wiley, 1978. [7] U.R. Dhond and J.K. Aggarwal, “Structure from stereo—A review,” IEEE Trans. Syst., Man. & Cybern., vol.19, no.6, pp.1489–1510, Nov./Dec. 1989.

IEICE TRANS. FUNDAMENTALS, VOL.E82–A, NO.7 JULY 1999

1376

[8] S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images,” IEEE Trans. Pattern Anal. & Mach. Intell., vol.PAMI-6, no.6, pp.721–741, Nov. 1984. [9] T. Kanade and M. Okutomi, “A stereo matching algorithm with an adaptive window: Theory and experiment,” IEEE Trans. Pattern Anal. & Mach. Intell., vol.PAMI-16, no.9, pp.920–932, Sept. 1994. [10] M. Okutomi and T. Kanade, “A multiple-baseline stereo,” IEEE Trans. Pattern Anal. & Mach. Intell., vol.PAMI-15, no.4, pp.353–363, 1993. [11] R. Szeliski and G. Hinton, “Solving random-dot stereograms using the heat equation,” Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR ’94), pp.194–201, June 1994. [12] D. Scharstein and R. Szeliski, “Stereo matching with nonlinear diffusion,” Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR ’96), pp.343–350, San Francisco, CA, June 1996. [13] W.T. Woo and A. Ortega, “Stereo image compression with disparity compensation using the MRF model,” Proc. VCIP ’96, vol.I, pp.28–41, Orlando US, Feb. 1996. [14] D. Scharstein, “Matching images by comparing their gradient fields,” Proc. International Conference on Pattern Recognition (ICPR ’94), vol.1, pp.572–575, Oct. 1994. [15] Z.-N. Li and G. Hu, “Analysis of disparity gradient based cooperative stereo,” IEEE Trans. Image Processing, vol.5, no.11, pp.1493–1506, Nov. 1996. [16] F.M. Porikli, “Stripe mesh based disparity estimation by using 3-D Hough transform,” Proc. ICIP ’97, Santa Barbara, USA, vol.III, pp.240–243, Oct. 1997. [17] A.E. Zaart, D. Ziou, and F. Dubeau, “Phase-base disparity estimation: A spatial approach,” Proc. ICIP ’97, Santa Babara, USA, vol.III pp.244–247, Oct. 1997. [18] J.I. Park and S. Inoue, “Arbitrary view generation from multiple cameras,” Proc. ICIP ’97, Santa Babara, USA, vol.I pp.149–152, Oct. 1997. [19] S.H. Lee, J.-I. Park, and C.W. Lee, “A new stereo matching algorithm based on Bayesian model,” Proc. ICASSP ’98, Seattle, USA, vol.5, pp.2769–2772, May 1998.

Sang Hwa Lee received the B.S. and M.S. degrees in electronics from Seoul National University, Korea, in 1994 and 1996, respectively. He is now a Ph.D. student at School of Electrical Engineering of the same university. He worked in ATR, Kyoto as a student intern in 1997. His interests include image restoration, motion estimation, image/video coding and stereo vision.

Jong-Il Park got his B.S., M.S., and Ph.D. degrees all in electronics engineering from Seoul National Univ., Korea, in 1987, 1989, and 1995, respectively. He was a research student of Univ. of Tokyo and a visiting researcher in NHK Science and Tech. Research Labs. from 1992 to 1994. After working for Korean Broadcasting Institute in 1995, he joined ATR Media Integration and Communications Research Labs. in 1996 where he was involved in various projects on video analysis and processing, 3D video processing, and virtual reality communications. He is currently an assistant professor of Division of Electrical and Computer Engineering at Hanyang Univ., Korea. His research interest includes computer graphics, computer vision, image and video processing, and virtual reality.

Seiki Inoue got the B.S. degree in electrical engineering, M.S. and Ph.D. degrees in electronics all from Univ of Tokyo in 1978, 1980, and 1992, respectively. He joined NHK in 1980. From 1995 to 1998, he had been with ATR Media Integration and Communications Research Labs. He moved back to NHK Science and Technical Research Labs. in 1998. His research interest includes image and video processing, multimedia database and Kansei processing.

Choong Woong Lee received the B.E. ans M.S. degrees in electronics engineering in 1958 and 1960 respectively from Seoul National Univ., and he received the degree of Doctor of Engineering from the Univ. of Tokyo in 1972. From 1958 to 1961, he worked at the Scientific Research Institute, Ministry of National Defense, Korea. After that he joined the staff of Seoul National Univ., where he is a professor in the School of Electrical Engineering. He worked as the chairman of Korea section of IEEE from 1983 to 1986 and received the grade of IEEE fellow in 1989. And, he was the Korean Institute of Telematics and Electronics in 1989 and got the Order of Merit, Dong-Baek Medal endowed by the government in 1991. He had been working as the Director of Institute of New Media and Communications at Seoul National Univ. from 1991 to 1997. His special interests include communication systems, image and HDTV signal processing and medical imaging.