Maximum likelihood estimation of depth field for ... - IEEE Xplore

3 downloads 37 Views 244KB Size Report
Mar 14, 2013 - ity and diffusion, as witnessed by the amount of 3D movies production. Combining digital film shooting with digital post-processing allows one.
Maximum likelihood estimation of depth field for trinocular images A. Neri, M. Carli and F. Battisti

power spectral density equal to N0/4. The generalised likelihood functional of fH(x,y) and fV(x,y) given z, computed by the conditional joint probability of fH(x,y) and fV(x,y) divided by any arbitrary function independent from z, is   2 fR (x, y)− L[[ fL , fV ; z(x, y)]] = exp − N0  2 g fL x + H , y dx+ z(x, y)  2 fR (x, y) − N0  2  gV −fV x, y + dy z(x, y)

In this reported work, the maximum likelihood estimation of the depth field from a trinocular imaging system is considered. Conventional stereo pairs are able to extract depth information from image changes only along the epipolar line, but the use of three cameras in an ‘L’ shape allows one to utilise any change, irrespective of its orientation. The estimate is obtained by maximising the sum of the cross-correlations computed along the orthogonal epipolar lines. Theoretical assessment of the achievable performance based on Fisher’s information is also included. Simulation results have validated the mathematical framework and the performance analysis.

Introduction: Recently 3D multimedia contents have regained popularity and diffusion, as witnessed by the amount of 3D movies production. Combining digital film shooting with digital post-processing allows one to create 3D scenes with characters recorded in a real environment and then immersed in a computer generated world, as in Avatar [1]. Combining effectively real and virtual components in a unique shot requires highly accurate information of the depth of each element of the scene. Although a stereo imaging system is able to provide such information, occluded components and flat regions constitute a big challenge. In fact a pair of horizontally aligned cameras can provide a reliable depth estimate only in correspondence of significant changes along the horizontal direction. In this Letter we reconsider the use of trinocular systems [2–5], to enhance the depth field estimation accuracy. Although this approach has already been suggested in the past [6, 7], only heuristic solutions have been proposed. Here we propose a trinocular vision system aimed at the optimal extraction of the depth information from the recorded images, based on the maximum likelihood (ML) criterion. Then, we compute Fisher’s information matrix and we derive the Cramer Rao lower bound for the achievable performance. The mathematical framework is verified with MATLAB simulations. As illustrated in Fig. 1 the imaging system consists of three cameras arranged in an ‘L’ shape obtainable by a horizontal stereo pair with an additional camera forming an additional vertical stereo pair. Although the reported solution can be easily generalised to trinocular noncollinear systems, the ‘L’ configuration simplifies drastically both the modelling and estimation algorithm. The proposed estimator optimally combines horizontal and vertical disparities and allows an accurate depth evaluation.

OV

  2   g fL (x, y) − fR x + H , y  zˆML (x, y) = Arg min    2+ z(x,y) z(x, y)   2   gV   (x, y) − f y, x + + f R  V z(x, y)  2

L

L

The ML can also be expressed in terms of cross-correlation functions

rk (j, h) =, fR (x, y), fk (x + j, y + h) .,

k = H, V

as follows: g

g zˆML (x, y) = Arg Max rH H , 0 + rV 0, V + z z z 1 2 2 − [ fL (x, y)L2 + 2 fR (x, y)L2 2 +  fV (x, y)2L2 ]} The ML depth estimate is obtained by maximising the sum of the crosscorrelations computed at corresponding pairs along the two epipolar lines. Concerning the achievable performance, we observe that, according to [8], Fisher’s information on z for a binocular system is proportional to the energy of the derivative of the pattern, which for horizontal stereo pairs is as follows:

JZ =

P=(xR,yR,z)

QH

b

The ML estimate of the depth field for trinocular images is

 2 2  4  ∂fR (x, y) gH N0  ∂x L2 z4

z QL OL

QR b

Since for trinocular systems the measures extracted from the horizontal and vertical pairs are independent, the information is additive; therefore

OR

 2  2      4 1 2 ∂fR (x, y) 2 ∂fR (x, y) g +g JZ = N0 H  ∂x L2 V  ∂y L2 z4

Fig. 1 ‘L’ shape configuration

Mathematical framework: Let f be the depth of the focal plane and δH and δv the horizontal and vertical baselines, respectively. Given a point P at depth z, the disparity of the horizontal (H) and vertical (V) pair is

dk =

fbk gk = , . . . . . . k = H, V z z

The right camera image fR can be expressed in terms of left camera image fH and of upper camera image fV as follows   gk , y + nk (x, y), k = H, V fR (x, y) = fk x + z(x, y) where nL(x,y) and nV(x,y) are modelled as samples from the independent, stationary ergodic, white, and zero mean Gaussian random fields with

When the two baselines are equal, Jz is proportional to the local energy of the gradient magnitude, irrespective of edge orientation and it is invariant with respect to rotations of the imaging system. Experimental results: To assess the effectiveness of the ML estimator, the reference planar pattern of Fig. 2 has been employed. This pattern is constituted by the repetition of a texton of a 2D pattern of the function sinc(Bx)sinc(By), a horizontal pattern with sinc(Bx) cross-section, a vertical pattern with sinc(By) cross-section and a grey cross on a black background used for registration purpose. The signal-to-noise ratio decreases from left to right, while signal bandwidth B increases from top to bottom. This pattern, printed on a photographic paper has been acquired by means of an ‘L’ shaped trinocular system.

ELECTRONICS LETTERS 14th March 2013 Vol. 49 No. 6

Conclusion: ‘L’ shaped trinocular systems constitute an effective yet simple extension of binocular systems for applications requiring accurate reconstruction of depth field. Moreover, ML solutions show a rather low complexity because they require the computation of two cross-correlations along the two epipolar lines. © The Institution of Engineering and Technology 2013 04 September 2012 doi: 10.1049/el.2012.2978 One or more of the Figures in this Letter are available in colour online. A. Neri, M. Carli and F. Battisti (University of Roma TRE, Applied Electronics Department, via della Vasca Navale, 84, Roma 00146, Italy) E-mail: [email protected] References

Fig. 2 Test pattern

In Fig. 3, the log likelihood functional against the disparity error for the 2D pattern and the vertical pattern is reported. As expected, while the trinocular system is able to estimate the depth of both, the horizontal stereo pair is able only to reliably locate the 2D pattern. Similar results have been experimented for the horizontal pattern and the vertical pair. ×105

0

–2

–2

–4

–4

log Λ

log Λ

0

–6

–10

–10 –80 –60 –40 –20

0 20 εδ trinocular

40

60

80

–80 –60 –40 –20 0 20 40 εδ trinocular

×105

80

log Λ

log Λ

–4 –6 –8 –10 –80 –60 –40 –20

×105

0 20 εδ

40

60

80

horizontal stereo pair

×105 0

–2

–2

–4

–4

log Λ

log Λ

60

×105 –1 –2 –3 –4 –5 –6 –7 –8 –9 –10 –80 –60 –40 –20

–2

0

–6 –8

–8

0

×105

1 Ayache, N., and Lustman, F.: ‘Trinocular stereo vision for robotics’, IEEE Trans. Pattern Anal. Mach. Intell., 1991, 13, (1), pp. 73–85 2 Hemayed, E.E., and Farag, A.A.: ‘A geometrical-based trinocular vision system for edges reconstruction’, Int. Conf. on Image Processing, Chicago, IL, USA, 1998, vol. 2, pp. 162–166 3 Wiegand, T., and Sullivan, G.J.: ‘The picturephone is here, really’, IEEE Spectr., 2011, 48, (9), pp. 50–54 4 Nedevschi, S., Oniga, F., Danescu, R., Graf, T., and Schmidt, R.: ‘Increased accuracy stereo approach for 3D lane detection’, Proc. IEEE Intelligent Vehicles Symp., Tokyo, Japan, 2006, pp. 42–49 5 Nedevschi, S., Bota, S., Marita, T., Oniga, F., and Pocol, C.: ‘Real-time 3D environment reconstruction using high precision trinocular stereovision’, Int. Conf. on Automation, Quality & Testing, Robotics CLujNapoca, Romania, 2006, pp. 333–338 6 Hedenberg, K., and Astrand, B.: ‘A trinocular stereo system for detection of thin horizontal structures,’ Engineering and Computer Science, Advances in Electrical and Electronics Engineering, 2008, pp. 211–218 7 Faugeras, O.: ‘Three-dimensional computer vision – a geometric viewpoint’ (MIT Press, Cambridge, MA, USA, 1993) 8 Neri, A., and Jacovitti, G.: ‘Maximum likelhood localization of 2-D patterns in the Gauss-Laguerre transform doman: theoretic framework and preliminary results’, IEEE Trans. Image Process., 2004, 13, (1), pp. 72–86

–6

0 εδ

20

40

60

80

60

80

horizontal stereo pair

–6

–8

–8

–10 –80 –60 –40 –20

0 20 40 εδ vertical stereo pair

60

80

–10 –80 –60 –40 –20

0 20 40 εδ vertical stereo pair

Fig. 3 Log likelihood functional against disparity error: 2D pattern (left), vertical pattern (right) (signal bandwidths: blue:B0,green:2B0,red:3B0, cyan:4.5 B0)

ELECTRONICS LETTERS 14th March 2013 Vol. 49 No. 6

Suggest Documents