Homotopy-Based Estimation of Depth Cues in Spatial Domain
F. Deschˆenes and D. Ziou D´epartement de math´ematiques et d’informatique Universit´e de Sherbrooke deschene, ziou @dmi.usherb.ca
Abstract This paper presents a homotopy-based algorithm for a cooperative and simultaneous estimation of defocus blur and spatial shifts (2D motion, stereo disparities and/or zooming disparities) in the spatial domain. These cues are estimated from two images of the same scene acquired by a camera evolving in time and/or space and for which the intrinsic parameters are known. We show that these depth cues can be directly computed by resolving a system of equations is embedded in a family of systems using a homotopy method. The results confirm that the use of homotopies reduces approximation errors and thus leads to a denser and more accurate estimation of depth cues.
1. Introduction Depth information is usually retrieved by extracting relevant features (depth cues) from images such as shadows, motion, blur, disparity, etc. Most of the existing techniques compute depth cues independently using simplistic assumptions (e.g., brightness constancy assumption for spatial shift estimation, negligible spatial shifts between a pair of images for defocus blur estimation, etc.). All of these assumptions imply a perfect control of both the environment and the acquisition system, which is usually difficult and even insufficient in many practical cases [1]. In order to overcome such limitations, few algorithms have recently been proposed [4, 6]. They consider the mutual influence between depth cues during the extraction process. In this line of thoughts, we are interested in simultaneous and cooperative estimation of defocus blur and spatial shifts (stereo disparities, 2D motion and/or zooming disparities). Let us consider , !"$# , a pair of images of a real scene obtained by a camera evolving in time ( ) and space and for which the values of the extrinsic ( : position and orientation) and intrinsic parameters ( : aperture, focal length, lens radius, etc.) are known. Based on image formation principles, we propose to use a generic constraint: % &' &() &*&+ &, &*-&(. /* )/00) /1 /2 /3 )/34/26587196 )/3/0
(1)
P. Fuchs Centre de robotique ´ Ecole des Mines de Paris
[email protected]
where 5 is the convolution operator, 7 the PSF of the camera and : the blur parameter. Assuming perspective projection, passive image formation system, PSF Gaussian and blur locally constant, we show that the more blurred image (and its partial derivatives) may be expressed as a function of its partial derivatives, the partial derivatives of the other image, the blur difference (: ), the horizontal and vertical shifts ( ;2< and ;(= ) and a continuation parameter ( ). Hence, : , ;2< , and ;2= can be computed by resolving a system of equations using a homotopy method. The use of homotopy yields a denser and more accurate estimation of the previously mentioned depth cues. Furthermore, all computations required by our algorithm are local and are carried out in the spatial domain at a single scale using higher-order polynomial expansion than existing approaches (e.g., [4]).
2. Homotopy-Based Approach A family of systems of equations for a simultaneous estimation of defocus blur and spatial shifts can be generated with a homotopy method. In order to simplify notation let us replace 0) 0 - by 0$ , in what follows. 0&' &""&2 and +/* /*'/2 can thus be considered as a pair of images of the same scene obtained from a passive image formation system by varying one or more intrinsic and/or extrinsic camera parameters. Let us assume that 0&" &",&+ is locally more blurred than (/, /*/0 . This assumption has no consequence for this work, since it is possible to determine which image is more blurred [1]. When the PSF is Gaussian, & & & >? @5A7,B+C22 & & and / / / @D E5F7,B3G20 / / , where 4 is the focused image, 5 the convolution operator, and H &JI H / . The relation between L& KNM*OKQPRO &",&+ and L/ KNM*OSKNPTO /*/0 , that is between the original images (UVXWDZY ) or their