A Focusing-by-Vergence system controlled by retinal motion disparity

A Focusing-by-Vergence system controlled by retinal motion disparity Jorge Batista?, Paulo Peixoto??, J. P. Barreto???, Helder Araujoy ISR - Institute of Systems and Robotics Dep. of Electrical Engineering - University of Coimbra 3000 COIMBRA - PORTUGAL

Abstract. In this paper we describe a focusing-by-vergence system combining a ow based vergence process with an o-line pre-calibration of the focusing odometry. The vergence process is controlled by the retinal optical ow disparity and the target depth velocity is directly obtained from this disparity. The relationship between the target depth (focused target distance) and the focusing odometry of the motorized lens is obtained by an o-line focusing calibration, using a bivariate polynomial to model de relationship between these parameters. The focusing system is velocity controlled and the focusing velocity is dependent on the target focusing depth velocity and the focal distance velocity. The in uence of the focal distance velocity that arise due to the focusing process itself is compensated by the adjustment of the zoom parameter based on a constant magni cation approach, or through the modeling of the focal distance as a function of the zoom and focus lens settings.

1 Introduction The integration of accommodation in active vision is mostly oriented to approaches that use accommodation as a cue to vergence control, combining depth from accommodation with disparity vergence. In these approaches the main objective of the accommodation is depth estimation. Extracting depth from accommodation can be done by using image in-focus or defocus measurements. For in-focus methods it is (almost) the focus setting that is controlled [1, 8, 9, 13, 15, 5, 12] while defocus methods basically rely on controlling the aperture [7, 6, 10, 11]. In range-from-focus the distance to a target is estimated by continuously monitoring some sharpness criterion function through a series of focus positions. When the function reaches a global maximum, the image is in-focus and the target depth can be obtained using a calibrated lens model. Defocus algorithms are based on measuring the degree of blur within a single or multiple images, computing the chance of blur as a function of the controlled optical parameters (aperture or focus) to estimate the accommodation distance. In the proposed focusing method [3, 4], the main purpose is maintain the images in-focus during the visual tracking of moving objects, using the estimated spatial target velocity obtained during the vergence control to control de focus motor velocity (focusing-by-vergence). The target depth velocity is expressed as a function of the angular velocity of the eye azimuth joints, and this angular velocity is controlled by the retinal image optical ow disparity. Computing the retinal motion disparity for a symmetric vergence geometry, we are able to obtain the target depth velocity. Using the estimated target depth velocity combined with a pre-calibration of the focus motor setting as a function of the accommodation depth, this approach is able to control the velocity of the focus motor of the lens to maintain the optical system in focus. The focus motor velocity depends on the focal length of the lens, retinal motion disparity and the present vergence geometry. The relationship between the focus motor velocity and the target depth velocity is obtained by dierentiating the bivariate polynomial function that models the relationship between focus motor setting and accommodation depth. [email protected] [email protected] [email protected] y [email protected]

? ?? ???

2 The Vergence Process Consider the existence of a point Pc with coordinates (Xc ; Yc ; Zc ) in the cyclopean eye coordinate system (Fig. 1), moving with velocity Vc = c Pc + tc, being c = [ 1 ; 2; 3]T the angular velocity and tc = [tx ; ty ; tz ]T the translational velocity. This velocity, Vc , can be expressed in each one of the retina coordinate systems, Vleft=right , by Vl=r = Rl=r [ c Pc + tc ] (1) representing Rl=r the rotation matrix between the cyclopean and the left/right retinal coordinate systems. V

Vz Vx

θl

θv

θr Horopter

CIP

Z

Wl

X

Wr

Wc 2B

Fig.1.: Symmetric xation geometry. The target velocity V has two main components: Vz that is the velocity along the cyclopean axes and Vx that is the velocity perpendicular to the cyclopean axes. De ning the retinal image velocity by "

#

"

#

f vxl=r = Pl=rx z^ (Vl=r x^) ? (Pl=rfxz^)2 (Pl=r x^)(Vl=r z^) (2) fy fy vyl=r Pl=r z^ (Vl=r y^) ? (Pl=r z^)2 (Pl=r y^)(Vl=r z^) and representing Pl=r by Pl=r = Rl=r Pc + Tl=r , the retinal image motion ow disparity is given by v = vl ? vr = P f z^ VVl xy^^ ? f(P (Vlz^)z2^) PPl xy^^ ? P f z^ VVr xy^^ ? f(P (Vrz^)z^2 ) PPr xy^^ (3) l r l r l l r r being f the focal length of both lenses. After some mathematical manipulation, and considering that the target is verged with equal vergence angles (l = ?r ; v = 2 ), which means that its coordinates are Pc = [0; 0; Zc] and its image projection are [0; 0]l=r , the retinal image motion ow disparity is given by

2 x = ?ftzZsin c : (4) v = v vy 0:0 Since the binocular system is verged on the target with equal vergence angles, Zc = B(tan )?1 , results for the Z component of the translational velocity tz the equation tz = ?B v2x ; 2 f sin that is a function of two important parameters: the horizontal retinal optical ow disparity vx and the actual vergence geometry .

Considering that

@Zc = t = ?B vx (5) z @t 2 f sin2 ?B @ c and combining this equation with the result of dierencing Zc with respect to time, @Z @t = sin2 @t , results @ = vx (6) @t 2f that represents the angular velocity of the vergence joints to maintain vergence on the moving target. For this particular geometry for vergence, only the horizontal motion ow disparity is required to control the joints vergence velocity of both retinas.

3 Focus motor setting calibration To model the non linear relationship between the in-focus motor setting and the focusing distance, we calibrated o-line the focus motor settings across a range of zoom settings mz and focusing distances D. To get the image in focus we used as sharpness criterion function the modi ed Laplacian proposed by Nayar [13], monitoring its behavior as the focus motor changed its settings. A Fibonacci search was used to nd the global maximum of the function criterion [9]. After having determined the focus motor settings for a range of zoom settings and focusing distances we must characterize how it varies with the variable parameters, in special how it varies with focusing distance. Since only zoom and focusing distance are going to be considered in this modeling process, the functional relationship used to model the lens behavior has two independent variables, the zoom motor setting mf and the focusing distance D. We used to describe these functional relationships bivariate polynomials [18]. The general formula for a nth order bivariate polynomial is BP(D; mz ) =

n X n?i X i=0 j =0

aij Di mjz

(7)

and the number of coecients required by the polynomial is NC = (n + 1)(n + 2)=2.

4

x 10 10

4

Focus settings

x 10

Focus settings

10 8 6

7 6

2 1500 2000 2

3000 D (mm )

1

3500 4000

n ti

t

3

2500

m

o Zo

se

4

8

2000

gs

5 4

6

2 1500

8

4

8

4

x 10

2500 D (m m)

4 3000

2 3500

Zoo

e m s

6 ngs tti

4

x 10

0

0

Fig.2.: The focus motor setting as a function of the zoom setting and focusing distance. left:calibrated data; right:modeled data. Using these type of functionals, three main relationships were modeled with direct in uence in the auto-focusing mechanism implemented: focus motor setting mf , eective focal length f and image magni cation M. mf = g(D; mz ) f = h(mf ; mz ) M = (mf ; mz ): (8)

A third order bivariate polynomial function g was used to model the focus motor setting mf mf = g(D; mz ) = a00 + a01mz + a02m2z + a03m3z + a10 D + a11Dmz + a12Dm2z + a20D2 + a21D2 mz + a30D3 ; being the calibrated data and the modeled data presented on gure 2. The focusing distance was measured using stereo triangulation from verged foveated images.

(9)

4 Focusing by vergence In a focusing system conducted by vergence, the angular velocity of the eyes joints controls the xation point (target) depth velocity and this depth velocity determines de velocity of the focusing system. Representing the focusing distance D as a function of the xation geometry (symmetric geometry) by D = B (sin )?1 ; the velocity of the focusing distance as a function of the angular velocity of the eye azimuth joint is given by @D = @D @ = ?B cos @ : (10) @t @ @t sin2 @t Being the angular velocity of the eye azimuth joint de ned by equation 6, the focusing distance velocity is given by ? d e _D = @D = B cos v2x ? vx (11) @t 2f sin ? representing vx = vxd ? vxe the retinal optical ow disparity. Representing the relationship between the focus lens setting and the focusing distance by mf = g (D; mz ) ; dierentiating mf in order to time we obtain the focus motor velocity m_f as a function of the focusing distance velocity D_ and as a function of the zoom motor velocity m_ z . @g m_ + @g D_ @mf = @g @mz + @g @D m_f = @m (12) z @D @t @mz @t @D @t z Assuming a constant zoom value during the focusing process (m_ z = 0), the focus motor velocity is only a function of the focusing distance velocity, resulting the relationship ? B cos vxd ? vxe @g (13) m_f = @D 2f sin2 being @g ?1 ?2 @D = b1 + b2 (sin ) + b3 (sin ) ; with the coecients b1 = a10 + a11 mz + a12 m2z b2 = 2 B (a20 + a21 mz ) b3 = 3 B 2 a30 Apart from the retinal optical ow disparity and xation geometry, the focus motor velocity (eq.13) is also a function of the eective focal distance of the lens. Being the eective focal length of the lens a function of the zoom and focus settings of the lens, the focusing process changes the eective value of the focal length. A solution to this problem can be obtained by using two dierent strategies: 1. Maintaining the invariance of the focal length, compensating the variation of the focal length with the zoom motor.

2. Modeling the focal length f as a function of the lens zoom and focus settings, f = h(mf ; mz ), resulting ? B cos vxd ? vxe @g : (14) m_f = @D 2 h (mf ; mz ) sin2 Focusing magni cation is one of the consequences of changing the eective focal length of the lens during the focusing process (see g. 3). The focal length invariance can be obtained by modeling the image magni cation M as a function of the focus and zoom settings, M = (mf ; mz ), using the isomagni cation curves to estimate the zoom motor setting that maintain a constant image magni cation, which means a constant focal length. The main advantage of this approach is the image invariance to the focusing process, allowing a more robust estimation of the optical ow. Since the eective focal length doesn't change with the focusing process it can be estimated o-line. However, this strategy transforms the focus motor velocity also dependent on the zoom motor velocity (m_ z 6= 0), increasing the complexity of the function that models the focus motor velocity. 4

9

x 10

8

5 7

4 3 2 1

0 2 om

10

Zo 4

x 10

4 to r

8

mo

8 s

ng

10 0

us

Foc

gs

tin

4 2

or Mot

Set

6 5 4 3 2

6

6 tt i

se

Zoom motor settings

Magnification factor

6

4

x 10

1 1

2

3

4 5 6 Focus motor settings

7

8

9 4

x 10

Fig.3.: Image magni cation as a function of the focus and zoom settings and isomagni cation curves for a group of dierent zoom motor values. Modeling the eective focal length as a function of the zoom and focus settings maintain the focus motor ^ However, this solution doesn't maintain the velocity only as a function of the focusing distance velocity D. invariance of the focal length, being the velocity induced on the images dependent on the velocity of the focal length variation. Considering a focal length f that changes its value witha velocity f,_ the velocity induced on the image _ Y_ ; Z_ is represented by (x;_ y)_ by a point P that moves with a spatial velocity V = X; x_ = PP xz^^ f_ + P f z^ X_ ? (P f z^)2 (P x^)Z_ y_ = PP yz^^ f_ + P f z^ Y_ ? (P f z^)2 (P y^)Z:_ For a xation point P with camera coordinates P = (0; 0; D), the velocity induced on the image is independent of the velocity of the focal length f,_ being the image velocity only dependent on the velocity of the target V .

5 Focusing accommodation The focusing accommodation is performed in parallel with the visual behaviors of xation and smooth-pursuit and is accomplished in two major steps: 1. The initial focusing is performed in parallel with the front-symmetric cyclopean xation of the system. The focus motor is position controlled and the odometric position for focus is de ned by ? mf = g mz ; B(sin )?1 (15) with = jl j = jr j.

2. Assuming that both lenses are in focus after the cyclopean xation of the system, the focusing accommodation of the lenses is performed during the smooth-pursuit of the system using the approach presented on the paper. During this period the focus motor is velocity controlled, being the focus motor velocity de ned by equation 14. To take into account the position error that results from inaccurate retinal velocity estimation, motor inertia or inaccurate focus motor setting calibration, the focus motor velocity that maintain the optical system in focus in obtained by u_ = m_ f + K mf (16) being mf = [g(D; mz ) ? mf ] the position error measured between the actual focus motor position mf and the modeled focus position for the target depth D. K represents a proportional gain component. ∆ vx

.

B cosθ 2 f sin2 θ

x

mf +

u(k) Kalman u f (k) Filter +

vm f pm

Focus Motor

f

f h

dg dD

K

D

g

^ m f +

∆ mf

mz

-

mf

Fig.4.: Block diagram of the focusing-by-vergence system. Pmf and vmf represent respectively the position and velocity of the focus motor. The focus motor velocity obtained by equation ?? is ltered by a Kalman lter before it is send to the PID focus motor controller. The Kalmar lter is used for two main reasons: 1) lter the noisy velocity components; 2) maintain a synchronous velocity information to the PID servo-controller. A velocity command is send to the PID controller every 10ms. A block diagram of the focusing-by-vergence system is presented on gure 4. The behavior of the proposed focusing-by-vergence method was analyzed under real environment conditions. For that purpose, we used an high texture pattern that was moved forward and backward along the cyclopean axes of the MDOF system. Two dierent tests were performed, considering dierent initial locations for the target (target depth): 2:0 and 2:5 meters away from the cyclopean eye. The focal length of both lenses were set to 15:0mm, which corresponds to a zoom motor setting equal to mz = 80000. On both tests the target was moved 1:0m forward and backward from its initial position and the behavior of the proposed focusing system is presented on gures 5. Three plots are presents on both gures: estimated target depth D (*), retinal motion ow disparity (**) and focus motor angular velocity (***). The estimated target depth was obtained during the vergence tracking process and is in agreement with the target movement performed. Good performance can be observed for the front-symmetric vergence tracking process. The retinal motion ow disparity used to control the angular vergence velocity and the focus motor velocity presents higher values for short target depth values, and the focus motor velocity required to maintain the optical system in focus also increase when the target approach the binocular system. This behavior was observed on both tests being the velocity required to maintain the system in focus higher for the closest target setup.

6 Conclusions In this paper we presented a focusing system controlled by vergence, fusing the information supplied by the vergence process and some pre-calibration of the lens. The pre-calibration of the lens parameters was done using bivariate polynomials to model the relationship between the calibration parameters and the variable lens parameters. We also described a vergence control approach based on the concept of binocular retinal image motion disparity that allows the computation of the target depth velocity.

5

4

4 3 3

*

* 2

2 1

1

** ***

0

**

0

***

−1 −1 −2 −2 −3 −4 0

2.5

5

7.5

10

12.5

15

−3 0

5

time(sec)

10

15

20

25

30

time(sec)

Fig.5.: Behavior of the proposed focusing-by-vergence method. The left gure corresponds to the closest target setup and the right gure the farther target setup. Each gure presents three plots: (*) Target depth (D) estimated by the vergence process (m=sec), (**) Filtered retinal motion ow disparity and (***) Filtered focus motor angular velocity ( =sec). The Y axes of the plots applies to all the plots.

References 1. Abbott, L., Ahuja, N., Surface reconstruction by dynamic integration of focus, camera vergence and stereo, IEEE Int. Conf. on Computer Vision, Florida, December 1988. 2. Aloimonos, Y., Weiss, Y., Bandopadhay, A., Active Vision. Intern. J. Comput. Vision 7 (1988) 333{356 3. Batista, J. Issues on Active Vision : Behaviors and Calibration. PhD Thesis - DEE-FCTUC, Coimbra, 1999. 4. Batista, J., Peixoto, P. Araujo, H., Real-Time Visual Behaviors with a Binocular Active Vision System. ICRA97 - IEEE Int. Conf. on Robotics and Automation. Albuquerque, New Mexico, USA, April, (1997) 5. Andersen, C., A framework for control of a camera head. PhD Thesis - LIA, Aalborg, 1996. 6. Pahlavan, K.: Active Robot Vision and Primary Ocular Processes. PhD Thesis, CVAP, KTH, (1993), Sweden. 7. Horri, A. Depth from defocusing CVAP Technical report TRITA-NA-P116, KTH Sweden, 1992. 8. Horn, B.P., Focusing MIT Arti cial Intelligence Laboratory, May 1968. 9. Krotkov, E.P., Focusing Int. Journal of Computer Vision, 3(1), 1987. 10. Pentland, A., A new sense for depth of eld, IEEE PAMI, 9(4), July 1987. 11. Surya, G., Subbarao, M., Depth from defocus by changing camera aperture: A spatial domain approach. IEEE Int. Conf. on Computer Vision and Pattern Recognition New-York, June 1993. 12. Tennenbaum, J., Accommodation in Computer Vision, PhD Thesis, Stanford University, 1970. 13. Nayar, S., Shape from focus, Carnegie Mellon University, 1989. 14. Murray D., Bradshaw K., MacLauchlan P., Reid I., Sharkey P.: Driving Saccade to Pursuit Using Image Motion. Intern. Journal of Computer Vision 16, No.3, November, (1995), 205{228. 15. Das, S., Ahuja, N., A comparative study of stereo, vergence and focus as depth cues for active vision, IEEE Int. Conf. on Computer Vision and Pattern Recognition, New-York, June 1993. 16. Brown,C., Coombs D.: Real-Time Binocular Smooth Pursuit Intern. Journal of Computer Vision 11, No.2, October, (1993), 147{165. 17. Burt, P., Bergen, J., Hingorani, R., Kolczynski, R., Lee, W., Leung, A., Lubin, J., Shvaytser, H.: Object tracking with a moving camera. Proc. IEEE Workshop Visual Motion, Irvine, (1989). 18. Willson, R.: Modelling and Calibration of Automated Zoom Lenses. CMU-RI-TR-94-03, Carnegie Mellon University, (1994). 19. Carpenter, R. H. S.: Movements of the Eye. Pion, (1988).

This article was processed using the TEX macro package with SIRS99 style