Robust and Efficient Algorithms for Optical Flow Computation * S. H. Lai
B. C. Vemuri
Department of Electrical Engineering University of Florida Gainesville, FL 3261 1
[email protected]
Department of Computer 81. Information Sciences University of Florida Gainesville, FL 3261 1
[email protected]
Abstract
introduced the membrane smoothness measure to constraint the flow field. Since the smoothness constraint is invalid across the motion boundary, Nagel [2] proposecl the ‘(oriented smoothness” measure to suppress the simoothness constraint in the direction perpendicular to boundaries. However, the smoothness across motion boundaries problem can be resolved by using the regularization with discontinuities. A major problem with the gradient-based approach is the “liability of the image flow constraint equation in the areas of an image where local brightness function is highly nonlinear. These areas occur in scenes containing highly textured regions, motion discontinuities, or depth discontinuities. The first-order Taylor series approximation used in the derivation of the image flow constraint equation leads to inaccuracies when the higher order terms are significant. The higher order terms are neglected in the derivation of the image flow constraint equation by assuming the time step between consecutive frames is arbitrarily small, i.e. approaches 0, which is far from being practical. Usually, the time step between consecutive images is fixed and is not arbitrarily small. Therefore, the gradient constraint equation is reliable only when the higher order derivatives of the brightness function are insignificant, i.e., the local brightness function is well approximated by a linear function. Most of the gradient-based methods do not account for the reliability of the gradient constraint equation. The above mentioned gradient constraint is very unreliable in the neighborhood of discontinuities of the brightness function. However, the contour-based methods [5] can give robust estimates of the optical flow along the zero-crossing contours, which are the discontinuity locations of the image brightness function. In this paper, we propose to use the image flow constraint in the areas away from the brightness discontinuities and use the contour-based flow constraint at the discontinuity locations. This yields an accurate estimate of the optical flow. The correlation-based approach locally finds the displacement vector ( u , v ) between two images 10 and 11 at the location ( z , y ) by minimizing the sum of squared difference (SSD) function [l]. Most correlation-based methods do an extensive search of the displacement vector ( U , U ) in a finite integer-pair
In this paper, we present two new, very efficient and accurate algorithms for computing optical flow. The first is a modified gradient-based regularization method, and the other is an SSD-based regularization method. To amend the errors in the image flow constraint caused b y the discontinuities in the brightness function, we propose t o selectively combine the image flow constraint and the contour-based flow constraint into the data constraint in a regularization framework. The image flow constraint is disabled in the neighborhood of discontinuities, whale the contour-based flow constraint is active at discontinuity locations. To solve the linear system resulting from the regularization formulation, the incomplete Cholesky preconditioned conjugate gradient algorithm is employed, leading to an efficient algorithm. Our SSD-based regularization method uses the SSD measure as the data constraint in a regularization framework. The preconditioned nonlinear conjugate gradient with a modified search direction scheme is developed t o minimize the resulting energy function. Experimental results for these two algorithms are given to demonstrate their performance.
1
Introduction
Optical flow computation is a fundamental problem in the motion analysis of image sequences. It provides very important information for estimating 3-D velocity fields, analyzing rigid and nonrigid motion, segmenting the image into regions based on their motion, or recovering 3-D structure of objects in the image. Many techniques for computation of optical flow have been proposed in literature. These can be classified into gradient-based, correlation-based, energy-based, and phase-based methods [2]. The gradient-based approach is dependent on the image flow constraint equation, which is derived from the brightness constancy assumption as well as the first-order Taylor series approximation [6]. Using the image flow constraint equation alone is not enough to compute the optical flow since each equation involves two different variables. Horn and Schunck [6] *Thisresearch was supported in part by the NSF grant ECS9210648
455 0-8186-7190-4/95 $4.00 0 1995 IEEE
obtain the optical flow estimates as reported in the literature on gradient-based methods [2]. In a regularization framework, the image flow constraint equation is regarded as the data constraint and additional smoothness constraint is imposed on the optical flow. The image constraint equation is derived by taking the Taylor expansion of the image constancy equation = 0 up t o the first-order terms. The higher-order terms are neglected under the assumption that the time step between consecutive frames is arbitrarily small, which is most often violated in practice. In reality, the time step is usually fixed. Therefore, the lack of higher-order terms becomes the main source of errors in the data constraint. The reliability of the image flow constraint equation depends on the magnitudes of the higher order derivatives of image brightness function. If the image brightness function in the neighborhood of a point is well approximated by a linear function, i.e. its higher-order derivatives are small, then the flow constraint is said t o be very reliable at this point. On the contrary, the image flow constraint is very unreliable at the locations with significant higher-order derivatives, which are mainly in the neighborhood of brightness discontinuities. Since the use of an unreliable flow constraint will degrade the accuracy of the optical flow estimation, we propose t o disable this unreliable image flow constraint in the neighborhood of discontinuities and enable a contour-based flow constraint at the discontinuity locations within a regularization framework.
set and find the pair with the smallest SSD value t o be the displacement at that location. Since the search region of the displacement vector is discretized for an extensive search, the accuracy of the computed optical flow is limited by this discretization. In addition, the correlation-based approach is incapable of producing reliable optical flow estimates in a homogeneous region. Anandan [l]treated the estimates provided by the matching process as the data constraints for optical flow with appropriate confidence measure, and incorporated the smoothness constraint on optical flow t o achieve better results. Recently, Szeliski and Coughlan [8] used the two-dimensional spline model to represent the flow field and minimized the following SSD function n
n
J = 1 a=l
where the vector U is the concatenation of the flow . Levenberg-Marquardt components uaJ and ~ 1 % ~The algorithm was then employed t o solve this nonconvex optimization problem. They reported very accurate results using this method. The 2-D spline models for optical flow field assume the flow field to be wellapproximated by the 2-D spline basis functions in the preset patches. However, this may not be the case in the presence of discontinuities since, incorporating discontinuities into these spline patches is a nontrivial task. In this paper, we use the standard finite difference discretization on the flow field and take the SSD as the data constraint energy in a regularization framework. A preconditioned nonlinear conjugate gradient algorithm is employed to minimize the total energy function. The experimental results on the standard synthetic image sequence using our algorithm are better t h a n or comparable t o t h e best exastang results reported an laterature. Unlike some other algorithms that only generate sparse optical flow estimates, our algorithms give dense optical flow estimates. In addztzon, m o t z o n dzscontznuztzes can be zncorporated an o u r (regularazataon) f r a m e w o r k . The remainder of this paper is organized as follows. In the next section, we present our modified gradientbased method that selectively combines the image flow constraint and the contour-based flow constraint. In section 3, our SSD-based regularization technique for optical flow computation from two images of a time sequence is proposed. The experimental results for both algorithms on synthetic and real image sequences are presented in section 4. Finally, we conclude our paper in section 5.
2
2.1
Combining the Flow Constraints
The contour-based method [5] computes the optical flow along the zero-crossing contours by using the following flow constraint equation S z ( z 7 Y , t ) 4 2 , Y , t ) S ? / ( 2 , y , t ) v ( z 1 y , t= ) -St(X,Y,t) (3) where S(z, y , t>is the convolution of I ( z ,y , t with the second derivative of the Gaussian function, , S,, and S t are the partial derivatives of S with respect t o Z,y and t respectively. In [ 5 ] , the function S was chosen to be the convolution of the Laplacian of Gaussian (LOG) filter with the image brightness function. Now, we incorporate the above two flow constraint equations into the regularization framework. The image flow constraint in equation 2 is disabled in the neighborhood of brightness discontinuities, while the contour-based flow constraint is active along the zerocrossing. The variational formulation of the optical flow problem using this selective scheme for the data constraint leads t o minimizing the following functional
+
s’,
s,
Modified Gradient-based Method
At the foundation of the standard gradient-based approach is the image flow constraint equation,
Q.((VI. U )
+
+
r”’
+ X(ll vu 1 ; + II vv 1l;)dX P(VS U + S$dX
f.,,_,,,,,,,,
0
(4)
where U(,, t ) = ( u ( x ,t ) ,v ( x , t ) )is the optical flow field t o be estimated, x = (z, y ) is a 2-D vector, is the 2-D image domain, a , X and P are the weighting functions associated t o the image flow constraint, smoothness constraint and contour-based Aow constraint respectively. In this paper, the weighting functions X and P
where I(z,y , t ) is the image brightness function at ocation ( z , y ) and time 1 , I z , I y and It are the partial derivatives of I with respect to z, y and t respectively, and ( u , o ) is the optical flow field. The image flow constraint equation has been used in different ways t o
456
are chosen to be constants, and a(.)
equations K u = b where U E 9?2nax1 is the concatenation of all the components ui and vil and the stiffness matrix K E !J?2nax2n2 is symmetric positive-definite and it has the following 2 x 2 block structure.
has value 0 when
x is in the neighborhood of discontinuities and takes
a constant value elsewhere. A discrete version of the above energy can be written as
C(E,,iUi
+E,,iVi
+Et,i)2+X(u$ +u;,i+v:,i
+V,2,i).
i
Where, the index i denotes the i-th location, the subscripts z, y and t denote the partial derivatives along the corresponding directions, (U,w) is the discretized flow vector, and the function E is defined as follows aIi when i~ D\N E i = { @Si when ~ E Z The set D contains all the discretized locations in the image domain 0, N is the set of locations in the neighborhood of discontinuities, and the set Z contains the discretized locations along the zero-crossing contours.
2.2
where Ks E 9?n2xnz is the discrete 2-D Laplacian matrix from the membrane smoothness constraint, E,,,
Ezy, and Eyyare all n2 x n2 diagonal matrices with -2 - and E:,%, respectively. entries E,,,, EXtjEy,i To solve this linear system for optical flow estimation, we use the preconditioned conjugate gradient algoriithm 41 with an incomplete Cholesky preconditioneir P [4 , given as follows.
I
1. Initialize UO;compute ro = b - Kuo; k = 0. 2. Solve Pzk = Ck; k = k 1. 3. If IC = 1, p1 = zo;else compute @f = rr-2zk--2,
+
Rejecting Unreliable Constraints
The image flow constraint or the contour-based flow constraint is unreliable in the regions where the underlying function ( I or S ) is not well approximated by a linear function locally, which means the function contains significant nonlinear terms. In this paper, we define a function S ( i ) as a measure of reliability for the flow constraint at i-th location to be
/?k
Normalizing Data Constraints
3
+
+
C(Fx,iui+~y,iVi+Ft,i)2+X(u;,i+U~,i+w;,,
+U,”,,).
i
2.4
l
and update Pk = zk-1
+ PkPk-1
The preconditioner P is chosen to be an incomplete Cholesky factorization of the matrix K . The idea of incomplete Cholesky factorization is to find an approximate Cholesky factorization of the matrix K , i.e. IC M LLT, such that the lower triangular matrix L has a similar sparsity structure to that of K . In addition, the product LLT at the locations with nonzero entries in L or LT still has the same values as those in K . Therefore, the preconditioner P = LLT is a good approximation to K . Since, K is sparse and wellstructured and hence the matrix L is also sparse and well-,structured. Thus, the solution to the auxiliary linear system P z = r in the preconditioning step of the preconditioned conjugate gradient algorithm can be olbtained via forward and backward substitutions very efficiently. Furthermore, we use the block version [4] of the incomplete Cholesky factorization to take advantage of the 2 x 2 block structure of the matrix K. FCesults of applying this algorithm to synthetic and real s2, = s2, or < sa), and si-constraint (F 1 2 si, = si, or 5 si). Again, the type of so-
&&
*If f, = 0 or 21, or
&
lution for 2 2 can change only at points where a t least one of f t , g2, is zero. Now we can summarize the results. Implicit curves ft(r) = 0 and gZ3(r)= 0 separate the sphere into a number of areas. Each of the areas is either contradictory (and contains only contradictory points), or ambiguous (containing ambiguous points). Two different rigid motions can produce ambiguous directions of flow if the image contains only points from ambiguous areas. There are also two scene surfaces constraining depi,h 2 1 and two surfaces constraining depth 22. If the depths do not satisfy the constraints, the two flows are not ambiguous.
So it is possible to classify all image points on the sphere depending on the kind of solution interval for 21. Possible outcomes are: no solution a t all, bounded solution interval, or unbounded solution interval. For the latter two cases, we can also check whether the interval has a lower bound greater than 0. If at a point there does not exist a positive solution for 21,this means that the two flows a t this point cannot have the same direction and we say that we have a contradictory point The existence of a solution for 21 depends on signs of f, and gZ3,and also on sign of s1 - si. Functions f,(r) and g,, (r) are polynomial functions of r . To find out where they change sign, it is enough to find points where they are zero. Sign of s1 - si is more complicated, since sl(r) and s/l(r) do not have to be continuous. However their discontinuities occur a t points where ft(r) = 0 or g?l(r) = 0. So sgn(s1 - si) can change at those points, and at points where s1 -si = 0.
depend on
&
2.1
Contradictory points
In this section we describe combinations of signs of f, and grJ that yield a contradictory point. Since we are interested in the contradictory areas, we investigate points where f a # 0, and g,, # 0. There are two simple conditions yielding contradiction for 21, one for the sl-constraint, one for the s{constraint. There is no solution for 21 if < s1 and s1 0, either if for any L&. (t& < 0 (i.e., Gi and form an angle greater are such that than 90') or (Gi. 0 and Cji and [tun!]' > 4(w . n o ) ( t . no)(w t), which means that Wi and t i must be close to the border. When ft = 0 is perpendicular to fh, = 0, the projections of wl and w2 on ft = 0 and the projections of t l and t2 on f w = 0 coincide in one point 1-1, i.e..
(c;
G,
465
.c)
G
only if the surfaces in view satisfy certain inequali and equality constraints . Furthermore, for two 3 motions to be compatible the two translation vecto must lie on a geodesic perpendicular to the geodes through the two rotation vectors.
= r? = 1‘3 = 2‘4. Point rl lies a t the intersection of all six curves f i = 0 and g i j = 0. Any three curves f i = 0, gid = 0 and gki = 0, with k # 1 intersect only in rl and one of the points i 1 , t-.., 6 1 , or W2.Furthermore, since all the zero motion contours have to be closed curves on the hemisphere, we conclude that if there exists a contradictory area, it also has to be in a neighborhood of r l . It thus suffices to consider all possible sign combinations of terms f i and gij around r l . It can be verified that, for a hemisphere to contain a contradictory area, the two translations have to have the same sign, that is sgn(tl ‘no) = sgn(t2 .no). Also the two rotations have to have the same sign, i.e. sgn(w1 ‘no) = sgn(w2 .no). Furthermore, the relative positions of t l , t 2 , w1 and w? have to be such that
rl
References [l] N. Ancona and T. Poggio. Optical flow from 1. correlation: Application to a simple time-to-cra: detector. International Journal of Computer V sion: Special issue on Qualitative Vision, Y. Alc monos (Ed.), 14:132-146, 1995. [2] T. Brodsky, C. Fermiiller and Y. Aloimono Uniqueness results for the direction of image f l o ~ Technical Report, Center for Automation RI search, University of Maryland, 1995.
[3] C. Fermiiller and Y. Aloimonos. On the geon etry of visual correspondence. Technical Repoi CAR-TR-732, Center for Automation Researcl University of Maryland, 1994.
Intuitively this means, when rotating f w = 0 in the orientation given by the rotations in order to make fw = 0 and ft = 0 parallel, then the order of points t l and t ? on f t = 0 is opposite to the order of points Wl and W? on f w = 0 (moving along the same direction along f w = 0 and fi = 0), if sgn(tl . no) = 1. Otherwise, if sgn(tl ‘no) = -1, the order of points -tl and -t? on f t = 0 must be the same as the order of points 3 1 and W:, on fw = 0. In summary, we have shown that two rigid motions could be ambiguous on one hemisphere, if ( t l x t2) is perpendicular to (w1 x w.), but only if certain sign and certain distance conditions on t l , t 2 , w1 and wz are met. In addition as shown in Section 3 the two surfaces in view are constrained by a second and a third order surface. Figure 4 gives an example of such a configuration.
[4] C. Fermuller. Passive navigation as a patter recognit ion problem. Int erna t ion a1 Journal Computer Vision: Special issue on Qualitati~ Vision, Y. Aloimonos (Ed.), 14:147-158, 1995. [5] B. Horn. Motion fields are hardly ever ambiguou: International Journal of Computer Vision, 1:259 274, 1987. [6] H. Longuet-Higgins. A computer algorithm fc reconstruction of a scene from two projection: Nature, 293:133-135, 1981. [7]
S.Maybank.
Theory of Reconstruction from Im age Motion. Springer, Berlin, Heidelberg, 1993.
[8] S. Negahdaripour. Critical surface pairs an1 triplets. International Journal of Computer VI sion, 3:293-312, 1989. [9]
(a)
[lo] M. Spetsakis and J . Aloimonos. Structure fron motion using line correspondences. Internationa Journal of Computer Vision, 1:171-183, 1990.
(b)
Figure 4: Both halves of the sphere showing two rigid motions for which there do not exist contradictory ar:as in one hemisphere.
4
S. Negahdaripour and B. Horn. Direct passiv, navigation. IEEE Transactions on Pattern Anal ysis and Machine Intelligence, 9:163-176, 1987.
[ll] R. Tsai and T. Huang. Uniqueness and estima tion of three-dimensional motion parameters o
rigid objects with curved surfaces. IEEE Trans actions on Pattern Analysis and iMnchine Intelli gence, 6:13-27, 1984.
Conclusions
[t has been shown that directions of motion vectors ire “hardly ever ambiguous”. Ambiguities could result
466