Piecewise-Smooth Dense Optical Flow via Level Sets - CiteSeerX

7 downloads 117 Views 535KB Size Report
and Schunck [3], by incorporating coarse-to-fine strategies [8, 9, 10], robust statistics [1, 9] and gradient constancy in a nonlinear objective functional.
Piecewise-Smooth Dense Optical Flow via Level Sets T. Amiaz and N. Kiryati School of Electrical Engineering Tel Aviv University Tel Aviv 69978, Israel Abstract We propose a new algorithm for dense optical flow computation. Dense optical flow schemes are challenged by the presence of motion discontinuities. In state of the art optical flow methods, over-smoothing of flow discontinuities accounts for most of the error. A breakthrough in the performance of optical flow computation has recently been achieved by Brox et al. Our algorithm embeds their functional within a two phase active contour segmentation framework. Piecewise-smooth flow fields are accommodated and flow boundaries are crisp. Experimental results show the superiority of our algorithm with respect to alternative techniques. We also study a special case of optical flow computation, in which the camera is static. In this case we utilize a known background image to separate the moving elements in the sequence from the static elements. Tests with challenging real world sequences demonstrate the performance gains made possible by incorporating the static camera assumption in our algorithm.

1

Introduction

Optical flow is the apparent motion in an image sequence, observable through intensity variations [1]. Although optical flow is generally not equivalent to the true motion field, they are quite similar in typical cases. Optical flow computation is a key step in many image and video analysis applications, including tracking, surveillance, dynamic super resolution and shape from motion. Various approaches to optical flow computation have been suggested [2]. These include differential techniques [3, 4], phase based [5] and statistical methods [6]. Variational optical flow computation is a subclass of differential techniques, that use the calculus of variations to 1

minimize a functional embodying the constancy of features in the image and smoothness of the resulting flow. Brox et al [7] have significantly improved upon the original work of Horn and Schunck [3], by incorporating coarse-to-fine strategies [8, 9, 10], robust statistics [1, 9] and gradient constancy in a nonlinear objective functional. Dense optical flow methods, i.e. algorithms that estimate the optical flow everywhere in the frame, tend to fail along strong discontinuities in the optical flow field. Even with the algorithm of Brox et al [7], that employs a robust smoothness term, most of the error develops there. Coupling optical flow estimation with segmentation offers better support for flow discontinuities [11, 10]. In other approaches, the precise location of the boundary is considered to be of higher importance than the overall accuracy of the optical flow field. Wang and Adelson [12] decomposed the image sequence into layers of affine motion with successively decreasing fidelity. Cremers and Soatto [13] segmented the sequence into regions of affine motion bounded by evolving contours, including an explicit spline based implementation, and two-phase and multi-phase level set implementation. However, these methods focus on the segmentation of the image, and their assumption of motion affinity limits their precision as optical flow computation methods. In this paper, we embed the functional of [7] within an active contour segmentation framework. The goal is to profit from the excellent performance of that method, while producing crisp flow boundaries. Following [11, 10, 13] we restate the optical flow problem as that of determining piecewise smooth flow fields bounded by contours, simultaneously minimizing the length of the contours and the optical flow functional of [7] inside the smooth regions. Due to the presence of unspecified discontinuities in the integration domain, minimiza2

tion of Mumford-Shah [14] type functionals is difficult. Notable solution approaches include Ambrosio and Tortorelli’s Γ-convergence technique [15], and Vese and Chan’s level set method [16]. We follow the latter, because it explicitly limits the thickness of the boundary. Experimental testing demonstrates the accuracy of our piecewise-smooth optical flow algorithm. A special case of the optical flow problem is computing optical flow for a sequence known to be taken by a static camera. This additional information can be used in several ways, such as explicitly forcing homogeneous areas to zero movement [1], or using a long sequence for background subtraction in a tracking context [17]. We propose using this information and a background image to aid in classifying the pixels as moving or being part of the static background. The conceptual roots of our approach are described in section 2. The proposed piecewisesmooth flow algorithm is derived in section 3. The experimental results, presented in section 4, demonstrate significant performance improvement with respect to the best published methods. In section 5, the static camera case is formulated and the performance boost made possible by incorporating the static camera constraint is demonstrated. Our preliminary results were presented in [18].

2

Background

Our work relies on the superior performance of the algorithm of Brox et al [7] in finding smooth optical flow fields, the Mumford-Shah [14] segmentation framework, and its numerical approximation by Vese and Chan [16].

3

2.1

The optical flow functional of Brox et al [7]

Optical flow computation is an underconstrained problem. The introduction of assumptions or a-priori knowledge is necessary for reliable estimation of optical flow. Brox et al’s optical flow functional represents the following principles and constraints: Grey level constancy: I(~x) = I(~x + w) ~

(1)

Here I is the image luminance as a function of spatial position and time, ~x = (x, y, t) is the spatio-temporal position in the image and w ~ = (u, v, 1) is the optical flow displacement field between the two frames. Gradient constancy: An extension of the grey level constancy assumption, that accommodates illumination changes between the two images.

∇I(~x) = ∇I(~x + w) ~

(2)

Smoothness: The minimization of an optical flow functional relying only on the image constancy assumptions is ill posed. A constraint on the spatio-temporal smoothness of the flow field provides regularization. There are two versions of smoothness presented in [7], the 3D version uses spatio-temporal smoothness, while the 2D version uses purely spatial smoothness. Note that in our implementation of the algorithm we used the 2D version. Presence of outliers: Outliers appear in the flow because of image noise and sharp edges. They are handled in the functional by using robust fidelity and smoothness terms. Incorporating the grey-level and gradient constancy assumptions into a modified L1 norm

4

Ψ(x2 ) =



x2 + 2 , the following fidelity (data) term is obtained:

fof (u, v) = Ψ(|I(~x + w) ~ − I(~x)|2 + γ|∇I(~x + w) ~ − ∇I(~x)|2 )

(3)

Here γ is the weight of gradient constancy with respect to grey level constancy. Using the same robust kernel for the smoothness term

sof (u, v) = Ψ(|∇u|2 + |∇v|2 )

(4)

results in the optical flow functional

Fof (u, v) =

Z

fof (u, v) + αsof (u, v)d~x ,

(5)

where α is a weight coefficient. Minimizing this functional is based on solving the Euler-Lagrange equations. The equations are highly non-linear and are linearized using nested fixed point iterations on a coarseto-fine grid.

2.2

Mumford-Shah segmentation

The Mumford-Shah formulation is based on the observation that natural images are piecewise smooth. Defining the fidelity term

f (I) = (I − I0 )2

5

that compares the segmentation field I with the original image I0 , and the smoothness term

s(I) = |∇I|2

leads to the well known Mumford-Shah functional [14]:

FM S =

Z

f (I)d~x + µ Ω

Z

s(I)d~x + ν|C|

(6)

Ω\C

Here Ω is the domain of the image, C is the discontinuity (edge) set, |C| represents edge length, and µ, ν are the relative weights of the smoothness and edge-length terms. This corresponds to requiring good approximation of the original image by a piecewise smooth segmentation field with few discontinuities. Minimization of the Mumford-Shah functional is difficult because of the presence of the unknown discontinuity set in the integration domain. One possible approach is Ambrosio and Tortorelli’s method [15] of using an auxiliary continuous field to describe whether a pixel belongs to a smooth segment or to an edge. That method, however, does not explicitly require sharp boundaries, and may yield a segmentation field with edges that are too thick.

2.3

Vese and Chan’s segmentation scheme [16]

The level set method [19] is an effective tool for computing evolving contours (an early version of this idea appeared in [20]). It accommodates topological changes and is calculated on a grid. In level set methods, regions are defined by contours that have zero width. This is the motivation for using the level-set segmentation approach in sharpening optical flow discontinuities.

6

In Vese and Chan’s algorithm [16], two smooth functions I + and I − approximate the original image, and the segmenting contour is the zero-level set of an evolving function φ. Using the same fidelity and smoothness terms as in (6), the following functional is obtained:

+



FV C (I , I , φ) =

Z

Z

(I − I0 ) H(φ)d~x + (I − − I0 )2 H(−φ)d~x Z Z + 2 +µ |∇I | H(φ)d~x + µ |∇I − |2 H(−φ)d~x Z +ν |∇H(φ)|d~x +

2

(7)

H(φ) is the Heaviside (step) function of φ, indicating whether the valid approximation of the image at any given point is I + or I − (in practice a smooth approximation of H(φ) is used). The contour length in this formulation is |∇H(φ)|. Vese and Chan suggest minimizing this functional by iteratively solving the Euler-Lagrange equations for I + , I − and φ.

2.4

Affine segmentation of optical flow

The final necessary background element is the affine optical flow segmentation approach [12, 21] that we shall use for the initialization of the level-set function φ. Wang and Adelson [12] suggested decomposing an image sequence into layers based on the assumption that motion is affine. Their method relies on an estimate of the optical flow field. The flow field is iteratively approximated by affine motion layers of decreasing accuracy. In each iteration the optical flow field of unassigned pixels is divided into blocks, and an affine motion approximation of each block is computed. The quality of approximation of each block is tested against a selection threshold Tr (increased in each iteration). Blocks that pass this test are merged using an adaptive k-means clustering algorithm. The affine field pertaining to the largest cluster of blocks is regarded as the result of the iteration; 7

pixels whose optical flow field’s distance from the resultant affine flow is within a threshold Ta are assigned to this iteration’s layer. The algorithm is terminated when sufficiently many pixels have been assigned. Borshukov et al [21] proposed a variant of this algorithm in which merging by k-means clustering is replaced by using a merge threshold Tm , and selecting the affine motion field with minimal error.

3 3.1

The Piecewise Smooth Flow (PSF) algorithm Objective functional and Euler-Lagrange equations

The functional for discontinuity preserving optical flow computation is obtained by substituting the expressions for (local) fidelity and smoothness of the optical flow field (Eqs. 3 and 4) in the segmentation functional of Vese and Chan (Eq. 7). Defining the two optical flow fields w ~ + = (u+ , v + , 1) and w ~ − = (u− , v − , 1), the proposed objective functional is obtained:

F (u+ , v + , u− , v − , φ) = Z  Ψ |I(~x + w ~ + ) − I(~x)|2 + γ|∇I(~x + w ~ + ) − ∇I(~x)|2 H(κφ)d~x Z  + Ψ |I(~x + w ~ − ) − I(~x)|2 + γ|∇I(~x + w ~ − ) − ∇I(~x)|2 H(−κφ)d~x Z +µ Ψ(|∇u+ |2 + |∇v + |2 )H(φ)d~x Z +µ Ψ(|∇u− |2 + |∇v − |2 )H(−φ)d~x Z +ν |∇H(φ)|d~x

(8)

Note the different scaling of the Heaviside function in the fidelity and smoothness terms, via the parameter κ (κ < 1). It stresses flow variations near optical flow discontinuities and 8

leads to piecewise smoothness. See also Eq. 14. According to the calculus of variations, a minimizer of this functional must satisfy the Euler-Lagrange equations. Following the abbreviations of [7], we denote:

4

Ix±

= ∂x I(~x + w ~ ±)

Iy±

= ∂y I(~x + w ~ ±)

Iz±

= I(~x + w ~ ± ) − I(~x)

4

4

4

± = ∂xx I(~x + w ~ ±) Ixx 4

± Ixy = ∂xy I(~x + w ~ ±) 4

± Iyy = ∂yy I(~x + w ~ ±) 4

± Ixz = ∂x I(~x + w ~ ± ) − ∂x I(~x) 4

± = ∂y I(~x + w ~ ± ) − ∂y I(~x) Iyz

(9)

The equations for w ~ + at φ > 0 and for w ~ − at φ < 0 are

2

2

2

± ± ± ± ± ± Iyz )) − Ixz + Ixy ))(Ix± Iz± + γ(Ixx + Iyz H(±κφ)Ψ0 (Iz± + γ(Ixz

− µdiv(H(±φ)Ψ0 (|∇u± |2 + |∇v ± |2 )∇u± ) = 0 2

2

(10)

2

± ± ± ± ± ± H(±κφ)Ψ0 (Iz± + γ(Ixz + Iyz ))(Iy± Iz± + γ(Iyy Iyz + Ixy Ixz )) −

− µdiv(H(±φ)Ψ0 (|∇u± |2 + |∇v ± |2 )∇v ± ) = 0

9

(11)

The equations for w ~ + at φ < 0 and for w ~ − at φ > 0 are the smoothness constraints:

div(Ψ0 (|∇u± |2 + |∇v ± |2 )∇u± ) = 0

(12)

div(Ψ0 (|∇u± |2 + |∇v ± |2 )∇v ± ) = 0

(13)

Eqs. 10-13 are linearized using fixed point iterations (see Appendix, A.2). The resulting system of linear equations is sparse, and can be efficiently solved. The gradient-descent equation for φ is

  ∂φ ∇φ = νδ(φ)∇ · ∂t |∇φ|   − µδ(φ) sof (u+ , v + ) − sof (u− , v − )

  − κδ(κφ) fof (u+ , v + ) − fof (u− , v − )

(14)

where δ(φ) = H 0 (φ). We solve this equation using standard discretization of the derivatives (see Appendix, A.3). The equations for w ~ +, w ~ − and φ are iteratively solved in alternation. Once the solutions are determined, the final estimated optical flow field is:

w ~=

3.2

Initialization

    w ~ + if φ > 0

(15)

   w ~ − otherwise

The proposed functional is non-convex, and therefore may be prone to local minima. The initialization scheme, based on affine flow segmentation, facilitates the convergence to the correct minimum. The initial values of the fields w ~ +, w ~ − and φ are obtained by a single 10

application of the original algorithm of [7], followed by detecting the dominant motion using the algorithm of [21]. After these two steps, w ~ − is initialized (throughout the image domain) to the first layer of the affine field, that corresponds to the dominant motion. w ~ + is initialized (throughout the image domain) to the optical flow computed in the first step (the algorithm of [7]), and φ is initialized to

φ=

    1 pixel belonging to the dominant motion

(16)

   2 otherwise

The rationale for these values is the following: pixels classified to the dominant motion, i.e., to the first affine layer, belong to w ~ − and their initial φ value is low. The φ value for pixels classified to other motion layers is high. Initially there is a bias towards the dominant motion, because it is smoother than the combination of other motions. To compensate for this, all initial φ values are positive.

3.3

Piecewise-Smooth Flow (PSF): algorithm summary

1. Calculate optical flow using [7]. 2. Segment the optical flow field using [21]. 3. Initialize w ~ +, w ~ − and φ. 4. Iteratively solve Eqs. 10-14. 5. Assign the optical flow according to Eq. 15.

11

4

Experimental results

We demonstrate the performance of the piecewise smooth flow (PSF) algorithm using synthetic and real image sequences. The following parameters, taken from [7, 16], were used in all the experiments: µ = 80, γ = 100, ν = 0.02 ∗ 255. The numerical approximation of the Heaviside function was H∆ (φ) = 12 (1 +

2 π

φ tan−1 ( ∆ )), with ∆ = 1. The parameter κ was

set to κ = 0.03, and the number of iterations was 40. The images were pre-blurred using a Gaussian kernel with σ = 0.8. The parameters for the initialization segmentation [21] were: Tr = 0.5, Tm = 1, and Ta = 0.1 in the first iteration, and the block size was 5 × 5 pixels. Figure 1 shows that the over-smoothing in [7], that forces a gradual change of the optical flow field at the sky-ground interface in the Yosemite sequence, is replaced in our PSF method by segmentation of the flow field that is crisp and close (≤ 2 pixels) to the ground truth boundaries. Table 1 compares our PSF algorithm with other dense optical flow algorithms on the Yosemite sequence (the angular error was computed as in [2]). As can be seen, our method is superior to other published methods, even though it utilizes only two frames. Primarily, our method is an optical flow estimation method, however it also generates a segmentation result as presented in Figure 3. It can be seen that the segmentation provided by our method is much closer to the horizon than that generated by the initialization step. Similar performance improvement was achieved on the Street with car sequence [24]. The sequence was converted from color to grey-levels. Due to the noisy nature of this sequence, pre-blurring leads to improved results. In this case, the best performance of both our algorithm and our implementation of [7] was obtained using a Gaussian kernel with σ = 1.0. The results for the Street with car sequence are presented in Figure 2 and in Table 2. The angular error is again significantly lower than that of other published methods,

12

(a)

(b)

(c)

(d)

Figure 1: (a) Frame 8 of the Yosemite sequence. (b) The horizontal flow as a function of vertical position, at the sky-ground boundary. Solid line: ground truth. Dotted line: Brox et al’s 2D result. Dashed line: our piecewise-smooth flow result. (c) Angular error in Brox et al’s 2D results. Dark means no error. (d) Angular error using our piecewise-smooth flow algorithm.

Technique Horn-Schunck, modified [2] M´emin-P´erez [10] Brox et al (2D) [7] Brox et al (3D) [7] Our PSF algorithm

AAE 9.78◦ 4.69◦ 2.46◦ 1.94◦ 1.64◦

STD 16.19◦ 6.89◦ 7.31◦ 6.02◦ 5.82◦

Table 1: Comparison between several dense optical flow algorithms on the Yosemite sequence. (AAE: Average angular error. STD: Standard deviation of the angular error.)

13

(a)

(b)

(c)

(d)

Figure 2: (a) Frame 10 of the Street with car sequence. (b) Frame 11. (c) Angular error with Brox et al’s 2D method (black: no error). (d) Angular error using our piecewise-smooth flow algorithm.

Technique Proesmans et al [24] Weickert-Schn¨orr (3D) [25] Brox et al (2D) Our PSF algorithm

AAE 7.41◦ 4.85◦ 3.03◦ 2.01◦

STD

10.18◦ 8.84◦

Table 2: Comparison between several dense optical flow algorithms on the Street with car sequence.

14

(a)

(b)

Figure 3: Boundary after initialization and at the end state of our algorithm. (a) Sky segmentation after initialization in white, black is the segmentation at the end of our algorithm. (b) Detail of (a). Yosemite κ AAE STD 0.03 1.64◦ 5.82◦ 0.06 1.79◦ 6.49◦ 0.015 1.62◦ 6.02◦ 0.12 1.95◦ 7.38◦ 0.0075 1.71◦ 6.60◦

Street with Car AAE STD ◦ 2.01 8.84◦ 2.03◦ 8.66◦ 2.14◦ 9.72◦ ◦ 2.22 9.43◦ 2.43◦ 11.12◦

Table 3: Robustness of the algorithm to variation in the parameter κ when tested on the Yosemite and Street with Car sequences.

and the error around the motion boundary is smaller. Parameter robustness for [7] was demonstrated in their work. We introduced in this work an additional paramter: κ. We tested its sensitivity by varying it by a factor of up to 4 from its optimum setting on both the Yosemite and Street with Car sequences. Table 3 demonstrates that the algorithm shows high robustness with respect to this parameter on both sequences. To test the PSF algorithm on real data, we used the NASA sequence [2]. Optical flow discontinuities were amplified by considering the flow between frames 10 and 15 of the sequence.

15

Figure 4: The optical flow between frames 10 and 15 of the NASA sequence, computed using the suggested Piecewise-Smooth Flow (PSF) algorithm. The computed flow is dense, but for clarity, vectors that represent negligible motion are not shown.

In this sequence the camera is moving toward the objects, and major flow discontinuities arise near the two vertical pencils. Figure 4 shows that the flow edges computed using our PSF algorithm are sharp. Comparison with the performance of previous techniques on this sequence [2] reveals the superior performance of the PSF algorithm.

5

Special case: static camera

Homogeneous regions in an image sequence provide no cues for optical flow computation. Optical flow methods that are not committed to dense output, e.g., Lucas-Kanade [4], simply do not provide optical flow estimation in uniform regions. Dense optical flow algorithms, that emphasize the smoothness of the computed flow, extrapolate flow from informative segments 16

to adjacent homogeneous regions. This leads to large errors in static homogeneous areas. The optical flow computed in uniform regions can be forced to be static [1]. Obviously, errors will arise wherever the underlying assumption (that homogeneous regions are static) does not hold. The inherent ambiguity about the motion of uniform regions can be resolved by exploiting additional information. In this section, the case of a static camera is considered, and the additional constraint is used to reduce the optical flow estimation error.

5.1

The Static Camera Flow (SCF) algorithm

When the camera is known to be static, an image of the static background can improve the separation between the moving elements and the static parts of the frames. When the background image is not directly available, it can be recovered using temporal filtering [22]. A simple background estimation method is temporal weighted median filtering [23], with weights inversely proportional to the magnitude of movement detected by the algorithm of [7]. The weighted median filter converges to the background image much faster than a non-weighted median filter, because moving elements are mostly ignored. In the piecewise-smooth flow (PSF) algorithm described in section 3, the estimated optical flow in each pixel is taken either from the field w ~ − (dominant motion) or from w ~ + (other motions), see Eq. 8. Here, given the static camera assumption, the optical flow assigned to each pixel is either zero (static region) or w ~ (moving). The likelihood that a given point ~x in an image is moving is quantified by the dissimilarity of the image I and the background image Ibg at that point:

fBG = Ψ |Ibg (~x) − I(~x)|2 + γ|∇Ibg (~x) − ∇I(~x)|2

17



In the static camera flow (SCF) method, the objective functional embeds the selection between the estimated optical flow in moving areas and zero flow in static regions within the Vese-Chan segmentation framework [16]:

F (u, v, φ) = Z  Ψ |I(~x + w) ~ − I(~x)|2 + γ|∇I(~x + w) ~ − ∇I(~x)|2 H(κφ)d~x Z  +λ Ψ |Ibg (~x) − I(~x)|2 + γ|∇Ibg (~x) − ∇I(~x)|2 H(−κφ)d~x Z +µ Ψ(|∇u|2 + |∇v|2 )H(φ)d~x Z +ν |∇H(φ)|d~x

(17)

Compare this functional to that of the PSF method (Eq. 8). The term in the latter that represents the optical flow fidelity of the dominant motion is replaced by the image-background dissimilarity term. The dominant motion smoothness term that appears in Eq. 8, but is zero in the static case, is discarded. From this functional the Euler-Lagrange equations for w ~ and φ are derived. The equations for w ~ are (10-13), and the equation for φ is

∂φ = νδ(φ)∇ · ∂t



∇φ |∇φ|



− µδ(φ)sof (u, v) − κδ(κφ) [fof (u, v) − λfBG ]

(18)

In practice, to account for noise, slight Gaussian blurring is applied to the images as a preprocessing step. The initialization of w ~ is to the optical flow calculated using [7]. The level set function φ is initialized to -1 throughout the image. Once the solutions w ~ and φ are determined, the

18

optical flow field is estimated as

w ~=

    w ~ if φ > 0    0

5.2

(19)

otherwise

Experiments with the Static Camera Flow (SCF) algorithm

To test the static camera flow algorithm with a known background image we used a sequence from the TAU-Dance database [27]. Optical flow estimation in this sequence is difficult because of the homogeneous regions in the frames. Figure 5 demonstrates that the static camera flow algorithm classifies the background as static and the flow within the dancer contour is smooth, while the algorithm of [7] and the piecewise smooth flow (PSF) algorithm classify much of the static background as moving. The Ettlinger Tor sequence [26] shows traffic in an intersection, viewed by a static camera, see Fig. 6a. The background image was extracted from the fifty frames available, and is shown in Figure 6b. Due to the limited number of frames and the motion of the large bus, the estimated background image is not ideal, note the artifacts near the light post. The optical flow computed using the algorithm of Brox et al is shown in Figure 6c-d. It is generally adequate, but in certain areas, e.g., between the bus and the preceding car, oversmoothing leads to false flow. The output of the static camera flow algorithm is presented in Figure 6e-f. It provides reliable estimation of the optical flow near optical flow discontinuities, while maintaining the good performance of [7] in other parts of the flow field. Note that identical parameters were used for both the TAU-Dance and Ettlinger Tor sequences: γ = 100, µ = 80, ν = 0.04 ∗ 255, λ = 1.5, κ = 0.4, and the number of iterations was 50. The parameters for the initialization segmentation [21] were the same as in Section 4.

19

(a)

(b)

(c)

(d)

Figure 5: (a) An image from the TAU Dance database. (b) Optical flow calculated using the algorithm of [7]. (c) Optical flow calculated using the piecewise smooth flow (PSF) algorithm. (d) Optical flow calculated using the static camera flow (SCF) algorithm. In (b)-(d), the computed flow is dense, but for clarity, vectors that represent negligible motion are not shown.

20

(a)

(b)

(c)

(d)

(e)

(f)

Figure 6: (a) Frame 1 of the Ettlinger Tor sequence. (b) Background image derived from fifty images. (c) Optical flow calculated using Brox et al’s algorithm. (d) Detail of (c): the bus and preceding car. (e) Optical flow calculated using our static camera flow (SCF) algorithm. (f) Detail of (e): the bus and preceding car. In (c)-(f), the computed flow is dense, but for clarity, vectors that represent negligible motion are not shown.

21

6

Discussion

The primary contribution of this work is the novel piecewise smooth flow algorithm, that embeds Brox et al’s optical flow functional within Vese and Chan’s segmentation scheme. The sharp discontinuities in the estimated optical flow field provide significant performance gains. Excellent results were shown for image sequences with two dominant motions. We developed an initialization scheme based on affine flow segmentation to facilitate convergence to the correct minimum of the proposed non-convex functional. Further research should be conducted regarding sequences with more than two dominant motions. One possible approach is to use Vese and Chan’s four-phase formulation [16], which theoretically is equivalent to the Mumford-Shah formulation. However, while their two-phase model has a piecewise smooth extension [16], their four-phase model is piecewise constant, which is not directly applicable to our needs. An initialization scheme for a fourphase piecewise smooth Vese-Chan model is a complex problem, and may warrant future research by itself. A multi-phase segmentation model of motion is supported by Cremers and Soatto [13]. They achieve this by restricting the motion to an affine space, and thus in each phase the motion has a constant parameter vector. This model supports complex motion topologies, but is not effective as an optical flow estimation method, as it does not support non-parametric motion. Variational methods are often perceived as slow. However, it was recently demonstrated that variational optical flow calculation can be accelerated to real-time speeds using multigrid methods [29]. The computational cost of our method is related to that of Brox et al [7], because it applies [7] iteratively, in conjunction with the level-set evolution. Optical flow calculation has traditionally focused on grayscale sequences, with some re-

22

search on color sequences [28]. The extension of the piecewise smooth flow functional to support color is straightforward, and the robust kernel in the fidelity term should protect it from outliers due to edge-related color-channel mismatch. The extension of optical flow to range sequences, also known as range flow [30] is a greater challenge, because the brightness constancy assumption does not hold in that context. To accommodate the brightness changes caused by objects moving perpendicular to the image plane, the flow calculated must be a 3D vector, the additional component representing movement towards or away from the camera. An additional contribution of this work is employing an estimate of the background image when the sequence is known to be taken with a static camera. We used the level set method to separate the static background from the moving elements in the frames. The static camera flow algorithm performed better than state of the art optical flow methods on challenging sequences that include large homogeneous regions.

Appendix A: Numerical scheme A.1 Basics This appendix provides details on the numerical approximation and discretization methods used in the implementation of our algorithm. We denote

4

h = ∆x = ∆y .

23

Discrete spatial indices are denoted by subscripts, and the temporal index (iteration count) by a superscript, as in 4

φni,j = φ(n∆t, i∆x, j∆y) . The numerical approximations of the Heaviside function and its derivative are

  x  1 2 H∆ (x) = 1 + arctan 2 π ∆ 1 ∆ 0 δ∆ (x) = H∆ (x) = 2 π ∆ + x2

The L1 norm and its derivative are approximated by

Ψ(x) = Ψ0 (x) =



x + 2

1 2 x + 2 √

Given the warping of an image by an optical flow field, the image values on the resampling grid are obtained by bilinear interpolation.

A.2 Optical flow Consider the linearization and discretization of Eq. 10 for u+ (φ > 0). The treatment of Eq. 11 at φ > 0 for v + , and of both equations at φ < 0 (for u− and v − ) are similar.

2

2

2

+ + + + + + Iyz )) − Ixz + Ixy H(κφ)Ψ0 (Iz+ + γ(Ixz + Iyz ))(Ix+ Iz+ + γ(Ixx

− µdiv(H(φ)Ψ0 (|∇u+ |2 + |∇v + |2 )∇u+ ) = 0

24

This equation is highly non-linear. Using fixed point iterations of (uk , v k ), and denoting by I∗k,+ the adaptation of the abbreviations (9) to the functions in the k-th iteration, we obtain

   k+1,+ 2 k+1,+ 2 k,+ k+1,+ k,+ k+1,+ H∆ (κφ)Ψ0 (Izk+1,+ )2 + γ (Ixz ) + (Iyz ) · Ixk,+ Izk+1,+ + γ(Ixx Ixz + Ixy Iyz )

   − µdiv H∆ (φ)Ψ0 |∇uk+1,+ |2 + |∇v k+1,+ |2 ∇uk+1,+ = 0

To remove the non-linearity that still remains in this equation, we use the first order Taylor expansion

Izk+1,+ ≈ Izk,+ + Ixk,+ duk,+ + Iyk,+ dv k,+ k+1,+ k,+ k,+ k,+ Ixz ≈ Ixz + Ixx duk,+ + Ixy dv k,+ k+1,+ k,+ k,+ k,+ Iyz ≈ Iyz + Ixy duk,+ + Iyy dv k,+

where

4

duk,+ = uk+1,+ − uk,+ 4

dv k,+ = v k+1,+ − v k,+

For conciseness, we introduce the abbreviations

 4 k,+ (Ψ0 )Fidelity = Ψ0 (Izk,+ + Ixk,+ duk,+ + Iyk,+ dv k,+ )2

k,+ k,+ k,+ k,+ k,+ k,+ + γ (Ixz + Ixx duk,+ + Ixy dv k,+ )2 + (Iyz + Ixy duk,+ + Iyy dv k,+ )2

k,+

4

(Ψ0 )Smooth = Ψ0 |∇(uk,+ + duk,+ )|2 + |∇(v k,+ + dv k,+ )|2

25





Substituting the Taylor expansions and applying the abbreviations, we obtain

 k,+ H(κφ) (Ψ0 )Fidelity · Ixk,+ (Izk,+ + Ixk,+ duk,+ + Iyk,+ dv k,+ )

k,+ k,+ k,+ k,+ + γIxx (Ixz + Ixx duk,+ + Ixy dv k,+ )

k,+ k,+ k,+ k,+ + γIxy (Iyz + Ixy duk,+ + Iyy dv k,+ )

h

0 k,+



−µ div H(φ) (Ψ )Smooth ∇(u

k,+

+ du

k,+

i

) =0

We finally obtain a linear equation by another, nested, fixed point iterations loop. The internal fixed point iteration count is denoted by an additional superscript (n or n + 1).

 k,n,+ H(κφn ) (Ψ0 )Fidelity · Ixk,+ (Izk,+ + Ixk,+ duk,n+1,+ + Iyk,+ dv k,n+1,+ )

k,+ k,+ k,+ k,+ + γIxx (Ixz + Ixx duk,n+1,+ + Ixy dv k,n+1,+ )

k,+ k,+ k,+ k,+ dv k,n+1,+ ) duk,n+1,+ + Iyy (Iyz + Ixy + γIxy



i h k,n,+ −µ div H(φn ) (Ψ0 )Smooth ∇(uk,+ + duk,n+1,+ ) = 0 A similar linearized equation is obtained for v + . Discretization of both equations leads to a sparse linear system of equations, that can be solved using successive over-relaxation, or more sophisticated solvers. The solution for u− and v − is similar.

26

A.3 Segmentation We wish to numerically solve Eq. 14:

  ∇φ ∂φ = νδ(φ)∇ · ∂t |∇φ|   − µδ(φ) sof (u+ , v + ) − sof (u− , v − )

  − κδ(κφ) fof (u+ , v + ) − fof (u− , v − )

We adopt the notations [16]

1

C1 = r φn

n i+1,j −φi,j

C2 = r  φn

h

n i,j −φi−1,j

h

C3 = r φn

2

2

+

n i+1,j −φi−1,j

C4 = r  φn

2h

+

 φn

n i,j+1 −φi,j−1

1  φn

n i−1,j+1 −φi−1,j−1

1 2

+

1 2

n i+1,j−1 −φi−1,j−1

2h

2h

2h

n i,j+1 −φi,j

 φn +

h

 φn

h

∆t δ∆ (φi,j )ν h2

C = 1 + m(C1 + C2 + C3 + C4 )

27

2

2

n i,j −φi,j−1

and m=

2

2

The discretized fidelity and smoothness terms are denoted by

fidn,+ i,j

= Ψ

fidn,− i,j

= Ψ

h

h



2 Izn,+ i,j

2 Izn,− i,j

+γ +γ

2 n,+ Ixz i,j

2 n,− Ixz i,j

n,+ un,+ i+1,j − ui−1,j  = Ψ 2h

gradn,+ i,j

n,+ vi+1,j

− 2h

n,+ vi−1,j



n,− un,− i+1,j − ui−1,j = Ψ 2h

gradn,− i,j

n,− vi+1,j

− 2h

n,− vi−1,j

!2

!2 !2 !2

+γ +γ

+

+

+

+

2 n,+ Iyz i,j

2 n,− Iyz i,j

i

i

n,+ un,+ i,j+1 − ui,j−1 2h n,+ vi,j+1

− 2h

n,+ vi,j−1

− 2h

n,− vi,j−1

+

!2 

n,− un,− i,j+1 − ui,j−1 2h n,− vi,j+1

!2



!2

+

!2  

The discrete evolution equation for φ finally takes the form

φn+1 = i,j

 1  n φi,j + m C1 φni+1,j + C2 φni−1,j + C3 φni,j+1 + C4 φni,j−1 C  n,− −∆t µδ∆ (φni,j ) · gradn,+ i,j − gradi,j n,− − ∆t κδ∆ (κφni,j ) · fidn,+ i,j − fidi,j



Acknowledgments This research was supported by MUSCLE: Multimedia Understanding through Semantics, Computation and Learning, a European Network of Excellence funded by the EC 6th Framework IST Programme.

28

References [1] G. Aubert and P. Kornprobst, Mathematical Problems in Image Processing, Springer, 2002. [2] J.L. Barron, D.J. Fleet and S.S. Beauchemin, “Performance of Optical Flow Techniques,” Int. J. Computer Vision, Vol. 12, pp. 43-77, 1994. [3] B. Horn and B. Schunck, “Determining optical flow,” Artificial Intelligence, Vol. 17, pp. 185-203, 1981. [4] B. Lucas and T. Kanade, “An Iterative Image Registration Technique with Application to Stereo Vision,” Proc. DARPA Image Understanding Workshop, pp. 121-130, 1981. [5] D.J. Fleet and A.D. Jepson, “Computation of Component Image Velocity from Local Phase Information,” Int. J. Computer Vision, Vol. 5, pp. 77-104, 1990. [6] M.J. Black and D.J. Fleet, “Probabilistic Detection and Tracking of Motion Boundaries,” Int. J. Computer Vision, Vol. 38, pp. 231-245, 2000. [7] T. Brox, A. Bruhn, N. Papenberg, and J. Weickert, “High Accuracy Optical Flow Estimation Based on a Theory for Warping,” Proc. 8th European Conf. Computer Vision, Part IV: Lecture Notes in Coputer Science, Vol 3024, pp. 25-36, 2004. [8] P. Anandan, “A Computational Framework and an Algorithm for the Measurement of Visual Motion,” Int. J. Computer Vision, Vol. 2, pp. 283-310, 1989. [9] M.J. Black and P. Anandan, “The Robust Estimation of Multiple Motions: Parametric and Piecewise Smooth Flow Fields,” Computer Vision and Image Understanding, Vol. 63, pp. 75-104, 1996. 29

[10] E. M´emin and P. P´erez, “A Multigrid Approach for Hierarchical Motion Estimation,” Proc. 6th Int. Conf. Computer Vision, pp. 933-938, Bombay, India, 1998. [11] P. Nesi, “Variational Approach to Optical Flow Estimation Managing Discontinuities,” Image and Vision Computing, Vol. 11, pp. 419-439, 1993. [12] J.Y.A. Wang and E.H. Adelson, “Representing Moving Images Images with Layers,” IEEE Trans. Image Processing, Vol. 3, pp. 625-638, 1994. [13] D. Cremers and S. Soatto, “Motion Competition: A Variational Approach to Piecewise Parametric Motion Segmentation,” Int. J. Computer Vision, Vol. 62, pp. 249-265, 2005. [14] D. Mumford and J. Shah, “Optimal Approximation by Piecewise Smooth Functions and Associated Variational Problems,” Comm. Pure and Applied Mathematics, Vol. 42, pp. 577-685, 1989. [15] L. Ambrosio and V.M. Tortorelli, “Approximation of Functionals Depending on Jumps by Elliptic Functionals via Γ-Convergence,” Comm. Pure and Applied Mathematics, Vol. 43, pp. 999-1036, 1990. [16] L.A. Vese and T.F. Chan, “A Multiphase Level Set Framework for Image Segmentation Using the Mumford and Shah Model,” Int. J. Computer Vision, Vol. 50, pp. 271-293, 2002. [17] N. Paragios and R. Deriche, “Geodesic Active Regions and Level Set Methods for Motion Estimation and Tracking,” Computer Vision and Image Understanding, Vol. 97, pp. 259282, 2005.

30

[18] T. Amiaz and N. Kiryati, “Dense Discontinuous Optical Flow via Contour-based Segmentation”, Proc. IEEE International Conference on Image Processing (ICIP’2005), Vol. III, pp. 1264-1267, Genova, Italy, 2005. [19] S. Osher and J.A. Sethian, “Fronts Propagating with Curvature Dependent Speed: Algorithms based on Hamilton-Jacobi Formulation,” J. Computational Physics, Vol. 79, pp. 12-49, 1988. [20] A. Dervieux and F. Thomasset, “A Finite Element Method for the Simulation of Rayleigh-Taylor Instability,” Lecture Notes in Mathematics, Vol. 771, pp. 145-158, 1979. [21] G.D. Borshukov, G. Bozdagi, Y. Altunbasak and A.M. Tekalp, “Motion Segmentation by Multistage Affine Classification,” IEEE Trans. Image Processing, Vol. 6, pp. 15911594, 1997. [22] M. Irani and S. Peleg, “Motion Analysis for Image Enhancement: Resolution, Occlusion and Transparency,” J. Visual Comm. Image Representation, Vol. 4, pp. 324-335, 1993. [23] D.R.K. Brownrigg, “The Weighted Median Filter,” Commun. ACM, Vol. 27, pp. 807818, 1984. [24] B. Galvin, B. McCane, K. Novins, D. Mason and S. Mills, “Recovering Motion Fields: An Evaluation of Eight Optical Flow Algorithms,” Proc. British Machine Vision Conference, Southampton, UK, 1998. [25] J. Weickert and C. Schn¨orr, “Variational Optic Flow Computation with a Spatiotemporal Smoothness Constraint,” J. Math. Imaging and Vision, Vol. 14, pp. 245-255, 2001.

31

[26] H. Kollnig, H.-H. Nagel and M. Otte, “Association of Motion Verbs with Vehicle Movements Extracted from Dense Optical Flow Fields,” Lecture Notes in Computer Science, Vol. 801, pp. 338-347, 1994. [27] L. Bar, S. Rochel and N. Kiryati, “TAU-DANCE: Tel-Aviv University Multiview and Omnidirectional Video Dance Library,” VIA - Vision and Image Analysis Laboratory, School of Electrical Engineering, Tel Aviv University, January 2005. [28] R.J. Andrews and B.C. Lovell, “Color Optical Flow,” Proc. Workshop Digital Image Computing, Brisbane, pp. 135-139, 2003. [29] A. Bruhn, J. Weickert, T. Kohlberger, and C. Schn¨orr, “Discontinuity-Preserving Computation of Variational Optic Flow in Real-Time,” Proc. Scale-Space 2005: Lecture Notes in Computer Science, Vol. 3459, pp. 279-290, 2005. [30] H. Spies, B. J¨ahne and J.L. Barron, “Range Flow Estimation,” Computer Vision and Image Understanding, Vol. 85, pp. 209-231, 2002.

32

Suggest Documents