the determination of the optic Ibis field for single or nmltiple n1o â¢1 11g regn a region teased ... [calurc space is chosen based on region characteristics such as ...
I'artctn Rccoeni
I
I .cucfs 6 (198= 1 169-177
Such( 1987
s,"1111-tiallund
Determination of the optic flow field using the spatiotemporal deformation of region properties T.L.. HUNTSBERGER*
and S .N . JAYARAMAMURTHY
Inlelligenr Sivems Iaborurorr, Department of Cmnpuler .Science, Uturrrwlu at South (wa/ te . Cnhtorhet . SC 29208, US. II
Receieed 24 October 1986
4hslran : Local feature based dynamic scene analysis methods arc usually set the determination of the optic Ibis field for single or nmltiple n1o •1 11g regn
a
noise . I his crier a method for region teased leawrcs .
mah : Dcnamic scene anulesi,, optic riots, motion dcfc,lui
1 . Introduction Depth and object characteristics are difficult to extract from a single static image . Imaging and geometrical constraints in a sequence of images allow the extraction of motion parameters such as optical flow and surface orientation . Computation of the optic flow field has been the subject of much recent work (e .g . Barnard and Thompson (1980), Horn and Schunck (1981), Ullman and Hildreth (1982), Davis (1983), Haralick (1983), Nagel (1983), Prager and Arhib (1983), Hildreth (1984), Wohn and Waxrnan (1984), Bandyopadhay (1984)) . Most of the techniques reviewed either relied on establishment of correspondence between points/discrete features such as corners in different frames in the sequence, or used spatiotemporal gradients of image intensities in a constraint equation . The first method will fail when the desired features such as corners arc absent, as is the case with curves . Also, both methods are sensitive to noise . In addition, the gradient approach suffers from problems encountered at the boundary interface between moving regions and the background, as pointed out by Schunck (1986) . Flow fields that were generated using either of `Contact author,
11167-8655/871$7 .50
these techniques lend to be sparse, which limits their usefulness for structure from motion compulations . These problems can be minimized if the [calurc space is chosen based on region characteristics such as surface area or bounding contours, as noted by Kanatani (1985) and Aloimonos (1986) . Noise in pixel values and positions won't have as much an effect, since region characteristics are based on a collection of points . In this paper we present a technique for the determination of the optic flow field in a sequence of images . this technique exploits the link between contour and region deformations that is inherent in the behavior of moving objects as viewed by a monocular observer . Possible problems with camera registration and camera operation limit approaches which use multiple cameras . Each image in a sequence is broken into regions (segmentation phase), which are characterized by a homogeneous interior color . It is to be noted that color is being used here to establish correspondence between regions in a series of consecutive frames . Deformations of feature properties of these regions over time are used to derive the optic flow field . The next section contains a brief review of the method of region and color edge determination . '!'his is followed by a discussion of the technique for the extraction of the optic flow field from
1987 . [lies er Science Publishers B . V . (NorthIlollaudl
In9
Volume 6, Number 3
PAI I ERN RECOGNITION LETTERS
deformations in the principal moments derived from these regions . The final section gives details about some preliminary experimental results .
2 . Segmentation phase Identification of nearly uniform color regions in a given image is the purpose of this phase of our technique . Clustering is used to obtain the global color characteristics of an image . The clustering in color space is done with the fully c-means (FCM) algorithm generalized by Bezdek (1981) . The RGB components of an image are treated as a vector during the course of the clustering procedure . However, edge location associated with these regions is also needed Ior the computation of the contour integrals necessary for optic flow field determination . A color edge detector has been developed which analyzes information obtained from the clustering procedure and identifies the boundaries between nearly homogeneous color regions . Space limitations prevent us from providing further details here . Interested readers may refer to our earlier work in Jacobs (1983), Huntsherger (1985, 1986) and Jayaramamurthy (1985) .
3 . Optic flow field determination Overview
The technique mentioned in the previous section is extended to deal with dynamic scenes . We consider a packet of k frames at a time for analysis, k normally being four of five . The clustering analysis is done for the first frame of this k frame sequence . The color cluster centers obtained are used as reference centers for the calculation of the region characteristics for subsequent frames . Our assumption here is that the global color characteristics of the scene do not change drastically over k frames . The color characteristics are only being used to establish correspondence between regions in the sequence of frames . These homogeneous color regions are treated as planar patches . 170
August 1987
A method for the determination of principal moments of image regions at video rates using cascaded delayed adders has recently been developed by Budrikis and Hatamian (1984) . The principal moments so determined can be used as features for the analysis of image sequences . Our set of features is defined as : ( ,y)dxdy,
(1)
where M u (x, y) is defined as x' y', and S is the surface of the color region over which the integral is defined . The time variation of the feature integral (1) can be derived in terms of the optic flow field as : df _ I { uM,+uh1 +[u,.+o ]M}dxdy, (2) dt where u=dx/dt, a=dv/dt are the optic flow velocity components in the image plane and the i and j indices have been suppressed for clarity . Spatial derivatives of the flow field are given by u, and c,., and spatial derivatives of the moment operators are M,- and M,. . This formulation was suggested earlier by Kanatani (1985) and Amari (1978) . However, the optic flow field parameters are not known in advance, and spatial derivatives such as u,, and u„ behave very poorly in the presence of noise . Elimination of the need to calculate spatial derivatives of the flow field can be done using Green's theorem of integral calculus to redefine this surface integral as a contour integral, d7 dt
,i c
[uMdy-oMdx] .
(3)
This means that the temporal variations of surface features are equivalent to an appropriately chosen contour variation of the same principal moment functional . Waxman (1983, 1984) has shown that a truncated Taylor series expansion form can be used for the optical flow field parameters u and v for planar regions . This form is : u=ubp+u 1ex+all ,y+u n xy+u,b x - ,
(4a)
u=u ob +u m x+u n ,y+U II XY+u 0y` .
(4b)
This truncated series expansion has been shown to
Volume 6, Number 3
PATTERN RECOGNITION LETTERS
be globally valid for planar regions by Waxman (1984) . In other words, a flow field can be derived at each point in the region, thus eliminating some of the problems associated with deriving structure from sparse flow fields . Equation of motion constraints give v„-u,„ and "02 -ut, . The contour integral of (3) now becomes : dF
d1
uoo[Mdy - voo[Mdx]+au,[xMdy]
(5)
- v, o [xM dx] + u o , I yM dy] - v o , [yM dxl + u, o { [x'- M dy] - [,KvM dx] } +u„[[xyMdy] [y`Mdx]}
where [G da] - JC G da . The eight unknowns in (5) can be found by using eight different linearly independent forms for M, as noted by Kanatani (1985) . This series of equations has the advantage that they are linear . 11)1plenu •a artiun
I here are a number of concerns that must be addressed before such an analysis can he applied to real-world scenes . Among these are : (i) the development of a discrete implementation of Green's integral theorem ; (ii) determination of the principal moments most useful for flow field determination ; and finally (iii) determination of the number of frames k the sequence must have for a reasonable numerical solution . These points were never really addressed in the previous works by Kanatani (1985) . The suggestion that two frames are enough for temporal derivative estimations will be shown to be in theory erroneous in the discussion below . Also, the calculation of the optic flow field parameters in his works were for extremely small, possibly sub-pixel, motion if actually applied to real-world scenes . Larger inter11ante motion causes problems with the algorithms (Kanatani, 1986) . We were led to make a number of modifications to these assumptions . Due to the discrete nature of digital images, Green's integral theorem can't be applied directly to images in a sequence of frames . Usually, the boundary of a region is encoded using something like the Freeman chain code, and, as such, is an ap-
August 1987
proximation to the true boundary in relation to the interior of the region . This will become a problem if numerical comparisons between the values of surface integrals and contour integrals are needed . As can he seen in (5), the left and right sides of the equation contain exactly this kind of comparison . This problem was studied earlier hq Tang (1982), who developed a discrete version for Green's theorem . There were floating point operations involved during intermediate steps of his algorithm, which tended to affect the accuracy of results for higher order moments of large regions . We decided to use a supergrid approach similar to the one described in Brice and Fennema (1970) for the evaluation of the contour and surface integrals . In this framework, the pixels arc assumed to be centered on the discrete row and column coordinates of the digitized images . The contour falls between pixels, and a lookup table was used to estimate the true distances traveled alone the contour . Two consecutive chain code values were used as a hash code into the lookup table . Since there arc eight unknowms in the truncated Taylor series expansion for the optic flow tield components, the principal moments that would be suitable candidates for substitution into equation (5) would be x, r, .1:1 ', .v - , y - , .c - _r, \y - , V= arid y' . However, examination of (5) shows that the left hand side of the equation involves the discrete estimation of the temporal change of the principal moment integrals involving these functions . This temporal derivative for two frames is given by d l'
dt
F(2)-I- ( I L
(6)
where P (I) and F(2) represent the values of'] ' in the first and second flumes . As the powers of x and r become higher, a tmo sequential frame estimation of this differential becomes increasingly unreliable . In tact, the twoframe result is only applicable for the first order principal moments, which are linear in v and r . Shariat and Price (1985, 1986) have recently examined the sensitivity of results of motion analysis to temporal estimations and coherence between frames . For principal moments tip to second order, quadratic in .v and y, at least three sequential frames arc needed for an estimate of the temporal
Volume 6,
Number
PATTERN RECO( ;NITION LrTTERS
3
derivative, the form of the derivative now becomes d!F(3)-F(1) (7) d[ 2, where F(1) and F(3) are the values of / in (lie first and third frames . Equation (7) will give the tear poral derivative estimate for the second frame in the sequence . This analysis can be extended to include any higher order principal moment . An increasingly larger number of frames is needed however, four frames for third order, and five frames for fourth order . Since we are required to compule principal moments up to third order to generate the eight equations, four frames would be the minimum needed for a reasonably accurate solution to the series of equations given in (5) . It would be expected that third order principal moments would be increasingly sensitive to noise . Instead of deriving all eight equations using lour Frames directly, two series of four equations call he generated using a four Name se(luence . This
I'igiirc I .
FOL . 11- 'me
August 1987
technique allows the principal moments to vary tip to second order with minimal inaccuracv in the evaluation of the temporal derivative . The first three frames in the sequence can be used for the first four equations using principal moments of v, r, xv, and .v . The second four equations can be generated using the last three frames in the four frame sequence . In addition, this method allows redundanev for error checking in that v' should give the same flow field results if substituted into (5) . Ire resulting flow field will be applicable to the middle of the four frame sequence . The eight linear equations in eight unknowns that are generated from (5) can he solved using any of the various inversion methods . However, there is the possibility of instability in the solulion space due to singularities in the inversion matrices . To minimize this possibility, we used the QR factori/at ion method, based on Ilouseholder's transformation, which has been shown to be a Much more stable solution technique for simultaneous equations (Stewart, 1973) .
styucnc' tot rectangular re_un 1 in s,ci
a 1101 .
Ilniensiis
IR't +Ri 3] .
PA I IFRN RECOGNITION I_I I IFR)
Volume 6 . Number 3
August 1987
4 . Experimental studies We arc currently .studying the application of this method to a wide variety of dynamic color scenes . Preliminary results of flow field determination for a number of image sequences are reported here . The first two studies were done with Laboratory sequences, and the third was done with a natural color traf fic sequence . Our present hardware configuration limited us to the acquisition of four frames in all of the studies . "l he resolution of each Dame is 240x256, with a color resolution of 24 bits . It should he emphasized here that no attempt has_ been made to smooth the frame sequences used in these experiments . Figure 1 shows the first laboratory sequence, a rectangular region constrained to he planar by its placement on a swinging door- this motion sequence is quite complicated, since it is not just motion in the image plane, but a rotation in depth . The flow field derived using the technique developed above is shown in Figure 2, where the
I
I iawc 2 . Plow held hectors ror rectangular rceiun in Figure I,
tip of cacti flow vector is indicated by a bright dot . In all of the following figures illustrating flow field, only a sampled representation is shown . In reality, the flow field can be computed at even -
nre 3 . Pour mine sequence for multiple regions
d
wteus17v
IR
6
131 11
Volume 6, Number 3
PATTERN RECOGNITION
Picture 4 . 14ow field vecmrs for multiple region
I inure
174
I .Ft
IERS
August 1987
point of the region. As can be seen in Figure 2, the flow field seems reasonable as far as magnitude and direction of the flow vectors for the given configuration is concerned . Rotation of a plane about a fixed axis would give rise to a greater velocity as the distance from the axis of rotation increases . The second laboratory sequence is shown in Figure 3, where a rectangular and a contoured region are placed as in the first experiment . Since these two regions are on the same rotating plane, their flow fields should be equivalent . Our analysis indicates that this is the case, as shown in Figure 4, where the tips of the flow field vectors are as in Figure 2 . The third experiment was taken from a color traffic sequence taken from the window of our laboratory . This sequence, shown in Figure 5, was analyzed in terms of two different colored regions . These two regions correspond to the roof and trunk of the same car in the sequence . This car is the bottom one in the dynamic scene shown in Figure 5 . The car is experiencing primarily transla-
Pour rrame .equenee for rrurfie Seenc limensiic-(R . G+H)ill .
'yea to fool pa pI .u¢7as In uuonhas aumil ino .I L
'am to yuau pawawaas {o aauanlas aweJ{ mo .{
361 1 '11'11V
I
9 am~TId
£ JagwuN '9 aum]OA
SMAIJ .A I NOLLTNDODAN N83 .L .IVd
Volume
6,
Number
3
August
PAI TERN RFCOGNtTION LEIt'IFRS
1987
basis of the order of the function being temporally analyzed . The feature method gives a reasonable Ilow field for both single arid multiple moving regions in laboratory and natural image sequences . With multiple moving objects, correspondence between regions could possibly be quite difficult with grayscale images . We feel color is a very strong aid for the solution of this problem in the analysis of im-
Figure
8 . Flow
field %ectors for trunk and roof of car in Figure 5 .
tional motion throughout the sequence . The segmentation results for the trunk arc shown in Figure 6, and for the roof in Figure 7 . the trunk was red and the roof was white in the original se. quence Since these two regions correspond to the same rigid object, the expected result would be that the direction of the flow fields for the two regions should be equivalent, while the magnitudes will differ slightly due to the relative difference in depth . This is born out in Figure 8, where the expected behavior is seen .
age sequences . We are presently applying our analysis technique to a longer frame sequence . In the case of occlusion, the method will have to be modified . Portions of the contours and regions undergoing occlusion can be treated separately using the methods developed during this research . The preliminary results thus far have been very encouraging, and we are presently investigating a number of natural color sequences .
Acknowledgements Both of the authors would like to thank the reviewers [or their comments arid close reading of this manuscript . We would like to express our appreciation to James Durig, Dean of the College of Science and Mathematics at the University of South Carolina for his financial support of the Intelligent Systems Laboratory during the course of this investigation . Thanks also go to James Bezdek for his encouragement .
References 5 . Discussion It is apparent from our preliminary results than there is a lot of information concerning motion parameters that can be derived from surface and contour deformations . These results were achieved in the presence of a very noisy digitizer and relatively low spatial resolution . The magnitudes and directions of the optic flow field velocity vectors match quite well with the known motion . During the course of this research a fast integer version of Green's theorem was also developed . Questions concerning the number of frames appropriate for velocity field determination were resolved on the 176
Aloimonos, J, and A . Basu (1986) . Shape and 3-D motion from contour without point to point con'espondence . Pruc. ( .'VPR, Man,) Beach, FL, 518-527 . Amari, S . (1978) . Feiure S itCCs which admit and detect invariant visual traus(otnmtions . Proc . 411r Int . Jowl (fml Potl . Rccor., Tokyo, Japan, 452-456 . Bandyophav, A . (19841_ A multiple channel model for the perception of optical flow . Pro'. IEEE iNorkshop on Comp_ F'ision : Reprecenr . and Contra( . Annapolis, MI), 78-82 . t)amard . S .1 . and W .B .'fhompmn (1980) . Disparity :utakso Wf images . 1LPE Trans . PAM/ 2, 333-340 .
Be,dek,
la_,
J .C . (1981) . Pattern Recognition wish 1 am/ion Algorit/rnss . Plenum Press, New' York
Brice . r
C .R . and C .L . Feunana (1970) . ns . Arri/ : 1well. l . 205-2_'6 .
Ohjrcrive
. Scene analy .,is wing
Volume 6, Number 3
August 1987
PATTERN RECOGNITION LE'I I' FRS
Budrikis . Z . I . . and M . Hatamian (1984) . Moment calculations by digital filler, . AT&T Bell Lah Tech. J, 63 . 217-230 . Dar is, L .S ., Z . Wu and H . Sun (1983) . Contour-based motion
Naeel, H .-H . (1983) . Displacement vectors derlred trom second order intensity Iat ation, in intagc equencc, . CI G/P 21, 85-117 .
csiimation . CVGIP, 23, 313-326 . Harahek, K . and .1 . Lee (1983) . The facet approach to optical III), . Prnc . / nogc Ur&cr.s/mrding Workshop . Arlington,
Prager, I .M . and MA . Arhib (1983) . (ontputtng the optic
l' A . Hildrelh, EA . (19841 . The computation of the velocity field . Prnc' . Roe Sot- l .ont/un B221 . 189-222(1 .
Schunck, 13 .G . (1986)- Image floss eontinuitc equations tot mss
Horn, B .K .P . and 13A .Schunck (1981) . Detennuung optical flow . ;trli/ . lnrell . 17, 185-203 .
Schunck, 13 .G . (1986)- the image floss cunslraint
Iluntsberger, TI ., C .I-. . Jacobs and R .L . Cannon (19851 . Iterative Iuzay Imagc scgmenialion . Par tern Recognilton 18,
Sharial, H . (1985) . The motion problem : :A decomposition based solution . Prue . IEEE Cold . C'VPR . San 1 rancisco, CA, 181-183 .
131 138 . Hunlzbergcr . TA and M .L . Descalei (1985) . Coloredgedetecnon . Rumens Rerottnition Lencrs- 3 . 2115-2119 . I Iunhbcrgcr, TA_ C . Rangarajan and S .N . 3avaramamurihy 11986)_ Representation of uncertainty In computer vision using haze sets . IEEE Dan, . ('anp. .Spec'. Amp VI FL, C-35, I35-146 . .1 act h, C L . (1983) . Color image segmentation : texture and a iway c-means clustering fmplementntion . M .S . Thesis, Uuieersite of South Carolina . .layuramumunhy, S .N . and T.L . Iluntsberger (1985) . Region and edge analysis using lue..zv sets . Pisses
IEEE Corrf .
Lnn ;uages Jur .4urornutimt . Nlallorca, Spain, 711-7 5 . Kanalan . K . t19851 . Tracing planar surlace motion from a pioloctiuu without knowing the correspondence . CVGIP 29, 1-12 . Kanalanl, k . (985) . Detectutg [tie motion of a planar' surface hp line and surface integrals . CVGIP 29, I3-22 . kanaluni, K . (1985) . Structure rout notion without coi :espondcce : (Ieneral primiple . Proc . 1985, IJC.dI Los Angeles, ( .A, 886-888 . Kam :unni, K . (1986) . Private communication .
flow : The March algorithm raid prediction- CI'G/P 21, 271-304 . Icon and dens ii
. Pt,ic'. IEEE (Fo1*simp :Nation : Repr'c,en-
mrvun tint/ ana/r.wt, Kiawuh Island, SC, 89-94 . , qualiun-
CVGIP, 35, 20-46 .
Sharial, H . and K_E . Price (1986) . I-loss to use more than two trances to estimate motion . Prat . IEEE Ilinb/top llooun : Represenlutinn and Anal llsis . Ktascall Island, SC, 119-1 2 4 . Stes,atI .
. 6 ., W
I'll, oduUiun to rWflon
Cmnpalaaurrs .
Academic Press, New York . fang, G .Y . (1982), A discrete session of Glcen -s theorem .
IEEE ITans . P,1 11I 4 . 242-249 . Ullmau, S . and L .C' . Ilildredr (1982) . The rucaauremenl of .), sisnal motion . In : 0 . .1 . Braddick tutd A . C . Sleigh (has Pln:v(eal rind Biolo„ienl 0. . .It n/ Inn!8es . Stirlnscr, Berlin . Wartime, A .ML and S . Ullman (1983) . Surface structure and 3-I) motion Iron image floss : .A kinematic aualvsis- Center for Automation Res ., Uns . Md ., CAR TI2-24 . Wamtan, A .M . and K . AVohn (1984)- Contour esolmton, neighborhood delormatiou and global hllu_uv Iluss : Planar surtatt's in rnotion . (ante fur Automation Res ., kin . Md ., CAR-TR-58 . Wohn, K . and A .M . Wasm .ul (1984), Cauuluur ecoluttou, neighborhood de)i,rmation and local intagc /lost : Cursed Luis Md ., witness in motion . Center tot Automation Re, .,. CAR-TR-1 34 .