Optical flow was first defined by Horn and Schunck [1] as a sequence of vector images connecting corresponding pixels in subsequent pairs of images in.
Motion segmentation for tracking small floating targets in IR video Alexander Borghgraef, Fabian D. Lapierre, Yves Dupont, Marc Acheroy Abstract In the domain of mine-warfare, the detection of targets floating on the surface has remained difficult to automate. Nevertheless, experience in the Persian Gulf has proved that unmoored floating mines are a realistic threat to shipping traffic. An automated system capable of detecting these and other free-floating small objects, using readily available sensors, would prove to be a valuable mine-warfare asset, and could double as a collision avoidance mechanism, salvaging tool or search-and-rescue aid. We have obtained test footage taken with both 3-5 and 8-12m IR cameras of various practice targets, in various environmental conditions. An optical flow sequence is extracted from the IR video sequence, which is subsequently segmented. Motion characteristics are extracted by applying the Proesmans optical flow algorithm to the IR video sequence, calculating and then segmenting the motion field of each subsequent pair of images. A time series of these motion fields allows us to classify different segments according to their motion characteristics and continuity, and thus to detect and track the floating mines.
1 1.1
Introduction Floating mines, a realistic threat?
Sea mines are static interdiction weapons, designed to deter ships from passing through zones of strategical importance, or to destroy those who try. They are usually divided into two groups: bottom mines and tethered mines. Bottom mines have no buoyancy, are equipped with sophisticated electromagnetic or acoustic triggers, often capable of determining a passing ships type and the optimal detonation range. Tethered mines float slightly below sea level, chained to the sea bottom, and are of simpler design, often using a contact detonation mechanism. These mines cover less surface than bottom mines, but on the other hand cause greater damage to a ship hit. International laws require for floating mines to be anchored and to self-disable in case the anchoring fails. Since free-floating mines in open waters are more likely to disrupt shipping outside the conflict zone than to hit an intended target, most nations obey these regulations. In more enclosed waters on the other hand, drifting mines can and have been used successfully. An example of this was the conflict in the Persian Gulf in the early nineties. Allied naval forces, amongst which two Belgian minehunters, were confronted with a large number of drifting mines in addition to a conventional minefield guarding the Iraqi coast. This proved a real threat to the heavy shipping traffic in the narrow Gulf, and to allied ships as well, as was made all too clear when US amphibious assault carrier Tripoli was hit by a floating contact mine, which rendered it unable to continue operations. After this incident, floating mines were considered to be a primary threat against US carriers. The small target they present to sensors above and below surface, and the noisy background required visual detection by a human lookout. Belgian minefighters reported this to as slowing down operations tremendously, since every spotted floating object had to be identified visually. This is especially true on a small vessel such as a minehunter, since a hit from a contact mine would mean instant death to it and its crew.
1
The experience in the Persian Gulf shows that drifting mines can be very effective at disrupting operations and traffic in narrow and crowded shipping lanes. This means that a reliable automatic detection method would be a useful addition to the minefighter’s arsenal.
1.2
Other Applications
A technology for the detection of floating mines would need to be capable of detecting a wide variety of small objects made of various materials in a rough sea. It is obvious that such a technology would be of use for a number of other tasks, both military and civilian. It could be used as a collision warning system for commercial ships, for salvaging containers and debris, for the detection of small refugee boats or life rafts, and for finding swimmers as a part of port protection or search and rescue operations.
2 2.1
Detection Sensor position
Our research assumes a ship-mounted sensor, with an expected elevation of about 10m above sea surface. The requirement that the system must be capable of detecting mines at transit speeds with a sufficient response time, dictates a detection range between 500 and 1000m. This means a sharp observation angle much smaller than the Brewster angle, eliminating any possible use of polarisation filters in the detection method.
2.2
Target
A mine is typically a sphere between 1 to 3m diameter, or a 2-by-1m cylinder, a standard shipping container is a beam of 6-by-2.5-by-2.5m. Half or more of the target is likely to be submerged, resulting in a solid angle range between 3.1 and 30µsr at 500m, and between 0.8 and 7.5 µsr at 1000m. This solid angle may be higher due to the object’s shadow, or lower because of occlusion by waves, depending on sea state. A sea state of of 2.5 is enough to hide all the floating objects under consideration. We assume the floating object to be at thermal equilibrium with the surrounding sea water, causing it to emit blackbody radiation at Tsea . This thermal emission will dominate the view the IR camera has of the target, though specular reflection of the sky’s radiation off of the top of the object can occur.
2.3
Background
A flat sea surface emits a blackbody spectrum at its surface temperature, but at the small observation angles we’re looking at, its emission coefficient is very low. It will however reflect the IR radiation downwelling from the sky, predominantly originating from the near-horizon air mass, and the target will contrast sharply. A rough sea on the other hand presents wave slopes to the camera at much larger angles, so the emission at Tsea will be detectable. These wave slopes are of similar scale to the target, and emit the same blackbody radiation, resulting in a noisy background for detection purposes. Our main task will be to determine some characteristics distinguishing the mine from this background, and to avoid false positives.
2.4
Determinant characteristics
Evidently, contrast of the mine against the background is a minimum requirement for detection, but as we have shown in the previous section, a contrast-based segmentation won’t be sufficient for reliable detection. Waves have signatures similar to that of the target, and will cause the detector
to report false positives. This means we will have to find additional determinants for the floating target. Scale as a characteristic filter can be used to filter out objects such as ships and birds, but cannot distinguish between the target and the waves, as these are of similar size. Shape isn’t a very useful criterium due to the partial submersion of the target, and it’s occasional occlusion by waves. We are of the opinion that a robust detection method needs to look at a time sequence of images. Visual observation of test footage reveals a distinct up-and-down motion by the floating object not present in the background. We will proceed to explore this feature’s potential as a determinant characteristic.
3 3.1
Motion segmentation Optical flow
If we want to classify objects by motion characteristics in an IR image sequence, we need to calculate the optical flow of said sequence. Optical flow was first defined by Horn and Schunck [1] as a sequence of vector images connecting corresponding pixels in subsequent pairs of images in the original video sequence. We choose the algorithm by Proesmans et al. [2] for our study following performance studies by Barron et al. [3] and by Galvin et al. [4] Proesmans expands upon the differential technique pioneered by Horn and Schunck, which optimises an energy functional composed of the Brightness Constancy Constraint Equation (BCCE) and a smoothness constraint Z ∂I 2 E = (∇I · u + ) + λ(k∇uk2 + k∇vk2 )dx (1) ∂t Ω which leads to an iterative method with the following evolution equations: ∂I ∂I ∂u =∆u − λ (∇I · u + ) ∂t ∂x ∂t ∂v ∂I ∂I =∆v − λ (∇I · u + ) ∂t ∂y ∂t
(2)
Horn and Schunck’s method assumes a global smoothness throughout, ignoring the discontinuities caused by image borders and occlusions. Proesmans et al. solve this by calculating a forward and backward motion field, and by comparing these two time-opposite fields for inconsistencies. The time-derivative of the image sequence used in equation 2 is calculated locally, meaning that the equations won’t hold under larger velocities. To avoid the computational cost of having to use larger operator masks for the time-derivation, Proesmans et al. have used the knowledge that at each step of an iterative implementation of the evolution equations, some estimation of the velocity u ˜ will be known. They replace the original time derivative with It (˜ u) = −∇I · u ˜+
I(x, t) − I(x + u ˜ dt, t + dt) dt
(3)
which centres the operator on the current frame and on its approximate corresponding location in the next frame, allowing for good accuracy with a small operator mask in the discrete implementation. The same can be done when calculating optical flow in the opposite time direction. Consistency can be calculated by comparing the forwards and backwards flow uf and ub at their corresponding image coordinates. Since these are expected to be each other’s opposites, the magnitude of their sum can be considered an indicator for inconsistencies. Cf =uf (x) + ub (x + uf dt), Cb =ub (x) + uf (x + ub dt)
(4)
Another diffusion process to eliminate the dependency on noise and the flow’s magnitude ∂c c = ρ∆c − + 2α(1 − c)kCk ∂t ρ
(5)
and an edge sharpening method is introduced to avoid contamination of consistent regions by inconsistent ones because of the smoothing in equations 2. This is done by replacing the Laplacian in the diffusion equation by ∇ · (γ(cf )∇uf ),
with
γ(c) =
ξ 1 + ( Kc )2
(6)
Put together, this results in the following system of six coupled PDEs, which will converge at t → ∞ into an optical flow field and a map of inconsistencies: ∂uf ∂t ∂vf ∂t ∂ub ∂t ∂vb ∂t ∂cf ∂t ∂cb ∂t
∂I (∇I · uf + It (˜ uf )), ∂x ∂I =∇ · (γ(cf )∇vf ) − λ (∇I · uf + It (˜ uf )), ∂y ∂I =∇ · (γ(cb )∇ub ) − λ (∇I · ub + It (˜ ub )), ∂x ∂I =∇ · (γ(cb )∇vb ) − λ (∇I · ub + It (˜ ub )), ∂y cf =ρ∆cf − + 2α(1 − cf )kCf (uf , ub )k, ρ cb =ρ∆cb − + 2α(1 − cb )kCb (uf , ub )k ρ =∇ · (γ(cf )∇uf ) − λ
(7)
Galvin and McCane’s[4] implementation dispenses with the noise reducing diffusion process for c in favour of computational speed. Using a Gauss-Seidel approach, this results in the following iterative scheme (using a tensor notation): cn+1 =kCf (unf , unb )k, f un+1 =∇ · (γ(cn+1 )∇unf ) − f f
λ∇I(∇I · unf + It (unf )) , 1 + λk∇Ik
cn+1 =kCb (un+1 , unb )k, b f un+1 =∇ · (γ(cn+1 )∇unb ) − b b
λ∇I(∇I · unb + It (unb )) 1 + λk∇Ik
(8)
In this updating scheme, at each step n the current optical flow field is an approximation for the flow at step n + 1, meaning un plays the role of u ˜ in equation 3. This means that for each pixel position x = (x, y) in the image sequence, the terms containing It in equation 8 can be written as
and
(∇I · unf + It (unf ))(x) = I2 (x) − I1 (x + unf )
(9)
(∇I · unb + It (unb ))(x) = I1 (x) − I2 (x + unb ),
(10)
simplifying calculation. I1 and I2 represent the consecutive frames being considered in the optical flow calculation, and dt is set to 1. For the constants used in equation 6, we have chosen ξ = 1 and K ≈ avg(C), and the Lagrange multiplier λ is set to 30.
3.2
Experiments
When applying the optical flow algorithm to a number of MWIR and LWIR video sequences of floating targets, we notice that the background is very noisy in the motion domain. We see this clearly in the separated motion field images fig.1b. and 1c. of a target rising on top of a wave.
a.
b.
c.
Figure 1: Rising mine in MWIR, b. and c. show the horizontal and vertical components of the motion field. T
a
r
g
e
t
v
e
r
t
i
c
a
l
O
F
h
i
s
t
o
g
r
a
m
T
r
g
e
t
v
e
r
t
i
c
a
l
O
F
h
i
s
t
o
g
r
a
m
T
0
7
0
7
0
6
0
6
0
6
0
5
0
5
4
0
0
3
0
2
0
0
.
0
0
.
8
0
.
6
0
.
4
0
.
0
2
vy
.
0
0
.
2
0
.
4
0
.
6
0
.
8
b.
e
t
v
e
r
t
i
c
a
l
O
F
h
i
s
t
o
g
r
a
m
0
1
0
1
g
0
2
0
r
0
3
1
0
4
0
a
0
0
3
1
5
4
0
2
a.
a
7
0
0
0
.
8
0
.
6
0
.
4
0
.
0
2
.
0
0
.
2
0
.
c.
4
vy
0
.
6
0
.
4
0
.
2
0
.
0
0
.
2
0
.
4
0
.
6
vy
Figure 2: Vertical OF histogram for (a.) rising mine, (b.) stationary mine on top of a wave, (c.) sinking mine The target zone within the horizontal motion image shows a more heterogenous optical flow, whereas in the vertical direction the target shows a more uniform direction. This was to be expected because of the significant up-and-down bobbing of the target, as also seen in the plot (fig. 4) of the mean x and y components troughout the image sequence. In fig.2 we show three vy histograms taken from this sequence illustrating three phases of the sinusoidal movement of the target, rising on a wave, briefly resting on top, and sinking again. Here it is clearly shown that during the moving phases, the target’s movement has a uniform sign: negative in 2a. (due to image coordinates being inverted from the physical directions) and positive in 2c. Only in the brief stationary phase the target histogram spreads out around 0, similar to the overall image histogram of fig.3. From this we can conclude that segmentation in the vertical will be possible when motion is large enough for the target peak to be carried outside of the main image peak around 0. When we applied a thresholding at 2 × σ of the frame histogram, we achieved the results shown in fig. 5a. for three consecutive frames of the rising mine. It shows that in each image false positives still remain, but that these have little continuity through the sequence. The target blob on the other hand, does remain visible, which can be used as an additional determinant characteristic. In fig.5b. we have multiplied the three binary images to obtain a simple continuity filter, resulting in a significant reduction of false positives in the result. I
6
0
5
0
0
0
4
0
0
2
1
e
v
e
r
t
i
c
a
l
O
F
h
i
s
t
o
g
r
a
m
0
0
0
g
0
0
0
a
0
0
3
m
0
0
0
0
0
0
.
6
0
.
4
0
.
2
0
.
0
0
.
2
0
.
4
0
.
6
vy
Figure 3: Vertical OF histogram for the overall image
0
.
2
0
.
.
8
0
.
6
0
0
0
.
1
0
.
vy
vx
0
1
0
.
0
0
2
.
.
4
.
2
0
0
.
3
0
.
2
0
.
4
0
.
4
0
5
1
f
r
a
0
m
1
e
5
2
0
0
5
1
s
f
a.
r
a
0
m
1
e
5
2
0
s
b. Figure 4: Evolution of the target’s µ(vx ) (a.) and µ(vy ) (b.) with standard deviation 2
0
0
0
0
2
0
0
2
0
0
5
0
1
5
0
1
5
0
1
5
1
0
0
1
0
0
1
0
0
1
0
5
a.
2
1
0
5
0
0
5
0
0
5
0
1
0
0
1
5
0
2
0
0
0
5
0
0
5
0
1
0
0
1
5
0
2
0
0
0
5
0
1
0
0
1
5
0
2
0
0
b.
0
0
0
0
0
5
0
1
0
0
1
5
0
2
0
0
Figure 5: a. Segmented vy images b. Continuity filtered image
4
Conclusions and future research
A strong vertical motion component proves to be a valid determinant of small floating objects, and as such can prove a valuable addition to the minefighter’s algorithmic arsenal. While not necessarily viable as a standalone detection method, vy -segmentation can be used to trigger a tracking system, which can be reinforced at each pass through the characteristic sinusoid movement. In fact we percieve this as quite similar to how a human observer finds the target. There are obvious improvements to make to the naive method presented here. The computational expense of the optical flow algorithm can be reduced significantly by limiting calculations to edge pixels, or to objects obtained from previous segmentations. Reducing the framerate, taking into account expected wave frequencies, would also reduce computational cost and increase target vy , possible reducing the number of false positives. Finally, a frequency analysis of segments’ vertical motion could also be an interesting approach. We will investigate these options in further research.
References [1] Horn and Schunck, “Determining optical flow,” Artificial Intelligence 17, pp. 185–203, 1981. [2] M. Proesmans, L. V. Gool, E. Pauwels, and A. Oosterlinck, “Determination of optical flow and its discontinuities using non-linear diffusion,” 3rd European Conference on Computer Vision 2, pp. 295–304, 1994. [3] J. Barron, D. Fleet, S. Beauchemin, and T. Burkitt, “Performance of optical flow techniques,” CVPR 92, pp. 236–242. [4] B. Galvin, B. McCane, K. Novins, D. Mason, and S. Mills, “Recovering motion fields: An evaluation of eight optical flow algorithms,” in Proceedings of the Ninth British Machine Vision Conference (BMVC ’98), pp. 195–204, (Southampton), Sept. 1998.
[5] A. Borghgraef and M. Acheroy, “Using optical flow for the detection of floating mines in IR image sequences.,” in Proceedings of SPIE Optics and Photonics in Security and Defence 2006, 6395, (Stockholm, Sweden), Sept. 2006.