FEBRUARY 2010
CHIN AND MARIANO
371
A Particle Filter for Inverse Lagrangian Prediction Problems T. MIKE CHIN AND ARTHUR J. MARIANO RSMAS, University of Miami, Miami, Florida (Manuscript received 15 October 2008, in final form 3 August 2009) ABSTRACT The authors present a numerical method for the inverse Lagrangian prediction problem, which addresses retrospective estimation of drifter trajectories through a turbulent flow, given their final positions and some knowledge of the flow field. Of particular interest is probabilistic estimation of the origin (or launch site) of drifters for practical applications in search and rescue operations, drifting sensor array design, and biochemical source location. A typical solution involves a Monte Carlo simulation of an ensemble of Lagrangian trajectories backward in time using the known final locations, a set of velocity estimates, and a stochastic model for the unresolved flow components. Because of the exponential dispersion of the trajectories, however, the distribution of the drifter locations tends to be too diffuse to be able to reliably locate the launch site. A particle filter that constrains the drifter ensemble according to the empirical dispersion characteristics of the flow field is examined. Using the filtering method, launch-site prediction cases with and without a dispersion constraint are compared in idealized as well as realistic scenarios. It is shown that the ensemble with the dispersion constraint can locate the launch site more specifically and accurately than the unconstrained ensemble.
1. Introduction Quasi-Lagrangian trajectories observed in scientific and operational applications are time series of horizontal positions following either tagged water parcels or drifting objects, including people lost at sea, derelict ships, pollutants, overboard cargo, floating mines, and plankton. Water parcels are tagged by an assortment of tracked drifters, buoys, and floats whose position data are used to study oceanic and coastal circulation, dynamics, and dispersion statistics (Mariano and Ryan 2007). These water-parcel tags and drifting objects are inclusively referred to as ‘‘drifters’’ in this paper. Forward prediction of Lagrangian trajectories is difficult because of the inherently chaotic nature of smallscale ocean circulation and wind fields and our limited knowledge of them (Babiano et al. 1994; Piterbarg et al. 2007). Typical flow field analyses based on observation and simulation are incomplete and may contain significant errors with respect to small-scale motions. To remedy these deficiencies, Lagrangian stochastic models can be used to supplement the velocity analyses with
Corresponding author address: Mike Chin, RSMAS/MPO, 4600 Rickenbacker Causeway, Miami, FL 33149. E-mail:
[email protected] DOI: 10.1175/2009JTECHO675.1 Ó 2010 American Meteorological Society
numerical representations of unresolved, subgrid-scale dynamics and random uncertainty. In particular, a Lagrangian model based on the first-order autoregressive (AR-1; or Markovian) process, known as the random flight model (Thomson 1987), is used here to capture some key dispersion characteristics of drifter trajectories (Griffa et al. 1995). With such an approach, solving the trajectory prediction problem tends to be probabilistic in nature. An inverse Lagrangian prediction (ILP) problem addresses retrospective estimation of drifter trajectories through a turbulent flow, given their final positions. In particular, estimation of the origin or launch site of the trajectories finds applications in search and rescue operations, drifting sensor array design, and estimation of biochemical sources to determine, for example, where and when a body went overboard, a ship became derelict and started to drift, arrays of drifting surveillance sensors are to be launched, enemy mines were deployed, a pollutant was released, or a spawning event occurred from the distribution of fish larvae (Mariano and Ryan 2007; Hitchcock and Cowen 2007). In a typical ILP scenario, the launch location of a single or a set of drifters in the past is sought given the present location(s) of the drifter(s) and estimates of the ocean circulation velocity field (and possibly other forcing
372
JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY
functions, such as the wind in the case of surface drifters). Because of the chaotic nature of the forward Lagrangian problem and limitations in accuracy and resolution of the ocean current and wind velocity data, it is unrealistic to expect a unique and deterministic answer to an ILP problem. Instead, we wish to determine a launch region such that drifters deployed in this region have the largest probability of being found at the desired final location. The optimal launch region should be as small as possible while maintaining a large probability of success. One approach to solving the ILP problem is to simulate an ensemble of Lagrangian trajectories backward in time using the known final locations. For example, the ILP problem has been studied in the atmosphere in the context of the source–receptor problem, where the goal is to find the source region for an anomalous concentration of gases, such as pollutants (Wilson et al. 1982). The concentrations observed elsewhere (the receptor) are used in conjunction with an atmospheric general circulation model and models of tracer dynamics, including the source/sink, chemical reactions, and radioactive decays. A stochastic turbulence model for unresolved velocity scales (Seibert and Frank 2004) and random flight model for trajectory uncertainty (Lin et al. 2003) have also been included in the dynamics. Solution methods include running the circulation model backward in time using a backward advection scheme (Kim et al. 1998) or an adjoint model that includes the tracer dynamics (Pudykiewicz 1998; Seibert and Frank 2004). Stochastically simulated drifter trajectories, however, tend to diverge. The distribution of the simulated drifter positions usually becomes too diffuse to locate the launch site reliably. We present a method that constrains the distribution of the drifter position probabilistically. Specifically, the probability distribution of the drifter position simulated backward in time is updated according to the empirical dispersion characteristics observed in forward simulations (of the same or similar dynamics). A practical benefit of this is that a more compact distribution for the launch site can be obtained. Our method employs the particle filter to control the spread of the simulated drifter location. The particle filter algorithm, also known as the sequential important sampling technique, is a Monte Carlo method for dynamicalstate estimation problems (Doucet et al. 2001). Versions of the particle filter have been examined for application to the data assimilation problems (van Leeuwen 2003). Only a special case, known as the ensemble Kalman filter (Evensen 1994), which restricts the probability distributions to be Gaussian, has so far been found to be practical, because the particle filter otherwise demands a prohibitive ensemble size to be effective in typical data assimilation problems. However, because simulation of
VOLUME 27
Lagrangian trajectories can be repeated with relatively little computational resources, a particle filter would be applicable to most operational ILP problems.
2. Method Let x 5 (x, y) 2 D denote a horizontal location in an ocean domain D. The ILP problem requires an estimate of the drifter trajectory r(t) 2 D, tL # t # tF, given its final position r(tF). We seek to estimate the probability density function (PDF) pr(x, t) of finding the drifter at a given location x and time t. The final position is given as a set of samples Xm [ xm(tF), m 5 1, . . . , M, which are called the targets. The targets Xm are considered to be statistical samples from the PDF pr(x, tF) of the final position. Our general solution approach is to simulate an ensemble of Nm drifter trajectories backward in time starting from each of the final position Xm. The resulting M trajectories xn(t), n 5 1, . . . , N, where N 5 åm51 N m , can be used (for a large N) to form a sample distribution to approximate the desired PDF pr(x, t) at any t. Of particular interest is the ensemble of locations xn(tL) at the launch time tL, from which the probability pr(x, tL) for the potential launch site can be estimated.
a. Random flight model Drifter trajectories can be simulated using an Eulerian velocity field u(x, t) 5 [u(x, t), y(x, t)] by integrating dr 5 u(r, t) dt
(1)
forward or backward, given an initial or final drifter location, respectively. Available estimate of the Eulerian velocity field u(x, t) includes numerical outputs of an ocean circulation model or gridded analysis of measurement data. Such numerical products, however, tend to lack adequate spatial resolution and accuracy for simulation of observed drifter trajectories and need to be supplemented by statistical velocity models. In a random flight model, the velocity field is expanded, u 5 u 1 u9, into deterministic u(x, t) and stochastic u9(r, t) components. The deterministic component u(x, t), which we call background flow, is obtained from the aforementioned numerical products such as model output and measurement analysis and is primarily responsible for advection by organized flow. The stochastic component u9(r, t), which we call turbulent velocity, characterizes statistically the residual Lagrangian motion, including the subgrid-scale uncertainty in the background flow field. Following Griffa (1996), each vector element of the turbulent velocity u9 is modeled by an AR-1 process,
FEBRUARY 2010
CHIN AND MARIANO
u9 du9 5 dt 1 tu
sffiffiffiffiffiffiffiffi 2s2u dh, tu
373
(2)
where dh is a unit-variance white noise process; su2 is variance of turbulent velocities, typically ranging 10– 100 cm2 s22; and t u is the Lagrangian integral time scale, whose value is on the order of a half a day for coastal surface flows, a few days for typical open oceanic surface flow, and one week for flow in strong currents.
b. Ensemble spread and dispersion characteristics Lagrangian dispersion can be measured by different metrics. Single-particle statistics, based on the average distance of drifters from the cluster centroid, and twoparticle statistics, based on the average distance between pairs of drifters, are used to study absolute and relative dispersion, respectively, in fluids. Because the goal of the ILP is to produce an estimate of the initial location expressed by a single PDF, a single-particle metric is used here to quantify the dispersion in our simulations. We define the ‘‘ensemble spread’’ as the time-dependent standard deviation
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u N u t 1 kx (t) xr (t)k2 , Dr (t) 5 N 1 n51 n
å
N
(3)
where xr (t) [ ån51 xn (t)/N is the ensemble-mean trajectory. For a random flight model used in our numerical experiment (section 3b), multiple launch sites positioned uniformly over the model domain D (and over different flow regimes; Fig. 1) were used to compute Dr(t). The ensemble spread shows a geometrical growth during the first 30 days of simulation Dr(t) } tc for c 5 1.23 on average (Fig. 2). Under Taylor’s assumption of homogeneous and isotropic turbulence, it can be shown that c 5 1 in the ballistic regime (small time lags), where the absolute dispersion is dominated by energetic eddies and that c 5 0.5 when particle motion is uncorrelated (large time lags). Our value of c 5 1.23 is superdiffusive, and the larger absolute dispersion rate is presumably due to shear-enhanced dispersion because of the strong circulation features in our numerical simulations. After 30 days, the combined effects of drifters starting to recirculate and their motion becoming more uncorrelated are contributing to a slower absolute dispersion rate. Consequentially, trajectories from the first 30 days are used to estimate Dr(t) for this particular model. Given a random flight model, we assume that the corresponding ensemble spread function Dr(t) is available through a Monte Carlo forward simulation, as described
FIG. 1. The five launch sites (stars) and the background flow at the launch time (dark contours are anticyclonic and light contours are cyclonic) for the source estimation experiment. The launch sites are named as follows (counterclockwise from top left): NW Saddle, Jet, Cold Core, Far East, and Center Saddle. The launch sites used for the computation of the ensemble spread function (Fig. 2) are distributed uniformly on a regular 100 3 100 grid over the entire flow domain.
earlier. The ensemble spread can be space dependent, Dr(x, t), because of varying Lagrangian behaviors in different flow regimes (e.g., coastal versus open ocean, western boundary current, eddy-trapped flows). Although we assume that the ensemble spread is location independent for simplicity in this paper, the filtering methodology presented here is just as applicable to spatially varying ensemble spread functions.
FIG. 2. Examples of ensemble spread functions (thin lines), derived empirically from the random flight model used in the source estimation experiment. The thick line is the average of the functions shown.
374
JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY
VOLUME 27
c. Constraining trajectory ensemble
d. Resampled particle filter
The drifter trajectories simulated for the inverse Lagrangian prediction problem must be constrained with the knowledge that all trajectories originate from a single launch site. Equation (1) can be integrated backward in time from each target Xm to obtain a drifter trajectory. Given the information that the drifters have been launched at a single location, an ensemble of such trajectories is expected to converge toward the launch site according to the known ensemble spread function Dr(t 2 tL), where tL is the launch time. To formally express the information with which to constrain the trajectory ensemble, we let s(t) be a conceptual ‘‘noisy observation’’ of the unknown trajectory r(t),
At any given time t, the trajectory PDF pr can be optimally updated with the observation s by applying the Bayes rule prjs 5 psjr pr/ps. Specifically, we have
s(t) 5 r(t) 1 e(t),
(4)
where the observation error e(t) is a vector of zero-mean white-noise processes, each with a known variance E2. Assuming that r(t) and e(t) are uncorrelated, the variance of s(t) would become Dr(t)2 1 E2, and an observation PDF psjr can be formulated. Because this observation PDF is conditioned on the realization (simulated ensemble) of r, it can be used to constrain the distribution of r through the Bayes rule. In particular, to constrain the location ensemble pr with psjr, the conditional PDF prjs is determined. The particle filter (Doucet et al. 2001; Chin et al. 2007) can be applied to accomplish this sequentially in time, so that the constraint can be applied while the trajectory ensemble is being simulated. To formulate the constraining PDF psjr, we note that the mean of s(t), or an observation of the mean trajectory, is not available. We thus estimate it in a bootstrapping fashion using the ensemble mean xr (t) of the ongoing simulation. Given such a mean and the variance Dr(t)2 1 E2, we could define a Gaussian PDF for the observation s(t) at any time t during the backward simulation. By experimentation, we have found that the following non-Gaussian PDF is more effective in imposing the constraint onto the ensemble of backward trajectories: 2 !F 3 kxn xr k2 1 1 5, (5) psjr (xn jxr , t) 5 exp4 c 2 D2r 1 E2 where c is a normalization constant and F is a constant parameter to control ‘‘flatness’’ of the PDF. For F 5 1, psjr would become a Gaussian PDF. We use F 5 3 so that the PDF has a relatively flat peak near its maximum. Choosing the larger value of F has given more equal importance (probability) to the ensemble members within a certain distance from the maximum, rather than favoring those in the immediate vicinity of the maximum.
prjs (xn , t) } psjr (xn jxr , t)
1 , N
n 5 1, . . . , N,
(6)
because pr is approximated by a uniform distribution of N sample locations xn and ps is a constant when s is given. The conditional PDF prjs(xn, t) can be used to rank the N drifter locations according to the agreement with the observation. From the right-hand side of (6), the probability (normalized weight), wn 5
psjr (xn jxr , t) N
,
(7)
å psjr (xn jxr , t)
n51
can be computed for this purpose. Such quantification of relative consistency between each ensemble member xn and the observation is the main feature of the particle filter method. Before constraining the ensemble with the observation, the trajectory PDF is represented by the N sample locations xn with equal weight. After constraining, the PDF is still represented by the same samples, but now with nonuniform probability wn. After several observations over time, some probability values wn can become too small for the corresponding sample location xn to serve any use. When the ensemble carries too many of such ‘‘highly unlikely’’ samples, the effective ensemble size and hence the range of the observations that the ensemble can approximate will be diminished accordingly. This is a well-known issue for the particle filter method, and a common remedy is to resample the ensemble to prune the unlikely samples by replacing them with samples with larger weights. Resampling is usually accomplished by drawing N samples from the probability distribution wn, n 5 1, . . . , N, allowing duplicate samples. Samples with a larger wn are thus duplicated more often, and those with a very small wn may not be sampled at all. The new (resampled) set of N samples is considered to be uniformly distributed, with an equal weight of 1/N for each sample. Resampling is a vital part of a practical particle filter, which we call the resampled particle filter (RPF).
e. Particle filter application to drifter simulation The RPF procedure is a time-sequential algorithm. Each RPF iteration for the ILP problem can be summarized as follows: Step 1: Predict the drifter-location ensemble by integrating
FEBRUARY 2010
375
CHIN AND MARIANO
dxn 5 u(xn , t) 1 u9n (t), dt
n 5 1, . . . , N,
backward in time, where u is the given Eulerian background flow field and u9 is the Lagrangian perturbation velocity obtained by integrating the regression Eq. (2) for each drifter n. Step 2: Compute the normalized weight wn using (7) for each predicted drifter location. Step 3: Resampling: Choose (e.g., randomly sample) N drifters based on the probability distribution fw1, w2, . . . , wNg, whereas allowing a single drifter to be chosen multiple times. The set of N chosen drifters becomes the new, updated set of location ensemble. Step 4: If .tL, where tL is the known launch time, go back to step 1 and repeat. Otherwise, the launch position PDF is estimated by the location ensemble as N
pr (x, tL ) 5
d(x xn ) , å N n51
where d is the Dirac delta function (or its numerical approximation). The time interval of the iterations is usually the sampling interval for the background flow u. Contribution of the background flow to the drifter advection can also be computed using an adjoint model (e.g., Pudykiewicz 1998; Seibert and Frank 2004); the filtering method presented here would be relevant as long as random perturbations are used in the trajectory simulation. The initial condition for the iterations is the set of target locations Xm, and Nm simulated drifters originate from each target Xm. Although Nm is constant for every target in the numerical experiments described later, Nm can be made variable to reflect, for example, the concentration of the target at each location. This facilitates application of the presented method to ILP problems involving dispersion of biological agents and pollutants (e.g., Cowen et al. 2006; Wilson et al. 1982).
3. Numerical experiments a. Measures of performance We evaluate the benefit of the RPF procedure by comparing two ensembles of trajectories. One is an ensemble produced with the RPF procedure as described in the , n 5 1, . . . , N; the previous section and denoted as xRPF n other is an ensemble of trajectories produced with only the random flight model without any constraint and denoted as xEns n . To compare the two ensembles, the launch-site disor pEns tribution, pRPF r r , estimated by each ensemble is
used to initialize some test drifters for forward trajectory simulations. The target locations estimated by the test drifters can then be used to evaluate statistical accuracy in reproducing the known target locations Xm, m 5 1, . . . , M. The trajectories of the test drifters are denoted as z‘(t), tL # t # tF for ‘ 5 1, . . . , L, which are initialized by samples from one of the distributions, (x, tL ) or pEns pRPF r r (x, tL ). The corresponding ensemble [ zRPF (tF ) or ZEns [ zEns of target locations ZRPF ‘ ‘ ‘ ‘ (tF ) will each be compared against the actual target locations Xm. To measure closeness of two locations x and z, we use the closeness function defined as " # kx zk 2 , C(x, z) 5 exp a
(8)
where a is an e-folding radius parameter that controls tolerance for colocation errors. Here, we set a equal to the grid spacing of the given background circulation field u(x, t). To quantify whether the mth target has been reached by any of the test drifters, we define the skill function, " G(Xm ) 5 min 1,
L
#
åC(Xm , Z‘ )
3 100,
(9)
‘51
with a maximum score of 100. We use the minimum value g [ minmG(Xm) to score effectiveness of the ensemble to cover all the targets. We call g the coverage score. We also quantify whether the ‘th test drifter has reached near any of the targets using the skill function, " H(Z‘ ) 5 min 1,
M
å C(Xm , Z‘ )
# 3 100,
(10)
m51
again with a maximum score of 100. We use the mean m and standard deviation of H(Z‘) over the drifter ensemble ‘ 5 1, . . . , L to score efficiency of the ensemble in reaching a target. A higher m value indicates that a drifter from the ensemble is less likely to miss a target and that the drifter destination is more likely to be focused near a target location. We hence call m the resolution score. Efficiency of the forward drifter ensemble is also dependent on compactness of the potential launch area determined by the launch-site PDF pr(x, tL). To quantify the total area, we first define the launch potential probability to be ð Pr 5 x92Qr
pr (x9, tL ) dx9,
(11)
376
JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY
VOLUME 27
where Qr 5 fx9: pr(x9, tL) $ rg is the domain of potential launch sites with a probability density value of at least r. For the probability density value ro such that Pro is exactly 95% (or 0.95), we call the corresponding Qro the region of 95th-percentile launch potential and its boundary the 95th-percentile contour. We use the total area, denoted A95, occupied by Qro to score the compactness of the potential drifter launch site.
b. Source estimation problem To examine performance of the ILP solution methods, a controlled numerical experiment has been conducted where the target trajectories are known with certainty to originate from a single launch site. We hence call this scenario the source estimation problem. A potential application of this scenario is estimation of spawning site(s) based on observations of larval dispersal (Cowen et al. 2006). In this experiment, the target locations Xm are simulated using a given random flight model and a launch site. The inverse trajectory calculation is performed using the same random flight model and background flow, except for values of some stochastic parameters to enhance filter performance, as detailed below. The estimated launch-site PDF can then be compared to the known launch site.
1) PARAMETERS The background flow u(x, t) is obtained from a numerical simulation of an idealized, classic double-gyre ocean circulation with a Rossby radius of roughly 40 km over a horizontal domain of 2000 km 3 2000 km with a grid size of Ds 5 20 km. Figure 1 shows streamfunction of u(x, tL ) and the five launch sites used in this experiment. The five launch sites are chosen at or near circulation features such as strong jetlike currents, vortices, and saddle points. The turbulent velocity parameters were tu 5 3 days and su 5 5 cm s21 for both velocity components. The resulting random flight model was used to simulate trajectory ensembles from 100 launch sites distributed uniformly over the horizontal domain to obtain the ensemble spread functions Dr(t) shown in Fig. 2. The first 30 days of the mean ensemble spread can be characterized by a geometric function (a straight line on the log–log plot), which is approximated by a least squares fit to be Dr(t) 5 2.5t1.23 in kilometers. The same random flight model was used to launch M 5 30 drifters from each of the five launch sites of Fig. 1 at tL 5 230 days, and the final drifter locations at tF 5 0 days were recorded and used as the targets Xm for the ILP problem. Figure 3 (dots) shows the target locations corresponding to the launch location in the strong eastward current (‘‘Jet’’ from Fig. 1). The distribution of the target sites exhibits two distinct groupings.
FIG. 3. The 30 target sites (dots) corresponding to the Jet launch site (star) of Fig. 1.
For the backward trajectory simulations, the turbulence magnitude in the random flight model was doubled to su 5 10 cm s21. This increase is aimed at facilitating a wider spatial coverage by the backward trajectories and hence enhancing their collective chance of finding the launch site. The new su value was chosen so that the unfiltered solution xEns n (t L ) would have a perfect coverage score, gEns 5 100, in each of the five launch cases. We used Nm 5 300 backward drifters for each target site for a total of N 5 9000 backward trajectories. To obtain (t), the ambient localization the filtered trajectories xRPF n uncertainty of E 5 5Ds 5 100 km was used along with the ensemble spread function Dr(t) described previously to realize the particle filter constraint in Eq. (5); L 5 5000 forward test drifters (z‘) are deployed in the model to evaluate the backward simulation results.
2) RESULTS Figure 4 shows time progressions of the unconstrained (left column) and its filtered counterpart pRPF PDF pEns r r (right column) for the Jet case. As expected, the area bounded by the 95th-percentile contours (thick lines) of the unconstrained PDF expands as the simulation progresses, whereas the analogous area for the filtered PDF contracts in accordance with the enforced (forward) dispersion characteristics. At day tL 5 230, when the target had been launched, the filtered PDF has surrounded the launch site (star, lower-right panel) tightly with the 10th-percentile contour, whereas the unconstrained PDF could surround the launch site only loosely with the 50th-percentile contour (bottom-left panel). For the other four cases, the launch time PDF is also found to be significantly tighter for the filtered simulations than the unconstrained simulations, and in each
FEBRUARY 2010
CHIN AND MARIANO
377
FIG. 4. Progression of the density of backward drifters initialized at each of the 30 target sites shown in Fig. 3. The 95th-percentile contours (thick lines) as well as 50th- and 10th-percentile contours (thin lines) of the drifter density are shown after 10, 20, and 30 days of simulation. The simulations are shown (left) without and (right) with the particle filter. The source (star) is located more accurately with the particle filter after the 30-day analysis period.
case the 95th-percentile contour of the filtered PDF is found to enclose the actual launch site (Fig. 5). The area enclosed by the 95th-percentile contours of the ARPF 95 filtered launch-time PDF was 19% to ;32% of the unconstrained counterpart AEns 95 in the five cases examined (Table 1).
As shown in Table 1, the coverage scores were perfect (g 5 100) for both unconstrained and filtered solutions in all five cases, implying that the filtered launch-time PDFs are effective in delivering drifters to all target sites. The resolution scores are 1.6–1.9 times higher for the filtered results than their unconstrained counterparts,
378
JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY
VOLUME 27
FIG. 5. The 95th-percentile contours of the drifter density after 30 days of inverse simulation, estimating four of five launch sites shown in Fig. 1 (see Fig. 4 for the one remaining launch site). Results obtained with (dark contour) and without (light contour) filter constraint are shown. Again, simulations with the particle filter can localize each launch site more tightly.
indicating that 60%–90% more drifters launched according to the filtered PDF would reach a target than those prescribed by the unconstrained PDF.
c. Array deployment problem In this experiment, the target sites are chosen arbitrarily instead of simulated using a single launch site. This scenario, called the array deployment problem hereafter, is motivated by the operation of delivering drifting sensors to a desired set of sampling locations. Unlike the source estimation problem, there is no guarantee that a single launch site is sufficient to reliably deliver drifters to all targets.
The same random flight model as the source estimation problem is used. Four cases, each with a 5 3 5 square array of targets (M 5 25), are considered, as depicted in Fig. 6. As before, Nm 5 300 backward drifters are released from each target site, making the total number of backward drifters be N 5 7500. Analysis duration is 15 days (tL 5 215). Other experimental parameters are identical to those in the source estimation problem. Figure 7 shows the 95th-percentile contours for the unconstrained (light line) and filtered (dark line) launchtime PDFs in each of four deployment cases, depicting that the area for the most probable launch sites determined by the unconstrained method is significantly
TABLE 1. Skill scores from the source estimation experiment. Launch point
Ens ARPF 95 /A95
g Ens
g RPF
mEns 6 EEns
mRPF 6 ERPF
mRPF/mEns
NW Saddle Jet Cold Core Far East Center Saddle
0.186 0.210 0.278 0.316 0.240
100 100 100 100 100
100 100 100 100 100
29.1 6 38.9 30.1 6 38.8 38.5 6 40.6 49.4 6 44.5 33.1 6 39.5
48.4 6 45.4 52.8 6 42.9 66.4 6 37.9 80.9 6 31.9 64.2 6 39.4
1.66 1.76 1.72 1.64 1.94
FEBRUARY 2010
CHIN AND MARIANO
FIG. 6. The four 5 3 5 arrays of target sites (circles) and the background flow at launch time (dark contours are anticyclonic and light contours are cyclonic) in the array deployment experiment. The target arrays are (top left) NW, (top right) NE, (bottom right) SC, and (bottom left) SW.
larger than the area determined by the unconstrained counterparts. Relative compactness quantified by the Ens ratio ARPF 95 /A95 , which ranges 11% to ;23% in the four cases (Table 2), indicates a larger difference between the
379
two PDFs in this experiment than in the source estimation experiment despite the shorter simulation time (15 days) here. However, although the unconstrained PDF has perfect coverage scores, the filtered PDF has missed perfect coverage in all but one case (gRPF; Table 2). In particular, there was one target site each in the ‘‘NW’’ and ‘‘SC’’ cases and two target sites in the ‘‘SW’’ case that had less than perfect [G(Xm) , 100] coverage. To remedy the coverage issue in the filtered results, we have made extra simulations of backward drifters initialized only at the target site(s) Xm with imperfect coverage G(Xm) , 100. The RPF constraint is also applied to these supplemental drifter ensembles. The final simulated positions at day tL are added to the original , and the combined ensemble of filtered positions xRPF n positions is used as a new estimate of the launch-time PDF. We call this procedure the supplemented RPF (SPF) method. Figure 8 shows the 95th-percentile contours for the SPF results (dark line) for the NW and SC cases. The supplemented position samples have added only slight additional area in each case. The additional area can be distant from the area estimated by RPF before combination (SC case; Fig. 8). These small additions lead to perfect coverage scores in all cases (gSPF;
FIG. 7. The 95th-percentile contours of drifter density indicating the potential launch site to deploy the shown drifter array. The results obtained with (dark contour) and without (light contour) the particle filter are shown.
380
JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY
VOLUME 27
TABLE 2. Skill scores from the array deployment experiment. Array
Ens ARPF 95 /A95
Ens ASPF 95 /A95
gEns
g RPF
g SPF
mEns
mRPF
mSPF
mRPF /mEns
mSPF /mEns
NW NE SC SW
0.232 0.148 0.111 0.156
0.279 0.148 0.135 0.271
100 100 100 100
39.5 100.0 21.4 1.2
100 100 100 100
57.4 52.4 55.2 56.4
79.3 74.5 73.5 76.9
76.4 74.5 73.2 73.5
1.38 1.42 1.33 1.36
1.31 1.42 1.33 1.29
Table 2). The resolution scores (m; Table 2) are generally high in all cases, likely because of the relatively short analysis time. The incorporation of supplemental position samples does not decrease the resolution scores significantly.
d. Application: Derelict fishing vessel Next, we consider an experiment based on an actual search and rescue scenario to determine the location and time of an accident in a fishing boat that subsequently has started to drift. Historical ocean data and real-time wind observations are combined for the background flow. For the drifting boat under the action of both currents and winds, the crosswind leeway coefficient that relates wind velocity to boat motion can be either positive or negative, depending on how the boat is leaning at the onset of the drift (Allen and Plourde 1999). This leads to bimodality in the distribution for the drifter location, and this experiment can hence demonstrate capability of the particle filter to handle non-Gaussian distributions. Because the actual launch site for this case is unknown, we use the solution obtained by a brute-force approach to serve as the reference to compare the filagainst the unconstrained countered estimate pRPF r terpart pEns r . The brute-force solution is obtained by computing 20 forward trajectories from every potential launch sites (regularly sampled grid points over the entire ocean domain D) to obtain estimates zBF ‘ (t F ) of the
target position. Each launch site is then scored in terms of the average closeness C[zBF ‘ (tF ), X1 ] to the actual final position X1 of the fishing boat. The ensemble of launch sites zBF ‘ (tL ) corresponding to the highest 50 scores are used as the reference solution. Note that high computational costs could make a brute-force solution impractical in operations where the potential launch sites cover both a large ocean domain D and long time window.
1) SCENARIO The fishing vessel Poseidon was sighted at 1115 LT 11 April 1999 off the Gulf Coast at 29822.79N, 85832.49W by the U.S. Coast Guard (USCG). All 3 members of its crew were dead, presumably by inhaling poisonous fumes. Last known contact to the crew was made at 1800 LT 6 April. Although the exact time and location of deaths are unknown, the time is assumed to be before early morning of 7 April 1999 from lack of routine morning radio call to the home base on 7 April. The vessel location at the time of death is needed in this case to determine legal jurisdiction.
2) PARAMETERS Pictures of F/V Poseidon were scanned and sent to Art Allen of USCG, a leading expert on leeway drag coefficients, who suggested to use a downwind coefficient of 2.5% of the 10-m average wind speeds and a crosswind coefficient of 62.76% of the 10-m average wind speeds. Sources for wind data are direct observations
FIG. 8. As in Fig. 7 (top), but the particle filter estimates (dark contour) have been augmented with an additional simulation.
FEBRUARY 2010
381
CHIN AND MARIANO
offshore station 42039 located at 288489N, 8683.69W. Hourly data were available for our time period. From these station wind data, the random flight model (AR-1) parameters are estimated to be su2 5 50 cm2 s22 and t u 5 6 h for each wind velocity component. The mean ocean currents in this region are seasonal. Interpolation of the Mariano Global Surface Velocity Analysis (MGSVA) data yielded a mean flow on the order of 3 cm s21 toward the east and 4 cm s21 toward the south for this region in April. Numerical simulations by He and Weisberg (2002) for the mean circulation of the Florida Panhandle and west Florida shelfs for April 1999 also showed a flow toward south-southeast with speeds on the order of 5 cm s21. Sea surface height maps and trajectories of near-surface buoys revealed that the Loop Current was entraining the region from the outer shelf of the Florida Panhandle at this time. This also supports the south-southeast direction of the flow given by MGSVA. The AR-1 parameters of su2 5 25 cm2 s22 and t u 5 3 days were used for each component of ocean turbulence velocity to complete the random flight model specification.
3) RESULTS
FIG. 9. Final position of F/ V Poseidon (red cross) and launch-site estimates by the brute-force solution (‘‘x’’ and ‘‘o,’’ where symbols correspond to the two crosswind leeway coefficients used), with launch-site probability contours by the (top) unconstrained backward ensemble solution and (bottom) particle-filtered ensemble solution.
and numerical weather simulations that, for operational products, contain a blend of observations. Because there is a tendency for the numerical wind fields to be smooth (reduced energy at high frequency and wavenumber), it is desirable to use in situ observations. The closest operating National Data Buoy Center (NDBC) station to the final position is station CSB at Cape St. Blas and
The background flow is given by the combined effects of mean ocean flow and hourly winds reduced by drag coefficients, whereas the turbulent velocity is given by the sum of the AR-1 processes for the winds and ocean flows. The sign of the crosswind coefficient (1 or 2) leads to a bimodal distribution of simulated drifter positions. Each solution was obtained by first computing the drifter trajectories under each sign separately and then combining all trajectories to form a single ensemble. The brute-force solution shown in Fig. 9 indicates that the launch-site estimates are clustered according to the crosswind coefficient value (sign) used. The solution from unconstrained backward ensemble simulation pEns r does not capture this clustering and has a relatively diffuse distribution (Fig. 9, top). The solution from fil, on the other hand, has a bimodal tered ensemble pRPF r distribution with each of the two probability maxima associated closely with the launch sites estimated by the brute-force approach.
TABLE 3. Skill scores from the source estimation experiment repeated with a more energetic Ds 5 10 km background flow (Fig. 10). Launch point
Ens ARPF 95 /A95
g Ens
g RPF
mEns 6 EEns
mRPF 6 ERPF
mRPF/mEns
Jet Head Jet and Saddle Jet Tail East Ring Big Ring
0.121 0.212 0.175 0.366 0.186
100 100 100 100 100
100 100 100 100 100
6.0 6 17.5 7.8 6 20.8 7.3 6 18.5 13.6 6 26.9 5.1 6 16.8
11.3 6 22.8 11.6 6 24.5 11.1 6 22.6 18.2 6 29.5 7.0 6 19.9
1.89 1.48 1.53 1.33 1.36
382
JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY
VOLUME 27
FIG. 10. Source estimation experiment repeated with a more energetic background flow. The 95th-percentile contours of launch-site probability are shown, showing results obtained with (dark contour) and without (light contour) the particle filter. (top left) Background flow (dark contours are anticyclonic and light contours are cyclonic) and launch sites (stars) named as follows (counter clockwise from left): ‘‘Jet Head,’’ ‘‘Jet and Saddle,’’ ‘‘Jet Tail,’’ ‘‘East Ring,’’ and ‘‘Big Ring.’’
4. Discussion and conclusions We have explored a particle filter approach to solve the inverse Lagrangian prediction problem by an ensemble simulation of backward trajectories. The numerical experiments presented here demonstrate that ensemble spread can be controlled using a constraint derived empirically and that the constrained solution leads to a spatially more compact estimate of the launch
site. The constrained solution is thus more efficient than the unconstrained counterpart, while not compromising much effectiveness in delivery to the intended target sites. Because of high demands for shipping resources in drifter deployment, adopting the technique to specific operational parameters and evaluating its benefits are potential topics of future investigation. The particle filter—more specifically, the resampled particle filter—is well suited for realization of the constrained
FEBRUARY 2010
383
CHIN AND MARIANO
trajectory simulation because of flexibility of the method. Although the particle filter methods are known generally to require a large number of samples to perform well (e.g., Chin et al. 2007), this issue should not become a limiting factor in Lagrangian applications because of relatively low computational cost of simulating a trajectory given a background flow field. Details of the random flight model can be expected to alter the results presented here. Our investigations so far, however, have shown no evidence for such. For example, statistics of both North Atlantic floating buoys and high-resolution numerical circulation simulations show that larger velocity values are more frequently observed than prediction based on a Gaussian distribution (Bracco et al. 2000). When a random number generator capable of fitting distributions with ‘‘fat tails’’ (Chin et al. 1998) is used to replace a Gaussian random number generator in the AR-1 turbulent velocity model (2), however, the resulting drifter trajectories has changed only negligibly. Also, we have used a subsampled background flow, to a daily interval instead of every 20 min, during the backward trajectory simulations but again found no appreciable change in drifter trajectories in the source estimation experiment. On the other hand, accuracy of the background flow field is expected to affect the quality of solution. For example, stronger flow tends to increase uncertainty in Lagrangian trajectories. When the ‘‘source estimation’’ experiment was repeated using a more energetic background flow [by halving the grid spacing to Ds 5 10 km with the ensemble spread Dr(t) 5 4.9 3 t1.30 doubled in magnitude], the filter constraint had to be loosened (by increasing the uncertainty parameter to E 5 20Ds 5 200 km) for the coverage to be perfect (gRPF; Table 3). The resolution scores became significantly lower (m; Table 3) as a consequence. Also, the most probable launch areas are generally larger (Fig. 10) because of greater dispersion. The relative performance of the filtered solution in reference to the unconstrained solution, however, remained unchanged both qualitatively RPF Ens Ens /m ; (Fig. 10) and quantitatively (ARPF 95 /A95 , m Table 3). The RPF method can hence be expected to maintain some level of relative performance in a highly energetic background flow. A potentially more serious issue would be systematic errors in the background flow. The main issue is not that these errors can affect the mean trajectory of a drifter ensemble (and hence the estimate of the launch site) but that there is general lack of viable models, statistical or otherwise, of such errors. Use of basis functions such as empirical orthogonal functions may be among the few available techniques to characterize systematic flow errors. Efforts are under way to enhance the presented
techniques to incorporate some assumed mathematical forms for systematic errors in the background flow field. Acknowledgments. This work is supported by Office of Naval Research Grant N00014-07-1-0175. REFERENCES Allen, A., and J. V. Plourde, 1999: Review of leeway: Field experiments and implementation. U.S. Coast Guard Tech. Rep. CG-D-08-99, 352 pp. Babiano, A., A. Provenzale, and A. Vulpiani, 1994: Chaotic Advection, Tracer Dynamics, and Turbulent Dispersion. Proc. NATO Advanced Research Workshop and EGS Tropical Workshop on Chaotic Advection, Sereno di Gavo, Italy, NATO/ EGS, 329 pp. Bracco, A., J. H. LaCasce, and A. Provenzale, 2000: Velocity probability density functions for oceanic floats. J. Phys. Oceanogr., 30, 461–474. Chin, T. M., R. F. Milliff, and W. G. Large, 1998: Basin-scale, highwavenumber sea surface wind fields from a multiresolution analysis of scatterometer data. J. Atmos. Oceanic Technol., 15, 741–763. ——, M. J. Turmon, J. B. Jewell, and M. Ghil, 2007: An ensemblebased smoother with retrospectively updated weights for highly nonlinear systems. Mon. Wea. Rev., 135, 186–202. Cowen, R. K., C. B. Paris, and A. Srinivasan, 2006: Scaling of connectivity in marine populations. Science, 311, 522–527. Doucet, A., N. de Freitas, and N. Gordon, Eds., 2001: Sequential Monte Carlo Methods in Practice. Springer-Verlag, 581 pp. Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99, 10 143–10 162. Griffa, A., 1996: Applications of stochastic particle models to oceanographic problems. Stochastic Modelling in Physical Oceanography, A. J. Adler, P. Mu¨ller, and B. Rozovskii, Eds., Birkhauser, 114–140. ——, K. Owens, L. Piterbarg, and B. Rozovskii, 1995: Estimates of turbulence parameters from Lagrangian data using a stochastic particle model. J. Mar. Res., 53, 212–234. He, R., and R. H. Weisberg, 2002: West Florida shelf circulation and temperature budget for the 1999 spring transition. Cont. Shelf Res., 22, 719–748. Hitchcock, G. L., and R. K. Cowen, 2007: Plankton: Lagrangian inhabitants of the sea. Lagrangian Analysis and Prediction of Coastal and Ocean Dynamics, A. Griffa et al., Eds., Cambridge University Press, 349–400. Kim, Y. P., S. G. Shim, K. C. Moon, C. G. Hu, C. H. Kang, and K. Y. Park, 1998: Monitoring of air pollutants at Kosan, Cheju Island, Korea, during March–April 1994. J. Appl. Meteor., 37, 1117–1126. Lin, J. C., C. Gerbig, S. C. Wofsy, A. E. Andrews, B. C. Daube, K. J. Davis, and C. A. Grainger, 2003: A near-field tool for simulating the upstream influence of atmospheric observations: The Stochastic Time-Inverted Lagrangian Transport (STILT) model. J. Geophys. Res., 108, 4493, doi:10.1029/2002JD003161. Mariano, A. J., and E. Ryan, 2007: Lagrangian analysis and prediction of coastal and ocean dynamics (LAPCOD). Lagrangian Analysis and Prediction of Coastal and Ocean Dynamics, A. Griffa et al., Eds., Cambridge University Press, 423–479. ¨ zgo¨kmen, A. Griffa, and A. J. Mariano, Piterbarg, L. I., T. M. O 2007: Predictability of Lagrangian motion in the upper ocean.
384
JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY
Lagrangian Analysis and Prediction of Coastal and Ocean Dynamics, A. Griffa et al., Eds., Cambridge University Press, 136–171. Pudykiewicz, J. A., 1998: Application of adjoint tracer transport equations for evaluating source parameters. Atmos. Environ., 32, 3039–3050. Seibert, P., and A. Frank, 2004: Source-receptor matrix calculation with a Lagrangian particle dispersion model in backward mode. Atmos. Chem. Phys., 4, 51–63.
VOLUME 27
Thomson, D. J., 1987: Criteria for the selection of stochastic models of particle trajectories in turbulent flows. J. Fluid Mech., 180, 529–556. van Leeuwen, P. J., 2003: A variance-minimizing filter for largescale applications. Mon. Wea. Rev., 131, 2071–2084. Wilson, J. D., G. W. Thurtell, G. E. Kidd, and E. G. Beauchamp, 1982: Estimation of the rate of gaseous mass transfer from a surface source plot to the atmosphere. Atmos. Environ., 16, 1861–1868.