The Hybrid Local Ensemble Transform Kalman Filter - AMS Journals

JUNE 2014

PENNY

2139

The Hybrid Local Ensemble Transform Kalman Filter STEPHEN G. PENNY Department of Atmospheric and Oceanic Science, University of Maryland, College Park, and NOAA/NWS/NCEP, College Park, Maryland (Manuscript received 22 April 2013, in final form 1 November 2013) ABSTRACT Hybrid data assimilation methods combine elements of ensemble Kalman filters (EnKF) and variational methods. While most approaches have focused on augmenting an operational variational system with dynamic error covariance information from an ensemble, this study takes the opposite perspective of augmenting an operational EnKF with information from a simple 3D variational data assimilation (3D-Var) method. A class of hybrid methods is introduced that combines the gain matrices of the ensemble and variational methods, rather than linearly combining the respective background error covariances. A hybrid local ensemble transform Kalman filter (Hybrid-LETKF) is presented in two forms: 1) a traditionally motivated Hybrid/CovarianceLETKF that combines the background error covariance matrices of LETKF and 3D-Var, and 2) a simple-toimplement algorithm called the Hybrid/Mean-LETKF that falls into the new class of hybrid gain methods. Both forms improve analysis errors when using small ensemble sizes and low observation coverage versus either LETKF or 3D-Var used alone. The results imply that for small ensemble sizes, allowing a solution to be found outside of the space spanned by ensemble members provides robustness in both hybrid methods compared to LETKF alone. Finally, the simplicity of the Hybrid/Mean-LETKF design implies that this algorithm can be applied operationally while requiring only minor modifications to an existing operational 3D-Var system.

1. Introduction Hybrid data assimilation systems combine two approaches traditionally used in operational weather forecasting: ensemble Kalman filters (EnKF) and the 3D variational data assimilation (3D-Var) and 4D-Var methods. For example, a hybrid system based on the developmental work of Barker (1999), Hamill and Snyder (2000), Lorenc (2003), Buehner (2005), Buehner et al. (2010a,b), and Wang et al. (2007a,b, 2008a,b, 2013) has recently been implemented at the National Centers for Environmental Prediction (NCEP) for use in operational forecasting (Kleist 2012; Wang et al. 2013). Most of the justification given for the improved performance of the traditional hybrids over the variational methods has been that the background error covariance is better defined with an ensemble, due to flow dependence and the corresponding improvement in multivariate covariance information. While such hybrid approaches have been

Corresponding author address: Stephen G. Penny, Dept. of Atmospheric and Oceanic Science, University of Maryland, College Park, 2431 Computer and Space Science Bldg., College Park, MD 20742. E-mail: [email protected] DOI: 10.1175/MWR-D-13-00131.1 Ó 2014 American Meteorological Society

shown to improve upon the existing operational variational systems, it is unclear how the static covariance matrix and the minimization procedure of the variational systems benefit the EnKF. We examine the impacts that a simple 3D-Var has on an EnKF in order to determine the source of these benefits. In an operational environment, the choices of ensemble size and observation coverage are limited by costs of computational facilities and observing equipment. Thus, it is important to identify the preferred algorithmic approach when these parameters are prescribed. We introduce a new hybrid using an EnKF combined with a simple 3D-Var and demonstrate its effectiveness from this perspective. Traditional hybrids start with a variational approach and incorporate the ensemble information through the ensemble-derived covariance matrix. Here we instead start with an EnKF and use a variational approach to apply a correction within the model space to stabilize the EnKF.

2. Methodology a. Model For the forecast model, we use the Lorenz-96 model on m 5 40 grid points (Lorenz 1996),

2140

MONTHLY WEATHER REVIEW

dxj dt

5 (xj11 2 xj22 )xj21 2 xj 1 F ,

(1)

with F 5 20 (as used by Wilks 2005, 2006; Messner 2009) and Dt 5 0.01, for j 5 1, . . . , m. Because this model’s internal doubling time varies strongly with forcing (Orrell 2003), we note that our results were qualitatively similar using F 5 8 with a forecast time step Dt 5 0.05 (as originally used by Lorenz). Lorenz (2005) discusses further implications of varying F. In this model, the first term represents ‘‘advection’’ constructed to conserve kinetic energy, the second is damping, and the third is forcing. The boundaries are cyclic, such that xm11 5 x1, and x0 5 xm. The ‘‘truth’’ or ‘‘nature’’ run is performed with Runge–Kutta–Fehlberg orders 4 and 5, while forecast runs are performed with Runge–Kutta–Fehlberg orders 2 and 3 (Fehlberg 1970).

analysis is formed within the linear space spanned by the ensemble members. We implement two hybrid algorithms using LETKF as a basis. First, a hybrid inspired by traditional methods computes a linear combination of B and the ensemble background error covariance matrix Pb for use in a 3D-Var step. A similar approach was shown by Wang et al. (2007b) to be equivalent to the control-variable method of Lorenc (2003) and Buehner et al. (2010a,b). We refer to this method as the Hybrid/Covariance-LETKF. Our new approach combines the gain matrices of the EnKF and 3D-Var methods, rather than their background covariances. The Kalman gain matrix K is the weighting applied to the observational innovation, (yo 2 Hxb). Brett et al. (2013) indicate that it is the Kalman gain matrix that determines stability of the filter. We construct the following general hybrid gain matrix:

b. Data assimilation methods We solve the data assimilation problem by minimizing the traditional functional (Kalnay 2003) J(x) 5 (x 2 xb )T B21(x 2 xb ) 1 (yo 2 Hx)T R21(yo 2 Hx) . (2) We minimize J over potential model states x, where xb is the background estimate, yo is the observation vector, and H is an operator transforming x from the model space of dimension m to the observation space of dimension l. The matrices B and R are the background and observation error covariance matrices, respectively. For 3D-Var we use the preconditioned conjugate gradient (PCG) minimization algorithm. In PCG, a preconditioner matrix M is used to solve M21Ax 5 M21b. The matrix B is used as the preconditioner where A 5 I 1 BHT R21 H, and

(3)

b 5 xb 1 BHT R21 yo .

(4)

The B matrix is constructed using an exponential decay function, similar to Derber and Rosati (1989), with maximum radius rB, Bi,j 5 s2b e2ji2jj , for row i, column j, and

ji 2 jj # rB . (5)

We implement the local ensemble transform Kalman filter (LETKF) of Hunt et al. (2007), inspired by Bishop et al. (2001) and Ott et al. (2004), as our EnKF method. In LETKF, an analysis is computed for each grid point based on local observations within radius r. Each

VOLUME 142

^ 5 b K 1 b KB 1 b KB HK, K 1 2 3

(6)

K 5 Pb HT (HPb HT 1 R)21 ,

(7)

where

KB 5 BHT (HBHT 1 R)21 .

(8)

We form our hybrid analysis by choosing b1 5 1, b2 5 a, and b3 5 2a, so that ^ 5 K 1 aKB (I 2 HK) , K

(9)

^ o 2 Hxb ) . xaHybrid 5 xb 1 K(y

(10)

The modified gain matrix (9) contains an automatic reduction in the contribution of KB given (I 2 HK) is a contraction. The parameter a allows a further manual scaling of KB, which is equivalent under a monotonic mapping to applying a scalar inflation rB (deflation for 0 , a , 1) directly to the static B matrix used to form KB. The values for these b coefficients are chosen specifically so that we can construct the following algorithm, which gives an algebraically equivalent result (see appendix A for proof). We refer to this algorithm as the Hybrid/Mean-LETKF, and it is first described in words: The standard LETKF is used first. The analysis mean from LETKF is then used as the ‘‘background’’ for 3DVar, which is performed locally in model space after each grid point is analyzed by LETKF. An empirically chosen weighted average of the two analysis solutions is computed, and the LETKF analysis ensemble is recentered at the new solution. Kalnay and Toth (1994) performed a similar procedure using a single bred vector

JUNE 2014

2141

PENNY

and 3D-Var. Another choice of coefficients—b1 5 (1 2 a), b2 5 a, and b3 5 0—can be implemented similarly by using the ensemble forecast mean as the background for 3D-Var (see appendix B). For reference in the results section, we will call this the Hybrid/Mean-LETKF(b). The Hybrid/Mean-LETKF algorithm is detailed as follows: We calculate the LETKF analysis following Hunt et al. (2007), first computing the analysis error covariance in ensemble space, 21 (k 2 1) a b T 21 b ~ , (11) P 5 I 1 (Y ) R Y r where Yb 5 H(Xb), the columns of Xb are ensemble perturbations from the mean state, and r is the local inflation parameter. The symmetric square root of this matrix is computed to determine the weights for the analysis ensemble, ~ a ]1/2 . Wa 5 [(k 2 1)P

(12)

To transform from ensemble space back to model space, we multiply these weights with each of the background ensemble member perturbations, Xa 5 Xb Wa .

(13)

The LETKF analysis perturbations computed in (13) are added to a different estimate of the ensemble mean in each of the following methods. For standard LETKF, the analysis mean is computed as ~ a (Yb )T R21 (yo 2 yb ) , wa 5 P

(14)

xa 5 Xb wa 1 xb ,

(15)

b

where x is the mean background state. At this point, the standard LETKF algorithm is complete. Next we relocalize in model space. Greybush et al. (2011) discuss the impacts of localization in observation and model spaces, termed R-localization and B-localization, respectively. We define the local model dimension, mloc 5 2r 1 1, and select the appropriate rows and columns of the full B and Pb matrices to define Bloc and Pbloc , respectively. The observations within this local radius are used to form Rloc. For the Hybrid/Covariance-LETKF, a linear combination is formed with the static B matrix and the ensemble-generated Pb in the local model space with dimension mloc, similar to that performed by Hamill and Snyder (2000), J(xa ) 5 (xa 2 xb )T [aBloc 1 (1 2 a)Pbloc ]21 (xa 2 xb ) o a 1 (yo 2 Hxa )T R21 loc (y 2 Hx ) .

(16)

For the Hybrid/Mean-LETKF we minimize the cost function ^ 21 (xa 2 xb ) J(xa ) 5 (xa 2 xa )T B ^ 21 (yo 2 Hxa ) . 1 (yo 2 Hxa )T R

(17)

^ 5 Rloc, but other choices ^ 5 Bloc and R Here we use B ^ is the true are possible. The preferred choice for B analysis error covariance, but this is unknown and approximated with Bloc. Because this approximation overestimates the analysis error, we adjust the analysis mean back toward the LETKF analysis with a scaling parameter a and recenter the analysis ensemble to this mean, xaHybrid 5 axa 1 (1 2 a)xa ,

(18)

XaHybrid 5 Xa 1 xaHybrid vT .

(19)

The vector v 5 (1 1 . . . 1 1)T is a column of k ones used to add the mean to each column of Xa, resulting in the final analysis ensemble having the hybrid-derived analysis as its mean. Finally, as with the standard LETKF, we update the single grid point at the center of the local region with the hybrid solution. For both hybrid methods, a is chosen empirically based on the ensemble size (k) and observation coverage (l). We note that while the observations have been procedurally reused, the results using the Hybrid/MeanLETKF algorithm are algebraically equivalent to the previous hybrid gain form given in (9) and (10) in which the observational increments are used only once. The form that defines the Hybrid/Mean-LETKF algorithm has the advantage that existing operational systems can be used without significant modification to generate the hybrid analysis.

3. Experiment design We first examine special case scenarios using limited observations and a small ensemble size: l 5 4 observations per time step and ensemble size k 5 5. We show the nature run, and compare free-run forecast error and data assimilation analysis error for LETKF, 3D-Var, and the Hybrid/Mean-LETKF. We then generalize the results across the full range of ensemble sizes (2–40) and observation coverage (1–40) for each method, and examine variations in the localization radius. Observations are generated randomly in space from a uniform distribution on the interval [0, 40] with errors from a normal distribution using a prescribed standard deviation of sr 5 0.5. We assume these observation statistics are known. A linear interpolation scheme is used to construct the observation operator H. The B

2142


VOLUME 142

FIG. 1. (top left) Nature run for Lorenz-96 over 600 time steps with Dt 5 0.01 and (top middle) free-run error. The following analyses were performed with l 5 4 observations per time step. Analysis error for (top right) 3D-Var; (bottom left),(bottom middle) LETKF for k 5 20 and 5, respectively; and (bottom right) Hybrid/Mean-LETKF for k 5 5.

matrix used for all methods is constructed as a double exponential distribution function with maximum s2b 5 1.0 centered on the diagonals with a local radius of rB 5 5. The Lorenz-96 model is spun up over 14 400 time steps, as per Lorenz (1996), to ensure convergence to the attractor. An additional 600 time steps are run with Dt 5 0.01 to form a nature run xt(t). The experiment initial conditions are sampled from a Gaussian distribution, N[xt(0), 0.1]. Unless otherwise noted, we use a constant multiplicative background covariance inflation of r 5 1.1 (10%) and a local radius of r 5 5 for LETKF and the associated hybrid methods.

4. Results We show in Fig. 1 that the standard LETKF algorithm performs well with a large ensemble size (e.g., 20, given that 40 would be full rank), but it fails due to catastrophic filter divergence (Harlim and Majda 2010)

when using smaller ensembles (e.g., k 5 5). This filter divergence is typical in our experiment setup for k # 5 and is dependent on the observation locations and Dt. We see that for a large ensemble size, the standard LETKF algorithm is quite accurate. However, as the ensemble size decreases, the analysis solution degrades until the filter eventually diverges from the nature run. When implementing the Hybrid/Mean-LETKF, using k 5 5 ensemble members, the filter recovers stability and has comparable accuracy to the standard LETKF with k 5 20. The energy for this system, s2 as defined by Lorenz (2005, 2006), is simply the mean square of the system state across all grid points. For longer time periods, the total energy oscillates chaotically in a range from 40 to 90 and is tracked well by the standard LETKF analysis for ensemble sizes k . 5. Bishop and Satterfield (2013) showed that as the variance of the underlying true distribution increases, the effective random sample ensemble

JUNE 2014

PENNY

2143

FIG. 2. The total energy s2 plotted for 2000 time steps (100 days) for the ensemble mean state in an analyses using l 5 4 observations per time step. The standard LETKF is shown in gray (k 5 20) and cyan (k 5 6). Four different cases of standard LETKF (k 5 5) are shown in red, each ‘‘blowing up’’ at a different time due to randomness in observation locations. The Hybrid/Mean-LETKF (k 5 5), shown in blue, recovers the stability and accuracy of the standard LETKF (k . 5). The Hybrid/Covariance-LETKF is shown in violet (k 5 5).

size decreases. In the standard LETKF (k 5 5), s2 ‘‘blows up’’ when the underlying variance increases, while both the Hybrid/Mean-LETKF and Hybrid/Covariance-LETKF using the same k 5 5 members track closely with the standard LETKF (k 5 20) (see Fig. 2). We next examine the impact of varying observation coverage and ensemble size. We begin with the standard LETKF in Fig. 3, for which k 5 1 represents the pure 3DVar results. This figure contains results from 40 3 40 5 1600 runs similar to the previously described special case scenarios. We define three regimes within the parameter space: 1) the ensemble/hybrid method outperforms 3DVar, 2) the ensemble–hybrid method fails, and 3) 3D-Var outperforms the ensemble–hybrid method. Our goal is to maximize the parameter space of regime 1 while simultaneously minimizing analysis error. As shown in Fig. 4, the Hybrid/Covariance-LETKF is successful at stabilizing the filter for small ensemble sizes. For improved stability of PCG, we required an initial guess of x 5 xb. With smaller values of a (e.g., a 5 0.0–0.2) for all ensemble sizes, the mean absolute analysis errors are close to that of the standard LETKF (k . 5). However, as the observation coverage decreases, the analysis errors increase. For larger values of a (e.g., a 5 0.5), the mean absolute analysis errors increase throughout

regime 1 and converge toward the 3D-Var solution (k 5 1) as a increases to 1.0. The Hybrid/Mean-LETKF algorithm (Fig. 5) using a 5 0.5 retains much of the accuracy of the standard LETKF (with k 5 20) while still using a small (k 5 5) ensemble size and few (l 5 4) observations. These results are found to hold even when driving the ensemble size down to k 5 3 members. If the number of observations decreases further (e.g., to l 5 3, with k 5 3) however, then this hybrid undergoes the same filter divergence as the standard LETKF. For this hybrid method, as a decreases there is a gradual adjustment back to the standard LETKF solution: the mean absolute analysis errors decrease throughout regime 1 while the minimum observation count required for filter stability for 2 , k , 5 steadily increases from l 5 3 to 7 (a 5 0.5) to l 5 5 to 18 (a 5 0.2), and finally to l 5 40 (a 5 0.0). We note that these results obtained with a localized 3DVar are similar when the 3D-Var correction is instead applied globally after LETKF or after a (nonlocalized) ensemble transform Kalman filter (ETKF), with comparable accuracy (not shown). The Hybrid/Mean-LETKF(b) has qualitatively similar results (Fig. 6). While the a parameter scales the analysis errors with an apparent monotonic relationship between the standard

2144


FIG. 3. Mean absolute analysis error for the standard LETKF using ensemble sizes of k 5 2–40, and observation coverage for l 5 1–40 randomly chosen throughout the domain. Results at k 5 1 correspond to the standard 3D-Var and are identical in the subsequent figures. Empty squares indicate cases in which the Runge– Kutta ordinary differential equation (ODE) solver could not reach the required tolerance.

LETKF (a 5 0.0) and 3D-Var (a 5 1.0) for each hybrid method, the exact a values are not directly comparable across methods. All of the methods give a range of mean analysis errors varying from those found with the standard LETKF to approximately those found with 3DVar. Therefore an appropriate set of parameter values should allow one to tune to any desired mean analysis error accuracy in that range. For example, the Hybrid/ Covariance-LETKF with a 5 0.1 (Fig. 4) appears to have similar analysis errors to the Hybrid/Mean-LETKF (Fig. 5) with a 5 0.5 (with the exception of the cases with very low observation counts, e.g., l , 4, for which the Hybrid/Mean-LETKF has lower mean analysis errors). Based on their experiments with an extended Kalman filter, Trevisan and Palatella (2011) hypothesized that in

VOLUME 142

an ensemble approach, when observations are sufficiently dense and accurate so that error dynamics are approximately linear, the necessary and sufficient number of ensemble members is equal to the total number of positive and null Lyapunov exponents. Our experiments indicate that for LETKF fewer ensemble members are needed. This is due to localization and is in agreement with Ott et al. (2004). Using parameters similar to Trevisan and Palatella—F 5 8, Dt 5 0.05, l 5 20, and sr 5 0.01—we obtain that 9 ensemble members are required for our configuration of LETKF compared to their hypothesized 14 for a general EnKF. Using the Hybrid/Mean-LETKF, as few as two ensemble members can be employed. The Lorenz-96 system is extensive, which implies that the number of positive Lyapunov exponents grows linearly with the system size (Pazo et al. 2008; Karimi and Paul 2010). Holding the number of observations fixed at l 5 20, we vary the localization radius r and the ensemble size k from 1 to 20 (Fig. 7). The minimum ensemble size for stability of LETKF is k 5 15 for r 5 20 and decreases linearly to k 5 2 for r 5 1. Assigning the covariance localization in 3D-Var to match LETKF (i.e., r [ rB), the Hybrid/Mean-LETKF eliminates the catastrophic filter divergence for all localization distances but exhibits increased mean errors that exceed the observation errors for short localization distances. However, when fixing the B-localization at rB 5 5 and applying 3D-Var globally to ensure a minimum dimension for the solution space, the Hybrid/Mean-LETKF has consistent results regardless of the localization radius r used in LETKF. Multiplicative covariance inflation gives limited benefit to the standard LETKF. As shown in Fig. 8, only a few cases with smaller ensemble sizes are afforded improvements in mean analysis error. When using sufficient observations (l 5 20), the Hybrid/Mean-LETKF renders inflation unnecessary. However, when using fewer observations (l 5 4), a small inflation (5%–10%)

FIG. 4. Mean absolute analysis error for the Hybrid/Covariance-LETKF for (a)–(c) a 5 0.1, 0.2, and 0.5, respectively. As a goes to 1.0, all ensemble sizes converge toward the 3D-Var accuracy (k 5 1). As a goes to 0.0, all cases converge to the standard LETKF solution in Fig. 3.

JUNE 2014

PENNY

2145

FIG. 5. Mean absolute analysis error for the Hybrid/Mean-LETKF, for (a),(b) a 5 0.2 and 0.5, respectively. As a increases to 1.0, the analysis errors converge toward the 3D-Var accuracy (k 5 1). As a goes to 0.0, the peak of regime 2 (dark red) at (k 5 2) gradually increases, as does the accuracy in regime 1 (dark blue), until all cases converge to the standard LETKF solution in Fig. 3.

has either a positive or neutral effect on reducing the mean analysis error.

5. Conclusions This research began with an investigation into the source of benefits arising from hybrid methods when variational techniques are added to an EnKF. We compared solutions from 3D-Var with LETKF, and showed the standard LETKF broke down when using small ensemble sizes. We then introduced two hybrid approaches. The first was the traditionally motivated Hybrid/Covariance-LETKF. The second used a new approach combining gain matrices from LETKF and 3DVar, and was implemented via the Hybrid/Mean-LETKF algorithm, for which a simple 3D-Var is applied after completion of LETKF to adjust the ensemble mean in

model space and to add stability to the filter for small ensemble sizes. While we demonstrated this approach with LETKF and 3D-Var, it is generally applicable to other data assimilation methods as well. A larger value of a tuning parameter a enhanced the stability of LETKF at the cost of accuracy. Thus, in practice, a reasonable approach may be to begin with a larger a value and gradually decrease a as long as diagnostic metrics continue to improve. The optimal value for a is dependent on problem specifications, such as the ensemble size, observation coverage, and localization radius. The filtering solution derived with LETKF is highly accurate when applied to the Lorenz-96 system when allowed a sufficient number of ensemble members. However, when using few members, there is catastrophic filter divergence. Because LETKF computes the analysis in the ensemble subspace, its stability is highly dependent

FIG. 6. Mean absolute analysis error for the Hybrid/Mean-LETKF(b) for (a),(b) a 5 0.2 and 0.5, respectively. As a approaches 1.0, the analysis errors converge to the 3D-Var accuracy.

2146


VOLUME 142

FIG. 7. Mean absolute analysis error with the localization radius varying from r 5 1 to 20 while holding the observation count fixed at l 5 20 observations. Results are given for (a) LETKF, (b) the Hybrid/Mean-LETKF (a 5 0.5) with rB 5 r, and (c) the Hybrid/Mean-LETKF (a 5 0.5) using 3D-Var applied globally with rB 5 5. Values at k 5 1 show mean analysis errors for 3D-Var with rB 5 5. For LETKF, the minimum ensemble size increases linearly with increasing localization radius. The Hybrid/Mean-LETKF stabilizes the filter for all localization distances but increases error when the radius is small. By maintaining a minimum dimension for B-localization, the errors are reduced.

on the ensemble dimension. It was shown that for the primary configuration used in this study, regardless of observation coverage, five ensemble members were insufficient to prevent filter divergence. For a specified

level of observation coverage (l 5 20), this minimum ensemble size was shown to be a linear function of localization radius. While decreasing the localization radius helped stabilize LETKF for small ensemble sizes

FIG. 8. Mean absolute analysis error, with multiplicative variance inflation ranging from 0.9 (210%) to 1.85 (85%), applied to (a) LETKF (l 5 20), (b) Hybrid/Mean-LETKF (l 5 20, a 5 0.5), (c) LETKF (l 5 4), and (d) Hybrid/MeanLETKF (l 5 4, a 5 0.5). Values at k 5 1 show mean analysis errors for 3D-Var, with no inflation applied to the B matrix. For the standard LETKF using many observations (l 5 20), a small inflation provides error reduction only for ensemble sizes k 5 5–7. For the Hybrid/Mean-LETKF with many observations (l 5 20), inflation gives no added benefit. When observations are reduced (l 5 4), the Hybrid/Mean-LETKF permits smaller ensemble sizes than possible with the standard LETKF. For both methods, a small inflation (5%–10%) reduces analysis errors when the ensemble sizes are small (k 5 6–9 for LETKF and k 5 3–6 for the Hybrid/Mean-LETKF), and in some cases with larger ensemble sizes.

JUNE 2014

2147

PENNY

with Lorenz-96, in practice the minimum localization radius is constrained by the fidelity of the observing network and computational model. A larger localization radius is typically needed for continuity of the analysis field. Of interest is the local dimensionality of the unstable Lyapunov vectors relative to the size of the ensemble k and the local model dimension mloc. Based on the work of Trevisan and Palatella (2011), we suspect the minimum ensemble size for the standard LETKF is directly related to the local dimensionality of the error growth. Bishop and Satterfield (2013) found that if an ensemblebased estimate of covariance is undersampled, then a superior estimate can be obtained by combining that with a climatological estimate. Our results support that finding: the Hybrid/Mean-LETKF approach generated solutions that outperformed both 3D-Var and the standard LETKF for observation coverage with 2 , l , 10 and ensemble size 1 , k , 5. As has been reported for the traditional hybrid methods (Hamill and Snyder 2000; Wang et al. 2007a), we conclude that it is the computation of the analysis in the higher-dimensional model space that stabilizes the Hybrid/Mean-LETKF when using small ensemble sizes. In operational settings, all practical ensemble sizes are small relative to the dimension of the state space. Both hybrid LETKF methods are well suited for applications using small ensemble sizes and limited observation coverage, the typical situation for global ocean data assimilation (Penny 2011; Penny et al. 2013) and coupled atmosphere– ocean data assimilation. In practice, selection of the Hybrid/ Covariance-LETKF versus the Hybrid/Mean-LETKF would depend upon the ease of implementation based on the design of existing operational software. Success with the Lorenz-96 model does not guarantee success with all models. However, because a similar approach by Kalnay and Toth (1994) was effective on both a Lorenz-63 model and a T62 National Meteorological Center (NMC) model, this gives a positive outlook for use of the Hybrid/Mean-LETKF with more realistic models. As one example, there has been success in preliminary applications of the Hybrid/Mean-LETKF algorithm to NCEP’s operational Global Ocean Data Assimilation System (GODAS). As LETKF is already being used or prepared for use in operational environments in Italy, Germany, Brazil, Argentina, Japan, and the United States, the Hybrid/Mean-LETKF algorithm is a simple extension that can be adopted to enhance the performance of these LETKF-based data assimilation systems.

Hunt, and Daryl Kleist for discussions that led to this work; David Behringer and NCEP for motivating the HybridLETKF; and Craig Bishop for the helpful discussions on the general role of hybrid methods in data assimilation. This material is based upon work supported by the National Science Foundation under Grant OCE1233942.

Acknowledgments. I gratefully acknowledge the thoughtful comments from unofficial reviewer Eugenia Kalnay, official reviewer Ross Hoffman, and two anonymous reviewers. Also, I thank Jim Carton, Steven Greybush, Brian

(I 1 BHT R21 H)xaHybrid 5 a(xa 1 BHT R21 yo )

APPENDIX A Equivalence of the Hybrid Gain Matrix (Using b1 = 1, b2 = a, and b3 = 2a) and the Hybrid/Mean-LETKF Algorithm The Hybrid/Mean-LETKF analysis as defined in (18) is xaHybrid 5 axa 1 (1 2 a)xa ,

(A1)

where xa is defined by minimizing the variational equation, J(xa ) 5 (xa 2 xa )T B21 (xa 2 xa ) 1 (yo 2 Hxa )T R21 (yo 2 Hxa ) .

(A2)

By setting the derivative of J equal to zero, we can solve for xa: dJ 5 2(xa 2 xa )T B21 2 2(yo 2 Hxa )T R21 H 5 0 dxa (A3) 1 dJ T 5 B21 (xa 2 xa ) 2 HT R21 (yo 2 Hxa ) 5 0 2 dxa (A4) (B21 xa 2 B21 xa )2(HT R21 yo 2 HT R21 Hxa ) 5 0

(A5)

(B21 1 HT R21 H)xa 5 B21 xa 1 HT R21 yo

(A6)

(I 1 BHT R21 H)xa 5 xa 1 BHT R21 yo

(A7)

xa 5 (I 1 BHT R21 H)21 (xa 1 BHT R21 yo ) .

(A8)

Then, the hybrid analysis (A1) equals xaHybrid 5 a(I 1 BHT R21 H)21 (xa 1 BHT R21 yo ) 1 (1 2 a)xa

(A9)

1 (I 1 BHT R21 H)(1 2 a)xa (A10)

2148


5 (axa 1 aBHT R21 yo ) 1 (12 a)xa 1 BHT R21H(12 a)xa (A11) 5 xa 1 BHT R21 ayo 1 BHT R21 (1 2 a)Hxa

(A12)

5 xa 1 BHT R21 [ayo 1 (1 2 a)Hxa ]

(A13)

T

5 x 1 aBH R a

21

T

(y 2 Hx ) 1 BH R o

a

21

Hx

a

5 (I 1 BHT R21 H)xa 1 aBHT R21 (yo 2 Hxa ) .

(A14) (A15)

VOLUME 142

APPENDIX B Equivalence of the Hybrid Gain Matrix (Using b1 =(1 2 a), b2 = a, and b3 = 0) and the Hybrid/Mean-LETKF(b) Algorithm If instead of the analysis ensemble mean we assign the forecast ensemble mean as the background for 3D-Var, then analogous to (A2) we have J(xa ) 5 (xa 2 xb )T B21 (xa 2 xb ) 1 (yo 2 Hxa )T R21 (yo 2 Hxa ) .

So, xaHybrid 5 xa 1 (I 1 BHT R21 H)21 aBHT R21 (yo 2 Hxa ) . (A16)

(B1)

Similarly to (A3)–(A8), by setting the derivative of J equal to zero, we can solve for xa: xa 5 (I 1 BHT R21 H)21 (xb 1 BHT R21 yo ) .

(B2)

Then the hybrid analysis (A1) equals

As in (8), KB 5 BHT (HBHT 1 R)21

(A17)

5 (I 1 BHT R21 H)21 BHT R21 .

(A18)

xaHybrid 5 a(I 1 BHT R21 H)21 (xb 1 BHT R21 yo ) 1 (1 2 a)xa .

(B3)

Substituting in the definition of xa from (A19), we have Then, since for LETKF

xaHybrid 5 a(I 1 BHT R21 H)21 (xb 1 BHT R21 yo )

xa 5 xb 1 K(yo 2 Hxb ) ,

(A19)

xaHybrid 5 xb 1 K(yo 2 Hxb )

DxaHybrid 5 a(xb 1 BHT R21 yo )

1 aKB fyo 2 H[xb 1 K(yo 2 Hxb )]g.

(A20)

5 xb 1 K(yo 2 Hxb ) 1 aKB [yo 2 Hxb 2 HK(yo 2 Hxb )] (A21) B

5 x 1 K(y 2 Hx ) 1 aK (I 2 HK)(y 2 Hx)

(A22)

5 xb 1 [K 1 aKB (I 2 HK)](yo 2 Hxb ) .

(A23)

b

1 (1 2 a)D[xb 1 K(yo 2 Hxb )] .

(B5)

Noting that [aI 1 (1 2 a)D]xb 5 Dxb 2 aBHT R21 Hxb , we have

Simplifying, we get

o

(B4)

For notational convenience, let D 5 (I 1 BHT R21 H). Then (B4) can be simplified to

we can substitute KB and xa into (35) to get

b

1 (1 2 a)[xb 1 K(yo 2 Hxb )].

o

(A24)

we have that the hybrid analysis is simply given as in (10), with an observation innovation weighted by the ^ modified gain matrix K, ^ o 2 Hxb ) . xaHybrid 5 xb 1 K(y

1 (1 2 a)K(yo 2 Hxb ) .

(A25)

(B6)

Using definition (A18) we can substitute KB into (A16) to get xaHybrid 5 xb 1 [aKB 1 (1 2 a)K](yo 2 Hxb ) .

Then, if we define as in (9) ^ 5 K 1 aKB (I 2 HK) , K

xaHybrid 5 xb 1 aD21 BHT R21 (yo 2 Hxb )

(B7)

Thus, we can define as in (6) with b1 5 (1 2 a), b2 5 a, and b3 5 0 that ^ 5 (1 2 a)K 1 aKB . K

(B8)

Again, we have that the hybrid analysis is simply given as in (10), with an observation innovation weighted by ^ the modified gain matrix K.

JUNE 2014

PENNY REFERENCES

Barker, D. M., 1999: The use of synoptic-dependent error structure in 3DVAR. Var Scientific Development Paper 25, Met Office Tech. Rep., 2 pp. [Available from Met Office, FitzRoy Rd., Exeter, Devon EX1 3PB, United Kingdom.] Bishop, C. H., and E. A. Satterfield, 2013: Hidden error variance theory. Part I: Exposition and analytic model. Mon. Wea. Rev., 141, 1454–1468, doi:10.1175/MWR-D-12-00118.1. ——, B. J. Etherton, and S. J. Majumdar, 2001: Adaptive sampling with the ensemble transform Kalman filter. Part I: Theoretical aspects. Mon. Wea. Rev., 129, 420–436, doi:10.1175/ 1520-0493(2001)129,0420:ASWTET.2.0.CO;2. Brett, C. E. A., K. F. Lam, K. J. H. Law, D. S. McCormick, M. R. Scott, and A. M. Stuart, 2013: Accuracy and stability of filters for dissipative PDEs. Physica D, 245, 34–45, doi:10.1016/j.physd.2012.11.005. Buehner, M., 2005: Ensemble-derived stationary and flow-dependent background error covariances: Evaluation in a quasi-operational NWP setting. Quart. J. Roy. Meteor. Soc., 131, 1013–1043, doi:10.1256/qj.04.15. ——, P. L. Houtekamer, C. Charette, H. L. Mitchell, and B. He, 2010a: Intercomparison of variational data assimilation and the ensemble Kalman filter for global deterministic NWP. Part I: Description and single-observation experiments. Mon. Wea. Rev., 138, 1550–1566, doi:10.1175/2009MWR3157.1. ——, ——, ——, ——, and ——, 2010b: Intercomparison of variational data assimilation and the ensemble Kalman filter for global deterministic NWP. Part II: One-month experiments with real observations. Mon. Wea. Rev., 138, 1567–1586, doi:10.1175/2009MWR3158.1. Derber, J., and A. Rosati, 1989: A global oceanic data assimilation system. J. Phys. Oceanogr., 19, 1333–1347, doi:10.1175/ 1520-0485(1989)019,1333:AGODAS.2.0.CO;2. Fehlberg, E., 1970: Classical Runge-Kutta fourth-order and lower order formulas with step size control and their application to heat conduction problems. Computing, 6, 61–71. Greybush, S. J., E. Kalnay, T. Miyoshi, K. Ide, and B. R. Hunt, 2011: Balance and ensemble Kalman filter localization techniques. Mon. Wea. Rev., 139, 511–522, doi:10.1175/2010MWR3328.1. Hamill, T. M., and C. Snyder, 2000: A hybrid ensemble Kalman filter3D variational analysis scheme. Mon. Wea. Rev., 128, 2905–2919, doi:10.1175/1520-0493(2000)128,2905:AHEKFV.2.0.CO;2. Harlim, J., and A. J. Majda, 2010: Catastrophic filter divergence in filtering nonlinear dissipative systems. Commun. Math. Sci., 8, 27–43, doi:10.4310/CMS.2010.v8.n1.a3. Hunt, B. R., E. J. Kostelich, and I. Szunyogh, 2007: Efficient data assimilation for spatiotemporal chaos: A local ensemble transform Kalman filter. Physica D, 230, 112–126, doi:10.1016/ j.physd.2006.11.008. Kalnay, E., 2003: Data assimilation. Atmospheric Modeling, Data Assimilation and Predictability, Cambridge University Press, 136–204. ——, and Z. Toth, 1994: Removing growing errors in the analysis. Preprints, 10th Conf. on Numerical Weather Prediction, Portland, OR, Amer. Meteor. Soc., 212–215. Karimi, A., and M. R. Paul, 2010: Extensive chaos in the Lorenz-96 model. Chaos, 20, 043105, doi:10.1063/1.3496397. Kleist, D. T., 2012: An evaluation of hybrid variational-ensemble data assimilation for the NCEP GFS. Ph.D. dissertation, University of Maryland, College Park, 149 pp. Lorenc, A. C., 2003: The potential of the ensemble Kalman filter for NWP—A comparison with 4D-Var. Quart. J. Roy. Meteor. Soc., 129, 3183–3203, doi:10.1256/qj.02.132.

2149

Lorenz, E. N., 1996: Predictability—A problem partly solved. Proceedings of a Seminar Held at ECMWF on Predictability, ECMWF Seminar Proceedings, Vol. 1, ECMWF, 1–18. ——, 2005: Designing chaotic models. J. Atmos. Sci., 62, 1574–1587, doi:10.1175/JAS3430.1. ——, 2006: Regimes in simple systems. J. Atmos. Sci., 63, 2056– 2073, doi:10.1175/JAS3727.1. Messner, J., 2009: Probabilistic forecasting using analogs in the idealized Lorenz96 setting. M.S. thesis, Dept. of Meteorology and Geophysics, University of Innsbruck, 59 pp. Orrell, D., 2003: Model error and predictability over different timescales in the Lorenz ’96 systems. J. Atmos. Sci., 60, 2219–2228, doi:10.1175/1520-0469(2003)060,2219:MEAPOD.2.0.CO;2. Ott, E., and Coauthors, 2004: A local ensemble Kalman filter for atmospheric data assimilation. Tellus, 56A, 415–428, doi:10.1111/ j.1600-0870.2004.00076.x. Pazo, D., I. G. Szendro, J. M. Lopez, and M. A. Rodriguez, 2008: Structure of characteristic Lyapunov vectors in spatiotemporal chaos. Phys. Rev., 78E, 016209, doi:10.1103/ PhysRevE.78.016209. Penny, S. G., 2011: Data assimilation of the global ocean using the 4D local ensemble transform Kalman filter (4D-LETKF) and the Modular Ocean Model. Ph.D. dissertation, University of Maryland, College Park, 141 pp. ——, E. Kalnay, J. A. Carton, B. R. Hunt, K. Ide, T. Miyoshi, and G. Chepurin, 2013: The running-in-place algorithm applied to a global ocean general circulation model. Nonlinear Processes Geophys., 20, 1031–1046, doi:10.5194/npg-20-1031-2013. Trevisan, A., and L. Palatella, 2011: On the Kalman filter error covariance collapse into the unstable subspace. Nonlinear Processes Geophys., 18, 243–250, doi:10.5194/npg-18-243-2011. Wang, X., 2010: Incorporating ensemble covariance in the gridpoint statistical interpolation variational minimization: A mathematical framework. Mon. Wea. Rev., 138, 2990–2995, doi:10.1175/2010MWR3245.1. ——, T. M. Hamill, J. S. Whitaker, and C. H. Bishop, 2007a: A comparison of hybrid ensemble transform Kalman filter– optimum interpolation and ensemble square root filter analysis schemes. Mon. Wea. Rev., 135, 1055–1076, doi:10.1175/ MWR3307.1. ——, C. Snyder, and T. M. Hamill, 2007b: On the theoretical equivalence of differently proposed ensemble–3D-VAR hybrid analysis schemes. Mon. Wea. Rev., 135, 222–227, doi:10.1175/ MWR3282.1. ——, D. M. Barker, C. Snyder, and T. M. Hamill, 2008a: A hybrid ETKF–3DVAR data assimilation scheme for the WRF Model. Part I: Observing system simulation experiment. Mon. Wea. Rev., 136, 5116–5131, doi:10.1175/2008MWR2444.1. ——, ——, ——, and ——, 2008b: A hybrid ETKF–3DVAR data assimilation scheme for the WRF Model. Part II: Real observation experiments. Mon. Wea. Rev., 136, 5132–5147, doi:10.1175/2008MWR2445.1. ——, D. Parrish, D. Kleist, and J. Whitaker, 2013: GSI 3DVarbased ensemble–variational hybrid data assimilation for NCEP Global Forecast System: Single-resolution experiments. Mon. Wea. Rev., 141, 4098–4117, doi:10.1175/MWR-D-12-00141.1. Wilks, D. S., 2005: Effects of stochastic parametrizations in the Lorenz ’96 system. Quart. J. Roy. Meteor. Soc., 131, 389–407, doi:10.1256/qj.04.03. ——, 2006: Comparison of ensemble-MOS methods in the Lorenz ’96 setting. Meteor. Appl., 13, 243–256, doi:10.1017/ S1350482706002192.