Joint state and parameter estimation with an iterative ensemble

0 downloads 0 Views 2MB Size Report
Oct 23, 2013 - Kalman filter, the ensemble Kalman smoother, and a 4D-Var, which are ... because it can account for model error through a parametric representation of the ...... P = 41, with its 41st entry being the forcing parameter. The.
Nonlinear Processes in Geophysics

Open Access

Nonlin. Processes Geophys., 20, 803–818, 2013 www.nonlin-processes-geophys.net/20/803/2013/ doi:10.5194/npg-20-803-2013 © Author(s) 2013. CC Attribution 3.0 License.

Joint state and parameter estimation with an iterative ensemble Kalman smoother M. Bocquet1,2 and P. Sakov3 1 Université

Paris-Est, CEREA joint laboratory École des Ponts ParisTech and EDF R&D, France Paris Rocquencourt research centre, France 3 Bureau of Meteorology, Melbourne, Australia 2 INRIA,

Correspondence to: M. Bocquet ([email protected]) Received: 4 June 2013 – Revised: 2 September 2013 – Accepted: 4 September 2013 – Published: 23 October 2013

Abstract. Both ensemble filtering and variational data assimilation methods have proven useful in the joint estimation of state variables and parameters of geophysical models. Yet, their respective benefits and drawbacks in this task are distinct. An ensemble variational method, known as the iterative ensemble Kalman smoother (IEnKS) has recently been introduced. It is based on an adjoint model-free variational, but flow-dependent, scheme. As such, the IEnKS is a candidate tool for joint state and parameter estimation that may inherit the benefits from both the ensemble filtering and variational approaches. In this study, an augmented state IEnKS is tested on its estimation of the forcing parameter of the Lorenz-95 model. Since joint state and parameter estimation is especially useful in applications where the forcings are uncertain but nevertheless determining, typically in atmospheric chemistry, the augmented state IEnKS is tested on a new low-order model that takes its meteorological part from the Lorenz-95 model, and its chemical part from the advection diffusion of a tracer. In these experiments, the IEnKS is compared to the ensemble Kalman filter, the ensemble Kalman smoother, and a 4D-Var, which are considered the methods of choice to solve these joint estimation problems. In this low-order model context, the IEnKS is shown to significantly outperform the other methods regardless of the length of the data assimilation window, and for present time analysis as well as retrospective analysis. Besides which, the performance of the IEnKS is even more striking on parameter estimation; getting close to the same performance with 4D-Var is likely to require both a long data assimilation window and a complex modeling of the background statistics.

1

Introduction

Data assimilation in geophysics is often concerned with the estimation of the state of the system (e.g. atmosphere, ocean). Yet, non-observed parameters of the model can also be seen as control variables. They can indirectly be estimated through the assimilation of observations. In such context, data assimilation can be a powerful inverse modeling tool. With the progress in techniques as well as the rise in popularity of data assimilation in geosciences, this topic has become of increasing interest. Parameter estimation is useful because it can account for model error through a parametric representation of the uncertain processes, and could serve as a tool to enhance the system state estimation. For instance, it is now accepted that air quality forecasting can benefit considerably from the online estimation of forcing parameters. Parameter estimation is also a fundamental tool per se in the estimation of the parameters which are often of physical or societal interests. For instance, again regarding air quality, data assimilation can help assess effective kinetic rates of interest to chemists, or it can help assess regulated pollutant emissions of interest to policy makers. 1.1

Data assimilation techniques for parameter estimation

As is the case with data assimilation for state estimation, two types of approach have been used for parameter estimation: filtering methods and variational methods. The estimation of parameters by the filtering approaches is based on the augmentation of the state vector with the parameter variables. If the state space has dimension M and if the number of parameters is P , then the augmented control

Published by Copernicus Publications on behalf of the European Geosciences Union & the American Geophysical Union.

804

M. Bocquet and P. Sakov: State and parameter estimation with the IEnKS

vector has dimension M + P . Through the assimilation of observations, the joint analysis of the state variables and the parameters aims at building covariances (or higher-order dependencies for non-Gaussian filters) between them; these are crucially needed because of the non-observability of most parameters. The augmented state principle is likely to be used with any type of filter: extended Kalman filters (e.g., Kondrashov et al., 2008), ensemble Kalman filters (e.g., Aksoy et al., 2006; Wirth and Verron, 2008; Barbu et al., 2009), particle filters (e.g., Vossepoel and van Leeuwen, 2007; Weir et al., 2013), and stochastic sampling and genetic algorithms (e.g., Jackson et al., 2004; Liu et al., 2005; Bocquet, 2012; Posselt and Bishop, 2012). In an enlightening review, Ruiz et al. (2013) have discussed the use of ensemble Kalman filters (EnKFs) for parameter estimation. When the filtering method accounts for asynchronous observations by building covariances between parameter errors defined at distinct times, the method is usually referred to as a smoother (Evensen, 2003; Hunt et al., 2004; Sakov et al., 2010; Cosme et al., 2010). The estimation of parameters with the variational approach is based on the explicit dependence of the cost function in not only the state variables, but also the parameters. If the dependence is not explicit, one should at least be able to compute the gradient of the cost function with respect to the parameters. The four-dimensional variational method, or 4DVar (Le Dimet and Talagrand, 1986; Talagrand and Courtier, 1987; Rabier et al., 2000), has the distinct advantage of being a natural smoother since it works within a temporal window to assimilate asynchronous observations. However, it requires the use of the adjoint evolution model to compute gradients of the cost function. Computing the gradient with respect to the model parameters requires the same adjoint model, and also the extra effort of computing the explicit derivative of the cost function with respect to the parameters, in terms of the adjoint variables. It has been used for parameter estimation by, e.g., Pulido and Thuburn (2006), Bocquet (2012), and Kazantsev (2012). This list of contributions to the field is far from being exhaustive and merely illustrates some of the methodologies used in atmospheric, ocean and climate sciences. In particular, there is a vast literature in atmospheric chemistry dedicated to the inversion of sources of pollutants and tracers. The extended and ensemble Kalman filters and variational methods (3D-Var and 4D-Var) have been employed in this field for over two decades (Zhang et al., 2012, and references within). Owing to the (quasi-)linearity of some chemical species, simpler four-dimensional smoothing analysis merely using a Best Linear Unbiased Estimator (BLUE) have also been extensively used to estimate sources. 1.2

The iterative ensemble Kalman smoother

The iterative ensemble Kalman smoother (IEnKS) has been recently proposed (Bocquet and Sakov, 2013) as an extension Nonlin. Processes Geophys., 20, 803–818, 2013

of the iterative ensemble Kalman filter (Sakov et al., 2012; Bocquet and Sakov, 2012). It is meant to solve the variational problem of 4D-Var with the help of a 4D ensemble. As such, it is a 4D ensemble variational method of the type used in the work by Buehner et al. (2010), Chen and Oliver (2012) and Fairbairn et al. (2013), and has (more remote) connections with ensemble of variational methods (Raynaud et al., 2009; Bowler et al., 2013). It does not require the use of the adjoint observation and evolution models since the sensitivities are estimated with the ensemble (Gu and Oliver, 2007; Liu et al., 2008). Moreover, the IEnKS generates the posterior ensemble using Gaussian assumptions and forecasts the ensemble to the next update step in the same way an ensemble square root Kalman filter does. Note that the scheme is not a hybrid method since it does not combine two distinct methods. Because the IEnKS fundamentally solves a variational problem, it may require iterations for the cost function minimization. The number of iterations depends on the nonlinearity of the system. This number is expected to be small (1 or 2) for weak nonlinearity (typical of synoptic scale meteorology). Using perfect model assumptions, Bocquet and Sakov (2013) have tested the IEnKS on two low-order models in different regimes representing different nonlinearities and lengths of the data assimilation window (DAW). The IEnKS (often significantly) outperforms EnKF and the standard ensemble Kalman smoother (EnKS) in all these regimes, not only regarding the smoothing performance (retrospective state estimation) but also regarding the filtering performance (state estimation at present and future time). Here, we will also show that the IEnKS also outperforms 4D-Var in this context. In addition, the IEnKS has been shown on these models to be able to handle long DAWs, especially when assimilating observations several times (in a mathematically consistent manner). Because the IEnKS offers the advantages of both filtering and variational methods, and because it is capable of operating on long DAWs, it has considerable potential as an efficient parameter estimation method. 1.3

Objective and outline

The objective of this article is to introduce a straightforward extension of the IEnKS to joint state and parameter estimation, and to test the potential of the approach on low-order models. The physical context is that of chaotic geophysical models, and of atmospheric chemical/tracer models, in which a joint state and parameter estimation is, in our opinion, a key to successful forecasts. The algorithm of the IEnKS will be described in Sect. 2, in a compact but comprehensive manner. The method will then be generalized to joint state and parameter estimation. In Sect. 3, the capabilities of the IEnKS on the Lorenz-95 model (Lorenz and Emmanuel, 1998) will be reported. Additional www.nonlin-processes-geophys.net/20/803/2013/

M. Bocquet and P. Sakov: State and parameter estimation with the IEnKS tests will be performed: a comparison with the state-of-theart EnKF and standard EnKS, as well as with a 4D-Var, and with a new cycling of the IEnKS DAWs. Then the IEnKS will be tested for joint state and parameter estimation on the Lorenz-95 model (Lorenz and Emmanuel, 1998). In Sect. 4, an original extension of the Lorenz-95 with the advection of a tracer will be introduced. It is meant to represent the dynamics of an online atmospheric chemistry model, or meteorological models with a constituent such as moisture, with two unobserved parameters: the Lorenz-95 forcing parameter, and the emission flux. The IEnKS, the EnKF/EnKS, and a 4D-Var will be tested and compared in this context. The results will be discussed in Sect. 5. Conclusions will be drawn in Sect. 6. 2

The iterative ensemble Kalman smoother for joint state and parameter estimation

2.1

The algorithm

A Bayesian derivation of the IEnKS can be found in Bocquet and Sakov (2013). However, we would like to introduce the IEnKS comprehensively in this article: reference to Bocquet and Sakov (2013) will only be made regarding details that are not directly relevant to this study. Here, we describe the algorithm with its main justifications, and then provide its pseudo-code. 2.1.1

The core algorithm

Observation vectors y ∈ Rd are assumed to be collected every time step 1t. Time is discretized into the times tk when the observations are collected. The number d of scalar observations within y can be time-dependent. The observations are related to the state vector through a possibly nonlinear, possibly time-dependent observation operator Hk . The observation errors are assumed to be Gaussian-distributed, unbiased, and uncorrelated in time, and to have an observation error covariance matrix Rk . The analysis step of the assimilation scheme is performed over a window of length L1t in time units. Unless otherwise stated, time index k is relative to present time. With this convention, present time is set to be always tL , so that the initial condition of the DAW is conveniently always t0 . Let us first describe the update step. At t0 (i.e. L1t in the past), the background is obtained from an ensemble of N state vectors of RM : x 0,[1] , . . . , x 0,[n] , . . . , x 0,[N] . Index 0 refers to time while [n] refers to the ensemble member index. They can be stored in a matrix E0 = [x 0,[1] , . . . , x 0,[N ] ] ∈ RM×N . One can P equivalently represent the ensemble with its mean x 0 = N1 N n=1 x 0,[n] and its anomaly matrix A0 = [x 0,[1] − x 0 , . . . , x 0,[N] − x 0 ]. As in the ensemble Kalman filter, this background is approximated as a Gaussian distribution of mean x 0 , and covariance matrix A0 AT0 /(N − 1), the first- and second-order www.nonlin-processes-geophys.net/20/803/2013/

805

empirical moments of the ensemble. The background is rarely full rank since the anomalies of the ensemble span a vector space of dimension smaller than or equal to N − 1 and in a realistic context N  M. Therefore, one solves for the  analysis state vector x 0 in the ensemble space x 0 + Vec x [1] − x 0 , . . . , x [N] − x 0 , which can be written x 0 = x 0 + A0 w, where w ∈ RN is a vector of coefficients in ensemble space. The analysis of IEnKS over [t0 , tL ] is obtained from a cost function. The restriction of this cost function in state space to the ensemble space yields: L 1X 1 βk δ Tk (w)R−1 Je(w) = (N − 1)w T w + k δ k (w) , 2 2 k=1   (0) δ k (w) = y k − Hk ◦Mk←0 x 0 + A0 w .

(1)

The tilde symbol signifies that Je is a mathematical object defined in ensemble space. Mk←0 is the possibly nonlinear transition operator from t0 to tk . {βk }1≤k≤L are scalars in [0, 1] that weight the observations within the DAW. The choice of the βk can be made mathematically consistent and can have dramatic consequences on the performance of the data assimilation system. We refer to Bocquet and Sakov (2013) for a justification and numerical tests. Nonetheless, the rational for the choice of the {βk }1≤k≤L will be discussed later. This cost function is iteratively minimized in the ensemble space following the Gauss–Newton algorithm: e−1 ∇ Je(j ) (w(j ) ) , w(j +1) = w(j ) − H (j )

(2)

e(j ) using the gradient ∇ Je(j ) and an approximate Hessian H of the cost function: L i h X (j ) ∇ Je(j ) = − βk YTk,(j ) R−1 ) y − H ◦M (x k k k←0 0 k k=1

e(j ) H

+(N − 1)w(j ) , L X = (N − 1)IN + βk YTk,(j ) R−1 k Yk,(j ) ,

(3) (4)

k=1 (j )

(0)

x 0 = x 0 + A0 w (j ) .

(5)

e(j ) is an approximation of the full Hessian because it disH regards the contribution of the second-order derivatives of the innovation vectors δ k (w) in the cost function. The notation (j ) refers to the iteration index of the minimization. At the first iteration one sets w(0) = 0. IN is the identity matrix in ensemble space. Yk,(j ) = [Hk ◦Mk←0 ]0 (j ) A0 is the tangent |x 0

linear of the operator from ensemble space to the observation space. The estimation of this sensitivity using the ensemble is what allows one to avoid the use of the model adjoint. Two implementations, referred to as the transform and the bundle variants, have been put forward (Sakov et al., 2012; Bocquet and Sakov, 2012). With the bundle scheme, for instance, the Nonlin. Processes Geophys., 20, 803–818, 2013

806 4

M. Bocquet and P. Sakov: State and parameter estimation with the IEnKS M. Bocquet, P. Sakov: State and parameter estimation with the IEnKS

ensemble is rescaled closer to the mean trajectory by a factor and the observation operators, after which it is rescaled back ε. It is then propagated −1 through the model and the observaby the inverse factor ε . The operation reads: tion operators, after which it is rescaled back by the inverse   11T 1 operation reads: factor ε−1 . The (j) T Yk,(j) ≈ Hk ◦ Mk←0 x0 1 + εA0 IN − , (6)   ε N   T 11 1 (j ) T 1 + εA0 IN − , (6) Yk,(j k ◦Mk←0T x N ) ≈ 1H where ε = (1,...,1) ∈ 0R . N Note that each iterative update Eq. (2) solves the inner where 1 = (1, . . . , 1)T ∈problem: RN . quadratic variational Note that each iterative (2) solves the inner

update Eq. 1

2 1) w − w(j) Je(j) (w) = (N −problem: quadratic variational 2

2

1 L

(j ) Je (w) = (N 1−X 1) w − w(j ) (j) βk yk − Hk ◦ Mk←0 (x0 ) 2 + 2L 1 Xk=1

(j )

2 + βk y k − H(j)

k←0 (x 0 ) k ◦M (w − w ) , (7) 2−Y

k=1k,(j) R

2 k

2 where kzk−Y zT) (w G−1 −z.w (j ) ) , (7) k,(j G= Rk The iteration is stopped when kw(j) − w(j−1) k becomes smaller than threshold e. Let us denote w⋆ where kzk2G = zaTpredetermined G−1 z. (j ) − w (j −1) kThe the solution cost function minimization. symbol The iterationofisthe stopped when kw becomes ⋆ will be used with any quantity obtained thedenote minimum. smaller than a predetermined threshold e. Letat us w? Subsequently, a posterior ensemble can be generated at t 0: the solution of the cost function minimization. The symbol √ ? will quantity obtained at the minimum. e −1/2 U, N any − 1A (8) E⋆0 =bex⋆0used 1T +with 0 H⋆ Subsequently, a posterior ensemble can be generated at t0 : where U is√an arbitrary orthogonal matrix but satisfying ? T e?−1/2posterior E?0U1 = x= N to − 1A U, meant keep ensemble centered(8) on 0 Hthe 0 11 + ⋆ the analysis, and x0 = x0 + A0 w⋆ . whereThe U isGauss-Newton an orthogonalminimization matrix that isscheme arbitrary but satisfies Eq. (2) can easU1ily = 1be– replaced meant toby keep the posterior ensemble centered on a quasi-Newton ?= ? . scheme that avoids the thecomputation analysis – and x x + A w 0 0 0 Hessian, or a Levenberg-Marquardt alof the The Gauss–Newton shown in gorithm that guaranteesminimization convergence scheme of the minimization. Eq.These (2) canalternatives easily be replaced by a quasi-Newton scheme that have been suggested and successfully avoids the computation of the Hessian, or by a Levenberg– tested in Bocquet and Sakov (2012). In the context of the Marquardt that guarantees the minstandard algorithm models tested in Sectionsconvergence 3 and 4, theofnonlinearimization. These alternatives have been suggested and suc-is ity is mild enough so that a Levenberg-Marquardt scheme cessfully tested and in Bocquet and Sakovscheme (2012).isInvery the efficient. context unnecessary, the Gauss-Newton of theThis standard models in Sects. andis4,required the nonlinends the part oftested the analysis step3that to cyearity is mild that ascheme. Levenberg–Marquardt scheme cle the data enough assimilation An optional analysis stepisis unnecessary, and athe Gauss-Newton is very required when state estimation is scheme desired at timesefficient. t1 ,...,tL , This ends the part time, of theor analysis that istorequired to cy-is i.e. up to present when astep forecast future times cledesired. the dataThis assimilation scheme. An optional additional step depends on the analysis choice ofstep the is βk required when athe state estimation is desiredIn at the times t1 , . . . ,case, tL , and whether DAW s are overlapping. simplest i.e.when up toobservations present time,areorassimilated when a forecast to future timesthis is once and only once, subsequent analysis takes form of forecast desired. This additional stepthe depends onathe choiceofofthe themean βk x⋆k =the Mk←0 (x⋆0are ), oroverlapping. a forecast ofInthe ensemble andstate whether DAWs thefull simplest case,if oneobservations is additionally in once estimating forecast when areinterested assimilated and only once,uncerthis tainty E⋆kanalysis = Mk←0takes (E⋆0 ).the form of a forecast of the mean subsequent forecast of the scheme cycle, not to ifbe state During x ?k = Mthe (x ?0 ), orstep a forecast of the full ensemble k←0 confused with the forecast of the analysis step we just menone is additionally interested in estimating forecast uncer? ? tioned, the ensemble is propagated for S∆t, with S an intetainty Ek = Mk←0 (E0 ). ger: During the forecast step of the scheme cycle, not to be ⋆ confused with the forecast of the analysis step we just menE⋆S = M (9) S←0 (E0 ). tioned, the ensemble is propagated for S1t, with S an integer:If the optional analysis step implied forecasting the ensemble to or beyond tS , then there is no need to forecast it again. E?S = MS←0 (E?0 ) . (9) Nonlin. Processes Geophys., 20, 803–818, 2013

yL−3 yL−2 tL−1

tL−3

yL−1 y L

S∆ t tL

tL−2

t0

t1

tL−1 t L yL+1

S∆ t tL+1

tL+1

yL+2

tL+2

L∆t

Fig.1.1. Chaining Chainingofofthe theSDA SDAIEnKS IEnKScycles. cycles.The The schematic ilFig. schematic illuslustrates the case L = 5 and a shift of S = 2 time intervals ∆t is trates the case L = 5 and a shift of S = 2 time intervals 1t is apapplied between two updates. The method performs a smoothing plied between two updates. The method performs a smoothing upupdate throughout window only assimilates newest obdate throughout the the window butbut only assimilates thethe newest obserservations vectors (that have not been already assimilated) marked vations vectors (that have not been already assimilated) marked by by black thattime the index time index the dates and the obserblack dots.dots. NoteNote that the of theofdates and the observations vations are absolute for this schematic, not relative. are absolute, not relative, for this schematic.

This ensemble tS willstep form the background next If the optionalatanalysis implied forecastingfor thethe ensemanalysis. ble to or beyond tS , then there is no need to forecast it again. typical chaining of form the analysis and forecast stepsnext is ThisA ensemble at tS will the background for the schematically displayed in Fig. 1. A pseudo-code of the analysis. IEnKS is displayed 1. Itand does not show A typical chaininginofalgorithm the analysis forecast stepstheis optional analysis step, since the cycling of data assimilation schematically displayed in Fig. 1. does not depend on it. It is the same as the one presented in A pseudo-code of the IEnKS is displayed in Algorithm 1. (Bocquet and Sakov, 2013), except that it is here given in the It does not show the optional analysis step, since the cycling general case 1 ≤ S ≤ L rather than S = 1 only. It accounts of data assimilation does not depend on it. It is the same as for the possible use of inflation (lines 20,21). the one presented in Bocquet and Sakov (2013), except that In summary, the IEnKS solves the variational problem of here it is given in the general case, 1 ≤ S ≤ L, rather than 4D-Var in the ensemble range. Because the variational probthe specific case S = 1. The pseudo-code accounts for the lem is solved in a reduced space, there is no need for the possible use of inflation (lines 20, 21). adjoint evolution and observation models. The IEnKS genIn summary, the IEnKS theperturbations variational problem of erates and propagates the solves posterior following 4D-Var in the ensemble range. Because the variational probthe scheme of the ensemble Kalman filter. As such, it uses lem is solved reduced sampled errorsinofathe day. space, there is no need for the adjoint evolution and observation models. The IEnKS generates propagates the posterior perturbations following 2.1.2 and Single and multiple assimilation of observations the scheme of the ensemble Kalman filter. As such, it uses sampled errors the day. There are someofdegrees of freedom in the choice of L, S and the {βk }1≤k≤L . Let us just mention a few legitimate choices. 2.1.2 Single multiple of that observations Firstly, for and any choice of assimilation L and S, such 1 ≤ S ≤ L, the most natural choice for the {βk }1≤k≤L is to set: βk = 1 There areLsome in the choice of way, L, S the and for k = − S +degrees 1,...L, of andfreedom βk = 0 otherwise. That {β } the . Let us just mention a few legitimate choices. k 1≤k≤Lare assimilated once and only once. We call it observations Firstly, L and S,scheme such that 1 ≤IEnKS). S ≤ L, Itthe most the singlefor dataany assimilation (SDA is sim{β } natural choice for the is β = 1 for k = L − S+ k 1≤k≤L k ple, and the optional analysis of the update step is merely a 1,forecast . . . L, and β = 0 otherwise. That way, the observations k analyzed state at t0 , or possibly a forecast are of the of assimilated once and only WeS call single the full ensemble from t0 .once. When = L,this thethe DAW s do data not assimilation scheme (SDA IEnKS). It is simple, and thedata opoverlap, while they do so if S < L. The chaining of the tional analysis of the update step is merely a forecast of assimilation cycles in the SDA case is displayed in Fig. 1. the analyzed state at tdata possibly a forecast of the 0 , orassimilation For very long windows, thefull use ensemble of multifrom t . When S = L, the DAWs do not overlap, but they do 0 ple assimilation (or splitting) of observations, denoted MDA soinwhen S < L. The chaining of the data assimilation cycles the following, can prove numerically efficient (Bocquet in the SDA case is displayed in Fig. 1.

www.nonlin-processes-geophys.net/20/803/2013/

M. Bocquet and P. Sakov: State and parameter estimation with the IEnKS M. Bocquet, P. Sakov: State and parameter estimation with the IEnKS Algorithm 1 A cycle of the lag-L / shift-S / MDA / bundle / Algorithm 1 A cycle of the lag-L / shift-S / MDA / bundle / Gauss-Newton IEnKS. Gauss-Newton IEnKS. Require: tL tisL present time. Transition model MM , observaRequire: is present time. Transition model , observak+1←k k+1←k tiontion operators HkHat tk .tkAlgorithm operators . Algorithmparameters: parameters:,ǫ,e,e,jjmax max. .EE k at 00, , ensemble observationatattkt.k .λ λisisthe theinflation inflation thethe ensemble at at t0 ,t0y, kythe observation k the N×N N×N factor. Uan is an orthogonal matrixininRR satisfyingU1 U1==1.1. factor. U is orthogonal matrix satisfying ≤ L, observation weightswithin withinthe theDAW DAW βk ,β1k ,≤1 k≤≤k L, areare thethe observation weights .. = 0, 1: 1: j =j 0, ww == 0 0 (0) (0) = E0 1/N 2: 2: x 0 x0= E 0 1/N (0) T 3: A0 = E0 −(0) x 1 3: A0 = E0 − x 0 01T 4: repeat 4: repeat (0) 5: x0 =(0) x0 + A0 w 5: 6: x 0 E =0 x=0 x0+1TA+ 0 wǫA0 6: 7: E0for = xk0= 1T1,...,L + A0 do 7: 8: for k E =k1,=. M . . , k←k−1 L do (Ek−1 ) 8: 9: Eky=k = MH (Ek−1 ) k (Ek )1/N k←k−1 (Hkk)1/N (Ek ) − yk )/ǫ 9:10: y k Y =kH=k (E end for 10:11: Y (Hk (Ek ) − y kP )/ k= T −1 Je = (N − 1)w − L 11:12: end∇for k=1 βk Yk Rk (yk − yk ) P P L −1−1 L TT e= H =(N (N− −1)w 1)IN−+ k=1 12:13: ∇ Je kkYY k k− y k ) k=1ββ k kRR k k(yY P −1 L e e T e 14: Solve H∆w = ∇ J 13: H = (N − 1)IN + k=1 βk Yk Rk Yk w := ∆w ew − = 14:15: Solve H1w ∇ Je 16: j := j + 1 15: w := w − 1w 16:17: juntil := j||∆w|| + 1T ≤√e or j ≥ jmax e − 21 U N −j 1A H 18: E = x 0 1 ≤+ 17: until0||1w|| e or ≥ j0max √ (E0 ) 1 19: ES = MS←0 e− 2 U 18: E0 = x 0 1T + N − 1A0 H 20: xs = ES 1/N  19:21: ESE= M T 0) T S←0 (E S := xS 1 + λ ES − xS 1 20: x s = ES 1/N  21: ES := x S 1T + λ ES − x S 1T

and Sakov, 2013). An observation vector y is said to be assimilated with weight β (0 ≤ β ≤ 1) if the following Gaussian For very long data assimilation windows, the use of multiobservation likelihood is used in the analysis: ple assimilation (or splitting) of observations, denoted MDA in the following, efficient (Bocquet β can prove T numerically R−1 (y−H(x)) e− 2 (y−H(x)) β andp(y Sakov, asp observation vector, y is said to be (10) |x) =2013). An similated with weight (2π/β) β (0 ≤dβ|R| ≤ 1) if the following Gaussian observation likelihood is used in the analysis: where |R| is the determinant of R. The upper index of yβ referse−toβ2 (y−H its partial assimilation (x))T R−1 (y−H (x)) with weight β. The β p p(yprior |x)errors = attached to the several ,occurrences of one(10) obd (2π/β) servation are chosen to be|R| independent. In that light, the {βk }1≤k≤L are merely the weights of the observation vecwhere |R| }is the determinant of R. The upper index of y β tors {y k 1≤k≤L within the DAW . Statistical consistency imrefers to the assimilation of vector y withisweight β. Theinprior poses thatpartial a unique observation assimilated such errors attached to the several occurrences of one a way that the sum of all its weights in the dataobservation assimilation {βk }requires areexperiment chosen to isbeone. independent. In that light, 1≤k≤L For instance, if 1 = S ≤ the L, one L {y }1≤k≤L areP merely the weights of the observation vectors kobservaβ = 1. In the more general case where the k k=1 within the DAW. consistency thatthen a tion vectors haveStatistical the same number of nonnecessitates zero weights, unique observation vector is assimilated in such a way that L is a multiple of S: L = QS, where Q is an integer. As a rePQ−1 thesult sum of all its requires: weights in q=0 the βdata consistency = 1 with l = experi1,...,S. Sq+lassimilation mentP is 1. For instance, if 1 = S ≤ L, consistency In the MDA case, except the SDA subcase, therequires optional L thatanalysis 1. more In a more general case in whichtothe obserk = is complex since it requires re-weight k=1 βstep vation vectors havewithin the same non-zero weights, L the observations the number DAW to of obtain the correct analyis asismultiple of tS1 :toLt= QS, where Q is an integer. As a refor states and beyond. More details that are not L P Q−1 directly relevant to this study be = found in l(Bocquet sult, consistency requires βSq+l 1 with = 1, . . . ,and S. q=0 can Sakov, 2013).case (except the SDA subcase) the optional In the MDA analysis step is more complex since it requires re-weighting

www.nonlin-processes-geophys.net/20/803/2013/

807 5 β

β

β

L−1 L yL−3 yL−2

y−11 tL−1

tL−3

tL−2

β1

tL

βL−1 y LβL yL−1

y1

S∆ t t0

t1 S∆ t

tL−1 t L y3β1

tL+1

βL−1 βL yL+1 yL+2

tL+1

tL+2

L∆t

Fig. the MDA MDA IEnKS IEnKScycles. cycles. The The schematic schematicillusillusFig.2.2. Chaining of the trates S = 2. 2. The method method performs performsaasmoothsmoothtrates the the case case L = 5, and S ing potentiallyusing usingall allobservations observations ingupdate updatethroughout throughout the window potentially withinthe the window window (marked (marked by black within black dots), dots), except exceptfor forthe thefirst firstobobservation vector vector assumed to be servation be already already entirely entirelyassimilated. assimilated.Note Note thatthe the time time index index for the dates and that and the the observations observationsare areabsolute absolute forthis thisschematic, schematic, not not relative. for

PL that when the constraint βk =the 1 is not satisthe Note observations within the DAW to k=1 obtain correct analfied the underlying smoothing probability density function yses for states t1 to tL and beyond. More details that are not (pdf) will not betothe one,bebut, with chosen Bocquet and directly relevant thistargeted study can found in well {βk }1≤k≤L , could be a power of it (Bocquet and Sakov, Sakov (2013). P 2013). Note that when the constraint L not satisfied, k=1 βk = 1 isconsistent These MDA approaches are mathematically in the underlying smoothing probability density function (pdf) the sense that they are demonstrated to be correct in the linwill not be the one targeted, but, with well chosen {βk }1≤k≤L , ear model, Gaussian statistics case. In (Bocquet and Sakov, could be a power of it (Bocquet and Sakov, 2013). 2013), an heuristic argument based on Bayesian ideas justiThese MDA approaches fies the use of the method inare themathematically nonlinear case. consistent in the The sense that they are demonstrated to be correct in the linchaining of the data assimilation cycles in the MDA ear model, Gaussian statistics case. An heuristic argument case is displayed in Fig. 2. In the experimental Sections 3 based Bayesian ideas justifies the will use of method in the and 4,on both SDA and MDA schemes bethe used. nonlinear case (Bocquet and Sakov, 2013). TheAugmented chaining ofstate the data assimilation cycles in the MDA 2.2 formalism case is displayed in Fig. 2. P We to estimate a set of model θ∈R Inwish the experimental Sects. 3 andparameters 4, both SDA and along MDA with the will statebevariables. To do so, the state space is augschemes used. mented from x ∈ RM to a vector  2.2 Augmented state formalism x z= ∈ RM+P , (11) θ We wish to estimate a set of model parameters θ ∈ RP along of thethe joint state and parameter mathematiwith state variables. To dospace. so, theFrom statethespace is augcal pointfrom of view step of the IEnKS is unchanged. mented x ∈the RManalysis to a vector As  usual  in a parameter estimation context, a forward x needs M+P model inz= ∈ Rto be ,introduced for the parameters. For (11) stance,θ it could be the persistence model (θk+1 = θk ), or some jittering such as a Brownian motion, could be assumed of parameter space. the mathematical (θthe = θkstate + ǫkand ). Depending on the From constraints on the pak+1joint point of view, the analysis step of the IEnKS is unchanged. rameters, this jittering could also be constrained. As is usual inthere a parameter context, a forward Technically, is nothingestimation more to the joint state and model needs to be than introduced for the parameters. This to model parameter IEnKS in the state IEnKS. As opposed the could instance, the persistence = θ ), or EnKFbe, andfor EnKS, the objective is not tomodel build (θ covariances k+1 k to help estimate hidden instead to minimize a some jittering such as parameters, a Brownian but motion, could be assumed on the augmented state. a (θcost = θ k + that Depending on full the constraints on theInpak+1function k ). depends rameters, this jittering could also be constrained.

Nonlin. Processes Geophys., 20, 803–818, 2013

808

M. Bocquet and P. Sakov: State and parameter estimation with the IEnKS

Technically, there is nothing more in the joint state and parameter IEnKS than in the state IEnKS. As opposed to the EnKF and EnKS, the objective of the joint state and parameter IEnKS is not to build covariances to help estimate hidden parameters, but instead to minimize a cost function that depends on the full augmented state. In a strongly nonlinear context, this approach could prove superior to the standard EnKF and EnKS. As mentioned in the introduction, the estimation of model parameters within 4D-Var requires the adjoint model. Besides, the computation of the derivative of the cost function with respect to the parameters in terms of the adjoint field can be tedious. Parameter estimation with the IEnKS avoids this time-consuming task. A potential advantage of the IEnKS over 4D-Var is that the errors of the day are by construction estimated within the IEnKS for all types of variables or parameters, whereas the 4D-Var modeling of background statistics of heterogeneous variables and parameters can be complex (see, for instance, Elbern et al. (2007), relating the modeling of inter-species correlation in a 4D-Var applied to air quality, or Montmerle and Berre (2010) in a meteorological convective scale context). Similarly to state estimation, joint state and parameter estimation with the IEnKS in theory combines appealing features of both variational and ensemble Kalman filtering techniques. The purpose of the following numerical exploration is to investigate whether this holds true in experiments with low-order models. 3

Numerical experiments with the Lorenz-95 model

The Lorenz-95 one-dimensional model (Lorenz and Emmanuel, 1998) represents a mid-latitude zonal circle of the global atmosphere. It has M = 40 variables {xm }m=1,...,M . Its dynamics is given by the following set of ordinary differential equations: dxm = (xm+1 − xm−2 )xm−1 − xm + F , dt

(12)

for m = 1, . . . , M, and the domain is periodic (circle-like). F is chosen to be 8 so that the dynamics is chaotic and has 13 positive Lyapunov exponents. A time step of 1t = 0.05 is meant to represent a time interval of 6 h in the real atmosphere. Unless otherwise stated, the time interval between each observational update will be 1t = 0.05, meant to be representative of a data assimilation cycle of global meteorological models. With such a value for 1t, the data assimilation system is considered weakly nonlinear, leading to statistics of errors weakly diverging from Gaussianity. This model is integrated using the fourth-order Runge–Kutta scheme with a time step of 0.05.

Nonlin. Processes Geophys., 20, 803–818, 2013

3.1

Setup

Twin experiments are conducted. The truth is represented by a free model run (nature run), meant to be tracked by the data assimilation system. The system is assumed to be fully observed (d = 40) every 1t, so that Hk = Id , with the observation error covariance matrix Rk = Id . The related synthetic observations are generated from the truth, and perturbed according to the same observation error prior. The performance of a scheme is measured by the temporal mean of a root mean square difference between a state estimate (x a ) and the truth (x t ). Typically, one averages the following analysis root mean square error (RMSE): v u M u1 X  a − xt 2 xm (13) RMSE = t m M m=1 over the data assimilation cycles. When this RMSE concerns the system state at present time, i.e., the state at the end of the DAW, we call it the filtering RMSE. When this RMSE concerns the state defined L1t in the past, i.e., at the beginning of the DAW, we call it the smoothing RMSE. All data assimilation runs will extend over 105 cycles after a burn-in period of 5×103 cycles. This guarantees a sufficient convergence of the error statistics. Unless otherwise stated, the size of the ensemble used with the ensemble methods will be N = 20, which is greater than the size of the unstable subspace, and, in the case of this model, makes localization unnecessary. In this context, we have chosen to implement the inflation using the finite-size counterparts of the filters/smoothers (Bocquet et al., 2011). For this model, except in quasilinear conditions (1t ∼ 0.01), this inflation leads to performances that are quantitatively very close to the same filter/smoother with optimally tuned uniform inflation (Bocquet et al., 2011; Bocquet and Sakov, 2012). In the following methods like EnKF/IEnKS/EnKS should be understood as EnKF/IEnKS/EnKS with optimally tuned uniform inflation, and will actually be implemented with a single run of the finite-size variants, i.e. EnKF-N/IEnKS-N/EnKS-N, which is much more economical. Any reader not interested in implementing the finite-size IEnKS (whose pseudo-code is presented in Algorithm 2), or IEnKS-N, can alternatively optimally tune the uniform inflation of an EnKF/IEnKS/EnKS to attain very similar results. 3.2

New experiments with the IEnKS

This section is meant to recall and extend to 4D-Var and the case S = L several numerical tests of Bocquet and Sakov (2013), before considering joint state and parameter estimation. The following five systems are compared:

www.nonlin-processes-geophys.net/20/803/2013/

M. Bocquet and P. Sakov: State and parameter estimation with the IEnKS 809 8 M. Bocquet, P. Sakov: State and parameter estimation with the IEnKS 0.22 0.20

4D-Var S=1 EnKS-N S=1 SDA IEnKS-N S=1 SDA IEnKS-N S=L MDA IEnKS-N S=1

Filtering analysis RMSE

0.210

4D-Var S=1 EnKS-N S=1 SDA IEnKS-N S=1 SDA IEnKS-N S=L MDA IEnKS-N S=1

0.18 0.16

Smoothing analysis RMSE

0.220

0.200 0.195 0.190 0.185 0.180 0.175 0.170

0.14 0.12 0.10 0.09 0.08 0.07 0.06

0.165

0.05

0.160 0.155 1

5

10

15

20

25

30

35

40

45

0.04

50

1

5

10

15

20

25

30

35

40

45

50

DAW length L

DAW length L

Fig. Fig.3.3. Comparison Comparisonofofthe thefiltering filtering(left) (left)and andsmoothing smoothing(right) (right)performance performanceofofthe theSDA SDAIEnKS, IEnKS,MDA MDAIEnKS, IEnKS, the the EnKS EnKS and and 4D-Var, 4D-Var, in in weakly weaklynon-linear nonlinear conditions conditions corresponding to ∆t 1t = 0.05.

Filtering analysis RMSE

0.39

SDA IEnKS-N S=1 SDA IEnKS-N S=L

0.38 Require: Same requirements as algorithm 1. εNMDA = 1. IEnKS-N S=1 PL 0.37 w T R−1 (y − y ) 12:0.36 ∇ Je = N ε +w − β Y k k k k k Tw k=1 N  0.35 T T P e = N εN +w w IN −2ww + L βk YT R−1 Yk 18:0.34 H 2 k=1 k k T (εN +w w) 0.33 √ 1 − 0.32 T e 2U 19: E0 = x 0 1 + N − 1A0 H 0.31 20:0.30 ES = MS←0 (E0 ) 21:0.29

Smoothing analysis RMSE

0.46 0.44 Algorithm 2 A cycle of the lag-L / shift-S / MDA / bundle / Gauss-Newton IEnKS-N. Same as algorithm 1 with the ex0.42 4D-Var S=1 ception of the following lines: EnKS-N S=1 0.40

0.40 0.38 0.36 0.34 0.32 0.30 0.28 0.26 0.24 0.22 0.20 0.18 0.16

vary much if we introduce some correlation and offdiagonal terms. However, the scaling of the B-matrix is crucial in this context (Kalnay et al., 2007). The longer the DAW is, the smaller the scaling factor should be, since the first guess becomes more accurate. For each experiment, we tuned this scaling so as to obtain the best filtering analysis RMSE.

0.14

To avoid tuning inflation, the finite-size variants of the 4D-Var S=1 0.12 filters and smoothers are employed (SDA IEnKS-N, MDA EnKS-N S=1 SDA IEnKS-N S=1 IEnKS-N, EnKS-N). All EnKF and EnKS, and their finite0.10 SDA IEnKS-N S=L S=1 size variantsMDA in IEnKS-N this article are based on the ensemble trans0.28 0.27 form 0.08 square root Kalman filter (Bishop et al., 2001; Hunt 7 7 8 9 10 8 9 10 1 2 3 4 1 2 3 4 5 6 5 6 et al., 2007; Bocquet et al., 2011). – The SDA IEnKS, S =DAW 1. length L DAW length LThese five data assimilation systems are compared in weakly nonlinear condi– Comparison The MDA IEnKS, S = 1. The are chosen to k }1≤k≤L tionsof(1t 0.05) chosen to roughly synoptic scale Fig. 4. of the filtering (left){β and smoothing (right) performance the=SDA IEnKS, MDA IEnKS, represent the EnKS and 4D-Var, in be conditions uniform incorresponding the DAW and constant non-linear to ∆t = 0.20.in time. meteorology dynamics (Lorenz and Emmanuel, 1998), and more nonlinear conditions (1t = 0.20 between updates). The – The SDA IEnKS, with S equal to the length of the time-averaged analysis RMSE is plotted in Fig. 3 for the forDAW S = L,Kalman so thatfilter the DAWs – The ensemble (EnKF).do not overlap. This EnKS, the and smoothing for Fcase, (at the mer case, in Fig. estimator 4 for the latter as beginning a function of of the the approach is meant to be computationally economical, DAW ) is plotted because it is better than the filtering estimate length of the DAW. – The ensemble Kalman smoother (EnKS) S = 1. and is much more economical than the quasi-static of FLet (atus thefirst end notice of the DAW persistence model that ).theBecause filteringtheperformance of the case S = IEnKS 1, sinceSthere no overlapping DAWs. isEnKS assumed F , the smoothing andthat theoffiltering estimates – The MDA = 1.isThe {βk }1≤k≤Lofare chosen is, byforconstruction, given by the EnKF, whatof F are the same for the IEnKS andexplains 4D-Var.why Because, in aduniform in the DAW and constant in time. ever the length of the DAWs. This the filtering – The standard ensemble Kalman smoother (EnKS), dition, F is the smoothing and filtering s RMSEthe of true EnKS is static, constant, modulo statistical noise.RMSE When with Swith = 1. SThe standard ensemble Kalman smoother – 4D-Var = 1. The background’s magnitude is should coincide. From Fig. 5, it is clear thatEnKF/EnKS the IEnKS sigcomparing the filtering performances of the and has been in Evensen and van (2000); tuned so as defined to minimize the global (onLeeuwen all 41 extended nificantly outperforms the EnKF and the EnKS. 4D-Var, the conclusions of Kalnay et al. (2007) are reinEvensen (2003, 2009); Cosme et al. (2012). variables) RMSE. The time-averaged square errors forced. 4D-Var does notanalysis perform root as wellmean for short DAWs and To avoid tuningwith inflation, variants,toEnKF-N, (performs RMSE s) are computed over a much longer runweofnote 105 that cycles. – 4D-Var a shiftthe S =finite-size 1, corresponding overlapbetter for long DAWs. In addition, the EnKS-N, MDA IEnKS-N, are employed. TheThe DAW lengthis The scores for theapplies state variables are reported in Fig. 6. even The ping DAWs and quasi-static conditions. gradient same conclusion to the smoothing performance, is varied in the by MDA IEnKS and 4D-Var up to Land = filtering RMSE s (i.e. at present time) of the EnKF or of the obtained finite differences, whichcases is affordable though the crossover point might be different. 50, before a degradation of the performance setscontext. in. In The the EnKS for any L filtering are, by construction, same. The precise enough in this small dimensional Considering as well as the smoothing, theestimaMDA EnKS case, the DAWoflength is strongly varied updepends to L = on 100, tion of theS forcing F is good enough so that 4D-Var the performance performance 4D-Var thewhich backIEnKS = 1 significantly outperforms and the corresponds tostatistics. the optimal performance for the smoothing isEnKF/EnKS indistinguishable the EnKF performance ground Since the correlations in the Lorenzin all from regimes. The SDA IEnKS Sin=the 1, case also estimation of F byare therather EnKS. where F =very 8 iswell, known. Nevertheless, in this 95 system short-ranged, the B-matrix is choperforms but its performanceeven wanes withweakly longer 3 The forcing parameter plotted in Fig. over a 5does × 10not nonlinear regime, the IEnKS withIEnKS L ≥ 1 outperforms them. sen diagonal. Theisperformance of5,4D-Var DAWs, which is why the MDA was introduced by cycle-long segment of the experiment. In the case of the Confirming the results of Bocquet and Sakov (2013), the gap www.nonlin-processes-geophys.net/20/803/2013/

Nonlin. Processes Geophys., 20, 803–818, 2013

0.155 1

10

5

20

15

30

25

40

35

45

0.04

50

1

10

5

20

15

30

25

40

35

45

50

DAW length L

DAW length L

Fig. 3. Comparison of the filtering (left) and smoothing (right) performance of theState SDA and IEnKS, MDA IEnKS, the EnKS andthe 4D-Var, in 810 M. Bocquet and P. Sakov: parameter estimation with IEnKS weakly non-linear conditions corresponding to ∆t = 0.05. 0.46 0.44

4D-Var S=1 EnKS-N S=1 SDA IEnKS-N S=1 SDA IEnKS-N S=L MDA IEnKS-N S=1

0.40 0.39 0.38 0.37 0.36 0.35 0.34 0.33 0.32 0.31

Smoothing analysis RMSE

Filtering analysis RMSE

0.42

0.40 0.38 0.36 0.34 0.32 0.30 0.28 0.26 0.24 0.22 0.20 0.18 0.16 0.14

4D-Var S=1 EnKS-N S=1 SDA IEnKS-N S=1 SDA IEnKS-N S=L MDA IEnKS-N S=1

0.12

0.30 0.29

0.10

0.28 0.27 1

2

3

4

5

6

7

8

9

10

DAW length L

0.08

1

2

3

4

5

6

7

8

9

10

DAW length L

Fig. 4. Comparison Comparison of of the the filtering filtering (left) (left) and and smoothing smoothing (right) (right) performance performance of of the the SDA SDA IEnKS, IEnKS,MDA MDAIEnKS, IEnKS,the theEnKS EnKSand and4D-Var, 4D-Var,inin non-linearconditions conditionscorresponding correspondingto to1t ∆t= =0.20. 0.20. nonlinear

– The and ensemble (EnKF). Bocquet SakovKalman (2013). filter For very short DAWs (L = 1, 2 in the case 1t = 0.05), the performances of the SDA IEnKS – The ensemble Kalman smoother (EnKS) S = 1. S = 1 and MDA IEnKS S = 1 are equal (L = 1) or very close (L–=The 2). For intermediate lengths, the SDA IEnKS MDA IEnKS S =DAW 1. The {βk }1≤k≤L are chosen S = 1uniform can slightly MDA IEnKS in the outperform DAW and constant in time.S = 1. This is not surprising, since the SDA IEnKS algorithm is meant to – 4D-Var with S = 1. The background’s magnitude is be optimal for sufficiently short DAWs, whereas the MDA tuned so as to minimize the global (on all 41 extended IEnKS algorithm is only guaranteed to be optimal in linvariables) RMSE. ear/Gaussian conditions. ToPractically, avoid tuning the finite-size variants, in inflation, weakly nonlinear conditions (1tEnKF-N, = 0.05), EnKS-N, are employed. DAW length the IEnKSMDA S = 1IEnKS-N, only requires one to twoThe propagations of is varied in the MDAthe IEnKS 4D-Var cases up toshown L= the ensemble within DAW.and Consistently, it was 50,Bocquet before aand degradation of the performance in. In in Sakov (2013) that a linearizedsets variant of the the EnKS case,requiring the DAWone length is variedof upthe to ensemble L = 100, which algorithm, propagation within corresponds to the optimal performance for the the DAW to compute the sensitivity, performed justsmoothing as well in estimation of F by thenevertheless EnKS. these conditions. It is tempting to check whether is plotted in Fig. 5, over a windows 5 × 103thisThe costforcing can beparameter reduced by using non-overlapping cycle-long segment of the experiment. In the case the S = L, and performing analysis every L1t. This of would divide the cost of model runs by L, but this effect might nevertheless be offset by an higher number of iterations required for the analysis. Quite surprisingly, the SDA IEnKS S = L performs very well for DAWs of length smaller than 0.80 (about twice the doubling time of the Lorenz-95 model). It is useless beyond that length, which was to be expected since the background at the beginning of the DAW results from a long forecast within the DAW, as opposed to a forecast of only 1t in the quasistatic S = 1 case. In stronger nonlinear conditions, the variational methods (4D-Var and IEnKS) easily outperform the EnKF/EnKS. In particular, 4D-Var outperforms the EnKF/EnKS as soon as the the DAW reaches L = 2.

Nonlin. Processes Geophys., 20, 803–818, 2013

EnKS, the smoothing F (at the beginning of the 3.3 Joint state andestimator forcing Fforestimation DAW ) is plotted because it is better than the filtering estimate A Ftwin experiment is DAW conducted in athe situation where F is of (at the end of the ). Because persistence model unknown. The true model (nature run) has forcing F = 8. is assumed for F , the smoothing and the filtering estimates The model used for assimilation and forecast has the initial of F are the same for the IEnKS and 4D-Var. Because, in advalue Fthe=true 7. F is static, the smoothing and filtering RMSEs dition, In addition to the state the forcing should coincide. From Fig.variables, 5, it is clear that the parameter IEnKS sig-F will be estimated as well. Hence, the state vector x ∈ RM nificantly outperforms the EnKF and the EnKS. with = 40 will be extended the joint vector of size M+ TheM time-averaged analysisto root mean square errors 5 P = 41, with its 41st entry being the forcing parameter. The (RMSEs) are computed over a much longer run of 10 cycles. persistence model will be assumed for the evolution of the The scores for the state variables are reported in Fig. 6. The model parameter. filtering RMSEs (i.e. at present time) of the EnKF or of the Because theLfilters and smoothers used here are deterEnKS for any are, by construction, the same. Theall estimaministic, the only source of stochasticity to generate the varition of the forcing F is good enough so that the performance ability in F comes from the initialization of the ensemble. is indistinguishable from the EnKF performance in the case The forcing a member iseven initialized 7 + ε, where F = 8 parameter is known. ofNevertheless, in this toweakly where ε isregime, independently drawn distribution nonlinear the IEnKS withfrom L ≥ 1a normal outperforms them. of standard the deviation 0.1.Bocquet The augmented IEnKS will Confirming results of and Sakovstate (2013), the gap be compared to several augmented state alternatives. Specifically, we shall consider in this experiment:

– The ensemble Kalman filter (EnKF). – The ensemble Kalman smoother (EnKS) S = 1. – The MDA IEnKS S = 1. The {βk }1≤k≤L are chosen to be uniform in the DAW and constant in time. – 4D-Var with S = 1. The background’s magnitude is tuned so as to minimize the global (on all 41 extended variables) RMSE. To avoid tuning inflation, the finite-size variants – EnKF-N, EnKS-N, MDA IEnKS-N – are employed. The DAW length is varied in the MDA IEnKS and 4D-Var cases up to L = 50, before a degeneration of the performance sets in. In the EnKS case, the DAW length is varied up to L = 100, which www.nonlin-processes-geophys.net/20/803/2013/

timated by several filters and smoothers with an ensemble of size N = 20. The forcing of the true model is F = 8. The MDA IEnKS for L = 1,5,10 and 30 is compared to the EnKF and the EnKS (L = 50). The finite-size variants of these methods are used: they M. Bocquet and P. Sakov: State and parameter estimation with 811 dothe not IEnKS require inflation and perform, in this context, as well as with optimally tuned inflation. M. Bocquet, P. Sakov: State and parameter estimation with the IEnKS 9 0.128 EnKF-N EnKS-N L=50 IEnKF-N IEnKS-N L=5 IEnKS-N L=10 IEnKS-N L=30

0.30 0.064 0.25

Analysis RMSE (parameter F) Analysis RMSE (state variables)

Analysis of parameter F

8.10

0.032 0.20

8.05

0.016 0.15

0.125 0.008

8

0.10 0.004

7.95

4D-Var filtering 4D-Varsmoothing filtering/smoothing 4D-Var EnKF-N/EnKS-N filtering EnKF-N/EnKS-N filtering EnKS-N EnKS-Nsmoothing smoothing MDA IEnKS-N filtering MDA IEnKS-N filtering/smoothing MDA IEnKS-N smoothing

0.07 0.002 0.001 0.05

7.90 0

1000

2000

3000

4000

5000

10 1

Time

5

10

5

20

10

30 20

30

Data Dataassimilation assimilationwindow windowlength length(in(in∆t) ∆t)

50

Analysis RMSE (state variables) Analysis of parameter F

0.30

Analysis RMSE (state variables)

8.05 0.25 corresponds to the optimal performance for the smoothing estimation of F by the EnKS. 0.20 The8 forcing parameter is plotted in Fig. 5, over a 5 × 103 0.15 cycle-long segment of the experiment. In the case of the 0.125 EnKS, the smoothing estimator for F (at the beginning of the 7.95 DAW) 0.10 is plotted because it is better than the filtering estimate of F (at the end of the DAW). Because the persistence model 4D-Var 7.90 is assumed for filtering F , the smoothing and the filtering estimates 0.07 4D-Var smoothing 0 1000 for 30004D-Var.4000 5000 EnKF-N/EnKS-N filtering of F are the same the2000 IEnKS and In addition, Time EnKS-N smoothing because true F is static, the smoothing and filtering RM0.05 theMDA IEnKS-N filtering MDA IEnKS-N smoothing SEs From’95 Fig. 5, it parameter is clear that thea function IEnKS Fig.should 5. Plotcoincide. of the Lorenz forcing F as 01 10 20 30 5 50 significantly the EnKF and the EnKS. of the cycle outperforms index Data of the data assimilation experiment. F is esassimilation window length (in ∆t) timated by several filters and smoothers with an ensemble of size The time-averaged analysis root mean square errors (RM5 N = 20. The forcing of the true model is F = 8. The MDA IEnKS SEs) areRoot computed over aerrors much run ofof10the cycles. The Fig. 6. mean square forlonger the analysis state vector Lfor = 1,5,10 andvariables 30 is to theinEnKF and the EnKS scores the state reported Fig. 6. The atfor present time (filtering) or compared theare retrospective analysis of thefilterstate (L = (smoothing) 50). The finite-size variants these areinused: they vector EnKF, EnKS and methods the IEnKS, case ing RMSEs (i.e., for thethe RMSEs atofpresent time) of thetheEnKF do not require inflation and perform, in this context, as well as with of of the the Lorenz ’95 model, with = 0.05. or EnKS for any L ∆t are, by construction, the same. optimally tuned inflation. The estimation of the forcing F is good enough that the performance is indistinguishable from the EnKF performance in the performance between EnKS andnonthe 0.30 when F smoothing = 8 is known. Nevertheless, even the in this weakly IEnKS significantly increases In this 0.25 linear regime, the IEnKS withas LL ≥ increases. 1 outperforms theweakly EnKF nonlinear regime, the number of iterations required the and by Sakov and EnKS. Confirming the results of Bocquet 0.20 IEnKS is its performance equalsbetween that of the (2013), theclose gap toinone, the and smoothing performance the linearized (Bocquet and Sakov, 2013).as L increases. 0.15 EnKS and IEnKS the IEnKS significantly increases The scores for the estimation forcingofparameter 0.125weakly In this nonlinear regime,of thethe number iterationsare rereported in Fig. 7. By construction the filtering performance quired by the IEnKS is close to one, and its performance 0.10 of the EnKF thelinearized EnKS at any L is the same, about equals that ofandthe IEnKS (Bocquet and 0.018. Sakov, The parameter smoothing RMSE for the EnKS is optimal for 4D-Var filtering 2013). 0.07 smoothing 0.015. By construction, the analysis at L The ∼ 100 and4D-Var is scores forabout the estimation of the forcing parameter are EnKF-N/EnKS-N filtering EnKS-N smoothing present time and retrospective analysis of F byperformance the IEnKS reported in Fig. 7. By construction, the filtering 0.05 MDA IEnKS-N filtering MDA IEnKS-N smoothing

01

5

10

20

30

Data assimilation window length (in ∆t) www.nonlin-processes-geophys.net/20/803/2013/

0.064 significantly ensemble Kalman filter (IEnKF)asoutranks the EnKS an IEnKS increases L increases. In thiswith weakly nonlinear regime, the number of iterations required by the RMSE of 0.013. With increasing L, this performance gets 0.032 −4 IEnKS is better close to and its equals that the better and andone, reaches theperformance RMSE of 7.5×10 for of L= 0.016 50. linearized IEnKS (Bocquet and Sakov, 2013). The scores for of the4D-Var estimation thebetter forcing parameter are The estimation onlyof gets than the EnKF 0.008 reported in Fig. 7. By construction the filtering performance for DAWs of length L = 50. This counter-performance can of 0.004 the and the at specification any L is the same, only beEnKF explained byEnKS a poor of theabout error0.018. coThe parameter smoothing RMSE forofthe EnKS is optimal for variance matrix. Indeed, the scaling the background error 0.002 4D-Var filtering/smoothing L ∼ 100for andthe is about 0.015. By construction, the analysis at statistics state variables should be different from the EnKF-N/EnKS-N filtering present and retrospective byparameter. the IEnKS smoothing 0.001 oftime scaling theEnKS-N background error analysis statistics of forFthe MDA IEnKS-N filtering/smoothing However, the separate tuning of scalings requires additional 10 20 hypothesis 30 100 50 work that1 the IEnKS does 5not require. This will Data assimilation window length (in ∆t) be checked in Sect. 5.

Analysis RMSE (parameter F)

IEnKF-N IEnKS-N L=5 IEnKS-N L=10 IEnKS-N L=30

Fig. mean square square errors errors for for the theanalysis analysisofofFF atatpresent present Fig. 7. 7. Root mean time (filtering) ororthe theretrospective retrospective analysis F (smoothing) for time (filtering) analysis of Fof(smoothing) for the EnKF, EnKS andand thethe IEnKS, in the case of of thethe Lorenz ’95 model, the EnKF, EnKS IEnKS, in the case Lorenz-95 model, 4with Numerical with ∆t = 0.05. experiments with a coupled Lorenz ’95 1t

tracer model

of thesection EnKF we andintroduce the EnKS atLany Lthe is the same, approxiIn a simple extension of the iterative Lorenz isthis the same. Even in the case = 1, so-called mately 0.018. The parameter smoothing RMSE for the EnKS ’95 model with a tracer advected by thethe Lorenz field ensemble Kalman filterfield (IEnKF) outranks EnKS’95 with an is approximately 0.015 and is optimal for L ∼ 100. By conrepresenting an advective wind. It isL, meant test the ability RMSE of 0.013. With increasing this to performance gets −4 struction, analysis at present time and retrospective analof the IEnKS for jointreaches state and estimation in L the better andthe better and theparameter RMSE of 7.5×10 for = ysis of F by the IEnKS is the same. Even in the case L = 1, dynamical context of an online atmospheric chemical model, 50. the so-called iterative ensemble Kalman filter (IEnKF) outwith heterogeneous variables. The estimation of 4D-Var only gets better than the EnKF performs the EnKS with of 0.013. With increasing for DAWs of length L = an 50.RMSE This counter-performance can L, this performance improves more and more, reaching the onlyExtending be explained by a poor95specification of the error co4.1 the−4 Lorenz’ model RMSE of 7.5 × 10 for L = 50. variance matrix. Indeed, the scaling of the background error The estimation of 4D-Var only becomes better from than the that statistics for the state variables should be different We shall think offorthe variables xlength Lorenz ’95 ascounterwind m of the of the EnKF DAWs of L = 50. This scaling of the background error statistics for the parameter. speed and direction variables defined on circle. A tracer performance only be explained by the arequires poor specification However, the can separate tuning of scalings additional 1 , m = 1,...,M = 40 will be added to the model field c of them+ error Indeed, thehypothesis scaling ofwill the 2 thecovariance work that IEnKS doesmatrix. not require. This variables, for error a totalstatistics of 80 variables. Thesevariables variablesshould are de-be background for the state be checked in Sect. 5.

50

Fig. 6. Root mean square errors for the analysis of the state vector at present time (filtering) or the retrospective analysis of the state

is the sa ensembl RMSE o better an 50. The e for DAW only be variance statistics scaling o Howeve work tha be check

100 50

Fig. 5.5. Plot Plot of the Lorenz ’95 forcing forcing parameter F as Fig. Lorenz-95 as aa function function Fig. the analysis F state atstate present Fig. mean square square errors errors for for the analysisof ofthe the vector Fig.7.6. 6. Root Root mean mean vector the cycle cycle index index of of the the data data assimilation assimilation experiment. FF isis estiesofof the time (filtering) or (filtering) the retrospective of F (smoothing) for the at time the analysis retrospective analysis of of the thestate state at present present or the retrospective analysis timatedbybyseveral severalfilters filtersand andsmoothers smoothers with with an ensemble mated ensemble of of size size EnKF, and thefor IEnKS, in theEnKS case of the Lorenz ’95 vector (smoothing) EnKS and the IEnKS, thecase case vectorEnKS (smoothing) the EnKF, and the IEnKS, ininmodel, the 20.The Theforcing forcing of of the the true true model is F = 8. NN==20. 8. The The MDA MDA IEnKS IEnKS with ∆tLorenz-95 = 0.05.’95model, of of the the Lorenz model, with with 1t ∆t= = 0.05. 0.05. for 1,5,10 and to the EnKF and for LLBocquet, ==1, 5, 10 P. and 30 is State compared EnKF and the the EnKS EnKS M. Sakov: and parameter estimation with the IEnKS 9 (L==50). 50). The Thefinite-size finite-size variants variants of these methods (L methods are are used: used: they they donot not require require inflation and perform, 0.128 do perform in this this context, context as well well as as with with thesame. smoothing performance between the EnKSiterative and the isinthe Even in the case L = 1, the so-called optimallytuned tunedEnKF-N inflation. optimally inflation. EnKS-N L=50 8.10

EnKF, En with ∆t =

Nonlin. Processes Geophys., 20, 803–818, 2013 4 Numerical experiments with a coupled Lorenz ’95 tracer model

4 Num trac

In this se ’95 mod represen of the IE dynamic with het

4.1 Ex

We shall speed an field cm variable

812

M. Bocquet and P. Sakov: State and parameter estimation with the IEnKS

different from the scaling of the background error statistics for the parameter. However, the separate tuning of scalings requires additional work that the IEnKS does not require. This hypothesis will be checked in Sect. 5. 4

Numerical experiments with a coupled Lorenz-95 – tracer model

In this section we introduce a simple extension of the Lorenz95 model with a tracer field advected by the Lorenz-95 field to represent an advective wind. This is meant to test the ability of the IEnKS to carry out joint state and parameter estimation in the dynamical context of an online atmospheric chemical model, with heterogeneous variables. 4.1

Extending the Lorenz-95 model

We shall think of the variables xm of the Lorenz-95 as wind speed and direction variables defined on the circle. A tracer field cm+ 1 , m = 1, . . . , M = 40 will be added to the model 2 variables, for a total of 80 variables. These variables are defined on the circle using a C-grid. A schematic of the grid is shown below: xm−1

cm− 1

xm

2

• 8m−1

cm+ 1 2

xm+1

• Em− 1

8m

2

• Em+ 1 2

8m+1

The tracer is advected by the wind field of the Lorenz-95 model. We have chosen to use the simple Godunov upwind scheme, which is positive and conservative. It is quite diffusive but this diffusion could be seen as a feature of the modeled physics. The equations read: dxm = (xm+1 − xm−2 )xm−1 − xm + F , dt dcm+ 1

2 = 8m − 8m+1 − λcm+ 1 + Em+ 1 , 2 2 dt where 8m = xm cm− 1 if xm ≥ 0 , 2

= xm cm+ 1 2

if xm < 0 .

(14) (15) (16) (17)

The tracer is emitted on the whole domain, and the emission fluxes are denoted Em+ 1 . It is deposited on the whole do2 main, using a simple scavenging scheme parameterized by a scavenging ratio λ. A stationary point of the dynamics is xm = F and cm+ 1 = Em+ 1 /λ. This provides orders of mag2 2 nitude for the wind and concentration variables. For simplicity, the emission flux will be made constant and uniform: Em+ 1 ≡ E. Obviously, however, a more com2 plex setting with urban/rural/sea emission type and diurnal/nocturnal cycle could be chosen. The values of our reference simulation’s parameters are λ = 0.1, and E = 1, so that the typical concentration value is 10. Nonlin. Processes Geophys., 20, 803–818, 2013

The Courant–Friedrichs–Lewy (CFL) condition is almost always satisfied: the Lorenz-95 model variables |xm | very rarely exceed 15; by construction, one has 1t = 0.05 and 1x = 1, so that CFL ≤ 0.75 < 1. Free run simulations help one to understand some dynamical characteristics of the model. The model exhibits features of a realistic tracer model. For instance, consider two of the model’s distinct trajectories in which the wind fields model trajectories are the same. It turns out that the concentrations cm+ 1 of the two trajectories converge with each other. The 2 positive part of the Lyapunov spectrum of the model is consistently very close to that of the Lorenz-95 model, with a number of positive Lyapunov exponents equal to 13, and a much broader negative part of the Lyapunov spectrum. However, we observed that the relaxation time of such two trajectories is quite long (typically τ = 10), so that it seems difficult to break down the system into fast and slow dynamics. A free run (after spin-up) is displayed in Fig. 8. The peaks of the tracer are correlated with the waves of the Lorenz-95, though not in an obvious way (see Sect. 5). The causality and propagation of information in this model is special, and presumably similar to much more complex online atmospheric chemistry models. This impacts the effectiveness of data assimilation. For instance, measuring a tracer plume at t0 (actually a peak in this one-dimensional context) does not enable one to detect a swift change in the local wind at t0 . Only future observations of the tracer concentrations will enable a diagnosis of this change in the local wind. As a consequence, variational schemes such as 4D-Var and the IEnKS that work over larger DAWs appear to be ideal tools in this context. 4.2

Numerical tests

We have performed data assimilation tests of the IEnKS using this model in order to estimate winds and concentrations, and unknown parameters F and E. Initially, we had carried out the same test but estimating E and λ instead of F and E. Parameters E and λ are typical of the kind one would like to control in an atmospheric chemistry model to improve forecast and re-analysis, when they are not themselves the focus of interest (Bocquet, 2012). The results were quite similar to those presented here. Yet, because the deposition and emission are antagonistic processes, the inverse problem of estimating them is very ill posed, requiring a specific prior distribution for those two parameters. In the absence of such strongly constraining prior, 4D-Var’s performance would be hampered. That is why we choose to estimate F and E instead. One of the potential difficulties in data assimilation with this model is the positivity of the concentrations cm+ 1 ≥ 0 2 and of the parameters F ≥ 0 and E ≥ 0. This problem can be dealt with straightforwardly with 4D-Var, since the positivity of the variables can be enforced by the minimizer, or by a change of variables that is easy to implement in this context. www.nonlin-processes-geophys.net/20/803/2013/

M. Bocquet and P. Sakov: State and parameter estimation with the IEnKS M. Bocquet, P. Sakov: State and parameter estimation with the IEnKS

813 11 12.5

35

10.0

30

7.5

25 5.0

20 2.5

15

0.0

10

-2.5

5

-5.0

0

-7.5

0

100

200

300

400

500

600

45.0

35 40.0

30 35.0

25

30.0

20

25.0

15

20.0

15.0

10 10.0

5 5.0

0 0

100

200

300

400

500

600

’95 –- tracer model. Fig. 8. Time evolution evolution of the wind (top) and concentration (bottom) fields of the coupled Lorenz Lorenz-95

numerical tests severe (EnKFwith as well IEnKS) showed that TheOur problem is more theasEnKF, since the Best the anamorphosis on the concentration variables is ensemuseless Linear Unbiased Estimator (BLUE) analysis and the for improving can even lead to instability. This ble generation precision, that are at and the heart of the methods will generis atnegative varianceconcentrations with the findings of Simon and Bertino ate or parameters. A simple but(2012) fairly who applied oceanallecosystem effective trickanamorphosed is to perform analysis clippingon by1D setting negative model, and find atogain in using anamorphosis on the variables ofwho the also analysis zero; this, however, is suboptistateand variables. Choosing more complex lognormal could also inducea imbalances andgamma harmingorpositive mal distribution for anamorphosis would avoid fabiases. A more elegant solution is function to perform an analytical voring large concentration as et done the Simon instabilityanamorphosis (Cohn, 1997;values Bocquet al., by 2010; and prone logarithm (pers. com. L. ensemble Bertino). Bertino, 2012), soanamorphosis that the BLUE analysis andofthe Aside fromare thecarried choice out of the this difgeneration in anamorphosis a space where function, the variables are ference on canRalso explained thecloser fact that anamordefined andbe their statisticsbyare to a static Gaussian. For phosis is more not too dynamical distributions. instance, in ourefficient case oneoncould perform the state augmentationBesides, using the statecase, vector: in extended the present we found that occurrences of negative concentrations in the analysis and the posterior T zensemble = [x1 , . . .are xMextremely , ln c 1 , . . . ,rare. ln cM− .found that(18) 1 , ln F, ln E] Differently, we the 2 2 anamorphosis on E is useful and avoids instabilities. BeNoteparameters that in practice problem does nottheir apply to F , cause F andthis E are not observed, anamorwhich estimated and close toasFin=4D-Var 8 because of a phosis is is awell mere change of variables that does strong sensitivity the model to Fin. the Thefollowing choice ofthe ln(F not require much of work. Therefore ex-) as the parameter to be estimated is only justified by the need tended state vector will be: of an homogeneous error metric for the two parameters. Our numerical tests (EnKF as well as IEnKS) showed that (19) z = [x1 ,...,xM ,c 21 ,...,cM− 12 ,lnF,lnE]T . the anamorphosis on the concentration variables is useless for improving precision, and can even lead to instability. This A twin experiment similar to that of Sect. 3 is performed is at variance with the findings of Simon and Bertino (2012) with ∆t = 0.05. The winds and the concentrations are fully who applied anamorphosed analysis on a 1D ocean ecosysobserved, with Rd = Id , d = M = 40, in the wind observatem model, and who also found benefit in using anamortion space, as well as the tracer concentration space. The phosis on the state variables. Choosing a more complex observations are generated from the truth and perturbed acgamma or lognormal distribution for anamorphosis function cording to these error statistics. All runs are performed over would avoid favoring large concentration values, as does the 105 cycles after a burn-in period of 5 × 103 cycles. The folinstability-prone logarithm anamorphosis (L. Bertino, perlowing methods are compared: sonal communication, 2013). Aside from the choice of the anamorphosis function, this difference can also be explained – The SDA IEnKS S = 1. www.nonlin-processes-geophys.net/20/803/2013/

IEnKS S = 1. The is{βmore are on chosen by –theThe factMDA that static anamorphosis efficient disk }1≤k≤L uniform constant in time. tributions thatin arethe notDAW too and dynamical. In addition, we found that occurrences of negative con– The EnKS S = 1. centrations in thewith analysis and the posterior ensemble are extremely rare in the present case. By contrast, we found that 4D-Var with on S= to overlapping the–anamorphosis E 1, is corresponding useful and avoids instabilities.winBedows, quasi-static conditions. The scaling of the backcause parameters F and E are not observed, their anamoris tuned so of as variables to minimize the4D-Var) global (on 82 phosisground is a mere change (as in thatall does extended variables) RMSE . not require much work. Therefore, in the following the extended state vector will be To avoid tuning inflation, the finite-size variants are emEnKS-N. zployed: = [x1 , .SDA/MDA . . , xM , c 1 , .IEnKS-N . . , cM− 1 ,and ln F, ln E]T . (19) 2 2 The time-averaged analysis RMSEs on the wind and concentration variables aresimilar plottedtointhat Fig.described 9, as a function of A twin experiment, in Sect. 3, theperformed DAW length. Both the mean filtering smoothing is with 1t = 0.05. The winds and and the concentraRMSEare s are reported. Again, consistent tions fully observed, with the Rd results = Id , dare =M = 40, inwith the those observation of (Kalnay et al., 2007). not asconcentraprecise as wind space as well 4D-Var as in theis tracer the EnKF/EnKS for short DAW s (L ≤ 20), whereas it outtion space. The observations are generated from the truth and performs the EnKF/IEnKF large DAW s, All for both filtering perturbed according to thesefor error statistics. runs are per3 cyand smoothing. Moreover, IEnKSperiod significantly outperformed over 105 cycles afterthe a burn-in of 5 × 10 formsThe thefollowing EnKS/EnKF in allare regimes and for both filtering cles. methods compared: and smoothing. In terms of performance, the difference be– The = 1. tween the SDA SDA IEnKS IEnKS Sand the MDA IEnKS is very similar to the findings of Sect. 3. However, differences are {βk }RMSE – The MDA IEnKS S = 1. The the 1≤k≤L are chosen to much be weaker, which may be explained by the increase in the uniform in the DAW and constant in time. number of observations that has been doubled. – The EnKS S = 1. of the two parameters, i.e. The RMSE s of with the logarithm r with S = 1, corresponding to overlapping win– 4D-Var 1 1 2 t )2 + (lnE (lnF a − lnF lnE t )of the back(20) RMSE dows, = quasi-static conditions. Thea − scaling 2 2 ground is tuned so as to minimize the global (on all 82 variables) whereextended F t = 8 and E t = 1RMSE. are plotted in Fig. 10 as a function of the DAW length. The filters and smoothers perform To avoid tuning inflation, the finite-size variants are emsignificantly better than 4D-Var. EnKF/EnKS and 4Dployed: SDA/MDA IEnKS-N and The EnKS-N. Var remain quite far from the performance of the SDA and Nonlin. Processes Geophys., 20, 803–818, 2013

M. Bocquet and P. Sakov: State and parameter estimation with the IEnKS M. Bocquet, P. Sakov: State and parameter estimation with the IEnKS 12

Analysis RMSE (wind)

0.20 0.15 0.12 0.10 0.07 0.05 0.04 0.03

0.30 0.30 0.25

0.25 0.20 0.20

1

5

10

20

Data assimilation window length (in ∆t)

40

0.15

MDA IEnKS-N filtering MDA IEnKS-N smoothing

0.12 0.15 0.10 0.12

0.10 0.07

0.07 0.05

0.02

M. Bocquet, P. Sak 4D-Var filtering 4D-Var smoothing EnKS-N filtering 4D-Var filtering EnKS-N smoothing 4D-Var smoothing SDA IEnKS-N filtering EnKS-N filtering SDA IEnKS-N smoothing EnKS-N smoothing MDA IEnKS-N filtering SDA IEnKS-N filtering MDA IEnKS-N smoothing SDA IEnKS-N smoothing

0.40

Analsyis RMSE (concentration)

4D-Var filtering 4D-Var smoothing EnKS-N filtering EnKS-N smoothing SDA IEnKS-N filtering SDA IEnKS-N smoothing MDA IEnKS-N filtering MDA IEnKS-N smoothing

Analysis RMSE (wind)

0.30 0.25

0.05 0.04 0.03

0.40

Analsyis RMSE (concentration)

814 12

0.30 0.25 0.20 0.15 0.12 0.10

0.07 0.05

0.04 0.02 1

10

5 1

5

20

Data assimilation window length (in ∆t) 10

20

Data assimilation window length (in ∆t)

40 0.04 40

1

Fig. 9. 9. Mean Meanfiltering filteringand andsmoothing smoothinganalysis analysisroot rootmean meansquare squareerrors errorsof ofthe thewind windvariables variables(left) (left)and and concentration concentration variables variables (right) (right) of of the the Fig. online tracer tracer model, model, as as aa function function of of the the DAW DAWlength lengthfor forthe theIEnKS IEnKS(finite-size (finite-size variant), theEnKF/EnKS, EnKF/EnKS, and4D-Var 4D-Var (with optimal inflation online variant), the and (with inflation Fig. 9. Mean filtering and smoothing analysis rootoptimal mean square errors of the wind online tracer model, as a function of the DAW length for the IEnKS (finite-size vari of the the prior). prior). of of the prior).

Analysis RMSE (log. of parameters)

The time-averaged analysis RMSEs on the wind and con4D-Var 0.1000 filtering centration variables are plotted in Fig. 9 as EnKS-N a function of EnKS-N smoothing 0.0500 SDA IEnKS-N the DAW length. Both the mean filtering and smoothing MDA IEnKS-N 0.0200 RMSEs are reported. Again, the results are consistent with 0.0100 of Kalnay et al. (2007). 4D-Var is not as precise as those the0.0050 EnKF/EnKS for short DAWs (L ≤ 20), but it outperforms the EnKF/IEnKF for large DAWs, in both filtering 0.0020 and smoothing. Moreover, the IEnKS significantly outper0.0010 forms the EnKS/EnKF in all regimes and for both filtering 0.0005 and smoothing. In terms of performance, the difference be0.0002 the SDA IEnKS and the MDA IEnKS is very similar tween to0.0001 that reported in Sect. 3. However, the RMSE differences are much1 weaker, which may be 10 20 explained by the doubled 40 5 Data assimilation window length (in ∆t) number of observations. The RMSEs of the logarithm of the two parameters, i.e., Fig. 10. Mean r filtering and smoothing analysis root mean square errors of the two of the online tracer model, as a func1 parameters 1 t )2 + (ln E a − ln E t )2 , F a − ln Fthe RMSE = DAW(lnlength tion of the for IEnKS (finite-size variant), (20) the 2 2 EnKF/EnKS, and 4D-Var (with optimal inflation of the prior).

where F t = 8 and E t = 1 are plotted in Fig. 10 as a function of the DAW length. The filters and smoothers perform significantly than 4D-Var. The EnKF/EnKS andwin4DMDA IEnKS.better This shows that smoothing over a large Var remain quite far from the performance of the SDA dow and flow-dependent error statistics are both crucial and for MDA IEnKS.estimation. This shows that smoothing over a large winthe parameter dow and flow-dependent error statistics are both crucial for the parameter estimation. 5 Discussion 5 Discussion One of the possible limitations of the IEnKS is the potentially large of iterations. Theisnumber of One of the average possible number limitations of the IEnKS the potenrequired ensemble could beThe thenumber most costly tially large average propagations number of iterations. of repart of ensemble the algorithm for complex quired propagations couldhigh-dimensional be the most costlymodpart els. have seen Sect. 3.2 in the Lorenz ’95 context of theWe algorithm for incomplex high-dimensional models. We that windows at moderate S context (S of = theL), Lorenz-95 model havewith seennon-overlapping in Sect. 3.2 in the (1 ≤S ≤ 15), the IEnKS is windows performing better thatS that with non-overlapping (S well, = L),and at moderate the EnKF/EnKS and 4D-Var. In the case of weak nonlinNonlin. Processes Geophys., 20, 803–818, 2013

earity ∆t = 0.05, only one iteration of the minimization is 0.2000 required sensitivities 4D-Var 0.1000 on average for the computation of the EnKS-N filtering EnKS-N smoothing Yk,(j) . Additionally accounting for the propagation of the 0.0500 SDA IEnKS-N IEnKS-N ensemble of the (ensemble) forecast step, an MDA average of 2 0.0200 propagations of the ensemble through the DAW is required. 0.0100 A further exploration of the computational performance of 0.0050 the IEnKS is out of scope of this article, but it seems quite 0.0020 promising for the success of the IEnKS with complex models.0.0010 In0.0005 the numerical experiments, parameters F and E were chosen 0.0002 to be static. This type of parameters are frequent and of great 0.0001 interest for geophysical systems. Yet, they make 4DVar and the IEnKS with large DAW ideal tools. When the40pa1 10 20 5 Data assimilation window length (in ∆t) may be less rameters evolve in time, the variational methods performing as compared to the EnKF, EnKS and IEnKS with Fig. Mean and smoothing smoothing analysis root root meanfor square Fig. 10. 10. DAW Mean analysis mean square smaller s. filtering In particular, the persistence model the errors two online tracer model,as asaafuncfuncerrors of of the thebecomes two parameters parameters of the online tracer model, parameters imperfect. We have repeated the same tion of of the the DAW DAW length for the IEnKS (finite-size variant), the tion length forbut thewith IEnKS (finite-size variant), the experiment as in Sect. 3.3, F varying in time accordEnKF/EnKS, and and 4D-Var 4D-Var (with optimal EnKF/EnKS, optimal inflation inflationof ofthe theprior). prior). ing to a sinusoid and a step-wise function, within the interval [7.5;8.5], with a period of one year (1456 time units of the Lorenz ’95 model). Not only issmoothing model error intrinsic MDA IEnKS. This shows overmade a large (1 ≤S≤ 15), the IEnKS isthat performing well, and betterwinthat by incorporating parameter F in the control variables as was dow and flow-dependent error areweak bothnonlinearity crucial for the EnKF/EnKS and 4D-Var. Instatistics the case of done so far, butestimation. model error also becomes extrinsic because the=parameter 1t 0.05, only one iteration of the minimization is required the assumed persistence model for F is wrong (permanently on average for the computation of the sensitivities Yk,(j ) . Adin the sinusoid case and intermittently in the step-wise case). ditionally, accounting for the propagation of the ensemble of results forecast are displayed Fig. 11.of The evolution of 5 Some Discussion the (ensemble) step, aninaverage two propagations the retrospective of FDAW is shown for the EnKF-N, of the ensemble analysis through the is required. A further the exOne of the limitations of the IEnKS theIEnKS potenEnKS-N Lof=possible 50,computational the MDA IEnKS-N L= 50 of S is = and 4D-is ploration the performance the1, tially large number The number of Var L 50 Saverage = 1. The RMSEbut sof are indicated in parenthesis in out of=scope of this article, ititerations. seems quite promising for required ensemble propagations could be the most costly the Although the with IEnKS-N L =models. 50 remains the best the legends. success of the IEnKS complex part of the for high-dimensional performer in algorithm both cases, thecomplex gap inparameters performance narrower, In the numerical experiments, F is and Emodwere els. We have seen in Sect. 3.2 in the Lorenz ’95 context because of be thestatic. incorrect assumption the chosen to Thispersistence type of parameters is within frequently that. with non-overlapping windows (S =the L),RMSE at moderate S DAW Let us that, insystems. these cases, of themake retmodelled in remark geophysical Furthermore, they (1 ≤ S ≤ 15), the IEnKS is performing well, and better that rospective of the IEnKS different RMSE 4D-Var andanalysis the IEnKS with large is DAW ideal from tools.the When the and 4D-Var. casetoofbeweak nonlinofthe theEnKF/EnKS filtering analysis becauseInthethe truth compared to Analysis RMSE (log. of parameters)

0.2000

www.nonlin-processes-geophys.net/20/803/2013/

earity ∆ required Yk,(j) . ensembl propagat A furthe the IEnK promisin els. In the chosen t of great i Var and rameters performi smaller paramete experime ing to a s [7.5;8.5] Lorenz ’ by incor done so the assum in the sin Some the retro EnKS-N Var L = the legen performe because DAW . Le rospectiv of the fil

M. Bocquet and P. Sakov: State and parameter estimation with the IEnKS 815 14 M. Bocquet, P. Sakov: State and parameter estimation with the IEnKS 14 M. Bocquet, P. Sa

8.4 8.2 8 7.8 7.6

8.6

9

Retrospective analysis of parameter F

8.6

EnKF-N L=50 (0.079) EnKS-N L=50 (0.063) 4D-Var L=50 (0.045) EnKF-N (0.063) MDAL=50 IEnKS-N L=50 (0.031) EnKS-N (0.040) Truth 4D-Var L=50 (0.030)

9 8.8

Retrospective analysis of parameter F

EnKF-N (0.063) EnKS-N L=50 (0.040) 4D-Var L=50 (0.030) MDA IEnKS-N L=50 (0.020) Truth

Retrospective analysis of parameter F

Retrospective analysis of parameter F

8.8

MDA IEnKS-N L=50 (0.020) Truth

8.5

8.4 8.2

8

8 7.8

7.5

7.6

8.5

8

7.5

7.4 0

1000

2000

0 7.4

3000

Time

0

1000 1000

Time Time

2000 2000

3000 3000

0

Fig.11. 11.Comparison Comparisonofofthe theretrospective retrospectiveanalysis analysisofofFFusing usingthe theEnKF, EnKF,the theEnKS EnKSLL==50, 50,4D-Var 4D-VarLL==50 50and andthe theMDA MDAIEnKS IEnKSLL==50, 50, Fig. Fig. function 11. Comparison whenthe thetruth truth(black (blackdashed dashedline) line)evolves evolvesasasaasinusoid sinusoid(left), (left),and andaastep-wise step-wise function (right). of the retrospective analysis of F using the EnKF, the EnK when (right). when the truth (black dashed line) evolves as a sinusoid (left), and a step-wise fun

www.nonlin-processes-geophys.net/20/803/2013/

ables. This stresses the importance of the time-dependence of the1 error covariance matrix for rapidly Lvarying variables. = 50, MDA IEnKS-N 20, SDA IEnKS-N By contrast, the IEnKS completely avoidsLL ==the need to build 10, SDA IEnKS-N 0.8 any background error covariance matrix. L = 5, SDA IEnKS-N L = 1, SDA IEnKS-N One way to account for model error is to parameterize this 0.6 error and estimate the related parameters, as was suggested in this 0.4 study. Another way is to implement a weak constraint formulation of the underlying variational problem. However, this 0.2 remains to be defined for the IEnKS, whereas this is already implemented in the standard EnKS. 0 Following this study, we are planning to test the IEnKS on a more complex low-order model with several reactive -0.2 species and test the estimation of the concentration variables -10 0 10 20 -15 -5 5 15 as well-20 as some parameters such as kinetic constants. If Distance succeeded, the plan is to implement the method on a highdimensional air quality model. definition ofthe a Fig. function of the theHowever, correlationthe of the the errors errorsofof the Fig. 12. 12. Structure function of correlation of initial in the IEnKS,offor for several DAW DAW lengths. initial condition condition several satisfying implementation localization inlengths. the IEnKS context will first be needed (work in progress).

ables. T of the er By cont any back One w error an in this st formula this rem ready im Follow on a mo species a as well succeed dimensi satisfyin text will

Acknowledgements. thankcandidate Olivier Talagrand, acting as IEnKS appeared toThe us authors as an ideal method to tackle Editor, Laurent Bertino, acting as Reviewer and another anonymous windows. These methods estimate the background state by a such problems. Reviewer for their useful comments and suggestions, as well as the forecast the analysis of the posteriorofensemble. NeverIt wasofapplied to theorjoint estimation the Lorenz ’95 organizers of the International Conference on Ensemble Methods theless, in the context of our low-order models, the IEnKS state vector and its forcing parameter F . The IEnKS outth in Geophysical Sciences held in Toulouse, France, from the 12 to performed the EnKF, EnKS andin 4D-Var a wide range outperforms 4D-Var, especially the jointfor state and paramthe 16th of November 2012.

Acknowle Editor, L Reviewer organizer in Geoph the 16th

Correlation coefficient

Correlation coefficient

parameters evolve in time, the variational methods may not 1 = 50, MDA IEnKS-N perform as well as the EnKF, EnKS and LLIEnKS smaller = 20, SDAwith IEnKS-N L = 10,for SDAthe IEnKS-N DAWs. In particular, the persistence model parame0.8 L = 5, SDA IEnKS-N ters becomes imperfect. We have repeated L = 1,the SDAsame IEnKS-Nexperi0.6 ment as in Sect. 3.3, but with F varying in time according to a sinusoid and a step-wise function, within the interval 0.4 [7.5; 8.5], with a period of one year (1456 time units of the Lorenz-95 model). Not only is model error made intrinsic 0.2 by incorporating parameter F in the control variables as has been done so far, but model error also becomes extrinsic be0 cause the assumed persistence model for F is wrong (permanently in the sinusoid case and intermittently in the step-wise -0.2 case). -20 -10 0 10 20 -15 -5 5 15 Some results are displayed in Fig. 11. The evolution of Distance the retrospective analysis of F is shown for the EnKF-N, the Fig. 12. Structure of IEnKS-N the correlation errors of 4Dthe EnKS-N L = 50,function the MDA L =of 50the S= 1, and initial condition for several DAW lengths. Var L = 50 S in=the 1. IEnKS, The RMSEs are indicated in parenthesis in the legends. Although the IEnKS-N L = 50 remains the best performer in both cases, the gap in performance is narIEnKS us incorrect as an ideal candidateassumption method to tackle rower,appeared because oftothe persistence within such problems. the DAW. Let us remark that, in these cases, the RMSE of It retrospective was applied to the joint estimation the Lorenz the analysis of the IEnKS isofdifferent from’95 the state vector and its forcing parameter F . The IEnKS outRMSE of the filtering analysis because the truth that serves performed thecomparison EnKF, EnKS and 4D-Var forDAW. a wide range as a point of changes within the Note also ofthat lengths of the DAW . Besides, the estimation of F was because of the imperfection of the persistence model, shown to be eveninflation more precise to the anomalies standard a multiplicative of 1.01 compared of the ensemble methods. Motivated by future applications of the IEnKS to has been applied to the finite-size methods since they are not atmospheric chemistry models where the estimation of the meant to intrinsically account for extrinsic model error (Bocforcings is crucial, introduced an extension Lorenzof quet et al., 2011), we whereas the EnKF requires of anthe inflation ’95 model, adding a tracer field advected by the Lorenz 1.05 here to account for both model and sampling errors.’95 field. of theof tracer emission and depositionto TheKey lastparameters and main point the discussion is dedicated were also meant to be estimated. Again, the IEnKS the improvement of the 4D-Var background and itsmanaged compartoison finely the The parameters without anystatistics tuning. that deto estimate the IEnKS. background error A betterthe specification of the error covariance of 4Dtermine prior of variational methods, such matrix as 4D-Var and Var (obtained from the IEnKS) led to a spectacular improvethe IEnKS, have less impact with longer data assimilation ment in the estimation of the static parameters. Yet, it did not help in the improvement of joint estimation of the state vari-

of lengths of the DAW Besides, the estimation of F was eter case. Therefore the. difference should lie in the specifishownoftothe bebackground even more error precise compared to theItstandard cation covariance matrix. could be methods. Motivated by future applications of error the IEnKS to References that the time-dependence of the background statistics atmospheric chemistry wherelength the estimation the remains essential in the models long DAWs limit. Or itofcould Aksoy, Zhang, F., and J.: background Ensemble-based siforcings is crucial, we Nielsen-Gammon, introduced of the Lorenz be thatA.,the climatological statisticsanofextension the in our multaneous state and estimation inby a two-dimensional ’95 model, adding a parameter tracer is field advected the Lorenz ’95 implementation of 4D-Var poorly specified. sea-breeze model, Mon.ofWea. 134, 2951–2969, deposition 2006. field. Key parameters the Rev., tracerwe emission To explore those hypotheses, derivedand climatological Barbu, A. L., Segers, A. J., Schaap, M., Heemink, A. W., and Builtwere alsoofmeant to be on estimated. Again, the IEnKS managed statistics errors the initial of the DAW inferred jes, P. J. H.:the A multi-component data state assimilation experiment dito finely estimate the parameters without any tuning. from the SDA IEnKS-N. We first considered the Lorenz-95 rected to sulphur dioxide and sulphate over Europe, Atmos. Env., A better specification ofestimation. the error covariance model without parameter Since thematrix systemofis4Dsta43, 1622–1631, 2009. Var (obtained from the IEnKS) led to a spectacular improvetistically homogeneous, the error covariance matrix is cirBishop, C. H., Etherton, B. J., and Majumdar, S. J.: Adaptive Samment thethat estimation ofTransform the static Kalman parameters. Yet, not culant, so is can be represented by aFilter. one-dimensional plinginwith the Ensemble PartitI:did Theohelp in Aspects, the improvement joint129, estimation the state variretical Mon. Wea.ofRev., 420–436,of2001.

Nonlin. Processes Geophys., 20, 803–818, 2013

Referen

Aksoy, A multan sea-br Barbu, A jes, P. rected 43, 16 Bishop, C pling w retical

816 M. Bocquet andIEnKS P. Sakov: State and parameter estimation with the IEnKS M. Bocquet, P. Sakov: State and parameter estimation with the 15 0 0.90

1

10

wind-wind concentration-concentration wind-concentration

0.8

0.75

Correlation coefficient

20 0.60

0.6 30 0.45

0.4 40

0.30

0.2 0.15

50

0 0.00

60

-0.2

-0.15

70

-0.4

-0.30

-20

-15

-10

-5

0

5

10

15

20

Distance

80 0

10

20

30

40

50

60

70

80

Fig.13. 13.Structure Structurefunctions functionsofofthe thecorrelation correlationofofthe theerrors errorsof ofthe theinitial initial condition condition from from the the IEnKS IEnKS applied to the online tracer model. Fig. model. The The structurefunctions functionsfor forthe thewind windcorrelation, correlation,for forthe theconcentration concentration correlation, correlation, and for the cross-correlation between structure between winds winds and and concentrations concentrations areshown shownininthe theleft leftpanel. panel.The Thefull fullcorrelation correlation matrix matrix is is displayed displayed in the right panel. are

structureM.: function thatfield depends on the between sites Bocquet, Parameter estimation for distance atmospheric dispersion: onApplication the circle.toThe structure function is plotted in the correlation Chernobyl accident using 4D-Var, Q. J. Roy. Meteor. Soc., 138, 664–681, doi:10.1002/qj.961, 2012. L is varFig. 12 instead of the full correlation matrix. When Bocquet, M. and Sakov, P.: Combining itied, the differences are small, except in inflation-free the case L =and 1 that erative ensemble Kalman filters for strongly nonlinear sysshows slightly modified next-to-nearest correlations. tems, Nonlin. covariance Processes Geophys., 19, 383–399, doi:10.5194/ The related matrix, defined up to an optimally npg-19-383-2012, 2012. tuned scaling parameter, is used in 4D-Var as a new prior in Bocquet, M. and Sakov, P.: An iterative ensemble Kalman place of the identity matrix. The new 4D-Var scores barely smoother, Q. J. Roy. Meteor. Soc., doi:10.1002/qj.2236, in press, change from when using a covariance matrix proportional to 2013. the identity. This is consistent with the findings of Kalnay Bocquet, M., Pires, C. A., and Wu, L.: Beyond Gaussian statistical etmodeling al. (2007) who also tried, in a similarMon. experimental in geophysical data assimilation, Wea. Rev., con138, text, to improve the performance of 4D-Var with a finer error 2997–3023, 2010. covariance Bocquet, M., structure. Wu, L., and Chevallier, F.: Bayesian design of conIn a second experiment, we derived the climatological trol space for optimal assimilation of observations. I: Consistent statistics of the errors on the initial extended state1340–1356, vector, in multiscale formalism, Q. J. Roy. Meteor. Soc., 137, doi:10.1002/qj.837, the Lorenz-95 case 2011. when F is unknown and estimated. The Bowler, N., Flowerdew, J., and Pring, of different flavours error covariance matrix turns outS.:toTests be almost identical to of Lorenz-95 EnKF on a case simple model, Q. F J. Roy. Soc., doi:10. the with a fixed = 8. Meteor. The only difference in press,that 2013. is1002/qj.2055, in the covariances involve F . As expected, the corBuehner, Houtekamer, P. L., C., Mitchell, L.,state and relation M., between the error onCharette, F and the error on H. any He, B.: Intercomparison of Variational Data Assimilation and variable is uniform. The covariance of the identity that was the Ensemble Kalman Filter for Global Deterministic NWP. Part used in Sect. 3.3 is clearly not a good model for this case. I: Description and Single-Observation Experiments, Mon. Wea. Furthermore, the errors on F and the state variables are not Rev., 138, 1550–1566, 2010. homogeneous, so, again, theRandomized error covariance maChen, Y. and Oliver, D. S.:choosing Ensemble Maximum trix proportional to the identity matrix is not ideal. The cliLikelihood Method as an Iterative Ensemble Smoother, Math. matological statistics of the errors have been inferred from Geosci., 44, 1–26, 2012. the SDA whentothe state vector is J. augmented to Cohn, S. E.:IEnKS-N, An Introduction Estimation Theory, Meteor. Soc. incorporate F . It was done for each L because the ratio of Japan, 75, 257–288, 1997. Cosme, E., Brankart, J., Brasseur, P., and Krysta, the variances of theJ.-M., errorVerron, on a typical state variable to the M.: on Implementation of a Reduced-rank, square-root smoother error F is a non-uniform but increasing function of L. for ocean assimilation, 33, 87–100, 2010. Using thisdata procedure, we Ocean did notModelling, obtain real improvement Cosme, E., Verron, J., RMSE. Brasseur,However, P., Blum, the J., precision and Auroux, in the state variable of D.: the Smoothingestimation problems in a Bayesian framework andtotheir parameter was remarkably improved thelinear level solutions, Mon. Wea. that Rev.,a140, 2012. of the ofGaussian the IEnKS-N. This shows fine683–695, specification Elbern, H., Strunk, A., Schmidt, H., and Talagrand, O.: Emission Nonlin. Processes Geophys., 20, 803–818, 2013

background statistics is very helpful the estimation of the rate and chemical state estimation by in 4-dimensional variational static parameter F Chem. . Nevertheless, these statistics inversion, Atmos. Phys., 7, 3749–3769, 2007.are static, Evensen, Thesignificantly Ensemble Kalman Theoretical and they G.: do not aid theFilter: estimation of theFormularapidly tion andstate Practical Implementation, Ocean Dynamics, 53, 343– changing variables. 367, 2003. In the last experiment, we applied the same procedure to Evensen, Datamodel Assimilation: the onlineG.:tracer based onThe the Ensemble Lorenz-95Kalman model.Filter, The Springer-Verlag, second edn., 2009. error covariance matrix derived from the SDA IEnKS, for Evensen, G. and van Leeuwen, P. J.: An Ensemble Kalman several L, entails significant covariances between the wind Smoother for Nonlinear Dynamics, Mon. Wea. Rev., 128, 1852– field and the concentration field. Again, because of the sta1867, 2000. tistical homogeneity of the subsystems, one can represent the Fairbairn, D., Pring, S. R., Lorenc, A. C., and Roulstone, I.: A correlations winds of the data concentrations, and the comparisonofofthe 4DVar withand ensemble assimilation methods, cross-correlations the winds and the concentrations, Q. J. Roy. Meteor.between Soc., 0, 0–0, in press, 2013. by using TheseEnsemble structure functions Gu, Y. and structure Oliver, D.functions. S.: An Iterative Kalman Filterare for displayed in Fig. 13, obtained from the IEnKS-N, L = Multiphase Fluid Flow Data Assimilation, SPE Journal, 12,20. 438– We 2007. believe that the structure function of the cross446, Hunt, B. R., Kalnay, E., Kostelich,because E. J., Ott, J. Patil, D. J., correlations is non-symmetric of E., theD.preferred oriSauer, T., Yorke, J. A.,inand A. V.: prefFourentation of Szunyogh, the winds.I.,The waves theZimin, Lorenz-95 dimensional ensemble Kalman 273–277, erentially travel westward andfiltering, create Tellus frontsA,of56,tracer on 2004. one preferred side of the wave, yielding a non-trivial crossHunt, B. R., structure Kostelich,function. E. J., andFor Szunyogh, Efficient data ascorrelation 4D-Var,I.:results similar to similation for spatiotemporal chaos: A local ensemble transform those from the previous experiment were obtained. Using the Kalman filter, Physica D, 230, 112–126, 2007. climatological priors, the errors of the state variables barely Jackson, C., Sen, M. K., and Stoffa, P. L.: An efficient stochasreduce. The fine correlations that build between the errors tic Bayesian approach to optimal parameter and uncertainty esof timation the wind concentration variablesJournal are dynamical forand climate model predictsions, of Climate,and 17, seem to be of little use when averaged in the climatological 2828–2841, 2004. background covariance However, the parameKalnay, E., Li,error H., Miyoshi, T., matrix. Yang, S.-C., and Ballabrera-Poy, ters are much better estimated. Nevertheless, unlike in the J.: 4D-Var or ensemble Kalman filter?, Tellus A, 59A, 758–773, previous 2007. experiment, they do not quite match the precision Kazantsev, E.: Sensitivity of a shallow-water parameters, of the IEnKS. For instance, in the L = 1model case,tothe 4D-Var Nonlinear Anal. R. WorldofAppl., 13, 1416 – 1428, doi:10.1016/ RMSE of the logarithm the parameters is reduced from 2012. 1.5j.nonrwa.2011.11.006, × 10−1 to 1.3 × 10−2 , but is still far from 1.0 × 10−3 of Kondrashov, D., Sun, C., and Ghil, M.: Data assimilation for a IEnKS. coupled ocean atmosphere model. Part II: Parameter estimation, Mon. Wea. Rev., 5, 5062–5076, 2008. Le Dimet, F.-X. and Talagrand, O.: Variational algorithms for anal-

www.nonlin-processes-geophys.net/20/803/2013/

M. Bocquet and P. Sakov: State and parameter estimation with the IEnKS 6

Conclusion

In this article, the iterative ensemble Kalman smoother (IEnKS) has been explored numerically. Using the Lorenz95 low-order model, it has been compared to the ensemble Kalman filter (EnKF) and the standard ensemble Kalman smoother (EnKS). It has also been compared to a 4D-Var, for a wide range of data assimilation window (DAW) lengths. The IEnKS systematically outperformed the EnKF, EnKS and 4D-Var. This conclusion holds true even when the background error covariance matrix of 4D-Var is better specified and tuned. The IEnKS has been extended to joint state and parameter estimation, using the augmented state formalism. Endowed with assets of 4D-Var and ensemble Kalman methods, IEnKS appeared to us as an ideal candidate method to tackle such problems. It was applied to the joint estimation of the Lorenz-95 state vector and its forcing parameterF . The IEnKS outperformed the EnKF, EnKS and 4D-Var for a wide range of lengths of the DAW. In addition, the estimation of F was shown to be even more precise compared to the standard methods. Motivated by future applications of the IEnKS to atmospheric chemistry models where the estimation of the forcings is crucial, we introduced an extension of the Lorenz-95 model, adding a tracer field advected by the Lorenz-95 field. Key parameters of the tracer emission and deposition were also meant to be estimated. Again, the IEnKS managed to finely estimate the parameters without any tuning. A better specification of the error covariance matrix of 4DVar (obtained from the IEnKS) led to a spectacular improvement in the estimation of the static parameters. Yet, it did not help in the improvement of joint estimation of the state variables. This stresses the importance of the time-dependence of the error covariance matrix for rapidly varying variables. By contrast, the IEnKS completely avoids the need to build any background error covariance matrix. One way to account for model error is to parameterize this error and estimate the related parameters, as was suggested in this study. Another way is to implement a weak constraint formulation of the underlying variational problem. However, this formulation remains to be defined for the IEnKS, whereas it is already implemented in the standard EnKS. Following this study, we are planning to test the IEnKS on a more complex low-order model with several reactive species and test the estimation of the concentration variables as well as some parameters such as kinetic constants. If this is successful, our eventual plan is to implement the method on a high-dimensional air quality model. However, the definition of a satisfying implementation of localization in the IEnKS context will first be needed (work in progress).

www.nonlin-processes-geophys.net/20/803/2013/

817

Acknowledgements. The authors thank the editor, Olivier Talagrand, the reviewers, Laurent Bertino and Takemasa Miyoshi, for their useful comments and suggestions. The authors also thank the organizers of the International Conference on Ensemble Methods in Geophysical Sciences held in Toulouse, France, from the 12th to the 16th of November 2012. Edited by: O. Talagrand Reviewed by: L. Bertino and T. Miyoshi

References Aksoy, A., Zhang, F., and Nielsen-Gammon, J.: Ensemble-based simultaneous state and parameter estimation in a two-dimensional sea-breeze model, Mon. Weather Rev., 134, 2951–2969, 2006. Barbu, A. L., Segers, A. J., Schaap, M., Heemink, A. W., and Builtjes, P. J. H.: A multi-component data assimilation experiment directed to sulphur dioxide and sulphate over Europe, Atmos. Environ., 43, 1622–1631, 2009. Bishop, C. H., Etherton, B. J., and Majumdar, S. J.: Adaptive Sampling with the Ensemble Transform Kalman Filter. Part I: Theoretical Aspects, Mon. Weather Rev., 129, 420–436, 2001. Bocquet, M.: Parameter field estimation for atmospheric dispersion: Application to the Chernobyl accident using 4D-Var, Q. J. R. Meteorol. Soc., 138, 664–681, doi:10.1002/qj.961, 2012. Bocquet, M. and Sakov, P.: Combining inflation-free and iterative ensemble Kalman filters for strongly nonlinear systems, Nonlin. Processes Geophys., 19, 383–399, doi:10.5194/npg-19-3832012, 2012. Bocquet, M. and Sakov, P.: An iterative ensemble Kalman smoother, Q. J. R. Meteorol. Soc., doi:10.1002/qj.2236, in press, 2013. Bocquet, M., Pires, C. A., and Wu, L.: Beyond Gaussian statistical modeling in geophysical data assimilation, Mon. Weather Rev., 138, 2997–3023, 2010. Bocquet, M., Wu, L., and Chevallier, F.: Bayesian design of control space for optimal assimilation of observations. I: Consistent multiscale formalism, Q. J. R. Meteorol. Soc., 137, 1340–1356, doi:10.1002/qj.837, 2011. Bowler, N., Flowerdew, J., and Pring, S.: Tests of different flavours of EnKF on a simple model, Q. J. R. Meteorol. Soc., 139, 1505– 1519, doi:10.1002/qj.2055, 2013. Buehner, M., Houtekamer, P. L., Charette, C., Mitchell, H. L., and He, B.: Intercomparison of Variational Data Assimilation and the Ensemble Kalman Filter for Global Deterministic NWP. Part I: Description and Single-Observation Experiments, Mon. Weather Rev., 138, 1550–1566, 2010. Chen, Y. and Oliver, D. S.: Ensemble Randomized Maximum Likelihood Method as an Iterative Ensemble Smoother, Math. Geosci., 44, 1–26, 2012. Cohn, S. E.: An Introduction to Estimation Theory, J. Meteor. Soc. Japan, 75, 257–288, 1997. Cosme, E., Brankart, J.-M., Verron, J., Brasseur, P., and Krysta, M.: Implementation of a Reduced-rank, square-root smoother for ocean data assimilation, Ocean Model., 33, 87–100, 2010. Cosme, E., Verron, J., Brasseur, P., Blum, J., and Auroux, D.: Smoothing problems in a Bayesian framework and their linear Gaussian solutions, Mon. Weather Rev., 140, 683–695, 2012. Elbern, H., Strunk, A., Schmidt, H., and Talagrand, O.: Emission rate and chemical state estimation by 4-dimensional variational

Nonlin. Processes Geophys., 20, 803–818, 2013

818

M. Bocquet and P. Sakov: State and parameter estimation with the IEnKS

inversion, Atmos. Chem. Phys., 7, 3749–3769, doi:10.5194/acp7-3749-2007, 2007. Evensen, G.: The Ensemble Kalman Filter: Theoretical Formulation and Practical Implementation, Ocean Dynam., 53, 343–367, 2003. Evensen, G.: Data Assimilation: The Ensemble Kalman Filter, Springer-Verlag, 2nd Edn., 2009. Evensen, G. and van Leeuwen, P. J.: An Ensemble Kalman Smoother for Nonlinear Dynamics, Mon. Weather Rev., 128, 1852–1867, 2000. Fairbairn, D., Pring, S. R., Lorenc, A. C., and Roulstone, I.: A comparison of 4DVar with ensemble data assimilation methods, Q. J. R. Meteorol. Soc., 0, 0–0, in press, 2013. Gu, Y. and Oliver, D. S.: An Iterative Ensemble Kalman Filter for Multiphase Fluid Flow Data Assimilation, SPE Journal, 12, 438– 446, 2007. Hunt, B. R., Kalnay, E., Kostelich, E. J., Ott, E., D. J. Patil, D. J., Sauer, T., Szunyogh, I., Yorke, J. A., and Zimin, A. V.: Fourdimensional ensemble Kalman filtering, Tellus A, 56, 273–277, 2004. Hunt, B. R., Kostelich, E. J., and Szunyogh, I.: Efficient data assimilation for spatiotemporal chaos: A local ensemble transform Kalman filter, Physica D, 230, 112–126, 2007. Jackson, C., Sen, M. K., and Stoffa, P. L.: An efficient stochastic Bayesian approach to optimal parameter and uncertainty estimation for climate model predictsions, J. Climate, 17, 2828–2841, 2004. Kalnay, E., Li, H., Miyoshi, T., Yang, S.-C., and Ballabrera-Poy, J.: 4D-Var or ensemble Kalman filter?, Tellus A, 59A, 758–773, 2007. Kazantsev, E.: Sensitivity of a shallow-water model to parameters, Nonlinear Anal. R. World Appl., 13, 1416–1428, doi:10.1016/j.nonrwa.2011.11.006, 2012. Kondrashov, D., Sun, C., and Ghil, M.: Data assimilation for a coupled ocean atmosphere model. Part II: Parameter estimation, Mon. Weather Rev., 5, 5062–5076, 2008. Le Dimet, F.-X. and Talagrand, O.: Variational algorithms for analysis and assimilation of meteorological observations: Theoretical aspects, Tellus A, 38, 97–110, 1986. Liu, C., Xiao, Q., and Wang, B.: An Ensemble-Based FourDimensional Variational Data Assimilation Scheme. Part I: Technical Formulation and Preliminary Test, Mon. Weather Rev., 132, 3363–3373, 2008. Liu, Y., Gupta, H. V., Sorooshian, S., Bastidas, L. A., and Shuttlewort, W. J.: Constraining land surface and atmospheric parameters of a locally coupled model using observational data, J. Hydrometeorol., 6, 156–172, 2005.

Nonlin. Processes Geophys., 20, 803–818, 2013

Lorenz, E. N. and Emmanuel, K. E.: Optimal sites for supplementary weather observations: simulation with a small model, J. Atmos. Sci., 55, 399–414, 1998. Montmerle, T. and Berre, L.: Diagnosis and formulation of heterogeneous background-error covariances at the mesoscale, Q. J. R. Meteorol. Soc., 136, 1408–1420, 2010. Posselt, D. J. and Bishop, C. H.: Nonlinear Parameter Estimation: Comparison of an Ensemble Kalman Smoother with a Markov Chain Monte Carlo Algorithm, Mon. Weather Rev., 140, 1957– 1974, 2012. Pulido, M. and Thuburn, J.: Gravity wave drag estimation from global analyses using variational data assimilation principles. Part II: A case study, Q. J. R. Meteorol. Soc., 132, 1527–1543, 2006. Rabier, F., Järvinen, H., Klinker, E., Mahfouf, J.-F., and Simmons, A.: The ECMWF operational implementation of fourdimensional variational assimilation. I: Experimental results with simplified physics, Q. J. R. Meteorol. Soc., 126, 1143–1170, 2000. Raynaud, L., Berre, L., and Desroziers, G.: Objective filtering of ensemble-based background-error variances, Q. J. R. Meteorol. Soc., 135, 1177–1199, 2009. Ruiz, J. J., Pulido, M., and Miyoshi, T.: Estimating model parameters with ensemble-based data assimilation: A Review, J. Meteorol. Soc. Japan, 91, 79–99, doi:10.2151/jmsj.2013-201, 2013. Sakov, P., Evensen, G., and Bertino, L.: Asynchronous data assimilation with the EnKF, Tellus A, 62, 24–29, 2010. Sakov, P., Oliver, D., and Bertino, L.: An iterative EnKF for strongly nonlinear systems, Mon. Weather Rev., 140, 1988–2004, 2012. Simon, E. and Bertino, L.: Gaussian anamorphosis extension of the DEnKF for combined state parameter estimation: application to a 1D ocean ecosystem model, J. Mar. Syst., 89, 1–18, 2012. Talagrand, O. and Courtier, P.: Variational Assimilation of Meteorological Observation with the Adjoint Vorticity Equation. I: Theory, Q. J. R. Meteorol. Soc., 113, 1311–1328, 1987. Vossepoel, F. C. and van Leeuwen, P. J.: Parameter Estimation Using a Particle Method: Inferring Mixing Coefficients from Sea Level Observations, Mon. Weather Rev., 135, 1006–1020, 2007. Weir, B., Miller, R. N., and Spitz, Y. H.: Implicit Estimation of Ecological Model Parameters, Bull. Math. Biol., 75, 223–257, doi:10.1007/s11538-012-9801-6, 2013. Wirth, A. and Verron, J.: Estimation of friction parameters and laws in 1.5D shallow-water gravity currents on the f-plane, by data assimilation, Ocean Dynam., 58, 247–257, 2008. Zhang, Y., Bocquet, M., Mallet, V., Seigneur, C., and Baklanov, A.: Real-Time Air Quality Forecasting, Part II: State of the Science, Current Research Needs, and Future Prospects, Atmos. Environ., 60, 656–676, doi:10.1016/j.atmosenv.2012.02.041, 2012.

www.nonlin-processes-geophys.net/20/803/2013/