Dynamic Meteorology: Data Assimilation Methods - UCLA

0 downloads 0 Views 6MB Size Report
AlSO Laboratory for Atmospheric Sciences,. NASA Goddard Space Flight ...... geostrophic adjustment, which maintains the atmosphere in ...... u.i]_1.00 u-w...u.
APPLICATIONS OF ESTIMATION THEORY TO NUMERICAL WEATHER PREDICTION * K. Bube, and E. Isaacson M. Ghil, S. Coho, J. Tavantzis ** Courant Institute of Mathematical Sciences New York University, New York, New York 10012

Abstract Numerical weather prediction (NWP) is an initial-value problem for a system of nonlinear partial differential equations (PDEs)

in which the initial values are known

only incompletely and inaccurately.

Data at initial time

can be supplemented, however, by observations of. the system distributed over a time interval preceding it.

Estimation

theory has been successful in approachinq such problems for models governed by systems of ordinary differential equations and of linear POEs. sequential estimation

We develop methods of

for NWP.

A model exhibiting many features of large-scale atmospheric flow important in NWP is the one governed by the shallow-fluid equations.

We study first the estimation

problem for a linearized version of these equations.

The

vector of observations corresponds to the different atmospheric quantities measured and space-time patterns associated with conventional and satellite-borne

meteorological

observing systems.

filter

A discrete Kalman-Bucy (K-B)

is applied to a finite-difference version of the equations, which simulates the numerical models used in NWP.

AlSO Laboratory for Atmospheric Sciences,

NASA Goddard Space Flight Center, Greenbelt, MD 20771

ll'l rtment of Mathematics, New Jersey Institute of Technology, N w k, N.J. 07102

1~9

The specific character of the equations' dynamics

CONTENTS

gives rise to the necessity of modifying the usual K-B filter. The modification consists in eliminating the high-frequency inertia-gravity waves which would otherwise be generated

l.

Introduction

2.

A

by the insertion of observational data. The modified filtering

Review

of the State-Space Approach to Estimation

2.1 Statistical considerations in estimation: a simple

procedure developed here combines in an optimal way dynamic illustration initialization (i.e., elimination of fast waves) and 2.2 Estimation for stochastic-dynamic systems four-dimensional

(space-time) assimilation of observational 2.3 General remarks on the Kalman-Bucy (K-B) filter

data, two procedures which traditionally have been carried 3. out separately in NWP. filter

Comparisons between the modified

Estimation

for the Shallow-Water Equations

3.1 The equations

and the standard K-B filter have been made.

3.2 Discretization

The matrix of weighting coefficients, or filter, 3.3 The modified K-B filter applied to the observational corrections of state variables

3.4 Observational pattern and choice of parameters

converges rapidly to an asymptotic, constant matrix. using

3.5 Previous work

realistic values of observational noise and system noise, 4.

Resul ts

this convergence has been shown to occur in numerical experi4.1 Estimation for a perfect, noise-free model ments

with the linear system studied; it has also been 4.2 Estimation in the presence of system noise

analyzed theoretically

in a simplified, scalar

case. 4.3 Theoretical analysis of the scalar case

The relatively rapid convergence of the filter in our simula5.

Concluding Remarks

tions leads us to expect that the filter will be efficiently References computable for operational

NWP models and real observation

patterns. Appendix A.

List of Major Symbols

Our program calls for the study of the asymptotic filter's dependence on observation patterns, noise levels, and the system's dynamics. Furthermore', the covariance ma trices of system noise

and observational

noise will be determined

from the data themselves in the process of sequential estimation, rather than be assigned predetermined, heuristic values. Finally, the estimation procedure will be extended to the full, nonlinear shallow-fluid equations.

)40

111

1.

INTRODUCTION One of the main reasons we cannot tell what the weather

will be tomorrow is that we do not know what the weather is oday.

In other words, numerical weather prediction (NWP)

s an initial-value problem

in sufficient quantity and with sufficient

~vailable I

for which initial data are not

curacy. Numerical forecasts are produced now routinely on a

llily basis by a number of weather services in different 'lIotries.

The models used in NWP are discretized versions

he partial differential equations (PDEs) governing large',lIe atmospheric flow. The discretization is performed by

nite differencing, finite element or spectral representa11 ns.

The number of degrees of freedom of the discretized

6 5 Is is typically of the order of 10 -10 .

The spatial

om in of the models is the entire globe or at least an entire ",j·phere. A large number of observations is made by the conven-

11.l' h 1

ground-based

meteorological network, coordinated

World Weather watch (I*/w).

They consist of point

of temperature, humidity, pressure and horizontal

I

,Ily.

These observations, of the order of 10

It duced at the so-called synoptic times,

.MT.

in number,

0000 GMT and

It is customary therefore to choose a synoptic initial time for a numerical forecast.

.••••• v ..

5

Conventional

l! ns are insufficient in number in order to determine

the initial state of the

model atmosphere.

Furthermore,

.... 0

they are very unevenly distributed in space, being

-" h

concentrated over the continents of the Northern Hemisphere,

0

00

c

200

~ e 0

· '" ·" ~

m c

and much sparser over the oceans and over the Southern

~

j

Hemisphere (Fig. l).

> h

Q

> Q

A number of additional observations are

made at the

0 0

00

.0

0

0

u

.'" "" ... ~

so-called subsynoptic times, 0600 GMT and 1800 GMT.

.c

.... .00,

A still

.. ·· u

'"

larger number of observations, exceeding by now that given 0

by the

l~

network, is gathered in an essentially time-

w

"00

~

'" "• ... "... "

· '"· · ·" "· '" " " · " '"'" ·•'" "" ". C

h

0

continuous manner

.0

from polar-orbiting satellites and other

non-conventional measuring platforms and references therein).

(Fleming et al., 1979 a,b,

All these observations

0

.0

...

.

~

~

:;: .... 0

00

0

0

together

00

...

00

0 C

E

form a rather bewildering array by their uneven distribution

C .0

in space and time, as well as by their different error

0

,

N

g

~

2c

m h

characteristics.

E

"

,,~

.c

In order to obtain the best possible estimate of the

0

w

0

Q

00

0

~

I

model state at the initial instant, NWP centers use the

0 0

data available over a time interval preceding that instant. The most common procedure to use such data, called updating,

u

0 0

... ....0'"

0

.00

.'j C

" "... ~ ;: • "" ....m E -" h 0

was suggested by Charney et al.

(1969).

·

The model is provided

the best available data at some preceding instant, e.g., 24 h g

or 48 h earlier,

,

and is integrated forward in time.

.0

0

~

available:

the model values as they become

the model is updated.

reaches the initial instant

When the model integration

for the next scheduled forecast,

its guess of the initial state is blended with the data available at that

instant to produce the desired estimate.

Thus the model itself is used to assimilate the data

144

., 0

· h

,

h

'" • '" ...." '" .C

Additional data replace

E

~

'"

0

available up to the initial instant.

'hls field's outstanding problems and eventually lead to

variations of this

dimensional (4-0), space-time assimilation of

ter practical algorithms for solving them.

IJ

procedure, as well as other procedures for the four-

The purpose of the present article is to apply the

meteorological

-a

observations are reviewed in Bengtsson (1975).

learn as much as possible from this application about

The blending of observations and model forecast values has been made most recently

filter to the linearized shallow-water equations and

h

using in an explicit manner

properties of the filter relevant to operational 4-0 assimilation.

the error structure of the data (Phillips, 1976; Rutherford, 1972).

This linear system already contains

I' rtant features of the equations used in operational

This error structure is determined from past data

models, and our application should be instructive.

and the resulting linear regression coefficients are computed once and for all, their constant values being used all along

We present a brief review of the K-B filter in Section 2.

the assimilation cycle of the model (McPherson et al., 1979;

dynamical model and observing pattern studied are presented

Schlatter et al., 1976).

, ction 3,

A modification of this approach,

which combines more intimately the dynamics of the model

with

111'

the time-continuously changing observation patterns, appears in Ghil et al.

(1979).

It

Clearly, a mathematical framework well suited for the 4-0

along with the modification of the K-B filter

ted by the system's dynamics.

Numerical results

me analytical ones

Section 4.

follow in

E the results, comparison with operational practice nclusions follow in Section 5.

assimilation problem of NWP is the state-space approach of estimation theory.

It deals with the estimation of

stochastic processes which are generated by randomly perturbed differential or difference equations.

This approach was

first formulated for processes governed by linear systems of ordinary differential equations (OOEs: Kalman, 1960; Kalman and Bucy, 1961); its

results are widely known as the Kalman

or the Kalman-Bucy (K-B)

filter.

We expect that applications

of the K-B filter, with suitable modifications, to 4-0 data assimilation will provide additional physical insight into

147 146

A discus-

2.

A REVIEW OF THE STATE-SPACE APPROACH Ta ESTIMATION

E(X

The intent of this section is to familiarize the

- x)

l

=

E(X

2

o ,

- x)

and we say the measurements are unbiased.

reader with the ide as and methods of sequential estimation.

It is natural to

require that the estimate x also be unbiased:

Systematic, more or less rigorous expositions of the theory

E(~ - x)

=

0

are in existence for the interested reader (Bucy and Joseph, 196~;

Curtain

this requirement is equivalent to

and Pritchard, 1978; Davis, 1977; Gelb, 1974;

Jazwinski, 1970).

(2.1b)

Here we shall stay on the purely formaI, so that (2.la) becomes

and hopefully intuitive, level. 2.1

Given a quanti ty

X,

suppose

=

x

Statistical Considerations in Estimation: A Simple Illustration

(2.lc)

tha t two independent measureNext it is assumed that the measurement errors xl-x

ments of this quantity, xl and x

2

' are available. For

instance x could be the temperature in a room and xl and x the readings of two thermometers placed in the room.

and x - x 2

are uncorrelated:

2

In

E (xl - x) (x 2 - x) = 0 ,

the absence of any additional information about x, it is natural to seek an estimate of x, x say, as a linear function of xl and x

2

and that their variances

al

2

2

and a 2 '

'

x = alx + a x . (2.1a) l 2 2 The function itself is called an estimator; the estimate is

are known from previous measurements, viz.

its value.

bration.

We wish to determine al and a7. so that the estimate x will be optimal in sorne sense.

The conditions to achieve

;2 = E(~

Then the variance of the estimation error, - x)2 , is given by

such

~2

errors in the measurements

xl and x

repeat our measurements many times equal the true value x.

2

' i.e.

that if we

then their average would

In terms of the expectation operator

E, this is written as

(2.2)

a

optimality depend on the nature of the measurements. Let us assume first of aIl that there are no systematic

from instrument cali-

Suppose for the moment that in addition to satisfying Eq.

(2.1b), al and a

2

are nonnegative but otherwise arbitrary;

the linear combination in Eq. convex. xl and x

(2.la) is then said to be

Convexity will imply that x always lies between 2

' and furthermore th~t ~2

a

< max

2

2

{a ,a 2 } l

148

14 P

(2.3a)

While property (2.3a) do better.

Notice also that formula (2.4c)

is reassuring, one should be able to

variance irnmediately implies property (2.3b). In particular,

We expect to be able to achieve (2.3b)

otherwise there would be no point in making more than one 2 2 The assumed knowledge of 01 and 02 will in measurement.

when 01

02

=

0, one obtains ;

=

0/12 , which generalizes

= o/IN

2.2

Estimation for Stochastic-Oynamic Systems

for N independent measurements of equal variance.

The purpose of sequential estimation theory for dynamic systems is to extend the simple ideas outlined above to the

requirement.

case in which the quantity x of interest evolves in time

This requirement can now be formulated precisely: x should be a minimum variance estimate, which in our case means that ~2 in Eq. 2

=

to °

fact yield (2.3b); it is accomplished by our optimality

to al and a

for the optimal error

(2.2) should be minimized with respect

' subject to (2.1b).

according to a given (system of ordinary or partial) differential or difference equation(s).

In this case, xl

will represent the state Df the system as determined from

The resultinq optimal

previous observations (measurements), while x

weights are 2 °2

al and a

A2 ° 2" °1 A2 °

2 2 0 + °2 l 2 °1 2 2 °1 + °2

2

2"

observations at the current time. To stress the analogy, let us consider here a system of

,

(2.4b)

randomly perturbed difference equations for the state vector x:

°2

~k+l

where the optimal error variance ° is given by

Notice that al and a

2

-2

° 1,

represents

(2.4a)

A

=

2

-2

+ °2

(2.4c)

.

~

=

'l'~k

+ ~k '

k = 0,1,2, ...

;

(2.5a)

and ç have dimension n, and 'l' is a constant nXn matrix.

(See Appendix A for a list of reçurring syrnbols.) In our

are nonnegative, so that the optimal application,

estimator,

~k

will stand for the meteorological variables

at time k at the grid points of a global atrnospheric predicx

tion model; 'l' stands for the finite-difference operator is, in fact, convex. Moreover, the optimal weights (2.4a,b)

which advances the variables by one time step. The random

satisfy the intuitive requifement that they should reflect our relative confidence in xl and x 2 : if 01 is smaller than ° 2 , for example,

then xl is weighted more

heavily.

151 150

vector sequence {:k:

k

=

uncorrelated with

O,1,2, ••. } is assumed to be a

~

, T

E~k~2

white noise sequence with mean zero and covariance matrix Q,

=

0 .

(2.6d)

The zero-mean assumptions (2.5b, 2.6b) are made for convenience (2.5b,c) onlYl they are not essential for the theory we describe here. The transpose of a vector or matrix by (

and ô

)T

k2

is indicated

is the Kronecker delta,

ô k2

=

a

if k ~ 2

s

The observation vectors and H is a pxn matrix.

~k

have dimension p

This formulation

~

n,

describes in

= l if k = 2. The white noise represents k2 dynamical and physical processes not described by the model

observations at any time k are incomplete; measurements

~, especially smaller-scale phenomena

are made over some areas (continents) and not over others

and ô

not resolved by

(oceans)

the grid.

~O

particular the meteorological situation

1

in which the

at a given location some variables are measured

suppose for the moment that an initial unbiased estimate

and not others, e.g., geostationary satellites de termine

E~O is available from observations at time zero, and

winds but not temperatures.

Finally, sorne functional of

that no further observations are available at later times.

the variables may be observed rather than a variable itself,

In this case the best estimate of ~k ' ~k say, would be the

e.g., polar-orbiting satellites measure radiances, which

mean of x , x -k -k

=

Ex , and is computed according to (2.5), -k

by the recursion ~O

~k+l

=

de pend on vertical temperature profiles.

Thus the entries

of H need not be only zero or one, i.e., H is not necessarily a permutation matrix.

E~O

this estimate can be improved if further observations become

In fact, H, as weIl as

~,

0, and R, need not be

constant in timel they are only taken constant here for available.

simplicity.

In particular, the rank of H, i.e., the effective

dimension of

~k

We have seen in the introduction that the description of ' can change from one time to the next:

the atmospheric state at a given synoptic time from observations the largest number of observations are available at synoptic at that time is entirely inadequate and that we are interested times, with fewer observations provided at subsynoptic times, in using observations at other times as weIl. This situation is modeled by assuming a co~tinuous stream of observations

~k

=

H~k + ~k'

k

=

1,2,3,...

(2.6a)

~k models observational errors and is also assumed to be a

white noise sequence, (2.6b,c)

and even fewer in between, at intermediate times. The linearity assumption that

~

and H are independent of x

is, however, important for the theory we shall use here. This assumptionis essential in guaranteeing the optimality of the filtering algorithm (2.11) below. Extensions to the nonlinear case will be discussed in Sec. 5. Having described our stochastic-dynamical model

(~,H),

Eqs.

(2.5, 2.6), we are now in a position to make

the

which is analogous to (2.1b). f the form

connection with (2.1-2.4). Given an estimate ~k(+) based on all the observations up to and including the time k,

~k+l (+)

the best prediction at time k+l, ~k+l(-)' is simply

=

~k+l (-) + Kk + l (~k+1 - H~k+l (-»);

the analogy with (2.1c) (2.7)

~k+l(-) will be the analogue of xl in Section 2.1. of x 2 is the actual observation

~k+l

The analogue

at time k+l. We

to be:

(a) linear,

we

gain matrix ~k+l

K

It remains for the

k + l to reflect the relative uncertainties in

~k+l(-)'

and

is obvious.

(2.9)

This will be achieved by imposing the

optimality requirement (c), and will yield formulae analogous

wish to combine ~k+l(-) and ~k+l in order to obtain an estimate ~k+l (+), which

Our estimator is therefore

to (2.4).

require

In order to formulate the optimality criterion, we

(b) unbiased, and (c) optimal in

define first the estimation error covariance matrices

some suitable :sense. In Sec. 2.1, we discussed the simple illustrative example of estimating the room temperature x readings xl and x

2

of two thermometers.

from the

In that situation,

r k + l (-) is the counterpart of lJ

the requirements (a) and (b) lead to formulae (2.1).

by one time step to Pk + l (-) P k + l (- ) -_

The optimality condition was expressed in (2.4), which yields the minimum

0

among all linear, unbiased estimators.

For our dynamic system

(~,H),

(2.8a)

according to

'P k (+), T + Q ;

A

E: k (~k - ~k)

(f.

~k+l (+)

A2

(2.10a )

derivation of (2.10a) depends on the fact that

h

requirement (a) leads

to the formula

2

and P + (+) is that of 0 • 1 k l ing Eqs. (2.5) and (2.7), one finds that P (+) is advanced k 0

(2.5c». Eqs.

T

=

(2.6) and (2.9)

f observations ~k+l '

P

0

imply that, in the

k + l (+) is found from

P

presence

k + l (-) by the formula

which is analogous to (2.1a), while requirement (b) leads to n Lhe total absence of observatiOlE at time k+l, we have instead

(2.8b) 11 (+)

= ~k+l(-)

and P k + l (+)

=

P k + l (-).

The estimation error covariance matrices Ilcitly I

lit

contain

all

relevant

he error structure of the current

155 1~4

information estimate.

time k are accumulated in P (+). k

T

trace CP + (+) C k l

J

The statistics of all errors committed up to and including FOrmula (2.10a) shows how

Using Pk + l (+)

this information is advanced to the next time step. For

from Eq.

(2.10b)

and setting the derivative

of J with respect to each element of K +

k l

example, this equation determines how the presumably small

one

find~

equal to zero,

that a unique, absolute minimum is attained at

errors committed over a continent, or other data-rich region, propagate over an ocean, or other data-sparse region. Equation (2.10b)

then determines precisely the extent to which

the estimate is improved by the new observations.

ame for all possible positive definite matrices S.

We have defined the estimation error covariance matrices and considered their changes in time.

This result is valid independently of C, and hence is the

other words, Kk + l above minimizes simultaneously all reason-

We are ready now to

'Jble measures, or norms, of the expected estimation error.

derive the optimal gain matrix by imposing the optimality requirement (c):

In

The formula above gives the so-called Kalman-Bucy

it is required that ~_k+ 1(+) be a minimum

.ptimal gain matrix, or filter.

variance estimate in the sense that

(K-B)

Substituting this into (2.10b)

yields the optimal error covariance matrix

J

be minimized with respect to each element of Kk + sYmmetric, for S

=

positive definite

matrices S.

l

' for all

Assuming the availability of an unbiased initial estimate

In particular,

~O

I, we see from the definition of P + (+) that the k l

trace, or sum of the diagonal elements, of P k + l (+) is to be minimized.

= ~O(+) = E~O

ncl an initial estimation error covariance matrix

The trace of P + (+) is the expected mean-square k l

(m-s) estimation error. We recall that a symmetric matrix S, ST

=

S, is positive

definite if, for any vector x t 0, the scalar product Since every such matrix has a factorization S

=

~

T

S~

>

It

o.

description of the Kalman filtering algorithm is now

'm 1 te:

8k +l

that in general

for k (-)

Pk+l (-)

Kk +1

15

=

0,1,2, ... , one computes in order

CTc, one finds

'!'~k

(+)

(2.lla)

,

'!'Pk(+)'!'T + Q ,

T

T-l P + l (-) H (HP + (-) H + R) , k k 1

157

(2.llb) (2.llc)

Pk+l (+)

(I - Kk+1H)

~k+l (+)

~k+l (-)

Pk+l (-)

H~k+l (-))

+ Kk + l (::k+l -

(2.11d)

are weighted more (less) heavily than the predictions ~k+l (-) .

(2.11e)

2.3

General Remarks on the K-B Filter. Before proceeding with the description of our dynamical and

In the absence of observations at time k+l, EqS.

(2.11c,d,e)

are replaced by

Recall first that Kk + l

0

(2 .llc')

P k + l (+)

Pk+l (-)

(2.11d' )

~k+l(+) = ~k+l (-)

(2 .lle')

(~,H),

a number of theoretical remarks are in order.

~k+l(+) was chosen to be the optimal,

minimum variance, unbiased estimate among all estimators of the form (2.8a).

It is not clear that in computing 8k+l(+)

all past observations ::j , j

=

Actually, the gain matrix sequence {K : k k

1,2,3, ... } may be

precomputed once and for all, i.e., for all realizations of the state and noise

observational model

processes, ~k '

~k'

~k'

Indeed,

(2.11b,c,d) do not

utilized.

= 1,2, ... ,k,

have been fully

It can be shown, however (e.g. Jazwinski, 1970,

Sec. 7.3), that our estimate is in fact the optimum unbiased t'stimate among all estimators which are linear combinations f all the available data

depend on the estimates (2.11a,e). This is a result of the assumed

A. z . .

(2.13)

J-J

linearity of the model

(~,H).

This wider optimality is due to assumptions (2.5c, 2.6c,d)

To complete the analogy with our earlier notice that Eqs.

(2.11c,d)

-1 P k + l (+)

illustra~ive

example,

can be rewritten as

lime. It can sometimes still be achieved

P~~l (-) + HTR-IH

(2.12a)

(+)HTR- l

(2.12b)

P

k+l

hat system errors and observational errors are uncorrelated in without these assumptions

(Jazwinski, 1970, Sec. 7.3, Examples 7.5-7.7). Carrying the timation error covariance matrices along in the computation m.kes i t possible for the filtering algorithm to be sequential,

In our analogy,

(P + (-) ,R,P k + l (+) ,K k + l ) correspond to k l

(Oi,o~,&2,a2); Eqs. (2.12a,b) are analogous to (2.4c) and

r recursive: each observation is discarded

lr

as soon as it is

essed. This sequential nature of the estimation makes the

(2.4b ), respectively. Some intuitively appealing results follow, as in the

Iq rithm conceptually simple,

as well as having great practical

lv, ntages. It is one of the major reasons for the broad simple example.

For instance, Eq.

(2.12a)

implies that plicability of Kalman filtering.

Pk+l (+) ::. P k + l (-)

;

the matrix inequality A ::. B means that C nonnegative

Another important feature of the filtering algorithm (2.11)

=

B - A is

semidefinite, xTCx > 0 for all x. Eq.

implies that if R is small

Ih

(2.12b)

fact that only first-order statistics, i.e" cond-order statistics,

i.e., covariance matrices, of the

(large) then the observations zk+l

159

158

means,

vector processes of interest are involved.

In other words,

it suffices to know these statistics for the system noise ~ , observational noise 1;, and initial error ~O - ~O. will provide the estimate P at all future times.

x

These

The sequential nature of the filter implies in particular that, in the absence of further observations at times k > N, the best prediction is simply given by (2.11a, 2.11e'):

as well as its error covariance k

~k+l

This property is important since in

N, N+l, N+2,

... ,

(2.l4b)

practice it is usually difficult to obtain even this much information about the random errors one wishes to filter: it is well nigh

impossible to obtain more, i.e., to prescribe

higher-order statistics. Moreover, the first-order and secondorder statistics of the error processes can be determined adaptively, i.e., by the filtering algorithm itself (Chin, 1979, and references therein). Gaussian processes

The covariance of this predictor is then given by (2.11b, 2.11d'). This corresponds roughly to what is done in operational practice:

Central Limit Theorem (e.g.

assimilation

Furthermore, the

Parzen, 1960, Sec. 8.5 and 10.4)

large number of random effects is approximately Gaussian, regardless of the distribution of the individual effects. It is reasonable, therefore, to expect our errors, which come from a large number of sources, to be approximately Gaussian. It follows that retention of only first and second-order statistics in the filtering algorithm (2.11) should be rather ~,

EN

is determined by 4-D data

from all observations up to and including the

synoptic time of interest.

Then the forecast model is

without further use of the dcta.

and

~O

sati~factory.

are Gaussian, it is known

.Issimilation

interval

k

= 0,1,2, •.. ,N .

The framework of estimation theory can provide also insight Into the nature of the system noise (~,Q) and ways to its d termination.

It could lead to improvements in modeling

"'d hence forecasting, by helping to pinpoint deterministic ,ornponents of~. h

This, however, is not our purpose here. With

e remarks, we turn to the description

of the dynamic

y tern to which the theory outlined in this section will be

1'1 lied.

(e.g. Jazwinski, 1970, Sec. 5.2) that the best possible nonlinear estimate of x, i.e., one whiyh might depend nonlinearly on all the observations, is still our linear estimate x.

161 16u

We intend therefore to study

nly the assimilation and initialization problem, over an

states, in its various forms, that the superposition of a

~,

an initial state

integrated forward in time from the initial state obtained, are in fact completely determined by

their first and second-order statistics.

Actually, when

(2.l4a)

3.

ESTIMATION FOR THE SHALLOW-WATER EQUATIONS

3.1

The Equations

Ut + UU x + vu y + ~x

-

0

,

(3.2a)

v

+ fu = 0

,

(3.2b)

~t + u~x + V~y + ~(ux+Vy)= 0

,

(3.2c)

t + uv x + VVy +

~y

fv

The shallow-water equations are a simple system whose solutions exhibit some of the properties of large-scale atmospheric flow.

They have certain important characteristics

by linearization around a solution (u = U, v = 0, ~ = ~)

in common with the more complicated, three-dimensional systems

ing

currently used in NWP models.

the perturbation quant~ties, i.e., the deviations from

We shall study here a linear, spatially one-dimensional version of the equations, written in cartesian coordinates for a plane tangent to the Earth at latitude 6 : 0

v t + uv ~t

+

+ fu

x

U~x

o o o

+

~ux-fUv

fU = -~y= const. We assume in this

satisfy-

derivation that

equilibrium values of (u,v,~), do not depend on y. It is advantageous

to work with a constant-coefficient

system, as long as the basic phenomena of interest are (3.1a) (3.1b) (3.1c)

not obscured by this simplification.

Hence f, U and ~

in (3.1) are taken to be constants. The variation with latitude of the CoriOlis parameter f, however, has an important effect on planetary flows. This

The coordinate x points eastward, in the zonal direction, along the circle of latitude

6 = 6

0

' while y points north-

ward, in the meridional direction; u and v are velocity components in the x and y directions, f = 2\1 sin 6 is the 0 Coriolis parameter, with \1 the angular velocity of the Earth. The geopotential h+H ~

~

= gh

measures the deviation of the height

of the free surface from its equilibrium value H , with

= gH, U is a constant zonal mean flow velocity. All quantities

so-called a-effect,

a = f y (6 ) I 0

ffect of bottom topography,

shallow-water

equatio~s

~y I

0, in the tangent-plane

approximation (Pedlosky, 1979, Sec. 3.17 and Ch. 6). The erm (- fUv)

in (3.1c) introduces this effect into the

.olutions of the system considered, without sacrificing the implicity of constant coefficients. All the solutions ~ = (u,v,~) of (3.1) can be expressed a superposition of plane waves:

are independent of y. These equations are

0, is equivalent to the

H(x-c,Q.t) d~rived

e

from the full, nonlinear

on a tangent plane,

"'h re ,Q. is the wave number. For each ,Q., the speed of the

n ividual waves, c,Q.

162

(3.3a)

,is given by the

163

dispersion relation

o .

(3.3b)

modified by the presence of the Coriolis force. Their dispersion and dissipation plays a role in the mechanism

This is a cubic equation f or cJl,(m), m

=

1,2, 3 .

c~

, Which has three real roots,

It turns out that (3.1) has two types of

geostrophic adjustment,

which maintains the atmosphere in

a quasi-geostrophic state, in which the Coriolis force nearly balances the

solutions: slow waves, corresponding to (3.4a)

(1)

of

c~

pressure-gradient force.

little energy at any given time, and

But they carry very appear mostly as

higher frequency oscillations superimposed

on the meteoro-

logically significant ones, i.e., as meteorological noise.

and fast waves, with +

2

(~

2



+f

. t erm (fU) In the absence of the 8-1~ke - v

2

u .

An important problem in NWP is the filtering of the

(3.4b)

)

fast waves in large-scale numerical forecast& in order to

in (3.1c), the last

prevent their spurious growth to amplitudes larger than

term in (3.4a), as well as in (3.4b), would not be present.

hose

. (3.4) are actually exact to first order in The expressions ~n

his filtering is different from that discussed in the previ-

the small

nondimensional parameter U/~.

;ng which correspond to the s 1 ow t rav el ~

(I

U

=

compara ble to that of the mean zonal current,

0(10 m/sec), and they retorogress:

their

relative to the mean flow is westward.

propagation

the

atmosphere.

need

for

it stems from the two time scales of the

extraneous, random noise perturbing the deterministic

J

ion.

In particular, the fast waves can be eliminated

I

reduced at the initial time of the forecast by an

nltialization procedure (Bengtsson, 1975;

Leith, 1980; and

rences therein). Once such an initialization has been

=

0(10

-4

sec

-1

, rCormed, the fast waves will not grow excessively over riods of time comparable to the evolution time of the slow

).

The speed of the fast waves is dominated by the second

v

(Browning et al., 1979; Bube and Ghil, 1980).

term in (3.4b), being O(lO~m/sec). They are called the familiar inertia-gravity waves, since they are gravity waves of shallow-water theory, for which c~

=U

~ ~ ,

165 1 4

The

These slow, retro-

gressing waves are named after Rossby, and they are an important feature of mid-latitude atmospheric dynamics. Their frequency is always smaller than f

in

.( terministic motion itself, rather than from the presence

of planetary waves

on which synoptic weat h er systems are superimposed. Their speed is

section:

0US

The slow waves are the meteorologically important ones,

found

Clearly, the two problems of:

a) determining a solution

The grid

to the forecast equations from a continuous stream of noisy data (4-D data assimilation) and

b)

Xj

rendering this solution

=

,A JUX,

J'

~~

1,2, ... ,M,

as free of fast waves as possible (initialization) are

is introduced, with

related.

state vector ~k of Sec. 2

The connection was stressed and a first step in

the direction of their joint solution taken in Ghil

approximating

t

k

=

kt::.t,

k

~(jt::.x,kt::.t).

0,1,2, ••.

(3.5a)

Then the

corresponds simply to ~~ ' with

(1980) (3.5b)

and Leith (1980), among others. We shall present in the sequel a systematic way of

so that n

=

3M.

combining the two aspects of filtering within our framework.

We used the Richtmyer two-step formulation of the

This will involve a modification of the standard K-B filter

Lax-Wendroff scheme (Richtmyer and Morton, 1967, Sec. 12.7

outlined

and 13.4).

in the previous section.

Before proceeding with

Let A and B denote the matrices of system (3.1),

this modification, we shall discretize the equations. This ~t

corresponds to what is done in operational practice and will bring the problem to the form (2.5a).

A~x + B~

The scheme can then be written as j+l/2

Wj + l / 2 3 •. 2

=

+ t::.t W , 2 -t

-k+1/2

Discretization

k

The discretization chosen for (3.1)

is in terms of finite (3.Ga)

differences.

Finite-difference models are still the most

widely used in NWP.

j

They also facilitate somewhat the assimila-

~k+l

tion of observations made in irregular patterns. Spectral

~~ j

models, on the other hand, have certain advantages with regard

j + (t::.tlwtl k+l/2 t::.t

j+l/2

~k + t::.x A(~k+l/2

to the initialization aspect of our problem. An analysis

j-l/2 t::.t j -1/2 j +1/2 ~k+l/2) + 2' B(~k+1/2 + ~k+l/2) (3.6b)

similar to the present one should be easy to reproduce for spectral, finite-element or hybrid

Eqs.

models. h

166

(3.6)

define the dynamics f of our system for

state vector ~k which is given by (3.5)

167

in terms of ~k '

solutions with fast components which are vanishing or small. The discrete system (3.6) has the same type of slow and fast

Obtaining a best estimate in the presence of such a

plane wave solutions as the continuous system (3.1). Their

constraint corresponds to a constrained optimization problem.

dispersive properties are similar. The Lax-Wendroff (L-W)

We shall study in this subsection an appropriate modification

scheme was used because its numerical dissipation is very

of the standard K-B filter described in Sec. 2. All solutions of (3.6)

useful in our simple model in order to simulate the physical dissipation

can be represented as a super-

position of plane waves, similar to (3.3a). For the purposes

mechanisms active in the atmosphere. Such

mechanisms are also present in more complex NWP models, and

of this discussion it is actually more convenient to think

they are essential for geostrophic adjustment to occur.

of ~k in its physical interpretation ~k ' so that we write

The plane waves of the continuous system (3.1) are better approximated

by those

the L-W scneme (3.6) version.

~(jtox,

ktot)

of the Richtmyer two-step version of than by those of the standard, one-step

[n particular, in the absence of the B-term, the

slow, quasi-geostrophic wave of (3.6) that of (3.1) satisfies u(x,t)

= O.

satisfies

u~

=~

useful check on the departure of solutions

of opposite signs.

as

This turned out to be

For each wave number R-, there is a slow wave w(l) eiR-jtox with -Rspeed c!l), and two fast waves ~R-(2,3) e iR-jtox w~th . ~ speeds cR-(2 ' 3)

a

The decay factors A~m),

IA~m) I

< 1, are

present due to the dissipation in the difference

Wj from geostrophy.

cheme (3.6).

-k

Denote by

R

(for Rossby waves), the span of the real

p rts of all compound n-vectors, n = 3M, of the form

3.3

The Modified K-B Filter In the absence of any constraints on the dynamics, it is

(3.7b)

clear from Sec. 2 that the K-B filter corresponds to the (m)

~R-

solution of an optimization problem: obtaining a minimum variance estimator

for the system (o/,H).

We have seen in

I

r m

= 1,

e

iR-Mtox

as R- ranges over all possible wave numbers.

is the slow wave space. The fast wave space is denoted

Sec. 3.1 that, in the application of interest here, it is

rh1

desirable to select among the solutions of the discrete

Iy G (for gravity waves), and is defined as the span of the

evolution operator 0/ defined by (3.5, 3.6) a special subset

1 parts of all n-vectors of the form (3.7b) for m ~

168

gain ranges over all possible wave numbers.

169

= 2,3,

The spaces Rand G are subspaces of Euclidean n-space En, and they span En,

R ~ G = En.

invariant under the matrix

~,

Furthermore, Rand G are

Having defined the operator IT, we are now ready to describe the modified filter. Let

defined by (3.6), i.e., they

z - H ~_k (-) _k = _k

T)

are invariant subspaces of the system's dynamics.

denote the innovation vector at time k. This vector

Indeed, the eigenvectors of 'I' are precisely the set of Hence any vector x in R

all vectors of the form (3.7b).

be advanced by the unperturbed system (3.6) to a vector in R

~t.

after a time step

evolve to

~y

in G.

(3.9a)

will

~x

Similarly, a vector y in G

will

Notice, however, that Rand G are not

carries the new information contained in the observations at time k; ~k(-) carries all the previously accumulated information, as propagated by the dynamic system. The filtering step (2.11e) is now written

orthogonal to each other.

(3.9b)

Given any n-vector x, there is a unique vector y in R which is closest to x in the sense that II is minimized. projection of

y

T

= IT,

=

~

onto R, and is denoted by y

and satisfies IT

IT~

(y-x) T (y-x)

What is desired is that, for all k, ~k+l(+) lies in R. rt is clear by inspection of Eqs.

(2.11 a,e,e') that, under

This vector y is called the orthogonal

The orthogonal projection operator

IT

¥. - ~1I2 =

2

assumption (3.8), this will be the case provided that,

= ITx.

IT is a symmetric matrix,

or all k, Kk+l~k+l lies in R.

perturbation of the noisy true state ~k+l '

zk+l is a noisy

= IT.

As the observation vector

The orthogonal projection

cf. Eqs. (2.5a, 2.6a), the correction vector Kk+l~k+l does

is found in O(n log n) arithmetic operations by

not, in general, lie in

R.

performing three Fast Fourier Transforms (FFTs) on the What we seek, then, is a modified filter K;+l ' possibly vector

~

(one for each component

~,

~,

~),

then multiplying

by a block diagonal matrix comprised of M 3x3 blocks, then

,!I-pending on ~k+l ' which has the property that K~+l~k+l

--! lie in R.

This filter is found by minimizing trace P + (+), k l before, but subject now to the constraint that the correction

performing three inverse FFTs. We assume in the sequel that

tor lie ~O

(3.8)

IT ~O

in

R.

A Lagrange multiplier method was used

solve this constrained optimization problem. The result is mply that

i.e., that initialization in R already.

has been performed and

~O

lies (3.10a)

These concepts are discussed and similar

notation is used in Leith (1980).

1\1 pendently of ~k+l '

where Kk + l is the

170

171

standard Kalman-Bucy

filter. Eq.

For the modified filter, therefore, we replace

Our physical domain is an interval of the x-axis of

(2.11e) by

length 2L.

~k+1 (+)

(3.l0b)

near 45° N.

It is meant to correspond to a circle of latitude Hence f

= 10-4 sec -l

and L

=

14000 km.

The

actual distribution of land and ocean at this latitude has Since this filter is no longer optimal, we must also been simplified to be 2-periodic, so that in each interval replace (2.11 d) by (2.10 b), in which

K + is replaced by k l

IT Kk + l •

of length L, half

is covered by ocean (Pacific or Atlantic) ,

and half by land (North America or Eurasia).

It is reasonable

In the sequel we shall call the modified algorithm to consider, therefore, only 2-periodic solutions of (3.1), the IT-filter, while the standard algorithm will be called the K-filter.

and consequently our computational domain is of length L. We shall see in Sec. 4 that the IT-filter

produces optimal estimates of the slow waves at the expense

We

consider

the left

half

of

the

computational L-domain

to be covered by land, and the right half to be covered by of estimation errors only slightly larger than the fast-wave ocean, so that our observation matrix is contaminated estimates produced by the K-filter. H =

3.4

Observational Pattern

The mean

and Choice of Parameters.

the



"ocean ll

are obs,srved over "land", and none over

This is only meant to serve as an illustra-

handle observations which are arbitrarily distributed in space

o

inertia-gravity waves. The slow waves in the solution

lnd mental L-domain of one continent and one ocean in a • 110

of approximately L/U; cf. hoice of L, U, ~ and t,

(3.4a), the actual time, for is roughly 12 days.

17:1

172

3xl04m2/s2. The value of U

(3.1) which we wish to estimate will travel across the

'r

and time.

=

lmosphere of H ~ 3 km, which gives realistic phase speeds

. 11 y familiar situation. tion of K-B filtering in a meteoro 1 oglca Clearly, the power of this approach lies in its ability to

20 m/s and ~

orresponds to an equivalent depth for a homogeneous

ponding to t h e convent ional meteorological upper-air network: (u,v,~) 'I'

=

typical for mid-tropospheric flow at this latitude;

the study of a "classical" observational pattern, corres-

.. all quantltles



flow about which (3.2) is linearized was

ken to have U

In the present article we shall restrict ourselves to

(I 0)

~

For a single wave number!, continuous system (3.1)

initial data for the

which lead only to slow waves

are, to first order in the small parameter

:

(1)

......

PI

rt 0 rt

rt C llJ

a) Erms over land:

::r (1)

rt

,......

(1)

0 P.

:3

(1)

III

1i

~

t

.02 00

.10

..,. ..,.

EXPECTED RMS ERRCJl OVER TDTR!. REGICf'

I

,

I '

2



I

5 T1'1:1lll.T51



7

e

t

b) Erms over the ocean;'

c) Erms over the entire L-domain'



g-

(1)

1i

:3 0

'


..........

:>:

::r (1)

rt

Ul

llJ

::r

Cl

C

:3

CI1

(1)

rt

:::

(1)

llJ 1i

"Cl "Cl

llJ

ro

5 t1

(1)

:3

0

Cl

(1)

0-

......

....'€ ......

III

8

Cl

::r (1)

Ul

I-'

::l llJ

0

.... rt.... ::l PI

~

P

....0

rt

Cl

(1)

III

0-

c

III

rt

>:

:::

(1)

rt :T

:>

r'

::r

....

~

ro :::

(1)

rt

(1)

0-

ro

:::()

1i (1)

(1)

HI

HI

....P.

ro

..,::r

I-' HI

ro

III

rt

(1)

:3

.... rt

::l

....rt 0

PI



III

PI

(1)

0-

0

::: \Cl

:::(1)

:>

Cl llJ

III

"

0

b)

9

......

III

llJ

c.o

\Cl

(1)

:3

0

0

g

7

I~ ....



T1'1: (OATSl

~~~~ '-""';-----,-u ~~;.

EXPECTED RI'1S ERROR OvER LRND

Components of the actual rms error in the estimation of

RKS ERROR OVER TOTJl. REGION

RMS £RIlOJl OVER TQTAl REGIOtl

an individual realization of our system, with given initial data ~O and observation error Fig. 3a. lI~k -

A

~kll

{5 k :

k

=

a)

1,2, ..• }, appear in

The corresponding components of relative m-s error,

2

Itrace P k ' where lIyll is the length of the vector y

are given in Fig. 3b. and sequence

~k

For a different random choice of

~O

' the same plots appear in Figs. 3c, 3d.

,

It is easy to see that the rms error in an individual realization decreases to the observational noise level in

TlII(I~1

.. ~ '" ,.,

about the same time as its expected value (compare Fig. 3a and 3c with Fig. 2c).

After that time, however, it continues to

fluctuate near zero, rather than decaying monotonically to zero.

Our experimental results (Figs. 3b,d)

spread of individual value is not too large.

show that the

estimation errors about their expected It would be interesting, in general,

to know a priori how large this spread is expected to be, and

I..

b)

::I~'

::~:] \l;;; "

we intend to study this question further. We show in Fig. 4 the actual time histories at a number of grid points for the estimated solution corresponding to the realization in Figs. 3a,b; ~ is shown in Fig. 4a, v in Fig. 4b and ~ in Fig. 4c.

We chose to show a point on the West Coast

of the continent, labeled SF (for San Francisco), one on the East Coast, labeled NY (for New York), and one in the middle of the ocean, labeled HA (for Hawaii)

=

"New York"

by periodicity.

Note that "Tokyo"

I"

I

'K. 3

I

I

J

t

11-';

~QIIT"



J





10

Components of the actual Toot-mean-square (rms) estimation error and the relative mean-square (m-s) estimation error, for two realizations of the experiment whose Erms errors appear io Fig. 2. The two realizations correspond to two different random choices of inl tial eondi ticn ;.0 and observational noise ~k' sampled from the same respective probability distributions, with covariance matrices Po and R, respectively. a) The rms errors for a realization of the run whose Erms errors are given in Fig. 2. The curves have the same labels, and are plotted on the same scale, as in Fig. 2. b) Relative m-s errors for the same realization. The tb.r-ee panels give the ratio of total m-s error over land, ocean, and the entire region, respectively, to the corresponding expected m-s error. .) Same as Fig. 3a, for a different realization. d) Same as Fig. 3b, for the run in Fig. 3c. Notice the difference between Fig. 3a and Fig. 3c: the actual rms estimation error in two realizations of the filtering process can be qui te different. Figs. 3b and 3d Dhow the variation of the relative m-s error: equality of absolute m-s orror and expected m-a error corresponds to a value of 1 in these figures.

18·1

c)

1.0

·8

r-1-,.-----,-----..---,

.-r----r-

-..-----.--

1.0.-- r·-·....,..--"

~-____r__.~~-~____.--~

~

a)

,

·6 ~

i

.8 "

~

.6 -

~



•4 ~

.2 -

'---1---'--

r--- ... ·-- r---"'-r-- ,--,--,.'---

lo---r---·l---'-··

f-'r--l

.4 1

.2 ..

O~A~~A

SF

HA :::::sr+ll(

l~\'

n9

"

-.2 -

1

~ •

-.4 -

-.6 -.8 .. -1.0

L--l....--.-I--L-__.

.1-----..

o

~:~ ~'

1 -,.-

2 -,

__ • .J

....... __ ...J..

3

'1

-r

--L_.L _ _ ........

5

--I--.~ _ _ ..1 6 7

--'--',,- -r--

,-

~---.--

~ _ ____L.._~_.L_--------J 8 9 10 r

-,

-~----'-~l

l:!~ ~ ,o~/~~~\ I~Rj

>

.~: ~'\

~:n

-1.0 /~

,. .l----" __ ~ ~

8

9

10

Fig. 4

location). a) u-component of velocitYi

1

1.2

L_1. .

I~.

__

__

.L --1_-.

~

2

3

(lr

UIJ

810w wov

I"lmpQ

h ,It Ii y,

d ,.mld t

I_.J..

3

L

'1

__ 1..--_



...1- _ _ 1.

5 TIME WAYS)

7

8

9

10

/ ~r~' ~\\

'9~

/

.L_...l_ .. - . J - __

6

7

1._~.

1__

8

~ I

9

_.

~

,

__

J

10

Same as Fig. 4, only for a run using our modified K-B filter, the IT-filter. The fast waves have been completely eliminated and the estimated solution is slowly varyinr,.

b) v-component of v 1o 1 Y.

wHh' p rlO(l "r ,pproxlm I Iv ,\ay., UH n whi 11 I~. r l . l wQV • 'NI lh , p jiltt or •• I' U l ~ ly (In

,

__

6

~ ~

-..J.-F_.

2

5

'1

cl t, the geopotentlal No lee th

l

~f~

~ ~_~-J~~_~~ ._._I__ ~__~~_.-L-_~ ~_~

l~~ -'

o Time history of the estimated solution without system noise, at three locations, labeled HA (for Hawaii, a mid-ocean location), SF (for San Francisco, a West Coast location), and NY (for New York, on East Coast

_-4 _ _

:

I'

~~

'9,~'

M

-.6: ;::

"'"\

; '~~V Il\" OD $, h) :!. on text. the 1n discussed are .~~ the v

is due to its statio n, SL, being locate d

110.0 ORYS

hieh ). SF a: interior continen tal location K-B f th . statIons an g'~ven These functions are simply c~~:s~~e ~~:o::i; ht and NY (see Fig. 4). ction to the fOrecast field at algorith~'s gain ma~rix K at day l~~ m~:~~ c:r~::~enine panels, 7a-71, give the The _ ma~n observat lon at stat~on S~, sa~. wh n u c) 4> on u d) v on f . i

OfSL·el(ef:;e~a~~:~L:::S8. Intl~encesefu1ncttiodns ec e are

It is positi ve over land,

station s in the middle of a data-d ense region : neighb oring but a small role. receiv e almost equal weight s and advect ion plays

.7S

.25

-.50

7

I I I

.50

-.2S

Fig.

PHI - U

center ed at SL is the

negati ve out becomi ng nearly zero at SF and NY and slight ly The relativ e small size and symmet ry of into the ocean.

1.00

PHI - PHI

They are the only ones

to have the latter proper ties.

I

0.00

-1.00

its y-comp onents were

1.00

.SO

51-

system noise covari ance matrix Q:

I

I Sf I

0.00

1.00

V- u

.75

U - PHI

In fact, the peak for the SF influen ce func-

er, the former tion is slight ly higher than the NY peak. Moreov at SF itself , locate d one grid point West of SF, rather than Both data densit y and advect ion while the NY peak is at NY. hus play a role. give even It makes sense for the point upstrea m of SF to is closer re weight to SF inform ation than SF itself : SF

197

J

1

Our (v-

(cf. Eg.

subspace

-.2

-1.0

is simply 0

2

3

'1

5

6

7

8

la

9

ITK oo

'

form of the IT-filter, which

gave practical¥ the same results as the

n-filter itself.

a~%

l\ ·

~!

/A~~;1! #, ~ 11

/

~

0

-.2 -.4 -.6 -.8 -1.0 -1.2

-1.'1 1.6 1.4 1.2 1.0 .8 .6

....

:x: Q...

0

2

3

'1

5

6

7

8

9

10

~~ ~10/~ 1

.'1 .2 0 -.2

19,

-.4 -.6 -.8 -1.0

IT-filter are

~ 0

2

3

'1

r:-/

Y

A

5

6

7

8

9

10

TIME WAYS) Fig.IQ

Same as Fig. 9, but for a run using the TI-filter, rather than the K-filter. The fast oscillations have been eliminated.

205

, 011

4.3 Theoretical Analysis of the Scalar Case

observation times.

In order to help explain some of the qualitative

Defining S.

J

features of the numerical results in Sec. 4.1 and

,

4.2, j = 0,1,2, ... ,

we perform an asymptotic analysis of the filtering equations for a scalar state x.

The matrices of interest

will now actually be scalars·, and we assume 'I' f H = 1, Q

~

R > 0 , and Po > O.

0,

assumptions on Q, R, and Po they are variances. observation

'I'2r ,

A

are due to the fact that

r-l B

I

'I'2p,

p=o

We assume furthermore that the

is performed only every

~t

tion, and the quantities

0,

The positivity

numerical experiments reported herein r = 24, as

to be the estimation error variance after the jth observa-

= 30 min.

E

time steps; in the Eqs.

we would have

(4.5.a,b) can be converted into a nonlinear difference

equation for Sj:

and observations are taken at

the standard 12 hour synoptic intervals.

S. = g (S. 1) JJ

The filtering algorithm (2.11) yields in this case

(4.6 a)

where (As + BQ) R

2

(4.5a)

'I' P k - l (+) + Q ,

J Pk(-)R/(P k (-) +

1

P

k

R)

From Eq.

To determine the asymptotic behavior of Eq.

ases

o

Q

and Q > O.

In the case of a perfect model,

= 0, the solution of (4.6) may be written explicitly as

(4.5b), one immediately finds that

Aj(A-l)SOR S.

J


1.

and tending to zero as s + +00.

It follows that the root

S+ of (4.8) is approached monotonically by the solutions Eq. 1'1'1

2

of the recursion (4.6a)

(4.7C) states that for stable dynamics, i.e.,

(Isaacson and Keller, 1966, Ch. 3.1,

Fig. 2a),

1, the estimation error variance, and hence the

= S./R, System (3.1) always tend to zero. Jr J itself is conservative, while our difference scheme (3.6)

filter

as

K.

If So is greater than (less than) S+ ' then Sj will is dissipative.

Hence all eigenvalues of the difference decrease (increase) monotonically to S+ . Since K

scheme matrix

'I' have modulus less than or equal to unity.

jr

=

Sj/R,

we have also We are thus in the stable case (4.7c) and it is only to be expected

K,

that our estimation error covariance matrices

Jr

S+/R

+

as j

+

+ 00,

and filter K approach zero in the absence of system noise. k

the convergence being monotone as well.

This is in accordance with our discussion

accordance with the monotone

of Fig. 2

in Sec. 4.1.

in Fig.

We wish to determine now the asymptotic nature of Sj in the case Q > 0, i.e. , in case system noise is present.

6;

in fact P

jr

decrease of trace P

=

jr

(+)

(-) decreases also monotonically.

Furthermore, 5+ is independent of So asymptotic filter Koo

This is in

=

Po ' and hence the

S+/R is independent of Po as well.

The quadratic equation To find an approximate value for S+ ' we now assume that s

=

(4.8)

g (s)

has a positive discriminant, hence it has real roots.

1'1'1

< 1 and r » 1 ,

term in Eq.

(4.8)

so that A

dg/ds

+ It follows

AR2/(AS+BQ+R)

2

> 0 .

=

S

Let S+ denote the unique positive root of (4.8).

Notice that

with g(O)

monoto~e

1. Then the quadratic

QR Q+(1_'I'2)R·

(4.9a)

in general, by analogy with (2.3b) and (4.5d), that S+

2

min {R, Q/(1_'I'2)}

t least approximately. Therefore g(s) is a

'l'2r «

is negligible and we have, approximately,

Its free term is negative, hence the roots are of opposite sign.

=

,

In particular, when the observational

increasing function of s,

> 0, while its derivative is monotone decreasing

208 209

error variance R is small enough, so that

The asymptotic properties of P and K (4.9bl

discussed above

have counterparts in the full, vector-matrix case, under the assumptions of complete observability and complete

then S+ is roughly equal to R, and the size of Q has little ·ontrollability. influence.

These assumptions concern

properties

If, on the other hand, the observational error f the matrices

~

and H; they are satisfied for our model.

variance is large enough so that The interested reader is referred to Bucy and Joseph (4. 9 cl

(1968, Ch. 5) and to Jazwinski (1970, Sec. 7.6) discussion of the general results.

then S+ is roughly proportional to Q, and the size of R has little influence. The two extreme cases (4.9b,o)

explain much of the

qualitative nature of the results in Sec. 4.2. a matrix version of Eq.

(4.9b)

and we see that the expected

Indeed,

is satisfied over land, m-s errors

at observation

times are approximately equal to the observational error variances.

In fact, they are slightly smaller than R ;

this is in accordance with (4.5d), as well as It also agrees with operational experience,

(4.9~,b).

as stated

in Sec. 3.4. Over the ocean, we can write R

=

00

,

so that a matrix

version of (4.9c) is satisfied. Experiments magnitudes of Q (not shown), of Q

confirmed that the size

is indeed the determining factor in the size of P

over the ocean. ence:

hav~

with different

This also agrees with operational experi-

analysis error

in, data-poor regions

is essentially

equal to the error in forecasts from one analysis time (synoptic or subsynoptic)

to the next.

210

211

for a full

5.

CONCLUDING REMARKS We have shown

observational noise covariance R; it does not depend on the

the ways in which the concepts and

formalism of sequential estimation theory

are relevant to

the 4-D assimilation of meteorological data, by applying them to a simple model.

The stochastic-dynamic model

used for the illustration of the theory

initial errors, PO.

The rapid convergence and good performance

of the W-filter for the linear problem hold filtering of nonlinear problems

Nonlinear

estimation theory is not completely understood

the linear shallow-water equations, including the B-effect

m thematically, at least

of latitudinal changes

form.

dynamics NWP models

The

of this model are similar to those of operational in that they admit as solutions slow,

quasi-geostrophic

Rossby waves, as well as fast

with similar dynamic and

tochastic properties.

was governed by

in planetary vorticity.

hope for the

not in a practically applicable

The filter which is most widely used in engineering

pplications is the extended K-B filter (EKF: Bucy and Joseph, 1968, Ch. 8; Gelb, 1974, Ch. 6; Jazwinski, 1970, Ch. 6

nd Ch. 9).

The principle of the EKF is simple:

linearize

!he problem around an estimated state, and apply the

inertia-gravity waves. We have modified the standard Kalman-Bucy (K-B) filter

rresponding linear filter over a time interval, T

say,

in order to obtain optimal estimates of the slow, meteorologically

v r which the solution of the nonlinear problem is not

significant waves, while eliminating

Kpected to change much.

undesirable waves.

entirely the fast,

In this way, our modified K-B filter

'Iual 6 h to 24 h.

In large-scale NWP this time could

After time T, relinearize around the new

achieves simultaneously the optimal 4-D assimilation of data

te and proceed.

Clearly, T is limited by the dynamics

and the initialization of model states for the purpose of

the system in the problem.

When choosing T, a trade-off

Iween accuracy and expediency has to be made.

noise-free forecasts. It was shown that the optimal filter for the linear

The EKF gives good results when the true characteristic

problem converges rapidly to an asymptotic matrix, the

r lation time T

Wiener filter.

white noise, i.e., T

Furthermore, the asymptotic filter (W-filter)

of the perturbations, which we model

= 0,

is actually

short compared to T,

performs nearly as well as the exact, time-varying filter

T.

(K-B filter).

rs from our experienc,e with linear problems that:

~,

The wiener

filter depends on system dynamics

observational pattern H, system noise covariance Q and

This is certainly the case in NWR Moreover, it

the asymptotic filter for each one of the successive rizations will work sufficiently well, and there is no o compute time-varying filters over a time interval T; he dependence of the W-filter on (linear) system mic

~

is rather weak.

212

213

We conclude that the W-filter for the succeeding T-interval

We saw already that taking into account the dynamics

will be easily computable from the W-filter valid over the

llowed us to unify the assimilation and intialization

preceding T-interval by a perturbation procedure.

spects of preparing initial data

It is more realistic in NWP to let the observation H be a function of time, H

matrb~

H(t), rather than a constant.

Different observations are made at the synoptic times, 0000 GMT and 1200 GMT, the subsynoptic times, 0600 GMT and 1800 GMT, and in between.

orecasts.

be a periodic matrix function of time, with period 24 h, and assume that the actual observation matrix, H(t), differs

o data-poor areas, and of "negative information", i.e., I. rge errors,

by the presence, absence or location of relatively

efficients for observations should be skewed in the irection of the prevailing lpstream;

the amount of skewness should depend on average

nd intensity, i.e., h

winds, with larger weights

on season.

Also

The asymptotic filter corresponding to ('I' (t) ,H* (t)), K*(tl say, should also be periodic, with a period of 24 h, rather than constant.

We expect some easily computed modi-

fication of K*(t) to

be a good approximation to the optimal

filter for ('I'(t) ,H(t)).

We plan to study, therefore,

time-periodic observation patterns and their modifications. Sequential estimation accounts explicitly for the fact that the system whose state we wish to estimate is governed It is this aspect of the theory which

distinguishes it from the so-called "optimal interpolation" currently used in operational

NwP.

The latter essentially

assumes that the system obeys trivial dynamics, 'I'

=

I.

weights used on

Western edge of the continents should be different

lom those used on the Eastern edge.

few observations.

by certain dynamics.

from data-poor to data-rich regions.

In particular, the theory shows that weighting

from H*(t) at every t only by a matrix of small rank, i.e., only

Furthermore, it allows us to account in a

ystematic way for the advection of information from data-rich

The distribution of observations,

how,ever, is not too far from being time-per iodic. Let H* (t)

for numerical weather

the anisotropv

and inhomogeneity

'lucture in the zonal direction he results

The present results of estimation error

should also be supplemented

on inhomogeneity in the latitudinal direc-

n of Ghil et al.

(1980).

It is this aspect of

the

• ry which we expect to have the largest impact in terms Improving operational procedures. One further aspect of the theory merits attention: the riance matrices Q and R do not need to be prescribed r ori.

They can be determined in the estimation process

(e.g., Chin, 1979; Ohap and Stubberud, 1976) by using an The determination of system noise would mportant consequences

for predictability theory, as

for stochastic modifications of numerical schemes nd Schemm, 1977).

The determination of observational

l"minating the well known problem of "ground truth",

214

215

would also help greatly in improving operational objective

References

analysis and data assimilation. 'ngtsson, L., 1975:

4-Dimensional Assimilation of

Meteorological Observations, GARP Publications Series,

Acknowledgement.

No. 15, World Meteorological Organization _ The potential usefulness of estimation theory in meteorological applications

was first pointed out to one

of us (M.G.) by Edward S. Sarachik.

International Council of Scientific Unions, CH-1211 Geneva 20, Switzerland, 76 pp.

It is a pleasure to 'rgman, K. H., 1979:

Multivariate analysis of temperatures

acknowledge many useful discussions on linear and nonlinear and winds using optimum interpolation, Mon. Wea. Rev. 107, filtering with George Papanicolaou. the filters'

eventual

Practical aspects of

1423-1444.

implementation in an operational owning, G., A. Kasahara and H.-O. Kreiss, 1979: Initialization

environment

were discussed with Wayman Baker, Mark Cane,

Milton Halem and Eugenia Kalnay-Rivas. The final version of the manuscript benefited A. Hollingsworth,

from comments by G. Cats, R. Daley,

A. Lorenc

and after the Seminar.

and O. Talagrand, made during

The manuscript could not have been

prepared without the whole-hearted support of Con stance Engle. Our work on this problem has been supported by NASA Grants NSG-5034 and NSG-5130, administered by the Laboratory for Atmospheric Sciences, Goddard Space Flight Center. Computations were carried out on the CDC 6600 of the Courant Mathematics

and Computing Laboratory, New York

University, under Contract DE-AC02-76ER03077 U. S. Department of Energy.

with the

of the primitive equations by the bounded derivative method, NCAR Ms. 0501-79-4, National Center for Atmospheric Research, Boulder, Colo. 80307, 44 pp. ,he, K. P., and M. Ghil, 1980: data and the

initiali~dtion

Assimilation of asynoptic problem, this volume.

I'Y, R. S., and P. D. Joseph, 1968:

Filtering for Stochastic

Processes with Applications to Guidance,

Wiley-Interscience,

New York, 195 pp. rney, J., M. Halem and R. Jastrow, 1969:

Use of incomplete

historical data to infer the present state of the abnosphere, J. Atmos. Sci. 26, 1160-1163. n, L., 1979:

Advances in adaptive filtering, in

Control and Dynamic Systems, Vol. 15, C. T. Leondes (ed.), Academic Press, 278-356.

in, R. F., and A. J. Pritchard, 1978:

Infinite Dimensional

Linear Systems Theory, Lecture Notes in Control and Information Sciences, Vol. 8, Springer-Verlag, New York, 97 pp. , M.H.A., 1977: 216

11 1

Linear Estimation and Stochastic Control,

ed Press, John Wiley and Sons, New York, 224 pp.

17

Faller, A. J., and C. E. Schemm, 1977:

Statistical corrections

Halem, M., M. Ghil, R. Atlas, J. Susskind and W. J. Quirk,

to numerical prediction equations 11, Mon. Wea. Rev. 105,

1978:

37-56.

NASA Tech. Memo. 78063, NASA Goddard Space Flight

Fleming, R. J., T. M. Kaneshige and W. E. McGovern, 1979a: The global weather experiment I. The observational phase through the first special observing period, Bull. Amer. Met. Soc. 60, 649-659.

The global weather experiment 11.

The second special

observing period, Bull. Amer. Met. Soc. 60,

Isaacson, E., and H. B. Keller, 1966:

(ed.), 1974:

Applied Optimal Estimation, The M.I.T.

Press, Cambridge, Mass., 374 pp.

Theory,

initialization, and four-dimensional data assimilation, Tellus 32, 198-206.

Stochastic Processes and Filtering

Jones, R. H., 1965a:

Optimal estimation of initial conditions

for numerical prediction, J. Atmos. Sci. 22, 658-663.

J. Appl. Meteor.

An experiment in nonlinear prediction, 4, 701-705. A new approach to linear fi1tering

and prediction problems, Trans. ASME, Ser. D,

Asynoptic variational

method for satellite data assimilation, in Halem et al. (1978), pp. 3.32-3.49.

A stochastic-dynamic model for global atmospheric mass field statistics, NASA Tech. Memo. 82009, NASA Goddard Space Flight Center, Greenbelt, Md. 20771, 50 pp. ___________ , M. Halem and R. Atlas, 1979:

Time-continuous

assimilation of remote-sounding data and its effect on weather forecasting, Mon. Wea. Rev.

107, 140-171.

New results in linear

filtering and prediction theory, Trans. ASME, Ser. D.: J. Basic Eng., 83, 95-108. r,eith,

___________ , R. C. Balgovind and E. Kalnay-Rivas, 1980:

J. Basic Eng.,

g,35-45. ___________ , and R. S. Bucy, 1961:

and R. Mosebach, 1978:

Analysis of Numerical

Academic Press, New York, 376 pp.

Kalman, R. E., 1960:

The compatible balancing approach to

[NTIS N783l6671.

Methods, Wiley, New York, 541 pp.

_ _ _ _ _ , 1965b:

1316-1322.

Ghil, M., 1980:

Center, Greenbelt, Md. 20771, 421 pp.

Jazwinski, A. H., 1970:

___________ , and T. E. Bryan, 1979b:

Gelb, A.

The GISS sounding temperature impact test.

c.

E., 1978:

Objective methods for weather prediction,

Ann. Rev. Fluid Mech., 10, 107-128. _ _ _ _ _ , 1980:

Nonlinear normal mode initialization and

quasi-geostrophic theory, J. Atmos. Sci., 37, 958-968. renz, E. N., 1969: possesses

The predictability of a flow which

many scales of motion, Tellus, 21, 289-307.

Pherson, R. D., K. H. Bergman, R. E. Kistler, G. E. Rasch and D. S. Gordon, 1979: The NMC operational global data assimilation system, Mon. Wea. Rev., 107, 1445-1461.

218 219

Miyakoda, K., and O. Talagrand, 1971:

The

assim~lation

of

_ _ _ _ _ , 1976:

past data in dynamical analysis. I, Tellus, 23, 310-317. Ohap, R. F., and A. R. Stubberud, 1976:

Adaptive minimum

space-time fields, Info. Sci., 10, 217-241. Phillips, N. A., 1971:

variance estimation in discrete-time linear systems, in

system, J. Atmos. Sci., 28, 1325-1328. _ _ _ _ _ , 1976:

Academic Press, 583-624. palmen, E., and C. W. Newton, 1969:

Atmospheric Circulation

pedlosky, J., 1979:

AIDer. Met. Soc., 57, 1225-1240.

Modern Probability Theory and Its ApplicaYork~

Richtrnyer, R. D., and K. W. Morton, 1967:

464 pp.

Verlag, New York, 624 pp.

New York, 405 pp. Rutherford, I. D., 1972: Data

On the concept and implementation of

673-686.

observations, _______ , 1973a:

29, 809-815. Schlatter, T. W., 1975:

Algorithms

for

sequential

and

random

103, 246-257. _______ , G. W. Branstator and L. G. Thiel, 1976:

sequential analysis, J. Appl. Meteor., 12, 437-440. _______ , 1973b:

Testing a global multivariate statistical objective

A comparison of the performance of

analysis scheme with observed data, Mon. Wea. Rev., 104,

quasi-optimal and conventional objective analysis schemes,

J. Appl. Meteor, 12, 1093-1101. _______ , 1973c:

765-783. , 1977:

~

Static and dynamic constraints on the

estimation of space-time covariance and

Some experiments with a multivariate

stiatistical objective analysis scheme, Mon. Wea. Rev.,

Meteorol. Mono., 11, 100-109. Transient suppression in optimal

wavenumber-

frequency spectral fields, J. Atmos. Sci., 30, 1252-1266.

assimilation by statistical

interpolation of forecast error fields, J. Atmos. Sci.,

sequential analysis for linear random fields, Tellus, 20,

_ _ _ _ _ , 1970:

Difference Methods

for Initial-Value Problems, 2nd. ed., Wiley-Interscience,

Geophysical Fluid Dynamics, Springer-

Petersen, D. P., 1968:

The impact of sYnoptic observing and

analysis systems on flow pattern forecasts, Bull.

Systems, Academic Press, New York, 603 pp.

tions, Wiley, New

Ability of the Tadjbakhsh method

to assimilate temperature data in a meteorological

Control and Dynamic Systems, Vol. 12, C. T. Leondes (ed.),

Parzen, E., 1960:

Linear sequential coding of random

Reply

(to Comments by H. J. Thiebaux on "Testing

a global •.•h

),

Mon. Wea. Rev., 105, 1465-1468.

jbakhsh, I. G., 1969: Utilization of time-dependent data in running solution of initial value problems, J. Appl. Meteor., 389-391.

~,

n r, N., 1949: E'xt r apo1ation, Interpolation and Smoothing of

Stationary Time Series with Engineering Applications, Wiley, New York,

220

163 pp.

221

Appendix A. (m)

c~

List of Major' Symbols

p

the dimension of z

for solutions of the continuous system (3.1), the three phase speeds, m

=

1,2,3, corresponding to wave number

as defined by Eq.

(3.3b).

Q

covariance matrix of the system noise

R

slow wave subspace

R

covariance matrix of the observational noise

t

time

~,

The same notation is used for

the phase speeds of solutions of the discrete system (3.6), which agree with the phase speeds of solutions of

temporal increment in the discrete system (3.6)

the continuous system to order (6t)2. E

expectation, or ensemble averaging, operator

f

Coriolis

G

fast wave subspace

g

gravitational acceleration constant of the Earth

H

observation matrix; defines the observed linear

parameter, f

~ 2~ sin 45°

= 10-4 s -1

superscript indicating the spatial grid point, x

K

Kalman gain matrix, defined by (2.11c)

k

subscr ipt indicating the time, t

L

length of the computational

k

=

wave number;

~L/2TI

M

number of grid points in the L-domain

n

number of state variables,

n

=

3M; the

u

perturbation zonal wind component

v

perturbation meridional wind component

w

in the linearization of (3.2)

a solution (u,v,~)T of the continuous system (3.1) ' (uk,vk'~k) j j j a so l ut10n of the discrete system (3.6)

distance along the spatial L-domain

j

domain, about half the

is an integer ,

mean zonal wind component

wkj

k 6t

circumference of the Earth at 45°N

u

initial amplitude of v

combinations of state variables j

number of observations at a synoptic time;

x

spatial increment in the discrete system (3.6) "true" atmospheric state given by the stochastic model (2.5)

k(-) estimated atmospheric state just prior to observations at time k k(+) estimated atmospheric state just after observations at time k vector of observations, given by (2.6) observational noise, or random part of the observation

dimension of x P (-) estimation error covariance matrix just prior to k

model (2.6) system noise, or random part of the atmospheric model (2.5)

observations at time k P (+) estimation error covariance matrix just after k observations at time k

222

orthogonal projection operator onto the slow wave subs pace

R

223

mean geopotential at 45°N

in the linearization of (3.2),

i.e., g times the mean equivalent atmospheric height at 45°N ~

p~rturbation

~O

initial amplitude of

~

matrix defining the atmospheric dynamics, given for our model by Eqs.

geopotential ~

(3.6)

angular rate of rotation of the Earth

224

Suggest Documents